Here is the link to the page. We have three pricing tests running here. Paypal conversion is the goal. http://unbouncepages.com/personal-medicine-plus-mvp/ Conversion rate so far is 26% but I am confused about the significance of the confidence level. Where does the confidence level need to be and what does the n need to be for views for the test to be significant?
Hey, I'm a conversion optimization consultant so I can offer some help here.
Whether a conversion rate is "good" or not is a relative measure. If you run a test and get it to 35% from 26%, that's good! If if go from 26% to 10%, then you know you could be doing better. Comparing your conversion rate to industry averages rarely tells you anything meaningful or actionable, so the best thing you can do is continue testing and compare to your old baseline.
The confidence level is the likelihood that the detected result actually exists. For example, if you flip a coin 10 times it will likely land on one side more than the other. If you stopped testing after 10 flips, you might assume that one side has a higher probability than the other... But we both know that's wrong. The real result (50/50) would only begin to emerge after 20 flips, and even then you would not have 100% confidence.
So you don't want to start flipping (ie, testing) too soon because you might draw the wrong conclusion ("the coin landed on tails 6 out of 10 times, therefore tails beats heads by 20%"). You also want to know when to stop and what is the smallest meaningful difference ("the coin landed on tails 49 out of 100 times, and the 2% difference is not significant, so we can confidently say the probability of landing on tails is the same as landing on heads.") The confidence level tells you when you've reached this point.
Before you start any A/B test, you need to know how many "flips" (sample number, ie, visitors) are need to reach a result that is likely to be true, and what is the minimum significant change (is a 2% increase significant, or is that just random fluctuation?). Here is a handy calculator to calculate this before you start your test: http://www.evanmiller.org/ab-testing/sample-size.html
If you (or anyone reading) would like some help in setting up and running A/B tests to increase conversions, get in touch!
a 95% significance threshold simply means that you have a 5% chance that the results you're seeing are wrong given the sample you're analyzing. In other words it means that 95% of the time the results you see is better (or worst) then the one you see on the originale page you're testing.
If you want to run a test that really has an impact you should:
1. Compute the sample size (here a tool: http://www.evanmiller.org/ab-testing/sample-size.html )
2. Run the test for at least 7 days
3. Record at least 300 hundred conversion per variation
4. Check significance here ( http://www.evanmiller.org/ab-testing/chi-squared.html )
The more you want to be sure about the result the higher the threshold. So 95% is an accepted standard, 99.5% of significance means that you are pretty sure that the result is not due to chance.
I run hundreds of test for my customers, what I can tell you is that the best test are the one that provide a bigger difference between the two tested pages (from 20% increase or more). That you should run the test for at least one week. And that you should be sure that you didn't make the sample "dirty" (like: changing traffic sources, sending a special offer newsletter etc), and that you should run your test for at least 300 conversions for each variation.
If you happen to run a good test, what you will see from the chart in your ab testing tool is that the line representing the winning variation will always be "over" the one of the loosing variation.
This is the signal that you run a very good test!
And BTW .. Don't forget that each test you run teach you something about your customers ..
Hope this helped.
26% is a satisfactory conversion rate. It differs for everyone, there is no benchmark. Can you improve on that - surely! Try bringing in more levels of analytics like heatmaps and multivariate testing to see what tweaks have the most impact.