A/B testing is a lot like driving a car.
When you first start out, you double-check everything. You adjust your mirrors and check your lights before every drive. But as you get more comfortable behind the wheel, you start to let some of the more delicate details go. You take shortcuts.
You start making mistakes.
Just like how the majority of accidents happen within five minutes of your home, the more old-hat you become at A/B testing, the more likely you are to start letting things slip through the cracks.
Today, you’ll learn about five of the most common mistakes in A/B testing, so whether it’s your first or fiftieth test, you can avoid them and get back on track.
Before we dive into what they are, though, let’s clear the air about the test itself.
Commonly used in conversion rate optimization, A/B testing is an experimental model where one variable, such as the size of a CTA, the length of copy, or timing of a triggered offer, is split-test between user groups.
The purpose is to determine which variant of a webpage or process, referred to as the “A” and “B” variant respectively, yields the most desirable results.
Usually, but not always, those results are conversions for sales or subscribers, but they can include any target behavior that the marketer is interested in generating, such as social shares.
And while it’s a powerful tool for increasing sales and optimizing an online business, A/B testing isn’t impervious to mistakes. Here are just five of the most common pitfalls.
5 common mistakes to avoid during A/B testing
A/B testing is an easy experiment to run and great for improving your performance in the digital sphere, but if it’s too easy, you run the risk of hurting your bottom line. That’s why we’ve put together these simple (but costly) mistakes and how you can avoid them.
1. Skipping the hypothesis
It’s entirely possible to run an A/B test without a hypothesis, but similar to the creation of instant rice, just because you can do something doesn’t mean you should.
Your hypothesis is a definitive statement about what is being tested, why it’s being tested, and what the expected results are.
Having a hypothesis forces you to think out the rationale of your experiment — which means cutting back on unnecessary tests and ensuring that the tests you do run produce actionable insights or results, even if your hypothesis is proven wrong.
Any hypotheses you form should, ideally, be based on insights and data from prior tests or analytics. The more data-backed your reasoning, the better.
For instance, a strong hypothesis might look like this:
“Triggering the pop-up offer on Page XYZ five seconds later will improve email subscription conversion rates. This outcome is expected because increased lag will allow users more time to engage with the content before being asked for input.”
A weaker hypothesis, on the other hand, would look like this:
“Adding a five-second delay to the pop-up offer will improve email conversion rates.”
As a rule of thumb, your hypothesis should answer the what, how, and why about your A/B test. Another way to think about this rule and frame your hypothesis is the “if, then, and because” formula.
“[If] we change this variable, [then] this result will happen, [because] of this reason.”
Don’t skip the hypothesis. It’s a small, but essential, step to making your A/B test as efficient as possible.
2. Excluding the control
All experiments need a control group. Your control group is made up of users who continue to receive and interact with the original, unchanged variable.
If you’re running an experiment to determine the optimal length of a form, for instance, one of the pages you test should include the original form length.
This way, when you’re conducting your test, you can get an accurate picture of whether or not changing the length produces enough of a conversion lift to justify the change.
Which isn’t to say that you can’t test multiple variants. You just need to create additional pages for your variants and keep your control as part of your lineup. Because if your control is already outperforming your variants and you skip over it and roll out the changes, you could end up losing conversions in the long-run and be none-the-wiser.
Remember, even small rate changes can translate into significant financial gains (or losses).
As another example, consider this: testing against the control helped Arenaturist reveal a 52% performance difference.
But what if they had enacted their changes without the control group and the original horizontal layout actually outperformed the variants? They might have erroneously cut their performance instead of uplifting their conversions.
And they’d have no way of pinpointing why.
Best practices dictate splitting your traffic evenly between your control and your variable. If you’re testing one change against the control, traffic should be 50/50. If you’re testing two against the control, traffic should be 33/33/33.
But while you can test as many variants as you want, be wary of making the next mistake and testing too many variables.
3. Overloading variables in a testing period
Running a test is exciting: it inspires hope for more revenue, shorter sales cycles, better customer lifetime value, and so many other things.
It’s so exciting that you might be tempted to test every possible element at once. After all, the more you improve your website, the closer you’ll get to your end goal, right?
But there’s a pitfall in this approach: if you change too many variables at once, you won’t be able to measure precisely which variables are responsible for the performance difference.
For instance, let’s go back to the pop-up example.
Instead of changing just the timing, you change the graphics, the copy, and the button placement, too. One variant performs better than the other. But why? Which element was responsible for the change? It could’ve been all of them or a single component, but if you’re testing them at the same time, your best guess is all you have.
It also makes your A/B test more like a multivariate test. Multivariate testing shares many of the same core mechanics with an A/B test, but it requires a lot more traffic to be statistically valid.
Speaking of validity, are you checking yours?
4. Overlooking external validity during analysis
The traditional model of a scientific experiment is in a highly controlled environment like a laboratory. This allows researchers to narrow down on variables more precisely, but it also creates barriers to generalizing the data to the real world.
This problem is referred to as validity. It’s typically distinguished between “internal” and “external” forms. Internal validity refers to the strength of the experiment’s protocols itself.
External validity, conversely, refers to the generalizability of those results outside of the experiment.
And while marketing experiments like A/B tests naturally have higher external validity than what you’d encounter in a laboratory test, they’re not exempt from validity issues.
When you’re analyzing the data, you need to identify and account for any factors that could impact the results and change the external validity.
As an example: running the test during holiday time frames where traffic patterns are naturally more profuse or conducting the analysis during a limited time promotion. In both cases, your website could be experiencing a traffic flow that’s far from the norm.
If you don’t address the validity issues before implementing changes, you could vastly undercut your “normal” performance when conditions return to baseline, such as when the holiday season is over.
So check your time frames and compare them against the rest of the year. If they’re out of the ordinary, you could be facing test results with low external validity.
And those could spell out even lower returns on investment in the long run.
By the way, what is your long-run plan? If you don’t have one, you could be making the last mistake on this list today.
5. Creating ambiguous end goals
Why are you running an A/B test? If you say it’s to improve conversions, that’s not actionable or narrow enough.
Nebulous end goals like “improve conversions” or “reduce bounce rate” are too open-ended. How will you know when the experiment has yielded the results you want? When you convert at 0.5% more? When you reduce bounce rate by 2%?
Your end goal dictates when you stop testing and start rolling out the changes (if any). Setting concrete parameters for your test, such as a 2.5% or more revenue change, prevents you from wasting time and money on unnecessary experiments.
Borrowing from the earlier example, the coinciding goal to accompany your strong hypothesis should look like this:
“Adding the delay to the pop-up offer should increase email sign-ups by X%, and thus put more leads into the sales funnel.”
That’s not all having a quantifiable end goal can do for you, either. It also allows you to align your testing and analysis with your desired results more closely.
For instance, if your end goal is to increase sign-ups by X%, you’ll know to measure the size of your list and not the click-through rate on the pop-up.
As Peter Drucker said, “If you can’t measure it, you can’t improve it.”
So make sure your goals can be measured. Your results will be easier to justify to stakeholders and more actionable.
A/B testing is an easy and critical tool in your conversion optimization arsenal, but you need to make sure you’re using the instrument to its full potential.
Starting with a strong hypothesis, testing against the control, limiting the number of variables you test, checking your validity, and creating concrete end goals are just a few steps you can take to keep that tool honed.
The better your tool, the better your results. And the results are the point, aren’t they?