Marketing Science

A/B Testing in Theory and Practice

Posted by Dave Ingram on July 31, 2015

The other day my niece asked me what the steps are of the scientific method. I responded with “form a hypothesis and run an experiment”, but she wasn’t satisfied. There was something with a “P” word in there and she wanted to get all of the steps right. When I googled the steps, numerous elementary school websites appeared, expounding on the steps of the scientific method, which both gave me hope for our next generation of scientists, and also got me wondering about who else was missing some steps from the scientific method.

I was thinking about this exchange later, and how essential it is for marketers to think about and follow the steps of the scientific method when working on their websites and other marketing materials. The proliferation of A/B testing tools over the past few years has made one step much easier, but the value of these tools is greatly diminished without a proper grounding in the scientific method.

So without further adieu, here are the steps as defined on Wikipedia:

  1. Make Observations
  2. Think of Interesting Questions
  3. Formulate Hypotheses
  4. Develop Testable Predictions
  5. Gather Data to Test Predictions
  6. Refine, Alter, Expand or Reject Hypotheses
  7. Repeat 4 through 6 as necessary
  8. Develop General Theories

The Scientific Method

I want to talk to each one of these, but the tl;dr of this article is that most examples I see focus only on running lots of experiments (step 5), but they miss doing it in such a way that general theories can be established. This is a greatly missed opportunity and should be remedied post haste.

Make Observations

You’ll never see anything interesting unless you stop, open your eyes, and pay attention. Opportunities for getting to know your audience abound in both the physical world and online. Surveying your users and performing user testing (made far easier with services like usertesting.com) are excellent places to start. Web analytics tools such as Google Analytics or Kissmetrics are also indispensable in this regard.

There are entire professions dedicated to both user testing and advanced analytics, and this step should be taken very seriously. If you’re not taking enough time on this today, be sure to block off time to just stop and observe.

Think of Interesting Questions

As you observe, questions will undoubtedly begin to form. Write them down and continue to ask more. If you see something interesting, ask why, then ask why again. In fact, ask why five times and you’ll undoubtedly get somewhere of interest.

Formulate Hypotheses

Only with a well formed hypothesis will you be able to make accurate statements in following steps. A good hypothesis has the end game of a general theory in mind, not just a question of “which image will work better in my call to action?”

A good hypothesis could look like “my audience will respond better to casual than formal language.” What makes this a good one? It leads to easily testable predictions as well as the potential of a general theory.

Develop Testable Predictions

A testable prediction is a smaller unit than a hypothesis. In other words, if the hypothesis says “I think this is the general rule”, the prediction says “using this specific piece of language will perform better than this other specific piece of language.” Once you get that far, you’re ready to launch an experiment.

Gather Data to Test Predictions

Running the actual experiment can be done in many different ways. There are dedicated A/B testing solutions such as Optimizely or Visual Website Optimizer for the web, but many other tools include A/B testing such as personalization solutions, email marketing tools and landing page tools. There are also many open source testing frameworks that can be used to perform this step.

The important part is that you set up a system that allows you to run a lot of tests often. This will pay dividends over time, as you’ll see in the next steps.

Refine, Alter, Expand or Reject Hypotheses

A single experiment isn’t enough to form a general theory or to reject a hypothesis. There are far too many confounding conditions for that to be the case. For example, let’s say you run an experiment on your home page using the hypothesis that friendly language is better than formal language, and your testable prediction is that the phrase “Hey y’all, how can we help?” will perform better than “Visit our support page for assistance”. It is indeed true that one phrase is more casual and the other more formal, but perhaps the former phrase is too colloquial for your audience or it has gone too far. Casual language in general may indeed be better than formal, but not in this specific case.

Other things can “go wrong” with your test as well. For example, a large conference in the Southern United States during the week that you run this test may lead to a large group of people finding the “hey y’all” language to be preferable, but the next week as the characteristics of your audience change, this can no longer be the case. So as you see results from an initial test, you should refine and expand your original hypothesis to get to something sufficiently accurate that can be answered. Surprising effects can also be found in this way, often leading to cases where personalization can be employed to speak to different kinds of people in different ways.

Repeat 4 through 6 as necessary

Eventually you’ll want to either accept or reject your overall hypothesis, and the importance of the question will help dictate how many different tests should be run. For example, if you’re trying to determine the overall tone of language to use in all marketing materials, this warrants many carefully planned tests as this could effect the overall trajectory of your business. If, however, you’re simply trying to determine the optimal size or color of a specific button, the risk of getting that wrong is likely smaller and can be answered in a few short iterations.

Develop General Theories

General theories are the gold you are mining for when embarking on an optimization plan. In the physical sciences, generalized theories about physics led to a world where man could walk on the moon. While perhaps less lofty, generalized theories of audience behavior can lead to far greater benefits in business than simply testing everything you can think of. If you can prove over many experiments that a certain tone of voice in your marketing material truly leads to a substantial lift in conversions, then you can use that knowledge in all your materials and move on to testing other questions.

In Conclusion

Running lots of experiments can have some value, but keeping in mind the greater picture and thinking like a scientist will provide far greater rewards in the long term. Work towards general theories about your brand, your market or your audience and you will get far more out of your investment in optimization and personalization.