A/B/n and Multivariate testing

“The difference between theory and practice is larger in practice than the difference between theory and practice in theory.” — Jan L.A. van de Snepscheut

We already discussed that A/B testing is a way to compare two versions of something to figure out which performs better. Now, let’s look at A/B/n and Multivariate tests a bit closer:

A/B/n testing is an extension of A/B testing where more than two versions are compared against each other at once. We use it when we have more than one proposed variation to test our hypothesis. A major advantage of this is that, it enables us to test all the variations at the same time, so that we can eliminate the seasonality effect. If we test the variations one by one in separate A/B tests, we might be running the risk of encountering different external effects in the tests and it might be hard to interpret and rely on their comparative results.

Example of an A/B/C/D test where a CTA button on property listing page was tested with 4 different wordings by Airbnb:

Multivariate testing (MVT) is also an extension of A/B testing where we create multiple variations of multiple elements and then test all the combinations simultaneously. We use it to measure interaction effects between multiple independent elements and to understand the contribution of each element to see which combination of variations works best.

An example of an MVT test where Microsoft Office tested the combination of elements (title, hero image, CTA, description and link location) to find out which combination works best. At the end, the version on the right was selected as the winner.

Even though they are technically named differently, in practice, A/B/n and Multivariate tests are just extensions of A/B testing and we use the term A/B testing as a meta expression for testing different versions regardless of the number of variations or elements being tested. They basically represent A/B tests with a larger number of variations.

If we think of it that way, in the end A/B/n tests are just A/B tests with more than two variations and Multivariate tests are A/B/n tests where we make changes on multiple elements at the same time.

The only difference when it comes to Multivariate tests is how we technically setup the test. Rather than creating all the combinations of the variations of multiple elements manually by ourselves, we just need to create the factors for the elements and then add them into our test. The combination part is then done by the software, which makes the setup easier and more efficient for us.

We need to be aware of the following when we are conducting A/B/n and Multivariate tests:

  • The more variations we have, the more time and traffic is required because testing too many variations can further divide the traffic among many variations. In this case we need to either have enough traffic or time to reach satisfactory results.
  • The higher the number of variations, the higher the error rate of making a false decision because the risk of a false positive increases drastically with each additional variation. Even though it is easier to get significant results when we have more variations, our chances of finding a winner that is not a real winner will be higher.

​​​​​Google’s 41 shades of blue test is a good “bad example” for this: In 2009, when Google could not decide which shade of blue would generate the most clicks on their search results page, they decided to test the 41 shades of blue. In the end, they got significant results and found a winner but the probability of finding a false positive was so high that it is really hard to believe that the outcome of the test was reliable.
In general, this test is a bit controversial since there was no solid hypothesis behind, no contrast between the variations from users’ perspective, no expected change in user behavior and a high probability of getting significant results purely out of coincidence.

To mitigate these limitations, as a best practice, we should make sure that we have a solid hypothesis and all the variations are in line with it, rather than testing all the options that spring to our mind. The important thing is to optimize the number of variations by making sure that the variations make sense and we have enough contrast between all our variations.

Here you can read the previous article in this A/B testing essentials mini series which is about “How much change can I test with A/B testing and when is it too much?”: https://aybalacoskun.medium.com/how-much-change-can-i-test-with-a-b-testing-and-when-is-it-too-much-1b1f91931e46

We’ll be discussing how to come up with the right hypothesis next time. So, stay tuned!

References:

https://www.nngroup.com/articles/multivariate-testing/

https://cxl.com/blog/multivariate-tests/

https://www.fastcompany.com/90181713/google-equates-design-with-endless-testing-theyre-wrong

Product Owner for A/B Testing @idealo

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store