A/B Testing

A data professional might use an A/B test to analyze one of the following metrics: 

  • Average revenue per user: How much revenue does a user generate for a website?
  • Average session duration: How long does a user remain on a website?
  • Click rate: If a user is shown an ad, does the user click on it?
  • Conversion rate: If a user is shown an ad, will that user convert into a customer?

Example: Average revenue per user  

Imagine we’re a data professional who works for an online footwear retailer. The company is trying to grow its business and is researching the average revenue per user on its website. Our team leader asks us to conduct an A/B test to determine whether increasing the size of the “Purchase” button has any effect on average revenue.

A typical A/B test has at least three main features: 

  1. Test design 
  2. Sampling 
  3. Hypothesis testing 

Test design   

First, let’s discuss the fundamental design of an A/B test. 

Randomized controlled experiment 

An A/B test is a basic version of what’s known as a randomized controlled experiment. In a randomized controlled experiment, test subjects are randomly assigned to a control group and a treatment group. The treatment is the new change being tested in the experiment. The control group is not exposed to the treatment. The treatment group is exposed to the treatment. The difference in metric values between the two groups measures the treatment’s effect on the test subjects.

Note: Ideally, exposure to the treatment is the only significant difference between the two groups. This test design allows researchers to control for other factors that might influence the test results and draw causal conclusions about the effect of the treatment. 

In our example, group A is the control group, group B is the treatment group, and the treatment is displaying a larger “Purchase” button. By making the website versions for A and B identical except for the size of the “Purchase” button, you minimize the chance that any observed difference in average revenue is due to other features such as page layout or background. 

Randomization, or randomly assigning test subjects to the control group or treatment group, also helps control the potential effect of other factors on the outcome of the experiment. 

Sampling  

Random selection helps us create a representative sample that reflects the characteristics of the overall user population. 

We’ll also need to choose a sample size that is appropriate for our A/B test. The larger the sample size, the more precise the results. However determining sample size is based on both the goal of the analysis and their available budget.   

Hypothesis testing 

For the purpose of our example, let’s say we run the online test, collect our data, and discover that group B has a higher average revenue per user than group A. 

The next step is to determine whether the observed difference in our data is statistically significant or due to chance. A/B tests use two-sample hypothesis tests to draw conclusions about statistical significance. To determine whether the observed difference in average revenue per user is statistically significant, we conduct a two-sample t-test. We formulate our hypotheses as follows: 

  • H0: There is no difference in average revenue per user between A and B
  • Ha: There is a difference in average revenue per user between A and B 

Results

Based on the results of our t-test, we reject the null hypothesis and conclude that the observed increase in average revenue per user is statistically significant. 

The results of our A/B test help us decide whether or not to recommend a design change for our company’s website. 


Disclaimer: Like most of my posts, this content is intended solely for educational purposes and was created primarily for my personal reference. At times, I may rephrase original texts, and in some cases, I include materials such as graphs, equations, and datasets directly from their original sources.

I typically reference a variety of sources and update my posts whenever new or related information becomes available. For this particular post, the primary source was Google Advanced Data Analytics Professional Certificate.