Questions tagged [ab-test]

A/B testing, also known as split or bucket testing, is a controlled comparison of the effectiveness of variants of a website, email, or other commercial product.

A/B test, or split or bucket test, is a colloquial term for a controlled experiment in which users are randomly exposed to one of several variants of a product, often a website feature.

The Response or Dependent Variable is most often count data (such as clicks on links or sales) but may be a continuous measure (like time on site). Count data is sometimes transformed to rates for analysis.

Because they create temporary variants of 'live' websites, on-line A/B tests must overcome several challenges not common in traditional experiments of human preference. For example, differential caching of test versions may degrade website performance for some versions. Users may be shown multiple variants if they return to a website and are not successfully identified with cookies or by login information. Moreover, nonhuman activity (search engine crawlers, email harvesters, and botnets) may be mistaken for human users.

Useful References:

Kohavi, Ron, Randal M. Henne, and Dan Sommerfield. "Practical Guide to Controlled Experiments on the Web: Listen to Your Customers not to the HiPPO." (2007).

Kohavi, Ron, et al. "Trustworthy online controlled experiments: five puzzling outcomes explained." Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2012.

51 questions

votes

2 answers

Analyzing A/B test results which are not normally distributed, using independent t-test

I have a set of results from an A/B test (one control group, one feature group) which do not fit a Normal Distribution. In fact the distribution resembles more closely the Landau Distribution. I believe the independent t-test requires that the…

dataset statistics ab-test

asked Aug 04 '14 at 22:27

teebszet

votes

2 answers

A/B testing: How to calculate p-value on post test segments?

My question on A/B testing is about doing post test segmentation analysis. For example: I run an A/B test on my website to track bounce rate. On the treatment group, i put a video to explain my company. On the control group i put just plain…

statistics ab-test experiments hypothesis-testing

asked Nov 14 '17 at 02:14

jxn

votes

1 answer

What does Prec@1 in fastText mean?

In Bag of Tricks for Efficient Text Classification paper which is popular right now, he calculates prec@1 for the datasets in the experimentation segment. What does that mean?

machine-learning nlp ab-test

asked Aug 09 '16 at 12:57

Hima Varsha

2,366
16
34

votes

2 answers

AB testing : When AA testing doesn't work

After 6 months of AB testing on our CRM tool (Oracle Responsys, but this could be true with anyone), the test exhibited some weird results so we decided to pause everything, and to make some good old AA testing. AA testing consists in dividing…

statistics ab-test experiments

asked Aug 08 '16 at 18:17

WNG

votes

4 answers

What is the minimum size of the test set?

The mean of a population of binary values can be sampled with about 1000 samples at 95% confidence, and 3000 samples at 99% confidence. Assuming a binary classification problem, why is the 80/20% rule always used, and not the fact that with a few…

statistics cross-validation ab-test

asked May 13 '16 at 13:24

zzzbbx

votes

3 answers

What are the methods to ensure that the population split for A/B test is random?

Before launching an A/B test, what are the methods to ensure that the population split in control and target group is random for a particular label say, purchase rate.

data-mining statistics data experiments ab-test

asked Feb 26 '16 at 13:27

vick

votes

1 answer

A/B testing randomization step

Let's say I want to measure a medicine’s impact on height. So I randomly broke down my user base into two groups (control and experiment, obviously). I calculated the average height for both groups before experiment and found out there’s a 2 cm…

statistics ab-test

asked May 30 '16 at 04:48

Helene

votes

3 answers

A/B Testing (Binomial Distribution vs Random Distribution)

When performing an A/B test for the number of clicks for users viewing (each view is an impression) two variants of an ad, a binomial distribution can be assumed where each variant has a constant click-through rate. Example: Two Ads, -> Ad one has…

statistics distribution descriptive-statistics ab-test

asked Nov 10 '20 at 08:04

DD.

votes

2 answers

How to control false positives in sequential A/B testing while keeping a low sample size?

I am working on a planned sequence of n independent A/B tests(=run a maximum of n tests or stop earlier if a good improvement is found) and in order to keep the significance level within an acceptable level(=0.05) I'm considering ways of controlling…

data-mining ab-test

asked Jul 20 '15 at 16:00

antonio-campari

votes

2 answers

Is it scientifically correct to derive conclusions unrelated to hypothesis from A/B test data

Consider a software A/B test with the hypothesis that "the addition of feature F is predicted to increase metric X". At the end of the test, the data doesn't show any significant change in X, but it does show a significant increase in Y - something…

ab-test

asked Mar 28 '17 at 03:29

Jon Burgess

votes

1 answer

How would I chi-squared test these simple results from A/B experiment?

I have results from an A/B experiment where users could do one of three things: Watch, Interact, or Nothing My data is like this: Watch | Nothing | Interact A: 327445 | 271602 | 744702 B: 376455 | 140737 | 818204 I tried to use the…

scikit-learn ab-test

asked Apr 28 '16 at 02:00

OneChillDude

votes

3 answers

Analysis of Split (A/B) tests using Poisson and/or Binomial Distribution

Cross posting this from Cross Validated: I've seen this question asked before, but I have yet to come across a definitive source answering the specific questions: What's the most appropriate statistical test to apply to a small A/B test? What's the…

r ab-test

asked Aug 08 '14 at 13:44

brycemcd

votes

1 answer

A/B test results contradictory with offline machine learning model performance

This seems to be a common problem when bringing machine learning models to production. Let's say we have an optimized machine learning model which gives decent performance metric in the unseen testing dataset. We are quite satisfied with that, and…

machine-learning ab-test

asked Oct 02 '20 at 13:53

CathyQian

votes

1 answer

Permutation test on two groups

I am trying to use a permutation test to test my hypothesis. I want to make sure I am understanding concept of permutation correctly. I have control and experimental group. Then I combine them and resample from combined dataset randomly calculating…

statistics ab-test permutation-test hypothesis-testing

asked Jul 02 '20 at 23:23

haneulkim

votes

0 answers

Firebase AB testing algorithm

We have run an AB test at firebase which has the following results: I was also building my own Bayesian AB-test suite and was wondering how they came to these conclusions. What I was doing was querying the data of this test for the Control Group…

bayesian ab-test implementation

asked Apr 10 '20 at 09:35

Boris Mulder

2 3 4 Next