3

Consider a software A/B test with the hypothesis that "the addition of feature F is predicted to increase metric X".

At the end of the test, the data doesn't show any significant change in X, but it does show a significant increase in Y - something that wasn't expected or even considered at the beginning of the experiment.

At this point, is it scientifically valid to say that F increases Y, or should a new A/B test be designed and executed?

DaL
  • 2,663
  • 13
  • 13
Jon Burgess
  • 133
  • 4

2 Answers2

5

It looks analagous to drug testing, where reporting of side effects during drug trials is obviously very important - i.e. the increase in Y seems analagous to a side effect. And some famous drugs have begun their lives as research into a side effect. Viagra is probably the most famous case, being a spinoff from a drug developed as angina medication. So in your write-up on your experiment you should definitely report the apparent effect on Y. However, if the effect on Y is commercially important, then you still need to go back and do an experiment around a hypothesis that references the increase in Y to validate the existence of the effect properly.

Robert de Graaf
  • 899
  • 5
  • 17
2

The problem is moving from estimating a single hypothesis into few ones. One could claim that X and Y are symmetric, if we were willing to examine X, why shouldn't we examine Y? The difference is that since Y wasn't part of the original plan, it is possible that there are many other variables there Y1, Y2, Yn...

Consider that we have extra n variables, all purely random. If we have a large enough n, one of them will have observations that seem to be correlated to F. In case that you consider a pair of variables, the number of options you have becomes O(n^2). The more complex hypothesis set you'll have, the more options you will have and more likely you will be to gat a false correlation.

It doesn't mean that you should ignore the result regarding Y. Many discoveries were accidentals. As Robert de Graaf suggested, you can do another experiment and check the Y-F relation. You can also check multiple hypothesis techniques in order to evaluate your current results in order to estimate whether the new relation is significant.

DaL
  • 2,663
  • 13
  • 13