Say you have an experiment and you want to test whether there is a difference in the treatment response between two groups. For instance, it could be that one group received a drug and one received a placebo, and you want to test whether the group receiving the drug has a significant difference in recovery time for some illness. You also might want to test whether one landing page has a higher clickthrough rate than another.

What happens if we use the group as a feature in a model? We wind up inheriting all the assumptions of the model. So let’s say is an indicator for treatment vs placebo and is the recovery time for an illness. Let’s say we use linear regression

(1)

then if we want to test whether a learned is significant, that is if there is a significant difference between the two groups, we wind up requiring all the standard assumptions of linear regression: since we only have a single feature this becomes linearity and iid Gaussian residuals. Linearity is a strong assumption. This issue will come up for other models as well, even if the assumptions are different.

On the other hand, if we do a two sample t-test, the primary assumption we’re making is that the sample mean of each group is Gaussian, which will usually hold in sufficiently large samples for iid data due to the central limit theorem.