I've been analyzing a data set of ~400k records and 9 variables The dependent variable is binary. I've fitted a logistic regression, a regression tree, a random forest, and a gradient boosted tree. All of them give virtual identical goodness of fit numbers when I validate them on another data set.
Why is this so? I'm guessing that it's because my observations to variable ratio is so high. If this is correct, at what observation to variable ratio will different models start to give different results?