0

I hope math stack exchange is the right place for this question, even though it comes from an AI point of view.

Say I have a machine learning model and for robustness of results, I initialize it with three different random seeds, train each initialization to convergence and report the average performance of those three initializations. Now my question: for calculating the standard deviation of the model performances, do I use the population std or the sample std?

Hope you can help!

  • 2
    Not sure I get the situation. Are you saying you only have $3$ data points? Not sure what you hope to get out of such a small sample...Why not just report the three values? – lulu May 28 '21 at 11:03
  • No, I have one model and I initialize the parameters of that model three times, each time with a different random seed. After training I report the average performance of the three training runs to have results that are more robust w.r.t. the random seed that is used in training. Does that clarify? – frederik May 28 '21 at 11:12
  • 1
    Still sounds like three data points to me. Just the three averages you get, one for each seed. Or is there further randomization, given the initial seed? Anyway, what you are saying is not clear. How many data points do you believe you have? What is the difference between the two $\sigma's$? In practice, if these numbers are very different, your sample is too small. – lulu May 28 '21 at 11:16
  • What is the meaning of "population std"? It is not standard terminology as far as I know – Aleksejs Fomins May 28 '21 at 11:18
  • 1
    @AleksejsFomins It is a standard term, see, e.g., this. Just a question of whether you regard the mean as an unknown, in which case you divide by $N-1$, or you regard it as known, in which case you divide by $N$. $N$ being the number of points in your sample. – lulu May 28 '21 at 11:20
  • @lulu Ok, I stand corrected. I did not know std estimator for known mean had a name :). But in this case, clearly, OP is interested in sample std, as mean is not known a priori. Still, I agree that std of only three variables is not very useful at representing true variance – Aleksejs Fomins May 28 '21 at 11:25
  • @AleksejsFomins Yes, that's exactly right. It's a sample, so presumably the "true" mean is unknown, so the sample $\sigma$ is a better, unbiased, measure. But the sample size is critical. – lulu May 28 '21 at 11:28

0 Answers0