7

Choose positive integers $D$ and $N$. Roll a fair $D$-sided die $N$ times, recording the number of times each of the $D$ outcomes are rolled, say $r_1, r_2, \ldots, r_D.$ What are the asymptotics of $\min(r_1, r_2, \ldots, r_D)$?

They're not independent, so I can't directly apply the Fisher–Tippett–Gnedenko theorem but that's probably a good starting point.

If it makes it easier you can assume $N \gg D.$ Of course the leading term is $N/D$ but what’s the second-order term? Maybe $\asymp \sqrt{N/D}$?

Charles
  • 32,999
  • 1
    This corresponds to the minimum of a uniform multinomial. Does notlook easy. See eg https://www.mdpi.com/2073-8994/13/11/2173 https://projecteuclid.org/journals/annals-of-mathematical-statistics/volume-21/issue-3/Distribution-of-Maximum-and-Minimum-Frequencies-in-a-Sample-Drawn/10.1214/aoms/1177729800.full – leonbloy Mar 28 '24 at 15:01

1 Answers1

1

Doing some Monte Carlo on the case $D=6$

f(N,D=6)=my(v=vectorsmall(D)); for(i=1,N, v[random(D)+1]++); (N/D-vecmin(Vec(v)))/sqrt(N/D)
\\ Use gp2c for better performance

the error does seem to be right around $\sqrt{N/D}$. Out of many trials (with $N/D$ varying from $10^7$ to $10^9$) I find the error over $\sqrt{N/D}$ to be around 1-1.5 on average.* I don't see enough variability between trials with $10^7$ and $10^9$ to guess what the third asymptotic term might look like.

So my guess is that, out of $N$ rolls of $D$-sided dice (with $D$ fixed and $N\to\infty$), the number that is rolled least often out of the $D$ is $$ N/D + \alpha\sqrt{N/D} + o\left(\sqrt{N/D}\right) $$ where $1 \le \alpha \le 3/2$ is a fixed constant independent of $D$.

* Using bootstrap resampling to estimate I get a 95% confidence interval right around (1, 1.5). For $10^7$ where I have more samples it's more like (1.05, 1.45).

Charles
  • 32,999