0

Based on birthday paradox;

Let $d$ be the set of elements randomly chosen from a set of $n$ distinct elements then

a) What is expected number of unique elements in $d$ (remaining will be repetition of unique elements)?

b) What is expected maximum count/ frequency of occurrence of an element in $d$?

c) How large d will be such that all distinct elements of $n$ appear in $d$ atleast once?

for simplicity, Let

$n = [0,1,2,3...99]$

$d = randomly\ chosen\ 100\ elements\ from\ n$

joriki
  • 242,601
crypt
  • 143
  • 2
    a) Use linearity of expectation. c) This is just the coupon collector's problem. As for b), this is a rather difficult question in comparison. Not exactly a duplicate, but see The Coupon Collector's Most Collected Coupon for the related question of if you don't stop after 100 pulls but rather stop after having finished your collection. – JMoravitz Feb 09 '23 at 13:05
  • 1
    How you count collisions will affect (a): is it the number of values which appear at least two times or the number of pairs of selections with the same value or the the number of selections which share a value with at least one other – Henry Feb 09 '23 at 13:51
  • @Henry updated the question. – crypt Feb 10 '23 at 03:22
  • Hi! Is this a homework question, something you came up with on your own, ...? – Brian Tung Feb 10 '23 at 04:21
  • not a homework, just curious. thought to solve these after studying birthday paradox, IMO a) is similar to Number of people with a shared/ non shared birthday. Remaining i cant guess – crypt Feb 10 '23 at 04:41

1 Answers1

1

(a) If you draw $k$ times with replacement from $n$ possible values, the expected number of values drawn exactly once is $k\left(1-\dfrac1n\right)^{k-1}$, which is maximised when $n=k-1$ or $n=k$. In the case $n=k=100$ this is about $36.97$.

Similarly the expected number of values drawn zero times is $n\left(1-\dfrac1n\right)^{k}$ which with $n=k=100$ is about $36.60$. The expected number of values drawn two or more times is $n- (k+n-1)\left(1-\dfrac1n\right)^{k-1}$ which with $n=k=100$ is about $26.42$.

(b) I am not aware of a closed form expression but, for $n=k=100$, simulation suggests that the value drawn most often is drawn an average of about $4.23$ times.

(c) If you draw until each of $n$ values has been drawn at least once, this is the coupon collector's problem and the expected number of draws needed number is $n\,H_n=n\sum\limits_{m=1}^n \frac1m$ which with $n=100$ is about $518.74$.

Henry
  • 169,616
  • where can i find more information/ derivation of formula given at serial (a). – crypt Feb 24 '23 at 04:32
  • 1
    @crypt The probability a particular value is drawn from $n$ exactly once in $k$ draws is the binomial ${n \choose 1}(\frac1n)^1(\frac{n-1}{n})^{k-1}$ and the probability it is never drawn is ${n \choose 0}(\frac1n)^0(\frac{n-1}{n})^{k}$. So the expected number drawn exactly once and drawn exactly zero times would be $k$ times these; then simplify. – Henry Feb 24 '23 at 11:09