0

I am trying to prove the following:

There are N balls in a vessel, of which M is red and N - M is white $(0\leq M \leq N)$. From this vessel n balls are drawn at random without being put back. X is the discrete random variable that counts the red balls drawn.

a) Show that:

$$p(X = k)=\frac{\binom Mk \binom {N-M}{n-k}}{\binom Nn}$$ (hypergeometric distribution with the parameters N, M and n).

b) Show that $$E(X) = n \frac{M}{N}$$


I have tried to prove this using the solution of the proof listed at:

Proof that the hypergeometric distribution with large $N$ approaches the binomial distribution.

but I am not sure, if the following is the right solution.


Proof:

$$\begin{eqnarray} \frac{\binom{M}{x} \binom{N-M}{n-x}}{\binom{N}{n}} &=& \frac{M!}{\color\green{x!} \cdot (M-x)!} \frac{(N-M)!}{\color\green{(n-x)!} \cdot (N-n -(M-x))!} \cdot \frac{\color\green{n!} \cdot (N-n)!}{N!} \\ \\ &=& \color\green{\binom{n}{x}} \cdot \frac{M!/(M-x)!}{N!/(N-x)!} \cdot \frac{(N-M)! \cdot (N-n)!}{(N-x)! \cdot (N-M-(n-x))!} \\ \\ &=& \binom{n}{x} \cdot \frac{M!/(M-x)!}{N!/(N-x)!} \cdot \frac{(N-M)!/(N-M-(n-x))!}{(N-n+(n-x))!/(N-n)! } \\ \\ &=& \binom{n}{x} \cdot \prod_{k=1}^x \frac{(M-x+k)}{(N-x+k)} \cdot \prod_{m=1}^{n-x}\frac{(N-M-(n-x)+m)}{(N-n+m) } \end{eqnarray} $$ weil

$\lim_{N \to \infty} \prod_{k=1}^x \frac{M-x+k}{N-x+k} = \prod_{k=1}^x \lim_{N \to \infty} \frac{M-x+k}{N-x+k} = \prod_{k=1}^x p = p^x$

und

$\lim_{N \to \infty} \prod_{m=1}^{n-x}\frac{(N-M-(n-x)+m)}{(N-n+m) } = \prod_{m=1}^{n-x} (1-p) = (1-p)^{n-x}$ $$\Rightarrow$$ $$p(X = k)= \binom{n}{k} \cdot \prod_{k=1}^x \frac{(M-x+k)}{(N-x+k)} \cdot \prod_{m=1}^{n-x}\frac{(N-M-(n-x)+m)}{(N-n+m) }$$

$$=\binom{n}{x} p^x (1-p)^{n-x}=p(X)\; \\ q.e.d$$

SlaMath
  • 93

1 Answers1

1

Note: This answer requires prior knowledge of the binomial coefficient $\binom xy$

What you proved is that as $N \to \infty$ in a hypergeometric distribution, the distribution approaches the binomial distribution. But this is not what you want; you simply want to find the probability mass function of the hypergeometric distribution.

You define a hypergeometric distribution as such:

There are $N$ balls in a vessel, of which $M$ is red and $N - M$ is white $(0\le M\le N)$. From this vessel $n$ balls are drawn at random without being put back. $X$ is the discrete random variable that counts the red balls drawn.

$P(\text{anything}) = \cfrac{\text{# of outcomes of interest}}{\text{# of possible outcomes}}. $ Here, $P(X=k) = \cfrac{\text{number of ways to draw $k$ red balls in $n$ total draws}}{\text{number of ways to perform $n$ draws}}$

Looking first at the denominator: if we are drawing $n$ balls from a vessel with $N$ balls, then we "choose" $n$ from $N$. Thus there are $\binom Nn$ number of ways to perform $n$ draws, and maybe you will notice that our denominator matches the answer's denominator.

On to the numerator. If we drew $k$ red balls in $n$ draws, then we necessarily drew $n-k$ white balls as well. (Hopefully this is intuitive - there are only red and white balls, so the number of red balls we draw plus the number of white balls we draw should equal the total number of balls we draw. $k + [n-k] = n)$ Since there are $M$ red balls (and thus $N-M$ white balls) to choose from, the number of ways we can choose $k$ red balls is necessarily $\binom{M}{k}\binom{N-M}{n-k}$. To convince you that we should simply multiply the two binomial coefficients, consider that for fixed $k$, the way we choose red balls and white balls is independent. That is, for each different way we can choose $k$ red balls from $M$, there are $\binom{N-M}{n-k}$ ways to choose the white balls. (If you're not convinced yet, consider making a sandwich where you have $3$ choices of bread type and $3$ choices of meat. Then you have $3\text{x}3=9$ ways to make your sandwich.)

Thus, $P(X=k)= \cfrac{\binom{M}{k}\binom{N-M}{n-k}}{\binom Nn}$

The question does not ask for the support of the distribution, but we have to find it to calculate the expected value anyway, so let's note now that this probability applies for $k = 0, 1, 2, \cdots, n$; assuming $n\le M$.

You can find the proof of the expectation here: Expected Value of a Hypergeometric Random Variable

jeremy909
  • 1,048