16

What is the distribution of the variable $X$ given $$ X = Y + Z, $$where $Y \sim $ Binomial($n$, $P_Y$) and $Z\sim$ Binomial($n$, $P_Z$)?


For the special case, when $P_Y = P_Z = P$, I think that X~Binomial($2n$, $P$) is correct. If $P_A ≠ P_B$, the distribution might eventually just be Binomial$\left(2n, \frac{P_A + P_B}{2}\right)$ but I can't prove it.

If the problem is more complicated than I expect and we can't derive the whole distribution, can we tell something about the mean and the variance of $X$?

tortue
  • 2,261
Remi.b
  • 1,655

4 Answers4

11

It will be a special case of the Poisson Binomial Distribution.

6

Assuming $Y$ and $Z$ are independent, $X=Y+Z$ has mean $E[Y]+E[Z] = n P_Y + n P_Z$ and variance $\text{Var}(Y) + \text{Var}(Z) = n P_Y (1-P_Y) + n P_Z (1 - P_Z)$. The characteristic function is $$ \left( P_Y {{\rm e}^{it}}+1-P_Y \right) ^{n}\left( P_Z {{\rm e}^{it}}+1-P_Z \right) ^{n}$$ But unless $P_Y = P_Z$, there is no special name for the distribution of $X$.

EDIT: Maple does come up with a closed form for the probability mass function involving the associated Legendre function of the first kind:

$$\mathbb P(X=x) = \cases{ \dfrac{n!}{x!} P_n^{x-n}\left(\dfrac{2 P_Y P_Z - P_Y - P_Z}{P_Y - P_Z}\right) (P_Z - P_Y)^n \left(\dfrac{(1-P_Z)(1-P_Y)}{P_Z P_Y}\right)^{(n-x)/2} & if $0 \le x \le n$\cr \dfrac{n!}{(2n-x)!} P_n^{n-x}\left(\dfrac{2 P_Y P_Z - P_Y - P_Z}{P_Y - P_Z}\right) (P_Z - P_Y)^n \left(\dfrac{(1-P_Z)(1-P_Y)}{P_Z P_Y}\right)^{(n-x)/2} & if $n \le x \le 2n$}$$

EDIT: In response to Shakil's request, here is the Maple code:

> sum(binomial(n,k)*P[Z]^k*(1-P[Z])^(n-k)*
    binomial(n,x-k)*P[Y]^(x-k)*(1-P[Y])^(n-(x-k)),k=0..x) assuming x>=0,x<=n;
> simplify(%);
> sum(binomial(n,k)*P[Z]^k*(1-P[Z])^(n-k)*
    binomial(n,x-k)*P[Y]^(x-k)*(1-P[Y])^(n-(x-k)),k=x-n..n) assuming x>=n,x<=2*n;
> simplify(%);
Robert Israel
  • 470,583
5

See the binomial sum variance inequality. Here is an excerpt from the Wikipedia page.

In probability theory and statistics, the sum of independent binomial random variables is itself a binomial random variable if all the component variables share the same success probability. If success probabilities differ, the probability distribution of the sum is not binomial.

-3

In the limit as $n \to \infty$, your binomials become Gaussian and since it seems you are implicitly assuming your two binomials are independent, a sum of two independent Gaussians is Gaussian with mean and variance parameters given by the sum of the parameters for the two Gaussians, so yes, in the limit as $n \to \infty$, your distribution will converge to Binomial$(2n, (P_A+P_B)/2)$.

Also you are right, if $P_A = P_B = P$ and you assume independence, then the distribution is precisely Binomial$(2n,P)$.

However if $P_A \neq P_B$ and you assume independence, then the exact distribution is different from Binomial$(2n,(P_A + P_B)/2)$. If you let $X = X_A + X_B$ be the random variable which is the sum of your two binomials, then $P(X = k)$ is the summation over all the ways that you get $X_A = k_A$ and $X_B = k_B$ where $k_A + k_B = k$. It is easy to write down this summation formula if you know the formulas for binomial distribution, and summation notation. However I'm inclined to believe there is no closed form formula for it, unless it's something crazy like hypergeometric.

user2566092
  • 26,450
  • 2
    "In the limit as n→∞, your binomials become Gaussian" Sorry but this is simultaneously vague and wrong. – Did Feb 17 '15 at 22:18
  • 1
    @Did You're a smart guy, you must know it can't be both vague and wrong at the same time. And in a certain sense, binomial does converge to Gaussian. Maybe I'm wrong in the conclusions about what that particular convergence means though. Maybe continuity corrections etc could save my argument, I don't know. But I thought that convergence to a Gaussian was enough to say what I was saying. – user2566092 Feb 17 '15 at 23:10
  • And by Gaussian I mean normal distribution – user2566092 Feb 17 '15 at 23:15
  • Vague: "binomials become Gaussian", meaning what? Wrong: Binomial $(n,p)$ distributions do not converge to anything, gaussian or not, when $n\to\infty$. – Did Feb 17 '15 at 23:33
  • @Did http://en.wikipedia.org/wiki/De_Moivre%E2%80%93Laplace_theorem . Maybe the tails don't work for convergence, and if not, then my apologies. But it works between the tails. And from the theorem, I guess I meant convergence in terms of ratios of probabilities going to 1. – user2566092 Feb 18 '15 at 00:16
  • 3
    The De Moivre-Laplace theorem says that certain scaled and translated binomials converge to Gaussians. The binomials themselves do not converge (except in the trivial case of $p=0$). – Robert Israel Feb 18 '15 at 01:59