2

I feel like I have been bashing my head against the wall on something that I thought would be easy. Not from a math background so this might actually be trivial.

I have a set of numbers $S$ with size $n$. I want to know the odds of finding a subset $B$ size $m$ of $S$ which has the same median as $S$.

For example, if $S$ is $\{1, 2, 3, 4, 5, 6\}$, and $m=4$, then of the $15$ ways to make $B$, $5$ share the median of $3.5$.

I thought that I would just have to find the odds of getting the middle two numbers being paired appropriately (e.g. $3$ with $4$ or $2$ with $5$ above) but I am struggling to find how to relate this with arbitrary $m$ and $n$. I am also aware that with the way the median works this will differ between odd/even $n$ and $m$ and honestly I haven't even gotten there yet.

Aig
  • 5,725
  • If $n=2b+1$ and $m=2a+1$ are odd, one can directly count the number of size-$m$ subsets of $S$ with the same median: one element of the subset must be the median of $S$, and then $a$ elements must be drawn from the smallest $b$ elements of $n$ while $a$ elements are drawn from the largest $b$ elements of $n$, resuilting in $\binom ba^2$ possible subsets in total. – Greg Martin Apr 19 '24 at 02:49

2 Answers2

2

If $m$ is odd and $n$ is even, then the probability is $0$. It’s because the median of $B$ will be an element of $S$, and of the $S$ won’t.

If $m$ and $n$ are both odd, then the median of $S$ is its element. $B$ has to have this element as well, $\frac{m-1}2$ other elements should be chosen from the “left half” of $S$, and $\frac{m-1}2$ elements from the “right half”. The probability is then

$$\frac{\displaystyle\binom{\frac{n-1}2}{\frac{m-1}2}^2}{\displaystyle\binom nm}.$$

If $m$ and $n$ are both even, then the median of $S$ isn’t an element of $S$. Two “middle” elements of $B$ should be the same distance from the median of $S$. Then we have to choose $\frac m2-1$ from the elements left to the “left middle” one, and the same number right to the “right middle” element. The probability in this case is

$$\frac{\displaystyle\sum_{k= \frac m2-1}^{\frac n2-1} \binom{k}{\frac m2-1}^2 }{\displaystyle\binom nm}.$$

For example, putting $m=4, n=6$ in this formula we get $$\frac{1^2+2^2}{15}=\frac 13,$$ which agrees with your data.

If $m$ is even and $n\ge3$ is odd, then $\frac m2$ elements of $B$ must be from the “left half” of $S$. Two “middle” elements of $B$, again, must be equal distance from the median of $S$. Then the probability is equal to

$$\frac{\displaystyle\sum_{k= \frac m2-1}^{\frac {n-3}2} \binom{k}{\frac m2-1}^2 }{\displaystyle\binom nm}.$$

Aig
  • 5,725
  • 1
    You may note that your first statement "If $m$ and $n$ are of different parity, then the probability is $0$. It’s because the median of one set will be an element of $S$, and of the other one won’t." is not always correct. Please see the remark in my answer and let me know your feedback (@GadRaganas). – Amir Apr 19 '24 at 12:55
  • 2
    @Amir true, I corrected the answer. – Aig Apr 19 '24 at 14:00
  • 2
    You are welcome! – Amir Apr 19 '24 at 14:14
  • Wow you guys are rockstars! It is unfortunate that I cannot push the button twice. It was pissing me off that in some of my attempts I was getting a -1 and a -2 term here and there and I could not get it down right. Where is a good starting point to read about this sort of stuff? – Gad Raganas May 03 '24 at 01:26
  • 1
    @GadRaganas For example, here. More suggestions here. – Aig May 03 '24 at 10:52
1

The answer for a general finite set $S$, including the special case $S=\{1,\dots, n\}$, is provided here in three cases. Let $P_M$ denote the set of pairs $(a,b)\in S\times S$ with $a<b$ such that $a+b=2\,\text{Median}(S)$. Moreover, define the sets $S(<c)=\{x\in S: x<c\}$ and $S(>c)=\{x\in S: x>c\}$ for any $c \in S$.

Remark: In @Aig's answer, presented for $S=\{1,\dots, n\}$, the probability for case that $m$ is even and $n$ is odd is considered $0$, which is not always correct. Indeed, for $S=\{1,\dots, 2k+1\}$, whose median is $k+1$, and $m=2$, there are $k$ sets $\{1, 2k+1\},\{2, 2k\}, \dots ,\{k-1, k+2\}$ whose medians are all $k+1$, and the probability is not zero. The number of such sets for any even $m$, can be obtained using the formula given for Case 2, by considering that $|S(<c)|=c-1$, $|S(>c)|=n-c$, and $$P_M=\big\{(1, 2k+1),(2, 2k), \dots ,(k-1, k+3)\big\}.$$

Case 1: Both $m$ and $n$ are odd numbers

The probability is

$$\frac{\begin{pmatrix} \frac{n-1}{2} \\ \frac{m-1}{2} \end{pmatrix} ^2}{\begin{pmatrix} n \\ m \end{pmatrix}}$$

Case 2: $m$ is even ($n$ can be even or odd)

The probability is

$$\frac{\sum_{(a,b)\in P_M}\begin{pmatrix} |S(<a)| \\ \frac{m-2}{2} \end{pmatrix}\begin{pmatrix} |S(>b)| \\ \frac{m-2}{2} \end{pmatrix}}{\begin{pmatrix} n \\ m \end{pmatrix}}.$$

In this case, the median of the subset can be equal to the median of $S$ if there are pairs in $S$ whose average is equal to the median of $S$, that is, $P_M\neq \emptyset$.

Case 3: $m$ is odd and $n$ is even

The probability is $0$, as the average of the two middle numbers forming the median of $S$ is not in $S$.

Amir
  • 11,124