3

Suppose I have a set of $n$ real numbers, $\{x_1, x_2, \dots, x_n\}$.

I choose a uniformly random subset of size $m \le n$.

What is the expected maximum of the subset in terms of $n$, $m$ and $x_i$?

1 Answers1

2

Let us assume $x_1 \leq x_2 \leq \cdots \leq x_n$.

With this ordering, note that the maximum of a $m$ element subset must be $\geq x_m$.

Total number of ways of choosing a $m$ element subset is $\dbinom{n}m$.

Out of these, $x_m$ is will be a maximum on $1$ occasion i.e. when the subset chosen is $\{x_1,x_2,\ldots,x_m\}$.

In general, $x_k$ will be a maximum on $\dbinom{k-1}{m-1}$. This is so since fixing $x_k$, the remaining $m-1$ $x_i$'s required to form the $m$ element set must all be less than or equals $x_k$ i.e. we need $i < k$. Hence, the number of ways to choose $m-1$ $x_i$'s such that $i < k$ is $\dbinom{k-1}{m-1}$.

Hence, the expected maximum of a $m$ element subset is $$\dfrac{\displaystyle \sum_{k=m}^{n} \dbinom{k-1}{m-1} x_k}{\dbinom{n}{m}}$$

  • And this expression simplifies nicely when the set is ${1,2,\dots,n}$: http://math.stackexchange.com/questions/65398/why-does-this-expected-value-simplify-as-shown –  Jun 25 '12 at 22:59
  • @ByronSchmuland Just to make sure I am not imagining things didn't you make the same comment to this post around $10$ minutes back? I find that the timestamp says $1$ minute ago instead of $10$ minutes ago. –  Jun 25 '12 at 23:03
  • Yes I deleted and re-commented to correct a typo $m\to n$. –  Jun 25 '12 at 23:04
  • @ByronSchmuland Oh Ok. Thanks. I just wanted to make sure that I was normal! –  Jun 25 '12 at 23:05