2

The problem statement and the official answer was not written in English BTW.

$~7~$piggy banks and$~7~$corresponding keys are given.

Each key is stored into an piggy bank.

A key can be obtained via opening a piggy bank using a corresponding key or breaking that piggy bank.

Evaluate the expected value of number of piggy bank(s) which is/are needed to be broken to obtain the all keys.

The official answer of the formula is given as follows.

$$ 1+{1\over 7}+{1\over 6}+{1\over 5}+{1\over 4}+{1\over 3}+{1\over 2}={363\over 140} ~~~\left[\text{piggy banks}\right] $$

I can understand that at least one piggy bank should be broken to obtain the first key but I cannot get the meanings of the rest of the terms above shown in LHS of the equation.

In the first place, the definition of an expected value for me is like as follows, which seems not relative to this problem.

$$ E[X]=\sum_{i=1}^{n}p_i x_i~~~\text{or}~~~E[X]=\int_{-\infty}^{+\infty}x\cdot\operatorname{pdf}(x)~\mathrm{d}x $$

A key inside a piggy banks is to be one of 2 cases as we act optimally.

One of which is a key of a type which yet obtained and the other is a key inside a piggy banks which is to be broken at last.

Hence as we obtained a key and$~k~$unbroken piggy banks remaining, the probabily of a next bank should be broken is$~{1\over k+1}~$

tangent_26
  • 443
  • 3
  • 9

1 Answers1

3

The number of piggy banks you need to break is equal to the number of "cycles" in the permutation. For example, imagine you break bank 1 and it contains the key to bank 6. Then you open bank 6 and it has the key to bank 1. This is a cycle of length 2. You should be able to see that you will have to break exactly one bank per cycle. If you get lucky and get a cycle that contains all seven banks, you will only have to break one bank.

The math for the average number of cycles has been shown in several other posts, but I will attempt a new explanation. Expected number of cycles in permutation

Let $X_i$ denote a random variable that tells what fraction of bank i's "cycle" is accounted for by bank i. For example, if bank 1 is in a cycle of length four, then it accounts for one fourth of its cycle, and $X_1=\frac 14$. The total number of cycles will just be $\sum X_i$. Each $X_i$ has the same distribution, so the expected value of this sum is just $7 E[X_1]$.

The probability of bank 1 being in a cycle of length 1 is just the probability that key 1 is in bank 1, $P(X_1 = 1) = \frac 17$.

The probability of bank 1 being in a cycle of length exactly 2 is the probability that bank 1 gets a key other than 1, times the probability that that bank gets key 1, so $P(X_1 = \frac 12) = \frac 67 \frac 16 = \frac 17$.

The probability of bank 1 being in a cycle of exactly length 3 is the probability that the first two banks get keys other than bank 1's or their own, and the third bank gets bank 1's key: $P(X_1 = \frac 13) = \frac 67 \frac 56 \frac 15 = \frac 17$

Similarly, $P(X_1=\frac 1k) = \frac 17$ for all $1 \leq k \leq 7$.

The expected value is then $E[X_1] = \sum_{n=1}^{7} \frac 17 \frac 1k$, and the expected total number of cycles is 7 times this value, which is just the first seven terms of the harmonic series.

3rdMoment
  • 467
  • It's not intuitive for me that $~X_a~$and $~X_b~$ are independent(i.e. $~ E \left[\sum_{i=1}^{7}X_i\right] = \sum_{i=1}^{7}E[X_i] ~$ can be held)... – tangent_26 Oct 24 '22 at 01:51
  • 1
    They are not independent, but you don't need independence. $E[X+Y] = E[X}+E[Y]$ whether or not X and Y are independent. – 3rdMoment Oct 24 '22 at 03:24