I need to show that, given a fair coin (probability of tails or heads is $0.5$, and all tosses are i.i.d) the following holds:
$$E[M] \geq \log_2(n) - O(1)$$
where $M$ is the length of the longest consecutive run of tails and $n$ is the number of throws.
I tried a bunch of ways, but I got stuck towards the end.
One way was partitioning the "word" , which is the sequence of $0$'s and $1$'s representing tails and heads respectively, into $log(n)$ parts, and trying to use Markov's inequality to show that:
$$E[M] \geq \log_2(n) P(M\geq \log_2(n)) \geq \log_2(n)P(\text{At least one block is all zeroes})\geq\\\geq \log_2(n) (1-(1-\frac 1 n )^{\frac {n} {\log_2(n)}})$$
but that lead me no where...
Any help would be appreciated!
-
1This answer gives a proof of this via generating functions and complex analysis. I suspect that there aren't any proofs which are both simple and rigorous. – Mike Earnest Jan 03 '25 at 00:15
1 Answers
In Analytic Combinatorics, Flajolet and Sedgewick give a rather involved proof that $E[M_n]=\log n-O(1)$, using generating functions and complex analysis. I can do no better than to cite this answer by Markus Scheuer for a great explanation of Flajolet and Sedgewick's method.
In this answer, I will just use elementary methods to prove the following looser bound for the mean. $$ E[M_n]\ge \log n-O(\log \log n). $$ My strategy is to write $E[M_n]=\sum_{t=1}^\infty P(M_n\ge t)$, and then find a lower bound for $P(M_n\ge t)$. Breaking the string into $\lfloor n/t\rfloor$ disjoint blocks each with $t$ flips, we see that $\{M_n\ge t\}$ occurs as long as at least one block is all heads. Therefore, $$ P(M_n\ge t) \ge 1-\left(1-\frac1{2^t}\right)^{\lfloor n/t\rfloor } \approx 1-\left(1-\frac1{2^t}\right)^{n/t} $$ Let $t=\log n - k$. The game we are playing is to see how small we can choose $k$ to be such that $P(M_n\ge \log n-k)\approx 1$, in an appropriate sense. Using the inequality $(1-x/n)^n\le e^x$, valid when $|x|\le n$, $$ \begin{align} &\;\;\;\;P(M_n\ge \log n-k) \ge 1-\left(1-\frac{2^k}n\right)^{n/(\log n-k)} \ge 1-\exp\left(-\frac{2^k}{\log n-k}\right). \end{align} $$ Now, $\log^{(i)}(x)$ denote the $i$-fold iteration $\log \log \cdots \log x$, and let $k=\log^{(2)} n+\log^{(3)} n$. For this choice of $k$, we have $$ \begin{align} P(M_n\ge \log n-k) &\ge 1-\exp\left(-\frac{2^{\log^{(2)}n +\log^{(3)} n}}{\log n-\log^{(2)}n -\log^{(3)}n}\right) \\&\approx 1-\exp\left( -\frac {\log n\cdot \log^{(2)}n} {\log n\color{gray}{-0-0}}\right) \\&=1-\frac1{\log n}. \end{align} $$ We see that, for all $t$ up to $\log n-\log^{(2)}n-\log^{(3)} n$, we have $P(M_n\ge t)\ge 1-\frac1{\log n}$. This allows us to get the following lower bound for $E[M_n]=\sum_t P(M_n\ge t)$: $$ \begin{align} E[M_n] &\ge \sum_{t=1}^{\log n-\log^{(2)}n-\log^{(3)} n}P(M_n\ge t) \\&\ge (\log n-\log^{(2)}n-\log^{(3)})\cdot \left(1-\frac1{\log n}\right) \\&=\log n-O(\log^{(2)}n). \end{align} $$
- 84,902