Uniqueness of longest run of Bernoulli experiment

Question

In our lecture we prove the statement (Erdös-Rényi law of runs):

We consider the probability space $(\Omega:=\{0,1\}^{\mathbb{N}},\mathcal{F},\mathbb{P})$ and define a Bernoulli experiment of length $n$ with probability of success $p$. Let be $R_n$ the length of the longest run, i.e. $$ R_n:=\max\left\{l-k\mid 0\leq k<l\leq n, \frac{S_l-S_k}{l-k}=1\right\} $$ where $S_l$,$S_k$ are the number of successes until the $l$-th and $k$-th step. Then $$ P\left(\lim\limits_{n\to\infty}\frac{R_n}{\ln(n)}\text{ exists and equals }\frac{1}{\ln\left(\frac{1}{p}\right)}\right)=1. $$

The proof relies on the heuristic assumption that the longest run of the Bernoulli experiment of length $n$ is unique, so that we can use the fact $$1=np^{R_n}\implies R_n=\frac{\ln(n)}{\ln\left(\frac{1}{p}\right)}.$$ To make it clear, if we conduct the experiment $n$-times, then there is exactly one tupel $\omega\in\Omega$ which contains $R_n$-many $1$'s in a row, e.g. $\omega=(0,1,0,\underset{R_n-\text{ many}}{\underbrace{1,1,1,1,\dots,1}},1,1,0,0,1,0,1,1,0,\dots)$.

If we conduct the experiment $(n+1)$-times, then there is exactly one tupel $\omega'\in\Omega$ which contains $R_{n+1}$-many $1$'s in a row, e.g. $\omega'=(0,1,0,\underset{R_{n+1}-\text{ many}}{\underbrace{1,1,1,1,\dots,1}},1,1,0,0,1,0,1,1,0,\dots)$. And so on...

I don't understand why we can simply make this assumption? Maybe someone is more familiar with this and can explain it to me?

When you say it is unique, do you mean, there is only one $R$ that $R_n=R$ for all $n$? That would be a reasonable guess, but is clearly a false statement. Also I don't know why $1=np^{R_n}$ is considered, so it makes it a bit harder to guess the argument — FShrike, Oct 12 '22 at 14:38
@FShrike, I think my last comment was a bit sloppy. I hope my edit clarifies it. — Philipp, Oct 12 '22 at 16:14

r.e.s. · Answer 1 · 2024-01-31T17:20:11.713

The proof relies on the heuristic assumption that the longest run of the Bernoulli experiment of length $n$ is unique

Your sample space $\Omega:=\{0,1\}^{\mathbb{N}}$ is infinite, but it seems to me that any such "heuristic assumption" ought to be consistent with the behavior of longest runs in finite experiments with sample spaces $\Omega_n=\{0,1\}^n,\ n\in\mathbb{N},$ taking the limit as $n\to\infty.$

In particular, defining $X_n$ to be the number of occurrences of the longest run of $1$s in a sequence of $n$ i.i.d. Bernoulli(1/2) random variables, it can be shown$^\dagger$ that $$\begin{align}P[X_n=1] &={1\over 2 \log(2)}+\delta_n\tag{1}\\[2ex] &=0.7213...+\delta_n\end{align}$$
where $|\delta_n|<10^{-5}$ for all $n>30000$. So the assumption of a unique longest run is expected to fail in about $28$% of those experiments, as $n\to\infty$.

Formulas for $P[X_n=1]$ can be obtained by noting that there is a bijection between (a) length-$n$ binary sequences with a unique longest run, and (b) compositions of $n+1$ with a unique largest part, such that these are equal in number. Calling this number $a(n)$, we thus have $P[X_n=1]={a(n)/ 2^n}$. The sequence $a(n)$ is listed at OEIS A097979, which gives both a generating function and a Mathematica program. The following pictures show the "almost convergence" of $P[X_n=1]$, which has interesting oscillations as $n\to\infty$:

The sequence has been proved$^\dagger$ to oscillate around $1/(2\log 2)$ (solid red line) with ever-decreasing amplitude, the amplitude converging to $0.00000713...$ as $n\to\infty.$ (For $n>30$ it's infeasible to directly count the sequences that have a unique longest run, so for larger $n$ up to $10000$ I used a SageMath translation of the Mathematica program, and for $n>10000$ I used the generating function (with help from ask.sagemath). The plot on the right shows only every tenth point up to $n=10000$, and every hunderdth point thereafter. (The changeover at 10000 is visible in the spacing of the dots, which also decreases as the slope decreases.)

$^\dagger$ I learned this from an answer to my question about a conjectured limit distribution of $X_n$ (which turns out not to exist, except "almost"). The proof is in "Brands, J. J. A. M., Steutel, F. W., & Wilms, R. J. G. (1994). On the number of maxima in a discrete sample. Statistics & Probability Letters, 20(3), 209–217. doi:10.1016/0167-7152(94)90044-2". The paper is behind a paywall, but a preprint is available.

Uniqueness of longest run of Bernoulli experiment

1 Answers1

Linked