6

In our lecture we prove the statement (Erdös-Rényi law of runs):

We consider the probability space $(\Omega:=\{0,1\}^{\mathbb{N}},\mathcal{F},\mathbb{P})$ and define a Bernoulli experiment of length $n$ with probability of success $p$. Let be $R_n$ the length of the longest run, i.e. $$ R_n:=\max\left\{l-k\mid 0\leq k<l\leq n, \frac{S_l-S_k}{l-k}=1\right\} $$ where $S_l$,$S_k$ are the number of successes until the $l$-th and $k$-th step. Then $$ P\left(\lim\limits_{n\to\infty}\frac{R_n}{\ln(n)}\text{ exists and equals }\frac{1}{\ln\left(\frac{1}{p}\right)}\right)=1. $$

The proof relies on the heuristic assumption that the longest run of the Bernoulli experiment of length $n$ is unique, so that we can use the fact $$1=np^{R_n}\implies R_n=\frac{\ln(n)}{\ln\left(\frac{1}{p}\right)}.$$ To make it clear, if we conduct the experiment $n$-times, then there is exactly one tupel $\omega\in\Omega$ which contains $R_n$-many $1$'s in a row, e.g. $\omega=(0,1,0,\underset{R_n-\text{ many}}{\underbrace{1,1,1,1,\dots,1}},1,1,0,0,1,0,1,1,0,\dots)$.

If we conduct the experiment $(n+1)$-times, then there is exactly one tupel $\omega'\in\Omega$ which contains $R_{n+1}$-many $1$'s in a row, e.g. $\omega'=(0,1,0,\underset{R_{n+1}-\text{ many}}{\underbrace{1,1,1,1,\dots,1}},1,1,0,0,1,0,1,1,0,\dots)$. And so on...

I don't understand why we can simply make this assumption? Maybe someone is more familiar with this and can explain it to me?

Philipp
  • 4,951
  • 2
  • 14
  • 30
  • When you say it is unique, do you mean, there is only one $R$ that $R_n=R$ for all $n$? That would be a reasonable guess, but is clearly a false statement. Also I don't know why $1=np^{R_n}$ is considered, so it makes it a bit harder to guess the argument – FShrike Oct 12 '22 at 14:38
  • @FShrike each $n$ has it's own unique $R_n$. – Philipp Oct 12 '22 at 14:42
  • Well. There can only be one maximum value – FShrike Oct 12 '22 at 15:07
  • @FShrike, I think my last comment was a bit sloppy. I hope my edit clarifies it. – Philipp Oct 12 '22 at 16:14

1 Answers1

1

The proof relies on the heuristic assumption that the longest run of the Bernoulli experiment of length $n$ is unique

Your sample space $\Omega:=\{0,1\}^{\mathbb{N}}$ is infinite, but it seems to me that any such "heuristic assumption" ought to be consistent with the behavior of longest runs in finite experiments with sample spaces $\Omega_n=\{0,1\}^n,\ n\in\mathbb{N},$ taking the limit as $n\to\infty.$

In particular, defining $X_n$ to be the number of occurrences of the longest run of $1$s in a sequence of $n$ i.i.d. Bernoulli(1/2) random variables, it can be shown$^\dagger$ that $$\begin{align}P[X_n=1] &={1\over 2 \log(2)}+\delta_n\tag{1}\\[2ex] &=0.7213...+\delta_n\end{align}$$
where $|\delta_n|<10^{-5}$ for all $n>30000$. So the assumption of a unique longest run is expected to fail in about $28$% of those experiments, as $n\to\infty$.


Formulas for $P[X_n=1]$ can be obtained by noting that there is a bijection between (a) length-$n$ binary sequences with a unique longest run, and (b) compositions of $n+1$ with a unique largest part, such that these are equal in number. Calling this number $a(n)$, we thus have $P[X_n=1]={a(n)/ 2^n}$. The sequence $a(n)$ is listed at OEIS A097979, which gives both a generating function and a Mathematica program. The following pictures show the "almost convergence" of $P[X_n=1]$, which has interesting oscillations as $n\to\infty$:

probability that longest run is unique

The sequence has been proved$^\dagger$ to oscillate around $1/(2\log 2)$ (solid red line) with ever-decreasing amplitude, the amplitude converging to $0.00000713...$ as $n\to\infty.$ (For $n>30$ it's infeasible to directly count the sequences that have a unique longest run, so for larger $n$ up to $10000$ I used a SageMath translation of the Mathematica program, and for $n>10000$ I used the generating function (with help from ask.sagemath). The plot on the right shows only every tenth point up to $n=10000$, and every hunderdth point thereafter. (The changeover at 10000 is visible in the spacing of the dots, which also decreases as the slope decreases.)


$^\dagger$ I learned this from an answer to my question about a conjectured limit distribution of $X_n$ (which turns out not to exist, except "almost"). The proof is in "Brands, J. J. A. M., Steutel, F. W., & Wilms, R. J. G. (1994). On the number of maxima in a discrete sample. Statistics & Probability Letters, 20(3), 209–217. doi:10.1016/0167-7152(94)90044-2". The paper is behind a paywall, but a preprint is available.

r.e.s.
  • 15,537