A coin is tossed 100 times. How many instances of at least 5 heads in a row do we expect to see?

Question

No overlaps. We are counting runs of at least 5, whereby for example a run of 6 does not count as 2 runs of 5.

I have received an answer to this question from someone with a PhD in Statistics, yet their theoretical answer does not agree with my code simulation.

Theoretically, the answer would be

$ \frac{96}{32} - \frac{95}{64} + \frac{94}{128} - \frac{93}{256} + \frac{92}{512} - ...$

since we expect $\frac{96}{32}=3$ instances of 5 heads in a row (if we allow overlap), and if we apply the Inclusion-Exclusion Principle, we can correct for double counts of runs of 6, 7, 8, etc...

The problem is, this answer is approx 2, but my coded simulation always results in about 1.5:

import numpy as np
import pandas as pd
def random_list(length):
    random_list = np.zeros(length) #list of "length" zeros
    for i in range(len(random_list)):
        random_value = np.random.random() #instantiate random value for each i
        if random_value > 0.5: #for approximately half of the random values
            random_list[i] = 1
        else:
            random_list[i] = 0
return random_list #random_list is now a random list of zeros and ones


def count_ones(array):
    runs = 0
    i = 0
    while i < (len(array) - 1): #iterate over each index in list
        if array[i] == 1: #find a value of one
            j = i + 1 #the next value is j
            while array[j] == 1: #iterate over indices until we hit a zero or end of array
                if j == (len(array)-1):
                    break #break out of loop if we are at the end of the list
                j += 1
            k = j #we now have either the first zero after a list of ones, or we are at the end of the list
            ones = k - i #how many ones in a row
            if ones >= 5:
                runs += 1 #count this as a run of 5
        else: #if array[i] == 0
            k = i # necessary so that code after if/else conditional runs
        i = k + 1 # loop will iterate over index after k
return runs


def average_runs(trials):
    results = np.zeros(trials)
    for i in range(trials): #do it a large number of times
        array = random_list(100)
        runs = count_ones(array)
        results[i] = runs
    average = sum(results)/len(results) #take the average
return average


average_runs(1000)

Can anyone explain why the simulation and theory do not agree?

Try that for real, either with an actual coin or a computer simulation and you should find that even runs of 10, let alone five, are common.
I took my results to a Cambridge PhD in number theory, who said 'Yes… that's what most of my students find.'

My own trial started not with mere binary coin tosses, but roulette spins… though in either case, runs of 12 or 13 in a row were not uncommon.

If your particular simulation doesn't agree with whichever theory you're following, what does that suggest?

Almost separately, why did you need such complex code for such a simple problem? — Robbie Goodwin, May 28 '25 at 20:56
You don't need numpy, pandas, or any if/while logic to test this with python, by the way. Your code works great, but I had fun trying to write something hopefully simpler: https://colab.research.google.com/drive/19e5IeaPmDonNir5Em7m8mO65PRXbKZme?usp=sharing — TylerW, May 29 '25 at 00:29
You might enjoy reading about Shannon's communications theory. — Carl Witthoft, May 29 '25 at 15:27
3k views in 2 days even without bounty... that's crazy. btw, I'm kinda disappointed not seeing a generating-function answer :) — Quý Nhân, May 30 '25 at 13:31
There's a bug in your code: count_ones won't count a run of exactly $5$ heads right at the end. Therefore the number your code is estimating is actually $1.5$ :). Unsolicited advice.. if you find yourself writing a for loop that iterates over a numpy array, either there is a better way to do what you want to achieve or you shouldn't be using numpy. If you're interested in some faster/more compact implementations, I compared a few to your approach here. (@TylerW you may be interested too) — Izaak van Dongen, Jun 04 '25 at 15:57
I’m voting to close this question because Math.SE is not a coding site. — amWhy, Jun 21 '25 at 20:22

Christophe Boilley · Answer 1 · 2025-05-31T10:12:11.017

33

There is no triple count here, because you cannot have $A_i$ and $A_{i+2}$ without $A_{i+1}$ (where $A_i$ means 5 consecutive heads starting in round $i$). So the answer is the expected number of runs of 5 consecutive (possibly with overlaps) minus the expected number of overlaps (which is exactly the expected number of runs of 6 consecutive): $$\frac{96}{32}-\frac{95}{64}\approx 1.52$$

edited May 31 '25 at 10:12

answered May 28 '25 at 08:14

Christophe Boilley

8,996

2

Just to comment, you have my upvote especially, because the question was not just what the right number is, but "why was this logic wrong" and you did that beautifully in just 14 words. – CR Drost May 29 '25 at 16:34
@CRDrost how did you count 14? I spent some time figuring it out but nothing came out. Help? – htmlcoderexe May 30 '25 at 08:36
@htmlcoderexe see the edit history. – Integreek May 30 '25 at 16:45
Great answer. Can you add the definition of $A_i$? – Taladris May 31 '25 at 09:16
Oh sorry the actual count was 15 and was based on taking “A-sub-i-plus-one” etc. as one single word – CR Drost Jun 01 '25 at 02:45

score 22 · Answer 2 · answered May 28 '25 at 08:44

22

The infinite sum seems like overkill to me.

A toss "starts a run" of five or more consecutive heads if it and the next four coins are heads, and the previous coin, if any, was tails (otherwise it is just part of a longer run).

So the very first toss has a $\frac{1}{2^5} = \frac{1}{32}$ chance of starting a run, the last four cannot start a run at all, and the remaining 95 have a $\frac{1}{2^6} = \frac{1}{64}$ chance of starting a run. Any run must start in exactly one place, so this covers all the possibilities. Total: 97/64 = 1.515625.

answered May 28 '25 at 08:44

Toph

1,566
7
17

1

If the sequence contains 10 (or more) consecutive heads, then both the first and the sixth coin start a run of five but your method wouldn't count it for the sixth coin. This does make a difference but nowhere near enough to get to around 2 instead. – quarague May 29 '25 at 12:30
4

See the nice thing about the Original Asker including a computer program is that it perfectly encapsulates what OA meant by his definitions and on those definitions, the first and sixth will not each start his, "a run of at least 5," but it is regarded as one run of 10. – CR Drost May 29 '25 at 16:30

Especially Lime · Answer 3 · 2025-05-28T08:45:44.777

13

Here is another way to see that their "theoretical answer" is wrong (and Christophe Boilley's is correct, although verifying that our two expressions are exactly equal is not so easy).

What is the expected number of runs of exactly $k$? There are $101-k$ places this can start. In all but two of these, the probability of having a run of exactly $k$ starting there is $(\frac12)^{k+2}$, since we need $k$ heads with a tail either side. The two exceptions are when the run starts at the first coin or ends at the 100th coin, and these have probability $(\frac12)^{k+1}$. Thus the overall expected number of runs of exactly $k$ is $(103-k)\times(\frac12)^{k+2}$, for any $k\leq 99$. (For $k=100$ the probability is $(\frac12)^{100}$; what goes wrong with the above is that the "two exceptions" coincide.

Thus the total expectation of the number of runs of at least $5$ is $$\frac{98}{2^7}+\frac{97}{2^8}+\cdots+\frac{5}{2^{100}}+\frac{4}{2^{101}}+\frac{1}{2^{100}}\approx 1.516.$$

edited May 28 '25 at 08:45

answered May 28 '25 at 08:37

Especially Lime

46,692

3

To be sure, the series does sum to $97/64$ (see https://www.wolframalpha.com/input?i=%2898%2F2%5E7+%2B+97%2F2%5E8+%2B+...+%2B+4%2F2%5E%28101%29%29+%2B+%281%2F2%5E%28100%29%29). To do this "by hand", see e.g. https://math.stackexchange.com/questions/894998/explanation-of-the-formulas-for-sums-sum-nrn-and-sum-n2-rn – ronno May 28 '25 at 18:21
@Especially Lime Is the expected number of runs of exactly $k$ not equal to $\frac{101-k}{2^k}$? – Abhay Agarwal Jun 09 '25 at 16:28
@AbhayAgarwal no, because you need to take into account not only the probability of a given set of $k$ coins being heads, but also that the coin(s) either side are tails. – Especially Lime Jun 10 '25 at 07:50

A coin is tossed 100 times. How many instances of at least 5 heads in a row do we expect to see?

3 Answers3

Linked