Randomized Pascal's triangle: What is the average of all the numbers?

Question

Let's build a variation of Pascal's triangle. We write $1$'s going down the sides, as usual. Then for each number inside the triangle, we flip a biased coin, with probability of Heads $p$.

Heads: write the sum of the two numbers above
Tails: write the absolute value of the difference between the two numbers above

Here is an example, with $p=\frac13$. Heads (sum) are in $\color{blue}{\text{blue}}$, Tails (difference) are in $\color{red}{\text{red}}$.

$$ 1\\ 1\quad 1\\ 1\quad \color{blue}{2}\quad 1\\ 1\quad \color{red}{1}\quad \color{red}{1}\quad 1\\ 1\quad \color{red}{0}\quad \color{blue}{2}\quad \color{red}{0}\quad 1\\ 1\quad \color{red}{1}\quad \color{red}{2}\quad \color{red}{2}\quad \color{red}{1}\quad 1\\ 1\quad \color{blue}{2}\quad \color{blue}{3}\quad \color{red}{0}\quad \color{red}{1}\quad \color{red}{0}\quad 1\\ 1\quad \color{red}{1}\quad \color{blue}{5}\quad \color{red}{3}\quad \color{blue}{1}\quad \color{red}{1}\quad \color{red}{1}\quad 1\\ \cdots $$

What is the expected average of all the numbers in the triangle, as a function of $p$?

That is,

What is $L_p=\lim\limits_{n\to\infty}\text{(expectation of the arithmetic mean of all numbers in the first $n$ rows)}$ as a function of $p$?

$p=1$ and $p=0$

When $p=1$, it's just the normal Pascal's triangle ("NPT"). Therefore $L_1=\infty$.

When $p=0$, each number in the coin-based Pascal's triangle ("CPT") is $1$ just if the corresponding number in the NPT is odd, otherwise $0$ (this is because the sum and difference of two numbers have the same parity). The proportion of odd numbers in the NPT approaches $0$ as the number of rows approaches infinity. Therefore $L_0=0$.

$0<p<1$

I used Excel to simulate $500$ row triangles with $p=0.12$, $p=0.07$ and $p=0.01$.
I also include $p=0$ for comparison.

$A(n)=\text{arithmetic mean of all numbers in the first $n$ rows}$.

Perhaps there is a critical value of $p$, above which $L_p=\infty$, and below which $L_p<\infty$.

Or perhaps $L_p=\infty$ for all $p>0$, but for very small positive values of $p$, the triangle takes a while to "get going" towards infinity.

Context

This question was inspired by a question about a variation of Pascal's triangle in which half the time we write $0$ instead of adding the two numbers above.

Update

This question is now posted on MO.

I assume there is only one possible triangle for $p=0$ and $p=1$, but inifnite number of triangles are possible for any non-zero $p$. And one of those infinite triangles for a fixed non-zero $p$ is the same as the triangle that you got with $p=1$ (with $L = \infty$). So, are you talking about average of averages (or some kind of integral of the PDF) when you talk about non-zero $p$? — Srini, Sep 19 '24 at 21:15
I simulated both $p=0.12$ and $p=0.07$ with $10$ triangles each and $20,000$ rows in each triangle. Two conclusions: For $p=0.12$, it does blow up; for $p=0.07$, the average reaches around (but slightly more than) $1.5$. I tried analyzing it analytically, but couldn't due to $2$ unpleasant entities (absolute value in the subtraction operation and the fact that the sides are not random). I am trying a heuristic approach to answer just the question: will the average will NOT blow up if $p$ is less than some threshold. I will post it if I make any progress and you see any value in it. — Srini, Sep 21 '24 at 00:10
@Srini Yes, answering that sub-question would be very useful. Regarding the side $1$'s, they can be considered to be random also, but no matter if they have Heads or Tails, they are always $1$ (because beyond the sides, there are just $0$'s). — Dan, Sep 21 '24 at 00:13

Einar Rødland · Answer 1 · 2024-09-22T10:25:45.170

See update at end for extended simulations, and how these indicate that the average may diverge for all $p>0$. I see this is also the conclusion that Srini is leaning towards.

To fix a notation, let $X_{i,j}$ correspond to $\binom{i+j}{i}$ in the sense that $X_{i,0}=X_{0,j}=1$ and $$ X_{i,j}= \begin{cases} X_{i-1,j}+X_{i,j-1} & \text{with probability $p$}\\ |X_{i-1,j}-X_{i,j-1}| & \text{with probability $1-p$} \end{cases} $$ for $i,j\ge1$, and where row $n$ of Pascal's triangle consists of $X_{i,j}$ for $i+j=n$.

First, we may note that the two alternatives are identical modulo 2: ie, which $X_{i,j}$ are odd or even is not random.

Next, if $p>1/2$, the expected value of $X_{i,j}$ is at least $p\cdot(X_{i-1,j}+X_{i,j-1})$, and so $$ E\left[\sum_{i+j=n} X_{i,j}\right] \ge 2p\cdot E\left[\sum_{i+j=n-1} X_{i,j}\right] $$ which makes the expected average increase exponentially. I'm sure this could be extended to $p=1/2$ and maybe below by taking into account the variability of the values.

My next thought was to analyse $X_{i,j}$ for fixed $i$. For example, $X_{1,j}$ is a sequence that randomly increases or decreases by 1, and so in the long run there will be a distribution of values: ie it will converge to a probability distribution if $p<1/2$. This approach could be extended to $(X_{1,j},\ldots,X_{r,j})$, and we could ask for which $p$ this will converge towards an equilibrium distribution. However, before getting properly started on this analysis, I made some interesting discoveries.

When making plots of $X_{i,j}$ for fixed $i$, I saw that when $p$ was low, ie well below $1/2$, the average was dominated by large spikes, while most of the $X_{i,j}$ would tend to remain small. Eg here is a segment $X_{14,5000..6000}$ from a simulation with $p=0.2$:

Spikes can arise by chance through a pattern of heads (when values are added), and then these spikes tend to carry on to subsequent values. This can be seen in a heatmap for $p=0.11$ where values have been transformed by $\log_2(X+1)$ for better illustration:

So, what drives the averages is how frequently these spikes arise, and how long they continue, which depends on $p$. Here is a case with $p=0.092$. The average value is just $9.3$, but this is primarily due to a small portion of values that are much larger. Note that the scale of the heatmap varies from map to map. Due to the logarithmic scale, a difference of $10$ on the scale corresponds to a factor of appr $1000$.

Since the spikes are highly sporadic, you will have to run the simulation for a while to see the long term behaviour, particularly for small $p$. Here is a case for $p=0.095$ for which the average values is $1358.4$, and it seems like the average is set to increase indefinitely as it keeps running:

Here is a case of $p=0.10$ where the increase is even clearer:

I zoomed in on a $120\times60$ region from a $20000\times10000$ simulation with $p=0.09$ which shows one out of two major blow-up points in that simulation. Note that the colour scale here shows the actual values, ie not log-transformed as in the previous maps.

I looks as if when a certain pattern arises, a blow-up is almost inevitable, and once that happens it will spread out and thus eventually dominate the triangle. If that is true, the effect of $p$ is just to influence how long it takes for that pattern to occur, but no matter how rare it is, it will eventually take over, and so the average value will eventually diverge (with probability 1).

One way to think about this is that, if at some point there is a spike value, say of size $A$, with other values being mostly small, this starts what is basically a new random triangle but with values $A$ times as large. Then, you can get another random spike within this with an even grater value, and so on. However, it does seem like it takes a while for the initial spikes to appear, but then subsequent values tend to grow more quickly.

Update

I coded it in C to be able to do longer and bigger simulations, and the mean seems to start diverging even as $p$ gets below $0.09$. Here are a few results:

100000 x 100000 Pascal with p = 0.085000
Final mean = 21397033537218698805248.000000
Total mean = 851727491377168532897792.000000
250000 x 1000000 Pascal with p = 0.080000
Final mean = 29375911547.915989
Total mean = 79375851.833287
1000000 x 1000000 Pascal with p = 0.080000
Final mean = 62104889838672863782624815384565539227294683665492982388038245023744.000000
Total mean = 33037472267506384764204401071820950891726503895517883310943854657536.000000

Beware that, since spikes are now quite infrequent, the simulations have to run for a long time. The total mean is across all $N\times M$ values while the final mean is just the final column.

I am increasingly inclined to believe that the average will eventually start diverging whenever $p>0$, that it is just a matter of how long it takes for spikes to appear.

The simulations were written in Python and run in Jupyter with plots made using the matplotlib.pyplot package. Here is the core code:

import random
from math import log
from matplotlib import pyplot as plt
class Pascal(list):
    "Generates random array [0..d][0..n]"
    def init(self, p, n=1000, d=1):
        self.p = p
        self.n = n+1
        self.append([1]*self.n)
        self.increase(d)
def incdim(self):
    XX = self[-1]
    X, x = [], 1
    for i in range(self.n):
        y = XX[i]
        x = x+y if random.random()&lt;self.p else abs(x-y)
        X.append(x)
    self.append(X)

def increase(self, d):
    for _ in range(d+1-len(self)):
        self.incdim()


Plot single diagonal
P = Pascal(0.2, 10000, 100)
X = P[14]
trans = lambda x: x  # Untransformed values
#trans = lambda x: log(1+x)/log(2)  # Log-transformed values
plt.figure(figsize=(15,5))
plt.plot([trans(_) for _ in X[5000:6000]])  # Segment of X selected
Heatmap
R, N = 2, 5000  # Size RN x N
off = 0  # Trim the first off columns/rows (keep ratio)
Q = Pascal(0.095, RN, N)
D = [[log(1+x)/log(2) for x in X[off:-1-(R-1)*off]] for X in Q[off:]]
plt.figure(figsize=(20,8))
plt.imshow(D, cmap='hot', interpolation='nearest')  # Heatmap
plt.colorbar()  # Adds a legend to show intensity

If run stand-alone, ie not in Jupyter, you probably have to add plt.show() after each plot to display them.

Very interesting. "It looks as if when a certain pattern arises, a blow-up is almost inevitable, and once that happens it will spread out and thus eventually dominate the triangle." If that certain pattern arises, and then we change $p$ to exactly $0$ from then on, will there still be growth towards infinity? — Dan, Sep 23 '24 at 04:37
@Dan: The only way to get a larger value is through addition, so if $p=0$ the maximal value may spread, but there will never be any larger values. But it seems that, once you have a spike value that is large enough, it may spread out a bit without differences with smaller values being able to reduce it too much, and eventually there will be an addition that produces a larger value. So in the simulations, I see a lot of spikes arising and then sizzling out, but after a while there is a spike that happens to get heads in just the right places, and then blow up. — Einar Rødland, Sep 23 '24 at 07:08
Reminds me of the origin of life on planet earth. Very primitive life forms for billions of years, then an explosion of life. — Dan, Sep 23 '24 at 07:21
If $p$ is very small (e.g. $10^{-100!}$), perhaps the blow-ups will be so rare that, when they do occur, they will be diluted by the vast seas of $0$'s. — Dan, Sep 23 '24 at 09:10
@Dan: Even a single spike, if it spreads in both directions, will tend to span out into a Sierpinsky-like triangle. Although the triangle created eventually spans most of the triangle, all except a fixed region on both sides, it still has an average density of zero within. However, it provides an ever increasing number of neighbouring values that may be added to produce ever greater values. I suspect that is why, once an initial spike starts to grow, it quickly blows up. — Einar Rødland, Sep 23 '24 at 17:07

Srini · Answer 2 · 2024-10-07T17:55:46.537

Sorry I had to delete my previous answer due to some legitimate flaws identified in it by Einar.

This time, I am just posting a few thoughts to analyze it slightly differently. I haven't completed my analysis fully and won't likely be able to post a confident stance on this before the bounty deadline. Nevertheless, this is an interesting enough topic to explore independent of such worldly non-mathematical constraints :)

Firstly, instead of filling the triangle row by row, I started filling the triangle from the sides, along what I call SL (Slant Lines SL-$1,2,3,...$). Refer to the picture below.

By symmetry arguments, we can analyze just one half of the triangle. Here I am looking at only left half.
Red line is all $1$s and can be populated with no other inputs.
Points on Blue line can be populated from top to bottom using red line as the input and the previous value of the blue point. The very first point needed is called Seed. The seed is derived from contributions from red line and some part of the right half. Let it be random for our purposes.
Points on Green line can be populated from top to bottom using Blue line as the input and the previous value of the Green point. The Green seed is derived from contributions from Blue line and some part of the right half. Let it be random for our purposes. Similarly the pink line.

Now, we can model each line as a feedback system as shown below:

In the following analysis, $p$ is considered very small (i.e. dominated by absolute difference operation and some occasional sum operations). For example, with $p=0$, the Blue line is an alternating pattern of $0$s and $1$s. The Green line another alternating pattern, but with more consecutive $0$ and $1s$. I post a few sample scenarios below in this feedback system. In the above paragraphs, where I mention "blah line can be used as the input", that is shown as $X_{k}$. The seed is shown in green and the output $Y_{k}$ is shown in blue if it is an absolute difference operation and red if it is a sum operation.

As can be seen in a few sample scenarios, the absolute difference operation can either sustain the Expectation of the input or it can make it grow (increase the expectation) or it can kill it (reduce the expectation), based on the alignment and value of the seed. I show them as sustain, growth and death. These are far from exhaustive, but we already see all the possibilities.

$2$ interesting things to see are

when an occasional Sum operation is introduced. It disrupts the natural course of the absolute difference pattern (be it sustain, growth or death) and it can reverse the course of the pattern depending on where the effect ends. But during the window of its effect, it does seem to increase the expectation (due to no negative values allowed).
The seed is another thing that decides the course in the absence (or minimal presence) of red spikes and it is contributed by an upward Rhombus of points starting from the seed point.

With this type of analysis introduced concentrating on slant rows and feedback modeling, it seems natural that expectation will grow as we go from outside slant lines towards inner ones, IF we frequenty introduce red spikes. What needs to be seen is that if the red spikes are very infrequent and the natural sustain/growth/death pattern dominates, do other seed values ($\ne0, \ne1$ which haven't been analyzed but are more common along the center line) naturally tend to increase the expectation, if they are considered random. If the answer to that is yes, all triangles with $p \ne 0$ should blow up. I will post any update if I have an answer to that.

Update $\mathbf{1}$: larger initial seed values are indistinguishable from red spikes in terms of their effect, in the sense that they have some transient effect of increasing expectation, but importantly they are just one time effects and don't matter in the grand scheme. Currently, I don't believe in my earlier hypothesis (based on incorrect modeling) that the blow up is guaranteed. I will post more updates.

Update $\mathbf{2}$: I ran $10$ iterations with $8000$ rows and $p=0.1$ until first $2500$ rows, but $p=0$ for the rest until row $8000$. At least in these $10$ trials for $p=0.1 \to 0$, it is suggesting that we need sustained Sum operations to keep the fire going. The average reached at row $2500$ always drops when it reaches row $8000$.

However, things get interesting when I try the same thing with $p=0.12$ with $2500$ rows and $p=0$ for the rest until $8000$ rows. In some cases, the blowup increases and in some cases, the blow up decreases. Roughly even split of these cases. But note that we are interested in average of averages.

Update $\mathbf{3}$: I got rid of the seed business altogether by swithching from left-half analysis to both halves analysis as shown in the image below. As you can see, with this approach, we always start with a seed of $1$. If we want to push this line of analysis even further, we can consider the averages of the rhombus formed with the reflected grid of the triangle points WLOG. That way, each SL is of the same length and the process of moving from red line to pink line and further is just a matter of starting with the red line of all $1$ and restart with a seed of $1$ and repeat.

That's all on that front. No revelation, except that the feedback system became simpler.

$\color\red{\text{Next}}$, a change of gears to another type of analysis. I wanted to see what interesting things are revealed if I switch from a uniform RV with $p = \frac1v$ to introducing a sum operation deterministically once in every $v$ dots (along horizontal rows still, my simulation is not ready for slant rows). There are of course $v$ phases of implementing this train.

Interesting things to note:

With random $p$, we expect blow up from $p = \frac1v = \frac11$ to $\frac{1}{12}$. We expect no blow up from $v = 13$ onwards.
With deterministic $v$ method, $v=9,10,11$ don't blow up, but $v=12$ blows up for one particular phase. Note that my blowup threshold in simulation was $20,000$ for the triangle average (nothing to do with the choice of $20,000$ rows, just coincidental).
I initially suspected a density of introducing the sum operation would be the only factor, but it is clear from the trial with $v=12$, it is not the case.

Then, I wanted to take the very first sort-of-unexpected result, i.e. $v=9$ and see what happens if instead of introducing $1$ sum operation in exactly $v$ dots, I introduced $2$ sum operations in $2v$ dots. Now, for $v=9$, for each phase ($0$ to $8$), we have gap between the first and the second dot in a window of $18$ dots as another variable that is between $1$ and $9$. Of course, the gap of $9$ between $2$ dots in a window of $18$ dots corresponds to the earlier simulation of $1$ sum operation in $9$ dots. So with the phase and gap combinations, we have $9 . 9 = 81$ combinations. The following picture shows that.

Interesting things to note:

Gap = $9$ is as seen before not blowing up (which is what led me to the gap analysis).
BUT, gap = $1$ for $v=9$ is also somehow good. This means two consecutive sum operations followed by $16$ difference operations in a window of $18$ dots to make a particular pattern of $p = \frac19$.

OK, hope you are still with me. Next I took $v=13$, which from random simulations shows no blow up AND from deterministic simulation with $1$ sum every $13$ dots ALSO shows no blow-up. I repeated the $2$ sums in $26$ dots approach as I did before. Interestingly there is a blow-up in some cases. I am not pasting the entire $13 . 13 = 169$ results.

Update $\mathbf{4}$: Envelope and Monotonicity

I simulated $200000$ rows for $v=13$ (both for random and one of the deterministic scenarios) to establish any monotonicity patterns in the average creep. It is illuminating to know that zooming out is helpful. The triangle average row by row is very noisy. I used a window of $500$ rows to detect the envelope on this noisy data. I think this gives enough resolution and enough smoothness. Of course one can do another envelope of envelopes to make it even smoother, but that will lose details.

Interesting things to note:

The overall shapes for random and deterministic (at least for the one case I studied) are not too different.
There is a overall continuous upward creeping trend, but the rate is definitely much slowed down after about $50000$ rows. I can't tell for certain that it is going to blow up, but it seems like it. Each day I seem to be flip-flopping on which way I lean, but I lean towards blowup with this extra simulation data :)

Update $\mathbf{5}$: Concluding remarks I ran a(nother) $2$ million rows simulation for $v=13$ (random), which hasn't completed (may take a couple of more days). But I got data for ~$800,000$ rows. The simulation result is as shown below:

Interesting things to note:

Even though there is a sharp average increasing event after $700,000$ rows, it is exaggerated due to the range of my y-axis. If we zoom out sufficiently, we are actually looking at gradual growths. So, in my mind, the explosive blow-up scenarios similar to Dan's analogy of evolution is not strongly seen. The following is my hypothesis (not proof) to explain the empirical evidence of gradual growth.
Going back to my feedback model, we can view it as an accumulator of average patterns (sustaining, growth and death). If the accumulator sees more growth patterns than death patterns, it will lead to increase. Sustaining patterns are neutral.
By definition, Sum events are average increasing patterns. I already demonstrated with some examples that Absolute Difference patterns can be sustaining, growth or death patterns depending on alignment.
The key thing to note is the second effect of Sum operation is not only to increase average, but also to distrupt/reverse patterns. By random/symmetry arguments, I argue that occassional Sum operations will introduce equal amounts of growth and death patterns due to frequent pattern reversals. By that, the net effect is equivalent to a sustaining pattern.
But given that Sum always increases the average, the so called net sustaining pattern is on the current average (and not initial average) which is being nudged up, however slowly. This explains why we see constant slow growth.
Hence I conclude that there will be slow creep up no matter how small $p$ is. It is a matter of scale and if we were to zoom out to a scale of billions or trillions of rows, we will still observe what I observed in $3$ different zoom scales ($20,000$, $200,000$ and $2,000,000$).
What I need to justify is why there are certain $v$s beyond which the accumulation is happening at an accelerated pace. But that does not preclude my conclusion that no matter how small $p$ is, we can justify an increase in average.

Update $\mathbf{6}$: Postscript The $2$ million rows simulation finished. I don't know when it finished, as my poor Perl script was stretched beyond its original intent (a few thousands of rows). Something really interesting happened around $1.65$ million rows. It suddenly ramped up towards what seemed like a definite blowup. Then it started ramping down at around $1.77$ million rows. So there is something happening beyond what my hypothesis can explain. it is quite possible that the random number generator was highly skewed at that point, creating dense Sum events. At this point, the only path left in my approach is to come up with a theory for accelerated growth for dense/frequent Sum events. I don't have such a theory.

I would be very interested in evidence or proof that the blow up is not guaranteed. That would imply that there exists a mysterious critical $p$-value. (I received your chat invite, but unfortunately I am blocked by a firewall from entering chat rooms.) — Dan, Sep 25 '24 at 01:10
Hi @Dan, I am slowly trying to get there. I posted another update. I am going to try a few more of my similar thought experiments in my simulation. Ideally, with the chat, I was hoping that we could discuss a few ideas and systematically attack this, without the constant fear of the big brother bot. Don't worry I won't clutter up the discussion. I will try to publish whatever I think is interesting in the body of my answer. — Srini, Sep 25 '24 at 03:46
@Dan, I don't know enough math to see if a real uniform distribution with $p=\frac{1}{13}$ can be analyzed with the deterministic method that showed blow up. Obviously there are infinite patterns and I am only talking about $169$ of them. Can't tell how the averaging works out, with all crazy binomial combinations to consider for $3$ dots in$39$ dots, $4$ dots in $52$ dots etc. But just something to think about — Srini, Sep 26 '24 at 21:42
With the deterministic model, taking $v=9$, phase $=0$ for example, average after $5000$ rows is $3.477$, average after $20000$ rows is $3.502$. How do we know whether this will eventually lead to blow-up or not? — Dan, Sep 26 '24 at 22:28
@Dan, Assuming fluctuating averages, I will look out for monotonicity in the envelope (as we go from $5000$ rows to $20000$ rows), suggesting a slow blow-up. That's within the capabilities of my simulation. Unfortunately, increasing rows significantly slows down the simulation, as it's currently in Perl, not C. — Srini, Sep 26 '24 at 22:50
OK. In all of your data, going from $5000$ to $20000$ rows, there is either blow-up, or a slight increase. I believe there is no instance of a decrease. Does this suggest that there will blow-up in all cases? — Dan, Sep 26 '24 at 22:58
@Dan, I will post monotonicity analysis with random and deterministic setups in the next day or so, let's see what it reveals. Plus I have a few more things to try out. Example, formula for deterministic row spike density to slant row density, a possible approximation for an analytical answer with some simple deterministic case etc. As I said earlier, I am very slowly trying to get to some conclusion, but this is a difficult problem after all. Hope MO guys crack it easily — Srini, Sep 26 '24 at 23:06
@Dan, Quick feedback on monotonicity analysis (more data possibly tomorrow if it warrants): I need to simulate WAY more points. For particular case that you asked: For $v=9$, phase = $0$, had I sampled at row $3500$ or $4000$ (instead of $5000$) and row $20,000$, we would have seen a drop. The row by row average is very noisy. So I compared maximum values across consecutive windows of $100$, $500$, $1000$ etc. That is not noisy, but clearly not monotonic for either $v=9$ or $v=13$. They dip for thousands of rows and then come back up. Given only $20000$ rows, insufficent to see trends. — Srini, Sep 27 '24 at 04:00
@Dan, I started a $2$ Million rows run with $v=13$. Probably will take a few days to finish. Currently at $523,000$ rows and there is a steady but super slow creep up. Even if it finishes successfully maintaining this trend, this is just about what we can prove with simulations alone. We need one more indicator to prove a definite blow up, most likely some kind of analytical hypothesis to seal the conclusion. In search of that elusive supporting theory... — Srini, Sep 28 '24 at 19:43
@Dan, I wasn't sure if you get notified when answers are modified. So letting you know that I have posted my final conclusions. I might still tinker with this when I have time, but for now, this is it. — Srini, Sep 29 '24 at 22:08
Thanks for the update (I don't get notified when answers are modified, but I check for updates occoasionally). Interesting data, and interpretation. — Dan, Sep 29 '24 at 22:42
@Dan, added the final chart for the sake of completeness. Mystery not solved :) — Srini, Oct 07 '24 at 17:57
Thank you for your efforts! Maybe some day someone will solve this mystery. — Dan, Oct 07 '24 at 21:07

Randomized Pascal's triangle: What is the average of all the numbers?

$p=1$ and $p=0$

$0<p<1$

Context

Update

2 Answers2

Plot single diagonal

Heatmap

Linked