Expected value of random expressions

Question

I'm trying to solve this math puzzle: write numbers $1$ to $N$ in a row. Randomly insert $+$ or $\times$ between two adjacent numbers with equal probability. What is the expected value of the expression if the expression is evaluated as an ordinary arithmetic expression? For instance, $1 + 2 \times 3$ will be evaluated as $1 + (2\times3)$.

Initially I thought that it was a simple recursion, but then I realized that $\times$ would change the precedence of the entire expression and then got stuck. It's easy to find a solution for a small enough $N$ with code, but I'm curious how one can solve it with math.

Thanks,

Clarification requested. If $n = 3$, is $1 + 2 \times 3$ evaluated as $(1 + 2) \times 3$ or $1 + (2 \times 3)$? That is, does multiplication take precedence over addition, or is the order of operations strictly left to right? — user2661923, Jan 21 '22 at 06:32
Would Polish Notation or reverse polish notation be helpful? Then you don't need to worry about precedence — perpetuallyconfused, Jan 21 '22 at 06:38
@perpetuallyconfused regardless of which notation is used, the OP (i.e. the original poster) must first clarify what the problem composer's intent is. — user2661923, Jan 21 '22 at 06:39
Also, is it that you insert an operator between each adjacent pair? Or do you insert one operator and concatenate the numbers, eg if $n = 4$ we might have $12+34$ — perpetuallyconfused, Jan 21 '22 at 06:40
@user2661923 I'm not totally sure you need to worry about that. Both of those types of expressions can be rewritten as $1 \ldots N {+,\times}^{n-1}$ for some (equally likely) element in ${+, \times}^{n-1}$ (slightly abusing notation, my apologies) — perpetuallyconfused, Jan 21 '22 at 06:47
@user2661923 It will be evaluated just as an arithmetic experssion. $1 + 2 \times 3$ will be evaluated as $1 + (2 \times 3)$. — user159566, Jan 21 '22 at 07:08
@user2661923 Yes, I did. I have yet to see any patterns, though. I'll plot the numbers to see if there's any pattern. — user159566, Jan 21 '22 at 07:13
Correction from previous comment, which I have deleted. For a problem like this, my first try would be to recognize that there are $2^{n−1}$ gaps that will either be + or x. For each $n \in {2,3,\cdots, 10}$, I would brute force cycle through the $2^{n−1}$ possibilities, computing the expected value. Then, I would look for a pattern in the data, try to form a hypothesis around the perceived pattern, and then try to prove the hypothesis. — user2661923, Jan 21 '22 at 07:14
@perpetuallyconfused RPN looks a brilliant idea! I'll explore it. — user159566, Jan 21 '22 at 07:29
@user159566 awesome! My post answers a slightly different question than what you asked, but I still think RPN can help — perpetuallyconfused, Jan 21 '22 at 07:59

Chris Grossack · Accepted Answer · 2022-01-21T11:21:08.157

Welcome to MSE!

First, (as you mentioned) this is the kind of thing we can code up and just check for small $n$. Maybe we get lucky and hit something in the oeis (as perpetuallyconfused also suggested).

I wrote up some sage code to do exactly this, which you can find here, and we find

$$ \begin{array}{|c|c|c|c|} \hline n & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 & 10 \\ \hline \mathbb{E}[\text{infix}] & 1 & 5/2 & 6 & 59/4 & 157/4 & 469/4 & 1599/4 & 6221/4 & 27359/4 & 33595 \\ \hline \mathbb{E}[\text{rpn}] & 1 & 5/2 & 6 & 63/4 & 189/4 & 327/2 & 2589/4 & 23133/8 & 115113/8 & 631083/8 \\ \hline \end{array} $$

interestingly, it seems neither of these sequences (where we take numerators/denominators for fractions) is in the oeis! You should absolutely submit them, and if you run the attached code for a bit longer you can probably get a pretty long sequence. Maybe through $n=25$ for both? Notice this shows that it does matter whether we use infix or RPN.

But you ask about tackling the infix sequence mathematically. You're on the right track, thinking about recurrences, but the location of parentheses makes things tricky to reason about. The key insight is to realize that a $+$ totally separates the things on either side of it. After all, any $\times$s next to the $+$ will take priority over the $+$, so will stay on their side. But $+$s next to the plus can be computed in whatever order we like!

The plan, then, will be to look for the rightmost $+$. As a concrete example, say we have

$$ 1 \ \square\ 2 \ \square \ 3 \ \square \ 4 \ \square \ 5 \quad \mathbf{+} \quad 6 \times 7 \times 8 \times 9 $$

where the $\square$s are allowed to be either $+$ or $\times$.

This naturally separates into a sum of two things: a smaller, self similar case which we can handle be recursion, and the falling factorial $9 \times 8 \times 7 \times 6$, which we'll write as $9^{\underline{4}}$.

In general, let's write $X_n$ for the random variable where we put operations between $1, 2, \ldots, n$. Then $X_n$ either has no $+$s, in which case it's $n!$, or it has a rightmost $+$, say between $(n-k)$ and $(n-k+1)$. Then we see that

$$ \mathbb{E}[X_1] = 1 $$

$$ \mathbb{E}[X_n] = \mathbb{P}[\text{no $+$s}] \ n! \ + \sum_{k=1}^{n-1} \mathbb{P} \big [ \text{rightmost + falls in the $k$th slot from the right} \big ] \left ( \mathbb{E}[X_{n-k}] + n^{\underline{k}} \right ) $$

but we know

the probability of getting no $+$s is $2^{-(n-1)}$
the probability that the rightmost $+$ falls in the $k$th slot from the right is $2^{-k}$

Putting these together, we see that

$$ \mathbb{E}[X_n] = \frac{n!}{2^{n-1}} + \sum_{k=1}^{n-1} \frac{1}{2^k} \left ( \mathbb{E}[X_{n-k}] + n^{\underline{k}} \right ). $$

Next, we approximate

$$ \sum_{k=1}^{n-1} \frac{n^{\underline{k}}}{2^k} = n! \sum_{k=1}^{n-1} \frac{1}{2^k (n-k)!} \approx \frac{n!}{2^{n-2}} $$

This is because the $(n-k)!$ terms crush everything except when $n-k$ is small. If we just take the $k = n-1$ and $k=n-2$ terms, we see that $2^{-(n-2)}$ is a good approximation to the whole sum. We're among friends, so let's say we're off by a constant factor $C$ (which you should think of as being $\approx 1$).

$$ \mathbb{E}[X_n] = 3C \frac{n!}{2^{n-2}} + \sum_{k=1}^{n-1} \frac{\mathbb{E}[X_{n-k}]}{2^k} $$

which is a recurrence that we actually have a shot at solving.

To do this, let's rewrite our sum as

$$ \mathbb{E}[X_n] = C \frac{n!}{2^{n-2}} + \frac{1}{2^n} \sum_{k=1}^{n-1} 2^k \mathbb{E}[X_k]. $$

(where we've also quietly pulled the $3$ inside the arbitrary constant)

Now, importantly, we guess. We know that we have an $n!$ term out front, and it seems reasonable that this would be asymptotically larger than the $\mathbb{E}[X_k]$s. So let's guess that, actually, $\mathbb{E}[X_n] \approx C \frac{n!}{2^{n-2}}$. Then notice, we can (sketchily) prove this by induction!

$$ \begin{align} \mathbb{E}[X_n] &= C \frac{n!}{2^{n-2}} + \frac{1}{2^n} \sum_{k=1}^{n-1} 2^k \mathbb{E}[X_k] \\ &= C \frac{n!}{2^{n-2}} + \frac{1}{2^n} \sum_{k=1}^{n-1} 2^k C \frac{k!}{2^{k-2}} \\ &= C \frac{n!}{2^{n-2}} + \frac{1}{2^n} \sum_{k=1}^{n-1} 2^2 C k! \\ &= \frac{C}{2^{n-2}} \sum_{k=1}^{n} k! \end{align} $$

But it's well known that $\sum_{k=1}^n k! = n! (1 + O(1/n))$, which (modulo a bunch of annoying details) gives us the claim.

How good is this, I hear you asking?

Well, now that we know our answer looks like $C \frac{n!}{2^{n-2}}$ we can use sage to approximate $C$ numerically. Indeed, computing with the first $25$ terms sage guesses that

$$ \mathbb{E}[X_n] \approx 2.193 \frac{n!}{2^{n-2}} $$

and we can check how well this does. For instance, we can compute the true value of

$$\mathbb{E}[X_{15}] = \frac{1443523063}{4} = 360880765.75$$

The approximate formula above (using the full guess $C = 2.1930450175980556$) gives

$$C \frac{15!}{2^{15-2}} = 350071869.79$$

Of course, with more persistence, we can try to make all of the approximations used in this answer precise. This will be quite tricky, but has the added benefit of being "correct". In particular, keeping more careful track of lower order terms and constants will give a more precise answer, at the cost of being quite difficult.

That said, I'm not much of an analyst, so there are plenty of people who can probably do just that! With any luck, one of them answers too.

I hope this helps ^_^

perpetuallyconfused · Answer 2 · 2022-01-21T07:52:02.790

I'm pretty sure this is incorrect actually. I assumed that the precedence rules would play nicely with RPN, but forgot that you might need to shuffle the input numbers as well. However, I'm going to leave this up since 1) it answers a similar question and 2) might be fixable with some slightly more careful arguments.

Every expression $1 \square_1 2 \square_2 3 \cdots \square_{n-1} N$ where $\square_i \in \{+,\times\}$ with equal probability can be rewritten unambiguously as $N, \ldots, 1, 2, \diamond_1, \diamond_2, \ldots, \diamond_{n-1}$ with $\diamond_i \in \{+,\times\}$ using Reverse Polish Notation. This should help us with the thorny issue of precedence that you raise. Now, it should be a relatively simple recursion argument.

Let $S_i$ be the random variable corresponding to value from the game played with $N = i$. Apologies in advance for mixing RPN and "standard" notation. Then $E(S_1) = 1, E(S_2) = \frac{3 + 2}{2} = \frac{5}{2}$. Then $S_3 = 3, (2 \cdot S_2), \diamond$, and so we have

\begin{align} E(S_3) & = E[\frac{1}{2}(3, (2 \cdot S_2), +) + \frac{1}{2}(3, (2 \cdot S_2), \times)] \\ & = \frac{1}{2} ([3, 5, +] + [3,5,\times]) \\ & = \frac{1}{2} (8 + 15) \\ & = \frac{23}{2}. \end{align}

Generally, then, we have the recursive formula \begin{align} E(S_N) & = E[\frac{1}{2}( (N,((N-1) \cdot S_{N-1}),+) + (N,((N-1) \cdot S_{N-1}),\times) )] \\ & = \frac{1}{2}\left[ (N, \left[ (N - 1) \cdot E(S_{N-1})\right], +) + (N, \left[ (N - 1) \cdot E(S_{N-1})\right], \times) \right] \end{align} with initial condition $E(S_1) = 1$.

Now, does this have a closed form solution? I'm not sure, and it's late here so I'll leave it at that for now. If you know the first few values of the sequence you could always try oies.org

Expected value of random expressions

2 Answers2