0

This problem has been asked lots of times, but I could not find a completely satisfactory answer.

The statement is as follows: we have a standard 6-sided die. Each time we roll it, if we get a number $i \neq 6$ we make $i$ dollars, and we lose everything otherwise (we lose all the previous earnings). The question is to determine the optimal stopping rule and the expected payoff under the optimal strategy.

If we assume that there is an upper bound $M$ to the money that we can make, using backward induction we can derive $$ S[n] = \max\Big(n, \frac{1}{6}\sum_{i = 1}^{5}S[n + i]\Big), $$ where $S[n]$ is the expected payoff of the game, given that we follow the optimal strategy and that we have accumulated $n$ dollars. Using a simple Python code, we observe that if we set, for instance, $M = 30$, we get that it is optimal to stop if $n \geq 15$:

def cut(x, M):
    if x >= M: return M
    return x

def dice_game(M): S = np.zeros(M + 1, dtype = float) S[M] = M for n in range(M - 1, -1, -1): for k in range(1, 6): S[n] += S[cut(n + k, M)] S[n] *= 1/6 S[n] = max(n, S[n]) print(S)

>>> dice_game(30) [ 6.15373793 6.53152262 6.93267954 7.35859956 7.81046053 8.28916532 8.79823077 9.33962108 9.91411966 10.52162637 11.16139403 11.85262346 12.58796296 13.36111111 14.16666667 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. ]

However, I'd like to prove it mathematically. In this answer, a proof is given, but I was wondering if there are different ways of proving it.

Another common answer is to argue as follows: if we have made $n$ dollars, in the following roll we expect to make $\frac{5}{6}(n + 3) = \frac{5n}{6} + 2.5$, and thus we should stop whenever $n \geq \frac{5n}{6} + 2.5$, which happens if and only if $n \geq 15$. The problem I see with this reasoning is that $\frac{5n}{6} + 2.5$ only accounts for the immediate expected gain in the next roll, but it does not include the expected future earnings. In other words, we'll always have that:

$$\frac{1}{6}\sum_{i = 1}^{5}S[n + i] \geq \frac{5n}{6} + 2.5.$$

My questions are the following:

  1. Why is the ''$n \geq \frac{5n}{6} + 2.5$'' direct argument valid, given what I've stated above?

  2. Could someone give a formal proof of $$ S[n] = n \iff n \geq 15, $$ different from the one that can be found here (EDIT: there is nothing wrong with this proof, I just wanted to see alternative proofs, if possible)?

  • 1
    The linked proof looks optimal to me. What's wrong with it? You are correct regarding your criticism of the heuristic argument. You can't ignore the potential for future earnings. – lulu Nov 05 '23 at 19:14
  • @lulu There is nothing wrong! I just wanted to see alternative proofs. – user_12345 Nov 05 '23 at 19:19
  • The reason ignoring “continuation value” still gets the right answer is that past 15 any roll you could ever take will have negative incremental EV. There’s no way to build positive EV strategizing around a string of negative EV gambles (usually… cf optional stopping theorem). – spaceisdarkgreen Nov 06 '23 at 21:41

0 Answers0