Draw successively from a deck. Score = no. of black cards minus no. of red cards. Optimal strategy convergence?

Question

During an interview, I was asked the following problem:

We shuffle a deck of 52 cards (26 red and 26 black cards), and enter the following the game: starting with a score of $0$, we will draw a card from the deck. If it is red, we decrement our score by $1$, and if it is black we increment it by $1$. We can decide to stop playing whenever we desire, and our gains will be our score at this moment. What is the best strategy and what is the expected gain associated?

With the help of the interviewer I did manage to find most of the solution. Let's say we already played a bit and observed how many red and black cards already came up. We know there remain $b$ black cards and $r$ red cards, so that our current score is $(26-b) - (26-r) = r - b$. Should we continue playing or should we keep our gain? Well, we should continue playing if and only if the expected gain of continuing is higher than our current gain. In fact, our expected gain $f(b, r)$ given there remain $b$ black cards and $r$ red cards verifies \begin{equation*} f(b, r) = \max\left\{r-b,\; \frac{b}{r+b}\cdot \left[1 + f(b-1, r)\right]+\frac{r}{r+b}\cdot \left[-1 + f(b, r-1)\right]\right\} \end{equation*} because when we choose to continue, there is a probability of $\frac{b}{b + r}$ to draw a black card, which increases our score by 1 and leaves us with $b - 1$ black and $r$ red cards. This scenario leads to an expected gain of $1 + f(b - 1, r)$. Similarly, there is a probability of $\frac{r}{b + r}$ to draw a red card, decreasing our score by 1 and leaving us with $b$ black and $r - 1$ red cards, resulting in an expected gain of $-1 + f(b, r - 1)$.

We easily see that $f(b=26, r=0) = 26$ and $f(b=0, r=26) = 0$, and we can compute any $f(b, r)$ in $O(b\times r)$ time complexity using memoization.

But I wasn't able to answer the follow-up question: The original game asks to compute $U_{26}:=f(26, 26)$. What happens to $U_n$ as $n\to +\infty$? According to ChatGPT, my problem is strongly related to a brownian bridge, but I couldn't find a rigorous result. Could you help me?

In general, the expected value is always positive, since you can always play until you the end of the deck to get 0. The number of ways to get an end result of zero is the Catalan number, $\frac1{27}\binom{52}{26},$ so the probability of being required to get $0$ is $\frac1{27}.$ That is the probability that you never have more black than red. https://en.wikipedia.org/wiki/Catalan_number — Thomas Andrews, Jan 29 '25 at 15:57

Mike Earnest · Answer 1 · 2025-01-31T18:15:25.833

I can prove that, asymptotically, $$ \sqrt{\frac n{2e}} \le U_n \le \frac{\sqrt{\pi n}}2. $$ That is, $0.4289\le U_n/\sqrt n\le 0.8862$ when $n$ is large enough. I computed $U_n$ for $n$ up to $500$ using your recursive formula, and based on the results, it appears $U_n$ grows like $C\sqrt n$, where $C\approx 0.5210$.

Lower bound

Consider the following strategy for the game with $n$ black cards and $n$ red cards. For a particular integer $s>0$, you stop only when your score reaches $+s$, or when the deck is exhausted. The expected value of this strategy is equal to $s$ multiplied the by probability that the score ever reaches $+s$. Using the reflection principle, you can show that the probability of the score ever reaching $+s$ is $$ \binom{2n}{n+s}\Big/\binom{2n}n =\exp\left(\frac{-s^2}{n}\right)+o(1)\qquad \text{as }n\to\infty. $$ In particular, for all $s>0$, we get a lower bound $U_n\ge s\binom{2n}{n+s}/\binom{2n}n$, because the optimal strategy is at least as good as these particular strategies. It turns out the best choice of $s$ is $\sqrt{n/2}$ (rounded to an integer). For this choice of $s$, we get $$ U_n \ge \sqrt{n/2}\cdot \left[ \exp\left( \frac{-(\sqrt{n/2})^2}{n}\right)+o(1) \right] \approx \sqrt{\frac n{2e}}. $$

Upper bound

Imagine an omniscient player who knows exactly how the deck is shuffled. This player would wait until the score is at its maximum and stop there. Certainly, the value of the game is no better than the expected winnings of an omniscient player, so an upper bound for $U_n$ is the expected value of the maximum score of a shuffled deck. This expected maximum is well known to be $\sqrt{\pi n}/2$. I wrote a proof in an earlier answer.

Draw successively from a deck. Score = no. of black cards minus no. of red cards. Optimal strategy convergence?

1 Answers1

Lower bound

Upper bound