How to maximize the expected number of corrected guesses?

Question

A, B are to play heads or tails for $N$ rounds. They win a round if both guess correctly.

A and B are allowed to communicate their strategy before the game starts.
A knows the full sequence of $N$ results right after the game starts, before making the first guess.
A and B make their guesses simultaneously, and know each others' previous guesses, as well as the correct results of previous rounds.

How to design an algorithm that maximizes the expected number of correct guesses in this game? An obvious solution that's better than random guessing would be for A to spend the first $\lceil{N/2}\rceil$ rounds communicating the results of the last half of the game to B, giving an expected $N/2\times (1/2)^2+N/2=5N/8$ wins. Would there be better solutions?

I'd be very surprised if there is a better method. If they're not allowed to communicate directly then the only way they can communicate is via their 'guesses'. But, for a particular round, communicating via a guess and guaranteeing a success are mutually exclusive, so they can only guarantee success in at most half of the rounds. Not sure how to make that argument rigorous though - I guess it'd be something to do with not being able to encode the information of a binary string in a shorter binary string. — , Mar 14 '19 at 00:43
Related (not the same) https://math.stackexchange.com/questions/2866221/9-bits-game-a-brain-teaser-on-information-theory-or-cryptography — leonbloy, Mar 14 '19 at 19:53

antkam · Accepted Answer · 2019-03-16T14:06:58.270

Here is a provable improvement inspired by the answer by @leonbloy (which for a shorthand I will call the $LB$ strategy - hope you don't mind!) I haven't calculated the exact success rate but my guess is slightly over $70$%.

The way I understood the $LB$ strategy, the key idea is that $A$ knows what $B$ will say every timeslot (obviously except the initial timeslot), so $A$ already knows if $B$ will be right or wrong at timeslot $t$. If $B$ will be right, $A$ helps them score. If $B$ will be wrong, then $A$ might as well tell $B$ the next coin. This works for $B$ because $B$ can also tell what $A$ was thinking. This fits the standard concept of "if we're gonna be wrong, lets be 'maximally' wrong together" for this type of game.

My improvement is based on blocks of $3$ coins. In each block, there will be a majority, and that is what $A$ tells $B$. So:

Step $1: A$ tells $B$ the majority in the next block.
Step $2:$ Within each block, $B$ guesses the majority every time.
Step $3a:$ If all $3$ coins are the same, $A$ helps them score $3$ times. At the end of which, they are back to the state of knowledge at the beginning of the game, so go back to Step $1$ for the next coin.
Step $3b:$ If only $2$ of the $3$ coins are the same, $A$ helps them score those $2$ timeslots. For the remaining timeslots (the "bad" coin), $A$ knows $B$ will be wrong, so $A$ tells $B$ the majority of the next block. Then go to Step $2$.

The analysis is easier if we start from Step $2$:

In case of $3b$ (which happens with prob $3/4$), they score $2$ coins in a block of $3$.

In case of $3a$ (which happens with prob $1/4$), they score all $3$ coins in the block (say timeslots $T, T+1, T+2$), but has to spend the next timeslot ($T+3$) just for $A$ to tell $B$ the majority in the next block ($T+4, T+5, T+6$). There is a $1/4$ chance they got $T+3$ right by sheer luck. So among $[T, T+3]$, they score $3$ for sure and an additional $1/4$ by expectation, for a total expected value of ${13 \over 4} = 3.25$ out of $4$.

Since ${3.25 \over 4} > {2 \over 3}$, this is strictly better than the $LB$ strategy in the average case.

In fact it is also strictly better in the worst (adversarial) case. My worst case is $2/3$ for the sequence $THHTHHTHHT...$ while for $LB$ the worst case is $1/2$ for the sequence $THTHTHT...$

The exact time-average analysis is a bit messy: Because the two analysis cases $3a, 3b$ require different amounts of time ($4$ vs $3$ timeslots), I don't think I can simply say the time-average is ${3 \over 4} {2 \over 3} + {1 \over 4} {3.25 \over 4} = {1\over 2} + {13 \over 64} = {45 \over 64} = 0.703125.$ But it should be pretty close (and my guess: slightly higher).

In my head I can model this as a $5$-state Markov Chain, but I haven't gone to the trouble of actually solving it. My guess is that the time-average is a weighted average of the form $b {2 \over 3} + a {3.25 \over 4}$ where $a+b=1$, and they represent fraction of time spent in each case. Although $1/4$ of the cases are type $3a$, we actually spend $a > 1/4$ fraction of time there because each case $3a$ is really $4$ timeslots long - and this is why I'm guessing the correct exact answer $> 45/64$. I.e. instead of $a:b = 1 : 3$ (proportion of each case), we need some more rescaling to account for the different time lengths, e.g. $a:b = 1 \times 4 : 3 \times 3 = 4:9.$ For this guess (which is just a guess!) the time-average $\approx 0.712$.

This idea can also be generalized. E.g. if we use blocks-of-$5$, then in the best case we score ${5.25 \over 6}$ (prob $1/16$), in the second best case we score ${4 \over 5}$ (prob $5/16$), and in the last case we score ${3 \over 5}$ and have $2$ timeslots to talk - what a luxury! I have no idea how best to use so much "bandwidth"! :) If we don't use the second bad coin well, the time-average is dragged down by the ${3 \over 5}$ case, but I'd think there is a way to use it e.g. to give more info about the next block or even the next next block. I haven't yet figured out a way to make this better than the block-of-$3$ case.

Nice. I think the key of the improvement is that it takes advantage of $B$ knowing right vs bad guesses. — leonbloy, Mar 16 '19 at 14:58
@leonbloy - You mean B knows that A knows that B will be wrong? :) Yeah, but I got that idea from your Answer! — antkam, Mar 16 '19 at 15:47
A very wild (probably wrong) conjecture, using (abusing) Fano's inequality, gives me a bound $\approx 0.773$ . Not bad... — leonbloy, Mar 16 '19 at 15:59

leonbloy · Answer 2 · 2019-03-23T02:17:31.400

Inspired in antkam's answer, here's another idea to investigate.

Let's pick some binary error-correcting code $(n,k)$, not necessarily linear, with not too small $n$.

Proposal 1: pick $2^k$ random tuples as codewords, with $n/k \approx 4.5 $. For example, $n=41$, $k=9$.

Proposal 2: pick some BCH code with $ k \approx t$. For example, let us take a BCH $(255,45)$ code, which has $t=43$.

The strategy is: the sequence is divided in blocks of length $n$. In each block, we mark the $m$ 'miss bits' (those which were not correctly guessed). It $m\ge k$ we label the last $k$ of them as 'information bits'; if $m<k$ we label additional $k-m$ hit bits (the last ones) as information bits.

$A$ looks ahead, finds the codeword that is nearest (Hamming distance) from the next block, and uses the $k$ information bits in this block to code it. The remaining bits are copied from $C$.

$B$ simply picks that codeword (and, after knowing the results, deduce the code for the next block).

The analysis seems easier with the random code (proposal $1$), though probably the BCH code (or something similar) would perform better.

The Hamming distance between the codeword and the $C$ block will correspond to the minimum of $2^k$ $Binom(n,1/2)$. This concentrates around

$$ t^*= \frac{n}{2} - \sqrt{n k \log(2) /2} \tag 1$$

with $ t^* \approx k \iff n/k \approx 4.5$. Granted this, in each block we'll have $m \approx k$ , i.e., we'll have approximately as many missed bits as information bits are needed (which is what we want). If that is so, we'd attain a score of $1-k/n \approx 0.777$.

For the case of the BCH code, I suggested taking $t\approx k$, in the hope that the distance from a random tuple to a codeword would concentrate at (or less than) the value $t$. But this needs more elaboration (or at least some simulation).

Update: some simulations partially support the above (a little too optimistic) conjecture, though $n/k \approx 4$ seem to perform better. A random code with $n=57,k=14$ attains a hit rate $r=0.753$. For smaller sizes, a punctured/truncated BCH code performs a little better; for example: $n=23,k=6$ ($BCH(31,6)$ punctured) gives $r=0.740$ ; random: $0.731$). It seems that random codes perform roughly the same (or better!) than BCH codes for large sizes.

Some Octave/Matlab code:

NC = 45; KC=11;            %  (n,k) code parameters 
N = 1000;                  % total tentative number of coins
NB = floor(N/NC+1/2);      % Number of blocks in message
N = NB * NC;               % total number of coins adjusted
NT = 100 ; % number of independent tries

mindist = zeros(1,3*KC); % distribution of minimal distances

for t = 1:NT
 CW=randint(2^KC,NC);  %  codewords
 %% For BCH, comment the previous line and uncomment the following two
 %NCNP =63; KCNP =16; % BCH (n,k) nonpunctured parameters (greater or equal than NC KP)
 %CW=bchenco(dec2bin(0:2^KCNP - 1) - '0',NCNP,KCNP)(1:2^KC,1:NC); % 2^KC codewords

 C = randint(NB,NC);
 for b = 1:NB
    % nearest codeword index in nci, distance in ncd
   [ncd,nci]= min(sum(mod(bsxfun(@plus,C(b,:),CW),2) , 2)) ;
   mindist(ncd+1)++;
 endfor
endfor
mindist /= sum(mindist);

hitrate=1-((0:size(mindist,2)-1)+max((KC-(0:size(mindist,2)-1))*1/2,0))*mindist' / NC

Edit: fixed the hitrate calculation (a little up) : when A has to use "good" bits ($m<k$) to send the message, the probability of coincidence for those bits is $1/2$ (not $1/4$ as I initially assumed).

Added: These values seem consistent with the bound I conjectured in a comment, thus:

The goal of $A$ is to use the "missed rounds" (those not guessed by both) to pass information to $B$ about the other coins. Let $p$ be the miss probability. Then, $A$ would like to pass to $B$ an average of $p$ bits of information for each round: $I(A;B)=p$ bits. Applying Fano inequality, we get the critical value:

$$ h(p) = H(B|A) = H(B) - I(A;B)= 1 - p \tag 2$$

with $h(p)=- p \log_2(p)- (1-p) \log_2(1-p)$. The root occurs at $p =0.2271\cdots$, which corresponds to a hit rate around $0.773$.

Added (2019-03-23): In this answer I show that the distribution of the minimum of $k=2^{\beta n}$ Binomials $(n,1/2)$ asymptotically concentrates around the root of $h(d/n)=1 - \beta$. This proves that the random coding strategy is asympotically optimal, attaining the bound given by Fano inequality above.

leonbloy · Answer 3 · 2019-03-15T21:46:25.083

I think you can do better.

Here's a simple scheme that gives, asymptotically, $2N/3$ wins.

Let $C_i \in \{0,1\}$ ($i=1 \cdots N$) be the coin results.

Let player $A$ make her choices thus:

$$ A_i = \begin{cases} C_i & \text{if } C_i=C_{i+1} \text{ or } C_i = A_{i-1}\\ C_{i+1} & \text{ elsewhere} \end{cases}$$

(Here we assume an extra dummy value $C_{N+1}=C_N$).

And let player $B$ just copy the previous value from $A$: $$B_i= A_{i-1}$$

(Here we assume an extra dummy value $A_0 = 0$).

We can model this as a Markov Chain with two states ($S_1$ if $A_{i-1}=C_i$ and $S_0$ if $A_{i-1}\ne C_i$ ) . We have the transitions $S_1 \to S_1$ (prob : $1/2$), $S_1 \to S_0$ (prob : $1/2$), $S_0 \to S_1$ (prob : $1$). Hence, asymptotically $P(S_1)=2/3$. Also, we score one point when in state $S_1$, zero points when in state $S_0$. Hence the expected score is $2N/3$.

Edit: More in detail, if $s_n$ is the probability of being in state $S_1$ after $n$ rounds, and $e_n$ is the expected score, then

$$ s_n = \frac23 + \frac13 \frac{1}{(-2)^n} $$

and

$$ e_n = \frac23 n - \frac{1-(-\frac12)^n}{9} = \frac23 n - \frac{1}{9} + O(2^{-n})$$

with $ \frac23 n - \frac1{6} \le e_n \le \frac23 n - \frac1{12}$

I don't know if this is optimal. Notice, BTW, that here $B$ doesn't use the knowledge of previous results.

I simulated this strategy and it works. Very clever. Is this is a known sort of technique? — Jair Taylor, Mar 15 '19 at 00:20
@JairTaylor - The basic idea is that if A knows B will be wrong, then A knows the timeslot is gonna be wasted and might as well use it to tell something useful. (See my answer for another variation.) This is a direct consequence of the scoring method in the game: you only score if everyone is right. For many games of this sort (search for "prisoners" and "hats" here at MSE), the "general technique" is "if we're gonna be wrong (i.e. not score) then lets be 'maximally' wrong together". — antkam, Mar 16 '19 at 14:12

score 0 · Answer 4 · answered Jun 01 '25 at 00:36

There is a more or less definitive answer (with references) to this question, presented by Mike Earnest in black-and-white game post.

The original solution to the problem was given by Gossner, Hernandez, Neyman in their 2003 article, available here: Online Matching Pennies. It proves that you can get a hit rate as high as 0.8107...; however, the proof is quite non-elementary. It is worth noting that, using Hamming and Golay covering codes, you can get relatively elementary algorithms for 0.7 and 0.75 rates.

How to maximize the expected number of corrected guesses?

4 Answers4

Linked