12

What's the probability of getting heads on the second toss given that the first toss was a head. (Trying to refresh my probability a bit). I've seen this analyzed like this:

  • HH 1/4
  • HT 1/4
  • TH 1/4
  • TT 1/4

So since we are given information (Head on first flip), then TT goes away and were are left with:

  • HH 1/3
  • HT 1/3
  • TH 1/3

So we could say that HH now has a 1/3 probability. Should we not also get rid of TH, since we know that the first flip is a head? So now we have:

  • HH 1/2
  • HT 1/2
Ole
  • 247
  • 4
    Often the question is asked "what is the probability of getting a heads on both tosses, given that you got at least one head". In that case you toss TT, and keep the three with heads (HH,HT,TH) and of those three exactly one has 2 heads so the probability is 1/3. I think you are confusing the analysis of two different problems. But your reasoning is correct. Given trusting something you've vaguely remember but which makes no sense, and something that makes sense to you. Trust yourself. You may be wrong for other reasons but if you don't remember the vague stuff, it's definitely wrong. – fleablood Dec 26 '17 at 23:09
  • @fleabood - Excellent point - Thanks! – Ole Dec 26 '17 at 23:21
  • 4
    You should probably specify that you assume the coin to be fair. Otherwise if the probability of heads is unknown the problem gets much more complicated (and interesting). For example you could assume a prior distribution on the probability of heads; in this case observing a head would update that prior and increase your expectation that the second toss will also be heads. – Luca Citi Dec 27 '17 at 00:18
  • 2
    The two events are utterly unrelated. If you have trouble picturing that, consider this example: Let's say you are about to toss a coin. Now, I assert that, back in 1993, at 2:34 PM, on Thursday afternoon, in Bogota, someone flipped a coin. I now tell you: ok, the flip your about to perform is the second of that sequence. Obviously, you'd think I'm nuts - the two are *not connected in anyway at all*. But similarly, the flip you just did, has utterly no connection, in anyway - whatsoever - to the flip you are about to perform. This is quite deep and hard to truly understand. .... – Fattie Dec 27 '17 at 17:40
  • 1
    Your example is trivial, but, the same mistake is often made in sophisticated analysis of various problems. – Fattie Dec 27 '17 at 17:40
  • 1
    Your second step is is not correct. Heads on first flip would remove both TT and TH. Which leaves HH and HT, or (simplified) H and T. But, as mentioned by others, before your first throw, there is a 1/4 probability of throwing a HH in two throws (1 of 4 equally weighted outcomes). Once the first throw has been determined, throwing a HH has the same probability as throwing a H in a single toss, which is what you are doing. – Michael Richardson Dec 27 '17 at 18:07

6 Answers6

17

Yeah, that is right. You can also use a concept called independence; if the two coin tosses are independent, then knowing that the first one is heads does not change at all the probability of heads for the second one.

E-A
  • 6,075
  • 1
  • 11
  • 21
  • 2
    "Also use"? Isn't independence the assumption underlying everything here? – zhw. Dec 26 '17 at 23:18
  • 3
    @zhw. Yes, but the other solutions that explicitly list the four possible outcomes and says they are equally likely doesn't require explicit mention of independence. I think this answer that makes it explicit is better. – Ethan Bolker Dec 27 '17 at 01:13
12

If we are give the information that "the first coin was a head" then, from, HH, HT, TH, TT, would remove both TT and TH. That leaves only HH and HT so that the probability that the second flip is also a head is 1/2.

user247327
  • 19,020
  • 2
  • 15
  • 21
  • 1
    This is absolutely wrong. There's no connection, whatsoever, between the mentioned previous-toss, and the toss in question. Consider all the coin-tosses ever made by humans. (I'd guess there would be about 100 million such coin-tosses - let's number then 0 to 100,000,000.) If you choose one (or more) of those coin tosses ......... well, *you can't*. You can stop there. You can't just choose one or more of those 100,000,000 coin-tosses and say "consider this set...". There's utterly no difference between that coin toss and the other 99,999,999 extant coin tosses. – Fattie Dec 27 '17 at 17:45
  • 1
    Which post is this a response to? It comes right after mine and is indented, so appears to be in response to mine, but what it says, except for the "This is absolutely wrong", seems to support what I said. Yes, there I no connection between the previous toss and the toss in question. That is why the second toss has probability 1/2 of heads and 1/2 of tails just as I said. – user247327 Dec 27 '17 at 17:57
  • Hey User24 ... happy new year .. this sentence: " If we are given the information that "the first coin was a head" " is the sentence I mean that is "absolutely wrong". That information (ie: the information: "the first coin was a head") is of entirely no connection, whatsoever, to the question at hand. If I said "In the 13th drawing of 2008 of the Eurolottery, the first number was a 9, and hence ..." that would be equally utterly unrelated. – Fattie Dec 27 '17 at 18:06
3

the two events are unrelated, the outcome of the second is (as mentioned) independent of the first. So, the odds of the second being heads is 1/2.

The odds of both being heads is 1/4.

If you did 49 flips - and they all came up heads - the odds of the next one being heads is still 1/2.

Bucky
  • 39
  • 2
    My gut says the chance of a 1 in 2^49 event occurring is far lower than my running into an unfair coin. – JTP - Apologise to Monica Dec 27 '17 at 02:34
  • I would want to check the two sides of the coin for if there is a tails or not, but otherwise, @JoeTaxpayer, I would argue that even a significantly biased coin isn't likely to be sufficiently biased to overcome a small prior. A coin with 0.55 probability of coming up heads would be 2 orders of magnitude more likely to produce such a sequence (and at 0.6, 4 orders of magnitude), but I would assume that such coins would be difficult to produce and be far less common than a normal coin with <1% bias. This is, of course, assuming a random coin, and not a coin your friend handed to you, giggling – timuzhti Dec 27 '17 at 04:12
  • In other words, I wouldn't be able to distinguish a designed-to-be-biased coin with a normal coin with only 50 flips, whatever the sequence is. Trick coins with only one side are an obvious exception, assuming I can observe it. – timuzhti Dec 27 '17 at 04:15
  • One weird remark about conditional probabilities here: @JoeTaxpayer would be completely right a priori. You could decide to flip a coin 50 times, and call it unfair if you get more than, say, 35 heads, given that you only expect 25 for a fair coin.

    However, given that I already have 49 heads, we're already in a very unlikely case, and it's equally unlikely that we're in the 1 in 2^50 event of the sequence "HHH....HH" (50 heads) as in the 1 in 2^50 event of the sequence "HH...HT" and so the 50th throw is exactly like the first one.

    – CompuChip Dec 27 '17 at 10:58
  • 1
    @CompuChip: No that's not the correct way to think about it. Suppose your goal is to maximum your accuracy rate. Well if there are only fair coins in the world then whatever you do will give you 50% accuracy. But if there are unfair coins in the world, and one of them is flipped 50 times, then your strategy will be strictly inferior to the one that always guesses the most common side so far. So if a random coin is picked from some distribution of coins, each of whose flips are independent and identically distributed for that particular coin and you see only Hs, the optimal is to guess H again. – user21820 Dec 27 '17 at 11:18
  • @Alpha3031: See my above comment. Also, it is slightly different if the coin could be an adversary, namely it is a random process that decides which side it comes up based on both the history of its flips and your guesses. In that case guessing the majority is clearly not the optimal, but I'm not sure whether there is an optimal strategy. You could guess randomly with probability based on the past flips, in which case you would get 50% accuracy on an optimal adversarial coin but poorer accuracy on memoryless coins, with the worst absolute discrepancy for a 3:1 coin. – user21820 Dec 27 '17 at 11:31
  • @user21820, Of course, but I was talking about the probability of the coin being biased (or the probability of the next result being a heads being significantly higher than 0.5) , not the most likely next result. – timuzhti Dec 27 '17 at 11:55
  • @Alpha3031: Well if you use probability theory to compute likelihoods in the sense of confidence in belief, and you see 49 H in a row, it is sensible to have very high confidence that the coin is biased towards producing H. If you on the other hand view probability theory differently and think that since the coin is already fixed the probability of the coin being biased is either $1$ or $0$ then it is equally moot to talk about the probability of head on the next flip, because it will be $1$ or $0$ as well. – user21820 Dec 27 '17 at 13:40
  • @user21820 Yes, you can reject the null hypothesis with p<0.05, and it's 2 orders of magnitude more likely for the coin to be biased with P(Head)=0.55 given that you get 50 heads vs if you don't. My point is, though that such biased coins are rare enough (i.e. Much less than 1% of all coins) that even given this evidence, it is not more likely that your coin is biased than it is unbiased. The change in expected probability is due to a shift in the tail: median probability is unchanged, and even expected probability changes only very, very slightly. The priors are simply too small. – timuzhti Dec 27 '17 at 14:06
  • @Alpha3031: Okay I see why we differ. You are imposing an experience-based prior where the rarity of biased coins implies unlikeliness of the coin you get to be biased. I am not, because I do not assume anything (much less that the coin is randomly chosen from all possible coins) when the information is not given. However, I take an information theoretic approach; the simplest explanation for the data (coin flips I observe) is that it is biased. By the way the null hypothesis kind of analysis is flawed, but I don't have the time to explain it now. – user21820 Dec 27 '17 at 14:10
  • @user21820 Yes, the null based test is completely wrong, because the test hypotheses can also be rejected under the same p value. I also understand your information theoretic approach , and it does depend on the experiment setup as to which priors are appropriate (obviously, your generalised 2-state RNG is different from my physical, 2 sided coin). Thank you for taking the time to clarify though. – timuzhti Dec 27 '17 at 14:41
  • @Alpha3031: Yeap I think we're largely in agreement. You're welcome to the Logic chat-room if you wish to discuss further. =) – user21820 Dec 27 '17 at 15:33
2

In general, for two events $A, B$ $$ P(A\mid B)=\frac{P(A\cap B)}{P(B)}. $$ Let $A$ be the event of heads on the second toss, and $B$ the event of first toss being a head. Then $$ P(A\mid B)=\frac{1/4}{1/2}=1/2. $$ The numerator is explained by noting that of the four possible sequences of two tosses (all equiprobable), we want $HH$.

1

If the coin is a fair coin, the results of the first toss and the second are independent, so there are exactly two possibilities for the second toss: H and T. The probability of getting H is 1/2. Don't forget, the coin may have been tossed thousands of times before the one we care about. None of those affects the result; there's nothing special about the last of those pre-tosses.

0

What confuses our intuition in this case is the difference between probabilities of events in the future and events that already appeared (past). If all the coin tosses lay in the future, the probability of a straight sequence of H decreases the more coin tosses we plan to do. So for only one toss the probability of H is 1/2, for two tosses (HH) it is 1/4, and so on, so that e.g. for 10 tosses the sequence HHHHHHHHHH has a probability of around 0.001. (Mathematically, this is P(AB) = P(A) * P(B) for independent events A and B.)

The probability decreases with every toss we add, so we expect the likelihood for a T coming up next increases. But this is only true when all tosses lie in the future. In that case the space of possible events comprises all strings of H and T with a length given by the number of tosses we plan (2 ^ number_of_tosses members). The reason why the probability of a straight sequence of H's is so small with a large number of tosses is that it compares to all sequences where one or more T's appear anywhere in the sequence (and not just one at the end).

But this is not true when we already have a prefix string of H's as a given (past) (mathematically P(A|B)). This is because then the event space for the next toss is the one with only two members, H...HH and H...HT. And here the likelihood of the string of only H's is just 1/2. And it is so for any length of the prefix string (however unlikely it was to get such a prefix in the first place), because the event space for the outcome of the next toss always ever only has two members.

ThomasH
  • 101