100 prisoners and a lightbulb

Question

100 prisoners are imprisoned in solitary cells. Each cell is windowless and soundproof. There's a central living room with one light bulb; the bulb is initially off. No prisoner can see the light bulb from his or her own cell. Each day, the warden picks a prisoner equally at random, and that prisoner visits the central living room; at the end of the day the prisoner is returned to his cell. While in the living room, the prisoner can toggle the bulb if he or she wishes. Also, the prisoner has the option of asserting the claim that all 100 prisoners have been to the living room. If this assertion is false (that is, some prisoners still haven't been to the living room), all 100 prisoners will be shot for their stupidity. However, if it is indeed true, all prisoners are set free and inducted into MENSA, since the world can always use more smart people. Thus, the assertion should only be made if the prisoner is 100% certain of its validity.

Before this whole procedure begins, the prisoners are allowed to get together in the courtyard to discuss a plan. What is the optimal plan they can agree on, so that eventually, someone will make a correct assertion?

http://www.ocf.berkeley.edu/~wwu/riddles/hard.shtml

I was wondering does anyone know the solution for the 4,000 days solution. Also, anyone got any ideas on how to solve this?

Just wanted to know if anyone got any ideas how to solve this. http://www.siam.org/ is offering prize money for a proof of a optimal solution for it. I suppose to make it clear the people want a plan that would mean on average it would take least amount of time to free the prisoner. The point is the average run time is as low as possible.

Depends entirely on the values of the outcomes (freedom, death) and cost of another day in prison. — Alex Becker, Mar 04 '12 at 15:39
Why did someone downvote this. The riddle is well defined and if you click on the link it's okay. Plus there is a riddle tag. — steadyboy, Mar 04 '12 at 15:46
@AlexBecker no . The point is on average you want a strategry that takes the least amount of time so everyone is freed. — steadyboy, Mar 04 '12 at 15:47
@Alex The cost of death is infinity; the cost of another day in prison is 1; and the value of freedom doesn't matter, because any strategy that avoids infinite punishment will result in freedom. — Sophie Swett, Mar 04 '12 at 15:53
That SIAM link is a bit unhelpful, steadyboy! Can you be more specific? — TonyK, Mar 04 '12 at 16:33
@AlexBecker, it's a hard condition that failure must he impossible. Here's a simple strategy that provides a small postive chance of success and zero risk of failure: prisoner 0 turns the lamp on on days divisible by 100 and off on all other days. No other prisoner ever turns it on. Prisoner $n$ for $0<n<99$ turns the lamp off if he's in the living room and it's not day $n$ modulo 100. Prisoner 99 declares success if he's in the living room on day 99 modulo 100 and sees the light being on. Expected running time about $100^{101}$ days; will succeed eventually with probability 1. — hmakholm left over Monica, Mar 04 '12 at 16:34
@TonyK: Indeed. Google turns up only one other mention of this and the wording is almost identical with no link to any "prize". Color me skeptical for various reasons. — cardinal, Mar 04 '12 at 20:58
This question is highly flawed. You state no prisoner can see the light bulb from his/her own cell but I am wondering can they see the light? If so, then any of the $100$ prisoners that can can be a "counter" and any "new" person that enters the central living area can toggle the light to indicate it is their first time in the room. Problems are you didn't mention if the warden can also toggle the light to mess up the system. You also stated prisoners can be shot but you didn't say with what. Perhaps a camera? — David, Jan 03 '15 at 18:31
I have a hard time thinking of a [nice] plan. But if I were forced into this I would advise each prisoner to decline a second invite. Remove random, 101 days minimum. Sounds like Alcatraz. — StackOverFowl, Jul 11 '17 at 19:15

score 23 · Answer 1 · answered Mar 04 '12 at 16:35

23

The "standard" solution for the puzzle is the following. Prisoners choose one of them, lets call him "The Counter". The job of The Counter is to count the prisoners who have been in living room. Other prisoners' job is to give a signal to The Counter that they have been in the living room. This is done as follows. The Counter is the only person who can turn the light off. Others can only turn the light on and do it only once. That is if a prisoner walks into the living and the light bulb is off and he has never turned the bulb on, then he turns it on. The bulb will stay on, until The Counter comes and turns it off. When The Counter has turned the bulb off 99 times, he knows all prisoners have been in the living room.

I don't know what is efficiency of this solution, but in your link it is stated that the standard solution takes 27-28 years on average. So I suspect that is efficiency of this solution.

answered Mar 04 '12 at 16:35

Levon Haykazyan

4,916

1

Back of the envelope calculuation: Each time the bulb is off we have to wait until a yet uncounted prisoner comes around. This takes longer as time goes by, but in total we should expect about $$\frac{100}{99}+\frac{100}{98}+\frac{100}{97}+\cdots+\frac{100}{1} \approx 100\log 99$$ days of waiting in this phase. Once it is on we must wait for the counter to show up, on average 100 days per prisoner we count. This works out to between 28 and 29 years, but the logarithmic estimate may be slightly inflated. – hmakholm left over Monica Mar 04 '12 at 17:11
1

@Henning: Each time the Counter is waiting to go to the living room, the probability that the light will be turned on is the probability that one of the uncounted prisoners gets to the living room before the Counter. If there are $u$ uncounted prisoners, this probability is $\dfrac{u}{u+1}$. Thus, the number of passes needed for the Counter to see the light on is $\dfrac{u+1}{u}$. – robjohn Mar 04 '12 at 23:56
@Henning: Therefore, the expected number of days to count all $99$ other prisoners is $$\left(\frac{100}{99}+\frac{99}{98}+\frac{98}{97}+\dots+\frac32+\frac21\right) \cdot100$$ and this is approximately $(99+\log(99)+\gamma)\cdot100\approx10417$ days, which is about $28.5$ years. – robjohn Mar 04 '12 at 23:56
@Henning: now that I reread your answer, I notice something I missed before, and I believe that our answers are the same. Sorry for the interruption :-) – robjohn Mar 05 '12 at 00:19
3

If we would not prescribe the person doing the counting, but instead take the person who is with the warden the second day as the counter, we would improve right? – Jan Apr 09 '17 at 13:49

hmakholm left over Monica · Answer 2 · 2012-03-05T03:03:34.017

Here's one way to improve on Levon's solution. I don't have time to do a full probability analysis just now, but my very rough estimates suggest that with careful tuning, techniques like this should be able to get the average running time down to around 4000 days:

It's not certain but "pretty likely" that every prisoner gets to visit the living room at least once during the first, say 800 days. So divide the strategy into several phases:

Day 1 to 800: Everyone toggles the lamp the first time they see the living room. If we're lucky, after day 800 there will be 50 prisoners who have ever turned off the lamp. They are the "active" prisoners. (If we're unlucky and there's someone who hasn't been in the living room yet, there will be less than 50 active prisoners).
Day 801 to 1600: Ever prisoner who is active after the first phase toggles the lamp the first time they see the living room during this period. (The prisoner who enters on day 801 makes sure the lamp is already off before he does his part -- if so, something will have gone wrong but proceed according to plan anyway). The prisoners who turned the lamp off during this phase are still active. If we're lucky there are 25 off them.
Day 1601 to 2400: Halve the number of active prisones once again. This time those who turned the lamp on stay active; there are now 13 active prisoners unless something went wrong.
Day 2401 to 6000: Count the 13 still active prisoners using the standard algorithm, with a Counter selected in advance.

If anybody finds themselves still in prison on day 6001, everyone shrugs and starts the entire algorithm over from day 1. Since 3600 days should be plenty of time to count 13 active prisoners, the main risk is that someone who needed to didn't get to participate in one of the first three phases, and since the risk of that is small, it won't affect the expected running time of the algorithm much.

With the parameters given here (and several approximation shortcuts in the calculation) this strategy has an average running time of 4353 days. But there's room for some optimizations by tweaking the numbers. It appears that 800 days is close to optimal for the first halving, but the later halvings can be shortened a bit without incurring quite as large a total risk of missing an active prisoner and having to restart.

EDIT: After a semi-systematic search for better parameters it seems that the best this method can get is an average running time of 4227 days (about 11½ years), which is achieved by three halving phases of 840, 770 and 700 days, followed by a counting phase of 2200 days. Neither more halving phases nor fewer ones improved the overall efficiency. Back to the drawing board ...

Paul Sinclair · Answer 3 · 2016-05-15T22:23:39.903

The shortest time solutions that I am aware of are based on this scheme, first suggested by Paul Hammond here.

Time is broken into Rounds, and each round is further broken into two Stages. In Paul Hammond's original scheme, the first stage in each round is 2600 days, while the second is 2700 days.

Each prisoner is considered to start with a virtual "token". If the light is off, they can leave their token in the room by turning the light on (so they no longer have a token themselves). They can pick up a token left in the room by turning the light off. At the initial meeting, A head counter and 9 assistant counters are picked. Everyone else is a "drone". Drones never pick up a token (except on the last day of a stage). They can only leave one.

During the first stage, the drones leave their token in the room when the light is off. If the light is on, or they have already given away their token, they do nothing. All counters pick up any tokens left in the room until they have 10 tokens (including their own). After that, they do nothing in this stage. On the last day of the stage, whoever visits the room picks up any token that was there (if this was a drone, and he had not yet given his own token away, then he will have two tokens to pass on in the next round, which he must do one at a time).

During the 2nd stage, tokens are passed 10 at a time from assistant counters to the head counter. Drones never turn the light on or off. Asssistant counters can only turn on the light if they have 10 tokens to pass. After passing, they become drones. The head counter turns the light off on his next visit, and adds 10 tokens to his inventory. If he has all 100 tokens, he declares everyone has visited. As with the first stage, if the light is on on the last day of the stage, the visitor picks up the 10 tokens there, and saves them for stage 2 of the next round, when he will act as an assistant counter.

If the round is unsuccessful, an additional round with new stage 1 and stage 2 take place, and so on, until the head counter finally reaches 100.

Paul Hammond originally had them start over with each new round (everyone gets their tokens back and all counters reset their counts to just their own token). Computer simulations estimate the expected run time at 3964 days. AlexH suggested saving the tokens as I described above. This change and tweaking of stage lengths (which need not be the same between rounds) produced a claimed estimated expected run time of 3600 days. However, the person posting that result did not say what stage lengths he used. However a later variant of the scheme which picks the counters during a special 40 day period at the start and uses initial stage lengths of 2000 and 1500, while all later stages are 300 days each, results in an estimated expected run time of 3489 days.

A later scheme appears to have dropped that under 3400 days, but I have not yet examined that scheme.

score 4 · Answer 4 · answered Oct 20 '14 at 12:51

A survey of various strategies can be found at https://www.math.washington.edu/~morrow/336_11/papers/yisong.pdf. It describes a strategy ("two-stage counting") which is believed, based on simulations, to yield expected run times between 3500 and 4000 days, and references a hybrid strategy by a Bertram Felgenhauer which is supposedly proved to run in 3949 days average.

Petar Donchev · Answer 5 · 2016-03-17T12:50:50.937

The halving strategy seems to work well - that is in a frrst phase of x1 days everyone toggles the lamp on first entry (lamp is always turned off on the last day of a phase). On the next phase i+1 everyone that toggled the lamp to a certain state in phase i, continue. This certain state depends on the number of players in phase i - if it was even (as initially - 100), then everyone who turned off the lamp continue for i+1, if it was odd - then everyone that turned it on - this is to ensure noone "slipping". This should continue until there is a phase with 2 players. Then the if one of the two enters and sees the lamp on, he knows that he is second, and in retrospect that in each previous phase all the active players managed to enter the room, including the first phase, so everyone has been at least once there.

The phases are with 100, 50, 25, 13, 7, 4, and 2 players, so if we try to solve the following minimization problem - minimize the sum of the phase lengths, with the constraint that the total probability of all phase players go into the rooms at least once on each phase is exactly (or at least - depending on what minimization algorithm is used) 0.5. Of course, a different target probability can be set.

There are two approaches: 1. solve a constrained problem where objective is total days, and constraint is total probability = 0.5 (or something else). 2. solve unconstrained problem where the objective function somehow includes total length and target probability (this allows greater variety of methods).

In all cases there is a general problem that the actual domain is integers and minimizing functions over integers is hard to do, so at least I defaulted to approximations. Additionally it seems that the constrained task is hard and the unconstrained task is nonconvex.

For Nelder-Mead and somewhat contrived target function I get different solutions (all very close to p=0.5) depending on starting point. The minimum, achieved from a sane starting point is around 3439 days. (See below for sane starting point).

SLSQP and COBYLA seem to behave more consistently once some numerical problems are resolved (both in objective function and constraint) and give 3455.

Regarding sane initial guess - it seems reasonable to guess that if target probability is 0.5. then we can pick initial phase lengths so that all phase probabilities are equal - to 0.5^(1/7). Experimentation confirmed that this may be a good initial guess.

P.S. Of course this is minimizing only the length for this particular scheme. I have no guess on a more generic approach. — Petar Donchev, Mar 15 '16 at 12:11
I found this paper that features an O(n*ln(n)^2) solution - that would be 2120 day.. — Petar Donchev, Mar 15 '16 at 13:01
$O$ notation doesn't work that way. It cannot give you a specific value for any fixed $n$. It only tells you how the sequence grows as $n$ increases. In this case, William Wu produced an upper bound formula for the expected time, and used that formula to justify $O(n(\ln n)^2)$. If you calculate the upper bound formula for $n = 100$, the value is $7585$ days. But in fact, if you read through the Wu Forums thread, you will see that computer testing of this scheme estimates the expected time to be around $4250$ or $4500$ days. Some other schemes given are $< 4000$ days. — Paul Sinclair, May 15 '16 at 19:38

score 2 · Answer 6 · answered Nov 03 '21 at 05:52

My solution to this was a more generalised version of the above. At any given time, each prisoner has a value. This starts at 1 for all prisoners except one of them that has a value of 29.

On each night the light bulb has a power of two value that is agreed by the prisoners at the beginning. When a prisoner enters the room (on any day except day one), if the light is on they increase their value by the light-bulb value of the night before.

They then expand their value in binary. If they have a 1 in the power-of-two slot for the value on the subsequent night, they ensure the light is left on and subtract this from their own value. If they have a zero in that slot they ensure the light is left off.

The sequencing of light bulb values is chosen by drawing from a random distribution that reflects the expected density of the powers-of-two values present in the prisoner population as a whole given the previous choices. (The prisoners have to have prodigious memories for this algorithm).

If any prisoner finds themselves worth 128, they announce success.

David · Answer 7 · 2015-01-08T05:54:00.133

1

A random person chosen could mean that all $100$ prisoners will never all be chosen in their lifetime so it seems to me that there is no minimum amount of time to satisfy the "$100$% certain" situation (in the general case) so what is the point of this question? It may take anywhere from $100$ days to a lifetime. Why don't they just each get a number assigned to them from $1$ to $100$ and make a mark in the living room when they are in it? Prisoners are good at that type of stuff. Perhaps scratch a binary (stick men) code?

Also, what happens if one of the unchosen ones dies before all $100$ are chosen? How is it verified if all the prisoners have visited the living room or not? If it is written down somewhere then maybe the prisoners should find out where and use that info.

Too many unknowns in this badly worded question to come up with a reasonable answer.

edited Jan 08 '15 at 05:54

answered Jan 03 '15 at 18:24

David

1,710

4

You can do it as a practical exercise (highly unlikely to ever occur in real life) or you can do it as a mathematical puzzle. In the latter case you idealize the problem so that it works as designed. – David K Jan 08 '15 at 06:07
2

Changing the problem by "idealizing" it will then change the answer depending on what "liberties" the person solving it takes. This problem could have been worded much better to clearly state any restrictions and reasonable assumptions. Without those, people can just make assumptions (even unreasonable ones). For example, if you allow me liberties I have a $100$ day solution. I will assume the warden is new and doesn't know the prisoners so when they are "randomly" called, the prisoners will "come forward" in order (from $1$ to $100$) that they preagreed on. What do I win for my solution? – David Jan 08 '15 at 14:01
1

I think most people would interpret "imprisoned in solitary cells" to mean that the prisoners spend all their time there except the explicitly specified visits to the living room, so one can hardly "step forward" when one's day arrives. You can assume the prisoners can communicate reliably via the light switch and only via the light switch, or you can assume other communication; if you don't assume well-defined limits on that communication then the problem becomes uninteresting, but no doubt there are interesting variations that can be devised. – David K Jan 08 '15 at 16:54
1

My point is either ask the question clearly or expect to have answers to "changed" questions cuz many different interpretations/variations are possible. I've seen a similar problem on a show called Brain Games I think. They were able to toggle 3 lights with 3 switches but could not see the lights and they had only 1 opportunity to ID which was light 1, which was 2, and which was 3. The common problem I see with word problems like this is they have flaws in them that can be exploited easily. My solution was to just set up mirrors to see which lights were being toggled. The host "forgot" that. – David Jan 08 '15 at 19:03
1

Some mathematical problems are much easier to state with words, as in this case. If you provide the statement as some kind of Markov chain; it becomes less "engaging" and perhaps harder to visualize. Steven Pinker (indirectly) holds that this way of conveying ideas is much better than the directly mathematical version. – OFRBG Feb 06 '17 at 11:45

Brownbat · Answer 8 · 2023-05-15T23:14:11.477

Here's a good discussion of the "standard" answer to the problem, using one prisoner to "count" the others.

https://medium.com/i-math/100-prisoners-and-a-light-bulb-573426272f4c

The main difference from Levon's answer is that the counter isn't preselected, the first prisoner to enter the room becomes the counter. The group gets to start collecting counters slightly earlier that way.

The other nice thing about this article is it talks about a strategy for dealing with problems of this class. Imagine a simpler problem, reduce the 100 to something manageable (in the article's case, 5 prisoners).

Note that solutions for 1-3 prisoners are fairly trivial.

1 prisoner: leave after day 1
2 prisoners: leave if ever your first visit to the living room is not day 1
3 prisoners: now incorporate light bulb if your first visit is not day 1, turn the bulb on if your first visit is not on day 1 and the bulb is on, then leave

For 4 prisoners, is there any faster approach than having the first prisoner become the counter? Can you use the regimes method at that level?

I wrote some short functions to benchmark the standard strategy.

import random
def generate_ledger(total_prisoners=100, ledger_length=200):
    '''Generates a warden's ledger of randomly selected prisoners'''
    ledger = []
    for _ in range(ledger_length):
        ledger.append(random.randint(0, total_prisoners-1))
    return ledger
def freedom(ledger=[], total_prisoners=100):
    '''Checks whether the prisoners, using the standard strategy, can escape
    given a certain ledger'''
if len(ledger) &lt; total_prisoners:
    return False

counter = ledger[0]  # The first prisoner is the counter
visited = set()  # Set to keep track of prisoners who have turned on bulb
bulb = False  # bulb starts off

visited.add(counter)  # counter will visit first but never turn off own bulb

for prisoner in ledger[1:]:
    # counter checks and turns off the light
    if prisoner == counter:
        if bulb:
            bulb = False
            # counter checks if all prisoners have visited
            if len(visited) == total_prisoners:
                return True
    elif prisoner not in visited and not bulb:
        visited.add(prisoner)
        bulb = True

return False  # If all prisoners didn't visit, return False



for ledger_length in range(6, 50):
    #  considers ledgers of lengths in range, testing total number of times
    #  in order to generate an estimate of success for the standard strategy
    #  against a random ledger of that length
    count = 0
    total = 10000
    for i in range(total):
        ledger = generate_ledger(total_prisoners=4, ledger_length=ledger_length)
        if freedom(ledger, total_prisoners=4):
            count += 1
    print(f"ledger length {ledger_length}: {count}/{total}")

Output:

ledger length 6: 0/10000
ledger length 7: 11/10000
ledger length 8: 82/10000
ledger length 9: 199/10000
ledger length 10: 407/10000
ledger length 11: 730/10000
ledger length 12: 1129/10000
ledger length 13: 1566/10000
ledger length 14: 2184/10000
...
ledger length 23: 7166/10000
...
ledger length 26: 8270/10000
...
ledger length 30: 9155/10000
...
ledger length 33: 9508/10000
ledger length 34: 9574/10000
...
ledger length 41: 9920/10000

An alternative solution would divide the game into rounds of three and use parity bits. So each prisoner toggles the bulb on every odd visit they make to the room. On the fourth night, ...

[This version was non-optimal and confusing, removing in lieu of the TRY+TEST strategy.]

EDIT 16 MAY: The TRY+TEST Strategy

A simpler, more successful, and more generalizable strategy would be to leave the light on for prisoner D if all is well, turn it off if it's not.

So whoever arrives on day 1 is A, turns bulb on. If A returns day 2, turns bulb off, that round is a wash. If someone else arrives day 2, call him B, leave bulb on. For round three, if A or B returns, turn bulb off. Else C leaves on. On the next turn, if D sees a lit bulb, D knows all others visited. If unlit, someone visited twice and we must try again from scratch. You can vary this strategy by dividing each round into a "try" period for the first three days, then a "test" period where you're leaving the bulb on just waiting for D to arrive.

So you could have rounds of length 7, then at round_index % 7 == 0, you basically reset the state. A arrives and turns bulb on. %7==1, B leaves on, A turns off. %7==2, C leaves on, A or B turn off. %7 in [3-6], everybody ignores the bulb but D. Iff D sees a lit bulb, exits.

Implemented as follows:

def freedom_test_1(ledger=[], total_prisoners=4, round_length=7):
    '''Checks whether the prisoners, using the alternate strategy, can escape
    given a certain ledger
Turn bulb on / keep bulb on if ABC in beginning of round
Keep bulb on until D sees it or round resets

This time with greater lengths for D to check'''

# round_length = how long to check for D before giving up and resetting

OFF = False
ON = True
visitors = set()
bulb = OFF
for turn, prisoner in enumerate(ledger):
    if turn % round_length == 0:
        # reset state
        visitors = set()
        bulb = OFF
    if turn % round_length in [0, 1, 2]:
        if prisoner not in visitors:
            visitors.add(prisoner)
            bulb = ON
        else:
            bulb = OFF
    else:
        if prisoner not in visitors and bulb is ON:
            return True
return False

So for round lengths of [6, 7, 8] I found these results:

Adding the default "counter" strategy as Method 0 for comparison:

There are tradeoffs with round lengths on the try and test strategy, but it significantly outperforms counter until round lengths of 28 or so.

100 prisoners and a lightbulb

8 Answers8

Linked

Related