Both Matt Parker and the Youtube Veritasium channel have reviewed the surprising best practice for the 100 prisoners puzzle. The puzzle is a way to find a solution that will let the prisoners win with the most probability. If you have not seen this already please see the video here before some spoilers below.
There is a way we can test this solution and that is using a computer script to simulate this riddle for numerous randomized boxes… First of all, how can we represent these in the computer? An array of numbers should work well enough so that array[0] would be the first one, etc. For example, in Python:
>>> list(range(100));
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99]
A strategy function can be tested with each numbered prisoner, and given any strategy and randomized boxes:
def wins(strategy, boxes):
"""Return true when this strategy works with the given boxes."""
for prisoner in range(100):
#numbered prisoner
if not strategy(prisoner, boxes):
#print('%s lost, did not choose their number.'% (prisoner,))
return False
return True
This works with a switchable function since Python can take a function (“stategy” above) as an argument to a function. Now, to make a function that can be used to test a strategy given the prisoner number and boxes array:
def randomStrategy( prisonerNum, boxes ):
""" Return true upon success choosing the prisonerNum. """
chosen = []
for i in range(50):
c = random.choice(boxes)
while c in chosen:
#choose another.
c = random.choice(boxes)
if c == prisonerNum:
return True
chosen.append(c)
return False #Did not find it.
def numberStrategy( prisonerNum, boxes ):
boxchecking = prisonerNum
for i in range(50):
found = boxes[boxchecking]
if found == prisonerNum:
return True
else:
boxchecking = found
return False # Did not find it.
Now one can check this success vs failure rate to check:
successes = 0
fails = 0
for i in range(10000):
boxes = list(range(100))
random.shuffle(boxes)
if wins(randomStrategy, boxes):
successes += 1
else:
fails += 1
print('%s success %s fails, %s' % (successes, fails, successes/(successes+fails)))
This outputs 0 successes, which would be astronomically small chance of winning in the random choice way:
0 success 10000 fails, 0.0
Try it again with the number of the next box strategy:
successes = 0
fails = 0
for i in range(10000):
boxes = list(range(100))
random.shuffle(boxes)
#if wins(randomStrategy, boxes):
if wins(numberStrategy, boxes):
successes += 1
else:
fails += 1
print('%s success %s fails, %s' % (successes, fails, successes/(successes+fails)))
it gives generally about 0.3 or 30% success:
3101 success 6899 fails, 0.3101
This shows that the calculated value over thousands of tries, gives the same value as calculated in the video. Nice solution!
Other strategies/possibilites
One other possibility that I had supposed before looking at their answer, could be using the timing of choice of box. There is no constraint on the time that a prisoner can take in the room, so suppose the prisoner points a direction they are starting from, and then waits 1 second for drawing a 1, 2 seconds for drawing a 2, etc. Timing the time within the room before the next prisoner is called in, would tell if the larger numbers are on that side, which could give a hint to the next user even though they cannot communicate after going in to the room. I was able to build a proof of concept here – first of all starting out on the numbered unchanged boxes with box 1=1, 2=2, 3=prisoner 3 card etc. Obviously this basic case should work for this circumstance and is better than completely random choice, it should always work:
import random
def timeStrategy( prisonerNum, lastTime, lastdirection, boxes):
times = 0
expectedhalf = 1229 #sum(list(range(0,50)))
if lastDirection > 0:#forward
if lastTime < expectedhalf:
# Forward if lower.
if prisonerNum < 50:
for i in range(50):
found = boxes[i]
times += found
if found == prisonerNum:
return [times, 1]
else:
for i in range(50):
found = boxes[99-i]
times += found
if found == prisonerNum:
return [times, -1]
else:
#forward is high, try other side.
if prisonerNum < 50:
for i in range(50):
found = boxes[99-i]
times += found
if found == prisonerNum:
return [times, -1]
else:
for i in range(50):
found = boxes[i]
times += found
if found == prisonerNum:
return [times, 1]
else: #was backward
if lastTime < 99:
#backward if lower
if prisonerNum < 50:
for i in range(50):
found = boxes[99-i]
times += found
if found == prisonerNum:
return [times, -1]
else:
for i in range(50):
found = boxes[i]
times += found
if found == prisonerNum:
return [times, 1]
else:
#backward is high, try other side.
if prisonerNum < 50:
for i in range(50):
found = boxes[i]
times += found
if found == prisonerNum:
return [times, 1]
else:
for i in range(50):
found = boxes[99-i]
times += found
#print('times add %s' % (times,))
if found == prisonerNum:
return [times, -1]
#Did not get it.
return [0,0]
This returns a new time and direction + or -1 after each prisoner, and to find a summation with the numbered boxes I use:
successes = 0
fails = 0
for i in range(10000):
boxes = list(range(100))
#random.shuffle(boxes)
lastTime = 1
lastDirection = 1 # -1 for backward from end.
fail = False
for prisonerNum in range(100):
print(prisonerNum)
[lastTime, lastDirection] = timeStrategy(prisonerNum, lastTime, lastDirection, boxes)
print([lastTime, lastDirection])
if lastTime == 0 and lastDirection == 0:
# fail
fail = True
if not fail:
successes += 1
else:
fails += 1
print('%s success %s fails, %s' % (successes, fails, successes/(successes+fails)))
This outputs:
10000 success 0 fails, 1.0
Yeah! A strategy that works if the warden does no shuffling at all! If random.shuffle() is run in the above and uncommented, their chances go to 0. What if the warden is lazy and just swaps a few of them, what are their chances? Modifying the code to just shuffle by swapping a few of them, there is still a good chance of success:
successes = 0
fails = 0
for i in range(10000):
boxes = list(range(100))
#random.shuffle(boxes)
for i in range(3): #shuffle a number of times:
ind1 = random.choice(range(100))
ind2 = random.choice(range(100))
a = boxes[ind1]
b = boxes[ind2]
boxes[ind1] = b
boxes[ind2] = a
#Two values have now been switched.
lastTime = 1
lastDirection = 1 # -1 for backward from end.
fail = False
for prisonerNum in range(100):
print(prisonerNum)
[lastTime, lastDirection] = timeStrategy(prisonerNum, lastTime, lastDirection, boxes)
print([lastTime, lastDirection])
if lastTime == 0 and lastDirection == 0:
# fail
fail = True
if not fail:
successes += 1
else:
fails += 1
print('%s success %s fails, %s' % (successes, fails, successes/(successes+fails)))
So with 3 swaps, an averaging of experiments provides:
1232 success 8768 fails, 0.1232
With only 1 swap, chances are even better:
4979 success 5021 fails, 0.4979
And if the warden somewhat thoroughly shuffles by swapping positions of 10 of them:
9 success 9991 fails, 0.0009
There is still a chance and it it not infinitesimally small. This is obviously not as good as their solution but it does provide much improvement if the warden is a lazy shuffler and is giving the prisoners as much time as they want.
Thinking outside the box?
One other possible solution could come from the fact that there is no time limit specified per user. Each prisoner could pull out a drawer with the number and use the drawer to start digging below the floor… or push the ceiling tiles to the side to escape?