Riddler Solitair

09 may 2022

Riddler is a puzzle column in the infamous blog FiveThirtyEigth by Nate Silver. This column publishes puzzles in the fields of logic, mathematics, and probability theory. Each week the Riddler presents two puzzles: the simpler Riddler Express and the more challenging Riddler Classic. Having become a bit overconfident as earlier puzzles of my creation - which have also been published in STAtOR - were published in Riddler 538, I sent in a puzzle that unexpectedly became a large success with surprising solutions of adepts of Riddler 538. The puzzle that was published as Riddler Classic reads as follows.

Riddler solitaire is played with 11 cards: an ace, a two, a three, a four, a five, a six, a seven, an eight, a nine, a 10 and a joker. Each card is worth its face value in points, while the ace counts for 1 point. To play a game, you shuffle the cards so they are randomly ordered, and then turn them over one by one. You start with 0 points, and as you flip over each card your score increases by that card’s points — as long as the joker hasn’t shown up. The moment the joker appears, the game is over and your score is 0. The key is that you can stop any moment and walk away with a nonzero score.

What strategy maximizes your expected number of points?

Extra credit: With an optimal strategy, how many points would you earn on average in a game of Riddler solitaire?

The determination of the optimal stopping rule is not extremely difficult, in contrast to the determination of the maximum average score per game. A lot of submissions came to the idea to see what the effect is on the current score without a joker when the player does not stop but draws a single additional card. The reason is as follows. Imagine that you have a current total of p points and that you have drawn c cards up until now. In this situation, there are 10-c non-joker cards left in the game with a total value of 55-p points, where each of these cards has an average number of (55-p)/(10-c) points. The next card you draw has a 1/(11-c) chance of being a joker and a (10-c)/(11-c) chance of being a non-joker. Thus, we have the expected gain of your current score of p points.

while the expected loss of your current score is given by

The expected increase T is larger than the expected loss A if 55-p>p, i.e. p<27.5. This leads to the conclusion that one should only continue if their current score is below 28 points, and to stop the moment your score is above 28 points. That is indeed the optimal stopping rule, as can be proven using Markov decision theory (the so-called “one-stage-look-ahead” rule is optimal for stopping problems with the property that the process stays in the collection of unfavorable states once they have entered it). More generally, for the case of N non-joker cards with respective values 1 until N, the above reasoning then gives that the optimal stopping rule is to stop once your score is above N(N+1)/4 and continue otherwise. It is worth noting that the optimal strategy is not dependent on the number of cards already drawn, only on the number of points already gained.

A lot of submissions of a solution to the puzzle used Monte Carlo simulation to find the maximum average score per game, where some found the surprising and peculiar result that under the optimal stopping rule, the chance to draw a joker is 50%. One submission even showed that this probability is indeed equal to 50% by having the computer cycle through all possible permutations of the 11 cards in a clever way, by which they could calculate the distribution of the final score under the optimal stopping rule. Where I personally had considered a dynamic programming recursion with a multi-dimensional state als alternative to the simulation, there was a submission with a different approach to calculate the maximum average score per game without simulation. To that end, it was first noted that for the optimal stopping rule with a stopping threshold of 28 points, one has to have at least 4 cards and at most 7, where the final score can not be more than 37 (=27+10) points. The expected value can then be calculated using the formula

where a_c(p) denotes the number of possible combinations of c distinct non-joker cards where the sum of the values of the cards is equal to p and less than 28 if one of these c cards would be removed. These values a_c(p) are determined by enumeration. This leads to the expected value 15.453 for the final score under the optimal stopping rule (with a standard deviation of the final score of 15.547). Note that the last formula also embraces the probability distribution of the final score: the probability of a final score of p points has the values 0.5, 0.0952, 0.0826, 0.0751, 0.0643, 0.0536, 0.0442, 0.0356, 0.0249, 0.0162 and 0.0083 for p=0, 28, 29, ... ,37. Indeed the probability of picking the joker under the optimal strategy is exactly 50%.

Finally, I want to give some comments on the challenging puzzle about which had so many interesting reactions sent in. Some submissions applied a combination of a Monte Carlo simulation and a regression analysis. Through this, the probability of a final score of zero for the case of 11 cards could be calculated as a regression of the stopping value s.

and for the more general case of N non-joker cards with the respective values of 1 to N the maximum expected value of the final score can be approximated by the regression curve

with the respective approximating values of 0.49996 for s=28 and 15.45 for N=10.

The surprising trade-off between the probability of 50% of having a final score of zero and the expected value of the final score on the optimal stopping rule also seems applicable for the more general case of N cards (this can be easily verified for N=2 and N=3). It would be interesting to see further research into this. Until this point, it is assumed that there is only one joker in the cards. What happens when there are two jokers? It is then simple to check that the average number of points per game is maximal by stopping the moment the number of accumulated points is more or equal than N(N+1)/6 if N is the number of cards with the values 1 to N. However, what is the story for other results? What happens if there are cards with duplicate values? These could form a number of interesting subjects for students to research. Especially for the readers of Sector, I would like to give an intriguing twist on the card game, see also the thread "The Devil's Card Game" on the site 10kdiver.com, a very interesting site about finance and investing. This twist involves probability, betting styles, investor psychology and utility concepts from economics. You have 11 cards again: 10 'double' cards and 1 'joker' card. The cards are indistinguishable and thoroughly shuffled. Starting with a bankroll of $1000, you get to draw the cards one by one. Each time you can bet any proportion of your current bankroll on the next card to be drawn. If you draw a double card, you get back twice the money bet on that card. However, the game is over and the money bet on the card is lost when you draw the joker card. You can decide to stop at any moment as long as the joker card has not been drawn. What strategy would you use and what is the underlying criterion?

Henk Tijms is an emeritus-professor Operations Research at the Vrije Universiteit and author of a number of books about Operations Research and probability theory. E-mail: h.c.tijms@xs4all.nl