A probabilistic pub quiz for nerds
The typical pub quiz has a “true or false” round. You know the game: the quizmaster tries to trick you with statements that are often believed to be true, but which are in fact false. In this game, you’re rewarded for accuracy, but not for your confidence -- and that’s unfortunate, because drunk people just love high-stakes gambling! In this post, I show the “Jim scoring” system, a simple way to inject risk and alcoholic overconfidence into your “true or false” quiz round.
As a refresher, here’s a traditional “true or false” question:
If you answer right, you get a point. If you answer wrong, you get no points. If you don’t know the answer, just have a punt, because you can’t lose anything. But now consider the following variant, and consider it carefully:
Instead of asking for a black-and-white answer, this asks you for your level of confidence. Now, you’re rewarded based on two things: your correctness, and your level of confidence. The scoring system is as follows:
|80-100% likely||3 pts||-7 pts|
|60-80% likely||2 pts||-3 pts|
|40-60% likely||0 pts||0 pts|
|20-40% likely||-3 pts||2 pts|
|0-20% likely||-7 pts||3 pts|
The Jim scoring system sure looks odd at first glance! There’s a “magic sequence” of numbers: 3, 2, 0, -3, -7. Stare at it for a few seconds, and you might spot a pattern. But why is this a “good” scoring system?
It turns out the Jim scoring system has a very nice property: the optimal strategy is to choose the option that matches your true belief. You can’t “cheat” by pretending to be more confident than you truly are in order to gain points. To see why this scoring system rewards true reporting, think about your expected score given your belief. I’ll explain with another example:
Perhaps you’ve heard something like this before -- but was it ⅓ of bones, or ¼, or ½ ..? Was it the feet, or one foot, or the hands ..? Let’s say you think it’s 70% likely that “Approximately ⅓ of human bones are in the feet”. Then your expected score for picking “60-80% likely” is calculated as (70% × 2 pts) + (30% × -3 pts). This expected score comes out at 0.5, which is higher than your expected score for any other answer. Here’s a plot of your expected score, given your belief and your answer:
Notice that the “0-20% likely answer” is optimal precisely in the range 0-20%, and so on. Now you’ve had time to think about it, and you know the theory, you can have a go:
The Jim scoring system is discrete: it asks you to put your belief into one of five categories. But if you’re a real nerd, you can use a continuous scoring system. Here is one such system:
- You enter an answer a between 0 and 1 (that is, “0% likely” to “100% likely”).
- If the statement is true, your score is log(a).
- If the statement is false, your score is log(1 - a).
If you want to see why this works, consider that your believed probability is p. Then your expected score is _p_×log(a) + (1 - p) × log(1 - a). It turns out that to maximize this expected score, you should set a=p -- that is, you should answer with your true believed probability.
But people in pubs don’t like logarithms -- they like quizzes and gambling. The Jim scoring system adds some fun gambling to the quiz. It’s funny to see the effects of alcohol: alcohol biases your confidence as well as your accuracy, resulting in drunk people scoring many -7 points. Observe the confidence ratings become more extreme towards the end of the quiz.
Try it out in your next family quiz over Zoom. But in the mean time, I’ll leave you to test your confidence on this 10-question quiz. Post your final score on Twitter:
More by Jim
- Your syntax highlighter is wrong
- Granddad died today
- The Three Ts of Time, Thought and Typing: measuring cost on the web
- I hate telephones
- The sorry state of OpenSSL usability
- The dots do matter: how to scam a Gmail user
- My parents are Flat-Earthers
- How Hacker News stays interesting
- Project C-43: the lost origins of asymmetric crypto
- The hacker hype cycle
- The inception bar: a new phishing method
- Time is running out to catch COVID-19
- A probabilistic pub quiz for nerds
- Smear phishing: a new Android vulnerability