At the end of each year, there are usually media stories that compile lists of famous people who have passed away. These lists usually cause us to pause and reflect. Lists like Celebrity Death Pool 2013 on Ranker, however, give us an opportunity to make (macabre) predictions about recent celebrity deaths.
We were interested in whether “wisdom of the crowd” methods could be applied to aggregate the individual predictions. The wisdom of the crowd is about making more complete and more accurate predictions, and both completeness and accuracy seem relevant here. Being complete means building an aggregate list that identifies as many celebrity deaths as possible. Being accurate means, in a list where only some predictions are borne out, placing those who do die near the top of the list.
Our Ranker data involved the lists provided by a total of 27 users up until early in 2013. (Some them were done after at least one celebrity, Patti Page, had passed away, but we thought they still provided useful predictions about other celebrities). Some users predicted as many as 25 deaths, while some made a single prediction. The median number of predictions was eight, and, in total, 99 celebrities were included in at least one list. At the time of posting, six of the 99 celebrities have passed away.
One way to measure how well a user made predictions is to work down their list, keeping track of every time they correctly predicted a recent celebrity death. This approach to scoring is shown for all 27 users in the graph below. Each blue circle corresponds to a user, and represents their final tally. The location of the circle on the x-axis corresponds to the total length of their list, and the location on the y-axis corresponds to the total number of correct predictions they made. The blue lines leading up to the circles track the progress for each user, working down their ranked lists. We can see that the best any user did was predict two out or the current six deaths, and most users currently have none or one correct predictions in their list.
To try and find some wisdom in this crowd of users, we applied an approach to combining rank data developed as part of our general research into human decision-making, memory, and individual differences. The approach is based on classic models in psychology that go all the way back to the work of Thurstone in 1931, but has some modern tweaks. Our approach allows for individual differences, and naturally identifies expert users, upweighting their opinions in determining the aggregated crowd list. A paper describing the nuts and bolts of our modeling approach can be found here (but note we used a modified version for this problem, because users only provide their “Top-N” responses, and they get to choose N, which is the length of their list).
The net result of our modeling is a list of all 99 celebrities, in an order that combines the rankings provided by everybody. The top 5 in our aggregated list, for the morbidly curious, are Hugo Chavez (already a correct prediction), Fidel Castro, Zsa Zsa Gabor, Abe Vigoda, and Kirk Douglas. We can assess the wisdom of the crowd in the same way we did individuals, by working down the list, and keeping track of correct predictions. This assessment is shown by the green line in the graph below. Because the list includes all 99 celebrities, it will always find the six who have already recently passed away, and the names of those celebrities are shown at the top, in the place they occur in the aggregated list.
The interesting part assessing the wisdom of the crowd is how early in the list it makes correct predictions about recent celebrity deaths. Thus, the more quickly the green line goes up as it moves to the right, the better the predictions of the crowd. From the graph, we can see that the crowd is currently performing quite well, and is certainly about the “chance” line, represented by the dotted diagonal. (This line corresponds to the average performance of a randomly-ordered list).
We can also see that the crowd is performing as well as, or better than, all but one of the individual users. Their blue circles are shown again along with crowd performance. Circles that lie above and to the left of the green line indicate users outperforming the crowd, and there is only one of these. Interestingly, predicting celebrity deaths by using age, and starting with the oldest celebrity first, does not perform well. This seemingly sensible heuristic is assessed by the red line, but is outperformed by the crowd and many users.
Of course, it is only May, so the predictions made by users on Ranker have time to be borne out. Our wisdom of the crowd predictions are locked in, and we will continue to update the assessment graphs.
– Michael Lee