by    in Data Science, prediction

Combining Preferences for Pizza Toppings to Predict Sales

The world’s most expensive pizza, auctioned for $4,200 as a charity gift in 2007, was topped with edible gold, lobster marinated in cognac, champagne-soaked caviar, smoked salmon, and medallions of venison. While most of us prefer (or can only afford to prefer) more humble ingredients, our preferences are similarly diverse.  Ranker has a Tastiest Pizza Toppingslist that asks people to express their preferences. At the time of writing there are 29 re-ranks of this list, and a total of 64 different ingredients mentioned. Edible gold, by the way, is not one of them.

Equipped with this data about popular pizza toppings, we were interested in finding out if pizzerias were actually selling the toppings that people say that they want. We also wanted to see if we could predict sales for individual ingredients by looking at one list that combined all of the responses about pizza topping preferences. This “Ultimate List” contains all of toppings that were listed in individual lists (known as re-ranks) and is ordered in a way that reflects how many times each ingredient was mentioned and where they ranked on individual lists. Many of the re-ranks only list a few ingredients, so it is fitting to combine lists and rely on the “wisdom of the crowd” to get a more complete ranking of many possible ingredients.

As a real-world test of how people’s preferences correspond to sales, we used Strombolini’s New York Pizzeria’s list of their top 10 selling ingredients. Pepperoni, cheese, sausage and mushrooms topped the list, followed by: pineapple, bacon, ham, shrimp, onion, and green peppers. All of these ingredients, save for shrimp, are included in the Ranker lists so we considered the 9 overlapping ingredients and measured how close each user’s preference list was to the pizzeria’s sales list.

To compare lists, we used a standard statistical measure known as Kendall’s tau, which counts how many times we would need to swap one item for another (known as a pair-wise swap) before two lists are identical. A Kendall’s tau of zero means the two lists are exactly the same. The larger the Kendall’s tau value becomes, the further one list is from another.

The figure shows, using little stick people, the Kendall’s tau distances between users’ lists, and the Strombolini’s sales list. The green dot corresponds to a perfect tau of zero, and the red dot is the highest possible tau (if two lists are the exact opposite of the other). The dotted line is provided as a reference to show how likely each Kendall’s tau value is by chance (that is, how often different Kendall’s tau values occur for random lists of the ingredients). It is clear that there are large differences in how close individual users’ lists came to the sales-based list. It is also clear that many users produced rankings that were quite different from the sales-based list.

Using this model, the combined list came out to be: cheese, pepperoni, bacon, mushrooms, sausage, onion, pineapple, ham, and green peppers. This is a Kendall’s tau of 7 pair-wise swaps from the Strombolini list, as shown in the figure by the blue dot representing the crowd. This means the combined list is closer to the sales list than all but one of the individual users.

Our “wisdom of the crowd” analysis, combining all the users’ lists, used the same approach we previously applied to predicting celebrity deaths using Ranker data. It is a “Top-N” variant of the psychological approach developed in our work modeling decision-making and individual differences for ranking lists, and has the nice property of naturally incorporating individual differences.

This analysis is a beginning example of a couple of interesting ideas. One is that it is possible to extract relatively complete information from a set of incomplete opinions provided by many people. The other is that this combined knowledge can be compared to, and possibly be predictive of, real-world ground truths, like whether more pizzas have bacon or green peppers on them.  It may never begin to explain, however, why someone would waste champagne-soaked caviar on pizza, as a topping.