by    in interest graph, Market Research, Pop Culture

Hierarchical Clustering of a Ranker list of Beers

This is a guest post by Markus Pudenz.

Ranker is currently exploring ways to visualize the millions of votes collected on various topics each month.  I’ve recently begun using hierarchical cluster analysis to produce taxonomies (also known as dendograms), and applied these techniques to Ranker’s Best Beers from Around the World. A dendrogram allows one to visualize the relationships on voting patterns (scroll down to see what a dendrogram looks like). What hierarchical clustering does is break down the list into related groups based on voting patterns of the users, grouping like items with items that were voted similarly by the same users. The algorithm is agglomerative, meaning it is starts with individual items and combines them iteratively until one large cluster (all of the beers in the list)  remains.

Every beer in our dendrogram is related to another at some level, whether in the original cluster or further down the dendrogram. See the height axis on the left side? The lower the cluster is on the axis, the closer the relationship the beers will have. For example, the cluster containing Guinness and Guinness Original is the lowest in this dendrogram indicating these to beers have the closest relationship based on the voting patterns. Regarding our list, voters have the option to Vote Up or Vote Down any beer they want. Let’s start at the top of the dendrogram and work our way down.

Hierarchical Clustering of Beer Preferences

Looking at the first split of the clusters, one can observe the cluster on the right contains beers that would generally be considered well-known including Guinness, Sam Adams, Heineken and Corona. In fact, the cluster on the right includes seven of the top ten beers from the list. The fact that most of our popular beers are in this right cluster indicates that there is a strong order effect with voters more likely to select beers that are more popular when ranking their favorite beers. For example, if someone selects a beer that is in the top ten, then another beer they select is also more likely to be in the top ten. As we examine the right cluster further, the first split divides the cluster into two smaller clusters. In the left cluster, we can clearly see, unsurprisingly, that a drinker who likes Guinness is more likely to vote for another variety of Guinness. This left cluster is comprised almost entirely of Guinness varieties with the exception of Murphy’s Irish Stout. The right cluster lists a larger variety of beer makers including Sam Adams, Stella Artois and Pyramid. In addition, none of the beers in this right cluster are stouts as with the left cluster. The only brewer in this right cluster with multiple varieties is Sam Adams with Boston Lager and Octoberfest meaning drinkers in this cluster were not as brand loyal as in the left cluster. Drinkers in this cluster were more likely to select a beer variety from a different brewer. When reviewing this cluster from the first split in the dendrogram, there is clearly a defined split between those drinkers who prefer a heavier beer (stout) as opposed to those who prefer lighter beers like lagers, pilseners, pale ales or hefeweizen.

Conversely, for beers in the left cluster, drinkers are more likely to vote for other beers that are not as popular with only three of the top ten beers in this cluster. In addition, because of the larger size, the range of beer styles and brewers for this cluster is more varied as opposed to those in the right cluster. The left cluster splits into three smaller clusters before splitting further. One cluster that is clearly distinct is the second of these clusters. This cluster is comprised almost entirely of Belgian style beers with the only exception being Pliny the Elder, an IPA. La Fin du Monde is a Belgian style tripel from Quebec with the remaining brewers from Belgium. One split within this cluster is comprised entirely of beer varieties from Chimay indicating a strong relationship; voters who select Chimay are more likely to also select a different style from Chimay when ranking their favorites.  Our remaining clusters have a little more variety. Our first cluster, the smallest of the three, has a strong representation from California with varieties from Stone, Sierra Nevada and Anchor Steam taking four out of six nodes in the cluster. Stone IPA and Stone Arrogant Bastard Ale have the strongest relationship in this cluster. Our third cluster, the largest of the three, has even more variety than the first. We see a strong relationship especially with Hoegaarden and Leffe.

I was also curious as to whether the beers in the top ten were associated with larger or smaller breweries. As the following list shows,  there is an even split between the larger conglomerates like AB InBev, Diageo, Miller Coors and independent breweries like New Belgium and Sierra Nevada.

  1. Guinness (Diageo)
  2. Newcastle (Heineken)
  3. Sam Adams Boston Lager (Boston Beer Company)
  4. Stella Artois (AB InBev)
  5. Fat Tire (New Belgium Brewing Company)
  6. Sierra Nevada Pale Ale (Sierra Nevada Brewing Company)
  7. Blue Moon (Miller Coors)
  8. Stone IPA (Stone Brewing Company)
  9. Guinness Original (Diageo)
  10. Hoegaarden Witbier (AB InBev)

Markus Pudenz

An Opinion Graph of the World’s Beers

One of the strengths of Ranker‘s data is that we collect such a wide variety of opinions from users that we can put opinions about a wide variety of subjects into a graph format.  Graphs are useful as they let you go beyond the individual relationships between items and see overall patterns.  In anticipation of Cinco de Mayo, I produced the below opinion graph of beers, based on votes on lists such as our Best World Beers list.  Connections in this graph represent significant correlations between sentiment towards connected beers, which vary in terms of strength.  A layout algorithm (force atlas in Gephi) placed beers that were more related closer to each other and beers that had fewer/weaker connections further apart.  I also ran a classification algorithm that clustered beers according to preference and colored the graph according to these clusters.  Click on the below graph to expand it.

Ranker's Beer Opinion Graph

One of the fun things about graphs is that different people will see different patterns.  Among the things I learned from this exercise are:

  • •The opposite of light beer, from a taste perspective, isn’t dark beer.  Rather, light beers like Miller Lite are most opposite craft beers like Stone IPA and Chimay.
  • •Coors light is the light beer that is closest to the mainstream cluster.  Stella Artois, Corona, and Heineken are also reasonable bridge beers between the main cluster and the light beer world.
  • •The classification algorithm revealed six main taste/opinion clusters, which I would label: Really Light Beers (e.g. Natural Light), Lighter Mainstream Beers (e.g. Blue Moon), Stout Beers (e.g. Guinness), Craft Beers (e.g. Stone IPA), Darker European Beers (e.g. Chimay), and Lighter European Beers (e.g. Leffe Blonde).  The interesting parts about the classifications are the cases on the edge, such as how Newcastle Brown Ale appeals to both Guinness and Heineken drinkers.
  • •Seeing beers graphed according to opinions made me wonder if companies consciously position their beers accordingly.  Is Pyramid Hefeweizen successfully appealing to the Sam Adams drinker who wants a bit of European flavor?  Is Anchor Steam supposed to appeal to both the Guinness drinker and the craft beer drinker?  I’m not sure if I know enough about the marketing of beers to know the answer to this, but I’d be curious if beer companies place their beers in the same space that this opinion graph does.

These are just a few observations based on my own limited beer drinking experience.  I tend to be more of a whiskey drinker, and hope more of you will vote on our Best Tasting Whiskey list, so I can graph that next.  I’d love to hear comments about other observations that you might make from this graph.

– Ravi Iyer

by    in Popular Lists

The List of the Day Off

It’s Labor Day once again! For those of you reading this outside of the US, a bit of background. Labor Day was originally a celebration held by labor unions in September 5, 1882. (It was made a national holiday in 1894 as an olive branch to unions, following the deaths of workers during the Pullman Strike.)

Today, it is largely recognized in the US by giving employees a Monday off. Even employees who may not actually deserve a day off, like these nearly unemployable video game characters.

Think we’re being too hard on Mario? Consider this: He knows all these pipes are broken, and are warping him to another dimension instead of carrying water and sewage where they’re supposed to go, but he never bothers to fix them. Yet he still seems to have a lot of time left over for parties and go-kart rides…

Labor Day is also universally recognized as the end of the “summer” season, and therefore the last time you can acceptably leave the house in white pants until May of the following year.

NOTE: Wearing a sport coat over a futuristic jumpsuit is acceptable only on April Fool’s Day, Halloween and for a few hours on Yom Kippur.

On Ranker, Americans recognized the holiday much as they did in real life. First, by voting on their favorite places to eat barbecue. (This list has actually become a real nail-biter with some later additions rocketing up the charts. Will Murphysboro, Illinois’ 17th Street Bar and Grill overtake Moe’s Original Bar B Que?)

The weekend also saw a lot of activity on the Best Tasting Light Beers list (Less Filling!), in which 27 nominated brews are duking it out. This was particularly surprising, as none of us at Ranker were aware that light beer had a taste.

Yes, there’s nothing more satisfying on a hot day than a tall, frothy mug of Hop Water.

As this weekend also marks the symbolic end of Summer 2011, Ranker users also enjoyed taking a look back at some of the season’s hottest hit songs. Adele’s holding strong in the top spot with “Rolling in the Deep,” which has as-of-today received not a single downvote.

The next 5 items on the list? Smarmy, winking, ironic hipster covers of “Rolling in the Deep.”