by    in Opinion Graph, Rankings

Ranky Goes to Washington?

Something pretty cool happened last week here at Ranker, and it had nothing to do with the season premiere of the “Big Bang Theory”, which we’re also really excited about. Cincinnati’s number one digital paper used our widget to create a votable list of ideas mentioned in Cincinnati Mayor John Cranley’s first State of the City. As of right now, 1,958 voters cast 5,586 votes on the list of proposals from Mayor Cranley (not surprisingly, “fixing streets” ranks higher than the “German-style beer garden” that’s apparently also an option).

Now, our widget is used by thousands of websites to either take one of our votable lists or create their own and embed it on their site, but this was the very first time Ranker was used to directly poll people on public policy initiatives.

Here’s why we’re loving this idea: we feel confident that Ranker lists are the most fun and reliable way to poll people at scale about a list of items within a specific context. That’s what we’ve been obsessing about for the past 6 years. But we also think this could lead to a whole new way for people to weigh in in fairly  large numbers on complex public policy issues on an ongoing basis, from municipal budgets to foreign policy. That’s because Ranker is very good at getting a large number of people to cast their opinion about complex issues in ways that can’t be achieved at this scale through regular polling methods (nobody’s going to call you at dinner time to ask you to rank 10 or 20 municipal budget items … and what is “dinner time” these days, anyway?).  It may not be a representative sample, but it may be the only sample that matters, given that the average citizen of Cincinnati will have no idea about the details within the Mayor’s speech and likely will give any opinion simply to move a phone survey conversation along about a topic they know little about.

Of course, the democratic process is the best way to get the best sample (there’s little bias when it’s the whole friggin voting population!) to weigh in on public policy as a whole. But elections are very expensive, infrequent, and the focus of their policy debates is the broadest possible relative to their geographical units, meaning that micro-issues like these will often get lost in same the tired partisan debates.

Meanwhile, society, technology, and the economy no longer operate on cycles consistent with elections cycles: the rate and breadth of societal change is such that the public policy environment specific to an election quickly becomes obsolete, and new issues quickly need sorting out as they emerge, something our increasingly polarized legislative processes have a hard time doing.

Online polls are an imperfect, but necessary, way to evaluate public policy choices on an ongoing basis. Yes, they are susceptible to bias, but good statistical models can overcome a lot of such bias and in a world where the response rates for telephone polls continue to drop, there simply isn’t an alternative.  All polling is becoming a function of statistical modeling applied to imperfect datasets.  Offline polls are also expensive, and that cost is climbing as rapidly as response rates are dropping. A poll with a sample size of 800 can cost anywhere between $25,000 and $50,000 depending on the type of sample and the response rate.  Social media is, well, very approximate. As we’ve covered elsewhere in this blog, social media sentiment is noisy, biased, and overall very difficult to measure accurately.

In comes Ranker. The cost of that Cincinnati.com Ranker widget? $0. Its sample size? Nearly 2,000 people, or anywhere between 2 to 4x the average sample size of current political polls. Ranker is also the best way to get people to quickly and efficiently express a meaningful opinion about a complex set of issues, and we have collected thousands of precise opinions about conceptually complex topics like the scariest diseases and the most important life goals by making providing opinions entertaining within a context that makes simple actions meaningful.

Politics is the art of the possible, and we shouldn’t let the impossibility of perfect survey precision preclude the possibility of using technology to improve civic engagement at scale.  If you are an organization seeking to poll public opinion about a particular set of issues that may work well in a list format, we’d invite you to contact us.

– Ravi Iyer

Can Colbert bring young Breaking Bad Fans to The Late Show?

I have to admit that I thought it was a joke at first when I heard the news that Stephen Colbert is leaving The Colbert Report and is going to host the Late Show, currently hosted by David Letterman.  The fact that he won’t be “in character” in the new show makes it more intriguing, even as it brings tremendous change to my entertainment universe.  However, while it will take some getting used to, looking at Ranker data on the two shows reveals how the change really does make sense for CBS.

Despite the ire of those who disagree with The Colbert Report’s politics, CBS is definitely addressing a need to compete better for younger viewers, who are less likely to watch TV on the major networks.  Ranker users tend to be in the 18-35 year old age bracket and The Colbert Report ranks higher than the Late Show on most every list that they both are on including the Funniest TV shows of 2012 (19 vs. 28), Best TV Shows of All-Time (186 vs. 197), and Best TV Shows of Recent Memory (37 vs. 166).  Further, people who tend to like The Colbert Report also seem to like many of the most popular shows around like Breaking Bad, Mad Men, Game of Thrones, and 30 Rock.  In contrast, correlates of the Late Show include older shows like The Sopranos and 60 Minutes.  There is some overlap as fans of both shows like The West Wing and The Daily Show, indicating that Colbert may be able to appeal to current fans as well as new audiences.

Colbert Can Expand Late Show's Audience to New Groups, yet Retain Many Current Fans.

I’ll be sad to see “Stephen Colbert” the character go.  But it looks like my loss is CBS’ gain.

– Ravi Iyer

Lists are the Best way to get Opinion Graph Data: Comparing Ranker to State & Squerb

I was recently forwarded an article about Squerb, which shares an opinion we have long agreed with.  Specifically…

““Most sites rely on simple heuristics like thumbs-up, ‘like’ or 1-5 stars,” stated Squerb founder and CEO Chris Biscoe. He added that while those tools offer a quick overview of opinion, they don’t offer much in the way of meaningful data.

It reminds me a bit of State, another company building an opinion graph that connects more specific opinions to specific objects in the world.  They too are built upon the idea that existing sources of big data opinions, e.g. mining tweets and facebook likes, have inherent limitations.  From this Wired UK article:

Doesn’t Twitter already provide a pretty good ‘opinion network’? Alex thinks not. “The opinions out there in the world today represent a very thin slice. Most people are not motivated to express their opinion and the opinions out there for the most part are very chaotic and siloed. 98 percent of people never get heard,” he told Wired.co.uk.

I think more and more people who try to parse Facebook and Twitter data for deeper Netflix AltGenre-like opinions will realize the limitations of such data, and attempt to collect better opinion data.  In the end, I think collecting better opinion data will inevitably involve the list format that Ranker specializes in.  Lists have a few important advantages over the methods that Squerb and State are using, which include slick interfaces for tagging semantic objects with adjectives.  The advantages of lists include:

  • Lists are popular and easily digestible.  There is a reason why every article on Cracked is a list.  Lists appeal to the masses, which is precisely the audience that Alex Asseily is trying to reach on State.  To collect mass opinions, one needs a site that appeals to the masses, which is why Ranker has focused on growth as a consumer destination site, that currently collects millions of opinions.
  • Lists provide the context of other items.  It’s one thing to think that Army of Darkness is a good movie.  But how does it compare to other Zombie Movies?  Without context, it’s hard to compare people’s opinions as we all have different thresholds for different adjectives.  The presence of other items lets people consider alternatives they may not have considered in a vacuum and allows better interpretation of non-response.
  • Lists provide limits to what is being considered.  For example, consider the question of whether Tom Cruise is a good actor?  Is he one of the Best Actors of All-time?  one of the Best Action Stars?  One of the Best Actors Working Today?  Ranker data shows that people’s answers usually depend on the context (e.g. Tom Cruise gets a lot of downvotes as one of the best actors of all-time, but is indeed considered one of the best action stars.)
  • Lists are useful, especially in a mobile friendly world.

In short, collecting opinions using lists produces both more data and better data.  I welcome companies that seek to collect semantic opinion data as the opportunity is large and there are network effects such that each of our datasets is more valuable when other datasets with different biases are available for mashups.  As others realize the importance of opinion graphs, we likely will see more companies in this space and my guess is that many of these companies will evolve along the path that Ranker has taken, toward the list format.

– Ravi Iyer

by    in About Ranker, Opinion Graph, Pop Culture, Rankings

Ranker’s Rankings API Now in Beta

Increasingly, people are looking for specific answers to questions as opposed to webpages that happen to match the text they type into a search engine.  For example, if you search for the capital of France or the birthdate of Leonardo Da Vinci, you get a specific answer.  However, the questions that people ask are increasingly about opinions, not facts, as people are understandably more interested in what the best movie of 2013 was, as opposed to who the producer for Star Trek: Into Darkness was.

Enter Ranker’s Rankings API, which is currently now in beta, as we’d love the input of potential users’ of our API to help improve it.  Our API returns aggregated opinions about specific movies, people, tv shows, places, etc.  As an input, we can take a Wikipedia, Freebase, or Ranker ID.  For example, below is a request for information about Tom Cruise, using his Ranker ID from his Ranker page (contact us if you want to use other IDs to access).
http://api.ranker.com/rankings/?ids=2257588&type=RANKER

In the response to this request, you’ll get a set of Rankings for the requested object, including a set of list names (e.g. “listName”:”The Greatest 80s Teen Stars”), list urls (e.g. “listUrl”:”http://www.ranker.com/crowdranked-list/45-greatest-80_s-teen-stars” – note that the domain, www.ranker.com, is implied), item names (e.g. “itemName”:”Tom Cruise”) position of the item on this list (e.g. “position”:21), number of items on the list (e.g. “numItemsOnList”:70), the number of people who have voted on this list (e.g. “numVoters”:1149), the number of positive votes for this item (e.g. “numUpVotes”:245) vs. the number of negative votes (e.g. “numDownVotes”:169), and the Ranker list id (e.g. “listId”:584305).  Note that results are cached so they may not match the current page exactly.

Here is a snipped of the response for Tom Cruise.

[ { “itemName” : “Tom Cruise”,
“listId” : 346881,
“listName” : “The Greatest Film Actors & Actresses of All Time”,
“listUrl” : “http://www.ranker.com/crowdranked-list/the-greatest-film-actors-and-actresses-of-all-time”,
“numDownVotes” : 306,
“numItemsOnList” : 524,
“numUpVotes” : 285,
“numVoters” : 5305,
“position” : 85
},
{ “itemName” : “Tom Cruise”,
“listId” : 542455,
“listName” : “The Hottest Male Celebrities”,
“listUrl” : “http://www.ranker.com/crowdranked-list/hottest-male-celebrities”,
“numDownVotes” : 175,
“numItemsOnList” : 171,
“numUpVotes” : 86,
“numVoters” : 1937,
“position” : 63
},
{ “itemName” : “Tom Cruise”,
“listId” : 679173,
“listName” : “The Best Actors in Film History”,
“listUrl” : “http://www.ranker.com/crowdranked-list/best-actors”,
“numDownVotes” : 151,
“numItemsOnList” : 272,
“numUpVotes” : 124,
“numVoters” : 1507,
“position” : 102
}

…CLIPPED….
]

What can you do with this API?  Consider this page about Tom Cruise from Google’s Knowledge Graph.  It tells you his children, his spouse(s), and his movies.  But our API will tell you that he is one of the hottest male celebrities, an annoying A-List actor, an action star, a short actor, and an 80s teen star.  His name comes up in discussions of great actors, but he tends to get more downvotes than upvotes on such lists, and even shows up on lists of “overrated” actors.

We can provide this information, not just about actors, but also about politicians, books, places, movies, tv shows, bands, athletes, colleges, brands, food, beer, and more.  We will tend to have more information about entertainment related categories, for now, but as the domains of our lists grow, so too will the breadth of opinion related information available from our API.

Our API is free and no registration is required, though we would request that you provide links and attributions to the Ranker lists that provide this data.  We likely will add some free registration at some point.  There are currently no formal rate limits, though there are obviously practical limits so please contact us if you plan to use the API heavily as we may need to make changes to accommodate such usage.  Please do let me know (ravi a t ranker) your experiences with our API and any suggestions for improvements as we are definitely looking to improve upon our beta offering.

– Ravi Iyer

Ranker Opinion Graph: the Best Froyo Toppings

Its hard to resist a cold treat on a hot summer afternoon, and frozen yogurt shops with their array of flavors and toppings have a little of something for everyone. Once you’re done agonizing over whether you want new york cheesecake or wild berry froyo (and trying a sample of each at least twice), its time for the topping bar. But which topping should you choose? We asked people to vote for their favorite frozen yogurt toppings on Ranker from a list of 32 toppings, and they responded with over 7,500 votes.

The Top 5 Frozen Yogurt Toppings (by number of upvotes):
1. Oreo (235 votes)
2. Strawberries(225 votes)
3. Brownie bits (223 votes)
4. Hot fudge (216 votes)
5. Whipped cream (201 votes)

But let’s be honest, who can just choose just ONE topping for their froyo? Using Gephi and data from Ranker’s Opinion Graph, we ran a cluster analysis on people’s favorite froyo topping votes to determine which toppings people like to eat together (click on graph to enlarge). In the graph, larger circles mean more likes with other toppings. Most of the versatile toppings were either a syrup (like strawberry sauce) or chocolate candy (like Reese’s Pieces).froyo

The 10 Most Versatile Froyo Toppings:

1. Strawberry sauce
2. Snickers
3. Magic Shell
4. White Chocolate chips
5. Peanut butter chips
6. Butterscotch syrup
7. Candies Nestle Butterfinger Bar
8. Reese’s Pieces
9. M&Ms
10. Brownie bits

 

Using the modularity clustering tool in Gephi, we were then able to sort toppings into groups based on which toppings people were most likely to upvote together. We identified 4 kinds of froyo topping lovers:

fruitnut1. Fruit and Nuts (Blue): This cluster is all about the fruits and nuts. These people love Strawberry sauce, sliced almonds, and Marschino cherries.

chocolate2. Chocolate (purple): This cluster encompases all things chocolate. These people love Magic Shell, Brownie bits, and chocolate syrup.

 

sugar3. Sugar candy (green): This cluster is made up of pure sugar. These people love gummy worms, Rainbow sprinkles, and Skittles.

 

 

salty4. Salty and Cake (Red): This cluster encompasses cake bites and toppings that have a salty taste to them. These people like Snickers, Cheesecake bits, and Caramel Syrup.

 

Some additional thoughts:

  • Banana was a strange topping that was only linked with Snickers.
  •  People who like nuts like both fruit and items from the salty category.
  •  People who like blueberries only like other fruits.
  • People who like sugar items like gummy worms also like chocolate, but don’t particularly like fruit.

 

– Kate Johnson

How Netflix’s AltGenre Movie Grammar Illustrates the Future of Search Personalization

I recently got sent this Atlantic article on how Netflix reverse engineered Hollywood by a few contacts, and it happens to mirror my long term vision for how Ranker’s data fits into the future of search personalization.  Netflix’s goal, to put “the right title in front of the right person at the right time,” is very similar to what Apple, Bing, Google, and Facebook are attempting to do with regards to personalized contextual search.  Rather than you having to type in “best kitchen gadgets for mothers”, applications like Google Now and Cue (bought by Apple) hope to eventually be able to surface this information to you in real time, knowing not only when your mother’s birthday is, but also that you tend to buy kitchen gadgets for her, and knowing what the best rated kitchen gadgets that aren’t too complex and are in your price range happen to be.  If the application was good enough, a lot of us would trust it to simply charge our credit card and send the right gift.  But obviously we are a long way from that reality.

Netflix’s altgenre movie grammar (e.g. Irreverent Werewolf Movies Of The 1960s) gives us a glimpse of the level of specificity that would be required to get us there.  Consider what you need to know to buy the right gift for your mom.  You aren’t just looking for a kitchen gadget, but one with specific attributes.  In altgenre terminology, you might be looking for “best simple, beautifully designed kitchen gadgets of 2014 that cost between $25 and $100” or “best kitchen gadgets for vegetarian technophobes”.  Google knows that simple text matching is not going to get it the level of precision necessary to provide such answers, which is why semantic search, where the precise meaning of pages is mapped, has become a strategic priority.

However, the universe of altgenre equivalents in the non-movie world is nearly endless (e.g. Netflix has thousands of ways just to classify movies), which is where Ranker comes in, as one of the world’s largest sources for collecting explicit cross-domain altgenre-like opinions.  Semantic data from sources like wikipedia, dbpedia, and freebase can help you put together factual altgenres like “of the 60s” or “that starred Brad Pitt“, but you need opinion ratings to put together subtler data like “guilty pleasures” or “toughest movie badasses“.  Netflix’s success is proof of the power of this level of specificity in personalizing movies and consider how they produced this knowledge.  Not through running machine learning algorithms on their endless stream of user behavior data, but rather by soliciting explicit ratings along these dimensions by paying “people to watch films and tag them with all kinds of metadata” using a “36-page training document that teaches them how to rate movies on their suggestive content, goriness, romance levels, and even narrative elements like plot conclusiveness.”  Some people may think that with enough data, TripAdvisor should be able to tell you which cities are “cool”, but big data is not always better data.  Most data scientists will tell you the importance of defining the features in any recommendation task (see this article for technical detail on this), rather than assuming that a large amount of data will reveal all of the right dimensions.  The wrong level of abstraction can make prediction akin to trying to predict who will win the superbowl by knowing the precise position and status of every cell in every player on every NFL team.  Netflix’s system allows them to make predictions at the right level of abstraction.

The future of search needs a Netflix grammar that goes beyond movies.  It needs to able to understand not only which movies are dark versus gritty, but also which cities are better babymoon destinations versus party cities and which rock singers are great vocalists versus great frontmen.  Ranker lists actually have a similar grammar to Netflix movies, except that we apply this grammar beyond the movie domain.  In a subsequent post, I’ll go into more detail about this, but suffice it to say for now that I’m hopeful that our data will eventually play a similar role in the personalization of non-movie content that Netflix’s microtagging plays in film recommendations.

– Ravi Iyer

 

Why Topsy/Twitter Data may never predict what matters to the rest of us

Recently Apple paid a reported $200 million for Topsy and some speculate that the reason for this purchase is to improve recommendations for products consumed using Apple devices, leveraging the data that Topsy has from Twitter.  This makes perfect sense to me, but the utility of Twitter data in predicting what people want is easy to overstate, largely because people often confuse bigger data with better data.  There are at least 2 reasons why there is a fairly hard ceiling on how much Twitter data will ever allow one to predict about what regular people want.

1.  Sampling – Twitter has a ton of data, with daily usage of around 10%.  Sample size isn’t the issue here as there is plenty of data, but rather the people who use Twitter are a very specific set of people.  Even if you correct for demographics, the psychographic of people who want to share their opinion publicly and regularly (far more people have heard of Twitter than actually use it) is way too unique to generalize to the average person, in the same way that surveys of landline users cannot be used to predict what psychographically distinct cellphone users think.

2. Domain Comprehensiveness – The opinions that people share on Twitter are biased by the medium, such that they do not represent the spectrum of things many people care about.  There are tons of opinions on entertainment, pop culture, and links that people want to promote, since they are easy to share quickly, but very little information on people’s important life goals or the qualities we admire most in a person or anything where people’s opinions are likely to be more nuanced.  Even where we have opinions in those domains, they are likely to be skewed by the 140 character limit.

Twitter (and by extension, companies that use their data like Topsy and DataSift) has a treasure trove of information, but people working on next generation recommendations and semantic search should realize that it is a small part of the overall puzzle given the above limitations.  The volume of information gives you a very precise measure of a very specific group of people’s opinions about very specific things, leaving out the vast majority of people’s opinions about the vast majority of things.  When you add in the bias introduced by analyzing 140 character natural language, there is a great deal of variance in recommendations that likely will have to be provided by other sources.

At Ranker, we have similar sampling issues, in that we collect much of our data at Ranker.com, but we are actively broadening our reach through our widget program, that now collects data on thousands of partner sites.  Our ranked list methodology certainly has bias too, which we attempt to mitigate that through combining voting and ranking data.  The key is not in the volume of data, but rather in the diversity of data, which helps mitigate the bias inherent in any particular sampling/data collection method.

Similarly, people using Twitter data would do well to consider issues of data diversity and not be blinded by large numbers of users and data points.  Certainly Twitter is bound to be a part of understanding consumer opinions, but the size of the dataset alone will not guarantee that it will be a central part.  Given these issues, either Twitter will start to diversify the ways that it collects consumer sentiment data or the best semantic search algorithms will eventually use Twitter data as but one narrowly targeted input of many.

– Ravi Iyer

by    in Opinion Graph, Pop Culture, Rankings

Examining Regional Voting Differences with Ranker’s Polling Widget

Ranker has a new program where we offer a polling widget to partner sites who want the engagement of a poll in list format (as opposed to the standard radio button poll).  Currently, sites that use our poll (e.g. TheNextWeb or CBC) are seeing 20-50% of visitors engaging in the poll and an increase in returning visitors who want to keep track of results.  We also give partners prominent placement on Ranker.com (details of that here), but a benefit that is less obvious is the potential insights from one’s users that one can gain from the data behind a poll.  To illustrate what is possible, I’m going to use data from one of our regular widget users, Phish.net, who posted this poll on Phish’s best summer concert jams.

One piece of data that Ranker can give partners is a regional breakdown of voters.  Unsuprisingly, there were strong regional differences in voting behavior with voters from the northeast often choosing a jam from their New Jersey show, voters from the west coast often choosing a jam from their Hollywood Bowl show, voters from the south often choosing a jam from their Maryland show, voters from the midwest often choosing a jam from their Chicago show, and voters from the mountain region often choosing a jam from their show at The Gorge.  However, the interesting thing to me was that the leading jam in every region was Tweezer – Lake Tahoe from July 31st.  As someone who believes that better crowdsourced answers are produced by aggregating across bias and who has only been to 1 Phish concert, I’m definitely going to have to check out this jam.  Perhaps the answer is obvious to more experienced Phish fans, but the results of the poll are certainly instructive to the more casual music fan who wants a taste of Phish.

Below are the results of the poll in graphical format.  Notice how the shows cluster based on venue and geography except for Tweezer – Lake Tahoe which is directly in the center of the graph.

If you’re interested in running a widget poll on your site, the benefits are more clearly spelled out here and you can email us at “widget at ranker.com”.  We’d love to provide similar region based insights for your polls as well.

– Ravi Iyer

 

Why We Still Play Board Games: An Opinion Graph Analysis

It’s hard reading studies about people my age when research scientists haven’t agreed upon a term for us yet. In one study I’m a member of “Gen Y” (lazy), in another I’m from the “iGeneration” (Orwellian), or worse still, a “Millennial” (…no). You beleaguered and cynical 30-somethings had things easy with the “Generation X” thing. Let the record reflect that no one from my generation is even remotely okay with any of these terms. Furthermore, we all collectively check out whenever we hear the term “aughties”.

I’m whining about the nomenclature only because there’s a clear need for distinction between my generation and those who have/will come before/after us. This isn’t just from a cultural standpoint (although calling us “Generation Spongebob” might be the most ubiquitous touchstone you could get), but from a technical one. If this Kaiser Family Foundation study is to be believed (via NYT), 8-18 year olds today are the first to spend the majority of their waking hours interacting with the internet.

Yet despite this monumental change, there are still many childhood staples that have not been forsaken by an increasingly digital generation. One of the most compelling examples of this anomaly lies in board games. In a day and age where Apple is selling two billion apps a month (Apple), companies peddling games for our increasingly elusive away-from-keyboard time are still holding their own. For example, Hasbro’s board-and-card game based revenue grew to $1.19b dollars over the course of the last fiscal year (a 2% gain from last year).

What drove this growth? Hasbro’s earnings reports primarily accredits this growth to three products: Magic: The Gathering, Twister, and Battleship. All of these products have been mainstays of their line-up for quite some time (prepare to feel old: if Magic: The Gathering was a child, it could buy booze this year), so what’s compelling people to keep buying? Fortunately, Ranker has some pretty in-depth data on all of these products, based on people who vote on it’s best board games list, which receives thousands of opinions each month, as well as voting on other Ranker lists.

Twister’s continuous sales were the easiest to explain: users who expressed interest in the game were most likely to be a fan of other board games (Candy Land, Chutes and Ladders, Monopoly and so forth). Twister also correlated with many other programs/products with fairly universal appeal (Friends, Gremlins). This would seem to indicate that the chief reason for Twister’s continued high sales lies in its simplicity and ubiquity. The game is a cultural touchstone for that reason: more than any other game on the list, it’s the one hardest to picture a childhood without.

Battleship’s success lies in the same roots: our data shows great overlap between fans of the game and fans of Mouse Trap, Monopoly, etc. But Battleship has attracted fans of a different stripe, interest in films such as Doom, Independence Day, and Terminator were highly correlated with the game. In all likelihood, this is due to the recent silver-screen adaptation of the game. Although the movie only faired modestly within the United States, the film clearly did propel the game back into the public consciousness, which translated nicely into sales.

Finally, Magic: The Gathering’s success came from support of another nature. Interest in Magic correlated primarily with other role-play and strategy games (Settlers of Catan, Dominion, Heroscape). Simply put, most fans of Magic are likely to enjoy other traditionally “nerdy” games. The large correlation overlap between Magic and other role-playing games is a testament to how voraciously this group consumes these products.

The crowd-sourced information we have here neatly divides the consumers of each game into three pools. With this sort of individualized knowledge, targeting and marketing to each archetype of consumer is a far easier task.

– Eamon Levesque

An Opinion Graph of the World’s Beers

One of the strengths of Ranker‘s data is that we collect such a wide variety of opinions from users that we can put opinions about a wide variety of subjects into a graph format.  Graphs are useful as they let you go beyond the individual relationships between items and see overall patterns.  In anticipation of Cinco de Mayo, I produced the below opinion graph of beers, based on votes on lists such as our Best World Beers list.  Connections in this graph represent significant correlations between sentiment towards connected beers, which vary in terms of strength.  A layout algorithm (force atlas in Gephi) placed beers that were more related closer to each other and beers that had fewer/weaker connections further apart.  I also ran a classification algorithm that clustered beers according to preference and colored the graph according to these clusters.  Click on the below graph to expand it.

Ranker's Beer Opinion Graph

One of the fun things about graphs is that different people will see different patterns.  Among the things I learned from this exercise are:

  • •The opposite of light beer, from a taste perspective, isn’t dark beer.  Rather, light beers like Miller Lite are most opposite craft beers like Stone IPA and Chimay.
  • •Coors light is the light beer that is closest to the mainstream cluster.  Stella Artois, Corona, and Heineken are also reasonable bridge beers between the main cluster and the light beer world.
  • •The classification algorithm revealed six main taste/opinion clusters, which I would label: Really Light Beers (e.g. Natural Light), Lighter Mainstream Beers (e.g. Blue Moon), Stout Beers (e.g. Guinness), Craft Beers (e.g. Stone IPA), Darker European Beers (e.g. Chimay), and Lighter European Beers (e.g. Leffe Blonde).  The interesting parts about the classifications are the cases on the edge, such as how Newcastle Brown Ale appeals to both Guinness and Heineken drinkers.
  • •Seeing beers graphed according to opinions made me wonder if companies consciously position their beers accordingly.  Is Pyramid Hefeweizen successfully appealing to the Sam Adams drinker who wants a bit of European flavor?  Is Anchor Steam supposed to appeal to both the Guinness drinker and the craft beer drinker?  I’m not sure if I know enough about the marketing of beers to know the answer to this, but I’d be curious if beer companies place their beers in the same space that this opinion graph does.

These are just a few observations based on my own limited beer drinking experience.  I tend to be more of a whiskey drinker, and hope more of you will vote on our Best Tasting Whiskey list, so I can graph that next.  I’d love to hear comments about other observations that you might make from this graph.

– Ravi Iyer

Page 2 of 3123