A Psychographic Interests Platform

In today’s fragmented world, psychographic profiling is more important to marketers than ever. Yet the art of figuring out how to effectively reach and convert audiences remains a daunting task. Ranker Insights brings precision and depth to what has typically been the very “fuzzy” exercise of psychographic profiling.

Each month over 35 million people visit Ranker to cast their votes on thousands of online polls about Films, Celebrities, TV, Music, Sports, and more, providing a treasure trove of proprietary self-reported preference data across 1.1 million interests.

Powered by Ranker’s unprecedented data collection engine, the Ranker Insights platform was developed to provide data-driven audience insights, both at scale and in precise context. Contact us to learn more about how we can work with you on a custom API basis.

Ranker Insights has made a portion of its data available for free here.

Can Colbert bring young Breaking Bad Fans to The Late Show?

I have to admit that I thought it was a joke at first when I heard the news that Stephen Colbert is leaving The Colbert Report and is going to host the Late Show, currently hosted by David Letterman.  The fact that he won’t be “in character” in the new show makes it more intriguing, even as it brings tremendous change to my entertainment universe.  However, while it will take some getting used to, looking at Ranker data on the two shows reveals how the change really does make sense for CBS.

Despite the ire of those who disagree with The Colbert Report’s politics, CBS is definitely addressing a need to compete better for younger viewers, who are less likely to watch TV on the major networks.  Ranker users tend to be in the 18-35 year old age bracket and The Colbert Report ranks higher than the Late Show on most every list that they both are on including the Funniest TV shows of 2012 (19 vs. 28), Best TV Shows of All-Time (186 vs. 197), and Best TV Shows of Recent Memory (37 vs. 166).  Further, people who tend to like The Colbert Report also seem to like many of the most popular shows around like Breaking Bad, Mad Men, Game of Thrones, and 30 Rock.  In contrast, correlates of the Late Show include older shows like The Sopranos and 60 Minutes.  There is some overlap as fans of both shows like The West Wing and The Daily Show, indicating that Colbert may be able to appeal to current fans as well as new audiences.

Colbert Can Expand Late Show's Audience to New Groups, yet Retain Many Current Fans.

I’ll be sad to see “Stephen Colbert” the character go.  But it looks like my loss is CBS’ gain.

– Ravi Iyer

Ranker Opinion Graph: the Best Froyo Toppings

Its hard to resist a cold treat on a hot summer afternoon, and frozen yogurt shops with their array of flavors and toppings have a little of something for everyone. Once you’re done agonizing over whether you want new york cheesecake or wild berry froyo (and trying a sample of each at least twice), its time for the topping bar. But which topping should you choose? We asked people to vote for their favorite frozen yogurt toppings on Ranker from a list of 32 toppings, and they responded with over 7,500 votes.

The Top 5 Frozen Yogurt Toppings (by number of upvotes):
1. Oreo (235 votes)
2. Strawberries(225 votes)
3. Brownie bits (223 votes)
4. Hot fudge (216 votes)
5. Whipped cream (201 votes)

But let’s be honest, who can just choose just ONE topping for their froyo? Using Gephi and data from Ranker’s Opinion Graph, we ran a cluster analysis on people’s favorite froyo topping votes to determine which toppings people like to eat together (click on graph to enlarge). In the graph, larger circles mean more likes with other toppings. Most of the versatile toppings were either a syrup (like strawberry sauce) or chocolate candy (like Reese’s Pieces).froyo

The 10 Most Versatile Froyo Toppings:

1. Strawberry sauce
2. Snickers
3. Magic Shell
4. White Chocolate chips
5. Peanut butter chips
6. Butterscotch syrup
7. Candies Nestle Butterfinger Bar
8. Reese’s Pieces
9. M&Ms
10. Brownie bits

 

Using the modularity clustering tool in Gephi, we were then able to sort toppings into groups based on which toppings people were most likely to upvote together. We identified 4 kinds of froyo topping lovers:

fruitnut1. Fruit and Nuts (Blue): This cluster is all about the fruits and nuts. These people love Strawberry sauce, sliced almonds, and Marschino cherries.

chocolate2. Chocolate (purple): This cluster encompases all things chocolate. These people love Magic Shell, Brownie bits, and chocolate syrup.

 

sugar3. Sugar candy (green): This cluster is made up of pure sugar. These people love gummy worms, Rainbow sprinkles, and Skittles.

 

 

salty4. Salty and Cake (Red): This cluster encompasses cake bites and toppings that have a salty taste to them. These people like Snickers, Cheesecake bits, and Caramel Syrup.

 

Some additional thoughts:

  • Banana was a strange topping that was only linked with Snickers.
  •  People who like nuts like both fruit and items from the salty category.
  •  People who like blueberries only like other fruits.
  • People who like sugar items like gummy worms also like chocolate, but don’t particularly like fruit.

 

– Kate Johnson

by    in interest graph, Market Research, Pop Culture

Hierarchical Clustering of a Ranker list of Beers

This is a guest post by Markus Pudenz.

Ranker is currently exploring ways to visualize the millions of votes collected on various topics each month.  I’ve recently begun using hierarchical cluster analysis to produce taxonomies (also known as dendograms), and applied these techniques to Ranker’s Best Beers from Around the World. A dendrogram allows one to visualize the relationships on voting patterns (scroll down to see what a dendrogram looks like). What hierarchical clustering does is break down the list into related groups based on voting patterns of the users, grouping like items with items that were voted similarly by the same users. The algorithm is agglomerative, meaning it is starts with individual items and combines them iteratively until one large cluster (all of the beers in the list)  remains.

Every beer in our dendrogram is related to another at some level, whether in the original cluster or further down the dendrogram. See the height axis on the left side? The lower the cluster is on the axis, the closer the relationship the beers will have. For example, the cluster containing Guinness and Guinness Original is the lowest in this dendrogram indicating these to beers have the closest relationship based on the voting patterns. Regarding our list, voters have the option to Vote Up or Vote Down any beer they want. Let’s start at the top of the dendrogram and work our way down.

Hierarchical Clustering of Beer Preferences

Looking at the first split of the clusters, one can observe the cluster on the right contains beers that would generally be considered well-known including Guinness, Sam Adams, Heineken and Corona. In fact, the cluster on the right includes seven of the top ten beers from the list. The fact that most of our popular beers are in this right cluster indicates that there is a strong order effect with voters more likely to select beers that are more popular when ranking their favorite beers. For example, if someone selects a beer that is in the top ten, then another beer they select is also more likely to be in the top ten. As we examine the right cluster further, the first split divides the cluster into two smaller clusters. In the left cluster, we can clearly see, unsurprisingly, that a drinker who likes Guinness is more likely to vote for another variety of Guinness. This left cluster is comprised almost entirely of Guinness varieties with the exception of Murphy’s Irish Stout. The right cluster lists a larger variety of beer makers including Sam Adams, Stella Artois and Pyramid. In addition, none of the beers in this right cluster are stouts as with the left cluster. The only brewer in this right cluster with multiple varieties is Sam Adams with Boston Lager and Octoberfest meaning drinkers in this cluster were not as brand loyal as in the left cluster. Drinkers in this cluster were more likely to select a beer variety from a different brewer. When reviewing this cluster from the first split in the dendrogram, there is clearly a defined split between those drinkers who prefer a heavier beer (stout) as opposed to those who prefer lighter beers like lagers, pilseners, pale ales or hefeweizen.

Conversely, for beers in the left cluster, drinkers are more likely to vote for other beers that are not as popular with only three of the top ten beers in this cluster. In addition, because of the larger size, the range of beer styles and brewers for this cluster is more varied as opposed to those in the right cluster. The left cluster splits into three smaller clusters before splitting further. One cluster that is clearly distinct is the second of these clusters. This cluster is comprised almost entirely of Belgian style beers with the only exception being Pliny the Elder, an IPA. La Fin du Monde is a Belgian style tripel from Quebec with the remaining brewers from Belgium. One split within this cluster is comprised entirely of beer varieties from Chimay indicating a strong relationship; voters who select Chimay are more likely to also select a different style from Chimay when ranking their favorites.  Our remaining clusters have a little more variety. Our first cluster, the smallest of the three, has a strong representation from California with varieties from Stone, Sierra Nevada and Anchor Steam taking four out of six nodes in the cluster. Stone IPA and Stone Arrogant Bastard Ale have the strongest relationship in this cluster. Our third cluster, the largest of the three, has even more variety than the first. We see a strong relationship especially with Hoegaarden and Leffe.

I was also curious as to whether the beers in the top ten were associated with larger or smaller breweries. As the following list shows,  there is an even split between the larger conglomerates like AB InBev, Diageo, Miller Coors and independent breweries like New Belgium and Sierra Nevada.

  1. Guinness (Diageo)
  2. Newcastle (Heineken)
  3. Sam Adams Boston Lager (Boston Beer Company)
  4. Stella Artois (AB InBev)
  5. Fat Tire (New Belgium Brewing Company)
  6. Sierra Nevada Pale Ale (Sierra Nevada Brewing Company)
  7. Blue Moon (Miller Coors)
  8. Stone IPA (Stone Brewing Company)
  9. Guinness Original (Diageo)
  10. Hoegaarden Witbier (AB InBev)

Markus Pudenz

Why We Still Play Board Games: An Opinion Graph Analysis

It’s hard reading studies about people my age when research scientists haven’t agreed upon a term for us yet. In one study I’m a member of “Gen Y” (lazy), in another I’m from the “iGeneration” (Orwellian), or worse still, a “Millennial” (…no). You beleaguered and cynical 30-somethings had things easy with the “Generation X” thing. Let the record reflect that no one from my generation is even remotely okay with any of these terms. Furthermore, we all collectively check out whenever we hear the term “aughties”.

I’m whining about the nomenclature only because there’s a clear need for distinction between my generation and those who have/will come before/after us. This isn’t just from a cultural standpoint (although calling us “Generation Spongebob” might be the most ubiquitous touchstone you could get), but from a technical one. If this Kaiser Family Foundation study is to be believed (via NYT), 8-18 year olds today are the first to spend the majority of their waking hours interacting with the internet.

Yet despite this monumental change, there are still many childhood staples that have not been forsaken by an increasingly digital generation. One of the most compelling examples of this anomaly lies in board games. In a day and age where Apple is selling two billion apps a month (Apple), companies peddling games for our increasingly elusive away-from-keyboard time are still holding their own. For example, Hasbro’s board-and-card game based revenue grew to $1.19b dollars over the course of the last fiscal year (a 2% gain from last year).

What drove this growth? Hasbro’s earnings reports primarily accredits this growth to three products: Magic: The Gathering, Twister, and Battleship. All of these products have been mainstays of their line-up for quite some time (prepare to feel old: if Magic: The Gathering was a child, it could buy booze this year), so what’s compelling people to keep buying? Fortunately, Ranker has some pretty in-depth data on all of these products, based on people who vote on it’s best board games list, which receives thousands of opinions each month, as well as voting on other Ranker lists.

Twister’s continuous sales were the easiest to explain: users who expressed interest in the game were most likely to be a fan of other board games (Candy Land, Chutes and Ladders, Monopoly and so forth). Twister also correlated with many other programs/products with fairly universal appeal (Friends, Gremlins). This would seem to indicate that the chief reason for Twister’s continued high sales lies in its simplicity and ubiquity. The game is a cultural touchstone for that reason: more than any other game on the list, it’s the one hardest to picture a childhood without.

Battleship’s success lies in the same roots: our data shows great overlap between fans of the game and fans of Mouse Trap, Monopoly, etc. But Battleship has attracted fans of a different stripe, interest in films such as Doom, Independence Day, and Terminator were highly correlated with the game. In all likelihood, this is due to the recent silver-screen adaptation of the game. Although the movie only faired modestly within the United States, the film clearly did propel the game back into the public consciousness, which translated nicely into sales.

Finally, Magic: The Gathering’s success came from support of another nature. Interest in Magic correlated primarily with other role-play and strategy games (Settlers of Catan, Dominion, Heroscape). Simply put, most fans of Magic are likely to enjoy other traditionally “nerdy” games. The large correlation overlap between Magic and other role-playing games is a testament to how voraciously this group consumes these products.

The crowd-sourced information we have here neatly divides the consumers of each game into three pools. With this sort of individualized knowledge, targeting and marketing to each archetype of consumer is a far easier task.

– Eamon Levesque

An Opinion Graph of the World’s Beers

One of the strengths of Ranker‘s data is that we collect such a wide variety of opinions from users that we can put opinions about a wide variety of subjects into a graph format.  Graphs are useful as they let you go beyond the individual relationships between items and see overall patterns.  In anticipation of Cinco de Mayo, I produced the below opinion graph of beers, based on votes on lists such as our Best World Beers list.  Connections in this graph represent significant correlations between sentiment towards connected beers, which vary in terms of strength.  A layout algorithm (force atlas in Gephi) placed beers that were more related closer to each other and beers that had fewer/weaker connections further apart.  I also ran a classification algorithm that clustered beers according to preference and colored the graph according to these clusters.  Click on the below graph to expand it.

Ranker's Beer Opinion Graph

One of the fun things about graphs is that different people will see different patterns.  Among the things I learned from this exercise are:

  • •The opposite of light beer, from a taste perspective, isn’t dark beer.  Rather, light beers like Miller Lite are most opposite craft beers like Stone IPA and Chimay.
  • •Coors light is the light beer that is closest to the mainstream cluster.  Stella Artois, Corona, and Heineken are also reasonable bridge beers between the main cluster and the light beer world.
  • •The classification algorithm revealed six main taste/opinion clusters, which I would label: Really Light Beers (e.g. Natural Light), Lighter Mainstream Beers (e.g. Blue Moon), Stout Beers (e.g. Guinness), Craft Beers (e.g. Stone IPA), Darker European Beers (e.g. Chimay), and Lighter European Beers (e.g. Leffe Blonde).  The interesting parts about the classifications are the cases on the edge, such as how Newcastle Brown Ale appeals to both Guinness and Heineken drinkers.
  • •Seeing beers graphed according to opinions made me wonder if companies consciously position their beers accordingly.  Is Pyramid Hefeweizen successfully appealing to the Sam Adams drinker who wants a bit of European flavor?  Is Anchor Steam supposed to appeal to both the Guinness drinker and the craft beer drinker?  I’m not sure if I know enough about the marketing of beers to know the answer to this, but I’d be curious if beer companies place their beers in the same space that this opinion graph does.

These are just a few observations based on my own limited beer drinking experience.  I tend to be more of a whiskey drinker, and hope more of you will vote on our Best Tasting Whiskey list, so I can graph that next.  I’d love to hear comments about other observations that you might make from this graph.

– Ravi Iyer

Ranker Uses Big Data to Rank the World’s 25 Best Film Schools

NYU, USC, UCLA, Yale, Julliard, Columbia, and Harvard top the Rankings.

Does USC or NYU have a better film school?  “Big data” can provide an answer to this question by linking data about movies and the actors, directors, and producers who have worked on specific movies, to data about universities and the graduates of those universities.  As such, one can use semantic data from sources like Freebase, DBPedia, and IMDB to figure out which schools have produced the most working graduates.  However, what if you cared about the quality of the movies they worked on rather than just the quantity?  Educating a student who went on to work on The Godfather must certainly be worth more than producing a student who received a credit on Gigli.

Leveraging opinion data from Ranker’s Best Movies of All-Time list in addition to widely available semantic data, Ranker recently produced a ranked list of the world’s 25 best film schools, based on credits on movies within the top 500 movies of all-time.  USC produces the most film credits by graduates overall, but when film quality is taken into account, NYU (208 credits) actually produces more credits among the top 500 movies of all-time, compared to USC (186 credits).  UCLA, Yale, Julliard, Columbia, and Harvard take places 3 through 7 on the Ranker’s list.  Several professional schools that focus on the arts also place in the top 25 (e.g. London’s Royal Academy of Dramatic Art) as well as some well-located high schools (New York’s Fiorello H. Laguardia High School & Beverly Hills High School).

The World’s Top 25 Film Schools

  1. New York University (208 credits)
  2. University of Southern California (186 credits)
  3. University of California – Los Angeles (165 credits)
  4. Yale University (110 credits)
  5. Julliard School (106 credits)
  6. Columbia University (100 credits)
  7. Harvard University (90 credits)
  8. Royal Academy of Dramatic Art (86 credits)
  9. Fiorello H. Laguardia High School of Music & Art (64 credits)
  10. American Academy of Dramatic Arts (51 credits)
  11. London Academy of Music and Dramatic Art (51 credits)
  12. Stanford University (50 credits)
  13. HB Studio (49 credits)
  14. Northwestern University (47 credits)
  15. The Actors Studio (44 credits)
  16. Brown University (43 credits)
  17. University of Texas – Austin (40 credits)
  18. Central School of Speech and Drama (39 credits)
  19. Cornell University (39 credits)
  20. Guildhall School of Music and Drama (38 credits)
  21. University of California – Berkeley (38 credits)
  22. California Institute of the Arts (38 credits)
  23. University of Michigan (37 credits)
  24. Beverly Hills High School (36 credits)
  25. Boston University (35 credits)

“Clearly, there is a huge effect of geography, as prominent New York and Los Angeles based high schools appear to produce more graduates who work on quality films compared to many colleges and universities,“ says Ravi Iyer, Ranker’s Principal Data Scientist, a graduate of the University of Southern California.

Ranker is able to combine factual semantic data with an opinion layer because Ranker is powered by a Virtuoso triple store with over 700 million triples of information that are processed into an entertaining list format for users on Ranker’s consumer facing website, Ranker.com.  Each month, over 7 million unique users interact with this data – ranking, listing and voting on various objects – effectively adding a layer of opinion data on top of the factual data from Ranker’s triple store. The result is a continually growing opinion graph that connects factual and opinion data.  As of January 2013, Ranker’s opinion graph included over 30,000 nodes with over 5 million edges connecting these nodes.

– Ravi Iyer

Predicting Box Office Success a Year in Advance from Ranker Data

A number of data scientists have attempted to predict movie box office success from various datasets.  For example, researchers at HP labs were able to use tweets around the release date plus the number of theaters that a movie was released in to predict 97.3% of movie box office revenue in the first weekend.  The Hollywood Stock Exchange, which lets participants bet on the box office revenues and infers a prediction, predicts 96.5% of box office revenue in the opening weekend.  Wikipedia activity predicts 77% of box office revenue according to a collaboration of European researchers.  Ranker runs lists of anticipated movies each year, often for more than a year in advance, and so the question I wanted to analyze in our data was how predictive is Ranker data of box office success.

However, since the above researchers have already shown that online activity at the time of the opening weekend predicts box office success during that weekend, I wanted to build upon that work and see if Ranker data could predict box office receipts well in advance of opening weekend.  Below is a simple scatterplot of results, showing that Ranker data from the previous year predicts 82% of variance in movie box office revenue for movies released in the next year.

Predicting Box Office Success from Ranker Data
Predicting Box Office Success from Ranker Data

The above graph uses votes cast in 2011 to predict revenues from our Most Anticipated 2012 Films list.  While our data is not as predictive as twitter data collected leading up to opening weekend, the remarkable thing about this result is that most votes (8,200 votes from 1,146 voters) were cast 7-13 months before the actual release date.  I look forward to doing the same analysis on our Most Anticipated 2013 Films list at the end of this year.

– Ravi Iyer

by    in Data Science, Market Research

Validating Ranker’s Aggregated Data vs. a Gallup Poll of Best Colleges

We were talking to someone in the market research field about the credibility of Ranker’s aggregated rankings, and they were intruiged and suggested that we validate our data by comparing the aggregated results of one of our lists to the results achieved by a traditional research company using traditional market research methodologies.  Companies like Gallup often do not survey the same types of questions that we ask at Ranker, in part due to the inherent difficulties of open ended polling via random digit dialing.  You can’t realistically call someone up at dinner time and ask them to list their 50 favorite TV shows.  You could ask them to name one favorite, but doing that, you can end up with headlines like “Americans admire Glenn Beck more than they admire the Pope.”  However, one question that both Gallup and Ranker have asked concerns the nation’s top colleges/universities.  How do Ranker’s results compare to Gallup’s data?  Below are our results, side by side.

Ranker vs Gallup Best US Colleges

From a market researcher’s perspective, this is good news for Ranker data.  Our algorithms have successfully replicated the top 4 results from the Gallup poll exactly, at a fraction of the cost.  This likely occurs because Ranker data is largely collected from users who find our website via organic search, so while our data is not a representative probability sample (assuming such a thing still exists in a world where people screen their calls on cellphones), our users tend to be more representative than the motivated Yelp user or the intellectual Quora user.  If you compare how representative Ranker’s best movies list is compared to Rotten Tomatoes aggregated opinion list (Toy Story 2 and Man on Wire are #1 & #2!?!?), you get a sense of the importance of having relatively representative data.

In addition, the fact that our lists are derived from a combination of methodologies (listing, reranking, + voting), means that the error associated with each method somewhat cancels out.  Indeed, one might argue that Ranker’s top dream colleges list is better than Gallup’s for precisely this reason as individuals are often tempted to list their alma mater or their local school as the best college, and the long tail of answers might actually contain more pertinent information.  Aggregating ranked lists from motivated users and combining that data with casual voters might actually be the best way to answer a question like this.

– Ravi Iyer

by    in Market Research

A Look Inside the Ranker Data Tool

You may have looked through some of the more fascinating, insightful posts her on the Ranker Data Blog and thought… how can he possibly come up with some of these connections?

Well, to be perfectly fair, the Ranker data tool does a lot of the heavy lifting. It allows me to quickly look through topics that have received a lot of up or down votes on Ranker, and make quick comparisons to other topics easily.

And here’s a quick look at how it all works…

We start by picking a general category we want and a specific item (or “node” in this case) from that topic. So under the category of TV, I’m going to pick the item “Boardwalk Empire.”

Now the tool knows that I only want to look at people who voted on “Boardwalk Empire.” The next step involves the tool looking for correlations – that is, relationships between “Boardwalk Empire” votes and other votes cast on Ranker. I could compare votes cast for or against “Boardwalk Empire” with votes cast on pretty much any other subject – films, foods, people, gadgets… you name it. Sometimes, this can be very interesting, as in this post, where we correlated people’s taste in breakfast cereals vs. films and tv shows.

But for the sake of explanation, let’s look at a more direct comparison, which usually yields more interesting results. So we’ll compare votes on “Boardwalk Empire” to votes on other TV shows, to see how well we can predict what fans of HBO’s Prohibition drama might also enjoy on the tube.

The results are pretty standard, and really show off exactly what the tool can do. When searching “Boardwalk Empire” correlated with other TV shows, here’s what I see:

Those percentages to the right represent what we call the “Lift %,” which basically just means “how much more likely is a “Boardwalk Empire” fan to enjoy X show, over a random person who does not have an opinion about Boardwalk Empire”? I’d ask Ravi to explain it to you directly, but his answer would likely involve fractals, and I don’t want to put you through that.

Trust me on this part, though… The higher the Lift %, the MORE likely a “Boardwalk Empire” fan will also enjoy whatever show we’re discussing.

Keeping that in mind, most of the results seem fairly predictable and straight-forward. A “Boardwalk Empire” fan would naturally be likely to enjoy “The Shield” or “The Killing,” two different hard-edged crime dramas with occasionally similar themes. Similarly, “Deadwood” seems an obvious fit – both are violent HBO series exploring crime in different periods of American history. In fact, there’s really only two outliers that make this list kind of compelling… What the hell are “Thundercats” and “Police Squad!” doing there?

There’s probably a very reasonable explanation for this. Maybe a big chunk of people went to the “Boardwalk Empire” page and then immediately voted on their favorite ’80s cartoon series as well? It’s possible, but seems unlikely, as there aren’t any other animated shows in the Top 10 (or even 20!) of this group. Maybe people who like “Boardwalk Empire” – or crime shows more generally – also enjoy occasionally making light of a very serious subject by throwing on the adventures of Detective Frank Drebin of “Police Squad!” To investigate this, I’d probably look at a similar chart for the show “Police Squad!” and see if a lot of more serious crime fare appeared.

And what do you know? It does! Along with the expected other comedy series from the same era – “Welcome Back Kotter,” “WKRP in Cincinnati” and so on, sure enough we see that “Police Squad!” fans have also voted positively on “The Sopranos,” “Boardwalk Empire” and even “Miami Vice.” We could certainly do more research to confirm, but this definitely points me towards a preliminary hypothesis – fans of crime shows don’t really differentiate between funny or serious content. They just like the topic of crime and criminals.

To keep investigating, I’d probably look at some other crime dramas and comedies to see if I also got similar results. If, say, “The Wire” fans also tended to enjoy “Pink Panther” movies, or fans of “Hackers” also cited “Sneakers” as a favorite film, we’d be on our way to a full-fledged theory. But that’s a blog post for a different day, kids. Now it’s time for bed.

– Lon

Page 1 of 212