by   Ranker
Staff
in Data Science, Market Research, Opinion Graph, Pop Culture

An Opinion Graph of the World’s Beers

One of the strengths of Ranker‘s data is that we collect such a wide variety of opinions from users that we can put opinions about a wide variety of subjects into a graph format.  Graphs are useful as they let you go beyond the individual relationships between items and see overall patterns.  In anticipation of Cinco de Mayo, I produced the below opinion graph of beers, based on votes on lists such as our Best World Beers list.  Connections in this graph represent significant correlations between sentiment towards connected beers, which vary in terms of strength.  A layout algorithm (force atlas in Gephi) placed beers that were more related closer to each other and beers that had fewer/weaker connections further apart.  I also ran a classification algorithm that clustered beers according to preference and colored the graph according to these clusters.  Click on the below graph to expand it.

Ranker's Beer Opinion Graph

One of the fun things about graphs is that different people will see different patterns.  Among the things I learned from this exercise are:

  • The opposite of light beer, from a taste perspective, isn’t dark beer.  Rather, light beers like Miller Lite are most opposite craft beers like Stone IPA and Chimay.
  • Coors light is the light beer that is closest to the mainstream cluster.  Stella Artois, Corona, and Heineken are also reasonable bridge beers between the main cluster and the light beer world.
  • The classification algorithm revealed six main taste/opinion clusters, which I would label: Really Light Beers (e.g. Natural Light), Lighter Mainstream Beers (e.g. Blue Moon), Stout Beers (e.g. Guinness), Craft Beers (e.g. Stone IPA), Darker European Beers (e.g. Chimay), and Lighter European Beers (e.g. Leffe Blonde).  The interesting parts about the classifications are the cases on the edge, such as how Newcastle Brown Ale appeals to both Guinness and Heineken drinkers.
  • Seeing beers graphed according to opinions made me wonder if companies consciously position their beers accordingly.  Is Pyramid Hefeweizen successfully appealing to the Sam Adams drinker who wants a bit of European flavor?  Is Anchor Steam supposed to appeal to both the Guinness drinker and the craft beer drinker?  I’m not sure if I know enough about the marketing of beers to know the answer to this, but I’d be curious if beer companies place their beers in the same space that this opinion graph does.

These are just a few observations based on my own limited beer drinking experience.  I tend to be more of a whiskey drinker, and hope more of you will vote on our Best Tasting Whiskey list, so I can graph that next.  I’d love to hear comments about other observations that you might make from this graph.

– Ravi Iyer

Ranker Uses Big Data to Rank the World’s 25 Best Film Schools

NYU, USC, UCLA, Yale, Julliard, Columbia, and Harvard top the Rankings.

Does USC or NYU have a better film school?  “Big data” can provide an answer to this question by linking data about movies and the actors, directors, and producers who have worked on specific movies, to data about universities and the graduates of those universities.  As such, one can use semantic data from sources like Freebase, DBPedia, and IMDB to figure out which schools have produced the most working graduates.  However, what if you cared about the quality of the movies they worked on rather than just the quantity?  Educating a student who went on to work on The Godfather must certainly be worth more than producing a student who received a credit on Gigli.

Leveraging opinion data from Ranker’s Best Movies of All-Time list in addition to widely available semantic data, Ranker recently produced a ranked list of the world’s 25 best film schools, based on credits on movies within the top 500 movies of all-time.  USC produces the most film credits by graduates overall, but when film quality is taken into account, NYU (208 credits) actually produces more credits among the top 500 movies of all-time, compared to USC (186 credits).  UCLA, Yale, Julliard, Columbia, and Harvard take places 3 through 7 on the Ranker’s list.  Several professional schools that focus on the arts also place in the top 25 (e.g. London’s Royal Academy of Dramatic Art) as well as some well-located high schools (New York’s Fiorello H. Laguardia High School & Beverly Hills High School).

The World’s Top 25 Film Schools

  1. New York University (208 credits)
  2. University of Southern California (186 credits)
  3. University of California – Los Angeles (165 credits)
  4. Yale University (110 credits)
  5. Julliard School (106 credits)
  6. Columbia University (100 credits)
  7. Harvard University (90 credits)
  8. Royal Academy of Dramatic Art (86 credits)
  9. Fiorello H. Laguardia High School of Music & Art (64 credits)
  10. American Academy of Dramatic Arts (51 credits)
  11. London Academy of Music and Dramatic Art (51 credits)
  12. Stanford University (50 credits)
  13. HB Studio (49 credits)
  14. Northwestern University (47 credits)
  15. The Actors Studio (44 credits)
  16. Brown University (43 credits)
  17. University of Texas – Austin (40 credits)
  18. Central School of Speech and Drama (39 credits)
  19. Cornell University (39 credits)
  20. Guildhall School of Music and Drama (38 credits)
  21. University of California – Berkeley (38 credits)
  22. California Institute of the Arts (38 credits)
  23. University of Michigan (37 credits)
  24. Beverly Hills High School (36 credits)
  25. Boston University (35 credits)

“Clearly, there is a huge effect of geography, as prominent New York and Los Angeles based high schools appear to produce more graduates who work on quality films compared to many colleges and universities,“ says Ravi Iyer, Ranker’s Principal Data Scientist, a graduate of the University of Southern California.

Ranker is able to combine factual semantic data with an opinion layer because Ranker is powered by a Virtuoso triple store with over 700 million triples of information that are processed into an entertaining list format for users on Ranker’s consumer facing website, Ranker.com.  Each month, over 7 million unique users interact with this data – ranking, listing and voting on various objects – effectively adding a layer of opinion data on top of the factual data from Ranker’s triple store. The result is a continually growing opinion graph that connects factual and opinion data.  As of January 2013, Ranker’s opinion graph included over 30,000 nodes with over 5 million edges connecting these nodes.

– Ravi Iyer

by   Ranker
Staff
in Data Science, Market Research, Pop Culture, prediction

Predicting Box Office Success a Year in Advance from Ranker Data

A number of data scientists have attempted to predict movie box office success from various datasets.  For example, researchers at HP labs were able to use tweets around the release date plus the number of theaters that a movie was released in to predict 97.3% of movie box office revenue in the first weekend.  The Hollywood Stock Exchange, which lets participants bet on the box office revenues and infers a prediction, predicts 96.5% of box office revenue in the opening weekend.  Wikipedia activity predicts 77% of box office revenue according to a collaboration of European researchers.  Ranker runs lists of anticipated movies each year, often for more than a year in advance, and so the question I wanted to analyze in our data was how predictive is Ranker data of box office success.

However, since the above researchers have already shown that online activity at the time of the opening weekend predicts box office success during that weekend, I wanted to build upon that work and see if Ranker data could predict box office receipts well in advance of opening weekend.  Below is a simple scatterplot of results, showing that Ranker data from the previous year predicts 82% of variance in movie box office revenue for movies released in the next year.

Predicting Box Office Success from Ranker Data
Predicting Box Office Success from Ranker Data

The above graph uses votes cast in 2011 to predict revenues from our Most Anticipated 2012 Films list.  While our data is not as predictive as twitter data collected leading up to opening weekend, the remarkable thing about this result is that most votes (8,200 votes from 1,146 voters) were cast 7-13 months before the actual release date.  I look forward to doing the same analysis on our Most Anticipated 2013 Films list at the end of this year.

– Ravi Iyer

by   Ranker
Staff
in Data Science

Crowdsourcing Objective Answers to Subjective Questions – Nerd Nite Los Angeles

A lot of the questions on Ranker are subjective, but that doesn’t mean that we cannot use data to bring some objectivity to this analysis.  In the same way that Yelp crowdsources answers to subjective questions about restaurants and TripAdvisor crowdsources answers to subjective questions about hotels, Ranker crowdsources answers to a broader assortment of relatively subjective questions such as the Tastiest Pizza Toppings, the Best Cruise Destination, and the Worst Way to Die.

A few weeks ago, I did an informal talk on the Wisdom of Crowds approach that Ranker takes to crowdsource such answers at a Los Angeles bar as part of “Nerd Nite”.  The gist of it is that one can crowdsource objective answers to subjective questions by asking diverse groups of people questions in diverse ways.  Greater diversity, when aggregated effectively, enables the error inherent in answering any subjective question to be minimized.  For example, we know intuitively that relying on only the young or only the elderly or only people in cities or only people who live in rural areas gives us biased answers to subjective questions.  But when all of these diverse groups agree on a subjective question, there is reason to believe that there is an objective truth that they are responding to.  Below is the video of that talk.

If you want to see a more formal version of this talk, I’ll be speaking at greater length on Ranker’s methodologies at the Big Data Innovation Summit in San Francisco this Friday.

– Ravi Iyer

by   Ranker
Staff
in New Features

New Features on Ranker

As usual, we are hard at work here in the Internet factory trying to make our site better and better. Pretty soon, Ranker.com is going to be able to walk your dog and drop your kids off at soccer practice. Those jet-pack days are not here yet, but we do have some other fun, new stuff for you to gaze upon. And maybe also use.

Do you sometimes see a list and think, “I want to rank that, but who has the TIME?” Not to fear, good people. We’ve actually reduced the amount of time it takes to register your opinion in list-form. With science!

Say you go to a list, any list. You start voting, all casual-like… and suddenly this tab pops out of the right side of your browser. Every time you vote something up, the number on the counter goes up, too! If you click on the green button that says “Your Votes,” a starter-list will come sliding in from the right side. You can literally re-order items, delete items, add items, write copy, and even add images and/or videos RIGHT THERE. And then publish your re-rank. RIGHT THERE. I mean, think of the time you just saved. Now you finally have time to cut your toenails. You’re welcome.

This isn’t technically new… but it’s something that looks new, so we’re counting it. The trigger buttons that switch how you view lists have moved! Not that exciting, I guess.

But in case you wondered why your ability to change ‘blog view’ to ‘info view’ or ‘info view’ to ‘slideshow’ wasn’t where it was supposed to be, just move your eyeballs to the left side of the screen and look right at the top of the list. There they are! Everything is still as it should be, only different.

We’ve gone and moved into the fine year of 2010 by finally making a functioning mobile version of our site. You can browse our lists more easily now, vote more easily and now you can actually re-rank a list… on your PHONE!

Unfortunately, we don’t yet support creating your own, brand-new lists on mobile, and there are some other user features of the site that we still won’t support on mobile… BUT it should be a lot easier to find, read, and vote on all your favorite lists.

Have you ever just wanted to copy and paste a list you see on Ranker? Maybe to a blog post, email, or just to Facebook? Well, now you can!

If you go into the ‘more options’ dropdown at the top of every list and select “Paste to Clipboard,” a popup with the list as it appears in text format will appear, allowing you to copy it and paste it wherever you want, free of formatting. If you want to paste it to Facebook, make sure to select the checkbox that says, “Check here to copy for Facebook.”

Enjoy. Check back soon for more fun new features!

by   Ranker
Staff
in interest graph, Opinion Graph

A Battle of Taste Graphs: Baltimore Ravens Fans vs. San Francisco 49ers Fans

Super Bowl Sunday is a day when two cities and two fan groups are competing for bragging rights, even as the Baltimore Ravens and San Francisco 49ers themselves do the playing.  You might be interested in understanding these teams’ fans better through an exploration of their fans’ taste graphs, from a recent post on our data blog, which examines correlations between votes on lists like the Top NFL Teams of 2012 and non-sports lists like our list of delicious vegetables (yum!).

For one, There is also absolutely zero consensus where music is concerned. 49er’s fans listen to an eclectic mixture of genres: up-and-coming rappers like Kendrick Lamar sit right next to INXS and 90s brit-poppers Pulp. Yet where the Ravens are concerned, classic rock is still king: Hendrix, CCR, and Neil Young are an undisputed top three. The 49ers also have the Ravens utterly beat in terms of culinary taste. Monterrey Jack and Cosmos are a fairly clear favorite among fans, while Baltimore’s stick to staples: Coffee, Bell peppers, and Ham are the only food items that correlated enough to even be tracked.

 A Snapshot from Ranker’s Data Mining Tool

TV tastes also varied between the two teams: Ravens fans stuck to almost exclusively comedic faire (Pinky and The Brain, Rugrats, Mythbusters and Louie correlated strongly), while the 49er’s stuck to more structured, dramatic shows, such as The Walking Deadand Dexter.

Read the full post here over on our data blog.

– Ravi Iyer

by   Ranker
Staff
in Data Science, Pop Culture

On Touchdowns and Tastes: This Sunday’s Conflict Of Fan-Interests

 

helmet images courtesy of http://nfl-franchises.findthedata.org

 

The greatest moment of fear in my childhood came on the eve of my first ever family trip to Manhattan. It wasn’t the flight or the crowds or the crime rate that had seven-year-old me scared. I was terrified because I had been brought up to believe that any and all Yankees fans were villainous scum, lowest of the low, the nadir of human development. Visiting the city and actually interacting with people from New York had an effect on me akin to realizing that there wasn’t a Santa Claus: I was faced with the reality that not all Yankees fans are evil. It just wasn’t mathematically feasible. You can’t run a city of 8 million people without having some people who don’t suck. This, of course, is a key part of the unspoken acknowledgement all (nonviolent & sane) sports fans have; that sports fandom is a mostly regional thing, and that there’s no point in thinking those who back another team are truly inferior, or even all that different from you.

However, if you told that to anyone from Baltimore or San Francisco right now, they’d likely try to argue for the ideological superiority of their respective squad. With the Super Bowl literally on the horizon, this is not a time where people deal in shades of gray. But are there any real, quantifiable differences between the fans of the Ravens and the 49ers? Anything else on the line in this contest?

Weirdly enough, yes. The Ranker correlation data for supporters of the Ravens and the 49ers is strikingly dissimilar. You’d think that there would be some commonalities between the likes and dislikes of the two teams, even just those that stem from the demographic features of “football fans”. But no, the pop culture tastes of the two teams have a strikingly miniscule amount of overlap.  Let us examine some of the correlations based on user behavior at Ranker.com.

For one, There is also absolutely zero consensus where music is concerned. 49er’s fans listen to an eclectic mixture of genres: up-and-coming rappers like Kendrick Lamar sit right next to INXS and 90s brit-poppers Pulp. Yet where the Ravens are concerned, classic rock is still king: Hendrix, CCR, and Neil Young are an undisputed top three. The 49ers also have the Ravens utterly beat in terms of culinary taste. Monterrey Jack and Cosmos are a fairly clear favorite among fans, while Baltimore’s stick to staples: Coffee, Bell peppers, and Ham are the only food items that correlated enough to even be tracked.

 A Snapshot from Ranker’s Data Mining Tool

TV tastes also varied between the two teams: Ravens fans stuck to almost exclusively comedic faire (Pinky and The Brain, Rugrats, Mythbusters and Louie correlated strongly), while the 49er’s stuck to more structured, dramatic shows, such as The Walking Dead and Dexter.

Some of these differences can be explained away geographically (In-and-Out Burger, a prominent correlated item for the 49ers, isn’t going to appeal to anyone on the east coast since they just don’t have it), but when the data is stacked up, there is a very noticeable dissimilarity in interests between the two teams. One could, of course, use this data to try to advocate for the superiority of one team over the other (I won’t even get into the far more extensive video game tastes of the 49er’s). However, the far more intriguing question at hand lies in what we all really watch the Super Bowl for: the ads.

If, as the data suggests, there is such a difference between the interests of the average 49er’s fan and the average Ravens fan, how will the ads attempt to bridge this gap? Since I could give a damn about the score (neither team is the Pats, who cares), I’ll be keeping track instead of whose team’s interests are catered to by the adverts. On Sunday, one team will win on the field, and another during the commercials.

– Eamon Levesque

by   Ranker
Staff
in Data Science, interest graph, Opinion Graph

The Opinion Graph predicts more than the Interest Graph

At Ranker, we keep track of talk about the “interest graph” as we have our own parallel graph of relationships between objects in our system, that we call an “opinion graph”.  I was recently sent this video concerning the power of the interest graph to drive personalization.

The points made in the video are very good, about how the interest graph is more predictive than the social graph, as far as personalization goes.  I love my friends, but the kinds of things they read and the kinds of things I read are very different and while there is often overlap, there is also a lot of diversity.  For example, trying to personalize my movie recommendations based on my wife’s tastes would not be a satisfying experience.  Collaborative filtering using people who have common interests with me is a step in the right direction and the interest graph is certainly an important part of that.

However, you can predict more about a person with an opinion graph versus an interest graph. The difference is that while many companies can infer from web behavior what people are interested in, perhaps by looking at the kinds of articles and websites they consume, a graph of opinions actually knows what people think about the things they are reading about.  Anyone who works with data knows that the more specific a data point is, the more you can predict, as the amount of “error” in your measurement is reduced.  Reduced measurement error is far more important for prediction than sample size, which is a point that gets lost in the drive toward bigger and bigger data sets.  Nate Silver often makes this point in talks and in his book.

For example, if you know someone reads articles about Slumdog Millionare, then you can serve them content about Slumdog Millionare.  That would be a typical use case for interest graph data. Using collaborative filtering, you can find out what other Slumdog Millionare fans like and serve them appropriate content.  With opinion graph data, of the type we collect at Ranker, you might be able to differentiate between a person who thinks that Slumdog Millionare is simply a great movie versus someone who thinks the soundtrack was one of the best ever.  If you liked the movie, we would predict that you would also like Fight Club.  But if you liked the soundtrack, you might instead be interested in other music by A.R. Rahman.

Simply put, the opinion graph can predict more about people than the interest graph can.

– Ravi Iyer

by   Ranker
Staff
in New Features

Latest Features on Ranker

There are a lot of neat little things we’ve been working on around here in the lab. Things that make it easier and more fun to make the lists you want to make. Take a peek:

Send A Note

We were sitting around the other day in the conference room and someone said ‘hey, wouldn’t it be cool if users could talk to each other? Like email, sorta?”

So we decided that you guys should totally get in on this whole “electronic” form of communication. Now, If you read a list you like, or are intrigued by the genius behind “5 Ways To Make Homemade Spam”, you can go to their profile page and send the list-maker a note and let them know that there are actually 6 ways! And, because we believe in the goodness of the human spirit, we are sure you guys won’t use this new power for evil.

Send A Note

PS. You need to be logged in to see this new feature!

 

Adding Items

Remember that time you made your favorite movie list? But you couldn’t remember ALL your favorite movies, because you’re not a damned robot, right? And then you were looking at someone else’s favorite movie list – or maybe perusing the Best Movies of All Time list – and you saw Piranha II: The Spawning listed there. That is totally one of your favorite movies, but you forgot until just now! Well, we have a way for you to add it to your own list with a single click. If you click that blue ‘+’ button, you will get a dropdown with any relevant lists of yours that Piranha II might be good to add to. Select your favorite movie list from the dropdown and POW, that James Cameron classic is now on your own list, too!

Adding Items

PS. You need to be logged in to see this new feature!

 

SlideShow View

You already know that you have two choices for how your list displays on Ranker. you can write lots of lovely words for the internet to read with big pictures… or you can just create easily digestable stacked lists with small images. Now we give you a third option… Slideshow! Build your list like normal in Edit, put in nice pretty images that will look good big — this view supports any commentary you might want to add, too! Choose the ‘slideshow view’ option from your ‘list options’ popup, and when you publish your list will display one beautiful item at a time!

SlideShow View

 

Filtering Lists

We have so many lists on Ranker. So. Many. And sometimes it’s overwhelming, we know. God, we know. But we’ve been tagging lists (and so have you) for the last few years and we finally went ahead and made use of them. Now, when you go into any of the big category tabs on ranker (film, tv, people, etc) you will see a little array of blue buttons on the top of the right sidebar. You can use these little buttons to sort and filter the content of that category in a million different ways! Each new filter button will narrow down your results until you find the exact lists you are looking for. Go try it!

Filtering Lists

 

Stylish Copy

One of the things we’ve never really had so much around here is the ability to dress up the things you guys are writing on your blog view lists. Bolding, italics, stuff like that. Well, fret no more! We now support a simple text styling interface in Edit.

When you are building your lists, and you want to write stuff… just click on the text field for your item. There is a whole little string of new tools there that allows you to make your text a lot fancier! And easy! Always easy!

Stylish Copy

by   Ranker
Staff
in Data Science

Mitt Romney Should Have Advertised on the X-Files

With the election recently behind us, many political analysts are conducting analyses of the campaigns, examining what worked and what didn’t.  One specific area where the Obama team is getting praise is in their unprecedented use of data to drive campaign decisions, and even more specifically, how they used data to micro-target fans who watched specific TV shows.  From this New York Times article concerning the Obama Team’s TV analytics:

“Culling never-before-used data about viewing habits, and combining it with more personal information about the voters the campaign was trying to reach and persuade than was ever before available, the system allowed Mr. Obama’s team to direct advertising with a previously unheard-of level of efficiency, strategists from both sides agree….

[They] created a new set of ratings based on the political leanings of categories of people the Obama campaign was interested in reaching, allowing the campaign to buy its advertising on political terms as opposed to traditional television industry terms…..

[They focused] on niche networks and programs that did not necessarily deliver large audiences but, as Mr. Grisolano put it, did provide the right ones.”

 

The Obama team focused more on undecided/apolitical voters in an effort to get them to the polls.  Given that some Mitt Romney supporters have blamed a lack of turnout of supporters for the results of the election, perhaps Romney would have been smart to have created a ranked list of TV shows, based on how much fans of the shows supported Romney, and then placed positive/motivating ads on those shows in an effort to increase turnout of his base.  Where would Romney get such data?  From Ranker!

Mitt Romney is on many votable Ranker lists (e.g. Most Influential People of 2012) and based on people who voted on those lists and also lists such as our Best Recent TV Shows list, we can examine which TV shows are positively or negatively associated with Mitt Romney.  Below are the top positive results from one of our internal tools.

As you can see, the X-Files appears to be the highest correlated show, by a fair margin.  I don’t watch the X-Files, so I wasn’t sure why this correlation exists, but I did a bit of research, and found this article exploring how the X-Files supported a number of conservative themes, such as the persistence of evil, objective truth, and distrust of government (also see here).  The article points out that in one episode, right wing militiamen are depicted as being heroic, which never would happen in a more liberal leaning plot.  Perhaps if you are a conservative politician seeking to motivate your base, you should consider running ads on reruns of the X-Files, or if you run a television station that shows X-Files reruns, consider contacting your local conservative politicians leveraging this data.

You may notice that this list contains more classic/rerun shows (e.g. Leave it to Beaver) than current shows.  This appears to be part of a general trend where conservatives on Ranker tend to positively vote for classic TV, a subject we’ll cover in a future blog post.  The possibility of advertising on reruns is part of what we would like to highlight in this post, as ads are likely relatively cheap and audiences can be more easily targeted, a tactic which the Obama campaign has been praised for.  At Ranker, we’re hopeful that more advertisers will seek value in the long-tail and mid-tail and will seek to mimic the tactics of the Obama campaign, as our data is uniquely suited for such psychographic targeting.

– Ravi Iyer