Collecting and Connecting Millions of Opinions

insights_logo_transparent

Ranker Insights is the Most Precise Data for Entertainment, Personalities, Sports, Brands and More

Ranker is a leading, digital media company that ranks opinions on (almost) everything through our vote-based, user experience. Our rankings don’t just collect opinions, they contextualize them. Through context, Ranker can discern users who prefer an actor’s talent vs. their attractiveness, for example, or fans who like a college for its academics vs. athletics; and the millions upon millions of correlations therein.

Thusly, we created Ranker Insights: Ranker’s first-party analytics platform that optimizes data from users votes, into actionable intelligence with countless applications.

by    in Data

Baby Bomb – Here’s How We Knew Bridget Jones’s Baby Would Tank

Doc is going to be honest here. He was probably never going to buy a ticket for Bridget Jones’s Baby… mostly because Doc believes in restricting oneself to just an apostrophe when a possessive word ends in “s.” But also because the travails of a winsome Anglo-dumpling with a journaling fixation never held much personal appeal.

But movies that Doc doesn’t personally care for make bank all the time, and clearly there were plenty in Hollywood (or at least at Universal Pictures) who were convinced that the franchise’s devoted fanbase would turn out for another spin on the Bridget-go-round. And why not? Over the past couple of years, Sequels That No One Asked For actually have been a pretty safe bet, especially the ones targeting women over 25. My Big Fat Greek Wedding 2 wasn’t the surprise smash of the original, but it more than made its budget back, grossing a respectable $60 million in the U.S. And last year’s Second Best Exotic Marigold Hotel took in over $80 million worldwide, with a little over a third of that total coming from the U.S. With this summer’s sleeper hit Bad Moms proving the strength of the women-over-25 market, and credible critical response, most experts were looking at Bridget Jones’s Baby opening at $15 million, if not higher.

bluegraph

Of course, the gang here at Ranker are not “most experts.” And accordingly, Doc can say that we had a pretty strong idea that Bridget Jones’s Baby was due for a troubled birth and a sickly, blighted existence on this earth. How’d we know?

Easy. We pulled up Ranker Insights, and dug into the numbers on Bridget Jones’s Diary, the first and best-regarded of Bridget’s misadventures. After all, the fanbase for Bridget Jones’s Diary seems like an obvious—really, the obvious—group for the movie to market to. And we learned all sorts of interesting things, like that it’s the 66th best rainy-day movie, and that Bridget’s stateside popularity is strongest in the southeast, then wanes as you move north and west across the country.

And then we pulled up the list of other films that Bridget Jones fans were most likely to voice their approval of. The first on the list, unsurprisingly, is Bridget Jones: The Edge of Reason, the (widely derided) first sequel to Diary. But how about those next six titles? See if you can spot any pattern…

  • Love Actually
  • Elizabeth
  • About a Boy
  • Notting Hill
  • Sense and Sensibility
  • Four Weddings and a Funeral

moviecollage-1

You’re a smart cookie—you see where Doc is going with this, yes? No fewer than five of those six movies feature the harried, boyish stammerings of one Hugh John Mungo Grant. (And in related news: Mungo? MUNGO? Doc swears he isn’t making this stuff up.) Yes, Love Actually additionally features Grant’s Bridget Jones co-star Colin Firth, which probably accounts for its placement at #2 on the list after Edge of Reason. But otherwise, the message is clear as day: Above all others, Bridget Jones fans love, love, love them some Hugh Grant.

mungo80

This would be just peachy, except for the tiny, easily-overlooked detail that Hugh Grant decided he wanted no part of Bridget Jones’s Baby, and isn’t in the movie.   Even if you’ve just seen the film’s traditional three-shot poster, you know that the role of “handsome douche” previously filled by Grant is this time assayed by Patrick Dempsey (nee McDreamy). Now Bridget Jones fans don’t seem to have anything especially against Dempsey. On the list of TV shows most liked by Bridget Jones fans, Grey’s Anatomy ranks #9. (It’s still behind Pinky & the Brain and Golden Girls, so go figure.) But there’s no comparison between their mild affection for Dempsey and their deep and abiding passion for Hugh Grant. Their feelings for Grant’s co-stars Renee Zellweger and Colin Firth similarly pale by comparison. After Edge of Reason, the top Zellweger film on the list is Chicago, at #19. Zellweger’s breakthrough film, Jerry Maguire, sits at #532.

Wouldn’t you think that if the Bridget Jones fanbase was really devoted to Renee Zellweger, they’d be more inclined to like Jerry Maguire than, say, The Mighty Ducks or American History X? But no. Apparently, fans of Bridget Jones would rather watch Ed Norton curb-stomp a dude than see Renee Zellweger “complete” Tom Cruise. Good stuff to bear in mind when you’re planning your next at-home double feature.

graphwithfaces-1

And so there was zero astonishment around Ranker HQ when Bridget Jones’s Baby didn’t even crack $9 million in its opening weekend. Doc takes no joy in being right about this stuff. He wants all the movies to do well, what with a rising tide lifting all boats and everything. But when you blow it this big, and this obviously, you deserve to get called on it.

So for future reference, trying to sustain a movie franchise after shedding its fans’ favorite character/actor is a lousy idea. And that’s the only truth Doc has for you this week, baby.

by    in Data

Big Data Shows Movie Fans Love Tom Hanks, Just Not in Sequels

It’s summertime. And when it comes to big-budget movies, that also means it’s sequel time. We’ve already seen remarkable successes like Captain America: Civil War and Finding Dory, and a few flops (at least, based on their allotted budget) like Teenage Mutant Ninja Turtles 2 and Independence Day: Resurgence. This got us at Ranker Insights thinking: what goes into making a successful sequel? The truth is, there are a lot of extenuating circumstances that contribute. The box office success of the original just happens to be one of them. From solid, open-ended plot lines and apparent depth of main characters to preordained fan bases and predictably bankable actors, big data suggests many factors come into play when creating a flourishing movie franchise. However, this much seems certain: you’re probably better off casting anyone but who voters consider the greatest actor of all time.

Allow us to explain. Big data can tell you big things when it comes to making a great film. But if you’re planning on getting the most bang for your buck on your original idea, even the smallest minutia might make a big deal. For instance, let’s take a look at the top 30 of the Best Movie Characters of All Time. Notice anything? Sure, you see all the memorable characters you would expect to see near the top: Forrest Gump, Indiana Jones, James Bond, and Bruce Wayne/Batman are all in the top 10. This makes sense, especially when you consider their names are usually in the title of the movies their characters star in. Look a little closer, and further analyze the films from which these characters came. Of the top 30, 22 of them were strong enough to star in a sequel or trilogy. Now, let’s look at the eight that didn’t return to entertain you once again. What do all these movies have in common? That’s right. They all involve the indisputably lovable Thomas Jeffrey Hanks.

Why is this you ask? Good question. Certainly Toy Story was a smashing success, and went on to create not one – but two – great sequels. Toy Story 2 was even voted 8th on Ranker’s list of Best Movie Sequels. But for obvious reasons, that franchise just featured his voice, not his face. The only sequel in which Tom Hanks participated in and had to actually act, The DaVinci Code, produced far less favorable results. While Angels & Demons still proved to be a box office success, it only took in about 2/3 of the box office its predecessor did. And as for the character Hanks portrayed, Robert Langdon, well, he is nowhere to be found on the Best Movie Characters of All Time list.

It doesn’t seem to be Tom’s directorial choices either, as the Tom Hanks/Steven Spielberg combo are a whopping 975% more likely to be liked by Tom Hanks fans, with the Tom Hanks/Ron Howard team coming in a close second at 809%. And it’s not like these fans are adverse to the idea of sequels either. Voters who like Tom all like their action, adventure, and animated sequels. In fact, Tom fanatics are 549% more likely to enjoy Captain America: The Winter Solider; 258% more likely to have high praise for Back to the Future II; and 396% more likely to be a fan of the previously mentioned Toy Story 2. Heck, the analytics show that voters on the Greatest Actor & Actress in Entertainment History are willing for a sequel of any kind: they’re 38% more likely to vote up the universally agreed upon clunker, Crocodile Dundee in Los Angeles. Maybe Tom Hanks-related sequels were meant not to be seen, but simply heard.

Perhaps it’s just a demographic thing? Nope, as that doesn’t seem to matter either. In fact, Toy Story 2 even drops in the rankings to number 9 among international voters and even further to 10 among female voters. Judging by the data mined from Actors You Would Watch Read The Phone Book, analytics show that Tom Hanks fans are 200% (or more) likely to listen to Robert De Niro, Harrison Ford, Johnny Depp or Liam Neeson go through the names from A to Z, and all four know a thing or two about sequels. However, with Hanks ranking sixth on that same list, we can now confidently deduce that the reason for so few sequels from the actor is probably not his acting itself.

In all likeliness, it’s probably just a content thing. Most of Hanks roles have a historical end, or at the very least, a distinctive one. The stories he stars in just don’t lend themselves to sequels. Voters must agree, as there is nary a Hanks movie to be found on Ranker’s list of Movies That Need Sequels. Saving Private Ryan? Saved. Catch Me If You Can? Caught. Philadelphia? Finished. So don’t hold your breath waiting for Forrest Gumper or Sully 2: Nursing Home Boogaloo, regardless of how well it does upon release in early September. These Hanks vehicles just don’t seem to be in demand, success be damned.

Now, Ranker Insights would never be one to tell you how to create a successful movie franchise, because frankly, that would be a thankless job. But if your job is to create a character that is memorable enough to secure a sequel, big data shows your main character should probably be a Hanks-less one. He’s seems to be the epitome of Mr. One-and-Done.

by    in Data

Using Data To Determine The Best Months Of The Year

Why do people like some months more than others? For many, it is all about the holidays:

“I love the scents of winter! For me, it’s all about the feeling you get when you smell pumpkin spice, cinnamon, nutmeg, gingerbread and spruce.” – Taylor Swift

while for others, it is about avoiding the cold

“A lot of people like snow. I find it to be an unnecessary freezing of water.” – Carl Reiner

and for some more disaffected souls, it is about the specifics

“August used to be a sad month for me. As the days went on, the thought of school starting weighed heavily upon my young frame.“ – Henry Rollins

Presumably all of these preferences and this angst is reflected in Ranker’s Best Months of the Year list. The graphic below provides a visualization of the opinions of ranker users. Each row is a different person, and their (sometimes incomplete) ranking of the months is shown from best-to-worst from left-to-right. The months are color coded by the four seasons: Spring has the hues of green, summer is yellow, fall has the rustic earth hues of brown, and winter is blue.

BestMonthsOriginal

The patchwork quilt of colors and hues makes it is clear that different people have different opinions. We wanted to understand the structure of these individual differences, using cognitive data analysis.

To do this, we used a simple model of how people produce rankings—known as a Thurstonian model, going back to the 1920s in psychology—that we have previously applied successfully to Ranker data. Rather than assuming everybody’s rankings were based on a shared opinion, we allowed this version of the model to have groups or clusters of people, and for each group to have their own preferences for the months. We didn’t want to pre-determine the number of groups, and so we allowed our model to make this inference directly from the data. Our modeling approach thus involves two sorts of interacting uncertainties: about how many groups there are, and about which people belong to which group. Bayesian statistical methods are well suited to handling these sorts of uncertainties.

For fans of Bayesian cognitive graphical models — we know you’re out there — the final model we used is shown in the figure below. For non-fans of Bayesian cognitive graphical models — we KNOW you’re out there — there are three important parts. The variable gamma at the top corresponds to how many groups there are, the variables z to the side correspond to which of these groups each individual belongs, and all of this is inferred from the rankings people gave, represented by the variables at the bottom.

GraphicalModel

The figure below shows the first key insight from the model. It shows the probability that there are 1, 2, …, 17 groups, ranging from everybody having the same opinion about the best months, to everyone having their own unique opinion. There is uncertainty about how many groups the rankings reveal, but the most likely answer is that there are four.

Gamma

Assuming there are four groups, the figure below organizes the ranking data  by grouping together the people most likely to belong to each group. Group 1 shows a preference for late summer and early fall, and hates cold weather. Group 2 shows a preference for the holidays. They like fall and Christmas time and despise hot weather. Group 3 loves the summertime and hates the winter. We had a look at where these people were from, and it probably comes as no surprise they’re all from the north-east of the US. The last group, a bit like Henry Rollins, stands out as a consensus of one.

BestMonths

This analysis shows how cognitive models with individual differences can help understand opinion groupings, and deal with difficult questions like how many groups exist. One especially interesting feature of the Best Months list is that at least one of the groups is defined more by what comes at the bottom of their lists than the top. People in group 1 don’t agree very precisely on which months they like, but they all agree they don’t like winter months. This shows that it is not just the top few items on a Ranker list that carry useful information: what comes at the bottom can be just as informative. Both what you love and hate matters.

“When I was young, I loved summer and hated winter. When I got older I loved winter and hated summer. Now that I’m even older, and wiser, I hate both summer and winter.” – Jarod Kintz

 

Crystal Velasquez and Michael Lee

by    in Data

According to Big Data, Millennials Don’t Care Much About America’s Pastime

Does Respect for the Past Bode Well for Baseball’s Future?
Breaking Down the Big Data of the Greatest Baseball Players of All Time List

How much does America’s Pastime’s current popularity factor into the rankings of who are the greatest baseball players of all time? And, what factors beyond simple player statistics come into play when one makes their own list? Well, the resulting Ranker data speaks – or rather, cheers – volumes when it comes to players of past generations. While nostalgia might have some effect on the voting, is the lack of current players represented on the list a sign that voters have an unwavering respect for the legends of the past, or is our national pastime becoming just that? Past its time.

Ranker asked participants upfront to list the best baseball players only by their on-field accomplishments. Nearly 115,000 votes from almost 7,500 participants have chimed in, and it’s no surprise who was the consensus top pick. With a lifetime batting average of .342 and #1 in all-time OPS (on-base plus slugging percentage), the voters made their choice clear: Babe Ruth. Anyone who has had a casual conversation around this topic knows the Great Bambino is always one of the first names mentioned when it comes to ranking the greatest players of all time, and he’s usually a favorite across all ages.

Whether you are an astute baseball statistical historian, been sitting in your team’s bleachers since you were a child, or are one of nearly 60 million people who play fantasy sports, you probably have at least a passing opinion about who is the best of all time. According to Ranker’s data, your top 5 has some combination of the Babe, Stan Musial, Ted Williams, Mickey Mantle, and Willie Mays or Hank Aaron, the latter being the latest retiree of the group, which was all the way back in 1976. Once you break down the demographics even a little bit further, that’s when things start to get interesting.

Gone, but not forgotten.

The most glaring data at first glance is there’s nary an active player on the all-time list’s starting roster. In fact, it isn’t until you get down to #44 where you’ll find someone who is still an active player in Ichiro Suzuki. For the record, Ichiro is ranked only #76 on Ranker’s Top CURRENT Baseball Players List. Does this imply that voters know and respect their history? Or could it be that the current crop of baseball players aren’t well represented because they aren’t being watched? Television ratings data suggests that a steady decline in viewership over the years might play a factor in the voting. Major League Baseball as an entity is as strong as ever (just have a look at some of the salaries they’re handing out), people aren’t as interested in the game as they used to be.

How much does a voter’s age factor into the results? A deeper dive into the big data analytics suggests quite a bit. Baby Boomers are 184% more likely to have Mel Ott on their list than any other age group because, you know, they’ve actually seen him play. If you’re between the ages of 30-49, you are a whopping 305% more likely to have Sadaharu Oh of the Yomuiri Giants on your list (which suggests that internationally, fans aren’t only passionate about their soccer). If you’re a Millennial, you must enjoy a good quote. They are 248% and 234% more likely to vote for the non sequitur machine Yogi Berra and the forever quirky Rickey Henderson, respectively. Ranker doesn’t have analytics to suggest that voters in the 30-49 age demographic were all mustache enthusiasts, they were 281% more likely to include Rollie Fingers on their list.

However, those stats focus on specific characters in the game that a certain demographic is drawn to. Where are the Mike Trouts (#1 with people under the age of 29 on the Top Current Baseball Players List), Clayton Kershaws (#2), or players who have brand recognition among fans like Troy Tulowitzki (#20)? All of them, gaudy numbers and all, failed to crack the top 100. In fact, the only other active players on the list (besides the aging Ichiro) were the also-aging Albert Pujols (#48) and Miguel Cabrera (#90). Maybe, there’s just not a large (or long) enough sample size to include current players on this list of all-time greats.

Is today’s game yesterday’s news?

Perhaps voters are just into something else. When you look at the voting demographics, Young voters are the least represented participants, with the majority being aged 30 and up. But with nearly 23% of the votes, you would think at least a couple more current players would sneak in, wouldn’t you? Perhaps baseball just doesn’t resonate with this new generation. They’re gravitating toward playing lacrosse, on their video game consoles, or even fiddling with their smartphones. As a recent article in the Wall Street Journal even suggests, younger people are just tuning out.

So who’s got next?

The times may have changed, but according to Ranker data, the best baseball players really haven’t. From Cobb in the dead-ball era and Satchel Paige of the Negro Leagues to various International Leagues and beyond, the voters know that the greatest all-time baseball was played beyond just the Major Leagues here in the States. Records were made to be broken, but which of the best baseball players of today do you think will eventually break into the all-time list? Only time (and the fickle, under the age of 30 voters) will tell. So if you should happen to ask a Millennial if they saw the game last night, just don’t expect them to inquire who won. You’ll probably just get a “who cares?”

by    in Popular Lists

Why do Ranker voters think Ellen should be president?

Yesterday, Ellen talked about being voted #1 on our list of Celebrities who should run for President.

What is it that makes a celebrity “president”-worthy?  Because Ranker polls about each person along dozens of dimensions (e.g. cool vs. hot vs. good actor vs. trustworthy vs. ?), we can see how ratings on other lists relate to being voted as someone who should run for president.  For example, below we can see that being seen as “cool” is only weakly related to being seen as presidential, with actors like Tom Hanks and Clint Eastwood scoring as relatively cool, but not relatively presidential.

CoolVsPresident

Being good at your job seems to relate moderately to being seen as presidential.  For example, below you can see how being seen as a good actor positively relates to being seen as presidential, with people like Meryl Streep, Leonardo Di Caprio, and Morgan Freeman scoring well on both fronts.

GoodActorVsPresident

It also relates well to likability.  Below you can see how the men who people want to have a beer with, like Johnny Depp, Morgan Freeman, and Di Caprio, also tend to be people they rate well as potential presidential candidates.

BeerVsPresident

It seems to relate best to trust as people like Ellen, Meryl Streep, and Morgan Freeman seem to be rated as both Trustworthy and as someone who should run for President.  Notice how the items below form a fairly straight line going up and to the right.

TrustVsPresident

In all, looking at the relationship between Ranker lists yields comparable results to what political scientists find drives evaluations of presidential candidates.  People want a president who is competent, likable, and trustworthy.  And clearly Ellen fits all three buckets as she ranks as one of the best comedians of all-time, someone people would want to have a beer with, and as trustworthy.  Hence, Ranker users vote her as the #1 Celebrity Who Should Run for President.

Ravi Iyer

by    in Popular Lists

Ranker Users Predict Final Four Teams Accurately Based on Limited Bias

In 2015, Ranker’s voters predicted seven teams in the NCAA tournament’s Elite Eight. With the field for the Sweet 16 now set, we can see how well our rankings can predict how far a particular team will in this year’s tournament. This has been a historically tumultuous season of college basketball. Top-10 teams lost regularly, upsets were commonplace, and no teams were safe.

We can use Ranker’s data to see which team is having a year that matches their historical reputation as a powerhouse, and vice versa.  Ranker visitors drew a clear line around North Carolina, Michigan State, Kansas and Villanova as favorites to make it into the Final Four.  Kentucky is notable because it ranks highest in the overall best college programs poll, but is not predicted by our voters to end up in the Final Four.  Villanova, which is not ranked among the top historical teams, is the main outlier of teams that aren’t as strong in the same way that Kentucky and UNC are, yet is expected to have a good tournament showing.

The rankings provide an insight into how our voting data is based on the current season instead of a bias towards teams based on their longstanding reputations.

 

Here are our results from the 2015 tournament:

Screen Shot 2016-03-08 at 12.25.50 PM

 

Here are our results for this year’s tournament:

Screen Shot 2016-03-08 at 10.43.38 AM

 

 

by    in Popular Lists

Duke and Kentucky Among Teams with the Most Annoying Fans

With March Madness tipping off, we turn to Ranker’s voters to learn more about college basketball and what to expect in this year’s tournament!

Which college basketball fan base wears their pride the best way?  We all know the traditional powers in college basketball, but sometimes their gloating can be a bit much.  In two separate lists, Ranker visitors ranked which college basketball team was the best, and which had the most annoying fans.  When we combine these two lists, we can see which team is best respected for its prowess on the court and how this relates to how annoying its fans are to the rest of the world.  As it happens, powerhouse bluebloods Duke and Kentucky are ranked among the top teams for both being historically successful, and for having annoying fans.  The most successful team with only moderately annoying fans is North Carolina.  The least annoying but still respected team fan base was Villanova.  Ohio State and Florida stand out for having annoying fans, but not particularly respected as programs overall.

 

Screen Shot 2016-03-01 at 1.08.08 PM

 

by    in Popular Lists

Combining Best and Worst Lists to find Polarizing TV Shows

Ranker lists are expressions of people’s opinions, and it is possible for people to have opposite views. The same movie, television show, song, or celebrity can be loved or hated by different groups of people. (If this is not immediately obvious, think about Donald Trump for a moment). Social psychology has long been interested in differences of opinion, and has gathered all sorts of evidence that people will take more extreme views in an argument (attitude polarization), that they will focus on evidence that reinforces what they already believe (confirmation bias), and that they tend to judge new items and experiences based on their previous knowledge (apperception).

Ranker can provide evidence of polarization, since people’s ranks can express different opinions about the same items. This polarization can be especially clear when looking at “best” and “worst” lists on the same general topic. At the moment, it is easy to imagine Donald Trump at the top of both a “Best Presidential Candidates” and a “Worst Presidential Candidates” list. About the only way to explain this pattern of opinions is to identify Trump as a polarizing person. He doesn’t lead to one opinion or attitude. He polarizes people into “lovers” and “hater”.

Previously, we have developed cognitive models to analyze Ranker lists as diverse as the Soccer World Cup, movie box office takings, and how people feel about pizza toppings. None of these models, however, allowed for polarization. The assumption has always been that each item was perceived in a similar way by everybody. So, we extended our cognitive modeling approach to allow for polarizing items, perceived by some users with a “positive spin” and by others with a “negative spin”.

Not wanting to give Trump any more publicity, we decided to test the new model by looking at people’s opinions of recent TV shows. The two lists we looked at were The Best New TV Series of 2015 and The Most Disappointing New TV Shows of 2015. Together these lists involve 22 users — 17 in the best list and 5 from the worst list — ranking a total of 67 shows, with 14 of shows appearing on both best and worst lists. Some of the lists had as few as 3 shows, while others had as many as 27, with an average of about 9 shows per list.

Our new model assumes each TV show is represented in one of two ways. One possibility is that everybody has the same opinion, and the show is not polarizing. This means if a TV show is good, for example, people put it high in their best list, and low in the worst list, or doesn’t list it on their worst list at all. On the other hand, if a TV show is bad people put it high in their bad list and low in in their good list, or don’t mention it in their bad list at all. The new possibility in our model is that a show is polarizing, and so some people believe it is good while others believe it is bad. These polarizing shows need two separate representations: one for the “lovers”, and one for the “haters”.

TVShowBlog

The model we created determined which shows were polarizing and which were not, and how each should be represented on a scale from best to worst. The results are summarized in the graph. The shows are listed from best at the top to worst at the bottom. If a show is not polarizing, it is listed once in gray. If a show is polarizing, it is listed twice: once in green in for in its positive form, and once in red for in its negative form. The graph also summarizes the Ranker data that lead to these conclusions. The green circles indicate when a show was included in the “best” list, starting from rank 1 on the left, to lower ranks moving to the right. The larger the area of the circle, the more people ranked the show in that position. The red crosses indicate when a show was included in the “worst” list, again starting from rank 1 on the left, and again with size of the cross indicating how often it was ranked in that position.

It is clear from the figure that shows identified as polarizing — Better Call Saul, Empire, Ballers, Backstrom, and so on — generally were included in high positions on both the “best” and “worst” lists. Other shows are not polarizing: Last Man on Earth is consistently highly rated, and Schitt’s Creek seems to review itself with its name. A good question for the producers, marketers, and consumers of these TV shows is why some are polarizing. Better Call Saul, which is perhaps the most polarizing show in our results, is a nice example. It has a “lover” representation at the top of the overall list, and a “hater” representation near the bottom. One possibility is that the polarization arises is because Better Call Saul was created as a spin-off prequel to Breaking Bad, and many people would argue that Breaking Bad is one of the greatest television series of all time (and we’d agree). We guess that the people who had a negative opinion of Better Call Saul were die-hard fans of Breaking Bad, and found it didn’t match their lofty expectations. On the other hand, people with positive opinions of Better Call Saul probably evaluated it largely independent of Breaking Bad, as a good new crime television series.

Whatever the causes of polarization, it seems clear that Ranker data provide useful measures, and we think our modeling approach can lead to deeper insights. Finding what is polarizing, and identifying the “lovers” and “haters” should apply not just to TV shows, but to rappers, directors, songs, and everything else where not everybody feels the same way about everything. There is lots for us to do. Or, as Donald Trump has it: “If you’re going to be thinking, you may as well think big.”

– Crystal Velazquez and Michael Lee

by    in Data, Data Science, Popular Lists

Applying Machine Learning to the Diversity within our Worst Presidents List

Ranker visitors come from a diverse array of backgrounds, perspectives and opinions.  The diversity of the visitors, however, is often lost when we look at the overall rankings of the lists, due to the fact that the rankings reflect a raw average of all the votes on a given item–regardless of how voters behave on multiple other items.  It would be useful then, to figure out more about how users are voting across a range of items, and to recreate some of the diversity inherent in how people vote on the lists.

Take for instance, one of our most popular lists: Ranking the Worst U.S. Presidents, which has been voted on by over 60,000 people, and is comprised of over a half a million votes.

In this partisan age, it is easy to imagine that such a list would create some discord. So when we look at the average voting behavior of all the voters, the list itself has some inconsistencies.  For instance, the five worst-rated presidents alternate along party lines–which is unlikely to represent a historically accurate account of which presidents are actually the worst.  The result is a list that represents our partisan opinions about our nation’s presidents:

 

ListScreenShot

 

The list itself provides an interesting glimpse of what happens when two parties collide in voting for the worst presidents, but we are missing interesting data that can inform us about how diverse our visitors are.  So how can we reconstruct the diverse groups of voters on the list such that we can see how clusters of voters might be ranking the list?

To solve this, we turn to a common machine learning technique referred to as “k-means clustering.” K-means clustering takes the voting data for each user, summarizes it into a result, and then finds other users with similar voting patterns.  The k-means algorithm is not given any information whatsoever from me as the data scientist, and has no real idea what the data mean at all.  It is just looking at each Ranker visitor’s votes and looking for people who vote similarly, then clustering the patterns according to the data itself.  K-means can be done to parse as many clusters of data as you like, and there are ways to determine how many clusters should be used.  Once the clusters are drawn, I re-rank the presidents for each cluster using Ranker’s algorithm, and the we can see how different clusters ranked the presidents.

As it happens, there are some differences in how clusters of Ranker visitors voted on the list.  In a two-cluster analysis, we find two groups of people with almost completely opposite voting behavior.

(*Note that since this is a list of voting on the worst president, the rankings are not asking voters to rank the presidents from best to worst, it is more a ranking of how much worse each president is compared to the others)

The k-means analysis found one cluster that appears to think Republican presidents are worst:

ClusterOneB

Here is the other cluster, with opposite voting behavior:

ClusterTwoB

In this two-cluster analysis, the shape of the data is pretty clear, and fits our preconceived picture of how partisan politics might be voting on the list.  But there is a bias toward recent presidents, and the lists do not mimic academic lists and polls ranking the worst presidents.

To explore the data further, I used a five cluster analysis–in other words, looking for five different types of voters in the data.

Here is what the five cluster analysis returned:

FiveClusterRankings

The results show a little more diversity in how the clusters ranked the presidents.  Again, we see some clusters that are more or less voting along party lines based on recent presidents (Clusters 5 and 4).  Cluster 1 and 3 also are interesting in that the algorithm also seems to be picking up clusters of visitors who are voting for people that have not been president (Hillary Clinton, Ben Carson), and thankfully were never president (Adolf Hitler).  Cluster 2 and 3 are most interesting to me however, as they seem to show a greater resemblance to the academic lists of worst presidents, (for reference, see wikipedia’s rankings of presidents) but the clusters tend toward a more historical bent on how we think of these presidents–I think of this as a more informed partisan-ship.

By understanding the diverse sets of users that make up our crowdranked lists, we are able to improve our overall rankings, and also provide more nuanced understanding how different group opinions compare, beyond the demographic groups we currently expose on our Ultimate Lists.  Such analyses help us determine outliers and agenda pushers in the voting patterns, as well as allowing us to rebalance our sample to make lists that more closely resemble a national average.

  • Glenn Fox

 

 

Page 1 of 2512345...1020...Last »