by    in Data

Voters Gonna Vote – Has Liberal Hollywood Produced (Half) a Nation of Haters?

jill_hillyes_02

Okay, Doc is going to be up-front about this: Regardless of the outcome, Doc hereby agrees to abide by the certified results of the upcoming presidential election. There, he said it. Anderson Cooper, you can stop calling now.

tumblr_m1od00g5x31qfr4d6

But seriously, election season is mana from heaven for data analysis junkies like Doc and his pals. So many polls! So many data points! So many trends and subsets and margins of error!  It’s going to be hard to get back to normal after so many months of high-leverage number crunching.

Doc may just have to hunker down and re-review the Brexit referendum results until the withdrawal tremors subside.

But Doc was curious to see how the Clinton/Trump battle royal was playing out in pop culture. What cultural preferences look to be ascendant, and which might be in decline?  What popular passions are driving the voting blocks inside the U.S.?

As there always are, there were some surprises and some confirmations of conventional wisdom. (It will come as no surprise, for instance, that Trump fans dig Ted Nugent and The Patriot.) But underlying all of these specific preferences and antipathies, the data suggests an unsettling meta-question that, on most days, Doc would prefer not to ask.

Being a man of science, however, he is going to swallow hard, bite the bullet and wonder out loud: Is pop culture really just for liberals/Democrats? Has Hollywood failed the conservative/Republican half of the country?

jill_melgop_01

Now, Doc has heard AM talk radio hosts voice those sentiments before, but he always assumed that it was just angry right-wing shock jocks demonizing Hollywood for effect. But the numbers from Ranker Insights suggest that there may be something to it.

Let’s start with some basics: Ranker users, by and large, go onto sites like Ranker to voice support for their favorite personalities and cultural products. Most users giving us data are doing so because they like a certain item, pushing it up the respective lists with their votes, and occasionally voting down entries they feel are less deserving.
If you look at Ranker’s correlation data, you’ll see the majority of cross-referenced preferences are positive: If a user likes A, she’ll probably like B, C and maybe D, but probably won’t like Z. But overall, it usually paints an upbeat picture: People like more things than they dislike.

Trump fans? Not so much. Not much at all, actually.

Crunch the numbers and you discover that you’re through the looking glass: They’re more likely to vote against stuff than for it. Of the movies that have a strong enough correlation to Trump fans, 70% of those movies summon an aggregate dislike, rather than support. Look at the music preferences, and 81% of them are more likely to be downvotes than upvotes.

It’s hard to stress just how unusual this is. Hillary’s breakdowns are a lot more typical. For the 95 musical acts and releases that correlate with Hillary fans, 61 of them, or 64% are positive associations.  Among the movies, 52% are positive associations… a lot closer to the norm for Ranker fan categories.

Well, Doc thought to himself, maybe that’s just Trump, who’s notoriously polarizing and liable to elevate the “haters” among his constituents.

Nope.

Doc ran the numbers for a (slightly) less polarizing guy, our last Republican Prez, George W. Bush.  Now even though W. has repeatedly declined to endorse Trump’s Republican candidacy, the pop culture numbers of their fans are comparable: W fans are more likely to dislike a movie (65/35) or musical artist (76/24) than they are to like them.

What about Ronald Reagan?  As our sunniest and most fondly-remembered Republican President—as well as the only one to make the leap to the White House from Hollywood—surely those fans would be better disposed towards pop culture, right?  The numbers are a little better, but still, fans of the Gipper dislike more movies (53/47) and music (61/39) than they actually like at all.

(We also ran numbers for fans of moderate Republican poster boy Paul Ryan, but the sample was small enough that Doc doesn’t feel confident enough to share them. But for the record, they looked a lot like Trump’s.)

downvotegraph_02

Others on the Democratic/liberal side, on the other hand, reflect the numbers of Clinton fans. Fans of President Obama like more music than they dislike by a 71/29 spread, and like more movies by a 55/45 margin. (President Obama fans also watch enough television for TV to get a cross-tab; they’re more likely to like a given show by a 63/37 breakdown.) Looking at fans of Clinton’s spirited primary rival Bernie Sanders shows that they’re also more likely to like a musical act (64/36) or TV show (67/33) than dislike it; Sanders fans do, however, show their angry maverick side when it comes to movies, which are more likely to be rejected than embraced, by a Trump-like margin of 84/16. (It’s probably too late to turn the tide, but if Trump wants to try and find that elusive common ground with the Bernie voters, he might consider hosting a cross-country series of screenings of Tommy Boy.)

Here’s a few more numbers to round out the picture. Among Ranker users, all current politicians draw negative overall approval numbers. That is to say: Ranker is not a particular haven of Hillary/Bernie fans versus Trump/Bush fans. Of all of the politicians mentioned in this article, only Reagan has an overall net positive approval rating.  Ranker users overall disapprove of Trump (60/40) by virtually the same margin they disapprove of Hillary (57/43). So it’s not like Ranker users have a particular love for Hillary herself. But her fans, like most Ranker users, like more pop culture than they dislike. Not so for Trump and his Republican cohorts.

Finally, the results for Trump/Republicans are true only of fans of the politicians and not fans of the pop culture itself; the relationship isn’t reciprocal

Doc is sorry if that sounds confusing, but here’s what it means.

As noted above, Trump fans are most likely, by far, also to be fans of Ted Nugent. However, when you cross reference with fans of the Motor City Madman himself, the Hollywood/pop culture antipathy vanishes. Turns out, fans of The Nuge like movies, TV shows and music in proportions that look a lot more like Democrats/everyone else than they resemble Trump fans. The same is true for fans of The Patriot: As a group, their preferences are a lot more typical than those of the Trump fans. Basically, it’s the fans of Republican politicians—and only Republican politicians—who are more likely to reject popular culture than they are to embrace it.

As for what this means in a larger cultural sense (and what Washington and/or Hollywood might want to do about it), Doc is a lot less sure of himself. Doc admits, he always considered pop culture to be one of those things that binds us together even when politics pulls us apart.

jill_yawrong

But even that may be wishful thinking, as it turns out. All election season, we’ve been told that we’re a divided nation. Maybe Doc shouldn’t have been surprised to see it played out so starkly in Ranker’s data, but there it is. It’s too much to expect us to all like the same movies and music, but Doc thought that everybody at least liked movies and music in general. He can hear the response now, cutting into his train of thought, just like during those debates – “WRONG.”

Either way, Doc encourages you to vote on Nov. 8, both at your designated polling place and on Ranker.com.

by    in Data

Why Batman This Halloween? The Anatomy of a Batman Fan

So it’s Halloween time.  What are you going as?  Wait!  Let Doc take a guess.

6w8p

Batman.  Right?

Unless you’ve got a lot of green hair dye lying around, in which case you’re probably going as The Joker. Or Harley Quinn, if you’re into bats. Well, baseball bats.

harley-quinn

See? It was a bat joke. No, a bat joke.

Y’see, the National Retail Federation recently announced its top selling Halloween costumes for 2016 (Doc always wakes up extra early to get to the press conference). And topping the list for “millennials” (the 18 – 34 crowd) is Batman and his bat-ilk. Captain America: Civil War and Deadpool may have outpaced Batman v. Superman: Dawn of Justice and Suicide Squad at the box office, but when it comes time to don a costume themselves, millennials are drawn to Gotham.

Now, a franchise doesn’t just claw its way to the top of the Halloween heap by appealing to just one segment of our media-hungry millennials.  The folks dressing up in Batman costumes include everyone from the person who’s been collecting comics for 20 years (Batman costumes are the #5 on the list for the 35 and over crowd) to the college freshman who just saw Suicide Squad last week.

MOST POPULAR HALLOWEEN COSTUMES FOR MILLENNIALS (National Retail Federation 2016)
1. Batman Character
2. Witch
3. Animal (Cat, Dog, Bunny, etc.)
4. Tie: Marvel Superhero (Deadpool, Spider-Man, etc.) AND DC Superhero (Wonder Woman, Superman)
5. Vampire
6. Video Game Character
7. Slasher Movie Villain (Freddy, Jason, Michael Myers, etc)
8. Pirate
9. Yoda
10. Zombie

So who are these people?  How do we separate the comics fans from the movie fans?  Obviously, there’s going to be some overlap, but thanks to Ranker Insights, it’s not hard to see that we’re talking about some pretty distinct groups.

Here we go. There are two different kinds of Batman fans. First, let’s talk about the movie fans.  On Ranker’s Best Movie Characters of All Time, The Joker ranks as #6 and Batman himself is #11. (Doc guesses Warner Bros was right to give Jack Nicholson top billing over Michael Keaton back in 1989.)  As you’d expect, when you narrow the list to millennials, though, The Joker jumps outright to #1, and Batman uses his utility belt to climb to #7.

dark-knight-heath-ledger

Dig deeper into that data, and you find some stuff that’s surprising, and some less so.  For instance, when fans of Batman as a movie character are cross-listed against Ranker’s list of Best Movies of All Time, the result is a lot of love for Batman movies; they’re 4 or 5 times more likely to be boosters of the Chris Nolan trilogy and the Nicholson/Keaton outing.  Doc was less than stunned by this finding. But if there’s one thing this group loves, it’s epic franchise filmmaking.  After the Batman films, the movies most likely to be admired by movie-Batman fans are Peter Jackson’s Lord of the Rings trilogy, The Godfather parts I and II, and Terminator 2: Judgment Day.  It’s only when you get to #11 that you find a stand-alone film, Nolan’s Inception.

So what about the folks who are fans of Batman as a comic book character?  Like their movie-fan brethren, The Dark Knight tops the list of movies these guys are likely to admire.  But after that, the list is a bunch of films that, for the most part, you’ll find on or near the AFI top 100: Citizen Kane, Chinatown, Ben-Hur, Amadeus.  Add in a sprinkling of cult classics (The Big Lebowski, The Truman Show, The Princess Bride) and animation milestones (Finding Nemo and Who Framed Roger Rabbit?), plus one more superhero flick—significantly, NOT a Batman movie but Spider-Man 2, for Doc’s money by far the best of the Tobey Maguire trilogy.  This is an impressive bunch of films, with tastes that run both deep and broad. Those tastes may be more refined than the movie fans’, but they’re also less intense. A Batman movie fan is about 7 times more likely than the average ranker to vote up The Dark Knight; the Batman comic-book fans are only about 2.5 times as likely to vote to push The Dark Knight up the lists.

The picture gets clearer once you look at the kinds of TV shows the two fandoms watch.  According to Ranker Insights, Batman movie fans love one show above all others, and that show is… Scrubs.

Wait, what?  Doc did not see that one coming.

But the numbers don’t lie.  If you like movie Batman, you’re almost three times more likely than the average Ranker to call Scrubs one of the better shows of the past 20 years.  Less strongly, the tastes of this group overlap with How I Met Your Mother, Supernatural, procedurals like Law & Order and Criminal Minds, and Sesame StreetSesame Street?  Hmmmm… a picture is starting to come into focus here.

unnamed-1

How about the comic-book fans?  The show they’re most likely to overlap ain’t Scrubs… it’s The Wire.  The rest of the list has the grown-up sensibility of the group’s movie list: Band of Brothers and Justified appear near the top, and the comedies (Arrested Development and It’s Always Sunny…) have a lot more of a TV-MA feel.

How about each fandom’s all-time TV characters?  Batman himself unsurprisingly tops both lists, but after that, the movie fans go for pretty lighthearted icons… Family Guy‘s Peter Griffin, Ron Swanson of Parks & Recreation, Homer Simpson and even Carlton Banks from The Fresh Prince.  The comic fans run in the opposite direction—serious-minded anti-heroes like Tony Soprano, Walter White and Don Draper.  For comic relief, this group turns to Peter Falk’s Columbo and Fred Gwynne’s Herman Munster.

Okay, one group likes Scrubs, Sesame Street, Supernatural, Peter Griffin and Carlton Banks.  The other likes The Wire, Breaking Bad, Mad Men and The Munsters.  What do we draw from this?  The fault line here is age. Doc wonders if the movie-Batman fans have even seen an episode of The Munsters.

You even see it in the overlaps with non-pop culture lists like The Greatest Minds of All Time.  For the movie fans, the top answers are MLK, Abe Lincoln, Mozart and Einstein… in other words, the great minds you learn about in grade school and high school.  The comic book fans line up behind Immanuel Kant, Socrates, Hippocrates, Plato and a bunch of other guys whose work you have to go to college in order to blow off.

venndiagram2

And Doc’s got one more telling fact, not based on cross-referencing any single list, but the range of lists that the fans tend to vote on.  Across all of Ranker, enough movie-Batman fans have voted to create cross-listings with 181 other lists.  The comic book fans have voted enough to be cross-listed with 134 lists.  Of those 134 lists, 38 of them (28%) are lists that, not to put too fine a point on it, rank female celebrities and/or characters on physical attractiveness.  Across the movie fan voting, only 18 lists (10%) have a similar focus.

Doc isn’t sure if the movie fans are more enlightened, or just haven’t hit puberty yet.  In any event, those comic book fans can’t get enough of weighing in on the top animated sex symbols or which actresses cross their legs most spectacularly.

unnamed

Doc admits it, he was looking for something to maybe cut against the stereotype of the horny comic-book geek obsessed with women he has no chance at (partly because some of them are fictional), but the data paints a picture that supports it: Older, better educated, with more refined tastes, except for an unmistakable emphasis on completely unattainable fantasy women.

wolf

Meanwhile, the movie generation is young, in the first flush of fandom, relying on the consistency of franchises to point them towards movies they’re gonna like, and then probably going home to finish up their bio homework while watching Scrubs reruns.

So that’s the citizenry of Bat nation this Halloween. As time passes and Ranker absorbs the consensus around Suicide Squad and the character’s continuing evolution in the Affleck Era, its contours will probably shift a little. We’ll keep watching. Until then, Doc is gonna let his freak flag fly and power up the 1966 Adam West/Burt Ward camp classic. POW. THWACK. BYE.

by    in Data

Baby Bomb – Here’s How We Knew Bridget Jones’s Baby Would Tank

Doc is going to be honest here. He was probably never going to buy a ticket for Bridget Jones’s Baby… mostly because Doc believes in restricting oneself to just an apostrophe when a possessive word ends in “s.” But also because the travails of a winsome Anglo-dumpling with a journaling fixation never held much personal appeal.

But movies that Doc doesn’t personally care for make bank all the time, and clearly there were plenty in Hollywood (or at least at Universal Pictures) who were convinced that the franchise’s devoted fanbase would turn out for another spin on the Bridget-go-round. And why not? Over the past couple of years, Sequels That No One Asked For actually have been a pretty safe bet, especially the ones targeting women over 25. My Big Fat Greek Wedding 2 wasn’t the surprise smash of the original, but it more than made its budget back, grossing a respectable $60 million in the U.S. And last year’s Second Best Exotic Marigold Hotel took in over $80 million worldwide, with a little over a third of that total coming from the U.S. With this summer’s sleeper hit Bad Moms proving the strength of the women-over-25 market, and credible critical response, most experts were looking at Bridget Jones’s Baby opening at $15 million, if not higher.

bluegraph

Of course, the gang here at Ranker are not “most experts.” And accordingly, Doc can say that we had a pretty strong idea that Bridget Jones’s Baby was due for a troubled birth and a sickly, blighted existence on this earth. How’d we know?

Easy. We pulled up Ranker Insights, and dug into the numbers on Bridget Jones’s Diary, the first and best-regarded of Bridget’s misadventures. After all, the fanbase for Bridget Jones’s Diary seems like an obvious—really, the obvious—group for the movie to market to. And we learned all sorts of interesting things, like that it’s the 66th best rainy-day movie, and that Bridget’s stateside popularity is strongest in the southeast, then wanes as you move north and west across the country.

And then we pulled up the list of other films that Bridget Jones fans were most likely to voice their approval of. The first on the list, unsurprisingly, is Bridget Jones: The Edge of Reason, the (widely derided) first sequel to Diary. But how about those next six titles? See if you can spot any pattern…

  • Love Actually
  • Elizabeth
  • About a Boy
  • Notting Hill
  • Sense and Sensibility
  • Four Weddings and a Funeral

moviecollage-1

You’re a smart cookie—you see where Doc is going with this, yes? No fewer than five of those six movies feature the harried, boyish stammerings of one Hugh John Mungo Grant. (And in related news: Mungo? MUNGO? Doc swears he isn’t making this stuff up.) Yes, Love Actually additionally features Grant’s Bridget Jones co-star Colin Firth, which probably accounts for its placement at #2 on the list after Edge of Reason. But otherwise, the message is clear as day: Above all others, Bridget Jones fans love, love, love them some Hugh Grant.

mungo80

This would be just peachy, except for the tiny, easily-overlooked detail that Hugh Grant decided he wanted no part of Bridget Jones’s Baby, and isn’t in the movie.   Even if you’ve just seen the film’s traditional three-shot poster, you know that the role of “handsome douche” previously filled by Grant is this time assayed by Patrick Dempsey (nee McDreamy). Now Bridget Jones fans don’t seem to have anything especially against Dempsey. On the list of TV shows most liked by Bridget Jones fans, Grey’s Anatomy ranks #9. (It’s still behind Pinky & the Brain and Golden Girls, so go figure.) But there’s no comparison between their mild affection for Dempsey and their deep and abiding passion for Hugh Grant. Their feelings for Grant’s co-stars Renee Zellweger and Colin Firth similarly pale by comparison. After Edge of Reason, the top Zellweger film on the list is Chicago, at #19. Zellweger’s breakthrough film, Jerry Maguire, sits at #532.

Wouldn’t you think that if the Bridget Jones fanbase was really devoted to Renee Zellweger, they’d be more inclined to like Jerry Maguire than, say, The Mighty Ducks or American History X? But no. Apparently, fans of Bridget Jones would rather watch Ed Norton curb-stomp a dude than see Renee Zellweger “complete” Tom Cruise. Good stuff to bear in mind when you’re planning your next at-home double feature.

graphwithfaces-1

And so there was zero astonishment around Ranker HQ when Bridget Jones’s Baby didn’t even crack $9 million in its opening weekend. Doc takes no joy in being right about this stuff. He wants all the movies to do well, what with a rising tide lifting all boats and everything. But when you blow it this big, and this obviously, you deserve to get called on it.

So for future reference, trying to sustain a movie franchise after shedding its fans’ favorite character/actor is a lousy idea. And that’s the only truth Doc has for you this week, baby.

by    in Data

Big Data Shows Movie Fans Love Tom Hanks, Just Not in Sequels

It’s summertime. And when it comes to big-budget movies, that also means it’s sequel time. We’ve already seen remarkable successes like Captain America: Civil War and Finding Dory, and a few flops (at least, based on their allotted budget) like Teenage Mutant Ninja Turtles 2 and Independence Day: Resurgence. This got us at Ranker Insights thinking: what goes into making a successful sequel? The truth is, there are a lot of extenuating circumstances that contribute. The box office success of the original just happens to be one of them. From solid, open-ended plot lines and apparent depth of main characters to preordained fan bases and predictably bankable actors, big data suggests many factors come into play when creating a flourishing movie franchise. However, this much seems certain: you’re probably better off casting anyone but who voters consider the greatest actor of all time.

Allow us to explain. Big data can tell you big things when it comes to making a great film. But if you’re planning on getting the most bang for your buck on your original idea, even the smallest minutia might make a big deal. For instance, let’s take a look at the top 30 of the Best Movie Characters of All Time. Notice anything? Sure, you see all the memorable characters you would expect to see near the top: Forrest Gump, Indiana Jones, James Bond, and Bruce Wayne/Batman are all in the top 10. This makes sense, especially when you consider their names are usually in the title of the movies their characters star in. Look a little closer, and further analyze the films from which these characters came. Of the top 30, 22 of them were strong enough to star in a sequel or trilogy. Now, let’s look at the eight that didn’t return to entertain you once again. What do all these movies have in common? That’s right. They all involve the indisputably lovable Thomas Jeffrey Hanks.

Why is this you ask? Good question. Certainly Toy Story was a smashing success, and went on to create not one – but two – great sequels. Toy Story 2 was even voted 8th on Ranker’s list of Best Movie Sequels. But for obvious reasons, that franchise just featured his voice, not his face. The only sequel in which Tom Hanks participated in and had to actually act, The DaVinci Code, produced far less favorable results. While Angels & Demons still proved to be a box office success, it only took in about 2/3 of the box office its predecessor did. And as for the character Hanks portrayed, Robert Langdon, well, he is nowhere to be found on the Best Movie Characters of All Time list.

It doesn’t seem to be Tom’s directorial choices either, as the Tom Hanks/Steven Spielberg combo are a whopping 975% more likely to be liked by Tom Hanks fans, with the Tom Hanks/Ron Howard team coming in a close second at 809%. And it’s not like these fans are adverse to the idea of sequels either. Voters who like Tom all like their action, adventure, and animated sequels. In fact, Tom fanatics are 549% more likely to enjoy Captain America: The Winter Solider; 258% more likely to have high praise for Back to the Future II; and 396% more likely to be a fan of the previously mentioned Toy Story 2. Heck, the analytics show that voters on the Greatest Actor & Actress in Entertainment History are willing for a sequel of any kind: they’re 38% more likely to vote up the universally agreed upon clunker, Crocodile Dundee in Los Angeles. Maybe Tom Hanks-related sequels were meant not to be seen, but simply heard.

Perhaps it’s just a demographic thing? Nope, as that doesn’t seem to matter either. In fact, Toy Story 2 even drops in the rankings to number 9 among international voters and even further to 10 among female voters. Judging by the data mined from Actors You Would Watch Read The Phone Book, analytics show that Tom Hanks fans are 200% (or more) likely to listen to Robert De Niro, Harrison Ford, Johnny Depp or Liam Neeson go through the names from A to Z, and all four know a thing or two about sequels. However, with Hanks ranking sixth on that same list, we can now confidently deduce that the reason for so few sequels from the actor is probably not his acting itself.

In all likeliness, it’s probably just a content thing. Most of Hanks roles have a historical end, or at the very least, a distinctive one. The stories he stars in just don’t lend themselves to sequels. Voters must agree, as there is nary a Hanks movie to be found on Ranker’s list of Movies That Need Sequels. Saving Private Ryan? Saved. Catch Me If You Can? Caught. Philadelphia? Finished. So don’t hold your breath waiting for Forrest Gumper or Sully 2: Nursing Home Boogaloo, regardless of how well it does upon release in early September. These Hanks vehicles just don’t seem to be in demand, success be damned.

Now, Ranker Insights would never be one to tell you how to create a successful movie franchise, because frankly, that would be a thankless job. But if your job is to create a character that is memorable enough to secure a sequel, big data shows your main character should probably be a Hanks-less one. He’s seems to be the epitome of Mr. One-and-Done.

by    in Data

Using Data To Determine The Best Months Of The Year

Why do people like some months more than others? For many, it is all about the holidays:

“I love the scents of winter! For me, it’s all about the feeling you get when you smell pumpkin spice, cinnamon, nutmeg, gingerbread and spruce.” – Taylor Swift

while for others, it is about avoiding the cold

“A lot of people like snow. I find it to be an unnecessary freezing of water.” – Carl Reiner

and for some more disaffected souls, it is about the specifics

“August used to be a sad month for me. As the days went on, the thought of school starting weighed heavily upon my young frame.“ – Henry Rollins

Presumably all of these preferences and this angst is reflected in Ranker’s Best Months of the Year list. The graphic below provides a visualization of the opinions of ranker users. Each row is a different person, and their (sometimes incomplete) ranking of the months is shown from best-to-worst from left-to-right. The months are color coded by the four seasons: Spring has the hues of green, summer is yellow, fall has the rustic earth hues of brown, and winter is blue.

BestMonthsOriginal

The patchwork quilt of colors and hues makes it is clear that different people have different opinions. We wanted to understand the structure of these individual differences, using cognitive data analysis.

To do this, we used a simple model of how people produce rankings—known as a Thurstonian model, going back to the 1920s in psychology—that we have previously applied successfully to Ranker data. Rather than assuming everybody’s rankings were based on a shared opinion, we allowed this version of the model to have groups or clusters of people, and for each group to have their own preferences for the months. We didn’t want to pre-determine the number of groups, and so we allowed our model to make this inference directly from the data. Our modeling approach thus involves two sorts of interacting uncertainties: about how many groups there are, and about which people belong to which group. Bayesian statistical methods are well suited to handling these sorts of uncertainties.

For fans of Bayesian cognitive graphical models — we know you’re out there — the final model we used is shown in the figure below. For non-fans of Bayesian cognitive graphical models — we KNOW you’re out there — there are three important parts. The variable gamma at the top corresponds to how many groups there are, the variables z to the side correspond to which of these groups each individual belongs, and all of this is inferred from the rankings people gave, represented by the variables at the bottom.

GraphicalModel

The figure below shows the first key insight from the model. It shows the probability that there are 1, 2, …, 17 groups, ranging from everybody having the same opinion about the best months, to everyone having their own unique opinion. There is uncertainty about how many groups the rankings reveal, but the most likely answer is that there are four.

Gamma

Assuming there are four groups, the figure below organizes the ranking data  by grouping together the people most likely to belong to each group. Group 1 shows a preference for late summer and early fall, and hates cold weather. Group 2 shows a preference for the holidays. They like fall and Christmas time and despise hot weather. Group 3 loves the summertime and hates the winter. We had a look at where these people were from, and it probably comes as no surprise they’re all from the north-east of the US. The last group, a bit like Henry Rollins, stands out as a consensus of one.

BestMonths

This analysis shows how cognitive models with individual differences can help understand opinion groupings, and deal with difficult questions like how many groups exist. One especially interesting feature of the Best Months list is that at least one of the groups is defined more by what comes at the bottom of their lists than the top. People in group 1 don’t agree very precisely on which months they like, but they all agree they don’t like winter months. This shows that it is not just the top few items on a Ranker list that carry useful information: what comes at the bottom can be just as informative. Both what you love and hate matters.

“When I was young, I loved summer and hated winter. When I got older I loved winter and hated summer. Now that I’m even older, and wiser, I hate both summer and winter.” – Jarod Kintz

 

Crystal Velasquez and Michael Lee

by    in Data

According to Big Data, Millennials Don’t Care Much About America’s Pastime

Does Respect for the Past Bode Well for Baseball’s Future?
Breaking Down the Big Data of the Greatest Baseball Players of All Time List

How much does America’s Pastime’s current popularity factor into the rankings of who are the greatest baseball players of all time? And, what factors beyond simple player statistics come into play when one makes their own list? Well, the resulting Ranker data speaks – or rather, cheers – volumes when it comes to players of past generations. While nostalgia might have some effect on the voting, is the lack of current players represented on the list a sign that voters have an unwavering respect for the legends of the past, or is our national pastime becoming just that? Past its time.

Ranker asked participants upfront to list the best baseball players only by their on-field accomplishments. Nearly 115,000 votes from almost 7,500 participants have chimed in, and it’s no surprise who was the consensus top pick. With a lifetime batting average of .342 and #1 in all-time OPS (on-base plus slugging percentage), the voters made their choice clear: Babe Ruth. Anyone who has had a casual conversation around this topic knows the Great Bambino is always one of the first names mentioned when it comes to ranking the greatest players of all time, and he’s usually a favorite across all ages.

Whether you are an astute baseball statistical historian, been sitting in your team’s bleachers since you were a child, or are one of nearly 60 million people who play fantasy sports, you probably have at least a passing opinion about who is the best of all time. According to Ranker’s data, your top 5 has some combination of the Babe, Stan Musial, Ted Williams, Mickey Mantle, and Willie Mays or Hank Aaron, the latter being the latest retiree of the group, which was all the way back in 1976. Once you break down the demographics even a little bit further, that’s when things start to get interesting.

Gone, but not forgotten.

The most glaring data at first glance is there’s nary an active player on the all-time list’s starting roster. In fact, it isn’t until you get down to #44 where you’ll find someone who is still an active player in Ichiro Suzuki. For the record, Ichiro is ranked only #76 on Ranker’s Top CURRENT Baseball Players List. Does this imply that voters know and respect their history? Or could it be that the current crop of baseball players aren’t well represented because they aren’t being watched? Television ratings data suggests that a steady decline in viewership over the years might play a factor in the voting. Major League Baseball as an entity is as strong as ever (just have a look at some of the salaries they’re handing out), people aren’t as interested in the game as they used to be.

How much does a voter’s age factor into the results? A deeper dive into the big data analytics suggests quite a bit. Baby Boomers are 184% more likely to have Mel Ott on their list than any other age group because, you know, they’ve actually seen him play. If you’re between the ages of 30-49, you are a whopping 305% more likely to have Sadaharu Oh of the Yomuiri Giants on your list (which suggests that internationally, fans aren’t only passionate about their soccer). If you’re a Millennial, you must enjoy a good quote. They are 248% and 234% more likely to vote for the non sequitur machine Yogi Berra and the forever quirky Rickey Henderson, respectively. Ranker doesn’t have analytics to suggest that voters in the 30-49 age demographic were all mustache enthusiasts, they were 281% more likely to include Rollie Fingers on their list.

However, those stats focus on specific characters in the game that a certain demographic is drawn to. Where are the Mike Trouts (#1 with people under the age of 29 on the Top Current Baseball Players List), Clayton Kershaws (#2), or players who have brand recognition among fans like Troy Tulowitzki (#20)? All of them, gaudy numbers and all, failed to crack the top 100. In fact, the only other active players on the list (besides the aging Ichiro) were the also-aging Albert Pujols (#48) and Miguel Cabrera (#90). Maybe, there’s just not a large (or long) enough sample size to include current players on this list of all-time greats.

Is today’s game yesterday’s news?

Perhaps voters are just into something else. When you look at the voting demographics, Young voters are the least represented participants, with the majority being aged 30 and up. But with nearly 23% of the votes, you would think at least a couple more current players would sneak in, wouldn’t you? Perhaps baseball just doesn’t resonate with this new generation. They’re gravitating toward playing lacrosse, on their video game consoles, or even fiddling with their smartphones. As a recent article in the Wall Street Journal even suggests, younger people are just tuning out.

So who’s got next?

The times may have changed, but according to Ranker data, the best baseball players really haven’t. From Cobb in the dead-ball era and Satchel Paige of the Negro Leagues to various International Leagues and beyond, the voters know that the greatest all-time baseball was played beyond just the Major Leagues here in the States. Records were made to be broken, but which of the best baseball players of today do you think will eventually break into the all-time list? Only time (and the fickle, under the age of 30 voters) will tell. So if you should happen to ask a Millennial if they saw the game last night, just don’t expect them to inquire who won. You’ll probably just get a “who cares?”

Collecting and Connecting Millions of Opinions

insights_logo_transparent

Ranker Insights is the Most Precise Data for Entertainment, Personalities, Sports, Brands and More

Ranker is a leading, digital media company that ranks opinions on (almost) everything through our vote-based, user experience. Our rankings don’t just collect opinions, they contextualize them. Through context, Ranker can discern users who prefer an actor’s talent vs. their attractiveness, for example, or fans who like a college for its academics vs. athletics; and the millions upon millions of correlations therein.

Thusly, we created Ranker Insights: Ranker’s first-party analytics platform that optimizes data from users votes, into actionable intelligence with countless applications.

by    in Data, Data Science, Popular Lists

Applying Machine Learning to the Diversity within our Worst Presidents List

Ranker visitors come from a diverse array of backgrounds, perspectives and opinions.  The diversity of the visitors, however, is often lost when we look at the overall rankings of the lists, due to the fact that the rankings reflect a raw average of all the votes on a given item–regardless of how voters behave on multiple other items.  It would be useful then, to figure out more about how users are voting across a range of items, and to recreate some of the diversity inherent in how people vote on the lists.

Take for instance, one of our most popular lists: Ranking the Worst U.S. Presidents, which has been voted on by over 60,000 people, and is comprised of over a half a million votes.

In this partisan age, it is easy to imagine that such a list would create some discord. So when we look at the average voting behavior of all the voters, the list itself has some inconsistencies.  For instance, the five worst-rated presidents alternate along party lines–which is unlikely to represent a historically accurate account of which presidents are actually the worst.  The result is a list that represents our partisan opinions about our nation’s presidents:

 

ListScreenShot

 

The list itself provides an interesting glimpse of what happens when two parties collide in voting for the worst presidents, but we are missing interesting data that can inform us about how diverse our visitors are.  So how can we reconstruct the diverse groups of voters on the list such that we can see how clusters of voters might be ranking the list?

To solve this, we turn to a common machine learning technique referred to as “k-means clustering.” K-means clustering takes the voting data for each user, summarizes it into a result, and then finds other users with similar voting patterns.  The k-means algorithm is not given any information whatsoever from me as the data scientist, and has no real idea what the data mean at all.  It is just looking at each Ranker visitor’s votes and looking for people who vote similarly, then clustering the patterns according to the data itself.  K-means can be done to parse as many clusters of data as you like, and there are ways to determine how many clusters should be used.  Once the clusters are drawn, I re-rank the presidents for each cluster using Ranker’s algorithm, and the we can see how different clusters ranked the presidents.

As it happens, there are some differences in how clusters of Ranker visitors voted on the list.  In a two-cluster analysis, we find two groups of people with almost completely opposite voting behavior.

(*Note that since this is a list of voting on the worst president, the rankings are not asking voters to rank the presidents from best to worst, it is more a ranking of how much worse each president is compared to the others)

The k-means analysis found one cluster that appears to think Republican presidents are worst:

ClusterOneB

Here is the other cluster, with opposite voting behavior:

ClusterTwoB

In this two-cluster analysis, the shape of the data is pretty clear, and fits our preconceived picture of how partisan politics might be voting on the list.  But there is a bias toward recent presidents, and the lists do not mimic academic lists and polls ranking the worst presidents.

To explore the data further, I used a five cluster analysis–in other words, looking for five different types of voters in the data.

Here is what the five cluster analysis returned:

FiveClusterRankings

The results show a little more diversity in how the clusters ranked the presidents.  Again, we see some clusters that are more or less voting along party lines based on recent presidents (Clusters 5 and 4).  Cluster 1 and 3 also are interesting in that the algorithm also seems to be picking up clusters of visitors who are voting for people that have not been president (Hillary Clinton, Ben Carson), and thankfully were never president (Adolf Hitler).  Cluster 2 and 3 are most interesting to me however, as they seem to show a greater resemblance to the academic lists of worst presidents, (for reference, see wikipedia’s rankings of presidents) but the clusters tend toward a more historical bent on how we think of these presidents–I think of this as a more informed partisan-ship.

By understanding the diverse sets of users that make up our crowdranked lists, we are able to improve our overall rankings, and also provide more nuanced understanding how different group opinions compare, beyond the demographic groups we currently expose on our Ultimate Lists.  Such analyses help us determine outliers and agenda pushers in the voting patterns, as well as allowing us to rebalance our sample to make lists that more closely resemble a national average.

  • Glenn Fox

 

 

by    in Data Science, Popular Lists, Rankings

In Good Company: Varieties of Women we would like to Drink With

MainImagesvg

They say you’re defined by the company you keep.  But how are you defined by the company you want to keep?

The list “Famous Women You’d Want to Have a Beer With”  provides an interesting way to examine this idea.  In other words, how people vote on this list can define something about what kind of person is doing the voting.

We can think of people as having many traits, or dimensions.  The traits and dimensions that are most important to the voters will be given higher rankings.  For instance, some people may rank the list thinking about the trait of how funny the person is, so may be more inclined to rate comedians higher than drama actresses.  Others may vote just on attractiveness, or based on singing talent, etc…  It may be the case that some people rank comedians and singers in a certain way, whereas others would only spend time with models and actresses.  By examining how people rank the various celebrities along these dimensions, we can learn something about the people doing the voting.

The rankings on the site, however, are based on the sum of all of the voters’ behavior on the list, so the final rankings do not tell us about how certain types of people are voting on the list.  While we could manually go through the list to sort the celebrities according to their traits, i.e. put comedians with comedians, singers with singers,  we would risk using our own biases to put voters into categories where they do not naturally belong.  It would be much better to let the voter’s own voting decide how the celebrities should be clustered.  To do this, we can use some fancy-math techniques from machine learning, called clustering algorithms, to let a computer examine the voting patterns and then tell us which patterns are similar between all the voters.   In other words, we use the algorithm to find patterns in the voting data, to then put similar patterns together into groups of voters, and then examine how the different groups of voters ranked the celebrities.  How each group ranked the celebrities tells us something about the group, and about the type of people they would like to keep them company.

As it happens, using this approach actually finds unique clusters, or groups, in the voting data, and we can then guess for ourselves how the voters from each group can be defined based on the company they wish to keep.

Here are the results:

Cluster 1:

Cluster4_MakeCelebPanels

Cluster 1 includes females known to be funny, and includes established comedians like Carol Burnett and Ellen DeGeneres. What is interesting is that Emma Stone and Jennifer Lawrence are also included, who are also highly ranked on lists based on physical attractiveness, they also have a reputation for being funny.  The clustering algorithm is showing us that they are often categorized alongside other funny females as well.  Among the clusters, this cluster has the highest proportion of female voters, which may explain why the celebrities are ranked along dimensions other than attractiveness.

 

Cluster 2:

Cluster1_MakeCelebPanels

Cluster 2 appears to consist of celebrities that are more in the nerdy camp, with Yvonne Strahovski and Morena Baccarin, both of whom play roles on shows popular with science fiction fans.  In the bottom of this list we see something of a contrarian streak as well, with downvotes handed out to some of the best known celebrities who rank highly on the list overall.

Cluster 3:

Cluster2_MakeCelebPanels

Cluster 3 is a bit more of a puzzle.  The celebrities tend to be a bit older, and come from a wide variety of backgrounds that are less known for a single role or attribute.  This cluster could be basing their votes more on the celebrity’s degree of uniqueness, which is somewhat in contrast with the bottom ranked celebrities who represent the most common and regularly listed female celebrities on Ranker.

Cluster 4:

Cluster3_MakeCelebPanels

We would also expect a list such as this to be heavily correlated with physical attractiveness, or perhaps for the celebrity’s role as a model.  Cluster 4 is perhaps the best example of this, and likely represents our youngest cluster.  The top ranked women are from the entertainment sector and are known for their looks, whereas in the bottom ranked people are from politics, comedy, or are older and probably less well known to the younger voters.  As we might expect, cluster 3 also has a high proportion of younger voters.

Here is the list of the top and bottom ten for each cluster (note that the order within these lists is not particularly important since the celebrity’s scores will be very close to one another):

TopCelebsPerClusterTable

 

In the end, the adage that we are defined by the company we keep appears to have some merit–and can be detected with machine learning approaches.  Though not a perfect split among the groups, there were trends in each group that drew the people of the cluster together.  This approach can provide a useful tool as we improve the site and improve the content for our visitors.   We are using these approaches to help improve the site and to provide better content to our visitors.

 

–Glenn R. Fox, PhD

 

 

A Ranker World of Comedy Opinion Graph: Who Connects the Funny Universe?

In the previous post, we showed how a Gephi layout algorithm was able to capture different domains in the world of comedy across all of the Ranker lists tagged with the word “funny”.  However, these algorithms also give us information about the roles that individuals play within clusters. The size of the node indicates that node’s ability to connect other nodes, so bigger nodes indicate individuals who serve as a gateway between different nodes and categories.  These are the nodes that you would want to target if you wanted to reach the broadest audience, as people who like these comedic individuals also like many others.  Sort of like having that one friend who knows everyone send out the event invite instead of having to send it to a smaller group of friends in your own social network and hoping it gets around. So who connects the comedic universe?

The short answer: Dave Chappelle (click to enlarge)

Chappelle

Dave Chappelle is the superconnector. He has both the largest number of direct connections and the largest number of overall connections. If you want to reach the most people, go to him. If you want to connect people between different kinds of comedy, go to him.  He is the center of the comedic universe. He’s not the only one with connections though.

Top 10 Overall Connectors

  1. Dave Chappelle 
  2. Eddie Izzard 
  3. John Cleese 
  4. Ricky Gervais
  5. Rowan Atkinson
  6. Eric Idle
  7. Billy Connolly
  8. Bill Hicks
  9. It’s Always Sunny In Philadelphia
  10. Sarah Silverman

 

We can also look at who the biggest connectors are between different comedy domains.

  • Contemporary TV Shows: It’s Always Sunny in Philadelphia, ALF, and The Daily Show are the strongest connectors. They provide bridges to all 6 other comedy domains.
  • Contemporary Comedians on American Television: Dave Chappelle, Eddie Izzard and Ricky Gervais are the strongest connectors. They provide bridges to all 6 other comedy domains.
  •  Classic Comedians: John Cleese and Eric Idle are the strongest connectors. They provide bridges to all 6 other comedy domains.
  • Classic TV Shows: The Muppet Show and Monty Python’s Flying Circus are the strongest connectors. They provide bridges to Classic TV Comedians, Animated TV shows, and Classic Comedy Movies.
  • British Comedians: Rowan Atkinson is the strongest connector. He serves as a bridge to all of the other 6 comedy domains.
  • Animated TV Shows: South Park is the strongest connector. It serves as a bridge to Classic Comedians, Classic TV Shows, and British Comedians.
  • Classic Comedy Movies: None of the nodes in this domain were strong connectors to other domains, though National Lampoon’s Christmas Vacation was the strongest node in this network.

 

 

Page 1 of 712345...Last »