by    in Data Science, Popular Lists, Rankings

In Good Company: Varieties of Women we would like to Drink With

They say you’re defined by the company you keep.  But how are you defined by the company you want to keep?

The list “Famous Women You’d Want to Have a Beer With”  provides an interesting way to examine this idea.  In other words, how people vote on this list can define something about what kind of person is doing the voting.

We can think of people as having many traits, or dimensions.  The traits and dimensions that are most important to the voters will be given higher rankings.  For instance, some people may rank the list thinking about the trait of how funny the person is, so may be more inclined to rate comedians higher than drama actresses.  Others may vote just on attractiveness, or based on singing talent, etc…  It may be the case that some people rank comedians and singers in a certain way, whereas others would only spend time with models and actresses.  By examining how people rank the various celebrities along these dimensions, we can learn something about the people doing the voting.

The rankings on the site, however, are based on the sum of all of the voters’ behavior on the list, so the final rankings do not tell us about how certain types of people are voting on the list.  While we could manually go through the list to sort the celebrities according to their traits, i.e. put comedians with comedians, singers with singers,  we would risk using our own biases to put voters into categories where they do not naturally belong.  It would be much better to let the voter’s own voting decide how the celebrities should be clustered.  To do this, we can use some fancy-math techniques from machine learning, called clustering algorithms, to let a computer examine the voting patterns and then tell us which patterns are similar between all the voters.   In other words, we use the algorithm to find patterns in the voting data, to then put similar patterns together into groups of voters, and then examine how the different groups of voters ranked the celebrities.  How each group ranked the celebrities tells us something about the group, and about the type of people they would like to keep them company.

As it happens, using this approach actually finds unique clusters, or groups, in the voting data, and we can then guess for ourselves how the voters from each group can be defined based on the company they wish to keep.

Here are the results:

Cluster 1:

Cluster4_MakeCelebPanels

Cluster 1 includes females known to be funny, and includes established comedians like Carol Burnett and Ellen DeGeneres. What is interesting is that Emma Stone and Jennifer Lawrence are also included, who are also highly ranked on lists based on physical attractiveness, they also have a reputation for being funny.  The clustering algorithm is showing us that they are often categorized alongside other funny females as well.  Among the clusters, this cluster has the highest proportion of female voters, which may explain why the celebrities are ranked along dimensions other than attractiveness.

 

Cluster 2:

Cluster1_MakeCelebPanels

Cluster 2 appears to consist of celebrities that are more in the nerdy camp, with Yvonne Strahovski and Morena Baccarin, both of whom play roles on shows popular with science fiction fans.  In the bottom of this list we see something of a contrarian streak as well, with downvotes handed out to some of the best known celebrities who rank highly on the list overall.

Cluster 3:

Cluster2_MakeCelebPanels

Cluster 3 is a bit more of a puzzle.  The celebrities tend to be a bit older, and come from a wide variety of backgrounds that are less known for a single role or attribute.  This cluster could be basing their votes more on the celebrity’s degree of uniqueness, which is somewhat in contrast with the bottom ranked celebrities who represent the most common and regularly listed female celebrities on Ranker.

Cluster 4:

Cluster3_MakeCelebPanels

We would also expect a list such as this to be heavily correlated with physical attractiveness, or perhaps for the celebrity’s role as a model.  Cluster 4 is perhaps the best example of this, and likely represents our youngest cluster.  The top ranked women are from the entertainment sector and are known for their looks, whereas in the bottom ranked people are from politics, comedy, or are older and probably less well known to the younger voters.  As we might expect, cluster 3 also has a high proportion of younger voters.

Here is the list of the top and bottom ten for each cluster (note that the order within these lists is not particularly important since the celebrity’s scores will be very close to one another):

TopCelebsPerClusterTable

 

In the end, the adage that we are defined by the company we keep appears to have some merit–and can be detected with machine learning approaches.  Though not a perfect split among the groups, there were trends in each group that drew the people of the cluster together.  This approach can provide a useful tool as we improve the site and improve the content for our visitors.   We are using these approaches to help improve the site and to provide better content to our visitors.

 

–Glenn R. Fox, PhD

 

 

by    in Data

Why Ranker Data is Better than Facebook’s and Twitter’s

 By Clark Benson (CEO, Ranker)

It’s unlikely you’ll be pouring freezing water over your head for it, but the marketing world is experiencing its own Peak Oil crisis.

Yes, you read correctly: we don’t have enough data. At least not enough good data.

Pull up to any marketing RSS and you’ll read the same story: the world is awash in golden insights, companies are able to “know” their customers in real time and predict more and better about their own market … blablabla.

Here’s what you won’t read: it’s really, really hard. And it’s getting harder, for the simple reason that we are all positively drenched in … overwhelmingly bad data. Noisy, incomplete, out of context, approximate, downright misleading data. “Big Data” = (Mostly) Bad Data as it tends to draw explicit behavior from implicit and noisy sources like social media or web visits.

Traditional market research methods are getting less reliable due to dropping response rates, especially among young, tech-savvy consumers. To counteract this trend, marketing research firms have hired hundreds of PhDs to refine the math in their models and try to build a better picture of the zeitgeist, leveraging social media and implicit web behavior. This has proven to be a dangerous proposition, as modeling and research firms have fallen prey to statistics’ number one rule: garbage in, garbage out.

No amount of genius mathematical skills can fix Bad Data, and simple statistical models on well measured data will trump extensive algorithms on badly measured data every single time. Sophisticated statistical models might help in political polling, where people are far more predictable based on party and demographics, but they won’t do anything to help traditional marketing research, where people’s tastes and positions are less entrenched and evolve more rapidly.

Parsing the exact sentiment behind a “like”, a follow or a natural language tweet is extremely difficult, as analysts often lack control over the sample population they are covering, as well as any context about why the action occurred, and what behavior or opinion triggered it. Since there is no negative sentiment to use as control, there is no ability to unconfound good with popular. Natural language processing algorithms can’t sort out sarcasm, which reigns supreme on social media, and even the best algorithms can’t reliably categorize the sentiment of more than 50% of Twitter’s volume of posts. Others have pointed out the issues with developing a more than razor-thin understanding of consumer mindsets and preferences based on social media data. What does a Facebook “Like” mean, exactly? If you “like” Coca-Cola on Facebook, does it mean that you like the product or the company? And does it necessarily mean you don’t like Pepsi? And what is a “like” worth? Nobody knows.

This is where we come in. We at Ranker have developed a very good answer to this issue: the “opinion graph”, which is a more precise version of the “interest graph” that advertisers are currently using.

Ranker is a popular (top 200 website, 18 million unique visitors and 300 million pageviews per month) that crowdsources answers to questions, using the popular list format.  Visitors to Ranker can view, rank and vote items on around 400,000 lists. Unlike more ambiguous data points based on Facebook likes or twitter tweets, Ranker solicits precise and explicit opinions from users about questions like the most annoying celebrities, the best guilty pleasure movies, the most memorable ad slogansthe top dream colleges, or the best men’s watch brands.

It’s very simple: instead of the vaguely positive act of “liking” a popular actor on Facebook, Ranker visitors cast 8 million votes every month and thus directly express whether they think someone is “hot”, “cool”, one of the “best actors of all-time”, or just one of the “best action stars”. Not only that, they also vote on other lists of items seemingly unrelated to their initial interest: best cars, best beers, most annoying TV shows, etc.

As a result, Ranker has been building since 2008 the world’s largest opinion graph, with 50,000 nodes (topics) and 20 million edges (statistically significant connections between 2 items). Thanks to our massive sample and our rich database of correlations, we can tell you that people who like “Modern Family” are 5x more likely to dine at “Chipotle” than non-fans, or people who like the Nissan 370Z also like oddball comedy movies such as “Napoleon Dynamite” and “Big Lebowski”, and TV shows such as “Dexter” and “Weeds”.

Our exclusive Ranker “FanScope” about the show “Mad Men” lays out this capability in more details below:

Mad Men Data

How good is it? Pretty good. Like “ we predicted the outcome of the World Cup better than Nate Silver’s FiveThirtyEight and Betfair” good.

Our opinion data is also much more precise than Facebook’s, since we not only know that someone who likes Coke is very likely to rank “Jaws” as one of his/her top movies of all time, but we’re able to differentiate between those who like to drink Coke, and those who like Coca-Cola as a company:

jaws chart

We’re also able to differentiate between people who always like Pepsi better than Coke overall, and those who like to drink Coke but just at the movie theater:

  • 47% of Pepsi fans on Ranker vote for (vs. against) Coke on Best Sodas of All Time
  • 65% of Pepsi fans on Ranker vote for (vs. against) Coke on Best Movie Snacks

That’s the kind of specific relationship you can’t get using Facebook data or Twitter messages.

By collecting millions of discrete opinions each month on thousands of diverse topics, Ranker is the only company able to combine internet-level scale (hundreds of thousands surveyed on millions of opinions each month) with market research-level precision (e.g. adjective specific opinions about specific objects in a specific context).

We can poll questions that are too specific (e.g. most memorable slogans) or not lucrative enough (most annoying celebrities) for other pollsters. And we use the same types of mathematical models to address sampling challenges that all pollsters (internet or not internet based) currently have, working with some of the world’s leading academics who study crowdsourcing, such as our Chief Data Scientist Ravi Iyer, and UC Irvine Cognitive Sciences professor Michael Lee.

Our data suggests you won’t be dropping gallons of iced water on your face over it. But if you’re a marketer or an advertiser, we predict it’s likely you will want to pay close attention.

by    in Data

Men and Women Both Lie—But They Do It For Different Reasons

lying-girlfriend-1085358-TwoByOne

We all tell white lies now and then (yes you do, don’t lie!) but did you know that men and women lie for different reasons? The data from our list of Things People Lie About All the Time shows a pattern that may hint at this difference.

The poll lists 49 common lies and asks respondents to vote “yes” if they’ve lied about that in the past 6 months or “no” if they have not. According to votes cast by over 350 people, women are more likely to lie about things that “keep the peace socially” while men are more likely to lie over matters of “self-preservation.”

On the list, women are 8 times more likely than men to lie about “being too swamped to hang out” and 4 times more likely to claim that their “phone died.” These results imply that women may be more likely to feel guilty about canceling on friends or having alone time.

In contrast, men were 2 times more likely to admit to saying things like “Oh yeah! That makes sense!” when they did not understand something and 5 times more likely to say, “No officer, I do not know why you pulled me over,” when, presumably, they did know why. These types of lies could point to men’s desire to show themselves in the best possible light and cover up wrongdoing.

Differences aside, both men and women voted similarly on many items on this list. In fact, the top 3 most popular lies were the same for both men and women.

The Top 3 Lies for BOTH Men and Women Are:

1. I’m Fine

2. I’m 5 Minutes Away

3. Yeah, I’m Listening.

Which goes to show that men and women may be able to see eye-to-eye after all… just as long as they don’t ask each other how they are doing, where they are and whether or not they are listening.

by    in Trends

Bruno Mars Is Disliked, But Mostly by Older People

Bruno Mars With His Grammy

Bruno Mars is #98 out of the 468 worst bands of all time, as voted on by over 12.5k voters. But it turns out that older people dislike him way more than younger people.

Super Bowl XLVIII is upon us! Woop, woop! Time to prepare for head-spinning sensory overload: brawny men knocking each others’ lights out, sassy cheerleaders, flashy TV ads… it’s almost too much to handle. And what about the crown jewel of the day’s entertainment: the halftime performer, Bruno Mars?

Will Peter Gene Hernandez, aka Bruno Mars, have the charisma to command a crowd of 100,000 screaming fans, not to mention the 110 million Americans who are expected to watch the game from home?

According to our data, it’s not looking good. At least not on the surface. Bruno Mars is currently ranked #98 on our list of The Worst Bands of All Time.

As of right now, 12,627 people have voted on this list, which means that there are a whole lot of haters out there. Compound that with the fact that people always complain about the Super Bowl halftime performer, and it’s looking like Bruno Mars may not get a lot of love for his performance.


Puppy Bowl FTWLove for BeyonceStanding up for Amurika

However, when we slice the data up a little, it actually looks like things may not be that bad for the incredibly short crooner.

Why’s that, you ask?

1. Not to be taken lightly: Bruno Mars Has Got Some Serious Dance Moves.

Even though he was voted as one of the worst bands of all-time, people also acknowledge that he’s got some moves, which bodes well for him as a performer—especially in a situation where he is expected to wow the crowd. He was voted #29 out of 54 on this list of the best dancing singers.

Bruno Mars Dancing GIF
Bruno’s got moves.

2. Young people like Bruno Mars way more than this list would have you believe.  

Statistically Speaking: If we isolate the votes coming from only young people—people ages 30 and under, that is, Bruno Mars would drop all the way down to #381 on the list of worst bands of all time.

Only 5 out of 12 young people who voted on this list upvoted Bruno Mars. That’s about 40% who agree that he should be considered one of the worst bands of all time.

Compare that to say, Justin Bieber who received 16 upvotes for every 19 people who voted on him, which is close to 85%.

*Or, in Plain English if You’re Starting to Get a Headache: Being #381 on a ‘worst bands’ list is way better than being #1. Young people voting him down on this list means that a lot of them do not think that he is the worst.

3. Old people hating on Bruno Mars is making him rise in the ‘worst of’ rankings. 

Bruno Mars loses the mic GIF
What do you have to say about that, Bruno?

Let’s look at how old people (over the age of 50) feel about our buddy Bruno. If we strip out all of the young and middle-aged voters, Bruno Mars would climb up to #82 on the list of worst bands of all time. Remember, getting closer to #1 on a ‘worst’ list is not a good thing.

The ratio for this demographic is much higher. 3 out of 5 old-timers who voted on this list upvoted Bruno Mars as the worst band of all time. That’s 60% for those of you who are keeping track.

While we can crunch the numbers for preferences according to age, it must be noted that we do not have a specific reason as to why people voted this way. Why do people over 50 dislike Bruno Mars? Is it his cavalier attitude, his voluminous hair, his sexy lyrics? His widely-publicized cocaine bust?

Bruno Mars' mug shot actually isn't that bad.
Bruno Mars’ mug shot actually isn’t that bad.

Either way–we’d bet that Bruno would rather be pleasing young’uns than winning the hearts of old folks. They are the ones, after all, who will be more likely to pay to see him perform live (they’ll also be way more likely to pirate his music, but that’s another conversation).

So, while you are eating your triple beany cheese nachos and downing Bud heavies this weekend and one of your friends starts to complain about how much he hates Bruno Mars…you can think to yourself (or gently point out) that maybe he’s just too old to understand him.

Ranker Opinion Graph: the Best Froyo Toppings

Its hard to resist a cold treat on a hot summer afternoon, and frozen yogurt shops with their array of flavors and toppings have a little of something for everyone. Once you’re done agonizing over whether you want new york cheesecake or wild berry froyo (and trying a sample of each at least twice), its time for the topping bar. But which topping should you choose? We asked people to vote for their favorite frozen yogurt toppings on Ranker from a list of 32 toppings, and they responded with over 7,500 votes.

The Top 5 Frozen Yogurt Toppings (by number of upvotes):
1. Oreo (235 votes)
2. Strawberries(225 votes)
3. Brownie bits (223 votes)
4. Hot fudge (216 votes)
5. Whipped cream (201 votes)

But let’s be honest, who can just choose just ONE topping for their froyo? Using Gephi and data from Ranker’s Opinion Graph, we ran a cluster analysis on people’s favorite froyo topping votes to determine which toppings people like to eat together (click on graph to enlarge). In the graph, larger circles mean more likes with other toppings. Most of the versatile toppings were either a syrup (like strawberry sauce) or chocolate candy (like Reese’s Pieces).froyo

The 10 Most Versatile Froyo Toppings:

1. Strawberry sauce
2. Snickers
3. Magic Shell
4. White Chocolate chips
5. Peanut butter chips
6. Butterscotch syrup
7. Candies Nestle Butterfinger Bar
8. Reese’s Pieces
9. M&Ms
10. Brownie bits

 

Using the modularity clustering tool in Gephi, we were then able to sort toppings into groups based on which toppings people were most likely to upvote together. We identified 4 kinds of froyo topping lovers:

fruitnut1. Fruit and Nuts (Blue): This cluster is all about the fruits and nuts. These people love Strawberry sauce, sliced almonds, and Marschino cherries.

chocolate2. Chocolate (purple): This cluster encompases all things chocolate. These people love Magic Shell, Brownie bits, and chocolate syrup.

 

sugar3. Sugar candy (green): This cluster is made up of pure sugar. These people love gummy worms, Rainbow sprinkles, and Skittles.

 

 

salty4. Salty and Cake (Red): This cluster encompasses cake bites and toppings that have a salty taste to them. These people like Snickers, Cheesecake bits, and Caramel Syrup.

 

Some additional thoughts:

  • Banana was a strange topping that was only linked with Snickers.
  •  People who like nuts like both fruit and items from the salty category.
  •  People who like blueberries only like other fruits.
  • People who like sugar items like gummy worms also like chocolate, but don’t particularly like fruit.

 

– Kate Johnson