by   Ranker
in Popular Lists

Will 2015 be the year that better data eclipses bigger data?

Data is a tool, not an end, but understandably, some people are really into their tools. They like to describe how many petabytes zettabytes their data takes up every second picosecond, requiring even more tools that allow them to analyze that data ever faster. It’s very very cool. But just like the engines on those lamborghinis I see idling in Los Angeles traffic on the way to the office, I have to question how truly useful all that engineering is.

Do we really need zettabytes of data to produce the insight that I might, in my weaker moments, click on a link advertising photos of singles in my area or detaling “13+ Things You Shouldn’t Eat in a Restaurant”? [ these are actual headlines served by content recommendation companies that leverage enormous datasets on web behavior] Does Facebook really need all my likes, interests, and friends to know to serve me clickbait or is the single biggest predictor of whether I might generate a click for an advertiser the fact that I have enjoyed clickbait in the past?  If 8% of internet users account for 85% of banner ad clicks, how effective can the plethora of data scientists who work on advertising actually be, over and above a simple cookie that identifies that 8% and removes banner ads for everyone else?

Rather than simply declaring, in rather cliched form, that “big data is dead”, I have a solution: Better Data.  If I want to know what to buy my wife for Christmas, I can analyze everything she has done on the internet for the past 10 years…or I could just ask her.  If I want to know who is going to win the world cup, I could analyze the statistics of every player and team in every situation and create an algorithm that scores their collective talents…or I could just ask people who they think will win.  Small datasets with rich variables that incorporate lots of information intelligently (e.g. stock prices) almost always out-perform complex algorithms performed on low-level datasets.

Evidence for this is found not only in the fact that algorithms cannot reliably beat the stock market (though they can make money by beating slower, dumber algorithms), but that the world’s biggest companies like Google, Facebook, and Baidu are emphasizing “Deep Learning” artificial intelligence as primary initiatives.  Deep learning attempts to encode the patterns hiding in lots of low level data points (e.g. pixel colors) into higher-order variables that human beings find meaningful (e.g. a cat or a smiling friend), effectively creating better smaller datasets.  The excitement over deep learning is an acknowledgment that zettabytes of data yield far less meaningful information about a person than the average human can get from a 15 minute conversation.  Deep learning may someday allow Google to read our email with the same sophistication as a human, but the average toddler still far outpaces the most sophisticated deep learning algorithms. And it still needs good data to be trained on.  It will never be able to take all the videos ever uploaded onto YouTube and predict much variance in the direction of the stock market because the data is not there. If you want to predict the stock market, you need better data on companies.  If you want to predict what a person will buy or better yet, what really motivates them, you need to ask them questions about what motivates them.

How can we create better datasets?  Think less like an engineer and more like someone writing a biography.  Rather that trying ever more technological solutions to squeeze knowledge from a stone, think about what is missing in our understanding of the average person.  If, through some combination of deep learning and data aggregation, I am able to fully understand 1% or 25% or 100% of a person’s online behavior, I still will only understand that part of their world that is revealed through their online behavior.  How can we start to ask people what their most meaningful moments from college were, what annoys them most, or what makes them happiest in their quiet moments?  Dating sites probably have some of the best data around because they ask meaningful questions, even given the relatively low number of people who use those sites as compared to Gmail or Facebook, and the sharpness of the insights that they are able to produce is no accident.  The OK Cupid blog (better data) will always be more interesting than the Facebook data blog (bigger data) until Facebook is able to collect data more meaningful than the generic “like”.

2015 is an exciting time to be working on data.  Tools are more accessible than ever, such that many engineers can find a tutorial and learn to run any algorithm in a weekend.  Data is more ubiquitous and accessible than ever as well. But the world doesn’t need yet another company that takes publicly accessible data and mines it for sentiment, while throwing off stats about how big their data is.  Think like a biographer,  figure out what nobody else is asking and create meaningful data.

– Ravi Iyer

by   Ranker
in Opinion Graph, Ranker Comics

A Cluster Analysis of the Superpower Opinion Graph produces 5 Superhero types

If you could have one superpower, which would you choose?  Data from the Ranker list “Badass Superpowers We’d Give Anything to Have” improves on the age-old classroom ice breaker question by letting people rank all of the superpowers in order of how much they would want them.  Because really, unless you’re one of the X-men, you probably would have more than one power. So, if you could have a collection of superpowers, what kind of superhero would you be?

Using Gephi and data from Ranker’s Opinion Graph, we ran a cluster analysis on people’s votes on the superpowers list to determine what groupings of superpowers different people wanted.

This analysis grouped superpowers into 5 clusters, which we interpreted to represent unique superhero types.


The Overall Superpower Opinion Graph




The 5 Types of Superheroes


1. The Creationist God: This superhero type is characterized by creation and destruction, Old-Testament Christian God-style. Notable superpowers: the ability to create/destroy worlds, die and come back to life, have gods’ weapons (Thor’s Hammer, Zeus’ Thunderbolt), remove others’ senses, and resurrect the dead.


2. The Time Lord: This superhero type is basically The Doctor from Dr. Who. Notable superpowers: omnipotence, travel to other dimensions, open portals to anywhere, and travel beyond the omniverse.


3. The Elementalist: This superhero type has the ability to manipulate the elements and use them as weapons to their advantage. Notable superpowers: manipulation of water, fire, weather, and plants, ability to shapeshift, shoot ice, and lightning and fire.


4. The Superhuman: This superhero type is humans+, with enhanced human senses and decreased human limitations. Notable superpowers: sense danger, x-ray vision, walk through walls, super speed, mind reading, flight, super strength, and enhanced flexibility.


5. The Zen Master: This superhero type sounds a bit like being permanently on mind-altering psychoactive substances crossed with Gandhi. Notable superpowers: speech empowerment, spiritual enlightenment, and infinite appetite!!.


-Kate Johnson

by   Ranker
in API Documentation, Developers

Ranker Widget oEmbed Documentation


oEmbed is a format for allowing an embedded representation of a URL on third party sites. The simple API allows a website to display embedded content (such as photos or videos) when a user posts a link to that resource, without having to parse the resource directly.

To find out more information please review the oEmbed specification.


oEmbed endpoint URL

You can use the API endpoint to request the embed code for a list from its ID. The response will return in JSON format.

Request Type HTTP(S) GET
Authorization None
URL //
Response Format JSON


List ID Discovery

Ranker supports discovery of List IDs. Each list page that supports embedding will have one or more links to an embed (or widget) page. These links can be found under ’embed’ or ‘widget’ links, or via ‘</>’ icons. Once navigated to, these widget pages will both display the List ID in the pages URL and beneath the Embed Code area.


Endpoint Parameters

All parameters are sent via query string parameters and must be urlencoded.

format List format type (string1). Defaults to grid.
rows Sizes widget to display this many rows (number2). Defaults to 20.
headername Show the list name (boolean). Defaults to True.
headerimage Show the list image (boolean). Defaults to False.
headerusername Show the list author name (boolean). Defaults to False.
headercriteria Show the list criteria (boolean). Defaults to False.
headerbgcolor Specify the widget header background color (hex3). Defaults to ffffff.
headerfontface Specify the widget header font-face (string4). Defaults to Arial.
headerfontcolor Specify the widget header foreground color (hex3). Defaults to 000000.
listfontface Specify the widget body font-face (string4). Defaults to Arial.
listfontcolor Specify the widget body foreground color (hex3). Defaults to 000000.
listdisplaydescriptions Show list item descriptions (boolean). Defaults to True.
listslidebgcolor Specify the slideshow format background color (hex3). Defaults to ffffff.
listdisplaythumbnails Show slideshow thumbnail carousel (boolean). Defaults to False.
listflatbuttons Show vote buttons in a flat style using your listfontcolor. Defaults to False.
footerbgcolor Specify the widget footer background color (hex3). Defaults to 1e3e66.
footerfontcolor Specify the widget footer foreground color (hex3). Defaults to ffffff.
footersharing Show the widget footer sharing options5 (boolean). Defaults to True.

1 Possible format types are ‘grid’ for poll lists and ‘slideshow’ for photo lists.
2 Rows setting only applies to grid format widgets.
3 Hex values should NOT include the #.
4 Possible font-face values are ‘arial’, ‘helevtica’, ‘verdana’, ‘geneva’, ‘georgia’ or ‘times’ only.
5 Widget sharing options share YOUR page, not the original embedded list.


Example endpoint URL



Response Parameters

provider_name Name of the embedded content provider, Ranker.
provider_url URL of the embedded content provider,
author_name List author name.
author_url List author URL.
cache_age How long this embedded content will be cached for.
version Version of this oEmbed response, 1.0.
type Type of embedded content, Rich.
title Name of embedded list.
html Embed code content. This is what you will use on your pages.


Example Response

provider_name : “Ranker”,
provider_url : “”,
author_name : “Ranky”,
author_url : “”
cache_age : “100”,
version : “1.0”,
type : “rich”,
title : “The Funniest Seinfeld Quotes”,
html : “<a class="rnkrw-widget" data-rnkrw-id="517518"
data-rnkrw-format="grid" data-rnkrw-rows="999"
href=" quotes/desertrat89">The Funniest Seinfeld Quotes</a><script
id="rnkrw-loader" type="text/javascript" async src="//"></script>”
by   Ranker
in Opinion Graph

Characteristics of people who are not annoyed by Bill O’Reilly

On today’s The O’Reilly Factor (video below), Bill O’Reilly lamented the fact that he was only #10 on Ranker’s Most Annoying TV Hosts list and decided that he would make it his New Year’s Resolution to become the #1 most annoying person on our list. While I may not share O’Reilly’s politics, I like him as a person, even as he does annoy me from time to time, and would like to help him reach his goals. I enjoy working with the Ranker dataset as it lets me answer very specific questions, like whether people who think the show 24 is overrated are also convinced that George W. Bush was a terrible person—or, in this case, I can study the people who specifically disagree that O’Reilly is annoying, in the hopes that O’Reilly can find these people and work to annoy them more.

Who does O’Reilly need to work harder to annoy? From our opinion graph of 20+ million edges, (so named because we can connect not only vague “likes” or “interests,” but specifically whether someone thinks something is best, worst, hot, annoying, overrated, etc.), we have hundreds of specific opinions that characterize people who don’t find O’Reilly annoying. Here are a chosen few findings about these people:

People who are NOT annoyed by O’Reilly tend to…
– find liberals like Jon Stewart, Rachel Maddow, and Bill Maher annoying.
– believe that John Wayne and Humphrey Bogart are among the Best Actors in Film History.
– enjoy movies like The Sound of Music and Toy Story.
– watch America’s Got Talent, Cops, Dirty Jobs, Deadliest Catch, Home Improvement, and Extreme Makeover: Home Edition.
– listen to Lynyrd Skynyrd, Boston, and Elvis.
– enjoy comedians like Bob Hope, Jeff Foxworthy, Joan Rivers, and Billy Crystal.
– be attracted to  Carrie Underwood, Jessica Simpson, Brooklyn Decker, and Sarah Palin.

Thanks to big data, these audiences are all readily targetable online—and if O’Reilly really wants to annoy these people, he might want to study our biggest pet peeves list for ideas (e.g. chewing with his mouth open might work on TV). We hope this list will help O’Reilly with his ambitions for 2015, and please do reach out to us if you need more market research on how to annoy people more.

– Ravi Iyer

by   Ranker
in Data, Opinion Graph

The Opinion Graph Connections between 24, George W. Bush, Jack Bauer, and Rachel Maddow.

As someone whose roots are in political psychology, I’m always interested in seeing how the Ranker dataset shows how our values are reflected in our entertainment choices.  We’ve seen many instances where politicians have cited 24 in the case for or against torture, but are politics reflected in attitudes toward 24 amongst the public?  Using data from users who have voted on multiple Ranker lists, including our lists polling for The Worst Person in History, the Greatest TV Characters of All-Time, the most Overrated TV shows and The Biggest Hollywood Douchebags, the clear answer is yes.

People who think George W. Bush is one of the worst people in history, also tend to think that 24 is one of the most overrated TV shows of all-time.

People who think Bush is a terrible person also think 24 is overrated.
People who think Bush is a terrible person also think 24 is overrated.

…and people who think Jack Bauer is one of the best TV Characters of All-Time also think that Rachel Maddow is one of Hollywood’s Biggest Douchebags.

People who think Jack Bauer is a great TV character also think Rachel Maddow is a douchebag.

– Ravi Iyer

ps. …and these are just a few of the relationships between 24 and politicians in our opinion graph, which all tell the same basic story.

by   Ranker
in Opinion Graph

The Clear Split Between AMD and Intel CPU Fans

Recently, Tom’s Hardware used the Ranker widget to poll for their Reader’s Choice awards.  Among the topics they polled was the best CPUs and while I knew that there would likely be a preference for AMD or Intel, the two largest manufacturers, I didn’t realize that the choice would be as stark.  I’m a relative novice compared to most of the people who voted in this poll, so perhaps this would not surprise them, but voting for an AMD CPU, made one, on average, 80% less likely to vote for an Intel CPU, and vice versa.  Below is a taxonomy of votes, with items that are voted on similarly closer together, based on a hierarchical cluster analysis of the votes on this list, so you can visualize the split for yourself.



– Ravi Iyer

by   Ranker
in Data Science, Pop Culture, prediction

Ranker Predicts Spurs to beat Cavaliers for 2015 NBA Championship

The NBA Season starts tonight and building on the proven success of our World Cup and movie box office predictions, as well as the preliminary success of our NFL predictions, Ranker is happy to announce our 2015 NBA Championship Predictions, based upon the aggregated data from basketball fans who have weighed in on our NBA and basketball lists.

Ranker's 2015 NBA Championship Predictions as Compared to ESPN and FiveThirtyEight
Ranker’s 2015 NBA Championship Predictions as Compared to ESPN and FiveThirtyEight

For comparison’s sake, I included the current ESPN power rankings as well as FiveThirtyEight’s teams that have the most percentage chance of winning the championship.  As with any sporting event, chance will play a large role in the outcome, but the premise of producing our predictions regularly is to validate our belief that the aggregated opinions of many will generally outperform expert opinions (ESPN) or models based on non-opinion data (e.g. player performance data plays a large role in FiveThirtyEight’s predictions).  Our ultimate goal is to prove the utility of crowdsourced data, as while something like NBA predictions is a crowded space where many people attempt to answer this question, Ranker produces the world’s only significant data model for equally important questions, such as determining the world’s best DJseveryone’s biggest turn-ons or the best cheeses for a grilled cheese sandwich.

– Ravi Iyer

by   Ranker
in Opinion Graph, Rankings

Characteristics of People who are less Afraid of Ebola

Ebola is everywhere in the news these days, even as Ebola trails other causes of death by wide margins.  Clearly the risks are great, so some amount of fear is certainly justified, but many have taken it to levels that do not make sense scientifically, making back of the envelope projections for its spread based on anecdotal evidence and/or positing that its only a matter of time before the virus evolves into an airborne disease, as diseases regularly mutate to enable more killing in movies.  Regardless of whether Ebola warrants fear or outright panic, the consensus is that it is scary, as also evidenced by its clear #1 ranking on Ranker‘s Scariest Diseases of All Time list.  Yet, among those who are fearful, I couldn’t help but wonder, what are the characteristics of people who tend to be less afraid than others?  Using the metadata associated with users who voted and reranked this list, in combination with their other activity on the site, here are a few things I found.

– Ebola fear appears to be slightly less prevalent in the Northeast, as compared to other regions of the US, and slightly more prevalent in the South.

– Older people tend to be slightly less afraid of Ebola, often expressing more fear of Alzheimer’s.

– International visitors to this list are half as likely to vote for Ebola, as compared to Americans.

– People who are afraid of Ebola are 4.4x as likely to be afraid of Dengue Fever.

– People who are afraid of Strokes, Parkinson’s Disease, Muscular Distrophy, Influenza, and/or Depression are about half as likely to believe that Ebola is one of the world’s scariest diseases.

Bear in mind that these results are based on degree of fear and ALL people are afraid of Ebola.  The fear in some groups is simply less pronounced and only the last 3 results are statistically significant based on classical statistical methods.  There are plausible explanations for all of the above, ranging from the fact that conservative areas of the country are likely more responsive to potential threats, to the fact that losing one’s mind over time to Alzheimer’s really may be much scarier for older people versus a quick death, to the fact that people who are afraid of foreign diseases prevalent in tropical areas likely fear other foreign diseases prevalent in tropical areas.

To me the most interesting fact is that people who are afraid of more common everyday diseases, including Influenza, which kills thousands every year, appear to be less afraid of Ebola than others.  Human beings are wired to be more afraid of the new and spectacular, as much psychological research has shown.  That fear kept many of our ancestors alive, so I wouldn’t dismiss it as wrong.  But it is interesting to observe that perhaps some of us are less wired in this way than others.

– Ravi Iyer

by   Ranker
in Opinion Graph, Rankings

Ranky Goes to Washington?

Something pretty cool happened last week here at Ranker, and it had nothing to do with the season premiere of the “Big Bang Theory”, which we’re also really excited about. Cincinnati’s number one digital paper used our widget to create a votable list of ideas mentioned in Cincinnati Mayor John Cranley’s first State of the City. As of right now, 1,958 voters cast 5,586 votes on the list of proposals from Mayor Cranley (not surprisingly, “fixing streets” ranks higher than the “German-style beer garden” that’s apparently also an option).

Now, our widget is used by thousands of websites to either take one of our votable lists or create their own and embed it on their site, but this was the very first time Ranker was used to directly poll people on public policy initiatives.

Here’s why we’re loving this idea: we feel confident that Ranker lists are the most fun and reliable way to poll people at scale about a list of items within a specific context. That’s what we’ve been obsessing about for the past 6 years. But we also think this could lead to a whole new way for people to weigh in in fairly  large numbers on complex public policy issues on an ongoing basis, from municipal budgets to foreign policy. That’s because Ranker is very good at getting a large number of people to cast their opinion about complex issues in ways that can’t be achieved at this scale through regular polling methods (nobody’s going to call you at dinner time to ask you to rank 10 or 20 municipal budget items … and what is “dinner time” these days, anyway?).  It may not be a representative sample, but it may be the only sample that matters, given that the average citizen of Cincinnati will have no idea about the details within the Mayor’s speech and likely will give any opinion simply to move a phone survey conversation along about a topic they know little about.

Of course, the democratic process is the best way to get the best sample (there’s little bias when it’s the whole friggin voting population!) to weigh in on public policy as a whole. But elections are very expensive, infrequent, and the focus of their policy debates is the broadest possible relative to their geographical units, meaning that micro-issues like these will often get lost in same the tired partisan debates.

Meanwhile, society, technology, and the economy no longer operate on cycles consistent with elections cycles: the rate and breadth of societal change is such that the public policy environment specific to an election quickly becomes obsolete, and new issues quickly need sorting out as they emerge, something our increasingly polarized legislative processes have a hard time doing.

Online polls are an imperfect, but necessary, way to evaluate public policy choices on an ongoing basis. Yes, they are susceptible to bias, but good statistical models can overcome a lot of such bias and in a world where the response rates for telephone polls continue to drop, there simply isn’t an alternative.  All polling is becoming a function of statistical modeling applied to imperfect datasets.  Offline polls are also expensive, and that cost is climbing as rapidly as response rates are dropping. A poll with a sample size of 800 can cost anywhere between $25,000 and $50,000 depending on the type of sample and the response rate.  Social media is, well, very approximate. As we’ve covered elsewhere in this blog, social media sentiment is noisy, biased, and overall very difficult to measure accurately.

In comes Ranker. The cost of that Ranker widget? $0. Its sample size? Nearly 2,000 people, or anywhere between 2 to 4x the average sample size of current political polls. Ranker is also the best way to get people to quickly and efficiently express a meaningful opinion about a complex set of issues, and we have collected thousands of precise opinions about conceptually complex topics like the scariest diseases and the most important life goals by making providing opinions entertaining within a context that makes simple actions meaningful.

Politics is the art of the possible, and we shouldn’t let the impossibility of perfect survey precision preclude the possibility of using technology to improve civic engagement at scale.  If you are an organization seeking to poll public opinion about a particular set of issues that may work well in a list format, we’d invite you to contact us.

– Ravi Iyer

Page 2 of 2512345...1020...Last »