Why Topsy/Twitter Data may never predict what matters to the rest of us

Recently Apple paid a reported $200 million for Topsy and some speculate that the reason for this purchase is to improve recommendations for products consumed using Apple devices, leveraging the data that Topsy has from Twitter.  This makes perfect sense to me, but the utility of Twitter data in predicting what people want is easy to overstate, largely because people often confuse bigger data with better data.  There are at least 2 reasons why there is a fairly hard ceiling on how much Twitter data will ever allow one to predict about what regular people want.

1.  Sampling – Twitter has a ton of data, with daily usage of around 10%.  Sample size isn’t the issue here as there is plenty of data, but rather the people who use Twitter are a very specific set of people.  Even if you correct for demographics, the psychographic of people who want to share their opinion publicly and regularly (far more people have heard of Twitter than actually use it) is way too unique to generalize to the average person, in the same way that surveys of landline users cannot be used to predict what psychographically distinct cellphone users think.

2. Domain Comprehensiveness – The opinions that people share on Twitter are biased by the medium, such that they do not represent the spectrum of things many people care about.  There are tons of opinions on entertainment, pop culture, and links that people want to promote, since they are easy to share quickly, but very little information on people’s important life goals or the qualities we admire most in a person or anything where people’s opinions are likely to be more nuanced.  Even where we have opinions in those domains, they are likely to be skewed by the 140 character limit.

Twitter (and by extension, companies that use their data like Topsy and DataSift) has a treasure trove of information, but people working on next generation recommendations and semantic search should realize that it is a small part of the overall puzzle given the above limitations.  The volume of information gives you a very precise measure of a very specific group of people’s opinions about very specific things, leaving out the vast majority of people’s opinions about the vast majority of things.  When you add in the bias introduced by analyzing 140 character natural language, there is a great deal of variance in recommendations that likely will have to be provided by other sources.

At Ranker, we have similar sampling issues, in that we collect much of our data at Ranker.com, but we are actively broadening our reach through our widget program, that now collects data on thousands of partner sites.  Our ranked list methodology certainly has bias too, which we attempt to mitigate that through combining voting and ranking data.  The key is not in the volume of data, but rather in the diversity of data, which helps mitigate the bias inherent in any particular sampling/data collection method.

Similarly, people using Twitter data would do well to consider issues of data diversity and not be blinded by large numbers of users and data points.  Certainly Twitter is bound to be a part of understanding consumer opinions, but the size of the dataset alone will not guarantee that it will be a central part.  Given these issues, either Twitter will start to diversify the ways that it collects consumer sentiment data or the best semantic search algorithms will eventually use Twitter data as but one narrowly targeted input of many.

– Ravi Iyer

by    in New Features

Twitter Give Away

You’re a Ranker fan, you love making lists and now we want to reward you with a treat. So, we’re hosting a simple Twitter promo these next few days to give away two brand new copies of the most anticipated video game of the year, The Elder Scrolls V: Skyrim to two separate, random Ranker Twitter followers if we can reach 5000 Twitter followers by the release date, 11-11-11.

All you need to do to be entered to win is click the follow button below to follow Ranker. If you are already a follower of Ranker’s Twitter account, you’ve already been entered to win. Thank you from the bottom of our hearts and good luck in the give away.


Skyrim is going be one of the hottest games of the year with one of the largest worlds to explore. If you’re like us, we think it might even be the best rpg. Another reason we think you might like a copy is because in the Ranker community there continues to be a lot of chatter about it’s anticipation.

We are definitely giving away one copy is Skyrim next week and if we reach our goal of 5000 before Skyrim hits stores we’ll give away a second copy. Be sure to let all your gamer friends know what’s going down this week at Ranker ASAP, because if we get to 5000 followers before Skyrim hit stores, we’ll give away an extra copy to a different follower, essentially doubling your chances of winning if we hit 5000! BONUS! All it takes is one click to follow the Ranker account to be entered to win a copy of Skyrim. It’s that simple. As of this afternoon we are 84% of the way there and only need a few more followers to join our Ranker Twitter Army to reach our goal.

Let your gaming buddies know, share this story on Facebook and Google Plus, retweet our tweets or send out a message with the following text:  

Calling all my gamer tweeps. Follow @Ranker by Nov 11th to win 1 of 2 copies of The Elder Scrolls V: Skyrim.  

Thank you and good luck.

And now for the entertainment:

by    in Popular Lists

List of the Day: Kim Kardashian's Divorce Explodes the Internet

When professional celebrity Kim Kardashian married sports star Kris Humphries  in a two hour E! TV special, it seemed like a match made in Reality TV Producer heaven. I mean, after all the suffering and heartbreak that poor Kim had gone through, what with a leaked sex tape that happened to slingshot her into stardom and on again off again relationships with multiple football players, the nation wept to see her finally settle down with her hulking Prince Charming.

Now all our hopes and dreams of a happy ending for little Kim have come crashing down. With the announcement today that Kim and Kris’ epic 72-day old marriage was coming to an end, the internet exploded in response. It was with great remorse that Ranker user Ariel Kana has put together the Funniest Reactions to Kim Kardashian’s Divorce.

God Speed Kim. We all hope that your heart heals and you learn to love again. Hopefully in time for next May’s sweeps.

by    in Pop Culture, Popular Lists

Scarlett Johansson's Naked, Leaked List of the Day

If you’ve been living under a rock (in which case, welcome to our above-ground world! We have lots of delicious insects up here, but I would highly recommend our smoked meats), then you haven’t heard that Scarlett Johansson broke one of the cardinal rules of being a famous attractive female celebrity: she, at some point, texted nude pictures of herself to somebody (which, for the sake of all of us, we’re hoping wasn’t Sean Penn). 

And because the internet is a cesspool of humanity that cares about nothing more than “boobs [cleaned up for language] or GTFO”, some crafty Hackers are to blame for the sexty pictures, making Scarlett Johansson one of the few female celebrities who no longer has to “GTFO” of the internet. 

Is it really “news” that she probably looks amazing naked?

We won’t link you to the pictures, because all you have to do to see them is go to any website ever (except maybe Disney.com). But the internet really reacted to them like a bunch of creeps. Every guy on Twitter is talking about stuff that’s waaay too personal to post, except for today, for some reason. 

So, today’s list of the day has one thing to illustrate/explain:

“… not a man in the world doesn’t want to see Scarlett Johanssen naked, but do we really all have to let each other know what we’re “doing” to these new leaked naked photos of her via social networks? This is the most the internet has ever instantly exploded over leaked nude pictures of a famous hot celebrity, and it was weird, awkward and creepy as hell.”

Well said, Robert Wabash. Well said. In honor of the biggest T.M.I. day in internet history…

The Worst, and Creepiest, Internet Reactions to Scarlett Johansson 

Internet, you guys are creeps! Also, we agree. But T.M.I. Seriously.

by    in Popular Lists

ApocaList of the Week

7-17-11

We have been holed up in the Ranker offices on Wilshire in Los Angeles for 3 days now, waiting out Carmageddon, hoping and praying that some other people are left out there…alive. We have little water left, and are subsisting on a thin gruel made of coffee grinds and old “Best of 2008” lists. (Best song: Rihanna and T.I.’s “Live Your Life”? What were you thinking, people of 3 years ago?)

Last night, it got really quiet, and Brian thought he could make a break for it, but he was only about 20 feet out the door when one of the cars got him. He made it back inside and seemed okay, but he’s…different somehow. Changed. This morning, I thought I caught him sipping on motor oil and making “vroom” noises, but it could just be the stress getting to me. I haven’t been sleeping.

We’re going to continue to wait, for as long as we can last. In the meantime, Ranker users outside of LA, who managed to escape this dys-autopian nightmare have made some lists about other stuff that happened this week. Check ‘em out.

Spotification

This week’s hot new music startup was Spotify, the subscription music service that’s already been a big hit in Europe and has FINALLY landed on American shores after working out deals with all the record labels. Users can stream music to their computers for free (with ads), pay $5 a month to dump the ads or $10 a month to stream music to their mobile devices.

The library is pretty amazing, but before you dive in and start collecting your favorite post-pop-emocore-abilly songs into a playlist, check out these Spotify Tips, Tricks and Hints to make sure you’re, you know, doing it right. Can you imagine if you were sharing that Best Reggae Jams playlist publicly, and accidentally had left some Rocksteady in there? Shock! Horror!

Always good advice.

PS: Still can’t get in to Spotify? We also have some thoughts on Turntable.fm. Which is open to everyone!

Gluttony: A Celebration

Good news, everyone! New Jersey resident and soon-to-be-national hero Donna Simpson, who currently weighs in at a solid 700 pounds, has announced her intention to gain the additional 300 pounds needed to secure the world record! Plus she’s promised to do most of the actual required eating in front of a webcam, so all of us amateur gluttons can enjoy her achievements vicariously.

If this all sounds vaguely familiar, it’s probably because Homer Simpson (no relation…probably…) hatched a similar scheme back in the ’90s, with somewhat disappointing results.

Still waiting for his special dialing wand

To commemorate Donna’s historic attempt to eat a metric ton of bacon, Ranker user Barbara Gaston threw together this list of Great Historical Gluttons. Hey, she’s sharing a list with Elvis Presley! The King! It’s a compliment!

New Movie Trailers

Tons of new movie trailers debuted this week, in part because a new “Harry Potter” film opened, so they know a lot of people will be in theaters waiting to see if the kids get back to the Shire. (That’s what it’s about, yes?) They’re all on our 2011 Movie Trailer list, including this new spot for Martin Scorsese’s 3D adventure story “Hugo.” LET’S WATCH!

Sacha Baron Cohen’s 3D nude wrestling scene, I predict, will cause some controversy…

Happy Birthday Twitter

5 years ago this week, Twitter (then called Twittr) was introduced to the public. Hard to believe it’s been that long! Before then, if you wanted to know what someone had had for lunch, and if it was delicious, you’d have to actually call them up and ask them! Not that anyone ever did that. Because, really, let’s be honest, who cares? But still…Twitter…woooo!

There are, after all, lots of historic, awesome, funny and important tweets worth remembering. Like that time Ice T insulted singer Aimee Mann with language we would not dream of repeating on a corporate-type blog.

Aimee Mann can eat a hot bowl of…oh, hey, kids, stay in school!

O K, that provided a few moments of distraction from the horrorscape that is post-Carmageddon Los Angeles. (Thinking we should start calling it “New Los Angeles.” Sounds more post-Carmageddon-y.)

I’ll send word if I can. Hopefully the US government still exists and the military can get some tanks through to us. Also, please, if you see my wife, tell her… oh God… I hear engines revving… I think they’re in the building… I… Oh no…

[End Transmission]

by    in Popular Lists

Roger Ebert's List of the Day

With the untimely death of Jackass star Ryan Dunn came a slew of internet reactions ranging from the devastated to the outright self-righteous and disrespectful. Roger Ebert walks a fine line. The man who said that not liking “Lost in Translation” says more about you than it does about the film and also said that “video games are not art” put out a few reactions to Ryan Dunn’s death via his Twitter account. One of them wasn’t exactly “nice”.

This, of course, led to outrage in the vein of “how could he do this?!” and “what a jerk!*” (*few people were using the word “jerk”). But why? Reacting like this is neglecting Roger Ebert’s past as a man who says whatever he wants whenever he wants. And as one of the greatest minds in film criticism history, the man has a right to his opinion, but keep in mind what he does for a living. 

He’s a critic.

So, today’s list of the day shows us all exactly why we should come to expect inflammatory comments from Roger Ebert. He always makes people angry and he always says what he wants. This one’s for you, Roger. 

The 13 Most Inflammatory Roger Ebert Statements