by    in Data Science, Opinion Graph, Rankings

From Ranker Labs: A Deeper Look at the Worst Movies List

Perhaps you didn’t know Ranker had a whole large laboratory full of scientists in neatly pressed white coats doing crazy, some might even say Willy Wonka-esque experiments. We try to keep that sort of thing fairly under wraps. The government’s been sort of cracking down on evil science ever since that Freeze Ray incident a few years back… you know the one I mean…

A rare glimpse behind the curtain at how Ranker lists are made. Photo by RDECOM.

Anyway, recently, our list technicians have been playing around with CrowdRanked lists. We get a lot of Ranker users giving us their opinion on these lists.

(Ranker’s CrowdRankings invite our community members to all gather together and make lists about one topic. Then everyone else can come in and vote on what they think. When it’s all been going on for a while, and a bunch of people have participated, you get a list that’s a fairly definitive guide to that topic.)

One list that has interested us in particular is this one: The Worst Movies of All Time. Almost 70 people have contributed their own lists of the worst films ever, and thousands of other members of the Ranker community have voted.

And what do we learn from this list? Everyone really, really, really hates “Gigli.” I mean, hates it. That movie is no good at all.

Ben Affleck does his impression of everyone watching more than 5 minutes of ‘Gigli.’

It comes in #2 right now, with almost 700 votes upholding its general crapitude. The only movie topping it in votes right now is Mariah Carey’s vanity project, “Glitter,” which, to be fair, barely qualifies as “a movie.”

But our scientists – because they are seriously all about science – thought, there must be something more we can do with this data now that we’ve collected it. And wouldn’t you know, they came up with something. They call it “Factor Analysis.” I call it “The thing on my desk I’m supposed to write about after I have a few more cups of coffee.”

So What Is Factor Analysis Anyway?

Here’s how the technicians explained it to me…

We’re going to perform a statistical analysis of the votes we collected on the “Worst Movies Ever” list. (Just the votes, not the lists people made nominating movies.)  To do this, we’re going to break up the list of movies into groups based on similarities in people’s voting patterns. (That is, if a lot of people voted for both “Twilight” and “From Justin to Kelly,” we might group them together. If a lot of those same people voted against “Catwoman,” we’d put that in a separate group.)

Sometimes, you’ll be able to look at the grouping and the common thread between those choices will be obvious. Of course the same people hated “Lady in the Water’ and “The Last Airbender.” They can’t stand M. Night Shyamalan (or, perhaps more accurately, they can’t stand what he has become.) Not exactly a shocking twist there.

The Airbender gains his abilities by harnessing the power of constant downvotes.

But other times, the groupings will not be quite as obvious, and that’s where the analysis can get more intriguing. Once we collect enough data, we’ll be able to make all kinds of weird connections between movies, and maybe figure out a more Unified Theory of Bad Movies than currently exists! (Hey, a blogger can dream…)

When doing this kind of factor analysis, you first must determine the number of groups that exist in your data. We used something called a Catell’s Scree Test to determine the number of groups. (This is fancy-talk for saying: “We plot everything on a graph like the one below, and look for the elbow – the point where the steepness of the dropoff between factors is the greatest.”)

The “Eigenvalue” that you see along the Y axis there is a measure of the importance of each factor. It helps us to differentiate between significant factors (the “signal”) from insignificant ones (the “noise”).

Once we decide how many factors we have, it’s time to actually extract factors whereby we determine which movies load on which factors. It sounds precise and mathematical, but there’s some amount of subjectivity that still comes into play. For example, let’s say you were talking about your favorite foods. (Yes, yes, we all love “bacon,” but be serious.)

One way to group them would be on a spectrum from spicy to bland foods. But you could also choose to go from very exotic foods to more ordinary, everyday ones. Or starting with healthy foods and moving into junk food. Each view would be a legitimate way to classify food, so a decision must be made on some level about how to “rotate” the factor solution.

In our case, we chose what’s called the “varimax rotation,” which maximizes the independence of each factor and tries to prevent a ton of overlap. This allows us to break up the movies into interesting sub-groups, rather than just having one big list of “bad” films (which is where we started out.)

Doing that yields the below chart.

Along the top, you can see the factors that were extracted. The higher the number a film gets for a certain component, the more closely aligned it is with that component. Using these charts, we can then place movies in “Factors,” or categories, with relative ease.

Unfortunately, the program can only get us this far – we can see the factors, but we can’t tell why certain items apply to certain factors and not others.

So What Can Factor Analysis Tell Us About the Worst Movies?

First, our lab rats managed to split the entire Worst Movies List (containing 70 total films) into 5 different categories.

Category 1 (we called it “Factor 1”) contained the most movies overall, so whatever the common thread was, we knew that it must be something that people immediately identified with “bad movies.” Some of the titles that most closely correlated with Factor 1 were:

– “Monster a Go-Go”
– “Manos: The Hands of Fate”
– “Crossover”
– “The Final Sacrifice”
– “Zombie Nation”

Check out the full group here on our Complete List of Factor 1 Bad Movies.

We decided that “Classic B-Movie Horror” was the best way to describe this grouping. Of the group, 1965’s “Monster a Go-Go” was the most representative item, and it didn’t really overlap with any of the other groups. The film is a fairly standard horror/sci-fi matinee of the time. An astronaut crashes back to Earth having suffered radiation poisoning, and then goes on a rampage.

So when most Rankers think about what makes a movie “bad,” they tend to think of older, low budget movies that fail at being scary, and maybe have a sci-fi element as well.

Factor 2 was a bit harder to pin down. Lots more movies seemed to fall into or overlap with this category, but it was a bit tricky to pinpoint what they had in common. Representative Factor 2 movies included:

– “Glitter”
– “SuperBabies: Baby Geniuses 2”
– “From Justin to Kelly”
– “Catwoman”

and the most representative of all for Factor 2 was “Gigli.” (See all the movies relating to Factor 2 here.)

We settled on “Cheesiness” as a good common thread for these movies. (Especially if you continue on down the list: “Battlefield Earth,” “The Room,” “Batman and Robin,” “Superman IV: The Quest for Peace”…yeesh…)

Note here that “Gigli” was the film that most closely correlated to Factor 2 (what we have deemed “cheesy movies”), and “Glitter” was also considered highly cheesy. Yet “Glitter” is the overall most popular “Worst Movie” on the list, when going by straight votes. This seems to indicate that “Gigli” was hated SOLELY because it is cheesy, while “Glitter” commits numerous cinematic crimes, including cheesiness.

Factor 3 had even fewer films that closely correlated, but it was very simple to figure out what they all had in common. Consider the movies that were most representative of Factor 3:

– “The English Patient”
– “The Family Stone”
– “Far and Away”
– “Legends of the Fall”
– “The Fountain”
– “Eyes Wide Shut” (oh come on are you guys kidding it’s freaking Kubrick!)
– “What Dreams May Come”

Check out the full group here on our Complete List of Factor 3 Bad Movies.

Let’s call this the “Self-Important Pretension” group. People who hate movies that are self-consciously “artsy” and “important” REALLY hate those movies, and will pretty much always pick them over other bad movies from other genres. These folks are just outnumbered by the people who think it’s worse to be old-fashioned or cheesy than pompous. (At least, people ON RANKER.)

Factors 4 and 5 are sort of interesting. It’s definitely harder to make a clear-cut distinction between these two groups when you’re just looking at the films. We know they are distinct, because of the voting patterns that created them. But consider the actual movies:

– “Star Wars: Episode I: The Phantom Menace”
– “Transformers: Revenge of the Fallen”
– “Indiana Jones and the Kingdom of the Crystal Skull”
– “Spider-Man”
– “Godzilla” (the 1998 Matthew Broderick version)
– “Star Wars: Episode II: Attack of the Clones”
– “Pearl Harbor”

(Here’s the complete Factor 4 list.)

Biggest disappointments? That was our first thought. But then check out Factor 5:
– “Forrest Gump”
– “Indiana Jones and the Temple of Doom”
– “Million Dollar Baby”
– “Avatar”
– “Quantum of Solace”

(All the Factor 5 movies are listed here.)

Certainly, if you didn’t like Best Picture winners “Forrest Gump,” “Million Dollar Baby” and “Avatar,” you considered them disappointments? “Quantum of Solace” was the lukewarm follow-up to “Casino Royale,” one of the best Bond films of all time. And “Temple of Doom” is the sequel to arguably the best adventure movie ever made, “Raiders of the Lost Ark.”

So how come the movies in Factor 4 closely correlated with one another, and the movies in Factor 5 closely correlated with one another, if they’re BOTH groups of disappointing films? Maybe they disappointed different people, or they disappointed people in different ways?

One theory: Factor 4 films are entries in above-average franchises that are considered not as good as the other films. (This doesn’t quite apply to “Pearl Harbor,” unless you consider Michael Bay Movies to be a franchise. As I do.) The people who agreed on voting for these films felt that the worst thing a movie can do is disappoint fans of other, similar movies.

For example, movies starring Ben Affleck…

This would make Factor 5 the “overhyped” category. Everyone’s “supposed” to love “Million Dollar Baby” and “Avatar” and “Forrest Gump.” And the people who don’t like them feel a curmudgeonly sense of kinship around some of these titles. (One would expect “The English Patient,” then, to fall into this factor. Unfortunately for our theory, it’s most closely aligned with Factor 3, the “Long and Boring” category.)

More theories as to the strange circumstances of Factor 4 and 5 are certainly welcome. We just thought it was kind of an intriguing puzzle.

There were 3 movies that seemed to coalesce into a “Factor 6,” but we didn’t have enough data and enough films didn’t correlate to create a true category in any meaningful sense. So it may forever elude us what “Waterworld,” “The Postman” and “Road House” have in common. Aside from kicking ass, amiright? R-r-right?

Movies That Scored High in Multiple Factors

Some movies didn’t closely align with any single group, but nonetheless scored high for numerous different factors. For example, “Masters of the Universe,” the ill-fated live-action ’80s adaptation of the He-Man line of toys. “Masters of the Universe” was somewhat aligned with Factor 1 – the “dated B-movie genre” group – as well as Factor 3 – the self-important pretension group. Now that is just weird. I mean, yes, He-Man is kind of a blowhard, with all that “I Have the Power!” stuff. But I don’t really think of it as terribly similar to “The English Patient” when all is said and done.

Also, consider “Lady in the Water.” It aligns fairly closely with Factors 1, 2 AND 3, and even makes a showing in Factor 4. This is a movie upon which haters of every kind of movie can agree.

A Look at Things to Come

So, that’s how we’ve gotten started with using Factor Analysis on some of our CrowdRanked lists. Isn’t it very very very interesting, such that you’d like to tell all of your friends about what you’ve just read? If only there were some kind of digital environment where people could socially interact and share hypertextual links to information that they enjoy with their friends…

Be sure to check out the next edition of Ranker Labs, coming in a few weeks, when we’ll apply some Factor Analysis to ANOTHER one of our big CrowdRanked lists – History’s Worst People.