Ranker Uses Big Data to Rank the World’s 25 Best Film Schools

NYU, USC, UCLA, Yale, Julliard, Columbia, and Harvard top the Rankings.

Does USC or NYU have a better film school?  “Big data” can provide an answer to this question by linking data about movies and the actors, directors, and producers who have worked on specific movies, to data about universities and the graduates of those universities.  As such, one can use semantic data from sources like Freebase, DBPedia, and IMDB to figure out which schools have produced the most working graduates.  However, what if you cared about the quality of the movies they worked on rather than just the quantity?  Educating a student who went on to work on The Godfather must certainly be worth more than producing a student who received a credit on Gigli.

Leveraging opinion data from Ranker’s Best Movies of All-Time list in addition to widely available semantic data, Ranker recently produced a ranked list of the world’s 25 best film schools, based on credits on movies within the top 500 movies of all-time.  USC produces the most film credits by graduates overall, but when film quality is taken into account, NYU (208 credits) actually produces more credits among the top 500 movies of all-time, compared to USC (186 credits).  UCLA, Yale, Julliard, Columbia, and Harvard take places 3 through 7 on the Ranker’s list.  Several professional schools that focus on the arts also place in the top 25 (e.g. London’s Royal Academy of Dramatic Art) as well as some well-located high schools (New York’s Fiorello H. Laguardia High School & Beverly Hills High School).

The World’s Top 25 Film Schools

  1. New York University (208 credits)
  2. University of Southern California (186 credits)
  3. University of California – Los Angeles (165 credits)
  4. Yale University (110 credits)
  5. Julliard School (106 credits)
  6. Columbia University (100 credits)
  7. Harvard University (90 credits)
  8. Royal Academy of Dramatic Art (86 credits)
  9. Fiorello H. Laguardia High School of Music & Art (64 credits)
  10. American Academy of Dramatic Arts (51 credits)
  11. London Academy of Music and Dramatic Art (51 credits)
  12. Stanford University (50 credits)
  13. HB Studio (49 credits)
  14. Northwestern University (47 credits)
  15. The Actors Studio (44 credits)
  16. Brown University (43 credits)
  17. University of Texas – Austin (40 credits)
  18. Central School of Speech and Drama (39 credits)
  19. Cornell University (39 credits)
  20. Guildhall School of Music and Drama (38 credits)
  21. University of California – Berkeley (38 credits)
  22. California Institute of the Arts (38 credits)
  23. University of Michigan (37 credits)
  24. Beverly Hills High School (36 credits)
  25. Boston University (35 credits)

“Clearly, there is a huge effect of geography, as prominent New York and Los Angeles based high schools appear to produce more graduates who work on quality films compared to many colleges and universities,“ says Ravi Iyer, Ranker’s Principal Data Scientist, a graduate of the University of Southern California.

Ranker is able to combine factual semantic data with an opinion layer because Ranker is powered by a Virtuoso triple store with over 700 million triples of information that are processed into an entertaining list format for users on Ranker’s consumer facing website, Ranker.com.  Each month, over 7 million unique users interact with this data – ranking, listing and voting on various objects – effectively adding a layer of opinion data on top of the factual data from Ranker’s triple store. The result is a continually growing opinion graph that connects factual and opinion data.  As of January 2013, Ranker’s opinion graph included over 30,000 nodes with over 5 million edges connecting these nodes.

– Ravi Iyer

by    in Data Science, Market Research

Validating Ranker’s Aggregated Data vs. a Gallup Poll of Best Colleges

We were talking to someone in the market research field about the credibility of Ranker’s aggregated rankings, and they were intruiged and suggested that we validate our data by comparing the aggregated results of one of our lists to the results achieved by a traditional research company using traditional market research methodologies.  Companies like Gallup often do not survey the same types of questions that we ask at Ranker, in part due to the inherent difficulties of open ended polling via random digit dialing.  You can’t realistically call someone up at dinner time and ask them to list their 50 favorite TV shows.  You could ask them to name one favorite, but doing that, you can end up with headlines like “Americans admire Glenn Beck more than they admire the Pope.”  However, one question that both Gallup and Ranker have asked concerns the nation’s top colleges/universities.  How do Ranker’s results compare to Gallup’s data?  Below are our results, side by side.

Ranker vs Gallup Best US Colleges

From a market researcher’s perspective, this is good news for Ranker data.  Our algorithms have successfully replicated the top 4 results from the Gallup poll exactly, at a fraction of the cost.  This likely occurs because Ranker data is largely collected from users who find our website via organic search, so while our data is not a representative probability sample (assuming such a thing still exists in a world where people screen their calls on cellphones), our users tend to be more representative than the motivated Yelp user or the intellectual Quora user.  If you compare how representative Ranker’s best movies list is compared to Rotten Tomatoes aggregated opinion list (Toy Story 2 and Man on Wire are #1 & #2!?!?), you get a sense of the importance of having relatively representative data.

In addition, the fact that our lists are derived from a combination of methodologies (listing, reranking, + voting), means that the error associated with each method somewhat cancels out.  Indeed, one might argue that Ranker’s top dream colleges list is better than Gallup’s for precisely this reason as individuals are often tempted to list their alma mater or their local school as the best college, and the long tail of answers might actually contain more pertinent information.  Aggregating ranked lists from motivated users and combining that data with casual voters might actually be the best way to answer a question like this.

– Ravi Iyer