The best kittens, technology, and video games blog in the world.

Saturday, July 04, 2009

Plan 9 from Outer Forks

Fatty watching himself on TV by cloudzilla from flickr (CC-BY)

I love IMDB. For one, it might be one of the very few non-personalized recommendation systems that actually work. Scores on IMDB correlate very highly with likelihood of me enjoying a movie, especially if I apply a correction for genres that are overrated (like very old and very long movies) or underrated (like zombie flicks). In any case IMDB is vastly more accurate than professional critics' reviews.

The most drastic example of IMDB and pro critics disagreeing was Transformers: Revenge of the Fallen, which I absolutely loved! It got an okish 6.5 score from IMDB - which is accurate enough, it's a really nice action-packed movie vaguely following Transformers canon, with as much concern for the plot as a typical action movie (that is not terribly much), and perhaps far higher aircraft carrier to robot ratio than you'd expect from the title.

It was also universally panned by the pro critics getting Rotten Tomatoes metascore of just 20%. Seriously, critics? Is it really one of the worst movies in existence? Have you expected anything else than robots meaninglessly fighting each other and destroying valuable stuff? It's almost as if critics panned Transformers 2 to signal their sophisticated taste, not to provide good service to the audience.

Anyway, back to my point. IMDB not only provides great recommendation service, they also make a lot of the underlying data available in convenient formats. Yes, they charge huge money for some of the data, but even the free portion is very useful.

Some voting patterns are very interesting. Here are two examples. First, one of the "so bad it's good" movies - Troll 2. Normal movies have sort-of Gaussian distribution of votes. OK, not really Gaussian but with a fairly definite single peak. Not so with the "so bad it's good" movies - these tend to have mostly 1s and 10s, with perhaps a few 2s and 3s, but amusingly very rarely many 9s and 8s - it's either very low, or a 10!

The second interesting bit is a demographically polarizing movie like Twilight. People of different ages and genders tend to like pretty much the same movies.

Top 100 for men, and Top 100 for women contain virtually the same movies, just slightly rearranged - neither stereotypically feminine chick flicks nor stereotypically masculine action flicks get into top lists, and movies rarely have very different male and female scores, or much different scores by age group. But exceptions do happen, here's one:

Under-18 men scored it almost 2 points lower than under-18 women! I have a perfectly good explanation for the Twilight ratings effect - normally movies are watched by people who like the genre, so only men who like chick flicks and only women who like action movies watch them. So there will
be very little discernible gender-genre bias. But with Twilight, millions of girlfriends worldwide must have forced their boyfriends to watch Twilight against their wills, to what the aforementioned boyfriends reacted in a passive-aggressive way by downvoting Twilight on IMDB. I don't have any way of
testing this theory, but I watched Twilight on my own free will, and liked it a lot.

Anyway, back to "so bad it's good" movies. I really like the genre, but it's difficult to tell the "so bad it's good" movies from straightforwardly bad movies. So like a good hacker I am, I decided to grab IMDB's database, and find out movies. Here's the list, criteria being - at least 10% of votes are 1s, at least 10% of votes are 10s, ordered by number of votes.

Votes10s %1s %IMDB scoreTitle
17725712.6%11.0%6.2The Blair Witch Project
36856030.4%10.5%7.6Fahrenheit 9/11
43845920.0%19.2%5.5Sex and the City
62426531.7%23.0%5.1High School Musical
72404410.6%10.3%5.1The Pink Panther
82287314.9%12.1%5.6House of 1000 Corpses
92232418.6%27.5%3.9Freddy Got Fingered
102170415.3%12.4%5.0White Chicks
111989814.5%10.7%7.6Midnight Express
141841011.6%10.3%6.3Funny Games U.S.
151777912.2%10.0%4.7Queen of the Damned
161760610.5%27.9%3.8In the Name of the King: A Dungeon Siege Tale
171710325.3%10.4%5.7Kung Pow: Enter the Fist
181684013.6%54.9%2.5You Got Served
191674314.7%35.0%2.8Spice World
201670610.5%74.0%1.6From Justin to Kelly
211659617.2%10.8%5.7Southland Tales
221633816.2%31.9%3.6Plan 9 from Outer Space
231619119.1%11.8%5.5Step Up 2: The Streets
241617915.1%40.5%3.8Get Rich or Die Tryin'
251572822.4%44.0%3.7High School Musical 3: Senior Year
261541712.5%38.1%3.4Little Man
271517715.0%11.0%5.6Alvin and the Chipmunks
281493813.5%16.0%3.7Super Mario Bros.
291443110.8%14.6%5.4Star Wars: The Clone Wars
301396011.0%10.4%4.9DOA: Dead or Alive
321364323.6%27.6%4.5High School Musical 2
331337735.3%61.5%1.3Jonas Brothers: The 3D Concert Experience
341276411.2%43.2%3.3Dragonball Evolution
361259618.3%26.7%4.2Stomp the Yard
371256017.8%11.3%6.1Cannibal Holocaust
401216810.7%12.7%4.9An American Haunting
411200613.0%11.9%4.4Stay Alive
421177014.0%18.4%3.5Grease 2
441155429.3%10.4%8.3Le salaire de la peur
461125712.2%10.4%5.0Just My Luck
471115010.6%24.7%3.7Prom Night
481105411.4%18.9%4.5The Marine
501074210.9%77.7%1.7Who's Your Caddy?
511058911.7%16.7%4.7Meet Dave
521039719.2%11.9%6.1Sal? o le 120 giornate di Sodoma
54977412.4%20.6%4.0Big Momma's House 2
55925545.0%10.2%8.1La passion de Jeanne d'Arc
56923415.3%10.9%5.8Georgia Rule
57914511.1%71.9%1.9Troll 2
58912113.1%11.0%5.1Strange Wilderness
59908512.2%16.9%4.3Black Christmas
60890612.0%13.0%5.3The Ten
61888817.9%15.8%4.7The Lizzie McGuire Movie
62884012.0%19.9%3.3Universal Soldier: The Return
63867818.4%45.3%2.9Material Girls
64865418.5%22.8%3.8Pok?mon: The First Movie
66841910.1%17.5%3.8On Deadly Ground
67837813.3%13.4%5.7Last Days
68821012.7%76.4%1.4SuperBabies: Baby Geniuses 2
69817920.1%58.3%2.9Hannah Montana: The Movie
70810312.0%52.7%3.1Beverly Hills Chihuahua
71804512.2%29.0%3.6Soul Plane
73801341.1%11.9%8.1Idi i smotri
74789814.6%32.4%2.7Hercules in New York
75785423.1%11.6%6.417 Again
76780420.6%19.5%5.2What the #$*! Do Wenow!?
77772626.4%10.9%7.1The Birth of a Nation
79768410.7%23.7%3.3Problem Child 2
80764325.6%15.0%5.1Raise Your Voice
81760512.6%11.2%5.0Eight Crazy Nights
82758825.4%12.4%8.1Du rififi chez les hommes
83756311.4%51.8%2.7Hannah Montana/Miley Cyrus: Best of Both Worlds Concert Tour
84749912.6%20.3%4.0Van Wilder 2: The Rise of Taj
86723712.7%55.4%2.7Larry the Cable Guy: Health Inspector
88709531.4%10.4%4.9Good Burger
91704344.5%10.2%8.0Les enfants du paradis
92693023.9%18.3%3.9Mighty Morphin Power Rangers: The Movie
93692523.2%13.1%5.8Pink Flamingos
94691545.5%16.8%5.8Kurtlar vadisi - Irak
95676613.5%12.9%4.4Yours, Mine and Ours
97671412.9%40.7%3.1Are We Done Yet?
98668720.9%57.1%3.7Expelled: No Intelligence Allowed
99658627.9%11.6%8.1Tengoku to jigoku

Trivia from the list:
  • Transformers 2 almost got to the list by having 9.9% 1 votes.
  • The most controversial movie in every way is Jonas Brothers: The 3D Concert Experience with 35.3% 10s, 61.5% 1s, and only 3.2% everything else. It also has huge 5.0 to 1.3 under-18 gender gap, but as you can see even most teenage females aren't huge fans of it. It's also the worst scored movie on the list, with score of just 1.3, what suggests IMDB's filters think most of the 10s are attempt at ballot stuffing, and get thrown away.
  • The highest rated controversial movie is Le salaire de la peur with IMDB score of 8.3 and somehow still more 1s (10.4%) than Transformers 2.
  • The highest rated controversial movies that seem popular (by number of votes they received) are Fahrenheit 9/11, and Twilight, both of which I liked, and not because of the "so bad it's good effect", and The Blair Witch Project, which I hated with passion for being so unbelievably boring.
  • The lowest scored movie from the list I watched was Troll 2 and I liked it because of the "so bad it's good" effect.
  • Celebrity movies are huge on the list, like Crossroads, Glitter, Spice World, Hannah Montana: The Movie, and the aforementioned Jonas Brothers' movie. The only one I watched of them was Crossroads, and I genuinely enjoyed it, in semi-ironic way.


Divided Mind said...

Have you tried Criticker? It totally works for me. Took it some time (like two hundred movies rated) to calibrate to my taste, but now it's rarely more off by more than 5 percent points when calculating PSIs (probable score indicators). Oh, and it's got IMDb ratings imported there as a virtual user, so you might find out just how much you really agree with IMDb.

taw said...

Criticker is strangely calibrated - 90 is "Alright", and 50 is "Terrible". Isn't 50 supposed to mean "Average"?

Divided Mind said...

Actually it analyses your scores and spans deciles along your scale, therefore assigning them to 10 tiers. That way it works a charm however you rate your movies, eg. you can use /10, /100 or whatever scale, you can consistently overrate movies, etc. Takes some getting used to, but makes sense if you think about it.

Divided Mind said...

Of course it uses the normalized scale when calculating likeness &c. (converting back when presenting you with a PSI). Wouldn't make much sense otherwise, given peculiarities of people's rating habits, now would it?

Divided Mind said...

BTW, rotten tomatoes are also there. And, surprisingly, I have higher compatibility with tomatoes than with imdb. Still pretty low, but higher.

taw said...

Well, it's full of fail as I mostly remember the good movies, not the shitty ones, so if movie falls in the middle of my scale it's probably very good, not average.

Fortunately it was kind enough to give me a few shitty movie suggestion so it's not that bad.

Divided Mind said...

Oh, and btw, if you consistently only watch good flicks you can remap tier names and colours ;)

taw said...

Isn't that the point of movie recommendation services? Watching mostly good (or so-bad-they're-good) films ;-)?