Sunday, September 2, 2007

Netflix Prize readings.

A pretty interesting set of blog articles from people participating in the Netflix Prize:
  • The Netflix Prize: 300 Days Later - Is it even possible to reach a root-mean-squared-error of 0.85-ish on the dataset they provided? How much better for the customers is that than just taking the average rating for the movie? Not too much, and we may need a lot more information about the users than their simple rating history to make further improvements. Should we be investing so much hope in Recommender Systems? Perhaps not, and it could even be dangerous.
  • Netflix Update: Try This at Home - Some simple approaches to the dataset by reducing movies and users to engagement with an arbitrary number of attributes. Like action movies, and hate chick flicks? You might like movies that are very action-y and very much not chick-flick-y.
  • Netflix Prize Results and Source Code - An account of the author's efforts in the prize. Lots of references to interesting subjects at the bottom. He brings up the idea of clustering the movies as a potential boost to the recommendations, which brings it in line with the Netflix Update article above. Seems reasonable to me, if you like romantic comedies a lot, you'll probably rate romantic comedies pretty highly, provided that other people who like romantic comedies rate it highly as well. Yay, I've invented Naive-Bayes!
Now I'm intrigued. Bad news for a person with no free time.

0 comments: