Netflix on Facebook – The slow revolution of recommendation engines
Techcrunch reports that Neflix is now using Facebook Connect: Now, a Facebook user with Netflix can share their movie ratings with friends, view friends’ ratings through their news feed, and can add movies to their queue directly from their friends news feed.
My prediction is that this will revolutionize media recommendation algorithms. Why? Read the post I wrote some time ago called “Does the Netflix challenge have it backwards?” where I raised this question: Should Netflix spend a million dollars on their algorithm? Or should they rethink their primitive rating system on which their algorithm is based? Below, I am rewriting that post including some new ideas…
Netflix is trying to guess people’s taste based on rating system of 1-5 stars. Is that a very good algorithm database in the first place? Couldn’t they spend a million dollars to make a better database? What if the “five star good-or-bad” rating system is a fundamentally flawed scale to use in predicting what we are likely to watch?
The “five star good-or-bad” scale has good uses, but recommendation is not one of them. It is not a question of accuracy; it’s a matter of depth. For example, imagine trying to map out all your favorite restaurants using only a five star rating system. You couldn’t portray a very good idea of your culinary taste could you? There are so many factors like cost and atmosphere that would be completely ignored.
You might ask – “so why does yelp use a five star system to rate restaurants? That’s different – it’s not user preference, it’s restaurant popularity – this is the important difference that Netflix has missed. The five star system is useful for sites like Yelp to rate restaurants, but it’s a poor choice to rate individual diverse preferences. Your Netflix’s movie preferences are more analogous to the latter and yet, netflix is still using a five star system. It doesn’t make sense.
Netflix already knows that there are psychological influences (like the anchoring effect) that skew the accuracy of a five star rating system. (see this wired article) Programmers do a lot of fancy gymnastics to account for effects like that, but there could be a better approach. What if Netflix used one of these rating systems instead of a five-star system?
What if they used tag keywords? What if the user could play an online game where they would fight for the movies they liked the best. What if users bet on which movie they thought would get higher ratings? What if they were given a random sampling of reviews and were asked to agree or disagree? (update 20090325: What if they used facebook connect to mine comments and direct user-friend interaction? –now implemented)

The five star rating system’s early predecessors like Slashdot needed a rating system to provide the best content to fit a community of readers. But Netflix users are not voters in a democracy. They are niche choosers. Netflix isn’t distilling a single set of content, they are tailor fitting content to users – it doesn’t make sense to use the same five star rating system. Netflix is using the ratings of each movie as building blocks to define what niche a person is in. But it would be better to use a rich language of associative cues to define a rich web of micro-genres without worrying about each person’s preference about each movie. It is more important to get a rich rendering of a micro-genre web. THEN you can guess which nodes of that web the user will most likely be attracted to.
At first it seems stupid to gather data by having the user play a loose association game; because the data is very low-resolution – any given “review” (collected by keyword or playing a game) seems inconclusive and almost arbitrary. But on the other hand, this data has personality, and when you have a huge sample section, even wild fluctuations in arbitrary choices average out to produce meaningful results.
It’s the old trick that a room full of people guessing the number of jellybeans in a jar will all be really far off, but their average will be scary accurate.

Statistics of jelly beans in a jar: A crowd can make an accurate guess collectively even though each person's guess is inaccurate
The upside of this method, is that you gather a much richer database of information. You won’t have a very good idea how the user feels about any given movie, but you WILL start to home in on which movies belong to which niches. Then you can judge which niches the user is likely to be attracted to. Netflix is asking people to guess the jellybeans to the nearest 1/5 of the jar, and THAT is ruining their whole crowd sample. Rather than focusing on each individual’s unique “rating thumbprint” and comparing those thumbprints, they should be establishing a detailed map of micro-genres and then deciding which micro-genres the user is most attracted to. In trying to be accurate about each user’s opinion on each movie, Netflix looses the detail in their rendering of a micro-genre web.
—-

I find it odd that you didn’t mention Amazon’s recommendation engine (considering you spoke so much about it in your longtail post) – just a thought.
Also, remember the uproar that Netflix went though following their previous fiasco with Facebook (ie Beacon)
SELFPOST:
@distail@disrail- thanks for the thoughts. Yeah I copied most of this post quickly from my old post about netflix – so I didn’t add alot about Amazon’s recommendation engine. I just read a little more on that though – http://www.readwriteweb.com/archives/recommendation_longtail.phpYeah, I had to think back for a minute but i remember the Beacon fiasco. They’ll have to be more careful this time not to post any info behind the user’s back. No matter how careful they are, I’ll bet they get some slack anyway – this stuff is an increasingly touchy privacy issue.
Thanks distail-Thanks disrail –(DOH! Cheers, disrail)
its disrail btw