I Know What You Want – Recommendation Systems

Posted onCategoriesSelene Systems

Do you remember the days of scrolling through the TV channels or standing at the DVD shelf trying to decide what to watch? I do, what I cannot remember that well is exactly when we stopped needing to do this. Like many great innovations, recommendation algorithms have slowly helped to make the content we view online more tailored to what we want.

This advancement has been gradual as the big players like YouTube, Netflix and Amazon have amassed unfathomable amounts of data and have been able to carefully curate a highly personalised experience for their users. In this article, I will mainly focus on recommendation systems for movies and TV but there are far more applications to this that I am sure you will have encountered.

Netflix actually ran a competition with a prize of $1M to find the best algorithm in 2009.

Since then, the technology has continued to improve, but let’s have a look at the core mechanics of how these recommendation systems work. Broadly, there are two main types:

  • Content-Based Filtering
  • Collaborative Filtering

Content-Based Filtering works based on finding movies that appear to be similar in terms of their attributes and how a given user has rated them. For example, if a user rates Harry Potter a 5/5, then the system will look for other movies about magical schools (not sure how many of these there are) but it might also find other movies with Daniel Radcliffe as a lead actor. This will work well in suggesting other Harry Potter movies, but might also result in showing very different movies like The Woman In Black which is a horror movie that might not be a good suggestion younger audiences that enjoyed Harry Potter.

Collaborative Filtering aims to solve this problem differently instead of trying to use the label attributes of the movies, instead it relies on the collaborative efforts of the community to find patterns in the user ratings. With a perfect dataset (imagine every person on the planet rated every movie they have ever seen out of 5) you can see how it is likely that the system could find a person just like you, who has seen some films that you haven’t. Its a bit like sending a clone of yourself to watch things and then let you know which of them you will like! Without that perfect dataset we can still make good suggestions and as more and more users join the service, the power of the system only gets better. How this actually happens is a bit more complicated, essentially it involves generating Latent-Features that can be used within a Matrix Factorisation process to generate as close as possible the known data we have:

Movie AMovie BMovie C
Kyle512
Dave25?
Sarah4?3
Gary51?
User-Movie-Rating Matrix

In the above table, we can see data for different people and how they have rated different movies, we can then generate 2 other matrices containing the “Latent-Features” (note that we don’t know what these will actually represent but we hope to iteratively adjust the values to get closer to a solution.

We can then iteratively multiply (Dot-Product) the 2 Latent-Feature matrices together, compare the resulting matrix to the one shown above and adjust the values. Repeating this many times using Gradient-Descent should hopefully result in the Dot-Product showing that Gary would rate Movie C as a 2.

This is a very high level overview, but if you have an interest in this topic or can see how your business would benefit from a recommender system, let us know!