3. Content-based Recommender Systems
Let's continue our introduction to recommender systems by digging a bit deeper into content-based recommender systems.
The main idea behind content-based recommender systems is to recommend items to a user A that are similar to previous items rated "highly" by A.
A content-based recommendation process starts by extracting relevant key-features from the items in the catalog and then building an item profile for each of the items using those key-features. For example, let's consider a catalog from a retailer that sells geometric shapes. The figure below shows how each item in the catalog can be mapped to a profile using key-features extracted from attributes that describe the item.
The process tracks the items that a user "likes", using both explicit and implicit ratings and then, leveraging the items' profiles of those items that the user likes, it builds (infers) a user's profile. With reference to the example shown in the figure below, we could infer that the user likes the color red, circles, and triangles.
Once we have the user's profile we can match it against the items in the catalog and find those items that are more similar to the user's profile, and recommend them to the user.
But how do you build item profiles? An item profile is a set of features (key-words) that describe the item. Here are some examples:
- Movies: actors, director, title, plot
- Clothing: product type (coat, jacket, jumper, etc), brand, style, color, material, etc.
Even though the item profile is a set of features, it is convenient to think of it as a vector. The vector will have one entry for each feature and its values can be either boolean or real values. Defining the profile is a very domain-specific task that starts by specifying the item's attributes from which key-features (key-words) should be extracted. In the simplified example depicted in the figures above the attributes are the color and the shape of the items. In the case of movies, these attributes might be the title, the actors, the director, and the plot. Note that key-words can be extracted also from text fields like the movie's plot or an item's description, using text classifier tools.
Once we have built the items' profiles, the next step is to build the user's profile. How can we do that? Let's say we have a user that has rated items with profiles: i1, i2, ... ,in. A simple way of building the user's profile is to calculate the average of the rated items' profiles (see figure below).
The next figure shows how this technique can be applied to our simplified geometric shapes example.
This simple way of building the user's profile doesn't take into account that a user might like certain items more than others. In that case, you might want to use a weighted average where the weight is equal to the rating given by the user for a specific item.
Let's consider a simple movie example to see how you can build the user's profile within a 5 star rating system. Suppose a user has watched 5 movies, 2 of which featuring actor A, and 3 of them featuring actor B. Let's also suppose that "actor" is the only meaningful key-feature for describing a movie item, and that the user has rated the movies as follows:
- the two movies featuring actor A were rated 3 and 5
- the threes movies featuring actor B were rated 1, 2, and 4
Since we are using a 5 star rating system, it seems somewhat apparent that the user liked at least one movie featuring actor A (the one rated with 5 stars) and one movie featuring actor B (the one rated with 4 stars). It also appears that the user didn't like the movies rated with 1 and 2 stars, as these values are negative values, and we would like to be able to capture this fact somehow. The idea of normalizing ratings helps us capture the fact that some ratings are actually negative ratings and others are actually positive ratings. Users are very different from each other; some are more "generous" with their ratings than others. So for one user, 4 stars may be an unusual very positive rating, while for another user 4 stars could be just an average rating. To capture this nuance we baseline each user's rating to their average rating. We normalize ratings by subtracting the user's average rating (3 in this case) from each rating value. So the normalized ratings become as follows:
- actor A's movies normalized ratings: (3-3) = 0, (5-3) = 2
- actor B's movies normalized ratings: (1-3) = -2, (2-3) = -1, (4-3) = 1
Notice that the normalized ratings seems to capture the intuition that the user did not like the movies he rated with 1 star and 2 stars, as those ratings are now negative values. After performing this normalization we can compute the profile's weights.
- actor A profile weight: (0+2) / 2 = 1
- actor B profile weight: (-2-1+1) / 3 = -2/3
Therefore the user's profile vector would look like this:
It's worth noticing that this indicates a mild preference for actor A and mild negative preference for actor B.
Now that we are able to build profiles for both users and items, the next step is to recommend items to the user. The key step here is to take a pair, consisting of the user's profile and an item's profile and determine how similar they are. But what is similarity, exactly? It doesn’t really seem something we can easily quantify, but surprisingly it can be easily measured. Remember that both the user's profile and the item's profile are vectors in a multi-dimensional space (with each key feature being a dimension). A good measure of the similarity between two vectors is given by the angle formed by the two vectors, which can be easily estimated using the cosine similarity formula.
The figure below shows two vectors, v and i, in a 2-dimensional space for simplicity.
So the way a content-based recommender system makes predictions is as follows: given a user x and its profile, the recommender system computes the cosine similarity between the user's profile and all the items in the catalog. Then picks the top-k items with the highest cosine similarity (which are the most similar to the user's profile) and recommends those to the user.
That is the (simple) theory behind content-based recommender systems. Let's wrap it up, looking at the pros and cons of this technique.
- No need for data on other user.
You can start making recommendation from day one.
- Ability to recommend to users with unique tastes.
When using collaborative-filtering you need to find similar users. The problem with that might be that if there is a user with very unique tastes then this user may not be similar to any other user.
- Ability to recommend new and unpopular items.
As the item profile depends entirely on the item's features and not on the ratings from other users, we can make recommendations for an item as soon as it becomes available.
- Easy to explain.
Content-based recommendation are easy to explain to the user.
- Finding the appropriate features could be hard and it's a very domain specific task.
Never recommends items outside the user's content profile. You need to mitigate this finding ways for increasing recommendation diversity and therefore sales diversity, in order to start new trends of interests in the user. This is often achieved introducing a certain level of serendipity.
- Cold start for new users.
How do you build a profile for a new user? A new user hasn't rated any items yet so there is no user profile. In most practical situations two approaches are used. The first approach is to initialize new users with an "average profile" based on the existing users' profiles. Then over time, the user profile evolves as the user rates more items and therefore its profile becomes more individualized. The second approach recommends to the user the most popular items and starts building the users profile from scratch as he starts rating items.
In the next article, I will dig deeper into another popular technique used by modern recommender systems (often used in conjunction with content-based recommender systems): memory based collaborative-filtering. Stay tuned!
This article is written by Riccardo Saccomandi, Co-founder and CTO of Kickdynamic.