ML - Content Based Recommender System
Recommendation systems are an important part of many digital platforms, like for suggesting movies on Netflix, recommending products on Amazon etc. Among the different types of recommendation approaches, Content-Based Recommender Systems focus on the characteristics of items and the preferences of users to generate personalized recommendations. It uses information about a user’s past behavior and item features to recommend similar items.
This can include explicit feedback such as ratings or even implicit feedback like clicks, views or time spent on content. Based on this data, the system generates a user profile, which is then used to find items that closely match the user's preferences. As time passes, the user continues to interact with the system which results in more accurate and relevant suggestions.

Components of a Content-Based Recommender
1. User Profile
The User Profile is a representation of the user’s preferences. We model it as a feature vector, capturing characteristics of items the user liked or interacted with.
- A utility matrix is used to represent interactions between users and items (ratings, clicks, likes).
- The system analyzes items previously rated or liked by the user to identify key features like genre, actors, etc.
- These features are then aggregated to form the user's profile vector.
Example: If a user likes action movies directed by Christopher Nolan and starring Christian Bale, their profile may have high weights for the features “action” “Christopher Nolan” and “Christian Bale”
2. Item Profile
Each item is also represented as a vector of relevant features. Key features depend on the domain:
- Movies: genre, director, actors, release year, IMDb rating.
- Books: author, genre, publication year, keywords.
- Products: brand, category, specifications, price.
The item profile captures the essence of what the item is about. This information is later compared with the user profile to measure similarity.
3. Utility Matrix
The Utility Matrix represents the preferences of users for different items. Each row corresponds to a user and each column corresponds to an item. The matrix can be partially filled, as users rarely rate or interact with all available items.
User / Movie | Inception | The Dark Knight | Interstellar | The Notebook |
---|---|---|---|---|
User A | 5 | 4 | 5 | 1 |
User B | 4 | 5 | ? | ? |
Some of the columns are blank in the matrix that is because we don’t get the whole input from the user every time, and the goal of a recommendation system is not to fill all the columns but to recommend a movie to the user which they will prefer.
Making Recommendations
Once the user and item profiles are created the system must determine how well each item aligns with the user's preferences. Two common approaches are:
Method 1: Cosine Similarity
Cosine similarity is used to measure the angle between the user vector and the item vector. The smaller the angle (closer to 0), the higher the similarity.
How it works:
- A user vector might include positive weights for preferred features (e.g., genres, actors) and negative weights for disliked features.
- The cosine of the angle between vectors indicates how closely an item matches the user’s taste.
Formula:
Where:
\vec{u} is the user profile vector\vec{i} is the item profile vector
A higher similarity score indicates a better match and such items are recommended to the user.
Method 2: Classification-Based Approach
Instead of calculating similarities, we can treat recommendation as a classification problem, predicting whether a user will like or dislike an item.
Example classifier: Decision Tree
- Features: Genre, director, actors, duration, etc.
- At each node, the tree asks a question like "Is the genre Action?"
- The tree refines decisions at each level until it predicts a binary outcome: like or dislike.

This approach can be extended using:
It works well when we have labeled data and want more interpretable rules for recommendations.
Advantages of Content-Based Recommendation
- Personalized: Gives suggestions based on individual preferences.
- No cold start for items: Since recommendations depend on item features, new items can be recommended if their features are known.
- User independence: Unlike collaborative filtering, content-based methods don’t need data from other users.
Content-Based Recommender Systems offer a way to suggest items by analyzing user behavior and item attributes. They come with limitations but their effectiveness can be increased when combined with other techniques in hybrid models.
Other type of recomedation sytems is Collaborative Filtering which you can refer from: Collaborative Filtering in Machine Learning