Recommendation Systems

4 minute read

Published: January 19, 2022

Notes from the Recommendation Systems course by Prof. Arpit Rana @ DAIICT and the MMD course @ Stanford.

Generating explanation for recommendations

1. Recommendation as Rating Prediction

a. Memory-based (Neighbour-based) methods

a.1 User User Collaborative Filtering

Predict rating of a user for an item based on how the users peer rated it.
So first, find neighbours of u2. One simple way could be construct vector from ratings on items that are rated by both the users, and compute the cosine sim of two vectors. Then find neighbours based on cosine sim.
Also note that we can use XNOR (not jaccard) to compute similarity. As here, the 0 value represents dislike. So we might want to consider if both users dislike the same item, then they might be somewhat similar in a local view.
For any user, we can either say that we will pick top-100 users as neighbours (k-Nearest Neighbours), or we can set a threshold and say that any user with similarity greater than thresh k would be a neighbour to the target user.
Now, once we have similar neighbours, we can do one of the following:
1. Take average (or weighted average) rating of all user rating for items $i_x$ (with unknown item rating). Simple average of ratings: Weighted average based on user-user sim: An issue with this is that the ratings are subjective, i.e. for the same level of appreciation people may give different ratings. The difference in expression of appreciation is different for the same level of appreciation, and this is not captured here.
2. Normalized rating for users
Similarity

b. Model-based methods

2. Recommendation as a User-Based Classification Problem

We count how many neighbours have given each of the rating values. But now instead of counting each as 1, we multiply it with the user similarity, i.e. delta is 1 and is multiplied by user similarity weight. Then predict the rating which is given by the maximum number of neighbours.

In short, this calculates the average sum of user similarity for each rating r if a user has given rating r to the item.
Extremes are better handled (i.e the recommendation is less risky) by regression than classification. E.g.: neighbour ratings 1, 5 will have 3 in regresion vs 1/5 in cls. So when the neighbour size is large, the risk can be minimized via regression more efficiently.

a.2 Item Item Collaborative Filtering

a.3 User-User vs Item-Item Collaborative Filtering

The following criteria need to be considered while choosing one of the techniques:

Accuracy: A small number of high-confidence neighbors is far preferable than a large number of loosely coupled neighbors. Consider if #users»#items (item-item, e.g. Amazon) or #items»#users (user-user, e.g. research article recommendation).
Efficiency: Ratio of number of users and items.
Stability: Frequency and amount of change in the users and items of the system. E.g. if items are constantly changing (like research papers) then user-user is prefereable. If the list of users are updating and the list of items is fairly static, item based approach is preferable, e.g. online shopping applications.
Justifiability: Depends on the explainability of reco approach. The list of neighboring items is more justifiable than the list of users (most users are unknown to the active user).
Serendipity: depends on the ability of the recommendation approach to offer surprising recommendations. tem based approach relies on items similar to active users items; thus, less prone to provide more surprising recommendations. User based approach relies on peers’ opinio with similar tastes. In case, a user likes one item different from her usual taste, may be recommended to her peers.

a.4 Components of Memory-based (Neighbour-based) methods

Rating Normalization: Mean-centering, Z-score. Mean-centering determine if the rating is + or - with repect to the users mean rating. Z-score also considers the spread in an individual’s rating scale. Mean-centering vs Z-score
Computataion of Similarity Weights: E.g.: Cosine Vector similarity, Pearon correlation, Adjusted Cosine In above, note the significance of Adjusted cosine in item-based setting, i.e. the similarity between items i and j. $U_{ij}$ is the set of all users that have rated both items i and j.