As we all know, Matrix Factorization is an effective method to do rating prediction jobs in recommender systems. Thanks to the work of Yahuda Koren. My question is why MF can do this job? What's the physical meaning behind it?
2 Answers
The idea in matrix factorization is to find the latent variables which connect the input and the output. Suppose that we are interested in a movie recommendation system, and that movies "live" on a one-dimensional axis, having romantic comedies in one end, and action movies in the other. Each "input" and "output" move can be rated in this scale (say $1$ to $-1$). Given the input movies, we can estimate a person's location $p$ on the axis (say a point in $[-1,1]$) by taking the inner product between the movies they saw and their "ratings" $r_i$, i.e. $p \approx \frac{1}{n} \sum_{i=1}^n r_i$, where $1,\ldots,n$ are the movies seen by the person. Given the location, we can estimate how much that person will like other movies: movie $t$ will be liked by an amount proportional to $pr_t$ (this is just a simplistic linear model).
In this example, the recommendation matrix can be decomposed as an outer product $A_{it} = r_i r_t$. More generally, the situation could depend on several latent variables, and this is the general situation described by matrix factorization. The number of latent variables is the "middle" dimension in the factorization. If you have found a good factorization with only a few variables, then you have a succinct explanation of your data that could have predictive power.
- 280,205
- 27
- 317
- 514
a rough intuition (other than the basic data compression concept) is as follows. recommender systems that use matrix factorization methods eg SVD can be seen as a set of linear weights of natural features found via unsupervised means. a "feature" eg for a movie database is a large set of movies of varying weights. they dont necessarily have recognizable human meanings but could be given loose interpretations based on correlations with other variables associated with the subjects. such as "movies that women like, movies that men like, movies that kids like, movies that republicans like," etcetera. an individuals tastes are then represented approximately as a linear weighted combination of all these "factors". hence its very similar to statistics-based Factor Analysis where, roughly, causes & effects are decomposed into weighted "factors". some accounts of the Netflix contest refer to this concept.
- 11,162
- 1
- 28
- 52