Collaborative Filtering

Supervised Learning Setting

Feature representation (encoding) \[\Phi: \text{item}^{(m)} \rightarrow \mathbf{x}^{(m)}=\begin{pmatrix}x_1^{(m)} \\ \vdots \\ x_N^{(m)}\end{pmatrix}\]

Training Data

Design matrix \(\mathbf{X}\)

feature n → item m ↓	1	2	\(N\)
1	\(x^{1}_1\)	\(x^{1}_2\)	\(x^{1}_N\)
2	\(x^{2}_1\)	\(x^{2}_2\)	\(x^{2}_N\)
⋮
\(M\)	\(x^{M}_1\)	\(x^{M}_2\)	\(x^{M}_N\)

Ratings \(\mathbf{y}\)


\(y^{1}\)
\(y^{2}\)
⋮
\(y^{M}\)

Learning Task

Learn a predictor, \(f\), that maps an \(N\)-dimensional vector representation of an item (row in \(\mathbf{X}\)) to an output value (element in \(\mathbf{y}\))

\[f\left(\mathbf{x}^{(m)}\right) \rightarrow y^{(m)}\]

\(y^{(m)} \in \{1,2,3,4,5\} \rightarrow\) classification
\(y^{(m)} \in \mathbb{R} \rightarrow\) regression

Regression Problem

Hypothesis, e.g. linear: \(f(\mathbf{x}^{(m)})=\mathbf{\theta}^T\mathbf{x}^{(m)}\)
Loss function: \(\mathcal{L}=\sum_{m=1}^{M}\left(y^{m}-f(\mathbf{x}^{(m)})\right)^2\)
\(+\) regularisation
Cost function: \[J(\mathbf{\theta})=\frac{1}{2M}\sum_{m=1}^{M}\left(y^{m}-\mathbf{\theta}^T\mathbf{x}^{(m)}\right)^2+\frac{\lambda}{2}||\mathbf{\theta}||^2\]
Minimise: Solve analytically or by gradient descent

Summary

Difficult to design expressive features
No ratings or other user information required
For personal recommendations data from other users is not leveraged
No cold-start or sparsity problem
new and less famous objects are also recommended
Serendipity effect is not really supported