Supervised Learning Setting

Feature representation (encoding) \[\Phi: \text{item}^{(m)} \rightarrow \mathbf{x}^{(m)}=\begin{pmatrix}x_1^{(m)} \\ \vdots \\ x_N^{(m)}\end{pmatrix}\]

Training Data

Design matrix \(\mathbf{X}\)

feature n →
item m ↓
1 2 \(N\)
1 \(x^{1}_1\) \(x^{1}_2\)   \(x^{1}_N\)
2 \(x^{2}_1\) \(x^{2}_2\)   \(x^{2}_N\)
       
\(M\) \(x^{M}_1\) \(x^{M}_2\)   \(x^{M}_N\)

Ratings \(\mathbf{y}\)

 
 
\(y^{1}\)
\(y^{2}\)
\(y^{M}\)

Learning Task

Learn a predictor, \(f\), that maps an \(N\)-dimensional vector representation of an item (row in \(\mathbf{X}\)) to an output value (element in \(\mathbf{y}\))

\[f\left(\mathbf{x}^{(m)}\right) \rightarrow y^{(m)}\]

  • \(y^{(m)} \in \{1,2,3,4,5\} \rightarrow\) classification
  • \(y^{(m)} \in \mathbb{R} \rightarrow\) regression

Regression Problem

  • Hypothesis, e.g. linear: \(f(\mathbf{x}^{(m)})=\mathbf{\theta}^T\mathbf{x}^{(m)}\)

  • Loss function: \(\mathcal{L}=\sum_{m=1}^{M}\left(y^{m}-f(\mathbf{x}^{(m)})\right)^2\)

  • \(+\) regularisation

  • Cost function: \[J(\mathbf{\theta})=\frac{1}{2M}\sum_{m=1}^{M}\left(y^{m}-\mathbf{\theta}^T\mathbf{x}^{(m)}\right)^2+\frac{\lambda}{2}||\mathbf{\theta}||^2\]

  • Minimise: Solve analytically or by gradient descent

Summary

  • Difficult to design expressive features
  • No ratings or other user information required
  • For personal recommendations data from other users is not leveraged
  • No cold-start or sparsity problem
  • new and less famous objects are also recommended
  • Serendipity effect is not really supported