Collaborative Filtering

Rating Matrix

\(\mathbf{Y}: M \text{ users} \times N \text{ items}\)

item n → user m ↓	1	2	\(N\)
1	\(y^{(1)}_1\)	\(y^{(1)}_2\)	\(y^{(1)}_N\)
2	\(y^{(2)}_1\)	\(y^{(2)}_2\)	\(y^{(2)}_N\)
⋮
\(M\)	\(y^{(M)}_1\)	\(y^{(M)}_2\)	\(y^{(M)}_N\)

Matrix Filling Task

Rating matrix is very sparse

item n → user m ↓	1	2	\(N\)
1	5		3
2		1
⋮			2
\(M\)	1

Collaborative Filtering - The Principle

Using the Nearest Neighbour approach

User-based vs. item-based

The Nearest Neighbour Approach

Calculate the unknown rating as the average rating of the other users weighted by similarity

E.g. by cosine similarity

\[s_{m,m'}=\frac{\mathbf{x}^{(m)} \cdot \mathbf{x}^{(m')}}{|\mathbf{x}^{(m)}||\mathbf{x}^{(m')}|}\]

by row → user-based
By columns → item-based

Similarity of Users 1 and 2

Only consider items rated by both users

\[ \begin{align*} s_{1,2} &= \frac{\mathbf{x}^{(1)} \cdot \mathbf{x}^{(2)}}{|\mathbf{x}^{(1)}||\mathbf{x}^{(2)}|} \\ &= \frac{1\cdot 5+3\cdot 4+1 \cdot 4+5 \cdot 1}{\sqrt{1^2+3^2+1^2+5^2}\cdot \sqrt{5^2+4^2+4^2+1^2}} = 0.57 \end{align*} \]

Similarity Matrix

Calculation of an Unknown Rating

Predicted rating for user \(m=2\), item \(n=4\): \[ \begin{align*} \hat{y}_{2,4} &= \frac{1}{s_{2,3}+s_{2,5}+s_{2,6}} \left(s_{2,3}\cdot y_{3,4}+ s_{2,5} \cdot y_{5,4}+s_{2,6}\cdot y_{6,4}\right) \\ &= \frac{1}{0.73+0.65+1.0} \left( 0.73 \cdot 4+0.65 \cdot 5+1.0 \cdot 4 \right) \\ &= 4.27 \end{align*} \]

Matrix Factorisation

For a matrix \(\mathbf{Y}:M\times N\) of rank \(K\) there exist \(\mathbf{U}: N\times K\) and \(\mathbf{V}: M\times K\), such that

\[\mathbf{Y}=\mathbf{U}\mathbf{V}^T\]

Matrix Factorisation

\[\mathbf{Y}=\mathbf{U}\mathbf{V}^T\]

Problem: \(\mathbf{Y}\) is sparse
Approach:
- Calculate \(\mathbf{U}\) and \(\mathbf{V}\) based on available entries in \(\mathbf{Y}\)
- Use \(\mathbf{U}\) and \(\mathbf{V}\) to predict unknown ratings \(\mathbf{\hat{Y}}\)

Factorisation Machines

\[ \begin{align*} \text{Cost: } J(U,V) &= \frac{1}{2}\sum_{(m,n) \text{ where}\\ y_{m,n}\neq 0} \left(y_{m,n}- \left(\mathbf{u}^{(m)}\cdot \mathbf{v}^{(n)}+b_u^{(m)}+b_v^{(n)}\right)\right)^2 \\ &+ \frac{\lambda}{2}\sum||\mathbf{u}^{(m)}||^2 + \frac{\lambda}{2}\sum||\mathbf{v}^{(n)}||^2 \end{align*} \]

Minimise by alternating least squares or stochastic gradient descent

Summary

No item features needed
User ratings required
Current interests infered from historic user behavior
Sparsity
Cold start problems
Users’s range of interests can be expanded