Pranking with Ranking Koby Crammer and Yoram Singer
description
Transcript of Pranking with Ranking Koby Crammer and Yoram Singer
![Page 1: Pranking with Ranking Koby Crammer and Yoram Singer](https://reader036.fdocuments.net/reader036/viewer/2022062323/56816719550346895ddb8d8c/html5/thumbnails/1.jpg)
Pranking with RankingKoby Crammer and Yoram Singer
Lecture: Dudu Yanay
![Page 2: Pranking with Ranking Koby Crammer and Yoram Singer](https://reader036.fdocuments.net/reader036/viewer/2022062323/56816719550346895ddb8d8c/html5/thumbnails/2.jpg)
Input:Each instance is associated with a rank or a rating, i.e. an integer from ‘1’ to ‘K’.
Goal:To find a rank-prediction rule which assigns to each instance a rank which is as close as possible to the instance true rank.
Similar problems:◦ Classifications.◦ Regression.
The Problem
![Page 3: Pranking with Ranking Koby Crammer and Yoram Singer](https://reader036.fdocuments.net/reader036/viewer/2022062323/56816719550346895ddb8d8c/html5/thumbnails/3.jpg)
Information Retrieval.
Collaborative filtering:Predict a user’s rating on new items (books, movies etc) given the user’s past rating of similar items.
Natural Setting For…
![Page 4: Pranking with Ranking Koby Crammer and Yoram Singer](https://reader036.fdocuments.net/reader036/viewer/2022062323/56816719550346895ddb8d8c/html5/thumbnails/4.jpg)
To cast a rating problem as a regression problem.
To reduce a total order into a set of preferences over pairs.◦ Time consuming since it might require to increase the sample
size from to .
Possible Solutions
n 2O n
![Page 5: Pranking with Ranking Koby Crammer and Yoram Singer](https://reader036.fdocuments.net/reader036/viewer/2022062323/56816719550346895ddb8d8c/html5/thumbnails/5.jpg)
Online Algorithm (Littlestone 1988):◦ Each can be computed
in polynomial time.
◦ If the problem is separable,after polynomial failures(no) the learner doesn’t makea mistake. Meaning:
Lets try another approach…
מורהלומד1x
)(1 xh1 1( )y f x
noyes /
)(2 xh2 2( )y f x
noyes /
2x
( )ih x
( )ih x f x
Animation from Nader Bshouty’s
Course.
![Page 6: Pranking with Ranking Koby Crammer and Yoram Singer](https://reader036.fdocuments.net/reader036/viewer/2022062323/56816719550346895ddb8d8c/html5/thumbnails/6.jpg)
The PERCEPTRON algorithm
1x
2x
1 2( , )w w
1 2( , )w w
Animation from Nader
Bshouty’sCourse.
![Page 7: Pranking with Ranking Koby Crammer and Yoram Singer](https://reader036.fdocuments.net/reader036/viewer/2022062323/56816719550346895ddb8d8c/html5/thumbnails/7.jpg)
The PERCEPTRON algorithm
0
1
1) w 0; 0;2) Get ( , )
3) Predict ( ).4) If Mistake ( ) 4.2) ; 4.3) 1;5) Goto 2.
Ti
i i
ix y
z sign w xz yw w y xi i
A slide from Nader Bshouty’s
Course.
![Page 8: Pranking with Ranking Koby Crammer and Yoram Singer](https://reader036.fdocuments.net/reader036/viewer/2022062323/56816719550346895ddb8d8c/html5/thumbnails/8.jpg)
1x
2x
2|| || 1w
| |Tw x
| |Tw x
2|| ||x R
R
w
2
#( ) RMistakes
A slide from Nader Bshouty’s Course.
![Page 9: Pranking with Ranking Koby Crammer and Yoram Singer](https://reader036.fdocuments.net/reader036/viewer/2022062323/56816719550346895ddb8d8c/html5/thumbnails/9.jpg)
PRank algorithm - The model Input:
A sequence ◦ .
Output:A ranking rule where:◦ .◦ .◦ .
Ranking loss after T rounds is: where is the TRUE rank of the instance in round ‘t’ and .
1 1 2 2, , , ,..., ,t tx y x y x y , 1, 2,..., with ">" as the order relationi n ix y k
, : 1, 2,...,nw bH k
nw 1 2 1 1 2 1, ,..., , ...k k kb b b b b b b b
, 1,2,..,min : 0rw b r kH x r w x b
1
T t t
t
y y
ty
,
ttw bH x y
![Page 10: Pranking with Ranking Koby Crammer and Yoram Singer](https://reader036.fdocuments.net/reader036/viewer/2022062323/56816719550346895ddb8d8c/html5/thumbnails/10.jpg)
PRank algorithm - The update rule Given an input instance-rank pair , if:
◦ .◦ .
Lets represent the above inequalities by where
,x y ,w bH x y 1,..., 1 , rr y w x b ,.., 1 , rr y k w x b
1 1,..., ,..., 1,..., 1, 1,..., 1y ky y y
1 if else 1r ry r y y
, , 0r rw bH x y r y w x b The TRUE rank vector
![Page 11: Pranking with Ranking Koby Crammer and Yoram Singer](https://reader036.fdocuments.net/reader036/viewer/2022062323/56816719550346895ddb8d8c/html5/thumbnails/11.jpg)
PRank algorithm - The update rule Given an input instance-rank pair , if
.
So, let’s “move” the values of and towards each other:◦ .
◦ , where the sum is only over the indices ‘r’ for which there was a prediction error, i.e., .
,x y ,w bH x y : 0r rr y w x b
w x rb
r r rb b y
rr
w w y x
0r ry w x b
![Page 12: Pranking with Ranking Koby Crammer and Yoram Singer](https://reader036.fdocuments.net/reader036/viewer/2022062323/56816719550346895ddb8d8c/html5/thumbnails/12.jpg)
The update rule - Illustrasion
1 2 3 4 5
Predicted Rank
Correct interval
![Page 13: Pranking with Ranking Koby Crammer and Yoram Singer](https://reader036.fdocuments.net/reader036/viewer/2022062323/56816719550346895ddb8d8c/html5/thumbnails/13.jpg)
The PRank algorithm
0 0
1,2,..
1) w 0; ;2) Get ( , )
3) Predict min : 0 .
4) If Mistake ( )
4.1) 1
1.
4.2) 0
0.
r
t trr k
tr
tr
t t t t tr r r r
tr
bx y
z r w x b
z y
if y r then y
else y
if w x b y then y
else
1
1
4.3) ;
4.5) ; 4.4) t 1;5) Goto 2.
tt t r
r
t t tr r r
w w x
b bt
Building the TRUErank vector
Checking which thresholdprediction is wrong
Updating the hypothesis
![Page 14: Pranking with Ranking Koby Crammer and Yoram Singer](https://reader036.fdocuments.net/reader036/viewer/2022062323/56816719550346895ddb8d8c/html5/thumbnails/14.jpg)
First, we need to show that the output hypothesis of Prank is acceptable. Meaning, if and is the final ranking rule then .
Proof – By induction:Since the initialization of the thresholds is such that , then it suffices to show that the claim hold inductively.
Lemma 1 (Order Preservation):Let and be the current ranking rule, where and let be an instance-rank pair fed to Prank on round ‘t’. Denote by and the resulting ranking after the update of Prank, then
PRank Analysis – Consistent Hypothesis
tw tb1 2 1...t t t
kb b b ,t tx y1tw 1tb
1 1 11 2 1...t t t
kb b b
fw fb1 2 1...f f f
kb b b
0 0 01 2 1... kb b b
![Page 15: Pranking with Ranking Koby Crammer and Yoram Singer](https://reader036.fdocuments.net/reader036/viewer/2022062323/56816719550346895ddb8d8c/html5/thumbnails/15.jpg)
Lemma 1 – Proof
2 3 4 5 6
Predicted Rank
Correct interval
1
1Option 1
1 2 3 4 5
Correct interval
Predicted Rank
1
1Option 2
1
![Page 16: Pranking with Ranking Koby Crammer and Yoram Singer](https://reader036.fdocuments.net/reader036/viewer/2022062323/56816719550346895ddb8d8c/html5/thumbnails/16.jpg)
Theorem 2:Let be an input sequence for PRank where . and . Denote by . Assume that there is a ranking rule with of a unit norm that classifies the entire sequence correctly with margin . . Then, the rank loss of the algorithm , is at the most .
PRank Analysis – Mistake bound
1 1, ,..., ,T Tx y x y
t nx 1,...,ty k 22 max ttR x
* * *,v w b * * *1 2 1... kb b b
* *,min 0t t
r t r rw x b y
1
T t t
t
y y
2
2
1 1k R
![Page 17: Pranking with Ranking Koby Crammer and Yoram Singer](https://reader036.fdocuments.net/reader036/viewer/2022062323/56816719550346895ddb8d8c/html5/thumbnails/17.jpg)
Comparison between:◦ Prank.◦ MultiClass Perceptron – MCP.◦ Widrow-Hoff (online regression) – WH.
Datasets:◦ Synthetic.◦ EachMovie.
Experiments
![Page 18: Pranking with Ranking Koby Crammer and Yoram Singer](https://reader036.fdocuments.net/reader036/viewer/2022062323/56816719550346895ddb8d8c/html5/thumbnails/18.jpg)
Randomly generated points - uniformly at random.
Each point was assign a rank according to:
◦ - noise. Generated 100
sequences of instance-rank pairs, each of length 7000.
Synthetic Dataset 21 2, 0,1x x x
1,...,5y
1 2max 10 0.5 05 , , 1, 0.1,0.25,1r ry x x b whereb
0,0.125N
![Page 19: Pranking with Ranking Koby Crammer and Yoram Singer](https://reader036.fdocuments.net/reader036/viewer/2022062323/56816719550346895ddb8d8c/html5/thumbnails/19.jpg)
Collaborative filtering dataset. Contains ratings of movies provided by 61,265 people.
6 possible rating: 0, 0.2, 0.4, 0.6, 0.8, 1. Only people with at
least 100 rating whereconsidered.
Chose at random oneperson to be the TRUE rank and otherratings where used asfeatures(-0.5,-0.3,-0.1,0.1, 0.3, 0.5).
EachMovie Dataset
![Page 20: Pranking with Ranking Koby Crammer and Yoram Singer](https://reader036.fdocuments.net/reader036/viewer/2022062323/56816719550346895ddb8d8c/html5/thumbnails/20.jpg)
Batch setting Ran Prank over the training data as an online algorithm
and used its last hypothesis to rank the unseen data.
EachMovie Dataset – cont’
![Page 21: Pranking with Ranking Koby Crammer and Yoram Singer](https://reader036.fdocuments.net/reader036/viewer/2022062323/56816719550346895ddb8d8c/html5/thumbnails/21.jpg)
Thank You
![Page 22: Pranking with Ranking Koby Crammer and Yoram Singer](https://reader036.fdocuments.net/reader036/viewer/2022062323/56816719550346895ddb8d8c/html5/thumbnails/22.jpg)
PERCEPTRON משפט2|| || 1w | |Tw x 2|| ||x R
2
#( ) RMistakes הוכחה
2
cos( )|| ||
Tt
tt
w ww
1 ( )T Ti i i iw w w w y a
( )T Ti i iw w y w a
| |T Ti iw w w a
Tiw w
Ttw w t
![Page 23: Pranking with Ranking Koby Crammer and Yoram Singer](https://reader036.fdocuments.net/reader036/viewer/2022062323/56816719550346895ddb8d8c/html5/thumbnails/23.jpg)
PERCEPTRON משפט2|| || 1w | |Tw x 2|| ||x R
2
#( ) RMistakes הוכחה
2
cos( )|| ||
Tt
tt
w ww
Ttw w t
21 2|| ||iw
22|| ||i i iw y a
2 22 2|| || || || 2 ( )T
i i i i i iw y a y w a 2 22 2|| || || ||i iw a 2 22|| ||iw R
2 22|| ||tw tR
![Page 24: Pranking with Ranking Koby Crammer and Yoram Singer](https://reader036.fdocuments.net/reader036/viewer/2022062323/56816719550346895ddb8d8c/html5/thumbnails/24.jpg)
PERCEPTRON משפט2|| || 1w | |Tw x 2|| ||x R
2
#( ) RMistakes הוכחה
2
cos( )|| ||
Tt
tt
w ww
Ttw w t 2 2
2|| ||tw tR
2
1 cos( )|| ||
Tt
tt
w ww
t tRR t
2
#( ) RMistakes t