Prediction, Learning and Games - Chapter 4 Randomized...
Transcript of Prediction, Learning and Games - Chapter 4 Randomized...
![Page 1: Prediction, Learning and Games - Chapter 4 Randomized ...walid.krichene.net/notes/reading-plg-chap4.pdf · Prediction, Learning and Games - Chapter 4 Randomized prediction WalidKrichene](https://reader036.fdocuments.net/reader036/viewer/2022071011/5fc9a00a43690a5cd0463f71/html5/thumbnails/1.jpg)
Prediction, Learning and Games - Chapter 4Randomized prediction
Walid Krichene
November 14, 2013
Walid Krichene Prediction, Learning and Games - Chapter 4 Randomized predictionNovember 14, 2013 1 / 16
![Page 2: Prediction, Learning and Games - Chapter 4 Randomized ...walid.krichene.net/notes/reading-plg-chap4.pdf · Prediction, Learning and Games - Chapter 4 Randomized prediction WalidKrichene](https://reader036.fdocuments.net/reader036/viewer/2022071011/5fc9a00a43690a5cd0463f71/html5/thumbnails/2.jpg)
Experts framework
RecallOn iteration t: experts reveal their advice fi,t
forecaster makes a decision pt =∑N
i=1 wi,t fi,tthe losses are revealed `(fi,t , yt) and `(pt , yt)
forecaster updates weights wi,t+1
`(·, yt) is convex
DefinitionRegret
Ri,T = LT − Li,T =T∑
t=1
`(pt , yt)− `(fi,t , yt)
Goal: RTT = o(1)
Walid Krichene Prediction, Learning and Games - Chapter 4 Randomized predictionNovember 14, 2013 2 / 16
![Page 3: Prediction, Learning and Games - Chapter 4 Randomized ...walid.krichene.net/notes/reading-plg-chap4.pdf · Prediction, Learning and Games - Chapter 4 Randomized prediction WalidKrichene](https://reader036.fdocuments.net/reader036/viewer/2022071011/5fc9a00a43690a5cd0463f71/html5/thumbnails/3.jpg)
Experts framework
RecallOn iteration t: experts reveal their advice fi,tforecaster makes a decision pt =
∑Ni=1 wi,t fi,t
the losses are revealed `(fi,t , yt) and `(pt , yt)
forecaster updates weights wi,t+1
`(·, yt) is convex
DefinitionRegret
Ri,T = LT − Li,T =T∑
t=1
`(pt , yt)− `(fi,t , yt)
Goal: RTT = o(1)
Walid Krichene Prediction, Learning and Games - Chapter 4 Randomized predictionNovember 14, 2013 2 / 16
![Page 4: Prediction, Learning and Games - Chapter 4 Randomized ...walid.krichene.net/notes/reading-plg-chap4.pdf · Prediction, Learning and Games - Chapter 4 Randomized prediction WalidKrichene](https://reader036.fdocuments.net/reader036/viewer/2022071011/5fc9a00a43690a5cd0463f71/html5/thumbnails/4.jpg)
Experts framework
RecallOn iteration t: experts reveal their advice fi,tforecaster makes a decision pt =
∑Ni=1 wi,t fi,t
the losses are revealed `(fi,t , yt) and `(pt , yt)
forecaster updates weights wi,t+1
`(·, yt) is convex
DefinitionRegret
Ri,T = LT − Li,T =T∑
t=1
`(pt , yt)− `(fi,t , yt)
Goal: RTT = o(1)
Walid Krichene Prediction, Learning and Games - Chapter 4 Randomized predictionNovember 14, 2013 2 / 16
![Page 5: Prediction, Learning and Games - Chapter 4 Randomized ...walid.krichene.net/notes/reading-plg-chap4.pdf · Prediction, Learning and Games - Chapter 4 Randomized prediction WalidKrichene](https://reader036.fdocuments.net/reader036/viewer/2022071011/5fc9a00a43690a5cd0463f71/html5/thumbnails/5.jpg)
Experts framework
RecallOn iteration t: experts reveal their advice fi,tforecaster makes a decision pt =
∑Ni=1 wi,t fi,t
the losses are revealed `(fi,t , yt) and `(pt , yt)
forecaster updates weights wi,t+1
`(·, yt) is convex
DefinitionRegret
Ri,T = LT − Li,T =T∑
t=1
`(pt , yt)− `(fi,t , yt)
Goal: RTT = o(1)
Walid Krichene Prediction, Learning and Games - Chapter 4 Randomized predictionNovember 14, 2013 2 / 16
![Page 6: Prediction, Learning and Games - Chapter 4 Randomized ...walid.krichene.net/notes/reading-plg-chap4.pdf · Prediction, Learning and Games - Chapter 4 Randomized prediction WalidKrichene](https://reader036.fdocuments.net/reader036/viewer/2022071011/5fc9a00a43690a5cd0463f71/html5/thumbnails/6.jpg)
Experts framework
RecallOn iteration t: experts reveal their advice fi,tforecaster makes a decision pt =
∑Ni=1 wi,t fi,t
the losses are revealed `(fi,t , yt) and `(pt , yt)
forecaster updates weights wi,t+1
`(·, yt) is convex
DefinitionRegret
Ri,T = LT − Li,T =T∑
t=1
`(pt , yt)− `(fi,t , yt)
Goal: RTT = o(1)
Walid Krichene Prediction, Learning and Games - Chapter 4 Randomized predictionNovember 14, 2013 2 / 16
![Page 7: Prediction, Learning and Games - Chapter 4 Randomized ...walid.krichene.net/notes/reading-plg-chap4.pdf · Prediction, Learning and Games - Chapter 4 Randomized prediction WalidKrichene](https://reader036.fdocuments.net/reader036/viewer/2022071011/5fc9a00a43690a5cd0463f71/html5/thumbnails/7.jpg)
Experts framework
RecallOn iteration t: experts reveal their advice fi,tforecaster makes a decision pt =
∑Ni=1 wi,t fi,t
the losses are revealed `(fi,t , yt) and `(pt , yt)
forecaster updates weights wi,t+1
`(·, yt) is convex
DefinitionRegret
Ri,T = LT − Li,T =T∑
t=1
`(pt , yt)− `(fi,t , yt)
Goal: RTT = o(1)
Walid Krichene Prediction, Learning and Games - Chapter 4 Randomized predictionNovember 14, 2013 2 / 16
![Page 8: Prediction, Learning and Games - Chapter 4 Randomized ...walid.krichene.net/notes/reading-plg-chap4.pdf · Prediction, Learning and Games - Chapter 4 Randomized prediction WalidKrichene](https://reader036.fdocuments.net/reader036/viewer/2022071011/5fc9a00a43690a5cd0463f71/html5/thumbnails/8.jpg)
Multiplicative weight algorithms
DefinitionHedge algorithm
wi,t+1 ∝ wi,t exp(−γ`(fi,t , yt))
Average regret RTT ≤
ln NγT + γ
8
taking γt = O(√
ln NT ) yields RT
T ≤ O( ln NT ) + O( ln N
T )
small losses: RTT ≤
1T
(γ
1−e−γ − 1)
L∗i + 1T
ln N1−e−γ
Walid Krichene Prediction, Learning and Games - Chapter 4 Randomized predictionNovember 14, 2013 3 / 16
![Page 9: Prediction, Learning and Games - Chapter 4 Randomized ...walid.krichene.net/notes/reading-plg-chap4.pdf · Prediction, Learning and Games - Chapter 4 Randomized prediction WalidKrichene](https://reader036.fdocuments.net/reader036/viewer/2022071011/5fc9a00a43690a5cd0463f71/html5/thumbnails/9.jpg)
Multiplicative weight algorithms
DefinitionHedge algorithm
wi,t+1 ∝ wi,t exp(−γ`(fi,t , yt))
Average regret RTT ≤
ln NγT + γ
8
taking γt = O(√
ln NT ) yields RT
T ≤ O( ln NT ) + O( ln N
T )
small losses: RTT ≤
1T
(γ
1−e−γ − 1)
L∗i + 1T
ln N1−e−γ
Walid Krichene Prediction, Learning and Games - Chapter 4 Randomized predictionNovember 14, 2013 3 / 16
![Page 10: Prediction, Learning and Games - Chapter 4 Randomized ...walid.krichene.net/notes/reading-plg-chap4.pdf · Prediction, Learning and Games - Chapter 4 Randomized prediction WalidKrichene](https://reader036.fdocuments.net/reader036/viewer/2022071011/5fc9a00a43690a5cd0463f71/html5/thumbnails/10.jpg)
Multiplicative weight algorithms
DefinitionHedge algorithm
wi,t+1 ∝ wi,t exp(−γ`(fi,t , yt))
Average regret RTT ≤
ln NγT + γ
8
taking γt = O(√
ln NT ) yields RT
T ≤ O( ln NT ) + O( ln N
T )
small losses: RTT ≤
1T
(γ
1−e−γ − 1)
L∗i + 1T
ln N1−e−γ
Walid Krichene Prediction, Learning and Games - Chapter 4 Randomized predictionNovember 14, 2013 3 / 16
![Page 11: Prediction, Learning and Games - Chapter 4 Randomized ...walid.krichene.net/notes/reading-plg-chap4.pdf · Prediction, Learning and Games - Chapter 4 Randomized prediction WalidKrichene](https://reader036.fdocuments.net/reader036/viewer/2022071011/5fc9a00a43690a5cd0463f71/html5/thumbnails/11.jpg)
Multiplicative weight algorithms
DefinitionHedge algorithm
wi,t+1 ∝ wi,t exp(−γ`(fi,t , yt))
Average regret RTT ≤
ln NγT + γ
8
taking γt = O(√
ln NT ) yields RT
T ≤ O( ln NT ) + O( ln N
T )
small losses: RTT ≤
1T
(γ
1−e−γ − 1)
L∗i + 1T
ln N1−e−γ
Walid Krichene Prediction, Learning and Games - Chapter 4 Randomized predictionNovember 14, 2013 3 / 16
![Page 12: Prediction, Learning and Games - Chapter 4 Randomized ...walid.krichene.net/notes/reading-plg-chap4.pdf · Prediction, Learning and Games - Chapter 4 Randomized prediction WalidKrichene](https://reader036.fdocuments.net/reader036/viewer/2022071011/5fc9a00a43690a5cd0463f71/html5/thumbnails/12.jpg)
Motivation for randomization
What if the decision set is non-convex? E.g. 0, 1Also: any deterministic algorithm can incur Ω(n) loss.solution (to both): Randomize
Walid Krichene Prediction, Learning and Games - Chapter 4 Randomized predictionNovember 14, 2013 4 / 16
![Page 13: Prediction, Learning and Games - Chapter 4 Randomized ...walid.krichene.net/notes/reading-plg-chap4.pdf · Prediction, Learning and Games - Chapter 4 Randomized prediction WalidKrichene](https://reader036.fdocuments.net/reader036/viewer/2022071011/5fc9a00a43690a5cd0463f71/html5/thumbnails/13.jpg)
Setting
Actions 1, 2, . . . ,N = [N]
forecaster maintains a probability distribution (pi,t)i∈[N]
randomly picks an action It ∼ pt
losses are revealed `(i , yt)
sequence y1, . . . , yT can be fixed a priori (oblivious opponent) or can dependon player’s decisions.
Walid Krichene Prediction, Learning and Games - Chapter 4 Randomized predictionNovember 14, 2013 5 / 16
![Page 14: Prediction, Learning and Games - Chapter 4 Randomized ...walid.krichene.net/notes/reading-plg-chap4.pdf · Prediction, Learning and Games - Chapter 4 Randomized prediction WalidKrichene](https://reader036.fdocuments.net/reader036/viewer/2022071011/5fc9a00a43690a5cd0463f71/html5/thumbnails/14.jpg)
Setting
Actions 1, 2, . . . ,N = [N]
forecaster maintains a probability distribution (pi,t)i∈[N]
randomly picks an action It ∼ pt
losses are revealed `(i , yt)
sequence y1, . . . , yT can be fixed a priori (oblivious opponent) or can dependon player’s decisions.
Walid Krichene Prediction, Learning and Games - Chapter 4 Randomized predictionNovember 14, 2013 5 / 16
![Page 15: Prediction, Learning and Games - Chapter 4 Randomized ...walid.krichene.net/notes/reading-plg-chap4.pdf · Prediction, Learning and Games - Chapter 4 Randomized prediction WalidKrichene](https://reader036.fdocuments.net/reader036/viewer/2022071011/5fc9a00a43690a5cd0463f71/html5/thumbnails/15.jpg)
Setting
Actions 1, 2, . . . ,N = [N]
forecaster maintains a probability distribution (pi,t)i∈[N]
randomly picks an action It ∼ pt
losses are revealed `(i , yt)
sequence y1, . . . , yT can be fixed a priori (oblivious opponent) or can dependon player’s decisions.
Walid Krichene Prediction, Learning and Games - Chapter 4 Randomized predictionNovember 14, 2013 5 / 16
![Page 16: Prediction, Learning and Games - Chapter 4 Randomized ...walid.krichene.net/notes/reading-plg-chap4.pdf · Prediction, Learning and Games - Chapter 4 Randomized prediction WalidKrichene](https://reader036.fdocuments.net/reader036/viewer/2022071011/5fc9a00a43690a5cd0463f71/html5/thumbnails/16.jpg)
Setting
Actions 1, 2, . . . ,N = [N]
forecaster maintains a probability distribution (pi,t)i∈[N]
randomly picks an action It ∼ pt
losses are revealed `(i , yt)
sequence y1, . . . , yT can be fixed a priori (oblivious opponent) or can dependon player’s decisions.
Walid Krichene Prediction, Learning and Games - Chapter 4 Randomized predictionNovember 14, 2013 5 / 16
![Page 17: Prediction, Learning and Games - Chapter 4 Randomized ...walid.krichene.net/notes/reading-plg-chap4.pdf · Prediction, Learning and Games - Chapter 4 Randomized prediction WalidKrichene](https://reader036.fdocuments.net/reader036/viewer/2022071011/5fc9a00a43690a5cd0463f71/html5/thumbnails/17.jpg)
Setting
Actions 1, 2, . . . ,N = [N]
forecaster maintains a probability distribution (pi,t)i∈[N]
randomly picks an action It ∼ pt
losses are revealed `(i , yt)
sequence y1, . . . , yT can be fixed a priori (oblivious opponent) or can dependon player’s decisions.
Walid Krichene Prediction, Learning and Games - Chapter 4 Randomized predictionNovember 14, 2013 5 / 16
![Page 18: Prediction, Learning and Games - Chapter 4 Randomized ...walid.krichene.net/notes/reading-plg-chap4.pdf · Prediction, Learning and Games - Chapter 4 Randomized prediction WalidKrichene](https://reader036.fdocuments.net/reader036/viewer/2022071011/5fc9a00a43690a5cd0463f71/html5/thumbnails/18.jpg)
Regret
DefinitionExpected loss (conditioned on past plays) ¯(pt ,Yt) =
∑Ni=1 `(i ,Yt)pi,t
DefinitionExpected regret
Ri,T =T∑
t=1
¯(pt ,Yt)− `(i ,Yt)
Walid Krichene Prediction, Learning and Games - Chapter 4 Randomized predictionNovember 14, 2013 6 / 16
![Page 19: Prediction, Learning and Games - Chapter 4 Randomized ...walid.krichene.net/notes/reading-plg-chap4.pdf · Prediction, Learning and Games - Chapter 4 Randomized prediction WalidKrichene](https://reader036.fdocuments.net/reader036/viewer/2022071011/5fc9a00a43690a5cd0463f71/html5/thumbnails/19.jpg)
Regret
Note: pt+1 only depends on pt and `(i ,Yt) (not on forecaster randomization)Strategy: bound expected regret
RT
T≤ B(T )
then with high probability (≥ 1− δ)
RT
T≤ B(T ) +
√− ln δ
T
Walid Krichene Prediction, Learning and Games - Chapter 4 Randomized predictionNovember 14, 2013 7 / 16
![Page 20: Prediction, Learning and Games - Chapter 4 Randomized ...walid.krichene.net/notes/reading-plg-chap4.pdf · Prediction, Learning and Games - Chapter 4 Randomized prediction WalidKrichene](https://reader036.fdocuments.net/reader036/viewer/2022071011/5fc9a00a43690a5cd0463f71/html5/thumbnails/20.jpg)
Regret
Note: pt+1 only depends on pt and `(i ,Yt) (not on forecaster randomization)
Strategy: bound expected regret
RT
T≤ B(T )
then with high probability (≥ 1− δ)
RT
T≤ B(T ) +
√− ln δ
T
Walid Krichene Prediction, Learning and Games - Chapter 4 Randomized predictionNovember 14, 2013 7 / 16
![Page 21: Prediction, Learning and Games - Chapter 4 Randomized ...walid.krichene.net/notes/reading-plg-chap4.pdf · Prediction, Learning and Games - Chapter 4 Randomized prediction WalidKrichene](https://reader036.fdocuments.net/reader036/viewer/2022071011/5fc9a00a43690a5cd0463f71/html5/thumbnails/21.jpg)
Regret
Note: pt+1 only depends on pt and `(i ,Yt) (not on forecaster randomization)Strategy: bound expected regret
RT
T≤ B(T )
then with high probability (≥ 1− δ)
RT
T≤ B(T ) +
√− ln δ
T
Walid Krichene Prediction, Learning and Games - Chapter 4 Randomized predictionNovember 14, 2013 7 / 16
![Page 22: Prediction, Learning and Games - Chapter 4 Randomized ...walid.krichene.net/notes/reading-plg-chap4.pdf · Prediction, Learning and Games - Chapter 4 Randomized prediction WalidKrichene](https://reader036.fdocuments.net/reader036/viewer/2022071011/5fc9a00a43690a5cd0463f71/html5/thumbnails/22.jpg)
Back to experts
Can think of this asStrategy space is simplex on actions: ∆N
Every expert has constant advice: fi,t is the i-th vertex of the simplexdecision of forecaster is convex combination of verticesloss function is `(·,Yt) is linear: (expected) loss incurred by forecaster is∑
i
pt,i`(fi,t ,Yt) = `(∑
i
pt,i fi,t ,Yt) = `(pt ,Yt)
write the expected loss as 〈`t , pt〉 where `t,i = `(fi,t ,Yt).
Walid Krichene Prediction, Learning and Games - Chapter 4 Randomized predictionNovember 14, 2013 8 / 16
![Page 23: Prediction, Learning and Games - Chapter 4 Randomized ...walid.krichene.net/notes/reading-plg-chap4.pdf · Prediction, Learning and Games - Chapter 4 Randomized prediction WalidKrichene](https://reader036.fdocuments.net/reader036/viewer/2022071011/5fc9a00a43690a5cd0463f71/html5/thumbnails/23.jpg)
Back to experts
Can think of this asStrategy space is simplex on actions: ∆N
Every expert has constant advice: fi,t is the i-th vertex of the simplex
decision of forecaster is convex combination of verticesloss function is `(·,Yt) is linear: (expected) loss incurred by forecaster is∑
i
pt,i`(fi,t ,Yt) = `(∑
i
pt,i fi,t ,Yt) = `(pt ,Yt)
write the expected loss as 〈`t , pt〉 where `t,i = `(fi,t ,Yt).
Walid Krichene Prediction, Learning and Games - Chapter 4 Randomized predictionNovember 14, 2013 8 / 16
![Page 24: Prediction, Learning and Games - Chapter 4 Randomized ...walid.krichene.net/notes/reading-plg-chap4.pdf · Prediction, Learning and Games - Chapter 4 Randomized prediction WalidKrichene](https://reader036.fdocuments.net/reader036/viewer/2022071011/5fc9a00a43690a5cd0463f71/html5/thumbnails/24.jpg)
Back to experts
Can think of this asStrategy space is simplex on actions: ∆N
Every expert has constant advice: fi,t is the i-th vertex of the simplexdecision of forecaster is convex combination of vertices
loss function is `(·,Yt) is linear: (expected) loss incurred by forecaster is∑i
pt,i`(fi,t ,Yt) = `(∑
i
pt,i fi,t ,Yt) = `(pt ,Yt)
write the expected loss as 〈`t , pt〉 where `t,i = `(fi,t ,Yt).
Walid Krichene Prediction, Learning and Games - Chapter 4 Randomized predictionNovember 14, 2013 8 / 16
![Page 25: Prediction, Learning and Games - Chapter 4 Randomized ...walid.krichene.net/notes/reading-plg-chap4.pdf · Prediction, Learning and Games - Chapter 4 Randomized prediction WalidKrichene](https://reader036.fdocuments.net/reader036/viewer/2022071011/5fc9a00a43690a5cd0463f71/html5/thumbnails/25.jpg)
Back to experts
Can think of this asStrategy space is simplex on actions: ∆N
Every expert has constant advice: fi,t is the i-th vertex of the simplexdecision of forecaster is convex combination of verticesloss function is `(·,Yt) is linear: (expected) loss incurred by forecaster is∑
i
pt,i`(fi,t ,Yt) = `(∑
i
pt,i fi,t ,Yt) = `(pt ,Yt)
write the expected loss as 〈`t , pt〉 where `t,i = `(fi,t ,Yt).
Walid Krichene Prediction, Learning and Games - Chapter 4 Randomized predictionNovember 14, 2013 8 / 16
![Page 26: Prediction, Learning and Games - Chapter 4 Randomized ...walid.krichene.net/notes/reading-plg-chap4.pdf · Prediction, Learning and Games - Chapter 4 Randomized prediction WalidKrichene](https://reader036.fdocuments.net/reader036/viewer/2022071011/5fc9a00a43690a5cd0463f71/html5/thumbnails/26.jpg)
Hedge algorithm
Regularized greedy algorithmHedge is solution to
pt+1 = arg minp∈∆N
〈`t , p〉+1γ
DKL(p||pt)
where DKL is the Kullback-Leibler divergence DKL(p, q) =∑
i lnpiqi
pi
Alsopt+1 = arg min
p∈∆N〈Lt , p〉 −
1γ
H(p)
where H is the entropy H(p) = −∑
i pi ln pi
Connection with stochastic optimization (last week)Greedy strategy called fictitious play
Walid Krichene Prediction, Learning and Games - Chapter 4 Randomized predictionNovember 14, 2013 9 / 16
![Page 27: Prediction, Learning and Games - Chapter 4 Randomized ...walid.krichene.net/notes/reading-plg-chap4.pdf · Prediction, Learning and Games - Chapter 4 Randomized prediction WalidKrichene](https://reader036.fdocuments.net/reader036/viewer/2022071011/5fc9a00a43690a5cd0463f71/html5/thumbnails/27.jpg)
Hedge algorithm
Regularized greedy algorithmHedge is solution to
pt+1 = arg minp∈∆N
〈`t , p〉+1γ
DKL(p||pt)
where DKL is the Kullback-Leibler divergence DKL(p, q) =∑
i lnpiqi
pi
Alsopt+1 = arg min
p∈∆N〈Lt , p〉 −
1γ
H(p)
where H is the entropy H(p) = −∑
i pi ln pi
Connection with stochastic optimization (last week)Greedy strategy called fictitious play
Walid Krichene Prediction, Learning and Games - Chapter 4 Randomized predictionNovember 14, 2013 9 / 16
![Page 28: Prediction, Learning and Games - Chapter 4 Randomized ...walid.krichene.net/notes/reading-plg-chap4.pdf · Prediction, Learning and Games - Chapter 4 Randomized prediction WalidKrichene](https://reader036.fdocuments.net/reader036/viewer/2022071011/5fc9a00a43690a5cd0463f71/html5/thumbnails/28.jpg)
Follow the perturbed leader
Regularized problem called: follow the perturbed leader
They prove it in a more general case
Theorem (Theorem 4.2)
It = argmini
Li,t−1 + Zi,t
Then
RT
T≤ 1
T
(Emax
iZi,1 + Emax
i−Zi,1 +
∑i
∫Ft(z)(fZ (z)− fZ (z − `t))
)
where Ft(z) = `(it(z), yt) and fZ is the density of Z .
Walid Krichene Prediction, Learning and Games - Chapter 4 Randomized predictionNovember 14, 2013 10 / 16
![Page 29: Prediction, Learning and Games - Chapter 4 Randomized ...walid.krichene.net/notes/reading-plg-chap4.pdf · Prediction, Learning and Games - Chapter 4 Randomized prediction WalidKrichene](https://reader036.fdocuments.net/reader036/viewer/2022071011/5fc9a00a43690a5cd0463f71/html5/thumbnails/29.jpg)
Follow the perturbed leader
Regularized problem called: follow the perturbed leaderThey prove it in a more general case
Theorem (Theorem 4.2)
It = argmini
Li,t−1 + Zi,t
Then
RT
T≤ 1
T
(Emax
iZi,1 + Emax
i−Zi,1 +
∑i
∫Ft(z)(fZ (z)− fZ (z − `t))
)
where Ft(z) = `(it(z), yt) and fZ is the density of Z .
Walid Krichene Prediction, Learning and Games - Chapter 4 Randomized predictionNovember 14, 2013 10 / 16
![Page 30: Prediction, Learning and Games - Chapter 4 Randomized ...walid.krichene.net/notes/reading-plg-chap4.pdf · Prediction, Learning and Games - Chapter 4 Randomized prediction WalidKrichene](https://reader036.fdocuments.net/reader036/viewer/2022071011/5fc9a00a43690a5cd0463f71/html5/thumbnails/30.jpg)
Internal regret
regret analyzed so far: external regret. Compare your cumulative loss tocumulative loss of a single expert / action
internal regret: Swap all instances of action i with j
Definition (Internal regret)
Ri,j,T =T∑
t=1
pi,t(`(i ,Yt)− `(j ,Yt))
internal regretmaxi,j
Ri,j,T
Walid Krichene Prediction, Learning and Games - Chapter 4 Randomized predictionNovember 14, 2013 11 / 16
![Page 31: Prediction, Learning and Games - Chapter 4 Randomized ...walid.krichene.net/notes/reading-plg-chap4.pdf · Prediction, Learning and Games - Chapter 4 Randomized prediction WalidKrichene](https://reader036.fdocuments.net/reader036/viewer/2022071011/5fc9a00a43690a5cd0463f71/html5/thumbnails/31.jpg)
Internal regret
regret analyzed so far: external regret. Compare your cumulative loss tocumulative loss of a single expert / actioninternal regret: Swap all instances of action i with j
Definition (Internal regret)
Ri,j,T =T∑
t=1
pi,t(`(i ,Yt)− `(j ,Yt))
internal regretmaxi,j
Ri,j,T
Walid Krichene Prediction, Learning and Games - Chapter 4 Randomized predictionNovember 14, 2013 11 / 16
![Page 32: Prediction, Learning and Games - Chapter 4 Randomized ...walid.krichene.net/notes/reading-plg-chap4.pdf · Prediction, Learning and Games - Chapter 4 Randomized prediction WalidKrichene](https://reader036.fdocuments.net/reader036/viewer/2022071011/5fc9a00a43690a5cd0463f71/html5/thumbnails/32.jpg)
Internal regret
regret analyzed so far: external regret. Compare your cumulative loss tocumulative loss of a single expert / actioninternal regret: Swap all instances of action i with j
Definition (Internal regret)
Ri,j,T =T∑
t=1
pi,t(`(i ,Yt)− `(j ,Yt))
internal regretmaxi,j
Ri,j,T
Walid Krichene Prediction, Learning and Games - Chapter 4 Randomized predictionNovember 14, 2013 11 / 16
![Page 33: Prediction, Learning and Games - Chapter 4 Randomized ...walid.krichene.net/notes/reading-plg-chap4.pdf · Prediction, Learning and Games - Chapter 4 Randomized prediction WalidKrichene](https://reader036.fdocuments.net/reader036/viewer/2022071011/5fc9a00a43690a5cd0463f71/html5/thumbnails/33.jpg)
Internal regret
In a sense, stronger than external regret:
∑i
Ri,j,T =T∑
t=1
∑i
pt,i (`t,i − `t,j)
=T∑
t=1
〈pt , `t〉 − `t,j
= Rj,T
so if maxj Rj,T ≤ N maxi,j Ri,j,T
Weighted average forecaster has large internal regret. RT = Ω(T )
But, can adapt to have small internal regret
Walid Krichene Prediction, Learning and Games - Chapter 4 Randomized predictionNovember 14, 2013 12 / 16
![Page 34: Prediction, Learning and Games - Chapter 4 Randomized ...walid.krichene.net/notes/reading-plg-chap4.pdf · Prediction, Learning and Games - Chapter 4 Randomized prediction WalidKrichene](https://reader036.fdocuments.net/reader036/viewer/2022071011/5fc9a00a43690a5cd0463f71/html5/thumbnails/34.jpg)
Internal regret
In a sense, stronger than external regret:
∑i
Ri,j,T =T∑
t=1
∑i
pt,i (`t,i − `t,j)
=T∑
t=1
〈pt , `t〉 − `t,j
= Rj,T
so if maxj Rj,T ≤ N maxi,j Ri,j,T
Weighted average forecaster has large internal regret. RT = Ω(T )
But, can adapt to have small internal regret
Walid Krichene Prediction, Learning and Games - Chapter 4 Randomized predictionNovember 14, 2013 12 / 16
![Page 35: Prediction, Learning and Games - Chapter 4 Randomized ...walid.krichene.net/notes/reading-plg-chap4.pdf · Prediction, Learning and Games - Chapter 4 Randomized prediction WalidKrichene](https://reader036.fdocuments.net/reader036/viewer/2022071011/5fc9a00a43690a5cd0463f71/html5/thumbnails/35.jpg)
Internal regret
In a sense, stronger than external regret:
∑i
Ri,j,T =T∑
t=1
∑i
pt,i (`t,i − `t,j)
=T∑
t=1
〈pt , `t〉 − `t,j
= Rj,T
so if maxj Rj,T ≤ N maxi,j Ri,j,T
Weighted average forecaster has large internal regret. RT = Ω(T )
But, can adapt to have small internal regret
Walid Krichene Prediction, Learning and Games - Chapter 4 Randomized predictionNovember 14, 2013 12 / 16
![Page 36: Prediction, Learning and Games - Chapter 4 Randomized ...walid.krichene.net/notes/reading-plg-chap4.pdf · Prediction, Learning and Games - Chapter 4 Randomized prediction WalidKrichene](https://reader036.fdocuments.net/reader036/viewer/2022071011/5fc9a00a43690a5cd0463f71/html5/thumbnails/36.jpg)
Internal regret
In a sense, stronger than external regret:
∑i
Ri,j,T =T∑
t=1
∑i
pt,i (`t,i − `t,j)
=T∑
t=1
〈pt , `t〉 − `t,j
= Rj,T
so if maxj Rj,T ≤ N maxi,j Ri,j,T
Weighted average forecaster has large internal regret. RT = Ω(T )
But, can adapt to have small internal regret
Walid Krichene Prediction, Learning and Games - Chapter 4 Randomized predictionNovember 14, 2013 12 / 16
![Page 37: Prediction, Learning and Games - Chapter 4 Randomized ...walid.krichene.net/notes/reading-plg-chap4.pdf · Prediction, Learning and Games - Chapter 4 Randomized prediction WalidKrichene](https://reader036.fdocuments.net/reader036/viewer/2022071011/5fc9a00a43690a5cd0463f71/html5/thumbnails/37.jpg)
Minimizing internal regret
Minimizing internal regret
define modified strategies pi→jt
pi→jt,i ← 0
pi→jt,j ← pt,i + pt,j
Apply an algorithm that minimizes external regret to a set of new expertsi → j , i 6= j. O(N2) experts for the new algorithm.
An action is pi→jt .
Results in a sequence of meta-strategies µt ∈ ∆N2
The fixed point pt =∑
(i,j) pi→jt µt,(i,j) minimizes internal regret
can be computed using Gaussian elimination. (write pt = A(µt)pt)
Walid Krichene Prediction, Learning and Games - Chapter 4 Randomized predictionNovember 14, 2013 13 / 16
![Page 38: Prediction, Learning and Games - Chapter 4 Randomized ...walid.krichene.net/notes/reading-plg-chap4.pdf · Prediction, Learning and Games - Chapter 4 Randomized prediction WalidKrichene](https://reader036.fdocuments.net/reader036/viewer/2022071011/5fc9a00a43690a5cd0463f71/html5/thumbnails/38.jpg)
Minimizing internal regret
Minimizing internal regret
define modified strategies pi→jt
pi→jt,i ← 0
pi→jt,j ← pt,i + pt,j
Apply an algorithm that minimizes external regret to a set of new expertsi → j , i 6= j. O(N2) experts for the new algorithm.
An action is pi→jt .
Results in a sequence of meta-strategies µt ∈ ∆N2
The fixed point pt =∑
(i,j) pi→jt µt,(i,j) minimizes internal regret
can be computed using Gaussian elimination. (write pt = A(µt)pt)
Walid Krichene Prediction, Learning and Games - Chapter 4 Randomized predictionNovember 14, 2013 13 / 16
![Page 39: Prediction, Learning and Games - Chapter 4 Randomized ...walid.krichene.net/notes/reading-plg-chap4.pdf · Prediction, Learning and Games - Chapter 4 Randomized prediction WalidKrichene](https://reader036.fdocuments.net/reader036/viewer/2022071011/5fc9a00a43690a5cd0463f71/html5/thumbnails/39.jpg)
Minimizing internal regret
Minimizing internal regret
define modified strategies pi→jt
pi→jt,i ← 0
pi→jt,j ← pt,i + pt,j
Apply an algorithm that minimizes external regret to a set of new expertsi → j , i 6= j. O(N2) experts for the new algorithm.
An action is pi→jt .
Results in a sequence of meta-strategies µt ∈ ∆N2
The fixed point pt =∑
(i,j) pi→jt µt,(i,j) minimizes internal regret
can be computed using Gaussian elimination. (write pt = A(µt)pt)
Walid Krichene Prediction, Learning and Games - Chapter 4 Randomized predictionNovember 14, 2013 13 / 16
![Page 40: Prediction, Learning and Games - Chapter 4 Randomized ...walid.krichene.net/notes/reading-plg-chap4.pdf · Prediction, Learning and Games - Chapter 4 Randomized prediction WalidKrichene](https://reader036.fdocuments.net/reader036/viewer/2022071011/5fc9a00a43690a5cd0463f71/html5/thumbnails/40.jpg)
Minimizing internal regret
Minimizing internal regret
define modified strategies pi→jt
pi→jt,i ← 0
pi→jt,j ← pt,i + pt,j
Apply an algorithm that minimizes external regret to a set of new expertsi → j , i 6= j. O(N2) experts for the new algorithm.
An action is pi→jt .
Results in a sequence of meta-strategies µt ∈ ∆N2
The fixed point pt =∑
(i,j) pi→jt µt,(i,j) minimizes internal regret
can be computed using Gaussian elimination. (write pt = A(µt)pt)
Walid Krichene Prediction, Learning and Games - Chapter 4 Randomized predictionNovember 14, 2013 13 / 16
![Page 41: Prediction, Learning and Games - Chapter 4 Randomized ...walid.krichene.net/notes/reading-plg-chap4.pdf · Prediction, Learning and Games - Chapter 4 Randomized prediction WalidKrichene](https://reader036.fdocuments.net/reader036/viewer/2022071011/5fc9a00a43690a5cd0463f71/html5/thumbnails/41.jpg)
Generalized regret
experts react to the prediction, fi,t(It)
also, activation function Ai,t(k)
Definition (Generalized regret)
ri,t =N∑
k=1
pt,kAi,t(k)(`(k,Yt)− `(fi,t(k),Yt))
External regret is a special caseSet Ai,t(k) = 1 identically, and fi,t(k) = iInternal regret is a special caseConsider experts (i , j), i 6= j). Set
f(i,j),t(k) =
k if k 6= ij otherwise
Walid Krichene Prediction, Learning and Games - Chapter 4 Randomized predictionNovember 14, 2013 14 / 16
![Page 42: Prediction, Learning and Games - Chapter 4 Randomized ...walid.krichene.net/notes/reading-plg-chap4.pdf · Prediction, Learning and Games - Chapter 4 Randomized prediction WalidKrichene](https://reader036.fdocuments.net/reader036/viewer/2022071011/5fc9a00a43690a5cd0463f71/html5/thumbnails/42.jpg)
Generalized regret
experts react to the prediction, fi,t(It)
also, activation function Ai,t(k)
Definition (Generalized regret)
ri,t =N∑
k=1
pt,kAi,t(k)(`(k,Yt)− `(fi,t(k),Yt))
External regret is a special caseSet Ai,t(k) = 1 identically, and fi,t(k) = iInternal regret is a special caseConsider experts (i , j), i 6= j). Set
f(i,j),t(k) =
k if k 6= ij otherwise
Walid Krichene Prediction, Learning and Games - Chapter 4 Randomized predictionNovember 14, 2013 14 / 16
![Page 43: Prediction, Learning and Games - Chapter 4 Randomized ...walid.krichene.net/notes/reading-plg-chap4.pdf · Prediction, Learning and Games - Chapter 4 Randomized prediction WalidKrichene](https://reader036.fdocuments.net/reader036/viewer/2022071011/5fc9a00a43690a5cd0463f71/html5/thumbnails/43.jpg)
Generalized regret
experts react to the prediction, fi,t(It)
also, activation function Ai,t(k)
Definition (Generalized regret)
ri,t =N∑
k=1
pt,kAi,t(k)(`(k,Yt)− `(fi,t(k),Yt))
External regret is a special caseSet Ai,t(k) = 1 identically, and fi,t(k) = i
Internal regret is a special caseConsider experts (i , j), i 6= j). Set
f(i,j),t(k) =
k if k 6= ij otherwise
Walid Krichene Prediction, Learning and Games - Chapter 4 Randomized predictionNovember 14, 2013 14 / 16
![Page 44: Prediction, Learning and Games - Chapter 4 Randomized ...walid.krichene.net/notes/reading-plg-chap4.pdf · Prediction, Learning and Games - Chapter 4 Randomized prediction WalidKrichene](https://reader036.fdocuments.net/reader036/viewer/2022071011/5fc9a00a43690a5cd0463f71/html5/thumbnails/44.jpg)
Generalized regret
experts react to the prediction, fi,t(It)
also, activation function Ai,t(k)
Definition (Generalized regret)
ri,t =N∑
k=1
pt,kAi,t(k)(`(k,Yt)− `(fi,t(k),Yt))
External regret is a special caseSet Ai,t(k) = 1 identically, and fi,t(k) = iInternal regret is a special caseConsider experts (i , j), i 6= j). Set
f(i,j),t(k) =
k if k 6= ij otherwise
Walid Krichene Prediction, Learning and Games - Chapter 4 Randomized predictionNovember 14, 2013 14 / 16
![Page 45: Prediction, Learning and Games - Chapter 4 Randomized ...walid.krichene.net/notes/reading-plg-chap4.pdf · Prediction, Learning and Games - Chapter 4 Randomized prediction WalidKrichene](https://reader036.fdocuments.net/reader036/viewer/2022071011/5fc9a00a43690a5cd0463f71/html5/thumbnails/45.jpg)
Generalized regret
Given a potential φ, there exists a forecaster that satisfies the Blackwellcondition. (Theorem 4.3)
Recall
Blackwell condition
〈rt ,∇φ(Rt−1)〉 ≤ 0
From the Blackwell condition one can derive a bound on regret
Walid Krichene Prediction, Learning and Games - Chapter 4 Randomized predictionNovember 14, 2013 15 / 16
![Page 46: Prediction, Learning and Games - Chapter 4 Randomized ...walid.krichene.net/notes/reading-plg-chap4.pdf · Prediction, Learning and Games - Chapter 4 Randomized prediction WalidKrichene](https://reader036.fdocuments.net/reader036/viewer/2022071011/5fc9a00a43690a5cd0463f71/html5/thumbnails/46.jpg)
Generalized regret
Given a potential φ, there exists a forecaster that satisfies the Blackwellcondition. (Theorem 4.3)Recall
Blackwell condition
〈rt ,∇φ(Rt−1)〉 ≤ 0
From the Blackwell condition one can derive a bound on regret
Walid Krichene Prediction, Learning and Games - Chapter 4 Randomized predictionNovember 14, 2013 15 / 16
![Page 47: Prediction, Learning and Games - Chapter 4 Randomized ...walid.krichene.net/notes/reading-plg-chap4.pdf · Prediction, Learning and Games - Chapter 4 Randomized prediction WalidKrichene](https://reader036.fdocuments.net/reader036/viewer/2022071011/5fc9a00a43690a5cd0463f71/html5/thumbnails/47.jpg)
Generalized regret
Given a potential φ, there exists a forecaster that satisfies the Blackwellcondition. (Theorem 4.3)Recall
Blackwell condition
〈rt ,∇φ(Rt−1)〉 ≤ 0
From the Blackwell condition one can derive a bound on regret
Walid Krichene Prediction, Learning and Games - Chapter 4 Randomized predictionNovember 14, 2013 15 / 16
![Page 48: Prediction, Learning and Games - Chapter 4 Randomized ...walid.krichene.net/notes/reading-plg-chap4.pdf · Prediction, Learning and Games - Chapter 4 Randomized prediction WalidKrichene](https://reader036.fdocuments.net/reader036/viewer/2022071011/5fc9a00a43690a5cd0463f71/html5/thumbnails/48.jpg)
Next week
Chapter 5 (Efficient forecasting for special classes of Experts)or Chapter 6 (Limited information: multi-armed bandit versions)
Walid Krichene Prediction, Learning and Games - Chapter 4 Randomized predictionNovember 14, 2013 16 / 16