An introduction to Gaussian processes for ... -...

478
An introduction to Gaussian processes for probabilistic inference Dr. Richard E. Turner ([email protected]) Computational and Biological Learning Lab, Department of Engineering, University of Cambridge

Transcript of An introduction to Gaussian processes for ... -...

Page 1: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

An introduction to Gaussian processes forprobabilistic inference

Dr. Richard E. Turner ([email protected])

Computational and Biological Learning Lab, Department of Engineering,University of Cambridge

Page 2: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Motivation: non-linear regression

Page 3: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Motivation: non-linear regression

?

Page 4: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Motivation: non-linear regression

Page 5: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Motivation: non-linear regression

Page 6: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Motivation: non-linear regression

Can we do this with a plain old Gaussian?

Page 7: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Gaussian distribution

p(y|Σ) ∝ exp(−1

2yTΣ−1y)

Σ =

1 .7

.7 1

y1

y2

−3 −2 −1 0 1 2 3−3

−2

−1

0

1

2

3

Page 8: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Gaussian distribution

p(y|Σ) ∝ exp(−1

2yTΣ−1y)

Σ =

1 .7

.7 1

y1

y2

−3 −2 −1 0 1 2 3−3

−2

−1

0

1

2

3

Page 9: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Gaussian distribution

p(y|Σ) ∝ exp(−1

2yTΣ−1y)

Σ =

1 .6

.6 1

y1

y2

−3 −2 −1 0 1 2 3−3

−2

−1

0

1

2

3

Page 10: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Gaussian distribution

p(y|Σ) ∝ exp(−1

2yTΣ−1y)

Σ =

1 .4

.4 1

y1

y2

−3 −2 −1 0 1 2 3−3

−2

−1

0

1

2

3

Page 11: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Gaussian distribution

p(y|Σ) ∝ exp(−1

2yTΣ−1y)

Σ =

1 .1

.1 1

y1

y2

−3 −2 −1 0 1 2 3−3

−2

−1

0

1

2

3

Page 12: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Gaussian distribution

p(y|Σ) ∝ exp(−1

2yTΣ−1y)

Σ =

1 .0

.0 1

y1

y2

−3 −2 −1 0 1 2 3−3

−2

−1

0

1

2

3

Page 13: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Gaussian distribution

p(y|Σ) ∝ exp(−1

2yTΣ−1y)

Σ =

1 .7

.7 1

y1

y2

−3 −2 −1 0 1 2 3−3

−2

−1

0

1

2

3

Page 14: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Gaussian distribution

p(y|Σ) ∝ exp(−1

2yTΣ−1y)

Σ =

1 .7

.7 1

y1

y2

−3 −2 −1 0 1 2 3−3

−2

−1

0

1

2

3

Page 15: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Gaussian distribution

p(y2|y1,Σ) ∝ exp(−1

2(y2 − µ∗)Σ∗−1(y2 − µ∗)) 1

2

y1

y2

−3 −2 −1 0 1 2 3−3

−2

−1

0

1

2

3

Page 16: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Gaussian distribution

p(y2|y1,Σ) ∝ exp(−1

2(y2 − µ∗)Σ∗−1(y2 − µ∗)) 1

2

y1

y 2

−3 −2 −1 0 1 2 3−3

−2

−1

0

1

2

3

Page 17: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Gaussian distribution

p(y2|y1,Σ) ∝ exp(−1

2(y2 − µ∗)Σ∗−1(y2 − µ∗)) 1

2

y1

y2

−3 −2 −1 0 1 2 3−3

−2

−1

0

1

2

3

Page 18: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Gaussian distribution

p(y2|y1,Σ) ∝ exp(−1

2(y2 − µ∗)Σ∗−1(y2 − µ∗)) 1

2

y1

y2

−3 −2 −1 0 1 2 3−3

−2

−1

0

1

2

3

Page 19: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Gaussian distribution

p(y2|y1,Σ) ∝ exp(−1

2(y2 − µ∗)Σ∗−1(y2 − µ∗)) 1

2

y1

y2

−3 −2 −1 0 1 2 3−3

−2

−1

0

1

2

3

Page 20: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Gaussian distribution

p(y2|y1,Σ) ∝ exp(−1

2(y2 − µ∗)Σ∗−1(y2 − µ∗)) 1

2

y1

y2

−3 −2 −1 0 1 2 3−3

−2

−1

0

1

2

3

Page 21: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Gaussian distribution

p(y2|y1,Σ) ∝ exp(−1

2(y2 − µ∗)Σ∗−1(y2 − µ∗)) 1

2

y1

y2

−3 −2 −1 0 1 2 3−3

−2

−1

0

1

2

3

Page 22: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y 2

Σ =

1 .9

.9 1

Page 23: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y 2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9

.9 1

Page 24: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y 2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9

.9 1

Page 25: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9

.9 1

Page 26: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9

.9 1

Page 27: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9

.9 1

Page 28: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9

.9 1

Page 29: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9

.9 1

Page 30: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9

.9 1

Page 31: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9

.9 1

Page 32: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9

.9 1

Page 33: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9

.9 1

Page 34: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9

.9 1

Page 35: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9

.9 1

Page 36: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9

.9 1

Page 37: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9

.9 1

Page 38: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9

.9 1

Page 39: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9

.9 1

Page 40: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9

.9 1

Page 41: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9

.9 1

Page 42: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9

.9 1

Page 43: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9

.9 1

Page 44: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9

.9 1

Page 45: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9

.9 1

Page 46: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9

.9 1

Page 47: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9

.9 1

Page 48: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9

.9 1

Page 49: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9

.9 1

Page 50: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9

.9 1

Page 51: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9

.9 1

Page 52: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9

.9 1

Page 53: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9

.9 1

Page 54: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9

.9 1

Page 55: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9

.9 1

Page 56: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9

.9 1

Page 57: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9

.9 1

Page 58: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9

.9 1

Page 59: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9

.9 1

Page 60: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9

.9 1

Page 61: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9

.9 1

Page 62: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9

.9 1

Page 63: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9

.9 1

Page 64: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y2

1 2−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9

.9 1

Page 65: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4

.9 1 .9 .8 .6

.8 .9 1 .9 .8

.6 .8 .9 1 .9

.4 .6 .8 .9 1

Page 66: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4

.9 1 .9 .8 .6

.8 .9 1 .9 .8

.6 .8 .9 1 .9

.4 .6 .8 .9 1

Page 67: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4

.9 1 .9 .8 .6

.8 .9 1 .9 .8

.6 .8 .9 1 .9

.4 .6 .8 .9 1

Page 68: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4

.9 1 .9 .8 .6

.8 .9 1 .9 .8

.6 .8 .9 1 .9

.4 .6 .8 .9 1

Page 69: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4

.9 1 .9 .8 .6

.8 .9 1 .9 .8

.6 .8 .9 1 .9

.4 .6 .8 .9 1

Page 70: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4

.9 1 .9 .8 .6

.8 .9 1 .9 .8

.6 .8 .9 1 .9

.4 .6 .8 .9 1

Page 71: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4

.9 1 .9 .8 .6

.8 .9 1 .9 .8

.6 .8 .9 1 .9

.4 .6 .8 .9 1

Page 72: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4

.9 1 .9 .8 .6

.8 .9 1 .9 .8

.6 .8 .9 1 .9

.4 .6 .8 .9 1

Page 73: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4

.9 1 .9 .8 .6

.8 .9 1 .9 .8

.6 .8 .9 1 .9

.4 .6 .8 .9 1

Page 74: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4

.9 1 .9 .8 .6

.8 .9 1 .9 .8

.6 .8 .9 1 .9

.4 .6 .8 .9 1

Page 75: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4

.9 1 .9 .8 .6

.8 .9 1 .9 .8

.6 .8 .9 1 .9

.4 .6 .8 .9 1

Page 76: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4

.9 1 .9 .8 .6

.8 .9 1 .9 .8

.6 .8 .9 1 .9

.4 .6 .8 .9 1

Page 77: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4

.9 1 .9 .8 .6

.8 .9 1 .9 .8

.6 .8 .9 1 .9

.4 .6 .8 .9 1

Page 78: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4

.9 1 .9 .8 .6

.8 .9 1 .9 .8

.6 .8 .9 1 .9

.4 .6 .8 .9 1

Page 79: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4

.9 1 .9 .8 .6

.8 .9 1 .9 .8

.6 .8 .9 1 .9

.4 .6 .8 .9 1

Page 80: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4

.9 1 .9 .8 .6

.8 .9 1 .9 .8

.6 .8 .9 1 .9

.4 .6 .8 .9 1

Page 81: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4

.9 1 .9 .8 .6

.8 .9 1 .9 .8

.6 .8 .9 1 .9

.4 .6 .8 .9 1

Page 82: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4

.9 1 .9 .8 .6

.8 .9 1 .9 .8

.6 .8 .9 1 .9

.4 .6 .8 .9 1

Page 83: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4

.9 1 .9 .8 .6

.8 .9 1 .9 .8

.6 .8 .9 1 .9

.4 .6 .8 .9 1

Page 84: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4

.9 1 .9 .8 .6

.8 .9 1 .9 .8

.6 .8 .9 1 .9

.4 .6 .8 .9 1

Page 85: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4

.9 1 .9 .8 .6

.8 .9 1 .9 .8

.6 .8 .9 1 .9

.4 .6 .8 .9 1

Page 86: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4

.9 1 .9 .8 .6

.8 .9 1 .9 .8

.6 .8 .9 1 .9

.4 .6 .8 .9 1

Page 87: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4

.9 1 .9 .8 .6

.8 .9 1 .9 .8

.6 .8 .9 1 .9

.4 .6 .8 .9 1

Page 88: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4

.9 1 .9 .8 .6

.8 .9 1 .9 .8

.6 .8 .9 1 .9

.4 .6 .8 .9 1

Page 89: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4

.9 1 .9 .8 .6

.8 .9 1 .9 .8

.6 .8 .9 1 .9

.4 .6 .8 .9 1

Page 90: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4

.9 1 .9 .8 .6

.8 .9 1 .9 .8

.6 .8 .9 1 .9

.4 .6 .8 .9 1

Page 91: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4

.9 1 .9 .8 .6

.8 .9 1 .9 .8

.6 .8 .9 1 .9

.4 .6 .8 .9 1

Page 92: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4

.9 1 .9 .8 .6

.8 .9 1 .9 .8

.6 .8 .9 1 .9

.4 .6 .8 .9 1

Page 93: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4

.9 1 .9 .8 .6

.8 .9 1 .9 .8

.6 .8 .9 1 .9

.4 .6 .8 .9 1

Page 94: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4

.9 1 .9 .8 .6

.8 .9 1 .9 .8

.6 .8 .9 1 .9

.4 .6 .8 .9 1

Page 95: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4

.9 1 .9 .8 .6

.8 .9 1 .9 .8

.6 .8 .9 1 .9

.4 .6 .8 .9 1

Page 96: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4

.9 1 .9 .8 .6

.8 .9 1 .9 .8

.6 .8 .9 1 .9

.4 .6 .8 .9 1

Page 97: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4

.9 1 .9 .8 .6

.8 .9 1 .9 .8

.6 .8 .9 1 .9

.4 .6 .8 .9 1

Page 98: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4

.9 1 .9 .8 .6

.8 .9 1 .9 .8

.6 .8 .9 1 .9

.4 .6 .8 .9 1

Page 99: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4

.9 1 .9 .8 .6

.8 .9 1 .9 .8

.6 .8 .9 1 .9

.4 .6 .8 .9 1

Page 100: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4

.9 1 .9 .8 .6

.8 .9 1 .9 .8

.6 .8 .9 1 .9

.4 .6 .8 .9 1

Page 101: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4

.9 1 .9 .8 .6

.8 .9 1 .9 .8

.6 .8 .9 1 .9

.4 .6 .8 .9 1

Page 102: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4

.9 1 .9 .8 .6

.8 .9 1 .9 .8

.6 .8 .9 1 .9

.4 .6 .8 .9 1

Page 103: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4

.9 1 .9 .8 .6

.8 .9 1 .9 .8

.6 .8 .9 1 .9

.4 .6 .8 .9 1

Page 104: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y5

1 2 3 4 5−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 .9 .8 .6 .4

.9 1 .9 .8 .6

.8 .9 1 .9 .8

.6 .8 .9 1 .9

.4 .6 .8 .9 1

Page 105: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

Page 106: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

Page 107: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

Page 108: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

Page 109: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

Page 110: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

Page 111: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

Page 112: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

Page 113: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

Page 114: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

Page 115: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

Page 116: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

Page 117: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

Page 118: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

Page 119: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

Page 120: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

Page 121: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

Page 122: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

Page 123: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

Page 124: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

Page 125: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

Page 126: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

Page 127: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

Page 128: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

Page 129: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

Page 130: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

Page 131: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

Page 132: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

Page 133: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

Page 134: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

Page 135: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

Page 136: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

Page 137: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

Page 138: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

Page 139: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

Page 140: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

Page 141: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

Page 142: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

Page 143: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

Page 144: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

New visualisation

−2 0 2−3

−2

−1

0

1

2

3

y1

y20

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

Page 145: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Regression using Gaussians

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

Page 146: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Regression using Gaussians

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

Page 147: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Regression using Gaussians

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

1 10 20

1

10

20

Page 148: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Regression using Gaussians

1 2 3 4 5 6 7 8 9 1011121314151617181920−3

−2

−1

0

1

2

3

variable index

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)

1 10 20

1

10

20

Page 149: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Regression: probabilistic inference in function space

−3

−2

−1

0

1

2

3

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)

1 10 20

1

10

20

Page 150: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Regression: probabilistic inference in function space

−3

−2

−1

0

1

2

3

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric)

K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)

1 10 20

1

10

20

Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)

Page 151: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Regression: probabilistic inference in function space

−3

−2

−1

0

1

2

3

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

Non-parametric (∞-parametric)

p(y|θ) = N (0,Σ)

function estimatewith uncertaintyK(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

1 10 20

1

10

20

Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)

Page 152: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Regression: probabilistic inference in function space

−3

−2

−1

0

1

2

3

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric)

noise

K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)

1 10 20

1

10

20

Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)

Page 153: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Regression: probabilistic inference in function space

−3

−2

−1

0

1

2

3

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric)

horizontal-scale

noise

K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)

1 10 20

1

10

20

Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)

Page 154: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Regression: probabilistic inference in function space

−3

−2

−1

0

1

2

3

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric)

noise

Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)

1 10 20

1

10

20

horizontal-scalevertical-scale

Page 155: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Mathematical Foundations: Definition

Gaussian process = generalization of multivariate Gaussian distribution toinfinitely many variables.

Definition: a Gaussian process is a collection of random variables, anyfinite number of which have (consistent) Gaussian distributions.

A Gaussian distribution is fully specified by a mean vector, µ, and covariancematrix Σ:

f = (f1, . . . , fn) ∼ N (µ,Σ), indices i = 1, . . . , n

A Gaussian process is fully specified by a mean function m(x) and covariancefunction K(x,x′):

f(x) ∼ GP (m(x),K(x,x′)) , indices x

Page 156: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Mathematical Foundations: Marginalisation

Q1. A GP is "like" a Gaussian distribution with an infinitely long mean vector and an "infinite by infinite" covariance matrix, so how do we represent it on a computer?

Page 157: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Mathematical Foundations: Marginalisation

Q1. A GP is "like" a Gaussian distribution with an infinitely long mean vector and an "infinite by infinite" covariance matrix, so how do we represent it on a computer?

We are saved by the marginalisation property:

Page 158: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Mathematical Foundations: Marginalisation

Q1. A GP is "like" a Gaussian distribution with an infinitely long mean vector and an "infinite by infinite" covariance matrix, so how do we represent it on a computer?

We are saved by the marginalisation property:

Page 159: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Mathematical Foundations: Marginalisation

Q1. A GP is "like" a Gaussian distribution with an infinitely long mean vector and an "infinite by infinite" covariance matrix, so how do we represent it on a computer?

We are saved by the marginalisation property:

Page 160: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Mathematical Foundations: Marginalisation

Q1. A GP is "like" a Gaussian distribution with an infinitely long mean vector and an "infinite by infinite" covariance matrix, so how do we represent it on a computer?

We are saved by the marginalisation property:

Page 161: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Mathematical Foundations: Marginalisation

Q1. A GP is "like" a Gaussian distribution with an infinitely long mean vector and an "infinite by infinite" covariance matrix, so how do we represent it on a computer?

We are saved by the marginalisation property:

Only need to represent finite dimensional projections of GPs on computer.

Page 162: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Mathematical Foundations: Marginalisation

Q1. A GP is "like" a Gaussian distribution with an infinitely long mean vector and an "infinite by infinite" covariance matrix, so how do we represent it on a computer?

We are saved by the marginalisation property:

Q2. Why do we use covariances and not precisions to express modelling choices?

Only need to represent finite dimensional projections of GPs on computer.

Page 163: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Mathematical Foundations: Marginalisation

Q1. A GP is "like" a Gaussian distribution with an infinitely long mean vector and an "infinite by infinite" covariance matrix, so how do we represent it on a computer?

We are saved by the marginalisation property:

Q2. Why do we use covariances and not precisions to express modelling choices?

Only need to represent finite dimensional projections of GPs on computer.

Page 164: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Mathematical Foundations: Marginalisation

Q1. A GP is "like" a Gaussian distribution with an infinitely long mean vector and an "infinite by infinite" covariance matrix, so how do we represent it on a computer?

We are saved by the marginalisation property:

Q2. Why do we use covariances and not precisions to express modelling choices?

Only need to represent finite dimensional projections of GPs on computer.

Page 165: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Mathematical Foundations: Marginalisation

Q1. A GP is "like" a Gaussian distribution with an infinitely long mean vector and an "infinite by infinite" covariance matrix, so how do we represent it on a computer?

We are saved by the marginalisation property:

Q2. Why do we use covariances and not precisions to express modelling choices?

Only need to represent finite dimensional projections of GPs on computer.

Page 166: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Mathematical Foundations: Marginalisation

Q1. A GP is "like" a Gaussian distribution with an infinitely long mean vector and an "infinite by infinite" covariance matrix, so how do we represent it on a computer?

We are saved by the marginalisation property:

Q2. Why do we use covariances and not precisions to express modelling choices?

Only need to represent finite dimensional projections of GPs on computer.

not the same

Page 167: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Mathematical Foundations: Marginalisation

Q1. A GP is "like" a Gaussian distribution with an infinitely long mean vector and an "infinite by infinite" covariance matrix, so how do we represent it on a computer?

We are saved by the marginalisation property:

Q2. Why do we use covariances and not precisions to express modelling choices?

Only need to represent finite dimensional projections of GPs on computer.

Entries in a precision matrix depend on what other data we are considering

not the same

Page 168: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Mathematical Foundations: Marginalisation

Q1. A GP is "like" a Gaussian distribution with an infinitely long mean vector and an "infinite by infinite" covariance matrix, so how do we represent it on a computer?

We are saved by the marginalisation property:

Q2. Why do we use covariances and not precisions to express modelling choices?

Only need to represent finite dimensional projections of GPs on computer.

Precisions tell us about the conditional independence relationships betweenpairs of variables (are and independent given all other variables).

Entries in a precision matrix depend on what other data we are considering

not the same

Page 169: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Mathematical Foundations: Regression

Q3. What's the formal justification for how we were using GPs for regression?

Page 170: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Mathematical Foundations: Regression

Q3. What's the formal justification for how we were using GPs for regression?

Generative model (like non-linear regression)

Page 171: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Mathematical Foundations: Regression

Q3. What's the formal justification for how we were using GPs for regression?

Generative model (like non-linear regression)

Page 172: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Mathematical Foundations: Regression

Q3. What's the formal justification for how we were using GPs for regression?

Generative model (like non-linear regression)

place GP prior over the non-linear function

(smoothly wiggling functions expected)

Page 173: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Mathematical Foundations: Regression

Q3. What's the formal justification for how we were using GPs for regression?

Generative model (like non-linear regression)

place GP prior over the non-linear function

since the sum of two Gaussians is a Gaussian, the model induces a GP over

(smoothly wiggling functions expected)

Page 174: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Mathematical foundations: Prediction

Q4. How do we make predictions?

Page 175: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Mathematical foundations: Prediction

Q4. How do we make predictions?

Page 176: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Mathematical foundations: Prediction

Q4. How do we make predictions?

Page 177: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Mathematical foundations: Prediction

Q4. How do we make predictions?

Page 178: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Mathematical foundations: Prediction

Q4. How do we make predictions?

Page 179: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Mathematical foundations: Prediction

Q4. How do we make predictions?

linear in the data

predictive mean

Page 180: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Mathematical foundations: Prediction

Q4. How do we make predictions?

prior uncertainty

predictive uncertainty

reduction inuncertainty

linear in the data

predictive mean predictive covariance

predictions more confident than prior

Page 181: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 182: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 183: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 184: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 185: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 186: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 187: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 188: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 189: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 190: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 191: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 192: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 193: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 194: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 195: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 196: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 197: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 198: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 199: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 200: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 201: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 202: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

Σ =

short horizontal length-scale

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 203: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

short horizontal length-scale

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 204: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

short horizontal length-scale

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 205: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

short horizontal length-scale

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 206: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

short horizontal length-scale

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 207: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

short horizontal length-scale

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 208: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

short horizontal length-scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 209: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

short horizontal length-scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 210: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

short horizontal length-scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 211: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

short horizontal length-scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 212: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric)

short horizontal length-scale

Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 213: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

short horizontal length-scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 214: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric)

short horizontal length-scale

Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 215: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

short horizontal length-scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 216: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

short horizontal length-scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 217: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

short horizontal length-scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 218: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric)

short horizontal length-scale

Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 219: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric)

short horizontal length-scale

Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 220: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric)

short horizontal length-scale

Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 221: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric)

short horizontal length-scale

Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 222: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric)

short horizontal length-scale

Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 223: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

Σ =

long horizontal length-scale

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 224: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

long horizontal length-scale

Σ(x1, x2) = K(x1, x2) + Iσ2y

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 225: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

long horizontal length-scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 226: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

long horizontal length-scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 227: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

long horizontal length-scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 228: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

long horizontal length-scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 229: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

long horizontal length-scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 230: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

long horizontal length-scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 231: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

long horizontal length-scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 232: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

long horizontal length-scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 233: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

long horizontal length-scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 234: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

long horizontal length-scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 235: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

long horizontal length-scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 236: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

long horizontal length-scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 237: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

long horizontal length-scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 238: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

long horizontal length-scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 239: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

long horizontal length-scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 240: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

long horizontal length-scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 241: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

long horizontal length-scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 242: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

long horizontal length-scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 243: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

long horizontal length-scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 244: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

small vertical scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 245: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

small vertical scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 246: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

small vertical scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 247: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

small vertical scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 248: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

small vertical scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 249: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

small vertical scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 250: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

small vertical scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 251: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

small vertical scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 252: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

small vertical scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 253: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

small vertical scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 254: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

small vertical scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 255: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

small vertical scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 256: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

small vertical scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 257: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

small vertical scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 258: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

small vertical scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 259: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

small vertical scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 260: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

small vertical scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 261: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

small vertical scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 262: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

small vertical scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 263: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

small vertical scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 264: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Σ(x1, x2) = K(x1, x2) + Iσ2y

small vertical scale

p(y|θ) = N (0,Σ)

Non-parametric (∞-parametric) Parametric model

y(x) = f(x; θ) + σyε

ε ∼ N (0, 1)K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)K(x1, x2) = σ2 exp

(− 1

2l2(x1 − x2)2

)

Page 265: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect do the hyper-parameters have?

K(x1, x2) = σ2 exp(− 1

2l2(x1 − x2)2

)• Hyper-parameters have a strong effect

– l controls the horizontal length-scale– σ2 controls the vertical scale of the data

• =⇒ we need automatic ways of learning the hyper-parameters from data

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x−3

−2

−1

0

1

2

3

x−3

−2

−1

0

1

2

3

x−3

−2

−1

0

1

2

3

x−3

−2

−1

0

1

2

3

x−3

−2

−1

0

1

2

3

x−3

−2

−1

0

1

2

3

x−3

−2

−1

0

1

2

3

x−3

−2

−1

0

1

2

3

x−3

−2

−1

0

1

2

3

x−3

−2

−1

0

1

2

3

x

y−3

−2

−1

0

1

2

3

x

y−3

−2

−1

0

1

2

3

x

y−3

−2

−1

0

1

2

3

x

y−3

−2

−1

0

1

2

3

x

y−3

−2

−1

0

1

2

3

x

y−3

−2

−1

0

1

2

3

x

y−3

−2

−1

0

1

2

3

x

y−3

−2

−1

0

1

2

3

x

y−3

−2

−1

0

1

2

3

x

y

l = short l = longl = medium

Page 266: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

How do we choose the hyper-parameters?

idea: use probability distributions to represent plausibility of hyper-parameters (uncertainty) given the data

Page 267: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

How do we choose the hyper-parameters?

(Bayes' Rule)

idea: use probability distributions to represent plausibility of hyper-parameters (uncertainty) given the data

Page 268: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

How do we choose the hyper-parameters?

what we know afterseeing the data

what we knew beforeseeing the data

what the datatell us

(Bayes' Rule)

(likelihood) (prior)

idea: use probability distributions to represent plausibility of hyper-parameters (uncertainty) given the data

Page 269: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

How do we choose the hyper-parameters?

what we know afterseeing the data

what we knew beforeseeing the data

what the datatell us

= how well did predict the data we observed

(Bayes' Rule)

= likelihood of the parameters

(likelihood) (prior)

idea: use probability distributions to represent plausibility of hyper-parameters (uncertainty) given the data

Page 270: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

How do we choose the hyper-parameters?

−2 −1 0 1 2−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

y1

y 2

0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

−3.5

−3

−2.5

−2

length−scale, l

log−

likel

ihoo

ddata

contours of

Page 271: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

How do we choose the hyper-parameters?

−2 −1 0 1 2−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

y1

y 2

0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

−3.5

−3

−2.5

−2

length−scale, l

log−

likel

ihoo

ddata

contours of

Page 272: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

How do we choose the hyper-parameters?

−2 −1 0 1 2−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

y1

y2

0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

−3.5

−3

−2.5

−2

length−scale, l

log

−lik

elih

oo

d

Page 273: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

How do we choose the hyper-parameters?

−2 −1 0 1 2−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

y1

y2

0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

−3.5

−3

−2.5

−2

length−scale, l

log

−lik

elih

oo

d

Page 274: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

How do we choose the hyper-parameters?

−2 −1 0 1 2−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

y1

y2

0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

−3.5

−3

−2.5

−2

length−scale, l

log

−lik

elih

oo

d

Page 275: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

How do we choose the hyper-parameters?

−2 −1 0 1 2−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

y1

y2

0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

−3.5

−3

−2.5

−2

length−scale, l

log

−lik

elih

oo

d

Page 276: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

How do we choose the hyper-parameters?

−2 −1 0 1 2−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

y1

y2

0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

−3.5

−3

−2.5

−2

length−scale, l

log

−lik

elih

oo

d

Page 277: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

How do we choose the hyper-parameters?

−2 −1 0 1 2−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

y1

y2

0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

−3.5

−3

−2.5

−2

length−scale, l

log

−lik

elih

oo

d

Page 278: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

How do we choose the hyper-parameters?

−2 −1 0 1 2−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

y1

y2

0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

−3.5

−3

−2.5

−2

length−scale, l

log

−lik

elih

oo

d

Page 279: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

How do we choose the hyper-parameters?

−2 −1 0 1 2−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

y1

y2

0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

−3.5

−3

−2.5

−2

length−scale, l

log

−lik

elih

oo

d

Page 280: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

How do we choose the hyper-parameters?

−2 −1 0 1 2−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

y1

y2

0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

−3.5

−3

−2.5

−2

length−scale, l

log

−lik

elih

oo

d

Page 281: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

How do we choose the hyper-parameters?

−2 −1 0 1 2−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

y1

y2

0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

−3.5

−3

−2.5

−2

length−scale, l

log

−lik

elih

oo

d

Page 282: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

How do we choose the hyper-parameters?

−2 −1 0 1 2−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

y1

y2

0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

−3.5

−3

−2.5

−2

length−scale, l

log

−lik

elih

oo

d

Page 283: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

How do we choose the hyper-parameters?

−2 −1 0 1 2−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

y1

y2

0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

−3.5

−3

−2.5

−2

length−scale, l

log

−lik

elih

oo

d

Page 284: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

How do we choose the hyper-parameters?

−2 −1 0 1 2−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

y1

y2

0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

−3.5

−3

−2.5

−2

length−scale, l

log

−lik

elih

oo

d

Page 285: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

How do we choose the hyper-parameters?

−2 −1 0 1 2−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

y1

y2

0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

−3.5

−3

−2.5

−2

length−scale, l

log

−lik

elih

oo

d

Page 286: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

How do we choose the hyper-parameters?

−2 −1 0 1 2−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

y1

y2

0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

−3.5

−3

−2.5

−2

length−scale, l

log

−lik

elih

oo

d

Page 287: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

How do we choose the hyper-parameters?

−2 −1 0 1 2−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

y1

y2

0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

−3.5

−3

−2.5

−2

length−scale, l

log

−lik

elih

oo

d

Page 288: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

How do we choose the hyper-parameters?

−2 −1 0 1 2−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

y1

y2

0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

−3.5

−3

−2.5

−2

length−scale, l

log

−lik

elih

oo

d

Page 289: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

How do we choose the hyper-parameters?

−2 −1 0 1 2−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

y1

y2

0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

−3.5

−3

−2.5

−2

length−scale, l

log

−lik

elih

oo

d

Page 290: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

How do we choose the hyper-parameters?

−2 −1 0 1 2−2

−1.5

−1

−0.5

0

0.5

1

1.5

2

y1

y2

0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

−3.5

−3

−2.5

−2

length−scale, l

log

−lik

elih

oo

d

Page 291: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

How do we choose the hyper-parameters?

0 2 4 6 8 10 12 14 16 18 20−0.5

0

0.5

1

1.5

2

2.5

3

x

y

Page 292: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

How do we choose the hyper-parameters?

0 2 4 6 8 10 12 14 16 18 20−0.5

0

0.5

1

1.5

2

2.5

3

x

y

length scale l =1

Page 293: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

How do we choose the hyper-parameters?

0 2 4 6 8 10 12 14 16 18 20−0.5

0

0.5

1

1.5

2

2.5

3

x

y

2 4 6 8 10 12 14 16 18 20

−10

0

10

20

length−scale, l

log−

likel

ihoo

d

length scale l =1

Page 294: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

How do we choose the hyper-parameters?

0 2 4 6 8 10 12 14 16 18 20−0.5

0

0.5

1

1.5

2

2.5

3

x

y

2 4 6 8 10 12 14 16 18 20

−10

0

10

20

length−scale, l

log

−lik

elih

oo

d

Page 295: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

How do we choose the hyper-parameters?

0 2 4 6 8 10 12 14 16 18 20−0.5

0

0.5

1

1.5

2

2.5

3

x

y

2 4 6 8 10 12 14 16 18 20

−10

0

10

20

length−scale, l

log

−lik

elih

oo

d

Page 296: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

How do we choose the hyper-parameters?

0 2 4 6 8 10 12 14 16 18 20−0.5

0

0.5

1

1.5

2

2.5

3

x

y

2 4 6 8 10 12 14 16 18 20

−10

0

10

20

length−scale, l

log

−lik

elih

oo

d

Page 297: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

How do we choose the hyper-parameters?

0 2 4 6 8 10 12 14 16 18 20−0.5

0

0.5

1

1.5

2

2.5

3

x

y

2 4 6 8 10 12 14 16 18 20

−10

0

10

20

length−scale, l

log

−lik

elih

oo

d

Page 298: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

How do we choose the hyper-parameters?

0 2 4 6 8 10 12 14 16 18 20−0.5

0

0.5

1

1.5

2

2.5

3

x

y

2 4 6 8 10 12 14 16 18 20

−10

0

10

20

length−scale, l

log

−lik

elih

oo

d

Page 299: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

How do we choose the hyper-parameters?

0 2 4 6 8 10 12 14 16 18 20−0.5

0

0.5

1

1.5

2

2.5

3

x

y

2 4 6 8 10 12 14 16 18 20

−10

0

10

20

length−scale, l

log

−lik

elih

oo

d

Page 300: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

How do we choose the hyper-parameters?

0 2 4 6 8 10 12 14 16 18 20−0.5

0

0.5

1

1.5

2

2.5

3

x

y

2 4 6 8 10 12 14 16 18 20

−10

0

10

20

length−scale, l

log

−lik

elih

oo

d

Page 301: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

How do we choose the hyper-parameters?

0 2 4 6 8 10 12 14 16 18 20−0.5

0

0.5

1

1.5

2

2.5

3

x

y

2 4 6 8 10 12 14 16 18 20

−10

0

10

20

length−scale, l

log

−lik

elih

oo

d

Page 302: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

How do we choose the hyper-parameters?

0 2 4 6 8 10 12 14 16 18 20−0.5

0

0.5

1

1.5

2

2.5

3

x

y

2 4 6 8 10 12 14 16 18 20

−10

0

10

20

length−scale, l

log

−lik

elih

oo

d

Page 303: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

How do we choose the hyper-parameters?

0 2 4 6 8 10 12 14 16 18 20−0.5

0

0.5

1

1.5

2

2.5

3

x

y

2 4 6 8 10 12 14 16 18 20

−10

0

10

20

length−scale, l

log

−lik

elih

oo

d

Page 304: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

How do we choose the hyper-parameters?

0 2 4 6 8 10 12 14 16 18 20−0.5

0

0.5

1

1.5

2

2.5

3

x

y

2 4 6 8 10 12 14 16 18 20

−10

0

10

20

length−scale, l

log

−lik

elih

oo

d

Page 305: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

How do we choose the hyper-parameters?

0 2 4 6 8 10 12 14 16 18 20−0.5

0

0.5

1

1.5

2

2.5

3

x

y

2 4 6 8 10 12 14 16 18 20

−10

0

10

20

length−scale, l

log

−lik

elih

oo

d

Page 306: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

How do we choose the hyper-parameters?

0 2 4 6 8 10 12 14 16 18 20−0.5

0

0.5

1

1.5

2

2.5

3

x

y

2 4 6 8 10 12 14 16 18 20

−10

0

10

20

length−scale, l

log

−lik

elih

oo

d

Page 307: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

How do we choose the hyper-parameters?

0 2 4 6 8 10 12 14 16 18 20−0.5

0

0.5

1

1.5

2

2.5

3

x

y

2 4 6 8 10 12 14 16 18 20

−10

0

10

20

length−scale, l

log

−lik

elih

oo

d

Page 308: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

How do we choose the hyper-parameters?

0 2 4 6 8 10 12 14 16 18 20−0.5

0

0.5

1

1.5

2

2.5

3

x

y

2 4 6 8 10 12 14 16 18 20

−10

0

10

20

length−scale, l

log

−lik

elih

oo

d

Page 309: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

How do we choose the hyper-parameters?

0 2 4 6 8 10 12 14 16 18 20−0.5

0

0.5

1

1.5

2

2.5

3

x

y

2 4 6 8 10 12 14 16 18 20

−10

0

10

20

length−scale, l

log

−lik

elih

oo

d

Page 310: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

How do we choose the hyper-parameters?

0 2 4 6 8 10 12 14 16 18 20−0.5

0

0.5

1

1.5

2

2.5

3

x

y

2 4 6 8 10 12 14 16 18 20

−10

0

10

20

length−scale, l

log

−lik

elih

oo

d

Page 311: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

How do we choose the hyper-parameters?

0 2 4 6 8 10 12 14 16 18 20−0.5

0

0.5

1

1.5

2

2.5

3

x

y

2 4 6 8 10 12 14 16 18 20

−10

0

10

20

length−scale, l

log

−lik

elih

oo

d

Page 312: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

How do we choose the hyper-parameters?

0 2 4 6 8 10 12 14 16 18 20−0.5

0

0.5

1

1.5

2

2.5

3

x

y

2 4 6 8 10 12 14 16 18 20

−10

0

10

20

length−scale, l

log

−lik

elih

oo

d

Page 313: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

How do we choose the hyper-parameters?

0 2 4 6 8 10 12 14 16 18 20−0.5

0

0.5

1

1.5

2

2.5

3

x

y

2 4 6 8 10 12 14 16 18 20

−10

0

10

20

length−scale, l

log

−lik

elih

oo

d

Page 314: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

How do we choose the hyper-parameters?

2 4 6 8 10 12 14 16 18 200

0.5

1

length−scale, l

likel

ihoo

d

Page 315: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

How do we choose the hyper-parameters?

2 4 6 8 10 12 14 16 18 200

0.5

1

length−scale, l

likel

ihoo

d

0

0.5

1

prio

r

Page 316: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

How do we choose the hyper-parameters?

2 4 6 8 10 12 14 16 18 200

0.5

1

length−scale, l

likel

ihoo

d

0

0.5

1

prio

r

0

0.5

1

post

erio

r

Page 317: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Why does Bayesian inference work?

2 4 6 8 10 12 14 16 18 200

0.5

1

length−scale, l

likel

ihoo

d

Page 318: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Why does Bayesian inference work?

2 4 6 8 10 12 14 16 18 200

0.5

1

length−scale, l

likel

ihoo

d

fits every training point"complex" model

Page 319: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Why does Bayesian inference work?

2 4 6 8 10 12 14 16 18 200

0.5

1

length−scale, l

likel

ihoo

d

fits every training point"complex" model

straight line-like"simple" model

Page 320: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Why does Bayesian inference work?

2 4 6 8 10 12 14 16 18 200

0.5

1

length−scale, l

likel

ihoo

d

fits every training point"complex" model

straight line-like"simple" model"best" model

Page 321: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Why does Bayesian inference work?

2 4 6 8 10 12 14 16 18 200

0.5

1

length−scale, l

likel

ihoo

d

fits every training point"complex" model

straight line-like"simple" model

all possible datasets

"best" model

Page 322: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Why does Bayesian inference work?

2 4 6 8 10 12 14 16 18 200

0.5

1

length−scale, l

likel

ihoo

d

fits every training point"complex" model

straight line-like"simple" model

all possible datasets

"best" model

"complex" model

Page 323: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Why does Bayesian inference work?

2 4 6 8 10 12 14 16 18 200

0.5

1

length−scale, l

likel

ihoo

d

fits every training point"complex" model

straight line-like"simple" model

all possible datasets

"best" model

"complex" model

"simple" model

Page 324: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Why does Bayesian inference work?

2 4 6 8 10 12 14 16 18 200

0.5

1

length−scale, l

likel

ihoo

d

fits every training point"complex" model

straight line-like"simple" model

all possible datasets

"best" model

"complex" model"best" model

"simple" model

Page 325: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Why does Bayesian inference work? Occam’s Razor.

2 4 6 8 10 12 14 16 18 200

0.5

1

length−scale, l

likel

ihoo

d

fits every training point"complex" model

straight line-like"simple" model

all possible datasets

"best" model

"complex" model"best" model

"simple" model

Page 326: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

How do we make predictions now?

p(y′|y1:N ,M) =

∫dθ p(y′|y1:N , θ,M)p(θ|y1:N ,M)

• for every setting of the parameters...

• make a prediction for the testing points (as before) p(y′|y1:N , θ,M)

• weight the prediction by probability of θ under the posterior p(θ|y1:N ,M)

• sum up for each parameter setting

Only need two rules for Bayesian computation: product and sum rules

p(x, y) = p(x|y)p(y) = p(y|x)p(x) p(x) =∑

y

p(x, y)

Page 327: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

How do we make predictions now?

p(y′|y1:N ,M) =

∫dθ p(y′|y1:N , θ,M)p(θ|y1:N ,M)

• for every setting of the parameters...

• make a prediction for the testing points (as before) p(y′|y1:N , θ,M)

• weight the prediction by probability of θ under the posterior p(θ|y1:N ,M)

• sum up for each parameter setting

Only need two rules for Bayesian computation: product and sum rules

p(x, y) = p(x|y)p(y) = p(y|x)p(x) p(x) =∑

y

p(x, y)

Page 328: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

Σ =

Laplacian covariance function

Browninan motion

Ornstein-Uhlenbeck

Σ =

K(x1, x2) = σ2 exp(− 1

2l2|x1 − x2|

)

Page 329: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Laplacian covariance function

Browninan motion

Ornstein-Uhlenbeck

K(x1, x2) = σ2 exp(− 1

2l2|x1 − x2|

)

Page 330: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Laplacian covariance function

Browninan motion

Ornstein-Uhlenbeck

K(x1, x2) = σ2 exp(− 1

2l2|x1 − x2|

)

Page 331: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Laplacian covariance function

Browninan motion

Ornstein-Uhlenbeck

K(x1, x2) = σ2 exp(− 1

2l2|x1 − x2|

)

Page 332: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Laplacian covariance function

Browninan motion

Ornstein-Uhlenbeck

K(x1, x2) = σ2 exp(− 1

2l2|x1 − x2|

)

Page 333: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Laplacian covariance function

Browninan motion

Ornstein-Uhlenbeck

K(x1, x2) = σ2 exp(− 1

2l2|x1 − x2|

)

Page 334: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Laplacian covariance function

Browninan motion

Ornstein-Uhlenbeck

K(x1, x2) = σ2 exp(− 1

2l2|x1 − x2|

)

Page 335: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Laplacian covariance function

Browninan motion

Ornstein-Uhlenbeck

K(x1, x2) = σ2 exp(− 1

2l2|x1 − x2|

)

Page 336: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Laplacian covariance function

Browninan motion

Ornstein-Uhlenbeck

K(x1, x2) = σ2 exp(− 1

2l2|x1 − x2|

)

Page 337: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Laplacian covariance function

Browninan motion

Ornstein-Uhlenbeck

K(x1, x2) = σ2 exp(− 1

2l2|x1 − x2|

)

Page 338: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

Σ =

Laplacian covariance function

Browninan motion

Ornstein-Uhlenbeck

K(x1, x2) = σ2 exp(− 1

2l2|x1 − x2|

)

Page 339: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Laplacian covariance function

Browninan motion

Ornstein-Uhlenbeck

K(x1, x2) = σ2 exp(− 1

2l2|x1 − x2|

)

Page 340: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Laplacian covariance function

Browninan motion

Ornstein-Uhlenbeck

K(x1, x2) = σ2 exp(− 1

2l2|x1 − x2|

)

Page 341: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Laplacian covariance function

Browninan motion

Ornstein-Uhlenbeck

K(x1, x2) = σ2 exp(− 1

2l2|x1 − x2|

)

Page 342: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Laplacian covariance function

Browninan motion

Ornstein-Uhlenbeck

K(x1, x2) = σ2 exp(− 1

2l2|x1 − x2|

)

Page 343: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Laplacian covariance function

Browninan motion

Ornstein-Uhlenbeck

K(x1, x2) = σ2 exp(− 1

2l2|x1 − x2|

)

Page 344: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Laplacian covariance function

Browninan motion

Ornstein-Uhlenbeck

K(x1, x2) = σ2 exp(− 1

2l2|x1 − x2|

)

Page 345: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Laplacian covariance function

Browninan motion

Ornstein-Uhlenbeck

K(x1, x2) = σ2 exp(− 1

2l2|x1 − x2|

)

Page 346: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Laplacian covariance function

Browninan motion

Ornstein-Uhlenbeck

K(x1, x2) = σ2 exp(− 1

2l2|x1 − x2|

)

Page 347: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Laplacian covariance function

Browninan motion

Ornstein-Uhlenbeck

K(x1, x2) = σ2 exp(− 1

2l2|x1 − x2|

)

Page 348: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Laplacian covariance function

Browninan motion

Ornstein-Uhlenbeck

K(x1, x2) = σ2 exp(− 1

2l2|x1 − x2|

)

Page 349: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

Σ =

Rational Quadratic

K(x1, x2) = σ2(

1 + 1

2αl2|x1 − x2|

)−α

Page 350: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2(

1 + 1

2αl2|x1 − x2|

)−α

Rational Quadratic

Page 351: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2(

1 + 1

2αl2|x1 − x2|

)−α

Rational Quadratic

Page 352: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2(

1 + 1

2αl2|x1 − x2|

)−α

Rational Quadratic

Page 353: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2(

1 + 1

2αl2|x1 − x2|

)−α

Rational Quadratic

Page 354: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2(

1 + 1

2αl2|x1 − x2|

)−α

Rational Quadratic

Page 355: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2(

1 + 1

2αl2|x1 − x2|

)−α

Rational Quadratic

Page 356: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2(

1 + 1

2αl2|x1 − x2|

)−α

Rational Quadratic

Page 357: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2(

1 + 1

2αl2|x1 − x2|

)−α

Rational Quadratic

Page 358: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2(

1 + 1

2αl2|x1 − x2|

)−α

Rational Quadratic

Page 359: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2(

1 + 1

2αl2|x1 − x2|

)−α

Rational Quadratic

Page 360: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2(

1 + 1

2αl2|x1 − x2|

)−α

Rational Quadratic

Page 361: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2(

1 + 1

2αl2|x1 − x2|

)−α

Rational Quadratic

Page 362: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2(

1 + 1

2αl2|x1 − x2|

)−α

Rational Quadratic

Page 363: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2(

1 + 1

2αl2|x1 − x2|

)−α

Rational Quadratic

Page 364: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2(

1 + 1

2αl2|x1 − x2|

)−α

Rational Quadratic

Page 365: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2(

1 + 1

2αl2|x1 − x2|

)−α

Rational Quadratic

Page 366: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Rational Quadratic

K(x1, x2) = σ2(

1 + 1

2αl2|x1 − x2|

)−α

Page 367: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Rational Quadratic

K(x1, x2) = σ2(

1 + 1

2αl2|x1 − x2|

)−α

Page 368: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Rational Quadratic

K(x1, x2) = σ2(

1 + 1

2αl2|x1 − x2|

)−α

Page 369: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

Rational Quadratic

K(x1, x2) = σ2(

1 + 1

2αl2|x1 − x2|

)−α

Page 370: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

Σ =

Periodic

sinusoid × squared exponential

K(x1, x2) = σ2 cos(ω(x1 − x2)) exp(− 1

2l2(x1 − x2)2

)

Page 371: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2 cos(ω(x1 − x2)) exp(− 1

2l2(x1 − x2)2

)

Periodic

sinusoid × squared exponential

Page 372: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2 cos(ω(x1 − x2)) exp(− 1

2l2(x1 − x2)2

)

Periodic

sinusoid × squared exponential

Page 373: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2 cos(ω(x1 − x2)) exp(− 1

2l2(x1 − x2)2

)

Periodic

sinusoid × squared exponential

Page 374: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2 cos(ω(x1 − x2)) exp(− 1

2l2(x1 − x2)2

)

Periodic

sinusoid × squared exponential

Page 375: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2 cos(ω(x1 − x2)) exp(− 1

2l2(x1 − x2)2

)

Periodic

sinusoid × squared exponential

Page 376: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2 cos(ω(x1 − x2)) exp(− 1

2l2(x1 − x2)2

)

Periodic

sinusoid × squared exponential

Page 377: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2 cos(ω(x1 − x2)) exp(− 1

2l2(x1 − x2)2

)

Periodic

sinusoid × squared exponential

Page 378: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2 cos(ω(x1 − x2)) exp(− 1

2l2(x1 − x2)2

)

Periodic

sinusoid × squared exponential

Page 379: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2 cos(ω(x1 − x2)) exp(− 1

2l2(x1 − x2)2

)

Periodic

sinusoid × squared exponential

Page 380: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2 cos(ω(x1 − x2)) exp(− 1

2l2(x1 − x2)2

)

Periodic

sinusoid × squared exponential

Page 381: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2 cos(ω(x1 − x2)) exp(− 1

2l2(x1 − x2)2

)

Periodic

sinusoid × squared exponential

Page 382: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2 cos(ω(x1 − x2)) exp(− 1

2l2(x1 − x2)2

)

Periodic

sinusoid × squared exponential

Page 383: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2 cos(ω(x1 − x2)) exp(− 1

2l2(x1 − x2)2

)

Periodic

sinusoid × squared exponential

Page 384: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2 cos(ω(x1 − x2)) exp(− 1

2l2(x1 − x2)2

)

Periodic

sinusoid × squared exponential

Page 385: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2 cos(ω(x1 − x2)) exp(− 1

2l2(x1 − x2)2

)

Periodic

sinusoid × squared exponential

Page 386: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2 cos(ω(x1 − x2)) exp(− 1

2l2(x1 − x2)2

)

Periodic

sinusoid × squared exponential

Page 387: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2 cos(ω(x1 − x2)) exp(− 1

2l2(x1 − x2)2

)

Periodic

sinusoid × squared exponential

Page 388: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2 cos(ω(x1 − x2)) exp(− 1

2l2(x1 − x2)2

)

Periodic

sinusoid × squared exponential

Page 389: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2 cos(ω(x1 − x2)) exp(− 1

2l2(x1 − x2)2

)

Periodic

sinusoid × squared exponential

Page 390: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

What effect does the form of the covariance function have?

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

Σ =

K(x1, x2) = σ2 cos(ω(x1 − x2)) exp(− 1

2l2(x1 − x2)2

)

Periodic

sinusoid × squared exponential

Page 391: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

The covariance function has a large effect

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

OU RQ periodic

Page 392: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

The covariance function has a large effect

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

OU RQ periodic

Bayesian model comparison:

Page 393: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

The covariance function has a large effect

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

OU RQ periodic

Bayesian model comparison:prior over models

Page 394: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

The covariance function has a large effect

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

OU RQ periodic

Bayesian model comparison:prior over models

marginal likelihood

Page 395: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

The covariance function has a large effect

Health warnings: Hard to compute (need approximations)Often results are very sensitive to the priors

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

−3

−2

−1

0

1

2

3

x

y

OU RQ periodic

Bayesian model comparison:prior over models

marginal likelihood

Page 396: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Basis function view of Gaussian processes

−10 −8 −6 −4 −2 0 2 4 6 8 10

−1

0

1co

eff

icie

nts

−10 −8 −6 −4 −2 0 2 4 6 8 100

0.5

1

ba

sis

fu

nctio

ns

−10 −8 −6 −4 −2 0 2 4 6 8 10

−1

0

1

x

f(x)

γ ∼ N (0, 1)

g(x) = e−1

l2(x−µ)2

f(x) = γg(x)

Page 397: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Basis function view of Gaussian processes

−10 −8 −6 −4 −2 0 2 4 6 8 10

−1

0

1co

eff

icie

nts

−10 −8 −6 −4 −2 0 2 4 6 8 100

0.5

1

ba

sis

fu

nctio

ns

−10 −8 −6 −4 −2 0 2 4 6 8 10

−1

0

1

x

f(x)

γ ∼ N (0, 1)

g(x) = e−1

l2(x−µ)2

f(x) = γg(x)

Page 398: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Basis function view of Gaussian processes

−10 −8 −6 −4 −2 0 2 4 6 8 10

−1

0

1co

eff

icie

nts

−10 −8 −6 −4 −2 0 2 4 6 8 100

0.5

1

ba

sis

fu

nctio

ns

−10 −8 −6 −4 −2 0 2 4 6 8 10

−1

0

1

x

f(x)

γ ∼ N (0, 1)

g(x) = e−1

l2(x−µ)2

f(x) = γg(x)

Page 399: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Basis function view of Gaussian processes

−10 −8 −6 −4 −2 0 2 4 6 8 10

−1

0

1co

eff

icie

nts

−10 −8 −6 −4 −2 0 2 4 6 8 100

0.5

1

ba

sis

fu

nctio

ns

−10 −8 −6 −4 −2 0 2 4 6 8 10

−1

0

1

x

f(x)

γ ∼ N (0, 1)

g(x) = e−1

l2(x−µ)2

f(x) = γg(x)

Page 400: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Basis function view of Gaussian processes

−10 −8 −6 −4 −2 0 2 4 6 8 10

−1

0

1co

eff

icie

nts

−10 −8 −6 −4 −2 0 2 4 6 8 100

0.5

1

ba

sis

fu

nctio

ns

−10 −8 −6 −4 −2 0 2 4 6 8 10

−1

0

1

x

f(x)

γ ∼ N (0, 1)

g(x) = e−1

l2(x−µ)2

f(x) = γg(x)

Page 401: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Basis function view of Gaussian processes

−10 −8 −6 −4 −2 0 2 4 6 8 10

−1

0

1co

eff

icie

nts

−10 −8 −6 −4 −2 0 2 4 6 8 100

0.5

1

ba

sis

fu

nctio

ns

−10 −8 −6 −4 −2 0 2 4 6 8 10

−1

0

1

x

f(x)

γ ∼ N (0, 1)

g(x) = e−1

l2(x−µ)2

f(x) = γg(x)

Page 402: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Basis function view of Gaussian processes

−10 −8 −6 −4 −2 0 2 4 6 8 10

−1

0

1co

eff

icie

nts

−10 −8 −6 −4 −2 0 2 4 6 8 100

0.5

1

ba

sis

fu

nctio

ns

−10 −8 −6 −4 −2 0 2 4 6 8 10

−1

0

1

x

f(x)

γ ∼ N (0, 1)

g(x) = e−1

l2(x−µ)2

f(x) = γg(x)

Page 403: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Basis function view of Gaussian processes

−10 −8 −6 −4 −2 0 2 4 6 8 10

−1

0

1co

eff

icie

nts

−10 −8 −6 −4 −2 0 2 4 6 8 100

0.5

1

ba

sis

fu

nctio

ns

−10 −8 −6 −4 −2 0 2 4 6 8 10

−1

0

1

x

f(x)

γ ∼ N (0, 1)

g(x) = e−1

l2(x−µ)2

f(x) = γg(x)

Page 404: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Basis function view of Gaussian processes

−10 −8 −6 −4 −2 0 2 4 6 8 10

−1

0

1co

eff

icie

nts

−10 −8 −6 −4 −2 0 2 4 6 8 100

0.5

1

ba

sis

fu

nctio

ns

−10 −8 −6 −4 −2 0 2 4 6 8 10

−1

0

1

x

f(x)

γ ∼ N (0, 1)

g(x) = e−1

l2(x−µ)2

f(x) = γg(x)

Page 405: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Basis function view of Gaussian processes

−10 −8 −6 −4 −2 0 2 4 6 8 10

−1

0

1co

eff

icie

nts

−10 −8 −6 −4 −2 0 2 4 6 8 100

0.5

1

ba

sis

fu

nctio

ns

−10 −8 −6 −4 −2 0 2 4 6 8 10

−1

0

1

x

f(x)

γ ∼ N (0, 1)

g(x) = e−1

l2(x−µ)2

f(x) = γg(x)

Page 406: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Basis function view of Gaussian processes

−10 −8 −6 −4 −2 0 2 4 6 8 10

−1

0

1

co

eff

icie

nts

−10 −8 −6 −4 −2 0 2 4 6 8 100

0.2

ba

sis

fu

nctio

ns

−10 −8 −6 −4 −2 0 2 4 6 8 10

−0.5

0

0.5

x

f(x)

γk ∼ N (0, 1)

gk(x) = 1√K

e−1

l2(x−k/K)2

f(x) =∑Kk=1 γkgk(x)

Page 407: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Basis function view of Gaussian processes

−10 −8 −6 −4 −2 0 2 4 6 8 10

−1

0

1

co

eff

icie

nts

−10 −8 −6 −4 −2 0 2 4 6 8 100

0.2

ba

sis

fu

nctio

ns

−10 −8 −6 −4 −2 0 2 4 6 8 10

−0.5

0

0.5

x

f(x)

γk ∼ N (0, 1)

gk(x) = 1√K

e−1

l2(x−k/K)2

f(x) =∑Kk=1 γkgk(x)

Page 408: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Basis function view of Gaussian processes

−10 −8 −6 −4 −2 0 2 4 6 8 10

−1

0

1

co

eff

icie

nts

−10 −8 −6 −4 −2 0 2 4 6 8 100

0.2

ba

sis

fu

nctio

ns

−10 −8 −6 −4 −2 0 2 4 6 8 10

−0.5

0

0.5

x

f(x)

γk ∼ N (0, 1)

gk(x) = 1√K

e−1

l2(x−k/K)2

f(x) =∑Kk=1 γkgk(x)

Page 409: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Basis function view of Gaussian processes

−10 −8 −6 −4 −2 0 2 4 6 8 10

−1

0

1

co

eff

icie

nts

−10 −8 −6 −4 −2 0 2 4 6 8 100

0.2

ba

sis

fu

nctio

ns

−10 −8 −6 −4 −2 0 2 4 6 8 10

−0.5

0

0.5

x

f(x)

γk ∼ N (0, 1)

gk(x) = 1√K

e−1

l2(x−k/K)2

f(x) =∑Kk=1 γkgk(x)

Page 410: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Basis function view of Gaussian processes

−10 −8 −6 −4 −2 0 2 4 6 8 10

−1

0

1

co

eff

icie

nts

−10 −8 −6 −4 −2 0 2 4 6 8 100

0.2

ba

sis

fu

nctio

ns

−10 −8 −6 −4 −2 0 2 4 6 8 10

−0.5

0

0.5

x

f(x)

γk ∼ N (0, 1)

gk(x) = 1√K

e−1

l2(x−k/K)2

f(x) =∑Kk=1 γkgk(x)

Page 411: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Basis function view of Gaussian processes

−10 −8 −6 −4 −2 0 2 4 6 8 10

−1

0

1

co

eff

icie

nts

−10 −8 −6 −4 −2 0 2 4 6 8 100

0.2

ba

sis

fu

nctio

ns

−10 −8 −6 −4 −2 0 2 4 6 8 10

−0.5

0

0.5

x

f(x)

γk ∼ N (0, 1)

gk(x) = 1√K

e−1

l2(x−k/K)2

f(x) =∑Kk=1 γkgk(x)

Page 412: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Basis function view of Gaussian processes

−10 −8 −6 −4 −2 0 2 4 6 8 10

−1

0

1

co

eff

icie

nts

−10 −8 −6 −4 −2 0 2 4 6 8 100

0.2

ba

sis

fu

nctio

ns

−10 −8 −6 −4 −2 0 2 4 6 8 10

−0.5

0

0.5

x

f(x)

γk ∼ N (0, 1)

gk(x) = 1√K

e−1

l2(x−k/K)2

f(x) =∑Kk=1 γkgk(x)

Page 413: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Basis function view of Gaussian processes

−10 −8 −6 −4 −2 0 2 4 6 8 10

−1

0

1

co

eff

icie

nts

−10 −8 −6 −4 −2 0 2 4 6 8 100

0.2

ba

sis

fu

nctio

ns

−10 −8 −6 −4 −2 0 2 4 6 8 10

−0.5

0

0.5

x

f(x)

γk ∼ N (0, 1)

gk(x) = 1√K

e−1

l2(x−k/K)2

f(x) =∑Kk=1 γkgk(x)

Page 414: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Basis function view of Gaussian processes

−10 −8 −6 −4 −2 0 2 4 6 8 10

−1

0

1

co

eff

icie

nts

−10 −8 −6 −4 −2 0 2 4 6 8 100

0.2

ba

sis

fu

nctio

ns

−10 −8 −6 −4 −2 0 2 4 6 8 10

−0.5

0

0.5

x

f(x)

γk ∼ N (0, 1)

gk(x) = 1√K

e−1

l2(x−k/K)2

f(x) =∑Kk=1 γkgk(x)

Page 415: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Basis function view of Gaussian processes

−10 −8 −6 −4 −2 0 2 4 6 8 10

−1

0

1

co

eff

icie

nts

−10 −8 −6 −4 −2 0 2 4 6 8 100

0.2

ba

sis

fu

nctio

ns

−10 −8 −6 −4 −2 0 2 4 6 8 10

−0.5

0

0.5

x

f(x)

γk ∼ N (0, 1)

gk(x) = 1√K

e−1

l2(x−k/K)2

f(x) =∑Kk=1 γkgk(x)

Page 416: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Basis function view of Gaussian processes

−10 −8 −6 −4 −2 0 2 4 6 8 10

−2

0

2

co

eff

icie

nts

−10 −8 −6 −4 −2 0 2 4 6 8 100

0.05

0.1

ba

sis

fu

nctio

ns

−10 −8 −6 −4 −2 0 2 4 6 8 10

−0.5

0

0.5

x

f(x)

γk ∼ N (0, 1)

gk(x) = 1√K

e−1

l2(x−k/K)2

f(x) =∑Kk=1 γkgk(x)

Page 417: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Basis function view of Gaussian processes

−10 −8 −6 −4 −2 0 2 4 6 8 10

−2

0

2

co

eff

icie

nts

−10 −8 −6 −4 −2 0 2 4 6 8 100

0.05

0.1

ba

sis

fu

nctio

ns

−10 −8 −6 −4 −2 0 2 4 6 8 10

−0.5

0

0.5

x

f(x)

γk ∼ N (0, 1)

gk(x) = 1√K

e−1

l2(x−k/K)2

f(x) =∑Kk=1 γkgk(x)

Page 418: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Basis function view of Gaussian processes

−10 −8 −6 −4 −2 0 2 4 6 8 10

−2

0

2

co

eff

icie

nts

−10 −8 −6 −4 −2 0 2 4 6 8 100

0.05

0.1

ba

sis

fu

nctio

ns

−10 −8 −6 −4 −2 0 2 4 6 8 10

−0.5

0

0.5

x

f(x)

γk ∼ N (0, 1)

gk(x) = 1√K

e−1

l2(x−k/K)2

f(x) =∑Kk=1 γkgk(x)

Page 419: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Basis function view of Gaussian processes

−10 −8 −6 −4 −2 0 2 4 6 8 10

−2

0

2

co

eff

icie

nts

−10 −8 −6 −4 −2 0 2 4 6 8 100

0.05

0.1

ba

sis

fu

nctio

ns

−10 −8 −6 −4 −2 0 2 4 6 8 10

−0.5

0

0.5

x

f(x)

γk ∼ N (0, 1)

gk(x) = 1√K

e−1

l2(x−k/K)2

f(x) =∑Kk=1 γkgk(x)

Page 420: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Basis function view of Gaussian processes

−10 −8 −6 −4 −2 0 2 4 6 8 10

−2

0

2

co

eff

icie

nts

−10 −8 −6 −4 −2 0 2 4 6 8 100

0.05

0.1

ba

sis

fu

nctio

ns

−10 −8 −6 −4 −2 0 2 4 6 8 10

−0.5

0

0.5

x

f(x)

γk ∼ N (0, 1)

gk(x) = 1√K

e−1

l2(x−k/K)2

f(x) =∑Kk=1 γkgk(x)

Page 421: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Basis function view of Gaussian processes

−10 −8 −6 −4 −2 0 2 4 6 8 10

−2

0

2

co

eff

icie

nts

−10 −8 −6 −4 −2 0 2 4 6 8 100

0.05

0.1

ba

sis

fu

nctio

ns

−10 −8 −6 −4 −2 0 2 4 6 8 10

−0.5

0

0.5

x

f(x)

γk ∼ N (0, 1)

gk(x) = 1√K

e−1

l2(x−k/K)2

f(x) =∑Kk=1 γkgk(x)

Page 422: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Basis function view of Gaussian processes

−10 −8 −6 −4 −2 0 2 4 6 8 10

−2

0

2

co

eff

icie

nts

−10 −8 −6 −4 −2 0 2 4 6 8 100

0.05

0.1

ba

sis

fu

nctio

ns

−10 −8 −6 −4 −2 0 2 4 6 8 10

−0.5

0

0.5

x

f(x)

γk ∼ N (0, 1)

gk(x) = 1√K

e−1

l2(x−k/K)2

f(x) =∑Kk=1 γkgk(x)

Page 423: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Basis function view of Gaussian processes

−10 −8 −6 −4 −2 0 2 4 6 8 10

−2

0

2

co

eff

icie

nts

−10 −8 −6 −4 −2 0 2 4 6 8 100

0.05

0.1

ba

sis

fu

nctio

ns

−10 −8 −6 −4 −2 0 2 4 6 8 10

−0.5

0

0.5

x

f(x)

γk ∼ N (0, 1)

gk(x) = 1√K

e−1

l2(x−k/K)2

f(x) =∑Kk=1 γkgk(x)

Page 424: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Basis function view of Gaussian processes

−10 −8 −6 −4 −2 0 2 4 6 8 10

−2

0

2

co

eff

icie

nts

−10 −8 −6 −4 −2 0 2 4 6 8 100

0.05

0.1

ba

sis

fu

nctio

ns

−10 −8 −6 −4 −2 0 2 4 6 8 10

−0.5

0

0.5

x

f(x)

γk ∼ N (0, 1)

gk(x) = 1√K

e−1

l2(x−k/K)2

f(x) =∑Kk=1 γkgk(x)

Page 425: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Basis function view of Gaussian processes

−10 −8 −6 −4 −2 0 2 4 6 8 10

−2

0

2

co

eff

icie

nts

−10 −8 −6 −4 −2 0 2 4 6 8 100

0.05

0.1

ba

sis

fu

nctio

ns

−10 −8 −6 −4 −2 0 2 4 6 8 10

−0.5

0

0.5

x

f(x)

γk ∼ N (0, 1)

gk(x) = 1√K

e−1

l2(x−k/K)2

f(x) =∑Kk=1 γkgk(x)

Page 426: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Basis function view of Gaussian processes

−10 −8 −6 −4 −2 0 2 4 6 8 10

−0.5

0

0.5

x

f(x)

γk ∼ N (0, 1)

gk(x) = 1√K

e−1

l2(x−k/K)2

f(x) =∑Kk=1 γkgk(x)

m(x) = 〈f(x)〉

Page 427: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Basis function view of Gaussian processes

−10 −8 −6 −4 −2 0 2 4 6 8 10

−0.5

0

0.5

x

f(x)

γk ∼ N (0, 1)

gk(x) = 1√K

e−1

l2(x−k/K)2

f(x) =∑Kk=1 γkgk(x)

m(x) = 〈∑k γkgk(x)〉

Page 428: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Basis function view of Gaussian processes

−10 −8 −6 −4 −2 0 2 4 6 8 10

−0.5

0

0.5

x

f(x)

γk ∼ N (0, 1)

gk(x) = 1√K

e−1

l2(x−k/K)2

f(x) =∑Kk=1 γkgk(x)

m(x) =∑k〈γk〉gk(x)

Page 429: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Basis function view of Gaussian processes

−10 −8 −6 −4 −2 0 2 4 6 8 10

−0.5

0

0.5

x

f(x)

γk ∼ N (0, 1)

gk(x) = 1√K

e−1

l2(x−k/K)2

f(x) =∑Kk=1 γkgk(x)

m(x) = 0

Page 430: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Basis function view of Gaussian processes

−10 −8 −6 −4 −2 0 2 4 6 8 10

−0.5

0

0.5

x

f(x)

γk ∼ N (0, 1)

gk(x) = 1√K

e−1

l2(x−k/K)2

f(x) =∑Kk=1 γkgk(x)

m(x) = 0

K(x, x′) = 〈f(x)f(x′)〉

Page 431: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Basis function view of Gaussian processes

−10 −8 −6 −4 −2 0 2 4 6 8 10

−0.5

0

0.5

x

f(x)

γk ∼ N (0, 1)

gk(x) = 1√K

e−1

l2(x−k/K)2

f(x) =∑Kk=1 γkgk(x)

m(x) = 0

K(x, x′) = 〈∑k γkgk(x)

∑k′ γk′gk′(x

′)〉

Page 432: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Basis function view of Gaussian processes

−10 −8 −6 −4 −2 0 2 4 6 8 10

−0.5

0

0.5

x

f(x)

γk ∼ N (0, 1)

gk(x) = 1√K

e−1

l2(x−k/K)2

f(x) =∑Kk=1 γkgk(x)

m(x) = 0

K(x, x′) =∑k,k′〈γkγk′〉gk(x)gk′(x′)

Page 433: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Basis function view of Gaussian processes

−10 −8 −6 −4 −2 0 2 4 6 8 10

−0.5

0

0.5

x

f(x)

γk ∼ N (0, 1)

gk(x) = 1√K

e−1

l2(x−k/K)2

f(x) =∑Kk=1 γkgk(x)

m(x) = 0

K(x, x′) =∑k,k′ δk,k′gk(x)gk′(x′)

Page 434: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Basis function view of Gaussian processes

−10 −8 −6 −4 −2 0 2 4 6 8 10

−0.5

0

0.5

x

f(x)

γk ∼ N (0, 1)

gk(x) = 1√K

e−1

l2(x−k/K)2

f(x) =∑Kk=1 γkgk(x)

m(x) = 0

K(x, x′) =∑k gk(x)gk(x′)

Page 435: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Basis function view of Gaussian processes

−10 −8 −6 −4 −2 0 2 4 6 8 10

−0.5

0

0.5

x

f(x)

γk ∼ N (0, 1)

gk(x) = 1√K

e−1

l2(x−k/K)2

f(x) =∑Kk=1 γkgk(x)

m(x) = 0

K(x, x′) =∑k gk(x)gk(x′)

K(x, x′) = 1K

∑k e−

1

l2(x−k/K)2− 1

l2(x′−k/K)2

Page 436: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Basis function view of Gaussian processes

−10 −8 −6 −4 −2 0 2 4 6 8 10

−0.5

0

0.5

x

f(x)

γk ∼ N (0, 1)

gk(x) = 1√K

e−1

l2(x−k/K)2

f(x) =∑Kk=1 γkgk(x)

m(x) = 0

K(x, x′) =∑k gk(x)gk(x′)

K(x, x′) = 1K

∑k e−

1

l2(x−k/K)2− 1

l2(x′−k/K)2

K(x, x′) −→∫du e−

1

l2(x−u)2− 1

l2(x′−u)2

K →∞

Page 437: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Basis function view of Gaussian processes

−10 −8 −6 −4 −2 0 2 4 6 8 10

−0.5

0

0.5

x

f(x)

γk ∼ N (0, 1)

gk(x) = 1√K

e−1

l2(x−k/K)2

f(x) =∑Kk=1 γkgk(x)

m(x) = 0

K(x, x′) =∑k gk(x)gk(x′)

K(x, x′) = 1K

∑k e−

1

l2(x−k/K)2− 1

l2(x′−k/K)2

K(x, x′) −→∫du e−

1

l2(x−u)2− 1

l2(x′−u)2

K →∞

K(x, x′) ∝ e−1

2l2(x−x′)2

Page 438: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Basis function view of Gaussian processes

−10 −8 −6 −4 −2 0 2 4 6 8 10

−0.5

0

0.5

x

f(x)

γk ∼ N (0, 1)

gk(x) = 1√K

e−1

l2(x−k/K)2

f(x) =∑Kk=1 γkgk(x)

m(x) = 0

K(x, x′) =∑k gk(x)gk(x′)

K(x, x′) = 1K

∑k e−

1

l2(x−k/K)2− 1

l2(x′−k/K)2

K(x, x′) −→∫du e−

1

l2(x−u)2− 1

l2(x′−u)2

K →∞

K(x, x′) ∝ e−1

2l2(x−x′)2

Gaussian processes ≡ models with∞ parameters

Page 439: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Basis function view of Gaussian Processes

basis function

squaredexponential

Page 440: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Basis function view of Gaussian Processes

basis function

rationalquadratic

squaredexponential

Page 441: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Basis function view of Gaussian Processes

basis function

rationalquadratic

squaredexponential

mixture of SEs

Page 442: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Basis function view of Gaussian Processes

&

basis function

rationalquadratic

squaredexponential

any stationarycovariance

(Fourier basis)

mixture of SEs

Page 443: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Basis function view of Gaussian Processes

&

basis function

rationalquadratic

squaredexponential

any stationarycovariance

(Fourier basis)

covariance even functiondata = Fourier series

with Gaussian coefficients

mixture of SEs

Page 444: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Basis function view of Gaussian Processes

&

basis function

and

rationalquadratic

squaredexponential

any stationarycovariance

(Fourier basis)

linear regression

covariance even functiondata = Fourier series

with Gaussian coefficients

mixture of SEs

Page 445: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Basis function view of Gaussian Processes

Basis function view useful because:

• connects to classical approaches

• basis function generative model is useful theoretically

– Q: are draws from a SE continuous and differentiable?– A: they are (almost surely, almost everywhere) as the basis functions are

• thinking about the basis function generative model is useful practically

– Q: how could I construct a periodic covariance?– A: use periodic basis functions with Gaussian weights

Page 446: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Making new covariance functions from old

(positive) linear combinationsof covariance functions

Page 447: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Making new covariance functions from old

(positive) linear combinationsof covariance functions

e.g. scale mixture of SE rationalquadratic

Page 448: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Making new covariance functions from old

(positive) linear combinationsof covariance functions

multiplication of covariancefunctions

e.g. scale mixture of SE rationalquadratic

Page 449: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Making new covariance functions from old

(positive) linear combinationsof covariance functions

multiplication of covariancefunctions

e.g. scale mixture of SE rationalquadratic

e.g. periodic SE

Page 450: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Making new covariance functions from old

(positive) linear combinationsof covariance functions

multiplication of covariancefunctions

e.g. scale mixture of SE

derivative of GP = GP

rationalquadratic

e.g. periodic SE

Page 451: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Making new covariance functions from old

(positive) linear combinationsof covariance functions

multiplication of covariancefunctions

e.g. scale mixture of SE

derivative of GP = GP

rationalquadratic

e.g. periodic SE

Page 452: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Making new covariance functions from old

(positive) linear combinationsof covariance functions

multiplication of covariancefunctions

e.g. scale mixture of SE

derivative of GP = GP

rationalquadratic

e.g. periodic SE

Page 453: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Making new covariance functions from old

(positive) linear combinationsof covariance functions

multiplication of covariancefunctions

e.g. scale mixture of SE

derivative of GP = GP

rationalquadratic

e.g. periodic SE

Page 454: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Making new covariance functions from old

(positive) linear combinationsof covariance functions

multiplication of covariancefunctions

e.g. scale mixture of SE

derivative of GP = GP

rationalquadratic

e.g. periodic SE

new basis:

Page 455: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Making new covariance functions from old

(positive) linear combinationsof covariance functions

multiplication of covariancefunctions

e.g. scale mixture of SE

derivative of GP = GP

rationalquadratic

e.g. periodic SE

new basis:new covariance:

Page 456: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Making new covariance functions from old

(positive) linear combinationsof covariance functions

multiplication of covariancefunctions

e.g. scale mixture of SE

derivative of GP = GP

integral of GP = GP

rationalquadratic

e.g. periodic SE

new basis:new covariance:

Page 457: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Making new covariance functions from old

(positive) linear combinationsof covariance functions

multiplication of covariancefunctions

e.g. scale mixture of SE

derivative of GP = GP

integral of GP = GP

rationalquadratic

e.g. periodic SE

new basis:new covariance:

Page 458: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Making new covariance functions from old

(positive) linear combinationsof covariance functions

multiplication of covariancefunctions

e.g. scale mixture of SE

derivative of GP = GP

integral of GP = GP

rationalquadratic

e.g. periodic SE

new basis:

new basis:

new covariance:

new covariance:

Page 459: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Making new covariance functions from old

(positive) linear combinationsof covariance functions

multiplication of covariancefunctions

e.g. scale mixture of SE

derivative of GP = GP

integral of GP = GP

rationalquadratic

e.g. periodic SE

filtering a GP = GP

Page 460: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Higher dimensional input spaces

0

20

40

60

80

100

0

20

40

60

80

100

−2

0

2

x1x

2

y

Page 461: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Higher dimensional input spaces

0

20

40

60

80

100

0

20

40

60

80

100

−2

0

2

x1x

2

y

Page 462: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Higher dimensional input spaces

0

20

40

60

80

100

0

20

40

60

80

100

−2

0

2

x1x

2

y

Page 463: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Higher dimensional input spaces

0

20

40

60

80

100

0

20

40

60

80

100

−2

0

2

x1x

2

y

Page 464: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Higher dimensional input spaces

0

20

40

60

80

100

0

20

40

60

80

100

−2

0

2

x1x

2

y

Page 465: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Higher dimensional input spaces

0

20

40

60

80

100

0

20

40

60

80

100

−2

0

2

x1x

2

y

Page 466: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Higher dimensional input spaces

0

20

40

60

80

100

0

20

40

60

80

100

−2

0

2

x1x

2

y

Page 467: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Higher dimensional input spaces

0

20

40

60

80

100

0

20

40

60

80

100

−2

0

2

x1x

2

y

Page 468: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Higher dimensional input spaces

0

20

40

60

80

100

0

20

40

60

80

100

−2

0

2

x1x

2

y

Page 469: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Higher dimensional input spaces

0

20

40

60

80

100

0

20

40

60

80

100

−2

0

2

x1x

2

y

Page 470: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Higher dimensional input spaces

0

20

40

60

80

100

0

20

40

60

80

100

−2

0

2

x1x

2

y

Page 471: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Higher dimensional input spaces

0

20

40

60

80

100

0

20

40

60

80

100

−2

0

2

x1x

2

y

Page 472: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Higher dimensional input spaces

0

20

40

60

80

100

0

20

40

60

80

100

−2

0

2

x1x

2

y

Page 473: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Higher dimensional input spaces

0

20

40

60

80

100

0

20

40

60

80

100

−2

0

2

x1x

2

y

Page 474: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Higher dimensional input spaces

0

20

40

60

80

100

0

20

40

60

80

100

−2

0

2

x1x

2

y

Page 475: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Computational cost

• prediction task

– train on N points– test on M points

• prediction equations

µM = KMNK−1NNyN

ΣMM = KMM − KMNK−1NNKNM

• Full cost O((N +M)3) just variances O(N2M)

• Without special structure, computation is limited to O(1000) variables

⇒ Computational cost is a major limitation of GPs

Page 476: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Beyond regression: classification

Idea: points near each other in the input space tend to have similar labels

Class probability related to latent function:

π(x) = p(y = 1|f(x)) = Φ(f(x))

Logistic link function a typical choice: Φ(f) = 11+exp(−f)

Likelihood non-Gaussian ⇒ prediction analytically intractable ⇒ requireapproximations

Page 477: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Beyond regression

GPs useful when ever a prior over functions is required

• dimensionality reduction

• time-series models (Kalman filter)

• clustering

• active learning

• reinforcement learning

• ...

Page 478: An introduction to Gaussian processes for ... - gpss.ccgpss.cc/gpss13/assets/Sheffield-GPSS2013-Turner.pdf · An introduction to Gaussian processes for probabilistic inference Dr.

Summary

• Gaussian process: collection of random variables, any finite subset ofwhich are Gaussian distributed

• Easy to use

– Predictions correspond to models with infinite numbers of parameters

• GPs have many standard methods as special cases

• Problem: N3 complexity

– approximation methods for N > 2000 or special covariance functions

• Great reference: Rasmussen & Williams www.gaussianprocess.org/