prof. dr. Lambert Schomaker

15
prof. dr. Lambert Schomaker Bayes and continuous PDFs Kunstmatige Intelligentie / RuG

description

Bayes and continuous PDFs. prof. dr. Lambert Schomaker. Kunstmatige Intelligentie / RuG. discrete vs continuous. Bayes theory is usually introduced on the basis of discrete PDFs (alarm? true/false) … in a set-theoretic framework - PowerPoint PPT Presentation

Transcript of prof. dr. Lambert Schomaker

Page 1: prof. dr. Lambert Schomaker

prof. dr. Lambert Schomaker

Bayes and continuous PDFs

Kunstmatige Intelligentie / RuG

Page 2: prof. dr. Lambert Schomaker

2

discrete vs continuous

Bayes theory is usually introduced on the basis of discrete PDFs (alarm? true/false)

… in a set-theoretic framework

but: numbers along a dimension can be considered as points in a set: {x R}

Page 3: prof. dr. Lambert Schomaker

3

Bayes revisited

P(C|x) = P(x|C) P(C) / P(x)

where C is a “class” of observations x is an observed scalar feature

P(C) is the prior probability of finding that class

P(x) is the likelihood or prior probability of the observable value of x

P(x|C) is the probability of finding x in case of C

Page 4: prof. dr. Lambert Schomaker

4

Bayes & continuous PDFs

P(C|x) = P(x|C) P(C) / P(x) where C is a “class” of observations x is an observed scalar feature

If x is a real number:

P(x|C) is the probability density function (PDF) or histogram of feature values observed for class C

P(x) is the PDF of x “at all” (all possible classes)

Page 5: prof. dr. Lambert Schomaker

5

Example: temperature classification

Classes C:

Cold P(x|C)Normal P(x|N)Warm P(x|W)Hot P(x|H)

P(x)P(x)

P(x|C)P(x|C)P(x|N)P(x|N)

P(x|W)P(x|W)

P(x|H)P(x|H)

P(x) likelihoodP(x) likelihoodof x valuesof x values

Page 6: prof. dr. Lambert Schomaker

6

Bayes: probability “blow up”

Classes C:

Cold P(x|C)Normal P(x|N)Warm P(x|W)Hot P(x|H)

P(C|x) P(C|x) P(N|x)P(N|x) P(W|x)P(W|x) P(H|x)P(H|x)

Page 7: prof. dr. Lambert Schomaker

P(x|C) P(x|C)

P(C|x) P(C|x)

P(C|x) = P(x|C) P(C) / P(x)P(C|x) = P(x|C) P(C) / P(x)

Bayesian outputhas a nice plateau

even with an irregularPDF shape …

in

out

Page 8: prof. dr. Lambert Schomaker

8

Puzzle

So if Bayes is optimal and can be used for continuous data too, why has it become popular so late, i.e., much later than neural networks?

Page 9: prof. dr. Lambert Schomaker

9

Why Bayes has become popular so late…

Note: the example was 1-dimensional

A PDF (histogram) with 100 bins for one dimension will cost 10000 bins for two dimensions etc.

Ncells = Nbinsndims

P(x)

x

Page 10: prof. dr. Lambert Schomaker

10

Why Bayes has become popular so late…

Ncells = Nbinsndims

Yes… but you could use n-dimensional theoretical distributions (Gauss, Weibull etc.) instead of empirically measured PDFs…

Page 11: prof. dr. Lambert Schomaker

11

Why Bayes has become popular so late…

… use theoretical distributions instead of empirically measured PDFs…

still the dimensionality is a problem:– 20 samples needed to estimate 1-dim. Gaussian PDF

400 samples needed to estimate 2-dim. Gaussian!, etc.

massive amounts of labeled data are needed to estimate probabilities reliably!

Page 12: prof. dr. Lambert Schomaker

12

Labeled (ground truthed) data

0.1 0.54 0.53 0.874 8.455 0.001 –0.111 risk

0.2 0.59 0.01 0.974 8.40 0.002 –0.315 risk

0.11 0.4 0.3 0.432 7.455 0.013 –0.222 safe

0.2 0.64 0.13 0.774 8.123 0.001 –0.415 risk

0.1 0.17 0.59 0.813 9.451 0.021 –0.319 risk

0.8 0.43 0.55 0.874 8.852 0.011 –0.227 safe

0.1 0.78 0.63 0.870 8.115 0.002 –0.254 risk

. . . . . . . .

Example: client evaluation in insurances

Page 13: prof. dr. Lambert Schomaker

13

Success of speech recognition

massive amounts of data increased computing power cheap computer memory

allowed for the use of Bayes in hidden Markov Models for speech recognition

similarly (but slower): application of Bayes in script recognition

Page 14: prof. dr. Lambert Schomaker

Global Structure: year title date date and number of entry (Rappt) redundant lines between paragraphs jargon-words:

NotificatieBesluit fiat

imprint with page number

XML model

Page 15: prof. dr. Lambert Schomaker

Local probabilistic structure:

P(“Novb 16 is a date” | “sticks out to the left” & is left of “Rappt ”) ?