Bregman Information Bottleneck NIPS’03, Whistler December 2003 Koby Crammer Hebrew University of...

Post on 15-Jan-2016

216 views 0 download

Tags:

Transcript of Bregman Information Bottleneck NIPS’03, Whistler December 2003 Koby Crammer Hebrew University of...

Bregman Bregman Information BottleneckInformation Bottleneck

NIPS’03, Whistler December 2003

Koby CrammerKoby CrammerHebrew UniversityHebrew University

of Jerusalemof Jerusalem

Noam SlonimNoam SlonimPrinceton UniversityPrinceton University

MotivationMotivation

• Extend the IB for a broad family of representations• Relation to the Exponential family

Hello, world

Multinomial distribution

Vectors

OutlineOutline

• Rate-Distortion Formulation• Bregman Divergences• Bregman IB• Statistical Interpretation• Summary

Information BottleneckInformation Bottleneck

X T Y

X

[ p(y=1|X) … p(y=n|X)]

[ p(y=1|T) … p(y=n|T)]

T

• Input

• Variables

• Distortion

Rate-Distortion FormulationRate-Distortion Formulation

• Bolzman Distribution:

• Markov + Bayes

• Marginal

Self-Consistent EquationsSelf-Consistent Equations

Bregman DivergencesBregman Divergences

f

(u,f(u))

(v,f(v))

(v, f(u)+f’(u)(v-u))

Bf(v||u) = f(v) - (f(u)+f’(u)(v-u))Bf(v||u) = f:S R

• Functional

• Bregman Function

• Input

• Variables

• Distortion

Bregman IB: Rate-Distortion FormulationBregman IB: Rate-Distortion Formulation

• Bolzman Distribution:

• Prototypes: convex combination of input vectors

• Marginal

Self-Consistent EquationsSelf-Consistent Equations

Special CasesSpecial Cases

• Information Bottleneck: Bregman function: f(x)=x log(x) – x Domain: Simplex Divergence: Kullback-Leibler

• Soft K-means Bregman function: f(x)=(1/2) x2

Domain: Realsn

Divergence: Euclidian Distance [Still, Bialek, Bottou, NIPS 2003]

Bregman IBBregman IB

Information Bottleneck

BregmanClustering

Rate-Distortion

Exponential Family

Exponential FamilyExponential Family

• Expectation parameters:

• Examples (single dimension): Normal

Poisson

• Expectation parameters:

• Properties :

Exponential Family and Exponential Family and Bregman DivergencesBregman Divergences

IllustrationIllustration

• Expectation parameters:

• Properties :

Exponential Family and Exponential Family and Bregman DivergencesBregman Divergences

• Distortion:

• Data vectors and prototypes: expectation parameters

• Question: For what exponential distribution we have ?

Answer: Poisson

Back to Distributional ClusteringBack to Distributional Clustering

Product of Poisson

Distributions

IllustrationIllustration

a a b a a a b a a a .8.2

a b

6040

a b

Pr

Multinomial Distribution

Back to Distributional ClusteringBack to Distributional Clustering

• Information Bottleneck: Distributional clustering of Poison distributions

• (Soft) k-means: (Soft) Clustering of Normal distributions

• Distortion

• Input: Observations

• Output Parameters of Distribution

• IB functional: EM [Elidan & Fridman, before]

Maximum Likelihood PerspectiveMaximum Likelihood Perspective

• Posterior:

• Partition Function:

Weighted -norm of the Likelihood

• → ∞ , most likely cluster governs• →0 , clusters collapse into a single prototype

Back to Self Consistent EquationsBack to Self Consistent Equations

Summary Summary

• Bregman Information Bottleneck Clustering/Compression

for many representations and divergences

• Statistical Interpretation Clustering of distributions from the exponential family EM like formulation

• Current Work: Algorithms Characterize distortion measures which also yield

Bolzman distributions General distortion measures