High-Dimensional Unsupervised Selection and Estimation of a Finite Generalized Dirichlet Mixture...

17
High-Dimensional Unsupervised Selection and Estimation of a Fin ite Generalized Dirichlet Mixture model Based on Minimum Message Length by Nizar Bouguila and Djemel Ziou Dissusion led by Qi An Duke University Machine Learnin g Group

Transcript of High-Dimensional Unsupervised Selection and Estimation of a Finite Generalized Dirichlet Mixture...

Page 1: High-Dimensional Unsupervised Selection and Estimation of a Finite Generalized Dirichlet Mixture model Based on Minimum Message Length by Nizar Bouguila.

High-Dimensional Unsupervised Selection and Estimation of a Finite Generali

zed Dirichlet Mixture model Based on Minimum Message Length

by Nizar Bouguila and Djemel Ziou

Dissusion led by Qi An

Duke University Machine Learning Group

Page 2: High-Dimensional Unsupervised Selection and Estimation of a Finite Generalized Dirichlet Mixture model Based on Minimum Message Length by Nizar Bouguila.

Outline

• Introduction

• The generalized Dirichlet mixture

• The minimal message length (MML) criterion

• Fisher information matrix and priors

• Density estimation and model selection

• Experimental results

• Conclusions

Page 3: High-Dimensional Unsupervised Selection and Estimation of a Finite Generalized Dirichlet Mixture model Based on Minimum Message Length by Nizar Bouguila.

Introduction

• How to determine the number of components in a mixture model for high-dimensional data?– Stochastic and resampling (Slow)

• Implementation of model selection criteria• Fully Bayesian way

– Deterministic (Fast)• Approximate Bayesian criteria• Information/coding theory concepts

– Minimal message length (MML)

– Akaike’s information criterion (AIC)

Page 4: High-Dimensional Unsupervised Selection and Estimation of a Finite Generalized Dirichlet Mixture model Based on Minimum Message Length by Nizar Bouguila.

The generalized Dirichlet distribution

• A d dimensional generalized Dirichlet distribution is defined to be

It can be reduced to the Dirichlet distribuiton when

where and , , ,

d

iiX

1

1 10 iX 0i 0i 11 iiii

11 iii

Page 5: High-Dimensional Unsupervised Selection and Estimation of a Finite Generalized Dirichlet Mixture model Based on Minimum Message Length by Nizar Bouguila.

The generalized Dirichlet distribution

For the generalized Dirichlet distribution:

The GDD has a more general covariance structure than the DD and it is conjugate to multinomial distribution.

Page 6: High-Dimensional Unsupervised Selection and Estimation of a Finite Generalized Dirichlet Mixture model Based on Minimum Message Length by Nizar Bouguila.

GDD vs. Gaussian

• The GDD has smaller number of parameters to estimate. The estimation can be more accurate

• The GDD is defined in a support [0,1] and can be extended to a compact support [A,B]. It is more appropriate for the nature of data.

Beta distribution:

Beta type-II distribution:

They are equal if we set u=v/(1+v).

Page 7: High-Dimensional Unsupervised Selection and Estimation of a Finite Generalized Dirichlet Mixture model Based on Minimum Message Length by Nizar Bouguila.

A GDD mixture model

A generalized Dirichlet mixture model with M components, where p(X|α) takes a form of the GDD.

Page 8: High-Dimensional Unsupervised Selection and Estimation of a Finite Generalized Dirichlet Mixture model Based on Minimum Message Length by Nizar Bouguila.

The MML criterion

• The message length is defined as minus the logarithm of the posterior probability.

• After placing an explicit prior over parameters, the message length for a mixture of distribution is given as

prior likelihood Fisher Information

optimal quantization constant

Page 9: High-Dimensional Unsupervised Selection and Estimation of a Finite Generalized Dirichlet Mixture model Based on Minimum Message Length by Nizar Bouguila.

Fisher Information matrix

• The Fisher information matrix is the expected value of the Hessian minus the logarithm of the likelihood

where

Page 10: High-Dimensional Unsupervised Selection and Estimation of a Finite Generalized Dirichlet Mixture model Based on Minimum Message Length by Nizar Bouguila.

Prior distribution

• Assume the independence between difference components

Mixture weighs

GDD parameters

Place a Dirichlet distribution and a generalized Dirichlet distribution on P and α, respectively, with parameters set to 1.

Page 11: High-Dimensional Unsupervised Selection and Estimation of a Finite Generalized Dirichlet Mixture model Based on Minimum Message Length by Nizar Bouguila.

Message length

• After obtaining the Fisher information and specifying the prior distribution, the message length can be expressed as

Page 12: High-Dimensional Unsupervised Selection and Estimation of a Finite Generalized Dirichlet Mixture model Based on Minimum Message Length by Nizar Bouguila.

Estimation and selection algorithm

• The authors use an EM algorithm to estimate the mixture parameters.

• To overcome the computation issue and local maxima problem, they implement a fairly sophisticated initialization algorithm.

• The whole algorithm is summarized in the next page

Page 13: High-Dimensional Unsupervised Selection and Estimation of a Finite Generalized Dirichlet Mixture model Based on Minimum Message Length by Nizar Bouguila.
Page 14: High-Dimensional Unsupervised Selection and Estimation of a Finite Generalized Dirichlet Mixture model Based on Minimum Message Length by Nizar Bouguila.

Experimental results

The correct number of mixture are 5, 6, 7, respectively

Page 15: High-Dimensional Unsupervised Selection and Estimation of a Finite Generalized Dirichlet Mixture model Based on Minimum Message Length by Nizar Bouguila.

Experimental results

Page 16: High-Dimensional Unsupervised Selection and Estimation of a Finite Generalized Dirichlet Mixture model Based on Minimum Message Length by Nizar Bouguila.

Experimental results

• Web mining:– Training with multiple

classes of labels– Use to

predict the label of testing sample

– Use top 200 words frequency

Page 17: High-Dimensional Unsupervised Selection and Estimation of a Finite Generalized Dirichlet Mixture model Based on Minimum Message Length by Nizar Bouguila.

Conclusions

• A MML-based criterion is proposed to select the number of components in generalized Dirichlet mixtures.

• Full dimensionality of the data is used.• Generalized Dirichlet mixtures allow more model

ing flexibility than mixture of Gaussians.• The results indicate clearly that the MML and LE

C model selection methods outperform the other methods.