An Infinite Factor Model Hierarchy Via a Noisy-Or Mechanism

16
An Infinite Factor Model Hierarchy Via a Noisy-Or Mechanism Aaron C. Courville, Douglas Eck and Yosh ua Bengio NIPS 2009 Presented by Lingbo Li ECE, Duke University May 21, 2010 Note: all tables and figures taken from the original paper

description

An Infinite Factor Model Hierarchy Via a Noisy-Or Mechanism. Aaron C. Courville, Douglas Eck and Yoshua Bengio NIPS 2009 Presented by Lingbo Li ECE, Duke University May 21, 2010. Note: all tables and figures taken from the original paper. Outline. Motivations Latent Factor Modeling - PowerPoint PPT Presentation

Transcript of An Infinite Factor Model Hierarchy Via a Noisy-Or Mechanism

Page 1: An Infinite Factor Model Hierarchy Via a Noisy-Or Mechanism

An Infinite Factor Model Hierarchy Via a Noisy-Or Mechanism

Aaron C. Courville, Douglas Eck and Yoshua BengioNIPS 2009

Presented by Lingbo LiECE, Duke University

May 21, 2010

Note: all tables and figures taken from the original paper

Page 2: An Infinite Factor Model Hierarchy Via a Noisy-Or Mechanism

Outline

• Motivations• Latent Factor Modeling• A Hierarchy of Latent Features Via a Noisy-

Or Mechanism• Inference • Experiments• Conclusions

Page 3: An Infinite Factor Model Hierarchy Via a Noisy-Or Mechanism

Motivations• Indian Buffet Process (IBP): factorial representation of data.• Music tag data (Last.fm): to organize playlists. e.g. RADIOHEAD: alternative, rock, alternative rock, indie, el

ectronic, britpop, british, and indie rock.• In IBP, latent features are independent across object instances.• Dependency between latent factors: co-occurrence of some fea

tures. e.g. ‘alternative’ + ’indie’ > ‘alternative’ + ‘classical’• Extend infinite latent factor models to two unbounded layers o

f factors.• ‘Upper-layer factors express correlations between lower-layer

factors via a noisy-or mechanism.’

Page 4: An Infinite Factor Model Hierarchy Via a Noisy-Or Mechanism

Latent Factor Modeling• objects model parameters binary feature variables • Features: active inactive • Model is summarized as

• are mutually independent.

Page 5: An Infinite Factor Model Hierarchy Via a Noisy-Or Mechanism

Latent Factor Modeling

• As , IBP gets the distribution of an unbounded binary feature matrix by marginalizing out

• Stick-breaking construction for the IBP

• Factor probabilities are expressed as:

Page 6: An Infinite Factor Model Hierarchy Via a Noisy-Or Mechanism

A Hierarchy of Latent Features Via a Noisy-OR Mechanism

Extent to two layers of binary latent features: an upper-layer binary latent feature matrix with elements

an lower-layer binary latent feature matrix with elements

• The weight matrix connect every element of to every element of , where

• The active can be interpreted as ‘the possible causes of the activation of the individual

Page 7: An Infinite Factor Model Hierarchy Via a Noisy-Or Mechanism

A Hierarchy of Latent Features Via a Noisy-OR Mechanism

Page 8: An Infinite Factor Model Hierarchy Via a Noisy-Or Mechanism

A Hierarchy of Latent Features Via a Noisy-OR Mechanism

• Define an additional random matrix with inactive upper-layer features are failure

s active upper-layer features are failures• For each

trial

SuccessNo further trials

FailureMove on to Trial

if all trials with

Page 9: An Infinite Factor Model Hierarchy Via a Noisy-Or Mechanism

A Hierarchy of Latent Features Via a Noisy-OR Mechanism

• Posterior distributions for the model parameters and

: number of times is active

: number of times that the j-th trial was a

success for : number of times that the j-th trial was a failure for despite being active

• Integrate out

Page 10: An Infinite Factor Model Hierarchy Via a Noisy-Or Mechanism

Inference• Based on the blocked Gibbs sampling and the IBP

semi-ordered slice sampler

• Semi-ordered slice sampling of the upper-layer IBP

• Semi-ordered slice sampling of the lower-layer factor model

• Efficient extended blocked Gibbs sampler over the entire model without approximation

Page 11: An Infinite Factor Model Hierarchy Via a Noisy-Or Mechanism

Experiments (I)• MNIST dataset• 1000 examples of images of the digit 3, preprocessed

by projecting onto the first 64 PCA components• Set 500 examples as training and the left 500 as

testing• Each data object is modeled as

• Add random noise (std = 0.5) on the post-processed test set

• Recover the noise-free version

Page 12: An Infinite Factor Model Hierarchy Via a Noisy-Or Mechanism
Page 13: An Infinite Factor Model Hierarchy Via a Noisy-Or Mechanism

Experiments (II)• Music Tags • Tags and tag frequencies are extracted from the social music

website (http://www.last.fm/) using the Audioscrobbler web service

• Dataset: 1000 artists with a vocabulary size of 100 tags representing a total 312134 counts.

• Goal: to reduce the noisy collection of tags to a sparse representation for each artist;

• Model the data as

where C is the limit on the number of possible counts achievable, C=100

Page 14: An Infinite Factor Model Hierarchy Via a Noisy-Or Mechanism

Experiments (II)

• Both layers are sparse• Most features at the upper layer use one to three tags• Most features at the lower layer cover a broader range of tags

Tags with the two most probable factors at the upper layer:

Page 15: An Infinite Factor Model Hierarchy Via a Noisy-Or Mechanism

Experiments (II)• Comparison among Generalized linear model, IBP and two-

layer Noisy-Or IFM • Test data contains 600 artist-tag collections, and 90% of the tags

are missing; To impute the missing data from the left 10%.• For generalized linear model

• Both IBP and noisy-or models perform better than the generalized latent linear model

Page 16: An Infinite Factor Model Hierarchy Via a Noisy-Or Mechanism

Conclusions• Bayesian nonparametric version of the noisy-or mechanism

• Extend infinite latent factor models to two or more unbounded layers of factors

• Efficient inference via Gibbs sampling procedure

• Compare performance with the standard IBP construction