Lifelong Topic Modelling presentation

Lifelong Topic ModellingPaper Review Presentation

Daniele Di Mitri

Department of Knowledge EngineeringUniversity of Maastricht

22th May 2015

Daniele Di Mitri (DKE) Lifelong Topic Modelling 22th May 2015 1 / 13

Chosen paper

Chen, Zhiyuan, and Bing Liu.Topic Modeling using Topics from Many Domains, Lifelong Learningand Big Data.Proceedings of the 31st ICML conference, 2014


Outline

1 Topic modellingLDA descriptionLDA limitations

2 Topic modelling using knowledgeKnowledge Based Topic modelling

3 Lifelong Topic modellingLifelong learning approachThe proposed algorithmIncorporation of knowledge

4 Evaluation

5 Summary


Latent Dirichlet Allocationsome useful backgroundLatent Dirichlet allocation (LDA)

gene 0.04

dna 0.02

genetic 0.01

.,,

life 0.02

evolve 0.01

organism 0.01

.,,

brain 0.04

neuron 0.02

nerve 0.01

...

data 0.02

number 0.02

computer 0.01

.,,

Topics DocumentsTopic proportions and

assignments

• Each topic is a distribution over words

• Each document is a mixture of corpus-wide topics

• Each word is drawn from one of those topics

Figure: David Blei, Probabilistic Topic Models, 2012


LDA limitations

Unsupervised model can produce incoherent topics

Example

LDA sample topics

D1 = {price, color, cost, life}D2 = {cost, picture, price, expensive}D3 = {price, money, customer, expensive}

These topics have incoherent words: color, life, picture, customer


Can we use Knowledge?some related works

SUPERVISED

Topic model in supervised settingsE.g. Blei & McAuliffe (2007)All prior knowledge is correctUses ”regions” and ”labels”

UNSUPERVISED

Knowledge Based Topic ModellingE.g. GK-LDA (Chen et al. 2013) and DF-LDA (Andrezejewski et al.2009)Typically assume that given knowledge is correctThey don’t extract automatically and target prior knowledge


Can we do better?A fully automatic system to mine prior knowledge and deal with inconsistencies

INTUITION

If we find a set or words common in two domains these can serve asprior knowledge

Example

D1 ∩ D2 = {price, cost}D2 ∩ D3 = {price, expensive}

These are prior knowledge sets (pk-sets)

Example (D1 improved)

D1′ = {price, cost, expensive, color}


Lifelong Learning approachIn 4 ”simple” steps

1 Given a set of domains D = {D1, ..,Dn} it runs simple LDA(Di ) togenerate prior topics p-topics, unionised in S

2 Given a test domain Dt , run LTM(Dt) to generate c-topics At

3 For each aj ∈ At find matching topics Mtj ∈ S (high level knowledge

for aj)

4 Mine Mtj to generate pk-sets of length 2

Why Lifelong Learning? Retaining the learnt knowledge with LTM andadding (replacing) it to our initial prior topics S .


LTM algorithm

1 Runs GibbsSampling(Dt ,∅) (equivalent to LDA), for N iterations

2 Runs GibbsSampling(Dt ,K t) for N iterations adding K t

3 K t is updated at each iteration using minimum SymmetrisedKL-divergence sk ∈ S and aj ∈ At , and the Frequent Itemset Miningto generate frequent itemsets of length 2 (pk-sets)


How does LTM incorporate knowledge?

NB: d is added not by 1, but to a certain proportion, which stored in amatrix and is determined by using Pointwise Mutual Information.

PMI (w1,w2) = log(P(w1,w2)/P(w1)P(w2))


Evaluation

Test against 4 other baseline algorithms: LDA,DF-LDA, GK-LDAand AKL

Average Topic Coherence as quality measure

Figure: Results of tests in settings 1 & 2


In summary

Lifelong Topic Modelling

Learn prior knowledge

Fault tolerance

First Lifelong Learning Topic model

Big Data ready

However...some points for improvement

Text-corpora to be diversified (only Amazon review)

Focus on the flow of the algorithm

2nd test setting and test with Big Data not fully reported


Thank you!

Q&A


Lifelong Topic Modelling presentation

Technology

Transcript of Lifelong Topic Modelling presentation