A PLSA-based Language Model for Conversational Telephone Speech David Mrva and Philip C.Woodland

A PLSA-based Language Model for Conversational Telephone Speech

David Mrva and Philip C.Woodland

2004/12/08 邱炫盛

Outline

• Language Model• PLSA Model• Experimental Results• Conclusion

Language Model

• The task of a language model is to calculate probability

• n-gram model – Range of dependencies is limited to n-words – Information is ignored

)( ii hwP

),...,()( 11 iniiii wwwPhwP

Language Model (cont.)

• Topic-based language model– Latent Semantic Analysis– Topic-based language model– PLSA-based language model

PLSA Model• PLSA is general machine learning technique for

modeling the co-occurrences of events.

• Co-occurrence of words and documents

• Hidden variable = aspect

• PLSA in this paper is a mixture of unigram distribution.

PLSA Model (cont.)

P(d)d w

P(w|d)

P(d)td w

P(t|d)

P(w|t)

Graphical Model Representation

PLSA Model (cont.)

P(wj|z1)

P(wj|z2)

P(wj|zk)

P(z1|di)

P(z2|di)

P(zk|di)w1 w2 w3…….wj

PLSA Model (cont.)

kkjikij

kikiii

zwpdzpdwn

zwpdzp

zwpdzpzwpdzpzwpdzpdwp

)|()|(log),(

))|((log(log

)|()|(

)|()|(...)|()|()|()|()|(

M: number of words in vocabulary

N: number of documents in training collection

K: number of aspects or topics

PLSA Model (cont.)

kijkijz

kikjijk

kijijk

,d|wzijzikj,d|wzij

ijkikj

d,|wzp,d|wpd|,zwp,d|wzp

d|wp,d|wzp

d,|wpd|,zwpEd|wpE

,d|wzp|d,zwp,d|wzp|d,zwp

dwpzwpdzp

ijkkijk

~logˆ~logˆ

~logˆ

~log~log~log

Step-E

logloglog

|log||log

PLSA Model (cont.)

1log ~

logˆ~logˆ

log log~

kijkijk

ijkijk

kijkijk

ijkijk

iiiiiiii

ijijijij

d,|wzp,dwzp

,d|wzpd,|wzp

,d|wzp

xx,d|wzpd,|wzp

,d|wzp

,d|wzp,d|wzpd,|wzp,d|wzp

dddddddd

|dwpd|wp|dwpd|wp

ikjijk

ikkjijk

kikjijk

dzp|zwp

dzp|zwp|dwp|d,zwp

dzp|zwp,d|wzp

d|,zwp,d|wzp

ddd|wp

|ˆˆˆ

|logˆmax

~logˆmax

~ maximum

~log maximum

conditional independent

PLSA Model (cont.)

i kk j

kikijkijdzP

jkjijkij|zwP

d ziki

z wkjk

ikkjijkij

d Tiki

z wkjk

dzpdzp,d|wzpdwn

zwp|zwp,d|wzpdwn

dzpzwp

dzp|zwp,dwzpdwn

dzpzwpLE

|1|logˆ,

|1logˆ,

|log|ˆ,

Step-M

PLSA Model (cont.)

jijkij

,d|wzpdwn

,d|wzpdwndzP

,d|wzpdwn

,d|wzpdwn|zwP

...difference take

PLSA Model (cont.)

)|(1)|()|(

)|()|(1

)|(),()()|(

q iqqi

ikkiik

dw kkk

hzpzwphzpzwp

dzpdwnzphzp

Use PLSA in language model:P(zk|di) are used as mixture weights when calculating the word probability.The history hi is used instead of di to re-estimate these weight on the test set.

PLSA Model (cont.)

kikkiii

icsikki

hzpzwp)|hp(w

hzpbibi

hzpzwphzpzwp

)|()|(

ndistrbutio cprior topi theof weight the:bth word-i theof score confidence the:cs(i)

)|(1))|()|((

))|()|((1)|(

document. theof topicabout the model the toavailablen informatioenough not and,history in the errorsn recognitio of Because

PLSA Model (cont.)Account for the whole document history of word irrespective of the do

cument length.Have no means for representing the word order because of mixture o

f unigram distribution.

Combine n-gram with PLSA:

When PLSA used in decoding, Viterbi-based decoder is not suitable.Two-pass decoder:• First pass:

– n-gram, output a confidence score• Second pass:

– PLSA, rescoring the lattices

)()|()|()|(

iunigram

iiPLSAiigramnii wP

hwPhwPhwP

PLSA Model (cont.)• During the re-scoring, the PLSA history comprises of all segments in

a document but the current segment.

• PLSA history is fixed for all words in a given segment.

• Refer to “history “ as “context” (ctx). It contains both past and future words.

Experimental ResultsTwo Test Sets• NIST’s Hub5 speech-to-text evaluation 2002(eval02)

– Switchboard I and II– 62k words,19k form Switchboard I

• NIST’s Rich Transcription Spring 2003 CTS speech-to-text evalation(eval03)– Switchboard II phase 5 and Fisher– 74k words, 36k from Fisher

Experimental Results (cont.)

Experimental Results (cont.)• The reduction is greater if PLSA’s training text relates to

the test set.

• PP of (ref.ctx,10) <PP of (rec.ctx,10)

• b=10 is the best value

• Use of confidence score makes the PLSA model less sensitive to b

Experimental Results (cont.)• baseline: n-gram trained on 20M words of Fisher

transcripts. Increased to 500 classes• PLSA: 750 aspects,100 EM iterations• Separate into eval03dev,eval03tst

– Interpolation weight of the word and class-based n-gram were set to minimize perplexity.

– A slight improvement when side-based documents were used.

Experimental Results (cont.)• b=100 is best value

– PLSA model needs much more data to estimate the topic of Fisher than SwbI

• Having a long context is very important.

Conclusion

• PLSA with the suggested modifications in a language model reduces perplexity.

• Future work:– Re-score lattices to calculate WERs– Combine semantics-oriented model with synta

x-based language model

A PLSA-based Language Model for Conversational Telephone Speech David Mrva and Philip C.Woodland

Documents

Transcript of A PLSA-based Language Model for Conversational Telephone Speech David Mrva and Philip C.Woodland

Conversational Conversational Spanish Classes - … · Conversational Conversational Caldwell Community College & Technical Institute Spanish for Medical Professionals Wednesdays

Estimating Conversational Styles in Conversational ...

Probabilistic Latent Semantic Analysis (pLSA)

GrandChair: Conversational Collection of Grandparents' Stories · GrandChair: Conversational Collection of Grandparents' Stories by ... Conversational Collection of Grandparents'

Clustering Search Results Using PLSA

Conversational Leadership

PLSA CMI WP Flys - epa.gov

Conversational Structure

ENCLOVER - sunguardawnings.com...Conversational / Sectional / Dining Sectional Conversational / Sectional Sectional Conversational Dining Dining Conversational Conversational Conversational

PLSA Longevity Model - Club Vita · CV LLP (nor their respective licensors) accept liability for errors or omissions in the Research and neither the PLSA nor CV LLP (nor their respective

Conversational Evangelism

Tic como herramienta para la informática educativa. mrva

Scene Classification via pLSA - PASCAL EPrints

€¦ · Conversational Commerce Conversational Commerce Kunden und Unternehmen interagieren direkt 0 Einkaufen Wie wird Conversational Commerce im Al Itag eingesetzt? Conversational

%Template for producing VLSI Symposia proceedings€¦ · Web viewCorrelation analysis via pLSA learning: To learn the pLSA model, a term-document matrix should be computed as a prerequisite.

Conversational Topics

Conversational Reframing Acknowledgements. Conversational Reframing Introduction.

COMPARISON OF A BIGRAM PLSA AND A NOVEL CONTEXT-BASED PLSA LANGUAGE MODEL FOR SPEECH RECOGNITION

Multiple Resource Value Assessment (MRVA)

Conversational computing via conversational virtual machine