The MEI Team August 23, 2000

64
Mandarin-English Information (MEI): Investigating Translingual Speech Retrieval Johns Hopkins University Center of Language and Speech Processing Summer Workshop 2000 The MEI Team August 23, 2000

description

Mandarin-English Information (MEI): Investigating Translingual Speech Retrieval Johns Hopkins University Center of Language and Speech Processing Summer Workshop 2000. The MEI Team August 23, 2000. MEI Team. Senior Members Students. Helen Meng Chinese University of Hong Kong - PowerPoint PPT Presentation

Transcript of The MEI Team August 23, 2000

Page 1: The MEI Team August 23, 2000

Mandarin-English Information (MEI):Investigating Translingual Speech Retrieval

Johns Hopkins University Center of Language and Speech Processing

Summer Workshop 2000

The MEI Team

August 23, 2000

Page 2: The MEI Team August 23, 2000

MEI Team

• Senior Members

• Students

Helen Meng Chinese University of Hong KongErika Grams Advanced Analytic ToolsSanjeev Khudanpur Johns Hopkins UniversityGina-Anne Levow University of MarylandDouglas Oard University of MarylandPatrick Schone US Department of DefenseHsin-Min Wang Academia Sinica, Taiwan

Berlin Chen National Taiwan UniversityWai-Kit Lo Chinese University of Hong KongKaren Tang Princeton UniversityJianqiang Wang University of Maryland

Page 3: The MEI Team August 23, 2000

Outline• Motivation• Background• The Multi-scale Paradigm

– multi-scale query processing

– multi-scale document indexing

– multi-scale retrieval

• The Perfect Retrieval Myth• Experiments and Findings• Conclusions and Future Work

Page 4: The MEI Team August 23, 2000

Motivation

• Monolingual speech retrieval applications are emerging, e.g.– http://speechbot.research.compaq.com

source: www.real.com, Feb 2000

529

1367

English

OtherLanguages

Internet-accessibleRadio and Television Stations

Page 5: The MEI Team August 23, 2000

Source: Global Reach

EnglishEnglish

2000 2005

Motivation (cont): Internet User Population

Chinese

Page 6: The MEI Team August 23, 2000

MEI: The Big Picture

InteractiveRefinement

Speech-to-SpeechTranslation

English Spoken Documents

Retrieval Engine

English Text Query (Exemplar)

English-to-ChineseTranslation

Mandarin Audio News Broadcasts

Mandarin AudioIndexing (ASR)

Ranked List of Mandarin Spoken Documents

Page 7: The MEI Team August 23, 2000

Concept Demo

Karen, Erika

Page 8: The MEI Team August 23, 2000

Two Prevailing Problems in CL-SDR• Translation problem

– out-of-vocabulary (OOV) in translation– too many translations

• Recognition problem– OOV in recognition– acoustic confusions

• Solution: subword units may help– transliteration, e.g.

Northern Ireland /bei3 ai4 er3 lan2/ (in query)– recognition of subword units, e.g.

Iraq --> a rock (in document)

Page 9: The MEI Team August 23, 2000

Background for Mandarin Speech Recognition

• 400 syllables – full phonological coverage in Mandarin Chinese

• 6,800 characters – full textual coverage in written Chinese (GB-coded)

– each character pronounced as a syllable

• Unknown number of Chinese words– one to several characters per word

– character combinations create different meanings

– ambiguity in word tokenization

Page 10: The MEI Team August 23, 2000

OOV and Acoustic Confusions in Mandarin SDR

Query: …Iraq...

Page 11: The MEI Team August 23, 2000

Subwords for Retrieval

• Character n-grams – robust to word-level mismatches due to

different tokenization

• Syllable n-grams– robust to word/character-level mismatches

due to homophones

• Partial matches possible

Pros

Con • Subwords contain reduced lexical knowledge c.f. words

Page 12: The MEI Team August 23, 2000

The MEI Investigation• Use of a multi-scale representation for crosslingual spoken

document retrieval (CL-SDR)• Words and subwords

Research Challenges• Multi-scale query translation

• Multi-scale audio indexing

• Multi-scale retrieval

Page 13: The MEI Team August 23, 2000

Query byExampleEnglish

NewswireExemplars

MandarinAudioStories

President Bill Clinton and Chinese President Jiang Zemin engaged in a spirited, televised debate Saturday over human rights and theTiananmen Square crackdown, and announced a string of agreements on arms control, energy and environmental matters. There were no announced breakthroughs on American human rights concerns, including Tibet, but both leaders accentuated the positive …

美国总统克林顿的助手赞扬中国官员允许电视现场直播克林顿和江泽民在首脑会晤后举行的联合记者招待会。。特别是一九八九镇压民主运动的决定。他表示镇压天安门民主运动是错误的 , 他还批评了中国对西藏精神领袖达 国家安全事务助理伯格表示 , 这次直播让中国人第一次在种公开的论坛上听到围绕敏感的人权问题的讨论。在记者招待会上 …

Page 14: The MEI Team August 23, 2000

Evaluation Collection

2265manually

segmentedstories

3371manually segmented

stories

DevelopmentCollection: TDT-2

EvaluationCollection: TDT-3

Mar 98

Oct 98 Dec 98

17 topics,variable number

of exemplars

Jun 98Jan 98

Exhaustive relevance assessment based on event overlap

English texttopic exemplars:Associated PressNew York Times

Mandarin audiobroadcast news:Voice of America

56 topics,variable number

of exemplars

Page 15: The MEI Team August 23, 2000

Cross-LanguageSpeech Retrieval

American EnglishText Exemplar

Ranked Listof News Stories

Mandarin ChineseBroadcast News

Abstract Task Model

Page 16: The MEI Team August 23, 2000

Evaluation of Ranked Lists

VOA 0427.22

VOA 0521.14

VOA 0604.39

VOA 0419.12

VOA 0527.13

VOA 0513.17

Relevant

Not

Not

Relevant

Not

Relevant

…Relevance Judgments

Page 17: The MEI Team August 23, 2000

Recall-Precision Graph

0.0

0.5

1.0

0.0 0.2 0.4 0.6 0.8 1.0

Recall

Inte

rpol

ated

Pre

cisi

on

Page 18: The MEI Team August 23, 2000

0.0

0.5

1.0

0.0 0.2 0.4 0.6 0.8 1.0

Recall

Inte

rpol

ated

Pre

cisi

onVariation Across Exemplars

Page 19: The MEI Team August 23, 2000

Average Across Exemplars

0.0

0.5

1.0

0.0 0.2 0.4 0.6 0.8 1.0

Recall

Me

an

Inte

rpo

late

d P

rec

.

0.353

Page 20: The MEI Team August 23, 2000

Variation Across Topics

0.0

0.2

0.4

0.6

0.8

1.0

Me

an

Un

inte

rpo

late

d

Av

era

ge

Pre

cis

ion

Topic

Page 21: The MEI Team August 23, 2000

Comparing Two Systems

0.0

0.2

0.4

0.6

0.8

1.0

Me

an U

nin

terp

ola

ted

Avg

Pre

c

System A System B

Topic

Page 22: The MEI Team August 23, 2000

Significance Testing

• Statistical significance– Null hypothesis: mean average precision across

topics is drawn from same distribution

– Paired 2-tailed t-test, significant if p<0.05• For System A vs. System B, p=0.94

• Meaningful differences– Rule of thumb: 5-10% relative

• For System A vs. System B, relative difference is <1%

Page 23: The MEI Team August 23, 2000

Translingual and Multi-ScaleQuery Processing

Page 24: The MEI Team August 23, 2000

Mandarin Audio

Term Translation

President Bill Clinton and…

English Exemplar

Term Selection

BilingualTermList

Query Construction

MandarinIR System

StoryBoundaries

Evaluation

Named Entity

Tagging

DocumentConstruction

SpeechRecognition

Relevance Judgments

RankedList

BBN

U Mass

LDC

Cornell

DragonLDC

LDC

LDC 000100010000010100

MeanUninterpolated

AveragePrecision

Page 25: The MEI Team August 23, 2000

Multi-Scale Query Translation

• Words and Phrases

(Gina, Sanjeev)

• Subwords

(Helen, Wai-Kit, Berlin, Karen)

Page 26: The MEI Team August 23, 2000

Bilingual Term List

• Combination of– LDC English-Chinese bilingual term list

– Chinese-English Translation Assistance File (CETA) [inverted]

199,444395,216

81,127105,750

Total English TermsTotal Translation Pairs

Phrasal TermsPhrasal Translation Pairs

Termhuman right(s)human rights

# translations7

301

Page 27: The MEI Team August 23, 2000

Query Term Selection

• Tagged named entities (BBN Identifinder)– Person: partners of Goldman, Sachs, & Co.– Organization: UN Security Council

• Dictionary-based “phrases”– translatable multi-word units, e.g– “Wall Street”, “best interests”, “guiding principles”, “human rights”

– automatic tagging: greedy, left-to-right, max match

• Chi-squared filtering– Compared to English background model

Page 28: The MEI Team August 23, 2000

Query Term Translation

• Named entities– if absent from dictionary, translate individual terms

• e.g. “Security Council” versus “First Bank of Siam”

• Numeric Expressions– special processing for digits

• e.g. “12:30 pm, June 15, 1969”

• Remaining terms– Consult bilingual term list, lemmatize if necessary

• e.g. “televised” translates as “television”

Page 29: The MEI Team August 23, 2000

Query Construction

• Unbalanced queries– Use all plausible translations for each term

• Balanced queries– Pseudo-term weight: average of translations’ weights

• Structured queries– Recompute pseudo-term weight from translations’ term

frequency and document frequency

Page 30: The MEI Team August 23, 2000

Strategies in Query Translation

• Phrase based translation is significantly better

• Named entities and numeral translations are (barely) helpful

• Balanced translation matches Structured queries– also extends easily to

subword units

0.2

0.3

0.4

0.5

0.6

0.7

0.8

Me

an

Av

era

ge

Pre

cis

ion

Wo

rds

Ph

ras

es

NE

/NU

ME

X

Ba

lan

ce

d

Strategy

Page 31: The MEI Team August 23, 2000

Untranslatable Terms

suharto 97 (# of occurrences)netanyahu 88starr 62arafat 50bjp 45vajpayee 44estrada 44….hsu 19zemin 7

# (by token)87,0043,028

# (by type)12,4021,122

TermstotalOOV

Page 32: The MEI Team August 23, 2000

Subword Transliteration

English Query Exemplar

Mandarin Audio Document

……..Kosovo…...

…../ke-suo-fo/….

Sound alike --> match in phonetic space?

Kosovo (/ke1-suo3-wo4/, /ke1-suo3-fo2/, /ke1-suo3-fu1, /ke1-suo3-fu2/)

Page 33: The MEI Team August 23, 2000

Subword Transliteration Procedure (1)

Named Entities

PinYin / WadeGiles Spellingse.g. Wang Jianqiang, Wang Hsinmin

Syllables, e.g.wang jian qiangwang xin min

Acquire English Pronunciation• PRONLEX Lookup• Spelling-to-Pron Generation• e.g. christopher

English Phones, e.g. /kk rr ih ss tt aa ff er/

Trans. Error-Driven Learning[Brill 1994]PRONLEX, 85K(train), 4.5K (test)82%(phoneme), 45% (word)

Page 34: The MEI Team August 23, 2000

Subword Transliteration Procedure (2)

Cross-lingual Phonetic MappingEnglish phones to Chinese “phones”

Trans. Error-Driven Learning• 4800 words (train) [Chen H. H., NTU; WWW]• FST aligns Eng / Chin phones• /k e l i s i t uo f u/

Chinese phone lattice generationSyllable bigram language modelN-best syllable sequence hyp

N=1 (one-best hypothesis)/ji li si te fu/ (hyp)/ke li si tuo fu/ (ref)

/kk rr ih ss tt aa ff er/

Cross-lingual Phonological Rules• Syllable nuclei insertion

Handle consonant clustersWord-final consonants, etc.

/kk ax rr ih ss ax tt aa ff er/

Page 35: The MEI Team August 23, 2000

Cross Lingual Phonetic Matching

• Documents are indexed with syllable bigrams (in addition to words and character bigrams if necessary)

• Query terms are translated as words where possible, phonetically where necessary 0.2

0.3

0.4

0.5

0.6

0.7

0.8

Me

an

Av

era

ge

Pre

cis

ion

Word Char Syllable

Indexing Terms

no CLPM CLPM

Page 36: The MEI Team August 23, 2000

Multi-Scale Query Construction

Helen

Page 37: The MEI Team August 23, 2000

Multi-Scale Query Construction:Objectives

Query Construction

Bag of Englishquery terms(selected)

Multi-scale queryrepresentation in

Chinese

Multi-scale representation integrates:• translated phrases, named entities, numeric expressions, translated terms• transliterated syllables• words, characters and syllable n-grams

Page 38: The MEI Team August 23, 2000

Multi-Scale Query ConstructionProcedures

Syllable bigrams and Transliterationsyi-se se-lie shou-xiang ben-jie jie-ming ne-tan tan-ya ya-hu

English Bag of TermsIsraeli <Ph>Prime Minister</Ph> <NE>Benjamin Netanyahu</NE>

Chinese Translations and Transliteration

ne-tan tan-ya ya-hu

Character bigrams and Transliterations

ne-tan tan-ya ya-hu

words + syl bigrams

char + syl bigrams

syl bigrams

Page 39: The MEI Team August 23, 2000

Multi-Scale Audio Document Indexing

Hsin-min, Helen, Berlin, and Wai-kit

Page 40: The MEI Team August 23, 2000

Previous Chinese Example

Page 41: The MEI Team August 23, 2000

Audio Document IndexingObjectives

• Augment words with subword-based indexing• Dragon word recognition outputs are provided• Character-based indexing

– Characters derived from Dragon’s recognized words

• Syllable-based indexing– Syllables derived by pronunciation lookup using Dragon’s

recognized words

• Address Dragon’s ASR errors– Augment with alternative (word/char/syl) hypotheses e.g.

syllable lattice [Chen & Wang, ICASSP-2000]

Page 42: The MEI Team August 23, 2000

Syllable Lattice Development

Dragon’s syl

• Dragon’s recognition accuracies– Evaluated against anchor scripts – 82.0%(word) 87.9%(char) 92.1%(syl)– Syllable substitution errors (5.2%)

• MEI’s syllable recognition accuracy– Trained on Hub4 Mandarin (VOA, 11 hours, 1997)– 70.2% (syl) !!!

Alternative syl

• Develop a syllable recognizer to produce lattice representation

Page 43: The MEI Team August 23, 2000

Strategy

• Improve MEI’s syllable recognizer

• Design a structure for document indexing which incorporates– Dragon’s word / character / syllable hypotheses

– MEI’s syllable hypotheses

(hopefully complementary to Dragon’s syllables)

Page 44: The MEI Team August 23, 2000

MEI Syllable Recognizer:Improve Acoustic Models

VOA Audio for Doc i Forced

AlignmentSpeaker Adaptation

Speaker-Adapted Acoustic Models

Baseline Acoustic Models

Syllable Recognition

MEI Syllablesfor Doc i• Forced alignment with Dragon’s output for each document

• Blind speaker adaptation with Dragon’s syllables• MEI syllable accuracy: 70.2%(original) 87.7% !!!

Dragon Outputs for Doc i

Page 45: The MEI Team August 23, 2000

MEI Syllable Recognizer:Incorporate Language Model

VOA Audio for Doc i

Dragon Outputs for Doc i

Forced Alignment

Speaker Adaptation

Speaker-Adapted Acoustic Models

Baseline Acoustic Models

Syllable Recognition

MEI Syllablesfor Doc i

1998 XinhuaLanguage Models

• Syllable trigram language model • MEI syllable accuracy: 70.2%87.7%90.0% !!!

Page 46: The MEI Team August 23, 2000

Audio Document Indexing withMultiple Syllable Recognition Outputs

Dragon’s syl

MEI’s syl

Two separate recognition outputsDragon’s syl

MEI’s syl

The revised syllable lattice

Page 47: The MEI Team August 23, 2000

Multi-scale Audio Document Indexing

MEI’s syl

Dragon’s word

Dragon’s syl

Dragon’s chr

Page 48: The MEI Team August 23, 2000

Fusion of Words and Subwordsin Multi-Scale Retrieval

Wai-Kit Lo, Pat Schone

Page 49: The MEI Team August 23, 2000

• Merging ranked lists from separate runs• For each query and document pair, the

score is recalculated as

– wk are the weights for different retrieval runs– K denotes a retrieval run at some scale (word,

characters, syllables, combinations)– Sk (Qi, Dj) is a rank-based score between

query i and document j in retrieval run k

Loose Coupling

Page 50: The MEI Team August 23, 2000

Loose Coupling

0.2

0.3

0.4

0.5

0.6

0.7

0.8

Ch

ar2

Sy

l2

ME

ISy

l2

Sy

l2

ME

ISy

l2

ME

ISy

l2

Wo

r d

Ch

ar 2

Ch

ar 2

Syl 2

Wo

r d

Wo

r d

fusion

Page 51: The MEI Team August 23, 2000

Tight Coupling

• Unified indexing of words and subword ngrams• For query and documents

– Combine terms at different scales to form a multi-scale query/document representation, e.g.

• Multi-scale retrieval produces a single ranked listyi-se se-lie shou-xiang ben-jie jie-ming ne-tan tan-ya ya-hu

Page 52: The MEI Team August 23, 2000

Loose vs Tight Coupling

• Tight coupling combines document scores before ranking– may need weight

optimization

• Loose coupling combines lists post-hoc– outperforms individual lists

0.2

0.3

0.4

0.5

0.6

0.7

0.8

Me

an

Av

era

ge

Pre

cis

ion

Wo

rds

Ch

ar2

Lo

os

e

Tig

ht

Retrieval Method

Page 53: The MEI Team August 23, 2000

The Perfect Retrieval Myth

Erika, Helen, Hsin-Min, Jian Qiang, Berlin

Page 54: The MEI Team August 23, 2000

Differences in News Sources

The Perfect Retrieval Myth• 100% Average Precision = ALL relevant docs

and ZERO non-relevant docs retrieved

Query Processing

English Newswire Article

Term selectionTranslation errorsTranslation ambiguityOOV

Document Processing

Mandarin Audio Files

Speech recognition errorsWord tokenization ambiguity

OOV

Is corrupted by ...

Page 55: The MEI Team August 23, 2000

“Bounds” on Word-Based Systems

• Using Mandarin VOA documents as exemplars– matched condition

• Using Xinhua text documents as exemplar– source mismatch

• Using manual translations of NYT documents as exemplars

0.2

0.3

0.4

0.5

0.6

0.7

0.8

Mea

n Av

erag

e Pr

ecis

ion

VOA Xinhua NYT

Query Exemplars

ASR Output Anchor Scripts

Page 56: The MEI Team August 23, 2000

“Bounds” on Subword-Based Systems

• Character bigrams for indexing– marginally outperforms

word-based systems

• Syllable bigrams– are quite competitive,

though somewhat behind

• Mean average precision ~0.6 is a good CL-SDR target

0.2

0.3

0.4

0.5

0.6

0.7

0.8

Mea

n A

ver

age

Pre

cisi

on

Xinhua NYT

Query Exemplars

Words Char Syllable

Page 57: The MEI Team August 23, 2000

TDT-2 Results

Page 58: The MEI Team August 23, 2000

Retrieval Performance on TDT2

0.2

0.3

0.4

0.5

0.6

0.7

0.8

Me

an

Av

era

ge

Pre

cis

ion

Word

s

Char

2

Syl

lable

s(C

LPM

)

Word

s (C

LPM

)

Char

2 (C

LPM

)

Char

1+2

+3

Word

s+C

har

(L)

Word

s+C

har

(T)

MEI S

ylla

ble

s

Lat

tice

TDT2

Page 59: The MEI Team August 23, 2000

TDT-3 Results

Page 60: The MEI Team August 23, 2000

Retrieval Performance on TDT3

0.2

0.3

0.4

0.5

0.6

0.7

0.8

Me

an

Av

era

ge

Pre

cis

ion

Wo

rds

Ch

ar

2

Sy

l2(C

LP

M)

Wo

rds

(C

LP

M)

Ch

ar2

(C

LP

M)

Ch

ar

1+

2+

3

Wo

rds

+C

ha

r(L

)

Wo

rds

+C

ha

r(T

)

ME

I Sy

llab

les

La

ttic

e

Ch

ar2

(SR

Err

)

TDT2 TDT3

Page 61: The MEI Team August 23, 2000

Summary and Conclusions• Novel multi-scale paradigm for CL-SDR

– ameliorates the translation and recognition OOV problems

• Multi-scale query and document processing– cross-lingual subword transliteration procedure (CLPM)

– query and document construction embeds words / characters / syllables

– balanced and structured queries

• Multi-scale retrieval– tight and loose coupling strategies to fuse words and

subwords for retrieval

Page 62: The MEI Team August 23, 2000

Summary and Conclusions (2)

• Extensive experiments on TDT-2, TDT-3– character bigrams typically outperform words or

syllable bigrams in retrieval

– fusion of word and subword units shows potential in multi-scale retrieval

– syllable lattice needs further investigation

Page 63: The MEI Team August 23, 2000

Future Work

• Word-subword fusion techniques merit further investigation

• Multi-scale query expansion for retrieval performance improvement (Wai-Kit)

• Incorporation of acoustic scores in syllable lattice representation for documents

Page 64: The MEI Team August 23, 2000

END