Maximum Likelihood Linear Regression for Speaker Adaptation of Continuous Density Hidden Markov...

Maximum Likelihood Linear Regression for Speaker Adaptation of

Continuous Density Hidden Markov Models

C. J. Leggetter and P. C. WoodlandDepartment of Engineering, University of Cambridge,

Trumpington Street, Cambridge CB2 1PZ, U.K.

Computer Speech and Language (1995)

Present by Hsu Ting-Wei 2006.03.16

Introduction

• Speaker adaptation techniques fall into two main categories:– Speaker normalization

• The input speech is normalized to match the speaker that the system is trained to model

– Model adaptation techniques• The parameters of the model set are adjusted to improve the

modeling of the new speaker• MAP method

– Only update the parameters of models which are observed in the adaptation data

• MLLR method (Maximum Likelihood Linear Regression)

– All model states can be adapted even if no model-specific data is available

Speaker HMM Models

Say: “Hello!”

MLLR’s adaptation approach

• This method requires an initial speaker independent continuous density HMM system

• MLLR takes some adaptation data from a new speaker and updates the model mean parameters to maximize the likelihood of the adaptation data

• The other HMM parameters are not adapted since the main differences between speakers are assumed to be characterized by the means

MLLR’s adaptation approach (cont.)

• Consider the case of a continuous density HMM system with Gaussian output distributions.

• A particular distribution s ,is characterized by a mean vector , and a covariance matrix

• Given a parameterized speech frame vector , the probability density of that vector being generated by distribution s iswhere n is the dimension of the observation vector

speech frame vector

sss oCo

2/12/2

• We use the following equation

• We can simply it

• So the probability density function for the adapted system becomes

n*(n+1)

ssss bA ̂

sss W ˆ

'21 ],...,,,[ ns

extended mean vector要調適的分佈的 mean 值所串起的向量

offset = 1, include an offset in the regressionoffset = 0, ignore offsets若調適語者的錄音環境與初始模型錄音環境不同時，可以加入的一項參數 [ 參考資料 ]

Original ..

(n+1)*1transformation matrices

sssss WoCWo

2/12/2

• The transformation matrices are calculated to maximize the likelihood of the adaptation data

• The transformation matrices can be implemented using the forward–backward algorithm

• A more general approach is adopted in which the same transformations matrix is used for several distributions.

• If some of the distributions are not observed in the adaptation data, a transformation may still be applied (global transformation)

Estimation of MLLR regression matrices

|,log|,

,||,log,

)Likelihood-(Maximum

it maximize want to Weand function,auxiliary an Define–

is sequencen observatio thegenerateset model theof likelihood totalThe–

set by the denoted are Tlength of sequences state possible all And–

as parameters model of

set estimated-re a and by parameters model ofset current theDenote–

ns.observatio of series a is , data, adaptation theAssume–

• 1.Definition of auxiliary function

objective function

speech frame vector

E-step

Estimation of MLLR regression matrices (cont.)

• 2.Maximization of auxiliary function

generated is sequencen observatio thegiven that

at time state occupying ofy probabilit posteriori a theas and

system in the onsdistributi state all ofset theas Defines -

log|, constant

|,log|,

,||,log,

21 111

obaobOF

only related with mean

• 2.Maximization of auxiliary function (cont.)

log|constant

|,| constant

||,constant

log|, constant ,

已知

(4)expanding this term

• 2.Maximization of auxiliary function (cont.)

tjjtjjjtjj

tjtjjtjj

jtjjtj

tjtjjtj

jtjjtjn

johCntOF

WoCWoCntOF

oCoCntOF

oCoCtOF

obtOFQ

1'2/12/

2/12/1 1

,log2log|2

1constant

log2log|2

1constant

log2log|2

1constant

1|constant

1log2log|constant

12log|constant

1log|constant

log|constant ,

tsstss

tjjtjjjtjj

sWCtoCt

sWoCtOF

WoCWoCntWd

WoCWoCntOFWd

***2*|2

log2log|2

1constant ,

• 2.Maximization of auxiliary function (cont.)M-step

sW<= 估測的 general form (5)

• 3.Re-estimation formula for tied regression matrices

(6)equation therewrite

:becomes (5)equation then ...., states Rby shared is If

:form general

sDWVoCt

sWCtoCt

rrrrrrr

[(n+1)*1][1*(n+1)] =(n+1) *(n+1)

當調適語料不夠多時，可以將調適語料中相關性較大的狀態分為同一類，利用在同一類別中所收集到的語料來估測 Ws 。

• 3.Re-estimation formula for tied regression matrices (cont.)

(7) 1 1

rrroCtsDWV

where 0

thensymmetric, is D since and

diagonal, are scovariance all if

riiiqij

rippqij

pidvdv

(7)is denoted by n*(n+1) matrix Y

basis. row-by-row

aon calculated and methodsion decomposit LU

orn eliminatioGaussian using solves becan equation These

Zand of rows theare and where,

equations ussimultaneo of system thefrom

computed becan hence,parameters model theand

n vectorsobservatio thefrom computed becan both and

on dependent not are and that note

riiiqijij

izwzGw

gwdvwyz

(7)is denoted by n*(n+1) matrix Z

Special cases of MLLR

• 1.Least squares regression

otherwise ,0

... if ,1

becomes (8)equation the

otherwise ,0

on distributi state toassigned is if ,1

thatso alignment) Viterbiby (e.g.on distributi oneexactly toassigned is framespeech each If

becomes

(6)equation thesame, theareation transformsame the to tiedonsdistributi theof scovariance theall If

sWCtoCt

rrrrrrr

estimate squaresleast standard theismatrix regression theof estimate the

g,rearrangin and (9)Equation in Y and X ngSubstituti

..., ...

as Y and X matrices theDefining

sTssTT

(XX’)YX’

Special cases of MLLR (cont.)

• 1.Least squares regression (cont.)

• 2.Single variable linear regression

0....00....0

00....00..0

0..00....00

0..00....0

r mean vecto extended theof elements of up made matrix defining and

.regressionlinear variablesingle simpleby calculated becan component

meaneach ofon modificati thet,independen arer mean vecto in the features theall If

1)*(nn1,1,

3,21,2

2,11,1

• 2.Single variable linear regression (cont.))wD(oC)wD(o/

n/sstsste

||Cπ((o)b ˆˆ21

wDCDtoCDt ˆ)()( 1

wDCDtoCDtrrrrrˆ)()( 1

0ˆ| ,1

tsstcss

wDoCDtOFQWd

rs oCDtDCDtw

)()(ˆ

M-step

Defining regression classes

• When regression matrices are tied across mixture components, each matrix is associated with many mixture components.

• For the tied approach to be effective it is desirable to put all the mixture components which will use similar transforms into the same class.

• Two approaches for defining regression classes were considered:– Based on broad phonetic classes

• All mixture components in any model representing the same broad phonetic class (e.g. fricatives, nasals, etc.) were placed in the same regression class.

– Based on clustering of mixture components

• The mixture components were compared using a likelihood measure and similar components placed in the same regression class.

Experiment: Full regression matrix V.S. Diagonal regression matrix

diagonal

full : a lot of parameters

Experiment: Full matrix using global regression class

adapted

Experiment: Supervised v.s Unsupervised

supervised

unsupervised

Conclusion

• MLLR can be applied to continuous density HMMs with a large number of Gaussians and is effective with small amounts of adaptation data.

Maximum Likelihood Linear Regression for Speaker Adaptation of Continuous Density Hidden Markov...

Documents

Transcript of Maximum Likelihood Linear Regression for Speaker Adaptation of Continuous Density Hidden Markov...

Maximum Likelihood Estimation - unipveconomia.unipv.it/pagp/pagine_personali/erossi/macroeconometria_4... · Maximum Likelihood Estimation Eduardo Rossi University of Pavia. Likelihood

Ancient Woodland Inventory for the Chilterns · Chilterns Ancient Woodland Survey Report ... ancient woodland inventory ... To promote appropriate woodland management and support.

AND HOW YOU CAN USE IT PHIL LEGGETTER

LR1.1 - Shan Tsui Village Road Woodland LR1.2 - Shan Tsui ...€¦ · LR1.1 - Shan Tsui Village Road Woodland LR1.2 - Shan Tsui Woodland LR1.3 - Shan Tsui Eastern Woodland Figure

Woodland collages

Woodland Lanterns

Woodland Art

Britannia Woodland workshop summary - Vancouvervancouver.ca/files/cov/britannia-woodland-workshop-summary.pdf · 1 Synopsis of Britannia-Woodland Sub-Area Workshop Grandview-Woodland

4. Woodland

Barry Leggetter

Barry Leggetter Executive Director AMEC Avril Lee Partner and Chief Executive Officer Ketchum Pleon, London.

Woodland Guardian - Woodland Cemetery Foundation

Woodland Assignment

Woodland презентация

Stirling & Clackmannanshire Forestry & Woodland Strategy · woodland - that is, areas with continuous woodland cover since AD 75. Stirling & Clackmannanshire Forestry & Woodland Strategy

BugabooCity Free Printable Woodland Bingo. For … Free Printable Woodland Bingo. For Personal Use ONLY. 13 WOODLAND BINGO! maaa WOODLAND BINGO! WOODLAND BINGO! WOODLAND BINGO! WOODLAND

Joint and Conditional Maximum Likelihood Estimation for ... · Joint and Conditional Maximum Likelihood Estimation for ... Joint and Conditional Maximum Likelihood ... marginal maximum-likelihood

Enterprise Content Management in DWP Jacqui Leggetter Corporate IT.

Tools, Tips and Techniques for Developing Real-time Apps. Phil Leggetter

Assessing DiagnosticAssessing Diagnostic Articles ... Likelihood Ratios Pedi 101 Feb... · Assessing DiagnosticAssessing Diagnostic Articles: Likelihood RatiosArticles: Likelihood