Results of Tagalog vowel Speech recognition using Continuous HMM Arnel C. Fajardo Ph. D student...

Results of Tagalog vowel Speech recognition using Continuous HMM

Arnel C. FajardoPh. D student

(Under the supervision of Professor Yoon-Joong Kim)

Basic structure of HTK

1.Data Preparation

Speech Data for Training and Testing-Data ( wave file)

625 wave file 25 sets5 sets per speaker

-Training Data 5 sets per speaker 25 sets

-Test DataSpeaker DependentTest 1 :5 setsTest 2 :10 sets

Feature of Speech Data: *.wav-16Khz, 16 bit, linear PCM

a1001.wav=>”a”e1001.wav=>”e”i1001.wav=>”I”o1001.wav=>”o”u1001.wav=>”u”…………

Variables:2 test : 5 speakers ( 1 set each)

10 speakers ( 1 set each)

Hmmdefsm5m6

Compute Feature Vectors

• Use HCopy -C configs\HCopy.config -S scripts\HCopy.scp• Hcopy.exe

– Compute the features from wave file and save the features on the same folder.

– MFCC was used

-C configs\HCopy.configConfiguration file to compute features

-S scripts\HCopy.scpScript file of a listWave file and feature file

• HCopy

• Number of Inputs files: 3Waveform files - *.wavConfiguration file – Hcopy.configScript file - Hcopy.scp

Number of output file: 1MFCC file - *.mfc

Create Hcopy.config in ….Configs/Hcopy.configWrite:# Coding parametersSOURCEKIND = WAVEFORMSOURCEFORMAT = NIST SOURCERATE = 625 TARGETKIND = MFCC_0 TARGETRATE = 100000.0 SAVECOMPRESSED = TSAVEWITHCRC = TWINDOWSIZE = 250000.0USEHAMMING = T PREEMCOEF = 0.97NUMCHANS = 26CEPLIFTER = 22 NUMCEPS = 12 ENORMALISE = F

Script file: Scripts/Hcopy.scp

Prepare the Master label file Master label File- Word level transcriptionsmlfs/words.mlf

ModelList-modelList/wordList

Hmm model name list

Generate initial master macro file or HmmdefsHCompV -C configs\config -f 0.01 -m -S scripts\train.scp -M wordHmms\m0\ wordHmms\proto

•HCompV.exe

number of Inputs: 3Input 1 - -C configs/config //parameters for computing feature

-f 0.01 //the variance floor macro (called vFloors) will be // computed with value 0.01 times the global variance

-m //the mean and the variance will be computed Input 2- -S scripts/train.scp //mfc feature vector list to be used in training Input 3- WordHmms/proto //the handwritten hmm prototype

• Number of output: 1 -M WordHmms/m0 // directory for the result //vfloors : variance floor macro

Output 1 //proto : hmm prototype with valued GMM //hmmdefs : will be written manually with proto

Input 1Configs/configscript/Hcopy.config => configs/config

wordHmms/mo/vfloorglobal constant values for computing bj(ot) shown below

Input 2Scripts/train.scp

Input 3General Hmm model(prototype) for mono phone speechWord Hmms/protoIt has 3 states Note: NumStates has 5 states since state 1 and 5 correspond to sil

wordHmms/proto + global means and variances => wordHmms/m0/protoShows the result of the command HCompV for wordHmms/m0/proto

Input 3 - wordHmms/mo/hmmdefs-Master Macro file (MMF)

Step 2.Training HERest -C configs\config -I mlfs\words.mlf -S scripts\train.scp -H wordHmms\m0\hmmdefs -M wordHmms\m1 modelList\wordList

HERestNumber of inputs: 5 -C configs/config //parameters for feature -I mlfs/words.mlf //master label file, word, speech file

modellist/wordList //word name list(hmm list) -S scripts/train.scp //mfc file list for training -H wordHmms/m0/hmmdefs //hmmdefs (a set of hmm

prototypes) for all wordsNumber of output: 1 -M wordhmms/m1 // re-estimated hmmdefs

Input 1Configs/configConfiguration for wordhmms/m1

Input 3modelList/wordList

Input 2mlfs/words.mlf

Input 4Scripts/train.scp

Input 5wordHmms/mo/hmmdefs(MMF)

Output 1HERest -C configs\config -I mlfs\words.mlf -S scripts\train.scp -H wordHmms\m0\hmmdefs -M wordHmms\m1 modelList\wordList Result: wordHmms/m1/hmmdefs

Reestimate hmmdefs :HERest –C configs/config –I mlfs/words.mlf -S scripts/train.scp –H wordHmms/m1/hmmdefs –M wordHmms/m2 modelList/wordList

HERest –C configs/config –I mlfs/words.mlf -S scripts/train.scp –H wordHmms/m2/hmmdefs –M wordHmms/m3 modelList/wordList

HERest –C configs/config –I mlfs/words.mlf -S scripts/train.scp–H wordHmms/m3/hmmdefs –M wordHmms/m4 modelList/wordList

HERest –C configs/config –I mlfs/words.mlf -S scripts/train.scp–H wordHmms/m4/hmmdefs –M wordHmms/m5 modelList/wordList

Step 3.Recognition TestHVite –C configs/config -S scripts/test.scp –H wordHmms/m5/hmmdefs –w dic/tag_Net –i mlfs/recOutWordm5.mlf dic/dict modelList/wordList

HVite Number of Inputs = 5 –C configs/config //parameters for mfc modelList/wordList // hmm name list -S scripts/test.scp // mfc vector list for testing –w dic/tag_Net //word network for recognition Dic/dict //pronouncing dictionary –H wordHmms/m5/hmmdefs //a set of hmms Number of output = 1 –i mlfs/recOutWordm5.mlf // result of recognition

dic/dict - Writing a pronouncing dictionary Word [outsym] models –Word : word to be recognized –[outsym] : string to output when word is recognized –models : hmm model list

BNF

Grammar rule $ :variable {} : zero or more repitions <>:one or more repitions [] : optional

(sil $words sil)

$words= a | e | i | o | u;(sil $words sil)

HParse –C configs/config dic/tag_v_Gram dic/tag_Net

(dic/tag_v_Gram)

HParse –C configs/config dic/korGram dic/tag_Net Results of HParse to tag_v_Gram:

dic/tag_Net configs/config

HVite –C configs/config -S scripts/test.scp –H wordHmms/m5/hmmdefs –w dic/tag_Net –i mlfs/recOutWordm5.mlf dic/dict modelList/wordList

config/config scripts/test.scp modellist/wordList

HVite –C configs/config -S scripts/test.scp -H wordHmms/m5/hmmdefs –w dic/tag_Net –i mlfs/recOutWordm5.mlf dic/dict modelList/wordList mlfs/recOutWordm5.mlf

Step 4.Recognition results.HResults –I mlfs/words.mlf modelList/wordList mlfs/recOutWordm5.mlfFirst test : 5 sets ( each set represents 1 speaker) = > 5 speakers

Step 4.Recognition results.HResults –I mlfs/words.mlf modelList/wordList mlfs/recOutWordm5.mlfSecond test : 10 sets ( each set represents 1 speaker) = > 10 speakers

Comparison of m5 and m6 ( hmmdefs)( slight difference)HERest -C configs\config -I mlfs\words.mlf -S scripts\train.scp -H wordHmms\m4\hmmdefs -M wordHmms\m5 modelList\wordList

HERest -C configs\config -I mlfs\words.mlf -S scripts\train.scp -H wordHmms\m5\hmmdefs -M wordHmms\m6 modelList\wordList

m5 m6

Results of Tagalog vowel Speech recognition using Continuous HMM Arnel C. Fajardo Ph. D student...

Documents

Transcript of Results of Tagalog vowel Speech recognition using Continuous HMM Arnel C. Fajardo Ph. D student...