Results of Tagalog vowel Speech recognition using Continuous HMM Arnel C. Fajardo Ph. D student...

32
Results of Tagalog vowel Speech recognition using Continuous HMM Arnel C. Fajardo Ph. D student (Under the supervision of Professor Yoon- Joong Kim)

Transcript of Results of Tagalog vowel Speech recognition using Continuous HMM Arnel C. Fajardo Ph. D student...

Page 1: Results of Tagalog vowel Speech recognition using Continuous HMM Arnel C. Fajardo Ph. D student (Under the supervision of Professor Yoon-Joong Kim)

Results of Tagalog vowel Speech recognition using Continuous HMM

Arnel C. FajardoPh. D student

(Under the supervision of Professor Yoon-Joong Kim)

Page 2: Results of Tagalog vowel Speech recognition using Continuous HMM Arnel C. Fajardo Ph. D student (Under the supervision of Professor Yoon-Joong Kim)

Basic structure of HTK

Page 3: Results of Tagalog vowel Speech recognition using Continuous HMM Arnel C. Fajardo Ph. D student (Under the supervision of Professor Yoon-Joong Kim)

1.Data Preparation

Speech Data for Training and Testing-Data ( wave file)

625 wave file 25 sets5 sets per speaker

-Training Data 5 sets per speaker 25 sets

-Test DataSpeaker DependentTest 1 :5 setsTest 2 :10 sets

Feature of Speech Data: *.wav-16Khz, 16 bit, linear PCM

a1001.wav=>”a”e1001.wav=>”e”i1001.wav=>”I”o1001.wav=>”o”u1001.wav=>”u”…………

Variables:2 test : 5 speakers ( 1 set each)

10 speakers ( 1 set each)

Hmmdefsm5m6

Page 4: Results of Tagalog vowel Speech recognition using Continuous HMM Arnel C. Fajardo Ph. D student (Under the supervision of Professor Yoon-Joong Kim)

Compute Feature Vectors

• Use HCopy -C configs\HCopy.config -S scripts\HCopy.scp• Hcopy.exe

– Compute the features from wave file and save the features on the same folder.

– MFCC was used

-C configs\HCopy.configConfiguration file to compute features

-S scripts\HCopy.scpScript file of a listWave file and feature file

Page 5: Results of Tagalog vowel Speech recognition using Continuous HMM Arnel C. Fajardo Ph. D student (Under the supervision of Professor Yoon-Joong Kim)

• HCopy

• Number of Inputs files: 3Waveform files - *.wavConfiguration file – Hcopy.configScript file - Hcopy.scp

Number of output file: 1MFCC file - *.mfc

Create Hcopy.config in ….Configs/Hcopy.configWrite:# Coding parametersSOURCEKIND = WAVEFORMSOURCEFORMAT = NIST SOURCERATE = 625 TARGETKIND = MFCC_0 TARGETRATE = 100000.0 SAVECOMPRESSED = TSAVEWITHCRC = TWINDOWSIZE = 250000.0USEHAMMING = T PREEMCOEF = 0.97NUMCHANS = 26CEPLIFTER = 22 NUMCEPS = 12 ENORMALISE = F

Page 6: Results of Tagalog vowel Speech recognition using Continuous HMM Arnel C. Fajardo Ph. D student (Under the supervision of Professor Yoon-Joong Kim)

Script file: Scripts/Hcopy.scp

Page 7: Results of Tagalog vowel Speech recognition using Continuous HMM Arnel C. Fajardo Ph. D student (Under the supervision of Professor Yoon-Joong Kim)

Prepare the Master label file Master label File- Word level transcriptionsmlfs/words.mlf

Page 8: Results of Tagalog vowel Speech recognition using Continuous HMM Arnel C. Fajardo Ph. D student (Under the supervision of Professor Yoon-Joong Kim)

ModelList-modelList/wordList

Hmm model name list

Page 9: Results of Tagalog vowel Speech recognition using Continuous HMM Arnel C. Fajardo Ph. D student (Under the supervision of Professor Yoon-Joong Kim)

Generate initial master macro file or HmmdefsHCompV -C configs\config -f 0.01 -m -S scripts\train.scp -M wordHmms\m0\ wordHmms\proto

•HCompV.exe

number of Inputs: 3Input 1 - -C configs/config //parameters for computing feature

-f 0.01 //the variance floor macro (called vFloors) will be // computed with value 0.01 times the global variance

-m //the mean and the variance will be computed Input 2- -S scripts/train.scp //mfc feature vector list to be used in training Input 3- WordHmms/proto //the handwritten hmm prototype

• Number of output: 1 -M WordHmms/m0 // directory for the result //vfloors : variance floor macro

Output 1 //proto : hmm prototype with valued GMM //hmmdefs : will be written manually with proto

Page 10: Results of Tagalog vowel Speech recognition using Continuous HMM Arnel C. Fajardo Ph. D student (Under the supervision of Professor Yoon-Joong Kim)

Input 1Configs/configscript/Hcopy.config => configs/config

Page 11: Results of Tagalog vowel Speech recognition using Continuous HMM Arnel C. Fajardo Ph. D student (Under the supervision of Professor Yoon-Joong Kim)

wordHmms/mo/vfloorglobal constant values for computing bj(ot) shown below

Page 12: Results of Tagalog vowel Speech recognition using Continuous HMM Arnel C. Fajardo Ph. D student (Under the supervision of Professor Yoon-Joong Kim)

Input 2Scripts/train.scp

Page 13: Results of Tagalog vowel Speech recognition using Continuous HMM Arnel C. Fajardo Ph. D student (Under the supervision of Professor Yoon-Joong Kim)

Input 3General Hmm model(prototype) for mono phone speechWord Hmms/protoIt has 3 states Note: NumStates has 5 states since state 1 and 5 correspond to sil

Page 14: Results of Tagalog vowel Speech recognition using Continuous HMM Arnel C. Fajardo Ph. D student (Under the supervision of Professor Yoon-Joong Kim)

wordHmms/proto + global means and variances => wordHmms/m0/protoShows the result of the command HCompV for wordHmms/m0/proto

Page 15: Results of Tagalog vowel Speech recognition using Continuous HMM Arnel C. Fajardo Ph. D student (Under the supervision of Professor Yoon-Joong Kim)

Input 3 - wordHmms/mo/hmmdefs-Master Macro file (MMF)

Page 16: Results of Tagalog vowel Speech recognition using Continuous HMM Arnel C. Fajardo Ph. D student (Under the supervision of Professor Yoon-Joong Kim)

Step 2.Training HERest -C configs\config -I mlfs\words.mlf -S scripts\train.scp -H wordHmms\m0\hmmdefs -M wordHmms\m1 modelList\wordList

HERestNumber of inputs: 5 -C configs/config //parameters for feature -I mlfs/words.mlf //master label file, word, speech file

modellist/wordList //word name list(hmm list) -S scripts/train.scp //mfc file list for training -H wordHmms/m0/hmmdefs //hmmdefs (a set of hmm

prototypes) for all wordsNumber of output: 1 -M wordhmms/m1 // re-estimated hmmdefs

Page 17: Results of Tagalog vowel Speech recognition using Continuous HMM Arnel C. Fajardo Ph. D student (Under the supervision of Professor Yoon-Joong Kim)

Input 1Configs/configConfiguration for wordhmms/m1

Page 18: Results of Tagalog vowel Speech recognition using Continuous HMM Arnel C. Fajardo Ph. D student (Under the supervision of Professor Yoon-Joong Kim)

Input 3modelList/wordList

Input 2mlfs/words.mlf

Input 4Scripts/train.scp

Page 19: Results of Tagalog vowel Speech recognition using Continuous HMM Arnel C. Fajardo Ph. D student (Under the supervision of Professor Yoon-Joong Kim)

Input 5wordHmms/mo/hmmdefs(MMF)

Page 20: Results of Tagalog vowel Speech recognition using Continuous HMM Arnel C. Fajardo Ph. D student (Under the supervision of Professor Yoon-Joong Kim)

Output 1HERest -C configs\config -I mlfs\words.mlf -S scripts\train.scp -H wordHmms\m0\hmmdefs -M wordHmms\m1 modelList\wordList Result: wordHmms/m1/hmmdefs

Page 21: Results of Tagalog vowel Speech recognition using Continuous HMM Arnel C. Fajardo Ph. D student (Under the supervision of Professor Yoon-Joong Kim)

Reestimate hmmdefs :HERest –C configs/config –I mlfs/words.mlf -S scripts/train.scp –H wordHmms/m1/hmmdefs –M wordHmms/m2 modelList/wordList

HERest –C configs/config –I mlfs/words.mlf -S scripts/train.scp –H wordHmms/m2/hmmdefs –M wordHmms/m3 modelList/wordList

HERest –C configs/config –I mlfs/words.mlf -S scripts/train.scp–H wordHmms/m3/hmmdefs –M wordHmms/m4 modelList/wordList

HERest –C configs/config –I mlfs/words.mlf -S scripts/train.scp–H wordHmms/m4/hmmdefs –M wordHmms/m5 modelList/wordList

Page 22: Results of Tagalog vowel Speech recognition using Continuous HMM Arnel C. Fajardo Ph. D student (Under the supervision of Professor Yoon-Joong Kim)

Step 3.Recognition TestHVite –C configs/config -S scripts/test.scp –H wordHmms/m5/hmmdefs –w dic/tag_Net –i mlfs/recOutWordm5.mlf dic/dict modelList/wordList

HVite Number of Inputs = 5 –C configs/config //parameters for mfc modelList/wordList // hmm name list -S scripts/test.scp // mfc vector list for testing –w dic/tag_Net //word network for recognition Dic/dict //pronouncing dictionary –H wordHmms/m5/hmmdefs //a set of hmms Number of output = 1 –i mlfs/recOutWordm5.mlf // result of recognition

Page 23: Results of Tagalog vowel Speech recognition using Continuous HMM Arnel C. Fajardo Ph. D student (Under the supervision of Professor Yoon-Joong Kim)

dic/dict - Writing a pronouncing dictionary Word [outsym] models –Word : word to be recognized –[outsym] : string to output when word is recognized –models : hmm model list

Page 24: Results of Tagalog vowel Speech recognition using Continuous HMM Arnel C. Fajardo Ph. D student (Under the supervision of Professor Yoon-Joong Kim)

BNF

Grammar rule $ :variable {} : zero or more repitions <>:one or more repitions [] : optional

(sil $words sil)

$words= a | e | i | o | u;(sil $words sil)

Page 25: Results of Tagalog vowel Speech recognition using Continuous HMM Arnel C. Fajardo Ph. D student (Under the supervision of Professor Yoon-Joong Kim)

HParse –C configs/config dic/tag_v_Gram dic/tag_Net

(dic/tag_v_Gram)

Page 26: Results of Tagalog vowel Speech recognition using Continuous HMM Arnel C. Fajardo Ph. D student (Under the supervision of Professor Yoon-Joong Kim)

HParse –C configs/config dic/korGram dic/tag_Net Results of HParse to tag_v_Gram:

dic/tag_Net configs/config

Page 27: Results of Tagalog vowel Speech recognition using Continuous HMM Arnel C. Fajardo Ph. D student (Under the supervision of Professor Yoon-Joong Kim)

HVite –C configs/config -S scripts/test.scp –H wordHmms/m5/hmmdefs –w dic/tag_Net –i mlfs/recOutWordm5.mlf dic/dict modelList/wordList

config/config scripts/test.scp modellist/wordList

Page 28: Results of Tagalog vowel Speech recognition using Continuous HMM Arnel C. Fajardo Ph. D student (Under the supervision of Professor Yoon-Joong Kim)

HVite –C configs/config -S scripts/test.scp -H wordHmms/m5/hmmdefs –w dic/tag_Net –i mlfs/recOutWordm5.mlf dic/dict modelList/wordList mlfs/recOutWordm5.mlf

Page 29: Results of Tagalog vowel Speech recognition using Continuous HMM Arnel C. Fajardo Ph. D student (Under the supervision of Professor Yoon-Joong Kim)

Step 4.Recognition results.HResults –I mlfs/words.mlf modelList/wordList mlfs/recOutWordm5.mlfFirst test : 5 sets ( each set represents 1 speaker) = > 5 speakers

Page 30: Results of Tagalog vowel Speech recognition using Continuous HMM Arnel C. Fajardo Ph. D student (Under the supervision of Professor Yoon-Joong Kim)

Step 4.Recognition results.HResults –I mlfs/words.mlf modelList/wordList mlfs/recOutWordm5.mlfSecond test : 10 sets ( each set represents 1 speaker) = > 10 speakers

Page 31: Results of Tagalog vowel Speech recognition using Continuous HMM Arnel C. Fajardo Ph. D student (Under the supervision of Professor Yoon-Joong Kim)

Comparison of m5 and m6 ( hmmdefs)( slight difference)HERest -C configs\config -I mlfs\words.mlf -S scripts\train.scp -H wordHmms\m4\hmmdefs -M wordHmms\m5 modelList\wordList

HERest -C configs\config -I mlfs\words.mlf -S scripts\train.scp -H wordHmms\m5\hmmdefs -M wordHmms\m6 modelList\wordList

m5 m6

Page 32: Results of Tagalog vowel Speech recognition using Continuous HMM Arnel C. Fajardo Ph. D student (Under the supervision of Professor Yoon-Joong Kim)

END