Fixed-Location Pulse Linear Prediction Coding Vocoder Model...

Research ArticleFixed-Location Pulse Linear Prediction Coding Vocoder Modelfor Low Bit Rate Speech Coding

Zhen Ma

School of Information Engineering Binzhou University China

Correspondence should be addressed to Zhen Ma 13954380080139com

Received 26 September 2018 Revised 4 December 2018 Accepted 13 December 2018 Published 31 December 2018

Academic Editor Paolo Crippa

Copyright copy 2018 Zhen Ma This is an open access article distributed under the Creative Commons Attribution Licensewhich permits unrestricted use distribution and reproduction in any medium provided the original work is properlycited

An arbitrary-location pulse determination algorithm based on multipulse linear prediction coding (MP-LPC) is presented Thisalgorithm can determine all the amplitudes of the pulses at a time according to given pulse locations without the use of analysis-by-synthesis This ensures that the pulses are optimal in a least-square sense providing the theoretical foundation to improvethe quality of synthesized speech A fixed-location pulse linear prediction coding (FLP-LPC) method is proposed based on thearbitrary-location pulse determination algorithm Simulation of the algorithm in MATLAB showed the superior quality of thespeech synthesized using pulses in different locations and processed using the arbitrary-location pulse determination algorithmThe algorithm improved speech quality without affecting coding time which was approximately 15 of the coding time for MP-LPC Pulse locations in FLP-LPC are fixed and do not need to be transmitted with only LSF gain and 16 pulse amplitudes requiringcoding and transmission FLP-LPC allows the generation of synthesized speech similar to G729 coded speech at a rate of 25 kbps

1 Introduction

As one of the most common ways to exchange ideas speechcoding has been the topic of extensive research for manyyears [1ndash4] Inmobile communication systems alongwith theexplosive growth in the number of users a dramatic increasein telephone traffic has led to limited bandwidth allotted toeach speech channel

The quality of synthesized speech and the coding rate arecontradictory factors in communication systems Parametercoding focuses on and extracts the parameters of vocal tractmodel and excitation which can synthesize speech at bitrates below approximately 4 kbps The synthesized speechhas intelligibility although it sounds unnatural Parametercoding includes duality excitation linear prediction mixedexcitation linear prediction (MELP) [5] (McCree and Barn-well 1995) waveform interpolation (WI) [6] sinusoidaltransform coding (STC) [7] andmultiband excitation (MBE)[8] Hybrid coding can achieve good synthesized speechquality at coding rates ranging from 4 to 16 kbps which isused in different areas Hybrid coding includes multipulse

linear prediction coding (MP-LPC) [9] and code-excitedlinear prediction [4 10]

MP-LPC is a typical analysis-by-synthesis linear predic-tive coding (ABS-LPC) method in which dozens of pulsesare selected as excitation signals [11] MP-LPC can achievegood synthesized speech quality at low coding rates howeverits pulse determination algorithm is complex MP-LPC wasimproved by simplifying the amplitudes and locations ofexcitation pulses In regular-pulse excitation linear predictioncoding (RPE-LPC) [12] equispaced pulses are used as theexcitation signal Therefore only the amplitude of pulsesand the first pulse location need to be determined andtransmitted Inmultipulse maximum likelihood quantization(MP-MLQ) [13] the locations of pulses can be either allodd or all even and the amplitudes of pulses can be thesigns (plusmn1) In RPE-LPC and MP-MLQ the locations orlocations and amplitudes of pulses become more regularand less information about them needs to be transmittedIn the present study we present an arbitrary-location pulsedetermination algorithm based on conventional MP-LPC Inthis algorithm pulse locations can be assigned arbitrarily

HindawiMathematical Problems in EngineeringVolume 2018 Article ID 1732151 7 pageshttpsdoiorg10115520181732151

2 Mathematical Problems in Engineering

without searching through the analysis-by-synthesis proce-dure These pulses with arbitrarily assigned locations andoptimal amplitudes can produce speech approaching originalspeech in a least-square sense Based on the arbitrary-locationpulse determination algorithm a fixed-location pulse linearprediction coding (FLP-LPC) method is presented Thearbitrary-location pulse determination algorithm can syn-thesize speech with better quality within a shorter codingtime than the sequential method FLP-LPC not only yieldssynthesized speech of high quality and short coding time butcan also reduce the coding rate

2 MP-LPC

For a speech frame of length N M pulses are used as theexcitation signal denoted as 119901(119899) = sum119872119896=1 119892119896lowast120575(119899minus119899119896) where119892119896 and 119899119896 are the amplitude and location of the kth pulserespectively The key of MP-LPC is to determine 119892119896 and 119899119896to minimize the perceptual weighted error between originaland synthesized speech

The corresponding synthesized speech signal is

119904 (119899) = 1199040 (119899) + 119872sum119896=1

119892119896 lowast ℎ (119899 minus 119899119896) (1)

where 1199040(119899) is the zero-input response of the linear prediction(LP) filter that represents the effect of the previous frame andℎ(119899) is the unit impulse response of the LP filter

The error between 119904(119899) and 119904(119899) is119890119904 (119899) = 119904 (119899) minus 119904 (119899) = 119890119899 (119899) minus 119872sum

119896=1

119892119896 lowast ℎ (119899 minus 119899119896) (2)

where 119890119899(119899) = 119904(119899) minus 1199040(119899) is the equivalent speech withoutthe effect of the previous frame that is the excitation signalfiltered by the LP filter 119890119904(119899) filtered by the perceptuallyweighted filter is

119890 (119899) = [119890119899 (119899) minus 119872sum119896=1

119892119896 lowast ℎ (119899 minus 119899119896)] lowast 119908 (119899)

= 119890119908 (119899) minus 119872sum119896=1

119892119896 lowast ℎ119908 (119899 minus 119899119896) (3)

where ℎ119908(119899) is the unit impulse response of the perceptuallyweighted filter The perceptually weighted mean square error119864119898 is119864119898 = 119873sum

119899=1

1198902 (119899) = 119873sum119899=1

[119890119908 (119899) minus 119872sum119896=1

119892119896 lowast ℎ119908 (119899 minus 119899119896)]2

(4)

Themain idea of MP-LPC is to minimize 119864119898 by selectingthe appropriate 119892119896 and 119899119896 in the excitation signal By settingthe partial derivative of 119864119898 to zero M linear equations andM nonlinear equations can be obtained Solving these 2Mequations at a time is complicated Therefore in MP-LPC asequential method is used to determine the amplitude and

location of one pulse in each iteration AfterM iterations theamplitudes and locations ofM pulses can be determinedThe119899119895 of the jth pulse is the location to maximize the followingformula

1198772119890ℎ (119899119895)119877ℎℎ (119899119895 119899119895) (5)

Next 119892119895 is determined according to the 119899119895 determinedabove as follows

119892119895 = 119877119890ℎ (119899119895)119877ℎℎ (119899119895 119899119895) (6)

where

119877119890ℎ (119899119895) = 119873sum119899=1

119890119908 (119899) lowast ℎ119908 (119899 minus 119899119895) (7)

119877ℎℎ (119899119896 119899119895) = 119873sum119899=1

ℎ119908 (119899 minus 119899119896) lowast ℎ119908 (119899 minus 119899119895) (8)

3 Arbitrary-Location PulseDetermination Algorithm

From the above the main idea of MP-LPC is to determine 119892119896and 119899119896 for 119890(119899) = 119890119908(119899) minussum119872119896=1 119892119896 lowastℎ119908(119899 minus 119899119896) = 0 that is

119872sum119896=1

119892119896 lowast ℎ119908 (119899 minus 119899119896) = 119890119908 (119899) (9)

The aboveM equations can be written compactly as

119867120573 = 119864 (10)

where

119867(1198991 1198992 119899119872)

=[[[[[[[[[

ℎ119908 (1 minus 1198991) sdot sdot sdot ℎ119908 (1 minus 119899119872)sdot sdotsdot sdotsdot sdot

ℎ119908 (119873 minus 1198991) sdot sdot sdot ℎ119908 (119873 minus 119899119872)

]]]]]]]]]119873times119872

(11)

120573 = [1198921 1198922 119892119872]119879 (12)

119864 = [119890119908 (1) 119890119908 (2) 119890119908 (119873)]119879 (13)

It can be inferred that 119867 will remain unchanged once1198991 1198992 119899119872 are determined (even though assigned arbi-trarily) in the beginning of the search Therefore formula(10) can be regarded as a linear system If 119873 = 119872 and119867 is nonsingular then linear system (10) has a uniquesolution However in most cases 119873 gt 119872 and the system isoverdetermined and may not be consistent therefore it does

Mathematical Problems in Engineering 3

not always have a unique solution To determine 119892119896 is simplyequivalent to finding a least-squares solution 120573 of (10)

10038171003817100381710038171003817119867120573 minus 11986410038171003817100381710038171003817 = min120573

1003817100381710038171003817119867120573 minus 1198641003817100381710038171003817 (14)

The smallest norm least-squares solution of the abovelinear system can be determined by H+ the MoorendashPenrosegeneralized inverse of H [14]

120573 = 119867+119864 (15)

Several methods can be used to calculate theMoorendashPen-rose generalized inverse including orthogonal projection theorthogonalization method the iterative method and singularvalue decomposition (SVD) [14]

The locations of pulses in the excitation signal can beassigned arbitrarily which has no effect on the existence ofthe smallest norm least-squares solution of (10) The processof the arbitrary-location pulse determination algorithm canbe summarized as follows

Step 1 Assign pulse locations 119899119894 119894 = 1 119872 arbitrarily

Step 2 Calculate the unit impulse response matrix H

Step 3 Calculate the pulse amplitude vector 120573 120573 = 119867+119864The location of pulses in the excitation signal has no effect

on the existence of the smallest norm least-squares solution of(10) namely the synthesized speech can always approach theoriginal speech in a least-square sense regardless of the pulselocations Moreover the transmission of pulse locations willincrease the coding rate leading to the waste of bandwidthAs discussed above for different speech frames the pulsesin fixed locations but with different amplitudes can be usedas the excitation signal We therefore developed a methodin which only the pulse amplitudes and LPC parametersare coded and the pulse locations do not need to be codedThis method is called fixed-location pulse linear predictioncoding

4 Results

41 Performance Evaluation of the Arbitrary-Location PulseDetermination Algorithm To test the effect of a combinationof different pulse locations on the quality of synthesizedspeech processed by the proposed arbitrary-location pulsedetermination algorithm five sections of speech from fivedifferent speakers with equal content were analyzed Thesampling frequency was 8000 Hz There were 813 framesin the five sections of speech and each frame included 160samples The speech frames were analyzed in 50 trials andthe locations of pulses were arbitrarily selected at each trialthe amplitudes of all pulses were then calculated using theproposed arbitrary-location pulse determination algorithm

The pulses at different locations and the synthesizedspeech generated by each for the same frame of speechare shown in Figure 1 The residual signal and originalspeech are shown in Figures 1(a) and 1(b) respectively For

direct comparison with the sequential method the samelocations of pulses were used in the arbitrary-location pulsedetermination algorithm and in the sequential method Theexcitation signals obtained by the sequential method andarbitrary-location pulse determination algorithm (in thesame locations but with different amplitudes) are shown inFigures 1(c) and 1(e) and the corresponding synthesizedspeech is shown in Figures 1(d) and 1(f) These two methodswere capable of generating synthesized speech close to theoriginal speechwith signal to noise ratios (SNR) of 171471 and199547 indicating that the proposed method is superior tothe sequential method Here SNR is defined as

119878119873119877 = 10 log[ sum119873minus1119896=0 1198782119900 (119896)sum119873minus1119896=0 (119878119900 (119896) minus 119878119903 (119896))2] (16)

where 119878119900(119896) and 119878119903(119896) are the original and synthesized speechsignals respectively Figures 1(g)ndash1(p) show the speech syn-thesized with five strings of pulses in different locationsillustrating the superior SNRs obtained with the proposedmethod compared with those of the sequential methodThese results indicate that high-quality synthesized speechcan be obtained with pulses in different locations and theamplitudes were calculated using the proposed method Themean SNR for the 40650 trials performed was 192937 Themean coding time of the sequential method was 01224 andthat of the proposed method was 00019 only 155 of thecoding time for sequential method

For the above speeches 50 trials were conducted for allthe speech frames At each trial 8 10 12 14 16 18 2022 24 28 and 32 pulses with arbitrarily selected locationswere extracted and the amplitudes were calculated usingthe proposed algorithm The results are shown in Figure 2which shows that the box heights decrease and the mediansincrease (Figure 2(a)) The increase in the mean SNR at alltrials and the decrease in standard deviation with increasingpulse number indicate that the greater number of pulses leadsto a more steady solving process and a more modest effectof the assigned pulse locations As mentioned by Ma et al[15] for a certain frame of speech when the pulse numberreaches a certain value a continuing increase of pulses willnot improve the quality of synthesized speech Regarding theproposed algorithmwhen the pulse number reaches a certainvalue the rank of H does not increase subsequently and doesnot contribute significantly to solving for amplitudes Theaverage coding times for different pulse numbers are shownin Figure 2(b) An increase in the pulse number results ina gradual increase in the average coding time The averagecoding time for 32 pulses was 00028 s which was lowerthan that of the sequential method at 01224 s The excitationsignals obtained with different numbers of pulses and thesynthesized speeches for the same frame of speech are shownin Figure 3 The results show that these synthesized speecheswere close to the original speech and had good SNRs Figures2 and 3 indicate thatwhen the pulse number is greater than 16the quality of synthesized speech is improved in a nonobviousmanner


0 50 100 150minus02

0

02

(a)0 50 100 150

minus02

0

02

(b)0 50 100 150

minus01

0

01

(c)0 50 100 150

minus02

0

02 SNR=171471

(d)

0 50 100 150minus01

0

01

(e)0 50 100 150

minus02

0

02 SNR=199547

(f)0 50 100 150

minus01

0

01

(g)0 50 100 150

minus02

0

02SNR=263215

(h)

0 50 100 150minus01

0

01

(i)0 50 100 150

minus02

0

02SNR=257581

(j)0 50 100 150

minus01

0

01

(k)0 50 100 150

minus02

0

02SNR=248960

(l)

0 50 100 150minus01

0

01

(m)0 50 100 150

minus02

0

02SNR=247275

(n)0 50 100 150

minus01

0

01

(o)0 50 100 150

minus02

0

02SNR=245586

(p)

Figure 1 (a) Residual signal (b) original speech (c) multipulse excitation signal obtained by the sequential method (d) speech synthesizedusing the excitation signal in (c) (e)multipulse excitation signal obtained by the presentedmethodwith the same locations as the signal in (c)(f) speech synthesized using the excitation signal in (e) (gndashp) the pulses with the locations arbitrarily assigned and the speeches synthesized

42 Performance Evaluation of FLP-LPC

421 Performance Evaluation of Unquantized FLP-LPC Inthe proposed method the locations of pulses can be assignedarbitrarily and do not need to be calculated using analgorithm Therefore the pulses at fixed locations but withdifferent amplitudes can be used as the excitation signal forevery speech frame The proposed method and sequentialmethod [11] were used to process the same speech spokenby a female and male from Chinese Central Television newsbroadcasting (25343950 s) The PESQ MOS specified in theITU standard P862 and SNR were used to evaluate speechquality Average SNR and PESQ MOS as a function of pulsenumber are shown in Figure 4

The speech obtained with FLP-LPCwasmore natural andintelligible than that obtained with MP-LPC Average SNRand PESQ MOS for MP-LPC and FLP-LPC increase withpulse number But they increase insignificantly when thepulse numbers are more than 18

422 Coding Scheme of FLP-LPC and Performance Evalua-tion The present results indicate that for a speech frameof 20 ms 16 pulses are sufficient to produce good-qualitysynthesized speech Therefore in the coding scheme of FLP-LPC 16 evenly distributed pulses are used as the excitationsignal First the amplitudes of pulses are normalized andthen the gain and normalized amplitudes are coded Thecoded parameters are LSF gain and pulse amplitude LSF andnormalized amplitudes are multistage vector quantized andthe gain is quantized using 4 bits The specific bit allocationof FLP-LPC is shown in Table 1

The test speeches included 20 sets of samples from twomen and two women with a sample frequency of 8000 HzThere were background noises in the recording such asa creaking door and automobile noises The test speechesincluded a database composed of 1560 sentences from 83men and 83 women the content of which was selectedfrom Peoplersquos Daily These speeches were coded using FLP-LPC G7231 and G729 and the PESQ MOS values were


5

10

15

20

25

30

35

40

SNR

Pulse number16 18 20 22 24 28 32

m=141550=44953

m=144212=43606

m=146707=42096

m=149287=41331

m=151452=40176

m=152482=37266

m=153649=35191

(a)

16 18 20 22 24 28 32Pulse number

times10minus3

0

1

2

3

Aver

age c

odin

g tim

e

(b)

Figure 2 (a) SNR distribution with different pulse numbers (b) average coding time with different pulse numbers

Table 1 FLP-LPC bit allocation

Parameter BitsLSF 18Gain 4Pulse amplitudes 28total 50

Table 2 Performance comparison of FLP-LPC with G7231 andG729

Codingmethodstandard

Codingrate (kbps) PESQ MOS

FLP-LPC 25 3731G7231 53 3497G729 8 3765

calculated as shown in Table 2 FLP-LPC synthesized speechwith similar quality to that generated with G729 and superiorto that of G7231 at a coding rate of 25 kbps

5 Conclusion

To solve the problems associated with MP-LPC an arbitrary-location pulse determination algorithm was developed in thecurrent study In this algorithm the pulse amplitudes aredetermined by solving a linear system of equations underthe premise of arbitrarily assigned locations of pulses Theexistence of a smallest norm least-squares solution of thislinear system is not affected by the locations of pulses in theexcitation signal Tests performed in different speech framesshowed that pulse combinations with different locations canbe used as the excitation signal to synthesize high-qualityspeech yielding better results than those obtained withthe conventional sequential method The sequential methoddetermines one pulse at a time which ensures that theadded pulse is optimal at every iteration whereas it does notguarantee that the combination of pulses after all iterationsis optimal In the proposed algorithm the combination ofpulses determined at a time is optimal in a least-square sensewhich provides the theoretical basis for ensuring the qualityof synthesized speech The proposed algorithm does notincrease coding time to improve synthesized speech qualitywhich is 15 of the coding time in MP-LPC To study theeffect of pulse number on the quality of synthesized speechthe excitation signals with different numbers of pulses were


0 50 100 150minus02

0

02

(a)0 50 100 150

minus02

0

02

(b)0 50 100 150

minus01

0

01

16pulses

(c)0 50 100 150

minus02

0

02 SNR=201730

(d)

0 50 100 150minus01

0

0118pulses

(e)0 50 100 150

minus02

0

02 SNR=203125

(f)0 50 100 150

minus01

0

0120pulses

(g)0 50 100 150

minus02

0

02 SNR=208017

(h)

0 50 100 150minus01

0

01 22pulses

(i)0 50 100 150

minus02

0

02 SNR=209049

(j)0 50 100 150

minus01

0

0124pulses

(k)0 50 100 150

minus02

0

02 SNR=206357

(l)

0 50 100 150minus01

0

0128pulses

(m)0 50 100 150

minus02

0

02 SNR=211795

(n)0 50 100 150

minus01

0

0132pulses

(o)0 50 100 150

minus02

0

02 SNR=217628

(p)

Figure 3 (a) residual signal (b) original speech (cndashp) excitation signals with 16 18 20 22 24 28 and 32 pulses and the speeches synthesized

5 10 15 20 25 300

10

20

FLP-LPCMP-LPC

(a)

5 10 15 20 25 300

2

4

6

FLP-LPCMP-LPC

(b)

Figure 4 (a) average SNR and (b) PESQ MOS as a function of pulse number for MP-LPC and FLP-LPC

calculated for the same frames of speech The results showedthat 16 pulses were sufficient to generate 20 ms long speech

In improved MP-LPC methods such as RPE-LPC andMP-MLQ the locations or amplitudes or both of pulsesbecome more regular to reduce the information on theexcitation signals to be transmitted FLP-LPC was developedto further reduce the coding rate The method is based onthe premise that the pulse vector in a least-square sense isindependent of the locations of pulses in excitation signal

In FLP-LPC the locations of pulses are fixed and only theamplitudes of pulses need to be determined through thearbitrary-location pulse determination algorithm The pulselocations do not need coding or transmitting which reducesthe coding rate without affecting the quality of synthesizedspeech Our results show that the SNR and the PESQ MOS ofthe speech synthesized with FLP-LPCwere higher than thoseof speech generated by MP-LPC Moreover we developedan FLP-LPC coding scheme in which the pulses are evenly


distributed FLP-LPC can synthesize speech with qualitysimilar to that of G729 and superior to that generated byG7231 at a coding rate of 25 kbps In conclusion FLP-LPCcan synthesize speech of high quality with a short codingtime and can reduce the coding rate however it has thedrawback of a biggermemory requirement for calculating theMoorendashPenrose generalized inverse

Data Availability

The speech data used to support the findings of this studywere supplied by Chinese Linguistic Data Consortium underlicense and so cannot be made freely available Requestsfor access to these data should be made to Mengyi Sunservicechineseldcorg

Conflicts of Interest

The author declares that they have no competing interests

Acknowledgments

Project supported byMinistry of Education of ChinaHuman-ities and Social Sciences Research Project (18YJCZH129)Natural Science Foundation of Shandong (ZR2014FL005)Binzhou University Scientific Research Fund Project(2016Y29)

References

[1] F Lahouti A R Fazel A H Safavi-Naeini andA K KhandanildquoSingle and double frame coding of speech LPC parametersusing a lattice-based quantization schemerdquo IEEE Transactionson Audio Speech and Language Processing vol 14 no 5 pp1624ndash1632 2006

[2] A Mouchtaris K Karadimou and P Tsakalides ldquoMultires-olution SourceFilter Model for Low Bitrate Coding of SpotMicrophone Signalsrdquo EURASIP Journal on Audio Speech andMusic Processing vol 2008 pp 1ndash16 2008

[3] M Deriche and D Ning ldquoA novel audio coding schemeusing warped linear prediction model and the discrete wavelettransformrdquo IEEE Transactions on Audio Speech and LanguageProcessing vol 14 no 6 pp 2039ndash2048 2006

[4] N Ku C Yeh and S Hwang ldquoAn efficient algebraic codebooksearch for ACELP speech coderrdquo EURASIP Journal on AudioSpeech and Music Processing vol 2014 no 1 2014

[5] A V McCree and T P Barnwell ldquoA Mixed Excitation LPCVocoder Model for Low Bit Rate Speech Codingrdquo IEEE Trans-actions on Audio Speech and Language Processing vol 3 no 4pp 242ndash250 1995

[6] W B Kleijn ldquoEncoding Speech Using Prototype WaveformsrdquoIEEE Transactions on Audio Speech and Language Processingvol 1 no 4 pp 386ndash399 1993

[7] R JMcauly and T F Quatieri ldquoSpeech analysissynthesis basedon a sinusoidal representationrdquo IEEE Transactions on SignalProcessing vol 34 no 4 pp 744ndash754 1986

[8] D W Griffin and J S Lim ldquoMultiband Excitation VocoderrdquoIEEE Transactions on Signal Processing vol 36 no 8 pp 1223ndash1235 1988

[9] BS Jr Atal ldquoRemde A new model of LPC excitation for pro-ducing natural-sounding speech at low bit ratesrdquo in Proceedingsof the IEEE International Conference on Acoustics Speech andSignal Processing pp 614ndash617 Paris France 1982

[10] M Schroeder and B Atal ldquoCode-excited linear predic-tion(CELP) High-quality speech at very low bit ratesrdquo inProceedings of the IEEE International Conference on AcousticsSpeech and Signal Processing pp 937ndash940 Tampa FL USA

[11] S Singhal and B S Atal ldquoAmplitude Optimization and PitchPrediction in Multipulse Codersrdquo IEEE Transactions on SignalProcessing vol 37 no 3 pp 317ndash327 1989

[12] P Kroon E F Deprettere and R J Sluyter ldquoRegular-PulseExcitationmdashA Novel Approach to Effective and Efficient Mul-tipulse Coding of Speechrdquo IEEE Transactions on Signal Process-ing vol 34 no 5 pp 1054ndash1063 1986

[13] S-W Yoon H-G Kang Y-C Park and D-H Youn ldquoAnefficient transcoding algorithm for G7231 and G729A speechcoders Interoperability between mobile and IP networkrdquoSpeech Communication vol 43 no 1-2 pp 17ndash31 2004

[14] D SerreMatrices Theory andApplications vol 216 ofGraduateTexts in Mathematics Springer New York NY USA 2002

[15] Z Ma Y Cao and J Zang ldquoResearch on MPLPC excited-pulseabstract algorithmrdquo in Proceedings of the 2009 InternationalSymposium on Computational Intelligence and Design ISCID2009 pp 489ndash492 China December 2009

Hindawiwwwhindawicom Volume 2018

MathematicsJournal of


Mathematical Problems in Engineering

Applied MathematicsJournal of


Probability and StatisticsHindawiwwwhindawicom Volume 2018

Journal of


Mathematical PhysicsAdvances in

Complex AnalysisJournal of


OptimizationJournal of



Engineering Mathematics

International Journal of


Operations ResearchAdvances in

Journal of


Function SpacesAbstract and Applied AnalysisHindawiwwwhindawicom Volume 2018

International Journal of Mathematics and Mathematical Sciences


Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018Volume 2018

Numerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisNumerical AnalysisAdvances inAdvances in Discrete Dynamics in

Nature and SocietyHindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Dierential EquationsInternational Journal of

Volume 2018


Decision SciencesAdvances in


AnalysisInternational Journal of


Stochastic AnalysisInternational Journal of

Submit your manuscripts atwwwhindawicom


without searching through the analysis-by-synthesis proce-dure These pulses with arbitrarily assigned locations andoptimal amplitudes can produce speech approaching originalspeech in a least-square sense Based on the arbitrary-locationpulse determination algorithm a fixed-location pulse linearprediction coding (FLP-LPC) method is presented Thearbitrary-location pulse determination algorithm can syn-thesize speech with better quality within a shorter codingtime than the sequential method FLP-LPC not only yieldssynthesized speech of high quality and short coding time butcan also reduce the coding rate

2 MP-LPC

For a speech frame of length N M pulses are used as theexcitation signal denoted as 119901(119899) = sum119872119896=1 119892119896lowast120575(119899minus119899119896) where119892119896 and 119899119896 are the amplitude and location of the kth pulserespectively The key of MP-LPC is to determine 119892119896 and 119899119896to minimize the perceptual weighted error between originaland synthesized speech

The corresponding synthesized speech signal is

119904 (119899) = 1199040 (119899) + 119872sum119896=1

119892119896 lowast ℎ (119899 minus 119899119896) (1)

where 1199040(119899) is the zero-input response of the linear prediction(LP) filter that represents the effect of the previous frame andℎ(119899) is the unit impulse response of the LP filter

The error between 119904(119899) and 119904(119899) is119890119904 (119899) = 119904 (119899) minus 119904 (119899) = 119890119899 (119899) minus 119872sum

119896=1

119892119896 lowast ℎ (119899 minus 119899119896) (2)

where 119890119899(119899) = 119904(119899) minus 1199040(119899) is the equivalent speech withoutthe effect of the previous frame that is the excitation signalfiltered by the LP filter 119890119904(119899) filtered by the perceptuallyweighted filter is

119890 (119899) = [119890119899 (119899) minus 119872sum119896=1

119892119896 lowast ℎ (119899 minus 119899119896)] lowast 119908 (119899)

= 119890119908 (119899) minus 119872sum119896=1

119892119896 lowast ℎ119908 (119899 minus 119899119896) (3)

where ℎ119908(119899) is the unit impulse response of the perceptuallyweighted filter The perceptually weighted mean square error119864119898 is119864119898 = 119873sum

119899=1

1198902 (119899) = 119873sum119899=1

[119890119908 (119899) minus 119872sum119896=1

119892119896 lowast ℎ119908 (119899 minus 119899119896)]2

(4)

Themain idea of MP-LPC is to minimize 119864119898 by selectingthe appropriate 119892119896 and 119899119896 in the excitation signal By settingthe partial derivative of 119864119898 to zero M linear equations andM nonlinear equations can be obtained Solving these 2Mequations at a time is complicated Therefore in MP-LPC asequential method is used to determine the amplitude and

location of one pulse in each iteration AfterM iterations theamplitudes and locations ofM pulses can be determinedThe119899119895 of the jth pulse is the location to maximize the followingformula

1198772119890ℎ (119899119895)119877ℎℎ (119899119895 119899119895) (5)

Next 119892119895 is determined according to the 119899119895 determinedabove as follows

119892119895 = 119877119890ℎ (119899119895)119877ℎℎ (119899119895 119899119895) (6)

where

119877119890ℎ (119899119895) = 119873sum119899=1

119890119908 (119899) lowast ℎ119908 (119899 minus 119899119895) (7)

119877ℎℎ (119899119896 119899119895) = 119873sum119899=1

ℎ119908 (119899 minus 119899119896) lowast ℎ119908 (119899 minus 119899119895) (8)

3 Arbitrary-Location PulseDetermination Algorithm

From the above the main idea of MP-LPC is to determine 119892119896and 119899119896 for 119890(119899) = 119890119908(119899) minussum119872119896=1 119892119896 lowastℎ119908(119899 minus 119899119896) = 0 that is

119872sum119896=1

119892119896 lowast ℎ119908 (119899 minus 119899119896) = 119890119908 (119899) (9)

The aboveM equations can be written compactly as

119867120573 = 119864 (10)

where

119867(1198991 1198992 119899119872)

=[[[[[[[[[

ℎ119908 (1 minus 1198991) sdot sdot sdot ℎ119908 (1 minus 119899119872)sdot sdotsdot sdotsdot sdot

ℎ119908 (119873 minus 1198991) sdot sdot sdot ℎ119908 (119873 minus 119899119872)

]]]]]]]]]119873times119872

(11)

120573 = [1198921 1198922 119892119872]119879 (12)

119864 = [119890119908 (1) 119890119908 (2) 119890119908 (119873)]119879 (13)

It can be inferred that 119867 will remain unchanged once1198991 1198992 119899119872 are determined (even though assigned arbi-trarily) in the beginning of the search Therefore formula(10) can be regarded as a linear system If 119873 = 119872 and119867 is nonsingular then linear system (10) has a uniquesolution However in most cases 119873 gt 119872 and the system isoverdetermined and may not be consistent therefore it does



10038171003817100381710038171003817119867120573 minus 11986410038171003817100381710038171003817 = min120573

1003817100381710038171003817119867120573 minus 1198641003817100381710038171003817 (14)


120573 = 119867+119864 (15)







4 Results








0 50 100 150minus02

0

02

(a)0 50 100 150

minus02

0

02

(b)0 50 100 150

minus01

0

01

(c)0 50 100 150

minus02

0

02 SNR=171471

(d)

0 50 100 150minus01

0

01

(e)0 50 100 150

minus02

0

02 SNR=199547

(f)0 50 100 150

minus01

0

01

(g)0 50 100 150

minus02

0

02SNR=263215

(h)

0 50 100 150minus01

0

01

(i)0 50 100 150

minus02

0

02SNR=257581

(j)0 50 100 150

minus01

0

01

(k)0 50 100 150

minus02

0

02SNR=248960

(l)

0 50 100 150minus01

0

01

(m)0 50 100 150

minus02

0

02SNR=247275

(n)0 50 100 150

minus01

0

01

(o)0 50 100 150

minus02

0

02SNR=245586

(p)








5

10

15

20

25

30

35

40

SNR

Pulse number16 18 20 22 24 28 32

m=141550=44953

m=144212=43606

m=146707=42096

m=149287=41331

m=151452=40176

m=152482=37266

m=153649=35191

(a)

16 18 20 22 24 28 32Pulse number

times10minus3

0

1

2

3

Aver

age c

odin

g tim

e

(b)







FLP-LPC 25 3731G7231 53 3497G729 8 3765


5 Conclusion



0 50 100 150minus02

0

02

(a)0 50 100 150

minus02

0

02

(b)0 50 100 150

minus01

0

01

16pulses

(c)0 50 100 150

minus02

0

02 SNR=201730

(d)

0 50 100 150minus01

0

0118pulses

(e)0 50 100 150

minus02

0

02 SNR=203125

(f)0 50 100 150

minus01

0

0120pulses

(g)0 50 100 150

minus02

0

02 SNR=208017

(h)

0 50 100 150minus01

0

01 22pulses

(i)0 50 100 150

minus02

0

02 SNR=209049

(j)0 50 100 150

minus01

0

0124pulses

(k)0 50 100 150

minus02

0

02 SNR=206357

(l)

0 50 100 150minus01

0

0128pulses

(m)0 50 100 150

minus02

0

02 SNR=211795

(n)0 50 100 150

minus01

0

0132pulses

(o)0 50 100 150

minus02

0

02 SNR=217628

(p)


5 10 15 20 25 300

10

20

FLP-LPCMP-LPC

(a)

5 10 15 20 25 300

2

4

6

FLP-LPCMP-LPC

(b)







Data Availability




Acknowledgments


References























Journal of












Journal of







Volume 2018






Volume 2018










10038171003817100381710038171003817119867120573 minus 11986410038171003817100381710038171003817 = min120573

1003817100381710038171003817119867120573 minus 1198641003817100381710038171003817 (14)


120573 = 119867+119864 (15)







4 Results








0 50 100 150minus02

0

02

(a)0 50 100 150

minus02

0

02

(b)0 50 100 150

minus01

0

01

(c)0 50 100 150

minus02

0

02 SNR=171471

(d)

0 50 100 150minus01

0

01

(e)0 50 100 150

minus02

0

02 SNR=199547

(f)0 50 100 150

minus01

0

01

(g)0 50 100 150

minus02

0

02SNR=263215

(h)

0 50 100 150minus01

0

01

(i)0 50 100 150

minus02

0

02SNR=257581

(j)0 50 100 150

minus01

0

01

(k)0 50 100 150

minus02

0

02SNR=248960

(l)

0 50 100 150minus01

0

01

(m)0 50 100 150

minus02

0

02SNR=247275

(n)0 50 100 150

minus01

0

01

(o)0 50 100 150

minus02

0

02SNR=245586

(p)








5

10

15

20

25

30

35

40

SNR

Pulse number16 18 20 22 24 28 32

m=141550=44953

m=144212=43606

m=146707=42096

m=149287=41331

m=151452=40176

m=152482=37266

m=153649=35191

(a)

16 18 20 22 24 28 32Pulse number

times10minus3

0

1

2

3

Aver

age c

odin

g tim

e

(b)







FLP-LPC 25 3731G7231 53 3497G729 8 3765


5 Conclusion



0 50 100 150minus02

0

02

(a)0 50 100 150

minus02

0

02

(b)0 50 100 150

minus01

0

01

16pulses

(c)0 50 100 150

minus02

0

02 SNR=201730

(d)

0 50 100 150minus01

0

0118pulses

(e)0 50 100 150

minus02

0

02 SNR=203125

(f)0 50 100 150

minus01

0

0120pulses

(g)0 50 100 150

minus02

0

02 SNR=208017

(h)

0 50 100 150minus01

0

01 22pulses

(i)0 50 100 150

minus02

0

02 SNR=209049

(j)0 50 100 150

minus01

0

0124pulses

(k)0 50 100 150

minus02

0

02 SNR=206357

(l)

0 50 100 150minus01

0

0128pulses

(m)0 50 100 150

minus02

0

02 SNR=211795

(n)0 50 100 150

minus01

0

0132pulses

(o)0 50 100 150

minus02

0

02 SNR=217628

(p)


5 10 15 20 25 300

10

20

FLP-LPCMP-LPC

(a)

5 10 15 20 25 300

2

4

6

FLP-LPCMP-LPC

(b)







Data Availability




Acknowledgments


References























Journal of












Journal of







Volume 2018






Volume 2018









0 50 100 150minus02

0

02

(a)0 50 100 150

minus02

0

02

(b)0 50 100 150

minus01

0

01

(c)0 50 100 150

minus02

0

02 SNR=171471

(d)

0 50 100 150minus01

0

01

(e)0 50 100 150

minus02

0

02 SNR=199547

(f)0 50 100 150

minus01

0

01

(g)0 50 100 150

minus02

0

02SNR=263215

(h)

0 50 100 150minus01

0

01

(i)0 50 100 150

minus02

0

02SNR=257581

(j)0 50 100 150

minus01

0

01

(k)0 50 100 150

minus02

0

02SNR=248960

(l)

0 50 100 150minus01

0

01

(m)0 50 100 150

minus02

0

02SNR=247275

(n)0 50 100 150

minus01

0

01

(o)0 50 100 150

minus02

0

02SNR=245586

(p)








5

10

15

20

25

30

35

40

SNR

Pulse number16 18 20 22 24 28 32

m=141550=44953

m=144212=43606

m=146707=42096

m=149287=41331

m=151452=40176

m=152482=37266

m=153649=35191

(a)

16 18 20 22 24 28 32Pulse number

times10minus3

0

1

2

3

Aver

age c

odin

g tim

e

(b)







FLP-LPC 25 3731G7231 53 3497G729 8 3765


5 Conclusion



0 50 100 150minus02

0

02

(a)0 50 100 150

minus02

0

02

(b)0 50 100 150

minus01

0

01

16pulses

(c)0 50 100 150

minus02

0

02 SNR=201730

(d)

0 50 100 150minus01

0

0118pulses

(e)0 50 100 150

minus02

0

02 SNR=203125

(f)0 50 100 150

minus01

0

0120pulses

(g)0 50 100 150

minus02

0

02 SNR=208017

(h)

0 50 100 150minus01

0

01 22pulses

(i)0 50 100 150

minus02

0

02 SNR=209049

(j)0 50 100 150

minus01

0

0124pulses

(k)0 50 100 150

minus02

0

02 SNR=206357

(l)

0 50 100 150minus01

0

0128pulses

(m)0 50 100 150

minus02

0

02 SNR=211795

(n)0 50 100 150

minus01

0

0132pulses

(o)0 50 100 150

minus02

0

02 SNR=217628

(p)


5 10 15 20 25 300

10

20

FLP-LPCMP-LPC

(a)

5 10 15 20 25 300

2

4

6

FLP-LPCMP-LPC

(b)







Data Availability




Acknowledgments


References























Journal of












Journal of







Volume 2018






Volume 2018









5

10

15

20

25

30

35

40

SNR

Pulse number16 18 20 22 24 28 32

m=141550=44953

m=144212=43606

m=146707=42096

m=149287=41331

m=151452=40176

m=152482=37266

m=153649=35191

(a)

16 18 20 22 24 28 32Pulse number

times10minus3

0

1

2

3

Aver

age c

odin

g tim

e

(b)







FLP-LPC 25 3731G7231 53 3497G729 8 3765


5 Conclusion



0 50 100 150minus02

0

02

(a)0 50 100 150

minus02

0

02

(b)0 50 100 150

minus01

0

01

16pulses

(c)0 50 100 150

minus02

0

02 SNR=201730

(d)

0 50 100 150minus01

0

0118pulses

(e)0 50 100 150

minus02

0

02 SNR=203125

(f)0 50 100 150

minus01

0

0120pulses

(g)0 50 100 150

minus02

0

02 SNR=208017

(h)

0 50 100 150minus01

0

01 22pulses

(i)0 50 100 150

minus02

0

02 SNR=209049

(j)0 50 100 150

minus01

0

0124pulses

(k)0 50 100 150

minus02

0

02 SNR=206357

(l)

0 50 100 150minus01

0

0128pulses

(m)0 50 100 150

minus02

0

02 SNR=211795

(n)0 50 100 150

minus01

0

0132pulses

(o)0 50 100 150

minus02

0

02 SNR=217628

(p)


5 10 15 20 25 300

10

20

FLP-LPCMP-LPC

(a)

5 10 15 20 25 300

2

4

6

FLP-LPCMP-LPC

(b)







Data Availability




Acknowledgments


References























Journal of












Journal of







Volume 2018






Volume 2018









0 50 100 150minus02

0

02

(a)0 50 100 150

minus02

0

02

(b)0 50 100 150

minus01

0

01

16pulses

(c)0 50 100 150

minus02

0

02 SNR=201730

(d)

0 50 100 150minus01

0

0118pulses

(e)0 50 100 150

minus02

0

02 SNR=203125

(f)0 50 100 150

minus01

0

0120pulses

(g)0 50 100 150

minus02

0

02 SNR=208017

(h)

0 50 100 150minus01

0

01 22pulses

(i)0 50 100 150

minus02

0

02 SNR=209049

(j)0 50 100 150

minus01

0

0124pulses

(k)0 50 100 150

minus02

0

02 SNR=206357

(l)

0 50 100 150minus01

0

0128pulses

(m)0 50 100 150

minus02

0

02 SNR=211795

(n)0 50 100 150

minus01

0

0132pulses

(o)0 50 100 150

minus02

0

02 SNR=217628

(p)


5 10 15 20 25 300

10

20

FLP-LPCMP-LPC

(a)

5 10 15 20 25 300

2

4

6

FLP-LPCMP-LPC

(b)







Data Availability




Acknowledgments


References























Journal of












Journal of







Volume 2018






Volume 2018










Data Availability




Acknowledgments


References























Journal of












Journal of







Volume 2018






Volume 2018















Journal of












Journal of







Volume 2018






Volume 2018








Fixed-Location Pulse Linear Prediction Coding Vocoder Model...

Documents

Transcript of Fixed-Location Pulse Linear Prediction Coding Vocoder Model...