Audio Coding MPEG1 Layers I, II, III MPEG2MPEG4 Sherida Subrati Anthony Caliendo.
-
date post
21-Dec-2015 -
Category
Documents
-
view
226 -
download
1
Transcript of Audio Coding MPEG1 Layers I, II, III MPEG2MPEG4 Sherida Subrati Anthony Caliendo.
Audio CodingAudio Coding
MPEG1 Layers I, II, IIIMPEG1 Layers I, II, III
MPEG2MPEG2
MPEG4MPEG4
Sherida SubratiAnthony Caliendo
OverviewOverview
• Explanation of CodecsExplanation of CodecsMPEG1 – Layer I, II, III (Differences)MPEG1 – Layer I, II, III (Differences)MPEG2 – Basic OverviewMPEG2 – Basic OverviewMPEG4 – Possible ApplicationsMPEG4 – Possible Applications
• Applications & Sound Samples UsedApplications & Sound Samples Used• Results & ExplanationResults & Explanation
File Size, Bitrate, & QualityFile Size, Bitrate, & QualityWaveform ComparisonWaveform Comparison
• Summary & QuestionsSummary & Questions
Sub-Band Coding OverviewSub-Band Coding Overview
• Size of sub-Size of sub-bands variesbands varies
• Varying Varying application of application of psychoacoustpsychoacoustic modelic model
MPEG1 – Layer I & IIMPEG1 – Layer I & II
• Time Frequency MappingTime Frequency Mapping Polyphase Filter BankPolyphase Filter Bank 32 Equal Bands32 Equal Bands
• Psychoacoustic ModelPsychoacoustic Model 512-point FFT & 1024-point FFT respectively512-point FFT & 1024-point FFT respectively Tonal & Noise MaskingTonal & Noise Masking
• QuantizerQuantizer Scale Factor: 6 bitsScale Factor: 6 bits Layer II – Allows 3 successive scale factors & Layer II – Allows 3 successive scale factors &
uses 1-3 depending on how much they differuses 1-3 depending on how much they differ
MPEG1 – Layer I & II MPEG1 – Layer I & II DiagramDiagram
Images from Peter Noll MPEG Digital Audio Coding Standards
MPEG1 – Layer IIIMPEG1 – Layer III
• Time Frequency MappingTime Frequency Mapping Switched Hybrid Filter BankSwitched Hybrid Filter Bank 32 sub-bands further sub-divided using a32 sub-bands further sub-divided using a
6 or 18-point DCT6 or 18-point DCT
• Psychoacoustic ModelPsychoacoustic Model Variable FFTVariable FFT Tonal & Noise MaskingTonal & Noise Masking
• QuantizerQuantizer Non-uniform Scale FactorsNon-uniform Scale Factors Huffman Coding, Bit Reservoir, & Iterative Huffman Coding, Bit Reservoir, & Iterative
AnalysisAnalysis
MPEG1 – Layer III DiagramMPEG1 – Layer III Diagram
Images from Peter Noll MPEG Digital Audio Coding Standards
MPEG2 – General OverviewMPEG2 – General Overview
• 5.1 Channel Support5.1 Channel Support
• Advanced Audio Coding (AAC)Advanced Audio Coding (AAC) Optional PreprocessingOptional Preprocessing Bit-stream FormatterBit-stream Formatter Prediction – helps to optimize quantizerPrediction – helps to optimize quantizer Noiseless CodingNoiseless Coding 3 Profiles3 Profiles
• Main – Variable length DCT, noiseless coding, etc.Main – Variable length DCT, noiseless coding, etc.
• Low Complexity – No temporal noise shaping & time Low Complexity – No temporal noise shaping & time domain predictiondomain prediction
• Sampling Rate Scalability – preprocessor allows for Sampling Rate Scalability – preprocessor allows for sampling rates of 6, 12, 18, & 24 KHzsampling rates of 6, 12, 18, & 24 KHz
MPEG4 - General OverviewMPEG4 - General Overview
• Consists of all previous MPEG iterationsConsists of all previous MPEG iterations
• Uses 3 Core CodersUses 3 Core CodersParametric coding for low bit rate speechParametric coding for low bit rate speechAnalysis-by-synthesis for medium bit ratesAnalysis-by-synthesis for medium bit ratesSub-band/Transform coding for high bit ratesSub-band/Transform coding for high bit rates
• Low Delay (LD) Encoding / DecodingLow Delay (LD) Encoding / Decoding
• Quality ScalabilityQuality Scalability
Applications & Sound Applications & Sound Samples Samples • ApplicationsApplications
AVI2MP.EXEAVI2MP.EXE LAMEwin32LAMEwin32 Nero MPEG4 AACNero MPEG4 AAC GoldwaveGoldwave
• HardwareHardware Pentium III – 1.0 GHzPentium III – 1.0 GHz 512MB RAM512MB RAM Win2K SP3Win2K SP3
• Sound SamplesSound Samples PCM 16-bit Stereo 44.1 PCM 16-bit Stereo 44.1
KHzKHz• Clubbed to Death Clubbed to Death
(Kurayamino Mix) – Rob D(Kurayamino Mix) – Rob D
• Man Who Sold The World - Man Who Sold The World - NirvanaNirvana
PCM 8-bit Mono 44.1KhzPCM 8-bit Mono 44.1Khz• Voice SampleVoice Sample
Results – File size VS BitrateResults – File size VS Bitrate
Sample 2File Size VS Bitrate
0.00
1,000,000.00
2,000,000.00
3,000,000.00
4,000,000.00
5,000,000.00
6,000,000.00
7,000,000.00
32 48 56 64 80 96 112 128 160 192
Bitrate (kbps)
Sample 3File Size VS Bitrate
0.00
10,000.00
20,000.00
30,000.00
40,000.00
50,000.00
60,000.00
70,000.00
80,000.00
16 24 32 48 56 64 80 96
Bitrate (kbps)
Results – Encode Time VS Results – Encode Time VS BitrateBitrate
Sample 1Encode Time VS Bitrate
0.00
20.00
40.00
60.00
80.00
100.00
120.00
140.00
160.00
32 48 56 64 80 96 112 128 160 192
Bitrate (kbps)
Sample 2Encode Time VS Bitrate
0.00
20.00
40.00
60.00
80.00
100.00
120.00
32 48 56 64 80 96 112 128 160 192
Bitrate (kbps)
Results – Quality VS BitrateResults – Quality VS Bitrate
Sample 2Quality VS Bitrate
0.00
1.00
2.00
3.00
4.00
5.00
6.00
32 48 56 64 80 96 112 128 160 192
Bitrate (kbps)
Sample 3Quality VS Bitrate
0.00
1.00
2.00
3.00
4.00
5.00
6.00
16 24 32 48 56 64 80 96
Bitrate (kbps)
Sample SoundsSample Sounds
• Music SampleMusic Sample Original SoundOriginal Sound Sample 2 Play listSample 2 Play list S2-M4LT-064SS2-M4LT-064S S2-M4LT-080SS2-M4LT-080S S2-M4LT-096SS2-M4LT-096S
• Voice SampleVoice Sample Original SoundOriginal Sound Sample 3 Play listSample 3 Play list S3-M4LT-016MS3-M4LT-016M S3-M4LT-024MS3-M4LT-024M S3-M4LT-032MS3-M4LT-032M
SummarySummary
• MPEG1 – Layers I, II have limited options & MPEG1 – Layers I, II have limited options & are not size versus quality efficientare not size versus quality efficient
• MPEG1 – Layer III offers excellent quality MPEG1 – Layer III offers excellent quality at low rates but has large overheadat low rates but has large overhead
• MPEG2 – Much more comprehensiveMPEG2 – Much more comprehensive
• MPEG4 – Encompasses all previous MPEG4 – Encompasses all previous iterations & has new capabilities to iterations & has new capabilities to increase its lifespanincrease its lifespan