T.-H. Tsai and Y.-C. Yang
-
Upload
doris-foreman -
Category
Documents
-
view
32 -
download
0
description
Transcript of T.-H. Tsai and Y.-C. Yang
-
Low power and cost effective VLSI design for an MP3 audio decoder using an optimized synthesis-subband approachT.-H. Tsai and Y.-C. YangDepartment of Electrical Engineering and National Central University, Taiwan ROCIEE Proceedings on Computers and Digital Techniques
-
AbstractAn optimized approach to MPEG layer-3(MP3) audio decoding is presented, with the main theme focused on the synthesis subband. Since the synthesis subband is the most power-consuming component in decoding, a cost-effective architecture is proposed based on a system-design consideration. By means of an algorithm and architecture, the synthesis subband archives a high throughput with reduced memory requirements and hardware complexity. With a two-stage pipeline architecture, it allows 100% hardware utilization and is suitable for low-power implementation. In addition, the chip design in a 0.35um process is also accomplished. It occupies a die area of about 2.7 3.2 mm2 with a transistor count of 157,469 and a low-power dissipation of only 2.92mW
-
Whats the problemMPEG layer-3(MP3) coding has been widely applied to current digital audio broadcasting and multimedia applicationA cost-effective and low-power implementation will largely reduce the hardware and computation complexityFrom the MP3 decoder point of view, the computational load depends on the realization of a synthesis subband
-
OutlineIntroduction of synthesis subbandImplementation considerations and analysisProposed method and architectureResults and comparisonConclusion
-
Introduction(1)Elementary concept of MP3Multirate subband-based coding techniquesIn the encoder, it performs analysis subband filtering with 32 equally spaced filterbanks based on a psychoacoustical modelIn the decoder, it performs synthesis subband filteringMost fast algorithms techniques interpret synthesis subband filtering as a modified discrete cosine transform (MDCT) with some additional windowing operations
-
Introduction(2)One of the popular methodTranslate DCT into a FFT kernelAdvantageBecause of FFT equations specific symmetric and recursive property, we can reduce the number of multiplications and additionsDisadvantagethese methods have complex control and irregular data flow which will introduce a high hardware costThe proposed designreduced memory requirements and hardware complexityHigh efficiency with 100% hardware utilization using a two-stage pipeline architecture
-
Introduction(3)MP3 decoding flowHybrid filter bank divided into inverse modified discrete cosine transfer with dynamic windowing and overlap (DWIMDCT), and the synthesis subband filterbank
Start
Get bit streamFind Header
Decode Side information
Decode Scale factors
Decode Huffman data
Requantize Spectrum
Reorder Spectrum
Joint Stereo Processing(if necessary)
Alias reduction
IMDCT and windowing
Sub-band synthesis
Output PCM samples
-
Introduction(4)Synthesis-subband decoding flow
-
Implementation analysisDesign targetDelivering the required high performance at the minimum cost and the smallest silicon areaThe performance is determined by real-time constraints
-
Implementation analysis (cont.)MOPS = Fs C NFsSample frequencyCTotal number of numerical calculations per sampleNnumber of audio channel
-
Implementation considerationIn synthesis subband, IMDCT can be broken into an FFT, a data shift, preprocessing and post-processingThree considerationsThe initial transformer, the real-number computation is also translated into the complex number computationData shift, preprocessing and post-processing still contain complex multiplicationsFFT algorithms always need many multipliers, and the butterfly recursive process leads to some complex interconnection and routing
-
Proposed methodNormal IMDCTProposed IMDCT
Require about amount of multiplier-accumulate computationsRequired size for the ram buffer can be reduced to only 512 words per channel( amount of original)
-
Architecture IMDCTIPQMF
-
Architecture (cont.)Pipeline architecture
-
Memory configuration (1)
-
Memory configuration (2)Data conflicts in IMDCT and IPQMF
-
Memory configuration (3)Memory data access with pipeline operation
-
Results and comparison (1)
-
Results and comparison (2)
-
Results and comparison (3)
-
ConclusionBy means of novel algorithm and architecture, the synthesis subband has a better performanceIt also archives a high throughput, with a low-cost memory requirement and hardware complexity
-
Sub-band samples(32 subband x 18 samples)
0 1 2 16 17
01...3031
IMDCT
0 1 2 62 63
031
3263
6495
96127
128159
160191
16 x 64-bitFIFO= 1024 samples
896927
928959
960991
9921023
0
1
2
14
15
031
031
3263
3263
6495
6495
480511
480511
U vector
D window
x
x
x
x
031
031
031
031
w0
w1
w2
w15
+
+
+
+
=Sum(w0 ~w15)