MP3 Optimization Exploiting Processor Architecture and Using Better Algorithms Mancia Anguita...

22
MP3 Optimization Exploiting Processor Architecture and Using Better Algorithms Mancia Anguita Universidad de Granada J. Manuel Martinez – Lechado Vitelcom Mobile Technology

Transcript of MP3 Optimization Exploiting Processor Architecture and Using Better Algorithms Mancia Anguita...

Page 1: MP3 Optimization Exploiting Processor Architecture and Using Better Algorithms Mancia Anguita Universidad de Granada J. Manuel Martinez – Lechado Vitelcom.

MP3 Optimization Exploiting Processor Architecture and Using Better Algorithms

Mancia Anguita

Universidad de Granada

J. Manuel Martinez – Lechado

Vitelcom Mobile Technology

Page 2: MP3 Optimization Exploiting Processor Architecture and Using Better Algorithms Mancia Anguita Universidad de Granada J. Manuel Martinez – Lechado Vitelcom.

Abstract

An application’s execution time depends on the processor architecture and clock frequency, the computational complexity of the algorithm, the choice of compiler and optimization options, and it also depends on how well the programmer explicitly and implicitly exploits processor architecture. This article quantifies the influence of these factors for an MP3 decoder through experimental results

Page 3: MP3 Optimization Exploiting Processor Architecture and Using Better Algorithms Mancia Anguita Universidad de Granada J. Manuel Martinez – Lechado Vitelcom.

Outline

What’s the problem? MP3 decoder overview MP3 decoder implementations Performance comparison Experiment results Conclusion

Page 4: MP3 Optimization Exploiting Processor Architecture and Using Better Algorithms Mancia Anguita Universidad de Granada J. Manuel Martinez – Lechado Vitelcom.

What’s the problem?

What factors can influence the application’s execution time? Executing processor’s architecture and clock

frequency The computational complexity of the algorithm The compiler The programmer’s skill

But how much influence do these factors exert on overall performance?

Page 5: MP3 Optimization Exploiting Processor Architecture and Using Better Algorithms Mancia Anguita Universidad de Granada J. Manuel Martinez – Lechado Vitelcom.

MP3 decoder overview( 1)

Page 6: MP3 Optimization Exploiting Processor Architecture and Using Better Algorithms Mancia Anguita Universidad de Granada J. Manuel Martinez – Lechado Vitelcom.

MP3 decoder overview( 2) Preprocessing

Finds frames in the bitstream Extracts their compressed audio data and informatio

n Huffman tables, scale factors

Requantization Reconstruct the original frequency line samples xri by

using scale factors extracted form preprocessing xri = sign(isi) |isi|4/3 × 2Cj/4

Page 7: MP3 Optimization Exploiting Processor Architecture and Using Better Algorithms Mancia Anguita Universidad de Granada J. Manuel Martinez – Lechado Vitelcom.

MP3 decoder overview( 3) Huffman decoding

Huffman encoding is a lossless coding scheme Decoding process is based in several Huffman table

s for mapping Huffman code to symbols Total 17 different tables The significant part of the processing

handling the compressed audio bitstream Searching Huffman tables

Page 8: MP3 Optimization Exploiting Processor Architecture and Using Better Algorithms Mancia Anguita Universidad de Granada J. Manuel Martinez – Lechado Vitelcom.

MP3 decoder overview( 4) Reordering

The encoder reorder short blocks to make the Huffman coding more efficiently

The decoder reverses this reordering

Stereo decoding To exploit redundancies between different stereo

channels When using single channel or dual channel, no

stereo processing is necessary

Page 9: MP3 Optimization Exploiting Processor Architecture and Using Better Algorithms Mancia Anguita Universidad de Granada J. Manuel Martinez – Lechado Vitelcom.

MP3 decoder overview( 5) Alias reduction

In the encoder, it is necessary to negate the alias effects of the polyphase filter bank

Consist of eight butterfly calculations for each pair of adjacent subbands

IMDCT

Page 10: MP3 Optimization Exploiting Processor Architecture and Using Better Algorithms Mancia Anguita Universidad de Granada J. Manuel Martinez – Lechado Vitelcom.

MP3 decoder overview( 6) Frequency inversion

To compensate for frequency inversions, this stage negate every odd sample in all odd subbands

Synthesis polyphase filter bank

Page 11: MP3 Optimization Exploiting Processor Architecture and Using Better Algorithms Mancia Anguita Universidad de Granada J. Manuel Martinez – Lechado Vitelcom.

MP3 decoder implementations( 1) Standard version

Implement MP3 following documentations Using only the tables specified in the standard

Basic version Improving on the standard version Replace some instructions by other with few clock cycles

EX : replace floating-point division by multiplicands and some integer multiply instruction by shift

Replace computationally intensive library functions with tables Library functions, using special processor instructions, replace

slower high-level programmer code Using loop unrolling to improve some loops

Page 12: MP3 Optimization Exploiting Processor Architecture and Using Better Algorithms Mancia Anguita Universidad de Granada J. Manuel Martinez – Lechado Vitelcom.

MP3 decoder implementations( 2) SIMD version

Improving on the basic version using SIMD extensions

MP3 is based on vector operations, so it can achieve benefit from SIMD instructions Requantization, stereo processing, IMDCT, and synthe

sis filter bank Using SIMD for improving memory initializations and

block transfers

Page 13: MP3 Optimization Exploiting Processor Architecture and Using Better Algorithms Mancia Anguita Universidad de Granada J. Manuel Martinez – Lechado Vitelcom.

MP3 decoder implementations( 3) Algorithm version

Improving basic version with algorithm Synthesis polyphase filter bank

Konstantinides’ method reduces the number of operations by transforming the matrixing operation to a 32 DCT and some reorder operation

IMDCT Marovich’s method Reduce IMDCT to a fast DCT and some data copying

operations Huffman decoding

A tree-clustering algorithm can speed up the search process

Page 14: MP3 Optimization Exploiting Processor Architecture and Using Better Algorithms Mancia Anguita Universidad de Granada J. Manuel Martinez – Lechado Vitelcom.

MP3 decoder implementations( 3) Algorithm-SIMD version

Based on SIMD version combined with the SIMD implementation

Using IMDCT and synthesis algorithm and clustering Huffman-decoding

Page 15: MP3 Optimization Exploiting Processor Architecture and Using Better Algorithms Mancia Anguita Universidad de Granada J. Manuel Martinez – Lechado Vitelcom.

Performance comparison( 0) Optimization operations

Page 16: MP3 Optimization Exploiting Processor Architecture and Using Better Algorithms Mancia Anguita Universidad de Granada J. Manuel Martinez – Lechado Vitelcom.

Performance comparison( 1) O2

Include classical optimizations that are processor independent Include inline function expansion

G6 This switch optimizes code for Pentium Pro, PII, and PIII, gene

rating code that is compatible with earlier processors G7

This switch optimizes code for Pentium IV, generating code that is compatible with earlier processors

QxK Allow vectorization using the SSE and MMX instruction include

d in PIII and P4 Arch:SSE

Using SSE and cmov instructions

Page 17: MP3 Optimization Exploiting Processor Architecture and Using Better Algorithms Mancia Anguita Universidad de Granada J. Manuel Martinez – Lechado Vitelcom.

Performance comparison( 2) Test platform

Test MP3 file Note

We measure processor clock

cycle instead of time, so the

result are independent of the

processor clock frequency

Page 18: MP3 Optimization Exploiting Processor Architecture and Using Better Algorithms Mancia Anguita Universidad de Granada J. Manuel Martinez – Lechado Vitelcom.

Experiment results( 1)

Page 19: MP3 Optimization Exploiting Processor Architecture and Using Better Algorithms Mancia Anguita Universidad de Granada J. Manuel Martinez – Lechado Vitelcom.

Experiment results( 2)

Page 20: MP3 Optimization Exploiting Processor Architecture and Using Better Algorithms Mancia Anguita Universidad de Granada J. Manuel Martinez – Lechado Vitelcom.

Experiment results( 3)

Page 21: MP3 Optimization Exploiting Processor Architecture and Using Better Algorithms Mancia Anguita Universidad de Granada J. Manuel Martinez – Lechado Vitelcom.

Conclusion

Exploiting architecture features can be as important as choosing the right algorithms

Programmer can exploit architecture features to a higher degree than compiler

Optimization choice depends on the application

Page 22: MP3 Optimization Exploiting Processor Architecture and Using Better Algorithms Mancia Anguita Universidad de Granada J. Manuel Martinez – Lechado Vitelcom.

Sub-band samples(32 subband x 18 samples)

0 1 2 ………………………………16 1701...

3031

DCT

0 1 2 ………………………62 63

0…………31

32…………63

64…………95

96………127

128………159

160………191

16 x 64-bitFIFO

= 1024 samples

896………927

928………959

960………991

992……1023

0 1 2 14 15

0……31

0……31

32……63

32……63

64……95

64……95

480…511

480…511

…………

…………

U vector

D window

x x x …… x

0…………31

0…………31

0…………31

…………

0…………31

w0 w1 w2 w15

+ + + + =Sum(w0 ~w15)