L27:Lower Power Algorithm for Multimedia Systems 1999. 8 성균관대학교 조 준 동 .
-
date post
19-Dec-2015 -
Category
Documents
-
view
220 -
download
1
Transcript of L27:Lower Power Algorithm for Multimedia Systems 1999. 8 성균관대학교 조 준 동 .
![Page 1: L27:Lower Power Algorithm for Multimedia Systems 1999. 8 성균관대학교 조 준 동 .](https://reader036.fdocuments.net/reader036/viewer/2022062515/56649d2b5503460f94a0136f/html5/thumbnails/1.jpg)
L27:Lower Power Algorithmfor Multimedia Systems
1999. 8
성균관대학교 조 준 동 http://vada.skku.ac.kr
![Page 2: L27:Lower Power Algorithm for Multimedia Systems 1999. 8 성균관대학교 조 준 동 .](https://reader036.fdocuments.net/reader036/viewer/2022062515/56649d2b5503460f94a0136f/html5/thumbnails/2.jpg)
Contents
• Algorithmic Effects on Low Power
• Low Power Management
• Low Power Applications
– Low Power Video Processor
– Single Chip Video Camera
– Vector Quantization
– Data Encoding
– CDMA Searcher
– Viterbi Decoder
![Page 3: L27:Lower Power Algorithm for Multimedia Systems 1999. 8 성균관대학교 조 준 동 .](https://reader036.fdocuments.net/reader036/viewer/2022062515/56649d2b5503460f94a0136f/html5/thumbnails/3.jpg)
Low Power Algorithm
![Page 4: L27:Lower Power Algorithm for Multimedia Systems 1999. 8 성균관대학교 조 준 동 .](https://reader036.fdocuments.net/reader036/viewer/2022062515/56649d2b5503460f94a0136f/html5/thumbnails/4.jpg)
Algorithm Selection
• Example: 8x8 matrix DCT
![Page 5: L27:Lower Power Algorithm for Multimedia Systems 1999. 8 성균관대학교 조 준 동 .](https://reader036.fdocuments.net/reader036/viewer/2022062515/56649d2b5503460f94a0136f/html5/thumbnails/5.jpg)
Strength Reduction: DIGLOG multiplierC n n C n n
n
A A B B
A B A B B A A B
mult add
jR
kR
jR
kR
jR
kR R R
( ) , ( ) ,
,
( )( )
253 214
2 2
2 2 2 2
2
where world length in bits
1st Iter 2nd Iter 3rd Iter
Worst-case error -25% -6% -1.6%
Prob. of Error<1% 10% 70% 99.8%
With an 8 by 8 multiplier, the exact result can be obtained at a maximum of seven iteration steps (worst case)
![Page 6: L27:Lower Power Algorithm for Multimedia Systems 1999. 8 성균관대학교 조 준 동 .](https://reader036.fdocuments.net/reader036/viewer/2022062515/56649d2b5503460f94a0136f/html5/thumbnails/6.jpg)
Logarithmic Number System
L x
L L L L L L
L L L L
x
AB A B A B A B
A A A A
log | |,
, ,
, ,/
2
2 1 1
--> Significant Strength Reduction
![Page 7: L27:Lower Power Algorithm for Multimedia Systems 1999. 8 성균관대학교 조 준 동 .](https://reader036.fdocuments.net/reader036/viewer/2022062515/56649d2b5503460f94a0136f/html5/thumbnails/7.jpg)
Switching Activity Reduction(a) Average activity in a multiplier as a function of the constant value
(b) A parallel and serial implementations of an adder tree.
![Page 8: L27:Lower Power Algorithm for Multimedia Systems 1999. 8 성균관대학교 조 준 동 .](https://reader036.fdocuments.net/reader036/viewer/2022062515/56649d2b5503460f94a0136f/html5/thumbnails/8.jpg)
System-Level Solutions• System management, System partitioning, Algorithm selection
• Precompute physical capacitance of Interconnect and switching activity (number of bus accesses)
• Regularity: to minimize the power in the control hardware and the interconnection network.
• Modularity: to exploit data locality through distributed processing units, memories and control.
– Spatial locality: an algorithm can be partitioned into natural clusters based on connectivity
– Temporal locality:average lifetimes of variables (less temporal storage, probability of future accesses referenced in the recent past).
• Few memory references: since references to memories are expensive in terms of power.
![Page 9: L27:Lower Power Algorithm for Multimedia Systems 1999. 8 성균관대학교 조 준 동 .](https://reader036.fdocuments.net/reader036/viewer/2022062515/56649d2b5503460f94a0136f/html5/thumbnails/9.jpg)
System-Level Solutions - cont.
• Simulator: Instruction-level Energy Estimation
• Software: Energy Efficient Algorithms
• OS: Voltage Scheduling Algorithms • OS: Multiprocessing for Energy• Microprocessor: Dynamic Caches
![Page 10: L27:Lower Power Algorithm for Multimedia Systems 1999. 8 성균관대학교 조 준 동 .](https://reader036.fdocuments.net/reader036/viewer/2022062515/56649d2b5503460f94a0136f/html5/thumbnails/10.jpg)
Processor Systems:high Power
• Thinkpad (Pentium) 0.3 Hours/AA• InfoPad (ARM) 0.8 Hours/AA• Toshiba Portable (486) 0.9 Hours/AA• Newton (ARM) 2.0 Hours/AAOperations per Battery Life:Minimize Energy Consumed per OperationOperations per Second:Maximize Throughput Operations/ second
![Page 11: L27:Lower Power Algorithm for Multimedia Systems 1999. 8 성균관대학교 조 준 동 .](https://reader036.fdocuments.net/reader036/viewer/2022062515/56649d2b5503460f94a0136f/html5/thumbnails/11.jpg)
DPM vs SPM
• DPM (Dynamic Power Management): stops the clock switching of a specific unit generated by clock generators.
• SPM (Static Power Management): When the system remains idle for a significant period time, then it is shut-down.
Identify power hungry modules and look for opportunities to reduce power
![Page 12: L27:Lower Power Algorithm for Multimedia Systems 1999. 8 성균관대학교 조 준 동 .](https://reader036.fdocuments.net/reader036/viewer/2022062515/56649d2b5503460f94a0136f/html5/thumbnails/12.jpg)
Vdd vs Delay
•Use Variable Voltage Scaling or Scheduling for Real-time Processing •Use architecture optimization to compensate for slower operation, e.g., Parallel Processing and Pipelining for concurrent increasing and critical path reducing. •Scale down device sizes to compensate for delay (Interconnects do not scale proportionately and can become dominant)
![Page 13: L27:Lower Power Algorithm for Multimedia Systems 1999. 8 성균관대학교 조 준 동 .](https://reader036.fdocuments.net/reader036/viewer/2022062515/56649d2b5503460f94a0136f/html5/thumbnails/13.jpg)
Power PC 603 Strategy• Baseline: use right supply and right frequency to each part of th
e system If one has to wait on the occurence of some input, only a small circuit could wait and wake-up the main circuit when the input occurs.
• PowerPC 603 is a 2-issue (2 instructions read at a time) with 5 parallel
• Execution units. 4 modes:– Full on mode for full speed– Doze mode in which the execution units are not running– Nap mode which also stops the bus clocking and the Sleep
mode which stops the clock generator– Sleep mode which stops the clock generator with or without t
he PLL (20-100mW).
![Page 14: L27:Lower Power Algorithm for Multimedia Systems 1999. 8 성균관대학교 조 준 동 .](https://reader036.fdocuments.net/reader036/viewer/2022062515/56649d2b5503460f94a0136f/html5/thumbnails/14.jpg)
Power PC 603 Power Management
![Page 15: L27:Lower Power Algorithm for Multimedia Systems 1999. 8 성균관대학교 조 준 동 .](https://reader036.fdocuments.net/reader036/viewer/2022062515/56649d2b5503460f94a0136f/html5/thumbnails/15.jpg)
TI Structures • Two DSPs: TMS320C541, TMS320C542 reduce power and chip count and syst
em cost for wireless communication applications • C54X DSPs, 2.7V, 5V, Low-Power Enhanced Architecture DSP (LEAD) family: T
hree different power down modes, these devices are well-suited for wireless communications products such as digital cellular phones, personal digital assistants, and wireless modem,low power on voice coding and decoding
• The TMS320LC548 features:– 15-ns (66 MIPS) or 20-ns (50 MIPS) instruction cycle times– 3.0- and 3.3-V operation
• 32K 16-bit words of RAM and 2K 16-bit words of boot ROM on-chip• Integrated Viterbi accelerator that reduces Viterbi butterfly update in four instructi
on cycles for GSM channel decoding• Powerful single-cycle instructions (dual operand, parallel instructions, conditional
instructions)
![Page 16: L27:Lower Power Algorithm for Multimedia Systems 1999. 8 성균관대학교 조 준 동 .](https://reader036.fdocuments.net/reader036/viewer/2022062515/56649d2b5503460f94a0136f/html5/thumbnails/16.jpg)
InfoPad Architecture, UC-Berkeley
SpeechRecognizer
“PadServer”Wireless Basestation
InfoPadInfoPadMaintain state in the network, not
on the Pad
Transmit audio and raw bitmaps across
the wireless link
WebBrowser
Internet
Example:Hand-held
speech-enabled web-browser
Perform all computation in the network to minimize client energy dissipation
![Page 17: L27:Lower Power Algorithm for Multimedia Systems 1999. 8 성균관대학교 조 준 동 .](https://reader036.fdocuments.net/reader036/viewer/2022062515/56649d2b5503460f94a0136f/html5/thumbnails/17.jpg)
InfoPad Hardware Flexibility
Only header sentto microprocessor
10 MIPSμProcessor
ControlStatisticsReliabilityDebugging
Entire packet routed to dedicated hardware
RX Packet
PacketHeader
Frame-bufferupdate
Embedded software responsible for high-level functions
Main data-flow handled by custom low-power ASICs
Radio
FrameBuffer
• Use hardware/software integration toprovide energy-efficient high-level functionality
![Page 18: L27:Lower Power Algorithm for Multimedia Systems 1999. 8 성균관대학교 조 준 동 .](https://reader036.fdocuments.net/reader036/viewer/2022062515/56649d2b5503460f94a0136f/html5/thumbnails/18.jpg)
Multimedia I/O Terminal.
![Page 19: L27:Lower Power Algorithm for Multimedia Systems 1999. 8 성균관대학교 조 준 동 .](https://reader036.fdocuments.net/reader036/viewer/2022062515/56649d2b5503460f94a0136f/html5/thumbnails/19.jpg)
Multimedia I/O terminal
![Page 20: L27:Lower Power Algorithm for Multimedia Systems 1999. 8 성균관대학교 조 준 동 .](https://reader036.fdocuments.net/reader036/viewer/2022062515/56649d2b5503460f94a0136f/html5/thumbnails/20.jpg)
InfoPad EvolutionTotal Power: ~7 W
Where did the power go?
No local computation?
Commercial radios
Commercial DC/DC
Inefficientimplementation
IntercomIntercomEnergy-Efficient
ProcessorsInfoPadInfoPad
• High-level system design optimizes complete solution and drives new research
![Page 21: L27:Lower Power Algorithm for Multimedia Systems 1999. 8 성균관대학교 조 준 동 .](https://reader036.fdocuments.net/reader036/viewer/2022062515/56649d2b5503460f94a0136f/html5/thumbnails/21.jpg)
Power-Down Techniques
![Page 22: L27:Lower Power Algorithm for Multimedia Systems 1999. 8 성균관대학교 조 준 동 .](https://reader036.fdocuments.net/reader036/viewer/2022062515/56649d2b5503460f94a0136f/html5/thumbnails/22.jpg)
Low Power Memory