Architectural-level Design Exploration for Power Aware System
description
Transcript of Architectural-level Design Exploration for Power Aware System
![Page 1: Architectural-level Design Exploration for Power Aware System](https://reader036.fdocuments.net/reader036/viewer/2022081603/56814647550346895db35627/html5/thumbnails/1.jpg)
Architectural-level Design Exploration for Power Aware System
Dexin Li
October 2000
![Page 2: Architectural-level Design Exploration for Power Aware System](https://reader036.fdocuments.net/reader036/viewer/2022081603/56814647550346895db35627/html5/thumbnails/2.jpg)
Background
• Component-level low power design cannot meet system-level design goals
• System needs not only low power designs, but also power aware features
![Page 3: Architectural-level Design Exploration for Power Aware System](https://reader036.fdocuments.net/reader036/viewer/2022081603/56814647550346895db35627/html5/thumbnails/3.jpg)
Motivation• System architecture is important for power
aware system designs– Our micro-rover example shows bus/bus interface
consume 25.65% of total system power• By adopting a variety of low-power design
techniques and low power components, architectural optimization becomes more important.
![Page 4: Architectural-level Design Exploration for Power Aware System](https://reader036.fdocuments.net/reader036/viewer/2022081603/56814647550346895db35627/html5/thumbnails/4.jpg)
Application Example
• Microrover - Robot exploring Mars– Solar power: 15W @noon
– Electronics system:• Processor, microcontroller• Camera• radio frequency modem• Non-volatile memory/hard drive• Scientific equipment: APXS & ASI/MET• Bus drivers
– System tasks:• Steering and driving• Capture pictures and send compressed data• Perform scientific experiments, store data on media, and send data
![Page 5: Architectural-level Design Exploration for Power Aware System](https://reader036.fdocuments.net/reader036/viewer/2022081603/56814647550346895db35627/html5/thumbnails/5.jpg)
Previous Work
• A lot of lower power design techniques– Voltage scaling, frequency scaling, clock gating– Bus encoding, bus segmentation– Algorithm transformation, imprecise arithmetic
• Other Power Aware methodology– PACT: on demand control of power consumption and
performance– µAMPS: adaptive energy-aware distributed
microsensors
![Page 6: Architectural-level Design Exploration for Power Aware System](https://reader036.fdocuments.net/reader036/viewer/2022081603/56814647550346895db35627/html5/thumbnails/6.jpg)
IMPACCT methodology• A framework to enable power aware design
– Behavioral level optimization
• Scheduling, partitioning, migration
– Architectural level design exploration
• Constraint-driven design space exploration
• Meet power and performance constraints
• Different view of system behavior, thus different solution– Static, know system behaviors prior to architecture exploration
– Mixed, hybrid, prepare solutions for a few scenarios, pick up one at run time
– Dynamic, determine the system behaviors and explore design space both at run time,
![Page 7: Architectural-level Design Exploration for Power Aware System](https://reader036.fdocuments.net/reader036/viewer/2022081603/56814647550346895db35627/html5/thumbnails/7.jpg)
Assumptions for the problem• Use COTS component to construct system
• Communication– Un-directional– Components’ stand-alone time is absorbed into
communication time (at coarse granularity)
• Static view of the system behavior
![Page 8: Architectural-level Design Exploration for Power Aware System](https://reader036.fdocuments.net/reader036/viewer/2022081603/56814647550346895db35627/html5/thumbnails/8.jpg)
Problem statement• Design a tool(algorithm) that comes up with an architectural
topology and power management scheme that satisfy system-level power, workload and schedule constraints.
• Input:
– Component property
– Workload graph
– Behavioral schedule
– System constraints
• Output:
– A feasible architecture
– power management scheme
![Page 9: Architectural-level Design Exploration for Power Aware System](https://reader036.fdocuments.net/reader036/viewer/2022081603/56814647550346895db35627/html5/thumbnails/9.jpg)
Component property• Component name• Power modes• Communication bandwidth• Mapping table
– Performance / power
– Clock frequency / power
– Supply voltage / power
• Bus interface– Maximum fanout
– Root node eligibility(Can be root node or not)
![Page 10: Architectural-level Design Exploration for Power Aware System](https://reader036.fdocuments.net/reader036/viewer/2022081603/56814647550346895db35627/html5/thumbnails/10.jpg)
Workload graph
• A representation of communication– Vertices: components– Edges: workload(data
transfer rate)– Weight: required
communication bandwidth
rfsc1
hd cpu2
sc1
cpu1mc1
cammc1
40
202
5
180
40
30
20
![Page 11: Architectural-level Design Exploration for Power Aware System](https://reader036.fdocuments.net/reader036/viewer/2022081603/56814647550346895db35627/html5/thumbnails/11.jpg)
Behavioral schedule
• Mission-level schedules• From behavioral
scheduling or system specification
• Communication-active and stand-alone-active– Granularity related– Here assumes they are
same
CPU1
MC1
MC2
CPU2
CAM
RF
HD
SC1
SC2
10 20 40 5030 min
![Page 12: Architectural-level Design Exploration for Power Aware System](https://reader036.fdocuments.net/reader036/viewer/2022081603/56814647550346895db35627/html5/thumbnails/12.jpg)
System constraints• Power
– Maximum power, constant – Maximum power, function of time– Power range, constant, function of time
• Protocol– Topology: e.g. tree for 1394 bus– Communication bandwidth
• 100, 200, 400Mbps for different 1394 bus components
• Up to 80% of bandwidth for isochronous transfers
![Page 13: Architectural-level Design Exploration for Power Aware System](https://reader036.fdocuments.net/reader036/viewer/2022081603/56814647550346895db35627/html5/thumbnails/13.jpg)
Output topology
• A feasible topology meets all system constraints, if any
rfsc1
hd cpu2
sc2
cpu1mc1
cam
mc2
![Page 14: Architectural-level Design Exploration for Power Aware System](https://reader036.fdocuments.net/reader036/viewer/2022081603/56814647550346895db35627/html5/thumbnails/14.jpg)
Output-PM scheme
• Power management scheme– Working together with the output topology– Indicating results for each components, at each
schedule interval • power mode
• power consumption number
• required bandwidth
– Used as feedback to behavioral scheduling or software development
![Page 15: Architectural-level Design Exploration for Power Aware System](https://reader036.fdocuments.net/reader036/viewer/2022081603/56814647550346895db35627/html5/thumbnails/15.jpg)
Problem Formulation• Tool elements:
– Component library(CL)
– Topology generator(TG)
– Power management inspector(PMI)
– Power calculator(PC)
• With workload graph, TG first generates a graph from which different topology would be abstracted out; PMI sets working modes to each component, and check whether they are legal combinations. PC finds out power number for the entire system and see whether it meets power constraints. If yes, the problem is solved; if not, different working modes or different topology are tried, and check again.
![Page 16: Architectural-level Design Exploration for Power Aware System](https://reader036.fdocuments.net/reader036/viewer/2022081603/56814647550346895db35627/html5/thumbnails/16.jpg)
Component Model
• Component composition:– Functional unit (FU)– Bus interface (BI)
• Power management model:– Layered power modes– Modes correspondence
between FU and BI
FU
BI
Application
Bus media
LNK Full-on
Deep sleep
sleepPHY
SUS
BIApplication
Bus media
FU
Suppose when FU is working, it has communication with other components.
![Page 17: Architectural-level Design Exploration for Power Aware System](https://reader036.fdocuments.net/reader036/viewer/2022081603/56814647550346895db35627/html5/thumbnails/17.jpg)
Bus Model
• Sender and receiver
• Service layers
• Transfer property(modes, speed, bandwidth)
• Configuration process
LNK
PHYBus media
TRS
LNK
PHY
TRS
application application
sender receiver
![Page 18: Architectural-level Design Exploration for Power Aware System](https://reader036.fdocuments.net/reader036/viewer/2022081603/56814647550346895db35627/html5/thumbnails/18.jpg)
Configuration Management I
• Power modes constraints:– Intra-component constraints– Inter-component constraints
LNK Full-on
Deep sleep
sleepPHY
SUS
BIApplication
Bus media
FU
yes
noA B C
Data to be transferred from node A to C
Node B can’t be put in SUS mode.
![Page 19: Architectural-level Design Exploration for Power Aware System](https://reader036.fdocuments.net/reader036/viewer/2022081603/56814647550346895db35627/html5/thumbnails/19.jpg)
Configuration Management II
• Bandwidth constraints:
A B C
Data to be transferred from node A to C @ 400Mbps
Node B’s transfer speed should be 400Mbps, too
A B C D
Data transfer rates:A to D: 150MbpsB to D: 80MbpsBandwidth for C:No less than 230MbpsFor FireWire bus: 400Mbps
![Page 20: Architectural-level Design Exploration for Power Aware System](https://reader036.fdocuments.net/reader036/viewer/2022081603/56814647550346895db35627/html5/thumbnails/20.jpg)
Low power design techniques I
• Bus segmentation– Improve communication bandwidth– Power reduction by disable unused components
or clusters– Enabling other low power design techniques
segmentation
![Page 21: Architectural-level Design Exploration for Power Aware System](https://reader036.fdocuments.net/reader036/viewer/2022081603/56814647550346895db35627/html5/thumbnails/21.jpg)
Low power design techniques II
• Clock Scaling and Voltage scaling– Trade off between performance and power– Two or multiple levels of frequencies or voltages
to select from– Extra hardware needed to implement the
techniques
![Page 22: Architectural-level Design Exploration for Power Aware System](https://reader036.fdocuments.net/reader036/viewer/2022081603/56814647550346895db35627/html5/thumbnails/22.jpg)
Using low power design techniques
• Bus segmentation with clock scaling– With clustered bus, we can keep same
bandwidth by lower the clock frequency for the communication
segmentation400Mbps bus 200Mbps cluster
100Mbps clusterSuspended cluster
![Page 23: Architectural-level Design Exploration for Power Aware System](https://reader036.fdocuments.net/reader036/viewer/2022081603/56814647550346895db35627/html5/thumbnails/23.jpg)
Algorithm I• Creating Communication-Scheduling Table
– Obtain combined information of both schedule and communication
– Used for finding out constraint set for each component
– Format: • CST : (tuple1, tuple2, ...)
• Tuple1:(workload_path, interval, required_bandwidth)
(('cpu2','mc1'),((20,30), 10)), (('cpu2','mc2'),((0,15), 20)), (('cpu1','cam'),((10,20), 20)),...
![Page 24: Architectural-level Design Exploration for Power Aware System](https://reader036.fdocuments.net/reader036/viewer/2022081603/56814647550346895db35627/html5/thumbnails/24.jpg)
Algorithm II• Building Constraint Set
– Find legal modes• Working mode
• Power mode
• Bandwidth level
– Constrained by • Topology
• system schedule
• communication
Cam: ON: LNKCam: WL: 120
Camera must be working at at least link-layer-on mode;Required bandwidth is 120Mbps, thus the bus driver should work at at least 200Mbps
![Page 25: Architectural-level Design Exploration for Power Aware System](https://reader036.fdocuments.net/reader036/viewer/2022081603/56814647550346895db35627/html5/thumbnails/25.jpg)
Algorithm III
• Enumerating topology
• Complexity– pick up |Et| from |Eg|, |Et|, # of edges in the
tree;|Eg|, # of edges in the graph
1. Start from workload graph G;2. Add some redundant edges to G, we get G’;3. Abstract valid topology T from G’4. Append T to topology library TL
![Page 26: Architectural-level Design Exploration for Power Aware System](https://reader036.fdocuments.net/reader036/viewer/2022081603/56814647550346895db35627/html5/thumbnails/26.jpg)
Enumerating topology
![Page 27: Architectural-level Design Exploration for Power Aware System](https://reader036.fdocuments.net/reader036/viewer/2022081603/56814647550346895db35627/html5/thumbnails/27.jpg)
Algorithm IV• Traversing Power
management schemes– Grouping nodes into
three classes:• Transferring (C1)
• Passing (C2)
• Idle(C3)
– Traverse different combinations
– Try bus segmentation and clock scaling techniques
C1 C2 C3
1 Full-on Full-on Full-on
2 Full-on Full-on PHY
3 Full-on PHY PHY
4 Full-on PHY SUS
5 Clustered
Full-on
Clustere
d PHY
SUS
![Page 28: Architectural-level Design Exploration for Power Aware System](https://reader036.fdocuments.net/reader036/viewer/2022081603/56814647550346895db35627/html5/thumbnails/28.jpg)
Algorithm: top level1.Reading in component property, workload graph, system schedule, and system
constraints
2. Creating Communication-Scheduling Table
3. Building Constraint Set
4. Enumerating topology, building topology library TL
5. For Ti in TL :
6. For interval in schedule :
7. Traverse power management schemes PMSi;
8. Run power_calculator to find power number P for PMSi
9. If p satisfy power_constraint :
10. print “find a feasible solution”, Ti, PMSi
11. Stop
12. Print “can’t find a feasible solution”
![Page 29: Architectural-level Design Exploration for Power Aware System](https://reader036.fdocuments.net/reader036/viewer/2022081603/56814647550346895db35627/html5/thumbnails/29.jpg)
Example• FireWire 1394 bus architecture
– Tree topology
– Transfer speed 100, 200, 400Mbps
• Application-Micro rover
– 9 nodes
– System schedule:walking, taking picture, walking and collect scientific data
– Workload graph
– power Constraints:
• Constant value
• Function of time
• A range with max and min value or function
RFCAM
NVM/HD
SC1 SC2
CPU2
CPU1
MC1
MC2160 20
30
20 10
12030
CPU1
MC1
MC2CPU2
CAM
RF
HD
SC1
SC2
10 20 40 5030 min
![Page 30: Architectural-level Design Exploration for Power Aware System](https://reader036.fdocuments.net/reader036/viewer/2022081603/56814647550346895db35627/html5/thumbnails/30.jpg)
Experimental methodology• Constraint-driven design
space exploration
• Pre-given schedule from behavioral level to break the iteration loop
• Proliferate the exploration space by adding some edges to original graph
• Use both scheduling and communication information as knowledge, to build constraint set
scheduleschedule workloadworkload
Constraint setConstraint set
topologytopologyTopology
iteratorTopology
iterator
Power calculatorPower
calculatorComponent
library
Power modestraversor
Power modestraversor
Solution
![Page 31: Architectural-level Design Exploration for Power Aware System](https://reader036.fdocuments.net/reader036/viewer/2022081603/56814647550346895db35627/html5/thumbnails/31.jpg)
Experiments
• Experiment 1:
CAM
CPU
RF
HD MC
SC
80
40 30
120
80
30
SC MCHD
CAM
CPU
RF
MAX_POWER constraint = 15.0WActual MAX_POWER = 14.9W
![Page 32: Architectural-level Design Exploration for Power Aware System](https://reader036.fdocuments.net/reader036/viewer/2022081603/56814647550346895db35627/html5/thumbnails/32.jpg)
Experiments
MC
SC
HDCAM CPU
RF
MAX_POWER constraint = 14.0WActual MAX_POWER = 13.94W
min
CAM
CPU
MC
SC
RF
HD
10 20 40 5030
![Page 33: Architectural-level Design Exploration for Power Aware System](https://reader036.fdocuments.net/reader036/viewer/2022081603/56814647550346895db35627/html5/thumbnails/33.jpg)
Experiments
IntervalComponent
0-10 10-30 30-40 40-50 50-60
CAM LNK200 PHY PHY PHY PHY
CPU LNK200 LNK200 PHY LNK200 LNK100
MC LNK200 LNK200 LNK100 PHY PHY
SC LNK200 LNK200 PHY PHY PHY
RF PHY LNK200 PHY PHY LNK100
HD PHY PHY LNK100 LNK200 PHY
![Page 34: Architectural-level Design Exploration for Power Aware System](https://reader036.fdocuments.net/reader036/viewer/2022081603/56814647550346895db35627/html5/thumbnails/34.jpg)
Experiments
IntervalComponent
0-10 10-30 30-40 40-50 50-60
CAM LNK100 SUS SUS SUS SUS
CPU LNK100 LNK100 SUS LNK200 LNK100
MC LNK100 LNK100 LNK100 SUS SUS
SC LNK100 LNK100 SUS SUS SUS
RF SUS LNK100 SUS SUS LNK100
HD SUS SUS LNK100 LNK200 SUS
![Page 35: Architectural-level Design Exploration for Power Aware System](https://reader036.fdocuments.net/reader036/viewer/2022081603/56814647550346895db35627/html5/thumbnails/35.jpg)
Experimental Results
SC2
MC1HD
CAM
CPU1
RF
CPU2 MC2
SC1
Time(min)
10
12
13
14
15
11
Power(W)
10 20 30 40 50 60
8
7
9
Power constraints
![Page 36: Architectural-level Design Exploration for Power Aware System](https://reader036.fdocuments.net/reader036/viewer/2022081603/56814647550346895db35627/html5/thumbnails/36.jpg)
rf
sc1
hd
cpu2
sc2
cpu1 mc1cam mc2
![Page 37: Architectural-level Design Exploration for Power Aware System](https://reader036.fdocuments.net/reader036/viewer/2022081603/56814647550346895db35627/html5/thumbnails/37.jpg)
rf
sc1
hdcpu2
sc2
cpu1 mc1
cam
mc2
![Page 38: Architectural-level Design Exploration for Power Aware System](https://reader036.fdocuments.net/reader036/viewer/2022081603/56814647550346895db35627/html5/thumbnails/38.jpg)
Summary and future work
• A tool to explore design space for power aware architecture
• Meets different kinds of power constraints
• Incorporate low power design techniques
• Interaction with behavioral scheduling to refine solution
• Future work: hybrid and dynamic exploration
![Page 39: Architectural-level Design Exploration for Power Aware System](https://reader036.fdocuments.net/reader036/viewer/2022081603/56814647550346895db35627/html5/thumbnails/39.jpg)
Algorithm1.Read in component property, communication graph, read system schedule, read system constraints;2. Construct searching graph (SG); if |SG| > Max_SG then stop;3. Construct schedule intervals Si;4. Enumerate all the topologies from searching graph Ti SG5. For each Ti do 6. { if Ti is topologically illegal then next Ti;7. Build configuration constraints set(CCS)) for each component;8. Initialize first schedule interval S1, all components in Full-on modes;8. For each Si do9. { if (Si != S1) copy power modes sets(PMSi) from previous interval;10. While (PMSi not exhausted) 11. { If PMSi is legal then run power_calculator 12. { if system power satisfy power constraints then next Si; 13. Else next Ti;14. } else15. { find next PMSi; }16. }17. Next Ti ;18. } print “find a solution:”; output Ti, PMS; stop19. }20. Go to step 2;