Identifying Program Power Phase Behavior Using Power Vectors

60
Identifying Program Power Phase Behavior Using Power Vectors Canturk Isci & Margaret Martonosi WWC-6 10.27.2003 Austin, TX

description

Identifying Program Power Phase Behavior Using Power Vectors. Canturk Isci & Margaret Martonosi. WWC-6 10.27.2003 Austin, TX. Power Phase Behavior. Existence of distinguishable intervals during an application ’ s execution lifetime such that: - PowerPoint PPT Presentation

Transcript of Identifying Program Power Phase Behavior Using Power Vectors

Page 1: Identifying Program Power Phase Behavior Using Power Vectors

Identifying Program Power Phase Behavior Using Power Vectors

Canturk Isci & Margaret Martonosi

WWC-6

10.27.2003Austin, TX

Page 2: Identifying Program Power Phase Behavior Using Power Vectors

2

Power Phase BehaviorPower Phase Behavior

Existence of distinguishable intervals during an application’s execution lifetime such that:

They share significantly higher resemblance within themselves in terms of power behavior the application exhibits on a given processorThis similarity is carried out by not only the total processor power, but also the distribution of power into processor sub-units

(Filtered) VPR Power Breakdowns

0

1

2

3

4

5

6

0 100 200 300 400 500Time (s)

Po

wer

[W

atts

]

(Filtered) GAP Power Breakdowns

0

1

2

3

4

5

6

0 50 100 150 200 250Time (s)

Po

wer

[W

atts

]

(Filtered) Gcc Power Breakdowns

0

1

2

3

4

5

0 100 200Time (s)

Po

wer

[W

atts

]

Page 3: Identifying Program Power Phase Behavior Using Power Vectors

3

Our Power Phase AnalysisOur Power Phase AnalysisGoal:

Identify phases in program power behavior

Determine execution points that correspond to these phases

Define small set of power signatures that represent overall power behavior

Page 4: Identifying Program Power Phase Behavior Using Power Vectors

4

Our ApproachOur ApproachOur Approach – Outline:

Collect samples of estimated power values for processor sub-units <Power Vectors> at application runtime

Define a power vector similarity metric

Group sampled program execution into phases

Determine execution points and representative signature vectors for each phase group

Analyze the accuracy of our approximation

Page 5: Identifying Program Power Phase Behavior Using Power Vectors

5

MotivationMotivationCharacterizing power behavior:

Future power-aware architectures and applications

Dynamic power/thermal management

Architecture research

Utilizing power vectors:

Direct relation to actual processor power consumption

Acquired at runtime

Identify program phases with no knowledge of application

Page 6: Identifying Program Power Phase Behavior Using Power Vectors

6

0

5

10

15

20

25

30

35

40

45

50

Vector(0) Vector(3) Vector(6) Vector(9) Vector(12) Vector(15) Vector(18) Vector(21) Vector(24)

RETIRE

Schedule Inst Queue2

Inst Queue1

Rename Allocation

Ucode ROM

1st Level BPU Trace Cache

Inst Dec

FP Regfile INT Regfile FP Exec

INT Exec

Data TLB MEM control

MOB

L1 cache ITLB & Fetch

2nd Level BPU L2 Cache

Bus Control

Generating Power VectorsGenerating Power Vectors

POWERCLIENT

POWERSERVER Voltage readings

via RS232 to logging machine

Convert voltage to measured power Convert access rates to component powers

1mV/Adc conversion

Counter based access rates

over ethernet

22 Entries of each power vector sample

Page 7: Identifying Program Power Phase Behavior Using Power Vectors

7

Power Vector Similarity MetricPower Vector Similarity Metric How to quantify the ‘power behavior

dissimilarity’ between two execution points?1. Consider solely total power difference

2. Consider manhattan distance between the corresponding 2 vectors

3. Consider manhattan distance between the corresponding 2 vectors normalized

4. Consider a combination of (2) & (3)

Construct a “similarity matrix” to represent similarity among all pairs of execution points

Each entry in the similarity matrix:

Page 8: Identifying Program Power Phase Behavior Using Power Vectors

8

Gcc Component Power Breakdowns

0

1

2

3

4

5

6

0 50 100 150 200 250Time (s)

Po

we

r [W

att

s] L1 cache

TraceCache

RETIRE

Gcc Total Power

0

10

20

30

40

50

60

0 50 100 150 200 250

Po

wer

[W

atts

]

MEASURED POWER COUNTER ESTIMATED POWER

Gzip Total Power

0

10

20

30

40

50

60

44 88 132 176 220 264 308 352 396 440Time (s)

Pow

er [W

atts

]

MEASURED POWER COUNTER ESTIMATED POWER

Gzip Component Power Breakdowns

0

1

2

3

4

5

6

44 144 244 344 444Time (s)

Pow

er [W

atts

] L1 cache

INT Exec

RETIRE

Gcc

& G

zip

Mat

rix

Plo

tsG

cc &

Gzi

p M

atri

x P

lots

(Sim

ilar

to S

imPoin

ts w

ork

of

Sh

erw

ood

et

al.)

Page 9: Identifying Program Power Phase Behavior Using Power Vectors

9

Gcc Component Power Breakdowns

0

1

2

3

4

5

6

0 50 100 150 200 250Time (s)

Po

we

r [W

att

s] L1 cache

TraceCache

RETIRE

Gcc Elaboration: Very variant power

Almost identical power behavior at 30, 50, 180s.

Although 88s, 110s, 140s, 210s and 230s show similar total power; 88, 210 and 230 share higher similarity.

Gcc Total Measured Power

0

10

20

30

40

50

60

0 50 100 150 200 250Time (s)

Po

wer

[W

atts

]

Gcc Total Power

0

10

20

30

40

50

60

0 50 100 150 200 250

Pow

er [W

atts

]

MEASURED POWER COUNTER ESTIMATED POWER

Gcc

& G

zip

Mat

rix

Plo

tsG

cc &

Gzi

p M

atri

x P

lots

Page 10: Identifying Program Power Phase Behavior Using Power Vectors

10Gcc

& G

zip

Mat

rix

Plo

tsG

cc &

Gzi

p M

atri

x P

lots

Gzip Elaboration: Much regular power behavior

Spurious similarities are again distinguished by the similarity analysis

GZIP Total Measured Power

0

10

20

30

40

50

60

44 94 144 194 244 294 344 394 444

Time (s)

Po

wer

[W

atts

]

Gzip Total Power

0

10

20

30

40

50

60

44 88 132 176 220 264 308 352 396 440Time (s)

Pow

er [W

atts

]

MEASURED POWER COUNTER ESTIMATED POWER

Gzip Component Power Breakdowns

0

1

2

3

4

5

6

44 144 244 344 444Time (s)

Pow

er [W

atts

] L1 cache

INT Exec

RETIRE

Page 11: Identifying Program Power Phase Behavior Using Power Vectors

11

Grouping Execution PointsGrouping Execution Points “Thresholding Algorithm”:

Define a threshold of similarity < % of max dissimilarity>

Start from first execution point (0,0) and identify ones in the fwd execution path that lie within threshold for both normalized and absolute metrics

Tag the corresponding execution points (j,j) as the same group

Find next untagged execution point (r,r) and do the same along forward path

Rule: A tagged execution point cannot add new elements to its group!We demonstrate the outcome of thresholding with Grouping Matrices

Page 12: Identifying Program Power Phase Behavior Using Power Vectors

12

Gzip Grouping MatricesGzip Grouping Matrices

Gzip has 974 power vectors Cluster vectors based on similarity

using “thresholding” Max Gzip power dissimilarity: 47.35W

Page 13: Identifying Program Power Phase Behavior Using Power Vectors

13

Gzip Group Distribution for Threshold = 1%

0

50

100

150

200

250

300

0 50 100 150 200 250 300 350 400 450 500

Time (s)

Gro

up #

0

10

20

30

40

50

Gzip Group Distribution for Threshold = 10%

05

1015202530354045

0 50 100 150 200 250 300 350 400 450 500

Time (s)

Gro

up #

0

10

20

30

40

50

Generated Group DistributionsGenerated Group Distributions

Page 14: Identifying Program Power Phase Behavior Using Power Vectors

14

Representative Vectors & Execution PointsRepresentative Vectors & Execution Points

We have each execution point assigned to a group For Each Group:

For Each Execution Point:

We can represent whole execution with as many power vectors as the number of generated groups

Define a representative vector as the average of all instances of that group

Select the execution point that started the group (The earliest point in each group)

Assign the corresponding group’s representative vector as that point’s power vector

Assign the power vector of the selected execution point for that group as that point’s power vector

Page 15: Identifying Program Power Phase Behavior Using Power Vectors

15

Rec

onst

ruct

ing

Pow

er T

race

R

econ

stru

ctin

g P

ower

Tra

ce

wit

h R

epre

sen

tati

ve V

ecto

rs:

wit

h R

epre

sen

tati

ve V

ecto

rs:

RECONSTRUCTED GZIP POWER for Threshold=1% <254 Vectors>

5

15

25

35

45

55

0 50 100 150 200 250 300 350 400 450 500Time (s)

Po

wer

[W

]

TOTAL_MEASURED_POWER TOTAL_MODELED_POWER RECONSTRUCTED_POWER(Representative Vectors)

RECONSTRUCTED GZIP POWER for Threshold=10% <33 Vectors>

5

15

25

35

45

55

0 50 100 150 200 250 300 350 400 450 500Time (s)

Po

we

r [W

]

TOTAL_MEASURED_POWER TOTAL_MODELED_POWER RECONSTRUCTED_POWER(Representative Vectors)

Page 16: Identifying Program Power Phase Behavior Using Power Vectors

16

Rec

onst

ruct

ing

Pow

er T

race

R

econ

stru

ctin

g P

ower

Tra

ce

wit

h S

elec

ted

Exe

cuti

on P

oin

ts:

wit

h S

elec

ted

Exe

cuti

on P

oin

ts:

RECONSTRUCTED GZIP POWER for Threshold=10% <33 Vectors>

5

15

25

35

45

55

0 50 100 150 200 250 300 350 400 450 500Time (s)

Po

we

r [W

]

TOTAL_MEASURED_POWER TOTAL_MODELED_POWER RECONSTRUCTED_POWER(Vectors Based on Selected Execution Points)

RECONSTRUCTED GZIP POWER for Threshold=1% <254 Vectors>

5

15

25

35

45

55

0 50 100 150 200 250 300 350 400 450 500Time (s)

Po

we

r [W

]

TOTAL_MEASURED_POWER TOTAL_MODELED_POWER RECONSTRUCTED_POWER(Vectors Based on Selected Execution Points)

Page 17: Identifying Program Power Phase Behavior Using Power Vectors

17

Component Power CharacterizationsComponent Power Characterizations

GZIP Modeled Power - Vector Components

0

10

20

30

40

50

60

44 66 88 110132

154176

198220

242264

286308

330352

374396

418440

462

Ve

cto

r C

om

po

ne

nts

Bus Control L2 Cache 2nd Level BPU ITLB & Fetch L1 cache MOB MEM control Data TLB INT Exec FP Exec INT Regfile FP Regfile Inst Dec Trace Cache 1st Level BPU Ucode ROM Allocation Rename Inst Queue1 Inst Queue2 Schedule RETIRE TOTAL_MODELED_POWER

GZIP Reconstructed Power - Vector Components

0

10

20

30

40

50

60

44 66 88 110132

154176

198220

242264

286308

330352

374396

418440

462

Vec

tor

Co

mp

on

ents

Bus Control_R L2 Cache_R 2nd Level BPU_R ITLB & Fetch_R L1 cache_R MOB_R MEM control_R Data TLB_R INT Exec_R FP Exec_R INT Regfile_R FP Regfile_R Inst Dec_R Trace Cache_R 1st Level BPU_R Ucode ROM_R Allocation_R Rename_R Inst Queue1_R Inst Queue2_R Schedule_R RETIRE_R RECONSTRUCTED_POWER

(Representative Vectors)GZIP Reconstructed Power - Vector Components

0

10

20

30

40

50

60

44 66 88 110132

154176

198220

242264

286308

330352

374396

418440

462

Ve

cto

r C

om

po

ne

nts

Bus Control_R L2 Cache_R 2nd Level BPU_R ITLB & Fetch_R L1 cache_R MOB_R MEM control_R Data TLB_R INT Exec_R FP Exec_R INT Regfile_R FP Regfile_R Inst Dec_R Trace Cache_R 1st Level BPU_R Ucode ROM_R Allocation_R Rename_R Inst Queue1_R Inst Queue2_R Schedule_R RETIRE_R RECONSTRUCTED_POWER

(Vectors Based on Selected Execution Points)

Page 18: Identifying Program Power Phase Behavior Using Power Vectors

18

GZIP Reconstructed Power - Absolute Errors

0

1

2

3

4

5

6

7

8

9

10

44 66 88 110 132 154 176 198 220 242 264 286 308 330 352 374 396 418 440 462

Time(s)

Ab

solu

te D

iffe

ren

ce

0

10

20

30

40

50

60

Po

wer

[W

]

(Vectors Based on Selected Execution Points)GZIP Reconstructed Power - Absolute Errors

0

1

2

3

4

5

6

7

8

9

10

44 66 88 110 132 154 176 198 220 242 264 286 308 330 352 374 396 418 440 462

Ab

solu

te D

iffe

ren

ce

0

10

20

30

40

50

60

Time (s)

Po

wer

[W

]

(Representative Vectors)

Approximation ErrorApproximation Error

Due to thresholding algorithm Errors for selected exec. points are bounded with the threshold Max Error: 4.71W & RMS Error: 3.08W

As representative vectors are group centroids Cumulative errors for repr. vectors are lower Max Error: 7.10W & RMS Error: 2.31W

Error in total power< Σ(Component errors)

Page 19: Identifying Program Power Phase Behavior Using Power Vectors

19

ConclusionConclusion Presented a power oriented methodology to

identify program phases that uses power vectors generated during program runtime

Provided a similarity metric to quantify power behavior similarity of different execution samples

Demonstrated our representative sampling technique to characterize program power behavior

Can be useful for power & characterization research: Power Phase identification/prediction Reduced power simulation Dynamic power/thermal management

Page 20: Identifying Program Power Phase Behavior Using Power Vectors

20

Related WorkRelated Work Dhodapkar and Smith [ISCA’02]

Working set signatures to detect phase changes Sherwood et. al. [PACT’01,ASPLOS’02,ISCA’30]

Similarity analysis based on program basic block profiles to identify phases

Todi [WWC’01] Clustering based on counter information to identify

similar behavior

Our work in comparison Power oriented Power behavior similarity metric Runtime No information about the application is required Bounded approximation error with thresholding

Page 21: Identifying Program Power Phase Behavior Using Power Vectors

21

EOP

Page 22: Identifying Program Power Phase Behavior Using Power Vectors

22

Power Vector Components More detail on Power Vectors

Different Similarity Metrics Similarity matrices and equations for all discussed techniques

Similarity & Grouping Matrices Exemplified description of the two matrices and plots

Current & Future Research Discussion of the ongoing and future research Includes also some new ideas and some things to do to make

our current analysis solid

Questions & Rebuttals Starts with the discussion of presented work Discusses some shortcomings, things that need to be done to

improve and to verify that it is unique and solid(Some parts of current work also discusses these issues)

Also provides some answers to reviewers’ questions Includes some possible new ideas

EXTRA SLIDESEXTRA SLIDES

Page 23: Identifying Program Power Phase Behavior Using Power Vectors

23

Defining ComponentsDefining Components

Page 24: Identifying Program Power Phase Behavior Using Power Vectors

24

P4 Architecture vs LayoutP4 Architecture vs Layout

Components to Model:

1) Bus Control2) L2 Cache3) 2nd Level BPU4) ITLB & Ifetch5) L1 Cache

6) MOB7) Mem Control8) DTLB9) Int EXE10)FP EXE11) Int RF

12)FP RF13)Decode14)Trace $15)1st Level BPU16)Microcode ROM17)Allocation

18)Rename19) Inst-n Qs20)Schedule21) Inst-n Qs22)Retirement

Back Back

Page 25: Identifying Program Power Phase Behavior Using Power Vectors

25

Defining Events Defining Events Access Rates Access Rates We determined 24 events to approximate access rates

for 22 components Used Several Heuristics to represent each access rate

Examples:

Need to rotate counters 4 times to collect all event data Used 15 counters & 4 rotations to collect all event data

Page 26: Identifying Program Power Phase Behavior Using Power Vectors

26

Access Rates Access Rates Component Powers Component Powers

“Performance Counter based Access Rate estimations are used as proxy for max component power weighting together with microarchitectural details in order to estimate processor sub-unit powers”

EX: Trace cache delivers 3 uops/cycle in deliver mode and 1 uop/cycle in build mode:

Power(TC)=[Access-Rate(TC)/3 + Access-Rate(ID)] x MaxPower(TC) + Non-gated TC CLK power

Total power is computed as the sum of all 22 component powers + measured idle power (8W):

Page 27: Identifying Program Power Phase Behavior Using Power Vectors

27

Counter Access HeuristicsCounter Access Heuristics 1) BUS CONTROL:

No 3rd Level cache BSQ allocations ~ IOQ allocations Metric1: Bus accesses from all agents

Event: IOQ_allocationCounts various types of bus transactions

Should account for BSQ as wellaccess based rather than duration

MASK:Default req. type, all read (128B) and write (64B) types, include OWN,OTHER and PREFETCH

Metric2: Bus Utilization(The % of time Bus is utilized)Event: FSB_data_activity

Counts DataReaDY and DataBuSY events on BusMask:

Count when processor or other agents drive/read/reserve the busExpression: FSB_data_activity x BusRatio / Clocks Elapsed

To account for clock ratios

Page 28: Identifying Program Power Phase Behavior Using Power Vectors

28

Counter Access HeuristicsCounter Access Heuristics 2) L2 Cache:

Metric: 2nd Level cache referencesEvent: BSQ_cache_reference

Counts cache ref-s as seen by bus unitMASK:

All MESI read misses (LD & RFO)2nd level WR misses

3) 2nd Level BPU: Metric 1: Instructions fetched from L2 (predict)

Event: ITLB_ReferenceCounts ITLB translations

Mask:All hits, misses & UC hits

Metric 2: Branches retired (history update)Event: branch_retired

Counts branches retiredMask:

Count all Taken/NT/Predicted/MissP

Page 29: Identifying Program Power Phase Behavior Using Power Vectors

29

Counter Access HeuristicsCounter Access Heuristics 4) ITLB & I-Fetch:

etc……… 10) FP Execution:

Metric: FP instructions executedevent1: packed_SP_uop

counts packed single precision uopsevent2: packed_DP_uop

counts packed single precision uopsevent3: scalar_SP_uop

counts scalar double precision uopsevent4: scalar_DP_uop

counts scalar double precision uopsevent5: 64bit_MMX_uop

counts MMX uops with 64bit SIMD operandsevent6: 128bit_MMX_uop

counts integer SSE2 uops with 128bit SIMD operandsevent7: x87_FP_UOP

counts x87 FP uopsevent8: x87_SIMD_moves_uop

counts x87, FP, MMX, SSE, SSE2 ld/st/mov uops Back Back

Page 30: Identifying Program Power Phase Behavior Using Power Vectors

30

Similarity Based on Total PowerSimilarity Based on Total Power

Page 31: Identifying Program Power Phase Behavior Using Power Vectors

31

Similarity Based on Absolute Power Similarity Based on Absolute Power VectorsVectors

Page 32: Identifying Program Power Phase Behavior Using Power Vectors

32

Similarity Based on Normalized Power Similarity Based on Normalized Power VectorsVectors

Page 33: Identifying Program Power Phase Behavior Using Power Vectors

33

Similarity Based on Both Absolute and Similarity Based on Both Absolute and Normalized Power VectorsNormalized Power Vectors

Back Back

Page 34: Identifying Program Power Phase Behavior Using Power Vectors

Similarity Matrix ExampleSimilarity Matrix Example

Consider 4 vectors, each with 4 dimensions:

1

2

3

5

2

1

5

3

2

2

4

4

4

3

5

1

0 6 3 7

6 0 3 6

3 3 0 7

7 6 7 0

322214543

:2 &1 VectorsBetween Distance

:nCalculatio DistanceManhattan Exemplary

Log all distances in the similarity matrix

0 6 3 7

6 0 3 6

3 3 0 7

7 6 7 0

Color-scale from black to white (only for upper diagonal)

Page 35: Identifying Program Power Phase Behavior Using Power Vectors

35

Interpreting Similarity Matrix PlotInterpreting Similarity Matrix Plot

Back Back

Level of darkness at any location (r,c) shows the amount of similarity between vectors –samples– r & c.

i.e. 0 & 2

All samples are perfectly similar to themselves

All (r,r) are black

Vertically above the diagonal shows similarity of the sample at the diagonal to previous samples

i.e. 1 vs. 0

Horizontally right of the diagonal shows similarity of the sample at the diagonal to future samples

i.e. 1 vs. 2,3

Page 36: Identifying Program Power Phase Behavior Using Power Vectors

Grouping Matrix ExampleGrouping Matrix Example

Consider same 4 vectors:

0 6 3 76 0 3 63 3 0 77 6 7 0

togethergrouped becan 5.3 distance with VectorsThreshold 50%

together grouped becan none 0.7, Threshod 10%

7 :VectorsBetween Distance Maximum

Mark execution pairs with distance ≤ Threshold

2

1

5

3

2

2

4

4

4

3

5

1

1

2

3

5

0 6 3 76 0 3 63 3 0 77 6 7 0

0 6 3 76 0 3 63 3 0 77 6 7 0

0 6 3 76 0 3 63 3 0 77 6 7 0 Back Back

Page 37: Identifying Program Power Phase Behavior Using Power Vectors

37

Current & Future ResearchCurrent & Future Research

FOLLOWING SLIDES DISCUSS ONGOING RESEARCH RELATED TO POWER PHASES. PLANS FOR FUTURE RESEARCH ARE ALSO DISCUSSED

Page 38: Identifying Program Power Phase Behavior Using Power Vectors

38

THE BIG PICTURETHE BIG PICTURE

Performance Monitoring

Real Power Measurement

PowerModeling

ThermalModeling

Performance Monitoring

Real Power Measurement

PowerModeling

ThermalModeling

PowerPhases

ProgramProfiling

ProgramStructure

To Estimate component power & temperature breakdowns for P4 at runtime…

To analyze how power phase behavior relates to program structure

Bottom line…

Page 39: Identifying Program Power Phase Behavior Using Power Vectors

39

PowerModeling

PowerPhases

ProgramProfiling

ProgramStructure

PowerPhases

ProgramProfiling

ProgramStructure

Phase BranchPhase Branch

Power Phase Behavior Similarity Based on Power Vectors Identifying similar program regions

Profiling Execution Flow Sampling process’ execution “PCsampler” LKM

Program Structure Execution vs. Code space Power Phases Exec. Phases

NOT YET?

Page 40: Identifying Program Power Phase Behavior Using Power Vectors

40

PowerModeling

PowerPhases

ProgramProfiling

ProgramStructure

PowerPhases

POWER PHASE BEHAVIORPOWER PHASE BEHAVIOR

Power Phase Behavior Similarity Based on Power Vectors Identifying similar program regions

Profiling Execution Flow Sampling process’ execution “PCsampler” LKM

Program Structure Execution vs. Code space Power Phases Exec. Phases

NOT YET

Page 41: Identifying Program Power Phase Behavior Using Power Vectors

41

Identifying Power PhasesIdentifying Power PhasesMost of the methodology is ready

Complete Gzip case in WWCExtensibility to other benchmarks

Generated similarity metrics for several Performing phase identification with

thresholdingRepeatibilty of the experimentSeveral other possible ideas such as:

Thresholding + k-means clustering Two-pass thresholding PCA for dimension reduction (or SVD?) Manhattan = L1 norm

Euclidian (L2) – not interestingChebyschev (Linf) - ??

Page 42: Identifying Program Power Phase Behavior Using Power Vectors

42

PowerModeling

PowerPhases

ProgramProfiling

ProgramStructure

ProgramProfiling

Program Execution ProfileProgram Execution Profile

ProgramStructure

Power Phase Behavior Similarity Based on Power Vectors Identifying similar program regions

Profiling Execution Flow Sampling process’ execution “PCsampler” LKM

Program Structure Execution vs. Code space Power Phases Exec. Phases

NOT YET

Page 43: Identifying Program Power Phase Behavior Using Power Vectors

43

Program Execution ProfileProgram Execution Profile

Sample program flow simultaneously with power Our LKM implementation: “PCsampler”

Not Finished…

Generate code space similarity in parallel with power space similarity

Relative comparisons of methods for: Complexity Accuracy Applicability, etc.

Page 44: Identifying Program Power Phase Behavior Using Power Vectors

44

CURRENT STATECURRENT STATE

Sample PC Binding to functionsReacquire PID

Those SPECs, Runspec: always in fixed address at ELF_program interpreterBenches: change pid between datasets

Verify PC with objdump So we can make sure it is the PC

we’re sampling

Page 45: Identifying Program Power Phase Behavior Using Power Vectors

45

Initial Data: PC?? Trace For gzip-sourceInitial Data: PC?? Trace For gzip-source

Gzip-Source PC Distribution

134515000

134520000

134525000

134530000

134535000

134540000

134545000

0 20 40 60 80 100 120 140 160Sample

Me

mo

ry L

oc

ati

on

Correspond to:<send_bits><bit_reverse><lm_init><longest_match><fill_window>

functions

Back Back

Page 46: Identifying Program Power Phase Behavior Using Power Vectors

46

Generality of the technique All Spec benchmarks show

distinct phase behavior:Repeatability of the experiment

Need to be able to arrive at similar phase behavior in order to characterize an application

Correlation between vector components Inherent redundancy in power vectors Could be removed with PCA

Alternative norms for similarityApplicability of selected execution points

DiscussionDiscussionFrom here onFrom here on

Page 47: Identifying Program Power Phase Behavior Using Power Vectors

47

Similarity Analysis for Other Similarity Analysis for Other Applications?Applications? SPECs show similar applicable behavior

Not always phase-like, i.e. twolf has more like a power gradient

Results for other benchmarks: <NOT READY> Gcc & Twolf:

# of groups w.r.t. thresholdsErrors plots for reconstructed & selected vectors

Apply to other applications: Desktop applications

Will follow the bursty behavior, maybe determine action signatures??

Saving, computation, streaming, etc. Ghostscript might be interesting

Correlation between phases vs. locations of images

Page 48: Identifying Program Power Phase Behavior Using Power Vectors

48

Dependence of Results to the Dependence of Results to the Applied Power Model?Applied Power Model?The generic technique requires only

sufficiently detailed power breakdowns that add up to total power Doesn’t matter how you acquire the ‘power

vectors’ otherwiseIf you use some other characterization

data other than power vectors Can still perform phase analysis, but cannot

provide a direct estimation for reconstructed power behavior

Maybe use some kind of mapping?Log measured power data as well as characterization metrics?

Would still be unable to predict component-wise

Page 49: Identifying Program Power Phase Behavior Using Power Vectors

49

Possibility for Other Processors?Possibility for Other Processors?

Most recent processors are keen on power management There will be enough power variability to exploit for power

phase analysis Porting the power estimation to other architectures

Requires significant effort to Define power related metricsImplement counter reader and power estimation user and kernel SW

Porting to same architecture, different implementation More straightforward

Reevaluate max/idle/gated power estimates Experiences with other architectures:

Castle project for Pentium Pro (P6) Few watts of variationLow dimensionality

IBM Power3 IIVery low measured power variation

Page 50: Identifying Program Power Phase Behavior Using Power Vectors

50

Other Statistical Techniques?Other Statistical Techniques?

Alternative measures of distance: Different norms

Canberra Distance

Squared Chi-squared distance, etc.Other similarity metrics:

Pearson’s correlationcoefficient

Cosine similarity

N

i

N

crN iPViPVLcr

1 22

1

)()(,

22

1)()(

)()(),(i

iPViPV

iPViPV

cr

crcrC

22

122

)()(),(i

iPViPV crcrPC

22

1

222

1

2

22

1

)()(

)()(

),(i

ci

r

icr

iPViPV

iPViPV

crCS

Page 51: Identifying Program Power Phase Behavior Using Power Vectors

51

SPEClite vs. SimPoints SPEClite vs. SimPoints vs. Power Vectorsvs. Power VectorsApproaches of 3 methods:

SPEClite: Performance counts (runtime) (Perf. Oriented) (k-means partitioning for vectors normalized to 0 mean and unit variance) (PCA to create reduced vectors) (selects vectors closest to centroids)

SimPoints: Basic block accesses (Simulation) (Perf. Oriented) (k-means partitioning for normalized basic block vectors) (selects vectors closest to centroids –except for ‘Early’ Simpoints)

Power Vectors:Component power consumptions (Runtime) (Power Oriented) (threshold based partitioning for ‘normalized + absolute’ vectors) (Selects earliest vector of each group)

Page 52: Identifying Program Power Phase Behavior Using Power Vectors

52

Why Power Vectors w.r.t. Others?Why Power Vectors w.r.t. Others?

Provides a direct interpretation for power consumption Could be used to identify specific power behavior for

dynamic power/thermal management Power phases might be not a perfect

translation of performance phases <CURRENT WORK INVESTIGATES>

I.e. same basic block accesses during different architectural states

Generated at runtime Easy repeatability, etc.

Thresholding provides upper bound estimate for the power approximation with selected execution points

Page 53: Identifying Program Power Phase Behavior Using Power Vectors

53

Modified StuffModified Stuff

FOLLOWING SLIDES I MODIFIED FROM THE ORIGINAL, BUT STILL KEEP ‘EM

Page 54: Identifying Program Power Phase Behavior Using Power Vectors

54

Our Power Phase AnalysisOur Power Phase Analysis Goal:

Identify phases in program power behavior Determine execution points

that correspond to these phases Define small set of power signatures

that represent overall power behavior

Our Approach: Collect samples of estimated power values for processor sub-

units <Power Vectors> at application runtime Define a similarity metric regarding these power vectors Process the outcome of this similarity metric to group sampled

execution points (/Power Vectors) into phases Determine execution points and representative signature

vectors for each phase group Quantify the closeness of our approximation based on these

vectors to original power behavior

Page 55: Identifying Program Power Phase Behavior Using Power Vectors

55

MotivationMotivation Characterizing power behavior:

Future power-aware architectures and applications Dynamic power/thermal management

Multiconfigurable hardwareThread scheduling, DVS, DFSRecurring phase prediction

Architecture researchRepresentative –reduced– simulation points

Utilizing power vectors: Direct relation to actual processor power consumption Acquired at runtime

Similarity relations generated quicklyEasy repeatability for different datasets/compilationsIdentify (recurring) phases over large scales of execution

Identify program phases with no knowledge of application (i.e. no basic block profile, PC sampling, code space info, etc.)

Page 56: Identifying Program Power Phase Behavior Using Power Vectors

56

Power Vector Similarity MetricPower Vector Similarity Metric How to quantify the ‘power behavior

dissimilarity’ between two execution points?1. Consider solely total power difference

Conceals significant phase information2. Consider manhattan distance between the

corresponding 2 vectors Vectors with small magnitudes are inherently closer

3. Consider manhattan distance between the corresponding 2 vectors normalized Indifferent to magnitude of power consumption

4. Consider a combination of (2) & (3) Restricts us to both absolute and ratio-wise similarities

Construct a “similarity matrix” to represent similarity among all pairs of execution points

Each entry in the similarity matrix:

Page 57: Identifying Program Power Phase Behavior Using Power Vectors

57

Gcc Component Power Breakdowns

0

1

2

3

4

5

6

0 50 100 150 200 250Time (s)

Po

we

r [W

att

s] L1 cache

TraceCache

RETIRE

Gcc Total Power

0

10

20

30

40

50

60

0 50 100 150 200 250

Po

wer

[W

atts

]

MEASURED POWER COUNTER ESTIMATED POWER

Gzip Total Power

0

10

20

30

40

50

60

44 88 132 176 220 264 308 352 396 440Time (s)

Pow

er [W

atts

]

MEASURED POWER COUNTER ESTIMATED POWER

Gzip Component Power Breakdowns

0

1

2

3

4

5

6

44 144 244 344 444Time (s)

Pow

er [W

atts

] L1 cache

INT Exec

RETIRE

Gcc

& G

zip

Mat

rix

Plo

tsG

cc &

Gzi

p M

atri

x P

lots

Gcc Elaboration: Very variant power

Almost identical power behavior at 30, 50, 180s.

Although 88s, 110s, 140s, 210s and 230s show similar total power; 88, 210 and 230 share higher similarity.

Gcc Total Measured Power

0

10

20

30

40

50

60

0 50 100 150 200 250Time (s)

Po

wer

[W

atts

]

Gzip Elaboration: Much regular power behavior

Spurious similarities such as 100-150s and 200-280 are distinguished by the similarity analysis

GZIP Total Measured Power

0

10

20

30

40

50

60

44 94 144 194 244 294 344 394 444

Time (s)

Po

wer

[W

atts

]

Page 58: Identifying Program Power Phase Behavior Using Power Vectors

58

Similarity for Simplicity?Similarity for Simplicity? So, we can identify similar power phases:

I.e. informally: if similarity matrix(r,c) is DARK Execution points r & c have similar power behavior

2 Questions: 1) How do we group the execution points (power vectors) based on their similarity? 2) Could we represent power behavior with reasonable accuracy, with a small number of ‘signature’ vectors?

Our answer to Q.1: “Thresholding Algorithm”: Define a threshold of similarity < % of max dissimilarity> Start from first execution point (0,0) and identify ones in the fwd execution path that lie within threshold for both

normalized and absolute metrics Tag the corresponding execution points (j,j) as the same group Find next untagged execution point (r,r,) and do the same along fwd path Rule: A tagged execution point cannot add new elements to its group!

We demonstrate the outcome of thresholding with Grouping Matrices

Page 59: Identifying Program Power Phase Behavior Using Power Vectors

59

Gzip Grouping MatricesGzip Grouping Matrices

Gzip has 974 power vectors Cluster vectors based on similarity

using “thresholding” Max Gzip power dissimilarity: 47.35W

Page 60: Identifying Program Power Phase Behavior Using Power Vectors

60

ConclusionConclusion Presented a power oriented methodology to

identify program phases that uses power vectors generated during program runtime

Provided a similarity metric to quantify power behavior similarity of different execution samples

Demonstrated our representative sampling technique to characterize program power behavior Representative vectors for program power signatures Execution points for representative simulation

Can be useful for power & characterization research: Power Phase identification/prediction Reduced power simulation Dynamic power/thermal management