Discovering and Exploiting Program Phases Timothy Sherwood, Erez Perelman, Greg Hamerly, Suleyman...

35
Discovering and Exploiting Program Phases Timothy Sherwood, Erez Perelman, Greg Hamerly, Suleyman Sair, Brad Calder CSE 231 Presentation by Justin Ma

Transcript of Discovering and Exploiting Program Phases Timothy Sherwood, Erez Perelman, Greg Hamerly, Suleyman...

Page 1: Discovering and Exploiting Program Phases Timothy Sherwood, Erez Perelman, Greg Hamerly, Suleyman Sair, Brad Calder CSE 231 Presentation by Justin Ma.

Discovering and Exploiting Program Phases

Timothy Sherwood, Erez Perelman,Greg Hamerly, Suleyman Sair, Brad Calder

CSE 231 Presentation by Justin Ma

Page 2: Discovering and Exploiting Program Phases Timothy Sherwood, Erez Perelman, Greg Hamerly, Suleyman Sair, Brad Calder CSE 231 Presentation by Justin Ma.

400 Million Instructions

New Compiler

Non-Existent ProcessorNew Processor

Simulator

BenchmarkSpec2000

Page 3: Discovering and Exploiting Program Phases Timothy Sherwood, Erez Perelman, Greg Hamerly, Suleyman Sair, Brad Calder CSE 231 Presentation by Justin Ma.

400 Million Instructions

• Suppose you have a time budget…• Less than half second of execution

time• What would you simulate?

– Beginning?– Middle?– End?

Page 4: Discovering and Exploiting Program Phases Timothy Sherwood, Erez Perelman, Greg Hamerly, Suleyman Sair, Brad Calder CSE 231 Presentation by Justin Ma.

400 Million Instructions

gzip gcc

Programs exhibit diverse modes of

behavior

Page 5: Discovering and Exploiting Program Phases Timothy Sherwood, Erez Perelman, Greg Hamerly, Suleyman Sair, Brad Calder CSE 231 Presentation by Justin Ma.

400 Million Instructions

• Suppose you have a time budget…• Less than half second of execution

time• What would you simulate?

– Beginning?– Middle?– End?– Samples of different modes of behavior

Page 6: Discovering and Exploiting Program Phases Timothy Sherwood, Erez Perelman, Greg Hamerly, Suleyman Sair, Brad Calder CSE 231 Presentation by Justin Ma.

Program Phases

• Observation: programs exhibit various modes of periodic behavior

• These modes are program phases

• Challenge: Extract these automatically

Page 7: Discovering and Exploiting Program Phases Timothy Sherwood, Erez Perelman, Greg Hamerly, Suleyman Sair, Brad Calder CSE 231 Presentation by Justin Ma.

Phase Basics

• Intervals – slices in times• Phases – intervals with similar

behavior

Time (Instruction Count)

IPC

Page 8: Discovering and Exploiting Program Phases Timothy Sherwood, Erez Perelman, Greg Hamerly, Suleyman Sair, Brad Calder CSE 231 Presentation by Justin Ma.

Phase Basics

• Intervals – slices in times• Phases – intervals with similar

behavior

Time (Instruction Count)

IPC

Page 9: Discovering and Exploiting Program Phases Timothy Sherwood, Erez Perelman, Greg Hamerly, Suleyman Sair, Brad Calder CSE 231 Presentation by Justin Ma.

Defining “Similar Behavior”

• Metric for comparing intervals?– Cache misses?– IPC?– Branch misprediction rates?

• Problem: Performance alone is too architecture dependent

Page 10: Discovering and Exploiting Program Phases Timothy Sherwood, Erez Perelman, Greg Hamerly, Suleyman Sair, Brad Calder CSE 231 Presentation by Justin Ma.

Defining “Similar Behavior”

• Code path traversal– Directly affects time-varying behavior– Execute same code, same performance– Architecture independent

• Metrics for code path traversal– Frequency of branches– Frequency of function calls– Frequency of basic block calls

Page 11: Discovering and Exploiting Program Phases Timothy Sherwood, Erez Perelman, Greg Hamerly, Suleyman Sair, Brad Calder CSE 231 Presentation by Justin Ma.

Basic Block Vector

B1

B2 B3

B4

0 0 0 0

B1 B2 B3 B4

Time t

Page 12: Discovering and Exploiting Program Phases Timothy Sherwood, Erez Perelman, Greg Hamerly, Suleyman Sair, Brad Calder CSE 231 Presentation by Justin Ma.

Basic Block Vector

B1

B2 B3

B4

1 1 0 1

B1 B2 B3 B4

Time t

Page 13: Discovering and Exploiting Program Phases Timothy Sherwood, Erez Perelman, Greg Hamerly, Suleyman Sair, Brad Calder CSE 231 Presentation by Justin Ma.

Basic Block Vector

B1

B2 B3

B4

2 1 1 2

B1 B2 B3 B4

Time t

Page 14: Discovering and Exploiting Program Phases Timothy Sherwood, Erez Perelman, Greg Hamerly, Suleyman Sair, Brad Calder CSE 231 Presentation by Justin Ma.

Basic Block Vector

B1

B2 B3

B4

2 1 1 2

B1 B2 B3 B4

Time t

0 0 0 0

B1 B2 B3 B4

Time t + 1

Page 15: Discovering and Exploiting Program Phases Timothy Sherwood, Erez Perelman, Greg Hamerly, Suleyman Sair, Brad Calder CSE 231 Presentation by Justin Ma.

Basic Block Vector

B1

B2 B3

B4

2 1 1 2

B1 B2 B3 B4

Time t

1 1 0 1

B1 B2 B3 B4

Time t + 1

Page 16: Discovering and Exploiting Program Phases Timothy Sherwood, Erez Perelman, Greg Hamerly, Suleyman Sair, Brad Calder CSE 231 Presentation by Justin Ma.

Basic Block Vector

B1

B2 B3

B4

2 1 1 2

B1 B2 B3 B4

Time t

2 2 0 2

B1 B2 B3 B4

Time t + 1

Manhattan Distance = |1 – 2| + |1 – 0| = 2Euclidian Distance = sqrt((1 – 2)2 + (1 – 0)2) = sqrt(2)

Page 17: Discovering and Exploiting Program Phases Timothy Sherwood, Erez Perelman, Greg Hamerly, Suleyman Sair, Brad Calder CSE 231 Presentation by Justin Ma.

Basic Block Similarity Matrix

• gzip

Page 18: Discovering and Exploiting Program Phases Timothy Sherwood, Erez Perelman, Greg Hamerly, Suleyman Sair, Brad Calder CSE 231 Presentation by Justin Ma.

Basic Block Similarity Matrix

• gcc

BBV similarity between intervals

reflects performance

similarity

Page 19: Discovering and Exploiting Program Phases Timothy Sherwood, Erez Perelman, Greg Hamerly, Suleyman Sair, Brad Calder CSE 231 Presentation by Justin Ma.

Automatic Phase Classification

• Classify intervals into phases– We do not know which BBVs correspond to

particular phases a priori

• k-means clustering– Iterative clustering algorithm– Dimension Reduction

• Random Linear Projection

– Try different k values• Use BIC to choose best

Page 20: Discovering and Exploiting Program Phases Timothy Sherwood, Erez Perelman, Greg Hamerly, Suleyman Sair, Brad Calder CSE 231 Presentation by Justin Ma.

Automatic Phase Classification

Page 21: Discovering and Exploiting Program Phases Timothy Sherwood, Erez Perelman, Greg Hamerly, Suleyman Sair, Brad Calder CSE 231 Presentation by Justin Ma.

Automatic Phase Classification

Clustering accurately distinguishes phases

automatically

Page 22: Discovering and Exploiting Program Phases Timothy Sherwood, Erez Perelman, Greg Hamerly, Suleyman Sair, Brad Calder CSE 231 Presentation by Justin Ma.

SimPoint

• Simulate large programs on a budget

• Perform detailed simulation on representative code snippets– Choose centroid interval from each phase

(10 million instructions)

• Extrapolate large program performance– Weighted by frequency of phase

Page 23: Discovering and Exploiting Program Phases Timothy Sherwood, Erez Perelman, Greg Hamerly, Suleyman Sair, Brad Calder CSE 231 Presentation by Justin Ma.

• Simulate 400 million instructions total

SimPoint

Accurate estimate despite instruction

budget

Page 24: Discovering and Exploiting Program Phases Timothy Sherwood, Erez Perelman, Greg Hamerly, Suleyman Sair, Brad Calder CSE 231 Presentation by Justin Ma.

Why SimPoint Succeeds

• Program behavior varies over time

• SimPoint intelligently chooses which intervals to simulate

• Regularity within program phases allows accurate extrapolation

Page 25: Discovering and Exploiting Program Phases Timothy Sherwood, Erez Perelman, Greg Hamerly, Suleyman Sair, Brad Calder CSE 231 Presentation by Justin Ma.

Online Classification

• Detect phases as program is running

• Applications– Thread scheduling– Power management– Predicting future phases

• Challenges– One pass of input– Limited storage

Page 26: Discovering and Exploiting Program Phases Timothy Sherwood, Erez Perelman, Greg Hamerly, Suleyman Sair, Brad Calder CSE 231 Presentation by Justin Ma.

Online Classification

Page 27: Discovering and Exploiting Program Phases Timothy Sherwood, Erez Perelman, Greg Hamerly, Suleyman Sair, Brad Calder CSE 231 Presentation by Justin Ma.

Online ClassificationHigh variance in metrics

across full trace

Low variance shows online classification succeeds in finding

phases

Page 28: Discovering and Exploiting Program Phases Timothy Sherwood, Erez Perelman, Greg Hamerly, Suleyman Sair, Brad Calder CSE 231 Presentation by Justin Ma.

Conclusions

• Phases are a vital abstraction– Performance varies greatly w/in program– Attributable to different modes of behavior

• Can discover phases automatically– Offline: k-means clustering– Online

• Code path characterization– Strong correlation with actual performance– SimPoint exploits this with great success

Page 29: Discovering and Exploiting Program Phases Timothy Sherwood, Erez Perelman, Greg Hamerly, Suleyman Sair, Brad Calder CSE 231 Presentation by Justin Ma.
Page 30: Discovering and Exploiting Program Phases Timothy Sherwood, Erez Perelman, Greg Hamerly, Suleyman Sair, Brad Calder CSE 231 Presentation by Justin Ma.

Outline

• Introduction (motivate)• Basics (definitions, BBV, BBMatrix)• Offline Phase Classification

– SimPoints

• Online Phase Classification• Conclusions

Page 31: Discovering and Exploiting Program Phases Timothy Sherwood, Erez Perelman, Greg Hamerly, Suleyman Sair, Brad Calder CSE 231 Presentation by Justin Ma.

Limitations of Clustering

Page 32: Discovering and Exploiting Program Phases Timothy Sherwood, Erez Perelman, Greg Hamerly, Suleyman Sair, Brad Calder CSE 231 Presentation by Justin Ma.

Bayesian Information Criterion

• Fit to Gaussians

Page 33: Discovering and Exploiting Program Phases Timothy Sherwood, Erez Perelman, Greg Hamerly, Suleyman Sair, Brad Calder CSE 231 Presentation by Justin Ma.

Self-Modifying Code

Self-m

odifyin

g c

ode

Program Phases

85o

Page 34: Discovering and Exploiting Program Phases Timothy Sherwood, Erez Perelman, Greg Hamerly, Suleyman Sair, Brad Calder CSE 231 Presentation by Justin Ma.

Learning Phases

Page 35: Discovering and Exploiting Program Phases Timothy Sherwood, Erez Perelman, Greg Hamerly, Suleyman Sair, Brad Calder CSE 231 Presentation by Justin Ma.

Learning Phases