IFRA Instruction Footprint Recording & Analysis for Post-Silicon Bug Localization Sung-Boem Park...

34
IFRA Instruction Footprint Recording & Analysis for Post-Silicon Bug Localization Sung-Boem Park Subhasish Mitra Robust Systems Group Departments of Electrical Eng. & Computer Sc. Stanford University 1

Transcript of IFRA Instruction Footprint Recording & Analysis for Post-Silicon Bug Localization Sung-Boem Park...

Page 1: IFRA Instruction Footprint Recording & Analysis for Post-Silicon Bug Localization Sung-Boem Park Subhasish Mitra Robust Systems Group Departments of Electrical.

IFRA

Instruction Footprint Recording & Analysis

for Post-Silicon Bug Localization

Sung-Boem Park

Subhasish Mitra

Robust Systems Group

Departments of Electrical Eng. & Computer Sc.

Stanford University11

Page 2: IFRA Instruction Footprint Recording & Analysis for Post-Silicon Bug Localization Sung-Boem Park Subhasish Mitra Robust Systems Group Departments of Electrical.

Key Message

Post-silicon bug localization – Major bottleneck

Pinpoint from system failure

Bug location, exposing stimulus

Existing schemes – Expensive & not scalable

IFRA – New technique for processors

Eliminates limitations of existing techniques

96% accuracy

1% area, ~0% performance impact

22

Page 3: IFRA Instruction Footprint Recording & Analysis for Post-Silicon Bug Localization Sung-Boem Park Subhasish Mitra Robust Systems Group Departments of Electrical.

Outline

Motivation

IFRA Overview

Simulation Results

Conclusion

3

Page 4: IFRA Instruction Footprint Recording & Analysis for Post-Silicon Bug Localization Sung-Boem Park Subhasish Mitra Robust Systems Group Departments of Electrical.

Microprocessor Development Flow

4

“Post-silicon cost & complexity is rising faster than design cost”

S. Yerramilli, VP, Intel, ITC06 Invited Address

Pre-Silicon

Post-Silicon

Pre-Silicon Verification

Design

Manufacturing Test

POST-SILICON VALIDATION

Post-Silicon Validation Costs: 35% of Development Time25% of Design Resources

Page 5: IFRA Instruction Footprint Recording & Analysis for Post-Silicon Bug Localization Sung-Boem Park Subhasish Mitra Robust Systems Group Departments of Electrical.

Detect – Run test content in system

e.g., OS, games, functional tests

Localize – Pinpoint from system failure (e.g., crash)

Bug location – e.g., ALU, decoder, scheduler

Exposing stimulus – e.g., instruction sequence

Dominates cost [Josephson DAC06]

Root cause & Fix

Optical probing, patch / circuit edit / respin

5

Post-Silicon Validation Steps

Page 6: IFRA Instruction Footprint Recording & Analysis for Post-Silicon Bug Localization Sung-Boem Park Subhasish Mitra Robust Systems Group Departments of Electrical.

6

Post-Silicon Bug Types [Josephson DAC06]

Functional bugs – Incorrect logic implementation

e.g., design errors

Short localization time – e.g., hours to days

Electrical bugs / circuit marginalities

e.g., speed-path, noise, races, hold time

Some voltage / temp / frequency corners

LONG localization time – e.g., days to weeks

Our focus

6

Page 7: IFRA Instruction Footprint Recording & Analysis for Post-Silicon Bug Localization Sung-Boem Park Subhasish Mitra Robust Systems Group Departments of Electrical.

Reproduce failure on tester

2 days

Localize on tester3 days

Not always Possible

Tester-based

Detect in system

Existing Post-Silicon Bug Localization Flows

7

Detect in system

System-based

Localize failure in system

1 to 4 weeks

Major ProblemsFailure Reproduction

System-level simulation

Page 8: IFRA Instruction Footprint Recording & Analysis for Post-Silicon Bug Localization Sung-Boem Park Subhasish Mitra Robust Systems Group Departments of Electrical.

8

IFRA vs. Existing Techniques

8

TechniquesTrace buffers

Clock manipulation

Checkpoint+ replay

Scan techniques IFRA

Intrusive? ? Yes No

Failure reproduction? Yes No

System-level simulation? Yes No

Area impact? Yes No 1%

Page 9: IFRA Instruction Footprint Recording & Analysis for Post-Silicon Bug Localization Sung-Boem Park Subhasish Mitra Robust Systems Group Departments of Electrical.

Instruction Footprint Recording & Analysis

Insert recorders inside chip design

DesignPhase

Record special info. in recorders / Run tests

Scan out recorder contents

Post-analyze offline

Localized Bug: (location, stimulus)

Failuredetected?

Yes

No

Post-SiValidation

9

No system simulation Self-consistency against

test program binary

Non-intrusiveNo failure reproduction Single test run sufficient

Page 10: IFRA Instruction Footprint Recording & Analysis for Post-Silicon Bug Localization Sung-Boem Park Subhasish Mitra Robust Systems Group Departments of Electrical.

Outline

Motivation

IFRA Overview

Hardware Support

Automated Post-Analysis Techniques

Simulation Results

Conclusion

10

Page 11: IFRA Instruction Footprint Recording & Analysis for Post-Silicon Bug Localization Sung-Boem Park Subhasish Mitra Robust Systems Group Departments of Electrical.

IFRA Hardware in Superscalar Processor

11

FETCH

DECODE

ISSUE

EXECUTE

COMMIT

Branch Predictor I-CacheI-TLBFetch Queue

Pipeline Registers

Decoders

Pipeline Registers

Reg Rename

Phys Regfile

Pipeline Registers

Instruction Window

Pipeline Registers

2xBr2xALUMUL

2xLSUD-CacheD-TLBFPU

Pipeline Registers

Reorder Buffer Reg Map

Pipeline Registers

Reg Map Reg FreeDISPATCH

Alpha 21264

Part of scan chain

Post-TriggerGenerator

Recorders

Recorders

Recorders

Recorders

Recorders

Recorders

ID assignment

Slow wireNo at-speed

routing

Scan chain

Page 12: IFRA Instruction Footprint Recording & Analysis for Post-Silicon Bug Localization Sung-Boem Park Subhasish Mitra Robust Systems Group Departments of Electrical.

INST1 ID1Auxiliary Info: PC1INST2 Auxiliary Info: PC2 ID2

Pipeline Reg

Pipeline Reg ID1INST1

ID1INST1 Auxiliary Info: Decoded bits1

ID1INST1

ID2 Auxiliary Info: Decoded bits2ID2INST2

INST2 ID2 Auxiliary Info: Decoded bits2

INST2 ID2 Auxiliary Info: PC2ID2

Recording Operation Example

12

FETCH

DECODE

ID Assignment

Branch Predictor I-CacheI-TLB

Fetch Queue

Decoder

ID1 Auxiliary Info: PC1

ID1 Auxiliary Info: Decoded bits1

Recorder 1

Recorder 2

Instruction Footprints

Special ID assignment rule

Page 13: IFRA Instruction Footprint Recording & Analysis for Post-Silicon Bug Localization Sung-Boem Park Subhasish Mitra Robust Systems Group Departments of Electrical.

13

Special Rule for Instruction ID Assignment

Simplistic ID assignment inadequate

Speculation + flushes, out-of-order execution

PC does not work for loops

Special ID assignment rule – formal proof in paper

ID width: log24n bits

n = max. instructions in flight

e.g., 8 bits for Alpha-like processor (n=64)

No timestamp or global synchronization required

13

Page 14: IFRA Instruction Footprint Recording & Analysis for Post-Silicon Bug Localization Sung-Boem Park Subhasish Mitra Robust Systems Group Departments of Electrical.

Dominated by memory

Simple control logic

Idle cycle compaction

Circular buffer control

Serialization

Stop / Start recording

No high-speed global routing

Contents scanned out after failure detection

Instruction Footprint Recorder Design

14

Circular Buffer

Con

trol

Log

ic

Post-triggersignal

Instruction ID + Auxiliary info.

To slow scan chain

14

Page 15: IFRA Instruction Footprint Recording & Analysis for Post-Silicon Bug Localization Sung-Boem Park Subhasish Mitra Robust Systems Group Departments of Electrical.

What to Record?Pipeline stage Auxiliary information Bits per

recorderNumber of recorders

Fetch PC 32 4Decode Decoding results 4 4Dispatch 2-bit residue of reg. name 6 4

Issue 3-bit residue of operands 6 4Execution

(ALU, MUL)3-bit residue of result 3 4

Execution(Branch)

None 0 2

Execution(Load/Store unit)

3-bit residue of result32-bit memory address

35 2

Commit Exceptions ~0 4

15

Total required storage for all recorders: 60 KBytes

Page 16: IFRA Instruction Footprint Recording & Analysis for Post-Silicon Bug Localization Sung-Boem Park Subhasish Mitra Robust Systems Group Departments of Electrical.

Post-Trigger Generation

16

time

Failure after 2 billion cycles(e.g., crash)

Error after a billion cycles(e.g., speedpath)

t=0

Code Execution

Too much storage overheadto store 1 billion cycles

Page 17: IFRA Instruction Footprint Recording & Analysis for Post-Silicon Bug Localization Sung-Boem Park Subhasish Mitra Robust Systems Group Departments of Electrical.

Post-Trigger Generation

17

time

Early failure detection techniques (post-triggers) Classical error detection – residue, parity Deadlock & segfault detection

Special early warnings to pause recording Details in paper

Failure after 2 billion cycles(e.g., crash)

Error after a billion cycles(e.g., speedpath)

t=0

Code Execution

Need to capturein recorder storage

Early failure detection necessary

Page 18: IFRA Instruction Footprint Recording & Analysis for Post-Silicon Bug Localization Sung-Boem Park Subhasish Mitra Robust Systems Group Departments of Electrical.

18

IFRA Area Impact

1% chip-level area impact

Synopsys Design Compiler synthesis

Alpha 21264-like processor: 2MB L2 cache

TSMC 130nm technology

No global at-speed routing

Area dominated by circular buffers in recorders

Total recorder storage: 60 KBytes

Page 19: IFRA Instruction Footprint Recording & Analysis for Post-Silicon Bug Localization Sung-Boem Park Subhasish Mitra Robust Systems Group Departments of Electrical.

Outline

Motivation

IFRA Overview

Hardware Support

Post-Analysis Techniques

Simulation Results

Conclusion

19

Page 20: IFRA Instruction Footprint Recording & Analysis for Post-Silicon Bug Localization Sung-Boem Park Subhasish Mitra Robust Systems Group Departments of Electrical.

20

Post-Analysis Overview

Link footprints

Test program binary

Footprints from recorders

Run high-level analysis

Run low-level analysis

List of bug location-stimulus pairs

Control-flow analysis

Data-dependency analysis

Decoding analysis

Load/Store analysis

Residue consistency check

(Not covered today – Details in paper)

Page 21: IFRA Instruction Footprint Recording & Analysis for Post-Silicon Bug Localization Sung-Boem Park Subhasish Mitra Robust Systems Group Departments of Electrical.

21

Linking Footprints from Recorder ContentsCommit-stage

recorderFetch-stage

recorderExecution-stage

recorderTest program

binary

INST6 INST5 INST4 INST3 INST2

INST0

ID: 7 ID: 6 ID: 5 ID: 4 ID: 7 ID: 6 ID: 5

AUX7 AUX6 AUX5 AUX4 AUX3 AUX2 AUX1

PC4 PC3 PC2 PC1 PC3 PC2 PC1

ID: 6 ID: 5 ID: 4 ID: 7 ID: 6 ID: 5

AUX17 AUX16 AUX15 AUX14 AUX12 AUX11

ID: 7 ID: 6 ID: 5 ID: 4 ID: 7 ID: 6 ID: 5

PC6 PC5 PC4 PC3 PC2 PC0…

… ……

ID: 0 AUX13

ID: 0 AUX0

ID: 0 AUX8

ID: 0 PC0

ID: 0 PC5

PC1 INST1

PC7 INST7

time

ID: 0 AUX10

Special ID assignment rule ensures: Uncommitted instructions uniquely identified Relative orders of identical IDs maintained

Even under flushes & out-of-order execution

ID: 0 AUX18

… … … …

ID: 0 PC4

Page 22: IFRA Instruction Footprint Recording & Analysis for Post-Silicon Bug Localization Sung-Boem Park Subhasish Mitra Robust Systems Group Departments of Electrical.

22

Debug Example

Link footprints

Bug locations + exposing stimulus

?

??

???

??

??

??? ?

??

?

?

?

?

?

?

?

?

?

?

?

Low-level analysis

High-level analysis

Page 23: IFRA Instruction Footprint Recording & Analysis for Post-Silicon Bug Localization Sung-Boem Park Subhasish Mitra Robust Systems Group Departments of Electrical.

23

Debug Example – Decision 1

R0 R3 + R6

R5 R0 + R6

……

R0 R1 + R2

Test Program Binary

Fetch-stage recorder

Serial execution trace

Page 24: IFRA Instruction Footprint Recording & Analysis for Post-Silicon Bug Localization Sung-Boem Park Subhasish Mitra Robust Systems Group Departments of Electrical.

24

Debug Example – Question 1

R0 R3 + R6

R5 R0 + R6

……

RAW hazard

R0 R1 + R2

R0=3

Issue-stagerecorder

R0=5

Execute-stagerecorder

Residue of values mismatch?

Serial execution trace

Producer of R0

Consumer of R0

Page 25: IFRA Instruction Footprint Recording & Analysis for Post-Silicon Bug Localization Sung-Boem Park Subhasish Mitra Robust Systems Group Departments of Electrical.

25

Debug Example – Question 2

R0 R3 + R6

R5 R0 + R6

……

RAW hazard

R0 R1 + R2

Residue of phys. reg. names mismatch?

R0=P5

Dispatch-stagerecorder

R0=P2

Serial execution trace

Producer of R0

Consumer of R0

Page 26: IFRA Instruction Footprint Recording & Analysis for Post-Silicon Bug Localization Sung-Boem Park Subhasish Mitra Robust Systems Group Departments of Electrical.

26

Debug Example – Question 3

R0 R3 + R6

R5 R0 + R6

……

RAW hazard

R0 R1 + R2

Serial execution trace

Producer of R0

Consumer of R0

Residue of phys. reg. name match with

previous producer?

R0=P5

Dispatch-stagerecorder

R0=P5Previous producer

Page 27: IFRA Instruction Footprint Recording & Analysis for Post-Silicon Bug Localization Sung-Boem Park Subhasish Mitra Robust Systems Group Departments of Electrical.

27

Debug Example – Result

Arch. Dest. Reg

Pipeline Register

Decoder

Read Circuit

Write Circuit

Reg. Mapping

Rest of pipeline reg. R0 R1 + R2R0 R3 + R6

R5 R0 + R6

Stim

ulates Bug

Bug Location

Rest of modules in

dispatch stage

……

Propagates to failure

Page 28: IFRA Instruction Footprint Recording & Analysis for Post-Silicon Bug Localization Sung-Boem Park Subhasish Mitra Robust Systems Group Departments of Electrical.

Outline

Motivation

IFRA Overview

Simulation Results

Conclusion

28

Page 29: IFRA Instruction Footprint Recording & Analysis for Post-Silicon Bug Localization Sung-Boem Park Subhasish Mitra Robust Systems Group Departments of Electrical.

29

Experimental Setup

Simplescalar architectural simulator

Alpha 21264 configuration

Augmented with ~1K error injection points

Error model – single bit-flips

Hard-to-repeat electrical bugs

Both flip-flops & combinational logic

Stimulus

SpecInt 2000 benchmarks

Page 30: IFRA Instruction Footprint Recording & Analysis for Post-Silicon Bug Localization Sung-Boem Park Subhasish Mitra Robust Systems Group Departments of Electrical.

Experimental Flow

30

Any failure detected?

Yes

No

Short error latency?

Yes

Warm up for a million cycles

Inject error

Masked/silent errorMasked/

silent error

No

100K simulation runs800 post-analysis runs

Post-analyze

Complete miss

Complete miss

Localization with

candidates

Localization with

candidates

Exact localization

Exact localization

Page 31: IFRA Instruction Footprint Recording & Analysis for Post-Silicon Bug Localization Sung-Boem Park Subhasish Mitra Robust Systems Group Departments of Electrical.

IFRA Bug Localization Results

31

Localization resolution Bug exposing stimulus One of 200 erroneous design blocks

Avg. block size: 10K 2-input NAND gates

Correct localization (96%)

Complete miss (4%)

Exactlocalization

(78%)

Localization with avg. 6 candidates

(22%)

Page 32: IFRA Instruction Footprint Recording & Analysis for Post-Silicon Bug Localization Sung-Boem Park Subhasish Mitra Robust Systems Group Departments of Electrical.

Outline

Motivation

IFRA Overview

Simulation Results

Conclusion

32

Page 33: IFRA Instruction Footprint Recording & Analysis for Post-Silicon Bug Localization Sung-Boem Park Subhasish Mitra Robust Systems Group Departments of Electrical.

Conclusion

IFRA

Inexpensive

1% area, no expensive logic analyzers

No failure reproduction or system simulation

Effective

96% accuracy

Practical

Alpha processor demonstration

3333

Page 34: IFRA Instruction Footprint Recording & Analysis for Post-Silicon Bug Localization Sung-Boem Park Subhasish Mitra Robust Systems Group Departments of Electrical.

Acknowledgement Bob Gottlieb, Intel

Nagib Hakim, Intel

Ted Hong, Stanford University

Doug Josephson, Intel

Onur Mutlu, Microsoft Research

Priyadarshan Patra, Intel

Eric Rentschler, AMD

Jason Stinson, Intel

34