Center for Embedded Computer Systems cecs.uci/~spark

30
Center for Embedded Computer Systems http://www.cecs.uci.edu/~spark Dynamic Conditional Branch Balancing during the High-Level Synthesis of Control-Intensive Designs Supported by Semiconductor Research Corporation 1 School of Information and Computer Science University of California, Irvine 2 Department of Computer Science And Engineering University of California, San Diego Sumit Gupta Sumit Gupta 1 Nikil Dutt Nikil Dutt 1 Rajesh Gupta Rajesh Gupta 2 2 Alex Nicolau Alex Nicolau 1 1

description

Dynamic Conditional Branch Balancing during the High-Level Synthesis of Control-Intensive Designs. Sumit Gupta 1 Nikil Dutt 1 Rajesh Gupta 2 Alex Nicolau 1. 1 School of Information and Computer Science University of California, Irvine. 2 Department of Computer Science And Engineering - PowerPoint PPT Presentation

Transcript of Center for Embedded Computer Systems cecs.uci/~spark

Page 1: Center for Embedded Computer Systems cecs.uci/~spark

Center for Embedded Computer Systemshttp://www.cecs.uci.edu/~spark

Dynamic Conditional Branch Balancing during the High-Level Synthesis of Control-Intensive Designs

Supported by Semiconductor Research Corporation

1School of Information and Computer Science

University of California, Irvine

2Department of Computer Science And Engineering

University of California, San Diego

Sumit GuptaSumit Gupta11 Nikil DuttNikil Dutt11

Rajesh GuptaRajesh Gupta22 Alex NicolauAlex Nicolau11

Page 2: Center for Embedded Computer Systems cecs.uci/~spark

22

High Level Synthesis: From Behavior High Level Synthesis: From Behavior to Hardwareto Hardware

M e m o r y

ALUCo

ntr

ol

Data path

d = e - f g = h + i

If NodeT F

c

x = a + bc = a < b

j = d x gl = e + x

x = a + b;c = a < b;if (c) then d = e – f;else g = h + i;j = d x g;l = e + x;

Our approach targets descriptions with Our approach targets descriptions with nested conditionals and loopsnested conditionals and loops

Page 3: Center for Embedded Computer Systems cecs.uci/~spark

33

Synthesizing Control-Intensive Synthesizing Control-Intensive DesignsDesigns

Programming styleProgramming style and and control constructscontrol constructs have tremendous impact on the quality of have tremendous impact on the quality of HLS resultsHLS results Operation placement is for programming Operation placement is for programming

convenience: not optimized for synthesisconvenience: not optimized for synthesis Restructure and duplicate code using Restructure and duplicate code using

Parallelizing Compiler Transformations: Parallelizing Compiler Transformations: Speculative Code Motions Speculative Code Motions

We present heuristics that carefully We present heuristics that carefully guideguide and and increase the scopeincrease the scope of the speculative of the speculative code motions – particularly operation code motions – particularly operation duplication (Conditional Speculation )duplication (Conditional Speculation )

Page 4: Center for Embedded Computer Systems cecs.uci/~spark

44

Toolbox Approach to Toolbox Approach to Scheduling Scheduling

SchedulingCode MotionDynamic CSE

Loop Transformations

Percolation/TrailblazingSpeculative Code Motions

CSE/IVA/Copy PropOperation Chaining

Loop Transformations

HeuristicsHeuristicsTransformations Transformations

ToolboxToolbox

Scheduling FrameworkScheduling Framework

Scheduling Heuristics employ Code Scheduling Heuristics employ Code Transformations from Transformations ToolboxTransformations from Transformations Toolbox

Page 5: Center for Embedded Computer Systems cecs.uci/~spark

55

Scheduling using Speculative Code Scheduling using Speculative Code MotionsMotions

BB 2 BB 3

BB 1

BB 6 BB 7

BB 5

BB 4

BB 8

+

+

+

c

b

d

+ +a BB 0

BB 9

Speculate

Across If Block

Speculate

BB 2 BB 3

BB 1

BB 6 BB 7

BB 5

BB 4

BB 8

+

+ c

b

+a BB 0

BB 9+ d

++Resource Allocation ++

Page 6: Center for Embedded Computer Systems cecs.uci/~spark

66

BB 2 BB 3

BB 1

BB 6 BB 7

BB 5

BB 4

BB 8

+

+ c

b

+a BB 0

BB 9+ d

Scheduling using Speculative Code Scheduling using Speculative Code MotionsMotions

BB 2 BB 3

BB 1

BB 6 BB 7

BB 5

BB 4

BB 8

+

+

+

c

b

d

+

+

AcrossIf Block

ConditionalSpeculation

+a

+ d

BB 0

BB 9

+ d

++Resource Allocation ++

Conditional Speculation Duplicates Operations into the

branches of a Conditional Block

Page 7: Center for Embedded Computer Systems cecs.uci/~spark

77

Increasing the Scope of Code Increasing the Scope of Code MotionsMotions

If NodeT F

_ e

BB 0

BB 2BB 1

BB 3

BB 4

+ a

+ b

_ c

_ dS0

S1

S2

S3

++Resource Allocation

Original Design

If NodeT F

_ e

BB 0

BB 2BB 1

BB 3

BB 4

+a

+b

_ c _ d

Scheduled Design

UnbalancedConditional

Longest Longest PathPath

A A scheduling step is a set of concurrent is a set of concurrent operations inside a basic blockoperations inside a basic block

A basic block is a sequence of scheduling A basic block is a sequence of scheduling steps with no control branches/merges steps with no control branches/merges between thembetween them

Page 8: Center for Embedded Computer Systems cecs.uci/~spark

88

Insert New Scheduling Step in Insert New Scheduling Step in Shorter BranchShorter Branch

If NodeT F

_ e

BB 0

BB 2BB 1

BB 3

BB 4

+a

+b

_ c _ d

If NodeT F

_ e

BB 0

BB 2BB 1

BB 3

BB 4

+ a

+ b

_ c

_ dS0

S1

S2

S3

++Resource Allocation

Original Design Scheduled Design

Page 9: Center for Embedded Computer Systems cecs.uci/~spark

99

Insert New Scheduling Step in Insert New Scheduling Step in Shorter BranchShorter Branch

If NodeT F

BB 0

BB 2BB 1

BB 3

BB 4

+a

+b

_ c _ d

If NodeT F

_ e

BB 0

BB 2BB 1

BB 3

BB 4

+ a

+ b

_ c

_ dS0

S1

S2

S3

++Resource Allocation

e_ _e

Original Design Scheduled Design

Insert scheduling steps into Insert scheduling steps into shortershorter conditional conditional branchbranch

Enables further code compactionEnables further code compaction

Page 10: Center for Embedded Computer Systems cecs.uci/~spark

1010

Organization of Scheduling Organization of Scheduling HeuristicsHeuristics

Scheduling Heuristic

Candidate Mover

Candidate Provider

IR Walker

Traverses Design to find next basic block to schedule

Traverses Design to find Candidate Operations to schedule

Chooses one of the Candidate Operations to Schedule

Moves, duplicates and schedules chosen Operation

SchedulerScheduler

Page 11: Center for Embedded Computer Systems cecs.uci/~spark

1111

Organization of Scheduling Organization of Scheduling HeuristicsHeuristics

Scheduling Heuristic

Candidate Mover

Candidate Provider

IR Walker

SchedulerSchedulerBBDTBBDT

Branch Branch Balancing Balancing

During TraversalDuring Traversal

BBDCMBBDCMBranch Branch

Balancing Balancing During Code During Code

MotionMotion

Check if BBDCM Check if BBDCM willwill

Enable Code Enable Code MotionMotion

Page 12: Center for Embedded Computer Systems cecs.uci/~spark

1212

BBDTBBDT: Get Next Step to : Get Next Step to ScheduleSchedule Schedule Design starting from Schedule Design starting from

first basic block in Designfirst basic block in Design On each call, returns next step in On each call, returns next step in

current basic blockcurrent basic block If last step in basic block is If last step in basic block is

reachedreached If current BB is a Branch of a

Conditional C If basic blocks in other branches of C are scheduled and have more scheduling steps,

Insert new step in currBB Traverse design and get next Traverse design and get next

basic blockbasic block Return first step from next Return first step from next

basic blockbasic block

If NodeT F

_ f

BB 1

BB 3BB 2

BB 4

BB 5

+b

+d

_ c

BB 0

_ e

_ a

Page 13: Center for Embedded Computer Systems cecs.uci/~spark

1313

BBDTBBDT: : Insert Scheduling StepsInsert Scheduling Steps while while

Getting Next Step to ScheduleGetting Next Step to Schedule Schedule Design starting from Schedule Design starting from first basic block in Designfirst basic block in Design

On each call, returns next step in On each call, returns next step in current basic blockcurrent basic block

If last step in basic block is If last step in basic block is reachedreached If current BB is a Branch of a

Conditional C If basic blocks in other branches of C are scheduled and have more scheduling steps,

Insert new step in currBB Traverse design and get next Traverse design and get next

basic blockbasic block Return first step from next Return first step from next

basic blockbasic block

If NodeT F

_ f

BB 1

BB 3BB 2

BB 4

BB 5

+b

+d

_ c

_ aBB 0

_ e

UnbalancedConditional

Page 14: Center for Embedded Computer Systems cecs.uci/~spark

1414

BBDTBBDT: : Insert Scheduling StepsInsert Scheduling Steps while while

Getting Next Step to ScheduleGetting Next Step to Schedule

If NodeT F

_ f

BB 1

BB 3BB 2

BB 4

BB 5

+b

+d

_ c

_ aBB 0

_ e

Schedule Design starting from Schedule Design starting from first basic block in Designfirst basic block in Design

On each call, returns next step in On each call, returns next step in current basic blockcurrent basic block

If last step in basic block is If last step in basic block is reachedreached If current BB is a Branch of a

Conditional C If basic blocks in other branches of C are scheduled and have more scheduling steps,

Insert new step in currBB Traverse design and get next Traverse design and get next

basic blockbasic block Return first step from next Return first step from next

basic blockbasic block

Page 15: Center for Embedded Computer Systems cecs.uci/~spark

1515

Scope of BBDTScope of BBDT

If Node 2

T F

f

BB 2

BB 4BB 3

BB 5

BB 7

+ c

+

_ d

BB 6

b_

a

If Node 1T F

BB 0

BB 1

+

++Resource Constraints Scheduling Scheduling

order: order: BB1, BB3 and BB1, BB3 and

BB4BB4 AfterAfter scheduling scheduling

the step in BB4 the step in BB4 BBDT adds one BBDT adds one

new step in BB3 new step in BB3 & BB4 since & BB4 since number of steps number of steps in BB1 is larger in BB1 is larger

ScheduledBeing Scheduled

Page 16: Center for Embedded Computer Systems cecs.uci/~spark

1616

Scope of BBDTScope of BBDT

If Node 2

T F

f

BB 2

BB 4BB 3

BB 5

+ c

+

_ d

BB 6

b_

a

If Node 1T F

BB 0

BB 1

+

Scheduling order: Scheduling order: BB1, BB3 and BB4BB1, BB3 and BB4

After scheduling After scheduling the step in BB4 the step in BB4 BBDT adds one new BBDT adds one new

step in BB3 & BB4 step in BB3 & BB4 since number of since number of steps in BB1 is steps in BB1 is largerlarger

The new step in The new step in BB4 is now BB4 is now scheduledscheduled Operation Operation f f can be can be

duplicated up into duplicated up into BB1, BB3 and BB4BB1, BB3 and BB4

BB 7

++Resource Constraints

ScheduledBeing Scheduled

Page 17: Center for Embedded Computer Systems cecs.uci/~spark

1717

Scope of BBDTScope of BBDT Inserts steps (Balances Branches) Inserts steps (Balances Branches) afterafter

scheduling last branch of conditionalscheduling last branch of conditional Needs all other branches to be scheduled Needs all other branches to be scheduled

alreadyalready To get an accurate picture of the number To get an accurate picture of the number

of steps and the resource utilization in all of steps and the resource utilization in all the conditional branchesthe conditional branches

Continues to schedule the new scheduling Continues to schedule the new scheduling step in the last branchstep in the last branch

Page 18: Center for Embedded Computer Systems cecs.uci/~spark

1818

Limitations of BBDTLimitations of BBDT

If Node 2

T F

f

BB 2

BB 4BB 3

BB 5

+ c

+

_ d

BB 6

b_

a

If Node 1T F

BB 0

BB 1

+_ e

Scheduling order: Scheduling order: BB1, BB3 and BB4BB1, BB3 and BB4

BBDT can add new BBDT can add new steps/balance steps/balance branches branches AfterAfter scheduling last scheduling last branch in branch in conditional conditional

BBDT adds new BBDT adds new step in BB3 after step in BB3 after scheduling BB4, so scheduling BB4, so we cannot do CS of we cannot do CS of operationoperation f f

BB 7

++Resource Constraints

ScheduledBeing Scheduled

Page 19: Center for Embedded Computer Systems cecs.uci/~spark

1919

Limitations of BBDTLimitations of BBDT

If Node 2

T F

f

BB 2

BB 4BB 3

BB 5

+ c

+

_ d

BB 6

b_

a

If Node 1T F

BB 0

BB 1

+_ e

Scheduling order: Scheduling order: BB1, BB3 and BB4 BB1, BB3 and BB4

BBDT can add new BBDT can add new steps/balance steps/balance branches branches AfterAfter scheduling last scheduling last branch in branch in conditional conditional

BBDT adds new BBDT adds new step in BB3 after step in BB3 after scheduling BB4, so scheduling BB4, so we cannot do CS of we cannot do CS of operation foperation f

BB 7

++Resource Constraints

ScheduledBeing Scheduled

No backtracking/re-scheduling of basic blocks/branches No backtracking/re-scheduling of basic blocks/branches that have already been scheduledthat have already been scheduled

Page 20: Center for Embedded Computer Systems cecs.uci/~spark

2020

BBDCMBBDCM: Branch Balancing : Branch Balancing During During

Code MotionCode Motion Operates while checking if an operation Operates while checking if an operation

can be conditionally speculated into the can be conditionally speculated into the scheduling step under considerationscheduling step under consideration

Checks resource utilization of all the basic Checks resource utilization of all the basic blocks that the operation will be blocks that the operation will be duplicated induplicated in Tries to Find Tries to Find Idle ResourcesIdle Resources in all branches of in all branches of

the conditionalthe conditional Inserts new step whenever possible if there is Inserts new step whenever possible if there is

no idle resource: no idle resource: This is the Modification to This is the Modification to Balance BranchesBalance Branches

Page 21: Center for Embedded Computer Systems cecs.uci/~spark

2121

Idle Idle ResourcesResources

A resource is said to be A resource is said to be IdleIdle in a scheduling step in a scheduling step ifif

There is no operation There is no operation scheduled on it in scheduled on it in that that stepstep

For multi-cycle For multi-cycle resourcesresources if there is no operation if there is no operation

scheduled on scheduled on previousprevious step(s)step(s)

if there is no operation if there is no operation scheduled on scheduled on nextnext step(s) step(s)

If NodeT F

BB 1

BB 3BB 2

BB 4

BB 5

BB 0_ a

+d

_ e

2 Cycle Multiplier

PreviousStep

NextStep

XX

Page 22: Center for Embedded Computer Systems cecs.uci/~spark

2222

BBDCMBBDCM: Allow Conditional : Allow Conditional Speculation ?Speculation ?

If Node 2

T F

f

BB 2

BB 4BB 3

BB 5

BB 7

+ c

+

e_

_ d

BB 6

b_

a

If Node 1T F

BB 0

BB 1

+

While considering While considering CS in last CS in last conditional BB conditional BB

Find Idle Resources Find Idle Resources in each basic block in each basic block to duplicate into duplicate in

If no idle resource If no idle resource in any BBin any BB If number of steps in If number of steps in

BB ≤ number of BB ≤ number of steps in any other BBsteps in any other BB

Insert New Insert New scheduling stepscheduling step

++++

AlloAlloww

CS ?CS ?

BB 7

++Resource Constraints

ScheduledBeing Scheduled

Page 23: Center for Embedded Computer Systems cecs.uci/~spark

2323

BBDCMBBDCM: Allow Conditional : Allow Conditional Speculation ?Speculation ?

If Node 2

T F

f

BB 2

BB 4BB 3

BB 5

BB 7

+ c

+

e_

_ d

BB 6

b_

a

If Node 1T F

BB 0

BB 1

+

While considering While considering CS in last CS in last conditional BB conditional BB

Find Idle Resources Find Idle Resources in each basic block in each basic block to duplicate into duplicate in

If no idle resource If no idle resource in any BBin any BB If number of steps in If number of steps in

BB ≤ number of BB ≤ number of steps in any other BBsteps in any other BB

Insert New Insert New scheduling stepscheduling step

++++ ++

BB 7

++Resource Constraints

ScheduledBeing Scheduled

AlloAlloww

CS ?CS ?

Page 24: Center for Embedded Computer Systems cecs.uci/~spark

2424

BBDCMBBDCM: Allow Conditional : Allow Conditional Speculation ?Speculation ?

If Node 2

T F

f

BB 2

BB 4BB 3

BB 5

BB 7

+ c

+ e_

_ d

BB 6

b_

a

If Node 1T F

BB 0

BB 1

+

While considering While considering CS in last CS in last conditional BB conditional BB

Find Idle Resources Find Idle Resources in each basic block in each basic block to duplicate into duplicate in

If no idle resource If no idle resource in any BBin any BB If number of steps in If number of steps in

BB ≤ number of BB ≤ number of steps in any other BBsteps in any other BB

Insert New Insert New scheduling stepscheduling step

++ f f

BB 7

++Resource Constraints

ScheduledBeing Scheduled

BBDCM inserts new scheduling steps while BBDCM inserts new scheduling steps while applying code motions applying code motions if it enables if it enables Conditional SpeculationConditional Speculation

Page 25: Center for Embedded Computer Systems cecs.uci/~spark

2525

Implementation: Implementation: SPARK High Level SPARK High Level Synthesis FrameworkSynthesis Framework

C Input => RTL VHDL Output

VHDL => Logic Synthesis Results

Customizable Scheduler

Modular toolbox of transformations

Heuristics select transformations

Branch Balancing Branch Balancing Algorithms Integrate with Algorithms Integrate with

Scheduling HeursiticsScheduling Heursitics

Page 26: Center for Embedded Computer Systems cecs.uci/~spark

2626

Experiments: Target Experiments: Target ApplicationsApplications

DesignDesign # of # of IfsIfs

# of # of LoopsLoops

# Non-# Non-Empty Empty Basic Basic BlocksBlocks

# of # of OperatiOperati

onsons

MPEG-1 MPEG-1 pred1pred1

44 22 1717 123123

MPEG-1 MPEG-1 pred2pred2

1111 66 4545 287287

GIMP GIMP

tilertiler1111 22 3535 150150

MPEG-1 Prediction BlockMPEG-1 Prediction Block GIMP Image Processing softwareGIMP Image Processing software

Page 27: Center for Embedded Computer Systems cecs.uci/~spark

2727

All Code Motions except CS

+ Conditional Spec (CS)

CS+BBDT: Add Steps during TraversalCS+BBDCM: Add steps during CS

All Code Motions+CS+BBDT+BBDCM

Gimp Tiler Function

0

0.2

0.4

0.6

0.8

1

Number ofStates

Cycles onLongest

Path

No

rma

lize

d V

alu

es

Experimental ResultsExperimental ResultsMPEG-1 Pred2 Function

0

0.2

0.4

0.6

0.8

1

Number ofStates

Cycles onLongest Path

No

rma

lize

d V

alu

es

MPEG-1 Pred1 Function

0

0.2

0.4

0.6

0.8

1

Number ofStates

Cycles onLongest Path

Nor

mal

ized

Val

ues

Effectiveness of Conditional Speculation is limited Effectiveness of Conditional Speculation is limited without Branch Balancing Algorithmswithout Branch Balancing Algorithms

All Code Motions except CS

+ Conditional Spec (CS)

CS+Algo 1: Add Steps during SchedulingCS+Algo 2: Insert steps during CS

All Code Motions+CS+Algo 1+Algo 2

Inserting Scheduling Steps during Traversal (BBDT) Inserting Scheduling Steps during Traversal (BBDT) and while applying Conditional Speculation (BBDCM) and while applying Conditional Speculation (BBDCM)

improves Scheduling Results by improves Scheduling Results by 30-4030-40 % %

Page 28: Center for Embedded Computer Systems cecs.uci/~spark

2828

ConclusionsConclusions Two Branch Balancing Techniques to Increase Two Branch Balancing Techniques to Increase

the Effectiveness of Code Motions – specifically the Effectiveness of Code Motions – specifically Conditional SpeculationConditional Speculation Manage resource utilization of multiple basic blocksManage resource utilization of multiple basic blocks

Insert Scheduling Steps in Unbalanced Insert Scheduling Steps in Unbalanced Conditional Branches Conditional Branches dynamically dynamically during during scheduling scheduling

Implemented in comprehensive High-Level Implemented in comprehensive High-Level Synthesis framework: Synthesizes Behavioral C Synthesis framework: Synthesizes Behavioral C to RTL VHDLto RTL VHDL

Demonstrated effectiveness on large industrial Demonstrated effectiveness on large industrial applicationsapplications

With Profiling Information: insert steps in less With Profiling Information: insert steps in less taken Conditional Branchestaken Conditional Branches

Page 29: Center for Embedded Computer Systems cecs.uci/~spark

2929

Recent Related WorkRecent Related Work Scheduling designs with conditionalsScheduling designs with conditionals

Condition Vector List Scheduling Condition Vector List Scheduling [[Wakabayashi 89Wakabayashi 89]]

Path Based Scheduling [Path Based Scheduling [Camposano 91Camposano 91]] Symbolic Scheduling [Symbolic Scheduling [Radivojevic 96Radivojevic 96]] WaveSched Scheduler [WaveSched Scheduler [Lakshminarayana 98Lakshminarayana 98]] Basic Block Control Graph Scheduling Basic Block Control Graph Scheduling

[[Santos 99Santos 99]] Early work was on data-intensive DSP Early work was on data-intensive DSP

algorithmsalgorithms Pipelining, Algorithmic transformationsPipelining, Algorithmic transformations

Page 30: Center for Embedded Computer Systems cecs.uci/~spark

3030

Thank YouThank You