Explaining issues with -...

7
Explaining issues with DCremoval( ) Common problems to avoid 03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary, Canada 2 Tackled today Testing the performance of the CPP version First assembly version – using I-ALU operations – testing and timing Details of the code 03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary, Canada 3 DCRemoval( ) Not as complex as FIR, but many of the same requirements Easier to handle You use same ideas in optimizing FIR over Labs 2 and 3 Two issues – speed and accuracy. Develop suitable tests for CPP code and check that various assembly language versions satisfy the same tests Memory intensive Addition intensive Loops for main code FIFO implemented as circular buffer 03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary, Canada 4 Call and return test Basically – if the code gets here it is probably that we did not crash the system I use a cut-and-paste approach to develop code variants. This test is (embarrassingly) useful.

Transcript of Explaining issues with -...

Page 1: Explaining issues with - people.ucalgary.capeople.ucalgary.ca/~smithmr/2007webs/encm515_07/07Lectures/07… · Explaining issues with DCremoval( ) Common problems to avoid 03-Feb-07

Explaining issues with DCremoval( )

Common problems to avoid

03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,

Canada

2

Tackled todayTesting the performance of the CPP versionFirst assembly version – using I-ALU operations – testing and timingDetails of the code

03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,

Canada

3

DCRemoval( )

Not as complex as FIR, but many of the same requirementsEasier to handleYou use same ideas in optimizing FIR over Labs 2 and 3Two issues – speed and accuracy. Develop suitable tests for CPP code and check that various assembly language versions satisfy the same tests

Memoryintensive

Additionintensive

Loops formain code

FIFO implementedas circularbuffer

03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,

Canada

4

Call and return test

Basically – ifthe code gets hereit is probably that wedid not crash the system

I use a cut-and-paste approach todevelop code variants. This test is(embarrassingly) useful.

Page 2: Explaining issues with - people.ucalgary.capeople.ucalgary.ca/~smithmr/2007webs/encm515_07/07Lectures/07… · Explaining issues with DCremoval( ) Common problems to avoid 03-Feb-07

03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,

Canada

5

Initially we expect the code to fail to work correctly

If the code worksinitially, then itis doing so by accident

Use

XF_CHECK_EQUAL( )

Expected to fail

NOTE: This test is just a “cut-and-paste”version of C++ testwith three changesof function name

03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,

Canada

6

Timing test

Normalized the timing tests to “process the function once”Need to develop various other routines to make tests work -- DoNothing loop, run C++ and assembly code routines in a loopMay not be correctly performing timing – but gives initial concepts

Once

10 times

100 times

03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,

Canada

7

Other functions needed to run the test

Do NothingCareful – may beoptimized to “nothing”

C++ function loop

J-ALU function loop

03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,

Canada

8

Use build failure information to determine assembly code function name

Required name for void DCremovalASM_JALU(int *, int *)

_DCremoval_JALU__FPiT1

Page 3: Explaining issues with - people.ucalgary.capeople.ucalgary.ca/~smithmr/2007webs/encm515_07/07Lectures/07… · Explaining issues with DCremoval( ) Common problems to avoid 03-Feb-07

03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,

Canada

9

Proper test run and exit – lib_prog_term

Yellow indicates that there areNO failures but some expectedfailures

All successes and failures shownin console window 03-Feb-07 DC removal Lecture 1,

M. Smith, ECE, University of Calgary, Canada

10

Quick look at the codevoid DCremovalASM(int *, int *)

Setting up the static arraysDefining and then setting pointersMoving incoming parameters in FIFOSumming the FIFO valuesPerforming (FAST) divisionReturning the correct valuesUpdating the FIFO in preparation for next time this function is called – discarding oldest value, and “rippling” the FIFO to make the “newest”FIFO slot empty

03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,

Canada

11

Developing the assembly codestatic arrays – “section data1”

1) .section data1;

In later algorithms we willshow that using multiple data sections in different parts of TigerSHARC memory allow us to bring in 256-bits of data per cycle

03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,

Canada

12

Developing the assembly codestatic arrays – “section data1”

2) .align 4;

Later will use ability to bring in 4 words (32-bits) of data at the same time.

Works best when the array starts on a 4 word boundary

Page 4: Explaining issues with - people.ucalgary.capeople.ucalgary.ca/~smithmr/2007webs/encm515_07/07Lectures/07… · Explaining issues with DCremoval( ) Common problems to avoid 03-Feb-07

03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,

Canada

13

Developing the assembly codestatic arrays – “section data1”

3) .var array[128];

The .var syntax allows declaring of “word”arrays. Other syntax for short int and byte arrays

NOTE: -- reused .align 4 before next array 03-Feb-07 DC removal Lecture 1,

M. Smith, ECE, University of Calgary, Canada

14

Developing the assembly codestatic arrays – “section data1”

4) .var array[128];

Array is “static” – known in this file only – as we don’t globalize the name TRUE or FALSE?

KEY – switch between data and program memory is “really key”

03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,

Canada

15

Define the (250) register names for code maintainability (and marking) ease

Actualstaticarraydeclaration

DEFINE pointersinto arrays

DEFINE temps

DEFINE Inpars

SET pointersinto arrays 03-Feb-07 DC removal Lecture 1,

M. Smith, ECE, University of Calgary, Canada

16

Value into FIFO buffer

RISC processor LOAD and STORE architecture –1. Use pointer value (came in J4) to read “left

value” passed in by reference into a registerMIPS – like rather than CISC

2. Now place this value into last element of FIFO array (make sure that not one element out.

NOTE – BUFFERSIZE – 1 is converted BY ASSEMBLER and does not happen at run timeUsing index with pre-modify offset – J2 is not changed

Page 5: Explaining issues with - people.ucalgary.capeople.ucalgary.ca/~smithmr/2007webs/encm515_07/07Lectures/07… · Explaining issues with DCremoval( ) Common problems to avoid 03-Feb-07

03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,

Canada

17

Perform sumHardware loop 1

Set up an index i_J8to be used as offset intoArray – note how thissyntax follows C++

Set upLOOP COUNTER 0

Perform test and jumo

03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,

Canada

18

Perform sumHardware loop 2

Set upLOOP COUNTER 0

Division by 128 is performed by shift(What did C++ do)

Note that with the I-ALUyou can only shift by 1 bit(not a barrel shifter).

Perform test and jumo

03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,

Canada

19

Some obvious multiple instructions.Can they go wrong?

NoteAdd occurs whether the jump does or does not occur

Should this be a predicted or non-predicted jump

One shift too many?

03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,

Canada

20

Correcting INPARS and then updating the FIFO buffer

Adjust theINPARS

rememberint *

Update FIFOmemory usingload / storeapproach

SLOW

Page 6: Explaining issues with - people.ucalgary.capeople.ucalgary.ca/~smithmr/2007webs/encm515_07/07Lectures/07… · Explaining issues with DCremoval( ) Common problems to avoid 03-Feb-07

03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,

Canada

21

Adjust tests for expected success

03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,

Canada

22

Run the tests

03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,

Canada

23

Examine the timingIn “debug” mode, we are already “beating” the compiler”

QuestionsWhy is C++ slower? Is it doing something

that us (in ignorance) don’t know we need to do?What happens with “release mode”?

03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,

Canada

24

Can you explain this 10% change in the results depending on how many tests?

Timing with all the tests

Timing Test only

Page 7: Explaining issues with - people.ucalgary.capeople.ucalgary.ca/~smithmr/2007webs/encm515_07/07Lectures/07… · Explaining issues with DCremoval( ) Common problems to avoid 03-Feb-07

03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,

Canada

25

Tackled todayWhat are the basic characteristics of a DSP algorithm?A near perfect “starting” example

DCRemoval( ) has many of the features of the FIR filters used in all the Labs

Testing the performance of the CPP versionFirst assembly version – using I-ALU operations –testing and timingCode will be examined in more detail in the next lecture