Explaining issues with -...
Transcript of Explaining issues with -...
Explaining issues with DCremoval( )
Common problems to avoid
03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,
Canada
2
Tackled todayTesting the performance of the CPP versionFirst assembly version – using I-ALU operations – testing and timingDetails of the code
03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,
Canada
3
DCRemoval( )
Not as complex as FIR, but many of the same requirementsEasier to handleYou use same ideas in optimizing FIR over Labs 2 and 3Two issues – speed and accuracy. Develop suitable tests for CPP code and check that various assembly language versions satisfy the same tests
Memoryintensive
Additionintensive
Loops formain code
FIFO implementedas circularbuffer
03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,
Canada
4
Call and return test
Basically – ifthe code gets hereit is probably that wedid not crash the system
I use a cut-and-paste approach todevelop code variants. This test is(embarrassingly) useful.
03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,
Canada
5
Initially we expect the code to fail to work correctly
If the code worksinitially, then itis doing so by accident
Use
XF_CHECK_EQUAL( )
Expected to fail
NOTE: This test is just a “cut-and-paste”version of C++ testwith three changesof function name
03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,
Canada
6
Timing test
Normalized the timing tests to “process the function once”Need to develop various other routines to make tests work -- DoNothing loop, run C++ and assembly code routines in a loopMay not be correctly performing timing – but gives initial concepts
Once
10 times
100 times
03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,
Canada
7
Other functions needed to run the test
Do NothingCareful – may beoptimized to “nothing”
C++ function loop
J-ALU function loop
03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,
Canada
8
Use build failure information to determine assembly code function name
Required name for void DCremovalASM_JALU(int *, int *)
_DCremoval_JALU__FPiT1
03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,
Canada
9
Proper test run and exit – lib_prog_term
Yellow indicates that there areNO failures but some expectedfailures
All successes and failures shownin console window 03-Feb-07 DC removal Lecture 1,
M. Smith, ECE, University of Calgary, Canada
10
Quick look at the codevoid DCremovalASM(int *, int *)
Setting up the static arraysDefining and then setting pointersMoving incoming parameters in FIFOSumming the FIFO valuesPerforming (FAST) divisionReturning the correct valuesUpdating the FIFO in preparation for next time this function is called – discarding oldest value, and “rippling” the FIFO to make the “newest”FIFO slot empty
03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,
Canada
11
Developing the assembly codestatic arrays – “section data1”
1) .section data1;
In later algorithms we willshow that using multiple data sections in different parts of TigerSHARC memory allow us to bring in 256-bits of data per cycle
03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,
Canada
12
Developing the assembly codestatic arrays – “section data1”
2) .align 4;
Later will use ability to bring in 4 words (32-bits) of data at the same time.
Works best when the array starts on a 4 word boundary
03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,
Canada
13
Developing the assembly codestatic arrays – “section data1”
3) .var array[128];
The .var syntax allows declaring of “word”arrays. Other syntax for short int and byte arrays
NOTE: -- reused .align 4 before next array 03-Feb-07 DC removal Lecture 1,
M. Smith, ECE, University of Calgary, Canada
14
Developing the assembly codestatic arrays – “section data1”
4) .var array[128];
Array is “static” – known in this file only – as we don’t globalize the name TRUE or FALSE?
KEY – switch between data and program memory is “really key”
03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,
Canada
15
Define the (250) register names for code maintainability (and marking) ease
Actualstaticarraydeclaration
DEFINE pointersinto arrays
DEFINE temps
DEFINE Inpars
SET pointersinto arrays 03-Feb-07 DC removal Lecture 1,
M. Smith, ECE, University of Calgary, Canada
16
Value into FIFO buffer
RISC processor LOAD and STORE architecture –1. Use pointer value (came in J4) to read “left
value” passed in by reference into a registerMIPS – like rather than CISC
2. Now place this value into last element of FIFO array (make sure that not one element out.
NOTE – BUFFERSIZE – 1 is converted BY ASSEMBLER and does not happen at run timeUsing index with pre-modify offset – J2 is not changed
03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,
Canada
17
Perform sumHardware loop 1
Set up an index i_J8to be used as offset intoArray – note how thissyntax follows C++
Set upLOOP COUNTER 0
Perform test and jumo
03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,
Canada
18
Perform sumHardware loop 2
Set upLOOP COUNTER 0
Division by 128 is performed by shift(What did C++ do)
Note that with the I-ALUyou can only shift by 1 bit(not a barrel shifter).
Perform test and jumo
03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,
Canada
19
Some obvious multiple instructions.Can they go wrong?
NoteAdd occurs whether the jump does or does not occur
Should this be a predicted or non-predicted jump
One shift too many?
03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,
Canada
20
Correcting INPARS and then updating the FIFO buffer
Adjust theINPARS
rememberint *
Update FIFOmemory usingload / storeapproach
SLOW
03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,
Canada
21
Adjust tests for expected success
03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,
Canada
22
Run the tests
03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,
Canada
23
Examine the timingIn “debug” mode, we are already “beating” the compiler”
QuestionsWhy is C++ slower? Is it doing something
that us (in ignorance) don’t know we need to do?What happens with “release mode”?
03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,
Canada
24
Can you explain this 10% change in the results depending on how many tests?
Timing with all the tests
Timing Test only
03-Feb-07 DC removal Lecture 1, M. Smith, ECE, University of Calgary,
Canada
25
Tackled todayWhat are the basic characteristics of a DSP algorithm?A near perfect “starting” example
DCRemoval( ) has many of the features of the FIR filters used in all the Labs
Testing the performance of the CPP versionFirst assembly version – using I-ALU operations –testing and timingCode will be examined in more detail in the next lecture