Outline What is a “Soft” Processor What is the NIOS II? Architecture for NIOS II, what are...

28

Transcript of Outline What is a “Soft” Processor What is the NIOS II? Architecture for NIOS II, what are...

Page 1: Outline  What is a “Soft” Processor  What is the NIOS II?  Architecture for NIOS II, what are the implications TigerSHARC VS. NIOS II Pipeline Issues.
Page 2: Outline  What is a “Soft” Processor  What is the NIOS II?  Architecture for NIOS II, what are the implications TigerSHARC VS. NIOS II Pipeline Issues.

Outline What is a “Soft” Processor What is the NIOS II? Architecture for NIOS II, what are the

implications • TigerSHARC VS. NIOS II• Pipeline Issues• Issues related to FIR

Hardware acceleration, using FPGA logic

Page 3: Outline  What is a “Soft” Processor  What is the NIOS II?  Architecture for NIOS II, what are the implications TigerSHARC VS. NIOS II Pipeline Issues.

What’s is a “Soft” Processor? Processor implemented in VHDL, Verilog,

etc., and downloaded onto FPGA hardware Can implement many parallel processors

on one FPGA Can use addition FPGA resources on the

same chip that is not part of the processor core.

NIOS II is a “Soft” Processor

Page 4: Outline  What is a “Soft” Processor  What is the NIOS II?  Architecture for NIOS II, what are the implications TigerSHARC VS. NIOS II Pipeline Issues.

Why “Soft” Processor?

Higher level of design reuse Reduced obsolescence risk Simplified design update or change Increased design implementation

options Lower latency between processor and

FPGA components

Page 5: Outline  What is a “Soft” Processor  What is the NIOS II?  Architecture for NIOS II, what are the implications TigerSHARC VS. NIOS II Pipeline Issues.

What is NIOS II?

Software-defined processor The processor core is loaded onto

FPGA Programmed using ‘normal’

programming tools (C, asm), not hardware description languages

Can use the rest of the FPGA hardware for accelerating parts of the code

Page 6: Outline  What is a “Soft” Processor  What is the NIOS II?  Architecture for NIOS II, what are the implications TigerSHARC VS. NIOS II Pipeline Issues.

How Is NIOS II Implemented The custom FPGA logic that interacts

with the processor is implemented in Altera Quartus II

The Avalon Interface bus (common instruction/data bus) is implemented in Quartus II

The architecture is generated in Quartus II and used for programming in Eclipse IDE

Page 7: Outline  What is a “Soft” Processor  What is the NIOS II?  Architecture for NIOS II, what are the implications TigerSHARC VS. NIOS II Pipeline Issues.
Page 8: Outline  What is a “Soft” Processor  What is the NIOS II?  Architecture for NIOS II, what are the implications TigerSHARC VS. NIOS II Pipeline Issues.

NIOS II IDE

Coding is implemented in Eclipse rather than VisualDSP.

Page 9: Outline  What is a “Soft” Processor  What is the NIOS II?  Architecture for NIOS II, what are the implications TigerSHARC VS. NIOS II Pipeline Issues.

The Different NIOS II Cores There are 3 cores available from Altera

NIOSII/e: Economical CoreNIOSII/s: Standard CoreNIOSII/f: Fast Core

Page 10: Outline  What is a “Soft” Processor  What is the NIOS II?  Architecture for NIOS II, what are the implications TigerSHARC VS. NIOS II Pipeline Issues.

What’s the Difference between the Cores?

An LE is equivalent to a 8-1 NAND gate + 1 D-Flip FlopAn ALM is equivalent to 2 LE’s

Page 11: Outline  What is a “Soft” Processor  What is the NIOS II?  Architecture for NIOS II, what are the implications TigerSHARC VS. NIOS II Pipeline Issues.

Comparison of TigerSHARC and NIOS II architecture

Page 12: Outline  What is a “Soft” Processor  What is the NIOS II?  Architecture for NIOS II, what are the implications TigerSHARC VS. NIOS II Pipeline Issues.

TigerSHARC Architecture

Page 13: Outline  What is a “Soft” Processor  What is the NIOS II?  Architecture for NIOS II, what are the implications TigerSHARC VS. NIOS II Pipeline Issues.

NIOS II Architecture

-thirty two 32-bit general registers, six 32-bit control registers-variable cache based on how much FPGA space you have-ALU- 32bit two input to one input, does shifts, logic and arithmetic. Shifter is not separate like TigerSHARC

Page 14: Outline  What is a “Soft” Processor  What is the NIOS II?  Architecture for NIOS II, what are the implications TigerSHARC VS. NIOS II Pipeline Issues.

Avalon Interface

-separate address, data and control lines-up to 1024-bit data width transfer, can be set to any width (not power of 2)-one transfer per clock cycle.

Page 15: Outline  What is a “Soft” Processor  What is the NIOS II?  Architecture for NIOS II, what are the implications TigerSHARC VS. NIOS II Pipeline Issues.

NIOS II/f pipeline Six stages One instruction can be dispatched and/or

retired pre cycle Dynamic branch prediction: 2-bit branch

history table (no BTB like in TigerSHARC)

Page 16: Outline  What is a “Soft” Processor  What is the NIOS II?  Architecture for NIOS II, what are the implications TigerSHARC VS. NIOS II Pipeline Issues.

NIOS II/f pipeline

The pipeline stalls for:

• Multi-cycle instructions• Cache misses• Data dependencies (2 cycles between

calculating and using result)

Mispredicted branch penalty: 3 cycles

Page 17: Outline  What is a “Soft” Processor  What is the NIOS II?  Architecture for NIOS II, what are the implications TigerSHARC VS. NIOS II Pipeline Issues.
Page 18: Outline  What is a “Soft” Processor  What is the NIOS II?  Architecture for NIOS II, what are the implications TigerSHARC VS. NIOS II Pipeline Issues.

Hardware multiply Can use different options for multiplier

(at the processor design stage)No h/w multiply (saves FPGA gates)

○ Speed depends on algorithmUse embedded multipliers (if FPGA has

those)○ 1-5 cycles (depends on FPGA)

Implement multipliers on FPGA gates○ 11 cycles

Division 4-66 cycles on hardware

Page 19: Outline  What is a “Soft” Processor  What is the NIOS II?  Architecture for NIOS II, what are the implications TigerSHARC VS. NIOS II Pipeline Issues.

Compare to TigerSHARC

No support for parallel instructions No support for SIMD operations Multicycle instructions stall the pipeline

All the above limitations can be overcome by using FPGA space unoccupied by the processor itself

Page 20: Outline  What is a “Soft” Processor  What is the NIOS II?  Architecture for NIOS II, what are the implications TigerSHARC VS. NIOS II Pipeline Issues.

Comparison of NIOS II and TigerSHARC on an FIR Algorithm

Page 21: Outline  What is a “Soft” Processor  What is the NIOS II?  Architecture for NIOS II, what are the implications TigerSHARC VS. NIOS II Pipeline Issues.

Integer FIR algorithm int coeff[]={1, 2, 3, 4, 5, 6, 7, 8}; int data1[] = {1, 0, 0, 0, 0 ,0 ,0 ,0}; int output[8]; int i=0, j=0, k=0;  for(k=0; k<8; k++) output[k] =0; for( j =0; j< 8; j++) { for( i= 0; i< 8; i++) { output[j] += data1[i]*coeff[7-i]; } }

Page 22: Outline  What is a “Soft” Processor  What is the NIOS II?  Architecture for NIOS II, what are the implications TigerSHARC VS. NIOS II Pipeline Issues.

Speed analysis0 movi r4,8 i = 8

1 Loop: ldw r2,0(r6) load data

2 ldw r3,0(r7) load coefficient

3 addi r4,r4,-1 i--

4 addi r6,r6,4 coeffPt++

5 mul r2,r2,r3 data = data * coeff

6 addi r7,r7,-4 dataPt--

7 stall data stall – waiting for multiplication result

8 add r5,r5,r2 output += data

9 bne r4,zero,0x10002a0

will mispredict 2 times in the beginning, and 1 time in the end of the loop (waste 3 cycles each time)

Page 23: Outline  What is a “Soft” Processor  What is the NIOS II?  Architecture for NIOS II, what are the implications TigerSHARC VS. NIOS II Pipeline Issues.

Speed analysis 9 cycles per iteration except the first two

(branch predicted not taken) and the last (branch predicted taken) – those will be 9+3=12 cycles

1 data stall – can remove by moving instruction from line 4 to 7

Speed: 8 cycles * (N-3) + 11 cycles * 3 = 8*(N-3)+33 cycles

For 1024-tap FIR: 8201 cycles Clock cycle is 3 times longer (200MHz vs

600MHz)

Page 24: Outline  What is a “Soft” Processor  What is the NIOS II?  Architecture for NIOS II, what are the implications TigerSHARC VS. NIOS II Pipeline Issues.

Speed comparison

• 8201 NIOS II cycles equivalent to 24603 TigerSHARC cycles

• Lab3 timing: – 56000 cycles Debug mode– 13000 unoptimized ASM– 4000 Optimized ASM

Worse than unoptimized assembly, but no hardware acceleration used, so this is not that bad

Page 25: Outline  What is a “Soft” Processor  What is the NIOS II?  Architecture for NIOS II, what are the implications TigerSHARC VS. NIOS II Pipeline Issues.

Hardware Acceleration

Profiling tool in Eclipse can show how long each function takes

If function takes too long, it can be sped up byCustom instructionsHardware Acceleration

Hardware Acceleration is to take the function and transform it into FPGA circuitry

Page 26: Outline  What is a “Soft” Processor  What is the NIOS II?  Architecture for NIOS II, what are the implications TigerSHARC VS. NIOS II Pipeline Issues.

Hardware Acceleration

Can be done using C2H compiler from Altera Trades off Logic Size for Speed up.

Table 1. User Application Results Example

Algorithm Speed Increase(vs. Nios II CPU)

System fMAX

(Mhz)

System Resource Increase (1)

Autocorrelation 41.0x 115 124%

Bit Allocation 42.3x 110 152%

Convolution Encoder 13.3x 95 133%

Fast Fourier Transform (FFT)

15.0x 85 208%

High Pass Filter 42.9x 110 181%

Matrix Rotate 73.6x 95 106%

RGB to CMYK 41.5x 120 84%

RGB to YIQ 39.9x 110 158%

Page 27: Outline  What is a “Soft” Processor  What is the NIOS II?  Architecture for NIOS II, what are the implications TigerSHARC VS. NIOS II Pipeline Issues.

Conclusion

“Soft” Processors such as the NIOSII offers another alternative in the embedded system scene.

The NIOSII offers the advantage of added configurability, and customization that blur the line between FPGAs and DSPs

Page 28: Outline  What is a “Soft” Processor  What is the NIOS II?  Architecture for NIOS II, what are the implications TigerSHARC VS. NIOS II Pipeline Issues.

References[1] http://www.fpgajournal.com/articles/behere.htm

Describes an FPGA-DSP project based on Altera Nios

[2] http://www.altera.com/products/ip/processors/nios2/ni2-index.html

Official Nios II page

[3] http://www.hunteng.co.uk/dsp-fpga.htm

DSP or FPGA? What is better when?

[4] http://www.hunteng.co.uk/pdfs/tech/DSP1736FPGA.pdf

Article from Xilinx about FPGA DSPs

[5] http://www.niosforum.com

Community forum for NIOS

[6] http://www.altera.com/literature/hb/nios2/n2cpu_nii5v1.pdf

NIOSII Processor Handbook –Altera Corporation

[7] http://www.altera.com/literature/manual/mnl_avalon_spec.pdf

Avalon Memory-Mapped Interface Specifications – Altera Corporation

[8] http://www.analog.com/en/prod/0,2877,ADSP%252DTS201S,00.html

ADSP-TS201S 500/600 MHz TigerSHARC Processor with 24 Mbit on-chip embedded DRAM