David Hansen and James Michelussi. Introduction Discrete Fourier Transform (DFT) Fast Fourier...
-
Upload
tiffany-dorsey -
Category
Documents
-
view
270 -
download
3
Transcript of David Hansen and James Michelussi. Introduction Discrete Fourier Transform (DFT) Fast Fourier...
![Page 1: David Hansen and James Michelussi. Introduction Discrete Fourier Transform (DFT) Fast Fourier Transform (FFT) FFT Algorithm – Applying the Mathematics.](https://reader036.fdocuments.net/reader036/viewer/2022081417/551acd0755034606048b4fac/html5/thumbnails/1.jpg)
David Hansen and James Michelussi
![Page 2: David Hansen and James Michelussi. Introduction Discrete Fourier Transform (DFT) Fast Fourier Transform (FFT) FFT Algorithm – Applying the Mathematics.](https://reader036.fdocuments.net/reader036/viewer/2022081417/551acd0755034606048b4fac/html5/thumbnails/2.jpg)
Introduction
Discrete Fourier Transform (DFT) Fast Fourier Transform (FFT) FFT Algorithm – Applying the
Mathematics Implementations of DFT and FFT Hardware Benchmarks Conclusion
![Page 3: David Hansen and James Michelussi. Introduction Discrete Fourier Transform (DFT) Fast Fourier Transform (FFT) FFT Algorithm – Applying the Mathematics.](https://reader036.fdocuments.net/reader036/viewer/2022081417/551acd0755034606048b4fac/html5/thumbnails/3.jpg)
DFT
In 1807 introduced by Jean Baptiste Joseph Fourier. allows a sampled or discrete signal that is periodic to
be transformed from the time domain to the frequency domain
Correlation between the time domain signal and N cosine and N sine waves
e
e
N
j
N
N
n
N
n
N
nkj
W
N
nkj
N
nknx
Nnx
NkX
2
1
0
1
0
2 2sin
2cos)(
1)(
1)(
X(k) = DFT Frequency SignalN = Number of Sample PointsX(n) = Time Domain SignalWN = Twiddle Factor
![Page 4: David Hansen and James Michelussi. Introduction Discrete Fourier Transform (DFT) Fast Fourier Transform (FFT) FFT Algorithm – Applying the Mathematics.](https://reader036.fdocuments.net/reader036/viewer/2022081417/551acd0755034606048b4fac/html5/thumbnails/4.jpg)
DFT (Walking Speed)
Why is this important? Where is this used? allows machines to calculate the frequency
domain allows for the convolution of signals by just
multiplying them together Used in digital spectral analysis for speech,
imaging and pattern recognition as well as signal manipulation using filters
But the DFT requires N2 multiplications!
![Page 5: David Hansen and James Michelussi. Introduction Discrete Fourier Transform (DFT) Fast Fourier Transform (FFT) FFT Algorithm – Applying the Mathematics.](https://reader036.fdocuments.net/reader036/viewer/2022081417/551acd0755034606048b4fac/html5/thumbnails/5.jpg)
FFT (Jet Speed)
J. W. Cooley and J. W. Tukey are given credit for bringing the FFT to the world in the 1960s
Simply an algorithm for more efficiently calculating the DFT Takes advantage of symmetry and periodicity in the twiddle
factors as well as uses a divide and conquer method Symmetry: WN
r +N/2 = -WNr
Periodicity: WNr+N = WN
r
Requires only (N/2)log2(N) multiplications ! Faster computation times More precise results due to less round-off error
![Page 6: David Hansen and James Michelussi. Introduction Discrete Fourier Transform (DFT) Fast Fourier Transform (FFT) FFT Algorithm – Applying the Mathematics.](https://reader036.fdocuments.net/reader036/viewer/2022081417/551acd0755034606048b4fac/html5/thumbnails/6.jpg)
FFT Algorithm
Several different types of FFT Algorithms (Radix-2, Radix-4, DIT & DIF)
Focus on Radix-2 using Decimation in Time (DIT) method Breaks down the DFT calculation into a number of 2-
point DFTs Each 2-point DFT uses an operation called the
Butterfly These groups are then re-combined with another
group of two and so on for log2(N) stages Using the DIT method the input time domain points
must be reordered using bit reversal
![Page 7: David Hansen and James Michelussi. Introduction Discrete Fourier Transform (DFT) Fast Fourier Transform (FFT) FFT Algorithm – Applying the Mathematics.](https://reader036.fdocuments.net/reader036/viewer/2022081417/551acd0755034606048b4fac/html5/thumbnails/7.jpg)
Butterfly Operation
e N
j
NW2
![Page 8: David Hansen and James Michelussi. Introduction Discrete Fourier Transform (DFT) Fast Fourier Transform (FFT) FFT Algorithm – Applying the Mathematics.](https://reader036.fdocuments.net/reader036/viewer/2022081417/551acd0755034606048b4fac/html5/thumbnails/8.jpg)
Bit Reversal
![Page 9: David Hansen and James Michelussi. Introduction Discrete Fourier Transform (DFT) Fast Fourier Transform (FFT) FFT Algorithm – Applying the Mathematics.](https://reader036.fdocuments.net/reader036/viewer/2022081417/551acd0755034606048b4fac/html5/thumbnails/9.jpg)
8-Point Radix-2 FFT Example
![Page 10: David Hansen and James Michelussi. Introduction Discrete Fourier Transform (DFT) Fast Fourier Transform (FFT) FFT Algorithm – Applying the Mathematics.](https://reader036.fdocuments.net/reader036/viewer/2022081417/551acd0755034606048b4fac/html5/thumbnails/10.jpg)
8-Point Radix-2 FFT Example
![Page 11: David Hansen and James Michelussi. Introduction Discrete Fourier Transform (DFT) Fast Fourier Transform (FFT) FFT Algorithm – Applying the Mathematics.](https://reader036.fdocuments.net/reader036/viewer/2022081417/551acd0755034606048b4fac/html5/thumbnails/11.jpg)
David Hansen
Implementations of DFT and FFT
![Page 12: David Hansen and James Michelussi. Introduction Discrete Fourier Transform (DFT) Fast Fourier Transform (FFT) FFT Algorithm – Applying the Mathematics.](https://reader036.fdocuments.net/reader036/viewer/2022081417/551acd0755034606048b4fac/html5/thumbnails/12.jpg)
DFT Implementation
Nested For Loop, (N/2)*N Iterations… O(N2) 63027.41 Cycles / Sample (123 cycles per inner
loop iteration) Obvious Inefficiencies, cos and sin math.h functions Efficient assembly coding could reduce the inner
loop to 3 cycles per iteration (1,536 cycles / sample)
for (r=0; r<=samples/2; r++){
float re = 0.0f, im = 0.0f;float part = (float)r * -2.0f * PI / (float)samples;
for (k=0; k<samples; k++){
float theta = part * (float)k;re += data_in[k] * cos(theta);im += data_in[k] * sin(theta);
}}
![Page 13: David Hansen and James Michelussi. Introduction Discrete Fourier Transform (DFT) Fast Fourier Transform (FFT) FFT Algorithm – Applying the Mathematics.](https://reader036.fdocuments.net/reader036/viewer/2022081417/551acd0755034606048b4fac/html5/thumbnails/13.jpg)
C++ FFT Implementationvoid fft_float (unsigned NumSamples, float *RealIn, float *ImagIn, float *RealOut, float *ImagOut ){ for ( i=0; i < NumSamples; i++ ) { // Iterate over the samples and perform the bit-reversal j = ReverseBits ( i, NumBits ); } BlockEnd = 1;
// Following loop iterates Log2(NumSamples) for ( BlockSize = 2; BlockSize <= NumSamples; BlockSize <<= 1 ) { // Perform Angle Calculations (Using math.h sin/cos)
// Following 2 loops iterate over NumSamples/2 for ( i=0; i < NumSamples; i += BlockSize ) { for ( j=i, n=0; n < BlockEnd; j++, n++ ) {
// Perform butterfly calculations } }
BlockEnd = BlockSize; }}
![Page 14: David Hansen and James Michelussi. Introduction Discrete Fourier Transform (DFT) Fast Fourier Transform (FFT) FFT Algorithm – Applying the Mathematics.](https://reader036.fdocuments.net/reader036/viewer/2022081417/551acd0755034606048b4fac/html5/thumbnails/14.jpg)
C++ FFT Implementation
Bit-Reverse For Loop – N iterations Nested For Loops
First Outer Loop – Log2(N) iterations Made use of sin/cos math.h functions
Second Outer Loop – N / BlockSize iterations Inner Loop – BlockSize/2 iterations
O(N + Log2(N) * N/BlockSize * BlockSize/2) O(N+N*Log2(N))
193.84 Cycles / Sample
![Page 15: David Hansen and James Michelussi. Introduction Discrete Fourier Transform (DFT) Fast Fourier Transform (FFT) FFT Algorithm – Applying the Mathematics.](https://reader036.fdocuments.net/reader036/viewer/2022081417/551acd0755034606048b4fac/html5/thumbnails/15.jpg)
Assembly FFT Implementation Bit-Reverse Address Generation
Hide Bit-Reverse operation inside first and second FFT Stages
Sin and Cos values stored in a Look-Up-Table 256 Kbyte LUT added to Data1
Needed to grow Data1 Memory Space using LDF file
Interleaved Real and Imaginary Arrays Quad Reads Loads 2 Complex Points per Cycle
Supports the Real FFT for input signals with no Imaginary component 40% Algorithm-based Savings
![Page 16: David Hansen and James Michelussi. Introduction Discrete Fourier Transform (DFT) Fast Fourier Transform (FFT) FFT Algorithm – Applying the Mathematics.](https://reader036.fdocuments.net/reader036/viewer/2022081417/551acd0755034606048b4fac/html5/thumbnails/16.jpg)
Assembly FFT Implementation Special Butterfly Instruction
Can perform addition/subtraction in parallel in one compute block
Speeds up the inner-most loop VLIW and SIMD Operations
Performs simultaneous operations in both compute blocks
Loop unrolling and instruction scheduling keeps the entire processor busy with instructions.
11.35 Cycles per Sample
![Page 17: David Hansen and James Michelussi. Introduction Discrete Fourier Transform (DFT) Fast Fourier Transform (FFT) FFT Algorithm – Applying the Mathematics.](https://reader036.fdocuments.net/reader036/viewer/2022081417/551acd0755034606048b4fac/html5/thumbnails/17.jpg)
Assembly FFT Implementation_BflyLoop: q[j2+=4]=r27:26; k5=k5+k9; fr6=r30*r12; fr16=r6-r7;;
yr3:0=q[j0+=4]; k3=k5 and k4; fr15=r23*r4; fr24=r8+r18, fr26=r8-r18;; xr3:0=q[j0+=4]; r5:4=l[k7+k3]; fr7=r31*r13; fr25=r9+r19, fr27=r9-r19;; q[j1+=4]=r25:24; fr14=r30*r13; fr17=r14+r15;; q[j2+=4]=r27:26; k5=k5+k9; fr6=r2*r4; fr18=r6-r7;;
yr11:8=q[j0+=4]; k3=k5 and k4; fr15=r31*r12; fr24=r20+r16, fr26=r20-r16;; xr11:8=q[j0+=4]; r13:12=l[k7+k3]; fr7=r3*r5; fr25=r21+r17, fr27=r21-r17;; q[j1+=4]=r25:24; fr14=r2*r5; fr19=r14+r15;; q[j2+=4]=r27:26; k5=k5+k9; fr6=r10*r12; fr16=r6-r7;;
yr23:20=q[j0+=4]; k3=k5 and k4; fr15=r3*r4; fr24=r28+r18, fr26=r28-r18;; xr23:20=q[j0+=4]; r5:4=l[k7+k3]; fr7=r11*r13; fr25=r29+r19, fr27=r29-r19;; q[j1+=4]=r25:24; fr14=r10*r13; fr17=r14+r15;; q[j2+=4]=r27:26; k5=k5+k9; fr6=r22*r4; fr18=r6-r7;; yr31:28=q[j0+=4]; k3=k5 and k4; fr15=r11*r12; fr24=r0+r16, fr26=r0-r16;; xr31:28=q[j0+=4]; r13:12=l[k7+k3]; fr7=r23*r5; fr25=r1+r17, fr27=r1-r17;; .align_code 4; if NLC0E, jump _BflyLoop;
![Page 18: David Hansen and James Michelussi. Introduction Discrete Fourier Transform (DFT) Fast Fourier Transform (FFT) FFT Algorithm – Applying the Mathematics.](https://reader036.fdocuments.net/reader036/viewer/2022081417/551acd0755034606048b4fac/html5/thumbnails/18.jpg)
DC FFT Test
FFT Source Array FFT Output Magnitude
![Page 19: David Hansen and James Michelussi. Introduction Discrete Fourier Transform (DFT) Fast Fourier Transform (FFT) FFT Algorithm – Applying the Mathematics.](https://reader036.fdocuments.net/reader036/viewer/2022081417/551acd0755034606048b4fac/html5/thumbnails/19.jpg)
Audio FFT Test
FFT Source Array FFT Output Magnitude
![Page 20: David Hansen and James Michelussi. Introduction Discrete Fourier Transform (DFT) Fast Fourier Transform (FFT) FFT Algorithm – Applying the Mathematics.](https://reader036.fdocuments.net/reader036/viewer/2022081417/551acd0755034606048b4fac/html5/thumbnails/20.jpg)
1024 Point DFT / FFT Comparison
Implementation Cycles Per Sample
DFT Implemented in C 63,027.41 cycles / sample
DFT Implemented in Assembly
1,536 cycles / sample
FFT Implemented in C 193.85 cycles / sample
FFT Implemented in Assembly
11.35 cycles / sample
![Page 21: David Hansen and James Michelussi. Introduction Discrete Fourier Transform (DFT) Fast Fourier Transform (FFT) FFT Algorithm – Applying the Mathematics.](https://reader036.fdocuments.net/reader036/viewer/2022081417/551acd0755034606048b4fac/html5/thumbnails/21.jpg)
1024 Point Radix-2 FFT Hardware Comparison
Processor Architecture
Cycles Per Sample
Processor Frequency
Execution Time
ADSP-21369 (SHARC)8.98 cycles /
sample400 MHz 22.99 µSec
TigerSHARC (website)9.16 cycles /
sample600 MHz 15.63 µSec
TigerSHARC (our results)
11.35 cycles / sample
600 MHz 19.37 µSec
TMS320C6000™14.125 cycles /
sample350 MHz 41.33 µSec
TMS320DM644x™7.59 cycles /
sample594 MHz 13.08 µSec
![Page 22: David Hansen and James Michelussi. Introduction Discrete Fourier Transform (DFT) Fast Fourier Transform (FFT) FFT Algorithm – Applying the Mathematics.](https://reader036.fdocuments.net/reader036/viewer/2022081417/551acd0755034606048b4fac/html5/thumbnails/22.jpg)
Conclusion
The FFT algorithm is very useful when computing the frequency domain on a DSP.
FFT is much faster than a regular DFT algorithm FFT is more precise by having less errors created
due to round off. The timed coding examples further support this
claim and demonstrate how to code the algorithm.
The Radix-2 FFT isn’t the fastest but it uses a less complex addressing and twiddle factor routine
In this case (unlike in school) F is better then D.