FPGA Accelerated 3-D Tomography
description
Transcript of FPGA Accelerated 3-D Tomography
![Page 1: FPGA Accelerated 3-D Tomography](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815a2d550346895dc76ee7/html5/thumbnails/1.jpg)
FPGA Accelerated3-D Tomography
Richard Dorrance
Progress Update: 09/07/12
![Page 2: FPGA Accelerated 3-D Tomography](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815a2d550346895dc76ee7/html5/thumbnails/2.jpg)
2
Outline Introduction to Tomography
Reconstruction Methods– Analytical
o Backprojectiono Filtered Backprojection
– Algebraico Algebraic Reconstruction Technique (ART)o Simultaneous Iterative Reconstruction Technique (SIRT)o Simultaneous Algebraic Reconstruction Technique (SART)
Modeling Performance of Reconstruction Methods
Future Work
![Page 3: FPGA Accelerated 3-D Tomography](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815a2d550346895dc76ee7/html5/thumbnails/3.jpg)
3
Tomography Cross-sectional imaging technique using transmission
or reflection data from multiple angles
Basis for CAT scan, MRI,PET, SPECT, ET, etc.
Computed Tomography (CT):A form of tomographic reconstruction on computers
![Page 4: FPGA Accelerated 3-D Tomography](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815a2d550346895dc76ee7/html5/thumbnails/4.jpg)
4
Cross-Sections by X-Ray Projections Project X-ray through biological tissue;
measure total absorption of ray by tissue
Projection Pθ(t) is the Radontransform of object functionf(x,y):
Total set of projections calledsinogram
, cos sinP t f x y x y t dxdy
![Page 5: FPGA Accelerated 3-D Tomography](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815a2d550346895dc76ee7/html5/thumbnails/5.jpg)
5
Shepp-Logan Phantom Standard test image for tomographic reconstructions
![Page 6: FPGA Accelerated 3-D Tomography](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815a2d550346895dc76ee7/html5/thumbnails/6.jpg)
6
Example Image with Projections
1 3 1
1 2 1
4 1 1
5
4
6
14
42
46 6 312
74
1
![Page 7: FPGA Accelerated 3-D Tomography](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815a2d550346895dc76ee7/html5/thumbnails/7.jpg)
7
CT Reconstruction Restore image from projection data
Inverse Radon transform
Most common algorithm is filtered backprojection– “Smear” each projection over image plane
Accuracy of reconstruction depends on the number of detectors and projection angles
Original 4 Angles 16 Angles 64 Angles 256 Angles
![Page 8: FPGA Accelerated 3-D Tomography](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815a2d550346895dc76ee7/html5/thumbnails/8.jpg)
8
Analytical Reconstruction Methods (Filtered) Backprojection Pseudo Code:
– Input: sinogram sino(θ, N)– Output: image img(x,y)
for each θ
filter sino(θ,:) ; only for FBP
for each x
for each y
n = x*cos(θ) + y*sin(θ)
img(x,y) = sino(θ,n) + img(x,y)
![Page 9: FPGA Accelerated 3-D Tomography](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815a2d550346895dc76ee7/html5/thumbnails/9.jpg)
9
Backprojection (Step 1)
0 0 0
0 0 0
0 0 0
5
4
6
14
42
46 6 312
74
1
![Page 10: FPGA Accelerated 3-D Tomography](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815a2d550346895dc76ee7/html5/thumbnails/10.jpg)
10
Backprojection (Step 2)
5 5 5
4 4 4
6 6 6
5
4
6
14
42
46 6 312
74
1
![Page 11: FPGA Accelerated 3-D Tomography](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815a2d550346895dc76ee7/html5/thumbnails/11.jpg)
11
Backprojection (Step 3)
9 9 6
6 8 8
10 8 10
5
4
6
14
42
46 6 312
74
1
![Page 12: FPGA Accelerated 3-D Tomography](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815a2d550346895dc76ee7/html5/thumbnails/12.jpg)
12
Backprojection (Step 4)
15 15 9
12 14 11
16 14 13
5
4
6
14
42
46 6 312
74
1
![Page 13: FPGA Accelerated 3-D Tomography](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815a2d550346895dc76ee7/html5/thumbnails/13.jpg)
13
Backprojection (Step 5)
16 19 16
16 21 13
23 16 14
5
4
6
14
42
46 6 312
74
1
![Page 14: FPGA Accelerated 3-D Tomography](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815a2d550346895dc76ee7/html5/thumbnails/14.jpg)
14
Backprojection vs. Original Final Step: normalize image power
– Divide each pixel by θ·N
1.33 1.58 1.33
1.33 1.75 1.08
1.92 1.33 1.17
1 3 1
1 2 1
4 1 1
![Page 15: FPGA Accelerated 3-D Tomography](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815a2d550346895dc76ee7/html5/thumbnails/15.jpg)
15
Note On Filtering
No Filtering With Filtering
![Page 16: FPGA Accelerated 3-D Tomography](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815a2d550346895dc76ee7/html5/thumbnails/16.jpg)
16
Filtered Backprojection (Step 1)
0 0 0
0 0 0
0 0 0
1.22
-1.220.61
0.39-0.84
1.061.16 0.49 0-0.11-0.84
1.55-0.06
-0.55
-0.73
1.61
![Page 17: FPGA Accelerated 3-D Tomography](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815a2d550346895dc76ee7/html5/thumbnails/17.jpg)
17
Filtered Backprojection (Step 2)
1.22 1.22 1.22
-0.73 -0.73 -0.73
1.61 1.61 1.61
1.22
-1.220.61
0.39-0.84
1.061.16 0.49 0-0.11-0.84
1.55-0.06
-0.55
-0.73
1.61
![Page 18: FPGA Accelerated 3-D Tomography](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815a2d550346895dc76ee7/html5/thumbnails/18.jpg)
18
Filtered Backprojection (Step 3)
1.61 1.83 0
-1.57 -0.34 -0.12
2.67 0.77 2
1.22
-1.220.61
0.39-0.84
1.061.16 0.49 0-0.11-0.84
1.55-0.06
-0.55
-0.73
1.61
![Page 19: FPGA Accelerated 3-D Tomography](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815a2d550346895dc76ee7/html5/thumbnails/19.jpg)
19
Filtered Backprojection (Step 4)
0.45 2.32 0
-0.41 0.15 -0.12
3.83 1.26 2
1.22
-1.220.61
0.39-0.84
1.061.16 0.49 0-0.11-0.84
1.55-0.06
-0.55
-0.73
1.61
![Page 20: FPGA Accelerated 3-D Tomography](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815a2d550346895dc76ee7/html5/thumbnails/20.jpg)
20
Filtered Backprojection (Step 5)
-0.1 2.26 1.55
-0.47 1.7 -0.96
5.38 0.42 1.89
1.22
-1.220.61
0.39-0.84
1.061.16 0.49 0-0.11-0.84
1.55-0.06
-0.55
-0.73
1.61
![Page 21: FPGA Accelerated 3-D Tomography](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815a2d550346895dc76ee7/html5/thumbnails/21.jpg)
21
Filtered Backprojection vs. Original
-0.1 2.26 1.55
-0.47 1.7 -0.96
5.38 0.42 1.89
1 3 1
1 2 1
4 1 1
![Page 22: FPGA Accelerated 3-D Tomography](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815a2d550346895dc76ee7/html5/thumbnails/22.jpg)
22
Conventional Algebraic Reconstruction Methods
![Page 23: FPGA Accelerated 3-D Tomography](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815a2d550346895dc76ee7/html5/thumbnails/23.jpg)
23
Problem Formulation We want to formulate it as a Linear Inverse Problem:
Where x is a column vector of length N2 representing the pixels of the original image, A is an M by N2 matrix representing the data acquisition process, and b is a column vector of length M representing the measured projection data.
We want to find a solution such that:
bAx
bAx left1
![Page 24: FPGA Accelerated 3-D Tomography](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815a2d550346895dc76ee7/html5/thumbnails/24.jpg)
24
Notes on the Discretized Image x The discretized image is denoted by:
and by:
where x is obtained by stacking the columns of X.
NNX R
12
Rvec NXx
![Page 25: FPGA Accelerated 3-D Tomography](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815a2d550346895dc76ee7/html5/thumbnails/25.jpg)
25
Notes on the projection data b There are a total of d detectors and θ projection angles,
so that a total of M = d · θ are used.
Then the measured projection data is denoted by:
and by:
where b is obtained by stacking the columns of B.
dB R
11 RRvec MdBb
![Page 26: FPGA Accelerated 3-D Tomography](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815a2d550346895dc76ee7/html5/thumbnails/26.jpg)
26
Notes on the Acquisition Matrix A The acquisition of projection data b from x is modeled
by:
where:
ai,j is the contribution of pixel j to projection i.
Also, let:
be a column matrix that represents the ith ray which computes the value of the ith projection.
2
R NMA
M.,,,ixabN
jjjii 21,
2
1,
Ti iAA :,
![Page 27: FPGA Accelerated 3-D Tomography](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815a2d550346895dc76ee7/html5/thumbnails/27.jpg)
27
Iterative Reconstruction Algorithm Let x(k) denote the kth estimation of the reconstruction.
Then:
where the relaxation factor λ is a scalar.
bAxAxx kTkk 1
![Page 28: FPGA Accelerated 3-D Tomography](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815a2d550346895dc76ee7/html5/thumbnails/28.jpg)
28
Proof of Convergence [1] Let
Then
AAI T
bAIx
bAIx
bAxx
Tkk
Tk
Tkk
01
12
1
![Page 29: FPGA Accelerated 3-D Tomography](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815a2d550346895dc76ee7/html5/thumbnails/29.jpg)
29
Proof of Convergence [2] If ATA is positive definite and λ is chosen so that the
spectral radius of Δ is less than 1, then:
and
0lim 1
k
k
1lim
IIk
k
![Page 30: FPGA Accelerated 3-D Tomography](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815a2d550346895dc76ee7/html5/thumbnails/30.jpg)
30
Proof of Convergence [3] Therefore:
bA
bAAA
bAIx
left
TT
Tk
k
1
1
1lim
![Page 31: FPGA Accelerated 3-D Tomography](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815a2d550346895dc76ee7/html5/thumbnails/31.jpg)
31
# of Projections needed for ART Reconstruction on a square grid (N×N) with N detectors Assuming a circular reconstruction region, we can
ignore pixels outside this region
pixels 4
2N
4
4
detectors of #
pixels of # 2 N
N
NART
![Page 32: FPGA Accelerated 3-D Tomography](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815a2d550346895dc76ee7/html5/thumbnails/32.jpg)
32
# of Projections needed for FBP [1] Reconstructing region with diameter L
Sampling interval is at least:
with a maximum frequency of:
Due to polar sampling,the density of samplesdecreases as we gooutward on the polar grid
NL
1
L2
1max
![Page 33: FPGA Accelerated 3-D Tomography](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815a2d550346895dc76ee7/html5/thumbnails/33.jpg)
33
# of Projections needed for FBP [2] To ensure a sampling rate of at least Δω everywhere:
therefore:
2
NFBP
NL
NL 2
21
1
max
![Page 34: FPGA Accelerated 3-D Tomography](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815a2d550346895dc76ee7/html5/thumbnails/34.jpg)
34
Matrix Formulation with Normalization Introduce diagonal matrices V and W:
V: diagonal matrix of theinverse of the row sums
W: diagonal matrix of theinverse of the column sums
bAxWVAxx kTkk 1
2
1,
,
1N
jji
iii
a
VV
M
iji
jjj
aWW
1,
,
1
![Page 35: FPGA Accelerated 3-D Tomography](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815a2d550346895dc76ee7/html5/thumbnails/35.jpg)
35
Reconstruction Methods Algebraic Reconstruction Technique
– Update image after each ray is processed
Simultaneous Iterative Reconstruction Technique– Update image after all rays are processed
Simultaneous Algebraic Reconstruction Technique– Update image after all rays in a single projection angle
are processed
![Page 36: FPGA Accelerated 3-D Tomography](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815a2d550346895dc76ee7/html5/thumbnails/36.jpg)
36
ART Image update method:
– After each ray is processed
Pseudocode:
for k = 1:K
for i = 1:M
end
end
iiTiii
ii bxAWAVxxi
1
1 ik xx
![Page 37: FPGA Accelerated 3-D Tomography](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815a2d550346895dc76ee7/html5/thumbnails/37.jpg)
37
ART (Iterations 1-6)
1 3.03 1.06
0.97 2 1.03
3.94 0.97 1
1 2.99 0.98
1.01 2 0.99
4.02 1.01 1
1 3 1
1 2 1
4 1 1
Iteration 4 Iteration 5 Iteration 6
Iteration 1 Iteration 2 Iteration 3
1 3 0.83
1 1.83 0.75
4.33 1.25 1
1 3 1
1 2 1
4 1 1
1 3 1
1 2 1
4 1 1
![Page 38: FPGA Accelerated 3-D Tomography](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815a2d550346895dc76ee7/html5/thumbnails/38.jpg)
38
SIRT Image update method:
– After all rays are processed
Pseudocode:
for k = 1:K
end
bAxWVAxx kTkk 1
![Page 39: FPGA Accelerated 3-D Tomography](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815a2d550346895dc76ee7/html5/thumbnails/39.jpg)
39
SIRT (Iterations 1-6, λ = 0.5)
0.67 3.5 0.66
0.83 2.17 0.33
5.83 0.67 0.33
0.78 3.43 0.86
0.76 2.08 1.01
4.28 0.85 0.94
0.94 3.2 0.91
0.87 2.04 0.99
4.12 0.91 1.02
0.97 3.1 0.95
0.94 2.02 1
4.05 0.96 1.01
0.99 3.05 0.97
0.97 2.01 1
4.03 0.98 1.01
Iteration 4 Iteration 5 Iteration 6
Iteration 1 Iteration 2 Iteration 3
0.99 3.03 0.99
0.98 2.01 1
4.01 0.99 1
![Page 40: FPGA Accelerated 3-D Tomography](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815a2d550346895dc76ee7/html5/thumbnails/40.jpg)
40
SART Image update method:
– After all rays in a single projection angle are processed
Pseudocode:
for k = 1:K
for θ = 1:Θ
end
end
bxAWAVxx T 1
1 xx k
![Page 41: FPGA Accelerated 3-D Tomography](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815a2d550346895dc76ee7/html5/thumbnails/41.jpg)
41
SART (Step 1, Iteration 1, Theta 1)
1.67 1.67 1.67
1.67 1.67 1.67
1.67 1.67 1.67
5
5
5
![Page 42: FPGA Accelerated 3-D Tomography](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815a2d550346895dc76ee7/html5/thumbnails/42.jpg)
42
SART (Step 2, Iteration 1, Theta 1)
1.67 1.67 1.67
1.33 1.33 1.33
2 2 2
03
55
33.03
54
33.03
56
![Page 43: FPGA Accelerated 3-D Tomography](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815a2d550346895dc76ee7/html5/thumbnails/43.jpg)
43
SART (Step 1, Iteration 1, Theta 2)
1.67 1.67 1.67
1.33 1.33 1.33
2 2 2
67.13
533.3
2
![Page 44: FPGA Accelerated 3-D Tomography](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815a2d550346895dc76ee7/html5/thumbnails/44.jpg)
44
SART (Step 2, Iteration 1, Theta 2)
1.33 2.17 1
0.67 1 1.83
4 1.33 1.67
67.0
1 67.1
1
5.0
2 34
33.0
3 54
67.0
2 33.3
2
2
1 24
![Page 45: FPGA Accelerated 3-D Tomography](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815a2d550346895dc76ee7/html5/thumbnails/45.jpg)
45
SART (Step 1, Iteration 1, Theta 3)
6 5.4
1.33 2.17 1
0.67 1 1.83
4 1.33 1.67
5.4
![Page 46: FPGA Accelerated 3-D Tomography](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815a2d550346895dc76ee7/html5/thumbnails/46.jpg)
46
SART (Step 2, Iteration 1, Theta 3)
1.33 2.67 0.5
0.67 1.5 1.33
4 1.83 1.17
03
66
5.0
3
5.4
3
5.0
3
5.4
6
![Page 47: FPGA Accelerated 3-D Tomography](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815a2d550346895dc76ee7/html5/thumbnails/47.jpg)
47
SART (Step 1, Iteration 1, Theta 4)
617.3
17.1
33.333.1
1.33 2.67 0.5
0.67 1.5 1.33
4 1.83 1.17
![Page 48: FPGA Accelerated 3-D Tomography](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815a2d550346895dc76ee7/html5/thumbnails/48.jpg)
48
ART (Step 2, Iteration 1, Theta 4)
1 3 0.83
1 1.83 0.75
4.33 1.25 1
33.0
1
33.11
33.0
2
33.34
33.0
3
67
59.0
2
17.32
17.0
1
17.11
![Page 49: FPGA Accelerated 3-D Tomography](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815a2d550346895dc76ee7/html5/thumbnails/49.jpg)
49
SART (Iterations 1-6)
1 3.03 1.06
0.97 2 1.03
3.94 0.97 1
1 2.99 0.98
1.01 2 0.99
4.02 1.01 1
1 3 1
1 2 1
4 1 1
Iteration 4 Iteration 5 Iteration 6
Iteration 1 Iteration 2 Iteration 3
1 3 0.83
1 1.83 0.75
4.33 1.25 1
1 3 1
1 2 1
4 1 1
1 3 1
1 2 1
4 1 1
![Page 50: FPGA Accelerated 3-D Tomography](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815a2d550346895dc76ee7/html5/thumbnails/50.jpg)
50
Modeling Performance (CPU, GPU, FPGA) Write C pseudo code for Matrix-Vector multiplication
and Vector-Vector addition
Convert C pseudo code to application specific pseudo code (CPU = x86, GPU = OpenCL/CUDA)
Model latency and throughput of pseudo code given:– CPU architecture:
o Cache structure, freq., total # of threads, etc…
– Image reconstruction problem:o N, d, θ, A matrix sparsity (α), # of iterations, etc…
![Page 51: FPGA Accelerated 3-D Tomography](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815a2d550346895dc76ee7/html5/thumbnails/51.jpg)
51
C Pseudo Code (Ax = b)float btemp;
float *Apos = &A[0][0];
for(int i=0; i<M; i++)
{
float *xpos = &x[0];
btemp=0;
for(int j=0; j<N; j++)
{
btemp += (*Apos++) * (*xpos++);
}
b[i] = btemp;
}
![Page 52: FPGA Accelerated 3-D Tomography](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815a2d550346895dc76ee7/html5/thumbnails/52.jpg)
52
x86 Pseudo Code (Ax = b)loop_i: ;
fldz ; btemp = 0
mov eax, hXXXX ; j = M
loop_j: ;
fld dword ptr [edx] ; A_ij
add edx, 4 ; Apos++
fmul dword ptr [ecx] ; A_ij*x_j
add ecx, 4 ; xpos++
faddp st(1), st ; btemp = btemp + A_ij*x_j
dec eax ; j--
jnz short loop_j; loop if j~=0
fst dword [ebx] ; b_i = btemp
add ebx, 4 ; bpos++
dec esi ; i--
jnz short loop_i; loop if i~=0
![Page 53: FPGA Accelerated 3-D Tomography](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815a2d550346895dc76ee7/html5/thumbnails/53.jpg)
53
Results for CPUs [1]Processor Xeon E5405 [1] Xeon E5405 [1] Xeon E5405 [1] Xeon E5405 [1]
Architecture Penryn Penryn Penryn PenrynOperating Frequency 2.00 GHz 2.00 GHz 2.00 GHz 2.00 GHzNumber of Cores 4 4 4 4Number of Threads per Core 1 1 1 1Total Threads Used 1 1 1 1
Reconstruction Specifics Number of Pixels (NxN) 1024x1024 1024x1024 512x512 512x512Number of Dectectors (D) 1024 1024 512 512Number of Angles (θ) 140 140 140 140Matrix Sparsity (α) 0.098% 0.098% 0.195% 0.195%Number of Iterations 30 30 30 30Loop Unrolling Yes Yes Yes YesSIMD or Floating Point? Floating Point SIMD Floating Point SIMD
Reconstruction Time Reported [s] 24.174 6.639 6.087 1.650Estimated [s] 22.478 6.307 5.613 1.570Accuracy [%] 92.982% 94.987% 92.214% 95.180%
[1] J.I. Agulleiro, E.M. Garzon, I. Garcia, J.J. Fernandez, "Multi-core Desktop Processors Make Possible Real-Time Electron Tomography," 2011 19th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), pp.127-132, Feb. 2011.
![Page 54: FPGA Accelerated 3-D Tomography](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815a2d550346895dc76ee7/html5/thumbnails/54.jpg)
54
Results for CPUs [2]Processor Xeon 3.4 [2] Xeon 3.4 [2] Xeon 3.4 [2]
Architecture NetBurst NetBurst NetBurstOperating Frequency 3.40 GHz 3.40 GHz 3.40 GHzNumber of Cores 1 1 1Number of Threads per Core 1 1 1Total Threads Used 1 1 1
Reconstruction Specifics Number of Pixels (NxN) 2048x2048 1024x1024 512x512Number of Dectectors (D) 2048 1024 512Number of Angles (θ) 88 88 88Matrix Sparsity (α) 0.049% 0.195% 0.977%Number of Iterations 10 10 10Loop Unrolling Yes Yes YesSIMD or Floating Point? Floating Point Floating Point Floating Point
Reconstruction Time Reported [s] 4.512 2.227 1.336Estimated [s] 5.488 2.558 1.509Accuracy [%] 121.630% 114.875% 112.953%
[2] D.C. Diez, H. Mueller, A.S. Frangakis, "Implementation and Performance Evaluation of Reconstruction Algorithms on Graphics Processors," Journal of Structural Biology, vol. 157, no. 1, pp. 288-295, Jan. 2007.
![Page 55: FPGA Accelerated 3-D Tomography](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815a2d550346895dc76ee7/html5/thumbnails/55.jpg)
55
Results for CPUs [3]Processor P4 2.40A [2] P4 2.40A [2] P4 2.40A [2]
Architecture Prescott Prescott PrescottOperating Frequency 2.40 GHz 2.40 GHz 2.40 GHzNumber of Cores 1 1 1Number of Threads per Core 2 2 2Total Threads Used 2 2 2
Reconstruction Specifics Number of Pixels (NxN) 2048x2048 1024x1024 512x512Number of Dectectors (D) 2048 1024 512Number of Angles (θ) 88 88 88Matrix Sparsity (α) 0.049% 0.195% 0.977%Number of Iterations 10 10 10Loop Unrolling Yes Yes YesSIMD or Floating Point? Floating Point Floating Point Floating Point
Reconstruction Time Reported [s] 5.449 2.637 1.609Estimated [s] 5.687 2.542 1.448Accuracy [%] 104.355% 96.401% 89.958%
[2] D.C. Diez, H. Mueller, A.S. Frangakis, "Implementation and Performance Evaluation of Reconstruction Algorithms on Graphics Processors," Journal of Structural Biology, vol. 157, no. 1, pp. 288-295, Jan. 2007.
![Page 56: FPGA Accelerated 3-D Tomography](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815a2d550346895dc76ee7/html5/thumbnails/56.jpg)
56
Results for CPUs [4]Processor 2x X5550 [3] 4x X7460 [3] 4x X7560 [3]
Architecture Nehalem Core NehalemOperating Frequency 2.66 GHz 2.66 GHz 2.27 GHzNumber of Cores 4 6 8Number of Threads per Core 2 1 2Total Threads Used 16 24 64
Reconstruction Specifics Number of Pixels (NxN) 512x512 512x512 512x512Number of Dectectors (D) 512 512 512Number of Angles (θ) 414 414 414Matrix Sparsity (α) 0.391% 0.391% 0.391%Number of Iterations 1 1 1Loop Unrolling No No NoSIMD or Floating Point? Floating Point Floating Point Floating Point
Reconstruction Time Reported [s] 0.138 0.099 0.045Estimated [s] 0.150 0.098 0.044Accuracy [%] 109.059% 98.765% 97.041%
[3] H.G. Hofmann, B. Keck, C. Rohkohl, J. Hornegger, "Comparing Performance of Many-core CPUs and GPUs for Static and Motion Compensated Reconstruction of C-arm CT Data," Medical Physics, vol. 38, no 1, pp. 468-473, Jan. 2011.
![Page 57: FPGA Accelerated 3-D Tomography](https://reader035.fdocuments.net/reader035/viewer/2022062500/56815a2d550346895dc76ee7/html5/thumbnails/57.jpg)
57
Future Work Modeling performance of SART on GPUs and FPGAs