Computation of Mutual Information Metric for Image...
Transcript of Computation of Mutual Information Metric for Image...
![Page 1: Computation of Mutual Information Metric for Image ...on-demand.gputechconf.com/gtc/2014/presentations/S... · Computation of Mutual Information Metric for Image Registration on Multiple](https://reader033.fdocuments.net/reader033/viewer/2022042317/5f05e6d47e708231d41549c8/html5/thumbnails/1.jpg)
Mitg
lied
der H
elm
holtz
-Gem
eins
chaf
t
Computation of Mutual Information Metric for Image Registration on Multiple GPUs
Andrew V. Adinetz1, Markus Axer2, Marcel Huysegoms2, Stefan Köhnen2, Jiri Kraus3, Dirk Pleiter1
26.03.2014
1 JSC, Forschungszentrum Jülich 2 INM-1, Forschungszentrum Jülich 3 NVIDIA GmbH
Presented at HeteroPar’13 workshop of EuroPar‘13
![Page 2: Computation of Mutual Information Metric for Image ...on-demand.gputechconf.com/gtc/2014/presentations/S... · Computation of Mutual Information Metric for Image Registration on Multiple](https://reader033.fdocuments.net/reader033/viewer/2022042317/5f05e6d47e708231d41549c8/html5/thumbnails/2.jpg)
Mitg
lied
der H
elm
holtz
-Gem
eins
chaf
t
• Brain Image Registration • Multi-GPU Implementation
• system memory • listupdate
• Performance Evaluation • Conclusion
Outline
March 26, 2014 2 GPU Technology Conference 2014
![Page 3: Computation of Mutual Information Metric for Image ...on-demand.gputechconf.com/gtc/2014/presentations/S... · Computation of Mutual Information Metric for Image Registration on Multiple](https://reader033.fdocuments.net/reader033/viewer/2022042317/5f05e6d47e708231d41549c8/html5/thumbnails/3.jpg)
Mitg
lied
der H
elm
holtz
-Gem
eins
chaf
t
Preparation of the brain
March 26, 2014 3 GPU Technology Conference 2014
![Page 4: Computation of Mutual Information Metric for Image ...on-demand.gputechconf.com/gtc/2014/presentations/S... · Computation of Mutual Information Metric for Image Registration on Multiple](https://reader033.fdocuments.net/reader033/viewer/2022042317/5f05e6d47e708231d41549c8/html5/thumbnails/4.jpg)
Mitg
lied
der H
elm
holtz
-Gem
eins
chaf
t
BigBrain – first high-resolution brain model at microscopical scale
! 7404 histological sec/ons stained for cell bodies ! scanned with a flad bed scanner ! original resolu/on 10 × 10 × 20 μm3 (11.000 × 13.000 pixels) ! downscaling to 20 μm isotropic ! removal of ar/facts ! 1 Terabyte
in cooperation with Alan Evans, McGill, Montreal
Amunts et al. (2013) Science
Pushing the limits for a cellular brain model
![Page 5: Computation of Mutual Information Metric for Image ...on-demand.gputechconf.com/gtc/2014/presentations/S... · Computation of Mutual Information Metric for Image Registration on Multiple](https://reader033.fdocuments.net/reader033/viewer/2022042317/5f05e6d47e708231d41549c8/html5/thumbnails/5.jpg)
![Page 6: Computation of Mutual Information Metric for Image ...on-demand.gputechconf.com/gtc/2014/presentations/S... · Computation of Mutual Information Metric for Image Registration on Multiple](https://reader033.fdocuments.net/reader033/viewer/2022042317/5f05e6d47e708231d41549c8/html5/thumbnails/6.jpg)
![Page 7: Computation of Mutual Information Metric for Image ...on-demand.gputechconf.com/gtc/2014/presentations/S... · Computation of Mutual Information Metric for Image Registration on Multiple](https://reader033.fdocuments.net/reader033/viewer/2022042317/5f05e6d47e708231d41549c8/html5/thumbnails/7.jpg)
![Page 8: Computation of Mutual Information Metric for Image ...on-demand.gputechconf.com/gtc/2014/presentations/S... · Computation of Mutual Information Metric for Image Registration on Multiple](https://reader033.fdocuments.net/reader033/viewer/2022042317/5f05e6d47e708231d41549c8/html5/thumbnails/8.jpg)
Mitg
lied
der H
elm
holtz
-Gem
eins
chaf
t
• Registration = process of image alignment
Image Registration
ITK Workflow
March 26, 2014 8 GPU Technology Conference 2014
![Page 9: Computation of Mutual Information Metric for Image ...on-demand.gputechconf.com/gtc/2014/presentations/S... · Computation of Mutual Information Metric for Image Registration on Multiple](https://reader033.fdocuments.net/reader033/viewer/2022042317/5f05e6d47e708231d41549c8/html5/thumbnails/9.jpg)
Mitg
lied
der H
elm
holtz
-Gem
eins
chaf
t
• i, j – pixel values (0 .. 255)
• successful for multi-modal registration
Mutual Information Metric
€
MI(I f ,Im ) = p(i, j)log2i, j∑ p(i, j)
pf (i)pm ( j)
pf (i) = p(i, j)j∑
pm ( j) = p(i, j)i∑
March 26, 2014 9 GPU Technology Conference 2014
![Page 10: Computation of Mutual Information Metric for Image ...on-demand.gputechconf.com/gtc/2014/presentations/S... · Computation of Mutual Information Metric for Image Registration on Multiple](https://reader033.fdocuments.net/reader033/viewer/2022042317/5f05e6d47e708231d41549c8/html5/thumbnails/10.jpg)
Mitg
lied
der H
elm
holtz
-Gem
eins
chaf
t
• main computational kernel • transform can be complex (1000+ parameters) • GPU implementation: 1 pixel/thread, atomics
Two Image Cross-Histogram
for(int y = 0; y < fixed_sz_y; y++) for(int x = 0; x < fixed_sz_x; x++) { int i = bin(fixed[x, y]); float x1 = transform_x(x, y); float y1 = transform_y(x, y); int j = bin(interpolate(moving, x1, y1)); histogram[i, j]++; // atomic on GPU }
March 26, 2014 10 GPU Technology Conference 2014
![Page 11: Computation of Mutual Information Metric for Image ...on-demand.gputechconf.com/gtc/2014/presentations/S... · Computation of Mutual Information Metric for Image Registration on Multiple](https://reader033.fdocuments.net/reader033/viewer/2022042317/5f05e6d47e708231d41549c8/html5/thumbnails/11.jpg)
Mitg
lied
der H
elm
holtz
-Gem
eins
chaf
t
Large Data Size
size: 3.000 × 3.000 px
pixel size: 60 × 60 µm
file size: 30 MB
Large-area Polarimeter
size: 100.000 × 100.000 px
pixel size: 1.6 x 1.6 µm
file size: 40 GB
Polarizing Microscope
March 26, 2014 11 GPU Technology Conference 2014
Need mul(ple GPUs!
![Page 12: Computation of Mutual Information Metric for Image ...on-demand.gputechconf.com/gtc/2014/presentations/S... · Computation of Mutual Information Metric for Image Registration on Multiple](https://reader033.fdocuments.net/reader033/viewer/2022042317/5f05e6d47e708231d41549c8/html5/thumbnails/12.jpg)
Mitg
lied
der H
elm
holtz
-Gem
eins
chaf
t
• Domain decomposition • distribute fixed and moving images • histogram contributions summed up
• Moving image: how to handle? • irregular access pattern
• Approaches • System memory replication (sysmem) • Listupdate (listupdate)
Multi-GPU Mutual Information
March 26, 2014 12 GPU Technology Conference 2014
![Page 13: Computation of Mutual Information Metric for Image ...on-demand.gputechconf.com/gtc/2014/presentations/S... · Computation of Mutual Information Metric for Image Registration on Multiple](https://reader033.fdocuments.net/reader033/viewer/2022042317/5f05e6d47e708231d41549c8/html5/thumbnails/13.jpg)
Mitg
lied
der H
elm
holtz
-Gem
eins
chaf
t
• Replicate entire moving image in pinned host RAM • accessible to GPU
+ easy to implement
– system memory accesses are slower – cannot use texture interpolation
• Optimizations • moving image halo in GPU RAM
System Memory Replication
March 26, 2014 13 GPU Technology Conference 2014
![Page 14: Computation of Mutual Information Metric for Image ...on-demand.gputechconf.com/gtc/2014/presentations/S... · Computation of Mutual Information Metric for Image Registration on Multiple](https://reader033.fdocuments.net/reader033/viewer/2022042317/5f05e6d47e708231d41549c8/html5/thumbnails/14.jpg)
Mitg
lied
der H
elm
holtz
-Gem
eins
chaf
t
• On remote access • „send message“
• „On receiving message“ • compute contributions
• Active messaging variant • buffering • relies on undocumented features
• Listupdate • chunking • buffer size bounded • communication-computation
overlap
Listupdate typedef struct { float[2] movingCoords; short destRank; char fixedBin; } message_t;
March 26, 2014 14 GPU Technology Conference 2014
![Page 15: Computation of Mutual Information Metric for Image ...on-demand.gputechconf.com/gtc/2014/presentations/S... · Computation of Mutual Information Metric for Image Registration on Multiple](https://reader033.fdocuments.net/reader033/viewer/2022042317/5f05e6d47e708231d41549c8/html5/thumbnails/15.jpg)
Mitg
lied
der H
elm
holtz
-Gem
eins
chaf
t
Writeout: Atomics vs Grouping
March 26, 2014 15 GPU Technology Conference 2014
Atomics
Grouping
write to per-‐pixel buffer
group (compress)
determine write posi(on using atomics
warp-‐aggregated increment
![Page 16: Computation of Mutual Information Metric for Image ...on-demand.gputechconf.com/gtc/2014/presentations/S... · Computation of Mutual Information Metric for Image Registration on Multiple](https://reader033.fdocuments.net/reader033/viewer/2022042317/5f05e6d47e708231d41549c8/html5/thumbnails/16.jpg)
Mitg
lied
der H
elm
holtz
-Gem
eins
chaf
t
Chunk Processing and Overlap
Process chunk Group Exchange Handle
messages
Process chunk Group Exchange
Process chunk Group 1
2
Fixed Image Fixed Image
y
x (0,0)
![Page 17: Computation of Mutual Information Metric for Image ...on-demand.gputechconf.com/gtc/2014/presentations/S... · Computation of Mutual Information Metric for Image Registration on Multiple](https://reader033.fdocuments.net/reader033/viewer/2022042317/5f05e6d47e708231d41549c8/html5/thumbnails/17.jpg)
Mitg
lied
der H
elm
holtz
-Gem
eins
chaf
t
+ computation-communication overlap – hard to implement – chunk processing (or won‘t fit into buffer)
• Optimizations • buffers: AoS vs. SoA • atomics vs. grouping • using multiple streams
Listupdate typedef struct { float[2] movingCoords; short destRank; char fixedBin; } message_t;
March 26, 2014 17 GPU Technology Conference 2014
![Page 18: Computation of Mutual Information Metric for Image ...on-demand.gputechconf.com/gtc/2014/presentations/S... · Computation of Mutual Information Metric for Image Registration on Multiple](https://reader033.fdocuments.net/reader033/viewer/2022042317/5f05e6d47e708231d41549c8/html5/thumbnails/18.jpg)
Mitg
lied
der H
elm
holtz
-Gem
eins
chaf
t
Benchmark setup
Fixed Image Fixed Image
y
x (0,0)
Remote access
Mask
March 26, 2014 18 GPU Technology Conference 2014
![Page 19: Computation of Mutual Information Metric for Image ...on-demand.gputechconf.com/gtc/2014/presentations/S... · Computation of Mutual Information Metric for Image Registration on Multiple](https://reader033.fdocuments.net/reader033/viewer/2022042317/5f05e6d47e708231d41549c8/html5/thumbnails/19.jpg)
Mitg
lied
der H
elm
holtz
-Gem
eins
chaf
t
• JUDGE • 256-node GPU cluster • Each M2070 node:
• 2x M2070 (Fermi) GPU, each 6 GB RAM • 12-core X5650 CPU @ 2.67 GHz, 96 GB RAM
• JuHydra • single-node Kepler machine
• 2x K20X (Kepler) GPU, each 6 GB RAM • 16-core E5-2650 CPU @ 2 GHz, 64 GB RAM
Test Hardware
March 26, 2014 19 GPU Technology Conference 2014
![Page 20: Computation of Mutual Information Metric for Image ...on-demand.gputechconf.com/gtc/2014/presentations/S... · Computation of Mutual Information Metric for Image Registration on Multiple](https://reader033.fdocuments.net/reader033/viewer/2022042317/5f05e6d47e708231d41549c8/html5/thumbnails/20.jpg)
Mitg
lied
der H
elm
holtz
-Gem
eins
chaf
t
Baseline: Full Replication (M2070)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 5.4
10.8
16.2
21.6
27
32.4
37.8
43.2
48.6
54
59.4
64.8
70.2
75.6
81
86.4
91.8
97.2
102.6
108
113.4
118.8
124.2
129.6
135
140.4
145.8
151.2
156.6
162
167.4
172.8
178.2
Run/
me in secon
ds
Rota/on angle
1 -‐ GPU
2 -‐ GPUs
4 -‐ GPUs
ideal scalability March 26, 2014 20 GPU Technology Conference 2014
![Page 21: Computation of Mutual Information Metric for Image ...on-demand.gputechconf.com/gtc/2014/presentations/S... · Computation of Mutual Information Metric for Image Registration on Multiple](https://reader033.fdocuments.net/reader033/viewer/2022042317/5f05e6d47e708231d41549c8/html5/thumbnails/21.jpg)
Mitg
lied
der H
elm
holtz
-Gem
eins
chaf
t
Sysmem on Fermi
0
0.2
0.4
0.6
0.8
1
1.2
0 5.4
10.8
16.2
21.6
27
32.4
37.8
43.2
48.6
54
59.4
64.8
70.2
75.6
81
86.4
91.8
97.2
102.6
108
113.4
118.8
124.2
129.6
135
140.4
145.8
151.2
156.6
162
167.4
172.8
178.2
Run/
me in secon
ds
Rota/on angle
1-‐GPU
2-‐GPUs Baseline
2 GPUs
March 26, 2014 21 GPU Technology Conference 2014
![Page 22: Computation of Mutual Information Metric for Image ...on-demand.gputechconf.com/gtc/2014/presentations/S... · Computation of Mutual Information Metric for Image Registration on Multiple](https://reader033.fdocuments.net/reader033/viewer/2022042317/5f05e6d47e708231d41549c8/html5/thumbnails/22.jpg)
Mitg
lied
der H
elm
holtz
-Gem
eins
chaf
t
Sysmem on Fermi: Explanation
No sysmem Access Good Coalescing
Few sysmem Access Bad Coalescing
Many sysmem Access Bad Coalescing
Most sysmem Access Good Coalescing
March 26, 2014 22 GPU Technology Conference 2014
![Page 23: Computation of Mutual Information Metric for Image ...on-demand.gputechconf.com/gtc/2014/presentations/S... · Computation of Mutual Information Metric for Image Registration on Multiple](https://reader033.fdocuments.net/reader033/viewer/2022042317/5f05e6d47e708231d41549c8/html5/thumbnails/23.jpg)
Mitg
lied
der H
elm
holtz
-Gem
eins
chaf
t
Sysmem on Fermi: PCI-E Queries
0
20000000
40000000
60000000
80000000
100000000
120000000
0
0.2
0.4
0.6
0.8
1
1.2
0 5.4
10.8
16.2
21.6
27
32.4
37.8
43.2
48.6
54
59.4
64.8
70.2
75.6
81
86.4
91.8
97.2
102.6
108
113.4
118.8
124.2
129.6
135
140.4
145.8
151.2
156.6
162
167.4
172.8
178.2
Sysm
em_q
ueries
Run/
me in secon
ds
Rota/on angle
2-‐GPUs Baseline 2 GPUs Total Sysmem_queries
March 26, 2014 23 GPU Technology Conference 2014
![Page 24: Computation of Mutual Information Metric for Image ...on-demand.gputechconf.com/gtc/2014/presentations/S... · Computation of Mutual Information Metric for Image Registration on Multiple](https://reader033.fdocuments.net/reader033/viewer/2022042317/5f05e6d47e708231d41549c8/html5/thumbnails/24.jpg)
Mitg
lied
der H
elm
holtz
-Gem
eins
chaf
t
Sysmem: Halo Sizes
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0 1.8 3.6 5.4 7.2 9 10.8 12.6 14.4 16.2 18 19.8 21.6 23.4 25.2 27 28.8 30.6 32.4 34.2 36
Time, s
Angle, degrees
2 K20X, baseline 2 K20X, sysmem 2 K20X, 5% halo 2 K20X, 10% halo
2 K20X, 15% halo 2 K20X, 20% halo 2 K20X, 25% halo
mostly quan(ta(ve, not qualita(ve difference March 26, 2014 24 GPU Technology Conference 2014
![Page 25: Computation of Mutual Information Metric for Image ...on-demand.gputechconf.com/gtc/2014/presentations/S... · Computation of Mutual Information Metric for Image Registration on Multiple](https://reader033.fdocuments.net/reader033/viewer/2022042317/5f05e6d47e708231d41549c8/html5/thumbnails/25.jpg)
Mitg
lied
der H
elm
holtz
-Gem
eins
chaf
t
Listupdate: Multiple Streams
4 streams look the best
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
0
5.4
10.8
16.2
21.6
27
32.4
37.8
43.2
48.6
54
59.4
64.8
70.2
75.6
81
86.4
91.8
97.2
102.6
108
113.4
118.8
124.2
129.6
135
140.4
145.8
151.2
156.6
162
167.4
172.8
178.2
Time, s
Angle, degrees
2 K20X, 1 stream 2 K20X, 2 streams 2 K20X, 3 streams 2 K20X, 4 streams
March 26, 2014 25 GPU Technology Conference 2014
![Page 26: Computation of Mutual Information Metric for Image ...on-demand.gputechconf.com/gtc/2014/presentations/S... · Computation of Mutual Information Metric for Image Registration on Multiple](https://reader033.fdocuments.net/reader033/viewer/2022042317/5f05e6d47e708231d41549c8/html5/thumbnails/26.jpg)
Mitg
lied
der H
elm
holtz
-Gem
eins
chaf
t
Listupdate: AoS vs SoA, Atomics vs Group
SoA + atomics looks best
0
0.2
0.4
0.6
0.8
1
1.2
0
5.4
10.8
16.2
21.6
27
32.4
37.8
43.2
48.6
54
59.4
64.8
70.2
75.6
81
86.4
91.8
97.2
102.6
108
113.4
118.8
124.2
129.6
135
140.4
145.8
151.2
156.6
162
167.4
172.8
178.2
Time, s
Angle, degrees
2 K20X, SoA 2 K20X, AoS 2 K20X, compress
March 26, 2014 26 GPU Technology Conference 2014
typedef struct { float[2] movingCoords; char fixedBin; } message_t;
![Page 27: Computation of Mutual Information Metric for Image ...on-demand.gputechconf.com/gtc/2014/presentations/S... · Computation of Mutual Information Metric for Image Registration on Multiple](https://reader033.fdocuments.net/reader033/viewer/2022042317/5f05e6d47e708231d41549c8/html5/thumbnails/27.jpg)
Mitg
lied
der H
elm
holtz
-Gem
eins
chaf
t
Sysmem vs. Listupdate: Fermi
0
0.5
1
1.5
2
2.5
0 5.4
10.8
16.2
21.6
27
32.4
37.8
43.2
48.6
54
59.4
64.8
70.2
75.6
81
86.4
91.8
97.2
102.6
108
113.4
118.8
124.2
129.6
135
140.4
145.8
151.2
156.6
162
167.4
172.8
178.2
Time, s
Angle, degrees
4 M2070, SoA 4 M2070, baseline 4 M2070, sysmem 4 M2070, 25% halo
on Fermi, sysmem is be_er March 26, 2014 27 GPU Technology Conference 2014
![Page 28: Computation of Mutual Information Metric for Image ...on-demand.gputechconf.com/gtc/2014/presentations/S... · Computation of Mutual Information Metric for Image Registration on Multiple](https://reader033.fdocuments.net/reader033/viewer/2022042317/5f05e6d47e708231d41549c8/html5/thumbnails/28.jpg)
Mitg
lied
der H
elm
holtz
-Gem
eins
chaf
t
Sysmem vs. Listupdate: Kepler (Closeup)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0 1.8 3.6 5.4 7.2 9 10.8 12.6 14.4 16.2 18 19.8 21.6 23.4 25.2 27 28.8 30.6 32.4 34.2 36
Time, s
Angle, degrees
2 K20X, SoA 2 K20X, baseline 2 K20X, sysmem 2 K20X, 25% halo
on Kepler, listupdate is be_er March 26, 2014 28 GPU Technology Conference 2014
![Page 29: Computation of Mutual Information Metric for Image ...on-demand.gputechconf.com/gtc/2014/presentations/S... · Computation of Mutual Information Metric for Image Registration on Multiple](https://reader033.fdocuments.net/reader033/viewer/2022042317/5f05e6d47e708231d41549c8/html5/thumbnails/29.jpg)
Mitg
lied
der H
elm
holtz
-Gem
eins
chaf
t
• Fermi • performance limited by atomics • system memory replication is better
• Kepler • 10x faster than Fermi • no longer dominated by atomics • listupdate (atomic, SoA, 4 streams) is better
• Future work • Compression • Trials on real images
Conclusions
March 26, 2014 29 GPU Technology Conference 2014
![Page 30: Computation of Mutual Information Metric for Image ...on-demand.gputechconf.com/gtc/2014/presentations/S... · Computation of Mutual Information Metric for Image Registration on Multiple](https://reader033.fdocuments.net/reader033/viewer/2022042317/5f05e6d47e708231d41549c8/html5/thumbnails/30.jpg)
Mitg
lied
der H
elm
holtz
-Gem
eins
chaf
t
• INM-1 at FZJ: http://www.fz-juelich.de/inm/inm-1/EN/Home/home_node.html
• NVidia Application Lab at FZJ: http://www.fz-juelich.de/ias/jsc/nvlab • Andrew V. Adinetz: [email protected] • Jiri Kraus: [email protected] • Dirk Pleiter: [email protected]
Questions
?
March 26, 2014 30 GPU Technology Conference 2014