Real-Time 3-D Wavelet Lifting

Post on 07-Aug-2015

13 views 5 download

Transcript of Real-Time 3-D Wavelet Lifting

Real-Time 3-D Wavelet Lifting

David Barina Pavel Zemcik

Faculty of Information TechnologyBrno University of Technology

3-D Wavelet Lifting

LLL

HHH

LLH

LHL

LHH

HLL

HLH

HHL

1-D Wavelet Lifting

α

β

γ

δ

P̃(z) =

[1 α

(1 + z−1

)0 1

] [1 0

β (1 + z) 1

][

1 γ(1 + z−1

)0 1

] [1 0

δ (1 + z) 1

] [ζ 00 1/ζ

]

Naive Approaches

Naive Horizontalforeach dimension do /* X, Y, Z axis */

foreach lifting doforeach sample do

step;end

end

end

set offset

LSB

tag

MSB

Comparison: Strides

20.0ns40.0ns60.0ns80.0ns

100.0ns120.0ns140.0ns160.0ns180.0ns200.0ns220.0ns

1.0k 10.0k 100.0k 1.0M 10.0M 100.0M 1.0G

tim

e /

voxe

l

voxels

unchanged strideprime stride

Naive Approaches

Naive Verticalforeach dimension do /* X, Y, Z axis */

foreach sample doforeach lifting do

step;end

end

end

I huge amount of cache misses

I three passes through the data

Comparison: Naive Approaches

20.0ns

40.0ns

60.0ns

80.0ns

100.0ns

120.0ns

140.0ns

160.0ns

0.0 50.0M 100.0M 150.0M 200.0M 250.0M

tim

e /

voxe

l

voxels

horizontalvertical

2-D Approach

2-D Slicesforeach slice do

foreach sample doforeach lifting do step; /* X axis */

foreach lifting do step; /* Y axis */

end

end/* Z axis */

foreach sample doforeach lifting do step;

end

I 42 core with SIMD

Comparison: Slices

0.0 s

20.0ns

40.0ns

60.0ns

80.0ns

100.0ns

120.0ns

140.0ns

160.0ns

0.0 50.0M 100.0M 150.0M 200.0M 250.0M

tim

e /

voxe

l

voxels

naive horizontalnaive vertical

slices

3-D Approach

True 3-Dforeach sample do

foreach lifting do step; /* X axis */

foreach lifting do step; /* Y axis */

foreach lifting do step; /* Z axis */

end

I 23 cube

I 43 with SIMD

3-D Single-Loop Approach

x

y

z

buffer x

buffer y

buffer z

Overall Comparison

0.0 s

20.0ns

40.0ns

60.0ns

80.0ns

100.0ns

120.0ns

140.0ns

160.0ns

0.0 50.0M 100.0M 150.0M 200.0M 250.0M

tim

e /

voxe

l

voxels

naive horizontalnaive vertical

core 42

core 23

core 43

Conclusions

Intel Core2 AMD Opteronmethod time speedup time speedup

naive horiz. 159.8 1.0 105.7 1.0naive vert. 100.1 1.6 73.5 1.4core 42 53.8 2.9 41.0 2.5core 23 23.3 6.8 21.7 4.7core 43 13.5 11.7 12.9 8.0

I core = streaming unit

I CPU cache friendly = single-loop approach

I SIMD friendly