Static Image Filtering on Commodity Graphics Processors
description
Transcript of Static Image Filtering on Commodity Graphics Processors
![Page 1: Static Image Filtering on Commodity Graphics Processors](https://reader034.fdocuments.net/reader034/viewer/2022051401/56814af0550346895db7fff8/html5/thumbnails/1.jpg)
Static Image Filtering on Commodity Graphics Processors
Peter Djeu
May 1, 2003
![Page 2: Static Image Filtering on Commodity Graphics Processors](https://reader034.fdocuments.net/reader034/viewer/2022051401/56814af0550346895db7fff8/html5/thumbnails/2.jpg)
Filters from Computer Vision
• Mean (a.k.a. average) filter– each element in a neighborhood is given equal weight;
a simple image smoother
• Gaussian– a neighborhood is weighted by a 2-D Gaussian, with
the peak at the center; a better image smoother
• Laplacian of Gaussian– The Gaussian filter is applied, and then the Laplacian
(spatial derivative is applied); good for edge detection
![Page 3: Static Image Filtering on Commodity Graphics Processors](https://reader034.fdocuments.net/reader034/viewer/2022051401/56814af0550346895db7fff8/html5/thumbnails/3.jpg)
The Convolution Kernel
• We want to transmit pixel information from neighbors to a central pixel
• Use the convolution kernel as a window to frame the work that needs to be done
16 26 16
26 41 26
16 26 16
1 161
![Page 4: Static Image Filtering on Commodity Graphics Processors](https://reader034.fdocuments.net/reader034/viewer/2022051401/56814af0550346895db7fff8/html5/thumbnails/4.jpg)
Filtering on a CPU vs. a GPU
• CPU– sequential and straightforward
• GPU– not so straightforward if the goal is to exploit
parallelism and maintain good locality– a pixel’s output value depends on the weighted
value of it’s neighbors, so there is dependency across various elements
![Page 5: Static Image Filtering on Commodity Graphics Processors](https://reader034.fdocuments.net/reader034/viewer/2022051401/56814af0550346895db7fff8/html5/thumbnails/5.jpg)
Pixel Buffers in GPUS
• GPU’s do not have indirect addressing to memory, so results have to be stored in pixel buffers. The card is really rendering to an off-screen frame (writing).
• The GPU can then treat the Pixel Buffer as a texture for rendering (reading).
![Page 6: Static Image Filtering on Commodity Graphics Processors](https://reader034.fdocuments.net/reader034/viewer/2022051401/56814af0550346895db7fff8/html5/thumbnails/6.jpg)
Proposal for the GPU Algorithm1. Store original into pb1.2. For each element ki in the convolution kernel {3. Copy pb1 into pb2, scaling by ki
in the process (use Cg shader).
4. Based on the location of ki,render pb2 into pb3 with acertain offset. The blending isa single add.
}5. return pb3
![Page 7: Static Image Filtering on Commodity Graphics Processors](https://reader034.fdocuments.net/reader034/viewer/2022051401/56814af0550346895db7fff8/html5/thumbnails/7.jpg)
The Ups and Downs
• This technique may be fast because…– parallelism is completely possible during the scaling
stage and the blending– since most convolution kernels have symmetry, a little
bit of preprocess could mean
• On the other hand…– as image size grows, cache misses may become more
prominent, since we manipulate the whole image– when translating, coords. are interpolated, not mapped
• Tiling? Can a good size be determined in exp.?
![Page 8: Static Image Filtering on Commodity Graphics Processors](https://reader034.fdocuments.net/reader034/viewer/2022051401/56814af0550346895db7fff8/html5/thumbnails/8.jpg)
Current Progress
• P-Buffer’s are frustrating– wglReleasePbufferDCARB() returning type
PFNWGLRELEASEPBUFFERDCARBPROC
• Lot’s of low level implementation / debugging, very much on a hardware level
• (Naïve) CPU implementation is complete and working, and P-Buffers are almost done
![Page 9: Static Image Filtering on Commodity Graphics Processors](https://reader034.fdocuments.net/reader034/viewer/2022051401/56814af0550346895db7fff8/html5/thumbnails/9.jpg)
Results (in real-time sec’s)CPU, Gaussian Filter, w/ RGB, 24 bit targa’s
x y 5 x 5 11 x 11 31 x 31
Quake 256 256 0.4 1.5 10.5
Fruit 512 480 1.4 5.4 41.7
Ruins03 735 485 2.0 7.8 59.7
![Page 10: Static Image Filtering on Commodity Graphics Processors](https://reader034.fdocuments.net/reader034/viewer/2022051401/56814af0550346895db7fff8/html5/thumbnails/10.jpg)
Time (s) versus Kernel Size (elts)
0
10
20
30
40
50
60
70
0 200 400 600 800 1000 1200
Quake
Fruit
Ruins03
![Page 11: Static Image Filtering on Commodity Graphics Processors](https://reader034.fdocuments.net/reader034/viewer/2022051401/56814af0550346895db7fff8/html5/thumbnails/11.jpg)
Time(s) versus Image Size (x*y)using a (31 x 31) kernel
0
10
20
30
40
50
60
70
0 50000 100000 150000 200000 250000 300000 350000 400000
![Page 12: Static Image Filtering on Commodity Graphics Processors](https://reader034.fdocuments.net/reader034/viewer/2022051401/56814af0550346895db7fff8/html5/thumbnails/12.jpg)
![Page 13: Static Image Filtering on Commodity Graphics Processors](https://reader034.fdocuments.net/reader034/viewer/2022051401/56814af0550346895db7fff8/html5/thumbnails/13.jpg)
![Page 14: Static Image Filtering on Commodity Graphics Processors](https://reader034.fdocuments.net/reader034/viewer/2022051401/56814af0550346895db7fff8/html5/thumbnails/14.jpg)
![Page 15: Static Image Filtering on Commodity Graphics Processors](https://reader034.fdocuments.net/reader034/viewer/2022051401/56814af0550346895db7fff8/html5/thumbnails/15.jpg)
Applications?
• Super fast filtering techniques on 2-D images may provide tools or insight for traditionally more complex problems involving 2-D images, like categorization / classification