Parallel Video Processingcourses.csail.mit.edu/18.337/2012/projects/nealwadhwa_18... ·...
Transcript of Parallel Video Processingcourses.csail.mit.edu/18.337/2012/projects/nealwadhwa_18... ·...
![Page 1: Parallel Video Processingcourses.csail.mit.edu/18.337/2012/projects/nealwadhwa_18... · 2015-08-21 · Parallel Video Processing I Processing is on uncompressed video I Uncompressed](https://reader030.fdocuments.net/reader030/viewer/2022040917/5e926dac21e1b7457e4434a3/html5/thumbnails/1.jpg)
Parallel Video Processing
Neal Wadhwa
December 10, 2012
![Page 2: Parallel Video Processingcourses.csail.mit.edu/18.337/2012/projects/nealwadhwa_18... · 2015-08-21 · Parallel Video Processing I Processing is on uncompressed video I Uncompressed](https://reader030.fdocuments.net/reader030/viewer/2022040917/5e926dac21e1b7457e4434a3/html5/thumbnails/2.jpg)
Parallel Video Processing
I Processing is on uncompressed video
I Uncompressed videos uses huge amounts of space.
I 1080p at 30 FPS is one gigabyte per second
I Lots of algorithms are easy to parallelize due to independenceof processing in space or time.
![Page 3: Parallel Video Processingcourses.csail.mit.edu/18.337/2012/projects/nealwadhwa_18... · 2015-08-21 · Parallel Video Processing I Processing is on uncompressed video I Uncompressed](https://reader030.fdocuments.net/reader030/viewer/2022040917/5e926dac21e1b7457e4434a3/html5/thumbnails/3.jpg)
Parallel Video Processing
I Processing is on uncompressed video
I Uncompressed videos uses huge amounts of space.
I 1080p at 30 FPS is one gigabyte per second
I Lots of algorithms are easy to parallelize due to independenceof processing in space or time.
![Page 4: Parallel Video Processingcourses.csail.mit.edu/18.337/2012/projects/nealwadhwa_18... · 2015-08-21 · Parallel Video Processing I Processing is on uncompressed video I Uncompressed](https://reader030.fdocuments.net/reader030/viewer/2022040917/5e926dac21e1b7457e4434a3/html5/thumbnails/4.jpg)
Parallel Video Processing
I Processing is on uncompressed video
I Uncompressed videos uses huge amounts of space.
I 1080p at 30 FPS is one gigabyte per second
I Lots of algorithms are easy to parallelize due to independenceof processing in space or time.
![Page 5: Parallel Video Processingcourses.csail.mit.edu/18.337/2012/projects/nealwadhwa_18... · 2015-08-21 · Parallel Video Processing I Processing is on uncompressed video I Uncompressed](https://reader030.fdocuments.net/reader030/viewer/2022040917/5e926dac21e1b7457e4434a3/html5/thumbnails/5.jpg)
Parallel Video Processing
I Processing is on uncompressed video
I Uncompressed videos uses huge amounts of space.
I 1080p at 30 FPS is one gigabyte per second
I Lots of algorithms are easy to parallelize due to independenceof processing in space or time.
![Page 6: Parallel Video Processingcourses.csail.mit.edu/18.337/2012/projects/nealwadhwa_18... · 2015-08-21 · Parallel Video Processing I Processing is on uncompressed video I Uncompressed](https://reader030.fdocuments.net/reader030/viewer/2022040917/5e926dac21e1b7457e4434a3/html5/thumbnails/6.jpg)
Example: Motion Magnification
I DSP based method to magnify subtle motions
I Here are some cool examples of motion magnification.
I Switch to video.
I FFT-based algorithm lends itself to being parallelized.
I Try to parallelize and see how far we can get
![Page 7: Parallel Video Processingcourses.csail.mit.edu/18.337/2012/projects/nealwadhwa_18... · 2015-08-21 · Parallel Video Processing I Processing is on uncompressed video I Uncompressed](https://reader030.fdocuments.net/reader030/viewer/2022040917/5e926dac21e1b7457e4434a3/html5/thumbnails/7.jpg)
Example: Motion Magnification
I DSP based method to magnify subtle motions
I Here are some cool examples of motion magnification.
I Switch to video.
I FFT-based algorithm lends itself to being parallelized.
I Try to parallelize and see how far we can get
![Page 8: Parallel Video Processingcourses.csail.mit.edu/18.337/2012/projects/nealwadhwa_18... · 2015-08-21 · Parallel Video Processing I Processing is on uncompressed video I Uncompressed](https://reader030.fdocuments.net/reader030/viewer/2022040917/5e926dac21e1b7457e4434a3/html5/thumbnails/8.jpg)
Example: Motion Magnification
I DSP based method to magnify subtle motions
I Here are some cool examples of motion magnification.
I Switch to video.
I FFT-based algorithm lends itself to being parallelized.
I Try to parallelize and see how far we can get
![Page 9: Parallel Video Processingcourses.csail.mit.edu/18.337/2012/projects/nealwadhwa_18... · 2015-08-21 · Parallel Video Processing I Processing is on uncompressed video I Uncompressed](https://reader030.fdocuments.net/reader030/viewer/2022040917/5e926dac21e1b7457e4434a3/html5/thumbnails/9.jpg)
Example: Motion Magnification
I DSP based method to magnify subtle motions
I Here are some cool examples of motion magnification.
I Switch to video.
I FFT-based algorithm lends itself to being parallelized.
I Try to parallelize and see how far we can get
![Page 10: Parallel Video Processingcourses.csail.mit.edu/18.337/2012/projects/nealwadhwa_18... · 2015-08-21 · Parallel Video Processing I Processing is on uncompressed video I Uncompressed](https://reader030.fdocuments.net/reader030/viewer/2022040917/5e926dac21e1b7457e4434a3/html5/thumbnails/10.jpg)
Example: Motion Magnification
I DSP based method to magnify subtle motions
I Here are some cool examples of motion magnification.
I Switch to video.
I FFT-based algorithm lends itself to being parallelized.
I Try to parallelize and see how far we can get
![Page 11: Parallel Video Processingcourses.csail.mit.edu/18.337/2012/projects/nealwadhwa_18... · 2015-08-21 · Parallel Video Processing I Processing is on uncompressed video I Uncompressed](https://reader030.fdocuments.net/reader030/viewer/2022040917/5e926dac21e1b7457e4434a3/html5/thumbnails/11.jpg)
Outline of algorithm - three stages
I 1. Spatially decompose each frame.
I 2. Temporally process each pixel in each decomposition level
I 3. Reconstruct each frame
I Every stage is easy to parallelize individually
I Serial algorithm takes several hours on high resolution videos.
![Page 12: Parallel Video Processingcourses.csail.mit.edu/18.337/2012/projects/nealwadhwa_18... · 2015-08-21 · Parallel Video Processing I Processing is on uncompressed video I Uncompressed](https://reader030.fdocuments.net/reader030/viewer/2022040917/5e926dac21e1b7457e4434a3/html5/thumbnails/12.jpg)
Outline of algorithm - three stages
I 1. Spatially decompose each frame.
I 2. Temporally process each pixel in each decomposition level
I 3. Reconstruct each frame
I Every stage is easy to parallelize individually
I Serial algorithm takes several hours on high resolution videos.
![Page 13: Parallel Video Processingcourses.csail.mit.edu/18.337/2012/projects/nealwadhwa_18... · 2015-08-21 · Parallel Video Processing I Processing is on uncompressed video I Uncompressed](https://reader030.fdocuments.net/reader030/viewer/2022040917/5e926dac21e1b7457e4434a3/html5/thumbnails/13.jpg)
Outline of algorithm - three stages
I 1. Spatially decompose each frame.
I 2. Temporally process each pixel in each decomposition level
I 3. Reconstruct each frame
I Every stage is easy to parallelize individually
I Serial algorithm takes several hours on high resolution videos.
![Page 14: Parallel Video Processingcourses.csail.mit.edu/18.337/2012/projects/nealwadhwa_18... · 2015-08-21 · Parallel Video Processing I Processing is on uncompressed video I Uncompressed](https://reader030.fdocuments.net/reader030/viewer/2022040917/5e926dac21e1b7457e4434a3/html5/thumbnails/14.jpg)
Outline of algorithm - three stages
I 1. Spatially decompose each frame.
I 2. Temporally process each pixel in each decomposition level
I 3. Reconstruct each frame
I Every stage is easy to parallelize individually
I Serial algorithm takes several hours on high resolution videos.
![Page 15: Parallel Video Processingcourses.csail.mit.edu/18.337/2012/projects/nealwadhwa_18... · 2015-08-21 · Parallel Video Processing I Processing is on uncompressed video I Uncompressed](https://reader030.fdocuments.net/reader030/viewer/2022040917/5e926dac21e1b7457e4434a3/html5/thumbnails/15.jpg)
Outline of algorithm - three stages
I 1. Spatially decompose each frame.
I 2. Temporally process each pixel in each decomposition level
I 3. Reconstruct each frame
I Every stage is easy to parallelize individually
I Serial algorithm takes several hours on high resolution videos.
![Page 16: Parallel Video Processingcourses.csail.mit.edu/18.337/2012/projects/nealwadhwa_18... · 2015-08-21 · Parallel Video Processing I Processing is on uncompressed video I Uncompressed](https://reader030.fdocuments.net/reader030/viewer/2022040917/5e926dac21e1b7457e4434a3/html5/thumbnails/16.jpg)
Outline of algorithm - Spatial Decomposition
I Say you have frames F1, . . . ,Fn
I Decompose each frame into different spatial bands
Fi → (Di ,1, . . . ,Di ,k)
uses k times as much space as the original frame.
I Decomposition is performing by FFT, multiplying by filtersand applying IFFT
Di ,j = F−1{Tj ×F{Fi}}
I Transform is invertible.
![Page 17: Parallel Video Processingcourses.csail.mit.edu/18.337/2012/projects/nealwadhwa_18... · 2015-08-21 · Parallel Video Processing I Processing is on uncompressed video I Uncompressed](https://reader030.fdocuments.net/reader030/viewer/2022040917/5e926dac21e1b7457e4434a3/html5/thumbnails/17.jpg)
Outline of algorithm - Spatial Decomposition
I Say you have frames F1, . . . ,FnI Decompose each frame into different spatial bands
Fi → (Di ,1, . . . ,Di ,k)
uses k times as much space as the original frame.
I Decomposition is performing by FFT, multiplying by filtersand applying IFFT
Di ,j = F−1{Tj ×F{Fi}}
I Transform is invertible.
![Page 18: Parallel Video Processingcourses.csail.mit.edu/18.337/2012/projects/nealwadhwa_18... · 2015-08-21 · Parallel Video Processing I Processing is on uncompressed video I Uncompressed](https://reader030.fdocuments.net/reader030/viewer/2022040917/5e926dac21e1b7457e4434a3/html5/thumbnails/18.jpg)
Outline of algorithm - Spatial Decomposition
I Say you have frames F1, . . . ,FnI Decompose each frame into different spatial bands
Fi → (Di ,1, . . . ,Di ,k)
uses k times as much space as the original frame.
I Decomposition is performing by FFT, multiplying by filtersand applying IFFT
Di ,j = F−1{Tj ×F{Fi}}
I Transform is invertible.
![Page 19: Parallel Video Processingcourses.csail.mit.edu/18.337/2012/projects/nealwadhwa_18... · 2015-08-21 · Parallel Video Processing I Processing is on uncompressed video I Uncompressed](https://reader030.fdocuments.net/reader030/viewer/2022040917/5e926dac21e1b7457e4434a3/html5/thumbnails/19.jpg)
Outline of algorithm - Spatial Decomposition
I Say you have frames F1, . . . ,FnI Decompose each frame into different spatial bands
Fi → (Di ,1, . . . ,Di ,k)
uses k times as much space as the original frame.
I Decomposition is performing by FFT, multiplying by filtersand applying IFFT
Di ,j = F−1{Tj ×F{Fi}}
I Transform is invertible.
![Page 20: Parallel Video Processingcourses.csail.mit.edu/18.337/2012/projects/nealwadhwa_18... · 2015-08-21 · Parallel Video Processing I Processing is on uncompressed video I Uncompressed](https://reader030.fdocuments.net/reader030/viewer/2022040917/5e926dac21e1b7457e4434a3/html5/thumbnails/20.jpg)
Outline of algorithm - Spatial Decomposition
I Create decompositionfor every frame
x
y
Space (x)
Spa
ce (y)
Time (t)
Levels
Orientations
Space (x)
Spa
ce(y
)
Time (t)
![Page 21: Parallel Video Processingcourses.csail.mit.edu/18.337/2012/projects/nealwadhwa_18... · 2015-08-21 · Parallel Video Processing I Processing is on uncompressed video I Uncompressed](https://reader030.fdocuments.net/reader030/viewer/2022040917/5e926dac21e1b7457e4434a3/html5/thumbnails/21.jpg)
Outline of algorithm - Spatial Decomposition
I Create decompositionfor every frame
x
y
Space (x)
Spa
ce (y)
Time (t)
Levels
Orientations
Space (x)
Spa
ce(y
)
Time (t)
![Page 22: Parallel Video Processingcourses.csail.mit.edu/18.337/2012/projects/nealwadhwa_18... · 2015-08-21 · Parallel Video Processing I Processing is on uncompressed video I Uncompressed](https://reader030.fdocuments.net/reader030/viewer/2022040917/5e926dac21e1b7457e4434a3/html5/thumbnails/22.jpg)
Outline of algorithm - Spatial Decomposition
I Create decompositionfor every frame
x
y
Space (x)
Spa
ce (y)
Time (t)
Levels
Orientations
Space (x)
Spa
ce(y
)
Time (t)
![Page 23: Parallel Video Processingcourses.csail.mit.edu/18.337/2012/projects/nealwadhwa_18... · 2015-08-21 · Parallel Video Processing I Processing is on uncompressed video I Uncompressed](https://reader030.fdocuments.net/reader030/viewer/2022040917/5e926dac21e1b7457e4434a3/html5/thumbnails/23.jpg)
Outline of Algorithm
I For every pixel in every level, values contain motion signal
![Page 24: Parallel Video Processingcourses.csail.mit.edu/18.337/2012/projects/nealwadhwa_18... · 2015-08-21 · Parallel Video Processing I Processing is on uncompressed video I Uncompressed](https://reader030.fdocuments.net/reader030/viewer/2022040917/5e926dac21e1b7457e4434a3/html5/thumbnails/24.jpg)
Outline of Algorithm
I Bandpass from 100 Hz to 120Hz
I Add bandpassed signal to original signal
0 50 100 150 200 250 300
−3
−2
−1
0
1
2
3
Time (t)0 50 100 150 200 250 300
−3
−2
−1
0
1
2
3
Time (t)
Bandpass 100-120 Hz Magnified)
I Amplifies only selected frequency
![Page 25: Parallel Video Processingcourses.csail.mit.edu/18.337/2012/projects/nealwadhwa_18... · 2015-08-21 · Parallel Video Processing I Processing is on uncompressed video I Uncompressed](https://reader030.fdocuments.net/reader030/viewer/2022040917/5e926dac21e1b7457e4434a3/html5/thumbnails/25.jpg)
Outline of Algorithm
I Bandpass from 100 Hz to 120Hz
I Add bandpassed signal to original signal
0 50 100 150 200 250 300
−3
−2
−1
0
1
2
3
Time (t)0 50 100 150 200 250 300
−3
−2
−1
0
1
2
3
Time (t)
Bandpass 100-120 Hz Magnified)
I Amplifies only selected frequency
![Page 26: Parallel Video Processingcourses.csail.mit.edu/18.337/2012/projects/nealwadhwa_18... · 2015-08-21 · Parallel Video Processing I Processing is on uncompressed video I Uncompressed](https://reader030.fdocuments.net/reader030/viewer/2022040917/5e926dac21e1b7457e4434a3/html5/thumbnails/26.jpg)
Easy to Parallelize
I Parallelize spatial decomposition over frames
I Parallelize temporal filtering over pixels.
I Difficulty lies in how to store data over cores.
![Page 27: Parallel Video Processingcourses.csail.mit.edu/18.337/2012/projects/nealwadhwa_18... · 2015-08-21 · Parallel Video Processing I Processing is on uncompressed video I Uncompressed](https://reader030.fdocuments.net/reader030/viewer/2022040917/5e926dac21e1b7457e4434a3/html5/thumbnails/27.jpg)
Easy to Parallelize
I Parallelize spatial decomposition over frames
I Parallelize temporal filtering over pixels.
I Difficulty lies in how to store data over cores.
![Page 28: Parallel Video Processingcourses.csail.mit.edu/18.337/2012/projects/nealwadhwa_18... · 2015-08-21 · Parallel Video Processing I Processing is on uncompressed video I Uncompressed](https://reader030.fdocuments.net/reader030/viewer/2022040917/5e926dac21e1b7457e4434a3/html5/thumbnails/28.jpg)
Easy to Parallelize
I Parallelize spatial decomposition over frames
I Parallelize temporal filtering over pixels.
I Difficulty lies in how to store data over cores.
![Page 29: Parallel Video Processingcourses.csail.mit.edu/18.337/2012/projects/nealwadhwa_18... · 2015-08-21 · Parallel Video Processing I Processing is on uncompressed video I Uncompressed](https://reader030.fdocuments.net/reader030/viewer/2022040917/5e926dac21e1b7457e4434a3/html5/thumbnails/29.jpg)
Matlab vs. JuliaI Relatively easy to port code to Julia
I Compare serial performance of matlab vs. julia at differentimage sizes.
103
104
105
106
107
101
102
103
104
105
Frame Size in Pixels
RunningTim
e
Comparison of Serial Matlab and Julia Code
MatlabJulia
I Julia is slightly slower, but comparable.I Not surprising since main processing occurs in ffts (in libfftw).I Uses 400 GB at largest problem size, 1600x1600x300.
![Page 30: Parallel Video Processingcourses.csail.mit.edu/18.337/2012/projects/nealwadhwa_18... · 2015-08-21 · Parallel Video Processing I Processing is on uncompressed video I Uncompressed](https://reader030.fdocuments.net/reader030/viewer/2022040917/5e926dac21e1b7457e4434a3/html5/thumbnails/30.jpg)
Matlab vs. JuliaI Relatively easy to port code to JuliaI Compare serial performance of matlab vs. julia at different
image sizes.
103
104
105
106
107
101
102
103
104
105
Frame Size in Pixels
RunningTim
e
Comparison of Serial Matlab and Julia Code
MatlabJulia
I Julia is slightly slower, but comparable.I Not surprising since main processing occurs in ffts (in libfftw).I Uses 400 GB at largest problem size, 1600x1600x300.
![Page 31: Parallel Video Processingcourses.csail.mit.edu/18.337/2012/projects/nealwadhwa_18... · 2015-08-21 · Parallel Video Processing I Processing is on uncompressed video I Uncompressed](https://reader030.fdocuments.net/reader030/viewer/2022040917/5e926dac21e1b7457e4434a3/html5/thumbnails/31.jpg)
Matlab vs. JuliaI Relatively easy to port code to JuliaI Compare serial performance of matlab vs. julia at different
image sizes.
103
104
105
106
107
101
102
103
104
105
Frame Size in Pixels
RunningTim
e
Comparison of Serial Matlab and Julia Code
MatlabJulia
I Julia is slightly slower, but comparable.
I Not surprising since main processing occurs in ffts (in libfftw).I Uses 400 GB at largest problem size, 1600x1600x300.
![Page 32: Parallel Video Processingcourses.csail.mit.edu/18.337/2012/projects/nealwadhwa_18... · 2015-08-21 · Parallel Video Processing I Processing is on uncompressed video I Uncompressed](https://reader030.fdocuments.net/reader030/viewer/2022040917/5e926dac21e1b7457e4434a3/html5/thumbnails/32.jpg)
Matlab vs. JuliaI Relatively easy to port code to JuliaI Compare serial performance of matlab vs. julia at different
image sizes.
103
104
105
106
107
101
102
103
104
105
Frame Size in Pixels
RunningTim
e
Comparison of Serial Matlab and Julia Code
MatlabJulia
I Julia is slightly slower, but comparable.I Not surprising since main processing occurs in ffts (in libfftw).
I Uses 400 GB at largest problem size, 1600x1600x300.
![Page 33: Parallel Video Processingcourses.csail.mit.edu/18.337/2012/projects/nealwadhwa_18... · 2015-08-21 · Parallel Video Processing I Processing is on uncompressed video I Uncompressed](https://reader030.fdocuments.net/reader030/viewer/2022040917/5e926dac21e1b7457e4434a3/html5/thumbnails/33.jpg)
Matlab vs. JuliaI Relatively easy to port code to JuliaI Compare serial performance of matlab vs. julia at different
image sizes.
103
104
105
106
107
101
102
103
104
105
Frame Size in Pixels
RunningTim
e
Comparison of Serial Matlab and Julia Code
MatlabJulia
I Julia is slightly slower, but comparable.I Not surprising since main processing occurs in ffts (in libfftw).I Uses 400 GB at largest problem size, 1600x1600x300.
![Page 34: Parallel Video Processingcourses.csail.mit.edu/18.337/2012/projects/nealwadhwa_18... · 2015-08-21 · Parallel Video Processing I Processing is on uncompressed video I Uncompressed](https://reader030.fdocuments.net/reader030/viewer/2022040917/5e926dac21e1b7457e4434a3/html5/thumbnails/34.jpg)
Matlab Parfor
I Parfor gives factor of two improvement when used with 12cores.
I Parfor processing on frames and on temporal processing
103
104
105
106
107
100
101
102
103
104
105
Frame Size in Pixels
RunningTim
e
Comparison of Serial Matlab and Matlab with parfor
Serial Matlabparfor Matlab
I Only 2x improvement
![Page 35: Parallel Video Processingcourses.csail.mit.edu/18.337/2012/projects/nealwadhwa_18... · 2015-08-21 · Parallel Video Processing I Processing is on uncompressed video I Uncompressed](https://reader030.fdocuments.net/reader030/viewer/2022040917/5e926dac21e1b7457e4434a3/html5/thumbnails/35.jpg)
Matlab Parfor
I Parfor gives factor of two improvement when used with 12cores.
I Parfor processing on frames and on temporal processing
103
104
105
106
107
100
101
102
103
104
105
Frame Size in Pixels
RunningTim
e
Comparison of Serial Matlab and Matlab with parfor
Serial Matlabparfor Matlab
I Only 2x improvement
![Page 36: Parallel Video Processingcourses.csail.mit.edu/18.337/2012/projects/nealwadhwa_18... · 2015-08-21 · Parallel Video Processing I Processing is on uncompressed video I Uncompressed](https://reader030.fdocuments.net/reader030/viewer/2022040917/5e926dac21e1b7457e4434a3/html5/thumbnails/36.jpg)
Matlab Parfor
I Parfor gives factor of two improvement when used with 12cores.
I Parfor processing on frames and on temporal processing
103
104
105
106
107
100
101
102
103
104
105
Frame Size in Pixels
RunningTim
e
Comparison of Serial Matlab and Matlab with parfor
Serial Matlabparfor Matlab
I Only 2x improvement
![Page 37: Parallel Video Processingcourses.csail.mit.edu/18.337/2012/projects/nealwadhwa_18... · 2015-08-21 · Parallel Video Processing I Processing is on uncompressed video I Uncompressed](https://reader030.fdocuments.net/reader030/viewer/2022040917/5e926dac21e1b7457e4434a3/html5/thumbnails/37.jpg)
Julia spawnat vs. Matlab parforI Parllelize the spatial decomposition and reconstruction in Julia
I Faster than serial Julia for large problem size.
103
104
105
106
107
100
101
102
103
104
Frame Size in Pixels
RunningTim
e
Comparison of Serial Matlab and Matlab with parfor
parfor MatlabSerial JuliaParallelize Spatial Decomposition in Julia
I The spatial decomposition is extremely fast, but reordereddata for temporal filtering is very, very slow.
I In serial code, temporal processing uses 14% of time.I In parallel code, temporal processing uses 50% of time.
![Page 38: Parallel Video Processingcourses.csail.mit.edu/18.337/2012/projects/nealwadhwa_18... · 2015-08-21 · Parallel Video Processing I Processing is on uncompressed video I Uncompressed](https://reader030.fdocuments.net/reader030/viewer/2022040917/5e926dac21e1b7457e4434a3/html5/thumbnails/38.jpg)
Julia spawnat vs. Matlab parforI Parllelize the spatial decomposition and reconstruction in JuliaI Faster than serial Julia for large problem size.
103
104
105
106
107
100
101
102
103
104
Frame Size in Pixels
RunningTim
e
Comparison of Serial Matlab and Matlab with parfor
parfor MatlabSerial JuliaParallelize Spatial Decomposition in Julia
I The spatial decomposition is extremely fast, but reordereddata for temporal filtering is very, very slow.
I In serial code, temporal processing uses 14% of time.I In parallel code, temporal processing uses 50% of time.
![Page 39: Parallel Video Processingcourses.csail.mit.edu/18.337/2012/projects/nealwadhwa_18... · 2015-08-21 · Parallel Video Processing I Processing is on uncompressed video I Uncompressed](https://reader030.fdocuments.net/reader030/viewer/2022040917/5e926dac21e1b7457e4434a3/html5/thumbnails/39.jpg)
Julia spawnat vs. Matlab parforI Parllelize the spatial decomposition and reconstruction in JuliaI Faster than serial Julia for large problem size.
103
104
105
106
107
100
101
102
103
104
Frame Size in Pixels
RunningTim
e
Comparison of Serial Matlab and Matlab with parfor
parfor MatlabSerial JuliaParallelize Spatial Decomposition in Julia
I The spatial decomposition is extremely fast, but reordereddata for temporal filtering is very, very slow.
I In serial code, temporal processing uses 14% of time.I In parallel code, temporal processing uses 50% of time.
![Page 40: Parallel Video Processingcourses.csail.mit.edu/18.337/2012/projects/nealwadhwa_18... · 2015-08-21 · Parallel Video Processing I Processing is on uncompressed video I Uncompressed](https://reader030.fdocuments.net/reader030/viewer/2022040917/5e926dac21e1b7457e4434a3/html5/thumbnails/40.jpg)
Julia spawnat vs. Matlab parforI Parllelize the spatial decomposition and reconstruction in JuliaI Faster than serial Julia for large problem size.
103
104
105
106
107
100
101
102
103
104
Frame Size in Pixels
RunningTim
e
Comparison of Serial Matlab and Matlab with parfor
parfor MatlabSerial JuliaParallelize Spatial Decomposition in Julia
I The spatial decomposition is extremely fast, but reordereddata for temporal filtering is very, very slow.
I In serial code, temporal processing uses 14% of time.
I In parallel code, temporal processing uses 50% of time.
![Page 41: Parallel Video Processingcourses.csail.mit.edu/18.337/2012/projects/nealwadhwa_18... · 2015-08-21 · Parallel Video Processing I Processing is on uncompressed video I Uncompressed](https://reader030.fdocuments.net/reader030/viewer/2022040917/5e926dac21e1b7457e4434a3/html5/thumbnails/41.jpg)
Julia spawnat vs. Matlab parforI Parllelize the spatial decomposition and reconstruction in JuliaI Faster than serial Julia for large problem size.
103
104
105
106
107
100
101
102
103
104
Frame Size in Pixels
RunningTim
e
Comparison of Serial Matlab and Matlab with parfor
parfor MatlabSerial JuliaParallelize Spatial Decomposition in Julia
I The spatial decomposition is extremely fast, but reordereddata for temporal filtering is very, very slow.
I In serial code, temporal processing uses 14% of time.I In parallel code, temporal processing uses 50% of time.
![Page 42: Parallel Video Processingcourses.csail.mit.edu/18.337/2012/projects/nealwadhwa_18... · 2015-08-21 · Parallel Video Processing I Processing is on uncompressed video I Uncompressed](https://reader030.fdocuments.net/reader030/viewer/2022040917/5e926dac21e1b7457e4434a3/html5/thumbnails/42.jpg)
Change processing to use 3-tap primal domain temporalfilter
I Makes temporal processing more local to avoidcommunication overhead.
I Store temporally close pixels on same processors
103
104
105
106
107
100
101
102
103
104
Frame Size in Pixels
RunningTim
e
Comparison of Serial Matlab and Matlab with parfor
parfor MatlabSerial Julia3Tap Filter instead of FFT
I 2.5x faster than Matlab, 5x faster than serial JuliaI Matlab parfor fails to capitalize on this
![Page 43: Parallel Video Processingcourses.csail.mit.edu/18.337/2012/projects/nealwadhwa_18... · 2015-08-21 · Parallel Video Processing I Processing is on uncompressed video I Uncompressed](https://reader030.fdocuments.net/reader030/viewer/2022040917/5e926dac21e1b7457e4434a3/html5/thumbnails/43.jpg)
Change processing to use 3-tap primal domain temporalfilter
I Makes temporal processing more local to avoidcommunication overhead.
I Store temporally close pixels on same processors
103
104
105
106
107
100
101
102
103
104
Frame Size in Pixels
RunningTim
e
Comparison of Serial Matlab and Matlab with parfor
parfor MatlabSerial Julia3Tap Filter instead of FFT
I 2.5x faster than Matlab, 5x faster than serial JuliaI Matlab parfor fails to capitalize on this
![Page 44: Parallel Video Processingcourses.csail.mit.edu/18.337/2012/projects/nealwadhwa_18... · 2015-08-21 · Parallel Video Processing I Processing is on uncompressed video I Uncompressed](https://reader030.fdocuments.net/reader030/viewer/2022040917/5e926dac21e1b7457e4434a3/html5/thumbnails/44.jpg)
Change processing to use 3-tap primal domain temporalfilter
I Makes temporal processing more local to avoidcommunication overhead.
I Store temporally close pixels on same processors
103
104
105
106
107
100
101
102
103
104
Frame Size in Pixels
RunningTim
e
Comparison of Serial Matlab and Matlab with parfor
parfor MatlabSerial Julia3Tap Filter instead of FFT
I 2.5x faster than Matlab, 5x faster than serial Julia
I Matlab parfor fails to capitalize on this
![Page 45: Parallel Video Processingcourses.csail.mit.edu/18.337/2012/projects/nealwadhwa_18... · 2015-08-21 · Parallel Video Processing I Processing is on uncompressed video I Uncompressed](https://reader030.fdocuments.net/reader030/viewer/2022040917/5e926dac21e1b7457e4434a3/html5/thumbnails/45.jpg)
Change processing to use 3-tap primal domain temporalfilter
I Makes temporal processing more local to avoidcommunication overhead.
I Store temporally close pixels on same processors
103
104
105
106
107
100
101
102
103
104
Frame Size in Pixels
RunningTim
e
Comparison of Serial Matlab and Matlab with parfor
parfor MatlabSerial Julia3Tap Filter instead of FFT
I 2.5x faster than Matlab, 5x faster than serial JuliaI Matlab parfor fails to capitalize on this