ACM Multimedia October 20, 2009

45
Manipulating Lossless Video in the Compressed Domain William Thies 1 , Steven Hall 2 , Saman Amarasinghe 2 1 Microsoft Research India 2 Massachusetts Institute of Technology ACM Multimedia October 20, 2009

description

Manipulating Lossless Video in the Compressed Domain William Thies 1 , Steven Hall 2 , Saman Amarasinghe 2 1 Microsoft Research India 2 Massachusetts Institute of Technology. ACM Multimedia October 20, 2009. Processing in the Compressed Domain. Multimedia archives are growing rapidly - PowerPoint PPT Presentation

Transcript of ACM Multimedia October 20, 2009

Page 1: ACM Multimedia October 20, 2009

Manipulating Lossless Videoin the Compressed Domain

William Thies1, Steven Hall2, Saman Amarasinghe2

1 Microsoft Research India2 Massachusetts Institute of Technology

ACM Multimedia

October 20, 2009

Page 2: ACM Multimedia October 20, 2009

Processing in the Compressed Domain• Multimedia archives are growing rapidly

– Monsters vs. Aliens production 100 TB– Facebook photos 400 TB– YouTube 600 TB

• How to analyze or modify the data?

Uncompress Process RecompressCompressed

InputCompressed

Output

ProcessCompressed

InputCompressed

Output

Compressed-domain transformation

Typical practice

lossless priorto distribution

Page 3: ACM Multimedia October 20, 2009

Prior Work: Focus on Lossy Formats• DCT-based spatial compression (JPEG, MPEG stills)

– Resizing [Dugad & Ahuja 2001] [Mukherjee & Mitra 2002]– Edge detection [Shen & Sethi 1996]– Image segmentation [Feng & Jiang 2003]– Shearing and rotating inner blocks [Shen & Sethi 1998]– Linear combinations of pixels [Smith & Rowe 1996]

• DCT-based temporal compression (MPEG video)– Captioning [Nang, Kwon, & Hong 2000]– Reversal [Vasudev 1998]– Distortion detection [Dorai, Ratha, & Bolle 2000]– Transcoding [Acharya & Smith 1998]

• Almost no work on lossless formats– Transpose and rotation of black/white images [Shoji 1995; Misra et al. 1999]– Pattern matching in compressed text [Farach & Thorup 1998; Navarro

2003]– Modifying pitch and playback of audio [Levine 1998]

Page 4: ACM Multimedia October 20, 2009

Prior Work: Focus on Lossy Formats• DCT-based spatial compression (JPEG, MPEG stills)

– Resizing [Dugad & Ahuja 2001] [Mukherjee & Mitra 2002]– Edge detection [Shen & Sethi 1996]– Image segmentation [Feng & Jiang 2003]– Shearing and rotating inner blocks [Shen & Sethi 1998]– Linear combinations of pixels [Smith & Rowe 1996]

• DCT-based temporal compression (MPEG video)– Captioning [Nang, Kwon, & Hong 2000]– Reversal [Vasudev 1998]– Distortion detection [Dorai, Ratha, & Bolle 2000]– Transcoding [Acharya & Smith 1998]

• Almost no work on lossless formats– Transpose and rotation of black/white images [Shoji 1995; Misra et al. 1999]– Pattern matching in compressed text [Farach & Thorup 1998; Navarro

2003]– Modifying pitch and playback of audio [Levine 1998]

Our Focus:

Regular Processing ofLZ77-Compressed Data Streams

Page 5: ACM Multimedia October 20, 2009

Example

o o o o l a l a l a

O O O O L A L A L A

Output:

Input:

to lowercase

Page 6: ACM Multimedia October 20, 2009

AAL A L L

Example

O O O O AL A L L A

O O O O L A L A L AInput:

CompressedInput:

o o o o l a l a l aOutput:

Page 7: ACM Multimedia October 20, 2009

A

Example

O O O O L A L L A

O O O O L A L A L A

4 2

Input:

o o o o l a l a l aOutput:

CompressedInput:

Page 8: ACM Multimedia October 20, 2009

Example

O O O O L A

O O O O L A L A L A

4 2

o o o o l a l a l aOutput:

Input:

CompressedInput:

“Repeat Token”

Count Distance

Page 9: ACM Multimedia October 20, 2009

Example

O O O O L A

O O O O L A L A L A

4 213

o o o o l a l a l aOutput:

Input:

CompressedInput:

Count Distance

“Repeat Token”

Page 10: ACM Multimedia October 20, 2009

Example

O L A

O O O O L A L A L A

4 213

o o o o l a l a l aOutput:

Input:

CompressedInput:

Count Distance

“Repeat Token”

Page 11: ACM Multimedia October 20, 2009

Example

O L A

o l a

O O O O L A L A L A

4 213

4 213CompressedOutput:

CompressedInput:

Input:

o o o o l a l a l aOutput:

Page 12: ACM Multimedia October 20, 2009

Example

O L A

o l a

4 213

4 213CompressedOutput:

CompressedInput:

Compressed Domain TransformationO O O O L A L A L A

o o o o l a l a l aOutput:

Input:

Page 13: ACM Multimedia October 20, 2009

Example

O L A

o l a

4 213

4 213CompressedOutput:

CompressedInput:

Compressed Domain Transformation

Page 14: ACM Multimedia October 20, 2009

Our Contributions• Handle the general case

– Produce and consumemore than one data item

– Split and join data streams

• Implement in a compiler– Programmer thinks in terms of uncompressed data– Compiler translates to work on compressed data– Relies on StreamIt programming language

• Evaluate on video processing tasks– 12 videos in Apple Animation format– Adjust colors or overlay two videos– Speedups proportional to compression ratio (median 15x)

O L A

o l a

4 213

4 213CompressedOutput:

CompressedInput:

Compressed Domain Transformation

Page 15: ACM Multimedia October 20, 2009

In This Talk• StreamIt Language

• Compressed Domain Transformation

• Experimental Evaluation

Page 16: ACM Multimedia October 20, 2009

void->void pipeline FMRadio(freq1 low, float freq2, int N) {

add AtoD();

add FMDemod();

add splitjoin {

split duplicate;

for (int i=0; i<N; i++) {

add pipeline {

add LowPassFilter(freq1 + i*(freq2-

freq1)/N);

add HighPassFilter(freq2 + i*(freq2-freq1)/N);

}}join roundrobin();

}

add Adder();

add Speaker();

}

Adder

Speaker

AtoD

FMDemod

LPF1

Duplicate

RoundRobin

LPF2 LPF3

HPF1 HPF2 HPF3

The StreamIt Language

Page 17: ACM Multimedia October 20, 2009

Adder

Speaker

AtoD

FMDemod

LPF1

Duplicate

RoundRobin

LPF2 LPF3

HPF1 HPF2 HPF3

• Applications– DES and Serpent [PLDI 05]– MPEG-2 [IPDPS 06]– SAR, DSP benchmarks, JPEG, …

• Programmability– StreamIt Language (CC 02) – Teleport Messaging (PPOPP 05)– Programming Environment in Eclipse (P-PHEC 05)

• Domain Specific Optimizations– Linear Analysis and Optimization (PLDI 03)– Optimizations for bit streaming (PLDI 05)– Linear State Space Analysis (CASES 05)

• Architecture Specific Optimizations– Compiling for Communication-Exposed

Architectures (ASPLOS 02 & 06, dasCMP 07)– Phased Scheduling (LCTES 03)– Cache Aware Optimization (LCTES 05)– Load-Balanced Rendering (Graphics Hardware 05)

• Migrating Legacy Code to a Stream Representation– Using a Dynamic Analysis (MICRO 07)

The StreamIt Language

Page 18: ACM Multimedia October 20, 2009

Language Primitives

Filter Splitter Joiner

Filter

pop 2 push 1 roundrobin(1,1) roundrobin(2,2)pop N push M roundrobin(N,M)

Model of computation also known as cyclo-static dataflow

Page 19: ACM Multimedia October 20, 2009

Example: Video Compositing

roundrobin(1,1)

Source 1 Source 2

Output

MultiplyPixels

2

1

Page 20: ACM Multimedia October 20, 2009

In This Talk• StreamIt Language

• Compressed Domain Transformation

• Experimental Evaluation

Page 21: ACM Multimedia October 20, 2009

Transforming Windows of Data

O O – O O L A – L –– L A – A

L A L A AO O O O L

O O – O O L A – L –– L A – A

L A L A AO O O O L

HyphenatePairs

Input:

Output:

Page 22: ACM Multimedia October 20, 2009

Transforming Windows of Data

O O – O O L A – L –– L A – A

L A L A AO O O O L

O O – O O L A – L –– L A – A

L A L A AO O O O L

HyphenatePairs

Input:

Output:

Page 23: ACM Multimedia October 20, 2009

Transforming Windows of Data

O O – O O L A – L –– L A – A

L A L A AO O O O L

AO L243 1

L –A36

Output:

CompressedInput:

Input:

CompressedOutput:

Page 24: ACM Multimedia October 20, 2009

Transforming Windows of Data

O O – O O L A – L –– L A – A

L A L A AO O O O L

AO L243 1

L –A36

Output:

CompressedInput:

Input:

CompressedOutput:

Page 25: ACM Multimedia October 20, 2009

Transforming Windows of Data

O O – O O L A – L –– L A – A

L A L A AO O O O L

AO L243 1

O O L –– A36

AO O L242 2

33

Output:

Coarsened,Expanded

CompressedInput:

Input:

CompressedOutput:

Page 26: ACM Multimedia October 20, 2009

General Case: Filters

DN… … Filter

I O

D’O/IN’’O/I… …

D’N’… ..… Filter

I O

FilterI O

…N’ % I

items

Coarsen

Translate

D’ = LCM (D, I)N’ = N – (D’ – D)

N’’ = N’ – N % I

Page 27: ACM Multimedia October 20, 2009

CompressedInput:CompressedOutput:

Splitting Streams

L A L A AL A L A L1

1

AL A L AL A L A L1

1

2814

14

Input:Output:

Page 28: ACM Multimedia October 20, 2009

CompressedInput:

Splitting Streams

L A L A AL A L A L2

2

AL2

2

Input:Output:

Page 29: ACM Multimedia October 20, 2009

Coarsened,ExpandedInput:

CompressedOutput:

Splitting Streams

L A AL AL A L A L2

2

4624

22

Page 30: ACM Multimedia October 20, 2009

1

1

O

XX

O

O O O

Splitting and Joining: Transpose

O O O

O O O

4

4

O O O

Page 31: ACM Multimedia October 20, 2009

O

X

1

1O O O

Splitting and Joining: Transpose

O O O 4

4

O

X O O O

O O O

Page 32: ACM Multimedia October 20, 2009

O

XX

O

O O O

Splitting and Joining: Transpose

O O O

O O O

O O O 1

1

4

4

Page 33: ACM Multimedia October 20, 2009

1

1

4

4X O

Splitting and Joining: Transpose

O

X O

3 1O

O O

O O O

12 12

3 1

Page 34: ACM Multimedia October 20, 2009

1

1

4

4

Splitting and Joining: Transpose

O3 13 1

12O

O

2

4

X

O

X O12

O

X O

3 1

12

O3 1

Page 35: ACM Multimedia October 20, 2009

General Case: Joiners

D1N1… …

W1W2D2N2

… …

If D1 % W1 = 0 and D2 % W2 = 0 and D1/W1 = D2/W2

D1(W1+W2)N’… …

W1

Page 36: ACM Multimedia October 20, 2009

In This Talk• StreamIt Language

• Compressed Domain Transformation

• Experimental Evaluation

Page 37: ACM Multimedia October 20, 2009

Implementation• Implemented subset of transformations in StreamIt

– User can change graph connectivity + filter functions

• Supported file format: Apple Animation (part of .MOV)– Standard format for interchange of lossless video– Compression: Run-length encoding within a line +

difference encoding between frames

• Emit executable plugins for MEncoder and Blender– Allows integration with standard video editing workflow

1 1 2 11-to-1 filter

1-to-1 joinerwith 2-to-1 filter

1

1

Page 38: ACM Multimedia October 20, 2009

Experimental Methodology• Evaluated on 12 videos drawn from Internet video,

computer animation, and stock digital television content

• Two classes of transformations:1. Color adjustment: inverse, brightness, contrast

2. Composite transformations: alpha-under, multiply

+ =

x =

alphaunder

1 1

2 11

1

Page 39: ACM Multimedia October 20, 2009

1x 10x 100x 1000x1x

10x

100x

1000x

Brightness

Contrast

Inverse

Compositing

Compression Factor

Sp

eed

up

Results: Execution Time

Color Adjustment:- 2.5x to 471x (median 17x)

Compositing:- 1.1x to 32x (median 6.6x)

Compression FactorFollowing Re-compression

Compression factor was low (≤1.1x) for one of source videos

Page 40: ACM Multimedia October 20, 2009

1x 10x 100x 1000x0x

1x

2x

3x

4x

5x

6xBrightness

Contrast

Inverse

Compositing

Compression Factor

Fil

e B

loat

Rela

tive t

o R

eco

mp

ressio

n

Masked out areasnot re-compressed

Saturated colorsnot re-compressed

Compression FactorFollowing Re-compression

Results: File Bloat

Page 41: ACM Multimedia October 20, 2009

Opportunity: Ignoring “Dead” Data• Some pixels in composite frames do not depend on both

input frames– Example: digital television mask (a low-performance case)

• If two data streams are multiplied, and one of them is repeatedly zero, then the repeat can be copied to the output (regardless of the values in the other stream)– We expect this would fix performance of our outlier cases– Requires pattern matching on stream graph

x =

2 11

1

Page 42: ACM Multimedia October 20, 2009

Extension to Other File Formats• High-efficiency mappings

– Flic Video– Microsoft RLE– Targa (with run-length encoding)

• Medium-efficiency mappings– Open EXR– Planar RGB

Re-arranges data by color or by byte

• Low-efficiency mappings– ZIP– GZIP– PNG

Performs Huffman coding prior to LZ77

Page 43: ACM Multimedia October 20, 2009

Conclusions• New method for direct processing of lossless-encoded

data streams– Relies on LZ77 compression and stream programming model– Supports operations on windows of data– Supports splitting, joining, and reordering data

• Preliminary implementation in an automatic compiler– Write program on uncompressed data, run on compressed data

• Good speedups in the context of video processing– 15x speedup (median) on color adjustment and compositing– Across 12 videos in Apple Animation format– May prove useful as more content authored in lossless formats

• Scope for extending technique, finding new applications

Page 44: ACM Multimedia October 20, 2009

Extra Slides

Page 45: ACM Multimedia October 20, 2009

General Case: Splitters

DN… … Split

U

D’VU+V

N’’VU+V… …

D’N’… ..… Split

Split…N’ % (U+V)

items

Coarsen

Translate

D’ = LCM (D, U+V)N’ = N – (D’ – D)

N’’ = N’ – N % (U+V)

V

U

V

U

V