US20090263030

36
US 20090263030A1 (12) Patent Application Publication (10) Pub. No.: US 2009/0263030 A1 (19) United States Ramasastry et al. (43) Pub. Date: Oct. 22, 2009 (54) METHODS AND APPARATUSES FOR COMPRESSING DIGITAL IMAGE DATA (76) Inventors: Jayaram Ramasastry, Woodinville, CA (U S); Partho Choudhury, Maharashtra (IN); Ramesh Prasad, Maharashtra (IN) Correspondence Address: PERKINS COIE LLP PATENT-SEA PO. BOX 1247 SEATTLE, WA 98111-1247 (US) (21) Appl. No.: 12/277,207 (22) Filed: Nov. 24, 2008 Related US. Application Data (63) Continuation of application No. 11/077,106, ?led on Mar. 9, 2005, noW Pat. No. 7,522,774. (60) Provisional application No. 60/552,153, ?led on Mar. 10, 2004, provisional application No. 60/552,356, ?led on Mar. 10, 2004, provisional application No. 60/552,270, ?led on Mar. 10, 2004. Publication Classi?cation (51) Int. Cl. G06K 9/36 (2006.01) H04M 3/42 (2006.01) (52) US. Cl. .................................... .. 382/232;455/414.1 (57) ABSTRACT Methods and apparatuses for compressing digital image data are described herein. In one embodiment, a Wavelet transform is performed on each pixel of a frame to generate multiple Wavelet coef?cients representing each pixel in a frequency domain. The Wavelet coef?cients of a sub-band of the frame are iteratively encoded into a bit stream based on a target transmission rate, Where the sub-band of the frame is obtained from a parent sub-band of a previous iteration. The encoded Wavelet coef?cients satisfy a predetermined threshold based on a predetermined algorithm While the Wavelet coef?cients that do not satisfy the predetermined threshold are ignored in the respective iteration. Other methods and apparatuses are also described. 1 0 Encoder Data Acquisition Optional Decoder 106 107 Server 101 Network (e.g., wired and/or wireless) 102 Optional Encoder Defgger optional Encoder De1cggder Client 103 Client 104

Transcript of US20090263030

Page 1: US20090263030

US 20090263030A1

(12) Patent Application Publication (10) Pub. No.: US 2009/0263030 A1 (19) United States

Ramasastry et al. (43) Pub. Date: Oct. 22, 2009

(54) METHODS AND APPARATUSES FOR COMPRESSING DIGITAL IMAGE DATA

(76) Inventors: Jayaram Ramasastry, Woodinville, CA (U S); Partho Choudhury, Maharashtra (IN); Ramesh Prasad, Maharashtra (IN)

Correspondence Address: PERKINS COIE LLP PATENT-SEA PO. BOX 1247 SEATTLE, WA 98111-1247 (US)

(21) Appl. No.: 12/277,207

(22) Filed: Nov. 24, 2008

Related US. Application Data

(63) Continuation of application No. 11/077,106, ?led on Mar. 9, 2005, noW Pat. No. 7,522,774.

(60) Provisional application No. 60/552,153, ?led on Mar. 10, 2004, provisional application No. 60/552,356,

?led on Mar. 10, 2004, provisional application No. 60/552,270, ?led on Mar. 10, 2004.

Publication Classi?cation

(51) Int. Cl. G06K 9/36 (2006.01) H04M 3/42 (2006.01)

(52) US. Cl. .................................... .. 382/232;455/414.1

(57) ABSTRACT

Methods and apparatuses for compressing digital image data are described herein. In one embodiment, a Wavelet transform is performed on each pixel of a frame to generate multiple Wavelet coef?cients representing each pixel in a frequency domain. The Wavelet coef?cients of a sub-band of the frame are iteratively encoded into a bit stream based on a target transmission rate, Where the sub-band of the frame is obtained from a parent sub-band of a previous iteration. The encoded Wavelet coef?cients satisfy a predetermined threshold based on a predetermined algorithm While the Wavelet coef?cients that do not satisfy the predetermined threshold are ignored in the respective iteration. Other methods and apparatuses are also described.

1 0 Encoder Data Acquisition Optional Decoder —

106 107

Server 101

Network (e.g., wired and/or wireless)

102 ‘

Optional Encoder Defgger optional Encoder De1cggder

Client 103 Client 104

Page 2: US20090263030

Patent Application Publication Oct. 22, 2009 Sheet 1 0f 18 US 2009/0263030 Al

N 5N“

mo _. 6880

wow Ew=0 Euoucm E6230

C)

v

wow 20:0

hwuoocm 5:250 kwwnwwwo ‘ Nov A3225, 5:26 8:3 rmdv v:QEQZ

SF 62mm

no? mo? mo? hmuoowo 8:250 coEw5wo< Ema Lmuoocw

Page 3: US20090263030
Page 4: US20090263030

Patent Application Publication Oct. 22, 2009 Sheet 3 0f 18 US 2009/0263030 A1

Physical Layer (W-CDMA, CDMA 1.X, cdma2000, GSM-GPRS, UMTS, iBen) (1)

Data Link Control (DLC) (2)

Streaming protocol stack (RTP, RTSP, RTCP, SDP)

(4) Third party ISO protocol stack ‘ (TVCP/lP/UOP) (3)

Billing and otheE a)nci|lary services 5

Network Amaro Layer (NAL) (6)

Application Layer Apis for QwikStreamTM, QwikVuTM and QwikTexiM (7)

Constant Generation Engine (8)

Data Repository (9)

FIG. 3

Page 5: US20090263030

Patent Application Publication Oct. 22, 2009 Sheet 4 0f 18 US 2009/0263030 A1

Raw YUV color frame data 401

@ Wavelet Transform ?lter bank

402

4 Source Encoder (ARIES)

403

4 Channel encoding (Tree partitioning, CRC, RCPC)

404

lL Compressed File (e.g., .qvx file)

405

.IL 0

FIG. 4A

Page 6: US20090263030

Patent Application Publication Oct. 22, 2009 Sheet 5 0f 18 US 2009/0263030 A1

Compressed Image (.qvx file format)

ll Channel decoding (Tree merging, CRC, RCPC)

@ Source Decoder (l-ARIES)

lL Inverse Wavelet Transform

lL Raw YUV data

Ji CJ

FIG. 4B

Page 7: US20090263030

Patent Application Publication Oct. 22, 2009 Sheet 6 0f 18 US 2009/0263030 A1

501

Perform a wavelet transformation on each image pixel to transform the pixel into one or more coefficients in one or more wavelet

maps

1' 502 Encode each wavelet map by representing

the significance, sign and bit plane information of the pixel using a single bit in

a bit stream

i 503 Encode the significant bits into a context variable dependent upon the information represented by the bit and its location of the coefficient being coded (e.g., the

probability of occurrence of a predetermined set of bits immediately

preceding the current bit)

i Transmit the content of the context variable as a bit stream as an output representing

the encoded pixels

FIG. 5

Page 8: US20090263030

Patent Application Publication Oct. 22, 2009 Sheet 7 0f 18 US 2009/0263030 A1

' sum,” 3

(HH)

Fig. 6

Page 9: US20090263030

Patent Application Publication Oct. 22, 2009 Sheet 8 0f 18 US 2009/0263030 A1

Page 10: US20090263030

Patent Application Publication Oct. 22, 2009 Sheet 9 0f 18 US 2009/0263030 A1

801 Determine a number of iterations (nl) based on a number of quantization levels, which may be determined on the largest wavelet coefficient, and set an initial quantization

threshold T=2“'

'00 O O

v 802

Populate all insigni?cant pixels in IPQ, all insigni?cant pixel having descendants in ISQ, and all signi?cant pixels in SPQ

v 803

For each type I entry of ISQ, if the entry is significant with respect to a current quantization threshold, remove the

respective entry from ISQ and append it in the SPQ

v 804 For each type I entry of ISQ, if the entry is

insignificant with respect to a current quantization threshold, remove the

respective entry from ISQ and append it in ‘ the IPO

805 If the respective type I entry includes

descendants, remove the entry from the ISQ and append it at the end of ISQ as type ll entry for next iteration; otherwise,

the entry is purged

r 806

For each type II entry of ISQ, if the entry is signi?cant with respect to a current

quantization threshold, all offspring of the current ISQ entry are appended to the end of ISQ as type I entries for next iteration

807

Remove any entry in lPQ that is signi?cant with respect to the current quantization threshold and append it in the SPQ

FIG. 8

Page 11: US20090263030

Patent Application Publication Oct. 22, 2009 Sheet 10 0f 18 US 2009/0263030 A1

Raw YUV color frame data

<> Wavelet Transform filter bank

Bypass ME/MC*

More :> M E/ M 0* forlframes

b w 2%’ CABAC I

f 8.2 coded 8'; motion 15' ~9 informationl

Source Encoder (ARIES l/ll) l-ARlES l/ll

Q . Channel encoding (Tree partitioning, CRC, RCPC)

w Compressed File

‘ 4L FIG. + Optional Streaming data

Page 12: US20090263030

Patent Application Publication Oct. 22, 2009 Sheet 11 0f 18 US 2009/0263030 A1

Raw YUV color frame data

* b MC u ME/MC*

f Bypass ME/MC* it m f for l frames

e I V, g r CABAC $ ‘E coded I ‘ §>§2§ motion

information I

V \rNaveiectxe e Transform (DWT) Transform (l-DWT) I

l"- ————————— "I — — — — — - -—,

l | l-ARIES l/ll Source Encoder (ARIES l/ll) :

' | I _|

Channel encoding (Tree partitioning, CRC, RCPC)

V

Compressed File

I + l L Optional

Streaming data FIG. 9B

Page 13: US20090263030

Patent Application Publication Oct. 22, 2009 Sheet 12 0f 18 US 2009/0263030 A1

Streaming data

I Optional

Compressed Video (.qsx file format)

CABAC coded I motion I

information

Channel decoding (Tree merging, CRC, RCPC)

I

Source Decoder (l-ARIES VII)

1 1 It Bypass MC* for l

Frame Buffer MC* frames

Inverse Wavelet Transform

Raw YUV data

FIG. 10A

Page 14: US20090263030

Patent Application Publication Oct. 22, 2009 Sheet 13 0f 18 US 2009/0263030 A1

Streaming data

| Optional

Compressed Video (.qsx file format)

CABAC coded I motion l

information V Channel decoding (Tree merging, CRC, RCPC)

Source Decoder (l-ARIES III!)

I

[; Bypass MC* for I Frame Buffer MC* frames

Inverse Wavelet Transform

Raw YUV data

FIG. 10B

Page 15: US20090263030

Patent Application Publication Oct. 22, 2009 Sheet 14 0f 18 US 2009/0263030 A1

1101

Identify a reference frame (e.g., the first frame 11 or an l-frame) —

l 1102 Perform a MEIMC on the coarsest subbands as parent subbands of a current frame other than

the l-frame with respect to the identi?ed reference frame to generate one or more motion vectors for the coarsest subbands

:1 1103 Estimate the spatial shifting of pixels of child subbands using the motion vectors of the

parent subbands to determine a search area of the child subbands

l 1104 Perform a MEIMC for the child subbands to determine the motion vectors of the child

subbands

O

Yes More child subbands?

1105

Perform compression on the predicted! compensated data into compressed data

(eg, see, Figs. 5 and 8)

FIG. 11

Page 16: US20090263030

Patent Application Publication Oct. 22, 2009 Sheet 15 0f 18 US 2009/0263030 A1

12 .mc M/w/v/////////// .

/ /

/ ///, , /////////////,/W///////% Boundary of the Search Area for re?nement MVs

k C b B e C n e r .m e R level of sub-band; =orientat|on (LL,

k: 0

HL, LH, HH) W Block Neighborhood

Refinement Vector 0

AK for level k, orientation 0

é Motlon Vector

Page 17: US20090263030

Patent Application Publication Oct. 22, 2009 Sheet 16 0f 18 US 2009/0263030 A1

/ /// /

Integer Motion Prediction ...._ nun. Erma-Mb? . ? .14

1.. ....\ // b a v //

///////1 \x/

. H r diction

Page 18: US20090263030
Page 19: US20090263030

Patent Application Publication Oct. 22, 2009 Sheet 18 0f 18 US 2009/0263030 A1

Block currently being

- tested

Matching block

) Motion Vector (Identical colors * _ _) denote MVs of the same block)

Displaced MV to translate matching block to the relative

" '> position of macroblock currently being tested

Fig 15 i=ia teat‘; .

OBMCwhen -. . .--;_.

current block being tested is in 4MV mode

Page 20: US20090263030

US 2009/0263030 A1

METHODS AND APPARATUSES FOR COMPRESSING DIGITAL IMAGE DATA

[0001] This application claims the bene?t of US. Provi sional Application No. 60/552, 1 53, ?led Mar. 10, 2004, US. Provisional Application No. 60/552,356, ?led Mar. 10, 2004, and US. Provisional Application No. 60/552,270, ?led Mar. 10, 2004. The above-identi?ed applications are hereby incor porated by reference in their entirety.

FIELD OF THE INVENTION

[0002] The present invention relates generally to multime dia applications. More particularly, this invention relates to compressing digital image data.

BACKGROUND OF THE INVENTION

[0003] A variety of systems have been developed for the encoding and decoding of audio/video data for transmission over Wireline and/or Wireless communication systems over the past decade. Most systems in this category employ stan dard compression/transmission techniques, such as, for example, the ITU-T Rec. H.264 (also referred to as H.264) and ISO/IEC Rec. 14496-10 AVC (also referred to as MPEG4) standards. HoWever, due to their inherent general ity, they lack the speci?c qualities needed for seamless imple mentation on loW poWer, loW complexity systems (such as hand held devices including, but not restricted to, personal digital assistants and smart phones) over noisy, loW bit rate Wireless channels. [0004] Due to the likely business models rapidly emerging in the Wireless market, in Which cost incurred by the con sumer is directly proportional to the actual volume of trans mitted data, and also due to the limited bandWidth, processing capability, storage capacity and battery poWer, ef?ciency and speed in compression of audio/video data to be transmitted is a major factor in the eventual success of any such multimedia content delivery system. Most systems in use today are ret ro?tted versions of identical systems used on higher end desktop Workstations. Unlike desktop systems, Where error control is not a critical issue due to the inherent reliability of cable LAN/WAN data transmission, and bandWidth may be assumed to be almost unlimited, transmission over limited capacity Wireless netWorks require integration of such sys tems that may leverage suitable processing and error-control technologies to achieve the level of ?delity expected of a commercially viable multimedia compression and transmis sion system. [0005] Conventional video compression engines, or codecs, can be broadly classi?ed into tWo broad categories. One class of coding strategies, knoWn as a doWnload-and play (D&P) pro?le, not only requires the entire ?le to be doWnloaded onto the local memory before playback, leading to a large latency time (depending on the available bandWidth and the actual ?le siZe), but also makes stringent demands on the amount of buffer memory to be made available for the doWnloaded payload. Even With the more sophisticated streaming pro?le, the current physical limitations on current generation transmission equipment at the physical layer force service providers to incorporate a pseudo-streaming capabil ity, Which requires an initial period of latency (at the begin ning of transmission), and continuous buffering henceforth, Which imposes a strain on the limited processing capabilities

Oct. 22, 2009

of the hand-held processor. Most commercial compression solutions in the market today do not possess a progressive transmission capability, Which means that transmission is possible only until the last integral frame, packet or bit before bandWidth drops beloW the minimum threshold. In case of video codecs, if the connection breaks before the transmis sion of the current frame, this frame is lost forever. [0006] Another draWback in conventional video compres sion codes is the introduction of blocking artifacts due to the block-based coding schemes used in most codecs. Apart from the degradation in subjective visual quality, such systems suffer from poor performance due to bottlenecks introduced by the additional de-blocking ?lters. Yet another draWback is that, due to the limitations in the Word siZe of the computing platform, the coded coef?cients are truncated to an approxi mate value. This is especially prominent along object bound aries, Where Gibbs’ phenomenon leads to the generation of a visual phenomenon knoWn as mosquito noise. Due to this, the blurring along the object boundaries becomes more promi nent, leading to degradation in overall frame quality. [0007] Additionally, the local nature of motion prediction in some codes introduces motion-induced artifacts, Which cannot be easily smoothened by a simple ?ltering operation. Such problems arise especially in cases of fast motion clips and systems Where the frame rate is beloW that of natural video (e.g., 25 or 30 fps non-interlaced video). In either case, the temporal redundancy betWeen tWo consecutive frames is extremely loW (since much of the motion is lost in betWeen the frames itself), leading to poorer tracking of the motion across frames. This effect is cumulative in nature, especially for a longer group of frames (GoF). [0008] Furthermore, mobile end-user devices are con strained by loW processing poWer and storage capacity. Due to the limitations on the silicon footprint, most mobile and hand-held systems in the market have to time-share the resources of the central processing unit (microcontroller or RISC/CISC processor) to perform all its DSP, control and communication tasks, With little or no provisions for a dedi cated processor to take the video/audio processing load off the central processor. Moreover, most general-purpose cen tral processors lack the unique architecture needed for opti mal DSP performance. Therefore, a mobile video-codec design must have minimal client-end complexity While main taining consistency on the e?iciency and robustness front.

SUMMARY OF THE INVENTION

[0009] Methods and apparatuses for compressing digital image data are described herein. In one embodiment, a Wave let transform is performed on each pixel of a frame to generate multiple Wavelet coef?cients representing each pixel in a frequency domain. The Wavelet coef?cients of a sub-band of the frame are iteratively encoded into a bit stream based on a target transmission rate, Where the sub-band of the frame is obtained from a parent sub-band of a previous iteration. The encoded Wavelet coef?cients satisfy a predetermined thresh old based on a predetermined algorithm While the Wavelet coef?cients that do not satisfy the predetermined threshold are ignored in the respective iteration. [0010] Other features of the present invention Will be apparent from the accompanying draWings and from the detailed description Which folloWs.

BRIEF DESCRIPTION OF THE DRAWINGS

[0011] The present invention is illustrated by Way of example and not limitation in the ?gures of the accompanying draWings in Which like references indicate similar elements.

Page 21: US20090263030
Page 22: US20090263030
Page 23: US20090263030
Page 24: US20090263030
Page 25: US20090263030
Page 26: US20090263030
Page 27: US20090263030
Page 28: US20090263030
Page 29: US20090263030
Page 30: US20090263030
Page 31: US20090263030
Page 32: US20090263030
Page 33: US20090263030
Page 34: US20090263030
Page 35: US20090263030
Page 36: US20090263030