US20090263030
-
Upload
partho-choudhury -
Category
Documents
-
view
22 -
download
1
Transcript of US20090263030
US 20090263030A1
(12) Patent Application Publication (10) Pub. No.: US 2009/0263030 A1 (19) United States
Ramasastry et al. (43) Pub. Date: Oct. 22, 2009
(54) METHODS AND APPARATUSES FOR COMPRESSING DIGITAL IMAGE DATA
(76) Inventors: Jayaram Ramasastry, Woodinville, CA (U S); Partho Choudhury, Maharashtra (IN); Ramesh Prasad, Maharashtra (IN)
Correspondence Address: PERKINS COIE LLP PATENT-SEA PO. BOX 1247 SEATTLE, WA 98111-1247 (US)
(21) Appl. No.: 12/277,207
(22) Filed: Nov. 24, 2008
Related US. Application Data
(63) Continuation of application No. 11/077,106, ?led on Mar. 9, 2005, noW Pat. No. 7,522,774.
(60) Provisional application No. 60/552,153, ?led on Mar. 10, 2004, provisional application No. 60/552,356,
?led on Mar. 10, 2004, provisional application No. 60/552,270, ?led on Mar. 10, 2004.
Publication Classi?cation
(51) Int. Cl. G06K 9/36 (2006.01) H04M 3/42 (2006.01)
(52) US. Cl. .................................... .. 382/232;455/414.1
(57) ABSTRACT
Methods and apparatuses for compressing digital image data are described herein. In one embodiment, a Wavelet transform is performed on each pixel of a frame to generate multiple Wavelet coef?cients representing each pixel in a frequency domain. The Wavelet coef?cients of a sub-band of the frame are iteratively encoded into a bit stream based on a target transmission rate, Where the sub-band of the frame is obtained from a parent sub-band of a previous iteration. The encoded Wavelet coef?cients satisfy a predetermined threshold based on a predetermined algorithm While the Wavelet coef?cients that do not satisfy the predetermined threshold are ignored in the respective iteration. Other methods and apparatuses are also described.
1 0 Encoder Data Acquisition Optional Decoder —
106 107
Server 101
Network (e.g., wired and/or wireless)
102 ‘
Optional Encoder Defgger optional Encoder De1cggder
Client 103 Client 104
Patent Application Publication Oct. 22, 2009 Sheet 1 0f 18 US 2009/0263030 Al
N 5N“
mo _. 6880
wow Ew=0 Euoucm E6230
C)
v
wow 20:0
hwuoocm 5:250 kwwnwwwo ‘ Nov A3225, 5:26 8:3 rmdv v:QEQZ
SF 62mm
no? mo? mo? hmuoowo 8:250 coEw5wo< Ema Lmuoocw
Patent Application Publication Oct. 22, 2009 Sheet 3 0f 18 US 2009/0263030 A1
Physical Layer (W-CDMA, CDMA 1.X, cdma2000, GSM-GPRS, UMTS, iBen) (1)
Data Link Control (DLC) (2)
Streaming protocol stack (RTP, RTSP, RTCP, SDP)
(4) Third party ISO protocol stack ‘ (TVCP/lP/UOP) (3)
Billing and otheE a)nci|lary services 5
Network Amaro Layer (NAL) (6)
Application Layer Apis for QwikStreamTM, QwikVuTM and QwikTexiM (7)
Constant Generation Engine (8)
Data Repository (9)
FIG. 3
Patent Application Publication Oct. 22, 2009 Sheet 4 0f 18 US 2009/0263030 A1
Raw YUV color frame data 401
@ Wavelet Transform ?lter bank
402
4 Source Encoder (ARIES)
403
4 Channel encoding (Tree partitioning, CRC, RCPC)
404
lL Compressed File (e.g., .qvx file)
405
.IL 0
FIG. 4A
Patent Application Publication Oct. 22, 2009 Sheet 5 0f 18 US 2009/0263030 A1
Compressed Image (.qvx file format)
ll Channel decoding (Tree merging, CRC, RCPC)
@ Source Decoder (l-ARIES)
lL Inverse Wavelet Transform
lL Raw YUV data
Ji CJ
FIG. 4B
Patent Application Publication Oct. 22, 2009 Sheet 6 0f 18 US 2009/0263030 A1
501
Perform a wavelet transformation on each image pixel to transform the pixel into one or more coefficients in one or more wavelet
maps
1' 502 Encode each wavelet map by representing
the significance, sign and bit plane information of the pixel using a single bit in
a bit stream
i 503 Encode the significant bits into a context variable dependent upon the information represented by the bit and its location of the coefficient being coded (e.g., the
probability of occurrence of a predetermined set of bits immediately
preceding the current bit)
i Transmit the content of the context variable as a bit stream as an output representing
the encoded pixels
FIG. 5
Patent Application Publication Oct. 22, 2009 Sheet 7 0f 18 US 2009/0263030 A1
' sum,” 3
(HH)
Fig. 6
Patent Application Publication Oct. 22, 2009 Sheet 8 0f 18 US 2009/0263030 A1
Patent Application Publication Oct. 22, 2009 Sheet 9 0f 18 US 2009/0263030 A1
801 Determine a number of iterations (nl) based on a number of quantization levels, which may be determined on the largest wavelet coefficient, and set an initial quantization
threshold T=2“'
'00 O O
v 802
Populate all insigni?cant pixels in IPQ, all insigni?cant pixel having descendants in ISQ, and all signi?cant pixels in SPQ
v 803
For each type I entry of ISQ, if the entry is significant with respect to a current quantization threshold, remove the
respective entry from ISQ and append it in the SPQ
v 804 For each type I entry of ISQ, if the entry is
insignificant with respect to a current quantization threshold, remove the
respective entry from ISQ and append it in ‘ the IPO
805 If the respective type I entry includes
descendants, remove the entry from the ISQ and append it at the end of ISQ as type ll entry for next iteration; otherwise,
the entry is purged
r 806
For each type II entry of ISQ, if the entry is signi?cant with respect to a current
quantization threshold, all offspring of the current ISQ entry are appended to the end of ISQ as type I entries for next iteration
807
Remove any entry in lPQ that is signi?cant with respect to the current quantization threshold and append it in the SPQ
FIG. 8
Patent Application Publication Oct. 22, 2009 Sheet 10 0f 18 US 2009/0263030 A1
Raw YUV color frame data
<> Wavelet Transform filter bank
Bypass ME/MC*
More :> M E/ M 0* forlframes
b w 2%’ CABAC I
f 8.2 coded 8'; motion 15' ~9 informationl
Source Encoder (ARIES l/ll) l-ARlES l/ll
Q . Channel encoding (Tree partitioning, CRC, RCPC)
w Compressed File
‘ 4L FIG. + Optional Streaming data
Patent Application Publication Oct. 22, 2009 Sheet 11 0f 18 US 2009/0263030 A1
Raw YUV color frame data
* b MC u ME/MC*
f Bypass ME/MC* it m f for l frames
e I V, g r CABAC $ ‘E coded I ‘ §>§2§ motion
information I
V \rNaveiectxe e Transform (DWT) Transform (l-DWT) I
l"- ————————— "I — — — — — - -—,
l | l-ARIES l/ll Source Encoder (ARIES l/ll) :
' | I _|
Channel encoding (Tree partitioning, CRC, RCPC)
V
Compressed File
I + l L Optional
Streaming data FIG. 9B
Patent Application Publication Oct. 22, 2009 Sheet 12 0f 18 US 2009/0263030 A1
Streaming data
I Optional
Compressed Video (.qsx file format)
CABAC coded I motion I
information
Channel decoding (Tree merging, CRC, RCPC)
I
Source Decoder (l-ARIES VII)
1 1 It Bypass MC* for l
Frame Buffer MC* frames
Inverse Wavelet Transform
Raw YUV data
FIG. 10A
Patent Application Publication Oct. 22, 2009 Sheet 13 0f 18 US 2009/0263030 A1
Streaming data
| Optional
Compressed Video (.qsx file format)
CABAC coded I motion l
information V Channel decoding (Tree merging, CRC, RCPC)
Source Decoder (l-ARIES III!)
I
[; Bypass MC* for I Frame Buffer MC* frames
Inverse Wavelet Transform
Raw YUV data
FIG. 10B
Patent Application Publication Oct. 22, 2009 Sheet 14 0f 18 US 2009/0263030 A1
1101
Identify a reference frame (e.g., the first frame 11 or an l-frame) —
l 1102 Perform a MEIMC on the coarsest subbands as parent subbands of a current frame other than
the l-frame with respect to the identi?ed reference frame to generate one or more motion vectors for the coarsest subbands
:1 1103 Estimate the spatial shifting of pixels of child subbands using the motion vectors of the
parent subbands to determine a search area of the child subbands
l 1104 Perform a MEIMC for the child subbands to determine the motion vectors of the child
subbands
O
Yes More child subbands?
1105
Perform compression on the predicted! compensated data into compressed data
(eg, see, Figs. 5 and 8)
FIG. 11
Patent Application Publication Oct. 22, 2009 Sheet 15 0f 18 US 2009/0263030 A1
12 .mc M/w/v/////////// .
/ /
/ ///, , /////////////,/W///////% Boundary of the Search Area for re?nement MVs
k C b B e C n e r .m e R level of sub-band; =orientat|on (LL,
k: 0
HL, LH, HH) W Block Neighborhood
Refinement Vector 0
AK for level k, orientation 0
é Motlon Vector
Patent Application Publication Oct. 22, 2009 Sheet 16 0f 18 US 2009/0263030 A1
/ /// /
Integer Motion Prediction ...._ nun. Erma-Mb? . ? .14
1.. ....\ // b a v //
///////1 \x/
. H r diction
Patent Application Publication Oct. 22, 2009 Sheet 18 0f 18 US 2009/0263030 A1
Block currently being
- tested
Matching block
) Motion Vector (Identical colors * _ _) denote MVs of the same block)
Displaced MV to translate matching block to the relative
" '> position of macroblock currently being tested
Fig 15 i=ia teat‘; .
OBMCwhen -. . .--;_.
current block being tested is in 4MV mode
US 2009/0263030 A1
METHODS AND APPARATUSES FOR COMPRESSING DIGITAL IMAGE DATA
[0001] This application claims the bene?t of US. Provi sional Application No. 60/552, 1 53, ?led Mar. 10, 2004, US. Provisional Application No. 60/552,356, ?led Mar. 10, 2004, and US. Provisional Application No. 60/552,270, ?led Mar. 10, 2004. The above-identi?ed applications are hereby incor porated by reference in their entirety.
FIELD OF THE INVENTION
[0002] The present invention relates generally to multime dia applications. More particularly, this invention relates to compressing digital image data.
BACKGROUND OF THE INVENTION
[0003] A variety of systems have been developed for the encoding and decoding of audio/video data for transmission over Wireline and/or Wireless communication systems over the past decade. Most systems in this category employ stan dard compression/transmission techniques, such as, for example, the ITU-T Rec. H.264 (also referred to as H.264) and ISO/IEC Rec. 14496-10 AVC (also referred to as MPEG4) standards. HoWever, due to their inherent general ity, they lack the speci?c qualities needed for seamless imple mentation on loW poWer, loW complexity systems (such as hand held devices including, but not restricted to, personal digital assistants and smart phones) over noisy, loW bit rate Wireless channels. [0004] Due to the likely business models rapidly emerging in the Wireless market, in Which cost incurred by the con sumer is directly proportional to the actual volume of trans mitted data, and also due to the limited bandWidth, processing capability, storage capacity and battery poWer, ef?ciency and speed in compression of audio/video data to be transmitted is a major factor in the eventual success of any such multimedia content delivery system. Most systems in use today are ret ro?tted versions of identical systems used on higher end desktop Workstations. Unlike desktop systems, Where error control is not a critical issue due to the inherent reliability of cable LAN/WAN data transmission, and bandWidth may be assumed to be almost unlimited, transmission over limited capacity Wireless netWorks require integration of such sys tems that may leverage suitable processing and error-control technologies to achieve the level of ?delity expected of a commercially viable multimedia compression and transmis sion system. [0005] Conventional video compression engines, or codecs, can be broadly classi?ed into tWo broad categories. One class of coding strategies, knoWn as a doWnload-and play (D&P) pro?le, not only requires the entire ?le to be doWnloaded onto the local memory before playback, leading to a large latency time (depending on the available bandWidth and the actual ?le siZe), but also makes stringent demands on the amount of buffer memory to be made available for the doWnloaded payload. Even With the more sophisticated streaming pro?le, the current physical limitations on current generation transmission equipment at the physical layer force service providers to incorporate a pseudo-streaming capabil ity, Which requires an initial period of latency (at the begin ning of transmission), and continuous buffering henceforth, Which imposes a strain on the limited processing capabilities
Oct. 22, 2009
of the hand-held processor. Most commercial compression solutions in the market today do not possess a progressive transmission capability, Which means that transmission is possible only until the last integral frame, packet or bit before bandWidth drops beloW the minimum threshold. In case of video codecs, if the connection breaks before the transmis sion of the current frame, this frame is lost forever. [0006] Another draWback in conventional video compres sion codes is the introduction of blocking artifacts due to the block-based coding schemes used in most codecs. Apart from the degradation in subjective visual quality, such systems suffer from poor performance due to bottlenecks introduced by the additional de-blocking ?lters. Yet another draWback is that, due to the limitations in the Word siZe of the computing platform, the coded coef?cients are truncated to an approxi mate value. This is especially prominent along object bound aries, Where Gibbs’ phenomenon leads to the generation of a visual phenomenon knoWn as mosquito noise. Due to this, the blurring along the object boundaries becomes more promi nent, leading to degradation in overall frame quality. [0007] Additionally, the local nature of motion prediction in some codes introduces motion-induced artifacts, Which cannot be easily smoothened by a simple ?ltering operation. Such problems arise especially in cases of fast motion clips and systems Where the frame rate is beloW that of natural video (e.g., 25 or 30 fps non-interlaced video). In either case, the temporal redundancy betWeen tWo consecutive frames is extremely loW (since much of the motion is lost in betWeen the frames itself), leading to poorer tracking of the motion across frames. This effect is cumulative in nature, especially for a longer group of frames (GoF). [0008] Furthermore, mobile end-user devices are con strained by loW processing poWer and storage capacity. Due to the limitations on the silicon footprint, most mobile and hand-held systems in the market have to time-share the resources of the central processing unit (microcontroller or RISC/CISC processor) to perform all its DSP, control and communication tasks, With little or no provisions for a dedi cated processor to take the video/audio processing load off the central processor. Moreover, most general-purpose cen tral processors lack the unique architecture needed for opti mal DSP performance. Therefore, a mobile video-codec design must have minimal client-end complexity While main taining consistency on the e?iciency and robustness front.
SUMMARY OF THE INVENTION
[0009] Methods and apparatuses for compressing digital image data are described herein. In one embodiment, a Wave let transform is performed on each pixel of a frame to generate multiple Wavelet coef?cients representing each pixel in a frequency domain. The Wavelet coef?cients of a sub-band of the frame are iteratively encoded into a bit stream based on a target transmission rate, Where the sub-band of the frame is obtained from a parent sub-band of a previous iteration. The encoded Wavelet coef?cients satisfy a predetermined thresh old based on a predetermined algorithm While the Wavelet coef?cients that do not satisfy the predetermined threshold are ignored in the respective iteration. [0010] Other features of the present invention Will be apparent from the accompanying draWings and from the detailed description Which folloWs.
BRIEF DESCRIPTION OF THE DRAWINGS
[0011] The present invention is illustrated by Way of example and not limitation in the ?gures of the accompanying draWings in Which like references indicate similar elements.