Effective Fades and Flashlight Detection Based on Accumulating ...

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 16, NO. 10, OCTOBER 2006 1245

Effective Fades and Flashlight Detection Based onAccumulating Histogram Difference

Xueming Qian, Guizhong Liu, and Rui Su

Abstract—Scene change detection is a fundamental step in au-tomatic video indexing, browsing and retrieval. Fade in and fadeout are two kinds of gradually changing scenes which are difficultto be detected in comparison with the abruptly changing scenes.The salient character of flashlight effect is the luminance change,which is caused by abrupt appearance or disappearance of the il-lumination source. Performance of shot boundary detection is notsatisfactory for the video sequences containing flashlights, if noflashlight discrimination strategy is adopted. In this paper, an ef-fective fades and flashlight detection method is proposed for boththe compressed and uncompressed videos, based on the accumu-lating histogram difference (AHD). This fades detection methodis proposed in terms of their mathematical models. AHDs of allthe two consecutive frames during fades transitions can be clas-sified into six cases. The flashlight detection method is proposedbased on the AHD and the energy variation characters. AHD andenergy variation characters for the starting and ending frames ofa flashlight have certain regularities, which can also be expressedby cases. Thus the fades and flashlight detection problems are con-verted into cases matching ones. Experimental results on severaltest video sequences with different bit rates show the effectivenessof the proposed AHD based fades and flashlight detection method.

Index Terms—Accumulating histogram difference (AHD), dcimage, fade in/out, flashlight, gradual scene change detection,video.

I. INTRODUCTION

DUE TO the rapid improvements in compression tech-nology, the expansion of low-cost storage media, and

the information explosion over the internet such as audio,video, graph, and image, digital video libraries are cominginto reality in the near future. Searching for the video sec-tions which are interesting to us from a huge amount ofmultimedia database is not easy. The requirement for fastvideo accessing and browsing is also increased in areas suchas video conference, multimedia database browsing systems,remote video based education, video-on-demand systemsand so on [1]–[3]. The fundamental task of video browsing,indexing, abstraction and retrieval is parsing video into shots,where a shot is defined as a sequence of frames capturedfrom a single camera operation [1].

Manuscript received August 10, 2005; revised January 19, 2006 and June 15,2006. This work is supported in part by the National Natural Science Founda-tion of China (NSFC) under Project 60572045, in part by the Ministry of Ed-ucation of China Doctorate Program under Project 20050698033, and in partby Microsoft Research Asia. This paper was recommended by Associate EditorI. Ahmad.

The authors are with the School of Electronics and Information En-gineering, Xi’an Jiaotong University, Xi’an 710049, China (e-mail:[email protected]; [email protected]; [email protected]).

Color versions of Figs. 1–3 and 5–7 are available online at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TCSVT.2006.881858

In the past decade, a lot of research has been done on shotboundary detection [4]. They can be classified into the edgematching based methods [5]–[8], [30], the pixel-based methods,and the histogram based methods [9], [10], [19], [28], [29].Since videos are often stored in compressed format, many shotboundary detection methods are proposed using the features ex-tracted from the compressed domain. Those algorithms includethe macroblock (MB) type based methods [11], [12], the motionvector (MV) based methods [13], [14], and the dc image basedmethods [15], [16].

There are two different types of transitions between shots:abrupt shot transitions (also referred to as cuts) and gradual shottransitions such as fades, dissolves and wipes. A cut is an instan-taneous change from one shot to another. During a fade tran-sition, a shot gradually appears from or disappears to a solidcolor image [19]. A dissolve occurs when the first shot fadesout whilst the second shot fades in. Flashlight effect is causedby abrupt luminance variation [20].

Cut is the simplest, most common way of movement fromone shot to the next. Those abrupt transitions are comparativelyeasy to detect because the two frames located at a shot boundaryare completely uncorrelated. However, the cut detection per-formance for specified videos, such as news sequences, is notso satisfactory when no flashlight and cut frame discriminationmethod is adopted. The main goal of flashlight determination isto discriminate flashlight effects from cuts, when a suspectedabrupt change frame pair has been detected [30]. Moreover,flashlight frames themselves can indicate the highlight eventsin video sequences, which could facilitate content based videobrowsing, indexing and retrieval [32]. Gradual transitions aremore difficult to detect because the difference between adjacentframes corresponding to two successive shots is trivial. Gradualtransitions are often used at scene boundaries to emphasize thechange in content of the sequence [4]. Fades are used to denotethe time transitions. Fade in/out combinations can also indicaterelative mood and pace between shots [23]. Hence, detectingfades in video sequences can be used in semantic based videoanalysis, indexing and retrieval.

In this paper we propose an algorithm for fades and flashlightdetection based on accumulating histogram difference (AHD).The rest of the paper is organized as follows. In Section II re-lated work on fades and flashlight detection are briefly reviewed.In Section III, an AHD based fades and flashlight detection ap-proach is presented by virtue of the mathematical models. Forcompressed videos, AHD of the dc image of the luminance com-ponent is used for fades and flashlight detection instead of thatof the original image. Comparison results with other fades andflashlight detection methods and discussions are presented inSection IV. And finally conclusions are drawn in Section V.

1051-8215/$20.00 © 2006 IEEE

1246 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 16, NO. 10, OCTOBER 2006

II. RELATED WORK ON FADES AND FLASHLIGHT DETECTION

Fades and flashlight detection are important for content andsemantic based video analysis, indexing and retrieval. Recently,tremendous work is done in these fields. However, many prob-lems remain unsolved. A brief review of the related work onfades and flashlight detection is given below.

A. Related Work for Fades Detection

During fades transition, a shot gradually appears from or dis-appears to a solid color image. The dynamic ranges and con-trasts of the frames during fades transition change with certainregularities, which can be used to identify fades.

The standard deviations of the luminance frames during fadesexhibit certain pattern and they are used to determine the fadestransition by Lienhart [33]. Similarly, the fades detection algo-rithm proposed by Alattar et al. [17] is carried out by analyzingthe first and second derivative of the luminance variance curves.However, this method is sensitive to motions and noises. Fe-mando et al. [18] make full use of the statistical features of boththe luminance and chrominance components to detect fades.However, this method is not so effective in identifying fadeswhen the solid color is very close to the mean of the originalsequences. Moreover, Truong et al. [23] improve the fades de-tection algorithm by detecting the solid color frames in the firststep, and then checking all the spikes in the second derivativecurve.

Candidate fades are identified by determined the solid colorframes during fades transitions by an inter-frame correlation co-efficient because the correlation coefficient of two consecutiveframes suddenly falls to zero when solid color frame appeared[21]. Then the candidate fades transition regions are validatedusing luminance variance constraint.

The fades detection algorithm used in [20] is based on the vi-sual rhythm of histogram (VRH). Fades region can be identifiedby detecting the inclined edges in VRH image. This algorithmis ineffective in detecting the fades where the solid color framesare not solid black or white. In these circumstances, there aretwo cross edges rather than one inclined edge in VRH image.

The fades detection method proposed in [5] and [6] by Zabihet al. is to compare the entering and existing edge pixels fromthe edge images. Fades are detected based on this observation:during fade in there will be many entering pixels and a few ex-iting pixels, while during fade out there will be many exitingpixels and a few pixels entering pixels.

In [22], a histogram difference and normalized direc-tional moment feature-based hidden Markov models for thewavelet-transformed images are used for abrupt and gradualscene change detection. The moment feature of a fades pair hasa “W” shape where two valleys exist.

Fades detection method proposed by Nam et al. [24] is to fitfades transition by a B-spline curve. And they make use of thegoodness of fitting to determine the presence of fades transition.

In [19], fades can be detected by the variations of dynamicrange during fades transition. The horizontal span of histogramshould decrease for fade out and increase for fade in. For robustdetection, each of the luminance frames is partitioned into fourequal-sized regions and the histogram spans for these regions

are considered separately, to identify the starting (ending) frameof fade out (in).

There are two disadvantages of the existing fades detectionmethods. The first one is that they may rely heavily on the solidcolor frames during fades transitions. However, in real videosequences, many fades transitions only have near solid colorframes or do not have solid color frames at all. The second oneis the assumption that the fades transitions undergo in the entireframes. The fades transiting in part of the images may not bedetected out, such as captions fade in from (or fade out to) a sta-tionary background image. In order to improve the fades detec-tion performances, an AHD-based method is proposed below.

B. Related Work for Flashlight Detection

The existence of flashlights or lightning scenes changesthe luminance and chrominance abruptly, due to the suddenflashlight effect across the sequence [30]. In this condition,the traditional abrupt scene change detection methods usingthe feature differences between two adjacent frames are notsatisfactory [30].

In the previous work, the flashlight and cut discriminationmethods are proposed by studying the behaviors of flashlightscenes and cuts, under the assumption that flashlight effectsonly last a few frames and the statistical characteristics offrame features return to the original state after flashlighteffects [30]. Yeo et al. [15] proposed a flashlight detectionmethod by analyzing the magnitude of the difference of twoadjacent dc images. Two consecutive sharp peaks appear andthey are comparatively higher enough than the averaged valuewithin the sliding windows. Nakajima et al. [26], [27] use thehigher corelationship between the frames before and after aflashlight to distinguish flashlights from other scene changes.In [29], Zhang et al. used a twin thresholds technique to de-tect cuts and flashlights. First, an adaptive threshold selectionapproach based on intensity histogram difference is used.Then “flash model” and “cut model” are used to distinguishabrupt changes caused by flashlights and cuts.

Pei and Chu [11] proposed a flashlight detection method byanalyzing the influence of flashlight effect to the intra coded MBnumber. The number of intra-coded MBs would indicate theoccurrence of a flashlight. However, this method is ineffective indetecting local and weak flashlight frames, in which the numberof intra-coded MBs does not vary significantly.

In [20] and [28], luminance related characters of flashlighteffects are used to identify flashlight. Truong et al. [28] pro-posed a flashlight detection method which can detect consec-utive flashlight frames by determining substantial global lumi-nance changes, under the condition that a flashlight start with ahigh luminance increase and be followed by a period of con-stant high luminance and end with a decrease in luminance.Due to the brightness characteristics of flashlight effect, therewill be long white vertical stripes in the visual rhythm (VR)image [20]. Flashlight effect can be identified by finding thecorresponding vertical stripes in VR image. However, this al-gorithm is proposed under the assumption that the flashlight isstronger enough. When the flashlight is weaker, local or withheavy shadows, there would not be clear white stripes in theVR image.

QIAN et al.: EFFECTIVE FADES AND FLASHLIGHT DETECTION BASED ON ACCUMULATING HISTOGRAM DIFFERENCE 1247

A feature-based flashlight detection method presented in [7]was under the assumption that contours of the dominant ob-jects still exist in the flashlight frames. Flashlight is detected byedge matching and filtering. To improve the flashlight detectionperformance, edge direction and matched edge are taken intoconsideration during edge matching [30]. This method is effec-tive and robust in detecting stronger flashlights even when theyare located at shot boundaries. However, the detection perfor-mance is degraded for some local flashlight frames and flash-light frames with heavy shadows because of the falsely gener-ated edges.

All in all, the existing flashlight detection algorithms havesome limitations in identifying local or weaker flashlight orflashlight with heavy shadows. Hence, by analyzing the math-ematical model and studying the behavior of flashlight effect,an AHD-based flashlight detection method is proposed in thispaper.

III. PROPOSED FADES AND FLASHLIGHT

DETECTION ALGORITHM

In the following, the intuitive characteristics of AHD forflashlight and fades transition in a video sequence are obtainedby the commonly used mathematical models.

A. Mathematical Models for Fades

In video editing and production, proportions of two picturesignals are simply added together so that the two pictures ap-pear to merge on the output screen. Very often this process isused to move from picture to picture . In such case, theproportions of the two signals are such that the contribution ofpicture changes from zero to 100% as the contribution of pic-ture changes from 100% to zero. This is called a dissolve[19]. When picture is a solid color image, it is called as fadein and when picture is a solid color image, it is known as afade out [5]. Mathematically, fade in and fade out can be de-scribed by (1) and (2), respectively [19], shown at the bottom ofthe page, where denotes the gray value of the solid (mono-chrome) color, is the resultant video signal,is picture is picture is the length of se-quences is the frame index, is the length of fading se-quences, is the length of the total sequence, and is anon-decreasing function which controls the fade rate. It is oftenassumed that change linearly from 0 to 1 during fades, suchthat . As fade out detectionis similar to fade in detection, we will focus on the descriptionof fade in detection in the following of this section.

B. Mathematical Model for Flashlight

Flashlight is a point light source, whose illuminating energyis sensible for human eyes and video cameras. The illuminationsources in a scene containing flashlight effects can be classifiedinto two types: the scene illumination sources and the flashlightsource. According to the additive rule of illumination sourcesand relationship between the illumination models and video sig-nals [31], a general mathematical model for flashlight scene canbe expressed as follows:

(3)

where FLS and FLE are the starting and ending frames of aflashlight, respectively, is produced by the scene illumi-nation source, and is generated by the flashlight source.According to the non-negative characteristics of illuminationsource, we have .

C. Accumulating Histogram Difference Based Fades Detection

From the mathematical model for fade in, as shown in (1), thedifference of two consecutive frames and is expressedas follows:

(4)where .

Let denote the normalized intensityhistogram of the image , where and are the cor-responding histogram bin and maximum histogram bin. (Here,each histogram is quantized into 64 bins for noise suppressionand fast calculation, namely . In Section IV-F, de-tailed discussions on histogram bins and the performances willbe given). The histogram difference of two consecu-tive frames and is

(5)

Correspondingly, AHD at frame is represented by

(6)

(1)

(2)


For arbitrary frame and an , in terms of thecharacters of the normalized histogram, we have

and

(7)

Thus

(8)

From (4), it is observed that the fades type can be exclusivelyidentified by gray value and the ending framefor fade in. In the following of this section, our fades detectionalgorithm is described in detail, according to the relationshipof (4)–(6). For simplification, we divide the pixels in a frameduring fades transition into three sets, denoted as , and

, respectively, according to the relationship of to. Note that , and remain unchanged, because they

are determined by the solid color frame A (or B) and the mostclear frame (or ) during fades transition.For fade in, they are defined as follows:

(9)

For those pixels in the set , the gray values of the pixelskeep increasing during fade in transition. Let anddenote the normalized histogram of pixels in set and thepercentage pixels in it take over the image, respectively. Duringfade in operation will extend from gray level to graylevels larger than . Similarly, let anddenote the percentages and histograms of pixels in the sets

and , respectively. From above we have the followingrelationships:

(10)

Let and denote the maximum and minimum grayvalues of the ending frame in the case of fade in orthe starting frame in the case of fade out. In this paper,fade in can be identified by analyzing the relationship among

and . To be brief, 3 cases can be identified andthe AHD characteristics of each case are analyzed in detail asfollows.

• Case 1: .In this case, is the lowest gray value during fade intransition, for example fade in from a solid black frame.During fade in the gray level of each pixel increases. Let

and denote the minimum and maximumgray values at frame . For any two consecutive frames

and belonging to a fade in transition, their min-imum and maximum gray values satisfy the relationships

and .Spans of histogram expand with the increment of

frame number in fade in transition. Therefore, at the left-most side of the difference the histogram , it holdsthat , and at the rightmost side, it holds that

. So, AHD has the following character:

(11)

Now, we turn to a proof of this property.By virtue of (4), in this case , it is obviousthat

. We have forall . Thus, we can always find a plus quantity , where

, on such that

(12)

Rearranging the left-hand terms as

(13)

Since , for arbitrary , we have.

Thus, we get

• Case 2: .In this case, is the highest gray value during fade intransition, i.e., a solid white frame. It is an inverse problemof Case 1 . Correspondingly, AHD has thefollowing property (with the proof being similar to that ofCase 1)

(14)

• Case 3: .During fade in, the gray levels of the pixels in the setwill increase from lower gray values to higher ones grad-ually; the gray values of the pixels in the set will de-crease from higher levels to lower ones gradually; and thatof the pixels in the set remain unchanged during fadein. So in the leftmost and rightmost sides of , itsatisfies . Correspondingly, the minimumand maximum gray values of frame and , satisfy

and . In thiscase, AHD has a zero cross point, denoted by . AHDhas the following properties:

(15)

The proof of (15) can be partitioned into three parts. Fromthe definition of the three sets , and , the unionof them is a universal set and the intersection sets any two


Fig. 1. Fade in example and its histogram characters for Case 1: C � G . (a) Frame 38808. (b) Frame 38813. (c) Frame 38823. (d) Original histograms.(e) Histogram differences. (f) Accumulating histogram differences.

of them are null sets. Hence, each set can be handled in-dependently. The first condition in (15) is a special case of(14) for the set . It is obvious for the second conditionin (15) for the set . Similarly, the third one can be seenas a special case of (11) for set .

Fig. 1 shows an example for Case1. Three frames (labeledas Frame 38808, Frame 38813 and Frame 38823) in a fadein transition from the test video sequence riscos are shownin Fig. 1(a)–(c). Their histograms, histogram differences andAHDs are shown in Fig. 1(d)–(f). At the leftmost side inFig. 1(e), it holds that and at the rightmost side

. It is too complex to extract salient features forfades detection from the histogram difference characteristics.However, AHDs during fade in transition have a consistentproperty, as shown in (11) and Fig. 1(f).

Fig. 2(a)–(c) shows three frames (denoted as Frame 31930,Frame 31936 and Frame 31942, extracted from the test videoeyeexam) during fade in for Case 3. Their histograms, histogramdifferences and AHDs are shown in Fig. 2(d)–(f). It is observedthat the histogram spans for the frames during fade in transitionextend from the middle part of histogram toward 0 andwith the increase of frame index . Correspondingly, AHDs haveidentical characteristics, as shown in (15). The complete Casesand histogram characters for fade in and fade out are listed inTable I. The zero cross point for Case 3 or Case 6 mustbe determined beforehand as follows:

or

(16)

where

D. Accumulating Histogram Difference Based FlashlightDetection

Characteristics of AHD during fades transitions are listed inTable I. The flashlight detection problem can be converted to acases related one, too. According to the mathematical model offlashlight described in (3), the energy variation and AHD char-acteristics for flashlight effect are analyzed in detail. The salientfeature of flashlight effects is the luminance variation. It is wellknown that a flashlight appears with energy increasing and dis-appears with energy decreasing. We call the starting and endingframes of a flashlight effect a flashlight pair. Let FLS, FLE and(FLS, FLE) denote the starting, ending frames and their flash-light pair of a flashlight effect, respectively. Flashlight effectscan be identified by determining the corresponding flashlightpairs by virtue of AHD and energy variation characteristics.

Let and denote the normalized energy and en-ergy variation at the th frame, respectively, namely

(17)

(18)

where and are the height and width of an image.


Fig. 2. Fade in example and its histogram characters for Case 3: G < C < G . (a) Frame 31930. (b) Frame 31936. (c) Frame 31942. (d) Originalhistograms. (e) Histogram differences. (f) Accumulating histogram differences.

TABLE IFADE CASES AND THEIR HISTOGRAM PROPERTIES

The flashlight effect detection problem is converted toidentify flashlight pair according to AHD and energy variationcharacteristics. Let us analyze the AHD and energy variationcharacters for the two consecutive frames and

FLS when the flashlight appears. The frame difference and theenergy variation are expressed as follows:

(19)

(20)

From the above, the following relationships can be derived:

(21)

Similarly, the AHD and energy variation characteristics for theframe have following relationships:

(22)

Fig. 3(a)–(c) shows three consecutive frames of a flashlighteffect lasting only one frame. The corresponding histogramsand AHDs are shown in Fig. 3(d)–(e). According to the math-ematical model for flashlight, pixel values at each locationare increased due to the brightness effect of flashlight (i.e.,

for all ) when flashlight appears.Hence, the histogram of Fig. 3(b) shifts from left to right com-pared to those of Fig. 3(a) and (c). Correspondingly, each bin inAHD at frame is consistently less than or equal tozero. Similarly, each bin in AHD at frame FLE is consistentlylarger than or equal to zero. Hereinafter, we will make full useof the characteristics expressed in (21) and (22) for flashlighteffects detection.


Fig. 3. Flashlight frames with heavy shadow and their histogram characters. (a) Frame before flashlight appears. (b) Flashlight frame. (c) Frame after flashlightdisappears.

E. AHD for dc Images

Different from existing fades detection algorithms, theproposed AHD-based fades detection algorithm does not relyheavily on the monochrome frames, because AHD character-istics of each two consecutive frames during fades transitionsare similar and cases related. Hereafter fades and flashlightdetection problem is converted into a case matching problem,which is suitable for both compressed and uncompressedvideos. Usually videos are stored in compressed formats tosave storage spaces and costs, and the MPEG coding is oftenused. Experiments show that, in video decoding, approximately40% of CPU time is spent in inverse discrete cosine transform(IDCT) even using available fast discrete cosine transform(DCT) algorithms [25]. Therefore, how to use the compressedvideo bit streams directly for fades detection has more practicalusage. In the following, we concentrate on the fades and flash-light detection in compressed video using AHDs of dc images.A dc image consists of the dc coefficients of the entire 8 8blocks in a frame, which results in an image that is 1/64 of theoriginal one [15].

The reason that we use AHDs of dc images for fades andflashlight detection rather than the original images is based ontwo facts. The first is lower computational intensity, since it onlyneeds to decode a small fraction of the original data to constructthe dc images. The second fact is that AHD is a highly compactfeature, and the AHD characteristics of the dc images and theoriginal images have little influence to the fades and flashlightdetection (see Section IV-D and IV-E).

F. Flowchart for Fades Detection in Compressed Videos

Cases and AHD properties listed in Table I are deduced fromthe mathematical models by assuming that each frame duringfades transitions is stationary and noise free. In order to de-tect fades robustly, some post-processing techniques must beadopted. To be brief, our AHD-based fades detection algorithmconsists of the following three steps, which are labeled by boxeswith dash lines, as shown in Fig. 4(a).

• Fades Cases Determination for Compressed Video UsingAHD of dc Image:Let FNum denote the total frame number of a video se-quence. Before fades cases determination, case type infor-mation for each of the frames is initialized with

. And the maximum histogram bin isset to 64, i.e., . In order to obtain the cases forall of the frames more accurately, a median filter with ra-dius 2 is used on the AHDs. The fade case of each frameis determined by using AHD of the dc image accordingto the properties listed in Table I. In this paper, we assign

if the case type information of the th framebelongs to the csth case . Moreover,the maximum and minimum gray values of each dc imageare recorded to calculate the dynamic range variation in thethird step.

• Post-Processing for Fades Cases:After the case type for each frame is determined, a medianfilter with radius 2 is used tofor robust fades detection. The median filter used here isdifferent from that used in the first step, where median filter


Fig. 4. Flowcharts of AHD-based fades and flashlight detection in compressed video sequences. (a) Flowchart for fades detection. (b) Flowchart for flash-light detection.


with radius 2 is carried out for the AHD of a dc image tosuppress noise.

• Fades Detection Based on AHD Cases Information:The third step is to find the starting and ending frames ofeach fades transition according to the obtained case typeinformation. From Table I, candidate fades transition canbe found by identifying the consecutive frame with thesame case type (Case 1–Case 6). As fades transitions usu-ally last for many frames and have obvious dynamic rangechanges, the minimum length and dynamic range varia-tion (DRV) constraint can be used to verify the candidatefades transition. Let and denote the minimumlength and DRV threshold. In this paper we setand by experimentation. Detailed informationis given in Part F, Section IV. Let Fs and Fe denote thestarting and ending frames which have identical case type(from Case 1 to Case 6). They belong to a fade in (out)transition if the conditions in (23), shown at the bottom ofthe page, are satisfied.

The starting and ending frames, the case type informationfor each of the fades and the total number of fades transitionsin a video sequences (denoted as FadeNum) can be recordedduring fades detection, as shown in Fig. 4(a). Different fromthe existing fades detection methods, the proposed AHD-basedmethod does not rely heavily on the solid color frames. Not onlythe fades with solid color frames but also fades without solidcolor frames can be correctly detected. Furthermore fades casesare recognized during detection.

G. Flowchart for Flashlight Detection in Compressed Videos

The AHD and energy variation characteristics for the startingand ending frames of a flashlight pair, as shown in (21) and(22), are obtained under ideal conditions. In order to improveflashlight detection performances a minimum energy variationthreshold is adopted (in this paper, is set to 5). Twoframes are expected to be a flashlight pair if their energy vari-ation satisfies the minimum energy variation constraint. More-over, a positive quantity (in this paper, is set to 0.002) isadded to or subtracted from AHD to resist noise. Hence, (21)and (22) are modified into (24) and (25)

(24)

(25)

Many observations show that flashlight effects only last sev-eral frames [20], [29]. For the ordinary video signals with framerate 25, we set flashlight duration threshold for flashlight

TABLE IIFADES NUMBERS FOR DIFFERENT TYPE OF CASES

detection (in this paper, we set ). Frames FLS andFLE are the starting and ending frames of a flashlight pair. Theyshould satisfy the following relationships:

frame satisfies Equation (24)frame satisfies Equation (25)

(26)

Let indicate the flashlight case at frame .implies that frame is a possible starting

frame of a flashlight pair. means thatframe is a possible ending frame of a flashlight pair.

shows that frame does not belong to thestarting or ending frames of a flashlight. The flowchart of theproposed AHD-based flashlight detection method in com-pressed videos is shown in Fig. 4(b), which consists of twomain steps. The first step is the flashlight cases determination byvirtue of AHD and energy variation characteristics. The secondstep is the flashlight pair identification by finding the startingand ending frames. Usually, the luminance of a flashlight effectmay gradually change, such as the luminance of flashlightgradually increases and then decreases. For the consecutiveframes with luminance gradually increasing, FLCase for eachof the frames is consistently equal to 1. Correspondingly, thereal starting frame can be indicated by identifying the firstframe with its FLCase equals to 1. Similarly, the ending frameof a flashlight pair with gradually decreasing luminance canbe determined by identifying the last frame with its FLCaseequals to 1. It is labeled out in Fig. 4(b) by a box with dasheddot lines.

IV. EXPERIMENTAL RESULTS AND DISCUSSION

A. Test Video Sequences

In order to evaluate the performance of the AHD-based fadesdetection method 8 test video sequences are used. They are“eyeexam,” “riscos,” “docon,” “culture,” “animals,” “harmony,”“bebes” and “don qui.” Totally, 119 fades transitions exist inthese video sequences. The detail case type information forthese fades transitions is shown in Table II. The performancesof the flashlights detection are carried out on the two testvideo sequences “news1” and “jornal,” in which totally 116flashlight pairs exist. The test video sequences are compressedby MPEG-2, with frame size 352 288 and frame rate 25.

(23)


TABLE IIIFADES DETECTION RESULTS FOR THE 8 VIDEO SEQUENCES WITH DIFFERENT METHODS

In order to see the robustness and effectiveness of the pro-posed AHD-based flashlight and fades detection method, thetest video sequences are also converted into sequences withdifferent bit rates.

B. Performance Evaluation Criteria

Comparison between an algorithm’s output and the groundtruth is based on the number of missed detections and thenumber of false alarms , corresponding recall and pre-cision being given by

% (27)

% (28)

where stands for the number of correctly detected fades (orflashlight pairs), and the sum is the total number offades transitions (or flashlight pairs).

C. Performance Comparison With Different FadesDetection Methods

We compare the AHD-based fades detection algorithms withVRH [20], Truong et al. [23] (denoted as Truong) and Zabihet al. [5], [6] (denoted as Zabih). Their detection results for eachtest video sequences are shown in Table III, where AHD dcand AHD ORG denote the fades detection methods using AHDcharacteristics of dc images and the original images, respec-tively. The notations and denote the number of totalframes and the number of fades transitions. From Table III, wefind that correctly detected fades numbers by our method arelarger than those by the methods VRH, Truong and Zabih. Itis also observed that VRH is not effective in detecting fadestransitions of Case type for the testvideo sequence eyeexam. Totally 48 fades are miss detected. Ineach of those circumstances, two cross edges rather than oneinclined edge exist for a fades transition in the VRH image.The falsely detected fades by Zabih are caused by very fastmotion or object occlusions, where most edges are blurredor occluded. Average computational complexities for the testvideo sequences of different fades detection methods are listedin Table IV, where the speeds (in microseconds per frame) andgains (compared to the AHD DC-based method) are listed. Inthis paper, all the compared algorithms are implemented with

TABLE IVSPEEDS AND GAINS COMPARISON FOR DIFFERENT

FADES DETECTION METHODS

ASCII C in Microsoft Visual C++ environment. The averagecomputational costs are obtained by running the programs ona P4 2.8-G PC with 1-G RAM. The average computationalcosts of our AHD ORG and AHD DC methods are 43.5439and 9.7491 ms per frame, respectively. From Table IV, we findthat the computational costs of the Truong, VRH, Zabih andAHD ORG methods are 3.1973, 4.6130, 14.2892, and 4.4665times of that of AHD DC, respectively.

In order to compare the performances of different methodsfor each case, the detail information for fade in and fade out andthe six cases are listed in Table V, where anddenote the recall and precision of each of the cases, andand represent the recall and precision for fade in andfade out. It is difficult to classify the falsely detected fades forVRH, Zabih, and Truong. So, the and of VRH,Zabih, and Truong are not listed in Table V. We find that VRH isvery sensitive to the gray values of the solid color frames duringfades transitions. VRH-based fades detection method is not ef-fective in detecting fades with case types 3 and 6

. During experimentation, we also observe thatVRH, Zabil and Truong are ineffective in detecting fades whereonly a small fraction of the image undergoes fades transition;for example, captions fade in and fade out in the solid blackbackgrounds. However, our algorithms can detect those fadesvery well. The average recall and precision of AHD DC are,respectively, 91.60% and 90.08%, which are close to those ofAHD ORG, whose recall is 94.96% and precision is 88.28%.The missed fades are caused by fast motions. The falsely de-tected fades are the shots with gradually changing luminance.

D. Performance Comparison With Different FlashlightDetection Methods

The flashlight pairs in the test videos news1 and jornal aremanually labeled. The energy variation distribution for the


TABLE VPERFORMANCES COMPARISON FOR EACH OF

THE CASES WITH DIFFERENT METHODS

frames when flashlights appear and disappear is shown inFig. 5. We find that the energy variation of about 40% flashlightpairs is less than 10. So, simple energy constraint-based methodcould not provide satisfactory detection performance [28]. Thatis why we set the energy variation constraint in thispaper.

We compare the AHD-based flashlight detection algorithmwith the methods proposed by Yeo et al. [15], Truong et al.[28], VR[20], and Heng et al. [30]. Note that Heng flashlightdetection is carried out on the frames suspected to be abruptscene changes which are detected by the scene change detectionmethod proposed by Yeo et al. [15]. The corresponding recallsand precisions of the flashlight detection methods are shownin Table VI. Comparatively, the average recall and precision ofAHD DC are 93.10% and 93.10% and those of AHD ORG are97.41% and 92.62%, respectively, which are both higher than

Fig. 5. Energy variation distribution for flashlight pairs in the video sequencesnews1 and journal.

those of VR, Yeo, Truong, and Heng. The computational costsfor different flashlight detection methods are listed in Table VII,from which we find that the speed of the AHD DC-based flash-light detection method is closer to that of Yeo.

During experimentation, we find that Heng method is ef-fective in detecting stronger or local flashlights located atshot boundaries. This method is proposed to discriminate thecuts from the flashlight frames after suspected abrupt shotboundaries have been detected. The VR method is good indetecting stronger or global flashlights. It is not very effectivein detecting local flashlights or flashlights with heavy shadows.Several flashlight effects are not detected by the proposedAHD-based method, because the starting or ending framesof the flashlights are located at shot boundaries, which makethe flashlights unpaired. Several frames are falsely detected asflashlights which are not caused by flashlights due to globalluminance changes.

The average processing speeds are 31.5603 and 7.6527 msper frame for the AHD ORG and AHD DC-based flashlight de-tection methods which are comparatively faster than the corre-sponding AHD ORG and AHD DC-based fades detection, with43.5439 and 9.7491 ms per frame, respectively. This is because,they only needed to calculate the AHD of the frames whose en-ergy variations are no less than , as shown in Fig. 4(b),instead of calculating those of all the frames. It is obvious thatthe performances of AHD DC and AHD ORG-based fades andflashlight detection methods are very close. However, AHD DCsaves more calculating time compared to AHD ORG.

E. Performance Evaluation for the Videos With DifferentBit Rates

In order to see the robustness of our AHD-based fades andflashlight detection algorithm against bit rates, each of the orig-inally compressed videos are reencoded into variable bit rates(VBR) sequences with 80% and 50% bits of the original ones(denoted as VBR80 and VBR50, respectively), and constant bitrates (CBR) sequences at bit rates 1.0, 0.5, and 50 kbps (de-noted as CBR1.0 M, CBR0.5 M, and CBR50K, respectively).The corresponding fades and flashlight detection performances


TABLE VIFLASHLIGHT DETECTION PERFORMANCES FOR DIFFERENT VIDEO SEQUENCES

WITH DIFFERENT FLASHLIGHT DETECTION METHODS

TABLE VIISPEEDS AND GAINS COMPARISON FOR DIFFERENT

FLASHLIGHT DETECTION METHODS

for them are shown in Tables VIII and IX. From these two ta-bles, we find that the proposed AHD-based fades and flashlightdetection methods are insensitive to bit rates. The quantities

and are the same for the original video, VBR80,VBR50, CBR1.0M, and CBR0.5M. For CBR50K, another threefades transitions are misdetected, because serious distortions areproduced in this circumstance. The performances degrade littleeven the test video sequences are coded with very low bit rates.This shows the robustness of the proposed AHD-based fadesand flashlight detection method against bit rates. The robustnessof our AHD-based fades and flashlight detection method relies

Fig. 6. Recall and precision against DR for the test video eyeexam usingAHD DC.

on two facts. First, quantized AHD is a compacted represen-tation of the difference information of two consecutive frameswhich is insensitive to noise. Second, some post-processing toAHD further improves its robustness against noise.

F. Discussions on Parameters Selection

The experimental results of the proposed AHD-based fadesdetection method listed in Table III and V are under the min-imum fades transition length and minimum dynamic range vari-ation constraints, and . We choose todetect those fades whose durations are comparatively short. Thedynamic range variation threshold is selected based onthe analysis of the precisions and recalls by varying from0 to 255. For example, the recall and precision curves against

for eyeexam are shown in Fig. 6, from which we find thatthe precision is decreasing and recall is increasing with incre-ment of . Comparatively, better performances, with recall91.60% and precision 90.08%, are achieved underfor the test video sequences.

AHD-based flashlight detection results listed in Tables VI andIX are obtained under the thresholds and .Now, we turn to analyze the influence of and on theflashlight detection performances. For , let the param-eter take 6 values in (0, 0.003]. Correspondingly, the curvesof recall and precision against are shown in Fig. 7(a). The re-call increases and the precision decreases with the increasingof . The precision decreases with the increasing of , becausesome very fast motion frames are falsely detected as flashlights.Comparatively, gives better recall and precision forthe test video sequences. Similarly, for , let the pa-rameter take 8 values in [0, 30]. The recall and precisioncurves are shown in Fig. 7(b), from which we find that the pre-cision increases and the recall decreases with increasing energythreshold . Comparatively, better recall and precision areachieved for .

Now, we turn to discuss the influence of the histogram binson the performance of the AHD-based fades and flashlight de-tection method. AHD is quantized into 16, 32, 64, 128, and 256


TABLE VIIIAHD-BASED FADES DETECTION PERFORMANCES UNDER DIFFERENT BIT RATES

TABLE IXAHD-BASED FLASHLIGHT DETECTION PERFORMANCES UNDER DIFFERENT BIT RATES

Fig. 7. Recall and precision against � and�E . (a) Recall and precision with different � and�E . (b) Recall and precision with different�E .

TABLE XFLASHLIGHT AND FADES DETECTION PERFORMANCES FOR DIFFERENT BINS

bins, respectively. The corresponding recalls and precisions forfades on eyeexame and those for flashlights on journal are shownin Table X. We find that the miss detected fades and flashlightsnumbers are decreased with the decrease of the histogram bins.At the same time, the falsely detected numbers are increased.From Table X, we find that the performances at bins 128, 64,and 32 are almost the same. This shows that the AHD-basedmethod is insensitive to histogram bins.

V. CONCLUSION

In this paper, we proposed an AHD-based fade in/out andflashlight detection method. From the mathematical models andby comparing the gray values of the corresponding monochromecolor with the maximum and minimum gray values of the solidcolor frames during fades transitions, the AHD characteristicsfor fades can be classified into six cases. The fades detection is


converted into a cases matching and validation problem. Flash-light detection problem is also converted into a case related oneaccording to AHD and the energy variation characteristics offlashlight pair.

Comparison results with other fades and flashlight detectionmethods on several video sequences show the effectiveness ofthe proposed AHD-based method. Moreover, experimental re-sults on the same test video sequences coded with different bitrates show its robustness.

ACKNOWLEDGMENT

The authors would like to thank the Associate Editor I. Ahmadand the anonymous reviewers for providing valuablecomments and suggestions that were used in preparing therevised manuscript.

REFERENCES

[1] H. J. Zhang, J. Wu, D. Zhong, and S. Smoliar, “An integrated systemfor content-based video retrieval and browsing,” Pattern Recognit., vol.30, pp. 643–658, 1997.

[2] J. Fan, A. K. Elmagarmid, X. Zhu, W. G. Aref, and L. Wu, “ClassView:Hierarchical video shot classification, indexing, and accessing,” IEEETrans. Multimedia, vol. 6, no. 1, pp. 70–86, Feb. 2004.

[3] W. Tavanapong and J. Zhou, “Shot clustering techniques for storybrowsing,” IEEE Trans. Multimedia, vol. 6, no. 4, pp. 517–527, Aug.2004.

[4] J. Boreczky and L. Rowe, “Comparison of video shot boundary detec-tion techniques,” in Proc. SPIE Conf. Storage . Retrieval Image VideoDatabases IV, 1996, vol. 2670, pp. 170–179.

[5] R. Zabih, J. Miller, and K. Mai, “A feature-based algorithm for de-tecting and classifying scene breaks,” in Proc. ACM Multimedia’95,1995, pp. 189–200.

[6] R. Zabih, J. Miller, and K. Mai, “A feature-based algorithm for de-tecting and classification production effects,” Multimedia Syst., vol. 7,pp. 119–128, 1999.

[7] W. J. Heng and K. N. Ngan, “Integrated shot boundary detection usingobject-based techniques,” in Proc. IEEE Int. Conf. Image Process.,1999, vol. 3, pp. 289–293.

[8] U. Gargi, R. Kasturi, and S. H. Strayer, “Performance characterizationof video-shot-change detection methods,” IEEE Trans. Circuits Syst.Video Technol., vol. 10, no. 2, pp. 1–13, Feb. 2000.

[9] C. F. Lam and M. C. Lee, “Video segmentation using color differencehistogram,” in Lecture Notes in Computer Science 1464. New York:Springer-Verlag, 1998, pp. 159–174.

[10] A. Hampapur, R. Jain, and T. Weymouth, “Production model baseddigital video segmentation,” Multimedia Tools Applicat., vol. 1, no. 1,pp. 9–46, 1995.

[11] S. C. Pei and Y. Z. Chou, “Efficient MPEG compressed video analysisusing macroblock type information,” IEEE Trans. Multimedia, vol. 1,no. 4, pp. 321–333, Dec. 1999.

[12] S. C. Pei and Y. Z. Chou, “Effective wipe detection in MPEG com-pressed video using macro block type information,” IEEE Trans. Mul-timedia, vol. 4, no. 3, pp. 309–319, Sept. 2002.

[13] A. Akutsu, “Video indexing using motion vectors,” in Proc. SPIE Vis.Commun. Image Process., 1992, vol. SPIE 1818, pp. 1522–1530.

[14] P. Bouthemy, M. Gelgon, and F. Ganansia, “A unified approach to shotchange detection and camera motion characterization,” IEEE Trans.Circuits Syst. Video Technol., vol. 9, no. 7, pp. 1030–1044, Oct. 1999.

[15] B. L. Yeo and B. Liu, “Rapid scene analysis on compressed video,”IEEE Trans. Circuits Syst. Video Technol., vol. 5, no. 6, pp. 533–544,Dec. 1995.

[16] K. Shen and E. J. Delp, “A fast algorithm for video parsing usingMPEG compressed sequences,” in IEEE Int. Conf. Image Process., Oct.1995, pp. 252–255.

[17] A. M. Alattar, “Detecting fade regions in uncompressed video se-quences,” in Proc. ICASSP’97, vol. 4, pp. 3025–3028.

[18] W. A. C. Femando, C. N. Canagarajah, and D. R. Bull, “Automaticdetection of fade in and fade out in video sequences,” in Proc. ISCAS,1999, vol. 4, pp. 255–258.

[19] W. A. C. FernandoC. N. CanagararajahD. R. Bull, “Fade-in andfade-out detection in video sequences using histograms,” in Proc.ISCAS 2000, vol. 4, pp. 709–712.

[20] S. J. F. Guimarães, M. Couprie, A. de A. Araújo, and N. J. Leite, “Videosegmentation based on 2D image analysis,” Pattern Recognit. Lett., vol.24, no. 7, pp. 947–957, Apr. 2003.

[21] S. Porter, M. Mirmehdi, and B. Thomas, “Temporal video segmenta-tion and classification of edit effects,” Image Vis. Comput., vol. 21, no.13–14, pp. 1097–1106, Dec. 1, 2003.

[22] J. H. Park, S. Y. Park, S. J. Kang, and W. H. Cho, “Content-based scenechange detection of video sequence using hierarchical hidden Markovmodel,” in Proc. LNAI, 2003, vol. 2843, pp. 426–433.

[23] B. T. Truong, C. Dorai, and S. Venkatesh, “Improved fade and dissolvedetection for reliable video segmentation,” in Proc. IEEE Int. Conf.Image Process. (ICIP 2000), 2000, vol. 3, pp. 961–964.

[24] J. Nam and A. H. Tewfik, “Dissolve transition detection using B-splinesinterpolation,” in Proc. IEEE Int. Conf. Multimedia Expo (ICME), Jul.2000, vol. 3, pp. 1349–1352.

[25] E. Feig and S. Winograg, “Fast algorithms for the discrete cosine trans-form,” IEEE Trans. Signal Process., vol. 40, no. 9, pp. 2174–2193, Sep.1992.

[26] Y. Nakajima, K. Ujihara, and A. Yoneyama, “Universal scene changedetection on MPEG-coded data domain,” Vis. Commun. ImageProcess., vol. 3024, pp. 992–1003, 1997.

[27] M. Sugano, Y. Nakajima, H. Yanagihara, and A. Yoneyama, “A fastscene change detection on MPEG coding parameter domain,” IEEEICIP’98, vol. 1, no. 4–7, pp. 888–892, Oct. 1998.

[28] B. T. Truong and S. Venkatesh, “Determining dramatic intensificationvia flashing lights in movies,” in Proc. IEEE Int. Conf. MultimediaExpo, 2001, pp. 60–63.

[29] D. Zhang, W. Qi, and H. J. Zhang, “A new shot boundary detectionalgorithm,” Proc. PCM 2001, vol. 2195, pp. 63–70.

[30] W. J. Heng and K. N. Ngan, “High accuracy flashlight scene determi-nation for shot boundary detection,” Signal Process.: Image Commun.,vol. 18, no. 3, pp. 203–219, Mar. 2003.

[31] Y. Wang, J. Osterman, and Y.-Q. Zhang, Video Processing andCommunications. Englewood Cliffs, NJ: Prentice-Hall, 2002, pp.116–120.

[32] C. G. M. Snoek and M. Worring, “Multimedia event-based video in-dexing using time intervals,” IEEE Trans. Multimedia, vol. 7, no. 4,pp. 638–647, Aug. 2005.

[33] R. Lienhart, “Comparison of automatic shot boundary detection algo-rithms,” in Proc. SPIE Conf. Storage Retrieval Image Video DatabasesVII, 1999, vol. 3656, pp. 290–301.

Xueming Qian received the B.S. and M.S. degrees in Xi’an University of Tech-nology, Xi’an, China, in 1999 and 2004, respectively. He is currently workingtoward the Ph.D. degree at the School of Electronics and Information Engi-neering, Xi’an Jiaotong University, Xi’an, China.

From 1999 to 2001, he was an Assistant Engineer at Shannxi Daily. His re-search interests include video/image communication and transmission, videoanalysis, processing and compression, mobile wireless multimedia communi-cation, error resilience and error concealment techniques, and semantic-basedvideo analysis, indexing, and retrieval.

Guizhong Liu received the B.S. and M.S. degrees in computational mathe-matics from Xi’an Jiaotong University, Xi’an, China, in 1982 and 1985, re-spectively, and the Ph.D. degree in mathematics and computing science fromEindhoven University of Technology, Eindhoven, The Netherlands, in 1989.

He is currently a Full Professor with the School of Electronic and InformationEngineering, Xi’an Jiaotong University, Xi’an. China. His research interests in-clude nonstationary signal analysis and processing, image processing, audio andvideo compression, and inversion problems.

Rui Su received the B.S. degrees from the National University of DefenseTechnology, Changsha, China, in 1996, and the M.S. degrees from Xi’anJiaotong University, Xi’an, China, in 2004, respectively, where he is currentlyworking towards the Ph.D. degree at the School of Electronics and InformationEngineering.

From 1996 to 2001, he was an Engineer at Xi’an Satellite Control Center.His research interests include video compression techniques, rate control andcommunication, image analysis and processing, VLSI architecture, and imple-mentation for video technology and multiprocessor system.

Effective Fades and Flashlight Detection Based on Accumulating ...

Documents

Transcript of Effective Fades and Flashlight Detection Based on Accumulating ...