Gradient

4
  ETRI Journal, Volume 35, Number 5, October 2013 © 2013 Yunbo Rao et al.  923 To resolve video enhancement problems, a novel method of  gr ad ie nt do ma in fu si on wh erei n g ra di en t d om ai n f ra me s o f t he background in daytime video are fused with nighttime video  fr am es is pr op os ed . T o ver if y the su pe ri or it y of the pr op os ed method, it is compared to conventional techniques. The implemented output of our method is shown to offer enhanced visual quality.  Ke yw or ds : Vid eo en ha ncement, gr ad ie nt do ma in , fu si on . I. Introduction It is well known that video enhancement has been an active topic in computer vision in recent years [1]. However, low contrast and complex backgrounds create video enhancement  pr ob le ms th at ar e to ug h to re so lv e. Th e go al of vi de o enhancement is to improve the interpretability or perceptibility of visual information in conventional nighttime conditions (minimal light) for human viewers. Video enhancement includes numerous applications in which digital video systems, such as video surveillance, general identity verification, traffic, criminal justice systems, and civilian or military video  pr oc es si ng , ar e used to re co gn iz e an d tr ac k ob je cts [2 ]. Understanding nighttime video is a challenging problem  be ca us e o f th e f ol lo wi ng r ea so ns . Fi rs tl y , co lo r- ba se d m et ho ds will fail on this matter if the color of the moving objects and the Manuscript received Dec. 11, 2012; revised Feb. 4, 2013; accepted Feb. 27, 2013. Yunbo Rao (phone: +86 159 0817 7003, [email protected]) is with the School of Information and Software Engineering, University of Electronic Science and Technology of China, Sichuan, China. Yuhong Zhang ([email protected]) was with the Department of Electrical Engineering and Computer Science, Northwestern University, Evanston, Illinois, USA, and is now with the College of Information Science and Engineering, Henan University of Technology, Henan, China. Jianping Gou ([email protected]) was with the School of Information and Software Engineering, University of Electronic Science and Technology of China, Sichuan, China, and is now with the College of Science and Engineering, Jiangsu University, Jiangsu, China. http://dx .doi.org/10 .4218/etrij. 13.0212.0550  ba ck gr ou nd i s si mi la r . Se cond ly , on e pi xe l fr om a lo w qu alit y image may be important even if the local variance is small, such as a headlight or taillight. Thirdly, poor quality video devices are used [3]. Traditional video enhancement techniques can be classified into two broad categories: spatial-based domain video enhancement, such as histogram equalization (HE), power-law transformation, or tone mapping [1], and transform-based domain video enhancement [2], [4]. The traditional HE techniques utilize the image histogram to obtain a single- indexed mapping to modify the pixel values using the  pr ob ab il it y de ns it y fu nction and cu mu la ti ve di st ri bu tion function of image [1]. Additionally, these techniques attempt to alter the spatial histogram of an image to closely match a uniform distribution. Traditional HE methods have some limitations, including the following: 1) Irregular histogram distribution frequently results in over-enhancement; 2) It is difficult to achieve a well-balanced enhancement effect over different parts of an image. Another method is power-law transformation, which is global image/video processing with low calculation complexity. This method can be performed to enhance the remaining dark regions of the image/video while keeping the bright parts relatively unchanged. The tone mapping functions are also used to enhance the nighttime videos. In general, the nonlinear tone mapping function is used to attenuate image details and to adjust the contrast of large- scale features [5]. It can be expressed as Max log( ( 1 ) 1 ) ( , ) log( )  x  x m x  Ψ + Ψ = Ψ , (1) where Max  x is level of the input illumination and Ψ controls the attenuation profile. This mapping function exhibits a similar characteristic to that of the traditional power-law transformation. The limitation of the traditional methods, firstly, is that a dark Gradient Fusion Method for Night Video Enhancement Yunbo Rao, Y uhong Zhang, and Jianping Gou  

description

Research article

Transcript of Gradient

  • ETRI Journal, Volume 35, Number 5, October 2013 2013 Yunbo Rao et al. 923

    To resolve video enhancement problems, a novel method of gradient domain fusion wherein gradient domain frames of the background in daytime video are fused with nighttime video frames is proposed. To verify the superiority of the proposed method, it is compared to conventional techniques. The implemented output of our method is shown to offer enhanced visual quality.

    Keywords: Video enhancement, gradient domain, fusion.

    I. Introduction It is well known that video enhancement has been an active

    topic in computer vision in recent years [1]. However, low contrast and complex backgrounds create video enhancement problems that are tough to resolve. The goal of video enhancement is to improve the interpretability or perceptibility of visual information in conventional nighttime conditions (minimal light) for human viewers. Video enhancement includes numerous applications in which digital video systems, such as video surveillance, general identity verification, traffic, criminal justice systems, and civilian or military video processing, are used to recognize and track objects [2].

    Understanding nighttime video is a challenging problem because of the following reasons. Firstly, color-based methods will fail on this matter if the color of the moving objects and the

    Manuscript received Dec. 11, 2012; revised Feb. 4, 2013; accepted Feb. 27, 2013. Yunbo Rao (phone: +86 159 0817 7003, [email protected]) is with the School of

    Information and Software Engineering, University of Electronic Science and Technology of China, Sichuan, China.

    Yuhong Zhang ([email protected]) was with the Department of Electrical Engineering and Computer Science, Northwestern University, Evanston, Illinois, USA, and is now with the College of Information Science and Engineering, Henan University of Technology, Henan, China.

    Jianping Gou ([email protected]) was with the School of Information and Software Engineering, University of Electronic Science and Technology of China, Sichuan, China, and is now with the College of Science and Engineering, Jiangsu University, Jiangsu, China.

    http://dx.doi.org/10.4218/etrij.13.0212.0550

    background is similar. Secondly, one pixel from a low quality image may be important even if the local variance is small, such as a headlight or taillight. Thirdly, poor quality video devices are used [3].

    Traditional video enhancement techniques can be classified into two broad categories: spatial-based domain video enhancement, such as histogram equalization (HE), power-law transformation, or tone mapping [1], and transform-based domain video enhancement [2], [4]. The traditional HE techniques utilize the image histogram to obtain a single-indexed mapping to modify the pixel values using the probability density function and cumulative distribution function of image [1]. Additionally, these techniques attempt to alter the spatial histogram of an image to closely match a uniform distribution. Traditional HE methods have some limitations, including the following: 1) Irregular histogram distribution frequently results in over-enhancement; 2) It is difficult to achieve a well-balanced enhancement effect over different parts of an image. Another method is power-law transformation, which is global image/video processing with low calculation complexity. This method can be performed to enhance the remaining dark regions of the image/video while keeping the bright parts relatively unchanged. The tone mapping functions are also used to enhance the nighttime videos. In general, the nonlinear tone mapping function is used to attenuate image details and to adjust the contrast of large-scale features [5]. It can be expressed as

    Maxlog( ( 1) 1)( , )log( )

    xxm x

    + =

    , (1)

    where Maxx is level of the input illumination and controls the attenuation profile. This mapping function exhibits a similar characteristic to that of the traditional power-law transformation.

    The limitation of the traditional methods, firstly, is that a dark

    Gradient Fusion Method for Night Video Enhancement

    Yunbo Rao, Yuhong Zhang, and Jianping Gou

  • 924 Yunbo Rao et al. ETRI Journal, Volume 35, Number 5, October 2013

    video/image may not reveal all the details and look unnatural. Secondly, since the information of some dark areas of the videos is already lost, it will not be recovered. Due to drawbacks of the traditional methods, some researchers have presented methods that combine daytime images of the background with nighttime images of the same scene to enhance visual footage. Stathaki [6] proposed a pixel-based method to enhance nighttime video according to

    1 n 2 db( , ) ( , ) ( , ) ( , ) ( , )F x y A x y L x y A x y L x y= + , (2)

    where ( , )F x y , n ( , )L x y , and db ( , )L x y are the illumination of the final enhanced video frame, nighttime video frame, and daytime background, respectively. 1( , )A x y and 2 ( , )A x y are the weighting with a value in [0, 1]. This produces desirable enhancement results for the background regions. However, in the foreground, where moving objects are found, combining the daytime and nighttime images can result in severe ghost patterns.

    A denighting method [7] uses the illumination ratios of the daytime background and nighttime background videos to enhance the nighttime videos. The illumination component of the enhanced nighttime video is obtained by

    dbn

    nb

    ( , )( , ) ( , )

    ( , )L x y

    L x y L x yL x y

    = , (3)

    where ( , )L x y represents the enhanced illumination component, nb ( , )L x y represents the illumination component of the

    nighttime background images, and n ( , )L x y denotes the illumination component of nighttime videos. The drawback is that the illumination ratios of the daytime background images and nighttime background images can be much smaller than 1; if so, the enhanced results lose static illumination, such as highway lighting.

    Raskar and others [8] proposed a gradient domain method to combine daytime background and nighttime video frames as

    ( , ) ( , ) ( , ) ( , ) (1 ( , ))i i iG x y N x y w x y D x y w x y= + , (4)

    where ( , )G x y is the mixed gradient field, ( , )iN x y is the nighttime video gradient field, ( , )D x y is the daytime background gradient field, and ( , )iw x y is the importance image. The method has two shortcomings: One, color shift problems commonly exist in gradient-based approaches; Two, the computational complexity is high.

    II. Methodology

    To overcome the aforementioned problems, a novel method is proposed. A flowchart of the proposed method is shown in Fig. 1. Firstly, we convert the input frame from the red, green, blue (RGB) color model to the hue, saturation, intensity (HSI)

    color model. Secondly, we extract the intensity (I) component and obtain the gradient image. Thirdly, we fuse daytime gradient frames with nighttime gradient frames. Finally, we reconstruct the color image according to the HSI color space. Experiment results show that the videos enhanced by our method have improved visual quality, which exceeds the quality achieved through conventional techniques.

    1. Conversion of RGB to HSI Color Space

    The RGB color space consists of three additive primaries: red, green, and blue. Illumination can be calculated by red, green, and blue using standard equations. To convert an image from RGB color to an illumination image, the following equation is used.

    ( , ) 0.298 ( , ) 0.587 ( , ) 0.114 ( , )f x y r x y g x y b x y= + + . (5)

    In this letter, our method is based on the illumination component of the images. So, we must convert the input RGB to another color space. The HSI color model is an ideal tool for developing image processing algorithms, and its color-perceiving properties closely resemble those of the human visual system. It is assumed that the RGB values have been normalized to the range [0, 1] and that angle is measured with respect to the red axis of the HSI space. An example of the RGB components and their converted HSI components is shown in Fig. 2.

    2. Gradient-Based Fusion Method

    The gradient of frame f (x, y) at location (x, y) is defined as

    //

    x

    y

    G f xf

    G f y = =

    . (6)

    It is well known from vector analysis that the gradient vector points in the direction of the maximum rate of change of f at coordinates x, y. Computation of the gradient of a frame is based on obtaining the partial derivatives /f x and

    /f y at every pixel location. In general, we can use Roberts cross-gradient operators, Prewitt operators, and Sobel operators. Herein, our method uses Sobel operators to obtain the frame gradient in the horizontal and vertical directions. The experiment results are shown in Fig. 1.

    After obtaining the daytime background gradient and the nighttime video frames of the horizontal and vertical directions, we consider how to represent the video enhancement by combining the gradient images. The proposed method is composed of the two components described below.

    1) Fuse the x and y directions (horizontal and vertical directions) of the daytime background gradient and the x and y

  • ETRI Journal, Volume 35, Number 5, October 2013 Yunbo Rao et al. 925

    Fig. 1. Flowchart of proposed method.

    ( i htti id )

    (d ti id )

    (d i b k d) (I )

    (I t)

    H component

    Y gradient

    X gradient

    Fusion gradient

    Enhanced result

    Nighttime video

    Daytime video

    Daytime background I component

    I component

    S component

    Fig. 2. (a) Original nighttime video frame with dark scene, (b) Icomponent extracted, (c) H component extracted, and (d)S component extracted.

    (a) (b)

    (c) (d)

    directions of the nighttime video frames. The proposed gradient fusion is described as follows.

    db nvideo/

    db nvideo

    ,,

    x xx y

    y y

    G GG

    G G

    +=

    + (7)

    where Gx and Gy represent the enhanced gradient of a nighttime video frame in the x direction and y direction, respectively. The enhanced gradient results combine the daytime background gradient and nighttime video frame gradient.

    2) After obtaining the enhanced gradient, to successfully obtain the color restoration and enhancement results, we further

    fuse the enhanced gradient of Gx and Gy. The proposed gradient fusion method is described as follows.

    enhancement

    . (1 ). , if ,if ,(1 ). . ,

    x y x y

    x yx x

    G G G GG

    G GG G

    + =

    < +

    (8)

    where Genhancement represents the final enhanced gradient results. Symbol represents a weighting factor that combines the daytime and nighttime gradient. Normally, is set to be a large value that is close to 1. From (8), it is clear that our method produces desirable visual quality and that the gradient fusion result satisfies the following requirements: (i) If the x-direction gradient is larger than the y-direction gradient, the region needs less enhancement; (ii) If the y-direction gradient is larger than the x-direction gradient, there is a need to use the y-direction gradient to enhance the nighttime frames.

    3. Image Reconstructions

    The gradient vector field G may not be integral. In this letter, we initially use one of the direct methods in work [5] to restore gradient information. To restore the color information, we also perform image reconstruction in all three color channels separately and compute a linear color restoration process based on the chromatic information of the original nighttime frames.

    III. Experiment Results and Conclusion

    To demonstrate the performance of the proposed method, we

  • 926 Yunbo Rao et al. ETRI Journal, Volume 35, Number 5, October 2013

    Fig. 3. (a) Original daytime background with high illumination,(b) original nighttime video frame with dark scene, (c)enhanced nighttime video sequence using prowler-lowtransform in [2], (d) HE, (e) enhanced nighttime videosequence using tone mapping function in [5], and (f) ourmethod.

    (a) (b) (c)

    (d) (e) (f)

    Fig. 4. (a) Enhanced nighttime video sequence uses pixel-basedmethod in [6], (b) enhanced nighttime video sequenceusing denighting method in [7], (c) enhanced nighttimevideo sequence using gradient domain in [8], and (d) ourmethod.

    (a) (b)

    (c) (d)

    conduct experiments using video footage referred to as Highway and Gate. Due to the limited space, only the results for Gate are displayed in this letter. Related methods are implemented, and their results are shown for comparison.

    Figure 3 shows the results of our method, the method in [2], HE, and the method in [5]. As shown in Figs. 3(c) through 3(e), the images may not reveal all the details, and some information located in the dark areas might already be lost. However, Fig. 3(f) shows our method efficiently. Our algorithm can compensate for illumination information, for example, on the wall region shown in Fig. 3(f). We also compare fusion methods. Figure 4 shows the results of our method, the method in [6], the method in [7], and the method in [8]. The results of the method in [6] show over illumination and color shift

    Table 1. PSNR of enhanced videos.

    Sequence Our method Method in [6] Method in [7] Method in [8]

    Gate 35.62 30.61 33.62 29.98

    Highway 36.45 32.12 34.42 30.81

    problems. The results of the method in [7] show a loss of light owing to the nighttime illumination exceeding the daytime illumination at the inside gate region. The results of the method in [8] show that a threshold must be set through manipulation. However, our method fully resolves these problems and obtains enhanced illumination. The peak signal-to-noise ratio (PSNR) method is used to objectively evaluate contrast in the enhanced nighttime videos [1]. As Table 1 shows, the PSNR associated with our method is bigger than that of the other four methods, in most cases.

    In this letter, we proposed a gradient-based domain fusion video enhancement technique, in which gradient domain frames of the background in daytime video are fused with frames of nighttime video. Subjective and objective evaluations showed our method to be effective.

    References

    [1] Y.B. Rao and L.T. Chen, A Survey of Video Enhancement Techniques, Int. J. Electr. Eng. Inform., vol. 3, no. 1, 2012, pp. 71-99.

    [2] R.C. Gonzalez and R.E. Woods, Digital Image Processing, 3rd ed., NJ: Prentice Hall, 2007.

    [3] E.P. Bennett and L. McMillan Video Enhancement Using per-Pixel Virtual Exposures, Proc. ACM Trans. Graphics, vol. 24, no. 3, July 2005, pp. 845-852.

    [4] Q. Wang and R.K. Ward, Fast Image/Video Contrast Enhancement Based on Weighted Threshold Histogram Equalization, IEEE Trans. Consum. Electron., vol. 53, no. 2, 2007, pp. 757-764.

    [5] F. Durand and J. Dorsey, Fast Bilateral Filtering for the Display of High-Dynamic Range Images, ACM Trans. Graphics, vol. 21, no. 3, 2002, pp. 257-266.

    [6] T. Stathaki, Image Fusion: Algorithms and Applications, Academic Press, 2008.

    [7] A. Yamasaki et al., Denighting: Enhancement of Nighttime Image for a Surveillance Camera, 19th Int. Conf. Pattern Recog., 2008.

    [8] R. Raskar, A. Ilie, and J.Y. Yu, Image Fusion for Context Enhancement and Video Surrealism, Proc. SIGGRAPH 2005, ACM New York, NY, USA, 2005.

    I. IntroductionII. MethodologyIII. Experiment Results and ConclusionReferences

    /ColorImageDict > /JPEG2000ColorACSImageDict > /JPEG2000ColorImageDict > /AntiAliasGrayImages false /CropGrayImages true /GrayImageMinResolution 300 /GrayImageMinResolutionPolicy /OK /DownsampleGrayImages true /GrayImageDownsampleType /Bicubic /GrayImageResolution 300 /GrayImageDepth -1 /GrayImageMinDownsampleDepth 2 /GrayImageDownsampleThreshold 1.50000 /EncodeGrayImages true /GrayImageFilter /DCTEncode /AutoFilterGrayImages true /GrayImageAutoFilterStrategy /JPEG /GrayACSImageDict > /GrayImageDict > /JPEG2000GrayACSImageDict > /JPEG2000GrayImageDict > /AntiAliasMonoImages false /CropMonoImages true /MonoImageMinResolution 1200 /MonoImageMinResolutionPolicy /OK /DownsampleMonoImages true /MonoImageDownsampleType /Bicubic /MonoImageResolution 1200 /MonoImageDepth -1 /MonoImageDownsampleThreshold 1.50000 /EncodeMonoImages true /MonoImageFilter /CCITTFaxEncode /MonoImageDict > /AllowPSXObjects false /CheckCompliance [ /None ] /PDFX1aCheck false /PDFX3Check false /PDFXCompliantPDFOnly false /PDFXNoTrimBoxError true /PDFXTrimBoxToMediaBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXSetBleedBoxToMediaBox true /PDFXBleedBoxToTrimBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXOutputIntentProfile (None) /PDFXOutputConditionIdentifier () /PDFXOutputCondition () /PDFXRegistryName () /PDFXTrapped /False

    /Description > /Namespace [ (Adobe) (Common) (1.0) ] /OtherNamespaces [ > /FormElements false /GenerateStructure false /IncludeBookmarks false /IncludeHyperlinks false /IncludeInteractive false /IncludeLayers false /IncludeProfiles false /MultimediaHandling /UseObjectSettings /Namespace [ (Adobe) (CreativeSuite) (2.0) ] /PDFXOutputIntentProfileSelector /DocumentCMYK /PreserveEditing true /UntaggedCMYKHandling /LeaveUntagged /UntaggedRGBHandling /UseDocumentProfile /UseDocumentBleed false >> ]>> setdistillerparams> setpagedevice