Multispectral image fusion for improved RGB representation ... · The RGB colour space is the...
Transcript of Multispectral image fusion for improved RGB representation ... · The RGB colour space is the...
Multispectral image fusion for improved RGB representation based onperceptual attributes
V. TSAGARIS and V. ANASTASSOPOULOS
Electronics and Computers Division, Physics Department, University of Patras,
Patras 26500, Greece; Email: [email protected]
(Received 15 October 2003; in final form 17 February 2005 )
A pixel-level fusion technique for RGB representation of multispectral images is
proposed. The technique results in highly correlated RGB components, a fact
which occurs in natural colour images and is strictly related to the colour
perception attributes of the human eye. Accordingly, specific properties for the
covariance matrix of the final RGB image are demanded. Mutual information is
employed as an objective criterion for quality refinement. The method provides
dimensionality reduction, while the resulting RGB colour image is perceptually
of high quality. Comparisons with existing techniques are carried out using both
subjective and objective measures.
1. Introduction
Various approaches can be found in the literature for pixel-level fusion (Pohl and
Van Genderen 1998). Fusion at pixel-level means processing at the raw-data level, as
shown in figure 1. The great variety of image fusion methods can be justified by the
complexity of the problem, the different types of data involved and the different
aims of each application. Fusion can be employed to provide improved visual
interpretation, by means of combining different spectral characteristics or image
modalities. This is desirable in various applications, such as medical imaging and
remote sensing. Pixel-level fusion techniques can also be used to improve the
efficiency of classification and detection algorithms. In general, pixel-level fusion
methods can be classified into linear methods (Achalakul and Taylor 2001), non-
linear methods (Matsopoulos et al. 1994, Matsopoulos and Marshall 1995,
Mukhopadhyay and Chanda 2001), optimization techniques (Solberg et al. 1996),
neural networks (Zhang et al. 2001, Shkvarko et al. 2001) and image pyramids (Liu
et al. 2001).
The proposed fusion method can be categorized into linear ones. The core idea of
this method is to yield a final colour image with maximum information from the
dataset and enhanced visual features compared with the source multispectral bands.
This is achieved by transforming the multispectral data into the 3D RGB space, by
means of preserving the basic correlation properties of the RGB components
existing in natural colour images. For this purpose the key attributes of human
colour perception along with the main properties of natural colour images are
presented. These concepts are incorporated into the proposed method by imposing
specific restrictions on the covariance matrix of the final colour image. Simul-
taneously, the non-diagonal terms of this matrix are adjusted for achieving
maximum mutual information between the original multispectral bands and the
International Journal of Remote Sensing
Vol. 26, No. 15, 10 August 2005, 3241–3254
International Journal of Remote SensingISSN 0143-1161 print/ISSN 1366-5901 online # 2005 Taylor & Francis
http://www.tandf.co.uk/journalsDOI: 10.1080/01431160500127609
final RGB image. Cholesky decomposition is employed to derive the transformation
of the source multispectral data into the RGB space. The colour image resulting by
fusing the source multispectral bands is suitable to be displayed in any RGB device
and no additional transformation is needed (Rast et al. 1991, Pohl and Van
Genderen 1998, Achalakul and Taylor 2001, Tyo et al. 2003).
The paper is organized as follows. Section 2 provides background material on
human colour perception and the RGB colour space. Section 3 discusses principal
component analysis and introduces the key concept of the proposed fusion
technique. The multispectral dataset used in this work is described in § 4.
Experimental results as well as subjective and objective performance evaluation of
the proposed fusion technique are presented in the same section. Finally, the
conclusions are drawn in § 5.
2. Human colour perception and the RGB colour space
2.1 Colour perception
Colour is a rich and complex experience, usually caused by the vision system
responding differently to different wavelengths of light. The study of colour is
essential in the design and development of colour vision devices. The use of colour in
image displays is not only pleasant for the human eye, but it also enables the user to
perceive more information. The human eye can perceive only a few dozen grey
levels, yet it has the ability to distinguish between thousands of colours.
There are two main types of receptor in the retina, called rods and cones. Colour
perception is based on the activity of cones. Studies of the genetics of colour vision
support the idea that there are three types of cones, called S cones, M cones and L
cones (with peak sensitivity at short, medium and long wavelength, respectively).
They are occasionally called blue, green and red cones, but this nomenclature is
misleading because the sensation of red is not caused by the stimulation of red cones
only. The first two receptors have peak sensitivities at quite similar wavelengths. The
third receptor, the S cone, has a different peak sensitivity. The response of a receptor
to incoming light can be obtained by summing the product of the sensitivity and the
Figure 1. Information fusion can be carried out in different processing levels: (a) raw datafusion, (b) feature fusion and (c) decision fusion.
3242 V. Tsagaris and V. Anastassopoulos
spectral radiance of the light over all the wavelengths that correspond in the visible
region of the electromagnetic spectrum.
2.2 The RGB colour space
Various colour spaces have been standardized for different practical reasons, namely
RGB, YIQ, HSV, Lab, etc. The RGB colour space is the dominant colour space and
the most frequently used in colour cameras, scanners, displays, etc. Its advantages
are its simplicity as well as the fact that other colour representations have to be
transformed to RGB in order to be displayed on a colour monitor. The single
wavelength primaries used in the RGB colour space are 645.16 nm for red, 526.3 nm
for green and 444.44 nm for blue. The colour matching functions for the primaries of
the RGB system are depicted in figure 2. The negative values mean that subtractive
matching is required to match colour lights at the same wavelength with the RGB
primaries. On the other hand, the RGB colour matching functions present
similarities to the raw L, M and S responses of the cones.
One of the properties that characterize the RGB space in applications involving a
natural colour image is the high degree of correlation between its components. The
term high correlation means that if the intensity changes, all three components will
change accordingly. This is a consequence of the overlapping sensitivity curves of
the different types of cone in the human eye (Forsyth and Ponce 2002 p. 105), as well
as the colour matching functions for the primary colours of the RGB system given in
figure 2. This high correlation is studied in this paper using the two-dimensional
correlation coefficient r, given by:
r~
P
m
P
n
Amn{A� �
Bmn{B� �
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiP
m
P
n
Amn{A� �2P
m
P
n
Bmn{B� �2
r ð1Þ
Figure 2. Colour matching functions for the primaries of the RGB system (Data obtainedfrom the Colour & Vision Research Laboratories, University College London, UK, www-cvrl.ucsd.edu).
Multispectral image fusion based on perceptual attributes 3243
where A–
is the mean of A and B–
is the mean of B and A,B can be any two
components of the multispectral entity. The values of the correlation coefficient
satisfy the relation 21(r(1.
2.3 Correlation properties in natural colour images
The correlation properties of natural colour images were examined by means of the
Imagine Macmillan database (Macmillan Technical Publishing, www.mcp.com),
which consists of a great variety of images covering several themes. In this work a
set of 100 images is used and the images are categorized subjectively according to
their variety of colour. The selected scenes were those displaying perceptually rich
information content. The first category consists of images with rich colour. By the
term rich colour, natural colour images with a great variety of colours and hues are
described. The other three categories are formed by images with one of the primary
colours dominant. For example, a jet ski in the sea is an image where the blue colour
is dominant. The statistical results for the entire set, consisting of 25 images in each
category, can be found in table 1. The evaluation of the correlation matrix is based
on equation (1). The degree of correlation is high (higher than 0.6) for all pairs of
colour components and all types of image. The selection of the correlation
coefficients was based on their mean value, which is a sufficient estimator since the
corresponding variance is quite small.
3. Linear pixel-level fusion techniques
In the context of colour image fusion of multispectral data, most of the linear
methods employ principal components analysis (PCA) (Rast et al. 1991, Achalakul
and Taylor 2001, Tyo et al. 2003). The main drawback of these methods is that the
principal components cannot be used to produce an RGB image without an
additional transformation. This transformation is used to convert the pixels’ values
from an appropriate chosen colour space (e.g. IHS, HSV) to RGB values. In this
section the basic properties of the PCA are presented. On the other hand, the core
idea of the proposed transformation originates from PCA but the correlation
properties among the RGB components of the obtained fused image are similar to
those of natural colour images and, thus, no additional transformation for RGB
representation is necessary. In this way the dimensionality of the multispectral
vector space is reduced because only three components are used to represent the
information contained in the multispectral image set.
3.1 Multidimensional image representation and dimensionality reduction
The statistical properties of a multispectral entity with MN pixels per channel and K
different channels can be studied if each pixel is described by a vector whose
components are the individual spectral responses to each multispectral channel:
Table 1. Correlation coefficient for natural colour images.
Corr(R, G) Corr(R, B) Corr(G, B)
Mean 0.8487 0.7040 0.8849Variance 0.0317 0.0990 0.0175
3244 V. Tsagaris and V. Anastassopoulos
x~
x1
x2
..
.
xK
2
66664
3
77775
ð2Þ
with mean vector given by mx~E xf g~ 1MN
PMN
i~1
xi. The dimensionality of the mean
value mx is K, with its components corresponding to the mean value of each
multispectral channel. While the mean vector is used to define the average or
expected position of the pixels in the vector space, the covariance matrix describes
their scatter
Cx~1
MN
XMN
i~1
xixTi {mxmT
x ð3Þ
The covariance matrix can be used to quantify the correlation between the
multispectral bands. In the case of a high degree of correlation the corresponding
off-diagonal elements in the covariance matrix will be large. The diagonal elements
of the covariance matrix are the variances of the multispectral components.
The correlation between the different multispectral components can also be
described by means of the correlation coefficient given by equation (1). The
correlation coefficient r is related to the corresponding covariance matrix element,
since it is the covariance matrix element divided by the standard deviation of the
corresponding multispectral component (rij5cij/sisj). The correlation coefficient
matrix Rx has as elements the correlation coefficient between the ith and jth
multispectral component. Accordingly, all the diagonal elements will be 1 and the
matrix is symmetric
Rx~
1 r12� � � r
1K
r21
1 r2K
..
.� � � ..
.
rK1
rK2� � � 1
2
66664
3
77775
ð4Þ
In the literature several different linear transforms can be found, based on the
statistical properties of vector representation. An important case is the Karhunen–
Loewe transform, also known as principal components analysis (PCA). For this
transformation the matrix Cx is real and symmetric thereby finding a set of
orthonormal eigenvalues is always possible. Let ei and li, i51,2,3...K, be the
eigenvectors and the corresponding eigenvalues of Cx arranged in descending order.
Furthermore, let A be a matrix whose rows are formed by the eigenvectors of Cx
ordered so that the first row of A is the eigenvector corresponding to the largest
eigenvalue, and the last row is the eigenvector corresponding to the smallest one.
The matrix A is the transformation matrix that maps x into vectors denoted by y as
follows
y~A x{mxð Þ ð5Þ
Multispectral image fusion based on perceptual attributes 3245
The mean of y resulting from that transformation is zero and the covariance matrix
Cy is given by
Cy~ACxAT ð6Þ
The resulting covariance matrix Cy will be diagonal and the elements along the main
diagonal are the eigenvalues of Cx; that is
Cy~
l1 0 � � � 0
0 l2 0
..
.� � � ..
.
0 0 � � � lK
2
66664
3
77775
ð7Þ
The off diagonal elements of the covariance matrix are zero, denoting that the
elements of the vector population y are uncorrelated. This transformation will
establish a new coordinate system whose origin is at the centroid of the populationand whose axes are in the direction of the eigenvectors of Cx. This coordinate system
clearly shows that the transformation in equation (5) is a rotation transformation
that aligns the data with the eigenvectors, and this alignment is exactly the
mechanism that decorrelates the data. The transform is optimal in the sense that the
first principal component will have the highest contrast and it can be displayed as a
greyscale image with the bigger percentage of the total variance and thus the bigger
percentage of visual information. The above property does not hold in the case of a
colour image. If the three principal components are used to establish an RGB image(first component as red, second as green and third as blue) the result is not optimal
for the human visual system. The first principal component (red) will exhibit a high
degree of contrast, the second (green) will display only a limited range of the
available brightness value, whilst the third one (blue) will demonstrate an even
smaller range. In addition, the three components displayed as R, G and B are totally
uncorrelated and this is an assumption that does not hold for natural images
(Chavez 1989, Gonzalez and Woods 2002, Forsyth and Ponce 2002).
3.2 The proposed method
A different approach for RGB image formation using multispectral data is not to
totally decorrelate the data, but to control the correlation between the colour
components of the final image. This is achieved by means of the covariance matrix.
The proposed transformation distributes the energy of the source multispectral
bands, so that the correlation between the RGB components of the final image is
similar to that of natural colour images. In this way no additional transformation isneeded and direct representation to any RGB display can be applied. This can be
achieved using a linear transformation of the form
y~AT x ð8Þ
where x and y are the population vectors of the source and the final images,
respectively. The relation between the covariance matrices is
Cy~ATCxA ð9Þ
where Cx is the covariance of the vector population x and Cy is the covariance of the
arising vector population y. The required values for the elements in the resulting
3246 V. Tsagaris and V. Anastassopoulos
covariance matrix Cy are based on the study of natural colour images as explained
previously. The selection of a covariance matrix based on the statistical properties of
natural colour images guarantees that the resulting colour image will be pleasing for
the human eye. The RGB correlation coefficients depend on the scenes depicted in
the images. However, since a large variety of images with different scenes,
perceptually pleasing for the observer, have been chosen from the database, the
mean value of the correlation coefficients is not affected by the selection of the
scenes. The matrices Cx and Cy are of the same dimension and, if they are known,
the transformation matrix A can be evaluated using the Cholesky factorization
method. Accordingly, a symmetric positive definite matrix S can be decomposed by
means of an upper triangular matrix Q, so that
S~QT :Q ð10Þ
The matrices Cx, Cy using the above factorization can be written as
Cx~QTx Qx
Cy~QTy Qy
ð11Þ
and equation (9) becomes
QTy Qy~ATQT
x QxA~ QxAð ÞT QxA ð12Þ
thus Qy~QxA ð13Þ
and the transformation matrix A is
A~Q{1x Qy ð14Þ
The final form of the transformation matrix A implies that the proposed
transformation depends on the statistical properties of the original multispectral
dataset. Additionally, in the design of the transformation the statistical properties of
natural colour images are taken into account. The resulting population vector y is of
the same order as the original population vector x, but only three of the components
of y will be used for colour representation.
The evaluation of the desired covariance matrix Cy for the transformed vector is
based on the statistical properties of natural colour images, discussed in § 2.3, and on
requirements imposed by the user or the visual expert. The relation between the
covariance Cy and the correlation coefficient matrix Ry is given by
Cy~~Ry~T ð15Þ
where
~~
sy1 0 0 : 0
0 sy2 0 : 0
0 0 sy3: 0
: : : : :
0 0 0 : syK
2
6666664
3
7777775
ð16Þ
Multispectral image fusion based on perceptual attributes 3247
is the diagonal matrix with the variances (or standard deviations) of the new vectors
in the main diagonal and
Ry~
1 rR,G
rR,B
: 0
rR,G
1 rG,B
: 0
rR,B
rG,B
1 : 0
: : : : :
0 0 0 : 1
2
6666664
3
7777775
ð17Þ
is the desired correlation coefficient matrix.
The steps that one has to follow in order to apply the proposed method can be
summarized as follows:
1. Determine the desired Ry in equation (17) and evaluate the corresponding Cy
from equation (15).
2. Evaluate Cx from the source multispectral data.
3. Calculate Qx and Qy from Cx and Cy using Cholesky decomposition.
4. Evaluate the required transformation matrix A using equation (14).
For high visual quality the final colour image produced by the transformation
must have a high degree of contrast. In other words the energy of the original data
must be sustained and equally distributed in the RGB components of the final
colour image. This requirement is expressed as follows
XK
i~1
s2xi~
X3
i~1
s2yi ð18Þ
with sy15sy25sy3 approximately. The remaining K-3 bands should have negligible
energy (contrast) and will not be used in forming the final colour image. Their
variance can be adjusted to small values say syi51024sy1 for i54…K.
3.3 Selection of primary bands
The selection of the initial spectral bands that will help to determine the
primary axes for projecting the multispectral information is of paramount
importance for the visual quality of the final colour image. In principal components
analysis this selection is based on the three largest eigenvalues and the
corresponding eigenvectors. Consequently, the direction of the three new axes
used for projection is different from that of the initial bands and simultaneously
the information is totally decorrelated. According to the proposed transfor-
mation, the projection is carried out giving special importance to those of the initial
axes (spectral bands) which possess the largest amount of energy. It is preferable
that these bands have the smallest possible correlation with all the other spectral
bands.
A method proposed by Chavez et al. (1982) takes into consideration the
previously mentioned requirements. Specifically, this selection method is based on
3248 V. Tsagaris and V. Anastassopoulos
the optimum index factor (OIF), which considers the source spectral bands in
triplets and is defined as
OIF~
P3
i~1
si
P3
j~1
jrj jð19Þ
where si is the standard deviation of each of the three selected bands and rj is the
correlation coefficient between any pair formed by these bands. The OIF factor is
evaluated for all possible combinations of groups with three bands. The group of
bands with higher OIF is selected for projecting the information content of the
multispectral data.
In this work the information quality of each of the original bands is assessed using
the factor MEMC, which stands for maximum energy minimum correlation, defined
on each separate source band as
MEMC~si
PK
j~1,i=j
ri,j
����
ð20Þ
for each band i51,...,K where si is the standard deviation of the band and ri,j is the
correlation coefficient between band i and the rest of the bands. The three source
spectral bands with the largest MEMC index span the maximum of the original
spectral space. According to the proposed method the source spectral bands are
ordered with descending MEMC index before applying the transformation given by
equation (14).
3.4 Objective performance evaluation
The performance evaluation of image fusion methods and the testing of the achieved
results is a relatively complex issue because of the different sources of data and the
different aims of fusion processes. The method proposed in this work aims to derive
a colour image of improved quality and fidelity that will be used mainly for visual
interpretation. Therefore, the overall performance evaluation is based on perceptual
evaluation as in Achalakul and Taylor (2001), Tyo et al. (2003), Bogogni and
Hansen (2001) and Toet and Franken (2003). In recent years, a few objective
measures for the evaluation of fused methods have been proposed (Xydeas and
Petrovic 2000, Qu et al. 2002). These measures have been developed for the
assessment of greyscale fusion techniques, thus their use in colour fusion is not
straightforward. A numerical quality assessment of image fusion based on mutual
information has been recently introduced in Qu et al. (2002).
Each source multispectral band X and each colour component of the final
colour image Y, can be treated as discrete random variables distributed according
to probabilities pX(x) and pY(y), respectively. Thus, the mutual information
shared by a source multispectral image and one of the final colour components is
given by
IXY ~X
x
X
y
pXY x,yð ÞlogpXY x,yð Þ
pX xð ÞpY yð Þ ð21Þ
Multispectral image fusion based on perceptual attributes 3249
It can be proved that mutual information is always a positive quantity that vanishes
only if pXY(x,y)5pX(x)pY(y). Therefore, it can be interpreted as a measure of the
statistical dependence between the variables X and Y. The physical significance of
mutual information is that it quantifies the amount of common information
between two images. In Qu et al. (2002) the mutual information between each source
image and the final greyscale image is evaluated. The fusion performance measure is
the total mutual information.
In this work the total mutual information between the original multispectral
bands and each colour component of the final image is evaluated. An iteration
process is employed in order to maximize the total mutual information by adjusting
the elements of the resulting correlation matrix. For this purpose each element of the
correlation matrix Cy takes its value in a range that is defined by the corresponding
variance given in table 1. In this way not only the perceptual attributes, related to the
correlated bands in the RGB colour space, are incorporated in the method, but also
the objective condition of the maximization of mutual information is satisfied.
4. Experimental procedure
4.1 Multispectral data description
The multispectral dataset used in this work consists of four multispectral bands and
is available by Space Imaging (www.spaceimaging.com) and acquired from
IKONOS-2 sensor. The analysis of each band is 11 bits per pixel and the size is
200162001 pixels. The ground resolution provided by IKONOS-2 for the
multispectral imagery is 4 m. The spectral range of the sensor is depicted in table 2.
The area covered in this multispectral image is mainly an urban area with a
structured road network, a forest, a stadium, a park, etc. The correlation among the
source multispectral components is shown in table 3. Obviously, a high degree of
correlation is present mainly between the three components that lie in the visible
region of the electromagnetic spectrum.
The selection of primary bands for projecting the multispectral information is
carried out here on a perceptual as well as on a statistical basis. According to the
perceptual approach bands 1, 2 and 3, given in table 2, are used to determine
Table 2. Spectral range of IKONOS-2 data.
Band number Spectral range (mm)
1 (blue) 0.45–0.522 (green) 0.51–0.603 (red) 0.63–0.704 (near infrared) 0.76–0.85
Table 3. Correlation coefficient matrix for IKONOS data.
Band number 1 2 3 4
1 1.000 0.986 0.959 0.3932 0.986 1.000 0.983 0.3853 0.959 0.983 1.000 0.2744 0.393 0.385 0.274 1.000
3250 V. Tsagaris and V. Anastassopoulos
the projection directions, since their spectral range is lying in the visible region of the
electromagnetic spectrum. In this way the information contained in band 4 (near
infrared) is distributed on the three primary bands, thus sustaining all the visually
perceivable information of the original bands.
Statistically based band selection is implemented using both OIF and MEMC
indexes. The values of the OIF index for all combinations of source spectral bands
are displayed in table 4, while index MEMC is shown for each of the four bands in
table 5. The OIF index indicates that bands 2, 3 and 4 are the most important for
primary axes evaluation. On the other hand the MEMC index designates bands 1, 4
and 2.
4.2 Experimental results
The fusion results are demonstrated in figure 3. In the upper left image a false colour
composite using only the first three bands of the data is shown. The image in
figure 3(b) is derived from PCA analysis. The two images in the second row of
figure 3 have resulted according to the proposed method using the perceptual
selection and the MEMC index, respectively. The transformation matrix Acalculated by means of equation (14) was based on the following correlation
coefficient matrix
Ry~
1 0:8487 0:7040 0
0:8487 1 0:8849 0
0:7040 0:8849 1 0
0 0 0 1
2
6664
3
7775
ð22Þ
according to table 1. The result transformation matrix A is depicted in table 6 for the
case of using channels 1, 2 and 3 as primary bands. Apparently, in the perceptual
case the final colour image possesses more natural colours, while the image resulting
on the MEMC selection is more expressive for the human eye.
Subjectively, the proposed method produces a colour image that can be
characterized as a rich colour image with similar properties to those of natural
colour images. Another important property of the colour images resulting from the
proposed transformation is that areas with the same spectral signature (urban area,
sea, forestry, etc.) are depicted with variations of the same colour. In other words,
Table 4. OIF index for the multispectral dataset.
Band combination OIF
1, 2, 3 3651, 2, 4 6442, 3, 4 680
Table 5. MEMC index for the multispectral dataset.
Band number MEMC (6104)
1 5.6462 5.5303 5.4184 5.570
Multispectral image fusion based on perceptual attributes 3251
the overall colour balance of the area is preserved but the colours are presented with
more shading. In addition, the proposed method outperforms the other two, because
the resulting image has a greater variety of colours and hues that are perceivable by
the human eye, especially in dark areas of the source images.
The objective evaluation of the proposed method is based on the measure
described in § 3.4 and the results are displayed in table 7. The proposed method
outperforms PCA-based approaches in both realizations. The amount of informa-
Table 6. Transformation table for IKONOS data in the case of selecting multispectralbands 1, 2 and 3 for projection.
A~
1:1024 {2:5622 {1:7097 0:0383
0 3:5480 {0:3595 {0:2209
0 0 3:1217 0:1761
0 0 0 0:0433
2
6664
3
7775
(a) (b)
(c) (d )
Figure 3. Detail image (a) false colour composite of the first three bands, (b) first three PCcomponents from PCA transformation, (c) proposed method, (d) proposed method withMEMC selection.
3252 V. Tsagaris and V. Anastassopoulos
tion conveyed in the fused colour image, as described by mutual information, is
greater in all cases. In addition, the information of the source multispectral images is
almost uniformly distributed among the RGB components of the fused image. This
property is a result of the attributes of the human visual system that have been
incorporated in the proposed transformation.
Further improvement in the visual quality of the resulting colour image could be
achieved by means of a post-processing step. This step includes the use of a
histogram equalization technique in order to improve the contrast of the final colour
image. In the case of PCA the second and the third principal components are always
narrow because of the energy compaction property of the KL transform. The
proposed method outperforms PCA due to the fact that the three components
produced by the transform (8) have improved contrast. The results reveal that the
visual appearance along with the discrimination capability is enhanced.
5. Conclusions
The main purpose of this paper is to present a new fusion technique for
multispectral images. The fusion process results in a colour image suitable for
visual interpretation and provides a novel scheme for the display of multispectral
imagery. Its basic idea is to control the terms of the covariance matrix of the output
colour image so that attributes related to human colour perception are incorporated.
For this purpose correlation properties in natural colour images are taken into
consideration in the design of the transform. The perceptual attributes of the
obtained image are not sensitive to small variations of the correlation coefficient
values. In addition, the proposed technique is well suited for direct representation to
any RGB-based device.
The projecting directions have been derived according to the MEMC index
introduced in this paper. This index reveals the multispectral bands that play a
dominating role in the proposed transformation, since they have the maximum
energy and the smallest correlation among all the other bands. In order to establish
the proposed fusion technique, both subjective and objective performance
evaluations have been carried out. The objective evaluation is based on mutual
information and justifies the proposed method as meaningful. Specifically, the total
mutual information is proposed and used as a measure for maximizing the
information conveyed from the source multispectral bands to the final colour image.
Subjectively, the experimental results demonstrate that the proposed method
produces a colour image with a large variety of colours and hues. In this way the
ability of the human eye to perceive millions of colours is fully exploited. Another
main advantage of the technique is that the resulting colour image is formed in the
RGB colour space and no further transformation is needed.
Table 7. Mutual information between source multispectral bands and the final colour image.
PCA Proposed Proposed using MEMC
Red Green Blue Red Green Blue Red Green Blue
1 1.833 0.265 0.270 3.884 0.775 0.498 4.053 1.272 0.5892 1.893 0.296 0.231 1.901 1.079 0.589 0.594 1.126 0.5963 1.268 0.387 0.239 1.359 1.109 0.825 1.907 0.901 0.7814 0.617 0.831 0.162 0.592 0.258 0.251 1.362 0.656 0.744Sum 5.611 1.779 0.902 7.736 3.221 2.163 7.916 3.955 2.710
Multispectral image fusion based on perceptual attributes 3253
Acknowledgments
The authors thank the referees for their comments and suggestions that have helped
to improve this paper. This work was partly supported by the European Social Fund
(ESF), Operational Program for Educational and Vocational Training II (EPEAEKII), and the Program HERAKLEITOS of the Ministry of Education and Religious
Affairs, Greece.
ReferencesACHALAKUL, T. and TAYLOR, S., 2001, Real-time multi-spectral image fusion. Concurrency
and Computation: Practice and Experience, 13, pp. 1063–1081.
BOGOGNI, L. and HANSEN, M., 2001, Pattern selective colour image fusion. Pattern
Recognition, 34, pp. 1515–1526.
CHAVEZ, P.S., 1989, Radiometric calibration of Landsat Thematic Mapper multi-
spectral images. Photogrammetric Engineering and Remote Sensing, 55,
pp. 1285–1294.
CHAVEZ, P.S., BERLIN, G.L. and SOWERS, L.B., 1982, Statistical methods for selecting
LandSat MSS ratios. Journal of Applied Photographic Engineering, 8, pp. 23–30.
FORSYTH, D. and PONCE, J., 2002, Computer Vision (Englewood Cliffs, NJ: Prentice Hall).
GONZALEZ, R.C. and WOODS, R.E., 2002, Digital Image Processing (New York: Addison-
Wesley).
LIU, Z., TSUKADA, K., HANASAKI, K., HO, Y.K. and DAI, Y.P., 2001, Image fusion by using
steerable pyramid. Pattern Recognition Letters, 22, pp. 929–939.
MATSOPOULOS, G.K. and MARSHALL, S., 1995, Application of morphological pyramids:
Fusion of MR and CT phantoms. Journal of Visual Communications and Image
Representation, 6, pp. 196–207.
MATSOPOULOS, G.K., MARSHALL, S. and BRUNT, J.N.H., 1994, Multiresolution morpholo-
gical fusion of MR and CT images of the human brain. IEEE Proceedings on Vision,
Image and Signal Processing, 141, pp. 137–142.
MUKHOPADHYAY, S. and CHANDA, B., 2001, Fusion of 2D grayscale images using multiscale
morphology. Pattern Recognition, 34, pp. 1939–1949.
POHL, C. and VAN GENDEREN, J.L., 1998, Multisensor image fusion in remote sensing:
concepts, methods and applications. International Journal of Remote Sensing, 19,
pp. 823–854.
QU, G., ZHANG, D. and YAN, P., 2002, Information measure for performance of image fusion.
Electronics Letters, 38, pp. 313–315.
RAST, M., JASKOLLA, M. and ARANSON, F., 1991, Comparative digital analysis of Seasat-SAR
and LandSat-TM data for Iceland. International Journal of Remote Sensing, 12,
pp. 527–544.
SHKVARKO, Y.V., SHMAILY, Y.S., JAIME-RIVAS, R. and TORRES-CISNEROS, M., 2001, System
fusion in passive sensing using a modified hopfield network. Journal of the Franklin
Institute, 338, pp. 405–427.
SOLBERG, A., TAXT, T. and JAIN, A., 1996, A Markov random field model for classification of
multisource satellite imagery. IEEE Transactions on Geoscience and Remote Sensing,
34, pp. 100–113.
TOET, A. and FRANKEN, E.M., 2003, Perceptual evaluation of different image fusion schemes.
Displays, 24, pp. 25–37.
TYO, J.C., KONSOLAKIS, A., DIERSEN, D. and OLSEN, R.C., 2003, Principal components based
display strategy for spectral imagery. IEEE Transactions on Geoscience and Remote
Sensing, 41, pp. 708–718.
XYDEAS, C.S. and PETROVIC, V., 2000, Objective image fusion performance measure.
Electronics Letters, 36, pp. 308–309.
ZHANG, Z., SUAN, S. and ZHENG, F., 2001, Image fusion based on median filters and SOFM
neural networks: A three-step scheme. Signal Processing, 81, pp. 1325–1330.
3254 Multispectral image fusion based on perceptual attributes