Pixel Unmixing in Hyperspectral Data by Means of Neural...

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 49, NO. 11, NOVEMBER 2011 4163

Pixel Unmixing in Hyperspectral Data byMeans of Neural Networks

Giorgio A. Licciardi and Fabio Del Frate, Senior Member, IEEE

Abstract—Neural networks (NNs) are recognized as very effec-tive techniques when facing complex retrieval tasks in remote sens-ing. In this paper, the potential of NNs has been applied in solvingthe unmixing problem in hyperspectral data. In its complete form,the processing scheme uses an NN architecture consisting of twostages: the first stage reduces the dimension of the input vector,while the second stage performs the mapping from the reducedinput vector to the abundance percentages. The dimensionalityreduction is performed by the so-called autoassociative NNs, whichyield a nonlinear principal component analysis of the data. Theevaluation of the whole performance is carried out for differentsets of experimental data. The first one is provided by the AirborneHyperspectral Scanner. The second set consists of images fromthe Compact High-Resolution Imaging Spectrometer on board theProject for On-Board Autonomy satellite, and it includes multian-gle and multitemporal acquisitions. The third set is represented byAirborne Visible/InfraRed Imaging Spectrometer measurements.A quantitative performance analysis has been carried out in termsof effectiveness in the dimensionality reduction phase and in termsof the accuracy in the final estimation. The results obtained,when compared with those produced by appropriate benchmarktechniques, show the advantages of this approach.

Index Terms—Autoassociative neural networks (AANNs), di-mensionality reduction, hyperspectral, NNs, nonlinear principalcomponents, pixel unmixing.

I. INTRODUCTION

N EURAL networks (NNs) appeared on the scene of remotesensing at the beginning of the 1990s when it was proven

that, in multisource analysis, where we do not always knowthe distribution functions, they could be more appropriate thantraditional statistical algorithms [1]. Since then, the use of NNshas spread in the remote sensing community, leading to anincreasing number of studies reported in literature in recentyears [2]. Many of them have shown considerable advantagesof NNs over other methods. In brief, the rapid growth ofneural approaches in remote sensing is mainly due to theirlargely demonstrated ability to learn the complex patterns char-acterizing both the forward radiative transfer model and the

Manuscript received September 30, 2010; revised February 1, 2011 andApril 1, 2011; accepted May 19, 2011. Date of publication August 1, 2011;date of current version October 28, 2011.

G. A. Licciardi was with the Computer Science, Systems and ProductionDepartment, Tor Vergata University, 00133 Rome, Italy. He is now withthe GIPSA-Lab, Grenoble Institute of Technology, 38402 Grenoble, France(e-mail: [email protected]).

F. Del Frate is with the Computer Science, Systems and Produc-tion Department, Tor Vergata University, 00133 Rome, Italy (e-mail:[email protected]).

Color versions of one or more of the figures in this paper are available onlineat http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TGRS.2011.2160950

inversion problem [3]. Their capability to generalize in noisyenvironments makes NNs robust techniques when the input dataare incomplete or incorrect [4]. Moreover, NNs can be flexibleand capable of positively combining different types of data,with no need of assumptions about the distributions of the dataset used [5].

Also, in the specific context of the pixel unmixing of multi-spectral images, NNs have already been proven to be an effec-tive approach [6]–[8]. In particular, although a heavy learningcomputation can be required, the multilayer perceptron (MLP)models have provided competitive performance with respect toother techniques such as the linear unmixing model and fuzzyc-means classifiers [9], [10]. Considering the hyperspectraldata, the unmixing problem may become more complex due tothe high dimensionality of the input vector, i.e., the number ofspectral bands in the acquired data. For such data, the designof effective procedures aiming at lowering the size of thedata, preserving as much as possible their information content,can be one of the key steps for the success of the wholeunmixing procedure [11]. In fact, a high number of spectralbands exhibit high correlation, adding a redundancy that mayobscure information relevant for abundance estimation and maydecrease the accuracy of final products. Dealing with an NNinversion scheme, the extraction of representative features playsan even more crucial role. A network with fewer inputs hasfewer adaptive parameters to be determined, which need asmaller training set to be properly constrained. This leads toa network with improved generalization properties providingsmoother mappings. In addition, a network with fewer weightsmay be faster to train. All of these benefits make the reductionof the dimension of the input data a recommended procedure inmany NN applications [12].

As far as the use of NNs in the field of hyperspectral data isconcerned, they have already been recognized as representing arather competitive family of algorithms for image classification[13]. Moreover, NNs have already been successfully appliedfor the design of one of the first end-to-end processing schemededicated to hyperspectral imagery provided by the CompactHigh-Resolution Imaging Spectrometer (CHRIS)-Project forOn-Board Autonomy (PROBA) satellite [14]. However, despitethe recognized potential of NNs in inversion problems, theirexploitation in hyperspectral pixel unmixing has so far beenrather sparsely investigated, and the management of a high-dimension input was confirmed to be one of the crucial issues[15], [16].

The objective of this paper is to present a new methodologyin designing robust schemes for hyperspectral unmixing usingNNs. The main technical innovation is that MLP topologies are

0196-2892/$26.00 © 2011 IEEE

Giorgio

4164 IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 49, NO. 11, NOVEMBER 2011

exploited for both tasks of feature reduction and abundance es-timation, providing an effective solution to the management ofa high-dimension input vector for pixel unmixing. The rationaleof using MLP models relies on two main factors: they havebeen found to have the best-suited topology for classificationand inversion problems [17], and they are able to performwith one single architecture (both the feature extraction andthe unmixing). In fact, the final implementation chains the twotasks to a rather compact scheme and is characterized by a highlevel of automatism. In the study, both airborne and satelliteimages are used for the evaluation of the performance.

II. METHODOLOGY

A. Training MLP NNs

If, on one hand, NNs are recognized as being universalapproximators [18], on the other hand, their inappropriate usemight lead to an undesired network overfitting over the trainingset. This means that the network becomes tailored too muchto the learning examples and does not provide a satisfactoryperformance when it is asked to generalize over new patterns.Hence, in NN design, two main issues have to be addressed:1) when to stop the training algorithm, and 2) how manyneurons have to be included in the hidden layers of the topology.For the first aspect, we considered the so-called early stoppingalgorithm [12]. According to this algorithm, training and testsets are involved in the learning phase. The test set consistsof examples not belonging to the training one. The networkparameters are iteratively changed to minimize the error onthe training set, but the network performance is, in parallel,evaluated also on the test set. The training of the net is stoppedwhen the error on the test set reaches its minimum. Whereasthis procedure is not applied, a net overtraining is very likely,and despite the fact that the error on the training data set maybe smaller, the generalization capability is reduced. For thesecond task, a grid search method aiming at minimizing themean-square error MSE was carried out. The MSE error isgiven by

MSE =

∑p∈P

∑j∈N

(tpj − opj)2

P

where P is the number of training patterns, N is the numberof output neurons, tpj is the desired (true) output of neuronj for pattern p, and opj is the actual output. It should bementioned that most of the neural simulations were providedby the Stuttgart Neural Network Simulator package [19].

B. AANNs

Autoassociative NNs (AANNs) have already been used inremote sensing for data dimensionality reduction in [20]. How-ever, in that case, the authors considered the AANN model in arather different application, i.e., for the retrieval of atmosphericvariables from ground-based microwave radiometry. AANNsare MLP networks where the targets used to train the networkare simply the input vectors themselves so that the network isattempting to map each input vector onto itself. For this reason,

Fig. 1. AANN topology.

the network is said to form an autoassociative mapping. Thenumber of units in the central bottleneck layer is significantlyreduced with respect to the number of input (or output) units;therefore, in general, a perfect reconstruction of all inputs isnot possible. However, if network training finds an acceptablesolution, i.e., a solution which gives an error below a predefinedthreshold, a good representation of the input must exist in thebottleneck layer. In other words, data compression caused bythe network bottleneck may force hidden units to representsignificant features in the data.

As shown in Fig. 1, the network can be viewed as twosuccessive functional mappings. The first mapping projects theoriginal d-dimensional data onto a lower dimensional subspacedefined by the activations of the units in the central hiddenlayer. Because of the presence of nonlinear units, this mappingis essentially arbitrary and, in particular, not restricted to beinglinear, as for the standard principal component analysis (PCA).Similarly, the second half of the network defines an arbitraryfunctional mapping from the lower dimensional space backinto the original d-dimensional space. The described networkperforms a nonlinear PCA (NLPCA). It might be thought thatthese arbitrary mappings can be performed by a net with onehidden layer, obtained by the one in Fig. 1, removing themapping and demapping layers, provided that the bottlenecklayer has nonlinear (sigmoid) activation functions. However,it was shown by Bourlard and Kamp [21] that, in this case,the PCA analysis is the one providing the minimum erroron the representation of the data in a subspace with reduceddimensionality. There is therefore no advantage in using the netwith one hidden layer to perform dimensionality reduction. Italso has to be noted that, as the output has to simply replicatethe input, no independent target data are provided, and there isno need to have an a priori knowledge for the implementationof the learning phase. This implies that the AANN training canbe performed in a fully automatic way and that all pixels inthe image can be considered for this task, which has actuallybeen the technique adopted in this paper. As to the numberof nonlinear principal components to be included into thefinal processing scheme, we observed that the PCA techniqueis still a consolidated technique for dimensionality extractionin hyperspectral data. For this reason, it could be used fortwo purposes: first, it was the benchmark in evaluating thedimensionality reduction performance of the AANN techniqueon hyperspectral data, and second, it was used to decide the

LICCIARDI AND DEL FRATE: PIXEL UNMIXING IN HYPERSPECTRAL DATA BY MEANS OF NEURAL NETWORKS 4165

Fig. 2. Cascade NN architecture performing feature extraction (left stage) andpixel unmixing (right stage).

number of units to be considered in the AANN bottleneck layer.In particular, the number was set equal to the number of PCAcomponents enabling the representation of at least 99% of thevariance of the data.

C. Unmixing Algorithm

Once feature extraction was performed by means of AANN,the reduced measurement vector could be used as input to a newMLP scheme for a pixel-based fuzzy classification procedure(Fig. 2). When adequate and available, a priori knowledgewas exploited for two purposes: to define the number and thetypology of the endmembers and to select the pure trainingsamples (pixels) necessary for the learning phase of the NNperforming the fuzzy classification. For the first task, this meansthat the most important types of land cover characterizing theconsidered test areas were individuated. For the second task,the a priori information available guided the selection of aset of individual seed pixels in the image. At this point, togenerate a statistically significant number of training pixels,other representative pure pixels, belonging to the same landcover type, were automatically extracted. The selection condi-tion was that their Euclidean distance from the seed pixels wasbelow a predefined threshold. It has to be noted that providingreliable examples is essential for the success of the networklearning phase. As the selection of the training pixels wasbased on image photo-interpretation, the exact individuationof mixed pixels could be critical. Hence, the strategy of usingonly pure pixels, also considered by Foody in [7], seemedmore robust and appropriate in minimizing photo-interpretationerrors. However, in one of the three experiments considered(Airborne Visible/InfraRed Imaging Spectrometer (AVIRIS)data), it was not possible to extract individual pure pixels inthe image. In this case, the training of the NN performing thefuzzy classification was carried out using pure reference spectramade available by the United States Geological Survey (USGS)spectral library. In any case, as only pure pixels were consideredin the learning phase, the following standard code was con-sidered for the output vector: the output unit associated to theactual land cover class had a value of “1,” while the remainingones had a value of “0.” Although the network is trained withbinary values in the output vector, the activation functions of its

processing units are real-valued sigmoidal functions, providingan output value in the range [0, 1]. Therefore, such a value canbe considered correlated to the fuzzy membership value. Moreanalytically, the abundance ai corresponding to i-ism class isgiven by the following expression:

ai =oi

M∑k=1

ok

where ok indicates the NN output associated to the k-ismendmember and M is the total number of endmembers.

D. NN Performance Evaluation

The results yielded by the NN scheme have been qualita-tively and quantitatively evaluated both in terms of the NNs’capability of representing the hyperspectral data with a reducednumber of components and in terms of the accuracy obtained onthe endmember abundance estimation. For the dimensionalityreduction results, a performance index was introduced and com-puted by considering the pixel mean percentage reconstructionerror obtained over the whole image, including all bands. Moreprecisely, we defined the reconstruction mean error RME as

RME =

M∑1

B∑1

vt−vrvt

M ×B

where M is the overall number of pixels in the image, B isthe number of bands, vt is the true original band value forthe specific pixel, and vr is the reconstructed value. Note that,in one case (Airborne Hyperspectral Scanner (AHS) data), ananalysis of the robustness to variable atmospheric path contri-bution was carried out. In fact, the scattering of atmosphericmolecules and aerosols and the absorption of gases such aswater vapor can weaken the upwelling signals received by thesensors [22]. For the dimensionality reduction, the PCA andminimum noise fraction (MNF) algorithms have been consid-ered as benchmarks for the assessment of the NN methodology.In fact, they are still commonly used approaches for dimension-ality reduction of hyperspectral data [23], [24]. As far as theunmixing performance is concerned, the quantitative evaluationwas made by considering the results provided by the networkon a set of individual pixels, not used in the training phase.Some mixed pixels interpreted based on the available groundtruth (in most cases, an image at higher spatial resolution)were carefully selected for this purpose. With regard to thebenchmark algorithm used for comparison, the linear spectralunmixing (LSU) algorithm provided by ENVI software hasbeen chosen. It belongs to the class of inversion algorithmsbased on squared-error minimization [25]. The algorithm hasbeen set up to follow the full-additivity and nonnegativityconstraints.

III. EXPERIMENTAL DATA

A. INTA-AHS Data

The airborne INTA-AHS instrument data set has been ac-quired in the framework of the European Space Agency (ESA)


Fig. 3. RGB image of the DEMMIN test site taken by the AHS instrument.R: band 6. G: band 4. B: band 2.

AGRISAR measurement campaign [26]. The test site is thearea of Durable Environmental Multidisciplinary MonitoringInformation Network (DEMMIN). This is a consolidated testsite located in Mecklenburg-Western Pomerania, North-EastGermany, which is based on a group of farms within a farmingassociation, covering approximately 25 000 ha. The fields arevery large in this area (in average, 200–250 ha). The main cropsgrown are wheat, barley, rape, maize, and sugar beet. The alti-tudinal range within the test site is around 50 m. The AHS has80 spectral channels available in the visible, shortwave infrared,and thermal infrared, with a pixel size of 5.2 m. In this paper,the acquisition taken on the June 6, 2006, has been considered.At that time, five bands in the SWIR region became blind due toloose bonds in the detector array, so they were not used in thispaper [27]. An RGB image of the DEMMIN test site taken bythe AHS instrument is shown in Fig. 3. Note that the AGRISARcampaign included the collection of extensive ground truth datathat, together with a 50-cm resolution orthophoto, formed thea priori knowledge for the algorithm training and the resultassessment.

B. CHRIS-PROBA Data

The payload of the PROBA-1 satellite includes the CHRISinstrument, which provides acquisitions up to 62 narrow andquasi-contiguous spectral bands with a spatial resolution of34–40 m (mode 1) [28]. In this paper, only acquisitions in mode3, recommended for land-use and land-change investigations,have been considered. This configuration is characterized by 18spectral bands covering the visible and near-infrared ranges at18 m of spatial resolution. Bands for atmospheric correctionare present (442 and 490 nm), and also, the red edge range

Fig. 4. RGB image of the Tor Vergata–Frascati test site taken by CHRIS-PROBA. R: band 6. G: band 4. B: band 2.

(between 700 and 800 nm) is well sampled in order to produceaccurate identification and monitoring of crops and green areas.From the scientific point of view, one of the most interestingcapabilities of this satellite consists in tilting the spacecraft ontwo axes (along and across track) during the target overpass,allowing quasi-simultaneous acquisitions from five differentangles. More particularly, the images are acquired when thezenith angle of the platform, with respect to the fly-by position,is one of the following: +55◦, +36◦, 0◦, −36◦, or −55◦. Thismultiple-view-angle imaging capability, in addition to the highspectral and spatial sensor resolution, permits the collection of alarge amount of data in order to perform new studies regardingthe land and the atmosphere. The area between Frascati andTor Vergata is in a test site for the PROBA-1 mission andhas also been considered for this paper. This is a mainly flatarea located in the southeast of Rome, Italy, which presents aninteresting heterogeneous landscape consisting of both naturaland artificial areas (see Fig. 4). The multiangular acquisitionsgiven by CHRIS have been combined with a multitemporaldata set. We considered three acquisition dates with a fly-by zenith angle (FZA) of 0◦: February 28, 2006, August 19,2006, and October 9, 2006. Such dates provided 54 (18 × 3)inputs and seemed particularly suitable in sampling the crops’growth cycle and, hence, in catching the differences among themultitemporal signatures associated to each land cover type.Additionally, the acquisition at an FZA of 36◦ on the seconddate (month of August) was also considered for an overallnumber of 72 inputs to be exploited for the unmixing scheme.The choice of the second date stems from the fact that themultiangular information should carry the best contributionwhen the agricultural fields are at the stage of full development.Note that all data have been atmospherically calibrated. AVHR panchromatic QuickBird image represented the availableground truth.

C. AVIRIS Data

The AVIRIS data set used in this paper has beenobtained by connection to the AVIRIS-JPL website(http://aviris.jpl.nasa.gov). It was acquired over the Cupritesite during a 1997 campaign. AVIRIS is an optical sensordelivering calibrated images of the upwelling spectral radiancein 224 contiguous spectral bands with wavelengths from 0.4 to2.5 µm. Cuprite is a place located in NV, located approximately


Fig. 5. RGB image of the Tor Vergata–Frascati test site taken by CHRIS-PROBA. R: band 6. G: band 4. B: band 2.

200 km northwest of Las Vegas, with a relatively undisturbedacid-sulfate hydrothermal system exhibiting well-exposedalteration mineralogy consisting principally of kaolinite,alunite, and hydrothermal silica. Cuprite has been used as ageologic remote sensing test site since the early 1980s, whereso many studies have been published [29], [30] and its geologymapped in detail [31]. In this paper, the original 224 bandswere reduced to 134, selecting the spectral range 1.0–2.5 µmcontaining very diagnostic absorption features. Noisy bandswere also discarded. On the Cuprite test site, nonlinear mixturesof endmembers dominate the image, and they prevented theidentification of pure pixels by visual inspection. Hence, thetraining phase was carried out by using the mineral spectralsamples taken from the USGS spectral library. The selection ofthe types of signatures was carried out in agreement with thoseindividuated in [32]. These correspond to the following 18main spectral typologies: diopsite, hyperstene, lizardite, illite,montmorillonite, goethite, hematite, jarosite, dolomite, calcite,chlorite, muscovite, alunite, dickite, halloysite, kaolinite,nontronite, and olivine.

IV. RESULTS AND DISCUSSIONS

A. AHS Data

In the case of the AHS data, from the PCA analysis, itresulted that the first five PCA components contained almost99.9% of the whole statistical information. For this reason,five principal components have also been considered for thenonlinear dimensionality reduction performed by AANN. InFig. 5, we show the result of the analysis aiming at selectingthe suitable number of neurons for the intermediate layers.We can see that the minimization of the cost function wasobtained with 25 neurons. This led to the following topologyfor the AANN: 75 nodes for both input and output layers, 25nodes for the mapping and demapping hidden layers, and fivenodes for the bottleneck layer. As anticipated in Section II, asimilar quantitative analysis, yielding plots as the one shownin Fig. 5, has been carried out each time an MLP topologyhas to be designed. However, to be concise, in the rest of thispaper, only the final selected configurations will be provided.We observed that, with some minor exceptions, the components

Fig. 6. Atmospheric effects on the first (up) and fourth (down) principalcomponents extracted by means of the different considered techniques. (Left)NLPCA. (Center) PCA. (Right) MNF.

extracted by means of the AANN are generally not seriouslydisturbed by atmospheric effects. On the other hand, this prob-lem seems to be more visible in the PCA components, andit seriously degrades the MNF components. This is shown inFig. 6, where we show, for each technique, the two componentsthat appear as the most affected by the atmospheric effects.We see how the level of disturbance is slight for NLPCA,more significant for the PCA, and even dramatic for MNF. InFig. 7, we examined the accuracy in reconstructing the originalspectra starting from the five extracted principal components.Two examples of reconstructed signatures are shown. The plotshave been obtained by averaging over pixels of the same areaof interest. The two land cover types considered are waterand trees. In the case of water (about 650 pixels), it can benoted that the NLPCA is significantly more effective thanPCA in encoding the spectral information. In particular, thebehavior in the visible and near infrared with strong curvaturesis better resembled. Differently, the forest case (again about650 pixels) is an example where the two techniques are rathercomparable, even though the PCA shows slight discrepancieswith the true spectra at the higher wavelengths. Trends similarto those shown in Fig. 7 have been observed for the otherland cover types. An RME value of 0.051 was computed forthe NLPCA technique versus a value of 0.101 for the PCAmethod; hence, the NLPCA improved the performance of thelinear approach by about 50%. The five components extractedfrom each pixel spectral signature have then been used forthe implementation of the unmixing algorithm via an MLPtopology of 5-25-25-10. About 45 000 pixels were consideredfor the training phase, and about 50 000 were considered forthe test set. The training phase lasted less than 150 epochs;hence, a limited computational burden was required. It mustbe remembered that the decrease of the complexity of the NNtraining phase was one of the objectives of this paper. In theimplementation of the final automatic unmixing scheme, the


Fig. 7. Original spectral signatures and spectral signatures reconstructed byusing NLPCA and PCA. Examples for water (up) and tree (down) land covertypes. Values between bands 25 and 60 have been skipped because they are notsignificant.

whole NN architecture consists of the following layers: one in-put layer of 75 units; four hidden layers of 25, 5, 25, and 25 neu-rons, respectively; and an output vector of ten components. Thefirst hidden layer extracts the five nonlinear principal compo-nents which feed the subsequent hidden layers performing theabundance estimation. Ten different land cover types were in-dividuated in the scene, corresponding to the following: maize,winter wheat, winter barley, rape, sugar beet, pasture, water,artificial areas, coniferous trees, and deciduous trees. The abun-dance estimation was obtained starting from the fuzzy classifi-cation of the NN algorithm, adopting the procedure previouslyexplained. The technique was compared through a quantitativeexercise with the LSU method. For such an evaluation, wederived the ground truth for 40 separated individual pixels byphoto-interpretation of a 50-cm spatial resolution orthophototaken on the same area. In Fig. 8, we show two examples ofthe individual selected pixels whose abundances were estimatedand verified. The corresponding results in terms of percentagesof each endmember are given in Table I. In the first example, itcan be noted that the radiance measured by the sensor stemsfrom the mixed contribution of three different endmembers:mostly from pasture and, for a lower percentage, from artificialarea and deciduous trees. The LSU is better than NN in re-trieving the artificial percentage and also detects the significantpresence of deciduous trees. However, it fails in the abundanceestimation of the other agricultural endmembers, particularlypasture. On the other hand, NN correctly assigns a high percent-age value to pasture and low values to the other low-vegetated

Fig. 8. Examples of AHS individual pixels (in the red box) for which groundtruth was evaluated by photo-interpretation and compared with the unmixingresults provided by LSU and NN. For the correspondence with Table I, example1 is on the left, and example 2 is on the right.

TABLE IABUNDANCE ESTIMATION (IN PERCENT) FOR TWO PIXELS (THE RED

BOXES IN FIG. 8) EXTRACTED FROM THE AHS IMAGE

classes. This behavior can be explained by considering thatstandard unmixing methodologies such as LSU require the end-members to be the most uncorrelated as possible. If, conversely,the components are strictly connected, the mixing process be-tween the different components is essentially nonlinear [20]. Inour case, most of the endmembers correspond to crops, whichmay be correlated to each other, leading the linear unmixingtechnique to a wrong result. On the other hand, the NN unmix-ing technique should be able to exploit nonlinear dependenciesmore effectively and should provide good accuracy even ifthe chosen endmembers are closely correlated. The results ofexample 2, showing certain confusion in the LSU performance,seem to confirm this assumption. A more general quantitativeassessment is provided in Table II, which reports the mean andthe standard deviation values of each endmember, consideringthe whole set of the 40 pixels selected for the quantitative eval-uation. In particular, Table II gives the quantitative assessmentin terms of root-mean-square error (rmse), computed by con-sidering all classes. For each class, the rmse value is given by

rmse =

√√√√∑i∈N

(aei − ati)2

N

where N is the number of pixels selected for the performanceevaluation (in this case 40), aei is the estimated abundancefor pixel i, and ati is the corresponding true abundance ascalculated via photo-interpretation. Note that the percentagemean value of some classes is zero. In fact, it was difficultto find pixels characterized by some specific contributions by


TABLE IIABUNDANCE STATISTICS AND RMSE VALUES OBTAINED IN THEQUANTITATIVE PERFORMANCE EVALUATION FOR THE AHS DATA

photo-interpretation. From the reported values, we can see thatNN definitely seems to be more effective than LSU. Averagingthe rmse values over all of the endmembers, we obtained amore concise performance parameter, which is 0.065 for NNand 0.329 for LSU. Therefore, the NN approach improves theresult obtained with LSU by almost one order of magnitude.

B. CHRIS-PROBA Data

In the case of CHRIS-PROBA, a multitemporal–multiangular data set has been considered. The final selectedAANN topology was 72-25-5-25-72. A total of 36 inputs aregiven by the acquisitions taken on two days, but with the FZAonly at 0◦, the remaining inputs are provided by the acquisitiontaken in a different date but at two FZAs (0◦ and 36◦). InFig. 9, the five NLPCA and PCA components are shown. Itcan be observed that the nonlinear components seem to focuson specific features of the scene, which is rather interesting.For example, components 2 and 3 appear rather suitable inanalyzing the building distribution, while component 1 is moresensitive to natural and agricultural areas, and component 5 issensitive to the road network. Note that components 4 and 5clearly detect the presence of cloud in one of the acquisitions,while this is not the case for the other three components. We cansee in Fig. 9 that the PCA results did not show such a specificproperty. For example, none of the five PCA components is ableto highlight the urban fabric as the third NLPCA componentdoes. These results can be, at least partially, explained with thecapability of the NLPCA in detecting nonlinear dependenciesamong the data. Also for the CHRIS-PROBA imagery, themore quantitative evaluation considering the overall error inthe reconstruction of the original band information was carriedout. In this case, parameter B includes the multitemporaland multiangular acquisitions. We obtained an RME valueof 0.006 for the NLPCA and 0.083 for the PCA, so in thiscase, the improvement of the NLPCA with respect to PCAis even more significant. For the case of the CHRIS-PROBAdata, the network topology estimating the abundances fromthe five nonlinear components is 5-36-36-11. The followingdifferent classes have been considered: vineyards, pasture,permanent crops, industrial, dark asphalt, maize, built-up area,bright asphalt, and agricultural area. It should be noted thatthree agricultural areas have been differentiated in the network

Fig. 9. Five principal components (first to fifth from the top to the bottom)obtained for the multitemporal–multiangular CHRIS imagery using (left) PCAand (right) NLPCA.

Fig. 10. Examples of PROBA individual pixels (in the red box) for whichthe ground truth was evaluated by photo-interpretation and compared withthe unmixing results provided by LSU and NN. For the correspondence withTable III, example 1 is on the left, and example 2 is on the right.

output layer according to the growth cycle, but they have beenincorporated within the same class for abundance estimation.The NN used for abundance estimation has been obtained usingtraining and test sets of 4000 and 1200 patterns, respectively.With only five inputs, the number of epochs necessary to getthe network trained was again around 150, which is much lessthan those required in performing the abundance estimationdirectly starting from the 72 measurements. This confirmed thatdimensionality reduction significantly lowered the complexityof the training phase. In this exercise, the ground truth interms of percentages of abundances was determined for 50selected individual pixels by visual inspection using a VHRpanchromatic QuickBird image. It should be observed thatthe considered CHRIS-PROBA imagery pixel size is 18 m, sothe panchromatic QuickBird data characterized by less than1-m resolution can be recognized as suitable in producing the


TABLE IIIABUNDANCE ESTIMATION (IN PERCENT) FOR TWO PIXELS (THE REDBOXES IN FIG. 10) EXTRACTED FROM THE CHRIS-PROBA IMAGERY

TABLE IVABUNDANCE STATISTICS AND RMSE VALUES OBTAINED IN THE

QUANTITATIVE PERFORMANCE EVALUATIONFOR THE CHRIS-PROBA DATA

ground truth reference. In Fig. 10, we report two examplesextracted from the selected pixels whose abundances werecomputed by interpretation of the VHR data. In Table III,the corresponding results obtained with the two methods areshown. We can see that, in the first example, the pastureestimation of NN is rather accurate, while some confusionbetween built-up and bright asphalt appears. Also, in example2, the performance of NN is positive: the percentage ofbright asphalt is detected with a good approximation, andthe underestimation of the built-up area with respect to theindustrial area can be recognized as a minor error. On the otherhand, the LSU does not provide a good performance, especiallyin the first example where the industrial area is given as theprevailing class. In the comment of such results, we have tonote that the CHRIS-PROBA data set is different from AHSunder two (linked) points of view: spatial resolution is lower,and endmembers correspond to categories whose spectralsignatures can be more easily distinguished. In the AHS scene,most of the endmembers correspond to agricultural classes. Inthis second test site, we have more heterogeneity in terms ofspectral signatures. These factors should, in principle, facilitatethe unmixing analysis via a linear approach. However, in theCHRIS-PROBA case, we have to manage rather differenttypes of measurements: hyperspectral, multitemporal, andmultiangular. Therefore, it can be possible that only the NNshave the appropriate flexibility to manage such a variabilityin one single input quantity. Finally, we evaluated the rmseover the entire set of the pixels selected for the performanceassessment. As for the AHS case, we were not able to includeall endmembers in the considered ground truth mixed pixels.Table IV reports the mean and the standard deviation values

of each endmember, considering the whole set of measuredCHRIS-PROBA pixels. With regard to the two dominantcategories (pasture and built-up areas), the NN methodologygives an rmse quite below the standard deviation value, whichproves the effectiveness of the approach. The mean NN rmsevalue computed over all of the endmembers is 0.161, and again,it is better than the value of 0.391 obtained with LSU.

C. AVIRIS Data

The 134-band data set was reduced using the NLPCAapproach. In this case, we found the best topology to be134-50-6-50-134. This means that we reduced the original dataset to six nonlinear principal components. In Fig. 11, we showtwo examples of spectral signatures reconstructed from thesix NLPCA components compared with the original one. Forthe training phase of the NN performing the unmixing, the sixNLPCs corresponding to the reference spectra of the mineralswere computed and given as input for the training of a networkwith topology 6-45-45-18. The spectra from the USGS spectrallibrary were subdivided into 36 samples for the training set and18 for the test set. For this exercise, the performance evaluationcould only be carried out at a more qualitative level as the exactcomposition values of the single pixel were not available. InFig. 12, we can see the image fractions of the most significantminerals. The maps shown agree with the results reported in [32].

V. CONCLUSION

In this paper, a novel approach based on NNs for the ex-traction of pixel abundances from hyperspectral data has beendeveloped. The NN performs both the dimensionality reductionprocedure and the final unmixing. The final scheme is a singlearchitecture chaining the two operations in an automatic mode.However, the procedures need to be designed separately, withspecial care in avoiding overfitting effects. The methodologyhas been applied to the following: the images acquired by theINTA-AHS instrument, characterized by 75 working bands,over a German test site; a set of satellite CHRIS-PROBA im-ages taken over the extra-urban area of the Tor Vergata Univer-sity campus, Rome, Italy; and the AVIRIS imagery taken on theCuprite test site, NV. We have noted that the AANN approach ismore suitable than other techniques such as PCA in eliminatingnonlinear correlations in the data (hence, to optimize the designof successive inversion schemes). In particular, in the AHS data,we have observed a better robustness to atmospheric effects.For the CHRIS-PROBA data, the nonlinear NLPCA techniqueseems to have interesting capabilities in feature extraction. Itoften happened that single NLPCA components were polarizedon a specific land cover type, e.g., urban fabric, which couldbe of great use for object detection problems. The interestingoverall performance of NLPCA highlights the potential of thistechnique in reducing dimensionality of hyperspectral data evenonly for storage or transmission purposes. The unmixing resultsshow that the reduced vector, when used as input for a subse-quent NN dedicated module, allows us to yield accurate pixelabundance estimation, in any case, better than that obtainedwith LSU. In fact, from the quantitative comparison with the


Fig. 11. Two examples of original (solid line) spectral signatures and spectral signatures reconstructed by using NLPCA (dashed line) from the AVIRIS data seton Cuprite, NV.

Fig. 12. On the left, examples of distribution of (a) alunite, (b) calcite, (c) kaolinite, (d) muscovite, (e) halloysite, and (f) jarosite derived from the NN unmixingalgorithm. On the right, the classification map obtained in [32]. Note that the dashed yellow box approximately shows the common area for the two studies.

ground truth, both in AHS and in the CHRIS-PROBA data,the observed rmse on the abundance estimation is significantlylower than that obtained using the LSU approach. In the caseof AHS, the NN exploits their inherent characteristic of beinga nonlinear model, which is more appropriate in dealing withhighly correlated endmembers. In the case of the CHRIS-PROBA data, the NN approach seems to be more effective inthe management of the different types of information simul-taneously available with the hyperspectral–multitemporal andmultiangular acquisition modes. The experiment with AVIRISimagery basically confirmed the effectiveness of the approachon the third type of hyperspectral data. We note that, so far, NNhyperspectral unmixing schemes have been mainly applied toairborne data, so the inclusion of the spaceborne data can beconsidered as an additional innovative result provided by this

paper. In particular, even if the analysis of a multitemporal andmultiangular configuration might represent a rather uncommonscenario, it has to be noted that the availability of satellitemulticonfiguration data is continuously increasing. Under thisperspective, the impact of the presented technique could beeven more significant.

ACKNOWLEDGMENT

The authors would like to thank the anonymous reviewersfor their helpful comments. The AHS data set was providedwithin the ESA CAT-1 project n. 6519, while the CHRIS-PROBA data set was provided within the ESA CAT-1 projectn. 3075. The Cuprite AVIRIS data set was provided by JPL(http://aviris.jpl.nasa.gov).


REFERENCES

[1] J. A. Benediktsson, P. H. Swain, and O. K. Ersoy, “Neural network ap-proaches versus statistical methods in classification of multisource remotesensing data,” IEEE Trans. Geosci. Remote Sens., vol. 28, no. 4, pp. 540–552, Jul. 1990.

[2] J. F. Mas and J. J. Flores, “The application of artificial neural networks tothe analysis of remotely sensed data,” Int. J. Remote Sens., vol. 29, no. 3,pp. 617–663, 2008.

[3] M. S. Dawson, “Applications of electromagnetic scattering models to pa-rameter retrieval and classification,” in Microwave Scattering and Emis-sion Models and Their Applications, A. K. Fung, Ed. Norwood, MA:Artech House, 1994, ch. 12.

[4] F. Del Frate and G. Schiavon, “A combined natural orthogonalfunctions—Neural network technique for the radiometric estimation ofatmospheric profiles,” Radio Sci., vol. 33, no. 2, pp. 405–410, 1998.

[5] J. A. Benediktsson and J. R. Sveinsson, “Feature extraction for multi-source data classification with artificial neural networks,” Int. J. RemoteSens., vol. 18, no. 4, pp. 727–740, 1997.

[6] A. Baraldi, E. Binaghi, P. Blonda, P. A. Brivio, and A. Rampini, “Com-parison of the multi-layer perceptron with neurofuzzy techniques in theestimation of cover class mixture in remotely sensed data,” IEEE Trans.Geosci. Remote Sens., vol. 39, no. 5, pp. 994–1005, May 2001.

[7] G. M. Foody, “Relating the land-cover composition of mixed pixels toartificial neural network classification output,” Photogramm. Eng. RemoteSens., vol. 62, no. 5, pp. 491–499, 1996.

[8] R. Fernandes, R. Fraser, R. Latifovic, J. Cihlar, J. Beaubien, and Y. Du,“Approaches to fractional land cover and continuous field mapping: Acomparative assessment over the BOREAS study region,” Remote Sens.Environ., vol. 89, no. 2, pp. 234–251, Jan. 2004.

[9] P. M. Atkinson, M. E. J. Cutler, and H. Lewis, “Mapping sub-pixel pro-portional land cover with AVHRR imagery,” Int. J. Remote Sens., vol. 18,no. 4, pp. 917–935, 1997.

[10] W. Liu and E. Y. Wu, “Comparison of non-linear mixture models: Sub-pixel classification,” Remote Sens. Environ., vol. 94, no. 2, pp. 145–154,Jan. 2005.

[11] N. Keshava, “A survey of spectral unmixing algorithms,” Lincoln Lab. J.,vol. 14, no. 1, pp. 55–78, 2003.

[12] C. Bishop, Neural Networks for Pattern Recognition. New York: OxfordUniv. Press, 1995.

[13] G. Licciardi, F. Pacifici, D. Tuia, S. Prasad, T. West, F. Giacco, C. Thiel,J. Inglada, E. Christophe, J. Chanussot, and P. Gamba, “Decision fusionfor the classification of hyperspectral data: Outcome of the 2008 GRS-SData Fusion Contest,” IEEE Trans. Geosci. Remote Sens., vol. 47, no. 11,pp. 3857–3865, Nov. 2009.

[14] R. Duca and F. Del Frate, “Hyperspectral and multi-angle CHRIS PROBAimages for the generation of land cover maps,” IEEE Trans. Geosci.Remote Sens., vol. 46, no. 10, pp. 2857–2866, Oct. 2008.

[15] J. L. Crespo, R. J. Duro, and F. López Peña, “Gaussian synapse ANNsin multi- and hyperspectral image data analysis,” IEEE Trans. Instrum.Meas., vol. 52, no. 3, pp. 724–732, Jun. 2003.

[16] J. Plaza, A. Plaza, R. Perez, and P. Martinez, “On the use of smalltraining sets for neural network-based characterization of mixed pixels inremotely sensed hyperspectral images,” Pattern Recognit., vol. 42, no. 11,pp. 3032–3045, Nov. 2009.

[17] S.-Y. Hsu, T. Masters, M. Olson, M. Tenorio, and T. Grogan, “Compar-ative analysis of five neural network models,” Remote Sens. Rev., vol. 6,no. 1, pp. 319–329, 1992.

[18] K. Hornik, M. Stinchcombe, and H. White, “Multilayer feedforward net-works are universal approximators,” Neural Netw., vol. 2, no. 5, pp. 359–366, 1989.

[19] A. Zell, G. Mamier, M. Vogt, N. Mache, R. Hübner, S. Döring,K. U. Herrmann, T. Soyez, M. Schmalzl, T. Sommer, A. Hatzigeorgiou,D. Posselt, T. Schreiner, B. Kett, G. Clemente, and J. Wieland, “SNNSStuttgart Neural Network Simulator user manual,” Univ Stuttgart, Inst.Parallel Distrib. High Perform. Syst., Stuttgart, Germany, Rep. N6/95,1995.

[20] F. Del Frate and G. Schiavon, “Nonlinear principal component analysisfor the radiometric inversion of atmospheric profiles by using neuralnetworks,” IEEE Trans. Geosci. Remote Sens., vol. 37, no. 5, pp. 2335–2342, Sep. 1999.

[21] H. Bourlard and Y. Kamp, “Auto-association by multilayer perceptronsand singular value decomposition,” Biol. Cybern., vol. 59, no. 4/5,pp. 291–294, 1988.

[22] R. S. Fraser and Y. J. Kaufmann, “The relative importance of aerosolscattering and absorption in remote sensing,” IEEE Trans. Geosci. RemoteSens., vol. GRS-23, no. 5, pp. 625–633, Sep. 1985.

[23] M. D. Farrell and R. M. Mersereau, “On the impact of PCA dimensionreduction for hyperspectral detection of difficult targets,” IEEE Geosci.Remote Sens. Lett., vol. 2, no. 2, pp. 192–195, Apr. 2005.

[24] A. Green, M. Berman, P. Switzer, and M. D. Craig, “A transformation forordering multispectral data in terms of image quality with implications fornoise removal,” IEEE Trans. Geosci. Remote Sens., vol. 26, no. 1, pp. 65–74, Jan. 1988.

[25] N. Keshava and J. F. Mustard, “Spectral unmixing,” IEEE Signal Process.Mag., vol. 19, no. 1, pp. 44–57, Jan. 2002.

[26] “AGRISAR final Rep.,” Document ID 19974/06/I-LG, 2008.[27] J. A. Gómez, E. de Miguel, Ó. Gutiérrez de la Cémara, and A. Fernández-

Renau, “Status of the INTA AHS sensor,” in Proc. 5th EARSeL WorkshopImag. Spectrosc., Bruges, Belgium, Apr. 23–25, 2007.

[28] M. J. Barnsley, J. J. Settle, M. A. Cutter, D. R. Lobb, and F. Teston,“The PROBA/CHRIS mission: A low-cost smallsat for hyperspectral mul-tiangle observations of the Earth surface and atmosphere,” IEEE Trans.Geosci. Remote Sens., vol. 42, no. 7, pp. 1512–1520, Jul. 2004.

[29] A. F. H. Goetz and V. Strivastava, “Mineralogical mapping in the Cupritemining district,” in Proc. AIS Data Anal. Workshop, JPL Publ. 85-41,1985, pp. 22–29.

[30] G. A. Swayze, R. L. Clark, S. Sutley, and A. J. Gallagher, “Ground-truthing AVIRIS mineral mapping at Cuprite, Nevada,” in Proc. Sum-maries 3rd Annu. JPL Airborne Geosci. Workshop, V 1, AVIRIS Workshop,JPL Publ. 92-14, 1992, pp. 47–49.

[31] G. A. Swayze, “The hydrothermal and structural history of the Cupritemining district, southwestern Nevada: An integrated geological and geo-physical approach,” Ph.D. dissertation, Univ. Colorado, Boulder, CO,1997, 341 p.

[32] R. N. Clark, G. A. Swayze, K. E. Livo, R. F. Kokaly, S. J. Sutley,J. B. Dalton, R. R. McDougal, and C. A. Gent, “Imaging spectroscopy:Earth and planetary remote sensing with the USGS Tetracorder and expertsystems,” J. Geophys Res., vol. 108, no. E12, pp. 5.1–5.44, Dec. 2003,DOI: 101029/2002JE001847.

Giorgio A. Licciardi received the M.S. degree intelecommunication engineering and the Ph.D. degreein “geoinformation” from the Tor Vergata University,Rome, Italy, in 2005 and 2010, respectively.

He is currently a Postdoctoral Fellow with theGrenoble Institute of Technology (INPG), Grenoble,France, where he is conducting his research at theLaboratoire Grenoblois de l’Image, de la Parole, duSignal et de l’Automatique GIPSA-Lab. His researchincludes information extraction from remote sensingdata and multispectral and hyperspectral image anal-

ysis. He is also a European Space Agency Category-1 Principal Investigator forEarth observation data.

Dr. Licciardi serves as a Referee for several scientific journals such as theIEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING and theIEEE GEOSCIENCE AND REMOTE SENSING LETTERS.

Fabio Del Frate (M’03–SM’09) received theLaurea degree in electronic engineering and thePh.D. degree in computer science from the TorVergata University, Rome, Italy, in 1992 and 1997.

From September 1995 to June 1996, he was aVisiting Scientist with the Research Laboratory ofElectronics, Massachusetts Institute of Technology,Cambridge. In 1998 and 1999, he was a ResearchFellow with the ESRIN establishment, EuropeanSpace Agency (ESA), Frascati, Italy, where he wasengaged in projects concerning end-to-end remote

sensing applications. He is currently a Research Professor with the Tor VergataUniversity, where he teaches courses of electromagnetics and neural networks(NNs). He has been a PI in several remote sensing projects supported byESA. He is the author or coauthor of more than 100 proceeding and journalpapers with a special focus on the applications of NNs to remote sensinginversion problems. His main research topics include retrieval and classificationalgorithms for land cover from satellite data, oil spill detection in SARimagery, retrieval of atmospheric variables with microwave radiometry, anddata exploitation for the PROBA and OMI missions.

Dr. Del Frate has been a session organizer and a member of technicalcommittees in different international conferences. He serves as an AssociateEditor of the IEEE GEOSCIENCE AND REMOTE SENSING LETTERS.

Pixel Unmixing in Hyperspectral Data by Means of Neural...

Documents

Transcript of Pixel Unmixing in Hyperspectral Data by Means of Neural...