A Comparative Study of Remotely Sensed Data Classification

download A Comparative Study of Remotely Sensed Data Classification

of 6

Transcript of A Comparative Study of Remotely Sensed Data Classification

  • 8/2/2019 A Comparative Study of Remotely Sensed Data Classification

    1/6

    A Comparative Study ofRemotely Sensed Data Classification Using PrincipalComponents Analysis and DivergenceChih-Cheng Hung*!, Ahmed Fahsi!, Wubishet Tadesse! and Tommy Coleman!

    Department of Mathematics and Com puter Science,!Center for Hydrology, Soil Climatology, and Rem ote SensingAlabama A&M University, Normal, AL 35762 U.S.A.

    *

    ABSTRACTThis paper investigates principal componentsanalysis (PCA) and divergence for transforming andselecting data bands for multispectral imageclassification. As the principal components areindependent of one another, a color combination of

    the first three components can be useful in providingmaximum visual separability of image features.Therefore, principal components analysis is used togenerate a new set of data. Divergence, ameasurement of statistical separability, is employedas a method of feature selection to choose the optimalm-band subset from the n-band data for use in theautomated classification process. Classificationaccuracy assessment is carried out using large scaleaerial photographs. Classification results on theLandsat Thematic Mapper (TM) data show that PCAis a more effective approach than divergence1. INTRODUCTION

    Image classification is the process ofautomatically categorizing each pixel o f an image intoone of several classes. It is an important tool used toanalyze remotely sensed data of the earth and toextract useful thematic information. Several differentapproaches, including per-pixel, textural, andcontextual algorithms, have been already developedfor remotely sensed multispectral imageclassification. However, a major challenge forresearchers in the field of image processing is toincrease the classification accuracy of automatedinterpretation of remotely sensed multispectral data.Image classification can be done by either apixel-based or a region-based approach. In the lattercase, the image must be divided into homogeneousregions and a set of meaningful features has to be

    defined. Once these features are defined, imageregions (blocks) can be categorized using patternrecognition techn es [ l ] . However, imagesegmentation has proven to be an elusive goal [2]. Inpixel-based classification, spectral information (pixelvalue) is used to classify each pixel in the image.One of the main d rawbacks of this method is that eachpixel is treated independently without considerationfor its neighbors. In most natural scenes, the objectsthat have similar spectral responses tend to cluster.Hence, groups of like pixels should occur together.Principal components analysis (PCA), which hasbeen widely used in pattern recognition and remotesensing applications, mathematically establishes anew set of variables which describe the variance inthe original data set. As the principal components areindependent of one another, a combination of the firstthree components is useful in providing maximumvisual separability of image features [4]. Therefore,principal components analysis can be used in imageclassification to improve the accuracy. Divergence, ameasurement of statistical separability, is employedas a m ethod of feature selection to choose the optimalsubset of the original data set for use in the automatedclassification process. A comparison of theclassification accuracy for the PC A and divergence isexamined in this study. The site selected to conductthis analysis is a 15 km by 9 km area in the Huntsvilleregion, Northern Alabama, U .S.A.The organization of this paper is as follows. Animage classification scheme is briefly described insection 2. Section 3 gives a brief description of theprincipal components analysis based on the contentsof [ 5 ] . The divergence analysis is sketched in section4. Classification Accuracy Assessment is discussedin section 5. Results are shown in section 6 .Conclusion and discussion then follow.

    0-78034053-1/97/$10.00 @ 1997 EEE 2444

  • 8/2/2019 A Comparative Study of Remotely Sensed Data Classification

    2/6

    2. IMAGE CLASSIFICATION SYSTEMSupervised and Unsupervised classification arethe most common methods used in imageclassification. In this study we used the supervisedclassification technique. This method is usuallydivided into two stages [ 6 ] : he training stage and theclassification stage. The training stage is used todetermine the spectral signature of the optimalnumber of spectral classes. Given a set of classesafter the training process, these labeled classes arethen used for classification in which the unknownpixel should be assigned to one of these labeledclasses. A classified image appears as a mosaic ofuniform parcels. Each pixel in the classifie d image isidentified by a valu e internally and a color externally.

    3. PRINCIPAL COM PONENTS ANALYSISPrincipal components analysis is a multivariate

    statistical transformation tec hnique w hich is based onstatistical properties of vector representations. PCAprovides a systematic means of reducing thedimensionality of multispectral data. PCA has beenused in image data compression [7], imageenhancement [SI, and pattern classification [9]. Toperform the PCA, the axes of the spectral space arerotated, changing the coordinates of each pixel inspectral space, and the data values as well. In otherwords, PCA is formed through a linear combinationof the input bands. The new axe s are parallel to theaxes of the ellipse (In an n-dimensional histogram, ahyperellipsoid is formed if the distribution of eachinput band are normal or near normal). If there issignificant correlation between the original image set,most of the image information will be contained inthe first few bands (principal com ponents) after PCAtransformation. These principal com pon ents areuncorrelated and independent. The first principalcomponent shows the direction and length of thewidest transect of the ellipse. Therefore, it measuresthe highest variation with the data. The secondprincipal component is the widest transect of theellipse that is orthogonal to the first principalcomponent. Hence, the second principal comp onentdescribes the largest amount of variance in the datathat is not described by the first principal component[lo]. In n-dimensional space, there are n principalcomponents. Each successive principal component isthe widest transect of the ellipse that is orthogonal tothe previous components in the n-dimensional spaceand accounts for a decreasing amount of variation in

    the data which is not already accounted for by theprevious principal components.Mathematically, if' XT = [xl, x2, ..., x,,] is an N-dimensional random variable with mean vector M andcovariance matrix C and let A be a matrix whoserows are formed from ,the eigenvectors of C, orderedso that the first row of A is the eigenvectorcorresponding to the largest eigenvalue, and the lastrow is the eigenvector corresponding to the smallesteigenvalue [5], then the PCA transformation isdefined as:I!= A (X - M )

    where Y = [ Y ~ , Y ~ , . . , Y ~ ] ~ ,is the transpose andeach vector y, is the i"' principal component.4. DIVERGENCE ANALYSIS

    Statistical methods of fea ture selection are usedto quantitatively select the subset of bands thatprovide the greatest degree of statistical separabilitybetween two classes [12, 131. Since remotely senseddata consist of several spectral bands, each band willrepresent a feature in n-dimensional feature space. Inother words, each point represents a pixel of N-bandsin N-dimensional feature space. To reduce thecomputation time and to maintain the sameclassification accuracy, how many bands (features)should be selected for the classification process? Thisis the problem in pattern recognition known as featureselection. Several measu res of statistical separabilityhave been used in the machine processing of remotelysensed data [111.Signature separability is a statistical measure ofdistance between two signatures. The greater thestatistical separability of the classes, the smaller theprobability of error. Separability can be calculatedfor any combination of bands that will be used in theclassification, thereby ruling out any bands that do notcontribute to the classification accuracy. Divergenceis one of the popular measures of statisticalseparability. Divergence is a covariance weighteddistance measure between class means collected inthe training phase of the supervised classification.The degree of divergence between class ci and class cjis computed as [111:Diverg (ci, cj) = 0.5Tr[(Ci - Cj)(C;' - C,")] + O.STr[(C;'+C;')(Mi - Mj)( Mi - IV[j)T)]

    2445

  • 8/2/2019 A Comparative Study of Remotely Sensed Data Classification

    3/6

    Z2P9E 2

    where Tr is the trace of a matrix, C, and Cj are thecovariance for classes i and j, M, and M, are the meanvectors for classes i and j, and superscript T is theN = -

    transpose of the matrix.The average divergence is usually computedsince more than two classes are defined in the trainingstage in practical applications. The computationinvolves getting the average over all possible pairs ofclasses. Assum ing that m classes are already definedin the training phase, the average divergence can beexpressed as:

    m - 1 nrD i v e r g , = x D i v e r g ( c i , c j ) m

    i = l j = i + l

    Using the average divergence, the subset of bandshaving the maximum average diverage would beselected as the most appropriate set for theclassification. However, to bound the range of thedivergence, a transformed divergence is normallyused [12]. The transformed diverg ence scales thedegree of divergence to lie between 0 and 2000. Thesaturated value of 2000 indicates an excellentseparability, whiIe a low value suggests a poorseparability. In this study, we used the transformeddivergence defined as:DivergT(ci, j) = 2000( 1 - exp(-Diverg(c, cj)/S))

    5. CLASSIFICATION AC CURACYASSESSMENTVarious methods have been developed toevaluate a classification accuracy. In our analysis,we used the technique developed by [13], which isbased on pixel by pixel comparison. This techniquecompares a number of sampled pixels to theircorresponding ground truth data to determine theaccuracy of the classification. The number of pixelsto sample is determined from a predicted accuracythat is itself determined using a preliminary samplingscheme to estimate the predicted classificationaccuracy. Thus, one hundred points were randomly

    sampled by overlaying a regular grid onto theclassified image. Com parison of the 100 sampledpoints to their ground truth (i.e., large scalephotographs) points yielded an 8 1YO ccuracy. Thispredicted accuracy was then used to determine thenumber, N, of pixels to sample to evaluate theclassification accuracy. This number is determinedby the binomial probability theory as [14]:

    where N is the number of points (i.e., pixels) tosampled, p is the predicted accuracy (S l), q = 100 -Z = 2 is the standard normal deviate of 1.96 for thesided confidence interval of 95%, and E = 5 is thallowed error (confiden ce interval = 95%).Once the number of points to be sampled wdetermined, it was necessary to select the moefficient and objective sampling design. For this, wused a stratified systematic sampling at threlements per strata, which is considered adequate fland usehover classification accuracy assessme

    [l5]. The sampling procedure was carried out usithe statistical random tables. The N points were thidentified on both classified images by PCA andivergence. The ground truth for both images wextracted from the 1:10,000 aerial photographs.Error matrices were generated for both imagesquantify and assess their classification accuracKappa statistics has paditio nally been used evaluate the classification accuracy. It is determinas [12]:

    where r is the number of rows in the matrix, xil is tnumber of observations along the matrix diagonal, xand x+i are the margina l totals for row i and columnrespectively, and N is the total numberobservations.6. RESULTS

    To quantitatively compare the classified resuderived from PCA and divergence methods, 7 banof TM multispectral images were used. Tsupervised classification approach was employedcreate 6 spectral classes: cotton (C), forest (F), wa(W), pasture (P), grass (G), and residential (R). Tmaximum likelihood classifier was then applied to first three principal components for the PCA andbands 1, 3 and 5 for divergence analyClassification results are visually shown in FigureVisual inspection of figure 1 shows that the classifimage from divergence method exhibits a laamount of noise expressed by scattered pixels2446

  • 8/2/2019 A Comparative Study of Remotely Sensed Data Classification

    4/6

    2447

  • 8/2/2019 A Comparative Study of Remotely Sensed Data Classification

    5/6

    cotton inside forested areas. This anomaly does notappear on the classified image generated by the PCA.Statistical analysis of the classification accuracy ispresented in the error matrices below (table 1). Theseerror matrices were established by comparing theclassified images to the 1 10000 aerial photographs.Classes listed in the column denote the computerclassification results while classes in the rowrepresent reference (actual) results.The overall accuracy for divergence is 78%while the accuracy for PCA is 81%. The Kappastatistic is 70% and 74% for divergence and PCA,respectively. Most confusion is shown betweencotton and grass and between cotton and residentialTable 1. Error matrices of the classified im agesfram Divergence (a) and PCA (b).(a) Classified Data. ,

    components analysis and divergence for multispectralimage classification were compared in this study.Classification results on the Landsat ThematicMapper (TM) data showed that PCA is an effectiveapproach for automated image classification.Although the overall classification accuracy shows aslight difference between these two approaches, webelieve that if random grou ps of pixels instead ofsingle pixels were evaluated, a higher classificationaccuracy would have resulted for the PCA. This wasvisually detected when examining the classifiedimages derived from the PCA and divergencemethods; the divergence method resulted in an imagewhich presents a high amount of scattered pixelswrongly classified (e.g., a large number of cottonpixels scattered in forested areas).

    7. ACKNOWLEThis work was supported by Grant No. NCCW-

    0084 from the National Aeronautics and SpaceAdministration (NASA), Washington, DC. Any useof trade, product or firm names is for descriptivepurposes only and does not imply endorsement by theU.S. Government. The authors wish to thank Mr.Donvilla Williams for helping produce the outputproduct and Dr. John Ada ms for proofreading it.8. REFERENCES(b)

    for both images. The classified image usingdivergence present more confusion between cottonand forest, also shown visually in figure 1 (scatteredcotton pixels inside forested areas), while theclassified image using PCA present slightly higherconfusion between pasture and grass.

    7. CONCLUSION AND DISCUSSIONA major challenge for researchers in the field ofimage processing and remote sensing is to increaseclassification accuracy in the automated interpretationof remotely sensed multispectral data. Principal

    IEEE Trans. Patte Analysis and MachineIntelligence,Vol. PAM I-4, No. 3,304-306, 1982.[3] 2. Zhang, "A new spatial classification algorithmfor high ground resolution images," proceedings ofIGARSS '88 symposium, Edinburgh, Scotland, Sep.[4] A. A. D. Canas and M. E. Barnett, "TheGeneration and Interpretation of False-ColourComposite Principal Component Images," Int. J.Remote Sensing, Vol. 6, No. 6, 867-881, 1985.

    13-16, 1988.

    [5] ERDAS, IMAGINE: Field Guide (Third Edition),ERD AS, Inc. Atlanta, Georgia, 1 995.2448

  • 8/2/2019 A Comparative Study of Remotely Sensed Data Classification

    6/6

    [6] T. M. Lillesand and R. W. Kiefer, Remote Sensingand Image Interpretation (2nd Edition), John Wiley &Sons, 1987.[7] R. C. Gonzalez and R. E. Woods, Digital ImageProcessing, Addision W esley, 1993.[S I A. R. Gillispie, Digital Techniques of ImageEnhancement, in Remote Sensing in Geology, editedby B. S. Siegal and A. R. Gillispie, New York, Wiley,pp. 139-226, 1980.[9] M. Shimura and T. Imai, NonsupervisedClassification Using the Principal Component,Pattern Recognition, Vol. 5 , pp. 353-363, 1973.[101 P. J. Taylor, Quantitative Methods in Geography:An Introduction to Spatial Analysis, Boston,Massachusetts: Houghton Mifflin Company, 1977.

    [121 J. R. Jensen, Introductory Digital ImageProcessing: A Remote Sensing Pespective (2ndedition), Prentice-Hall, 1996.[131 K. Fitzpatrick-Lins, The Accuracy of SelectedLand Use and Land Cover Maps at Scales of1 :250000 and 1 : 00000, Journal of Research, U.S.Geological Survey:6: 169- 173, 1980.[14] G. W. Snedecor and W. F. Cochran, StatisticalMethods, Am es: Iowa State University Press: 202-21 1and 516-517, 1967.[15] M. Assafi, A . Fahsi, and M. Azerzak,Utilization des images HRV de Spot pour Laclassification en mode tioccupation du sol de la villede Casablanca (Maroc). Societe Francaise dePhotogrammetric et de Teledetection, 145 (1): 8-17,1997.

    [ l l ] P. H. Swain and S . M. Davis (ed) RemoteSensing - The Quantitative Analysis, McGraw-Hill,1978.

    2449