Spectral–Spatial Hyperspectral Image Segmentation Using ... · PDF...

Click here to load reader

  • date post

  • Category


  • view

  • download


Embed Size (px)

Transcript of Spectral–Spatial Hyperspectral Image Segmentation Using ... · PDF...


    SpectralSpatial Hyperspectral Image SegmentationUsing Subspace Multinomial Logistic Regression

    and Markov Random FieldsJun Li, Jos M. Bioucas-Dias, Member, IEEE, and Antonio Plaza, Senior Member, IEEE

    AbstractThis paper introduces a new supervised segmen-tation algorithm for remotely sensed hyperspectral image datawhich integrates the spectral and spatial information in a Bayesianframework. A multinomial logistic regression (MLR) algorithm isfirst used to learn the posterior probability distributions from thespectral information, using a subspace projection method to bet-ter characterize noise and highly mixed pixels. Then, contextualinformation is included using a multilevel logistic MarkovGibbsMarkov random field prior. Finally, a maximum a posteriorisegmentation is efficiently computed by the -Expansion min-cut-based integer optimization algorithm. The proposed segmen-tation approach is experimentally evaluated using both simulatedand real hyperspectral data sets, exhibiting state-of-the-art per-formance when compared with recently introduced hyperspec-tral image classification methods. The integration of subspaceprojection methods with the MLR algorithm, combined with theuse of spatialcontextual information, represents an innovativecontribution in the literature. This approach is shown to provideaccurate characterization of hyperspectral imagery in both thespectral and the spatial domain.

    Index TermsHyperspectral image segmentation, Markov ran-dom field (MRF), multinomial logistic regression (MLR), subspaceprojection method.


    SUPERVISED classification (and segmentation) of high-dimensional data sets such as remotely sensed hyperspec-tral images is a difficult endeavor [1]. Obstacles, such as theHughes phenomenon [2], appear as the data dimensionalityincreases. This is because the number of training samplesused for the learning stage of the classifier is generally verylimited compared with the number of available spectral bands.In order to circumvent this problem, several feature selection[3] and extraction [4] methods have been combined with ma-chine learning techniques that are able to perform accurately inthe presence of limited training sets, including support vector

    Manuscript received March 5, 2011; revised June 13, 2011; accepted July 10,2011. Date of publication August 30, 2011; date of current version February 24,2012. This work was supported in part by the European Commission under theMarie Curie Training Grant MEST-CT-2005-021175 and in part by the MRTN-CT-2006-035927 and AYA2008-05965-C04-02 Projects.

    J. Li and A. Plaza are with the Hyperspectral Computing Laboratory,Department of Technology of Computers and Communications, University ofExtremadura, 10071 Caceres, Spain (e-mail: junli@unex.es; aplaza@unex.es).

    J. M. Bioucas-Dias is with the Instituto de Telecomunicaes and theInstituto Superior Tcnico, Technical University of Lisbon, 1049-001 Lisbon,Portugal (e-mail: bioucas@lx.it.pt).

    Color versions of one or more of the figures in this paper are available onlineat http://ieeexplore.ieee.org.

    Digital Object Identifier 10.1109/TGRS.2011.2162649

    machines (SVMs) [5], [6] or multinomial logistic regression(MLR)-based classifiers [7], [8].

    Due to sensor design considerations, the wealth of spectralinformation in hyperspectral data is often not complemented byextremely fine spatial resolution. This (and other phenomena,such as the presence of mixtures of components at differentscales) leads to the problem of mixed pixels, which representa challenge for accurate hyperspectral image classification [9].In order to address this issue, subspace projection methods [10]have been shown to be a powerful class of statistical patternclassification algorithms [11]. These methods can handle thehigh dimensionality of an input data set by bringing it tothe right subspace without losing the original information thatallows for the separation of classes.

    In this context, subspace projection methods can providecompetitive advantages by separating classes which are verysimilar in spectral sense, thus addressing the limitations inthe classification process due to the presence of highly mixedpixels. The idea of applying subspace projection methods toimprove classification relies on the basic assumption that thesamples within each class can approximately lie in a lowerdimensional subspace. Thus, each class may be representedby a subspace spanned by a set of basis vectors, while theclassification criterion for a new input sample would be thedistance from the class subspace [12][14]. Recently, severalsubspace projection methods have been specifically designedfor improving hyperspectral data characterization [3], [15][17], obtaining successful results. Another recent trend is tocombine spectral and spatialcontextual information [8], [9],[18][22]. In some of these works, Markov random fields(MRFs) have obtained great success in characterizing spatialinformation in remotely sensed data sets [23], [24]. MRFs ex-ploit the continuity, in probability sense, of neighboring labels.The basic assumption is that, in a hyperspectral image, it isvery likely that two neighboring pixels will have the class samelabel.

    In this paper, we propose a new Bayesian approach to hy-perspectral image segmentation which combines spectral andspatial information. The algorithm implements the followingtwo main steps: 1) learning, where the posterior probability dis-tributions are modeled by an MLR combined with a subspaceprojection method, and 2) segmentation, which infers an imageof class labels from a posterior distribution built on the learnedsubspace classifier and on a multilevel logistic (MLL) prior onthe image of labels. The final output of the algorithm is based

    0196-2892/$26.00 2011 IEEE


    on a maximum a posteriori (MAP) segmentation process whichis computed via an efficient min-cut-based integer optimizationtechnique. The main novelty of our proposed work is theintegration of a subspace projection method with the MLRwhich is further combined with spatialcontextual information,which will be shown to provide a good characterization ofcontent of hyperspectral imagery in both the spectral and thespatial domain. The proposed Bayesian method exhibits gooddiscriminatory capability when dealing with ill-posed prob-lems, i.e., limited training samples versus high dimensionalityof the input data. In addition to this, we emphasize that theproposed approach provides class posterior probabilities whichare crucial to the complete posterior probabilities, such that thefinal MAP segmentation can benefit from the inclusion of boththe spectral and the spatial information available in the originalhyperspectral data. As will be shown by our experimentalresults, the accuracies achieved by our approach are compet-itive or superior to those provided by many other state-of-the-art supervised classifiers for hyperspectral image analysis.Furthermore, an important innovative contribution of this workwith regard to our previous work in [8] and [22] is that theproposed method applies a subspace projection method insteadof considering the full spectral information as an input to theMLR model, thus being able to circumvent some limitationsin the techniques described in [8] and [22] due to the highdimensionality of the input data and the presence of highlymixed pixels, which are now tackled simultaneously by ournewly proposed approach.

    The remainder of this paper is organized as follows.Section II formulates the problem. Section III describes theproposed approach. Section IV reports segmentation resultsbased on simulated and real hyperspectral data sets in compar-ison with other state-of-the-art supervised classifiers. Finally,Section V concludes with some remarks and hints at plausiblefuture research lines.


    Before describing our proposed approach, let us first definesome of the notations that will be used throughout this paper:

    S {1, . . . , n} Set of integers index-ing the n pixels of animage.

    K {1, . . . ,K} Set of K classes.x = (x1, . . . ,xn) Rdn Image in which the pix-

    els are d-dimensionalvectors.

    y = (y1, . . . , yn) Ln Image of labels.D(k)

    l(k) {(y1,x1), . . . , (yl(k) ,xl(k))} Set of labeled sam-

    ples for class k withsize l(k).


    l(k) {x1, . . . ,xl(k)} Set of feature vectors

    in D(k)l(k)


    Dl {D(1)l(1) , . . . ,D(K)

    l(K)} Set of labeled samples

    with size l =Kk=1 l


    With the aforementioned definitions in place, the goal ofclassification is to assign a label yi K to each pixel vector xi,with i S . This process results in an image of class labels y,and we will call this assignment labeling. In turn, the goal ofsegmentation is to partition the set S such that the pixels in eachsubset Sk, with S = kSk, share some common property, e.g.,they represent the same type of land cover. Notice that, givena labeling y, the collection Sk = {i S|yi = k} for k K isa partition of S . On the other hand, given the segmentation Skfor k K, the image {yi|yi = k, if i Sk, i S} is a labeling.As a result, there is a one-to-one relationship between labelingsand segmentations. Without loss of generality, in this paper, weuse the term classification when the spatial information in theoriginal scene is not used in the labeling process. Similarly, weuse the term segmentation when the spatial information in theoriginal scene is used to produce such labeling.

    In a Bayesian framework, the labeling process is usuallyconducted by maximizing the posterior distribution1 as follows:

    p(y|x) p(x|y)p(y) (1)

    where p(x|y) is the likelihood fu