Download - Data Analysis Methods and Applications: Hyperspectral Band ...Hyperspectral Imagery (HSI) Hyperspectral sensors generate imagery in the electromagnetic spectrum, capturing aspects

Transcript
Page 1: Data Analysis Methods and Applications: Hyperspectral Band ...Hyperspectral Imagery (HSI) Hyperspectral sensors generate imagery in the electromagnetic spectrum, capturing aspects

INTRODUCTION HYPERSPECTRAL BAND SELECTION CLASSIFICATION OF DATA ON GRASSMANNIANS FUTURE DIRECTIONS

Data Analysis Methods and Applications:Hyperspectral Band Selection and Data

Classification on Embedded Grassmannians

Sofya Chepushtanova

Department of MathematicsColorado State University

February 10, 2014

SOFYA CHEPUSHTANOVA COLORADO STATE UNIVERSITY 1 OF 48

Page 2: Data Analysis Methods and Applications: Hyperspectral Band ...Hyperspectral Imagery (HSI) Hyperspectral sensors generate imagery in the electromagnetic spectrum, capturing aspects

INTRODUCTION HYPERSPECTRAL BAND SELECTION CLASSIFICATION OF DATA ON GRASSMANNIANS FUTURE DIRECTIONS

Outline

1. IntroductionMotivationSparse SVMs

2. Hyperspectral Band SelectionHyperspectral Imagery (HSI)AlgorithmComputational ResultsFuture Work

3. Classification of Data on GrassmanniansGrassmannian FrameworkAlgorithmApplication to HSIFuture Work

4. Future Directions

SOFYA CHEPUSHTANOVA COLORADO STATE UNIVERSITY 2 OF 48

Page 3: Data Analysis Methods and Applications: Hyperspectral Band ...Hyperspectral Imagery (HSI) Hyperspectral sensors generate imagery in the electromagnetic spectrum, capturing aspects

INTRODUCTION HYPERSPECTRAL BAND SELECTION CLASSIFICATION OF DATA ON GRASSMANNIANS FUTURE DIRECTIONS

Motivation

Application-driven research

Algorithms for Threat Detection (ATD) program (launched in 2009):developing novel mathematical and statistical methods to extractmeaningful information from large data streams

Big data: massive, high-dimensional, complex

Growing demand for geometric data analysis, classification, anddimension reduction models

Dimension reduction - how?Feature extraction: transforms the data to a lower dimensional space,using manifold learning techniquesFeature selection: identifies the relevant set of features whilemaintaining or improving the performance of a prediction model

SOFYA CHEPUSHTANOVA COLORADO STATE UNIVERSITY 3 OF 48

Page 4: Data Analysis Methods and Applications: Hyperspectral Band ...Hyperspectral Imagery (HSI) Hyperspectral sensors generate imagery in the electromagnetic spectrum, capturing aspects

INTRODUCTION HYPERSPECTRAL BAND SELECTION CLASSIFICATION OF DATA ON GRASSMANNIANS FUTURE DIRECTIONS

Support Vector Machines

Training data xi ∈ Rn with class labels di ∈ {−1,+1}, i = 1, . . . ,m;D = diag(di) and X is the m× n data matrix.Separating hyperplane P = {x : wTx + b = 0},w ∈ Rn is normal to P.Points on wTx + b = ±1 are support vectors.The optimal P has the largest margin 2/‖w‖2.

SVM:

minw,b,ξ

‖w‖22

2+ CeTξ

s. t. D(Xw + be) + ξ ≥ e,

ξ ≥ 0.

Decision function:f (x) = sgn(wTx + b)

Class +1

Margin

W

Class -1

WTx+b=0

WTx+b=1

Misclassified

points

WTx+b=-1

Optimal

Separating

Hyperplane

Support vectors

SOFYA CHEPUSHTANOVA COLORADO STATE UNIVERSITY 4 OF 48

Page 5: Data Analysis Methods and Applications: Hyperspectral Band ...Hyperspectral Imagery (HSI) Hyperspectral sensors generate imagery in the electromagnetic spectrum, capturing aspects

INTRODUCTION HYPERSPECTRAL BAND SELECTION CLASSIFICATION OF DATA ON GRASSMANNIANS FUTURE DIRECTIONS

Nonlinear SVM: Kernel Trick

Φ : x ∈ RN 7→ Φ(x) ∈ RN′ ,N′ > N.Kernel function Kij = K(xi, xj) = Φ(xi)

TΦ(xj).

Ф

Input

Space

Feature

Space

the decision function is f (x) = sgn(∑m

i=1 αidiK(xi, x) + b).RBF K(xi, x) = exp(−γ‖xi − x‖2),polynomial K(xi, x) = (xT

i x + 1)n.

SOFYA CHEPUSHTANOVA COLORADO STATE UNIVERSITY 5 OF 48

Page 6: Data Analysis Methods and Applications: Hyperspectral Band ...Hyperspectral Imagery (HSI) Hyperspectral sensors generate imagery in the electromagnetic spectrum, capturing aspects

INTRODUCTION HYPERSPECTRAL BAND SELECTION CLASSIFICATION OF DATA ON GRASSMANNIANS FUTURE DIRECTIONS

Arbitrary-Norm Separating Hyperplane

Dual norm

For a norm ‖· ‖ on R, the dual norm ‖x‖′ := max‖y‖=1

xTy.

Example: for p, q ∈ [1,∞], 1/p + 1/q = 1, the p-norm and q-normare dual.

Theorem (Mangasarian, 1998)

Let q ∈ Rn be any point not on the plane P := {x|wTx + b = 0},0 6= w ∈ Rn, b ∈ R.Then the distance between q and p(q) is given by:

‖q− p(q)‖ =|wTq + b|‖w‖′

.

SOFYA CHEPUSHTANOVA COLORADO STATE UNIVERSITY 6 OF 48

Page 7: Data Analysis Methods and Applications: Hyperspectral Band ...Hyperspectral Imagery (HSI) Hyperspectral sensors generate imagery in the electromagnetic spectrum, capturing aspects

INTRODUCTION HYPERSPECTRAL BAND SELECTION CLASSIFICATION OF DATA ON GRASSMANNIANS FUTURE DIRECTIONS

Sparse SVMs

Corollary

‖q− p(q)‖∞ = |wTq + b|/‖w‖1

(where ‖x‖1 =∑n

i=1 |xi| and ‖x‖∞ = maxi{|xi|})

If the `∞-norm is used to measure the distance between the planes,then the margin is given by 2/‖w‖1, which yields the followingsparse SVM (SSVM):

minw,b,ξ

‖w‖1 + CeTξ

s. t. D(Xw + be) + ξ ≥ e,

ξ ≥ 0.

SOFYA CHEPUSHTANOVA COLORADO STATE UNIVERSITY 7 OF 48

Page 8: Data Analysis Methods and Applications: Hyperspectral Band ...Hyperspectral Imagery (HSI) Hyperspectral sensors generate imagery in the electromagnetic spectrum, capturing aspects

INTRODUCTION HYPERSPECTRAL BAND SELECTION CLASSIFICATION OF DATA ON GRASSMANNIANS FUTURE DIRECTIONS

Sparse SVMs

SSVM⇒ LP (with ‖w‖1 = w+ + w− and w = w+ − w−):min

w+,w−,b,ξeT(w+ + w−) + CeTξ

s. t. D(X(w+ − w−) + be) + ξ ≥ e,

w+,w−, ξ ≥ 0.

Sparsity of `1-norm:

−3 −2 −1 0 1 2 3 4

−1.5

−1

−0.5

0

0.5

1

1.5

2

x1

x2

Class −1Class +12−norm hyperplanes1−norm hyperplanes

−0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

1.2

w1

w2

feasible set1−norm locus2−norm locussolution to 1−norm SVMsolution to 2−norm SVM

SOFYA CHEPUSHTANOVA COLORADO STATE UNIVERSITY 8 OF 48

Page 9: Data Analysis Methods and Applications: Hyperspectral Band ...Hyperspectral Imagery (HSI) Hyperspectral sensors generate imagery in the electromagnetic spectrum, capturing aspects

INTRODUCTION HYPERSPECTRAL BAND SELECTION CLASSIFICATION OF DATA ON GRASSMANNIANS FUTURE DIRECTIONS

Hyperspectral Imagery (HSI)

Hyperspectral sensors generate imagery in the electromagneticspectrum, capturing aspects that are imperceptible to the humaneye.

The radiance of materials is measured within each pixel area at avery large number of contiguous spectral wavelength bands.

Spatial and spectral information is contained in data cubes.

Each pixel is a vector x ∈ Rn .

Z,bands

X,columns of pixels

Y,rows

of pixels

20 40 60 80 100 120 140 160 180 200 220

2000

3000

4000

5000

6000

7000

Band index

Spe

ctra

l rad

ianc

e

AlfalfaCorn−notillCorn−minCornGrass−PastureGrass−TreesGrass−PastureMowedHay−windrowedOatsSoybeans−notillSoybeans−minSoybeans−cleanWheatWoodsBldg−Grass−Trees−DrivesStone−steel Towers

SOFYA CHEPUSHTANOVA COLORADO STATE UNIVERSITY 9 OF 48

Page 10: Data Analysis Methods and Applications: Hyperspectral Band ...Hyperspectral Imagery (HSI) Hyperspectral sensors generate imagery in the electromagnetic spectrum, capturing aspects

INTRODUCTION HYPERSPECTRAL BAND SELECTION CLASSIFICATION OF DATA ON GRASSMANNIANS FUTURE DIRECTIONS

Hyperspectral Imagery (HSI)

Advantage: rich detailed radiance informationDisadvantage: huge amount of data (more is not always better)

Band selectionidentify a subset of bands that contain the most discriminatoryinformation→ use them for further analysis

Methods1 Filters:

all bands→ filter→ band subset→ predictor2 Wrappers:

all bands→ space of band subsets→ predictor (wrapper)→ bandsubset

3 Embedded algorithms:all bands→ predictor→ band subset

SOFYA CHEPUSHTANOVA COLORADO STATE UNIVERSITY 10 OF 48

Page 11: Data Analysis Methods and Applications: Hyperspectral Band ...Hyperspectral Imagery (HSI) Hyperspectral sensors generate imagery in the electromagnetic spectrum, capturing aspects

INTRODUCTION HYPERSPECTRAL BAND SELECTION CLASSIFICATION OF DATA ON GRASSMANNIANS FUTURE DIRECTIONS

Band Selection via SSVMs(Collaborators: M. Kirby and C. Gittins)

A linear SSVM: basic model for band selection. We solve it bythe primal dual interior point method. This allows one to monitorthe variation of the primal and dual variables simultaneously.

A weight ratio criterion for embedded band selection: allows toeasily distinguish the non-zero weights from the zero weights.

The bagging (Bootstrap AGGregatING) approach is employed toenhance the robustness of SSVMs.

We extend the binary band selection to the multiclass case.

The SSVM algorithm is an effective technique for embeddedband selection⇒ high accuracies in numerical experiments.

SOFYA CHEPUSHTANOVA COLORADO STATE UNIVERSITY 11 OF 48

Page 12: Data Analysis Methods and Applications: Hyperspectral Band ...Hyperspectral Imagery (HSI) Hyperspectral sensors generate imagery in the electromagnetic spectrum, capturing aspects

INTRODUCTION HYPERSPECTRAL BAND SELECTION CLASSIFICATION OF DATA ON GRASSMANNIANS FUTURE DIRECTIONS

Recall: Sparse Linear SVMs

Training data xi ∈ Rn with class labelsdi ∈ {−1,+1}, i = 1, . . . ,m;D = diag(di) and X is the m× n data matrix.Separating hyperplane P = {x : wTx + b = 0},w ∈ Rn is normal to P.Points on wTx + b = ±1 are support vectors.The optimal P has the largest margin 2/‖w‖1.

SSVM:

minw,b,ξ

‖w‖1 + CeTξ

s. t. D(Xw + be) + ξ ≥ e,

ξ ≥ 0.

Decision function:f (x) = sgn(wTx + b)

Class +1

Margin

W

Class -1

WTx+b=0

WTx+b=1

Misclassified

points

WTx+b=-1

Optimal

Separating

Hyperplane

Support vectors

SOFYA CHEPUSHTANOVA COLORADO STATE UNIVERSITY 12 OF 48

Page 13: Data Analysis Methods and Applications: Hyperspectral Band ...Hyperspectral Imagery (HSI) Hyperspectral sensors generate imagery in the electromagnetic spectrum, capturing aspects

INTRODUCTION HYPERSPECTRAL BAND SELECTION CLASSIFICATION OF DATA ON GRASSMANNIANS FUTURE DIRECTIONS

Sparsity in w

Comparison of weights for sparse SVMand standard SVM models using twoclasses of a hyperspectral data set.

0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2 2.4−1.5

−1

−0.5

0

0.5

1

1.5x 10

−3

Wavelength (µm)

Wei

ghts

Sparse SVM weights

Standard SVM weights

Weight ratio criterionThe resulting weights of the modelw1,w2, . . . ,wl are ordered s.t.:

|wi1 | ≥ |wi2 | ≥ · · · ≥ |wil |.

The key feature of this sparseapproach is that

|wik ||wik+1 |

= O(1)

save for where the weights transitionto zero:

|wik∗ ||wik∗+1|

= O(10M).

SOFYA CHEPUSHTANOVA COLORADO STATE UNIVERSITY 13 OF 48

Page 14: Data Analysis Methods and Applications: Hyperspectral Band ...Hyperspectral Imagery (HSI) Hyperspectral sensors generate imagery in the electromagnetic spectrum, capturing aspects

INTRODUCTION HYPERSPECTRAL BAND SELECTION CLASSIFICATION OF DATA ON GRASSMANNIANS FUTURE DIRECTIONS

Bootstrap Aggregating (Breiman, 1996)

To enforce stability: sample with replacement from n-dimensionaltraining data to compute N SSVM models:

band 1: [w11 w2

1 · · · wN1 ]

band 2: [w12 w2

2 · · · wN2 ]

...... · · ·

...band j: [w1

j w2j · · · wN

j ]...

... · · ·...

band n: [w1n w2

n · · · wNn ]

To reduce the number of bands, we eliminate those with at least 95%of “zeros” in the samples:

#{|wjk| < tolerance, j = 1, . . . ,N} ≥ 0.95 ∗ N for the k-th band.

SOFYA CHEPUSHTANOVA COLORADO STATE UNIVERSITY 14 OF 48

Page 15: Data Analysis Methods and Applications: Hyperspectral Band ...Hyperspectral Imagery (HSI) Hyperspectral sensors generate imagery in the electromagnetic spectrum, capturing aspects

INTRODUCTION HYPERSPECTRAL BAND SELECTION CLASSIFICATION OF DATA ON GRASSMANNIANS FUTURE DIRECTIONS

Algorithm

Input: Set of bands S = {1, 2, . . . , n}

Sample withreplacement from

train data X ⇒X1,X2, . . . ,XN

Train N SSVMmodels fj(x) ⇒ Nweight vectors wj

Remove k-th band if#{|wi

k| < tol, i =1, . . . ,N} ≥ 0.95 ∗N ⇒S = S \ k,Xnew = X(:, S)

Train an SSVM onXnew ⇒ w, rankw values → wr

In wr comparemagnitude orders:if |wr

ik |/|wrik+1| >

10 for someik = i∗k , remove

bands starting fromi∗k + 1, update S

Train a finalSSVM model f onXnew = Xnew(:, S)

Return: bandset S, model f

SOFYA CHEPUSHTANOVA COLORADO STATE UNIVERSITY 15 OF 48

Page 16: Data Analysis Methods and Applications: Hyperspectral Band ...Hyperspectral Imagery (HSI) Hyperspectral sensors generate imagery in the electromagnetic spectrum, capturing aspects

INTRODUCTION HYPERSPECTRAL BAND SELECTION CLASSIFICATION OF DATA ON GRASSMANNIANS FUTURE DIRECTIONS

Multiclass Band Selection

One-against-one (OAO) SSVMs

k classes→(k

2

)= k(k−1)

2 binary classifiers→ majority voting to assign class to a testing point.

MethodsMethod I: Rank selected bands by the frequency of their occurrence.

Method II: Rank bands in each two-class subset by magnitude andtake the superset of the M top bands.

Method III: The Ward’s Linkage Strategy Using MutualInformation (WaLuMI) method (Martinez-Uso et al, 2007) is a filtermethod that we employ as a pre-selection step.

SOFYA CHEPUSHTANOVA COLORADO STATE UNIVERSITY 16 OF 48

Page 17: Data Analysis Methods and Applications: Hyperspectral Band ...Hyperspectral Imagery (HSI) Hyperspectral sensors generate imagery in the electromagnetic spectrum, capturing aspects

INTRODUCTION HYPERSPECTRAL BAND SELECTION CLASSIFICATION OF DATA ON GRASSMANNIANS FUTURE DIRECTIONS

Spatial Smoothing

Adopted from Zare & Gader, 2008:After a texting pixel X has been assigned a class vote via OAOSSVMs, spatial smoothing can be done by summing class votes overthe eight-connected neighborhood of the pixel X

1 1 11 X 11 1 1

1 1 11 X 22 2 2

1 1 12 X 23 3 3

SOFYA CHEPUSHTANOVA COLORADO STATE UNIVERSITY 17 OF 48

Page 18: Data Analysis Methods and Applications: Hyperspectral Band ...Hyperspectral Imagery (HSI) Hyperspectral sensors generate imagery in the electromagnetic spectrum, capturing aspects

INTRODUCTION HYPERSPECTRAL BAND SELECTION CLASSIFICATION OF DATA ON GRASSMANNIANS FUTURE DIRECTIONS

AVIRIS Indian Pines Data Set

20 40 60 80 100 120 140

20

40

60

80

100

120

140

20 40 60 80 100 120 140

20

40

60

80

100

120

140Background

Alfalfa

Corn−notill

Corn−min

Corn

Grass−Pasture

Grass−Trees

Grass−PastureMowed

Hay−windrowed

Oats

Soybeans−notill

Soybeans−min

Soybeans−clean

Wheat

Woods

Bldg−Grass−Trees−Drives

Stone−steel Towers

Aiborne Visible/Infrared Imaging Spectrometer (AVIRIS): collected inan agricultural area of northern Indiana in 1992.

145× 145 images, 220 spectral bands (ranging from 0.4 to 2.5µm).

Ground truth is known for 49% of the pixels.

16 classes ranging from 20 to 2468 pixels.

SOFYA CHEPUSHTANOVA COLORADO STATE UNIVERSITY 18 OF 48

Page 19: Data Analysis Methods and Applications: Hyperspectral Band ...Hyperspectral Imagery (HSI) Hyperspectral sensors generate imagery in the electromagnetic spectrum, capturing aspects

INTRODUCTION HYPERSPECTRAL BAND SELECTION CLASSIFICATION OF DATA ON GRASSMANNIANS FUTURE DIRECTIONS

Comparison with Other Methods

1 WaLuMI: hierarchical clustering approach that exploits bandcorrelation using a mutual information (MI) criterion(Martinez-Uso et al, 2007).

2 B-SPICE: simultaneous band selection and endmember detection(Zare & Gader, 2008).

3 Lasso Logistic Regression:

minβ0,β− 1

m

m∑i=1

yi(β0 + xTi β)− log(1 + e(β0+xT

i β)) + λ‖β‖1.

SOFYA CHEPUSHTANOVA COLORADO STATE UNIVERSITY 19 OF 48

Page 20: Data Analysis Methods and Applications: Hyperspectral Band ...Hyperspectral Imagery (HSI) Hyperspectral sensors generate imagery in the electromagnetic spectrum, capturing aspects

INTRODUCTION HYPERSPECTRAL BAND SELECTION CLASSIFICATION OF DATA ON GRASSMANNIANS FUTURE DIRECTIONS

Binary Band Selection

Weight magnitudes:

Corn-min and Woods Corn-notill and Grass/TreesBand Weight Band Weight

29 1.4249e-03 1 1.0202e-0341 1.3191e-03 9 9.6991e-0428 3.5594e-08 5 6.5283e-0442 1.6342e-09 29 8.3022e-0927 1.3258e-09 32 4.2466e-09

Accuracy rates (%) for binary band selection:

Classes Accuracy: SSVM Algorithm WaLuMI + SSVM Lasso Logistic Regressionall bands # Bands Kept Accuracy # Bands Kept Accuracy # Bands Kept Accuracy

Corn-min andWoods 100.00 2 100.00 2 99.9 12 100.00

Corn-notill andGrass/Trees 99.73 12 99.73 12 100 19 98.9

Soybeans-notill andSoybeans-min 89.58 179 89.23 - - 127 89.52

SOFYA CHEPUSHTANOVA COLORADO STATE UNIVERSITY 20 OF 48

Page 21: Data Analysis Methods and Applications: Hyperspectral Band ...Hyperspectral Imagery (HSI) Hyperspectral sensors generate imagery in the electromagnetic spectrum, capturing aspects

INTRODUCTION HYPERSPECTRAL BAND SELECTION CLASSIFICATION OF DATA ON GRASSMANNIANS FUTURE DIRECTIONS

Binary Band Selection

Spectral signatures and weights of selected bands for:

20 40 60 80 100 120 140 160 180 200 2200

5000

10000

SP

EC

TR

AL

RA

DIA

NC

E

BAND INDEX20 40 60 80 100 120 140 160 180 200 220

0

1

2x 10

−3

WE

IGH

T

20 40 60 80 100 120 140 160 180 200 2200

5000

10000

SP

EC

TR

AL

RA

DIA

NC

E

BAND INDEX20 40 60 80 100 120 140 160 180 200 220

0

0.005

0.01

WE

IGH

T

20 40 60 80 100 120 140 160 180 200 2200

2000

4000

6000

SP

EC

TR

AL

RA

DIA

NC

E

BAND INDEX20 40 60 80 100 120 140 160 180 200 220

0

0.02

0.04

0.06

WE

IGH

T

Corn-min & Woods Corn-notill & Grass/TreesSoybeans-notill &

Soybeans-min

SOFYA CHEPUSHTANOVA COLORADO STATE UNIVERSITY 21 OF 48

Page 22: Data Analysis Methods and Applications: Hyperspectral Band ...Hyperspectral Imagery (HSI) Hyperspectral sensors generate imagery in the electromagnetic spectrum, capturing aspects

INTRODUCTION HYPERSPECTRAL BAND SELECTION CLASSIFICATION OF DATA ON GRASSMANNIANS FUTURE DIRECTIONS

Multiclass Band Selection

Number of bands selected for each of(16

2

)subsets (pairs of classes)

and number of occurrences of each band.

Class number

Cla

ss n

um

ber

2 4 6 8 10 12 14 16

2

4

6

8

10

12

14

160

10

20

30

40

50

60

70

80

90

100

110

120

130

140

150

160

170

50 100 150 2000

10

20

30

40

50

60

70

Band index

Occ

urr

ence

nu

mb

er

SOFYA CHEPUSHTANOVA COLORADO STATE UNIVERSITY 22 OF 48

Page 23: Data Analysis Methods and Applications: Hyperspectral Band ...Hyperspectral Imagery (HSI) Hyperspectral sensors generate imagery in the electromagnetic spectrum, capturing aspects

INTRODUCTION HYPERSPECTRAL BAND SELECTION CLASSIFICATION OF DATA ON GRASSMANNIANS FUTURE DIRECTIONS

Multiclass Band Selection

Accuracy plots for OAO SSVM before and after spatial smoothingobtained by Methods I and III.

20 40 60 80 100 120 140 160 180 200 22020

30

40

50

60

70

80

90

100

Number of bands

Cla

ssif

icat

ion

Acc

ura

cy (

%)

Method I

Method I + Spatial Smoothing

Method III

Method III + Spatial Smoothing

SOFYA CHEPUSHTANOVA COLORADO STATE UNIVERSITY 23 OF 48

Page 24: Data Analysis Methods and Applications: Hyperspectral Band ...Hyperspectral Imagery (HSI) Hyperspectral sensors generate imagery in the electromagnetic spectrum, capturing aspects

INTRODUCTION HYPERSPECTRAL BAND SELECTION CLASSIFICATION OF DATA ON GRASSMANNIANS FUTURE DIRECTIONS

Multiclass Band Selection

Accuracy results for multiclass band selection (%) and comparisonwith other methods

# Bands Kept Method I Method II Method III Comparison(frequency) (top bands) (WaLuMI + SSVM) B-SPICE + WaLuMI +

RVM NN220 98.36 - 98.36 93.9 -80 97.14 - 96.89 - -57 95.66 97.3 96.22 - -34 93.15 - 93.03 86.4 8019 91.20 - 92.57 82.5 8110 84.37 - 93.07 - 81

SOFYA CHEPUSHTANOVA COLORADO STATE UNIVERSITY 24 OF 48

Page 25: Data Analysis Methods and Applications: Hyperspectral Band ...Hyperspectral Imagery (HSI) Hyperspectral sensors generate imagery in the electromagnetic spectrum, capturing aspects

INTRODUCTION HYPERSPECTRAL BAND SELECTION CLASSIFICATION OF DATA ON GRASSMANNIANS FUTURE DIRECTIONS

Future Work

Apply the algorithm to other data sets (not necessarily HSI).

Consider using kernel SSVMs instead of linear: the resultingdimension reduction is not in the number of input space featuresbut in the number of kernel functions, so it is interesting toinvestigate how a feature selection tool can be build in thenonlinear predictor.

SOFYA CHEPUSHTANOVA COLORADO STATE UNIVERSITY 25 OF 48

Page 26: Data Analysis Methods and Applications: Hyperspectral Band ...Hyperspectral Imagery (HSI) Hyperspectral sensors generate imagery in the electromagnetic spectrum, capturing aspects

INTRODUCTION HYPERSPECTRAL BAND SELECTION CLASSIFICATION OF DATA ON GRASSMANNIANS FUTURE DIRECTIONS

References

V. N. Vapnik, The nature of statistical learning theory, New York: Springer, 1995

L. Breiman, Bagging predictors, Machine learning, 24, pp. 123-140, 1996

O. L. Mangasarian, Arbitrary-norm separating plane, Operations Research Letters,24, pp. 15-23, 1997

J. Bi, K. P. Bennett, M. Embrechts, C. M. Breneman, and M. Song, Dimensionalityreduction via sparse support vector machines, Journal of Machine LearningResearch, 3, pp. 1229-1243, 2003

O. L. Mangasarian, Exact 1-norm support vector machines via unconstrainedconvex differentiable minimization, Journal of Machine Learning Research, 7, pp.1517-1530, 2006

A. Zare and P. Gauder, Hyperspectral band selection and endmember detectionusing sparsity promoting priors, IEEE Geoscience and remote sensing letters, vol.5, no.2, pp. 256-260, 2008

A. Martinez-Uso, F. Pla, J. M. Sotoca, and P. Garcia-Sevilla, Clustering-basedhyperspectral band selection using information measures, IEEE Transactions onGeoscience and Remote Sensing, vol. 45, no. 12, pp.4158-4171, 2007.

S. Chepushtanova, C. Gittins, and M. Kirby, Band Selection in HyperspectralImagery Using Sparse Support Vector Machines, submitted

SOFYA CHEPUSHTANOVA COLORADO STATE UNIVERSITY 26 OF 48

Page 27: Data Analysis Methods and Applications: Hyperspectral Band ...Hyperspectral Imagery (HSI) Hyperspectral sensors generate imagery in the electromagnetic spectrum, capturing aspects

INTRODUCTION HYPERSPECTRAL BAND SELECTION CLASSIFICATION OF DATA ON GRASSMANNIANS FUTURE DIRECTIONS

Classification of Data on Grassmannians(Collaborator: M. Kirby)

Set-to-set pattern recognition: a set of points from a classcharacterizes the variability of the class information.

Grassmann manifolds G(k, n) (collections of k-dimensionalsubspaces of Rn) provide a geometric framework forcharacterizing sets of points.

Subspaces can be realized as points in Euclidean space viamultidimensional scaling.

Sparse support vector machine identifies optimal dimensionsof embedded subspaces.

SOFYA CHEPUSHTANOVA COLORADO STATE UNIVERSITY 27 OF 48

Page 28: Data Analysis Methods and Applications: Hyperspectral Band ...Hyperspectral Imagery (HSI) Hyperspectral sensors generate imagery in the electromagnetic spectrum, capturing aspects

INTRODUCTION HYPERSPECTRAL BAND SELECTION CLASSIFICATION OF DATA ON GRASSMANNIANS FUTURE DIRECTIONS

Grassmann Manifold

Definition

The Grassmann manifold G(k, n) is the collection of allk-dimensional linear subspaces of Rn, 1 ≤ k ≤ n.

Example: G(1, n) is the set of all lines going through the origin ofRn (projective space RPn−1).

Note 1: An element of G(k, n) can be represented by an n× korthogonal matrix U (UTU = Ik).

Note 2: The matrix representation on G(k, n) is not unique: we sayU1 = U2 if span(U1) = span(U2).

SOFYA CHEPUSHTANOVA COLORADO STATE UNIVERSITY 28 OF 48

Page 29: Data Analysis Methods and Applications: Hyperspectral Band ...Hyperspectral Imagery (HSI) Hyperspectral sensors generate imagery in the electromagnetic spectrum, capturing aspects

INTRODUCTION HYPERSPECTRAL BAND SELECTION CLASSIFICATION OF DATA ON GRASSMANNIANS FUTURE DIRECTIONS

Constructing Points on G(k, n):

Data points in Rn, dimension k

Form n × k “tall-skinny”matrices Y1,Y2, . . . ,YN

U1,U2, . . . ,UN representpoints on G(k, n) (or take

UjVTj = arg min

PT P=I‖Yj − P‖F).

sample data

SVD: Yi = UiΣiVTi

SOFYA CHEPUSHTANOVA COLORADO STATE UNIVERSITY 29 OF 48

Page 30: Data Analysis Methods and Applications: Hyperspectral Band ...Hyperspectral Imagery (HSI) Hyperspectral sensors generate imagery in the electromagnetic spectrum, capturing aspects

INTRODUCTION HYPERSPECTRAL BAND SELECTION CLASSIFICATION OF DATA ON GRASSMANNIANS FUTURE DIRECTIONS

Geodesic distance

θ

up

vp

Uispan )(

Ujspan )(

Principal angles

0 < θ1 ≤ θ2 ≤ . . . ≤ θk ≤ π/2given by cos θp = max

up∈span(Ui)max

vp∈span(Uj)uT

p vp,

where uTp up = 1, vT

p vp = 1, uTp uq = 0, vT

p vq = 0, q = 1, . . . , p− 1.

Geodesic distance (or arclength)

dG(Ui,Uj) = ‖θ‖2 =√∑k

p=1 θ2p

SOFYA CHEPUSHTANOVA COLORADO STATE UNIVERSITY 30 OF 48

Page 31: Data Analysis Methods and Applications: Hyperspectral Band ...Hyperspectral Imagery (HSI) Hyperspectral sensors generate imagery in the electromagnetic spectrum, capturing aspects

INTRODUCTION HYPERSPECTRAL BAND SELECTION CLASSIFICATION OF DATA ON GRASSMANNIANS FUTURE DIRECTIONS

Embedding G(k, n) in Rd via Multidimensional Scaling (MDS)

Classical MDS (Mardia):

Input: Distance matrix D ∈ RN×N with Dij = dG(Ui,Uj).1 Compute B = HAH, where H = I − 1

N eeT and Aij = −12 D2

ij(e is a vector of N ones).

2 Compute the spectral decomposition of B: B = ΓΛΓT .3 Set X := ΓΛ

12 .

Output: X, a configuration of points in Rd, whered = rank(B) = rank(X) ≤ N − 1. (Note: Be = 0e.)

Note: if B is positive semidefinite, the configuration preserves thegeodesic distances, otherwise we adopt the resulting scaling as thebest approximation we can obtain.

SOFYA CHEPUSHTANOVA COLORADO STATE UNIVERSITY 31 OF 48

Page 32: Data Analysis Methods and Applications: Hyperspectral Band ...Hyperspectral Imagery (HSI) Hyperspectral sensors generate imagery in the electromagnetic spectrum, capturing aspects

INTRODUCTION HYPERSPECTRAL BAND SELECTION CLASSIFICATION OF DATA ON GRASSMANNIANS FUTURE DIRECTIONS

Algorithm

Compute points and geodesicdistance matrix on G(k, n)

Embed subspaces in Euclidean space (MDS)preserving distances

Feature (dimension) selectionand classification via SSVMs

SOFYA CHEPUSHTANOVA COLORADO STATE UNIVERSITY 32 OF 48

Page 33: Data Analysis Methods and Applications: Hyperspectral Band ...Hyperspectral Imagery (HSI) Hyperspectral sensors generate imagery in the electromagnetic spectrum, capturing aspects

INTRODUCTION HYPERSPECTRAL BAND SELECTION CLASSIFICATION OF DATA ON GRASSMANNIANS FUTURE DIRECTIONS

Application to HSI

20 40 60 80 100 120 140

20

40

60

80

100

120

140

20 40 60 80 100 120 140

20

40

60

80

100

120

140Background

Alfalfa

Corn−notill

Corn−min

Corn

Grass−Pasture

Grass−Trees

Grass−PastureMowed

Hay−windrowed

Oats

Soybeans−notill

Soybeans−min

Soybeans−clean

Wheat

Woods

Bldg−Grass−Trees−Drives

Stone−steel Towers

Aiborne Visible/Infrared Imaging Spectrometer (AVIRIS): collected inan agricultural area of northern Indiana in 1992.

145× 145 images, 220 spectral bands (ranging from 0.4 to 2.5µm).

Ground truth is known for 49% of the pixels.

16 classes ranging from 20 to 2468 pixels.

SOFYA CHEPUSHTANOVA COLORADO STATE UNIVERSITY 33 OF 48

Page 34: Data Analysis Methods and Applications: Hyperspectral Band ...Hyperspectral Imagery (HSI) Hyperspectral sensors generate imagery in the electromagnetic spectrum, capturing aspects

INTRODUCTION HYPERSPECTRAL BAND SELECTION CLASSIFICATION OF DATA ON GRASSMANNIANS FUTURE DIRECTIONS

Configurations in Euclidean Space

2 Classes: Corn-notill (blue) and Grass/Pasture (red).Dimensions correspond to the two top eigenvalues of B (MDS).Solid dots - training set, hollow dots - testing set

−2.5 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2−2

−1

0

1

2

3k = 1

−1.5 −1 −0.5 0 0.5 1 1.5 2−3

−2

−1

0

1k = 2

−2 −1.5 −1 −0.5 0 0.5 1 1.5 2−3

−2

−1

0

1

2k = 3

−1 −0.5 0 0.5 1 1.5−1

−0.5

0

0.5

1k = 5

−1 −0.5 0 0.5 1−1.5

−1

−0.5

0

0.5

1k = 10

−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8−1.5

−1

−0.5

0

0.5

1k = 15

SOFYA CHEPUSHTANOVA COLORADO STATE UNIVERSITY 34 OF 48

Page 35: Data Analysis Methods and Applications: Hyperspectral Band ...Hyperspectral Imagery (HSI) Hyperspectral sensors generate imagery in the electromagnetic spectrum, capturing aspects

INTRODUCTION HYPERSPECTRAL BAND SELECTION CLASSIFICATION OF DATA ON GRASSMANNIANS FUTURE DIRECTIONS

Configurations in Euclidean Space

3-Classes: Corn-notill (blue), Grass/Pasture (red) and Grass/Trees(green).Dimensions correspond to the two top eigenvalues of B (MDS).(Solid dots - training set, hollow dots - testing set)

−3 −2 −1 0 1 2 3−4

−2

0

2k = 1

−2.5 −2 −1.5 −1 −0.5 0 0.5 1−2

−1

0

1

2k = 2

−2 −1.5 −1 −0.5 0 0.5 1 1.5 2−2

−1

0

1

2k = 3

−1 −0.5 0 0.5 1 1.5−2

−1

0

1

2k = 5

−0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1−1

−0.5

0

0.5

1k = 10

−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8−1

−0.5

0

0.5

1k = 15

SOFYA CHEPUSHTANOVA COLORADO STATE UNIVERSITY 35 OF 48

Page 36: Data Analysis Methods and Applications: Hyperspectral Band ...Hyperspectral Imagery (HSI) Hyperspectral sensors generate imagery in the electromagnetic spectrum, capturing aspects

INTRODUCTION HYPERSPECTRAL BAND SELECTION CLASSIFICATION OF DATA ON GRASSMANNIANS FUTURE DIRECTIONS

Classification

SSVM applied toconfiguration of points onG(15, 220) embedded inEuclidean space:

Corn-notill (blue) andGrass/Pasture (red)

Dimensions correspond tothe two largest absolutevalues of the sparse weightvector w −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

SOFYA CHEPUSHTANOVA COLORADO STATE UNIVERSITY 36 OF 48

Page 37: Data Analysis Methods and Applications: Hyperspectral Band ...Hyperspectral Imagery (HSI) Hyperspectral sensors generate imagery in the electromagnetic spectrum, capturing aspects

INTRODUCTION HYPERSPECTRAL BAND SELECTION CLASSIFICATION OF DATA ON GRASSMANNIANS FUTURE DIRECTIONS

Classification

Accuracy as a funcion of k:

2 4 6 8 10 12 14 16 18 200.5

0.55

0.6

0.65

0.7

0.75

0.8

0.85

0.9

0.95

1

Dimension of subspaces k

Acc

urac

y

Training set

Testing set

Corn-notill and Grass/Pasture

2 4 6 8 10 12 14 16 18 200.5

0.55

0.6

0.65

0.7

0.75

0.8

0.85

0.9

0.95

1

Dimension of subspaces kA

ccur

acy

Training set

Testing set

Corn-notill, Grass/Pasture, and Grass/Trees

SOFYA CHEPUSHTANOVA COLORADO STATE UNIVERSITY 37 OF 48

Page 38: Data Analysis Methods and Applications: Hyperspectral Band ...Hyperspectral Imagery (HSI) Hyperspectral sensors generate imagery in the electromagnetic spectrum, capturing aspects

INTRODUCTION HYPERSPECTRAL BAND SELECTION CLASSIFICATION OF DATA ON GRASSMANNIANS FUTURE DIRECTIONS

Feature Selection in Embedded Spaces

Corn-notill versus Grass/Pasture: N = 200 constructed points on G(k, 220)

Dimension of Dimension of Number of Number of Features Number ofsubspaces feature space negative zero selected features

k of embedded eigenvalues eigenvalues selectedpoints, d of B of B

1 131 68 1 1-3,5-7,10 72 156 43 1 1-6,8,11 83 126 73 1 1-6,10-13,16-18,20,23,43,39,47,62,74 205 147 52 1 1,3,6,9,14,15,18,19,34,37,39,42,52,63 1410 195 4 1 1,4,5,8,15,28,38,65,71 920 199 0 1 1,3,24,31,63 525 199 0 1 1,2,8,14 4

SOFYA CHEPUSHTANOVA COLORADO STATE UNIVERSITY 38 OF 48

Page 39: Data Analysis Methods and Applications: Hyperspectral Band ...Hyperspectral Imagery (HSI) Hyperspectral sensors generate imagery in the electromagnetic spectrum, capturing aspects

INTRODUCTION HYPERSPECTRAL BAND SELECTION CLASSIFICATION OF DATA ON GRASSMANNIANS FUTURE DIRECTIONS

Future Work

Use other distances provided by the principal angles, forinstance, projection F-norm, ‖sin θ‖2.

Compare results of the multiclass case with the literature.

Determine computationally the optimal number of constructedpoints on G(k, n) for training and testing.

Apply the method to other HSI and medical data sets.

SOFYA CHEPUSHTANOVA COLORADO STATE UNIVERSITY 39 OF 48

Page 40: Data Analysis Methods and Applications: Hyperspectral Band ...Hyperspectral Imagery (HSI) Hyperspectral sensors generate imagery in the electromagnetic spectrum, capturing aspects

INTRODUCTION HYPERSPECTRAL BAND SELECTION CLASSIFICATION OF DATA ON GRASSMANNIANS FUTURE DIRECTIONS

References

J.-M. Chang, et al. Recognition of digital images of the human face at ultralow resolution via illumination spaces, Proceedings of the 8th Asianconference on Computer vision - Volume Part II, pp. 733-743, 2007.

A. Edelman, T. A. Arias, and S. T. Smith, The geometry of algorithms withorthogonality constraints, SIAM J. MATRIX ANAL. APPL, 20(2), pp.303-353, 1998.

K. V. Mardia, J. T. Kent, and J. M. Bibby, Multivariate Analysis, AcademicPress, 1979.

V. N. Vapnik, The nature of statistical learning theory, New York: Springer,1995.O. L. Mangasarian, Exact 1-norm support vector machines viaunconstrained convex differentiable minimization, Journal of MachineLearning Research, 7, pp. 1517-1530, 2006.

SOFYA CHEPUSHTANOVA COLORADO STATE UNIVERSITY 40 OF 48

Page 41: Data Analysis Methods and Applications: Hyperspectral Band ...Hyperspectral Imagery (HSI) Hyperspectral sensors generate imagery in the electromagnetic spectrum, capturing aspects

INTRODUCTION HYPERSPECTRAL BAND SELECTION CLASSIFICATION OF DATA ON GRASSMANNIANS FUTURE DIRECTIONS

Ellipsoidal Separation: Motivation

Important application: medical diagnosis. In particular, we areinterested in diagnosis of neonatal sepsis, for the data setcollected in Yale-New Haven Hospital’s Neonatal Intensive CareUnit (NICU).

We expect points from the same class to be close to each other,i.e. to be enclosed in a hull or ball. Ellipsoids, being the affinedeformations of balls, will make the separation procedure scalinginvariant.

Ellipsoids are simple convex sets.

Ellipsoidal separation can be modelled as a semidefinite program(SDP) which can be solved efficiently.

SOFYA CHEPUSHTANOVA COLORADO STATE UNIVERSITY 41 OF 48

Page 42: Data Analysis Methods and Applications: Hyperspectral Band ...Hyperspectral Imagery (HSI) Hyperspectral sensors generate imagery in the electromagnetic spectrum, capturing aspects

INTRODUCTION HYPERSPECTRAL BAND SELECTION CLASSIFICATION OF DATA ON GRASSMANNIANS FUTURE DIRECTIONS

Ellipsoids: Facts

An ellipsoid is the image of a unit ball {xTx ≤ 1} under an affinetransformation.

Given a center c and an n× n symmetric positive semidefinite matrixE (E � 0), we can define an ellipsoid as{x ∈ Rn|(x− c)TE(x− c) ≤ 1}.

The condition E � 0 is crucial for ellipsoids: if not satisfied, theequation above may describe any quadratic set.

SOFYA CHEPUSHTANOVA COLORADO STATE UNIVERSITY 42 OF 48

Page 43: Data Analysis Methods and Applications: Hyperspectral Band ...Hyperspectral Imagery (HSI) Hyperspectral sensors generate imagery in the electromagnetic spectrum, capturing aspects

INTRODUCTION HYPERSPECTRAL BAND SELECTION CLASSIFICATION OF DATA ON GRASSMANNIANS FUTURE DIRECTIONS

Feasibility Problem (Boyd and Vandenberghe 2004)

find P, q, r

s. t. xTi Pxi + qTxi + r ≥ 1, i = 1, . . . ,N,

yTi Pyi + qTyi + r ≤ −1, i = 1, . . . ,M,

P ≺ 0.

Note: the constraint P ≺ 0 can be expressed as P � −I (due homogeneity off in P, q, r)

SOFYA CHEPUSHTANOVA COLORADO STATE UNIVERSITY 43 OF 48

Page 44: Data Analysis Methods and Applications: Hyperspectral Band ...Hyperspectral Imagery (HSI) Hyperspectral sensors generate imagery in the electromagnetic spectrum, capturing aspects

INTRODUCTION HYPERSPECTRAL BAND SELECTION CLASSIFICATION OF DATA ON GRASSMANNIANS FUTURE DIRECTIONS

Non-separable Data

minimizeP,q,r

eTξ + eTτ

subject to xTi Pxi + qTxi + r ≥ 1− ξi, i = 1, . . . ,N,

yTi Pyi + qTyi + r ≤ −1 + τi, i = 1, . . . ,M,

P � −I, ξ, τ ≥ 0.

SOFYA CHEPUSHTANOVA COLORADO STATE UNIVERSITY 44 OF 48

Page 45: Data Analysis Methods and Applications: Hyperspectral Band ...Hyperspectral Imagery (HSI) Hyperspectral sensors generate imagery in the electromagnetic spectrum, capturing aspects

INTRODUCTION HYPERSPECTRAL BAND SELECTION CLASSIFICATION OF DATA ON GRASSMANNIANS FUTURE DIRECTIONS

References

S. Boyd and L. Vandenberghe Convex Optimization, Cambridge UniversityPress, New York, NY, USA, 2004.

M. Grant and S. Boyd, Graph implementations for nonsmooth convexprograms, In V. Blondel, S. Boyd, and H. Kimura, editors, Recent Advancesin Learning and Control, Lecture Notes in Control and InformationSciences, pp. 95-110. Springer-Verlag Limited, 2008,http://stanford.edu/~boyd/graph_dcp.html.

M. Grant and S. Boyd, CVX: Matlab software for disciplined convexprogramming, version 2.0 beta. http://cvxr.com/cvx, September2013.

SOFYA CHEPUSHTANOVA COLORADO STATE UNIVERSITY 45 OF 48

Page 46: Data Analysis Methods and Applications: Hyperspectral Band ...Hyperspectral Imagery (HSI) Hyperspectral sensors generate imagery in the electromagnetic spectrum, capturing aspects

INTRODUCTION HYPERSPECTRAL BAND SELECTION CLASSIFICATION OF DATA ON GRASSMANNIANS FUTURE DIRECTIONS

Topological Data Analysis (TDA)

Basic idea: to describe the “shape of the data” by findingclusters, holes, tunnels, etc.

Persistent homology (PH): a rapidly growing branch of TDA.

PH can be applied to a data set to capture the persistance oftopological structure across scales.

Application of PH to hyperspectral remote sensing data analysis.See, e.g., Afra Zomorodian and Gunnar Carlsson Computingpersistent homology, Discrete Comput. Geom. 33 (2005), 2, 249–274.

SOFYA CHEPUSHTANOVA COLORADO STATE UNIVERSITY 46 OF 48

Page 47: Data Analysis Methods and Applications: Hyperspectral Band ...Hyperspectral Imagery (HSI) Hyperspectral sensors generate imagery in the electromagnetic spectrum, capturing aspects

INTRODUCTION HYPERSPECTRAL BAND SELECTION CLASSIFICATION OF DATA ON GRASSMANNIANS FUTURE DIRECTIONS

Persistent Homology

Encoded in the form of a parameterized version of a Betti number,called a barcode: a set of line segments each representing the range ofparameter values over which a topological feature persists.

SOFYA CHEPUSHTANOVA COLORADO STATE UNIVERSITY 47 OF 48

Page 48: Data Analysis Methods and Applications: Hyperspectral Band ...Hyperspectral Imagery (HSI) Hyperspectral sensors generate imagery in the electromagnetic spectrum, capturing aspects

INTRODUCTION HYPERSPECTRAL BAND SELECTION CLASSIFICATION OF DATA ON GRASSMANNIANS FUTURE DIRECTIONS

THANK YOU FOR YOUR ATTENTION!

SOFYA CHEPUSHTANOVA COLORADO STATE UNIVERSITY 48 OF 48