Post on 25-Dec-2015
1
Information Extraction Principles for Hyperspectral Data
David LandgrebeProfessor of Electrical & Computer Engineering
Purdue Universitylandgreb@ecn.purdue.edu
• A Historical Perspective• Data and Analysis Factors• Hyperspectral Data Characteristics• Examples• Summary of Key Factors
Outline
2
1957 - Sputnik
REMOTE SENSING OF THE EARTH
Atmosphere - Oceans - Land
Brief History
1958 - National Space Act - NASA formed
1960 - TIROS I
1960 - 1980 Some 40 Earth Observational Satellites Flown
3
Image Pixels
Enlarged 10 Times
Thematic Mapper Image
4
Three Generations of Sensors
Band No.
Re
lativ
e R
esp
on
se
05
101520253035404550
1 2 3 4
Green Veg.
Bare Soil6-bit data
MSS1968
Band No.
Re
lati
ve
Re
sp
on
se
0
50
100
150
200
1 2 3 4 5 6 7
Green Veg.
Bare Soil
8-bit data
TM1975
Wavelength (µm)
Re
lativ
e R
ad
ian
ce
Re
sp
on
se
0
500
1000
1500
2000
2500
0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4
Water
Emerging Crop
Trees
Soil
10-bit data
Hyperspectral1986
5
Systems View
Sensor On-BoardProcessing
PreprocessingData
AnalysisInformationUtilization
Human Participationwith Ancillary Data
Ephemeris,Calibration, etc.
6
Scene Effects on Pixel
7
Data Representations
Image Space Spectral Space
0
1000
2000
3000
0.40 0.80 1.20 1.60 2.00 2.40
Wavelength (µm)
Water Trees Soil
Feature Space
0
200
400
600
800
1000
1200
1400
1600
0 500 1000 1500 2000
Band 0.60
Water
Trees
Soil
Sample
• Image Space - Geographic Orientation
• Feature Space - For Use in Pattern Analysis• Spectral Space - Relate to Physical Basis for Response
8
Data Classes
9
SCATTER PLOT FOR TYPICAL DATA
30
60
90
120
150
180
210
17 34 51 68 85 102
BiPlot of Channels 4 vs 3
Channel 3
++ +++
+++++++ ++++
+++++
++++++
+++++++++ +++ +
+ + +++++ +++
+
+ ++++++++++++++ +
+
+ +++ +
++
+
++
+++ +
+++ +++
++ +++++++
+++++ +
++++ + + + + ++
++++++++
+++ ++ ++++
++++ +++++
+ + ++++ ++
+ +++
++++++
++
+ +
++
++++
+++
++
++++
+++
++
++++++
++
+ + + +++
++++++ +
++++++
+
+
+++++
+
+
+
++++
+++++
+
+
+
++
+++
+++
+++
++
+++++++
++++ ++
+ ++++++++ ++ ++
++
++++++ ++++++++
+++
++ +++++++
+++
+++++++++
++++
+
++++
++
+++++++
+
+++++++
+++
++ + +
+++
+
+++++++++
+++
+++
+++
+++
+++ +
++++
+
+
++ ++++
++++++
++++
+++ ++
+++++++ +
+ ++++++++++ + +
+
+++++++++
++ +++
+++
++++++++++
++++ ++
+
+
+++++
+++ +
++++
++++
+++++++
+ ++ ++
++ + +
+++ +
+
+++++++ +
+
+++
+
++ ++
+
+++
++++++++ +
++
+++
++++
++++++++ +++
++++++++++
++
++
++
+++++++ +
+
+
+
+ ++
++
++++++
+ +++
++
+++++++
++ +++++ ++ + +
+++++
+ +
+++++
++++
+++
++
+++
++
++++++
++++
+
+++ +++
+
+
++++
+++++ +
++
+
+ +++
+++
++++
+
+
++ +
++++
+
+++
+++
++++
++++
+
++
++++++++++
+++++
+
++
+
++
+++
++
+++ +
++ ++
+++++
++++
+++ ++
+++
+++
++
+
+++
+++++++ +
++
++++++
+
++
++++
+++
+ ++
+ +
++
+++
++
++++++ +
+++
+++
++
+++++
+++
+++ + ++
+++++
++
++
++ +
+ + ++
+++
++
++
+
+++
+++++
+
+ + +
++
+
++++ +
+ ++
+ ++
+
+++
+
+++
+++ +
++++++
+
++
+
+ ++
++
+
+
++
+
++++++ ++
+
+++
+
+
++
++
++ ++
++
++
+++
+
+++
+
+ ++++++
++++
+++
+ ++
++
+++++++
++
+++
++
+++
++
++
+++ +
++
+ ++++
+ +++ +
+++
++++ +
++
+++
+
+ +
++++
++
+++
+++
+
+
++
+ +++ + +
+
++
+
++ ++
+++
+++
+
+
+++++
+
+++
++
++++ + +
++
++++++
+
+++++
++
++
+
+++ +++
++
+++
++
++
+ + ++
+ +++
+
+
+++ ++
+
+ ++
+++
++ ++
+ + +
++++++++
++
+++++++
++ ++++
+++
++++
+++++++
++
++ +
+
++++
+
+++
+++++++
++ +++ +
+
++
++ +
+++
+
++++++++++
+++
+
+++ ++
++++
++
+++
+++
++
+ ++
++++++
++++
+
+++
+ +
++
+++
+ ++
++
+ +++
+
+++
++++
++
+
++ +
+
+
++++
++++
++++
+
++ ++
++++++++++
++
++
+ +
+
++
++
+
++
++
++
++
++++++
+
++
+ +
+
++++
+
++
++
+ + + ++
++
+ ++
+
++
+++
++
+++++
++
+
++++
+
+++
+++
+
+
+
++
+++
++++
+
+++
++
++ +
+
+
+++
++
++
+
++
+
+
++
+++
+++
+
++
+
+ ++
++
+++
+++
++
+++
++
++
+ +++
++ +
++ ++
+ +
+++++++
+++
++
+ ++++++
+
+ +++
++++
++
++
++
+
+++++++++
+
++
+ ++
+++
++
+++
++
++
+++
++
+++++++
+
++
+++
++
+ +
+
+
+
+++
+++
++
++
++
+
++
+
+
+
+
+++
+
+
+
++
+
++
++
+
+ +
+++
++
++++ +
+
++
+++
+
++
+
+
+
++
+
+
+++
+ +
+++
+
+
+
++++
+
++
++
++++
++++
++++
+
++ +
+ ++
+++
++
+ +
++++
+ ++ +
+
+++
++
++
+
++
+ +++
+
+ +
++++
+
++++
++
+++++
+
+
+
++
+
++
+
+
++
++++
+
+
++ ++
+
++
++++
+++
+
+
++
+
+
+
++++
+++
+
++
++
+
++++
++
+
+
+
+
+
+
++
+
++
+++
++
++
+++++
++
+
+
+ ++
++
++
++
+++
+
+ +
+++
++++
++++++
++
++
+
+
+
+++
+
+++++ +
+
+
++ ++++
+
+ +++
+
+
+
+++
+ ++
++
+ +
+
+
++
+
+ +++ ++
+
+++
++
+
+++
+
++
+
++
++
+
+
++ ++
++
+++++++
+
+
+
+
+
+++
+
+++
++++
++
++
++
+
+++++
++ +
+
+
+
+
+
+
++
+
+
+
+
+++++
+
+
+
++
+++
++
++++
+
++ +
+
++
++
++++ +
++
+++
+
++ +
+
+
+ ++ +
++
+
+
+
++
+
+
++
++
+
+
+++
++
+ +++
++
+
+
++
+
+++++
+++
+
+
+
++++++++++
++
++
++
+
++
+
+
+
++++
+
+
+
+++
++++
++
+
+++
++
++++
++
++
++
+
+
++
++++
+
+
+++ +++ +
+++
+
+
++
++
++
++++
+
+++++
+++++++ +
+
+
+ ++
+
+++ +
++
++
++
++ ++
+++ ++
++ ++++
++
+++ +
+++++
+
+++++ +
++
+
+ ++
+
+++
+
+++
+++
+ ++++
++
+++++
+
++
+++
++
+++ +
+
++++
+++++++
+
+++
+++ +
++ ++
+ ++
+++
+++
++++
+++
+
+++
+
+
+++
++++ ++ +
+++
++ ++++++
+ ++
++
++
++
++ +
+ +++
+++
+
+
+++ ++++
+
++++
++
+ ++ +
++
++
++
++
+
+ +++
+
++
+++
+++
++++
+++++++
+
+
++
++
++ ++++
+ +
+
+++
+
+++++++++
+ ++
+
+
+
++ +++
++
++++++++ ++++
+ +++
+
++
+
++++
+
+
++
+++++
++ + +++
+
++
++
++ ++
+ +
+
++
+++ +
++
++
+
+++++
++ ++++
+ ++
+
+++ +
+
++
+++
+++ ++
++
+++++
++
+++++
++ +
+++
+++++++ +
+ + ++ + +++++
++++
+++
+++
+ ++
+
++++ +++++ ++
++++ +
++
++ +++
+++++++ +++
+++
+
++
+ +
+
+++
++
+
++
++
++
+
++
++ +
++++
+++
++
++ +
++
++
++++
+
++++
+++
+
+
+
+
+
+
+ + ++ + + ++++
++
+ ++++ + ++ +++
++
+++++++++
+
+
++++++++ ++
++++++
+
++
+
+
+
+
+
+
+
+++++++
++
+ ++
+++++
++
++
++++++
+
+
+++
++
+++++++++++
+
+
++
++
+
++++++ +
++
++
+
++
++
+
+
+
+
+
+
++
+++++
+
++
+
++
++++
+
+
+
+
+
++++
++
++
+
++
++
+
+
+
+ ++
+
+++ +
+
+ +
++
++
++
+
+++
+
+
+
+
++
+
++
+
+
+
+
++
+++
+++
+
+++
+
+
+
+
++
++
++
+
++
++
+
++
++++
++
+
+++
+
++
+
+
+
++
+
++
+
++
+ +
+
+
+ + ++++
+
++
+ +
+
+
++ +
+
+
++
+
+
++
+
+
+ +
+
++++
+
+++
+
+
+
+++ +
++
+ +
+++
+
++
+
++
+ +++++
+
10
BHATTACHARYYA DISTANCE
B 1
81 2
T 1 2
2
1
1 2 1
2Ln
1
21 2
1 2
Mean Difference Term Covariance Term
11
Vegetation in Spectral Space
Laboratory Data: Two classes of vegetation
12
Scatter Plots of Reflectance
0.720.710.700.690.680.670.660.6510
12
14
16
18
20
22
24
Class 1 - 0.67 µm
Class 2 - 0.67 µmClass 1 - 0.69 µm
Class 2 - 0.69 µm
Scatter of 2-Class Data
Wavelength - µm
Refl
ect
ance
- %
13
Vegetation in Feature Space
15141312111016
17
18
19
20
21
22
23
Class 1
Class 2
Samples from Two Classes
% Reflectance at 0.67 µm
% R
efl
ecta
nce a
t 0
.69
µm
14
Hughes Effect
m=25
10
20
50100
200
1000
500
m =
1 100050020010050201052
MEASUREMENT COMPLEXITY n (Total Discrete Values)
0.50
0.55
0.60
0.65
0.70
0.75M
EA
N R
EC
OG
NIT
ION
AC
CU
RA
CY
G.F. Hughes, "On the mean accuracy of statistical pattern recognizers," IEEE Trans. Inform. Theory., Vol IT-14, pp. 55-63, 1968.
15
A Simple Measurement Complexity Example
16
Classifiers of Varying Complexity
• Quadratic Form
gi(X) = 1
2(X i )
T i 1(X i )
1
2ln i
• Fisher Linear Discriminant - Common class covariance
gi(X) = 1
2(X i )
T 1(X i )
• Minimum Distance to Means - Ignores second moment
gi(X) = 1
2(X i )
T (X i )
17
Classifier Complexity - con’t• Correlation Classifier
gi(X) XT i
XTX iT i
• Spectral Angle Mapper
gi(X) cos 1 XTiXTX i
Ti
• Matched Filter - Constrained Energy Minimization
gi(X) XTCb
1i iTCb
1 i• Other types - “Nonparametric”
Parzen Window Estimators Fuzzy Set - based Neural Network implementations K Nearest Neighbor - K-NN etc.
18
Covariance Coefficients to be Estimated
• Assume a 5 class problem in 6 dimensions
• Normal maximum likelihood - estimate coefficients a and b• Ignore correlation between bands - estimate coefficients b
• Ignore correlation between bands - estimate coefficients d
Class 1 Class 2 Class 3 Class 4 Class 5b b b b ba b a b a b a b a ba a b a a b a a b a a b a a ba a a b a a a b a a a b a a a b a a a ba a a a b a a a a b a a a a b a a a a b a a a a ba a a a a b a a a a a b a a a a a b a a a a a b a a a a a b
• Assume common covariance - estimate coefficients c and d
Common Covar.dc dc c dc c c dc c c c dc c c c c d
19
EXAMPLE SOURCES OFCLASSIFICATION ERROR
Decision boundary defined by the
diagonal covariance classifier
class 2
class 1
Decision boundary defined by Gaussian ML classifier
20
Number of Coefficients to be Estimated
• Assume 5 classes and p features
No. ofFeatures p
Class Covar.
(a & b above)5{{ p+1)p/2}
Diagonal ClassCommon Covar.
(b above)5p
CommonCovar.
(c & d above){ p+1)p/2}
Diagonal CommonCovar.
(d above)p
5 75 25 15 510 275 50 55 1020 1050 100 210 2050 6375 250 1275 50
200 100,500 1000 20,100 200
21
Intuition and Higher Dimensional Space
Borsuk’s Conjecture: If you break a stick in two, both pieces are shorter than the original.
Keller’s Conjecture: It is possible to use cubes (hypercubes) of equal size to fill an n-dimensional space, leaving no overlaps nor underlaps.
Science, Vol. 259, 1 Jan 1993, pp 26-27
Counter-examples to both have been found for higher dimensional spaces.
22
The Geometry of High Dimensional Space
The Volume of a Hypercube concentrates in the corners
0.6
1 2 3 4 5 6 70
0.2
0.4
0.8
1
dimension d
The Volume of a Hypersphereconcentrates in the outer shell
1 2 3 4 5 6 7 8 9 10 110
0.2
0.4
0.6
0.8
1
dimension d
Vd (r ) Vd (r )
Vd (r)rd (r )d
rd1 1
r
d
d 1
V hypersphere
Vhypercube
d
2
d2d 1 d2
d 0
23
Some Implications
High dimensional space is mostly empty. Data in high dimensional space is mostly in a lower dimensional structure.
Normally distributed data will have a tendency to concentrate in the tails; Uniformly distributed data will concentrate in the corners.
24
Volume of a hypersphere =2rd
dd / 2
(d / 2)
How can that be?
dVdr
2d / 2
(d / 2)r (d 1)
Differential Volume at r =
0 1 2 3 4 50
20
40
60
80
Distance from Class Mean, r
1
2
3 4 5
Surface of Hypersphere
Volumn of shell
25
How can that be? (continued)
rd 1e r
2
2
2d2 1d2
The Probability Mass at r =
0 1 2 3 4 50
0.2
0.4
0.6
0.8
Distance from Class Mean, r
1
2 3 4 5 10 15 20
Probability Density of Distance r
Probability mass in shell
26
MORE ON GEOMETRY
• The diagonals in high dimensional spaces become nearly orthogonal to all coordinate axes
cos d 1d
Implication: The projectionof any cluster onto anydiagonal, e.g., by averagingfeatures could destroy information
27
STILL MORE GEOMETRY
• The number of labeled samples needed for supervised classification increases rapidly with dimensionality
In a specific instance, it has been shown that the samples required for a linear classifier increases linearly, as the square for a quadratic classifier. It has been estimated that the number increases exponentially for a non-parametric classifier.
• For most high dimensional data sets, lower dimensional linear projections tend to be normal or a combination of normals.
28
A HYPERSPECTRAL DATA ANALYSIS SCHEME
200 Dimensional Data
Class ConditionalFeature Extraction
FeatureSelection
Classifier/Analyzer
Class-SpecificInformation
29
Finding Optimal Feature Subspaces
• Feature Selection (FS)
• Discriminant Analysis Feature Extraction (DAFE)
• Decision Boundary Feature Extraction (DBFE)
• Projection Pursuit (PP)
.Available in MultiSpec via WWW at: http://dynamo.ecn.purdue.edu/~biehl/MultiSpec/Additional documentation via WWW at: http://dynamo.ecn.purdue.edu/~landgreb/publications.html
30
Hyperspectral Image of DC Mall
HYDICE Airborne System1208 Scan Lines, 307 Pixels/Scan Line210 Spectral Bands in 0.4-2.4 µm Region155 Megabytes of Data(Not yet Geometrically Corrected)
31
Define Desired Classes
Training areas designated by polygons outlined in white
32
Thematic Map of DC Mall
Legend Operation CPU Time (sec.) Analyst TimeDisplay Image 18Define Classes < 20 min.Feature Extraction 12Reformat 67Initial Classification 34Inspect and Mod. Training ≈ 5 min.Final Classification 33
Total 164 sec = 2.7 min. ≈ 25 min.
Roofs
Streets
Grass
Trees
Paths
Water
Shadows
(No preprocessing involved)
33
Hyperspectral Potential - Simply Stated
• Assume 10 bit data in a 100 dimensional space.• That is (1024)100 ≈ 10300 discrete locations
Even for a data set of 106 pixels, the probability
of any two pixels lying in the same discrete location
is vanishingly small.
34
Summary - Limiting Factors
PreprocessingData
AnalysisInformationUtilization
Human Participationwith Ancillary Data
Sensor On-BoardProcessing
Ephemeris,Calibration, etc. • Scene - The most complex
and dynamic part
• Sensor - Also not under analyst’s control
• Processing System - Analyst’s choices
35
Limiting Factors
Scene - Varies from hour to hour and sq. km to sq. km
Sensor - Spatial Resolution, Spectral bands, S/N
Processing System -
• Classes to be labeled
• Number of samples to define the classes
• Complexity of the Classifier
• Features to be used
- Exhaustive,
- Separable,- Informational Value,
36
Source of Ancillary Input
Possibilities
• Ground Observations
• “Imaging Spectroscopy”
- From the Ground
- Of the Ground
• Previously Gather Spectra
• “End Members”
Image Space
Spectral Space
Feature Space
.
37
Use of Ancillary Input
A Key Point:
• Ancillary input is used to label training samples.
• Training samples are then used to compute class quantitative descriptions
Result:
• This reduces or eliminates the need for many types of preprocessing by normalizing out the difference between class descriptions and the data