Adaptive hierarchical clustering for hyperspectral image ...
Adaptive Algorithms for Optimal Classification and Compression of Hyperspectral Images
description
Transcript of Adaptive Algorithms for Optimal Classification and Compression of Hyperspectral Images
Adaptive Algorithms for Optimal Classification and Compression of
Hyperspectral Images
Tamal Bose* and Erzsébet Merényi#
*Wireless@VTBradley Dept. of Electrical and Computer Engineering
Virginia Tech
#Electrical and Computer Engineering
Rice University
Outline Motivation Signal Processing System Adaptive Differential Pulse Code
Modulation (ADPCM) Scheme Transform Scheme Results Conclusion
Status-Quo
Raw data (limited onboard processing)Unreliable linksUnacceptable latencies
Delay in science and discovery Restricts deep space missions
High stressReduced productivity
Mission Control
Mission Scientists
High-Speed Real-Time On-Board Signal ProcessingImpact on ScienceState-of-the-art signal processing algorithms to:
• enable onboard science
• detect events and take necessary action; e.g. collecting and processing data as a result of detecting dust storms in Mars
• process and filter science data with “machine intelligence”; e.g. data compression with signal classification metrics, so that that certain features can be preserved
Concept:Current science/technology plans
• Scientific processing and data analysis• Data compression/filtering• Autonomous mission control; e.g. automatic landing site identification, instrument control, etc.• Cognitive radio based communications to optimize power, cost, bandwidth, processing speed, etc.
Objectives
Computationally efficient signal processing algorithms with the following features:
• Adaptive filter based algorithms that continuously adapt to new environments, inputs, events, disturbances, etc.
• Modular algorithms suitable for implementation in distributed processors
• Cognitive algorithms that learn from its environments; high degree of artificial intelligence built-in for mission technology and for science data gathering/processing
11/12/2007
DSP Algorithms
Impact Large body of knowledge developed
for on-board processing. Two main classes (Filtering and Classification): Adaptive filtering algorithms (EDS, FEDS,
CG, and many variants) Algorithms for 3-D data de-noising,
filtering, compression, and coding. Algorithms for hyperspectral image
clustering, classification, onboard science (HyperEye)
Algorithms for joint classification and compression.
Decision / control subsystem
Data acquisition subsystem: Hyperspectral imager
Alert for navigation decision
Supervised classification:
Continuous production of surface cover maps
Unsupervised clustering Novelty detection
HyperEye IDU:“Precision” manifold
learning system
Spacecraft system
Labeled and unlabeled remote sensing observations
Training data
Environment:Mars, Earth, … planet surfaces
Environment:Mars, Earth, … planet surfaces
Intelligent Data Understanding in on-board context
Hyper
Eye
On-board component
To on-board autonomous decision system
HyperEye“precision” learner
Alerts
Cluster extraction from SOM, discovery
Supervised class maps, class stats
Products
Artificial Neural Net core
Self-Organizing Map (unsupervised)
with non-standard capabilities
Supervised SOM-hybrid
classifier
On-ground component
Human interaction
Evaluation (by domain expert, ANN expert, … )
Feedback to learning
Visualization & summary
Decision control
HyperEye: Intelligent Data Understanding environment
“Precision” manifold learning system
Specific Goals (this talk)
Maximize compression ratio with classification metrics
Minimize mean square error under some constraints
Minimize classification error
Signal Processing System
TOOLS & ALGORITHMS
Digital filters Coefficient adaptation algorithms Neural nets, SOMs Pulse code modulators Image transforms Nonlinear optimizers Entropy coding
Scheme-I ADPCM is used for compression SOM mapping is used for clustering Genetic algorithm is used to minimize
the minimum cost function Compression is done along spatial
and/or spectral domain
ADPCM system
Prediction error Reconstruction Reconstruction error = quantization error
Cost function
sse ˆ ses ˆ~~ qeess ~~
w
w
n
i
inn ieJ
1
2 )()( w
Several different algorithms are used for adaptive filtering
Least Mean Square (LMS) Recursive Least Squares (RLS) Euclidean Direction Search (EDS) Conjugate Gradient (CG)
The Quantizer is Adaptive Jayant quantizer Lloyd-Max optimum quantizer Custom quantizer as needed
PREDICTOR FOOTPRINT
C(i,j,k) represents prediction coefficientsR is a prediction window over which C(i,j,k) is nonzero
j
i
Filter coefficient position
Position to be predicted
Cubic Filter
Rnml
nnmnlnukjicnnnd),,(
)3,2,1(),,()3,2,1(
EDS Algorithm The least squares cost function:
n
i
ind
n
i
inn
i
Tin idniidniinQ0
22
00
)()()()()()()()( xrxx
)()(2)(])()([)( 22
0
nnnQiidJ dTT
n
i
Tinn
rwwwxww
An iterative algorithm for minimizing has the form:gww )()1( nn
0)(rg2)gw)((Q2)( nnJ TTn ggw
)(wnJ
gQg
rwQg
)(
))()()((
n
nnnT
T
The cost function at the next step is )( gw nJ
Now we find α such that the above is minimized:
EDS Algorithm
Unsupervised neural network A mapping from high-dimensional input data space onto a regular two-dimensional array of neurons The neurons of the map are connected to adjacent neurons by topology (rectangular or hexagonal) One neuron wins the competition; then change its weights and its neighborhood
Source:http://www.generation5.org/content/2004/aisompic.asp
Self-organzing map — SOM
Competition layer
(output layer)
weights
Input layer
The learning process of the SOM Competition A winning neuron is selected when {output(i)=<input, weight>} = the shortest Euclidean distance between input vector and weights
UpdateUpdate the weight values of the winning neuron and its neighborhood
RepeatAs the learning proceeds, the learning rate and the size of the neighborhooddecreases gradually
GA-ADPCM Algorithm1. Apply SOM mapping to the original image.2. Generate initial population of ADPCM coefficients.3. Implement ADPCM (LMS, EDS, RLS, etc.) processing
using these sets of coefficients.4. Apply SOM mapping to the decompressed images.5. Calculate the fitness scores (clustering errors) between the
decompressed images and the original image.
Coefficients 1 2 3Population
1
2
3
4
ADPCM SOMFitness
Scores
4
2
3
1
Coefficients 1 2 3
Population
4
2
Fitness
Scores
1
2
6. Sort the fitness scores and choose the 50% fittest individuals.
7. Apply the genetic operations (crossover and mutation) and create the new coefficient population.
8. Go to Step 2 and repeat this loop until the termination condition is achieved.
9. The termination condition is when the clustering error smaller than a certain threshold
Fitness functionF=Ce/N. Ce is the number of pixels clustered incorrectly. N is the total pixels in the image. F is the percentage of incorrectly clustered pixels. Ce is obtained by the following steps:1. Calculate Cm=Co-Cg, where Co is the matrix containing the
clustering result of the original image. Cg is the matrix containing clustering result of the image after ADPCM compression and decompression. Cm is the difference between the two clustered images.
2. Assign all the nonzero points in Cm matrix to be 1 and add them together to get the clustering error Ce.1 2 3
2 1 3
3 1 2
Co
1 2 2
1 1 3
1 1 3
Cg
=
0 0 1
1 0 0
2 0 -1
Cm
0 0 1
1 0 0
1 0 1
Cm
Transform Domain Scheme Image transform is used for compression
DFT, DCT, DST, DWT, etc. Parameters (block size, number of bits) can be
adjusted by cost function Compression is done along:
spectral domain, spatial domain, or both Quantization:
Uniform, non-uniform, optimum, custom, etc. Bit allocation:
non-uniform
Transform-Domain Algorithm
Method I: fix the number of quantization bits, adjust block size (DCT length)
Method II: fix block size (DCT length), adjust the number of quantization bits
Several other combinations
Results
Hyperspectral cube- Lunar Crater Volcanic Field (LCVF)
Jasper Ridge (JR)
One frame of the hyperspectral cube
One block of the original image Clustered image by LMS Clustered image by EDS
Clustered original image Clustered image by GA-LMS Clustered image by GA-EDS
10 20 30 40 50 60
10
20
30
40
50
60
10 20 30 40 50 60
10
20
30
40
50
60
10 20 30 40 50 60
10
20
30
40
50
60
10 20 30 40 50 60
10
20
30
40
50
60
Clustered results comparison between ADPCM and GA-ADPCM
10 20 30 40 50 60
10
20
30
40
50
60
10 20 30 40 50 60
10
20
30
40
50
60
Fitness score for GA-LMS Fitness score for GA-EDS
fitness scores = clustering errors
1 2 3 4 50.08
0.085
0.09
0.095
0.1
0.105
0.11
generationfit
ness
sco
re
max fitnessmin fitnessaverage fitness
1 2 3 4 50.06
0.07
0.08
0.09
0.1
0.11
0.12
0.13
0.14
0.15
generation
fitn
ess
sco
re
max fitnessmin fitnessaverage fitness
Block index (1,1) (1,2) (1,3) (1,4) (1,5) (1,6) (1,7) (1,8)
LMS 0.04248 0.06030 0.05127 0.02759 0.04248 0.03906 0.05713 0.06934
GA-LMS 0.034424 0.04956 0.04077 0.02319 0.02759 0.03735 0.04907 0.06567
EDS 0.052734 0.08593 0.07080 0.03955 0.05664 0.05127 0.06519 0.09277
GA-EDS 0.04541 0.07056 0.06128 0.03223 0.04834 0.05078 0.05835 0.08057
Clustering error comparison between ADPCM and GA-ADPCM
0 5 10 15 20 25 30 350
2
4
6
8
10
12
column number
clas
sifie
d er
ror
%
adpcm LMSga+adpcm LMS
Block size=16 Classes=4 Block size=32 Classes=4 Block size=64 Classes=4
Block size=16 Classes=3 Block size=32 Classes=6 Block size=64 Classes=8
0 5 10 15 20 25 30 350
2
4
6
8
10
12
14
column number
clas
sifie
d er
ror
%
adpcm LMSga+adpcm LMS
0 2 4 6 8 10 12 14 161
2
3
4
5
6
7
8
9
10
11
column number
clas
sifie
d er
ror
%
adpcm LMSga+adpcm LMS
1 2 3 4 5 6 7 82
2.5
3
3.5
4
4.5
5
5.5
6
6.5
7
column number
clas
sifie
d e
rro
r %
adpcm LMSga+adpcm LMS
0 2 4 6 8 10 12 14 162
4
6
8
10
12
14
16
18
20
column number
clas
sifie
d er
ror
%
adpcm LMSga+adpcm LMS
1 2 3 4 5 6 7 84
6
8
10
12
14
16
column number
clas
sifie
d er
ror
%
adpcm LMSga+adpcm LMS
Clustering error comparison between LMS and GA-LMS
0 5 10 15 20 25 30 350
2
4
6
8
10
12
14
16
18
column number
clas
sifie
d er
ror
%
adpcm EDSga+adpcm EDS
1 2 3 4 5 6 7 83
4
5
6
7
8
9
10
column number
clas
sifie
d er
ror
%
adpcm EDSga+adpcm EDS
Clustering error comparison between EDS and GA-EDS
0 2 4 6 8 10 12 14 161
2
3
4
5
6
7
8
9
10
11
column number
clas
sifie
d er
ror
%
adpcm EDSga+adpcm EDS
Block size=16 Classes=4 Block size=32 Classes=4 Block size=64 Classes=4
20 40 60 80 100 120
20
40
60
80
100
120
Clustering results between uncompressed image and transformed image
20 40 60 80 100 120
20
40
60
80
100
120
Clustered image of original image Clustered image after transform
64 65 66 67 68 690
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
band index
Ave
rag
e d
ata
nu
mb
er
cluster 1cluster 2cluster 3cluster 4cluster 5cluster 6
64 65 66 67 68 690
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
band index
Ave
rag
e d
ata
nu
mb
er
cluster 1cluster 2cluster 3cluster 4cluster 5cluster 6
Mean spectral signatures of the SOM clusters identified in the Jasper Ridge image.
Left: from the original image. Right: from the image after applying DCT compression and decompression
Iteration 1 2 3 4 5
Block size 768 384 192 96 48
Cluster error 0.0980 0.0728 0.0469 0.0317 0.0218
Iteration 1 2 3 4
Cluster error 0.1292 0.0657 0.0367 0.0223
Compression ratio 6.4:1 5.33:1 4.2:1 4:1
Clustering Errors using Different Number of Bits in JR
Clustering Errors using Different Block Sizes in JR
Spectral signature comparison (Mean, STD, Envelope) of whole hyperspectral data LCVF Uncompressed Data LCVF after ADPCM compression LCVF after DCT compression
LCVF Uncompressed Data LCVF after ADPCM compression LCVF after DCT compression
Classification accuracy
Data Set Run C Run A Run E Run M Run B.0 Avg. Std.
D1c16 84.9 86.3 85.1 82.4 84.6 84.66 1.46
D1c8b3 77.3%
Run 1 Run 2 Run 5.1 Run 5.2 Run 6
LCVF benchmark 86.01 86.03 86.05 86.15 86.1 86.07 0.06
Run 1
DCT194b8hb4 63.5%
Measuring the effect of compression on classification accuracy. Data: Hyperspectral image of Lunar Crater Volcanic Field, 196 spectral bands, 614 x 420 pixels. Classifications were done for 23 known surface cover types. Original uncompressed data are labeled with “LCVF”, a compressed-uncompressed data set with “D1c16” using ADPCM, a compressed-uncompressed data set with “DCT194b8hb4” using DCT (8-bit quantization for significant data, 4-bit for insignificant data). “D1c8b3” is using ADPCM with 3-bit Jayant quantization.
Conclusion New algorithms have been developed and implemented that use
the concept of classification metric driven compression GA-ADPCM algorithm was simulated:
Optimized the adaptive filter in an ADPCM using GA Reduced clustering error Drawback – increased computational cost
Feedback-Transform algorithm was simulated: Select the optimal block size (DCT length) and number of
quantization bits to achieve a balance between a low clustering error, and computational complexity, and memory usage
Compression along spectral domain preserves the spectral signatures of the clusters
Results using the above algorithms are promising
Acknowledgments Graduate students:
Mike Larsen (USU) Kay Thamvichai (USU) Mike Mendenhall (Rice) Li Ling (Rice) Bei Xei (VT) B. Ramkumar (VT)
NASA AISR Program