Hyperspectral Unmixing Via Sparse Regression Optimization ... · the multiple sinal classification...

47
José M. Bioucas-Dias Instituto de Telecomunicações Instituto Superior Técnico Technical University of Lisbon Portugal Hyperspectral Unmixing Via Sparse Regression Optimization Problems and Algorithms MAHI, Nice 2012 Joint work with Mário A. T. Figueiredo

Transcript of Hyperspectral Unmixing Via Sparse Regression Optimization ... · the multiple sinal classification...

  • José M. Bioucas-Dias

    Instituto de Telecomunicações

    Instituto Superior Técnico

    Technical University of Lisbon

    Portugal

    Hyperspectral Unmixing Via Sparse Regression

    Optimization Problems and Algorithms

    MAHI, Nice 2012

    TexPoint fonts used in EMF.

    Read the TexPoint manual before you

    delete this box.: AAAAAAAAAAAAAAAAAA

    Joint work with Mário A. T. Figueiredo

  • Outline

    Introduction to hyperspectral unmixing

    Solving convex sparse hyperspectral unmixing and related

    convex inverse problems with ADMM

    Concluding remarks

    Hyperspectral unmixing via sparse regression.

    Recovery guarantees/convex and nonconvex solvers

    Improving hyperspectral sparse regression structured sparsity

    dictionary pruning

  • Hyperspectral imaging (and mixing)

    3

  • Linear mixing model (LMM)

    Incident radiation interacts only

    with one component

    (checkerboard type scenes)

    Hyperspectral linear

    unmixing Estimate

    4

  • Hyperspectral unmixing

    Alunite Kaolinite Kaolinite #2

    VCA [Nascimento, B-D, 2005]

    AVIRIS of Cuprite,

    Nevada, USA

    R – ch. 183 (2.10 m)

    G – ch. 193 (2.20 m)

    B – ch. 207 (2.34 m)

    5

  • NIR tablet imaging

    6 [Lopes et al., 2010]

  • Spectral linear unmixing (SLU)

    Subject to the LMM:

    ANC: abundance

    nonnegative

    constraint ASC: abundance

    sum-to-one

    constraint

    SLU is a blind source separation problem (BSS) )

    Given N spectral vectors of dimension m:

    Determine:

    The mixing matrix M (endmember spectra)

    The fractional abundance vectors

    7

  • Geometrical view of SLU

    probability simplex ( )

    (p-1) - simplex

    Inferring M inferring the vertices of the simplex

    8

  • Pure pixels

    Minimum volume simplex (MVS)

    MVS works MVS works MVS does not work

    No pure pixels No pure pixels

    9 [B-D et al., IEEE JSTATRS, 2012]

  • Sparse regression-based SLU

    Unmixing: given y and A, find the sparsest solution of

    Advantage: sidesteps endmember estimation

    Disadvantage: Combinatorial problem !!!

    Key observation: Spectral vectors can be expressed as linear

    combinations of a few pure spectral signatures obtained from a

    (potentially very large) spectral library

    [Iordache, B-D, Plaza, 11, 12]

    0

    0

    0

    0

    0

    0

    13

  • Sparse reconstruction/compressive sensing

    Key result: a sparse signal is exactly recoverable from an

    underdetermined linear system of equations in a computationally

    efficient manner via convex/nonconvex programming

    [Candes, Romberg, Tao, 06], [Candes,Tao, 06], [Donoho, Tao, 06], [Blumensath, Davies, 09]

    is the unique solution of if

    is the solution of the optimization problem

    NP-hard

    14

  • Bayesian CS

    CoSaMP – Compressive Sampling

    Matching Pursuit

    IHT – Iterative Hard Thresholding

    GDS - Gradient Descent Sparsification

    HTP – Hard Thresholding Pursuit

    MP - Message Passing

    [Blumensath, Davies, 09]

    [Foucart, 10]

    [Villa Schniter, 2012]

    [Ji et al., 2008]

    [Needell, Tropp, 2009]

    [Garg Khandekar, 2009]

    Optimization strategies to cope with P0 NP-hardness

    (BP – Basis Pursuit) [Chen et al., 2001]

    Sparse reconstruction/compressive sensing

    Convex relaxation Approximation algorithms

    (BPDN – BP denoising)

    (LASSO) [Tibshirani, 1996]

    15

  • Exact recovery of sparse vectors

    Many SR algorithms ensure exact recovery provided that:

    This condition is satisfied for random matrices provided that

    Algorithm BP HTP CoSaMP GDP IHT

    Ratio 9.243 9 27.08 18 12

    Recovery guarantees: linked with the restricted isometric property (RIP)

    Restricted isometric constant:

    [Foucart, 10] (from ) 16

  • Example: Gaussian matrices; signed signals

    17

    0 20 40 60 80 100 120 140 160 180 2000

    20

    40

    60

    80

    100

    120

    ||x||0 (number of active columns of A)

    SR

    E (

    dB

    )

    Signal to Reconstruction Error

    BPDN

    EM-GM-AMP

    SNR = 25 dB

    SNR = 100 dB

    EM-GM-AMP

    [Villa Schniter, 2012] Algorithms: (BPDN – SUnSAL) [B-D, Figueiredo, 2010] EM-BG-AMP

  • 18

    Example: Gaussian matrices; non-negative signals

    0 20 40 60 80 100 120 140 160 180 2000

    20

    40

    60

    80

    100

    120

    ||x||0 (number of active columns of A)

    SR

    E (

    dB

    )

    Signal to Reconstruction Error

    BPDN

    EM-GM-AMPSNR = 100 dB

    SNR = 25 dB

    Algorithms: (BPDN – SUnSAL) EM-BG-AMP

  • Hyperspectral libraries

    0 50 100 150 200 250-0.2

    -0.1

    0

    0.1

    0.2

    Normalized and mean removed

    Illustration:

    19

    Hyperspectral libraries exhibit poor RI constants

    (Mutal coherence close to 1 ) [Iordache, B-D, Plaza, 11, 12]

    0 200 400 600 800 10000

    0.5

    1

    1.5

    2

    2.5

    BP guarantee

    for p = 10

  • 20

    Example: Hyperspectral library; non-negative signals

    1 2 3 4 5 6 7 8 9 100

    10

    20

    30

    40

    50

    60

    70

    80

    90

    100

    ||x||0 (number of active columns of A)

    SR

    E (

    dB

    )

    Signal to Reconstruction Error

    BPDN - 100 dB

    EM-GM-AMP 100 dB

    BPDN - 30 dB

    EM-GM-AMP 30 dB

    min

    10

    1 2 3 4 5 6 7 8 9 100

    5

    10

    15

    20

    25

    30

    35

    40

    ||x||0 (number of active columns of A)

    SR

    E (

    dB

    )

    Signal to Reconstruction Error

    BPDN - 100 dB

    EM-GM-AMP 100 dB

    BPDN - 30 dB

    EM-GM-AMP 30 dB

    min 4

    Algorithms: (BPDN – SUnSAL)

    Illustration:

    EM-BG-AMP

  • USGS

    Phase transition curves

    21

    undersamplig factor fractional sparsity

    Gaussian

    matrices

    BPDN

    EM-BG-AMP

    (enough for many hyperspectral applications)

    [Donoho, Tanner, 2005]

    Results linked with the

    k-neighborliness property of

    polytopes

    SNR = 1

    SNR = 30 dB

  • 19

    Real data – AVIRIS Cuprite

    [Iordache, B-D, Plaza, 11, 12]

  • 20

    Sparse reconstruction of hyperspectral data: Summary

    Bad news: Hyperspectral libraries have poor RI constants

    Surprising fact: Convex programs (BP, BPDN, LASSO, …) yield much

    better empirical performance than non-convex state-of-the-art

    competitors

    Good news: Hyperspectral mixtures are highly sparse, very often p · 5

    Directions to improve hyperspectral sparse reconstruction

    Structured sparsity + subspace structure

    (pixels in a give data set share the same support)

    Spatial contextual information

    (pixels belong to an image)

  • 21

    Beyond l1 pixelwise regularization

    Rationale: introduce new sparsity-inducing regularizers to

    counter the sparse regression limits imposed by the high

    coherence of the hyperspectral libraries.

    Let’s rewrite the LMM as

    image band

    regression coefficients

    of pixel j

  • 22

    total Variation of

    i-th band

    Constrained total variation sparse regression (CTVSR)

    [Iordache, B-D, Plaza, 11]

    [Zhao, Wang, Huang, Ng, Plemmons, 12] Related work

  • Ilustrative examples with simulated data : SUnSAL-TV

    Original data cube

    Original abundance of EM5

    SUnSAL estimate

    SUnSAL-TV estimate

    28

  • 24

    multiple measurements share the same support

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0 =

    [Turlach, Venables, Wright, 2004] [Iordache, B-D, Plaza, 11, 12]

    Constrained colaborative sparse regression (CCSR)

    Theoretical guaranties (superiority of multichanel) : the probability of recovery

    failure decays exponentially in the number of channels. [Eldar, Rauhut, 11]

  • Original x

    20 40 60 80 100

    50

    100

    150

    200

    250

    300

    Without collaborative sparseness

    20 40 60 80 100

    50

    100

    150

    200

    250

    300

    With collaborative sparseness

    20 40 60 80 100

    50

    100

    150

    200

    250

    300

    Ilustrative examples with simulated data : CSUnSAL

    34

  • If , the above bound is achieved using

    the multiple sinal classification (MUSIC) algorithm

    Multiple measurements

    29

    The multiple measurement vector (MMV) problem

    - number of non-null rows of

    [Feng, 1997], [Chen, Huo, 2006],

    [Davies, Eldar, 2012]

    MMV has a unique solution iff

    MMV gain

  • 30

    Endemember identification with MUSIC

    (noiseless measurements) Y=AX

    2) Compute and set

    with

    1) Compute , the first p eigenvalues of

    MUSIC algorithm

  • detected endmembers

    A – USGS (¸ 3±) p = 10, N=5000, X - uniform over the simplex, SNR = 1,

    50 100 150 200 250 3000

    0.02

    0.04

    0.06

    0.08

    0.1

    31

    Examples (simulated data)

    50 100 150 200 250 3000

    0.02

    0.04

    0.06

    0.08

    0.1Gaussian iid noise, SNR = 25 dB

    gap

    gap

  • 32

    Examples (simulated data)

    colored noise, SNR = 25 dB

    cause of the large projection errors: poor identification of the

    subspace signal

    cure: identify the signal subspace

    50 100 150 200 250 3000

    0.02

    0.04

    0.06

    0.08

    0.1

  • 33

    Signal subspace identification

    colored noise, SNR = 25 dB (incorrect signal subspace)

    50 100 150 200 250 3000

    0.02

    0.04

    0.06

    0.08

    0.1

    colored noise, SNR = 25 dB (signal subspace identified

    with HySime, [BD, Nascimento, 2008])

    50 100 150 200 250 3000

    0.02

    0.04

    0.06

    0.08

    0.1

  • 34

    2) Compute and define the index set

    Proposed MUSIC – Colaborative SR algorithm

    [Kim, Lee, Ye, 2012] Related work: CS-MUSIC

    (N < k and iid noise)

    3) Solve the colaborative sparse regression optimization

    [B-D, Figueiredo, 2012]

    1) Estimate the signal subspace using, e.g. the HySime algorithm.

    MUSIC-CSR algorithm [B-D, 2012]

  • acummulated

    abundances

    35

    MUSIC – CSR results

    A – USGS (¸ 3±), Gaussian shaped noise, SNR = 25 dB, k = 5, m = 300,

    500 1000 1500 2000 2500 3000 3500 4000 4500 5000

    50

    100

    150

    200

    250

    300

    true

    endmembers

    SNR = 11.7 dB

    computation time ' 10 sec MUSIC-CSR

    CSR SNR = 0 dB

    computation time ' 600 sec

  • 33

    • size: 350x350 pixels

    • spectral library: 302 materials

    (minerals) from the USGS library

    • spectral bands: 188 out of 224

    (noisy bands were removed)

    • spectral range: 0.4 – 2.5 um

    • spectral resolution: 10 nm

    • “validation” map: Tetracorder***

    Results with CUPRITE

    0 50 100 150 200 250 300 350 400 4500

    0.05

    0.1

    0.15

    0.2

  • 34

    Note: Good spatial distribution of the endmembers

    Processing times: 2.6 ms/pixel using the full library; 0.22ms/pixels using the

    pruned library with 40 members

    Results with real data

  • Convex optimization problems in SLU

    Constrained least squares (CLS)

    Constrained sparse regression (CSR)

    Fully constrained least squares (FCLS)

    38

    ANC ASC

  • 36

    CBP denoising (CBPDN)

    Convex optimization problems in SLU

    Constrained basis pursuit (CBP)

    Constrained total variation (CTV)

    Constrained colaborative sparse regression (CCSR)

  • 37

    Convex optimization problems in SLU

    Structure of the optimization problems:

    convex functions

    convex set

    Source of difficulties: large scale ( ); nonsmoothness

    40

    Line of attack: alternating direction method of multiplies (ADMM)

    [Glowinski, Marrocco, 75], [Gabay, Mercier, 76]

  • Alternating Direction Method of Multipliers (ADMM)

    Unconstrained (convex) optimization problem:

    ADMM [Glowinski, Marrocco, 75], [Gabay, Mercier, 76]

    Interpretations: variable splitting + augmented Lagrangian + NLBGS;

    Douglas-Rachford splitting on the dual [Eckstein, Bertsekas, 92];

    split-Bregman approach [Goldstein, Osher, 08]

    41

  • A Cornerstone Result on ADMM [Eckstein, Bertsekas, 1992]

    Consider the problem

    Let and be closed, proper, and convex and have full column rank.

    The theorem also allows for inexact minimizations, as long as the

    errors are absolutely summable.

    Let be the sequence produced by ADMM, with ;

    , then, if the problem has a solution, say , then

    [He, Yuan, 2011] Convergence rate:

    42

  • Proper, closed, convex functions Arbitrary matrices

    We adopt:

    Consider a more general problem

    There are many ways to write as

    ADMM for two or more functions [Figueiredo, B-D, 09]

    Another approach in [Goldfarb, Ma, 09, 11]

    43

  • Applying ADMM to more than two functions

    Conditions for easy applicability:

    inexpensive matrix inversion

    inexpensive proximity operators

    44

  • Constrained sparse regression (CSR)

    Problem

    Equivalent formulation

    indicator of the first

    orthant Template:

    Spectral unmixing by split augmented Lagrangian (SUnSAL) [B-D, Figueiredo, 2010]

    Related algorithm (split-Bregman view) in [Szlam, Guo, Osher, 2010]

    Matrix inversion can be precomputed

    (typical sizes 200~300 x 500~1000)

    Proximity operators: Mapping:

    45

  • Concluding remarks

    Sparse regression framework, used with care, yields effective

    hyperspectral unmixing results.

    Critical factors High mutual coherence of the hyperspectral libraries

    Non-linear mixing and noise

    Acquisition and calibration of hyperspectral libraries

    Favorable factors Hyperspectral mixtures are highly sparse

    ADMM is a very flexible and efficient tool solving the hyperspectral

    sparse regression optimization problems

  • References

    J. Bioucas-Dias, A. Plaza, N. Dobigeon, M. Parente, Q. Du, P. Gader, and J. Chanussot,

    "Hyperspectral unmixing overview: geometrical, statistical, and sparse regression-based

    approaches", IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing,

    vol. 5, no. 2, pp. 354-379, 2012

    S. Chen, D. Donoho, and M. Saunders, “Atomic decomposition by basis pursuit,” SIAM Rev., vol.

    43, no. 1, pp. 129–159, 2001.

    M. V. Afonso, J. Bioucas-Dias, and M. Figueiredo, “An augmented Lagrangian approach to the

    constrained optimization formulation of imaging inverse problems”, IEEE Transactions on Image

    Processing, vol. 20, no. 3, pp. 681-695, 2011.

    J. Bioucas-Dias and M. Figueiredo, “Alternating direction algorithms for constrained sparse

    regression: Application to hyperspectral unmixing”, in 2nd IEEE GRSS Workshop on

    Hyperspectral Image and Signal Processing-WHISPERS'2010, Raykjavik, Iceland, 2010.

    D. Gabay and B. Mercier, “A dual algorithm for the solution of nonlinear variational problems via

    finite-element approximations,” Comput. Math. Appl., vol. 2, pp. 17–40, 1976.

  • M-D, Iordache, J. Bioucas-Dias and A. Plaza, "Sparse unmixing of hyperspectral data",

    IEEE Transactions on Geoscience and Remote Sensing, vol. 49, no. 6, pp. 2014-2039, 2011

    References

    E. Esser, “Applications of Lagrangian-based alternating direction methods and connections to split-

    Bregman,” Comput. Appl. Math., Univ. California, Los Angeles, Tech. Rep. 09–31, 2009.

    R. Glowinski and A. Marroco, “Sur l’approximation, par elements finis d’ordre un, et la

    resolution, par penalisation-dualité d’une classe de problemes de dirichlet non lineares,” Rev.

    Française d’Automat. Inf. Recherche Opérationelle, vol. 9, pp. 41–76, 1975.

    T. Goldstein and S. Osher, “The split Bregman algorithm for regularized problems,” Comput.

    Appl. Math., Univ. California, Los Angeles, Tech. Rep. 08–29, 2008.

    J. Eckstein, D. Bertsekas, “On the Douglas-Rachford splitting method and the proximal point

    algorithm for maximal monotone operators”, Math. Progr., vol. 5, pp. 293-318, 1992

    B. He and X. Yuan, X, “On convergence rate of the Douglas-Rachford operator splitting method”,

    2011.

  • References

    J. Nascimento and J. Bioucas-Dias, "Vertex component analysis: a fast algorithm to unmix

    hyperspectral data", IEEE Transactions on Geoscience and Remote Sensing, vol. 43, no. 4, pp. 898-910,

    2005.

    M.-D. Iordache, J. Bioucas-Dias, and A. Plaza, "Hyperspectral unmixing with sparse group lasso",

    in IEEE International Geoscience and Remote Sensing Symposium (IGARSS'11), Vancouver,

    Canada, 2011.

    M.-D. Iordache, J. Bioucas-Dias and A. Plaza, "Total variation spatial regularization for sparse

    hyperspectral unmixing", IEEE Transactions on Geoscience and Remote Sensing, vol. PP, no. 99,

    pp. 1-19, 2012.

    C. Li, T. Sun, K. Kelly, and Y. Zhang, “A Compressive Sensing and Unmixing Scheme for

    Hyperspectral Data Processing”, IEEE Transactions on Image Processing, 2011

    G. Martin, J.Bioucas-Dias, A. Plaza, “Hyperspectral coded aperture (hyca): a new

    technique for hyperspectral compressive sensing”, IEEE International Geoscience and

    Remote Sensing Symposium (IGARSS'12), Munich, Germany, 2012

  • References

    A Turlach, W. Venables, and S. Wright, “Atomic decomposition by basis pursuit,” SIAM Rev.,

    vol. 43, no. 1, pp. 129–159, 2001.

    R. Tibshirani, “Regression shrinkage and selection via the lasso”, “Journal of the Royal Statistical

    Society. Series B (Methodological”, pp. 267--288, 2006.

    X. Zhao, F. Wang, T. Huang, M. Ng and R. Plemmons, “Deblurring and Sparse Unmixing For

    Hyperspectral Images”, Preprint, May 2012.

    Q. Zhang, R. Plemmons, D. Kittle, D. Brady, S. Prasad, “Joint segmentation and reconstruction of

    hyperspectral data with compressed measurements”, Applied Optics, 2011.

    P. Sprechmann, I. Ramirez, G. Sapiro, G., and, Y. Eldar, “Collaborative hierarchical sparse

    modeling”, IEEE 4th Annual Conference on Information Sciences and Systems (CISS), 2010,

    pp. 1-6, 2010