Using Bayesian Neural Networks for UQ of Hyperspectral ... ·...

40
Sandia National Laboratories is a multimis- sion laboratory managed and operated by National Technology and Engineering So- lutions of Sandia, LLC., a wholly owned subsidiary of Honeywell International, Inc., for the U.S. Department of Energy’s National Nuclear Security Administration under contract DE-NA-0003525. SAND NO. 2019-3409 C Using Bayesian Neural Networks for UQ of Hyperspectral Image Target Detection AUTHORS Daniel Ries, Braden Smith,Josh Zollweg, Lauren Hund

Transcript of Using Bayesian Neural Networks for UQ of Hyperspectral ... ·...

Page 1: Using Bayesian Neural Networks for UQ of Hyperspectral ... · SandiaNationalLaboratoriesisamultimis-sionlaboratorymanagedandoperatedby NationalTechnologyandEngineeringSo-lutionsofSandia,LLC.,awhollyowned

Sandia National Laboratories is a multimis-sion laboratory managed and operated byNational Technology and Engineering So-lutions of Sandia, LLC., a wholly owned

subsidiary of Honeywell International, Inc.,for the U.S. Department of Energy’s National

Nuclear Security Administration under contractDE-NA-0003525. SAND NO. 2019-3409 C

Using Bayesian Neural Networks for UQof Hyperspectral Image Target Detection

AUTHORS

Daniel Ries, Braden Smith,Josh Zollweg,Lauren Hund

Page 2: Using Bayesian Neural Networks for UQ of Hyperspectral ... · SandiaNationalLaboratoriesisamultimis-sionlaboratorymanagedandoperatedby NationalTechnologyandEngineeringSo-lutionsofSandia,LLC.,awhollyowned

Outline

1. What is a neural network?2. What is a Bayesian neural network (BNN)?3. Estimating BNNs with variational inference4. Hyperspectral images and Megascene5. BNN on Megascene6. Concluding remarks

Page 3: Using Bayesian Neural Networks for UQ of Hyperspectral ... · SandiaNationalLaboratoriesisamultimis-sionlaboratorymanagedandoperatedby NationalTechnologyandEngineeringSo-lutionsofSandia,LLC.,awhollyowned

What is a neural network?

Page 4: Using Bayesian Neural Networks for UQ of Hyperspectral ... · SandiaNationalLaboratoriesisamultimis-sionlaboratorymanagedandoperatedby NationalTechnologyandEngineeringSo-lutionsofSandia,LLC.,awhollyowned

What is a neural network?

Page 5: Using Bayesian Neural Networks for UQ of Hyperspectral ... · SandiaNationalLaboratoriesisamultimis-sionlaboratorymanagedandoperatedby NationalTechnologyandEngineeringSo-lutionsofSandia,LLC.,awhollyowned

Logistic Regression as a Neural Network

Suppose Y1, ...,Yn are binary random variables, and x1, ...,xn arep-length vectors

If want to estimate P(Yi = 1|xi), we could use logistic regression:

Yiiid∼ Bernoulli(πi)

log(

πi1− πi

)= x′

Page 6: Using Bayesian Neural Networks for UQ of Hyperspectral ... · SandiaNationalLaboratoriesisamultimis-sionlaboratorymanagedandoperatedby NationalTechnologyandEngineeringSo-lutionsofSandia,LLC.,awhollyowned

Logistic Regression as a Neural Network

𝑥2

.

.

.

𝑥1

𝑥𝑝

𝑃 𝑌 = 1 𝒙𝑌

𝑖=1

𝑝

𝑥𝑖𝑤𝑖 + 𝑏

= 𝑧

𝑔(𝜋)𝑓 𝑧 = 𝜋

= 𝜋

Page 7: Using Bayesian Neural Networks for UQ of Hyperspectral ... · SandiaNationalLaboratoriesisamultimis-sionlaboratorymanagedandoperatedby NationalTechnologyandEngineeringSo-lutionsofSandia,LLC.,awhollyowned

A Simple Neural Network

𝑥2

.

.

.

𝑥1

𝑥𝑝

𝑃 𝑌 = 1 𝒙𝑌

𝑖=1

𝑝

𝑥𝑖𝑤1𝑖 + 𝑏1

= 𝑧11

𝑖=1

𝑝

𝑥𝑖𝑤2𝑖 + 𝑏2

= 𝑧12

𝑖=1

2

𝑣1𝑖𝑤3𝑖 + 𝑏3

= 𝑧21

𝑓2 𝑧21 = 𝜋 𝑔(𝜋)

𝑓1 𝑧11 = 𝑣11

𝑓1 𝑧12 = 𝑣12= 𝜋

Page 8: Using Bayesian Neural Networks for UQ of Hyperspectral ... · SandiaNationalLaboratoriesisamultimis-sionlaboratorymanagedandoperatedby NationalTechnologyandEngineeringSo-lutionsofSandia,LLC.,awhollyowned

More Complicated Neural Network

𝑥2

.

.

.

𝑥1

𝑥𝑝

𝑃(𝑌 = 1|𝒙) 𝑌

𝑖=1

𝑝

𝑥𝑖𝑤11𝑖 + 𝑏11

= 𝑧11

𝑖=1

𝑝

𝑥𝑖𝑤12𝑖 + 𝑏12

= 𝑧12

𝑖=1

𝑝

𝑥𝑖𝑤1𝑖 + 𝑏13

= 𝑧13

𝑖=1

𝑝

𝑥𝑖𝑤1𝐾1𝑖 + 𝑏1𝐾1

= 𝑧1𝐾1

𝑖=1

𝐾1

𝑣1𝑖𝑤21𝑖 + 𝑏21

= 𝑧21

𝑖=1

𝐾1

𝑣1𝑖𝑤22𝑖 + 𝑏22

= 𝑧22

𝑖=1

𝐾1

𝑣1𝑖𝑤23𝑖 + 𝑏23

= 𝑧23

𝑖=1

𝐾1

𝑣1𝑖𝑤2𝐾2𝑖 + 𝑏2𝐾2

= 𝑧2𝐾2

𝑖=1

𝐾2

𝑣𝐿𝑖𝑤𝐿𝑖 + 𝑏𝐿

= 𝑧𝐿

.

.

.

.

.

.

𝑓 𝑧𝐿

𝑓1 𝑧1𝑖= 𝑣1𝑖

𝑓𝐿 𝑧𝐿𝑖= 𝑣𝐿𝑖

= 𝜋

𝑔(𝜋)

Page 9: Using Bayesian Neural Networks for UQ of Hyperspectral ... · SandiaNationalLaboratoriesisamultimis-sionlaboratorymanagedandoperatedby NationalTechnologyandEngineeringSo-lutionsofSandia,LLC.,awhollyowned

What is a Bayesian Neural Network?

For those of you who “do machine learning, not statistics”:BNNs are simply NNs that have an implied loss function (youdon’t have to choose and you get UQ for free!)

For those of you who “do statistics, not machine learning”:BNNs are simply a highly nonlinear model where you put a prioron your parameters (and a Bernoulli likelihood for our binarydata)

For those of you who “do both”:You’re not fooling anyone, I know you associate with one or theother.

For those of you who “do neither”:You are already “resting your eyes.”

Page 10: Using Bayesian Neural Networks for UQ of Hyperspectral ... · SandiaNationalLaboratoriesisamultimis-sionlaboratorymanagedandoperatedby NationalTechnologyandEngineeringSo-lutionsofSandia,LLC.,awhollyowned

What is a Bayesian Neural Network?

For those of you who “do machine learning, not statistics”:BNNs are simply NNs that have an implied loss function (youdon’t have to choose and you get UQ for free!)

For those of you who “do statistics, not machine learning”:BNNs are simply a highly nonlinear model where you put a prioron your parameters (and a Bernoulli likelihood for our binarydata)

For those of you who “do both”:You’re not fooling anyone, I know you associate with one or theother.

For those of you who “do neither”:You are already “resting your eyes.”

Page 11: Using Bayesian Neural Networks for UQ of Hyperspectral ... · SandiaNationalLaboratoriesisamultimis-sionlaboratorymanagedandoperatedby NationalTechnologyandEngineeringSo-lutionsofSandia,LLC.,awhollyowned

What is a Bayesian Neural Network?

For those of you who “do machine learning, not statistics”:BNNs are simply NNs that have an implied loss function (youdon’t have to choose and you get UQ for free!)

For those of you who “do statistics, not machine learning”:BNNs are simply a highly nonlinear model where you put a prioron your parameters (and a Bernoulli likelihood for our binarydata)

For those of you who “do both”:You’re not fooling anyone, I know you associate with one or theother.

For those of you who “do neither”:You are already “resting your eyes.”

Page 12: Using Bayesian Neural Networks for UQ of Hyperspectral ... · SandiaNationalLaboratoriesisamultimis-sionlaboratorymanagedandoperatedby NationalTechnologyandEngineeringSo-lutionsofSandia,LLC.,awhollyowned

What is a Bayesian Neural Network?

For those of you who “do machine learning, not statistics”:BNNs are simply NNs that have an implied loss function (youdon’t have to choose and you get UQ for free!)

For those of you who “do statistics, not machine learning”:BNNs are simply a highly nonlinear model where you put a prioron your parameters (and a Bernoulli likelihood for our binarydata)

For those of you who “do both”:You’re not fooling anyone, I know you associate with one or theother.

For those of you who “do neither”:You are already “resting your eyes.”

Page 13: Using Bayesian Neural Networks for UQ of Hyperspectral ... · SandiaNationalLaboratoriesisamultimis-sionlaboratorymanagedandoperatedby NationalTechnologyandEngineeringSo-lutionsofSandia,LLC.,awhollyowned

What is a Bayesian Neural Network?

For those of you who “do machine learning, not statistics”:BNNs are simply NNs that have an implied loss function (youdon’t have to choose and you get UQ for free!)

For those of you who “do statistics, not machine learning”:BNNs are simply a highly nonlinear model where you put a prioron your parameters (and a Bernoulli likelihood for our binarydata)

For those of you who “do both”:You’re not fooling anyone, I know you associate with one or theother.

For those of you who “do neither”:You are already “resting your eyes.”

Page 14: Using Bayesian Neural Networks for UQ of Hyperspectral ... · SandiaNationalLaboratoriesisamultimis-sionlaboratorymanagedandoperatedby NationalTechnologyandEngineeringSo-lutionsofSandia,LLC.,awhollyowned

An Example BNN

Yi ∼ Bernoulli(πi)

πi = f2(z21)

= f2

2∑j=1

f1

( p∑k=1

xkwjk + bj

)w2j + b3

wjk

iid∼ N (0, 1), ∀j, k

bliid∼ N (0, 1), l = 1, 2, 3

𝑥2

.

.

.

𝑥1

𝑥𝑝

𝑃 𝑌 = 1 𝒙𝑌

𝑖=1

𝑝

𝑥𝑖𝑤1𝑖 + 𝑏1

= 𝑧11

𝑖=1

𝑝

𝑥𝑖𝑤2𝑖 + 𝑏2

= 𝑧12

𝑖=1

2

𝑣1𝑖𝑤3𝑖 + 𝑏3

= 𝑧21

𝑓2 𝑧21 = 𝜋 𝑔(𝜋)

𝑓1 𝑧11 = 𝑣11

𝑓1 𝑧12 = 𝑣12= 𝜋

Page 15: Using Bayesian Neural Networks for UQ of Hyperspectral ... · SandiaNationalLaboratoriesisamultimis-sionlaboratorymanagedandoperatedby NationalTechnologyandEngineeringSo-lutionsofSandia,LLC.,awhollyowned

Estimating (or Training) BNNs

Bayesians love to sample their posterior distributions via MCMC

But this can be slowwww

Enter Variational Inference:• Instead approximate posterior with q(θ) by solving:

argminq∗

KL(q∗(θ)||p(w,b|y))

= argminq∗

KL(q∗(θ)||p(w,b, y))

• Put restrictions on set of q∗

• Result is a loss function we can do gradient descent as usualon

Page 16: Using Bayesian Neural Networks for UQ of Hyperspectral ... · SandiaNationalLaboratoriesisamultimis-sionlaboratorymanagedandoperatedby NationalTechnologyandEngineeringSo-lutionsofSandia,LLC.,awhollyowned

Estimating (or Training) BNNs

Bayesians love to sample their posterior distributions via MCMCBut this can be slowwww

Enter Variational Inference:• Instead approximate posterior with q(θ) by solving:

argminq∗

KL(q∗(θ)||p(w,b|y))

= argminq∗

KL(q∗(θ)||p(w,b, y))

• Put restrictions on set of q∗

• Result is a loss function we can do gradient descent as usualon

Page 17: Using Bayesian Neural Networks for UQ of Hyperspectral ... · SandiaNationalLaboratoriesisamultimis-sionlaboratorymanagedandoperatedby NationalTechnologyandEngineeringSo-lutionsofSandia,LLC.,awhollyowned

Estimating (or Training) BNNs

Bayesians love to sample their posterior distributions via MCMCBut this can be slowwww

Enter Variational Inference:

• Instead approximate posterior with q(θ) by solving:

argminq∗

KL(q∗(θ)||p(w,b|y))

= argminq∗

KL(q∗(θ)||p(w,b, y))

• Put restrictions on set of q∗

• Result is a loss function we can do gradient descent as usualon

Page 18: Using Bayesian Neural Networks for UQ of Hyperspectral ... · SandiaNationalLaboratoriesisamultimis-sionlaboratorymanagedandoperatedby NationalTechnologyandEngineeringSo-lutionsofSandia,LLC.,awhollyowned

Estimating (or Training) BNNs

Bayesians love to sample their posterior distributions via MCMCBut this can be slowwww

Enter Variational Inference:• Instead approximate posterior with q(θ) by solving:

argminq∗

KL(q∗(θ)||p(w,b|y))

= argminq∗

KL(q∗(θ)||p(w,b, y))

• Put restrictions on set of q∗

• Result is a loss function we can do gradient descent as usualon

Page 19: Using Bayesian Neural Networks for UQ of Hyperspectral ... · SandiaNationalLaboratoriesisamultimis-sionlaboratorymanagedandoperatedby NationalTechnologyandEngineeringSo-lutionsofSandia,LLC.,awhollyowned

Estimating (or Training) BNNs

Bayesians love to sample their posterior distributions via MCMCBut this can be slowwww

Enter Variational Inference:• Instead approximate posterior with q(θ) by solving:

argminq∗

KL(q∗(θ)||p(w,b|y))

= argminq∗

KL(q∗(θ)||p(w,b, y))

• Put restrictions on set of q∗

• Result is a loss function we can do gradient descent as usualon

Page 20: Using Bayesian Neural Networks for UQ of Hyperspectral ... · SandiaNationalLaboratoriesisamultimis-sionlaboratorymanagedandoperatedby NationalTechnologyandEngineeringSo-lutionsofSandia,LLC.,awhollyowned

Estimating (or Training) BNNs

Bayesians love to sample their posterior distributions via MCMCBut this can be slowwww

Enter Variational Inference:• Instead approximate posterior with q(θ) by solving:

argminq∗

KL(q∗(θ)||p(w,b|y))

= argminq∗

KL(q∗(θ)||p(w,b, y))

• Put restrictions on set of q∗

• Result is a loss function we can do gradient descent as usualon

Page 21: Using Bayesian Neural Networks for UQ of Hyperspectral ... · SandiaNationalLaboratoriesisamultimis-sionlaboratorymanagedandoperatedby NationalTechnologyandEngineeringSo-lutionsofSandia,LLC.,awhollyowned

Estimating (or Training) BNNs

Bayesians love to sample their posterior distributions via MCMCBut this can be slowwww

Enter Variational Inference:• Instead approximate posterior with q(θ) by solving:

argminq∗

KL(q∗(θ)||p(w,b|y))

= argminq∗

KL(q∗(θ)||p(w,b, y))

• Put restrictions on set of q∗

• Result is a loss function we can do gradient descent as usualon

Page 22: Using Bayesian Neural Networks for UQ of Hyperspectral ... · SandiaNationalLaboratoriesisamultimis-sionlaboratorymanagedandoperatedby NationalTechnologyandEngineeringSo-lutionsofSandia,LLC.,awhollyowned

Hyperspectral Images (HSI)

• Airborne spectrometers construct a (x, y, z) tensor• x and y describe the spatial dimension• z describes the spectra at a single pixel (x, y)

• HSI Target detection is where specific materials are identifiedby their reflectance

• The target object might be smaller than the projection of apixel on the ground

Page 23: Using Bayesian Neural Networks for UQ of Hyperspectral ... · SandiaNationalLaboratoriesisamultimis-sionlaboratorymanagedandoperatedby NationalTechnologyandEngineeringSo-lutionsofSandia,LLC.,awhollyowned

Megascene

• Simulated HSI scene using DIRSIG model• Targets inserted using paint reflectanceand bi-directional reflectancedistribution function data from NEFDS

• 3 environments, and 3 times of day for 9total scenes

• 125 targets placed randomly throughouteach scene

• Targets range from large to subpixel(radius of 0.1m to 4m), size distributednearly uniform

• Some targets only partially visible

Page 24: Using Bayesian Neural Networks for UQ of Hyperspectral ... · SandiaNationalLaboratoriesisamultimis-sionlaboratorymanagedandoperatedby NationalTechnologyandEngineeringSo-lutionsofSandia,LLC.,awhollyowned

BNN on Megascene

• Architecture: Input-10-5-2-Output• Activations: Hyperbolic tan between hidden layers, inverselogit for output

• Priors: Standard normal on all weights and biases• Optimizer: Stochastic gradient descent with momentum• Training Data: Left half of mid latitude summer environmentat 1200

• Test Data: Right half of all environments at all times• Software: Edward and Tensorflow

Page 25: Using Bayesian Neural Networks for UQ of Hyperspectral ... · SandiaNationalLaboratoriesisamultimis-sionlaboratorymanagedandoperatedby NationalTechnologyandEngineeringSo-lutionsofSandia,LLC.,awhollyowned

BNN on Megascene

• Architecture: Input-10-5-2-Output

• Activations: Hyperbolic tan between hidden layers, inverselogit for output

• Priors: Standard normal on all weights and biases• Optimizer: Stochastic gradient descent with momentum• Training Data: Left half of mid latitude summer environmentat 1200

• Test Data: Right half of all environments at all times• Software: Edward and Tensorflow

Page 26: Using Bayesian Neural Networks for UQ of Hyperspectral ... · SandiaNationalLaboratoriesisamultimis-sionlaboratorymanagedandoperatedby NationalTechnologyandEngineeringSo-lutionsofSandia,LLC.,awhollyowned

BNN on Megascene

• Architecture: Input-10-5-2-Output• Activations: Hyperbolic tan between hidden layers, inverselogit for output

• Priors: Standard normal on all weights and biases• Optimizer: Stochastic gradient descent with momentum• Training Data: Left half of mid latitude summer environmentat 1200

• Test Data: Right half of all environments at all times• Software: Edward and Tensorflow

Page 27: Using Bayesian Neural Networks for UQ of Hyperspectral ... · SandiaNationalLaboratoriesisamultimis-sionlaboratorymanagedandoperatedby NationalTechnologyandEngineeringSo-lutionsofSandia,LLC.,awhollyowned

BNN on Megascene

• Architecture: Input-10-5-2-Output• Activations: Hyperbolic tan between hidden layers, inverselogit for output

• Priors: Standard normal on all weights and biases

• Optimizer: Stochastic gradient descent with momentum• Training Data: Left half of mid latitude summer environmentat 1200

• Test Data: Right half of all environments at all times• Software: Edward and Tensorflow

Page 28: Using Bayesian Neural Networks for UQ of Hyperspectral ... · SandiaNationalLaboratoriesisamultimis-sionlaboratorymanagedandoperatedby NationalTechnologyandEngineeringSo-lutionsofSandia,LLC.,awhollyowned

BNN on Megascene

• Architecture: Input-10-5-2-Output• Activations: Hyperbolic tan between hidden layers, inverselogit for output

• Priors: Standard normal on all weights and biases• Optimizer: Stochastic gradient descent with momentum

• Training Data: Left half of mid latitude summer environmentat 1200

• Test Data: Right half of all environments at all times• Software: Edward and Tensorflow

Page 29: Using Bayesian Neural Networks for UQ of Hyperspectral ... · SandiaNationalLaboratoriesisamultimis-sionlaboratorymanagedandoperatedby NationalTechnologyandEngineeringSo-lutionsofSandia,LLC.,awhollyowned

BNN on Megascene

• Architecture: Input-10-5-2-Output• Activations: Hyperbolic tan between hidden layers, inverselogit for output

• Priors: Standard normal on all weights and biases• Optimizer: Stochastic gradient descent with momentum• Training Data: Left half of mid latitude summer environmentat 1200

• Test Data: Right half of all environments at all times• Software: Edward and Tensorflow

Page 30: Using Bayesian Neural Networks for UQ of Hyperspectral ... · SandiaNationalLaboratoriesisamultimis-sionlaboratorymanagedandoperatedby NationalTechnologyandEngineeringSo-lutionsofSandia,LLC.,awhollyowned

BNN on Megascene

• Architecture: Input-10-5-2-Output• Activations: Hyperbolic tan between hidden layers, inverselogit for output

• Priors: Standard normal on all weights and biases• Optimizer: Stochastic gradient descent with momentum• Training Data: Left half of mid latitude summer environmentat 1200

• Test Data: Right half of all environments at all times

• Software: Edward and Tensorflow

Page 31: Using Bayesian Neural Networks for UQ of Hyperspectral ... · SandiaNationalLaboratoriesisamultimis-sionlaboratorymanagedandoperatedby NationalTechnologyandEngineeringSo-lutionsofSandia,LLC.,awhollyowned

BNN on Megascene

• Architecture: Input-10-5-2-Output• Activations: Hyperbolic tan between hidden layers, inverselogit for output

• Priors: Standard normal on all weights and biases• Optimizer: Stochastic gradient descent with momentum• Training Data: Left half of mid latitude summer environmentat 1200

• Test Data: Right half of all environments at all times• Software: Edward and Tensorflow

Page 32: Using Bayesian Neural Networks for UQ of Hyperspectral ... · SandiaNationalLaboratoriesisamultimis-sionlaboratorymanagedandoperatedby NationalTechnologyandEngineeringSo-lutionsofSandia,LLC.,awhollyowned

Test Set Performance

Full test set. High confidence set.

High confidence set: 80% credible interval forπ ∈ (0, 0.2) ∪ (0.8, 1)

Page 33: Using Bayesian Neural Networks for UQ of Hyperspectral ... · SandiaNationalLaboratoriesisamultimis-sionlaboratorymanagedandoperatedby NationalTechnologyandEngineeringSo-lutionsofSandia,LLC.,awhollyowned

Test Set Performance

Page 34: Using Bayesian Neural Networks for UQ of Hyperspectral ... · SandiaNationalLaboratoriesisamultimis-sionlaboratorymanagedandoperatedby NationalTechnologyandEngineeringSo-lutionsofSandia,LLC.,awhollyowned

Moving Forward

There is still a lot of work to be done on BNNs

• Explore deeper and wider, more complicated (convolutional)architectures

• Full tuning of hyperparameters• Explore effects of prior• Assess coverage of credible intervals

Page 35: Using Bayesian Neural Networks for UQ of Hyperspectral ... · SandiaNationalLaboratoriesisamultimis-sionlaboratorymanagedandoperatedby NationalTechnologyandEngineeringSo-lutionsofSandia,LLC.,awhollyowned

Thank you for listening!

[email protected]

Page 36: Using Bayesian Neural Networks for UQ of Hyperspectral ... · SandiaNationalLaboratoriesisamultimis-sionlaboratorymanagedandoperatedby NationalTechnologyandEngineeringSo-lutionsofSandia,LLC.,awhollyowned

Test Set Performance

BNN

Stan-dardNN

Full test set High confidence set

Page 37: Using Bayesian Neural Networks for UQ of Hyperspectral ... · SandiaNationalLaboratoriesisamultimis-sionlaboratorymanagedandoperatedby NationalTechnologyandEngineeringSo-lutionsofSandia,LLC.,awhollyowned

Test Set Performance

Page 38: Using Bayesian Neural Networks for UQ of Hyperspectral ... · SandiaNationalLaboratoriesisamultimis-sionlaboratorymanagedandoperatedby NationalTechnologyandEngineeringSo-lutionsofSandia,LLC.,awhollyowned

Abundance vs Probability

Page 39: Using Bayesian Neural Networks for UQ of Hyperspectral ... · SandiaNationalLaboratoriesisamultimis-sionlaboratorymanagedandoperatedby NationalTechnologyandEngineeringSo-lutionsofSandia,LLC.,awhollyowned
Page 40: Using Bayesian Neural Networks for UQ of Hyperspectral ... · SandiaNationalLaboratoriesisamultimis-sionlaboratorymanagedandoperatedby NationalTechnologyandEngineeringSo-lutionsofSandia,LLC.,awhollyowned