Uncertainty in Bayesian Neural Nets · BNN •Variational Inference •Maximize lower bound on the...
Transcript of Uncertainty in Bayesian Neural Nets · BNN •Variational Inference •Maximize lower bound on the...
![Page 1: Uncertainty in Bayesian Neural Nets · BNN •Variational Inference •Maximize lower bound on the marginal log-likelihood log’12 ≥!"$[log’12,,+log’,−log6,] Prior Posterior](https://reader036.fdocuments.net/reader036/viewer/2022071005/5fc2a276693fd84a34519fe5/html5/thumbnails/1.jpg)
UncertaintyinBayesianNeuralNets
August42017
![Page 2: Uncertainty in Bayesian Neural Nets · BNN •Variational Inference •Maximize lower bound on the marginal log-likelihood log’12 ≥!"$[log’12,,+log’,−log6,] Prior Posterior](https://reader036.fdocuments.net/reader036/viewer/2022071005/5fc2a276693fd84a34519fe5/html5/thumbnails/2.jpg)
Overview
• BNNreview• Visualizationexperiments• BNNresults
![Page 3: Uncertainty in Bayesian Neural Nets · BNN •Variational Inference •Maximize lower bound on the marginal log-likelihood log’12 ≥!"$[log’12,,+log’,−log6,] Prior Posterior](https://reader036.fdocuments.net/reader036/viewer/2022071005/5fc2a276693fd84a34519fe5/html5/thumbnails/3.jpg)
BNN
Prior:p(W)Likelihood:p(Y|X,W)ApproximatePosterior:q(W)PosteriorPredictive:𝐸"($)[𝑝(𝑦|𝑥,𝑊)]
![Page 4: Uncertainty in Bayesian Neural Nets · BNN •Variational Inference •Maximize lower bound on the marginal log-likelihood log’12 ≥!"$[log’12,,+log’,−log6,] Prior Posterior](https://reader036.fdocuments.net/reader036/viewer/2022071005/5fc2a276693fd84a34519fe5/html5/thumbnails/4.jpg)
BNN
• VariationalInference• Maximizelowerboundonthemarginallog-likelihood
log 𝑝 𝑌 𝑋 ≥ 𝐸" $ [log 𝑝 𝑌 𝑋,𝑊 + log 𝑝 𝑊 − log 𝑞 𝑊 ]
Prior PosteriorApprox
Likelihood
Y
X W
Dependentonthenumberofdatapoints
1𝑀9 log 𝑝 𝑌: 𝑋:,𝑊
;
:<=
+1𝑁 log
𝑝(𝑊)𝑞(𝑊)
![Page 5: Uncertainty in Bayesian Neural Nets · BNN •Variational Inference •Maximize lower bound on the marginal log-likelihood log’12 ≥!"$[log’12,,+log’,−log6,] Prior Posterior](https://reader036.fdocuments.net/reader036/viewer/2022071005/5fc2a276693fd84a34519fe5/html5/thumbnails/5.jpg)
Differentpriorsandposteriorapproximations
• Priorsp(W):• 𝑁(0, 𝜎A)• Scale-mixturesofNormals• SparsityInducing
• PosteriorApproximationsq(W):• Deltapeak q W = 𝛿𝑊• FullyFactorizedGaussiansq W =∏𝑁(𝑤I|𝜇I, 𝜎IA)�
�• BernoulliDropout• GaussianDropout• MNF
![Page 6: Uncertainty in Bayesian Neural Nets · BNN •Variational Inference •Maximize lower bound on the marginal log-likelihood log’12 ≥!"$[log’12,,+log’,−log6,] Prior Posterior](https://reader036.fdocuments.net/reader036/viewer/2022071005/5fc2a276693fd84a34519fe5/html5/thumbnails/6.jpg)
MultiplicativeNormalizingFlows(MNF)
• Augmentmodelwithauxiliaryvariable
ChristosLouizos,MaxWellingICML2017
Y
X W
Z
GenerativeModel
InferenceModel
W
Z
𝑧~𝑞 𝑧 𝑊~𝑞 𝑊 𝑧
𝑞 𝑊 = N𝑞 𝑊 𝑧 𝑞 𝑧 𝑑𝑧�
�
𝑞 𝑊 𝑧 =PP𝑁(𝑧I𝜇IQ, 𝜎IQA)RSTU
Q<=
RVW
I<=
log 𝑝 𝑌 𝑋 ≥ 𝐸" $ [log 𝑝 𝑌 𝑋,𝑊 + log 𝑝 𝑊 − log 𝑞 𝑊|𝑧 + log 𝑟 𝑧 𝑤 − log 𝑞(𝑧)]
NewlowerboundNormalizingFlows
![Page 7: Uncertainty in Bayesian Neural Nets · BNN •Variational Inference •Maximize lower bound on the marginal log-likelihood log’12 ≥!"$[log’12,,+log’,−log6,] Prior Posterior](https://reader036.fdocuments.net/reader036/viewer/2022071005/5fc2a276693fd84a34519fe5/html5/thumbnails/7.jpg)
PredictiveDistributions
![Page 8: Uncertainty in Bayesian Neural Nets · BNN •Variational Inference •Maximize lower bound on the marginal log-likelihood log’12 ≥!"$[log’12,,+log’,−log6,] Prior Posterior](https://reader036.fdocuments.net/reader036/viewer/2022071005/5fc2a276693fd84a34519fe5/html5/thumbnails/8.jpg)
Uncertainties
• Modeluncertainty(Epistemicuncertainty)• Capturesignoranceaboutthemodelthatismostsuitabletoexplainthedata• Reducesastheamountofobserveddataincreases• Summarizedbygeneratingfunctionrealizationsfromourdistribution
• MeasurementNoise(Aleatoric uncertainty)• Noiseinherentintheenvironment,capturedinlikelihoodfunction
• Predictiveuncertainty• Entropyofprediction=H[p(y|x)]
![Page 9: Uncertainty in Bayesian Neural Nets · BNN •Variational Inference •Maximize lower bound on the marginal log-likelihood log’12 ≥!"$[log’12,,+log’,−log6,] Prior Posterior](https://reader036.fdocuments.net/reader036/viewer/2022071005/5fc2a276693fd84a34519fe5/html5/thumbnails/9.jpg)
VisualizationExperiments
• 1Dregression• ClassificationofMNIST(visualizein2D)
• Questions:• Activations• Numberofsamples• Heldoutclasses• Typeofuncertainties
![Page 10: Uncertainty in Bayesian Neural Nets · BNN •Variational Inference •Maximize lower bound on the marginal log-likelihood log’12 ≥!"$[log’12,,+log’,−log6,] Prior Posterior](https://reader036.fdocuments.net/reader036/viewer/2022071005/5fc2a276693fd84a34519fe5/html5/thumbnails/10.jpg)
Sigmoid:(1+e-x)-1 Tanh
Softplus:ln(1+ex) ReLU:max(0,x)
BNNswithDifferentActivationFunctions
![Page 11: Uncertainty in Bayesian Neural Nets · BNN •Variational Inference •Maximize lower bound on the marginal log-likelihood log’12 ≥!"$[log’12,,+log’,−log6,] Prior Posterior](https://reader036.fdocuments.net/reader036/viewer/2022071005/5fc2a276693fd84a34519fe5/html5/thumbnails/11.jpg)
UncertaintyofDecisionBoundaries
• Setup:• ClassificationofMNIST• Train:50000Test:10000
784-100-2-100-10
BNN:FFG,N(0,1)Activations:Softplus
NN BNN
![Page 12: Uncertainty in Bayesian Neural Nets · BNN •Variational Inference •Maximize lower bound on the marginal log-likelihood log’12 ≥!"$[log’12,,+log’,−log6,] Prior Posterior](https://reader036.fdocuments.net/reader036/viewer/2022071005/5fc2a276693fd84a34519fe5/html5/thumbnails/12.jpg)
DecisionBoundaries– 3Samples
PlotofArgmax p(y|x)ateachpoint
![Page 13: Uncertainty in Bayesian Neural Nets · BNN •Variational Inference •Maximize lower bound on the marginal log-likelihood log’12 ≥!"$[log’12,,+log’,−log6,] Prior Posterior](https://reader036.fdocuments.net/reader036/viewer/2022071005/5fc2a276693fd84a34519fe5/html5/thumbnails/13.jpg)
UncertaintyofDecisionBoundaries:HeldOutClasses• Setup:• Classificationofdigits0to4(5to9heldout)
784-100-100-2-100-100-10
BNN:FFG,N(0,1)Activations:Softplus
NN BNN
![Page 14: Uncertainty in Bayesian Neural Nets · BNN •Variational Inference •Maximize lower bound on the marginal log-likelihood log’12 ≥!"$[log’12,,+log’,−log6,] Prior Posterior](https://reader036.fdocuments.net/reader036/viewer/2022071005/5fc2a276693fd84a34519fe5/html5/thumbnails/14.jpg)
Wheredoyouthinktheheldoutclasseswillgo?
InsideorOutsidetheCircle?
![Page 15: Uncertainty in Bayesian Neural Nets · BNN •Variational Inference •Maximize lower bound on the marginal log-likelihood log’12 ≥!"$[log’12,,+log’,−log6,] Prior Posterior](https://reader036.fdocuments.net/reader036/viewer/2022071005/5fc2a276693fd84a34519fe5/html5/thumbnails/15.jpg)
Wheredoyouthinktheheldoutclasseswillgo?
![Page 16: Uncertainty in Bayesian Neural Nets · BNN •Variational Inference •Maximize lower bound on the marginal log-likelihood log’12 ≥!"$[log’12,,+log’,−log6,] Prior Posterior](https://reader036.fdocuments.net/reader036/viewer/2022071005/5fc2a276693fd84a34519fe5/html5/thumbnails/16.jpg)
HeldOutClasses
Unseenclassesdon’tgetencodedassomethingfaraway,insteadencodednearmean
![Page 17: Uncertainty in Bayesian Neural Nets · BNN •Variational Inference •Maximize lower bound on the marginal log-likelihood log’12 ≥!"$[log’12,,+log’,−log6,] Prior Posterior](https://reader036.fdocuments.net/reader036/viewer/2022071005/5fc2a276693fd84a34519fe5/html5/thumbnails/17.jpg)
ConfidenceofPredictions?MaybelargeareashavehighentropyArgmax vsMax
![Page 18: Uncertainty in Bayesian Neural Nets · BNN •Variational Inference •Maximize lower bound on the marginal log-likelihood log’12 ≥!"$[log’12,,+log’,−log6,] Prior Posterior](https://reader036.fdocuments.net/reader036/viewer/2022071005/5fc2a276693fd84a34519fe5/html5/thumbnails/18.jpg)
ClassBoundaries- Confidences
SharptransitionsThereisn’tmuchuncertainspace:mostlyuniform,highconfidence
![Page 19: Uncertainty in Bayesian Neural Nets · BNN •Variational Inference •Maximize lower bound on the marginal log-likelihood log’12 ≥!"$[log’12,,+log’,−log6,] Prior Posterior](https://reader036.fdocuments.net/reader036/viewer/2022071005/5fc2a276693fd84a34519fe5/html5/thumbnails/19.jpg)
EntropyArgmax
Max
Entropy
![Page 20: Uncertainty in Bayesian Neural Nets · BNN •Variational Inference •Maximize lower bound on the marginal log-likelihood log’12 ≥!"$[log’12,,+log’,−log6,] Prior Posterior](https://reader036.fdocuments.net/reader036/viewer/2022071005/5fc2a276693fd84a34519fe5/html5/thumbnails/20.jpg)
AffectofChoiceofActivationFunction
• Softplus• ReLU• Tanh
![Page 21: Uncertainty in Bayesian Neural Nets · BNN •Variational Inference •Maximize lower bound on the marginal log-likelihood log’12 ≥!"$[log’12,,+log’,−log6,] Prior Posterior](https://reader036.fdocuments.net/reader036/viewer/2022071005/5fc2a276693fd84a34519fe5/html5/thumbnails/21.jpg)
Softplus
Sample1 Sample2 Sample3 Meanofq(W) 𝐸"($)[𝑝(𝑦|𝑥, 𝑤)]
![Page 22: Uncertainty in Bayesian Neural Nets · BNN •Variational Inference •Maximize lower bound on the marginal log-likelihood log’12 ≥!"$[log’12,,+log’,−log6,] Prior Posterior](https://reader036.fdocuments.net/reader036/viewer/2022071005/5fc2a276693fd84a34519fe5/html5/thumbnails/22.jpg)
ReLU
Sample1 Sample2 Sample3 Meanofq(W) 𝐸"($)[𝑝(𝑦|𝑥, 𝑤)]
![Page 23: Uncertainty in Bayesian Neural Nets · BNN •Variational Inference •Maximize lower bound on the marginal log-likelihood log’12 ≥!"$[log’12,,+log’,−log6,] Prior Posterior](https://reader036.fdocuments.net/reader036/viewer/2022071005/5fc2a276693fd84a34519fe5/html5/thumbnails/23.jpg)
TanhSample1 Sample2 Sample3 Meanofq(W) 𝐸"($)[𝑝(𝑦|𝑥, 𝑤)]
![Page 24: Uncertainty in Bayesian Neural Nets · BNN •Variational Inference •Maximize lower bound on the marginal log-likelihood log’12 ≥!"$[log’12,,+log’,−log6,] Prior Posterior](https://reader036.fdocuments.net/reader036/viewer/2022071005/5fc2a276693fd84a34519fe5/html5/thumbnails/24.jpg)
Mix(Softplus,ReLu,Tanh)Sample1 Sample2 Sample3 Meanofq(W) 𝐸"($)[𝑝(𝑦|𝑥)]
![Page 25: Uncertainty in Bayesian Neural Nets · BNN •Variational Inference •Maximize lower bound on the marginal log-likelihood log’12 ≥!"$[log’12,,+log’,−log6,] Prior Posterior](https://reader036.fdocuments.net/reader036/viewer/2022071005/5fc2a276693fd84a34519fe5/html5/thumbnails/25.jpg)
NumberofDatapoints25000 10000 1000 100
Argmax
Max
Entropy
𝐸"($)[𝑝(𝑦|𝑥)]
![Page 26: Uncertainty in Bayesian Neural Nets · BNN •Variational Inference •Maximize lower bound on the marginal log-likelihood log’12 ≥!"$[log’12,,+log’,−log6,] Prior Posterior](https://reader036.fdocuments.net/reader036/viewer/2022071005/5fc2a276693fd84a34519fe5/html5/thumbnails/26.jpg)
ModelvsOutputUncertainty
• PredictiveUncertainty=𝐻[𝑝(𝑦|𝑥)]
OutputUncertainty
ModelUncertainty
𝐻[𝑝(𝑦|𝑥, 𝑤Z)]where𝑤Z=meanofq(w)
𝐻[𝐸"($)[𝑝(𝑦|𝑥, 𝑤)]]
Outputhighentropy(ondecisionboundary)
Highvariancepredictions
![Page 27: Uncertainty in Bayesian Neural Nets · BNN •Variational Inference •Maximize lower bound on the marginal log-likelihood log’12 ≥!"$[log’12,,+log’,−log6,] Prior Posterior](https://reader036.fdocuments.net/reader036/viewer/2022071005/5fc2a276693fd84a34519fe5/html5/thumbnails/27.jpg)
ModelvsOutputUncertainty
Train Test HeldOut
ModelUncertainty .07 .26 .43
OutputUncertainty .03 .15 .25
Train Test HeldOut
ModelUncertainty .06 .06 .43
OutputUncertainty .05 .05 .36
100trainingdatapoints
25000trainingdatapoints
Smalldata:modeluncertainty
Largedata:outputuncertainty
![Page 28: Uncertainty in Bayesian Neural Nets · BNN •Variational Inference •Maximize lower bound on the marginal log-likelihood log’12 ≥!"$[log’12,,+log’,−log6,] Prior Posterior](https://reader036.fdocuments.net/reader036/viewer/2022071005/5fc2a276693fd84a34519fe5/html5/thumbnails/28.jpg)
NN BNN GP+NN
AdversarialExamples,Uncertainty,andTransferTestingRobustnessinGaussianProcessHybridDeepNetworks(July2017)
![Page 29: Uncertainty in Bayesian Neural Nets · BNN •Variational Inference •Maximize lower bound on the marginal log-likelihood log’12 ≥!"$[log’12,,+log’,−log6,] Prior Posterior](https://reader036.fdocuments.net/reader036/viewer/2022071005/5fc2a276693fd84a34519fe5/html5/thumbnails/29.jpg)
Visualizelandscapeoflikelihood
w1
w2
p(ytrain|xtrain,W)
DimensionofWislarge,sousean2Dauxiliaryvariable
![Page 30: Uncertainty in Bayesian Neural Nets · BNN •Variational Inference •Maximize lower bound on the marginal log-likelihood log’12 ≥!"$[log’12,,+log’,−log6,] Prior Posterior](https://reader036.fdocuments.net/reader036/viewer/2022071005/5fc2a276693fd84a34519fe5/html5/thumbnails/30.jpg)
Visualizelandscapeoflikelihood
• AuxiliaryVariableModel
Y
X W
Z
GenerativeModel
InferenceModel
W
Z
784-100-100-2-10-10-10NN BNN
(2D)𝑧~𝑞 𝑧
𝑞 𝑊 = N𝛿 𝑊 𝑧 𝑞 𝑧 𝑑𝑧�
�
𝑊~𝑞 𝑊 𝑧r 𝑧 𝑊
log 𝑝 𝑌 𝑋 ≥ 𝐸" $ [log 𝑝 𝑌 𝑋,𝑊 + log 𝑝 𝑊 − log 𝑞 𝑊|𝑧 + log 𝑟 𝑧 𝑤 − log 𝑞(𝑧)]
𝑞 𝑊 𝑧 = 𝛿(𝑊|𝑧)
hyper-network hypo-network
![Page 31: Uncertainty in Bayesian Neural Nets · BNN •Variational Inference •Maximize lower bound on the marginal log-likelihood log’12 ≥!"$[log’12,,+log’,−log6,] Prior Posterior](https://reader036.fdocuments.net/reader036/viewer/2022071005/5fc2a276693fd84a34519fe5/html5/thumbnails/31.jpg)
DecisionBoundariesz1 z2 z3 𝐸"([)[𝑝(𝑦|𝑥, 𝑧)]
![Page 32: Uncertainty in Bayesian Neural Nets · BNN •Variational Inference •Maximize lower bound on the marginal log-likelihood log’12 ≥!"$[log’12,,+log’,−log6,] Prior Posterior](https://reader036.fdocuments.net/reader036/viewer/2022071005/5fc2a276693fd84a34519fe5/html5/thumbnails/32.jpg)
LikelihoodLandscape
Logp(ytrain|xtrain,W,z) Logp(ytest|xtest,W,z)
z1
z2 z2
![Page 33: Uncertainty in Bayesian Neural Nets · BNN •Variational Inference •Maximize lower bound on the marginal log-likelihood log’12 ≥!"$[log’12,,+log’,−log6,] Prior Posterior](https://reader036.fdocuments.net/reader036/viewer/2022071005/5fc2a276693fd84a34519fe5/html5/thumbnails/33.jpg)
LikelihoodLandscape
logp(ytrain|xtrain,W,z) logp(ytest|xtest,W,z)
logp(ytrain|xtrain,W,z)+logr(z|W)- logq(z)
z1
z2
![Page 34: Uncertainty in Bayesian Neural Nets · BNN •Variational Inference •Maximize lower bound on the marginal log-likelihood log’12 ≥!"$[log’12,,+log’,−log6,] Prior Posterior](https://reader036.fdocuments.net/reader036/viewer/2022071005/5fc2a276693fd84a34519fe5/html5/thumbnails/34.jpg)
LikelihoodLandscape
Logp(ytrain|xtrain,W,z) Logp(ytest|xtest,W,z)
logp(ytrain|xtrain,W,z)+logr(z|W)- logq(z)
z1
z2
![Page 35: Uncertainty in Bayesian Neural Nets · BNN •Variational Inference •Maximize lower bound on the marginal log-likelihood log’12 ≥!"$[log’12,,+log’,−log6,] Prior Posterior](https://reader036.fdocuments.net/reader036/viewer/2022071005/5fc2a276693fd84a34519fe5/html5/thumbnails/35.jpg)
LikelihoodLandscape
Logp(ytrain|xtrain,W,z) Logp(ytest|xtest,W,z)
logp(ytrain|xtrain,W,z)+logr(z|W)- logq(z)
z1
z2
![Page 36: Uncertainty in Bayesian Neural Nets · BNN •Variational Inference •Maximize lower bound on the marginal log-likelihood log’12 ≥!"$[log’12,,+log’,−log6,] Prior Posterior](https://reader036.fdocuments.net/reader036/viewer/2022071005/5fc2a276693fd84a34519fe5/html5/thumbnails/36.jpg)
RecentBNNPapers
• MultiplicativeNormalizingFlowsforVariationalBayesianNeuralNetworks(2017)• VariationalDropoutSparsifies DeepNeuralNetworks(2017)• BayesianCompressionforDeepLearning(2017)
• AdversarialPerturbations• Compression
![Page 37: Uncertainty in Bayesian Neural Nets · BNN •Variational Inference •Maximize lower bound on the marginal log-likelihood log’12 ≥!"$[log’12,,+log’,−log6,] Prior Posterior](https://reader036.fdocuments.net/reader036/viewer/2022071005/5fc2a276693fd84a34519fe5/html5/thumbnails/37.jpg)
Adversarialperturbations
MNIST CIFAR10
![Page 38: Uncertainty in Bayesian Neural Nets · BNN •Variational Inference •Maximize lower bound on the marginal log-likelihood log’12 ≥!"$[log’12,,+log’,−log6,] Prior Posterior](https://reader036.fdocuments.net/reader036/viewer/2022071005/5fc2a276693fd84a34519fe5/html5/thumbnails/38.jpg)
CompressionvsUncertainty
H[P]
![Page 39: Uncertainty in Bayesian Neural Nets · BNN •Variational Inference •Maximize lower bound on the marginal log-likelihood log’12 ≥!"$[log’12,,+log’,−log6,] Prior Posterior](https://reader036.fdocuments.net/reader036/viewer/2022071005/5fc2a276693fd84a34519fe5/html5/thumbnails/39.jpg)
Conclusion
• UsedvisualizationstohelpunderstanduncertaintyinBNNs• Goal:improveuncertaintyestimatesandgeneralization
Applications• Activelearning• BayesOpt• RL
• Safety• Efficiency
![Page 40: Uncertainty in Bayesian Neural Nets · BNN •Variational Inference •Maximize lower bound on the marginal log-likelihood log’12 ≥!"$[log’12,,+log’,−log6,] Prior Posterior](https://reader036.fdocuments.net/reader036/viewer/2022071005/5fc2a276693fd84a34519fe5/html5/thumbnails/40.jpg)
References
• WeightUncertaintyinNeuralNetworks(2015)
• VariationalDropoutandtheLocalReparameterization Trick(2015)
• DropoutasaBayesianApproximation:RepresentingModelUncertaintyinDeepLearning(2016)
• VariationalDropoutSparsifies DeepNeuralNetworks(2017)
• OnCalibrationofModernNeuralNetworks(2017)
• MultiplicativeNormalizingFlowsforVariationalBayesianNeuralNetworks(2017)
![Page 41: Uncertainty in Bayesian Neural Nets · BNN •Variational Inference •Maximize lower bound on the marginal log-likelihood log’12 ≥!"$[log’12,,+log’,−log6,] Prior Posterior](https://reader036.fdocuments.net/reader036/viewer/2022071005/5fc2a276693fd84a34519fe5/html5/thumbnails/41.jpg)
ThankYou