Beyond the black box: Flexible programming of hierarchical ...paciorek/presentations/...Use igraph...

Post on 31-Jul-2020

1 views 0 download

Transcript of Beyond the black box: Flexible programming of hierarchical ...paciorek/presentations/...Use igraph...

Beyondtheblackbox:Flexibleprogrammingofhierarchical

modelingalgorithmsforBUGS-basedmodelsusingNIMBLE

ImperialCollegeseminarJuly2018

ChristopherPaciorekUCBerkeleyStatistics

Jointworkwith:PerrydeValpine(PI)UCBerkeleyEnvironmentalScience,PolicyandManagementDanielTurekWilliamsCollegeNickMichaudUCBerkeleyStatisticsandESPMDuncanTempleLangUCDavisStatistics

FundedbyNSFDBI-1147230andNSF-ACI-1550488

http://r-nimble.org

Whatdowewanttodowithhierarchicalmodels?1.Corealgorithms•  MCMC•  SequentialMonteCarlo•  Laplaceapproximation•  Importancesampling•  VariationalBayes

2NIMBLE:extensiblesoftwareforhierarchicalmodels(r-nimble.org)

Whatdowewanttodowithhierarchicalmodels?

2.Differentflavorsofalgorithms•  ManyflavorsofMCMC•  Gaussianquadrature•  MonteCarloexpectationmaximization(MCEM)•  KalmanFilter•  Auxiliaryparticlefilter•  Posteriorpredictivesimulation•  Posteriorre-weighting•  Datacloning•  Bridgesampling(normalizingconstants)•  YOURFAVORITEHERE•  YOURNEWIDEAHERE

1.Corealgorithms•  MCMC•  SequentialMonteCarlo•  Laplaceapproximation•  Importancesampling•  VariationalBayes

3NIMBLE:extensiblesoftwareforhierarchicalmodels(r-nimble.org)

Whatdowewanttodowithhierarchicalmodels?3.Ideacombinations•  ParticleMCMC•  ParticleFilterwithreplenishment•  MCMC/Laplaceapproximation•  DozensofideasinrecentJRSSB/JCGSissues

2.Differentflavorsofalgorithms•  ManyflavorsofMCMC•  Gaussianquadrature•  MonteCarloexpectationmaximization(MCEM)•  KalmanFilter•  Auxiliaryparticlefilter•  Posteriorpredictivesimulation•  Posteriorre-weighting•  Datacloning•  Bridgesampling(normalizingconstants)•  YOURFAVORITEHERE•  YOURNEWIDEAHERE

1.Corealgorithms•  MCMC•  SequentialMonteCarlo•  Laplaceapproximation•  Importancesampling•  VariationalBayes

4NIMBLE:extensiblesoftwareforhierarchicalmodels(r-nimble.org)

Whatcanapractitionerdowithhierarchicalmodels?Twobasicsoftwaredesigns:1. TypicalRpackage=Modelfamily+1ormorealgorithms•  GLMMs:lme4,MCMCglmm•  GAMMs:mgcv•  spatialmodels:spBayes,INLA

5NIMBLE:extensiblesoftwareforhierarchicalmodels(r-nimble.org)

Whatcanapractitionerdowithhierarchicalmodels?Twobasicsoftwaredesigns:1. TypicalRpackage=Modelfamily+1ormorealgorithms•  GLMMs:lme4,MCMCglmm•  GAMMs:mgcv•  spatialmodels:spBayes,INLA

2. Flexiblemodel+blackboxalgorithm•  BUGS:WinBUGS,OpenBUGS,JAGS•  PyMC•  INLA•  Stan

6NIMBLE:extensiblesoftwarefor

hierarchicalmodels(r-nimble.org)

Existingsoftware

X(1) X(2) X(3)

Y(1) Y(2) Y(3)

Model Algorithm

e.g.,BUGS(WinBUGS,OpenBUGS,JAGS),INLA,Stan,variousRpackages

7NIMBLE:extensiblesoftwareforhierarchicalmodels(r-nimble.org)

NIMBLE:TheGoal

X(1) X(2) X(3)

Y(1) Y(2) Y(3)

Model Algorithmlanguage

8

+

=NIMBLE:extensiblesoftwareforhierarchicalmodels(r-nimble.org)

X(1) X(2) X(3)

Y(1) Y(2) Y(3)

MCMCFlavor1

MCMCFlavor2

ParticleFilter

ImportanceSampler

Maximumlikelihood

Quadrature

MCEM

Datacloning

Yournewmethod

9NIMBLE:extensiblesoftwareforhierarchicalmodels(r-nimble.org)

DivorcingModelSpecificationfromAlgorithm

Goals– RetainingBUGScompatibility– Providingavarietyofstandardalgorithms– Allowinguserstoeasilymodifythosealgorithms– Allowingdeveloperstoaddnewalgorithms(includingmodularcombinationofalgorithms)

– AllowinguserstooperatewithinR– ProvidingspeedviacompilationtoC++,withRwrappers

10NIMBLE:extensiblesoftwareforhierarchicalmodels(r-nimble.org)

statisticalmodel(BUGScode)+algorithm(nimbleFunction)

Robjects+Runderthehood

Robjects+C++underthehood² WegenerateC++code,² compileandloadit,² provideinterfaceobject.

NIMBLESystemSummary

NIMBLE:extensiblesoftwareforhierarchicalmodels(r-nimble.org) 11

NIMBLE

1.  Modelspecification

BUGSlanguageèR/C++modelobject

2.  Algorithmlibrary

MCMC,ParticleFilter/SequentialMC,MCEM,etc.

3.  AlgorithmspecificationNIMBLEprogramminglanguagewithinRèR/C++algorithmobject

NIMBLE:extensiblesoftwareforhierarchicalmodels(r-nimble.org) 12

13

1

>littersModel<-nimbleModel(littersCode,constants=list(N=16,G=2),data=list(r=input$r))>littersModel_cpp<-compileNimble(littersModel)

littersCode<-nimbleCode({for(jin1:G){for(Iin1:N){

r[i,j]~dbin(p[i,j],n[i,j]);p[i,j]~dbeta(a[j],b[j]);

}mu[j]<-a[j]/(a[j]+b[j]);theta[j]<-1.0/(a[j]+b[j]);a[j]~dgamma(1,0.001);b[j]~dgamma(1,0.001);})

mu

muBlock[1]

muBlock[2]

muBlock[3]

y[1]

y[2]

y[3]

y[4]

y[5]

y[6]

y[7]

y[8]

y[9]

ParseandprocessBUGScode.Collectinformationinmodelobject.

Useigraphplotmethod(wealsousethistodeterminedependencies).

Providesvariablesandfunctions(calculate,simulate)foralgorithmstouse.

14NIMBLE:extensiblesoftwareforhierarchicalmodels(r-nimble.org)

UserExperience:CreatingaModelfromBUGS

TheSuccessofR

15

YougiveNIMBLE:

littersCode<-nimbleCode({for(jin1:G){for(Iin1:N){

r[i,j]~dbin(p[i,j],n[i,j]);p[i,j]~dbeta(a[j],b[j]);

}mu[j]<-a[j]/(a[j]+b[j]);theta[j]<-1.0/(a[j]+b[j]);a[j]~dgamma(1,0.001);b[j]~dgamma(1,0.001);})

Yougetthis:

ProgrammingwithModels

NIMBLEalsoextendsBUGS:multipleparameterizations,namedparameters,anduser-defineddistributionsandfunctions.

NIMBLE:extensiblesoftwareforhierarchicalmodels(r-nimble.org) 16

>littersModel$a[1]<-5#setvaluesinmodel>simulate(littersModel,‘p')#simulatefromprior>p_deps<-littersModel$getDependencies(‘p’)#modelstructure>calculate(littersModel,p_deps)#calculateprobabilitydensity>getLogProb(pumpModel,‘r')

17NIMBLE:extensiblesoftwareforhierarchicalmodels(r-nimble.org)

UserExperience:SpecializinganAlgorithmtoaModellittersModelCode<-modelCode({for(jin1:G){for(Iin1:N){

r[i,j]~dbin(p[i,j],n[i,j]);p[i,j]~dbeta(a[j],b[j]);

}mu[j]<-a[j]/(a[j]+b[j]);theta[j]<-1.0/(a[j]+b[j]);a[j]~dgamma(1,0.001);b[j]~dgamma(1,0.001);})

sampler_slice<-nimbleFunction(setup=function((model,mvSaved,control){calcNodes<-model$getDependencies(control$targetNode)discrete<-model$getNodeInfo()[[control$targetNode]]$isDiscrete()[…snip…]run=function(){u<-getLogProb(model,calcNodes)-rexp(1,1)x0<-model[[targetNode]]L<-x0-runif(1,0,1)*width[…snip….]…

>littersMCMCconf<-configureMCMC(littersModel)>littersMCMCconf$printSamplers()[…snip…][3]RWsampler;targetNode:b[1],adaptive:TRUE,adaptInterval:200,scale:1[4]RWsampler;targetNode:b[2],adaptive:TRUE,adaptInterval:200,scale:1[5]conjugate_betasampler;targetNode:p[1,1],dependents_dbin:r[1,1][6]conjugate_betasampler;targetNode:p[1,2],dependents_dbin:r[1,2][...snip...]>littersMCMCconf$addSampler(‘a[1]’,‘slice’,list(adaptInterval=100))>littersMCMCconf$addSampler(‘a[2]’,‘slice’,list(adaptInterval=100))>littersMCMCconf$addMonitors(‘theta’)>littersMCMC<-buildMCMC(littersMCMCspec)>littersMCMC_Cpp<-compileNimble(littersMCMC,project=littersModel)>runMCMC(littersMCMC_Cpp,niter=20000)

>littersMCEM<-buildMCEM(littersModel,latentNodes=‘p’,mcmcControl=list(adaptInterval=50),boxConstraints=list(list(‘a’,‘b’),limits=c(0,Inf))),buffer=1e-6)>set.seed(0)>littersMCEM(maxit=50,m1=500,m2=5000)

18NIMBLE:extensiblesoftwareforhierarchicalmodels(r-nimble.org)

UserExperience:SpecializinganAlgorithmtoaModel(2)littersModelCode<-quote({for(jin1:G){for(Iin1:N){

r[i,j]~dbin(p[i,j],n[i,j]);p[i,j]~dbeta(a[j],b[j]);

}mu[j]<-a[j]/(a[j]+b[j]);theta[j]<-1.0/(a[j]+b[j]);a[j]~dgamma(1,0.001);b[j]~dgamma(1,0.001);})

buildMCEM<-nimbleFunction(while(runtime(converged==0)){….calculate(model,paramDepDetermNodes)mcmcFun(mcmc.its,initialize=FALSE)currentParamVals[1:nParamNodes]<-getValues(model,paramNodes)op<-optim(currentParamVals,objFun,maximum=TRUE)newParamVals<-op$maximum…..

OnecanpluganyMCMCsamplerintotheMCEM,withusercontrolofthesamplingstrategy,inplaceofthedefaultMCMC.

Modularity:

NIMBLE

1.  Modelspecification

BUGSlanguageèR/C++modelobject

2.  Algorithmlibrary

MCMC,ParticleFilter/SequentialMC,MCEM,etc.

3.  AlgorithmspecificationNIMBLEprogramminglanguagewithinRèR/C++algorithmobject

NIMBLE:extensiblesoftwareforhierarchicalmodels(r-nimble.org) 19

NIMBLE’salgorithmlibrary– MCMCsamplers:

•  Conjugate,adaptiveMetropolis,adaptiveblockedMetropolis,slice,ellipticalslicesampler,particleMCMC,specializedsamplersforparticulardistributions(Dirichlet,CAR)•  Flexiblechoiceofsamplerforeachparameter•  User-specifiedblocksofparameters•  Cross-validation,WAIC

– SequentialMonteCarlo(particlefilters)•  Variousflavors

– MCEM– Writeyourown

NIMBLE:extensiblesoftwareforhierarchicalmodels(r-nimble.org) 20

Grou

littersModelCode<-nimbleCode({for(jin1:2){for(Iin1:16){

r[i,j]~dbin(p[i,j],n[i,j]);p[i,j]~dbeta(a[j],b[j]);

}mu[j]<-a[j]/(a[j]+b[j]);theta[j]<-1.0/(a[j]+b[j]);a[j]~dgamma(1,0.001);b[j]~dgamma(1,0.001);})

21NIMBLE:extensiblesoftwareforhierarchicalmodels(r-nimble.org)

NIMBLEinAction:theLittersExample

ri,j

aj pi,j

Groupj

•  BUGSmanual:“Theestimates,particularlya1,a2sufferfromextremelypoorconvergence,limitedagreementwithm.l.e.’sandconsiderablepriorsensitivity.Thisappearstobedueprimarilytotheparameterisationintermsofthehighlyrelatedajandbj,whereasdirectsamplingofmujandthetajwouldbestronglypreferable.”

•  Butthat’snotallthat’sgoingon.Considerthedependencebetweenthep’sandtheiraj,bjhyperparameters.

•  AndperhapswewanttodosomethingotherthanMCMC.

Challengesofthetoyexample:

Litteri

bj

Beta-binomialGLMMforclusteredbinaryresponsedataSurvivalintwosetsof16littersofpigs

ni,j

DefaultMCMC:Gibbs+Metropolis

NIMBLE:extensiblesoftwareforhierarchicalmodels(r-nimble.org) 22

>littersMCMCspec<-configureMCMC(littersModel,list(adaptInterval=100))>littersMCMC<-buildMCMC(littersMCMCspec)>littersMCMC_cpp<-compileNIMBLE(littersModel,project=littersModel)>runMCMC(littersMCMC_cpp,niter=10000)

NIMBLE:extensiblesoftwareforhierarchicalmodels(r-nimble.org) 23

RedlineisMLE

0 500 1500 2500

020

0040

0060

00un

ivar.

adap

tive

a1

0 500 1500 2500

020

060

0

b1

0 500 1500 2500

05

1015

20

a2

0 500 1500 2500

01

23

45

6

b2

BlockedMCMC:Gibbs+BlockedMetropolis

NIMBLE:extensiblesoftwareforhierarchicalmodels(r-nimble.org) 24

>littersMCMCspec2<-configureMCMC(littersModel,list(adaptInterval=100))>littersMCMCspec2$addSampler(c(‘a[1]’,‘b[1]’),‘RW_block’,list(adaptInterval=100)>littersMCMCspec2$addSampler(c(‘a[2]’,‘b[2]’),‘RW_block’,list(adaptInterval=100)>littersMCMC2<-buildMCMC(littersMCMCspec2)>littersMCMC2_cpp<-compileNIMBLE(littersMCMC2,project=littersModel)>runMCMC(littersMCMC2_cpp,niter=10000)

NIMBLE:extensiblesoftwareforhierarchicalmodels(r-nimble.org) 25

0 500 1500 2500

020

0040

0060

00un

ivar.

adap

tive

a1

0 500 1500 2500

020

060

0

b1

0 500 1500 2500

05

1015

20

a2

0 500 1500 2500

01

23

45

6

b2

0 500 1500 2500

020

0040

0060

00bl

ocke

d

0 500 1500 2500

020

060

0

0 500 1500 2500

05

1015

20

0 500 1500 2500

01

23

45

6

BlockedMCMC:Gibbs+Cross-levelUpdaters

NIMBLE:extensiblesoftwareforhierarchicalmodels(r-nimble.org) 26

>littersMCMCspec3<-configureMCMC(littersModel,adaptInterval=100)>topNodes1<-c('a[1]','b[1]')>littersMCMCspec3$addSampler(topNodes1,‘crossLevel’,list(adaptInterval=100)>topNodes2<-c('a[2]','b[2]')>littersMCMCspec3$addSampler(topNodes2,‘crossLevel’,list(adaptInterval=100)>littersMCMC3<-buildMCMC(littersMCMCspec3)>littersMCMC3_cpp<-compileNIMBLE(littersMCMC3,project=littersModel)>runMCMC(littersMCMC3_cpp,niter=10000)

•  Cross-leveldependenceisakeybarrierinthisandmanyothermodels.•  Wewroteanew“cross-level”updaterfunctionusingtheNIMBLEDSL.

•  BlockedMetropolisrandomwalkonasetofhyperparameterswithconditionalGibbsupdatesondependentnodes(providedtheyareinaconjugaterelationship).

•  Equivalentto(analytically)integratingthedependent(latent)nodesoutofthemodel.

NIMBLE:extensiblesoftwareforhierarchicalmodels(r-nimble.org) 27

0 500 1500 2500

020

0040

0060

00un

ivar.

adap

tive

a1

0 500 1500 2500

020

060

0

b1

0 500 1500 2500

05

1015

20

a2

0 500 1500 2500

01

23

45

6

b2

0 500 1500 2500

020

0040

0060

00bl

ocke

d

0 500 1500 2500

020

060

0

0 500 1500 2500

05

1015

20

0 500 1500 2500

01

23

45

6

0 500 1500 2500

020

0040

0060

00cr

oss−

leve

l

0 500 1500 2500

020

060

0

0 500 1500 2500

05

1015

20

0 500 1500 2500

01

23

45

6

LittersMCMC:BUGSandJAGS

NIMBLE:extensiblesoftwareforhierarchicalmodels(r-nimble.org) 28

•  CustomizedsamplingpossibleinNIMBLEgreatlyimprovesperformance.

•  BUGSgivessimilarperformancetothedefaultNIMBLEMCMC•  Becareful–valuesof$sim.listand$sims.matrixinR2WinBUGS

outputarerandomlypermuted•  Mixingfora2andb2modestlybetterthandefaultNIMBLE

MCMC•  JAGSslicesamplergivessimilarperformanceasBUGS,butfailsforsomestartingvalueswiththis(troublesome)parameterization

•  NIMBLEprovidesusercontrolandtransparency.•  NIMBLEisfasterthanJAGSonthisexample(ifoneignoresthecompilationtime),thoughnotalways.

•  Note:we’renotouttobuildthebestMCMCbutratheraflexibleframeworkforalgorithms–we’dlovetohavesomeoneelsebuildabetterdefaultMCMCanddistributeforuseinoursystem.

NIMBLE

1.  Modelspecification

BUGSlanguageèR/C++modelobject

2.  Algorithmlibrary

MCMC,ParticleFilter/SequentialMC,MCEM,etc.

3.  AlgorithmspecificationNIMBLEprogramminglanguagewithinRèR/C++algorithmobject

NIMBLE:extensiblesoftwareforhierarchicalmodels(r-nimble.org) 29

NIMBLE:ProgrammingWithModels

Wewant:• High-levelprocessing(modelstructure)inR

•  Low-levelprocessinginC++

NIMBLE:extensiblesoftwareforhierarchicalmodels(r-nimble.org) 30

NIMBLE:ProgrammingwithModelssampler_myRW<-nimbleFunction(setup=function(model,mvSaved,targetNode,scale){calcNodes<-model$getDependencies(targetNode)},run=function(){model_lp_initial<-calculate(model,calcNodes)proposal<-rnorm(1,model[[targetNode]],scale)model[[targetNode]]<<-proposalmodel_lp_proposed<-calculate(model,calcNodes)log_MH_ratio<-model_lp_proposed-model_lp_initialif(decide(log_MH_ratio))jump<-TRUEelsejump<-FALSE#….Variousbookkeepingoperations…#}) NIMBLE:extensiblesoftwarefor

hierarchicalmodels(r-nimble.org) 31

2kindsoffunctions

NIMBLE:ProgrammingwithModelssampler_myRW<-nimbleFunction(setup=function(model,mvSaved,targetNode,scale){calcNodes<-model$getDependencies(targetNode)},run=function(){model_lp_initial<-calculate(model,calcNodes)proposal<-rnorm(1,model[[targetNode]],scale)model[[targetNode]]<<-proposalmodel_lp_proposed<-calculate(model,calcNodes)log_MH_ratio<-model_lp_proposed-model_lp_initialif(decide(log_MH_ratio))jump<-TRUEelsejump<-FALSE#….Variousbookkeepingoperations…#}) NIMBLE:extensiblesoftwarefor

hierarchicalmodels(r-nimble.org) 32

querymodelstructureONCE

NIMBLE:ProgrammingwithModelssampler_myRW<-nimbleFunction(setup=function(model,mvSaved,targetNode,scale){calcNodes<-model$getDependencies(targetNode)},run=function(){model_lp_initial<-calculate(model,calcNodes)proposal<-rnorm(1,model[[targetNode]],scale)model[[targetNode]]<<-proposalmodel_lp_proposed<-calculate(model,calcNodes)log_MH_ratio<-model_lp_proposed-model_lp_initialif(decide(log_MH_ratio))jump<-TRUEelsejump<-FALSE#….Variousbookkeepingoperations…#}) NIMBLE:extensiblesoftwarefor

hierarchicalmodels(r-nimble.org) 33

theactual(generic)algorithm

TheNIMBLEcompiler(runcode)Featuresummary:•  R-likematrixalgebra(usingEigenlibrary)•  R-likeindexing(e.g.X[1:5,])•  Useofmodelvariablesandnodes•  Modelcalculate(logProb)andsimulatefunctions•  Sequentialintegeriteration•  If-then-else,do-while•  AccesstomuchofRmath.h(e.g.distributions)•  AutomaticRinterface/wrapper•  CallouttoyourownC/C++orbacktoR•  Manyimprovements/extensionsplanned

NIMBLE:extensiblesoftwareforhierarchicalmodels(r-nimble.org) 34

HowanAlgorithmisProcessedinNIMBLEDSLcode(runcode)within

nimbleFunction()

Parsetreeofcode

Abstractsyntaxtree

.Cppand.hfilesinRTMPDIR

DLLinRTMPDIR

AccessviawrappersfromR

ParseinR

ProcesstoaReferenceClassinR

WritingtofilesfromR

g++/llvm/etc.

GenerationofRwrapperfunctionsthatuse.Call

NIMBLE:extensiblesoftwareforhierarchicalmodels(r-nimble.org) 35

Modularalgorithms:particleMCMC

NIMBLE:extensiblesoftwareforhierarchicalmodels(r-nimble.org) 36

•  Particlefilter(SMC)approximatesaposteriorforlatentstatesusingasample

•  Traditionallyusedinstatespacemodelswherethesampleparticlesarepropagatedintimetoapproximate:

•  Weightsfrom‘correction’stepcanbeusedtoestimate•  EmbedinMCMCtodoapproximatemarginalizationover

p(xt|y1:t, ✓)

p(y1:t|✓)x1:t

ParticleMCMCinNIMBLE

NIMBLE:extensiblesoftwareforhierarchicalmodels(r-nimble.org) 37

sampler_PMCMC<-nimbleFunction(setup=function(model,mvSaved,target,control){….my_particleFilter<-buildAuxiliaryFilter(model,control$latents,control=list(saveAll=TRUE,smoothing=TRUE,lookahead=lookahead))….},run=function(){….modelLP0<-modelLL0+calculate(model,target)propValue<-rnorm(1,mean=model[[target]],sd=scale)model[[target]]<<-propValuemodelLL1<-my_particleFilter$run(m)modelLP1<-modelLL1+calculate(model,target)jump<-my_decideAndJump$run(modelLP1,modelLP0,0,0)….})

StatusofNIMBLEandNextSteps•  FirstreleasewasJune2014withregularreleasessince.Lotstodo:

–  Improvetheuserinterfaceandspeedupcompilation(inprogress)–  Scalabilityforlargemodels(inprogress)–  BayesiannonparametricswithClaudiaWehrhahn&AbelRodriguez

(UCSC)(firstreleaseafewweeksagoandmoreworkinprogress)–  Refinement/extensionoftheDSLforalgorithms(inprogress)

•  e.g.,automaticdifferentiation,parallelization–  AdditionalalgorithmswritteninNIMBLEDSL

•  e.g.,normalizingconstantcalculation,Laplaceapproximations

•  Interested?–  Wehavefundingforapostdoc/programmer!–  Wehavefundingtobringusers/developerstoBerkeley.–  Announcements:nimble-announceGooglesite–  Usersupport/discussion:nimble-usersGooglesite–  WriteanalgorithmusingNIMBLE!–  HelpwithdevelopmentofNIMBLE:emailnimble.stats@gmail.comor

seegithub.com/nimble-dev38NIMBLE:extensiblesoftwarefor

hierarchicalmodels(r-nimble.org)

•  Yourowndistributionforuseinamodel•  Yourownfunctionforuseinamodel•  YourownMCMCsamplerforavariableina

model•  AnewMCMCsamplingalgorithmforgeneral

use•  Anewalgorithmforhierarchicalmodels•  Analgorithmthatcomposesotherexisting

algorithms(e.g.,MCMC-SMCcombinations)

NIMBLE:extensiblesoftwareforhierarchicalmodels(r-nimble.org) 39

NIMBLE:WhatcanIprogram?

PalEONProjectwww3.nd.edu/~paleolab/paleonproject

40NIMBLE:extensiblesoftwareforhierarchicalmodels(r-nimble.org)

Goal:ImprovethepredictivecapacityofterrestrialecosystemmodelsFriedlingsteinetal.2006J.Climate

“Thislargevariationamongcarbon-cyclemodels…hasbeencalled‘uncertainty’.Iprefertocallit‘ignorance’.”-Prentice(2013)GranthamInstitute

Criticalissue:modelparameterizationandrepresentationofdecadal-tocentennial-scaleprocessesarepoorlyconstrainedbydataApproach:usehistoricalandfossildatatoestimatepastvegetationandclimateandusethisinformationformodelinitialization,assessment,andimprovement

PalEONStatisticalApplications•  Estimatespatially-varyingcompositionandbiomassoftreespeciesfromcountandzero-inflatedsizedatainyear1850

•  Estimatetemporalvariationsintemperatureandprecipitationover2000yearsfromtreeringsandlake/bogrecords

•  Estimatetreecompositionspatiallyover2000yearsfromfossilpolleninlakesedimentcores

•  Estimatebiomassovertimeatasitefromfossilpolleninlakesedimentcores

41NIMBLE:extensiblesoftwareforhierarchicalmodels(r-nimble.org)

InferringBiomassfromPollen•  Calibrationwithmultiplespatiallocations:–  “Regress”multinomialcountsonbiomass–  Foreachtaxon,haveproportionofthetaxonbeasmoothfunctionofbiomassusingsplinesandparametersofBetadistributions:•  αk=exp(Z(b)βk)

–  Estimatesplinecoefficientsforeachtaxon•  Predictbiomassovertimeatonelocation:–  Statespacemodelforbiomassovertime–  Fixedsplinecoefficientsfromcalibration–  Inverseproblem(justBayesianinference)

•  αk=exp(Z(bt)βk)

42NIMBLE:extensiblesoftwareforhierarchicalmodels(r-nimble.org)

Predictingbiomassfromcompositionaldata

NIMBLE:extensiblesoftwareforhierarchicalmodels(r-nimble.org) 43

2000

1900

1800

1700

1600

1500

1300

1200

1100

1000

1400

Year A

D

Pine

20

Hemloc

k

20 40

Birch

20

Oak Hickory

20 40

Beech

Chestn

ut

20

Grass &

wee

ds

1500

Charco

al:Poll

en

60

% orga

nic m

atterBerry Pond WestBerry Pond, W Massachusetts

Onset of Little Ice Age

European settlement

Prediction:basedoncalibrationmodelandpollencompositionovertime,predictbiomass

Calibration:atsettlementtimewehavebiomassestimates(basedonsurveydataandaspatialmodel)andpollencomposition(fromsedimentcores)

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●●

●●●●●

●●

●●

●●●●

Biomass Point Estiamtes at Settlement

010

2030

4050

60

0 10 20 30 40 50 75 100

125

150

Biomass (Mg/ha)

Freq

uenc

y

Biomassestimatesatponds

Calibrationmodel

•  Pollenproportionforeachtaxondeterminedbytransformationofaflexible(spline)functionofbiomass•  shape1andshape2parametersofbetadistribution(stick-

breakingpriorformultinomial)aresplinesofbiomass•  Primarycalibrationparametersaresplinecoefficients

•  Overdispersedmultinomiallikelihoodforpollencountsgivenmodeledproportions

•  FitinNIMBLE(couldbefitinvariousotherpackages)

NIMBLE:extensiblesoftwareforhierarchicalmodels(r-nimble.org) 44

Calibrationmodelfit

NIMBLE:extensiblesoftwareforhierarchicalmodels(r-nimble.org) 45

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●

●●●●●●●●●●●●●

●●●

●●●●●●●

●●

0 20 40 60 80 100 140

0.0

0.2

0.4

0.6

0.8

PINUSX

biomass

polle

n p

rop

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●

●●●●●●●●●●●●●

●●●

●●●●●●●

●●

●●

● ●

●●

●●

●●

●● ●

●●

●●

●●

●●●

●●

● ●

●●

●●

●● ●

●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●

●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●

0 20 40 60 80 100 140

0.0

0.1

0.2

0.3

0.4

0.5

prairie

biomass

polle

n p

rop ●

●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●

●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●

●●

● ●

●●

●●

●●●

● ●●

●●

●●

● ●●

●● ●

●●

●●

●●●●●

●●●●●●●●●●●

●●

●●●●●

●●●

●●●●●●●●●●●●●●

●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●

0 20 40 60 80 100 140

0.0

0.1

0.2

0.3

0.4

0.5

QUERCUS

biomass

polle

n p

rop

●●●●●●●●●●●

●●

●●●●●

●●●

●●●●●●●●●●●●●●

●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●

●●

●●

●●

●●

●● ●

● ●

● ●

●● ●

●●

●●

● ●

●● ●●

●●

●●●

●●● ●●●

●●

●●●

●●●●●

●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●

0 20 40 60 80 100 140

0.0

0.1

0.2

0.3

0.4

other herbs

biomass

polle

n p

rop

●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●

●●

●●

●●

●●

●●

● ●

●●●

●●

●●

●●

●●●

● ●

●●●

●●

●●

●●● ●

●●●

●●●

Meanandvariabilityofmodeledpollenproportionsacrosspondsvarywithbiomass

PredictionModel

timet

taxonk

β1k,β2k

α1tk,α2tk

ptk

Ytk

bt

Z(bt)

σ

NIMBLE:extensiblesoftwareforhierarchicalmodels(r-nimble.org) 46

for(tin1:nTimes)Y[t,1]~dbetabin(shape1[t,1],shape2[t,1],n[t])for(kin2:(nTaxa-1)){ Y[t,k]~dbetabin(shape1[t,k],shape2[t,k],n[t]-sum(Y[t,1:(k-1)])

shape1[,]<-exp(Zb[,]%*%beta1[,])shape2[,]<-exp(Zb[,]%*%beta2[,])for(tin1:nTimes)

Zb[t,1:nKnots]<-bspline(b[t],knots[1:nKnots])for(tin2:nTimes)

b[t]~dGenPareto(3b[t-1]-3b[t-2]+b[t-3],sigma)sigma~dunif(0,10)#Gelman(2006)b[1]~dunif(0,400)

pollenlikelihood

latentpredictor

hyperpriors

(nonstationary)biomassevolution

NIMBLE:extensiblesoftwareforhierarchicalmodels(r-nimble.org) 47

PredictionModel

MCMCperformance

NIMBLE:extensiblesoftwareforhierarchicalmodels(r-nimble.org) 48

MCMC iteration

biom

ass

at −

5000

yea

rs

0 200 400 600 800 1000

110

115

120

125

130

MixingwithdataaugmentationusingdefaultNIMBLEMCMC

MCMC iterationbi

omas

s at

−50

00 y

ears

0 200 400 600 800 1000

110

115

120

125

130

chain 1chain 2

MixinginmarginalizedmodelusingHMCinStan

(recallnon-differentiablespikeatzerofromgeneralizedPareto)

NIMBLEsolution:customizedMCMCsamplerfor{b[t-3],b[t-2],b[t-1],b[t],b[t+1],b[t+2],b[t+3]},withanormalapproximationtolikelihoodtogenerategood(quasi-conjugate)proposals.

PalEONAcknowledgements•  Pollen-biomassCollaborators:AnnRaiho,JasonMcLachlan(NotreDameBiology)

•  PalEONinvestigators:JasonMcLachlan(NotreDame,PI),MikeDietze(BU),AndrewFinley(MichiganState),AmyHessl(WestVirginia),PhilHiguera(Idaho),MevinHooten(USGS/ColoradoState),SteveJackson(USGS/Arizona),DaveMoore(Arizona),NeilPederson(HarvardForest),JackWilliams(Wisconsin),JunZhu(Wisconsin)

•  NSFMacrosystemsProgram49NIMBLE:extensiblesoftwarefor

hierarchicalmodels(r-nimble.org)