Practical Model Selection and Multi-model Inference using R

Practical Model Selection and Multi-model Inference

using R

Modified from on a presentation by :

Eric Stolen and Dan Hunt

Theory

• This is the link with science, which is about understanding how the world works

Indigo Snake Habitat selectionDavid R. Breininger, M. Rebecca Bolt, Michael L. Legare, John H. Drese, and Eric D. Stolen

Source: Journal of Herpetology, 45(4):484-490. 2011.

– Animal perception– Evolutionary Biology– Population Demography

http://www.seaworld.org/animal-info/animal-bytes/spooky-safari/eastern-indigo-snake.htm

Hypotheses• To use the Information-theoretic toolbox,

we must be able to state a hypothesis as a statistical model (or more precisely an equation which allows us to calculate the maximum likelihood of the hypothesis)

http://www.seaworld.org/animal-info/animal-bytes/spooky-safari/eastern-indigo-snake.htm

Multiple Working Hypotheses

• We operate with a set of multiple alternative hypotheses (models)

• The many advantages include safeguarding objectivity, and allowing rigorous inference.

Chamberlain (1890)Strong Inference - Platt (1964)Karl Popper (ca. 1960)– Bold Conjectures

Deriving the model set

• This is the tough part (but also the creative part) • much thought needed, so don’t rush• collaborate, seek outside advice, read the

literature, go to meetings…• How and When hypotheses are better than What

hypotheses (strive to predict rather than describe)

Models – Indigo Snake exampleDavid R. Breininger, M. Rebecca Bolt, Michael L. Legare, John H. Drese, and Eric D. Stolen

Source: Journal of Herpetology, 45(4):484-490. 2011.

• Study of indigo snake habitat use• Response variable: home range size ln(ha)• SEX• Land cover – 2-3 levels (lC2)• weeks = effort/exposure• Science question: “Is there a seasonal difference in

habitat use between sexes?”

Models – Indigo Snake exampleSEXland cover type (lc2)weeksSEX + lc2SEX + weeksllc2 + weeksSEX + lc2 + weeksSEX + lc2 + SEX * lc2 SEX + lc2 + weeks + SEX * lc2

http://www.herpnation.com/hn-blog/indigo-snake-survival-demographics/?simple_nav_category=john-c-murphy

SEXland cover type (lc2)weeksSEX + lc2SEX + weeksllc2 + weeksSEX + lc2 + weeksSEX + lc2 + SEX * lc2 SEX + lc2 + weeks + SEX * lc2

Models – Indigo Snake example

Modeling

• Trade-off between precision and bias• Trying to derive knowledge / advance learning; not

“fit the data”• Relationship between data (quantity and quality) and

sophistication of the model

Precision-Bias Trade-offB

Model Complexity – increasing umber of Parameters

Kullback-Leibler Information

• Basic concept from Information theory• The information lost when a model is used to

represent full reality• Can also think of it as the distance between a

model and full reality

Truth / reality

G1 (best model in set)

Truth / reality

G3The relative difference between models is constant

Akaike’s Contributions

• Figured out how to estimate the relative Kullback-Leibler distance between models in a set of models

• Figured out how to link maximum likelihood estimation theory with expected K-L information

• An (Akaike’s) Information Criteria • AIC = -2 loge (L{modeli }| data) + 2K

AICci = -2*loge (Likelihood of model i given the data) + 2*K (n/(n-K-1))

or = AIC + 2*K*(K+1)/(n-K-1)

(where K = the number of parameters estimated and n = the sample size)

AICcmin = AICc for the model with the lowest AICc value

i = AICci– AICcmin

wi =Prob{gi | data} Model Probability (model probabilities)

evidence ratio of model i to model j = wi / wj

)5.0exp(

Least Squares Regression

AIC = n loge () + 2*K (n/(n-K-1))

Where RSS / n

Counting Parameters:

K = number of parameters estimated

Least Square Regression K = number of parameters + 2 (for intercept &

Counting Parameters:

K = number of parameters estimated

Logistic Regression K = number of parameters + 1 (for intercept

Comparing Models

Model selection based on AICc :

K AICc Delta_AICc AICcWt Cum.Wt LLmod4 4 112.98 0.00 0.71 0.71 -51.99mod7 5 114.89 1.91 0.27 0.98 -51.67mod1 3 121.52 8.54 0.01 0.99 -57.47mod5 4 122.27 9.29 0.01 1.00 -56.64mod2 3 125.93 12.95 0.00 1.00 -59.67mod6 4 128.34 15.36 0.00 1.00 -59.67mod3 3 141.26 28.28 0.00 1.00 -67.34

Model 1 = “SEX ",Model 2 = "ha.ln ~ lc2",Model 3 = "ha.ln ~ weeks ",Model 4 = "ha.ln ~ SEX + lc2",Model 5 = "ha.ln ~ SEX + weeks",Model 6 = "ha.ln ~ lc2 + weeks",Model 7 = "ha.ln ~ SEX + lc2 + weeks"

Model Averaging Predictions

iiiYwY

Model-averaged prediction

iiiYwY

Prediction from modeli

iiiYwY

Weight modeli

Model-averaged parameter estimate

Model Averaging Parameters

Unconditional Variance Estimator

varvar i

iiii gw

SECI *96.1%95

Unconditional Variance Estimator

Practical Model Selection and Multi-model Inference using R

Documents

Transcript of Practical Model Selection and Multi-model Inference using R

Multiple Model Inference: Calibration, Selection, and ...

Likelihood, Inference, and Model Comparison

Low precision Inference on GPU - Nvidia · 3 INFERENCE • Inference: using a trained model to make predictions • Much of inference is fwd pass in training • Inference engines

General Linear Model & Classical Inference

UvA-DARE (Digital Academic Repository) Bayes factors for ... · Model Selection and Multimodel Inference: A Practical Information–Theoretic Approach (2nd ed.). Springer Verlag,

Ch 14 – Inference for Regression YMS - 14.1 Inference about the Model.

Fast Inference for the Latent Space Network Model Using a ...stat.cmu.edu/~brian/780/bibliography/12 Inference - Goodness of Fit...Fast Inference for the Latent Space Network Model

Bayesian Inference: A Practical Primer - Cornell University · Bayesian Inference: A Practical Primer TomLoredo DepartmentofAstronomy,CornellUniversity loredo@spacenet.tn.cornell.edu

1836 Eco-bio-social determinants of human infection with ...mpf2131/ASTMH_FernandezMP.pdf · Model selection and multimodel inference: a practical information-theoretic approach.

General Linear Model & Classical Inference

Retrospective model-based inference guides model-free ...

Analysis of the Regression Model Parameters · Coherent Bayesian Inference Analysis of parameters Model generation Example Practice Coherent Bayesian Inference Coherent Bayesian Inference

Package ‘MuMIn’mumin.r-forge.r-project.org/MuMIn-manual.pdf · Burnham, K. P. and Anderson, D. R. (2002) Model selection and multimodel inference: a practical information-theoretic

Scaling Up Graphical Model Inference

ccc.inaoep.mx › ~villasen › bib › Inference and... · Inference and evaluation of the multinomial mixture model ...Inference and evaluation of the multinomial mixture model

INFaaS: Automated Model-less Inference Serving

Performing Bayesian Inference by Weighted Model Counting

GWR OLS · Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach, The 2nd Edition. Springer-Verlag, New York, 515 Pp. ... Spatial assessment of model

Fuzzy Inference Model

The Triangle of Statistical Inference: Likelihoood Data Scientific Model Probability Model Inference.