Cliff Notes on Ecological Niche Modeling with RandomForest (ensembles) Falk Huettmann EWHALE lab...
-
Upload
bridget-davis -
Category
Documents
-
view
215 -
download
1
Transcript of Cliff Notes on Ecological Niche Modeling with RandomForest (ensembles) Falk Huettmann EWHALE lab...
![Page 1: Cliff Notes on Ecological Niche Modeling with RandomForest (ensembles) Falk Huettmann EWHALE lab University of Alaska Fairbanks AK 99775 Email fhuettmann@alaska.edufhuettmann@alaska.edu.](https://reader035.fdocuments.net/reader035/viewer/2022062806/56649cf85503460f949c8eb1/html5/thumbnails/1.jpg)
Cliff Notes on Ecological Niche Modeling with RandomForest (ensembles)
Falk HuettmannEWHALE lab
University of AlaskaFairbanks AK 99775
Email [email protected] Tel. 907 474 7882
![Page 2: Cliff Notes on Ecological Niche Modeling with RandomForest (ensembles) Falk Huettmann EWHALE lab University of Alaska Fairbanks AK 99775 Email fhuettmann@alaska.edufhuettmann@alaska.edu.](https://reader035.fdocuments.net/reader035/viewer/2022062806/56649cf85503460f949c8eb1/html5/thumbnails/2.jpg)
Modeling Ecological Niches
Geographic Space Ecological Space
Latitude
Longitude Environmental factor a
Env
ironm
enta
l fac
tor
b
Sampling Space Model Space => Predictions
![Page 3: Cliff Notes on Ecological Niche Modeling with RandomForest (ensembles) Falk Huettmann EWHALE lab University of Alaska Fairbanks AK 99775 Email fhuettmann@alaska.edufhuettmann@alaska.edu.](https://reader035.fdocuments.net/reader035/viewer/2022062806/56649cf85503460f949c8eb1/html5/thumbnails/3.jpg)
A Super Model
LMGLMGAMCARTMARS
NNGARP
TNRF
GDMMaxent…
=>Ensembles
![Page 4: Cliff Notes on Ecological Niche Modeling with RandomForest (ensembles) Falk Huettmann EWHALE lab University of Alaska Fairbanks AK 99775 Email fhuettmann@alaska.edufhuettmann@alaska.edu.](https://reader035.fdocuments.net/reader035/viewer/2022062806/56649cf85503460f949c8eb1/html5/thumbnails/4.jpg)
‘Mean’SDOne formula capturing the data y=a +bx
Linear regression
A starting point…
Response Variable ~ Predictor1 Y X
X
Y
![Page 5: Cliff Notes on Ecological Niche Modeling with RandomForest (ensembles) Falk Huettmann EWHALE lab University of Alaska Fairbanks AK 99775 Email fhuettmann@alaska.edufhuettmann@alaska.edu.](https://reader035.fdocuments.net/reader035/viewer/2022062806/56649cf85503460f949c8eb1/html5/thumbnails/5.jpg)
Common Ground
A Multiple Regression framework
Response Variable ~ Predictor1 + Predictor2 + Predictor3…
![Page 6: Cliff Notes on Ecological Niche Modeling with RandomForest (ensembles) Falk Huettmann EWHALE lab University of Alaska Fairbanks AK 99775 Email fhuettmann@alaska.edufhuettmann@alaska.edu.](https://reader035.fdocuments.net/reader035/viewer/2022062806/56649cf85503460f949c8eb1/html5/thumbnails/6.jpg)
Common Ground
A Multiple Regression framework
Response Variable ~ Predictor1 + Predictor2 + Predictor3…
Traditionally, we used 1-5 predictors
But: 1 to 1000s of predictors are possible
‘One single algorithm’ that explains relationship between response and predictors
Derived relationship can be predicted to other locations with known predictors
![Page 7: Cliff Notes on Ecological Niche Modeling with RandomForest (ensembles) Falk Huettmann EWHALE lab University of Alaska Fairbanks AK 99775 Email fhuettmann@alaska.edufhuettmann@alaska.edu.](https://reader035.fdocuments.net/reader035/viewer/2022062806/56649cf85503460f949c8eb1/html5/thumbnails/7.jpg)
GLM vs CART etc.
‘Mean’SD => potentially low r2
‘Mean’ ?SD ?
CART, TreeNet & RandomForest(there are many other algorithms !)
Linear(~unrealistic)
Non-Linear(driven by data)
![Page 8: Cliff Notes on Ecological Niche Modeling with RandomForest (ensembles) Falk Huettmann EWHALE lab University of Alaska Fairbanks AK 99775 Email fhuettmann@alaska.edufhuettmann@alaska.edu.](https://reader035.fdocuments.net/reader035/viewer/2022062806/56649cf85503460f949c8eb1/html5/thumbnails/8.jpg)
Our Free Algorithms …
R-ProjectTreeNet
RandomForest
Fortran, C …
http://rweb.stat.umn.edu/R/library/randomForest/html/00Index.html
http://salford-systems.com/products.php
(free 30 day trial)
![Page 9: Cliff Notes on Ecological Niche Modeling with RandomForest (ensembles) Falk Huettmann EWHALE lab University of Alaska Fairbanks AK 99775 Email fhuettmann@alaska.edufhuettmann@alaska.edu.](https://reader035.fdocuments.net/reader035/viewer/2022062806/56649cf85503460f949c8eb1/html5/thumbnails/9.jpg)
Tree/CART - Family
Classification & Regression Tree (CART)=>Binary recursive partitioning
Leo Breiman 1984, and others
![Page 10: Cliff Notes on Ecological Niche Modeling with RandomForest (ensembles) Falk Huettmann EWHALE lab University of Alaska Fairbanks AK 99775 Email fhuettmann@alaska.edufhuettmann@alaska.edu.](https://reader035.fdocuments.net/reader035/viewer/2022062806/56649cf85503460f949c8eb1/html5/thumbnails/10.jpg)
Tree/CART - Family
Leo Breiman 1984, and others
YES NO
Temp>15
Precip <100
Temp<5
Classification & Regression Tree (CART)=>Binary recursive partitioning
![Page 11: Cliff Notes on Ecological Niche Modeling with RandomForest (ensembles) Falk Huettmann EWHALE lab University of Alaska Fairbanks AK 99775 Email fhuettmann@alaska.edufhuettmann@alaska.edu.](https://reader035.fdocuments.net/reader035/viewer/2022062806/56649cf85503460f949c8eb1/html5/thumbnails/11.jpg)
Tree/CART - Family
Binary splits
Leo Breiman 1984, and others
Widely used concept
![Page 12: Cliff Notes on Ecological Niche Modeling with RandomForest (ensembles) Falk Huettmann EWHALE lab University of Alaska Fairbanks AK 99775 Email fhuettmann@alaska.edufhuettmann@alaska.edu.](https://reader035.fdocuments.net/reader035/viewer/2022062806/56649cf85503460f949c8eb1/html5/thumbnails/12.jpg)
Tree/CART - Family
Binary splits
Leo Breiman 1984, and others
Widely used conceptFree of dataassumptions!No significances.
![Page 13: Cliff Notes on Ecological Niche Modeling with RandomForest (ensembles) Falk Huettmann EWHALE lab University of Alaska Fairbanks AK 99775 Email fhuettmann@alaska.edufhuettmann@alaska.edu.](https://reader035.fdocuments.net/reader035/viewer/2022062806/56649cf85503460f949c8eb1/html5/thumbnails/13.jpg)
Tree/CART - Family
Binary splits
Binary split recursive partitioning (samepredictor can re-occur elsewhere as a ‘splitter’)
Maximizes Nodes for Homogenous Variance
Stopping Rules for Number of Branches basedon Optimization/Cross-validation
Terminal Nodes show Means (Regression Tree)or Categories (Classification Tree)
Leo Breiman 1984, and others
Widely used conceptFree of dataassumptions!No significances.
![Page 14: Cliff Notes on Ecological Niche Modeling with RandomForest (ensembles) Falk Huettmann EWHALE lab University of Alaska Fairbanks AK 99775 Email fhuettmann@alaska.edufhuettmann@alaska.edu.](https://reader035.fdocuments.net/reader035/viewer/2022062806/56649cf85503460f949c8eb1/html5/thumbnails/14.jpg)
Tree/CART - Family
Binary splits Multiple splits
Binary split recursive partitioning (samepredictor can re-occur elsewhere as a ‘splitter’)
Maximizes Nodes for Homogenous Variance
Stopping Rules for Number of Branches basedon Optimization/Cross-validation
Terminal Nodes show Means (Regression Tree)or Categories (Classification Tree)
Leo Breiman 1984, and others
Classification Tree
A B C
A B
Widely used conceptRarely used, yet
Free of dataassumptions!No significances.
0.3 3 0.1
2 2.3
Regression Tree
![Page 15: Cliff Notes on Ecological Niche Modeling with RandomForest (ensembles) Falk Huettmann EWHALE lab University of Alaska Fairbanks AK 99775 Email fhuettmann@alaska.edufhuettmann@alaska.edu.](https://reader035.fdocuments.net/reader035/viewer/2022062806/56649cf85503460f949c8eb1/html5/thumbnails/15.jpg)
CART Salford (rpart in R)Nice to interpret(e.g. for small trees, orwhen following throughspecific decision rulestil end)
![Page 16: Cliff Notes on Ecological Niche Modeling with RandomForest (ensembles) Falk Huettmann EWHALE lab University of Alaska Fairbanks AK 99775 Email fhuettmann@alaska.edufhuettmann@alaska.edu.](https://reader035.fdocuments.net/reader035/viewer/2022062806/56649cf85503460f949c8eb1/html5/thumbnails/16.jpg)
0.70
0.80
0.90
0 100 200 300 400 500
Rel
ativ
e C
ost
Number of Nodes
DEM 100.00 ||||||||||||||||||||||||||||||||||||||||||TAIR_AUG 77.58 ||||||||||||||||||||||||||||||||PREC_AUG 69.46 |||||||||||||||||||||||||||||HYDRO 54.59 ||||||||||||||||||||||POP 47.39 |||||||||||||||||||LDUSE 40.88 |||||||||||||||||
Importance Value
CART Salford (rpart in R)
ROC curves for accuracy tests
e.g. correctly predicted absence app. 77%
e.g. correctly predicted presence app. 85%
=>Apply to a dataset for predictions
ROC
Nice to interpret(e.g. for small trees, orwhen following throughspecific decision rulestil end)
From withheld
Test Data
Optimum
![Page 17: Cliff Notes on Ecological Niche Modeling with RandomForest (ensembles) Falk Huettmann EWHALE lab University of Alaska Fairbanks AK 99775 Email fhuettmann@alaska.edufhuettmann@alaska.edu.](https://reader035.fdocuments.net/reader035/viewer/2022062806/56649cf85503460f949c8eb1/html5/thumbnails/17.jpg)
TreeNet(~A sequence of CARTs) ‘boosting’
+ + + +
The more nodes…the more detail…the slower
Many trees make for a ‘net of trees’, or ‘a forest’ => Leo Breiman + Data Mining
![Page 18: Cliff Notes on Ecological Niche Modeling with RandomForest (ensembles) Falk Huettmann EWHALE lab University of Alaska Fairbanks AK 99775 Email fhuettmann@alaska.edufhuettmann@alaska.edu.](https://reader035.fdocuments.net/reader035/viewer/2022062806/56649cf85503460f949c8eb1/html5/thumbnails/18.jpg)
TreeNet(~A sequence of CARTs) ‘boosting’
Variable Score LDUSE 100.00 ||||||||||||||||||||||||||||||||||||||||||TAIR_AUG 97.62 |||||||||||||||||||||||||||||||||||||||||HYDRO94.35 ||||||||||||||||||||||||||||||||||||||||DEM94.01 |||||||||||||||||||||||||||||||||||||||PREC_AUG 90.17 ||||||||||||||||||||||||||||||||||||||POP 82.54 ||||||||||||||||||||||||||||||||||HMFPT81.46 ||||||||||||||||||||||||||||||||||
0.0
0.1
0.2
0.3
0.4
0 10 20 30 40 50 60 70 80 90 100 110
Ris
k
Number of Trees
0
20
40
60
80
100
0 20 40 60 80 100
Pct
. C
lass
1
Pct. Population
+ + + +
Importance Value ROC curves for accuracy tests
e.g. correctly predicted absence app. 97%
e.g. correctly predicted presence app. 92%
=>Apply to a dataset for predictions
The more nodes…the more detail…the slower
ROCeach explains remaining variance
Difficult to interpretbut good graphs
![Page 19: Cliff Notes on Ecological Niche Modeling with RandomForest (ensembles) Falk Huettmann EWHALE lab University of Alaska Fairbanks AK 99775 Email fhuettmann@alaska.edufhuettmann@alaska.edu.](https://reader035.fdocuments.net/reader035/viewer/2022062806/56649cf85503460f949c8eb1/html5/thumbnails/19.jpg)
Distance to Lake (m)
Bea
r O
ccu
rren
ce(P
arti
al D
epen
den
ce)
TreeNet: Graphic Output example
Response Curve
yes
no
![Page 20: Cliff Notes on Ecological Niche Modeling with RandomForest (ensembles) Falk Huettmann EWHALE lab University of Alaska Fairbanks AK 99775 Email fhuettmann@alaska.edufhuettmann@alaska.edu.](https://reader035.fdocuments.net/reader035/viewer/2022062806/56649cf85503460f949c8eb1/html5/thumbnails/20.jpg)
Distance to Lake (m)
Bea
r O
ccu
rren
ce(P
arti
al D
epen
den
ce)
TreeNet: Graphic Output example
Response Curve
(the function above is virtually impossible to fit in linear algorithms => misleading coefficients, e.g. from LMs, GLMs)
yes
no
?
or
![Page 21: Cliff Notes on Ecological Niche Modeling with RandomForest (ensembles) Falk Huettmann EWHALE lab University of Alaska Fairbanks AK 99775 Email fhuettmann@alaska.edufhuettmann@alaska.edu.](https://reader035.fdocuments.net/reader035/viewer/2022062806/56649cf85503460f949c8eb1/html5/thumbnails/21.jpg)
Distance to Lake (m)
Bea
r O
ccu
rren
ce(P
arti
al D
epen
den
ce)
or
TreeNet: Graphic Output example
Response Curve
(the function above is virtually impossible to fit in linear algorithms => misleading coefficients, e.g. from LMs, GLMs)
yes
no
?
?
![Page 22: Cliff Notes on Ecological Niche Modeling with RandomForest (ensembles) Falk Huettmann EWHALE lab University of Alaska Fairbanks AK 99775 Email fhuettmann@alaska.edufhuettmann@alaska.edu.](https://reader035.fdocuments.net/reader035/viewer/2022062806/56649cf85503460f949c8eb1/html5/thumbnails/22.jpg)
Random set 1
Random set 2
Average Final Treefrom >2000 treesdone by VOTING
RandomForest (Prasad et al. 2006, Furlanelllo et al. 2003 Breimann 2001)
‘Boosting & Bagging’ algorithms (~Ensemble)
![Page 23: Cliff Notes on Ecological Niche Modeling with RandomForest (ensembles) Falk Huettmann EWHALE lab University of Alaska Fairbanks AK 99775 Email fhuettmann@alaska.edufhuettmann@alaska.edu.](https://reader035.fdocuments.net/reader035/viewer/2022062806/56649cf85503460f949c8eb1/html5/thumbnails/23.jpg)
DEM Slope Aspect Climate Land-cover
1
2
3
4
5
Ran
dom
set
of
Row
s(C
ases
)
PredictorsRandom set 1
Random set 2
Average Final Treefrom e.g.>2000 treesdone by VOTING
RandomForest (Prasad et al. 2006, Furlanelllo et al. 2003 Breimann 2001)
‘Boosting & Bagging’ algorithms (~Ensemble)
![Page 24: Cliff Notes on Ecological Niche Modeling with RandomForest (ensembles) Falk Huettmann EWHALE lab University of Alaska Fairbanks AK 99775 Email fhuettmann@alaska.edufhuettmann@alaska.edu.](https://reader035.fdocuments.net/reader035/viewer/2022062806/56649cf85503460f949c8eb1/html5/thumbnails/24.jpg)
DEM Slope Aspect Climate Land-cover
1
2
3
4
5
Ran
dom
set
of
Row
s(C
ases
)
Random set of Columns(Predictors)
Random set 1
Random set 2
RandomForest (Prasad et al. 2006, Furlanelllo et al. 2003 Breimann 2001)
Difficult to interpretbut good graphs
Average Final Treefrom e.g.>2000 treesdone by VOTING
‘Boosting & Bagging’ algorithms (~Ensemble)
![Page 25: Cliff Notes on Ecological Niche Modeling with RandomForest (ensembles) Falk Huettmann EWHALE lab University of Alaska Fairbanks AK 99775 Email fhuettmann@alaska.edufhuettmann@alaska.edu.](https://reader035.fdocuments.net/reader035/viewer/2022062806/56649cf85503460f949c8eb1/html5/thumbnails/25.jpg)
DEM Slope Aspect Climate Land-cover
1
2
3
4
5
Ran
dom
set
of
Row
s(C
ases
)
Random set of Columns(Predictors)
Random set 1
Random set 2
RandomForest (Prasad et al. 2006, Furlanelllo et al. 2003 Breimann 2001)
Bagging: Optimization based on In-Bag, Out-of Bag samples
In RF no pruning => Difficult to overfit (robust)
Boosting & Bagging algorithms
Difficult to interpretbut good graphs
Handles ‘noise’, interactionsand categorical data fine!
Average Final Treefrom e.g.>2000 treesdone by VOTING
![Page 26: Cliff Notes on Ecological Niche Modeling with RandomForest (ensembles) Falk Huettmann EWHALE lab University of Alaska Fairbanks AK 99775 Email fhuettmann@alaska.edufhuettmann@alaska.edu.](https://reader035.fdocuments.net/reader035/viewer/2022062806/56649cf85503460f949c8eb1/html5/thumbnails/26.jpg)
RandomForest and GIS: Spatial Modeling
![Page 27: Cliff Notes on Ecological Niche Modeling with RandomForest (ensembles) Falk Huettmann EWHALE lab University of Alaska Fairbanks AK 99775 Email fhuettmann@alaska.edufhuettmann@alaska.edu.](https://reader035.fdocuments.net/reader035/viewer/2022062806/56649cf85503460f949c8eb1/html5/thumbnails/27.jpg)
RandomForest and GIS: Spatial Modeling
Predictors
Response
Table
RandomForest(quantification)
Train &DevelopModel
ApplyModel
GISOverlays
GISVisualization
ofPredictions
![Page 28: Cliff Notes on Ecological Niche Modeling with RandomForest (ensembles) Falk Huettmann EWHALE lab University of Alaska Fairbanks AK 99775 Email fhuettmann@alaska.edufhuettmann@alaska.edu.](https://reader035.fdocuments.net/reader035/viewer/2022062806/56649cf85503460f949c8eb1/html5/thumbnails/28.jpg)
Predictors
Response
Table
aaahhhhuuhhhh ?!-Makes sense because of...-No, wait a minute, that’s wrong…
RandomForest and GIS: Spatial Modeling
Train &DevelopModel
ApplyModel
GISOverlays
GISVisualization
ofPredictions
RandomForest(quantification)
![Page 29: Cliff Notes on Ecological Niche Modeling with RandomForest (ensembles) Falk Huettmann EWHALE lab University of Alaska Fairbanks AK 99775 Email fhuettmann@alaska.edufhuettmann@alaska.edu.](https://reader035.fdocuments.net/reader035/viewer/2022062806/56649cf85503460f949c8eb1/html5/thumbnails/29.jpg)
Allows for:
Works multivariate (100s of predictors)
Best Possible Predictions
Best Possible Clustering (without a response variable)
Tracking of Complex Interactions
Predictor Ranking
Handling Noisy Data
Fast & convenient applications
Allows for multiple (!) response variables !
RandomForest: Why so good and useable ?
Algorithms:RandomForest (R, Fortran, Salford)YAIMPUTE (R)PARTY (R)…
=> Change in World’s Science
![Page 30: Cliff Notes on Ecological Niche Modeling with RandomForest (ensembles) Falk Huettmann EWHALE lab University of Alaska Fairbanks AK 99775 Email fhuettmann@alaska.edufhuettmann@alaska.edu.](https://reader035.fdocuments.net/reader035/viewer/2022062806/56649cf85503460f949c8eb1/html5/thumbnails/30.jpg)
What to read, for instance…
http://www.stat.berkeley.edu/~breiman/RandomForests/
Breiman, L. 2001. Statistical modeling: the two cultures. Statistical Science. 16(3): 199 –231.
Craig, E., and F. Huettmann. (2008). Using “blackbox” algorithms such as TreeNet and Random Forests for data-mining and for finding meaningful patterns, relationships and outliers in complex ecological data: an overview, an example using golden eagle satellite data and an outlook for a promising future. Chapter IV in Intelligent Data Analysis: Developing New Methodologies through Pattern Discovery and Recovery (Hsiao-fan Wang, Ed.). IGI Global, Hershey, PA,USA.
Magness, D.R., F. Huettmann, and J.M. Morton. (2008). Using Random Forests to provide predicted species distribution maps as a metric for ecological inventory & monitoring programs. Pages 209-229 in T.G. Smolinski, M.G. Milanova & A-E. Hassanien (eds.). Applications of Computational Intelligence in Biology: Current Trends and Open Problems. Studies in Computational Intelligence,Vol. 22, Springer-Verlag Berlin Heidelberg. 428 pp.
Prasad, A. L.A. Iverson, A. Liar. 2006. Newer Classification and Regression Tree Techniques: Bagging and Random Forests for Ecological Prediction. Ecosystems 181-199.
(and Hastie & Tibshirani, Furlanello et al. 2003, Elith et al. 2006 etc. etc.)
![Page 31: Cliff Notes on Ecological Niche Modeling with RandomForest (ensembles) Falk Huettmann EWHALE lab University of Alaska Fairbanks AK 99775 Email fhuettmann@alaska.edufhuettmann@alaska.edu.](https://reader035.fdocuments.net/reader035/viewer/2022062806/56649cf85503460f949c8eb1/html5/thumbnails/31.jpg)
From now on, simply referred to as …
![Page 32: Cliff Notes on Ecological Niche Modeling with RandomForest (ensembles) Falk Huettmann EWHALE lab University of Alaska Fairbanks AK 99775 Email fhuettmann@alaska.edufhuettmann@alaska.edu.](https://reader035.fdocuments.net/reader035/viewer/2022062806/56649cf85503460f949c8eb1/html5/thumbnails/32.jpg)
A Super Model
LMGLMCARTMARS
NNGARP
TNRF
GDMMaxent…
=>Ensembles
![Page 33: Cliff Notes on Ecological Niche Modeling with RandomForest (ensembles) Falk Huettmann EWHALE lab University of Alaska Fairbanks AK 99775 Email fhuettmann@alaska.edufhuettmann@alaska.edu.](https://reader035.fdocuments.net/reader035/viewer/2022062806/56649cf85503460f949c8eb1/html5/thumbnails/33.jpg)
Some Super Models: Ensembles
LMGLMCARTMARS
NNGARP
TNRF
GDMMaxent…
Find the best modelfor a given section of yourdata => the best possible fit & prediction
Pres/Abs
Predictors
RF
LM
log
poly
Ivory Gull
LMpoly
RFlog
![Page 34: Cliff Notes on Ecological Niche Modeling with RandomForest (ensembles) Falk Huettmann EWHALE lab University of Alaska Fairbanks AK 99775 Email fhuettmann@alaska.edufhuettmann@alaska.edu.](https://reader035.fdocuments.net/reader035/viewer/2022062806/56649cf85503460f949c8eb1/html5/thumbnails/34.jpg)
On Greyboxes, Philosophy and ScienceData
(Data Mining) Prediction & Accuracy
Algorithm with a Known Behavior
![Page 35: Cliff Notes on Ecological Niche Modeling with RandomForest (ensembles) Falk Huettmann EWHALE lab University of Alaska Fairbanks AK 99775 Email fhuettmann@alaska.edufhuettmann@alaska.edu.](https://reader035.fdocuments.net/reader035/viewer/2022062806/56649cf85503460f949c8eb1/html5/thumbnails/35.jpg)
On Greyboxes, Philosophy and ScienceData
(Data Mining) Prediction & Accuracy
Algorithm with a Known Behavior
Such a statistical relationshipwill be found by either CART, TN, RF orLM, GLM
![Page 36: Cliff Notes on Ecological Niche Modeling with RandomForest (ensembles) Falk Huettmann EWHALE lab University of Alaska Fairbanks AK 99775 Email fhuettmann@alaska.edufhuettmann@alaska.edu.](https://reader035.fdocuments.net/reader035/viewer/2022062806/56649cf85503460f949c8eb1/html5/thumbnails/36.jpg)
On Greyboxes, Philosophy and ScienceData
(Data Mining) Prediction & Accuracy
GLMs as a blackbox!? YES.Just think of software implementations, Max-Likelihood, Model FittingAIC and Research Design (sensu Keating & Cherry 1994)
Algorithm with a Known Behavior
![Page 37: Cliff Notes on Ecological Niche Modeling with RandomForest (ensembles) Falk Huettmann EWHALE lab University of Alaska Fairbanks AK 99775 Email fhuettmann@alaska.edufhuettmann@alaska.edu.](https://reader035.fdocuments.net/reader035/viewer/2022062806/56649cf85503460f949c8eb1/html5/thumbnails/37.jpg)
On Greyboxes, Philosophy and Science
-> Over time ->GLM ANN Boosting, Bagging …
100%
0%
ImprovementIncreases
ModelPerfor-mance
Data
(Data Mining) Prediction & Accuracy
GLMs as a blackbox!? YES.Just think of software implementations, Max-Likelihood, Model FittingAIC and Research Design (sensu Keating & Cherry 1994)
Algorithm with a Known Behavior
![Page 38: Cliff Notes on Ecological Niche Modeling with RandomForest (ensembles) Falk Huettmann EWHALE lab University of Alaska Fairbanks AK 99775 Email fhuettmann@alaska.edufhuettmann@alaska.edu.](https://reader035.fdocuments.net/reader035/viewer/2022062806/56649cf85503460f949c8eb1/html5/thumbnails/38.jpg)
Parsimony, Inference and Prediction ?!
Sole focus on predictions and its accuracies, whereas…
…R2, p-values and traditional inference (variable rankings, AIC) are of lower relevance
Why Parsimony ?
No real need for optimizing the fit and for parsimony when prediction is the goal
Global accuracy metrics, ROC, AUC, kappa, meta analysis …(instead of p-values and significance levels or AIC)
0.70
0.80
0.90
0 100 200 300 400 500
Rel
ativ
e C
ost
Number of Nodes