Le Bauer: Data Driven Model Development
-
Upload
questrcn -
Category
Technology
-
view
328 -
download
4
Transcript of Le Bauer: Data Driven Model Development
Data Driven Model DevelopmentDavid LeBauer, Mike Dietze, Deepak Jaiswal, Rob Kooper, Stephen P. Long, Shawn Serbin, Dan Wang
Objective: Useful Predictions
Clark et al. 2001 Ecological Forecasts, An Emerging Imperative. Science
Precision, Accuracy
In
form
ati
on
Sources of Uncertainty
Schlesinger et al. 1979 Terminology for model credibility. Simulation.
An error has occurred. To continue:
Press Enter to return to Windows, or
Press CTRL+ALT+DEL to restart your computer. If you do this, you will loose any unsaved information in all open applications
Error: 0E : 016F : BFF9B3D4
Press any key to continue _
Windows
Technical UncertaintyA Cautionary Tale
Yie
ld
+ T
rait
Data
+ F
lux D
ata
Pri
ors
Observed
Annual Merge
Technical UncertaintyA Cautionary Tale
Yie
ld
+ T
rait
Data
+ F
lux D
ata
+ L
ate
st V
ers
ion
Pri
ors
Observed
Annual Merge
Best Practices Write programs for people, not computers
Automate repetitive tasks
Use the computer to record history
Make incremental changes
Use version control
Don't repeat yourself (or others)
Plan for mistakes
Optimize software only after it works correctly
Document the design and purpose of code
Conduct code reviewsWilson et al 2012. Best Practices for Scientific Computing. arXiv:1210.0530v3
Best Practices Write programs for people, not computers
Automate repetitive tasks
Use the computer to record history
Make incremental changes
Use version control
Don't repeat yourself (or others)
Plan for mistakes
Optimize software only after it works correctly
Document the design and purpose of code
Conduct code reviewsWilson et al 2012. Best Practices for Scientific Computing. arXiv:1210.0530v3
Best Practices 1: Automation
Altintas et al 2004. Kepler: an extensible system for design and execution of scientific workflows. Proc 16th ICSSDM
Write programs for people, not computers
Automate repetitive tasks
Use the computer to record history
Make incremental changes
Use version control
Don't repeat yourself (or others)
Plan for mistakes
Optimize software only after it works correctly
Document the design and purpose of code
Conduct code reviews
Parameter Uncertainty: Test Case
Single Analysis:
Contribution of parameter uncertainty to uncertainty in Switchgrass Yield prediction.
LeBauer, Wang, Richter, Davidson, and Dietze 2013. Facilitating Feedbacks between ecological models and data. Ecological
Monographs
Parameter Uncertainty: Automated
Contribution of parameter uncertainty to model uncertainty.
* 17 Plant functional types
* 6 biomes
* 8 scientists
* 6 Months
Dietze, Serbin, LeBauer, Davidson, Desai, Feng, Kelly, Kooper, LeBauer, Mantooth, McHenry, and Wang. submitted
A quantitative assessment of a terrestrial biosphere model's data needs across North American biomes. JGR
% S
D
Exp
lain
ed
Best Practices 2: Iteration with Testing
Wilson et al 2012. Best Practices for Scientific Computing. arXiv:1210.0530v3
Write programs for people, not computers
Automate repetitive tasks
Use the computer to record history
Make incremental changes
Use version control
Don't repeat yourself (or others)
Plan for mistakes
Optimize software only after it works correctly
Document the design and purpose of code
Conduct code reviews
Benchmark Data
Aboveground Biomass
23 Calibration Sites
72 Observations
Observed (Mg/ha)
40.0
20.0
0.0
60.0
RMSE*
Correlation
Standard Deviation*
0
1
1
*Scaled to sddata = 1
Results:
Start (C4 Grass)
+ C3 Photosynthesis
+ Perennial Stem
+ Fixed Respiration
+ Leaf Senescence
0.74
0.67
0.20
RMSE*
Correlation
Standard Deviation*
0
1
1
*Scaled to sddata = 1
Results:
Start (C4 Grass)
+ C3 Photosynthesis
+ Perennial Stem
+ Fixed Respiration
+ Leaf Senescence
0.74
0.67
0.20
RMSE*
Correlation
Standard Deviation*
0
1
1
*Scaled to sddata = 1
Results:
Start (C4 Grass)
+ C3 Photosynthesis
+ Perennial Stem
+ Fixed Respiration
+ Leaf Senescence
0.74
0.67
0.20
RMSE*
Correlation
Standard Deviation*
0
1.46
1
1
*Scaled to sddata = 1
Results:
Start (C4 Grass)
+ C3 Photosynthesis
+ Perennial Stem
+ Fixed Respiration
+ Leaf Senescence
0.74
0.67
0.20
RMSE*
Correlation
Standard Deviation*
0
1.46
1
1
*Scaled to sddata = 1
Results:
Start (C4 Grass)
+ C3 Photosynthesis
+ Perennial Stem
+ Fixed Respiration
+ Leaf Senescence
0.74
0.67
0.20
RMSE*
Correlation
Standard Deviation*
0
0.30
0.87
0.84
1.46
1
1
*Scaled to sddata = 1
Results:
Start (C4 Grass)
+ C3 Photosynthesis
+ Perennial Stem
+ Fixed Respiration
+ Leaf Senescence
Conclusions * Best practices lead to more effective and efficient modeling
* Applied integration tests to support model development
* Controlling technical error produces more robust and accurate inference
Future Directions * Track benchmark metrics for specific model runs
* Maintain ability to reproduce published results
* Automated testing with each code commit or major release
* Current Metrics to define limits of model credibility