2013 Workshop on Quantitative Evaluation of Downscaled Data
description
Transcript of 2013 Workshop on Quantitative Evaluation of Downscaled Data
![Page 1: 2013 Workshop on Quantitative Evaluation of Downscaled Data](https://reader036.fdocuments.net/reader036/viewer/2022062410/56816439550346895dd60115/html5/thumbnails/1.jpg)
Keith Dixon1, Katharine Hayhoe2, John Lanzante1, Anne Stoner2, Aparna Radhakrishnan3, V. Balaji4, Carlos Gaitán5
“Perfect Model” Experiments: Testing Stationarity in Statistical
Downscaling (NCPP Protocol 2)
IS PAST PERFORMANCE AN INDICATION OF FUTURE RESULTS?
1 2 3 DRC Inc. 4 Princeton Univ. 5 Univ. of Oklahoma
2013 Workshop on QuantitativeEvaluation of Downscaled Data
![Page 2: 2013 Workshop on Quantitative Evaluation of Downscaled Data](https://reader036.fdocuments.net/reader036/viewer/2022062410/56816439550346895dd60115/html5/thumbnails/2.jpg)
Goals of this presentation1. Define the ‘stationarity assumption’ inherent to
statistical downscaling of future climate projections.2. Present our ‘perfect model’ (aka ‘big brother’) approach
to quantitatively assess the extent to which the stationarity assumption holds.
3. Illustrate with a few examples the kind of results one can generate using this evaluation framework (Anne Stoner will show more detail next…)
4. Introduce options to extend and supplement the method (setting the hurdle at different heights).
5. Invite statistical downscalers to consider testing their methods within the perfect model framework.
6. Garner feedback from workshop participants.
![Page 3: 2013 Workshop on Quantitative Evaluation of Downscaled Data](https://reader036.fdocuments.net/reader036/viewer/2022062410/56816439550346895dd60115/html5/thumbnails/3.jpg)
This project does not produce any data files that one would use in real world applications.
DATA INFO KNOW-LEDGE
WISDOM
The aim at GFDL is to gain knowledge about aspects of commonly used SD methods (& GCMS), and to communicate that knowledge so that better informed decisions can be made.
![Page 4: 2013 Workshop on Quantitative Evaluation of Downscaled Data](https://reader036.fdocuments.net/reader036/viewer/2022062410/56816439550346895dd60115/html5/thumbnails/4.jpg)
REAL WORLD APPLICATION:
? ? ?
Start with 3 types of data sets … lacking observations of the futureCompute transform functions in training step (1 example below) Produce downscaled refinements of GCM (historical and/or future)Skill: Compare Obs to SD historical output (e.g., cross validation)Cannot evaluate future skill -- left assuming transform functionsapply equally well to past & future -- “The Stationarity Assumption”
TARGET
PREDICTORS
![Page 5: 2013 Workshop on Quantitative Evaluation of Downscaled Data](https://reader036.fdocuments.net/reader036/viewer/2022062410/56816439550346895dd60115/html5/thumbnails/5.jpg)
“PERFECT MODEL” EXPERIMENTAL DESIGN:Start with 4 types of data sets – Hi-res GCM output as proxy for obs& coarsened version of Hi-res GCM output as proxy for usual GCM
? ? ?
Proxy for observations in real-world applicationProxy for GCM output in real-world application
![Page 6: 2013 Workshop on Quantitative Evaluation of Downscaled Data](https://reader036.fdocuments.net/reader036/viewer/2022062410/56816439550346895dd60115/html5/thumbnails/6.jpg)
Daily time resolution~25km grid spacing
194 x 114 grid22k pts
![Page 7: 2013 Workshop on Quantitative Evaluation of Downscaled Data](https://reader036.fdocuments.net/reader036/viewer/2022062410/56816439550346895dd60115/html5/thumbnails/7.jpg)
~200km grid spacing64:1
![Page 8: 2013 Workshop on Quantitative Evaluation of Downscaled Data](https://reader036.fdocuments.net/reader036/viewer/2022062410/56816439550346895dd60115/html5/thumbnails/8.jpg)
~64 of the smaller25km resolution grid cells fit withinthe coarser cell
![Page 9: 2013 Workshop on Quantitative Evaluation of Downscaled Data](https://reader036.fdocuments.net/reader036/viewer/2022062410/56816439550346895dd60115/html5/thumbnails/9.jpg)
![Page 10: 2013 Workshop on Quantitative Evaluation of Downscaled Data](https://reader036.fdocuments.net/reader036/viewer/2022062410/56816439550346895dd60115/html5/thumbnails/10.jpg)
Daily time resolution~25km grid spacing
194 x 114 grid
![Page 11: 2013 Workshop on Quantitative Evaluation of Downscaled Data](https://reader036.fdocuments.net/reader036/viewer/2022062410/56816439550346895dd60115/html5/thumbnails/11.jpg)
![Page 12: 2013 Workshop on Quantitative Evaluation of Downscaled Data](https://reader036.fdocuments.net/reader036/viewer/2022062410/56816439550346895dd60115/html5/thumbnails/12.jpg)
Follow the same interpolation/regriddingsequence to produce coarsened data sets for the future climate projections
![Page 13: 2013 Workshop on Quantitative Evaluation of Downscaled Data](https://reader036.fdocuments.net/reader036/viewer/2022062410/56816439550346895dd60115/html5/thumbnails/13.jpg)
“PERFECT MODEL” EXPERIMENTAL DESIGN:Compute transform functions in training step (1 example below) Produce downscaled output for historical and future time periods Can directly evaluate skill both for the historical period and the future using the Hi Res GCM output as “truth” – Test Stationarity
QuantitativeTests of Stationarity: The extent to which
SKILL computed for the FUTURE CLIMATE PROJECTIONS is diminished relative to the
SKILL computed for the HISTORICAL PERIOD
![Page 14: 2013 Workshop on Quantitative Evaluation of Downscaled Data](https://reader036.fdocuments.net/reader036/viewer/2022062410/56816439550346895dd60115/html5/thumbnails/14.jpg)
GFDL-HiRAM-C360 experimentsAll data used in the perfect model testswere derived from GFDL-HiRAM-C360 (C360) model simulations conducted following CMIP5 high resolution time slice experiment protocols.
We consider a pair of 30-year long “historical”experiments (1979-2008), labeled H2 and H2here…
For the period 2086-2095, we make useof a pair of 3 member ensembles -- a total of six 10-year long experiments, identified as the “C” and “E” ensemblesto the right (C warms more than E)...
![Page 15: 2013 Workshop on Quantitative Evaluation of Downscaled Data](https://reader036.fdocuments.net/reader036/viewer/2022062410/56816439550346895dd60115/html5/thumbnails/15.jpg)
Same GCM used to generate 2 sets of future projections (3 members each)1979-2008 model climatology
2086-2095 projection “C” (3x10yr)
2086-2095 projection “E” (3x10yr)
"E" Regional Warming [K]
"C" Regional Warming [K]
0 1 2 3 4 5 6 7 8
4.1
6.2
5.0
7.2
Land All pts
![Page 16: 2013 Workshop on Quantitative Evaluation of Downscaled Data](https://reader036.fdocuments.net/reader036/viewer/2022062410/56816439550346895dd60115/html5/thumbnails/16.jpg)
Q: How High a Hurdle does this Perfect Model approach present?
A: Varies geographically,by variable of interest,time period of interest, etc.
![Page 17: 2013 Workshop on Quantitative Evaluation of Downscaled Data](https://reader036.fdocuments.net/reader036/viewer/2022062410/56816439550346895dd60115/html5/thumbnails/17.jpg)
Coarsened data isslightly cooler withSlightly smaller std dev.… not too challenging
August Daily Max Temperaturesfor a point in Oklahoma
(2.5C histogram bins)
![Page 18: 2013 Workshop on Quantitative Evaluation of Downscaled Data](https://reader036.fdocuments.net/reader036/viewer/2022062410/56816439550346895dd60115/html5/thumbnails/18.jpg)
approx. +7Cmean warming
August Daily Max Temperaturesfor a point in Oklahoma
(2.5C histogram bins)
![Page 19: 2013 Workshop on Quantitative Evaluation of Downscaled Data](https://reader036.fdocuments.net/reader036/viewer/2022062410/56816439550346895dd60115/html5/thumbnails/19.jpg)
August Daily Max Temperaturesfor a point just NE of San Fancisco
Coarsened data ismore ‘maritime’ &HiRes target is more ‘continental’.
Presents more of achallenge during thehistorical period than the Oklahoma example.
(2.5C histogram bins)
![Page 20: 2013 Workshop on Quantitative Evaluation of Downscaled Data](https://reader036.fdocuments.net/reader036/viewer/2022062410/56816439550346895dd60115/html5/thumbnails/20.jpg)
August Daily Max Temperaturesfor a point just NE of San Fancisco
(2.5C histogram bins)
![Page 21: 2013 Workshop on Quantitative Evaluation of Downscaled Data](https://reader036.fdocuments.net/reader036/viewer/2022062410/56816439550346895dd60115/html5/thumbnails/21.jpg)
Note that in several locations there is a tendency for ARRM cross-validated downscaled output to have too low standard deviations just offshore and too high std. dev. over coastal land
Map produced by NCPP Evaluation Environment ‘Machinery’
![Page 22: 2013 Workshop on Quantitative Evaluation of Downscaled Data](https://reader036.fdocuments.net/reader036/viewer/2022062410/56816439550346895dd60115/html5/thumbnails/22.jpg)
NEXT: A sampling of results…
Intended to be illustrative…not exhaustive, nor systematic.
Anne Stoner (TTU) will show more.
![Page 23: 2013 Workshop on Quantitative Evaluation of Downscaled Data](https://reader036.fdocuments.net/reader036/viewer/2022062410/56816439550346895dd60115/html5/thumbnails/23.jpg)
Area Mean Time Mean Absolute Downscaling ErrorsΣ | (Downscaled Estimate – HiRes GCM) |
(NumDays)
1979-2008 "C" 2086-2095 "E" 2086-20950.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
1.6
0.63
0.900.760.82
1.14
0.98
All Pts Land
+40% +20%
Start with a summary: ARRM downscaling errors are larger for daily max temp at end of 21stC than for 1979-2008
![Page 24: 2013 Workshop on Quantitative Evaluation of Downscaled Data](https://reader036.fdocuments.net/reader036/viewer/2022062410/56816439550346895dd60115/html5/thumbnails/24.jpg)
Mean Absolute downscaling Error (MAE) during 1979-2008Σ | (Downscaled Estimate – HiRes GCM) |
(60 * 365)
1979-2008
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
1.6All PtsLand
Geographic Variations: Downscaling MAE for 1979-2008
![Page 25: 2013 Workshop on Quantitative Evaluation of Downscaled Data](https://reader036.fdocuments.net/reader036/viewer/2022062410/56816439550346895dd60115/html5/thumbnails/25.jpg)
Mean absolute downscaling error during 2086-2095“E” Projections (3 member ensemble)
"E" 2086-2095
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
1.6All PtsLand
Geographic Variations: MAE pattern for “E” projections (+5,4C)
![Page 26: 2013 Workshop on Quantitative Evaluation of Downscaled Data](https://reader036.fdocuments.net/reader036/viewer/2022062410/56816439550346895dd60115/html5/thumbnails/26.jpg)
Mean absolute downscaling error during 2086-2095“C” Projections (3 member ensemble)
"C" 2086-2095
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
1.6All PtsLand
Geographic Variations: MAE pattern for “C” projections (+7,6C)
![Page 27: 2013 Workshop on Quantitative Evaluation of Downscaled Data](https://reader036.fdocuments.net/reader036/viewer/2022062410/56816439550346895dd60115/html5/thumbnails/27.jpg)
Mean Climate Change Signal Difference (ARRM minus Target) “C” Projection
Geographic Variations: Bias pattern for “C” projections (+7,6C)
![Page 28: 2013 Workshop on Quantitative Evaluation of Downscaled Data](https://reader036.fdocuments.net/reader036/viewer/2022062410/56816439550346895dd60115/html5/thumbnails/28.jpg)
J F M A M J J A S O N D
Red = “C” Blue=“E”
J F M A M J J A S O N DEntire US48 Domain
Looking at how well the stationarity assumption holds in different seasons
Where and when ratio=1.0, the stationarity assumption fully holds (i.e., no degradation in mean absolute downscaling error during 2085-2095 vs. the 1979-2008 period using in training.)
4
3
2
1
tasmax
![Page 29: 2013 Workshop on Quantitative Evaluation of Downscaled Data](https://reader036.fdocuments.net/reader036/viewer/2022062410/56816439550346895dd60115/html5/thumbnails/29.jpg)
…comparing near-coastal land vs. interior
Looking at how well the stationarity assumption holds in different seasons
![Page 30: 2013 Workshop on Quantitative Evaluation of Downscaled Data](https://reader036.fdocuments.net/reader036/viewer/2022062410/56816439550346895dd60115/html5/thumbnails/30.jpg)
J F M A M J J A S O N DBlue= coastal SC-CSCBlack = full US48+ domainGreen = interior SC-CSC
Looking at how well the stationarity assumption holds in different seasons
“C” ensemble
4
3
2
1
![Page 31: 2013 Workshop on Quantitative Evaluation of Downscaled Data](https://reader036.fdocuments.net/reader036/viewer/2022062410/56816439550346895dd60115/html5/thumbnails/31.jpg)
“C” ensembletasmax
A clear intra-month MAE trend in some but not all months
Looking at how well the stationarity assumption holds in different seasons
![Page 32: 2013 Workshop on Quantitative Evaluation of Downscaled Data](https://reader036.fdocuments.net/reader036/viewer/2022062410/56816439550346895dd60115/html5/thumbnails/32.jpg)
“sawtooth” has lower values in cooler part of month, higher in warmer part of month
“C” ensembletasmax
![Page 33: 2013 Workshop on Quantitative Evaluation of Downscaled Data](https://reader036.fdocuments.net/reader036/viewer/2022062410/56816439550346895dd60115/html5/thumbnails/33.jpg)
Options for extending this ‘perfect model-based’ exploration of statistical downscaling stationarity
More SDMethods
More ClimateVariables & Indices
Different HiResClimate Models
Different Emissions Scenarios & Times
Use of SyntheticTime Series
Alter GCM-baseddata in known waysto ‘Raise the Bar’
Mix & Match GCMs’Target & Predictors
?? MoreIdeas ??
![Page 34: 2013 Workshop on Quantitative Evaluation of Downscaled Data](https://reader036.fdocuments.net/reader036/viewer/2022062410/56816439550346895dd60115/html5/thumbnails/34.jpg)
GFDL-HiRAM-C360 experimentsData sets used in the perfect model tests are beingmade available to the community. This enables others to use our exp design to test the stationarity assumption for their own SD method.Also, we invite SD developers to consider collaborating with us, so that their techniques &perspectives may be more fully incorporated into this evolving perfect model-based research & assessment system.For more info, visit www.gfdl.noaa.gov/esd_eval &www.earthsystemcog.org/projects/gfdl-perfectmodel
![Page 35: 2013 Workshop on Quantitative Evaluation of Downscaled Data](https://reader036.fdocuments.net/reader036/viewer/2022062410/56816439550346895dd60115/html5/thumbnails/35.jpg)
In other words, by accessing these types of files, folks can generate their own SD results and test how well the stationarity assumption holds for their favorite SD method.
from us
from us
![Page 36: 2013 Workshop on Quantitative Evaluation of Downscaled Data](https://reader036.fdocuments.net/reader036/viewer/2022062410/56816439550346895dd60115/html5/thumbnails/36.jpg)
In other words, by accessing these types of files, folks can generate their own SD results and test how well the stationarity assumption holds for their favorite SD method.
from us
SD method
of choice
from us
![Page 37: 2013 Workshop on Quantitative Evaluation of Downscaled Data](https://reader036.fdocuments.net/reader036/viewer/2022062410/56816439550346895dd60115/html5/thumbnails/37.jpg)
Obviously, one does not need to perform tests using all 22,000+ grid points. Below we indicate the locations of
a set of 16 points we’ve used for some development work.(color areas are 3x3 with grid point of interest in the center)
![Page 38: 2013 Workshop on Quantitative Evaluation of Downscaled Data](https://reader036.fdocuments.net/reader036/viewer/2022062410/56816439550346895dd60115/html5/thumbnails/38.jpg)
Anne Stoner and Katharine Hayhoe with more ‘perfect model’ analyses…
![Page 39: 2013 Workshop on Quantitative Evaluation of Downscaled Data](https://reader036.fdocuments.net/reader036/viewer/2022062410/56816439550346895dd60115/html5/thumbnails/39.jpg)
Keith Dixon1, Katharine Hayhoe2, John Lanzante1, Anne Stoner2, Aparna Radhakrishnan3, V. Balaji4, Carlos Gaitán5
“Perfect Model” Experiments: Testing Stationarity in Statistical
Downscaling (NCPP Protocol 2)
IS PAST PERFORMANCE AN INDICATION OF FUTURE RESULTS?
1 2 3 DRC Inc. 4 Princeton Univ. 5 Univ. of Oklahoma
2013 Workshop on QuantitativeEvaluation of Downscaled Data
![Page 40: 2013 Workshop on Quantitative Evaluation of Downscaled Data](https://reader036.fdocuments.net/reader036/viewer/2022062410/56816439550346895dd60115/html5/thumbnails/40.jpg)
“PERFECT MODEL” EXPERIMENTAL DESIGN:Compute transform functions in training step (1 example below) Produce downscaled output for historical and future time periods Can directly evaluate skill both for the historical period and the future using the Hi Res GCM output as “truth” – Test Stationarity
![Page 41: 2013 Workshop on Quantitative Evaluation of Downscaled Data](https://reader036.fdocuments.net/reader036/viewer/2022062410/56816439550346895dd60115/html5/thumbnails/41.jpg)
Options for extending this ‘perfect model-based’ exploration of statistical downscaling stationarity
More SDMethods
More ClimateVariables & Indices
Different HiResClimate Models
Different Emissions Scenarios & Times
Use of SyntheticTime Series
Alter GCM-baseddata in known waysto ‘Raise the Bar’
Mix & Match GCMs’Target & Predictors
?? MoreIdeas ??
![Page 42: 2013 Workshop on Quantitative Evaluation of Downscaled Data](https://reader036.fdocuments.net/reader036/viewer/2022062410/56816439550346895dd60115/html5/thumbnails/42.jpg)
This project does not produce any data files that one would use in real world applications.
DATA INFO KNOW-LEDGE WISDOM
The aim at GFDL is to gain knowledge about aspects of commonly used SD methods (& GCMS), and to communicate that knowledge so that better informed decisions can be made.
Odds-n-ends from day 1* Not all SD codes run on a cell phone* Not All GCMs have 200km atmos grid (50km)* GCMs are developed primarily as research tools – not prediction tools
![Page 43: 2013 Workshop on Quantitative Evaluation of Downscaled Data](https://reader036.fdocuments.net/reader036/viewer/2022062410/56816439550346895dd60115/html5/thumbnails/43.jpg)
Goals of this presentation1. Define the ‘stationarity assumption’ inherent to
statistical downscaling of future climate projections.2. Present our ‘perfect model’ (aka ‘big brother’) approach
to quantitatively assess the extent to which the stationarity assumption holds.
3. Illustrate with a few examples the kind of results one can generate using this evaluation framework (Anne Stoner will show more detail next…)
4. Introduce options to extend and supplement the method (setting the hurdle at different heights).
5. Invite statistical downscalers to consider testing their methods within the perfect model framework.
6. Garner feedback from workshop participants.