Verification Summit AMB verification: rapid feedback to guide model development decisions Patrick...

Verification Summit

AMB verification:rapid feedback to guide

model development decisions

Patrick Hofmann, Bill Moninger, Steve Weygandt, Curtis Alexander,

Susan Sahm

MotivationThere is a critical need for both rapid and comprehensive statistical and graphical verification of model forecasts from various AMB experimental models:

RUC, RR, and HRRR

- Real-time parallel cycles as well as retrospective runs

- Two primary types:- Station verification : Upper-air, surface and clouds- Gridded verification: Precipitation, radar reflectivity,

convective probabilities

- Illuminate model biases and patterns to errors

- Essential for evaluating model/assimilation configuration changes

Rapid verification feedback enables timely improvement in forecast skill

Design GoalsFast computation and display of verification results

(real-time for real-time cycles, day or two for retros)

Simple procedures, but with sufficient options to elucidate key aspects (quantify visual impressions)

Built-in capabilities to allow quick stratification by key parameters (metric, threshold, scale, valid time, initial time, forecast length, region)

Easily accessible web-based presentation of verification results ability to quickly examine aggregate statistics AND single-case plots in complementary manner

Verification design driven by needs of forecast system developers

Design DetailsUse modified NCEP IPOLATES routines for

interpolation and upscaling of input fields to multiple common grids.

Calculate contingency table fields (YY, YN, NY, NN) for multiple scales, domains, and thresholds:

-- database storage for statistical aggregation -- graphics for each event for detailed evaluation

Web-based interface for aggregate statistics and event graphics

Apply to multiple gridded fields (reflectivity, precipitation, probabilities) and multiple model runs (several version each of RUC, RR, HRRR as well as RCPF, HCPF, etc.)

Statistics WebpagesComposite Reflectivity

Time Series: http://ruc.noaa.gov/stats/radar/beta/timeseries Valid Times: http://ruc.noaa.gov/stats/radar/beta/validtimes Lead Times: http://ruc.noaa.gov/stats/radar/beta/leadtimes

24 Hour Precipitation Time Series: http://ruc.noaa.gov/stats/precip/beta/timeseries Thresholds: http://ruc.noaa.gov/stats/precip/beta/thresholds

Convective Probabilities Time Series: http://ruc.noaa.gov/stats/prob/beta/timeseries CSI vs Bias: http://ruc.noaa.gov/stats/prob/beta/csibias Reliability Diagrams:

http://ruc.noaa.gov/stats/radar/prob/reliabilitydiagrams ROC Curves: http://ruc.noaa.gov/stats/prob/beta/roc

http://ruc.noaa.gov/stats/radar/beta/timeseries

http://ruc.noaa.gov/stats/radar/beta/validtimes

http://ruc.noaa.gov/stats/radar/beta/leadtimes







Sample “time-series” stats interface

Model RegionScale Averaging

period

Metric

ThresholdForecastLength

Validtime

DateRange

ManyR/T runs

andretros

RR-dev w/ Pseudo-obs

HRRR-devHRRR

HRRR-dev better

HRRR better

Reflectivity (> 25 dBZ)CSI Eastern US on 40 km grid

(3-day avg)

Models

Thresh

“Time series” mode

Metric

Region

Scale

Sample application of “time-series” stats

Difference

HRRR-devHRRR


HRRR-dev better

HRRR better


(3-day avg)

Models

Thresh


Metric

Region

Scale


Difference

HRRR-devHRRR


HRRR-dev better

HRRR better


(3-day avg)

Models

Thresh


Metric

Region

Scale


DifferenceImplemented in RR-prim

HRRR-devLongertime-step

HRRR-devHRRR


HRRR-dev better

HRRR better


(3-day avg)

Models

Thresh


Metric

Region

Scale


DifferenceImplemented in RR-prim

HRRR-devLongertime-step

RR-devAdded shorter vert.length-scales in RR-dev/GSI

Imple-mented In HRRR

CSI 25 dBZ 40-km EUS +6h fcst 8-22 Aug

RUCHRRR Better

RRHRRR better

Sample “time-series” stats to examine scatter in forecast differences

August

Sample application of “lead-time” stats illustrating CSI and bias

“die-off” for different strengths of radar heating

CSI (X100)

Bias (X100)

Forecast Length (hours)0 2 4 6 8 10 0 2 4 6 8 10

Upscaled verification (especially to 40km and 80km) reveals “neighborhood” skill in HRRR forecasts, especially around the time of convective initiation

20-km

80-km

40-km

3-km

HRRR 25dBZ, 6-h fcst

Valid Time (GMT)

CSI (

x 10

0)

Sample application of “valid time” stats illustrating diurnal variation

in scale-dependent skill

Convective

Initiationtime

00z 04z 08z 12z 16z 20z 00z

Reflectivity Graphics Webpage

http://ruc.noaa.gov/crefVerif/Welcome.cgi

http://ruc.noaa.gov/crefVerif/Welcome.cgi

12z + 6 hr

3-km40-kmMiss FA Hit

Single case plots showing

“neighborhood” skill

Obs Refl. HRRR fcst

13-km CONUSComparison

2 X 12 hr fcstvs. CPC 24-h analysis

1 – 31 Dec 2010Matched

RR vs. RUC PrecipitationVerification

RR

RUC

| | | | | | | |0.01 0.10 0.25 0.50 1.00 1.50 2.00 3.00 in.

| | | | | | | |0.01 0.10 0.25 0.50 1.00 1.50 2.00 3.00 in.

CSI(x 100)

RUC

RR

100(1.0)

bias(x 100)

Sample application of “threshold” stats to show skill for range of precip amounts

Precipitation Graphics Webpageshttp://ruc.noaa.gov/precipVerif

http://ruc.noaa.gov/precipVerif/Welcome.cgi

http://ruc.noaa.gov/precipVerif/Welcome.cgi

CPC24-h

precip

RUC

Thrs CSI Bias1.00 .45 1.222.00 .29 1.95

observed

Thrs CSI Bias1.00 .31 0.692.00 .21 0.58

2 x 12h fcst interpolatedto 20-km grid

RR vs. RUC 24-h precip. verif


forecast skill for precip.

RR

RRRUC

Thrs CSI Bias1.00 .45 1.222.00 .29 1.95

Thrs CSI Bias1.00 .31 0.692.00 .21 0.58

1” threshold

Miss FA Hit

CPC24-h

precip

observed

2 x 12h fcst interpolatedto 20-km grid

RR vs. RUC 24-h precip. verif


forecast skill for precip.

2-h fcst4-h fcst6-h fcst

ROC curve CSI vs. bias

Sample display of probability

verification statistics

Work in progress, have display for CCFP and CoSPA probabilities

Plan to add HCPF, RCPF,expand to probabilitiesof other hazards (fog,

high echo-tops, etc.)

2-h fcst4-h fcst6-h fcst

Sample Reliability Diagram

All plots can zoom

Conclusion

• The verification system, including both the statistical and graphical webpages, greatly aids evaluation of model performance within AMB and facilitates rapid assessment of experimental configurations and improvements in real-time.

• We are also able to verify retrospective cases of scientific interest in very quick succession for use in presentations and publications for outreach endeavors.

Verification Summit AMB verification: rapid feedback to guide model development decisions Patrick...

Documents

Transcript of Verification Summit AMB verification: rapid feedback to guide model development decisions Patrick...