11IMSC, Edinburgh, UK, 12-16 July 2010

16
New techniques for detection and New techniques for detection and adjustment of shifts in daily adjustment of shifts in daily precipitation series precipitation series Xiaolan L. Wang Xiaolan L. Wang 1,2 1,2 , H. Chen , H. Chen 3 , Y. Wu , Y. Wu 2 , Y. , Y. Feng Feng 1 , and P. Qiang , and P. Qiang 2 11IMSC, Edinburgh, UK, 12-16 July 2010 11IMSC, Edinburgh, UK, 12-16 July 2010 ate Research Division, Science & Technology Branch, Environment Canada ate Research Division, Science & Technology Branch, Environment Canada rtment of Mathematics & Statistics, York University, Toronto, Canada rtment of Mathematics & Statistics, York University, Toronto, Canada rtment of Mathematics & Statistics, Bowling Green State University, Ohio, US rtment of Mathematics & Statistics, Bowling Green State University, Ohio, US J. Appl. Meteor. Climatol. (accepted)

description

New techniques for detection and adjustment of shifts in daily precipitation series Xiaolan L. Wang 1,2 , H. Chen 3 , Y. Wu 2 , Y. Feng 1 , and P. Qiang 2. 1. Climate Research Division, Science & Technology Branch, Environment Canada - PowerPoint PPT Presentation

Transcript of 11IMSC, Edinburgh, UK, 12-16 July 2010

Page 1: 11IMSC, Edinburgh, UK, 12-16 July 2010

New techniques for detection and adjustment of shifts in New techniques for detection and adjustment of shifts in

daily precipitation seriesdaily precipitation series

Xiaolan L. WangXiaolan L. Wang1,21,2, H. Chen, H. Chen33, Y. Wu, Y. Wu22, Y. Feng, Y. Feng11, and P. Qiang, and P. Qiang22

11IMSC, Edinburgh, UK, 12-16 July 201011IMSC, Edinburgh, UK, 12-16 July 2010

1.1. Climate Research Division, Science & Technology Branch, Environment CanadaClimate Research Division, Science & Technology Branch, Environment Canada

2. Department of Mathematics & Statistics, York University, Toronto, Canada2. Department of Mathematics & Statistics, York University, Toronto, Canada

3. Department of Mathematics & Statistics, Bowling Green State University, Ohio, USA3. Department of Mathematics & Statistics, Bowling Green State University, Ohio, USA

J. Appl. Meteor. Climatol. (accepted)

Page 2: 11IMSC, Edinburgh, UK, 12-16 July 2010

Our recent studies (Wang et al. 2007, Wang 2008a,b) 1. Propose two penalized tests, PMT and PMF, to even out the distribution of false alarm rates 2. Extend these penalized tests to account for the first order autocorrelation (red noise) 3. Propose a stepwise testing algorithm for detecting multiple changepoints in a single series

3. transPMF3. transPMFredred algorithm algorithm for detecting changepoints - in non-zero - in non-zero daily precipitationdaily precipitation series – typically non-Gaussian data series – typically non-Gaussian data - for use without a reference series - for use without a reference series

This study:

1. PMT1. PMTredred algorithm algorithm - for detecting mean shifts in - for detecting mean shifts in zero-trendzero-trend series with series with independent or AR(1)independent or AR(1) Gaussian noise Gaussian noise - for use with reference series - for use with reference series

k1 k2…

RHtestsV3 software package (R and FORTRAN; 220+ users from 55+ countries so far)

Background information:

2. PMF2. PMFredred algorithm algorithm - for detecting mean shifts in - for detecting mean shifts in constant trendconstant trend series with series with independent or AR(1)independent or AR(1) Gaussian noise Gaussian noise - can be used - can be used withoutwithout a reference series a reference series

k1 k2…

* Quantile Matching (QM) algorithm for adjusting quantile-dependent artificial shiftsQuantile Matching (QM) algorithm for adjusting quantile-dependent artificial shifts* RHtests_dlyPrcp software package for homogenization of RHtests_dlyPrcp software package for homogenization of dailydaily precipitation series precipitation series

Page 3: 11IMSC, Edinburgh, UK, 12-16 July 2010

tc - an unknown changepoint time

Nict

citXH

tXH

ii

iiia

iii

1 ,

1 ,:

against : test to

2

1

0

The PMF (TPR3) model for constant trend series (Wang 2008a and 2003):

The relevant model with Gaussian noise:

The test statistic for an unknown changepoint is a maximal F, not regular F statistic,

because of the need to search for the most probable point of change in a time series

t - independent or AR(1) Gaussian noise

tc

?

Also applicable to TPR3b model (Solow 1987) for a trend-change without an accompanying mean shift:tc

and TPR4 model (Lund & Reeves 2002) for a mean shift that may be accompanied by a trend-changetc

?

?

Page 4: 11IMSC, Edinburgh, UK, 12-16 July 2010

- Log transformation is often sufficient for monthly/annual total precipitation (Prcp) data series

recommend use the RHtestsV3 functions to test a log-transformed monthly/annual Prcp series

- Integrate a Box-Cox transformation in the PMFred algorithm, developing the transPMFred algorithm

& RHtests_dlyPrcp package for homogenization of daily precipitation data series

alleviates the limitation of the assumption of normal distribution in the RHtestsV3 package

Precipitation is typically not normally distributed; daily precipitation is not a continuous variable!

- Homogenization of daily precipitation data is much more challenging, and yet much needed for

characterizing extremes

Log-transformation is often not good enough; a data-adaptive transformation procedure is needed.

),...,2,1( 0 NiYi where is a series of non-zero daily precipitation amounts

0 ,log

0 ,/)1();(

i

iii

Y

YYhXBox-Cox transformation:

Yi can be other positive values, e.g., non-zero wind speedsThe gist of the transPMFred algorithm:

- For a set of trial λ values, use the PMFred algorithm to test each transformed series Xi- Use a profile log-likelihood statistic to find the best λ for the series being tested

A data-adaptive transformation, because different λ values (transformations) may be chosen for different series

Page 5: 11IMSC, Edinburgh, UK, 12-16 July 2010

To assess detection power of the transPMFred (nominal significance level: 5%)

Consider daily precipitation of 5 different distribution types (i.e., of different λ values: -0.2, -0.1, 0.0, 0.1, 0.2) log-normal distribution

For each distribution type (each λ): Block bootstrap 1000 surrogate series of N=600 from a homogeneous real precip. series whose λ is one of the five values

► False Alarm Rates (FARs) – apply the transPMFred to each of the homogenous surrogate series

Results: FARs are around the nominal level (5% here)

transPMFred for shift size:

transPMFred

PMFred

below 70%

above 95%

shorter upper tail longer upper tail

► Hit Rates (HRs):

– insert, at a randomly chosen position,

one shift to each surrogate series of N=600

then apply the new and old methods to detect

the inserted shift

Results: hit rates as a function of λ value

]10,10[ˆ :Hit KKk

HRs are all above 95%except for very small shifts

transPMFredfor shift size:

as a function of shift position K

HRs are basically independent of K

Page 6: 11IMSC, Edinburgh, UK, 12-16 July 2010

Quantile Matching (QM) algorithm – for adjusting quantile-dependent shifts,

i.e. shifts that affect not only the mean, but also the entire distribution of the data.

- regime dependent shifts- seasonality of shifts, e.g., …

Gist of QM adjustments – to match the distributions of different segments of the de-trended base series,

i.e., to diminish differences in the distribution caused by non-climatic factors.

to preserve in the QM-adjusted series the linear trend estimated from a multi-phase regression fit- important not to remove the natural trend!

Site moves at an Australian station quantile-dependent shifts:

de-seasonalized daily Tmin

QM-adjusted daily Tmin

Largest diff in the lowest 10% of daily Tmin

Mean-adjusted daily Tmin

var. diff. remains

Lord Howe Island daily Tmin

Site moves in Jan 1955 and Dec 1988

different variances

Larger effects onlow extremes

Page 7: 11IMSC, Edinburgh, UK, 12-16 July 2010

Seg. 1 Seg. 2 Seg. 3

Do these for each valueto be adjusted

Probability Distribution at Mq categories for each segment between-segment differences for each category & interpolate them by fitting splines (Mq=8 here):

Adjust to Seg. 3

For daily precipitation, the QM adjustments are estimated this way:

Precipitation trend component: );ˆ(ˆ 1b

tri

tri XhY i

tri tX ̂ˆ

triNi

tr YY ˆmaxˆ1max 0ˆ

m̂ax tri

tr YYDe-trended precipitation series:

0)ˆˆ(ˆmax tr

itr

idtri YYYY

Add these to the original series to make it homogeneous

Different quantiles inthe same segment could

be adjusted differently

Empirical Cumulative Frequency of the value to be adjusted

Adjustment if in Seg. 1

Adjustment if in Seg. 2

Page 8: 11IMSC, Edinburgh, UK, 12-16 July 2010

Examples to show:

1. The proposed new algorithm works well in detecting changepoints in real daily P

2. Small P are harder to measure with accuracy than larger P (larger %error)

– discontinuities often exist in freq. series of measured small P (e.g., P < 1 mm)

3. In the presence of frequency discontinuity,

any adjustment derived from the measured daily P is not good.

(e.g., ratio-based, Quantile-Matching)

One must address the issue of freq. discontinuity first!

The RHtestsV3 functions can be used to detect frequency discontinuities

Page 9: 11IMSC, Edinburgh, UK, 12-16 July 2010

Daily precipitation recorded at The Pas (Manitoba, Canada) for Jun 1st, 1910 to Dec 31st, 2007

- snowfall water equivalent; rainfall adjusted for wetting loses and gauge undercatch

(Mekis & Hogg 1999; and updates by Mekis)

- joining of two stns: 5052864 for up to 31 Dec. 1945, 5052880 1 Jan 1946 to 31 Dec. 2007

Same three changepoints detected

Examples of application

Both series have a very significant

changepoint near the time of joining of stations

Before including trace precipitation amounts, we have two Prcp data series for this site:

1. not adjusted for joining (noT_naJ)

2. has been adjusted for joining (noT_aJ)

Vincent & Mekis (2009):Ratio-based adjustments (used one rainfall ratio & one snowfall ratio for all data)

Page 10: 11IMSC, Edinburgh, UK, 12-16 July 2010

Results for the two series not including trace amounts(noT series):

Type Date Documented date of change(s) 1 4 Jul 1938 9 Oct 1937 to 8 Aug 1938: changes in gauge type, rim

height, observing frequency; poor gauge condition reported on 9 Oct 1937

1 24 Oct 1946 31 Dec 1945: joining of two nearby stations (5052864 + 5052880)

1 4 Oct 1976 16 Oct 1975 to 18 Oct 1977: gauge type change (standard at 12” rim height to Type B at 16” rim height)

Changes in the min. measurable amount (precision, unit) 1976-77

1945-46joining

1937-38

transPMFred detected the same 3 changepoints:

noT_naJ

-0.76 mm

1. noT_naJ (closest to original measurements):

2. noT_aJ (aJ changed the mean shift size from -0.76 mm to -0.73 mm)

The ratio-based adjustments for station joining failed to homogenize the series, because …

Page 11: 11IMSC, Edinburgh, UK, 12-16 July 2010

The discontinuities are mainly in the measurements of small precipitation (P ≤ 3 mm), especially in the frequency of measured small precipitation: Series of daily P > 3 mm – homogeneous!

noT_naJ > 3mm

noT_naJ > 3mm

No P < 0.3 mm or 0.4 < P ≤ 0.5 mm before 1976

noT_naJ

Much fewer0.3 ~ 0.4 mmuntil 1945 -joining point

0.21 mm from SWE

Much fewer0.5 ~ 1 mmuntil 1937

Good news for studying extreme (high) precipitation

Any ratio-based adjustments for joining are not good in this case, because larger P are adjusted more than smaller P when they should not be adjusted at all!

noT_aJ

The above frequency discontinuities largely remain:

Page 12: 11IMSC, Edinburgh, UK, 12-16 July 2010

b) IBC adjustments the Inverse Box-Cox (IBC) transformation of the fitted multi-phase regression lines

Homogenization of daily precipitation series – very challenging!!

Happy? – No!Because large P are adjusted similarly,while they should not be adjusted at all

wT_naJ

homogeneous

Seg. 1 Seg. 2

a) Ratio-based adjustments – bad in the presence of frequency discontinuity

We also tried

Page 13: 11IMSC, Edinburgh, UK, 12-16 July 2010

wT_naJ

c) QM adjustments

e.g., Quartile-Matching:(4 categories)

Adjust to last Seg. - Seg. 3

inhomogeneous

This is worse than the simple IBC adjustments!- still inhomogeneous; - larger absolute adjustments made to larger P

Seg. 1 Seg. 3Seg. 2

Quantile matching algorithms would work only if there is no discontinuity in the frequency, becausethey line up the adjustments by empirical frequency, implicitly assuming homogeneous frequencies. they should be used after all frequency discontinuities have been diminished!

Page 14: 11IMSC, Edinburgh, UK, 12-16 July 2010

How to address the issue of frequency discontinuity?

noT_naJ

1945-46

1955-56

PMFred algorithm

Flag more days with T? – which dates to flag? Needs obs’ of other variables, such as cloud, humidity…

At least, monthly and annual total Prcp can be adjusted to account for the frequency discontinuities,e.g., adjust the total trace amount in each month to that month’s current trace amount when

no trend in trace frequency

In spite of the uncertainty in the date of trace Prcp, adding days of a trace amount in the serieswould help obtain more accurate adjustments for other discontinuities using quantile-matching

Apply a homogeneity test to the frequency series, and homogenize the series if necessary, e.g.:

The frequency of reported trace occurrence at station The Pas is not homogeneous!

No trendAdding a trace amount for T-flagged days is

not good enoughin this case

Need to address the issue of frequency discontinuity!

But how?

Page 15: 11IMSC, Edinburgh, UK, 12-16 July 2010

Concluding remarks

Shall aim to get better insight into the cause (metadata) and characteristics of discontinuity

(e.g., freq.) before any attempt to adjust daily precipitation data!

In the presence of frequency discontinuity, any adjustment derived from the measured daily P is not good, no matter how it was derived!One must address the issue of frequency discontinuity before doing any adjustment (incl. QM)!

2) also test the frequency series of zero P and small P (e.g. Trace, ≤0.3 mm, 0.3-0.5 mm, …) (e.g., using the PMFred algorithm)

- Homogenization of precipitation data, especially daily P, is very challenging would recommend: 1) use transPMFred to test series of P > Pmin with different Pmin values

(e.g. 0.0, 0.3 mm, 0.4 mm, 0.5 mm, 1.0 mm, …)

should reflect changes in measurement precision/unit

- the new method, transPMFred, works well for both simulated and real daily precipitation data

Page 16: 11IMSC, Edinburgh, UK, 12-16 July 2010

References:Wang, X. L., H. Chen, Y. Wu, Y. Feng, and Q. Pu, 2010: New Techniques for detection and adjustment of shifts in daily precipitation data series. J. App. Meteor. Climatol, accepted subject to revision.

Wang, X. L., 2008a: Penalized maximal F test for detecting undocumented mean-shift without trend change.

J. Atmos. Oceanic Technol., 25 (No. 3), 368-384. DOI:10.1175/2007/JTECHA982.1Wang, X. L., 2008b: Accounting for autocorrelation in detecting mean-shifts in climate data series

using the penalized maximal t or F test. J. App. Meteor. Climatol, 47, 2423–2444.

Wang, X. L., Q. H. Wen, and Y. Wu, 2007: Penalized Maximal t-test for Detecting Undocumented Mean Change

in Climate Data Series. J. App. Meteor. Climatol., 46 (No. 6), 916-931. DOI:10.1175/JAM2504.1

Wan, H., X. L. Wang, and V. R. Swail, 2007: A Quality Assurance System for Canadian Hourly Pressure Data.

J. App. Meteor. Climatol., 46 (No. 11), 1804-1817.

Thank you very much for your attention!Thank you very much for your attention!Questions and/or comments?Questions and/or comments?

The RHtestsV3 and RHtests_dlyPrcp software packages are available free of charge at

http://cccma.seos.uvic.ca/ETCCDMI/software.shtml

- used by WMO ETCCDI in 12 training workshops so far (Expert Team on Climate Change Detection and Indices)