COST benchmark dataset homogenisation: issues and remarks of the “Slovenian team”
description
Transcript of COST benchmark dataset homogenisation: issues and remarks of the “Slovenian team”
ENVIRONMENTAL AGENCY OF THE REPUBLIC OF SLOVENIAENVIRONMENTAL AGENCY OF THE REPUBLIC OF SLOVENIAENVIRONMENTAL AGENCY OF THE REPUBLIC OF SLOVENIAENVIRONMENTAL AGENCY OF THE REPUBLIC OF SLOVENIA
www.arso.gov.siwww.arso.gov.si
COST benchmark dataset homogenisation: issues and remarks of the “Slovenian team”
Presentation for WG 2-4 meeting in Tarragona, March 9-11, 2009
Gregor VertačnikBoris Pavčič
Tarragona, March 2009
-6
-5
-4
-3
-2
-1
0
1
2
3
4
1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000
homogenised
raw
ENVIRONMENTAL AGENCY OF THE REPUBLIC OF SLOVENIAENVIRONMENTAL AGENCY OF THE REPUBLIC OF SLOVENIAENVIRONMENTAL AGENCY OF THE REPUBLIC OF SLOVENIAENVIRONMENTAL AGENCY OF THE REPUBLIC OF SLOVENIA
www.arso.gov.siwww.arso.gov.si
Overview
• MASH• SNHT• Craddock• SNHT vs. Craddock
ENVIRONMENTAL AGENCY OF THE REPUBLIC OF SLOVENIAENVIRONMENTAL AGENCY OF THE REPUBLIC OF SLOVENIAENVIRONMENTAL AGENCY OF THE REPUBLIC OF SLOVENIAENVIRONMENTAL AGENCY OF THE REPUBLIC OF SLOVENIA
www.arso.gov.siwww.arso.gov.si
MASH
• In February, some testing and inspection of the results of various procedures in MASH was done
• Homogenisation procedure used for the benchmark:– Data formatting (COST-MASH and vice versa)
– Statistical significance for break-point detection: 0.05
– Monthly outliers found by mashlier.bat
– Break-point detection only on annual series (Gregor) or also on monthly and seasonal (Boris), samauto.bat, 50 iterations
– All breaks and outliers found accepted
ENVIRONMENTAL AGENCY OF THE REPUBLIC OF SLOVENIAENVIRONMENTAL AGENCY OF THE REPUBLIC OF SLOVENIAENVIRONMENTAL AGENCY OF THE REPUBLIC OF SLOVENIAENVIRONMENTAL AGENCY OF THE REPUBLIC OF SLOVENIA
www.arso.gov.siwww.arso.gov.si
• Issues:– Fully-automated version is quick– Time-consuming manual correction (mashcor.bat) and inspection
(mashgame.bat)– Inconsistent results when using SAM, hard to find out the “true
date” of breaks from monthly series– Homogenisation of annual series insufficient, especially for
precipitation series– Some “obivous” outliers remain undetected– Problems with clustered breakpoints – Complete series of raw data needed for at least one station
(surrogate temp. station network 13 problem)– Too many small breaks and outliers according to the benchmark
description– Long-term trends can’t be detected, possible reconstruction only
from consecutive breaks
ENVIRONMENTAL AGENCY OF THE REPUBLIC OF SLOVENIAENVIRONMENTAL AGENCY OF THE REPUBLIC OF SLOVENIAENVIRONMENTAL AGENCY OF THE REPUBLIC OF SLOVENIAENVIRONMENTAL AGENCY OF THE REPUBLIC OF SLOVENIA
www.arso.gov.siwww.arso.gov.si
-1.5
-1
-0.5
0
0.5
1
1.5
1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000
year
tem
pe
ratu
re d
iffe
ren
ce
(°C
)
Difference series for July (equally weighted ref. stations), tnm28070001
• Some obvious outliers remain after the homogenisation procedure (mashlier.bat, samauto.bat)
ENVIRONMENTAL AGENCY OF THE REPUBLIC OF SLOVENIAENVIRONMENTAL AGENCY OF THE REPUBLIC OF SLOVENIAENVIRONMENTAL AGENCY OF THE REPUBLIC OF SLOVENIAENVIRONMENTAL AGENCY OF THE REPUBLIC OF SLOVENIA
www.arso.gov.siwww.arso.gov.si
• Fully-automated procedure “detects” numerous small breaks and outliers – possibly a consequence of an interval break-point correction
• Occurence of outlier years (all months within the year are outliers)
-5
-4
-3
-2
-1
0
1
2
3
4
5
6
7
8
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
cumulative probability
ma
gn
itu
de
of
inh
om
. (°C
)
breaks outliers
An example of CDF for inhomogenities, temperature, surrogate data, station network 000001
ENVIRONMENTAL AGENCY OF THE REPUBLIC OF SLOVENIAENVIRONMENTAL AGENCY OF THE REPUBLIC OF SLOVENIAENVIRONMENTAL AGENCY OF THE REPUBLIC OF SLOVENIAENVIRONMENTAL AGENCY OF THE REPUBLIC OF SLOVENIA
www.arso.gov.siwww.arso.gov.si
• Fully-automated application of SAM procedure in MASH (monthly, seasonal, annual homogenisation) on the COST-benchmark sometimes result in unrealistic inhomogeniety series:
– Inconsistent times of breaks– Very different monthly correction factors
-4.5
-4
-3.5
-3
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000
year
ma
gn
itud
e o
f in
ho
m. (
°C)
September October November
An example of inhomogeniety series for autumn months, temperature, surrogate data, tnm2807001
ENVIRONMENTAL AGENCY OF THE REPUBLIC OF SLOVENIAENVIRONMENTAL AGENCY OF THE REPUBLIC OF SLOVENIAENVIRONMENTAL AGENCY OF THE REPUBLIC OF SLOVENIAENVIRONMENTAL AGENCY OF THE REPUBLIC OF SLOVENIA
www.arso.gov.siwww.arso.gov.si
SNHT
• For each candidate series one weighted reference series was calculated from all other stations
• All series were tested with corresponding reference series (monthly, seasonal and yearly values)
• All significant breaks were marked in a table • Only reference series without uncorrected breaks in the
period used to calculate corrections were chosen • After the correction of marked breaks in the first round, all
steps were repeated until no significant breaks were found• Corrections were always applied from first year to the date
of break
ENVIRONMENTAL AGENCY OF THE REPUBLIC OF SLOVENIAENVIRONMENTAL AGENCY OF THE REPUBLIC OF SLOVENIAENVIRONMENTAL AGENCY OF THE REPUBLIC OF SLOVENIAENVIRONMENTAL AGENCY OF THE REPUBLIC OF SLOVENIA
www.arso.gov.siwww.arso.gov.si
Table of examined series with marked breaks. Numbers in yellow cells indicate the month in which the break is located.
ENVIRONMENTAL AGENCY OF THE REPUBLIC OF SLOVENIAENVIRONMENTAL AGENCY OF THE REPUBLIC OF SLOVENIAENVIRONMENTAL AGENCY OF THE REPUBLIC OF SLOVENIAENVIRONMENTAL AGENCY OF THE REPUBLIC OF SLOVENIA
www.arso.gov.siwww.arso.gov.si
Craddock
• The Craddock approach is the same as discussed in Brunetti et al. (2006)
• Corrections which were calculated for individual breaks were applied to the period between the corrected break and the next one
ENVIRONMENTAL AGENCY OF THE REPUBLIC OF SLOVENIAENVIRONMENTAL AGENCY OF THE REPUBLIC OF SLOVENIAENVIRONMENTAL AGENCY OF THE REPUBLIC OF SLOVENIAENVIRONMENTAL AGENCY OF THE REPUBLIC OF SLOVENIA
www.arso.gov.siwww.arso.gov.si
SNHT vs. Craddock
• Comparison of detected breaks and correcting factors SNHT vs. Craddock
• More breaks were detected by Craddock method then by SNHT
• Correcting factors for breaks detected with both methods are quite the same
ratxm28070001d
-3
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
3
3.5
-3.5 -3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5
[dT] CRADDOCK
[dT
] S
NH
T
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
ENVIRONMENTAL AGENCY OF THE REPUBLIC OF SLOVENIAENVIRONMENTAL AGENCY OF THE REPUBLIC OF SLOVENIAENVIRONMENTAL AGENCY OF THE REPUBLIC OF SLOVENIAENVIRONMENTAL AGENCY OF THE REPUBLIC OF SLOVENIA
www.arso.gov.siwww.arso.gov.si
-1.2
-1.0
-0.8
-0.6
-0.4
-0.2
0.0
0.2
0.4
0.6
1900 1910 1920 1930 1940 1950 1960 1970 1980 1990
year
dt
[°C
]
Spring
Summer
Autumn
Winter
YEAR
Average seasonal differences Craddock – SNHT of homogenised series (over all series)
shows that the final results are similar. Differences are bigger in period where more series have missing data
(beginning of the century) and in short period with lots of breaks in all series (80’s)
ENVIRONMENTAL AGENCY OF THE REPUBLIC OF SLOVENIAENVIRONMENTAL AGENCY OF THE REPUBLIC OF SLOVENIAENVIRONMENTAL AGENCY OF THE REPUBLIC OF SLOVENIAENVIRONMENTAL AGENCY OF THE REPUBLIC OF SLOVENIA
www.arso.gov.siwww.arso.gov.si
Many thanks for your attention!