DEVELOPMENT OF INTENSITY-DURATION-FREQUENCY CURVES USING ...
Transcript of DEVELOPMENT OF INTENSITY-DURATION-FREQUENCY CURVES USING ...
2
DEVELOPMENT OF INTENSITY-DURATION-FREQUENCY CURVES USING LOCAL
AND REGIONAL SCALE METHODS
By: Marc D’Alessandro, B.Sc
A Thesis Submitted to the School of Graduate Studies in Partial Fulfillment of the Requirement
for the Degree of Master of Science
McMaster University © Copyright by Marc D’Alessandro, September 2016
ii
MASTER OF SCIENCE (2016) McMaster University
(Geography and Earth Science) Hamilton, Ontario
TITLE: Development of Intensity-Duration-Frequency Curves Using Local
and Regional Scale Methods
AUTHOR: Marc D’Alessandro, B.Sc (McMaster University)
SUPERVISOR: Dr. Paulin Coulibaly
NUMBER OF PAGES: xv, 84
iii
Abstract
Traditionally civil infrastructure designs were rendered using rainfall data from dated
historical records. However, recent studies have shown that the magnitude and intensity of
historical precipitation events do not exhibit the extreme nature of precipitation events that are
projected to occur in the future. Increasing extreme rainfall trends have already been documented
in Canada. Therefore there are growing concerns that the aging infrastructure in southern Ontario
will be unable to function effectively and as a result the frequency of floods is expected to
increase. Updating intensity-duration-frequency (IDF) curves to account for extreme
precipitation events is vital to ensure that the consequences of floods are mitigated. This study
first reviewed the most robust techniques for updating IDF curves, and applied a select set of
approaches to create IDF curves for stations within southern Ontario.
Three robust techniques – the at-site method, the regional frequency analysis method, and
a future IDF curve development technique – were compared with one another to determine
which technique was most suitable for updating IDF curves in southern Ontario. Results showed
that the difference between the at-site method and the regional frequency analysis method was
marginal for short return periods, however for larger return periods larger differences were
observed. Future IDF statistic results showed that for the 2050s there were minor differences in
the increases in rainfall intensities when comparing with the at-site and the regional frequency
analysis method. For the 2100s there were larger increases in rainfall intensities compared to the
at-site and the regional frequency analysis method, especially for larger return periods. These
results suggest that it is worthwhile for regions within southern Ontario to update their IDF
curves using the future IDF curve technique, however it is recommended that additional climate
iv
models, emission scenarios and downscaling techniques involved in future IDF curve
construction are explored.
v
Acknowledgements
I would like to thank my supervisor, Dr. Coulibaly, for all of his guidance and support
throughout my graduate studies. Dr. Coulibaly’s encouragement enabled me to reach all of my
goals and for that I am very appreciative. I would also like to thank my mom and dad, my sister,
my grandparents and all of my family and friends for their support. I would also like to thank
Serge Traore, Dr. Tara Razavi, Dr. Niko Yiannakoulias, Dr. Hussein Wazneh and everybody
from McMaster’s Hydrologic Modeling and Water Resources Lab for all of their help and
guidance.
vi
Table of Contents
Abstract ……………………………………………………………………………………… iii
Acknowledgements ………………………………………………………………………….. v
Table of Contents ……………………………………………………………………………. vi
List of Figures ……………………………………………………………………………….. ix
List of Tables ………………………………………………………………………………… xii
List of Abbreviations ………………………………………………………………………... xiv
List of Symbols ………………………………………………………………………………. xv
1. Introduction
1.1 Background and Rationale………………………………………………………..... 1
1.2 Study Objectives …………………………………………………………………... 2
1.3 Organization of Thesis ……………………………………………………………...3
2. Literature Review
2.1 The At-Site Intensity-Duration-Frequency (IDF) Curve Method …………...…….. 4
2.2 The Regional Frequency Analysis Method ………………………………………... 4
2.2.1 Regionalization Techniques ……………………………………………… 5
2.2.2 Principal Component Analysis and K-Means Clustering …………..……. 5
2.2.3 Ward’s Method …………………………………………………………… 6
2.2.4 Tabreg ……………………………………………………………………. 7
2.3 Future IDF Curve Development Method ……………..………………………..…... 8
3. Study Area and Data
3.1 Study Area ………………………………………………………………………….. 9
3.2 Observed Data ……………………………………………………………………… 10
3.3 Climate Model Data ………………………………………………………………... 11
4. Methodology
vii
4.1 Overview of Methods ……………………………………………………………… 12
4.2 Data Processing
4.2.1 Converting Daily Duration Data to 24-hour Duration Data ……………. 14
4.2.2 Replacing Flagged Data …………………………………………………. 14
4.2.3 Converting 24-hour Duration Data to Hourly and Sub-Hourly Duration
Data ...…………………………………………………………………….. 15
4.2.4 Annual Maximum Series Derivation ……………………………………... 16
4.3 At-Site Method IDF Curve Development ……………………………………….…. 17
4.4 Regional Frequency Analysis Method IDF Curve Development ………………….. 17
4.4.1 Regionalization Inputs …………………………………………………… 18
4.4.2 Principal Component Analysis with K-means Clustering ……………….. 18
4.4.3 Ward’s Method ………………………………………………………….... 20
4.4.4 Tabreg Method ………………………………………………………….…20
4.4.5 L-moments Approach …………………………………………………….. 23
4.4.6 L-moment Ratio Diagrams ……………………………………………….. 24
4.4.7 Goodness-of-Fit Measure …………………………………………………24
4.5 Future IDF Curve Methodology
4.5.1 Selection of Future Time Periods ………………………………………... 25
4.5.2 Climate Model Data Processing …………………………………………. 26
4.5.3 Downscaling Techniques ………………………………………………… 26
4.5.4. Future IDF Curve Derivation …………………………………………… 27
5. Results
5.1 Principal Component Analysis and K-means Clustering Regionalization Results
……………………………………………………………………………………… 27
5.2 Ward’s Method Regionalization Results …………………………………………... 31
viii
5.3 Tabreg Regionalization Results ……………………………………………………. 33
5.4 L-moment Ratio Diagrams……..………………………………………….……….. 34
5.5 Goodness-of-Fit Measure ………………...………………………………………… 37
5.6 IDF Curve Results: At-Site Method Compared to Regional Frequency Analysis
Method ………………………………………………………………………..……. 39
5.7 Future IDF Curve Results …..……………………………………………………… 50
5.8 Variability of the Future IDF Curve Results ………………..……………………… 56
6. Discussion ………………………………………………………………………………...… 60
7. Conclusion ……………………………………………………………………………..…… 62
8. Recommendations ………………………………………………………………………….. 63
9. References …………………………………………………………………………………... 65
Appendix A ……………………………………………………………………………………. 73
Appendix B ……………………………………………………………………………………. 75
Appendix C ……………………………………………………………………………………. 78
Appendix D ……………………………………………………………………………………. 79
Appendix E ……………………………………………………………………………………. 83
ix
List of Figures
Figure 1: Map of southern Ontario showing the 48 weather stations used in the study……...….10
Figure 2: An overview of the methodology used in this study…………………………………..13
Figure 3: Topology created for the selected weather stations………………………………..…..21
Figure 4: Graphical plot of the cost function after 500 iterations using Tabreg…..……………..23
Figure 5: Davies-Bouldin index applied to the input dataset…………………………….…..…. 28
Figure 6: Percentage of total variability explained by each principal component……………….29
Figure 7: PCA loading plots for the first and second principal components…………….………30
Figure 8: Map showing the regions that were delineated using Principal Component Analysis
with k-means……….…………………………………………………………………..31
Figure 9: Map showing the regions that were delineated using Ward’s method…………...........32
Figure 10: Map showing the regions that were delineated using Tabreg clustering method……34
Figure 11: L-moment ratio diagram for cluster 1………………………………………………..35
Figure 12: L-moment ratio diagram for cluster 2………………………………………………..36
Figure 13: Plots of IDF statistics from the regional frequency analysis method and the at-site
method for Glen Allan station: (a) 3 hour rainfall and (b) 2 hour rainfall ...................41
x
Figure 14: Plots of IDF statistics from the regional frequency analysis method and the at-site
method for Tillsonburg station: (a) 3 hour rainfall and (b)12 hour rainfall…....…….42
Figure 15: Difference (in %) of rainfall intensity between the at-site method and the regional
frequency analysis method for stations in cluster 1………………………..……...…43
Figure 16: Plots of IDF statistics from the regional frequency analysis method and the at-site
method for Vineland Rittenhouse station (cluster 3): (a) 3 hour rainfall and (b) 12
hour rainfall………………………………………………………………..................45
Figure 17: Plots of IDF statistics from the regional frequency analysis method and the at-site
method for Hagersville station (cluster 2): (a) 3 hour rainfall and (b) 12 hour
rainfall………………………………………………………………………………..46
Figure 18: Difference (in %) of rainfall intensity between the at-site method and the regional
frequency analysis method for Hagersville (cluster 2) and Vineland Rittenhouse
(cluster 3)…………………………………………………………………………….47
Figure 19: Difference (in %) of rainfall intensity between the at-site method and the regional
frequency analysis method for weather stations in all clusters for short return
periods………………………………………………………………………………..49
Figure 20: Difference (in %) of rainfall intensity between the at-site method and the regional
frequency analysis method for weather stations in all clusters for large return
periods………………………………………………………………………………...50
xi
Figure 21: IDF curves obtained from the future IDF method, the regional frequency analysis
method and the at-site method for Durham station (3 hour period)………………....52
Figure 22: IDF curves obtained from the future IDF method, the regional frequency analysis
method and the at-site method for Durham station (12 hour period)….………….…53
Figure 23: IDF curves obtained from the future IDF method, the regional frequency analysis
method and the at-site method for Hartington station (3 hour period)………………54
Figure 24: IDF curves obtained from the future IDF method, the regional frequency analysis
method and the at-site method for Hartington station (12 hour period)….……….…55
Figure 25: Comparison of IDF curves for 12 hour rainfall duration using the minimum,
maximum and mean of the future IDF method results versus the at-site method results
for Durham station..……………………………………………...…………………...58
Figure 26: Comparison of IDF curves for 12 hour rainfall duration using the minimum,
maximum and mean of the future IDF method results versus the at-site method results
for Hartington station. ……………………………………………………………….59
xii
List of Tables
Table 1: Coefficients of the Ratio formula for converting 24h rainfall to different hourly and sub-
hourly durations of rainfall for Ontario (Adapted after Hershfield (1961); Huff and
Angel (1989) by Coulibaly and Shi 2005)……………………………………………...16
Table 2: Goodness-of-fit measure results for each station located in cluster 1 …………………37
Table 3: Goodness-of-fit measure results for each station located in cluster 2………………….38
Table 4: L-moment ratio diagram results for each cluster…………………………………….…38
Table A1: Geographic characteristics of each weather station…………………………………..73
Table B1: List of flagged data found for each weather station. Note: The definition for each
flagged value is provided in the table below…………….…………………………...75
Table B2: Definition of each flag data that was found in the daily precipitation data. The method
that was used to replace each of the flag data is indicated in the table. Note: EC is an
acronym for Environment Canada……………………………………………………77
Table C1: Data and resolution of the CanRCM4-CanEMS2 and HadGEM2-ES climate models
used…………………………………………………………………………………...78
Table D1: IDF values for Glen Allan station using the at-site method……………………….....79
Table D2: IDF values for Glen Allan station using the regional frequency analysis method…...79
Table D3: IDF values for Tillsonburg station using the at-site method…………………………80
Table D4: IDF values for Tillsonburg station using the regional frequency analysis method…..80
xiii
Table D5: IDF values for Vineland Rittenhouse station using the at-site method…………........81
Table D6: IDF values for Vineland Rittenhouse station using the regional frequency analysis
method ………………………………………………………………………………..81
Table D7: IDF values for Hagersville station using the at-site method………………………….82
Table D8: IDF values for Hagersville station using the regional frequency analysis……….......82
Table E1: IDF values for Durham station (3 hour rainfall) using the at-site method, regional
frequency analysis method and future IDF curve development method……………..83
Table E2: IDF values for Durham station (12 hour rainfall) using the at-site method, regional
frequency analysis method and future IDF curve development method…………..….83
Table E3: IDF values for Hartington station (3 hour rainfall) using the at-site method, regional
frequency analysis method and future IDF curve development method………….….84
Table E4: IDF values for Hartington station (12 hour rainfall) using the at-site method, regional
frequency analysis method and future IDF curve development method……………..84
xiv
List of Abbreviations
AMS Annual Maximum Series
CanRCM4-CanESM2
The fourth generation of the Canadian regional climate model driven
by the second generation of the Canadian earth system model.
GCM Global Climate Model
GEV Generalized Extreme Value
GHG Greenhouse gas
GLO Generalized Logistic
GNO Generalized Normal
GPD Generalized Pareto
HadGEM2-ES
The Hadley Global Environment Model 2 coupled with the earth
system model.
IDF Intensity-duration-frequency
IPCC Intergovernmental Panel on Climate Change
PCs Principal Components
PCA Principal Component Analysis
PDS Partial duration series
PE3 Pearson Type III
RCP Representative Concentration Pathways
RCM Regional Climate Model
RFA Regional Frequency Analysis
xv
List of Symbols
𝑎 Coefficient which is equal to 0
𝑏 Linear regression coefficient
𝑐 The 24-hour precipitation value
𝐶 The total cost function for the entire regionalization plan
𝑑 The ratio value used to obtain the desired hourly or sub-hourly data value
𝑃𝑥1 The station with a missing precipitation value
𝑃𝑥2 The nearest neighbouring station’s respective precipitation value
𝑋 The converted precipitation value for a desired duration
𝑉𝑥1 The total within-region variation in mean annual precipitation for all sites
𝑉𝑥2 The total within-region variation in spring precipitation for all sites
𝑉𝑥3 The total within-region variation in summer precipitation for all sites
𝑉𝑥4 The total within-region variation in fall precipitation for all sites
𝑉𝑥5 The total within-region variation in winter precipitation for all sites
𝑉𝑥6 The total within-region variation in 100 maximum daily precipitation
values for all sites
𝑉𝑥7 The non-connectivity penalty
ZDIST Goodness-of-fit measure
𝜏3 L-moment coefficient of skewness
𝜏4 L-moment coefficient of kurtosis
1
1. Introduction
1.1 Background and Rationale
Intensity-duration-frequency (IDF) curves are utilized by water resources engineers to
design hydraulic structures such as sewer systems, dams and culverts. Previously IDF curves
were developed using historical rainfall data however recent observations and studies indicate
that historical data does not represent the intensity and magnitude of rainfall events that are
expected to occur in the future. Climate change impact studies from across North America have
predicted increasing trends in extreme rainfall events and such information needs to be
accounted for in design storm estimation. Infrastructure failure is a major concern because it can
lead to flooding which has proven to negatively impact society. There were recent cases in
southern Ontario where extreme rainfall events led to severe flooding. On July 8, 2013 an intense
storm in Toronto, Ontario caused approximately $1 billion in damages due to homes being
flooded, power outages and roads being flooded (Environment and Climate Change Canada,
2015). Furthermore, on August 4, 2014 a large storm in Burlington, Ontario caused $90 million
in damages due to homes and commercial buildings being flooded, damage to driveway culverts
and the flooding of car parking areas (Conservation Halton, 2015). Therefore updating IDF
curves to account for extreme rainfall events is required to ensure that the aforementioned
consequences of floods are prevented.
Climate change is caused by anthropogenic greenhouse gas emissions being produced at
an excessive rate. Approximately 50% of anthropogenic carbon dioxide emissions have occurred
within the last 40 years (IPCC, 2014). Greenhouse gas emissions are expected to increase in the
future and climate scientists believe that there will be unduly consequences on the Earth’s
2
climate, including increases in the magnitude and intensity of extreme rainfall events.
Accounting for these projected increases is extremely important for infrastructure design,
therefore the use of future climate models is necessary when designing water system
infrastructure (Lemmen, 2008). Different scenarios project greenhouse gas emission rates for the
entire 21st century based on global population, energy use and land use practices and resulting
precipitation intensity responses can be computed using climate models (IPCC, 2014). Regional
climate models and global climate models are used in this study to determine future precipitation
intensities triggered by climate change for the southern Ontario region.
1.2 Study Objectives
This research is part of the NSERC Canadian FloodNet Research Program, specifically
Project 1-4: Development of new methods for updating IDF curves in Canada.
The specific objective of this study was to update IDF curves in southern Ontario using
emerging methodologies. This study assessed several methodologies for updating IDF curves
and a comparative analysis was completed to determine which technique is most suitable for
updating IDF curves in southern Ontario. The objectives of this study are summarized as
follows:
1. Identify weather stations located within southern Ontario that meet the requirements of
this study and collect daily precipitation data for each station.
2. Identify the most effective IDF curve methodologies and update the IDF curves for each
station located within the study area using these methodologies;
3
a. Apply the at-site method to update IDF curves, specifically fitting the generalized
extreme value (GEV) probability distribution function to the annual maximum
rainfall data for each station.
b. Apply the regional frequency analysis method to update IDF curves. Utilize
clustering techniques such as Principal Component Analysis (PCA) with k-means,
Ward’s method and Tabreg to classify weather stations into homogeneous
precipitation regions. Apply the L-moments approach to determine if the
delineated regions were statistically homogeneous and for quantile estimation.
c. Apply the future IDF curve method to update IDF statistics. Use the Delta change
method for statistical downscaling and use a combination of two climate model
outputs (CanRCM4-CanESM2 and HadGEM2-ES) and two emission scenarios
(RCP4.5 and RCP8.5) to establish future rainfall intensities.
3. Perform a comparative analysis for the three aforementioned techniques and identify the
technique that is most suitable for updating IDF curves in southern Ontario.
1.3 Organization of Thesis
Chapter 2 contains a literature review which outlines the development of IDF curve
techniques over time. Chapter 3 presents the study area and a description of the data that was
used. Chapter 4 outlines the methodologies utilized throughout this study. Chapter 5 presents and
discusses the results of the study. Chapter 6 provides the conclusions of the study, and Chapter 7
outlines recommendations for future work.
2. Literature Review
4
2.1 The At-Site Intensity-Duration-Frequency (IDF) Curve Method
The at-site method which is used for IDF curve derivation can be dated back to 1932
(Bernard, 1932). The at-site method uses frequency analysis to determine the recurrence of
extreme rainfall events for a single site (or station). Various organizations have used the at-site
method to update IDF curves for different regions throughout Canada. Coulibaly and Shi (2005)
used the at-site method to develop updated IDF curves for the Grand River region and the
Kenora and Rainy River region. The City of Guelph (2007) developed IDF curves for three
weather stations using the at-site method and compared their results with the previously outdated
IDF curves used for storm design. Shephard (2011) updated IDF curves using additional weather
station data for New Brunswick, Nova Scotia, Newfoundland and Prince Edward Island with the
at-site method. Paixao et al. (2011) utilized the at-site method to update IDF curves for 92
climate stations and 29 tipping bucket rain gauge stations in southern Ontario. Although the at-
site method was once the conventional technique used for IDF curve derivation, there are
limitations associated with this method. Skewed data obtained from weather stations could
misrepresent the extreme nature of precipitation events due to short data records, a low density of
weather stations in a large region and missing rainfall data. To circumvent this issue the regional
frequency analysis method has become increasingly utilized for IDF curve development to
overcome the limitations associated with the at-site method.
2.2 The Regional Frequency Analysis Method
The regional frequency analysis method (Hosking and Wallis, 1997) incorporates
additional weather station data to enhance the characterization of extreme rainfall events for
different regions. The regional frequency analysis method reduces uncertainties associated with
5
the at-site method by creating homogeneous regions which allows for additional rainfall data to
be included in the derivation of IDF curves. The regional frequency analysis method has been
used in studies to determine extreme rainfall rates and flood estimates for various study areas
(Jingyi and Hall, 2004; Trefry et al., 2005; Schaefer et al., 2006; Parida and Moalafhi, 2008; Lim
and Voeller, 2009; Saf, 2009; Ngongondo et al., 2011; Malekinezhad and Zare-Garizi, 2014;
Bharath and Srinivas, 2015). A key component of the regional frequency analysis approach is the
regionalization technique needed to define homogeneous regions.
2.2.1 Regionalization Techniques
One particularly important step in regional frequency analysis is the delineation of
homogeneous regions using a clustering or classification technique. Hosking and Wallis (1997)
recommend using Ward’s method because it forms clusters containing an equal number of sites.
Different studies have used a wide variety of clustering techniques to create homogeneous
regions (Stathis and Myronidis, 2009; Abolverdi and Khalili, 2010; Modarres and Sarhadi, 2011;
Paixao et al., 2011). The accuracy of the delineated regions is tested using the L-moment
approach. Regions that are heterogeneous or contain discordant stations must be reevaluated to
ensure that the inclusion of all sites within a cluster accurately represent similar hydrologic
characteristics. Three clustering techniques, Principal Component Analysis with k-means
clustering, Ward’s method and Tabreg, were utilized to delineate regions exhibiting similar
hydrologic characteristics for this study.
2.2.2 Principal Component Analysis and K-means Clustering
Principal Component Analysis (PCA) and k-means clustering have been utilized to
delineate homogeneous regions when using the regional frequency analysis method. For
6
example, Brunnetti et al. (2004) applied PCA with VARIMAX rotation to monthly precipitation
data for 39 stations located in Italy and analyzed long term precipitation trends for the 5
delineated regions. Maraun et al. (2008) analyzed seasonal precipitation trends in the United
Kingdom by applying PCA to 689 rain gauge records that were categorized into 10 precipitation
intensity classes. Stathis and Myronidis (2009) delineated homogeneous precipitation regions in
central Greece by applying PCA to mean monthly precipitation data for 75 meteorological
stations. Abolverdi and Khalili (2010) applied k-means to annual maximum precipitation data
and at-site information, including latitude, longitude and elevation, and estimated quantiles for
different return periods for 4 homogeneous regions in Iran. Bernard et al. (2013) applied k-means
clustering to maximum hourly precipitation data in France and compared its results with the
partitioning around medoids algorithm. Fragoso and Tildes Gomes (2008) utilized PCA with k-
means clustering to determine spatial distribution patterns of heavy precipitation days and
identified specific atmospheric circulation patterns contributing to heavy rainfall patterns in
Portugal. Razavi and Coulibaly (2013) applied PCA with k-means clustering to delineate
homogeneous watershed clusters in Ontario.
2.2.3 Ward’s Method
Ward’s method is a widely used clustering technique to allocate sites into homogeneous
regions (Hosking and Wallis, 1997). Smithers and Schulze (2001) applied Ward’s method to
precipitation statistics and at-site characteristics of 172 rainfall stations in South Africa and
created design storms for 15 regions, where 10 of the regions were considered acceptably
homogeneous and 5 of the regions were considered possibly heterogeneous. Unal et al. (2003)
applied Ward’s method to temperature and precipitation data for 113 climate stations in Turkey
and delineated 7 clusters that exhibited distinct climate characteristics. Muñoz-Diaz and Rodrigo
7
(2004) applied Ward’s method to delineate homogeneous regions in Spain using seasonal rainfall
data and compared their results with the PCA method. Kansakar et al. (2004) applied Ward’s
method to precipitation data in Nepal and determined the effect physical characteristics have on
the seasonality of precipitation events and the magnitude of precipitation events. Kyselý et al.
(2006) determined that Ward’s method yielded superior results compared to the average-linkage
clustering method and as a result four homogeneous regions which exhibited similar extreme
precipitation characteristics in Czech Republic were identified. Noto and La Loggia (2009)
applied Ward’s method to 52 stream gauging sites in Sicily and developed growth curves to
estimate flooding events for each region. Modarres and Sarhadi (2011) utilized Ward’s method
to delineate 8 rainfall regions in Iran and then applied L-moment statistics to determine the
homogeneity, discordancy and regional frequency distribution function for each region. Paixao
et al. (2011) applied k-means, centroid and Ward’s method to 24 hour rainfall data in southern
Ontario and determined that Ward’s method was the most suitable technique to delineate
homogeneous regions that exhibit similar precipitation patterns.
2.2.4 Tabreg
Tabreg (Yiannakoulias et al. 2007; Yiannakoulias and Bland, 2016) is a general purpose
regionalization tool that is used to cluster stations into regions that exhibit similar hydrologic
characteristics. Tabreg uses a greedy search algorithm to minimize a user defined cost function.
The cost function can be comprised of multiple criteria and it is used to minimize within-region
variability (Yiannakoulias and Bland, 2016). Tabreg is used for regionalization of weather
stations, however this algorithm can be applied to other areas of study such as creating political
districts or creating health care regions (Yiannakoulias and Bland, 2016).
8
2.3 Future IDF Curve Development Method
The at-site method and the regional frequency analysis method use historical rainfall data
to determine recurrence intervals of extreme rainfall events. However the incorporation of
climate model data is now commonly utilized when updating IDF curves. Practitioners need to
develop infrastructure accounting for larger return periods and climate model data is used to
develop design storms to withstand future rainfall extremes with a 50 year and 100 year return
period. Climate change impact studies use regional climate models and global climate models to
assess the impact future precipitation rates will have on society. The future climate model based
technique has been used globally to determine rainfall rates that are expected to occur in the
future (Hennessy et al., 1997; Xuejie et al., 2001; Mailhot et al., 2007; Mladjic et al., 2011;
Dominguez et al., 2012; Mailhot et al., 2012; Willems et al., 2012; Clavet-Gaumont et al., 2013).
Coulibaly and Shi (2005) used the Canadian Global Circulation Model-2 to develop future IDF
curves for various regions in Ontario. Olsson et al. (2009) used the RCA3 regional climate model
coupled with A2 and B2 emission scenarios and determined that there will be significant
increases in extreme precipitation during the summer and autumn seasons in Kalmar, Sweden.
Urrutia and Vuille (2009) used a regional climate model following the A2 and B2 scenarios for
the 2070 – 2100 period and determined that different regions in the tropical Andes will
experience both increases and decreases in precipitation rates. Feng et al. (2011) used a global
climate model to project future precipitation changes in China and determined that extreme
precipitation events would significantly increase in southeast China. Coulibaly et al. (2015) used
an ensemble of global climate models and regional climate models to update IDF curves for
weather stations located in the Toronto and Essex regions.
3. Study Area and Data
9
3.1 Study Area
The study area that will be assessed in this study is southern Ontario, Canada. Southern
Ontario is the most populated region in Canada and it is a region that is projected to see its
population increase in the future. Infrastructure development and expansion is expected and there
is a need for IDF curves to accurately account for the extreme rainfall events that can impact this
region. Selected 48 weather stations located within different areas of southern Ontario were used
in this study (Figure 1). Between 1970 and 2000 the weather station with the highest mean
annual precipitation was Chatsworth which had 1149 mm, and the weather station with the
lowest mean annual precipitation was Toronto Island Airport which had a recording of 795 mm.
The weather stations with the lowest altitude are Belleville and Frenchman’s Bay with 76 meters
above sea level, meanwhile Proton weather station had the highest altitude with 480 meters
above sea level.
10
Figure 1: Map of southern Ontario showing the 48 weather stations used in the study.
3.2 Observed Data
31 years of daily precipitation data between 1970 and 2000 was collected from
Environment Canada’s Climate Data Archive for all 48 weather stations. A list of the selected
weather stations is presented in Appendix A, also included in this list are geographic
characteristics of each station such as latitude, longitude and altitude. There were specific criteria
that each weather station had to fulfill before being selected for this study. The first criterion
required each weather station to have at least 30 years of data. The second criterion was each
station’s most recent record of data must be recorded in the year 2000. The last criterion required
London
Kingston
Windsor
Wiarton
Toronto
Fort Erie
11
each station to have no more than 5% of its data missing. The weather station that had the most
missing data was Cressy weather station and it had approximately 4% of its daily precipitation
data missing.
Each weather station used in this study contained a large number of flagged data, i.e.
estimated values and trace values. A table displaying the flagged data for each weather station
can be found in Appendix B, Table B1. The weather station that contained the most trace values
was Toronto Pearson Airport and approximately 20% of its daily precipitation data contained
trace values. The weather station that contained the most estimated values was St. Catharines
Power Glen and approximately 2% of its daily precipitation data contained estimated values.
Although some weather stations contained a large number of flagged values this only affected
the data used for the regionalization portion of the study. Environment Canada’s weather station
data exhibited substantial sources of error which is a limitation for the regionalization part of this
study, however the annual maximum series data used to create the IDF curves was not estimated
and considered more reliable.
3.3 Climate Model Data
Regional climate models (RCM) and global climate models (GCM) are commonly used
for climate change impact studies in Ontario. The fourth generation of the Canadian regional
climate model driven by the second generation of the Canadian Earth System Model
(CanRCM4-CanESM2) was the RCM that was used in this study. The Hadley Global
Environment Model 2 coupled with the Earth System Model (HadGEM2-ES) was the GCM that
was used in this study. The CanRCM4-CanESM2 model data was downloaded from the
Canadian Centre for Climate Modeling. The HadGEM2-ES model data was downloaded from
12
the PCMDI – Earth System Grid Federation. Details outlining characteristics such as spatial
resolution, temporal resolution and simulation period for the CanRCM4-CanESM2 and
HadGEM2-ES climate models can be found in Appendix C.
Climate model data for the current period (1970 - 2000) was collected for both climate
models. Additionally, future climate model data was downloaded for the 2050s (2026 – 2045)
and the 2100s (2081 – 2099) time periods for two emissions scenarios (RCP4.5 and RCP8.5) for
both climate models. Determining future precipitation rates in response to climate change is
achieved by looking at different scenarios which assess future global population, energy use and
land use practices to project greenhouse gas emission rates for the 21st century (IPCC, 2014).
These scenarios are known as Representative Concentration Pathways (RCPs). RCP4.5 is an
intermediate scenario that limits the increase of future greenhouse gas emissions, meanwhile
RCP8.5 is an intensive scenario which predicts that future greenhouse gas emissions will
continually increase (IPCC, 2014).
4. Methodology
4.1 Overview of Methods
Initially weather stations were assessed to see if they would be suitable for this study.
Additionally, an extensive review was conducted to determine the most robust techniques
currently used for IDF curve development. After the review was conducted it was determined
that the at-site method, the regional frequency analysis method and a future IDF curve method
would be used to develop IDF curves for the selected weather stations. Figure 2 shows a
flowchart of the methodology used for this study. Detailed description of each step of the
methodology is provided hereafter.
13
Figure 2: An overview of the methodology used in this study.
Selection of Stations (48 stations were selected)
Regional Frequency Analysis Method
Classification of stations and
identification of homogeneous
regions
Fitting best fit probability
distribution to each homogeneous region
Development of regional IDF curves
At-site Method
Fitting of generalized extreme value (GEV)
probability distribution function
to each observed station data
Development of IDF curves for each
station using the at-site method
Future IDF Curve Method
Climate Model Selection
Downscaling climate model
data using Delta change approach
Development of future IDF curves
Comparison of IDF Curves
14
4.2 Data Processing
4.2.1 Converting Daily Duration Data to 24-hour Duration Data
The Environment Canada weather station data was provided in a daily duration format,
however to create IDF curves for drainage design purposes the precipitation data must be in an
hourly format. For all weather stations daily precipitation data was multiplied by 1.13, which is
the Hershfield factor and it is commonly used to convert daily precipitation data to 24-hour
duration data (Hershfield, 1961; Huff and Angel 1989; 1992; Coulibaly and Shi 2005; Coulibaly
et al. 2015).
4.2.2 Replacing Flagged Data
Flagged data was replaced for each weather station using the following procedure, which
is further described in Appendix B, Table B2:
1. All trace values were replaced with a value of 0.2 mm because this is the minimum
recorded rainfall value from the dataset.
2. The “C” flag, which is defined as precipitation occurred but amount is uncertain, was
replaced with a value of 0 mm.
3. The “A” flag, which is defined as accumulated precipitation where the previous value
was the “C” flag, was replaced with a value of 0 mm.
4. The “E” flag, which is defined as an estimated value, already had a value in the dataset
that was provided by Environment Canada, therefore the respective value remained
intact.
15
5. The “F” flag, which is defined as an accumulated and estimated value, already had a
value in the dataset that was provided by Environment Canada, therefore the respective
value remained intact.
6. All missing data was replaced using a linear regression model between two nearest
neighboring stations. The linear regression equation is formulated as,
𝑃𝑥1 = 𝑎 + 𝑏𝑃𝑥2 (1)
where 𝑃𝑥1 is the station with a missing precipitation value, 𝑎 is a coefficient which
equals 0, 𝑏 is the linear regression coefficient and 𝑃𝑥2 is the nearest neighbouring
station’s respective precipitation value.
4.2.3 Converting 24-hour Duration Data to Hourly and Sub-Hourly Duration Data
Converting 24-hour precipitation data to hourly and sub-hourly data was completed using
the ratio formula which was introduced by Hershfield (1961) and adapted by Huff and Angel
(1989). The ratios were obtained from the Hydrological Atlas of Canada (1978) and can be
viewed in Table 1. The ratio formula is defined as
𝑋 = 𝑐(𝑑) (2)
where 𝑋 is the converted precipitation value for a desired duration, 𝑐 is the 24-hour precipitation
value and 𝑑 is the ratio value used to obtain the desired hourly or sub-hourly data value.
16
Rainfall Duration Td (hours) Ratio Converting 24-hour Data
18 0.91
12 0.82
6 0.73
3 0.64
2 0.56
1 0.45
0.50 (30 min.) 0.38
0.25 (15 min.) 0.29
0.17 (10 min.) 0.22
0.08 (5 min.) 0.12
Table 1: Coefficients of the Ratio formula for converting 24h rainfall to different hourly and
sub-hourly durations of rainfall for Ontario (Adapted after Hershfield (1961); Huff and Angel
(1989); by Coulibaly and Shi, 2005).
4.2.4 Annual Maximum Series Derivation
For IDF curve development there are two methods commonly used to extract extreme
rainfall data from a time series, which are the annual maximum series (AMS) method and the
partial duration series (PDS) method. The AMS method extracts the largest recorded
precipitation event from each year. The PDS method sets a precipitation threshold value, i.e. for
24 hour rainfall data, the threshold value can be set to 40 mm (which is site or region dependent),
therefore any rainfall event greater than 40 mm will be extracted from the time series and is
included when creating IDF curves. There are positive and negative factors associated with both
methods. The problem associated with the AMS method is that it only extracts one extreme
17
rainfall event from a given year, however there may be two or more extreme precipitation events
within a year, therefore the extreme rainfall characteristics for a station could be misrepresented.
The issue associated with the PDS method is that there is large uncertainty regarding how to
properly choose the precipitation threshold value, i.e. is 40 mm more representative of extreme
rainfall events than 50 mm. Subjectivity is involved when answering the aforementioned
problem. Therefore the AMS method was employed when deriving IDF curves for this study.
The AMS data was extracted for all weather stations using MatLab (R2014a).
4.3 At-Site Method IDF Curve Development
The at-site IDF curve method was utilized to develop IDF curves for all weather stations
in the study area. Frequency analysis was utilized to determine the recurrence of extreme rainfall
events for a variety of return periods. Return periods are measured in years and the range of
return periods used in this study include 2, 5, 10, 20, 50 and 100 year return periods. To estimate
recurrence intervals of extreme precipitation events a probability distribution was fit to the AMS
data. The generalized extreme value (GEV) distribution was the probability distribution function
that was chosen based on previous recent study in Southern Ontario (Coulibaly et al., 2015).
Once the probability distribution function is fitted to the AMS, estimation of extreme rainfall
events for desired return periods can be completed. IDF curves for various storm durations
ranging from 15 minutes to 24 hour were created.
4.4 Regional Frequency Analysis Method IDF Curve Development
The regional frequency analysis method was used to create IDF curves for all
homogeneous regions identified by selected regionalization techniques. The regional frequency
analysis approach includes additional weather stations in the development of IDF curves. Three
18
clustering techniques were utilized to generate homogeneous regions, which include the
Principal Component Analysis (PCA) with k-means, the Ward’s method, and the Tabreg
approach. A regional probability distribution function was fit to the AMS of all weather stations
in each respective cluster. IDF curves for multiple rainfall durations were created for all stations
within that cluster.
4.4.1 Regionalization Method Inputs
When using the regional frequency analysis method, it is important to include at-site
characteristics as inputs into the clustering technique (Hosking and Wallis, 1997). After
completing sensitivity analyses it was determined that the most suitable parameters to
characterize precipitation for stations in southern Ontario included mean annual precipitation
(MAP), total spring precipitation (April – May), total summer precipitation (June – September),
total fall precipitation (October – November), total winter precipitation (December – March) and
total 100 maximum daily precipitation values. It is common practice to include input parameters
such as number of wet days and elevation, however this is problematic because these
aforementioned parameters do not have the same unit so it is difficult to ensure that all of the
parameters are evenly weighted. The units for all parameters used in this study are in
millimeters. All of the data has been normalized before being inputted into the regionalization
techniques.
4.4.2 Principal Component Analysis with K-means Clustering
Principal Component Analysis (PCA) is a multivariate method that discovers patterns and
trends within a dataset with the objective of classifying multiple variables using one value
(Mallants and Feyen, 1990). In most regionalization studies, multiple variables are assessed to
19
determine the behavior of a system. PCA extracts information from the original dataset and
simplifies it into orthogonal values called principal components (PCs) (Mallants and Feyen,
1990). The first PC explains the largest variance within the dataset, the second PC explains the
second largest variance within the dataset, and this trend continues until the last PC is measured
(Stathis and Myronidis, 2009). It is acceptable to include PCs where the variance explained is
greater than or equal to 80%. PCA was applied to observed data to describe hydrological
characteristics of sites located within southern Ontario.
K-means algorithm was used to classify the PCA scores of each weather station into
homogeneous regions. K-means algorithm places centroids into a dimension of space which
describes the PCA score data (Razavi and Coulibaly, 2013). The Euclidean distance between
each cluster centroid and the closest PCA score is measured (Razavi and Coulibaly, 2013). K-
means algorithm divides the data into k clusters by assigning each PCA score to the respective
cluster where the distance between the data point and the cluster centroid is smallest (Fragoso
and Tildes Gomes, 2008). The cluster centroids continue to relocate based on the inclusion of
data points within each respective cluster. The algorithm continually recalculates the
membership of each data point within each cluster in order to obtain new clusters, and this
process is carried out until the variance within each cluster is minimized. Therefore each cluster
is created based on the minimum Euclidean distance between the cluster centroids and each data
point.
The Davies–Bouldin (DB) index introduced by Davies and Bouldin (1979) is a metric for
evaluating clustering algorithms. Small values of this index correspond to clusters that are
compact (Aguado et al., 2008). Therefore the K-means algorithm with the Davies-Bouldin index
is applied to the first two principal components to determine the appropriate number of clusters
20
for the stations located in the study area (Razavi and Coulibaly, 2013). The number of clusters
that minimizes the Davis-Bouldin index is taken as the optimal number of clusters.
4.4.3 Ward’s Method
Ward’s method is an agglomerative hierarchal clustering technique that was used to
group each weather station into clusters that exhibit similar site characteristics. In Ward’s
method each site behaves as an individual cluster and after each iteration, sites that exhibit the
smallest Euclidean distance merge together to form a new cluster. For example, if Ga and Gb are
two sites that have the shortest distance between them compared to all other sites, then they
merge to form a new cluster Ga,b (Kahya et al., 2008). The regionalization process is complete
when all sites belong to a single cluster and desired regions are delineated by reducing the
variance within each cluster. Hosking and Wallis (1997) recommended using Ward’s method for
regionalization because it produces adequate clusters with an equal amount of sites in each
region.
4.4.4 Tabreg Method
Tabreg was used to cluster weather stations into regions which exbitied similar
hydrologic characteristics. The only hard constraint on the formation of regions is that the sites
within a region must form a contiguous system. The topology for determining contiguity is
provided in Figure 3.
21
Figure 3: Topology created for the selected weather stations.
In our application, we start with a randomly generated regionalization plan and use
Tabreg to search for an improved plan by moving stations between regions in a way that
minimizes this cost function subject to the contiguity constraint as follows:
𝐶 = (𝑉𝑥1 + 𝑉𝑥2 + 𝑉𝑥3 + 𝑉𝑥4 + 𝑉𝑥5 + 𝑉𝑥6 + 𝑉𝑥7) (3)
where 𝐶 represents the total cost function for the entire regionalization plan, 𝑉𝑥1 represents the
total within-region variation in mean annual precipitation for all sites, 𝑉𝑥2 represents the total
within-region variation in spring precipitation for all sites, 𝑉𝑥3 represents the total within-region
variation in summer precipitation for all sites, 𝑉𝑥4 represents the total within-region variation in
fall precipitation for all sites, 𝑉𝑥5 represents the total within-region variation in winter
22
precipitation for all sites, 𝑉𝑥6 represents the total within-region variation in 100 maximum daily
precipitation values for all sites and 𝑉𝑥7 represents the non-connectivity penalty (Yiannakoulias
et al., 2007). The non-connectivity penalty increases the cost associated with regions of irregular
shape.
Once an initial randomly generated regionalization plan is found, Tabreg uses a greedy
search to move sites into and out of regions in a way that minimizes the cost function.
Unfortunately regionalization plans based on greedy searches are not likely to generate good
regionalization results. This is because they can get trapped in local (or “poor”) optima from
which no improved moves can be found. In order to escape poor quality regionalization plans,
Tabreg employs a tabu list based on the work of Bozkaya et al. (2003). Local optimum solutions
are avoided by memorizing previous station assignments and new regions are created by adding
stations into different areas of the search space. This occurs because a tabu list limits certain
search options for a period of time. This forces the algorithm to identify new stations that can be
added into regions. Initially this can negatively impact the cost functions, however over a long
run, as more station assignments are made, the quality of solutions is greatly improved. As
displayed in Figure 4, the quality of solutions remains trapped in a local optima for the first 235
iterations. Once the tabu list is activated additional stations are utilized and optimal regions are
delineated which is observed when the cost function substantially decreases from 57 to less than
20.
23
Figure 4: Graphical plot of the cost function after 500 iterations using Tabreg.
A non-connectivity penalty was included in the objective function of the Tabreg method
to ensure that regions are relatively compact in shape. Additionally Tabreg ensures that the
delineated regions are contiguous, meanwhile PCA with k-means and Ward’s method does not
ensure this.
4.4.5 L-moments Approach
The L-moments statistics validate the homogeneous regions that are delineated using
regionalization techniques. When using this approach it is assumed that stations which form a
0
10
20
30
40
50
60
70
80
90
100
1
19
37
55
73
91
10
9
12
7
14
5
16
3
18
1
19
9
21
7
23
5
25
3
27
1
28
9
30
7
32
5
34
3
36
1
37
9
39
7
41
5
43
3
45
1
46
9
48
7
Co
st F
un
ctio
n
Number of Iterations
Tabreg Cost Function After 500 Iterations
24
homogeneous region represent an identical frequency distribution apart from a site-specific
scaling factor (Hosking and Wallis, 1997). The L-moments approach uses multiple datasets from
a respective region to compute the final quantile estimates which are used to create regional IDF
curves.
4.4.6 L-moment Ratio Diagrams
L-moment ratio diagrams identify the probability distribution function that is most
appropriate for each region. The L-moment ratio diagram is used to assess the adequacy of five
different three-parameter probability distribution functions for each region, including generalized
logistic (GLO), generalized extreme value (GEV), generalized normal (GN), Pearson Type III
(P3) and generalized Pareto (GPA) distributions. L-moment ratio diagrams are visual tools which
present the L-moment coefficient of kurtosis, 𝜏4, of each distribution as a function of its L-
moment coefficient of skewness, 𝜏3 (Hosking and Wallis, 1997). For the candidate probability
distribution functions, the representation of 𝜏4 as a function of 𝜏3 will result in a curve that can
be seen on the L-moment ratio diagram (Hosking and Wallis, 1997). In the same diagram the
point corresponding to 𝜏4 as a function of 𝜏3 for the observed data is considered the reference
point. The selection of the best distribution is based on the proximity of the distribution curve in
relation to the observed data. A suitable probability distribution function is selected if it is closest
to the point of the observed data.
4.4.7 Goodness-of-Fit Measure
When using the L-moment ratio diagram if there are multiple probability distribution
functions that are appropriate for a respective region then the goodness-of-fit measure is utilized
to accurately determine which probability distribution function is most suitable for each region.
25
The goodness-of-fit measure assessed the similarity between the sample kurtosis of the observed
data and the population kurtosis of the fitted distribution (Hosking and Wallis, 1997). A
candidate distribution was selected if the goodness-of-fit measure ZDIST
was less than the
threshold value 1.64 which corresponds to the 90% normal quantile (Hosking and Wallis, 1997).
ZDIST
was defined as
𝑍𝐷𝐼𝑆𝑇 =𝜏4
𝐷𝐼𝑆𝑇−𝜏4𝑅+𝛽4
𝜎4 (4)
where DIST is any particular candidate distribution that is being assessed, 𝜏4𝑅 is the regional
average L-kurtosis, 𝜏4𝐷𝐼𝑆𝑇 is the L-kurtosis of the fitted distribution, 𝛽4 is the bias of the regional
average sample L-kurtosis, and 𝜎4 is the standard deviation of the regional average sample L-
kurtosis (Hosking and Wallis, 1997). A candidate distribution was considered satisfactory if
𝑍𝐷𝐼𝑆𝑇 ≤ 1.64 and a satisfactory distribution was accepted at a confidence level of 90% (Hosking
and Wallis, 1997).
4.5 Future IDF Curve Methodology
4.5.1 Selection of Future Time Periods
Certain climate models do not have the same projected future time periods. In order to
allow for a consistent comparison of different climate models, the future time periods for
selected climate models must overlap. Two time periods were selected to create future IDF
curves for both the CanRCM4-CanESM2 and HadGEM2-ES climate models, the 2050s (2026 –
2045) and the 2100s (2081 – 2099). Both of these time periods follow the RCP4.5 and RCP8.5
scenarios. Therefore for each station, an ensemble of IDF projections were generated using the
aforementioned RCP scenarios and future time periods.
26
4.5.2 Climate Model Data Processing
Data for CanRCM4-CanESM2 was downloaded at the three closest grid points of each
station and the inverse distance weighted average was calculated to obtain the daily rainfall data.
Data for HadGEM2-ES was downloaded at the closest grid point of each station because GCMs
have coarse spatial resolution and only the grid point that is closest to each station can best
represent the data. The CanRCM4-CanESM2 data was downloaded in a 1-hour duration format
and the HadGEM2-ES data was provided in a 3-hour duration format. In order to obtain climate
model data for different storm durations, the aggregation method was applied to the precipitation
data to get lower resolution data (i.e. 6-hour, 12-hour and 24-hour for HadGEM2-ES and 3-hour,
6-hour, 12-hour and 24-hour for CanRCM4-CanESM2) and then higher resolution data (i.e. 15-
min, 30-min, 1-hour for HadGEM2-ES and 15-min and 30-min for CanRCM4-CanESM2) was
derived using the ratio formula presented in Section 4.1.3 of this study.
4.5.3 Downscaling Techniques
Each climate model scenario is downscaled using a variant of the Delta change method
first introduced by Olsson (2009). The Delta change method is a robust downscaling technique
that has been implemented by the European Union Sustainable Urban Development Planner for
Climate Change Adaptation Project (SUDPLAN, 2012). The Delta change method uses climate
projections to estimate the change in current design storms and future design storms and applies
this change to historical precipitation data. The projected future design storm 𝐼𝑝 is formulated as:
𝐼𝑝 = 𝐼𝑜𝐼𝑓
𝐼𝑐 (5)
27
where 𝐼𝑓 is the future design storm based on future climate model rainfall intensities for a
specific duration, 𝐼𝑐 is the current design storm based on current climate model rainfall
intensities for a specific duration and 𝐼𝑜 is the observed rainfall data for a specific duration. The
advantage of the Delta change method is that the temporal and spatial variability of the observed
data is preserved (Olsson et al., 2009).
4.5.4. Future IDF Curve Derivation
Based on previous recent study (Coulibaly et al. 2015), the generalized extreme value
(GEV) distribution was the probability distribution function that was used for all future design
storm estimation. IDF curves for each future time period were developed by taking the
downscaled output from each climate model and RCP scenario (Coulibaly et al., 2015).
5. Results and Discussion
5.1 Principal Component Analysis and K-means Clustering Regionalization Results
The first step when utilizing regional frequency analysis is determining the number of
clusters that are appropriate for the region based on the available data. The k-means algorithm
and the Davies-Bouldin index was applied to determine the optimal number of clusters based on
the available dataset. The optimal number of clusters equates to the cluster size that has the
lowest Davies-Bouldin value. The value of this index is averaged after 10 runs. The results based
on the Davies-Bouldin index indicate that nine clusters is appropriate to delineate regions for the
sites used in this study (Figure 5).
28
Figure 5: Davies-Bouldin index applied to the input dataset.
The percentage of variability which is explained for each principal component is
acceptable at a level of 80%. Figure 6 shows the percentage of variance explained by each
principal component. The first two principal components explained approximately 80% of the
total variability of the data and will be included in the principal component analysis portion of
the study.
The principal component coefficients for each attribute and the principal component
scores for each observation are presented in Figure 7. The six parameters are represented as a
vector and the position of each vector indicates which parameters influence each principal
component. For the first principal component all six parameters have positive coefficients which
29
indicates that regionalization of the stations will be influenced by all parameters. Mean annual
precipitation is the parameter that exhibits the most influence on the first principal component.
Figure 6: Percentage of total variability explained by each principal component.
Meanwhile total fall precipitation, total winter precipitation, total summer precipitation, total
spring precipitation and 100 maximum daily precipitation values have less of an influence on the
first principal component (in descending order). The 100 maximum daily precipitation values is
the parameter that is represented the most in the second principal component. Total spring
precipitation and total summer precipitation are parameters that also exhibited influence on the
second principal component. Mean annual precipitation, total fall precipitation and total winter
30
precipitation are the least influential parameters for the second principal component (in
descending order).
Figure 7: PCA loading plots for the first and second principal components.
The results for weather station regionalization using Principal Component Analysis with
k-means clustering are shown in Figure 8. The regionalization results using PCA with k-means
clustering show that contiguous homogeneous regions were not obtained and that most regions
are heterogeneous. Most clusters contain stations that are located at opposite areas of the study
31
region. Although some clusters such as cluster 4, cluster 7 and cluster 9 show signs of contiguity
it can be seen that the use of PCA with k-means clustering creates heterogeneous regions.
Figure 8: Map showing the regions that were delineated using Principal Component
Analysis with k-means.
5.2 Ward’s Method Regionalization Results
The results for weather station regionalization using Ward’s method are shown in Figure
9. The regions delineated using Ward’s method exhibit more contiguity compared to PCA with
32
k-means clustering, however cluster 1, cluster 2 and cluster 3 show signs of heterogeneity and
are not contiguous homogeneous regions (Figure 9). However, the method also delineated
contiguous homogeneous: cluster 4, cluster 6, cluster 9 and cluster 9. But, most regions created
using Ward’s method are heterogeneous.
Figure 9: Map showing the regions that were delineated using Ward’s method.
PCA with k-means clustering and Ward’s method produced clusters that are not
contiguously homogeneous. An important aspect of regional frequency analysis is to delineate
contiguous homogeneous regions. In previous studies it is common to see methods such as PCA
with k-means clustering and Ward’s method to include input parameters such as latitude and
33
longitude which allow for the delineation of contiguous regions. Initially latitude and longitude
were not included as input parameters for this study. Additional tests were completed where
latitude and longitude were included as input parameters however this did not obtain better
results as the new regions that were created still did not exhibit contiguity. Therefore the
inclusion of a regionalization technique that accounts for contiguity and is able to cluster the
stations into statistically homogeneous regions is required. Tabreg clustering method has proven
to be a robust method that accounts for contiguity and it was used as an additional
regionalization method.
5.3 Tabreg Regionalization Results
The Tabreg clustering method was used to delineate homogeneous regions and the results
show that the clusters that were formed are uniformly and contiguously regionalized (Figure 10).
34
Figure 10: Map showing the regions that were delineated using Tabreg clustering method.
Therefore the regions delineated using the Tabreg method will be used for regional frequency
analysis in this study. To ensure that the clusters are statistically homogeneous the L-moments
approach was applied to each cluster delineated using Tabreg. All of the clusters obtained using
Tabreg successfully passed the discordancy measure and the heterogeneity measure test.
5.4 L-moment ratio diagrams
L-moment ratio diagrams are visual tools that assess which probability distribution
functions are most appropriate for each clusters’ dataset. The probability function closest to the
reference point (the + symbol in Figure 11 and Figure 12) is considered the most appropriate
London
Kingston
Windsor
Wiarton
Toronto
Fort Erie
35
function. The most appropriate probability distribution function that will be used for cluster 1 is
the generalized logistic (GLO) probability distribution (Figure 11) and for cluster 2 it is the
generalized extreme value (GEV) probability distribution (Figure 12).
Figure 11: L-moment ratio diagram for cluster 1. Note: Each line on the graph represents
different three-parameter distributions that were tested for cluster 1. The three-parameter
distributions used are: GPA: Generalized Pareto, GEV: Generalized Extreme Value, GLO:
Generalized Logistic, LN3: Lognormal and PE3: Pearson Type 3; the + symbol is the reference
point; the ■ symbol represents the different two-parameter distributions that were tested for
cluster 1. The two-parameter distributions used are E: Exponential, G: Gumbel, L: Logistic, N:
Normal and U: Uniform.
36
Figure 12: L-moment ratio diagram for cluster 2. Note: Each line on the graph represents
different three-parameter distributions that were tested for cluster 2. The three-parameter
distributions used are: GPA: Generalized Pareto, GEV: Generalized Extreme Value, GLO:
Generalized Logistic, LN3: Lognormal and PE3: Pearson Type 3; the + symbol is the reference
point; the ■ symbol represents the different two-parameter distributions that were tested for
cluster 2. The two-parameter distributions used are E: Exponential, G: Gumbel, L: Logistic, N:
Normal and U: Uniform.
37
5.5 Goodness-of-fit measure
The goodness-of-fit measure was utilized to confirm the most appropriate regional
probability distribution function for each cluster. There are many cases where more than one
probability distribution appropriately fits the dataset. For cluster 1 there are three probability
distributions that can appropriately fit the cluster’s dataset, including the generalized logistic
distribution, generalized extreme value distribution and generalized normal distribution (Table
2). For cluster 2 there are four probability distributions that can appropriately fit the cluster’s
data (Table 3). Since multiple probability distributions passed the goodness-of-fit measure, the
ZDIST
score closest to 0 is used to determine the most suitable probability distribution for each
cluster. The most suitable probability distributions for all clusters can be found in Table 4.
Cluster 1
Probability Distribution
Function Goodness-of-fit measure
Generalized Logistic -0.39
Generalized Extreme Value -1.18
Generalized Normal -1.52
Table 2: Goodness-of-fit measure results for each station located in cluster 1.
38
Cluster 2
Probability Distribution
Function Goodness-of-fit measure
Generalized Logistic 0.50
Generalized Extreme Value -0.32
Generalized Normal -0.71
Pearson Type III -1.42
Table 3: Goodness-of-fit measure results for each station located in cluster 2.
Cluster Identification Best Fit Distribution
Cluster 1 GLO
Cluster 2 GEV
Cluster 3 GEV
Cluster 4 GNO
Cluster 5 GLO
Cluster 6 GPA
Cluster 7 PE3
Cluster 8 GPA
Cluster 9 GNO
Table 4: L-moment ratio diagram results for each cluster.
39
5.6 IDF Curve Results: At-Site Method Compared to Regional Frequency Analysis
Method
After determining the most appropriate regional probability distribution function for each
cluster, IDF curves were created for each station using multiple return periods. For illustrative
purposes, the at-site method (following the GEV distribution) and the regional frequency
analysis method IDF curve results were compared for selected stations in different clusters. The
at-site IDF curve and the regional frequency analysis method IDF curve for Glen Allen station
are shown in Figure 13. The at-site IDF curve and the regional frequency analysis method IDF
curve for Tillsonburg station are shown in Figure 14. Glen Allen and Tillsonburg are located in
cluster 1 and the probability distribution function that is used for regional frequency analysis for
cluster 1 is the generalized logistic distribution. For the 3 hour and 12 hour duration rainfall the
at-site method produces similar rainfall values compared to the regional frequency analysis
method for the 2, 5 and 10 year return period for both Glen Allen station (Figure 13) and
Tillsonburg station (Figure 14). The regional frequency analysis method produces slightly larger
rainfall values compared to the at-site method for the 20 year return period for Glen Allen and
Tillsonburg. For the 50 year and 100 year return period the regional frequency analysis method
produces significantly larger rainfall values compared to the at-site method for Glen Allen and
Tillsonburg.
Using two different probability distribution functions for the at-site IDF curve and the
regional frequency analysis IDF curve contributed to the differences that were observed between
the at-site method and the regional frequency analysis method. The Z-score probability values
for the generalized logistic distribution and the Z-score probability values for the generalized
extreme value distribution are similar for the 2, 5 and 10 year return periods. However for the
40
larger return periods – such as 20, 50 and 100 years – the generalized logistic and the generalized
extreme value distributions have large differences in Z-score probability values. Since the Z-
score probability values for short return periods are similar for both the GEV and GLO
distributions then the IDF curve values are also similar. However for the larger return periods,
the GLO Z-score probability values are larger compared to the GEV Z-score probability values
and as a result the final IDF curve values for the regional frequency analysis method are
subsequently larger than the rainfall values of the at-site method. There are other clusters in the
study area that have a different regional probability distribution function compared to the at-site
probability distribution. As a result significant differences of rainfall values were observed for
larger return periods in other clusters as well. The patterns observed using the at-site method and
regional frequency analysis method are displayed by showing the percentage difference of the
rainfall intensities produced for the at-site method and the regional frequency analysis method
(Figure 15).
41
(a) (b)
Figure 13: Plots of IDF statistics from the regional frequency analysis method and the at-site
method for Glen Allan station (cluster 1): (a) 3 hour rainfall and (b) 12 hour rainfall.
5
10
15
20
25
30
35
2 5 10 20 50 100
Rai
nfa
ll In
ten
sity
(m
m/h
r)
Return Period (years)
Glen Allan Station: 3 hour Rainfall
At-siteMethod
RegionalMethod
2
3
4
5
6
7
8
9
10
2 5 10 20 50 100R
ain
fall
Inte
nsi
ty (
mm
/hr)
Return Period (years)
Glen Allan Station: 12 hour Rainfall
At-siteMethod
RegionalMethod
42
(a) (b)
Figure 14: Plots of IDF statistics from the regional frequency analysis method and the at-site
method for Tillsonburg station (cluster 1): (a) 3 hour rainfall and (b) 12 hour rainfall.
5
10
15
20
25
30
35
2 5 10 20 50 100
Rai
nfa
ll In
ten
sity
(m
m/h
r)
Return Period (years)
Tillsonburg Station: 3 hour Rainfall
At-siteMethod
RegionalMethod
2
3
4
5
6
7
8
9
10
2 5 10 20 50 100R
ain
fall
Inte
nsi
ty (
mm
/hr)
Return Period (years)
Tillsonburg Station: 12 hour Rainfall
At-siteMethod
RegionalMethod
43
Figure 15: Difference (in %) of rainfall intensity between the at-site method and the
regional frequency analysis method for stations in cluster 1.
After comparing IDF curves using the at-site method and the regional frequency analysis
method with different probability distribution functions, the differences in rainfall values for
larger return periods were obvious. Given that certain clusters use the GEV distribution similarly
to the at-site method, the next set of stations compared using the at-site method and the regional
frequency analysis method will use the GEV distribution to see if there are still differences in
final IDF values. The at-site IDF curve and the regional frequency analysis method IDF curve for
Vineland Rittenhouse station are shown in Figure 16. The at-site IDF curve and the regional
frequency analysis method IDF curve for Hagersville station are shown in Figure 17. Vineland
-10
-5
0
5
10
15
20
25
30
2 5 10 20 50 100
Pe
rce
nta
ge D
iffe
ren
ce (
%)
Return Period (years)
Difference in rainfall intensity between the at-site method and the RFA method
Glen Allan
Tillsonburg
44
Rittenhouse is located in cluster 3 and Hagersville is located in cluster 2 and the probability
distribution function that is used for regional frequency analysis for cluster 2 and cluster 3 is the
generalized extreme value (GEV) distribution also used for all stations in the at site method. For
the 3 hour and 12 hour duration rainfall, the at-site method produces similar rainfall values
compared to the regional frequency analysis method for the 2, 5 and 10 year return period for
both Vineland Rittenhouse (Figure 16) and Hagersville (Figure 17). The regional frequency
analysis method produces slightly larger rainfall values compared to the at-site method for the 20
year return period for Vineland Rittenhouse (Figure 16), however the regional frequency analysis
method has similar rainfall values as the at-site method at the 20 year return period for
Hagersville (Figure 17). For the 50 year and 100 year return period the regional frequency
analysis method produces larger rainfall values compared to the at-site method for Vineland
Rittenhouse and Hagersville.
A similar trend that was observed in the cluster 1 results is also observed in the results for
cluster 2 and cluster 3 where the rainfall values for the at-site method and the regional frequency
analysis method are similar for the 2, 5 and 10 year return period. However for the larger return
periods – including 20, 50 and 100 years – the rainfall values are larger for the regional
frequency analysis method compared to the at-site method. Therefore the final IDF values for the
at-site method are similar to the regional frequency analysis method for short return periods for
both stations. The patterns observed using the at-site method and regional frequency analysis
method are displayed by showing the percentage difference of the rainfall intensities produced
for the at-site method and the regional frequency analysis method (Figure 18).
45
(a) (b)
Figure 16: Plots of IDF statistics from the regional frequency analysis method and the at-site
method for Vineland Rittenhouse station (cluster 3): (a) 3 hour rainfall and (b) 12 hour rainfall.
0
5
10
15
20
25
2 5 10 20 50 100
Rai
nfa
ll In
ten
sity
(m
m/h
r)
Return Period (years)
Vineland Station: 3 hour Rainfall
At-siteMethod
RegionalMethod
0
1
2
3
4
5
6
7
8
2 5 10 20 50 100R
ain
fall
Inte
nsi
ty (
mm
/hr)
Return Period (years)
Vineland Station: 12 hour Rainfall
At-siteMethod
RegionalMethod
46
(a) (b)
Figure 17: Plots of IDF statistics from the regional frequency analysis method and the at-site
method for Hagersville station (cluster 2): (a) 3 hour rainfall and (b) 12 hour rainfall.
0.0
5.0
10.0
15.0
20.0
25.0
30.0
2 5 10 20 50 100
Rai
nfa
ll In
ten
sity
(m
m/h
r)
Return Period (years)
Hagersville Station: 3 hour Rainfall
At-SiteMethod
RegionalMethod
0.0
1.0
2.0
3.0
4.0
5.0
6.0
7.0
8.0
9.0
10.0
2 5 10 20 50 100
Rai
nfa
ll In
ten
sity
(m
m/h
r)
Return Period (years)
Hagersville Station: 12 hour Rainfall
At-SiteMethod
RegionalMethod
47
Figure 18: Difference (in %) of rainfall intensity between the at-site method and
the regional frequency analysis method for Hagersville (cluster 2) and
Vineland Rittenhouse (cluster 3).
Furthermore, the at-site method and the regional frequency analysis method were
compared using weather stations from all clusters for short return periods (Figure 19) and for
large return periods (Figure 20). Similar trends that were observed in clusters 1, 2 and 3 are also
found within the other delineated clusters as differences in rainfall intensities for large return
periods are obvious, however for short return periods rainfall values are similar. Cluster 2 and
cluster 3 use the generalized extreme value as the regional probability distribution, and the
regional frequency analysis method produces larger rainfall values compared to the at-site
-20
-15
-10
-5
0
5
10
15
20
25
2 5 10 20 50 100
Pe
rce
nta
ge D
iffe
ren
ce (
%)
Return Period (years)
Difference in rainfall intensity between the at-site method and the RFA method
Hagersville
VinelandRittenhouse
48
method. Meanwhile clusters 1, 4, 5, 6, 7, 8 and 9 use a different regional probability distribution
compared to the at-site method, and differences in rainfall values are observed for larger return
periods as well. Overall using the regional frequency analysis method produces differences in
rainfall values compared to the at-site method, especially for large return periods. Similar
patterns are observed for clusters where the regional frequency analysis probability distribution
and the at-site probability distribution are different or identical. There were also cases where
stations using the at-site method produced larger rainfall values compared to the regional
frequency analysis method. Therefore regardless of which probability distribution function is
used for each IDF method, the IDF values are generally different for larger return periods (50 to
100 years).
49
Figure 19: Difference (in %) of rainfall intensity between the at-site method and
the regional frequency analysis method for weather stations in all clusters for short return
periods.
-6
-4
-2
0
2
4
6
8
10P
erc
en
tage
Dif
fere
nce
(%
)
Weather Stations
Difference in rainfall intensity between the at-site method and the RFA method for all clusters for short return periods
2 year returnperiod
5 year returnperiod
10 year returnperiod
50
Figure 20: Difference (in %) of rainfall intensity between the at-site method and
the regional frequency analysis method for weather stations in all clusters for large return
periods.
5.7 Future IDF Curve Results
To simplify results presentation, sample station results are presented for illustration.
Durham station is located in cluster 9 and Hartington station is located in cluster 6 and despite
being at large distance away from one another, similar future IDF curve results were observed
for both stations. Figure 19 shows the future IDF results for Durham station for the 3 hour
0
5
10
15
20
25P
erc
en
tage
Dif
fere
nce
(%
)
Weather Stations
Difference in rainfall intensity between the at-site method and the RFA method for all clusters for large return periods
20 year returnperiod
50 year returnperiod
100 year returnperiod
51
rainfall duration and Figure 20 shows the Durham station results for the 12 hour rainfall duration.
Figure 21 shows the future IDF curves for Hartington station for the 3 hour rainfall duration and
Figure 22 shows the Hartington station results for the 12 hour rainfall duration. Future IDF
curves derived for the 2050s period and 2100s period were compared with the IDF curve results
using the at-site method and the regional frequency analysis method. The 2050s period has larger
rainfall values compared to the at-site method and the regional frequency analysis method for all
return periods for both Durham (Figure 19 and Figure 20) and Hartington (Figure 21). However
for the 12 hour duration for Hartington station, the regional frequency analysis method has larger
rainfall values for the 10, 20, 50 and 100 year return period compared to the 2050s period
(Figure 22 a). The 2100s period has significantly larger rainfall values compared to the 2050s
period, the at-site method and the regional frequency analysis method for all return periods for
Durham station and Hartington station.
52
(a) (b)
Figure 21: IDF curves obtained from the future IDF method, the regional frequency analysis
method and the at-site method for Durham station (cluster 9) for the 3 hour rainfall: (a) 2050s
period and (b) 2100s time period.
5
10
15
20
25
2 5 10 20 50 100
Rai
nfa
ll In
ten
sity
(m
m/h
r)
Return Period (years)
Durham Station: Comparison of IDF methods for 3 hr rainfall for 2050s
period
At-siteMethod
RegionalMethod
2050s
5
10
15
20
25
30
2 5 10 20 50 100
Rai
nfa
ll In
ten
sity
(m
m/h
r)
Return Period (years)
Durham Station: Comparison of IDF methods for 3 hr rainfall for 2100s
period
At-siteMethod
RegionalMethod
2100s
53
(a) (b)
Figure 22: IDF curves obtained from the future IDF method, the regional frequency analysis
method and the at-site method for Durham station (cluster 9) for the 12 hour rainfall: (a) 2050s
period and (b) 2100s time period.
2
3
4
5
6
7
8
2 5 10 20 50 100
Rai
nfa
ll In
ten
sity
(m
m/h
r)
Return Period (years)
Durham Station: Comparison of IDF methods for 12 hr rainfall for 2050s
period
At-siteMethod
RegionalMethod
2050s
2
3
4
5
6
7
8
9
10
2 5 10 20 50 100
Rai
nfa
ll In
ten
sity
(m
m/h
r)
Return Period (years)
Durham Station: Comparison of IDF methods for 12 hr rainfall for 2100s
period
At-siteMethod
RegionalMethod
2100s
54
(a) (b)
Figure 23: IDF curves obtained from the future IDF method, the regional frequency analysis
method and the at-site method for Hartington station (cluster 6) for the 3 hour rainfall: (a)
2050s period and (b) 2100s time period.
5
7
9
11
13
15
17
19
21
2 5 10 20 50 100
Rai
nfa
ll In
ten
sity
(m
m/h
r)
Return Period (years)
Hartington Station: Comparison of IDF methods for 3 hr rainfall for
2050s period
At-siteMethod
RegionalMethod
2050s
5
10
15
20
25
30
35
2 5 10 20 50 100
Rai
nfa
ll In
ten
sity
(m
m/h
r)
Return Period (years)
Hartington Station: Comparison of IDF methods for 3 hr rainfall for
2100s period
At-siteMethod
RegionalMethod
2100s
55
(a) (b)
Figure 24: IDF curves obtained from the future IDF method, the regional frequency analysis
method and the at-site method for Hartington station (cluster 6) for the 12 hour rainfall: (a)
2050s period and (b) 2100s time period.
1
2
3
4
5
6
7
2 5 10 20 50 100
Rai
nfa
ll In
ten
sity
(m
m/h
r)
Return Period (years)
Hartington Station: Comparison of IDF methods for 12 hr rainfall for
2050s period
At-siteMethod
RegionalMethod
2050s
2
3
4
5
6
7
8
9
10
2 5 10 20 50 100
Rai
nfa
ll In
ten
sity
(m
m/h
r)
Return Period (years)
Hartington Station: Comparison of IDF methods for 12 hr rainfall for
2100s period
At-siteMethod
RegionalMethod
2100s
56
The rainfall values obtained using the future IDF methods for Durham station and
Hartington station showed that the 2050s time period had slightly larger rainfall values compared
to the at-site IDF curve and the regional frequency analysis method IDF curve; the exception was
for Hartington station 12 hour rainfall where the regional frequency analysis method had larger
rainfall values for the 10, 20, 50 and 100 year return period compared to the 2050s period. The
CanRCM4 and HadGEM2 climate models are based on the new IPCC emission scenarios which
suggest that extreme increases in rainfall intensities will not be observed over a short timescale.
As a result there were minor increases in rainfall values for the 2050s period. However, when
comparing the 2100s time period to the at-site method and the regional frequency analysis
method larger increases in rainfall values were observed for all return periods. Furthermore, the
2100s period had larger rainfall values for the 20, 50 and 100 year return periods when compared
to the 2050s time period. Following the RCP4.5 and RCP8.5 emission scenarios it is expected
that there will be large increases in rainfall values for the 2100s period (IPCC, 2014). (IPCC,
2014) infers that increases in extreme rainfall events will be realized over a large period of time
due to the intensive greenhouse gas emissions that are projected to accumulate by the 2100s
period. After developing IDF curves using the future IDF curve method it is expected that
rainfall intensities are expected to significantly increase in southern Ontario for the 2100s period
as similar results were observed for all stations in this region.
5.8 Variability of the Future IDF Curve Results
There are numerous climate models and emission scenarios that are available to create
future IDF curves. Since infrastructure designs are planned to account for extreme rainfall events
expected to occur in the future, it is important that climate models adequately represent projected
57
rainfall intensities for each respective region. The variability of results due to the diversity of the
climate models and the associated emission scenarios was assessed as well.
For 12 hour rainfall duration for the Durham station, the climate model and emission
scenario that produced the lowest intensity of rainfall for the 2100s period was the HadGEM2
model for the RCP4.5 emission scenario (Figure 24). The HadGEM2 model following the
RCP4.5 scenario is an intermediate scenario where emission levels are supposed to level off,
however, the rainfall intensity for this scenario are moderately larger compared to the at-site
method rainfall intensities, therefore under one of the best case lowest emission scenarios, an
increase in rainfall intensity is still expected for return periods greater than 10 years for the 2100s
period. For the Durham station the climate model and emission scenario that produced the largest
intensity of rainfall for the 2100s period was the CanRCM4 model for the RCP8.5 emission
scenario. The rainfall intensities for this scenario are moderately larger compared to the at-site
method rainfall intensities for the 2 – 10 year return periods however significant increases are
observed for the 25 – 100 year return periods, therefore under one of the worst case emission
scenarios a significant increase in rainfall intensity is expected for the 2100s period.
58
Figure 25: Comparison of IDF curves for 12 hour rainfall duration using the minimum,
maximum and mean of the future IDF method results versus the at-site method results for
Durham station. Note: Minimum = HadGEM2 model following RCP4.5; Maximum =
CanRCM4 model following RCP8.5; Mean = the average rainfall intensity of all
climate models and emission scenarios used in this study.
For 12 hour rainfall duration for the Hartington station, the climate model and emission
scenario that produced the lowest intensity of rainfall for the 2100s period was the HadGEM2
model for the RCP4.5 emission scenario (Figure 26). The rainfall intensities for this scenario are
smaller compared to the at-site method for short return periods (2 – 5 years), however the rainfall
intensities for the 10 – 100 year return periods are larger compared to the at-site method rainfall
intensities, therefore under one of the best case lowest emission scenarios an increase in rainfall
intensity is still expected for all return periods for the 2100s period. For the Durham station the
2
3
4
5
6
7
8
9
10
11
12
2 5 15 25 35 45 55 70 90
Rai
nfa
ll In
ten
sity
(m
m/h
r)
Return Period (years)
Durham station: 2100s storm scenarios for 12 hr rainfall
At-site Method
Minimum
Mean
Maximum
59
climate model and emission scenario that produced the largest intensity of rainfall for the 2100s
period was the CanRCM4 model for the RCP4.5 emission scenario. The rainfall intensities for
this scenario are larger compared to the at-site method rainfall intensities for the 2 – 15 year
return periods however significant increases are observed for the 20 – 100 year return periods,
therefore following the RCP4.5 emission scenario a significant increase in rainfall intensity is
expected for the 2100s period.
Figure 26: Comparison of IDF curves for 12 hour rainfall duration using the minimum,
maximum and mean of the future IDF method results versus the at-site method results for
Hartington station. Note: Minimum = HadGEM2 model following RCP4.5; Maximum =
CanRCM4 model following RCP4.5; Mean = the average rainfall intensity of all climate
models and emission scenarios used in this study.
2
4
6
8
10
12
14
16
2 5 15 25 35 45 55 70 90
Rai
nfa
ll In
ten
sity
(m
m/h
r)
Return Period (years)
Hartington station: 2100s storm scenarios for 12 hr rainfall
At-site Method
Minimum
Mean
Maximum
60
Assessing the rainfall outputs for each climate model is of utmost importance to
determine which climate models and emission scenarios should be utilized for research in the
future. The climate models and emission scenarios used in this study showed that different
scenarios can produce a wide range of rainfall intensities for both intermediate and extreme
emission scenarios. In most cases the RCP8.5 emission scenario triggers the largest increase in
rainfall intensities for the 2081 – 2099 period. The RCP4.5 emission scenario is associated with
minimal increases in rainfall intensities for the 2081 – 2099 period, the only exception was for
the 12 hour rainfall duration for Hartington station and in this case the RCP4.5 scenario was
associated with the maximum increase in rainfall intensity. Pinpointing appropriate climate
models and emission scenarios that will accurately represent future conditions will enable the
derivation of IDF curves to be more accurate. Until then, using an ensemble of climate models to
create IDF curves is the most effective method.
6. Discussion
Overall the three regionalization techniques used in this study produced different weather
station clusters. PCA with k-means clustering and Ward’s method created heterogeneous
clusters. However Tabreg clustering method delineated statistically homogenous clusters that
were contiguous. Tabreg clustering method incorporates a non-connectivity parameter in the cost
function to ensure that delineated regions are compact and contiguous. Meanwhile PCA with k-
means clustering and Ward’s method do not incorporate a similar parameter in their objective
function. Therefore using a regionalization technique that includes a factor which accounts for
contiguity, such as Tabreg, is required to cluster stations into statistically homogeneous regions.
61
Furthermore, the at-site method IDF curves and the regional frequency analysis method
IDF curves generated different rainfall values for each cluster. Using different probability
distribution functions for the aforementioned methods resulted in the differences observed for the
rainfall values. For different probability distribution functions the Z-score probability values are
similar for short return periods (2, 5 and 10 year return periods). However for larger return
periods (20, 50 and 100 year return periods) there are large differences in Z-score probability
values. As a result the final IDF curve values for the regional frequency analysis method are
larger than the rainfall values for the at-site method for all stations, especially for large return
periods. There are multiple cases where clusters within the study area have different regional
probability distribution functions compared to the at-site probability distribution function, as a
result differences in rainfall values were observed in other clusters as well.
The rainfall values obtained using the future IDF curve methods showed that the 2050s
time period had slightly larger rainfall values in comparison to the at-site method IDF curve and
the regional frequency analysis method IDF curve. The IPCC emission scenarios describe that
over a short period of time there will not be significant increases in rainfall intensities;
subsequently there were insignificant increases in rainfall values for the 2050s period compared
to the at-site method and regional frequency analysis method IDF curves. However, the 2100s
period produced significantly larger rainfall values for the 20, 50 and 100 year return periods
compared to the at-site method, regional frequency analysis method and the 2050s period. Under
the RCP4.5 and RCP8.5 emission scenarios increases in rainfall intensities are expected over a
large time scale due to the compilation of greenhouse gas emissions that will accumulate by the
2100s period. Consequently extreme rainfall events are expected for larger return periods during
this time period. Therefore practitioners and organizations that are interested in updating
62
intensity-duration-frequency curves in a particular region of southern Ontario can recognize that
accounting for future rainfall extremes for the 2100s period is essential for the design of
hydraulic structures.
7. Conclusion
The IDF curves derived using the at-site method and the regional frequency analysis
method for all stations produced different results. Comparing the at-site method using the GEV
distribution for all sites with the regional frequency analysis method had previously not been
completed in southern Ontario. Initial results showed that the regional frequency analysis method
produced larger rainfall values compared to the at-site method for the 50 and 100 year return
period. For most cases there were minor differences for rainfall intensities for 2 year, 5 year, 10
year and 20 year return periods.
Future IDF curves showed that for the 2050s time period there were minor increases in
rainfall intensities when compared with the at-site method and the regional frequency analysis
method whatever the region. For the 2100s time period there were larger increases in rainfall
intensities compared to the at-site method and the regional frequency analysis method, especially
for larger return periods (50 to 100 years). These results suggest that it is worthwhile for regions
within southern Ontario to update their IDF curves using the future IDF curve technique,
however further investigation to determine the most adequate climate models and emission
scenarios is required.
This study also evaluated the ability of regionalization techniques such as Principal
Component Analysis with k-means, Ward’s method and Tabreg. The regionalization results
indicate that only Tabreg was able to delineate contiguous homogeneous regions. The clusters
63
obtained using PCA with k-means and Ward’s method were heterogeneous and therefore were
not used for the regional IDF development in this study. Furthermore, most regionalization
studies include latitude and longitude as inputs for clustering, and initially these two parameters
were not included. However after including latitude and longitude as input parameters for PCA
with k-means and Ward’s method there was still little improvement in delineating contiguous
homogeneous regions. Future research will attempt to include even more weather stations to see
if homogeneous regions can be delineated using Tabreg with a larger number of stations.
Characterizing rainfall to determine future rainfall intensities in southern Ontario is of
utmost importance to mitigate flooding impacts on society. The study results are consistent with
previous studies and indicate that IDF curves should be updated in southern Ontario. Similar
analysis approach should be applied to other study areas as well to ensure that IDF curves are
created using adequate data and methods.
8. Recommendations
Based on the results found in this study the following recommendations can be made:
Utilizing both global climate models or regional climate models are necessary to capture
the future extreme rainfall trends which will enable accurate 100 year storm designs for
new infrastructure. Using the Representative Concentration Pathways (RCP) emissions
scenarios that represent intermediate to intensive greenhouse gas emissions is effective to
eliminate any redundancies that might be caused by using lower emission scenarios.
Therefore the strategies used in this study should be utilized when creating future IDF
curves for additional weather stations in the study area.
64
Further testing of probability distribution functions to ensure that we are characterizing
extreme rainfall events properly is required using additional tests such as Mann-Kendall,
Quantile-Quantile plots, Kolmogorov-Smirnov goodness of fit test and the Akaike
Information Criterion (Coulibaly et al., 2015).
Explore the possibility of using different parameter weights based on the importance of
each input parameter for Tabreg. Also try using different input parameters for Tabreg
such as extreme rainfall for 200 maximum values, for example.
Since global climate models and regional climate models are continually evolving and
being updated, using the latest version of the CRCM and HadGEM climate models for
the next study is recommended.
65
9. References
Abolverdi, J., and Khalili, D., (2010). Development of Regional Rainfall Annual Maxima for
Southwestern Iran by L-Moments. Water Resources Management, 24(11), 2501 – 2526.
Adamowski, J., Adamowski, K., Bougadis, J., (2010). Influence of Trend on Short Duration
Design Storms. Water Resources Management, 24(3), 401 – 413.
Aguado, D., Montoya, T., Borras, L., Seco, A., and Ferrer, J., (2008). Using SOM and PCA for
analysing and interpreting data from a P-removal SBR. Engineering Applications of
Artificial Intelligence, 21(6), 919–930.
Bharath, R., and Srinivas, V.V., (2015). Regionalization of extreme rainfall in India.
International Journal of Climatology, 35(6), 1142 – 1156.
Bernard, M. M., (1932). Formulas for rainfall intensities of long duration. Transactions of the
American Society of Civil Engineers, 96(1), 592 – 606.
Bernard, E., Naveau, P., Vrac, M., Mestre, O., (2013). Clustering of maxima: Spatial
dependencies among heavy rainfall in France. Journal of Climate, 26(20), 7929 – 7937.
Bozkaya, B., Erkut, E., and Laporte, G., (2003). A tabu search heuristic and adaptive memory
procedure for political districting. European Journal of Operational Research, 144(1),
12 – 26.
Brunetti, M., Maugeri, M., Monti, F., and Nanni, T., (2004). Changes in daily precipitation
frequency and distribution in Italy over the last 120 years. Journal of Geophysical
Research: Atmospheres, 109(D5), 1 – 16.
66
Coulibaly, P., Burn, D.H., Switzman, H., Henderson, J., and Fausto, E., (2015). A Comparison of
Future IDF Curves for Southern Ontario. Technical Report, Toronto Region
Conservation Authority and Essex Region Conservation Authority, 1 – 100.
Coulibaly, P., and Shi, X., 2005. Identification of the Effect of Climate Change on Future Design
Standards of Drainage Infrastructure in Ontario. Technical Report, Ministry of
Transportation of Ontario.
City of Guelph, (2007). Frequency analysis of maximum rainfall and IDF design curve update
(Technical Report). 1– 29.
Clavet-Gaumont, J., Sushama, L., Khaliq, M. N., Huziy, O., and Roy, R., (2013). Canadian RCM
projected changes to high flows for Quebec watersheds using regional frequency
analysis. International Journal of Climatology, 33(14), 2940 – 2955.
Conservation Halton, (2015). August 4th, 2014 Storm Event, Burlington. 1 – 35.
Davies, D. L., and Bouldin, D. W. (1979). A cluster separation measure. IEEE transactions on
pattern analysis and machine intelligence, (2), 224-227.
Dominguez, F., Rivera, E., Lettenmaier, D.,P., and Castro, C. L., (2012). Changes in winter
precipitation extremes for the western United States under a warmer climate as simulated
by regional climate models. Geophysical Research Letters, 39(5), 1 – 7.
Environment and Climate Change Canada, (2014). Canada’s Top 10 Weather Stories for 2013 –
2. Toronto’s Torrent. Climate and Historical Weather, 3.
67
Feng, L., Zhou, T., Wu, B., Li, T., and Luo, J.J., (2011). Projection of future precipitation change
over China with a high-resolution global atmospheric model. Advances in Atmospheric
Sciences, 28(2), 464 – 476.
Fragoso, M., and Tildes Gomes, P., (2008). Classification of daily abundant rainfall patterns an
associated large‐scale atmospheric circulation types in southern Portugal. International
Journal of Climatology, 28(4), 537 – 544.
Hennessy, K. J., Gregory, J. M., and Mitchell, J. F. B., (1997). Changes in daily precipitation
under enhanced greenhouse conditions. Climate Dynamics, 13, 667 – 680.
Hershfield, D.M., (1961). Rainfall Frequency Atlas of the United States for Durations from 30
minutes to 24 hours and Return Periods from 1 to 100 Years. U.S. Weather Bureau,
Technical Paper 40.
Hosking, J.R.M., and Wallis, J.R., (1997). Regional Frequency Analysis An Approach Based on
L-moments. Cambridge University Press.
Huff, F.A., and Angel, J.R., (1989). Frequency Distribution and Hydraulic Climatic
Characteristics of Heavy Rainstorms in Illinois, Bulletin 70, Illinois State Water Survey,
Champaign, Illinois.
Huff, F.A., and Angel, J.R., (1992). Rainfall Frequency Atlas of the Midwest, Bulletin 71,
Midwestern Climate Center and Illinois State Water Service, 1992, 5 – 6.
Hydrological Atlas of Canada, (1978). Fisheries and Environment Canada.
Intergovernmental Panel on Climate Change, (2014). Climate Change 2014: Synthesis Report.
Contribution of Working Groups I, II and III to the Fifth Assessment Report of the
68
Intergovernmental Panel on Climate Change [Core Writing Team, R.K. Pachauri and
L.A. Meyer (eds.)]. IPCC, Geneva, Switzerland, 151.
Jingyi, Z., and Hall, M.J., (2004). Regional flood frequency analysis for the Gan-Ming River
basin in China. Journal of Hydrology, 296, 98 – 117.
Kahya, E., Demirel, M.C., Bég, O.A., (2008). Hydrologic homogeneous regions using monthly
streamflow in Turkey. Earth Sciences Research Journal, 12(2), 181 – 193.
Kansakar, S. R., Hannah, D. M., Gerrard, J., Rees, G., (2004). Spatial pattern in the precipitation
regime of Nepal. International Journal of Climatology, 24(13), 1645 – 1659.
Kyselý, J., Picek, J., and Huth, R., (2007). Formation of homogeneous regions for regional
frequency analysis of extreme precipitation events in the Czech Republic. Studia
Geophysica et Geodaetica, 51(2), 327 – 344.
Lemmen, D.S., Warren, F.J., Lacroix, J., and Bush, E., editors (2008). From Impacts to
Adaptation: Canada in a Changing Climate 2007. Government of Canada, Ottawa,
Ontario.
Lim, Y.H., and Voeller, D.L., (2009). Regional flood estimations in Red River using L-moment
based index-flood and Bulletin 17B procedures. Journal of Hydrologic Engineering,
14(9), 1002 – 1016.
Lin, G.F., and Chen, L.H., (2006). Identification of homogeneous regions for regional frequency
analysis using the self-organizing map. Journal of Hydrology, 324(1), 1 – 9.
69
Mailhot, A., Duchesne, S., Caya, D., and Talbot, G., (2007). Assessment of future change in
intensity-duration-frequency (IDF) curves for Southern Quebec using the Canadian
Regional Climate Model (CRCM). Journal of Hydrology, 347(1), 197 – 210.
Mailhot, A., Beauregard, I., Talbot, G., Caya, D., and Biner, S., (2012). Future changes in
intense precipitation over Canada assessed from multi-model NARCCAP ensemble
simulations. International Journal of Climatology, 32(8), 1151 – 1163.
Malekinezhad, H., and Zare-Garizi, A., (2014). Regional frequency analysis of daily rainfall
extremes using L-moments approach. Atmósfera, 27(4), 411 – 427.
Mallants, D., and Feyen, J., (1990). Defining homogeneous precipitation regions by means of
principal componenet analysis. Journal of Applied Meterology, 29, 892 – 901.
Maraun, D., Osborn, T.J., and Gilleet, N.P., (2008). United Kingdom daily precipitation
intensity: improved early data, error estimates and an update from 2000 to 2006.
International Journal of Climatology, 28(6), 833 – 842.
Marengo, J.A., Ambrizzi, T., da Rocha, R.P., Alves, L.M., Cuadra, S.V., Valverde, M.C., Torres,
R.R., Santos, D.C., Ferraz, S.E.T, (2010). Future change of climate in South America in
the late twenty-first century: Intercomparison of scenarios from three regional climate
models. Climate Dynamics, 35(6), 1089 – 1113.
Mladjic, B., Sushama, L., Khaliq, M.N., Laprise, R., Caya, D., and Roy, R., (2011). Canadian
RCM projected changes to extreme precipitation characteristics over Canada. Journal of
Climate, 24(10), 2565 – 2584.
70
Modarres, R., and Sarhadi, A., (2011). Statistically-based regionalization of rainfall climates of
Iran. Global and Planetary Change, 75(1), 67 – 67.
Muñoz-Diaz, D., and Rodrigo, F.S., (2004). Spatio-temporal patterns of seasonal rainfall in
Spain (1912 – 2000) using cluster and principal component analysis : comparison.
Annales Geophysicae, 22, 1435 – 1448.
Nam, W., Shin, H., Jung, Y., Joo, K., and Heo, J.H., (2015). Delineation of the climatic rainfall
regions of South Korea based on a multivariate analysis and regional rainfall frequency
analyses. International Journal of Climatology, 35(5), 777 – 793.
Ngongondo, C.S., Xu, C.Y., Tallaksen, L.M., Alemaw, B., and Chirwa, T., (2011). Regional
frequency analysis of rainfall extremes in Southern Malawi using the index rainfall and
L-moments approaches. Stochastic Environmental Research and Risk Assessment, 25(7),
939 – 955.
Noto, L., and La Loggia, G., (2009). Use of L-moments approach for regional flood frequency
analysis in Sicily, Italy. Water Resources Management, 23(11), 2207 – 2229.
Olsson, J., Berggren, K., Olofsson, M., and Viklander, M., (2009). Applying climate model
precipitation scenarios for urban hydrological assessment: A case study in Kalmar City,
Sweden. Atmospheric Research, 92(3), 364 – 375.
Paixao, E., Auld, H., Mirza, M.M.Q., Klaassen, J., and Shephard, M.W., (2011). Regionalization
of heavy rainfall to improve climatic design values for infrastructure: case study in
Southern Ontario, Canada. Hydrological Sciences Journal, 56(7), 1067 – 1089.
71
Parida, B.P., and Moalafhi, D.B., (2008). Regional rainfall frequency analysis for Botswana
using L-Moments and radial basis function network. Physics and Chemistry of the Earth,
33, 614 – 620.
Razavi, T., and Coulibaly, P., (2013). Classification of Ontario watersheds based on physical
attributes and streamflow series. Journal of Hydrology, 493, 81 – 94.
Saf, B., (2009). Regional Flood Frequency Analysis Using L Moments for the Buyuk and Kucuk
Menderes River Basins of Turkey. Journal of Hydrologic Engineering, 14(8), 783 – 794.
Schaefer, M.G., Barker, B.L., Taylor, G.H., and Wallis, J.D., (2006). Regional precipitation
frequency analysis and spatial mapping of precipitation for 24-hour and 2-hour durations
in eastern Washington. Washington State Department of Transportation, 1 – 40.
Shephard, M.W., (2011). Updating IDF Climate Design Values for the Atlantic Provinces.
Ministry of Environment, Labour and Justice, Government of Prince Edward Island.
Smithers, J.C., and Schulze R.E., (2001). A methodology for the estimation of short duration
design storms in South Africa using a regional approach based on L-moments. Journal of
Hydrology, 241(1), 42 – 52.
Stathis, D., and Myronidis, D., (2009). Principal Component Analysis of Precipitation in
Thessaly Region (Central Greece). Global NEST Journal, 11(4), 467 – 476.
SUDPLAN 2012: Sustainable Urban Development Planner for Climate Change Adaptation.
Project 247708, Final Report Available online at:
http://sudplan.eu/polopoly_fs/1.30418!/SUDPLAN_final.pdf
72
Trefry, C.M., Watkins Jr., D.W., and Johnson, D., (2005). Regional Rainfall Frequency Analysis
for the State of Michigan. Journal of Hydrologic Engineering, 10(6), 437 – 449.
Unal, Y., Kindap, T., and Karaca, M., (2003). Redefining the climate zones of Turkey using
cluster analysis. International Journal of Climatology, 23(9), 1045 – 1055.
Urrutia, R., and Vuille, M., (2009). Climate change projections for the tropical Andes using a
regional climate model: Temperature and precipitation simulations for the end of the 21st
century. Journal of Geophysical Research Atmospheres, 114(2), 1 – 15.
Willems, P., Arnbjerg-Nielsen, K., Olsson, J., and Nguyen, V.T.V., (2012). Climate change
impact assessment on urban rainfall extremes and urban drainage: methods and
shortcomings. Atmospheric Research, 103, 106 – 118.
Xuejie, G., Zongci, Z., Yihui, D., Ronghui, H., and Giorgi, F., (2001). Climate change due to
greenhouse effects in China as simulated by a regional climate model. Advances in
Atmospheric Sciences, 18(6), 1224 – 1230.
Yiannakoulias, N., Rosychuk, R.J., and Hodgson, J., (2007). Adaptations for finding irregularly
shaped disease clusters. International Journal of Health Geographics, 6.
Yiannakoulias, N., and Bland, W., (2016). Tabreg: Software for Regionalization
Problems. Health Geomatics Lab, School of Geography and Earth Sciences, McMaster
University.
73
Appendix A
Station Name Latitude Longitude Elevation (meters)
Belleville 44.15 -77.39 76.2
Bloomfield 43.98 -77.22 91.4
Bowmanville MOSTERT 43.92 -78.67 99.1
Brantford MOE 43.13 -80.23 196.0
Burketon Mclaughlin 44.03 -78.80 312.4
Chatham 42.39 -82.22 180.0
Chatsworth 44.40 -80.91 305.0
Cressy 44.10 -76.85 83.8
Durham 44.18 -80.82 384.0
Fergus Shand Dam 43.73 -80.33 417.6
Foldens 43.02 -80.78 328.0
Fort Erie 42.88 -78.97 179.8
Frechmans Bay 43.82 -79.08 76.2
Georgetown WWTP 43.64 -79.88 221.0
Glen Allan 43.68 -80.71 400.0
Glen Haffy Mono Mills 43.93 -79.95 434.3
Hagersville 42.97 -80.07 221.0
Hamilton Airport 43.17 -79.93 237.7
Hartington IHD 44.43 -76.69 160.0
Kingston Pumping Station 44.24 -76.48 76.5
Kingsville MOE 42.04 -82.67 200.0
London International Airport 43.03 -81.15 278.0
Millgrove 43.32 -79.97 255.1
New Glasgow 42.51 -81.64 198.1
Oakville Southeast WPCP 43.48 -79.63 86.9
Orangville MOE 43.92 -80.09 411.5
Oshawa WPCP 43.87 -78.83 83.8
74
Owen Sound MOE 44.58 -80.93 178.9
Petrolia Town 42.88 -82.17 201.2
Port Colborne 42.88 -79.25 175.3
Proton Station 44.17 -80.52 480.1
Richmond Hill 43.88 -79.45 240.0
Ridgeville 43.04 -79.33 236.2
Sarnia Airport 42.99 -82.30 180.6
St. Catharines Power Glen 43.12 -79.25 121.9
Stratford WWTP 43.37 -81.00 345.0
Thornbury Slama 44.57 -80.49 213.4
Thornhill Grandview 43.80 -79.42 199.3
Tillsonburg WWTP 42.86 -80.72 213.4
Toronto Island Airport 43.63 -79.40 76.5
Toronto Pearson Airport 43.68 -79.63 173.4
Trenton Airport 44.12 -77.53 86.3
Vineland Rittenhouse 43.17 -79.42 94.5
Waterloo Wellington 43.45 -80.38 317.0
Welland 42.99 -79.26 175.3
Wiarton Airport 44.75 -81.11 222.2
Windsor Airport 42.28 -82.96 189.6
Woodstock 43.14 -80.77 281.9
Table A1: Geographic characteristics of each weather station.
75
Appendix B
Environment Canada Flagged Data
Station Name T C A E F "blank" cells M
Belleville 1750 2 2 92 0 0 0
Bloomfield 493 50 43 158 5 31 8
Bowmanville MOSTERT 668 2 2 28 0 62 1
Brantford MOE 579 17 8 98 6 92 24
Burketon Mclaughlin 494 7 2 86 6 59 18
Chatham 382 24 26 32 0 62 0
Chatsworth 901 11 10 38 1 0 0
Cressy 814 7 7 29 0 424 8
Durham 1455 99 86 153 1 152 43
Fergus Shand Dam 751 1 1 15 0 91 0
Foldens 1241 6 4 60 0 0 0
Fort Erie 1100 19 20 75 0 155 41
Frechmans Bay 1191 17 16 5 0 120 0
Georgetown WWTP 867 17 16 36 2 243 9
Glen Allan 957 1 0 13 1 62 0
Glen Haffy Mono Mills 458 31 19 137 9 242 12
Hagersville 1039 8 8 28 0 0 0
Hamilton Airport 1686 0 0 0 0 60 0
Hartington IHD 944 10 8 33 1 31 1
Kingston Pumping Station 1186 1 1 4 0 31 1
Kingsville MOE 1210 29 30 65 1 31 6
London International
Airport 1751 0 0 3 0 20 7
Millgrove 491 18 17 152 4 153 96
New Glasgow 123 13 12 37 0 61 1
Oakville Southeast WPCP 418 33 18 183 10 304 50
Orangville MOE 647 10 7 73 3 62 10
Oshawa WPCP 656 2 2 32 0 62 0
Owen Sound MOE 656 10 5 95 4 30 1
76
Petrolia Town 354 12 11 41 0 31 31
Port Colborne 534 8 7 35 2 181 1
Proton Station 644 1 1 16 0 0 0
Richmond Hill 1314 0 0 4 0 61 0
Ridgeville 762 41 29 154 7 124 23
Sarnia Airport 1397 0 0 4 0 0 6
St. Catharines Power Glen 735 65 45 221 11 241 86
Stratford WWTP 1372 1 1 14 0 31 0
Thornbury Slama 712 12 11 48 0 0 0
Thornhill Grandview 911 5 4 24 1 31 0
Tillsonburg WWTP 772 35 27 83 4 28 12
Toronto Island Airport 1439 0 0 12 0 22 163
Toronto Pearson Airport 2319 0 0 0 0 60 0
Trenton Airport 1973 0 0 0 0 55 0
Vineland Rittenhouse 666 6 5 16 0 0 0
Waterloo Wellington
Airport 1388 0 0 0 0 60 0
Welland 945 2 2 30 0 242 8
Wiarton Airport 1319 0 0 0 0 178 3
Windsor Airport 1872 0 0 0 0 67 0
Woodstock 1049 8 6 84 2 59 11
Table B1: List of flagged data found for each weather station. Note: The definition for each
flagged value is provided in the table below
77
Flag Definition Replaced Value
A Accumulated amount; previous value was C. 0 mm
C Precipitation occurred, amount uncertain. 0 mm
E Estimated. Flag was removed; daily precipitation
value was provided by EC.
F Accumulated and estimated. Flag was removed; daily precipitation
value was provided by EC.
M Missing. The linear regression value from nearest
neighbouring station was used.
T Trace. 0.2 mm
“blank” cells Missing. The linear regression value from nearest
neighbouring station was used.
Table B2: Definition of each flag data that was found in the daily precipitation data. The method
that was used to replace each of the flag data is indicated in the table. Note: EC is an acronym for
Environment Canada
78
Appendix C
Climate Model Spatial Resolution Scenario Temporal
Resolution Simulation Period
CanRCM4-
CanEMS2 40 km
Historical 1 hour 1970 – 2000
RCP 4.5 1 hour 2006 – 2100
RCP 8.5 1 hour 2006 – 2100
HadGEM2-ES
1.25 × 1.875
degrees
(approximately
120 km x 139 km)
Historical 3 hour 1970 – 2000
RCP 4.5 3 hour 2026 – 2045;
2081 – 2099
RCP 8.5 3 hour 2026 – 2045;
2081 – 2099
Table C1: Data and resolution of the CanRCM4-CanEMS2 and HadGEM2-ES climate models
used.
79
Appendix D
Site 15 Glen Allan: At-site Method IDF Curve Results
Return Period 1 hour 2 hours 3 hours 6 hours 12 hours 24 hours
2 years 24.5 15.3 11.6 6.6 3.7 2.3
5 years 31.9 19.9 15.1 8.6 4.8 3.0
10 years 37.0 23.0 17.6 10.0 5.6 3.4
20 years 42.1 26.2 20.0 11.4 6.4 3.9
50 years 49.0 30.5 23.2 13.3 7.4 4.5
100 years 54.4 33.9 25.8 14.7 8.3 5.0
Table D1: IDF values for Glen Allan station using the at-site method.
Site 15 Glen Allan: RFA Method IDF Curve Results
Return Period 1 hour 2 hours 3 hours 6 hours 12 hours 24 hours
2 years 24.3 15.1 11.5 6.6 3.7 2.2
5 years 31.6 19.7 15.0 8.5 4.8 2.9
10 years 37.2 23.2 17.7 10.1 5.7 3.4
20 years 43.5 27.1 20.6 11.8 6.6 4.0
50 years 53.3 33.1 25.2 14.4 8.1 4.9
100 years 62.1 38.6 29.4 16.8 9.4 5.8
Table D2: IDF values for Glen Allan station using the regional frequency analysis method.
80
Site 39 Tillsonburg: At-site Method IDF Curve Results
Return Period 1 hour 2 hours 3 hours 6 hours 12 hours 24 hours
2 years 24.8 15.5 11.8 6.7 3.8 2.3
5 years 31.0 19.3 14.7 8.4 4.7 2.9
10 years 35.2 21.9 16.7 9.5 5.3 3.3
20 years 39.4 24.5 18.7 10.7 6.0 3.6
50 years 45.1 28.0 21.4 12.2 6.8 4.2
100 years 49.4 30.8 23.4 13.4 7.5 4.6
Table D3: IDF values for Tillsonburg station using the at-site method.
Site 39 Tillsonburg: RFA Method IDF Curve Results
Return Period 1 hour 2 hours 3 hours 6 hours 12 hours 24 hours
2 years 24.2 15.1 11.5 6.6 3.7 2.2
5 years 31.6 19.6 15.0 8.5 4.8 2.9
10 years 37.2 23.1 17.6 10.1 5.6 3.4
20 years 43.4 27.0 20.6 11.7 6.6 4.0
50 years 53.2 33.1 25.2 14.4 8.1 4.9
100 years 62.0 38.6 29.4 16.8 9.4 5.7
Table D4: IDF values for Tillsonburg station using the regional frequency analysis method.
81
Site 43 Vineland Rittenhouse: At-site Method IDF Curve Results
Return Period 1 hour 2 hours 3 hours 6 hours 12 hours 24 hours
2 years 23.0 14.3 10.9 6.2 3.5 2.1
5 years 28.5 17.7 13.5 7.7 4.3 2.6
10 years 31.7 19.8 15.1 8.6 4.8 2.9
20 years 34.7 21.6 16.5 9.4 5.3 3.2
50 years 38.3 23.8 18.1 10.3 5.8 3.5
100 years 40.7 25.3 19.3 11.0 6.2 3.8
Table D5: IDF values for Vineland Rittenhouse station using the at-site method.
Site 43 Vineland Rittenhouse: RFA Method IDF Curve Results
Return Period 1 hour 2 hours 3 hours 6 hours 12 hours 24 hours
2 years 22.2 13.8 10.5 6.0 3.4 2.1
5 years 28.7 17.9 13.6 7.8 4.4 2.7
10 years 33.2 20.7 15.8 9.0 5.0 3.1
20 years 37.9 23.6 18.0 10.2 5.8 3.5
50 years 44.2 27.5 21.0 12.0 6.7 4.1
100 years 49.2 30.6 23.3 13.3 7.5 4.6
Table D6: IDF values for Vineland Rittenhouse station using the regional frequency analysis
method.
82
Site 17 Hagersville: At-site Method IDF Curve Results
Return Period 1 hour 2 hours 3 hours 6 hours 12 hours 24 hours
2 years 25.2 15.7 12.0 6.8 3.8 2.3
5 years 32.2 20.1 15.3 8.7 4.9 3.0
10 years 37.0 23.0 17.6 10.0 5.6 3.4
20 years 41.8 26.0 19.8 11.3 6.3 3.9
50 years 48.2 30.0 22.8 13.0 7.3 4.5
100 years 53.1 33.0 25.2 14.4 8.1 4.9
Table D7: IDF values for Hagersville station using the at-site method.
Site 17 Hagersville: RFA Method IDF Curve Results
Return Period 1 hour 2 hours 3 hours 6 hours 12 hours 24 hours
2 years 24.7 15.4 11.7 6.7 3.8 2.3
5 years 32.2 20.0 15.3 8.7 4.9 3.0
10 years 37.7 23.5 17.9 10.2 5.7 3.5
20 years 43.5 27.1 20.6 11.8 6.6 4.0
50 years 51.8 32.2 24.5 14.0 7.9 4.8
100 years 58.6 36.5 27.8 15.8 8.9 5.4
Table D8: IDF values for Hagersville station using the regional frequency analysis method.
83
Appendix E
Durham 3 hour rainfall IDF Curve Comparison
Return Period At-site Method RFA Method 2050s Period 2100s period
2 years 11.5 11.5 12.7 14.8
5 years 14.3 14.6 16.3 18.5
10 years 16.2 16.6 18.3 20.8
20 years 18.0 18.5 20.1 22.9
50 years 20.3 20.9 22.2 25.6
100 years 22.0 22.7 23.5 27.7
Table E1: IDF values for Durham station (3 hour rainfall) using the at-site method, regional
frequency analysis method and future IDF curve development method.
Durham 12 hour rainfall IDF Curve Comparison
Return Period At-site Method RFA Method 2050s Period 2100s period
2 years 3.7 3.7 4.0 4.5
5 years 4.6 4.7 5.1 5.8
10 years 5.2 5.3 5.6 6.7
20 years 5.8 5.9 6.1 7.5
50 years 6.5 6.7 6.8 8.5
100 years 7.1 7.3 7.3 9.3
Table E2: IDF values for Durham station (12 hour rainfall) using the at-site method, regional
frequency analysis method and future IDF curve development method.
84
Hartington 3 hour rainfall IDF Curve Comparison
Return Period At-site Method RFA Method 2050s Period 2100s period
2 years 10.3 9.9 11.6 12.2
5 years 12.5 13.2 14.5 16.4
10 years 13.9 15.1 16.0 19.4
20 years 15.2 16.6 17.4 22.5
50 years 16.7 18.0 19.0 26.9
100 years 17.8 18.9 20.1 30.5
Table E3: IDF values for Hartington station (3 hour rainfall) using the at-site method, regional
frequency analysis method and future IDF curve development method.
Hartington 12 hour rainfall IDF Curve Comparison
Return Period At-site Method RFA Method 2050s Period 2100s period
2 years 3.3 3.2 3.6 3.7
5 years 4.0 4.2 4.3 4.6
10 years 4.5 4.8 4.8 5.4
20 years 4.9 5.3 5.1 6.2
50 years 5.4 5.8 5.6 7.7
100 years 5.7 6.0 5.9 9.0
Table E4: IDF values for Hartington station (12 hour rainfall) using the at-site method, regional
frequency analysis method and future IDF curve development method.