DEVELOPMENT OF INTENSITY-DURATION-FREQUENCY CURVES USING ...

100
DEVELOPMENT OF INTENSITY-DURATION-FREQUENCY CURVES USING LOCAL AND REGIONAL SCALE METHODS

Transcript of DEVELOPMENT OF INTENSITY-DURATION-FREQUENCY CURVES USING ...

1

DEVELOPMENT OF INTENSITY-DURATION-FREQUENCY CURVES USING LOCAL

AND REGIONAL SCALE METHODS

2

DEVELOPMENT OF INTENSITY-DURATION-FREQUENCY CURVES USING LOCAL

AND REGIONAL SCALE METHODS

By: Marc D’Alessandro, B.Sc

A Thesis Submitted to the School of Graduate Studies in Partial Fulfillment of the Requirement

for the Degree of Master of Science

McMaster University © Copyright by Marc D’Alessandro, September 2016

ii

MASTER OF SCIENCE (2016) McMaster University

(Geography and Earth Science) Hamilton, Ontario

TITLE: Development of Intensity-Duration-Frequency Curves Using Local

and Regional Scale Methods

AUTHOR: Marc D’Alessandro, B.Sc (McMaster University)

SUPERVISOR: Dr. Paulin Coulibaly

NUMBER OF PAGES: xv, 84

iii

Abstract

Traditionally civil infrastructure designs were rendered using rainfall data from dated

historical records. However, recent studies have shown that the magnitude and intensity of

historical precipitation events do not exhibit the extreme nature of precipitation events that are

projected to occur in the future. Increasing extreme rainfall trends have already been documented

in Canada. Therefore there are growing concerns that the aging infrastructure in southern Ontario

will be unable to function effectively and as a result the frequency of floods is expected to

increase. Updating intensity-duration-frequency (IDF) curves to account for extreme

precipitation events is vital to ensure that the consequences of floods are mitigated. This study

first reviewed the most robust techniques for updating IDF curves, and applied a select set of

approaches to create IDF curves for stations within southern Ontario.

Three robust techniques – the at-site method, the regional frequency analysis method, and

a future IDF curve development technique – were compared with one another to determine

which technique was most suitable for updating IDF curves in southern Ontario. Results showed

that the difference between the at-site method and the regional frequency analysis method was

marginal for short return periods, however for larger return periods larger differences were

observed. Future IDF statistic results showed that for the 2050s there were minor differences in

the increases in rainfall intensities when comparing with the at-site and the regional frequency

analysis method. For the 2100s there were larger increases in rainfall intensities compared to the

at-site and the regional frequency analysis method, especially for larger return periods. These

results suggest that it is worthwhile for regions within southern Ontario to update their IDF

curves using the future IDF curve technique, however it is recommended that additional climate

iv

models, emission scenarios and downscaling techniques involved in future IDF curve

construction are explored.

v

Acknowledgements

I would like to thank my supervisor, Dr. Coulibaly, for all of his guidance and support

throughout my graduate studies. Dr. Coulibaly’s encouragement enabled me to reach all of my

goals and for that I am very appreciative. I would also like to thank my mom and dad, my sister,

my grandparents and all of my family and friends for their support. I would also like to thank

Serge Traore, Dr. Tara Razavi, Dr. Niko Yiannakoulias, Dr. Hussein Wazneh and everybody

from McMaster’s Hydrologic Modeling and Water Resources Lab for all of their help and

guidance.

vi

Table of Contents

Abstract ……………………………………………………………………………………… iii

Acknowledgements ………………………………………………………………………….. v

Table of Contents ……………………………………………………………………………. vi

List of Figures ……………………………………………………………………………….. ix

List of Tables ………………………………………………………………………………… xii

List of Abbreviations ………………………………………………………………………... xiv

List of Symbols ………………………………………………………………………………. xv

1. Introduction

1.1 Background and Rationale………………………………………………………..... 1

1.2 Study Objectives …………………………………………………………………... 2

1.3 Organization of Thesis ……………………………………………………………...3

2. Literature Review

2.1 The At-Site Intensity-Duration-Frequency (IDF) Curve Method …………...…….. 4

2.2 The Regional Frequency Analysis Method ………………………………………... 4

2.2.1 Regionalization Techniques ……………………………………………… 5

2.2.2 Principal Component Analysis and K-Means Clustering …………..……. 5

2.2.3 Ward’s Method …………………………………………………………… 6

2.2.4 Tabreg ……………………………………………………………………. 7

2.3 Future IDF Curve Development Method ……………..………………………..…... 8

3. Study Area and Data

3.1 Study Area ………………………………………………………………………….. 9

3.2 Observed Data ……………………………………………………………………… 10

3.3 Climate Model Data ………………………………………………………………... 11

4. Methodology

vii

4.1 Overview of Methods ……………………………………………………………… 12

4.2 Data Processing

4.2.1 Converting Daily Duration Data to 24-hour Duration Data ……………. 14

4.2.2 Replacing Flagged Data …………………………………………………. 14

4.2.3 Converting 24-hour Duration Data to Hourly and Sub-Hourly Duration

Data ...…………………………………………………………………….. 15

4.2.4 Annual Maximum Series Derivation ……………………………………... 16

4.3 At-Site Method IDF Curve Development ……………………………………….…. 17

4.4 Regional Frequency Analysis Method IDF Curve Development ………………….. 17

4.4.1 Regionalization Inputs …………………………………………………… 18

4.4.2 Principal Component Analysis with K-means Clustering ……………….. 18

4.4.3 Ward’s Method ………………………………………………………….... 20

4.4.4 Tabreg Method ………………………………………………………….…20

4.4.5 L-moments Approach …………………………………………………….. 23

4.4.6 L-moment Ratio Diagrams ……………………………………………….. 24

4.4.7 Goodness-of-Fit Measure …………………………………………………24

4.5 Future IDF Curve Methodology

4.5.1 Selection of Future Time Periods ………………………………………... 25

4.5.2 Climate Model Data Processing …………………………………………. 26

4.5.3 Downscaling Techniques ………………………………………………… 26

4.5.4. Future IDF Curve Derivation …………………………………………… 27

5. Results

5.1 Principal Component Analysis and K-means Clustering Regionalization Results

……………………………………………………………………………………… 27

5.2 Ward’s Method Regionalization Results …………………………………………... 31

viii

5.3 Tabreg Regionalization Results ……………………………………………………. 33

5.4 L-moment Ratio Diagrams……..………………………………………….……….. 34

5.5 Goodness-of-Fit Measure ………………...………………………………………… 37

5.6 IDF Curve Results: At-Site Method Compared to Regional Frequency Analysis

Method ………………………………………………………………………..……. 39

5.7 Future IDF Curve Results …..……………………………………………………… 50

5.8 Variability of the Future IDF Curve Results ………………..……………………… 56

6. Discussion ………………………………………………………………………………...… 60

7. Conclusion ……………………………………………………………………………..…… 62

8. Recommendations ………………………………………………………………………….. 63

9. References …………………………………………………………………………………... 65

Appendix A ……………………………………………………………………………………. 73

Appendix B ……………………………………………………………………………………. 75

Appendix C ……………………………………………………………………………………. 78

Appendix D ……………………………………………………………………………………. 79

Appendix E ……………………………………………………………………………………. 83

ix

List of Figures

Figure 1: Map of southern Ontario showing the 48 weather stations used in the study……...….10

Figure 2: An overview of the methodology used in this study…………………………………..13

Figure 3: Topology created for the selected weather stations………………………………..…..21

Figure 4: Graphical plot of the cost function after 500 iterations using Tabreg…..……………..23

Figure 5: Davies-Bouldin index applied to the input dataset…………………………….…..…. 28

Figure 6: Percentage of total variability explained by each principal component……………….29

Figure 7: PCA loading plots for the first and second principal components…………….………30

Figure 8: Map showing the regions that were delineated using Principal Component Analysis

with k-means……….…………………………………………………………………..31

Figure 9: Map showing the regions that were delineated using Ward’s method…………...........32

Figure 10: Map showing the regions that were delineated using Tabreg clustering method……34

Figure 11: L-moment ratio diagram for cluster 1………………………………………………..35

Figure 12: L-moment ratio diagram for cluster 2………………………………………………..36

Figure 13: Plots of IDF statistics from the regional frequency analysis method and the at-site

method for Glen Allan station: (a) 3 hour rainfall and (b) 2 hour rainfall ...................41

x

Figure 14: Plots of IDF statistics from the regional frequency analysis method and the at-site

method for Tillsonburg station: (a) 3 hour rainfall and (b)12 hour rainfall…....…….42

Figure 15: Difference (in %) of rainfall intensity between the at-site method and the regional

frequency analysis method for stations in cluster 1………………………..……...…43

Figure 16: Plots of IDF statistics from the regional frequency analysis method and the at-site

method for Vineland Rittenhouse station (cluster 3): (a) 3 hour rainfall and (b) 12

hour rainfall………………………………………………………………..................45

Figure 17: Plots of IDF statistics from the regional frequency analysis method and the at-site

method for Hagersville station (cluster 2): (a) 3 hour rainfall and (b) 12 hour

rainfall………………………………………………………………………………..46

Figure 18: Difference (in %) of rainfall intensity between the at-site method and the regional

frequency analysis method for Hagersville (cluster 2) and Vineland Rittenhouse

(cluster 3)…………………………………………………………………………….47

Figure 19: Difference (in %) of rainfall intensity between the at-site method and the regional

frequency analysis method for weather stations in all clusters for short return

periods………………………………………………………………………………..49

Figure 20: Difference (in %) of rainfall intensity between the at-site method and the regional

frequency analysis method for weather stations in all clusters for large return

periods………………………………………………………………………………...50

xi

Figure 21: IDF curves obtained from the future IDF method, the regional frequency analysis

method and the at-site method for Durham station (3 hour period)………………....52

Figure 22: IDF curves obtained from the future IDF method, the regional frequency analysis

method and the at-site method for Durham station (12 hour period)….………….…53

Figure 23: IDF curves obtained from the future IDF method, the regional frequency analysis

method and the at-site method for Hartington station (3 hour period)………………54

Figure 24: IDF curves obtained from the future IDF method, the regional frequency analysis

method and the at-site method for Hartington station (12 hour period)….……….…55

Figure 25: Comparison of IDF curves for 12 hour rainfall duration using the minimum,

maximum and mean of the future IDF method results versus the at-site method results

for Durham station..……………………………………………...…………………...58

Figure 26: Comparison of IDF curves for 12 hour rainfall duration using the minimum,

maximum and mean of the future IDF method results versus the at-site method results

for Hartington station. ……………………………………………………………….59

xii

List of Tables

Table 1: Coefficients of the Ratio formula for converting 24h rainfall to different hourly and sub-

hourly durations of rainfall for Ontario (Adapted after Hershfield (1961); Huff and

Angel (1989) by Coulibaly and Shi 2005)……………………………………………...16

Table 2: Goodness-of-fit measure results for each station located in cluster 1 …………………37

Table 3: Goodness-of-fit measure results for each station located in cluster 2………………….38

Table 4: L-moment ratio diagram results for each cluster…………………………………….…38

Table A1: Geographic characteristics of each weather station…………………………………..73

Table B1: List of flagged data found for each weather station. Note: The definition for each

flagged value is provided in the table below…………….…………………………...75

Table B2: Definition of each flag data that was found in the daily precipitation data. The method

that was used to replace each of the flag data is indicated in the table. Note: EC is an

acronym for Environment Canada……………………………………………………77

Table C1: Data and resolution of the CanRCM4-CanEMS2 and HadGEM2-ES climate models

used…………………………………………………………………………………...78

Table D1: IDF values for Glen Allan station using the at-site method……………………….....79

Table D2: IDF values for Glen Allan station using the regional frequency analysis method…...79

Table D3: IDF values for Tillsonburg station using the at-site method…………………………80

Table D4: IDF values for Tillsonburg station using the regional frequency analysis method…..80

xiii

Table D5: IDF values for Vineland Rittenhouse station using the at-site method…………........81

Table D6: IDF values for Vineland Rittenhouse station using the regional frequency analysis

method ………………………………………………………………………………..81

Table D7: IDF values for Hagersville station using the at-site method………………………….82

Table D8: IDF values for Hagersville station using the regional frequency analysis……….......82

Table E1: IDF values for Durham station (3 hour rainfall) using the at-site method, regional

frequency analysis method and future IDF curve development method……………..83

Table E2: IDF values for Durham station (12 hour rainfall) using the at-site method, regional

frequency analysis method and future IDF curve development method…………..….83

Table E3: IDF values for Hartington station (3 hour rainfall) using the at-site method, regional

frequency analysis method and future IDF curve development method………….….84

Table E4: IDF values for Hartington station (12 hour rainfall) using the at-site method, regional

frequency analysis method and future IDF curve development method……………..84

xiv

List of Abbreviations

AMS Annual Maximum Series

CanRCM4-CanESM2

The fourth generation of the Canadian regional climate model driven

by the second generation of the Canadian earth system model.

GCM Global Climate Model

GEV Generalized Extreme Value

GHG Greenhouse gas

GLO Generalized Logistic

GNO Generalized Normal

GPD Generalized Pareto

HadGEM2-ES

The Hadley Global Environment Model 2 coupled with the earth

system model.

IDF Intensity-duration-frequency

IPCC Intergovernmental Panel on Climate Change

PCs Principal Components

PCA Principal Component Analysis

PDS Partial duration series

PE3 Pearson Type III

RCP Representative Concentration Pathways

RCM Regional Climate Model

RFA Regional Frequency Analysis

xv

List of Symbols

𝑎 Coefficient which is equal to 0

𝑏 Linear regression coefficient

𝑐 The 24-hour precipitation value

𝐶 The total cost function for the entire regionalization plan

𝑑 The ratio value used to obtain the desired hourly or sub-hourly data value

𝑃𝑥1 The station with a missing precipitation value

𝑃𝑥2 The nearest neighbouring station’s respective precipitation value

𝑋 The converted precipitation value for a desired duration

𝑉𝑥1 The total within-region variation in mean annual precipitation for all sites

𝑉𝑥2 The total within-region variation in spring precipitation for all sites

𝑉𝑥3 The total within-region variation in summer precipitation for all sites

𝑉𝑥4 The total within-region variation in fall precipitation for all sites

𝑉𝑥5 The total within-region variation in winter precipitation for all sites

𝑉𝑥6 The total within-region variation in 100 maximum daily precipitation

values for all sites

𝑉𝑥7 The non-connectivity penalty

ZDIST Goodness-of-fit measure

𝜏3 L-moment coefficient of skewness

𝜏4 L-moment coefficient of kurtosis

1

1. Introduction

1.1 Background and Rationale

Intensity-duration-frequency (IDF) curves are utilized by water resources engineers to

design hydraulic structures such as sewer systems, dams and culverts. Previously IDF curves

were developed using historical rainfall data however recent observations and studies indicate

that historical data does not represent the intensity and magnitude of rainfall events that are

expected to occur in the future. Climate change impact studies from across North America have

predicted increasing trends in extreme rainfall events and such information needs to be

accounted for in design storm estimation. Infrastructure failure is a major concern because it can

lead to flooding which has proven to negatively impact society. There were recent cases in

southern Ontario where extreme rainfall events led to severe flooding. On July 8, 2013 an intense

storm in Toronto, Ontario caused approximately $1 billion in damages due to homes being

flooded, power outages and roads being flooded (Environment and Climate Change Canada,

2015). Furthermore, on August 4, 2014 a large storm in Burlington, Ontario caused $90 million

in damages due to homes and commercial buildings being flooded, damage to driveway culverts

and the flooding of car parking areas (Conservation Halton, 2015). Therefore updating IDF

curves to account for extreme rainfall events is required to ensure that the aforementioned

consequences of floods are prevented.

Climate change is caused by anthropogenic greenhouse gas emissions being produced at

an excessive rate. Approximately 50% of anthropogenic carbon dioxide emissions have occurred

within the last 40 years (IPCC, 2014). Greenhouse gas emissions are expected to increase in the

future and climate scientists believe that there will be unduly consequences on the Earth’s

2

climate, including increases in the magnitude and intensity of extreme rainfall events.

Accounting for these projected increases is extremely important for infrastructure design,

therefore the use of future climate models is necessary when designing water system

infrastructure (Lemmen, 2008). Different scenarios project greenhouse gas emission rates for the

entire 21st century based on global population, energy use and land use practices and resulting

precipitation intensity responses can be computed using climate models (IPCC, 2014). Regional

climate models and global climate models are used in this study to determine future precipitation

intensities triggered by climate change for the southern Ontario region.

1.2 Study Objectives

This research is part of the NSERC Canadian FloodNet Research Program, specifically

Project 1-4: Development of new methods for updating IDF curves in Canada.

The specific objective of this study was to update IDF curves in southern Ontario using

emerging methodologies. This study assessed several methodologies for updating IDF curves

and a comparative analysis was completed to determine which technique is most suitable for

updating IDF curves in southern Ontario. The objectives of this study are summarized as

follows:

1. Identify weather stations located within southern Ontario that meet the requirements of

this study and collect daily precipitation data for each station.

2. Identify the most effective IDF curve methodologies and update the IDF curves for each

station located within the study area using these methodologies;

3

a. Apply the at-site method to update IDF curves, specifically fitting the generalized

extreme value (GEV) probability distribution function to the annual maximum

rainfall data for each station.

b. Apply the regional frequency analysis method to update IDF curves. Utilize

clustering techniques such as Principal Component Analysis (PCA) with k-means,

Ward’s method and Tabreg to classify weather stations into homogeneous

precipitation regions. Apply the L-moments approach to determine if the

delineated regions were statistically homogeneous and for quantile estimation.

c. Apply the future IDF curve method to update IDF statistics. Use the Delta change

method for statistical downscaling and use a combination of two climate model

outputs (CanRCM4-CanESM2 and HadGEM2-ES) and two emission scenarios

(RCP4.5 and RCP8.5) to establish future rainfall intensities.

3. Perform a comparative analysis for the three aforementioned techniques and identify the

technique that is most suitable for updating IDF curves in southern Ontario.

1.3 Organization of Thesis

Chapter 2 contains a literature review which outlines the development of IDF curve

techniques over time. Chapter 3 presents the study area and a description of the data that was

used. Chapter 4 outlines the methodologies utilized throughout this study. Chapter 5 presents and

discusses the results of the study. Chapter 6 provides the conclusions of the study, and Chapter 7

outlines recommendations for future work.

2. Literature Review

4

2.1 The At-Site Intensity-Duration-Frequency (IDF) Curve Method

The at-site method which is used for IDF curve derivation can be dated back to 1932

(Bernard, 1932). The at-site method uses frequency analysis to determine the recurrence of

extreme rainfall events for a single site (or station). Various organizations have used the at-site

method to update IDF curves for different regions throughout Canada. Coulibaly and Shi (2005)

used the at-site method to develop updated IDF curves for the Grand River region and the

Kenora and Rainy River region. The City of Guelph (2007) developed IDF curves for three

weather stations using the at-site method and compared their results with the previously outdated

IDF curves used for storm design. Shephard (2011) updated IDF curves using additional weather

station data for New Brunswick, Nova Scotia, Newfoundland and Prince Edward Island with the

at-site method. Paixao et al. (2011) utilized the at-site method to update IDF curves for 92

climate stations and 29 tipping bucket rain gauge stations in southern Ontario. Although the at-

site method was once the conventional technique used for IDF curve derivation, there are

limitations associated with this method. Skewed data obtained from weather stations could

misrepresent the extreme nature of precipitation events due to short data records, a low density of

weather stations in a large region and missing rainfall data. To circumvent this issue the regional

frequency analysis method has become increasingly utilized for IDF curve development to

overcome the limitations associated with the at-site method.

2.2 The Regional Frequency Analysis Method

The regional frequency analysis method (Hosking and Wallis, 1997) incorporates

additional weather station data to enhance the characterization of extreme rainfall events for

different regions. The regional frequency analysis method reduces uncertainties associated with

5

the at-site method by creating homogeneous regions which allows for additional rainfall data to

be included in the derivation of IDF curves. The regional frequency analysis method has been

used in studies to determine extreme rainfall rates and flood estimates for various study areas

(Jingyi and Hall, 2004; Trefry et al., 2005; Schaefer et al., 2006; Parida and Moalafhi, 2008; Lim

and Voeller, 2009; Saf, 2009; Ngongondo et al., 2011; Malekinezhad and Zare-Garizi, 2014;

Bharath and Srinivas, 2015). A key component of the regional frequency analysis approach is the

regionalization technique needed to define homogeneous regions.

2.2.1 Regionalization Techniques

One particularly important step in regional frequency analysis is the delineation of

homogeneous regions using a clustering or classification technique. Hosking and Wallis (1997)

recommend using Ward’s method because it forms clusters containing an equal number of sites.

Different studies have used a wide variety of clustering techniques to create homogeneous

regions (Stathis and Myronidis, 2009; Abolverdi and Khalili, 2010; Modarres and Sarhadi, 2011;

Paixao et al., 2011). The accuracy of the delineated regions is tested using the L-moment

approach. Regions that are heterogeneous or contain discordant stations must be reevaluated to

ensure that the inclusion of all sites within a cluster accurately represent similar hydrologic

characteristics. Three clustering techniques, Principal Component Analysis with k-means

clustering, Ward’s method and Tabreg, were utilized to delineate regions exhibiting similar

hydrologic characteristics for this study.

2.2.2 Principal Component Analysis and K-means Clustering

Principal Component Analysis (PCA) and k-means clustering have been utilized to

delineate homogeneous regions when using the regional frequency analysis method. For

6

example, Brunnetti et al. (2004) applied PCA with VARIMAX rotation to monthly precipitation

data for 39 stations located in Italy and analyzed long term precipitation trends for the 5

delineated regions. Maraun et al. (2008) analyzed seasonal precipitation trends in the United

Kingdom by applying PCA to 689 rain gauge records that were categorized into 10 precipitation

intensity classes. Stathis and Myronidis (2009) delineated homogeneous precipitation regions in

central Greece by applying PCA to mean monthly precipitation data for 75 meteorological

stations. Abolverdi and Khalili (2010) applied k-means to annual maximum precipitation data

and at-site information, including latitude, longitude and elevation, and estimated quantiles for

different return periods for 4 homogeneous regions in Iran. Bernard et al. (2013) applied k-means

clustering to maximum hourly precipitation data in France and compared its results with the

partitioning around medoids algorithm. Fragoso and Tildes Gomes (2008) utilized PCA with k-

means clustering to determine spatial distribution patterns of heavy precipitation days and

identified specific atmospheric circulation patterns contributing to heavy rainfall patterns in

Portugal. Razavi and Coulibaly (2013) applied PCA with k-means clustering to delineate

homogeneous watershed clusters in Ontario.

2.2.3 Ward’s Method

Ward’s method is a widely used clustering technique to allocate sites into homogeneous

regions (Hosking and Wallis, 1997). Smithers and Schulze (2001) applied Ward’s method to

precipitation statistics and at-site characteristics of 172 rainfall stations in South Africa and

created design storms for 15 regions, where 10 of the regions were considered acceptably

homogeneous and 5 of the regions were considered possibly heterogeneous. Unal et al. (2003)

applied Ward’s method to temperature and precipitation data for 113 climate stations in Turkey

and delineated 7 clusters that exhibited distinct climate characteristics. Muñoz-Diaz and Rodrigo

7

(2004) applied Ward’s method to delineate homogeneous regions in Spain using seasonal rainfall

data and compared their results with the PCA method. Kansakar et al. (2004) applied Ward’s

method to precipitation data in Nepal and determined the effect physical characteristics have on

the seasonality of precipitation events and the magnitude of precipitation events. Kyselý et al.

(2006) determined that Ward’s method yielded superior results compared to the average-linkage

clustering method and as a result four homogeneous regions which exhibited similar extreme

precipitation characteristics in Czech Republic were identified. Noto and La Loggia (2009)

applied Ward’s method to 52 stream gauging sites in Sicily and developed growth curves to

estimate flooding events for each region. Modarres and Sarhadi (2011) utilized Ward’s method

to delineate 8 rainfall regions in Iran and then applied L-moment statistics to determine the

homogeneity, discordancy and regional frequency distribution function for each region. Paixao

et al. (2011) applied k-means, centroid and Ward’s method to 24 hour rainfall data in southern

Ontario and determined that Ward’s method was the most suitable technique to delineate

homogeneous regions that exhibit similar precipitation patterns.

2.2.4 Tabreg

Tabreg (Yiannakoulias et al. 2007; Yiannakoulias and Bland, 2016) is a general purpose

regionalization tool that is used to cluster stations into regions that exhibit similar hydrologic

characteristics. Tabreg uses a greedy search algorithm to minimize a user defined cost function.

The cost function can be comprised of multiple criteria and it is used to minimize within-region

variability (Yiannakoulias and Bland, 2016). Tabreg is used for regionalization of weather

stations, however this algorithm can be applied to other areas of study such as creating political

districts or creating health care regions (Yiannakoulias and Bland, 2016).

8

2.3 Future IDF Curve Development Method

The at-site method and the regional frequency analysis method use historical rainfall data

to determine recurrence intervals of extreme rainfall events. However the incorporation of

climate model data is now commonly utilized when updating IDF curves. Practitioners need to

develop infrastructure accounting for larger return periods and climate model data is used to

develop design storms to withstand future rainfall extremes with a 50 year and 100 year return

period. Climate change impact studies use regional climate models and global climate models to

assess the impact future precipitation rates will have on society. The future climate model based

technique has been used globally to determine rainfall rates that are expected to occur in the

future (Hennessy et al., 1997; Xuejie et al., 2001; Mailhot et al., 2007; Mladjic et al., 2011;

Dominguez et al., 2012; Mailhot et al., 2012; Willems et al., 2012; Clavet-Gaumont et al., 2013).

Coulibaly and Shi (2005) used the Canadian Global Circulation Model-2 to develop future IDF

curves for various regions in Ontario. Olsson et al. (2009) used the RCA3 regional climate model

coupled with A2 and B2 emission scenarios and determined that there will be significant

increases in extreme precipitation during the summer and autumn seasons in Kalmar, Sweden.

Urrutia and Vuille (2009) used a regional climate model following the A2 and B2 scenarios for

the 2070 – 2100 period and determined that different regions in the tropical Andes will

experience both increases and decreases in precipitation rates. Feng et al. (2011) used a global

climate model to project future precipitation changes in China and determined that extreme

precipitation events would significantly increase in southeast China. Coulibaly et al. (2015) used

an ensemble of global climate models and regional climate models to update IDF curves for

weather stations located in the Toronto and Essex regions.

3. Study Area and Data

9

3.1 Study Area

The study area that will be assessed in this study is southern Ontario, Canada. Southern

Ontario is the most populated region in Canada and it is a region that is projected to see its

population increase in the future. Infrastructure development and expansion is expected and there

is a need for IDF curves to accurately account for the extreme rainfall events that can impact this

region. Selected 48 weather stations located within different areas of southern Ontario were used

in this study (Figure 1). Between 1970 and 2000 the weather station with the highest mean

annual precipitation was Chatsworth which had 1149 mm, and the weather station with the

lowest mean annual precipitation was Toronto Island Airport which had a recording of 795 mm.

The weather stations with the lowest altitude are Belleville and Frenchman’s Bay with 76 meters

above sea level, meanwhile Proton weather station had the highest altitude with 480 meters

above sea level.

10

Figure 1: Map of southern Ontario showing the 48 weather stations used in the study.

3.2 Observed Data

31 years of daily precipitation data between 1970 and 2000 was collected from

Environment Canada’s Climate Data Archive for all 48 weather stations. A list of the selected

weather stations is presented in Appendix A, also included in this list are geographic

characteristics of each station such as latitude, longitude and altitude. There were specific criteria

that each weather station had to fulfill before being selected for this study. The first criterion

required each weather station to have at least 30 years of data. The second criterion was each

station’s most recent record of data must be recorded in the year 2000. The last criterion required

London

Kingston

Windsor

Wiarton

Toronto

Fort Erie

11

each station to have no more than 5% of its data missing. The weather station that had the most

missing data was Cressy weather station and it had approximately 4% of its daily precipitation

data missing.

Each weather station used in this study contained a large number of flagged data, i.e.

estimated values and trace values. A table displaying the flagged data for each weather station

can be found in Appendix B, Table B1. The weather station that contained the most trace values

was Toronto Pearson Airport and approximately 20% of its daily precipitation data contained

trace values. The weather station that contained the most estimated values was St. Catharines

Power Glen and approximately 2% of its daily precipitation data contained estimated values.

Although some weather stations contained a large number of flagged values this only affected

the data used for the regionalization portion of the study. Environment Canada’s weather station

data exhibited substantial sources of error which is a limitation for the regionalization part of this

study, however the annual maximum series data used to create the IDF curves was not estimated

and considered more reliable.

3.3 Climate Model Data

Regional climate models (RCM) and global climate models (GCM) are commonly used

for climate change impact studies in Ontario. The fourth generation of the Canadian regional

climate model driven by the second generation of the Canadian Earth System Model

(CanRCM4-CanESM2) was the RCM that was used in this study. The Hadley Global

Environment Model 2 coupled with the Earth System Model (HadGEM2-ES) was the GCM that

was used in this study. The CanRCM4-CanESM2 model data was downloaded from the

Canadian Centre for Climate Modeling. The HadGEM2-ES model data was downloaded from

12

the PCMDI – Earth System Grid Federation. Details outlining characteristics such as spatial

resolution, temporal resolution and simulation period for the CanRCM4-CanESM2 and

HadGEM2-ES climate models can be found in Appendix C.

Climate model data for the current period (1970 - 2000) was collected for both climate

models. Additionally, future climate model data was downloaded for the 2050s (2026 – 2045)

and the 2100s (2081 – 2099) time periods for two emissions scenarios (RCP4.5 and RCP8.5) for

both climate models. Determining future precipitation rates in response to climate change is

achieved by looking at different scenarios which assess future global population, energy use and

land use practices to project greenhouse gas emission rates for the 21st century (IPCC, 2014).

These scenarios are known as Representative Concentration Pathways (RCPs). RCP4.5 is an

intermediate scenario that limits the increase of future greenhouse gas emissions, meanwhile

RCP8.5 is an intensive scenario which predicts that future greenhouse gas emissions will

continually increase (IPCC, 2014).

4. Methodology

4.1 Overview of Methods

Initially weather stations were assessed to see if they would be suitable for this study.

Additionally, an extensive review was conducted to determine the most robust techniques

currently used for IDF curve development. After the review was conducted it was determined

that the at-site method, the regional frequency analysis method and a future IDF curve method

would be used to develop IDF curves for the selected weather stations. Figure 2 shows a

flowchart of the methodology used for this study. Detailed description of each step of the

methodology is provided hereafter.

13

Figure 2: An overview of the methodology used in this study.

Selection of Stations (48 stations were selected)

Regional Frequency Analysis Method

Classification of stations and

identification of homogeneous

regions

Fitting best fit probability

distribution to each homogeneous region

Development of regional IDF curves

At-site Method

Fitting of generalized extreme value (GEV)

probability distribution function

to each observed station data

Development of IDF curves for each

station using the at-site method

Future IDF Curve Method

Climate Model Selection

Downscaling climate model

data using Delta change approach

Development of future IDF curves

Comparison of IDF Curves

14

4.2 Data Processing

4.2.1 Converting Daily Duration Data to 24-hour Duration Data

The Environment Canada weather station data was provided in a daily duration format,

however to create IDF curves for drainage design purposes the precipitation data must be in an

hourly format. For all weather stations daily precipitation data was multiplied by 1.13, which is

the Hershfield factor and it is commonly used to convert daily precipitation data to 24-hour

duration data (Hershfield, 1961; Huff and Angel 1989; 1992; Coulibaly and Shi 2005; Coulibaly

et al. 2015).

4.2.2 Replacing Flagged Data

Flagged data was replaced for each weather station using the following procedure, which

is further described in Appendix B, Table B2:

1. All trace values were replaced with a value of 0.2 mm because this is the minimum

recorded rainfall value from the dataset.

2. The “C” flag, which is defined as precipitation occurred but amount is uncertain, was

replaced with a value of 0 mm.

3. The “A” flag, which is defined as accumulated precipitation where the previous value

was the “C” flag, was replaced with a value of 0 mm.

4. The “E” flag, which is defined as an estimated value, already had a value in the dataset

that was provided by Environment Canada, therefore the respective value remained

intact.

15

5. The “F” flag, which is defined as an accumulated and estimated value, already had a

value in the dataset that was provided by Environment Canada, therefore the respective

value remained intact.

6. All missing data was replaced using a linear regression model between two nearest

neighboring stations. The linear regression equation is formulated as,

𝑃𝑥1 = 𝑎 + 𝑏𝑃𝑥2 (1)

where 𝑃𝑥1 is the station with a missing precipitation value, 𝑎 is a coefficient which

equals 0, 𝑏 is the linear regression coefficient and 𝑃𝑥2 is the nearest neighbouring

station’s respective precipitation value.

4.2.3 Converting 24-hour Duration Data to Hourly and Sub-Hourly Duration Data

Converting 24-hour precipitation data to hourly and sub-hourly data was completed using

the ratio formula which was introduced by Hershfield (1961) and adapted by Huff and Angel

(1989). The ratios were obtained from the Hydrological Atlas of Canada (1978) and can be

viewed in Table 1. The ratio formula is defined as

𝑋 = 𝑐(𝑑) (2)

where 𝑋 is the converted precipitation value for a desired duration, 𝑐 is the 24-hour precipitation

value and 𝑑 is the ratio value used to obtain the desired hourly or sub-hourly data value.

16

Rainfall Duration Td (hours) Ratio Converting 24-hour Data

18 0.91

12 0.82

6 0.73

3 0.64

2 0.56

1 0.45

0.50 (30 min.) 0.38

0.25 (15 min.) 0.29

0.17 (10 min.) 0.22

0.08 (5 min.) 0.12

Table 1: Coefficients of the Ratio formula for converting 24h rainfall to different hourly and

sub-hourly durations of rainfall for Ontario (Adapted after Hershfield (1961); Huff and Angel

(1989); by Coulibaly and Shi, 2005).

4.2.4 Annual Maximum Series Derivation

For IDF curve development there are two methods commonly used to extract extreme

rainfall data from a time series, which are the annual maximum series (AMS) method and the

partial duration series (PDS) method. The AMS method extracts the largest recorded

precipitation event from each year. The PDS method sets a precipitation threshold value, i.e. for

24 hour rainfall data, the threshold value can be set to 40 mm (which is site or region dependent),

therefore any rainfall event greater than 40 mm will be extracted from the time series and is

included when creating IDF curves. There are positive and negative factors associated with both

methods. The problem associated with the AMS method is that it only extracts one extreme

17

rainfall event from a given year, however there may be two or more extreme precipitation events

within a year, therefore the extreme rainfall characteristics for a station could be misrepresented.

The issue associated with the PDS method is that there is large uncertainty regarding how to

properly choose the precipitation threshold value, i.e. is 40 mm more representative of extreme

rainfall events than 50 mm. Subjectivity is involved when answering the aforementioned

problem. Therefore the AMS method was employed when deriving IDF curves for this study.

The AMS data was extracted for all weather stations using MatLab (R2014a).

4.3 At-Site Method IDF Curve Development

The at-site IDF curve method was utilized to develop IDF curves for all weather stations

in the study area. Frequency analysis was utilized to determine the recurrence of extreme rainfall

events for a variety of return periods. Return periods are measured in years and the range of

return periods used in this study include 2, 5, 10, 20, 50 and 100 year return periods. To estimate

recurrence intervals of extreme precipitation events a probability distribution was fit to the AMS

data. The generalized extreme value (GEV) distribution was the probability distribution function

that was chosen based on previous recent study in Southern Ontario (Coulibaly et al., 2015).

Once the probability distribution function is fitted to the AMS, estimation of extreme rainfall

events for desired return periods can be completed. IDF curves for various storm durations

ranging from 15 minutes to 24 hour were created.

4.4 Regional Frequency Analysis Method IDF Curve Development

The regional frequency analysis method was used to create IDF curves for all

homogeneous regions identified by selected regionalization techniques. The regional frequency

analysis approach includes additional weather stations in the development of IDF curves. Three

18

clustering techniques were utilized to generate homogeneous regions, which include the

Principal Component Analysis (PCA) with k-means, the Ward’s method, and the Tabreg

approach. A regional probability distribution function was fit to the AMS of all weather stations

in each respective cluster. IDF curves for multiple rainfall durations were created for all stations

within that cluster.

4.4.1 Regionalization Method Inputs

When using the regional frequency analysis method, it is important to include at-site

characteristics as inputs into the clustering technique (Hosking and Wallis, 1997). After

completing sensitivity analyses it was determined that the most suitable parameters to

characterize precipitation for stations in southern Ontario included mean annual precipitation

(MAP), total spring precipitation (April – May), total summer precipitation (June – September),

total fall precipitation (October – November), total winter precipitation (December – March) and

total 100 maximum daily precipitation values. It is common practice to include input parameters

such as number of wet days and elevation, however this is problematic because these

aforementioned parameters do not have the same unit so it is difficult to ensure that all of the

parameters are evenly weighted. The units for all parameters used in this study are in

millimeters. All of the data has been normalized before being inputted into the regionalization

techniques.

4.4.2 Principal Component Analysis with K-means Clustering

Principal Component Analysis (PCA) is a multivariate method that discovers patterns and

trends within a dataset with the objective of classifying multiple variables using one value

(Mallants and Feyen, 1990). In most regionalization studies, multiple variables are assessed to

19

determine the behavior of a system. PCA extracts information from the original dataset and

simplifies it into orthogonal values called principal components (PCs) (Mallants and Feyen,

1990). The first PC explains the largest variance within the dataset, the second PC explains the

second largest variance within the dataset, and this trend continues until the last PC is measured

(Stathis and Myronidis, 2009). It is acceptable to include PCs where the variance explained is

greater than or equal to 80%. PCA was applied to observed data to describe hydrological

characteristics of sites located within southern Ontario.

K-means algorithm was used to classify the PCA scores of each weather station into

homogeneous regions. K-means algorithm places centroids into a dimension of space which

describes the PCA score data (Razavi and Coulibaly, 2013). The Euclidean distance between

each cluster centroid and the closest PCA score is measured (Razavi and Coulibaly, 2013). K-

means algorithm divides the data into k clusters by assigning each PCA score to the respective

cluster where the distance between the data point and the cluster centroid is smallest (Fragoso

and Tildes Gomes, 2008). The cluster centroids continue to relocate based on the inclusion of

data points within each respective cluster. The algorithm continually recalculates the

membership of each data point within each cluster in order to obtain new clusters, and this

process is carried out until the variance within each cluster is minimized. Therefore each cluster

is created based on the minimum Euclidean distance between the cluster centroids and each data

point.

The Davies–Bouldin (DB) index introduced by Davies and Bouldin (1979) is a metric for

evaluating clustering algorithms. Small values of this index correspond to clusters that are

compact (Aguado et al., 2008). Therefore the K-means algorithm with the Davies-Bouldin index

is applied to the first two principal components to determine the appropriate number of clusters

20

for the stations located in the study area (Razavi and Coulibaly, 2013). The number of clusters

that minimizes the Davis-Bouldin index is taken as the optimal number of clusters.

4.4.3 Ward’s Method

Ward’s method is an agglomerative hierarchal clustering technique that was used to

group each weather station into clusters that exhibit similar site characteristics. In Ward’s

method each site behaves as an individual cluster and after each iteration, sites that exhibit the

smallest Euclidean distance merge together to form a new cluster. For example, if Ga and Gb are

two sites that have the shortest distance between them compared to all other sites, then they

merge to form a new cluster Ga,b (Kahya et al., 2008). The regionalization process is complete

when all sites belong to a single cluster and desired regions are delineated by reducing the

variance within each cluster. Hosking and Wallis (1997) recommended using Ward’s method for

regionalization because it produces adequate clusters with an equal amount of sites in each

region.

4.4.4 Tabreg Method

Tabreg was used to cluster weather stations into regions which exbitied similar

hydrologic characteristics. The only hard constraint on the formation of regions is that the sites

within a region must form a contiguous system. The topology for determining contiguity is

provided in Figure 3.

21

Figure 3: Topology created for the selected weather stations.

In our application, we start with a randomly generated regionalization plan and use

Tabreg to search for an improved plan by moving stations between regions in a way that

minimizes this cost function subject to the contiguity constraint as follows:

𝐶 = (𝑉𝑥1 + 𝑉𝑥2 + 𝑉𝑥3 + 𝑉𝑥4 + 𝑉𝑥5 + 𝑉𝑥6 + 𝑉𝑥7) (3)

where 𝐶 represents the total cost function for the entire regionalization plan, 𝑉𝑥1 represents the

total within-region variation in mean annual precipitation for all sites, 𝑉𝑥2 represents the total

within-region variation in spring precipitation for all sites, 𝑉𝑥3 represents the total within-region

variation in summer precipitation for all sites, 𝑉𝑥4 represents the total within-region variation in

fall precipitation for all sites, 𝑉𝑥5 represents the total within-region variation in winter

22

precipitation for all sites, 𝑉𝑥6 represents the total within-region variation in 100 maximum daily

precipitation values for all sites and 𝑉𝑥7 represents the non-connectivity penalty (Yiannakoulias

et al., 2007). The non-connectivity penalty increases the cost associated with regions of irregular

shape.

Once an initial randomly generated regionalization plan is found, Tabreg uses a greedy

search to move sites into and out of regions in a way that minimizes the cost function.

Unfortunately regionalization plans based on greedy searches are not likely to generate good

regionalization results. This is because they can get trapped in local (or “poor”) optima from

which no improved moves can be found. In order to escape poor quality regionalization plans,

Tabreg employs a tabu list based on the work of Bozkaya et al. (2003). Local optimum solutions

are avoided by memorizing previous station assignments and new regions are created by adding

stations into different areas of the search space. This occurs because a tabu list limits certain

search options for a period of time. This forces the algorithm to identify new stations that can be

added into regions. Initially this can negatively impact the cost functions, however over a long

run, as more station assignments are made, the quality of solutions is greatly improved. As

displayed in Figure 4, the quality of solutions remains trapped in a local optima for the first 235

iterations. Once the tabu list is activated additional stations are utilized and optimal regions are

delineated which is observed when the cost function substantially decreases from 57 to less than

20.

23

Figure 4: Graphical plot of the cost function after 500 iterations using Tabreg.

A non-connectivity penalty was included in the objective function of the Tabreg method

to ensure that regions are relatively compact in shape. Additionally Tabreg ensures that the

delineated regions are contiguous, meanwhile PCA with k-means and Ward’s method does not

ensure this.

4.4.5 L-moments Approach

The L-moments statistics validate the homogeneous regions that are delineated using

regionalization techniques. When using this approach it is assumed that stations which form a

0

10

20

30

40

50

60

70

80

90

100

1

19

37

55

73

91

10

9

12

7

14

5

16

3

18

1

19

9

21

7

23

5

25

3

27

1

28

9

30

7

32

5

34

3

36

1

37

9

39

7

41

5

43

3

45

1

46

9

48

7

Co

st F

un

ctio

n

Number of Iterations

Tabreg Cost Function After 500 Iterations

24

homogeneous region represent an identical frequency distribution apart from a site-specific

scaling factor (Hosking and Wallis, 1997). The L-moments approach uses multiple datasets from

a respective region to compute the final quantile estimates which are used to create regional IDF

curves.

4.4.6 L-moment Ratio Diagrams

L-moment ratio diagrams identify the probability distribution function that is most

appropriate for each region. The L-moment ratio diagram is used to assess the adequacy of five

different three-parameter probability distribution functions for each region, including generalized

logistic (GLO), generalized extreme value (GEV), generalized normal (GN), Pearson Type III

(P3) and generalized Pareto (GPA) distributions. L-moment ratio diagrams are visual tools which

present the L-moment coefficient of kurtosis, 𝜏4, of each distribution as a function of its L-

moment coefficient of skewness, 𝜏3 (Hosking and Wallis, 1997). For the candidate probability

distribution functions, the representation of 𝜏4 as a function of 𝜏3 will result in a curve that can

be seen on the L-moment ratio diagram (Hosking and Wallis, 1997). In the same diagram the

point corresponding to 𝜏4 as a function of 𝜏3 for the observed data is considered the reference

point. The selection of the best distribution is based on the proximity of the distribution curve in

relation to the observed data. A suitable probability distribution function is selected if it is closest

to the point of the observed data.

4.4.7 Goodness-of-Fit Measure

When using the L-moment ratio diagram if there are multiple probability distribution

functions that are appropriate for a respective region then the goodness-of-fit measure is utilized

to accurately determine which probability distribution function is most suitable for each region.

25

The goodness-of-fit measure assessed the similarity between the sample kurtosis of the observed

data and the population kurtosis of the fitted distribution (Hosking and Wallis, 1997). A

candidate distribution was selected if the goodness-of-fit measure ZDIST

was less than the

threshold value 1.64 which corresponds to the 90% normal quantile (Hosking and Wallis, 1997).

ZDIST

was defined as

𝑍𝐷𝐼𝑆𝑇 =𝜏4

𝐷𝐼𝑆𝑇−𝜏4𝑅+𝛽4

𝜎4 (4)

where DIST is any particular candidate distribution that is being assessed, 𝜏4𝑅 is the regional

average L-kurtosis, 𝜏4𝐷𝐼𝑆𝑇 is the L-kurtosis of the fitted distribution, 𝛽4 is the bias of the regional

average sample L-kurtosis, and 𝜎4 is the standard deviation of the regional average sample L-

kurtosis (Hosking and Wallis, 1997). A candidate distribution was considered satisfactory if

𝑍𝐷𝐼𝑆𝑇 ≤ 1.64 and a satisfactory distribution was accepted at a confidence level of 90% (Hosking

and Wallis, 1997).

4.5 Future IDF Curve Methodology

4.5.1 Selection of Future Time Periods

Certain climate models do not have the same projected future time periods. In order to

allow for a consistent comparison of different climate models, the future time periods for

selected climate models must overlap. Two time periods were selected to create future IDF

curves for both the CanRCM4-CanESM2 and HadGEM2-ES climate models, the 2050s (2026 –

2045) and the 2100s (2081 – 2099). Both of these time periods follow the RCP4.5 and RCP8.5

scenarios. Therefore for each station, an ensemble of IDF projections were generated using the

aforementioned RCP scenarios and future time periods.

26

4.5.2 Climate Model Data Processing

Data for CanRCM4-CanESM2 was downloaded at the three closest grid points of each

station and the inverse distance weighted average was calculated to obtain the daily rainfall data.

Data for HadGEM2-ES was downloaded at the closest grid point of each station because GCMs

have coarse spatial resolution and only the grid point that is closest to each station can best

represent the data. The CanRCM4-CanESM2 data was downloaded in a 1-hour duration format

and the HadGEM2-ES data was provided in a 3-hour duration format. In order to obtain climate

model data for different storm durations, the aggregation method was applied to the precipitation

data to get lower resolution data (i.e. 6-hour, 12-hour and 24-hour for HadGEM2-ES and 3-hour,

6-hour, 12-hour and 24-hour for CanRCM4-CanESM2) and then higher resolution data (i.e. 15-

min, 30-min, 1-hour for HadGEM2-ES and 15-min and 30-min for CanRCM4-CanESM2) was

derived using the ratio formula presented in Section 4.1.3 of this study.

4.5.3 Downscaling Techniques

Each climate model scenario is downscaled using a variant of the Delta change method

first introduced by Olsson (2009). The Delta change method is a robust downscaling technique

that has been implemented by the European Union Sustainable Urban Development Planner for

Climate Change Adaptation Project (SUDPLAN, 2012). The Delta change method uses climate

projections to estimate the change in current design storms and future design storms and applies

this change to historical precipitation data. The projected future design storm 𝐼𝑝 is formulated as:

𝐼𝑝 = 𝐼𝑜𝐼𝑓

𝐼𝑐 (5)

27

where 𝐼𝑓 is the future design storm based on future climate model rainfall intensities for a

specific duration, 𝐼𝑐 is the current design storm based on current climate model rainfall

intensities for a specific duration and 𝐼𝑜 is the observed rainfall data for a specific duration. The

advantage of the Delta change method is that the temporal and spatial variability of the observed

data is preserved (Olsson et al., 2009).

4.5.4. Future IDF Curve Derivation

Based on previous recent study (Coulibaly et al. 2015), the generalized extreme value

(GEV) distribution was the probability distribution function that was used for all future design

storm estimation. IDF curves for each future time period were developed by taking the

downscaled output from each climate model and RCP scenario (Coulibaly et al., 2015).

5. Results and Discussion

5.1 Principal Component Analysis and K-means Clustering Regionalization Results

The first step when utilizing regional frequency analysis is determining the number of

clusters that are appropriate for the region based on the available data. The k-means algorithm

and the Davies-Bouldin index was applied to determine the optimal number of clusters based on

the available dataset. The optimal number of clusters equates to the cluster size that has the

lowest Davies-Bouldin value. The value of this index is averaged after 10 runs. The results based

on the Davies-Bouldin index indicate that nine clusters is appropriate to delineate regions for the

sites used in this study (Figure 5).

28

Figure 5: Davies-Bouldin index applied to the input dataset.

The percentage of variability which is explained for each principal component is

acceptable at a level of 80%. Figure 6 shows the percentage of variance explained by each

principal component. The first two principal components explained approximately 80% of the

total variability of the data and will be included in the principal component analysis portion of

the study.

The principal component coefficients for each attribute and the principal component

scores for each observation are presented in Figure 7. The six parameters are represented as a

vector and the position of each vector indicates which parameters influence each principal

component. For the first principal component all six parameters have positive coefficients which

29

indicates that regionalization of the stations will be influenced by all parameters. Mean annual

precipitation is the parameter that exhibits the most influence on the first principal component.

Figure 6: Percentage of total variability explained by each principal component.

Meanwhile total fall precipitation, total winter precipitation, total summer precipitation, total

spring precipitation and 100 maximum daily precipitation values have less of an influence on the

first principal component (in descending order). The 100 maximum daily precipitation values is

the parameter that is represented the most in the second principal component. Total spring

precipitation and total summer precipitation are parameters that also exhibited influence on the

second principal component. Mean annual precipitation, total fall precipitation and total winter

30

precipitation are the least influential parameters for the second principal component (in

descending order).

Figure 7: PCA loading plots for the first and second principal components.

The results for weather station regionalization using Principal Component Analysis with

k-means clustering are shown in Figure 8. The regionalization results using PCA with k-means

clustering show that contiguous homogeneous regions were not obtained and that most regions

are heterogeneous. Most clusters contain stations that are located at opposite areas of the study

31

region. Although some clusters such as cluster 4, cluster 7 and cluster 9 show signs of contiguity

it can be seen that the use of PCA with k-means clustering creates heterogeneous regions.

Figure 8: Map showing the regions that were delineated using Principal Component

Analysis with k-means.

5.2 Ward’s Method Regionalization Results

The results for weather station regionalization using Ward’s method are shown in Figure

9. The regions delineated using Ward’s method exhibit more contiguity compared to PCA with

32

k-means clustering, however cluster 1, cluster 2 and cluster 3 show signs of heterogeneity and

are not contiguous homogeneous regions (Figure 9). However, the method also delineated

contiguous homogeneous: cluster 4, cluster 6, cluster 9 and cluster 9. But, most regions created

using Ward’s method are heterogeneous.

Figure 9: Map showing the regions that were delineated using Ward’s method.

PCA with k-means clustering and Ward’s method produced clusters that are not

contiguously homogeneous. An important aspect of regional frequency analysis is to delineate

contiguous homogeneous regions. In previous studies it is common to see methods such as PCA

with k-means clustering and Ward’s method to include input parameters such as latitude and

33

longitude which allow for the delineation of contiguous regions. Initially latitude and longitude

were not included as input parameters for this study. Additional tests were completed where

latitude and longitude were included as input parameters however this did not obtain better

results as the new regions that were created still did not exhibit contiguity. Therefore the

inclusion of a regionalization technique that accounts for contiguity and is able to cluster the

stations into statistically homogeneous regions is required. Tabreg clustering method has proven

to be a robust method that accounts for contiguity and it was used as an additional

regionalization method.

5.3 Tabreg Regionalization Results

The Tabreg clustering method was used to delineate homogeneous regions and the results

show that the clusters that were formed are uniformly and contiguously regionalized (Figure 10).

34

Figure 10: Map showing the regions that were delineated using Tabreg clustering method.

Therefore the regions delineated using the Tabreg method will be used for regional frequency

analysis in this study. To ensure that the clusters are statistically homogeneous the L-moments

approach was applied to each cluster delineated using Tabreg. All of the clusters obtained using

Tabreg successfully passed the discordancy measure and the heterogeneity measure test.

5.4 L-moment ratio diagrams

L-moment ratio diagrams are visual tools that assess which probability distribution

functions are most appropriate for each clusters’ dataset. The probability function closest to the

reference point (the + symbol in Figure 11 and Figure 12) is considered the most appropriate

London

Kingston

Windsor

Wiarton

Toronto

Fort Erie

35

function. The most appropriate probability distribution function that will be used for cluster 1 is

the generalized logistic (GLO) probability distribution (Figure 11) and for cluster 2 it is the

generalized extreme value (GEV) probability distribution (Figure 12).

Figure 11: L-moment ratio diagram for cluster 1. Note: Each line on the graph represents

different three-parameter distributions that were tested for cluster 1. The three-parameter

distributions used are: GPA: Generalized Pareto, GEV: Generalized Extreme Value, GLO:

Generalized Logistic, LN3: Lognormal and PE3: Pearson Type 3; the + symbol is the reference

point; the ■ symbol represents the different two-parameter distributions that were tested for

cluster 1. The two-parameter distributions used are E: Exponential, G: Gumbel, L: Logistic, N:

Normal and U: Uniform.

36

Figure 12: L-moment ratio diagram for cluster 2. Note: Each line on the graph represents

different three-parameter distributions that were tested for cluster 2. The three-parameter

distributions used are: GPA: Generalized Pareto, GEV: Generalized Extreme Value, GLO:

Generalized Logistic, LN3: Lognormal and PE3: Pearson Type 3; the + symbol is the reference

point; the ■ symbol represents the different two-parameter distributions that were tested for

cluster 2. The two-parameter distributions used are E: Exponential, G: Gumbel, L: Logistic, N:

Normal and U: Uniform.

37

5.5 Goodness-of-fit measure

The goodness-of-fit measure was utilized to confirm the most appropriate regional

probability distribution function for each cluster. There are many cases where more than one

probability distribution appropriately fits the dataset. For cluster 1 there are three probability

distributions that can appropriately fit the cluster’s dataset, including the generalized logistic

distribution, generalized extreme value distribution and generalized normal distribution (Table

2). For cluster 2 there are four probability distributions that can appropriately fit the cluster’s

data (Table 3). Since multiple probability distributions passed the goodness-of-fit measure, the

ZDIST

score closest to 0 is used to determine the most suitable probability distribution for each

cluster. The most suitable probability distributions for all clusters can be found in Table 4.

Cluster 1

Probability Distribution

Function Goodness-of-fit measure

Generalized Logistic -0.39

Generalized Extreme Value -1.18

Generalized Normal -1.52

Table 2: Goodness-of-fit measure results for each station located in cluster 1.

38

Cluster 2

Probability Distribution

Function Goodness-of-fit measure

Generalized Logistic 0.50

Generalized Extreme Value -0.32

Generalized Normal -0.71

Pearson Type III -1.42

Table 3: Goodness-of-fit measure results for each station located in cluster 2.

Cluster Identification Best Fit Distribution

Cluster 1 GLO

Cluster 2 GEV

Cluster 3 GEV

Cluster 4 GNO

Cluster 5 GLO

Cluster 6 GPA

Cluster 7 PE3

Cluster 8 GPA

Cluster 9 GNO

Table 4: L-moment ratio diagram results for each cluster.

39

5.6 IDF Curve Results: At-Site Method Compared to Regional Frequency Analysis

Method

After determining the most appropriate regional probability distribution function for each

cluster, IDF curves were created for each station using multiple return periods. For illustrative

purposes, the at-site method (following the GEV distribution) and the regional frequency

analysis method IDF curve results were compared for selected stations in different clusters. The

at-site IDF curve and the regional frequency analysis method IDF curve for Glen Allen station

are shown in Figure 13. The at-site IDF curve and the regional frequency analysis method IDF

curve for Tillsonburg station are shown in Figure 14. Glen Allen and Tillsonburg are located in

cluster 1 and the probability distribution function that is used for regional frequency analysis for

cluster 1 is the generalized logistic distribution. For the 3 hour and 12 hour duration rainfall the

at-site method produces similar rainfall values compared to the regional frequency analysis

method for the 2, 5 and 10 year return period for both Glen Allen station (Figure 13) and

Tillsonburg station (Figure 14). The regional frequency analysis method produces slightly larger

rainfall values compared to the at-site method for the 20 year return period for Glen Allen and

Tillsonburg. For the 50 year and 100 year return period the regional frequency analysis method

produces significantly larger rainfall values compared to the at-site method for Glen Allen and

Tillsonburg.

Using two different probability distribution functions for the at-site IDF curve and the

regional frequency analysis IDF curve contributed to the differences that were observed between

the at-site method and the regional frequency analysis method. The Z-score probability values

for the generalized logistic distribution and the Z-score probability values for the generalized

extreme value distribution are similar for the 2, 5 and 10 year return periods. However for the

40

larger return periods – such as 20, 50 and 100 years – the generalized logistic and the generalized

extreme value distributions have large differences in Z-score probability values. Since the Z-

score probability values for short return periods are similar for both the GEV and GLO

distributions then the IDF curve values are also similar. However for the larger return periods,

the GLO Z-score probability values are larger compared to the GEV Z-score probability values

and as a result the final IDF curve values for the regional frequency analysis method are

subsequently larger than the rainfall values of the at-site method. There are other clusters in the

study area that have a different regional probability distribution function compared to the at-site

probability distribution. As a result significant differences of rainfall values were observed for

larger return periods in other clusters as well. The patterns observed using the at-site method and

regional frequency analysis method are displayed by showing the percentage difference of the

rainfall intensities produced for the at-site method and the regional frequency analysis method

(Figure 15).

41

(a) (b)

Figure 13: Plots of IDF statistics from the regional frequency analysis method and the at-site

method for Glen Allan station (cluster 1): (a) 3 hour rainfall and (b) 12 hour rainfall.

5

10

15

20

25

30

35

2 5 10 20 50 100

Rai

nfa

ll In

ten

sity

(m

m/h

r)

Return Period (years)

Glen Allan Station: 3 hour Rainfall

At-siteMethod

RegionalMethod

2

3

4

5

6

7

8

9

10

2 5 10 20 50 100R

ain

fall

Inte

nsi

ty (

mm

/hr)

Return Period (years)

Glen Allan Station: 12 hour Rainfall

At-siteMethod

RegionalMethod

42

(a) (b)

Figure 14: Plots of IDF statistics from the regional frequency analysis method and the at-site

method for Tillsonburg station (cluster 1): (a) 3 hour rainfall and (b) 12 hour rainfall.

5

10

15

20

25

30

35

2 5 10 20 50 100

Rai

nfa

ll In

ten

sity

(m

m/h

r)

Return Period (years)

Tillsonburg Station: 3 hour Rainfall

At-siteMethod

RegionalMethod

2

3

4

5

6

7

8

9

10

2 5 10 20 50 100R

ain

fall

Inte

nsi

ty (

mm

/hr)

Return Period (years)

Tillsonburg Station: 12 hour Rainfall

At-siteMethod

RegionalMethod

43

Figure 15: Difference (in %) of rainfall intensity between the at-site method and the

regional frequency analysis method for stations in cluster 1.

After comparing IDF curves using the at-site method and the regional frequency analysis

method with different probability distribution functions, the differences in rainfall values for

larger return periods were obvious. Given that certain clusters use the GEV distribution similarly

to the at-site method, the next set of stations compared using the at-site method and the regional

frequency analysis method will use the GEV distribution to see if there are still differences in

final IDF values. The at-site IDF curve and the regional frequency analysis method IDF curve for

Vineland Rittenhouse station are shown in Figure 16. The at-site IDF curve and the regional

frequency analysis method IDF curve for Hagersville station are shown in Figure 17. Vineland

-10

-5

0

5

10

15

20

25

30

2 5 10 20 50 100

Pe

rce

nta

ge D

iffe

ren

ce (

%)

Return Period (years)

Difference in rainfall intensity between the at-site method and the RFA method

Glen Allan

Tillsonburg

44

Rittenhouse is located in cluster 3 and Hagersville is located in cluster 2 and the probability

distribution function that is used for regional frequency analysis for cluster 2 and cluster 3 is the

generalized extreme value (GEV) distribution also used for all stations in the at site method. For

the 3 hour and 12 hour duration rainfall, the at-site method produces similar rainfall values

compared to the regional frequency analysis method for the 2, 5 and 10 year return period for

both Vineland Rittenhouse (Figure 16) and Hagersville (Figure 17). The regional frequency

analysis method produces slightly larger rainfall values compared to the at-site method for the 20

year return period for Vineland Rittenhouse (Figure 16), however the regional frequency analysis

method has similar rainfall values as the at-site method at the 20 year return period for

Hagersville (Figure 17). For the 50 year and 100 year return period the regional frequency

analysis method produces larger rainfall values compared to the at-site method for Vineland

Rittenhouse and Hagersville.

A similar trend that was observed in the cluster 1 results is also observed in the results for

cluster 2 and cluster 3 where the rainfall values for the at-site method and the regional frequency

analysis method are similar for the 2, 5 and 10 year return period. However for the larger return

periods – including 20, 50 and 100 years – the rainfall values are larger for the regional

frequency analysis method compared to the at-site method. Therefore the final IDF values for the

at-site method are similar to the regional frequency analysis method for short return periods for

both stations. The patterns observed using the at-site method and regional frequency analysis

method are displayed by showing the percentage difference of the rainfall intensities produced

for the at-site method and the regional frequency analysis method (Figure 18).

45

(a) (b)

Figure 16: Plots of IDF statistics from the regional frequency analysis method and the at-site

method for Vineland Rittenhouse station (cluster 3): (a) 3 hour rainfall and (b) 12 hour rainfall.

0

5

10

15

20

25

2 5 10 20 50 100

Rai

nfa

ll In

ten

sity

(m

m/h

r)

Return Period (years)

Vineland Station: 3 hour Rainfall

At-siteMethod

RegionalMethod

0

1

2

3

4

5

6

7

8

2 5 10 20 50 100R

ain

fall

Inte

nsi

ty (

mm

/hr)

Return Period (years)

Vineland Station: 12 hour Rainfall

At-siteMethod

RegionalMethod

46

(a) (b)

Figure 17: Plots of IDF statistics from the regional frequency analysis method and the at-site

method for Hagersville station (cluster 2): (a) 3 hour rainfall and (b) 12 hour rainfall.

0.0

5.0

10.0

15.0

20.0

25.0

30.0

2 5 10 20 50 100

Rai

nfa

ll In

ten

sity

(m

m/h

r)

Return Period (years)

Hagersville Station: 3 hour Rainfall

At-SiteMethod

RegionalMethod

0.0

1.0

2.0

3.0

4.0

5.0

6.0

7.0

8.0

9.0

10.0

2 5 10 20 50 100

Rai

nfa

ll In

ten

sity

(m

m/h

r)

Return Period (years)

Hagersville Station: 12 hour Rainfall

At-SiteMethod

RegionalMethod

47

Figure 18: Difference (in %) of rainfall intensity between the at-site method and

the regional frequency analysis method for Hagersville (cluster 2) and

Vineland Rittenhouse (cluster 3).

Furthermore, the at-site method and the regional frequency analysis method were

compared using weather stations from all clusters for short return periods (Figure 19) and for

large return periods (Figure 20). Similar trends that were observed in clusters 1, 2 and 3 are also

found within the other delineated clusters as differences in rainfall intensities for large return

periods are obvious, however for short return periods rainfall values are similar. Cluster 2 and

cluster 3 use the generalized extreme value as the regional probability distribution, and the

regional frequency analysis method produces larger rainfall values compared to the at-site

-20

-15

-10

-5

0

5

10

15

20

25

2 5 10 20 50 100

Pe

rce

nta

ge D

iffe

ren

ce (

%)

Return Period (years)

Difference in rainfall intensity between the at-site method and the RFA method

Hagersville

VinelandRittenhouse

48

method. Meanwhile clusters 1, 4, 5, 6, 7, 8 and 9 use a different regional probability distribution

compared to the at-site method, and differences in rainfall values are observed for larger return

periods as well. Overall using the regional frequency analysis method produces differences in

rainfall values compared to the at-site method, especially for large return periods. Similar

patterns are observed for clusters where the regional frequency analysis probability distribution

and the at-site probability distribution are different or identical. There were also cases where

stations using the at-site method produced larger rainfall values compared to the regional

frequency analysis method. Therefore regardless of which probability distribution function is

used for each IDF method, the IDF values are generally different for larger return periods (50 to

100 years).

49

Figure 19: Difference (in %) of rainfall intensity between the at-site method and

the regional frequency analysis method for weather stations in all clusters for short return

periods.

-6

-4

-2

0

2

4

6

8

10P

erc

en

tage

Dif

fere

nce

(%

)

Weather Stations

Difference in rainfall intensity between the at-site method and the RFA method for all clusters for short return periods

2 year returnperiod

5 year returnperiod

10 year returnperiod

50

Figure 20: Difference (in %) of rainfall intensity between the at-site method and

the regional frequency analysis method for weather stations in all clusters for large return

periods.

5.7 Future IDF Curve Results

To simplify results presentation, sample station results are presented for illustration.

Durham station is located in cluster 9 and Hartington station is located in cluster 6 and despite

being at large distance away from one another, similar future IDF curve results were observed

for both stations. Figure 19 shows the future IDF results for Durham station for the 3 hour

0

5

10

15

20

25P

erc

en

tage

Dif

fere

nce

(%

)

Weather Stations

Difference in rainfall intensity between the at-site method and the RFA method for all clusters for large return periods

20 year returnperiod

50 year returnperiod

100 year returnperiod

51

rainfall duration and Figure 20 shows the Durham station results for the 12 hour rainfall duration.

Figure 21 shows the future IDF curves for Hartington station for the 3 hour rainfall duration and

Figure 22 shows the Hartington station results for the 12 hour rainfall duration. Future IDF

curves derived for the 2050s period and 2100s period were compared with the IDF curve results

using the at-site method and the regional frequency analysis method. The 2050s period has larger

rainfall values compared to the at-site method and the regional frequency analysis method for all

return periods for both Durham (Figure 19 and Figure 20) and Hartington (Figure 21). However

for the 12 hour duration for Hartington station, the regional frequency analysis method has larger

rainfall values for the 10, 20, 50 and 100 year return period compared to the 2050s period

(Figure 22 a). The 2100s period has significantly larger rainfall values compared to the 2050s

period, the at-site method and the regional frequency analysis method for all return periods for

Durham station and Hartington station.

52

(a) (b)

Figure 21: IDF curves obtained from the future IDF method, the regional frequency analysis

method and the at-site method for Durham station (cluster 9) for the 3 hour rainfall: (a) 2050s

period and (b) 2100s time period.

5

10

15

20

25

2 5 10 20 50 100

Rai

nfa

ll In

ten

sity

(m

m/h

r)

Return Period (years)

Durham Station: Comparison of IDF methods for 3 hr rainfall for 2050s

period

At-siteMethod

RegionalMethod

2050s

5

10

15

20

25

30

2 5 10 20 50 100

Rai

nfa

ll In

ten

sity

(m

m/h

r)

Return Period (years)

Durham Station: Comparison of IDF methods for 3 hr rainfall for 2100s

period

At-siteMethod

RegionalMethod

2100s

53

(a) (b)

Figure 22: IDF curves obtained from the future IDF method, the regional frequency analysis

method and the at-site method for Durham station (cluster 9) for the 12 hour rainfall: (a) 2050s

period and (b) 2100s time period.

2

3

4

5

6

7

8

2 5 10 20 50 100

Rai

nfa

ll In

ten

sity

(m

m/h

r)

Return Period (years)

Durham Station: Comparison of IDF methods for 12 hr rainfall for 2050s

period

At-siteMethod

RegionalMethod

2050s

2

3

4

5

6

7

8

9

10

2 5 10 20 50 100

Rai

nfa

ll In

ten

sity

(m

m/h

r)

Return Period (years)

Durham Station: Comparison of IDF methods for 12 hr rainfall for 2100s

period

At-siteMethod

RegionalMethod

2100s

54

(a) (b)

Figure 23: IDF curves obtained from the future IDF method, the regional frequency analysis

method and the at-site method for Hartington station (cluster 6) for the 3 hour rainfall: (a)

2050s period and (b) 2100s time period.

5

7

9

11

13

15

17

19

21

2 5 10 20 50 100

Rai

nfa

ll In

ten

sity

(m

m/h

r)

Return Period (years)

Hartington Station: Comparison of IDF methods for 3 hr rainfall for

2050s period

At-siteMethod

RegionalMethod

2050s

5

10

15

20

25

30

35

2 5 10 20 50 100

Rai

nfa

ll In

ten

sity

(m

m/h

r)

Return Period (years)

Hartington Station: Comparison of IDF methods for 3 hr rainfall for

2100s period

At-siteMethod

RegionalMethod

2100s

55

(a) (b)

Figure 24: IDF curves obtained from the future IDF method, the regional frequency analysis

method and the at-site method for Hartington station (cluster 6) for the 12 hour rainfall: (a)

2050s period and (b) 2100s time period.

1

2

3

4

5

6

7

2 5 10 20 50 100

Rai

nfa

ll In

ten

sity

(m

m/h

r)

Return Period (years)

Hartington Station: Comparison of IDF methods for 12 hr rainfall for

2050s period

At-siteMethod

RegionalMethod

2050s

2

3

4

5

6

7

8

9

10

2 5 10 20 50 100

Rai

nfa

ll In

ten

sity

(m

m/h

r)

Return Period (years)

Hartington Station: Comparison of IDF methods for 12 hr rainfall for

2100s period

At-siteMethod

RegionalMethod

2100s

56

The rainfall values obtained using the future IDF methods for Durham station and

Hartington station showed that the 2050s time period had slightly larger rainfall values compared

to the at-site IDF curve and the regional frequency analysis method IDF curve; the exception was

for Hartington station 12 hour rainfall where the regional frequency analysis method had larger

rainfall values for the 10, 20, 50 and 100 year return period compared to the 2050s period. The

CanRCM4 and HadGEM2 climate models are based on the new IPCC emission scenarios which

suggest that extreme increases in rainfall intensities will not be observed over a short timescale.

As a result there were minor increases in rainfall values for the 2050s period. However, when

comparing the 2100s time period to the at-site method and the regional frequency analysis

method larger increases in rainfall values were observed for all return periods. Furthermore, the

2100s period had larger rainfall values for the 20, 50 and 100 year return periods when compared

to the 2050s time period. Following the RCP4.5 and RCP8.5 emission scenarios it is expected

that there will be large increases in rainfall values for the 2100s period (IPCC, 2014). (IPCC,

2014) infers that increases in extreme rainfall events will be realized over a large period of time

due to the intensive greenhouse gas emissions that are projected to accumulate by the 2100s

period. After developing IDF curves using the future IDF curve method it is expected that

rainfall intensities are expected to significantly increase in southern Ontario for the 2100s period

as similar results were observed for all stations in this region.

5.8 Variability of the Future IDF Curve Results

There are numerous climate models and emission scenarios that are available to create

future IDF curves. Since infrastructure designs are planned to account for extreme rainfall events

expected to occur in the future, it is important that climate models adequately represent projected

57

rainfall intensities for each respective region. The variability of results due to the diversity of the

climate models and the associated emission scenarios was assessed as well.

For 12 hour rainfall duration for the Durham station, the climate model and emission

scenario that produced the lowest intensity of rainfall for the 2100s period was the HadGEM2

model for the RCP4.5 emission scenario (Figure 24). The HadGEM2 model following the

RCP4.5 scenario is an intermediate scenario where emission levels are supposed to level off,

however, the rainfall intensity for this scenario are moderately larger compared to the at-site

method rainfall intensities, therefore under one of the best case lowest emission scenarios, an

increase in rainfall intensity is still expected for return periods greater than 10 years for the 2100s

period. For the Durham station the climate model and emission scenario that produced the largest

intensity of rainfall for the 2100s period was the CanRCM4 model for the RCP8.5 emission

scenario. The rainfall intensities for this scenario are moderately larger compared to the at-site

method rainfall intensities for the 2 – 10 year return periods however significant increases are

observed for the 25 – 100 year return periods, therefore under one of the worst case emission

scenarios a significant increase in rainfall intensity is expected for the 2100s period.

58

Figure 25: Comparison of IDF curves for 12 hour rainfall duration using the minimum,

maximum and mean of the future IDF method results versus the at-site method results for

Durham station. Note: Minimum = HadGEM2 model following RCP4.5; Maximum =

CanRCM4 model following RCP8.5; Mean = the average rainfall intensity of all

climate models and emission scenarios used in this study.

For 12 hour rainfall duration for the Hartington station, the climate model and emission

scenario that produced the lowest intensity of rainfall for the 2100s period was the HadGEM2

model for the RCP4.5 emission scenario (Figure 26). The rainfall intensities for this scenario are

smaller compared to the at-site method for short return periods (2 – 5 years), however the rainfall

intensities for the 10 – 100 year return periods are larger compared to the at-site method rainfall

intensities, therefore under one of the best case lowest emission scenarios an increase in rainfall

intensity is still expected for all return periods for the 2100s period. For the Durham station the

2

3

4

5

6

7

8

9

10

11

12

2 5 15 25 35 45 55 70 90

Rai

nfa

ll In

ten

sity

(m

m/h

r)

Return Period (years)

Durham station: 2100s storm scenarios for 12 hr rainfall

At-site Method

Minimum

Mean

Maximum

59

climate model and emission scenario that produced the largest intensity of rainfall for the 2100s

period was the CanRCM4 model for the RCP4.5 emission scenario. The rainfall intensities for

this scenario are larger compared to the at-site method rainfall intensities for the 2 – 15 year

return periods however significant increases are observed for the 20 – 100 year return periods,

therefore following the RCP4.5 emission scenario a significant increase in rainfall intensity is

expected for the 2100s period.

Figure 26: Comparison of IDF curves for 12 hour rainfall duration using the minimum,

maximum and mean of the future IDF method results versus the at-site method results for

Hartington station. Note: Minimum = HadGEM2 model following RCP4.5; Maximum =

CanRCM4 model following RCP4.5; Mean = the average rainfall intensity of all climate

models and emission scenarios used in this study.

2

4

6

8

10

12

14

16

2 5 15 25 35 45 55 70 90

Rai

nfa

ll In

ten

sity

(m

m/h

r)

Return Period (years)

Hartington station: 2100s storm scenarios for 12 hr rainfall

At-site Method

Minimum

Mean

Maximum

60

Assessing the rainfall outputs for each climate model is of utmost importance to

determine which climate models and emission scenarios should be utilized for research in the

future. The climate models and emission scenarios used in this study showed that different

scenarios can produce a wide range of rainfall intensities for both intermediate and extreme

emission scenarios. In most cases the RCP8.5 emission scenario triggers the largest increase in

rainfall intensities for the 2081 – 2099 period. The RCP4.5 emission scenario is associated with

minimal increases in rainfall intensities for the 2081 – 2099 period, the only exception was for

the 12 hour rainfall duration for Hartington station and in this case the RCP4.5 scenario was

associated with the maximum increase in rainfall intensity. Pinpointing appropriate climate

models and emission scenarios that will accurately represent future conditions will enable the

derivation of IDF curves to be more accurate. Until then, using an ensemble of climate models to

create IDF curves is the most effective method.

6. Discussion

Overall the three regionalization techniques used in this study produced different weather

station clusters. PCA with k-means clustering and Ward’s method created heterogeneous

clusters. However Tabreg clustering method delineated statistically homogenous clusters that

were contiguous. Tabreg clustering method incorporates a non-connectivity parameter in the cost

function to ensure that delineated regions are compact and contiguous. Meanwhile PCA with k-

means clustering and Ward’s method do not incorporate a similar parameter in their objective

function. Therefore using a regionalization technique that includes a factor which accounts for

contiguity, such as Tabreg, is required to cluster stations into statistically homogeneous regions.

61

Furthermore, the at-site method IDF curves and the regional frequency analysis method

IDF curves generated different rainfall values for each cluster. Using different probability

distribution functions for the aforementioned methods resulted in the differences observed for the

rainfall values. For different probability distribution functions the Z-score probability values are

similar for short return periods (2, 5 and 10 year return periods). However for larger return

periods (20, 50 and 100 year return periods) there are large differences in Z-score probability

values. As a result the final IDF curve values for the regional frequency analysis method are

larger than the rainfall values for the at-site method for all stations, especially for large return

periods. There are multiple cases where clusters within the study area have different regional

probability distribution functions compared to the at-site probability distribution function, as a

result differences in rainfall values were observed in other clusters as well.

The rainfall values obtained using the future IDF curve methods showed that the 2050s

time period had slightly larger rainfall values in comparison to the at-site method IDF curve and

the regional frequency analysis method IDF curve. The IPCC emission scenarios describe that

over a short period of time there will not be significant increases in rainfall intensities;

subsequently there were insignificant increases in rainfall values for the 2050s period compared

to the at-site method and regional frequency analysis method IDF curves. However, the 2100s

period produced significantly larger rainfall values for the 20, 50 and 100 year return periods

compared to the at-site method, regional frequency analysis method and the 2050s period. Under

the RCP4.5 and RCP8.5 emission scenarios increases in rainfall intensities are expected over a

large time scale due to the compilation of greenhouse gas emissions that will accumulate by the

2100s period. Consequently extreme rainfall events are expected for larger return periods during

this time period. Therefore practitioners and organizations that are interested in updating

62

intensity-duration-frequency curves in a particular region of southern Ontario can recognize that

accounting for future rainfall extremes for the 2100s period is essential for the design of

hydraulic structures.

7. Conclusion

The IDF curves derived using the at-site method and the regional frequency analysis

method for all stations produced different results. Comparing the at-site method using the GEV

distribution for all sites with the regional frequency analysis method had previously not been

completed in southern Ontario. Initial results showed that the regional frequency analysis method

produced larger rainfall values compared to the at-site method for the 50 and 100 year return

period. For most cases there were minor differences for rainfall intensities for 2 year, 5 year, 10

year and 20 year return periods.

Future IDF curves showed that for the 2050s time period there were minor increases in

rainfall intensities when compared with the at-site method and the regional frequency analysis

method whatever the region. For the 2100s time period there were larger increases in rainfall

intensities compared to the at-site method and the regional frequency analysis method, especially

for larger return periods (50 to 100 years). These results suggest that it is worthwhile for regions

within southern Ontario to update their IDF curves using the future IDF curve technique,

however further investigation to determine the most adequate climate models and emission

scenarios is required.

This study also evaluated the ability of regionalization techniques such as Principal

Component Analysis with k-means, Ward’s method and Tabreg. The regionalization results

indicate that only Tabreg was able to delineate contiguous homogeneous regions. The clusters

63

obtained using PCA with k-means and Ward’s method were heterogeneous and therefore were

not used for the regional IDF development in this study. Furthermore, most regionalization

studies include latitude and longitude as inputs for clustering, and initially these two parameters

were not included. However after including latitude and longitude as input parameters for PCA

with k-means and Ward’s method there was still little improvement in delineating contiguous

homogeneous regions. Future research will attempt to include even more weather stations to see

if homogeneous regions can be delineated using Tabreg with a larger number of stations.

Characterizing rainfall to determine future rainfall intensities in southern Ontario is of

utmost importance to mitigate flooding impacts on society. The study results are consistent with

previous studies and indicate that IDF curves should be updated in southern Ontario. Similar

analysis approach should be applied to other study areas as well to ensure that IDF curves are

created using adequate data and methods.

8. Recommendations

Based on the results found in this study the following recommendations can be made:

Utilizing both global climate models or regional climate models are necessary to capture

the future extreme rainfall trends which will enable accurate 100 year storm designs for

new infrastructure. Using the Representative Concentration Pathways (RCP) emissions

scenarios that represent intermediate to intensive greenhouse gas emissions is effective to

eliminate any redundancies that might be caused by using lower emission scenarios.

Therefore the strategies used in this study should be utilized when creating future IDF

curves for additional weather stations in the study area.

64

Further testing of probability distribution functions to ensure that we are characterizing

extreme rainfall events properly is required using additional tests such as Mann-Kendall,

Quantile-Quantile plots, Kolmogorov-Smirnov goodness of fit test and the Akaike

Information Criterion (Coulibaly et al., 2015).

Explore the possibility of using different parameter weights based on the importance of

each input parameter for Tabreg. Also try using different input parameters for Tabreg

such as extreme rainfall for 200 maximum values, for example.

Since global climate models and regional climate models are continually evolving and

being updated, using the latest version of the CRCM and HadGEM climate models for

the next study is recommended.

65

9. References

Abolverdi, J., and Khalili, D., (2010). Development of Regional Rainfall Annual Maxima for

Southwestern Iran by L-Moments. Water Resources Management, 24(11), 2501 – 2526.

Adamowski, J., Adamowski, K., Bougadis, J., (2010). Influence of Trend on Short Duration

Design Storms. Water Resources Management, 24(3), 401 – 413.

Aguado, D., Montoya, T., Borras, L., Seco, A., and Ferrer, J., (2008). Using SOM and PCA for

analysing and interpreting data from a P-removal SBR. Engineering Applications of

Artificial Intelligence, 21(6), 919–930.

Bharath, R., and Srinivas, V.V., (2015). Regionalization of extreme rainfall in India.

International Journal of Climatology, 35(6), 1142 – 1156.

Bernard, M. M., (1932). Formulas for rainfall intensities of long duration. Transactions of the

American Society of Civil Engineers, 96(1), 592 – 606.

Bernard, E., Naveau, P., Vrac, M., Mestre, O., (2013). Clustering of maxima: Spatial

dependencies among heavy rainfall in France. Journal of Climate, 26(20), 7929 – 7937.

Bozkaya, B., Erkut, E., and Laporte, G., (2003). A tabu search heuristic and adaptive memory

procedure for political districting. European Journal of Operational Research, 144(1),

12 – 26.

Brunetti, M., Maugeri, M., Monti, F., and Nanni, T., (2004). Changes in daily precipitation

frequency and distribution in Italy over the last 120 years. Journal of Geophysical

Research: Atmospheres, 109(D5), 1 – 16.

66

Coulibaly, P., Burn, D.H., Switzman, H., Henderson, J., and Fausto, E., (2015). A Comparison of

Future IDF Curves for Southern Ontario. Technical Report, Toronto Region

Conservation Authority and Essex Region Conservation Authority, 1 – 100.

Coulibaly, P., and Shi, X., 2005. Identification of the Effect of Climate Change on Future Design

Standards of Drainage Infrastructure in Ontario. Technical Report, Ministry of

Transportation of Ontario.

City of Guelph, (2007). Frequency analysis of maximum rainfall and IDF design curve update

(Technical Report). 1– 29.

Clavet-Gaumont, J., Sushama, L., Khaliq, M. N., Huziy, O., and Roy, R., (2013). Canadian RCM

projected changes to high flows for Quebec watersheds using regional frequency

analysis. International Journal of Climatology, 33(14), 2940 – 2955.

Conservation Halton, (2015). August 4th, 2014 Storm Event, Burlington. 1 – 35.

Davies, D. L., and Bouldin, D. W. (1979). A cluster separation measure. IEEE transactions on

pattern analysis and machine intelligence, (2), 224-227.

Dominguez, F., Rivera, E., Lettenmaier, D.,P., and Castro, C. L., (2012). Changes in winter

precipitation extremes for the western United States under a warmer climate as simulated

by regional climate models. Geophysical Research Letters, 39(5), 1 – 7.

Environment and Climate Change Canada, (2014). Canada’s Top 10 Weather Stories for 2013 –

2. Toronto’s Torrent. Climate and Historical Weather, 3.

67

Feng, L., Zhou, T., Wu, B., Li, T., and Luo, J.J., (2011). Projection of future precipitation change

over China with a high-resolution global atmospheric model. Advances in Atmospheric

Sciences, 28(2), 464 – 476.

Fragoso, M., and Tildes Gomes, P., (2008). Classification of daily abundant rainfall patterns an

associated large‐scale atmospheric circulation types in southern Portugal. International

Journal of Climatology, 28(4), 537 – 544.

Hennessy, K. J., Gregory, J. M., and Mitchell, J. F. B., (1997). Changes in daily precipitation

under enhanced greenhouse conditions. Climate Dynamics, 13, 667 – 680.

Hershfield, D.M., (1961). Rainfall Frequency Atlas of the United States for Durations from 30

minutes to 24 hours and Return Periods from 1 to 100 Years. U.S. Weather Bureau,

Technical Paper 40.

Hosking, J.R.M., and Wallis, J.R., (1997). Regional Frequency Analysis An Approach Based on

L-moments. Cambridge University Press.

Huff, F.A., and Angel, J.R., (1989). Frequency Distribution and Hydraulic Climatic

Characteristics of Heavy Rainstorms in Illinois, Bulletin 70, Illinois State Water Survey,

Champaign, Illinois.

Huff, F.A., and Angel, J.R., (1992). Rainfall Frequency Atlas of the Midwest, Bulletin 71,

Midwestern Climate Center and Illinois State Water Service, 1992, 5 – 6.

Hydrological Atlas of Canada, (1978). Fisheries and Environment Canada.

Intergovernmental Panel on Climate Change, (2014). Climate Change 2014: Synthesis Report.

Contribution of Working Groups I, II and III to the Fifth Assessment Report of the

68

Intergovernmental Panel on Climate Change [Core Writing Team, R.K. Pachauri and

L.A. Meyer (eds.)]. IPCC, Geneva, Switzerland, 151.

Jingyi, Z., and Hall, M.J., (2004). Regional flood frequency analysis for the Gan-Ming River

basin in China. Journal of Hydrology, 296, 98 – 117.

Kahya, E., Demirel, M.C., Bég, O.A., (2008). Hydrologic homogeneous regions using monthly

streamflow in Turkey. Earth Sciences Research Journal, 12(2), 181 – 193.

Kansakar, S. R., Hannah, D. M., Gerrard, J., Rees, G., (2004). Spatial pattern in the precipitation

regime of Nepal. International Journal of Climatology, 24(13), 1645 – 1659.

Kyselý, J., Picek, J., and Huth, R., (2007). Formation of homogeneous regions for regional

frequency analysis of extreme precipitation events in the Czech Republic. Studia

Geophysica et Geodaetica, 51(2), 327 – 344.

Lemmen, D.S., Warren, F.J., Lacroix, J., and Bush, E., editors (2008). From Impacts to

Adaptation: Canada in a Changing Climate 2007. Government of Canada, Ottawa,

Ontario.

Lim, Y.H., and Voeller, D.L., (2009). Regional flood estimations in Red River using L-moment

based index-flood and Bulletin 17B procedures. Journal of Hydrologic Engineering,

14(9), 1002 – 1016.

Lin, G.F., and Chen, L.H., (2006). Identification of homogeneous regions for regional frequency

analysis using the self-organizing map. Journal of Hydrology, 324(1), 1 – 9.

69

Mailhot, A., Duchesne, S., Caya, D., and Talbot, G., (2007). Assessment of future change in

intensity-duration-frequency (IDF) curves for Southern Quebec using the Canadian

Regional Climate Model (CRCM). Journal of Hydrology, 347(1), 197 – 210.

Mailhot, A., Beauregard, I., Talbot, G., Caya, D., and Biner, S., (2012). Future changes in

intense precipitation over Canada assessed from multi-model NARCCAP ensemble

simulations. International Journal of Climatology, 32(8), 1151 – 1163.

Malekinezhad, H., and Zare-Garizi, A., (2014). Regional frequency analysis of daily rainfall

extremes using L-moments approach. Atmósfera, 27(4), 411 – 427.

Mallants, D., and Feyen, J., (1990). Defining homogeneous precipitation regions by means of

principal componenet analysis. Journal of Applied Meterology, 29, 892 – 901.

Maraun, D., Osborn, T.J., and Gilleet, N.P., (2008). United Kingdom daily precipitation

intensity: improved early data, error estimates and an update from 2000 to 2006.

International Journal of Climatology, 28(6), 833 – 842.

Marengo, J.A., Ambrizzi, T., da Rocha, R.P., Alves, L.M., Cuadra, S.V., Valverde, M.C., Torres,

R.R., Santos, D.C., Ferraz, S.E.T, (2010). Future change of climate in South America in

the late twenty-first century: Intercomparison of scenarios from three regional climate

models. Climate Dynamics, 35(6), 1089 – 1113.

Mladjic, B., Sushama, L., Khaliq, M.N., Laprise, R., Caya, D., and Roy, R., (2011). Canadian

RCM projected changes to extreme precipitation characteristics over Canada. Journal of

Climate, 24(10), 2565 – 2584.

70

Modarres, R., and Sarhadi, A., (2011). Statistically-based regionalization of rainfall climates of

Iran. Global and Planetary Change, 75(1), 67 – 67.

Muñoz-Diaz, D., and Rodrigo, F.S., (2004). Spatio-temporal patterns of seasonal rainfall in

Spain (1912 – 2000) using cluster and principal component analysis : comparison.

Annales Geophysicae, 22, 1435 – 1448.

Nam, W., Shin, H., Jung, Y., Joo, K., and Heo, J.H., (2015). Delineation of the climatic rainfall

regions of South Korea based on a multivariate analysis and regional rainfall frequency

analyses. International Journal of Climatology, 35(5), 777 – 793.

Ngongondo, C.S., Xu, C.Y., Tallaksen, L.M., Alemaw, B., and Chirwa, T., (2011). Regional

frequency analysis of rainfall extremes in Southern Malawi using the index rainfall and

L-moments approaches. Stochastic Environmental Research and Risk Assessment, 25(7),

939 – 955.

Noto, L., and La Loggia, G., (2009). Use of L-moments approach for regional flood frequency

analysis in Sicily, Italy. Water Resources Management, 23(11), 2207 – 2229.

Olsson, J., Berggren, K., Olofsson, M., and Viklander, M., (2009). Applying climate model

precipitation scenarios for urban hydrological assessment: A case study in Kalmar City,

Sweden. Atmospheric Research, 92(3), 364 – 375.

Paixao, E., Auld, H., Mirza, M.M.Q., Klaassen, J., and Shephard, M.W., (2011). Regionalization

of heavy rainfall to improve climatic design values for infrastructure: case study in

Southern Ontario, Canada. Hydrological Sciences Journal, 56(7), 1067 – 1089.

71

Parida, B.P., and Moalafhi, D.B., (2008). Regional rainfall frequency analysis for Botswana

using L-Moments and radial basis function network. Physics and Chemistry of the Earth,

33, 614 – 620.

Razavi, T., and Coulibaly, P., (2013). Classification of Ontario watersheds based on physical

attributes and streamflow series. Journal of Hydrology, 493, 81 – 94.

Saf, B., (2009). Regional Flood Frequency Analysis Using L Moments for the Buyuk and Kucuk

Menderes River Basins of Turkey. Journal of Hydrologic Engineering, 14(8), 783 – 794.

Schaefer, M.G., Barker, B.L., Taylor, G.H., and Wallis, J.D., (2006). Regional precipitation

frequency analysis and spatial mapping of precipitation for 24-hour and 2-hour durations

in eastern Washington. Washington State Department of Transportation, 1 – 40.

Shephard, M.W., (2011). Updating IDF Climate Design Values for the Atlantic Provinces.

Ministry of Environment, Labour and Justice, Government of Prince Edward Island.

Smithers, J.C., and Schulze R.E., (2001). A methodology for the estimation of short duration

design storms in South Africa using a regional approach based on L-moments. Journal of

Hydrology, 241(1), 42 – 52.

Stathis, D., and Myronidis, D., (2009). Principal Component Analysis of Precipitation in

Thessaly Region (Central Greece). Global NEST Journal, 11(4), 467 – 476.

SUDPLAN 2012: Sustainable Urban Development Planner for Climate Change Adaptation.

Project 247708, Final Report Available online at:

http://sudplan.eu/polopoly_fs/1.30418!/SUDPLAN_final.pdf

72

Trefry, C.M., Watkins Jr., D.W., and Johnson, D., (2005). Regional Rainfall Frequency Analysis

for the State of Michigan. Journal of Hydrologic Engineering, 10(6), 437 – 449.

Unal, Y., Kindap, T., and Karaca, M., (2003). Redefining the climate zones of Turkey using

cluster analysis. International Journal of Climatology, 23(9), 1045 – 1055.

Urrutia, R., and Vuille, M., (2009). Climate change projections for the tropical Andes using a

regional climate model: Temperature and precipitation simulations for the end of the 21st

century. Journal of Geophysical Research Atmospheres, 114(2), 1 – 15.

Willems, P., Arnbjerg-Nielsen, K., Olsson, J., and Nguyen, V.T.V., (2012). Climate change

impact assessment on urban rainfall extremes and urban drainage: methods and

shortcomings. Atmospheric Research, 103, 106 – 118.

Xuejie, G., Zongci, Z., Yihui, D., Ronghui, H., and Giorgi, F., (2001). Climate change due to

greenhouse effects in China as simulated by a regional climate model. Advances in

Atmospheric Sciences, 18(6), 1224 – 1230.

Yiannakoulias, N., Rosychuk, R.J., and Hodgson, J., (2007). Adaptations for finding irregularly

shaped disease clusters. International Journal of Health Geographics, 6.

Yiannakoulias, N., and Bland, W., (2016). Tabreg: Software for Regionalization

Problems. Health Geomatics Lab, School of Geography and Earth Sciences, McMaster

University.

73

Appendix A

Station Name Latitude Longitude Elevation (meters)

Belleville 44.15 -77.39 76.2

Bloomfield 43.98 -77.22 91.4

Bowmanville MOSTERT 43.92 -78.67 99.1

Brantford MOE 43.13 -80.23 196.0

Burketon Mclaughlin 44.03 -78.80 312.4

Chatham 42.39 -82.22 180.0

Chatsworth 44.40 -80.91 305.0

Cressy 44.10 -76.85 83.8

Durham 44.18 -80.82 384.0

Fergus Shand Dam 43.73 -80.33 417.6

Foldens 43.02 -80.78 328.0

Fort Erie 42.88 -78.97 179.8

Frechmans Bay 43.82 -79.08 76.2

Georgetown WWTP 43.64 -79.88 221.0

Glen Allan 43.68 -80.71 400.0

Glen Haffy Mono Mills 43.93 -79.95 434.3

Hagersville 42.97 -80.07 221.0

Hamilton Airport 43.17 -79.93 237.7

Hartington IHD 44.43 -76.69 160.0

Kingston Pumping Station 44.24 -76.48 76.5

Kingsville MOE 42.04 -82.67 200.0

London International Airport 43.03 -81.15 278.0

Millgrove 43.32 -79.97 255.1

New Glasgow 42.51 -81.64 198.1

Oakville Southeast WPCP 43.48 -79.63 86.9

Orangville MOE 43.92 -80.09 411.5

Oshawa WPCP 43.87 -78.83 83.8

74

Owen Sound MOE 44.58 -80.93 178.9

Petrolia Town 42.88 -82.17 201.2

Port Colborne 42.88 -79.25 175.3

Proton Station 44.17 -80.52 480.1

Richmond Hill 43.88 -79.45 240.0

Ridgeville 43.04 -79.33 236.2

Sarnia Airport 42.99 -82.30 180.6

St. Catharines Power Glen 43.12 -79.25 121.9

Stratford WWTP 43.37 -81.00 345.0

Thornbury Slama 44.57 -80.49 213.4

Thornhill Grandview 43.80 -79.42 199.3

Tillsonburg WWTP 42.86 -80.72 213.4

Toronto Island Airport 43.63 -79.40 76.5

Toronto Pearson Airport 43.68 -79.63 173.4

Trenton Airport 44.12 -77.53 86.3

Vineland Rittenhouse 43.17 -79.42 94.5

Waterloo Wellington 43.45 -80.38 317.0

Welland 42.99 -79.26 175.3

Wiarton Airport 44.75 -81.11 222.2

Windsor Airport 42.28 -82.96 189.6

Woodstock 43.14 -80.77 281.9

Table A1: Geographic characteristics of each weather station.

75

Appendix B

Environment Canada Flagged Data

Station Name T C A E F "blank" cells M

Belleville 1750 2 2 92 0 0 0

Bloomfield 493 50 43 158 5 31 8

Bowmanville MOSTERT 668 2 2 28 0 62 1

Brantford MOE 579 17 8 98 6 92 24

Burketon Mclaughlin 494 7 2 86 6 59 18

Chatham 382 24 26 32 0 62 0

Chatsworth 901 11 10 38 1 0 0

Cressy 814 7 7 29 0 424 8

Durham 1455 99 86 153 1 152 43

Fergus Shand Dam 751 1 1 15 0 91 0

Foldens 1241 6 4 60 0 0 0

Fort Erie 1100 19 20 75 0 155 41

Frechmans Bay 1191 17 16 5 0 120 0

Georgetown WWTP 867 17 16 36 2 243 9

Glen Allan 957 1 0 13 1 62 0

Glen Haffy Mono Mills 458 31 19 137 9 242 12

Hagersville 1039 8 8 28 0 0 0

Hamilton Airport 1686 0 0 0 0 60 0

Hartington IHD 944 10 8 33 1 31 1

Kingston Pumping Station 1186 1 1 4 0 31 1

Kingsville MOE 1210 29 30 65 1 31 6

London International

Airport 1751 0 0 3 0 20 7

Millgrove 491 18 17 152 4 153 96

New Glasgow 123 13 12 37 0 61 1

Oakville Southeast WPCP 418 33 18 183 10 304 50

Orangville MOE 647 10 7 73 3 62 10

Oshawa WPCP 656 2 2 32 0 62 0

Owen Sound MOE 656 10 5 95 4 30 1

76

Petrolia Town 354 12 11 41 0 31 31

Port Colborne 534 8 7 35 2 181 1

Proton Station 644 1 1 16 0 0 0

Richmond Hill 1314 0 0 4 0 61 0

Ridgeville 762 41 29 154 7 124 23

Sarnia Airport 1397 0 0 4 0 0 6

St. Catharines Power Glen 735 65 45 221 11 241 86

Stratford WWTP 1372 1 1 14 0 31 0

Thornbury Slama 712 12 11 48 0 0 0

Thornhill Grandview 911 5 4 24 1 31 0

Tillsonburg WWTP 772 35 27 83 4 28 12

Toronto Island Airport 1439 0 0 12 0 22 163

Toronto Pearson Airport 2319 0 0 0 0 60 0

Trenton Airport 1973 0 0 0 0 55 0

Vineland Rittenhouse 666 6 5 16 0 0 0

Waterloo Wellington

Airport 1388 0 0 0 0 60 0

Welland 945 2 2 30 0 242 8

Wiarton Airport 1319 0 0 0 0 178 3

Windsor Airport 1872 0 0 0 0 67 0

Woodstock 1049 8 6 84 2 59 11

Table B1: List of flagged data found for each weather station. Note: The definition for each

flagged value is provided in the table below

77

Flag Definition Replaced Value

A Accumulated amount; previous value was C. 0 mm

C Precipitation occurred, amount uncertain. 0 mm

E Estimated. Flag was removed; daily precipitation

value was provided by EC.

F Accumulated and estimated. Flag was removed; daily precipitation

value was provided by EC.

M Missing. The linear regression value from nearest

neighbouring station was used.

T Trace. 0.2 mm

“blank” cells Missing. The linear regression value from nearest

neighbouring station was used.

Table B2: Definition of each flag data that was found in the daily precipitation data. The method

that was used to replace each of the flag data is indicated in the table. Note: EC is an acronym for

Environment Canada

78

Appendix C

Climate Model Spatial Resolution Scenario Temporal

Resolution Simulation Period

CanRCM4-

CanEMS2 40 km

Historical 1 hour 1970 – 2000

RCP 4.5 1 hour 2006 – 2100

RCP 8.5 1 hour 2006 – 2100

HadGEM2-ES

1.25 × 1.875

degrees

(approximately

120 km x 139 km)

Historical 3 hour 1970 – 2000

RCP 4.5 3 hour 2026 – 2045;

2081 – 2099

RCP 8.5 3 hour 2026 – 2045;

2081 – 2099

Table C1: Data and resolution of the CanRCM4-CanEMS2 and HadGEM2-ES climate models

used.

79

Appendix D

Site 15 Glen Allan: At-site Method IDF Curve Results

Return Period 1 hour 2 hours 3 hours 6 hours 12 hours 24 hours

2 years 24.5 15.3 11.6 6.6 3.7 2.3

5 years 31.9 19.9 15.1 8.6 4.8 3.0

10 years 37.0 23.0 17.6 10.0 5.6 3.4

20 years 42.1 26.2 20.0 11.4 6.4 3.9

50 years 49.0 30.5 23.2 13.3 7.4 4.5

100 years 54.4 33.9 25.8 14.7 8.3 5.0

Table D1: IDF values for Glen Allan station using the at-site method.

Site 15 Glen Allan: RFA Method IDF Curve Results

Return Period 1 hour 2 hours 3 hours 6 hours 12 hours 24 hours

2 years 24.3 15.1 11.5 6.6 3.7 2.2

5 years 31.6 19.7 15.0 8.5 4.8 2.9

10 years 37.2 23.2 17.7 10.1 5.7 3.4

20 years 43.5 27.1 20.6 11.8 6.6 4.0

50 years 53.3 33.1 25.2 14.4 8.1 4.9

100 years 62.1 38.6 29.4 16.8 9.4 5.8

Table D2: IDF values for Glen Allan station using the regional frequency analysis method.

80

Site 39 Tillsonburg: At-site Method IDF Curve Results

Return Period 1 hour 2 hours 3 hours 6 hours 12 hours 24 hours

2 years 24.8 15.5 11.8 6.7 3.8 2.3

5 years 31.0 19.3 14.7 8.4 4.7 2.9

10 years 35.2 21.9 16.7 9.5 5.3 3.3

20 years 39.4 24.5 18.7 10.7 6.0 3.6

50 years 45.1 28.0 21.4 12.2 6.8 4.2

100 years 49.4 30.8 23.4 13.4 7.5 4.6

Table D3: IDF values for Tillsonburg station using the at-site method.

Site 39 Tillsonburg: RFA Method IDF Curve Results

Return Period 1 hour 2 hours 3 hours 6 hours 12 hours 24 hours

2 years 24.2 15.1 11.5 6.6 3.7 2.2

5 years 31.6 19.6 15.0 8.5 4.8 2.9

10 years 37.2 23.1 17.6 10.1 5.6 3.4

20 years 43.4 27.0 20.6 11.7 6.6 4.0

50 years 53.2 33.1 25.2 14.4 8.1 4.9

100 years 62.0 38.6 29.4 16.8 9.4 5.7

Table D4: IDF values for Tillsonburg station using the regional frequency analysis method.

81

Site 43 Vineland Rittenhouse: At-site Method IDF Curve Results

Return Period 1 hour 2 hours 3 hours 6 hours 12 hours 24 hours

2 years 23.0 14.3 10.9 6.2 3.5 2.1

5 years 28.5 17.7 13.5 7.7 4.3 2.6

10 years 31.7 19.8 15.1 8.6 4.8 2.9

20 years 34.7 21.6 16.5 9.4 5.3 3.2

50 years 38.3 23.8 18.1 10.3 5.8 3.5

100 years 40.7 25.3 19.3 11.0 6.2 3.8

Table D5: IDF values for Vineland Rittenhouse station using the at-site method.

Site 43 Vineland Rittenhouse: RFA Method IDF Curve Results

Return Period 1 hour 2 hours 3 hours 6 hours 12 hours 24 hours

2 years 22.2 13.8 10.5 6.0 3.4 2.1

5 years 28.7 17.9 13.6 7.8 4.4 2.7

10 years 33.2 20.7 15.8 9.0 5.0 3.1

20 years 37.9 23.6 18.0 10.2 5.8 3.5

50 years 44.2 27.5 21.0 12.0 6.7 4.1

100 years 49.2 30.6 23.3 13.3 7.5 4.6

Table D6: IDF values for Vineland Rittenhouse station using the regional frequency analysis

method.

82

Site 17 Hagersville: At-site Method IDF Curve Results

Return Period 1 hour 2 hours 3 hours 6 hours 12 hours 24 hours

2 years 25.2 15.7 12.0 6.8 3.8 2.3

5 years 32.2 20.1 15.3 8.7 4.9 3.0

10 years 37.0 23.0 17.6 10.0 5.6 3.4

20 years 41.8 26.0 19.8 11.3 6.3 3.9

50 years 48.2 30.0 22.8 13.0 7.3 4.5

100 years 53.1 33.0 25.2 14.4 8.1 4.9

Table D7: IDF values for Hagersville station using the at-site method.

Site 17 Hagersville: RFA Method IDF Curve Results

Return Period 1 hour 2 hours 3 hours 6 hours 12 hours 24 hours

2 years 24.7 15.4 11.7 6.7 3.8 2.3

5 years 32.2 20.0 15.3 8.7 4.9 3.0

10 years 37.7 23.5 17.9 10.2 5.7 3.5

20 years 43.5 27.1 20.6 11.8 6.6 4.0

50 years 51.8 32.2 24.5 14.0 7.9 4.8

100 years 58.6 36.5 27.8 15.8 8.9 5.4

Table D8: IDF values for Hagersville station using the regional frequency analysis method.

83

Appendix E

Durham 3 hour rainfall IDF Curve Comparison

Return Period At-site Method RFA Method 2050s Period 2100s period

2 years 11.5 11.5 12.7 14.8

5 years 14.3 14.6 16.3 18.5

10 years 16.2 16.6 18.3 20.8

20 years 18.0 18.5 20.1 22.9

50 years 20.3 20.9 22.2 25.6

100 years 22.0 22.7 23.5 27.7

Table E1: IDF values for Durham station (3 hour rainfall) using the at-site method, regional

frequency analysis method and future IDF curve development method.

Durham 12 hour rainfall IDF Curve Comparison

Return Period At-site Method RFA Method 2050s Period 2100s period

2 years 3.7 3.7 4.0 4.5

5 years 4.6 4.7 5.1 5.8

10 years 5.2 5.3 5.6 6.7

20 years 5.8 5.9 6.1 7.5

50 years 6.5 6.7 6.8 8.5

100 years 7.1 7.3 7.3 9.3

Table E2: IDF values for Durham station (12 hour rainfall) using the at-site method, regional

frequency analysis method and future IDF curve development method.

84

Hartington 3 hour rainfall IDF Curve Comparison

Return Period At-site Method RFA Method 2050s Period 2100s period

2 years 10.3 9.9 11.6 12.2

5 years 12.5 13.2 14.5 16.4

10 years 13.9 15.1 16.0 19.4

20 years 15.2 16.6 17.4 22.5

50 years 16.7 18.0 19.0 26.9

100 years 17.8 18.9 20.1 30.5

Table E3: IDF values for Hartington station (3 hour rainfall) using the at-site method, regional

frequency analysis method and future IDF curve development method.

Hartington 12 hour rainfall IDF Curve Comparison

Return Period At-site Method RFA Method 2050s Period 2100s period

2 years 3.3 3.2 3.6 3.7

5 years 4.0 4.2 4.3 4.6

10 years 4.5 4.8 4.8 5.4

20 years 4.9 5.3 5.1 6.2

50 years 5.4 5.8 5.6 7.7

100 years 5.7 6.0 5.9 9.0

Table E4: IDF values for Hartington station (12 hour rainfall) using the at-site method, regional

frequency analysis method and future IDF curve development method.