Application of Principal Component Analysis in Grouping Geomorphic Parameters for Hydrologic...

15
Water Resour Manage (2009) 23:325–339 DOI 10.1007/s11269-008-9277-1 Application of Principal Component Analysis in Grouping Geomorphic Parameters for Hydrologic Modeling P. K. Singh · Virendra Kumar · R. C. Purohit · Mahesh Kothari · P. K. Dashora Received: 24 November 2006 / Accepted: 23 May 2008 / Published online: 13 June 2008 © Springer Science + Business Media B.V. 2008 Abstract Principal component analysis has been applied to thirteen dimension- less geomorphic parameters for sixteen watersheds of the Chambal catchment of Rajasthan, India, in order to group the parameters under different components based on significant correlations. Results of the principal component analysis clearly revealed that first two principal components are strongly correlated with some of the geomorphic parameters. However, the third principal component is not found to be strongly correlated with any of the parameters but is moderately correlated with stream length ratio and bifurcation ratio. Furthermore, on the basis of the results, it is evident that some parameters are highly correlated with components but the parameters of hypsometric integral and drainage factor could not be grouped with any of the component because of its poor correlation with them. The principal component loadings matrix obtained using correlation matrix of ten parameters reveals that first three components together account for 87.01% of the total explained variance. Therefore, principal component lading matrix is applied in order to get P. K. Singh (B ) Department of Soil and Water Engineering, College of Technology and Engineering, Udaipur 313 001, India e-mail: [email protected] V. Kumar College of Technology and Engineering, Udaipur, India R. C. Purohit · M. Kothari Department of SWE, CTAE Udaipur, India P. K. Dashora Department of Statistics, RCA, Udaipur, India

Transcript of Application of Principal Component Analysis in Grouping Geomorphic Parameters for Hydrologic...

Water Resour Manage (2009) 23:325–339DOI 10.1007/s11269-008-9277-1

Application of Principal Component Analysisin Grouping Geomorphic Parametersfor Hydrologic Modeling

P. K. Singh · Virendra Kumar · R. C. Purohit ·Mahesh Kothari · P. K. Dashora

Received: 24 November 2006 / Accepted: 23 May 2008 /Published online: 13 June 2008© Springer Science + Business Media B.V. 2008

Abstract Principal component analysis has been applied to thirteen dimension-less geomorphic parameters for sixteen watersheds of the Chambal catchment ofRajasthan, India, in order to group the parameters under different componentsbased on significant correlations. Results of the principal component analysis clearlyrevealed that first two principal components are strongly correlated with some ofthe geomorphic parameters. However, the third principal component is not foundto be strongly correlated with any of the parameters but is moderately correlatedwith stream length ratio and bifurcation ratio. Furthermore, on the basis of theresults, it is evident that some parameters are highly correlated with componentsbut the parameters of hypsometric integral and drainage factor could not be groupedwith any of the component because of its poor correlation with them. The principalcomponent loadings matrix obtained using correlation matrix of ten parametersreveals that first three components together account for 87.01% of the total explainedvariance. Therefore, principal component lading matrix is applied in order to get

P. K. Singh (B)Department of Soil and Water Engineering,College of Technology and Engineering,Udaipur 313 001, Indiae-mail: [email protected]

V. KumarCollege of Technology and Engineering,Udaipur, India

R. C. Purohit · M. KothariDepartment of SWE, CTAE Udaipur, India

P. K. DashoraDepartment of Statistics, RCA, Udaipur, India

326 P.K. Singh et al.

better correlations and clearly grouped the parameters in physically significantcomponents.

Keywords Principal component analysis · Geomorphic parameters · Watershed ·Rajasthan

1 Introduction

Hydrologic modeling is basically a tool for prediction of hydrologic behavior of abasin. Multivariate analysis is a collection of procedure for analyzing associationbetween two or more set of variables that have been made on each object in oneor more samples of object. Different approaches have been made in the past to inter-correlate the variables involving geomorphologic, geological, hydro-meteorologicalcharacteristics of the watersheds. Many of the geomorphic parameters are knownto be strongly correlated. There is considerable amount of redundancy in thearray of geomorphic parameters currently in use. The collection and analysis ofgeomorphological data are often time-consuming, there is clearly the need to reducethe number of parameters to a few that adequately simulate the drainage basinmorphology. Such a screening will no doubt result in considerable saving in timeand expenses involved in deriving geomorphological parameters from topographicalmaps. It will also provide a more rational basis for a multi-dimensional classificationof drainage basins, which can form the basis for regional analysis and an objectiveselection of representative basins.

The screening of such a large number of interrelated variables for their under-lying dimensions is best achieved by multivariate statistical techniques of principalcomponent analysis. The principal component analysis breaks down the battery ofintercorrelations among variables into a set of uncorrelated factors, these togethersummarize the data in the original matrix and explain the underlying relations andinfluences among the variables. The identification of these underlying dimensionswill not only simplify future morphometric works but also provide criteria for anobjective multi-dimensional morphological classification of drainage basins.

Principal component analysis is in fact an analysis of reduced space in which anattempt is made to find a smaller number of dimensions that retain most of theinformation in the original space. Therefore, in this study an attempt has been madeto study the intercorrelationship (multicollinearity) among the variables in orderto screen out the less significant variables out of the analysis and to arrange theremaining into physically significant groups by applying principal component analysisfor better interpretability.

Principal component analysis by reducing a large number to a small numberof principal components ensures economy in the use of large volume of data. Aprincipal component conveys all essential information about the variables, ensuringeconomy in analysis and description while obtaining relatively accurate results(Sharma 2002).

It is evident from literature that multicollinearity exists between different ge-omorphic parameters. In the recent past few attempts have been made to solve

Application of principal component analysis 327

the problem of interdependency among hydrologic and geomorphic parameters(Ravichandrana et al. 1996; Montes-Botella and Tenorio 2003; Mrklas et al. 2006;Magner and Brooks 2007).

In developing a model from mean annual flood in New England, Wong (1979)utilized a multivariate statistical technique and component analysis in analyzingthe effects of twelve basins and climotological parameters. He found that principalcomponents to main stream length and average slope are orthogonal to each otherand at the same time covaried with mean annual flood, and therefore, selected newvariables, i.e., principal components as surrogates of stream size and slope and usedthem as predictors of mean annual flood.

Bagley et al. (1964), Haan and Read (1970), Haan and Allen (1972), Decourseyand Deal (1974), and Pondzic and Trninic (1992) have also demonstrated the useof multiple regression analysis and principal component analysis for developmentof hydrological prediction equation involving geomorphic parameters. Mishra andSatyanatayana (1988) carried out principal component analysis with varimax rotationon the ten geomorphic parameters of Damodar Valley catchment of India andconcluded that nine parameters could be significantly grouped into three components(steepness, shape and geological). Bifurcation ratio was found to be less significantin explaining the component variance. Kumar and Satyanarayana (1993) carried outprincipal component analysis for eastern red soil region of the India and concludedthat circulatory ratio, ruggedness number and drainage factor have been found nonsignificant for explaining the component variance. Tamene et al. (2006) appliedprincipal component analysis to analyze the relationship between sediment yieldand catchment characteristics and to determine the major factors controlling thevariability of sediment yield for 11 catchments of northern Ethiopia. The results showthat terrain form, gully erosion, surface lithology, and land cover explain most of thevariability in sediment yield among the catchments.

1.1 Study Area and Data Source

Sixteen watersheds were chosen for the present analysis. These watersheds arelocated in Chambal catchment of Rajasthan, India (Fig. 1). The entire Chambalcatchment lies between a longitude of 74◦45′ to 75◦50′ E and latitude of 23◦30′to 25◦10′ N and spread over the states of Madhya Pradesh and Rajasthan. TheChambal river which originates from Vindhya hills flows into Madhya Pradesh andfinally drains through Rajasthan. It has immense potentiality of water harvest. Fourlarge dams, namely Gandhi Sagar in Madhya Pradesh, Ranapratap Sagar, JawaharSagar and Kota Barrage in Rajasthan, constructed on river Chambal, have a totalcatchment area of about 27000 sq. km. A large portion of catchment area, about22850 sq. km is however in Madhya Pradesh and the rest of which is only about4150 sq. km falls in the state of Rajasthan.

The climate of the region in general is of sub-humid with an annual averagerainfall varying from about 650 mm to 850 mm. In general the topography of theregion is undulating. The predominant slope in the hilly region ranges between 15 to40 percent and in the valley portion it varies from almost level to 10 percent. Erosionproblem in the catchments is quite intensive because of undulating topography,

328 P.K. Singh et al.

Fig. 1 Selected watershedsof the Chambal catchment

misuse of land and degradation of forest. Sheet and gully erosion are common inthe area especially near the river.

2 Methodology

2.1 Geomorphic Parameters

Watershed characteristics play a vital role on the hydrologic responses of watersheds,and therefore, a number of parameters which signify the watershed characteristicsare evaluated from the toposheets. The length parameters are measured with thehelp of a map measurer and area with the help of a planimeter. Ultimately these areused to obtain following 13 dimensionless parameters known as geomorphic para-meters for the 16 watersheds of the Chambal catchment of Rajasthan, India. Singh(1992) and Singh (2000) also specify the important geomorphological characteristicsof the watersheds. Thirteen salient parameters are selected in this study, which isbased on the work conducted at Domodar Valley catchment, India (Kumar 1991).

Application of principal component analysis 329

1. Average slope of the watershed (Sa) is determined using the following relation-ship (Kumar 1991).

Sa = H Lca/(10 A) (1)

Where, Sa = average slope of the watershed, (%), H = maximum watershedrelief (Elevation difference between most remote point to the outlet, m)

Lca = Average length of the contour in km

=n∑

i=1

Lci/

n (2)

Where, Lca = length of each contour, km, n = number of identifiable contours,and A = drainage area of the watershed, sq km.

2. Relief ratio (Rr) is determined as the maximum watershed relief divided by thelength of main stream.

Rr = HL

(3)

Where, H = maximum watershed relief (Elevation difference between mostremote point to the outlet, m) and L = length of main stream.

3. Relative relief (RR) is determined as the ratio of the maximum watershed reliefto the perimeter of the watershed.

RR = HP

× 100 (4)

Where, H = maximum watershed relief (Elevation difference between mostremote point to the outlet, m) and P = perimeter of the watershed.

4. Main stream channel slope (Sc), expressed in percent, is obtained by measuringthe catchment area under the actual longitudinal profile of the main streamchannel from gauge to divide using the following expression (Kumar 1991).

Sc = Area Under the curve5L2

(5)

Where, L= length of main stream, km.5. Elongation ratio (Re) is determined as the ratio between the diameter of a circle

with the same area as the watershed and the maximum length of the watershed.

Re = Dc

L(6)

Where, Dc = diameter of a circle having same area as the watershed, L =maximum length of watershed.

6. Basin shape factor (Sb ) is determined as the ratio between the square of themaximum length of the watershed and the area of the watershed.

Sb = square of the maximum length of the watershedarea of the watershed

(7)

7. Length–width ratio (Lb /Lw) is the ratio of the maximum length to the width ofthe watershed. Width is measured at the mid point of the longest stream length.

330 P.K. Singh et al.

8. Stream length ratio (Rl) is the ratio between the mean length of a streamof a particular order and the mean stream length of the next higher order(Singh 2000).

Rl = L̄u

L̄u − 1(8)

Where, L̄u = mean length of stream of order u and L̄u − 1 = mean length ofstream of next lower order.

9. Bifurcation ratio (Rb ) is the ratio of the number of streams of a particular orderto the number of streams of the next higher order (Singh 2000).

Rb = Nu

Nu + 1(9)

Where, Nu = number of stream of u order, Nu + 1 = number of stream of u + 1order.

10. Hypsometric analysis of drainage basin is carried out to develop the relationshipbetween horizontal cross-sectional drainage basin area and the elevation. Inanalysis, a curve is derived by plotting the relative heights (h/H) and relativeareas (a/A); the obtained curve is called as hypsometric curve (Suresh 1997).The shape of the hypsometric curve varies in early geologic stages of develop-ment of the drainage basin, but once a steady state is attained it tends to varylittle despite lowering relief (Kumar 1991; Suresh 1997).

11. Circulatory ratio (Rc) is the ratio of circumference of a circle of same area asthe watershed to the perimeter of the watershed.

Rc = Au

Ac(10)

Where, Au = area of the watershed and Ac = area of circle having equalperimeter as the perimeter of the watershed.

12. Ruggedness number (RN) is the drainage density times the maximum basinrelief. The drainage density (Dd) is an important indicator of the linear scale ofland-form elements in stream eroded topography and simply the ratio of totalchannel segment lengths cumulated for all orders within a basin to the basinarea.

RN = HDd/

1000 (11)

Where, H = watershed relief and Dd = drainage density.13. Drainage factor (Df ) is the ratio of stream frequency to the square of drainage

density. Stream frequency (Fs) is the ratio of total number of streams of allorder to the basin area.

Df = Fs

D2d

(12)

The evaluated values of these geomorphic parameters are shown in Table 1.

Application of principal component analysis 331

Tab

le1

Sele

cted

dim

ensi

onle

ssge

omor

phic

para

met

ers

Wat

ersh

edS a

Re

Rc

S bR

rR

RR

NS c

Hsi

Df

Rl

Rb

Lb/L

w

W1

1.85

00.

532

0.73

14.

493

0.00

390.

0017

0.08

20.

204

0.54

10.

608

2.06

23.

580

2.84

8W

212

.940

0.87

10.

890

2.13

20.

0481

0.01

760.

715

1.29

60.

330

0.50

21.

108

3.26

01.

650

W3

1.62

50.

782

0.75

62.

079

0.00

470.

0014

0.06

40.

314

0.58

00.

718

1.17

73.

651

1.39

8W

41.

182

0.77

20.

706

2.13

00.

0033

0.00

090.

039

0.25

90.

690

0.72

00.

347

6.23

71.

845

W5

6.17

80.

422

0.59

17.

125

0.00

950.

0042

0.48

60.

397

0.62

50.

653

1.91

73.

664

6.29

5W

61.

672

0.67

50.

853

2.78

90.

0041

0.00

160.

042

0.31

00.

654

0.81

32.

673

4.78

62.

515

W7

2.74

40.

791

0.74

22.

002

0.01

440.

0043

0.31

90.

944

0.49

40.

677

2.45

25.

042

1.65

5W

86.

832

0.79

60.

870

2.00

40.

0019

0.00

660.

449

0.93

70.

422

0.44

91.

607

4.54

61.

465

W9

0.62

20.

776

0.89

22.

110

0.00

510.

0019

0.04

90.

375

0.54

01.

175

0.87

33.

359

1.69

4W

101.

058

0.79

10.

834

2.03

00.

0052

0.00

180.

074

0.56

10.

474

0.73

21.

817

3.68

11.

460

W11

1.30

20.

784

0.89

22.

071

0.00

760.

0027

0.04

60.

946

0.63

81.

202

0.95

84.

123

1.56

7W

122.

771

0.76

00.

923

2.20

00.

0082

0.00

320.

051

0.47

80.

481

0.57

01.

463

2.64

51.

641

W13

2.83

00.

746

0.87

62.

286

0.00

940.

0035

0.11

50.

601

0.44

80.

748

0.66

33.

207

1.79

3W

140.

593

0.63

30.

753

3.17

40.

0056

0.00

210.

038

0.33

60.

237

0.69

91.

068

7.99

82.

181

W15

1.12

60.

994

0.72

81.

287

0.00

580.

0014

0.04

20.

560

0.47

20.

748

1.04

12.

801

1.40

8W

162.

088

0.94

90.

904

1.29

50.

0131

0.00

400.

054

0.59

20.

369

0.75

51.

207

2.65

40.

933

332 P.K. Singh et al.

2.2 Principal Component Analysis

The method of principal components or component analysis is based upon theearly work of Pearson with the specific adaptations to principal component analysissuggested by the work of Hotelling (1933). The geomorphic parameters are usuallymany times correlated. The correlation indicates that some of the informationcontained in one variable is also contained in some of the other remaining variables.More specifically, the first principal component is that linear combination of theoriginal variables which contributes a maximum to their total variance; the secondprincipal component, uncorrelated with the first, contributes a maximum to theresidual variance, and so on until the total variance is analyzed. Since the method isso dependent on the total variance of the original variables, it is most suitable whenall the variables are measured in the same units. Hence, it is customary to express thevariables in standard form, i.e., to select the unit of measurement for each variable sothat its sample variance is one. Then, the analysis is made on the correlation matrix,with the total variance equal to n. The objectives are achieved in two steps:

Step 1 Calculate the correlation matrix, RStep 2 Calculate the principal component loading matrix by principal component

analysis.Step 3 In the principal component (PC) loading matrix, Eigen value greater than

one indicates significant PC loading.

Eigen value indicated how well each of the identified factors fit the data from all thegeomorphic parameters on all the principal components.

2.3 Correlation Matrix

The inter-correlation matrix of the geomorphic parameters is obtained by using thefollowing procedure:

1. The parameters are standardized:

X = (xij − x j

)/S j (13)

where

x denotes the matrix of standardized parametersxij ith observation on jth parameter

i 1, ......, N (Number of Observations)j 1, ......, P (Number of Parameters)

x j mean of the jth parameterS j standard deviation of the jth parameter

2. The correlation matrix of parameters is the minor product moment of thestandardized predictor measures divided by N and is given by

R = (x′ × x

)/N (14)

where, x’ denotes the transpose of the standardized matrix of predictorparameters.

Application of principal component analysis 333

2.4 Principal Component Loading Matrix

The principal component loading matrix which reflects how much a particularparameter is correlated with different factors, is obtained by premultiplying the char-acteristic vector with the square root of the characteristic values of the correlationmatrix.

Thus, A = Q × D0.5 (15)

where

A principal component loading matrix,Q characteristic vector of the correlation matrix,D characteristic value of the correlation matrix

3 Results and Discussion

The correlation matrix (Table 2) of the thirteen selected geomorphic parametersreveals that strong correlations (correlation coefficient more than 0.9) exist betweenbasin shape factor (Sb) and length width ratio (Lb/Lw), between average slope ofthe watershed (Sa) and ruggedness number (RN), between average slope of thewatershed (Sa) and relative relief (RR), between relief ratio (Rr) and relative reliefand elongation ratio (Re) and basin shape factor (Sb). Also, good correlations(correlation coefficient more than 0.75) exist between RR and RN, Sa and Rr andbetween Rr and main stream channel slope (Sc). Some more moderately correlatedparameters (correlation coefficient more than 0.6) are Sa with Sc, Rr with RN, Rr

with Sc and RN with Sc. It is very difficult at this stage to group the parameters intocomponents and attach any physical significance because some parameters like Df

and Hsi do not show any significant correlation with any of the parameters. Hence,in the next step, the principal component analysis has been applied. The correlationmatrix is subjected to the principal component analysis.

The principal component loading matrix obtained from correlation matrix(Table 3) reveals that the first three components whose Eigen values are greaterthan one, together account for about 77.26% of the total explained variance. Thefirst component is strongly correlated (loadings of more than 0.8) with relative relief,relief ratio, average slope and main stream channel slope and moderately (loadingsof more than 0.6) with ruggedness number, which may be termed as a slope orsteepness component. The second component is strongly correlated with basin shapefactor, length width ratio and elongation ratio but moderately with circulatory ratioand can be termed as shape component. The third component does not stronglycorrelate with any geomorphic parameters but moderately correlates with streamlength ratio and bifurcation ratio (loadings of more than 0.6) and may be termed asdrainage component. It is evident from these results that some parameters are highlycorrelated with components but the parameters Hsi and Df could not be groupedwith any of the components because of its poor correlation with them.

In order to screen out parameters having less significance in explaining thecomponent variance, the parameter Df is first screened out from the analysis. Then,the correlation matrix and principal component loading matrix are obtained for

334 P.K. Singh et al.

Tab

le2

Inte

rcor

rela

tion

mat

rix

ofth

ese

lect

edge

omor

phic

para

met

ers

Par

amet

erS a

Re

Rc

S bR

rR

RR

NS c

Hsi

Df

Rl

Rb

Lb/L

w

S a1.

000

0.00

50.

094

0.17

90.

801

0.94

40.

945

0.67

6−0

.294

−0.5

470.

124

−0.2

120.

211

Re

0.00

51.

000

0.51

9−0

.915

0.25

10.

196

−0.1

070.

453

−0.2

930.

096

−0.2

63−0

.304

−0.8

12R

c0.

094

0.51

91.

000

−0.6

430.

226

0.27

8−0

.103

0.40

6−0

.299

0.21

4−0

.096

−0.3

38−0

.657

S b0.

179

−0.9

12−0

.643

1.00

0−0

.080

−0.0

410.

299

−0.3

400.

254

−0.1

690.

276

0.10

70.

952

Rr

0.80

10.

251

0.22

6−0

.080

1.00

00.

923

0.70

50.

691

−0.4

08−0

.275

−0.0

31−0

.223

−0.0

51R

R0.

944

0.19

60.

278

−0.0

410.

923

1.00

00.

859

0.77

8−0

.450

−0.4

220.

010

−0.1

89−0

.023

RN

0.94

5−0

.107

−0.1

030.

299

0.70

50.

859

1.00

00.

669

−0.2

40−0

.519

0.22

6−0

.098

0.34

5S c

0.67

60.

453

0.40

6−0

.340

0.69

10.

778

0.66

91.

000

−0.3

59−0

.141

0.05

6−0

.185

−0.2

76H

si−0

.294

−0.2

93−0

.299

0.25

4−0

.408

−0.4

50−0

.240

−0.3

591.

000

0.38

80.

315

−0.0

900.

323

Df

−0.5

470.

096

0.21

4−0

.169

−0.2

75−0

.422

−0.5

19−0

.141

0.38

81.

000

−0.2

21−0

.029

−0.1

14R

l0.

124

−0.2

63−0

.096

0.27

6−0

.031

0.01

00.

226

0.05

60.

315

−0.2

211.

000

−0.2

700.

271

Rb

−0.2

12−0

.304

−0.3

380.

107

−0.2

23−0

.189

−0.0

98−0

.185

−0.0

90−0

.029

−0.2

701.

000

0.08

0L

b/L

w0.

211

−0.8

12−0

.657

0.95

2−0

.051

−0.0

230.

345

−0.7

60.

323

−0.1

140.

271

0.08

01.

000

Fig

ures

inbo

ldfa

cein

dica

test

rong

corr

elat

ions

Application of principal component analysis 335

Tab

le3

Pri

ncip

alco

mpo

nent

load

ing

mat

rix

ofse

lect

edge

omor

phic

para

met

ers

Par

amet

ers

Pri

ncip

alco

mpo

nent

s

12

34

56

78

910

1112

13

S a0.

882

0.41

80.

032

0.05

30.

016

0.04

30.

173

−0.0

620.

059

−0.0

340.

045

−0.0

270.

007

Re

0.39

2−0

.817

0.07

7−0

.030

0.08

2−0

.345

−0.1

13−0

.015

0.15

10.

048

0.03

40.

020

0.00

1R

c0.

384

−0.6

750.

246

0.06

4−0

.164

0.49

00.

238

−0.0

560.

059

0.05

7−0

.013

0.01

30.

001

S b−0

.234

0.93

40.

002

0.12

4−0

.194

0.10

5−0

.051

−0.0

22−0

.048

0.00

50.

056

0.03

30.

003

Rr

0.88

30.

107

−0.0

070.

225

−0.0

56−0

.063

−0.0

470.

377

−0.0

470.

059

−0.0

300.

004

0.00

3R

R0.

965

0.18

1−0

.033

0.12

8−0

.007

0.03

20.

083

0.08

60.

018

−0.0

410.

039

0.00

10.

003

RN

0.79

40.

551

0.01

30.

052

0.13

9−0

.052

−0.0

10−0

.145

0.06

3−0

.110

−0.0

630.

019

−0.0

11S c

0.85

7−0

.137

0.11

10.

159

0.30

70.

060

−0.1

76−0

.226

−0.1

420.

092

0.00

7−0

.002

0.00

1H

si−0

.539

0.21

80.

566

0.24

00.

331

−0.2

520.

331

0.02

4−0

.034

0.03

0−0

.002

−0.0

07−0

.000

Df

−0.4

43−0

.397

0.26

80.

707

0.06

50.

139

−0.2

030.

032

0.05

1−0

.084

0.01

1−0

.006

0.00

0R

l0.

016

0.39

10.

653

−0.4

680.

299

0.25

3−0

.166

0.13

50.

047

−0.0

160.

012

−0.0

030.

000

Rb

−0.2

720.

131

−0.7

560.

073

0.52

30.

209

0.06

90.

075

0.06

20.

024

0.01

00.

004

0.00

1L

b/L

w−0

.204

0.92

00.

060

0.20

1−0

.122

−0.0

10−0

.089

−0.0

720.

145

0.13

6−0

.023

−0.0

11−0

.002

Eig

enva

lue

4.81

33.

757

1.47

40.

945

0.67

90.

574

0.33

30.

259

0.08

90.

060

0.01

40.

003

0.00

0

336 P.K. Singh et al.

Tab

le4

Pri

ncip

alco

mpo

nent

load

ing

mat

rix

ofte

nfin

ally

scre

ened

outg

eom

orph

icpa

ram

eter

s

Par

amet

ers

Pri

ncip

alco

mpo

nent

s

12

34

56

78

910

S a0.

938

0.25

50.

054

−0.0

480.

019

−0.2

02−0

.024

0.07

50.

008

0.03

7R

e0.

258

−0.9

17−0

.115

−0.0

380.

084

−0.0

290.

260

−0.0

000.

051

−0.0

00S b

0.06

90.

980

0.02

1−0

.141

0.01

80.

063

−0.0

25−0

.038

0.08

80.

006

Rr

0.90

4−0

.044

0.10

6−0

.136

−0.3

320.

187

0.07

2−0

.029

−0.0

160.

018

RR

0.97

90.

023

0.12

5−0

.033

−0.1

16−0

.054

−0.0

570.

038

0.02

3−0

.051

RN

0.88

30.

386

0.06

30.

102

0.15

2−0

.133

0.05

8−0

.110

−0.0

29−0

.004

S c0.

849

−0.2

760.

005

0.23

60.

281

0.24

0−0

.099

0.01

90.

006

0.00

5R

l0.

099

0.38

6−0

.734

0.53

1−0

.134

−0.0

060.

044

0.01

00.

006

−0.0

00R

b−0

.285

0.14

80.

792

0.50

9−0

.065

−0.0

060.

075

0.01

20.

012

0.00

2L

b/L

w−0

.021

0.95

00.

005

−0.1

510.

129

0.09

80.

205

0.06

0−0

.035

−0.0

11E

igen

valu

e4.

319

3.16

81.

214

0.67

40.

272

0.16

90.

140

0.02

60.

014

0.00

4

Application of principal component analysis 337

12 parameters. The less correlated parameters such as Rc from second componentand Hsi from third component are also screened out because of lower principalcomponent loading matrix and same analysis is repeated with only 10 variables.

The principal component loadings matrix obtained using the correlation matrixof 10 parameters (Table 4) reveals that the first three components now togetheraccounts for 87.01% of the total explained variance showing an increase of about9.71%. The principal component loadings here also improved considerably in almostall significant parameters. The relative relief, relief ratio and average slope are highlycorrelated (loadings of more than 0.9) with the first component. The ruggednessnumber and main stream channel slope are also having good correlations (loadings ofmore than 0.8) with the first component. The basin shape factor, length width ratioand elongation ratio are highly correlated (loadings of more than 0.9) with secondcomponent. The correlations of Rl and Rb with third component are observed toincrease significantly.

The results of this study reveal that steepness component is the dominant compo-nent followed by shape and drainage components. Therefore, it can be presumed thathydrologic response like runoff yield and soil loss for these watersheds will be high.The peak runoff rate of the watershed will not be achieved frequently only becauseof less dominancy of shape and drainage components. Some watersheds (i.e., W4, W7and W14) exhibited higher bifurcation ratios, (Rb greater than 4), which would resultlower but extended peak flow, whereas remaining watersheds with lower Rb willproduce sharp peak flow. Furthermore, most of the watersheds (except watershedsW4, W9, and W13) exhibited higher values of elongation ratio (greater than 0.8),which shows that the area is having steep ground slope.

It is observed that the first component is strongly correlated with relative relief,average slope of the watershed, ruggedness number, relief ratio and main streamchannel slope which are grouped under slope or steepness component. The secondcomponent is strongly correlated with basin shape factor, length width ratio andelongation ratio of the watershed and is termed as shape component. The thirdcomponent is strongly correlated with bifurcation ratio and stream length ratio andhence is called as drainage component.

It can be seen how useful the principal component analysis have been in screeningout the parameters or variables of least significance and in regrouping the remainingvariables into physically significant factors. Multiple regression techniques can thenapplied in modeling the hydrologic responses such as runoff and sediment yields fromthe watersheds. One parameter each from significant components may form a set ofindependent parameters at a time in modeling the said hydrologic responses.

4 Conclusions

In the present study, thirteen geomorphic parameters of sixteen watersheds lo-cated in Chambal catchment of Rajasthan, India were chosen for the analysis. Thecorrelation matrix of the thirteen selected geomorphic parameters revealed thatstrong correlation (correlation coefficient>0.9) exist between basin shape factor andlength width ratio, between average slope of the watershed and the ruggednessnumber, between average slope of the watershed and relative relief, between reliefratio and relative relief, and elongation ratio and basin shape factor. The principal

338 P.K. Singh et al.

component loading matrix obtained from correlation matrix reveals that the firstthree components, whose Eigen values are greater than one, together accounts forabout 77.26% of the total explained variance. Based on the results of the principalcomponent analysis, first component is strongly correlated with relative relief, reliefratio, average slope of the watershed and main stream channel slope. The secondcomponent is strongly correlated with basin shape factor, length width ratio, andelongation ratio. However, third component is not found strongly correlated withany of the geomorphologic parameters but moderately correlated with stream lengthratio, and bifurcation ratio. The hypsometric integral and drainage factor could notbe grouped with any of the components because of its poor correlations with them.After screening out the hypsometric integral, drainage factor and circulatory ratio,the principal component loadings matrix of ten parameters indicate that first threecomponents together account for 87.01% of the total explained variance. Basedon the properties of the geomorphic parameters, three principal components weredefined as steepness, shape, and drainage components. Moreover, it is concludedthat in modeling the hydrologic responses such as runoff and sediment yield fromsmall watersheds, the principal component analysis is good tool for screening out theinsignificant parameters from the analysis.

References

Bagley JM, Jeppson RW, Milligan CH (1964) Water yields in Utah agricultural experimental station.Utah State University, Logan, Utah Special Report No. 18

Decoursey DG, Deal RB (1974) General aspect of multivariate analysis with applications to someproblems in hydrology. In: Proceedings of symposium on statistical hydrology, USDA, miscella-neous publication No. 1275. Washington DC, pp 47–68

Haan CT, Allen DM (1972) Comparison of multiple regression and principal component regressionfor predicting water yields in Kentucky. Water Resour Res 8(6):1593–1596

Haan CT, Read HR (1970) Prediction of monthly seasonal and annual runoff volumes for smallagricultural watersheds in Kentucky, Bulletin 711, Kentucky Agricultural Experiment Station,University of Lexington, Kentucky

Hotelling H (1933) Analysis of a complex of statistical variables into principal components. J EducPsychol 24:417–441, 498–520

Kumar V (1991) Hydrologic response models for prediction of runoff and sediment yields from smallwatersheds. Unpublished Ph.D. Thesis. Indian Institute of Technology, Kharagpur, India, p 350

Kumar V, Satyanarayana T (1993) Application of principal component analysis in grouping geomor-phic parameters for hydrologic modeling. J Indian Water Resour Soc 13(1–2):50–59

Magner JA, Brooks KN (2007) Predicting stream channel erosion in the lacustrine core of theupper Nemadji River, Minnesota (USA) using stream geomorphology metrics. Environ Geol,published online, http://www.springerlink.com/content/e65ju8857175kh23/?p=2dcbc28f160c4bfc86a413d59b4b1e82&pi=0. Accessed 8 March 2008

Mishra N, Satyanatayana T (1988) Parameter grouping—a prelude to hydrologic modeling. Indian JPower River Val Dev 256–260, September

Montes-Botella C, Tenorio MD (2003) Water characterization and seasonal heavy metal distributionin the Odiel River (Huelva, Spain) by means of principal component analysis. Arch EnvironContam Toxicol 45(4):436–444

Mrklas O, Bentley LR, Lunn SRD, Chu A (2006) Principal component analyses of groundwaterchemistry data during enhanced bioremediation. Water Air Soil Pollut 169(1–4):395–411

Pondzic K, Trninic D (1992) Principal component analysis of a river basin discharge and precipitationanomaly fields associated with the global circulation. J Hydrol 132(1–4):343–360

Ravichandrana S, Ramanibai R, Pundarikanthan NV (1996) Ecoregions for describing water qualitypatterns in Tamiraparani basin, South India. J Hydrol 178(1–4):257–276

Sharma KR (2002) Research methodology. National Publishing House, New Delhi, p 514

Application of principal component analysis 339

Singh VP (1992) Elementary hydrology. Prentice-Hall of India Private Limited, New Delhi, India,p 973

Singh RV (2000) Watershed planning and management. Yash Publishing House, Bikaner, p 470Suresh R (1997) Soil and watert conservation engineering. Standard Publishers Distributors, New

Delhi, p 973Tamene L, Park SJ, Dikau R, Vlek PLG (2006) Analysis of factors determining sediment yield

variability in the highlands of northern Ethiopia. Geomorphology 76(1–2):76–91Wong ST (1979) A multivariate statistical model for predicting mean annual flood in New England.

Annals Assoc Am Geographers 53:293–311