Research Article Assessment of Groundwater Potential Based...

12
Research Article Assessment of Groundwater Potential Based on Multicriteria Decision Making Model and Decision Tree Algorithms Huajie Duan, Zhengdong Deng, Feifan Deng, and Daqing Wang PLA University of Science and Technology, Nanjing 210007, China Correspondence should be addressed to Zhengdong Deng; [email protected] Received 17 July 2016; Revised 1 November 2016; Accepted 7 November 2016 Academic Editor: Alessandro Lo Schiavo Copyright © 2016 Huajie Duan et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Groundwater plays an important role in global climate change and satisfying human needs. In the study, RS (remote sensing) and GIS (geographic information system) were utilized to generate five thematic layers, lithology, lineament density, topology, slope, and river density considered as factors influencing the groundwater potential. en, the multicriteria decision model (MCDM) was integrated with C5.0 and CART, respectively, to generate the decision tree with 80 surveyed tube wells divided into four classes on the basis of the yield. To test the precision of the decision tree algorithms, the 10-fold cross validation and kappa coefficient were adopted and the average kappa coefficient for C5.0 and CART was 90.45% and 85.09%, respectively. Aſter applying the decision tree to the whole study area, four classes of groundwater potential zones were demarcated. According to the classification result, the four grades of groundwater potential zones, “very good,” “good,” “moderate,” and “poor,” occupy 4.61%, 8.58%, 26.59%, and 60.23%, respectively, with C5.0 algorithm, while occupying the percentages of 4.68%, 10.09%, 26.10%, and 59.13%, respectively, with CART algorithm. erefore, we can draw the conclusion that C5.0 algorithm is more appropriate than CART for the groundwater potential zone prediction. 1. Introduction Increasing population and water scarcity have raised the importance of groundwater zones, as they are a major source of freshwater. Integrated remote sensing and GIS are widely used in groundwater mapping. Locating potential groundwater targets is becoming more convenient and cost- effective with the advent of a number of satellite imageries. Remotely sensed based groundwater exploration has made it feasible to explore the areas with limited human access, for the wide visual range, short time cycle, and increasing spatial resolution. A lot of work has been done on the delineation of ground- water potential zones, including in tropical humid regions, such as Tirnavos area, Greece [1], Timor Leste, Indonesia [2], SW, Nigeria [3], and New Delhi [4], and in mid-latitude semiarid areas, such as Boryeong and Pohang, Korea [5, 6], Sultan Mountains (Konya, Turkey) [7, 8], Udaipur, India [9], Beheshtabad watershed, and Chaharmahal-and-Bakhtiari Province, Iran [10]. rough the analysis of the characteristics and factors within the typical regions, we found out basic principles and methods for factor selection, which provided reference and basis for the study area. Various researchers have effectively implemented multi- criteria decision model (MCDM) for accurately identifying the groundwater potential zones [1–4]. e major factors that influence the groundwater potential are lithology, rainfall, slope, drainage density, lineament density, and so forth. e factors’ values are mostly continuous; however, the prede- cessors mostly divided the continuous factors into several discrete levels according to the relationship with groundwater potential, causing a great loss of the original information. In the previous research, we have established the fuzzy member- ship functions to analyze each factor’s impact on groundwater enrichment from the perspective of continuity [11]. e MCDM is based on either manual decision method like analytic hierarchy process (AHP) [1–4, 9] or machine learning method, such as artificial neural network fitting [5], frequency ratio, weights of evidence and logistic regression [7, 8], boosted regression tree, classification and regression Hindawi Publishing Corporation Mathematical Problems in Engineering Volume 2016, Article ID 2064575, 11 pages http://dx.doi.org/10.1155/2016/2064575

Transcript of Research Article Assessment of Groundwater Potential Based...

Page 1: Research Article Assessment of Groundwater Potential Based ...downloads.hindawi.com/journals/mpe/2016/2064575.pdf · Figure . In general, slopes control the inltration and ow ability

Research ArticleAssessment of Groundwater Potential Based on MulticriteriaDecision Making Model and Decision Tree Algorithms

Huajie Duan Zhengdong Deng Feifan Deng and Daqing Wang

PLA University of Science and Technology Nanjing 210007 China

Correspondence should be addressed to Zhengdong Deng dengzdongsinacom

Received 17 July 2016 Revised 1 November 2016 Accepted 7 November 2016

Academic Editor Alessandro Lo Schiavo

Copyright copy 2016 Huajie Duan et al This is an open access article distributed under the Creative Commons Attribution Licensewhich permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

Groundwater plays an important role in global climate change and satisfying human needs In the study RS (remote sensing) andGIS (geographic information system) were utilized to generate five thematic layers lithology lineament density topology slopeand river density considered as factors influencing the groundwater potentialThen themulticriteria decisionmodel (MCDM)wasintegrated with C50 and CART respectively to generate the decision tree with 80 surveyed tube wells divided into four classes onthe basis of the yield To test the precision of the decision tree algorithms the 10-fold cross validation and kappa coefficient wereadopted and the average kappa coefficient for C50 and CART was 9045 and 8509 respectively After applying the decisiontree to the whole study area four classes of groundwater potential zones were demarcated According to the classification resultthe four grades of groundwater potential zones ldquovery goodrdquo ldquogoodrdquo ldquomoderaterdquo and ldquopoorrdquo occupy 461 858 2659 and6023 respectively with C50 algorithm while occupying the percentages of 468 1009 2610 and 5913 respectively withCART algorithmTherefore we can draw the conclusion that C50 algorithm is more appropriate than CART for the groundwaterpotential zone prediction

1 Introduction

Increasing population and water scarcity have raised theimportance of groundwater zones as they are a majorsource of freshwater Integrated remote sensing and GIS arewidely used in groundwater mapping Locating potentialgroundwater targets is becoming more convenient and cost-effective with the advent of a number of satellite imageriesRemotely sensed based groundwater exploration has made itfeasible to explore the areas with limited human access forthe wide visual range short time cycle and increasing spatialresolution

A lot of work has been done on the delineation of ground-water potential zones including in tropical humid regionssuch as Tirnavos area Greece [1] Timor Leste Indonesia[2] SW Nigeria [3] and New Delhi [4] and in mid-latitudesemiarid areas such as Boryeong and Pohang Korea [5 6]Sultan Mountains (Konya Turkey) [7 8] Udaipur India [9]Beheshtabad watershed and Chaharmahal-and-BakhtiariProvince Iran [10]Through the analysis of the characteristics

and factors within the typical regions we found out basicprinciples and methods for factor selection which providedreference and basis for the study area

Various researchers have effectively implemented multi-criteria decision model (MCDM) for accurately identifyingthe groundwater potential zones [1ndash4]Themajor factors thatinfluence the groundwater potential are lithology rainfallslope drainage density lineament density and so forth Thefactorsrsquo values are mostly continuous however the prede-cessors mostly divided the continuous factors into severaldiscrete levels according to the relationshipwith groundwaterpotential causing a great loss of the original information Inthe previous research we have established the fuzzymember-ship functions to analyze each factorrsquos impact on groundwaterenrichment from the perspective of continuity [11]

The MCDM is based on either manual decision methodlike analytic hierarchy process (AHP) [1ndash4 9] or machinelearning method such as artificial neural network fitting [5]frequency ratio weights of evidence and logistic regression[7 8] boosted regression tree classification and regression

Hindawi Publishing CorporationMathematical Problems in EngineeringVolume 2016 Article ID 2064575 11 pageshttpdxdoiorg10115520162064575

2 Mathematical Problems in Engineering

China

Tibet

0 5 10 20

N

79∘50

0E79

∘40

0E79

∘30

0E 80

∘00E 80

∘10

0E 80

∘20

0E

33∘00

N33∘100

N

(km)

Surveyed tube wells

33∘200

N

Figure 1 Location map of the study area (Landsat 8 OLI 753)

tree (CART) random forest [10] chi-squared automaticinteraction detector or the quick unbiased and efficientstatistical tree algorithms [12 13] The flexibility of the AHPmethod allows the revision of the weights and rating ofparameters in order to be suitable for other regions accord-ing to their specific characteristics [1] However regardingassigning weights to the different thematic layers personaljudgment reduced the objectivity of the model [9] Decisiontree techniques provide a multivariate method which isknown as a successful automatic classification scheme [14 15]CART algorithm [16] showed more application than otherdecision tree algorithms C50 algorithm [16ndash19] as one ofthe decision tree techniques serves as an enhancement ofC45 and shows relatively better classification result for it cansimultaneously handle continuous and categorical variableswith the unbiased processing The 119896-fold cross validationmethod [20] is an effective way to improve the precisionof the decision tree model with the basic idea randomlydividing the samples into the training set and validation setand circling for 119896 times to ensure the robustness of themodel

In the current study considering the groundwater poten-tial relating factors lithology lineament density topologyslope and river density decision tree algorithms C50 andCART respectively with the 10-fold cross validation wereapplied to generate and test the decision tree models forpredicting the groundwater potential grade in the wholestudy area

2 Materials and Methods

21 Study Area and Materials The study area (shown inFigure 1) is located in the southwestern part of Ritu countyAli city with the extent of 79∘301015840ndash80∘201015840E and 33∘ndash33∘301015840Nand surrounded by Kailash mountain in the south andKarakoram mountain in the north The county is in thesoutheast of BanGong Lake Basin across the BanGong Lake-Nu Jiang fault zoneThe area belongs to the wide valley in theplateau and mountain lake basin and provides runoff to BanGong Lake in the southeast directionThe study area is domi-nated by subfrigid monsoon climateThe temperature rangesfrom minus221 to 136 Celsius degrees with the annual averagetemperature of 05 Celsius degrees The annual sunshineperiod is 33709 hours with the frost-free period of 95 daysThe annual rainfall is quite low being only 75mm and havinghigh evaporation of 24563mm The Ban Gong Lake has amaximumdepth of 413m and lies in east-west direction hav-ing salt water in the midwest and freshwater in the east Theoriginal plant Ban Gong willow growing along the lake valleyhelps in soil and water conservation The lowest depressionfor catchment travels along the BanGong Lake-Nu Jiang faultand the beaded lake basin depression exists between the sur-rounding mountains The elevation ranges between 4196 and6240m in the study zone having an area of about 2240 km2

The available data sources include geology map with thescale of 1 250000 purchased from the National Geological

Mathematical Problems in Engineering 3

Geology Map DEM

Lithology Slope

Lineament density

Groundwater potential zone map

Riverdensity

Categorical ContinuousContinuous

Accuracy comparison

Surveyed Tube Wells

Gradeclassification

Landsat 8 OLI

Topography

Decision tree model generation with decision treealgorithm (50 24) and 10-fold cross validation

Figure 2 Flow chart for mapping groundwater potential

Table 1 Groundwater potential grade criteria on the tube well yield

Grade Very good Good Moderate PoorYield (119905119889) gt800 300ndash800 100ndash300 lt100Number 21 25 18 16

Library DEM from ASTER satellite with the horizon distri-bution of 30m and Landsat 8 OLI image with the acquisitiondate 5222013 downloaded from httpswwwusgsgov andcloud cover 243 sun elevation 6806 sun azimuth 11995and tube wells yield data with field survey 80 investigatedtubewells were divided into four grades according to the yield(Table 1) [21]

22 Methodological Framework The study on the ground-water potential zone delineation is carried out from theperspective of hydrogeology considering the occurrencespace and supply condition Lithology and lineament densityare chosen as the occurrence space factor topology slopeand river density are related to the groundwater supplycondition Lithology is the categorical variable and the restof the variables are continuous After the analysis betweenthe factors and groundwater potential grade C50 and CARTalgorithmswere respectively applied to generate the decisiontree and the 10-fold cross validation was adopted to test theclassification accuracy The specific technical route is shownin Figure 2

23 Factors Related to Groundwater Potential Lithology [22]influences the water-holding capacity of aquifer and directlyaffects the occurrence and distribution of groundwater Thelithology thematic map was derived through digitizing the1 250000 scale geology map from the National GeologicalLibrary as shown in Figure 3 The Quaternary (Q) includingthree kinds of sedimentary type alluvial diluvia and lacus-trine with the distribution in low-lying area belongs to theloose sediment for themelting snowwater that flows from thehigh mountains The Jurassic (J) layers are distributed widelyalong the east-west direction with the modern (J3) middle(J2) and early age (J1) respectively lying in the middlesouth and mid-north The Modern Cretaceous (K2) spreadsalong the north-south direction The Early Cretaceous (K1)is mostly distributed in the north along the east-west ThePaleogene (E) lies mostly in the west with the scatterdistribution in the central and eastern part

The linear faults accompanied by the cranny providespace for the occurrence of groundwater [23] In the stratumwith the same lithology the intersection of the faults leadsto development of the cranny which tends to be the ground-water enrichment zone with the connectivity enhancementThe linear structures are extracted from the satellite imagebased on the discontinuity of the color from the surroundingareas Orthographical correction is applied to Landsat 8 OLIimage with DEM to eliminate the shadowrsquos influence on thevisual interpretation in the study area Combination of bands753 (SWIR 2near infraredgreen) proved to be the mostsuitable for the extraction with the geology map in ENVI 51

4 Mathematical Problems in Engineering

0 5 10 20

N

79∘50

0E 80

∘00E 80

∘10

0E 80

∘20

0E

33∘00

N33∘100

N

QJ3J2J1

K2K1E

(km)

79∘40

0E79

∘30

0E

33∘200

N

Figure 3 Lithology map of the study area

0 5 10 20

N

79∘50

0E 80

∘00E 80

∘10

0E 80

∘20

0E

33∘00

N33∘100

N

(km)

High 1Low 0

(km

km

2 )

79∘40

0E79

∘30

0E

33∘200

N

Figure 4 Lineament density map of the study area

The lineament density shown in Figure 4 was calculated inArcGIS 101 with ldquoline densityrdquo command

Topography controls the groundwater supply conditionsThe mountainous region provides better runoff conditionsand most of the precipitation is accounted for in the surfacerunoff withminimum infiltration to the groundwater On theother hand precipitation in plains provides slower runoff andfacilitates groundwater recharge Topography map is shownin Figure 5The topography map with 30m spatial resolutionwas extracted from DEM data in ArcGIS 101

Groundwater flow is usually driven by surface force andthe boundary of the terrain is mostly the boundary of theshallow aquifer Slope [24] is important in analyzing theterrain as it can affect the groundwater in terms of its storageflow and discharge especially in mountainous areas Slopewas extracted from the DEM in ArcGIS 101 and is shown in

0 5 10 20

N

79∘50

0E 80

∘00E 80

∘10

0E 80

∘20

0E

33∘00

N33∘100

N

(km)

High 6240Low 4196

79∘40

0E79

∘30

0E

33∘200

N

(m)

Figure 5 Topography map of the study area

0 5 10 20

N

79∘50

0E 80

∘00E 80

∘10

0E 80

∘20

0E

33∘00

N33∘100

N

(km)

High 75Low 0

79∘40

0E79

∘30

0E

33∘200

N

(∘)

Figure 6 Slope map of the study area

Figure 6 In general slopes control the infiltration and flowability of the surface water Usually the steep slope indicatesgreater water velocity Therefore it is observed that in theareas of steeper relief the runoff increases while minimizingthe groundwater recharge On the contrary on the relativelygentle sloping terrain the groundwater potentiality increasesdue to greater infiltrationThus lower slope results in greaterrecharge

Flow accumulation reflects the upstream flow quantity Inthe study area the supply source is mainly the melting snowFlow accumulation was derived from DEM for generating astream network It can be seen from Figure 7 that the studyarea has a dendritic pattern for drainage The dendritic net-work is usually found in region underlain with homogeneoussurface without abrupt changes in geological conditions

The river density [25] represents the recharge conditionsto quantify the influence caused by surface water wherehigher density provides better recharge conditions Based on

Mathematical Problems in Engineering 5

0 5 10 20

N

79∘50

0E 80

∘00E 80

∘10

0E 80

∘20

0E

33∘00

N33∘100

N

(km)

gt1000

79∘40

0E79

∘30

0E

33∘200

N

Figure 7 Flow accumulation map of the study area

0 5 10 20

N

79∘50

0E 80

∘00E 80

∘10

0E 80

∘20

0E

33∘00

N33∘100

N

(km)

Low 0

High 114

(km

km

2 )

79∘40

0E79

∘30

0E

33∘200

N

Figure 8 River density map of the study area

the flow accumulation the river density was calculated usingthe ldquoline densityrdquo command in ArcGIS 101 as shown inFigure 8

24 Research Method The decision tree algorithms [16]are suitable for the multifactor classification problem Forthe mixture of the continuous and discrete factors in thegroundwater potential evaluation model C50 and CARTdecision tree algorithms were adopted together with the 10-fold cross validation method to improve the classificationaccuracy and unbiasedness

241 C50 The C50 decision tree algorithm is rooted inID3 and C45 The ID3 algorithm [26] with the maximumgain as the division standard aims to achieve the maximumof the information gain in every node which often givespriority to the variables with more classes To make upfor the defect of ID3 C45 adopts the gain-ratio criterion[19 27 28] With the gain-ratio criterion the binary nodes

divide the continuous variable The processing method forcategorical variables is firstly to refer each of the categoriesas a branch and then merge each two branches iterativelyuntil the two branches remain However the heuristic searchmay not find the best point for the categorical variablesdivision For continuous variables C50 algorithm can easilyfind the division point [29] To get rid of the biasedness onthe continuous variables the algorithm improves the gainof the continuous variables Besides C50 algorithm cansimplify the originally complex decision tree to the equivalenttree for the easy understanding With more splitting layersthan other algorithms it can ensure the high purity of theresult nodes C50 algorithm achieves the self-correction afterseveral iterations with boosting technology [30]The artificialmethods lay emphasis on the sole impact of various factorshowever C50 algorithm can consider all the factors toanalyze the comprehensive influence based on data statisticsC50 algorithm has been widely applied to the multivariateclassification for its unbiasedness and precision towards thecontinuous and categorical variables compared to other deci-sion tree algorithms [17 31ndash37] To improve the classificationaccuracy boosting technology was applied in C50 decisiontree algorithm and could adjust the decision tree according tothe fault samples until reaching a high precision [38] BesidesC50 algorithm is more applicable to the large data samples

Assume the splitting node119879 is expected to separate the |119879|samples into119870 target categories [27]The symbol119901119879(119894) standsfor the percentage of category 119894 at node 119879 119894 = 1 2 119870 Theinclusive information at node 119879 can be expressed as

Info (119879) = minus 119870sum119894=1

119901119879 (119894) times log2 [119901119879 (119894)] (1)

For the decision tree branch the samples are divided into1198791 1198792 119879119899 by the node 119879 with the subsamples of |1198791| |1198792|and |119879119899| respectivelyThen the information can be expressedas

Info (119883 119879) = 119899sum119894=1

10038161003816100381610038161198791198941003816100381610038161003816|119879| times Info (119879119894) (2)

After combining the above information formulas theinformation gain can be expressed as

Gain (119883 119879) = Info (119879) minus Info (119883 119879) (3)To improve the application of the information gain

concept information gain ratio was put forward

GainRatio (119883 119879) = Gain (119883 119879)SplitInfo (119883 119879) (4)

where

SplitInfo (119883 119879) = minus 119899sum119894=1

10038161003816100381610038161198791198941003816100381610038161003816|119879| times log2 (10038161003816100381610038161198791198941003816100381610038161003816|119879| ) (5)

The practice showed that the information gain ratio wasmore preferred to the continuous variable therefore theinformation gain ratio for the continuous variable with 119899distinct values should be expressed as [16]

Gain (119883 119879) = Info (119879) minus Info (119883 119879) minus log2 (119899 minus 1)|119879| (6)

6 Mathematical Problems in Engineering

Table 2 The precision of each loop based on the 10-fold cross validation ()

1 2 3 4 5 6 7 8 9 10 Averageaccuracy

C50 8333 10000 8125 8986 9025 8729 9450 9421 8750 9628 9045CART 8057 8921 9223 8534 8123 100 8012 7824 8157 8234 8509

Table 3 The importance of each factor based on C50 and CART algorithm

Lithology Topology River density Slope Lineament densityC50 0363 0331 0159 0117 0031CART 0355 0308 0024 0010 0312

242 CART CART decision tree algorithm [39] can dividethe sample set into two subsample sets making the root andintermediate nodes with two branches based on the recur-sively binary segmentation technology CART can handleboth continuous and discrete variables with the impuritylevel-Gini coefficient as the discriminant basis consideringthe probability distribution under the division node

Assume a total of 119870 classes variable 119883 and node for 119879then the Gini index is defined as [16]

Gini (119879) = 119870sum119894=1

119901119879 (119894) (1 minus 119901119879 (119894)) = 1 minus119870sum119894=1

1199012119879 (119894) (7)

When the classes show the equal probability in the node119879 the Gini index achieves the maximum 1 minus 1119870 with onlyone kind in node 119879 the Gini index achieves the minimumThe Gini index increases with the impurity therefore thesubnodes should be added to lower the impurity Whentaking the misclassification cost matrix into considerationthe Gini index formula becomes

Gini (119879) = sum119895 =119894

119862 (119894 | 119895) 119901119879 (119894) 119901119879 (119895) (8)

where 119862(119894 | 119895) represents the cost for misclassifying the casecategory 119895 as 119894 Assuming that the subnode 119878 added to node119879 the Gini index can be expressed as

Gini (119878 119879) = Gini (119879) minus 119901119871Gini (119879119871) minus 119901119877Gini (119879119877) (9)

where 119901119871 119901119877 stand for the proportion of cases in node 119879classified into 119879119871 and 119879119877243 10-Fold Cross Validation The basic idea for the 119896-foldcross validation [40 41] is to equally divide the surveyedsamples into 119896 parts of which 119896 minus 1 parts served as thetraining samples with the remaining one part for validationwith 119896 circulations The method can guarantee every sampleacted as the training sample and the verification sample forone time in the circulations For the 119896-fold cross validationthe commonly used value for 119896 is 10 called 10-fold crossvalidation [42]

3 Results and Discussion

In this study five groundwater relating factors were used inthe analysis and the factors except lithology were continuous

80 tube wells were utilized for training with C50 andCART respectively in the statistical analysis software SPSSClementine 120 and MATLAB According to the confusionmatrix [43] constructed by the verification result the kappacoefficient [44 45] is used to evaluate the accuracy Based onthe 10-fold cross validation with C50 and CART algorithmrespectively the precision of each loop is shown in Table 2The importance of each factor was calculated firstly to deter-mine the selection order of the factors for classification asshown in Table 3 For the C50 algorithm the importance was0363 0331 0159 0117 and 0031 respectively for lithologytopology river density slope and lineament density For theCART algorithm the importance was 0355 0308 00240010 and 0312 respectively After the ten loops the decisiontree with the higher classification accuracy was chosen asoptimal Figure 9 shows the optimal decision tree generatedby C50 algorithm with 6 layers 21 nodes 10 internal nodesand 11 terminal nodes Table 4 shows the rules for the optimaldecision tree by C50 The optimal decision tree generated byCARTalgorithm is shown in Figure 10 with 8 layers 21 nodes11 internal nodes and 10 terminal nodes and the rules areshown in Table 5

Kappa = 119873sum119903119894=1 119909119894119894 minus sum119903119894=1 119909119894+119909+1198941198732 minus sum119903119894=1 119909119894+119909+119894 (10)

where 119903 is the number of the total categories 119873 is the totalnumber for verification 119909119894119894 is the number of the correctclassifications 119909119894+ is the number of samples mistaken fromthe category 119894 for others 119909+119894 is the number of samplesmistaken from other categories for 119894

According to the classification rules we can see that notevery rule takes all the variables into account therefore forthe area lack of the detailed information the typical factorscan help to predict the groundwater potential classificationgrade The decision trees show that both the two algorithmscan determine the dividing point of the variables especiallyfor the continuous variables based on the training data whichmakes the division of the interval more scientific and reducesthe segmentation error compared with the artificial division

According to the decision tree result generated by C50algorithm we did deep analysis Topology was dividedby six nodes 4196ndash4301ndash43165ndash4357ndash44005ndash6240 and thegroundwater was distributed in the low-lying areas Slopecontained two ranges 0ndash243ndash75 and the flat areas werebeneficial to the surface water infiltration River density was

Mathematical Problems in Engineering 7

Table 4 Rules for groundwater potential grade based on C50

Grade Rule

Very goodLithology Q 4196 lt topology le 4301 lineament density le 0558Lithology Q 4196 lt topology le 43165 lineament density gt 0558Lithology Q 43165 lt topology le 44005 river density gt 0784 slope le 243

Good

Lithology Q 4301 lt topology le 43165 lineament density le 0558Lithology Q 43165 lt topology le 4357 river density le 0784Lithology Q 4357 lt topology le 44005 river density gt 0784 slope gt 243Lithology Q 44005 lt topology river density gt 0650

ModerateLithology others except Q river density gt 0506Lithology Q 44005 lt topology river density le 0650Lithology Q 4357 lt topology le 44005 river density le 0784

Poor Lithology others except Q river density le 05

RD

L

T

Others

T

Q

P M

RD RD

M GT S

VG G

LD

T VG

VG G

G M

le0506 gt0506 le43165 gt43165

le0558 gt0558

le4301 gt4301

le4357 gt4357

gt0784

le44005 gt44005

le0650 gt0650

le243 gt243

le0784

Factor GradeL lithologyLD lineament densityT topologyS slopeRD river density

VG very goodG goodM moderateP poor

Figure 9 The optimal decision tree generated by C50 algorithm

classified into four intervals 0ndash0506ndash0650ndash0784ndash114 andreflected the flow capacity in the region and the higherthe better for groundwater enrichment Lineament densitywas divided into two intervals 0ndash0558ndash1 and the moreoccurrence space for groundwater existed with the higherlineament density However for the CART algorithm topol-ogy was divided into three intervals 4196ndash4304ndash4400ndash6240slope contained two ranges 0ndash2386ndash75 river density was

classified into three intervals 0ndash0685ndash0703ndash114 lineamentdensitywas divided into four intervals 0ndash0296ndash0335ndash0713ndash1 After applying the optimal decision trees respectively tothe whole study area [6 41] the groundwater potential zonemaps were derived and shown in Figures 11 and 12

According to the results generated by the decision treesthe ldquovery goodrdquo area is mostly located in the broad plainzone with a patchy distribution covering 10325 km2 about

8 Mathematical Problems in Engineering

Table 5 Rules for groundwater potential grade based on CART

Grade Rule

Very goodRiver density gt 0703 4304 lt topology le 4400 lineament density gt 0713Lineament density gt 0296 topology le 4304Lineament density gt 0335 4304 lt topology le 4400 river density gt 0703 slope le 2386

GoodRiver density lt 0703 4304 lt topology le 4400 lineament density gt 03350335 lt lineament density le 0713 4304 lt topology le 4400 river density gt 0703 slope gt 2386Lineament density gt 0296 topology gt 4400 river density gt 0685

ModerateLineament density le 0296 lithology Q0296 lt lineament density le 0335 4304 lt topology le 4400Lineament density gt 0296 topology gt 4400 river density le 0685

Poor Lineament density le 0296 lithology others except Q

LD

L T

RDP M

Others Q

M G

T

VG LD

M RD

G LD

S VG

VG G

le0296 gt0296

le4400 gt4400

le4304 gt4304 le0685 gt0685

le0335 gt0335

le0703 gt0703

le0713 gt0713

gt2386

Factor GradeL lithologyLD lineament densityT topologyS slopeRD river density

VG very goodG goodM moderateP poor

le2386

Figure 10 The optimal decision tree generated by CART algorithm

Mathematical Problems in Engineering 9

5

N

79∘50

0E 80

∘00E 80

∘10

0E 80

∘20

0E

33∘00

N33∘100

N

0 10 20(km)

Very goodGoodModerate

PoorSurveyed tube wells

79∘40

0E79

∘30

0E

33∘200

N

Figure 11 Groundwater potential map of the study area based onC50

0 5 10 20

N

79∘50

0E 80

∘00E 80

∘10

0E 80

∘20

0E

33∘00

N33∘100

N

(km)

Very goodGoodModerate

PoorSurveyed tube wells

79∘40

0E79

∘30

0E

33∘200

N

Figure 12 Groundwater potential map of the study area based onCART

461 of the study area for C50 algorithm and 10505 km2about 468 of the study area for CART algorithm for thewater infiltration into the underground with sufficient timeand space The study shows that low-lying areas with goodflow condition well-developed stratigraphic gap and strongconnectivity like the southwest beaches of Ban Gong Lakecan be the first target for groundwater resources The ldquogoodrdquoarea was distributed along the river with 19210 km2 covering858 for C50 algorithm and 22602 km2 about 1009 ofthe study area for CART algorithm just like a long stripThe zones mainly had a planar distribution on the bottomof the diluvia fan on the low mountain and hilly terrainand a banded distribution on the mountain watershed ofpluvial valleys situated upstream and on the peripheryof ldquovery goodrdquo area which can serve as the candidate

target for groundwater exploration The ldquomoderaterdquo zonewith 59551 km2 occupying 2659 for C50 algorithmand 58453 km2 about 2610 of the study area for CARTalgorithm spreads around the upstream tributaries Thezone is located on both sides of the river valleys and the topof the diluvia fan The ldquopoorrdquo area occupies 6023 withthe area of 134914 km2 for C50 algorithm and 132440 km2about 5913 of the study area for CART algorithm which ismostly a mountainous region with a high altitude

4 Conclusions

In this study C50 and CART algorithms were applied for thedecision tree generation to predict the groundwater potentialzone with the five relating factors and the 10-fold crossvalidation method was adopted to verify the classificationresult with the kappa coefficient From this paper we candraw some conclusions as follows(1) In the study area the five groundwater relating fac-tors lithology topology slope river density and lineamentdensity were appropriate for the groundwater potential gradeprediction and the importance based on C50 algorithm was0363 0331 0159 0117 and 003 respectively for CARTalgorithm the importance was 0355 0308 0024 0010 and0312 respectively(2) Based on the 10-fold cross validation both C50 andCART could be applied for MCDM with the categoricaland continuous variables simultaneously with the averageaccuracy of 9045 and 8509 respectively however C50algorithm showed higher classification accuracy than CARTalgorithm(3) After applying the optimal decision trees to the wholestudy area respectively the groundwater potential zone mapwas delineated and the four grades of groundwater poten-tial zones ldquovery goodrdquo ldquogoodrdquo ldquomoderaterdquo and ldquopoorrdquooccupied the area of 10325 km2 19210 km2 59551 km2 and134914 km2 with the percentages of 461 858 2659and 6023 respectively for C50 and for CART the areaof 10505 km2 22602 km2 58453 km2 and 132440 km2with the percentages of 468 1009 2610 and 5913respectively

The study result can provide reference for groundwaterexploration and in the future work we will consider morerelating factors and survey more wells to enrich the modelThe integration of decision tree algorithms and MCDM inour study applies only to the qualitative assessment for thelack of the prior knowledge in the large area therefore theextra analysis is needed for the specific point investigationThe accuracy demonstrates that the 10-fold cross validation issuitable for training and verifying the decision tree howeverthe tested dataset is limited and more tube wells should beinvestigated to validate the stability of the model

Competing Interests

The authors declare that there is no conflict of interests in thisresearch work

10 Mathematical Problems in Engineering

Acknowledgments

This work was supported by Development Program of ChinaGroundwater Exploration Technology in the Water ShortageRegion (863 Program 2012AA062601)

References

[1] D Oikonomidis S Dimogianni N Kazakis and K VoudourisldquoA GISRemote Sensing-based methodology for groundwaterpotentiality assessment in Tirnavos area Greecerdquo Journal ofHydrology vol 525 pp 197ndash208 2015

[2] D Pinto S Shrestha M S Babel and S Ninsawat ldquoDelineationof groundwater potential zones in the Comoro watershedTimor Leste using GIS remote sensing and analytic hierarchyprocess (AHP) techniquerdquo Applied Water Science 2015

[3] O A Fashae M N Tijani A O Talabi and O I AdedejildquoDelineation of groundwater potential zones in the crystallinebasement terrain of SW-Nigeria an integrated GIS and remotesensing approachrdquoAppliedWater Science vol 4 no 1 pp 19ndash382014

[4] JMallick C K SinghHAl-Wadi et al ldquoGeospatial and geosta-tistical approach for groundwater potential zone delineationrdquoHydrological Processes vol 29 no 3 pp 395ndash418 2015

[5] S Lee K-Y Song Y Kim and I Park ldquoRegional groundwaterproductivity potential mapping using a geographic informationsystem (GIS) based artificial neural network modelrdquo Hydroge-ology Journal vol 20 no 8 pp 1511ndash1527 2012

[6] S Lee and C-W Lee ldquoApplication of decision-tree model togroundwater productivity-potential mappingrdquo Sustainabilityvol 7 no 10 pp 13416ndash13432 2015

[7] AOzdemir ldquoUsing a binary logistic regressionmethod andGISfor evaluating andmapping the groundwater spring potential inthe Sultan Mountains (Aksehir Turkey)rdquo Journal of Hydrologyvol 405 no 1-2 pp 123ndash136 2011

[8] A Ozdemir ldquoGIS-based groundwater spring potential mappingin the SultanMountains (Konya Turkey) using frequency ratioweights of evidence and logistic regression methods and theircomparisonrdquo Journal of Hydrology vol 411 no 3-4 pp 290ndash308 2011

[9] D Machiwal M K Jha and B C Mal ldquoAssessment ofgroundwater potential in a semi-arid region of india usingremote sensing GIS and MCDM techniquesrdquo Water ResourcesManagement vol 25 no 5 pp 1359ndash1386 2011

[10] S A Naghibi and H R Pourghasemi ldquoA comparative assess-ment between three machine learning models and their per-formance comparison by bivariate and multivariate statisticalmethods in groundwater potential mappingrdquo Water ResourcesManagement vol 29 no 14 pp 5217ndash5236 2015

[11] D Zheng-Dong Y Xin L Fan et al ldquoConstruction andinvestigation of groundwater remote sensing fuzzy assessmentindexrdquo Chinese Journal of Geophysics vol 56 no 11 pp 3908ndash3916 2013

[12] O FAlthuwaynee B PradhanH-J Park and JH Lee ldquoAnovelensemble decision tree-based CHi-squared Automatic Interac-tion Detection (CHAID) and multivariate logistic regressionmodels in landslide susceptibility mappingrdquo Landslides vol 11no 6 pp 1063ndash1078 2014

[13] I Park and S Lee ldquoSpatial prediction of landslide susceptibilityusing a decision tree approach a case study of the Pyeongchangarea Koreardquo International Journal of Remote Sensing vol 35 no16 pp 6089ndash6112 2014

[14] C Baker R Lawrence C Montagne and D Patten ldquoMappingwetlands and riparian areas using landsat ETM+ imagery anddecision-tree-based modelsrdquo Wetlands vol 26 no 2 pp 465ndash474 2006

[15] R Kohavi and J R Quinlan ldquoData mining tasks and methodsclassification decision-tree discoveryrdquo in Handbook of DataMining and Knowledge Discovery pp 267ndash276 Oxford Univer-sity Press New York NY USA 2002

[16] N Lin D Noe and X He ldquoTree-based methods and theirapplicationsrdquo in Springer Handbook of Engineering Statistics pp551ndash570 Springer London UK 2006

[17] I Klein U Gessner and C Kuenzer ldquoRegional land covermapping and change detection in Central Asia using MODIStime-seriesrdquo Applied Geography vol 35 no 1-2 pp 219ndash2342012

[18] J R Quinlan ldquoImproved use of continuous attributes in C4 5rdquoJournal of Artificial Intelligence Research vol 4 pp 77ndash90 1996

[19] S B Kotsiantis ldquoSupervised machine learning a review ofclassification techniquesrdquo Informatica vol 31 no 3 pp 249ndash268 2007

[20] K Polat and S Gunes ldquoA novel hybrid intelligent method basedon C45 decision tree classifier and one-against-all approachfor multi-class classification problemsrdquo Expert Systems withApplications vol 36 no 2 pp 1587ndash1592 2009

[21] M K Jha V M Chowdary and A Chowdhury ldquoGroundwaterassessment in Salboni Block West Bengal (India) using remotesensing geographical information system and multi-criteriadecision analysis techniquesrdquoHydrogeology Journal vol 18 no7 pp 1713ndash1728 2010

[22] V B Rekha A P Thomas M Suma and H Vijith ldquoAnintegration of spatial information technology for groundwa-ter potential and quality investigations in Koduvan Ar Sub-Watershed of Meenachil River Basin Kerala Indiardquo Journal ofthe Indian Society of Remote Sensing vol 39 no 1 pp 63ndash712011

[23] O Rahmati ANazari SamaniMMahdavi H R Pourghasemiand H Zeinivand ldquoGroundwater potential mapping at Kurdis-tan region of Iran using analytic hierarchy process and GISrdquoArabian Journal of Geosciences vol 8 no 9 pp 7059ndash7071 2014

[24] K R Preeja S Joseph J Thomas and H Vijith ldquoIdentificationof groundwater potential zones of a tropical river basin (KeralaIndia) using remote sensing and GIS techniquesrdquo Journal of theIndian Society of Remote Sensing vol 39 no 1 pp 83ndash94 2011

[25] K Narendra K Nageswara Rao and P Swarna Latha ldquoIntegrat-ing remote sensing and GIS for identification of groundwaterprospective zones in the Narava basin Visakhapatnam regionAndhra Pradeshrdquo Journal of the Geological Society of India vol81 no 2 pp 248ndash260 2013

[26] M Ture F Tokatli and I Kurt ldquoUsing KaplanndashMeier analysistogether with decision tree methods (CampRT CHAID QUESTC45 and ID3) in determining recurrence-free survival of breastcancer patientsrdquo Expert Systems with Applications vol 36 no 2pp 2017ndash2026 2009

[27] N A AL-Fakhry ldquoSummarizing data by using data miningtechniques a comparative by using C45 and C50 algorithmsrdquoInternational Education and Research Journal vol 2 no 4 2016

[28] R Kohavi and J R Quinlan ldquoData mining tasks and methodsclassification decision-tree discoveryrdquo in Handbook of DataMining and Knowledge Discovery pp 267ndash276 Oxford Univer-sity Press 2002

Mathematical Problems in Engineering 11

[29] U M Fayyad and K B Irani ldquoOn the handling of continuous-valued attributes in decision tree generationrdquoMachine Learningvol 8 no 1 pp 87ndash102 1992

[30] J R Quinlan ldquoBagging boosting and C4 5rdquo AAAIIAAI vol1 pp 725ndash730 1996

[31] S Pang and J C Gong ldquoC5 0 classification algorithm andapplication on individual credit evaluation of banksrdquo SystemsEngineering-Theory amp Practice vol 29 no 12 pp 94ndash104 2009

[32] M A Friedl C E Brodley and A H Strahler ldquoMaximizingland cover classification accuracies produced by decision treesat continental to global scalesrdquo IEEE Transactions on Geoscienceand Remote Sensing vol 37 no 2 pp 969ndash977 1999

[33] C J Moran and E N Bui ldquoSpatial data mining for enhancedsoil map modellingrdquo International Journal of GeographicalInformation Science vol 16 no 6 pp 533ndash549 2002

[34] X Li andA G-O Yeh ldquoDatamining of cellular automatarsquos tran-sition rulesrdquo International Journal of Geographical InformationScience vol 18 no 8 pp 723ndash744 2004

[35] P H Williams R Eyles and G Weiller ldquoPlant microRNAprediction by supervised machine learning using C50 decisiontreesrdquo Journal of Nucleic Acids vol 2012 Article ID 652979 10pages 2012

[36] N Patil R Lathi and V Chitre ldquoCustomer card classificationbased on C5 0 amp CART algorithmsrdquo International Journal ofEngineering Research and Applications vol 2 no 4 pp 164ndash1672012

[37] M M Javidi and E F Roshan ldquoSpeech emotion recognition byusing combinations of C50 neural network (NN) and supportvector machines (SVM) classification methodsrdquo Journal ofMathematics and Computer Science vol 6 pp 191ndash200 2013

[38] E Bauer and R Kohavi ldquoAn empirical comparison of vot-ing classification algorithms bagging boosting and variantsrdquoMachine Learning vol 36 no 1-2 pp 105ndash139 1999

[39] Y Shao and R S Lunetta ldquoComparison of support vectormachine neural network and CART algorithms for the land-cover classification using limited training data pointsrdquo ISPRSJournal of Photogrammetry and Remote Sensing vol 70 pp 78ndash87 2012

[40] Y Bengio andYGrandvalet ldquoNounbiased estimator of the vari-ance of k-fold cross-validationrdquo Journal of Machine LearningResearch vol 5 pp 1089ndash1105 2004

[41] C Chen B He and Z Zeng ldquoAmethod formineral prospectiv-ity mapping integrating C45 decision tree weights-of-evidenceandm-branch smoothing techniques a case study in the easternKunlunMountains Chinardquo Earth Science Informatics vol 7 no1 pp 13ndash24 2014

[42] M van der Gaag T Hoffman M Remijsen et al ldquoThe five-factor model of the positive and negative syndrome scale IIa ten-fold cross-validation of a revised modelrdquo SchizophreniaResearch vol 85 no 1ndash3 pp 280ndash287 2006

[43] M Al Saud ldquoMapping potential areas for groundwater storagein Wadi Aurnah Basin western Arabian Peninsula usingremote sensing and geographic information system techniquesrdquoHydrogeology Journal vol 18 no 6 pp 1481ndash1495 2010

[44] R G Congalton ldquoA review of assessing the accuracy of classifi-cations of remotely sensed datardquo Remote Sensing of Environ-ment vol 37 no 1 pp 35ndash46 1991

[45] F K Hoehler ldquoBias and prevalence effects on kappa viewed interms of sensitivity and specificityrdquo Journal of Clinical Epidemi-ology vol 53 no 5 pp 499ndash503 2000

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 2: Research Article Assessment of Groundwater Potential Based ...downloads.hindawi.com/journals/mpe/2016/2064575.pdf · Figure . In general, slopes control the inltration and ow ability

2 Mathematical Problems in Engineering

China

Tibet

0 5 10 20

N

79∘50

0E79

∘40

0E79

∘30

0E 80

∘00E 80

∘10

0E 80

∘20

0E

33∘00

N33∘100

N

(km)

Surveyed tube wells

33∘200

N

Figure 1 Location map of the study area (Landsat 8 OLI 753)

tree (CART) random forest [10] chi-squared automaticinteraction detector or the quick unbiased and efficientstatistical tree algorithms [12 13] The flexibility of the AHPmethod allows the revision of the weights and rating ofparameters in order to be suitable for other regions accord-ing to their specific characteristics [1] However regardingassigning weights to the different thematic layers personaljudgment reduced the objectivity of the model [9] Decisiontree techniques provide a multivariate method which isknown as a successful automatic classification scheme [14 15]CART algorithm [16] showed more application than otherdecision tree algorithms C50 algorithm [16ndash19] as one ofthe decision tree techniques serves as an enhancement ofC45 and shows relatively better classification result for it cansimultaneously handle continuous and categorical variableswith the unbiased processing The 119896-fold cross validationmethod [20] is an effective way to improve the precisionof the decision tree model with the basic idea randomlydividing the samples into the training set and validation setand circling for 119896 times to ensure the robustness of themodel

In the current study considering the groundwater poten-tial relating factors lithology lineament density topologyslope and river density decision tree algorithms C50 andCART respectively with the 10-fold cross validation wereapplied to generate and test the decision tree models forpredicting the groundwater potential grade in the wholestudy area

2 Materials and Methods

21 Study Area and Materials The study area (shown inFigure 1) is located in the southwestern part of Ritu countyAli city with the extent of 79∘301015840ndash80∘201015840E and 33∘ndash33∘301015840Nand surrounded by Kailash mountain in the south andKarakoram mountain in the north The county is in thesoutheast of BanGong Lake Basin across the BanGong Lake-Nu Jiang fault zoneThe area belongs to the wide valley in theplateau and mountain lake basin and provides runoff to BanGong Lake in the southeast directionThe study area is domi-nated by subfrigid monsoon climateThe temperature rangesfrom minus221 to 136 Celsius degrees with the annual averagetemperature of 05 Celsius degrees The annual sunshineperiod is 33709 hours with the frost-free period of 95 daysThe annual rainfall is quite low being only 75mm and havinghigh evaporation of 24563mm The Ban Gong Lake has amaximumdepth of 413m and lies in east-west direction hav-ing salt water in the midwest and freshwater in the east Theoriginal plant Ban Gong willow growing along the lake valleyhelps in soil and water conservation The lowest depressionfor catchment travels along the BanGong Lake-Nu Jiang faultand the beaded lake basin depression exists between the sur-rounding mountains The elevation ranges between 4196 and6240m in the study zone having an area of about 2240 km2

The available data sources include geology map with thescale of 1 250000 purchased from the National Geological

Mathematical Problems in Engineering 3

Geology Map DEM

Lithology Slope

Lineament density

Groundwater potential zone map

Riverdensity

Categorical ContinuousContinuous

Accuracy comparison

Surveyed Tube Wells

Gradeclassification

Landsat 8 OLI

Topography

Decision tree model generation with decision treealgorithm (50 24) and 10-fold cross validation

Figure 2 Flow chart for mapping groundwater potential

Table 1 Groundwater potential grade criteria on the tube well yield

Grade Very good Good Moderate PoorYield (119905119889) gt800 300ndash800 100ndash300 lt100Number 21 25 18 16

Library DEM from ASTER satellite with the horizon distri-bution of 30m and Landsat 8 OLI image with the acquisitiondate 5222013 downloaded from httpswwwusgsgov andcloud cover 243 sun elevation 6806 sun azimuth 11995and tube wells yield data with field survey 80 investigatedtubewells were divided into four grades according to the yield(Table 1) [21]

22 Methodological Framework The study on the ground-water potential zone delineation is carried out from theperspective of hydrogeology considering the occurrencespace and supply condition Lithology and lineament densityare chosen as the occurrence space factor topology slopeand river density are related to the groundwater supplycondition Lithology is the categorical variable and the restof the variables are continuous After the analysis betweenthe factors and groundwater potential grade C50 and CARTalgorithmswere respectively applied to generate the decisiontree and the 10-fold cross validation was adopted to test theclassification accuracy The specific technical route is shownin Figure 2

23 Factors Related to Groundwater Potential Lithology [22]influences the water-holding capacity of aquifer and directlyaffects the occurrence and distribution of groundwater Thelithology thematic map was derived through digitizing the1 250000 scale geology map from the National GeologicalLibrary as shown in Figure 3 The Quaternary (Q) includingthree kinds of sedimentary type alluvial diluvia and lacus-trine with the distribution in low-lying area belongs to theloose sediment for themelting snowwater that flows from thehigh mountains The Jurassic (J) layers are distributed widelyalong the east-west direction with the modern (J3) middle(J2) and early age (J1) respectively lying in the middlesouth and mid-north The Modern Cretaceous (K2) spreadsalong the north-south direction The Early Cretaceous (K1)is mostly distributed in the north along the east-west ThePaleogene (E) lies mostly in the west with the scatterdistribution in the central and eastern part

The linear faults accompanied by the cranny providespace for the occurrence of groundwater [23] In the stratumwith the same lithology the intersection of the faults leadsto development of the cranny which tends to be the ground-water enrichment zone with the connectivity enhancementThe linear structures are extracted from the satellite imagebased on the discontinuity of the color from the surroundingareas Orthographical correction is applied to Landsat 8 OLIimage with DEM to eliminate the shadowrsquos influence on thevisual interpretation in the study area Combination of bands753 (SWIR 2near infraredgreen) proved to be the mostsuitable for the extraction with the geology map in ENVI 51

4 Mathematical Problems in Engineering

0 5 10 20

N

79∘50

0E 80

∘00E 80

∘10

0E 80

∘20

0E

33∘00

N33∘100

N

QJ3J2J1

K2K1E

(km)

79∘40

0E79

∘30

0E

33∘200

N

Figure 3 Lithology map of the study area

0 5 10 20

N

79∘50

0E 80

∘00E 80

∘10

0E 80

∘20

0E

33∘00

N33∘100

N

(km)

High 1Low 0

(km

km

2 )

79∘40

0E79

∘30

0E

33∘200

N

Figure 4 Lineament density map of the study area

The lineament density shown in Figure 4 was calculated inArcGIS 101 with ldquoline densityrdquo command

Topography controls the groundwater supply conditionsThe mountainous region provides better runoff conditionsand most of the precipitation is accounted for in the surfacerunoff withminimum infiltration to the groundwater On theother hand precipitation in plains provides slower runoff andfacilitates groundwater recharge Topography map is shownin Figure 5The topography map with 30m spatial resolutionwas extracted from DEM data in ArcGIS 101

Groundwater flow is usually driven by surface force andthe boundary of the terrain is mostly the boundary of theshallow aquifer Slope [24] is important in analyzing theterrain as it can affect the groundwater in terms of its storageflow and discharge especially in mountainous areas Slopewas extracted from the DEM in ArcGIS 101 and is shown in

0 5 10 20

N

79∘50

0E 80

∘00E 80

∘10

0E 80

∘20

0E

33∘00

N33∘100

N

(km)

High 6240Low 4196

79∘40

0E79

∘30

0E

33∘200

N

(m)

Figure 5 Topography map of the study area

0 5 10 20

N

79∘50

0E 80

∘00E 80

∘10

0E 80

∘20

0E

33∘00

N33∘100

N

(km)

High 75Low 0

79∘40

0E79

∘30

0E

33∘200

N

(∘)

Figure 6 Slope map of the study area

Figure 6 In general slopes control the infiltration and flowability of the surface water Usually the steep slope indicatesgreater water velocity Therefore it is observed that in theareas of steeper relief the runoff increases while minimizingthe groundwater recharge On the contrary on the relativelygentle sloping terrain the groundwater potentiality increasesdue to greater infiltrationThus lower slope results in greaterrecharge

Flow accumulation reflects the upstream flow quantity Inthe study area the supply source is mainly the melting snowFlow accumulation was derived from DEM for generating astream network It can be seen from Figure 7 that the studyarea has a dendritic pattern for drainage The dendritic net-work is usually found in region underlain with homogeneoussurface without abrupt changes in geological conditions

The river density [25] represents the recharge conditionsto quantify the influence caused by surface water wherehigher density provides better recharge conditions Based on

Mathematical Problems in Engineering 5

0 5 10 20

N

79∘50

0E 80

∘00E 80

∘10

0E 80

∘20

0E

33∘00

N33∘100

N

(km)

gt1000

79∘40

0E79

∘30

0E

33∘200

N

Figure 7 Flow accumulation map of the study area

0 5 10 20

N

79∘50

0E 80

∘00E 80

∘10

0E 80

∘20

0E

33∘00

N33∘100

N

(km)

Low 0

High 114

(km

km

2 )

79∘40

0E79

∘30

0E

33∘200

N

Figure 8 River density map of the study area

the flow accumulation the river density was calculated usingthe ldquoline densityrdquo command in ArcGIS 101 as shown inFigure 8

24 Research Method The decision tree algorithms [16]are suitable for the multifactor classification problem Forthe mixture of the continuous and discrete factors in thegroundwater potential evaluation model C50 and CARTdecision tree algorithms were adopted together with the 10-fold cross validation method to improve the classificationaccuracy and unbiasedness

241 C50 The C50 decision tree algorithm is rooted inID3 and C45 The ID3 algorithm [26] with the maximumgain as the division standard aims to achieve the maximumof the information gain in every node which often givespriority to the variables with more classes To make upfor the defect of ID3 C45 adopts the gain-ratio criterion[19 27 28] With the gain-ratio criterion the binary nodes

divide the continuous variable The processing method forcategorical variables is firstly to refer each of the categoriesas a branch and then merge each two branches iterativelyuntil the two branches remain However the heuristic searchmay not find the best point for the categorical variablesdivision For continuous variables C50 algorithm can easilyfind the division point [29] To get rid of the biasedness onthe continuous variables the algorithm improves the gainof the continuous variables Besides C50 algorithm cansimplify the originally complex decision tree to the equivalenttree for the easy understanding With more splitting layersthan other algorithms it can ensure the high purity of theresult nodes C50 algorithm achieves the self-correction afterseveral iterations with boosting technology [30]The artificialmethods lay emphasis on the sole impact of various factorshowever C50 algorithm can consider all the factors toanalyze the comprehensive influence based on data statisticsC50 algorithm has been widely applied to the multivariateclassification for its unbiasedness and precision towards thecontinuous and categorical variables compared to other deci-sion tree algorithms [17 31ndash37] To improve the classificationaccuracy boosting technology was applied in C50 decisiontree algorithm and could adjust the decision tree according tothe fault samples until reaching a high precision [38] BesidesC50 algorithm is more applicable to the large data samples

Assume the splitting node119879 is expected to separate the |119879|samples into119870 target categories [27]The symbol119901119879(119894) standsfor the percentage of category 119894 at node 119879 119894 = 1 2 119870 Theinclusive information at node 119879 can be expressed as

Info (119879) = minus 119870sum119894=1

119901119879 (119894) times log2 [119901119879 (119894)] (1)

For the decision tree branch the samples are divided into1198791 1198792 119879119899 by the node 119879 with the subsamples of |1198791| |1198792|and |119879119899| respectivelyThen the information can be expressedas

Info (119883 119879) = 119899sum119894=1

10038161003816100381610038161198791198941003816100381610038161003816|119879| times Info (119879119894) (2)

After combining the above information formulas theinformation gain can be expressed as

Gain (119883 119879) = Info (119879) minus Info (119883 119879) (3)To improve the application of the information gain

concept information gain ratio was put forward

GainRatio (119883 119879) = Gain (119883 119879)SplitInfo (119883 119879) (4)

where

SplitInfo (119883 119879) = minus 119899sum119894=1

10038161003816100381610038161198791198941003816100381610038161003816|119879| times log2 (10038161003816100381610038161198791198941003816100381610038161003816|119879| ) (5)

The practice showed that the information gain ratio wasmore preferred to the continuous variable therefore theinformation gain ratio for the continuous variable with 119899distinct values should be expressed as [16]

Gain (119883 119879) = Info (119879) minus Info (119883 119879) minus log2 (119899 minus 1)|119879| (6)

6 Mathematical Problems in Engineering

Table 2 The precision of each loop based on the 10-fold cross validation ()

1 2 3 4 5 6 7 8 9 10 Averageaccuracy

C50 8333 10000 8125 8986 9025 8729 9450 9421 8750 9628 9045CART 8057 8921 9223 8534 8123 100 8012 7824 8157 8234 8509

Table 3 The importance of each factor based on C50 and CART algorithm

Lithology Topology River density Slope Lineament densityC50 0363 0331 0159 0117 0031CART 0355 0308 0024 0010 0312

242 CART CART decision tree algorithm [39] can dividethe sample set into two subsample sets making the root andintermediate nodes with two branches based on the recur-sively binary segmentation technology CART can handleboth continuous and discrete variables with the impuritylevel-Gini coefficient as the discriminant basis consideringthe probability distribution under the division node

Assume a total of 119870 classes variable 119883 and node for 119879then the Gini index is defined as [16]

Gini (119879) = 119870sum119894=1

119901119879 (119894) (1 minus 119901119879 (119894)) = 1 minus119870sum119894=1

1199012119879 (119894) (7)

When the classes show the equal probability in the node119879 the Gini index achieves the maximum 1 minus 1119870 with onlyone kind in node 119879 the Gini index achieves the minimumThe Gini index increases with the impurity therefore thesubnodes should be added to lower the impurity Whentaking the misclassification cost matrix into considerationthe Gini index formula becomes

Gini (119879) = sum119895 =119894

119862 (119894 | 119895) 119901119879 (119894) 119901119879 (119895) (8)

where 119862(119894 | 119895) represents the cost for misclassifying the casecategory 119895 as 119894 Assuming that the subnode 119878 added to node119879 the Gini index can be expressed as

Gini (119878 119879) = Gini (119879) minus 119901119871Gini (119879119871) minus 119901119877Gini (119879119877) (9)

where 119901119871 119901119877 stand for the proportion of cases in node 119879classified into 119879119871 and 119879119877243 10-Fold Cross Validation The basic idea for the 119896-foldcross validation [40 41] is to equally divide the surveyedsamples into 119896 parts of which 119896 minus 1 parts served as thetraining samples with the remaining one part for validationwith 119896 circulations The method can guarantee every sampleacted as the training sample and the verification sample forone time in the circulations For the 119896-fold cross validationthe commonly used value for 119896 is 10 called 10-fold crossvalidation [42]

3 Results and Discussion

In this study five groundwater relating factors were used inthe analysis and the factors except lithology were continuous

80 tube wells were utilized for training with C50 andCART respectively in the statistical analysis software SPSSClementine 120 and MATLAB According to the confusionmatrix [43] constructed by the verification result the kappacoefficient [44 45] is used to evaluate the accuracy Based onthe 10-fold cross validation with C50 and CART algorithmrespectively the precision of each loop is shown in Table 2The importance of each factor was calculated firstly to deter-mine the selection order of the factors for classification asshown in Table 3 For the C50 algorithm the importance was0363 0331 0159 0117 and 0031 respectively for lithologytopology river density slope and lineament density For theCART algorithm the importance was 0355 0308 00240010 and 0312 respectively After the ten loops the decisiontree with the higher classification accuracy was chosen asoptimal Figure 9 shows the optimal decision tree generatedby C50 algorithm with 6 layers 21 nodes 10 internal nodesand 11 terminal nodes Table 4 shows the rules for the optimaldecision tree by C50 The optimal decision tree generated byCARTalgorithm is shown in Figure 10 with 8 layers 21 nodes11 internal nodes and 10 terminal nodes and the rules areshown in Table 5

Kappa = 119873sum119903119894=1 119909119894119894 minus sum119903119894=1 119909119894+119909+1198941198732 minus sum119903119894=1 119909119894+119909+119894 (10)

where 119903 is the number of the total categories 119873 is the totalnumber for verification 119909119894119894 is the number of the correctclassifications 119909119894+ is the number of samples mistaken fromthe category 119894 for others 119909+119894 is the number of samplesmistaken from other categories for 119894

According to the classification rules we can see that notevery rule takes all the variables into account therefore forthe area lack of the detailed information the typical factorscan help to predict the groundwater potential classificationgrade The decision trees show that both the two algorithmscan determine the dividing point of the variables especiallyfor the continuous variables based on the training data whichmakes the division of the interval more scientific and reducesthe segmentation error compared with the artificial division

According to the decision tree result generated by C50algorithm we did deep analysis Topology was dividedby six nodes 4196ndash4301ndash43165ndash4357ndash44005ndash6240 and thegroundwater was distributed in the low-lying areas Slopecontained two ranges 0ndash243ndash75 and the flat areas werebeneficial to the surface water infiltration River density was

Mathematical Problems in Engineering 7

Table 4 Rules for groundwater potential grade based on C50

Grade Rule

Very goodLithology Q 4196 lt topology le 4301 lineament density le 0558Lithology Q 4196 lt topology le 43165 lineament density gt 0558Lithology Q 43165 lt topology le 44005 river density gt 0784 slope le 243

Good

Lithology Q 4301 lt topology le 43165 lineament density le 0558Lithology Q 43165 lt topology le 4357 river density le 0784Lithology Q 4357 lt topology le 44005 river density gt 0784 slope gt 243Lithology Q 44005 lt topology river density gt 0650

ModerateLithology others except Q river density gt 0506Lithology Q 44005 lt topology river density le 0650Lithology Q 4357 lt topology le 44005 river density le 0784

Poor Lithology others except Q river density le 05

RD

L

T

Others

T

Q

P M

RD RD

M GT S

VG G

LD

T VG

VG G

G M

le0506 gt0506 le43165 gt43165

le0558 gt0558

le4301 gt4301

le4357 gt4357

gt0784

le44005 gt44005

le0650 gt0650

le243 gt243

le0784

Factor GradeL lithologyLD lineament densityT topologyS slopeRD river density

VG very goodG goodM moderateP poor

Figure 9 The optimal decision tree generated by C50 algorithm

classified into four intervals 0ndash0506ndash0650ndash0784ndash114 andreflected the flow capacity in the region and the higherthe better for groundwater enrichment Lineament densitywas divided into two intervals 0ndash0558ndash1 and the moreoccurrence space for groundwater existed with the higherlineament density However for the CART algorithm topol-ogy was divided into three intervals 4196ndash4304ndash4400ndash6240slope contained two ranges 0ndash2386ndash75 river density was

classified into three intervals 0ndash0685ndash0703ndash114 lineamentdensitywas divided into four intervals 0ndash0296ndash0335ndash0713ndash1 After applying the optimal decision trees respectively tothe whole study area [6 41] the groundwater potential zonemaps were derived and shown in Figures 11 and 12

According to the results generated by the decision treesthe ldquovery goodrdquo area is mostly located in the broad plainzone with a patchy distribution covering 10325 km2 about

8 Mathematical Problems in Engineering

Table 5 Rules for groundwater potential grade based on CART

Grade Rule

Very goodRiver density gt 0703 4304 lt topology le 4400 lineament density gt 0713Lineament density gt 0296 topology le 4304Lineament density gt 0335 4304 lt topology le 4400 river density gt 0703 slope le 2386

GoodRiver density lt 0703 4304 lt topology le 4400 lineament density gt 03350335 lt lineament density le 0713 4304 lt topology le 4400 river density gt 0703 slope gt 2386Lineament density gt 0296 topology gt 4400 river density gt 0685

ModerateLineament density le 0296 lithology Q0296 lt lineament density le 0335 4304 lt topology le 4400Lineament density gt 0296 topology gt 4400 river density le 0685

Poor Lineament density le 0296 lithology others except Q

LD

L T

RDP M

Others Q

M G

T

VG LD

M RD

G LD

S VG

VG G

le0296 gt0296

le4400 gt4400

le4304 gt4304 le0685 gt0685

le0335 gt0335

le0703 gt0703

le0713 gt0713

gt2386

Factor GradeL lithologyLD lineament densityT topologyS slopeRD river density

VG very goodG goodM moderateP poor

le2386

Figure 10 The optimal decision tree generated by CART algorithm

Mathematical Problems in Engineering 9

5

N

79∘50

0E 80

∘00E 80

∘10

0E 80

∘20

0E

33∘00

N33∘100

N

0 10 20(km)

Very goodGoodModerate

PoorSurveyed tube wells

79∘40

0E79

∘30

0E

33∘200

N

Figure 11 Groundwater potential map of the study area based onC50

0 5 10 20

N

79∘50

0E 80

∘00E 80

∘10

0E 80

∘20

0E

33∘00

N33∘100

N

(km)

Very goodGoodModerate

PoorSurveyed tube wells

79∘40

0E79

∘30

0E

33∘200

N

Figure 12 Groundwater potential map of the study area based onCART

461 of the study area for C50 algorithm and 10505 km2about 468 of the study area for CART algorithm for thewater infiltration into the underground with sufficient timeand space The study shows that low-lying areas with goodflow condition well-developed stratigraphic gap and strongconnectivity like the southwest beaches of Ban Gong Lakecan be the first target for groundwater resources The ldquogoodrdquoarea was distributed along the river with 19210 km2 covering858 for C50 algorithm and 22602 km2 about 1009 ofthe study area for CART algorithm just like a long stripThe zones mainly had a planar distribution on the bottomof the diluvia fan on the low mountain and hilly terrainand a banded distribution on the mountain watershed ofpluvial valleys situated upstream and on the peripheryof ldquovery goodrdquo area which can serve as the candidate

target for groundwater exploration The ldquomoderaterdquo zonewith 59551 km2 occupying 2659 for C50 algorithmand 58453 km2 about 2610 of the study area for CARTalgorithm spreads around the upstream tributaries Thezone is located on both sides of the river valleys and the topof the diluvia fan The ldquopoorrdquo area occupies 6023 withthe area of 134914 km2 for C50 algorithm and 132440 km2about 5913 of the study area for CART algorithm which ismostly a mountainous region with a high altitude

4 Conclusions

In this study C50 and CART algorithms were applied for thedecision tree generation to predict the groundwater potentialzone with the five relating factors and the 10-fold crossvalidation method was adopted to verify the classificationresult with the kappa coefficient From this paper we candraw some conclusions as follows(1) In the study area the five groundwater relating fac-tors lithology topology slope river density and lineamentdensity were appropriate for the groundwater potential gradeprediction and the importance based on C50 algorithm was0363 0331 0159 0117 and 003 respectively for CARTalgorithm the importance was 0355 0308 0024 0010 and0312 respectively(2) Based on the 10-fold cross validation both C50 andCART could be applied for MCDM with the categoricaland continuous variables simultaneously with the averageaccuracy of 9045 and 8509 respectively however C50algorithm showed higher classification accuracy than CARTalgorithm(3) After applying the optimal decision trees to the wholestudy area respectively the groundwater potential zone mapwas delineated and the four grades of groundwater poten-tial zones ldquovery goodrdquo ldquogoodrdquo ldquomoderaterdquo and ldquopoorrdquooccupied the area of 10325 km2 19210 km2 59551 km2 and134914 km2 with the percentages of 461 858 2659and 6023 respectively for C50 and for CART the areaof 10505 km2 22602 km2 58453 km2 and 132440 km2with the percentages of 468 1009 2610 and 5913respectively

The study result can provide reference for groundwaterexploration and in the future work we will consider morerelating factors and survey more wells to enrich the modelThe integration of decision tree algorithms and MCDM inour study applies only to the qualitative assessment for thelack of the prior knowledge in the large area therefore theextra analysis is needed for the specific point investigationThe accuracy demonstrates that the 10-fold cross validation issuitable for training and verifying the decision tree howeverthe tested dataset is limited and more tube wells should beinvestigated to validate the stability of the model

Competing Interests

The authors declare that there is no conflict of interests in thisresearch work

10 Mathematical Problems in Engineering

Acknowledgments

This work was supported by Development Program of ChinaGroundwater Exploration Technology in the Water ShortageRegion (863 Program 2012AA062601)

References

[1] D Oikonomidis S Dimogianni N Kazakis and K VoudourisldquoA GISRemote Sensing-based methodology for groundwaterpotentiality assessment in Tirnavos area Greecerdquo Journal ofHydrology vol 525 pp 197ndash208 2015

[2] D Pinto S Shrestha M S Babel and S Ninsawat ldquoDelineationof groundwater potential zones in the Comoro watershedTimor Leste using GIS remote sensing and analytic hierarchyprocess (AHP) techniquerdquo Applied Water Science 2015

[3] O A Fashae M N Tijani A O Talabi and O I AdedejildquoDelineation of groundwater potential zones in the crystallinebasement terrain of SW-Nigeria an integrated GIS and remotesensing approachrdquoAppliedWater Science vol 4 no 1 pp 19ndash382014

[4] JMallick C K SinghHAl-Wadi et al ldquoGeospatial and geosta-tistical approach for groundwater potential zone delineationrdquoHydrological Processes vol 29 no 3 pp 395ndash418 2015

[5] S Lee K-Y Song Y Kim and I Park ldquoRegional groundwaterproductivity potential mapping using a geographic informationsystem (GIS) based artificial neural network modelrdquo Hydroge-ology Journal vol 20 no 8 pp 1511ndash1527 2012

[6] S Lee and C-W Lee ldquoApplication of decision-tree model togroundwater productivity-potential mappingrdquo Sustainabilityvol 7 no 10 pp 13416ndash13432 2015

[7] AOzdemir ldquoUsing a binary logistic regressionmethod andGISfor evaluating andmapping the groundwater spring potential inthe Sultan Mountains (Aksehir Turkey)rdquo Journal of Hydrologyvol 405 no 1-2 pp 123ndash136 2011

[8] A Ozdemir ldquoGIS-based groundwater spring potential mappingin the SultanMountains (Konya Turkey) using frequency ratioweights of evidence and logistic regression methods and theircomparisonrdquo Journal of Hydrology vol 411 no 3-4 pp 290ndash308 2011

[9] D Machiwal M K Jha and B C Mal ldquoAssessment ofgroundwater potential in a semi-arid region of india usingremote sensing GIS and MCDM techniquesrdquo Water ResourcesManagement vol 25 no 5 pp 1359ndash1386 2011

[10] S A Naghibi and H R Pourghasemi ldquoA comparative assess-ment between three machine learning models and their per-formance comparison by bivariate and multivariate statisticalmethods in groundwater potential mappingrdquo Water ResourcesManagement vol 29 no 14 pp 5217ndash5236 2015

[11] D Zheng-Dong Y Xin L Fan et al ldquoConstruction andinvestigation of groundwater remote sensing fuzzy assessmentindexrdquo Chinese Journal of Geophysics vol 56 no 11 pp 3908ndash3916 2013

[12] O FAlthuwaynee B PradhanH-J Park and JH Lee ldquoAnovelensemble decision tree-based CHi-squared Automatic Interac-tion Detection (CHAID) and multivariate logistic regressionmodels in landslide susceptibility mappingrdquo Landslides vol 11no 6 pp 1063ndash1078 2014

[13] I Park and S Lee ldquoSpatial prediction of landslide susceptibilityusing a decision tree approach a case study of the Pyeongchangarea Koreardquo International Journal of Remote Sensing vol 35 no16 pp 6089ndash6112 2014

[14] C Baker R Lawrence C Montagne and D Patten ldquoMappingwetlands and riparian areas using landsat ETM+ imagery anddecision-tree-based modelsrdquo Wetlands vol 26 no 2 pp 465ndash474 2006

[15] R Kohavi and J R Quinlan ldquoData mining tasks and methodsclassification decision-tree discoveryrdquo in Handbook of DataMining and Knowledge Discovery pp 267ndash276 Oxford Univer-sity Press New York NY USA 2002

[16] N Lin D Noe and X He ldquoTree-based methods and theirapplicationsrdquo in Springer Handbook of Engineering Statistics pp551ndash570 Springer London UK 2006

[17] I Klein U Gessner and C Kuenzer ldquoRegional land covermapping and change detection in Central Asia using MODIStime-seriesrdquo Applied Geography vol 35 no 1-2 pp 219ndash2342012

[18] J R Quinlan ldquoImproved use of continuous attributes in C4 5rdquoJournal of Artificial Intelligence Research vol 4 pp 77ndash90 1996

[19] S B Kotsiantis ldquoSupervised machine learning a review ofclassification techniquesrdquo Informatica vol 31 no 3 pp 249ndash268 2007

[20] K Polat and S Gunes ldquoA novel hybrid intelligent method basedon C45 decision tree classifier and one-against-all approachfor multi-class classification problemsrdquo Expert Systems withApplications vol 36 no 2 pp 1587ndash1592 2009

[21] M K Jha V M Chowdary and A Chowdhury ldquoGroundwaterassessment in Salboni Block West Bengal (India) using remotesensing geographical information system and multi-criteriadecision analysis techniquesrdquoHydrogeology Journal vol 18 no7 pp 1713ndash1728 2010

[22] V B Rekha A P Thomas M Suma and H Vijith ldquoAnintegration of spatial information technology for groundwa-ter potential and quality investigations in Koduvan Ar Sub-Watershed of Meenachil River Basin Kerala Indiardquo Journal ofthe Indian Society of Remote Sensing vol 39 no 1 pp 63ndash712011

[23] O Rahmati ANazari SamaniMMahdavi H R Pourghasemiand H Zeinivand ldquoGroundwater potential mapping at Kurdis-tan region of Iran using analytic hierarchy process and GISrdquoArabian Journal of Geosciences vol 8 no 9 pp 7059ndash7071 2014

[24] K R Preeja S Joseph J Thomas and H Vijith ldquoIdentificationof groundwater potential zones of a tropical river basin (KeralaIndia) using remote sensing and GIS techniquesrdquo Journal of theIndian Society of Remote Sensing vol 39 no 1 pp 83ndash94 2011

[25] K Narendra K Nageswara Rao and P Swarna Latha ldquoIntegrat-ing remote sensing and GIS for identification of groundwaterprospective zones in the Narava basin Visakhapatnam regionAndhra Pradeshrdquo Journal of the Geological Society of India vol81 no 2 pp 248ndash260 2013

[26] M Ture F Tokatli and I Kurt ldquoUsing KaplanndashMeier analysistogether with decision tree methods (CampRT CHAID QUESTC45 and ID3) in determining recurrence-free survival of breastcancer patientsrdquo Expert Systems with Applications vol 36 no 2pp 2017ndash2026 2009

[27] N A AL-Fakhry ldquoSummarizing data by using data miningtechniques a comparative by using C45 and C50 algorithmsrdquoInternational Education and Research Journal vol 2 no 4 2016

[28] R Kohavi and J R Quinlan ldquoData mining tasks and methodsclassification decision-tree discoveryrdquo in Handbook of DataMining and Knowledge Discovery pp 267ndash276 Oxford Univer-sity Press 2002

Mathematical Problems in Engineering 11

[29] U M Fayyad and K B Irani ldquoOn the handling of continuous-valued attributes in decision tree generationrdquoMachine Learningvol 8 no 1 pp 87ndash102 1992

[30] J R Quinlan ldquoBagging boosting and C4 5rdquo AAAIIAAI vol1 pp 725ndash730 1996

[31] S Pang and J C Gong ldquoC5 0 classification algorithm andapplication on individual credit evaluation of banksrdquo SystemsEngineering-Theory amp Practice vol 29 no 12 pp 94ndash104 2009

[32] M A Friedl C E Brodley and A H Strahler ldquoMaximizingland cover classification accuracies produced by decision treesat continental to global scalesrdquo IEEE Transactions on Geoscienceand Remote Sensing vol 37 no 2 pp 969ndash977 1999

[33] C J Moran and E N Bui ldquoSpatial data mining for enhancedsoil map modellingrdquo International Journal of GeographicalInformation Science vol 16 no 6 pp 533ndash549 2002

[34] X Li andA G-O Yeh ldquoDatamining of cellular automatarsquos tran-sition rulesrdquo International Journal of Geographical InformationScience vol 18 no 8 pp 723ndash744 2004

[35] P H Williams R Eyles and G Weiller ldquoPlant microRNAprediction by supervised machine learning using C50 decisiontreesrdquo Journal of Nucleic Acids vol 2012 Article ID 652979 10pages 2012

[36] N Patil R Lathi and V Chitre ldquoCustomer card classificationbased on C5 0 amp CART algorithmsrdquo International Journal ofEngineering Research and Applications vol 2 no 4 pp 164ndash1672012

[37] M M Javidi and E F Roshan ldquoSpeech emotion recognition byusing combinations of C50 neural network (NN) and supportvector machines (SVM) classification methodsrdquo Journal ofMathematics and Computer Science vol 6 pp 191ndash200 2013

[38] E Bauer and R Kohavi ldquoAn empirical comparison of vot-ing classification algorithms bagging boosting and variantsrdquoMachine Learning vol 36 no 1-2 pp 105ndash139 1999

[39] Y Shao and R S Lunetta ldquoComparison of support vectormachine neural network and CART algorithms for the land-cover classification using limited training data pointsrdquo ISPRSJournal of Photogrammetry and Remote Sensing vol 70 pp 78ndash87 2012

[40] Y Bengio andYGrandvalet ldquoNounbiased estimator of the vari-ance of k-fold cross-validationrdquo Journal of Machine LearningResearch vol 5 pp 1089ndash1105 2004

[41] C Chen B He and Z Zeng ldquoAmethod formineral prospectiv-ity mapping integrating C45 decision tree weights-of-evidenceandm-branch smoothing techniques a case study in the easternKunlunMountains Chinardquo Earth Science Informatics vol 7 no1 pp 13ndash24 2014

[42] M van der Gaag T Hoffman M Remijsen et al ldquoThe five-factor model of the positive and negative syndrome scale IIa ten-fold cross-validation of a revised modelrdquo SchizophreniaResearch vol 85 no 1ndash3 pp 280ndash287 2006

[43] M Al Saud ldquoMapping potential areas for groundwater storagein Wadi Aurnah Basin western Arabian Peninsula usingremote sensing and geographic information system techniquesrdquoHydrogeology Journal vol 18 no 6 pp 1481ndash1495 2010

[44] R G Congalton ldquoA review of assessing the accuracy of classifi-cations of remotely sensed datardquo Remote Sensing of Environ-ment vol 37 no 1 pp 35ndash46 1991

[45] F K Hoehler ldquoBias and prevalence effects on kappa viewed interms of sensitivity and specificityrdquo Journal of Clinical Epidemi-ology vol 53 no 5 pp 499ndash503 2000

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 3: Research Article Assessment of Groundwater Potential Based ...downloads.hindawi.com/journals/mpe/2016/2064575.pdf · Figure . In general, slopes control the inltration and ow ability

Mathematical Problems in Engineering 3

Geology Map DEM

Lithology Slope

Lineament density

Groundwater potential zone map

Riverdensity

Categorical ContinuousContinuous

Accuracy comparison

Surveyed Tube Wells

Gradeclassification

Landsat 8 OLI

Topography

Decision tree model generation with decision treealgorithm (50 24) and 10-fold cross validation

Figure 2 Flow chart for mapping groundwater potential

Table 1 Groundwater potential grade criteria on the tube well yield

Grade Very good Good Moderate PoorYield (119905119889) gt800 300ndash800 100ndash300 lt100Number 21 25 18 16

Library DEM from ASTER satellite with the horizon distri-bution of 30m and Landsat 8 OLI image with the acquisitiondate 5222013 downloaded from httpswwwusgsgov andcloud cover 243 sun elevation 6806 sun azimuth 11995and tube wells yield data with field survey 80 investigatedtubewells were divided into four grades according to the yield(Table 1) [21]

22 Methodological Framework The study on the ground-water potential zone delineation is carried out from theperspective of hydrogeology considering the occurrencespace and supply condition Lithology and lineament densityare chosen as the occurrence space factor topology slopeand river density are related to the groundwater supplycondition Lithology is the categorical variable and the restof the variables are continuous After the analysis betweenthe factors and groundwater potential grade C50 and CARTalgorithmswere respectively applied to generate the decisiontree and the 10-fold cross validation was adopted to test theclassification accuracy The specific technical route is shownin Figure 2

23 Factors Related to Groundwater Potential Lithology [22]influences the water-holding capacity of aquifer and directlyaffects the occurrence and distribution of groundwater Thelithology thematic map was derived through digitizing the1 250000 scale geology map from the National GeologicalLibrary as shown in Figure 3 The Quaternary (Q) includingthree kinds of sedimentary type alluvial diluvia and lacus-trine with the distribution in low-lying area belongs to theloose sediment for themelting snowwater that flows from thehigh mountains The Jurassic (J) layers are distributed widelyalong the east-west direction with the modern (J3) middle(J2) and early age (J1) respectively lying in the middlesouth and mid-north The Modern Cretaceous (K2) spreadsalong the north-south direction The Early Cretaceous (K1)is mostly distributed in the north along the east-west ThePaleogene (E) lies mostly in the west with the scatterdistribution in the central and eastern part

The linear faults accompanied by the cranny providespace for the occurrence of groundwater [23] In the stratumwith the same lithology the intersection of the faults leadsto development of the cranny which tends to be the ground-water enrichment zone with the connectivity enhancementThe linear structures are extracted from the satellite imagebased on the discontinuity of the color from the surroundingareas Orthographical correction is applied to Landsat 8 OLIimage with DEM to eliminate the shadowrsquos influence on thevisual interpretation in the study area Combination of bands753 (SWIR 2near infraredgreen) proved to be the mostsuitable for the extraction with the geology map in ENVI 51

4 Mathematical Problems in Engineering

0 5 10 20

N

79∘50

0E 80

∘00E 80

∘10

0E 80

∘20

0E

33∘00

N33∘100

N

QJ3J2J1

K2K1E

(km)

79∘40

0E79

∘30

0E

33∘200

N

Figure 3 Lithology map of the study area

0 5 10 20

N

79∘50

0E 80

∘00E 80

∘10

0E 80

∘20

0E

33∘00

N33∘100

N

(km)

High 1Low 0

(km

km

2 )

79∘40

0E79

∘30

0E

33∘200

N

Figure 4 Lineament density map of the study area

The lineament density shown in Figure 4 was calculated inArcGIS 101 with ldquoline densityrdquo command

Topography controls the groundwater supply conditionsThe mountainous region provides better runoff conditionsand most of the precipitation is accounted for in the surfacerunoff withminimum infiltration to the groundwater On theother hand precipitation in plains provides slower runoff andfacilitates groundwater recharge Topography map is shownin Figure 5The topography map with 30m spatial resolutionwas extracted from DEM data in ArcGIS 101

Groundwater flow is usually driven by surface force andthe boundary of the terrain is mostly the boundary of theshallow aquifer Slope [24] is important in analyzing theterrain as it can affect the groundwater in terms of its storageflow and discharge especially in mountainous areas Slopewas extracted from the DEM in ArcGIS 101 and is shown in

0 5 10 20

N

79∘50

0E 80

∘00E 80

∘10

0E 80

∘20

0E

33∘00

N33∘100

N

(km)

High 6240Low 4196

79∘40

0E79

∘30

0E

33∘200

N

(m)

Figure 5 Topography map of the study area

0 5 10 20

N

79∘50

0E 80

∘00E 80

∘10

0E 80

∘20

0E

33∘00

N33∘100

N

(km)

High 75Low 0

79∘40

0E79

∘30

0E

33∘200

N

(∘)

Figure 6 Slope map of the study area

Figure 6 In general slopes control the infiltration and flowability of the surface water Usually the steep slope indicatesgreater water velocity Therefore it is observed that in theareas of steeper relief the runoff increases while minimizingthe groundwater recharge On the contrary on the relativelygentle sloping terrain the groundwater potentiality increasesdue to greater infiltrationThus lower slope results in greaterrecharge

Flow accumulation reflects the upstream flow quantity Inthe study area the supply source is mainly the melting snowFlow accumulation was derived from DEM for generating astream network It can be seen from Figure 7 that the studyarea has a dendritic pattern for drainage The dendritic net-work is usually found in region underlain with homogeneoussurface without abrupt changes in geological conditions

The river density [25] represents the recharge conditionsto quantify the influence caused by surface water wherehigher density provides better recharge conditions Based on

Mathematical Problems in Engineering 5

0 5 10 20

N

79∘50

0E 80

∘00E 80

∘10

0E 80

∘20

0E

33∘00

N33∘100

N

(km)

gt1000

79∘40

0E79

∘30

0E

33∘200

N

Figure 7 Flow accumulation map of the study area

0 5 10 20

N

79∘50

0E 80

∘00E 80

∘10

0E 80

∘20

0E

33∘00

N33∘100

N

(km)

Low 0

High 114

(km

km

2 )

79∘40

0E79

∘30

0E

33∘200

N

Figure 8 River density map of the study area

the flow accumulation the river density was calculated usingthe ldquoline densityrdquo command in ArcGIS 101 as shown inFigure 8

24 Research Method The decision tree algorithms [16]are suitable for the multifactor classification problem Forthe mixture of the continuous and discrete factors in thegroundwater potential evaluation model C50 and CARTdecision tree algorithms were adopted together with the 10-fold cross validation method to improve the classificationaccuracy and unbiasedness

241 C50 The C50 decision tree algorithm is rooted inID3 and C45 The ID3 algorithm [26] with the maximumgain as the division standard aims to achieve the maximumof the information gain in every node which often givespriority to the variables with more classes To make upfor the defect of ID3 C45 adopts the gain-ratio criterion[19 27 28] With the gain-ratio criterion the binary nodes

divide the continuous variable The processing method forcategorical variables is firstly to refer each of the categoriesas a branch and then merge each two branches iterativelyuntil the two branches remain However the heuristic searchmay not find the best point for the categorical variablesdivision For continuous variables C50 algorithm can easilyfind the division point [29] To get rid of the biasedness onthe continuous variables the algorithm improves the gainof the continuous variables Besides C50 algorithm cansimplify the originally complex decision tree to the equivalenttree for the easy understanding With more splitting layersthan other algorithms it can ensure the high purity of theresult nodes C50 algorithm achieves the self-correction afterseveral iterations with boosting technology [30]The artificialmethods lay emphasis on the sole impact of various factorshowever C50 algorithm can consider all the factors toanalyze the comprehensive influence based on data statisticsC50 algorithm has been widely applied to the multivariateclassification for its unbiasedness and precision towards thecontinuous and categorical variables compared to other deci-sion tree algorithms [17 31ndash37] To improve the classificationaccuracy boosting technology was applied in C50 decisiontree algorithm and could adjust the decision tree according tothe fault samples until reaching a high precision [38] BesidesC50 algorithm is more applicable to the large data samples

Assume the splitting node119879 is expected to separate the |119879|samples into119870 target categories [27]The symbol119901119879(119894) standsfor the percentage of category 119894 at node 119879 119894 = 1 2 119870 Theinclusive information at node 119879 can be expressed as

Info (119879) = minus 119870sum119894=1

119901119879 (119894) times log2 [119901119879 (119894)] (1)

For the decision tree branch the samples are divided into1198791 1198792 119879119899 by the node 119879 with the subsamples of |1198791| |1198792|and |119879119899| respectivelyThen the information can be expressedas

Info (119883 119879) = 119899sum119894=1

10038161003816100381610038161198791198941003816100381610038161003816|119879| times Info (119879119894) (2)

After combining the above information formulas theinformation gain can be expressed as

Gain (119883 119879) = Info (119879) minus Info (119883 119879) (3)To improve the application of the information gain

concept information gain ratio was put forward

GainRatio (119883 119879) = Gain (119883 119879)SplitInfo (119883 119879) (4)

where

SplitInfo (119883 119879) = minus 119899sum119894=1

10038161003816100381610038161198791198941003816100381610038161003816|119879| times log2 (10038161003816100381610038161198791198941003816100381610038161003816|119879| ) (5)

The practice showed that the information gain ratio wasmore preferred to the continuous variable therefore theinformation gain ratio for the continuous variable with 119899distinct values should be expressed as [16]

Gain (119883 119879) = Info (119879) minus Info (119883 119879) minus log2 (119899 minus 1)|119879| (6)

6 Mathematical Problems in Engineering

Table 2 The precision of each loop based on the 10-fold cross validation ()

1 2 3 4 5 6 7 8 9 10 Averageaccuracy

C50 8333 10000 8125 8986 9025 8729 9450 9421 8750 9628 9045CART 8057 8921 9223 8534 8123 100 8012 7824 8157 8234 8509

Table 3 The importance of each factor based on C50 and CART algorithm

Lithology Topology River density Slope Lineament densityC50 0363 0331 0159 0117 0031CART 0355 0308 0024 0010 0312

242 CART CART decision tree algorithm [39] can dividethe sample set into two subsample sets making the root andintermediate nodes with two branches based on the recur-sively binary segmentation technology CART can handleboth continuous and discrete variables with the impuritylevel-Gini coefficient as the discriminant basis consideringthe probability distribution under the division node

Assume a total of 119870 classes variable 119883 and node for 119879then the Gini index is defined as [16]

Gini (119879) = 119870sum119894=1

119901119879 (119894) (1 minus 119901119879 (119894)) = 1 minus119870sum119894=1

1199012119879 (119894) (7)

When the classes show the equal probability in the node119879 the Gini index achieves the maximum 1 minus 1119870 with onlyone kind in node 119879 the Gini index achieves the minimumThe Gini index increases with the impurity therefore thesubnodes should be added to lower the impurity Whentaking the misclassification cost matrix into considerationthe Gini index formula becomes

Gini (119879) = sum119895 =119894

119862 (119894 | 119895) 119901119879 (119894) 119901119879 (119895) (8)

where 119862(119894 | 119895) represents the cost for misclassifying the casecategory 119895 as 119894 Assuming that the subnode 119878 added to node119879 the Gini index can be expressed as

Gini (119878 119879) = Gini (119879) minus 119901119871Gini (119879119871) minus 119901119877Gini (119879119877) (9)

where 119901119871 119901119877 stand for the proportion of cases in node 119879classified into 119879119871 and 119879119877243 10-Fold Cross Validation The basic idea for the 119896-foldcross validation [40 41] is to equally divide the surveyedsamples into 119896 parts of which 119896 minus 1 parts served as thetraining samples with the remaining one part for validationwith 119896 circulations The method can guarantee every sampleacted as the training sample and the verification sample forone time in the circulations For the 119896-fold cross validationthe commonly used value for 119896 is 10 called 10-fold crossvalidation [42]

3 Results and Discussion

In this study five groundwater relating factors were used inthe analysis and the factors except lithology were continuous

80 tube wells were utilized for training with C50 andCART respectively in the statistical analysis software SPSSClementine 120 and MATLAB According to the confusionmatrix [43] constructed by the verification result the kappacoefficient [44 45] is used to evaluate the accuracy Based onthe 10-fold cross validation with C50 and CART algorithmrespectively the precision of each loop is shown in Table 2The importance of each factor was calculated firstly to deter-mine the selection order of the factors for classification asshown in Table 3 For the C50 algorithm the importance was0363 0331 0159 0117 and 0031 respectively for lithologytopology river density slope and lineament density For theCART algorithm the importance was 0355 0308 00240010 and 0312 respectively After the ten loops the decisiontree with the higher classification accuracy was chosen asoptimal Figure 9 shows the optimal decision tree generatedby C50 algorithm with 6 layers 21 nodes 10 internal nodesand 11 terminal nodes Table 4 shows the rules for the optimaldecision tree by C50 The optimal decision tree generated byCARTalgorithm is shown in Figure 10 with 8 layers 21 nodes11 internal nodes and 10 terminal nodes and the rules areshown in Table 5

Kappa = 119873sum119903119894=1 119909119894119894 minus sum119903119894=1 119909119894+119909+1198941198732 minus sum119903119894=1 119909119894+119909+119894 (10)

where 119903 is the number of the total categories 119873 is the totalnumber for verification 119909119894119894 is the number of the correctclassifications 119909119894+ is the number of samples mistaken fromthe category 119894 for others 119909+119894 is the number of samplesmistaken from other categories for 119894

According to the classification rules we can see that notevery rule takes all the variables into account therefore forthe area lack of the detailed information the typical factorscan help to predict the groundwater potential classificationgrade The decision trees show that both the two algorithmscan determine the dividing point of the variables especiallyfor the continuous variables based on the training data whichmakes the division of the interval more scientific and reducesthe segmentation error compared with the artificial division

According to the decision tree result generated by C50algorithm we did deep analysis Topology was dividedby six nodes 4196ndash4301ndash43165ndash4357ndash44005ndash6240 and thegroundwater was distributed in the low-lying areas Slopecontained two ranges 0ndash243ndash75 and the flat areas werebeneficial to the surface water infiltration River density was

Mathematical Problems in Engineering 7

Table 4 Rules for groundwater potential grade based on C50

Grade Rule

Very goodLithology Q 4196 lt topology le 4301 lineament density le 0558Lithology Q 4196 lt topology le 43165 lineament density gt 0558Lithology Q 43165 lt topology le 44005 river density gt 0784 slope le 243

Good

Lithology Q 4301 lt topology le 43165 lineament density le 0558Lithology Q 43165 lt topology le 4357 river density le 0784Lithology Q 4357 lt topology le 44005 river density gt 0784 slope gt 243Lithology Q 44005 lt topology river density gt 0650

ModerateLithology others except Q river density gt 0506Lithology Q 44005 lt topology river density le 0650Lithology Q 4357 lt topology le 44005 river density le 0784

Poor Lithology others except Q river density le 05

RD

L

T

Others

T

Q

P M

RD RD

M GT S

VG G

LD

T VG

VG G

G M

le0506 gt0506 le43165 gt43165

le0558 gt0558

le4301 gt4301

le4357 gt4357

gt0784

le44005 gt44005

le0650 gt0650

le243 gt243

le0784

Factor GradeL lithologyLD lineament densityT topologyS slopeRD river density

VG very goodG goodM moderateP poor

Figure 9 The optimal decision tree generated by C50 algorithm

classified into four intervals 0ndash0506ndash0650ndash0784ndash114 andreflected the flow capacity in the region and the higherthe better for groundwater enrichment Lineament densitywas divided into two intervals 0ndash0558ndash1 and the moreoccurrence space for groundwater existed with the higherlineament density However for the CART algorithm topol-ogy was divided into three intervals 4196ndash4304ndash4400ndash6240slope contained two ranges 0ndash2386ndash75 river density was

classified into three intervals 0ndash0685ndash0703ndash114 lineamentdensitywas divided into four intervals 0ndash0296ndash0335ndash0713ndash1 After applying the optimal decision trees respectively tothe whole study area [6 41] the groundwater potential zonemaps were derived and shown in Figures 11 and 12

According to the results generated by the decision treesthe ldquovery goodrdquo area is mostly located in the broad plainzone with a patchy distribution covering 10325 km2 about

8 Mathematical Problems in Engineering

Table 5 Rules for groundwater potential grade based on CART

Grade Rule

Very goodRiver density gt 0703 4304 lt topology le 4400 lineament density gt 0713Lineament density gt 0296 topology le 4304Lineament density gt 0335 4304 lt topology le 4400 river density gt 0703 slope le 2386

GoodRiver density lt 0703 4304 lt topology le 4400 lineament density gt 03350335 lt lineament density le 0713 4304 lt topology le 4400 river density gt 0703 slope gt 2386Lineament density gt 0296 topology gt 4400 river density gt 0685

ModerateLineament density le 0296 lithology Q0296 lt lineament density le 0335 4304 lt topology le 4400Lineament density gt 0296 topology gt 4400 river density le 0685

Poor Lineament density le 0296 lithology others except Q

LD

L T

RDP M

Others Q

M G

T

VG LD

M RD

G LD

S VG

VG G

le0296 gt0296

le4400 gt4400

le4304 gt4304 le0685 gt0685

le0335 gt0335

le0703 gt0703

le0713 gt0713

gt2386

Factor GradeL lithologyLD lineament densityT topologyS slopeRD river density

VG very goodG goodM moderateP poor

le2386

Figure 10 The optimal decision tree generated by CART algorithm

Mathematical Problems in Engineering 9

5

N

79∘50

0E 80

∘00E 80

∘10

0E 80

∘20

0E

33∘00

N33∘100

N

0 10 20(km)

Very goodGoodModerate

PoorSurveyed tube wells

79∘40

0E79

∘30

0E

33∘200

N

Figure 11 Groundwater potential map of the study area based onC50

0 5 10 20

N

79∘50

0E 80

∘00E 80

∘10

0E 80

∘20

0E

33∘00

N33∘100

N

(km)

Very goodGoodModerate

PoorSurveyed tube wells

79∘40

0E79

∘30

0E

33∘200

N

Figure 12 Groundwater potential map of the study area based onCART

461 of the study area for C50 algorithm and 10505 km2about 468 of the study area for CART algorithm for thewater infiltration into the underground with sufficient timeand space The study shows that low-lying areas with goodflow condition well-developed stratigraphic gap and strongconnectivity like the southwest beaches of Ban Gong Lakecan be the first target for groundwater resources The ldquogoodrdquoarea was distributed along the river with 19210 km2 covering858 for C50 algorithm and 22602 km2 about 1009 ofthe study area for CART algorithm just like a long stripThe zones mainly had a planar distribution on the bottomof the diluvia fan on the low mountain and hilly terrainand a banded distribution on the mountain watershed ofpluvial valleys situated upstream and on the peripheryof ldquovery goodrdquo area which can serve as the candidate

target for groundwater exploration The ldquomoderaterdquo zonewith 59551 km2 occupying 2659 for C50 algorithmand 58453 km2 about 2610 of the study area for CARTalgorithm spreads around the upstream tributaries Thezone is located on both sides of the river valleys and the topof the diluvia fan The ldquopoorrdquo area occupies 6023 withthe area of 134914 km2 for C50 algorithm and 132440 km2about 5913 of the study area for CART algorithm which ismostly a mountainous region with a high altitude

4 Conclusions

In this study C50 and CART algorithms were applied for thedecision tree generation to predict the groundwater potentialzone with the five relating factors and the 10-fold crossvalidation method was adopted to verify the classificationresult with the kappa coefficient From this paper we candraw some conclusions as follows(1) In the study area the five groundwater relating fac-tors lithology topology slope river density and lineamentdensity were appropriate for the groundwater potential gradeprediction and the importance based on C50 algorithm was0363 0331 0159 0117 and 003 respectively for CARTalgorithm the importance was 0355 0308 0024 0010 and0312 respectively(2) Based on the 10-fold cross validation both C50 andCART could be applied for MCDM with the categoricaland continuous variables simultaneously with the averageaccuracy of 9045 and 8509 respectively however C50algorithm showed higher classification accuracy than CARTalgorithm(3) After applying the optimal decision trees to the wholestudy area respectively the groundwater potential zone mapwas delineated and the four grades of groundwater poten-tial zones ldquovery goodrdquo ldquogoodrdquo ldquomoderaterdquo and ldquopoorrdquooccupied the area of 10325 km2 19210 km2 59551 km2 and134914 km2 with the percentages of 461 858 2659and 6023 respectively for C50 and for CART the areaof 10505 km2 22602 km2 58453 km2 and 132440 km2with the percentages of 468 1009 2610 and 5913respectively

The study result can provide reference for groundwaterexploration and in the future work we will consider morerelating factors and survey more wells to enrich the modelThe integration of decision tree algorithms and MCDM inour study applies only to the qualitative assessment for thelack of the prior knowledge in the large area therefore theextra analysis is needed for the specific point investigationThe accuracy demonstrates that the 10-fold cross validation issuitable for training and verifying the decision tree howeverthe tested dataset is limited and more tube wells should beinvestigated to validate the stability of the model

Competing Interests

The authors declare that there is no conflict of interests in thisresearch work

10 Mathematical Problems in Engineering

Acknowledgments

This work was supported by Development Program of ChinaGroundwater Exploration Technology in the Water ShortageRegion (863 Program 2012AA062601)

References

[1] D Oikonomidis S Dimogianni N Kazakis and K VoudourisldquoA GISRemote Sensing-based methodology for groundwaterpotentiality assessment in Tirnavos area Greecerdquo Journal ofHydrology vol 525 pp 197ndash208 2015

[2] D Pinto S Shrestha M S Babel and S Ninsawat ldquoDelineationof groundwater potential zones in the Comoro watershedTimor Leste using GIS remote sensing and analytic hierarchyprocess (AHP) techniquerdquo Applied Water Science 2015

[3] O A Fashae M N Tijani A O Talabi and O I AdedejildquoDelineation of groundwater potential zones in the crystallinebasement terrain of SW-Nigeria an integrated GIS and remotesensing approachrdquoAppliedWater Science vol 4 no 1 pp 19ndash382014

[4] JMallick C K SinghHAl-Wadi et al ldquoGeospatial and geosta-tistical approach for groundwater potential zone delineationrdquoHydrological Processes vol 29 no 3 pp 395ndash418 2015

[5] S Lee K-Y Song Y Kim and I Park ldquoRegional groundwaterproductivity potential mapping using a geographic informationsystem (GIS) based artificial neural network modelrdquo Hydroge-ology Journal vol 20 no 8 pp 1511ndash1527 2012

[6] S Lee and C-W Lee ldquoApplication of decision-tree model togroundwater productivity-potential mappingrdquo Sustainabilityvol 7 no 10 pp 13416ndash13432 2015

[7] AOzdemir ldquoUsing a binary logistic regressionmethod andGISfor evaluating andmapping the groundwater spring potential inthe Sultan Mountains (Aksehir Turkey)rdquo Journal of Hydrologyvol 405 no 1-2 pp 123ndash136 2011

[8] A Ozdemir ldquoGIS-based groundwater spring potential mappingin the SultanMountains (Konya Turkey) using frequency ratioweights of evidence and logistic regression methods and theircomparisonrdquo Journal of Hydrology vol 411 no 3-4 pp 290ndash308 2011

[9] D Machiwal M K Jha and B C Mal ldquoAssessment ofgroundwater potential in a semi-arid region of india usingremote sensing GIS and MCDM techniquesrdquo Water ResourcesManagement vol 25 no 5 pp 1359ndash1386 2011

[10] S A Naghibi and H R Pourghasemi ldquoA comparative assess-ment between three machine learning models and their per-formance comparison by bivariate and multivariate statisticalmethods in groundwater potential mappingrdquo Water ResourcesManagement vol 29 no 14 pp 5217ndash5236 2015

[11] D Zheng-Dong Y Xin L Fan et al ldquoConstruction andinvestigation of groundwater remote sensing fuzzy assessmentindexrdquo Chinese Journal of Geophysics vol 56 no 11 pp 3908ndash3916 2013

[12] O FAlthuwaynee B PradhanH-J Park and JH Lee ldquoAnovelensemble decision tree-based CHi-squared Automatic Interac-tion Detection (CHAID) and multivariate logistic regressionmodels in landslide susceptibility mappingrdquo Landslides vol 11no 6 pp 1063ndash1078 2014

[13] I Park and S Lee ldquoSpatial prediction of landslide susceptibilityusing a decision tree approach a case study of the Pyeongchangarea Koreardquo International Journal of Remote Sensing vol 35 no16 pp 6089ndash6112 2014

[14] C Baker R Lawrence C Montagne and D Patten ldquoMappingwetlands and riparian areas using landsat ETM+ imagery anddecision-tree-based modelsrdquo Wetlands vol 26 no 2 pp 465ndash474 2006

[15] R Kohavi and J R Quinlan ldquoData mining tasks and methodsclassification decision-tree discoveryrdquo in Handbook of DataMining and Knowledge Discovery pp 267ndash276 Oxford Univer-sity Press New York NY USA 2002

[16] N Lin D Noe and X He ldquoTree-based methods and theirapplicationsrdquo in Springer Handbook of Engineering Statistics pp551ndash570 Springer London UK 2006

[17] I Klein U Gessner and C Kuenzer ldquoRegional land covermapping and change detection in Central Asia using MODIStime-seriesrdquo Applied Geography vol 35 no 1-2 pp 219ndash2342012

[18] J R Quinlan ldquoImproved use of continuous attributes in C4 5rdquoJournal of Artificial Intelligence Research vol 4 pp 77ndash90 1996

[19] S B Kotsiantis ldquoSupervised machine learning a review ofclassification techniquesrdquo Informatica vol 31 no 3 pp 249ndash268 2007

[20] K Polat and S Gunes ldquoA novel hybrid intelligent method basedon C45 decision tree classifier and one-against-all approachfor multi-class classification problemsrdquo Expert Systems withApplications vol 36 no 2 pp 1587ndash1592 2009

[21] M K Jha V M Chowdary and A Chowdhury ldquoGroundwaterassessment in Salboni Block West Bengal (India) using remotesensing geographical information system and multi-criteriadecision analysis techniquesrdquoHydrogeology Journal vol 18 no7 pp 1713ndash1728 2010

[22] V B Rekha A P Thomas M Suma and H Vijith ldquoAnintegration of spatial information technology for groundwa-ter potential and quality investigations in Koduvan Ar Sub-Watershed of Meenachil River Basin Kerala Indiardquo Journal ofthe Indian Society of Remote Sensing vol 39 no 1 pp 63ndash712011

[23] O Rahmati ANazari SamaniMMahdavi H R Pourghasemiand H Zeinivand ldquoGroundwater potential mapping at Kurdis-tan region of Iran using analytic hierarchy process and GISrdquoArabian Journal of Geosciences vol 8 no 9 pp 7059ndash7071 2014

[24] K R Preeja S Joseph J Thomas and H Vijith ldquoIdentificationof groundwater potential zones of a tropical river basin (KeralaIndia) using remote sensing and GIS techniquesrdquo Journal of theIndian Society of Remote Sensing vol 39 no 1 pp 83ndash94 2011

[25] K Narendra K Nageswara Rao and P Swarna Latha ldquoIntegrat-ing remote sensing and GIS for identification of groundwaterprospective zones in the Narava basin Visakhapatnam regionAndhra Pradeshrdquo Journal of the Geological Society of India vol81 no 2 pp 248ndash260 2013

[26] M Ture F Tokatli and I Kurt ldquoUsing KaplanndashMeier analysistogether with decision tree methods (CampRT CHAID QUESTC45 and ID3) in determining recurrence-free survival of breastcancer patientsrdquo Expert Systems with Applications vol 36 no 2pp 2017ndash2026 2009

[27] N A AL-Fakhry ldquoSummarizing data by using data miningtechniques a comparative by using C45 and C50 algorithmsrdquoInternational Education and Research Journal vol 2 no 4 2016

[28] R Kohavi and J R Quinlan ldquoData mining tasks and methodsclassification decision-tree discoveryrdquo in Handbook of DataMining and Knowledge Discovery pp 267ndash276 Oxford Univer-sity Press 2002

Mathematical Problems in Engineering 11

[29] U M Fayyad and K B Irani ldquoOn the handling of continuous-valued attributes in decision tree generationrdquoMachine Learningvol 8 no 1 pp 87ndash102 1992

[30] J R Quinlan ldquoBagging boosting and C4 5rdquo AAAIIAAI vol1 pp 725ndash730 1996

[31] S Pang and J C Gong ldquoC5 0 classification algorithm andapplication on individual credit evaluation of banksrdquo SystemsEngineering-Theory amp Practice vol 29 no 12 pp 94ndash104 2009

[32] M A Friedl C E Brodley and A H Strahler ldquoMaximizingland cover classification accuracies produced by decision treesat continental to global scalesrdquo IEEE Transactions on Geoscienceand Remote Sensing vol 37 no 2 pp 969ndash977 1999

[33] C J Moran and E N Bui ldquoSpatial data mining for enhancedsoil map modellingrdquo International Journal of GeographicalInformation Science vol 16 no 6 pp 533ndash549 2002

[34] X Li andA G-O Yeh ldquoDatamining of cellular automatarsquos tran-sition rulesrdquo International Journal of Geographical InformationScience vol 18 no 8 pp 723ndash744 2004

[35] P H Williams R Eyles and G Weiller ldquoPlant microRNAprediction by supervised machine learning using C50 decisiontreesrdquo Journal of Nucleic Acids vol 2012 Article ID 652979 10pages 2012

[36] N Patil R Lathi and V Chitre ldquoCustomer card classificationbased on C5 0 amp CART algorithmsrdquo International Journal ofEngineering Research and Applications vol 2 no 4 pp 164ndash1672012

[37] M M Javidi and E F Roshan ldquoSpeech emotion recognition byusing combinations of C50 neural network (NN) and supportvector machines (SVM) classification methodsrdquo Journal ofMathematics and Computer Science vol 6 pp 191ndash200 2013

[38] E Bauer and R Kohavi ldquoAn empirical comparison of vot-ing classification algorithms bagging boosting and variantsrdquoMachine Learning vol 36 no 1-2 pp 105ndash139 1999

[39] Y Shao and R S Lunetta ldquoComparison of support vectormachine neural network and CART algorithms for the land-cover classification using limited training data pointsrdquo ISPRSJournal of Photogrammetry and Remote Sensing vol 70 pp 78ndash87 2012

[40] Y Bengio andYGrandvalet ldquoNounbiased estimator of the vari-ance of k-fold cross-validationrdquo Journal of Machine LearningResearch vol 5 pp 1089ndash1105 2004

[41] C Chen B He and Z Zeng ldquoAmethod formineral prospectiv-ity mapping integrating C45 decision tree weights-of-evidenceandm-branch smoothing techniques a case study in the easternKunlunMountains Chinardquo Earth Science Informatics vol 7 no1 pp 13ndash24 2014

[42] M van der Gaag T Hoffman M Remijsen et al ldquoThe five-factor model of the positive and negative syndrome scale IIa ten-fold cross-validation of a revised modelrdquo SchizophreniaResearch vol 85 no 1ndash3 pp 280ndash287 2006

[43] M Al Saud ldquoMapping potential areas for groundwater storagein Wadi Aurnah Basin western Arabian Peninsula usingremote sensing and geographic information system techniquesrdquoHydrogeology Journal vol 18 no 6 pp 1481ndash1495 2010

[44] R G Congalton ldquoA review of assessing the accuracy of classifi-cations of remotely sensed datardquo Remote Sensing of Environ-ment vol 37 no 1 pp 35ndash46 1991

[45] F K Hoehler ldquoBias and prevalence effects on kappa viewed interms of sensitivity and specificityrdquo Journal of Clinical Epidemi-ology vol 53 no 5 pp 499ndash503 2000

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 4: Research Article Assessment of Groundwater Potential Based ...downloads.hindawi.com/journals/mpe/2016/2064575.pdf · Figure . In general, slopes control the inltration and ow ability

4 Mathematical Problems in Engineering

0 5 10 20

N

79∘50

0E 80

∘00E 80

∘10

0E 80

∘20

0E

33∘00

N33∘100

N

QJ3J2J1

K2K1E

(km)

79∘40

0E79

∘30

0E

33∘200

N

Figure 3 Lithology map of the study area

0 5 10 20

N

79∘50

0E 80

∘00E 80

∘10

0E 80

∘20

0E

33∘00

N33∘100

N

(km)

High 1Low 0

(km

km

2 )

79∘40

0E79

∘30

0E

33∘200

N

Figure 4 Lineament density map of the study area

The lineament density shown in Figure 4 was calculated inArcGIS 101 with ldquoline densityrdquo command

Topography controls the groundwater supply conditionsThe mountainous region provides better runoff conditionsand most of the precipitation is accounted for in the surfacerunoff withminimum infiltration to the groundwater On theother hand precipitation in plains provides slower runoff andfacilitates groundwater recharge Topography map is shownin Figure 5The topography map with 30m spatial resolutionwas extracted from DEM data in ArcGIS 101

Groundwater flow is usually driven by surface force andthe boundary of the terrain is mostly the boundary of theshallow aquifer Slope [24] is important in analyzing theterrain as it can affect the groundwater in terms of its storageflow and discharge especially in mountainous areas Slopewas extracted from the DEM in ArcGIS 101 and is shown in

0 5 10 20

N

79∘50

0E 80

∘00E 80

∘10

0E 80

∘20

0E

33∘00

N33∘100

N

(km)

High 6240Low 4196

79∘40

0E79

∘30

0E

33∘200

N

(m)

Figure 5 Topography map of the study area

0 5 10 20

N

79∘50

0E 80

∘00E 80

∘10

0E 80

∘20

0E

33∘00

N33∘100

N

(km)

High 75Low 0

79∘40

0E79

∘30

0E

33∘200

N

(∘)

Figure 6 Slope map of the study area

Figure 6 In general slopes control the infiltration and flowability of the surface water Usually the steep slope indicatesgreater water velocity Therefore it is observed that in theareas of steeper relief the runoff increases while minimizingthe groundwater recharge On the contrary on the relativelygentle sloping terrain the groundwater potentiality increasesdue to greater infiltrationThus lower slope results in greaterrecharge

Flow accumulation reflects the upstream flow quantity Inthe study area the supply source is mainly the melting snowFlow accumulation was derived from DEM for generating astream network It can be seen from Figure 7 that the studyarea has a dendritic pattern for drainage The dendritic net-work is usually found in region underlain with homogeneoussurface without abrupt changes in geological conditions

The river density [25] represents the recharge conditionsto quantify the influence caused by surface water wherehigher density provides better recharge conditions Based on

Mathematical Problems in Engineering 5

0 5 10 20

N

79∘50

0E 80

∘00E 80

∘10

0E 80

∘20

0E

33∘00

N33∘100

N

(km)

gt1000

79∘40

0E79

∘30

0E

33∘200

N

Figure 7 Flow accumulation map of the study area

0 5 10 20

N

79∘50

0E 80

∘00E 80

∘10

0E 80

∘20

0E

33∘00

N33∘100

N

(km)

Low 0

High 114

(km

km

2 )

79∘40

0E79

∘30

0E

33∘200

N

Figure 8 River density map of the study area

the flow accumulation the river density was calculated usingthe ldquoline densityrdquo command in ArcGIS 101 as shown inFigure 8

24 Research Method The decision tree algorithms [16]are suitable for the multifactor classification problem Forthe mixture of the continuous and discrete factors in thegroundwater potential evaluation model C50 and CARTdecision tree algorithms were adopted together with the 10-fold cross validation method to improve the classificationaccuracy and unbiasedness

241 C50 The C50 decision tree algorithm is rooted inID3 and C45 The ID3 algorithm [26] with the maximumgain as the division standard aims to achieve the maximumof the information gain in every node which often givespriority to the variables with more classes To make upfor the defect of ID3 C45 adopts the gain-ratio criterion[19 27 28] With the gain-ratio criterion the binary nodes

divide the continuous variable The processing method forcategorical variables is firstly to refer each of the categoriesas a branch and then merge each two branches iterativelyuntil the two branches remain However the heuristic searchmay not find the best point for the categorical variablesdivision For continuous variables C50 algorithm can easilyfind the division point [29] To get rid of the biasedness onthe continuous variables the algorithm improves the gainof the continuous variables Besides C50 algorithm cansimplify the originally complex decision tree to the equivalenttree for the easy understanding With more splitting layersthan other algorithms it can ensure the high purity of theresult nodes C50 algorithm achieves the self-correction afterseveral iterations with boosting technology [30]The artificialmethods lay emphasis on the sole impact of various factorshowever C50 algorithm can consider all the factors toanalyze the comprehensive influence based on data statisticsC50 algorithm has been widely applied to the multivariateclassification for its unbiasedness and precision towards thecontinuous and categorical variables compared to other deci-sion tree algorithms [17 31ndash37] To improve the classificationaccuracy boosting technology was applied in C50 decisiontree algorithm and could adjust the decision tree according tothe fault samples until reaching a high precision [38] BesidesC50 algorithm is more applicable to the large data samples

Assume the splitting node119879 is expected to separate the |119879|samples into119870 target categories [27]The symbol119901119879(119894) standsfor the percentage of category 119894 at node 119879 119894 = 1 2 119870 Theinclusive information at node 119879 can be expressed as

Info (119879) = minus 119870sum119894=1

119901119879 (119894) times log2 [119901119879 (119894)] (1)

For the decision tree branch the samples are divided into1198791 1198792 119879119899 by the node 119879 with the subsamples of |1198791| |1198792|and |119879119899| respectivelyThen the information can be expressedas

Info (119883 119879) = 119899sum119894=1

10038161003816100381610038161198791198941003816100381610038161003816|119879| times Info (119879119894) (2)

After combining the above information formulas theinformation gain can be expressed as

Gain (119883 119879) = Info (119879) minus Info (119883 119879) (3)To improve the application of the information gain

concept information gain ratio was put forward

GainRatio (119883 119879) = Gain (119883 119879)SplitInfo (119883 119879) (4)

where

SplitInfo (119883 119879) = minus 119899sum119894=1

10038161003816100381610038161198791198941003816100381610038161003816|119879| times log2 (10038161003816100381610038161198791198941003816100381610038161003816|119879| ) (5)

The practice showed that the information gain ratio wasmore preferred to the continuous variable therefore theinformation gain ratio for the continuous variable with 119899distinct values should be expressed as [16]

Gain (119883 119879) = Info (119879) minus Info (119883 119879) minus log2 (119899 minus 1)|119879| (6)

6 Mathematical Problems in Engineering

Table 2 The precision of each loop based on the 10-fold cross validation ()

1 2 3 4 5 6 7 8 9 10 Averageaccuracy

C50 8333 10000 8125 8986 9025 8729 9450 9421 8750 9628 9045CART 8057 8921 9223 8534 8123 100 8012 7824 8157 8234 8509

Table 3 The importance of each factor based on C50 and CART algorithm

Lithology Topology River density Slope Lineament densityC50 0363 0331 0159 0117 0031CART 0355 0308 0024 0010 0312

242 CART CART decision tree algorithm [39] can dividethe sample set into two subsample sets making the root andintermediate nodes with two branches based on the recur-sively binary segmentation technology CART can handleboth continuous and discrete variables with the impuritylevel-Gini coefficient as the discriminant basis consideringthe probability distribution under the division node

Assume a total of 119870 classes variable 119883 and node for 119879then the Gini index is defined as [16]

Gini (119879) = 119870sum119894=1

119901119879 (119894) (1 minus 119901119879 (119894)) = 1 minus119870sum119894=1

1199012119879 (119894) (7)

When the classes show the equal probability in the node119879 the Gini index achieves the maximum 1 minus 1119870 with onlyone kind in node 119879 the Gini index achieves the minimumThe Gini index increases with the impurity therefore thesubnodes should be added to lower the impurity Whentaking the misclassification cost matrix into considerationthe Gini index formula becomes

Gini (119879) = sum119895 =119894

119862 (119894 | 119895) 119901119879 (119894) 119901119879 (119895) (8)

where 119862(119894 | 119895) represents the cost for misclassifying the casecategory 119895 as 119894 Assuming that the subnode 119878 added to node119879 the Gini index can be expressed as

Gini (119878 119879) = Gini (119879) minus 119901119871Gini (119879119871) minus 119901119877Gini (119879119877) (9)

where 119901119871 119901119877 stand for the proportion of cases in node 119879classified into 119879119871 and 119879119877243 10-Fold Cross Validation The basic idea for the 119896-foldcross validation [40 41] is to equally divide the surveyedsamples into 119896 parts of which 119896 minus 1 parts served as thetraining samples with the remaining one part for validationwith 119896 circulations The method can guarantee every sampleacted as the training sample and the verification sample forone time in the circulations For the 119896-fold cross validationthe commonly used value for 119896 is 10 called 10-fold crossvalidation [42]

3 Results and Discussion

In this study five groundwater relating factors were used inthe analysis and the factors except lithology were continuous

80 tube wells were utilized for training with C50 andCART respectively in the statistical analysis software SPSSClementine 120 and MATLAB According to the confusionmatrix [43] constructed by the verification result the kappacoefficient [44 45] is used to evaluate the accuracy Based onthe 10-fold cross validation with C50 and CART algorithmrespectively the precision of each loop is shown in Table 2The importance of each factor was calculated firstly to deter-mine the selection order of the factors for classification asshown in Table 3 For the C50 algorithm the importance was0363 0331 0159 0117 and 0031 respectively for lithologytopology river density slope and lineament density For theCART algorithm the importance was 0355 0308 00240010 and 0312 respectively After the ten loops the decisiontree with the higher classification accuracy was chosen asoptimal Figure 9 shows the optimal decision tree generatedby C50 algorithm with 6 layers 21 nodes 10 internal nodesand 11 terminal nodes Table 4 shows the rules for the optimaldecision tree by C50 The optimal decision tree generated byCARTalgorithm is shown in Figure 10 with 8 layers 21 nodes11 internal nodes and 10 terminal nodes and the rules areshown in Table 5

Kappa = 119873sum119903119894=1 119909119894119894 minus sum119903119894=1 119909119894+119909+1198941198732 minus sum119903119894=1 119909119894+119909+119894 (10)

where 119903 is the number of the total categories 119873 is the totalnumber for verification 119909119894119894 is the number of the correctclassifications 119909119894+ is the number of samples mistaken fromthe category 119894 for others 119909+119894 is the number of samplesmistaken from other categories for 119894

According to the classification rules we can see that notevery rule takes all the variables into account therefore forthe area lack of the detailed information the typical factorscan help to predict the groundwater potential classificationgrade The decision trees show that both the two algorithmscan determine the dividing point of the variables especiallyfor the continuous variables based on the training data whichmakes the division of the interval more scientific and reducesthe segmentation error compared with the artificial division

According to the decision tree result generated by C50algorithm we did deep analysis Topology was dividedby six nodes 4196ndash4301ndash43165ndash4357ndash44005ndash6240 and thegroundwater was distributed in the low-lying areas Slopecontained two ranges 0ndash243ndash75 and the flat areas werebeneficial to the surface water infiltration River density was

Mathematical Problems in Engineering 7

Table 4 Rules for groundwater potential grade based on C50

Grade Rule

Very goodLithology Q 4196 lt topology le 4301 lineament density le 0558Lithology Q 4196 lt topology le 43165 lineament density gt 0558Lithology Q 43165 lt topology le 44005 river density gt 0784 slope le 243

Good

Lithology Q 4301 lt topology le 43165 lineament density le 0558Lithology Q 43165 lt topology le 4357 river density le 0784Lithology Q 4357 lt topology le 44005 river density gt 0784 slope gt 243Lithology Q 44005 lt topology river density gt 0650

ModerateLithology others except Q river density gt 0506Lithology Q 44005 lt topology river density le 0650Lithology Q 4357 lt topology le 44005 river density le 0784

Poor Lithology others except Q river density le 05

RD

L

T

Others

T

Q

P M

RD RD

M GT S

VG G

LD

T VG

VG G

G M

le0506 gt0506 le43165 gt43165

le0558 gt0558

le4301 gt4301

le4357 gt4357

gt0784

le44005 gt44005

le0650 gt0650

le243 gt243

le0784

Factor GradeL lithologyLD lineament densityT topologyS slopeRD river density

VG very goodG goodM moderateP poor

Figure 9 The optimal decision tree generated by C50 algorithm

classified into four intervals 0ndash0506ndash0650ndash0784ndash114 andreflected the flow capacity in the region and the higherthe better for groundwater enrichment Lineament densitywas divided into two intervals 0ndash0558ndash1 and the moreoccurrence space for groundwater existed with the higherlineament density However for the CART algorithm topol-ogy was divided into three intervals 4196ndash4304ndash4400ndash6240slope contained two ranges 0ndash2386ndash75 river density was

classified into three intervals 0ndash0685ndash0703ndash114 lineamentdensitywas divided into four intervals 0ndash0296ndash0335ndash0713ndash1 After applying the optimal decision trees respectively tothe whole study area [6 41] the groundwater potential zonemaps were derived and shown in Figures 11 and 12

According to the results generated by the decision treesthe ldquovery goodrdquo area is mostly located in the broad plainzone with a patchy distribution covering 10325 km2 about

8 Mathematical Problems in Engineering

Table 5 Rules for groundwater potential grade based on CART

Grade Rule

Very goodRiver density gt 0703 4304 lt topology le 4400 lineament density gt 0713Lineament density gt 0296 topology le 4304Lineament density gt 0335 4304 lt topology le 4400 river density gt 0703 slope le 2386

GoodRiver density lt 0703 4304 lt topology le 4400 lineament density gt 03350335 lt lineament density le 0713 4304 lt topology le 4400 river density gt 0703 slope gt 2386Lineament density gt 0296 topology gt 4400 river density gt 0685

ModerateLineament density le 0296 lithology Q0296 lt lineament density le 0335 4304 lt topology le 4400Lineament density gt 0296 topology gt 4400 river density le 0685

Poor Lineament density le 0296 lithology others except Q

LD

L T

RDP M

Others Q

M G

T

VG LD

M RD

G LD

S VG

VG G

le0296 gt0296

le4400 gt4400

le4304 gt4304 le0685 gt0685

le0335 gt0335

le0703 gt0703

le0713 gt0713

gt2386

Factor GradeL lithologyLD lineament densityT topologyS slopeRD river density

VG very goodG goodM moderateP poor

le2386

Figure 10 The optimal decision tree generated by CART algorithm

Mathematical Problems in Engineering 9

5

N

79∘50

0E 80

∘00E 80

∘10

0E 80

∘20

0E

33∘00

N33∘100

N

0 10 20(km)

Very goodGoodModerate

PoorSurveyed tube wells

79∘40

0E79

∘30

0E

33∘200

N

Figure 11 Groundwater potential map of the study area based onC50

0 5 10 20

N

79∘50

0E 80

∘00E 80

∘10

0E 80

∘20

0E

33∘00

N33∘100

N

(km)

Very goodGoodModerate

PoorSurveyed tube wells

79∘40

0E79

∘30

0E

33∘200

N

Figure 12 Groundwater potential map of the study area based onCART

461 of the study area for C50 algorithm and 10505 km2about 468 of the study area for CART algorithm for thewater infiltration into the underground with sufficient timeand space The study shows that low-lying areas with goodflow condition well-developed stratigraphic gap and strongconnectivity like the southwest beaches of Ban Gong Lakecan be the first target for groundwater resources The ldquogoodrdquoarea was distributed along the river with 19210 km2 covering858 for C50 algorithm and 22602 km2 about 1009 ofthe study area for CART algorithm just like a long stripThe zones mainly had a planar distribution on the bottomof the diluvia fan on the low mountain and hilly terrainand a banded distribution on the mountain watershed ofpluvial valleys situated upstream and on the peripheryof ldquovery goodrdquo area which can serve as the candidate

target for groundwater exploration The ldquomoderaterdquo zonewith 59551 km2 occupying 2659 for C50 algorithmand 58453 km2 about 2610 of the study area for CARTalgorithm spreads around the upstream tributaries Thezone is located on both sides of the river valleys and the topof the diluvia fan The ldquopoorrdquo area occupies 6023 withthe area of 134914 km2 for C50 algorithm and 132440 km2about 5913 of the study area for CART algorithm which ismostly a mountainous region with a high altitude

4 Conclusions

In this study C50 and CART algorithms were applied for thedecision tree generation to predict the groundwater potentialzone with the five relating factors and the 10-fold crossvalidation method was adopted to verify the classificationresult with the kappa coefficient From this paper we candraw some conclusions as follows(1) In the study area the five groundwater relating fac-tors lithology topology slope river density and lineamentdensity were appropriate for the groundwater potential gradeprediction and the importance based on C50 algorithm was0363 0331 0159 0117 and 003 respectively for CARTalgorithm the importance was 0355 0308 0024 0010 and0312 respectively(2) Based on the 10-fold cross validation both C50 andCART could be applied for MCDM with the categoricaland continuous variables simultaneously with the averageaccuracy of 9045 and 8509 respectively however C50algorithm showed higher classification accuracy than CARTalgorithm(3) After applying the optimal decision trees to the wholestudy area respectively the groundwater potential zone mapwas delineated and the four grades of groundwater poten-tial zones ldquovery goodrdquo ldquogoodrdquo ldquomoderaterdquo and ldquopoorrdquooccupied the area of 10325 km2 19210 km2 59551 km2 and134914 km2 with the percentages of 461 858 2659and 6023 respectively for C50 and for CART the areaof 10505 km2 22602 km2 58453 km2 and 132440 km2with the percentages of 468 1009 2610 and 5913respectively

The study result can provide reference for groundwaterexploration and in the future work we will consider morerelating factors and survey more wells to enrich the modelThe integration of decision tree algorithms and MCDM inour study applies only to the qualitative assessment for thelack of the prior knowledge in the large area therefore theextra analysis is needed for the specific point investigationThe accuracy demonstrates that the 10-fold cross validation issuitable for training and verifying the decision tree howeverthe tested dataset is limited and more tube wells should beinvestigated to validate the stability of the model

Competing Interests

The authors declare that there is no conflict of interests in thisresearch work

10 Mathematical Problems in Engineering

Acknowledgments

This work was supported by Development Program of ChinaGroundwater Exploration Technology in the Water ShortageRegion (863 Program 2012AA062601)

References

[1] D Oikonomidis S Dimogianni N Kazakis and K VoudourisldquoA GISRemote Sensing-based methodology for groundwaterpotentiality assessment in Tirnavos area Greecerdquo Journal ofHydrology vol 525 pp 197ndash208 2015

[2] D Pinto S Shrestha M S Babel and S Ninsawat ldquoDelineationof groundwater potential zones in the Comoro watershedTimor Leste using GIS remote sensing and analytic hierarchyprocess (AHP) techniquerdquo Applied Water Science 2015

[3] O A Fashae M N Tijani A O Talabi and O I AdedejildquoDelineation of groundwater potential zones in the crystallinebasement terrain of SW-Nigeria an integrated GIS and remotesensing approachrdquoAppliedWater Science vol 4 no 1 pp 19ndash382014

[4] JMallick C K SinghHAl-Wadi et al ldquoGeospatial and geosta-tistical approach for groundwater potential zone delineationrdquoHydrological Processes vol 29 no 3 pp 395ndash418 2015

[5] S Lee K-Y Song Y Kim and I Park ldquoRegional groundwaterproductivity potential mapping using a geographic informationsystem (GIS) based artificial neural network modelrdquo Hydroge-ology Journal vol 20 no 8 pp 1511ndash1527 2012

[6] S Lee and C-W Lee ldquoApplication of decision-tree model togroundwater productivity-potential mappingrdquo Sustainabilityvol 7 no 10 pp 13416ndash13432 2015

[7] AOzdemir ldquoUsing a binary logistic regressionmethod andGISfor evaluating andmapping the groundwater spring potential inthe Sultan Mountains (Aksehir Turkey)rdquo Journal of Hydrologyvol 405 no 1-2 pp 123ndash136 2011

[8] A Ozdemir ldquoGIS-based groundwater spring potential mappingin the SultanMountains (Konya Turkey) using frequency ratioweights of evidence and logistic regression methods and theircomparisonrdquo Journal of Hydrology vol 411 no 3-4 pp 290ndash308 2011

[9] D Machiwal M K Jha and B C Mal ldquoAssessment ofgroundwater potential in a semi-arid region of india usingremote sensing GIS and MCDM techniquesrdquo Water ResourcesManagement vol 25 no 5 pp 1359ndash1386 2011

[10] S A Naghibi and H R Pourghasemi ldquoA comparative assess-ment between three machine learning models and their per-formance comparison by bivariate and multivariate statisticalmethods in groundwater potential mappingrdquo Water ResourcesManagement vol 29 no 14 pp 5217ndash5236 2015

[11] D Zheng-Dong Y Xin L Fan et al ldquoConstruction andinvestigation of groundwater remote sensing fuzzy assessmentindexrdquo Chinese Journal of Geophysics vol 56 no 11 pp 3908ndash3916 2013

[12] O FAlthuwaynee B PradhanH-J Park and JH Lee ldquoAnovelensemble decision tree-based CHi-squared Automatic Interac-tion Detection (CHAID) and multivariate logistic regressionmodels in landslide susceptibility mappingrdquo Landslides vol 11no 6 pp 1063ndash1078 2014

[13] I Park and S Lee ldquoSpatial prediction of landslide susceptibilityusing a decision tree approach a case study of the Pyeongchangarea Koreardquo International Journal of Remote Sensing vol 35 no16 pp 6089ndash6112 2014

[14] C Baker R Lawrence C Montagne and D Patten ldquoMappingwetlands and riparian areas using landsat ETM+ imagery anddecision-tree-based modelsrdquo Wetlands vol 26 no 2 pp 465ndash474 2006

[15] R Kohavi and J R Quinlan ldquoData mining tasks and methodsclassification decision-tree discoveryrdquo in Handbook of DataMining and Knowledge Discovery pp 267ndash276 Oxford Univer-sity Press New York NY USA 2002

[16] N Lin D Noe and X He ldquoTree-based methods and theirapplicationsrdquo in Springer Handbook of Engineering Statistics pp551ndash570 Springer London UK 2006

[17] I Klein U Gessner and C Kuenzer ldquoRegional land covermapping and change detection in Central Asia using MODIStime-seriesrdquo Applied Geography vol 35 no 1-2 pp 219ndash2342012

[18] J R Quinlan ldquoImproved use of continuous attributes in C4 5rdquoJournal of Artificial Intelligence Research vol 4 pp 77ndash90 1996

[19] S B Kotsiantis ldquoSupervised machine learning a review ofclassification techniquesrdquo Informatica vol 31 no 3 pp 249ndash268 2007

[20] K Polat and S Gunes ldquoA novel hybrid intelligent method basedon C45 decision tree classifier and one-against-all approachfor multi-class classification problemsrdquo Expert Systems withApplications vol 36 no 2 pp 1587ndash1592 2009

[21] M K Jha V M Chowdary and A Chowdhury ldquoGroundwaterassessment in Salboni Block West Bengal (India) using remotesensing geographical information system and multi-criteriadecision analysis techniquesrdquoHydrogeology Journal vol 18 no7 pp 1713ndash1728 2010

[22] V B Rekha A P Thomas M Suma and H Vijith ldquoAnintegration of spatial information technology for groundwa-ter potential and quality investigations in Koduvan Ar Sub-Watershed of Meenachil River Basin Kerala Indiardquo Journal ofthe Indian Society of Remote Sensing vol 39 no 1 pp 63ndash712011

[23] O Rahmati ANazari SamaniMMahdavi H R Pourghasemiand H Zeinivand ldquoGroundwater potential mapping at Kurdis-tan region of Iran using analytic hierarchy process and GISrdquoArabian Journal of Geosciences vol 8 no 9 pp 7059ndash7071 2014

[24] K R Preeja S Joseph J Thomas and H Vijith ldquoIdentificationof groundwater potential zones of a tropical river basin (KeralaIndia) using remote sensing and GIS techniquesrdquo Journal of theIndian Society of Remote Sensing vol 39 no 1 pp 83ndash94 2011

[25] K Narendra K Nageswara Rao and P Swarna Latha ldquoIntegrat-ing remote sensing and GIS for identification of groundwaterprospective zones in the Narava basin Visakhapatnam regionAndhra Pradeshrdquo Journal of the Geological Society of India vol81 no 2 pp 248ndash260 2013

[26] M Ture F Tokatli and I Kurt ldquoUsing KaplanndashMeier analysistogether with decision tree methods (CampRT CHAID QUESTC45 and ID3) in determining recurrence-free survival of breastcancer patientsrdquo Expert Systems with Applications vol 36 no 2pp 2017ndash2026 2009

[27] N A AL-Fakhry ldquoSummarizing data by using data miningtechniques a comparative by using C45 and C50 algorithmsrdquoInternational Education and Research Journal vol 2 no 4 2016

[28] R Kohavi and J R Quinlan ldquoData mining tasks and methodsclassification decision-tree discoveryrdquo in Handbook of DataMining and Knowledge Discovery pp 267ndash276 Oxford Univer-sity Press 2002

Mathematical Problems in Engineering 11

[29] U M Fayyad and K B Irani ldquoOn the handling of continuous-valued attributes in decision tree generationrdquoMachine Learningvol 8 no 1 pp 87ndash102 1992

[30] J R Quinlan ldquoBagging boosting and C4 5rdquo AAAIIAAI vol1 pp 725ndash730 1996

[31] S Pang and J C Gong ldquoC5 0 classification algorithm andapplication on individual credit evaluation of banksrdquo SystemsEngineering-Theory amp Practice vol 29 no 12 pp 94ndash104 2009

[32] M A Friedl C E Brodley and A H Strahler ldquoMaximizingland cover classification accuracies produced by decision treesat continental to global scalesrdquo IEEE Transactions on Geoscienceand Remote Sensing vol 37 no 2 pp 969ndash977 1999

[33] C J Moran and E N Bui ldquoSpatial data mining for enhancedsoil map modellingrdquo International Journal of GeographicalInformation Science vol 16 no 6 pp 533ndash549 2002

[34] X Li andA G-O Yeh ldquoDatamining of cellular automatarsquos tran-sition rulesrdquo International Journal of Geographical InformationScience vol 18 no 8 pp 723ndash744 2004

[35] P H Williams R Eyles and G Weiller ldquoPlant microRNAprediction by supervised machine learning using C50 decisiontreesrdquo Journal of Nucleic Acids vol 2012 Article ID 652979 10pages 2012

[36] N Patil R Lathi and V Chitre ldquoCustomer card classificationbased on C5 0 amp CART algorithmsrdquo International Journal ofEngineering Research and Applications vol 2 no 4 pp 164ndash1672012

[37] M M Javidi and E F Roshan ldquoSpeech emotion recognition byusing combinations of C50 neural network (NN) and supportvector machines (SVM) classification methodsrdquo Journal ofMathematics and Computer Science vol 6 pp 191ndash200 2013

[38] E Bauer and R Kohavi ldquoAn empirical comparison of vot-ing classification algorithms bagging boosting and variantsrdquoMachine Learning vol 36 no 1-2 pp 105ndash139 1999

[39] Y Shao and R S Lunetta ldquoComparison of support vectormachine neural network and CART algorithms for the land-cover classification using limited training data pointsrdquo ISPRSJournal of Photogrammetry and Remote Sensing vol 70 pp 78ndash87 2012

[40] Y Bengio andYGrandvalet ldquoNounbiased estimator of the vari-ance of k-fold cross-validationrdquo Journal of Machine LearningResearch vol 5 pp 1089ndash1105 2004

[41] C Chen B He and Z Zeng ldquoAmethod formineral prospectiv-ity mapping integrating C45 decision tree weights-of-evidenceandm-branch smoothing techniques a case study in the easternKunlunMountains Chinardquo Earth Science Informatics vol 7 no1 pp 13ndash24 2014

[42] M van der Gaag T Hoffman M Remijsen et al ldquoThe five-factor model of the positive and negative syndrome scale IIa ten-fold cross-validation of a revised modelrdquo SchizophreniaResearch vol 85 no 1ndash3 pp 280ndash287 2006

[43] M Al Saud ldquoMapping potential areas for groundwater storagein Wadi Aurnah Basin western Arabian Peninsula usingremote sensing and geographic information system techniquesrdquoHydrogeology Journal vol 18 no 6 pp 1481ndash1495 2010

[44] R G Congalton ldquoA review of assessing the accuracy of classifi-cations of remotely sensed datardquo Remote Sensing of Environ-ment vol 37 no 1 pp 35ndash46 1991

[45] F K Hoehler ldquoBias and prevalence effects on kappa viewed interms of sensitivity and specificityrdquo Journal of Clinical Epidemi-ology vol 53 no 5 pp 499ndash503 2000

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 5: Research Article Assessment of Groundwater Potential Based ...downloads.hindawi.com/journals/mpe/2016/2064575.pdf · Figure . In general, slopes control the inltration and ow ability

Mathematical Problems in Engineering 5

0 5 10 20

N

79∘50

0E 80

∘00E 80

∘10

0E 80

∘20

0E

33∘00

N33∘100

N

(km)

gt1000

79∘40

0E79

∘30

0E

33∘200

N

Figure 7 Flow accumulation map of the study area

0 5 10 20

N

79∘50

0E 80

∘00E 80

∘10

0E 80

∘20

0E

33∘00

N33∘100

N

(km)

Low 0

High 114

(km

km

2 )

79∘40

0E79

∘30

0E

33∘200

N

Figure 8 River density map of the study area

the flow accumulation the river density was calculated usingthe ldquoline densityrdquo command in ArcGIS 101 as shown inFigure 8

24 Research Method The decision tree algorithms [16]are suitable for the multifactor classification problem Forthe mixture of the continuous and discrete factors in thegroundwater potential evaluation model C50 and CARTdecision tree algorithms were adopted together with the 10-fold cross validation method to improve the classificationaccuracy and unbiasedness

241 C50 The C50 decision tree algorithm is rooted inID3 and C45 The ID3 algorithm [26] with the maximumgain as the division standard aims to achieve the maximumof the information gain in every node which often givespriority to the variables with more classes To make upfor the defect of ID3 C45 adopts the gain-ratio criterion[19 27 28] With the gain-ratio criterion the binary nodes

divide the continuous variable The processing method forcategorical variables is firstly to refer each of the categoriesas a branch and then merge each two branches iterativelyuntil the two branches remain However the heuristic searchmay not find the best point for the categorical variablesdivision For continuous variables C50 algorithm can easilyfind the division point [29] To get rid of the biasedness onthe continuous variables the algorithm improves the gainof the continuous variables Besides C50 algorithm cansimplify the originally complex decision tree to the equivalenttree for the easy understanding With more splitting layersthan other algorithms it can ensure the high purity of theresult nodes C50 algorithm achieves the self-correction afterseveral iterations with boosting technology [30]The artificialmethods lay emphasis on the sole impact of various factorshowever C50 algorithm can consider all the factors toanalyze the comprehensive influence based on data statisticsC50 algorithm has been widely applied to the multivariateclassification for its unbiasedness and precision towards thecontinuous and categorical variables compared to other deci-sion tree algorithms [17 31ndash37] To improve the classificationaccuracy boosting technology was applied in C50 decisiontree algorithm and could adjust the decision tree according tothe fault samples until reaching a high precision [38] BesidesC50 algorithm is more applicable to the large data samples

Assume the splitting node119879 is expected to separate the |119879|samples into119870 target categories [27]The symbol119901119879(119894) standsfor the percentage of category 119894 at node 119879 119894 = 1 2 119870 Theinclusive information at node 119879 can be expressed as

Info (119879) = minus 119870sum119894=1

119901119879 (119894) times log2 [119901119879 (119894)] (1)

For the decision tree branch the samples are divided into1198791 1198792 119879119899 by the node 119879 with the subsamples of |1198791| |1198792|and |119879119899| respectivelyThen the information can be expressedas

Info (119883 119879) = 119899sum119894=1

10038161003816100381610038161198791198941003816100381610038161003816|119879| times Info (119879119894) (2)

After combining the above information formulas theinformation gain can be expressed as

Gain (119883 119879) = Info (119879) minus Info (119883 119879) (3)To improve the application of the information gain

concept information gain ratio was put forward

GainRatio (119883 119879) = Gain (119883 119879)SplitInfo (119883 119879) (4)

where

SplitInfo (119883 119879) = minus 119899sum119894=1

10038161003816100381610038161198791198941003816100381610038161003816|119879| times log2 (10038161003816100381610038161198791198941003816100381610038161003816|119879| ) (5)

The practice showed that the information gain ratio wasmore preferred to the continuous variable therefore theinformation gain ratio for the continuous variable with 119899distinct values should be expressed as [16]

Gain (119883 119879) = Info (119879) minus Info (119883 119879) minus log2 (119899 minus 1)|119879| (6)

6 Mathematical Problems in Engineering

Table 2 The precision of each loop based on the 10-fold cross validation ()

1 2 3 4 5 6 7 8 9 10 Averageaccuracy

C50 8333 10000 8125 8986 9025 8729 9450 9421 8750 9628 9045CART 8057 8921 9223 8534 8123 100 8012 7824 8157 8234 8509

Table 3 The importance of each factor based on C50 and CART algorithm

Lithology Topology River density Slope Lineament densityC50 0363 0331 0159 0117 0031CART 0355 0308 0024 0010 0312

242 CART CART decision tree algorithm [39] can dividethe sample set into two subsample sets making the root andintermediate nodes with two branches based on the recur-sively binary segmentation technology CART can handleboth continuous and discrete variables with the impuritylevel-Gini coefficient as the discriminant basis consideringthe probability distribution under the division node

Assume a total of 119870 classes variable 119883 and node for 119879then the Gini index is defined as [16]

Gini (119879) = 119870sum119894=1

119901119879 (119894) (1 minus 119901119879 (119894)) = 1 minus119870sum119894=1

1199012119879 (119894) (7)

When the classes show the equal probability in the node119879 the Gini index achieves the maximum 1 minus 1119870 with onlyone kind in node 119879 the Gini index achieves the minimumThe Gini index increases with the impurity therefore thesubnodes should be added to lower the impurity Whentaking the misclassification cost matrix into considerationthe Gini index formula becomes

Gini (119879) = sum119895 =119894

119862 (119894 | 119895) 119901119879 (119894) 119901119879 (119895) (8)

where 119862(119894 | 119895) represents the cost for misclassifying the casecategory 119895 as 119894 Assuming that the subnode 119878 added to node119879 the Gini index can be expressed as

Gini (119878 119879) = Gini (119879) minus 119901119871Gini (119879119871) minus 119901119877Gini (119879119877) (9)

where 119901119871 119901119877 stand for the proportion of cases in node 119879classified into 119879119871 and 119879119877243 10-Fold Cross Validation The basic idea for the 119896-foldcross validation [40 41] is to equally divide the surveyedsamples into 119896 parts of which 119896 minus 1 parts served as thetraining samples with the remaining one part for validationwith 119896 circulations The method can guarantee every sampleacted as the training sample and the verification sample forone time in the circulations For the 119896-fold cross validationthe commonly used value for 119896 is 10 called 10-fold crossvalidation [42]

3 Results and Discussion

In this study five groundwater relating factors were used inthe analysis and the factors except lithology were continuous

80 tube wells were utilized for training with C50 andCART respectively in the statistical analysis software SPSSClementine 120 and MATLAB According to the confusionmatrix [43] constructed by the verification result the kappacoefficient [44 45] is used to evaluate the accuracy Based onthe 10-fold cross validation with C50 and CART algorithmrespectively the precision of each loop is shown in Table 2The importance of each factor was calculated firstly to deter-mine the selection order of the factors for classification asshown in Table 3 For the C50 algorithm the importance was0363 0331 0159 0117 and 0031 respectively for lithologytopology river density slope and lineament density For theCART algorithm the importance was 0355 0308 00240010 and 0312 respectively After the ten loops the decisiontree with the higher classification accuracy was chosen asoptimal Figure 9 shows the optimal decision tree generatedby C50 algorithm with 6 layers 21 nodes 10 internal nodesand 11 terminal nodes Table 4 shows the rules for the optimaldecision tree by C50 The optimal decision tree generated byCARTalgorithm is shown in Figure 10 with 8 layers 21 nodes11 internal nodes and 10 terminal nodes and the rules areshown in Table 5

Kappa = 119873sum119903119894=1 119909119894119894 minus sum119903119894=1 119909119894+119909+1198941198732 minus sum119903119894=1 119909119894+119909+119894 (10)

where 119903 is the number of the total categories 119873 is the totalnumber for verification 119909119894119894 is the number of the correctclassifications 119909119894+ is the number of samples mistaken fromthe category 119894 for others 119909+119894 is the number of samplesmistaken from other categories for 119894

According to the classification rules we can see that notevery rule takes all the variables into account therefore forthe area lack of the detailed information the typical factorscan help to predict the groundwater potential classificationgrade The decision trees show that both the two algorithmscan determine the dividing point of the variables especiallyfor the continuous variables based on the training data whichmakes the division of the interval more scientific and reducesthe segmentation error compared with the artificial division

According to the decision tree result generated by C50algorithm we did deep analysis Topology was dividedby six nodes 4196ndash4301ndash43165ndash4357ndash44005ndash6240 and thegroundwater was distributed in the low-lying areas Slopecontained two ranges 0ndash243ndash75 and the flat areas werebeneficial to the surface water infiltration River density was

Mathematical Problems in Engineering 7

Table 4 Rules for groundwater potential grade based on C50

Grade Rule

Very goodLithology Q 4196 lt topology le 4301 lineament density le 0558Lithology Q 4196 lt topology le 43165 lineament density gt 0558Lithology Q 43165 lt topology le 44005 river density gt 0784 slope le 243

Good

Lithology Q 4301 lt topology le 43165 lineament density le 0558Lithology Q 43165 lt topology le 4357 river density le 0784Lithology Q 4357 lt topology le 44005 river density gt 0784 slope gt 243Lithology Q 44005 lt topology river density gt 0650

ModerateLithology others except Q river density gt 0506Lithology Q 44005 lt topology river density le 0650Lithology Q 4357 lt topology le 44005 river density le 0784

Poor Lithology others except Q river density le 05

RD

L

T

Others

T

Q

P M

RD RD

M GT S

VG G

LD

T VG

VG G

G M

le0506 gt0506 le43165 gt43165

le0558 gt0558

le4301 gt4301

le4357 gt4357

gt0784

le44005 gt44005

le0650 gt0650

le243 gt243

le0784

Factor GradeL lithologyLD lineament densityT topologyS slopeRD river density

VG very goodG goodM moderateP poor

Figure 9 The optimal decision tree generated by C50 algorithm

classified into four intervals 0ndash0506ndash0650ndash0784ndash114 andreflected the flow capacity in the region and the higherthe better for groundwater enrichment Lineament densitywas divided into two intervals 0ndash0558ndash1 and the moreoccurrence space for groundwater existed with the higherlineament density However for the CART algorithm topol-ogy was divided into three intervals 4196ndash4304ndash4400ndash6240slope contained two ranges 0ndash2386ndash75 river density was

classified into three intervals 0ndash0685ndash0703ndash114 lineamentdensitywas divided into four intervals 0ndash0296ndash0335ndash0713ndash1 After applying the optimal decision trees respectively tothe whole study area [6 41] the groundwater potential zonemaps were derived and shown in Figures 11 and 12

According to the results generated by the decision treesthe ldquovery goodrdquo area is mostly located in the broad plainzone with a patchy distribution covering 10325 km2 about

8 Mathematical Problems in Engineering

Table 5 Rules for groundwater potential grade based on CART

Grade Rule

Very goodRiver density gt 0703 4304 lt topology le 4400 lineament density gt 0713Lineament density gt 0296 topology le 4304Lineament density gt 0335 4304 lt topology le 4400 river density gt 0703 slope le 2386

GoodRiver density lt 0703 4304 lt topology le 4400 lineament density gt 03350335 lt lineament density le 0713 4304 lt topology le 4400 river density gt 0703 slope gt 2386Lineament density gt 0296 topology gt 4400 river density gt 0685

ModerateLineament density le 0296 lithology Q0296 lt lineament density le 0335 4304 lt topology le 4400Lineament density gt 0296 topology gt 4400 river density le 0685

Poor Lineament density le 0296 lithology others except Q

LD

L T

RDP M

Others Q

M G

T

VG LD

M RD

G LD

S VG

VG G

le0296 gt0296

le4400 gt4400

le4304 gt4304 le0685 gt0685

le0335 gt0335

le0703 gt0703

le0713 gt0713

gt2386

Factor GradeL lithologyLD lineament densityT topologyS slopeRD river density

VG very goodG goodM moderateP poor

le2386

Figure 10 The optimal decision tree generated by CART algorithm

Mathematical Problems in Engineering 9

5

N

79∘50

0E 80

∘00E 80

∘10

0E 80

∘20

0E

33∘00

N33∘100

N

0 10 20(km)

Very goodGoodModerate

PoorSurveyed tube wells

79∘40

0E79

∘30

0E

33∘200

N

Figure 11 Groundwater potential map of the study area based onC50

0 5 10 20

N

79∘50

0E 80

∘00E 80

∘10

0E 80

∘20

0E

33∘00

N33∘100

N

(km)

Very goodGoodModerate

PoorSurveyed tube wells

79∘40

0E79

∘30

0E

33∘200

N

Figure 12 Groundwater potential map of the study area based onCART

461 of the study area for C50 algorithm and 10505 km2about 468 of the study area for CART algorithm for thewater infiltration into the underground with sufficient timeand space The study shows that low-lying areas with goodflow condition well-developed stratigraphic gap and strongconnectivity like the southwest beaches of Ban Gong Lakecan be the first target for groundwater resources The ldquogoodrdquoarea was distributed along the river with 19210 km2 covering858 for C50 algorithm and 22602 km2 about 1009 ofthe study area for CART algorithm just like a long stripThe zones mainly had a planar distribution on the bottomof the diluvia fan on the low mountain and hilly terrainand a banded distribution on the mountain watershed ofpluvial valleys situated upstream and on the peripheryof ldquovery goodrdquo area which can serve as the candidate

target for groundwater exploration The ldquomoderaterdquo zonewith 59551 km2 occupying 2659 for C50 algorithmand 58453 km2 about 2610 of the study area for CARTalgorithm spreads around the upstream tributaries Thezone is located on both sides of the river valleys and the topof the diluvia fan The ldquopoorrdquo area occupies 6023 withthe area of 134914 km2 for C50 algorithm and 132440 km2about 5913 of the study area for CART algorithm which ismostly a mountainous region with a high altitude

4 Conclusions

In this study C50 and CART algorithms were applied for thedecision tree generation to predict the groundwater potentialzone with the five relating factors and the 10-fold crossvalidation method was adopted to verify the classificationresult with the kappa coefficient From this paper we candraw some conclusions as follows(1) In the study area the five groundwater relating fac-tors lithology topology slope river density and lineamentdensity were appropriate for the groundwater potential gradeprediction and the importance based on C50 algorithm was0363 0331 0159 0117 and 003 respectively for CARTalgorithm the importance was 0355 0308 0024 0010 and0312 respectively(2) Based on the 10-fold cross validation both C50 andCART could be applied for MCDM with the categoricaland continuous variables simultaneously with the averageaccuracy of 9045 and 8509 respectively however C50algorithm showed higher classification accuracy than CARTalgorithm(3) After applying the optimal decision trees to the wholestudy area respectively the groundwater potential zone mapwas delineated and the four grades of groundwater poten-tial zones ldquovery goodrdquo ldquogoodrdquo ldquomoderaterdquo and ldquopoorrdquooccupied the area of 10325 km2 19210 km2 59551 km2 and134914 km2 with the percentages of 461 858 2659and 6023 respectively for C50 and for CART the areaof 10505 km2 22602 km2 58453 km2 and 132440 km2with the percentages of 468 1009 2610 and 5913respectively

The study result can provide reference for groundwaterexploration and in the future work we will consider morerelating factors and survey more wells to enrich the modelThe integration of decision tree algorithms and MCDM inour study applies only to the qualitative assessment for thelack of the prior knowledge in the large area therefore theextra analysis is needed for the specific point investigationThe accuracy demonstrates that the 10-fold cross validation issuitable for training and verifying the decision tree howeverthe tested dataset is limited and more tube wells should beinvestigated to validate the stability of the model

Competing Interests

The authors declare that there is no conflict of interests in thisresearch work

10 Mathematical Problems in Engineering

Acknowledgments

This work was supported by Development Program of ChinaGroundwater Exploration Technology in the Water ShortageRegion (863 Program 2012AA062601)

References

[1] D Oikonomidis S Dimogianni N Kazakis and K VoudourisldquoA GISRemote Sensing-based methodology for groundwaterpotentiality assessment in Tirnavos area Greecerdquo Journal ofHydrology vol 525 pp 197ndash208 2015

[2] D Pinto S Shrestha M S Babel and S Ninsawat ldquoDelineationof groundwater potential zones in the Comoro watershedTimor Leste using GIS remote sensing and analytic hierarchyprocess (AHP) techniquerdquo Applied Water Science 2015

[3] O A Fashae M N Tijani A O Talabi and O I AdedejildquoDelineation of groundwater potential zones in the crystallinebasement terrain of SW-Nigeria an integrated GIS and remotesensing approachrdquoAppliedWater Science vol 4 no 1 pp 19ndash382014

[4] JMallick C K SinghHAl-Wadi et al ldquoGeospatial and geosta-tistical approach for groundwater potential zone delineationrdquoHydrological Processes vol 29 no 3 pp 395ndash418 2015

[5] S Lee K-Y Song Y Kim and I Park ldquoRegional groundwaterproductivity potential mapping using a geographic informationsystem (GIS) based artificial neural network modelrdquo Hydroge-ology Journal vol 20 no 8 pp 1511ndash1527 2012

[6] S Lee and C-W Lee ldquoApplication of decision-tree model togroundwater productivity-potential mappingrdquo Sustainabilityvol 7 no 10 pp 13416ndash13432 2015

[7] AOzdemir ldquoUsing a binary logistic regressionmethod andGISfor evaluating andmapping the groundwater spring potential inthe Sultan Mountains (Aksehir Turkey)rdquo Journal of Hydrologyvol 405 no 1-2 pp 123ndash136 2011

[8] A Ozdemir ldquoGIS-based groundwater spring potential mappingin the SultanMountains (Konya Turkey) using frequency ratioweights of evidence and logistic regression methods and theircomparisonrdquo Journal of Hydrology vol 411 no 3-4 pp 290ndash308 2011

[9] D Machiwal M K Jha and B C Mal ldquoAssessment ofgroundwater potential in a semi-arid region of india usingremote sensing GIS and MCDM techniquesrdquo Water ResourcesManagement vol 25 no 5 pp 1359ndash1386 2011

[10] S A Naghibi and H R Pourghasemi ldquoA comparative assess-ment between three machine learning models and their per-formance comparison by bivariate and multivariate statisticalmethods in groundwater potential mappingrdquo Water ResourcesManagement vol 29 no 14 pp 5217ndash5236 2015

[11] D Zheng-Dong Y Xin L Fan et al ldquoConstruction andinvestigation of groundwater remote sensing fuzzy assessmentindexrdquo Chinese Journal of Geophysics vol 56 no 11 pp 3908ndash3916 2013

[12] O FAlthuwaynee B PradhanH-J Park and JH Lee ldquoAnovelensemble decision tree-based CHi-squared Automatic Interac-tion Detection (CHAID) and multivariate logistic regressionmodels in landslide susceptibility mappingrdquo Landslides vol 11no 6 pp 1063ndash1078 2014

[13] I Park and S Lee ldquoSpatial prediction of landslide susceptibilityusing a decision tree approach a case study of the Pyeongchangarea Koreardquo International Journal of Remote Sensing vol 35 no16 pp 6089ndash6112 2014

[14] C Baker R Lawrence C Montagne and D Patten ldquoMappingwetlands and riparian areas using landsat ETM+ imagery anddecision-tree-based modelsrdquo Wetlands vol 26 no 2 pp 465ndash474 2006

[15] R Kohavi and J R Quinlan ldquoData mining tasks and methodsclassification decision-tree discoveryrdquo in Handbook of DataMining and Knowledge Discovery pp 267ndash276 Oxford Univer-sity Press New York NY USA 2002

[16] N Lin D Noe and X He ldquoTree-based methods and theirapplicationsrdquo in Springer Handbook of Engineering Statistics pp551ndash570 Springer London UK 2006

[17] I Klein U Gessner and C Kuenzer ldquoRegional land covermapping and change detection in Central Asia using MODIStime-seriesrdquo Applied Geography vol 35 no 1-2 pp 219ndash2342012

[18] J R Quinlan ldquoImproved use of continuous attributes in C4 5rdquoJournal of Artificial Intelligence Research vol 4 pp 77ndash90 1996

[19] S B Kotsiantis ldquoSupervised machine learning a review ofclassification techniquesrdquo Informatica vol 31 no 3 pp 249ndash268 2007

[20] K Polat and S Gunes ldquoA novel hybrid intelligent method basedon C45 decision tree classifier and one-against-all approachfor multi-class classification problemsrdquo Expert Systems withApplications vol 36 no 2 pp 1587ndash1592 2009

[21] M K Jha V M Chowdary and A Chowdhury ldquoGroundwaterassessment in Salboni Block West Bengal (India) using remotesensing geographical information system and multi-criteriadecision analysis techniquesrdquoHydrogeology Journal vol 18 no7 pp 1713ndash1728 2010

[22] V B Rekha A P Thomas M Suma and H Vijith ldquoAnintegration of spatial information technology for groundwa-ter potential and quality investigations in Koduvan Ar Sub-Watershed of Meenachil River Basin Kerala Indiardquo Journal ofthe Indian Society of Remote Sensing vol 39 no 1 pp 63ndash712011

[23] O Rahmati ANazari SamaniMMahdavi H R Pourghasemiand H Zeinivand ldquoGroundwater potential mapping at Kurdis-tan region of Iran using analytic hierarchy process and GISrdquoArabian Journal of Geosciences vol 8 no 9 pp 7059ndash7071 2014

[24] K R Preeja S Joseph J Thomas and H Vijith ldquoIdentificationof groundwater potential zones of a tropical river basin (KeralaIndia) using remote sensing and GIS techniquesrdquo Journal of theIndian Society of Remote Sensing vol 39 no 1 pp 83ndash94 2011

[25] K Narendra K Nageswara Rao and P Swarna Latha ldquoIntegrat-ing remote sensing and GIS for identification of groundwaterprospective zones in the Narava basin Visakhapatnam regionAndhra Pradeshrdquo Journal of the Geological Society of India vol81 no 2 pp 248ndash260 2013

[26] M Ture F Tokatli and I Kurt ldquoUsing KaplanndashMeier analysistogether with decision tree methods (CampRT CHAID QUESTC45 and ID3) in determining recurrence-free survival of breastcancer patientsrdquo Expert Systems with Applications vol 36 no 2pp 2017ndash2026 2009

[27] N A AL-Fakhry ldquoSummarizing data by using data miningtechniques a comparative by using C45 and C50 algorithmsrdquoInternational Education and Research Journal vol 2 no 4 2016

[28] R Kohavi and J R Quinlan ldquoData mining tasks and methodsclassification decision-tree discoveryrdquo in Handbook of DataMining and Knowledge Discovery pp 267ndash276 Oxford Univer-sity Press 2002

Mathematical Problems in Engineering 11

[29] U M Fayyad and K B Irani ldquoOn the handling of continuous-valued attributes in decision tree generationrdquoMachine Learningvol 8 no 1 pp 87ndash102 1992

[30] J R Quinlan ldquoBagging boosting and C4 5rdquo AAAIIAAI vol1 pp 725ndash730 1996

[31] S Pang and J C Gong ldquoC5 0 classification algorithm andapplication on individual credit evaluation of banksrdquo SystemsEngineering-Theory amp Practice vol 29 no 12 pp 94ndash104 2009

[32] M A Friedl C E Brodley and A H Strahler ldquoMaximizingland cover classification accuracies produced by decision treesat continental to global scalesrdquo IEEE Transactions on Geoscienceand Remote Sensing vol 37 no 2 pp 969ndash977 1999

[33] C J Moran and E N Bui ldquoSpatial data mining for enhancedsoil map modellingrdquo International Journal of GeographicalInformation Science vol 16 no 6 pp 533ndash549 2002

[34] X Li andA G-O Yeh ldquoDatamining of cellular automatarsquos tran-sition rulesrdquo International Journal of Geographical InformationScience vol 18 no 8 pp 723ndash744 2004

[35] P H Williams R Eyles and G Weiller ldquoPlant microRNAprediction by supervised machine learning using C50 decisiontreesrdquo Journal of Nucleic Acids vol 2012 Article ID 652979 10pages 2012

[36] N Patil R Lathi and V Chitre ldquoCustomer card classificationbased on C5 0 amp CART algorithmsrdquo International Journal ofEngineering Research and Applications vol 2 no 4 pp 164ndash1672012

[37] M M Javidi and E F Roshan ldquoSpeech emotion recognition byusing combinations of C50 neural network (NN) and supportvector machines (SVM) classification methodsrdquo Journal ofMathematics and Computer Science vol 6 pp 191ndash200 2013

[38] E Bauer and R Kohavi ldquoAn empirical comparison of vot-ing classification algorithms bagging boosting and variantsrdquoMachine Learning vol 36 no 1-2 pp 105ndash139 1999

[39] Y Shao and R S Lunetta ldquoComparison of support vectormachine neural network and CART algorithms for the land-cover classification using limited training data pointsrdquo ISPRSJournal of Photogrammetry and Remote Sensing vol 70 pp 78ndash87 2012

[40] Y Bengio andYGrandvalet ldquoNounbiased estimator of the vari-ance of k-fold cross-validationrdquo Journal of Machine LearningResearch vol 5 pp 1089ndash1105 2004

[41] C Chen B He and Z Zeng ldquoAmethod formineral prospectiv-ity mapping integrating C45 decision tree weights-of-evidenceandm-branch smoothing techniques a case study in the easternKunlunMountains Chinardquo Earth Science Informatics vol 7 no1 pp 13ndash24 2014

[42] M van der Gaag T Hoffman M Remijsen et al ldquoThe five-factor model of the positive and negative syndrome scale IIa ten-fold cross-validation of a revised modelrdquo SchizophreniaResearch vol 85 no 1ndash3 pp 280ndash287 2006

[43] M Al Saud ldquoMapping potential areas for groundwater storagein Wadi Aurnah Basin western Arabian Peninsula usingremote sensing and geographic information system techniquesrdquoHydrogeology Journal vol 18 no 6 pp 1481ndash1495 2010

[44] R G Congalton ldquoA review of assessing the accuracy of classifi-cations of remotely sensed datardquo Remote Sensing of Environ-ment vol 37 no 1 pp 35ndash46 1991

[45] F K Hoehler ldquoBias and prevalence effects on kappa viewed interms of sensitivity and specificityrdquo Journal of Clinical Epidemi-ology vol 53 no 5 pp 499ndash503 2000

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 6: Research Article Assessment of Groundwater Potential Based ...downloads.hindawi.com/journals/mpe/2016/2064575.pdf · Figure . In general, slopes control the inltration and ow ability

6 Mathematical Problems in Engineering

Table 2 The precision of each loop based on the 10-fold cross validation ()

1 2 3 4 5 6 7 8 9 10 Averageaccuracy

C50 8333 10000 8125 8986 9025 8729 9450 9421 8750 9628 9045CART 8057 8921 9223 8534 8123 100 8012 7824 8157 8234 8509

Table 3 The importance of each factor based on C50 and CART algorithm

Lithology Topology River density Slope Lineament densityC50 0363 0331 0159 0117 0031CART 0355 0308 0024 0010 0312

242 CART CART decision tree algorithm [39] can dividethe sample set into two subsample sets making the root andintermediate nodes with two branches based on the recur-sively binary segmentation technology CART can handleboth continuous and discrete variables with the impuritylevel-Gini coefficient as the discriminant basis consideringthe probability distribution under the division node

Assume a total of 119870 classes variable 119883 and node for 119879then the Gini index is defined as [16]

Gini (119879) = 119870sum119894=1

119901119879 (119894) (1 minus 119901119879 (119894)) = 1 minus119870sum119894=1

1199012119879 (119894) (7)

When the classes show the equal probability in the node119879 the Gini index achieves the maximum 1 minus 1119870 with onlyone kind in node 119879 the Gini index achieves the minimumThe Gini index increases with the impurity therefore thesubnodes should be added to lower the impurity Whentaking the misclassification cost matrix into considerationthe Gini index formula becomes

Gini (119879) = sum119895 =119894

119862 (119894 | 119895) 119901119879 (119894) 119901119879 (119895) (8)

where 119862(119894 | 119895) represents the cost for misclassifying the casecategory 119895 as 119894 Assuming that the subnode 119878 added to node119879 the Gini index can be expressed as

Gini (119878 119879) = Gini (119879) minus 119901119871Gini (119879119871) minus 119901119877Gini (119879119877) (9)

where 119901119871 119901119877 stand for the proportion of cases in node 119879classified into 119879119871 and 119879119877243 10-Fold Cross Validation The basic idea for the 119896-foldcross validation [40 41] is to equally divide the surveyedsamples into 119896 parts of which 119896 minus 1 parts served as thetraining samples with the remaining one part for validationwith 119896 circulations The method can guarantee every sampleacted as the training sample and the verification sample forone time in the circulations For the 119896-fold cross validationthe commonly used value for 119896 is 10 called 10-fold crossvalidation [42]

3 Results and Discussion

In this study five groundwater relating factors were used inthe analysis and the factors except lithology were continuous

80 tube wells were utilized for training with C50 andCART respectively in the statistical analysis software SPSSClementine 120 and MATLAB According to the confusionmatrix [43] constructed by the verification result the kappacoefficient [44 45] is used to evaluate the accuracy Based onthe 10-fold cross validation with C50 and CART algorithmrespectively the precision of each loop is shown in Table 2The importance of each factor was calculated firstly to deter-mine the selection order of the factors for classification asshown in Table 3 For the C50 algorithm the importance was0363 0331 0159 0117 and 0031 respectively for lithologytopology river density slope and lineament density For theCART algorithm the importance was 0355 0308 00240010 and 0312 respectively After the ten loops the decisiontree with the higher classification accuracy was chosen asoptimal Figure 9 shows the optimal decision tree generatedby C50 algorithm with 6 layers 21 nodes 10 internal nodesand 11 terminal nodes Table 4 shows the rules for the optimaldecision tree by C50 The optimal decision tree generated byCARTalgorithm is shown in Figure 10 with 8 layers 21 nodes11 internal nodes and 10 terminal nodes and the rules areshown in Table 5

Kappa = 119873sum119903119894=1 119909119894119894 minus sum119903119894=1 119909119894+119909+1198941198732 minus sum119903119894=1 119909119894+119909+119894 (10)

where 119903 is the number of the total categories 119873 is the totalnumber for verification 119909119894119894 is the number of the correctclassifications 119909119894+ is the number of samples mistaken fromthe category 119894 for others 119909+119894 is the number of samplesmistaken from other categories for 119894

According to the classification rules we can see that notevery rule takes all the variables into account therefore forthe area lack of the detailed information the typical factorscan help to predict the groundwater potential classificationgrade The decision trees show that both the two algorithmscan determine the dividing point of the variables especiallyfor the continuous variables based on the training data whichmakes the division of the interval more scientific and reducesthe segmentation error compared with the artificial division

According to the decision tree result generated by C50algorithm we did deep analysis Topology was dividedby six nodes 4196ndash4301ndash43165ndash4357ndash44005ndash6240 and thegroundwater was distributed in the low-lying areas Slopecontained two ranges 0ndash243ndash75 and the flat areas werebeneficial to the surface water infiltration River density was

Mathematical Problems in Engineering 7

Table 4 Rules for groundwater potential grade based on C50

Grade Rule

Very goodLithology Q 4196 lt topology le 4301 lineament density le 0558Lithology Q 4196 lt topology le 43165 lineament density gt 0558Lithology Q 43165 lt topology le 44005 river density gt 0784 slope le 243

Good

Lithology Q 4301 lt topology le 43165 lineament density le 0558Lithology Q 43165 lt topology le 4357 river density le 0784Lithology Q 4357 lt topology le 44005 river density gt 0784 slope gt 243Lithology Q 44005 lt topology river density gt 0650

ModerateLithology others except Q river density gt 0506Lithology Q 44005 lt topology river density le 0650Lithology Q 4357 lt topology le 44005 river density le 0784

Poor Lithology others except Q river density le 05

RD

L

T

Others

T

Q

P M

RD RD

M GT S

VG G

LD

T VG

VG G

G M

le0506 gt0506 le43165 gt43165

le0558 gt0558

le4301 gt4301

le4357 gt4357

gt0784

le44005 gt44005

le0650 gt0650

le243 gt243

le0784

Factor GradeL lithologyLD lineament densityT topologyS slopeRD river density

VG very goodG goodM moderateP poor

Figure 9 The optimal decision tree generated by C50 algorithm

classified into four intervals 0ndash0506ndash0650ndash0784ndash114 andreflected the flow capacity in the region and the higherthe better for groundwater enrichment Lineament densitywas divided into two intervals 0ndash0558ndash1 and the moreoccurrence space for groundwater existed with the higherlineament density However for the CART algorithm topol-ogy was divided into three intervals 4196ndash4304ndash4400ndash6240slope contained two ranges 0ndash2386ndash75 river density was

classified into three intervals 0ndash0685ndash0703ndash114 lineamentdensitywas divided into four intervals 0ndash0296ndash0335ndash0713ndash1 After applying the optimal decision trees respectively tothe whole study area [6 41] the groundwater potential zonemaps were derived and shown in Figures 11 and 12

According to the results generated by the decision treesthe ldquovery goodrdquo area is mostly located in the broad plainzone with a patchy distribution covering 10325 km2 about

8 Mathematical Problems in Engineering

Table 5 Rules for groundwater potential grade based on CART

Grade Rule

Very goodRiver density gt 0703 4304 lt topology le 4400 lineament density gt 0713Lineament density gt 0296 topology le 4304Lineament density gt 0335 4304 lt topology le 4400 river density gt 0703 slope le 2386

GoodRiver density lt 0703 4304 lt topology le 4400 lineament density gt 03350335 lt lineament density le 0713 4304 lt topology le 4400 river density gt 0703 slope gt 2386Lineament density gt 0296 topology gt 4400 river density gt 0685

ModerateLineament density le 0296 lithology Q0296 lt lineament density le 0335 4304 lt topology le 4400Lineament density gt 0296 topology gt 4400 river density le 0685

Poor Lineament density le 0296 lithology others except Q

LD

L T

RDP M

Others Q

M G

T

VG LD

M RD

G LD

S VG

VG G

le0296 gt0296

le4400 gt4400

le4304 gt4304 le0685 gt0685

le0335 gt0335

le0703 gt0703

le0713 gt0713

gt2386

Factor GradeL lithologyLD lineament densityT topologyS slopeRD river density

VG very goodG goodM moderateP poor

le2386

Figure 10 The optimal decision tree generated by CART algorithm

Mathematical Problems in Engineering 9

5

N

79∘50

0E 80

∘00E 80

∘10

0E 80

∘20

0E

33∘00

N33∘100

N

0 10 20(km)

Very goodGoodModerate

PoorSurveyed tube wells

79∘40

0E79

∘30

0E

33∘200

N

Figure 11 Groundwater potential map of the study area based onC50

0 5 10 20

N

79∘50

0E 80

∘00E 80

∘10

0E 80

∘20

0E

33∘00

N33∘100

N

(km)

Very goodGoodModerate

PoorSurveyed tube wells

79∘40

0E79

∘30

0E

33∘200

N

Figure 12 Groundwater potential map of the study area based onCART

461 of the study area for C50 algorithm and 10505 km2about 468 of the study area for CART algorithm for thewater infiltration into the underground with sufficient timeand space The study shows that low-lying areas with goodflow condition well-developed stratigraphic gap and strongconnectivity like the southwest beaches of Ban Gong Lakecan be the first target for groundwater resources The ldquogoodrdquoarea was distributed along the river with 19210 km2 covering858 for C50 algorithm and 22602 km2 about 1009 ofthe study area for CART algorithm just like a long stripThe zones mainly had a planar distribution on the bottomof the diluvia fan on the low mountain and hilly terrainand a banded distribution on the mountain watershed ofpluvial valleys situated upstream and on the peripheryof ldquovery goodrdquo area which can serve as the candidate

target for groundwater exploration The ldquomoderaterdquo zonewith 59551 km2 occupying 2659 for C50 algorithmand 58453 km2 about 2610 of the study area for CARTalgorithm spreads around the upstream tributaries Thezone is located on both sides of the river valleys and the topof the diluvia fan The ldquopoorrdquo area occupies 6023 withthe area of 134914 km2 for C50 algorithm and 132440 km2about 5913 of the study area for CART algorithm which ismostly a mountainous region with a high altitude

4 Conclusions

In this study C50 and CART algorithms were applied for thedecision tree generation to predict the groundwater potentialzone with the five relating factors and the 10-fold crossvalidation method was adopted to verify the classificationresult with the kappa coefficient From this paper we candraw some conclusions as follows(1) In the study area the five groundwater relating fac-tors lithology topology slope river density and lineamentdensity were appropriate for the groundwater potential gradeprediction and the importance based on C50 algorithm was0363 0331 0159 0117 and 003 respectively for CARTalgorithm the importance was 0355 0308 0024 0010 and0312 respectively(2) Based on the 10-fold cross validation both C50 andCART could be applied for MCDM with the categoricaland continuous variables simultaneously with the averageaccuracy of 9045 and 8509 respectively however C50algorithm showed higher classification accuracy than CARTalgorithm(3) After applying the optimal decision trees to the wholestudy area respectively the groundwater potential zone mapwas delineated and the four grades of groundwater poten-tial zones ldquovery goodrdquo ldquogoodrdquo ldquomoderaterdquo and ldquopoorrdquooccupied the area of 10325 km2 19210 km2 59551 km2 and134914 km2 with the percentages of 461 858 2659and 6023 respectively for C50 and for CART the areaof 10505 km2 22602 km2 58453 km2 and 132440 km2with the percentages of 468 1009 2610 and 5913respectively

The study result can provide reference for groundwaterexploration and in the future work we will consider morerelating factors and survey more wells to enrich the modelThe integration of decision tree algorithms and MCDM inour study applies only to the qualitative assessment for thelack of the prior knowledge in the large area therefore theextra analysis is needed for the specific point investigationThe accuracy demonstrates that the 10-fold cross validation issuitable for training and verifying the decision tree howeverthe tested dataset is limited and more tube wells should beinvestigated to validate the stability of the model

Competing Interests

The authors declare that there is no conflict of interests in thisresearch work

10 Mathematical Problems in Engineering

Acknowledgments

This work was supported by Development Program of ChinaGroundwater Exploration Technology in the Water ShortageRegion (863 Program 2012AA062601)

References

[1] D Oikonomidis S Dimogianni N Kazakis and K VoudourisldquoA GISRemote Sensing-based methodology for groundwaterpotentiality assessment in Tirnavos area Greecerdquo Journal ofHydrology vol 525 pp 197ndash208 2015

[2] D Pinto S Shrestha M S Babel and S Ninsawat ldquoDelineationof groundwater potential zones in the Comoro watershedTimor Leste using GIS remote sensing and analytic hierarchyprocess (AHP) techniquerdquo Applied Water Science 2015

[3] O A Fashae M N Tijani A O Talabi and O I AdedejildquoDelineation of groundwater potential zones in the crystallinebasement terrain of SW-Nigeria an integrated GIS and remotesensing approachrdquoAppliedWater Science vol 4 no 1 pp 19ndash382014

[4] JMallick C K SinghHAl-Wadi et al ldquoGeospatial and geosta-tistical approach for groundwater potential zone delineationrdquoHydrological Processes vol 29 no 3 pp 395ndash418 2015

[5] S Lee K-Y Song Y Kim and I Park ldquoRegional groundwaterproductivity potential mapping using a geographic informationsystem (GIS) based artificial neural network modelrdquo Hydroge-ology Journal vol 20 no 8 pp 1511ndash1527 2012

[6] S Lee and C-W Lee ldquoApplication of decision-tree model togroundwater productivity-potential mappingrdquo Sustainabilityvol 7 no 10 pp 13416ndash13432 2015

[7] AOzdemir ldquoUsing a binary logistic regressionmethod andGISfor evaluating andmapping the groundwater spring potential inthe Sultan Mountains (Aksehir Turkey)rdquo Journal of Hydrologyvol 405 no 1-2 pp 123ndash136 2011

[8] A Ozdemir ldquoGIS-based groundwater spring potential mappingin the SultanMountains (Konya Turkey) using frequency ratioweights of evidence and logistic regression methods and theircomparisonrdquo Journal of Hydrology vol 411 no 3-4 pp 290ndash308 2011

[9] D Machiwal M K Jha and B C Mal ldquoAssessment ofgroundwater potential in a semi-arid region of india usingremote sensing GIS and MCDM techniquesrdquo Water ResourcesManagement vol 25 no 5 pp 1359ndash1386 2011

[10] S A Naghibi and H R Pourghasemi ldquoA comparative assess-ment between three machine learning models and their per-formance comparison by bivariate and multivariate statisticalmethods in groundwater potential mappingrdquo Water ResourcesManagement vol 29 no 14 pp 5217ndash5236 2015

[11] D Zheng-Dong Y Xin L Fan et al ldquoConstruction andinvestigation of groundwater remote sensing fuzzy assessmentindexrdquo Chinese Journal of Geophysics vol 56 no 11 pp 3908ndash3916 2013

[12] O FAlthuwaynee B PradhanH-J Park and JH Lee ldquoAnovelensemble decision tree-based CHi-squared Automatic Interac-tion Detection (CHAID) and multivariate logistic regressionmodels in landslide susceptibility mappingrdquo Landslides vol 11no 6 pp 1063ndash1078 2014

[13] I Park and S Lee ldquoSpatial prediction of landslide susceptibilityusing a decision tree approach a case study of the Pyeongchangarea Koreardquo International Journal of Remote Sensing vol 35 no16 pp 6089ndash6112 2014

[14] C Baker R Lawrence C Montagne and D Patten ldquoMappingwetlands and riparian areas using landsat ETM+ imagery anddecision-tree-based modelsrdquo Wetlands vol 26 no 2 pp 465ndash474 2006

[15] R Kohavi and J R Quinlan ldquoData mining tasks and methodsclassification decision-tree discoveryrdquo in Handbook of DataMining and Knowledge Discovery pp 267ndash276 Oxford Univer-sity Press New York NY USA 2002

[16] N Lin D Noe and X He ldquoTree-based methods and theirapplicationsrdquo in Springer Handbook of Engineering Statistics pp551ndash570 Springer London UK 2006

[17] I Klein U Gessner and C Kuenzer ldquoRegional land covermapping and change detection in Central Asia using MODIStime-seriesrdquo Applied Geography vol 35 no 1-2 pp 219ndash2342012

[18] J R Quinlan ldquoImproved use of continuous attributes in C4 5rdquoJournal of Artificial Intelligence Research vol 4 pp 77ndash90 1996

[19] S B Kotsiantis ldquoSupervised machine learning a review ofclassification techniquesrdquo Informatica vol 31 no 3 pp 249ndash268 2007

[20] K Polat and S Gunes ldquoA novel hybrid intelligent method basedon C45 decision tree classifier and one-against-all approachfor multi-class classification problemsrdquo Expert Systems withApplications vol 36 no 2 pp 1587ndash1592 2009

[21] M K Jha V M Chowdary and A Chowdhury ldquoGroundwaterassessment in Salboni Block West Bengal (India) using remotesensing geographical information system and multi-criteriadecision analysis techniquesrdquoHydrogeology Journal vol 18 no7 pp 1713ndash1728 2010

[22] V B Rekha A P Thomas M Suma and H Vijith ldquoAnintegration of spatial information technology for groundwa-ter potential and quality investigations in Koduvan Ar Sub-Watershed of Meenachil River Basin Kerala Indiardquo Journal ofthe Indian Society of Remote Sensing vol 39 no 1 pp 63ndash712011

[23] O Rahmati ANazari SamaniMMahdavi H R Pourghasemiand H Zeinivand ldquoGroundwater potential mapping at Kurdis-tan region of Iran using analytic hierarchy process and GISrdquoArabian Journal of Geosciences vol 8 no 9 pp 7059ndash7071 2014

[24] K R Preeja S Joseph J Thomas and H Vijith ldquoIdentificationof groundwater potential zones of a tropical river basin (KeralaIndia) using remote sensing and GIS techniquesrdquo Journal of theIndian Society of Remote Sensing vol 39 no 1 pp 83ndash94 2011

[25] K Narendra K Nageswara Rao and P Swarna Latha ldquoIntegrat-ing remote sensing and GIS for identification of groundwaterprospective zones in the Narava basin Visakhapatnam regionAndhra Pradeshrdquo Journal of the Geological Society of India vol81 no 2 pp 248ndash260 2013

[26] M Ture F Tokatli and I Kurt ldquoUsing KaplanndashMeier analysistogether with decision tree methods (CampRT CHAID QUESTC45 and ID3) in determining recurrence-free survival of breastcancer patientsrdquo Expert Systems with Applications vol 36 no 2pp 2017ndash2026 2009

[27] N A AL-Fakhry ldquoSummarizing data by using data miningtechniques a comparative by using C45 and C50 algorithmsrdquoInternational Education and Research Journal vol 2 no 4 2016

[28] R Kohavi and J R Quinlan ldquoData mining tasks and methodsclassification decision-tree discoveryrdquo in Handbook of DataMining and Knowledge Discovery pp 267ndash276 Oxford Univer-sity Press 2002

Mathematical Problems in Engineering 11

[29] U M Fayyad and K B Irani ldquoOn the handling of continuous-valued attributes in decision tree generationrdquoMachine Learningvol 8 no 1 pp 87ndash102 1992

[30] J R Quinlan ldquoBagging boosting and C4 5rdquo AAAIIAAI vol1 pp 725ndash730 1996

[31] S Pang and J C Gong ldquoC5 0 classification algorithm andapplication on individual credit evaluation of banksrdquo SystemsEngineering-Theory amp Practice vol 29 no 12 pp 94ndash104 2009

[32] M A Friedl C E Brodley and A H Strahler ldquoMaximizingland cover classification accuracies produced by decision treesat continental to global scalesrdquo IEEE Transactions on Geoscienceand Remote Sensing vol 37 no 2 pp 969ndash977 1999

[33] C J Moran and E N Bui ldquoSpatial data mining for enhancedsoil map modellingrdquo International Journal of GeographicalInformation Science vol 16 no 6 pp 533ndash549 2002

[34] X Li andA G-O Yeh ldquoDatamining of cellular automatarsquos tran-sition rulesrdquo International Journal of Geographical InformationScience vol 18 no 8 pp 723ndash744 2004

[35] P H Williams R Eyles and G Weiller ldquoPlant microRNAprediction by supervised machine learning using C50 decisiontreesrdquo Journal of Nucleic Acids vol 2012 Article ID 652979 10pages 2012

[36] N Patil R Lathi and V Chitre ldquoCustomer card classificationbased on C5 0 amp CART algorithmsrdquo International Journal ofEngineering Research and Applications vol 2 no 4 pp 164ndash1672012

[37] M M Javidi and E F Roshan ldquoSpeech emotion recognition byusing combinations of C50 neural network (NN) and supportvector machines (SVM) classification methodsrdquo Journal ofMathematics and Computer Science vol 6 pp 191ndash200 2013

[38] E Bauer and R Kohavi ldquoAn empirical comparison of vot-ing classification algorithms bagging boosting and variantsrdquoMachine Learning vol 36 no 1-2 pp 105ndash139 1999

[39] Y Shao and R S Lunetta ldquoComparison of support vectormachine neural network and CART algorithms for the land-cover classification using limited training data pointsrdquo ISPRSJournal of Photogrammetry and Remote Sensing vol 70 pp 78ndash87 2012

[40] Y Bengio andYGrandvalet ldquoNounbiased estimator of the vari-ance of k-fold cross-validationrdquo Journal of Machine LearningResearch vol 5 pp 1089ndash1105 2004

[41] C Chen B He and Z Zeng ldquoAmethod formineral prospectiv-ity mapping integrating C45 decision tree weights-of-evidenceandm-branch smoothing techniques a case study in the easternKunlunMountains Chinardquo Earth Science Informatics vol 7 no1 pp 13ndash24 2014

[42] M van der Gaag T Hoffman M Remijsen et al ldquoThe five-factor model of the positive and negative syndrome scale IIa ten-fold cross-validation of a revised modelrdquo SchizophreniaResearch vol 85 no 1ndash3 pp 280ndash287 2006

[43] M Al Saud ldquoMapping potential areas for groundwater storagein Wadi Aurnah Basin western Arabian Peninsula usingremote sensing and geographic information system techniquesrdquoHydrogeology Journal vol 18 no 6 pp 1481ndash1495 2010

[44] R G Congalton ldquoA review of assessing the accuracy of classifi-cations of remotely sensed datardquo Remote Sensing of Environ-ment vol 37 no 1 pp 35ndash46 1991

[45] F K Hoehler ldquoBias and prevalence effects on kappa viewed interms of sensitivity and specificityrdquo Journal of Clinical Epidemi-ology vol 53 no 5 pp 499ndash503 2000

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 7: Research Article Assessment of Groundwater Potential Based ...downloads.hindawi.com/journals/mpe/2016/2064575.pdf · Figure . In general, slopes control the inltration and ow ability

Mathematical Problems in Engineering 7

Table 4 Rules for groundwater potential grade based on C50

Grade Rule

Very goodLithology Q 4196 lt topology le 4301 lineament density le 0558Lithology Q 4196 lt topology le 43165 lineament density gt 0558Lithology Q 43165 lt topology le 44005 river density gt 0784 slope le 243

Good

Lithology Q 4301 lt topology le 43165 lineament density le 0558Lithology Q 43165 lt topology le 4357 river density le 0784Lithology Q 4357 lt topology le 44005 river density gt 0784 slope gt 243Lithology Q 44005 lt topology river density gt 0650

ModerateLithology others except Q river density gt 0506Lithology Q 44005 lt topology river density le 0650Lithology Q 4357 lt topology le 44005 river density le 0784

Poor Lithology others except Q river density le 05

RD

L

T

Others

T

Q

P M

RD RD

M GT S

VG G

LD

T VG

VG G

G M

le0506 gt0506 le43165 gt43165

le0558 gt0558

le4301 gt4301

le4357 gt4357

gt0784

le44005 gt44005

le0650 gt0650

le243 gt243

le0784

Factor GradeL lithologyLD lineament densityT topologyS slopeRD river density

VG very goodG goodM moderateP poor

Figure 9 The optimal decision tree generated by C50 algorithm

classified into four intervals 0ndash0506ndash0650ndash0784ndash114 andreflected the flow capacity in the region and the higherthe better for groundwater enrichment Lineament densitywas divided into two intervals 0ndash0558ndash1 and the moreoccurrence space for groundwater existed with the higherlineament density However for the CART algorithm topol-ogy was divided into three intervals 4196ndash4304ndash4400ndash6240slope contained two ranges 0ndash2386ndash75 river density was

classified into three intervals 0ndash0685ndash0703ndash114 lineamentdensitywas divided into four intervals 0ndash0296ndash0335ndash0713ndash1 After applying the optimal decision trees respectively tothe whole study area [6 41] the groundwater potential zonemaps were derived and shown in Figures 11 and 12

According to the results generated by the decision treesthe ldquovery goodrdquo area is mostly located in the broad plainzone with a patchy distribution covering 10325 km2 about

8 Mathematical Problems in Engineering

Table 5 Rules for groundwater potential grade based on CART

Grade Rule

Very goodRiver density gt 0703 4304 lt topology le 4400 lineament density gt 0713Lineament density gt 0296 topology le 4304Lineament density gt 0335 4304 lt topology le 4400 river density gt 0703 slope le 2386

GoodRiver density lt 0703 4304 lt topology le 4400 lineament density gt 03350335 lt lineament density le 0713 4304 lt topology le 4400 river density gt 0703 slope gt 2386Lineament density gt 0296 topology gt 4400 river density gt 0685

ModerateLineament density le 0296 lithology Q0296 lt lineament density le 0335 4304 lt topology le 4400Lineament density gt 0296 topology gt 4400 river density le 0685

Poor Lineament density le 0296 lithology others except Q

LD

L T

RDP M

Others Q

M G

T

VG LD

M RD

G LD

S VG

VG G

le0296 gt0296

le4400 gt4400

le4304 gt4304 le0685 gt0685

le0335 gt0335

le0703 gt0703

le0713 gt0713

gt2386

Factor GradeL lithologyLD lineament densityT topologyS slopeRD river density

VG very goodG goodM moderateP poor

le2386

Figure 10 The optimal decision tree generated by CART algorithm

Mathematical Problems in Engineering 9

5

N

79∘50

0E 80

∘00E 80

∘10

0E 80

∘20

0E

33∘00

N33∘100

N

0 10 20(km)

Very goodGoodModerate

PoorSurveyed tube wells

79∘40

0E79

∘30

0E

33∘200

N

Figure 11 Groundwater potential map of the study area based onC50

0 5 10 20

N

79∘50

0E 80

∘00E 80

∘10

0E 80

∘20

0E

33∘00

N33∘100

N

(km)

Very goodGoodModerate

PoorSurveyed tube wells

79∘40

0E79

∘30

0E

33∘200

N

Figure 12 Groundwater potential map of the study area based onCART

461 of the study area for C50 algorithm and 10505 km2about 468 of the study area for CART algorithm for thewater infiltration into the underground with sufficient timeand space The study shows that low-lying areas with goodflow condition well-developed stratigraphic gap and strongconnectivity like the southwest beaches of Ban Gong Lakecan be the first target for groundwater resources The ldquogoodrdquoarea was distributed along the river with 19210 km2 covering858 for C50 algorithm and 22602 km2 about 1009 ofthe study area for CART algorithm just like a long stripThe zones mainly had a planar distribution on the bottomof the diluvia fan on the low mountain and hilly terrainand a banded distribution on the mountain watershed ofpluvial valleys situated upstream and on the peripheryof ldquovery goodrdquo area which can serve as the candidate

target for groundwater exploration The ldquomoderaterdquo zonewith 59551 km2 occupying 2659 for C50 algorithmand 58453 km2 about 2610 of the study area for CARTalgorithm spreads around the upstream tributaries Thezone is located on both sides of the river valleys and the topof the diluvia fan The ldquopoorrdquo area occupies 6023 withthe area of 134914 km2 for C50 algorithm and 132440 km2about 5913 of the study area for CART algorithm which ismostly a mountainous region with a high altitude

4 Conclusions

In this study C50 and CART algorithms were applied for thedecision tree generation to predict the groundwater potentialzone with the five relating factors and the 10-fold crossvalidation method was adopted to verify the classificationresult with the kappa coefficient From this paper we candraw some conclusions as follows(1) In the study area the five groundwater relating fac-tors lithology topology slope river density and lineamentdensity were appropriate for the groundwater potential gradeprediction and the importance based on C50 algorithm was0363 0331 0159 0117 and 003 respectively for CARTalgorithm the importance was 0355 0308 0024 0010 and0312 respectively(2) Based on the 10-fold cross validation both C50 andCART could be applied for MCDM with the categoricaland continuous variables simultaneously with the averageaccuracy of 9045 and 8509 respectively however C50algorithm showed higher classification accuracy than CARTalgorithm(3) After applying the optimal decision trees to the wholestudy area respectively the groundwater potential zone mapwas delineated and the four grades of groundwater poten-tial zones ldquovery goodrdquo ldquogoodrdquo ldquomoderaterdquo and ldquopoorrdquooccupied the area of 10325 km2 19210 km2 59551 km2 and134914 km2 with the percentages of 461 858 2659and 6023 respectively for C50 and for CART the areaof 10505 km2 22602 km2 58453 km2 and 132440 km2with the percentages of 468 1009 2610 and 5913respectively

The study result can provide reference for groundwaterexploration and in the future work we will consider morerelating factors and survey more wells to enrich the modelThe integration of decision tree algorithms and MCDM inour study applies only to the qualitative assessment for thelack of the prior knowledge in the large area therefore theextra analysis is needed for the specific point investigationThe accuracy demonstrates that the 10-fold cross validation issuitable for training and verifying the decision tree howeverthe tested dataset is limited and more tube wells should beinvestigated to validate the stability of the model

Competing Interests

The authors declare that there is no conflict of interests in thisresearch work

10 Mathematical Problems in Engineering

Acknowledgments

This work was supported by Development Program of ChinaGroundwater Exploration Technology in the Water ShortageRegion (863 Program 2012AA062601)

References

[1] D Oikonomidis S Dimogianni N Kazakis and K VoudourisldquoA GISRemote Sensing-based methodology for groundwaterpotentiality assessment in Tirnavos area Greecerdquo Journal ofHydrology vol 525 pp 197ndash208 2015

[2] D Pinto S Shrestha M S Babel and S Ninsawat ldquoDelineationof groundwater potential zones in the Comoro watershedTimor Leste using GIS remote sensing and analytic hierarchyprocess (AHP) techniquerdquo Applied Water Science 2015

[3] O A Fashae M N Tijani A O Talabi and O I AdedejildquoDelineation of groundwater potential zones in the crystallinebasement terrain of SW-Nigeria an integrated GIS and remotesensing approachrdquoAppliedWater Science vol 4 no 1 pp 19ndash382014

[4] JMallick C K SinghHAl-Wadi et al ldquoGeospatial and geosta-tistical approach for groundwater potential zone delineationrdquoHydrological Processes vol 29 no 3 pp 395ndash418 2015

[5] S Lee K-Y Song Y Kim and I Park ldquoRegional groundwaterproductivity potential mapping using a geographic informationsystem (GIS) based artificial neural network modelrdquo Hydroge-ology Journal vol 20 no 8 pp 1511ndash1527 2012

[6] S Lee and C-W Lee ldquoApplication of decision-tree model togroundwater productivity-potential mappingrdquo Sustainabilityvol 7 no 10 pp 13416ndash13432 2015

[7] AOzdemir ldquoUsing a binary logistic regressionmethod andGISfor evaluating andmapping the groundwater spring potential inthe Sultan Mountains (Aksehir Turkey)rdquo Journal of Hydrologyvol 405 no 1-2 pp 123ndash136 2011

[8] A Ozdemir ldquoGIS-based groundwater spring potential mappingin the SultanMountains (Konya Turkey) using frequency ratioweights of evidence and logistic regression methods and theircomparisonrdquo Journal of Hydrology vol 411 no 3-4 pp 290ndash308 2011

[9] D Machiwal M K Jha and B C Mal ldquoAssessment ofgroundwater potential in a semi-arid region of india usingremote sensing GIS and MCDM techniquesrdquo Water ResourcesManagement vol 25 no 5 pp 1359ndash1386 2011

[10] S A Naghibi and H R Pourghasemi ldquoA comparative assess-ment between three machine learning models and their per-formance comparison by bivariate and multivariate statisticalmethods in groundwater potential mappingrdquo Water ResourcesManagement vol 29 no 14 pp 5217ndash5236 2015

[11] D Zheng-Dong Y Xin L Fan et al ldquoConstruction andinvestigation of groundwater remote sensing fuzzy assessmentindexrdquo Chinese Journal of Geophysics vol 56 no 11 pp 3908ndash3916 2013

[12] O FAlthuwaynee B PradhanH-J Park and JH Lee ldquoAnovelensemble decision tree-based CHi-squared Automatic Interac-tion Detection (CHAID) and multivariate logistic regressionmodels in landslide susceptibility mappingrdquo Landslides vol 11no 6 pp 1063ndash1078 2014

[13] I Park and S Lee ldquoSpatial prediction of landslide susceptibilityusing a decision tree approach a case study of the Pyeongchangarea Koreardquo International Journal of Remote Sensing vol 35 no16 pp 6089ndash6112 2014

[14] C Baker R Lawrence C Montagne and D Patten ldquoMappingwetlands and riparian areas using landsat ETM+ imagery anddecision-tree-based modelsrdquo Wetlands vol 26 no 2 pp 465ndash474 2006

[15] R Kohavi and J R Quinlan ldquoData mining tasks and methodsclassification decision-tree discoveryrdquo in Handbook of DataMining and Knowledge Discovery pp 267ndash276 Oxford Univer-sity Press New York NY USA 2002

[16] N Lin D Noe and X He ldquoTree-based methods and theirapplicationsrdquo in Springer Handbook of Engineering Statistics pp551ndash570 Springer London UK 2006

[17] I Klein U Gessner and C Kuenzer ldquoRegional land covermapping and change detection in Central Asia using MODIStime-seriesrdquo Applied Geography vol 35 no 1-2 pp 219ndash2342012

[18] J R Quinlan ldquoImproved use of continuous attributes in C4 5rdquoJournal of Artificial Intelligence Research vol 4 pp 77ndash90 1996

[19] S B Kotsiantis ldquoSupervised machine learning a review ofclassification techniquesrdquo Informatica vol 31 no 3 pp 249ndash268 2007

[20] K Polat and S Gunes ldquoA novel hybrid intelligent method basedon C45 decision tree classifier and one-against-all approachfor multi-class classification problemsrdquo Expert Systems withApplications vol 36 no 2 pp 1587ndash1592 2009

[21] M K Jha V M Chowdary and A Chowdhury ldquoGroundwaterassessment in Salboni Block West Bengal (India) using remotesensing geographical information system and multi-criteriadecision analysis techniquesrdquoHydrogeology Journal vol 18 no7 pp 1713ndash1728 2010

[22] V B Rekha A P Thomas M Suma and H Vijith ldquoAnintegration of spatial information technology for groundwa-ter potential and quality investigations in Koduvan Ar Sub-Watershed of Meenachil River Basin Kerala Indiardquo Journal ofthe Indian Society of Remote Sensing vol 39 no 1 pp 63ndash712011

[23] O Rahmati ANazari SamaniMMahdavi H R Pourghasemiand H Zeinivand ldquoGroundwater potential mapping at Kurdis-tan region of Iran using analytic hierarchy process and GISrdquoArabian Journal of Geosciences vol 8 no 9 pp 7059ndash7071 2014

[24] K R Preeja S Joseph J Thomas and H Vijith ldquoIdentificationof groundwater potential zones of a tropical river basin (KeralaIndia) using remote sensing and GIS techniquesrdquo Journal of theIndian Society of Remote Sensing vol 39 no 1 pp 83ndash94 2011

[25] K Narendra K Nageswara Rao and P Swarna Latha ldquoIntegrat-ing remote sensing and GIS for identification of groundwaterprospective zones in the Narava basin Visakhapatnam regionAndhra Pradeshrdquo Journal of the Geological Society of India vol81 no 2 pp 248ndash260 2013

[26] M Ture F Tokatli and I Kurt ldquoUsing KaplanndashMeier analysistogether with decision tree methods (CampRT CHAID QUESTC45 and ID3) in determining recurrence-free survival of breastcancer patientsrdquo Expert Systems with Applications vol 36 no 2pp 2017ndash2026 2009

[27] N A AL-Fakhry ldquoSummarizing data by using data miningtechniques a comparative by using C45 and C50 algorithmsrdquoInternational Education and Research Journal vol 2 no 4 2016

[28] R Kohavi and J R Quinlan ldquoData mining tasks and methodsclassification decision-tree discoveryrdquo in Handbook of DataMining and Knowledge Discovery pp 267ndash276 Oxford Univer-sity Press 2002

Mathematical Problems in Engineering 11

[29] U M Fayyad and K B Irani ldquoOn the handling of continuous-valued attributes in decision tree generationrdquoMachine Learningvol 8 no 1 pp 87ndash102 1992

[30] J R Quinlan ldquoBagging boosting and C4 5rdquo AAAIIAAI vol1 pp 725ndash730 1996

[31] S Pang and J C Gong ldquoC5 0 classification algorithm andapplication on individual credit evaluation of banksrdquo SystemsEngineering-Theory amp Practice vol 29 no 12 pp 94ndash104 2009

[32] M A Friedl C E Brodley and A H Strahler ldquoMaximizingland cover classification accuracies produced by decision treesat continental to global scalesrdquo IEEE Transactions on Geoscienceand Remote Sensing vol 37 no 2 pp 969ndash977 1999

[33] C J Moran and E N Bui ldquoSpatial data mining for enhancedsoil map modellingrdquo International Journal of GeographicalInformation Science vol 16 no 6 pp 533ndash549 2002

[34] X Li andA G-O Yeh ldquoDatamining of cellular automatarsquos tran-sition rulesrdquo International Journal of Geographical InformationScience vol 18 no 8 pp 723ndash744 2004

[35] P H Williams R Eyles and G Weiller ldquoPlant microRNAprediction by supervised machine learning using C50 decisiontreesrdquo Journal of Nucleic Acids vol 2012 Article ID 652979 10pages 2012

[36] N Patil R Lathi and V Chitre ldquoCustomer card classificationbased on C5 0 amp CART algorithmsrdquo International Journal ofEngineering Research and Applications vol 2 no 4 pp 164ndash1672012

[37] M M Javidi and E F Roshan ldquoSpeech emotion recognition byusing combinations of C50 neural network (NN) and supportvector machines (SVM) classification methodsrdquo Journal ofMathematics and Computer Science vol 6 pp 191ndash200 2013

[38] E Bauer and R Kohavi ldquoAn empirical comparison of vot-ing classification algorithms bagging boosting and variantsrdquoMachine Learning vol 36 no 1-2 pp 105ndash139 1999

[39] Y Shao and R S Lunetta ldquoComparison of support vectormachine neural network and CART algorithms for the land-cover classification using limited training data pointsrdquo ISPRSJournal of Photogrammetry and Remote Sensing vol 70 pp 78ndash87 2012

[40] Y Bengio andYGrandvalet ldquoNounbiased estimator of the vari-ance of k-fold cross-validationrdquo Journal of Machine LearningResearch vol 5 pp 1089ndash1105 2004

[41] C Chen B He and Z Zeng ldquoAmethod formineral prospectiv-ity mapping integrating C45 decision tree weights-of-evidenceandm-branch smoothing techniques a case study in the easternKunlunMountains Chinardquo Earth Science Informatics vol 7 no1 pp 13ndash24 2014

[42] M van der Gaag T Hoffman M Remijsen et al ldquoThe five-factor model of the positive and negative syndrome scale IIa ten-fold cross-validation of a revised modelrdquo SchizophreniaResearch vol 85 no 1ndash3 pp 280ndash287 2006

[43] M Al Saud ldquoMapping potential areas for groundwater storagein Wadi Aurnah Basin western Arabian Peninsula usingremote sensing and geographic information system techniquesrdquoHydrogeology Journal vol 18 no 6 pp 1481ndash1495 2010

[44] R G Congalton ldquoA review of assessing the accuracy of classifi-cations of remotely sensed datardquo Remote Sensing of Environ-ment vol 37 no 1 pp 35ndash46 1991

[45] F K Hoehler ldquoBias and prevalence effects on kappa viewed interms of sensitivity and specificityrdquo Journal of Clinical Epidemi-ology vol 53 no 5 pp 499ndash503 2000

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 8: Research Article Assessment of Groundwater Potential Based ...downloads.hindawi.com/journals/mpe/2016/2064575.pdf · Figure . In general, slopes control the inltration and ow ability

8 Mathematical Problems in Engineering

Table 5 Rules for groundwater potential grade based on CART

Grade Rule

Very goodRiver density gt 0703 4304 lt topology le 4400 lineament density gt 0713Lineament density gt 0296 topology le 4304Lineament density gt 0335 4304 lt topology le 4400 river density gt 0703 slope le 2386

GoodRiver density lt 0703 4304 lt topology le 4400 lineament density gt 03350335 lt lineament density le 0713 4304 lt topology le 4400 river density gt 0703 slope gt 2386Lineament density gt 0296 topology gt 4400 river density gt 0685

ModerateLineament density le 0296 lithology Q0296 lt lineament density le 0335 4304 lt topology le 4400Lineament density gt 0296 topology gt 4400 river density le 0685

Poor Lineament density le 0296 lithology others except Q

LD

L T

RDP M

Others Q

M G

T

VG LD

M RD

G LD

S VG

VG G

le0296 gt0296

le4400 gt4400

le4304 gt4304 le0685 gt0685

le0335 gt0335

le0703 gt0703

le0713 gt0713

gt2386

Factor GradeL lithologyLD lineament densityT topologyS slopeRD river density

VG very goodG goodM moderateP poor

le2386

Figure 10 The optimal decision tree generated by CART algorithm

Mathematical Problems in Engineering 9

5

N

79∘50

0E 80

∘00E 80

∘10

0E 80

∘20

0E

33∘00

N33∘100

N

0 10 20(km)

Very goodGoodModerate

PoorSurveyed tube wells

79∘40

0E79

∘30

0E

33∘200

N

Figure 11 Groundwater potential map of the study area based onC50

0 5 10 20

N

79∘50

0E 80

∘00E 80

∘10

0E 80

∘20

0E

33∘00

N33∘100

N

(km)

Very goodGoodModerate

PoorSurveyed tube wells

79∘40

0E79

∘30

0E

33∘200

N

Figure 12 Groundwater potential map of the study area based onCART

461 of the study area for C50 algorithm and 10505 km2about 468 of the study area for CART algorithm for thewater infiltration into the underground with sufficient timeand space The study shows that low-lying areas with goodflow condition well-developed stratigraphic gap and strongconnectivity like the southwest beaches of Ban Gong Lakecan be the first target for groundwater resources The ldquogoodrdquoarea was distributed along the river with 19210 km2 covering858 for C50 algorithm and 22602 km2 about 1009 ofthe study area for CART algorithm just like a long stripThe zones mainly had a planar distribution on the bottomof the diluvia fan on the low mountain and hilly terrainand a banded distribution on the mountain watershed ofpluvial valleys situated upstream and on the peripheryof ldquovery goodrdquo area which can serve as the candidate

target for groundwater exploration The ldquomoderaterdquo zonewith 59551 km2 occupying 2659 for C50 algorithmand 58453 km2 about 2610 of the study area for CARTalgorithm spreads around the upstream tributaries Thezone is located on both sides of the river valleys and the topof the diluvia fan The ldquopoorrdquo area occupies 6023 withthe area of 134914 km2 for C50 algorithm and 132440 km2about 5913 of the study area for CART algorithm which ismostly a mountainous region with a high altitude

4 Conclusions

In this study C50 and CART algorithms were applied for thedecision tree generation to predict the groundwater potentialzone with the five relating factors and the 10-fold crossvalidation method was adopted to verify the classificationresult with the kappa coefficient From this paper we candraw some conclusions as follows(1) In the study area the five groundwater relating fac-tors lithology topology slope river density and lineamentdensity were appropriate for the groundwater potential gradeprediction and the importance based on C50 algorithm was0363 0331 0159 0117 and 003 respectively for CARTalgorithm the importance was 0355 0308 0024 0010 and0312 respectively(2) Based on the 10-fold cross validation both C50 andCART could be applied for MCDM with the categoricaland continuous variables simultaneously with the averageaccuracy of 9045 and 8509 respectively however C50algorithm showed higher classification accuracy than CARTalgorithm(3) After applying the optimal decision trees to the wholestudy area respectively the groundwater potential zone mapwas delineated and the four grades of groundwater poten-tial zones ldquovery goodrdquo ldquogoodrdquo ldquomoderaterdquo and ldquopoorrdquooccupied the area of 10325 km2 19210 km2 59551 km2 and134914 km2 with the percentages of 461 858 2659and 6023 respectively for C50 and for CART the areaof 10505 km2 22602 km2 58453 km2 and 132440 km2with the percentages of 468 1009 2610 and 5913respectively

The study result can provide reference for groundwaterexploration and in the future work we will consider morerelating factors and survey more wells to enrich the modelThe integration of decision tree algorithms and MCDM inour study applies only to the qualitative assessment for thelack of the prior knowledge in the large area therefore theextra analysis is needed for the specific point investigationThe accuracy demonstrates that the 10-fold cross validation issuitable for training and verifying the decision tree howeverthe tested dataset is limited and more tube wells should beinvestigated to validate the stability of the model

Competing Interests

The authors declare that there is no conflict of interests in thisresearch work

10 Mathematical Problems in Engineering

Acknowledgments

This work was supported by Development Program of ChinaGroundwater Exploration Technology in the Water ShortageRegion (863 Program 2012AA062601)

References

[1] D Oikonomidis S Dimogianni N Kazakis and K VoudourisldquoA GISRemote Sensing-based methodology for groundwaterpotentiality assessment in Tirnavos area Greecerdquo Journal ofHydrology vol 525 pp 197ndash208 2015

[2] D Pinto S Shrestha M S Babel and S Ninsawat ldquoDelineationof groundwater potential zones in the Comoro watershedTimor Leste using GIS remote sensing and analytic hierarchyprocess (AHP) techniquerdquo Applied Water Science 2015

[3] O A Fashae M N Tijani A O Talabi and O I AdedejildquoDelineation of groundwater potential zones in the crystallinebasement terrain of SW-Nigeria an integrated GIS and remotesensing approachrdquoAppliedWater Science vol 4 no 1 pp 19ndash382014

[4] JMallick C K SinghHAl-Wadi et al ldquoGeospatial and geosta-tistical approach for groundwater potential zone delineationrdquoHydrological Processes vol 29 no 3 pp 395ndash418 2015

[5] S Lee K-Y Song Y Kim and I Park ldquoRegional groundwaterproductivity potential mapping using a geographic informationsystem (GIS) based artificial neural network modelrdquo Hydroge-ology Journal vol 20 no 8 pp 1511ndash1527 2012

[6] S Lee and C-W Lee ldquoApplication of decision-tree model togroundwater productivity-potential mappingrdquo Sustainabilityvol 7 no 10 pp 13416ndash13432 2015

[7] AOzdemir ldquoUsing a binary logistic regressionmethod andGISfor evaluating andmapping the groundwater spring potential inthe Sultan Mountains (Aksehir Turkey)rdquo Journal of Hydrologyvol 405 no 1-2 pp 123ndash136 2011

[8] A Ozdemir ldquoGIS-based groundwater spring potential mappingin the SultanMountains (Konya Turkey) using frequency ratioweights of evidence and logistic regression methods and theircomparisonrdquo Journal of Hydrology vol 411 no 3-4 pp 290ndash308 2011

[9] D Machiwal M K Jha and B C Mal ldquoAssessment ofgroundwater potential in a semi-arid region of india usingremote sensing GIS and MCDM techniquesrdquo Water ResourcesManagement vol 25 no 5 pp 1359ndash1386 2011

[10] S A Naghibi and H R Pourghasemi ldquoA comparative assess-ment between three machine learning models and their per-formance comparison by bivariate and multivariate statisticalmethods in groundwater potential mappingrdquo Water ResourcesManagement vol 29 no 14 pp 5217ndash5236 2015

[11] D Zheng-Dong Y Xin L Fan et al ldquoConstruction andinvestigation of groundwater remote sensing fuzzy assessmentindexrdquo Chinese Journal of Geophysics vol 56 no 11 pp 3908ndash3916 2013

[12] O FAlthuwaynee B PradhanH-J Park and JH Lee ldquoAnovelensemble decision tree-based CHi-squared Automatic Interac-tion Detection (CHAID) and multivariate logistic regressionmodels in landslide susceptibility mappingrdquo Landslides vol 11no 6 pp 1063ndash1078 2014

[13] I Park and S Lee ldquoSpatial prediction of landslide susceptibilityusing a decision tree approach a case study of the Pyeongchangarea Koreardquo International Journal of Remote Sensing vol 35 no16 pp 6089ndash6112 2014

[14] C Baker R Lawrence C Montagne and D Patten ldquoMappingwetlands and riparian areas using landsat ETM+ imagery anddecision-tree-based modelsrdquo Wetlands vol 26 no 2 pp 465ndash474 2006

[15] R Kohavi and J R Quinlan ldquoData mining tasks and methodsclassification decision-tree discoveryrdquo in Handbook of DataMining and Knowledge Discovery pp 267ndash276 Oxford Univer-sity Press New York NY USA 2002

[16] N Lin D Noe and X He ldquoTree-based methods and theirapplicationsrdquo in Springer Handbook of Engineering Statistics pp551ndash570 Springer London UK 2006

[17] I Klein U Gessner and C Kuenzer ldquoRegional land covermapping and change detection in Central Asia using MODIStime-seriesrdquo Applied Geography vol 35 no 1-2 pp 219ndash2342012

[18] J R Quinlan ldquoImproved use of continuous attributes in C4 5rdquoJournal of Artificial Intelligence Research vol 4 pp 77ndash90 1996

[19] S B Kotsiantis ldquoSupervised machine learning a review ofclassification techniquesrdquo Informatica vol 31 no 3 pp 249ndash268 2007

[20] K Polat and S Gunes ldquoA novel hybrid intelligent method basedon C45 decision tree classifier and one-against-all approachfor multi-class classification problemsrdquo Expert Systems withApplications vol 36 no 2 pp 1587ndash1592 2009

[21] M K Jha V M Chowdary and A Chowdhury ldquoGroundwaterassessment in Salboni Block West Bengal (India) using remotesensing geographical information system and multi-criteriadecision analysis techniquesrdquoHydrogeology Journal vol 18 no7 pp 1713ndash1728 2010

[22] V B Rekha A P Thomas M Suma and H Vijith ldquoAnintegration of spatial information technology for groundwa-ter potential and quality investigations in Koduvan Ar Sub-Watershed of Meenachil River Basin Kerala Indiardquo Journal ofthe Indian Society of Remote Sensing vol 39 no 1 pp 63ndash712011

[23] O Rahmati ANazari SamaniMMahdavi H R Pourghasemiand H Zeinivand ldquoGroundwater potential mapping at Kurdis-tan region of Iran using analytic hierarchy process and GISrdquoArabian Journal of Geosciences vol 8 no 9 pp 7059ndash7071 2014

[24] K R Preeja S Joseph J Thomas and H Vijith ldquoIdentificationof groundwater potential zones of a tropical river basin (KeralaIndia) using remote sensing and GIS techniquesrdquo Journal of theIndian Society of Remote Sensing vol 39 no 1 pp 83ndash94 2011

[25] K Narendra K Nageswara Rao and P Swarna Latha ldquoIntegrat-ing remote sensing and GIS for identification of groundwaterprospective zones in the Narava basin Visakhapatnam regionAndhra Pradeshrdquo Journal of the Geological Society of India vol81 no 2 pp 248ndash260 2013

[26] M Ture F Tokatli and I Kurt ldquoUsing KaplanndashMeier analysistogether with decision tree methods (CampRT CHAID QUESTC45 and ID3) in determining recurrence-free survival of breastcancer patientsrdquo Expert Systems with Applications vol 36 no 2pp 2017ndash2026 2009

[27] N A AL-Fakhry ldquoSummarizing data by using data miningtechniques a comparative by using C45 and C50 algorithmsrdquoInternational Education and Research Journal vol 2 no 4 2016

[28] R Kohavi and J R Quinlan ldquoData mining tasks and methodsclassification decision-tree discoveryrdquo in Handbook of DataMining and Knowledge Discovery pp 267ndash276 Oxford Univer-sity Press 2002

Mathematical Problems in Engineering 11

[29] U M Fayyad and K B Irani ldquoOn the handling of continuous-valued attributes in decision tree generationrdquoMachine Learningvol 8 no 1 pp 87ndash102 1992

[30] J R Quinlan ldquoBagging boosting and C4 5rdquo AAAIIAAI vol1 pp 725ndash730 1996

[31] S Pang and J C Gong ldquoC5 0 classification algorithm andapplication on individual credit evaluation of banksrdquo SystemsEngineering-Theory amp Practice vol 29 no 12 pp 94ndash104 2009

[32] M A Friedl C E Brodley and A H Strahler ldquoMaximizingland cover classification accuracies produced by decision treesat continental to global scalesrdquo IEEE Transactions on Geoscienceand Remote Sensing vol 37 no 2 pp 969ndash977 1999

[33] C J Moran and E N Bui ldquoSpatial data mining for enhancedsoil map modellingrdquo International Journal of GeographicalInformation Science vol 16 no 6 pp 533ndash549 2002

[34] X Li andA G-O Yeh ldquoDatamining of cellular automatarsquos tran-sition rulesrdquo International Journal of Geographical InformationScience vol 18 no 8 pp 723ndash744 2004

[35] P H Williams R Eyles and G Weiller ldquoPlant microRNAprediction by supervised machine learning using C50 decisiontreesrdquo Journal of Nucleic Acids vol 2012 Article ID 652979 10pages 2012

[36] N Patil R Lathi and V Chitre ldquoCustomer card classificationbased on C5 0 amp CART algorithmsrdquo International Journal ofEngineering Research and Applications vol 2 no 4 pp 164ndash1672012

[37] M M Javidi and E F Roshan ldquoSpeech emotion recognition byusing combinations of C50 neural network (NN) and supportvector machines (SVM) classification methodsrdquo Journal ofMathematics and Computer Science vol 6 pp 191ndash200 2013

[38] E Bauer and R Kohavi ldquoAn empirical comparison of vot-ing classification algorithms bagging boosting and variantsrdquoMachine Learning vol 36 no 1-2 pp 105ndash139 1999

[39] Y Shao and R S Lunetta ldquoComparison of support vectormachine neural network and CART algorithms for the land-cover classification using limited training data pointsrdquo ISPRSJournal of Photogrammetry and Remote Sensing vol 70 pp 78ndash87 2012

[40] Y Bengio andYGrandvalet ldquoNounbiased estimator of the vari-ance of k-fold cross-validationrdquo Journal of Machine LearningResearch vol 5 pp 1089ndash1105 2004

[41] C Chen B He and Z Zeng ldquoAmethod formineral prospectiv-ity mapping integrating C45 decision tree weights-of-evidenceandm-branch smoothing techniques a case study in the easternKunlunMountains Chinardquo Earth Science Informatics vol 7 no1 pp 13ndash24 2014

[42] M van der Gaag T Hoffman M Remijsen et al ldquoThe five-factor model of the positive and negative syndrome scale IIa ten-fold cross-validation of a revised modelrdquo SchizophreniaResearch vol 85 no 1ndash3 pp 280ndash287 2006

[43] M Al Saud ldquoMapping potential areas for groundwater storagein Wadi Aurnah Basin western Arabian Peninsula usingremote sensing and geographic information system techniquesrdquoHydrogeology Journal vol 18 no 6 pp 1481ndash1495 2010

[44] R G Congalton ldquoA review of assessing the accuracy of classifi-cations of remotely sensed datardquo Remote Sensing of Environ-ment vol 37 no 1 pp 35ndash46 1991

[45] F K Hoehler ldquoBias and prevalence effects on kappa viewed interms of sensitivity and specificityrdquo Journal of Clinical Epidemi-ology vol 53 no 5 pp 499ndash503 2000

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 9: Research Article Assessment of Groundwater Potential Based ...downloads.hindawi.com/journals/mpe/2016/2064575.pdf · Figure . In general, slopes control the inltration and ow ability

Mathematical Problems in Engineering 9

5

N

79∘50

0E 80

∘00E 80

∘10

0E 80

∘20

0E

33∘00

N33∘100

N

0 10 20(km)

Very goodGoodModerate

PoorSurveyed tube wells

79∘40

0E79

∘30

0E

33∘200

N

Figure 11 Groundwater potential map of the study area based onC50

0 5 10 20

N

79∘50

0E 80

∘00E 80

∘10

0E 80

∘20

0E

33∘00

N33∘100

N

(km)

Very goodGoodModerate

PoorSurveyed tube wells

79∘40

0E79

∘30

0E

33∘200

N

Figure 12 Groundwater potential map of the study area based onCART

461 of the study area for C50 algorithm and 10505 km2about 468 of the study area for CART algorithm for thewater infiltration into the underground with sufficient timeand space The study shows that low-lying areas with goodflow condition well-developed stratigraphic gap and strongconnectivity like the southwest beaches of Ban Gong Lakecan be the first target for groundwater resources The ldquogoodrdquoarea was distributed along the river with 19210 km2 covering858 for C50 algorithm and 22602 km2 about 1009 ofthe study area for CART algorithm just like a long stripThe zones mainly had a planar distribution on the bottomof the diluvia fan on the low mountain and hilly terrainand a banded distribution on the mountain watershed ofpluvial valleys situated upstream and on the peripheryof ldquovery goodrdquo area which can serve as the candidate

target for groundwater exploration The ldquomoderaterdquo zonewith 59551 km2 occupying 2659 for C50 algorithmand 58453 km2 about 2610 of the study area for CARTalgorithm spreads around the upstream tributaries Thezone is located on both sides of the river valleys and the topof the diluvia fan The ldquopoorrdquo area occupies 6023 withthe area of 134914 km2 for C50 algorithm and 132440 km2about 5913 of the study area for CART algorithm which ismostly a mountainous region with a high altitude

4 Conclusions

In this study C50 and CART algorithms were applied for thedecision tree generation to predict the groundwater potentialzone with the five relating factors and the 10-fold crossvalidation method was adopted to verify the classificationresult with the kappa coefficient From this paper we candraw some conclusions as follows(1) In the study area the five groundwater relating fac-tors lithology topology slope river density and lineamentdensity were appropriate for the groundwater potential gradeprediction and the importance based on C50 algorithm was0363 0331 0159 0117 and 003 respectively for CARTalgorithm the importance was 0355 0308 0024 0010 and0312 respectively(2) Based on the 10-fold cross validation both C50 andCART could be applied for MCDM with the categoricaland continuous variables simultaneously with the averageaccuracy of 9045 and 8509 respectively however C50algorithm showed higher classification accuracy than CARTalgorithm(3) After applying the optimal decision trees to the wholestudy area respectively the groundwater potential zone mapwas delineated and the four grades of groundwater poten-tial zones ldquovery goodrdquo ldquogoodrdquo ldquomoderaterdquo and ldquopoorrdquooccupied the area of 10325 km2 19210 km2 59551 km2 and134914 km2 with the percentages of 461 858 2659and 6023 respectively for C50 and for CART the areaof 10505 km2 22602 km2 58453 km2 and 132440 km2with the percentages of 468 1009 2610 and 5913respectively

The study result can provide reference for groundwaterexploration and in the future work we will consider morerelating factors and survey more wells to enrich the modelThe integration of decision tree algorithms and MCDM inour study applies only to the qualitative assessment for thelack of the prior knowledge in the large area therefore theextra analysis is needed for the specific point investigationThe accuracy demonstrates that the 10-fold cross validation issuitable for training and verifying the decision tree howeverthe tested dataset is limited and more tube wells should beinvestigated to validate the stability of the model

Competing Interests

The authors declare that there is no conflict of interests in thisresearch work

10 Mathematical Problems in Engineering

Acknowledgments

This work was supported by Development Program of ChinaGroundwater Exploration Technology in the Water ShortageRegion (863 Program 2012AA062601)

References

[1] D Oikonomidis S Dimogianni N Kazakis and K VoudourisldquoA GISRemote Sensing-based methodology for groundwaterpotentiality assessment in Tirnavos area Greecerdquo Journal ofHydrology vol 525 pp 197ndash208 2015

[2] D Pinto S Shrestha M S Babel and S Ninsawat ldquoDelineationof groundwater potential zones in the Comoro watershedTimor Leste using GIS remote sensing and analytic hierarchyprocess (AHP) techniquerdquo Applied Water Science 2015

[3] O A Fashae M N Tijani A O Talabi and O I AdedejildquoDelineation of groundwater potential zones in the crystallinebasement terrain of SW-Nigeria an integrated GIS and remotesensing approachrdquoAppliedWater Science vol 4 no 1 pp 19ndash382014

[4] JMallick C K SinghHAl-Wadi et al ldquoGeospatial and geosta-tistical approach for groundwater potential zone delineationrdquoHydrological Processes vol 29 no 3 pp 395ndash418 2015

[5] S Lee K-Y Song Y Kim and I Park ldquoRegional groundwaterproductivity potential mapping using a geographic informationsystem (GIS) based artificial neural network modelrdquo Hydroge-ology Journal vol 20 no 8 pp 1511ndash1527 2012

[6] S Lee and C-W Lee ldquoApplication of decision-tree model togroundwater productivity-potential mappingrdquo Sustainabilityvol 7 no 10 pp 13416ndash13432 2015

[7] AOzdemir ldquoUsing a binary logistic regressionmethod andGISfor evaluating andmapping the groundwater spring potential inthe Sultan Mountains (Aksehir Turkey)rdquo Journal of Hydrologyvol 405 no 1-2 pp 123ndash136 2011

[8] A Ozdemir ldquoGIS-based groundwater spring potential mappingin the SultanMountains (Konya Turkey) using frequency ratioweights of evidence and logistic regression methods and theircomparisonrdquo Journal of Hydrology vol 411 no 3-4 pp 290ndash308 2011

[9] D Machiwal M K Jha and B C Mal ldquoAssessment ofgroundwater potential in a semi-arid region of india usingremote sensing GIS and MCDM techniquesrdquo Water ResourcesManagement vol 25 no 5 pp 1359ndash1386 2011

[10] S A Naghibi and H R Pourghasemi ldquoA comparative assess-ment between three machine learning models and their per-formance comparison by bivariate and multivariate statisticalmethods in groundwater potential mappingrdquo Water ResourcesManagement vol 29 no 14 pp 5217ndash5236 2015

[11] D Zheng-Dong Y Xin L Fan et al ldquoConstruction andinvestigation of groundwater remote sensing fuzzy assessmentindexrdquo Chinese Journal of Geophysics vol 56 no 11 pp 3908ndash3916 2013

[12] O FAlthuwaynee B PradhanH-J Park and JH Lee ldquoAnovelensemble decision tree-based CHi-squared Automatic Interac-tion Detection (CHAID) and multivariate logistic regressionmodels in landslide susceptibility mappingrdquo Landslides vol 11no 6 pp 1063ndash1078 2014

[13] I Park and S Lee ldquoSpatial prediction of landslide susceptibilityusing a decision tree approach a case study of the Pyeongchangarea Koreardquo International Journal of Remote Sensing vol 35 no16 pp 6089ndash6112 2014

[14] C Baker R Lawrence C Montagne and D Patten ldquoMappingwetlands and riparian areas using landsat ETM+ imagery anddecision-tree-based modelsrdquo Wetlands vol 26 no 2 pp 465ndash474 2006

[15] R Kohavi and J R Quinlan ldquoData mining tasks and methodsclassification decision-tree discoveryrdquo in Handbook of DataMining and Knowledge Discovery pp 267ndash276 Oxford Univer-sity Press New York NY USA 2002

[16] N Lin D Noe and X He ldquoTree-based methods and theirapplicationsrdquo in Springer Handbook of Engineering Statistics pp551ndash570 Springer London UK 2006

[17] I Klein U Gessner and C Kuenzer ldquoRegional land covermapping and change detection in Central Asia using MODIStime-seriesrdquo Applied Geography vol 35 no 1-2 pp 219ndash2342012

[18] J R Quinlan ldquoImproved use of continuous attributes in C4 5rdquoJournal of Artificial Intelligence Research vol 4 pp 77ndash90 1996

[19] S B Kotsiantis ldquoSupervised machine learning a review ofclassification techniquesrdquo Informatica vol 31 no 3 pp 249ndash268 2007

[20] K Polat and S Gunes ldquoA novel hybrid intelligent method basedon C45 decision tree classifier and one-against-all approachfor multi-class classification problemsrdquo Expert Systems withApplications vol 36 no 2 pp 1587ndash1592 2009

[21] M K Jha V M Chowdary and A Chowdhury ldquoGroundwaterassessment in Salboni Block West Bengal (India) using remotesensing geographical information system and multi-criteriadecision analysis techniquesrdquoHydrogeology Journal vol 18 no7 pp 1713ndash1728 2010

[22] V B Rekha A P Thomas M Suma and H Vijith ldquoAnintegration of spatial information technology for groundwa-ter potential and quality investigations in Koduvan Ar Sub-Watershed of Meenachil River Basin Kerala Indiardquo Journal ofthe Indian Society of Remote Sensing vol 39 no 1 pp 63ndash712011

[23] O Rahmati ANazari SamaniMMahdavi H R Pourghasemiand H Zeinivand ldquoGroundwater potential mapping at Kurdis-tan region of Iran using analytic hierarchy process and GISrdquoArabian Journal of Geosciences vol 8 no 9 pp 7059ndash7071 2014

[24] K R Preeja S Joseph J Thomas and H Vijith ldquoIdentificationof groundwater potential zones of a tropical river basin (KeralaIndia) using remote sensing and GIS techniquesrdquo Journal of theIndian Society of Remote Sensing vol 39 no 1 pp 83ndash94 2011

[25] K Narendra K Nageswara Rao and P Swarna Latha ldquoIntegrat-ing remote sensing and GIS for identification of groundwaterprospective zones in the Narava basin Visakhapatnam regionAndhra Pradeshrdquo Journal of the Geological Society of India vol81 no 2 pp 248ndash260 2013

[26] M Ture F Tokatli and I Kurt ldquoUsing KaplanndashMeier analysistogether with decision tree methods (CampRT CHAID QUESTC45 and ID3) in determining recurrence-free survival of breastcancer patientsrdquo Expert Systems with Applications vol 36 no 2pp 2017ndash2026 2009

[27] N A AL-Fakhry ldquoSummarizing data by using data miningtechniques a comparative by using C45 and C50 algorithmsrdquoInternational Education and Research Journal vol 2 no 4 2016

[28] R Kohavi and J R Quinlan ldquoData mining tasks and methodsclassification decision-tree discoveryrdquo in Handbook of DataMining and Knowledge Discovery pp 267ndash276 Oxford Univer-sity Press 2002

Mathematical Problems in Engineering 11

[29] U M Fayyad and K B Irani ldquoOn the handling of continuous-valued attributes in decision tree generationrdquoMachine Learningvol 8 no 1 pp 87ndash102 1992

[30] J R Quinlan ldquoBagging boosting and C4 5rdquo AAAIIAAI vol1 pp 725ndash730 1996

[31] S Pang and J C Gong ldquoC5 0 classification algorithm andapplication on individual credit evaluation of banksrdquo SystemsEngineering-Theory amp Practice vol 29 no 12 pp 94ndash104 2009

[32] M A Friedl C E Brodley and A H Strahler ldquoMaximizingland cover classification accuracies produced by decision treesat continental to global scalesrdquo IEEE Transactions on Geoscienceand Remote Sensing vol 37 no 2 pp 969ndash977 1999

[33] C J Moran and E N Bui ldquoSpatial data mining for enhancedsoil map modellingrdquo International Journal of GeographicalInformation Science vol 16 no 6 pp 533ndash549 2002

[34] X Li andA G-O Yeh ldquoDatamining of cellular automatarsquos tran-sition rulesrdquo International Journal of Geographical InformationScience vol 18 no 8 pp 723ndash744 2004

[35] P H Williams R Eyles and G Weiller ldquoPlant microRNAprediction by supervised machine learning using C50 decisiontreesrdquo Journal of Nucleic Acids vol 2012 Article ID 652979 10pages 2012

[36] N Patil R Lathi and V Chitre ldquoCustomer card classificationbased on C5 0 amp CART algorithmsrdquo International Journal ofEngineering Research and Applications vol 2 no 4 pp 164ndash1672012

[37] M M Javidi and E F Roshan ldquoSpeech emotion recognition byusing combinations of C50 neural network (NN) and supportvector machines (SVM) classification methodsrdquo Journal ofMathematics and Computer Science vol 6 pp 191ndash200 2013

[38] E Bauer and R Kohavi ldquoAn empirical comparison of vot-ing classification algorithms bagging boosting and variantsrdquoMachine Learning vol 36 no 1-2 pp 105ndash139 1999

[39] Y Shao and R S Lunetta ldquoComparison of support vectormachine neural network and CART algorithms for the land-cover classification using limited training data pointsrdquo ISPRSJournal of Photogrammetry and Remote Sensing vol 70 pp 78ndash87 2012

[40] Y Bengio andYGrandvalet ldquoNounbiased estimator of the vari-ance of k-fold cross-validationrdquo Journal of Machine LearningResearch vol 5 pp 1089ndash1105 2004

[41] C Chen B He and Z Zeng ldquoAmethod formineral prospectiv-ity mapping integrating C45 decision tree weights-of-evidenceandm-branch smoothing techniques a case study in the easternKunlunMountains Chinardquo Earth Science Informatics vol 7 no1 pp 13ndash24 2014

[42] M van der Gaag T Hoffman M Remijsen et al ldquoThe five-factor model of the positive and negative syndrome scale IIa ten-fold cross-validation of a revised modelrdquo SchizophreniaResearch vol 85 no 1ndash3 pp 280ndash287 2006

[43] M Al Saud ldquoMapping potential areas for groundwater storagein Wadi Aurnah Basin western Arabian Peninsula usingremote sensing and geographic information system techniquesrdquoHydrogeology Journal vol 18 no 6 pp 1481ndash1495 2010

[44] R G Congalton ldquoA review of assessing the accuracy of classifi-cations of remotely sensed datardquo Remote Sensing of Environ-ment vol 37 no 1 pp 35ndash46 1991

[45] F K Hoehler ldquoBias and prevalence effects on kappa viewed interms of sensitivity and specificityrdquo Journal of Clinical Epidemi-ology vol 53 no 5 pp 499ndash503 2000

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 10: Research Article Assessment of Groundwater Potential Based ...downloads.hindawi.com/journals/mpe/2016/2064575.pdf · Figure . In general, slopes control the inltration and ow ability

10 Mathematical Problems in Engineering

Acknowledgments

This work was supported by Development Program of ChinaGroundwater Exploration Technology in the Water ShortageRegion (863 Program 2012AA062601)

References

[1] D Oikonomidis S Dimogianni N Kazakis and K VoudourisldquoA GISRemote Sensing-based methodology for groundwaterpotentiality assessment in Tirnavos area Greecerdquo Journal ofHydrology vol 525 pp 197ndash208 2015

[2] D Pinto S Shrestha M S Babel and S Ninsawat ldquoDelineationof groundwater potential zones in the Comoro watershedTimor Leste using GIS remote sensing and analytic hierarchyprocess (AHP) techniquerdquo Applied Water Science 2015

[3] O A Fashae M N Tijani A O Talabi and O I AdedejildquoDelineation of groundwater potential zones in the crystallinebasement terrain of SW-Nigeria an integrated GIS and remotesensing approachrdquoAppliedWater Science vol 4 no 1 pp 19ndash382014

[4] JMallick C K SinghHAl-Wadi et al ldquoGeospatial and geosta-tistical approach for groundwater potential zone delineationrdquoHydrological Processes vol 29 no 3 pp 395ndash418 2015

[5] S Lee K-Y Song Y Kim and I Park ldquoRegional groundwaterproductivity potential mapping using a geographic informationsystem (GIS) based artificial neural network modelrdquo Hydroge-ology Journal vol 20 no 8 pp 1511ndash1527 2012

[6] S Lee and C-W Lee ldquoApplication of decision-tree model togroundwater productivity-potential mappingrdquo Sustainabilityvol 7 no 10 pp 13416ndash13432 2015

[7] AOzdemir ldquoUsing a binary logistic regressionmethod andGISfor evaluating andmapping the groundwater spring potential inthe Sultan Mountains (Aksehir Turkey)rdquo Journal of Hydrologyvol 405 no 1-2 pp 123ndash136 2011

[8] A Ozdemir ldquoGIS-based groundwater spring potential mappingin the SultanMountains (Konya Turkey) using frequency ratioweights of evidence and logistic regression methods and theircomparisonrdquo Journal of Hydrology vol 411 no 3-4 pp 290ndash308 2011

[9] D Machiwal M K Jha and B C Mal ldquoAssessment ofgroundwater potential in a semi-arid region of india usingremote sensing GIS and MCDM techniquesrdquo Water ResourcesManagement vol 25 no 5 pp 1359ndash1386 2011

[10] S A Naghibi and H R Pourghasemi ldquoA comparative assess-ment between three machine learning models and their per-formance comparison by bivariate and multivariate statisticalmethods in groundwater potential mappingrdquo Water ResourcesManagement vol 29 no 14 pp 5217ndash5236 2015

[11] D Zheng-Dong Y Xin L Fan et al ldquoConstruction andinvestigation of groundwater remote sensing fuzzy assessmentindexrdquo Chinese Journal of Geophysics vol 56 no 11 pp 3908ndash3916 2013

[12] O FAlthuwaynee B PradhanH-J Park and JH Lee ldquoAnovelensemble decision tree-based CHi-squared Automatic Interac-tion Detection (CHAID) and multivariate logistic regressionmodels in landslide susceptibility mappingrdquo Landslides vol 11no 6 pp 1063ndash1078 2014

[13] I Park and S Lee ldquoSpatial prediction of landslide susceptibilityusing a decision tree approach a case study of the Pyeongchangarea Koreardquo International Journal of Remote Sensing vol 35 no16 pp 6089ndash6112 2014

[14] C Baker R Lawrence C Montagne and D Patten ldquoMappingwetlands and riparian areas using landsat ETM+ imagery anddecision-tree-based modelsrdquo Wetlands vol 26 no 2 pp 465ndash474 2006

[15] R Kohavi and J R Quinlan ldquoData mining tasks and methodsclassification decision-tree discoveryrdquo in Handbook of DataMining and Knowledge Discovery pp 267ndash276 Oxford Univer-sity Press New York NY USA 2002

[16] N Lin D Noe and X He ldquoTree-based methods and theirapplicationsrdquo in Springer Handbook of Engineering Statistics pp551ndash570 Springer London UK 2006

[17] I Klein U Gessner and C Kuenzer ldquoRegional land covermapping and change detection in Central Asia using MODIStime-seriesrdquo Applied Geography vol 35 no 1-2 pp 219ndash2342012

[18] J R Quinlan ldquoImproved use of continuous attributes in C4 5rdquoJournal of Artificial Intelligence Research vol 4 pp 77ndash90 1996

[19] S B Kotsiantis ldquoSupervised machine learning a review ofclassification techniquesrdquo Informatica vol 31 no 3 pp 249ndash268 2007

[20] K Polat and S Gunes ldquoA novel hybrid intelligent method basedon C45 decision tree classifier and one-against-all approachfor multi-class classification problemsrdquo Expert Systems withApplications vol 36 no 2 pp 1587ndash1592 2009

[21] M K Jha V M Chowdary and A Chowdhury ldquoGroundwaterassessment in Salboni Block West Bengal (India) using remotesensing geographical information system and multi-criteriadecision analysis techniquesrdquoHydrogeology Journal vol 18 no7 pp 1713ndash1728 2010

[22] V B Rekha A P Thomas M Suma and H Vijith ldquoAnintegration of spatial information technology for groundwa-ter potential and quality investigations in Koduvan Ar Sub-Watershed of Meenachil River Basin Kerala Indiardquo Journal ofthe Indian Society of Remote Sensing vol 39 no 1 pp 63ndash712011

[23] O Rahmati ANazari SamaniMMahdavi H R Pourghasemiand H Zeinivand ldquoGroundwater potential mapping at Kurdis-tan region of Iran using analytic hierarchy process and GISrdquoArabian Journal of Geosciences vol 8 no 9 pp 7059ndash7071 2014

[24] K R Preeja S Joseph J Thomas and H Vijith ldquoIdentificationof groundwater potential zones of a tropical river basin (KeralaIndia) using remote sensing and GIS techniquesrdquo Journal of theIndian Society of Remote Sensing vol 39 no 1 pp 83ndash94 2011

[25] K Narendra K Nageswara Rao and P Swarna Latha ldquoIntegrat-ing remote sensing and GIS for identification of groundwaterprospective zones in the Narava basin Visakhapatnam regionAndhra Pradeshrdquo Journal of the Geological Society of India vol81 no 2 pp 248ndash260 2013

[26] M Ture F Tokatli and I Kurt ldquoUsing KaplanndashMeier analysistogether with decision tree methods (CampRT CHAID QUESTC45 and ID3) in determining recurrence-free survival of breastcancer patientsrdquo Expert Systems with Applications vol 36 no 2pp 2017ndash2026 2009

[27] N A AL-Fakhry ldquoSummarizing data by using data miningtechniques a comparative by using C45 and C50 algorithmsrdquoInternational Education and Research Journal vol 2 no 4 2016

[28] R Kohavi and J R Quinlan ldquoData mining tasks and methodsclassification decision-tree discoveryrdquo in Handbook of DataMining and Knowledge Discovery pp 267ndash276 Oxford Univer-sity Press 2002

Mathematical Problems in Engineering 11

[29] U M Fayyad and K B Irani ldquoOn the handling of continuous-valued attributes in decision tree generationrdquoMachine Learningvol 8 no 1 pp 87ndash102 1992

[30] J R Quinlan ldquoBagging boosting and C4 5rdquo AAAIIAAI vol1 pp 725ndash730 1996

[31] S Pang and J C Gong ldquoC5 0 classification algorithm andapplication on individual credit evaluation of banksrdquo SystemsEngineering-Theory amp Practice vol 29 no 12 pp 94ndash104 2009

[32] M A Friedl C E Brodley and A H Strahler ldquoMaximizingland cover classification accuracies produced by decision treesat continental to global scalesrdquo IEEE Transactions on Geoscienceand Remote Sensing vol 37 no 2 pp 969ndash977 1999

[33] C J Moran and E N Bui ldquoSpatial data mining for enhancedsoil map modellingrdquo International Journal of GeographicalInformation Science vol 16 no 6 pp 533ndash549 2002

[34] X Li andA G-O Yeh ldquoDatamining of cellular automatarsquos tran-sition rulesrdquo International Journal of Geographical InformationScience vol 18 no 8 pp 723ndash744 2004

[35] P H Williams R Eyles and G Weiller ldquoPlant microRNAprediction by supervised machine learning using C50 decisiontreesrdquo Journal of Nucleic Acids vol 2012 Article ID 652979 10pages 2012

[36] N Patil R Lathi and V Chitre ldquoCustomer card classificationbased on C5 0 amp CART algorithmsrdquo International Journal ofEngineering Research and Applications vol 2 no 4 pp 164ndash1672012

[37] M M Javidi and E F Roshan ldquoSpeech emotion recognition byusing combinations of C50 neural network (NN) and supportvector machines (SVM) classification methodsrdquo Journal ofMathematics and Computer Science vol 6 pp 191ndash200 2013

[38] E Bauer and R Kohavi ldquoAn empirical comparison of vot-ing classification algorithms bagging boosting and variantsrdquoMachine Learning vol 36 no 1-2 pp 105ndash139 1999

[39] Y Shao and R S Lunetta ldquoComparison of support vectormachine neural network and CART algorithms for the land-cover classification using limited training data pointsrdquo ISPRSJournal of Photogrammetry and Remote Sensing vol 70 pp 78ndash87 2012

[40] Y Bengio andYGrandvalet ldquoNounbiased estimator of the vari-ance of k-fold cross-validationrdquo Journal of Machine LearningResearch vol 5 pp 1089ndash1105 2004

[41] C Chen B He and Z Zeng ldquoAmethod formineral prospectiv-ity mapping integrating C45 decision tree weights-of-evidenceandm-branch smoothing techniques a case study in the easternKunlunMountains Chinardquo Earth Science Informatics vol 7 no1 pp 13ndash24 2014

[42] M van der Gaag T Hoffman M Remijsen et al ldquoThe five-factor model of the positive and negative syndrome scale IIa ten-fold cross-validation of a revised modelrdquo SchizophreniaResearch vol 85 no 1ndash3 pp 280ndash287 2006

[43] M Al Saud ldquoMapping potential areas for groundwater storagein Wadi Aurnah Basin western Arabian Peninsula usingremote sensing and geographic information system techniquesrdquoHydrogeology Journal vol 18 no 6 pp 1481ndash1495 2010

[44] R G Congalton ldquoA review of assessing the accuracy of classifi-cations of remotely sensed datardquo Remote Sensing of Environ-ment vol 37 no 1 pp 35ndash46 1991

[45] F K Hoehler ldquoBias and prevalence effects on kappa viewed interms of sensitivity and specificityrdquo Journal of Clinical Epidemi-ology vol 53 no 5 pp 499ndash503 2000

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 11: Research Article Assessment of Groundwater Potential Based ...downloads.hindawi.com/journals/mpe/2016/2064575.pdf · Figure . In general, slopes control the inltration and ow ability

Mathematical Problems in Engineering 11

[29] U M Fayyad and K B Irani ldquoOn the handling of continuous-valued attributes in decision tree generationrdquoMachine Learningvol 8 no 1 pp 87ndash102 1992

[30] J R Quinlan ldquoBagging boosting and C4 5rdquo AAAIIAAI vol1 pp 725ndash730 1996

[31] S Pang and J C Gong ldquoC5 0 classification algorithm andapplication on individual credit evaluation of banksrdquo SystemsEngineering-Theory amp Practice vol 29 no 12 pp 94ndash104 2009

[32] M A Friedl C E Brodley and A H Strahler ldquoMaximizingland cover classification accuracies produced by decision treesat continental to global scalesrdquo IEEE Transactions on Geoscienceand Remote Sensing vol 37 no 2 pp 969ndash977 1999

[33] C J Moran and E N Bui ldquoSpatial data mining for enhancedsoil map modellingrdquo International Journal of GeographicalInformation Science vol 16 no 6 pp 533ndash549 2002

[34] X Li andA G-O Yeh ldquoDatamining of cellular automatarsquos tran-sition rulesrdquo International Journal of Geographical InformationScience vol 18 no 8 pp 723ndash744 2004

[35] P H Williams R Eyles and G Weiller ldquoPlant microRNAprediction by supervised machine learning using C50 decisiontreesrdquo Journal of Nucleic Acids vol 2012 Article ID 652979 10pages 2012

[36] N Patil R Lathi and V Chitre ldquoCustomer card classificationbased on C5 0 amp CART algorithmsrdquo International Journal ofEngineering Research and Applications vol 2 no 4 pp 164ndash1672012

[37] M M Javidi and E F Roshan ldquoSpeech emotion recognition byusing combinations of C50 neural network (NN) and supportvector machines (SVM) classification methodsrdquo Journal ofMathematics and Computer Science vol 6 pp 191ndash200 2013

[38] E Bauer and R Kohavi ldquoAn empirical comparison of vot-ing classification algorithms bagging boosting and variantsrdquoMachine Learning vol 36 no 1-2 pp 105ndash139 1999

[39] Y Shao and R S Lunetta ldquoComparison of support vectormachine neural network and CART algorithms for the land-cover classification using limited training data pointsrdquo ISPRSJournal of Photogrammetry and Remote Sensing vol 70 pp 78ndash87 2012

[40] Y Bengio andYGrandvalet ldquoNounbiased estimator of the vari-ance of k-fold cross-validationrdquo Journal of Machine LearningResearch vol 5 pp 1089ndash1105 2004

[41] C Chen B He and Z Zeng ldquoAmethod formineral prospectiv-ity mapping integrating C45 decision tree weights-of-evidenceandm-branch smoothing techniques a case study in the easternKunlunMountains Chinardquo Earth Science Informatics vol 7 no1 pp 13ndash24 2014

[42] M van der Gaag T Hoffman M Remijsen et al ldquoThe five-factor model of the positive and negative syndrome scale IIa ten-fold cross-validation of a revised modelrdquo SchizophreniaResearch vol 85 no 1ndash3 pp 280ndash287 2006

[43] M Al Saud ldquoMapping potential areas for groundwater storagein Wadi Aurnah Basin western Arabian Peninsula usingremote sensing and geographic information system techniquesrdquoHydrogeology Journal vol 18 no 6 pp 1481ndash1495 2010

[44] R G Congalton ldquoA review of assessing the accuracy of classifi-cations of remotely sensed datardquo Remote Sensing of Environ-ment vol 37 no 1 pp 35ndash46 1991

[45] F K Hoehler ldquoBias and prevalence effects on kappa viewed interms of sensitivity and specificityrdquo Journal of Clinical Epidemi-ology vol 53 no 5 pp 499ndash503 2000

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 12: Research Article Assessment of Groundwater Potential Based ...downloads.hindawi.com/journals/mpe/2016/2064575.pdf · Figure . In general, slopes control the inltration and ow ability

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of