Research Article SVM-RFE Based Feature Selection and...

11
Research Article SVM-RFE Based Feature Selection and Taguchi Parameters Optimization for Multiclass SVM Classifier Mei-Ling Huang, 1 Yung-Hsiang Hung, 1 W. M. Lee, 2 R. K. Li, 2 and Bo-Ru Jiang 1 1 Department of Industrial Engineering and Management, National Chin-Yi University of Technology, No. 57, Sec. 2, Zhong-Shan Road, Taiping District, Taichung 41170, Taiwan 2 Department of Industrial Engineering & Management, National Chiao-Tung University, No. 1001, Ta-Hsueh Road, Hsinchu 300, Taiwan Correspondence should be addressed to Mei-Ling Huang; [email protected] Received 20 June 2014; Revised 5 August 2014; Accepted 5 August 2014; Published 10 September 2014 Academic Editor: Shifei Ding Copyright © 2014 Mei-Ling Huang et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Recently, support vector machine (SVM) has excellent performance on classification and prediction and is widely used on disease diagnosis or medical assistance. However, SVM only functions well on two-group classification problems. is study combines feature selection and SVM recursive feature elimination (SVM-RFE) to investigate the classification accuracy of multiclass problems for Dermatology and Zoo databases. Dermatology dataset contains 33 feature variables, 1 class variable, and 366 testing instances; and the Zoo dataset contains 16 feature variables, 1 class variable, and 101 testing instances. e feature variables in the two datasets were sorted in descending order by explanatory power, and different feature sets were selected by SVM-RFE to explore classification accuracy. Meanwhile, Taguchi method was jointly combined with SVM classifier in order to optimize parameters and to increase classification accuracy for multiclass classification. e experimental results show that the classification accuracy can be more than 95% aſter SVM-RFE feature selection and Taguchi parameter optimization for Dermatology and Zoo databases. 1. Introduction e support vector machine (SVM) is one of the important tools of machine learning. e principle of SVM operation is as follows: a given group of classified data is trained by the algorithm to obtain a group of classification models, which can help predict the category of the new data [1, 2]. Its scope of application is widely used in various fields, such as disease or medical imaging diagnosis [35], financial crisis prediction [6], biomedical engineering, and bioinformatics classification [7, 8]. Although SVM is an efficient machine learning method, its classification accuracy requires further improvement in the case of multidimensional space clas- sification and dataset for feature interaction variables [9]. Regarding such problems, in general, feature selection can be applied to reduce data structure complexity in order to identify important feature variables as a new set of testing instances [10]. By feature selection, inappropriate, redundant, and noise data of each problem can be filtered to reduce the computational time of classification and improve classi- fication accuracy. e common methods of feature selection include backward feature selection (BFS), forward feature selection (FFS), and ranker [11]. Another feature selection method, support vector machine recursive feature elimina- tion (SVM-RFE), can filter relevant features and remove relatively insignificant feature variables in order to achieve higher classification performance [12]. e research findings of Harikrishna et al. have shown that computation is simpler and can more effectively improve classification accuracy in the case of datasets aſter SVM-REF selection [1315]. As SVM basically applies on two-class data [16], many scholars have explored the expansion of SVM on multiclass data [1719]. However, classification accuracy is not ideal. ere are many studies on choosing kernel parameters for SVM [2022]. erefore, this study applies SVM-RFE to sort the 33 variables for Dermatology dataset and 16 variables for Zoo dataset by explanatory power in descending order and selects different feature sets before using the Taguchi Hindawi Publishing Corporation e Scientific World Journal Volume 2014, Article ID 795624, 10 pages http://dx.doi.org/10.1155/2014/795624

Transcript of Research Article SVM-RFE Based Feature Selection and...

Page 1: Research Article SVM-RFE Based Feature Selection and ...downloads.hindawi.com/journals/tswj/2014/795624.pdf · SVM-RFE Based Feature Selection and Taguchi Parameters Optimization

Research ArticleSVM-RFE Based Feature Selection and Taguchi ParametersOptimization for Multiclass SVM Classifier

Mei-Ling Huang1 Yung-Hsiang Hung1 W M Lee2 R K Li2 and Bo-Ru Jiang1

1 Department of Industrial Engineering and Management National Chin-Yi University of Technology No 57 Sec 2Zhong-Shan Road Taiping District Taichung 41170 Taiwan

2Department of Industrial Engineering amp Management National Chiao-Tung University No 1001 Ta-Hsueh RoadHsinchu 300 Taiwan

Correspondence should be addressed to Mei-Ling Huang huangmlncutedutw

Received 20 June 2014 Revised 5 August 2014 Accepted 5 August 2014 Published 10 September 2014

Academic Editor Shifei Ding

Copyright copy 2014 Mei-Ling Huang et al This is an open access article distributed under the Creative Commons AttributionLicense which permits unrestricted use distribution and reproduction in any medium provided the original work is properlycited

Recently support vector machine (SVM) has excellent performance on classification and prediction and is widely used on diseasediagnosis or medical assistance However SVM only functions well on two-group classification problems This study combinesfeature selection and SVMrecursive feature elimination (SVM-RFE) to investigate the classification accuracy ofmulticlass problemsfor Dermatology and Zoo databases Dermatology dataset contains 33 feature variables 1 class variable and 366 testing instancesand the Zoo dataset contains 16 feature variables 1 class variable and 101 testing instances The feature variables in the two datasetswere sorted in descending order by explanatory power and different feature sets were selected by SVM-RFE to explore classificationaccuracyMeanwhile Taguchimethodwas jointly combinedwith SVMclassifier in order to optimize parameters119862 and 120574 to increaseclassification accuracy for multiclass classificationThe experimental results show that the classification accuracy can be more than95 after SVM-RFE feature selection and Taguchi parameter optimization for Dermatology and Zoo databases

1 Introduction

The support vector machine (SVM) is one of the importanttools of machine learning The principle of SVM operationis as follows a given group of classified data is trained bythe algorithm to obtain a group of classification modelswhich can help predict the category of the new data [1 2]Its scope of application is widely used in various fields suchas disease or medical imaging diagnosis [3ndash5] financial crisisprediction [6] biomedical engineering and bioinformaticsclassification [7 8] Although SVM is an efficient machinelearning method its classification accuracy requires furtherimprovement in the case of multidimensional space clas-sification and dataset for feature interaction variables [9]Regarding such problems in general feature selection canbe applied to reduce data structure complexity in order toidentify important feature variables as a new set of testinginstances [10] By feature selection inappropriate redundantand noise data of each problem can be filtered to reduce

the computational time of classification and improve classi-fication accuracy The common methods of feature selectioninclude backward feature selection (BFS) forward featureselection (FFS) and ranker [11] Another feature selectionmethod support vector machine recursive feature elimina-tion (SVM-RFE) can filter relevant features and removerelatively insignificant feature variables in order to achievehigher classification performance [12] The research findingsof Harikrishna et al have shown that computation is simplerand can more effectively improve classification accuracy inthe case of datasets after SVM-REF selection [13ndash15]

As SVM basically applies on two-class data [16] manyscholars have explored the expansion of SVM on multiclassdata [17ndash19] However classification accuracy is not idealThere are many studies on choosing kernel parameters forSVM [20ndash22] Therefore this study applies SVM-RFE to sortthe 33 variables for Dermatology dataset and 16 variablesfor Zoo dataset by explanatory power in descending orderand selects different feature sets before using the Taguchi

Hindawi Publishing Corporatione Scientific World JournalVolume 2014 Article ID 795624 10 pageshttpdxdoiorg1011552014795624

2 The Scientific World Journal

Table 1 Feature information for Dermatology and Zoo databases

Dermatology ZooDatasetcharacteristics Multivariate Multivariate

Attributecharacteristics Categorical integer Categorical integer

Associated tasks Classification ClassificationArea Life LifeNumber ofinstances 366 101

Number ofattributes 33 16

Number of class 6 7

parameter design to optimize Multiclass SVM parameters 119862and 120574 to improve the classification accuracy for SVM multi-class classifier

This study is organized as follows Section 2 describes theresearch data Section 3 introduces methods used throughthis paper Section 4 discusses the experiment and resultsFinally Section 5 presents our conclusions

2 Study Population

This study used the Dermatology dataset from Universityof California at Irvine (UCI) and the Zoo database fromits College of Information Technology and Computers toconduct experimental tests parameter optimization andclassification accuracy performance evaluation using theSVM classifier

In medicine dermatological diseases are diseases of theskin that have a serious impact on health As frequentlyoccurring types of diseases there are more than 1000 kindsof dermatological diseases such as psoriasis seborrheicdermatitis lichen planus pityriasis chronic dermatitis andpityriasis rubra pilaris The Dermatology dataset was estab-lished by Nilsel in 1998 and contains 33 feature variables and1 class variable (6-class)

The dermatology feature variables and data summaryare as shown in Table 1 The Dermatology dataset has eightomissions After removing the eight omissions we retained358 (instances) for this studyThe instances of data of variouscategories are psoriasis (Class 1) 111 instances seborrheicdermatitis (Class 2) 71 instances lichen planus (Class 3) 60instances pityriasis (Class 4) 48 instances chronic dermatitis(Class 5) 48 instances and pityriasis rubra pilaris (Class6) 20 instances The Zoo dataset contains 17 Boolean-valued attributes and 101 instances The instances of data ofvarious categories are as follows bear and so forth (Class1) 41 instances chicken and so forth (Class 2) 20 instancesseasnake and so forth (Class 3) 5 instances bass and so forth(Class 4) 13 instances (Class 5) 4 instances frog and so forth(Class 6) 8 instances and honeybee and so forth (Class 7) 10instances

Before feature selection we conducted feature attributecodingThe feature attribute coding of Dermatology and Zoodatabases is as shown in Tables 2 and 3

Table 2 Attributes of Dermatology database

ID AttributeV1 ErythemaV2 ScalingV3 Definite bordersV4 ItchingV5 Koebner phenomenonV6 Polygonal papulesV7 Follicular papulesV8 Oral mucosal involvementV9 Knee and elbow involvementV10 Scalp involvementV11 Family historyV12 Melanin incontinenceV13 Eosinophils in the infiltrateV14 PNL infiltrateV15 Fibrosis of the papillary dermisV16 ExocytosisV17 AcanthosisV18 HyperkeratosisV19 ParakeratosisV20 Clubbing of the rete ridgesV21 Elongation of the rete ridgesV22 Thinning of the suprapapillary epidermisV23 Spongiform pustuleV24 Munro microabscessV25 Focal hypergranulosisV26 Disappearance of the granular layerV27 Vacuolisation and damage of basal layerV28 SpongiosisV29 Saw-tooth appearance of retesV30 Follicular horn plugV31 Perifollicular parakeratosisV32 Inflammatory mononuclear infiltrateV33 Band-like infiltrateV34 Age

3 Methodology

31 Research Framework The research framework of thestudy is shown in Figure 1 Steps are as follows

(1) Database preprocessing delete the omissions andfeature variable coding for Dermatology and Zoodatasets And there are 358 and 101 instances left forDermatology and Zoo databases for further experi-ment respectively

(2) Feature selection apply SVM-RFE ranking accordingto the order of importance of the features anddetermine the feature set that contributes to theclassification

(3) Parameter optimization apply Taguchi parameterdesign in the parameters (119862 amp 120574) optimization of aMulticlass SVM Classifier in order to enhance theclassification accuracy for the multiclass dataset

The Scientific World Journal 3

Table 3 Attributes of Zoo database

ID AttributeV1 HairV2 FeathersV3 EggsV4 MilkV5 AirborneV6 AquaticV7 PredatorV8 ToothedV9 BackboneV10 BreathesV11 VenomousV12 FinsV13 LegsV14 TailV15 DomesticV16 Cat-size

UCI Dermatology and Zoo datasets

After preprocessing

Dermatology 358 instances

Zoo 101 instances

Feature selection

SVM-FRE

Method 1 Method 2

Taguchi parameter designC and 120574

LS-SVMBayesian

LS-SVMBayesian

initial parametersC and 120574

LS-SVM classifier performance evaluation

Figure 1 Research framework

32 Feature Selection Feature selection implies not onlycardinality reduction which means imposing an arbitrary orpredefined cutoff on the number of attributes that can beconsidered when building a model but also the choice ofattributes meaning that either the analyst or the modelingtool actively selects or discards attributes based on theirusefulness for analysis The feature selection method is asearch strategy to select or remove some features of the

original feature set to generate various types of subsets toobtain the optimum feature subset The subsets selected eachtime are compared and analyzed according to the formulatedassessment function If the subset selected in step 119898 + 1 isbetter than the subset selected in step 119898 the subset selectedin step119898 + 1 can be selected as the optimum subset

33 Linear Support Vector Machine (Linear SVM) SVM isdeveloped from statistical learning theory as based on SRM(structural risk minimization) It can be applied on classifica-tion and nonlinear regression [6] Generally speaking SVMcan be divided into linear SVM (linear SVM) and nonlinearSVM described as follows

(1) Linear SVM The linear SVM encodes the training dataof different types by classification with Class 1 as being ldquo+1rdquoand Class 2 as being ldquominus1rdquo and the mathematical symbolis 119909

119894 119910119894119879

119894minus1 119909119894isin R119898 119910

119894isin minus1 +1 the hyperplane is

represented as follows

119908 sdot 119909 + 119887 = 0 (1)

where 119908 denotes weight vector 119909 denotes the input datasetand 119887 denotes a constant as a bias (displacement) in thehyperplane The purpose of bias is to ensure that the hyper-plane is in the correct position after horizontal movementTherefore bias is determined after training119908The parametersof the hyperplane include 119908 and 119887 When SVM is appliedon classification the hyperplane is regarded as a decisionfunction

119891 (119909) = sign (119908 sdot 119909 + 119887) (2)

Generally speaking the purpose of SVM is to obtain thehyperplane of the maximized marginal distance and improvethe distinguishing function between the two categories ofthe dataset The process of optimizing the distinguishingfunction of the hyperplane can be regarded as a quadraticprogramming problem

minimize 119871119901=1

21199082

subject to 119910119894(119909119894sdot 119908 + 119887) minus 1 ge 0 119894 = 1 119897

(3)

The original minimization problem is converted into amaximization problem by using the LagrangeTheory

max 119871119863 (120572) =

119897

sum

119894=1

120572119894minus1

2

119897

sum

119894=1

119897

sum

119895=1

120572119894120572119895119910119894119910119895(119909119894119909119895)

subject to119897

sum

119894=1

120572119894119910119894= 0 119894 = 1 119897

120572119894ge 0 119894 = 1 119897

(4)

Finally the linear divisive decision making function is

119891 (119909) = sign(119899

sum

119894=1

119910119894120572lowast

119894(119909 sdot 119909119894) + 119887lowast) (5)

4 The Scientific World Journal

If 119891(119909) gt 0 it means the sample is in the same category assamples marked with ldquo+1rdquo otherwise it is in the category ofsamples marked with ldquominus1rdquo When the training data includenoise the linear hyperplane cannot accurately distinguishdata points By introducing slack variables 120585

119894in the constraint

the original (3) can be modified into the following

minimize 1

21199082+ 119862(

119897

sum

119894=1

120585119894)

subject to 119910119894(119909119894sdot 119908 + 119887) minus 1 + 120585

119894ge 0 119894 = 1 119897

120585119894ge 0 119894 = 1 119897

(6)

where 120585119894is the distance between the boundary and the clas-

sification point and penalty parameter 119862 represents the costof the classification error of training data during the learningprocess as determined by the user When 119862 is greater themargin will be smaller indicating that the fault tolerancerate will be smaller when a fault occurs Otherwise when119862 is smaller the fault tolerance rate will be greater When119862 rarr infin the linear inseparable problem will degenerateinto a linear separable problem In this case the solution ofthe above mentioned optimization problem can be appliedto obtain the various parameters and optimum solution ofthe target function using the Lagrangian coefficient thus thelinear inseparable dual optimization problem is as follows

Max 119871119863 (120572) =

119897

sum

119894=1

120572119894minus1

2

119897

sum

119894=1

119897

sum

119895=1

120572119894120572119895119910119894119910119895(119909119894119909119895)

Subject to119897

sum

119894=1

120572119894119910119894= 0 119894 = 1 119897

0 le 120572119894le 119862 119894 = 1 119897

(7)

Finally the linear decision-making function is

119891 (119909) = sign(119899

sum

119894=1

119910119894120572lowast

119894(119909 sdot 119909119894) + 119887lowast) (8)

(2) Nonlinear Support Vector Machine (Nonlinear SVM)When input training samples cannot be separated usinglinear SVM we can use conversion function 120593 to convertthe original 2-dimensional data into a new high-dimensionalfeature space for linear separable problem SVM can effi-ciently perform a nonlinear classification using what is calledthe kernel trick implicitly mapping their inputs into high-dimensional feature spaces Presently many different corefunctions have been proposed Using different core functionsregarding different data features can effectively improve thecomputational efficiency of SVM The relatively commoncore functions include the following four types

(1) linear kernel function

119870(119909119894 119910119894) = 119909119905

119894sdot 119910119895 (9)

(2) polynomial kernel function

119870(119909119894 119910119895) = (120574119909

119905

119894119909119895+ 119903)119898

120574 gt 0 (10)

(3) radial basis kernel function

119870(119909119894 119910119895) = exp(

minus10038171003817100381710038171003817119909119894minus 119910119895

10038171003817100381710038171003817

2

21205902) 120574 gt 0 (11)

(4) sigmoid kernel function

119870(119909119894 119910119895) = tanh (120574119909119905

119894sdot 119910119895+ 119903) (12)

where the emissive core function is more frequently appliedin high feature dimensional and nonlinear problems andthe parameters to be set are 120574 and 119862 which can slightlyreduce SVM complexity and improve calculation efficiencytherefore this study selects the emissive core function

34 Support Vector Machine Recursive Feature Elimination(SVM-RFE) A feature selection process can be used toremove terms in the training dataset that are statisticallyuncorrelated with the class labels thus improving bothefficiency and accuracy Pal and Maiti (2010) provided asupervised dimensionality reduction method The featureselection problem has been modeled as a mixed 0-1 inte-ger program [23] Multiclass Mahalanobis-Taguchi system(MMTS) is developed for simultaneous multiclass classi-fication and feature selection The important features areidentified using the orthogonal arrays and the signal-to-noise ratio and are then used to construct a reduced modelmeasurement scale [24] SVM-RFE is an SVM-based featureselection algorithm created by [12] Using SVM-RFE Guyonet al selected key and important feature sets In addition toreducing classification computational time it can improve theclassification accuracy rate [12] In recent years many schol-ars improved the classification effect in medical diagnosis bytaking advantage of this method [22 25]

35 Multiclass SVM Classifier SVMrsquos basic classificationprinciple is mainly based on dual categories Presently thereare three main methods one-against-all one-against-oneand directed acyclic graph to process multiclass problems[26] described as follows

(1) One-Against-All (OAA) Proposed by Bottou et al (1994)the one-versus-rest converts the classification problem of 119896categories into 119896 dual-category problems [27] Scholars havealso proposed subsequent effective classification methods[28] In the training process it must train 119896 dual-categorySVMs When training the 119894th classifier data in the 119894thcategory is regarded as ldquo+1rdquo and the data of the remainingcategories is regarded as ldquominus1rdquo to complete the trainingof 119896 dual-category SVM during the testing process eachtesting instance is tested by trained 119896 dual-category SVMsThe classification results can be determined by comparingthe outputs of SVM Regarding unknown category 119909 the

The Scientific World Journal 5

decision function arg max119894=1119896

(119908119894)119905120601(119909) + 119887

119894 can be appliedto generate 119896 decision-making values and category 119909 is thecategory of the maximum decision making value

(2) One-Against-One (OAO) When there are 119896 categoriestwo categories can produce an SVM thus it can produce 119896(119896minus1)2 classifiers and determine the category of the samples by avoting strategy [28] For example if there are three categories(1 2 and 3) and a sample to be classified with an assumedcategory of 2 the sample will then be input into three SVMsEach SVM will determine the category of the sample usingdecision making function sign((119908119894119895)119905Φ(119909)+ 119887119894119895) and adds 1 tothe votes of the category Finally the category with the mostvotes is the category of the sample

(3) Directed Acyclic Graph (DAG) Similar to OAO methodDAG is to disintegrate the classification problem 119896 categoriesinto a 119896(119896 minus 1)2 dual-category classification problem [18]During the training process it selects any two categoriesfrom 119896 categories as a group which it combines into a dual-category classification SVM during the testing process itestablishes a dual-category acyclic graph The data of anunknown category is tested from the root nodes In a problemwith 119896 classes a rooted binaryDAGhas 119896 leaves labeled by theclasses where each of the 119896(119896 minus 1)2 internal nodes is labeledwith an element of a Boolean function [19]

4 Experiment and Results

41 Feature Selection Based on SVM-RFE Themain purposeof SVM-RFE is to compute the ranking weights for allfeatures and sort the features according to weight vectors asthe classification basis SVM-RFE is an iteration process ofthe backward removal of features Its steps for feature setselection are shown as follows

(1) Use the current dataset to train the classifier(2) Compute the ranking weights for all features(3) Delete the feature with the smallest weight

Implement the iteration process until there is only one featureremaining in the dataset the implementation result providesa list of features in the order of weight The algorithmwill remove the feature with smallest ranking weight whileretaining the feature variables of significant impact Finallythe feature variables will be listed in the descending orderof explanatory difference degree SVM-RFErsquos selection offeature sets can be mainly divided into three steps namely(1) the input of the datasets to be classified (2) calculationof weight of each feature and (3) the deletion of the featureof minimum weight to obtain the ranking of features Thecomputational step is shown as follows [12]

(1) Input

Training sample1198830= [1199091 1199092 119909

119898]119879

Category 119910 = [1199101 1199102 119910

119898]119879

The current feature set 119904 = [1 2 119899]Feature sorted list 119903 = []

(2) Feature Sorting

Repeat the following process until 119904 = []To obtain the new training sample matrix accordingto the remaining features119883 = 119883

0( 119904)

Training classifier 120572 = SVM-train(119883 119910)Calculation of weight 119908 = sum

119896120572119896119910119896119909119896

Calculation of sorting standards 119888119894= (119908119894)2

Finding the features of the minimum weight 119891 =

arg min(119888)Updating feature sorted list 119903 = [119904(119891) 119903]Removing the features with minimum weight 119904 =119904(1 minus1 119891 + 1 length(119904))

(3) Output Feature Sorted List 119903 In each loop the featurewithminimum (119908

119894)2 will be removedThe SVM then retrains

the remaining features to obtain the new feature sortingSVM-RFE repeatedly implements the process until obtaininga feature sorted list Through training SVM using the featuresubsets of the sorted list and evaluating the subsets using theSVMprediction accuracy we can obtain the optimum featuresubsets

42 SVM Parameters Optimization Based on Taguchi MethodTaguchi Method rises from the engineering technologicalperspective and its major tools include the orthogonal arrayand 119878119873 ratio where 119878119873 ratio and loss function are closelyrelated A higher 119878119873 ratio indicates fewer losses [29] Param-eter selection is an important step of the construction of theclassificationmodel using SVMThe differences in parametersettings can affect classification model stability and accuracyHsu and Yu (2012) combined Taguchi method and Staelinmethod to optimize the SVM-based e-mail spam filteringmodel and promote spam filtering accuracy [30] Taguchiparameter design has many advantages For one the effect ofrobustness on quality is great Robustness reduces variation inparts by reducing the effects of uncontrollable variationMoreconsistent parts are equal to better quality Also the Taguchimethod allows for the analysis of many different parameterswithout a prohibitively high amount of experimentation Itprovides the design engineer with a systematic and efficientmethod for determining near optimumdesign parameters forperformance and cost Therefore by using the Taguchi qual-ity parameter design this study conducts the optimizationdesign of parameters119862 and 120574 to enhance the accuracy of SVMclassifier on the diagnosis of multiclass diseases

This study uses the multiclass classification accuracy asthe quality attribute of the Taguchi parameter design [21] Ingeneral when the classification accuracy is higher it meansthe accuracy of the classification model is better that isthe quality attribute is larger-the-better (LTB) and 119878119873LTB isdefined as

119878119873LTB = minus10 log10 (119872119878119863) = minus10 log10 [1

119899

119899

sum

119894=1

1

1199102

119894

] (13)

6 The Scientific World Journal

Table 4 Classification accuracy comparison

Dermatology database Zoo database

119862120574

119862120574

1 3 10 12 01 5 10 121 5257 9518 9408 9422 1 7118 7809 6236 406410 5257 9604 9794 9793 10 7118 9600 9100 850950 5257 9631 9686 9658 50 7118 9609 9600 9600100 5257 9631 9632 9603 100 7118 9609 9609 9600

Table 5 Factor level configuration of LS-SVM parameter design

Dermatology database Zoo database

Control factor Level Control factor Level1 2 3 1 2 3

119860(119862) 10 50 100 119860(119862) 5 10 50119861(120574) 24 5 10 119861(120574) 008 4 11

43 Evaluation of Classification Accuracy Cross-validationmeasurement divides all the samples into a training set anda testing set The training set is the learning data of thealgorithm to establish the classification rules the samples ofthe testing data are used as the testing data to measure theperformance of the classification rules All the samples arerandomly divided into 119896-folds by category and the data aremutually repelled Each fold of the data is used as the testingdata and the remaining 119896minus1 folds are used as the training setThe step is repeated 119896 times and each testing set validates theclassification rules learnt from the corresponding training setto obtain an accuracy rate The average of the accuracy ratesof all 119896 testing sets can be used as the final evaluation resultsThe method is known as 119896-fold cross-validation

44 Results and Discussion The ranking order of all featuresfor Dermatology and Zoo databases using RFE-SVM issummarized as follows Dermatology = V1 V16 V32 V28V19 V3 V17 V2 V15 V21 V26 V13 V14 V5 V18 V4 V23V11 V8 V12 V27 V24 V6 V25 V30 V29 V10 V31 V22V20 V33 V7 V9 and Zoo = V13 V9 V14 V10 V16 V4V8 V1 V11 V2 V12 V5 V6 V3 V15 V7 According to thesuggestions of scholars the classification error rate of OAO isrelatively lowerwhen the number of testing instances is below1000Multiclass SVMparameter settings can affect theMulti-class SVMrsquos classification accuracy Arenas-Garcıa and Perez-Cruz applied SVMsrsquo parameters setting in the multiclass Zoodataset [31]They have carried out simulation usingGaussiankernels for all possible combinations of 119862 and Garmar from119862 = [119897 3 10 30 100] and Garmar = sqrt(025d) sqrt(05d)sqrt(d) sqrt(2d) and sqrt(4d) with d being the dimension ofthe input data In this study we have executed wide ranges ofthe parameter settings for Dermatology and Zoo databasesFinally the parameter settings are suggested as Dermatology(119862 120574) = 119862 = 1 10 50 100 and 120574 = 1 3 10 12 Zoo(119862 120574) = 119862 = 1 10 50 100 and 120574 = 01 5 10 12 and thetesting accuracies are shown in Table 4

As shown in Table 4 regarding parameter 119862 when 119862 =10 and 120574 = 5 10 12 the accuracy of the experiment ishigher than that of the experimental combination of 119862 = 1

and 120574 = 5 10 12 moreover regarding parameter 120574 theexperimental accuracy rate in the case of 120574 = 5 and 119862 =1 10 50 100 is higher than that of the experimental com-bination of 120574 = 01 and 119862 = 1 10 50 100 The near optimalvalue of 119862 or 120574 may not be the same for different databasesFinding the appropriate parameter settings is important forthe performance of classifiers Practically it is impossible tosimulate every possible combination of parameter settingsAnd that is the reason why Taguchi methodology is appliedto reduce the experimental combinations for SVM Theexperimental step used in this study was first referred tothe related study ex 119862 = [1 3 10 30 100] [31] then set apossible range for both databases (119862 = 1sim100 120574 = 1sim12)After that we slightly adjusted the ranges to understand ifthere will be better results in Taguchi quality engineeringparameter optimization for each database According toour experimental result the final parameter settings 119862 and120574 range 10sim100 and 24sim10 respectively for Dermatologydatabase the parameters settings 119862 and 120574 range 5sim50 and008sim11 respectively for Zoo databases Within the rangeof Dermatology and Zoo databases parameters 119862 and 120574 weselect three parameter levels and two control factors 119860 and119861 to represent parameters 119862 and 120574 respectively The Taguchiorthogonal array experiment selects 119871

9(32) and the factor

level configuration is as illustrated in Table 5After data preprocessing Dermatology and Zoo

databases include 358 and 101 testing instances respectivelyThe various experiments of the orthogonal array are repeatedfive times (119899 = 5) the experimental combination andobservations are summarized as shown in Tables 6 and 7According to (13) we can calculate the 119878119873 ratio for Taguchiexperimental combination 1 as

119878119873LTB = minus10 log10 [1

5times (

1

096312+

1

097012+

1

096972

+1

096272+

1

096142)]

= minus03060

(14)

The Scientific World Journal 7

Table 6 Summary of experiment data of Dermatology database

Number Control factor Observation Average SN119860 119861 119910

11199102

1199103

1199104

1199105

1 1 1 09631 09701 09697 09627 09614 09654 minus030602 1 2 09686 09749 09653 09621 09732 09688 minus027553 1 3 09795 09847 09848 09838 09735 09813 minus016474 2 1 09630 09615 09581 09599 09668 09619 minus033795 2 2 09687 09721 09704 09707 09626 09689 minus027466 2 3 09685 09748 09744 09712 09707 09719 minus024757 3 1 09671 09689 09648 09668 09645 09664 minus029678 3 2 09741 09704 09797 09799 09767 09762 minus020989 3 3 09625 09633 09642 09678 09619 09639 minus03191(1198601 = 10 1198602 = 50 1198603 = 100 1198611 = 24 1198612 = 5 1198613 = 10)

Table 7 Summary of experiment data of Zoo database

Number Control factor Observation Average SN119860 119861 119910

11199102

1199103

1199104

1199105

1 1 1 09513 09673 09435 09567 09546 09547 minus040372 1 2 09600 09616 09588 09611 09608 09605 minus035043 1 3 07809 07833 07820 07679 07811 07790 minus216944 2 1 07118 06766 07368 07256 07109 07123 minus295715 2 2 09600 09612 09604 09519 09440 09555 minus039606 2 3 08900 08947 09214 09050 09190 09060 minus085987 3 1 07118 07398 07421 07495 07203 07327 minus270648 3 2 09610 09735 09709 09752 09661 09693 minus027099 3 3 09600 09723 09707 09509 09763 09660 minus03013(1198601 = 5 1198602 = 10 1198603 = 50 1198611 = 008 1198612 = 4 1198613 = 11)

The calculation results of the 119878119873 ratios of the remaining eightexperimental combinations are summarized as in Table 6The Zoo experimental results and 119878119873 ratio calculation areas shown in Table 7 According to the above results we thencalculate the average 119878119873 ratios of the various factor levelsWith the experiment of Table 8 as an example the average119878119873 ratio 119860

1of Factor 119860 at Level 1 is

1198601=1

3[minus03060 + (minus02755) + (minus01647)] = minus02487

(15)

Similarly we can calculate the average effects of 1198602and

1198603from Table 6The difference analysis results of the various

factor levels of Dermatology and Zoo databases are as shownin Table 8 The factor effect diagrams are as shown in Figures2 and 3 As a greater 119878119873 ratio represents better qualityaccording to the factor level difference and factor effectdiagrams the Dermatology parameter level combination is11986011198613 in other words parameters 119862 = 10 120574 = 10 Zoo

parameter level combination is 11986011198612 and the parameter

settings are 119862 = 5 120574 = 4When constructing the Multiclass SVM model using

SVM-RFE three different feature sets are selected according

minus024

minus025

minus026

minus027

minus028

minus029

minus030

minus031

minus032

1 2 3 1 2 3

A B

SN

Figure 2 Main effect plots for 119878119873 ratio of Dermatology database

to their significance At the first stage Taguchi qualityengineering is applied to select the optimum values ofparameters 119862 and 120574 At the second stage it constructs theMulticlass SVM Classifier and compares the classificationperformance according to the above parameters In theDermatology experiment Table 9 illustrates the two featuresubsets containing 23 and 33 feature variables The 33 feature

8 The Scientific World Journal

Table 8 Average of each factor at all levels

Dermatology Zoo

Control factor Level Control factor Level1 2 3 Difference 1 2 3 Difference

119860(119862) minus02487 minus02867 minus02752 00380 119860(119862) minus09745 minus14043 minus10929 04298119861(120574) minus03135 minus02533 minus02438 00697 119861(120574) minus20224 minus03391 minus11102 16833

Table 9 Classification performance comparison of Dermatology database

Methods Dimensions 119862 120574 AccuracySVM 33 100 5 9510 plusmn 00096SVM-RFE 23 50 24 8928 plusmn 00139SVM-RFE-Taguchi 23 10 10 9538 plusmn 00098

Table 10 Classification performance comparison of Zoo database

Methods Dimensions 119862 120574 AccuracySVM 16 10 11 89 plusmn 00314SVM-RFE 6 50 008 92 plusmn 00199SVM-RFE-Taguchi 12 5 4 97 plusmn 00396

321 321

minus05

minus10

minus15

minus20

A B

SN

Figure 3 Main effect plots for 119878119873 ratio of Zoo database

sets are tested by SVM and SVM as based on Taguchi Theparameter settings and testing accuracy rate results are asshown in Table 9 The experimental results as shown inFigure 4 show that the SVM (119862 = 10 120574 = 10) testingaccuracy rate of the 17-feature sets datasets can be higherthan 90 which is better than the accuracy rate of 20-featuresets dataset SVM (119862 = 10 120574 = 11) up to 90 Moreoverregardless of how many sets of feature variables are selectedthe accuracy of SVM (119862 = 50 120574 = 24) cannot be higher than90

Regarding the Zoo experiment Table 10 summarizes theexperimental test results of sets containing 6 12 and 16feature variables using SVM and SVM based on Taguchi Asshown in Table 10 the experimental results show that theclassification accuracy rate of the set of 12-feature variables inthe classification experiment using SVM-RFE-Taguchi (119862 =10 120574 = 10) is the highest up to 97 plusmn 00396 As shown inFigure 5 the experimental results show that the classification

1

09

08

07

06

05

04

03

02

Accu

racy

0 5 10 15 20 25 30 35

Number of features

SVM-RFE-TaguchiC = 10 120574 = 10

SVM-RFE C = 50 120574 = 24

SVM-RFE C = 100 120574 = 5

Figure 4 Classification performance comparison of Dermatologydatabase

accuracy rate of the dataset containing 7 feature variables bySVM-RFE-Taguchi (119862 = 50 120574 = 24) can be higher than 90which can obtain relatively better prediction effects

5 Conclusions

As the study on the impact of feature selection on themulticlass classification accuracy rate becomes increasinglyattractive and significant this study applies SVM-RFE andSVM in the construction of amulticlass classificationmethodin order to establish the classification model As RFE is a

The Scientific World Journal 9

Table 11 Comparison of classification accuracy in related literature

Author Method AccuracyDermatology database

Xie et al (2005) [16] FOut SVM 9174Srinivasa et al (2006) [32] FCM SVM 8330Ren et al (2006) [33] LDA SVM 7209Our Method (2014) SVM-RFE-Taguchi 9538

Zoo databaseXie et al (2005) [16] FOut SVM 8824He (2006) [34] NFPH k-modes 9208Golzari et al (2009) [35] Fuzzy AIRS 9496Our Method (2014) SVM-RFE-Taguchi 9700

1

095

09

085

08

075

07

065

Accu

racy

0 2 4 6 8 10 12 14 16

Number of features

SVM-RFE-TaguchiC = 5 120574 = 4

SVM-RFE C = 10 120574 = 11

SVM-RFE C = 50 120574 = 008

Figure 5 Classification performance comparison of Zoo database

feature selection method of a wrapper model it requires apreviously defined classifier as the assessment rule of featureselection therefore SVM is used as the RFE assessmentstandard to help RFE in the selection of feature sets

According to the experimental results of this studywith respect to parameter settings the impact of parameterselection on the construction of SVM classification modelis huge Therefore this study applies the Taguchi parameterdesign in determining the parameter range and selection ofthe optimum parameter combination for SVM classifier asit is a key factor influencing the classification accuracy Thisstudy also collected the experimental results of using differentresearch methods in the case of Dermatology and Zoodatabases [16 32 33] as shown inTable 11 By comparison theproposed method can achieve higher classification accuracy

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

References

[1] N Cristianini and J Shawe-Taylor An Introduction to SupportVector Machines and Other Kernel-based Learning MethodsCambridge University Press Cambridge UK 2000

[2] J Luts F Ojeda R van de Plas Raf B de Moor S van Huffeland J A K Suykens ldquoA tutorial on support vector machine-based methods for classification problems in chemometricsrdquoAnalytica Chimica Acta vol 665 no 2 pp 129ndash145 2010

[3] M F Akay ldquoSupport vector machines combined with featureselection for breast cancer diagnosisrdquo Expert Systems withApplications vol 36 no 2 pp 3240ndash3247 2009

[4] C-Y Chang S-J Chen andM-F Tsai ldquoApplication of support-vector-machine-based method for feature selection and clas-sification of thyroid nodules in ultrasound imagesrdquo PatternRecognition vol 43 no 10 pp 3494ndash3506 2010

[5] H-L Chen B Yang J Liu and D-Y Liu ldquoA support vectormachine classifier with rough set-based feature selection forbreast cancer diagnosisrdquo Expert Systems with Applications vol38 no 7 pp 9014ndash9022 2011

[6] P Danenas and G Garsva ldquoCredit risk evaluation modelingusing evolutionary linear SVM classifiers and sliding windowapproachrdquo Procedia Computer Science vol 9 pp 1324ndash13332012

[7] C L Huang H C Liao and M C Chen ldquoPrediction modelbuilding and feature selection with support vector machines inbreast cancer diagnosisrdquo Expert Systems with Applications vol34 no 1 pp 578ndash587 2008

[8] H F Liau and D Isa ldquoFeature selection for support vectormachine-based face-iris multimodal biometric systemrdquo ExpertSystems with Applications vol 38 no 9 pp 11105ndash11111 2011

[9] Y Zhang Z Chi andY Sun ldquoA novelmulti-class support vectormachine based on fuzzy theoriesrdquo in Intelligent ComputingInternational Conference on Intelligent Computing Part I (ICICrsquo06) D S Huang K Li and G W Irwin Eds vol 4113 ofLecture Notes in Computer Science pp 42ndash50 Springer BerlinGermany

[10] Y Aksu D J Miller G Kesidis and Q X Yang ldquoMargin-maximizing feature elimination methods for linear and nonlin-ear kernel-based discriminant functionsrdquo IEEE Transactions onNeural Networks vol 21 no 5 pp 701ndash717 2010

[11] P Pudil J Novovicova and J Kittler ldquoFloating search methodsin feature selectionrdquo Pattern Recognition Letters vol 15 no 11pp 1119ndash1125 1994

10 The Scientific World Journal

[12] I Guyon J Weston S Barnhill and V Vapnik ldquoGene selec-tion for cancer classification using support vector machinesrdquoMachine Learning vol 46 no 1ndash3 pp 389ndash422 2002

[13] S Harikrishna M A H Farquad and Shabana ldquoCredit scoringusing support vector machine a comparative analysisrdquo inAdvanced Materials Research Trans Tech Publications ZurichSwitzerland 2012

[14] X Lin F Yang L Zhou et al ldquoA support vector machine-recursive feature elimination feature selection method basedon artificial contrast variables andmutual informationrdquo Journalof Chromatography B Analytical Technologies in the Biomedicaland Life Sciences vol 10 pp 149ndash155 2012

[15] R Zhang and M Jianwen ldquoFeature selection for hyperspectraldata based on recursive support vector machinesrdquo InternationalJournal of Remote Sensing vol 30 no 14 pp 3669ndash3677 2009

[16] Z X Xie Q H Hu and D R Yu ldquoFuzzy output supportvector machines for classificationrdquo in Advances in NaturalComputation L Wang K Chen and Y S Ong Eds vol 3612pp 1190ndash1197 Springer Berlin Germany

[17] Y Liu Z You and L Cao ldquoA novel and quick SVM-basedmulti-class classifierrdquo Pattern Recognition vol 39 no 11 pp 2258ndash2264 2006

[18] J Platt N C Cristianini and J Shawe-Taylor ldquoLarge marginDAGs for multiclass classificationrdquo in Advances in NeuralInformation Processing Systems S A Solla T K Leen and KR Muller Eds vol 12 pp 547ndash553 2000

[19] Y Xu S Zomer and R G Brereton ldquoSupport vector machinesa recent method for classification in chemometricsrdquo CriticalReviews in Analytical Chemistry vol 36 no 3-4 pp 177ndash1882006

[20] M L Huang Y H Hung and E J Lin ldquoEffects of SVMparameter optimization based on the parameter design ofTaguchi methodrdquo International Journal on Artificial IntelligenceTools vol 20 no 3 pp 563ndash575 2011

[21] H-C Lin C-T Su C-C Wang B-H Chang and R-CJuang ldquoParameter optimization of continuous sputtering pro-cess based on Taguchi methods neural networks desirabilityfunction and genetic algorithmsrdquo Expert Systems with Applica-tions vol 39 no 17 pp 12918ndash12925 2012

[22] Y Mao D Pi Y Liu and Y Sun ldquoAccelerated recursive featureelimination based on support vector machine for key variableidentificationrdquo Chinese Journal of Chemical Engineering vol 14no 1 pp 65ndash72 2006

[23] A Pal and J Maiti ldquoDevelopment of a hybrid methodology fordimensionality reduction inMahalanobis-Taguchi systemusingMahalanobis distance and binary particle swarm optimizationrdquoExpert Systems with Applications vol 37 no 2 pp 1286ndash12932010

[24] C-T Su and Y-H Hsiao ldquoMulticlass MTS for simultane-ous feature selection and classificationrdquo IEEE Transactions onKnowledge and Data Engineering vol 21 no 2 pp 192ndash2052009

[25] X Lin F Yang L Zhou et al ldquoA support vector machine-recursive feature elimination feature selectionmethod based onartificial contrast variables and mutual informationrdquo Journal ofChromatography B vol 910 pp 149ndash155 2012

[26] E Hullermeier and S Vanderlooy ldquoCombining predictions inpairwise classification an optimal adaptive voting strategy andits relation to weighted votingrdquo Pattern Recognition vol 43 no1 pp 128ndash142 2010

[27] L Bottou C Cortes J Denker et al ldquoComparison of classifiermethodsmdasha case study in handwritten digit recognitionrdquo in

Proceedings of the 12th Iapr International Conference on PatternRecognition vol 2 pp 77ndash82 IEEEComputer Society Press LosAlamitos Calif USA 1994

[28] J Furnkranz ldquoRound robin rule learningrdquo in Proceedings of the18th International Conference on Machine Learning (ICML 01)pp 146ndash153 2001

[29] M R Sohrabi S Jamshidi and A Esmaeilifar ldquoCloud pointextraction for determination of Diazinon optimization of theeffective parameters using Taguchi methodrdquoChemometrics andIntelligent Laboratory Systems vol 110 no 1 pp 49ndash54 2012

[30] W C Hsu and T Y Yu ldquoSupport vector machines parameterselection based on combined taguchi method and staelinmethod for e-mail spam filteringrdquo International Journal ofEngineering and Technology Innovation vol 2 no 2 pp 113ndash1252012

[31] J Arenas-Garcıa and F Perez-Cruz ldquoMulti-class support vectormachines A new approachrdquo in Proceeding of the IEEE Interna-tional Conference on Accoustics Speech and Signal Processing(ICASSP 03) vol 2 pp 781ndash784 April 2003

[32] K G Srinivasa K R Venugopal and L M Patnaik ldquoFeatureextraction using fuzzy c-means clustering for data mining sys-temsrdquo International Journal of Computer Science and NetworkSecurity vol 6 no 3A pp 230ndash236 2006

[33] Y Ren H Liu C Xue X YaoM Liu and B Fan ldquoClassificationstudy of skin sensitizers based on support vector machine andlinear discriminant analysisrdquo Analytica Chimica Acta vol 572no 2 pp 272ndash282 2006

[34] ZHe Farthest-point heuristic based initializationmethods for K-modes clustering [thesis] Department of Computer Science andEngineering Harbin Institute of Technology Harbin China2006

[35] SGolzari SDoraisamyMN Sulaiman andN IUdzir ldquoEffectof fuzzy resource allocation method on AIRS classifier accu-racyrdquo Journal ofTheoretical andApplied Information Technologyvol 5 no 1 pp 18ndash24 2009

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 2: Research Article SVM-RFE Based Feature Selection and ...downloads.hindawi.com/journals/tswj/2014/795624.pdf · SVM-RFE Based Feature Selection and Taguchi Parameters Optimization

2 The Scientific World Journal

Table 1 Feature information for Dermatology and Zoo databases

Dermatology ZooDatasetcharacteristics Multivariate Multivariate

Attributecharacteristics Categorical integer Categorical integer

Associated tasks Classification ClassificationArea Life LifeNumber ofinstances 366 101

Number ofattributes 33 16

Number of class 6 7

parameter design to optimize Multiclass SVM parameters 119862and 120574 to improve the classification accuracy for SVM multi-class classifier

This study is organized as follows Section 2 describes theresearch data Section 3 introduces methods used throughthis paper Section 4 discusses the experiment and resultsFinally Section 5 presents our conclusions

2 Study Population

This study used the Dermatology dataset from Universityof California at Irvine (UCI) and the Zoo database fromits College of Information Technology and Computers toconduct experimental tests parameter optimization andclassification accuracy performance evaluation using theSVM classifier

In medicine dermatological diseases are diseases of theskin that have a serious impact on health As frequentlyoccurring types of diseases there are more than 1000 kindsof dermatological diseases such as psoriasis seborrheicdermatitis lichen planus pityriasis chronic dermatitis andpityriasis rubra pilaris The Dermatology dataset was estab-lished by Nilsel in 1998 and contains 33 feature variables and1 class variable (6-class)

The dermatology feature variables and data summaryare as shown in Table 1 The Dermatology dataset has eightomissions After removing the eight omissions we retained358 (instances) for this studyThe instances of data of variouscategories are psoriasis (Class 1) 111 instances seborrheicdermatitis (Class 2) 71 instances lichen planus (Class 3) 60instances pityriasis (Class 4) 48 instances chronic dermatitis(Class 5) 48 instances and pityriasis rubra pilaris (Class6) 20 instances The Zoo dataset contains 17 Boolean-valued attributes and 101 instances The instances of data ofvarious categories are as follows bear and so forth (Class1) 41 instances chicken and so forth (Class 2) 20 instancesseasnake and so forth (Class 3) 5 instances bass and so forth(Class 4) 13 instances (Class 5) 4 instances frog and so forth(Class 6) 8 instances and honeybee and so forth (Class 7) 10instances

Before feature selection we conducted feature attributecodingThe feature attribute coding of Dermatology and Zoodatabases is as shown in Tables 2 and 3

Table 2 Attributes of Dermatology database

ID AttributeV1 ErythemaV2 ScalingV3 Definite bordersV4 ItchingV5 Koebner phenomenonV6 Polygonal papulesV7 Follicular papulesV8 Oral mucosal involvementV9 Knee and elbow involvementV10 Scalp involvementV11 Family historyV12 Melanin incontinenceV13 Eosinophils in the infiltrateV14 PNL infiltrateV15 Fibrosis of the papillary dermisV16 ExocytosisV17 AcanthosisV18 HyperkeratosisV19 ParakeratosisV20 Clubbing of the rete ridgesV21 Elongation of the rete ridgesV22 Thinning of the suprapapillary epidermisV23 Spongiform pustuleV24 Munro microabscessV25 Focal hypergranulosisV26 Disappearance of the granular layerV27 Vacuolisation and damage of basal layerV28 SpongiosisV29 Saw-tooth appearance of retesV30 Follicular horn plugV31 Perifollicular parakeratosisV32 Inflammatory mononuclear infiltrateV33 Band-like infiltrateV34 Age

3 Methodology

31 Research Framework The research framework of thestudy is shown in Figure 1 Steps are as follows

(1) Database preprocessing delete the omissions andfeature variable coding for Dermatology and Zoodatasets And there are 358 and 101 instances left forDermatology and Zoo databases for further experi-ment respectively

(2) Feature selection apply SVM-RFE ranking accordingto the order of importance of the features anddetermine the feature set that contributes to theclassification

(3) Parameter optimization apply Taguchi parameterdesign in the parameters (119862 amp 120574) optimization of aMulticlass SVM Classifier in order to enhance theclassification accuracy for the multiclass dataset

The Scientific World Journal 3

Table 3 Attributes of Zoo database

ID AttributeV1 HairV2 FeathersV3 EggsV4 MilkV5 AirborneV6 AquaticV7 PredatorV8 ToothedV9 BackboneV10 BreathesV11 VenomousV12 FinsV13 LegsV14 TailV15 DomesticV16 Cat-size

UCI Dermatology and Zoo datasets

After preprocessing

Dermatology 358 instances

Zoo 101 instances

Feature selection

SVM-FRE

Method 1 Method 2

Taguchi parameter designC and 120574

LS-SVMBayesian

LS-SVMBayesian

initial parametersC and 120574

LS-SVM classifier performance evaluation

Figure 1 Research framework

32 Feature Selection Feature selection implies not onlycardinality reduction which means imposing an arbitrary orpredefined cutoff on the number of attributes that can beconsidered when building a model but also the choice ofattributes meaning that either the analyst or the modelingtool actively selects or discards attributes based on theirusefulness for analysis The feature selection method is asearch strategy to select or remove some features of the

original feature set to generate various types of subsets toobtain the optimum feature subset The subsets selected eachtime are compared and analyzed according to the formulatedassessment function If the subset selected in step 119898 + 1 isbetter than the subset selected in step 119898 the subset selectedin step119898 + 1 can be selected as the optimum subset

33 Linear Support Vector Machine (Linear SVM) SVM isdeveloped from statistical learning theory as based on SRM(structural risk minimization) It can be applied on classifica-tion and nonlinear regression [6] Generally speaking SVMcan be divided into linear SVM (linear SVM) and nonlinearSVM described as follows

(1) Linear SVM The linear SVM encodes the training dataof different types by classification with Class 1 as being ldquo+1rdquoand Class 2 as being ldquominus1rdquo and the mathematical symbolis 119909

119894 119910119894119879

119894minus1 119909119894isin R119898 119910

119894isin minus1 +1 the hyperplane is

represented as follows

119908 sdot 119909 + 119887 = 0 (1)

where 119908 denotes weight vector 119909 denotes the input datasetand 119887 denotes a constant as a bias (displacement) in thehyperplane The purpose of bias is to ensure that the hyper-plane is in the correct position after horizontal movementTherefore bias is determined after training119908The parametersof the hyperplane include 119908 and 119887 When SVM is appliedon classification the hyperplane is regarded as a decisionfunction

119891 (119909) = sign (119908 sdot 119909 + 119887) (2)

Generally speaking the purpose of SVM is to obtain thehyperplane of the maximized marginal distance and improvethe distinguishing function between the two categories ofthe dataset The process of optimizing the distinguishingfunction of the hyperplane can be regarded as a quadraticprogramming problem

minimize 119871119901=1

21199082

subject to 119910119894(119909119894sdot 119908 + 119887) minus 1 ge 0 119894 = 1 119897

(3)

The original minimization problem is converted into amaximization problem by using the LagrangeTheory

max 119871119863 (120572) =

119897

sum

119894=1

120572119894minus1

2

119897

sum

119894=1

119897

sum

119895=1

120572119894120572119895119910119894119910119895(119909119894119909119895)

subject to119897

sum

119894=1

120572119894119910119894= 0 119894 = 1 119897

120572119894ge 0 119894 = 1 119897

(4)

Finally the linear divisive decision making function is

119891 (119909) = sign(119899

sum

119894=1

119910119894120572lowast

119894(119909 sdot 119909119894) + 119887lowast) (5)

4 The Scientific World Journal

If 119891(119909) gt 0 it means the sample is in the same category assamples marked with ldquo+1rdquo otherwise it is in the category ofsamples marked with ldquominus1rdquo When the training data includenoise the linear hyperplane cannot accurately distinguishdata points By introducing slack variables 120585

119894in the constraint

the original (3) can be modified into the following

minimize 1

21199082+ 119862(

119897

sum

119894=1

120585119894)

subject to 119910119894(119909119894sdot 119908 + 119887) minus 1 + 120585

119894ge 0 119894 = 1 119897

120585119894ge 0 119894 = 1 119897

(6)

where 120585119894is the distance between the boundary and the clas-

sification point and penalty parameter 119862 represents the costof the classification error of training data during the learningprocess as determined by the user When 119862 is greater themargin will be smaller indicating that the fault tolerancerate will be smaller when a fault occurs Otherwise when119862 is smaller the fault tolerance rate will be greater When119862 rarr infin the linear inseparable problem will degenerateinto a linear separable problem In this case the solution ofthe above mentioned optimization problem can be appliedto obtain the various parameters and optimum solution ofthe target function using the Lagrangian coefficient thus thelinear inseparable dual optimization problem is as follows

Max 119871119863 (120572) =

119897

sum

119894=1

120572119894minus1

2

119897

sum

119894=1

119897

sum

119895=1

120572119894120572119895119910119894119910119895(119909119894119909119895)

Subject to119897

sum

119894=1

120572119894119910119894= 0 119894 = 1 119897

0 le 120572119894le 119862 119894 = 1 119897

(7)

Finally the linear decision-making function is

119891 (119909) = sign(119899

sum

119894=1

119910119894120572lowast

119894(119909 sdot 119909119894) + 119887lowast) (8)

(2) Nonlinear Support Vector Machine (Nonlinear SVM)When input training samples cannot be separated usinglinear SVM we can use conversion function 120593 to convertthe original 2-dimensional data into a new high-dimensionalfeature space for linear separable problem SVM can effi-ciently perform a nonlinear classification using what is calledthe kernel trick implicitly mapping their inputs into high-dimensional feature spaces Presently many different corefunctions have been proposed Using different core functionsregarding different data features can effectively improve thecomputational efficiency of SVM The relatively commoncore functions include the following four types

(1) linear kernel function

119870(119909119894 119910119894) = 119909119905

119894sdot 119910119895 (9)

(2) polynomial kernel function

119870(119909119894 119910119895) = (120574119909

119905

119894119909119895+ 119903)119898

120574 gt 0 (10)

(3) radial basis kernel function

119870(119909119894 119910119895) = exp(

minus10038171003817100381710038171003817119909119894minus 119910119895

10038171003817100381710038171003817

2

21205902) 120574 gt 0 (11)

(4) sigmoid kernel function

119870(119909119894 119910119895) = tanh (120574119909119905

119894sdot 119910119895+ 119903) (12)

where the emissive core function is more frequently appliedin high feature dimensional and nonlinear problems andthe parameters to be set are 120574 and 119862 which can slightlyreduce SVM complexity and improve calculation efficiencytherefore this study selects the emissive core function

34 Support Vector Machine Recursive Feature Elimination(SVM-RFE) A feature selection process can be used toremove terms in the training dataset that are statisticallyuncorrelated with the class labels thus improving bothefficiency and accuracy Pal and Maiti (2010) provided asupervised dimensionality reduction method The featureselection problem has been modeled as a mixed 0-1 inte-ger program [23] Multiclass Mahalanobis-Taguchi system(MMTS) is developed for simultaneous multiclass classi-fication and feature selection The important features areidentified using the orthogonal arrays and the signal-to-noise ratio and are then used to construct a reduced modelmeasurement scale [24] SVM-RFE is an SVM-based featureselection algorithm created by [12] Using SVM-RFE Guyonet al selected key and important feature sets In addition toreducing classification computational time it can improve theclassification accuracy rate [12] In recent years many schol-ars improved the classification effect in medical diagnosis bytaking advantage of this method [22 25]

35 Multiclass SVM Classifier SVMrsquos basic classificationprinciple is mainly based on dual categories Presently thereare three main methods one-against-all one-against-oneand directed acyclic graph to process multiclass problems[26] described as follows

(1) One-Against-All (OAA) Proposed by Bottou et al (1994)the one-versus-rest converts the classification problem of 119896categories into 119896 dual-category problems [27] Scholars havealso proposed subsequent effective classification methods[28] In the training process it must train 119896 dual-categorySVMs When training the 119894th classifier data in the 119894thcategory is regarded as ldquo+1rdquo and the data of the remainingcategories is regarded as ldquominus1rdquo to complete the trainingof 119896 dual-category SVM during the testing process eachtesting instance is tested by trained 119896 dual-category SVMsThe classification results can be determined by comparingthe outputs of SVM Regarding unknown category 119909 the

The Scientific World Journal 5

decision function arg max119894=1119896

(119908119894)119905120601(119909) + 119887

119894 can be appliedto generate 119896 decision-making values and category 119909 is thecategory of the maximum decision making value

(2) One-Against-One (OAO) When there are 119896 categoriestwo categories can produce an SVM thus it can produce 119896(119896minus1)2 classifiers and determine the category of the samples by avoting strategy [28] For example if there are three categories(1 2 and 3) and a sample to be classified with an assumedcategory of 2 the sample will then be input into three SVMsEach SVM will determine the category of the sample usingdecision making function sign((119908119894119895)119905Φ(119909)+ 119887119894119895) and adds 1 tothe votes of the category Finally the category with the mostvotes is the category of the sample

(3) Directed Acyclic Graph (DAG) Similar to OAO methodDAG is to disintegrate the classification problem 119896 categoriesinto a 119896(119896 minus 1)2 dual-category classification problem [18]During the training process it selects any two categoriesfrom 119896 categories as a group which it combines into a dual-category classification SVM during the testing process itestablishes a dual-category acyclic graph The data of anunknown category is tested from the root nodes In a problemwith 119896 classes a rooted binaryDAGhas 119896 leaves labeled by theclasses where each of the 119896(119896 minus 1)2 internal nodes is labeledwith an element of a Boolean function [19]

4 Experiment and Results

41 Feature Selection Based on SVM-RFE Themain purposeof SVM-RFE is to compute the ranking weights for allfeatures and sort the features according to weight vectors asthe classification basis SVM-RFE is an iteration process ofthe backward removal of features Its steps for feature setselection are shown as follows

(1) Use the current dataset to train the classifier(2) Compute the ranking weights for all features(3) Delete the feature with the smallest weight

Implement the iteration process until there is only one featureremaining in the dataset the implementation result providesa list of features in the order of weight The algorithmwill remove the feature with smallest ranking weight whileretaining the feature variables of significant impact Finallythe feature variables will be listed in the descending orderof explanatory difference degree SVM-RFErsquos selection offeature sets can be mainly divided into three steps namely(1) the input of the datasets to be classified (2) calculationof weight of each feature and (3) the deletion of the featureof minimum weight to obtain the ranking of features Thecomputational step is shown as follows [12]

(1) Input

Training sample1198830= [1199091 1199092 119909

119898]119879

Category 119910 = [1199101 1199102 119910

119898]119879

The current feature set 119904 = [1 2 119899]Feature sorted list 119903 = []

(2) Feature Sorting

Repeat the following process until 119904 = []To obtain the new training sample matrix accordingto the remaining features119883 = 119883

0( 119904)

Training classifier 120572 = SVM-train(119883 119910)Calculation of weight 119908 = sum

119896120572119896119910119896119909119896

Calculation of sorting standards 119888119894= (119908119894)2

Finding the features of the minimum weight 119891 =

arg min(119888)Updating feature sorted list 119903 = [119904(119891) 119903]Removing the features with minimum weight 119904 =119904(1 minus1 119891 + 1 length(119904))

(3) Output Feature Sorted List 119903 In each loop the featurewithminimum (119908

119894)2 will be removedThe SVM then retrains

the remaining features to obtain the new feature sortingSVM-RFE repeatedly implements the process until obtaininga feature sorted list Through training SVM using the featuresubsets of the sorted list and evaluating the subsets using theSVMprediction accuracy we can obtain the optimum featuresubsets

42 SVM Parameters Optimization Based on Taguchi MethodTaguchi Method rises from the engineering technologicalperspective and its major tools include the orthogonal arrayand 119878119873 ratio where 119878119873 ratio and loss function are closelyrelated A higher 119878119873 ratio indicates fewer losses [29] Param-eter selection is an important step of the construction of theclassificationmodel using SVMThe differences in parametersettings can affect classification model stability and accuracyHsu and Yu (2012) combined Taguchi method and Staelinmethod to optimize the SVM-based e-mail spam filteringmodel and promote spam filtering accuracy [30] Taguchiparameter design has many advantages For one the effect ofrobustness on quality is great Robustness reduces variation inparts by reducing the effects of uncontrollable variationMoreconsistent parts are equal to better quality Also the Taguchimethod allows for the analysis of many different parameterswithout a prohibitively high amount of experimentation Itprovides the design engineer with a systematic and efficientmethod for determining near optimumdesign parameters forperformance and cost Therefore by using the Taguchi qual-ity parameter design this study conducts the optimizationdesign of parameters119862 and 120574 to enhance the accuracy of SVMclassifier on the diagnosis of multiclass diseases

This study uses the multiclass classification accuracy asthe quality attribute of the Taguchi parameter design [21] Ingeneral when the classification accuracy is higher it meansthe accuracy of the classification model is better that isthe quality attribute is larger-the-better (LTB) and 119878119873LTB isdefined as

119878119873LTB = minus10 log10 (119872119878119863) = minus10 log10 [1

119899

119899

sum

119894=1

1

1199102

119894

] (13)

6 The Scientific World Journal

Table 4 Classification accuracy comparison

Dermatology database Zoo database

119862120574

119862120574

1 3 10 12 01 5 10 121 5257 9518 9408 9422 1 7118 7809 6236 406410 5257 9604 9794 9793 10 7118 9600 9100 850950 5257 9631 9686 9658 50 7118 9609 9600 9600100 5257 9631 9632 9603 100 7118 9609 9609 9600

Table 5 Factor level configuration of LS-SVM parameter design

Dermatology database Zoo database

Control factor Level Control factor Level1 2 3 1 2 3

119860(119862) 10 50 100 119860(119862) 5 10 50119861(120574) 24 5 10 119861(120574) 008 4 11

43 Evaluation of Classification Accuracy Cross-validationmeasurement divides all the samples into a training set anda testing set The training set is the learning data of thealgorithm to establish the classification rules the samples ofthe testing data are used as the testing data to measure theperformance of the classification rules All the samples arerandomly divided into 119896-folds by category and the data aremutually repelled Each fold of the data is used as the testingdata and the remaining 119896minus1 folds are used as the training setThe step is repeated 119896 times and each testing set validates theclassification rules learnt from the corresponding training setto obtain an accuracy rate The average of the accuracy ratesof all 119896 testing sets can be used as the final evaluation resultsThe method is known as 119896-fold cross-validation

44 Results and Discussion The ranking order of all featuresfor Dermatology and Zoo databases using RFE-SVM issummarized as follows Dermatology = V1 V16 V32 V28V19 V3 V17 V2 V15 V21 V26 V13 V14 V5 V18 V4 V23V11 V8 V12 V27 V24 V6 V25 V30 V29 V10 V31 V22V20 V33 V7 V9 and Zoo = V13 V9 V14 V10 V16 V4V8 V1 V11 V2 V12 V5 V6 V3 V15 V7 According to thesuggestions of scholars the classification error rate of OAO isrelatively lowerwhen the number of testing instances is below1000Multiclass SVMparameter settings can affect theMulti-class SVMrsquos classification accuracy Arenas-Garcıa and Perez-Cruz applied SVMsrsquo parameters setting in the multiclass Zoodataset [31]They have carried out simulation usingGaussiankernels for all possible combinations of 119862 and Garmar from119862 = [119897 3 10 30 100] and Garmar = sqrt(025d) sqrt(05d)sqrt(d) sqrt(2d) and sqrt(4d) with d being the dimension ofthe input data In this study we have executed wide ranges ofthe parameter settings for Dermatology and Zoo databasesFinally the parameter settings are suggested as Dermatology(119862 120574) = 119862 = 1 10 50 100 and 120574 = 1 3 10 12 Zoo(119862 120574) = 119862 = 1 10 50 100 and 120574 = 01 5 10 12 and thetesting accuracies are shown in Table 4

As shown in Table 4 regarding parameter 119862 when 119862 =10 and 120574 = 5 10 12 the accuracy of the experiment ishigher than that of the experimental combination of 119862 = 1

and 120574 = 5 10 12 moreover regarding parameter 120574 theexperimental accuracy rate in the case of 120574 = 5 and 119862 =1 10 50 100 is higher than that of the experimental com-bination of 120574 = 01 and 119862 = 1 10 50 100 The near optimalvalue of 119862 or 120574 may not be the same for different databasesFinding the appropriate parameter settings is important forthe performance of classifiers Practically it is impossible tosimulate every possible combination of parameter settingsAnd that is the reason why Taguchi methodology is appliedto reduce the experimental combinations for SVM Theexperimental step used in this study was first referred tothe related study ex 119862 = [1 3 10 30 100] [31] then set apossible range for both databases (119862 = 1sim100 120574 = 1sim12)After that we slightly adjusted the ranges to understand ifthere will be better results in Taguchi quality engineeringparameter optimization for each database According toour experimental result the final parameter settings 119862 and120574 range 10sim100 and 24sim10 respectively for Dermatologydatabase the parameters settings 119862 and 120574 range 5sim50 and008sim11 respectively for Zoo databases Within the rangeof Dermatology and Zoo databases parameters 119862 and 120574 weselect three parameter levels and two control factors 119860 and119861 to represent parameters 119862 and 120574 respectively The Taguchiorthogonal array experiment selects 119871

9(32) and the factor

level configuration is as illustrated in Table 5After data preprocessing Dermatology and Zoo

databases include 358 and 101 testing instances respectivelyThe various experiments of the orthogonal array are repeatedfive times (119899 = 5) the experimental combination andobservations are summarized as shown in Tables 6 and 7According to (13) we can calculate the 119878119873 ratio for Taguchiexperimental combination 1 as

119878119873LTB = minus10 log10 [1

5times (

1

096312+

1

097012+

1

096972

+1

096272+

1

096142)]

= minus03060

(14)

The Scientific World Journal 7

Table 6 Summary of experiment data of Dermatology database

Number Control factor Observation Average SN119860 119861 119910

11199102

1199103

1199104

1199105

1 1 1 09631 09701 09697 09627 09614 09654 minus030602 1 2 09686 09749 09653 09621 09732 09688 minus027553 1 3 09795 09847 09848 09838 09735 09813 minus016474 2 1 09630 09615 09581 09599 09668 09619 minus033795 2 2 09687 09721 09704 09707 09626 09689 minus027466 2 3 09685 09748 09744 09712 09707 09719 minus024757 3 1 09671 09689 09648 09668 09645 09664 minus029678 3 2 09741 09704 09797 09799 09767 09762 minus020989 3 3 09625 09633 09642 09678 09619 09639 minus03191(1198601 = 10 1198602 = 50 1198603 = 100 1198611 = 24 1198612 = 5 1198613 = 10)

Table 7 Summary of experiment data of Zoo database

Number Control factor Observation Average SN119860 119861 119910

11199102

1199103

1199104

1199105

1 1 1 09513 09673 09435 09567 09546 09547 minus040372 1 2 09600 09616 09588 09611 09608 09605 minus035043 1 3 07809 07833 07820 07679 07811 07790 minus216944 2 1 07118 06766 07368 07256 07109 07123 minus295715 2 2 09600 09612 09604 09519 09440 09555 minus039606 2 3 08900 08947 09214 09050 09190 09060 minus085987 3 1 07118 07398 07421 07495 07203 07327 minus270648 3 2 09610 09735 09709 09752 09661 09693 minus027099 3 3 09600 09723 09707 09509 09763 09660 minus03013(1198601 = 5 1198602 = 10 1198603 = 50 1198611 = 008 1198612 = 4 1198613 = 11)

The calculation results of the 119878119873 ratios of the remaining eightexperimental combinations are summarized as in Table 6The Zoo experimental results and 119878119873 ratio calculation areas shown in Table 7 According to the above results we thencalculate the average 119878119873 ratios of the various factor levelsWith the experiment of Table 8 as an example the average119878119873 ratio 119860

1of Factor 119860 at Level 1 is

1198601=1

3[minus03060 + (minus02755) + (minus01647)] = minus02487

(15)

Similarly we can calculate the average effects of 1198602and

1198603from Table 6The difference analysis results of the various

factor levels of Dermatology and Zoo databases are as shownin Table 8 The factor effect diagrams are as shown in Figures2 and 3 As a greater 119878119873 ratio represents better qualityaccording to the factor level difference and factor effectdiagrams the Dermatology parameter level combination is11986011198613 in other words parameters 119862 = 10 120574 = 10 Zoo

parameter level combination is 11986011198612 and the parameter

settings are 119862 = 5 120574 = 4When constructing the Multiclass SVM model using

SVM-RFE three different feature sets are selected according

minus024

minus025

minus026

minus027

minus028

minus029

minus030

minus031

minus032

1 2 3 1 2 3

A B

SN

Figure 2 Main effect plots for 119878119873 ratio of Dermatology database

to their significance At the first stage Taguchi qualityengineering is applied to select the optimum values ofparameters 119862 and 120574 At the second stage it constructs theMulticlass SVM Classifier and compares the classificationperformance according to the above parameters In theDermatology experiment Table 9 illustrates the two featuresubsets containing 23 and 33 feature variables The 33 feature

8 The Scientific World Journal

Table 8 Average of each factor at all levels

Dermatology Zoo

Control factor Level Control factor Level1 2 3 Difference 1 2 3 Difference

119860(119862) minus02487 minus02867 minus02752 00380 119860(119862) minus09745 minus14043 minus10929 04298119861(120574) minus03135 minus02533 minus02438 00697 119861(120574) minus20224 minus03391 minus11102 16833

Table 9 Classification performance comparison of Dermatology database

Methods Dimensions 119862 120574 AccuracySVM 33 100 5 9510 plusmn 00096SVM-RFE 23 50 24 8928 plusmn 00139SVM-RFE-Taguchi 23 10 10 9538 plusmn 00098

Table 10 Classification performance comparison of Zoo database

Methods Dimensions 119862 120574 AccuracySVM 16 10 11 89 plusmn 00314SVM-RFE 6 50 008 92 plusmn 00199SVM-RFE-Taguchi 12 5 4 97 plusmn 00396

321 321

minus05

minus10

minus15

minus20

A B

SN

Figure 3 Main effect plots for 119878119873 ratio of Zoo database

sets are tested by SVM and SVM as based on Taguchi Theparameter settings and testing accuracy rate results are asshown in Table 9 The experimental results as shown inFigure 4 show that the SVM (119862 = 10 120574 = 10) testingaccuracy rate of the 17-feature sets datasets can be higherthan 90 which is better than the accuracy rate of 20-featuresets dataset SVM (119862 = 10 120574 = 11) up to 90 Moreoverregardless of how many sets of feature variables are selectedthe accuracy of SVM (119862 = 50 120574 = 24) cannot be higher than90

Regarding the Zoo experiment Table 10 summarizes theexperimental test results of sets containing 6 12 and 16feature variables using SVM and SVM based on Taguchi Asshown in Table 10 the experimental results show that theclassification accuracy rate of the set of 12-feature variables inthe classification experiment using SVM-RFE-Taguchi (119862 =10 120574 = 10) is the highest up to 97 plusmn 00396 As shown inFigure 5 the experimental results show that the classification

1

09

08

07

06

05

04

03

02

Accu

racy

0 5 10 15 20 25 30 35

Number of features

SVM-RFE-TaguchiC = 10 120574 = 10

SVM-RFE C = 50 120574 = 24

SVM-RFE C = 100 120574 = 5

Figure 4 Classification performance comparison of Dermatologydatabase

accuracy rate of the dataset containing 7 feature variables bySVM-RFE-Taguchi (119862 = 50 120574 = 24) can be higher than 90which can obtain relatively better prediction effects

5 Conclusions

As the study on the impact of feature selection on themulticlass classification accuracy rate becomes increasinglyattractive and significant this study applies SVM-RFE andSVM in the construction of amulticlass classificationmethodin order to establish the classification model As RFE is a

The Scientific World Journal 9

Table 11 Comparison of classification accuracy in related literature

Author Method AccuracyDermatology database

Xie et al (2005) [16] FOut SVM 9174Srinivasa et al (2006) [32] FCM SVM 8330Ren et al (2006) [33] LDA SVM 7209Our Method (2014) SVM-RFE-Taguchi 9538

Zoo databaseXie et al (2005) [16] FOut SVM 8824He (2006) [34] NFPH k-modes 9208Golzari et al (2009) [35] Fuzzy AIRS 9496Our Method (2014) SVM-RFE-Taguchi 9700

1

095

09

085

08

075

07

065

Accu

racy

0 2 4 6 8 10 12 14 16

Number of features

SVM-RFE-TaguchiC = 5 120574 = 4

SVM-RFE C = 10 120574 = 11

SVM-RFE C = 50 120574 = 008

Figure 5 Classification performance comparison of Zoo database

feature selection method of a wrapper model it requires apreviously defined classifier as the assessment rule of featureselection therefore SVM is used as the RFE assessmentstandard to help RFE in the selection of feature sets

According to the experimental results of this studywith respect to parameter settings the impact of parameterselection on the construction of SVM classification modelis huge Therefore this study applies the Taguchi parameterdesign in determining the parameter range and selection ofthe optimum parameter combination for SVM classifier asit is a key factor influencing the classification accuracy Thisstudy also collected the experimental results of using differentresearch methods in the case of Dermatology and Zoodatabases [16 32 33] as shown inTable 11 By comparison theproposed method can achieve higher classification accuracy

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

References

[1] N Cristianini and J Shawe-Taylor An Introduction to SupportVector Machines and Other Kernel-based Learning MethodsCambridge University Press Cambridge UK 2000

[2] J Luts F Ojeda R van de Plas Raf B de Moor S van Huffeland J A K Suykens ldquoA tutorial on support vector machine-based methods for classification problems in chemometricsrdquoAnalytica Chimica Acta vol 665 no 2 pp 129ndash145 2010

[3] M F Akay ldquoSupport vector machines combined with featureselection for breast cancer diagnosisrdquo Expert Systems withApplications vol 36 no 2 pp 3240ndash3247 2009

[4] C-Y Chang S-J Chen andM-F Tsai ldquoApplication of support-vector-machine-based method for feature selection and clas-sification of thyroid nodules in ultrasound imagesrdquo PatternRecognition vol 43 no 10 pp 3494ndash3506 2010

[5] H-L Chen B Yang J Liu and D-Y Liu ldquoA support vectormachine classifier with rough set-based feature selection forbreast cancer diagnosisrdquo Expert Systems with Applications vol38 no 7 pp 9014ndash9022 2011

[6] P Danenas and G Garsva ldquoCredit risk evaluation modelingusing evolutionary linear SVM classifiers and sliding windowapproachrdquo Procedia Computer Science vol 9 pp 1324ndash13332012

[7] C L Huang H C Liao and M C Chen ldquoPrediction modelbuilding and feature selection with support vector machines inbreast cancer diagnosisrdquo Expert Systems with Applications vol34 no 1 pp 578ndash587 2008

[8] H F Liau and D Isa ldquoFeature selection for support vectormachine-based face-iris multimodal biometric systemrdquo ExpertSystems with Applications vol 38 no 9 pp 11105ndash11111 2011

[9] Y Zhang Z Chi andY Sun ldquoA novelmulti-class support vectormachine based on fuzzy theoriesrdquo in Intelligent ComputingInternational Conference on Intelligent Computing Part I (ICICrsquo06) D S Huang K Li and G W Irwin Eds vol 4113 ofLecture Notes in Computer Science pp 42ndash50 Springer BerlinGermany

[10] Y Aksu D J Miller G Kesidis and Q X Yang ldquoMargin-maximizing feature elimination methods for linear and nonlin-ear kernel-based discriminant functionsrdquo IEEE Transactions onNeural Networks vol 21 no 5 pp 701ndash717 2010

[11] P Pudil J Novovicova and J Kittler ldquoFloating search methodsin feature selectionrdquo Pattern Recognition Letters vol 15 no 11pp 1119ndash1125 1994

10 The Scientific World Journal

[12] I Guyon J Weston S Barnhill and V Vapnik ldquoGene selec-tion for cancer classification using support vector machinesrdquoMachine Learning vol 46 no 1ndash3 pp 389ndash422 2002

[13] S Harikrishna M A H Farquad and Shabana ldquoCredit scoringusing support vector machine a comparative analysisrdquo inAdvanced Materials Research Trans Tech Publications ZurichSwitzerland 2012

[14] X Lin F Yang L Zhou et al ldquoA support vector machine-recursive feature elimination feature selection method basedon artificial contrast variables andmutual informationrdquo Journalof Chromatography B Analytical Technologies in the Biomedicaland Life Sciences vol 10 pp 149ndash155 2012

[15] R Zhang and M Jianwen ldquoFeature selection for hyperspectraldata based on recursive support vector machinesrdquo InternationalJournal of Remote Sensing vol 30 no 14 pp 3669ndash3677 2009

[16] Z X Xie Q H Hu and D R Yu ldquoFuzzy output supportvector machines for classificationrdquo in Advances in NaturalComputation L Wang K Chen and Y S Ong Eds vol 3612pp 1190ndash1197 Springer Berlin Germany

[17] Y Liu Z You and L Cao ldquoA novel and quick SVM-basedmulti-class classifierrdquo Pattern Recognition vol 39 no 11 pp 2258ndash2264 2006

[18] J Platt N C Cristianini and J Shawe-Taylor ldquoLarge marginDAGs for multiclass classificationrdquo in Advances in NeuralInformation Processing Systems S A Solla T K Leen and KR Muller Eds vol 12 pp 547ndash553 2000

[19] Y Xu S Zomer and R G Brereton ldquoSupport vector machinesa recent method for classification in chemometricsrdquo CriticalReviews in Analytical Chemistry vol 36 no 3-4 pp 177ndash1882006

[20] M L Huang Y H Hung and E J Lin ldquoEffects of SVMparameter optimization based on the parameter design ofTaguchi methodrdquo International Journal on Artificial IntelligenceTools vol 20 no 3 pp 563ndash575 2011

[21] H-C Lin C-T Su C-C Wang B-H Chang and R-CJuang ldquoParameter optimization of continuous sputtering pro-cess based on Taguchi methods neural networks desirabilityfunction and genetic algorithmsrdquo Expert Systems with Applica-tions vol 39 no 17 pp 12918ndash12925 2012

[22] Y Mao D Pi Y Liu and Y Sun ldquoAccelerated recursive featureelimination based on support vector machine for key variableidentificationrdquo Chinese Journal of Chemical Engineering vol 14no 1 pp 65ndash72 2006

[23] A Pal and J Maiti ldquoDevelopment of a hybrid methodology fordimensionality reduction inMahalanobis-Taguchi systemusingMahalanobis distance and binary particle swarm optimizationrdquoExpert Systems with Applications vol 37 no 2 pp 1286ndash12932010

[24] C-T Su and Y-H Hsiao ldquoMulticlass MTS for simultane-ous feature selection and classificationrdquo IEEE Transactions onKnowledge and Data Engineering vol 21 no 2 pp 192ndash2052009

[25] X Lin F Yang L Zhou et al ldquoA support vector machine-recursive feature elimination feature selectionmethod based onartificial contrast variables and mutual informationrdquo Journal ofChromatography B vol 910 pp 149ndash155 2012

[26] E Hullermeier and S Vanderlooy ldquoCombining predictions inpairwise classification an optimal adaptive voting strategy andits relation to weighted votingrdquo Pattern Recognition vol 43 no1 pp 128ndash142 2010

[27] L Bottou C Cortes J Denker et al ldquoComparison of classifiermethodsmdasha case study in handwritten digit recognitionrdquo in

Proceedings of the 12th Iapr International Conference on PatternRecognition vol 2 pp 77ndash82 IEEEComputer Society Press LosAlamitos Calif USA 1994

[28] J Furnkranz ldquoRound robin rule learningrdquo in Proceedings of the18th International Conference on Machine Learning (ICML 01)pp 146ndash153 2001

[29] M R Sohrabi S Jamshidi and A Esmaeilifar ldquoCloud pointextraction for determination of Diazinon optimization of theeffective parameters using Taguchi methodrdquoChemometrics andIntelligent Laboratory Systems vol 110 no 1 pp 49ndash54 2012

[30] W C Hsu and T Y Yu ldquoSupport vector machines parameterselection based on combined taguchi method and staelinmethod for e-mail spam filteringrdquo International Journal ofEngineering and Technology Innovation vol 2 no 2 pp 113ndash1252012

[31] J Arenas-Garcıa and F Perez-Cruz ldquoMulti-class support vectormachines A new approachrdquo in Proceeding of the IEEE Interna-tional Conference on Accoustics Speech and Signal Processing(ICASSP 03) vol 2 pp 781ndash784 April 2003

[32] K G Srinivasa K R Venugopal and L M Patnaik ldquoFeatureextraction using fuzzy c-means clustering for data mining sys-temsrdquo International Journal of Computer Science and NetworkSecurity vol 6 no 3A pp 230ndash236 2006

[33] Y Ren H Liu C Xue X YaoM Liu and B Fan ldquoClassificationstudy of skin sensitizers based on support vector machine andlinear discriminant analysisrdquo Analytica Chimica Acta vol 572no 2 pp 272ndash282 2006

[34] ZHe Farthest-point heuristic based initializationmethods for K-modes clustering [thesis] Department of Computer Science andEngineering Harbin Institute of Technology Harbin China2006

[35] SGolzari SDoraisamyMN Sulaiman andN IUdzir ldquoEffectof fuzzy resource allocation method on AIRS classifier accu-racyrdquo Journal ofTheoretical andApplied Information Technologyvol 5 no 1 pp 18ndash24 2009

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 3: Research Article SVM-RFE Based Feature Selection and ...downloads.hindawi.com/journals/tswj/2014/795624.pdf · SVM-RFE Based Feature Selection and Taguchi Parameters Optimization

The Scientific World Journal 3

Table 3 Attributes of Zoo database

ID AttributeV1 HairV2 FeathersV3 EggsV4 MilkV5 AirborneV6 AquaticV7 PredatorV8 ToothedV9 BackboneV10 BreathesV11 VenomousV12 FinsV13 LegsV14 TailV15 DomesticV16 Cat-size

UCI Dermatology and Zoo datasets

After preprocessing

Dermatology 358 instances

Zoo 101 instances

Feature selection

SVM-FRE

Method 1 Method 2

Taguchi parameter designC and 120574

LS-SVMBayesian

LS-SVMBayesian

initial parametersC and 120574

LS-SVM classifier performance evaluation

Figure 1 Research framework

32 Feature Selection Feature selection implies not onlycardinality reduction which means imposing an arbitrary orpredefined cutoff on the number of attributes that can beconsidered when building a model but also the choice ofattributes meaning that either the analyst or the modelingtool actively selects or discards attributes based on theirusefulness for analysis The feature selection method is asearch strategy to select or remove some features of the

original feature set to generate various types of subsets toobtain the optimum feature subset The subsets selected eachtime are compared and analyzed according to the formulatedassessment function If the subset selected in step 119898 + 1 isbetter than the subset selected in step 119898 the subset selectedin step119898 + 1 can be selected as the optimum subset

33 Linear Support Vector Machine (Linear SVM) SVM isdeveloped from statistical learning theory as based on SRM(structural risk minimization) It can be applied on classifica-tion and nonlinear regression [6] Generally speaking SVMcan be divided into linear SVM (linear SVM) and nonlinearSVM described as follows

(1) Linear SVM The linear SVM encodes the training dataof different types by classification with Class 1 as being ldquo+1rdquoand Class 2 as being ldquominus1rdquo and the mathematical symbolis 119909

119894 119910119894119879

119894minus1 119909119894isin R119898 119910

119894isin minus1 +1 the hyperplane is

represented as follows

119908 sdot 119909 + 119887 = 0 (1)

where 119908 denotes weight vector 119909 denotes the input datasetand 119887 denotes a constant as a bias (displacement) in thehyperplane The purpose of bias is to ensure that the hyper-plane is in the correct position after horizontal movementTherefore bias is determined after training119908The parametersof the hyperplane include 119908 and 119887 When SVM is appliedon classification the hyperplane is regarded as a decisionfunction

119891 (119909) = sign (119908 sdot 119909 + 119887) (2)

Generally speaking the purpose of SVM is to obtain thehyperplane of the maximized marginal distance and improvethe distinguishing function between the two categories ofthe dataset The process of optimizing the distinguishingfunction of the hyperplane can be regarded as a quadraticprogramming problem

minimize 119871119901=1

21199082

subject to 119910119894(119909119894sdot 119908 + 119887) minus 1 ge 0 119894 = 1 119897

(3)

The original minimization problem is converted into amaximization problem by using the LagrangeTheory

max 119871119863 (120572) =

119897

sum

119894=1

120572119894minus1

2

119897

sum

119894=1

119897

sum

119895=1

120572119894120572119895119910119894119910119895(119909119894119909119895)

subject to119897

sum

119894=1

120572119894119910119894= 0 119894 = 1 119897

120572119894ge 0 119894 = 1 119897

(4)

Finally the linear divisive decision making function is

119891 (119909) = sign(119899

sum

119894=1

119910119894120572lowast

119894(119909 sdot 119909119894) + 119887lowast) (5)

4 The Scientific World Journal

If 119891(119909) gt 0 it means the sample is in the same category assamples marked with ldquo+1rdquo otherwise it is in the category ofsamples marked with ldquominus1rdquo When the training data includenoise the linear hyperplane cannot accurately distinguishdata points By introducing slack variables 120585

119894in the constraint

the original (3) can be modified into the following

minimize 1

21199082+ 119862(

119897

sum

119894=1

120585119894)

subject to 119910119894(119909119894sdot 119908 + 119887) minus 1 + 120585

119894ge 0 119894 = 1 119897

120585119894ge 0 119894 = 1 119897

(6)

where 120585119894is the distance between the boundary and the clas-

sification point and penalty parameter 119862 represents the costof the classification error of training data during the learningprocess as determined by the user When 119862 is greater themargin will be smaller indicating that the fault tolerancerate will be smaller when a fault occurs Otherwise when119862 is smaller the fault tolerance rate will be greater When119862 rarr infin the linear inseparable problem will degenerateinto a linear separable problem In this case the solution ofthe above mentioned optimization problem can be appliedto obtain the various parameters and optimum solution ofthe target function using the Lagrangian coefficient thus thelinear inseparable dual optimization problem is as follows

Max 119871119863 (120572) =

119897

sum

119894=1

120572119894minus1

2

119897

sum

119894=1

119897

sum

119895=1

120572119894120572119895119910119894119910119895(119909119894119909119895)

Subject to119897

sum

119894=1

120572119894119910119894= 0 119894 = 1 119897

0 le 120572119894le 119862 119894 = 1 119897

(7)

Finally the linear decision-making function is

119891 (119909) = sign(119899

sum

119894=1

119910119894120572lowast

119894(119909 sdot 119909119894) + 119887lowast) (8)

(2) Nonlinear Support Vector Machine (Nonlinear SVM)When input training samples cannot be separated usinglinear SVM we can use conversion function 120593 to convertthe original 2-dimensional data into a new high-dimensionalfeature space for linear separable problem SVM can effi-ciently perform a nonlinear classification using what is calledthe kernel trick implicitly mapping their inputs into high-dimensional feature spaces Presently many different corefunctions have been proposed Using different core functionsregarding different data features can effectively improve thecomputational efficiency of SVM The relatively commoncore functions include the following four types

(1) linear kernel function

119870(119909119894 119910119894) = 119909119905

119894sdot 119910119895 (9)

(2) polynomial kernel function

119870(119909119894 119910119895) = (120574119909

119905

119894119909119895+ 119903)119898

120574 gt 0 (10)

(3) radial basis kernel function

119870(119909119894 119910119895) = exp(

minus10038171003817100381710038171003817119909119894minus 119910119895

10038171003817100381710038171003817

2

21205902) 120574 gt 0 (11)

(4) sigmoid kernel function

119870(119909119894 119910119895) = tanh (120574119909119905

119894sdot 119910119895+ 119903) (12)

where the emissive core function is more frequently appliedin high feature dimensional and nonlinear problems andthe parameters to be set are 120574 and 119862 which can slightlyreduce SVM complexity and improve calculation efficiencytherefore this study selects the emissive core function

34 Support Vector Machine Recursive Feature Elimination(SVM-RFE) A feature selection process can be used toremove terms in the training dataset that are statisticallyuncorrelated with the class labels thus improving bothefficiency and accuracy Pal and Maiti (2010) provided asupervised dimensionality reduction method The featureselection problem has been modeled as a mixed 0-1 inte-ger program [23] Multiclass Mahalanobis-Taguchi system(MMTS) is developed for simultaneous multiclass classi-fication and feature selection The important features areidentified using the orthogonal arrays and the signal-to-noise ratio and are then used to construct a reduced modelmeasurement scale [24] SVM-RFE is an SVM-based featureselection algorithm created by [12] Using SVM-RFE Guyonet al selected key and important feature sets In addition toreducing classification computational time it can improve theclassification accuracy rate [12] In recent years many schol-ars improved the classification effect in medical diagnosis bytaking advantage of this method [22 25]

35 Multiclass SVM Classifier SVMrsquos basic classificationprinciple is mainly based on dual categories Presently thereare three main methods one-against-all one-against-oneand directed acyclic graph to process multiclass problems[26] described as follows

(1) One-Against-All (OAA) Proposed by Bottou et al (1994)the one-versus-rest converts the classification problem of 119896categories into 119896 dual-category problems [27] Scholars havealso proposed subsequent effective classification methods[28] In the training process it must train 119896 dual-categorySVMs When training the 119894th classifier data in the 119894thcategory is regarded as ldquo+1rdquo and the data of the remainingcategories is regarded as ldquominus1rdquo to complete the trainingof 119896 dual-category SVM during the testing process eachtesting instance is tested by trained 119896 dual-category SVMsThe classification results can be determined by comparingthe outputs of SVM Regarding unknown category 119909 the

The Scientific World Journal 5

decision function arg max119894=1119896

(119908119894)119905120601(119909) + 119887

119894 can be appliedto generate 119896 decision-making values and category 119909 is thecategory of the maximum decision making value

(2) One-Against-One (OAO) When there are 119896 categoriestwo categories can produce an SVM thus it can produce 119896(119896minus1)2 classifiers and determine the category of the samples by avoting strategy [28] For example if there are three categories(1 2 and 3) and a sample to be classified with an assumedcategory of 2 the sample will then be input into three SVMsEach SVM will determine the category of the sample usingdecision making function sign((119908119894119895)119905Φ(119909)+ 119887119894119895) and adds 1 tothe votes of the category Finally the category with the mostvotes is the category of the sample

(3) Directed Acyclic Graph (DAG) Similar to OAO methodDAG is to disintegrate the classification problem 119896 categoriesinto a 119896(119896 minus 1)2 dual-category classification problem [18]During the training process it selects any two categoriesfrom 119896 categories as a group which it combines into a dual-category classification SVM during the testing process itestablishes a dual-category acyclic graph The data of anunknown category is tested from the root nodes In a problemwith 119896 classes a rooted binaryDAGhas 119896 leaves labeled by theclasses where each of the 119896(119896 minus 1)2 internal nodes is labeledwith an element of a Boolean function [19]

4 Experiment and Results

41 Feature Selection Based on SVM-RFE Themain purposeof SVM-RFE is to compute the ranking weights for allfeatures and sort the features according to weight vectors asthe classification basis SVM-RFE is an iteration process ofthe backward removal of features Its steps for feature setselection are shown as follows

(1) Use the current dataset to train the classifier(2) Compute the ranking weights for all features(3) Delete the feature with the smallest weight

Implement the iteration process until there is only one featureremaining in the dataset the implementation result providesa list of features in the order of weight The algorithmwill remove the feature with smallest ranking weight whileretaining the feature variables of significant impact Finallythe feature variables will be listed in the descending orderof explanatory difference degree SVM-RFErsquos selection offeature sets can be mainly divided into three steps namely(1) the input of the datasets to be classified (2) calculationof weight of each feature and (3) the deletion of the featureof minimum weight to obtain the ranking of features Thecomputational step is shown as follows [12]

(1) Input

Training sample1198830= [1199091 1199092 119909

119898]119879

Category 119910 = [1199101 1199102 119910

119898]119879

The current feature set 119904 = [1 2 119899]Feature sorted list 119903 = []

(2) Feature Sorting

Repeat the following process until 119904 = []To obtain the new training sample matrix accordingto the remaining features119883 = 119883

0( 119904)

Training classifier 120572 = SVM-train(119883 119910)Calculation of weight 119908 = sum

119896120572119896119910119896119909119896

Calculation of sorting standards 119888119894= (119908119894)2

Finding the features of the minimum weight 119891 =

arg min(119888)Updating feature sorted list 119903 = [119904(119891) 119903]Removing the features with minimum weight 119904 =119904(1 minus1 119891 + 1 length(119904))

(3) Output Feature Sorted List 119903 In each loop the featurewithminimum (119908

119894)2 will be removedThe SVM then retrains

the remaining features to obtain the new feature sortingSVM-RFE repeatedly implements the process until obtaininga feature sorted list Through training SVM using the featuresubsets of the sorted list and evaluating the subsets using theSVMprediction accuracy we can obtain the optimum featuresubsets

42 SVM Parameters Optimization Based on Taguchi MethodTaguchi Method rises from the engineering technologicalperspective and its major tools include the orthogonal arrayand 119878119873 ratio where 119878119873 ratio and loss function are closelyrelated A higher 119878119873 ratio indicates fewer losses [29] Param-eter selection is an important step of the construction of theclassificationmodel using SVMThe differences in parametersettings can affect classification model stability and accuracyHsu and Yu (2012) combined Taguchi method and Staelinmethod to optimize the SVM-based e-mail spam filteringmodel and promote spam filtering accuracy [30] Taguchiparameter design has many advantages For one the effect ofrobustness on quality is great Robustness reduces variation inparts by reducing the effects of uncontrollable variationMoreconsistent parts are equal to better quality Also the Taguchimethod allows for the analysis of many different parameterswithout a prohibitively high amount of experimentation Itprovides the design engineer with a systematic and efficientmethod for determining near optimumdesign parameters forperformance and cost Therefore by using the Taguchi qual-ity parameter design this study conducts the optimizationdesign of parameters119862 and 120574 to enhance the accuracy of SVMclassifier on the diagnosis of multiclass diseases

This study uses the multiclass classification accuracy asthe quality attribute of the Taguchi parameter design [21] Ingeneral when the classification accuracy is higher it meansthe accuracy of the classification model is better that isthe quality attribute is larger-the-better (LTB) and 119878119873LTB isdefined as

119878119873LTB = minus10 log10 (119872119878119863) = minus10 log10 [1

119899

119899

sum

119894=1

1

1199102

119894

] (13)

6 The Scientific World Journal

Table 4 Classification accuracy comparison

Dermatology database Zoo database

119862120574

119862120574

1 3 10 12 01 5 10 121 5257 9518 9408 9422 1 7118 7809 6236 406410 5257 9604 9794 9793 10 7118 9600 9100 850950 5257 9631 9686 9658 50 7118 9609 9600 9600100 5257 9631 9632 9603 100 7118 9609 9609 9600

Table 5 Factor level configuration of LS-SVM parameter design

Dermatology database Zoo database

Control factor Level Control factor Level1 2 3 1 2 3

119860(119862) 10 50 100 119860(119862) 5 10 50119861(120574) 24 5 10 119861(120574) 008 4 11

43 Evaluation of Classification Accuracy Cross-validationmeasurement divides all the samples into a training set anda testing set The training set is the learning data of thealgorithm to establish the classification rules the samples ofthe testing data are used as the testing data to measure theperformance of the classification rules All the samples arerandomly divided into 119896-folds by category and the data aremutually repelled Each fold of the data is used as the testingdata and the remaining 119896minus1 folds are used as the training setThe step is repeated 119896 times and each testing set validates theclassification rules learnt from the corresponding training setto obtain an accuracy rate The average of the accuracy ratesof all 119896 testing sets can be used as the final evaluation resultsThe method is known as 119896-fold cross-validation

44 Results and Discussion The ranking order of all featuresfor Dermatology and Zoo databases using RFE-SVM issummarized as follows Dermatology = V1 V16 V32 V28V19 V3 V17 V2 V15 V21 V26 V13 V14 V5 V18 V4 V23V11 V8 V12 V27 V24 V6 V25 V30 V29 V10 V31 V22V20 V33 V7 V9 and Zoo = V13 V9 V14 V10 V16 V4V8 V1 V11 V2 V12 V5 V6 V3 V15 V7 According to thesuggestions of scholars the classification error rate of OAO isrelatively lowerwhen the number of testing instances is below1000Multiclass SVMparameter settings can affect theMulti-class SVMrsquos classification accuracy Arenas-Garcıa and Perez-Cruz applied SVMsrsquo parameters setting in the multiclass Zoodataset [31]They have carried out simulation usingGaussiankernels for all possible combinations of 119862 and Garmar from119862 = [119897 3 10 30 100] and Garmar = sqrt(025d) sqrt(05d)sqrt(d) sqrt(2d) and sqrt(4d) with d being the dimension ofthe input data In this study we have executed wide ranges ofthe parameter settings for Dermatology and Zoo databasesFinally the parameter settings are suggested as Dermatology(119862 120574) = 119862 = 1 10 50 100 and 120574 = 1 3 10 12 Zoo(119862 120574) = 119862 = 1 10 50 100 and 120574 = 01 5 10 12 and thetesting accuracies are shown in Table 4

As shown in Table 4 regarding parameter 119862 when 119862 =10 and 120574 = 5 10 12 the accuracy of the experiment ishigher than that of the experimental combination of 119862 = 1

and 120574 = 5 10 12 moreover regarding parameter 120574 theexperimental accuracy rate in the case of 120574 = 5 and 119862 =1 10 50 100 is higher than that of the experimental com-bination of 120574 = 01 and 119862 = 1 10 50 100 The near optimalvalue of 119862 or 120574 may not be the same for different databasesFinding the appropriate parameter settings is important forthe performance of classifiers Practically it is impossible tosimulate every possible combination of parameter settingsAnd that is the reason why Taguchi methodology is appliedto reduce the experimental combinations for SVM Theexperimental step used in this study was first referred tothe related study ex 119862 = [1 3 10 30 100] [31] then set apossible range for both databases (119862 = 1sim100 120574 = 1sim12)After that we slightly adjusted the ranges to understand ifthere will be better results in Taguchi quality engineeringparameter optimization for each database According toour experimental result the final parameter settings 119862 and120574 range 10sim100 and 24sim10 respectively for Dermatologydatabase the parameters settings 119862 and 120574 range 5sim50 and008sim11 respectively for Zoo databases Within the rangeof Dermatology and Zoo databases parameters 119862 and 120574 weselect three parameter levels and two control factors 119860 and119861 to represent parameters 119862 and 120574 respectively The Taguchiorthogonal array experiment selects 119871

9(32) and the factor

level configuration is as illustrated in Table 5After data preprocessing Dermatology and Zoo

databases include 358 and 101 testing instances respectivelyThe various experiments of the orthogonal array are repeatedfive times (119899 = 5) the experimental combination andobservations are summarized as shown in Tables 6 and 7According to (13) we can calculate the 119878119873 ratio for Taguchiexperimental combination 1 as

119878119873LTB = minus10 log10 [1

5times (

1

096312+

1

097012+

1

096972

+1

096272+

1

096142)]

= minus03060

(14)

The Scientific World Journal 7

Table 6 Summary of experiment data of Dermatology database

Number Control factor Observation Average SN119860 119861 119910

11199102

1199103

1199104

1199105

1 1 1 09631 09701 09697 09627 09614 09654 minus030602 1 2 09686 09749 09653 09621 09732 09688 minus027553 1 3 09795 09847 09848 09838 09735 09813 minus016474 2 1 09630 09615 09581 09599 09668 09619 minus033795 2 2 09687 09721 09704 09707 09626 09689 minus027466 2 3 09685 09748 09744 09712 09707 09719 minus024757 3 1 09671 09689 09648 09668 09645 09664 minus029678 3 2 09741 09704 09797 09799 09767 09762 minus020989 3 3 09625 09633 09642 09678 09619 09639 minus03191(1198601 = 10 1198602 = 50 1198603 = 100 1198611 = 24 1198612 = 5 1198613 = 10)

Table 7 Summary of experiment data of Zoo database

Number Control factor Observation Average SN119860 119861 119910

11199102

1199103

1199104

1199105

1 1 1 09513 09673 09435 09567 09546 09547 minus040372 1 2 09600 09616 09588 09611 09608 09605 minus035043 1 3 07809 07833 07820 07679 07811 07790 minus216944 2 1 07118 06766 07368 07256 07109 07123 minus295715 2 2 09600 09612 09604 09519 09440 09555 minus039606 2 3 08900 08947 09214 09050 09190 09060 minus085987 3 1 07118 07398 07421 07495 07203 07327 minus270648 3 2 09610 09735 09709 09752 09661 09693 minus027099 3 3 09600 09723 09707 09509 09763 09660 minus03013(1198601 = 5 1198602 = 10 1198603 = 50 1198611 = 008 1198612 = 4 1198613 = 11)

The calculation results of the 119878119873 ratios of the remaining eightexperimental combinations are summarized as in Table 6The Zoo experimental results and 119878119873 ratio calculation areas shown in Table 7 According to the above results we thencalculate the average 119878119873 ratios of the various factor levelsWith the experiment of Table 8 as an example the average119878119873 ratio 119860

1of Factor 119860 at Level 1 is

1198601=1

3[minus03060 + (minus02755) + (minus01647)] = minus02487

(15)

Similarly we can calculate the average effects of 1198602and

1198603from Table 6The difference analysis results of the various

factor levels of Dermatology and Zoo databases are as shownin Table 8 The factor effect diagrams are as shown in Figures2 and 3 As a greater 119878119873 ratio represents better qualityaccording to the factor level difference and factor effectdiagrams the Dermatology parameter level combination is11986011198613 in other words parameters 119862 = 10 120574 = 10 Zoo

parameter level combination is 11986011198612 and the parameter

settings are 119862 = 5 120574 = 4When constructing the Multiclass SVM model using

SVM-RFE three different feature sets are selected according

minus024

minus025

minus026

minus027

minus028

minus029

minus030

minus031

minus032

1 2 3 1 2 3

A B

SN

Figure 2 Main effect plots for 119878119873 ratio of Dermatology database

to their significance At the first stage Taguchi qualityengineering is applied to select the optimum values ofparameters 119862 and 120574 At the second stage it constructs theMulticlass SVM Classifier and compares the classificationperformance according to the above parameters In theDermatology experiment Table 9 illustrates the two featuresubsets containing 23 and 33 feature variables The 33 feature

8 The Scientific World Journal

Table 8 Average of each factor at all levels

Dermatology Zoo

Control factor Level Control factor Level1 2 3 Difference 1 2 3 Difference

119860(119862) minus02487 minus02867 minus02752 00380 119860(119862) minus09745 minus14043 minus10929 04298119861(120574) minus03135 minus02533 minus02438 00697 119861(120574) minus20224 minus03391 minus11102 16833

Table 9 Classification performance comparison of Dermatology database

Methods Dimensions 119862 120574 AccuracySVM 33 100 5 9510 plusmn 00096SVM-RFE 23 50 24 8928 plusmn 00139SVM-RFE-Taguchi 23 10 10 9538 plusmn 00098

Table 10 Classification performance comparison of Zoo database

Methods Dimensions 119862 120574 AccuracySVM 16 10 11 89 plusmn 00314SVM-RFE 6 50 008 92 plusmn 00199SVM-RFE-Taguchi 12 5 4 97 plusmn 00396

321 321

minus05

minus10

minus15

minus20

A B

SN

Figure 3 Main effect plots for 119878119873 ratio of Zoo database

sets are tested by SVM and SVM as based on Taguchi Theparameter settings and testing accuracy rate results are asshown in Table 9 The experimental results as shown inFigure 4 show that the SVM (119862 = 10 120574 = 10) testingaccuracy rate of the 17-feature sets datasets can be higherthan 90 which is better than the accuracy rate of 20-featuresets dataset SVM (119862 = 10 120574 = 11) up to 90 Moreoverregardless of how many sets of feature variables are selectedthe accuracy of SVM (119862 = 50 120574 = 24) cannot be higher than90

Regarding the Zoo experiment Table 10 summarizes theexperimental test results of sets containing 6 12 and 16feature variables using SVM and SVM based on Taguchi Asshown in Table 10 the experimental results show that theclassification accuracy rate of the set of 12-feature variables inthe classification experiment using SVM-RFE-Taguchi (119862 =10 120574 = 10) is the highest up to 97 plusmn 00396 As shown inFigure 5 the experimental results show that the classification

1

09

08

07

06

05

04

03

02

Accu

racy

0 5 10 15 20 25 30 35

Number of features

SVM-RFE-TaguchiC = 10 120574 = 10

SVM-RFE C = 50 120574 = 24

SVM-RFE C = 100 120574 = 5

Figure 4 Classification performance comparison of Dermatologydatabase

accuracy rate of the dataset containing 7 feature variables bySVM-RFE-Taguchi (119862 = 50 120574 = 24) can be higher than 90which can obtain relatively better prediction effects

5 Conclusions

As the study on the impact of feature selection on themulticlass classification accuracy rate becomes increasinglyattractive and significant this study applies SVM-RFE andSVM in the construction of amulticlass classificationmethodin order to establish the classification model As RFE is a

The Scientific World Journal 9

Table 11 Comparison of classification accuracy in related literature

Author Method AccuracyDermatology database

Xie et al (2005) [16] FOut SVM 9174Srinivasa et al (2006) [32] FCM SVM 8330Ren et al (2006) [33] LDA SVM 7209Our Method (2014) SVM-RFE-Taguchi 9538

Zoo databaseXie et al (2005) [16] FOut SVM 8824He (2006) [34] NFPH k-modes 9208Golzari et al (2009) [35] Fuzzy AIRS 9496Our Method (2014) SVM-RFE-Taguchi 9700

1

095

09

085

08

075

07

065

Accu

racy

0 2 4 6 8 10 12 14 16

Number of features

SVM-RFE-TaguchiC = 5 120574 = 4

SVM-RFE C = 10 120574 = 11

SVM-RFE C = 50 120574 = 008

Figure 5 Classification performance comparison of Zoo database

feature selection method of a wrapper model it requires apreviously defined classifier as the assessment rule of featureselection therefore SVM is used as the RFE assessmentstandard to help RFE in the selection of feature sets

According to the experimental results of this studywith respect to parameter settings the impact of parameterselection on the construction of SVM classification modelis huge Therefore this study applies the Taguchi parameterdesign in determining the parameter range and selection ofthe optimum parameter combination for SVM classifier asit is a key factor influencing the classification accuracy Thisstudy also collected the experimental results of using differentresearch methods in the case of Dermatology and Zoodatabases [16 32 33] as shown inTable 11 By comparison theproposed method can achieve higher classification accuracy

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

References

[1] N Cristianini and J Shawe-Taylor An Introduction to SupportVector Machines and Other Kernel-based Learning MethodsCambridge University Press Cambridge UK 2000

[2] J Luts F Ojeda R van de Plas Raf B de Moor S van Huffeland J A K Suykens ldquoA tutorial on support vector machine-based methods for classification problems in chemometricsrdquoAnalytica Chimica Acta vol 665 no 2 pp 129ndash145 2010

[3] M F Akay ldquoSupport vector machines combined with featureselection for breast cancer diagnosisrdquo Expert Systems withApplications vol 36 no 2 pp 3240ndash3247 2009

[4] C-Y Chang S-J Chen andM-F Tsai ldquoApplication of support-vector-machine-based method for feature selection and clas-sification of thyroid nodules in ultrasound imagesrdquo PatternRecognition vol 43 no 10 pp 3494ndash3506 2010

[5] H-L Chen B Yang J Liu and D-Y Liu ldquoA support vectormachine classifier with rough set-based feature selection forbreast cancer diagnosisrdquo Expert Systems with Applications vol38 no 7 pp 9014ndash9022 2011

[6] P Danenas and G Garsva ldquoCredit risk evaluation modelingusing evolutionary linear SVM classifiers and sliding windowapproachrdquo Procedia Computer Science vol 9 pp 1324ndash13332012

[7] C L Huang H C Liao and M C Chen ldquoPrediction modelbuilding and feature selection with support vector machines inbreast cancer diagnosisrdquo Expert Systems with Applications vol34 no 1 pp 578ndash587 2008

[8] H F Liau and D Isa ldquoFeature selection for support vectormachine-based face-iris multimodal biometric systemrdquo ExpertSystems with Applications vol 38 no 9 pp 11105ndash11111 2011

[9] Y Zhang Z Chi andY Sun ldquoA novelmulti-class support vectormachine based on fuzzy theoriesrdquo in Intelligent ComputingInternational Conference on Intelligent Computing Part I (ICICrsquo06) D S Huang K Li and G W Irwin Eds vol 4113 ofLecture Notes in Computer Science pp 42ndash50 Springer BerlinGermany

[10] Y Aksu D J Miller G Kesidis and Q X Yang ldquoMargin-maximizing feature elimination methods for linear and nonlin-ear kernel-based discriminant functionsrdquo IEEE Transactions onNeural Networks vol 21 no 5 pp 701ndash717 2010

[11] P Pudil J Novovicova and J Kittler ldquoFloating search methodsin feature selectionrdquo Pattern Recognition Letters vol 15 no 11pp 1119ndash1125 1994

10 The Scientific World Journal

[12] I Guyon J Weston S Barnhill and V Vapnik ldquoGene selec-tion for cancer classification using support vector machinesrdquoMachine Learning vol 46 no 1ndash3 pp 389ndash422 2002

[13] S Harikrishna M A H Farquad and Shabana ldquoCredit scoringusing support vector machine a comparative analysisrdquo inAdvanced Materials Research Trans Tech Publications ZurichSwitzerland 2012

[14] X Lin F Yang L Zhou et al ldquoA support vector machine-recursive feature elimination feature selection method basedon artificial contrast variables andmutual informationrdquo Journalof Chromatography B Analytical Technologies in the Biomedicaland Life Sciences vol 10 pp 149ndash155 2012

[15] R Zhang and M Jianwen ldquoFeature selection for hyperspectraldata based on recursive support vector machinesrdquo InternationalJournal of Remote Sensing vol 30 no 14 pp 3669ndash3677 2009

[16] Z X Xie Q H Hu and D R Yu ldquoFuzzy output supportvector machines for classificationrdquo in Advances in NaturalComputation L Wang K Chen and Y S Ong Eds vol 3612pp 1190ndash1197 Springer Berlin Germany

[17] Y Liu Z You and L Cao ldquoA novel and quick SVM-basedmulti-class classifierrdquo Pattern Recognition vol 39 no 11 pp 2258ndash2264 2006

[18] J Platt N C Cristianini and J Shawe-Taylor ldquoLarge marginDAGs for multiclass classificationrdquo in Advances in NeuralInformation Processing Systems S A Solla T K Leen and KR Muller Eds vol 12 pp 547ndash553 2000

[19] Y Xu S Zomer and R G Brereton ldquoSupport vector machinesa recent method for classification in chemometricsrdquo CriticalReviews in Analytical Chemistry vol 36 no 3-4 pp 177ndash1882006

[20] M L Huang Y H Hung and E J Lin ldquoEffects of SVMparameter optimization based on the parameter design ofTaguchi methodrdquo International Journal on Artificial IntelligenceTools vol 20 no 3 pp 563ndash575 2011

[21] H-C Lin C-T Su C-C Wang B-H Chang and R-CJuang ldquoParameter optimization of continuous sputtering pro-cess based on Taguchi methods neural networks desirabilityfunction and genetic algorithmsrdquo Expert Systems with Applica-tions vol 39 no 17 pp 12918ndash12925 2012

[22] Y Mao D Pi Y Liu and Y Sun ldquoAccelerated recursive featureelimination based on support vector machine for key variableidentificationrdquo Chinese Journal of Chemical Engineering vol 14no 1 pp 65ndash72 2006

[23] A Pal and J Maiti ldquoDevelopment of a hybrid methodology fordimensionality reduction inMahalanobis-Taguchi systemusingMahalanobis distance and binary particle swarm optimizationrdquoExpert Systems with Applications vol 37 no 2 pp 1286ndash12932010

[24] C-T Su and Y-H Hsiao ldquoMulticlass MTS for simultane-ous feature selection and classificationrdquo IEEE Transactions onKnowledge and Data Engineering vol 21 no 2 pp 192ndash2052009

[25] X Lin F Yang L Zhou et al ldquoA support vector machine-recursive feature elimination feature selectionmethod based onartificial contrast variables and mutual informationrdquo Journal ofChromatography B vol 910 pp 149ndash155 2012

[26] E Hullermeier and S Vanderlooy ldquoCombining predictions inpairwise classification an optimal adaptive voting strategy andits relation to weighted votingrdquo Pattern Recognition vol 43 no1 pp 128ndash142 2010

[27] L Bottou C Cortes J Denker et al ldquoComparison of classifiermethodsmdasha case study in handwritten digit recognitionrdquo in

Proceedings of the 12th Iapr International Conference on PatternRecognition vol 2 pp 77ndash82 IEEEComputer Society Press LosAlamitos Calif USA 1994

[28] J Furnkranz ldquoRound robin rule learningrdquo in Proceedings of the18th International Conference on Machine Learning (ICML 01)pp 146ndash153 2001

[29] M R Sohrabi S Jamshidi and A Esmaeilifar ldquoCloud pointextraction for determination of Diazinon optimization of theeffective parameters using Taguchi methodrdquoChemometrics andIntelligent Laboratory Systems vol 110 no 1 pp 49ndash54 2012

[30] W C Hsu and T Y Yu ldquoSupport vector machines parameterselection based on combined taguchi method and staelinmethod for e-mail spam filteringrdquo International Journal ofEngineering and Technology Innovation vol 2 no 2 pp 113ndash1252012

[31] J Arenas-Garcıa and F Perez-Cruz ldquoMulti-class support vectormachines A new approachrdquo in Proceeding of the IEEE Interna-tional Conference on Accoustics Speech and Signal Processing(ICASSP 03) vol 2 pp 781ndash784 April 2003

[32] K G Srinivasa K R Venugopal and L M Patnaik ldquoFeatureextraction using fuzzy c-means clustering for data mining sys-temsrdquo International Journal of Computer Science and NetworkSecurity vol 6 no 3A pp 230ndash236 2006

[33] Y Ren H Liu C Xue X YaoM Liu and B Fan ldquoClassificationstudy of skin sensitizers based on support vector machine andlinear discriminant analysisrdquo Analytica Chimica Acta vol 572no 2 pp 272ndash282 2006

[34] ZHe Farthest-point heuristic based initializationmethods for K-modes clustering [thesis] Department of Computer Science andEngineering Harbin Institute of Technology Harbin China2006

[35] SGolzari SDoraisamyMN Sulaiman andN IUdzir ldquoEffectof fuzzy resource allocation method on AIRS classifier accu-racyrdquo Journal ofTheoretical andApplied Information Technologyvol 5 no 1 pp 18ndash24 2009

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 4: Research Article SVM-RFE Based Feature Selection and ...downloads.hindawi.com/journals/tswj/2014/795624.pdf · SVM-RFE Based Feature Selection and Taguchi Parameters Optimization

4 The Scientific World Journal

If 119891(119909) gt 0 it means the sample is in the same category assamples marked with ldquo+1rdquo otherwise it is in the category ofsamples marked with ldquominus1rdquo When the training data includenoise the linear hyperplane cannot accurately distinguishdata points By introducing slack variables 120585

119894in the constraint

the original (3) can be modified into the following

minimize 1

21199082+ 119862(

119897

sum

119894=1

120585119894)

subject to 119910119894(119909119894sdot 119908 + 119887) minus 1 + 120585

119894ge 0 119894 = 1 119897

120585119894ge 0 119894 = 1 119897

(6)

where 120585119894is the distance between the boundary and the clas-

sification point and penalty parameter 119862 represents the costof the classification error of training data during the learningprocess as determined by the user When 119862 is greater themargin will be smaller indicating that the fault tolerancerate will be smaller when a fault occurs Otherwise when119862 is smaller the fault tolerance rate will be greater When119862 rarr infin the linear inseparable problem will degenerateinto a linear separable problem In this case the solution ofthe above mentioned optimization problem can be appliedto obtain the various parameters and optimum solution ofthe target function using the Lagrangian coefficient thus thelinear inseparable dual optimization problem is as follows

Max 119871119863 (120572) =

119897

sum

119894=1

120572119894minus1

2

119897

sum

119894=1

119897

sum

119895=1

120572119894120572119895119910119894119910119895(119909119894119909119895)

Subject to119897

sum

119894=1

120572119894119910119894= 0 119894 = 1 119897

0 le 120572119894le 119862 119894 = 1 119897

(7)

Finally the linear decision-making function is

119891 (119909) = sign(119899

sum

119894=1

119910119894120572lowast

119894(119909 sdot 119909119894) + 119887lowast) (8)

(2) Nonlinear Support Vector Machine (Nonlinear SVM)When input training samples cannot be separated usinglinear SVM we can use conversion function 120593 to convertthe original 2-dimensional data into a new high-dimensionalfeature space for linear separable problem SVM can effi-ciently perform a nonlinear classification using what is calledthe kernel trick implicitly mapping their inputs into high-dimensional feature spaces Presently many different corefunctions have been proposed Using different core functionsregarding different data features can effectively improve thecomputational efficiency of SVM The relatively commoncore functions include the following four types

(1) linear kernel function

119870(119909119894 119910119894) = 119909119905

119894sdot 119910119895 (9)

(2) polynomial kernel function

119870(119909119894 119910119895) = (120574119909

119905

119894119909119895+ 119903)119898

120574 gt 0 (10)

(3) radial basis kernel function

119870(119909119894 119910119895) = exp(

minus10038171003817100381710038171003817119909119894minus 119910119895

10038171003817100381710038171003817

2

21205902) 120574 gt 0 (11)

(4) sigmoid kernel function

119870(119909119894 119910119895) = tanh (120574119909119905

119894sdot 119910119895+ 119903) (12)

where the emissive core function is more frequently appliedin high feature dimensional and nonlinear problems andthe parameters to be set are 120574 and 119862 which can slightlyreduce SVM complexity and improve calculation efficiencytherefore this study selects the emissive core function

34 Support Vector Machine Recursive Feature Elimination(SVM-RFE) A feature selection process can be used toremove terms in the training dataset that are statisticallyuncorrelated with the class labels thus improving bothefficiency and accuracy Pal and Maiti (2010) provided asupervised dimensionality reduction method The featureselection problem has been modeled as a mixed 0-1 inte-ger program [23] Multiclass Mahalanobis-Taguchi system(MMTS) is developed for simultaneous multiclass classi-fication and feature selection The important features areidentified using the orthogonal arrays and the signal-to-noise ratio and are then used to construct a reduced modelmeasurement scale [24] SVM-RFE is an SVM-based featureselection algorithm created by [12] Using SVM-RFE Guyonet al selected key and important feature sets In addition toreducing classification computational time it can improve theclassification accuracy rate [12] In recent years many schol-ars improved the classification effect in medical diagnosis bytaking advantage of this method [22 25]

35 Multiclass SVM Classifier SVMrsquos basic classificationprinciple is mainly based on dual categories Presently thereare three main methods one-against-all one-against-oneand directed acyclic graph to process multiclass problems[26] described as follows

(1) One-Against-All (OAA) Proposed by Bottou et al (1994)the one-versus-rest converts the classification problem of 119896categories into 119896 dual-category problems [27] Scholars havealso proposed subsequent effective classification methods[28] In the training process it must train 119896 dual-categorySVMs When training the 119894th classifier data in the 119894thcategory is regarded as ldquo+1rdquo and the data of the remainingcategories is regarded as ldquominus1rdquo to complete the trainingof 119896 dual-category SVM during the testing process eachtesting instance is tested by trained 119896 dual-category SVMsThe classification results can be determined by comparingthe outputs of SVM Regarding unknown category 119909 the

The Scientific World Journal 5

decision function arg max119894=1119896

(119908119894)119905120601(119909) + 119887

119894 can be appliedto generate 119896 decision-making values and category 119909 is thecategory of the maximum decision making value

(2) One-Against-One (OAO) When there are 119896 categoriestwo categories can produce an SVM thus it can produce 119896(119896minus1)2 classifiers and determine the category of the samples by avoting strategy [28] For example if there are three categories(1 2 and 3) and a sample to be classified with an assumedcategory of 2 the sample will then be input into three SVMsEach SVM will determine the category of the sample usingdecision making function sign((119908119894119895)119905Φ(119909)+ 119887119894119895) and adds 1 tothe votes of the category Finally the category with the mostvotes is the category of the sample

(3) Directed Acyclic Graph (DAG) Similar to OAO methodDAG is to disintegrate the classification problem 119896 categoriesinto a 119896(119896 minus 1)2 dual-category classification problem [18]During the training process it selects any two categoriesfrom 119896 categories as a group which it combines into a dual-category classification SVM during the testing process itestablishes a dual-category acyclic graph The data of anunknown category is tested from the root nodes In a problemwith 119896 classes a rooted binaryDAGhas 119896 leaves labeled by theclasses where each of the 119896(119896 minus 1)2 internal nodes is labeledwith an element of a Boolean function [19]

4 Experiment and Results

41 Feature Selection Based on SVM-RFE Themain purposeof SVM-RFE is to compute the ranking weights for allfeatures and sort the features according to weight vectors asthe classification basis SVM-RFE is an iteration process ofthe backward removal of features Its steps for feature setselection are shown as follows

(1) Use the current dataset to train the classifier(2) Compute the ranking weights for all features(3) Delete the feature with the smallest weight

Implement the iteration process until there is only one featureremaining in the dataset the implementation result providesa list of features in the order of weight The algorithmwill remove the feature with smallest ranking weight whileretaining the feature variables of significant impact Finallythe feature variables will be listed in the descending orderof explanatory difference degree SVM-RFErsquos selection offeature sets can be mainly divided into three steps namely(1) the input of the datasets to be classified (2) calculationof weight of each feature and (3) the deletion of the featureof minimum weight to obtain the ranking of features Thecomputational step is shown as follows [12]

(1) Input

Training sample1198830= [1199091 1199092 119909

119898]119879

Category 119910 = [1199101 1199102 119910

119898]119879

The current feature set 119904 = [1 2 119899]Feature sorted list 119903 = []

(2) Feature Sorting

Repeat the following process until 119904 = []To obtain the new training sample matrix accordingto the remaining features119883 = 119883

0( 119904)

Training classifier 120572 = SVM-train(119883 119910)Calculation of weight 119908 = sum

119896120572119896119910119896119909119896

Calculation of sorting standards 119888119894= (119908119894)2

Finding the features of the minimum weight 119891 =

arg min(119888)Updating feature sorted list 119903 = [119904(119891) 119903]Removing the features with minimum weight 119904 =119904(1 minus1 119891 + 1 length(119904))

(3) Output Feature Sorted List 119903 In each loop the featurewithminimum (119908

119894)2 will be removedThe SVM then retrains

the remaining features to obtain the new feature sortingSVM-RFE repeatedly implements the process until obtaininga feature sorted list Through training SVM using the featuresubsets of the sorted list and evaluating the subsets using theSVMprediction accuracy we can obtain the optimum featuresubsets

42 SVM Parameters Optimization Based on Taguchi MethodTaguchi Method rises from the engineering technologicalperspective and its major tools include the orthogonal arrayand 119878119873 ratio where 119878119873 ratio and loss function are closelyrelated A higher 119878119873 ratio indicates fewer losses [29] Param-eter selection is an important step of the construction of theclassificationmodel using SVMThe differences in parametersettings can affect classification model stability and accuracyHsu and Yu (2012) combined Taguchi method and Staelinmethod to optimize the SVM-based e-mail spam filteringmodel and promote spam filtering accuracy [30] Taguchiparameter design has many advantages For one the effect ofrobustness on quality is great Robustness reduces variation inparts by reducing the effects of uncontrollable variationMoreconsistent parts are equal to better quality Also the Taguchimethod allows for the analysis of many different parameterswithout a prohibitively high amount of experimentation Itprovides the design engineer with a systematic and efficientmethod for determining near optimumdesign parameters forperformance and cost Therefore by using the Taguchi qual-ity parameter design this study conducts the optimizationdesign of parameters119862 and 120574 to enhance the accuracy of SVMclassifier on the diagnosis of multiclass diseases

This study uses the multiclass classification accuracy asthe quality attribute of the Taguchi parameter design [21] Ingeneral when the classification accuracy is higher it meansthe accuracy of the classification model is better that isthe quality attribute is larger-the-better (LTB) and 119878119873LTB isdefined as

119878119873LTB = minus10 log10 (119872119878119863) = minus10 log10 [1

119899

119899

sum

119894=1

1

1199102

119894

] (13)

6 The Scientific World Journal

Table 4 Classification accuracy comparison

Dermatology database Zoo database

119862120574

119862120574

1 3 10 12 01 5 10 121 5257 9518 9408 9422 1 7118 7809 6236 406410 5257 9604 9794 9793 10 7118 9600 9100 850950 5257 9631 9686 9658 50 7118 9609 9600 9600100 5257 9631 9632 9603 100 7118 9609 9609 9600

Table 5 Factor level configuration of LS-SVM parameter design

Dermatology database Zoo database

Control factor Level Control factor Level1 2 3 1 2 3

119860(119862) 10 50 100 119860(119862) 5 10 50119861(120574) 24 5 10 119861(120574) 008 4 11

43 Evaluation of Classification Accuracy Cross-validationmeasurement divides all the samples into a training set anda testing set The training set is the learning data of thealgorithm to establish the classification rules the samples ofthe testing data are used as the testing data to measure theperformance of the classification rules All the samples arerandomly divided into 119896-folds by category and the data aremutually repelled Each fold of the data is used as the testingdata and the remaining 119896minus1 folds are used as the training setThe step is repeated 119896 times and each testing set validates theclassification rules learnt from the corresponding training setto obtain an accuracy rate The average of the accuracy ratesof all 119896 testing sets can be used as the final evaluation resultsThe method is known as 119896-fold cross-validation

44 Results and Discussion The ranking order of all featuresfor Dermatology and Zoo databases using RFE-SVM issummarized as follows Dermatology = V1 V16 V32 V28V19 V3 V17 V2 V15 V21 V26 V13 V14 V5 V18 V4 V23V11 V8 V12 V27 V24 V6 V25 V30 V29 V10 V31 V22V20 V33 V7 V9 and Zoo = V13 V9 V14 V10 V16 V4V8 V1 V11 V2 V12 V5 V6 V3 V15 V7 According to thesuggestions of scholars the classification error rate of OAO isrelatively lowerwhen the number of testing instances is below1000Multiclass SVMparameter settings can affect theMulti-class SVMrsquos classification accuracy Arenas-Garcıa and Perez-Cruz applied SVMsrsquo parameters setting in the multiclass Zoodataset [31]They have carried out simulation usingGaussiankernels for all possible combinations of 119862 and Garmar from119862 = [119897 3 10 30 100] and Garmar = sqrt(025d) sqrt(05d)sqrt(d) sqrt(2d) and sqrt(4d) with d being the dimension ofthe input data In this study we have executed wide ranges ofthe parameter settings for Dermatology and Zoo databasesFinally the parameter settings are suggested as Dermatology(119862 120574) = 119862 = 1 10 50 100 and 120574 = 1 3 10 12 Zoo(119862 120574) = 119862 = 1 10 50 100 and 120574 = 01 5 10 12 and thetesting accuracies are shown in Table 4

As shown in Table 4 regarding parameter 119862 when 119862 =10 and 120574 = 5 10 12 the accuracy of the experiment ishigher than that of the experimental combination of 119862 = 1

and 120574 = 5 10 12 moreover regarding parameter 120574 theexperimental accuracy rate in the case of 120574 = 5 and 119862 =1 10 50 100 is higher than that of the experimental com-bination of 120574 = 01 and 119862 = 1 10 50 100 The near optimalvalue of 119862 or 120574 may not be the same for different databasesFinding the appropriate parameter settings is important forthe performance of classifiers Practically it is impossible tosimulate every possible combination of parameter settingsAnd that is the reason why Taguchi methodology is appliedto reduce the experimental combinations for SVM Theexperimental step used in this study was first referred tothe related study ex 119862 = [1 3 10 30 100] [31] then set apossible range for both databases (119862 = 1sim100 120574 = 1sim12)After that we slightly adjusted the ranges to understand ifthere will be better results in Taguchi quality engineeringparameter optimization for each database According toour experimental result the final parameter settings 119862 and120574 range 10sim100 and 24sim10 respectively for Dermatologydatabase the parameters settings 119862 and 120574 range 5sim50 and008sim11 respectively for Zoo databases Within the rangeof Dermatology and Zoo databases parameters 119862 and 120574 weselect three parameter levels and two control factors 119860 and119861 to represent parameters 119862 and 120574 respectively The Taguchiorthogonal array experiment selects 119871

9(32) and the factor

level configuration is as illustrated in Table 5After data preprocessing Dermatology and Zoo

databases include 358 and 101 testing instances respectivelyThe various experiments of the orthogonal array are repeatedfive times (119899 = 5) the experimental combination andobservations are summarized as shown in Tables 6 and 7According to (13) we can calculate the 119878119873 ratio for Taguchiexperimental combination 1 as

119878119873LTB = minus10 log10 [1

5times (

1

096312+

1

097012+

1

096972

+1

096272+

1

096142)]

= minus03060

(14)

The Scientific World Journal 7

Table 6 Summary of experiment data of Dermatology database

Number Control factor Observation Average SN119860 119861 119910

11199102

1199103

1199104

1199105

1 1 1 09631 09701 09697 09627 09614 09654 minus030602 1 2 09686 09749 09653 09621 09732 09688 minus027553 1 3 09795 09847 09848 09838 09735 09813 minus016474 2 1 09630 09615 09581 09599 09668 09619 minus033795 2 2 09687 09721 09704 09707 09626 09689 minus027466 2 3 09685 09748 09744 09712 09707 09719 minus024757 3 1 09671 09689 09648 09668 09645 09664 minus029678 3 2 09741 09704 09797 09799 09767 09762 minus020989 3 3 09625 09633 09642 09678 09619 09639 minus03191(1198601 = 10 1198602 = 50 1198603 = 100 1198611 = 24 1198612 = 5 1198613 = 10)

Table 7 Summary of experiment data of Zoo database

Number Control factor Observation Average SN119860 119861 119910

11199102

1199103

1199104

1199105

1 1 1 09513 09673 09435 09567 09546 09547 minus040372 1 2 09600 09616 09588 09611 09608 09605 minus035043 1 3 07809 07833 07820 07679 07811 07790 minus216944 2 1 07118 06766 07368 07256 07109 07123 minus295715 2 2 09600 09612 09604 09519 09440 09555 minus039606 2 3 08900 08947 09214 09050 09190 09060 minus085987 3 1 07118 07398 07421 07495 07203 07327 minus270648 3 2 09610 09735 09709 09752 09661 09693 minus027099 3 3 09600 09723 09707 09509 09763 09660 minus03013(1198601 = 5 1198602 = 10 1198603 = 50 1198611 = 008 1198612 = 4 1198613 = 11)

The calculation results of the 119878119873 ratios of the remaining eightexperimental combinations are summarized as in Table 6The Zoo experimental results and 119878119873 ratio calculation areas shown in Table 7 According to the above results we thencalculate the average 119878119873 ratios of the various factor levelsWith the experiment of Table 8 as an example the average119878119873 ratio 119860

1of Factor 119860 at Level 1 is

1198601=1

3[minus03060 + (minus02755) + (minus01647)] = minus02487

(15)

Similarly we can calculate the average effects of 1198602and

1198603from Table 6The difference analysis results of the various

factor levels of Dermatology and Zoo databases are as shownin Table 8 The factor effect diagrams are as shown in Figures2 and 3 As a greater 119878119873 ratio represents better qualityaccording to the factor level difference and factor effectdiagrams the Dermatology parameter level combination is11986011198613 in other words parameters 119862 = 10 120574 = 10 Zoo

parameter level combination is 11986011198612 and the parameter

settings are 119862 = 5 120574 = 4When constructing the Multiclass SVM model using

SVM-RFE three different feature sets are selected according

minus024

minus025

minus026

minus027

minus028

minus029

minus030

minus031

minus032

1 2 3 1 2 3

A B

SN

Figure 2 Main effect plots for 119878119873 ratio of Dermatology database

to their significance At the first stage Taguchi qualityengineering is applied to select the optimum values ofparameters 119862 and 120574 At the second stage it constructs theMulticlass SVM Classifier and compares the classificationperformance according to the above parameters In theDermatology experiment Table 9 illustrates the two featuresubsets containing 23 and 33 feature variables The 33 feature

8 The Scientific World Journal

Table 8 Average of each factor at all levels

Dermatology Zoo

Control factor Level Control factor Level1 2 3 Difference 1 2 3 Difference

119860(119862) minus02487 minus02867 minus02752 00380 119860(119862) minus09745 minus14043 minus10929 04298119861(120574) minus03135 minus02533 minus02438 00697 119861(120574) minus20224 minus03391 minus11102 16833

Table 9 Classification performance comparison of Dermatology database

Methods Dimensions 119862 120574 AccuracySVM 33 100 5 9510 plusmn 00096SVM-RFE 23 50 24 8928 plusmn 00139SVM-RFE-Taguchi 23 10 10 9538 plusmn 00098

Table 10 Classification performance comparison of Zoo database

Methods Dimensions 119862 120574 AccuracySVM 16 10 11 89 plusmn 00314SVM-RFE 6 50 008 92 plusmn 00199SVM-RFE-Taguchi 12 5 4 97 plusmn 00396

321 321

minus05

minus10

minus15

minus20

A B

SN

Figure 3 Main effect plots for 119878119873 ratio of Zoo database

sets are tested by SVM and SVM as based on Taguchi Theparameter settings and testing accuracy rate results are asshown in Table 9 The experimental results as shown inFigure 4 show that the SVM (119862 = 10 120574 = 10) testingaccuracy rate of the 17-feature sets datasets can be higherthan 90 which is better than the accuracy rate of 20-featuresets dataset SVM (119862 = 10 120574 = 11) up to 90 Moreoverregardless of how many sets of feature variables are selectedthe accuracy of SVM (119862 = 50 120574 = 24) cannot be higher than90

Regarding the Zoo experiment Table 10 summarizes theexperimental test results of sets containing 6 12 and 16feature variables using SVM and SVM based on Taguchi Asshown in Table 10 the experimental results show that theclassification accuracy rate of the set of 12-feature variables inthe classification experiment using SVM-RFE-Taguchi (119862 =10 120574 = 10) is the highest up to 97 plusmn 00396 As shown inFigure 5 the experimental results show that the classification

1

09

08

07

06

05

04

03

02

Accu

racy

0 5 10 15 20 25 30 35

Number of features

SVM-RFE-TaguchiC = 10 120574 = 10

SVM-RFE C = 50 120574 = 24

SVM-RFE C = 100 120574 = 5

Figure 4 Classification performance comparison of Dermatologydatabase

accuracy rate of the dataset containing 7 feature variables bySVM-RFE-Taguchi (119862 = 50 120574 = 24) can be higher than 90which can obtain relatively better prediction effects

5 Conclusions

As the study on the impact of feature selection on themulticlass classification accuracy rate becomes increasinglyattractive and significant this study applies SVM-RFE andSVM in the construction of amulticlass classificationmethodin order to establish the classification model As RFE is a

The Scientific World Journal 9

Table 11 Comparison of classification accuracy in related literature

Author Method AccuracyDermatology database

Xie et al (2005) [16] FOut SVM 9174Srinivasa et al (2006) [32] FCM SVM 8330Ren et al (2006) [33] LDA SVM 7209Our Method (2014) SVM-RFE-Taguchi 9538

Zoo databaseXie et al (2005) [16] FOut SVM 8824He (2006) [34] NFPH k-modes 9208Golzari et al (2009) [35] Fuzzy AIRS 9496Our Method (2014) SVM-RFE-Taguchi 9700

1

095

09

085

08

075

07

065

Accu

racy

0 2 4 6 8 10 12 14 16

Number of features

SVM-RFE-TaguchiC = 5 120574 = 4

SVM-RFE C = 10 120574 = 11

SVM-RFE C = 50 120574 = 008

Figure 5 Classification performance comparison of Zoo database

feature selection method of a wrapper model it requires apreviously defined classifier as the assessment rule of featureselection therefore SVM is used as the RFE assessmentstandard to help RFE in the selection of feature sets

According to the experimental results of this studywith respect to parameter settings the impact of parameterselection on the construction of SVM classification modelis huge Therefore this study applies the Taguchi parameterdesign in determining the parameter range and selection ofthe optimum parameter combination for SVM classifier asit is a key factor influencing the classification accuracy Thisstudy also collected the experimental results of using differentresearch methods in the case of Dermatology and Zoodatabases [16 32 33] as shown inTable 11 By comparison theproposed method can achieve higher classification accuracy

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

References

[1] N Cristianini and J Shawe-Taylor An Introduction to SupportVector Machines and Other Kernel-based Learning MethodsCambridge University Press Cambridge UK 2000

[2] J Luts F Ojeda R van de Plas Raf B de Moor S van Huffeland J A K Suykens ldquoA tutorial on support vector machine-based methods for classification problems in chemometricsrdquoAnalytica Chimica Acta vol 665 no 2 pp 129ndash145 2010

[3] M F Akay ldquoSupport vector machines combined with featureselection for breast cancer diagnosisrdquo Expert Systems withApplications vol 36 no 2 pp 3240ndash3247 2009

[4] C-Y Chang S-J Chen andM-F Tsai ldquoApplication of support-vector-machine-based method for feature selection and clas-sification of thyroid nodules in ultrasound imagesrdquo PatternRecognition vol 43 no 10 pp 3494ndash3506 2010

[5] H-L Chen B Yang J Liu and D-Y Liu ldquoA support vectormachine classifier with rough set-based feature selection forbreast cancer diagnosisrdquo Expert Systems with Applications vol38 no 7 pp 9014ndash9022 2011

[6] P Danenas and G Garsva ldquoCredit risk evaluation modelingusing evolutionary linear SVM classifiers and sliding windowapproachrdquo Procedia Computer Science vol 9 pp 1324ndash13332012

[7] C L Huang H C Liao and M C Chen ldquoPrediction modelbuilding and feature selection with support vector machines inbreast cancer diagnosisrdquo Expert Systems with Applications vol34 no 1 pp 578ndash587 2008

[8] H F Liau and D Isa ldquoFeature selection for support vectormachine-based face-iris multimodal biometric systemrdquo ExpertSystems with Applications vol 38 no 9 pp 11105ndash11111 2011

[9] Y Zhang Z Chi andY Sun ldquoA novelmulti-class support vectormachine based on fuzzy theoriesrdquo in Intelligent ComputingInternational Conference on Intelligent Computing Part I (ICICrsquo06) D S Huang K Li and G W Irwin Eds vol 4113 ofLecture Notes in Computer Science pp 42ndash50 Springer BerlinGermany

[10] Y Aksu D J Miller G Kesidis and Q X Yang ldquoMargin-maximizing feature elimination methods for linear and nonlin-ear kernel-based discriminant functionsrdquo IEEE Transactions onNeural Networks vol 21 no 5 pp 701ndash717 2010

[11] P Pudil J Novovicova and J Kittler ldquoFloating search methodsin feature selectionrdquo Pattern Recognition Letters vol 15 no 11pp 1119ndash1125 1994

10 The Scientific World Journal

[12] I Guyon J Weston S Barnhill and V Vapnik ldquoGene selec-tion for cancer classification using support vector machinesrdquoMachine Learning vol 46 no 1ndash3 pp 389ndash422 2002

[13] S Harikrishna M A H Farquad and Shabana ldquoCredit scoringusing support vector machine a comparative analysisrdquo inAdvanced Materials Research Trans Tech Publications ZurichSwitzerland 2012

[14] X Lin F Yang L Zhou et al ldquoA support vector machine-recursive feature elimination feature selection method basedon artificial contrast variables andmutual informationrdquo Journalof Chromatography B Analytical Technologies in the Biomedicaland Life Sciences vol 10 pp 149ndash155 2012

[15] R Zhang and M Jianwen ldquoFeature selection for hyperspectraldata based on recursive support vector machinesrdquo InternationalJournal of Remote Sensing vol 30 no 14 pp 3669ndash3677 2009

[16] Z X Xie Q H Hu and D R Yu ldquoFuzzy output supportvector machines for classificationrdquo in Advances in NaturalComputation L Wang K Chen and Y S Ong Eds vol 3612pp 1190ndash1197 Springer Berlin Germany

[17] Y Liu Z You and L Cao ldquoA novel and quick SVM-basedmulti-class classifierrdquo Pattern Recognition vol 39 no 11 pp 2258ndash2264 2006

[18] J Platt N C Cristianini and J Shawe-Taylor ldquoLarge marginDAGs for multiclass classificationrdquo in Advances in NeuralInformation Processing Systems S A Solla T K Leen and KR Muller Eds vol 12 pp 547ndash553 2000

[19] Y Xu S Zomer and R G Brereton ldquoSupport vector machinesa recent method for classification in chemometricsrdquo CriticalReviews in Analytical Chemistry vol 36 no 3-4 pp 177ndash1882006

[20] M L Huang Y H Hung and E J Lin ldquoEffects of SVMparameter optimization based on the parameter design ofTaguchi methodrdquo International Journal on Artificial IntelligenceTools vol 20 no 3 pp 563ndash575 2011

[21] H-C Lin C-T Su C-C Wang B-H Chang and R-CJuang ldquoParameter optimization of continuous sputtering pro-cess based on Taguchi methods neural networks desirabilityfunction and genetic algorithmsrdquo Expert Systems with Applica-tions vol 39 no 17 pp 12918ndash12925 2012

[22] Y Mao D Pi Y Liu and Y Sun ldquoAccelerated recursive featureelimination based on support vector machine for key variableidentificationrdquo Chinese Journal of Chemical Engineering vol 14no 1 pp 65ndash72 2006

[23] A Pal and J Maiti ldquoDevelopment of a hybrid methodology fordimensionality reduction inMahalanobis-Taguchi systemusingMahalanobis distance and binary particle swarm optimizationrdquoExpert Systems with Applications vol 37 no 2 pp 1286ndash12932010

[24] C-T Su and Y-H Hsiao ldquoMulticlass MTS for simultane-ous feature selection and classificationrdquo IEEE Transactions onKnowledge and Data Engineering vol 21 no 2 pp 192ndash2052009

[25] X Lin F Yang L Zhou et al ldquoA support vector machine-recursive feature elimination feature selectionmethod based onartificial contrast variables and mutual informationrdquo Journal ofChromatography B vol 910 pp 149ndash155 2012

[26] E Hullermeier and S Vanderlooy ldquoCombining predictions inpairwise classification an optimal adaptive voting strategy andits relation to weighted votingrdquo Pattern Recognition vol 43 no1 pp 128ndash142 2010

[27] L Bottou C Cortes J Denker et al ldquoComparison of classifiermethodsmdasha case study in handwritten digit recognitionrdquo in

Proceedings of the 12th Iapr International Conference on PatternRecognition vol 2 pp 77ndash82 IEEEComputer Society Press LosAlamitos Calif USA 1994

[28] J Furnkranz ldquoRound robin rule learningrdquo in Proceedings of the18th International Conference on Machine Learning (ICML 01)pp 146ndash153 2001

[29] M R Sohrabi S Jamshidi and A Esmaeilifar ldquoCloud pointextraction for determination of Diazinon optimization of theeffective parameters using Taguchi methodrdquoChemometrics andIntelligent Laboratory Systems vol 110 no 1 pp 49ndash54 2012

[30] W C Hsu and T Y Yu ldquoSupport vector machines parameterselection based on combined taguchi method and staelinmethod for e-mail spam filteringrdquo International Journal ofEngineering and Technology Innovation vol 2 no 2 pp 113ndash1252012

[31] J Arenas-Garcıa and F Perez-Cruz ldquoMulti-class support vectormachines A new approachrdquo in Proceeding of the IEEE Interna-tional Conference on Accoustics Speech and Signal Processing(ICASSP 03) vol 2 pp 781ndash784 April 2003

[32] K G Srinivasa K R Venugopal and L M Patnaik ldquoFeatureextraction using fuzzy c-means clustering for data mining sys-temsrdquo International Journal of Computer Science and NetworkSecurity vol 6 no 3A pp 230ndash236 2006

[33] Y Ren H Liu C Xue X YaoM Liu and B Fan ldquoClassificationstudy of skin sensitizers based on support vector machine andlinear discriminant analysisrdquo Analytica Chimica Acta vol 572no 2 pp 272ndash282 2006

[34] ZHe Farthest-point heuristic based initializationmethods for K-modes clustering [thesis] Department of Computer Science andEngineering Harbin Institute of Technology Harbin China2006

[35] SGolzari SDoraisamyMN Sulaiman andN IUdzir ldquoEffectof fuzzy resource allocation method on AIRS classifier accu-racyrdquo Journal ofTheoretical andApplied Information Technologyvol 5 no 1 pp 18ndash24 2009

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 5: Research Article SVM-RFE Based Feature Selection and ...downloads.hindawi.com/journals/tswj/2014/795624.pdf · SVM-RFE Based Feature Selection and Taguchi Parameters Optimization

The Scientific World Journal 5

decision function arg max119894=1119896

(119908119894)119905120601(119909) + 119887

119894 can be appliedto generate 119896 decision-making values and category 119909 is thecategory of the maximum decision making value

(2) One-Against-One (OAO) When there are 119896 categoriestwo categories can produce an SVM thus it can produce 119896(119896minus1)2 classifiers and determine the category of the samples by avoting strategy [28] For example if there are three categories(1 2 and 3) and a sample to be classified with an assumedcategory of 2 the sample will then be input into three SVMsEach SVM will determine the category of the sample usingdecision making function sign((119908119894119895)119905Φ(119909)+ 119887119894119895) and adds 1 tothe votes of the category Finally the category with the mostvotes is the category of the sample

(3) Directed Acyclic Graph (DAG) Similar to OAO methodDAG is to disintegrate the classification problem 119896 categoriesinto a 119896(119896 minus 1)2 dual-category classification problem [18]During the training process it selects any two categoriesfrom 119896 categories as a group which it combines into a dual-category classification SVM during the testing process itestablishes a dual-category acyclic graph The data of anunknown category is tested from the root nodes In a problemwith 119896 classes a rooted binaryDAGhas 119896 leaves labeled by theclasses where each of the 119896(119896 minus 1)2 internal nodes is labeledwith an element of a Boolean function [19]

4 Experiment and Results

41 Feature Selection Based on SVM-RFE Themain purposeof SVM-RFE is to compute the ranking weights for allfeatures and sort the features according to weight vectors asthe classification basis SVM-RFE is an iteration process ofthe backward removal of features Its steps for feature setselection are shown as follows

(1) Use the current dataset to train the classifier(2) Compute the ranking weights for all features(3) Delete the feature with the smallest weight

Implement the iteration process until there is only one featureremaining in the dataset the implementation result providesa list of features in the order of weight The algorithmwill remove the feature with smallest ranking weight whileretaining the feature variables of significant impact Finallythe feature variables will be listed in the descending orderof explanatory difference degree SVM-RFErsquos selection offeature sets can be mainly divided into three steps namely(1) the input of the datasets to be classified (2) calculationof weight of each feature and (3) the deletion of the featureof minimum weight to obtain the ranking of features Thecomputational step is shown as follows [12]

(1) Input

Training sample1198830= [1199091 1199092 119909

119898]119879

Category 119910 = [1199101 1199102 119910

119898]119879

The current feature set 119904 = [1 2 119899]Feature sorted list 119903 = []

(2) Feature Sorting

Repeat the following process until 119904 = []To obtain the new training sample matrix accordingto the remaining features119883 = 119883

0( 119904)

Training classifier 120572 = SVM-train(119883 119910)Calculation of weight 119908 = sum

119896120572119896119910119896119909119896

Calculation of sorting standards 119888119894= (119908119894)2

Finding the features of the minimum weight 119891 =

arg min(119888)Updating feature sorted list 119903 = [119904(119891) 119903]Removing the features with minimum weight 119904 =119904(1 minus1 119891 + 1 length(119904))

(3) Output Feature Sorted List 119903 In each loop the featurewithminimum (119908

119894)2 will be removedThe SVM then retrains

the remaining features to obtain the new feature sortingSVM-RFE repeatedly implements the process until obtaininga feature sorted list Through training SVM using the featuresubsets of the sorted list and evaluating the subsets using theSVMprediction accuracy we can obtain the optimum featuresubsets

42 SVM Parameters Optimization Based on Taguchi MethodTaguchi Method rises from the engineering technologicalperspective and its major tools include the orthogonal arrayand 119878119873 ratio where 119878119873 ratio and loss function are closelyrelated A higher 119878119873 ratio indicates fewer losses [29] Param-eter selection is an important step of the construction of theclassificationmodel using SVMThe differences in parametersettings can affect classification model stability and accuracyHsu and Yu (2012) combined Taguchi method and Staelinmethod to optimize the SVM-based e-mail spam filteringmodel and promote spam filtering accuracy [30] Taguchiparameter design has many advantages For one the effect ofrobustness on quality is great Robustness reduces variation inparts by reducing the effects of uncontrollable variationMoreconsistent parts are equal to better quality Also the Taguchimethod allows for the analysis of many different parameterswithout a prohibitively high amount of experimentation Itprovides the design engineer with a systematic and efficientmethod for determining near optimumdesign parameters forperformance and cost Therefore by using the Taguchi qual-ity parameter design this study conducts the optimizationdesign of parameters119862 and 120574 to enhance the accuracy of SVMclassifier on the diagnosis of multiclass diseases

This study uses the multiclass classification accuracy asthe quality attribute of the Taguchi parameter design [21] Ingeneral when the classification accuracy is higher it meansthe accuracy of the classification model is better that isthe quality attribute is larger-the-better (LTB) and 119878119873LTB isdefined as

119878119873LTB = minus10 log10 (119872119878119863) = minus10 log10 [1

119899

119899

sum

119894=1

1

1199102

119894

] (13)

6 The Scientific World Journal

Table 4 Classification accuracy comparison

Dermatology database Zoo database

119862120574

119862120574

1 3 10 12 01 5 10 121 5257 9518 9408 9422 1 7118 7809 6236 406410 5257 9604 9794 9793 10 7118 9600 9100 850950 5257 9631 9686 9658 50 7118 9609 9600 9600100 5257 9631 9632 9603 100 7118 9609 9609 9600

Table 5 Factor level configuration of LS-SVM parameter design

Dermatology database Zoo database

Control factor Level Control factor Level1 2 3 1 2 3

119860(119862) 10 50 100 119860(119862) 5 10 50119861(120574) 24 5 10 119861(120574) 008 4 11

43 Evaluation of Classification Accuracy Cross-validationmeasurement divides all the samples into a training set anda testing set The training set is the learning data of thealgorithm to establish the classification rules the samples ofthe testing data are used as the testing data to measure theperformance of the classification rules All the samples arerandomly divided into 119896-folds by category and the data aremutually repelled Each fold of the data is used as the testingdata and the remaining 119896minus1 folds are used as the training setThe step is repeated 119896 times and each testing set validates theclassification rules learnt from the corresponding training setto obtain an accuracy rate The average of the accuracy ratesof all 119896 testing sets can be used as the final evaluation resultsThe method is known as 119896-fold cross-validation

44 Results and Discussion The ranking order of all featuresfor Dermatology and Zoo databases using RFE-SVM issummarized as follows Dermatology = V1 V16 V32 V28V19 V3 V17 V2 V15 V21 V26 V13 V14 V5 V18 V4 V23V11 V8 V12 V27 V24 V6 V25 V30 V29 V10 V31 V22V20 V33 V7 V9 and Zoo = V13 V9 V14 V10 V16 V4V8 V1 V11 V2 V12 V5 V6 V3 V15 V7 According to thesuggestions of scholars the classification error rate of OAO isrelatively lowerwhen the number of testing instances is below1000Multiclass SVMparameter settings can affect theMulti-class SVMrsquos classification accuracy Arenas-Garcıa and Perez-Cruz applied SVMsrsquo parameters setting in the multiclass Zoodataset [31]They have carried out simulation usingGaussiankernels for all possible combinations of 119862 and Garmar from119862 = [119897 3 10 30 100] and Garmar = sqrt(025d) sqrt(05d)sqrt(d) sqrt(2d) and sqrt(4d) with d being the dimension ofthe input data In this study we have executed wide ranges ofthe parameter settings for Dermatology and Zoo databasesFinally the parameter settings are suggested as Dermatology(119862 120574) = 119862 = 1 10 50 100 and 120574 = 1 3 10 12 Zoo(119862 120574) = 119862 = 1 10 50 100 and 120574 = 01 5 10 12 and thetesting accuracies are shown in Table 4

As shown in Table 4 regarding parameter 119862 when 119862 =10 and 120574 = 5 10 12 the accuracy of the experiment ishigher than that of the experimental combination of 119862 = 1

and 120574 = 5 10 12 moreover regarding parameter 120574 theexperimental accuracy rate in the case of 120574 = 5 and 119862 =1 10 50 100 is higher than that of the experimental com-bination of 120574 = 01 and 119862 = 1 10 50 100 The near optimalvalue of 119862 or 120574 may not be the same for different databasesFinding the appropriate parameter settings is important forthe performance of classifiers Practically it is impossible tosimulate every possible combination of parameter settingsAnd that is the reason why Taguchi methodology is appliedto reduce the experimental combinations for SVM Theexperimental step used in this study was first referred tothe related study ex 119862 = [1 3 10 30 100] [31] then set apossible range for both databases (119862 = 1sim100 120574 = 1sim12)After that we slightly adjusted the ranges to understand ifthere will be better results in Taguchi quality engineeringparameter optimization for each database According toour experimental result the final parameter settings 119862 and120574 range 10sim100 and 24sim10 respectively for Dermatologydatabase the parameters settings 119862 and 120574 range 5sim50 and008sim11 respectively for Zoo databases Within the rangeof Dermatology and Zoo databases parameters 119862 and 120574 weselect three parameter levels and two control factors 119860 and119861 to represent parameters 119862 and 120574 respectively The Taguchiorthogonal array experiment selects 119871

9(32) and the factor

level configuration is as illustrated in Table 5After data preprocessing Dermatology and Zoo

databases include 358 and 101 testing instances respectivelyThe various experiments of the orthogonal array are repeatedfive times (119899 = 5) the experimental combination andobservations are summarized as shown in Tables 6 and 7According to (13) we can calculate the 119878119873 ratio for Taguchiexperimental combination 1 as

119878119873LTB = minus10 log10 [1

5times (

1

096312+

1

097012+

1

096972

+1

096272+

1

096142)]

= minus03060

(14)

The Scientific World Journal 7

Table 6 Summary of experiment data of Dermatology database

Number Control factor Observation Average SN119860 119861 119910

11199102

1199103

1199104

1199105

1 1 1 09631 09701 09697 09627 09614 09654 minus030602 1 2 09686 09749 09653 09621 09732 09688 minus027553 1 3 09795 09847 09848 09838 09735 09813 minus016474 2 1 09630 09615 09581 09599 09668 09619 minus033795 2 2 09687 09721 09704 09707 09626 09689 minus027466 2 3 09685 09748 09744 09712 09707 09719 minus024757 3 1 09671 09689 09648 09668 09645 09664 minus029678 3 2 09741 09704 09797 09799 09767 09762 minus020989 3 3 09625 09633 09642 09678 09619 09639 minus03191(1198601 = 10 1198602 = 50 1198603 = 100 1198611 = 24 1198612 = 5 1198613 = 10)

Table 7 Summary of experiment data of Zoo database

Number Control factor Observation Average SN119860 119861 119910

11199102

1199103

1199104

1199105

1 1 1 09513 09673 09435 09567 09546 09547 minus040372 1 2 09600 09616 09588 09611 09608 09605 minus035043 1 3 07809 07833 07820 07679 07811 07790 minus216944 2 1 07118 06766 07368 07256 07109 07123 minus295715 2 2 09600 09612 09604 09519 09440 09555 minus039606 2 3 08900 08947 09214 09050 09190 09060 minus085987 3 1 07118 07398 07421 07495 07203 07327 minus270648 3 2 09610 09735 09709 09752 09661 09693 minus027099 3 3 09600 09723 09707 09509 09763 09660 minus03013(1198601 = 5 1198602 = 10 1198603 = 50 1198611 = 008 1198612 = 4 1198613 = 11)

The calculation results of the 119878119873 ratios of the remaining eightexperimental combinations are summarized as in Table 6The Zoo experimental results and 119878119873 ratio calculation areas shown in Table 7 According to the above results we thencalculate the average 119878119873 ratios of the various factor levelsWith the experiment of Table 8 as an example the average119878119873 ratio 119860

1of Factor 119860 at Level 1 is

1198601=1

3[minus03060 + (minus02755) + (minus01647)] = minus02487

(15)

Similarly we can calculate the average effects of 1198602and

1198603from Table 6The difference analysis results of the various

factor levels of Dermatology and Zoo databases are as shownin Table 8 The factor effect diagrams are as shown in Figures2 and 3 As a greater 119878119873 ratio represents better qualityaccording to the factor level difference and factor effectdiagrams the Dermatology parameter level combination is11986011198613 in other words parameters 119862 = 10 120574 = 10 Zoo

parameter level combination is 11986011198612 and the parameter

settings are 119862 = 5 120574 = 4When constructing the Multiclass SVM model using

SVM-RFE three different feature sets are selected according

minus024

minus025

minus026

minus027

minus028

minus029

minus030

minus031

minus032

1 2 3 1 2 3

A B

SN

Figure 2 Main effect plots for 119878119873 ratio of Dermatology database

to their significance At the first stage Taguchi qualityengineering is applied to select the optimum values ofparameters 119862 and 120574 At the second stage it constructs theMulticlass SVM Classifier and compares the classificationperformance according to the above parameters In theDermatology experiment Table 9 illustrates the two featuresubsets containing 23 and 33 feature variables The 33 feature

8 The Scientific World Journal

Table 8 Average of each factor at all levels

Dermatology Zoo

Control factor Level Control factor Level1 2 3 Difference 1 2 3 Difference

119860(119862) minus02487 minus02867 minus02752 00380 119860(119862) minus09745 minus14043 minus10929 04298119861(120574) minus03135 minus02533 minus02438 00697 119861(120574) minus20224 minus03391 minus11102 16833

Table 9 Classification performance comparison of Dermatology database

Methods Dimensions 119862 120574 AccuracySVM 33 100 5 9510 plusmn 00096SVM-RFE 23 50 24 8928 plusmn 00139SVM-RFE-Taguchi 23 10 10 9538 plusmn 00098

Table 10 Classification performance comparison of Zoo database

Methods Dimensions 119862 120574 AccuracySVM 16 10 11 89 plusmn 00314SVM-RFE 6 50 008 92 plusmn 00199SVM-RFE-Taguchi 12 5 4 97 plusmn 00396

321 321

minus05

minus10

minus15

minus20

A B

SN

Figure 3 Main effect plots for 119878119873 ratio of Zoo database

sets are tested by SVM and SVM as based on Taguchi Theparameter settings and testing accuracy rate results are asshown in Table 9 The experimental results as shown inFigure 4 show that the SVM (119862 = 10 120574 = 10) testingaccuracy rate of the 17-feature sets datasets can be higherthan 90 which is better than the accuracy rate of 20-featuresets dataset SVM (119862 = 10 120574 = 11) up to 90 Moreoverregardless of how many sets of feature variables are selectedthe accuracy of SVM (119862 = 50 120574 = 24) cannot be higher than90

Regarding the Zoo experiment Table 10 summarizes theexperimental test results of sets containing 6 12 and 16feature variables using SVM and SVM based on Taguchi Asshown in Table 10 the experimental results show that theclassification accuracy rate of the set of 12-feature variables inthe classification experiment using SVM-RFE-Taguchi (119862 =10 120574 = 10) is the highest up to 97 plusmn 00396 As shown inFigure 5 the experimental results show that the classification

1

09

08

07

06

05

04

03

02

Accu

racy

0 5 10 15 20 25 30 35

Number of features

SVM-RFE-TaguchiC = 10 120574 = 10

SVM-RFE C = 50 120574 = 24

SVM-RFE C = 100 120574 = 5

Figure 4 Classification performance comparison of Dermatologydatabase

accuracy rate of the dataset containing 7 feature variables bySVM-RFE-Taguchi (119862 = 50 120574 = 24) can be higher than 90which can obtain relatively better prediction effects

5 Conclusions

As the study on the impact of feature selection on themulticlass classification accuracy rate becomes increasinglyattractive and significant this study applies SVM-RFE andSVM in the construction of amulticlass classificationmethodin order to establish the classification model As RFE is a

The Scientific World Journal 9

Table 11 Comparison of classification accuracy in related literature

Author Method AccuracyDermatology database

Xie et al (2005) [16] FOut SVM 9174Srinivasa et al (2006) [32] FCM SVM 8330Ren et al (2006) [33] LDA SVM 7209Our Method (2014) SVM-RFE-Taguchi 9538

Zoo databaseXie et al (2005) [16] FOut SVM 8824He (2006) [34] NFPH k-modes 9208Golzari et al (2009) [35] Fuzzy AIRS 9496Our Method (2014) SVM-RFE-Taguchi 9700

1

095

09

085

08

075

07

065

Accu

racy

0 2 4 6 8 10 12 14 16

Number of features

SVM-RFE-TaguchiC = 5 120574 = 4

SVM-RFE C = 10 120574 = 11

SVM-RFE C = 50 120574 = 008

Figure 5 Classification performance comparison of Zoo database

feature selection method of a wrapper model it requires apreviously defined classifier as the assessment rule of featureselection therefore SVM is used as the RFE assessmentstandard to help RFE in the selection of feature sets

According to the experimental results of this studywith respect to parameter settings the impact of parameterselection on the construction of SVM classification modelis huge Therefore this study applies the Taguchi parameterdesign in determining the parameter range and selection ofthe optimum parameter combination for SVM classifier asit is a key factor influencing the classification accuracy Thisstudy also collected the experimental results of using differentresearch methods in the case of Dermatology and Zoodatabases [16 32 33] as shown inTable 11 By comparison theproposed method can achieve higher classification accuracy

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

References

[1] N Cristianini and J Shawe-Taylor An Introduction to SupportVector Machines and Other Kernel-based Learning MethodsCambridge University Press Cambridge UK 2000

[2] J Luts F Ojeda R van de Plas Raf B de Moor S van Huffeland J A K Suykens ldquoA tutorial on support vector machine-based methods for classification problems in chemometricsrdquoAnalytica Chimica Acta vol 665 no 2 pp 129ndash145 2010

[3] M F Akay ldquoSupport vector machines combined with featureselection for breast cancer diagnosisrdquo Expert Systems withApplications vol 36 no 2 pp 3240ndash3247 2009

[4] C-Y Chang S-J Chen andM-F Tsai ldquoApplication of support-vector-machine-based method for feature selection and clas-sification of thyroid nodules in ultrasound imagesrdquo PatternRecognition vol 43 no 10 pp 3494ndash3506 2010

[5] H-L Chen B Yang J Liu and D-Y Liu ldquoA support vectormachine classifier with rough set-based feature selection forbreast cancer diagnosisrdquo Expert Systems with Applications vol38 no 7 pp 9014ndash9022 2011

[6] P Danenas and G Garsva ldquoCredit risk evaluation modelingusing evolutionary linear SVM classifiers and sliding windowapproachrdquo Procedia Computer Science vol 9 pp 1324ndash13332012

[7] C L Huang H C Liao and M C Chen ldquoPrediction modelbuilding and feature selection with support vector machines inbreast cancer diagnosisrdquo Expert Systems with Applications vol34 no 1 pp 578ndash587 2008

[8] H F Liau and D Isa ldquoFeature selection for support vectormachine-based face-iris multimodal biometric systemrdquo ExpertSystems with Applications vol 38 no 9 pp 11105ndash11111 2011

[9] Y Zhang Z Chi andY Sun ldquoA novelmulti-class support vectormachine based on fuzzy theoriesrdquo in Intelligent ComputingInternational Conference on Intelligent Computing Part I (ICICrsquo06) D S Huang K Li and G W Irwin Eds vol 4113 ofLecture Notes in Computer Science pp 42ndash50 Springer BerlinGermany

[10] Y Aksu D J Miller G Kesidis and Q X Yang ldquoMargin-maximizing feature elimination methods for linear and nonlin-ear kernel-based discriminant functionsrdquo IEEE Transactions onNeural Networks vol 21 no 5 pp 701ndash717 2010

[11] P Pudil J Novovicova and J Kittler ldquoFloating search methodsin feature selectionrdquo Pattern Recognition Letters vol 15 no 11pp 1119ndash1125 1994

10 The Scientific World Journal

[12] I Guyon J Weston S Barnhill and V Vapnik ldquoGene selec-tion for cancer classification using support vector machinesrdquoMachine Learning vol 46 no 1ndash3 pp 389ndash422 2002

[13] S Harikrishna M A H Farquad and Shabana ldquoCredit scoringusing support vector machine a comparative analysisrdquo inAdvanced Materials Research Trans Tech Publications ZurichSwitzerland 2012

[14] X Lin F Yang L Zhou et al ldquoA support vector machine-recursive feature elimination feature selection method basedon artificial contrast variables andmutual informationrdquo Journalof Chromatography B Analytical Technologies in the Biomedicaland Life Sciences vol 10 pp 149ndash155 2012

[15] R Zhang and M Jianwen ldquoFeature selection for hyperspectraldata based on recursive support vector machinesrdquo InternationalJournal of Remote Sensing vol 30 no 14 pp 3669ndash3677 2009

[16] Z X Xie Q H Hu and D R Yu ldquoFuzzy output supportvector machines for classificationrdquo in Advances in NaturalComputation L Wang K Chen and Y S Ong Eds vol 3612pp 1190ndash1197 Springer Berlin Germany

[17] Y Liu Z You and L Cao ldquoA novel and quick SVM-basedmulti-class classifierrdquo Pattern Recognition vol 39 no 11 pp 2258ndash2264 2006

[18] J Platt N C Cristianini and J Shawe-Taylor ldquoLarge marginDAGs for multiclass classificationrdquo in Advances in NeuralInformation Processing Systems S A Solla T K Leen and KR Muller Eds vol 12 pp 547ndash553 2000

[19] Y Xu S Zomer and R G Brereton ldquoSupport vector machinesa recent method for classification in chemometricsrdquo CriticalReviews in Analytical Chemistry vol 36 no 3-4 pp 177ndash1882006

[20] M L Huang Y H Hung and E J Lin ldquoEffects of SVMparameter optimization based on the parameter design ofTaguchi methodrdquo International Journal on Artificial IntelligenceTools vol 20 no 3 pp 563ndash575 2011

[21] H-C Lin C-T Su C-C Wang B-H Chang and R-CJuang ldquoParameter optimization of continuous sputtering pro-cess based on Taguchi methods neural networks desirabilityfunction and genetic algorithmsrdquo Expert Systems with Applica-tions vol 39 no 17 pp 12918ndash12925 2012

[22] Y Mao D Pi Y Liu and Y Sun ldquoAccelerated recursive featureelimination based on support vector machine for key variableidentificationrdquo Chinese Journal of Chemical Engineering vol 14no 1 pp 65ndash72 2006

[23] A Pal and J Maiti ldquoDevelopment of a hybrid methodology fordimensionality reduction inMahalanobis-Taguchi systemusingMahalanobis distance and binary particle swarm optimizationrdquoExpert Systems with Applications vol 37 no 2 pp 1286ndash12932010

[24] C-T Su and Y-H Hsiao ldquoMulticlass MTS for simultane-ous feature selection and classificationrdquo IEEE Transactions onKnowledge and Data Engineering vol 21 no 2 pp 192ndash2052009

[25] X Lin F Yang L Zhou et al ldquoA support vector machine-recursive feature elimination feature selectionmethod based onartificial contrast variables and mutual informationrdquo Journal ofChromatography B vol 910 pp 149ndash155 2012

[26] E Hullermeier and S Vanderlooy ldquoCombining predictions inpairwise classification an optimal adaptive voting strategy andits relation to weighted votingrdquo Pattern Recognition vol 43 no1 pp 128ndash142 2010

[27] L Bottou C Cortes J Denker et al ldquoComparison of classifiermethodsmdasha case study in handwritten digit recognitionrdquo in

Proceedings of the 12th Iapr International Conference on PatternRecognition vol 2 pp 77ndash82 IEEEComputer Society Press LosAlamitos Calif USA 1994

[28] J Furnkranz ldquoRound robin rule learningrdquo in Proceedings of the18th International Conference on Machine Learning (ICML 01)pp 146ndash153 2001

[29] M R Sohrabi S Jamshidi and A Esmaeilifar ldquoCloud pointextraction for determination of Diazinon optimization of theeffective parameters using Taguchi methodrdquoChemometrics andIntelligent Laboratory Systems vol 110 no 1 pp 49ndash54 2012

[30] W C Hsu and T Y Yu ldquoSupport vector machines parameterselection based on combined taguchi method and staelinmethod for e-mail spam filteringrdquo International Journal ofEngineering and Technology Innovation vol 2 no 2 pp 113ndash1252012

[31] J Arenas-Garcıa and F Perez-Cruz ldquoMulti-class support vectormachines A new approachrdquo in Proceeding of the IEEE Interna-tional Conference on Accoustics Speech and Signal Processing(ICASSP 03) vol 2 pp 781ndash784 April 2003

[32] K G Srinivasa K R Venugopal and L M Patnaik ldquoFeatureextraction using fuzzy c-means clustering for data mining sys-temsrdquo International Journal of Computer Science and NetworkSecurity vol 6 no 3A pp 230ndash236 2006

[33] Y Ren H Liu C Xue X YaoM Liu and B Fan ldquoClassificationstudy of skin sensitizers based on support vector machine andlinear discriminant analysisrdquo Analytica Chimica Acta vol 572no 2 pp 272ndash282 2006

[34] ZHe Farthest-point heuristic based initializationmethods for K-modes clustering [thesis] Department of Computer Science andEngineering Harbin Institute of Technology Harbin China2006

[35] SGolzari SDoraisamyMN Sulaiman andN IUdzir ldquoEffectof fuzzy resource allocation method on AIRS classifier accu-racyrdquo Journal ofTheoretical andApplied Information Technologyvol 5 no 1 pp 18ndash24 2009

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 6: Research Article SVM-RFE Based Feature Selection and ...downloads.hindawi.com/journals/tswj/2014/795624.pdf · SVM-RFE Based Feature Selection and Taguchi Parameters Optimization

6 The Scientific World Journal

Table 4 Classification accuracy comparison

Dermatology database Zoo database

119862120574

119862120574

1 3 10 12 01 5 10 121 5257 9518 9408 9422 1 7118 7809 6236 406410 5257 9604 9794 9793 10 7118 9600 9100 850950 5257 9631 9686 9658 50 7118 9609 9600 9600100 5257 9631 9632 9603 100 7118 9609 9609 9600

Table 5 Factor level configuration of LS-SVM parameter design

Dermatology database Zoo database

Control factor Level Control factor Level1 2 3 1 2 3

119860(119862) 10 50 100 119860(119862) 5 10 50119861(120574) 24 5 10 119861(120574) 008 4 11

43 Evaluation of Classification Accuracy Cross-validationmeasurement divides all the samples into a training set anda testing set The training set is the learning data of thealgorithm to establish the classification rules the samples ofthe testing data are used as the testing data to measure theperformance of the classification rules All the samples arerandomly divided into 119896-folds by category and the data aremutually repelled Each fold of the data is used as the testingdata and the remaining 119896minus1 folds are used as the training setThe step is repeated 119896 times and each testing set validates theclassification rules learnt from the corresponding training setto obtain an accuracy rate The average of the accuracy ratesof all 119896 testing sets can be used as the final evaluation resultsThe method is known as 119896-fold cross-validation

44 Results and Discussion The ranking order of all featuresfor Dermatology and Zoo databases using RFE-SVM issummarized as follows Dermatology = V1 V16 V32 V28V19 V3 V17 V2 V15 V21 V26 V13 V14 V5 V18 V4 V23V11 V8 V12 V27 V24 V6 V25 V30 V29 V10 V31 V22V20 V33 V7 V9 and Zoo = V13 V9 V14 V10 V16 V4V8 V1 V11 V2 V12 V5 V6 V3 V15 V7 According to thesuggestions of scholars the classification error rate of OAO isrelatively lowerwhen the number of testing instances is below1000Multiclass SVMparameter settings can affect theMulti-class SVMrsquos classification accuracy Arenas-Garcıa and Perez-Cruz applied SVMsrsquo parameters setting in the multiclass Zoodataset [31]They have carried out simulation usingGaussiankernels for all possible combinations of 119862 and Garmar from119862 = [119897 3 10 30 100] and Garmar = sqrt(025d) sqrt(05d)sqrt(d) sqrt(2d) and sqrt(4d) with d being the dimension ofthe input data In this study we have executed wide ranges ofthe parameter settings for Dermatology and Zoo databasesFinally the parameter settings are suggested as Dermatology(119862 120574) = 119862 = 1 10 50 100 and 120574 = 1 3 10 12 Zoo(119862 120574) = 119862 = 1 10 50 100 and 120574 = 01 5 10 12 and thetesting accuracies are shown in Table 4

As shown in Table 4 regarding parameter 119862 when 119862 =10 and 120574 = 5 10 12 the accuracy of the experiment ishigher than that of the experimental combination of 119862 = 1

and 120574 = 5 10 12 moreover regarding parameter 120574 theexperimental accuracy rate in the case of 120574 = 5 and 119862 =1 10 50 100 is higher than that of the experimental com-bination of 120574 = 01 and 119862 = 1 10 50 100 The near optimalvalue of 119862 or 120574 may not be the same for different databasesFinding the appropriate parameter settings is important forthe performance of classifiers Practically it is impossible tosimulate every possible combination of parameter settingsAnd that is the reason why Taguchi methodology is appliedto reduce the experimental combinations for SVM Theexperimental step used in this study was first referred tothe related study ex 119862 = [1 3 10 30 100] [31] then set apossible range for both databases (119862 = 1sim100 120574 = 1sim12)After that we slightly adjusted the ranges to understand ifthere will be better results in Taguchi quality engineeringparameter optimization for each database According toour experimental result the final parameter settings 119862 and120574 range 10sim100 and 24sim10 respectively for Dermatologydatabase the parameters settings 119862 and 120574 range 5sim50 and008sim11 respectively for Zoo databases Within the rangeof Dermatology and Zoo databases parameters 119862 and 120574 weselect three parameter levels and two control factors 119860 and119861 to represent parameters 119862 and 120574 respectively The Taguchiorthogonal array experiment selects 119871

9(32) and the factor

level configuration is as illustrated in Table 5After data preprocessing Dermatology and Zoo

databases include 358 and 101 testing instances respectivelyThe various experiments of the orthogonal array are repeatedfive times (119899 = 5) the experimental combination andobservations are summarized as shown in Tables 6 and 7According to (13) we can calculate the 119878119873 ratio for Taguchiexperimental combination 1 as

119878119873LTB = minus10 log10 [1

5times (

1

096312+

1

097012+

1

096972

+1

096272+

1

096142)]

= minus03060

(14)

The Scientific World Journal 7

Table 6 Summary of experiment data of Dermatology database

Number Control factor Observation Average SN119860 119861 119910

11199102

1199103

1199104

1199105

1 1 1 09631 09701 09697 09627 09614 09654 minus030602 1 2 09686 09749 09653 09621 09732 09688 minus027553 1 3 09795 09847 09848 09838 09735 09813 minus016474 2 1 09630 09615 09581 09599 09668 09619 minus033795 2 2 09687 09721 09704 09707 09626 09689 minus027466 2 3 09685 09748 09744 09712 09707 09719 minus024757 3 1 09671 09689 09648 09668 09645 09664 minus029678 3 2 09741 09704 09797 09799 09767 09762 minus020989 3 3 09625 09633 09642 09678 09619 09639 minus03191(1198601 = 10 1198602 = 50 1198603 = 100 1198611 = 24 1198612 = 5 1198613 = 10)

Table 7 Summary of experiment data of Zoo database

Number Control factor Observation Average SN119860 119861 119910

11199102

1199103

1199104

1199105

1 1 1 09513 09673 09435 09567 09546 09547 minus040372 1 2 09600 09616 09588 09611 09608 09605 minus035043 1 3 07809 07833 07820 07679 07811 07790 minus216944 2 1 07118 06766 07368 07256 07109 07123 minus295715 2 2 09600 09612 09604 09519 09440 09555 minus039606 2 3 08900 08947 09214 09050 09190 09060 minus085987 3 1 07118 07398 07421 07495 07203 07327 minus270648 3 2 09610 09735 09709 09752 09661 09693 minus027099 3 3 09600 09723 09707 09509 09763 09660 minus03013(1198601 = 5 1198602 = 10 1198603 = 50 1198611 = 008 1198612 = 4 1198613 = 11)

The calculation results of the 119878119873 ratios of the remaining eightexperimental combinations are summarized as in Table 6The Zoo experimental results and 119878119873 ratio calculation areas shown in Table 7 According to the above results we thencalculate the average 119878119873 ratios of the various factor levelsWith the experiment of Table 8 as an example the average119878119873 ratio 119860

1of Factor 119860 at Level 1 is

1198601=1

3[minus03060 + (minus02755) + (minus01647)] = minus02487

(15)

Similarly we can calculate the average effects of 1198602and

1198603from Table 6The difference analysis results of the various

factor levels of Dermatology and Zoo databases are as shownin Table 8 The factor effect diagrams are as shown in Figures2 and 3 As a greater 119878119873 ratio represents better qualityaccording to the factor level difference and factor effectdiagrams the Dermatology parameter level combination is11986011198613 in other words parameters 119862 = 10 120574 = 10 Zoo

parameter level combination is 11986011198612 and the parameter

settings are 119862 = 5 120574 = 4When constructing the Multiclass SVM model using

SVM-RFE three different feature sets are selected according

minus024

minus025

minus026

minus027

minus028

minus029

minus030

minus031

minus032

1 2 3 1 2 3

A B

SN

Figure 2 Main effect plots for 119878119873 ratio of Dermatology database

to their significance At the first stage Taguchi qualityengineering is applied to select the optimum values ofparameters 119862 and 120574 At the second stage it constructs theMulticlass SVM Classifier and compares the classificationperformance according to the above parameters In theDermatology experiment Table 9 illustrates the two featuresubsets containing 23 and 33 feature variables The 33 feature

8 The Scientific World Journal

Table 8 Average of each factor at all levels

Dermatology Zoo

Control factor Level Control factor Level1 2 3 Difference 1 2 3 Difference

119860(119862) minus02487 minus02867 minus02752 00380 119860(119862) minus09745 minus14043 minus10929 04298119861(120574) minus03135 minus02533 minus02438 00697 119861(120574) minus20224 minus03391 minus11102 16833

Table 9 Classification performance comparison of Dermatology database

Methods Dimensions 119862 120574 AccuracySVM 33 100 5 9510 plusmn 00096SVM-RFE 23 50 24 8928 plusmn 00139SVM-RFE-Taguchi 23 10 10 9538 plusmn 00098

Table 10 Classification performance comparison of Zoo database

Methods Dimensions 119862 120574 AccuracySVM 16 10 11 89 plusmn 00314SVM-RFE 6 50 008 92 plusmn 00199SVM-RFE-Taguchi 12 5 4 97 plusmn 00396

321 321

minus05

minus10

minus15

minus20

A B

SN

Figure 3 Main effect plots for 119878119873 ratio of Zoo database

sets are tested by SVM and SVM as based on Taguchi Theparameter settings and testing accuracy rate results are asshown in Table 9 The experimental results as shown inFigure 4 show that the SVM (119862 = 10 120574 = 10) testingaccuracy rate of the 17-feature sets datasets can be higherthan 90 which is better than the accuracy rate of 20-featuresets dataset SVM (119862 = 10 120574 = 11) up to 90 Moreoverregardless of how many sets of feature variables are selectedthe accuracy of SVM (119862 = 50 120574 = 24) cannot be higher than90

Regarding the Zoo experiment Table 10 summarizes theexperimental test results of sets containing 6 12 and 16feature variables using SVM and SVM based on Taguchi Asshown in Table 10 the experimental results show that theclassification accuracy rate of the set of 12-feature variables inthe classification experiment using SVM-RFE-Taguchi (119862 =10 120574 = 10) is the highest up to 97 plusmn 00396 As shown inFigure 5 the experimental results show that the classification

1

09

08

07

06

05

04

03

02

Accu

racy

0 5 10 15 20 25 30 35

Number of features

SVM-RFE-TaguchiC = 10 120574 = 10

SVM-RFE C = 50 120574 = 24

SVM-RFE C = 100 120574 = 5

Figure 4 Classification performance comparison of Dermatologydatabase

accuracy rate of the dataset containing 7 feature variables bySVM-RFE-Taguchi (119862 = 50 120574 = 24) can be higher than 90which can obtain relatively better prediction effects

5 Conclusions

As the study on the impact of feature selection on themulticlass classification accuracy rate becomes increasinglyattractive and significant this study applies SVM-RFE andSVM in the construction of amulticlass classificationmethodin order to establish the classification model As RFE is a

The Scientific World Journal 9

Table 11 Comparison of classification accuracy in related literature

Author Method AccuracyDermatology database

Xie et al (2005) [16] FOut SVM 9174Srinivasa et al (2006) [32] FCM SVM 8330Ren et al (2006) [33] LDA SVM 7209Our Method (2014) SVM-RFE-Taguchi 9538

Zoo databaseXie et al (2005) [16] FOut SVM 8824He (2006) [34] NFPH k-modes 9208Golzari et al (2009) [35] Fuzzy AIRS 9496Our Method (2014) SVM-RFE-Taguchi 9700

1

095

09

085

08

075

07

065

Accu

racy

0 2 4 6 8 10 12 14 16

Number of features

SVM-RFE-TaguchiC = 5 120574 = 4

SVM-RFE C = 10 120574 = 11

SVM-RFE C = 50 120574 = 008

Figure 5 Classification performance comparison of Zoo database

feature selection method of a wrapper model it requires apreviously defined classifier as the assessment rule of featureselection therefore SVM is used as the RFE assessmentstandard to help RFE in the selection of feature sets

According to the experimental results of this studywith respect to parameter settings the impact of parameterselection on the construction of SVM classification modelis huge Therefore this study applies the Taguchi parameterdesign in determining the parameter range and selection ofthe optimum parameter combination for SVM classifier asit is a key factor influencing the classification accuracy Thisstudy also collected the experimental results of using differentresearch methods in the case of Dermatology and Zoodatabases [16 32 33] as shown inTable 11 By comparison theproposed method can achieve higher classification accuracy

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

References

[1] N Cristianini and J Shawe-Taylor An Introduction to SupportVector Machines and Other Kernel-based Learning MethodsCambridge University Press Cambridge UK 2000

[2] J Luts F Ojeda R van de Plas Raf B de Moor S van Huffeland J A K Suykens ldquoA tutorial on support vector machine-based methods for classification problems in chemometricsrdquoAnalytica Chimica Acta vol 665 no 2 pp 129ndash145 2010

[3] M F Akay ldquoSupport vector machines combined with featureselection for breast cancer diagnosisrdquo Expert Systems withApplications vol 36 no 2 pp 3240ndash3247 2009

[4] C-Y Chang S-J Chen andM-F Tsai ldquoApplication of support-vector-machine-based method for feature selection and clas-sification of thyroid nodules in ultrasound imagesrdquo PatternRecognition vol 43 no 10 pp 3494ndash3506 2010

[5] H-L Chen B Yang J Liu and D-Y Liu ldquoA support vectormachine classifier with rough set-based feature selection forbreast cancer diagnosisrdquo Expert Systems with Applications vol38 no 7 pp 9014ndash9022 2011

[6] P Danenas and G Garsva ldquoCredit risk evaluation modelingusing evolutionary linear SVM classifiers and sliding windowapproachrdquo Procedia Computer Science vol 9 pp 1324ndash13332012

[7] C L Huang H C Liao and M C Chen ldquoPrediction modelbuilding and feature selection with support vector machines inbreast cancer diagnosisrdquo Expert Systems with Applications vol34 no 1 pp 578ndash587 2008

[8] H F Liau and D Isa ldquoFeature selection for support vectormachine-based face-iris multimodal biometric systemrdquo ExpertSystems with Applications vol 38 no 9 pp 11105ndash11111 2011

[9] Y Zhang Z Chi andY Sun ldquoA novelmulti-class support vectormachine based on fuzzy theoriesrdquo in Intelligent ComputingInternational Conference on Intelligent Computing Part I (ICICrsquo06) D S Huang K Li and G W Irwin Eds vol 4113 ofLecture Notes in Computer Science pp 42ndash50 Springer BerlinGermany

[10] Y Aksu D J Miller G Kesidis and Q X Yang ldquoMargin-maximizing feature elimination methods for linear and nonlin-ear kernel-based discriminant functionsrdquo IEEE Transactions onNeural Networks vol 21 no 5 pp 701ndash717 2010

[11] P Pudil J Novovicova and J Kittler ldquoFloating search methodsin feature selectionrdquo Pattern Recognition Letters vol 15 no 11pp 1119ndash1125 1994

10 The Scientific World Journal

[12] I Guyon J Weston S Barnhill and V Vapnik ldquoGene selec-tion for cancer classification using support vector machinesrdquoMachine Learning vol 46 no 1ndash3 pp 389ndash422 2002

[13] S Harikrishna M A H Farquad and Shabana ldquoCredit scoringusing support vector machine a comparative analysisrdquo inAdvanced Materials Research Trans Tech Publications ZurichSwitzerland 2012

[14] X Lin F Yang L Zhou et al ldquoA support vector machine-recursive feature elimination feature selection method basedon artificial contrast variables andmutual informationrdquo Journalof Chromatography B Analytical Technologies in the Biomedicaland Life Sciences vol 10 pp 149ndash155 2012

[15] R Zhang and M Jianwen ldquoFeature selection for hyperspectraldata based on recursive support vector machinesrdquo InternationalJournal of Remote Sensing vol 30 no 14 pp 3669ndash3677 2009

[16] Z X Xie Q H Hu and D R Yu ldquoFuzzy output supportvector machines for classificationrdquo in Advances in NaturalComputation L Wang K Chen and Y S Ong Eds vol 3612pp 1190ndash1197 Springer Berlin Germany

[17] Y Liu Z You and L Cao ldquoA novel and quick SVM-basedmulti-class classifierrdquo Pattern Recognition vol 39 no 11 pp 2258ndash2264 2006

[18] J Platt N C Cristianini and J Shawe-Taylor ldquoLarge marginDAGs for multiclass classificationrdquo in Advances in NeuralInformation Processing Systems S A Solla T K Leen and KR Muller Eds vol 12 pp 547ndash553 2000

[19] Y Xu S Zomer and R G Brereton ldquoSupport vector machinesa recent method for classification in chemometricsrdquo CriticalReviews in Analytical Chemistry vol 36 no 3-4 pp 177ndash1882006

[20] M L Huang Y H Hung and E J Lin ldquoEffects of SVMparameter optimization based on the parameter design ofTaguchi methodrdquo International Journal on Artificial IntelligenceTools vol 20 no 3 pp 563ndash575 2011

[21] H-C Lin C-T Su C-C Wang B-H Chang and R-CJuang ldquoParameter optimization of continuous sputtering pro-cess based on Taguchi methods neural networks desirabilityfunction and genetic algorithmsrdquo Expert Systems with Applica-tions vol 39 no 17 pp 12918ndash12925 2012

[22] Y Mao D Pi Y Liu and Y Sun ldquoAccelerated recursive featureelimination based on support vector machine for key variableidentificationrdquo Chinese Journal of Chemical Engineering vol 14no 1 pp 65ndash72 2006

[23] A Pal and J Maiti ldquoDevelopment of a hybrid methodology fordimensionality reduction inMahalanobis-Taguchi systemusingMahalanobis distance and binary particle swarm optimizationrdquoExpert Systems with Applications vol 37 no 2 pp 1286ndash12932010

[24] C-T Su and Y-H Hsiao ldquoMulticlass MTS for simultane-ous feature selection and classificationrdquo IEEE Transactions onKnowledge and Data Engineering vol 21 no 2 pp 192ndash2052009

[25] X Lin F Yang L Zhou et al ldquoA support vector machine-recursive feature elimination feature selectionmethod based onartificial contrast variables and mutual informationrdquo Journal ofChromatography B vol 910 pp 149ndash155 2012

[26] E Hullermeier and S Vanderlooy ldquoCombining predictions inpairwise classification an optimal adaptive voting strategy andits relation to weighted votingrdquo Pattern Recognition vol 43 no1 pp 128ndash142 2010

[27] L Bottou C Cortes J Denker et al ldquoComparison of classifiermethodsmdasha case study in handwritten digit recognitionrdquo in

Proceedings of the 12th Iapr International Conference on PatternRecognition vol 2 pp 77ndash82 IEEEComputer Society Press LosAlamitos Calif USA 1994

[28] J Furnkranz ldquoRound robin rule learningrdquo in Proceedings of the18th International Conference on Machine Learning (ICML 01)pp 146ndash153 2001

[29] M R Sohrabi S Jamshidi and A Esmaeilifar ldquoCloud pointextraction for determination of Diazinon optimization of theeffective parameters using Taguchi methodrdquoChemometrics andIntelligent Laboratory Systems vol 110 no 1 pp 49ndash54 2012

[30] W C Hsu and T Y Yu ldquoSupport vector machines parameterselection based on combined taguchi method and staelinmethod for e-mail spam filteringrdquo International Journal ofEngineering and Technology Innovation vol 2 no 2 pp 113ndash1252012

[31] J Arenas-Garcıa and F Perez-Cruz ldquoMulti-class support vectormachines A new approachrdquo in Proceeding of the IEEE Interna-tional Conference on Accoustics Speech and Signal Processing(ICASSP 03) vol 2 pp 781ndash784 April 2003

[32] K G Srinivasa K R Venugopal and L M Patnaik ldquoFeatureextraction using fuzzy c-means clustering for data mining sys-temsrdquo International Journal of Computer Science and NetworkSecurity vol 6 no 3A pp 230ndash236 2006

[33] Y Ren H Liu C Xue X YaoM Liu and B Fan ldquoClassificationstudy of skin sensitizers based on support vector machine andlinear discriminant analysisrdquo Analytica Chimica Acta vol 572no 2 pp 272ndash282 2006

[34] ZHe Farthest-point heuristic based initializationmethods for K-modes clustering [thesis] Department of Computer Science andEngineering Harbin Institute of Technology Harbin China2006

[35] SGolzari SDoraisamyMN Sulaiman andN IUdzir ldquoEffectof fuzzy resource allocation method on AIRS classifier accu-racyrdquo Journal ofTheoretical andApplied Information Technologyvol 5 no 1 pp 18ndash24 2009

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 7: Research Article SVM-RFE Based Feature Selection and ...downloads.hindawi.com/journals/tswj/2014/795624.pdf · SVM-RFE Based Feature Selection and Taguchi Parameters Optimization

The Scientific World Journal 7

Table 6 Summary of experiment data of Dermatology database

Number Control factor Observation Average SN119860 119861 119910

11199102

1199103

1199104

1199105

1 1 1 09631 09701 09697 09627 09614 09654 minus030602 1 2 09686 09749 09653 09621 09732 09688 minus027553 1 3 09795 09847 09848 09838 09735 09813 minus016474 2 1 09630 09615 09581 09599 09668 09619 minus033795 2 2 09687 09721 09704 09707 09626 09689 minus027466 2 3 09685 09748 09744 09712 09707 09719 minus024757 3 1 09671 09689 09648 09668 09645 09664 minus029678 3 2 09741 09704 09797 09799 09767 09762 minus020989 3 3 09625 09633 09642 09678 09619 09639 minus03191(1198601 = 10 1198602 = 50 1198603 = 100 1198611 = 24 1198612 = 5 1198613 = 10)

Table 7 Summary of experiment data of Zoo database

Number Control factor Observation Average SN119860 119861 119910

11199102

1199103

1199104

1199105

1 1 1 09513 09673 09435 09567 09546 09547 minus040372 1 2 09600 09616 09588 09611 09608 09605 minus035043 1 3 07809 07833 07820 07679 07811 07790 minus216944 2 1 07118 06766 07368 07256 07109 07123 minus295715 2 2 09600 09612 09604 09519 09440 09555 minus039606 2 3 08900 08947 09214 09050 09190 09060 minus085987 3 1 07118 07398 07421 07495 07203 07327 minus270648 3 2 09610 09735 09709 09752 09661 09693 minus027099 3 3 09600 09723 09707 09509 09763 09660 minus03013(1198601 = 5 1198602 = 10 1198603 = 50 1198611 = 008 1198612 = 4 1198613 = 11)

The calculation results of the 119878119873 ratios of the remaining eightexperimental combinations are summarized as in Table 6The Zoo experimental results and 119878119873 ratio calculation areas shown in Table 7 According to the above results we thencalculate the average 119878119873 ratios of the various factor levelsWith the experiment of Table 8 as an example the average119878119873 ratio 119860

1of Factor 119860 at Level 1 is

1198601=1

3[minus03060 + (minus02755) + (minus01647)] = minus02487

(15)

Similarly we can calculate the average effects of 1198602and

1198603from Table 6The difference analysis results of the various

factor levels of Dermatology and Zoo databases are as shownin Table 8 The factor effect diagrams are as shown in Figures2 and 3 As a greater 119878119873 ratio represents better qualityaccording to the factor level difference and factor effectdiagrams the Dermatology parameter level combination is11986011198613 in other words parameters 119862 = 10 120574 = 10 Zoo

parameter level combination is 11986011198612 and the parameter

settings are 119862 = 5 120574 = 4When constructing the Multiclass SVM model using

SVM-RFE three different feature sets are selected according

minus024

minus025

minus026

minus027

minus028

minus029

minus030

minus031

minus032

1 2 3 1 2 3

A B

SN

Figure 2 Main effect plots for 119878119873 ratio of Dermatology database

to their significance At the first stage Taguchi qualityengineering is applied to select the optimum values ofparameters 119862 and 120574 At the second stage it constructs theMulticlass SVM Classifier and compares the classificationperformance according to the above parameters In theDermatology experiment Table 9 illustrates the two featuresubsets containing 23 and 33 feature variables The 33 feature

8 The Scientific World Journal

Table 8 Average of each factor at all levels

Dermatology Zoo

Control factor Level Control factor Level1 2 3 Difference 1 2 3 Difference

119860(119862) minus02487 minus02867 minus02752 00380 119860(119862) minus09745 minus14043 minus10929 04298119861(120574) minus03135 minus02533 minus02438 00697 119861(120574) minus20224 minus03391 minus11102 16833

Table 9 Classification performance comparison of Dermatology database

Methods Dimensions 119862 120574 AccuracySVM 33 100 5 9510 plusmn 00096SVM-RFE 23 50 24 8928 plusmn 00139SVM-RFE-Taguchi 23 10 10 9538 plusmn 00098

Table 10 Classification performance comparison of Zoo database

Methods Dimensions 119862 120574 AccuracySVM 16 10 11 89 plusmn 00314SVM-RFE 6 50 008 92 plusmn 00199SVM-RFE-Taguchi 12 5 4 97 plusmn 00396

321 321

minus05

minus10

minus15

minus20

A B

SN

Figure 3 Main effect plots for 119878119873 ratio of Zoo database

sets are tested by SVM and SVM as based on Taguchi Theparameter settings and testing accuracy rate results are asshown in Table 9 The experimental results as shown inFigure 4 show that the SVM (119862 = 10 120574 = 10) testingaccuracy rate of the 17-feature sets datasets can be higherthan 90 which is better than the accuracy rate of 20-featuresets dataset SVM (119862 = 10 120574 = 11) up to 90 Moreoverregardless of how many sets of feature variables are selectedthe accuracy of SVM (119862 = 50 120574 = 24) cannot be higher than90

Regarding the Zoo experiment Table 10 summarizes theexperimental test results of sets containing 6 12 and 16feature variables using SVM and SVM based on Taguchi Asshown in Table 10 the experimental results show that theclassification accuracy rate of the set of 12-feature variables inthe classification experiment using SVM-RFE-Taguchi (119862 =10 120574 = 10) is the highest up to 97 plusmn 00396 As shown inFigure 5 the experimental results show that the classification

1

09

08

07

06

05

04

03

02

Accu

racy

0 5 10 15 20 25 30 35

Number of features

SVM-RFE-TaguchiC = 10 120574 = 10

SVM-RFE C = 50 120574 = 24

SVM-RFE C = 100 120574 = 5

Figure 4 Classification performance comparison of Dermatologydatabase

accuracy rate of the dataset containing 7 feature variables bySVM-RFE-Taguchi (119862 = 50 120574 = 24) can be higher than 90which can obtain relatively better prediction effects

5 Conclusions

As the study on the impact of feature selection on themulticlass classification accuracy rate becomes increasinglyattractive and significant this study applies SVM-RFE andSVM in the construction of amulticlass classificationmethodin order to establish the classification model As RFE is a

The Scientific World Journal 9

Table 11 Comparison of classification accuracy in related literature

Author Method AccuracyDermatology database

Xie et al (2005) [16] FOut SVM 9174Srinivasa et al (2006) [32] FCM SVM 8330Ren et al (2006) [33] LDA SVM 7209Our Method (2014) SVM-RFE-Taguchi 9538

Zoo databaseXie et al (2005) [16] FOut SVM 8824He (2006) [34] NFPH k-modes 9208Golzari et al (2009) [35] Fuzzy AIRS 9496Our Method (2014) SVM-RFE-Taguchi 9700

1

095

09

085

08

075

07

065

Accu

racy

0 2 4 6 8 10 12 14 16

Number of features

SVM-RFE-TaguchiC = 5 120574 = 4

SVM-RFE C = 10 120574 = 11

SVM-RFE C = 50 120574 = 008

Figure 5 Classification performance comparison of Zoo database

feature selection method of a wrapper model it requires apreviously defined classifier as the assessment rule of featureselection therefore SVM is used as the RFE assessmentstandard to help RFE in the selection of feature sets

According to the experimental results of this studywith respect to parameter settings the impact of parameterselection on the construction of SVM classification modelis huge Therefore this study applies the Taguchi parameterdesign in determining the parameter range and selection ofthe optimum parameter combination for SVM classifier asit is a key factor influencing the classification accuracy Thisstudy also collected the experimental results of using differentresearch methods in the case of Dermatology and Zoodatabases [16 32 33] as shown inTable 11 By comparison theproposed method can achieve higher classification accuracy

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

References

[1] N Cristianini and J Shawe-Taylor An Introduction to SupportVector Machines and Other Kernel-based Learning MethodsCambridge University Press Cambridge UK 2000

[2] J Luts F Ojeda R van de Plas Raf B de Moor S van Huffeland J A K Suykens ldquoA tutorial on support vector machine-based methods for classification problems in chemometricsrdquoAnalytica Chimica Acta vol 665 no 2 pp 129ndash145 2010

[3] M F Akay ldquoSupport vector machines combined with featureselection for breast cancer diagnosisrdquo Expert Systems withApplications vol 36 no 2 pp 3240ndash3247 2009

[4] C-Y Chang S-J Chen andM-F Tsai ldquoApplication of support-vector-machine-based method for feature selection and clas-sification of thyroid nodules in ultrasound imagesrdquo PatternRecognition vol 43 no 10 pp 3494ndash3506 2010

[5] H-L Chen B Yang J Liu and D-Y Liu ldquoA support vectormachine classifier with rough set-based feature selection forbreast cancer diagnosisrdquo Expert Systems with Applications vol38 no 7 pp 9014ndash9022 2011

[6] P Danenas and G Garsva ldquoCredit risk evaluation modelingusing evolutionary linear SVM classifiers and sliding windowapproachrdquo Procedia Computer Science vol 9 pp 1324ndash13332012

[7] C L Huang H C Liao and M C Chen ldquoPrediction modelbuilding and feature selection with support vector machines inbreast cancer diagnosisrdquo Expert Systems with Applications vol34 no 1 pp 578ndash587 2008

[8] H F Liau and D Isa ldquoFeature selection for support vectormachine-based face-iris multimodal biometric systemrdquo ExpertSystems with Applications vol 38 no 9 pp 11105ndash11111 2011

[9] Y Zhang Z Chi andY Sun ldquoA novelmulti-class support vectormachine based on fuzzy theoriesrdquo in Intelligent ComputingInternational Conference on Intelligent Computing Part I (ICICrsquo06) D S Huang K Li and G W Irwin Eds vol 4113 ofLecture Notes in Computer Science pp 42ndash50 Springer BerlinGermany

[10] Y Aksu D J Miller G Kesidis and Q X Yang ldquoMargin-maximizing feature elimination methods for linear and nonlin-ear kernel-based discriminant functionsrdquo IEEE Transactions onNeural Networks vol 21 no 5 pp 701ndash717 2010

[11] P Pudil J Novovicova and J Kittler ldquoFloating search methodsin feature selectionrdquo Pattern Recognition Letters vol 15 no 11pp 1119ndash1125 1994

10 The Scientific World Journal

[12] I Guyon J Weston S Barnhill and V Vapnik ldquoGene selec-tion for cancer classification using support vector machinesrdquoMachine Learning vol 46 no 1ndash3 pp 389ndash422 2002

[13] S Harikrishna M A H Farquad and Shabana ldquoCredit scoringusing support vector machine a comparative analysisrdquo inAdvanced Materials Research Trans Tech Publications ZurichSwitzerland 2012

[14] X Lin F Yang L Zhou et al ldquoA support vector machine-recursive feature elimination feature selection method basedon artificial contrast variables andmutual informationrdquo Journalof Chromatography B Analytical Technologies in the Biomedicaland Life Sciences vol 10 pp 149ndash155 2012

[15] R Zhang and M Jianwen ldquoFeature selection for hyperspectraldata based on recursive support vector machinesrdquo InternationalJournal of Remote Sensing vol 30 no 14 pp 3669ndash3677 2009

[16] Z X Xie Q H Hu and D R Yu ldquoFuzzy output supportvector machines for classificationrdquo in Advances in NaturalComputation L Wang K Chen and Y S Ong Eds vol 3612pp 1190ndash1197 Springer Berlin Germany

[17] Y Liu Z You and L Cao ldquoA novel and quick SVM-basedmulti-class classifierrdquo Pattern Recognition vol 39 no 11 pp 2258ndash2264 2006

[18] J Platt N C Cristianini and J Shawe-Taylor ldquoLarge marginDAGs for multiclass classificationrdquo in Advances in NeuralInformation Processing Systems S A Solla T K Leen and KR Muller Eds vol 12 pp 547ndash553 2000

[19] Y Xu S Zomer and R G Brereton ldquoSupport vector machinesa recent method for classification in chemometricsrdquo CriticalReviews in Analytical Chemistry vol 36 no 3-4 pp 177ndash1882006

[20] M L Huang Y H Hung and E J Lin ldquoEffects of SVMparameter optimization based on the parameter design ofTaguchi methodrdquo International Journal on Artificial IntelligenceTools vol 20 no 3 pp 563ndash575 2011

[21] H-C Lin C-T Su C-C Wang B-H Chang and R-CJuang ldquoParameter optimization of continuous sputtering pro-cess based on Taguchi methods neural networks desirabilityfunction and genetic algorithmsrdquo Expert Systems with Applica-tions vol 39 no 17 pp 12918ndash12925 2012

[22] Y Mao D Pi Y Liu and Y Sun ldquoAccelerated recursive featureelimination based on support vector machine for key variableidentificationrdquo Chinese Journal of Chemical Engineering vol 14no 1 pp 65ndash72 2006

[23] A Pal and J Maiti ldquoDevelopment of a hybrid methodology fordimensionality reduction inMahalanobis-Taguchi systemusingMahalanobis distance and binary particle swarm optimizationrdquoExpert Systems with Applications vol 37 no 2 pp 1286ndash12932010

[24] C-T Su and Y-H Hsiao ldquoMulticlass MTS for simultane-ous feature selection and classificationrdquo IEEE Transactions onKnowledge and Data Engineering vol 21 no 2 pp 192ndash2052009

[25] X Lin F Yang L Zhou et al ldquoA support vector machine-recursive feature elimination feature selectionmethod based onartificial contrast variables and mutual informationrdquo Journal ofChromatography B vol 910 pp 149ndash155 2012

[26] E Hullermeier and S Vanderlooy ldquoCombining predictions inpairwise classification an optimal adaptive voting strategy andits relation to weighted votingrdquo Pattern Recognition vol 43 no1 pp 128ndash142 2010

[27] L Bottou C Cortes J Denker et al ldquoComparison of classifiermethodsmdasha case study in handwritten digit recognitionrdquo in

Proceedings of the 12th Iapr International Conference on PatternRecognition vol 2 pp 77ndash82 IEEEComputer Society Press LosAlamitos Calif USA 1994

[28] J Furnkranz ldquoRound robin rule learningrdquo in Proceedings of the18th International Conference on Machine Learning (ICML 01)pp 146ndash153 2001

[29] M R Sohrabi S Jamshidi and A Esmaeilifar ldquoCloud pointextraction for determination of Diazinon optimization of theeffective parameters using Taguchi methodrdquoChemometrics andIntelligent Laboratory Systems vol 110 no 1 pp 49ndash54 2012

[30] W C Hsu and T Y Yu ldquoSupport vector machines parameterselection based on combined taguchi method and staelinmethod for e-mail spam filteringrdquo International Journal ofEngineering and Technology Innovation vol 2 no 2 pp 113ndash1252012

[31] J Arenas-Garcıa and F Perez-Cruz ldquoMulti-class support vectormachines A new approachrdquo in Proceeding of the IEEE Interna-tional Conference on Accoustics Speech and Signal Processing(ICASSP 03) vol 2 pp 781ndash784 April 2003

[32] K G Srinivasa K R Venugopal and L M Patnaik ldquoFeatureextraction using fuzzy c-means clustering for data mining sys-temsrdquo International Journal of Computer Science and NetworkSecurity vol 6 no 3A pp 230ndash236 2006

[33] Y Ren H Liu C Xue X YaoM Liu and B Fan ldquoClassificationstudy of skin sensitizers based on support vector machine andlinear discriminant analysisrdquo Analytica Chimica Acta vol 572no 2 pp 272ndash282 2006

[34] ZHe Farthest-point heuristic based initializationmethods for K-modes clustering [thesis] Department of Computer Science andEngineering Harbin Institute of Technology Harbin China2006

[35] SGolzari SDoraisamyMN Sulaiman andN IUdzir ldquoEffectof fuzzy resource allocation method on AIRS classifier accu-racyrdquo Journal ofTheoretical andApplied Information Technologyvol 5 no 1 pp 18ndash24 2009

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 8: Research Article SVM-RFE Based Feature Selection and ...downloads.hindawi.com/journals/tswj/2014/795624.pdf · SVM-RFE Based Feature Selection and Taguchi Parameters Optimization

8 The Scientific World Journal

Table 8 Average of each factor at all levels

Dermatology Zoo

Control factor Level Control factor Level1 2 3 Difference 1 2 3 Difference

119860(119862) minus02487 minus02867 minus02752 00380 119860(119862) minus09745 minus14043 minus10929 04298119861(120574) minus03135 minus02533 minus02438 00697 119861(120574) minus20224 minus03391 minus11102 16833

Table 9 Classification performance comparison of Dermatology database

Methods Dimensions 119862 120574 AccuracySVM 33 100 5 9510 plusmn 00096SVM-RFE 23 50 24 8928 plusmn 00139SVM-RFE-Taguchi 23 10 10 9538 plusmn 00098

Table 10 Classification performance comparison of Zoo database

Methods Dimensions 119862 120574 AccuracySVM 16 10 11 89 plusmn 00314SVM-RFE 6 50 008 92 plusmn 00199SVM-RFE-Taguchi 12 5 4 97 plusmn 00396

321 321

minus05

minus10

minus15

minus20

A B

SN

Figure 3 Main effect plots for 119878119873 ratio of Zoo database

sets are tested by SVM and SVM as based on Taguchi Theparameter settings and testing accuracy rate results are asshown in Table 9 The experimental results as shown inFigure 4 show that the SVM (119862 = 10 120574 = 10) testingaccuracy rate of the 17-feature sets datasets can be higherthan 90 which is better than the accuracy rate of 20-featuresets dataset SVM (119862 = 10 120574 = 11) up to 90 Moreoverregardless of how many sets of feature variables are selectedthe accuracy of SVM (119862 = 50 120574 = 24) cannot be higher than90

Regarding the Zoo experiment Table 10 summarizes theexperimental test results of sets containing 6 12 and 16feature variables using SVM and SVM based on Taguchi Asshown in Table 10 the experimental results show that theclassification accuracy rate of the set of 12-feature variables inthe classification experiment using SVM-RFE-Taguchi (119862 =10 120574 = 10) is the highest up to 97 plusmn 00396 As shown inFigure 5 the experimental results show that the classification

1

09

08

07

06

05

04

03

02

Accu

racy

0 5 10 15 20 25 30 35

Number of features

SVM-RFE-TaguchiC = 10 120574 = 10

SVM-RFE C = 50 120574 = 24

SVM-RFE C = 100 120574 = 5

Figure 4 Classification performance comparison of Dermatologydatabase

accuracy rate of the dataset containing 7 feature variables bySVM-RFE-Taguchi (119862 = 50 120574 = 24) can be higher than 90which can obtain relatively better prediction effects

5 Conclusions

As the study on the impact of feature selection on themulticlass classification accuracy rate becomes increasinglyattractive and significant this study applies SVM-RFE andSVM in the construction of amulticlass classificationmethodin order to establish the classification model As RFE is a

The Scientific World Journal 9

Table 11 Comparison of classification accuracy in related literature

Author Method AccuracyDermatology database

Xie et al (2005) [16] FOut SVM 9174Srinivasa et al (2006) [32] FCM SVM 8330Ren et al (2006) [33] LDA SVM 7209Our Method (2014) SVM-RFE-Taguchi 9538

Zoo databaseXie et al (2005) [16] FOut SVM 8824He (2006) [34] NFPH k-modes 9208Golzari et al (2009) [35] Fuzzy AIRS 9496Our Method (2014) SVM-RFE-Taguchi 9700

1

095

09

085

08

075

07

065

Accu

racy

0 2 4 6 8 10 12 14 16

Number of features

SVM-RFE-TaguchiC = 5 120574 = 4

SVM-RFE C = 10 120574 = 11

SVM-RFE C = 50 120574 = 008

Figure 5 Classification performance comparison of Zoo database

feature selection method of a wrapper model it requires apreviously defined classifier as the assessment rule of featureselection therefore SVM is used as the RFE assessmentstandard to help RFE in the selection of feature sets

According to the experimental results of this studywith respect to parameter settings the impact of parameterselection on the construction of SVM classification modelis huge Therefore this study applies the Taguchi parameterdesign in determining the parameter range and selection ofthe optimum parameter combination for SVM classifier asit is a key factor influencing the classification accuracy Thisstudy also collected the experimental results of using differentresearch methods in the case of Dermatology and Zoodatabases [16 32 33] as shown inTable 11 By comparison theproposed method can achieve higher classification accuracy

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

References

[1] N Cristianini and J Shawe-Taylor An Introduction to SupportVector Machines and Other Kernel-based Learning MethodsCambridge University Press Cambridge UK 2000

[2] J Luts F Ojeda R van de Plas Raf B de Moor S van Huffeland J A K Suykens ldquoA tutorial on support vector machine-based methods for classification problems in chemometricsrdquoAnalytica Chimica Acta vol 665 no 2 pp 129ndash145 2010

[3] M F Akay ldquoSupport vector machines combined with featureselection for breast cancer diagnosisrdquo Expert Systems withApplications vol 36 no 2 pp 3240ndash3247 2009

[4] C-Y Chang S-J Chen andM-F Tsai ldquoApplication of support-vector-machine-based method for feature selection and clas-sification of thyroid nodules in ultrasound imagesrdquo PatternRecognition vol 43 no 10 pp 3494ndash3506 2010

[5] H-L Chen B Yang J Liu and D-Y Liu ldquoA support vectormachine classifier with rough set-based feature selection forbreast cancer diagnosisrdquo Expert Systems with Applications vol38 no 7 pp 9014ndash9022 2011

[6] P Danenas and G Garsva ldquoCredit risk evaluation modelingusing evolutionary linear SVM classifiers and sliding windowapproachrdquo Procedia Computer Science vol 9 pp 1324ndash13332012

[7] C L Huang H C Liao and M C Chen ldquoPrediction modelbuilding and feature selection with support vector machines inbreast cancer diagnosisrdquo Expert Systems with Applications vol34 no 1 pp 578ndash587 2008

[8] H F Liau and D Isa ldquoFeature selection for support vectormachine-based face-iris multimodal biometric systemrdquo ExpertSystems with Applications vol 38 no 9 pp 11105ndash11111 2011

[9] Y Zhang Z Chi andY Sun ldquoA novelmulti-class support vectormachine based on fuzzy theoriesrdquo in Intelligent ComputingInternational Conference on Intelligent Computing Part I (ICICrsquo06) D S Huang K Li and G W Irwin Eds vol 4113 ofLecture Notes in Computer Science pp 42ndash50 Springer BerlinGermany

[10] Y Aksu D J Miller G Kesidis and Q X Yang ldquoMargin-maximizing feature elimination methods for linear and nonlin-ear kernel-based discriminant functionsrdquo IEEE Transactions onNeural Networks vol 21 no 5 pp 701ndash717 2010

[11] P Pudil J Novovicova and J Kittler ldquoFloating search methodsin feature selectionrdquo Pattern Recognition Letters vol 15 no 11pp 1119ndash1125 1994

10 The Scientific World Journal

[12] I Guyon J Weston S Barnhill and V Vapnik ldquoGene selec-tion for cancer classification using support vector machinesrdquoMachine Learning vol 46 no 1ndash3 pp 389ndash422 2002

[13] S Harikrishna M A H Farquad and Shabana ldquoCredit scoringusing support vector machine a comparative analysisrdquo inAdvanced Materials Research Trans Tech Publications ZurichSwitzerland 2012

[14] X Lin F Yang L Zhou et al ldquoA support vector machine-recursive feature elimination feature selection method basedon artificial contrast variables andmutual informationrdquo Journalof Chromatography B Analytical Technologies in the Biomedicaland Life Sciences vol 10 pp 149ndash155 2012

[15] R Zhang and M Jianwen ldquoFeature selection for hyperspectraldata based on recursive support vector machinesrdquo InternationalJournal of Remote Sensing vol 30 no 14 pp 3669ndash3677 2009

[16] Z X Xie Q H Hu and D R Yu ldquoFuzzy output supportvector machines for classificationrdquo in Advances in NaturalComputation L Wang K Chen and Y S Ong Eds vol 3612pp 1190ndash1197 Springer Berlin Germany

[17] Y Liu Z You and L Cao ldquoA novel and quick SVM-basedmulti-class classifierrdquo Pattern Recognition vol 39 no 11 pp 2258ndash2264 2006

[18] J Platt N C Cristianini and J Shawe-Taylor ldquoLarge marginDAGs for multiclass classificationrdquo in Advances in NeuralInformation Processing Systems S A Solla T K Leen and KR Muller Eds vol 12 pp 547ndash553 2000

[19] Y Xu S Zomer and R G Brereton ldquoSupport vector machinesa recent method for classification in chemometricsrdquo CriticalReviews in Analytical Chemistry vol 36 no 3-4 pp 177ndash1882006

[20] M L Huang Y H Hung and E J Lin ldquoEffects of SVMparameter optimization based on the parameter design ofTaguchi methodrdquo International Journal on Artificial IntelligenceTools vol 20 no 3 pp 563ndash575 2011

[21] H-C Lin C-T Su C-C Wang B-H Chang and R-CJuang ldquoParameter optimization of continuous sputtering pro-cess based on Taguchi methods neural networks desirabilityfunction and genetic algorithmsrdquo Expert Systems with Applica-tions vol 39 no 17 pp 12918ndash12925 2012

[22] Y Mao D Pi Y Liu and Y Sun ldquoAccelerated recursive featureelimination based on support vector machine for key variableidentificationrdquo Chinese Journal of Chemical Engineering vol 14no 1 pp 65ndash72 2006

[23] A Pal and J Maiti ldquoDevelopment of a hybrid methodology fordimensionality reduction inMahalanobis-Taguchi systemusingMahalanobis distance and binary particle swarm optimizationrdquoExpert Systems with Applications vol 37 no 2 pp 1286ndash12932010

[24] C-T Su and Y-H Hsiao ldquoMulticlass MTS for simultane-ous feature selection and classificationrdquo IEEE Transactions onKnowledge and Data Engineering vol 21 no 2 pp 192ndash2052009

[25] X Lin F Yang L Zhou et al ldquoA support vector machine-recursive feature elimination feature selectionmethod based onartificial contrast variables and mutual informationrdquo Journal ofChromatography B vol 910 pp 149ndash155 2012

[26] E Hullermeier and S Vanderlooy ldquoCombining predictions inpairwise classification an optimal adaptive voting strategy andits relation to weighted votingrdquo Pattern Recognition vol 43 no1 pp 128ndash142 2010

[27] L Bottou C Cortes J Denker et al ldquoComparison of classifiermethodsmdasha case study in handwritten digit recognitionrdquo in

Proceedings of the 12th Iapr International Conference on PatternRecognition vol 2 pp 77ndash82 IEEEComputer Society Press LosAlamitos Calif USA 1994

[28] J Furnkranz ldquoRound robin rule learningrdquo in Proceedings of the18th International Conference on Machine Learning (ICML 01)pp 146ndash153 2001

[29] M R Sohrabi S Jamshidi and A Esmaeilifar ldquoCloud pointextraction for determination of Diazinon optimization of theeffective parameters using Taguchi methodrdquoChemometrics andIntelligent Laboratory Systems vol 110 no 1 pp 49ndash54 2012

[30] W C Hsu and T Y Yu ldquoSupport vector machines parameterselection based on combined taguchi method and staelinmethod for e-mail spam filteringrdquo International Journal ofEngineering and Technology Innovation vol 2 no 2 pp 113ndash1252012

[31] J Arenas-Garcıa and F Perez-Cruz ldquoMulti-class support vectormachines A new approachrdquo in Proceeding of the IEEE Interna-tional Conference on Accoustics Speech and Signal Processing(ICASSP 03) vol 2 pp 781ndash784 April 2003

[32] K G Srinivasa K R Venugopal and L M Patnaik ldquoFeatureextraction using fuzzy c-means clustering for data mining sys-temsrdquo International Journal of Computer Science and NetworkSecurity vol 6 no 3A pp 230ndash236 2006

[33] Y Ren H Liu C Xue X YaoM Liu and B Fan ldquoClassificationstudy of skin sensitizers based on support vector machine andlinear discriminant analysisrdquo Analytica Chimica Acta vol 572no 2 pp 272ndash282 2006

[34] ZHe Farthest-point heuristic based initializationmethods for K-modes clustering [thesis] Department of Computer Science andEngineering Harbin Institute of Technology Harbin China2006

[35] SGolzari SDoraisamyMN Sulaiman andN IUdzir ldquoEffectof fuzzy resource allocation method on AIRS classifier accu-racyrdquo Journal ofTheoretical andApplied Information Technologyvol 5 no 1 pp 18ndash24 2009

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 9: Research Article SVM-RFE Based Feature Selection and ...downloads.hindawi.com/journals/tswj/2014/795624.pdf · SVM-RFE Based Feature Selection and Taguchi Parameters Optimization

The Scientific World Journal 9

Table 11 Comparison of classification accuracy in related literature

Author Method AccuracyDermatology database

Xie et al (2005) [16] FOut SVM 9174Srinivasa et al (2006) [32] FCM SVM 8330Ren et al (2006) [33] LDA SVM 7209Our Method (2014) SVM-RFE-Taguchi 9538

Zoo databaseXie et al (2005) [16] FOut SVM 8824He (2006) [34] NFPH k-modes 9208Golzari et al (2009) [35] Fuzzy AIRS 9496Our Method (2014) SVM-RFE-Taguchi 9700

1

095

09

085

08

075

07

065

Accu

racy

0 2 4 6 8 10 12 14 16

Number of features

SVM-RFE-TaguchiC = 5 120574 = 4

SVM-RFE C = 10 120574 = 11

SVM-RFE C = 50 120574 = 008

Figure 5 Classification performance comparison of Zoo database

feature selection method of a wrapper model it requires apreviously defined classifier as the assessment rule of featureselection therefore SVM is used as the RFE assessmentstandard to help RFE in the selection of feature sets

According to the experimental results of this studywith respect to parameter settings the impact of parameterselection on the construction of SVM classification modelis huge Therefore this study applies the Taguchi parameterdesign in determining the parameter range and selection ofthe optimum parameter combination for SVM classifier asit is a key factor influencing the classification accuracy Thisstudy also collected the experimental results of using differentresearch methods in the case of Dermatology and Zoodatabases [16 32 33] as shown inTable 11 By comparison theproposed method can achieve higher classification accuracy

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

References

[1] N Cristianini and J Shawe-Taylor An Introduction to SupportVector Machines and Other Kernel-based Learning MethodsCambridge University Press Cambridge UK 2000

[2] J Luts F Ojeda R van de Plas Raf B de Moor S van Huffeland J A K Suykens ldquoA tutorial on support vector machine-based methods for classification problems in chemometricsrdquoAnalytica Chimica Acta vol 665 no 2 pp 129ndash145 2010

[3] M F Akay ldquoSupport vector machines combined with featureselection for breast cancer diagnosisrdquo Expert Systems withApplications vol 36 no 2 pp 3240ndash3247 2009

[4] C-Y Chang S-J Chen andM-F Tsai ldquoApplication of support-vector-machine-based method for feature selection and clas-sification of thyroid nodules in ultrasound imagesrdquo PatternRecognition vol 43 no 10 pp 3494ndash3506 2010

[5] H-L Chen B Yang J Liu and D-Y Liu ldquoA support vectormachine classifier with rough set-based feature selection forbreast cancer diagnosisrdquo Expert Systems with Applications vol38 no 7 pp 9014ndash9022 2011

[6] P Danenas and G Garsva ldquoCredit risk evaluation modelingusing evolutionary linear SVM classifiers and sliding windowapproachrdquo Procedia Computer Science vol 9 pp 1324ndash13332012

[7] C L Huang H C Liao and M C Chen ldquoPrediction modelbuilding and feature selection with support vector machines inbreast cancer diagnosisrdquo Expert Systems with Applications vol34 no 1 pp 578ndash587 2008

[8] H F Liau and D Isa ldquoFeature selection for support vectormachine-based face-iris multimodal biometric systemrdquo ExpertSystems with Applications vol 38 no 9 pp 11105ndash11111 2011

[9] Y Zhang Z Chi andY Sun ldquoA novelmulti-class support vectormachine based on fuzzy theoriesrdquo in Intelligent ComputingInternational Conference on Intelligent Computing Part I (ICICrsquo06) D S Huang K Li and G W Irwin Eds vol 4113 ofLecture Notes in Computer Science pp 42ndash50 Springer BerlinGermany

[10] Y Aksu D J Miller G Kesidis and Q X Yang ldquoMargin-maximizing feature elimination methods for linear and nonlin-ear kernel-based discriminant functionsrdquo IEEE Transactions onNeural Networks vol 21 no 5 pp 701ndash717 2010

[11] P Pudil J Novovicova and J Kittler ldquoFloating search methodsin feature selectionrdquo Pattern Recognition Letters vol 15 no 11pp 1119ndash1125 1994

10 The Scientific World Journal

[12] I Guyon J Weston S Barnhill and V Vapnik ldquoGene selec-tion for cancer classification using support vector machinesrdquoMachine Learning vol 46 no 1ndash3 pp 389ndash422 2002

[13] S Harikrishna M A H Farquad and Shabana ldquoCredit scoringusing support vector machine a comparative analysisrdquo inAdvanced Materials Research Trans Tech Publications ZurichSwitzerland 2012

[14] X Lin F Yang L Zhou et al ldquoA support vector machine-recursive feature elimination feature selection method basedon artificial contrast variables andmutual informationrdquo Journalof Chromatography B Analytical Technologies in the Biomedicaland Life Sciences vol 10 pp 149ndash155 2012

[15] R Zhang and M Jianwen ldquoFeature selection for hyperspectraldata based on recursive support vector machinesrdquo InternationalJournal of Remote Sensing vol 30 no 14 pp 3669ndash3677 2009

[16] Z X Xie Q H Hu and D R Yu ldquoFuzzy output supportvector machines for classificationrdquo in Advances in NaturalComputation L Wang K Chen and Y S Ong Eds vol 3612pp 1190ndash1197 Springer Berlin Germany

[17] Y Liu Z You and L Cao ldquoA novel and quick SVM-basedmulti-class classifierrdquo Pattern Recognition vol 39 no 11 pp 2258ndash2264 2006

[18] J Platt N C Cristianini and J Shawe-Taylor ldquoLarge marginDAGs for multiclass classificationrdquo in Advances in NeuralInformation Processing Systems S A Solla T K Leen and KR Muller Eds vol 12 pp 547ndash553 2000

[19] Y Xu S Zomer and R G Brereton ldquoSupport vector machinesa recent method for classification in chemometricsrdquo CriticalReviews in Analytical Chemistry vol 36 no 3-4 pp 177ndash1882006

[20] M L Huang Y H Hung and E J Lin ldquoEffects of SVMparameter optimization based on the parameter design ofTaguchi methodrdquo International Journal on Artificial IntelligenceTools vol 20 no 3 pp 563ndash575 2011

[21] H-C Lin C-T Su C-C Wang B-H Chang and R-CJuang ldquoParameter optimization of continuous sputtering pro-cess based on Taguchi methods neural networks desirabilityfunction and genetic algorithmsrdquo Expert Systems with Applica-tions vol 39 no 17 pp 12918ndash12925 2012

[22] Y Mao D Pi Y Liu and Y Sun ldquoAccelerated recursive featureelimination based on support vector machine for key variableidentificationrdquo Chinese Journal of Chemical Engineering vol 14no 1 pp 65ndash72 2006

[23] A Pal and J Maiti ldquoDevelopment of a hybrid methodology fordimensionality reduction inMahalanobis-Taguchi systemusingMahalanobis distance and binary particle swarm optimizationrdquoExpert Systems with Applications vol 37 no 2 pp 1286ndash12932010

[24] C-T Su and Y-H Hsiao ldquoMulticlass MTS for simultane-ous feature selection and classificationrdquo IEEE Transactions onKnowledge and Data Engineering vol 21 no 2 pp 192ndash2052009

[25] X Lin F Yang L Zhou et al ldquoA support vector machine-recursive feature elimination feature selectionmethod based onartificial contrast variables and mutual informationrdquo Journal ofChromatography B vol 910 pp 149ndash155 2012

[26] E Hullermeier and S Vanderlooy ldquoCombining predictions inpairwise classification an optimal adaptive voting strategy andits relation to weighted votingrdquo Pattern Recognition vol 43 no1 pp 128ndash142 2010

[27] L Bottou C Cortes J Denker et al ldquoComparison of classifiermethodsmdasha case study in handwritten digit recognitionrdquo in

Proceedings of the 12th Iapr International Conference on PatternRecognition vol 2 pp 77ndash82 IEEEComputer Society Press LosAlamitos Calif USA 1994

[28] J Furnkranz ldquoRound robin rule learningrdquo in Proceedings of the18th International Conference on Machine Learning (ICML 01)pp 146ndash153 2001

[29] M R Sohrabi S Jamshidi and A Esmaeilifar ldquoCloud pointextraction for determination of Diazinon optimization of theeffective parameters using Taguchi methodrdquoChemometrics andIntelligent Laboratory Systems vol 110 no 1 pp 49ndash54 2012

[30] W C Hsu and T Y Yu ldquoSupport vector machines parameterselection based on combined taguchi method and staelinmethod for e-mail spam filteringrdquo International Journal ofEngineering and Technology Innovation vol 2 no 2 pp 113ndash1252012

[31] J Arenas-Garcıa and F Perez-Cruz ldquoMulti-class support vectormachines A new approachrdquo in Proceeding of the IEEE Interna-tional Conference on Accoustics Speech and Signal Processing(ICASSP 03) vol 2 pp 781ndash784 April 2003

[32] K G Srinivasa K R Venugopal and L M Patnaik ldquoFeatureextraction using fuzzy c-means clustering for data mining sys-temsrdquo International Journal of Computer Science and NetworkSecurity vol 6 no 3A pp 230ndash236 2006

[33] Y Ren H Liu C Xue X YaoM Liu and B Fan ldquoClassificationstudy of skin sensitizers based on support vector machine andlinear discriminant analysisrdquo Analytica Chimica Acta vol 572no 2 pp 272ndash282 2006

[34] ZHe Farthest-point heuristic based initializationmethods for K-modes clustering [thesis] Department of Computer Science andEngineering Harbin Institute of Technology Harbin China2006

[35] SGolzari SDoraisamyMN Sulaiman andN IUdzir ldquoEffectof fuzzy resource allocation method on AIRS classifier accu-racyrdquo Journal ofTheoretical andApplied Information Technologyvol 5 no 1 pp 18ndash24 2009

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 10: Research Article SVM-RFE Based Feature Selection and ...downloads.hindawi.com/journals/tswj/2014/795624.pdf · SVM-RFE Based Feature Selection and Taguchi Parameters Optimization

10 The Scientific World Journal

[12] I Guyon J Weston S Barnhill and V Vapnik ldquoGene selec-tion for cancer classification using support vector machinesrdquoMachine Learning vol 46 no 1ndash3 pp 389ndash422 2002

[13] S Harikrishna M A H Farquad and Shabana ldquoCredit scoringusing support vector machine a comparative analysisrdquo inAdvanced Materials Research Trans Tech Publications ZurichSwitzerland 2012

[14] X Lin F Yang L Zhou et al ldquoA support vector machine-recursive feature elimination feature selection method basedon artificial contrast variables andmutual informationrdquo Journalof Chromatography B Analytical Technologies in the Biomedicaland Life Sciences vol 10 pp 149ndash155 2012

[15] R Zhang and M Jianwen ldquoFeature selection for hyperspectraldata based on recursive support vector machinesrdquo InternationalJournal of Remote Sensing vol 30 no 14 pp 3669ndash3677 2009

[16] Z X Xie Q H Hu and D R Yu ldquoFuzzy output supportvector machines for classificationrdquo in Advances in NaturalComputation L Wang K Chen and Y S Ong Eds vol 3612pp 1190ndash1197 Springer Berlin Germany

[17] Y Liu Z You and L Cao ldquoA novel and quick SVM-basedmulti-class classifierrdquo Pattern Recognition vol 39 no 11 pp 2258ndash2264 2006

[18] J Platt N C Cristianini and J Shawe-Taylor ldquoLarge marginDAGs for multiclass classificationrdquo in Advances in NeuralInformation Processing Systems S A Solla T K Leen and KR Muller Eds vol 12 pp 547ndash553 2000

[19] Y Xu S Zomer and R G Brereton ldquoSupport vector machinesa recent method for classification in chemometricsrdquo CriticalReviews in Analytical Chemistry vol 36 no 3-4 pp 177ndash1882006

[20] M L Huang Y H Hung and E J Lin ldquoEffects of SVMparameter optimization based on the parameter design ofTaguchi methodrdquo International Journal on Artificial IntelligenceTools vol 20 no 3 pp 563ndash575 2011

[21] H-C Lin C-T Su C-C Wang B-H Chang and R-CJuang ldquoParameter optimization of continuous sputtering pro-cess based on Taguchi methods neural networks desirabilityfunction and genetic algorithmsrdquo Expert Systems with Applica-tions vol 39 no 17 pp 12918ndash12925 2012

[22] Y Mao D Pi Y Liu and Y Sun ldquoAccelerated recursive featureelimination based on support vector machine for key variableidentificationrdquo Chinese Journal of Chemical Engineering vol 14no 1 pp 65ndash72 2006

[23] A Pal and J Maiti ldquoDevelopment of a hybrid methodology fordimensionality reduction inMahalanobis-Taguchi systemusingMahalanobis distance and binary particle swarm optimizationrdquoExpert Systems with Applications vol 37 no 2 pp 1286ndash12932010

[24] C-T Su and Y-H Hsiao ldquoMulticlass MTS for simultane-ous feature selection and classificationrdquo IEEE Transactions onKnowledge and Data Engineering vol 21 no 2 pp 192ndash2052009

[25] X Lin F Yang L Zhou et al ldquoA support vector machine-recursive feature elimination feature selectionmethod based onartificial contrast variables and mutual informationrdquo Journal ofChromatography B vol 910 pp 149ndash155 2012

[26] E Hullermeier and S Vanderlooy ldquoCombining predictions inpairwise classification an optimal adaptive voting strategy andits relation to weighted votingrdquo Pattern Recognition vol 43 no1 pp 128ndash142 2010

[27] L Bottou C Cortes J Denker et al ldquoComparison of classifiermethodsmdasha case study in handwritten digit recognitionrdquo in

Proceedings of the 12th Iapr International Conference on PatternRecognition vol 2 pp 77ndash82 IEEEComputer Society Press LosAlamitos Calif USA 1994

[28] J Furnkranz ldquoRound robin rule learningrdquo in Proceedings of the18th International Conference on Machine Learning (ICML 01)pp 146ndash153 2001

[29] M R Sohrabi S Jamshidi and A Esmaeilifar ldquoCloud pointextraction for determination of Diazinon optimization of theeffective parameters using Taguchi methodrdquoChemometrics andIntelligent Laboratory Systems vol 110 no 1 pp 49ndash54 2012

[30] W C Hsu and T Y Yu ldquoSupport vector machines parameterselection based on combined taguchi method and staelinmethod for e-mail spam filteringrdquo International Journal ofEngineering and Technology Innovation vol 2 no 2 pp 113ndash1252012

[31] J Arenas-Garcıa and F Perez-Cruz ldquoMulti-class support vectormachines A new approachrdquo in Proceeding of the IEEE Interna-tional Conference on Accoustics Speech and Signal Processing(ICASSP 03) vol 2 pp 781ndash784 April 2003

[32] K G Srinivasa K R Venugopal and L M Patnaik ldquoFeatureextraction using fuzzy c-means clustering for data mining sys-temsrdquo International Journal of Computer Science and NetworkSecurity vol 6 no 3A pp 230ndash236 2006

[33] Y Ren H Liu C Xue X YaoM Liu and B Fan ldquoClassificationstudy of skin sensitizers based on support vector machine andlinear discriminant analysisrdquo Analytica Chimica Acta vol 572no 2 pp 272ndash282 2006

[34] ZHe Farthest-point heuristic based initializationmethods for K-modes clustering [thesis] Department of Computer Science andEngineering Harbin Institute of Technology Harbin China2006

[35] SGolzari SDoraisamyMN Sulaiman andN IUdzir ldquoEffectof fuzzy resource allocation method on AIRS classifier accu-racyrdquo Journal ofTheoretical andApplied Information Technologyvol 5 no 1 pp 18ndash24 2009

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 11: Research Article SVM-RFE Based Feature Selection and ...downloads.hindawi.com/journals/tswj/2014/795624.pdf · SVM-RFE Based Feature Selection and Taguchi Parameters Optimization

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014