Boosted Svm

download Boosted Svm

of 27

Transcript of Boosted Svm

  • 8/10/2019 Boosted Svm

    1/27

    Applied Soft Computing 14 (2014) 99108

    Contents lists available at ScienceDirect

    Applied Soft Computing

    j o u r n a l h o m e p a g e : w w w . e l s e v i e r .c o m / l o c a t e / a s o c

    Boosted SVM for extracting rules from imbalanced data in application torediction of the post-operative life expectancy in the lung cancer patients

    Maciej Zieba , Jakub M. Tomczak, Marek Lubicz, Jerzy Swiatek

    Faculty of Computer Science and Management, Wroclaw University ofTechnology, Wybrzeze Wyspianskiego 27, 50-370 Wroclaw, Poland

    r t i c l e i n f o

    Article history:Received 27 March 2013

    Received in revisedorm 12 July 2013

    Accepted 22 July 2013

    Available online 6September 2013

    Keywords:

    mbalanced dataBoostedSVMDecisionules

    Post-operative lifexpectancy prediction

    a b s t r a c t

    In this paper, we present boosted SVM dedicated to solve imbalanceddata problems. Proposed solution combines the benefits of usingensemble classifiers for uneven data together with cost-sensitive sup-portvectors machines. Further, we present oracle-based approach forextracting decision rules from the boosted SVM. In the next step weexamine the quality of the proposed method by comparing the perfor-mance with other algorithms which deal with imbalanced data. Finally,boosted SVM is used for medical application of predicting post-operativelife expectancy in the lung cancer patients.

    2013 Elsevier B.V. All rights reserved.

    . Introduction

    The main difficulty in learning classificationmodels is the char-acter of data. Usually, raw data

    athered from many sources cannot be used directlyn the training process due to various circum-

    stances, i.e., missing values of some attributes [1],sequential nature of delivering data [2], ordisproportions in class distribution [3]. In theliterature, the third issue is known as imbalanceddata prob-lem. In general, each dataset with unequalclass distribution can be considered as imbalanced.In practice, the problem of dispropor-tion betweenclasses occurs when the classifier trained with

    typical methods has a tendency to make decisionsbiased toward major-ity class. Extreme imbalance

  • 8/10/2019 Boosted Svm

    2/27

    data problem is observed when the classifier trainedusing traditional methods is classifying all objectsnly to the majority class, independently on the

    vector of features. Applying techniques which dealwith the problem of unequally dis-tributed data is

    ssential for learning decision models with highredictive accuracy.

    The problem of imbalanced data is widelybserved in medical decision making, particularlyn post-operative risk evaluation domain.

    Considering the short period of planning (1 year)he

    Corresponding author. Tel.: +48 71 320 44 53.

    E-mail addresses: [email protected](M.Zieba),[email protected] (J.M. Tomczak),

    [email protected]

    M. Lubicz),[email protected](J.

    Swiatek).

    568-4946/$see front matter 2013Elsevier B.V. All rights reserved.http://dx.doi.org/10.1016/j.asoc.2013.07.016

    number of patients surviving the assumed interval is

    often sig-nificantly higher than the number ofdeceases. Moreover, the misclassification related totreating deceases as survivals is much moretroublesome than the decision mistake made inopposite direction.

    The second important issue connected to theproblem of post-operative life expectancyprediction is the interpretability of the decisionmodel. Using so-called black box models in theconsid-ered application is not recommended. It is

    mainly caused by the patients fear about beingtreated by machines with hidden and difficult-to-understand inference process and distrust amongdoc-tors of being supported by vaguely-workingmodels. To dissipate the doubts it is necessary topropose a method to extract knowledge in the formof decision rules or trees from black box models.

    The main goal of this paper is to propose ageneral decision model which obtains highpredictive accuracy in case of imbalanced data anduse this model for extraction of interpretable knowl-edge in the form of decision rules. The core of ourproposition is a generalization of the learning taskfor SVM by introducing indi-vidual penaltyparameters for each example, and then applicationof AdaBoost-like algorithm to learn ensemble ofSVMs. We call this approach boosted SVM forimbalanced data. At the end, we apply boostedSVM as an oracle to re-label examples and use thenew dataset for the rules induction in order toobtain an interpretable model.

    Additionally, we aim at using thoracic surgeryclinical diagnos-tics and treatment issues as animportant application of the boosted

  • 8/10/2019 Boosted Svm

    3/27

    0 M. Zieba et al. / Applied Softmputing 14 (2014) 99108

    M proposed in this paper. Our approach may

    ntribute current clinical treatment twofold. First,support decisions on patient selection for surgeryde by clinicians. Second, to identify the cases ofher risk of patients death after surgery by

    racting decision rules.

    The paper is organized as follows. In Section 2review approaches that are applied to

    balanced data and solutions used to extract rulesm uninterpretable models. In Section 3 wesent the ensemble SVM classifier (boosted

    M) for imbal-anced data and describe thethod of extracting decision rules from the model.ction 4 contains the results of an experimentwing the quality of the proposed approach for

    balanced data and presents post-operative lifeectancy rules extracted using boosted SVM.s paper is briefly summarized in Sec-tion 5.

    Related work

    Typical methods used to learn classifiers aresed toward the majority class if they are trainedh imbalanced data. Various techniques are useddeal with the stated problem which can beided into three categories [3,4]:

    ata level (external) approaches, which useversampling and undersampling techniques toqualize the number of instances from classes.

    lgorithm level (internal) approaches, whichncorporate balancing techniques in trainingrocess.

    ost-sensitive solutions, which use both data levelansforma-tions (by adding costs to instances)

    ogether with algorithm level modifications (bymodifying the learning process to accept costs).

    The first group of methods operatesependently on the training procedure bydifying the class distribution, either by

    nerating artificial samples or by eliminating non-ormative examples. The most popular andmmonly used sampling method is SMOTE

    (Synthetic Minority Over-sampling TEchnique) [5],which generates artificial examples situated on the

    path connecting two neighbours from minorityclass. Borderline-SMOTE is an extension ofSMOTE, which incorporates in the samplingprocedure only the minority data points located inthe neighbourhood of the separat-ing hyperplane[6]. On the other hand, non-informative majorityexamples can be eliminated to obtain morebalanced data in the undersampling procedure. Thistype of methods is mainly used as a component ofmore sophisticated, internal approaches, anddetection of non-informative examples is usually

    made by random selection, using K-NN algorithm[7]or with evolutionary algorithms [8].

    The second group of methods solves the problemof uneven data directly during training phase, and itincludes ensemble-based approaches which makeuse of under- and oversampling methods toconstruct diverse and balanced base classifiers. Themost common approach in this group isSMOTEBoost [9], that combines the benefits ofboosting with multiple sampling procedure usingSMOTE. Alternatively, synthetic data sampling canbe included in the process of constructing bagging-based ensemble [10]. Some of ensemble solutionsmake use of under-sampling techniques to constructclass-unbiased base learners [11,12].

    Other important group of inner methods usesgranular comput-ing techniques to solve theproblem of uneven data [1316]. The main featureof such approaches is knowledge-orienteddecompo-sition of the main problem into parallel,balanced sub-problems, named informationgranules. Active learning solutions are also used

  • 8/10/2019 Boosted Svm

    4/27

    deal with imbalanced data issue [17,18].ginally, active learn-ing was used to detectormative, unlabelled examples and query themut class values to create representative training

    Appli-cation of active learning techniques forbalanced data is based on the assumption that thetribution of detected examples is sig-nificantlyre equalized than the distribution of examplesm the initial dataset. According to [17], the datants accumulated near the borderline tends to bere equalized than the points in the entire dataset.

    The last group of methods assigns a weight toh element of the training set. In general, thecess of weighting the examples is made to

    rease the significance of the minorityervations at the expense of the majority class.ge set of cost-sensitive methods make use ofemble classifiers which update the weights while

    nstructing base learners. As a consequence, theights related with minority examples are updatedonger than examples from majority class if they

    misclassified by already constructed basessifiers. In this group we can distinguish mainlyosting-based approaches such as: AdaCost [19],B1, CSB2 [20],

    reBoost [21], AdaC1, AdaC2, AdaC3 [22].

    Cost-sensitive techniques are widely appliedether with Support Vector Machines (SVM). Theferent misclassification costs for classes arensidered in learning criterion to achieve softrgin unbiased toward majority class. Theroaches differ in the manner of including costues in penalization term of learning criterion to

    nstruct balanced SVM [23,24]. An inter-estingension of typical cost-sensitive approaches isM with boosting proposed in [25].

    n real-life applications, there are two major tasksbe solved: (i) prediction based on propertiesrned from data, and (ii) discovery of unknownperties from data. In the field of machinerning there are several successful approaches thatieve high predictive accuracy, e.g., neuralworks and SVM. On the other hand, there arethods that allow to understand the consideredenomenon and they are easily interpretable byman, e.g., tree-based or rule-based models.fortunately, the methods with high predictiveuracy are hard to interpret and the

    understandable models performs poorly in theprediction task. Therefore, there arises a need tocom-bine high predictive performance withinterpretability of the model.

    In the literature, there are three main approachesto extract understandable rules from uninterpretablemodels with high pre-dictive accuracy [26]. Thefirst one, called decompositional, focuses onextracting rules from the structure of the trainedmodel, e.g., from neural networks [27] or margindetermined by SVM [28]. The second approach,calledpedagogical, aims at generating examplesfrom the trained model or to re-label training exam-ples according to the trained model and then

    applying a method for rules extraction from data,e.g., using neural networks [29]. All othertechniques which do not fall clearly into one of theabove categories are called eclectic, e.g., rulesextraction from SVM [30].

    3. Boosted SVM for imbalanced data

    3.1. SVM for imbalanced data

    3.1.1. Problem statement

    Let xbe a M-dimensional vector of inputs, x X, and y an output (or class label), y {1, 1}.For given data S= {xn, yn}

    Nn=1, we are interested

    in finding a discriminative hyperplane of thefollowing form:

    H : a__(x) + b = 0, (1)

  • 8/10/2019 Boosted Svm

    5/27

    M. Zieba et al. / Applied Soft Computing 14(2014) 99108 101

    where a is a vector of weights, a RM , _(x)

    denotes a fixed feature-space transformation, and bs bias, b R.The solution of the problem of determining

    daptive parame-ters of the discriminativehyperplane is known as Support Vector MachinesSVM) [31]. The typical learning task for SVM istated as follows:

    1N

    mi

    n . Q(

    a

    )

    aT

    a

    + C _n

    =2a,b

    n=1 (2)_

    s.t.yn(a_(xn) + b)1

    _nfor all n = 1, . .., N

    where _nare slack variables, such that _n 0 for n =, . . ., N, and C is the parameter that controls therade-off between the slack variable penalty and the

    margin, C > 0.

    In the problems where data iswell-balanced the SVM achieveshigh predictive accuracy. However, it may fail inase of imbal-anced datasets. One of the possible

    solutions for thisproblem is to includecost-sensitivity ofclasses [32]. In otherwords, instead of onepenalty parameter Cthere are two real-valued parameters C+and C , namely, for

    minority and majority classes.1 It is imple-mentedby transforming the learning task for SVM into:

    We use the following convention. The data pointswithin the minority class are called positivexamples, i.e., y = 1, while data with majority classabel, y = 1 negative examples.

    The introduction of wallows to diversify the costsf misclassifi-cation not only between classes butlso within classes. Further, we will use the penaltyarameters win the boosting procedure during

    for all n = 1, . .., N

    ecause we have maintainedhe ter C, the penaltyarameters wncondition:

    Nwn=N.n=

    N

  • 8/10/2019 Boosted Svm

    6/27

    (a) typical SVM (b) cost-sensitiveSVM

    Fig. 1. Optimal hyperplane for SVM trained onmbalanced dataset: (a) using thetypicalormulation (2), (b) using the cost-sensitiveormulation (4). (a) Typical SVM.

    b) Cost-sensitive SVM.

    he classifiers construction. For the primal

    ptimization problem4) we get the following dual optimization problem2

    :

    N1

    N

    min. Q

    (

    ) _ _ yy

    k(

    x

    ,

    x )= n 2 i j i j i jn=1

    i,j=1

    _ _

    s.t.0 _nCwn (6)N

    __nyn= 0

    n=1

    min Q(a)

    =

    1a_a

    +C _n

    +_

    2 +a,b nN+

    n

    s.t._(xn) +b)1

    _ _ yn(a _n

    for all n = 1, . . ., N

    for all n = 1, . . ., N

    where is the vector of Lagrange multipliers.

    (3)The solution of the quadratic optimization task in theform (6)is optimal iff the matrix with the elementsyiyjk(xi, xj) is semipositive-definite and for all n =1, . . ., N the Kuhn-Tucker conditions are fulfilled:

    whereN+= {n {1, . . ., N} : yn= 1}, andN= {n {1, . . ., N} : yn= 1}.

    Further, we generalize the problem (3)byntroducing indi-vidual penalty parameters wnforach nth observation which

    yields:

    _n= 0,

    0 < _n< Cwn, (7)

    _n= Cwn.

    In practice, only some of the Lagrange multipliers_nwill be nonzero and all examples for which _n0 are called support vectors. Finally, theclassification function is expressed by:

    min Q(a)

    =

    1 a_a

    +C

    w_

    2a,b n=1

    n ny(xi

    )=

    signyn_nk(xn

    , xi)+

    b , (8)(4)

    s.t._(xn

    )b

    )

    _

    _ _yn(a + 1_n n SV

    _

    individual penaltyparame-have to fulfillthe following

    where SVdenotes the set of indices of the supportvectors, sign(a) is a function that returns 1 for a

    Risk1Yr = T 0.03 0.47(PRE14 = OC14) =>Risk1Yr = T 0.04 0.41(PRE17 = T) and (PRE30 =T)and (AGE > = 57) 0.05 0.38=>Risk1Yr=T(PRE11 = T) and (PRE5 = 2.44) 0.05 0.35=>Risk1Yr = T

    (PRE9 = T) and (AGE > =54)and (PRE5 < = 66.4) 0.05 0.35=>Risk1Yr = T(PRE14 = OC13) =>Risk1Yr = T 0.04 0.32(DGN = DGN2) and(PRE30 = T)and (PRE14 = OC12) and(PRE5 < = 3.72) 0.04 0.30=>Risk1Yr = T(PRE8 = T) and (PRE30 =T)and (PRE4 < = 3.52) 0.08 0.26

    =>Risk1Yr = TOTHERWISE =>Risk1Yr

    = F 0.62 0.97

    accuracy measures are typically used. The coveragemeasure deter-mines the percentage of examplescovered by rule on average while the accuracymeasure expresses the relative frequency of exam-ples satisfying the complete rule among thosesatisfying only the antecedent. The accuracy forrules combined with the minority class was from0.26 to 0.47. Each of the rules covered from 3% to

    8% of the training data. Remaining subspace offeatures covered 62% of examples with accuracy of97% for the majority class.

    Results show that it is possible to identify casesof higher risk of patients death after surgery byapplying oracle-based approach combined withboosted SVM method. Extracted rules together withcoverage and accuracy values give importantinformation about patients suffering specialtreatment due to high risk of death. It is also

    important to highlight, that patients uncovered bythe minority rules in 97% of cases will survive theconsidered survival period.

    5. Conclusions

    In this paper, we have proposed a novel boostedSVM method for imbalanced data problem whichwas further used for rules extrac-tion. We haveevaluated the quality of the proposed approach bycomparing it with other solutions dedicated forimbalanced data problem. Next, we have used theproposed method to solve the problem forprediction of the post-operative life expectancy inthe lung cancer patients. We have shown that ourapproach can be suc-cessfully applied to theproblem by making additional experimentalcomparison on real-life dataset. Finally, weextracted decision rules using oracle-basedapproach.

    Acknowledgements

    The research by Maciej Zieba was co-financedby the European Union as part of the European

  • 8/10/2019 Boosted Svm

    22/27

    Social Fund.The research by Marek Lubicz was partly

    financed by the National Science Centre under thegrant N N115 090939 Models and Decisions in theHealth Systems. Application of Operational

    Research and Information Technologies forSupporting Managerial Decisions in the HealthSystems.

    References

    [1] P.J. Garcia-Laencina, J.L. Sancho-Gomez,A.R. Figueiras-Vidal, Pattern classifica-tion withmissing data: a review, Neural Computing andApplications 19 (2009) 263282.

    [2]

    T. Dietterich, Machine learning forsequential data: a review, in: T. Caelli, A. Amin,R.P.W. Duin, M. Kamel, D. de Ridder (Eds.),Structural, Syntactic, and Statistical PatternRecognition, Springer, Berlin Heidelberg, 2002,pp. 227246.

    [3] H. He, E.A. Garcia, Learning fromimbalanced data, IEEE Transactions on Knowl-edge and Data Engineering 21 (2009) 12631284.

    [4]

    M. Galar, A. Fernndez, E. Barrenechea, H.Bustince, F. Herrera, A review on ensembles forthe class imbalance problem: bagging-,boosting-, and hybrid-based approaches, IEEETransactions on Systems, Man and Cybernetics Part C: Applications and Reviews 42 (2012)33583378.

    [5] N.V. Chawla, K.W. Bowyer, L.O. Hall,SMOTE: Synthetic Minority Over-samplingTEchnique, Journal of Artificial IntelligenceResearch 16 (2002) 321357.

    [6] H. Han, W.-Y. Wang, B.-H. Mao,Borderline-SMOTE: a new over-samplingmethod in imbalanced data sets learning, in: D.-S. Huang, X.-P. Zhang, G.-B. Huang (Eds.),Advances in Intelligent Computing, 2005, pp.878887.

    [7] I. Zhang, J. Mani, KNN approach tounbalanced data distributions: a case studyinvolving information extraction, in:Proceedings of International Conference onMachine Learning (ICML 2003), Workshop

  • 8/10/2019 Boosted Svm

    23/27

    Learning from Imbalanced Data Sets, 2003.

    [8] S. Garca, A. Fernndez, F. Herrera,Enhancing the effectiveness and inter-pretabilityof decision tree and rule induction classifiers

    with evolutionary training set selection overimbalanced problems, Applied Soft Computing9 (2009) 13041314.

    [9] N. Chawla, A. Lazarevic, L. Hall, K.Bowyer, Smoteboost: improving prediction ofthe minority class in boosting, in: N. Lavrac, D.Gamberger, H. Blockeel, L. Todorovski (Eds.),Knowledge Discovery in Databases: PKDD2003, Springer, Berlin Heidelberg, 2003, pp.107119.

    10]

    S. Wang, X. Yao, Diversity analysis onimbalanced data sets by using ensem-blemodels, in: 2009 IEEE Symposium onComputational Intelligence and Data MiningProceedings, 2009, pp. 324331.

    11] E. Chang, B. Li, G. Wu, K. Goh, Statisticallearning for effective visual informationretrieval, in: IEEE Proceedings of the 2003International Conference on Image Processing,vol. 3, 2003, pp. 609613.

    12] D. Tao, X. Tang, X. Li, X. Wu, Asymmetricbagging and random subspace for support vectormachines-based relevance feedback in imageretrieval, IEEE Transactions on Pattern Analysisand Machine Intelligence 28 (2006) 10881099.

    13] Y. Tang, B. Jin, Y. Zhang, Granular supportvector machines with associa-tion rules miningfor protein homology prediction, ArtificialIntelligence in Medicine 35 (2005) 121134.

    14] Y. Tang, B. Jin, Y. Zhang, H. Fang, B.Wang, Granular support vector machines usinglinear decision hyperplanes for fast medicalbinary classification, in: IEEE Proceedings ofthe 14th IEEE International Conference onFuzzy Systems, 2005, pp. 138142.

    15]

    Y. Tang, Y. Zhang, Granular svm withrepetitive undersampling for highly imbal-ancedprotein homology prediction, in: Y.-Q. Zhang,T.Y. Lin (Eds.), 2006 IEEE InternationalConference on Granular Computing, 2006, pp.457460.

    [16] Y. Tang, Y. Zhang, Z. Huang, Developmentof two-stage svm-rfe gene selection strategy formicroarray expression data analysis, IEEE/ACMTransactions on Computational Biology and

    Bioinformatics 4 (2007) 365381.[17]

    S. Ertekin, J. Huang, L. Bottou, L. Giles,Learning on the border: active learning inimbalanced data classification, in: Proceedingsof the 30th Annual International ACM SIGIRConference on Research and Development inInformation Retrieval, ACM, New York, 2007,pp. 127136.

    [18] S. Ertekin, J. Huang, C. Giles, Active

    learning for class imbalance problem, in:Proceedings of the Sixteenth ACM Conferenceon Conference on Information and KnowledgeManagement, ACM, 2007, pp. 823824.

    [19]

    W. Fan, S. Stolfo, J. Zhang, P Chan,Adacost: misclassification cost-sensitiveboosting, in: I. Bratko, S. Dzeroski (Eds.),Proceedings of the Sixteenth Interna-tionalConference on Machine Learning (ICML 1999),Morgan Kaufmann, 1999, pp. 97105.

    [20]

    K. Ting, A comparative study of cost-sensitive boosting algorithms, in: P. Lang-ley(Ed.), Proceedings of the SeventeenthInternational Conference on Machine Learning(ICML 2000), Morgan Kaufmann, 2000, pp.983990.

    [21] M. Joshi, V. Kumar, R. Agarwal, Evaluatingboosting algorithms to classify rare classes:Comparison and improvements, in: N. Cercone,T.Y. Lin, X. Wu (Eds.), Proceedings. 2001 IEEEInternational Conference on Data Mining, IEEE,Los Alamitos, 2001, pp. 257264.

    [22] Y. Sun, M. Kamel, A. Wong, Y. Wang,Cost-sensitive boosting for classification ofimbalanced data, Pattern Recognition 40 (2007)33583378.

    [23]

    K. Morik, P. Brockhausen, T. Joachims,Combining statistical learning with aknowledge-based approach-a case study inintensive care monitoring, in: I. Bratko, S.Dzeroski (Eds.), Proceedings of the SixteenthInternational Confer-ence on Machine Learning

  • 8/10/2019 Boosted Svm

    24/27

    (ICML 1999), Morgan Kaufmann, 1999, pp.268277.

  • 8/10/2019 Boosted Svm

    25/27

    8 M. Zieba et al. / Applied Softmputing 14 (2014) 99108

    ]

    K. Veropoulos, C. Campbell, N. Cristianini,

    Controlling the sensitivity of support vectormachines, in: Proceedings of the 16thInternational Joint Conference on ArtificialIntelligence, (IJCAI999), Workshop ML3, vol.1999, 1999, pp. 5560.

    ] B. Wang, N. Japkowicz, Boosting supportvector machines for imbalanced data sets,Knowledge and Information Systems 25 (2010)120.

    ]

    A. Tickle, R. Andrews, M. Golea, J.Diederich, The truth will come to light: direc-tions and challenges in extracting the knowledgeembedded within trained artificial neuralnetworks, IEEE Transactions on NeuralNetworks 9 (1998) 10571068.

    ] J. Chorowski, J.M. Zurada, Extracting rulesfrom neural networks as decision diagrams,IEEE Transactions on Neural Networks 22(2011) 24352446.

    ]

    H. N nez, C. Angulo, A. Catal, Ruleextraction from support vector machines, in:Proceedings of the European Symposium onArtificial Neural Networks, 2002, pp. 107112.

    ] M. Craven, J. Shavlik, Rule extraction:where do we go from here?, Technical Report,University of Wisconsin Machine LearningResearch Group Working Paper, 1999.

    ] N. Barakat, J. Diederich, Eclectic rule-extraction from support vector machines,International Journal of ComputationalIntelligence 2 (2005) 5962.

    ] I. Steinwart, A. Christmann, Support VectorMachines, Springer, New York, 2008.

    ] H. Masnadi-Shirazi, N. Vasconcelos, Riskminimization, probability elicitation, and cost-

    sensitive SVMs, in: J. Frnkranz, T. Joachims(Eds.), Proceedings of the 27th International

    Conference on Machine Learning (ICML-10),Omnipress, 2010, pp. 204213.

    [33] V. Vapnik, Statistical Learning Theory,John Wiley and Sons, Inc., New York, 1998.

    [34] E. Osuna, R. Freund, F. Girosi, Animproved training algorithm for support vectormachines, in: J. Principe, L. Gile, N. Morgan, E.Wilson (Eds.), Neural Networks for SignalProcessing VII, Proceedings of the 1997 IEEE

    Signal Processing Workshop, IEEE, 1997, pp.276285.

    [35] J. Platt, Fast training of support vectormachines using sequential minimal opti-mization, in: B. Schlkopf, C.J.C. Burges, A.J.Smola (Eds.), Advances in Kernel Methods:Support Vector Learning, MIT Press,Cambridge, 1999.

    [36] S. Keerthi, S. Shevade, C. Bhattacharyya, K.Murthy, Improvements to platts smo algorithmfor svm classifier design, Neural Computation13 (2001) 637649.

    [37] Y. Freund, R.E. Schapire, M. Hill,Experiments with a New Boosting Algorithm,in: L. Saitta (Ed.), Machine Learning,Proceedings of the Thirteenth InternationalConference (ICML96), Morgan Kaufmann,1996.

    [38] W.W. Cohen, Fast effective rule induction,in: A. Prieditis, S. Russell (Eds.), in Proceedingsof the 12th International Conference on MachineLearning, Morgan Kaufmann, Tahoe City, CA,1995, pp. 115123.

    [39]

    J. Alcal, A. Fernndez, J. Luengo, J.Derrac, S. Garca, L. Snchez, F. Herrera, Keeldata-mining software tool: Data set repository,integration of algorithms and experimental

  • 8/10/2019 Boosted Svm

    26/27

    analysis framework, Journal of Multiple-valuedLogic and Soft Computing 17 (2010) 255287.

    ] A. Fernndez, J. Luengo, J. Derrac, J.Alcal-Fdez, F. Herrera, Implementation and

    integration of algorithms into the keel data-mining software tool, in: E. Corchado, H. Yin(Eds.), Intelligent Data Engineering andAutomated Learning-IDEAL 2009, Springer,Berlin, 2009, pp. 562569.

    ] Y. Tang, Y. Zhang, N. Chawla, S. Krasser,Svms modeling for highly imbalancedclassification, IEEE Transactions on Systems,Man, and Cybernetics, Part B: Cybernetics 39(2009) 281288.

    [42] C. Seiffert, T. Khoshgoftaar, J. Van Hulse,A. Napolitano, Rusboost A hybrid approach toalleviating class imbalance, IEEE Transactionson Systems, Man and Cybernetics, Part A:

    Systems and Humans 40 (2010) 185197.[43]

    M. Shapiro, S.J. Swanson, C.D. Wright, C.Chin, S. Sheng, J. Wisnivesky, T.S. Weiser,Predictors of major morbidity and mortality afterpneumonectomy utilizing the society forthoracic surgeons general thoracic surgerydatabase, Annals of Thoracic Surgery 90 (2010)927935.

    [44] U. Aydogmus, L. Cansever, Y. Sonmezoglu,

    K. Karapinar, C.I. Kocaturk, M.A. Bedirhan,The impact of the type of resection on survivalin patients with n1 non-small-cell lung cancers,European Journal of Cardio-Thoracic Surgery 37(2010) 446450.

    [45] P. Icard, M. Heyndrickx, L. Guetti, F.Galateau-Salle, P. Rosat, J.P. Le Rochais, J.-L.Hanouz, Morbidity, mortality and survival after110 consecutive bilobec-tomies over 12 years,Interactive Cardiovascular and Thoracic Surgery16 (2013) 179185.

    [46] D. Shahian, F. Edwards, Statistical riskmodeling and outcomes analysis, Annals ofThoracic Surgery 86 (2008) 17171720.

    [47] R. Berrisford, A. Brunelli, G. Rocco, T.Treasure, M. Utley, The european thoracicsurgery database project: modelling the risk ofin-hospital death following lung resection,European Journal of Cardio-Thoracic Surgery 28(2005) 306311.

    [48] P.E. Falcoz, M. Conti, L. Brouchet, S.Chocron, M. Puyraveau, M. Mercier, J.P.Etievent, M. Dahan, The thoracic surgeryscoring system (thoracoscore): risk model for in-hospital death in 15,183 patients requiringthoracic surgery, The Journal of Thoracic andCardiovascular Surgery 133 (2007) 325332.

  • 8/10/2019 Boosted Svm

    27/27

    ] A. Barua, S.D. Handagala, L. Socci, B.Barua, M. Malik, N. Johnstone, A.E. Martin-Ucar, Accuracy of two scoring systems for riskstratification in thoracic surgery, InteractiveCardiovascular and Thoracic Surgery 14 (2012)

    556559.]

    G. Rocco, ecomment. re: Accuracy of twoscoring systems for risk stratification in thoracicsurgery, Interactive Cardiovascular and ThoracicSurgery 14 (2012) 559.

    ] E. Rivo, J. de la Fuente, . Rivo, E. Garca-Fontn, M.-. Ca nizares, P. Gil, Cross-industrystandard process for data mining is applicable tothe lung cancer surgery domain, improving

    decision making as well as knowledge andquality management, Clinical and TranslationalOncology 14 (2012) 7379.

    ] N. Voznuka, H. Granfeldt, A. Babic, M.Storm, U. Lnn, H.C. Ahn, Report gen-erationand data mining in the domain of thoracicsurgery, Journal of Medical Systems 28 (2004)497509.

    ] J. Dowie, M. Wildman, Choosing thesurgical mortality threshold for high risk patientswith stage la non-small cell lung cancer: Insightsfrom decision anal-ysis, Thorax 57 (2002) 710.

    ] M.K. Ferguson, J. Siddique, T. Karrison,Modeling major lung resection out-comes usingclassification trees and multiple imputationtechniques, European Journal of Cardio-Thoracic Surgery 34 (2008) 10851089.

    ] H. Esteva, T.G. N nez, R.O. Rodrguez,Neural networks and artificial intelli-gence inthoracic surgery, Thoracic Surgery Clinics 17(2007) 359367.

    ] G. Santos-Garca, G. Varela, N. Novoa,M.F. Jimnez, Prediction of postopera-tivemorbidity after lung resection using an artificialneural network ensemble, Artificial Intelligencein Medicine 30 (2004) 6169.

    [57] C.-Y. Lee, Z.-J. Lee, A novel algorithmapplied to classify unbalanced data, Applied SoftComputing 12 (2012) 24812485.

    [58]

    Y. Yang, J.O. Pedersen, A comparative

    study on feature selection in text cat-egorization,in: D.H. Fisher (Ed.), Proceedings of theFourteenth International Conference on MachineLearning (ICML 1997), Morgan Kaufmann,1997, pp. 412420.