Credit Rating Prediction Using Ant Colony Optimization

Credit Rating Prediction Using

Ant Colony Optimization

David Martens a,b Tony Van Gestel c,d Manu De Backer a

Raf Haesen a Jan Vanthienen a Bart Baesens e,a

aDepartment of Decision Sciences & Information Management, K.U.LeuvenNaamsestraat 69, B-3000 Leuven, Belgium

{David.Martens;Manu.DeBacker;Raf.Haesen;Jan.Vanthienen}@econ.kuleuven.bebDepartment of Business Administration and Public Management

Hogeschool GentVoskenslaan 270, Ghent 9000, Belgium

[email protected] Risk Modelling, Group Risk Management, Dexia Group

Square Meeus 1, 1000 Brussel, [email protected]

dDepartment of Electrical Engineering, ESAT-SCD-SISTA, K.U.LeuvenKasteelpark Arenberg 10, B-3001 Leuven (Heverlee), Belgium

eUniversity of Southampton, School of Management, United KingdomHighfield Southampton, SO17 1BJ, United Kingdom

[email protected]

Abstract

The introduction of the Basel II Capital Accord has encouraged financial institu-tions to build internal rating systems assessing the credit risk of their various creditportfolios. One of the key outputs of an internal rating system is the probabilityof default (PD), which reflects the likelihood that a counterparty will default onhis/her financial obligation. Since the PD modeling problem basically boils downto a discrimination problem (defaulter or not), one may rely on the myriad of clas-sification techniques that have been suggested in the literature. However, since thecredit risk models will be subject to supervisory review and evaluation, they mustbe easy to understand and transparent. Hence, techniques such as neural networksor support vector machines are less suitable due to their black box nature. Buildingupon previous research, we will use AntMiner+ to build internal rating systems forcredit risk. AntMiner+ allows to infer a propositional rule set from a given dataset hereby using the principles from Ant Colony Optimization. Experiments will beconducted using various types of credit data sets (retail, small- and medium-sizedenterprises (SMEs) and banks). It will be shown that the extracted rule sets are bothpowerful in terms of discriminatory power, and comprehensibility. Furthermore, a

Preprint submitted to Elsevier 29 October 2008

framework will be presented describing how AntMiner+ fits into a global Basel IIcredit risk management system.

Key words: Ant Colony Optimization, Classification, Credit Scoring, BankruptcyPrediction, Basel II

1 Introduction

Over the past decades, financial institutions have seen an ever growing need forquantitative analysis techniques to optimize and monitor decisions related torisk and investment management. The gradual adoption of data warehousingand knowledge discovery in data (KDD) technology is allowing these insti-tutions to analyze ever larger amounts of data, using a range of powerfultechniques from various disciplines such as conventional statistics, machinelearning, neurocomputing, and operations research. This process is only be-ing further accelerated by the recent implementation of several internationalfinancial and accounting standards (such as Basel II, Solvency II, Sarbanes-Oxley and IFRS). For example, by allowing banks to use their internal creditrisk assessment models as input for the minimum regulatory capital calcula-tions, the Basel II framework is providing financial institutions with additionalincentives to refine existing credit scoring models since more accurate predic-tions require less conservative capital requirements. Hence, there has been agrowing interest throughout the financial world in research on novel data min-ing techniques and information technologies to support the implementation ofsuch compliance frameworks.

As a result of a longstanding interest from the research community, a myriadof techniques have been proposed for many of the aforementioned problems,in particular for classification problems such as credit scoring and bankruptcyprediction. However, not all of these approaches have proven readily transfer-able from the academic domain to financial practice. Many of the represen-tations applied by the suggested algorithms cannot be easily interpreted andvalidated by humans. For example, neural networks are considered a blackbox technique, since the reasoning behind how the non-linear prediction mod-els reach their conclusions cannot easily be obtained from their structure.This has not only hindered their acceptance by practitioners, but also fails toaddress the increasing need for transparency under various regulatory frame-works. Credit risk analysts are unlikely to accept black box techniques suchas neural networks to make credit decisions, since under the Basel II accord,they are now required to demonstrate and periodically validate their models,and present reports to the national regulator for approval. Therefore, recentresearch proposed the use of rule-based classification techniques to generate

2

powerful, as well as intuitive and transparent decision models.

Such a rule-based classification technique that has recently been proposed isAntMiner+, which uses Ant Colony Optimization (ACO) to infer accuraterules from the data. This paper will describe how this technique can be usedto generate comprehensible credit scoring models, which can then be fit intoa Basel II-compliant decision support system.

The paper is structured as follows. The next Section discusses the issues relatedto building credit scoring models within the Basel II regulatory framework.Section 3 provides an overview of the AntMiner+ classification technique,as well as an introduction to ACO on which the technique is based. Theexperimental Section 4 provides AntMiner+ credit scoring models for retailbanking, small and medium-sized enterprises (SMEs) and banks. Section 5describes the further steps needed to obtain a Basel II compliant decisionsupport system, and finally, Section 6 concludes the paper.

2 Credit Scoring and Bankruptcy Prediction within Basel II

The recent introduction of the Basel II Capital Accord encourages financialinstitutions to calculate their minimum regulatory safety capital to ensure thatthey are able to return depositor funds at all times [? ]. The minimum safetycapital is determined at 8% of risk weighted assets, which are in turn quantifiedtaking into account three types of risk: credit risk, operational risk and marketrisk. In calculating credit risk, banks must use three key risk parameters:probability of default (PD), loss given default (LGD) and exposure at default(EAD). These three parameters are then used as input to a Merton/Vasicekmodel which then calculates the regulatory safety capital [? ].

The PD, LGD and EAD parameters can be obtained in three different ways.The standard approach for credit risk allows banks to buy risk ratings fromexternal rating agencies, often called External Credit Assessment Institu-tions (ECAIs) in the spirit of the Accord. Examples of well-known ECAIsare Moody’s, Standard & Poor’s and Fitch. The risk ratings are then trans-lated to risk weights provided in the Accord, which then allow to calculatethe risk weighted assets (RWA) and as such the regulatory capital. The foun-dation internal ratings based (IRB) approach allows banks to build their ownPD models and get LGD and EAD estimates from the supervisors, whereasthe advanced internal ratings based approach allows financial institutions toestimate all three risk parameters themselves. Many financial institutions inWestern Europe, Asia and the US are currently taking steps to implement theadvanced IRB approach. More than ever, this has triggered the interest andneed to develop credit scoring and bankruptcy prediction models for estimat-

3

ing the PD of a set of obligors.

For retail portfolios, application scoring models will be developed that try toquantify the credit risk of a set of recently acquired customers, given theirapplication characteristics (e.g. age, marital status, credit history, savingsamount, ...). Behavioural scoring models will be used to monitor the creditrisk of the existing customer base, given their most recent behaviour (e.g. av-erage checking account status during previous month, number of credit cards,...). For small and medium-sized enterprises (SMEs), financial institutions willdevelop bankruptcy prediction models that will quantify the risk of financialfailure given a set of accounting ratio’s and measurements. For both retailand SME type of obligors, one can usually assume that a sufficient number ofdefaults are present in order to make statistical discrimination and classifica-tion meaningful. However, for certain type of counterparties, such as banks,insurance companies and sovereign entities, the lack of default observationsnecessitates the use of alternative methods. In this context, financial insti-tutions will often build rating models hereby mimicking a set of externallyprovided ratings (e.g. by an ECAI) given a set of candidate explanatory vari-ables collected by the institution.

Ideally, the credit scoring, bankruptcy prediction and rating models should bevery powerful in terms of discriminatory power, so as to minimize the cost ofgranting credit to bad customers or the profit lost when good customers arerejected. Since these models now play a pivotal role in the risk managementstrategy of a bank, they are also subject to supervisory review and validationby financial regulators. Furthermore, in most countries, financial institutionsare obliged to explain why credit has been denied to an applicant. Both thesetrends basically prohibit the use of black box, mathematically complex ap-plication scoring models, but instead stimulate the use of comprehensible,easy-to-understand models.

Numerous classification techniques have been adopted for credit risk mea-surement and for financial forecasting in general. These techniques includetraditional statistical methods (e.g., discriminant analysis and logistic regres-sion [? ? ]), nonparametric statistical models (e.g., k-nearest neighbor [? ? ],decision tree [? ? ] and rule learners [? ]) and neural networks [? ? ? ]. Of-ten, conflicts may be found when the conclusions of some of these studies arecompared. In [? ], a large-scale benchmarking study compares the classifica-tion performance of various state-of-the art classification techniques on eightreal-life credit scoring data sets. It concludes that neural networks performvery well in terms of classification accuracy. However, their opacity and blackbox nature prevents them from being used in a Basel II context. That is whyin this paper, we will use the rule-based classification technique, AntMiner+,which provides comprehensible, accurate models that are in line with existingdomain knowledge.

4

3 AntMiner+: Classification based on Ant Colony Optimization

3.1 Ant Colony Optimization

Ant Colony Optimization (ACO) is a metaheuristic inspired on the foragingbehavior of real ant colonies [? ]. A biological ant by itself is a simple insectwith limited capabilities, and is guided by straightforward decision rules. How-ever, these simple rules are sufficient for the overall ant colony to find shortpaths from the nest to the food source. By dropping a chemical substancecalled pheromone that attracts other ants, an ant indirectly communicateswith its fellow ants from the colony. How this indirect communication leadsto shortest path finding capabilities is shown in Fig. 1. Suppose two ants startfrom their nest (left) and look for the shortest path to a food source (right).Initially no pheromone is present on either trails, so there is a 50-50 chanceof choosing either of the two possible paths (see Fig. 1(a)). Suppose one antchooses the lower trail, and the other one the upper trail. The ant that haschosen the lower (shorter) trail will have returned faster to the nest, resultingin twice as many pheromone on the lower trail as on the upper one, as illus-trated in Fig. 1(b). As a result, the probability that the next ant will choosethe lower, shorter trail will be twice as high, resulting in more pheromoneand thus more ants will choose this trail, until eventually (almost) all antswill follow the shorter path. Note that the pheromone on the longer trail willfinally disappear through evaporation.

Ant Colony Optimization employs artificial ants that cooperate in a simi-lar manner as their biological counterparts, in order to find good solutionsfor discrete optimization problems [? ]. The first ACO algorithm is Ant Sys-tem [? ? ], where ants iteratively construct solutions and add pheromone tothe paths corresponding to these solutions. Path selection is a stochastic pro-cedure based on not only a history-dependent pheromone value, but also aproblem-dependent heuristic value. The pheromone value gives an indicationof the number of ants that chose the trail recently, while the heuristic value isa problem dependent quality measure. When an ant reaches a decision point,it is more likely to choose the trail with the higher pheromone and heuristicvalues. Once the ant arrives at its destination, the solution corresponding tothe ant’s followed path is evaluated and the pheromone value of the path isincreased accordingly. Additionally, evaporation causes the pheromone level ofall trails to diminish gradually. Hence, trails that are not reinforced graduallylose pheromone and will in turn have a lower probability of being chosen bysubsequent ants.

The performance of traditional ACO algorithms, however, is rather poor onlarge instance problems [? ]. To overcome this issue, other ACO algorithms

5

have been proposed, such as Ant Colony System [? ], rank-based Ant System [?], Elitist Ant System [? ] and MAX -MIN Ant System [? ]. As the latter isthe one employed in the AntMiner+ classification technique, the main featuresof MAX -MIN Ant System are discussed next.

Stutzle et al. [? ] advocate that a better exploitation of the best solutionscan be obtained by only adding pheromone to the path of the best ant. Toavoid early search stagnation, which is the situation where all ants take thesame path and thus describe the same solution, possible pheromone values arelimited to the interval [τmin, τmax]. Finally, initializing the pheromone valuesto τmax entails a higher exploration at the beginning of the algorithm.

ACO has been applied to a wide variety of problems [? ], such as the vehiclerouting problem [? ? ? ], scheduling [? ? ], timetabling [? ], the traveling sales-man problem [? ? ? ] and routing in packet-switched networks [? ]. Recently,ACO has also entered the data mining domain, addressing both the cluster-ing [? ? ] and classification task [? ? ? ], which is the topic of interest in thispaper. The first application of ACO to the classification task is reported byParpinelli et al. in [? ] and was named AntMiner. Extensions were put forwardby Liu et al. in AntMiner2 [? ] and AntMiner3 [? ]. Our approach, AntMiner+,differs from these previous AntMiner versions in several ways, resulting in animproved performance, as described in [? ]. Next follows a brief discussion ofthe principles and workings of AntMiner+.

3.2 AntMiner+ Algorithm

ACO can be used to induce comprehensible and accurate rule-based classifi-cation models from data, as done in the AntMiner+ classification technique [?].

First of all, an environment needs to be defined in which the ants operate.When an ant moves through the environment from Start to Stop vertex, itshould incrementally construct a solution to the problem at hand, in this casethe classification problem. In order to build a set of classification rules, we de-fine the construction graph in such a way that each ant’s path will implicitlydescribe a classification rule. For each variable Vi a vertex vi,j is created foreach of its values V aluei,j. The set of vertices for one variable is defined asa vertex group. To allow for rules where not all variables are involved, henceshorter rules, an extra dummy vertex is added to each variable whose valueis undetermined, meaning it can take any of the values available. Althoughonly categorical variables are allowed, we make a distinction between nominal(no apparent ordering in its values, e.g. sex and purpose of loan) and ordinalvariables (a clear ordering of the values, e.g. amount on savings or checking

6

account and income). Each nominal variable has one vertex group (with theinclusion of the mentioned dummy vertex), but for the ordinal variables how-ever, we build two vertex groups to allow for intervals to be chosen by theants. The first vertex group corresponds to the lower bound of the intervaland should thus be interpreted as < Vi+1 ≥ V aluei,k >, the second vertexgroup determines the upper bound, giving < Vi+2 ≤ V aluei+1,l > (of course,the choice of the upper bound is constrained by the lower bound). This allowsto have less, shorter and actually better rules. To extract a rule set that isexhaustive, such that all future data points can be classified, the majorityclass is not included in the vertex group of the class variable, and will be thepredicted class for the final else clause.

An example AntMiner+ construction graph for a credit scoring data set withonly three variables (purpose of the loan, amount on savings account and credithistory of the applicant) is shown in Fig. 2. The path denoted in bold describesthe rule if Purpose = car and Savings Account ≥ 0e and Savings Account≤ 500e and Credit History=any then class=bad. A formal illustration ofthe construction graph is provided in Fig. 3, for a data set with d classes, nvariables, of which the first and last variable are nominal and V2 is ordinal(hence the two vertex groups). The weight parameters α and β determine therelative importance of the pheromone and heuristic values, and its notion isdescribed by (1).

Now the environment is defined, we can explain the workings of the technique.All ants begin in the Start vertex and walk through their environment to theStop vertex, gradually constructing a rule. Only the ant that describes thebest rule will update the pheromone of its path, as imposed by the MAX -MIN Ant System approach. Evaporation decreases the pheromone of alledges, while the pheromone levels are constrained to lie within the given in-terval [τmin, τmax]. Then another iteration occurs with ants walking from Startto Stop. Convergence occurs when all the edges of one path have a pheromonelevel τmax and all others edges have pheromone level τmin. Next, the rule corre-sponding to the path with τmax is extracted and added to the rule set. Finally,training data covered by this rule is removed from the training set. This it-erative process will be repeated until the stop criterion is met, which is earlystopping. This procedure monitors the accuracy on a separate validation set,and will stop inducing rules when the validation accuracy starts to decrease.Next we will have a closer look at the algorithm specifics, such as the edgeprobabilities and rule quality measure.

Pij(t) =[τ(vi−1,k,vi,j)(t)]

α.[ηvi,j(t)]β

∑pi

l=1[τ(vi−1,k,vi,l)(t)]α.[ηvi,l

(t)]β(1)

ηij =|Tij & CLASS = classant|

|Tij|(2)

7

τ(vi−1,k,vi,j)(0) = τmax (3)

τ(vi−1,k,vi,j)(t + 1) = ρ · τ(vi−1,k,vi,j)(t) +Q+

best

10(4)

The edge to choose when an ant arrives at a vertex vi−1,k, and thus the termto add next, is dependent on the pheromone value of the edge between verticesvi−1,k and vi,j (τ(vi−1,k,vi,j)) and the heuristic value of the vertex vi,j (ηi,j), andnormalized over all possible vertices, providing a probability Pij for each ofthe possible vertices, according to (1). As the heuristic function η is problem-dependent, we have defined the heuristic value ηij of vertex vi,j, correspondingto the term Vi = V aluei,j, as the fraction of training cases that are correctlycovered (described) by this term, as defined by (2). Let us illustrate this defini-tion with a simplified credit scoring data set of five data instances i1, i2, . . . , i5and three variables Sex, Term of the loan and nominal variable Real Estatestating what kind of real estate the applicant owns. Consider the vertex corre-sponding to Sex = Male. As this is a binary classification problem, the onlyclass in the construction graph is the bad class, giving a heuristic value forthis vertex of:

|Sex = male & CLASS = bad||Sex = male| = 3/4 (5)

The initial pheromone value is by definition τmax, as imposed by MAX -MINAnt System. The pheromone to add to the path of the best ant should beproportional to the quality of the path, which we define as the sum of theconfidence and the coverage of the corresponding rule. Confidence measuresthe fraction of the number of correctly classified remaining (not yet coveredby any of the extracted rules) data points by a rule compared to the totalnumber of remaining data points covered by that rule. The coverage givesan indication of the overall importance of the specific rule by measuring thenumber of correctly classified remaining data points over the total numberof remaining data points. More formally, the pheromone amount to add tothe path of the iteration best ant is given by the benefit of the path of theiteration best ant, as indicated by (6), with ruleant the rule antecedent (ifpart) comprising of a conjunction of terms corresponding to the path chosenby the ant, rulec

ant the conjunction of ruleant with the class chosen by the ant,and Cov a binary variable expressing whether a data point is already coveredby one of the extracted rules (Cov = 1) or not (Cov = 0). The number ofremaining data points can therefore be expressed as |Cov = 0|. This meansthat, taking into account the evaporation factor as well, the update rule forthe best ant’s path is described by (4), where the division by ten is a scalingfactor that is needed such that both the pheromone and heuristic values lie

8

within the range [0, 1].

Q+ =|rulec

ant||ruleant|︸︷︷︸

confidence

+|rulec

ant||Cov = 0|︸︷︷︸

coverage

(6)

For example, returning to our simple data set (see Table 1), suppose we havefollowing two rules:R1 : if Sex = M and Term ≥ 1 y and Term ≤ 15 y

then customer = Bad

R2 : if Sex = M and Term ≥ 1 y and Term ≤ 1 y and Real Estate = A

then customer = Bad

As shown in Table 1, rule R1 correctly classifies 3 of the 4 data instancesdescribed by the rule antecedent, yielding a confidence of 0.75. The coverageof R1 is 0.6, as it correctly describes 3 of the 5 instances in the data set.Similarly for rule R2, a confidence and coverage of respectively 1 and 0.2 isobtained. This example shows that although rule R2 is completely accurate,shown by the confidence of 1, it is not the best rule, as we also take into accountthe coverage of the rule. The coverage makes sure that we avoid overfittingand obtain less rules.

In previous research, a benchmarking study of AntMiner+ with state-of-the-art classification techniques, such as C4.5, RIPPER and support vector ma-chines, showed that AntMiner+ ranks at the absolute top when consideringboth accuracy and comprehensibility [? ]. However, a reluctance to accept theclassification models may still exist as possibly unexpected signs in the hyper-plane part of the AntMiner+ rules may arise, which may be due to spuriouscorrelations in the data, but do not represent the actual risk relationship (sim-ply put wrong inequation signs, e.g. rules as: if Income ≥ 10.000e and SavingsAccount ≥ 100.000e then customer = bad). To counter such inconsistencieswith existing domain knowledge, we have extended the AntMiner+ classifica-tion technique to incorporate domain knowledge [? ]. The basic principle is asfollows: considering our credit scoring example, we can make sure that increas-ing the amount on the applicant’s savings account cannot lead to a customerchanging from good to bad by removing the vertex group corresponding toSavings Account ≥ (see Fig. 2): since the ants look only for rules to classifybad customers (only the final else clause will classify a customer as good), theterm with Savings Account can only be in the form Savings Account ≤ X.This allows the domain expert to enforce hard constraints on the inequalitysigns. Furthermore, a bias may also exist towards certain values, in which casethe constraint is preferred and not mandatory. To deal with such soft con-straints, the heuristic values can be adapted. For more details we refer to [? ].The ability to incorporate domain knowledge is of crucial importance within

9

a credit scoring context, and reduces the Validation & Verification process ofthe model dramatically (see Section 5.1, further in the text).

AntMiner+ is implemented in the platform-independent, object-oriented Javaprogramming environment, with usage of the MySQL open source databaseserver. Example screenshots of the Graphical User Interface (GUI) of AntMiner+are included in Appendix.

4 Building Credit Risk Models with AntMiner+

In this section, we will illustrate how AntMiner+ can be used to build creditrisk systems in three different contexts: retail banking, small and medium sizedenterprises (SMEs), and bank ratings.

As AntMiner+ can only deal with categorical variables, a discretization pre-processing step takes place in which the continuous variables are turned intodiscrete variables. This process is done in an automatic manner with the Wekaworkbench [? ] according the criterion of Fayyad [? ]. All experiments wererun with 1000 ants and ρ set at 0.85, as suggested in [? ].

4.1 Retail Banking

In this section, we will illustrate how AntMiner+ can be used to develop appli-cation scoring models in a retail banking context. The purpose of applicationscoring is to provide a score or classification of a credit applicant given theapplication characteristics provided. The data set that we will use is the Ger-man credit data set, which is a publicly available application scoring data set(see www.ics.uci.edu/∼mlearn/MLRepository.html) having 1000 observa-tions and 20 application characteristics. Table 2 presents the rules that wereextracted using AntMiner+.

The extracted rule set is concise and easy to understand. Only 5 of the original20 application characteristics are used for making the discrimination. Thisclearly has a beneficial impact on interpretability, but also on operational costand efficiency.

4.2 SME Bankruptcy Prediction

Under the IRB approach for corporate credits, the Basel II Capital accordallows banks to separately distinguish exposures to SME borrowers (defined

10

as corporate exposures where the reported sales for the consolidated group ofwhich the firm is a part is less than 50 million e) from those to large firms.The SME data set consists 422 observations, 74 bankrupt and 348 solventcompanies. The default data were collected from 1989-1997, while the otherdata were extracted from the period 1996-1997 only. A total number of 40candidate input variables was selected from financial statement data, usinga.o. liquidity, profitability and solvency measures (see [? ] for an extensivedescription of this data set.

Table 3 represents the rules that were extracted by AntMiner+. Again, only5 of the 40 original inputs are used in making the discrimination decision.Note that the numbers were rounded and one variable was scaled randomlyfor confidentiality reasons.

4.3 Rating Prediction

For retail and SME portfolios, one typically has a sufficient number of defaultobservations in order to make statistical discrimination meaningful. However,when modeling credit risk for entities such as banks, sovereigns, or insurancecompanies, the lack of default observations necessitates the use of an alter-native modeling approach. That is why many financial institutions opt fora mapping to external ratings in this context. In this section, we will studyhow AntMiner+ can be used to model credit risk for bank entities. The datawas retrieved from the Bankscope database, which contains financial state-ments of more than 15.000 banks. For each of these banks the Moody’s rat-ing will be used as the basis of the target variable (low/speculative-grade orgood/investment-grade rating). These ratings were retrieved for the period1998-2003. The rating at the end of May of the year T + 1 is predicted basedon a 3-year history of inputs observed during years T , T − 1, T − 2. A varietyof different inputs was selected covering, amongst others, asset quality, cap-ital, operational result and liquidity. The size variable Total Assets was alsoincluded as well as a geographical indicator Region (Euro-zone, dollar-zone,EU accession countries, Japan and others). After data preprocessing, the dataset consisted of a cleaned database of 2996 observations with 37 inputs (see[? ] for a more extensive description).

4.4 Classification Model Performance

Table 5 shows the results of the classification models induced by AntMiner+,C4.5, support vector machine (SVM) and majority vote. The experimentalsetup is the same for all included data sets. The data set is split up intotraining, validation and test set according following fractions: 4/9, 2/9 and 3/9,

11

as is common practice in data mining [? ? ]. To eliminate any chance of havingunusually good or bad training and test sets, 10 runs are conducted wherethe order of observations is first randomized before the training, validationand test set are chosen. For each randomization AntMiner+ is run with hardmonotonicity constraints, as imposed by the financial expert.

The best average test set performance over the 10 randomizations is underlinedand denoted in bold face for each data set. We then use a paired t-test to testthe performance differences. Performances that are not significantly differentat the 5% level from the top performance with respect to a one-tailed pairedt-test are tabulated in bold face. Statistically significant underperformancesat the 1% level are emphasized in italics. Performances significantly differentat the 5% level but not at the 1% level are reported in normal script. Sincethe observations of the randomizations are not independent, we remark thatthis standard t-test is used as a common heuristic to test the performancedifferences [? ].

As Table 5 shows, the non-linear SVM classifiers performs best in terms of ac-curacy, as can be expected [? ]. However, as mentioned before, the black-boxnature of such non-linear classifiers make them less suited for credit scoring,where validation is required. When comparing the rule- and tree-based clas-sifiers AntMiner+ and C4.5 we can observe very competitive accuracies, butwhen considering the number of rules as well AntMiner+ comes out as thebest performing technique. On top of that, the AntMiner+ rule sets complywith stated domain constraints, which, as pointed out in [? ], can result in adecrease in accuracy. Yet a small decrease in accuracy can be allowable, aninconsistency with domain knowledge is not.

5 Towards a Basel II Credit Risk Management System

Up till now, we have largely focused on extracting a comprehensible set ofrules to do risk management in a Basel II context. These rules now need to befurther analyzed and used in various activities so as to arrive at a full-fledged,integrated Basel II risk decision and management application. In what follows,we will discuss the most important activities, which are summarized in Fig. 4.

5.1 Verification and Validation

A first set of tools can be used to verify and validate (V&V) the extracted ruleset. Verification will attempt to look for syntax based anomalies in the rule set.Whether the rule set is exhaustive (all cases being covered) and exclusive (a

12

case only covered by 1 rule) will be investigated in this step. Because of the if-then-else nature of the AntMiner+ rule sets, they are by definition exhaustiveand exclusive, making the verification step obsolete. In the validation step,it will be investigated whether the rules adequately model the risk involvedfrom a human interpretation viewpoint. The financial credit expert will alsobe consulted and asked to interpret the rule set in this step.

In order to facilitate the verification and validation step, decision tables maybe adopted [? ]. Decision tables provide an alternative way of representing theAntMiner+ rule sets in a user-friendly way. A decision table (DT) consistsof four quadrants, separated by double-lines, both horizontally and vertically(cf. Fig. 5). The vertical line divides the table into a condition part (left),specifying the inputs to be checked, and an action part (right) specifying theclasses assigned.

Each condition entry describes a relevant subset of values (called a state)for a given input, or contains a dash symbol (’–’) if its value is irrelevantwithin the context of that column. Subsequently, every action entry holds avalue assigned to the outcome class. True, false and unknown action valuesare typically abbreviated by ’×’, ’–’, and ’·’, respectively. Every row in theentry part of the DT thus comprises a classification rule, indicating what classresults from a certain combination of inputs. If each row only contains simplestates (no contracted or irrelevant entries), the table is called an expandedDT, whereas otherwise the table is called a contracted DT. Table contractioncan be achieved by combining rows that lead to the same outcome class. Thenumber of rows in the contracted table can then be further minimised bychanging the order of the conditions. It is obvious that a DT with a minimalnumber of rows is to be preferred since it provides a more parsimonious andcomprehensible representation of the extracted rule set than an expanded DT.This is illustrated in Fig. 6.

In the literature, several kinds of DTs have been proposed. We will requirethat the condition entry part of a DT satisfies the following two criteria:

• completeness: all possible combinations of input values are included;• exclusivity: no combination is covered by more than one column.

As such, we deliberately restrict ourselves to single-hit tables, wherein columnshave to be mutually exclusive, because of their advantages with respect to ver-ification and validation [? ]. It is this type of DT that can be easily checkedfor potential anomalies, such as inconsistencies (a particular counterparty be-ing assigned to more than one class) or incompleteness (no class assigned).The decision table formalism thus allows for easy verification of the extractedAntMiner+ rules. Additionally, for ease of legibility, the rows are arrangedin lexicographical order, in which entries at lower rows alternate first. As a

13

result, a tree structure emerges in the condition entry part of the DT, whichlends itself very well to a top-down evaluation procedure: starting at the firstcolumn, and then working one’s way to the right of the table by choosingfrom the relevant condition states, one safely arrives at the outcome class fora given case. This condition-oriented inspection approach often proves to bemore intuitive, faster, and less prone to human error, than evaluating a set ofrules one by one.

Decision tables can also be usefully adopted for validation purposes, as aneasily be checked for potential anomalies, such as in- consistency with mono-tonicity constraints: by placing the assumingly monotone variable in the lastcolumn, adjacent rows are found with data entries that are equal in all vari-ables except the last one. It can then be easily seen whether or not the classvariable changes in the expected manner. As AntMiner+ has the supplemen-tary benefit of incorporating such monotonicity constraints, as demonstratedin Section 3.2, the decision table will reveal no counter-intuitive patterns anymore. For example, Table 6 depicts the decision table corresponding to therule set extracted for the German credit scoring data set (see Table 2). Basedon this table, we can easily check that credit history can only have a positiveeffect on the applicants assessment, if any.

We can conclude that this first step of verifying and validating the modelhas been releaved significantly thanks to the nature of the induced rule sets(exhaustive and exclusive) and because of the incorporation of monotonicityconstraints. This does however not mean that this phase is no longer needed,as the domain expert still needs to check whether the model is suitable. Fromthat perspective, decision tables are still a very useful tool.

5.2 Traffic Light Decision Support System

Once the rule set has been verified and validated, it needs to be implementedas a decision support system (DSS) which can be used by the credit officersso as to make the actual credit decision: accept or reject. The DSS can beimplemented using a traffic light indicator approach that gives three possibleoutcomes: a green light, an orange light or a red light [? ]. A green lightindicates that the rule set is confident enough to classify a customer as a goodpayer and credit should be accepted. An orange light indicates a doubt casefor which human intervention is needed. This can be due to for example, lowconfidence of the rule set, external information obtained from a credit bureau(e.g. Equifax, Experian), a customer which is rejected borderline by the ruleset but is very profitable on other financial products, and/or a new marketingcampaign in which the financial institution decides to grant credit to some ofthe more risky customers. The orange light can allow for model overrides by

14

the credit expert. A low side override means that a customer rejected by therule set is accepted, and a high side override vice versa. A red light indicatesthat the rule set is confident enough to classify a customer as a bad payer andcredit should be rejected. Note that this traffic light indicator approach canalso be implemented using four colors (green, yellow, orange, red) or gaugesin a dashboard application. An implementation of a traffic light indicatorapproach using four colors could be as follows. Red when the rule set predictsbad customer and this is confirmed by the credit bureau information; Orangewhen the rule set predicts bad customer, but credit bureau says customer isgood risk; Yellow when the rule set predicts bad customer, but confidence isvery low and the credit bureau says customer is good risk; and Green when therule set says good customer and the credit bureau says customer is good risk.Note that the financial institutions can decide for themselves on the numberof colors and their meaning.

5.3 Interface to Basel II Calculation Engine

The extracted rule set must also interface with a Basel II calculation enginewhich will use the rule outputs to calculate expected loss and the regulatorycapital that a financial institution needs to set aside in order to cover un-expected credit losses. Therefore, in a calibration phase, each rule should beaccompanied by a PD estimate which should be forward looking and basedon five years of historical data.

Once the estimates for the LGD and EAD have been obtained, the expectedloss and the regulatory capital can be calculated. The expected loss (EL) canbe calculated as EL = PD×LGD×EAD. It represents the long-run averagecredit loss and will be used for debt provisioning. The regulatory safety capitalcan then also be calculated based on the formula’s provided in the Basel IIAccord. E.g., for retail exposures the formula’s are as follows

K = LGD · (Φ(√

11−ρ

Φ−1(PD) +√

ρ1−ρ

Φ−1(0.999)) − PD)

regulatory capital = K · EAD(7)

whereby Φ (Φ−1) represents the (inverse) cumulative standard normal distri-bution, and ρ the asset correlation factor which is fixed in the Accord [? ] (e.g.0.15 for residential mortgage exposures).

5.4 Evaluating the Model over Time: Backtesting and Benchmarking

The Basel II Capital Accord requires credit risk systems to be validated, atleast annually. The accord distinguishes between backtesting, which is com-

15

paring the predicted outcome by the rule set with the realized outcome, andbenchmarking, which is comparing the predicted outcome of the rule set withthe outcomes of models of other parties in the industry (such as credit bu-reaus, other financial institutions, or financial regulators). From a backtestingperspective, the performance of the rule set needs to be monitored. Again,a traffic light indicator approach can be adopted with three outcomes: greenlight, orange light, red light [? ]. The decision which light to switch on canbe determined based on the outcome of a test statistic which monitors theclassification accuracy (e.g. McNemar’s test [? ]). A green light indicates thatthe rule set performance is stable, e.g. no significant differences at the 5%level are reported. It means the rule set can continue to be used. An orangelight may indicate e.g. a difference at the 5% level but not at the 1% level ofsignificance. It indicates a performance difference which requires no immedi-ate action but needs to be closely monitored in the future. A red light thenindicates a significant performance difference at the 1% level. It indicates thatthe model is no longer appropriate for the current data which could possiblybe due to a change of the population (often referred to as population drift)or a new strategy of the financial institution. In other words, the model needsto be rebuilt, which in our context would mean extracting a new rule set us-ing AntMiner+. From a benchmarking perspective, a similar process can beconducted, whereby the traffic lights now indicate how much the two partiesagree or disagree on their credit decisions.

6 Conclusion

The introduction of the recently suggested Basel II Capital Accord has encour-aged financial institutions to build efficient and high-performing credit riskmodels assessing the creditworthiness of their counterpartys. Ideally, thesemodels should be both powerful, in terms of discriminating defaulters fromnon-defaulters, and comprehensible, in terms of explanatory power. In thispaper, we discussed how Ant Colony Optimization can be used to build creditrisk models for Basel II. More specifically, we used the AntMiner+ algorithm,which is a rule induction technique based on the principles of MAX -MINAnt System. AntMiner+ distinguishes itself by the comprehensibility of theinduced models which are in line with existing domain knowledge. We havealso shown how decision tables can be useful to provide even more insight intothe classification model.

Experiments were conducted using three real-life credit risk data sets: onein retail, one for SMEs, and one for bank ratings. It was illustrated that foreach of these data sets AntMiner+ extracted a powerful and concise rule set.Furthermore, it was also discussed how the induced rule sets could fit into aglobal credit risk management strategy and architecture. An interesting topic

16

for further research is to extend the algorithm to handle continuous targetsand generate regression rules, which could be useful e.g. for modeling LGDand EAD.

Acknowledgment

We extend our gratitude to the (associate) editor and the anonymous review-ers, as their many constructive and detailed remarks certainly contributedmuch to the quality of this paper. Further, we would like to thank the Flem-ish Research Council (FWO, Grant G.0615.05), and the Microsoft and KBC-Vlekho-K.U.Leuven Research Chairs for financial support to the authors.

References

[] A. Abraham and V. Ramos. Web usage mining using artificial ant colonyclustering. In the Congress on Evolutionary Computation, pages 1384–1391. IEEE Press, 2003.

[] B. Baesens, T. Van Gestel, S. Viaene, M. Stepanova, and J. Suykens,J.A.K.and Vanthienen. Benchmarking state of the art classification algo-rithms for credit scoring. Journal of the Operational Research Society,54(6):627–635, 2003.

[] Basel Committee on Banking Supervision. International convergence ofcapital measurement and capital standards: a revised framework. Tech-nical report, BIS, June 2006.

[] C. Blum. Beam-ACO — hybridizing ant colony optimization with beamsearch: An application to open shop scheduling. Computers & OperationsResearch, 32(6):1565–1591, 2005.

[] B. Bullnheimer, R. F. Hartl, and C. Strauss. A new rank based versionof the ant system: A computational study. Central European Journal forOperations Research and Economics, 7(1):25–38, 1999.

[] B. Bullnheimer, R.F. Hartl, and C. Strauss. Applying the ant system tothe vehicle routing problem. In S. Voss, S. Martello, I.H. Osman, andC. Roucairol, editors, Meta-Heuristics: Advances and Trends in LocalSearch Paradigms for Optimization, 1999.

[] G. Di Caro and M. Dorigo. Antnet: Distributed stigmergetic controlfor communications networks. Journal of Artificial Intelligence Research,9:317–365, 1998.

[] A. Colorni, M. Dorigo, V. Maniezzo, and M. Trubian. Ant systemfor jobshop scheduling. Journal of Operations Research, Statistics andComputer Science, 34(1):39–53, 1994.

[] V.S. Desai, J.N. Crook, and G.A. Overstreet Jr. A comparison of neu-

17

ral networks and linear scoring models in the credit union environment.European Journal of Operational Research, 95(1):24–37, 1996.

[] T. G. Dietterich. Approximate statistical test for comparing supervisedclassification learning algorithms. Neural Computation, 10(7):1895–1923,1998.

[] M. Dorigo and L. M. Gambardella. Ant colony system: A cooperativelearning approach to the traveling salesman problem. IEEE Transactionson Evolutionary Computation, 1(1):53–66, April 1997.

[] M. Dorigo, V. Maniezzo, and A. Colorni. Positive feedback as a searchstrategy. Technical Report 91016, Dipartimento di Elettronica e Infor-matica, Politecnico di Milano, IT, 1991.

[] M. Dorigo, V. Maniezzo, and A. Colorni. Ant System: Optimization by acolony of cooperating agents. IEEE Transactions on Systems, Man, andCybernetics Part B: Cybernetics, 26(1):29–41, 1996.

[] M. Dorigo and T. Stutzle. Ant Colony Optimization. MIT Press, Cam-bridge, MA, 2004.

[] U.M. Fayyad and K.B. Irani. Multi-interval discretization of continuous-valued attributes for classification learning. In Proceedings of theThirteenth International Joint Conference on Artificial Intelligence(IJCAI), pages 1022–1029, Chambery, France, 1993. Morgan Kaufmann.

[] L. M. Gambardella and M. Dorigo. Ant-Q: A reinforcement learning ap-proach to the traveling salesman problem. In A. Prieditis and S. Russell,editors, Proceedings of the Twelfth International Conference on MachineLearning, pages 252–260, Palo Alto, CA, 1995. Morgan Kaufmann Pub-lishers Inc.

[] D. Hand. Pattern detection and discovery. In D. Hand, N. Adams,and R. Bolton, editors, Pattern Detection and Discovery, volume 2447of Lecture Notes in Computer Science, pages 1–12. Springer, 2002.

[] J. Handl, J. Knowles, and M. Dorigo. Ant-based clustering and topo-graphic mapping. Artificial Life, 12(1):35–61, 2006.

[] W.E. Henley and D.J. Hand. Construction of a k-nearest neighbourcredit-scoring system. IMA Journal of Mathematics Applied In Businessand Industry, 8:305–321, 1997.

[] B. Liu, H. A. Abbass, and B. McKay. Density-based heuristic for rulediscovery with ant-miner. In 6th Australasia-Japan Joint Workshop onIntelligent and Evolutionary Systems (AJWIS2002), Canberra, Australia,2002.

[] B. Liu, H. A. Abbass, and B. McKay. Classification rule discovery withant colony optimization. In IAT, pages 83–88. IEEE Computer Society,2003.

[] D. Martens, M. De Backer, R. Haesen, B. Baesens, C. Mues, andJ. Vanthienen. Ant-based approach to the knowledge fusion problem.In Proceedings of the Fifth International Workshop on Ant ColonyOptimization and Swarm Intelligence, Lecture Notes in Computer Sci-ence, pages 85–96. Springer, 2006.

18

[] D. Martens, M. De Backer, R. Haesen, M. Snoeck, J. Vanthienen,and B. Baesens. Classification with ant colony optimization. IEEETransaction on Evolutionary Computation, 11(5):651–665, 2007.

[] R. Montemanni, L. M. Gambardella, A. E. Rizzoli, and A. Donati.Ant colony system for a dynamic vehicle routing problem. Journal ofCombinatorial Optimization, 10(4):327–343, 2005.

[] R. S. Parpinelli, H. S. Lopes, and A. A. Freitas. An ant colony basedsystem for data mining: Applications to medical data. In Proceedings ofthe Genetic and Evolutionary Computation Conference (GECCO-2001),pages 791–797, San Francisco, California, USA, 2001. Morgan Kaufmann.

[] D. Quintana, C. Luque, and P. Isasi. Evolutionary rule-based systemfor IPO underpricing prediction. In GECCO ’05: Proceedings of the 2005conference on Genetic and evolutionary computation, pages 983–989, NewYork, NY, 2005. ACM Press.

[] D.J. Sheskin. Handbook of parametric and nonparametric statisticalprocedures. Chapman and Hall/CRC, 2000.

[] K. Socha, J. Knowles, and M. Sampels. A MAX -MIN ant systemfor the university timetabling problem. In M. Dorigo, G. Di Caro, andM. Sampels, editors, Proceedings of ANTS 2002 – Third InternationalWorkshop on Ant Algorithms, volume 2463 of Lecture Notes in ComputerScience, pages 1–13, Berlin, Germany, September 2002. Springer-Verlag.

[] A. Steenackers and M.J. Goovaerts. A credit scoring model for personalloans. Insurance: Mathematics and Economics, 8:31–34, 1989.

[] T. Stutzle and H. H. Hoos. Improving the ant-system: A detailed reporton the MAX -MIN ant system. Technical Report AIDA 96-12, FGIntellektik, TU Darmstadt, Germany, 1996.

[] T. Stutzle and H. H. Hoos. MAX -MIN ant system. Future GenerationComputer Systems, 16(8):889–914, 2000.

[] D. Tasche. Traffic lights approach to PD validation. Technical report,2003.

[] E. Tsang, P. Yung, and J. Li. EDDIE-automation, a decision supporttool for financial forecasting. Decision Support Systems, 37(4):559–565,September 2004.

[] T. Van Gestel, B. Baesens, P. Van Dijcke, J. Garcia, J.A.K. Suykens,and J. Vanthienen. A process model to develop an internal rating sys-tem: sovereign credit ratings. Decision Support Systems, 42(2):1131–1151,2006.

[] T. Van Gestel, B. Baesens, P. Van Dijcke, J.A.K. Suykens, J. Garcia,and T. Alderweireld. Linear and nonlinear credit scoring by combininglogistic regression and support vector machines. Journal of Credit Risk,1(4), 2005.

[] J. Vanthienen, C. Mues, and A. Aerts. An illustration of verificationand validation in the modelling phase of KBS development. Data andKnowledge Engineering, 27(3):337–352, 1998.

[] J. Vanthienen, C. Mues, and A. Aerts. An illustration of verification

19

and validation in the modelling phase of kbs development. Data andKnowledge Engineering, 27:337–352, 1998.

[] J. Vanthienen and G. Wets. From decision tables to expert system shells.Data and Knowledge Engineering, 13(3):265–282, 1994.

[] A. Wade and S. Salhi. An ant system algorithm for the mixed ve-hicle routing problem with backhauls. In Metaheuristics: computerdecision-making, pages 699–719, Norwell, MA, 2004. Kluwer AcademicPublishers.

[] D. West. Neural network credit scoring models. Computers andOperations Research, 27:1131–1152, 2000.

[] I. H. Witten and E. Frank. Data mining: practical machine learning toolsand techniques with Java implementations. Morgan Kaufmann PublishersInc., San Francisco, CA, USA, 2000.

[] M.B. Yobas, J.N. Crook, and P. Ross. Credit scoring using neural and evo-lutionary techniques. IMA Journal of Mathematics Applied in Businessand Industry, 11:111–125, 2000.

Appendix: Screenshots of AntMiner+ GUI

Several screenshots of the AntMiner+ Graphical User Interface are providedin Fig. 7 and 8.

Fig. 7 shows the initial menu of AntMiner+, allowing the user to choose thenumber of ants and evaporation rate ρ. The ‘minimal fraction uncovered data’input variable can be used as an alternative for the early stopping stop crite-rion: no more rules will be extracted when all but x% of the data has beencovered by the extracted rule set. Note that all experiments were conductedwith the early stopping criterion.

Fig. 8 shows the construction graph for the SME data set during differentstages of execution: from initialization (top) to convergence (bottom), withthe width of the edges being proportional to their pheromone level. In thebottom box of each screenshot, the extracted rules with their accuracy onboth training, validation and test set are displayed.

20

(a) (b)

50%

50% 33%

67%

Fig. 1. Path selection directed by pheromone: the more pheromone on a path, themore likely an ant will follow the path. This simple mechanism of indirect commu-nication is sufficient for the overall ant colony to find short paths from the nest tothe food source.

Start Stopbad

Class Purpose

car

education

business

any any

SavingsSavingsAccountAccount

0e 0e

100e 100e

250e 250e

500e 500e

1000e 1000e

3000e 3000e

CreditHistory

all paid

none taken

critical

Fig. 2. Example of a path described by an ant for a credit scoring construction graphdefined by AntMiner+. The rule corresponding to the chosen path is if Purpose =car and Savings Account ∈ [0e,500e] then class = bad.

Start Stop

a1

a2

a3

a4

b1

b2

b3

b4

α= β=v0,1

v0,2

v0,d−1

v1,1 v2,1 vm,1

v1,2 v2,2 vm,2

v1,p∗1+1 v2,p∗2 vn,p∗m+1

V ∗1,=

v3,1

v3,2

v3,p∗3

V ∗2,≥ V ∗

3,≤ V ∗m,=V ∗

0,=

ClassWeight Parameters

Fig. 3. Multiclass construction graph of AntMiner+, with the inclusion of weightparameters.

21

R1 : if (Checking Account < 100 and Duration > 15m)

then class = bad

R2: if (Purpose = new car and Credit History = critical)

then class = bad

R3 : else if (Checking Account < 0 and Purpose = furniture and

Savings Account < 250 )

then class = bad

R4 : else class = good

data

AntMiner+

V&V

LGD

PD

EAD

DecisionSupport System

Capital Requirements

Backtesting &

Benchmarking

Fig. 4. Credit risk management system with the use of AntMiner+. The inducedrule set is verified and validated, after which it can be used as a decision supportsystem to make actual credit risk decisions (accept or deny credit), and to calculatecapital requirements. Finally, backtesting and benchmarking validate the credit riskmanagement system over time.

condition subjects action subjects

condition entries action entries

Fig. 5. DT quadrants.

22

1. Condition1 2. Condition2 3. Condition3 1. Class1 2. Class2

yes

yesyes – ×

no × –

noyes × –

no × –

no

yesyes – ×

no × –

noyes – ×

no × –

(a) Expanded decision table

1. Condition1 2. Condition2 3. Condition3 1. Class1 2. Class2

yesyes

yes – ×

no × –

no – × –

no –yes – ×

no × –

(b) Contracted decision table

Fig. 6. Minimizing the number of columns of a lexicographically ordered DT [? ].

Fig. 7. Screenshot of AntMiner+ initial menu.

23

Fig. 8. Screenshots of AntMiner+ run on the SME credit risk data set during dif-ferent stages of execution: from initialization (top) to convergence (bottom)

24

Table 1Illustration of Quality Measure Q+

Sex Term Real Estate Customer R1 R2

i1 M 1 A Bad√ √

i2 M 5 N Bad√

i3 F 15 A Good

i4 M 10 H Bad√

i5 M 15 H Good ×

Confidence 3/4 1/1

Coverage 3/5 1/5

Q+ 1.35 1.2

Table 2Example credit scoring rule set

R1: if (Checking Account < 100e and Duration > 15 m and

Credit History = no credits taken and Savings Account < 500e)

then class = bad

R2: else if (Purpose = new car/repairs/education/others and

Credit History = no credits taken/all credits paid back duly at this bank and

Savings Account < 500e)

then class = bad

R3: else if (Checking Account < 0e and

Purpose = furniture/domestic appliances/business and

Credit History = no credits taken/all credits paid back duly at this bank and

Savings Account < 250e)

then class = bad

R4: else if (Checking Account < 0e and Duration > 15 m and

Credit History = critical account and Savings Account < 250e)

then class = bad

R5: else class = good

Table 3Example SME bankruptcy rule set

R1: if (Capital & Reserves (Tr) < -0.001 and Turnover (% TA) < 0.16 and

Current profit/Current loss (R) < -25000)

then class = default

R2: else if (Turnover(Tr) < -0.001 and Solvency Ratio (%)(Tr) < -20 and

Total Assets (Tr) < 0

then class = default

R3: else class = non-default

25

Table 4Example bank rating rule set

R1: if Region = not EU15 and Loan Loss Res/Gross Loans ≥ 3 and

ln(Total Assets) ≤ 8.6 then class = low rating

R2: else if Loan Loss Prov/Net Int Rev ≥ 10.5 and Return on Avg Equity ≤ -3.4

then class = low rating

R3: else if Region = not EU15 and Total capital Ratio ≤ 10 and

Net Interest Margin ≤ 2.1


R4: else if Region = EU Next or Others and Loan Loss Prov/Net Int Rev ≥ 42


R5: else if Region = JPY or EU Next or Others and Cost to Income Ratio ≥ 80 and

Net Loans/Cust&ST Funding ≥ 46


R6: else if Region = JPY or EU Next or Others and Loan Loss Prov/Net Int Rev ≥ 42 and

Net Interest Margin ≤ 2.1


R7: else class = good rating

Table 5Average out-of-sample performances

german SME banks Average

Accuracy

AntMiner+ 71.9 86.2 84.3 80.8

C4.5 74.2 82.7 85.6 80.8

SVM 73.7 86.3 87.7 82.6

Majority Vote 66.7 83.2 61.0 70.3

Number of AntMiner+ 5.7 2.6 6.4 4.9

Rules C4.5 14.8 7.4 17 13.1

26

Table 6Decision table predicting retail loan defaults

Duration Purpose Checking Account Savings Account Credit History Bad Good

≤ 15m – – – – – ×> 15m car(old)/others – – – – ×

furniture/business <0e <250e or – × –unknown/no savings

≥250e – – ×≥0 and <100e or <500e all credits paid back duly

– ×no checking account or all credits at this bank

paid back duly

existing credits paid

× –back duly till now or

critical account

≥500e or –– ×

unknown/no savings

≥100e – – – ×radio/television <0e – all credits paid back duly

– ×or all credits at this bank

paid back duly

existing credits paid

× –back duly till now or

critical account

≥0e or – –– ×

no checking account

car(new)/retraining – – – – ×

27

List of Figures

1 Path selection directed by pheromone: the more pheromoneon a path, the more likely an ant will follow the path. Thissimple mechanism of indirect communication is sufficient forthe overall ant colony to find short paths from the nest to thefood source. 18

2 Example of a path described by an ant for a creditscoring construction graph defined by AntMiner+. The rulecorresponding to the chosen path is if Purpose = car and

Savings Account ∈ [0e,500e] then class = bad. 18

3 Multiclass construction graph of AntMiner+, with theinclusion of weight parameters. 18

4 Credit risk management system with the use of AntMiner+.The induced rule set is verified and validated, after which itcan be used as a decision support system to make actual creditrisk decisions (accept or deny credit), and to calculate capitalrequirements. Finally, backtesting and benchmarking validatethe credit risk management system over time. 19

5 DT quadrants. 19

6 Minimizing the number of columns of a lexicographicallyordered DT [? ]. 20

7 Screenshot of AntMiner+ initial menu. 20

8 Screenshots of AntMiner+ run on the SME credit risk data setduring different stages of execution: from initialization (top)to convergence (bottom) 21

28

List of Tables

1 Illustration of Quality Measure Q+ 22

2 Example credit scoring rule set 22

3 Example SME bankruptcy rule set 22

4 Example bank rating rule set 23

5 Average out-of-sample performances 23

6 Decision table predicting retail loan defaults 24

29

Credit Rating Prediction Using Ant Colony Optimization

Documents

Transcript of Credit Rating Prediction Using Ant Colony Optimization