Post on 02-Aug-2015
Cairo University
Institute of Statistical Studies and Research
A DECISION SUPPORT SYSTEM FOR
CREDIT CARDS APPLICATION ASSESSMENT
Prepared by
Ahmed Mahmoud Saleim Eliwa
A thesis submitted to the institute of Statistical Studies and Research, Cairo
University, in partial fulfillment of requirement for the master degree in Operations
Research, department of Operations Research.
Under supervision of
Prof. Bahaa El-Din Helmy Ismail Undedicated professor
Computer sciences and information department
Institute of Statistical Studies and Research
Cairo University
& Dr. Assem Abd El-Fattah Tharwat
Head of decision support department
Faculty of Computers & Information
Cairo University
& Dr. Ramadan Abd El-Hamed Zen El-Den
Operations research department
Institute of Statistical Studies and Research
Cairo University
June 2007
2
Contents
Page
Summary
Chapter one: Introduction
1.1 Introduction 1
1.2 Credit card definition and benefits 3
1.3 The steps of issuing credit cards 5
1.4 Properties of credit card application assessments process 6
1.5 Judgment process for credit card application assessment 7
1.6 The disadvantages of using judgment process for
credit card application assessment 8
1.7 Risk associated with credit card lending 9
1.8 Cost of wrong decisions in credit card application assessment 10
Chapter two: Scoring system
2.1 Introduction 11
2.2 History of scoring system 12
2.3 Definition of scoring system 13
2.4 Types of scoring system 15
2.5 Scoring system applications 19
2.6 Potential Benefits of scoring system 21
2.7 Scoring system limitation 23
2.8 Scoring system issues 23
2.9 Scoring system and risk management 25
2.9.1 Basel II 25
2.9.2 Risk management 27
2.9.3 Scoring system as a risk management tool 29
Chapter three: Problem formulation and survey
3.1 Introduction 31
3
3.2 Data description 32
3.2.1 Data used to build the credit score model 32
3.2.2 The output data of the credit score model 34
3.3 Building the credit score model 35
3.4 Literature survey 37
3.4.1 Classification the credit score methods 37
3.4.2 Statistical techniques 39
3.4.2.1 Linear discriminate analysis (linear
probability model) 39
3.4.2.2 Logistic regression 40
3.4.2.3 Probit and tobit analysis 40
3.4.2.4 Semiparametric regression 41
3.4.2.5 Bayesian classification 41
3.4.2.6 Nearest neighbor approach 41
3.4.3 Non statistical techniques 44
3.4.3.1 Multicriteria decision aid method (MCDA) 44
3.4.3.2 Linear programming 44
3.4.3.3 Integer programming 47
3.4.3.4 Goal programming 47
3.4.3.5 Neural network 48
3.4.3.6 Expert system 49
3.4.3.7 Genetic algorithm 51
3.4.3.8 Classification tree 51
3.4.3.9 Rough sets theory 52
3.4.3.10 Analytical hierarchy process 53
3.5 Comparisons of techniques used to build credit score 53
Chapter four: Decision support system 55
Part I: Introduction to decision support system
4. I.1 Definition of decision support system (DSS) 55
4
4. I.2 Characteristic and capabilities of DSS 58
4. I.3 Decision support system components 60
4. I.4 Decision support system application (type, classification,
taxonomy of DSS) 67
4. I.5 Constructing a decision support system 72
4. I.6 DSS technologies levels and tools 76
4. I.6.1 Relationships among the technologies levels 77
4. I.6.2 Future trends of decision support system 77
4. I.7 Approaches to DSS construction 78
4. I.7.1 Quick hit 78
4. I.7.2 Staged development 79
4. I.7.3 Complete DSS 79
4. I.8 Alternate development methodologies 79
4. I.8.1 Parallel development (traditional methodologies) 80
4. I.8.2 Rapid application development (RAD) methodologies 80
4. I.8.2.1 Phased development 80
4. I.8.2.2 Prototyping (evolutionary, iterative) 80
4. I.8.2.3 Throwaway prototyping 81
4. I.9 Team developed vs. user developed DSS 81
4. I.10 DSS development platforms 81
4. I.11 Issues associated with DSS 82
Part II: The proposed decision support system 83
4. II.1 Introduction 83
4. II.2 The proposed decision support system 83
4. II.3 Building the proposed decision support system 84
4. II.3.1 Decision support system database 85
4. II.3.2 Model base for the proposed DSS 86
4. II.3.2.1 A composite rule induction system (CRIS) 87
4. II.3.2.2 Naïve Bayesian classification 91
4. II.3.2.3 Linear programming (MSD model) 93
5
4. II.3.3 User interface for the proposed DSS 94
4.1 4. II.4 Summary 103
Chapter five: An application: Building credit score model for
credit card application assessment 104
5.1. Introduction 104
5.2. Description of the current system 104
5.3. Description of training and test sample 106
5.4. Building empirical credit score models 106
5.4.1. The subsystem: Composite Rule Induction System 106
5.4.2. The subsystem: Bayesian classification 109
5.4.3. The subsystem: linear programming based model
(MSD model) 112
5.4.4. Building empirical credit score models conclusion 114
5.5. Improving the credit score models 118
5.5.1. Building a new Bayesian model 119
5.5.2. Building a new MSD model 120
5.5.3. Improving the accuracy of credit score models
conclusion 122
5.6. Testing the models using new sample 130
5.7. Building new models using new sample 133
5.7.1. Bayesian model 133
5.7.2. MSD model 135
5.7.3. Building new models using the new sample 136
conclusion
5.8. General conclusion 138
Chapter six: Conclusions and points for further research 140
References 142
6
SUMMARY
This thesis consist of six chapters, these chapters can be described
as follows:
Chapter one: Introduction
Chapter one presents an introduction to the problem, the credit
card definition, its benefits, steps of issuing credit card, and properties of
credit card application assessments, the current method used for credit
card application assessment and its issues, and cost of wrong decision in
credit card application assessment.
Chapter two: Scoring system
In chapter two, we present a definition of credit score, its types,
history, applications, benefits, limitations, issues and the importance of
credit score as risk management tools.
Chapter three: Problem formulation and survey
In chapter three, we describe the input data used for building
a credit score model, its output, problem formulation, and the methods
used for building credit score model are reviewed.
7
Chapter four: Decision support system
Chapter four consists of two parts. First part gives an overview
about the decision support system. Second part describes the proposed
DSS. The model base management system for the proposed decision
system is based on Composite Rule Induction System (CRIS), Bayesian
classification and linear programming.
Chapter five: An application – credit score model for credit card
application assessment
In this chapter we apply the credit score models which present in
chapter four using data obtained from financial organization depend on
deductive credit score model and present the recommendations to improve
the accuracy of the model.
Chapter six: Conclusions and points for further research
In chapter six the conclusions and points for further research are
presented.
8
Chapter one
Introduction
1.1. Introduction
The last twenty years have seen a rapid growth in retail credit
markets which becomes play an important role in the economy in Egypt.
Retail credit define as ” homogeneous portfolios comprising a large
number of small, low value loans with either a consumer or business
focus, and where the incremental risk of any single exposure is small”. In
retail credit the focuses on the specific product types which consider retail
in nature these includes credit cards, personal finance, education, auto
loans, overdrafts, and residential mortgages. These types of credit make up
an important part of bank revenues and any error in the credit decision for
single customer means that the banks will loss the profit obtained from
other successful customers so banks must give more attention in credit
decision for this type credit. Banks can not use the same models used to
analyze corporate loans to analyze the retail credit because retail credit
have special features such as the exposure is to an individual person or
persons, the exposure to be one of a large pool of loans that are managed
by the bank and each individual exposure has a low value, Edward I.
Altman (2002), Basel Committee on Banking Supervision (2001d), Gayle
Delong, and Anthony Saunders (2003) and Linda Allen, Gayle Delong,
and Anthony Saunders (2004).
9
Card cards are a fast growing business segment and become the
most accepted, convenient, and profitable financial products. It’s a popular
non cash instrument which increasingly replacing cash. The advent of
credit cards in the 1960s meant that consumers could finance all their
purchases from hair clips to computer chips to holding trips by credit card,
The Comptroller of the Currency (1998) and Sujit Chakravorti (2003).
The numbers of credit card holder have increased rapidly, at the
same time the numbers of customers whom can not fulfill their obligations
to the banks have also increased. This fact forced banks to search for
methodologies that allow them to accurately evaluate the creditability of
each credit card applicant and determine if the applicant belong to good
group or bad group in order to minimize the risk of insolvent. The
objective of these methodologies is to increase the accuracy of credit
decision to increase the profits and decrease the losses, Nikolaos F.
Matsatsinis (2002) and Jih-Jeng Huang, et al (2005).
Credit score has been used to support banks in making decisions
related to variety of its products. The most obvious and common support
is to help banks to estimate whether a new applicant will pay back his
liabilities and determine if an exciting customer will default. Credit score
used to rank applicants on their expected performance. It is give quick,
objective, more accurate and consistent credit decisions. Moreover the
importance of credit score increases with growth rate of credit industry
and with Basel II which pushing banks to develop an internal credit risk
measurements, Liu, Y. (2001), Kasper Roszbach (2003), Nikolaos F.
Matsatsinis and C. Erik Larson (2004) and David B. Edelman (2005).
01
1.2. Credit card definition and benefits
Credit cards are one of the electronic payment methods which
involve a form of borrowing, often, with charges. The main idea of credit
card is that: issuer bank guarantee payment to merchants the value of cash
drawing or purchases in return for signature receipt from consumer. Credit
card enables consumer to obtain goods or services up to a specific credit
card limit and pay all amount or minimum during specific period, the
consumer pay interest charges for the left unpaid amount.
Credit card provides benefits to all participants in credit card
network, The Comptroller of the Currency (1998), Financial Consumer
Agency of Canada (2001), Lin Wei Ping (2003), Sujit Chakravorti (2003)
and Business Payment System Wisconsin "BPS" (2004). It is provide
benefits to consumers, merchants, issuers "issuing banks are banks that
directly issue the credit card", acquirers "acquiring banks are bank entered
into an agreement with a merchant to accept deposits generated by credit
card transactions", and network operators as follows:
1- Consumers
Credit cards provide many benefits to consumers as follows:
Credit cards provide consumers a secure, reliable, convenient
method of payment.
Credit cards give consumers the freedom to buy more merchandise
and pay from future income.
Credit cards are more convenient to carry than cash and make the
purchases through the internet easier.
00
2- Merchant
Credit card enables merchant to make more sales since:
Credit cards allow merchants to sell to illiquid consumers or to
those paying with future income and receive the value of good
within 48 hours of submitting the transaction to their acquirers.
Credit cards consumers spend more than consumers who only carry
cash.
Consumers are more likely to shop at businesses where credit cards
are accepted and tend to return to the same business again.
3- Issuers
Credit cards offer more advantages, for issuers, over other retail
products as follows:
Credit cards portfolio involves smaller loans that are spread across
large number of consumers.
Banks offer different product programs for entire consumer
segments.
Credit cards enables banks to aggregate pattern of consumer loan
behavior and build banking expertise that is used in other consumer
lending products.
Credit cards offer better return on assets than commercial lending,
credit card issuer earns revenue from consumers and acquires.
Consumer may pay annual fees, finance charges if they revolve, and
other fees, such as cash advance and over limit fees. Acquires pay
interchange fees to issuers to composite them for costs of attracting
and marinating credit cards holder.
02
4- Acquirers
Acquirers earn revenue from merchants by bilaterally setting
merchant discount rates and pay interchange fees to issuers.
5- Network operators
Network operators usually operate as a non profit organization such
as Visa and MasterCard. The main purpose of these organizations is to
meet the needs of their members by providing a set of rules, underlying
infrastructure, and some level of research and development to improve
their networks. The network sets the interchange fees, which are paid by
acquirers to issuers.
1.3. The steps of issuing credit cards
Issuing credit cards steps can be summarizing as follows:
1) The applicant fills an application form which contains about 20 to
30 items, including age, gender, address, telephone numbers, etc.
2) The applicant offer the required documents, these documents are
different from bank to another.
3) The application form and the documents will be checked to assure
the correctness of the data and to find false information.
4) The credit analyst examines past credit history, investigates the
application and other documents and uses his/her past experience to
decide acceptance or rejection and the credit card limit.
In steps three and four, may be there is a need to make field
investigation or to ask applicants for more documents from the applicants.
03
These steps can be summarized in the figure 1.1.
Need more documents or reject
Need more document or reject Accept
Need field investigation
Reject or need more document Accept
Figure 1.1: The steps of issuing credit cards process
1.4. Properties of credit card application assessments process
Credit card application assessments have the following properties,
Tetsuo Tamai and Masayuki Fujita (1987), Liu, Y. (2001), and Hussein
Almuallim, Shigeo Kaneda and Yasuhiro Akiba (2002):
1- The applicants usually have approximate homogeneous profile but
they vary in liquidity, limit required and risk.
Applicant fills the application
and offers required documents
Quick check to
application and
documents
Field
investigation
Office
investigation
Return the document and
the application to the
applicant
Determine the credit card
limit and issuing the card
04
2- Human factors play an important role in the process of credit card
application assessment, thus, they do not necessarily follow rigid
rules like physical or chemical laws.
3- The appropriateness and the uniformity of the decision are very
important factor as well as the cost and labor saving.
1.5. Judgment process for credit card application assessment
The credit card application assessment was based on human
judgment to assess the risk associated to an applicant. The decision in
credit card application assessment made by experts and depends on
imprecise and imperfect knowledge. The credit card analyst investigates
the application, builds a profile for this applicant from the description
given in the application form and matches this profile to a certain patterns
in his past experiences to determine the degree of credit risk associated
with this applicant and decide if he will accept or reject issue credit card.
Generally their decision based on 4Cs, Tetsuo Tamai and Masayuki Fujita
(1987), Thomas L. C. (2000) and Liu, Y. (2001). The 4Cs are:
1- The Character of the applicant – If this applicant or their family are
known or not, for the team bank.
2- The Capital – If this applicant ask credit card with a specific limit.
3- The Capacity – If this applicant has a free income for repaying
(financial ability to repay debt).
4- The Condition – What the condition of the market (the general
economic environment).
05
1.6. The disadvantages of using judgment process for credit card
application assessment
The banks were having difficulties with their credit card
management. The numbers of delinquent customers increase, it is
difficulty to distinguish between good and bad customer, and the
customers want instant decision. These invent new challenge to the
decision makers; they must make consistent and intelligent real time
decisions, Liu, Y. (2001).
Judgmental methods depend on criteria that are not systematically
tested and vary when applied by different individuals, thus the decision
was nonuniform, subjective and opaque, and depend on the personal and
empirical knowledge of each single credit analyst, Karel Komorad (2002)
and Federal trade commission for the consumer (2005).
Tetsuo Tamaiand and Masayuki Fujita (1987) and Thomas L. C.,
David B. Edelman, and Jonathan N. Crook (2004), summarize the
disadvantage of using judgment method as follow:
1- Banks now search to find out the risky customer and profitable
customer. Usually it is difficult to distinguish the profitable
customers and the risky ones, because both share the common
characteristics of using their card well. Using judgmental does not
enable banks to distinguish between risky and profitable customer
accurately.
2- Judgment needs more time to assessment the credit card application,
especially with increasing in the number of applicants, thus banks
can not give quick answer to customers especially with the growing
of number of applicants whom apply for credit card
3- Using judgment does not enable banks to completely getting rid of
individual preferences and thus it is difficult to preserve stable and
uniform judgment.
06
4- Using judgment does not enable banks to put clear criteria for
acceptance.
5- Different experts may make different judgments for the same
applicant.
6- The same expert may not give the same opinion when confronted
with the same customer twice over a period of time.
1.7. Risk associated with credit card lending
The Comptroller of the Currency (1998) defines the primary risks
associated with the credit lending as follows:
1- Credit risk
Credit risk is the risk to earnings or capital of an obligor’s failure to
meet the terms of any contract with the bank or otherwise fail to perform
as agreed.
2- Transaction risk
Transaction risk is the risk to earnings or capital arising from
problems with service or product delivery.
3- Liquidity risk
Liquidity risk is the risk to earnings or capital arising from a bank’s
inability to meet its obligations when they come due, without incurring
unacceptable losses.
4- Strategic risk
Strategic risk is the risk to earnings or capital arising from adverse
business decisions or improper implementation of those decisions.
5- Reputation risk
Reputation risk is the risk to earnings or capital arising from negative
public opinion.
07
6- Interest rate risk
Interest rate risk is the risk to earnings or capital arising from
movements in interest rates.
7- Compliance risk
Compliance risk is the risk to earnings or capital arising from
violations or non-conformance with laws, rules, regulations, prescribed
practices, or ethical standards.
1.8. Cost of wrong decisions in credit card application assessment
Wrong decision in classification problem is to classify a new sample
in wrong class. Thus in credit card application assessment their are two
types of error (misclassification), Michic, Spiegelhater, Taylor (1994),
Thomas L. C., David B. Edelman, and Jonathan N. Crook (2002) and
Karel Komorad (2002):
1- First type of error: bank can classify a good applicant as bad and
reject to issuing credit card.
In this case the banks will loss the profit from that applicant.
2- Second type of error: bank can classify a bad applicant as good and
issuing a credit card.
In that case the applicant may defaults and banks will loss the used
credit card limit.
08
Chapter two
Scoring system
2.1. Introduction
Issuing credit card decision was based essentially on credit card
analyst judgment, the growth in the demand for credit card forced to
search for more formal and objective methods to help credit card analyst,
these methods generally known as scoring system. Building scoring
system model is complex and iterative process; it takes a long time to
collect enough historical data. Scoring system uses this historical data to
study the effect of the applicant characteristics on his/her behavior.
Scoring system models rate applicant based on the data on the application
and past performance of current customers. Scoring system model often
vary from bank to another according to types of credit and what is
expected from scoring model. A good score model will not perfectly
predict the performance of the customers but it should give a fairly
accurate prediction, Loretta J. Mester (1997), Liu, Y. (2002), Nikolaos F.
Matsatsinis and C. Erik Larson (2004) and Federal trade commission for
the consumer (2005).
09
2.2. History of scoring system
Using scoring system starts from more the 60’s years. Scoring
system firstly used by Durand in 1941 to discriminate between good and
bad loan based on Fisher work in 1936, which discriminate between of
groups in a plant population based on various measured characteristic.
During the World War II, shortage in credit analysts occurred, so the
banks write down the rules of thumb used by credit analysts to decide give
a loans. The first consultancy was formed in San Francisco by Bill Fair
and Earl Isaac in the early 1950’s. Their system spread fast as the financial
institutions found out that using scoring system was cheaper, faster, more
objective, and mainly much better predictive than any judgmental scheme.
The arrival of credit cards in late 1960s and raising the number of people
applying to obtain credit card, increases the need of an automated system
and realize the importance of scoring system. In 1980, the success of
scoring system in credit card application assessment decisions was a
significant sign for the banks to use score methods to other products like
personal loans, mortgage loans, and small business loans etc. during the
second half of the 1990s, mortgage underwriting increasingly incorporated
credit score. Also, in 1990, the growth of direct marketing has led to the
use of score card to improve the response rate to advertising campaigns. In
1999 approximately 60% to 70% of all mortgages were underwritten using
credit score. The success of scoring system in banks motive, landlords,
employers and insurance companies to use it, Thomas L. C. (2000),
Thomas L. C., David B. Edelman, and Jonathan N. Crook (2002), Karel
Komorad (2002), Consumer Federation of America (2002), Ferenc Kiss
(2003), Peng and Goh Chwee (2004) and Allen N. Berger and W. Scott
Frame (2005).
21
2.3. Definition of scoring system
Scoring system is a classification method concern to classify a new
customer into pre defined groups according to their characteristics. The
original meaning of scoring system is to assign a score to each customer
and compares this score with a given or calculated cut off points, which is
the division between pre defined groups, and classify the customers to the
different classes. Scoring system try to relate the characteristic of a
customer to the risk associated with this customer and using this
relationship to build a model to classify them, according the their risk, to
predefined subgroups as accurately as possible, Liu, Y. (2001), Doumpos
M., Kosmido K., Bourakis G., and Zopounidis C. (2002) and Ki Mun Jung
& Thomas L. C. (2004).
Scoring system can be viewed as a method of classification or a
method of financial risk forecasting techniques. Scoring system can be
considered as classification method since it classifies a new applicant into
predefined groups. Scoring model uses an enormous volume of current
customers and tries to find rules to split between good and bad customers.
Then use these rule to classify new applicant, Liu, Y. (2001), Doumpos
M., Kosmido K., Bourakis G., and Zopounidis C. (2002).
Scoring system can be considered as a method of financial risk
forecasting techniques, Risk forecasting is the topic number one in modern
finance, since it help to assessment the risk corresponding to an applicant
and distinguish between groups which have different credit risk
characteristic. It involves techniques that help banks to assessment the risk
associated with each customer, so they can mange and quantify the risk
and make quickly and objectively decision, Thomas L. C., et al (2002) and
Karel Komorad (2002).
20
There are many definitions for scoring system; these definitions can
be review as follows:
Loretta J. Mester (1997) define scoring system as quantitative
method that is used to predict the probability of loan applicant or
existing borrower will default or become delinquent.
The Comptroller of the Currency (1998) define scoring system as
tools used to predict the behavior of new applicants based on the
performance of previous applicants.
Lewis define scoring system as studying the credit worthiness of
any of the many forms of commerce under which an individual
obtains money, goods or services under condition of repay the
money or to pay for the goods or services, along with a fee (the
interest), at some specific future date or dates, Karel Komorad
(2002).
Thomas L. C., David B. Edelman, and Jonathan N. Crook (2002),
define scoring system as the set of decision models and their
underlying techniques that aid lenders in the granting of consumer
credit.
Thomas L. C. (2000) defines scoring system as a decision process,
which has the input: answer to the application form questions and
various information obtained from credit reference bureau, and the
output: separation of applications into good and bads, Vladimir
Bugera, Hiroshi Konno, and Stanislav Uryasev (2002).
Mark Schreiner (2002) defines scoring system as any technique that
forecasts future risk from current characteristics using knowledge of
past links between risk and characteristics.
22
2.4. Types of scoring system
There are many types of scoring system especially with extend its
objectives from classifying the customers into predefined groups to cover
the three stages of credit management process (pre-application stage,
credit application stage, and credit performance stage). Figure 2.1.
presents the expanding of scoring into the three credit management stages,
Liu, Y. (2001):
Figure 2.1. Expanding of score model to different stages of credit
management process
Many authors categorize the types of scoring system from different
points of view as follow:
The Comptroller of the Currency (1998) divided the types of
scoring system to:
1- Application scoring
Application score predict the probability that a consumer will repay
as contracted.
2- Credit bureau risk scoring
Credit bureau risk score predict the customer’s future credit
payment behavior to achieve superior predictive power.
Pre-application stage Credit application stage Credit performance stage
Identification of
potential
applicants
Identification
of acceptable
applicants
Identification of
possible
behavior of
current
customers
23
3- Credit bureau bankruptcy scoring
Credit bureau bankruptcy score predict the probability that a
customer will declare bankruptcy or become a collection problem at some
point.
4- Credit bureau revenue scoring
Revenue score used to rank prospect customer by the amount of net
revenue likely to be generated.
5- Behavioral or performance scoring
Behavior score used to segment current customer into groups based
on past behavior to predict which one will be delinquent and put different
strategies, e.g. collection strategies, renewal decisions.
6- Collection scoring
Collection score used to predict the probability that the collection
efforts will succeed, the probability that a bank will receive a payment
from a delinquent customer and identify the probability of recoveries after
charge off.
Thomas L. C., David B. Edelman, and Jonathan N. Crook (2002)
and Vladimir Bugera, Hiroshi Konno, and Stanislav Uryasev (2002)
divided the scoring types to credit score and behavior score since the
banks must make two types of decision, the first decision concerns the
new customers to decide whether to grant credit or not and the second
decision concerns with the current customers for different purpose.
1- Credit score
Credit score deal with new applicants to decide which applicant will
grant credit card or deny.
24
2- Behavior score
Behavior score deal with current customers to evaluate their credit
performance for different purpose, e.g. collection purpose or to extent the
credit.
Liu, Y. (2001) and Kasper Roszbach (2003) divided the scoring
types according to its objective to: marketing score (retention score),
application score, performance sore (behavior score), bad debt
management and profit score.
1- Marketing score (retention score)
The objective of marketing score is to identify credit worthy
customers and measure their response to promotion activity. Also
marketing score used to predict the probability of losing valuable
customers to build effectives strategy to customer retention.
2- Application score
The objective of application score is to study the behavior of the
current customers to decide whether or not extend credit and predict if a
new customer will become default.
3- Performance score (behavior score)
The objective of performance score is to study the credit behavior of
current customer in order to isolate problem before it occur, so more
attention can be devoted.
4- Bad debt management score
The objective of bad debt management score is to build collection
strategy to deal with delinquents account.
5- Profit aspect
The objective of profit score is to identify the profitable and non
profitable customer to maximize the profit.
25
Mark Schreiner (2002) category scoring system according to types
of risk the score forecast to:
1- Pre disbursement scoring
Pre disbursement scoring predicts the probability that a
provisionally approved credit will default.
2- Post disbursement scoring
Post disbursement scoring predicts the probability that a current
customer will default.
3- Collection scoring
Collection scoring predicts the probability that a current customer
currently delay x day will late x + z days, where z is the numbers of day
which the customer expected to delay over the x day.
4- Desertion scoring
Desertion scoring predicts the probability that a current customer
will apply for another bank once the current one is paid off.
5- Visit scoring
Visiting scoring was used before visiting the customer to predict the
probability of rejecting before or after a visit.
All the above scoring system types are based on prior experiences
which can be acquired through deductive (subjective) or inductive
(empirical) way. According to these, any scoring system can be, Liu,
Y. (2001, 2002) and Mark Schreiner (2002):
1- Deductive (subjective) scoring system
According to deductive score, a weight is given to each attribute,
total scoring system are obtained by adding these weights and the
customer is classified into predefined subgroup by comparing these
scoring system with cut off point. The attributes, their weights, and cut
26
point are determined by the decision maker based on the knowledge
obtained from the experts.
2- Empirical (inductive) scoring system
Empirical scoring system use past data about current customers and try to
find a relation between the customers characteristics and the risk
associated with each one. These relations are expressed as set of rules or
mathematical formula using quantitative techniques such as linear
discriminate, linear programming, neural networks, etc.
2.5. Scoring system applications
The applications and the objectives of scoring system models are
widely spread. The first success of the application of scoring system is in
the area of credit cards. After that the applications area has spread to
include decision related to other credit products, e.g. personal loan, auto
loan, small business loans, housing, insurance, basic utility services, mail
order firms, telecommunications and employment. Also the objectives of
scoring system model are extended to include identifies of potential
customer (pre application stage), to determine whether grant credit or not
(credit application stage), and to identify possible behavior of current
customers (credit performance stage). Also scoring system used to help to
address some fundamental strategic issues as Forecasting Provisions and
Collections Resource Requirements, Value of Underwriting Process, Risk
Based Pricing and Risk Based Processing, and Acceptance Strategy/
Strategic Planning, Liu, Y. (2001), Peng and Goh Chwee (2004) and
David B. Edelman (2005).
27
Figure 2.2. presents the expanding of application areas of scoring
system Liu, Y. (2001).
Figure 2.2. Expanding of application areas of scoring system
Some examples of the applications area and its objectives of scoring
system can be summarized as follows, Consumer Federation of America
(2002), Peng and Goh Chwee (2004) and Allen N. Berger and W. Scott
Frame (2005):
1- The banks use scoring system to:
- Determine if the bank accepts an applicant or reject
- Measure credit risk
- Set credit limits
- Manage existing accounts
- Forecast the profitability of customers
- Identify target market
Consumer credit
Credit card
Personal loans
Auto loans
Home loans
Others
Business credit
Small business
loan
Others small
business loans
Other similar decision problem
in:
Retailer
Mail order firms
Telecommunication
s Others
28
- Underwrite small business credits
2- The insurance company can use scoring system to:
- Decide on the applications of new insurance policies and renewals
of existing polices.
- Adjust premiums
- Setting medium term strategy
3- The landlords can use scoring system to determine whether
potential tenants are likely to pay their rent on time.
4- The utility suppliers, home telephone and call phone services
providers can use scoring system to determine whether to
provide their services to consumer.
5- Employers can use scoring system to decide whether to hire a
potential employee especially for the posts where the
employees handle huge amount of money.
2.6. Potential Benefits of scoring system
Using scoring system provides many benefits as follows:
1- Scoring system reduce discrimination and the effect of personal
attitude, Steiner M. T. A. and Carnieri C. (1999), Liu, Y. (2001),
Consumer Federation of America (2002) and Peng and Goh Chwee
(2004), because:
a- It uses quantitative method to analyze the customer's credit
ability
b- Encourage the credit analysis to concentrate on the individual
difficult and focus on only the important information needed
to evaluate the credit risk.
2- Scoring system allows automation the credit decision and reduces
the human intervention which increase the speed of assessment
process, Steiner M. T. A. and Carnieri C. (1999), Thomas L. C.,
29
David B. Edelman, and Jonathan N. Crook (2002) and Jih-Jeng
Huang, et al (2005).
3- Scoring system enables banks to manage credit portfolio
effectively and profitability because it helps in determining the
credit card limit, interest, charge, and over limit rate, Peng, Goh
Chwee (2004).
4- Scoring system helps banks to build strong collection strategies,
Peng, Goh Chwee (2004) and Jih-Jeng Huang, et al (2005).
5- Scoring system enables banks to detect the creditworthy and non
creditworthy customer thus it is expected that the default rate
dropped after the implementation of scoring model, Steiner M. T.
A. and Carnieri C. (1999) and Thomas L. C. (2000).
6- The usage of scoring system allows lenders to underwrite and
monitor loan without actually meeting the borrower, David West
(2000).
7- Scoring system reduces the cost of analysis since the number of
credit analyst needed for applying scoring system are less than the
number of credit analyst needed if banks depend on the judgment,
Steiner M. T. A. and Carnieri C. (1999) , David West (2000) and
Jih-Jeng Huang, et al (2005).
8- Scoring system can be used for risk pricing. Banks can use the
scoring system to determine the higher risky customers and charge
them higher fees and higher interest rate, Consumer Federation of
America (2002).
31
2.7. Scoring system limitation
1- Some models are not transparent and the credit analysis may not
understand it explicitly, Liu, Y. (2001).
2- Some model are good for handling quantitative attributes but can
not handle qualitative attributes, Liu, Y. (2001).
3- The attributes that used in the scoring system reflect the historical
information about a risk, but the most credit default are caused by
the factors that come out after the credit is granted and may due to
unobservable variables such as employment status and current
status, Liu, Y. (2001) and Peng and Goh Chwee (2004).
2.8. Scoring system issues
There are many practical reasons which affect on the accuracy of
scoring system model. These problems can be categorized in three groups,
problems related to sample, problems related to attributes and problems
related to classes definition, Liu, Y. (2002). These problems can be
summarized as follow:
a- Sample
1- One should think of necessary data to implement the score. It is a
trade off between expensive data and low accuracy due to not
enough information, Karel Komorad (2002) and Vladimir Bugera,
Hiroshi Konno, and Stanislav Uryasev (2002).
2- After the sample was taken one should determine a suitable period
to gather the information about the payment behavior of these
sample, Vladimir Bugera, Hiroshi Konno, and Stanislav Uryasev
(2002) and Liu, Y. (2001).
3- The applicants who are rejected will not be represented in the
sample so it may be biased to the good applicants and there is no
information on the performance of rejected applicants, A. J.
30
Feelders (2000), Liu, Y. (2001), Karel Komorad (2002), and Peng
and Goh Chwee (2004).
b- Attributes
1- The attributes entering the scoring system should be chosen carefully
and explain why preferring some attributes versus others because the
irrelevant attributes will destroy the structure of data and decreases the
accuracy of scoring system model, Karel Komorad (2002), Doumpos
M., Kosmido K., Bourakis G., and Zopounidis C. (2002), Peng and
Goh Chwee (2004) and Jih-Jeng Huang, et al (2005).
2- The law in some country does not allow using information about race,
nationality, religion and gender to build a score card.
3- The method used to aggregation the attributes in order to build scoring
model and make issuing decision should be accurate, Zopounidis C.
(2002) and Peng and Goh Chwee (2004).
4- Scoring system requires sufficient information about the credit history
before calculate the scoring system which may not available, Peng and
Goh Chwee (2004).
5- Scoring system depend on the assumption that the past can predict the
future, scoring system use the characteristic of past applicants to
classify a new applicant. But, sometimes the tendency of the
distribution of the characteristic change over the time so we must
refresh the credit score, Peng and Goh Chwee (2004).
c- Class definition
1- Define the risk classes (good and bad) is very important for the
applicability of the scoring system model. Some banks used number
of months of missed payment, amount over the overdraft limit,
current account turnover, or function of these variables to define the
classes, Liu, Y. (2001).
32
2- Defining the proportion of good and bad in the sample is very
important point in building scoring system, Vladimir Bugera,
Hiroshi Konno, and Stanislav Uryasev (2002).
2.9. Scoring system and risk management
2.9.1. Basel II
The Basel committee established at the end of 1974 by central bank
governors of the group of ten countries. The committee meets every three
months at the Bank for International Settlements in Basel (The Bank for
International Settlements (BIS) is an international organization which
fosters international monetary and financial cooperation and serves as a
bank for central banks). The Basel committee formulates broad
supervisory standards and guidelines for banks. These supervisory
standards and the guidelines do not have legal force, Secretariat of the
Basel committee on Banking Supervision (2001) and Linda Allen, Gayle
Delong, and Anthony Saunders (2004).
In 1988 the committee introduces a capital measurement system
referred as Capital Accord. The 1988 Capital Accord (the Accord) focused
on the total amount of bank capital to reduce the risk of bank insolvency
and the potential cost of bank's failure for depositors. This system
provided for the implementation of a credit risk measurement framework
with a minimum capital standard of 8% by end-1992. In 1999 the
committee issued a proposal for revised Capital Adequacy. The new
Accord intends to provide approaches which are both more comprehensive
and more sensitive to risks than the 1988 Accord, Secretariat of the Basel
committee on Banking Supervision (2001) and Basel Committee on
Banking Supervision (2001a).
33
The new capital accord consist of three mutually pillars. These
pillars work together to provide banks with a higher level of safety and
soundness. These pillars are given in the following, Secretariat of the
Basel committee on Banking Supervision (2001):
1- Minimum capital requirement
The first pillar seeks to refine the standardized rules set forth in the
1988 Accord. The pillar defines the minimum ratio of capital to risk
weighted assets as follows:
2- Supervisory review process
The second pillar requires supervisors to undertake a qualitative
review of their bank’s capital allocation techniques and compliance with
relevant standards.
3- Market discipline
The third pillar aims to bolster market discipline through enhanced
disclosure by banks.
The primary changes between Basle II and Basel I are in the
approach to credit risk and in the inclusion of explicit capital requirements
for operational risk, addressing risks through a more comprehensive
approach is one of the Accord objectives which outlined by the committee
on 1999. For credit risk, the Committee believes that the improvements in
risk measurement and management help banks to use full credit risk model
as a basis for regulatory purpose and permit banks to choice between two
broad methodologies, the standardized approach and internal rating
approach (IRB), for calculating their capital requirement for credit risk,
Basel Committee on Banking Supervision (2001a, c).
Total capital
= the bank’s capital ratio (minimum 8%)
Credit risk + market risk + operational risk
34
Under IRB approach, banks will be allowed to use their internal
estimates of borrower creditworthiness to assess credit risk in their
portfolios. The Committee believes that internal rating based approach can
secure two key objectives, Basel Committee on Banking Supervision
(2001d):
1- Additional risk sensitivity
The capital requirement based on internal rating approaches will be
more sensitive to drivers of credit risk and economic loss in bank's
portfolio.
2- Incentive compatibility
The appropriately structured internal rating approach can provide a
framework which encourages banks to continue to improve their internal
risk management practices.
2.9.2. Risk management
Banks face increasing risks which impact on their profitability. The
term risk has variety of meanings in business. Generally it refers to the
possibility that the outcomes of an action or event are uncertain or could
bring up adverse impacts. The risks which may be face the banks can be
categories to credit risk (the risk of loss arising from default by a creditor
or counterparty), market risk (the risk of losses in trading positions when
prices move adversely) and operational risk (the risk of direct or indirect
loss resulting from inadequate or failed internal processes, people and
systems, or from external events), Stat bank of Pakistan, Scott E.
Harrington and Gregory R. Niehaus (1999), Secretariat of the Basel
committee on Banking Supervision (2001) and Taher Musa (2004).
Risk management concern managing, minimizing the risk and
creating opportunity with minor of risks. It becomes the core of every
banks and it integrated in planning and executing operations. The
35
importance of risk management increase since the banks work now in
global market and the line between the individual risk factors becomes
more blurred. Risk management plays an important role before occurrence
the risk and after occurrence the risk. Prior occurrence the loss, it prepare
the organization to meet the potential losses, minimize the anxiety and fear
associated with all losses exposures and meet the obligation imposed on it
by outsiders. After the loss occurs, it aim to resume the operation in
organization, maintain its earning and growth, and minimize the impact of
the loss on society, Rejda, George E (1995), Department of the army
(1998), The committee on regulation and supervision (1999) and Scott E.
Harrington, Gregory R. Niehaus (1999).
There are many definitions for risk management; some of these
definitions are given in following:
- Risk management is a systematic process for identification and
evaluation of pure loss exposures faced by an organization or
individual and selection and implantation of the most appropriate
techniques for treating such exposures, Rejda, George E (1995).
- Risk management is the process of identifying, assessing, and
controlling risks arising from operational factors and making
decisions that balance risk costs with mission benefits, Department
of the army (1998).
- Risk management is a discipline for dealing with the possibility that
some feature event will cause harm. It provides strategies,
techniques, and approach to recognizing and confronting any threat
faced by a company in fulfilling its mission, Taher Musa (2004).
The risk management process start by identify the potential risks
that may be causes the losses and evaluating the expected losses. Then
develop an appropriate technique or combination of techniques for treating
36
loss exposures. After implement this techniques, its effectiveness must be
measured, Rejda, George E (1995), Department of the army (1998) and
Scott E. Harrington and Gregory R. Niehaus (1999).
There are a number of basic risk management tools can used in
manage risks, Basel Committee on Banking Supervision (2001b). These
include:
1- Development of appropriate corporate polices and procedures
2- Use quantitative methods to measure risk
3- Pricing products and services according to their risks
4- Establishment management of risk through diversification
and hedging
5- Building of cushions to absorb losses
Risk management has more elaborate in Basel II. The committee
encourages banks to improving the risk management tools, Basel
Committee on Banking Supervision (2001c).
2.9.3. Scoring system as a risk management tool
Internal risk analysis forms the basis for risk management. Banks
use internal rating systems to categories their exposures into board
qualitatively differentiated layers of risk. Basel II moved towards
accepting the internal rating based approach (IRB) as a basis for the
determination of adequate reserves for credit risks and focus on techniques
that allow banks and supervisors to evaluate properly the various risks that
bank faces. The committee hopes banks moving from standardized
approach to the internal rating based approach, and envisages that IRB
approach will evolve over time, Basel Committee on Banking Supervision
(2001d) and Winfried G. Hallerbach and Albert J. Menkveld (2004).
The Basel committee published document in 1999 address
principles for the management of credit risk in order to encourage banking
37
supervisory to promote practices for managing credit risk. The internal
measures of credit risk are based on assessment of the risk characteristics
of both the borrower. Scoring system was used to derive internal rating
system and to build credit risk model. The credit risk model aims to
manage the risk in credit decision, provides a basis to decide if the credit
should be granted or not and its output play an important roles in bank's
risk management. Thus scoring system gains more importance as tools for
risk management when thinking about the New Basel Capital Accord,
Basel Committee on Banking Supervision (1999, 2000), Secretariat of the
Basel committee on Banking Supervision (2001), Liu, Y. (2001), Karel
Komorad (2002), Edward I. Altman (2002), Brian Coyle (2000), Thomas
Mahlmann (2004) and Allen N. Berger and W. Scott Frame (2005).
38
Chapter three
Problem formulation and survey
3.1. Introduction
The assessment of discrete set of alternatives (investment project,
firms, credit card application, country risk, portfolio selection and
management etc.) into predefined homogenous groups is a major problem
in financial decision making problems. This type of problem is referred to
as classification. Classification constructs models based on the
characteristic of previous set of sample to classify a new case into
predefined groups. Scoring system is an application of classification
methods, which predict the risk level of the customer and classify them
into predefined groups. Borrowers characteristic and credit performance
are used to build function, will be used to forecast of the performance of
new applicant with similar characteristics, D. Michic, D.J. Spiegelhaltcr,
and C.C. Taylor (1994) Liu, Y. (2001), Yi Peng, Yong Shi and Welxuan
Xu (2002), Liu, Y. (2002b), Doumpos M. and Zopounidis C. (2002b, c)
and Nikolaos F. Matsatsinis and C. Erik Larson (2004).
In this chapter and the following chapters the term credit score will
be used to refers to empirical scoring system which used to study current
customers to predicts if a new applicant have similar characteristics to the
39
current customers will default or not to determine if the bank will issue
credit card to that applicant or deny.
3.2. Data description
Usually banks save a mass of information of customers and their
credit behavior as a main source of information for further analysis. This
information can be used for building credit score model.
3.2.1. Data used to build credit score model
Generally judgment process depend on 4Cs (the Character, the
Capital, the Capacity, the Condition), the credit score try to utilize the
information relating to the traditional 4Cs to assessment the risk associated
to each customer. The information required to assess the risk are different
according to the type of risk bank tries to forecast, Liu, Y. (2001).
The data used to build credit score model are the historical data
about m customers with known classes. These data consists of two parts,
Thomas L. C., David B. Edelman, and Jonathan N. Crook (2002):
1- The independent variables
The independent variables, also called features, attributes, criterions,
or characteristics, may be qualitative (nominal) or quantitative (non
nominal). Any characteristics of the customers or the customer’s
environment that expected to predict the risk of customers should be used
in building the credit score model. These characteristics are obtained from:
a. Credit card application form
The application usually contains much information about the
applicant. This information gives an idea about the stability of the
customers (e.g. time at the current address, time at present employment,
marital status), financial status of the customers (e.g. having bank
41
accounts, having credit card, time with current bank), customers resources
(e.g. residential status, employment, other assets and expense), possible
outgoings (e.g. numbers of children), Liu, Y. (2001):
b. Information from credit bureau
Credit bureau, if it is available, include information like past
payment history, number of inquiries for information on the applicant, etc.
2- The dependent variables
The dependent variables are the classes or groups which each
customer belong. The dependent variables usually are nominal.
These data can be summarized in table 3.1, Michael Doumpos and
Constantin Zopounidis (2002):
Variables
Table 3.1. Data summary
Where:
Y refer to the class (god or bad) ),( BGY
),.....,,.....,( 21 nj aaaaA be a set of attributes about credit card
holders, where n is the number of the attributes, nj ,.....,1
Each attributes may have z value, ),..,,....,,( 21 jzjkjjj vvvva , zk ,....1 ,
nj ,....1 . So that an attributes can be used to partition the sample
into z subset.
1a 2a . ja . na class
1X 11a 12a . ja1 na1 Y
2X 21a 22a . ja2 na2 Y
. . . . . . . Y
iX ja1 . . ija . nja Y
. . . . . . . Y
. . . . . . Y
mX 1ma 2ma . mja . mna Y
Cu
sto
mer
s
40
),...,,...,( 1 inijii aaaX be the development (training) sample of data
for the variables, where mi ,....1 and m is the sample size (from the
application form of previous customers).
Thus ija is the jth attributes for ith customer.
3.2.2. The output of the credit score model
The output data depend on the technique used to build the credit
score model. Many of the techniques introduce weight for each
characteristic and cut point. Others techniques find the probability that the
customer is belong to predefined groups or build a decision tree.
42
3.3. Building the credit score model
Liu, Y. (2002a) presents a general framework for building credit
score model. This framework includes three stages as appear in the figure
3.1:
Figure 3.1. The process of credit score model building
Feedback
New samples
Problem
Relevant data
Stage 1
Past cases
(Standard format)
Stage 2
Apply credit score model
Building
credit score
model
Credit decision
Credit actual behaviors
New
cases
Validation
Stage 3
43
Stage one: Problem definition and data preparation
In this stage we define the objectives of scoring model, collect the
revenant data from the available sources and define the classes.
The definitions of classes "good and bad" are different from
bank to another. It may depend on:
o The number of missing consecutive payments or
o The number and amount of over limit or
o The total numbers of missed payment.
Stage two: Model building
In this stage we put the data in standard form and build the credit
score model:
- Select a sample (training sample) from previous customers,
),...,,...,( 1 inijii aaaX .
- Evaluate their performances during specific period
- Classify them to good or bad, ),( BGY .
- The quantitative techniques use these information and other data
collected from other sources, if it is available, to find a rule to split
iX into two groups (Good ""G and Bad ""B ) with the smallest
percentage of misclassifications, Thomas L. C. (2000).
Stage three: Models application and validation
In this stage the credit score model are applied and validated.
- To validate the credit score model we apply it using the same
sample used in building it or using new sample with known classes.
We compare the credit decision if we apply the credit score model
with the actual behavior and we can use the results of this
44
comparison as feedback to modify the relevant data or the past cases
(training sample) and rebuild the credit score model.
- To classify a new applicant as a good or bad in order to determine if
the bank will issue the card or not, the credit score model are
applied to the new customers to predict their classes. The sum of
attributes weight is compared with cut off point or we may follow
the decision tree from root until the leaf is reached, generally it
depend on the technique used in building credit score model.
- For the new applicants, whose grant a credit card using the credit
score model, the actual payment behaviors are recorded to use for
validating the credit score model and use it update the training
sample to update the model.
3.4. Literature survey
3.4.1. Classification the credit score methods
There are many techniques for building credit score model in
variety of research discipline. Most of these techniques generate a model
that minimizes some function of error between actual and predict values,
or that minimizes likelihood, Steven Finlay (2005).
These techniques are classified by many authors into different
groups by many authors as follows:
Liu, Y. (2002) categorizes the methods used for credit score into
three main historical of research: statistical (e.g. linear discriminate, k
nearest neighbors and regression models), machine learning (e.g. decision
tree, rule induction algorithm and genetic algorithm). And neural networks
(e.g. multi layers perceptron and radial basis function networks).
Thomas L. C., David B. Edelman, and Jonathan N. Crook (2002),
categorize these methods in: statistical methods (e.g. discriminate analysis,
logistic regression, Probit regression, Tobit analysis, classification tree,
45
and nearest neighbor approach) and non statistical methods (e.g. linear
programming, integer programming, neural networks, genetic algorithms
and expert systems).
Many authors categorize the credit score as the type of classification
problem of data mining, Liu, Y. (2002), Vladimir Bugera, Hiroshi Konno,
and Stanislav Uryasev (2002), Yong Shi, Yi Peng, Welxuan Xu and
Xiaowo Tang (2002), Yi Peng,Yong Shi and Welxuan Xu (2002) and
Peng and Goh Chwee (2004).
Ferenc Kiss (2003) classifies the methods used for credit score,
from the knowledge management perspective, to knowledge generating
modeling processes, knowledge saving modeling processes, and
knowledge selection processes. Knowledge generating modeling processes
includes methods that the decision making is depending on the experience
data with the help of statistical or analytical processes. (e.g. linear
probability model, probit and logit models, discriminate analysis, neural
network, classification trees, and nearest neighbors). Knowledge saving
modeling processes includes methods that formalize the theoretical
knowledge and experience of expert in some way (e.g. analytical hierarchy
processes and expert system). Knowledge selection processes includes
methods that are capable to select the optimum model from the set of
models available for finding solution (e.g. decision tree, expert systems,
and genetic algorithm).
Peng and Goh Chwee (2004), categorizes the techniques used to
build credit score model into statistical methods (e.g. discriminate analysis
and logistic regression) and data mining techniques (e.g. decision tree and
neural networks).
Liu, Y. (2002) and Jih-Jeng Huang, et al (2005), classifies the
methods used to build credit score models into induction based algorithm
and function based model. Induction based algorithm (non parametric,
46
discovery based or data driven) create the model automatically based on
the pattern found in the data (e.g. rough sets, classification and regression
tree). Function based model (parametric, verification based or theory
driven) utilize the idea of parameter estimation in statistic (e.g.
discriminate analysis, logistic regression, and neural network).
3.4.2. Statistical techniques
The first credit score model was present by Durand in 1941. This
model was based on the Fisher work in discriminate analysis in 1936.
Then, the forms of regression were used, since the Fisher approach can be
viewed as a form of linear regression, Thomas L. C. (2000), Thomas L. C.,
David B. Edelman, and Jonathan N. Crook (2002) and Karel Komorad
(2002).
Statistical methods can be reviewed as follow:
3.4.2.1. Linear Discriminate analysis (linear probability
model)
Linear discriminate is basically a regression model. Linear
discriminate tries to find the best linear combination (linear discriminate
function) of the characteristics which explains the probability of default.
Linear discriminate equation are used to find the attributes weights, the
score is obtained and compared to a cut off point, David West (2000),
Ferenc Kiss (2003) and Thomas L. C., David B. Edelman, and Jonathan N.
Crook (2002).
uawawawawY innijjii ..........2211
Where u is the random error, 0)( up , jw are the weights for ja .
47
3.4.2.2. Logistic regression
Logistic regression is a variation of linear regression and it is useful
when the outcome is restricted to two values. Logistic regression finds the
probability that given customer belong to predefined group, )/( XYP .
Logistic regression gives each attributes weight which measures the
contribution of each characteristic to variations in )/( XYP . Some research
find the logistic regression perform better than linear discriminate analysis
in credit score, Thomas L. C., David B. Edelman, and Jonathan N. Crook
(2002), Yang Liu.s (2002), Karel Komorad (2002), and Y. Liu and M
Schumann (2005).
Steiner M. T. A. and Carnieri C. (1999) proposed a methodology for
credit score, these methodologies divided into two stages. In the first stage
statistical techniques was used to analysis the data and in stage two
logistic regression used to build credit score model. They find that logistic
regression perform better among other six method (two involved the linear
programming, three are statistical, and the last one is neural network).
3.4.2.3. Probit and tobit analysis
Probit and tobit analysis are nonlinear regression which was used in
credit score. Probit model is derived by letting the standard normal
distribution to express the discriminate function. In tobit model, there is
something in satisfactory about the asymmetry of the tobit transformation
used to estimate )/( XYP . Generally both probit and tobit models are not
find much favor to use in building credit score, Thomas L. C., David B.
Edelman, and Jonathan N. Crook (2002), Karel Komorad (2002) and
Kasper Roszbach (2003).
48
3.4.2.4. Semiparametric regression
Semiparametric regression was used to give more attention to
nominal attributes. Hardle et al shown that semiparametric regression
perform better than logistic regression, Karel Komorad (2002).
3.4.2.5. Bayesian classification
Bayesian classifier was based on Bayes theorem. Bayesian method
used to predict the probability that a given applicants belongs to a
particular class. Classification rule can be stated as: GX if
)/()/( XBPXGP and BX if )/()/( XGPXBP , Jiawei Han and
Micheline Kamber (2001), Liu, Y. (2002) and Gutierrez-Pena E.(2004).
3.4.2.6. Nearest neighbor approach
Nearest neighbor method was applied in credit score also. Nearest
neighbors estimate that )/( XYP is given by KKG / or KKB / , where
BG KK , is the number of cases from the class G or B among the K most
similar to X . Using nearest neighbor enable to update the training sample
by adding new cases to the training sample and dropping the oldest cases,
Thomas L. C., David B. Edelman, and Jonathan N. Crook (2002) and Liu,
Y. (2002).
3.4.3. Non statistical techniques
3.4.3.1. Multicriteria decision aid method (MCDA)
MCDA is an advanced field of operation research providing several
advantages from research and practical points of view. It is a powerful
approach to analysis complex decision problems that involves multiple
and conflicting goals and provide financial decision makers and analysts
with a wide range of methodologies for decision making, Doumpos M.
49
and Zopounidis C. (2002a, b) and Jaap Spronk, Ralph E. Steuer and
Constantin Zopoundis (2003).
All MCDA methodologies start by specification a set of alternative
solution and identify all factors related to the decision then analysis the
data using suitable criteria aggregation model and provide the decision
maker with the necessary support to understand the recommendations of
the model. All MCDA focus on develop an automatic procedure for
analyzing data in order to construct a classification models and develop an
efficient preference modeling methodologies that enables to incorporate
the decision maker's preferences in the classification model, Doumpos M.
and Zopounidis C. (2002a, c):
Several decision making problems require the evaluation a set of
alternatives. The evaluation process involves the aggregation of all the
pertinent decision attributes. Within MCDA field one can distinguish
between the following forms of aggregation models, Doumpos M. and
Zopounidis C. (2002a) and Jaap Spronk, Ralph E. Steuer and Constantin
Zopoundis (2003):
1- Multi-objective mathematical programming (e.g. goal
programming).
2- Multi-attributes utility theory (e.g. UTA method “UTilites
Additives”, UTADIS method “UTilites Additives
DIScriminantes” and MHDIS method “Multi group Hier-
rarchical DIScrimination”)
3- Outranking relations (e.g. ELECTRE family “Elimination Et
Choix Traduisant la REalite” and PROMETHEE family
“Preference Ranking Organization METHhod of Enrichment
Evaluations).
4- Preference disaggregation analysis (e.g. MSM "Multi-Surface
Method).
51
Nikolaos F. Matsatsinis (2002), compare between UTADIS,
UTADIS I, UTADIS II, ELECTRE Tri Pes, ELECTRE Tri Opt, rough
sets, and composite rule induction system and find that UTADIS and
UTADIS I gives the best results.
Doumpos M. and Zopounidis C. (2002b), present a new method for
multi group discrimination problems. The method leads to the
development of a set of additive utility functions, which are used to
classify each alternative into a specific group. The additive utility
functions are estimated through the solution of three mathematical
programming formulations (two linear and one mixed integer) in order to
achieve the optimal discrimination both in term of the number of
misclassification, as well as in terms of the clarity of discrimination. The
first linear programming minimizes the overall classification error, the
second minimize the number of misclassifications, and the third one
maximizes the distance between the global utilities of the classified
alternatives achieved according to the two utility functions.
3.4.3.2. Linear programming
Until the 1980's the only methods used for credit scoring was
statistical methods. Freed and Glover find that linear programming can be
used to discriminant between two groups, and thus more freedom are
achieved in the model because there are no statistical assumption are
assumed, Thomas L. C., David B. Edelman, and Jonathan N. Crook
(2002), ,Young Shi, Yi Peng, Welxuan Xu and Xiaowo Tang (2002) and
Ferenc Kiss (2003).
The first model proposed by Freed and Glover depend on maximize
the minimum distance between the alternatives score (correctly classified)
and the cut off point. This model is known as MMD (maximize the
minimum distance), and present in figure 3.3, Doumpos M. and
50
Zopounidis C. (2002a) and Yong Shi, Yi Peng, Welxuan Xu and Xiaowo
Tang (2002).
Figure 3.3. The MMD model
Where C is the cut point which discriminates between good and bad
alternatives and i is the distance of alternative score iX form the cut of
point C .
After MMD model Freed and Glover published the second model
for classification which minimize the sum of deviations among the
alternative score (not correctly classified) from the cut off point, this
model known as MSD (minimize the sum of deviations). Many studies
found that the MSD model produce good test results in several studies.
MSD model are present in figure 3.4, Kim Fung Lam, Eng Ung Choo, and
Jane W. Moy (1996), Doumpos M. and Zopounidis C. (2002a) and Yong
Shi, Yi Peng, Welxuan Xu and Xiaowo Tang (2002).
C
i
Good Bad
i
52
Figure 3.4. The MSD model
Where C is the cut point which discriminates between good and bad
alternatives and i is the overlapping of two classes boundary for all
alternatives score iX from the cut of point.
After Freed and Glover proposed their linear programming
approach (MSD and MMD) for classification problem many authors have
studied the variants of linear programming formulations for the
classification problems, Most of these formulations determine the weights
for each attribute and the cut off point simultaneously, Kim Fung Lam,
Eng Ung Choo, and Jane W. Moy (1996) and Thomas L. C., David B.
Edelman, and Jonathan N. Crook (2002).
In 1986 Freed and Glover proposed a general linear programming
formulation which always gives a nontrivial solution and is invariant
under linear transformation of data, which considers two types of
measures for the quality of classification. The objective function of these
formulations was weighted by a combination of the maximum internal and
external deviations and the sum of the absolute values of the internal and
C
Good Bad
53
external deviations, Lyn Thomas L. C., David B. Edelman, and Jonathan
N. Crook (2002), Yong Shi, Yi Peng, Welxuan Xu and Xiaowo Tang
(2002) and Doumpos M. and Zopounidis C. (2002a).
Kim Fung Lam, Eng Ung Choo, and Jane W. Moy (1996) present a
new linear programming approach to solve the two group classification
problem. This new approach is based on an idea from cluster analysis that
objects within the same group should be more similar than objects
between groups. According to this, the alternatives score in group G
objects should be closer to each other but further from the alternatives
score in group B. They solve the classification problem in two stages. In
the first stage they find the attributes weights by solving linear
programming to minimize the total deviation of the alternatives scores
from their group mean scores. In stage two they use the attributes weights
which was computed in stage one to find the classification scores for each
customer and use this score to find the cut point c , where c is the cut
point which discriminates between good and bad alternatives, by solving
the linear programming or mixed integer programming problems.
Vladimir Bugera, Hiroshi Konno, and Stanislav Uryasev (2002),
present a general approach for classification and test it with credit score
for credit cards. It is based on finding an optimal classification utility
function belonging to a pre-specified class of function. He considered
linear and quadratic utility function with monotonicity constraints. He
conclude that his approach lead to quite robust classification techniques.
3.4.3.3. Integer programming
Integer programming used to build scoring system also, if one
wants to take the number of cases where the discrimination is incorrect as
a measure of goodness of fit, number of misclassification or total cost of
misclassification, one has to introduce integer variables into the linear
54
programming, and this lead to the integer programming models. Many
authors found that the integer model perform better than linear
programming, Thomas L. C., David B. Edelman, and Jonathan N. Crook
(2002).
3.4.3.4. Goal programming
Yong Shi, Yi Peng, Welxuan Xu and Xiaowo Tang (2002) and Yi
Peng,Yong Shi and Welxuan Xu (2002) present an approach of data
mining to classify the credit cardholders’ behavior through multiple
criteria linear programming. They present a model for classifying two
groups (e.g. good or bad) credit card holder behavior, and then a three
groups (e.g. bad, normal or good). This model is extends to the previous
LP approaches to classification problems presented by Freed and Glover
(1981).
They test this model with the same sample in Freed and Glover
(1981) and they found the result was consistent with the result of Freed
and Glover (1981).
3.4.3.5. Neural networks
Neural network is flexible models which consist of three layers,
input, hidden, and output layers. The input layers first processes a number
of inputs (variables) to hidden layers. The hidden layers calculate a weight
to each variables and the product are summed and transformed to output
layers or becomes an input value for another layers, Jiawei Han and
Micheline Kamber (2001) and Jih-Jeng Huang, et al (2005).
There are various architectures of Neural networks; more than 50%
of applications are using the multilayer perceptron, Karel Komorad
(2002). David West (2000) investigates the accuracy of credit score built
using five Neural network models (multilayer perceptron, mixture of
55
experts, radial basis function, learning vector quantization and fuzzy
adaptive resonance) and compare the performance of these five neural
network models with the most traditional methods including linear
discriminant analysis, logistic regression, k-nearest neighbor, kernel
density estimation, and decision trees. The results showing that:
- Neural network credit score model improve the accuracy ranging
from 0.5% up to 3%.
- The multilayer percptron is not the most accurate Neural network
model.
- Mixture of experts and radial basis function Neural network
models should be used for credit scoring.
- Logistic regression is a good alternative for Neural network and is
the most accurate of the traditional methods.
Many authors conclude that neural network comparing to the
traditional statistical methods produce more accuracy result but using
neural network in credit score is limited due to their intrinsic opaque and
its poor performance when incorporating irrelevant attributes or small
sample, Rashmi Malhotra and D.K. Malhotra (2001), Baesens B., Egmony
M, Castelo R., and Vanthienen J. (2002) and Jih-Jeng Huang, et al (2005).
3.4.3.6. Expert system
Expert system is a computer system that is capable, in the area of
application, of sorting and managing expert knowledge, and handling this
knowledge in a manner so that it can use targeted information, or perform
certain task alone, Ferenc Kiss (2003).
Efraim Turban define expert system as a system that employs
human knowledge captured in a computer to solve problems that
ordinarily require human expertise, Efraim Turban (1988).
56
Tetsuo Tamai and Masayuki Fujita (1987) introduce an expert
system for credit card application assessment. They try to simulate the
human process when the credit card analysis uses judgment process to
distinguish between good and bad customers. They adopt a decision tree
as a form of knowledge representation for the profile design and present a
decision process to simulate the human process and named it “profiling
system”. Their method depends on the algorithm developed by R.Quinlan
in 1983 as one of inductive machine learning approaches.
In addition to the profiles obtained from the past data analysis, they
collect some profiles from human expert and call it specific profiles. These
profiles do not cover the whole types of applicants but indicate important
patterns to be used in the credit assessment process. In the actual
operation, these specific profiles are used for screening clear patterns, and
then profiles obtained from the data analysis are applied to give systematic
information.
Tamai and Fujita think that the profiling method has some
advantage over the scoring method which applies statistical theories
because:
1- To apply the scoring method, some kind of measure on linear scale
is required for each applicant's property. If a given property is of a
continuous nature like amount of income or deposit, there is little
problem. But if it has a combinatorial nature, like the status of home
or industry type of the company the applicant is working with, then
some way of quantification needs to be taken. It may be easy to give
certain values, but not always easy to sort it on linear scale.
2- As the discriminant function for the scoring method is usually
linear, the effect of combination of properties is treated in a limited
way. In reality, there are such judgments as: “for a young person, it
is common to live with parent's apartment, but for a middle aged
57
person with family, it may be considered as a minus point”. Such
case can be appropriately treated in the profiling method.
3- A human assessor does not make judgment according to some kind
of scoring process, thus it is difficult to verify the method by
comparing it with human decision making. The profiling method is
natural for human experts to assess and to give constructive
adjustment.
3.4.3.7. Genetic algorithm (GA)
Genetic algorithms are another general heuristic optimization
schemes based on biological analogies. GA is a data driven, non
parametric heuristic search process, its used to extract intangible
relationships in system and used in many application such as
classification, Karel Komorad (2002), Steven Finlay (2005) and Jih-Jeng
Huang, et al. (2005).
Jih-Jeng Huang, et al. (2005), present two stages genetic
programming (2SGP) to deal with credit score problem. The first stage of
GP is employed to derive the IF-THEN rules for the decision maker. In
second stage of GP, the reduced data, the data which do not satisfy any
rule or satisfy more than rule, are employed to build the discriminate
function. They concluded that 2SGP can improve the accuracy of credit
score model and is superior to the conventional methods.
Steven Finlay (2005) shown that some scoring model build using
GA perform as well as a range of other approaches model but in other
cases perform worse.
58
3.4.3.8. Classification tree
The idea of decision tree is to split the set of application answers
into different sets and then identify each of these sets as good or bad
depending on what the majority in the set is. Building decision tree
involve make three decision. The first one is choosing the split rule. The
most common split rules are Kolmogrov Smirnov statistic, basic impurity
index, gini index, entropy index, and maximize half-sum of the squares.
The second decision is choosing the stopping rule. One makes a node a
terminal node for one of the following reasons, all samples for given node
are belong to the same class, the number of the samples in the node is so
small that it makes no sense to divide it further, there are no samples for
the branch test attribute, there are no remaining attributes on which the
samples may be further partitioned, or the split measurement value if one
makes the best split into two daughter nodes is hardly and different from
the measurement values if one keeps the node as is. The third decision is
determine how to assign terminal nodes into good and bad categories,
Thomas L. C., David B. Edelman, and Jonathan N. Crook (2002) and
Hussein Almuallim, Shigeo Kaneda and Yasuhiro Akiba (2002).
Decision tree can be used as classification method for determining
the appropriate class for given new applicant. Decision tree was used by
many authors and its performance was compared with the discriminate
analysis, probit regression, and logistic model and they find that decision
tree provides a much better classification accuracy when there are
interaction between variables, Eddt L. Ladue and Michael P. Novak
(1996), Vladimir Bugera, Hiroshi Konno, and Stanislav Uryasev (2002)
and Ferenc Kiss (2003).
59
3.4.3.9. Rough sets theory
Nikolaos F. Matsatsinis, (2002) develop an intelligent decision
support system for credit card assessment (CCAS). The model base in
CCAS includes composite rule induction system (CRIS) and rough sets.
He concludes that both CRIS and rough sets have advantages and
disadvantages.
CRIS can deal with nominal "qualitative" and non nominal
"quantitative" attributes and can handle large data sample. But the decision
tree that produced by CRIS may not cover all cases and if one use a rule in
classification and these rule was satisfied, then the other rules will not
examined.
Rough sets create decision rules independently of the sample size,
simple and clear, but the decision rules may not cover all cases.
3.4.3.10. Analytical hierarchy process (AHP)
AHP based on when the decision makers start to make decision,
they faced by a complicated system of factors. When the element of the
system and their relationships are reviewed together, they naturally
divided into groups based on certain characteristics. By repeating these
process several times, the characteristic that define the groups are further
examined as the elements of a further level of knowledge system. By
classifying these elements according to another criterion we create a new,
higher level of hierarchy, until we finally reach the uppermost element of
the system, which represents the general description of the decision
making problem, Ferenc Kiss (2003).
61
3.5. Comparisons of techniques used to build credit score
There are many techniques were developed for credit score and
there is no agreement on which method should be used to build credit
score model because, Thomas L. C., David B. Edelman, and Jonathan N.
Crook (2002):
- Commercial consultancies have a tendency to identify the
method they use as best.
- Comparisons by academics cannot reflect exactly what happens
in the industry since some of the significant data, like credit
bureau data, are sensitive to be passed on to them by the users.
Generally there is no a certain technique should be used to build
credit score model. Each technique has its advantage and disadvantage.
The technique used to build credit score model must be suit the problem at
hand, Liu, Y. (2002) and Jih-Jeng Huang, et al (2005).
Generally building credit score is complex and not standardized
process so the solution can not be the optimal one for all cases and the
perfect separation is impossible because the sample data may be not
accurate or may the good applicants and bad applicants have the exactly
the same characteristics, Vladimir Bugera, Hiroshi Konno, and Stanislav
Uryasev (2002) and Liu, Y. (2002).
60
Chapter four
Decision support system
This chapter consists of two parts. The first one give overview about
decision support system and the second describe the proposed decision
support system.
Part I: Introduction to decision support system
4. I.1 Definition of decision support system (DSS)
In the late 1970s, a number of company developed interactive
information systems that used data and models to help mangers to solve
semi-structured problems. These systems called decision support system
(DSS), D.J.Power (2000).
There are several definition of decision support system each one
define it from specific aspect of the decision making process and its reflect
the author’s point of view, B.Ravindranath (2002) and Gachet, A. (2001).
The early definitions of decision support system, were open and
may have several interpretations, identified it as a system intended to
support managerial decision makers in semistructured decision situations,
Turban (1988) and Efraim Turban and Jay.E. Aronson (2002).
One definition, which is largely acceptable, is a DSS is an
interactive system that provides the user with easy access to data and
decision making process in order to support unstructured and partly
structured tasks, B.Ravindranath (2002).
Turban (1988) and Efraim Turban and Jay.E. Aronson (2002),
compared and contrasted the various definitions of decision support
62
system by examining the various concepts used to define DSS and
summarize the results in table 4.I.1:
Source DSS define in terms of
Gorry and Scott-Morton (1971) Problem type, system function
(support)
Little (1970) System function, interface
characteristics
Alter (1980) Usage pattern, system objectives
Moore and Change (1980) Usage pattern, system capabilities
Bonczek et al. (1980) system components
Keen (1980) Development process
Table 4.I.1: Various definitions of DSS
Little (1970) defines DSS as a model based set of procedures for
processing data and judgment to assist a manger in his decision making.
(Little definition was refinement of Gorry and Scott-Morton's definition).
Alter (1980) defines DSS by contrasting them with traditional
electronic data processing (EDP) system on five dimensions. The result of
contrasting was given in table 4.I.2.
Dimension DSS EDP
Use Active Passive
User Line and staff management Clerical
Goal Effectiveness Mechanical efficiency
Time horizon Present and future Past
Objective Flexibility Consistency
Table 4.I.2: The result of contrasting
63
Moore and Chang (1980) define DSS as extendible systems, capable
of supporting ad hoc data analysis and decision modeling, oriented toward
future planning, and used at regular, unplanned intervals.
Bonczek et al. (1980) define as a computer based system consisting
of three interacting components: a language system (a mechanism to
provide communication between the user and other components of the
DSS), a knowledge system (a repository of problem domain knowledge
embodied in DSS as either data or procedures), and a problem processing
system (a link between the other components, containing one or more of
the general problem manipulation capabilities required for decision
making).
Keen (1980) define DSS as the product of a developmental process
in which the DSS user, the DSS builder, and the DSS itself are all capable
of influencing one another, resulting in an evolution of the system and
pattern of its use.
Turban (1988), and Efraim Turban and Jay.E. Aronson (2002),
summarize the results of their comparing as follows:
1- The basis for defining DSS has been:
Developed from the perceptions of what a DSS does (such
support decision making in unstructured problems).
Developed from ideas about how the DSS's objective can
accomplished (such as components required, appropriate usage
pattern, and necessary development processes).
2- These definitions do not provide a consistent focus because each
tries to narrow the population in different way.
3- These definitions collectively ignore the central purpose of DSS that
is to support and improve decision making.
Turban (1988) formulates his working definition of DSS as an
interactive computer based system that utilizes decision rules and models
64
coupled with a comprehensive database and decision maker's own
insights, leading to specific, implement-able decisions in solving problems
that would not be amenable to management science optimization models
per se.
Motaz Khorshid (2004) define the DSS as an advanced computer-
aided information technology, used to support complex decision making,
problem solving, policy testing, scenario simulation and strategic
planning.
Since there is no agreement on the definition of DSS,
B.Ravindranath (2002) presents the expectations from DSS as follows:
1- It should provide reliable information to support decision
making.
2- It should be able to handle unexpected problems by
performing necessary analysis and using suitable models.
3- It should make the support available when it need.
4- It should evolve with time with changing in user need and
with changing in information.
5- People can use it easily.
4. I.2 Characteristic and capabilities of DSS
The major feature of DSS that distinguish it from other computer
aided systems, such as management information system (MIS) and expert
system (ES) is its corporate quantitative tools, Motaz Khorshid (2004).
Turban (1988) and Efraim Turban and Jay.E. Aronson (2002),
summarizes the major DSS characteristics and capabilities and conclude
that because there is no agreement on the definition of DSS, there is no
agreement on characteristics and capabilities of DSS.
The major DSS characteristics and capabilities are the following:
65
1- DSS provides support for decision makers to solve semistructured
and unstructured problems by bringing together human judgment
and computerized information.
2- Provide support to all managerial level.
3- Support individuals or groups.
4- Provides support to several interdependent and/or sequential
decision.
5- DSS support all phases of decision making process (intelligence,
design, choice, and implementation).
6- Support a variety of decision making process and styles.
7- DSS are adaptive over time. The users can add, delete, combine,
change, or rearrange basic elements.
8- User friendless.
9- DSS attempt to improve the effectiveness of decision making
(accuracy, timeliness, quality) rather than efficiency (the cost of
making decisions).
10- The decision maker has a complete control over all steps of the
decision making process in solving a problem.
11- End user should able to construct or modify simple system by
themselves.
12- A DSS usually utilizes models for analyzing decision making
situations.
13- DSS provide access to variety of data sources, formats, and types.
14- DSS can integrate with other DSS and applications and distributed
using web technologies.
No DSS can exhibit all the above characteristics. It should be try to
build DSS corporate many of these characteristic as possible.
The capabilities and characteristics can summarize in figure 4.I.1.
66
Figure 4.I.1: DSS characteristics
4. I.3 Decision support system components
A DSS does not have a monolithic structure. It consists of a few
subsystems, which have to interact with each other, B.Ravindranath
(2002).
There is different identification for the component of DSS. Gachet
A., (2001), presents some of this identification as follows:
1- Sage (1991) identifies three component of DSS
i. Data base management system (DBMS)
ii. Model base management system (MBMS)
DSS
Semistructured
problems For mangers in all
levels
For groups and
individuals
Interdependent or
sequential decisions
Support all decision
making phases
Support variety of
decision styles and
processes
Adaptability and
flexibility
Interactive case of use
Effectiveness not
efficiency
Human control
the machine
Ease of
construction by
end users
Modeling and
analysis
Data access
Integration and web
connection
1 2
3
4
5
6
7 8
9
10
00
11
00
12
13
14
67
iii. Dialog generation and management system (DGMS)
2- Hattenschwiler (1999) identifies five component of DSS
i. User with different roles or functions in the decision
making process (decision maker, advisors, domain
expert, system expert, data collectors)
ii. A specific and definable decision context
iii. A target system describing the majority of the
preferences
iv. A knowledge base made of:
1. External data sources, knowledge database,
working database, data warehouses and meta-
database.
2. Mathematical models and methods
3. Procedures, inference and search engines
4. Administrative programs and reporting systems
v. A working environment for the preparation, analysis
and documentation of decision alternatives.
3- Power (2000) identifies four component of DSS
i. The user interface
ii. The database
iii. The model and analytical tools
iv. The DSS architecture and network
Efraim Turban & Jay.E. Aronson (2002), define the component of
DSS as follows: Decision support system composed the following four
components as shown in figure 4.I.2:
68
Figure 4.I.2: Component of DSS
1- Data management subsystem
The data management subsystem is composed the following
elements as shown in figure 4.I.3:
Other computer based
system
Users
Data management Model management
Dialog management
Data (external
& internal)
External
models
Knowledge based
subsystems
Organizational KB
69
Figure 4.I.3: The structure of data management subsystem
a) DSS database
DSS database extract data related to the problem under
consideration. This data are collected from:
- Internal sources
It comes from the daily organization's transactions.
- External sources
It includes different types of data like national economic data.
- Personal (private) data.
b) Database management system (DBMS)
DBMS is software which manages the data management subsystem.
DBMS storage the data in the data base, retrieval of data from the database
and control the database.
External data
sources
Internal data
sources
Private personal
data
Query facility
Extraction
Decision support
database
Database management
system
Data directory
Organizational
knowledge base
Interface
management
Model
management
Knowledge
based
subsystem
Corporate
data
warehouse
71
c) Data directory
The data directory contains the definition of the data and its
function. The main function of data directory is to answer questions about
the availability of data items, their sources, and their meaning.
d) Query facility
Query facility provides bases for access, manipulate, and query the
data.
2- Model management subsystem
The model management subsystem is composed the following
elements as shown in figure 4.I.4:
Figure 4.I.4: The structure of model management subsystem
a) Model base
Model base contain quantitative models (statistical, financial, etc.)
which provide DSS its capabilities.
The models in model base may be:
- Strategic models which used to support top management.
- Tactical models which used by middle management.
- Operational models which support day working activities.
Models base Model directory
Models base
management
Model execution, integration
and command processor
Data management Interface management Knowledge based subsystem
70
- Analytical models which used to perform some analysis on data.
b) Model base management system (MBMS)
MBMS is software used for:
- Model creation using
o Programming languages.
o DSS tools and/or subroutines
o Other building blocks.
- Generation of new routines and reports.
- Models updating and changing.
- Model data manipulation.
c) Model directory
Model directory contain the definition of the models and describe its
function and capabilities.
d) Model execution, integration, and command processor
- Model execution control of the model running.
- Model integration involves combining the operation of several
models or integrating the DSS with other applications.
- Command process interprets modeling instructions from user
interface component and route them to MBMS, model execution,
or integration functions.
3- Knowledge-based subsystem
Many problems require expertise for their solution. The knowledge
based subsystem can add to DSS to provide the required expertise.
4- User interface subsystem
The user use DSS through this subsystem. It includes all aspect of
communication between users and DSS. It includes:
- The hardware
- The software
72
- Factors that deal with ease of use, accessibility and human
machine interactions.
The user interface subsystem is managed by software called user
interface management system (UIMS).
Motaz Khorshid (2004) defines the component of DSS as follows:
DSS model is comprised of four main components, figure 4.I.5:
1- Database management capabilities with access to internal and
external data, information and knowledge.
2- Modeling function accessed by a model management system.
3- A powerful, yet simple user interface design that enables
interactive queries, reporting, and graphing functions.
4- A decision-maker's own insights.
Figure 4.I.5: A conceptual model of DSS
Other computer based software systems
Data base
Data base management
system (DBMS)
Model base
management system
(MBMS)
Model base
Knowledge management
Interactive user interface system
Decision user & policy analyst
73
4. I.4 Decision support system application (type, classification,
taxonomy of DSS)
There are several ways to classify DSS applications. Different
authors propose different classifications, Gachet A., (2001).
1- Alter's (1980) classification, Efraim Turban & Jay.E. Aronson,
(2002) and B.Ravindranath (2002).
Alter's (1980) classification based on (the purpose for which a DSS
is expected to be used)
- The degree of action implication of system outputs.
- The extent to which system outputs can directly support (or
determine) the decision.
According to this classification, there are seven categories of DSS
a) File drawer systems.
File drawer system provides the user with organized information
regarding specific demands.
b) Data analysis system.
Data analysis system provides different paths or alternative methods
to meet a given situation.
c) Analysis information systems.
Analysis information system is use for building the data warehouse
in any large organization.
d) Accounting models.
Accounting models use in accounting purpose.
e) Representation models.
Representation models use in forecasting future trends.
f) Optimization models.
Optimization models use to allocation the resources when its are
restricted.
74
g) Suggestion models.
Suggestion models use for operational purposes.
The first two types are data oriented, performing data retrieval or
analysis. The third deals both data and models. The remaining four are
model-oriented, providing simulation capabilities, optimization, or
computations that suggest an answer.
2- Holsapple and Whinston's classification (1996), Efraim Turban
& Jay.E. Aronson, (2002) and B.Ravindranath (2002).
Holsapple and Whinston (1996) classify DSS into six frameworks:
a) Text oriented DSS
In this type of DSS information drawn from reports, statements and
technical observation and stored in a textual format and must be accessed
by decision makers.
b) Database oriented DSS
In this type of DSS the database organization play a major role in
the DSS structure.
c) Spreadsheet oriented DSS
A spreadsheet is a modeling language that allow user to write
models to execute DSS analysis. This DSS enable user to get organized
information in a framed document.
d) Solver oriented DSS
A solver oriented DSS is an algorithm or procedure written as a
computer program for performing certain computations and giving
quantitative solutions for solving a particular problem type.
e) Rule oriented DSS
In this type of DSS the knowledge component of DSS includes both
procedural and inferential (reasoning) rules.
These rules can be quantitative or quantitative, and such a
component can replace quantitative models or integrated with them.
75
f) Compound (hybrid) DSS
This type of DSS is a hybrid system that includes two or more of
five basic structures described earlier.
3- Power (1997) classification
At technical level, Power differentiates between, Gachet, A. (2001).
a. Enterprise-wide DSS
Enterprise-wide DSS are linked to large data warehouses and serve
many mangers in a company.
b. Desktop DSS
Desktop single user DSS are small system that resides on an
individual manger’s Pc.
4- Hattenschwiler (1999) classification, Gachet, A. (2001)
a. Passive DSS
Passive DSS is a system that cannot bring out decision suggestions
or solutions.
b. Active DSS
An active DSS can bring out such decision suggestions or solutions.
c. Cooperative DSS
Cooperative DSS allows the decision maker (or its advisor) to
modify, complete, or refine the decision suggestions provided by the
system, before sending them back to the system for validation.
5- Power (2000) classification
a. Communication Driven DSS
A communication-Driven DSS support more than one person
working on a shared task.
76
b. Data-Driven DSS
Data-Driven or Data-oriented DSS emphasize access to and
manipulation of a time series of internal company data and external data.
c. Document-Driven DSS
Document-Driven DSS manage, retrieve and manipulate
unstructured information in variety of electronic formats.
d. Knowledge-Driven DSS
Knowledge-driven DSS provide specialized problem-solving
expertise stored as facts, rules, procedures, or in similar structures.
e. Model-Driven DSS
Model-driven DSS use data and parameters provided by decision-
maker to aid decision makers in analyzing a situation, but they are not
necessarily data intensive.
6- Classification based on usage modes, B.Ravindranath (2002).
DSS can be classified according to how they are put to use as
follows:
a. Subscription mode.
Any DSS, which introduce out-puts in the form of reports is
considered to be working in subscription mode.
b. Clerk mode.
This system behaves like an inquiry clerk either in the booking
office or in a library.
c. Terminal mode.
In this type the system is loaded in the personal computer.
d. Intermediary mode.
When the DSS becomes complicated the expert person from MIS
department who help user to draw information is called intermediary.
e. Institutional DSS.
77
In this system the information required for routine administration of
an organization is stored.
7- Special and general purpose DSS, Motaz Khorshid (2004).
a. Special purpose DSS.
Special purpose DSS concentrate on either a specific problem or a
specific tool of decision analysis (DSS are organized around computer
simulation techniques).
b. General purpose DSS.
General purpose DSS based on a specific analytical tool or
computational models (DSS tools are based on optimization models).
8- Others classifications, Efraim Turban and Jay.E. Aronson
(2002).
a) Institutional and AD HOC
i. Institutional DSS
Institutional DSS deal with decisions of recurring nature, e.g.
portfolio management system. It used to solve identical or similar
problems so institutional DSS can develop through many years.
ii. AD HOC DSS
AD HOC DSS deal with specific problems which are non repeated
type.
b) Personal, group, and organizational support
i. Personal support
Personal DSS can be used by individual user or group of users to
solve specific problem, e.g. selecting stocks.
ii. Group support
Group DSS can be used by group of users, each of them work
individual, to solve interrelated problems, e.g. DSS may serve many users
in finance department, here the decision made individual but they check
the impact of their decisions on others.
78
iii. Organizational support
Organizational DSS helps many users work in the same
organization but in different functional areas.
c) Individual DSS vs. a group support system
i. Individual DSS
ii. Group decision support system
In group decision support system (GSS) the decisions made by a
group.
d) Custom made vs. ready made systems
i. Custom made
DSS may be building for individual users and organizations.
ii. Ready made
Ready made DSS in generic DSS can be used in several
organizations. It is useful when the problem occur in similar organizations
or in same functional area of different organizations.
4. I.5 Constructing a decision support system
There is no a single way to construct DSS because:
1- There are several types of DSS.
2- There are differences in organizations, decision makers, and DSS
problem area.
The DSS architecture must ensure the following points,
B.Ravindranath (2002):
1- The transfer of information from one source to other.
2- There should be provision for future extension and
addition of new activities to DSS being designed.
3- The architecture should be compatible with the existing
computerized systems like MIS.
79
4- The architecture should assist the management to discuss
with the vendors the suitability of the systems offered.
5- The architecture should help in estimating the cost of the
system.
Efraim Turban (1988), describe all activities needed to building a
complex DSS, it is not necessary to follow this entire step for every DSS.
The phases for building a DSS summarized in figure 4.I.6.
81
Figure 4.I.6: Phases in building a DSS
Planning: need assessment. problem
diagnosis, objectives of DSS
Research: how to address user
needs? What resources available?
Analysis: what is the best
development approach? what are
necessary resources ? define
normative models
Design model base Design DSS
database
Design user
interface
Constructing: putting together DSS
Implementation: testing and
evaluation, demonstration,
orientation, training, and
deployment
Maintenance
Adaptation: continually repeat the
process to improve the system
80
1- Planning
Planning phase involves:
- Need assessment
- Problem diagnosis
- Define the objective and goals of the decision support system
2- Research
Research phase involves:
- Identifies how we can satisfy the user needs.
- Identifies the available resources.
3- Analysis
Analysis phase involves:
- Identifies the best approach to achieve the user needs.
- Identifies the required resources to achieve the user needs.
4- Design
Design phase involves:
- Design the database and its management
- Design the model base and its management
- Design the user interface.
5- Construction
In construction phase, we put the component of DSS together.
6- Implementation
i. Testing
In testing phase we collect information about the system
performance and compare it with the design specification.
ii. Evaluation
In evaluation phase we determine if the implemented system satisfy
the user needs.
82
iii. Demonstration
In demonstration phase we explain the fully operational system
capabilities to the user.
iv. Training
v. Deployment
7- Maintenance
In maintenance phase involves planning for continuing maintain the
system.
8- Adaptation
Adaptation phase involves determine the changing in the user needs
and adapt DSS.
4. I.6 DSS technologies levels and tools
There are three DSS technologies levels, Efraim Turban and Jay.E.
Aronson (2002): Specific DSS (DSS applications), DSS integrated tool
(generator or engine), and DSS primary tools.
Technologies level is very important for:
- Understanding the development of DSS.
- Developing a framework for their use.
1- Specific DSS (DSS applications)
Specific DSS (SDSS) is the final DSS which will be used by the
users to achieve their need.
2- DSS integrated tool (generator or engine)
A generator is a package of software used to build a specific DSS
quickly, inexpensively, and easily.
3- DSS primary tools
DSS tools are lowest level of DSS technology used to facilitate
development of either a DSS generator or specific DSS.
83
4. I.6.1 Relationships among the technologies levels
Efraim Turban and Jay.E. Aronson (2002), present the relationship
among the three levels of DSS technologies as follows.
- The tools are used to construct generators, which in turn are used
to construct specific DSS. Using DSS generator is useful in
constructing specific DSS and enables to update DSS if any
change occurred.
- Tools can be used to construct specific DSS directly but this may
be very lengthy and expensive.
4. I.6.2 Future trends of decision support system
There are four tools emerged for building DSS (in addition to
models and models base management system), Motaz Khorshid (2004).
These tools are:
1- Data warehouse.
Data warehouse is a subject oriented, integrated, time variant,
nonvolatile collection of data.
2- On line analytical processing (OLAP).
OLAP is software that enables analysts, mangers, executive or
decision make to gain insight into data through fast, consistent, interactive
access to wide variety of possible view of information that have been
transformed from raw data to reflect the real dimensionality of enterprise
as understood by the user.
3- Data mining.
Data mining is a set of artificial intelligence and statistical tools
used for more sophisticated data analysis.
84
4- World Wide Web.
A web-based DSS is a computerized system that delivers decision
support information or decision support tools to a manager or analyst
using a Web Brower.
The frequent use of various DSS tools in decision making and
problem solving has contributed to the development of new trends and
approaches as follows, Motaz Khorshid (2004):
- One trend is increasing sophistication of model centered DSS.
- Another trend is the development of collaborative support
system (GDSS).
The group decision support system (GDSS) is a DSS specially
designed to facilitate and enhance the communication related activities of
team members engaged in cooperative work.
- A third and important trend is active decision support system
(ADSS).
ADSS represent a third trend in the future development of DSS
technology. ADSS is a system wherein the computer and the user work as
partners in problem solving process.
4. I.7 Approaches to DSS construction
There are several approaches to DSS construction. Efraim Turban
(1988), classified these approach into three categories.
4. I.7.1 Quick hit
In quick hit approach, a specific DSS in constructed relatively
quickly to meet difficult problem.
The advantage of quick hit approach:
The costs and risks are low
The latest technologies can be used
It can use commercially available generators.
85
The disadvantage of quick hit approach:
Quick hit DSS are usually constructed for one person or for one
purpose.
Quick hit DSS do not relate to other DSS.
In quick hit DSS the experience is limited to carryover to the
next DSS.
4. I.7.2 Staged development
In staged development approach, a specific DSS is depending on
advanced planning.
4. I.7.3 Complete DSS
Complete DSS requires:
Development of a full service large scale DSS generator.
Large scale specific DSSs.
Organizational unit to mange such project.
To select an appropriate approach depend on:
- The organization
- Purpose of DSS
- Tasks
- Available tools
- Builders
4. I.8 Alternate development methodologies
DSS can be developed using several methodologies all of which are
based on the traditional SDLC. The choice of the method that will be used
to build a DSS will depend on whether the DSS will build by the end user
or by a DSS team, Efraim Turban (1988) and Efraim Turban and Jay.E.
Aronson (2002).
86
4. I.8.1 Parallel development (traditional methodologies)
Parallel development methodologies depend on the traditional
SDLC (planning, design, construction, and implementation). In parallel
development the design and implementation phases split into multiple
copies, each of which deals with separate subsystem.
The design strategy depend on the assumption that the required
information can be predetermined but this assumption not true because the
user learn more about the problem and will need to identify new
information. So there is a need to departure from the traditional SDLC.
4. I.8.2 Rapid application development (RAD) methodologies
In a rapid application development methodologies, the SDLC are
adjusted in order to provides the user by parts of system quickly.
Rapid application development methodologies include three
methods:
4. I.8.2.1 Phased development
In phased development methodologies the system breaking into a
series of version, developed sequentially, each of which has more
functionality than the previous one.
The advantage: users gain functionality quickly.
The disadvantage: The users start by incomplete system.
4. I.8.2.2 Prototyping (evolutionary, iterative)
The majority of DSS are built using evolutionary prototype
approach. Prototyping approach builds DSS in a series of steps and direct
feedback, at each step, from users to modify the system if there is a need.
Therefore, DSS tools must be flexible to permit changes quickly and
easily. In prototyping the analysis, design and implementation phases are
performing at the same time and repeatedly. It start by over all planning
and then analysis, design and prototype implementation phases are
performed iteratively until develop a small prototype.
87
The prototyping involves the following step:
1- Select an important sub problem (by the user and builder)
2- Develop a small but usable system to assist the decision
maker.
3- Evaluate the system constantly.
4- Refine, expand, and modify the system in cycle.
This steps are repeated until evolves a stable system. If the
prototype is suitable for the users, the formal implementation of the DSS
can be performing.
4. I.8.2.3 Throwaway prototyping
Throwaway prototyping often used to understand the users needs
and the system requirement. Throwaway prototyping is similar to both
traditional and prototyping approach. It is perform complete analysis as
SDLC and design prototypes to assist in understanding more about the
system being developed.
4. I.9 Team developed vs. user developed DSS
Team developed DSS and user DSS are theoretical; in practice often
a mixture between these two methods can be used, Efraim Turban (1988)
and Efraim Turban and Jay.E. Aronson (2002).
4. I.10 DSS development platforms
There are several basic DSS development software platforms,
Efraim Turban and Jay.E. Aronson (2002). The most important are the
following:
1- Write a customized DSS in a general purpose programming
language such as a visual basic.
2- Use a fourth generation language such as data-oriented
language, spreadsheets, and financial oriented languages.
88
3- Use OLAP with data warehouse or large database.
4- Use a DSS integrated development tool (generator or engine).
5- Use a domain specific DSS generator. Domain specific DSS
generators are designed to build a highly structured system.
6- Develop the DSS using CASE methodology.
7- Develop a complex DSS by integrating several of above
approach.
Links to web can be integrated to any of the previous platforms.
4. I.11 Issues associated with DSS
Gachet, A. (2001), analysis the reasons which make DSS has not
used in broad way yet and propose solution for some factors. He
summarizes the reasons of why DSS have a low interest in practice into
three main categories:
1- Human factors
Human factors cover the reasons users and decision maker oppose
the computerized decision-making system.
2- Conceptual factors
Conceptual factors cover the problems encountered by DSS because
of wrong or incomplete choices carried out during the design of the
systems.
3- Technical factors
Technical factors cover the problems encountered by DSS related to
purely software or hardware considerations.
89
Part II: The proposed decision support system
4. II.1 Introduction
In this part we will present the proposed decision support system to
help decision maker in credit card center to assessment the new applicants
in order to decide if the bank will issues a credit card to the applicant or
not.
4. II.2 The proposed decision support system
The proposed DSS will help the decision maker on the decision of
accepting or rejecting the new applicants. First the decision maker will test
if the new applicant in negative list in the credit bureau. Then test if there
any comment about this applicant in private data. The decision maker will
use Bayesian model or MSD model and enter the require data to
assessment the new applicant.
DSS was constructed using quick hit approach. The key decision
of the proposed decision support system is to classify a new applicant for
credit card to predefined classes (good or bad) and determine if the bank
will accept issuing a credit card to this customer or the bank will deny this
customer. The best approach to achieve the key objective of the proposed
DSS we will build credit score model using composite rule induction
system, Bayesian classification and linear programming.
The structure of the proposed DSS was given in figure 4.II.1:
91
Figure 4.II.1: The proposed DSS
4. II.3 Building the proposed decision support system
To build the credit score model we need a sample from the current
customers. This sample should include the characteristics of the
customers, age, gender (female and male), martial status (single, divorced,
widow, and married), education level (diploma, graduated, and post
graduated), occupation (self employee, employee), experience, home own
type (own, rent), home phone (yes, no), bank account (yes, no), credit card
(yes, no), home years and income. Also we need the information from
credit bureau and private data.
The sample was classified into two classes "good, bad". The
customer was classified as bad if the number of missing consecutive
DSS database
Customer's data
Private data Information from credit bureau
Model base
CRIS
Linear programming
Naïve Bayesian
classification
User interface
Credit analyst
90
payment more than or equal to six months otherwise the customer
classified as good.
4. II.3.1 Decision support system database
The DSS database consists of three parts:
1- Customers data
Customer table contain the characteristic age, gender, martial status,
education level, occupation, experience, home own type, home phone,
bank account, credit card, home years, income and the classification of the
customers.
2- Credit bureau data
Credit bureau will contain information about if the customers have a
credit problem with other banks or not.
3- Private data
Contain information collect from the credit card analyst and it
contain if any one have a comment about any current customer or about
new applicant (i.e. some profile about customers like VIP customers or
bad customers).
The DSS database can be shown in figure 4.II.2.
92
Figure 4.II.2: DSS database
4. II.3.2 Model base for the proposed DSS
The model base for the proposed DSS contains a composite rule
induction system (CRIS), Naïve Bayesian classification and linear
programming (MSD model). The model base is managed by WinQSB and
Spss.
The model base and model base management are given in figure
4.II.3:
Credit bureau
The customer's data
The private data
93
Figure 4.II.3: The model management subsystem
4. II.3.2.1 A composite rule induction system (CRIS)
Composite rule induction system is a knowledge acquisition system.
CRIS accept a set of data as inputs and produces "if…then" rules to
interpret the set of data. CRIS consists of three steps, Nikolaos F.
Matsatsinis and C. Erik Larson (2004) and Ting-Peng liang (1992):
1- Hypothesis generation
2- Probability assessment
3- Rule scheduler
The interaction between hypothesis generator and probability
calculator generates candidate rules which form the rule space and
organized by rule scheduler.
CRIS mechanism can be summarized as follows:
1- Hypothesis generation
Hypothesis generation responsible for determine the casual
relationships between dependent attributes (classes “good, bad”) and
independent attributes (gender, education, etc.).
WinQSB
Spss
CRIS
Linear programming
Naïve Bayesian
classification
Model base
94
For the nominal attributes, the values are simply identifying
different properties and their mean and variance do not provide useful
information. CRIS adopts a cross tabular approach to determine the
relationship between nominal attributes (gender, education, etc.) and the
dependent attributes (good or bad).
Let:
Y refer to the class (god or bad) ),( BGY
Gf be the number of good customers in the sample
Bf be the number of bad customers in the sample
jkGf be the number of good customers have the attribute value jkv
jkBf be the number of bad customers have the attribute value jkv
jkf be the number of customers (good and bad) that have the
attribute value jkv , jkf jkGf jkBf
The cross table are given in table 4.II.1:
Class
G B
1jv Gjf 1 Bjf 1 1jf
2jv Gjf 2 Bjf 2 2jf
. . .
jkv jkGf jkBf jkf
. . .
jzv jzGf jzGf jzf
Gf Bf
Table 4.II.1: The frequency table
To generate the hypothesis we repeat the following step until all
hypotheses are generated for all nominal attributes.
Att
ribute
J
95
For each jkj va , zk ,....,2,1 ,
o if jkGf jkBf formulate the hypothesis, If jkj va then GY
o if jkGf jkBf formulate the hypothesis, If jkj va then BY
Note that:
If there is a tie, all possible hypotheses are generated.
Total number of hypothesis to be generated for the attribute j is z
plus the number of ties.
2- Probability assessment
The purpose of the probability assessment is to calculate the
probability associated with each rule.
- The probability, )/( jkj vaGP , of the hypothesis If jkj va then
GY and the probability, )/( jkj vaBP , of the hypothesis If jkj va
then BY are conditional probability, it indicates the likelihood
that the conclusion is true if the condition of the hypothesis is met.
Which can be calculate from:
o The prior probability of class i , )( GYP and )( BYP
o Other conditional probability, )/( GvaP jkj and
)/( BvaP jkj , given the class, the probability that the value
of the attribute j equal jkv (the probability that the value of
the attribute j is jkv given that it is belong to the specific
class)
Let:
- Gp be the probability that an arbitrary customer are good, m
fp G
G
- Bp be the probability that an arbitrary customer are bad, m
fp B
B
96
- )/( jkj vaGP be the probability that customer is good given the
value of the attribute j is jkv .
- )/( jkj vaBP be the probability that customer is bad given the value
of the attribute j is jkv .
From Bayesian theorem this probability can be calculated as
follows:
)/(*)/(*
)/(*)/(
BvaPPGvaPP
GvaPPvaGP
jkjBjkjG
jkjG
jkj
)/(*)/(*
)/(*)/(
BvaPPGvaPP
BvaPPvaBP
jkjBjkjG
jkjB
jkj
For the nominal attributes the information about the data
distribution is unavailable. Hence, the conditional probability is assessed
by its relative frequency of occurrence in the training data.
Because both the numerator and denominator are divided by the
same constant (total number of occurrence), the two previous equations
can be simplified as follows:
o jkBBjkGG
jkGG
jkjfPfP
fPvaGP
**
*)/(
o jkBBjkGG
jkBB
jkjfPfP
fPvaBP
**
*)/(
3- Rule scheduler
A hypothesis with its associated probability is called a candidate
rule. Composite rules induction system selects attributes based on their
saliency. Rule saliency is defined as the difference between the number of
cases correctly covered (hit value) and those incorrectly interpreted (miss
value) by the rule.
The resulting structure is a decision tree with rules as its nodes.
97
Structure construction can be summarized as follows:
1- Determine of rule saliency.
2- Selection of rule. Guidelines for rule selection as follows:
i. If there are rules whose miss values are zero and
whose hit values are positive, then select the one
with the highest hit value.
ii. If all rules have positive miss values, then select the
rule with highest positive saliency value.
iii. If more than rule has the same saliency values, then
choose the one with highest probability.
4. II.3.2.2 Naïve Bayesian classification
Bayesian classifiers are statistical classifier which based on bayes
theorem. They can predict the probability that a given applicants
(alternatives or sample) belongs to a particular class, Jiawei Han and
Micheline Kamber (2001).
Naïve Bayesian classification is simple Bayesian classifier, based
on the assumption, called conditional independence, which the effect of an
attributes value on given class is independent of the value of the other
attributes, i.e. the values of the attributes are conditionally independent of
one another. This assumption makes the computation simple and when it
is hold the accuracy of the naïve Bayesian increase, in comparison with
other classifiers, when this assumption holds. Jiawei Han and Micheline
Kamber (2001) and Gutierrez-Pena E.(2004).
Naïve Bayesian will test if GX or BX , where X is unknown
sample with the set attributes ),.....,,.....,( 21 nj aaaaA , n is the number of
the attributes, nj ,.....,1 .
98
The classifier will predict that:
GX if )/()/( XBPXGP
BX if )/()/( XGPXBP
i.e. X will belong to the class having the highest posterior
probability, conditioned on X .
Where:
)/( XGP is the probability that X belong to the class G given that
X have set attributes ),.....,,.....,( 21 nj aaaaA
)/( XBP is the probability that X belong to the class B given that
X have set attributes ),.....,,.....,( 21 nj aaaaA
Naïve Bayesian classifier work as follows:
From Bayesian theorem )/( XGP and )/( XBP can be calculated as
follows:
)()/()()/(
)()/()/(
BpBXPGpGXP
GpGXPXGP
and
)()/()()/(
)()/()/(
BpBXPGpGXP
BpBXPXBP
Based on the assumption of conditional independence
n
j
jk GvXPGXP1
)/()/( and
n
j
jk BvXPBXP1
)/()/(
Where:
)/( GvXP jk be the probability that X have attribute jkv given that
it is belong to class good, (posterior probability of X condition on
the hypothesis that it is belong to the class G ) and
99
)/( BvXP jk be the probability that X have attribute jkv given that it
is belong to class bad, (posterior probability of X condition on the
hypothesis that it is belong to the class B )
The two conditional probability )/( GvXP jk and )/( BvXP jk are
assessed by the relative frequency of occurrence in the training data.
Thus:
)/( GvXP jki G
jkG
f
f
)/( BvXP jki B
jkB
f
f
4. II.3.2.3 Linear programming (MSD model)
Linear programming seek to develop a linear scorecard to find a
weighs iw for each attributes and cut off point c so that the good customer
will have score above these cut off point and bad customer will have score
below these cut off point Thomas, L.C., A (2000).
MSD model minimizing the sum of deviations among the
alternative score (not correctly classified) from the cut off point, these
model knows as MSD (minimize the sum of deviations), Doumpos M. and
Zopounidis C. (2002a) and Yong Shi, Yi Peng, Welxuan Xu and Xiaowo
Tang (2002).
min i
m
i
1
..ts
cwa ij
n
j
ij
1
, Gi
cwa ij
n
j
ij
1
, Bi
cwi , unrestricted in sign 0i
011
Where i is the overlapping of two classes boundary for all
alternatives score iA form the cut of point. The violation of the
classification rules by an alternative iX , i (external deviations) by which a
constraint is not satisfied.
4. II.3.3 User interface for the proposed DSS
User interface consist of windows to facilities the communication
between the credit card analyst and data base management and model base
management.
1- The main window
The main window is given in figure 4.II.4.
Figure 4.II.4: The main window of proposed DSS
The main window contains icons for the result of applying the
model, DSS data, MSD, CRIS and Bayesian.
2- The result of applying the models windows
This window is given in figure Figure 4.II.5.
010
Figure 4.II.5: The result of applying the model window
It contains icons to present the result of applying CRIS, MSD
model, Bayesian model and index to illustrate the code used in the model.
3- CRIS window
CRIS contain icon to present the frequency table, probability and
saliency rules and icon to present the result of applying CRIS. CRIS
window is given in figures 4.II.6, 4.II.7 and 4.II.8.
012
Figure 4.II.6: CRIS window
Figure 4.II.7: CRIS window
013
Figure 4.II.8: CRIS window
4- MSD model window
This window contain icon to present the MSD weights and the
result of applying the MSD1, MSD2, MSD3 and MSD4. MSD models
window are given in figures 4.II.9, 4.II.10 and 4.II.11.
Figure 4.II.9: MSD model window
014
Figure 4.II.10: MSD model window "weights"
Figure 4.II.11: MSD model window "result"
015
5- Bayesian model window
It contains icons to present the Bayesian probability and icons to
present the result of applying the Bayesian model. Bayesian model
windows are given in figures 4.II.12, 4.II.13 and 4.II.14.
Figures 4.II.12: Bayesian model window
Figures 4.II.13: Bayesian model window "probability"
016
Figures 4.II.14: Bayesian model window "result"
6- DSS data window
It contains icons for to access customer tale, credit bureau and
private data. DSS data windows are given in figures 4.II.15 and 4.II.16.
Figures 4.II.15: DSS data windows
017
Figures 4.II.16: DSS data windows "customer table"
7- MSD window
It's used to assessment the new applicant and the user will be asked
to enter the applicant data then category them as good "accept and issue
the credit card" or bad "deny them and refuse to issue the credit card".
MSD window is given in figure 4.II.17.
Figure 4.II.17: MSD window
018
8- Composite rule induction system window
Composite rule induction system window give the user an over view
about the importance of the attributes used in building the system (most
and lowest preference customer) and given in figure 4.II.18.
Figure 4.II.18: CRIS window
9- Bayesian window
It's used to assessment the new applicant and the user will be asked
to enter the applicant data then category them as good "accept and issue
the credit card" or bad "deny them and refuse to issue the credit card".
Bayesian window is given in figure 4.II.19.
019
Figure 4.II.19: Bayesian window
4. II.4 Summary
In part II we present the proposed DSS. It depends on the credit
score as techniques to classify the new applicant into good or bad classes
based on their characteristics. The credit card analyst will use it as follows:
- Check the documents offer by the new applicant and assure that the
data given by the applicant is true.
- Check if the applicant has any problem with other banks from credit
bureau.
- Check if there is any comment about hat applicant from the private
data.
- Use MSD model or Bayesian model to classify the applicant to
good or bad class, the credit analyst will enter the applicant data
then the system will return the classification.
001
Chapter five
An application: Building credit score model for credit
card application assessment
5.1. Introduction
The decision of issuing a credit card is very critical since any
mistake in the credit decision for single customer mean that the bank will
loss the profit obtained from other successful customers. Due to this fact
the method used to evaluate the creditability of each credit card applicant
should be accurately as possible in order to minimize the risk of insolvent.
Building credit score model using CRIS, Bayesian classification, and
linear programming can reduce the risk of insolvent as we see in the
following sections.
5.2. Description of the current system
Recently some financial organization in Egypt starts to use
deductive credit score to assessment the credit card application in order to
decide if they will issue a credit card to the applicants or not.
000
The deductive credit score work as follows:
- Determine the important attributes.
- Assign a weight to each attribute.
- The score for each customer are obtained by adding the attributes
weight.
- To determine whether the financial organization grants or not a
credit card to the customer, the score of this customer are compared
with a cut off point.
The attributes, their weights and cut off point are determined by the
decision maker based on their experiences.
The chosen attributes used in these model consists of 11 attributes
(three quantitative and eight qualitative). These attributes are:
1- Age (less than or equal 30, greater than 30 and less than or
equal 60, grater than 60)
2- Gender (female and male)
3- Martial status (single, divorced, widow, and married)
4- Education level (diploma, graduated, and post graduated)
5- Occupation (self employee, employee)
6- Experience (less than 3 years, grater than or equal 3 years and
les than 10 years, grater than or equal 10 years)
7- Home own type (own, rent)
8- Home phone (yes, no)
9- Bank account (yes, no)
10- Credit card (yes, no)
11- Home years (less than or equal 8 years, greater than 8 years)
According to this model, the financial organization received the
request of issuing credit card from customers and evaluates them. The
acceptance customers are granted a credit card and their performance was
observed and recorded.
002
5.3. Description of the training and test sample
To build and test the credit score models using CRIS, Bayesian, and
linear programming a sample consists of 200 customers was selected
randomly (100 bad and 100 good). The classification of customers to good
and bad are depend on the number of months of missed payment. If the
customer delay more than 6 months, that customer classify as bad,
otherwise the customer is classify as good.
This sample is divided equally into two samples. The first sample
consist of 100 customers (50 good and 50 bad), used to build the model,
these sample called the training sample. The second sample consist of 100
customers (50 good and 50 bad), used to test the model, these sample
called test sample. The credit score models will be building using the same
attributes used in the current system.
5.4. Building empirical credit score models
5.4.1. The subsystem: Composite Rule Induction System (CRIS)
The frequency tables for the 11 attributes are generated. The
frequency tables are given in the table 5.1:
003
Age good bad Martial status good bad
Age <=30 4 12 Married 41 34
30<age<=60 45 38 Divorced 1 2
Age>60 1 0 Widow 2 1
50 50 Single 6 13
50 50
Gender good bad education good bad
Male 43 46 Post graduated 11 8
Female 7 4 Graduated 37 33
50 50 Diploma 2 9
50 50
Experience good bad Occupation good bad
<3 years 2 9 Employee 31 25
More or =3 and <10 11 21 Retired 1 0
More than or = 10 37 20 Self employed 18 25
50 50 50 50
Home type own good bad Phone good bad
Owned 33 30 Yes 49 48
Rent 17 20 No 1 2
50 50 50 50
Bank account good bad Credit card good bad
Yes 47 43 Yes 41 17
No 3 7 No 9 33
50 50 50 50
Home years good bad
Les than or = 8 11 20
More 8 39 30
50 50
Table 5.1: The frequency tables
Rules are formulated using the frequency tables, and the probability
associated to each rules are calculated. The final rules are given in table
5.2:
004
Rules Saliency Probability Class
do not have credit card 24 0.785714 bad
have credit card 24 0.706897 good
experience more than or = 10 17 0.649123 good
experience more or =3 and <10 10 0.65625 bad
home years <=8 9 0.645161 bad
home years >8 9 0.565217 good
age <=30 8 0.75 bad
education =diploma 7 0.818182 bad
experience <3 years 7 0.818182 bad
martial status = single 7 0.684211 bad
self employed 7 0.581395 bad
martial status = married 7 0.546667 good
30<age<=60 7 0.542169 good
employee 6 0.553571 good
Do not bank account 4 0.7 bad
education graduated 4 0.528571 good
have bank account 4 0.522222 good
gender = female 3 0.636364 good
education = post graduated 3 0.578947 good
home type = rent 3 0.540541 bad
home type =owned 3 0.52381 good
gender = Male 3 0.516854 bad
age>60 1 1 good
retired 1 1 good
martial status = divorced 1 0.666667 bad
martial status = widow 1 0.666667 good
do not have phone 1 0.666667 bad
have phone 1 0.505155 good
Table 5.2: The final CRIS rules
Since the classification decision, in CRIS, may depend on one rule
(if the first rule was satisfied then the next rules will not be checked), the
CRIS will be used to give an overview of the importance of the attributes
to the credit card analyses. The results of applying CRIS indicate that the
first rule, according to the saliency rule and the probability, is "if the
005
applicant haven't credit card" then the applicant will classify as bad. The
second rule is "if the applicant have credit card" then the applicant will
classify as good and so one.
Also CRIS can used to find the most preference characteristic and
the lowest one as follows:
Most preference, customers have the following characteristics:
(have credit cardhave experience more than or equal 10 years
home more than 8 yearsmarried or widow age more than 30
yearsemployee or retiredgraduated or post graduatedhave bank
accounthome type = own have phone).
Lowest preference, customers have the following characteristics:
(do not have a credit card experience less than 10 years home less than
or equal 8 years age les than or equal 30 years education is diploma
single or divorced self-employeddon’t have bank accounthome own
type= rentdon’t have phone)
5.4.2. The subsystem: Bayesian classification
To compute )/( XGP (is the probability that X belong to the class
G given that X have set attributes ),.....,,.....,( 21 nj aaaaA ) and )/( XBP
(the probability that X belong to the class B given that X have set
attributes ),.....,,.....,( 21 nj aaaaA ) , )/( GvXP jk (the probability that X
has attribute jkv given that it is belong to class good) and )/( BvXP jk (the
probability that X have attribute jkv given that it is belong to class bad) are
computed and given in table 5.3:
006
Age good bad Martial status good bad
age <=30 0.08 0.24 married 0.82 0.68
30<age<=60 0.9 0.76 divorced 0.02 0.04
age>60 0.02 0 widow 0.04 0.02
1 1 single 0.12 0.26
1 1
Gender good bad Education good bad
Male 0.86 0.92 post 0.22 0.16
Female 0.14 0.08 graduated 0.74 0.66
1 1 diploma 0.04 0.18
1 1
Experience good bad Occupation good bad
<3 years 0.04 0.18 employee 0.62 0.5
more or =3 and <10 0.22 0.42 retired 0.02 0
more than or = 10 0.74 0.4 self employed 0.36 0.5
1 1 1 1
Home type own good bad Phone good bad
Owned 0.66 0.6 Yes 0.98 0.96
Rent 0.34 0.4 No 0.02 0.04
1 1 1 1
Bank account good bad Credit card good bad
Yes 0.94 0.86 Yes 0.82 0.34
No 0.06 0.14 No 0.18 0.66
1 1 1 1
Home years good bad
les than or = 8 0.22 0.4
more 8 0.78 0.6
1 1
Table 5.3: )/( GvXP jk and )/( BvXP jk
007
The result of using the Bayesian1 model to evaluates the customers
in the training and test samples are given in table 5.4:
Bayesian model (Bayesian1)
Estimated classes
Training sample Test sample
Original classes
Good Bad Good Bad
No. % No. % No. % No. %
Good 37 74% 13 26% 34 68% 16 32%
Bad 24 48% 26 52% 20 40% 30 60%
Table 5.4: The results of applying Bayesian1 model
Table 5.4 can be representing in figure 5.1:
0102030405060708090
100
%
Training
sample
Test
sample
Hit classification
Good
Bad
0102030405060708090
100
%
Training
sample
Test
sample
Erroneous classification
Good
Bad
Figure 5.1: The results of applying Bayesian1 model
In the training sample: Bayesian1 classify 74% of good customers
correctly and successes to find out 52% of bad customers the
deductive credit score classify them as good. At the same time
Bayesian1 classify 26% of good customers "incorrectly" as bad and
fall to detect 48% of bad customers and classify them "incorrectly"
as good.
008
In the test sample: Bayesian1 classify 68% of good customers
correctly and successes to find out 60% of bad customers the
deductive credit score classify them "incorrectly" as good. At the
same time Bayesian1 classify 32% of good customers as bad and
fall to detect 40% of bad customer and classify them as good.
5.4.3. The subsystem: linear programming based model (MSD
model)
Using the training sample, the weights for the attributes and the cut
point are computed by WinQSB and given in table 5.5:
Age Gender Material
status
Education
Level Occupation Experience
Home
Own
Type
Home
Phone
Bank
Accounts
Credit
Cards
Home
Years
Cut
point
0.0007 0.0062 0.0013 0.0140 0.0067 0.0011 0.0033 0.0273 0.0078 0.0138 0.0005 0.2004
Table 5.5: The weights and cut point for MSD1 model
The result of using the MSD1 model to evaluates the customers in
the training and test samples are given in table 5.6:
MSD model (MSD1)
Estimated classes
Training sample Test sample
Original classes
Good Bad Good Bad
No. % No. % No. % No. %
Good 35 70% 15 30% 28 56% 22 44%
Bad 17 34% 33 66% 16 32% 34 68%
Table 5.6: The results of applying MSD model
009
Table 5.6 can be representing in figure 5.2:
0102030405060708090
100
%
Training
sample
Test
sample
Hit classification
Good
Bad
0102030405060708090
100
%
Training
sample
Test
sample
Erroneous classification
Good
Bad
Figure 5.2: The results of applying MSD1 model
In the training sample: MSD1 classify 70% of good customers
correctly and successes to find out 66% of bad customers the
deductive credit score model classify them as good. At the same
time, MSD1 classify 30% of good customers "incorrectly" as bad
and fall to detect 34% of bad customers and classify them as good.
In the test sample: MSD1 classify 56% of good customers correctly
and successes to find out 68% of bad customers the deductive credit
score model classify them as good. At the same time, MSD1
classify 44% of good customers "incorrectly" as bad and fall to
detect 32% of bad customers and classify them "incorrectly" as
good.
021
5.4.4. Building empirical credit score models conclusion
- Score based on deductive credit score are not the suitable techniques
to classify the new applicants since it depend on the experience
essentially. Deductive credit score give a consistent and subjective
classification since it based on the customers scores but it still
subjective since it depends on experiences, Liu, Y. (2001). So it is
important to build credit score model using quantitative techniques.
- CRIS is goods to give an overview of the sample since the
procedure used to arrive to the rules can be understand by the user
but if the CRIS used, it is important perform further analysis
because the decision may depend on one rule. If this rule are
satisfied, then the applicant will ranked to the class that the rule
defines without examine other rules.
- Using Bayesian1 and MSD1 model will decrease the insolvent rate
since both models are successes to detect bad customers deductive
credit score classify them as good, at the same time part of good
customers may be loss.
- The comparison between Bayesian1 and MSD1, for training and test
samples, are summarized in table 5.7:
Estimated classes
Training sample Test sample
Original classes
Good Bad Good Bad
MSD1 Bayesian1 MSD1 Bayesian1 MSD1 Bayesian1 MSD1 Bayesian1
Good 70% 74% 30% 26% 56% 68% 44% 32%
Bad 34% 48% 66% 52% 32% 40% 68% 60%
Table 5.7: The comparison between Bayesian and MSD models
Table 5.7 can be representing in figures 5.3 and 5.4:
020
MS
D1
Ba
ye
sia
n1
MS
D1
Ba
ye
sia
n1
MS
D1
Ba
ye
sia
n1
MS
D1
Ba
ye
sia
n1
0
10
20
30
40
50
60
70
80
90
100
%
G > G B > B G > G B > B
Hit classification
MSD1
Bayesian1
Training sample Test sample
Figure 5.3: The comparison between Bayesian1 and MSD1 models for hit classification
MS
D1
Bayesia
n1
MS
D1
Bayesia
n1
MS
D1
Ba
ye
sia
n1
MS
D1 B
aye
sia
n1
0
10
20
30
40
50
60
70
80
90
100
%
G > B B > G G > B B > G
Erroneous classification
MSD1
Bayesian1
Training sample Test sample
Figure 5.4: The comparison between Bayesian1 and MSD1 models for erroneous classification
- Bayesian1 model perform better than MSD1 in detecting good
customers. Bayesian1 classify 74% and 68% of good customers
022
correctly, in training and test sample respectively. While MSD1
classify 70% and 56% of good customers correctly, in training and
test sample respectively.
- MSD1 perform better in classifying bad customers, its detect 66%
and 68% of bad customers, in training and test samples respectively,
while Bayesian1 detect 52% and 60% of bad, in training and test
sample respectively.
- Thus using Bayesian1 or MSD1 model will reduce the insolvent rate
but some good customers will be denied. Generally Bayesian1 and
MSD1 models give insufficient results since they classify some
good customers as bad and can not detect all bad customers.
- Insufficient results due to that there is a need to review the attributes
which used to build the models since these attributes do not reflect
all data about the customers and some of these attributes are vague.
The set of attributes should comprise more relevant data and more
details. Some of the attributes are obtained from application form.
Others are obtained from credit bureau, Thomas L. C., David B.
Edelman, and Jonathan N. Csrook (2002) and Yang Liu.s (2002).
023
1- Data from the application form
The data on the credit card application form are summarized table 6.8:
Information
resources Application forms
Info
rmat
ion c
ateg
ori
es a
nd s
ample
s
Basic personal
information Age , gender
1,education
2
Family
information
Martial status3, number of children, data of
marriage
Residential
information
Status3, number of years in current address
5,
the value of the house
Employment
status
Occupation sector, number of years in
current occupation, position
Financial status Salary, other income, rent payment, monthly
installments, credit report6
Contact
information
Phone home, work phone, distance from
home or work to nearest branch
Table 5.8: Some attributes in application form
(1) Female , male
(2) Diploma (high school), graduated, post graduated
(3) Single, married, widow, divorced
(4) Own, rented, functional, with parents
(5) Capital, countryside
(6) Have you paid your bills on time? What is your outstanding dept?
How long is your credit history? Have you applied for new credit
recently? And how many and what types of credit accounts do you
have?
024
2- Data from credit bureau
Credit bureau usually contains bankruptcy information obtained
from banks. It contain information like, identify of current and past
creditors, dates disbursed for current and past loans, monthly installments
for current and past loans, maximum line of credit with current and
creditors, arrears in current and past loans and number of inquiries.
5.5. Improving the accuracy of credit score models
One of the important points in building credit score models is to
select the relevant attributes. Irrelevant, redundant or vague attributes may
reduce the accuracy of the models, Liu Y. and M Schumann (2005). As
mentioned in the section 5.4.4 the accuracy of the credit score model can
be increase by adding new related attributes and remove vague attributes.
We will try to improve the accuracy of credit score model by reviewing
the set of attributes used to build it. We remove vague attributes and
adding useful attributes. Due to lake of data, the income will be add to the
set of attributes which used to build the credit score models and the
gender, home own type, bank account, and credit card will be removed,
since it need more details and do not provide useful information.
The attributes which will be used to build the new credit score
models are:
1- Age
2- Martial status (single, divorced, widow, and married)
3- Education level (diploma, graduated, and post graduated)
4- Occupation (self employee, employee)
5- Experience
6- Home years
7- Income
025
5.5.1. Building a new Bayesian model
)/( XGP and )/( XGP for income are computed and given in table 5.9:
Income Good Bad
<=600 0.24 0.54
>600 0.76 0.46
Table 5.9: )/( XGP and )/( XGP for income
The results of using the new Bayesian model in classifying the
customers for training and test samples are given in table 5.10:
New Bayesian model (Bayesian2)
Estimated classes
Training sample Test sample
Original classes
Good Bad Good Bad
No. % No. % No. % No. %
Good 39 78% 11 22% 33 66% 17 34%
Bad 16 32% 34 68% 22 44% 28 56%
Table 5.10: The result of applying Bayesian2 model
Table 5.10 can be representing in figure 5.5:
0102030405060708090
100
%
Training
sample
Test
sample
Hit classification
Good
Bad
0102030405060708090
100
%
Training
sample
Test
sample
Erroneous classification
Good
Bad
Figure 5.5: The result of applying Bayesian2 model
026
In the training sample Bayesian2 classify 78% of good customer
correctly and successes to find out 68% of bad customers deductive
credit score classify them as good “correctly”. At the same time
Bayesian2 classify 22% of good customers "incorrectly" as bad and
fall to detect 32% of bad customers and classify them "incorrectly"
as good.
In the test sample, Bayesian2 classify 66% of good customers
correctly and successes to find out 56% of bad customers the
deductive credit score classify them "incorrectly" as good. At the
same time Bayesian2 classify 34% of good customers as bad
“incorrectly” and fall to detect 44% of bad customers and classify
them as good.
5.5.2. Building a new MSD model
Using the training sample, the weights for the attributes and the cut
point are computed by WinQSB and given in the table 5.11:
Age Material
status
Education
Level Occupation Experience
Home
Years net income Cut point
0.00030 0.00510 0.00580 0.00260 0.00020 0.00020 0.006500 0.09540
Table 5.11: Weights and cut point for MSD2 model
The results of apply MSD2 model are given in table 5.12:
New MSD model (MSD2)
Estimated classes
Training sample Test sample
Original classes
Good Bad Good Bad
No. % No. % No. % No. %
Good 43 86% 7 14% 43 86% 7 14%
Bad 13 26% 37 74% 14 28% 36 72%
Table 5.12: The results of applying the MSD2 model
027
Table 5.12 can be representing in figure 5.6:
0102030405060708090
100
%
Training
sample
Test
sample
Hit classification
Good
Bad
0102030405060708090
100
%
Training
sample
Test
sample
Erroneous classification
Good
Bad
Figure 5.6: The results of applying the MSD2 model
In the training sample, MSD2 classify 86% of good customers
correctly and successes to find out 74% of bad customers the
deductive credit score model classify them as good. At the same
time MSD2 classify 14% of good customers "incorrectly" as bad
and fall to detect 26% of bad customers and classify them
"incorrectly" as good.
In the test sample, MSD2 classify 86% of good customers correctly
and successes to find out 72% of bad customers the deductive credit
score model classify them "incorrectly" as good. At the same time
MSD2 classify 14% of good customers as bad and fall to detect
28% of bad customers and classify them as good.
028
5.5.3. Improving the accuracy of credit score models conclusion
Adding income and remove the usefulness attributes improve the
accuracy of Bayesian and MSD models as indict in tables 5.13, 5.14, 5.15
and 5.16:
1- Bayesian model
o Training sample
Training sample (Bayesian models)
Estimated classes
Bayesian1 Bayesian2
Original classes Good Bad Good Bad
Good 74% 26% 78% 22%
Bad 48% 52% 32% 68%
Table 5.13: Comparison between Bayesian1 and Bayesian2 for training sample
Table 5.13 can be representing in figure 5.7:
0102030405060708090
100
%
Good Bad
Correct classification
Byesian1
Byesian2
0102030405060708090
100
%
Good Bad
Classification error
Byesian1
Byesian2
Figure 5.7: Comparison between Bayesian1 and Bayesian2 for training sample
For the good customers: the performance of Bayesian2 model is
better than Bayesian1 model. Bayesian2 classify 78% of good
customers correctly while bayesian1 classify 74% only.
029
For bad customers: the performance of Bayesian2 model is better
since it detect 68% of bad while Bayesian1 52% only.
o Test sample
Test sample (Bayesian models)
Estimated classes
Bayesian1 Bayesian2
Original classes Good Bad Good Bad
Good 68% 32% 66% 34%
Bad 40% 60% 44% 56%
Table 5.14: Comparison between Bayesian1 and Bayesian2 for test sample
Table 5.14 can be representing in figure 5.8:
0102030405060708090
100
%
Good Bad
Hit classification
Byesian1
Byesian2
0102030405060708090
100
%
Good Bad
Erroneous classification
Byesian1
Byesian2
Figure 5.8: Comparison between Bayesian1 and Bayesian2 for test sample
For the good customers: the performance of Bayesian1 model is
better than Bayesian2 model. Bayesian1 classify 68% of good
customers correctly while Bayesian2 classify 66%.
For the bad customers: the performance of Bayesian1 model is
better than Bayesian2 model. Bayesian1 classify 60% of good
customers correctly while Bayesian2 classify 56%.
031
2- MSD model
o Training sample
Training sample (MSD models)
Estimated classes
MSD1 MSD2
Original classes Good Bad Good Bad
Good 70% 30% 86% 14%
Bad 34% 66% 26% 74%
Table 5.15: The result of applying MSD2 for training sample
Table 5.15 can be representing in figure 5.9:
0102030405060708090
100
%
Good Bad
Hit classification
MSD1
MSD2
0102030405060708090
100
%
Good Bad
Erroneous classification
MSD1
MSD2
Figure 5.9: The result of applying MSD2 for training sample
For the good customers: the performance of MSD2 was improved,
its classify 86% of good customers correctly while the MSD1 model
classify 70% only.
For the bad customers: the performance of MSD2 was improved, its
detect 74% of bad customers while the MSD1 model detect 66%
only.
030
o Test sample
Test sample (MSD models)
Estimated classes
MSD1 MSD2
Original classes Good Bad Good Bad
Good 56% 44% 86% 14%
Bad 32% 68% 28% 72%
Table 5.16: The result of applying MSD2 for test sample
Table 5.16 can be representing in figure 5.10:
0102030405060708090
100
%
Good Bad
Hit classification
MSD1
MSD2
0102030405060708090
100
%
Good Bad
Erroneous classification
MSD1
MSD2
Figure 5.10: Comparison between MSD1 and MSD2 for test sample
For the good customers: the performance of MSD2 was improved,
its classify 86% of good customers correctly while the MSD1 model
classify 56% only.
For the bad customers: the performance of MSD2 was improved, its
detect 72% of bad customers while the MSD1 model without
income detect 68% only.
032
The performances of Bayesian1, MSD1, Bayesian2 and MSD2 are
compared and given in table 5.17 and 5.20 for training and test samples:
a) Training sample
Training sample
Estimated classes
Models without income New models
Original classes
Good Bad Good Bad
MSD1 Bayesian1 MSD1 Bayesian1 MSD2 Bayesian2 MSD2 Bayesian2
Good 70% 74% 30% 26% 86% 78% 14% 22%
Bad 34% 48% 66% 52% 26% 32% 74% 68%
Table 5.17: The comparison between MSD1, MSD2, Bayesian1, and Baesian2 for
training sample
Table 5.17 can be representing in figures 5.11 and 5.12:
MS
D1
Ba
ysia
n1
MS
D2
Bayesia
n2
MS
D1
Ba
ysia
n1
MS
D2
Bayesia
n2
0102030405060708090
100
%
G > G B > B
Hit classification
MSD1
Baysian1
MSD2
Bayesian2
Figure 5.11: The comparison between MSD1, MSD2, Bayesian1, and Baesian2 for
training sample, hit classification
033
MS
D1
Ba
ye
sia
n1
MS
D2
Bayesia
n2
MS
D1
Ba
ye
sia
n1
MS
D2
Bayesia
n2
0102030405060708090
100
%
G > B B > G
Erroneous classification
MSD1
Bayesian1
MSD2
Bayesian2
Figure 5.12: The comparison between MSD1, MSD2, Bayesian1, and Baesian2 for
training sample, erroneousness classification
According to rate of good customers which classified correctly, the
methods are arranged in table 5.18:
Table 5.18: Methods arranged according to hit classification rate of good customers
According to rate of bad customers which classified correctly, the
methods can be arranged in table 5.19:
Model Bad
MSD2 74%
Bayesian2 68%
MSD1 66%
Bayesian1 52%
Table 5.19: Methods arranged according to hit classification rate of bad customers
It is clear that MSD2 performs better than others models.
Model Good
MSD2 86%
Bayesian2 78%
Bayesian1 74%
MSD1 70%
034
b) Test sample
Test sample
Estimated classes
Models without income New models with income
Original classes
Good Bad Good Bad
MSD1 Bayesian1 MSD1 Bayesian1 MSD2 Bayesian2 MSD2 Bayesian2
Good 56% 68% 44% 32% 86% 66% 14% 34%
Bad 32% 40% 68% 60% 28% 44% 72% 56%
Table 5.20: The comparison between MSD, MSD2, Bayesian1, and Bayesian2 for
test sample
Table 5.20 can be representing in the figures 5.13 and 5.14:
MS
D1
Ba
ye
sia
n1
MS
D2
Bayesia
n2
MS
D1
Ba
ye
sia
n1
MS
D2
Bayesia
n2
0
10
20
30
40
50
60
70
80
90
100
%
G > G B > B
Hit classification
MSD1
Bayesian1
MSD2
Bayesian2
Figure 5.13: The comparison between MSD, MSD2, Bayesian1, and Bayesian2
for test sample, hit classification
Note:
G > G mean that the good customers classified correctly as good.
B > B mean that bad customers classified correctly as bad.
035
MS
D1
Bayesia
n1
MS
D2
Ba
ye
sia
n2
MS
D1
Bayesia
n1
MS
D2
Ba
ye
sia
n2
0
10
20
30
40
50
60
70
80
90
100
%
G > B B > G
Erroneous Classification
MSD1
Bayesian1
MSD2
Bayesian2
Figure 5.14: The comparison between MSD, MSD2, Bayesian1, and Bayesian2
for test sample, erroneous classification
According to rate of good customers which classified correctly, the
methods can be arranged in table 5.21:
Model Good
MSD2 86%
Bayesian1 68%
Bayesian2 66%
MSD1 56%
Table 5.21: Methods arranged according to good customers classified correctly
036
The methods can be arranged according to the numbers of bad
customers classified correctly as shown in table 5.22:
Model Bad
MSD2 72%
MSD1 68%
Bayesian1 60%
Bayesian2 56%
Table 5.22: Methods arranged according to bad customers classified correctly
It is clear that MSD2 performs better than others models.
5.6. Testing the models using new sample
To test the above conclusion, MSD2 perform better than other
models, we used another sample consist of 200 customers, 100 good and
100 bad, to test the Bayesian1, Bayesian2, MSD1 and MSD2 models.
The results are given in tables 5.23:
Estimated classes
Models without income New models
Original classes
Good Bad Good Bad
MSD1 Bayesian1 MSD1 Bayesian1 MSD2 Bayesian2 MSD2 Bayesian2
Good 64% 70% 36% 30% 74% 68% 26% 32%
Bad 37% 33% 63% 67% 31% 39% 69% 61%
Table 5.23: Test Bayesian1, Bayesian2, MSD1, and MSD2 using another sample
037
Table 5.23 can be representing in figures 5.15 and 5.16:
MS
D1
Ba
ye
sia
n1
MS
D2
Bayesia
n2
MS
D1
Ba
ye
sia
n1
MS
D2
Bayesia
n2
0102030405060708090
100
%
G > G B > B
Hit classification
MSD1
Bayesian1
MSD2
Bayesian2
Figure 5.15: Test Bayesian1, Bayesian2, MSD1, and MSD2 using another sample, hit
classification
MS
D1
Ba
ye
sia
n1
MS
D2
Bayesia
n2
MS
D1
Ba
ye
sia
n1
MS
D2
Bayesia
n2
0102030405060708090
100
%
G > B B > G
Erroneous Classification
MSD1
Bayesian1
MSD2
Bayesian2
Figure 5.16: Test Bayesian1, Bayesian2, MSD1, and MSD2 using another sample, erroneous
classification
According to rate of good customers which classified correctly, the
methods can be arranged in table 5.24:
Model Good
MSD2 74%
Bayesian1 70%
Bayesian2 68%
MSD1 64%
Table 5.24: Methods arranged according to rate of good customer classified correctly
038
The methods are arranged according to the rate of bad customers
classified correctly and given in table 5.25:
Model Bad
MSD2 69%
Bayesian1 67%
MSD1 63%
Bayesian2 61%
Table 5.25: Methods arranged according to rate of bad customer classified correctly
Table 5.25 show that MSD2 performs better than others models.
039
5.7. Building new models using new sample
To confirm the above conclusions, the MSD2 performs better than
others models, the new sample will be used to build another MSD and
Bayesian. Bayesian3 and MSD3 will be building using the 11 attributes
and Bayesian4 and MSD4 will be building using the 7 attributes.
5.7.1. Bayesian model
The )/( XGP and )/( XBP are computed and given in table 5.26:
Age good bad Martial status good bad
age <=30 0.1 0.28 Married 0.88 0.66
30<age<=60 0.84 0.72 Divorced 0 0.04
Age>60 0.06 0 Widow 0.02 0.04
1 1 Single 0.12 0.26
Gender good bad Education good bad
Male 0.84 0.92 Post 0.36 0.2
Female 0.16 0.08 Graduated 0.56 0.68
1 1 Diploma 0.08 0.12
1 1
Experience good bad Occupation good bad
<3 years 0.1 0.24 Employee 0.66 0.6
more or =3 and <10 0.26 0.42 Retired 0.06 0
more than or = 10 0.64 0.34 self employed 0.28 0.4
1 1 1 1
Home type own good bad Phone good bad
Owned 0.62 0.62 Yes 1 1
Rent 0.38 0.38 No 0 0
1 1 1 1
Bank account good bad Credit card good bad
Yes 0.8 0.66 Yes 0.64 0.46
No 0.2 0.34 No 0.36 0.54
1 1
Home years good Bad net income good bad
les than or = 8 0.2 0.56 0.26 0.42
more 8 0.8 0.44 0.74 0.58
1 1 1 1
Table 5.26: )/( XGP and )/( XBP for new sample
)/( GvXP jk )/( GvXP jk
041
The comparison between Bayesian3 and Bayesian4 are given in
table 5.27:
Test sample (Bayesian models)
Estimated classes
Bayesian3 Bayesian4
Original classes Good Bad Good Bad
Good 66% 34% 62% 38%
Bad 48% 52% 42% 58%
Table 5.27: Comparison between Bayesian3 and Bayesian4
Table 5.27 can be representing in figure 5.17:
0102030405060708090
100
%
Good Bad
Hit classification
Byesian3
Byesian4
0102030405060708090
100
%
Good Bad
erroneous classification
Byesian3
Byesian4
Figure 5.17: Comparison between Bayesian3 and Bayesian4
For the good customers: Bayesian3 performs better than Bayesian4.
Bayesian3 classify 66% of good customers correctly while the
Bayesian4 classify 62% only.
For the bad customers: Bayesian4 performs better than Bayesian3.
Bayesian4 classify 58% of bad customers correctly while the
Bayesian3 model classifies 52% only.
040
5.7.2. MSD model
The weights and cut point for MSD models are computed using
WinQSB and given in table 5.28 and 5.29:
Weights and cut point for MSD3:
Age gender Material status
Education Level
Occupation Experie
nce
Home Own Type
Home Phone
Bank Account
Credit Cards
Home Years
Cut point
0.0005 0.0071 0.0041 0.0065 0.0107 0.0007 0.0000 0.0000 0.0307 0.0069 0.0004 0.1628
Table 5.28: Weights and cut point for MSD3
Weights and cut point for MSD4:
Age Material status
Education Level
Occupation Experience Home Years
net income Cut point
0.00020 0.00580 0.00040 0.01130 0.00040 0.00070 0.004900 0.1059
Table 5.29: Weights and cut point for MSD4
The comparison between MSD3 and MSD4 are given in table 5.30:
Test sample (MSD model)
Estimated classes
MSD3 MSD4
Original classes Good Bad Good Bad
Good 66% 34% 68% 32%
Bad 34% 66% 26% 74%
Table 5.30: The comparison between MSD3 and MSD4
042
Table 5.30 can be representing in figure 5.18:
0102030405060708090
100
%
Good Bad
Hit classification
MSD3
MSD4
0102030405060708090
100
%
Good Bad
erroneous classification
MSD3
MSD4
Figure 5.18: The comparison between MSD3 and MSD4
For the good customers: MSD4 performs better than MSD3. MSD4
classify 68% of good customers correctly while the MSD3 classify
66% only.
For the bad customers: MSD4 performs better than MSD3. MSD4
classify 74% of bad customers correctly while the MSD3 classify
66% only.
5.7.3. Building new models using the new sample conclusion
The performances of Bayesian3, MSD3, Bayesian4 and MSD4
models are compared and given in table 5.31:
Test sample
Estimated classes
Model without income Model with income
Original classes
Good Bad Good Bad
MSD3 Bayesian3 MSD3 Bayesian3 MSD4 Bayesian4 MSD4 Bayesian4
Good 66% 66% 34% 34% 68% 62% 32% 38%
Bad 34% 48% 66% 52% 26% 42% 74% 58%
Table 5.31: The comparison between Bayesian3, MSD3, Bayesian4 and MSD4
043
Table 5.31 can be representing in figures 5.19 and 5.20:
MS
D3
Ba
ye
sia
n3
MS
D4
Bayesia
n4
MS
D3
Ba
ye
sia
n3
MS
D4
Bayesia
n4
0
10
20
30
40
50
60
70
80
90
100
%
G > G B > B
Hit classification
MSD3
Bayesian3
MSD4
Bayesian4
Figure 5.19: The comparison between Bayesian3, MSD3, Bayesian4 and MSD4 for hit classification
MS
D3
Ba
ye
sia
n3
MS
D4
Bayesia
n4
MS
D3
Ba
ye
sia
n3
MS
D4
Bayesia
n4
0
10
20
30
40
50
60
70
80
90
100
%
G > B B > G
erroneous classification
MSD3
Bayesian3
MSD4
Bayesian4
Figure 5.20: The comparison between Bayesian3, MSD3, Bayesian4 and MSD4 for erroneous
classification
044
The methods arranged according to the percentage of good
customers classified correctly and given in table 5.32:
Model Good
MSD4 68%
MSD3 66%
Bayesian3 66%
Bayesian4 62%
Table 5.32: The methods arranged according to the percentage of good
customers classified correctly
The methods arranged according to the percentage of bad customers
classified correctly and given in table 5.33:
Model Bad
MSD4 74%
MSD3 66%
Bayesian4 58%
Bayesian3 52%
Table 5.33: The methods arranged according to the percentage of bad customers
classified correctly
From the above comparison, MSD4 (model with income) perform
better than other models. This result is consistent with conclusion on
above sections.
5.8. General conclusion
The attributes used to build credit score model have an important
effect on the performance of the model. Irrelevant or vague attributes will
reduce the accuracy of scoring model so it is important to give attention to
selection the attributes which will be using to build scoring model. This
045
will need to get more information about the applicant credit history and
review the questions in credit card application form.
MSD3 and MSD4 give accuracy result more than other models and
it's recommended to use one of them.
046
Chapter six
Conclusions and points for further research
Conclusions
The purpose of this research is to build a credit score for credit card
applicants using Bayesian, composite rule induction system and linear
programming techniques to help banks to issuing credit card decision for
an applicant or deny. All the customers in the samples used in building
and testing these models were granted a credit card based on system
depend on the deductive credit score.
The models were built using the same attributes used by deductive
credit score and we conclude that the credit score model which depend on
Bayesian or linear programming give more accurate results than models
depend on deductive credit score.
Then we improved the accuracy of Bayesian and linear
programming credit score models by reviewing the attributes used in
building these models. We rebuilt the credit score models after omitting
unimportant attribute and add income attribute.
We concluded that the MSD (MSD2 and MSD4) credit score model
which depend on the new set of attributes after adding the income give the
more accurate results and the set of attributes which used in building the
047
credit score model should be reviewed and modify the questions in the
credit card application form.
Generally, credit score is very important technique to analysis the
data in many field sectors, especially for banks. Credit score as automated
and centralized system enable bank to measure the creditworthy of large
number of customers objectively and accurately in short time especially if
there are an precise and instant method to assurance that the data given by
the applicants are correct and its important was increased as risk
management tool with Basel II.
Points for further research
There are many points for further research; it can be summarized as
follows:
- Build a credit risk model for credit card using credit score.
- Study the effect of Basel II on Egyptian bank credit card lending.
- Using a hybrid approach in order to try improving the classification
accuracy.
048
References
[1] A. J. Feelders (2000), credit scoring and reject inference with
mixture models, International Journal of Intelligent Systems in
Accounting, Finance & Management, 9, 1-8.
[2] Allen N. Berger and W. Scott Frame (2005), small business credit
scoring and credit availability, credit scoring & credit control
conference, the credit research centre, the school of management,
the University of Edinburgh.
[3] Baesens B., Egmony M, Castelo R., and Vanthienen J. (2002),
learning Bayesian network classifiers for credit scoring using
markove chain Monte Carlo search, IEEE computer society, 49-52.
[4] Basel Committee on Banking Supervision (1999), credit risk
modeling: current practices and application, Bank for international
settlements.
[5] Basel Committee on Banking Supervision (2000), principle for
management of credit risk, Bank for international settlements.
[6] Basel Committee on Banking Supervision (2001a), consultative
document: the new Basel Capital Accord, Bank for international
settlements.
[7] Basel Committee on Banking Supervision (2001b), the joint forum:
risk management practices and regulatory capital, Bank for
international settlements.
[8] Basel Committee on Banking Supervision (2001c), consultative
document: overview of the new Basel Capital Accord, Bank for
international settlements.
[9] Basel Committee on Banking Supervision (2001d), consultative
document: the internal rating based approach, Bank for international
settlements.
[10] B. Ravindranath (2002), decision support system and data
warehouses, New Age International (p) ltd.
[11] Brian Coyle (2000), Measuring credit risk, Glenlake publishing
company, Ltd, Chicago.
[12] Business Payment System Wisconsin "BPS" (2004), ( BPS is an
agent of business payment system which is a registered ISO/MSP
National company in association with bank of America, N.A.,
http://www.bpswis.com/html/pre-paid_cards.html.
[13] Consumer Federation of America (2002), credit score accuracy
and implication for consumers.
[14] David B. Edelman (2005), credit scoring as a strategic
management tool, credit scoring & credit control conference, the
credit research centre, the school of management, the University of
Edinburgh.
049
[15] David West (2000), neural network credit scoring models,
computers & operation research 27, 1131-1152.
[16] Department of the army (1998), risk management, hearquarter,
Washington. DC, field manual No. 100-14.
[17] Dompos M., Kosmidou K., Baourakis G., and Zopounidis c.
(2002), Credit risk assessment using a multicriteria hierarchical
discrimination approach: A comparative analysis, European Journal
of Operation Research 138 392-412.
[18] Doumpos M. and Zopounidis C. (2002a), multicriteria decision aid
classification methods, Kluwer academic publishers.
[19] Doumpos M. and Zopounidis C. (2002b), Multi-group
discrimination using multi-criteria analysis: illustrations from the
field of finance, European journal of operation research, 139 371-
389.
[20] Doumpos M. and Zopounidis C. (2002c) multicriteria
classification and sorting methods: A literature review, European
journal of operation research, 138 229-246.
[21] D. Michic, D.J. Spiegelhaltcr, and C.C. Taylor (1994), machine
learning, neural network and statistical classification, Ellis
Horwood.
[22] Eddt L. Ladue and Michael P. Novak (1999), use recursive
partitioning in the development of credit scoring models, journal of
agricultural & applied economics, vol. 31, issue 1.
[23] Edward I. Altman (2002), revisiting credit scoring models in
BASEL2 environment, this paper was originally prepared for the
following publication, Ong, M., “credit rating: methodologies,
rationale and default risk,” London risk book, 2002.
[24] Efraim Turban (1988), Decision support system and expert
systems, Macmillan publishing company, New York.
[25] Efraim Turban and Jay.E. Aronson (2002), Decision support
system and intelligent systems, Pearson Education (Singapore) Pte.
Ltd., India.
[26] Federal trade commission for the consumer (2005), credit scoring.
[27] Ferenc Kiss (2003), credit scoring processes from a knowledge
management perspective, Periodica Polytechnica Ser. Soc. Vol. 11,
No. 1, 95-110.
[28] Financial Consumer Agency of Canada (2001), credit card and
you, http://dsp-psd.pwgsc.gc.ca.
[29] Freed N. and Glover F. (1981), simple but powerful goal
programming models for discriminanat problems, European journal
of operation research, 7 44-60.
051
[30] Gachet, A. (2001), a framework for developing distributed
cooperative decision support systems- inception phase, 4th
information science conference, June 19-22 Krakow, Poland.
[31] Gutierrez-Pena E.(2004), Bayesian classification methods,
Psychology science, vol. 46, p. 52-64.
[32] Hussein Almuallim, Shigeo Kaneda and Yasuhiro Akiba (2002),
development and application of decision trees, expert system, vol. 1.
[33] Jiawei Han and Micheline Kamber (2001), Data mining concept
and techniques, Morgan Kaufmann Publishers.
[34] Jiawei Han and Micheline Kamber (2001), Data mining, concepts
and techniques, Acadmic Press.
[35] Jaap Spronk, Ralph E. Steuer and Constantin Zopoundis (2003),
Multicriteria decision aid/analysis in finance,
[36] Jan Wallin and Stefan Sundgren (1995), using linear programming
to predict business failure: and empirical study, liiketaloudellinen
aikakausikirja.
[37] Jih-Jeng Huang, Gwo-Hshiung Tzeng & Chorng-Shyong Ong
(2005), two stage genetic programming (2SGP) for the credit
scoring model, applied mathematical and computation, article in
press.
[38] Karel Komorad (2002), on credit scoring estimation, Master's
thesis, Institute for statistics and econometrics, Humboldt
University, Berlin.
[39] Kasper Roszbach (2003), bank lending policy, credit scoring and
the survival of loans, Soveriges Riksbank working paper series no.
154, Sweden.
[40] Kim Fung Lam, Eng Ung Choo, and Jane W. Moy (1996),
Minimizing deviations from the two mean: a new linear
programming approach for the two group classification problem,
European Journal of Operation Research, 88, 358-367.
[41] Ki Mun Jung & Thomas L. C. (2004), a note on coarse
classification in acceptance scorecards, discussion paper in
management, M04-16. Southampton: university of Southampton.
[42] Linda Allen, Gayle Delong, and Anthony Saunders (2004), issues
in the credit risk modeling of retail markets, Journal of banking and
finance.
[43] Lin Wei Ping (2003), IBM business consulting services,
www.ibm.com/bcs.
[44] Liu, Y. (2001), new issues in credit scoring application, research
paper, institute of information system, university of Goettingen, Nr.
16/2001, Gottingen.
050
[45] Liu, Y. (2002a), a framework of data mining application for credit
scoring, research paper, institute of information system, university
of Goettingen, Nr. 01/2002, Gottingen.
[46] Liu, Y. (2002b), the evaluation of the classification models for
credit scoring, Arbeitsberichte der Abt. Wirtschaftsinformatik II,
Universitat Gottingen, Nr. 2, Gottingen.
[47] Liu Y. and M Schumann (2005), data mining feature selection for
credit scoring models, operational research society ltd. 1-10.
[48] Loretta J. Mester (1997), what's the point of credit scoring?,
Federal reserve of Philadelphia, business review.
[49] Mark Schreiner (2002), Scoring: the next breakthrough in
microcredit?, Microfinance risk management and center for social
development, USA.
[50] Michic, Spiegelhater, Taylor (1994), machine learning: Neural and
statistical classification, Ellis Horwood.
[51] Motaz Khorshid (2004), Model-centered government decision
support system for socioeconomic development in the Arab world,
the international conference on input-output general equilibrium:
data, modeling and policy analysis, Brussel, Belgium.
[52] Mu-chen, Shin-Hsien Huang, and Chia-Ming Chen (2002), credit
classification analysis through the genetic programming approach,
[53] Nicholas M. Kiefer (2004), specification and informational issues
in credit scoring, Washington, DC: Office of Comptroller of
Currency.
[54] Peng and Goh Chwee (2004), credit scoring using data mining
techniques, Singapore Management Review.
[55] Nikolaos F. Matsatsinis and C. Erik Larson (2004), CCAS: An
intelligent decision support system for credit card application
assessment, Journal of multi-criteria decision analysis, vol. 11, no 4-
5. 213-235.
[56] Rashmi Malhotra and D.K. Malhotra (2001), evaluating consumer
loans using neural networks, omega- the international journal of
management science, vol. 31,2, 83-97.
[57] Rejda, George E (1995), principles of risk management and
insurance, Harper Collins college publishers.
[58] Scott E. Harrington and Gregory R. Niehaus (1999), risk
management and insurance, Irwin/Mcgraw_Hill.
[59] Secretariat of the Basel committee on Banking Supervision
(2001), the new Basel Capital Accord: an explanatory note, Bank
for international settlements.
[60] Stat bank of Pakistan, risk management "guidelines for
commercial bank and DFIs.
052
[61] Steiner M. T. A. and Carnieri C. (1999), pattern recognition in
credit scoring analysis, Investigacion Operativa.
[62] Steven Finlay (2005), using genetic algorithms to develop scoring
models for alterative measure of performance, credit scoring and
credit control conference, the university of Edinburgh management
school, credit research center.
[63] Sujit Chakravorti (2003), theory of credit card networks: a survey
of the literature, Review of network economics, vol. 2, issue 2.
[64] Taher Musa (2004). Modern risk management in banking and
finance, Union of Arab Banks.
[65] Tetsuo Tamai and Masayuki Fujita (1987), Development of an
expert system for credit card application assessment, international
journal of computer application in technology, vol. 2, No. 4,234-
240.
[66] The committee on regulation and supervision (1999), response to
Basel's credit risk modeling: current practices and applications,
Global Association of Risk Professionals.
[67] The Comptroller of the Currency (1998), Comptroller of the
Currency Administrator of National Banks, Washington, D.C.
[68] Thomas L. C. (2000), a survey of credit and behavioural scoring:
forecasting financial risk of lending to consumers, International
Journal of forecasting, 16, 149-172.
[69] Thomas L. C., David B. Edelman, and Jonathan N. Crook (2004),
reading in credit scoring, recent developments, advances, and aims,
Oxford University Press Inc., New York.
[70] Thomas L. C., David B. Edelman, and Jonathan N. Crook (2002),
credit scoring and its applications, society for industrial and applied
mathematics.
[71] Thomas Mahlmann (2004), classification and rating of firms in the
presence of financial and non-financial information,
www.defaultrisk.com
[72] Ting-Peng liang (1992), a composite approach to inducing
knowledge for expert system design, management science, vol. 38
no. 1.
[73] Vladimir Bugera, Hiroshi Konno, and Stanislav Uryasev (2002),
credit cards scoring with quadratic utility function, journal of multi
criteria decision analysis, 11(4).
[74] William W. Lang, Loretta J. Mester & Todd A. Vermilyea (2006),
competitive effects of Basel II on U.S. bank credit card lending,
Bank for international settlements.
[75] Winfried G. Hallerbach and Albert J. Menkveld (2004), analysis
perceived downside risk: the component value at risk framework,
European Financial Management, Vol. 10, No. 4, 567-592.
053
[76] Yi Peng,Yong Shi and Welxuan Xu (2002), classification for three
group of credit cardholders' behavior via multi criteria approach,
AMO-Advanced modeling and optimization, volume4, number 1.
[77] Yong Shi, Yi Peng, Welxuan Xu and Xiaowo Tang (2002), data
mining via multiple criteria linear programming: application in
credit card portfolio management, International Journal of
information technology and decision making, vol. 1, No. 1, 131-
151.
054
جامعة القاهرة
و البحوث اإلحصائية معهد الدراسات
نظام دعم القرار لتقييم طلبات إصدار البطاقات االئتمانية
اعداد
احمد محمود سليم عليوة
أشراف
ن حلمى اسماعيلبهاء الدي / د.أ استاذ غير متفرغ بقسم علوم الحاسب و المعلومات
جامعه القاهره –معهد الدراسات و البحوث االحصائيه
و
عاصم عبدالفتاح ثروت/د رئيس قسم بحوث العمليات
جامعه القاهره –كليه الحاسبات و المعلومات
و
رمضان عبد الحميد زين الدين/ د قسم بحوث العمليات
جامعه القاهره –الدراسات و البحوث االحصائيه معهد
قدمت هذه الرساله استكماال لمتطلبات درجه الماجستير فى بحوث العمليات
معهد الدراسات و البحوث االحصائيه –قسم بحوث العمليات
2117يونيو
055
مقدمـــــــــــــــة
يت حيث ساد عذد انعالء انتقذيي نهحصل عهى االئتا بانبطاقاثيذا اشذث انفتزة األخيزة اتايا يتش
في ذ انحانت ال تجذ يشكهت انبطاقاثضااث نهحصل عهى ذ بعض ؤالء انعالء يقذي. ذ انخذيت
يتطهب األيز . انبعض األخز يتقذو نهحصل عهى انكزث االئتايت بذ ضا نهبطاق بانسبت نهبك انصذر
ي ذ انحانت استخذاو طزيقت يا نذراست طهباث ؤالء انعالء تحذيذ م يتى يحى كزث ائتايت ي انبك ف
.او ال بذ ضا
بطاقتتههتتل يتتتم إصتتدار تحديتتدالحكتتم او التقتتدير الشمصتت ل االئتمتتان و يستتتمدم الباحتتث
و عتادة متا يحتتو همانيتائت بطاقتهحيث يقوم العميل باستيفاء نموذج طلتب إصتدار . أم ال هائتماني
الخ و …طلب اإلصدار على بيانات عن العميل مثل السكن و السن و العمل و عدد سنوات العمل
بدراستة هتذه البيانتات و يستتمدم مبرتته و تعليمتات البنتد لتحديتد هتل يتتتم االئتمتان يقتوم الباحتث
لمتقتدمين للحصتول علتى و نتيجتة لزيتادة عتدد العمت ء ا. إصدار كرت لهتذا العميتل ام يتتم رفضته
كرت ائتمان اصبح هناد صعوبة فتى االعتمتاد علتى المبترة و الحكتم الشمصت فقتط فتى عمليتة
. التقييم
هو أسلوب يساعد البند فى تحديد هل يتم الموافقة على إصدار أسلوب الترجيح االئتمان
عتتدد العمتت ء المتقتتدمين للعميتتل ام ال و قتتد زاد متتن أهميتتة هتتذه الطريقتتة زيتتادة هائتمانيتت بطاقتته
.للحصول على هذه المدمة
:تتكون هذه الرساله من سته ابواب
:الباب االول
يعتتره هتتذا البتتاب تعريتتف للمشتتكله و تعريتتف للبطاقتتات االئتمانيتته و فائتتدتها ل طتتراف
الممتلفه و مطوات اصدار البطاقات االئتمانيه و المصائص المميزه لعمليه تقيتيم نمتوذج الطلتب
لحصول على البطاقه االئتمانيه كما يعره االسلوب الحالى المستمدم فتى التقيتيم و التكلفته التتى ا
.قد يتحملها البند فى حاله اتماذ قرار غير صحيح
056
:الباب الثانى
يعره هذا الباب تعريف لنظام الترجيح االئتمانى و انواعه و تتاريخ استتمدامه و فائدتته
.2و اهميتتتته كاحتتتد استتتاليب اداره الممتتتاطر فتتتى ظتتتل بتتتازل و المشتتتاكل التتتتى توجتتته تطبيقتتته
:الباب الثالث
يعره هذا الباب البيانات التى تستمدم كمدم ت لنظتام التترجيح االئتمتانى و ممرجتات
.النظام و كيفيه بنائه و االساليب الكميه المستمدمه فى بناء نظام الترجيح االئتمانى
:الباب الرابع
الجزء االول يعره مقدمه عن نظم دعم اتماذ القرار وفتى . الى جزئينينقسم هذا الباب
الجتتزء الثتتانى تتتم عتتره نظتتام دعتتم اتمتتاذ القتترار المقتتترح استتتمدامه فتتى تقيتتيم طلبتتات اصتتدار
.البطاقات االئتمانيه
:الباب الخامس
ام بيانات فى هذا الباب تم بناء نظام الترجيح االئتمانى الصدار البطاقات االئتمانيه باستمد
.فعليه
:الباب السادس
.يعره هذا الباب الم صه و بعه النقاط البحثيه فى هذا المجال