A DSS for Credit Card Application Assessment

Cairo University

Institute of Statistical Studies and Research

A DECISION SUPPORT SYSTEM FOR

CREDIT CARDS APPLICATION ASSESSMENT

Prepared by

Ahmed Mahmoud Saleim Eliwa

A thesis submitted to the institute of Statistical Studies and Research, Cairo

University, in partial fulfillment of requirement for the master degree in Operations

Research, department of Operations Research.

Under supervision of

Prof. Bahaa El-Din Helmy Ismail Undedicated professor

Computer sciences and information department


Cairo University

& Dr. Assem Abd El-Fattah Tharwat

Head of decision support department

Faculty of Computers & Information

Cairo University

& Dr. Ramadan Abd El-Hamed Zen El-Den

Operations research department


Cairo University

June 2007

2

Contents

Page

Summary

Chapter one: Introduction

1.1 Introduction 1

1.2 Credit card definition and benefits 3

1.3 The steps of issuing credit cards 5

1.4 Properties of credit card application assessments process 6

1.5 Judgment process for credit card application assessment 7

1.6 The disadvantages of using judgment process for

credit card application assessment 8

1.7 Risk associated with credit card lending 9

1.8 Cost of wrong decisions in credit card application assessment 10

Chapter two: Scoring system

2.1 Introduction 11

2.2 History of scoring system 12

2.3 Definition of scoring system 13

2.4 Types of scoring system 15

2.5 Scoring system applications 19

2.6 Potential Benefits of scoring system 21

2.7 Scoring system limitation 23

2.8 Scoring system issues 23

2.9 Scoring system and risk management 25

2.9.1 Basel II 25

2.9.2 Risk management 27

2.9.3 Scoring system as a risk management tool 29

Chapter three: Problem formulation and survey

3.1 Introduction 31

3

3.2 Data description 32

3.2.1 Data used to build the credit score model 32

3.2.2 The output data of the credit score model 34

3.3 Building the credit score model 35

3.4 Literature survey 37

3.4.1 Classification the credit score methods 37

3.4.2 Statistical techniques 39

3.4.2.1 Linear discriminate analysis (linear

probability model) 39

3.4.2.2 Logistic regression 40

3.4.2.3 Probit and tobit analysis 40

3.4.2.4 Semiparametric regression 41

3.4.2.5 Bayesian classification 41

3.4.2.6 Nearest neighbor approach 41

3.4.3 Non statistical techniques 44

3.4.3.1 Multicriteria decision aid method (MCDA) 44

3.4.3.2 Linear programming 44

3.4.3.3 Integer programming 47

3.4.3.4 Goal programming 47

3.4.3.5 Neural network 48

3.4.3.6 Expert system 49

3.4.3.7 Genetic algorithm 51

3.4.3.8 Classification tree 51

3.4.3.9 Rough sets theory 52

3.4.3.10 Analytical hierarchy process 53

3.5 Comparisons of techniques used to build credit score 53

Chapter four: Decision support system 55

Part I: Introduction to decision support system

4. I.1 Definition of decision support system (DSS) 55

4

4. I.2 Characteristic and capabilities of DSS 58

4. I.3 Decision support system components 60

4. I.4 Decision support system application (type, classification,

taxonomy of DSS) 67

4. I.5 Constructing a decision support system 72

4. I.6 DSS technologies levels and tools 76

4. I.6.1 Relationships among the technologies levels 77

4. I.6.2 Future trends of decision support system 77

4. I.7 Approaches to DSS construction 78

4. I.7.1 Quick hit 78

4. I.7.2 Staged development 79

4. I.7.3 Complete DSS 79

4. I.8 Alternate development methodologies 79

4. I.8.1 Parallel development (traditional methodologies) 80

4. I.8.2 Rapid application development (RAD) methodologies 80

4. I.8.2.1 Phased development 80

4. I.8.2.2 Prototyping (evolutionary, iterative) 80

4. I.8.2.3 Throwaway prototyping 81

4. I.9 Team developed vs. user developed DSS 81

4. I.10 DSS development platforms 81

4. I.11 Issues associated with DSS 82

Part II: The proposed decision support system 83

4. II.1 Introduction 83

4. II.2 The proposed decision support system 83

4. II.3 Building the proposed decision support system 84

4. II.3.1 Decision support system database 85

4. II.3.2 Model base for the proposed DSS 86

4. II.3.2.1 A composite rule induction system (CRIS) 87

4. II.3.2.2 Naïve Bayesian classification 91

4. II.3.2.3 Linear programming (MSD model) 93

5

4. II.3.3 User interface for the proposed DSS 94

4.1 4. II.4 Summary 103

Chapter five: An application: Building credit score model for

credit card application assessment 104

5.1. Introduction 104

5.2. Description of the current system 104

5.3. Description of training and test sample 106

5.4. Building empirical credit score models 106

5.4.1. The subsystem: Composite Rule Induction System 106

5.4.2. The subsystem: Bayesian classification 109

5.4.3. The subsystem: linear programming based model

(MSD model) 112

5.4.4. Building empirical credit score models conclusion 114

5.5. Improving the credit score models 118

5.5.1. Building a new Bayesian model 119

5.5.2. Building a new MSD model 120

5.5.3. Improving the accuracy of credit score models

conclusion 122

5.6. Testing the models using new sample 130

5.7. Building new models using new sample 133

5.7.1. Bayesian model 133

5.7.2. MSD model 135

5.7.3. Building new models using the new sample 136

conclusion

5.8. General conclusion 138

Chapter six: Conclusions and points for further research 140

References 142

6

SUMMARY

This thesis consist of six chapters, these chapters can be described

as follows:

Chapter one: Introduction

Chapter one presents an introduction to the problem, the credit

card definition, its benefits, steps of issuing credit card, and properties of

credit card application assessments, the current method used for credit

card application assessment and its issues, and cost of wrong decision in

credit card application assessment.

Chapter two: Scoring system

In chapter two, we present a definition of credit score, its types,

history, applications, benefits, limitations, issues and the importance of

credit score as risk management tools.

Chapter three: Problem formulation and survey

In chapter three, we describe the input data used for building

a credit score model, its output, problem formulation, and the methods

used for building credit score model are reviewed.

7

Chapter four: Decision support system

Chapter four consists of two parts. First part gives an overview

about the decision support system. Second part describes the proposed

DSS. The model base management system for the proposed decision

system is based on Composite Rule Induction System (CRIS), Bayesian

classification and linear programming.

Chapter five: An application – credit score model for credit card

application assessment

In this chapter we apply the credit score models which present in

chapter four using data obtained from financial organization depend on

deductive credit score model and present the recommendations to improve

the accuracy of the model.

Chapter six: Conclusions and points for further research

In chapter six the conclusions and points for further research are

presented.

8

Chapter one

Introduction

1.1. Introduction

The last twenty years have seen a rapid growth in retail credit

markets which becomes play an important role in the economy in Egypt.

Retail credit define as ” homogeneous portfolios comprising a large

number of small, low value loans with either a consumer or business

focus, and where the incremental risk of any single exposure is small”. In

retail credit the focuses on the specific product types which consider retail

in nature these includes credit cards, personal finance, education, auto

loans, overdrafts, and residential mortgages. These types of credit make up

an important part of bank revenues and any error in the credit decision for

single customer means that the banks will loss the profit obtained from

other successful customers so banks must give more attention in credit

decision for this type credit. Banks can not use the same models used to

analyze corporate loans to analyze the retail credit because retail credit

have special features such as the exposure is to an individual person or

persons, the exposure to be one of a large pool of loans that are managed

by the bank and each individual exposure has a low value, Edward I.

Altman (2002), Basel Committee on Banking Supervision (2001d), Gayle

Delong, and Anthony Saunders (2003) and Linda Allen, Gayle Delong,

and Anthony Saunders (2004).

9

Card cards are a fast growing business segment and become the

most accepted, convenient, and profitable financial products. It’s a popular

non cash instrument which increasingly replacing cash. The advent of

credit cards in the 1960s meant that consumers could finance all their

purchases from hair clips to computer chips to holding trips by credit card,

The Comptroller of the Currency (1998) and Sujit Chakravorti (2003).

The numbers of credit card holder have increased rapidly, at the

same time the numbers of customers whom can not fulfill their obligations

to the banks have also increased. This fact forced banks to search for

methodologies that allow them to accurately evaluate the creditability of

each credit card applicant and determine if the applicant belong to good

group or bad group in order to minimize the risk of insolvent. The

objective of these methodologies is to increase the accuracy of credit

decision to increase the profits and decrease the losses, Nikolaos F.

Matsatsinis (2002) and Jih-Jeng Huang, et al (2005).

Credit score has been used to support banks in making decisions

related to variety of its products. The most obvious and common support

is to help banks to estimate whether a new applicant will pay back his

liabilities and determine if an exciting customer will default. Credit score

used to rank applicants on their expected performance. It is give quick,

objective, more accurate and consistent credit decisions. Moreover the

importance of credit score increases with growth rate of credit industry

and with Basel II which pushing banks to develop an internal credit risk

measurements, Liu, Y. (2001), Kasper Roszbach (2003), Nikolaos F.

Matsatsinis and C. Erik Larson (2004) and David B. Edelman (2005).

01

1.2. Credit card definition and benefits

Credit cards are one of the electronic payment methods which

involve a form of borrowing, often, with charges. The main idea of credit

card is that: issuer bank guarantee payment to merchants the value of cash

drawing or purchases in return for signature receipt from consumer. Credit

card enables consumer to obtain goods or services up to a specific credit

card limit and pay all amount or minimum during specific period, the

consumer pay interest charges for the left unpaid amount.

Credit card provides benefits to all participants in credit card

network, The Comptroller of the Currency (1998), Financial Consumer

Agency of Canada (2001), Lin Wei Ping (2003), Sujit Chakravorti (2003)

and Business Payment System Wisconsin "BPS" (2004). It is provide

benefits to consumers, merchants, issuers "issuing banks are banks that

directly issue the credit card", acquirers "acquiring banks are bank entered

into an agreement with a merchant to accept deposits generated by credit

card transactions", and network operators as follows:

1- Consumers

Credit cards provide many benefits to consumers as follows:

Credit cards provide consumers a secure, reliable, convenient

method of payment.

Credit cards give consumers the freedom to buy more merchandise

and pay from future income.

Credit cards are more convenient to carry than cash and make the

purchases through the internet easier.

00

2- Merchant

Credit card enables merchant to make more sales since:

Credit cards allow merchants to sell to illiquid consumers or to

those paying with future income and receive the value of good

within 48 hours of submitting the transaction to their acquirers.

Credit cards consumers spend more than consumers who only carry

cash.

Consumers are more likely to shop at businesses where credit cards

are accepted and tend to return to the same business again.

3- Issuers

Credit cards offer more advantages, for issuers, over other retail

products as follows:

Credit cards portfolio involves smaller loans that are spread across

large number of consumers.

Banks offer different product programs for entire consumer

segments.

Credit cards enables banks to aggregate pattern of consumer loan

behavior and build banking expertise that is used in other consumer

lending products.

Credit cards offer better return on assets than commercial lending,

credit card issuer earns revenue from consumers and acquires.

Consumer may pay annual fees, finance charges if they revolve, and

other fees, such as cash advance and over limit fees. Acquires pay

interchange fees to issuers to composite them for costs of attracting

and marinating credit cards holder.

02

4- Acquirers

Acquirers earn revenue from merchants by bilaterally setting

merchant discount rates and pay interchange fees to issuers.

5- Network operators

Network operators usually operate as a non profit organization such

as Visa and MasterCard. The main purpose of these organizations is to

meet the needs of their members by providing a set of rules, underlying

infrastructure, and some level of research and development to improve

their networks. The network sets the interchange fees, which are paid by

acquirers to issuers.

1.3. The steps of issuing credit cards

Issuing credit cards steps can be summarizing as follows:

1) The applicant fills an application form which contains about 20 to

30 items, including age, gender, address, telephone numbers, etc.

2) The applicant offer the required documents, these documents are

different from bank to another.

3) The application form and the documents will be checked to assure

the correctness of the data and to find false information.

4) The credit analyst examines past credit history, investigates the

application and other documents and uses his/her past experience to

decide acceptance or rejection and the credit card limit.

In steps three and four, may be there is a need to make field

investigation or to ask applicants for more documents from the applicants.

03

These steps can be summarized in the figure 1.1.

Need more documents or reject

Need more document or reject Accept

Need field investigation

Reject or need more document Accept

Figure 1.1: The steps of issuing credit cards process

1.4. Properties of credit card application assessments process

Credit card application assessments have the following properties,

Tetsuo Tamai and Masayuki Fujita (1987), Liu, Y. (2001), and Hussein

Almuallim, Shigeo Kaneda and Yasuhiro Akiba (2002):

1- The applicants usually have approximate homogeneous profile but

they vary in liquidity, limit required and risk.

Applicant fills the application

and offers required documents

Quick check to

application and

documents

Field

investigation

Office

investigation

Return the document and

the application to the

applicant

Determine the credit card

limit and issuing the card

04

2- Human factors play an important role in the process of credit card

application assessment, thus, they do not necessarily follow rigid

rules like physical or chemical laws.

3- The appropriateness and the uniformity of the decision are very

important factor as well as the cost and labor saving.

1.5. Judgment process for credit card application assessment

The credit card application assessment was based on human

judgment to assess the risk associated to an applicant. The decision in

credit card application assessment made by experts and depends on

imprecise and imperfect knowledge. The credit card analyst investigates

the application, builds a profile for this applicant from the description

given in the application form and matches this profile to a certain patterns

in his past experiences to determine the degree of credit risk associated

with this applicant and decide if he will accept or reject issue credit card.

Generally their decision based on 4Cs, Tetsuo Tamai and Masayuki Fujita

(1987), Thomas L. C. (2000) and Liu, Y. (2001). The 4Cs are:

1- The Character of the applicant – If this applicant or their family are

known or not, for the team bank.

2- The Capital – If this applicant ask credit card with a specific limit.

3- The Capacity – If this applicant has a free income for repaying

(financial ability to repay debt).

4- The Condition – What the condition of the market (the general

economic environment).

05

1.6. The disadvantages of using judgment process for credit card

application assessment

The banks were having difficulties with their credit card

management. The numbers of delinquent customers increase, it is

difficulty to distinguish between good and bad customer, and the

customers want instant decision. These invent new challenge to the

decision makers; they must make consistent and intelligent real time

decisions, Liu, Y. (2001).

Judgmental methods depend on criteria that are not systematically

tested and vary when applied by different individuals, thus the decision

was nonuniform, subjective and opaque, and depend on the personal and

empirical knowledge of each single credit analyst, Karel Komorad (2002)

and Federal trade commission for the consumer (2005).

Tetsuo Tamaiand and Masayuki Fujita (1987) and Thomas L. C.,

David B. Edelman, and Jonathan N. Crook (2004), summarize the

disadvantage of using judgment method as follow:

1- Banks now search to find out the risky customer and profitable

customer. Usually it is difficult to distinguish the profitable

customers and the risky ones, because both share the common

characteristics of using their card well. Using judgmental does not

enable banks to distinguish between risky and profitable customer

accurately.

2- Judgment needs more time to assessment the credit card application,

especially with increasing in the number of applicants, thus banks

can not give quick answer to customers especially with the growing

of number of applicants whom apply for credit card

3- Using judgment does not enable banks to completely getting rid of

individual preferences and thus it is difficult to preserve stable and

uniform judgment.

06

4- Using judgment does not enable banks to put clear criteria for

acceptance.

5- Different experts may make different judgments for the same

applicant.

6- The same expert may not give the same opinion when confronted

with the same customer twice over a period of time.

1.7. Risk associated with credit card lending

The Comptroller of the Currency (1998) defines the primary risks

associated with the credit lending as follows:

1- Credit risk

Credit risk is the risk to earnings or capital of an obligor’s failure to

meet the terms of any contract with the bank or otherwise fail to perform

as agreed.

2- Transaction risk

Transaction risk is the risk to earnings or capital arising from

problems with service or product delivery.

3- Liquidity risk

Liquidity risk is the risk to earnings or capital arising from a bank’s

inability to meet its obligations when they come due, without incurring

unacceptable losses.

4- Strategic risk

Strategic risk is the risk to earnings or capital arising from adverse

business decisions or improper implementation of those decisions.

5- Reputation risk

Reputation risk is the risk to earnings or capital arising from negative

public opinion.

07

6- Interest rate risk

Interest rate risk is the risk to earnings or capital arising from

movements in interest rates.

7- Compliance risk

Compliance risk is the risk to earnings or capital arising from

violations or non-conformance with laws, rules, regulations, prescribed

practices, or ethical standards.

1.8. Cost of wrong decisions in credit card application assessment

Wrong decision in classification problem is to classify a new sample

in wrong class. Thus in credit card application assessment their are two

types of error (misclassification), Michic, Spiegelhater, Taylor (1994),

Thomas L. C., David B. Edelman, and Jonathan N. Crook (2002) and

Karel Komorad (2002):

1- First type of error: bank can classify a good applicant as bad and

reject to issuing credit card.

In this case the banks will loss the profit from that applicant.

2- Second type of error: bank can classify a bad applicant as good and

issuing a credit card.

In that case the applicant may defaults and banks will loss the used

credit card limit.

08

Chapter two

Scoring system

2.1. Introduction

Issuing credit card decision was based essentially on credit card

analyst judgment, the growth in the demand for credit card forced to

search for more formal and objective methods to help credit card analyst,

these methods generally known as scoring system. Building scoring

system model is complex and iterative process; it takes a long time to

collect enough historical data. Scoring system uses this historical data to

study the effect of the applicant characteristics on his/her behavior.

Scoring system models rate applicant based on the data on the application

and past performance of current customers. Scoring system model often

vary from bank to another according to types of credit and what is

expected from scoring model. A good score model will not perfectly

predict the performance of the customers but it should give a fairly

accurate prediction, Loretta J. Mester (1997), Liu, Y. (2002), Nikolaos F.

Matsatsinis and C. Erik Larson (2004) and Federal trade commission for

the consumer (2005).

09

2.2. History of scoring system

Using scoring system starts from more the 60’s years. Scoring

system firstly used by Durand in 1941 to discriminate between good and

bad loan based on Fisher work in 1936, which discriminate between of

groups in a plant population based on various measured characteristic.

During the World War II, shortage in credit analysts occurred, so the

banks write down the rules of thumb used by credit analysts to decide give

a loans. The first consultancy was formed in San Francisco by Bill Fair

and Earl Isaac in the early 1950’s. Their system spread fast as the financial

institutions found out that using scoring system was cheaper, faster, more

objective, and mainly much better predictive than any judgmental scheme.

The arrival of credit cards in late 1960s and raising the number of people

applying to obtain credit card, increases the need of an automated system

and realize the importance of scoring system. In 1980, the success of

scoring system in credit card application assessment decisions was a

significant sign for the banks to use score methods to other products like

personal loans, mortgage loans, and small business loans etc. during the

second half of the 1990s, mortgage underwriting increasingly incorporated

credit score. Also, in 1990, the growth of direct marketing has led to the

use of score card to improve the response rate to advertising campaigns. In

1999 approximately 60% to 70% of all mortgages were underwritten using

credit score. The success of scoring system in banks motive, landlords,

employers and insurance companies to use it, Thomas L. C. (2000),

Thomas L. C., David B. Edelman, and Jonathan N. Crook (2002), Karel

Komorad (2002), Consumer Federation of America (2002), Ferenc Kiss

(2003), Peng and Goh Chwee (2004) and Allen N. Berger and W. Scott

Frame (2005).

21

2.3. Definition of scoring system

Scoring system is a classification method concern to classify a new

customer into pre defined groups according to their characteristics. The

original meaning of scoring system is to assign a score to each customer

and compares this score with a given or calculated cut off points, which is

the division between pre defined groups, and classify the customers to the

different classes. Scoring system try to relate the characteristic of a

customer to the risk associated with this customer and using this

relationship to build a model to classify them, according the their risk, to

predefined subgroups as accurately as possible, Liu, Y. (2001), Doumpos

M., Kosmido K., Bourakis G., and Zopounidis C. (2002) and Ki Mun Jung

& Thomas L. C. (2004).

Scoring system can be viewed as a method of classification or a

method of financial risk forecasting techniques. Scoring system can be

considered as classification method since it classifies a new applicant into

predefined groups. Scoring model uses an enormous volume of current

customers and tries to find rules to split between good and bad customers.

Then use these rule to classify new applicant, Liu, Y. (2001), Doumpos

M., Kosmido K., Bourakis G., and Zopounidis C. (2002).

Scoring system can be considered as a method of financial risk

forecasting techniques, Risk forecasting is the topic number one in modern

finance, since it help to assessment the risk corresponding to an applicant

and distinguish between groups which have different credit risk

characteristic. It involves techniques that help banks to assessment the risk

associated with each customer, so they can mange and quantify the risk

and make quickly and objectively decision, Thomas L. C., et al (2002) and

Karel Komorad (2002).

20

There are many definitions for scoring system; these definitions can

be review as follows:

Loretta J. Mester (1997) define scoring system as quantitative

method that is used to predict the probability of loan applicant or

existing borrower will default or become delinquent.

The Comptroller of the Currency (1998) define scoring system as

tools used to predict the behavior of new applicants based on the

performance of previous applicants.

Lewis define scoring system as studying the credit worthiness of

any of the many forms of commerce under which an individual

obtains money, goods or services under condition of repay the

money or to pay for the goods or services, along with a fee (the

interest), at some specific future date or dates, Karel Komorad

(2002).

Thomas L. C., David B. Edelman, and Jonathan N. Crook (2002),

define scoring system as the set of decision models and their

underlying techniques that aid lenders in the granting of consumer

credit.

Thomas L. C. (2000) defines scoring system as a decision process,

which has the input: answer to the application form questions and

various information obtained from credit reference bureau, and the

output: separation of applications into good and bads, Vladimir

Bugera, Hiroshi Konno, and Stanislav Uryasev (2002).

Mark Schreiner (2002) defines scoring system as any technique that

forecasts future risk from current characteristics using knowledge of

past links between risk and characteristics.

22

2.4. Types of scoring system

There are many types of scoring system especially with extend its

objectives from classifying the customers into predefined groups to cover

the three stages of credit management process (pre-application stage,

credit application stage, and credit performance stage). Figure 2.1.

presents the expanding of scoring into the three credit management stages,

Liu, Y. (2001):

Figure 2.1. Expanding of score model to different stages of credit

management process

Many authors categorize the types of scoring system from different

points of view as follow:

The Comptroller of the Currency (1998) divided the types of

scoring system to:

1- Application scoring

Application score predict the probability that a consumer will repay

as contracted.

2- Credit bureau risk scoring

Credit bureau risk score predict the customer’s future credit

payment behavior to achieve superior predictive power.

Pre-application stage Credit application stage Credit performance stage

Identification of

potential

applicants

Identification

of acceptable

applicants

Identification of

possible

behavior of

current

customers

23

3- Credit bureau bankruptcy scoring

Credit bureau bankruptcy score predict the probability that a

customer will declare bankruptcy or become a collection problem at some

point.

4- Credit bureau revenue scoring

Revenue score used to rank prospect customer by the amount of net

revenue likely to be generated.

5- Behavioral or performance scoring

Behavior score used to segment current customer into groups based

on past behavior to predict which one will be delinquent and put different

strategies, e.g. collection strategies, renewal decisions.

6- Collection scoring

Collection score used to predict the probability that the collection

efforts will succeed, the probability that a bank will receive a payment

from a delinquent customer and identify the probability of recoveries after

charge off.

Thomas L. C., David B. Edelman, and Jonathan N. Crook (2002)

and Vladimir Bugera, Hiroshi Konno, and Stanislav Uryasev (2002)

divided the scoring types to credit score and behavior score since the

banks must make two types of decision, the first decision concerns the

new customers to decide whether to grant credit or not and the second

decision concerns with the current customers for different purpose.

1- Credit score

Credit score deal with new applicants to decide which applicant will

grant credit card or deny.

24

2- Behavior score

Behavior score deal with current customers to evaluate their credit

performance for different purpose, e.g. collection purpose or to extent the

credit.

Liu, Y. (2001) and Kasper Roszbach (2003) divided the scoring

types according to its objective to: marketing score (retention score),

application score, performance sore (behavior score), bad debt

management and profit score.

1- Marketing score (retention score)

The objective of marketing score is to identify credit worthy

customers and measure their response to promotion activity. Also

marketing score used to predict the probability of losing valuable

customers to build effectives strategy to customer retention.

2- Application score

The objective of application score is to study the behavior of the

current customers to decide whether or not extend credit and predict if a

new customer will become default.

3- Performance score (behavior score)

The objective of performance score is to study the credit behavior of

current customer in order to isolate problem before it occur, so more

attention can be devoted.

4- Bad debt management score

The objective of bad debt management score is to build collection

strategy to deal with delinquents account.

5- Profit aspect

The objective of profit score is to identify the profitable and non

profitable customer to maximize the profit.

25

Mark Schreiner (2002) category scoring system according to types

of risk the score forecast to:

1- Pre disbursement scoring

Pre disbursement scoring predicts the probability that a

provisionally approved credit will default.

2- Post disbursement scoring

Post disbursement scoring predicts the probability that a current

customer will default.

3- Collection scoring

Collection scoring predicts the probability that a current customer

currently delay x day will late x + z days, where z is the numbers of day

which the customer expected to delay over the x day.

4- Desertion scoring

Desertion scoring predicts the probability that a current customer

will apply for another bank once the current one is paid off.

5- Visit scoring

Visiting scoring was used before visiting the customer to predict the

probability of rejecting before or after a visit.

All the above scoring system types are based on prior experiences

which can be acquired through deductive (subjective) or inductive

(empirical) way. According to these, any scoring system can be, Liu,

Y. (2001, 2002) and Mark Schreiner (2002):

1- Deductive (subjective) scoring system

According to deductive score, a weight is given to each attribute,

total scoring system are obtained by adding these weights and the

customer is classified into predefined subgroup by comparing these

scoring system with cut off point. The attributes, their weights, and cut

26

point are determined by the decision maker based on the knowledge

obtained from the experts.

2- Empirical (inductive) scoring system

Empirical scoring system use past data about current customers and try to

find a relation between the customers characteristics and the risk

associated with each one. These relations are expressed as set of rules or

mathematical formula using quantitative techniques such as linear

discriminate, linear programming, neural networks, etc.

2.5. Scoring system applications

The applications and the objectives of scoring system models are

widely spread. The first success of the application of scoring system is in

the area of credit cards. After that the applications area has spread to

include decision related to other credit products, e.g. personal loan, auto

loan, small business loans, housing, insurance, basic utility services, mail

order firms, telecommunications and employment. Also the objectives of

scoring system model are extended to include identifies of potential

customer (pre application stage), to determine whether grant credit or not

(credit application stage), and to identify possible behavior of current

customers (credit performance stage). Also scoring system used to help to

address some fundamental strategic issues as Forecasting Provisions and

Collections Resource Requirements, Value of Underwriting Process, Risk

Based Pricing and Risk Based Processing, and Acceptance Strategy/

Strategic Planning, Liu, Y. (2001), Peng and Goh Chwee (2004) and

David B. Edelman (2005).

27

Figure 2.2. presents the expanding of application areas of scoring

system Liu, Y. (2001).

Figure 2.2. Expanding of application areas of scoring system

Some examples of the applications area and its objectives of scoring

system can be summarized as follows, Consumer Federation of America

(2002), Peng and Goh Chwee (2004) and Allen N. Berger and W. Scott

Frame (2005):

1- The banks use scoring system to:

- Determine if the bank accepts an applicant or reject

- Measure credit risk

- Set credit limits

- Manage existing accounts

- Forecast the profitability of customers

- Identify target market

Consumer credit

Credit card

Personal loans

Auto loans

Home loans

Others

Business credit

Small business

loan

Others small

business loans

Other similar decision problem

in:

Retailer

Mail order firms

Telecommunication

s Others

28

- Underwrite small business credits

2- The insurance company can use scoring system to:

- Decide on the applications of new insurance policies and renewals

of existing polices.

- Adjust premiums

- Setting medium term strategy

3- The landlords can use scoring system to determine whether

potential tenants are likely to pay their rent on time.

4- The utility suppliers, home telephone and call phone services

providers can use scoring system to determine whether to

provide their services to consumer.

5- Employers can use scoring system to decide whether to hire a

potential employee especially for the posts where the

employees handle huge amount of money.

2.6. Potential Benefits of scoring system

Using scoring system provides many benefits as follows:

1- Scoring system reduce discrimination and the effect of personal

attitude, Steiner M. T. A. and Carnieri C. (1999), Liu, Y. (2001),

Consumer Federation of America (2002) and Peng and Goh Chwee

(2004), because:

a- It uses quantitative method to analyze the customer's credit

ability

b- Encourage the credit analysis to concentrate on the individual

difficult and focus on only the important information needed

to evaluate the credit risk.

2- Scoring system allows automation the credit decision and reduces

the human intervention which increase the speed of assessment

process, Steiner M. T. A. and Carnieri C. (1999), Thomas L. C.,

29

David B. Edelman, and Jonathan N. Crook (2002) and Jih-Jeng

Huang, et al (2005).

3- Scoring system enables banks to manage credit portfolio

effectively and profitability because it helps in determining the

credit card limit, interest, charge, and over limit rate, Peng, Goh

Chwee (2004).

4- Scoring system helps banks to build strong collection strategies,

Peng, Goh Chwee (2004) and Jih-Jeng Huang, et al (2005).

5- Scoring system enables banks to detect the creditworthy and non

creditworthy customer thus it is expected that the default rate

dropped after the implementation of scoring model, Steiner M. T.

A. and Carnieri C. (1999) and Thomas L. C. (2000).

6- The usage of scoring system allows lenders to underwrite and

monitor loan without actually meeting the borrower, David West

(2000).

7- Scoring system reduces the cost of analysis since the number of

credit analyst needed for applying scoring system are less than the

number of credit analyst needed if banks depend on the judgment,

Steiner M. T. A. and Carnieri C. (1999) , David West (2000) and

Jih-Jeng Huang, et al (2005).

8- Scoring system can be used for risk pricing. Banks can use the

scoring system to determine the higher risky customers and charge

them higher fees and higher interest rate, Consumer Federation of

America (2002).

31

2.7. Scoring system limitation

1- Some models are not transparent and the credit analysis may not

understand it explicitly, Liu, Y. (2001).

2- Some model are good for handling quantitative attributes but can

not handle qualitative attributes, Liu, Y. (2001).

3- The attributes that used in the scoring system reflect the historical

information about a risk, but the most credit default are caused by

the factors that come out after the credit is granted and may due to

unobservable variables such as employment status and current

status, Liu, Y. (2001) and Peng and Goh Chwee (2004).

2.8. Scoring system issues

There are many practical reasons which affect on the accuracy of

scoring system model. These problems can be categorized in three groups,

problems related to sample, problems related to attributes and problems

related to classes definition, Liu, Y. (2002). These problems can be

summarized as follow:

a- Sample

1- One should think of necessary data to implement the score. It is a

trade off between expensive data and low accuracy due to not

enough information, Karel Komorad (2002) and Vladimir Bugera,

Hiroshi Konno, and Stanislav Uryasev (2002).

2- After the sample was taken one should determine a suitable period

to gather the information about the payment behavior of these

sample, Vladimir Bugera, Hiroshi Konno, and Stanislav Uryasev

(2002) and Liu, Y. (2001).

3- The applicants who are rejected will not be represented in the

sample so it may be biased to the good applicants and there is no

information on the performance of rejected applicants, A. J.

30

Feelders (2000), Liu, Y. (2001), Karel Komorad (2002), and Peng

and Goh Chwee (2004).

b- Attributes

1- The attributes entering the scoring system should be chosen carefully

and explain why preferring some attributes versus others because the

irrelevant attributes will destroy the structure of data and decreases the

accuracy of scoring system model, Karel Komorad (2002), Doumpos

M., Kosmido K., Bourakis G., and Zopounidis C. (2002), Peng and

Goh Chwee (2004) and Jih-Jeng Huang, et al (2005).

2- The law in some country does not allow using information about race,

nationality, religion and gender to build a score card.

3- The method used to aggregation the attributes in order to build scoring

model and make issuing decision should be accurate, Zopounidis C.

(2002) and Peng and Goh Chwee (2004).

4- Scoring system requires sufficient information about the credit history

before calculate the scoring system which may not available, Peng and

Goh Chwee (2004).

5- Scoring system depend on the assumption that the past can predict the

future, scoring system use the characteristic of past applicants to

classify a new applicant. But, sometimes the tendency of the

distribution of the characteristic change over the time so we must

refresh the credit score, Peng and Goh Chwee (2004).

c- Class definition

1- Define the risk classes (good and bad) is very important for the

applicability of the scoring system model. Some banks used number

of months of missed payment, amount over the overdraft limit,

current account turnover, or function of these variables to define the

classes, Liu, Y. (2001).

32

2- Defining the proportion of good and bad in the sample is very

important point in building scoring system, Vladimir Bugera,

Hiroshi Konno, and Stanislav Uryasev (2002).

2.9. Scoring system and risk management

2.9.1. Basel II

The Basel committee established at the end of 1974 by central bank

governors of the group of ten countries. The committee meets every three

months at the Bank for International Settlements in Basel (The Bank for

International Settlements (BIS) is an international organization which

fosters international monetary and financial cooperation and serves as a

bank for central banks). The Basel committee formulates broad

supervisory standards and guidelines for banks. These supervisory

standards and the guidelines do not have legal force, Secretariat of the

Basel committee on Banking Supervision (2001) and Linda Allen, Gayle

Delong, and Anthony Saunders (2004).

In 1988 the committee introduces a capital measurement system

referred as Capital Accord. The 1988 Capital Accord (the Accord) focused

on the total amount of bank capital to reduce the risk of bank insolvency

and the potential cost of bank's failure for depositors. This system

provided for the implementation of a credit risk measurement framework

with a minimum capital standard of 8% by end-1992. In 1999 the

committee issued a proposal for revised Capital Adequacy. The new

Accord intends to provide approaches which are both more comprehensive

and more sensitive to risks than the 1988 Accord, Secretariat of the Basel

committee on Banking Supervision (2001) and Basel Committee on

Banking Supervision (2001a).

33

The new capital accord consist of three mutually pillars. These

pillars work together to provide banks with a higher level of safety and

soundness. These pillars are given in the following, Secretariat of the

Basel committee on Banking Supervision (2001):

1- Minimum capital requirement

The first pillar seeks to refine the standardized rules set forth in the

1988 Accord. The pillar defines the minimum ratio of capital to risk

weighted assets as follows:

2- Supervisory review process

The second pillar requires supervisors to undertake a qualitative

review of their bank’s capital allocation techniques and compliance with

relevant standards.

3- Market discipline

The third pillar aims to bolster market discipline through enhanced

disclosure by banks.

The primary changes between Basle II and Basel I are in the

approach to credit risk and in the inclusion of explicit capital requirements

for operational risk, addressing risks through a more comprehensive

approach is one of the Accord objectives which outlined by the committee

on 1999. For credit risk, the Committee believes that the improvements in

risk measurement and management help banks to use full credit risk model

as a basis for regulatory purpose and permit banks to choice between two

broad methodologies, the standardized approach and internal rating

approach (IRB), for calculating their capital requirement for credit risk,

Basel Committee on Banking Supervision (2001a, c).

Total capital

= the bank’s capital ratio (minimum 8%)

Credit risk + market risk + operational risk

34

Under IRB approach, banks will be allowed to use their internal

estimates of borrower creditworthiness to assess credit risk in their

portfolios. The Committee believes that internal rating based approach can

secure two key objectives, Basel Committee on Banking Supervision

(2001d):

1- Additional risk sensitivity

The capital requirement based on internal rating approaches will be

more sensitive to drivers of credit risk and economic loss in bank's

portfolio.

2- Incentive compatibility

The appropriately structured internal rating approach can provide a

framework which encourages banks to continue to improve their internal

risk management practices.

2.9.2. Risk management

Banks face increasing risks which impact on their profitability. The

term risk has variety of meanings in business. Generally it refers to the

possibility that the outcomes of an action or event are uncertain or could

bring up adverse impacts. The risks which may be face the banks can be

categories to credit risk (the risk of loss arising from default by a creditor

or counterparty), market risk (the risk of losses in trading positions when

prices move adversely) and operational risk (the risk of direct or indirect

loss resulting from inadequate or failed internal processes, people and

systems, or from external events), Stat bank of Pakistan, Scott E.

Harrington and Gregory R. Niehaus (1999), Secretariat of the Basel

committee on Banking Supervision (2001) and Taher Musa (2004).

Risk management concern managing, minimizing the risk and

creating opportunity with minor of risks. It becomes the core of every

banks and it integrated in planning and executing operations. The

35

importance of risk management increase since the banks work now in

global market and the line between the individual risk factors becomes

more blurred. Risk management plays an important role before occurrence

the risk and after occurrence the risk. Prior occurrence the loss, it prepare

the organization to meet the potential losses, minimize the anxiety and fear

associated with all losses exposures and meet the obligation imposed on it

by outsiders. After the loss occurs, it aim to resume the operation in

organization, maintain its earning and growth, and minimize the impact of

the loss on society, Rejda, George E (1995), Department of the army

(1998), The committee on regulation and supervision (1999) and Scott E.

Harrington, Gregory R. Niehaus (1999).

There are many definitions for risk management; some of these

definitions are given in following:

- Risk management is a systematic process for identification and

evaluation of pure loss exposures faced by an organization or

individual and selection and implantation of the most appropriate

techniques for treating such exposures, Rejda, George E (1995).

- Risk management is the process of identifying, assessing, and

controlling risks arising from operational factors and making

decisions that balance risk costs with mission benefits, Department

of the army (1998).

- Risk management is a discipline for dealing with the possibility that

some feature event will cause harm. It provides strategies,

techniques, and approach to recognizing and confronting any threat

faced by a company in fulfilling its mission, Taher Musa (2004).

The risk management process start by identify the potential risks

that may be causes the losses and evaluating the expected losses. Then

develop an appropriate technique or combination of techniques for treating

36

loss exposures. After implement this techniques, its effectiveness must be

measured, Rejda, George E (1995), Department of the army (1998) and

Scott E. Harrington and Gregory R. Niehaus (1999).

There are a number of basic risk management tools can used in

manage risks, Basel Committee on Banking Supervision (2001b). These

include:

1- Development of appropriate corporate polices and procedures

2- Use quantitative methods to measure risk

3- Pricing products and services according to their risks

4- Establishment management of risk through diversification

and hedging

5- Building of cushions to absorb losses

Risk management has more elaborate in Basel II. The committee

encourages banks to improving the risk management tools, Basel

Committee on Banking Supervision (2001c).

2.9.3. Scoring system as a risk management tool

Internal risk analysis forms the basis for risk management. Banks

use internal rating systems to categories their exposures into board

qualitatively differentiated layers of risk. Basel II moved towards

accepting the internal rating based approach (IRB) as a basis for the

determination of adequate reserves for credit risks and focus on techniques

that allow banks and supervisors to evaluate properly the various risks that

bank faces. The committee hopes banks moving from standardized

approach to the internal rating based approach, and envisages that IRB

approach will evolve over time, Basel Committee on Banking Supervision

(2001d) and Winfried G. Hallerbach and Albert J. Menkveld (2004).

The Basel committee published document in 1999 address

principles for the management of credit risk in order to encourage banking

37

supervisory to promote practices for managing credit risk. The internal

measures of credit risk are based on assessment of the risk characteristics

of both the borrower. Scoring system was used to derive internal rating

system and to build credit risk model. The credit risk model aims to

manage the risk in credit decision, provides a basis to decide if the credit

should be granted or not and its output play an important roles in bank's

risk management. Thus scoring system gains more importance as tools for

risk management when thinking about the New Basel Capital Accord,

Basel Committee on Banking Supervision (1999, 2000), Secretariat of the

Basel committee on Banking Supervision (2001), Liu, Y. (2001), Karel

Komorad (2002), Edward I. Altman (2002), Brian Coyle (2000), Thomas

Mahlmann (2004) and Allen N. Berger and W. Scott Frame (2005).

38

Chapter three

Problem formulation and survey

3.1. Introduction

The assessment of discrete set of alternatives (investment project,

firms, credit card application, country risk, portfolio selection and

management etc.) into predefined homogenous groups is a major problem

in financial decision making problems. This type of problem is referred to

as classification. Classification constructs models based on the

characteristic of previous set of sample to classify a new case into

predefined groups. Scoring system is an application of classification

methods, which predict the risk level of the customer and classify them

into predefined groups. Borrowers characteristic and credit performance

are used to build function, will be used to forecast of the performance of

new applicant with similar characteristics, D. Michic, D.J. Spiegelhaltcr,

and C.C. Taylor (1994) Liu, Y. (2001), Yi Peng, Yong Shi and Welxuan

Xu (2002), Liu, Y. (2002b), Doumpos M. and Zopounidis C. (2002b, c)

and Nikolaos F. Matsatsinis and C. Erik Larson (2004).

In this chapter and the following chapters the term credit score will

be used to refers to empirical scoring system which used to study current

customers to predicts if a new applicant have similar characteristics to the

39

current customers will default or not to determine if the bank will issue

credit card to that applicant or deny.

3.2. Data description

Usually banks save a mass of information of customers and their

credit behavior as a main source of information for further analysis. This

information can be used for building credit score model.

3.2.1. Data used to build credit score model

Generally judgment process depend on 4Cs (the Character, the

Capital, the Capacity, the Condition), the credit score try to utilize the

information relating to the traditional 4Cs to assessment the risk associated

to each customer. The information required to assess the risk are different

according to the type of risk bank tries to forecast, Liu, Y. (2001).

The data used to build credit score model are the historical data

about m customers with known classes. These data consists of two parts,

Thomas L. C., David B. Edelman, and Jonathan N. Crook (2002):

1- The independent variables

The independent variables, also called features, attributes, criterions,

or characteristics, may be qualitative (nominal) or quantitative (non

nominal). Any characteristics of the customers or the customer’s

environment that expected to predict the risk of customers should be used

in building the credit score model. These characteristics are obtained from:

a. Credit card application form

The application usually contains much information about the

applicant. This information gives an idea about the stability of the

customers (e.g. time at the current address, time at present employment,

marital status), financial status of the customers (e.g. having bank

41

accounts, having credit card, time with current bank), customers resources

(e.g. residential status, employment, other assets and expense), possible

outgoings (e.g. numbers of children), Liu, Y. (2001):

b. Information from credit bureau

Credit bureau, if it is available, include information like past

payment history, number of inquiries for information on the applicant, etc.

2- The dependent variables

The dependent variables are the classes or groups which each

customer belong. The dependent variables usually are nominal.

These data can be summarized in table 3.1, Michael Doumpos and

Constantin Zopounidis (2002):

Variables

Table 3.1. Data summary

Where:

Y refer to the class (god or bad) ),( BGY

),.....,,.....,( 21 nj aaaaA be a set of attributes about credit card

holders, where n is the number of the attributes, nj ,.....,1

Each attributes may have z value, ),..,,....,,( 21 jzjkjjj vvvva , zk ,....1 ,

nj ,....1 . So that an attributes can be used to partition the sample

into z subset.

1a 2a . ja . na class

1X 11a 12a . ja1 na1 Y

2X 21a 22a . ja2 na2 Y

. . . . . . . Y

iX ja1 . . ija . nja Y

. . . . . . . Y

. . . . . . Y

mX 1ma 2ma . mja . mna Y

Cu

sto

mer

s

40

),...,,...,( 1 inijii aaaX be the development (training) sample of data

for the variables, where mi ,....1 and m is the sample size (from the

application form of previous customers).

Thus ija is the jth attributes for ith customer.

3.2.2. The output of the credit score model

The output data depend on the technique used to build the credit

score model. Many of the techniques introduce weight for each

characteristic and cut point. Others techniques find the probability that the

customer is belong to predefined groups or build a decision tree.

42

3.3. Building the credit score model

Liu, Y. (2002a) presents a general framework for building credit

score model. This framework includes three stages as appear in the figure

3.1:

Figure 3.1. The process of credit score model building

Feedback

New samples

Problem

Relevant data

Stage 1

Past cases

(Standard format)

Stage 2

Apply credit score model

Building

credit score

model

Credit decision

Credit actual behaviors

New

cases

Validation

Stage 3

43

Stage one: Problem definition and data preparation

In this stage we define the objectives of scoring model, collect the

revenant data from the available sources and define the classes.

The definitions of classes "good and bad" are different from

bank to another. It may depend on:

o The number of missing consecutive payments or

o The number and amount of over limit or

o The total numbers of missed payment.

Stage two: Model building

In this stage we put the data in standard form and build the credit

score model:

- Select a sample (training sample) from previous customers,

),...,,...,( 1 inijii aaaX .

- Evaluate their performances during specific period

- Classify them to good or bad, ),( BGY .

- The quantitative techniques use these information and other data

collected from other sources, if it is available, to find a rule to split

iX into two groups (Good ""G and Bad ""B ) with the smallest

percentage of misclassifications, Thomas L. C. (2000).

Stage three: Models application and validation

In this stage the credit score model are applied and validated.

- To validate the credit score model we apply it using the same

sample used in building it or using new sample with known classes.

We compare the credit decision if we apply the credit score model

with the actual behavior and we can use the results of this

44

comparison as feedback to modify the relevant data or the past cases

(training sample) and rebuild the credit score model.

- To classify a new applicant as a good or bad in order to determine if

the bank will issue the card or not, the credit score model are

applied to the new customers to predict their classes. The sum of

attributes weight is compared with cut off point or we may follow

the decision tree from root until the leaf is reached, generally it

depend on the technique used in building credit score model.

- For the new applicants, whose grant a credit card using the credit

score model, the actual payment behaviors are recorded to use for

validating the credit score model and use it update the training

sample to update the model.

3.4. Literature survey

3.4.1. Classification the credit score methods

There are many techniques for building credit score model in

variety of research discipline. Most of these techniques generate a model

that minimizes some function of error between actual and predict values,

or that minimizes likelihood, Steven Finlay (2005).

These techniques are classified by many authors into different

groups by many authors as follows:

Liu, Y. (2002) categorizes the methods used for credit score into

three main historical of research: statistical (e.g. linear discriminate, k

nearest neighbors and regression models), machine learning (e.g. decision

tree, rule induction algorithm and genetic algorithm). And neural networks

(e.g. multi layers perceptron and radial basis function networks).

Thomas L. C., David B. Edelman, and Jonathan N. Crook (2002),

categorize these methods in: statistical methods (e.g. discriminate analysis,

logistic regression, Probit regression, Tobit analysis, classification tree,

45

and nearest neighbor approach) and non statistical methods (e.g. linear

programming, integer programming, neural networks, genetic algorithms

and expert systems).

Many authors categorize the credit score as the type of classification

problem of data mining, Liu, Y. (2002), Vladimir Bugera, Hiroshi Konno,

and Stanislav Uryasev (2002), Yong Shi, Yi Peng, Welxuan Xu and

Xiaowo Tang (2002), Yi Peng,Yong Shi and Welxuan Xu (2002) and

Peng and Goh Chwee (2004).

Ferenc Kiss (2003) classifies the methods used for credit score,

from the knowledge management perspective, to knowledge generating

modeling processes, knowledge saving modeling processes, and

knowledge selection processes. Knowledge generating modeling processes

includes methods that the decision making is depending on the experience

data with the help of statistical or analytical processes. (e.g. linear

probability model, probit and logit models, discriminate analysis, neural

network, classification trees, and nearest neighbors). Knowledge saving

modeling processes includes methods that formalize the theoretical

knowledge and experience of expert in some way (e.g. analytical hierarchy

processes and expert system). Knowledge selection processes includes

methods that are capable to select the optimum model from the set of

models available for finding solution (e.g. decision tree, expert systems,

and genetic algorithm).

Peng and Goh Chwee (2004), categorizes the techniques used to

build credit score model into statistical methods (e.g. discriminate analysis

and logistic regression) and data mining techniques (e.g. decision tree and

neural networks).

Liu, Y. (2002) and Jih-Jeng Huang, et al (2005), classifies the

methods used to build credit score models into induction based algorithm

and function based model. Induction based algorithm (non parametric,

46

discovery based or data driven) create the model automatically based on

the pattern found in the data (e.g. rough sets, classification and regression

tree). Function based model (parametric, verification based or theory

driven) utilize the idea of parameter estimation in statistic (e.g.

discriminate analysis, logistic regression, and neural network).

3.4.2. Statistical techniques

The first credit score model was present by Durand in 1941. This

model was based on the Fisher work in discriminate analysis in 1936.

Then, the forms of regression were used, since the Fisher approach can be

viewed as a form of linear regression, Thomas L. C. (2000), Thomas L. C.,

David B. Edelman, and Jonathan N. Crook (2002) and Karel Komorad

(2002).

Statistical methods can be reviewed as follow:

3.4.2.1. Linear Discriminate analysis (linear probability

model)

Linear discriminate is basically a regression model. Linear

discriminate tries to find the best linear combination (linear discriminate

function) of the characteristics which explains the probability of default.

Linear discriminate equation are used to find the attributes weights, the

score is obtained and compared to a cut off point, David West (2000),

Ferenc Kiss (2003) and Thomas L. C., David B. Edelman, and Jonathan N.

Crook (2002).

uawawawawY innijjii ..........2211

Where u is the random error, 0)( up , jw are the weights for ja .

47

3.4.2.2. Logistic regression

Logistic regression is a variation of linear regression and it is useful

when the outcome is restricted to two values. Logistic regression finds the

probability that given customer belong to predefined group, )/( XYP .

Logistic regression gives each attributes weight which measures the

contribution of each characteristic to variations in )/( XYP . Some research

find the logistic regression perform better than linear discriminate analysis

in credit score, Thomas L. C., David B. Edelman, and Jonathan N. Crook

(2002), Yang Liu.s (2002), Karel Komorad (2002), and Y. Liu and M

Schumann (2005).

Steiner M. T. A. and Carnieri C. (1999) proposed a methodology for

credit score, these methodologies divided into two stages. In the first stage

statistical techniques was used to analysis the data and in stage two

logistic regression used to build credit score model. They find that logistic

regression perform better among other six method (two involved the linear

programming, three are statistical, and the last one is neural network).

3.4.2.3. Probit and tobit analysis

Probit and tobit analysis are nonlinear regression which was used in

credit score. Probit model is derived by letting the standard normal

distribution to express the discriminate function. In tobit model, there is

something in satisfactory about the asymmetry of the tobit transformation

used to estimate )/( XYP . Generally both probit and tobit models are not

find much favor to use in building credit score, Thomas L. C., David B.

Edelman, and Jonathan N. Crook (2002), Karel Komorad (2002) and

Kasper Roszbach (2003).

48

3.4.2.4. Semiparametric regression

Semiparametric regression was used to give more attention to

nominal attributes. Hardle et al shown that semiparametric regression

perform better than logistic regression, Karel Komorad (2002).

3.4.2.5. Bayesian classification

Bayesian classifier was based on Bayes theorem. Bayesian method

used to predict the probability that a given applicants belongs to a

particular class. Classification rule can be stated as: GX if

)/()/( XBPXGP and BX if )/()/( XGPXBP , Jiawei Han and

Micheline Kamber (2001), Liu, Y. (2002) and Gutierrez-Pena E.(2004).

3.4.2.6. Nearest neighbor approach

Nearest neighbor method was applied in credit score also. Nearest

neighbors estimate that )/( XYP is given by KKG / or KKB / , where

BG KK , is the number of cases from the class G or B among the K most

similar to X . Using nearest neighbor enable to update the training sample

by adding new cases to the training sample and dropping the oldest cases,

Thomas L. C., David B. Edelman, and Jonathan N. Crook (2002) and Liu,

Y. (2002).

3.4.3. Non statistical techniques

3.4.3.1. Multicriteria decision aid method (MCDA)

MCDA is an advanced field of operation research providing several

advantages from research and practical points of view. It is a powerful

approach to analysis complex decision problems that involves multiple

and conflicting goals and provide financial decision makers and analysts

with a wide range of methodologies for decision making, Doumpos M.

49

and Zopounidis C. (2002a, b) and Jaap Spronk, Ralph E. Steuer and

Constantin Zopoundis (2003).

All MCDA methodologies start by specification a set of alternative

solution and identify all factors related to the decision then analysis the

data using suitable criteria aggregation model and provide the decision

maker with the necessary support to understand the recommendations of

the model. All MCDA focus on develop an automatic procedure for

analyzing data in order to construct a classification models and develop an

efficient preference modeling methodologies that enables to incorporate

the decision maker's preferences in the classification model, Doumpos M.

and Zopounidis C. (2002a, c):

Several decision making problems require the evaluation a set of

alternatives. The evaluation process involves the aggregation of all the

pertinent decision attributes. Within MCDA field one can distinguish

between the following forms of aggregation models, Doumpos M. and

Zopounidis C. (2002a) and Jaap Spronk, Ralph E. Steuer and Constantin

Zopoundis (2003):

1- Multi-objective mathematical programming (e.g. goal

programming).

2- Multi-attributes utility theory (e.g. UTA method “UTilites

Additives”, UTADIS method “UTilites Additives

DIScriminantes” and MHDIS method “Multi group Hier-

rarchical DIScrimination”)

3- Outranking relations (e.g. ELECTRE family “Elimination Et

Choix Traduisant la REalite” and PROMETHEE family

“Preference Ranking Organization METHhod of Enrichment

Evaluations).

4- Preference disaggregation analysis (e.g. MSM "Multi-Surface

Method).

51

Nikolaos F. Matsatsinis (2002), compare between UTADIS,

UTADIS I, UTADIS II, ELECTRE Tri Pes, ELECTRE Tri Opt, rough

sets, and composite rule induction system and find that UTADIS and

UTADIS I gives the best results.

Doumpos M. and Zopounidis C. (2002b), present a new method for

multi group discrimination problems. The method leads to the

development of a set of additive utility functions, which are used to

classify each alternative into a specific group. The additive utility

functions are estimated through the solution of three mathematical

programming formulations (two linear and one mixed integer) in order to

achieve the optimal discrimination both in term of the number of

misclassification, as well as in terms of the clarity of discrimination. The

first linear programming minimizes the overall classification error, the

second minimize the number of misclassifications, and the third one

maximizes the distance between the global utilities of the classified

alternatives achieved according to the two utility functions.

3.4.3.2. Linear programming

Until the 1980's the only methods used for credit scoring was

statistical methods. Freed and Glover find that linear programming can be

used to discriminant between two groups, and thus more freedom are

achieved in the model because there are no statistical assumption are

assumed, Thomas L. C., David B. Edelman, and Jonathan N. Crook

(2002), ,Young Shi, Yi Peng, Welxuan Xu and Xiaowo Tang (2002) and

Ferenc Kiss (2003).

The first model proposed by Freed and Glover depend on maximize

the minimum distance between the alternatives score (correctly classified)

and the cut off point. This model is known as MMD (maximize the

minimum distance), and present in figure 3.3, Doumpos M. and

50

Zopounidis C. (2002a) and Yong Shi, Yi Peng, Welxuan Xu and Xiaowo

Tang (2002).

Figure 3.3. The MMD model

Where C is the cut point which discriminates between good and bad

alternatives and i is the distance of alternative score iX form the cut of

point C .

After MMD model Freed and Glover published the second model

for classification which minimize the sum of deviations among the

alternative score (not correctly classified) from the cut off point, this

model known as MSD (minimize the sum of deviations). Many studies

found that the MSD model produce good test results in several studies.

MSD model are present in figure 3.4, Kim Fung Lam, Eng Ung Choo, and

Jane W. Moy (1996), Doumpos M. and Zopounidis C. (2002a) and Yong

Shi, Yi Peng, Welxuan Xu and Xiaowo Tang (2002).

C

i

Good Bad

i

52

Figure 3.4. The MSD model

Where C is the cut point which discriminates between good and bad

alternatives and i is the overlapping of two classes boundary for all

alternatives score iX from the cut of point.

After Freed and Glover proposed their linear programming

approach (MSD and MMD) for classification problem many authors have

studied the variants of linear programming formulations for the

classification problems, Most of these formulations determine the weights

for each attribute and the cut off point simultaneously, Kim Fung Lam,

Eng Ung Choo, and Jane W. Moy (1996) and Thomas L. C., David B.

Edelman, and Jonathan N. Crook (2002).

In 1986 Freed and Glover proposed a general linear programming

formulation which always gives a nontrivial solution and is invariant

under linear transformation of data, which considers two types of

measures for the quality of classification. The objective function of these

formulations was weighted by a combination of the maximum internal and

external deviations and the sum of the absolute values of the internal and

C

Good Bad

53

external deviations, Lyn Thomas L. C., David B. Edelman, and Jonathan

N. Crook (2002), Yong Shi, Yi Peng, Welxuan Xu and Xiaowo Tang

(2002) and Doumpos M. and Zopounidis C. (2002a).

Kim Fung Lam, Eng Ung Choo, and Jane W. Moy (1996) present a

new linear programming approach to solve the two group classification

problem. This new approach is based on an idea from cluster analysis that

objects within the same group should be more similar than objects

between groups. According to this, the alternatives score in group G

objects should be closer to each other but further from the alternatives

score in group B. They solve the classification problem in two stages. In

the first stage they find the attributes weights by solving linear

programming to minimize the total deviation of the alternatives scores

from their group mean scores. In stage two they use the attributes weights

which was computed in stage one to find the classification scores for each

customer and use this score to find the cut point c , where c is the cut

point which discriminates between good and bad alternatives, by solving

the linear programming or mixed integer programming problems.

Vladimir Bugera, Hiroshi Konno, and Stanislav Uryasev (2002),

present a general approach for classification and test it with credit score

for credit cards. It is based on finding an optimal classification utility

function belonging to a pre-specified class of function. He considered

linear and quadratic utility function with monotonicity constraints. He

conclude that his approach lead to quite robust classification techniques.

3.4.3.3. Integer programming

Integer programming used to build scoring system also, if one

wants to take the number of cases where the discrimination is incorrect as

a measure of goodness of fit, number of misclassification or total cost of

misclassification, one has to introduce integer variables into the linear

54

programming, and this lead to the integer programming models. Many

authors found that the integer model perform better than linear

programming, Thomas L. C., David B. Edelman, and Jonathan N. Crook

(2002).

3.4.3.4. Goal programming

Yong Shi, Yi Peng, Welxuan Xu and Xiaowo Tang (2002) and Yi

Peng,Yong Shi and Welxuan Xu (2002) present an approach of data

mining to classify the credit cardholders’ behavior through multiple

criteria linear programming. They present a model for classifying two

groups (e.g. good or bad) credit card holder behavior, and then a three

groups (e.g. bad, normal or good). This model is extends to the previous

LP approaches to classification problems presented by Freed and Glover

(1981).

They test this model with the same sample in Freed and Glover

(1981) and they found the result was consistent with the result of Freed

and Glover (1981).

3.4.3.5. Neural networks

Neural network is flexible models which consist of three layers,

input, hidden, and output layers. The input layers first processes a number

of inputs (variables) to hidden layers. The hidden layers calculate a weight

to each variables and the product are summed and transformed to output

layers or becomes an input value for another layers, Jiawei Han and

Micheline Kamber (2001) and Jih-Jeng Huang, et al (2005).

There are various architectures of Neural networks; more than 50%

of applications are using the multilayer perceptron, Karel Komorad

(2002). David West (2000) investigates the accuracy of credit score built

using five Neural network models (multilayer perceptron, mixture of

55

experts, radial basis function, learning vector quantization and fuzzy

adaptive resonance) and compare the performance of these five neural

network models with the most traditional methods including linear

discriminant analysis, logistic regression, k-nearest neighbor, kernel

density estimation, and decision trees. The results showing that:

- Neural network credit score model improve the accuracy ranging

from 0.5% up to 3%.

- The multilayer percptron is not the most accurate Neural network

model.

- Mixture of experts and radial basis function Neural network

models should be used for credit scoring.

- Logistic regression is a good alternative for Neural network and is

the most accurate of the traditional methods.

Many authors conclude that neural network comparing to the

traditional statistical methods produce more accuracy result but using

neural network in credit score is limited due to their intrinsic opaque and

its poor performance when incorporating irrelevant attributes or small

sample, Rashmi Malhotra and D.K. Malhotra (2001), Baesens B., Egmony

M, Castelo R., and Vanthienen J. (2002) and Jih-Jeng Huang, et al (2005).

3.4.3.6. Expert system

Expert system is a computer system that is capable, in the area of

application, of sorting and managing expert knowledge, and handling this

knowledge in a manner so that it can use targeted information, or perform

certain task alone, Ferenc Kiss (2003).

Efraim Turban define expert system as a system that employs

human knowledge captured in a computer to solve problems that

ordinarily require human expertise, Efraim Turban (1988).

56

Tetsuo Tamai and Masayuki Fujita (1987) introduce an expert

system for credit card application assessment. They try to simulate the

human process when the credit card analysis uses judgment process to

distinguish between good and bad customers. They adopt a decision tree

as a form of knowledge representation for the profile design and present a

decision process to simulate the human process and named it “profiling

system”. Their method depends on the algorithm developed by R.Quinlan

in 1983 as one of inductive machine learning approaches.

In addition to the profiles obtained from the past data analysis, they

collect some profiles from human expert and call it specific profiles. These

profiles do not cover the whole types of applicants but indicate important

patterns to be used in the credit assessment process. In the actual

operation, these specific profiles are used for screening clear patterns, and

then profiles obtained from the data analysis are applied to give systematic

information.

Tamai and Fujita think that the profiling method has some

advantage over the scoring method which applies statistical theories

because:

1- To apply the scoring method, some kind of measure on linear scale

is required for each applicant's property. If a given property is of a

continuous nature like amount of income or deposit, there is little

problem. But if it has a combinatorial nature, like the status of home

or industry type of the company the applicant is working with, then

some way of quantification needs to be taken. It may be easy to give

certain values, but not always easy to sort it on linear scale.

2- As the discriminant function for the scoring method is usually

linear, the effect of combination of properties is treated in a limited

way. In reality, there are such judgments as: “for a young person, it

is common to live with parent's apartment, but for a middle aged

57

person with family, it may be considered as a minus point”. Such

case can be appropriately treated in the profiling method.

3- A human assessor does not make judgment according to some kind

of scoring process, thus it is difficult to verify the method by

comparing it with human decision making. The profiling method is

natural for human experts to assess and to give constructive

adjustment.

3.4.3.7. Genetic algorithm (GA)

Genetic algorithms are another general heuristic optimization

schemes based on biological analogies. GA is a data driven, non

parametric heuristic search process, its used to extract intangible

relationships in system and used in many application such as

classification, Karel Komorad (2002), Steven Finlay (2005) and Jih-Jeng

Huang, et al. (2005).

Jih-Jeng Huang, et al. (2005), present two stages genetic

programming (2SGP) to deal with credit score problem. The first stage of

GP is employed to derive the IF-THEN rules for the decision maker. In

second stage of GP, the reduced data, the data which do not satisfy any

rule or satisfy more than rule, are employed to build the discriminate

function. They concluded that 2SGP can improve the accuracy of credit

score model and is superior to the conventional methods.

Steven Finlay (2005) shown that some scoring model build using

GA perform as well as a range of other approaches model but in other

cases perform worse.

58

3.4.3.8. Classification tree

The idea of decision tree is to split the set of application answers

into different sets and then identify each of these sets as good or bad

depending on what the majority in the set is. Building decision tree

involve make three decision. The first one is choosing the split rule. The

most common split rules are Kolmogrov Smirnov statistic, basic impurity

index, gini index, entropy index, and maximize half-sum of the squares.

The second decision is choosing the stopping rule. One makes a node a

terminal node for one of the following reasons, all samples for given node

are belong to the same class, the number of the samples in the node is so

small that it makes no sense to divide it further, there are no samples for

the branch test attribute, there are no remaining attributes on which the

samples may be further partitioned, or the split measurement value if one

makes the best split into two daughter nodes is hardly and different from

the measurement values if one keeps the node as is. The third decision is

determine how to assign terminal nodes into good and bad categories,

Thomas L. C., David B. Edelman, and Jonathan N. Crook (2002) and

Hussein Almuallim, Shigeo Kaneda and Yasuhiro Akiba (2002).

Decision tree can be used as classification method for determining

the appropriate class for given new applicant. Decision tree was used by

many authors and its performance was compared with the discriminate

analysis, probit regression, and logistic model and they find that decision

tree provides a much better classification accuracy when there are

interaction between variables, Eddt L. Ladue and Michael P. Novak

(1996), Vladimir Bugera, Hiroshi Konno, and Stanislav Uryasev (2002)

and Ferenc Kiss (2003).

59

3.4.3.9. Rough sets theory

Nikolaos F. Matsatsinis, (2002) develop an intelligent decision

support system for credit card assessment (CCAS). The model base in

CCAS includes composite rule induction system (CRIS) and rough sets.

He concludes that both CRIS and rough sets have advantages and

disadvantages.

CRIS can deal with nominal "qualitative" and non nominal

"quantitative" attributes and can handle large data sample. But the decision

tree that produced by CRIS may not cover all cases and if one use a rule in

classification and these rule was satisfied, then the other rules will not

examined.

Rough sets create decision rules independently of the sample size,

simple and clear, but the decision rules may not cover all cases.

3.4.3.10. Analytical hierarchy process (AHP)

AHP based on when the decision makers start to make decision,

they faced by a complicated system of factors. When the element of the

system and their relationships are reviewed together, they naturally

divided into groups based on certain characteristics. By repeating these

process several times, the characteristic that define the groups are further

examined as the elements of a further level of knowledge system. By

classifying these elements according to another criterion we create a new,

higher level of hierarchy, until we finally reach the uppermost element of

the system, which represents the general description of the decision

making problem, Ferenc Kiss (2003).

61

3.5. Comparisons of techniques used to build credit score

There are many techniques were developed for credit score and

there is no agreement on which method should be used to build credit

score model because, Thomas L. C., David B. Edelman, and Jonathan N.

Crook (2002):

- Commercial consultancies have a tendency to identify the

method they use as best.

- Comparisons by academics cannot reflect exactly what happens

in the industry since some of the significant data, like credit

bureau data, are sensitive to be passed on to them by the users.

Generally there is no a certain technique should be used to build

credit score model. Each technique has its advantage and disadvantage.

The technique used to build credit score model must be suit the problem at

hand, Liu, Y. (2002) and Jih-Jeng Huang, et al (2005).

Generally building credit score is complex and not standardized

process so the solution can not be the optimal one for all cases and the

perfect separation is impossible because the sample data may be not

accurate or may the good applicants and bad applicants have the exactly

the same characteristics, Vladimir Bugera, Hiroshi Konno, and Stanislav

Uryasev (2002) and Liu, Y. (2002).

60

Chapter four

Decision support system

This chapter consists of two parts. The first one give overview about

decision support system and the second describe the proposed decision

support system.

Part I: Introduction to decision support system

4. I.1 Definition of decision support system (DSS)

In the late 1970s, a number of company developed interactive

information systems that used data and models to help mangers to solve

semi-structured problems. These systems called decision support system

(DSS), D.J.Power (2000).

There are several definition of decision support system each one

define it from specific aspect of the decision making process and its reflect

the author’s point of view, B.Ravindranath (2002) and Gachet, A. (2001).

The early definitions of decision support system, were open and

may have several interpretations, identified it as a system intended to

support managerial decision makers in semistructured decision situations,

Turban (1988) and Efraim Turban and Jay.E. Aronson (2002).

One definition, which is largely acceptable, is a DSS is an

interactive system that provides the user with easy access to data and

decision making process in order to support unstructured and partly

structured tasks, B.Ravindranath (2002).

Turban (1988) and Efraim Turban and Jay.E. Aronson (2002),

compared and contrasted the various definitions of decision support

62

system by examining the various concepts used to define DSS and

summarize the results in table 4.I.1:

Source DSS define in terms of

Gorry and Scott-Morton (1971) Problem type, system function

(support)

Little (1970) System function, interface

characteristics

Alter (1980) Usage pattern, system objectives

Moore and Change (1980) Usage pattern, system capabilities

Bonczek et al. (1980) system components

Keen (1980) Development process

Table 4.I.1: Various definitions of DSS

Little (1970) defines DSS as a model based set of procedures for

processing data and judgment to assist a manger in his decision making.

(Little definition was refinement of Gorry and Scott-Morton's definition).

Alter (1980) defines DSS by contrasting them with traditional

electronic data processing (EDP) system on five dimensions. The result of

contrasting was given in table 4.I.2.

Dimension DSS EDP

Use Active Passive

User Line and staff management Clerical

Goal Effectiveness Mechanical efficiency

Time horizon Present and future Past

Objective Flexibility Consistency

Table 4.I.2: The result of contrasting

63

Moore and Chang (1980) define DSS as extendible systems, capable

of supporting ad hoc data analysis and decision modeling, oriented toward

future planning, and used at regular, unplanned intervals.

Bonczek et al. (1980) define as a computer based system consisting

of three interacting components: a language system (a mechanism to

provide communication between the user and other components of the

DSS), a knowledge system (a repository of problem domain knowledge

embodied in DSS as either data or procedures), and a problem processing

system (a link between the other components, containing one or more of

the general problem manipulation capabilities required for decision

making).

Keen (1980) define DSS as the product of a developmental process

in which the DSS user, the DSS builder, and the DSS itself are all capable

of influencing one another, resulting in an evolution of the system and

pattern of its use.

Turban (1988), and Efraim Turban and Jay.E. Aronson (2002),

summarize the results of their comparing as follows:

1- The basis for defining DSS has been:

Developed from the perceptions of what a DSS does (such

support decision making in unstructured problems).

Developed from ideas about how the DSS's objective can

accomplished (such as components required, appropriate usage

pattern, and necessary development processes).

2- These definitions do not provide a consistent focus because each

tries to narrow the population in different way.

3- These definitions collectively ignore the central purpose of DSS that

is to support and improve decision making.

Turban (1988) formulates his working definition of DSS as an

interactive computer based system that utilizes decision rules and models

64

coupled with a comprehensive database and decision maker's own

insights, leading to specific, implement-able decisions in solving problems

that would not be amenable to management science optimization models

per se.

Motaz Khorshid (2004) define the DSS as an advanced computer-

aided information technology, used to support complex decision making,

problem solving, policy testing, scenario simulation and strategic

planning.

Since there is no agreement on the definition of DSS,

B.Ravindranath (2002) presents the expectations from DSS as follows:

1- It should provide reliable information to support decision

making.

2- It should be able to handle unexpected problems by

performing necessary analysis and using suitable models.

3- It should make the support available when it need.

4- It should evolve with time with changing in user need and

with changing in information.

5- People can use it easily.

4. I.2 Characteristic and capabilities of DSS

The major feature of DSS that distinguish it from other computer

aided systems, such as management information system (MIS) and expert

system (ES) is its corporate quantitative tools, Motaz Khorshid (2004).

Turban (1988) and Efraim Turban and Jay.E. Aronson (2002),

summarizes the major DSS characteristics and capabilities and conclude

that because there is no agreement on the definition of DSS, there is no

agreement on characteristics and capabilities of DSS.

The major DSS characteristics and capabilities are the following:

65

1- DSS provides support for decision makers to solve semistructured

and unstructured problems by bringing together human judgment

and computerized information.

2- Provide support to all managerial level.

3- Support individuals or groups.

4- Provides support to several interdependent and/or sequential

decision.

5- DSS support all phases of decision making process (intelligence,

design, choice, and implementation).

6- Support a variety of decision making process and styles.

7- DSS are adaptive over time. The users can add, delete, combine,

change, or rearrange basic elements.

8- User friendless.

9- DSS attempt to improve the effectiveness of decision making

(accuracy, timeliness, quality) rather than efficiency (the cost of

making decisions).

10- The decision maker has a complete control over all steps of the

decision making process in solving a problem.

11- End user should able to construct or modify simple system by

themselves.

12- A DSS usually utilizes models for analyzing decision making

situations.

13- DSS provide access to variety of data sources, formats, and types.

14- DSS can integrate with other DSS and applications and distributed

using web technologies.

No DSS can exhibit all the above characteristics. It should be try to

build DSS corporate many of these characteristic as possible.

The capabilities and characteristics can summarize in figure 4.I.1.

66

Figure 4.I.1: DSS characteristics

4. I.3 Decision support system components

A DSS does not have a monolithic structure. It consists of a few

subsystems, which have to interact with each other, B.Ravindranath

(2002).

There is different identification for the component of DSS. Gachet

A., (2001), presents some of this identification as follows:

1- Sage (1991) identifies three component of DSS

i. Data base management system (DBMS)

ii. Model base management system (MBMS)

DSS

Semistructured

problems For mangers in all

levels

For groups and

individuals

Interdependent or

sequential decisions

Support all decision

making phases

Support variety of

decision styles and

processes

Adaptability and

flexibility

Interactive case of use

Effectiveness not

efficiency

Human control

the machine

Ease of

construction by

end users

Modeling and

analysis

Data access

Integration and web

connection

1 2

3

4

5

6

7 8

9

10

00

11

00

12

13

14

67

iii. Dialog generation and management system (DGMS)

2- Hattenschwiler (1999) identifies five component of DSS

i. User with different roles or functions in the decision

making process (decision maker, advisors, domain

expert, system expert, data collectors)

ii. A specific and definable decision context

iii. A target system describing the majority of the

preferences

iv. A knowledge base made of:

1. External data sources, knowledge database,

working database, data warehouses and meta-

database.

2. Mathematical models and methods

3. Procedures, inference and search engines

4. Administrative programs and reporting systems

v. A working environment for the preparation, analysis

and documentation of decision alternatives.

3- Power (2000) identifies four component of DSS

i. The user interface

ii. The database

iii. The model and analytical tools

iv. The DSS architecture and network

Efraim Turban & Jay.E. Aronson (2002), define the component of

DSS as follows: Decision support system composed the following four

components as shown in figure 4.I.2:

68

Figure 4.I.2: Component of DSS

1- Data management subsystem

The data management subsystem is composed the following

elements as shown in figure 4.I.3:

Other computer based

system

Users

Data management Model management

Dialog management

Data (external

& internal)

External

models

Knowledge based

subsystems

Organizational KB

69

Figure 4.I.3: The structure of data management subsystem

a) DSS database

DSS database extract data related to the problem under

consideration. This data are collected from:

- Internal sources

It comes from the daily organization's transactions.

- External sources

It includes different types of data like national economic data.

- Personal (private) data.

b) Database management system (DBMS)

DBMS is software which manages the data management subsystem.

DBMS storage the data in the data base, retrieval of data from the database

and control the database.

External data

sources

Internal data

sources

Private personal

data

Query facility

Extraction

Decision support

database

Database management

system

Data directory

Organizational

knowledge base

Interface

management

Model

management

Knowledge

based

subsystem

Corporate

data

warehouse

71

c) Data directory

The data directory contains the definition of the data and its

function. The main function of data directory is to answer questions about

the availability of data items, their sources, and their meaning.

d) Query facility

Query facility provides bases for access, manipulate, and query the

data.

2- Model management subsystem

The model management subsystem is composed the following

elements as shown in figure 4.I.4:

Figure 4.I.4: The structure of model management subsystem

a) Model base

Model base contain quantitative models (statistical, financial, etc.)

which provide DSS its capabilities.

The models in model base may be:

- Strategic models which used to support top management.

- Tactical models which used by middle management.

- Operational models which support day working activities.

Models base Model directory

Models base

management

Model execution, integration

and command processor

Data management Interface management Knowledge based subsystem

70

- Analytical models which used to perform some analysis on data.

b) Model base management system (MBMS)

MBMS is software used for:

- Model creation using

o Programming languages.

o DSS tools and/or subroutines

o Other building blocks.

- Generation of new routines and reports.

- Models updating and changing.

- Model data manipulation.

c) Model directory

Model directory contain the definition of the models and describe its

function and capabilities.

d) Model execution, integration, and command processor

- Model execution control of the model running.

- Model integration involves combining the operation of several

models or integrating the DSS with other applications.

- Command process interprets modeling instructions from user

interface component and route them to MBMS, model execution,

or integration functions.

3- Knowledge-based subsystem

Many problems require expertise for their solution. The knowledge

based subsystem can add to DSS to provide the required expertise.

4- User interface subsystem

The user use DSS through this subsystem. It includes all aspect of

communication between users and DSS. It includes:

- The hardware

- The software

72

- Factors that deal with ease of use, accessibility and human

machine interactions.

The user interface subsystem is managed by software called user

interface management system (UIMS).

Motaz Khorshid (2004) defines the component of DSS as follows:

DSS model is comprised of four main components, figure 4.I.5:

1- Database management capabilities with access to internal and

external data, information and knowledge.

2- Modeling function accessed by a model management system.

3- A powerful, yet simple user interface design that enables

interactive queries, reporting, and graphing functions.

4- A decision-maker's own insights.

Figure 4.I.5: A conceptual model of DSS

Other computer based software systems

Data base

Data base management

system (DBMS)

Model base

management system

(MBMS)

Model base

Knowledge management

Interactive user interface system

Decision user & policy analyst

73

4. I.4 Decision support system application (type, classification,

taxonomy of DSS)

There are several ways to classify DSS applications. Different

authors propose different classifications, Gachet A., (2001).

1- Alter's (1980) classification, Efraim Turban & Jay.E. Aronson,

(2002) and B.Ravindranath (2002).

Alter's (1980) classification based on (the purpose for which a DSS

is expected to be used)

- The degree of action implication of system outputs.

- The extent to which system outputs can directly support (or

determine) the decision.

According to this classification, there are seven categories of DSS

a) File drawer systems.

File drawer system provides the user with organized information

regarding specific demands.

b) Data analysis system.

Data analysis system provides different paths or alternative methods

to meet a given situation.

c) Analysis information systems.

Analysis information system is use for building the data warehouse

in any large organization.

d) Accounting models.

Accounting models use in accounting purpose.

e) Representation models.

Representation models use in forecasting future trends.

f) Optimization models.

Optimization models use to allocation the resources when its are

restricted.

74

g) Suggestion models.

Suggestion models use for operational purposes.

The first two types are data oriented, performing data retrieval or

analysis. The third deals both data and models. The remaining four are

model-oriented, providing simulation capabilities, optimization, or

computations that suggest an answer.

2- Holsapple and Whinston's classification (1996), Efraim Turban

& Jay.E. Aronson, (2002) and B.Ravindranath (2002).

Holsapple and Whinston (1996) classify DSS into six frameworks:

a) Text oriented DSS

In this type of DSS information drawn from reports, statements and

technical observation and stored in a textual format and must be accessed

by decision makers.

b) Database oriented DSS

In this type of DSS the database organization play a major role in

the DSS structure.

c) Spreadsheet oriented DSS

A spreadsheet is a modeling language that allow user to write

models to execute DSS analysis. This DSS enable user to get organized

information in a framed document.

d) Solver oriented DSS

A solver oriented DSS is an algorithm or procedure written as a

computer program for performing certain computations and giving

quantitative solutions for solving a particular problem type.

e) Rule oriented DSS

In this type of DSS the knowledge component of DSS includes both

procedural and inferential (reasoning) rules.

These rules can be quantitative or quantitative, and such a

component can replace quantitative models or integrated with them.

75

f) Compound (hybrid) DSS

This type of DSS is a hybrid system that includes two or more of

five basic structures described earlier.

3- Power (1997) classification

At technical level, Power differentiates between, Gachet, A. (2001).

a. Enterprise-wide DSS

Enterprise-wide DSS are linked to large data warehouses and serve

many mangers in a company.

b. Desktop DSS

Desktop single user DSS are small system that resides on an

individual manger’s Pc.

4- Hattenschwiler (1999) classification, Gachet, A. (2001)

a. Passive DSS

Passive DSS is a system that cannot bring out decision suggestions

or solutions.

b. Active DSS

An active DSS can bring out such decision suggestions or solutions.

c. Cooperative DSS

Cooperative DSS allows the decision maker (or its advisor) to

modify, complete, or refine the decision suggestions provided by the

system, before sending them back to the system for validation.

5- Power (2000) classification

a. Communication Driven DSS

A communication-Driven DSS support more than one person

working on a shared task.

76

b. Data-Driven DSS

Data-Driven or Data-oriented DSS emphasize access to and

manipulation of a time series of internal company data and external data.

c. Document-Driven DSS

Document-Driven DSS manage, retrieve and manipulate

unstructured information in variety of electronic formats.

d. Knowledge-Driven DSS

Knowledge-driven DSS provide specialized problem-solving

expertise stored as facts, rules, procedures, or in similar structures.

e. Model-Driven DSS

Model-driven DSS use data and parameters provided by decision-

maker to aid decision makers in analyzing a situation, but they are not

necessarily data intensive.

6- Classification based on usage modes, B.Ravindranath (2002).

DSS can be classified according to how they are put to use as

follows:

a. Subscription mode.

Any DSS, which introduce out-puts in the form of reports is

considered to be working in subscription mode.

b. Clerk mode.

This system behaves like an inquiry clerk either in the booking

office or in a library.

c. Terminal mode.

In this type the system is loaded in the personal computer.

d. Intermediary mode.

When the DSS becomes complicated the expert person from MIS

department who help user to draw information is called intermediary.

e. Institutional DSS.

77

In this system the information required for routine administration of

an organization is stored.

7- Special and general purpose DSS, Motaz Khorshid (2004).

a. Special purpose DSS.

Special purpose DSS concentrate on either a specific problem or a

specific tool of decision analysis (DSS are organized around computer

simulation techniques).

b. General purpose DSS.

General purpose DSS based on a specific analytical tool or

computational models (DSS tools are based on optimization models).

8- Others classifications, Efraim Turban and Jay.E. Aronson

(2002).

a) Institutional and AD HOC

i. Institutional DSS

Institutional DSS deal with decisions of recurring nature, e.g.

portfolio management system. It used to solve identical or similar

problems so institutional DSS can develop through many years.

ii. AD HOC DSS

AD HOC DSS deal with specific problems which are non repeated

type.

b) Personal, group, and organizational support

i. Personal support

Personal DSS can be used by individual user or group of users to

solve specific problem, e.g. selecting stocks.

ii. Group support

Group DSS can be used by group of users, each of them work

individual, to solve interrelated problems, e.g. DSS may serve many users

in finance department, here the decision made individual but they check

the impact of their decisions on others.

78

iii. Organizational support

Organizational DSS helps many users work in the same

organization but in different functional areas.

c) Individual DSS vs. a group support system

i. Individual DSS

ii. Group decision support system

In group decision support system (GSS) the decisions made by a

group.

d) Custom made vs. ready made systems

i. Custom made

DSS may be building for individual users and organizations.

ii. Ready made

Ready made DSS in generic DSS can be used in several

organizations. It is useful when the problem occur in similar organizations

or in same functional area of different organizations.

4. I.5 Constructing a decision support system

There is no a single way to construct DSS because:

1- There are several types of DSS.

2- There are differences in organizations, decision makers, and DSS

problem area.

The DSS architecture must ensure the following points,

B.Ravindranath (2002):

1- The transfer of information from one source to other.

2- There should be provision for future extension and

addition of new activities to DSS being designed.

3- The architecture should be compatible with the existing

computerized systems like MIS.

79

4- The architecture should assist the management to discuss

with the vendors the suitability of the systems offered.

5- The architecture should help in estimating the cost of the

system.

Efraim Turban (1988), describe all activities needed to building a

complex DSS, it is not necessary to follow this entire step for every DSS.

The phases for building a DSS summarized in figure 4.I.6.

81

Figure 4.I.6: Phases in building a DSS

Planning: need assessment. problem

diagnosis, objectives of DSS

Research: how to address user

needs? What resources available?

Analysis: what is the best

development approach? what are

necessary resources ? define

normative models

Design model base Design DSS

database

Design user

interface

Constructing: putting together DSS

Implementation: testing and

evaluation, demonstration,

orientation, training, and

deployment

Maintenance

Adaptation: continually repeat the

process to improve the system

80

1- Planning

Planning phase involves:

- Need assessment

- Problem diagnosis

- Define the objective and goals of the decision support system

2- Research

Research phase involves:

- Identifies how we can satisfy the user needs.

- Identifies the available resources.

3- Analysis

Analysis phase involves:

- Identifies the best approach to achieve the user needs.

- Identifies the required resources to achieve the user needs.

4- Design

Design phase involves:

- Design the database and its management

- Design the model base and its management

- Design the user interface.

5- Construction

In construction phase, we put the component of DSS together.

6- Implementation

i. Testing

In testing phase we collect information about the system

performance and compare it with the design specification.

ii. Evaluation

In evaluation phase we determine if the implemented system satisfy

the user needs.

82

iii. Demonstration

In demonstration phase we explain the fully operational system

capabilities to the user.

iv. Training

v. Deployment

7- Maintenance

In maintenance phase involves planning for continuing maintain the

system.

8- Adaptation

Adaptation phase involves determine the changing in the user needs

and adapt DSS.

4. I.6 DSS technologies levels and tools

There are three DSS technologies levels, Efraim Turban and Jay.E.

Aronson (2002): Specific DSS (DSS applications), DSS integrated tool

(generator or engine), and DSS primary tools.

Technologies level is very important for:

- Understanding the development of DSS.

- Developing a framework for their use.

1- Specific DSS (DSS applications)

Specific DSS (SDSS) is the final DSS which will be used by the

users to achieve their need.

2- DSS integrated tool (generator or engine)

A generator is a package of software used to build a specific DSS

quickly, inexpensively, and easily.

3- DSS primary tools

DSS tools are lowest level of DSS technology used to facilitate

development of either a DSS generator or specific DSS.

83

4. I.6.1 Relationships among the technologies levels

Efraim Turban and Jay.E. Aronson (2002), present the relationship

among the three levels of DSS technologies as follows.

- The tools are used to construct generators, which in turn are used

to construct specific DSS. Using DSS generator is useful in

constructing specific DSS and enables to update DSS if any

change occurred.

- Tools can be used to construct specific DSS directly but this may

be very lengthy and expensive.

4. I.6.2 Future trends of decision support system

There are four tools emerged for building DSS (in addition to

models and models base management system), Motaz Khorshid (2004).

These tools are:

1- Data warehouse.

Data warehouse is a subject oriented, integrated, time variant,

nonvolatile collection of data.

2- On line analytical processing (OLAP).

OLAP is software that enables analysts, mangers, executive or

decision make to gain insight into data through fast, consistent, interactive

access to wide variety of possible view of information that have been

transformed from raw data to reflect the real dimensionality of enterprise

as understood by the user.

3- Data mining.

Data mining is a set of artificial intelligence and statistical tools

used for more sophisticated data analysis.

84

4- World Wide Web.

A web-based DSS is a computerized system that delivers decision

support information or decision support tools to a manager or analyst

using a Web Brower.

The frequent use of various DSS tools in decision making and

problem solving has contributed to the development of new trends and

approaches as follows, Motaz Khorshid (2004):

- One trend is increasing sophistication of model centered DSS.

- Another trend is the development of collaborative support

system (GDSS).

The group decision support system (GDSS) is a DSS specially

designed to facilitate and enhance the communication related activities of

team members engaged in cooperative work.

- A third and important trend is active decision support system

(ADSS).

ADSS represent a third trend in the future development of DSS

technology. ADSS is a system wherein the computer and the user work as

partners in problem solving process.

4. I.7 Approaches to DSS construction

There are several approaches to DSS construction. Efraim Turban

(1988), classified these approach into three categories.

4. I.7.1 Quick hit

In quick hit approach, a specific DSS in constructed relatively

quickly to meet difficult problem.

The advantage of quick hit approach:

The costs and risks are low

The latest technologies can be used

It can use commercially available generators.

85

The disadvantage of quick hit approach:

Quick hit DSS are usually constructed for one person or for one

purpose.

Quick hit DSS do not relate to other DSS.

In quick hit DSS the experience is limited to carryover to the

next DSS.

4. I.7.2 Staged development

In staged development approach, a specific DSS is depending on

advanced planning.

4. I.7.3 Complete DSS

Complete DSS requires:

Development of a full service large scale DSS generator.

Large scale specific DSSs.

Organizational unit to mange such project.

To select an appropriate approach depend on:

- The organization

- Purpose of DSS

- Tasks

- Available tools

- Builders

4. I.8 Alternate development methodologies

DSS can be developed using several methodologies all of which are

based on the traditional SDLC. The choice of the method that will be used

to build a DSS will depend on whether the DSS will build by the end user

or by a DSS team, Efraim Turban (1988) and Efraim Turban and Jay.E.

Aronson (2002).

86

4. I.8.1 Parallel development (traditional methodologies)

Parallel development methodologies depend on the traditional

SDLC (planning, design, construction, and implementation). In parallel

development the design and implementation phases split into multiple

copies, each of which deals with separate subsystem.

The design strategy depend on the assumption that the required

information can be predetermined but this assumption not true because the

user learn more about the problem and will need to identify new

information. So there is a need to departure from the traditional SDLC.

4. I.8.2 Rapid application development (RAD) methodologies

In a rapid application development methodologies, the SDLC are

adjusted in order to provides the user by parts of system quickly.

Rapid application development methodologies include three

methods:

4. I.8.2.1 Phased development

In phased development methodologies the system breaking into a

series of version, developed sequentially, each of which has more

functionality than the previous one.

The advantage: users gain functionality quickly.

The disadvantage: The users start by incomplete system.

4. I.8.2.2 Prototyping (evolutionary, iterative)

The majority of DSS are built using evolutionary prototype

approach. Prototyping approach builds DSS in a series of steps and direct

feedback, at each step, from users to modify the system if there is a need.

Therefore, DSS tools must be flexible to permit changes quickly and

easily. In prototyping the analysis, design and implementation phases are

performing at the same time and repeatedly. It start by over all planning

and then analysis, design and prototype implementation phases are

performed iteratively until develop a small prototype.

87

The prototyping involves the following step:

1- Select an important sub problem (by the user and builder)

2- Develop a small but usable system to assist the decision

maker.

3- Evaluate the system constantly.

4- Refine, expand, and modify the system in cycle.

This steps are repeated until evolves a stable system. If the

prototype is suitable for the users, the formal implementation of the DSS

can be performing.

4. I.8.2.3 Throwaway prototyping

Throwaway prototyping often used to understand the users needs

and the system requirement. Throwaway prototyping is similar to both

traditional and prototyping approach. It is perform complete analysis as

SDLC and design prototypes to assist in understanding more about the

system being developed.

4. I.9 Team developed vs. user developed DSS

Team developed DSS and user DSS are theoretical; in practice often

a mixture between these two methods can be used, Efraim Turban (1988)

and Efraim Turban and Jay.E. Aronson (2002).

4. I.10 DSS development platforms

There are several basic DSS development software platforms,

Efraim Turban and Jay.E. Aronson (2002). The most important are the

following:

1- Write a customized DSS in a general purpose programming

language such as a visual basic.

2- Use a fourth generation language such as data-oriented

language, spreadsheets, and financial oriented languages.

88

3- Use OLAP with data warehouse or large database.

4- Use a DSS integrated development tool (generator or engine).

5- Use a domain specific DSS generator. Domain specific DSS

generators are designed to build a highly structured system.

6- Develop the DSS using CASE methodology.

7- Develop a complex DSS by integrating several of above

approach.

Links to web can be integrated to any of the previous platforms.

4. I.11 Issues associated with DSS

Gachet, A. (2001), analysis the reasons which make DSS has not

used in broad way yet and propose solution for some factors. He

summarizes the reasons of why DSS have a low interest in practice into

three main categories:

1- Human factors

Human factors cover the reasons users and decision maker oppose

the computerized decision-making system.

2- Conceptual factors

Conceptual factors cover the problems encountered by DSS because

of wrong or incomplete choices carried out during the design of the

systems.

3- Technical factors

Technical factors cover the problems encountered by DSS related to

purely software or hardware considerations.

89

Part II: The proposed decision support system

4. II.1 Introduction

In this part we will present the proposed decision support system to

help decision maker in credit card center to assessment the new applicants

in order to decide if the bank will issues a credit card to the applicant or

not.

4. II.2 The proposed decision support system

The proposed DSS will help the decision maker on the decision of

accepting or rejecting the new applicants. First the decision maker will test

if the new applicant in negative list in the credit bureau. Then test if there

any comment about this applicant in private data. The decision maker will

use Bayesian model or MSD model and enter the require data to

assessment the new applicant.

DSS was constructed using quick hit approach. The key decision

of the proposed decision support system is to classify a new applicant for

credit card to predefined classes (good or bad) and determine if the bank

will accept issuing a credit card to this customer or the bank will deny this

customer. The best approach to achieve the key objective of the proposed

DSS we will build credit score model using composite rule induction

system, Bayesian classification and linear programming.

The structure of the proposed DSS was given in figure 4.II.1:

91

Figure 4.II.1: The proposed DSS

4. II.3 Building the proposed decision support system

To build the credit score model we need a sample from the current

customers. This sample should include the characteristics of the

customers, age, gender (female and male), martial status (single, divorced,

widow, and married), education level (diploma, graduated, and post

graduated), occupation (self employee, employee), experience, home own

type (own, rent), home phone (yes, no), bank account (yes, no), credit card

(yes, no), home years and income. Also we need the information from

credit bureau and private data.

The sample was classified into two classes "good, bad". The

customer was classified as bad if the number of missing consecutive

DSS database

Customer's data

Private data Information from credit bureau

Model base

CRIS

Linear programming

Naïve Bayesian

classification

User interface

Credit analyst

90

payment more than or equal to six months otherwise the customer

classified as good.

4. II.3.1 Decision support system database

The DSS database consists of three parts:

1- Customers data

Customer table contain the characteristic age, gender, martial status,

education level, occupation, experience, home own type, home phone,

bank account, credit card, home years, income and the classification of the

customers.

2- Credit bureau data

Credit bureau will contain information about if the customers have a

credit problem with other banks or not.

3- Private data

Contain information collect from the credit card analyst and it

contain if any one have a comment about any current customer or about

new applicant (i.e. some profile about customers like VIP customers or

bad customers).

The DSS database can be shown in figure 4.II.2.

92

Figure 4.II.2: DSS database

4. II.3.2 Model base for the proposed DSS

The model base for the proposed DSS contains a composite rule

induction system (CRIS), Naïve Bayesian classification and linear

programming (MSD model). The model base is managed by WinQSB and

Spss.

The model base and model base management are given in figure

4.II.3:

Credit bureau

The customer's data

The private data

93

Figure 4.II.3: The model management subsystem

4. II.3.2.1 A composite rule induction system (CRIS)

Composite rule induction system is a knowledge acquisition system.

CRIS accept a set of data as inputs and produces "if…then" rules to

interpret the set of data. CRIS consists of three steps, Nikolaos F.

Matsatsinis and C. Erik Larson (2004) and Ting-Peng liang (1992):

1- Hypothesis generation

2- Probability assessment

3- Rule scheduler

The interaction between hypothesis generator and probability

calculator generates candidate rules which form the rule space and

organized by rule scheduler.

CRIS mechanism can be summarized as follows:

1- Hypothesis generation

Hypothesis generation responsible for determine the casual

relationships between dependent attributes (classes “good, bad”) and

independent attributes (gender, education, etc.).

WinQSB

Spss

CRIS

Linear programming

Naïve Bayesian

classification

Model base

94

For the nominal attributes, the values are simply identifying

different properties and their mean and variance do not provide useful

information. CRIS adopts a cross tabular approach to determine the

relationship between nominal attributes (gender, education, etc.) and the

dependent attributes (good or bad).

Let:

Y refer to the class (god or bad) ),( BGY

Gf be the number of good customers in the sample

Bf be the number of bad customers in the sample

jkGf be the number of good customers have the attribute value jkv

jkBf be the number of bad customers have the attribute value jkv

jkf be the number of customers (good and bad) that have the

attribute value jkv , jkf jkGf jkBf

The cross table are given in table 4.II.1:

Class

G B

1jv Gjf 1 Bjf 1 1jf

2jv Gjf 2 Bjf 2 2jf

. . .

jkv jkGf jkBf jkf

. . .

jzv jzGf jzGf jzf

Gf Bf

Table 4.II.1: The frequency table

To generate the hypothesis we repeat the following step until all

hypotheses are generated for all nominal attributes.

Att

ribute

J

95

For each jkj va , zk ,....,2,1 ,

o if jkGf jkBf formulate the hypothesis, If jkj va then GY

o if jkGf jkBf formulate the hypothesis, If jkj va then BY

Note that:

If there is a tie, all possible hypotheses are generated.

Total number of hypothesis to be generated for the attribute j is z

plus the number of ties.

2- Probability assessment

The purpose of the probability assessment is to calculate the

probability associated with each rule.

- The probability, )/( jkj vaGP , of the hypothesis If jkj va then

GY and the probability, )/( jkj vaBP , of the hypothesis If jkj va

then BY are conditional probability, it indicates the likelihood

that the conclusion is true if the condition of the hypothesis is met.

Which can be calculate from:

o The prior probability of class i , )( GYP and )( BYP

o Other conditional probability, )/( GvaP jkj and

)/( BvaP jkj , given the class, the probability that the value

of the attribute j equal jkv (the probability that the value of

the attribute j is jkv given that it is belong to the specific

class)

Let:

- Gp be the probability that an arbitrary customer are good, m

fp G

G

- Bp be the probability that an arbitrary customer are bad, m

fp B

B

96

- )/( jkj vaGP be the probability that customer is good given the

value of the attribute j is jkv .

- )/( jkj vaBP be the probability that customer is bad given the value

of the attribute j is jkv .

From Bayesian theorem this probability can be calculated as

follows:

)/(*)/(*

)/(*)/(

BvaPPGvaPP

GvaPPvaGP

jkjBjkjG

jkjG

jkj

)/(*)/(*

)/(*)/(

BvaPPGvaPP

BvaPPvaBP

jkjBjkjG

jkjB

jkj

For the nominal attributes the information about the data

distribution is unavailable. Hence, the conditional probability is assessed

by its relative frequency of occurrence in the training data.

Because both the numerator and denominator are divided by the

same constant (total number of occurrence), the two previous equations

can be simplified as follows:

o jkBBjkGG

jkGG

jkjfPfP

fPvaGP

**

*)/(

o jkBBjkGG

jkBB

jkjfPfP

fPvaBP

**

*)/(

3- Rule scheduler

A hypothesis with its associated probability is called a candidate

rule. Composite rules induction system selects attributes based on their

saliency. Rule saliency is defined as the difference between the number of

cases correctly covered (hit value) and those incorrectly interpreted (miss

value) by the rule.

The resulting structure is a decision tree with rules as its nodes.

97

Structure construction can be summarized as follows:

1- Determine of rule saliency.

2- Selection of rule. Guidelines for rule selection as follows:

i. If there are rules whose miss values are zero and

whose hit values are positive, then select the one

with the highest hit value.

ii. If all rules have positive miss values, then select the

rule with highest positive saliency value.

iii. If more than rule has the same saliency values, then

choose the one with highest probability.

4. II.3.2.2 Naïve Bayesian classification

Bayesian classifiers are statistical classifier which based on bayes

theorem. They can predict the probability that a given applicants

(alternatives or sample) belongs to a particular class, Jiawei Han and

Micheline Kamber (2001).

Naïve Bayesian classification is simple Bayesian classifier, based

on the assumption, called conditional independence, which the effect of an

attributes value on given class is independent of the value of the other

attributes, i.e. the values of the attributes are conditionally independent of

one another. This assumption makes the computation simple and when it

is hold the accuracy of the naïve Bayesian increase, in comparison with

other classifiers, when this assumption holds. Jiawei Han and Micheline

Kamber (2001) and Gutierrez-Pena E.(2004).

Naïve Bayesian will test if GX or BX , where X is unknown

sample with the set attributes ),.....,,.....,( 21 nj aaaaA , n is the number of

the attributes, nj ,.....,1 .

98

The classifier will predict that:

GX if )/()/( XBPXGP

BX if )/()/( XGPXBP

i.e. X will belong to the class having the highest posterior

probability, conditioned on X .

Where:

)/( XGP is the probability that X belong to the class G given that

X have set attributes ),.....,,.....,( 21 nj aaaaA

)/( XBP is the probability that X belong to the class B given that

X have set attributes ),.....,,.....,( 21 nj aaaaA

Naïve Bayesian classifier work as follows:

From Bayesian theorem )/( XGP and )/( XBP can be calculated as

follows:

)()/()()/(

)()/()/(

BpBXPGpGXP

GpGXPXGP

and

)()/()()/(

)()/()/(

BpBXPGpGXP

BpBXPXBP

Based on the assumption of conditional independence

n

j

jk GvXPGXP1

)/()/( and

n

j

jk BvXPBXP1

)/()/(

Where:

)/( GvXP jk be the probability that X have attribute jkv given that

it is belong to class good, (posterior probability of X condition on

the hypothesis that it is belong to the class G ) and

99

)/( BvXP jk be the probability that X have attribute jkv given that it

is belong to class bad, (posterior probability of X condition on the

hypothesis that it is belong to the class B )

The two conditional probability )/( GvXP jk and )/( BvXP jk are

assessed by the relative frequency of occurrence in the training data.

Thus:

)/( GvXP jki G

jkG

f

f

)/( BvXP jki B

jkB

f

f

4. II.3.2.3 Linear programming (MSD model)

Linear programming seek to develop a linear scorecard to find a

weighs iw for each attributes and cut off point c so that the good customer

will have score above these cut off point and bad customer will have score

below these cut off point Thomas, L.C., A (2000).

MSD model minimizing the sum of deviations among the

alternative score (not correctly classified) from the cut off point, these

model knows as MSD (minimize the sum of deviations), Doumpos M. and

Zopounidis C. (2002a) and Yong Shi, Yi Peng, Welxuan Xu and Xiaowo

Tang (2002).

min i

m

i

1

..ts

cwa ij

n

j

ij

1

, Gi

cwa ij

n

j

ij

1

, Bi

cwi , unrestricted in sign 0i

011

Where i is the overlapping of two classes boundary for all

alternatives score iA form the cut of point. The violation of the

classification rules by an alternative iX , i (external deviations) by which a

constraint is not satisfied.

4. II.3.3 User interface for the proposed DSS

User interface consist of windows to facilities the communication

between the credit card analyst and data base management and model base

management.

1- The main window

The main window is given in figure 4.II.4.

Figure 4.II.4: The main window of proposed DSS

The main window contains icons for the result of applying the

model, DSS data, MSD, CRIS and Bayesian.

2- The result of applying the models windows

This window is given in figure Figure 4.II.5.

010

Figure 4.II.5: The result of applying the model window

It contains icons to present the result of applying CRIS, MSD

model, Bayesian model and index to illustrate the code used in the model.

3- CRIS window

CRIS contain icon to present the frequency table, probability and

saliency rules and icon to present the result of applying CRIS. CRIS

window is given in figures 4.II.6, 4.II.7 and 4.II.8.

012

Figure 4.II.6: CRIS window


013


4- MSD model window

This window contain icon to present the MSD weights and the

result of applying the MSD1, MSD2, MSD3 and MSD4. MSD models

window are given in figures 4.II.9, 4.II.10 and 4.II.11.

Figure 4.II.9: MSD model window

014

Figure 4.II.10: MSD model window "weights"

Figure 4.II.11: MSD model window "result"

015

5- Bayesian model window

It contains icons to present the Bayesian probability and icons to

present the result of applying the Bayesian model. Bayesian model

windows are given in figures 4.II.12, 4.II.13 and 4.II.14.

Figures 4.II.12: Bayesian model window

Figures 4.II.13: Bayesian model window "probability"

016

Figures 4.II.14: Bayesian model window "result"

6- DSS data window

It contains icons for to access customer tale, credit bureau and

private data. DSS data windows are given in figures 4.II.15 and 4.II.16.

Figures 4.II.15: DSS data windows

017

Figures 4.II.16: DSS data windows "customer table"

7- MSD window

It's used to assessment the new applicant and the user will be asked

to enter the applicant data then category them as good "accept and issue

the credit card" or bad "deny them and refuse to issue the credit card".

MSD window is given in figure 4.II.17.

Figure 4.II.17: MSD window

018

8- Composite rule induction system window

Composite rule induction system window give the user an over view

about the importance of the attributes used in building the system (most

and lowest preference customer) and given in figure 4.II.18.


9- Bayesian window

It's used to assessment the new applicant and the user will be asked

to enter the applicant data then category them as good "accept and issue

the credit card" or bad "deny them and refuse to issue the credit card".

Bayesian window is given in figure 4.II.19.

019

Figure 4.II.19: Bayesian window

4. II.4 Summary

In part II we present the proposed DSS. It depends on the credit

score as techniques to classify the new applicant into good or bad classes

based on their characteristics. The credit card analyst will use it as follows:

- Check the documents offer by the new applicant and assure that the

data given by the applicant is true.

- Check if the applicant has any problem with other banks from credit

bureau.

- Check if there is any comment about hat applicant from the private

data.

- Use MSD model or Bayesian model to classify the applicant to

good or bad class, the credit analyst will enter the applicant data

then the system will return the classification.

001

Chapter five

An application: Building credit score model for credit

card application assessment

5.1. Introduction

The decision of issuing a credit card is very critical since any

mistake in the credit decision for single customer mean that the bank will

loss the profit obtained from other successful customers. Due to this fact

the method used to evaluate the creditability of each credit card applicant

should be accurately as possible in order to minimize the risk of insolvent.

Building credit score model using CRIS, Bayesian classification, and

linear programming can reduce the risk of insolvent as we see in the

following sections.

5.2. Description of the current system

Recently some financial organization in Egypt starts to use

deductive credit score to assessment the credit card application in order to

decide if they will issue a credit card to the applicants or not.

000

The deductive credit score work as follows:

- Determine the important attributes.

- Assign a weight to each attribute.

- The score for each customer are obtained by adding the attributes

weight.

- To determine whether the financial organization grants or not a

credit card to the customer, the score of this customer are compared

with a cut off point.

The attributes, their weights and cut off point are determined by the

decision maker based on their experiences.

The chosen attributes used in these model consists of 11 attributes

(three quantitative and eight qualitative). These attributes are:

1- Age (less than or equal 30, greater than 30 and less than or

equal 60, grater than 60)

2- Gender (female and male)

3- Martial status (single, divorced, widow, and married)

4- Education level (diploma, graduated, and post graduated)

5- Occupation (self employee, employee)

6- Experience (less than 3 years, grater than or equal 3 years and

les than 10 years, grater than or equal 10 years)

7- Home own type (own, rent)

8- Home phone (yes, no)

9- Bank account (yes, no)

10- Credit card (yes, no)

11- Home years (less than or equal 8 years, greater than 8 years)

According to this model, the financial organization received the

request of issuing credit card from customers and evaluates them. The

acceptance customers are granted a credit card and their performance was

observed and recorded.

002

5.3. Description of the training and test sample

To build and test the credit score models using CRIS, Bayesian, and

linear programming a sample consists of 200 customers was selected

randomly (100 bad and 100 good). The classification of customers to good

and bad are depend on the number of months of missed payment. If the

customer delay more than 6 months, that customer classify as bad,

otherwise the customer is classify as good.

This sample is divided equally into two samples. The first sample

consist of 100 customers (50 good and 50 bad), used to build the model,

these sample called the training sample. The second sample consist of 100

customers (50 good and 50 bad), used to test the model, these sample

called test sample. The credit score models will be building using the same

attributes used in the current system.

5.4. Building empirical credit score models

5.4.1. The subsystem: Composite Rule Induction System (CRIS)

The frequency tables for the 11 attributes are generated. The

frequency tables are given in the table 5.1:

003

Age good bad Martial status good bad

Age <=30 4 12 Married 41 34

30<age<=60 45 38 Divorced 1 2

Age>60 1 0 Widow 2 1

50 50 Single 6 13

50 50

Gender good bad education good bad

Male 43 46 Post graduated 11 8

Female 7 4 Graduated 37 33

50 50 Diploma 2 9

50 50

Experience good bad Occupation good bad

<3 years 2 9 Employee 31 25

More or =3 and <10 11 21 Retired 1 0

More than or = 10 37 20 Self employed 18 25

50 50 50 50

Home type own good bad Phone good bad

Owned 33 30 Yes 49 48

Rent 17 20 No 1 2

50 50 50 50

Bank account good bad Credit card good bad

Yes 47 43 Yes 41 17

No 3 7 No 9 33

50 50 50 50

Home years good bad

Les than or = 8 11 20

More 8 39 30

50 50

Table 5.1: The frequency tables

Rules are formulated using the frequency tables, and the probability

associated to each rules are calculated. The final rules are given in table

5.2:

004

Rules Saliency Probability Class

do not have credit card 24 0.785714 bad

have credit card 24 0.706897 good

experience more than or = 10 17 0.649123 good

experience more or =3 and <10 10 0.65625 bad

home years <=8 9 0.645161 bad

home years >8 9 0.565217 good

age <=30 8 0.75 bad

education =diploma 7 0.818182 bad

experience <3 years 7 0.818182 bad

martial status = single 7 0.684211 bad

self employed 7 0.581395 bad

martial status = married 7 0.546667 good

30<age<=60 7 0.542169 good

employee 6 0.553571 good

Do not bank account 4 0.7 bad

education graduated 4 0.528571 good

have bank account 4 0.522222 good

gender = female 3 0.636364 good

education = post graduated 3 0.578947 good

home type = rent 3 0.540541 bad

home type =owned 3 0.52381 good

gender = Male 3 0.516854 bad

age>60 1 1 good

retired 1 1 good

martial status = divorced 1 0.666667 bad

martial status = widow 1 0.666667 good

do not have phone 1 0.666667 bad

have phone 1 0.505155 good

Table 5.2: The final CRIS rules

Since the classification decision, in CRIS, may depend on one rule

(if the first rule was satisfied then the next rules will not be checked), the

CRIS will be used to give an overview of the importance of the attributes

to the credit card analyses. The results of applying CRIS indicate that the

first rule, according to the saliency rule and the probability, is "if the

005

applicant haven't credit card" then the applicant will classify as bad. The

second rule is "if the applicant have credit card" then the applicant will

classify as good and so one.

Also CRIS can used to find the most preference characteristic and

the lowest one as follows:

Most preference, customers have the following characteristics:

(have credit cardhave experience more than or equal 10 years

home more than 8 yearsmarried or widow age more than 30

yearsemployee or retiredgraduated or post graduatedhave bank

accounthome type = own have phone).

Lowest preference, customers have the following characteristics:

(do not have a credit card experience less than 10 years home less than

or equal 8 years age les than or equal 30 years education is diploma

single or divorced self-employeddon’t have bank accounthome own

type= rentdon’t have phone)

5.4.2. The subsystem: Bayesian classification

To compute )/( XGP (is the probability that X belong to the class

G given that X have set attributes ),.....,,.....,( 21 nj aaaaA ) and )/( XBP

(the probability that X belong to the class B given that X have set

attributes ),.....,,.....,( 21 nj aaaaA ) , )/( GvXP jk (the probability that X

has attribute jkv given that it is belong to class good) and )/( BvXP jk (the

probability that X have attribute jkv given that it is belong to class bad) are

computed and given in table 5.3:

006


age <=30 0.08 0.24 married 0.82 0.68

30<age<=60 0.9 0.76 divorced 0.02 0.04

age>60 0.02 0 widow 0.04 0.02

1 1 single 0.12 0.26

1 1

Gender good bad Education good bad

Male 0.86 0.92 post 0.22 0.16

Female 0.14 0.08 graduated 0.74 0.66

1 1 diploma 0.04 0.18

1 1


<3 years 0.04 0.18 employee 0.62 0.5

more or =3 and <10 0.22 0.42 retired 0.02 0

more than or = 10 0.74 0.4 self employed 0.36 0.5

1 1 1 1


Owned 0.66 0.6 Yes 0.98 0.96

Rent 0.34 0.4 No 0.02 0.04

1 1 1 1


Yes 0.94 0.86 Yes 0.82 0.34

No 0.06 0.14 No 0.18 0.66

1 1 1 1

Home years good bad

les than or = 8 0.22 0.4

more 8 0.78 0.6

1 1

Table 5.3: )/( GvXP jk and )/( BvXP jk

007

The result of using the Bayesian1 model to evaluates the customers

in the training and test samples are given in table 5.4:

Bayesian model (Bayesian1)

Estimated classes

Training sample Test sample

Original classes

Good Bad Good Bad

No. % No. % No. % No. %

Good 37 74% 13 26% 34 68% 16 32%

Bad 24 48% 26 52% 20 40% 30 60%

Table 5.4: The results of applying Bayesian1 model

Table 5.4 can be representing in figure 5.1:

0102030405060708090

100

%

Training

sample

Test

sample

Hit classification

Good

Bad

0102030405060708090

100

%

Training

sample

Test

sample

Erroneous classification

Good

Bad

Figure 5.1: The results of applying Bayesian1 model

In the training sample: Bayesian1 classify 74% of good customers

correctly and successes to find out 52% of bad customers the

deductive credit score classify them as good. At the same time

Bayesian1 classify 26% of good customers "incorrectly" as bad and

fall to detect 48% of bad customers and classify them "incorrectly"

as good.

008

In the test sample: Bayesian1 classify 68% of good customers


deductive credit score classify them "incorrectly" as good. At the

same time Bayesian1 classify 32% of good customers as bad and

fall to detect 40% of bad customer and classify them as good.

5.4.3. The subsystem: linear programming based model (MSD

model)

Using the training sample, the weights for the attributes and the cut

point are computed by WinQSB and given in table 5.5:

Age Gender Material

status

Education

Level Occupation Experience

Home

Own

Type

Home

Phone

Bank

Accounts

Credit

Cards

Home

Years

Cut

point

0.0007 0.0062 0.0013 0.0140 0.0067 0.0011 0.0033 0.0273 0.0078 0.0138 0.0005 0.2004

Table 5.5: The weights and cut point for MSD1 model

The result of using the MSD1 model to evaluates the customers in

the training and test samples are given in table 5.6:

MSD model (MSD1)

Estimated classes


Original classes

Good Bad Good Bad

No. % No. % No. % No. %

Good 35 70% 15 30% 28 56% 22 44%

Bad 17 34% 33 66% 16 32% 34 68%

Table 5.6: The results of applying MSD model

009


0102030405060708090

100

%

Training

sample

Test

sample

Hit classification

Good

Bad

0102030405060708090

100

%

Training

sample

Test

sample


Good

Bad

Figure 5.2: The results of applying MSD1 model

In the training sample: MSD1 classify 70% of good customers


deductive credit score model classify them as good. At the same

time, MSD1 classify 30% of good customers "incorrectly" as bad

and fall to detect 34% of bad customers and classify them as good.

In the test sample: MSD1 classify 56% of good customers correctly

and successes to find out 68% of bad customers the deductive credit

score model classify them as good. At the same time, MSD1

classify 44% of good customers "incorrectly" as bad and fall to

detect 32% of bad customers and classify them "incorrectly" as

good.

021

5.4.4. Building empirical credit score models conclusion

- Score based on deductive credit score are not the suitable techniques

to classify the new applicants since it depend on the experience

essentially. Deductive credit score give a consistent and subjective

classification since it based on the customers scores but it still

subjective since it depends on experiences, Liu, Y. (2001). So it is

important to build credit score model using quantitative techniques.

- CRIS is goods to give an overview of the sample since the

procedure used to arrive to the rules can be understand by the user

but if the CRIS used, it is important perform further analysis

because the decision may depend on one rule. If this rule are

satisfied, then the applicant will ranked to the class that the rule

defines without examine other rules.

- Using Bayesian1 and MSD1 model will decrease the insolvent rate

since both models are successes to detect bad customers deductive

credit score classify them as good, at the same time part of good

customers may be loss.

- The comparison between Bayesian1 and MSD1, for training and test

samples, are summarized in table 5.7:

Estimated classes


Original classes

Good Bad Good Bad

MSD1 Bayesian1 MSD1 Bayesian1 MSD1 Bayesian1 MSD1 Bayesian1

Good 70% 74% 30% 26% 56% 68% 44% 32%

Bad 34% 48% 66% 52% 32% 40% 68% 60%

Table 5.7: The comparison between Bayesian and MSD models

Table 5.7 can be representing in figures 5.3 and 5.4:

020

MS

D1

Ba

ye

sia

n1

MS

D1

Ba

ye

sia

n1

MS

D1

Ba

ye

sia

n1

MS

D1

Ba

ye

sia

n1

0

10

20

30

40

50

60

70

80

90

100

%

G > G B > B G > G B > B

Hit classification

MSD1

Bayesian1


Figure 5.3: The comparison between Bayesian1 and MSD1 models for hit classification

MS

D1

Bayesia

n1

MS

D1

Bayesia

n1

MS

D1

Ba

ye

sia

n1

MS

D1 B

aye

sia

n1

0

10

20

30

40

50

60

70

80

90

100

%

G > B B > G G > B B > G


MSD1

Bayesian1


Figure 5.4: The comparison between Bayesian1 and MSD1 models for erroneous classification

- Bayesian1 model perform better than MSD1 in detecting good

customers. Bayesian1 classify 74% and 68% of good customers

022

correctly, in training and test sample respectively. While MSD1

classify 70% and 56% of good customers correctly, in training and

test sample respectively.

- MSD1 perform better in classifying bad customers, its detect 66%

and 68% of bad customers, in training and test samples respectively,

while Bayesian1 detect 52% and 60% of bad, in training and test

sample respectively.

- Thus using Bayesian1 or MSD1 model will reduce the insolvent rate

but some good customers will be denied. Generally Bayesian1 and

MSD1 models give insufficient results since they classify some

good customers as bad and can not detect all bad customers.

- Insufficient results due to that there is a need to review the attributes

which used to build the models since these attributes do not reflect

all data about the customers and some of these attributes are vague.

The set of attributes should comprise more relevant data and more

details. Some of the attributes are obtained from application form.

Others are obtained from credit bureau, Thomas L. C., David B.

Edelman, and Jonathan N. Csrook (2002) and Yang Liu.s (2002).

023

1- Data from the application form

The data on the credit card application form are summarized table 6.8:

Information

resources Application forms

Info

rmat

ion c

ateg

ori

es a

nd s

ample

s

Basic personal

information Age , gender

1,education

2

Family

information

Martial status3, number of children, data of

marriage

Residential

information

Status3, number of years in current address

5,

the value of the house

Employment

status

Occupation sector, number of years in

current occupation, position

Financial status Salary, other income, rent payment, monthly

installments, credit report6

Contact

information

Phone home, work phone, distance from

home or work to nearest branch

Table 5.8: Some attributes in application form

(1) Female , male

(2) Diploma (high school), graduated, post graduated

(3) Single, married, widow, divorced

(4) Own, rented, functional, with parents

(5) Capital, countryside

(6) Have you paid your bills on time? What is your outstanding dept?

How long is your credit history? Have you applied for new credit

recently? And how many and what types of credit accounts do you

have?

024

2- Data from credit bureau

Credit bureau usually contains bankruptcy information obtained

from banks. It contain information like, identify of current and past

creditors, dates disbursed for current and past loans, monthly installments

for current and past loans, maximum line of credit with current and

creditors, arrears in current and past loans and number of inquiries.

5.5. Improving the accuracy of credit score models

One of the important points in building credit score models is to

select the relevant attributes. Irrelevant, redundant or vague attributes may

reduce the accuracy of the models, Liu Y. and M Schumann (2005). As

mentioned in the section 5.4.4 the accuracy of the credit score model can

be increase by adding new related attributes and remove vague attributes.

We will try to improve the accuracy of credit score model by reviewing

the set of attributes used to build it. We remove vague attributes and

adding useful attributes. Due to lake of data, the income will be add to the

set of attributes which used to build the credit score models and the

gender, home own type, bank account, and credit card will be removed,

since it need more details and do not provide useful information.

The attributes which will be used to build the new credit score

models are:

1- Age

2- Martial status (single, divorced, widow, and married)

3- Education level (diploma, graduated, and post graduated)

4- Occupation (self employee, employee)

5- Experience

6- Home years

7- Income

025

5.5.1. Building a new Bayesian model

)/( XGP and )/( XGP for income are computed and given in table 5.9:

Income Good Bad

<=600 0.24 0.54

>600 0.76 0.46

Table 5.9: )/( XGP and )/( XGP for income

The results of using the new Bayesian model in classifying the

customers for training and test samples are given in table 5.10:

New Bayesian model (Bayesian2)

Estimated classes


Original classes

Good Bad Good Bad

No. % No. % No. % No. %

Good 39 78% 11 22% 33 66% 17 34%

Bad 16 32% 34 68% 22 44% 28 56%

Table 5.10: The result of applying Bayesian2 model


0102030405060708090

100

%

Training

sample

Test

sample

Hit classification

Good

Bad

0102030405060708090

100

%

Training

sample

Test

sample


Good

Bad

Figure 5.5: The result of applying Bayesian2 model

026

In the training sample Bayesian2 classify 78% of good customer

correctly and successes to find out 68% of bad customers deductive

credit score classify them as good “correctly”. At the same time

Bayesian2 classify 22% of good customers "incorrectly" as bad and

fall to detect 32% of bad customers and classify them "incorrectly"

as good.

In the test sample, Bayesian2 classify 66% of good customers


deductive credit score classify them "incorrectly" as good. At the

same time Bayesian2 classify 34% of good customers as bad

“incorrectly” and fall to detect 44% of bad customers and classify

them as good.

5.5.2. Building a new MSD model

Using the training sample, the weights for the attributes and the cut

point are computed by WinQSB and given in the table 5.11:

Age Material

status

Education

Level Occupation Experience

Home

Years net income Cut point

0.00030 0.00510 0.00580 0.00260 0.00020 0.00020 0.006500 0.09540

Table 5.11: Weights and cut point for MSD2 model

The results of apply MSD2 model are given in table 5.12:

New MSD model (MSD2)

Estimated classes


Original classes

Good Bad Good Bad

No. % No. % No. % No. %

Good 43 86% 7 14% 43 86% 7 14%

Bad 13 26% 37 74% 14 28% 36 72%

Table 5.12: The results of applying the MSD2 model

027


0102030405060708090

100

%

Training

sample

Test

sample

Hit classification

Good

Bad

0102030405060708090

100

%

Training

sample

Test

sample


Good

Bad

Figure 5.6: The results of applying the MSD2 model

In the training sample, MSD2 classify 86% of good customers


deductive credit score model classify them as good. At the same

time MSD2 classify 14% of good customers "incorrectly" as bad

and fall to detect 26% of bad customers and classify them

"incorrectly" as good.

In the test sample, MSD2 classify 86% of good customers correctly

and successes to find out 72% of bad customers the deductive credit

score model classify them "incorrectly" as good. At the same time

MSD2 classify 14% of good customers as bad and fall to detect

28% of bad customers and classify them as good.

028

5.5.3. Improving the accuracy of credit score models conclusion

Adding income and remove the usefulness attributes improve the

accuracy of Bayesian and MSD models as indict in tables 5.13, 5.14, 5.15

and 5.16:

1- Bayesian model

o Training sample

Training sample (Bayesian models)

Estimated classes

Bayesian1 Bayesian2

Original classes Good Bad Good Bad

Good 74% 26% 78% 22%

Bad 48% 52% 32% 68%

Table 5.13: Comparison between Bayesian1 and Bayesian2 for training sample


0102030405060708090

100

%

Good Bad

Correct classification

Byesian1

Byesian2

0102030405060708090

100

%

Good Bad

Classification error

Byesian1

Byesian2

Figure 5.7: Comparison between Bayesian1 and Bayesian2 for training sample

For the good customers: the performance of Bayesian2 model is

better than Bayesian1 model. Bayesian2 classify 78% of good

customers correctly while bayesian1 classify 74% only.

029

For bad customers: the performance of Bayesian2 model is better

since it detect 68% of bad while Bayesian1 52% only.

o Test sample

Test sample (Bayesian models)

Estimated classes

Bayesian1 Bayesian2


Good 68% 32% 66% 34%

Bad 40% 60% 44% 56%

Table 5.14: Comparison between Bayesian1 and Bayesian2 for test sample


0102030405060708090

100

%

Good Bad

Hit classification

Byesian1

Byesian2

0102030405060708090

100

%

Good Bad


Byesian1

Byesian2

Figure 5.8: Comparison between Bayesian1 and Bayesian2 for test sample

For the good customers: the performance of Bayesian1 model is


customers correctly while Bayesian2 classify 66%.

For the bad customers: the performance of Bayesian1 model is


customers correctly while Bayesian2 classify 56%.

031

2- MSD model

o Training sample

Training sample (MSD models)

Estimated classes

MSD1 MSD2


Good 70% 30% 86% 14%

Bad 34% 66% 26% 74%

Table 5.15: The result of applying MSD2 for training sample


0102030405060708090

100

%

Good Bad

Hit classification

MSD1

MSD2

0102030405060708090

100

%

Good Bad


MSD1

MSD2

Figure 5.9: The result of applying MSD2 for training sample

For the good customers: the performance of MSD2 was improved,

its classify 86% of good customers correctly while the MSD1 model

classify 70% only.

For the bad customers: the performance of MSD2 was improved, its

detect 74% of bad customers while the MSD1 model detect 66%

only.

030

o Test sample

Test sample (MSD models)

Estimated classes

MSD1 MSD2


Good 56% 44% 86% 14%

Bad 32% 68% 28% 72%

Table 5.16: The result of applying MSD2 for test sample


0102030405060708090

100

%

Good Bad

Hit classification

MSD1

MSD2

0102030405060708090

100

%

Good Bad


MSD1

MSD2

Figure 5.10: Comparison between MSD1 and MSD2 for test sample

For the good customers: the performance of MSD2 was improved,

its classify 86% of good customers correctly while the MSD1 model

classify 56% only.

For the bad customers: the performance of MSD2 was improved, its

detect 72% of bad customers while the MSD1 model without

income detect 68% only.

032

The performances of Bayesian1, MSD1, Bayesian2 and MSD2 are

compared and given in table 5.17 and 5.20 for training and test samples:

a) Training sample

Training sample

Estimated classes

Models without income New models

Original classes

Good Bad Good Bad


Good 70% 74% 30% 26% 86% 78% 14% 22%

Bad 34% 48% 66% 52% 26% 32% 74% 68%

Table 5.17: The comparison between MSD1, MSD2, Bayesian1, and Baesian2 for

training sample


MS

D1

Ba

ysia

n1

MS

D2

Bayesia

n2

MS

D1

Ba

ysia

n1

MS

D2

Bayesia

n2

0102030405060708090

100

%

G > G B > B

Hit classification

MSD1

Baysian1

MSD2

Bayesian2

Figure 5.11: The comparison between MSD1, MSD2, Bayesian1, and Baesian2 for

training sample, hit classification

033

MS

D1

Ba

ye

sia

n1

MS

D2

Bayesia

n2

MS

D1

Ba

ye

sia

n1

MS

D2

Bayesia

n2

0102030405060708090

100

%

G > B B > G


MSD1

Bayesian1

MSD2

Bayesian2

Figure 5.12: The comparison between MSD1, MSD2, Bayesian1, and Baesian2 for

training sample, erroneousness classification

According to rate of good customers which classified correctly, the

methods are arranged in table 5.18:

Table 5.18: Methods arranged according to hit classification rate of good customers

According to rate of bad customers which classified correctly, the

methods can be arranged in table 5.19:

Model Bad

MSD2 74%

Bayesian2 68%

MSD1 66%

Bayesian1 52%

Table 5.19: Methods arranged according to hit classification rate of bad customers

It is clear that MSD2 performs better than others models.

Model Good

MSD2 86%

Bayesian2 78%

Bayesian1 74%

MSD1 70%

034

b) Test sample

Test sample

Estimated classes

Models without income New models with income

Original classes

Good Bad Good Bad


Good 56% 68% 44% 32% 86% 66% 14% 34%

Bad 32% 40% 68% 60% 28% 44% 72% 56%

Table 5.20: The comparison between MSD, MSD2, Bayesian1, and Bayesian2 for

test sample

Table 5.20 can be representing in the figures 5.13 and 5.14:

MS

D1

Ba

ye

sia

n1

MS

D2

Bayesia

n2

MS

D1

Ba

ye

sia

n1

MS

D2

Bayesia

n2

0

10

20

30

40

50

60

70

80

90

100

%

G > G B > B

Hit classification

MSD1

Bayesian1

MSD2

Bayesian2

Figure 5.13: The comparison between MSD, MSD2, Bayesian1, and Bayesian2

for test sample, hit classification

Note:

G > G mean that the good customers classified correctly as good.

B > B mean that bad customers classified correctly as bad.

035

MS

D1

Bayesia

n1

MS

D2

Ba

ye

sia

n2

MS

D1

Bayesia

n1

MS

D2

Ba

ye

sia

n2

0

10

20

30

40

50

60

70

80

90

100

%

G > B B > G

Erroneous Classification

MSD1

Bayesian1

MSD2

Bayesian2

Figure 5.14: The comparison between MSD, MSD2, Bayesian1, and Bayesian2

for test sample, erroneous classification



Model Good

MSD2 86%

Bayesian1 68%

Bayesian2 66%

MSD1 56%

Table 5.21: Methods arranged according to good customers classified correctly

036

The methods can be arranged according to the numbers of bad

customers classified correctly as shown in table 5.22:

Model Bad

MSD2 72%

MSD1 68%

Bayesian1 60%

Bayesian2 56%

Table 5.22: Methods arranged according to bad customers classified correctly

It is clear that MSD2 performs better than others models.

5.6. Testing the models using new sample

To test the above conclusion, MSD2 perform better than other

models, we used another sample consist of 200 customers, 100 good and

100 bad, to test the Bayesian1, Bayesian2, MSD1 and MSD2 models.

The results are given in tables 5.23:

Estimated classes

Models without income New models

Original classes

Good Bad Good Bad


Good 64% 70% 36% 30% 74% 68% 26% 32%

Bad 37% 33% 63% 67% 31% 39% 69% 61%

Table 5.23: Test Bayesian1, Bayesian2, MSD1, and MSD2 using another sample

037


MS

D1

Ba

ye

sia

n1

MS

D2

Bayesia

n2

MS

D1

Ba

ye

sia

n1

MS

D2

Bayesia

n2

0102030405060708090

100

%

G > G B > B

Hit classification

MSD1

Bayesian1

MSD2

Bayesian2

Figure 5.15: Test Bayesian1, Bayesian2, MSD1, and MSD2 using another sample, hit

classification

MS

D1

Ba

ye

sia

n1

MS

D2

Bayesia

n2

MS

D1

Ba

ye

sia

n1

MS

D2

Bayesia

n2

0102030405060708090

100

%

G > B B > G

Erroneous Classification

MSD1

Bayesian1

MSD2

Bayesian2

Figure 5.16: Test Bayesian1, Bayesian2, MSD1, and MSD2 using another sample, erroneous

classification



Model Good

MSD2 74%

Bayesian1 70%

Bayesian2 68%

MSD1 64%

Table 5.24: Methods arranged according to rate of good customer classified correctly

038

The methods are arranged according to the rate of bad customers

classified correctly and given in table 5.25:

Model Bad

MSD2 69%

Bayesian1 67%

MSD1 63%

Bayesian2 61%

Table 5.25: Methods arranged according to rate of bad customer classified correctly

Table 5.25 show that MSD2 performs better than others models.

039

5.7. Building new models using new sample

To confirm the above conclusions, the MSD2 performs better than

others models, the new sample will be used to build another MSD and

Bayesian. Bayesian3 and MSD3 will be building using the 11 attributes

and Bayesian4 and MSD4 will be building using the 7 attributes.

5.7.1. Bayesian model

The )/( XGP and )/( XBP are computed and given in table 5.26:


age <=30 0.1 0.28 Married 0.88 0.66

30<age<=60 0.84 0.72 Divorced 0 0.04

Age>60 0.06 0 Widow 0.02 0.04

1 1 Single 0.12 0.26

Gender good bad Education good bad

Male 0.84 0.92 Post 0.36 0.2

Female 0.16 0.08 Graduated 0.56 0.68

1 1 Diploma 0.08 0.12

1 1


<3 years 0.1 0.24 Employee 0.66 0.6

more or =3 and <10 0.26 0.42 Retired 0.06 0

more than or = 10 0.64 0.34 self employed 0.28 0.4

1 1 1 1


Owned 0.62 0.62 Yes 1 1

Rent 0.38 0.38 No 0 0

1 1 1 1


Yes 0.8 0.66 Yes 0.64 0.46

No 0.2 0.34 No 0.36 0.54

1 1

Home years good Bad net income good bad

les than or = 8 0.2 0.56 0.26 0.42

more 8 0.8 0.44 0.74 0.58

1 1 1 1

Table 5.26: )/( XGP and )/( XBP for new sample

)/( GvXP jk )/( GvXP jk

041

The comparison between Bayesian3 and Bayesian4 are given in

table 5.27:

Test sample (Bayesian models)

Estimated classes

Bayesian3 Bayesian4


Good 66% 34% 62% 38%

Bad 48% 52% 42% 58%

Table 5.27: Comparison between Bayesian3 and Bayesian4


0102030405060708090

100

%

Good Bad

Hit classification

Byesian3

Byesian4

0102030405060708090

100

%

Good Bad

erroneous classification

Byesian3

Byesian4

Figure 5.17: Comparison between Bayesian3 and Bayesian4

For the good customers: Bayesian3 performs better than Bayesian4.

Bayesian3 classify 66% of good customers correctly while the

Bayesian4 classify 62% only.

For the bad customers: Bayesian4 performs better than Bayesian3.

Bayesian4 classify 58% of bad customers correctly while the

Bayesian3 model classifies 52% only.

040

5.7.2. MSD model

The weights and cut point for MSD models are computed using

WinQSB and given in table 5.28 and 5.29:

Weights and cut point for MSD3:

Age gender Material status

Education Level

Occupation Experie

nce

Home Own Type

Home Phone

Bank Account

Credit Cards

Home Years

Cut point

0.0005 0.0071 0.0041 0.0065 0.0107 0.0007 0.0000 0.0000 0.0307 0.0069 0.0004 0.1628

Table 5.28: Weights and cut point for MSD3

Weights and cut point for MSD4:

Age Material status

Education Level

Occupation Experience Home Years

net income Cut point

0.00020 0.00580 0.00040 0.01130 0.00040 0.00070 0.004900 0.1059

Table 5.29: Weights and cut point for MSD4

The comparison between MSD3 and MSD4 are given in table 5.30:

Test sample (MSD model)

Estimated classes

MSD3 MSD4


Good 66% 34% 68% 32%

Bad 34% 66% 26% 74%

Table 5.30: The comparison between MSD3 and MSD4

042


0102030405060708090

100

%

Good Bad

Hit classification

MSD3

MSD4

0102030405060708090

100

%

Good Bad


MSD3

MSD4

Figure 5.18: The comparison between MSD3 and MSD4

For the good customers: MSD4 performs better than MSD3. MSD4

classify 68% of good customers correctly while the MSD3 classify

66% only.

For the bad customers: MSD4 performs better than MSD3. MSD4

classify 74% of bad customers correctly while the MSD3 classify

66% only.

5.7.3. Building new models using the new sample conclusion

The performances of Bayesian3, MSD3, Bayesian4 and MSD4

models are compared and given in table 5.31:

Test sample

Estimated classes

Model without income Model with income

Original classes

Good Bad Good Bad


Good 66% 66% 34% 34% 68% 62% 32% 38%

Bad 34% 48% 66% 52% 26% 42% 74% 58%

Table 5.31: The comparison between Bayesian3, MSD3, Bayesian4 and MSD4

043


MS

D3

Ba

ye

sia

n3

MS

D4

Bayesia

n4

MS

D3

Ba

ye

sia

n3

MS

D4

Bayesia

n4

0

10

20

30

40

50

60

70

80

90

100

%

G > G B > B

Hit classification

MSD3

Bayesian3

MSD4

Bayesian4

Figure 5.19: The comparison between Bayesian3, MSD3, Bayesian4 and MSD4 for hit classification

MS

D3

Ba

ye

sia

n3

MS

D4

Bayesia

n4

MS

D3

Ba

ye

sia

n3

MS

D4

Bayesia

n4

0

10

20

30

40

50

60

70

80

90

100

%

G > B B > G


MSD3

Bayesian3

MSD4

Bayesian4

Figure 5.20: The comparison between Bayesian3, MSD3, Bayesian4 and MSD4 for erroneous

classification

044

The methods arranged according to the percentage of good

customers classified correctly and given in table 5.32:

Model Good

MSD4 68%

MSD3 66%

Bayesian3 66%

Bayesian4 62%

Table 5.32: The methods arranged according to the percentage of good

customers classified correctly

The methods arranged according to the percentage of bad customers

classified correctly and given in table 5.33:

Model Bad

MSD4 74%

MSD3 66%

Bayesian4 58%

Bayesian3 52%

Table 5.33: The methods arranged according to the percentage of bad customers

classified correctly

From the above comparison, MSD4 (model with income) perform

better than other models. This result is consistent with conclusion on

above sections.

5.8. General conclusion

The attributes used to build credit score model have an important

effect on the performance of the model. Irrelevant or vague attributes will

reduce the accuracy of scoring model so it is important to give attention to

selection the attributes which will be using to build scoring model. This

045

will need to get more information about the applicant credit history and

review the questions in credit card application form.

MSD3 and MSD4 give accuracy result more than other models and

it's recommended to use one of them.

046

Chapter six

Conclusions and points for further research

Conclusions

The purpose of this research is to build a credit score for credit card

applicants using Bayesian, composite rule induction system and linear

programming techniques to help banks to issuing credit card decision for

an applicant or deny. All the customers in the samples used in building

and testing these models were granted a credit card based on system

depend on the deductive credit score.

The models were built using the same attributes used by deductive

credit score and we conclude that the credit score model which depend on

Bayesian or linear programming give more accurate results than models

depend on deductive credit score.

Then we improved the accuracy of Bayesian and linear

programming credit score models by reviewing the attributes used in

building these models. We rebuilt the credit score models after omitting

unimportant attribute and add income attribute.

We concluded that the MSD (MSD2 and MSD4) credit score model

which depend on the new set of attributes after adding the income give the

more accurate results and the set of attributes which used in building the

047

credit score model should be reviewed and modify the questions in the

credit card application form.

Generally, credit score is very important technique to analysis the

data in many field sectors, especially for banks. Credit score as automated

and centralized system enable bank to measure the creditworthy of large

number of customers objectively and accurately in short time especially if

there are an precise and instant method to assurance that the data given by

the applicants are correct and its important was increased as risk

management tool with Basel II.

Points for further research

There are many points for further research; it can be summarized as

follows:

- Build a credit risk model for credit card using credit score.

- Study the effect of Basel II on Egyptian bank credit card lending.

- Using a hybrid approach in order to try improving the classification

accuracy.

048

References

[1] A. J. Feelders (2000), credit scoring and reject inference with

mixture models, International Journal of Intelligent Systems in

Accounting, Finance & Management, 9, 1-8.

[2] Allen N. Berger and W. Scott Frame (2005), small business credit

scoring and credit availability, credit scoring & credit control

conference, the credit research centre, the school of management,

the University of Edinburgh.

[3] Baesens B., Egmony M, Castelo R., and Vanthienen J. (2002),

learning Bayesian network classifiers for credit scoring using

markove chain Monte Carlo search, IEEE computer society, 49-52.

[4] Basel Committee on Banking Supervision (1999), credit risk

modeling: current practices and application, Bank for international

settlements.

[5] Basel Committee on Banking Supervision (2000), principle for

management of credit risk, Bank for international settlements.

[6] Basel Committee on Banking Supervision (2001a), consultative

document: the new Basel Capital Accord, Bank for international

settlements.

[7] Basel Committee on Banking Supervision (2001b), the joint forum:

risk management practices and regulatory capital, Bank for

international settlements.

[8] Basel Committee on Banking Supervision (2001c), consultative

document: overview of the new Basel Capital Accord, Bank for

international settlements.

[9] Basel Committee on Banking Supervision (2001d), consultative

document: the internal rating based approach, Bank for international

settlements.

[10] B. Ravindranath (2002), decision support system and data

warehouses, New Age International (p) ltd.

[11] Brian Coyle (2000), Measuring credit risk, Glenlake publishing

company, Ltd, Chicago.

[12] Business Payment System Wisconsin "BPS" (2004), ( BPS is an

agent of business payment system which is a registered ISO/MSP

National company in association with bank of America, N.A.,

http://www.bpswis.com/html/pre-paid_cards.html.

[13] Consumer Federation of America (2002), credit score accuracy

and implication for consumers.

[14] David B. Edelman (2005), credit scoring as a strategic

management tool, credit scoring & credit control conference, the

credit research centre, the school of management, the University of

Edinburgh.

http://www.bpswis.com/html/pre-paid_cards.html

049

[15] David West (2000), neural network credit scoring models,

computers & operation research 27, 1131-1152.

[16] Department of the army (1998), risk management, hearquarter,

Washington. DC, field manual No. 100-14.

[17] Dompos M., Kosmidou K., Baourakis G., and Zopounidis c.

(2002), Credit risk assessment using a multicriteria hierarchical

discrimination approach: A comparative analysis, European Journal

of Operation Research 138 392-412.

[18] Doumpos M. and Zopounidis C. (2002a), multicriteria decision aid

classification methods, Kluwer academic publishers.

[19] Doumpos M. and Zopounidis C. (2002b), Multi-group

discrimination using multi-criteria analysis: illustrations from the

field of finance, European journal of operation research, 139 371-

389.

[20] Doumpos M. and Zopounidis C. (2002c) multicriteria

classification and sorting methods: A literature review, European

journal of operation research, 138 229-246.

[21] D. Michic, D.J. Spiegelhaltcr, and C.C. Taylor (1994), machine

learning, neural network and statistical classification, Ellis

Horwood.

[22] Eddt L. Ladue and Michael P. Novak (1999), use recursive

partitioning in the development of credit scoring models, journal of

agricultural & applied economics, vol. 31, issue 1.

[23] Edward I. Altman (2002), revisiting credit scoring models in

BASEL2 environment, this paper was originally prepared for the

following publication, Ong, M., “credit rating: methodologies,

rationale and default risk,” London risk book, 2002.

[24] Efraim Turban (1988), Decision support system and expert

systems, Macmillan publishing company, New York.

[25] Efraim Turban and Jay.E. Aronson (2002), Decision support

system and intelligent systems, Pearson Education (Singapore) Pte.

Ltd., India.

[26] Federal trade commission for the consumer (2005), credit scoring.

[27] Ferenc Kiss (2003), credit scoring processes from a knowledge

management perspective, Periodica Polytechnica Ser. Soc. Vol. 11,

No. 1, 95-110.

[28] Financial Consumer Agency of Canada (2001), credit card and

you, http://dsp-psd.pwgsc.gc.ca.

[29] Freed N. and Glover F. (1981), simple but powerful goal

programming models for discriminanat problems, European journal

of operation research, 7 44-60.

http://dsp-psd.pwgsc.gc.ca/

051

[30] Gachet, A. (2001), a framework for developing distributed

cooperative decision support systems- inception phase, 4th

information science conference, June 19-22 Krakow, Poland.

[31] Gutierrez-Pena E.(2004), Bayesian classification methods,

Psychology science, vol. 46, p. 52-64.

[32] Hussein Almuallim, Shigeo Kaneda and Yasuhiro Akiba (2002),

development and application of decision trees, expert system, vol. 1.

[33] Jiawei Han and Micheline Kamber (2001), Data mining concept

and techniques, Morgan Kaufmann Publishers.

[34] Jiawei Han and Micheline Kamber (2001), Data mining, concepts

and techniques, Acadmic Press.

[35] Jaap Spronk, Ralph E. Steuer and Constantin Zopoundis (2003),

Multicriteria decision aid/analysis in finance,

[36] Jan Wallin and Stefan Sundgren (1995), using linear programming

to predict business failure: and empirical study, liiketaloudellinen

aikakausikirja.

[37] Jih-Jeng Huang, Gwo-Hshiung Tzeng & Chorng-Shyong Ong

(2005), two stage genetic programming (2SGP) for the credit

scoring model, applied mathematical and computation, article in

press.

[38] Karel Komorad (2002), on credit scoring estimation, Master's

thesis, Institute for statistics and econometrics, Humboldt

University, Berlin.

[39] Kasper Roszbach (2003), bank lending policy, credit scoring and

the survival of loans, Soveriges Riksbank working paper series no.

154, Sweden.

[40] Kim Fung Lam, Eng Ung Choo, and Jane W. Moy (1996),

Minimizing deviations from the two mean: a new linear

programming approach for the two group classification problem,

European Journal of Operation Research, 88, 358-367.

[41] Ki Mun Jung & Thomas L. C. (2004), a note on coarse

classification in acceptance scorecards, discussion paper in

management, M04-16. Southampton: university of Southampton.

[42] Linda Allen, Gayle Delong, and Anthony Saunders (2004), issues

in the credit risk modeling of retail markets, Journal of banking and

finance.

[43] Lin Wei Ping (2003), IBM business consulting services,

www.ibm.com/bcs.

[44] Liu, Y. (2001), new issues in credit scoring application, research

paper, institute of information system, university of Goettingen, Nr.

16/2001, Gottingen.

http://www.ibm.com/bcs

050

[45] Liu, Y. (2002a), a framework of data mining application for credit

scoring, research paper, institute of information system, university

of Goettingen, Nr. 01/2002, Gottingen.

[46] Liu, Y. (2002b), the evaluation of the classification models for

credit scoring, Arbeitsberichte der Abt. Wirtschaftsinformatik II,

Universitat Gottingen, Nr. 2, Gottingen.

[47] Liu Y. and M Schumann (2005), data mining feature selection for

credit scoring models, operational research society ltd. 1-10.

[48] Loretta J. Mester (1997), what's the point of credit scoring?,

Federal reserve of Philadelphia, business review.

[49] Mark Schreiner (2002), Scoring: the next breakthrough in

microcredit?, Microfinance risk management and center for social

development, USA.

[50] Michic, Spiegelhater, Taylor (1994), machine learning: Neural and

statistical classification, Ellis Horwood.

[51] Motaz Khorshid (2004), Model-centered government decision

support system for socioeconomic development in the Arab world,

the international conference on input-output general equilibrium:

data, modeling and policy analysis, Brussel, Belgium.

[52] Mu-chen, Shin-Hsien Huang, and Chia-Ming Chen (2002), credit

classification analysis through the genetic programming approach,

[53] Nicholas M. Kiefer (2004), specification and informational issues

in credit scoring, Washington, DC: Office of Comptroller of

Currency.

[54] Peng and Goh Chwee (2004), credit scoring using data mining

techniques, Singapore Management Review.

[55] Nikolaos F. Matsatsinis and C. Erik Larson (2004), CCAS: An

intelligent decision support system for credit card application

assessment, Journal of multi-criteria decision analysis, vol. 11, no 4-

5. 213-235.

[56] Rashmi Malhotra and D.K. Malhotra (2001), evaluating consumer

loans using neural networks, omega- the international journal of

management science, vol. 31,2, 83-97.

[57] Rejda, George E (1995), principles of risk management and

insurance, Harper Collins college publishers.

[58] Scott E. Harrington and Gregory R. Niehaus (1999), risk

management and insurance, Irwin/Mcgraw_Hill.

[59] Secretariat of the Basel committee on Banking Supervision

(2001), the new Basel Capital Accord: an explanatory note, Bank

for international settlements.

[60] Stat bank of Pakistan, risk management "guidelines for

commercial bank and DFIs.

052

[61] Steiner M. T. A. and Carnieri C. (1999), pattern recognition in

credit scoring analysis, Investigacion Operativa.

[62] Steven Finlay (2005), using genetic algorithms to develop scoring

models for alterative measure of performance, credit scoring and

credit control conference, the university of Edinburgh management

school, credit research center.

[63] Sujit Chakravorti (2003), theory of credit card networks: a survey

of the literature, Review of network economics, vol. 2, issue 2.

[64] Taher Musa (2004). Modern risk management in banking and

finance, Union of Arab Banks.

[65] Tetsuo Tamai and Masayuki Fujita (1987), Development of an

expert system for credit card application assessment, international

journal of computer application in technology, vol. 2, No. 4,234-

240.

[66] The committee on regulation and supervision (1999), response to

Basel's credit risk modeling: current practices and applications,

Global Association of Risk Professionals.

[67] The Comptroller of the Currency (1998), Comptroller of the

Currency Administrator of National Banks, Washington, D.C.

[68] Thomas L. C. (2000), a survey of credit and behavioural scoring:

forecasting financial risk of lending to consumers, International

Journal of forecasting, 16, 149-172.

[69] Thomas L. C., David B. Edelman, and Jonathan N. Crook (2004),

reading in credit scoring, recent developments, advances, and aims,

Oxford University Press Inc., New York.

[70] Thomas L. C., David B. Edelman, and Jonathan N. Crook (2002),

credit scoring and its applications, society for industrial and applied

mathematics.

[71] Thomas Mahlmann (2004), classification and rating of firms in the

presence of financial and non-financial information,

www.defaultrisk.com

[72] Ting-Peng liang (1992), a composite approach to inducing

knowledge for expert system design, management science, vol. 38

no. 1.

[73] Vladimir Bugera, Hiroshi Konno, and Stanislav Uryasev (2002),

credit cards scoring with quadratic utility function, journal of multi

criteria decision analysis, 11(4).

[74] William W. Lang, Loretta J. Mester & Todd A. Vermilyea (2006),

competitive effects of Basel II on U.S. bank credit card lending,

Bank for international settlements.

[75] Winfried G. Hallerbach and Albert J. Menkveld (2004), analysis

perceived downside risk: the component value at risk framework,

European Financial Management, Vol. 10, No. 4, 567-592.

http://www.defaultrisk.com/

053

[76] Yi Peng,Yong Shi and Welxuan Xu (2002), classification for three

group of credit cardholders' behavior via multi criteria approach,

AMO-Advanced modeling and optimization, volume4, number 1.

[77] Yong Shi, Yi Peng, Welxuan Xu and Xiaowo Tang (2002), data

mining via multiple criteria linear programming: application in

credit card portfolio management, International Journal of

information technology and decision making, vol. 1, No. 1, 131-

151.

054

جامعة القاهرة

و البحوث اإلحصائية معهد الدراسات

نظام دعم القرار لتقييم طلبات إصدار البطاقات االئتمانية

اعداد

احمد محمود سليم عليوة

أشراف

ن حلمى اسماعيلبهاء الدي / د.أ استاذ غير متفرغ بقسم علوم الحاسب و المعلومات

جامعه القاهره –معهد الدراسات و البحوث االحصائيه

و

عاصم عبدالفتاح ثروت/د رئيس قسم بحوث العمليات

جامعه القاهره –كليه الحاسبات و المعلومات

و

رمضان عبد الحميد زين الدين/ د قسم بحوث العمليات

جامعه القاهره –الدراسات و البحوث االحصائيه معهد

قدمت هذه الرساله استكماال لمتطلبات درجه الماجستير فى بحوث العمليات

معهد الدراسات و البحوث االحصائيه –قسم بحوث العمليات

2117يونيو

055

مقدمـــــــــــــــة

يت حيث ساد عذد انعالء انتقذيي نهحصل عهى االئتا بانبطاقاثيذا اشذث انفتزة األخيزة اتايا يتش

في ذ انحانت ال تجذ يشكهت انبطاقاثضااث نهحصل عهى ذ بعض ؤالء انعالء يقذي. ذ انخذيت

يتطهب األيز . انبعض األخز يتقذو نهحصل عهى انكزث االئتايت بذ ضا نهبطاق بانسبت نهبك انصذر

ي ذ انحانت استخذاو طزيقت يا نذراست طهباث ؤالء انعالء تحذيذ م يتى يحى كزث ائتايت ي انبك ف

.او ال بذ ضا

بطاقتتههتتل يتتتم إصتتدار تحديتتدالحكتتم او التقتتدير الشمصتت ل االئتمتتان و يستتتمدم الباحتتث

و عتادة متا يحتتو همانيتائت بطاقتهحيث يقوم العميل باستيفاء نموذج طلتب إصتدار . أم ال هائتماني

الخ و …طلب اإلصدار على بيانات عن العميل مثل السكن و السن و العمل و عدد سنوات العمل

بدراستة هتذه البيانتات و يستتمدم مبرتته و تعليمتات البنتد لتحديتد هتل يتتتم االئتمتان يقتوم الباحتث

لمتقتدمين للحصتول علتى و نتيجتة لزيتادة عتدد العمت ء ا. إصدار كرت لهتذا العميتل ام يتتم رفضته

كرت ائتمان اصبح هناد صعوبة فتى االعتمتاد علتى المبترة و الحكتم الشمصت فقتط فتى عمليتة

. التقييم

هو أسلوب يساعد البند فى تحديد هل يتم الموافقة على إصدار أسلوب الترجيح االئتمان

عتتدد العمتت ء المتقتتدمين للعميتتل ام ال و قتتد زاد متتن أهميتتة هتتذه الطريقتتة زيتتادة هائتمانيتت بطاقتته

.للحصول على هذه المدمة

:تتكون هذه الرساله من سته ابواب

:الباب االول

يعتتره هتتذا البتتاب تعريتتف للمشتتكله و تعريتتف للبطاقتتات االئتمانيتته و فائتتدتها ل طتتراف

الممتلفه و مطوات اصدار البطاقات االئتمانيه و المصائص المميزه لعمليه تقيتيم نمتوذج الطلتب

لحصول على البطاقه االئتمانيه كما يعره االسلوب الحالى المستمدم فتى التقيتيم و التكلفته التتى ا

.قد يتحملها البند فى حاله اتماذ قرار غير صحيح

056

:الباب الثانى

يعره هذا الباب تعريف لنظام الترجيح االئتمانى و انواعه و تتاريخ استتمدامه و فائدتته

.2و اهميتتتته كاحتتتد استتتاليب اداره الممتتتاطر فتتتى ظتتتل بتتتازل و المشتتتاكل التتتتى توجتتته تطبيقتتته

:الباب الثالث

يعره هذا الباب البيانات التى تستمدم كمدم ت لنظتام التترجيح االئتمتانى و ممرجتات

.النظام و كيفيه بنائه و االساليب الكميه المستمدمه فى بناء نظام الترجيح االئتمانى

:الباب الرابع

الجزء االول يعره مقدمه عن نظم دعم اتماذ القرار وفتى . الى جزئينينقسم هذا الباب

الجتتزء الثتتانى تتتم عتتره نظتتام دعتتم اتمتتاذ القتترار المقتتترح استتتمدامه فتتى تقيتتيم طلبتتات اصتتدار

.البطاقات االئتمانيه

:الباب الخامس

ام بيانات فى هذا الباب تم بناء نظام الترجيح االئتمانى الصدار البطاقات االئتمانيه باستمد

.فعليه

:الباب السادس

.يعره هذا الباب الم صه و بعه النقاط البحثيه فى هذا المجال

A DSS for Credit Card Application Assessment

Documents

Transcript of A DSS for Credit Card Application Assessment