Summer 07-mfin7011-tang1922
description
Transcript of Summer 07-mfin7011-tang1922
Summer 07, MFIN7011, Tang
Consumer Credit Risk
1
MFIN 7011: Credit Risk Management
Summer, 2007Dragon Tang
MFIN 7011: Credit Risk Management
Summer, 2007Dragon Tang
Lecture 18Consumer Credit Risk
Thursday, August 2, 2007
Readings: Niu (2004); Agarwal, Chomsisengphet,
Liu, and Souleles (2006)
Summer 07, MFIN7011, Tang
Consumer Credit Risk
2
Consumer Credit RiskConsumer Credit Risk
Objectives:
1. Credit scoring approach for consumer credit risk
2. Practice, challenge, and opportunity
Summer 07, MFIN7011, Tang
Consumer Credit Risk
3
Consumer CreditConsumer Credit
Credit Products
Fixed Term Revolving
Residential Mortgage Retail Finance Personal Loans Overdrafts Credit Cards
Default Risk(low in general)
Low High
Consumer Credit Risk 44
Consumer LendingConsumer Lending
• Examples:
– Automobile loans
– Home equity loans
– Revolving credit
• There is an exponential growth in consumer credit outstanding in the US, from USD 9.8 billion in 1946 to USD 2411 billion in January 2007
– $878 billion revolving; $1526 billion non-revolving
– Currently interest rate is 13%; interest accessed is 15%
Consumer Credit Risk 55
Consumer vs. Corporate LendingConsumer vs. Corporate Lending
• Consumer lending is not as glamorous as corporate lending
• Consumer lending is a volume business, where low cost producers who can manage the credit losses are able to enjoy profitable margins
• Corporate lending is often unprofitable as every bank is chasing the same corporate customers, depressing margins
Summer 07, MFIN7011, Tang
Consumer Credit Risk
6
Consumer Credit Risk: Art or Science?Consumer Credit Risk: Art or Science?
Art: consumers care about reputation Value of reputation is hard to model Reduced form model may be useful
Science: creditworthiness can be predicted from financial health Using structural models of Merton type
The answer is probably both! Hybrid structural-reduced form model should be
most promising
Summer 07, MFIN7011, Tang
Consumer Credit Risk
7
Never make predictions,Never make predictions,especially about the future.especially about the future.
——Casey StengelCasey Stengel
Summer 07, MFIN7011, Tang
Consumer Credit Risk
8
The credit Decision
Scoring vs. JudgmentalThe credit Decision
Scoring vs. Judgmental• Both methods
– Assume that the future will resemble the past
– Compare applicants to past experience
– Aim to grant credit only to acceptable risks
• Added value of scoring
– Defines degree of credit risk for each applicant
– Ranks risk relative to other applicants
– Allows decisions based on degree of risk
– Enables tracking of performance over time
– Permits known and measurable adjustments
– Permits decision automation
Summer 07, MFIN7011, Tang
Consumer Credit Risk
9
Evaluating the credit applicant
Time at present addressTime at present addressTime at present jobTime at present jobResidential statusResidential statusDebt ratioDebt ratioBank referenceBank referenceAgeAgeIncomeIncome # of Recent inquiries# of Recent inquiries% of Balance to avail. lines% of Balance to avail. lines# of Major derogs.# of Major derogs.OverallOverall
DecisionDecisionOdds of repaymentOdds of repayment
•••
CHARACTERISTICSCHARACTERISTICSCHARACTERISTICSCHARACTERISTICS
++++--++++
N / AN / A--
--++++++
AcceptAccept??
•••
JUDGMENTJUDGMENTJUDGMENTJUDGMENT
1212202055
21212828151555
-7-710103535
212212
AcceptAccept11:111:1
•••
CREDIT SCORINGCREDIT SCORINGCREDIT SCORINGCREDIT SCORING
Consumer Credit Risk 1010
Credit ScoringCredit Scoring
• Project
– Input x feature vector
– Label y, default or not
– Data (xi , yi)
– Target y=f(x)
• Objective
– Given new x, predict y so that probability of error is minimal
Summer 07, MFIN7011, Tang
Consumer Credit Risk
11
Typical Input DataTypical Input Data
Time at present address 0-1, 1-2, 3-4, 5+ years
Home status Owner, tenant, other
Telephone Yes, no
Applicant's annual income $(0-10000), $(11000-20000), $(21000+)
Credit card Yes, no
Type of bank account Cheque and/or savings, none
Age 18-25, 26-40, 41-55, 55+ years
Type of occupation Coded
Purpose of loan Coded
Marital status Married, divorced, single, widow
Time with bank Years
Time with employer Years
Summer 07, MFIN7011, Tang
Consumer Credit Risk
12
Input Data: FICO ScoreInput Data: FICO Score
Not in the score: demographic data
Summer 07, MFIN7011, Tang
Consumer Credit Risk
13
Characteristics of DataCharacteristics of Data
• X:
– Continuous
– Discrete
– Normal distribution?
• Y:
– Binary data: 0 or 1 (=default)
Summer 07, MFIN7011, Tang
Consumer Credit Risk
14
Scoring ModelsScoring Models
• Statistical Methods
– DA (Discriminant Analysis)
– Linear regression
– Logistic regression
– Probit analysis
– Non-parametric models
» Nearest-neighbor approach
Summer 07, MFIN7011, Tang
Consumer Credit Risk
15
Statistical Methods: Discriminant Analysis
Statistical Methods: Discriminant Analysis
• Multivariate statistical analysis: several predictors (independent variables) and several groups (categorical dependent variable, e.g. 0 and 1)
• Predictive DA: for a new observation, calculate the discriminant score, then classify it according to the score
• The objective is to maximize the between group to within group sum of squares ratio that results in the best discrimination between the groups (within group variance is solely due to randomness; between group variability is due to the difference of the means)
• Normal distribution for the response variables (dependent variables) is assumed (but normality only becomes important if significance tests are to be taken for small samples)
Summer 07, MFIN7011, Tang
Consumer Credit Risk
16
Statistical Credit ScoringStatistical Credit Scoring
Credit Score
#C
ust
om
ers Good
CreditBad Credit
Cut-off Score
Summer 07, MFIN7011, Tang
Consumer Credit Risk
17
Statistical Credit ScoringStatistical Credit Scoring
Credit scoring systems:
• Altman Z-score model:
• Z = .012 X1+.014 X2+.033 X3 +.006 X4 +1.0 X5
– X1 = working capital/total assets ratio
– X2 = retained earnings/total assets ratio
– X3 = earnings before interest and taxes/total assets ratio
– X4 = market value of equity/book value of total liabilities ratio
– X5 = sales/total assets ratio
Summer 07, MFIN7011, Tang
Consumer Credit Risk
18
Statistical Methods: Linear Regression
Statistical Methods: Linear Regression
• The regression model is like:
• For the true model, u can take only two values as Y; thus u can’t be normally distributed.
• u has heteroskedastic variances, which makes the OLS inefficient
• The estimated probability may well lie outside [0,1].
0 'i i iY X u
Summer 07, MFIN7011, Tang
Consumer Credit Risk
19
Statistical Methods:Nearest-Neighbor Approach
Statistical Methods:Nearest-Neighbor Approach
• A historical database has been divided into two groups (good and bad)
• When a consumer comes, calculate the distance between the consumer and everyone in the database
• The consumer will be classified in the category which is the same as the nearest one(s)
• Problems:– The definition of distance and the number of the nearest
ones– Scoring speed: when a new x comes, we need calculate
the distance between the new x and all of the historical data; too much calculation!
Summer 07, MFIN7011, Tang
Consumer Credit Risk
20
Scoring ModelsScoring Models
• Non-statistical Methods
– Mathematical programming
– Recursive partitioning
– Expert systems
– Machine Learning
» Neural Networks
» Support Vector Machine (SVM)
Summer 07, MFIN7011, Tang
Consumer Credit Risk
21
Which Method is Best?Which Method is Best?• In general there is no overall best method. What is best will
depend on the details of the problem:
– The data structure
– The characteristics used
– The extent to which it is possible to separate the classes by using those characteristics
– The objective of the classification (overall misclassification rate, cost-weighted misclassification rate, bad risk rate among those accepted, some measure of profitability, etc.)
• In the following slides, we will introduce three models, Logistic, Neural Networks, and SVM in detail, which are used widely today
Summer 07, MFIN7011, Tang
Consumer Credit Risk
22
Logistic RegressionLogistic Regression
• Empirical studies show, logistic regression may perform better than linear models (Hence, better than Discriminant Analysis), when data is nonnormal (particularly for binary data), or when covariance matrices of the two groups are not identical.
• Therefore, logistic regression is the preferred method among the statistical methods
• Probit regression is similar to logistic regression
Summer 07, MFIN7011, Tang
Consumer Credit Risk
23
Performing Logistic RegressionPerforming Logistic Regression
• Logistic Regression can be performed using the Maximum Likelihood method
• In the maximum likelihood method, we are seeking parameter values that maximize the likelihood of the observations occurring
Summer 07, MFIN7011, Tang
Consumer Credit Risk
24
Logistic Regression: SetupLogistic Regression: Setup
• Directly models the default probability as a function of the input variables X (a vector)
• Define
• Assume
1: if obligor defaults
0: if obligor does notl
lY
l
Pr obligor defaults| l lP X l X
'1
1 exp
l lP X h a X
h xx
Summer 07, MFIN7011, Tang
Consumer Credit Risk
25
Logistic Regression: SetupLogistic Regression: Setup
• Assume the observations are independent, the probability (likelihood) of the observed sample is given by
1
1
1 1
0
( ) 1 ( )
( ) 1 ( )
1( )
1 exp
ll
nYY
l ll
m n
l ll l m
l
i lii
L P X P X
P X P X
P X
a a X
Summer 07, MFIN7011, Tang
Consumer Credit Risk
26
Logistic Regression and MLLogistic Regression and ML
• ML estimator (of the coefficients a’s) for Logistic Regression can be found by applying non-linear optimization on the above likelihood function.
• The simplified version is given by
n
l ilii
m
l ilii
Xaa
Xaa
L
10
10
exp1
exp
0 01 1
or log log 1 expm n
i li i lil i l i
L a a X a a X
Summer 07, MFIN7011, Tang
Consumer Credit Risk
27
Logistic Regression and MLLogistic Regression and ML
• It is easy to show that the log of the odds (= logit) are a linear function:
• Therefore, the odds per se are a multiplicative function.
• Since probability takes on values between (0,1), the odds take on values between (0,∞), logits take on values between (-∞,∞). So, it looks very much like linear regression, and it does not need to restrict the dependent variable to values of {0, 1}.
• It is not solvable using OLS.
i
liil
l XaaXP
XP0)(1
)(ln
Summer 07, MFIN7011, Tang
Consumer Credit Risk
28
Logistic Function and DistributionLogistic Function and Distribution
)(exp1
1
0 ii xaa
Summer 07, MFIN7011, Tang
Consumer Credit Risk
29
Normal DistributionNormal Distribution
The tails are much thinner than Logistic
Summer 07, MFIN7011, Tang
Consumer Credit Risk
30
RiskCalc: Moody’s Default Model
RiskCalc: Moody’s Default Model
• Probit Regression
– Where x is the vector of the ratios
–
–
2'
Prob( | ; )
( ' )
1exp( )
22
x
y default x
x
tdt
1( ) 'y x
Summer 07, MFIN7011, Tang
Consumer Credit Risk
31
Neural NetworksNeural Networks• Non-parametric method
• Non-linear model estimation technique: e.g.– Saturation effect: i.e. marginal effect of a financial ratio
may decline quickly
– Multiplicative factors: highly leveraged firms have a harder time borrowing money
• Neural networks decide how to combine and transform the raw characteristics in the data, as well as yielding estimates of the parameters of the decision surface
• Well suited to situations where we have a poor understanding of the data structure
Summer 07, MFIN7011, Tang
Consumer Credit Risk
32
Neural NetworksNeural Networks
• Use the logistic function as the activation function in all the nodes
• Works well with classification problems
• Drawbacks
– May take much longer to train
– In credit scoring, there is solid understanding of data
Summer 07, MFIN7011, Tang
Consumer Credit Risk
33
Multilayer Perceptron (MLP)Multilayer Perceptron (MLP)
• The input values X are sent along with 1 to the hidden layer neuron
• The hidden layer generates a weight and generates a nonlinear output that is sent to the next layer
• The output neuron takes 1 with input from the hidden layer and generates the output signal
• When learning occurs, the weights are adjusted so that the final OUTs produce the least error (The output of a single neuron is called OUT)
X1
X2
1
H1
H2
1
O
Input Layer
Hidden Layer
Output Layer
w01
w12
w21
w22
w11
w02
w1
w2
w0
Summer 07, MFIN7011, Tang
Consumer Credit Risk
34
Multilayer Perceptron (MLP)Multilayer Perceptron (MLP)
• Input nodes do not perform processing
• Each hidden and output node processes the signals by an activation function. The most frequently used is given on the right.
• The parameters, w, are obtained by “training” the Neural Net to historical data.
parameters ofVector :
signalsinput ofVector :
)(
)(exp1
1)(
01
w
x
wxwxg
xgxf
n
iii
Summer 07, MFIN7011, Tang
Consumer Credit Risk
35
Support Vector Machine (SVM)Support Vector Machine (SVM)
• A relatively new promising supervised learning method for
– Pattern recognition (Classification)
– Regression estimation
• This originates from the statistical learning theory developed by Vaqnik and Chervonenkis
– 1960s, Vapnik V. N., Support Vector
– 1995, Statistical Learning Theory
» Vapnik, V. N., “The Nature of Statistical Learning Theory”. New York: Springer-Verlag, 1995 2
» Cortes C. and Vapnik, V. N., “Support Vector Networks”, Machine Learning, 20:1-25,1995
– Development, from 1995 to now
Summer 07, MFIN7011, Tang
Consumer Credit Risk
36
SVM ExtensionSVM Extension
• Proximal Support Vector Machine (PSVM)
– Glenn Fung and Olvi L. Mangasariany 2001
• Incremental and Decremental Support Vector Machine Learning
• Least Squares Support Vector Machine (LS-SVM)
• Also, SVMs can be seen as a new training method for learning machines (such as NNs)
Summer 07, MFIN7011, Tang
Consumer Credit Risk
37
Linear ClassifierLinear Classifier
• There are infinitely many lines that have zero training error.
• Which line should we choose?
Summer 07, MFIN7011, Tang
Consumer Credit Risk
38
margin
• Choose the line with the largest margin.
– The optimal separating hyperplane (OSH)
• The “large margin classifier”
”Support Vectors”
Linear ClassifierLinear Classifier
Summer 07, MFIN7011, Tang
Consumer Credit Risk
39
Performance of SVMPerformance of SVM
• S&P CreditModel White Paper
• Fan and Palaniswami (2000):
– SVM 70.35%–70.90%
– NN 66.11%–68.33%
– MDA 59.79%–63.68%
Summer 07, MFIN7011, Tang
Consumer Credit Risk
40
Credit Scoring and BeyondCredit Scoring and Beyond• Data collected at application will become outdated
pretty fast
• The way a customer uses its credit account is an indicator for future performance (Behavior Scoring)
• This leads to an update path of PD and credit control tools
• The future is moving into profitability scoring.– Banks should not only care about getting its money back
– Banks want to extend credit to those it can make a positive NPV, risk-adjusted
Summer 07, MFIN7011, Tang
Consumer Credit Risk
41
Best Practice in Consumer Credit Risk ManagementBest Practice in Consumer Credit Risk Management
Credit decision-making Adopt to changes in economy or within customer
segment
Credit scoring Adaptive algorithms using credit bureau data and firm’s
own experience Loss forecasting Historical delinquency rates and charge-off trend analysis Delinquency flow and segmented vintage analysis
Portfolio management Risk adjusted return on capital (RAROC)
Summer 07, MFIN7011, Tang
Consumer Credit Risk
42
Analytical TechniquesAnalytical Techniques
Response analysis: avoid adverse selection consequences that result in increased concentrations of high-risk borrowers
Pricing strategies: avoid “follow the competition”, focus on segment profitability and cash flow
Loan amount determination: avoid to be judgmental, quantify probabilities of losses
Credit loss forecasting: decompositional roll rate modeling, trend and seasonal indexing, and vintage curve
Portfolio management strategies: important for repricing and retention, don’t be judgmental, integrating behavioral element and cash flow profitability analysis (underwriting)
Collection strategies: behavioral models are useful
Summer 07, MFIN7011, Tang
Consumer Credit Risk
43
Credit Scoring and Loss ForecastingCredit Scoring and Loss Forecasting
Two critical components of consumer credit risk analysis Corresponds to default probabilities and loss given
default
These two are linked Loss given default is higher when default probability is
greater Market and economic variables matter In bad economic states, there will be more default and
lower recovery Good modeling should achieve stability
Summer 07, MFIN7011, Tang
Consumer Credit Risk
44
Do Consumers Choose the Right Credit Contracts?Do Consumers Choose the Right Credit Contracts?
Agarwal, Chomsisengphet, Liu, and Souleles (2006):
Some don’t, especially when the stake is small But consumers with high balance do!
Other issues: Personal bankruptcy in the U.S. soared! Avoid/fight predatory lending! (e.g., subprime lending) China is starting to have a consumer credit market
Consumer Credit Risk 4545
China’s Consumer Spending
China’s Consumer Spending
%Chg
1997 1998 1999 2000 2001 2002 200397-03
Food 2684 2756 2845 3029 3326 3487 3789 41%
Medicine&Healthcare
213 255 300 356 401 455 506 138
%
Clothing 785 750 728 791 866 885 958 22%
Household Durables
414 485 569 595 657 727 790 91%
Transport&Communication
290 337 385 437 498 554 614 112
%
Education&Entertainment
550 643 739 837 945 1057 1170 113
%
Housing 424 507 599 663 752 842 931 120
%
Services 244 268 296 330 367 400 441 80%
TOTAL 5603 6001 6462 7037 7811 8407 9198 64%
Consumer Credit Risk 4646
China’s Consumer Credit MarketChina’s Consumer Credit Market
• 1999-2004: Growth rate 52%– Automobile loans: 110%
» Only 15% of auto sales, compared to 80% in U.S.– Bankcard: 36%
» Mostly debit cards– Mortgage: 1000%
» Still a long way to go! Only 8% of GDP, compared to 45% in developed economies
• Other markets– Student loan– Credit cards!
• More opportunities are waiting!
Summer 07, MFIN7011, Tang
Consumer Credit Risk
476
Consumer loans vs GDP per Capita
0
5,000
10,000
15,000
20,000
25,000
30,000
35,000constant 2000 US$
0
5
10
15
20
25
30
35% of GDP
GDP per capita Consumer loans to GDP
Sources: World Bank, Fitch, Central Banks and Banks Superintendencies.
Summer 07, MFIN7011, Tang
Consumer Credit Risk
48
SummarySummary
Introduction to Consumer Credit Risk: Credit scoring methods Practical issues
Exam: Saturday, August 4, 2PM
Summer 07, MFIN7011, Tang
Consumer Credit Risk
49
Review for ExamReview for Exam
Topics: Credit risk modeling: structural/reduced-form/incomplete
information Recovery rate & default correlation Credit derivatives Credit VaR/Basel II/consumer credit risk
Question Types (tentative!): True or False (20%) Multiple Choice (20%) Short Answers (20%) Problems (40%) 60% conceptual; 40% analytical
Formulas will be provided if needed.
Summer 07, MFIN7011, Tang
Consumer Credit Risk
50
SVM Approach DetailsSVM Approach Details
Summer 07, MFIN7011, Tang
Consumer Credit Risk
51
• The plane separating and is defined by
• The dashed planes are given by
margin
Computing the Margin
aT xw
w
ba
baT
T
xw
xw
Summer 07, MFIN7011, Tang
Consumer Credit Risk
52
• Divide by b
• Define new w = w/b and α = a/b
margin
Computing the Margin
1//
1//
bab
babT
T
xw
xw
w
1
1
xw
xwT
T
We have defined a scalefor w and a
Summer 07, MFIN7011, Tang
Consumer Credit Risk
53
• We have
• which givesmargin
Computing the Margin
1
( ( )) 1
( ) margin
T
T
w x
w x w
ww)
x
x + w)
w
2margin
Summer 07, MFIN7011, Tang
Consumer Credit Risk
54
Quadratic Programming ProblemQuadratic Programming Problem
( ) 1 ( ) 1( ) 1 ( ) 1
T
T
n y nn y n
w xw x
Maximizing the margin is equivalent to minimizing ||w||2. Minimize ||w||2 subject to the constraints:
Where we have definedy(n) = +1 for all y(n) = –1 for all
This enables us to write the constraints as
01])()[( nny T xw
Summer 07, MFIN7011, Tang
Consumer Credit Risk
55
Quadratic Programming ProblemQuadratic Programming Problem
2
1
1( ) ( ) 1
2with , ,
NT
p nn
p p
L y n n
L L
w w x
w
Minimize the cost function (Lagrangian)
Here we have introduced non-negative Lagrange multipliers ln 0 that express the constraints
01)()(
nnyL T
n
p xw
Summer 07, MFIN7011, Tang
Consumer Credit Risk
56
Quadratic Programming ProblemQuadratic Programming Problem
• The first order conditions evaluated at the optimal solution are
• The solution can be derived (together with the constraint)
1
1
0
( ) 0
( ) ( ) 1 0
Np
nn
N
w p nn
Tn
Ly n
L y n n
y n n n
w x
w x
Summer 07, MFIN7011, Tang
Consumer Credit Risk
57
Quadratic Programming ProblemQuadratic Programming Problem
• The original minimizing problem is equivalent to the following maximizing problem (dual)
• For non-support vectors, λ will be zero, as the original constraint is not binding; only a few λ’s would be nonzero.
1 1 1
1
1( ) ( ) ( )
2
. . 0 and 0
N N NT
D n m nn n m
N
n nn
L y m y n m n
s t y n
x x
Summer 07, MFIN7011, Tang
Consumer Credit Risk
58
Quadratic Programming ProblemQuadratic Programming Problem
• Having solved for the optimal λ’s (denoted as ), we can derive others
• To classify a new data point x, simply solve
1
( )
( ) ( ) 1 0
N
nn
Tn
y n n
y n n n
w x
w x
sgn T w x