Econometric paper

download Econometric paper

of 12

Transcript of Econometric paper

  • 8/13/2019 Econometric paper

    1/12

    Journal of Econometrics 40 (1989) 3-14. North-Holland

    AN ECONOMETRIC ANALYSIS OF THE BANK CREDITSCORING PROBLEM*

    William J. BOYES, Dennis L. HOFFMAN and Stuart A. LOW

    Arizona State University, Tempt-, AZ 85287, USA

    Most credit assessment models used in practice are based on simple credit scoring functionsestimated by discriminant analysis. These functions are designed to distinguish whether or not

    applicants belong to the population of would be defaulters. We suggest that the traditional viewthat emphasizes default probability is too narrow. Our mo el of credit assessment focuses onexpected earnings. We demonstrate how maximum likelihood estimates of default probabilitiescan be obtained from a bivariate censored probit framework using a choice-based sampleoriginally intended for discriminant analysis. The paper concludes with recommendations forcombining these default probability estimates with other parameters of the loan earnings processto obtain a more meaningful model of credit assessment.

    1 Introduction

    Most credit assessment models used in practice are based on simple creditscoring functions designed to distinguish applicants who would repay fromthose who would default [see Altman et al. (1981) for a summary of recentexamples]. These functions are typically based on discriminant analysis. Mod-els are then judged on their ability to generate indices of applicant attributesthat take on values above or below a critical cut-off level depending onwhether or not the applicant belongs to the population of would be de-faulters.

    We suggest that this view of the credit assessment problem is too narrow.

    Ultimately the bank is interested in profit maximization - not simply aranking essentially based on a measure of default probability. We develop asimple model of credit card lending that demonstrates how expected earningson revolving credit loans depend both on maintained balances and probabilityof default. The choice-based estimator of Manski and Lerman (1977) maythen be combined with the notion of partial observability to estimate defaultprobabilities from non-random samples of consumer lending behavior origi-nally intended for use in discriminant analysis. The paper concludes with

    *This paper was presented at the Issues in Econometric Forecasting Conference at ArizonaState Universitv in March 1987. We benefited from discussions with Marie Connollv. MikeOrmiston and Peter Schmidt. Support from the Center for Financial System Research at &iaonaState University is gratefully acknowledged.

    0304~4076/89/ 3.500 1989, Elsevier Science Publishers B.V. (North-Holland)

  • 8/13/2019 Econometric paper

    2/12

    4 W .J. Boy es et al ., Economet ri c anal ysi s of bank s oring problems

    suggestions for combining these default probability estimates with the otherparameters of the loan earnings process to obtain a more meaningful creditassessment model.

    2. Credit card lending

    Assume that a credit card loan is granted (or denied) and is repaid ordefaulted within a single period. Each loan then yields two possible outcomes.The probability distribution of these outcomes can be described by a Bernoullitrial:

    x = r with repayment probability 1 - p,

    w with default probability p,

    where r is the earnings on a repaid loan and w denotes losses that must bewritten-off when a loan is defaulted. Specifically r is the product of thenominal yield on credit card loans, i, and balances maintained on repaidaccounts, ballr, while w is the product of write-off rate, q and balances thataccrue on defaulted accounts, baljw.

    If the bank knows all the parameters of the trial, it establishes a creditapproval requirement for each applicant. Approved loans must have expectedreturns that exceed the opportunity cost of bank funds - for example, E(x) ZZ-E(t), where E(1) could be the earnings that would be obtained by investing thefunds in government securities at interest rate i,. In this case the creditgranting requirement is (1 - p)(bal (I )( i, - i, - p(ba1) w)( q + i , > 0. Hence,credit is granted only to those applicants with default probabilities less than(bal(r)(i, - i,)/[(ballr)(i, - i,) + (balIw)(q + i, ].

    Two aspects of the above scenario are apparent. First, precise estimates ofdefault probabilities are necessary to ensure accurate credit assessment. His-torically banks have devoted significant resources to developing credit scores

    that are essentially proxies for default probabilities. Second, if lenders apply auniform critical credit score to all applicants - as is customary with somebanks - they are implicitly assuming that all applicants maintain the samerevolving credit balances. Recent efforts in the area of behavioral scoringhave attempted to identify those applicants that might maintain higher bal-ances and yield greater returns, but these are in the earliest stages of develop-ment. Our paper illustrates how estimates of default probabilities might beobtained from data typically complied by banks. We then briefly discuss howone might build a scoring model that accounts for variable balance behavior.

    Processing costs are presumed to be paid from the merchants contribution on each revolvingcredit transaction or annual fees assessed to cardholders.

  • 8/13/2019 Econometric paper

    3/12

    W.J. Boyes et al., Econometric analysis of bank scoringproblems

    3. Econometric considerations

    3 1 Sampl e st rat i fi cat i on and sampl e sel ecti on

    Most samples used to estimate credit assessment functions are not randomlydrawn from the applicant populations. In preparation for discriminant analy-sis, banks often segment samples into groups that repaid in a timely fashion,defaulted (or were chronic late payers) or were denied credit. The relative sizesof these groups may bear little relation to the proportions that are observed ina random sample from the applicant population. Also, the parameters of theassessment process must be estimated from a truncated or censored samplesince not all applicants receive credit and, thus, there is no way to observe thesubsequent behavior of the excluded group. Whether this warrants seriousconsideration depends on the nature of the sample censoring.

    We deal with the non-random choice-based stratification issue by applyingthe weighted exogenous sample maximum likelihood estimator (WESML)designed by Manski and Lerman (1977). The WESML is obtained by maxi-mizing a weighted log likelihood function with weights determined by compar-ing sample proportions with corresponding population frequencies. For oursample these weights were determined after discussions with bank officials.

    Samples used to estimate repayment probabilities are censored since onlyapplicants that receive credit are observed to default or repay. Heckman(1979) has shown that censored samples can lead to biased estimates, if, in ourexample, the sample selection rule is correlated with the errors in the repay-ment probability equation. Thus the impact of sample censoring depends onthe nature of the sample selection rule. If lenders rely strictly on quantitativecredit scores, sample selection is deterministically governed by applicantattributes and the sample selection rule does not lead to biased estimates.However, most lenders maintain that credit scoring is only one aspect of thecredit assessment process and that loan officers also allow subjective assess-ments to enter the loan granting decision. Presuming that these assessments

    are not simply a different deterministic function of observed attributes, theyadd an element of randomness to the loan granting process and ultimately thesample selection rule. If these subjective assessments are correlated withdefault equation disturbances, censoring may lead to biased estimates of creditassessment probabilities.

    In our case the structure exposed to potential sample selection bias has aqualitative dependent variable so that the standard Heckman procedure is notapplicable. Technically, the loan granting model and the default model to-gether constitute a bivariate qualitative dependent variable model that exhibits

    a form of partial observability first discussed by Poirier (1980) and applied byFarber (1983). Meng and Schmidt (1985) summarize this model along withseveral related structures.

  • 8/13/2019 Econometric paper

    4/12

    6 W .J. Boyes et al., Economet ri c analy sis of bank s oring problems

    To illustrate, assume that we have empirical credit granting and defaultequations with binary dependent variables

    Y, = za, + E2,if loan defaulted,

    Recognizing that y, is observed in this censored probit model on/y if yi = 1,the log likelihood function for a sample of T applicants, as specified in Mengand Schmidt (1985, eq. 6), is

    In L( c9, a2, P) = i hy,21nW;+ Z@2; dt=l

    +y,, l - y,,)ln[+ Zjq) - F Z/cf,> Z/a,; PII

    +(I -y,,)ln[l - +(Z:41 y

    where F(e) and +(e) denote the bivariate standard normal c.d.f. and uni-

    variate standard normal c.d.f., respectively. Estimates of the parameters areobtained by maximizing In L. These estimates offer efficiency gains over thoseobtained in the separate estimation of the two equations. More importantly,the joint approach accounts for potential correlation between the two equa-tions, p, and thereby corrects for potential sample selection bias that could beincurred in the separate estimation of the default equation.

    Our sample requires that we estimate this censored probit model from achoice-based sample. We found this to be a straightforward application ofManski-Lermans WESML estimator. The weighted likelihood function andasymptotic variance-covariance matrix associated with these censored probitWESML estimates are described in the appendix.

    3.2. Prediction in bank credit scoring models

    In practical applications, lenders gauge expected profits using estimates ofthe parameters of the earnings distribution. To illustrate, suppose that allyields, write-off rates, and balances are known a priori (as in an installmentloan) and default probabilities are estimated as outlined above to obtain pt foreach applicant. In this case, the expected return on the t th account is

    E(q) = Ep{E(xtlA~ ,, w,)) = Q{(l -B,)r,-Aw,).

  • 8/13/2019 Econometric paper

    5/12

    W .J. Boyes et al ., Economei ri c anal ysi s of bank scoring problems

    Noting that 3 is a non-linear function of random variables, we apply a resultfrom McFadden and Reid (1975):

    where ui;&, = Z;V(C?,)Z, and V(&,) is the asymptotic covariance matrix ofthe censored probit default probability estimates. Since Z,G2 < 0 for allapplicants in our sample, Ed(fi,) >a,, and naive measures of expectedreturns - based on 3, rather than E)(a,) - are biased upward. We estimatethe potential significance of this bias in the empirical section.

    4. Empirical results

    4 1 ata

    The data employed in this paper were obtained from a single large financialinstitution that monitored a non-random sample of its credit card applicantsbetween 1977 and 1980 as well as the performance through 1984 of thosegranted credit. The sample contains 4,632 credit card applicants with completeinformation. Of these, 3,711 (80.1%) were granted credit and 921 (19.9%) weredenied credit. Of those granted credit, 1,938 (41.8% of the sample and 52.2%

    of those granted credit) were classified as good by the institution and 1,773(38.3% of the sample and 47.8% of those granted credit) were classified asbad.

    The WESML technique was applied to a bivariate probit model to adjustthe sample to the true (as evaluated by bank officials) proportion of 51%granted credit and a 5% probability of default for those granted credit. Thisimplies population proportions of 48.4% good, 2.6% bad and 49.0% deny.All estimates reported below incorporate adjustments for sample non-random-ness induced by choice-based samples and partial observability with covari-

    ante matrix as described in the appendix.Each record contains information on personal characteristics, economic and

    financial variables, credit recipient status and repayment performance. Fullvariable descriptions are contained in table 1.

    4 2 Estimates

    The bivariate censored probit estimates and asymptotic t-statistics for theloan granting decision (column 1) and default decision (column 2) are pre-

    In practice balances would also be replaced with estimated counterparts. However, theseestimates would presumably be linear and conceivably uncorrelated with the default probabilityestimates.

  • 8/13/2019 Econometric paper

    6/12

  • 8/13/2019 Econometric paper

    7/12

  • 8/13/2019 Econometric paper

    8/12

  • 8/13/2019 Econometric paper

    9/12

    W .J. Boyes et al ., Economet ri c analy sis of bank scori ng problems 11

    petted earnings due to high expected balances. An alternative explanation forthe positive rho estimate is summarized in Boyes, Hoffman and Low (1986).This bank may have aggressively pursued minority accounts during this

    post-ECOA sample period in an effort to reduce the probability of class actiondiscrimination suits. This might have led to a riskier loan portfolio and isconsistent with the observed positive correlation between unexplained tenden-cies to grant credit and observed defaults.

    4.3. A simple simulation

    In an effort to illustrate the potential of the model described above, we canquantify some of the parameters in the earnings process to establish a

    hypothetical yet realistic, credit granting criterion. Over our sample period,credit card loan rates averaged 21% per annum, our average good creditrecipient maintained an account for about three years, and the opportunitycost of funds (one-year t-bill) averaged 12%. Although we do not have precisedata on balances maintained by individual applicants, bank officials suggestthat the average good account maintained about 500 in outstanding debt,while the average defaulted resulted in a one time loss of 1,500. We were alsotold that a write-off rate of 55% is in line with industry norms. Substitutingthese figures into the credit granting criterion established in section 2, we findthat only those applicants with default probabilities below 9.0% have positiveexpected earnings.3 Using this level as a classification rule we find that 94% ofthe good accounts, 61.4% of the bad accounts and 69.2% of the denyaccounts would be granted credit based on our estimated probability ofdefault equation and simulated granting criterion.

    The assumptions about balance behavior and write-off amounts used in ourstudy generate classifications that are quite different than those achieved bythe bank in our sample. To explore the sensitivity of our results to alternativeassumptions, we altered maintained balances so that a default probability of5% or less was required to generate positive expected returns. In this case theratio of correctly classified good accounts falls from 94% to 83%, thepercentage of bad accounts awarded credit falls from 61.4% to 42.2% andthe percentage of creditworthy denys falls from 69.2% to 51.1%. Interest-ingly, we continue to find that denied applicants appear to be as creditworthyas those who subsequently defaulted - suggesting that there may have been anumber of profitable accounts that the bank failed to acquire over this period.This might occur if lenders refused to rely exclusively on quantitative creditscores from a discriminant function. Upon further investigation we learned

    3This calculation is based on a three year loan with annual interest rates and balances given inthe immediately preceding discussion. The write-off rate represents a one-time loss of 55% of theaverage default balance of 1500.

  • 8/13/2019 Econometric paper

    10/12

    2 W .J. Boyes et al ., Economet ri c analy sis of bank scoring roblems

    that, for an unspecified length of time, the bank in our sample simply awardedcredit to all individuals with income levels that satisfied a particular criticallevel. Also, after calculating credit scores, branch managers based ultimatedecisions on subjective assessments. As a result we find that credit wasawarded to numerous applicants with characteristics very similar to many ofthe denys. Though our classification results differ substantially from those ofthe bank in our sample, it is impossible to verify that the bank could actuallyhave profited from more liberal credit granting policies due to the speculativenature of the balance assumptions used in this paper.

    The effect of accounting for estimation error in the estimates of defaultprobabilities can be measured by replacing 8 with Ea( b) and again applyingthe classification rule. Using the initial 9.0% cutoff, we now find that 92%,57.6% and 65.4% of goods, bads and denys, respectively, would be grantedcredit using Ea( fi) and our simulated granting criterion. In each case, asexpected, less credit is granted when accounting for estimation error. Specifi-cally, our simulation predicts that 134 more individuals (2.9% of the sample)would be denied credit after accounting for estimation error in estimates of thedefault probability. While these estimates are only suggestive, they do indicatethat estimation error may play an important role in credit granting decisions.

    5 Summary and conclusions

    Most applied credit scoring models are designed to minimize misclassifica-tions between good and bad loan accounts. Since the overall motive of alender is profit maximization, this narrow view of the problem may bemisleading. The goal of credit assessment should be to provide accurateestimates of each applicants probability of default and the pay-offs that willbe realized in the event of default or repayment. From estimates of theseparameters, loan officers can define a loan granting criterion that maximizesexpected earnings.

    A crucial parameter in the probability distribution of earnings is an individ-

    ual applicants probability of default. We apply the Manski-Lerman WESMLtechnique to estimate default probabilities from a bivariate censored probitmodel. Also we recognize, following McFadden and Reid (1975), that failureto account for the non-linearity of default probability estimates leads to biasedestimates of expected earnings. Results are based on a sample compiled by alarge commercial bank. Classification rules are illustrated by combining thedefault probability estimates with hypothetical balance behavior.

    Data limitations placed binding constraints on the analysis conducted inthis paper. With improved data on individual applicant balances we couldbuild a complete behavioral credit scoring model. This model would containestimates of maintained balances along with estimates of default probability.Second, with precise balance data we could begin to concentrate on the second

  • 8/13/2019 Econometric paper

    11/12

    W .J. Boy es et al ., Economet ri c anal ysi s of bank scori ng probl ems 13

    moment of the earnings distribution. Certainly, variance in the earnings of aprospective loan is an important consideration in deciding how much credit abank will allocate to a specific portion of its portfolio. An accurate measure of

    the variance of a prospective credit card loan will require calculation of thevariance of the earnings process with known parameters plus the additionaluncertainty attributable to error in estimation. Finally, the complete modelcredit assessment would account for time-to-default by applying split popu-lation survival time models of borrower behavior. Improved estimates oftime-to-default will undoubtedly increase our ability to measure expectedearnings.

    ppendix

    A sym pt ot i c var i ance-covari ance mat ri x for W ESM L esti mat es i n a censoredpro bi t modeI obta i ned rom a choice-based sampl e, The weighted log likelihoodfunction for the censored probit model is

    +w2yf1(l yf2)ln[~G ) - W31~ -Ta2; 41

    where

    WI= QJf4

    with Q, = population proportion of y1 = 1 and y, = 1, HI = sample proportionof yl=l andy,=l,

    with Q, = population proportion of y, = 1 and y 2 = 0, H 2 = sample propor-tion of yi = 1 and y, = 0,

    ~3 = Q3/4

    with Q3 = population proportion of y, = 0, H3 = sample proportion of yi = 0.

    The numerical values for these weights are described in the empiricalsection. Estimates are obtained by maximizing this likelihood function withrespect to 8 = ((Y;, (II;; p). Following Mans&Lerman, the asympotic vari-

  • 8/13/2019 Econometric paper

    12/12

    14 W .J. Boyes et al ., Econometr i c anal ysis of bank scoring roblems

    ante-covariance matrix for the WESML estimates 6 is

    V{ 6) = D-'AL'-',

    where

    and A=E{(G)( )}.

    References

    Altman, E.I., R.B. Avery, R.A. Eisenbeis and J.F. Sinkey, 1981, Application of classificationtechniques in business, banking and finance (JAI Press, Greenwich, CD.

    Boyes, W.J., D.L. Hoffman and S.A. Low, 1986, Lender reactions to information restrictions: Thecase of banks and the ECOA, Journal of Money, Credit and Banking 18, 211-219.

    Farber, H.S., 1983, Worker preference for union representation, Research in Labor Economics 2,171-205.

    Heckman, J.J., 1979, Sample selection bias as a specification error, Econometrica 47, 153-162.Manski, C.F. and S.R. Lerman, 1977, The estimation of choice probabilities from choice-based

    samples, Econometrica 45, 1977-1988.McFadden, D. and F. Reid, 1975, Aggregate travel demand forecasting from disaggregated

    behavior models, Record no. 534 (Transportation Research Board, Washington, DC).Meng, CL. and P. Schmidt, 1985, On the cost of partial observability in the bivariate probit

    model, International Economic Review 26, 71-85.Poirier, D.J., 1980, Partial observability in bivariate probit models, Journal of Econometrics 12,

    210-217.