Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber...

77
Bankruptcy Prediction with Soft Computing Prof. V. Ravi Head, Center of Excellence in CRM & Analytics IDRBT, Hyderabad [email protected]

Transcript of Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber...

Page 1: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

Bankruptcy Prediction with Soft Computing

Prof. V. Ravi

Head, Center of Excellence in CRM & Analytics

IDRBT, Hyderabad

[email protected]

Page 2: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

Outline• About IDRBT & Center of Excellence

• Introduction to Soft Computing

• Introduction to Bankruptcy Prediction

• Introduction to Analytics

• Differential evolution trained wavelet neural networks

• Differential evolution trained kernel principal component WNN and kernel binary quantileregression

• Conclusions

Page 3: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

About IDRBT

1/30/2015 3

• Set up by RBI in 1996• Autonomous R&D Institute• Research, Teaching (M.Tech (IT), Ph.D.), Training and Consultancy

Page 4: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

Broad Research Areas

• Networks, Grid, Cloud, Virtualization, Social

Media, Wireless Networks, Internet, Web

Technologies, INFINET etc.

Financial Networks and

Applications

• E-Payments, Internet Banking, Mobile Payments,

ATM, PoS, Cash Dispensers, Smart cards, CTS,

MICR, RTGS, NEFT, IMPS etc.

Electronic Payment and

Settlement systems

•Security Models, Bio-metrics, Access Control,

Information Security, Digital Forensics, Cryptology,

Steganography, Image water marking, Cyber

Frauds and Crimes, Ethical Hacking, Digital

forensics, Controls and Standards etc.

Security Technologies for the

Financial sector

• Data Warehousing, Data Mining, CRM, Big Data,

Soft Computing, Financial Engineering, Risk

Management, Software Engineering,

Optimization, IT Management, e-Governance etc.

Financial Information

Systems and Business

Intelligence

Page 5: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

• Soft Computing Hybrids Developed Support Vector-Wavelet Neural Network (SV-WNN)

Nonlinear PCA-Threshold Accepting based Logit (NLPCA-TALR)

Differential Evolution trained RBF network (DERBF)

*Differential Evolution Threshold Accepting hybridoptimization algorithm (DETA)

*Differential Evolution trained WNN (DEWNN)

WNN-Fuzzy Rule based Classifier (WNN-FRBC)

*Data Envelopment Analysis-Fuzzy Multi AttributeDecision Making (DEA-FMADM)

Boosting involving CART, SVM and MLP

Threshold Accepting trained WNN (TAWNN)

Research Contributions of CoE to Computer Science

Page 6: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

*Threshold Accepting trained Principal Component NN(TAPCNN)

Ensembling MLP, RBF, MARS, SVM and CART *PCA-PNN and ensembling several techniquesModified Greta Deluge Algorithm trained Auto

Associative Network (MGDAAANN) Threshold Accepting based Fuzzy Clustering (TAFC) Improved Differential Evolution (DE-NM-Simplex) SVM-Naïve Bayes Tree (SVM-NBTree) Recurrent Genetic Programming (RGP) TA trained Kernel Principal Component NN (TA-KPCNN)DERBF-Genetic Algorithm Tree (DERBF-GATree)Ant Colony Optimization-Nelder-Mead Simplex (ACONM) SVM-FRBC

Research Contributions of CoE to Computer Science

Page 7: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

• Forecasting Software Development Cost• Forecasting Software Reliability• Fast digital Watermark retrieval using Hopfield

Neural Network• Watermark retrieval using Evolutionary

Algorithms• DE-KPCWNN & DE-KBQR• Recurrent GP and recurrent GMDH• PSOAANN for variety of tasks• Firefly Miner• Novel Time Series Mining algorithms

Research Contributions of CoE to Computer Science

Page 8: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

• Bankruptcy Prediction in Banks• Analytical CRM in Banking

• Customer Churn Prediction in Credit Cards/CASA• Credit Scoring• Default Prediction• Fraud Detection in Insurance• Customer Lifetime Value Modeling• Data Quality/Imputation in Customer datasets• Privacy Preserving Data Mining• Data Mining Unbalanced data sets

• Fraud Detection in Accounting Statements• Forex Rate prediction• Cash Demand Forecasting in ATMs• Association Rule Mining in Banks using PSO• Profiling of Internet/ Mobile banking users in India • Ranking Indian PSU banks’ Productivity• Predicting Operational Risk from Software perspective• Fuzzy Optimization of Asset Liability Management (ALM), 2007

Research Contributions of CoE to Banking

Page 9: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

International Collaborations• Prof. Kalyanmoy Deb, Dept of Elec & Comp Engg, MSU, USA

• Prof. Dirk Van den Poel, Dept of Marketing, University of Ghent, Ghent, Belgium

• Prof. Anita Prinzie, Dept of Marketing, Univ. of Ghent, Belgium

• Prof. Venu Govindaraju, SUNY Buffalo, USA

• Prof. Ajith Abraham, Director, Mir Labs, USA

• Prof. Indranil Bose, Business School, University of Hong Kong- now with IIM Calcutta

• Prof. D. Nagesh Kumar, IISc, Bangalore

• Prof. Nik Kasabov, Director KEDRI, Auckland Univ.of Technology, New Zealand.

Page 10: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

Research Output

• Published – 125 papers

• Better content for CRM EDP programmes– Customized Training to 14 banks– POC conducted on Analytical CRMf or 14 banks– Framework on “Holistic CRM and Analytics” released

• Proof of the concept software – Data imputation

• Papers under review– 6 in International Journals

Page 11: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

Research Supervision• Ph.D.

– Graduated• Mr. Mohammad Abdul Haque Farquad (UoH)

– Rule extraction from Support Vector Machine: Applications to Banking and Finance

• Mr. R. Mohanty (Berhampur Univ, Orissa)

– Application of Machine Learning and Soft Computing to Software Engineering

• Mr. N. Naveen (UoH)

– Rule extraction from Neural Nets & Optimization Techniques

– Ongoing• Mr. D. Pradip Kumar (Time Series Data Mining)

• Mr. B. Shravan Kumar (Unstructured Data Mining)

• Mr. K. Ravi (Social Media Analytics and Big Data with Ontologies)

• Mr. G. Jayakrishna (Evolutionary Computing and Data Mining)

• Mr. S. K. Kamruddin (Big Data and Applications)

Page 12: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

Research Supervision

• M. Tech (IT)Projects in Soft Computing/Data Mining– More than 35 Projects

– One student won Best Project Award at UoH in 2007.

– One student won IDRBT Award in 2011; 2 in 2012

– One student did Ph.D. in NJIT and one in SUNY, Stonybrook, USA.

– One won Best Paper Award at ICCIC 2013

• Integrated B. Tech, M. Tech/M. Sc from IITs– 20 students for summer projects related to Soft Computing

– One won Best paper Award at MIWAI 2014

Page 13: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

Soft Computing/CI Constituents

It comprises intelligent technologies

– Fuzzy Computing

– Neuro Computing

– Evolutionary Computing

– Rough Set theory

– Chaos theory

– Machine Learning

– Probabilistic reasoning (Bayesian Belief Nets)

• SC solutions hybridize two or more of these technologies in

various permutations and combinations

Page 14: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

Definition of Soft Computing

• Soft computing differs from conventional(hard) computing in that it is tolerant ofimprecision, uncertainty, partial truth andapproximation. In effect, the role model forsoft computing is the human mind.

- Lotfi Zadeh (1992)

Page 15: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

Benefits of Soft Computing

An ultimate goal shared by AI and SC

– the creation and understanding of machine intelligence

Soft Computing (or Computational intelligence)

– For learning and adaptation, SC requires extensive

computation but does not perform much symbolic

manipulation. So it is also called Computational Intelligence

— a discipline that complements classical AI approaches.

Page 16: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

Benefits of Soft Computing

In SC paradigm, one can simultaneously

– incorporate and process human knowledge effectively

– deal with imprecision and uncertainty

– learn to adapt to unknown or changing environmentfor better performance

– Amplify the advantages of the componenttechnologies while nullifying their disadvantages

Page 17: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

Guiding principle of Soft Computing

• Exploit

– the tolerance for imprecision, uncertainty, partial truth, and approximation

• to achieve

– tractable, robust and low cost solution.

Page 18: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

Data Mining

• The non-trivial process ofextracting USEFUL, NON-OBVIOUS AND ACTIONABLEknowledge from huge massesof data.

• A consortium of techniques– Computer Science

– Statistics

– Operations Research

– Data bases and

– Artificial Intelligence

Introduction to DM- Dr. V. Ravi 20/49

Page 19: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

Types of Analytics

• Descriptive Analytics (Statistics, OLAP)– Knowing what happened

• What is the spend pattern of our Credit cards?

• Predictive Analytics (Data, Text &Web Mining)– Knowing what is going to happen

• Who is going to churn/attrite in 2 months?

• Prescriptive Analytics (Optimization Techniques)– Prescribing some insights using Predictive Analytics

• Finding Optimal amount of cash replenishment in ATMs

Page 20: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

When everyone runsbehind the ball, I goto the place whereball is going to be:Pele

Page 21: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

Tasks in Data Mining

• Association Rules– Finding which items frequently go together in a

data set • E.g: Diapers, beer purchased together.

• Classification– Categorizing observations into predetermined

classes• Classifying a customer into good or bad

Introduction to DM- Dr. V. Ravi 23/49

Page 22: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

Tasks in Data Mining

• Clustering

– Grouping similar observations into same cluster

• Segmenting customers into different groups

• Forecasting/Regression

– Predicting the numerical dependent variable with the help of several explanatory variables

• Forecasting stock price of a company

• Outlier Detection

– Fraud detection

Introduction to DM- Dr. V. Ravi 24/49

Page 23: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

Bankruptcy Predn-V. Ravi

Overview of Bankruptcy

Bankruptcy of banks and financial firms is wellresearched area since 1960s

This is considered a form of Operational Risk,

Some consider this as a fallout of Credit risk

Creditors, auditors, stockholders and seniormanagement are all interested in knowingabout bankruptcy as it affects all of them.

Page 24: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

Bankruptcy Predn-V. Ravi

Terminology

• The terms failure, insolvency and bankruptcy areused interchangeably. Altman (1983) distinguishesthe terms as follows.

• In an economic sense, failure means that the realizedrate of return on invested capital, with allowances forrisk considerations, is significantly and continuallylower than prevailing rates on similar investments.Thus, a company may be an economic failure formany years.

• Insolvency exists when a firm cannot meet its currentobligations.

• Bankruptcy occurs when a company files a formallegal document in a federal district court for thepurpose of either liquidation or reorganization.

Page 25: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

Bankruptcy Predn-V. Ravi

Overview of Bankruptcy

The numerous factors that could lead to companiesand banks going bankrupt can be broadly classifiedinto three main categories:

Economic factors: GDP slowing, inflation, interest rate,unemployment rate, recession, depression etc…

Industry factors: Competition, growth/decline rate,profit margin trends, government regulations, tradebarriers, import tariff and quotas, taxes etc…

Page 26: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

Bankruptcy Predn-V. Ravi

Overview of Bankruptcy

Company factors: Management quality,capital allocation, competitive advantage,operation efficiency, working capitalmanagement, inventory managementetc...

During the period of 1974-1994, one third ofall commercial banks in U.S. havecollapsed due to failures and mergers.

Page 27: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

Overview of Bankruptcy

According to US Federal Deposit Insurance CorporationImprovement Act 1991, on-site health examinationof banks by regulators every 12-18 months wasmade mandatory

Each bank was given a CAMELS rating to quantify itsfinancial health

– Capital adequacy

– Asset quality

– Management expertise

– Earning strength

– Liquidity

– Sensitivity to Market Risk

Page 28: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

Bankruptcy Predn-V. Ravi

Problems with CAMEL ratings

Cole and Gunter (1995) found that theeffectiveness of CAMEL rating of troubled banksbegan to decay as quickly as six months.

On-site financial health monitoring period (12-18months) is too long to anticipate impendingfinancial problems of banks

Financial experts are scarce and expensiveresources.

Page 29: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

Bankruptcy Predn-V. Ravi

Overview of Bankruptcy

Hence, off-line methods preferred. More effectivemethods required to enable regular monitoringof bank’s financial health and advance detectionof impending financial troubles

More efficient and cost effective method usingcomputerized systems required

Page 30: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

Differential evolution trained wavelet neural networks: Application to bankruptcy prediction in banks

Nikunj Chauhan, V. Ravi *, D. Karthik Chandra

Expert Systems with Applications (2009)

Doi:10.1016/j.eswa.2008.09.019

Page 31: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

Introduction• Differential evolution algorithm (DE) is proposed to train a

wavelet neural network (WNN).

• The resulting network is named as differential evolution trained wavelet neural network (DEWNN).

• The efficacy of DEWNN is tested on bankruptcy prediction datasets.

• The whole experimentation is conducted using 10-fold cross validation method.

• Results show that soft computing hybrids viz., DEWNN and TAWNN outperformed the original WNN in terms of accuracy and sensitivity across all problems. Furthermore, DEWNN outscored

• TAWNN in terms of accuracy and sensitivity across all problems except Turkish banks dataset.

Page 32: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

Wavelet neural networks

• Based on the locally supported basis functions such as Radial Basis Function Networks (RBFNs),a class of neural net works called WNN, which originate from wavelet decomposition in signal processing, have become more popular recently .

• A family of wavelets can be constructed from a function w(x), sometimes known as a ‘‘mother wavelet,’’ which is confined in a finite interval. ‘‘Daughter wavelets’’ u(a,b) (x) are then formed by using translation(b) and dilation (a) parameters. An individual wavelet is:

Page 33: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

Wavelet Neural Network (WNN)• Has one input, hidden and output layer each.• All nodes in each layer are fully connected to the nodes

in the next layer.• The output layer contains a single node.• Based on the activation functions (either Gaussian or

Morlet) used in hidden nodes, two variants of WNN areimplemented.

Page 34: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

WNN Algorithm1. Select the number of hidden nodes required.

2. Initialize randomly from Uniform (0,1)

i. the dilation and translation parameters for these nodes

ii. the weights for the connections between the input andhidden layer; for the connections between the hiddenand output layer.

3. The output of the sample Vk, k=1, . . ., np, wherenp is the number of samples, is calculated with thefollowing formula:

Page 35: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

3. Reduce the error of prediction by adjusting Wj, wij, aj, bj

using ΔWj, Δwij, Δaj, Δbj (see below). In the WNN, the gradient descend algorithm is employed.

where the error function E is taken as normalized root mean squared deviation (NRMSE) as follows:

4. Return to step (2), the process is continued until convergence, and the whole training of the WNN is completed.

Page 36: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

• Threshold accepting trained WNN (TAWNN):

– Threshold Accepting algorithm, originallyproposed by Dueck and Scheuer (1990) is a fastervariant of the original simulated annealingalgorithm wherein the acceptance of a newmove or solution is determined by adeterministic criterion rather than a probabilisticone.

– We used TA to determine the weights of WNN

Metaheuristics used to train WNN

Page 37: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

Differential evolution

• Differential Evolution (DE) is a novel approach inevolutionary algorithms. DE algorithm consistsmainly of four steps: initialization, mutation,recombination and selection.

• In a population of solutions within an n-dimensionalsearch space, a fixed number of vectors are randomlyinitialized, then evolved over time to explore thesearch space and to locate the minima of theobjective function. The objective function is here tominimize the error value.

Page 38: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

Differential evolution based WNN (DEWNN)

• Application of DE in training WNN basically modifies steps 3 and 4 of the WNN training algorithm for WNN described.

• Weights W, dilation parameters D, translation parameters T and input values X, i.e. Y = f(X,R), where Y is the output values vector and R = (D,T,W,w).

• Vector R consists of(i) Weight values from input nodes to hidden nodes W= {Wij, i = 1,2,. .

., nin, where nin = number of input nodes, j = 1,2, . . .,nhn, where nhn = number of hidden nodes}

(ii) Weight values from hidden nodes to output nodes w = {wjk, j = 1,2, . . .,nhn and k = 1,2,. . .,non, where non = number of output nodes}

(iii) Dilation parameters D = (d1,d2, . . .,dnhn)(iv) Translation parameter T = (t1, t2, . . ., tnhn)

Page 39: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

Flow Chart of Differential evolution

Page 40: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

Feature selection

Feature selection is a process by which samplesin the measurement space is described by afinite and usually smaller set of features.

Page 41: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

• Absolute value of each hidden-to-output weight wko, is incorporated into the input to- hidden weights Wij using the following expression:

• For each hidden node j, the sum of weights over all input nodes is equal to the hidden-to-output node weight wjo.

• For each input node, the adjusted weights W ij are summed over all hidden nodes and converted to a percentage of the total for all input nodes.

Page 42: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

Literature review• The prediction of bankruptcy for financial firms

especially banks has been the extensively researched area since late 1960s by Altman (1968).

• Bankruptcy prediction research has attracted both statisticians and computer scientists with the result that a number of statistical techniques and more sophisticated non parametric methods like neural networks are applied to solve this problem.

• Bankruptcy prediction problem can also be solved using various other types of classifiers such as case based reasoning, rough sets, support vector machines, case based reasoning, neural network and discriminant analysis and data envelopment analysis[23]to mention a few

Page 43: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

Bankruptcy prediction

• To solve bankruptcy prediction problems, Ravi and Pramodh(2008) proposed a threshold accepting based trainingalgorithm for a novel principal component neural network(PCNN), without a formal hidden layer.

• They employed PCNN for bankruptcy prediction problems andreported that PCNN outperformed BPNN, TANN, PCA-BPNNand PCA-TANN in terms of area under receiver operatingcharacteristic curve (AUC) criterion.

Page 44: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

Results and discussion

The datasets analyzed by us pertain to three Turkish Banks, Spanish Banks and US Banks datasets and three other benchmark datasets viz., Iris data, wine data and Wisconsin breast cancer data.

Page 45: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

Financial ratios of the datasets and the selected features

Page 46: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

Average results for 10-fold cross validation with all features

Table-2

Page 47: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

Average results of 10FCV for Benchmark datasets with all features

Table-3

Page 48: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

Average results for 10FCV with reduced features

Table- 4

The results indicate the overwhelming supremacy of DEWNN in accuracy and sensitivity as compared to TAWNN and original WNN. The results for other benchmark datasets i.e. wine data and Wisconsin breast cancer data with reduced features.

Page 49: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

DEWNN once again outperformed the other algorithms. In this case also therobustness of the algorithm is proved and the high accuracies show us theimpeccable feature selection done by incorporating Garson’s algorithm intoDEWNN

Average results for 10FCV for benchmark datasets with reduced features

Table- 5

Page 50: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

• It is concluded that besides being robust, DEWNN is an effective algorithm for solving classification problems occurring in finance.

Comparison of features selected by different techniques

Table- 6

Page 51: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

Conclusions• DEWNN, TAWNN are developed and compared

with and the original WNN on benchmark datasets and the results indicate that DEWNN can be a very effective soft computing tool for classification problems.

• In addition, we also adopted the Garson’s feature selection algorithm to WNN, DEWNN and TAWNN the superior performance of DEWNN as compared to TAWNN and the original WNN.

• It is concluded that training WNN with DE solves classification problems with higher accuracy.

Page 52: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

References• Dimoulas, C., Kalliris, G., Papanikolaou, G., Petridis, V.,

& Kalampakas, A. (2008).

• Bowel-sound pattern analysis using wavelets and neural networks with application to long-term, unsupervised, gastrointestinal motility monitoring. Expert Systems with Applications, 34, 26–41.

• Dong, L., Xiao, D., Liang, Y., & Liu, Y. (2008). Rough set and fuzzy wavelet neural network integrated with least square weighted fusion algorithm based fault diagnosis research for power transformers. Electric Power Systems Research, 78, 129–136.

• Dueck, G., & Scheuer, T. (1990). Threshold accepting: A general purpose optimization algorithm appearing superior to simulated annealing. Journal of Computational Physics, 90, 161–175.

• Garson, D. G. (1991). Interpreting neural-network connection weights. A! Expert, 47–51. April.

• Grossmann, A., & Morlet, J. (1984). Decomposition of Hardi functions into square integrable wavelets of constant shape. SIAM Journal of Mathematical Analysis, 15, 725–736.

• Guyon, B., & Elisseeff, A. (2003). An introduction to variable and feature selection. Journal of Machine Learning Research, 3, 1157–1182.

• Ilonen, J., Kamarainen, J.-K., & Lampinen, J. (2003). Differential evolution training algorithm for feed-forward neural networks. Neural Processing Letters, 17(1), 93–105.

• Altman, E. (1968). Financial ratios, discriminant analysis and the prediction of corporate bankruptcy. Journal of Finance, 23, 589–609.

• Avci, E. (2007). An expert system based on wavelet neural network-adaptive norm entropy for scale invariant texture classification. Expert Systems with Applications, 32, 919–926.

• Becerra, V. M., Galvao, R. K. H., & Abou-Seads, M. (2005). Neural and wavelet network model for financial distress classification. Data Mining and Knowledge Discovery, 11, 35–55.

• Bhat, T. R., Venkataramani, D., Ravi, V., & Murty, C. V. S. (2006). Improved differential evolution method for efficient parameter estimation in biofilter modeling. Biochemical Engineering Journal, 28, 167–176.

• Canbas, S., Caubak, B., & Kilic, S. B. (2005). Prediction of commercial bank failure via multivariate statistical analysis of financial structures: The Turkish case. European Journal of Operational Research, 166, 528–546.

• Cheng, C. B., a radial basis function network with logitanalysis learChen, C. L., & Fu, C. J. (2006). Financial distress prediction by ning. Computers and Mathematics with Applications, 51, 579–588.

• Cielen, A., Peeters, L., & Vanhoof, K. (2004). Bankruptcy prediction using a data envelopment analysis. European Journal of Operational Research, 154, 526–532.

• Cole, R., & Gunther, J. (1995). A CAMEL rating’s shelf life. Federal Reserve Bank of Dallas Review, 13–20. December.

• Deb, K. (2000). An efficient constraint handling method for genetic algorithms. Computer Methods in Applied Mechanics and Engineering, 186, 311–338.

Page 53: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

Differential evolution trained kernel principal component WNN and kernel binary quantileregression: Application to banking

Kalam Narendar Reddy, Vadlamani Ravi

Knowledge Based Systems,2012, Elsevier

DOI: doi:10.1016/j.knosys.2012.10.003

Page 54: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

Introduction

We discuss two new classification techniques viz.,

(i) DE trained Kernel Principal Component Wavelet NeuralNetwork (KPCWNN=KPCA + DEWNN).

(ii) DE trained Kernel Binary Quantile Regression (KBQR). TheKPCWNN employs KPCA and Differential Evolution trainedWavelet Neural Network (DEWNN) in tandem, where KPCAis used to find the kernel principal components, which arefed as input to WNN trained using Differential Evolution(DE)

Page 55: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer
Page 56: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

Binary quantile regression

• Kordas (2008) introduced Binary Quantile Regression (BQR)for binary response variable which is the extension of themaximum score estimation method introduced by Manski(1975, 1985). Literature abounds with the applications of QRin both economics and finance.

Here Quantθ (yi/xi) denotes the θth conditional quantile of yi on the regressorvector xi; βθ is the unknown vector of parameters to be estimated for different values of θ in (0, 1); uθi is the error term which follows a continuously differentiable cumulative density function Fuθ (./x)and a density function fuθ (./x). The value Fi(./x) denotes the conditional distribution of y given x.

Page 57: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

Proposed techniques

Architecture and Algorithm of DE-KPCWNN.

Page 58: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

Algorithm for DE-KPCWNN Architecture• A dataset with n samples and k features is represented by a data matrix X, and an element Xij

represents the value of jth feature for ith sample. Let xi, i=1,2,...,np be the training data, nbe the total number of samples, y be the vector representing the corresponding output variable class label.

The steps involved in DE-KPCWNN algorithm are given below: Step 1: Applying kernel PCA(i) For the training data, compute the kernel matrix viz.(ii) For the testing data, compute the kernel matrix viz.(iii) Centralize K and K te(iv) Combine K and Kte matrices to form total centralized kernel matrix K to t of order n np, where

n=np+nt.(v) Perform principal component analysis. On K to t . The principal components are computed using the

matrix equation P=KtotE(vi) The ratio of each of the eigen value to the total sum of all the eigen values indicates the proportion

of variation explained by the corresponding principal component.(vii) As we selected the first n in principal components according 435 to the required variance we want

to explain we will form a new matrix of choosen pcs [Y]n nin by ignoring the last (np-nin) columns of the P0 matrix.

Step 2:Training WNN using differential evolution(1) Select the number of hidden nodes (nhn) required. Initialize the dilation and translation

parameters for these nodes, weights for the connections between the input and hidden layers and also for the connections between the hidden and the output layers using random numbers generated from U(0,1)distribution.

(2) The output of the sample Vk, where k=1,2, 3,...,n and n is the number of samples, is computed with the following formula:

Page 59: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

Proposed techniques…

• 4.2. Architecture of DE-KBQRIn DE-KBQR, we apply the kernels like polynomial, sigmoid and Gaussian to

compute the Kernel Matrix from the input data which transforms the input data into higher dimensions.

4.2.1. Algorithm for DE-KBQR architectureThe steps involved in DE-KBQR are given below:Step 1:Applying kernel technique(I) Compute the kernel matrix, for the training data(II) Compute the kernel matrix, for the testing data,(III) Centralize K and Kte using(IV) Combine K and Kte matrices to form total

centralized kernel matrix K tot of order n np,Step 2:Training KBQR using differential evolution

Page 60: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

Architecture of DE-KBQR

(I) Initialize the weights b=(b1,b2,b3,...,bnp) which are corresponding to the eachinput variable, i.e. the columns in the Z to some random values generated fromU(0,1) distribution.

(II) The output of the sample yk, wherek=1, 2, 3,...,n and n is the number of samples, is computed with the followingformula:

Page 61: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer
Page 62: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer
Page 63: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

Steps of DE common to both architectures

(i) Initialization: The initial population is randomly initialized following uniform distribution.

(ii) Mutation : Mutation is basically a search mechanism, which, together with recombination and selection, directs the search towards potential areas of optimal solution.

(iii) Recombination(crossover) : In the recombination (crossover) operation, each target vector of the parent population is allowed to mate with a mutated vector.

(iv) Selection: In the selection stage, we will select either target vector or trail vector which will fit the objective function more.

Page 64: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

Flowchart for differential Evolution

Page 65: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

Datasets description and experimental setup

• The datasets analyzed by us in this work are three different bankruptcy prediction data sets viz. Turkish Banks, Spanish Banks and US Banks datasets; German and UK Credit datasets.

• Throughout this study, we performed the 10-foldcross validation method of testing. The results presented in the tables reflect the average results over the 10 folds.

Page 66: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

For KPCWNN the error value considered is NRMSE value.For KBQR the error value is given by:

Page 67: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer
Page 68: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer
Page 69: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer
Page 70: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer
Page 71: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer
Page 72: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer
Page 73: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer
Page 74: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

Results and discussion• We implemented both DE-KPCWNN and DE-BQR using Java (JDK

1.5) on Windows 7 platform on desktop with a RAM of 2 GB.

• For Spanish dataset, sensitivity improved from 94.17% to 100%,while accuracy increased from 95% to 100% with DE-KBQR. Byusing DE-KPCWNN accuracy increased from 92.17 to 100.

• For Turkish dataset, sensitivity of 100% is achieved; whileaccuracy increased from 94.17% to 100% by employing DE-KBQR.

• Thus, in both cases, DE-KBQR is superior to other techniques.

• The t-test values presented in Table 9are compared to 2.83, whichis the t-test table value at 18 degrees of freedom (10 + 102 =18)and 1% level of significance.

Page 75: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

Conclusions• DE-KPCWNN and DE-KBQR can be very

effective soft computing tools for solvingclassification problems like bankruptcyprediction and credit scoring applications.

• The reasons could be the employment ofkernel trick in conjunction with some provenintelligent techniques.

• Future directions include constructing morekernel techniques in this direction and alsodeveloping online training algorithms forsome of the kernel techniques.

Page 76: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer

References• [1] T. Abdelwahed, E.M. Amir, New evolutionary bankruptcy forecasting

model based on genetic algorithms and neural networks, in: Proceedings of

the 17th IEEE International Conference on Tools with Artificial Intelligence

(ICTAI05), IEEE Computer Society, 2005, pp. 1–5.

• [2] H. Ahn, K. Lee, K.J. Kim, Global optimization of support vector machines

using genetic algorithms for bankruptcy prediction, Lecture Notes in

Computer Science, vol. 4234, Springer, Heidelberg, 2006, pp. 420–429.

• [3] M.A. Aizerman, E.M. Braverman, L.I. Rozonoer, Theoretical foundations of

the potential function method in pattern recognition learning, Automation and

Remote Control 25 (6) (1964) 821–837.

• [4] E.I. Altman, Financial ratios, discriminant analysis and the prediction of

corporate bankruptcy, Journal of Finance 23 (1968) 589–609.

• [5] V. Atella, N. Pace, D. Vuri, Are employers discriminating with respect to

weight? European evidence using quantile regression, Economics and

Human Biology 6 (3) (2008) 305–329.

• [6] E. Avci, An expert system based on wavelet neural network-adaptive norm

entropy for scale invariant texture classification, Expert Systems with

Applications 32 (3) (2007) 919–926.

• [7] G. Bassett, H.L. Chen, Quantile style: return-based attribution using

regression quantiles, Empirical Economics 26 (2001) 293–305.

• [8] W. Beaver, Financial ratios as predictors of failure, Journal of Accounting

Research 5 (1966) 71–111.

• [9] D.F. Benoit, D. Van den Poel, Binary quantile regression: a Bayesian

approach based on the asymmetric laplace distribution, Journal of Applied

Econometrics(2010). 10.1002.jae.1216.

• [10] M.J. Beyon, M.J. Peel, Variable precision rough set theory and data

discretization: an application to corporate failure prediction, Omega 29

(6)(2001) 561–576.

• [11] T.R. Bhat, D. Venkataramani, V. Ravi, C.V.S. Murty, Improved differential

evolution method for efficient parameter estimation in biofilter modeling,

Biochemical Engineering Journal 28 (2) (2006) 167–176.

• [12] B.E. Boser, I.M. Guyon, V.N. Vapnik, A training algorithm for optimal

margin classifiers, in: COLT 92: Proceedings of the Fifth Annual Workshop on

Computational Learning Theory, ACM Press, New York, 1992, pp. 144–152.

• [13] M. Buchinsky, Changes in U.S. wage structure 1963–1987: an

application of quantile regression, Econometrica 62 (2) (1994) 405–458.

• [14] M. Buchinsky, The dynamics of changes in the female wage distribution

in the USA: a quantile regression approach, Journal of Applied Econometrics

13 (1) (1998) 1–30.

• [15] S. Canbas, B. Caubak, S.B. Kilic, Prediction of commercial bank failure

via multivariate statistical analysis of financial structures: the Turkish case,

European Journal of Operational Research 166 (2005) 528–546.

• [16] A. Chaudhuri, K. De, Fuzzy support vector machine for bankruptcy prediction, Applied Soft Computing 11 (2) (2011) 2472–2486.

• [17] N.J. Chauhan, V. Ravi, D.K. Chandra, Differential evolution trained wavelet neural network: application to bankruptcy prediction in banks, Expert Systems with Applications 36 (4) (2009) 7659–7665.

• [18] F.L. Chen, F.C. Li, Combination of feature selection approaches with SVM in credit scoring, Expert Systems with Applications 37 (7) (2010) 4902–4909.

• [19] H.L. Chen, B. Yang, G. wang, J. Liu, X. Xu, S.J. Wang, D.Y. Liu, A novel bankruptcy prediction model based on an adaptive fuzzyk-nearest neighbor method, Knowledge Based Systems 24 (8) (2011) 1348–135

• [20] N. Chen, B. Ribeiro, A.S. Vieira, J. Duarte, J.C. Neves, A genetic algorithm-based approach to cost-sensitive bankruptcy prediction, Expert Systems with Applications 38 (10) (2011) 12939–12945.

• [21] C.B. Cheng, C.L. Chen, C.J. Fu, Financial distress prediction by a radial basis function network with logit analysis learning, Computers and Mathematics with Applications 51 (3–4) (2006) 579–588.

• [22] V. Chernozhukov, L. Umantsev, Conditional value-at-risk: aspects of modeling and estimation, Empirical Economics 26 (1) (2001) 271–292.

• [23] A. Cielen, L. Peeters, K. Vanhoof, Bankruptcy prediction using a data envelopment analysis, European Journal of Operational Research 154 (2) (2004) 526–532.

• [24] R. Cole, J. Gunther, A CAMEL rating’s shelf life, Federal Reserve Bank of Dallas Review (December) (1995) 13–20.

• [25] T. Conley, D. Galenson, Nativity and wealth in mid-nineteenth-century cities, Journal of Economic History 58 (2) (1998) 468–493.

• [26] C. Dimoulas, G. Kalliris, G. Papanikolaou, V. Petridis, A. Kalampakas, Bowel sound pattern analysis using wavelets and neural networks with application to long-term, unsupervised, gastrointestinal motility monitoring, Expert Systems with Applications 34 (1) (2008) 26–41.

• [27] L. Dong, D. Xiao, Y. Liang, Y. Liu, Rough set and fuzzy wavelet neural network integrated with least square weighted fusion algorithm based fault diagnosis research for power transformers, Electric Power Systems Research 78 (1) (2008) 129–136.

• [28] M.A.H. Farquad, V. Ravi, Sreeramji, G. Praveen, Credit Scoring using PCA-SVM hybrid model, in: Second International Conference on Recent Trends in Information, Telecommunication and Computing – ITC 2011, March 10–11, Bangalore, India.

• [29] B. Fattouh, P. Scaramozzino, L. Harris, Capital structure in South Korea: a quantile regression approach, Development Economics 76 (1) (2005) 231–250.

• [30] A. Gosling, S. Machin, C. Meghir, The changing distribution of male wages in the UK, The Review of Economic Studies 67 (4) (2000) 635–66

Page 77: Bankruptcy Prediction with Soft Computing - crrao · Steganography, Image water marking, Cyber Frauds and Crimes, Ethical Hacking, Digital forensics, ... •Classifying a customer