IPR Oracle Innovation Days 2015

20
Oracle Advanced Analytics: insurance claim fraud detection Oracle Innovation Days 2015, Riga

Transcript of IPR Oracle Innovation Days 2015

Page 1: IPR Oracle Innovation Days 2015

Oracle Advanced Analytics: insurance claim fraud detection

Oracle Innovation Days 2015, Riga

Page 2: IPR Oracle Innovation Days 2015

• Established in November, 2007

• 100+ employees

• Customers in Nordics, Latvia, Russia and

the USA

• Provide systems integration services

(CRM, Decision Support Systems)

• Develops original products

• (Micromiles, Debessmana)

Who we are

Page 3: IPR Oracle Innovation Days 2015

• Defining needs

• Collecting data

• Generating and evaluating options

• Selecting the best possible

• Applying and using

• Getting feedback and following up

Decisions Making Process Is …

Page 4: IPR Oracle Innovation Days 2015

Data Mining is

• the computational process of discovering

patterns in large data sets

• Knowledge Discovery in Databases

What is Data Mining?

Page 5: IPR Oracle Innovation Days 2015

Financial Services

- Credit risk analysis

- Cross-LOB up-selling

- Fraud detection

- Retail banking personalization

- “Best customer” prediction & profiling

Retail

- Product recommendations

- Customer segmentation

- Customer profiling

- Market Basket Analysis

Telecommunications

- Churn prevention

- Social network analysis

- Network monitoring

- Customer handling time reduction

Transportation and logistics

- Anticipate bottlenecks

- Proactive resource planning

- Improved preventative maintenance strategies

Data Mining use cases

Page 6: IPR Oracle Innovation Days 2015

Cross Industry Standard Process for Data Mining (CRISP)

Business Understanding • Business Objectives • Success Criteria • Project plan • Deliveries

Data Understanding • Initial Data Collection • Data Description • Data Exploration

Data Preparation • Data cleaning • Sampling • Normalization • Feature Selection

Modeling • Select modeling techniques • Build/train model • Prediction

Evaluation • Model validation • Review results • Success criteria evaluation

Deployment • Results visualization • Report creation

Page 7: IPR Oracle Innovation Days 2015

Business Understanding

Fraud detection analysis for insurance claims (car insurance) Business Objectives The goal of this analysis is to create a tool which helps to identify fraudulent claims in auto insurance (KASKO) Deliveries • Possible fraud prediction • Descriptive analysis

Page 8: IPR Oracle Innovation Days 2015

Data Understanding

Initial Data Collection 250 attributes 404 k claims 4% fraud

FraudNormal

Source: Oracle Siebel CRM

Page 9: IPR Oracle Innovation Days 2015

Data preprocessing

FraudNormal

Activities: • normalization • inputting missing data • attribute selection • stratified sampling

• 70% training dataset • 30% test dataset

Final data set 150 of 250 attributes selected

Page 10: IPR Oracle Innovation Days 2015

Data Mining techniques

• Classification

• Clustering

Data mining tools: Oracle Data Miner

Modeling

Page 11: IPR Oracle Innovation Days 2015

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

– In-database data mining algorithms and open source R algorithms

– SQL, PL/SQL, R languages

– Scalable, parallel in-database execution

– Workflow GUI and IDEs

– Integrated component of Database

– Enables enterprise analytical applications

Key Features

Oracle Advanced Analytics Fastest Way to Deliver Scalable Enterprise-wide Predictive Analytics

Page 12: IPR Oracle Innovation Days 2015

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

OBIEE

Oracle Database Enterprise Edition

Oracle Advanced Analytics Architecture

Oracle Advanced Analytics Native SQL Data Mining/Analytic Functions + High-performance

R Integration for Scalable, Distributed, Parallel Execution

SQL Developer

Applications

R Client

Page 13: IPR Oracle Innovation Days 2015

Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |

Function Algorithms Applicability

Classification

Logistic Regression (GLM) Decision Trees Naïve Bayes Support Vector Machines (SVM)

Classical statistical technique Popular / Rules / transparency Embedded app Wide / narrow data / text

Regression Linear Regression (GLM) Support Vector Machine (SVM)

Classical statistical technique

Wide / narrow data / text

Anomaly Detection

One Class SVM Unknown fraud cases or anomalies

Attribute Importance

Minimum Description Length (MDL) Principal Components Analysis (PCA)

Attribute reduction, Reduce data noise

Association Rules

Apriori Market basket analysis / Next Best Offer

Clustering Hierarchical k-Means Hierarchical O-Cluster Expectation-Maximization Clustering (EM)

Product grouping / Text mining Gene and protein analysis

Feature Extraction

Nonnegative Matrix Factorization (NMF) Singular Value Decomposition (SVD)

Text analysis / Feature reduction

Oracle Advanced Analytics In-Database Data Mining Algorithms—SQL & R & GUI Access

A1 A2 A3 A4 A5 A6 A7

F1 F2 F3 F4

Page 14: IPR Oracle Innovation Days 2015

• Automated data preprocessing (normalizing, cleaning)

• Workflow type modeling • Build several models in

parallel

Modeling

Classification modeling using Oracle Data Miner

Page 15: IPR Oracle Innovation Days 2015

Models comparison and validation (confusion matrix)

Classification modeling evaluation

Models Actual values Predicted Values

Accuracy

Value Y N

SVM

Y 66% 34%

69%

N 29% 71%

DT

Y 66% 34%

66%

N 33% 67%

GLM

Y 70% 30%

70%

N 30% 70%

Where Y – Fraud cases N – Normal cases

Page 16: IPR Oracle Innovation Days 2015

Cluster evaluation

% of fraud vs normal cases

The top left quadrant is our goal

22

Page 17: IPR Oracle Innovation Days 2015

Cluster analysis OBIEE dashboard

Page 18: IPR Oracle Innovation Days 2015

Fraudulent claims prediction

Output: - List of possible

fraudulent cases - Probabilities

Page 19: IPR Oracle Innovation Days 2015

Contacts

• Web: www.ideaportriga.lv

• Blog: blog.ideaportriga.lv

• Email: [email protected]

• LinkedIn: lv.linkedin.com/in/jurijsj

Find out more

Page 20: IPR Oracle Innovation Days 2015

Q&A