Machine Intelligence for Fraud Prediction
-
Upload
dmitry-petukhov -
Category
Data & Analytics
-
view
74 -
download
3
Transcript of Machine Intelligence for Fraud Prediction
MACHINE INTELLIGENCE FOR
FRAUD PREDICTION
#paymentsecurity
Dmitry Petukhov,ML/DS Preacher,
Machine Intelligence Researcher @ OpenWay &&
Coffee Addicted
Говорят, что компьютерная программа обучается на основе опыта E по отношению к
некоторому классу задач T и меры качества P, если качество решения задач из T, измеренное
на основе P, улучшается с приобретением опыта E.T.M. Mitchell. Machine Learning, 1997.
Машинное обучение — процесс, в результате которого машина (компьютер) способна
показывать поведение, которое в нее не было явно заложено (запрограммировано).
A.L. Samuel. Some Studies in Machine Learning Using the Game of Checkers, 1959.
Терминология
Machine Learning is the FutureThesis #1
Machine Intelligence Cases for Retail Banking
Personalized
Product Offering
Real-timeBatch Processing
Processing Speed
Log
(Vo
lum
e)
Pbytes
Tbytes
Gbytes
Structured
data
Semi-structured
Unstructured
Customer Loyalty
Operational Efficiencies
Fraud Detection
Compliance and
Regulatory Reporting
Voice Identity, Chat-bots
Customer Segmentation
Credit Scoring
Credit Card Fraud
Web-/Mobile Bank Fraud
Insider Threats
Information Attacks
Data are everywhereThesis #2
Card-not-present Fraud Volume == Big Data caseV
olu
me
Variety
Velocity
Machine Intelligence + Big Data New Paradigm
Old School vs Big Data Paradigm
Dynamic threshold
Static* threshold
Old School vs AI Paradigm
* ∆t attack ≪ ∆t reaction
Evolution or and Revolution
1.
2.
3.
FALSE
FALSE
TRUE
Data
Infrastructure
Intelligence
Machine Intelligence Stack
MachineHuman
Private cloud Public cloudHybrid cloud
Forget or Secure Store and share
Machine Intelligence Stack
Cost
Law? Ethics?
Black box?
Architecture: Data Flow OnlineReal-time processing
Transactions stream
Risk score
Internal dataTransactions Log (WAY4),
customers/merchants CRMs,
black/white lists
External dataНБКИ, ФНС, ПФР, ФССП,
location & devices definition, social
graph, mobile provider score
1. Preprocessing data 2. Calculate statistics 3. Train model 4. Evaluate model
DetailsRaw Aggregates Model
Private data (152-ФЗ)
Payment data (PCI DSS)
0. Retrieve data
Step 1: Preprocessing DataTransaction Amount Challenge
1. 2. 3.
1. 2.
Step 1: Preprocessing DataCustomer Clustering Challenge
Step 2: Calculate Statistics
1% женщин в возрасте 40 лет, участвовавших в регулярных обследованиях, имеют рак груди. 80% женщин с раком
груди имеют положительный результат маммографии. 9,6% здоровых женщин также получают положительный
результат (маммография, как любые измерения, не дает 100% результатов).
Женщина-пациент из этой возрастной группы получила положительный результат на регулярном обследовании.
Какова вероятность того, что она фактически больна раком груди?
Step 2: Calculate Statistics
Step 3: Train ModelAlgorithm Selection Challenge
Algorithm Accuracy Speed Specifics
1. Logistic regression low fast linearly separable
2. Decision Tree low medium human-readable
3. Boosted Decision Tree high medium generalization ability
4. Neural Networks medium-high low pattern recognition
5. Deep Learning high very low magic AI
Step 4: Evaluate Model
Accur𝑎𝑐𝑦 =𝑇𝑃 + 𝑇𝑁
𝑃 + 𝑁
𝑅𝑒𝑐𝑎𝑙𝑙∗=
𝑇𝑃
𝑇𝑃 + 𝐹𝑁
Challenges:
Imbalanced classes;
False Positive Penalty != False Negative Penalty;
Calculate business-metrics:
Direct and indirect losses;
Bonus:
if you change Threshold, you will change everything…
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 =𝑇𝑃
𝑇𝑃 + 𝐹𝑃
𝐹2 =𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 ∙ 𝑟𝑒𝑐𝑎𝑙𝑙
𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 + 𝑟𝑒𝑐𝑎𝑙𝑙
Wikipedia
Rule-based or AI-based?
References1. Bansal, M. Credit Card Fraud Detection Using Self Organised Map (2014) International Journal of Information & Computation Technology,
Volume 4, Number 13.
2. Chan, P.K., Fan, W., Prodromidis, A.L., Stolfo, S.J. Distributed data mining in credit card fraud detection (1999) IEEE Intelligent Systems and
Their Applications, 14 (6).
3. Grolinger, K., Hayes, M., Higashino, W.A., L'Heureux, A., Allison, D.S., Capretz, M.A.M. Challenges for MapReduce in Big Data (2014)
Proceedings of the 2014 IEEE World Congress on Services.
4. Khan, A., Akhtar, N., and Qureshi, M. Real-Time Credit-Card Fraud Detection using Artificial Neural Network Tuned by Simulated Annealing
Algorithm (2014) ACEEE, Proc. of Int. Conf. on Recent Trends in Information, Telecommunication and Computing, ITC 2014 Chandigarh,
India.
5. Lu, Q., Ju, C. Research on credit card fraud detection model based on class weighted support vector machine (2011) Journal of Convergence
Information Technology, 6 (1).
6. Mardani, S., Akbari, M.K., Sharifian, S. Fraud detection in Process Aware Information systems using MapReduce (2014) 2014 6th Conference on
Information and Knowledge Technology, IKT 2014.
7. Dmitry Petukhov, A. Tselykh. Web service for detecting credit card fraud in near real-time (2015) Proceedings of the 8th International
Conference on Security of Information and Networks.
Advanced References
1. Максим Федотенко. Как защищают банки: разбираем устройство и принципы банковского антифрода. Журнал Хакер, 2017.
2. Дмитрий Петухов. Цикл статей: Антифрод как сервис. Интернет-ресурс 0xCode.in, 2016.
© 2017, Dmitry Petukhov. CC BY-SA 4.0 license. OpenWay and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.
Thank you!
Q&A
Now or later (see contacts below)
Stay connected
Facebook: @code.zombi
Habr: @codezombie
All contacts: http://0xCode.in/author
Download presentation from
http://0xCode.in/2017/paymentsecurity or