Gerardo Hernández Correa Application Engineer · Example – Bank Marketing Campaign Goal:...
Transcript of Gerardo Hernández Correa Application Engineer · Example – Bank Marketing Campaign Goal:...
![Page 1: Gerardo Hernández Correa Application Engineer · Example – Bank Marketing Campaign Goal: –Predict if customer would subscribe to bank term deposit based on different attributes](https://reader033.fdocuments.net/reader033/viewer/2022042815/5f8ae8ed782d846d934c40fc/html5/thumbnails/1.jpg)
1 © 2014 The MathWorks, Inc. © 2014 The MathWorks, Inc.
Machine Learning with MATLAB
Gerardo Hernández Correa
Application Engineer
![Page 2: Gerardo Hernández Correa Application Engineer · Example – Bank Marketing Campaign Goal: –Predict if customer would subscribe to bank term deposit based on different attributes](https://reader033.fdocuments.net/reader033/viewer/2022042815/5f8ae8ed782d846d934c40fc/html5/thumbnails/2.jpg)
2
What You Will Learn
Overview of Machine Learning
Algorithms available in MATLAB
Overcoming machine learning challenges with MATLAB
![Page 3: Gerardo Hernández Correa Application Engineer · Example – Bank Marketing Campaign Goal: –Predict if customer would subscribe to bank term deposit based on different attributes](https://reader033.fdocuments.net/reader033/viewer/2022042815/5f8ae8ed782d846d934c40fc/html5/thumbnails/3.jpg)
3
What is Machine Learning? Most common tool for Data Analytics modeling
Use features in the data to create a predictive model
![Page 4: Gerardo Hernández Correa Application Engineer · Example – Bank Marketing Campaign Goal: –Predict if customer would subscribe to bank term deposit based on different attributes](https://reader033.fdocuments.net/reader033/viewer/2022042815/5f8ae8ed782d846d934c40fc/html5/thumbnails/4.jpg)
4
Machine Learning for Data Analytics
“Data analytics solutions allow firms
to discover, optimize, and deploy predictive models
by analyzing data sources
to improve business outcomes.”
Forrester Research
Challenges
• Overloaded by data
• Competition for better decision-making
• Big Data buzz, but missing solution to make sense of it (analytics)
and integrate with enterprise-wide applications (deployment)
![Page 5: Gerardo Hernández Correa Application Engineer · Example – Bank Marketing Campaign Goal: –Predict if customer would subscribe to bank term deposit based on different attributes](https://reader033.fdocuments.net/reader033/viewer/2022042815/5f8ae8ed782d846d934c40fc/html5/thumbnails/5.jpg)
5
Used Across Many Application Areas
Biology Financial
Services
Image & Video
Processing Energy
Pattern
Recognition
Credit Scoring,
Algorithm
Trading, Bond
Classification
Load, Price
Forecasting,
Trading
Tumor Detection,
Drug Discovery
Audio
Processing
Speech
Recognition
![Page 6: Gerardo Hernández Correa Application Engineer · Example – Bank Marketing Campaign Goal: –Predict if customer would subscribe to bank term deposit based on different attributes](https://reader033.fdocuments.net/reader033/viewer/2022042815/5f8ae8ed782d846d934c40fc/html5/thumbnails/6.jpg)
6
Challenges – Machine Learning
Lots of data, with many variables (predictors)
Data is too complex to know the governing equations
Significant technical expertise required
No “one size fits all” solution requires an iterative
approach
Try multiple algorithms, see what works best
Time consuming
![Page 7: Gerardo Hernández Correa Application Engineer · Example – Bank Marketing Campaign Goal: –Predict if customer would subscribe to bank term deposit based on different attributes](https://reader033.fdocuments.net/reader033/viewer/2022042815/5f8ae8ed782d846d934c40fc/html5/thumbnails/7.jpg)
7
MATLAB Solution
Strong environment for interactive exploration
Algorithms and Apps to get started
- Clustering, Classification, Regression
- Neural network app, Curve fitting app
Easy to evaluate, iterate and choose the best algorithm
Parallel computing
Integrated with data and deployment for Data Analytics
workflows
![Page 8: Gerardo Hernández Correa Application Engineer · Example – Bank Marketing Campaign Goal: –Predict if customer would subscribe to bank term deposit based on different attributes](https://reader033.fdocuments.net/reader033/viewer/2022042815/5f8ae8ed782d846d934c40fc/html5/thumbnails/8.jpg)
8
Overview – Machine Learning
Machine
Learning
Supervised
Learning
Classification
Regression
Unsupervised
Learning Clustering
Group and interpret data based only
on input data
Develop predictive model based on both input and output data
Type of Learning Categories of Algorithms
![Page 9: Gerardo Hernández Correa Application Engineer · Example – Bank Marketing Campaign Goal: –Predict if customer would subscribe to bank term deposit based on different attributes](https://reader033.fdocuments.net/reader033/viewer/2022042815/5f8ae8ed782d846d934c40fc/html5/thumbnails/9.jpg)
9
Unsupervised Learning
Clustering
k-Means,
Fuzzy C-Means
Hierarchical
Neural
Networks
Gaussian
Mixture
Hidden Markov
Model
![Page 10: Gerardo Hernández Correa Application Engineer · Example – Bank Marketing Campaign Goal: –Predict if customer would subscribe to bank term deposit based on different attributes](https://reader033.fdocuments.net/reader033/viewer/2022042815/5f8ae8ed782d846d934c40fc/html5/thumbnails/10.jpg)
10
Supervised Learning
Regression
Non-linear Reg.
(GLM, Logistic)
Linear
Regression Decision Trees
Ensemble
Methods
Neural
Networks
Classification
Nearest
Neighbor
Discriminant
Analysis Naive Bayes
Support Vector
Machines
![Page 11: Gerardo Hernández Correa Application Engineer · Example – Bank Marketing Campaign Goal: –Predict if customer would subscribe to bank term deposit based on different attributes](https://reader033.fdocuments.net/reader033/viewer/2022042815/5f8ae8ed782d846d934c40fc/html5/thumbnails/11.jpg)
11
Supervised Learning - Workflow
Known data
Known responses
Model
Train the Model
Model
New Data
Predicted
Responses
Use for Prediction
Measure Accuracy
Select Model
Import Data
Explore Data
Data
Prepare Data
Speed up Computations
![Page 12: Gerardo Hernández Correa Application Engineer · Example – Bank Marketing Campaign Goal: –Predict if customer would subscribe to bank term deposit based on different attributes](https://reader033.fdocuments.net/reader033/viewer/2022042815/5f8ae8ed782d846d934c40fc/html5/thumbnails/12.jpg)
12
Classification Overview
What is classification?
– Predicting the best group for each point
– “Learns” from labeled observations
– Uses input features
Why use classification?
– Accurately group data never seen before
How is classification done?
– Can use several algorithms to build a predictive model
– Good training data is critical
-0.1 0 0.1 0.2 0.3 0.4 0.5 0.60
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Group1
Group2
Group3
Group4
Group5
Group6
Group7
Group8
![Page 13: Gerardo Hernández Correa Application Engineer · Example – Bank Marketing Campaign Goal: –Predict if customer would subscribe to bank term deposit based on different attributes](https://reader033.fdocuments.net/reader033/viewer/2022042815/5f8ae8ed782d846d934c40fc/html5/thumbnails/13.jpg)
13
Example – Bank Marketing Campaign
Goal:
– Predict if customer would subscribe to
bank term deposit based on different
attributes
Approach:
– Train a classifier using different models
– Measure accuracy and compare models
– Reduce model complexity
– Use classifier for prediction
Data set downloaded from UCI Machine Learning repository
http://archive.ics.uci.edu/ml/datasets/Bank+Marketing
0
10
20
30
40
50
60
70
80
90
100
Perc
enta
ge
Bank Marketing Campaign
Misclassification Rate
Neur
al N
et
Logi
stic
Reg
ress
ion
Dis
crim
inant
Ana
lysi
s k-
neare
st N
eig
hbor
s
Naiv
e B
ayes
Sup
port V
M
Deci
sion
Tree
s
Tre
eBagg
er
Redu
ced
TB
No
Misclassified
Yes
Misclassified
![Page 14: Gerardo Hernández Correa Application Engineer · Example – Bank Marketing Campaign Goal: –Predict if customer would subscribe to bank term deposit based on different attributes](https://reader033.fdocuments.net/reader033/viewer/2022042815/5f8ae8ed782d846d934c40fc/html5/thumbnails/14.jpg)
14
Example – Bank Marketing Campaign
Numerous predictive models with rich
documentation
Interactive visualizations and apps to
aid discovery
Built-in parallel computing support
Quick prototyping; Focus on
modeling not programming
0
10
20
30
40
50
60
70
80
90
100
Perc
enta
ge
Bank Marketing Campaign
Misclassification Rate
Neur
al N
et
Logi
stic
Reg
ress
ion
Dis
crim
inant
Ana
lysi
s k-
neare
st N
eig
hbor
s
Naiv
e B
ayes
Sup
port V
M
Deci
sion
Tree
s
Tre
eBagg
er
Redu
ced
TB
No
Misclassified
Yes
Misclassified
![Page 15: Gerardo Hernández Correa Application Engineer · Example – Bank Marketing Campaign Goal: –Predict if customer would subscribe to bank term deposit based on different attributes](https://reader033.fdocuments.net/reader033/viewer/2022042815/5f8ae8ed782d846d934c40fc/html5/thumbnails/15.jpg)
15
Clustering Overview
What is clustering?
– Segment data into groups,
based on data similarity
Why use clustering?
– Identify outliers
– Resulting groups may be
the matter of interest
How is clustering done?
– Can be achieved by various algorithms
– It is an iterative process (involving trial and error)
-0.1 0 0.1 0.2 0.3 0.4 0.5 0.60
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
![Page 16: Gerardo Hernández Correa Application Engineer · Example – Bank Marketing Campaign Goal: –Predict if customer would subscribe to bank term deposit based on different attributes](https://reader033.fdocuments.net/reader033/viewer/2022042815/5f8ae8ed782d846d934c40fc/html5/thumbnails/16.jpg)
16
Example – Clustering Corporate Bonds
Goal:
– Cluster similar corporate bonds
together
Approach:
– Cluster the bonds data using distance-
based and probability-based
techniques
– Evaluate clusters for validity
Data
Poin
t #
Hierarchical Clustering
1000 2000 3000 4000
500
1000
1500
2000
2500
3000
3500
4000
Dis
t M
etr
ic:s
pearm
an
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
Data Point #D
ata
Poin
t #
k-Means Clustering
1000 2000 3000 4000
500
1000
1500
2000
2500
3000
3500
4000
Dis
t M
etr
ic:c
osin
e
0
0.2
0.4
0.6
0.8
![Page 17: Gerardo Hernández Correa Application Engineer · Example – Bank Marketing Campaign Goal: –Predict if customer would subscribe to bank term deposit based on different attributes](https://reader033.fdocuments.net/reader033/viewer/2022042815/5f8ae8ed782d846d934c40fc/html5/thumbnails/17.jpg)
17
Example – Clustering Corporate Bonds
Numerous clustering functions with
rich documentation
Interactive visualizations to aid
discovery
Viewable source; not a black box
Rapid exploration & development
Data
Poin
t #
Hierarchical Clustering
1000 2000 3000 4000
500
1000
1500
2000
2500
3000
3500
4000
Dis
t M
etr
ic:s
pearm
an
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
Data Point #D
ata
Poin
t #
k-Means Clustering
1000 2000 3000 4000
500
1000
1500
2000
2500
3000
3500
4000
Dis
t M
etr
ic:c
osin
e
0
0.2
0.4
0.6
0.8
![Page 18: Gerardo Hernández Correa Application Engineer · Example – Bank Marketing Campaign Goal: –Predict if customer would subscribe to bank term deposit based on different attributes](https://reader033.fdocuments.net/reader033/viewer/2022042815/5f8ae8ed782d846d934c40fc/html5/thumbnails/18.jpg)
18
Challenges MATLAB Solution
Time (loss of productivity) Rapid analysis and application development High productivity from data preparation, interactive
exploration, visualizations.
Extract value from data Machine learning, Video, Image, and Financial Depth and breadth of algorithms in classification, clustering,
and regression
Computation speed Fast training and computation Parallel computation, Optimized libraries
Time to deploy & integrate Ease of deployment and leveraging enterprise Push-button deployment into production
Technology risk High-quality libraries and support Industry-standard algorithms in use in production
Access to support, training and advisory services when
needed
MATLAB for Machine Learning
![Page 19: Gerardo Hernández Correa Application Engineer · Example – Bank Marketing Campaign Goal: –Predict if customer would subscribe to bank term deposit based on different attributes](https://reader033.fdocuments.net/reader033/viewer/2022042815/5f8ae8ed782d846d934c40fc/html5/thumbnails/19.jpg)
19
Learn More : Machine Learning with MATLAB
mathworks.com/machine-learning
![Page 20: Gerardo Hernández Correa Application Engineer · Example – Bank Marketing Campaign Goal: –Predict if customer would subscribe to bank term deposit based on different attributes](https://reader033.fdocuments.net/reader033/viewer/2022042815/5f8ae8ed782d846d934c40fc/html5/thumbnails/20.jpg)
25 © 2014 The MathWorks, Inc. © 2014 The MathWorks, Inc.