Machine Learning - What, Where and How
-
Upload
narinderk -
Category
Technology
-
view
3.536 -
download
0
description
Transcript of Machine Learning - What, Where and How
![Page 1: Machine Learning - What, Where and How](https://reader034.fdocuments.net/reader034/viewer/2022051817/547ea01cb4af9fa0158b56a7/html5/thumbnails/1.jpg)
Machine LearningWhat, Where and How
Narinder Kumar ([email protected])
Mercris Technologies (www.mercris.com)
![Page 2: Machine Learning - What, Where and How](https://reader034.fdocuments.net/reader034/viewer/2022051817/547ea01cb4af9fa0158b56a7/html5/thumbnails/2.jpg)
2
Agenda
Definition
Types of Machine Learning
Under-the Hood
Languages & Libraries
![Page 3: Machine Learning - What, Where and How](https://reader034.fdocuments.net/reader034/viewer/2022051817/547ea01cb4af9fa0158b56a7/html5/thumbnails/3.jpg)
3
What is Machine Learning ?
![Page 4: Machine Learning - What, Where and How](https://reader034.fdocuments.net/reader034/viewer/2022051817/547ea01cb4af9fa0158b56a7/html5/thumbnails/4.jpg)
4
Definition Field of Study that gives Computers the ability
to learn without being explicitly programmed --Arthur Samuel
A more Mathematical one
A Computer program is said to learn from Experience E with respect to some Task T and Performance measure P, if it's Performance at Task in T, as measured by P, improves with Experience E –Tom M. Mitchell
![Page 5: Machine Learning - What, Where and How](https://reader034.fdocuments.net/reader034/viewer/2022051817/547ea01cb4af9fa0158b56a7/html5/thumbnails/5.jpg)
5
Related Disciplines
Sub-Field of Artificial Intelligence
Deals with Design and Development of Algorithms
Closely related to Data Mining
Uses techniques from Statistics, Probability Theory
and Pattern Recognition
Not new but growing fast because of Big Data
![Page 6: Machine Learning - What, Where and How](https://reader034.fdocuments.net/reader034/viewer/2022051817/547ea01cb4af9fa0158b56a7/html5/thumbnails/6.jpg)
6
Types of Machine Learning Supervised Machine Learning
Provide right set of answers for different set of questions
Underlying algorithm learns/infers over a period of time
Tries to return correct answers for similar questions
Unsupervised Machine Learning
Provide data & Let underlying algorithm find some structure
![Page 7: Machine Learning - What, Where and How](https://reader034.fdocuments.net/reader034/viewer/2022051817/547ea01cb4af9fa0158b56a7/html5/thumbnails/7.jpg)
7
Popular Use Cases
Recommendation Systems
Amazon, Netflix, iTunes Genius, IMDb...
Up-Selling & Churn Analysis
Customer Sentiment Analysis
Market Segmentation
...
![Page 8: Machine Learning - What, Where and How](https://reader034.fdocuments.net/reader034/viewer/2022051817/547ea01cb4af9fa0158b56a7/html5/thumbnails/8.jpg)
8
Understanding Regression
![Page 9: Machine Learning - What, Where and How](https://reader034.fdocuments.net/reader034/viewer/2022051817/547ea01cb4af9fa0158b56a7/html5/thumbnails/9.jpg)
9
Problem Contest
![Page 10: Machine Learning - What, Where and How](https://reader034.fdocuments.net/reader034/viewer/2022051817/547ea01cb4af9fa0158b56a7/html5/thumbnails/10.jpg)
10
Typical Machine Learning Algorithm
Training Set
Learning Algorithm
HypothesisInput FeaturesInput
FeaturesExpectedOutput
![Page 11: Machine Learning - What, Where and How](https://reader034.fdocuments.net/reader034/viewer/2022051817/547ea01cb4af9fa0158b56a7/html5/thumbnails/11.jpg)
11
Let's Simplify a bit
50 100 150 200 250 300 350 4000
500
1000
1500
2000
2500
3000
3500
4000
House Sizes vs Prices
House Sizes (Sq Yards)
Pric
es (
1000
US
D)
➢ Goal is to draw a Straight line which covers our Data-Set reasonably
➢ Our Hypothesis can be
➢ Such that hΘ x=Θ0+Θ1 xhθ( x)=θ0+θ1 x
hθ( x)≃ y
![Page 12: Machine Learning - What, Where and How](https://reader034.fdocuments.net/reader034/viewer/2022051817/547ea01cb4af9fa0158b56a7/html5/thumbnails/12.jpg)
12
In Mathematical Terms➢ Hypothesis
➢ Parameters
➢ Cost Function
➢ We would like to minimize
hθ( x)=θ0+θ1 x
θ0 ,θ1
J (θ0 ,θ1)
![Page 13: Machine Learning - What, Where and How](https://reader034.fdocuments.net/reader034/viewer/2022051817/547ea01cb4af9fa0158b56a7/html5/thumbnails/13.jpg)
13
Solution : Gradient Descent➢ Start with an initial
values of
➢ Keep Changing until we end up at minimum
θ0 ,θ1
θ0 ,θ1
![Page 14: Machine Learning - What, Where and How](https://reader034.fdocuments.net/reader034/viewer/2022051817/547ea01cb4af9fa0158b56a7/html5/thumbnails/14.jpg)
14
Mathematically
Repeat Until Convergence
For Our Scenario
Generic Formula
![Page 15: Machine Learning - What, Where and How](https://reader034.fdocuments.net/reader034/viewer/2022051817/547ea01cb4af9fa0158b56a7/html5/thumbnails/15.jpg)
15
Let's see all this in Action
![Page 16: Machine Learning - What, Where and How](https://reader034.fdocuments.net/reader034/viewer/2022051817/547ea01cb4af9fa0158b56a7/html5/thumbnails/16.jpg)
16
Extending Regression➢ Quadratic Model
➢ Cubic Model
➢ Square Root Model
➢ We can create multiple new Features like
X 2=X2 X 3=X
3 X 4=√ X
![Page 17: Machine Learning - What, Where and How](https://reader034.fdocuments.net/reader034/viewer/2022051817/547ea01cb4af9fa0158b56a7/html5/thumbnails/17.jpg)
17
Additional Pointers
➢ Mean Normalization
➢ Feature Scaling
➢ Learning Rate
➢ Gradient Descent vs Others
![Page 18: Machine Learning - What, Where and How](https://reader034.fdocuments.net/reader034/viewer/2022051817/547ea01cb4af9fa0158b56a7/html5/thumbnails/18.jpg)
18
HOW-TOLanguages & Libraries
![Page 19: Machine Learning - What, Where and How](https://reader034.fdocuments.net/reader034/viewer/2022051817/547ea01cb4af9fa0158b56a7/html5/thumbnails/19.jpg)
19
Languages
![Page 20: Machine Learning - What, Where and How](https://reader034.fdocuments.net/reader034/viewer/2022051817/547ea01cb4af9fa0158b56a7/html5/thumbnails/20.jpg)
20
Libraries, Tools and Products
![Page 21: Machine Learning - What, Where and How](https://reader034.fdocuments.net/reader034/viewer/2022051817/547ea01cb4af9fa0158b56a7/html5/thumbnails/21.jpg)
21
A Short Introduction
![Page 22: Machine Learning - What, Where and How](https://reader034.fdocuments.net/reader034/viewer/2022051817/547ea01cb4af9fa0158b56a7/html5/thumbnails/22.jpg)
22
What is WEKA ? Developed by Machine Learning Group,
University of Waikato, New Zealand Collection of Machine Learning Algorithms Contains tools for
Data Pre-Processing Classification & Regression Clustering Visualization
Can be embedded inside your application Implemented in Java
![Page 23: Machine Learning - What, Where and How](https://reader034.fdocuments.net/reader034/viewer/2022051817/547ea01cb4af9fa0158b56a7/html5/thumbnails/23.jpg)
23
Main Components
Explorer
Experimenter
Knowledge Flow
CLI
![Page 24: Machine Learning - What, Where and How](https://reader034.fdocuments.net/reader034/viewer/2022051817/547ea01cb4af9fa0158b56a7/html5/thumbnails/24.jpg)
24
Terminology Training DataSet == Instances Each Row in DataSet == Instance Instance is Collection of Attributes (Features) Types of Attributes
Nominal (True, False, Malignant, Benign, Cloudy...)
Real values (6, 2.34, 0...) String (“Interesting”, “Really like it”, “Hate
It” ...) ...
![Page 25: Machine Learning - What, Where and How](https://reader034.fdocuments.net/reader034/viewer/2022051817/547ea01cb4af9fa0158b56a7/html5/thumbnails/25.jpg)
25
Sample DataSets@RELATION house
@ATTRIBUTE houseSize real@ATTRIBUTE lotSize real@ATTRIBUTE bedrooms real@ATTRIBUTE granite real@ATTRIBUTE bathroom real@ATTRIBUTE sellingPrice real
@DATA3529,9191,6,0,0,205000 3247,10061,5,1,1,224900 4032,10150,5,0,1,197900 2397,14156,4,1,0,189900 2200,9600,4,0,1,195000 3536,19994,6,1,1,325000 2983,9365,5,0,1,230000
@RELATION CPU
@attribute outlook {sunny, overcast, rainy}
@attribute temperature real@attribute humidity real@attribute windy {TRUE, FALSE}@attribute play {yes, no}
@datasunny,85,85,FALSE,nosunny,80,90,TRUE,noovercast,83,86,FALSE,yesrainy,70,96,FALSE,yesrainy,68,80,FALSE,yesrainy,65,70,TRUE,noovercast,64,65,TRUE,yes
![Page 26: Machine Learning - What, Where and How](https://reader034.fdocuments.net/reader034/viewer/2022051817/547ea01cb4af9fa0158b56a7/html5/thumbnails/26.jpg)
26
WEKA Demo
![Page 27: Machine Learning - What, Where and How](https://reader034.fdocuments.net/reader034/viewer/2022051817/547ea01cb4af9fa0158b56a7/html5/thumbnails/27.jpg)
27
![Page 28: Machine Learning - What, Where and How](https://reader034.fdocuments.net/reader034/viewer/2022051817/547ea01cb4af9fa0158b56a7/html5/thumbnails/28.jpg)
28
Apache Mahout➢ Collection of Machine Learning Algorithms➢ Map-Reduce Enabled (most cases)➢ DataSources
➢ Database➢ File-System➢ Lucene Integration
➢ Very Active Community➢ Apache License
![Page 29: Machine Learning - What, Where and How](https://reader034.fdocuments.net/reader034/viewer/2022051817/547ea01cb4af9fa0158b56a7/html5/thumbnails/29.jpg)
29
WEKA vs Apache Mahout
WEKA➢ Lot of Algorithms➢ Tools for
➢ Modeling➢ Comparison➢ Data-Flow
➢ May need work for running on large data-sets
➢ License Issues
Apache-Mahout➢ Lesser number of
Algorithms but growing
➢ Lack of tools for Modeling
➢ Ready by Design for Large Scale
➢ Vibrant Community➢ Apache License
![Page 30: Machine Learning - What, Where and How](https://reader034.fdocuments.net/reader034/viewer/2022051817/547ea01cb4af9fa0158b56a7/html5/thumbnails/30.jpg)
30
&
An Overview
![Page 31: Machine Learning - What, Where and How](https://reader034.fdocuments.net/reader034/viewer/2022051817/547ea01cb4af9fa0158b56a7/html5/thumbnails/31.jpg)
31
Google Prediction API 101
➢ Cloud Based Web Service for Machine Learning
➢ Exposed as REST API
➢ Does not require any Machine Learning
knowledge
➢ Capabilities
➢ Categorical &
➢ Regression
![Page 32: Machine Learning - What, Where and How](https://reader034.fdocuments.net/reader034/viewer/2022051817/547ea01cb4af9fa0158b56a7/html5/thumbnails/32.jpg)
32
Working with Google Prediction API
![Page 33: Machine Learning - What, Where and How](https://reader034.fdocuments.net/reader034/viewer/2022051817/547ea01cb4af9fa0158b56a7/html5/thumbnails/33.jpg)
33
Let's see in Action
![Page 34: Machine Learning - What, Where and How](https://reader034.fdocuments.net/reader034/viewer/2022051817/547ea01cb4af9fa0158b56a7/html5/thumbnails/34.jpg)
34
Analysis
Very Promising Concept
Can be powerful tool for SME's
Not configurable
Data Security
Not Yet Production Ready (IMHO)
![Page 35: Machine Learning - What, Where and How](https://reader034.fdocuments.net/reader034/viewer/2022051817/547ea01cb4af9fa0158b56a7/html5/thumbnails/35.jpg)
35
Recap
➢ Very vast
➢ Huge demand
➢ Has an Initial Steep Learning Curve
➢ Several libraries available
➢ Lot of Innovative work going on currently
![Page 37: Machine Learning - What, Where and How](https://reader034.fdocuments.net/reader034/viewer/2022051817/547ea01cb4af9fa0158b56a7/html5/thumbnails/37.jpg)
37
Resources➢ Online Machine Learning Course - Prof. Andrew
Ng, Stanford University ➢ WEKA Wiki and API docs➢ Apache Mahout Wiki➢ IBM Developer Works Articles➢ Google Prediction API Web Site➢ Data Mining : Practical Machine Learning Tools &
Techniques – Ian H. Witten, Eibe Frank, Mark Hall➢ Machine Learning Forums