Machine Learning - Challenges, Learnings & Opportunities
Click here to load reader
-
Upload
codepolitan -
Category
Data & Analytics
-
view
2.843 -
download
1
Transcript of Machine Learning - Challenges, Learnings & Opportunities
challenges, learnings and opportunities
presented by imron zuhri, adit, and samudraKUDO codefest 14 May 2016
machine learning
can a machine think?
in 1996, Garry Kasparov was not afraid of a computer, and he wonthe next year, he played against a new and improved Deep Blue
and lost
this is the move that was so surprising, so un-machine-like,
that he was sure the IBM team had cheated
Rd5
Rd1
a random move, a computer bugto kasparov, a sign of superior intelligence
Rd5
Rd1
big data analytics, is the culminationof the machine way of thinking
we can now immenselyextend our memory and computational power to helped us doing that
what is machine learning
some definitions
a (hypnotized) user’s perspectivea scientific (witchcraft) field that:researches fundamental principles from data (potions) and develops magical algorithms (spells to cast) (pascal vincent, 2015)
field of study that gives computers the ability to learn without being explicitly programmed arthur samuel (1959)
formal definitions (tom mitchell, 1998):“A machine is said to be learning IFit improves with: each experience E on specific tasks T with specific performance P
CURRENT VIEW OF ML FOUNDING DISCIPLINES
10
three niches for machine learning
data mining: using historical data to improve decisions medical records medical knowledge
software applications that are difficult to program by hand autonomous driving image classification
user modeling automatic recommender systems
source: rong jin, 2013
(some) open problems in machine learning
one-shot learningunsupervised learning reinforced learningartificial general intelligence
“most of human and animal learning is unsupervised learning. If intelligence was a cake, unsupervised learning would be the cake, supervised learning would be the icing on the cake, and reinforcement learning would be the cherry on the cake. We know how to make the icing and the cherry, but we don't know how to make the cake.” yan lecun
challenges in machine learning
data-related: abundant yet scattered data unstructured, noisy data offline-stored data (duh!)
resource-related: data storage space constraints computing power training time inve$$$tments
• initial investments• running costs
challenges in machine learning
methodical issues: result consistency
(i.e. accuracy) overfitting algorithm computational efficiency
miscellaneous: architectural differences/ portability issues popularity of non-open standard, vendor-
locked compute libraries/apis(rawr!)
recent breakthroughs in machine learning
deepmind atari q learner (2014)
plays 5 kinds of atari 2600 games
states: pixels in atari actions: left/right movereward: score
algorithm used:feedforward “q-learning”conv-net for unsupervised map of reward
recent breakthroughs in machine learning
the translator (2015)
real-time translations of speech from/into 7 different languages
able to run from even from resource-constrained embedded hardware (i.e. smartphones)
uses same engine that was used in microsoft cortana (creepy!)
Reinforcement Learning: DeepMind AlphaGo
google deepmind alphago (2016)
99.8% winning rate vs other algorithm
first program to defeathuman go champion
algorithm used: deep neural network monte carlo search tree
supervised learning from expert games reinforcement learning vs other alphago instances
supervised learning: random forest
deldago et. al. (2014) used 179 classifiers with 121 data sets in uci data, result:
top 5 are random forest classifier for kaggle competition, try gbm : xgboost.
supervised: deep learning
don’t be fooled, dl research improve part by part, either new kind of layer, new activation function, new non-convex optimization solver, or deeper neural net.
from rodrigo benensondeep learning accuracies ranking
supervised: deep learning
summary:
relu works better than sigmoid function for activation.
maxout works better when applied to dropconnect for activation function.
dropout layer works to fight overfitting.adagrad and adadelta works better if you don’t
want to tune optimization hyperparameter.deeper layer works: highway layer and residual
layer.
unsupervised: t-sne
t-stochastic neighbor embeddingmaaten and hinton (2008):mnist data set visualization
works best for data-viz can be used for clustering too
(if you’d bother to tweak the algo)
Given 100 and 1000 label of data, and the other unlabeled (~50.000)Try to predict 10.000 future data. ● It works! with small label data.● Now we don’t have to tell some interns or PhD student to label some
data. :)
A Rasmus, H Valpola, M Honkala, M Berglund, and T Raiko. (2015)
semi-supervised learning: ladder neural networks
collaborative filtering: restricted boltzmann machine
rbm for collaborative learning (hinton, 2008): it has been used in netflix and spotify algo. it works better than svd! correlation(svd, rbm) : -1 < c < 1
• can be assembled with svd to improve the prediction.
some advices for applied machine learning research(this competition)
preprocessing: scaling & imputationcross-validation: choose best algoshyperparameter optimization ensembling n-models: dark knowledge
raschka(2014):scaling improve prediction!
gelman(2006)do prediction for n/a data, then predict the data with noiseless biased!
data preprocessing: scaling & imputation
cross-validation: how to choose best algo?
cross-validation is a must! (tibshirani et.al 2014)
don’t overlap your cross-validation data partition!
(zhang, data robot)
hyperparameter optimization
if you want to search best hyperparamaters:do random search.random search is better than grid search(bengio, 2012)
ensembling n-models: dark knowledge
If two model give same accuracy, but low correlation of prediction output, then we can improve prediction accuracy by averaging model prediction. (Hinton, 2015)
the landscape of opportunities
Popular Big Data IndustryFinancial Services Telco Web/Media Retail Healthcare Government
• Fraud detection
• Compliance reporting
• Portfolio analysis
• Customer statements
• Wire transfer alerts
• Customer acquisition, retention, and profitability
• Subscriber data management
• Fraud analysis
• Social analysis
• Response times
• Traffic analysis
• Product affinity/bundling
• Sentiment Analysis
• Content monetization
• Advertising optimization
• Optimization of user experience/ click stream analysis
• Network optimization to support service levels
• Store operation analysis
• Customer loyalty programs
• Collaborative planning and forecasting
• Loss prevention
• Supply chain optimization
• Drug development and launch cost reduction
• Regulatory compliance
• Product quality
• Return on promotional investment
• Lowered risk of new product success
• Security/anti-terror
• Recovery Act public disclosure
• Budgetary control and management
• Educational reporting
• Asset control and assessmentEnvironment monitoring
*cisco 2013-2014
currently the biggest prescriptive analytics engine:contextual advertising
http://www.flashtalking.com/us/targeted-ads/
another one:marketplace and services recommendation engine
challenges of implementation
and
what we do with machine learning
do you follow waze instruction during the first one week?
would you buy a self-driving car that couldn’t driveitself in 99 percent of the country?
or that knew nearly nothing about parking, couldn’t be taken out in snow or heavy rain, and would drive straight over a gaping pothole?
if your answer is yes, then check out the google self-driving car, model year 2014
but
can we trust them enough?
the BIGGEST CHALLENGES in indonesia
DATA SETS
the current analytics technology
human still doing most of the process
the current challenges of big data analytics?
heterogeneous data sources,
systems and formats
time consuming
and complex data
preparation process
almost impossible
task of integrating various kind
of data
it requires experts to
analyze big and
complex data
most of the user
interactions are not
intuitive
“Before performing analytics, data scientists must first format and prepare
the raw data for analytics, often with more than 80% of the effort.”, said Intel
Corp. Research
what it would be like,if we can simplify the whole process?
????? ????
hence our visionwe believe human should not be bogged down by tedious matters.by reimagining analytics we envisioned the creation of intelligent machines,that will free human to focus on solving the world’s toughest problems.
intelligent machines that can helped us collect the massive amount of data
automatically reads and connects to any kind of data, including automatic machine to machine connections
structureddata
printedinvoices
social mediaconversation
intelligent machines that can helped us collect the massive amount of data
automatically reads and connects to any kind of data, including automatic machine to machine connections
structureddata
printedinvoices
social mediaconversation
then helped us separate the signals from the noise
automatic data quality assessments, data cleansing and data filtering
regi
mita
gundam
x-men
then helped us separate the signals from the noise
automatic data quality assessments, data cleansing and data filtering
regi
mita
gundam
complete the information and connect them all in a meaningful way
automatic data transformation, entity extraction, contextual profiling
regi
mita
gundam
complete the information and connect them all in a meaningful way
automatic data transformation, entity extraction, contextual profiling
regi
mita
gundam
batman
tom
mediatrac
complete the information and connect them all in a meaningful way
automatic data transformation, entity extraction, contextual profiling
regi
mita
gundam
batman
tom
mediatrac
and finally helped us making sense of the massively connected data
contextual search andrecommendationintelligent data discovery
gundam
batman
sith
and finally helped us making sense of the massively connected data
contextual search andrecommendationintelligent data discovery
regi
mita
gundam
batman
tom
mediatrac
gundam
batman
sith
through a highly intuitive and natural user interface
natural language interfacevoice and gesture recognition
ada berapa banyak restoran yg jual soto sepanjang jalan senopati?
Platform As A Serviceintelligent machines
knowledge based artificial intelligence
contextualsearch and
recommendation
contextual profiling and enrichment
automaticdata
integration
scalable big data infrastructure
digi
tal
telc
o
lega
l
reta
il
heal
thca
re
agri
cult
ure
knowledge based artificial intelligence
artificial general intelligence
the brainknowledge graph
the intelligencereasoning and
learning
machine learning
heuristics
unsupervised
deep learning
NLP & image
recognition
highly secureddistributed graph
database
distributed computingwith GPU acceleration
personal brain
knowledge graph
highly securedpersonal graph
database
automaticdata
integration
multi formatstructuredunstructureduncleanmissing dataunstandardizedunconnecteddifficult to analyze
cleaned and standardizedenriched and validatedconnected at granular levelanalytics ready
data
automaticdata collection
automaticdata preparation
automaticdata integration
automaticdata
integration
automaticdata
integration
automaticdata
integration
teritory management
CONFIDENTIAL for internal use only
all of our silo data will have a totally elevated value,once you connect them all in a meaningful way
are all of our current data connected yet?
Almost…
google is a humongous library index, with a smart library card search that redirects you to the original documents
facebook is a giant personal scrapbook of all your acquaintances that are currently linked by manual tagging and friends list
source:techglimpse
youtube and instagram are a huge repository of current knowledge, lifestyle and trends that are still largely unconnected
now imagine this!
when we can have intelligent machines that can connect everything, in a meaningful way…
we can start asking questions, on things we never thought possible to be asked before
can map songs across social graphs.Spotify
can give us situational data — where someone is listening to a song, when, how and even (to an extent) why.
Shazam
can help us track the growth of a song using search and streams.
YouTube
are becoming hotbeds for music discovery.Instagram & Vine
If we can connect all their data together?
or if you have a radio station, what sort of playlist that will appeal to your target audience, if we know, that a sizeable percentage of them have a hummer?
we can even predict specific combination of words, notes and beats that will increase the chance of putting the song in billboard top 40 this upcoming season.
here are some sample of hidden insightsthat we can discover from our own large repository of data,using our intelligent data integration and data discovery tools
when we integrate historical media articles with geodemographic and point of interest database we can create a model that can predict high probability of fire incidence down to street level
productivy optimizationautomatic
dataintegration
contextual profiling and enrichment
behavioral profiling
community detection
influence and networks
contextualsearch and
recommendation
auto-correction
auto-complete
contextual rank
entity recognition
synonyms
personal geo-demographic historical time/activity/
mood
instant searchnatural
language
content
collaborative
influence
trending
similarity
popular
preference
search recommendation
contextual
optimizationpredictive
potential area distribution routing
marketing channelsegmentation
prediction
contextualsearch and
recommendation
contextual auto complete
contextual auto correct
contextual entity extraction
and recommendation
contextualsearch and
recommendation
analytic dashboard
contextual personalized
pagescurrent and
predicted trends
fraud detection
lessons learned including how to scale your ML
scalability problems - outline
large scale machine learning mahout - scalable ml on hadoop jubatus – distributed online real-time ml vowpal wabbit – fast learning at yahoo/ms trident ml and storm pattern: ml on storm, yarn upcoming --- samoa: ml on s4, storm
issues in scalable distributed ml load balancing auto scaling job scheduling workflow management
data and model parallelismparameter server frameworkpeer-to-peer framework
scalability problems - outline
distributed deep learning yahoolda: scalable parallel framework in latent variable models distbelief – distributed deep learning on cluster h2o – distributed deep learning on spark adam at msr – distributed deep learning dl4j – open source for deep learning on hadoop and spark petuum – distributed machine learning singa – distributed deep learning tensorflow: google large scale distributed dl mxnet: heterogeneous distributed deep learning caffee on spark: yahoo
distributed learning and optimization proximal splitting/auxiliary coordinates; bundle (sub-gradient); shotgun: parallelized cdm (coordinate descent method) asynchronous sgd; hogwild/dogwild;
what’s next?
emerging analytics technology for automatic analytics on large dimensional data
online deep learningtopological data analysisfuzzy-rough set based data exploration systemgranular computingkernel set and spatiotemporal analysisapplied differential geometrynon axiomatic reasoning system
intelligent rule and knowledge extraction/discoverymulti agent based modelingweak signal detection and analysisbayesian networks analysisgenetic programmingself organizing neural networks
and also more humanlike user interaction and data visualization technology
eye trackingglass-free auto stereoscopytouch sensitive hologramnatural language user interfacetangible user interfacewearable gestural interfacebrain-computer interfacesensor network user interface
In the meantime
principles for the development of a complete mind:study the science of art. study the art of science.develop your senses — especially learn how to see.
realize that everything connects to everything else.Leonardo DaVinci