Machine Learning Group University College Dublin 4.30 Machine Learning Pádraig Cunningham.
-
date post
23-Jan-2016 -
Category
Documents
-
view
222 -
download
0
Transcript of Machine Learning Group University College Dublin 4.30 Machine Learning Pádraig Cunningham.
Machine Learning GroupUniversity College Dublin
4.30 Machine Learning4.30 Machine Learning
Pádraig Cunningham
Intro to ML
2
OutlineOutline
Week 1 Introduction & General Overview of Matrix Decomposition Nearest Neighbour Classifiers Tutorial
Week 2: Neural Networks Simple Perceptron, Backpropagation Other Architectures: Hopfield, Self-Organising Maps Tutorial
Week 3 Support Vector Machines Kernel Methods & Evaluation Tutorial
Week 4 Decision Trees Naïve Bayes Tutorial
Intro to ML
3
OutlineOutline
Week 5: Ensemble Techniques Bagging Boosting Tutorial
Week 6: Unsupervised Learning Hierarchical Clustering Other Clustering Algorithms: k-Means, Spectral Clustering Tutorial
Week 7: Dimension Reduction Principle Components Analysis, LSI, SVD Feature Selection Tutorial
Later 2 revision tutorials
Coursework3-4 pieces, 15 hours, Weka & Java
Intro to ML
4
Why Machine LearningWhy Machine Learning
Recent progress in algorithms and theory Loads of processing power Computational power is available Growing flood of
online data Amazon Google
Intro to ML
5
3 niches for ML3 niches for ML
Data mining: using historical data to improve decisions medical records medical knowledge
Software applications that cannot be programmed by hand. autonomous driving speech recognition i.e. weak theory domains.
Self customising programs Personalised Newspaper E-mail filtering
Intro to ML
6
Data-mining in medical recordsData-mining in medical recordsQuality Assurance in Maternity Care.http://svr-www.eng.cam.ac.uk/projects/qamc/qamc.html
Intro to ML
7
Rule LearningRule Learning
The QAMC system uses Decision /trees (I think!) It is also possible to extract rules from data:-
If No previous normal delivery, andAbnormal 2nd Trimester Ultrasound, andMalpresentation at admission
Then Probability of Emergency C-Section is 0.6
Over training dat 26/41 = 0.63Over test data: 12/20 = 0.6
<Rule taken from Machine Learning by Tom Mitchell>
Intro to ML
8
Spam FilteringSpam Filtering
For Machine Learning… Lots of training data High dimensionality data (lots of features) Email is a diverse concept
Porn, mortgage, religion, cheap drugs… Work, family, play…
Spam Filtering is a challenge because… Arms race: spammers vs filters False Positives are unacceptable
Spam is a changing concept
Intro to ML
9
ALVINALVIN
Problems too difficult to program by hand
Alvin drives at 70mph on motorways
Intro to ML
10
Autonomous VehiclesAutonomous Vehicles
DARPA Grand Challenge 2005 Winner: Stanley from Stanford
Various modules use ML
Intro to ML
11
SmartRadioSmartRadio
Internet-based music radio Personalised
Collaborative Recommendation Content-Based Recommendation
supported by knowledge discovery from log data supported by feature extraction from sound files
feature seleciton refinement
Intro to ML
12
Smart RadioSmart Radio
Smart Radio is a web based client-server music application which allows listeners build, manage and share music programmes
The project was set up to look at a possible model for:
The regulated distribution of music on the web
A personalised stream of music service
To provide an architecture and data to test our data mining and collaborative filtering algorithms
Intro to ML
13
ML DimensionsML Dimensions
Lazy v’s Eager k-NN v’s rule learning
Supervised v’s Unsupervised Symbolic v’s Sub-symbolic