Extreme Learning Machines - CECS - ANUcourses.cecs.anu.edu.au/courses/CSPROJECTS/14S2... · ANU AI...
Transcript of Extreme Learning Machines - CECS - ANUcourses.cecs.anu.edu.au/courses/CSPROJECTS/14S2... · ANU AI...
![Page 1: Extreme Learning Machines - CECS - ANUcourses.cecs.anu.edu.au/courses/CSPROJECTS/14S2... · ANU AI Masters Project (early Presentation) 4/8/2014. This presentation covers: •Revision](https://reader035.fdocuments.net/reader035/viewer/2022071022/5fd63511c66bc0667170be82/html5/thumbnails/1.jpg)
Extreme Learning MachinesTony Oakden
ANU AI Masters Project (early Presentation)
4/8/2014
![Page 2: Extreme Learning Machines - CECS - ANUcourses.cecs.anu.edu.au/courses/CSPROJECTS/14S2... · ANU AI Masters Project (early Presentation) 4/8/2014. This presentation covers: •Revision](https://reader035.fdocuments.net/reader035/viewer/2022071022/5fd63511c66bc0667170be82/html5/thumbnails/2.jpg)
This presentation covers:
• Revision of Neural Network theory
• Introduction to Extreme Learning Machines ELM
• Early Results
• Brief description of code
• Discuss possible future work
![Page 3: Extreme Learning Machines - CECS - ANUcourses.cecs.anu.edu.au/courses/CSPROJECTS/14S2... · ANU AI Masters Project (early Presentation) 4/8/2014. This presentation covers: •Revision](https://reader035.fdocuments.net/reader035/viewer/2022071022/5fd63511c66bc0667170be82/html5/thumbnails/3.jpg)
Neural Network Revision
• In a single layer perceptron inputs are connected to output nodes via weights
• Training is carried out using least squares or similar function
• Pros• Simple and quick to train
• Cons• Can only learn to classify linearly separable problems
![Page 4: Extreme Learning Machines - CECS - ANUcourses.cecs.anu.edu.au/courses/CSPROJECTS/14S2... · ANU AI Masters Project (early Presentation) 4/8/2014. This presentation covers: •Revision](https://reader035.fdocuments.net/reader035/viewer/2022071022/5fd63511c66bc0667170be82/html5/thumbnails/4.jpg)
Hidden Layer
• To classify none linear data we must add an additional layer of weights between input and output (hidden layer)
• When combined with a suitable activation function (sigmoidal for example) the network can classify none linear functions
• To train the hidden layer we propagate errors on the output back through the network. This is the back propagation algorithm
• Pros:• Can theoretically classify any data set
• Cons• Training of the network can be very slower
![Page 5: Extreme Learning Machines - CECS - ANUcourses.cecs.anu.edu.au/courses/CSPROJECTS/14S2... · ANU AI Masters Project (early Presentation) 4/8/2014. This presentation covers: •Revision](https://reader035.fdocuments.net/reader035/viewer/2022071022/5fd63511c66bc0667170be82/html5/thumbnails/5.jpg)
Extreme Learning Machines
• Provide a way to train networks to classify none linear problems without back propagation
• These networks still use a hidden layer. But the weights and bias in the hidden layer are set to random values
• We only train the output nodes.
• Training is achieved using least squares algorithm
• Pros:• Very fast training time
• Cons:• Less accurate
http://www.ntu.edu.sg/home/egbhuang/
![Page 6: Extreme Learning Machines - CECS - ANUcourses.cecs.anu.edu.au/courses/CSPROJECTS/14S2... · ANU AI Masters Project (early Presentation) 4/8/2014. This presentation covers: •Revision](https://reader035.fdocuments.net/reader035/viewer/2022071022/5fd63511c66bc0667170be82/html5/thumbnails/6.jpg)
Wait, we use random weights? Huh?
• Sounds too good to be true so lets look at some results:
• http://fastml.com/extreme-learning-machines/
![Page 7: Extreme Learning Machines - CECS - ANUcourses.cecs.anu.edu.au/courses/CSPROJECTS/14S2... · ANU AI Masters Project (early Presentation) 4/8/2014. This presentation covers: •Revision](https://reader035.fdocuments.net/reader035/viewer/2022071022/5fd63511c66bc0667170be82/html5/thumbnails/7.jpg)
Two Spirals Data Set
• First set of experiments where carried out with the twin spiral data set.
• This was used because:• It is a difficult set to classify
• Easy visualization of results
![Page 8: Extreme Learning Machines - CECS - ANUcourses.cecs.anu.edu.au/courses/CSPROJECTS/14S2... · ANU AI Masters Project (early Presentation) 4/8/2014. This presentation covers: •Revision](https://reader035.fdocuments.net/reader035/viewer/2022071022/5fd63511c66bc0667170be82/html5/thumbnails/8.jpg)
Neural Network trained with back propagation
• 20 nodes in hidden layer
• Training time is 6.4 seconds
• Training accuracy is 100%
(Testing was performed with training data)
![Page 9: Extreme Learning Machines - CECS - ANUcourses.cecs.anu.edu.au/courses/CSPROJECTS/14S2... · ANU AI Masters Project (early Presentation) 4/8/2014. This presentation covers: •Revision](https://reader035.fdocuments.net/reader035/viewer/2022071022/5fd63511c66bc0667170be82/html5/thumbnails/9.jpg)
Extreme Learning Machine
• 20 nodes in hidden layer
• Training time is 0.02 seconds
• Training accuracy is 69%
Not great…
But…
![Page 10: Extreme Learning Machines - CECS - ANUcourses.cecs.anu.edu.au/courses/CSPROJECTS/14S2... · ANU AI Masters Project (early Presentation) 4/8/2014. This presentation covers: •Revision](https://reader035.fdocuments.net/reader035/viewer/2022071022/5fd63511c66bc0667170be82/html5/thumbnails/10.jpg)
Extreme Learning continued
• 200 nodes in hidden layer
• Training time is 0.066 seconds
• Training accuracy is 97%
If the number of nodes in the hidden layer is significantly increased then performance improves dramatically but time taken to train still remains much faster than a traditional network
![Page 11: Extreme Learning Machines - CECS - ANUcourses.cecs.anu.edu.au/courses/CSPROJECTS/14S2... · ANU AI Masters Project (early Presentation) 4/8/2014. This presentation covers: •Revision](https://reader035.fdocuments.net/reader035/viewer/2022071022/5fd63511c66bc0667170be82/html5/thumbnails/11.jpg)
Accuracy plotted against hidden Layer/20
0
0.2
0.4
0.6
0.8
1
1.2
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Chart Title
![Page 12: Extreme Learning Machines - CECS - ANUcourses.cecs.anu.edu.au/courses/CSPROJECTS/14S2... · ANU AI Masters Project (early Presentation) 4/8/2014. This presentation covers: •Revision](https://reader035.fdocuments.net/reader035/viewer/2022071022/5fd63511c66bc0667170be82/html5/thumbnails/12.jpg)
Matlab Code • http://www.ntu.edu.sg/home/egbhuang/reference.html%create random weights for hidden layer
InputWeight=rand(NumberofHiddenNeurons,NumberofInputNeurons)*2-1;
BiasofHiddenNeurons=rand(NumberofHiddenNeurons,1);
….
tempH=InputWeight*trainData.P;
ind=ones(1,NumberofTrainingData);
BiasMatrix=BiasofHiddenNeurons(:,ind); % Extend the bias matrix BiasofHiddenNeurons to match the dimention of H
tempH=tempH+BiasMatrix;
% Calculate hidden neuron output matrix H
% we can use a variety of activation functions here but we’ll stick to sigmoidal for now…
H = 1 ./ (1 + exp(-tempH));
OutputWeight=pinv(H') * trainData.T'; % pinv gives Moore-Penrose pseudoinverse matrix
http://www.mathworks.com.au/help/matlab/ref/pinv.html
![Page 13: Extreme Learning Machines - CECS - ANUcourses.cecs.anu.edu.au/courses/CSPROJECTS/14S2... · ANU AI Masters Project (early Presentation) 4/8/2014. This presentation covers: •Revision](https://reader035.fdocuments.net/reader035/viewer/2022071022/5fd63511c66bc0667170be82/html5/thumbnails/13.jpg)
Conclusion
• As can be see training times for ELM are very fast.
• From these early experiments 100 times faster than traditional back prop for similar accuracy
• Accuracy is slightly lower, with other data sets back prop achieved 85% ELM 80%. But for many applications is still good enough
• Increasing the number of nodes in the hidden layer improves performance at the expense of a small increase in training time
![Page 14: Extreme Learning Machines - CECS - ANUcourses.cecs.anu.edu.au/courses/CSPROJECTS/14S2... · ANU AI Masters Project (early Presentation) 4/8/2014. This presentation covers: •Revision](https://reader035.fdocuments.net/reader035/viewer/2022071022/5fd63511c66bc0667170be82/html5/thumbnails/14.jpg)
Further research
• Use of ELM with GA for feature selection (this weeks work)• Experiment with different data sets• Perform more rigorous analysis of results• So far we have only looked at binary classifiers. How does ELM algorithm
cope with multi-class classification?• Can we improve the accuracy of ELM in some way, maybe by combining
results with cascade networks?• What about continuous data sources?
• Second part of project is cascade networks, • can these be combined with elm in some way?
![Page 15: Extreme Learning Machines - CECS - ANUcourses.cecs.anu.edu.au/courses/CSPROJECTS/14S2... · ANU AI Masters Project (early Presentation) 4/8/2014. This presentation covers: •Revision](https://reader035.fdocuments.net/reader035/viewer/2022071022/5fd63511c66bc0667170be82/html5/thumbnails/15.jpg)
references
Guang-Bin Huang: An Insight into Extreme Learning Machines: Random Neurons,
Random Features and Kernels (Springer Science+Business Media New York 2014)