Hierarchical Temporal Memory “The Classification of Un-preprocessed Waveforms through the...
-
date post
15-Jan-2016 -
Category
Documents
-
view
228 -
download
1
Transcript of Hierarchical Temporal Memory “The Classification of Un-preprocessed Waveforms through the...
Hierarchical Temporal Memory
“The Classification of Un-preprocessed Waveforms through the Application of the Hierarchical
Temporal Memory Model”
April 18, 2009 Version 1.0; 04/18/2009
John M. Casarella
Ivan G. Seidenberg School of CSIS, Pace
University
Topics to be Discussed
Introduction
Intelligence
Artificial Intelligence
Neuroscience
Connectionism and Classical Neural Nets
Pattern Recognition, Feature Extraction & Signal
Processing
Hierarchical Temporal Memory (Memory – Prediction)
Hypothesis
Research
Results
Introduction
"I was proceeding down the road. The trees on the right were passing me in orderly fashion at 60 miles per hour. Suddenly one of them stepped in my path."
John von Neumann providing an explanation for his automobile accident.
Intelligence What is Intelligence?
A uniquely human quality? “means the ability to solve hard problems”
Ability to create memories Learning, language development, memory formation (synaptic pattern creation)
The human ability to adapt to a changing environment or the ability to change our environment for survival
Alan Turing “Can machines Think?” Connectionism Model digital computer like child’s mind, then “educate” to obtain “adult”
“Unorganized Machines” : A network of neuron-like Boolean elements randomly connected together
Proposed machines should be able to ‘learn by experience’
The Turing Test - constrained and focused research
Imitate human behavior Evaluate AI only on the basis of behavioral
response
Turing’s unorganized machine
Machine Intelligence What is Artificial Intelligence?
The science and engineering of making intelligent machines, especially intelligent computer program
The Objectives of AI Create machines to do something which would
require intelligence if done by a human To solve the problem of how to solve the problem
von Neumann and Shannon Sequential processing vs. parallel
McCarthy (Helped define AI), Minsky (first dedicated AI lab at MIT) and Zadeh (Fuzzy Logic)
Various Varieties Expert Systems (rule based, fuzzy, frames) Genetic Algorithms Perceptrons (Classical Neural Networks)
Neural Nets McCulloch and Pitts
Model of Neurons of the brain Proposed a model of artificial Neurons Cornerstone of neural computing and neural networks
Boolean nets of simple two-state ‘neurons’
Concept of ‘threshold’ No mechanism for learning Hebb - Pattern recognition learned through by changing the strength of the connection between neurons
Classical Neural Networks
Rosenblatt Perceptron Model - permitted mathematical analysis of neural networks Based on McCulloch and Pitts Linear combiner followed by a hard limiter Activation and weight training Linear Separation - No XOR
Classical Neural Networks
Minsky and Papert, what where they thinking
Mathematical proof perceptron model of limited usefulness Classes of problems which perceptrons could not handle Negative impact on funding Narrow analysis of the model Incapable of learning the XOR - wrong Incorrectly postulating that multi-layer perceptrons would be incapable of the XOR
Classical Neural Networks The Quiet Years, Grossberg, Kohonen and Anderson Hopfield
Introduced non-linearities ANNs could solve constrained optimization problems
Rumelhart and McClelland Parallel Distributed Processing Backpropagation
Interdisciplinary nature of neural net research
Neuroscience Structure of the neocortex Learning, pattern recognition and synapses Mountcastle
Columnar model of the Neocortex Learning associated with construction of cell assemblies related to the formation of pattern associations Neuroplasticity
Neuroscience The Biology
Neocortex >50% of human brain Locus of: perception, language, planned behavior, declarative memory, imagination, planning
Extremely flexible/generic Repetitive structure
The Neocortex Hierarchy of cortical regions Region - region connectivity Cortical - thalamic connectivity Cortical layers: cell types and connectivity
Neuroscience
Layer 1 Layer 2 Layer 3
A B
Layer 4
Layer 5 A B
Layer 6 A B
Electrocardiogram Electrocardiogram (ECG) records the electrical activity of the heart over time
Breakthrough by Willem Einthoven in 1901
Electrodes placed as per a pattern
ECG displays the voltage between pairs of these placed electrodes
Immediate results Assigned letters P, Q, R, S and T to the various deflections
Electrocardiogram
Measurement of the flow of electrical current as it moves across the conduction pathway of the heart.
Recorded over time
Represents different phases of the cardiac cycle
Electrocardiogram
Hypothesis Application of the HTM model, once correctly designed and configured, will provide a greater success rate in the classification of complex waveforms
Absent of pre-processing and feature extraction, using a visual process using actual images
Research Task Description
Create an image dataset of each waveform group for classification
Determine, through organized experiments, an optimized HTM
Apply optimized HTM to the classification of waveforms using images, devoid of any pre-processing or feature extraction
Hierarchical Temporal Memory
Each node performs similar algorithm
Each node learns 1) Common spatial patterns 2) Common sequences of spatial
patterns (use time to form groups of
patterns with a common cause) “Names” of groups passed up
- Many to one mapping, bottom to top
- Stable patterns at top of hierarchy
Modeled as an extension of Bayesian network with belief propagation
Creates a hierarchical model (time and space) of the world
Hierarchy of memory nodes
Overview
Hierarchical Temporal Memory
Structure of an HTM network for learning invariant representations for the binary images world. This network is organized in 3 levels. Input is fed in at the bottom level. Nodes are shown as squares. The top level of the network has one node, the middle level has 16 nodes and the bottom level has 64 nodes. The input image is of size 32 pixels by 32 pixels. This image is divided into adjoining patches of 4 pixels by 4 pixels as shown. Each bottom-level node’s input corresponds to one such 4x4 patch.
Hierarchical Temporal Memory
The fully learn ed node has 12 quantization centers within its spatial pooler and 4 tempo ral groups within its tempo ral po oler. T he quantization cente rs are shown as 4x4 pixel patc hes. The tempo ral groups are shown in ter ms of their quantization centers. The input to th e node is a 4x 4 pixel patch .
Hierarchical Temporal Memory
QuickTime™ and a decompressor
are needed to see this picture.
This figure illustrates how nodes operate in a hierarchy; we show a two-level network and its associated inputs for three time steps. This network is constructed for illustrative purposes and is not the result of a real learning process. The outputs of the nodes are represented using an array of rectangles. The number of rectangles in the array corresponds to the length of the output vector. Filled rectangles represent ‘1’s and empty rectangles represent ‘0’s.
Hierarchical Temporal Memory
This input sequence is for an “L” moving to the right. The level-2 node has already learned one pattern before the beginning of this input sequence. The new input sequence introduced one additional pattern to the level-2 node.
QuickTime™ and a decompressor
are needed to see this picture.
Hierarchical Temporal Memory
(A) An initial node that has not started its learning process.
(B) The spatial pooler of the node is in its learning phase and has formed 2 quantization enters
(C) the spatial pooler has finished its learning process and is in the inference stage. The temporal pooler is receiving inputs and learning the time-adjacency matrix.
(D) shows a fully learned node where both the spatial pooler and temporal pooler have finished their learning processes
Hierarchical Temporal Memory
Temporal Grouping
Hidden Markov Model - A
QuickTime™ and a decompressor
are needed to see this picture.
Temporal Grouping
HMM - 2
QuickTime™ and a decompressor
are needed to see this picture.
Temporal Grouping
HMM - 3
QuickTime™ and a decompressor
are needed to see this picture.
Experimental Design
HTM Design, Parameters and Structure
Determine the number of hierarchies to be used
Image size in pixels influences the sensor layer
Pixel Image size broken down into primes to determine layer 1 and layer 2 array configuration
Determine the number of “iterations” of viewed images at each layer
Small learning and unknown datasets used
Hierarchical Temporal Memory
Memorization of the input patterns
Learning transition probabilities
Temporal Grouping Degree of membership of input pattern in temporal group
Belief Propagation
Experimental Design
Waveform Datasets Individual beats broken down by classification and grouped (SN, LBBB, RBBB)
Teaching and unknown dataset randomly created
Teaching sets of 50, 90 and 100 images used Multiple sets created
Teaching vs. Unknown Datasets Traditional ratios 1:1 or 2:1, teaching to unknown
With HTM model, the ratio was 1:3, teaching to unknown
ECG Waveform Images - Sinus
ECG Waveform Images
Individual ECG Beat Images
Left Bundle Branch Block Normal Sinus Right Bundle Branch Block
All images were sized to 96 x 120 pixels
ECG Series Waveform Images
Left Bundle Branch Block Right Bundle Branch Block
Sinus
Results – Individual Beat Classification
Learning Smaller number of teaching images Diverse images produce greater classification pct
Overtraining not evident, saturation may exist RAM influences performance
Waveform Datasets Diversity Noise : approx. 87 pct w/o inclusion in teaching set
Average > 99 percent classification Average differentiation of images by class approx. 99 pct
NUPIC Model Results
48 object categories producing 453 training images. Only 99.3 percent of the training images were correctly classified. (32 x 32 pixels)
Of this “distorted set”, only 65.7 percent were correctly classified within their categories
Individual Beat Results
HTM Model SeriesPercent Classified
htm_100_17 98.5
htm_100_19 99.2
htm_100_21 98.8
htm_100_23 99.2
htm_100_25 99.2
htm_100_29 99.0
htm_100_33 98.8
htm_100_37 98.8
Percent Classified by Model
Results by Dataset
98
98.2
98.4
98.6
98.8
99
99.2
99.4
99.6
99.8
100
p dataset q dataset r dataset s dataset t dataset v dataset x dataset
Results by HTM Model
98
98.2
98.4
98.6
98.8
99
99.2
99.4
99.6
99.8
100
htm_100_17 htm_100_19 htm_100_21 htm_100_23 htm_100_25 htm_100_29 htm_100_33 htm_100_37
IR Spectra
Sample IR Spectra
Results – IR Spectra
Classification pct > 99
Figure X: Left: IR Spectrum of an Alcohol; Right: IR Spectrum of a Hydrocarbon
Figure X: IR Spectrum of a Nitrile
Gait Waveforms
ALS
Control
Huntingtons
With a limited teaching and unknown set > 98 pct
References (Short List)
[4] Computational Models of the Neocortex. http://www.cs.brown.edu/people/tld/projects/cortex/
[6] Department of Computer Science, Colorado State University. http://www.cs.colostate.edu/eeg/?Summary.
[8] George, Dileep and Jaros, Bobby. The HTM Learning Algoritims. Numenta, Inc. March 1, 2007.
[11] Hawkins, Jeff and Dileep, George. Hierarchical Temporal Memory, Concepts, Theory, and Terminology. Numenta, Inc. 2006.
[12] Hawkins, Jeff. Learn like a Human. http://spectrum.ieee.org/apr07/4982. [15] Swartz Center for Computational Neuroscience. University of California San
Diego. http://sccn.ucsd.edu/eeglab/downloadtoolbox.html [16] Turing, A. M. “Computing Machinery and Intelligence”. Mind, New Series,
Vol. 59, No. 236. (Oct., 1950). pp 443 – 460.