Hierarchical Temporal Memory “The Classification of Un-preprocessed Waveforms through the...

Hierarchical Temporal Memory

“The Classification of Un-preprocessed Waveforms through the Application of the Hierarchical

Temporal Memory Model”

April 18, 2009 Version 1.0; 04/18/2009

John M. Casarella

Ivan G. Seidenberg School of CSIS, Pace

University

Topics to be Discussed

Introduction

Intelligence

Artificial Intelligence

Neuroscience

Connectionism and Classical Neural Nets

Pattern Recognition, Feature Extraction & Signal

Processing

Hierarchical Temporal Memory (Memory – Prediction)

Hypothesis

Research

Results

Introduction

"I was proceeding down the road. The trees on the right were passing me in orderly fashion at 60 miles per hour. Suddenly one of them stepped in my path."

John von Neumann providing an explanation for his automobile accident.

Intelligence What is Intelligence?

A uniquely human quality? “means the ability to solve hard problems”

Ability to create memories Learning, language development, memory formation (synaptic pattern creation)

The human ability to adapt to a changing environment or the ability to change our environment for survival

Alan Turing “Can machines Think?” Connectionism Model digital computer like child’s mind, then “educate” to obtain “adult”

“Unorganized Machines” : A network of neuron-like Boolean elements randomly connected together

Proposed machines should be able to ‘learn by experience’

The Turing Test - constrained and focused research

Imitate human behavior Evaluate AI only on the basis of behavioral

response

Turing’s unorganized machine

Machine Intelligence What is Artificial Intelligence?

The science and engineering of making intelligent machines, especially intelligent computer program

The Objectives of AI Create machines to do something which would

require intelligence if done by a human To solve the problem of how to solve the problem

von Neumann and Shannon Sequential processing vs. parallel

McCarthy (Helped define AI), Minsky (first dedicated AI lab at MIT) and Zadeh (Fuzzy Logic)

Various Varieties Expert Systems (rule based, fuzzy, frames) Genetic Algorithms Perceptrons (Classical Neural Networks)

Neural Nets McCulloch and Pitts

Model of Neurons of the brain Proposed a model of artificial Neurons Cornerstone of neural computing and neural networks

Boolean nets of simple two-state ‘neurons’

Concept of ‘threshold’ No mechanism for learning Hebb - Pattern recognition learned through by changing the strength of the connection between neurons

Classical Neural Networks

Rosenblatt Perceptron Model - permitted mathematical analysis of neural networks Based on McCulloch and Pitts Linear combiner followed by a hard limiter Activation and weight training Linear Separation - No XOR

Classical Neural Networks

Minsky and Papert, what where they thinking

Mathematical proof perceptron model of limited usefulness Classes of problems which perceptrons could not handle Negative impact on funding Narrow analysis of the model Incapable of learning the XOR - wrong Incorrectly postulating that multi-layer perceptrons would be incapable of the XOR

Classical Neural Networks The Quiet Years, Grossberg, Kohonen and Anderson Hopfield

Introduced non-linearities ANNs could solve constrained optimization problems

Rumelhart and McClelland Parallel Distributed Processing Backpropagation

Interdisciplinary nature of neural net research

Neuroscience Structure of the neocortex Learning, pattern recognition and synapses Mountcastle

Columnar model of the Neocortex Learning associated with construction of cell assemblies related to the formation of pattern associations Neuroplasticity

Neuroscience The Biology

Neocortex >50% of human brain Locus of: perception, language, planned behavior, declarative memory, imagination, planning

Extremely flexible/generic Repetitive structure

The Neocortex Hierarchy of cortical regions Region - region connectivity Cortical - thalamic connectivity Cortical layers: cell types and connectivity

Neuroscience

Layer 1 Layer 2 Layer 3

A B

Layer 4

Layer 5 A B

Layer 6 A B

Electrocardiogram Electrocardiogram (ECG) records the electrical activity of the heart over time

Breakthrough by Willem Einthoven in 1901

Electrodes placed as per a pattern

ECG displays the voltage between pairs of these placed electrodes

Immediate results Assigned letters P, Q, R, S and T to the various deflections

Electrocardiogram

Measurement of the flow of electrical current as it moves across the conduction pathway of the heart.

Recorded over time

Represents different phases of the cardiac cycle

Electrocardiogram

Hypothesis Application of the HTM model, once correctly designed and configured, will provide a greater success rate in the classification of complex waveforms

Absent of pre-processing and feature extraction, using a visual process using actual images

Research Task Description

Create an image dataset of each waveform group for classification

Determine, through organized experiments, an optimized HTM

Apply optimized HTM to the classification of waveforms using images, devoid of any pre-processing or feature extraction


Each node performs similar algorithm

Each node learns 1) Common spatial patterns 2) Common sequences of spatial

patterns (use time to form groups of

patterns with a common cause) “Names” of groups passed up

- Many to one mapping, bottom to top

- Stable patterns at top of hierarchy

Modeled as an extension of Bayesian network with belief propagation

Creates a hierarchical model (time and space) of the world

Hierarchy of memory nodes

Overview


Structure of an HTM network for learning invariant representations for the binary images world. This network is organized in 3 levels. Input is fed in at the bottom level. Nodes are shown as squares. The top level of the network has one node, the middle level has 16 nodes and the bottom level has 64 nodes. The input image is of size 32 pixels by 32 pixels. This image is divided into adjoining patches of 4 pixels by 4 pixels as shown. Each bottom-level node’s input corresponds to one such 4x4 patch.


The fully learn ed node has 12 quantization centers within its spatial pooler and 4 tempo ral groups within its tempo ral po oler. T he quantization cente rs are shown as 4x4 pixel patc hes. The tempo ral groups are shown in ter ms of their quantization centers. The input to th e node is a 4x 4 pixel patch .


QuickTime™ and a decompressor

are needed to see this picture.

This figure illustrates how nodes operate in a hierarchy; we show a two-level network and its associated inputs for three time steps. This network is constructed for illustrative purposes and is not the result of a real learning process. The outputs of the nodes are represented using an array of rectangles. The number of rectangles in the array corresponds to the length of the output vector. Filled rectangles represent ‘1’s and empty rectangles represent ‘0’s.


This input sequence is for an “L” moving to the right. The level-2 node has already learned one pattern before the beginning of this input sequence. The new input sequence introduced one additional pattern to the level-2 node.




(A) An initial node that has not started its learning process.

(B) The spatial pooler of the node is in its learning phase and has formed 2 quantization enters

(C) the spatial pooler has finished its learning process and is in the inference stage. The temporal pooler is receiving inputs and learning the time-adjacency matrix.

(D) shows a fully learned node where both the spatial pooler and temporal pooler have finished their learning processes

Temporal Grouping

Hidden Markov Model - A



Temporal Grouping

HMM - 2



Temporal Grouping

HMM - 3



Experimental Design

HTM Design, Parameters and Structure

Determine the number of hierarchies to be used

Image size in pixels influences the sensor layer

Pixel Image size broken down into primes to determine layer 1 and layer 2 array configuration

Determine the number of “iterations” of viewed images at each layer

Small learning and unknown datasets used


Memorization of the input patterns

Learning transition probabilities

Temporal Grouping Degree of membership of input pattern in temporal group

Belief Propagation

Experimental Design

Waveform Datasets Individual beats broken down by classification and grouped (SN, LBBB, RBBB)

Teaching and unknown dataset randomly created

Teaching sets of 50, 90 and 100 images used Multiple sets created

Teaching vs. Unknown Datasets Traditional ratios 1:1 or 2:1, teaching to unknown

With HTM model, the ratio was 1:3, teaching to unknown

ECG Waveform Images - Sinus

ECG Waveform Images

Individual ECG Beat Images

Left Bundle Branch Block Normal Sinus Right Bundle Branch Block

All images were sized to 96 x 120 pixels

ECG Series Waveform Images

Left Bundle Branch Block Right Bundle Branch Block

Sinus

Results – Individual Beat Classification

Learning Smaller number of teaching images Diverse images produce greater classification pct

Overtraining not evident, saturation may exist RAM influences performance

Waveform Datasets Diversity Noise : approx. 87 pct w/o inclusion in teaching set

Average > 99 percent classification Average differentiation of images by class approx. 99 pct

NUPIC Model Results

48 object categories producing 453 training images. Only 99.3 percent of the training images were correctly classified. (32 x 32 pixels)

Of this “distorted set”, only 65.7 percent were correctly classified within their categories

Individual Beat Results

HTM Model SeriesPercent Classified

htm_100_17 98.5

htm_100_19 99.2

htm_100_21 98.8

htm_100_23 99.2

htm_100_25 99.2

htm_100_29 99.0

htm_100_33 98.8

htm_100_37 98.8

Percent Classified by Model

Results by Dataset

98

98.2

98.4

98.6

98.8

99

99.2

99.4

99.6

99.8

100

p dataset q dataset r dataset s dataset t dataset v dataset x dataset

Results by HTM Model

98

98.2

98.4

98.6

98.8

99

99.2

99.4

99.6

99.8

100

htm_100_17 htm_100_19 htm_100_21 htm_100_23 htm_100_25 htm_100_29 htm_100_33 htm_100_37

IR Spectra

Sample IR Spectra

Results – IR Spectra

Classification pct > 99

Figure X: Left: IR Spectrum of an Alcohol; Right: IR Spectrum of a Hydrocarbon

Figure X: IR Spectrum of a Nitrile

Gait Waveforms

ALS

Control

Huntingtons

With a limited teaching and unknown set > 98 pct

References (Short List)

[4] Computational Models of the Neocortex. http://www.cs.brown.edu/people/tld/projects/cortex/

[6] Department of Computer Science, Colorado State University. http://www.cs.colostate.edu/eeg/?Summary.

[8] George, Dileep and Jaros, Bobby. The HTM Learning Algoritims. Numenta, Inc. March 1, 2007.

[11] Hawkins, Jeff and Dileep, George. Hierarchical Temporal Memory, Concepts, Theory, and Terminology. Numenta, Inc. 2006.

[12] Hawkins, Jeff. Learn like a Human. http://spectrum.ieee.org/apr07/4982. [15] Swartz Center for Computational Neuroscience. University of California San

Diego. http://sccn.ucsd.edu/eeglab/downloadtoolbox.html [16] Turing, A. M. “Computing Machinery and Intelligence”. Mind, New Series,

Vol. 59, No. 236. (Oct., 1950). pp 443 – 460.

Hierarchical Temporal Memory “The Classification of Un-preprocessed Waveforms through the...

Documents

Transcript of Hierarchical Temporal Memory “The Classification of Un-preprocessed Waveforms through the...