ACM CIKM October 31, 2013 Jeff Hawkins [email protected] On-line Learning From Streaming...

28
ACM CIKM October 31, 2013 Jeff Hawkins jhawkins@GrokSolutions .com On-line Learning From Streaming Data

Transcript of ACM CIKM October 31, 2013 Jeff Hawkins [email protected] On-line Learning From Streaming...

Page 1: ACM CIKM October 31, 2013 Jeff Hawkins jhawkins@GrokSolutions.com On-line Learning From Streaming Data.

ACM CIKMOctober 31, 2013

Jeff [email protected]

On-line Learning From Streaming Data

Page 2: ACM CIKM October 31, 2013 Jeff Hawkins jhawkins@GrokSolutions.com On-line Learning From Streaming Data.

1) Discover operating principles of neocortex

2) Build systems based on these principles

Anatomy,Physiology

Theoreticalprinciples

Software

Industrial Research Track

Anomaly detection in high velocity data

Cortical algorithms

Page 3: ACM CIKM October 31, 2013 Jeff Hawkins jhawkins@GrokSolutions.com On-line Learning From Streaming Data.

The neocortex is a memory system.

data streamretina

cochlea

somatic

The neocortex learns a model from sensory data

- predictions - anomalies - actions

The neocortex learns a sensory-motor model of the world

Page 4: ACM CIKM October 31, 2013 Jeff Hawkins jhawkins@GrokSolutions.com On-line Learning From Streaming Data.

Principles of Neocortical Function

retina

cochlea

somatic

1) On-line learning from streaming data

data stream

Page 5: ACM CIKM October 31, 2013 Jeff Hawkins jhawkins@GrokSolutions.com On-line Learning From Streaming Data.

Principles of Neocortical Function

2) Hierarchy of memory regions

retina

cochlea

somatic

1) On-line learning from streaming data

data stream

Page 6: ACM CIKM October 31, 2013 Jeff Hawkins jhawkins@GrokSolutions.com On-line Learning From Streaming Data.

Principles of Neocortical Function

2) Hierarchy of memory regions

retina

cochlea

somatic

3) Sequence memory- inference- motor

data stream

1) On-line learning from streaming data

Page 7: ACM CIKM October 31, 2013 Jeff Hawkins jhawkins@GrokSolutions.com On-line Learning From Streaming Data.

Principles of Neocortical Function

4) Sparse Distributed Representations

2) Hierarchy of memory regions

retina

cochlea

somatic

3) Sequence memory

data stream

1) On-line learning from streaming data

Page 8: ACM CIKM October 31, 2013 Jeff Hawkins jhawkins@GrokSolutions.com On-line Learning From Streaming Data.

Principles of Neocortical Function

retina

cochlea

somatic

data stream

2) Hierarchy of memory regions

3) Sequence memory

5) All regions are sensory and motor

4) Sparse Distributed Representations

Motor

1) On-line learning from streaming data

Page 9: ACM CIKM October 31, 2013 Jeff Hawkins jhawkins@GrokSolutions.com On-line Learning From Streaming Data.

Principles of Neocortical Function

retina

cochlea

somatic

data stream

x xx

x x x

x x x x x x

x

2) Hierarchy of memory regions

3) Sequence memory

5) All regions are sensory and motor6) Attention

4) Sparse Distributed Representations

1) On-line learning from streaming data

Page 10: ACM CIKM October 31, 2013 Jeff Hawkins jhawkins@GrokSolutions.com On-line Learning From Streaming Data.

Principles of Neocortical Function

4) Sparse Distributed Representations

2) Hierarchy of memory regions

retina

cochlea

somatic

3) Sequence memory

5) All regions are sensory and motor6) Attention

data stream

1) On-line learning from streaming data

These six principles are necessary and sufficientfor biological and machine intelligence.

- All mammals from mouse to human have them

Page 11: ACM CIKM October 31, 2013 Jeff Hawkins jhawkins@GrokSolutions.com On-line Learning From Streaming Data.

Sparse Distributed Representations (SDRs) • Many bits (thousands)

• Few 1’s mostly 0’s• Example: 2,000 bits, 2% active

• Each bit has semantic meaning• Meaning of each bit is learned, not assigned

01000000000000000001000000000000000000000000000000000010000…………01000

Dense Representations

• Few bits (8 to 128)• All combinations of 1’s and 0’s• Example: 8 bit ASCII

• Individual bits have no inherent meaning• Representation is assigned by programmer

01101101 = m

Page 12: ACM CIKM October 31, 2013 Jeff Hawkins jhawkins@GrokSolutions.com On-line Learning From Streaming Data.

A Few SDR Properties

1) Similarity: shared bits = semantic similarity

subsampling is OKIndices12|10

2) Store and Compare: store indices of active bits

Indices12345|40

Page 13: ACM CIKM October 31, 2013 Jeff Hawkins jhawkins@GrokSolutions.com On-line Learning From Streaming Data.

Coincidence detectors

How does a layer of neurons learn sequences?

Sequence Memory (for inference and motor)

Page 14: ACM CIKM October 31, 2013 Jeff Hawkins jhawkins@GrokSolutions.com On-line Learning From Streaming Data.

Each cell is one bit in our Sparse Distributed Representation

SDRs are formed via a local competition between cells.

Page 15: ACM CIKM October 31, 2013 Jeff Hawkins jhawkins@GrokSolutions.com On-line Learning From Streaming Data.

SDR (time =1)

Page 16: ACM CIKM October 31, 2013 Jeff Hawkins jhawkins@GrokSolutions.com On-line Learning From Streaming Data.

SDR (time =2)

Page 17: ACM CIKM October 31, 2013 Jeff Hawkins jhawkins@GrokSolutions.com On-line Learning From Streaming Data.

Cell forms connections to subsample of previously active cells.Predicts its own future activity.

Page 18: ACM CIKM October 31, 2013 Jeff Hawkins jhawkins@GrokSolutions.com On-line Learning From Streaming Data.

Multiple Predictions Can Occur at Once

With one cell per column, 1st order memoryWe need a high order memory

Page 19: ACM CIKM October 31, 2013 Jeff Hawkins jhawkins@GrokSolutions.com On-line Learning From Streaming Data.

High Order Sequence MemoryEnabled by Columns of Cells

Cortical Learning Algorithm (CLA)Distributed sequence memoryHigh orderHigh capacityMultiple simultaneous predictionsSemantic generalization

Page 20: ACM CIKM October 31, 2013 Jeff Hawkins jhawkins@GrokSolutions.com On-line Learning From Streaming Data.

1) NuPIC Open Source Project

Three Current Directions

Page 21: ACM CIKM October 31, 2013 Jeff Hawkins jhawkins@GrokSolutions.com On-line Learning From Streaming Data.

www.Numenta.org

Single source tree (used by GROK)

GPLv3

Steady community growth– 67 contributors (+26 since July)– 245 mailing list subscribers– 1621 total messages

eBook from community member OS community joining Kaggle CompetitionsFall Hackathon: 70 attendees

NuPIC Open Source Project

Page 22: ACM CIKM October 31, 2013 Jeff Hawkins jhawkins@GrokSolutions.com On-line Learning From Streaming Data.

1) NuPIC Open Source Project

2) Custom CLA Hardware- Needed for scaling research and commercial applications- DARPA “Cortical Processor”- IBM, Seagate, Sandia Labs

3) Commercialization

Three Current Directions

Page 23: ACM CIKM October 31, 2013 Jeff Hawkins jhawkins@GrokSolutions.com On-line Learning From Streaming Data.

2. Look at data 3. Build models

Problem: - Doesn’t scale with velocity and #models

Solution: - Automated model creation - Continuous learning - Temporal inference

1. Store data

Stream data Automated model creationContinuous learningTemporal inference

PredictionsAnomaliesActions

Past

Future

Data: Past and Future

Page 24: ACM CIKM October 31, 2013 Jeff Hawkins jhawkins@GrokSolutions.com On-line Learning From Streaming Data.

Anomaly Detection Using Predictive Cortical Models

Cortical Memory

Encoder

SDR

Prediction Point anomaly score Time average Distribution of averages Metric anomaly score

Metric 1

Cortical Memory

Encoder

SDR

Prediction Point anomaly score Time average Distribution of averages Metric anomaly score

SDRMetric N

.

.

.

SystemAnomalyScore

Page 25: ACM CIKM October 31, 2013 Jeff Hawkins jhawkins@GrokSolutions.com On-line Learning From Streaming Data.

Largely predictable

Largely unpredictable

Metr

ic

valu

e

An

om

al

y score

Metr

ic

valu

e

An

om

al

y score

Page 26: ACM CIKM October 31, 2013 Jeff Hawkins jhawkins@GrokSolutions.com On-line Learning From Streaming Data.

Smartphone-centric

Ranks anomalous instances

Rapid drill down

Continuously updated

User-controlled notifications

Breakthrough Science for Anomaly Detection

Reinventing UX for IT Monitoring

Grok for IT Monitoring

Detects problems thresholds miss

Continuous learning Automated model building State-of-the art neocortical

model

In private beta for Amazon AWS cloud [email protected]

Page 27: ACM CIKM October 31, 2013 Jeff Hawkins jhawkins@GrokSolutions.com On-line Learning From Streaming Data.

Custom metrics for any application/server

Web interface and mobile client source code available under no-cost license

Engine API to be published NuPIC open source community

Extensible Architecture

Page 28: ACM CIKM October 31, 2013 Jeff Hawkins jhawkins@GrokSolutions.com On-line Learning From Streaming Data.