Neuro - Inspired Computing Elements Sandia National Labs 2014 Jeff Hawkins jhawkins@Numenta

Post on 23-Feb-2016

40 views 0 download

Tags:

description

From Cortical Microcircuits to Machine Intelligence. Numenta. Neuro - Inspired Computing Elements Sandia National Labs 2014 Jeff Hawkins jhawkins@Numenta.com. Catalyst for machine intelligence. NuPIC Open source community. Research Theory, Algorithms. Products for streaming analytics - PowerPoint PPT Presentation

Transcript of Neuro - Inspired Computing Elements Sandia National Labs 2014 Jeff Hawkins jhawkins@Numenta

Neuro-Inspired Computing ElementsSandia National Labs 2014

Jeff Hawkinsjhawkins@Numenta.com

From Cortical Microcircuitsto Machine Intelligence

NumentaCatalyst for machine intelligence

ResearchTheory, Algorithms

NuPICOpen source community

Products for streaming analyticsBased on cortical algorithms

“If you invent a breakthrough so computers can learn, that is worth 10 Microsofts”

"If you invent a breakthrough so computers that learn that is worth 10 Micros

1) What principles will we use to build intelligent machines?

2) What applications will drive adoption in the near and long term?

Machine Intelligence

1) Flexible (universal learning machine)

2) Robust

3) If we knew how the neocortex worked, we would be in a race to build them.

Machine intelligence will be built on the principles of the neocortex

What the Cortex Does

data streamretina

cochlea

somatic

The neocortex learns a model from sensory data

- predictions - anomalies - actions

The neocortex learns a sensory-motor model of the world

“Hierarchical Temporal Memory” (HTM)

1) Hierarchy of nearly identical regions

- across species- across modalities

retina

cochlea

somatic2) Regions are mostly sequence memory

- inference- motor

3) Feedforward: Temporal stability - inference

Feedback: Unfold sequences- motor/prediction

data stream

2.5 mm

Region

Layers

Layers &Mini-columns

2/3456

2/3

4

5

6

1) The cellular layer is unit of cortical computation.2) Each layer implements a variant of a common learning algorithm.3) Mini-columns play critical role in how layers learn sequences.

Sequence memory

Sequence memory

Sequence memory

Sequence memory

2/3

4

5

6

L4 and L2/3: Feedforward Inference

Copy of motor commands

Sensor/afferent data

Learns sensory-motor transitions

Learns high-order sequences

Stable

Predicted

Pass throughchanges

Un-predicted

Layer 4 learns changes in input due to behavior.Layer 3 learns sequences of L4 remainders.

These are universal inference steps.They apply to all sensory modalities.

Next higher region

A-B-C-DX-B-C-Y

A-B-C- ? DX-B-C- ? Y

2/3

4

5

6

L5 and L6: Feedback, Behavior and Attention

Learns sensory-motor transitions

Learns high-order sequences

Well understoodtested / commercial

Recalls motor sequences

Attention

90% understoodtesting in progress

50% understood

10% understood

Feed

forw

ard

Feed

bac

k

How does a layer of neurons learn sequences?

2/3

4

5

6

Learns high-order sequences

First:- Sparse Distributed Representations- Neurons

Sparse Distributed Representations (SDRs) • Many bits (thousands)

• Few 1’s mostly 0’s• Example: 2,000 bits, 2% active

• Each bit has semantic meaningLearned

01000000000000000001000000000000000000000000000000000010000…………01000

Dense Representations• Few bits (8 to 128)• All combinations of 1’s and 0’s• Example: 8 bit ASCII

• Bits have no inherent meaningArbitrarily assigned by programmer

01101101 = m

SDR Properties

1) Similarity: shared bits = semantic similarity

subsampling is OK

3) Union membership:

Indices12|10

Is this SDRa member?

2) Store and Compare: store indices of active bits

Indices12345|40

1)2)3)

….10)

2%

20%Union

Model neuronFeedforward

100’s of synapses“Classic” receptive fieldContext

1,000’s of synapsesDepolarize neuron“Predicted state”

Active Dendrites

Pattern detectorsEach cell can recognize100’s of unique patterns

Neurons

Learning TransitionsFeedforward activation

Learning TransitionsInhibition

Learning TransitionsSparse cell activation

Time = 1Learning Transitions

Time = 2Learning Transitions

Learning TransitionsForm connections to previously active cells.Predict future activity.

This is a first order sequence memory.It cannot learn A-B-C-D vs. X-B-C-Y.

Learning TransitionsMultiple predictions can occur at once.A-B A-C A-D

High-order Sequences Require Mini-columnsA-B-C-D vs. X-B-C-Y

A

X B

B

C

C

Y

D

Before trainingA

X B’’

B’

C’’

C’

Y’’

D’

After training

Same columns,but only one cell active per column.

IF 40 active columns, 10 cells per columnTHEN 1040 ways to represent the same input in different contexts

Cortical Learning Algorithm (CLA)aka Cellular Layer

Converts input to sparse representations in columnsLearns transitionsMakes predictions and detects anomalies

Applications1) High-order sequence inference L2/32) Sensory-motor inference L43) Motor sequence recall L5

Capabilities- On-line learning- High capacity- Simple local learning rules- Fault tolerant- No sensitive parameters

Basic building block of neocortex/Machine Intelligence

CLA

Encoder

SDR

Prediction Point anomaly Time average Historical comparison Anomaly score

Metric(s)

SystemAnomalyScore

Anomaly Detection Using CLA

CLA

Encoder

SDR

Prediction Point anomaly Time average Historical comparison Anomaly score

SDRMetric N

.

.

.

Grok for Amazon AWS“Breakthrough Science for Anomaly Detection”

Automated model creation Ranks anomalous instances Continuously updated Continuously learning

Technology can be applied to any kind of data, financial, manufacturing, web sales, etc.

Unique Value of Cortical Algorithms

Application: CEPT Systems

Document corpus(e.g. Wikipedia)

128 x 128

100K “Word SDRs”

- =Apple Fruit Computer

MacintoshMicrosoftMacLinuxOperating system….

Sequences of Word SDRs

Training setfrog eats fliescow eats grainelephant eats leavesgoat eats grasswolf eats rabbitcat likes ballelephant likes watersheep eats grasscat eats salmonwolf eats micelion eats cowdog likes sleepelephant likes watercat likes ballcoyote eats rodentcoyote eats rabbitwolf eats squirreldog likes sleepcat likes ball---- ---- -----

Word 3Word 2Word 1

Sequences of Word SDRs

Training set

eats“fox”

?

frog eats fliescow eats grainelephant eats leavesgoat eats grasswolf eats rabbitcat likes ballelephant likes watersheep eats grasscat eats salmonwolf eats micelion eats cowdog likes sleepelephant likes watercat likes ballcoyote eats rodentcoyote eats rabbitwolf eats squirreldog likes sleepcat likes ball---- ---- -----

Sequences of Word SDRs

Training set

eats“fox”

rodent

1)Word SDRs created unsupervised

2)Semantic generalizationSDR: lexicalCLA: grammatic

3)Commercial applicationsSentiment analysisAbstractionImproved text to speechDialog, Reporting, etc.www.Cept.at

frog eats fliescow eats grainelephant eats leavesgoat eats grasswolf eats rabbitcat likes ballelephant likes watersheep eats grasscat eats salmonwolf eats micelion eats cowdog likes sleepelephant likes watercat likes ballcoyote eats rodentcoyote eats rabbitwolf eats squirreldog likes sleepcat likes ball---- ---- -----

Cept and Grok use exact same code base

eats“fox”

rodent

Source code for:- Cortical Learning Algorithm- Encoders- Support libraries

Single source tree (used by GROK), GPLv3

Active and growing community- 80 contributors as of 2/2014- 533 mailing list subscribers

Education Resources / Hackathons

www.Numenta.org

NuPIC Open Source Project (created July 2013)

Research- Implement and test L4 sensory/motor inference- Introduce hierarchy- Publish

NuPIC- Grow open source community- Support partners, e.g. IBM Almaden, CEPT

Grok- Demonstrate commercial value for CLA

Attract resources (people and money)Provide a “target” and market for HW

- Explore new application areas

Goals For 2014

The Limits of Software

• One layer, no hierarchy• 2000 mini-columns• 30 cells per mini-column• Up to 128 dendrite segments per cell• Up to 40 active synapses per segment• Thousands of potential synapses per segment

One millionth human cortex50 msec per inference/training step (highly optimized)

40

12 8

2000 columns

30

We need HW for the usual reasons.- Speed- Power- Volume

Start of supplemental material