Exploiting Cognitive Constraints To Improve Machine-Learning Memory Models Michael C. Mozer...

19
Exploiting Cognitive Constraints To Improve Machine-Learning Memory Models Michael C. Mozer Department of Computer Science University of Colorado, Boulder

Transcript of Exploiting Cognitive Constraints To Improve Machine-Learning Memory Models Michael C. Mozer...

Page 1: Exploiting Cognitive Constraints To Improve Machine-Learning Memory Models Michael C. Mozer Department of Computer Science University of Colorado, Boulder.

Exploiting Cognitive ConstraintsTo Improve Machine-Learning Memory Models

Michael C. MozerDepartment of Computer ScienceUniversity of Colorado, Boulder

Page 2: Exploiting Cognitive Constraints To Improve Machine-Learning Memory Models Michael C. Mozer Department of Computer Science University of Colorado, Boulder.

Why Care About Human Memory?

The neural architecture of human vision has inspired computer vision. Perhaps the cognitive architecture of memory can inspire the design of RAM systems.

Understanding human memory essential for ML systems that predict what information will be accessible or interesting to people at any moment.

E.g., selecting material for students to review to maximize long-term retention (Lindsey et al., 2014)

Page 3: Exploiting Cognitive Constraints To Improve Machine-Learning Memory Models Michael C. Mozer Department of Computer Science University of Colorado, Boulder.

The World’s Most Boring Task

Stimulus X -> Response aStimulus Y -> Response b

freq

uenc

y

response latency

response latency

Page 4: Exploiting Cognitive Constraints To Improve Machine-Learning Memory Models Michael C. Mozer Department of Computer Science University of Colorado, Boulder.

Sequential Dependencies

Dual Priming Model(Wilder, Jones, & Mozer, 2009; Jones, Curran, Mozer, & Wilder, 2013)

Recent trial history leads to expectation of next stimulus

Responses latencies are fast when reality matches expectation

Expectation is based on exponentially decaying traces of two different stimulus properties

Page 5: Exploiting Cognitive Constraints To Improve Machine-Learning Memory Models Michael C. Mozer Department of Computer Science University of Colorado, Boulder.

Examining Longer-Term Dependencies(Wilder, Jones, Ahmed, Curran, & Mozer, 2013)

Page 6: Exploiting Cognitive Constraints To Improve Machine-Learning Memory Models Michael C. Mozer Department of Computer Science University of Colorado, Boulder.

Declarative Memory

Cepeda, Vul, Rohrer, Wixted, & Pashler (2008)

study test

Page 7: Exploiting Cognitive Constraints To Improve Machine-Learning Memory Models Michael C. Mozer Department of Computer Science University of Colorado, Boulder.

Forgetting Is Influenced By The Temporal Distribution Of Study

Spaced study Massed studyproduces morerobust & durablelearning than

Page 8: Exploiting Cognitive Constraints To Improve Machine-Learning Memory Models Michael C. Mozer Department of Computer Science University of Colorado, Boulder.

Experimental Paradigm To Study Spacing Effect

Page 9: Exploiting Cognitive Constraints To Improve Machine-Learning Memory Models Michael C. Mozer Department of Computer Science University of Colorado, Boulder.

Cepeda, Vul, Rohrer, Wixted, & Pashler (2008)

Intersession Interval (Days)

% R

ecal

l

Page 10: Exploiting Cognitive Constraints To Improve Machine-Learning Memory Models Michael C. Mozer Department of Computer Science University of Colorado, Boulder.

Optimal Spacing Between Study Sessionsas a Function of Retention Interval

Page 11: Exploiting Cognitive Constraints To Improve Machine-Learning Memory Models Michael C. Mozer Department of Computer Science University of Colorado, Boulder.

Predicting The Spacing Curve

characterizationof student

and domain

intersessioninterval

MultiscaleContextModel

predictedrecall

forgetting afterone session

Intersession Interval (Days)

% R

ecal

l

Page 12: Exploiting Cognitive Constraints To Improve Machine-Learning Memory Models Michael C. Mozer Department of Computer Science University of Colorado, Boulder.

Multiscale Context Model(Mozer et al., 2009)

Neural network

Explains spacing effects

Multiple Time Scale Model(Staddon, Chelaru, & Higa, 2002)

Cascade of leaky integrators

Explains rate-sensitive habituation

Kording, Tenenbaum, Shadmehr (2007)

Kalman filter

Explains motor adaptation

Page 13: Exploiting Cognitive Constraints To Improve Machine-Learning Memory Models Michael C. Mozer Department of Computer Science University of Colorado, Boulder.

Key Features Of Models

Each time an event occursin the environment…

A memory of this eventis stored via multiple traces

Traces decay exponentiallyat different rates

Memory strength isweighted sum of traces

Slower scales are downweighted relative to faster scales

Slower scales store memory (learn) only when faster scales fail to predict event

trac

e st

reng

th

medium slowfast

+ +

Page 14: Exploiting Cognitive Constraints To Improve Machine-Learning Memory Models Michael C. Mozer Department of Computer Science University of Colorado, Boulder.

time

time

eventoccurrence

eventoccurrence

Page 15: Exploiting Cognitive Constraints To Improve Machine-Learning Memory Models Michael C. Mozer Department of Computer Science University of Colorado, Boulder.

Exponential Mixtures Scale Invariance➜

Infinite mixture of exponentials gives exactly power function

Finite mixture of exponentials gives good approximation to power function

With , can fit arbitrary power functions

+ + =

Page 16: Exploiting Cognitive Constraints To Improve Machine-Learning Memory Models Michael C. Mozer Department of Computer Science University of Colorado, Boulder.

Relationship To Memory Models In Ancient NN Literature

Focused back prop (Mozer, 1989), LSTM (Hochreiter & Schmidhuber, 1997)

Little/no decay

Multiscale backprop (Mozer, 1992), Tau net (Nguyen & Cottrell, 1997)

Learned decay constants

No enforced dominance of fast scales over slow scales

Hierarchical recurrent net (El Hihi & Bengio, 1995)

Fixed decay constants

History compression (Schmidhuber, 1992;Schmidhuber, Mozer, & Prelinger, 1993)

Event based, not time based

Page 17: Exploiting Cognitive Constraints To Improve Machine-Learning Memory Models Michael C. Mozer Department of Computer Science University of Colorado, Boulder.

Sketch of Multiscale Memory Module

xt: activation of ‘event’ in input to be remembered, in [0,1]

mt: memory trace strength at time t

Activation rule (memory update) based on error,

Activation rule consistent with the 3 models(for Koerding model, ignore KF uncertainty)

This update is differentiable ➜can back prop through memory module

Redistributes activation across time scales in a manner that is dependent on temporal distribution of input events

Could add output gate as well to make it even more LSTM-like

+

∆fixed

learned

+1

-1

xt

mt

Page 18: Exploiting Cognitive Constraints To Improve Machine-Learning Memory Models Michael C. Mozer Department of Computer Science University of Colorado, Boulder.

Sketch of Multiscale Memory Module

Pool of self-recurrent neurons with fixed time constants

Input is the response of a feature-detection neuron

This memory module stores the particular feature that is detected

When the feature is present, the memory updates Update depends on error between  is a feature detected at time t

When feature detected, memory state compared to input, and a correction is made to memory to represent input strongly

+

∆fixed

learned

+1

-1

+1

Page 19: Exploiting Cognitive Constraints To Improve Machine-Learning Memory Models Michael C. Mozer Department of Computer Science University of Colorado, Boulder.

Why Care About Human Memory?

Understanding human memory essential for ML systems that predict what information will be accessible or interesting to people at any moment.

E.g., shopping patterns

E.g., pronominal reference

E.g., music preferences