Download - FearNet: Brain-Inspired Framework for Incremental Learningrmk6217/FearNetPoster.pdf · Comparison of Incremental Class Learning Frameworks Effectiveness of BLA Sub-System 1 10 100

FearNet: Brain-Inspired Framework for Incremental LearningRonald Kemker and Christopher KananRochester Institute of Technology, Rochester, NY

{rmk6217, kanan}@rit.edu

Catastrophic Forgetting Experimental Results

Discussion

0.8

0.85

0.9

0.95

1

CIFAR-100 CUB-200 AudioSet

Oracle With BLA

𝛀𝒂𝒍𝒍

Rehearsal for Incremental Learning

• Incremental Class Learning – The model learns object classes one at a

time, without revisiting previous classes.

• Rehearsal - The model stores previous training exemplars and replays

them later in training to prevent forgetting of older memories. It’s a way

to incrementally update a model, but it comes with a high storage cost [3].

Fig 1. Performance Improves as More

Exemplars Per Class (EPC) are stored

Proposed FearNet FrameworkPseudorehearsal – FearNet uses a generative model to generate pseudo-

examples. These are replayed during pre-defined sleep stages.

Fig 3. Proposed FearNet Framework

mPFC – Responsible for long-term memory storage and recall

HC – Responsible for short-term memory storage and recall

BLA – Recalls associated memory from mPFC or HC

Experimental Evaluation

Ω𝑎𝑙𝑙 =1

𝑇 − 1

𝑡=2

𝑇 𝛼𝑎𝑙𝑙,𝑡𝛼𝑜𝑓𝑓𝑙𝑖𝑛𝑒 𝛼𝑎𝑙𝑙,𝑡 → MCA for all task seen to point 𝑡

𝛼𝑜𝑓𝑓𝑙𝑖𝑛𝑒 → MCA for mPFC trained offline

0.4

0.5

0.6

0.7

0.8

0.9

1

CIFAR-100 CUB-200 AudioSet Mean

1-Nearest Neighbor GeppNet+STM

GeppNet FEL

iCaRL FearNet

𝛀𝒂𝒍𝒍

Comparison of Incremental Class Learning Frameworks

Effectiveness of BLA Sub-System

1

10

100

1000

10000

Category 1

1-NN GeppNet+STM GeppNet

FEL iCaRL FearNet

Meg

ab

yte

s

Storage Cost for CIFAR-100

Storing per-class statistics are more cost-effective; however, largest cost is the

per-class covariance matrix.

Conclusion• State-of-the-art performance for incremental class learning for three

many class datasets.

• FearNet is more memory efficient than existing frameworks.

• Future work will focus on

• Learning feature representation from raw inputs

• Using semi-parametric HC models for better discrimination

• Using generative models that don’t require storage of class statistics

• Making the entire framework end-to-end trainable.

References1. R. Kemker , M. McClure, A. Abitino, T.L. Hayes, and C. Kanan. Measuring Catastrophic

Forgetting in Neural Networks. In AAAI 2018.

2. S. Rebuffi, A. Kolesnikov, and C.H. Lampert. iCaRL: Incremental classifier and

representation learning. In CVPR, 2017.

3. T. Kitamura, et al. Engrams and circuits crucial for systems consolidation of a memory.

Science, 356(6333):73–78, 2017.

Recent vs. Remote Recall

𝑦 = argmax

𝑘′𝑃𝐻𝐶 𝐶 = 𝑘′ 𝒙 𝑖𝑓 𝜓 > max

𝑘𝑃𝑚𝑃𝐹𝐶 𝐶 = 𝑘 𝒙

argmax𝑘′

𝑃𝑚𝑃𝐹𝐶 𝐶 = 𝑘′ 𝒙 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒

𝜓 = 1 − 𝐴 𝒙−1max𝑘

𝑃𝐻𝐶 𝐶 = 𝑘 𝒙 𝐴 𝒙 , 𝐴 𝒙 𝜖[0,1]

Fig 4. BLA trained with data stored in HC and

pseudoexamples generated by mPFC

Prediction made by probabilities from HC (𝑷𝑯𝑪), mPFC (𝑷𝒎𝑷𝑭𝑪), & BLA

(𝑨(𝒙))

• Neural networks are incapable of incrementally learning new

information without existing memories

• The best mitigation technique is combining the old and new data

and retraining the model from scratch – inefficient for large models

and datasets!

CIFAR-100 = Popular image classification dataset (100 classes)

CUB-200 = Fine-grained bird classification dataset (200 classes)

AudioSet = Audio classification dataset (100 classes)

Complementary Learning SystemsFearNet is heavily inspired on the dual-memory model of mammalian

memory.

• Hippocampus (HC) - Thought to play a role in the recall of recent

(i.e., short-term) memories

• medial Pre-Frontal Cortex (mPFC) - Responsible for the recall of

remote (i.e., long-term) memories.

Fig 2. Formation of recent memories in HC and their

gradual consolidation into mPFC as remote memories

Image Courtesy of

Kitamura et al. (2017)