FearNet: Brain-Inspired Framework for Incremental Learningrmk6217/FearNetPoster.pdf · Comparison...

1
FearNet : Brain - Inspired Framework for Incremental Learning Ronald Kemker and Christopher Kanan Rochester Institute of Technology, Rochester, NY {rmk6217, kanan}@rit.edu Catastrophic Forgetting Experimental Results Discussion 0.8 0.85 0.9 0.95 1 CIFAR-100 CUB-200 AudioSet Oracle With BLA Rehearsal for Incremental Learning Incremental Class Learning – The model learns object classes one at a time, without revisiting previous classes. Rehearsal - The model stores previous training exemplars and replays them later in training to prevent forgetting of older memories. It’s a way to incrementally update a model, but it comes with a high storage cost [3]. Fig 1. Performance Improves as More Exemplars Per Class (EPC) are stored Proposed FearNet Framework Pseudorehearsal – FearNet uses a generative model to generate pseudo- examples. These are replayed during pre-defined sleep stages. Fig 3. Proposed FearNet Framework mPFC – Responsible for long-term memory storage and recall HC – Responsible for short-term memory storage and recall BLA – Recalls associated memory from mPFC or HC Experimental Evaluation Ω = 1 −1 =2 , , → MCA for all task seen to point → MCA for mPFC trained offline 0.4 0.5 0.6 0.7 0.8 0.9 1 CIFAR-100 CUB-200 AudioSet Mean 1-Nearest Neighbor GeppNet+STM GeppNet FEL iCaRL FearNet Comparison of Incremental Class Learning Frameworks Effectiveness of BLA Sub-System 1 10 100 1000 10000 Category 1 1-NN GeppNet+STM GeppNet FEL iCaRL FearNet Megabytes Storage Cost for CIFAR-100 Storing per-class statistics are more cost-effective; however, largest cost is the per-class covariance matrix. Conclusion State-of-the-art performance for incremental class learning for three many class datasets. FearNet is more memory efficient than existing frameworks. Future work will focus on Learning feature representation from raw inputs Using semi-parametric HC models for better discrimination Using generative models that don’t require storage of class statistics Making the entire framework end-to-end trainable. References 1. R. Kemker , M. McClure, A. Abitino, T.L. Hayes, and C. Kanan. Measuring Catastrophic Forgetting in Neural Networks. In AAAI 2018. 2. S. Rebuffi, A. Kolesnikov, and C.H. Lampert. iCaRL: Incremental classifier and representation learning. In CVPR, 2017. 3. T. Kitamura, et al. Engrams and circuits crucial for systems consolidation of a memory. Science, 356(6333):73–78, 2017. Recent vs. Remote Recall = arg max = > max = arg max = = 1− −1 max = , [0,1] Fig 4. BLA trained with data stored in HC and pseudoexamples generated by mPFC Prediction made by probabilities from HC ( ), mPFC ( ), & BLA (()) Neural networks are incapable of incrementally learning new information without existing memories The best mitigation technique is combining the old and new data and retraining the model from scratch – inefficient for large models and datasets! CIFAR-100 = Popular image classification dataset (100 classes) CUB-200 = Fine-grained bird classification dataset (200 classes) AudioSet = Audio classification dataset (100 classes) Complementary Learning Systems FearNet is heavily inspired on the dual-memory model of mammalian memory. Hippocampus (HC) - Thought to play a role in the recall of recent (i.e., short-term) memories medial Pre-Frontal Cortex (mPFC) - Responsible for the recall of remote (i.e., long-term) memories. Fig 2. Formation of recent memories in HC and their gradual consolidation into mPFC as remote memories Image Courtesy of Kitamura et al. (2017)

Transcript of FearNet: Brain-Inspired Framework for Incremental Learningrmk6217/FearNetPoster.pdf · Comparison...

Page 1: FearNet: Brain-Inspired Framework for Incremental Learningrmk6217/FearNetPoster.pdf · Comparison of Incremental Class Learning Frameworks Effectiveness of BLA Sub-System 1 10 100

FearNet: Brain-Inspired Framework for Incremental LearningRonald Kemker and Christopher KananRochester Institute of Technology, Rochester, NY

{rmk6217, kanan}@rit.edu

Catastrophic Forgetting Experimental Results

Discussion

0.8

0.85

0.9

0.95

1

CIFAR-100 CUB-200 AudioSet

Oracle With BLA

𝛀𝒂𝒍𝒍

Rehearsal for Incremental Learning

• Incremental Class Learning – The model learns object classes one at a

time, without revisiting previous classes.

• Rehearsal - The model stores previous training exemplars and replays

them later in training to prevent forgetting of older memories. It’s a way

to incrementally update a model, but it comes with a high storage cost [3].

Fig 1. Performance Improves as More

Exemplars Per Class (EPC) are stored

Proposed FearNet FrameworkPseudorehearsal – FearNet uses a generative model to generate pseudo-

examples. These are replayed during pre-defined sleep stages.

Fig 3. Proposed FearNet Framework

mPFC – Responsible for long-term memory storage and recall

HC – Responsible for short-term memory storage and recall

BLA – Recalls associated memory from mPFC or HC

Experimental Evaluation

Ω𝑎𝑙𝑙 =1

𝑇 − 1

𝑡=2

𝑇 𝛼𝑎𝑙𝑙,𝑡𝛼𝑜𝑓𝑓𝑙𝑖𝑛𝑒 𝛼𝑎𝑙𝑙,𝑡 → MCA for all task seen to point 𝑡

𝛼𝑜𝑓𝑓𝑙𝑖𝑛𝑒 → MCA for mPFC trained offline

0.4

0.5

0.6

0.7

0.8

0.9

1

CIFAR-100 CUB-200 AudioSet Mean

1-Nearest Neighbor GeppNet+STM

GeppNet FEL

iCaRL FearNet

𝛀𝒂𝒍𝒍

Comparison of Incremental Class Learning Frameworks

Effectiveness of BLA Sub-System

1

10

100

1000

10000

Category 1

1-NN GeppNet+STM GeppNet

FEL iCaRL FearNet

Meg

ab

yte

s

Storage Cost for CIFAR-100

Storing per-class statistics are more cost-effective; however, largest cost is the

per-class covariance matrix.

Conclusion• State-of-the-art performance for incremental class learning for three

many class datasets.

• FearNet is more memory efficient than existing frameworks.

• Future work will focus on

• Learning feature representation from raw inputs

• Using semi-parametric HC models for better discrimination

• Using generative models that don’t require storage of class statistics

• Making the entire framework end-to-end trainable.

References1. R. Kemker , M. McClure, A. Abitino, T.L. Hayes, and C. Kanan. Measuring Catastrophic

Forgetting in Neural Networks. In AAAI 2018.

2. S. Rebuffi, A. Kolesnikov, and C.H. Lampert. iCaRL: Incremental classifier and

representation learning. In CVPR, 2017.

3. T. Kitamura, et al. Engrams and circuits crucial for systems consolidation of a memory.

Science, 356(6333):73–78, 2017.

Recent vs. Remote Recall

𝑦 = argmax

𝑘′𝑃𝐻𝐶 𝐶 = 𝑘′ 𝒙 𝑖𝑓 𝜓 > max

𝑘𝑃𝑚𝑃𝐹𝐶 𝐶 = 𝑘 𝒙

argmax𝑘′

𝑃𝑚𝑃𝐹𝐶 𝐶 = 𝑘′ 𝒙 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒

𝜓 = 1 − 𝐴 𝒙−1max𝑘

𝑃𝐻𝐶 𝐶 = 𝑘 𝒙 𝐴 𝒙 , 𝐴 𝒙 𝜖[0,1]

Fig 4. BLA trained with data stored in HC and

pseudoexamples generated by mPFC

Prediction made by probabilities from HC (𝑷𝑯𝑪), mPFC (𝑷𝒎𝑷𝑭𝑪), & BLA

(𝑨(𝒙))

• Neural networks are incapable of incrementally learning new

information without existing memories

• The best mitigation technique is combining the old and new data

and retraining the model from scratch – inefficient for large models

and datasets!

CIFAR-100 = Popular image classification dataset (100 classes)

CUB-200 = Fine-grained bird classification dataset (200 classes)

AudioSet = Audio classification dataset (100 classes)

Complementary Learning SystemsFearNet is heavily inspired on the dual-memory model of mammalian

memory.

• Hippocampus (HC) - Thought to play a role in the recall of recent

(i.e., short-term) memories

• medial Pre-Frontal Cortex (mPFC) - Responsible for the recall of

remote (i.e., long-term) memories.

Fig 2. Formation of recent memories in HC and their

gradual consolidation into mPFC as remote memories

Image Courtesy of

Kitamura et al. (2017)