FearNet: Brain-Inspired Framework for Incremental Learningrmk6217/FearNetPoster.pdf · Comparison...
Transcript of FearNet: Brain-Inspired Framework for Incremental Learningrmk6217/FearNetPoster.pdf · Comparison...
FearNet: Brain-Inspired Framework for Incremental LearningRonald Kemker and Christopher KananRochester Institute of Technology, Rochester, NY
{rmk6217, kanan}@rit.edu
Catastrophic Forgetting Experimental Results
Discussion
0.8
0.85
0.9
0.95
1
CIFAR-100 CUB-200 AudioSet
Oracle With BLA
𝛀𝒂𝒍𝒍
Rehearsal for Incremental Learning
• Incremental Class Learning – The model learns object classes one at a
time, without revisiting previous classes.
• Rehearsal - The model stores previous training exemplars and replays
them later in training to prevent forgetting of older memories. It’s a way
to incrementally update a model, but it comes with a high storage cost [3].
Fig 1. Performance Improves as More
Exemplars Per Class (EPC) are stored
Proposed FearNet FrameworkPseudorehearsal – FearNet uses a generative model to generate pseudo-
examples. These are replayed during pre-defined sleep stages.
Fig 3. Proposed FearNet Framework
mPFC – Responsible for long-term memory storage and recall
HC – Responsible for short-term memory storage and recall
BLA – Recalls associated memory from mPFC or HC
Experimental Evaluation
Ω𝑎𝑙𝑙 =1
𝑇 − 1
𝑡=2
𝑇 𝛼𝑎𝑙𝑙,𝑡𝛼𝑜𝑓𝑓𝑙𝑖𝑛𝑒 𝛼𝑎𝑙𝑙,𝑡 → MCA for all task seen to point 𝑡
𝛼𝑜𝑓𝑓𝑙𝑖𝑛𝑒 → MCA for mPFC trained offline
0.4
0.5
0.6
0.7
0.8
0.9
1
CIFAR-100 CUB-200 AudioSet Mean
1-Nearest Neighbor GeppNet+STM
GeppNet FEL
iCaRL FearNet
𝛀𝒂𝒍𝒍
Comparison of Incremental Class Learning Frameworks
Effectiveness of BLA Sub-System
1
10
100
1000
10000
Category 1
1-NN GeppNet+STM GeppNet
FEL iCaRL FearNet
Meg
ab
yte
s
Storage Cost for CIFAR-100
Storing per-class statistics are more cost-effective; however, largest cost is the
per-class covariance matrix.
Conclusion• State-of-the-art performance for incremental class learning for three
many class datasets.
• FearNet is more memory efficient than existing frameworks.
• Future work will focus on
• Learning feature representation from raw inputs
• Using semi-parametric HC models for better discrimination
• Using generative models that don’t require storage of class statistics
• Making the entire framework end-to-end trainable.
References1. R. Kemker , M. McClure, A. Abitino, T.L. Hayes, and C. Kanan. Measuring Catastrophic
Forgetting in Neural Networks. In AAAI 2018.
2. S. Rebuffi, A. Kolesnikov, and C.H. Lampert. iCaRL: Incremental classifier and
representation learning. In CVPR, 2017.
3. T. Kitamura, et al. Engrams and circuits crucial for systems consolidation of a memory.
Science, 356(6333):73–78, 2017.
Recent vs. Remote Recall
𝑦 = argmax
𝑘′𝑃𝐻𝐶 𝐶 = 𝑘′ 𝒙 𝑖𝑓 𝜓 > max
𝑘𝑃𝑚𝑃𝐹𝐶 𝐶 = 𝑘 𝒙
argmax𝑘′
𝑃𝑚𝑃𝐹𝐶 𝐶 = 𝑘′ 𝒙 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
𝜓 = 1 − 𝐴 𝒙−1max𝑘
𝑃𝐻𝐶 𝐶 = 𝑘 𝒙 𝐴 𝒙 , 𝐴 𝒙 𝜖[0,1]
Fig 4. BLA trained with data stored in HC and
pseudoexamples generated by mPFC
Prediction made by probabilities from HC (𝑷𝑯𝑪), mPFC (𝑷𝒎𝑷𝑭𝑪), & BLA
(𝑨(𝒙))
• Neural networks are incapable of incrementally learning new
information without existing memories
• The best mitigation technique is combining the old and new data
and retraining the model from scratch – inefficient for large models
and datasets!
CIFAR-100 = Popular image classification dataset (100 classes)
CUB-200 = Fine-grained bird classification dataset (200 classes)
AudioSet = Audio classification dataset (100 classes)
Complementary Learning SystemsFearNet is heavily inspired on the dual-memory model of mammalian
memory.
• Hippocampus (HC) - Thought to play a role in the recall of recent
(i.e., short-term) memories
• medial Pre-Frontal Cortex (mPFC) - Responsible for the recall of
remote (i.e., long-term) memories.
Fig 2. Formation of recent memories in HC and their
gradual consolidation into mPFC as remote memories
Image Courtesy of
Kitamura et al. (2017)