Embedded High Performance Computing (EHPC) and Neuromorphic...
Transcript of Embedded High Performance Computing (EHPC) and Neuromorphic...
1 DISTRIBUTION A. Approved for public release; distribution unlimited (88ABW-2013-3982, 09 Sep 2013)
Integrity Service Excellence
Embedded High Performance
Computing (EHPC) and
Neuromorphic Computing
November 18, 2014
Mr. Mark Barnell
Senior Computer Scientist
Information Directorate
Air Force Research Laboratory
2 DISTRIBUTION A. Approved for public release; distribution unlimited (88ABW-2013-3982, 09 Sep 2013)
Outline
• Challenges
• Approaches
– Scalable computing & applications
– Exploit 3D integration
– New devices and models of computation
– Move “C4ISR to the edge” with EHPC
3 DISTRIBUTION A. Approved for public release; distribution unlimited (88ABW-2013-3982, 09 Sep 2013)
Challenges in Big Data Analytics
• Big data limits human performance in analysis and decision
making • High-performance information technologies for massive analytics
enables the journey from big data to information, knowledge and
wisdom.
• R&D needed to achieve trusted autonomous systems that are capable
of learning, reasoning, inferencing and interacting with human.
• Advantage in labor power could convert into global
competitive advantages • Ability to “do more, do better” with more intelligent power, less
labor power, would be a “force multiplier”
• Computing hardware technologies reach physical limits in
area, power and performance • Three-dimensional integrated circuits and systems with optimized
performance under size, weight and power (SWaP) constraints.
• RDT&E for nano/quantum/neuro device and system technologies.
4 DISTRIBUTION A. Approved for public release; distribution unlimited (88ABW-2013-3982, 09 Sep 2013)
Multi-Tiered Approach to
EHPC or HPEC Challenges
FUNDAMENTAL SCIENCE
TRUSTED ARCHITECTURES
TECH
DEVELOPMENT
5 DISTRIBUTION A. Approved for public release; distribution unlimited (88ABW-2011-3208, 06 Jun 2011)
The Condor Cluster
1716 SONY Playstation3s • STI Cell Broadband Engine
• PowerPC PPE
• 6 SPEs
• 256 MB RAM
84 head nodes
• 6 gateway access points
• 78 compute nodes
• Intel Xeon X5650 dual-socket hexa-
core
• (2) NVIDIA Tesla GPGPUs
• 39 nodes – (78) C2050
• 39 nodes – (78) C2070/5
• 48 GB RAM
FY10 DHPI Key design considerations: Price/performance & Performance/Watt
6 DISTRIBUTION A. Approved for Public Release [88ABW-2010-6225] Distribution Unlimited
Input: character images
Character level: auto-associative neural networks
T e tx i m ga e
Word level: confabulation algorithms based on knowledge base of weighted links among letters
Sentence level: confabulation algorithms based on knowledge base of weighted links among words and phrases
Output: “Text image”
Hybrid Neuromorphic Model
7 DISTRIBUTION A. Approved for public release; distribution unlimited (88ABW-2011-3208, 06 Jun 2011)
RADAR Data Processing for
High Resolution Images
8 DISTRIBUTION A. Approved for public release; distribution unlimited (88ABW-2013-3982, 09 Sep 2013)
R. U. D. I. Cluster 176 Jetson Boards (60T/flops @ 2.1kW)
9 DISTRIBUTION A. Approved for public release; distribution unlimited (88ABW-2013-3982, 09 Sep 2013)
3D Integration
3D
Integration
Hybrid Memory Cube
(HMC Consortium) 3D NAND Flash Memory (Toshiba)
10 DISTRIBUTION A. Approved for public release; distribution unlimited (88ABW-2013-3982, 09 Sep 2013)
Input Vectors
Config.
OutputBuffer
Training Complemted
Training Signal Generation
ErrorDetection
ST &Arbiter
R/WControl
CrossbarArray
M+
Diff
Std
Patt
erns
X(k) , k
=1...
m Vin CrossbarArray
M-
Summing Amplifier
& Comparator
V(t+1)
V+(t)
Vout
V-(t)
Crossbar-based Computation Core
Neuromorphic Computing Accelerator (NCA)
ADC
00101101
Arbiter
NCA
I/OCfg
Buffers
Arbiter
NCA
I/OCfg
Buffers
Arbiter
NCA
I/OCfg
Buffers
Arbiter
NCA
I/OCfg
Buffers
Brid
ge
ADC
00101101
Brid
ge
ADC
00101101
GeneralPurpose
Processor
SRAM
I/O
ConventionalProcessing Neuromorphic Computing Accelerators
Crossbar array of memristors
Bio-Inspired Computing Architecture
Neuromorphic models and algorithms
(ANN, Inference, etc.)
11 DISTRIBUTION A. Approved for public release; distribution unlimited (88ABW-2013-3982, 09 Sep 2013)
Moving ISR to the Tactical Edge Versatile Intelligent Sensor
12 DISTRIBUTION A. Approved for public release; distribution unlimited (88ABW-2013-3982, 09 Sep 2013)
Summary
• Future embedded systems are challenged to continue delivering
extreme performance in small space
• But security places additional challenges for trust, agility and
resilience
• Powerful technology drivers are still at hand to meet the
challenges
– Computer architecture innovations
– Nano and quantum advances
– 3D stacking
– Algorithm development and mapping to architectures