Post on 02-Jan-2021
© 2019 Toshiba Memory Corporation
Nonvolatile Memory Technology
for Future ComputingLatest Innovations : Device, process and system technologies
Jun. 27, 2019
Kazunari Ishimaru
Senior Fellow, IEEE Fellow
Institute of Memory Technology Research & Development
Toshiba Memory Corporation
© 2019 Toshiba Memory CorporationLID 2019, K. Ishimaru 1
http://www.news.com.au/opinion/so-busy-snapping-we-miss-the-moment/story-fnh4jt54-1226597915916
2005
The faithful gathered near St. Peter's
Square at the Vatican,
to witness Pope John Paul II's.
2013
St. Peter's Square at the Vatican,
Pope Francis appearance
on March 13, 2013.
Past : Digital Age
© 2019 Toshiba Memory CorporationLID 2019, K. Ishimaru
Now : Smart Fab. (Yokkaichi Operation)
Promoting productivity improvement by using Big-data
Automated transport system
M/C requires precise control
2
AI based analytical tools already introduced
Source : FMS Keynote 2018, Toshiba Memory
Control
Database2Billion/day
Data analysis
Inspection M/C
Production M/C
Real-time Analysis/Control
© 2019 Toshiba Memory CorporationLID 2019, K. Ishimaru
Info-plosion
3
Not Stored
175
Real-Time
Data
Stored
IoT
Connected Devices
>50B@2020
180
100
0
Source : IDC’s Data Age 2025, April 2017
0
1,000
2,000
3,000
4,000
5,000
6,000
7,000
8,000
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
Where Data is Stored [EB]
Enterprise PCs Entertainment Mobile
SOURCE: IDC Global Datasphere, April 2017
Data generation exceeds 175ZB but stored <10ZB in 2025
© 2019 Toshiba Memory CorporationLID 2019, K. Ishimaru
Issue : Energy Efficiency
4
Source: 2017 paper "Total consumer power consumption forecast" by Anders Andrae
Traffic (ZB/y)
2021→2025 x6.1
Typical Case
2021→2025 x4.2
Best Case
Source: 2013, 2018 book “The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines,”
2012
2017
Electricity usage of Data Center increases exponentially.
Energy efficiency improvement of system is crucial.
Consumption in France
475TWh (2017)
Electricity usage (TWh) of Data Centers 2015-2025 Power consumption by components (server)
© 2019 Toshiba Memory CorporationLID 2019, K. Ishimaru
System Bottleneck
Processor DRAM
Mem
ory
wall
von Neumann bottleneck
0.01
0.1
1
10
100
1000
10000
ADD
Integer
(32bit)
MULT
Integer
(32bit)
SRAM
8KB
(32bit)
SRAM
32KB
(32bit)
SRAM
1MB
(32bit)
DRAM
En
erg
y [
pJ]
~1000x
“Computing’s Energy Problem (and what we can do about it)”, M. Horowitz, ISSCC 2014
Current system is not energy efficient, because…
What Processors are doing?
CPU time
Genomics : ~95%
Language Processing : ~80%
Talking to memories!
© 2019 Toshiba Memory CorporationLID 2019, K. Ishimaru
Requirement for Deep Learning (Inference)
6
Cloud
Fog
Edge
Large
latency
Low
throughputSecurity/
privacy
Cloud, Fog, Edge require energy efficient system
© 2019 Toshiba Memory CorporationLID 2019, K. Ishimaru
Computing Systems
7
von Neumann Near Memory In Memory
Between PKG
PKG
PE
Memory
PKG
PKG
PE
Memory
in PKG
PKG
PE
+
Memory
Merger
Efficiency (Energy/Speed)
Flexibility (Memory size/Application)
Each system architecture has trade-off between Efficiency and Flexibility
© 2019 Toshiba Memory CorporationLID 2019, K. Ishimaru
System requirements
Memory space expansion, Storage latency reduction are required
Memory
Space
Storage
Space
SSD
DRAMDRAM
Current
CPUCPU/PE
HDD
Late
ncy : L
arg
e
Fre
qu
en
cy : H
igh
Desired
Memory Space
Expansion
Storage with
Low Latency
+
High Density
© 2019 Toshiba Memory CorporationLID 2019, K. Ishimaru
Storage expansion : BiCS FLASH™ with QLC
Density trend continues by BiCS FLASH™ with QLC technology
1
0.1
10
Den
sit
y(G
b/m
m2)
Source : 13.1 ISSCC 2019, Toshiba Memory
© 2019 Toshiba Memory CorporationLID 2019, K. Ishimaru
Latency reduction : XL-FLASH™
10
Source : FMS Keynote 2018, Toshiba Memory
BiCS FLASH™ based Low Latency SLC device (scalable)
Good for random IOPS and Better QoS at shallow QD in SSD
2 Planes
Conventional
WL WL
BL Many Planes
XL-FLASH™
© 2019 Toshiba Memory CorporationLID 2019, K. Ishimaru 11
BiCS FLASH™ with TSV Technology
Higher Data Rate
>1Gbps with 16 die/ch
Higher Density
1TB / package
Lower Power Consumption
~45% Power Reduction
Memory Chip Bump TSV
Bump Substrate I/O Signal
TSV Technology
Press release July 11, 2017
Bonding Pad Memory Chip
Bonding Wire Substrate
Conventional Wire Bonding
1TB, Toggle DDR, 1066Mbps
TSV : Through Silicon Via
© 2019 Toshiba Memory CorporationLID 2019, K. Ishimaru 12
16-32GB/DIMM 256GB/DIMM
x10 larger memory area
SCM-SSDDRAM
SRAM
CPU
DRAM
SRAM
CPU
SSD
x7 faster access
SCM
DIMM
DRAM
SRAM
CPU
DRAM
SRAM
CPU
$$ $
SSD
DRAM
SRAM
SCM
SSD
NAND
HDD
Big Data Analytics
Faster SSD
Latency, Power, Cost
Cost Optimized DRAM
Storage Class Memory
SCM will fill the “gap” between DRAM and NAND/SSD in the storage hierarchy
There are multiple ways to utilize SCM in a system
XL-FLASHTM
© 2019 Toshiba Memory CorporationLID 2019, K. Ishimaru
SCM devices
So far PCM is the only device on the market. Others may follow.
13
Source: Micron press release July 28, 2015
128Gbit 3D XpointTM (PCM)
ISSCC 2013
32Gbit ReRAM
ISSCC 2017
4Gbit STT-MRAM
© 2019 Toshiba Memory CorporationLID 2019, K. Ishimaru
System improvement : SCM/XL-FLASHTM
SCM and XL-FLASHTM+QLC SSD system expands memory/storage space and
improves latency
14
Memory
Space
Storage
Space
BiCS SSD
(TLC)
BiCS SSD
(TLC)
BiCS SSD
(TLC)
SSD
(QLC)
DRAM
Coming
XL-FLASH
CPU
SCMSCM
Accelerator(GPU/FPGA)
SSD
(TLC)
DRAMDRAM
Current
CPUCPU
HDD
Late
ncy : L
arg
e
Fre
qu
en
cy : H
igh
© 2019 Toshiba Memory CorporationLID 2019, K. Ishimaru
Proposed architecture
15
von Neumann Near Memory In Memory
Between PKG
PKG
Processor
Memory
PKG
PKG
Processor
Memory
in PKG
PKG
Processor
+
Memory
Merger
Efficiency (Energy/Speed)
Flexibility (Memory size/Application)
Each system architecture has trade-off between Efficiency and Flexibility
© 2019 Toshiba Memory CorporationLID 2019, K. Ishimaru
Issue : In-memory type
Kernel: 3 x 3
Stride: 2
CNN1 CNN2 FC
Kernel: 3 x 3
Stride: 2
4 x 4
- Utilization of crossbar array is only 7.4%
- Energy efficiency degrade ~x13
- SW/HW desired which applicable to all
neural network
Assuming BNN
Crossbar array
Emerging memory devices
e.g. PCM, RRAM, etc.
Source : 5.1 ASSCC 2018, Toshiba Memory
© 2019 Toshiba Memory CorporationLID 2019, K. Ishimaru 17
Experimental result ~execution cycles~ResNet-50 ImageNet
Source : 5.1 ASSCC 2018, Toshiba Memory
© 2019 Toshiba Memory CorporationLID 2019, K. Ishimaru
Work together
18
GB/sDNN
accelerator
Memory
Algorithm
SW to map DNNs onto HW
Apps
Circuit
Architecture
Device
Material/Process
Algorithm
Application
System/SW
Co-work/co-optimization
is important
© 2019 Toshiba Memory CorporationLID 2019, K. Ishimaru 19
Conclusion
In Info-plosion era, there are strong demands for storage and
energy efficient system.
BiCS FLASHTM is a key component and continuously grow GB/area
benefit.
XL-FLASHTM+QLC-SSD improves system performance over the
DRAM+HDD.
PAM4 Multiplexing improves I/F BW with low power.
Storage Class Memory is required to improve system performance.
AI system needs SW/HW cooperation. Memory is a key
component and BiCS FLASHTM based system can support.
© 2019 Toshiba Memory Corporation 20
All other company names, product names, and service names mentioned herein may be trademarks of their respective companies.