Lec0 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech ECE -- Introduction
Lec17 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Memory
-
Upload
hsien-hsin-lee -
Category
Devices & Hardware
-
view
794 -
download
1
Transcript of Lec17 Intro to Computer Engineering by Hsien-Hsin Sean Lee Georgia Tech -- Memory
ECE2030 Introduction to Computer Engineering
Lecture 17: Memory and Programmable Logic
Prof. Hsien-Hsin Sean LeeProf. Hsien-Hsin Sean LeeSchool of Electrical and Computer EngineeringSchool of Electrical and Computer EngineeringGeorgia TechGeorgia Tech
2
Memory• Random Access Memory (RAM)
– Contrary to Serial Access Memory (e.g. Tape)– Static Random Access Memory (SRAM)
• Data stored so long as Vdd is applied• 6-transistors per cell• Faster• Differential
– Dynamic Random Access Memory (DRAM)• Require periodic refresh• Smaller (can be implemented with 1 or 3 transistor)• Slower• Single-Ended
– Can be read and written– Typically, addressable at byte granularity
• Read-Only Memory (ROM)
3
Block Diagram of Memory
• Example: 2MB memory, byte-addressable– N = 8 (because of byte-addressability)– K = 21 (1 word = 8-bit)
2k wordsN-bit per word
Memory Unit
N-bit Data Input(for Write)
N-bit Data Output(for Read)
K-bit address lines
Read/WriteChip Enable
N
N
K
4
Static Random Access Memory (SRAM)
• Typically each bit is implemented with 6 transistors (6T SRAM Cell)
• During read, the bitline and its inverse are precharged to Vdd (1) before set WL=1
• During write, put the value on Bitline and its inverse on Bitline_bar before set WL=1
BitLineBitLine
Wordline (WL)
5
Dynamic Random Access Memory (DRAM)
• 1-transistor DRAM cell• During a write, put value on bitline and then set WL=1• During a read, precharge bitline to Vdd (1) before assert WL to
1• Storage decays, thus requires periodic refreshing (read-sense-
write)
Bitline
Wordline (WL)
6
Memory Description• Capacity of a memory is described as
– # addresses x Word size– Examples:
Memory # of addr # of data lines
# of addr lines
# of total bytes
1M x 8 1,048,576 8 20 1 MB2M x 4 2,097,152 4 21 1 MB1K x 4 1024 4 10 512 B
4M x 32 4,194,304 32 22 16 MB16K x 64 16,384 64 14 128 KB
7
How to Address Memory
1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit
1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit
1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit
1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit
0
1
2
3
D7 D6 D5 D4 D3 D2 D1 D0
4x8 Memory4x8 Memory2-to-42-to-4
DecoderDecoderA0
A1
CS
ChipSelect
8
How to Address Memory
1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit
1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit
1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit
1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit
0
1
2
3
D7 D6 D5 D4 D3 D2 D1 D0
4x8 Memory4x8 Memory2-to-42-to-4
DecoderDecoderA0=1
A1=0
Access address = 0x1
CS
ChipSelect=1
9
Use 2 Decoders
1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit
1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit
1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit
1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit
0
1
2
3
8x4 Memory8x4 Memory2-to-42-to-4
DecoderDecoder
RowRowDecoderDecoder
A1
A2
1-to-2 Decoder 1-to-2 Decoder Column DecoderColumn Decoder
D0D1D2D3
TristateBuffer (read)
0 1
A0
CS
ChipSelect CS
10
Tristate Buffer
• Similar to Transmission Gate
• Could amplify signal (in contrast to a TG)
• Typically used for signal traveling, e.g. bus
Input Output
En
Input Output
En
Output
En
EnInput
Vdd
CMOS circuit
11
Bi-directional Bus using Tri-state Buffer
Direction(control data flow for read/write)
A
B
Input/Output
12
Read/Write Memory
1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit
1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit
1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit
1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit
0
1
2
3
8x4 Memory8x4 Memory
2-to-42-to-4RowRow
DecoderDecoder
A1
A2
1-to-2 Column Decoder1-to-2 Column Decoder
D0D1D2D3
0 1
A0
CS
ChipSelect = 0
CS
Rd/Wr = 0
13
Read/Write Memory
1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit
1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit
1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit
1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit 1-bit
0
1
2
3
8x4 Memory8x4 Memory
2-to-42-to-4RowRow
DecoderDecoder
A1
A2
1-to-2 Column Decoder1-to-2 Column Decoder
D0D1D2D3
0 1
A0
CS
ChipSelect = 1
CS
Rd/Wr = 1
14
Building Memory in Hierarchy• Design a 1Mx8 using 1Mx4 memory chips
D3D2D1D0
A19A18A17
A0
1Mx41Mx4
R/WCS
D7D6D5D4
A19A18
1Mx41Mx4
R/WCS
A17
A0CS
15
Building Memory in Hierarchy• Design a 2Mx4 using 1Mx4 memory chips
A19A18A17
A0
1Mx41Mx4
R/WCS
A19A18A17
A0
1Mx41Mx4
R/WCSA20 1-to-2
Decoder
CS
1
0
D3D2D1D0
Note that 1-to-2 decoder is the wireitself (or use an inverter)
16
Building Memory in Hierarchy• Design a 2Mx8 using 1Mx4 memory chips
A19A18A17
A0
1Mx41Mx4
CS R/W
A19A18A17
A0
1Mx41Mx4
CS R/W
A19A18A17
A0
1Mx41Mx4
CS R/W
A19A18A17
A0
1Mx41Mx4
CS R/W
D7D6D5D4D3D2D1D0
A19A18A17
A0
A20 1-to-2Decoder
CS
1
0
17
Memory Model• 32-bit address space can address up to 4GB (232)
different memory locations
Flat Memory ModelFlat Memory Model
0x0A0xB60x410xFC
LowerMemoryAddress
0x00000000
HigherMemoryAddress
0x00000001
0x00000002
0x00000003
0xFFFFFFFF 0x0D
18
Endianness [Danny Cohen 91][Danny Cohen 91]
• Byte ordering How a multiple byte data word stored in memory
• Endianness (from Gulliver’s Travels)– Big Endian
• MostMost significant byte of a multi-byte word is stored at the lowestlowest memory address
• e.g. Sun Sparc, PowerPC– Little Endian
• LeastLeast significant byte of a multi-byte word is stored at the lowestlowest memory address
• e.g. Intel x86• Some embedded & DSP processors would
support both for interoperability
19
Endianness Examples• Store 0x87654321 at address 0x0000, byte-
addressable
0x870x650x430x21
LowerMemoryAddress
HigherMemoryAddress
0x0000
0x0001
0x0002
0x0003
BIG ENDIANBIG ENDIAN
0x210x430x650x87
LowerMemoryAddress
HigherMemoryAddress
0x0000
0x0001
0x0002
0x0003
LITTLELITTLEENDIANENDIAN
20
Memory Allocation (Little Endian).data
.globl declaredeclare:
.align 0
.word 511
.byte 14
.align 2
.byte 14
.word 0x0B1E8143
.align 2
.ascii “GAece”
.half 10
.word 0x2B1E8145
.space 1
.byte 52
.align 1
.byte 16
.space 2
.byte 67
0xFF0x010x000x000x0E------------
0
1
234
56
------0x0E0x430x810x1E0x0B------
7
8
9 a b
c d
------------
0x41
e f
1011
0x47
0x631213
0x65
0x0A1415
0x65
0x811718
0x45
0x2B191a
0x1E
0x0016
1b
------1c1d
0x34
1f20
21 0x43
0x101e
.align N.align N: Align next datum on a 2n byte boundary
.align 0.align 0: turn off automaticalignment for .half, .word, .float, and .double till thenext .data directive
.word.word: 4 bytes
.half.half: 2 bytes
.byte.byte: 1 byte
.space.space: 1-byte space
.ascii.ascii: ASCII code (American Standard Code for Information Interchange)
21
Read Only Memory (ROM)• “Permanent” binary information is stored• Non-volatile memory
– Power off does not erase information stored
2k wordsN-bit per work
ROMROM N-bit Data OutputK-bit address lines NK
22
32x8 ROM
32x8 ROM 85
0123
28293031
D7 D6 D5 D4 D3 D2 D1 D0
A4A3A2A1A0
5-to-32
Decoder
Each represents 32 wires
Fuse can beimplemented as a diode or a pass transistor
23
Programming the 32x8 ROMA4 A3 A2 A1 A0 D7 D6 D5 D4 D3 D2 D1 D00 0 0 0 0 1 1 0 0 0 1 0 10 0 0 0 1 1 0 0 0 1 0 1 10 0 0 1 0 1 0 1 1 0 0 0 0… … … … … … … … … … … … …1 1 1 0 1 0 0 0 1 0 0 0 01 1 1 1 0 0 1 0 1 0 1 1 01 1 1 1 1 1 1 1 0 0 0 0 1
012
293031
D7 D6 D5 D4 D3 D2 D1 D0
A4A3A2A1A0
5-to-32
Decoder
24
Example: Lookup Table• Design a square lookup table for F(X) = XF(X) = X22 using
ROM
X F(X)=X2
0 01 12 43 94 165 256 367 49
X F(X)=X2
000 000000001 000001010 000100011 001001100 010000101 011001110 100100111 110001
25
Square Lookup Table using ROM
X F(X)=X2
000 000000001 000001010 000100011 001001100 010000101 011001110 100100111 110001
0123
F5 F4 F3 F2 F1 F0
X2X1X0
3-to-8
Decoder 4567
26
Square Lookup Table using ROM
X F(X)=X2
000 000000001 000001010 000100011 001001100 010000101 011001110 100100111 110001
= X0= X0Not UsedNot Used
0123
F5 F4 F3 F2 F1 F0
X2X1X0
3-to-8
Decoder 4567
27
Square Lookup Table using ROM
X F(X)=X2
000 000000001 000001010 000100011 001001100 010000101 011001110 100100111 110001
0123
F5 F4 F3 F2 F0
X2X1X0
3-to-8
Decoder 4567
F1
28
Classifying Three Basic PLDsFixed AND planeFixed AND plane
(decoder)(decoder)Programmable Programmable
OR planeOR plane
ProgrammableConnections
(Programmable) Read-Only Memory (ROM)(Programmable) Read-Only Memory (ROM)
INPUT OUTPUT
Programmable Programmable OR planeOR plane
ProgrammableConnections
Programmable Logic Array (PLA)Programmable Logic Array (PLA)
ProgrammableProgrammableAND planeAND planeINPUT OUTPUT
ProgrammableProgrammableAND planeAND plane
Fixed Fixed OR planeOR plane
Programmable Array Logic (PAL) DevicesProgrammable Array Logic (PAL) Devices PAL: trademark of AMD, use PAL as an adjective orPAL: trademark of AMD, use PAL as an adjective orexpect to receive a letter from AMD’s lawyersexpect to receive a letter from AMD’s lawyers
INPUTOUTPUT
F/F
29
Programmable Logic Array (PLA)
C
B
A
C C B B A A
F2
Programmable AND Plane
Programmable OR Plane
30
Example using PLAPLA
m(0,5,6,7)C)B,F2(A,
m(0,1,2,4) C)B,F1(A,
CBAACABF2
BCACABF1
CBCABAF1
31
Example using PLAPLA
C
B
A
C C B B A A
CBAACABF2
BCACABF1
AB
AC
BC
A B C
F2F1
32
PAL Device
A
B
IO1
IO2
IO1 IO1B BA A IO1 IO2
Programmable AND Plane
Fixed OR Plane
33
PAL Device Design Example
A
B
IO1
IO2
IO1 IO1B BA A
DCBADCADCBACABIO2
DCBACABIO1
D DC C
Not programmed
34
CPLD and FPGA [Brown&Rose 96]
• Complex Programmable Logic Device (CPLDCPLD)– Multiple PLDs (e.g. PALs, PLAs) with
programmable interconnection structure– Pioneered by Altera
• Field-Programmable Gate Array (FPGAFPGA)– High logic capacity with large distributed
interconnection structure• Logic capacity number of 2-input NAND gates
– Offers more narrow logic resources• CPLD offers logic resources w/ a wide number of inputs
(AND planes)– Offer a higher ratio of Flip-flops to logic resources
than CPLD• HCPLDHCPLD (High Capacity PLD) is often used to
refer to both CPLD and FPGA
35
CPLD structure
PLD PLD PLD PLD
PLD PLD PLD PLD
Logic block
Interconnects
I/O block
36
FPGA StructureLogic block
I/O block
Interconnects
37
FPGA Programmability• Floating gate transistor
– Used in EPROM and EEPROM• SRAM-controlled switch Control
– Pass transistors– Multiplexers (to determine how to route
inputs)• Antifuse
– Similar to fuse– Originally an Open-Circuit – One-Time Programmable (OTP)