Download - SRAM Design

SRAM Design and Layout

EE 7325

Project Description

• Design and layout of a 128 word SRAM using the IBM 130nm process. The key

design tools used are Cadence’s Virtuoso for layout editing, DRC (for design rule

checking), LVS (layout versus netlist, for verifying that the layout matches the

schematic netlist) and circuit simulation (for measuring the read/write times).

• Word size is 10bits

• An output capacitance of 30fF is used for all outputs when simulating for delays.

• All input signals, and clocks are provided by inverters sized: PMOS=0.75µm and

NMOS=0.25µm.

Introduction

Static random access memory (SRAM) is a type of volatile semiconductor memory meaning

it stores data as long as it is powered. SRAM uses bi-stable latching circuitry made of

transistors to store each bit. Unlike Dynamic RAM (DRAM), SRAM doesn't have a capacitor

to store the data hence, SRAM works without refreshing. SRAM is often used as a memory

cache.

The most commonly used SRAM cell consists of 6 transistors and this configuration is called

6T Memory Cell. It consists of two cross-coupled inverters and two access transistors.

Figure 1: 6T SRAM Cell


EE 7325

The access transistors are connected to the word line (WL) at their respective gate terminals, and the bit lines (BL and BLbar) at their source/drain terminals. The word line is used to select the cell while the bit lines are used to perform read or write operations on the cell.

Read Operation

Figure 2: Read Operation

The read operation of the memory cell is explained in Figure 2. Assume that a “0” is stored

on the left side of the cell, and a “1” on the right side. M1 is on and M2 is off. Initially, BL

and BLbar are pre-charged to VDD. Whenever a row is selected by making the word line

active, access transistors M3 and M4 are turned on. Current begins to flow through M3 and M1

to ground. As a result the cell discharges the capacitance Cbit. On the other side of the cell, the

voltage on M4 remains high since there is no path to ground through M2. The difference

between BL and BLbar is fed to a sense amplifier to generate a valid low output.

Write Operation

Figure 3: Write Operation


EE 7325

In order to write to the cell it has to be attacked from both sides. A “1” is placed on one of the

bit lines and “0” on the other. By doing this we can flip the value that was stored in the cell

and write the new value. The WL transistors need to be ON during read and write operations.

SRAM Implementation

The top level block diagram of the SRAM is shown in Figure 4

Figure 4: Top Level Block Diagram

The signal description is as follows

Port I/O Type Description WR Input 1 bit Write/Read signal

1- Write 0- Read

clk Input Clock signal 0-Precharge 1-Evaluate

addr0-6 Input 7 bit input address addr_en Input 1 bit address enable

Word line selected only on addr_en =1 data0-9 Bidirectional 10 bit SRAM data

When WR is 0- Reads the data stored in SRAM

1- Writes the data to SRAM vdd,vss Inputs Supply(1.2 V) and gnd

128 word SRAM has 128*10 memory cells considering the word size of 10 bits. The cell is

designed to have 40 columns and 32 rows. Hence, we need a 5 bit address line to access one

of the rows/word lines and a 2 bit address line to access one of the four words. The overall

architecture of the memory design is as shown in Figure 5.


EE 7325

Figure 5: Memory Architecture

Now our goal is to design each individual unit of this architecture, integrate and ensure that

the read and write operations are working correctly for the design.


EE 7325

Component Design

• SRAM Cell

The layout and schematic of the designed SRAM cell are illustrated in Figure 6 and 7.

Figure 6: Memory Cell Layout


EE 7325

Figure 7: Memory Cell Schematic

Since there are usually millions of bits to be stored in these memories, in order to achieve the

minimum area, all the transistors are minimum size (0.28µm here).

Width = 2.49 µm, Length = 2.38 µm => Aspect ratio = 𝑾𝑳

= 1.04

Hence, the area per memory cell is 2.49 * 2.38 = 5.92 µm 2

• Precharge Circuit

In both read and write operations, the bitlines are initially pulled up to high voltage.

This is done using a precharge circuit. The schematic of the circuit is as shown in

Figure 9 below. A clock input is applied to the two pull-up transistors, called the

balance transistors, connected between the two bitlines. When the wordline (WL)

signal goes high, one bitline remains high and the other falls until WL goes low. The

layout of the precharge circuit is as shown in Figure 8.


EE 7325

Figure 8: Layout of the Precharge Circuit

Figure 9: Schematic of the Precharge Circuit


EE 7325

• Clock Driver Circuit

Since we have used a clocked precharge circuit to charge the bitlines, it is necessary to size the clock buffer circuit as well. The sizing of the transistor is as follows: All calculations are done based on the fact that the clock drives 2 PFETs between every BL and BL lines. That is it has to drive a total of 2*40 PFETs. Cpoly = 2 fF/µm* 2µm *2*40 = 320fF Cwire = 0.2fF/µm * Width of the memory cell*Number of columns = 0.2fF/µm * 2.38*40 = 19.04fF CLoad = Cpoly + Cwire =

!!".!"!

= 169.52fF F = GBH= 169.52

Number of stages, N = !"# !"#.!"!"# !.!

= 4 stages

f = F1/N = 3.6

Hence the circuit is as below

Figure 10: Clock Driver Circuit

The sizing equation is Cin = !∗!"#$%!

4: !∗!"#.!"!.!

= 47.08 => Wp =31.38µm, Wn= 15.69µm

3: !∗!".!"!.!

= 13.08 => Wp =8.72µm, Wn= 4.36µm

2: !∗!".!"!.!

= 3.633 => Wp =2.42µm, Wn= 1.21µm

1: !∗!.!""!.!

= 1.00 => Wp =0.66µm, Wn= 0.33µm

The schematic and layout of the Clock Driver circuit is shown below.

Cload


EE 7325

Figure 11: Layout and Schematic of Clock Driver


EE 7325

• Sense Amplifier

The designed SRAM uses ten identical sense amplifiers to provide simultaneous output of ten data bits. In our design, we have used a current mode differential input single ended sense amplifier in order to attenuate the common mode noise and amplify the differential mode signals. The main reason for using this type of sense amplifier is to improve the noise immunity and speed of the read circuit. The differential signal that changes between the two bit lines during read operation is amplified by the differential pair current mode sense amplifier. The transistors are sized such that the differential voltage is amplified suitably for read operation. The output of the sense amplifier is then given to a pair of inverters in order to have a digital output. Inverted Write (WR) signal is given to the gate of the current source transistor in order to enable the sense amplifier only during read operation. The schematic and layout of the sense amplifier are as shown in the figures below.

Figure 10: Sense Amplifier Layout


EE 7325

Figure 11: Sense Amplifier Schematic

• Row Decoder

Access time and power consumption of memories may be largely determined by

decoder design. Row decoders take an n-bit address and produce 2n outputs. Row

decoders are used to select the required row in the memory array. The required

wordline is activated based on the address given to the decoder. In our design we have

32 rows, hence n=5 address bits are used to select a row. Since the row decoder is

used to activate one of the 25 wordlines, it has to be sized suitably using logical effort

based on the capacitance of the wordline. The gate level schematic of one stage of

row decoder is as shown in Figure.

Figure 12: Row Decoder Circuit


EE 7325

The gates are sized as below

Cpoly = 2 fF/µm* 2µm *0.28*40 = 44.8fF Cwire = 0.2fF/µm * 2.38*40 = 19.04fF CLoad = Cpoly + Cwire = !".!"

! = 31.92fF = H

B= 16, G = !

! * !

! = !"

!

F = GBH= 1418.67 Number of stages, N = !"# !"!#.!"

!"# !.! = 5.66 = 6 stages

f = F1/N = 2.82

1: !∗!".!"!.!"

= 11.32 => Wp =7.55µm, Wn= 3.78µm

2: !∗!!.!"!.!"

= 4.02 => Wp = 2.68µm, Wn= 1.34µm

3: !∗!.!"!∗!.!"

= 2.38 => Wp = 0.48µm, Wn= 1.9µm

4: !∗!.!"!∗!.!"

= 1.4 => Wp =0.56µm, Wn= 0.84µm

5: !"∗!.!!.!"

= 7.95 => Wp = 5.3µm, Wn= 2.65µm

6: !∗!.!"!.!"

= 2.82 => Wp = 1.88µm, Wn= 0.94µm

7: !∗!.!"!.!"

= 1.00 => Wp = 0.66µm, Wn= 0.33µm

Two more inverters are used to get the non-inverted input

8: !.!"√!.!"

= 1.68 => Wp = 1.12µm, Wn= 0.56µm

9: !.!"√!.!"

= 1.00 => Wp = 0.66µm, Wn= 0.33µm

10: !"∗!.!!.!"

= 15.89 => Wp = 10.6µm, Wn= 5.3µm

11: !".!"!.!"

= 5.64 => Wp = 3.76µm, Wn= 1.88µm

12: !.!"!.!"

= 2 => Wp = 1.34µm, Wn= 0.67µm

13: !!.!"

= 0.71 =>The transistor widths are below the minimum. Hence, Wp = 0.66µm,

Wn= 0.33µm are chosen.


EE 7325

Figure 13: Layout and Schematic of Row Decoder


EE 7325

• Column Decoder

After precharging all the bitlines to a high voltage, the next step is to select a column

of the memory cell array that will be involved in the read or write operation. This

column selection is performed using a decoder/multiplexer combination. The m-bit

column address is used to select one or more of the 2m columns. In our case, the array

is designed such that four words are placed in a row with all the first bits of the word

together and so on. The column decoder is hence used to select a bit from among 4

bits hence, 2 bits are used to select a column. The transistors are sized based on the

bitline capacitances. The column decoder is sized as below

Figure 15: Column Decoder

Cpoly = 2 fF/µm* Wn of the Transmission Gate *Number of gates = 2 fF/µm* 1µm *20 = 40fF Cwire = 0.2fF/µm * Width of the memory cell*Number of columns = 0.2fF/µm * 2.38*40 = 19.04fF CLoad = Cpoly + Cwire = !".!"

! = 29.52fF = H

F = GBH= !! * 29.52 * 4 = 157.44

Number of stages, N = !"# !"#.!!

!"# !.! = 4 stages

f = F1/N = 3.54 The circuit is modified as shown below to get the required number of stages


EE 7325

Figure 16: One input-output block of the column decoder with the required number of

stages

The gates were sized as follows:

7&9: !∗!".!"!.!"

= 8.33 => Wp =5.55µm, Wn= 2.77µm

6: !∗!.!!√!.!"

= 4.42 => Wp = 2.94µm, Wn= 1.47µm

5&8: !∗!.!"√!.!"

= 2.35 => Wp =1.56µm, Wn= 0.78µm

4: !! ∗!∗!.!"

!.!" = 1.77 => Wp =0.88µm, Wn= 0.88µm

1&2: !∗!∗!.!!!.!"

= 1 => Wp =0.66µm, Wn= 0.33µm

3: !∗!∗!.!!√!.!"

= 1.88 => Wp = 1.25µm, Wn= 0.62µm

The layout and schematic of the column decoder are shown below.


EE 7325

Figure 17: Layout and Schematic of the Column Decoder


EE 7325

• Write Driver

During precharge both the BL and BLbar lines are charged to VDD. Before write

operation, one of the bitlines must be driven high and the other low based on the data bit

that is being written. The schematic of the write circuitry that we have used in our design

is shown in Figure 19. During write operation, WR signal goes high and the 8 bit data can

be written by giving required bit values to the corresponding input bits. These values are

then passed through a set of pass transistors that are attached to the BL and BLbar lines so

that the data bit will be written into the corresponding memory cell.

Figure 18: Write Driver Layout


EE 7325

Figure 19: Write Driver Schematic

• Write Enable (WR) Driver

Since the WR signal drives two NFETs in each column, a total of 20 NFETs will be

driven by WR. In addition, its complement is given to the 20 PFETs of the write driver.

Hence, the buffer circuit for WR must be suitably sized so that it drives the required load.

The transistor sizing is as given below. The schematic and the layout of the buffer circuit

are as shown in figure 22 and 23 respectively.

Figure 20: WR Driver


EE 7325

Load to WRbar is

Cpoly = (2 fF/µm* 4.22µm *20) + (2 fF/µm*2*10) = 208.8fF Cwire = 0.2fF/µm * 2.38*40 = 19.04fF CLoad = Cpoly + Cwire = !"#.!!!".!"

! = 113.92fF = H

F = GBH= 113.92 Number of stages, N = !"# !!".!"

!"# !.! = 3.69 stages

In order to get the inversion, 5 stages were chosen. f = F1/N = 2.57

Load to WR is

Cpoly = (2 fF/µm* 0.48µm *20) + (2 fF/µm*2*10) = 59.2fF Cwire = 0.2fF/µm * 2.38*40 = 19.04fF CLoad = Cpoly + Cwire = !".!!!".!"

! = 39.12fF = H

F = GBH= 39.12 Number of stages, N = !"# !".!"

!"# !.! = 2.86 = 4 stages

f = F1/N = 2.77

The circuit is modified as shown below to get the required number of stages

Figure 21: Modified WR Driver

The gates were sized as follows:

1: !∗!!".!"!.!"

= 44.32 => Wp = 29.6µm, Wn= 14.8µm

2: !∗!!.!"!.!"

= 17.24 => Wp = 11.5µm, Wn= 5.74µm

3: !∗!".!"!.!"

= 6.7 => Wp =4.45µm, Wn= 2.23µm

4: !∗!.!!.!"

= 2.607 => Wp =1.73µm, Wn= 0.86µm


EE 7325

5: !∗!.!!.!"

= 1.01 => Wp = 0.66µm, Wn= 0.33µm

6: !∗!".!"!.!!

= 21.34 => Wp = 14.2µm, Wn= 7.11µm

7: !∗!".!"!.!!

= 7.703 => Wp =5.13µm, Wn= 2.56µm

8: !∗!.!"#!.!!

= 2.78 => Wp =1.85µm, Wn= 0.92µm

9: !∗!.!"!.!!

= 1.003 => Wp = 0.66µm, Wn= 0.33µm

Figure 22: Layout of WR Driver

Figure 23: Schematic of WR Driver


EE 7325

• Data Buffer

The data is given through two tristate inverters to the bitlines. The data buffer has to

be appropriately sized to run these tristate inverters. The correct sizing is shown

below. The layout and schematic are shown in figures 25 and 26.

Figure 24: Data Buffer Circuit

Cpoly = (2 fF/µm* 4.22µm) + (2 fF/µm* 0.48µm) = 9.4fF Cwire = 0.2fF/µm * 2.38*40 = 19.04fF CLoad = Cpoly + Cwire = !.!!!".!"

! = 14.22fF = H

F = GBH= 14.22 Number of stages, N = !"# !".!!

!"# !.! = 2.07 = 3 stages in order to invert.

f = F1/N = 2.42

3: !∗!".!!!.!"

= 5.87 => Wp =3.2µm, Wn= 1.96µm

2: !∗!.!"!.!"

= 2.42 => Wp = 1.61µm, Wn= 0.8µm

1: !∗!.!"!.!"

= 2.42 => Wp =0.66µm, Wn= 0.33µm


EE 7325

Figure 25: Data Driver Layout

Figure 26: Data Driver Schematic


EE 7325

• Transmission Gate

In our design, we have used transmission gates in order to select the columns. The transmission gates are also sized for optimal speed. The schematic and layout of the transmission gate are as shown in Figures 25 and 26 respectively.

Figure 27: Transmission Gate Layout

Figure 28: Transmission Gate Schematic


EE 7325

• Complete Schematic and Layout

Once all the peripheral circuits are designed, all of the units are then integrated to the memory cell array. The complete SRAM schematic including precharge, clock buffer, row decoders, column decoders, sense amplifier and the write circuit is as shown in Figure 29. The corresponding layout of the design along with the rulers is given in Figure 30.

Figure 29: Complete Schematic


EE 7325

Figure 30: Complete Layout

The total area of the design is 107.84 µm *114.85 µm = 12385.42 µm 2 Therefore, total area that accounts for one bit is given by, Area /bit =12385.42 / 1028 = 12.048 µm 2


EE 7325

DRC and LVS Reports

The designed layout is checked in Cadence for design rule errors. There were no DRC errors

in the layout. A snapshot of the DRC report is shown in Figure 31. The functionality is then

tested by comparing the Layout versus Schematic (LVS). LVS matched and the report is

shown in Figure 32.

Figure 31: DRC Report for complete SRAM layout


EE 7325

Figure 32: LVS Report of the complete SRAM layout


EE 7325

Simulation and Results

The functionality of the SRAM is tested by writing a 10 bit data word 0110001010 into the first row and first column of each super column of the design and then reading the written value in the next clock cycle. When the clk is low, all the bitlines are precharged to VDD. During evaluation, the Write enable (WR) signal is activated. During this phase, the write operation take place and word bits are written into the corresponding memory cell depending on the row and column address.

Figure 33: SRAM Simulation Result


EE 7325

The worst case write time is found by reducing the width of the WR signal until the write does not work properly. The smallest width of WR at which the data is written correctly is the worst case write time. The read time delay is also measured by 50 - 50% delay between addr_en signal and data bits being read. The simulated waveforms are as shown below.

Figure 34: Worst case time simulation

The operating frequency is calculated as shown below Operating frequency =

!!∗!"#$% !"#$ !"#$

= !!∗!"#!"

= 613.5MHz

The noise margin is calculated by drawing the overlapped VTC (Butterfly diagram) for the

cross-coupled inverters that form the memory cell. The largest square that can fit in the eyes

of the butterfly diagram determines the noise margin. The butterfly diagram obtained is

shown below.


EE 7325

Figure 35: Noise Margin Conclusion

The SRAM has comparatively low operating frequency but that is a trade-off for the low

area/bit that we have tried to achieve. The memory cell has good noise margin and good

control over read and write operations.

Parameter Value

Aspect Ratio 1.065

Worst case write time 815ps

Worst case read time 714ps

Operating frequency 613.5MHz

Noise Margin 0.35V