SRAM Design and Layout
EE 7325 Page 1
Project Description
• Design and layout of a 128 word SRAM using the IBM 130nm process. The key
design tools used are Cadence’s Virtuoso for layout editing, DRC (for design rule
checking), LVS (layout versus netlist, for verifying that the layout matches the
schematic netlist) and circuit simulation (for measuring the read/write times).
• Word size is 10bits
• An output capacitance of 30fF is used for all outputs when simulating for delays.
• All input signals, and clocks are provided by inverters sized: PMOS=0.75µm and
NMOS=0.25µm.
Introduction
Static random access memory (SRAM) is a type of volatile semiconductor memory meaning
it stores data as long as it is powered. SRAM uses bi-stable latching circuitry made of
transistors to store each bit. Unlike Dynamic RAM (DRAM), SRAM doesn't have a capacitor
to store the data hence, SRAM works without refreshing. SRAM is often used as a memory
cache.
The most commonly used SRAM cell consists of 6 transistors and this configuration is called
6T Memory Cell. It consists of two cross-coupled inverters and two access transistors.
Figure 1: 6T SRAM Cell
SRAM Design and Layout
EE 7325 Page 2
The access transistors are connected to the word line (WL) at their respective gate terminals, and the bit lines (BL and BLbar) at their source/drain terminals. The word line is used to select the cell while the bit lines are used to perform read or write operations on the cell.
Read Operation
Figure 2: Read Operation
The read operation of the memory cell is explained in Figure 2. Assume that a “0” is stored
on the left side of the cell, and a “1” on the right side. M1 is on and M2 is off. Initially, BL
and BLbar are pre-charged to VDD. Whenever a row is selected by making the word line
active, access transistors M3 and M4 are turned on. Current begins to flow through M3 and M1
to ground. As a result the cell discharges the capacitance Cbit. On the other side of the cell, the
voltage on M4 remains high since there is no path to ground through M2. The difference
between BL and BLbar is fed to a sense amplifier to generate a valid low output.
Write Operation
Figure 3: Write Operation
SRAM Design and Layout
EE 7325 Page 3
In order to write to the cell it has to be attacked from both sides. A “1” is placed on one of the
bit lines and “0” on the other. By doing this we can flip the value that was stored in the cell
and write the new value. The WL transistors need to be ON during read and write operations.
SRAM Implementation
The top level block diagram of the SRAM is shown in Figure 4
Figure 4: Top Level Block Diagram
The signal description is as follows
Port I/O Type Description WR Input 1 bit Write/Read signal
1- Write 0- Read
clk Input Clock signal 0-Precharge 1-Evaluate
addr0-6 Input 7 bit input address addr_en Input 1 bit address enable
Word line selected only on addr_en =1 data0-9 Bidirectional 10 bit SRAM data
When WR is 0- Reads the data stored in SRAM
1- Writes the data to SRAM vdd,vss Inputs Supply(1.2 V) and gnd
128 word SRAM has 128*10 memory cells considering the word size of 10 bits. The cell is
designed to have 40 columns and 32 rows. Hence, we need a 5 bit address line to access one
of the rows/word lines and a 2 bit address line to access one of the four words. The overall
architecture of the memory design is as shown in Figure 5.
SRAM Design and Layout
EE 7325 Page 4
Figure 5: Memory Architecture
Now our goal is to design each individual unit of this architecture, integrate and ensure that
the read and write operations are working correctly for the design.
SRAM Design and Layout
EE 7325 Page 5
Component Design
• SRAM Cell
The layout and schematic of the designed SRAM cell are illustrated in Figure 6 and 7.
Figure 6: Memory Cell Layout
SRAM Design and Layout
EE 7325 Page 6
Figure 7: Memory Cell Schematic
Since there are usually millions of bits to be stored in these memories, in order to achieve the
minimum area, all the transistors are minimum size (0.28µm here).
Width = 2.49 µm, Length = 2.38 µm => Aspect ratio = 𝑾𝑳
= 1.04
Hence, the area per memory cell is 2.49 * 2.38 = 5.92 µm 2
• Precharge Circuit
In both read and write operations, the bitlines are initially pulled up to high voltage.
This is done using a precharge circuit. The schematic of the circuit is as shown in
Figure 9 below. A clock input is applied to the two pull-up transistors, called the
balance transistors, connected between the two bitlines. When the wordline (WL)
signal goes high, one bitline remains high and the other falls until WL goes low. The
layout of the precharge circuit is as shown in Figure 8.
SRAM Design and Layout
EE 7325 Page 7
Figure 8: Layout of the Precharge Circuit
Figure 9: Schematic of the Precharge Circuit
SRAM Design and Layout
EE 7325 Page 8
• Clock Driver Circuit
Since we have used a clocked precharge circuit to charge the bitlines, it is necessary to size the clock buffer circuit as well. The sizing of the transistor is as follows: All calculations are done based on the fact that the clock drives 2 PFETs between every BL and BL lines. That is it has to drive a total of 2*40 PFETs. Cpoly = 2 fF/µm* 2µm *2*40 = 320fF Cwire = 0.2fF/µm * Width of the memory cell*Number of columns = 0.2fF/µm * 2.38*40 = 19.04fF CLoad = Cpoly + Cwire =
!!".!"!
= 169.52fF F = GBH= 169.52
Number of stages, N = !"# !"#.!"!"# !.!
= 4 stages
f = F1/N = 3.6
Hence the circuit is as below
Figure 10: Clock Driver Circuit
The sizing equation is Cin = !∗!"#$%!
4: !∗!"#.!"!.!
= 47.08 => Wp =31.38µm, Wn= 15.69µm
3: !∗!".!"!.!
= 13.08 => Wp =8.72µm, Wn= 4.36µm
2: !∗!".!"!.!
= 3.633 => Wp =2.42µm, Wn= 1.21µm
1: !∗!.!""!.!
= 1.00 => Wp =0.66µm, Wn= 0.33µm
The schematic and layout of the Clock Driver circuit is shown below.
Cload
SRAM Design and Layout
EE 7325 Page 9
Figure 11: Layout and Schematic of Clock Driver
SRAM Design and Layout
EE 7325 Page 10
• Sense Amplifier
The designed SRAM uses ten identical sense amplifiers to provide simultaneous output of ten data bits. In our design, we have used a current mode differential input single ended sense amplifier in order to attenuate the common mode noise and amplify the differential mode signals. The main reason for using this type of sense amplifier is to improve the noise immunity and speed of the read circuit. The differential signal that changes between the two bit lines during read operation is amplified by the differential pair current mode sense amplifier. The transistors are sized such that the differential voltage is amplified suitably for read operation. The output of the sense amplifier is then given to a pair of inverters in order to have a digital output. Inverted Write (WR) signal is given to the gate of the current source transistor in order to enable the sense amplifier only during read operation. The schematic and layout of the sense amplifier are as shown in the figures below.
Figure 10: Sense Amplifier Layout
SRAM Design and Layout
EE 7325 Page 11
Figure 11: Sense Amplifier Schematic
• Row Decoder
Access time and power consumption of memories may be largely determined by
decoder design. Row decoders take an n-bit address and produce 2n outputs. Row
decoders are used to select the required row in the memory array. The required
wordline is activated based on the address given to the decoder. In our design we have
32 rows, hence n=5 address bits are used to select a row. Since the row decoder is
used to activate one of the 25 wordlines, it has to be sized suitably using logical effort
based on the capacitance of the wordline. The gate level schematic of one stage of
row decoder is as shown in Figure.
Figure 12: Row Decoder Circuit
SRAM Design and Layout
EE 7325 Page 12
The gates are sized as below
Cpoly = 2 fF/µm* 2µm *0.28*40 = 44.8fF Cwire = 0.2fF/µm * 2.38*40 = 19.04fF CLoad = Cpoly + Cwire = !".!"
! = 31.92fF = H
B= 16, G = !
! * !
! = !"
!
F = GBH= 1418.67 Number of stages, N = !"# !"!#.!"
!"# !.! = 5.66 = 6 stages
f = F1/N = 2.82
1: !∗!".!"!.!"
= 11.32 => Wp =7.55µm, Wn= 3.78µm
2: !∗!!.!"!.!"
= 4.02 => Wp = 2.68µm, Wn= 1.34µm
3: !∗!.!"!∗!.!"
= 2.38 => Wp = 0.48µm, Wn= 1.9µm
4: !∗!.!"!∗!.!"
= 1.4 => Wp =0.56µm, Wn= 0.84µm
5: !"∗!.!!.!"
= 7.95 => Wp = 5.3µm, Wn= 2.65µm
6: !∗!.!"!.!"
= 2.82 => Wp = 1.88µm, Wn= 0.94µm
7: !∗!.!"!.!"
= 1.00 => Wp = 0.66µm, Wn= 0.33µm
Two more inverters are used to get the non-inverted input
8: !.!"√!.!"
= 1.68 => Wp = 1.12µm, Wn= 0.56µm
9: !.!"√!.!"
= 1.00 => Wp = 0.66µm, Wn= 0.33µm
10: !"∗!.!!.!"
= 15.89 => Wp = 10.6µm, Wn= 5.3µm
11: !".!"!.!"
= 5.64 => Wp = 3.76µm, Wn= 1.88µm
12: !.!"!.!"
= 2 => Wp = 1.34µm, Wn= 0.67µm
13: !!.!"
= 0.71 =>The transistor widths are below the minimum. Hence, Wp = 0.66µm,
Wn= 0.33µm are chosen.
SRAM Design and Layout
EE 7325 Page 13
Figure 13: Layout and Schematic of Row Decoder
SRAM Design and Layout
EE 7325 Page 14
• Column Decoder
After precharging all the bitlines to a high voltage, the next step is to select a column
of the memory cell array that will be involved in the read or write operation. This
column selection is performed using a decoder/multiplexer combination. The m-bit
column address is used to select one or more of the 2m columns. In our case, the array
is designed such that four words are placed in a row with all the first bits of the word
together and so on. The column decoder is hence used to select a bit from among 4
bits hence, 2 bits are used to select a column. The transistors are sized based on the
bitline capacitances. The column decoder is sized as below
Figure 15: Column Decoder
Cpoly = 2 fF/µm* Wn of the Transmission Gate *Number of gates = 2 fF/µm* 1µm *20 = 40fF Cwire = 0.2fF/µm * Width of the memory cell*Number of columns = 0.2fF/µm * 2.38*40 = 19.04fF CLoad = Cpoly + Cwire = !".!"
! = 29.52fF = H
F = GBH= !! * 29.52 * 4 = 157.44
Number of stages, N = !"# !"#.!!
!"# !.! = 4 stages
f = F1/N = 3.54 The circuit is modified as shown below to get the required number of stages
SRAM Design and Layout
EE 7325 Page 15
Figure 16: One input-output block of the column decoder with the required number of
stages
The gates were sized as follows:
7&9: !∗!".!"!.!"
= 8.33 => Wp =5.55µm, Wn= 2.77µm
6: !∗!.!!√!.!"
= 4.42 => Wp = 2.94µm, Wn= 1.47µm
5&8: !∗!.!"√!.!"
= 2.35 => Wp =1.56µm, Wn= 0.78µm
4: !! ∗!∗!.!"
!.!" = 1.77 => Wp =0.88µm, Wn= 0.88µm
1&2: !∗!∗!.!!!.!"
= 1 => Wp =0.66µm, Wn= 0.33µm
3: !∗!∗!.!!√!.!"
= 1.88 => Wp = 1.25µm, Wn= 0.62µm
The layout and schematic of the column decoder are shown below.
SRAM Design and Layout
EE 7325 Page 16
Figure 17: Layout and Schematic of the Column Decoder
SRAM Design and Layout
EE 7325 Page 17
• Write Driver
During precharge both the BL and BLbar lines are charged to VDD. Before write
operation, one of the bitlines must be driven high and the other low based on the data bit
that is being written. The schematic of the write circuitry that we have used in our design
is shown in Figure 19. During write operation, WR signal goes high and the 8 bit data can
be written by giving required bit values to the corresponding input bits. These values are
then passed through a set of pass transistors that are attached to the BL and BLbar lines so
that the data bit will be written into the corresponding memory cell.
Figure 18: Write Driver Layout
SRAM Design and Layout
EE 7325 Page 18
Figure 19: Write Driver Schematic
• Write Enable (WR) Driver
Since the WR signal drives two NFETs in each column, a total of 20 NFETs will be
driven by WR. In addition, its complement is given to the 20 PFETs of the write driver.
Hence, the buffer circuit for WR must be suitably sized so that it drives the required load.
The transistor sizing is as given below. The schematic and the layout of the buffer circuit
are as shown in figure 22 and 23 respectively.
Figure 20: WR Driver
SRAM Design and Layout
EE 7325 Page 19
Load to WRbar is
Cpoly = (2 fF/µm* 4.22µm *20) + (2 fF/µm*2*10) = 208.8fF Cwire = 0.2fF/µm * 2.38*40 = 19.04fF CLoad = Cpoly + Cwire = !"#.!!!".!"
! = 113.92fF = H
F = GBH= 113.92 Number of stages, N = !"# !!".!"
!"# !.! = 3.69 stages
In order to get the inversion, 5 stages were chosen. f = F1/N = 2.57
Load to WR is
Cpoly = (2 fF/µm* 0.48µm *20) + (2 fF/µm*2*10) = 59.2fF Cwire = 0.2fF/µm * 2.38*40 = 19.04fF CLoad = Cpoly + Cwire = !".!!!".!"
! = 39.12fF = H
F = GBH= 39.12 Number of stages, N = !"# !".!"
!"# !.! = 2.86 = 4 stages
f = F1/N = 2.77
The circuit is modified as shown below to get the required number of stages
Figure 21: Modified WR Driver
The gates were sized as follows:
1: !∗!!".!"!.!"
= 44.32 => Wp = 29.6µm, Wn= 14.8µm
2: !∗!!.!"!.!"
= 17.24 => Wp = 11.5µm, Wn= 5.74µm
3: !∗!".!"!.!"
= 6.7 => Wp =4.45µm, Wn= 2.23µm
4: !∗!.!!.!"
= 2.607 => Wp =1.73µm, Wn= 0.86µm
SRAM Design and Layout
EE 7325 Page 20
5: !∗!.!!.!"
= 1.01 => Wp = 0.66µm, Wn= 0.33µm
6: !∗!".!"!.!!
= 21.34 => Wp = 14.2µm, Wn= 7.11µm
7: !∗!".!"!.!!
= 7.703 => Wp =5.13µm, Wn= 2.56µm
8: !∗!.!"#!.!!
= 2.78 => Wp =1.85µm, Wn= 0.92µm
9: !∗!.!"!.!!
= 1.003 => Wp = 0.66µm, Wn= 0.33µm
Figure 22: Layout of WR Driver
Figure 23: Schematic of WR Driver
SRAM Design and Layout
EE 7325 Page 21
• Data Buffer
The data is given through two tristate inverters to the bitlines. The data buffer has to
be appropriately sized to run these tristate inverters. The correct sizing is shown
below. The layout and schematic are shown in figures 25 and 26.
Figure 24: Data Buffer Circuit
Cpoly = (2 fF/µm* 4.22µm) + (2 fF/µm* 0.48µm) = 9.4fF Cwire = 0.2fF/µm * 2.38*40 = 19.04fF CLoad = Cpoly + Cwire = !.!!!".!"
! = 14.22fF = H
F = GBH= 14.22 Number of stages, N = !"# !".!!
!"# !.! = 2.07 = 3 stages in order to invert.
f = F1/N = 2.42
3: !∗!".!!!.!"
= 5.87 => Wp =3.2µm, Wn= 1.96µm
2: !∗!.!"!.!"
= 2.42 => Wp = 1.61µm, Wn= 0.8µm
1: !∗!.!"!.!"
= 2.42 => Wp =0.66µm, Wn= 0.33µm
SRAM Design and Layout
EE 7325 Page 22
Figure 25: Data Driver Layout
Figure 26: Data Driver Schematic
SRAM Design and Layout
EE 7325 Page 23
• Transmission Gate
In our design, we have used transmission gates in order to select the columns. The transmission gates are also sized for optimal speed. The schematic and layout of the transmission gate are as shown in Figures 25 and 26 respectively.
Figure 27: Transmission Gate Layout
Figure 28: Transmission Gate Schematic
SRAM Design and Layout
EE 7325 Page 24
• Complete Schematic and Layout
Once all the peripheral circuits are designed, all of the units are then integrated to the memory cell array. The complete SRAM schematic including precharge, clock buffer, row decoders, column decoders, sense amplifier and the write circuit is as shown in Figure 29. The corresponding layout of the design along with the rulers is given in Figure 30.
Figure 29: Complete Schematic
SRAM Design and Layout
EE 7325 Page 25
Figure 30: Complete Layout
The total area of the design is 107.84 µm *114.85 µm = 12385.42 µm 2 Therefore, total area that accounts for one bit is given by, Area /bit =12385.42 / 1028 = 12.048 µm 2
SRAM Design and Layout
EE 7325 Page 26
DRC and LVS Reports
The designed layout is checked in Cadence for design rule errors. There were no DRC errors
in the layout. A snapshot of the DRC report is shown in Figure 31. The functionality is then
tested by comparing the Layout versus Schematic (LVS). LVS matched and the report is
shown in Figure 32.
Figure 31: DRC Report for complete SRAM layout
SRAM Design and Layout
EE 7325 Page 27
Figure 32: LVS Report of the complete SRAM layout
SRAM Design and Layout
EE 7325 Page 28
Simulation and Results
The functionality of the SRAM is tested by writing a 10 bit data word 0110001010 into the first row and first column of each super column of the design and then reading the written value in the next clock cycle. When the clk is low, all the bitlines are precharged to VDD. During evaluation, the Write enable (WR) signal is activated. During this phase, the write operation take place and word bits are written into the corresponding memory cell depending on the row and column address.
Figure 33: SRAM Simulation Result
SRAM Design and Layout
EE 7325 Page 29
The worst case write time is found by reducing the width of the WR signal until the write does not work properly. The smallest width of WR at which the data is written correctly is the worst case write time. The read time delay is also measured by 50 - 50% delay between addr_en signal and data bits being read. The simulated waveforms are as shown below.
Figure 34: Worst case time simulation
The operating frequency is calculated as shown below Operating frequency =
!!∗!"#$% !"#$ !"#$
= !!∗!"#!"
= 613.5MHz
The noise margin is calculated by drawing the overlapped VTC (Butterfly diagram) for the
cross-coupled inverters that form the memory cell. The largest square that can fit in the eyes
of the butterfly diagram determines the noise margin. The butterfly diagram obtained is
shown below.
SRAM Design and Layout
EE 7325 Page 30
Figure 35: Noise Margin Conclusion
The SRAM has comparatively low operating frequency but that is a trade-off for the low
area/bit that we have tried to achieve. The memory cell has good noise margin and good
control over read and write operations.
Parameter Value
Aspect Ratio 1.065
Worst case write time 815ps
Worst case read time 714ps
Operating frequency 613.5MHz
Noise Margin 0.35V
Top Related