SRAM Adi Teman
SRAM Adi Teman
SRAM Adi Teman
(83-313)
Lecture 8:
SRAM
Prof. Adam Teman
2 May 2021
Disclaimer: This course was prepared, in its entirety, by Adam Teman. Many materials were copied from sources freely available on the internet. When possible, these sources have been cited;
however, some references may have been cited incorrectly or overlooked. If you feel that a picture, graph, or code example has been copied from you and either needs to be cited or removed,
please feel free to email [email protected] and I will address this as soon as possible.
Lecture Content
2
© Adam Teman,
May 2, 2021
First Look at Memory
3
Why Memory?
Source: Intel
Intel Pentium-M (2001) – 2MB L3 Cache
Source: wccftech.com
8
© Adam Teman,
May 2, 2021
Special Considerations
• The “core” of the memory array is huge.
It can sometimes take up most of the chip area.
• For this reason, we will try to make the “bitcell” as small as possible.
• A standard Flip Flop uses at least 10 transistors per bit
(usually more than 20). This is very area consuming.
• We will trade-off area for other circuit properties:
• Noise Margins
• Logic Swing
• Speed
• Design Rules
• This requires special peripheral circuitry.
9
© Adam Teman,
May 2, 2021
Memory Architecture
Storage Cell
Bit Line Memory Size: W Words of C bits
=W x C bits
Address bus: A bits
ADDA-1 : ADDM
→W=2A
Row Decoder
Word Line
11
Basic Static Memory Element
Q Q
12
© Adam Teman,
May 2, 2021
Positive Feedback: Bi-Stability
© Adam Teman,
May 2, 2021
Writing into a Cross-Coupled Pair
• The write operation is ratioed
• The access transistor must overcome the feedback.
En
D Q Q
15
© Adam Teman,
May 2, 2021
How should we write a ‘1’
Option 1: nMOS Access Transistor Option 2: pMOS Access Transistor
Passes a “weak ‘1’”, bad at pulling Passes a “weak ‘0’”, bad at pulling
up against the feedback down against the feedback
© Adam Teman,
May 2, 2021
6-transistor CMOS SRAM Cell
BL BLB
WL WL
M3 M6
M2 M5
Q QB
M1 M4
18
© Adam Teman,
May 2, 2021
The Computer Hall of Fame
• The machine that introduced the GUI, the mouse,
and Steve Jobs to the mainstream.
21
SRAM Operation: HOLD
BL BLB
WL WL
M3 M6
M2 M5
Q QB
M1 M4
22
© Adam Teman,
May 2, 2021
SRAM Operation: READ
BL BLB
WL 0 VDD WL
M3 M6
M2 VDD M5
0 BLB
Q QB
QB M1
0 VDD
M4
WL
M5
M3
QB=ΔV
Q=VDD
Q
M4
WL M2
Left Side: Right Side:
BL “nMOS” inverter –
Nothing Changes…
QB voltage rises
23
© Adam Teman,
May 2, 2021
SRAM Operation - Read
BLB
BL
BLB Cell Ratio:
WL M3 M6 WL WL
M5 W4
L4
VDD M2 M5 VDD QB=ΔV CR
W5
Q=‘1’ M1 Q
M4 L5
CBLB
M4 QB=‘0’
CBL
VDSat,n
2
V 2
kM5 (VDD − V − VT,n )VDSat,n − = kM4 (VDD − VT,n ) V −
2 2
W4 WL
M5
L4
CR QB=ΔV
W5
L5 M4
Q
25
© Adam Teman,
May 2, 2021
SRAM Operation: WRITE
BL BLB
WL VDD 0 WL
M3 M6
VDD M2 0 VDD M5 0
Q QB
BL M1 M4
VDD 0
Q
WL
M2 M6
Q=ΔV
QB=VOLmin
M1 WL M5
QB Left Side: Right Side:
BLB
Same as during read – Pseudo nMOS
designed so ΔV<VM inverter!
26
© Adam Teman,
May 2, 2021
SRAM Operation - Write Pull-Up Ratio
W6
BLB
BL L6
PR
Q W5
M6 L5
WL M3 M6 WL
M2 M5
QB=VOLmin
‘0’
Q=‘0’ M1 WL M5
M4 QB=‘1’
VDD BLB
2
2
( )
V
= kM5 (VDD − VT,n )VQB − 2
V QB
kM6 VDD − VT,p VDSat,p − DSat,p
2
p 2
( )
VDSat,p
(V − VT,n )
2
VQB = VDD − VT,n − − 2 PR VDD − VT,p VDSat,p −
DD
n 2
27
© Adam Teman,
May 2, 2021
Pull Up Ratio – Write Constraint Q
M6
QB=VOLmin
WL M5
BLB
28
© Adam Teman,
May 2, 2021
Summary – SRAM Sizing Constraints
W1 W4
Read Constraint L1 L4 PDN
CR = =
W2 W5 access
L2 L5
KPDN Kaccess
30
© Adam Teman,
May 2, 2021
Multi-Port SRAM
31
© Adam Teman,
May 2, 2021
6T SRAM Layout
SRAM Layout - Traditional
• Share Horizontal Routing (WL).
• Share Vertical Routing (BL, BLB).
• Share Power and Ground.
BL BLB
WL WL
M3 M6
M2 M5
Q QB
M1 M4
© Adam Teman,
May 2, 2021
SRAM Layout – Thin Cell BL BLB
34
© Adam Teman,
May 2, 2021
65nm SRAM
• Industrial example from ST/Phillips
35
© Adam Teman,
May 2, 2021
Commercial SRAMs
36
© Adam Teman,
May 2, 2021
And very recent SRAMs Samsung 3nm
TSMC 7nm GAA SRAM Test Chip
SRAM
Source: TSMC
38
Static Noise Margin - Hold
+ +
VN - - VN
39
© Adam Teman,
May 2, 2021
Static Noise Margin - Hold
M3 M6
Q QB Q QB
M1 M4
40
© Adam Teman,
May 2, 2021
Static Noise Margin - Read
• What happens during Read?
• We can’t ignore the access transistors anymore…
BLB
Vout
BL
WL M3 M6 WL
VDD M2 M5 VDD
Q M1 M4 QB
CBL
CBLB
M3 M2 M6 M5
QB Q
Q QB
M1 M4
Vin
41
© Adam Teman,
May 2, 2021
Static Noise Margin - Read
QB
M3 M2
QB Q
M1
SNM
M6 M5
Q
QB
M4
Q
42
© Adam Teman,
May 2, 2021
Static Noise Margin - Write Q
BLB
BL
QB
WL M3 M6 WL QB
M2 M5 ‘0’
M6
Q=‘0’ M1 M4 QB=‘1’ QB
VDD Q
M4 M5
43
© Adam Teman,
May 2, 2021
Static Noise Margin - Write
Q
M3 M2
QB Q
M1
If there is a stable
point here, the
wrong data is
M6 WSNM written!
QB
Q
M4 M5
QB
44
© Adam Teman,
May 2, 2021
Alternative Write SNM Definition
• Write SNM depends on the cell’s separatrix,
therefore alternative definitions have been proposed.
• For example, add a DC Voltage (VBL) to the 0 bitline Q
and see how high it can be and still flip the cell.
M3 M2 M6 VBL=
V DD
QB
QB Q
Q
VB
L =0
M1 M4 M5
QB
VBL
45
© Adam Teman,
May 2, 2021
Dynamic Stability
46
© Adam Teman,
May 2, 2021
SNM Calculation
47
Simulating SNM
• Problem:
• How can we calculate SNM with SPICE?
• Some options:
• Insert DC sources at Q and QB
• But where exactly do we connect them?
• Draw Butterfly Curves
• But how do we find the largest squares?
49
© Adam Teman,
May 2, 2021
Simulating SNM
• First let’s define the graphical solution:
• The diagonals of all the squares are on lines parallel to Q=QB.
• We need to find the distance
between the points where these
intersect the butterfly plot.
• The largest of these distances
is the diagonal of the maximum
square in each lobe.
• Multiply this by cos45°
and we get the SNM.
• Easy, right?
50
© Adam Teman,
May 2, 2021
Changing Coordinates
• What if we were to turn the graph?
51
© Adam Teman,
May 2, 2021
Changing Coordinates
• If we were to use new axes, we could just subtract the graphs.
• This gives us the distances between the intersections with the Q=QB parallels.
• Now all we have to do is
find the maximum of the
subtraction.
• (Don’t forget to multiply by cos45)
52
© Adam Teman,
May 2, 2021
Changing Coordinates
• The required transformation is: x=
1
u+
1
v
2 2
1 1
y=− u+ v
2 2
• Now let’s define some function as F1
53
© Adam Teman,
May 2, 2021
Changing Coordinates
• What we did is turn some function (F1) 45 degrees
counter clockwise.
• This can easily be implemented with the following circuit:
• What is F1?
• It could be the VTC of
Vin=Q, Vout=QB…
54
© Adam Teman,
May 2, 2021
Changing Coordinates
• But what about the “mirrored” VTC?
• This needs to first be mirrored with respect to the v axis and then transformed
to the (u,v) system.
• If we call the second VTC F2 with x=F2(y) then the operation we need is:
v = −u + 2 x
v u
= −u + 2 F2 −
2 2
55
© Adam Teman,
May 2, 2021
Final SNM Calculation
VDD GND VDD
• Now we need to:
BL WL BLB
• Make a schematic of our SRAM
cell with two pins: Q and QB. 6T Cell
Q2 Q QB QB2
DC Sweep u v2 v2
Transformation 2
QB2 F2(in) F2(out) Q2
56
© Adam Teman,
May 2, 2021
Final SNM Calculation
• Now, connect F1 to Q→QB, and F2 to QB→Q. VDD GND VDD
DC Sweep u v1 v1 BL WL BLB
Transformation 1 6T Cell
Q1 F1(in) F1(out) QB1 Q1 Q QB QB1
DC Sweep u v2 v2
BL WL BLB
Transformation 2
QB2 F2(in) F2(out) Q2 6T Cell
Q2 Q QB QB2
58
© Adam Teman,
May 2, 2021
Read/Write SNM
• How about Read SNM:
• Use the exact same setup.
• Connect BL and BLB to VDD.
• Connect WL to VDD.
• Run the same calculation.
59
© Adam Teman,
May 2, 2021
Testbench Setup – Read/Write
Read Testbench: Write Testbench:
DC Sweep u v1 v1 DC Sweep u v1 v1
Transformation 1 Transformation 1
Q1 F1(in) F1(out) QB1 Q1 F1(in) F1(out) QB1
DC Sweep u v2 v2
DC Sweep u v2 v2 Transformation 2
Transformation 2 QB2 F2(in) F2(out) Q2
QB2 F2(in) F2(out) Q2
GND VDD VDD
VDD VDD VDD
BL WL BLB
BL WL BLB
6T Cell
6T Cell Q1 Q QB QB1
Q1 Q QB QB1
GND VDD VDD
VDD VDD VDD
BL WL BLB
BL WL BLB
6T Cell 6T Cell
Q2 Q QB QB2 Q2 Q QB QB2
60
© Adam Teman,
May 2, 2021
SRAM Stability under process variations
61
© Adam Teman,
May 2, 2021
Metastability Convergence in Spectre
• Node Sets
• What solution does Virtuoso find with a standard OP?
• To fix this, make sure you use the “Node Set” option.
© Adam Teman,
May 2, 2021
Node Sets vs. Initial Conditions
• SPICE supports two types of conversion aids :
• Node Sets:
• Help SPICE converge by providing it
with an initial guess.
• Used only for DC convergence!
Disregarded for Transient Analysis.
• Initial Conditions:
• Enforce a node voltage at time t=0.
• Used only for Transient analysis!
Disregarded for DC convergence.
63
© Adam Teman,
May 2, 2021
Additional simulation tips
• Work with Design Hierarchy
• Create transformation functions
and DUTs as symbols.
• Create multiple tests in
single ADE-XL view.
• Use variables/parameters to
define initial conditions/node sets.
• Create supply voltages in
separate symbol.
• Use buffers to smooth transitions
and reduce cross cap.
64
© Adam Teman,
May 2, 2021
Further Reading
• Rabaey, et al. “Digital Integrated Circuits” (2nd Edition)
• Elad Alon, Berkeley ee141 (online)
• Weste, Harris, “CMOS VLSI Design (4th Edition)”
• Seevinck, List, Lostroh, “Static Noise Margin Analysis of SRAM Cells”
IEEE Journal of Solid State Circuits, 1987
• Teman and Visotsky. "A fast modular method for true variation-aware separatrix
tracing in nanoscaled SRAMs." IEEE TVLSI, 2014.
65
© Adam Teman,
May 2, 2021
Digital Integrated Circuits
(83-313)
Lecture 9:
Memory Peripherals
Prof. Adam Teman
25 May 2021
Disclaimer: This course was prepared, in its entirety, by Adam Teman. Many materials were copied from sources freely available on the internet. When possible, these sources have been cited;
however, some references may have been cited incorrectly or overlooked. If you feel that a picture, graph, or code example has been copied from you and either needs to be cited or removed,
please feel free to email [email protected] and I will address this as soon as possible.
Lecture Content
2
© Adam May
Teman,
25, 2021
Memory Peripherals Overview
3
Memory Architecture
Storage Cell
Bit Line Memory Size: W Words of C bits
=W x C bits
Address bus: A bits
ADDA-1 : ADDM
→W=2A
Row Decoder
Word Line
Real Datasheet
Example
Simple Definitions
• Row Decoder
• Column Multiplexer
Row Decoder
Word Line
AW-1 : AM
• Sense Amplifier
• Write Driver
• Precharge Circuit
C×2M
Sense Amplifiers /Drivers
Input/Output
(C bits)
7
© Adam May
Teman,
25, 2021
Row Decoder Design
8
Row Decoders
• A Decoder reduces the number of select signals by log2.
• Number of Rows: W
• Number of Row Address Bits: A=log2W
Word 0
Word 1
ADDA-1 : ADD0
Word 2
Row Decoder
Word W-2
Word W-1
WL0 = A7 A6 A5 A4 A3 A2 A1 A0 WL255 = A7 A6 A5 A4 A3 A2 A1 A0
• NOR Decoder:
• DeMorgan will provide us with a NOR Decoder.
• In the previous example, we’ll get 256 8-input NOR gates:
WL0 = A7 + A6 + A5 + A4 + A3 + A2 + A1 + A0
WL255 = A7 + A6 + A5 + A4 + A3 + A2 + A1 + A 0
10 Row Decoder Column Mux Precharge Sense Amp © Adam May
Teman,
25, 2021
How should we build it? WL0
WL255
(
t pd = t pINV ( pi + EFi ) = t pINV pi + N N PE )
12 Row Decoder Column Mux Precharge Sense Amp © Adam May
Teman,
25, 2021
Problem Setup
• For LE calculation we need to start with:
• Output Load (CL)
• Input Capacitance (Cin)
• Branching (B)
• What is the Load Capacitance?
• 256 bitcells on each Word Line
= 10 3; = 10 3 = 80 27;
p = 2 + 2 + 2 +1 = 7
= 2.37;
p = 8 +1 = 9 p = 4+2 = 6
p = 2 3 + 1 3 = 9
PE = F bi LEi =
= 2.37 213 = 19.418k
N opt = log 3.6 PE = 7.7
• Bit-cell Pitch:
• Each signal drives one row of bitcells.
• How will we fit 8 address signals into this pitch?
4 →16
A0
• How do we do this? A1 D
A2
• If we look at the final Boolean expression, A3
it has combinations of groups of inputs.
• By grouping together a few inputs,
we actually create a small decoder.
4 →16
• Then we just AND the outputs of all the A4
“pre” decoders.
A5 E
A6
• For example: Two 4:16 predecoders A7
D = dec ( A0 , A1 , A2 , A3 ) ; E = dec ( A4 , A5 , A6 , A7 ) ;
WL0 = D0 E0 ; WL255 = D15 E15 ; WL254 = D14 E15 ;
20 Row Decoder Column Mux Precharge Sense Amp © Adam May
Teman,
25, 2021
Predecoding - Example
• Let’s look at our example: WL0 = D0 E0
D = dec ( A0 , A1 , A2 , A3 ) WL255 = D15 E15
E = dec ( A4 , A5 , A6 , A7 ) WL254 = D15 E14
• What is our new branching effort?
• As before, each address drives half the lines of the small decoder.
• Each predecoder output drives 256/16 post-decoder gates.
• Altogether, the branching effort is:
B = baddr _ driver bpredecoder = 16 256 = 128
2 16
• Same as before!
21 Row Decoder Column Mux Precharge Sense Amp © Adam May
Teman,
25, 2021
Predecoding - Solution
• Why is this a better solution?
• Each Address driver is only driving eight gates
• less capacitance.
• We saved a ton of area by “sharing” gates.
• We can “Pitch Fit” 2-input NAND gates.
WL1 WL1
4 4 4 4 16 16
WL127 WL127
GND
GND
VDD
PC
WL0 WL3
WL1 WL2
WL1
WL2
WL0
WL3
A0
A0
A1
A1
A0
A0
A1
A1
2-input NOR decoder 2-input NAND decoder
25 Row Decoder Column Mux Precharge Sense Amp © Adam May
Teman,
25, 2021
Column Multiplexer
26
Column Multiplexer
• First option – PTL Mux with decoder
• Fast – only 1 transistor in signal path.
• Large transistor Count A1 A0
B0 B1 B2 B3
Y
27 Row Decoder Column Mux Precharge Sense Amp © Adam May
Teman,
25, 2021
4 to 1 tree decoder
• Second option – Tree Decoder
• For 2k:1 Mux, it uses k series transistors.
• Delay increases quadratically
• No external decode logic → big area reduction.
30
Precharge Circuitry
• Precharge bitlines high before reads
bit bit_b
• Equalize bitlines to minimize voltage difference when using sense amplifiers
bit bit_b
large small
small
transition s.a.
input output
Source: pcworld.com
• 32-bit, CISC architecture, introduced in 1977
• The VAX-11/780 was TTL-based, 5MHz, 2kB cache, reaching 1 MIPS
• Known as a “minicomputer”, even though it took up a whole room.
• VAX means “Virtual Address Extension”,
since the VAX was one of the first minicomputers to use virtual memory.
• Ran the VMS operating system.
• Many systems that were developed during the cold war
(e.g., F-15, F-18, Hawk missiles, nuclear programs) still use VAX today!
Further Reading
• Rabaey, et al. “Digital Integrated Circuits” (2nd Edition)
• Elad Alon, Berkeley ee141 (online)
• Weste, Harris, “CMOS VLSI Design (4th Edition)”
36
© Adam May
Teman,
25, 2021