8.1 8 Memory Subsystem 1. Classification 2. Architectures 3. Circuits 1) SRAM 2) DRAM 3) Address...
-
Upload
dinah-banks -
Category
Documents
-
view
230 -
download
6
Transcript of 8.1 8 Memory Subsystem 1. Classification 2. Architectures 3. Circuits 1) SRAM 2) DRAM 3) Address...
8.1
8 Memory Subsystem
1. Classification
2. Architectures
3. Circuits
1) SRAM
2) DRAM
3) Address decoders
4) Sense Amplifier
Contents
4. PLA
5. Gate Matrix
6. ROM
8.2
1. Classification
RWM(Read-Write Memory) Random Access : SRAM, DRAM
Sequential Access : FIFO, Stack(LIFO)
Content Access : CAM(Associative Memory)
NVRWM(Nonvolatile RWM) EPROM
E2PROM
FLASH
ROM Mask Programmed
OTP(One-Time Programmable) ; PROM
8.3
2. Architectures
1-dimensional memory : N(words) M(bits/word) Decoder reduces the number of wires
8.4
2-dimensional array structure uses column decoder to make the chip square.
8.5
Hierarchical memory architecture using block address Block address is used to activate only one block.
Other blocks(nonactive) are put in power-saving mode.
8.6
Architecture of large memory
8.7
Basic organization for a 4K SRAM(1989 Philips research).
8.8
Schematic circuit diagram of 64K SRAM(Hitachi 1982).
8.9
Another schematic of SRAM(column grouping).
SRAM chip block diagram
8.10
Design Considerations
bit line precharge, sense amp enable 등을 위한 모든 clock 의 발생은 address, CS,
WE 등 신호의 transition 을 detect 하는 회로에 의해 internal clock 발생기가 tri
gger 됨으로써 이루어진다 .( 전력소모 억제 )
2-stage row address decoding : WL driver decodes A1.
Sense amp 는 column switch 앞에 , 혹은 뒤에 놓을 수 있다 .
앞에 놓을 경우 : column 의 cell pitch 에 맞추기 위해 아주 simple 한 SA 를
사용
뒤에 놓을 경우 : 상대적으로 복잡한 SA 사용가능 (SA 의 input cap. 는 증가
윗 그림은 column 을 (1024 column 의 경우 , by 4 인 경우 ) 크게 4 로 나누고 ,
각각을 16 으로 나누어 각 소 group 의 16 개의 column 을 한 SA 가
담당토록하는 compromise 임 .
8.11
3. Circuits
Address decoders Single stage(10-to-1024) decoder
i) # of transistors =
20/NAND 10 1024 = 20,480
ii) Large fanout requirement on buff
ers generating Xi’s.
iii) series-connected transistors limit
discharge time.
8.12
Predecoded scheme
i) Group 2 bits and predecode the word using 2-bit segments ;
(X9, X8), (X7, X6), …. (X1, X0)
ii) 2nd-stage decoder logic
# of transistors ;
10/NAND 5 1024 + 12,000
8.13
Divided Word Line architecture
Global word line selects a block, while the local line is used to activate a word
line within the selected block.
8.14
Hierarchical word decoding logic
8.15
Row decoder circuits
(Complementary AND, pseudo NMOS, cascade NAND)
8.16
Typical Symbolic Layout Style of row decoders
8.17
Various other decoder circuits(Power saving, Decoder-powered)
8.18
Tree style column decoder
8.19
Sense Amplifier for SRAM
Single differential stage 의 전압이득 Av = gm·ro
gm : current/voltage(transducer gain) of M1, M2
ro : output impedance( = ro M1 ro M2)
Av 가 크기 위해서는 M1 과 P1(M2 와 P2) 가 모두 saturation 영역에 있어야 함 .
( Sat. 영역에서 gm= 가 크고 , ro 도 크기때문 )
따라서 point X 의 전압을 로 precharge 해 두는 것이 response time 을 짧게 하고 ,
signal swing 을 크게하는데 유리 .
IVG
D
s
VDD2
8.20
Single-ended amp 를 두개 symmetric 하게 연결함으로써 voltage g
ain 을 높인다 . ( 다음 단에 latch 나 another double-ended amp. Sta
ge 혹은 diff. Input 을 갖는 output buffer 를 달면 된다 .)
8.21
SRAM sense amp precharged to VDD2
SA 의 출력점을 로 충전하여 SA 의 high-gain 영역에서 동작토록하는 회로 .
1 : V1 은 VDD 로 prech 됨
power-down 상태
2 : WL 이 access 되면 V1 을 로 prech.
3 : BL, BL 에 전압차가 생기면 high-gain SA
동작하면서 column decoder/switch 인 pass gate가 동작
data output bus 로 신호전달
4 : power-down 상태
VDD2
VDD2
8.22
2 차구간에서 Static
전력소모가 있음
8.23
SRAM circuit before sense Amp.
8.24
Evolution of SRAM cells
i) 6- and 4-transistor SRAM cells
8.25
ii) Dual-port/double-ended access and dual-port/single access
8.26
iii) Content-addressable memory cell
8.27
Evolution of DRAM cells
(a) basic bi-stable f/f w/o load (b) 2C-2D(C:control lines, D:data lines)
8.28
( c) 1C-2D (d) 2C-1D scheme
8.29
(e) 1C-1D (f) 1C-1D(industry standard DRAM)
8.30
DRAM read cycle
8.31
8.32
8.33
Dummy word line scheme
8.34
8.35
DRAM differential sense amp with dummy cell structure
8.36
Cross-coupled Latch
Assume node 1 & 2 are precharged, and node 2 begins to drop.
When clk is on, node 3 pulls down. N2 strongly turns on, leaving n1 off.
주의 ) cross-coupled TR pair 의 layout 이 대칭이어야 함 .
threshold 전압차이에 의한 영향
8.37
Charge transfer-based Circuit
8.38
Charge-transfer Circuit(cont’d)
Operation Sequence
As clk goes high, node 1 & 2 are precharged;
V1 (Vref-Vth, n2), V2 min(VDD, Vclk-Vth, n3) > Vref
n3 turns off.
Cell(n1, Cc) is selected(Assume Vc was ‘0’)
Due to charge sharing between Cc & Clarge,
V1 becomes
n2 is turned on until is transferred from
Cout .i.e., until V1 reaches Vref-Vth.
Voltage drop at node 2 due to charge transfer is
V V V V
V V V CC Cref th
ref th c c
c l e1 1
( )
arg
( ) arg
arg
V V C VC
C Cref th l e c c
c l e
Q V C Cc l e 1( )arg
VQ
CCC
V V Vout
c
outref th c2 ( )
V V C C Cout c l e2 1 ( )arg
C C
Cc l e
out
arg : amplif. factor
8.39
Sense amplifier for single - Tr. DRAM cells.
dummy cell(Cd=Cc), dummy bit line complete Symmetry
8.40
Operation
1. Precharge 전에는 BL, DBL 모두 로 되어 있다 .*
precharge(n1, n2 on) 를 통해 node 1,2 가 pull up 된다 .
그리고 n1 과 n2 는 off 된다 .
2. Cc 와 Cd 가 select 되어 charge transfer 에 의해 ( =0 라 하자 )
node 1 의 전압은 node 2 의 전압보다 많이 강하 된다 .
( Cd 는 로 충전되어 있었기 때문 )*
3. Clk1 이 high 가 되어 n4 는 on, n5 는 off(V1 은 Vss 로 됨 )
n7 이 다시 conduction 되어 BL 이 Vss 로 방전되어
Cc 가 ‘ 0’ 으로 restore 된다 .
4. Sel ‘0’ 로 하여 Cc 를 isolate 한 후에 clk2 를 on 하여 BL 과 DBL 을 로 함 . 그 후에 seld ‘0’ 하여 Cd 에 를 만들고 n3 를 off 시킴 .
(Cc 에 ‘ 1’ 이 저장되어 있는 경우도 비슷한 방식으로 동작한다 .)
VDD2
VDD2
VDD2
VDD2
VCC
8.41
Column SA 와 main SA 를 사용한 SRAM SA 회로 매 column 마다 n 개의 colunm 간에 multiplex
8.42
(input 신호 )
(Column SA 가 있는 경우 )
8.43
(Column SA 가 없는 경우 )
8.44
Resistive-load SRAM cells
Undoped polysilicon as resistors with R 1 /
Just enough(10-12A) to compensate for leakage current of 10-15A
BL & BL precharged to VDD, thus preventing slow charging of BL, BL.
8.45
TFT SRAM cell
Instead of traditional PMOS devices, pull-up transistors realized by PMOS TFT(thin-film transistor) on top of the cell structure.
ON current : 10-8A, OFF current : 10-13A
Complementary CMOSComplementary CMOS Resistive LoadResistive Load TFT cellTFT cell
Number of transistorsNumber of transistors 66 44 4(+2 TFT)4(+2 TFT)
Cell sizeCell size 58.2m2
(0.7 m rule)58.2m2
(0.7 m rule)40.8 m2
(0.7 m rule)40.8 m2
(0.7 m rule)41.1m2
(0.8 m rule)41.1m2
(0.8 m rule)
Standby current(per cell)Standby current(per cell) 10-15A10-15A 10-12A10-12A 10-13A10-13A
8.46
Bipolar SRAM cells :
Very fast SRAMs are necessary for cache & microcode memory in high-s
peed computers.
SBD(Schottky Barrier Diode) bipolar SRAM
8.47
3-T DRAM cell :
Resulted by removing the loads to obtain 4-T DRAM cell and further removing redundemt complementary pull down device
Separate Read Word line(RWL) & Write word line(WWL)
Refreshing by writing the inverted BL2 signal onto BL1.
8.48
1-T DRAM cell :
V V V V VC
C CBL PRECH BIT PRECHC
C BL
( )C
C CC
C BL : charge transfer ratio
8.49
1-T DRAM cell structure :
8.50
Trench capacitor type & Stacked-capacitor type
8.51
NOR-type address decoder
8.52
NAND-type address decoder
8.53
Reducing coupling noise bet. WL&BL : Folded bit line.
8.54
Reducing coupling noise bet. BL & neighbor bit lines :
Transposed bit line
VC
C CVcross
cross
cross BLSwing
2
Vcross : worst-case variation on each bit line.
Vswing : signal swing on bit line.
8.55
4. PLA(Programmable Logic Array)
Generally two classes exist for implementing control logic functions.
Multi-level logic through logic optimization on random logic
Regular structure type, i.e.,
ROM : firmware, mask-programmable
PLA : Customized logic to remove unnecessary
Product(AND) terms and sum(OR) terms.
8.56
Sum of product form, F = ab+c d
i) NAND-NAND PLA
이러한 2-level Boolean 식은 decoder 를 2 단 연속 붙인것으로 볼 수 있다 .
ab
cd
FdcbadcbaF
a
b
c
d
AND OR
F
8.57
i) NOR-NOR PLA
NOR-NOR is faster, but requires larger space
( 30% additional) than NAND-NAND.
F a b c d a b c d ( ) ( )
F a b c dab
c
d
F
a b
F
c d
8.58
Various ways for decoding
(NOR 형 decoder) (NAND 형 decoder)
NOR : fastNAND : compact
: diffusion
: polysilicon
: metal
8.59
Complementary 형 decoder(CMOS-like)
저전력 소모 large area
8.60
MOS ROM vs. MOS PLA
8.61
P x y z x y z5
P x y z x y z7
P x y z x y z4
f P P P2 5 7 2 f P P P P P3 4 5 7 4 2 P P P x z2 5 7
(PLA) (ROM)
8.62
Various Programmable Logic Devices(PLD’s)
FSM(Finite State Machine) FPLA(Field-Programmable PLA)
: PLA with latched feedback
8.63
PLA(Programmable Array Logic)
= FPLA where the OR array is not
programmable, AND array is field
programmable.
ROM : (single)mask programmable
PLA:(multiple) mask programmable
FPLA:field programmable, bulky
PAL:field programmable, less bulky
8.64
MGA(Multilevel Gate Array)
8.65
Associative LogicMatrix
8.66
Pseudo-NMOS PLA
8.67
Dynamic NMOS PLA
NOR 형 NAND 형
T1 : product line precharge, input latch in
T2 : sum line precharge
T3 : product line evaluate
T4 : sum line evaluate, output latch out
8.68
Dynamic CMOS PlA(2-phase) - I
8.69
T1 : product line precharge, latch input
T2 : product line evaluate, T2’:sum line precharge
T3 : sum line evaluate
T4 : latch output Dummy row 는 모든 TR pair 중의 하나는 항상 ‘ ON’ 상태이므로
큰 capacitance, C 가 있는것과 같아 Vx
파형 은 파형이 delay 된 것과 같다 .11 C
8.70
Dynamic CMOS PLA - II
T1 : product line precharge, latch input/output(master-slave 방식 )T2 : product line evaluateT3 : AND-OR plane connect, sum line evaluateT4 : sum line evaluate
8.71
Dynamic CMOS PLA - V (NORA type)
AND plane : NMOS
OR plane : PMOS
T1 T2
T2(=low) : p-line precharge, s-line predischarge latch input
T1(=high) : p-line, s-line evaluate, latch output
8.72
Decoded PLA partition input variables into multiple groups
8.73
row folding : partition inputs into
two groups such that one can find
an order of rows(product lines)
with one input group fed from
below while the other input group
fed from top.
PLA folding(row & column folding)
8.74
PPL(Programmable Path Logic)
merging of AND and OR plane.
Do=1 if
i.e., ; two-level Boolean eq.
A C K
A C K
o
o
0
0OR
D A C A C Ko o o ( )
8.75
Associative Logic Array(subset of Storage Logic Array)
y D C
y D x x C
y x C
y x x
1
2 1 2
3 1
4 1 2
x y y y D C x C
x y y y x x x C
1 1 2 3 1
2 2 3 4 1 2 1
Ex.
8.76
MGA(Multiple Gate Array, or Multi-level PLA)
8.77
MGA with three associative logic matrices
8.78
5. Gate Matrix Use regularly-spaced polysilicon lines for both gate electrode and inter
connect.
(a) : NMOS TR 과 채널이 분리됨
(b) : 각 TR 을 polysilicon grid 상에 배치
(c) : series( 혹은 parallel) 로 연결된 TR group 을 한 row 에 배치하고 연결 .
8.79
Rule
1. Polysilicon 은 일정간격으로 수직방향으로 달린다 .
2. 인접한 column 같은 row 에 위치한 TR 의 series 연결은 diffusi
on butting 으로 한다 .
3. Metal 은 parallel 연결 , 인접되지 않은 TR 의 series 연결 및 각
gate 간의 연결을 하며 , 수평 및 수직방향으로 달린다 .
4. Transistor 는 polysilicon column 상에서만 존재한다 .
5. Diffusion wire 는 polysilicon grid 중간으로 수직방향으로 ( 짧게 )
달릴 수 있다 .
8.80
Static CMOS layout in Gate Matrix
L(f,h) is realizable if h is realizable. h is realizable if every diffusion runs(vertical) it generates is legal.
8.81
Automation of Gate Matrix Layout : Ref. O.Wing et.al. “Gate Matrix Layo
ut”, IEEE Trans. on CAD, Vol.4, July. 1985
Find a function f(gate assignment)
: assign the transistor gate and output terminal to each column(TR gat
es connected to the same node must be assigned same column)
Find function(net assignment)
: assign the net(segment of horiz. Metal line) to each row.
Find layout L(f,h) which is realizable* & has min. rows.
8.82
Problem Formulation for Gate Matrix Optimization
8.83
Example : CMOS Half-Adder Circuit
8.84
Gate Nets
1 N1, N2 2 N1, N3 3 N4 4 N2, N4 5 N1, N2, N3, N5 6 N3, N4 7 N5
Net Representation(Case I)
Net Representation(Case II)
Problem Statement ;
Given a set of nets which connectat gates, find a permutation of gates and an assignment of nets totracks, such that the number of tracks is minimized.
Problem Statement ;
Given a set of nets which connectat gates, find a permutation of gates and an assignment of nets totracks, such that the number of tracks is minimized.
8.85
6. ROM(Read Only Memory) ROM cells
Diode cell : consumes large power from WL
Transistor(BJT) cell : consumes less current(IB vs. IC)
MOSFET cell :
8.86
Sharing supply voltage lines and mirroring cells
8.87
NOR ROM with contact programming
8.88
NOR ROM with Vth-raising implant or thick-oxide implants.
8.89
NAND Rom
8.90
논문을 쓰려면 두 가지 중의하나를 고르라 . 현재 매우실용적이거나 , 당신의 시기에파급효과가 큰 기술 분야를 고르든지 ,아니면 매우 학문적 , 이론적인탁월성을 추구하라 .
논문을 쓰려면 두 가지 중의하나를 고르라 . 현재 매우실용적이거나 , 당신의 시기에파급효과가 큰 기술 분야를 고르든지 ,아니면 매우 학문적 , 이론적인탁월성을 추구하라 .
논문을쓰기 전에논문을
쓰기 전에
8.91
힘든 일을 시작하라 .
그러면 심각해 질 것이다 .
물러서지만 않으면성공할 것이다 .
힘든 일을 시작하라 .
그러면 심각해 질 것이다 .
물러서지만 않으면성공할 것이다 .
성공의 비결성공의 비결