A Hierarchical Self-organizing Associative Memory for Machine ...
FSA, Hierarchical Memory Systems
-
Upload
mercedes-knight -
Category
Documents
-
view
39 -
download
4
description
Transcript of FSA, Hierarchical Memory Systems
How can we create a flip-flop using another flip-flop?
Say we have a flip-flop BG with the following properties:
Let’s try to implement this flip-flop using a T flip-flop
BG Q+
00 Q’
01 Q
10 1
11 0
Step 1:Create Table
The first step is to draw a table with created flip-flop first (in this case BG), Q, Q+, and the creator flip-flop (in this case T)
-Look at Q, Q+ to determine value of T
BG Q Q+ T
00 0 1 1
00 1 0 1
01 0 0 0
01 1 1 0
10 0 1 1
10 1 1 0
11 0 0 0
11 1 0 1
Step 2:Karnaugh Map
Draw a Karnaugh Map, based on when T is a 1
BG Q Q+ T
00 0 1 1
00 1 0 1
01 0 0 0
01 1 1 0
10 0 1 1
10 1 1 0
11 0 0 0
11 1 0 1
1 1
0 0
0 1
1 0
BGQ
00
01
10
11
0 1
T=B’G’+BGQ+G’Q’
The Root of the Problem:Economics
Fast memory is possible, but to run at full speed, it needs to be located on the same chip as the CPU Very expensive Limits the size of the memory
Do we choose: A small amount of fast memory? A large amount of slow memory?
Memory Hierarchy Design (2)
It is a tradeoff between size, speed and cost and exploits the principle of locality.
Register Fastest memory element; but small storage; very expensive
Cache Fast and small compared to main memory; acts as a buffer between the CPU
and main memory: it contains the most recent used memory locations (address and contents are recorded here)
Main memory is the RAM of the system Disk storage - HDD
Registers(CPU)
Cache (one ormore levels)
MainMemory
DiskStorage
Specialized bus(internal or external
to CPU)
Memory bus I/O bus
Memory Hierarchy Design (3)
Comparison between different types of memory
size:speed:$/Mbyte:
32 - 256 B2 ns
Register Cache Memory
32KB - 4MB4 ns$100/MB
128 MB60 ns$1.50/MB
20 GB8 ms$0.05/MB
larger, slower, cheaper
HDD
Memory Hierarchy
Can only do useful work at the top 90-10 rule: 90% of time is spent of 10% of program Take advantage of locality temporal locality keep recently accessed memory locations in cache spatial locality keep memory locations nearby accessed memory
locations in cache
The connection between the CPU and cache is very fast; the connection between the CPU and memory is slower
The Cache Hit Ratio
How often is a word found in the cache? Suppose a word is accessed k times in a short
interval 1 reference to main memory (k-1) references to the cache
The cache hit ratio h is then
kk
h1-
=
Reasons why we use cache
• Cache memory is made of STATIC RAM – a transistor based RAM that has very low access times (fast)
• STATIC RAM is however, very bulky and very expensive
• Main Memory is made of DYNAMIC RAM – a capacitor based RAM that has very high access times because it has to be constantly refreshed (slow)
• DYNAMIC RAM is much smaller and cheaper
Performance (Speed) Access time
Time between presenting the address and getting the valid data (memory or other storage)
Memory cycle time Some time may be required for the memory to
“recover” before next access cycle time = access + recovery
Transfer rate rate at which data can be moved for random access memory = 1 / cycle time
(cycle time)-1
Memory Hierarchy
size ? speed ? cost ? registers
in CPU
internal may include one or more levels of cache
external memory backing store
smallest, fastest, most expensive, most frequently accessed
medium, quick, price varies
largest, slowest, cheapest, least frequently accessed
Replacing Data
Initially all valid bits are set to 0 As instructions and data are fetched from
memory, the cache is filling and some data need to be replaced.
Which ones? Direct mapping – obvious
Replacement Policies for Associative Cache
1. FIFO - fills from top to bottom and goes back to top. (May store data in physical memory before replacing it)
2. LRU – replaces the least recently used data. Requires a counter.
3. Random
Replacement in Set-Associative Cache
Which if n ways within the location to replace?
FIFO Random LRU
Accessed locations are D, E, A
Writing Data
If the location is in the cache, the cached value and possibly the value in physical memory must be updated.
If the location is not in the cache, it maybe loaded into the cache or not (write-allocate and write-noallocate)
Two methodologies:1. Write-through
Physical memory always contains the correct value2. Write-back
The value is written to physical memory only it is removed from the cache
Cache Performance
Cache hits and cache misses.
Hit ratio is the percentage of memory accesses that are served from the cache
Average memory access time
TM = h TC + (1- h)TP Tc = 10 ns
Tp = 60 ns
Associative Cache
Access order A0 B0 C2 A0 D1 B0 E4 F5 A0 C2
D1 V0 G3 C2 H7 I6 A0 B0Tc = 10 ns
Tp = 60 ns
FIFO
h = 0.389
TM = 40.56 ns
Direct-Mapped Cache Access order
A0 B0 C2 A0 D1 B0 E4 F5 A0 C2 D1 V0 G3 C2 H7 I6 A0 B0
Tc = 10 ns
Tp = 60 ns
h = 0.167
TM = 50.67 ns
2-Way Set Associative Cache Access order
A0 B0 C2 A0 D1 B0 E4 F5 A0 C2 D1 V0 G3 C2 H7 I6 A0 B0
Tc = 10 ns
Tp = 60 ns
LRU
h = 0.31389
TM = 40.56 ns
Associative Cache(FIFO Replacement Policy)
Data A B C A D B E F A C D B G C H I A B
CACHE
A A A A A A A A A A A A A A A I I I
B B B B B B B B B B B B B B B A A
C C C C C C C C C C C C C C C B
D D D D D D D D D D D D D D
E E E E E E E E E E E E
F F F F F F F F F F F
G G G G G G
H H H H
Hit? * * * * * * *
Hit ratio = 7/18Hit ratio = 7/18
A0 B0 C2 A0 D1 B0 E4 F5 A0 C2 D1 B0 G3 C2 H7 I6 A0 B0
Two-way set associative cache(LRU Replacement Policy)
Hit ratio = 7/18Hit ratio = 7/18
A0 B0 C2 A0 D1 B0 E4 F5 A0 C2 D1 B0 G3 C2 H7 I6 A0 B0
Data A B C A D B E F A C D B G C H I A B
C
A
C
H
E
0 A-0 A-1 A-1 A-0 A-0 A-1 E-0 E-0 E-1 E-1 E-1 B-0 B-0 B-0 B-0 B-0 B-1 B-0
0 B-0 B-0 B-1 B-1 B-0 B-1 B-1 A-0 A-0 A-0 A-1 A-1 A-1 A-1 A-1 A-0 A-1
1 D-0 D-0 D-0 D-1 D-1 D-1 D-0 D-0 D-0 D-0 D-0 D-0 D-0 D-0
1 F-0 F-0 F-0 F-1 F-1 F-1 F-1 F-1 F-1 F-1 F-1
2 C-0 C-0 C-0 C-0 C-0 C-0 C-0 C-0 C-0 C-0 C-0 C-0 C-0 C-1 C-1 C-1
2 I-0 I-0 I-0
3 G-0 G-0 G-1 G-1 G-1 G-1
3 H-0 H-0 H-0 H-0
Hit? * * * * * * *
Associative Cache with 2 byte line size (FIFO Replacement Policy)
Hit ratio = 11/18Hit ratio = 11/18
A0 B0 C2 A0 D1 B0 E4 F5 A0 C2 D1 B0 G3 C2 H7 I6 A0 B0
A and J; B and D; C and G; E and F; and I and HA and J; B and D; C and G; E and F; and I and H
Data A B C A D B E F A C D B G C H I A B
C
A
C
H
E
A A A A A A A A A A A A A A I I I I
J J J J J J J J J J J J J J H H H H
B B B B B B B B B B B B B B B A A
D D D D D D D D D D D D D D D J J
C C C C C C C C C C C C C C C B
G G G G G G G G G G G G G G G D
E E E E E E E E E E E E
F F F F F F F F F F F F
Hit? * * * * * * * * * * *
Direct-mapped Cachewith line size of 2 bytes
Hit ratio 7/18Hit ratio 7/18
Data A B C A D B E F A C D B G C H I A B
CACHE
0 A B B A B B B B A A B B B B B B A B
1 J D D J D D D D J J D D D D D D J D
2 C C C C C C C C C C C C C C C C
3 G G G G G G G G G G G G G G G G
4 E E E E E E E E E E E E
5 F F F F F F F F F F F F
6 I I I I
7 H H H H
Hit? * * * * * * *
A0 B0 C2 A0 D1 B0 E4 F5 A0 C2 D1 B0 G3 C2 H7 I6 A0 B0
A and J; B and D; C and G; E and F; and I and HA and J; B and D; C and G; E and F; and I and H
Two-way set Associative Cachewith line size of 2 bytes
Hit ratio = 12/18Hit ratio = 12/18
Data A B C A D B E F A C D B G C H I A B
C
A
C
H
E
0 A-0 A-1 A-1 A-0 A-1 A-1 E-0 E-0 E-1 B-0 B-0 B-0 B-0 B-0 B-0 B-0 B-1 B-0
1 J-0 J-1 J-1 J-0 J-1 J-1 F-0 F-0 F-1 D-0 D-0 D-0 D-0 D-0 D-0 D-0 D-1 D-0
0 B-0 B-0 B-1 B-0 B-0 B-1 B-1 A-0 A-0 A-1 A-1 A-1 A-1 A-1 A-1 A-0 A-1
1 D-0 D-0 D-1 D-0 D-0 D-1 D-1 J-0 J-0 J-1 J-1 J-1 J-1 J-1 J-1 J-0 J-1
2 C-0 C-0 C-0 C-0 C-0 C-0 C-0 C-0 C-0 C-0 C-0 C-0 C-1 C-1 C-1 C-1
3 G-0 G-0 G-0 G-0 G-0 G-0 G-0 G-0 G-0 G-0 G-0 G-0 G-1 G-1 G-1 G-1
2 I-0 I-0 I-0 I-0
3 H-0 H-0 H-0 H-0
Hit? * * * * * * * * * * * *
A0 B0 C2 A0 D1 B0 E4 F5 A0 C2 D1 B0 G3 C2 H7 I6 A0 B0
A and J; B and D; C and G; E and F; and I and HA and J; B and D; C and G; E and F; and I and H
Page Replacement - FIFO
FIFO is simple to implement When page in, place page id on end of list Evict page at head of list
Might be good? Page to be evicted has been in memory the longest time
But? Maybe it is being used We just don’t know
FIFO suffers from Belady’s Anomaly – fault rate may increase when there is more physical memory!
FIFO vs. Optimal•Reference string – ordered list of pages accessed as process executesEx. Reference String is A B C A B D A D B C BOPTIMAL
A B C A B D A D B C B
toss A or Dtoss C5 Faults
FIFOA B C A B D A D B C B
toss A
ABCDABC
toss ?7 faults
System has 3 page frames
Least Recently Used (LRU) Replace the page that has not been used for the longest
time3 Page Frames Reference String - A B C A B D A D B C
A B C A B D A D B C
LRU – 5 faults
LRU Past experience may indicate future behavior Perfect LRU requires some form of timestamp to be associated
with a PTE on every memory reference !!! Counter implementation
Every page entry has a counter; every time page is referenced through this entry, copy the clock into the counter.
When a page needs to be changed, look at the counters to determine which are to change
Stack implementation – keep a stack of page numbers in a double link form: Page referenced: move it to the top No search for replacement