8/8/2019 Lec6 Memory Cache0910
1/42
CC2202 1
Lecture 6
Memory System and Cache Memory
8/8/2019 Lec6 Memory Cache0910
2/42
CC2202 2
Last Week
We learnt
Combinational Circuit
Truth Table and K-Map
Today, we will discuss
Hierarchical Memory SystemCache System
Mapping of Content between Cache and Memory
8/8/2019 Lec6 Memory Cache0910
3/42
CC2202 3
Characteristics of Memory System
Location
Capacity
Unit of transfer Access method
Performance
Physical type
Physical characteristics
Organisation
These are the areas of
distinguishing memory
types
Includes main memory
(or simply memory) and
storage such as hard disk
8/8/2019 Lec6 Memory Cache0910
4/42CC2202 4
Characteristics
LocationCPU
Registers
Internal
RAM
External
Storage Disk / Tape
Capacity
Word size The natural unit of organization
Number of words
or Bytes
8/8/2019 Lec6 Memory Cache0910
5/42
CC2202 5
Characteristics I
Unit of TransferInternal
Usually governed by data bus width
External Usually a block which is much larger than a wordAddressable unit
Smallest location which can be uniquely addressed
Word internally
8/8/2019 Lec6 Memory Cache0910
6/42
CC2202 6
Characteristics II
Access Methods Sequential
Start at the beginning and read through in order Access time depends on location of data and previous location
e.g. tape Direct
Individual blocks have unique address Access is by jumping to vicinity plus sequential search
Access time depends on location and previous location e.g. disk
Random Individual addresses identify locations exactly
Access time is independent of location or previous access e.g. RAM
Associative Data is located by a comparison with contents of a portion of the
store
Access time is constant independent of location or previous access e.g. cache
8/8/2019 Lec6 Memory Cache0910
7/42
CC2202 7
Memory Hierarchy
RegistersIn CPU
Internal or Main memoryMay include one or more
levels of cache
RAM External memory
Backing store
Hierarchy List
Registers
L1 Cache
L2 Cache
Main memory
Disk cacheDisk
Optical
Tape
8/8/2019 Lec6 Memory Cache0910
8/42
CC2202 8
Memory Hierarchy - Diagram
Digital Video Disc Rewriteable (DVD+RW)
can be erased and written to, or recordedon, more than 1000 times.
DVD Random Access Memory (DVD-RAM) can be erased and written to, or recorded on,
more than 100,000 times. DVD-RAM discscan be read by DVD-RAM drives and someDVD-ROM players
Compact Disk Read-Only Memory (CD-ROM) uses laser technology to store data,
instructions and information
Compact Disk Rewriteable (CD-RW)
Erasable CD onto which users can write andrewrite data, instructions and informationmultiple times.
Optical disc
Magneto Optical disk
Size of a floppy disk capable of writing and rewriting data upon
WORM (Write Once Read Many or Write Once Read Multiple times) Mostly used in software, manual or document given out by manufacturers since
can be written to only once, but read from multiple times.
8/8/2019 Lec6 Memory Cache0910
9/42
CC2202 9
Performance
Access time (TA)
Time between presenting theaddress and getting the valid
data H (Hit Ratio)
Fraction of all memoryaccesses that are found in
the faster memory (e.g. thecache)
T1 = Access Time to Level 1
T2 = Access Time To Level 2
8/8/2019 Lec6 Memory Cache0910
10/42
CC2202 10
Performance
Memory Cycle time (CT)Time may be required for the
memory to recover before next
access (TR)
Cycle time is access + recovery
CT = TA
+ TR
Transfer Rate (R)
Rate at which data can be moved
RATTCT
R+
==11
8/8/2019 Lec6 Memory Cache0910
11/42
CC2202 11
Characteristics III
Physical TypesSemiconductor
RAM
Magnetic Disk & Tape
Optical
CD & DVD Physical Characteristics
Decay
VolatilityErasable
Power consumption
8/8/2019 Lec6 Memory Cache0910
12/42
CC2202 12
Design Constraints
Trade-off among key characteristics of Memory:How much?
Capacity
How fast?
Access Time(Time is money)
How expensive?
Cost
Relationships
Faster access time, greater cost per bitGreater capacity, lower cost per bit
Greater capacity, longer access time
8/8/2019 Lec6 Memory Cache0910
13/42
CC2202 13
Trend in the Memory Hierarchy
As one goes down thehierarchy, the followingoccur :
1. Decreasing cost perbit ($/b )
2. Increasing Capacity
(C )3. Increasing AccessTime (R)
4. Decreasing frequencyof access of thememory by theprocessor
8/8/2019 Lec6 Memory Cache0910
14/42
CC2202 14
Cache Memory Principles
Small amount of fast memory
Sits between normal main memory and CPU
May be located on CPU chip or module
8/8/2019 Lec6 Memory Cache0910
15/42
CC2202 15
Cache operation - overview
1. CPU requests contents of memory location
2. Check cache for this data
3. If present, get from cache (fast)4. If not present, read required block from main
memory to cache
5. Then deliver from cache to CPU
6. Cache includes tags to identify which block of
main memory is in each cache slot
8/8/2019 Lec6 Memory Cache0910
16/42
CC2202 16
Cache/main-memory Structure
8/8/2019 Lec6 Memory Cache0910
17/42
CC2202 17
Cache/main-memory Structure
MM consists of up to 2n addressable words, with each word having a unique n-bit address.
MM is considered to consist of a number of fixed-lengthblocks.
Each block contains K words. Therefore MM has M (=2n/K) blocks
Cache consists of C lines. Each line contains K words.
The number of lines is considerably less than the
number of memory blocks (C < M) At any time, some subset of the blocks of MM reside in
lines in the cache
8/8/2019 Lec6 Memory Cache0910
18/42
CC2202 18
Cache Design
Size Size does matter Cost
More cache is expensive
Speed More cache is faster (up to a point) Checking cache for data takes time
Mapping Function- Algorithm is needed to map mainmemory blocks to cache lines. Three techniques: direct, associative and set associative.
Replacement Algorithm Once the cache is filled, when a new main memory block is broughtinto the cache, one of the existing blocks in cache must be replaced. Least Recently Used (LRU), First-in First-out (FIFO),
Least Frequently Used (LFU) or Random algorithm is used.
8/8/2019 Lec6 Memory Cache0910
19/42
CC2202 19
Cache Design
Replacement Write PolicyTo synchronize cache and memory, techniques write
through or write back is used.
In write through, all write operations are made to mainmemory as well as to the cache, ensuring that memory isalways valid but creating heavy memory traffic.
In write back, updates are made only in the cache. An
UPDATE bit in a line is set. When a block is replaced, it iswritten back to memory if its UPDATE bit is set. Thisapproach cannot ensure valid memory but fast.
Block Size Number of Caches
8/8/2019 Lec6 Memory Cache0910
20/42
CC2202 20
Mapping Function
Fewer cache lines than main memory blocks,=> an algorithm is needed for mapping mainmemory blocks into cache lines.
Three Techniques:Direct
Associative
Set Associative
TagIdentifies which particular block of memory is
currently being stored
It is usually a portion of the main memory
Di t M i
8/8/2019 Lec6 Memory Cache0910
21/42
CC2202 21
Direct Mapping
Each block of main memory maps to only one cache line i.e. if a block is in cache, it must be in one specific place
Address is in two parts
Most Significant sbits specify one of 2s memory blocks The MSBs are split into a cache line field rand a tag ofs-r(most
significant)
Least Significant wbits identify unique word within a block of MM Case for illustration (note: words = bytes)
Cache of 64kByte
Cache block of 4 bytes
i.e. cache is 16k (214) lines of 4 bytes
16M Bytes main memory
No. of Bits required for the MM address
24 bit address 224 = 24x 210x 210 = 16 M
Di t M i Add St t
8/8/2019 Lec6 Memory Cache0910
22/42
CC2202 22
Direct Mapping Address Structure
Tag s-r Line or Slot r Word w
8 14 2
24 bit address (s+w bits = 24 bits)
2 bit word identifier (4 byte block)
22 bit block identifier (no of memory blocks = 2s = 222 )
8 bit tag (=22-14)
14 bit slot or lineNo two blocks in the same line have the sameTag field (no of lines in cache = m = 2r )
Check contents of cache by finding line and checking Tag
Direct Mapping
8/8/2019 Lec6 Memory Cache0910
23/42
CC2202 23
Cache Line Table
Cache line Main Memory blocks held
0 0, m, 2m 2s
-m 1 1, m+1, 2m+12s-m+1
. . . . .
. . . . .
. . . . .
m-1 m-1, 2m-1, 3m-12s-1
Di t M i C h O i ti
8/8/2019 Lec6 Memory Cache0910
24/42
CC2202 24
Direct Mapping Cache Organization
Direct Mapping Example
8/8/2019 Lec6 Memory Cache0910
25/42
CC2202 25
Direct Mapping Example
16 Kline Cache
16 MByte Main Memory
Direct Mapping Summary
8/8/2019 Lec6 Memory Cache0910
26/42
CC2202 26
Direct Mapping Summary
Address length = (s + w) bits Number of addressable units = 2s+w words or bytes
Block size = line size = 2w words or bytes
Number of blocks in main memory = 2s+ w/2w = 2s
Number of lines in cache = m = 2r
Size of tag = (s r) bits
Direct Mapping pros & cons
8/8/2019 Lec6 Memory Cache0910
27/42
CC2202 27
Direct Mapping pros & cons
ProsSimple
Inexpensive
Cons
Fixed location for given block
If a program accesses 2 blocks that map to the same linerepeatedly, cache misses are very high
Associative Mapping
8/8/2019 Lec6 Memory Cache0910
28/42
CC2202 28
Associative Mapping
A main memory block can load into any line ofcache
Memory address is interpreted as tag and word Tag uniquely identifies block of memory
Every lines tag is examined for a match
Cache searching gets expensive
Fully Associative Cache Organization
8/8/2019 Lec6 Memory Cache0910
29/42
CC2202 29
Fully Associative Cache Organization
Associative Mapping Example
8/8/2019 Lec6 Memory Cache0910
30/42
CC2202 30
Associative Mapping Example
16 Kline Cache
16 MByte Main Memory
Associative Mapping Address Structure
8/8/2019 Lec6 Memory Cache0910
31/42
CC2202 31
Tag 22 bitsWord
2 bits
Associative Mapping Address Structure
22 bit tag stored with each 32 bit block of data
Compare tag field with tag entry in cache tocheck for hit
Least significant 2 bits of address identify which
16 bit word is required from 32 bit data block e.g.
Address Tag Data Cache line
FFFFFC 3FFFFF 24682468 3FFF
Associative MappingAddress Structure
8/8/2019 Lec6 Memory Cache0910
32/42
CC2202 32
Address Structure
Associative Mapping Summary
8/8/2019 Lec6 Memory Cache0910
33/42
CC2202 33
Associative Mapping Summary
Remember, it is the leftmost (most significant) 22bits ofthe address that form the tag
Further Example:
For 24-bit hexadecimal address :16339CThe 22-bit tag : 058CE7
Do you know why?
Summary Address length = (s + w) bits
Number of addressable units = 2s+w words or bytes Block size = line size = 2w words or bytes
Number of blocks in main memory = 2s+ w/2w = 2s
Number of lines in cache = undetermined
Size of tag = s bits
Memory address : 0001 0110 0011 0011 1001 1100
Tag : 00 0101 1000 1100 1110 0111
Associative Mapping pros & cons
8/8/2019 Lec6 Memory Cache0910
34/42
CC2202 34
Associative Mapping pros & cons
ProsFlexible When a new block is read into the cache,
any block in cache can be replaced.
Maximize hit ratio when appropriate replacementalgorithm is used.
Cons
Requires complex circuitry to examine the tags of allcache lines in parallel.
Set Associative Mapping
8/8/2019 Lec6 Memory Cache0910
35/42
CC2202 35
Set Associative Mapping
Cache is divided into a number of sets Each set contains a number of lines
A given block maps to any line in a given sete.g. Block B can be in any line of set i
e.g. 2 lines per set
2 way associative mapping
A given block can be in one of 2 lines in only one set
Set Associative Mapping
Example
8/8/2019 Lec6 Memory Cache0910
36/42
CC2202 36
Example
13 bit set number Block number in main memory is modulo 213
000000, 00A000, 00B000, 00C000 map tosame set
Two Way Set Associative Cache Organization
8/8/2019 Lec6 Memory Cache0910
37/42
CC2202 37
y g
Set Associative Mapping Address Structure
8/8/2019 Lec6 Memory Cache0910
38/42
CC2202 38
Use set field to determine cache set to look in
Compare tag field to see if we have a hit
e.gAddress Set+Word Tag Data Set
FFFFFC 7FFC 1FF 24682468 1FFF
167FFC 7FFC 02C 12345678 1FFF
Tag 9 bits Set 13 bitsWord
2 bits
8/8/2019 Lec6 Memory Cache0910
39/42
CC2202 39
e.g
Address Set+Word Tag Data Set
FFFFFC 7FFC 1FF 24682468 1FFF
Two Way Set Associative Mapping Example
8/8/2019 Lec6 Memory Cache0910
40/42
CC2202 40
y g
02C
16K line Cache
16 MByte Main Memory
Set Associative Mapping Summary
8/8/2019 Lec6 Memory Cache0910
41/42
CC2202 41
Address length = (s + w) bits Number of addressable units = 2s+w words or bytes
Block size = line size = 2w words or bytes
Number of blocks in main memory = 2d
Number of lines in set = k
Number of sets = v = 2d
Number of lines in cache = kv = k * 2d
Size of tag = (s d) bits
Replacement Algorithms
8/8/2019 Lec6 Memory Cache0910
42/42
CC2202 42
Direct mappingNo choice
Each block only maps to one line
Replace that line Associative & Set Associative
Hardware implemented algorithm (speed)
Least Recently used (LRU) e.g. in 2 way set associative
First in first out (FIFO)
replace block that has been in cache longestLeast frequently used
replace block which has had fewest hits
Random