I/O and Storage · CS 571: Operating Systems (Spring 2020) Lecture 10a Yue Cheng Some material...
Transcript of I/O and Storage · CS 571: Operating Systems (Spring 2020) Lecture 10a Yue Cheng Some material...
I/O and Storage: Flash SSDs
CS 571: Operating Systems (Spring 2020)Lecture 10a
Yue Cheng
Some material taken/derived from: • Wisconsin CS-537 materials created by Remzi Arpaci-Dusseau.Licensed for use under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.
Disk vs. Flash
2Y. Cheng GMU CS571 Spring 2020
Disk Overview
• I/O requires: seek, rotate, transfer
• Inherently:• Not parallel (only one head)• Slow (mechanical)• Poor random I/O (locality around disk head)
• Random requests each taking ~10+ ms
3Y. Cheng GMU CS571 Spring 2020
Flash Overview
• Hold charge in cells. No moving (mechanical) parts!
• Inherently parallel!
• No seeks!
4Y. Cheng GMU CS571 Spring 2020
Storage Hierarchy Overview
5
Flash:Smaller capacityFaster accesses
HDD:Larger capacity
Way slower accesses
Y. Cheng GMU CS571 Spring 2020
Disks vs. Flash: Performance
• Throughput• Disk: ~130MB/s (sequential)
• Flash: ~400MB/s
• Latency• Disk: ~10ms (one op)
• Flash:• Read: 10-50us
• Program: 200-500us
• Erase: 2ms
6Y. Cheng GMU CS571 Spring 2020
Disks vs. Flash: Performance
• Throughput• Disk: ~130MB/s (sequential)
• Flash: ~400MB/s
• Latency• Disk: ~10ms (one op)
• Flash:• Read: 10-50us
• Program: 200-500us
• Erase: 2ms
7
Types of write, more later…
Y. Cheng GMU CS571 Spring 2020
Disks vs. Flash: Internal
8* https://www.enterprisestorageforum.com/storage-hardware/ssd-vs-hdd.html
Disks vs. Flash: Summary
10* https://www.enterprisestorageforum.com/storage-hardware/ssd-vs-hdd.html
Flash Architecture
11Y. Cheng GMU CS571 Spring 2020
SLC: Single-Level Cell
12
NAND Cell
char
ge
Y. Cheng GMU CS571 Spring 2020
SLC: Single-Level Cell
13
NAND Cell
char
ge 1
Y. Cheng GMU CS571 Spring 2020
SLC: Single-Level Cell
14
NAND Cell
char
ge 0
Y. Cheng GMU CS571 Spring 2020
MLC: Multi-Level Cell
15
NAND Cell
char
ge 00
Y. Cheng GMU CS571 Spring 2020
MLC: Multi-Level Cell
16
NAND Cell
char
ge 01
Y. Cheng GMU CS571 Spring 2020
MLC: Multi-Level Cell
17
NAND Cell
char
ge 10
Y. Cheng GMU CS571 Spring 2020
MLC: Multi-Level Cell
18
NAND Cell
char
ge 11
Y. Cheng GMU CS571 Spring 2020
Single- vs. Multi-Level Cell
19
MLC
charge
SLC
charge
Y. Cheng GMU CS571 Spring 2020
Single- vs. Multi-Level Cell
20
MLC
char
ge
SLC
char
ge
expensiverobust
1 cell: 1 bit
cheapsensitive
1 cell: multi-bit
Y. Cheng GMU CS571 Spring 2020
Wearout
• Problem: flash cells wear out after being erased too many times
• MLC: ~10K times
• SLC: ~100K times
• Usage strategy: ???
21Y. Cheng GMU CS571 Spring 2020
Wearout
• Problem: flash cells wear out after being erased too many times
• MLC: ~10K times
• SLC: ~100K times
• Usage strategy: wear leveling• Prevents some cells from being wornout while others
still fresh
22Y. Cheng GMU CS571 Spring 2020
Banks
• Flash devices are divided into banks (aka. planes)
• Banks can be accessed in parallel
23
Bank 0 Bank 1 Bank 2 Bank 3
Y. Cheng GMU CS571 Spring 2020
Banks
• Flash devices are divided into banks (aka. planes)
• Banks can be accessed in parallel
24
Bank 0 Bank 1 Bank 2 Bank 3
read read
Y. Cheng GMU CS571 Spring 2020
Banks
• Flash devices are divided into banks (aka. planes)
• Banks can be accessed in parallel
25
Bank 0 Bank 1 Bank 2 Bank 3
data data
Y. Cheng GMU CS571 Spring 2020
Flash Writes
• Writing 0’s• Fast, fine-grained
• Writing 1’s• Slow, coarse-grained
26Y. Cheng GMU CS571 Spring 2020
Flash Writes
• Writing 0’s• Fast, fine-grained
• called “program”
• Writing 1’s• Slow, coarse-grained
• called “erase”
27Y. Cheng GMU CS571 Spring 2020
Flash Writes
• Writing 0’s• Fast, fine-grained [page-level]
• called “program”
• Writing 1’s• Slow, coarse-grained [block-level]
• called “erase”
28Y. Cheng GMU CS571 Spring 2020
Flash Writes
• Writing 0’s• Fast, fine-grained [page-level]
• called “program”
• Writing 1’s• Slow, coarse-grained [block-level]
• called “erase”
• Flash can only “write” (program) into clean pages• “clean”: pages containing all 1’s (pages that have
been erased)
• Flash does not support in-place overwrite!
29Y. Cheng GMU CS571 Spring 2020
Banks and Blocks
30
Bank 0 Bank 1 Bank 2 Bank 3
Y. Cheng GMU CS571 Spring 2020
Banks and Blocks
31
Bank 0 Bank 2 Bank 3
Each bank contains many “blocks”
Y. Cheng GMU CS571 Spring 2020
Block and Pages
32
11111111
11111111
11111111
11111111
11111111
11111111
11111111
11111111
One block
Y. Cheng GMU CS571 Spring 2020
Block and Pages
33
11111111
11111111
11111111
11111111
11111111
11111111
11111111
11111111
One page
Y. Cheng GMU CS571 Spring 2020
Block and Pages
34
11111111
11111111
11111111
11111111
11111111
11111111
11111111
11111111
All pages are clean (“programmable”)
Y. Cheng GMU CS571 Spring 2020
Block
35
10111000
11111111
11111111
11111111
11111111
11111111
11111111
11111111
program
Y. Cheng GMU CS571 Spring 2020
Block
36
10111000
11111111
11111111
11111111
11111111
01101010
11111111
11111111
program
Y. Cheng GMU CS571 Spring 2020
Block
37
10111000
11111111
11111111
11111111
11111111
01101010
11111111
11111111
Two pages hold data (cannot be overwritten)
Y. Cheng GMU CS571 Spring 2020
Block
38
10111000
11111111
11111111
11111111
11111111
01101010
11111111
11111111
still want to write data into this page???
Two pages hold data (cannot be overwritten)
Y. Cheng GMU CS571 Spring 2020
Block
39
10111000
11111111
11111111
11111111
11111111
01101010
11111111
11111111
erase
Y. Cheng GMU CS571 Spring 2020
Block
40
11111111
11111111
11111111
11111111
11111111
11111111
11111111
11111111
erase(the whole block)
Y. Cheng GMU CS571 Spring 2020
Block
41
11111111
11111111
11111111
11111111
11111111
11111111
11111111
11111111
After erase, again, free state (can write new data in any page)
Y. Cheng GMU CS571 Spring 2020
Block
42
10110001
11111111
11111111
11111111
11111111
11111111
11111111
11111111
This blue page holds data
Y. Cheng GMU CS571 Spring 2020
Flash vs. Disks: APIs
43
disk flashread
write
Y. Cheng GMU CS571 Spring 2020
Flash vs. Disks: APIs
44
disk flashre
adw
rite
read sector read page
Y. Cheng GMU CS571 Spring 2020
Flash vs. Disks: APIs
45
disk flashre
adw
rite
read sector read page
write sectorprogram page
(0’s)erase block
(1’s)
Y. Cheng GMU CS571 Spring 2020
Flash Architecture
• Bank/plane: 1024 to 4096 blocks• Banks accessed in parallel
• Block: 64 to 256 pages• Unit of erase
• Page: 2 to 8 KB• Unit of read and program
46Y. Cheng GMU CS571 Spring 2020
Disks vs. Flash: Performance
• Throughput• Disk: ~130MB/s (sequential)
• Flash: ~400MB/s
• Latency• Disk: ~10ms (one op)
• Flash:• Read: 10-50us
• Program: 200-500us
• Erase: 2ms
47Y. Cheng GMU CS571 Spring 2020
Working with File System
48Y. Cheng GMU CS571 Spring 2020
Traditional File Systems
49
File System
Storage Device
Traditional API:• read sector• write sector
Y. Cheng GMU CS571 Spring 2020
Traditional File Systems
50
File System
Storage Device
Traditional API:• read sector• write sector
Mismatch with flash!
Y. Cheng GMU CS571 Spring 2020
Traditional APIs wrapping around Flash APIsread(addr):
return flash_read(addr)
write(addr, data):
block_copy = flash_read(all pages of block)
modify block_copy with data
flash_erase(block of addr)
flash_program(all pages of block_copy)
51Y. Cheng GMU CS571 Spring 2020
Awkward Flash Write
52
Memory
Flash
0000
0011
0110
1111
block 0
1101
1111
1110
1111
block 1
0001
1111
1111
1011
block 2Y. Cheng GMU CS571 Spring 2020
Awkward Flash Write
53
Memory
Flash
0000
0011
0110
1111
block 0
1101
1111
1110
1111
block 1
0001
1111
1111
1011
block 2
File system wantsto write 0001
Y. Cheng GMU CS571 Spring 2020
Awkward Flash Write
54
Memory
Flash
0000
0011
0110
1111
block 0
1101
1111
1110
1111
block 1
0001
1111
1111
1011
block 2Y. Cheng GMU CS571 Spring 2020
Awkward Flash Write
55
Memory
Flash
0000
0011
0110
1111
block 0
1101
1111
1110
1111
block 1
0001
1111
1111
1011
block 2
Read all pages in block
0000
0011
0110
1111
Y. Cheng GMU CS571 Spring 2020
Awkward Flash Write
56
Memory
Flash
0000
0011
0110
1111
block 0
1101
1111
1110
1111
block 1
0001
1111
1111
1011
block 2
0000
0011
0110
1111
Y. Cheng GMU CS571 Spring 2020
Awkward Flash Write
57
Memory
Flash
0000
0011
0110
1111
block 0
1101
1111
1110
1111
block 1
0001
1111
1111
1011
block 2
0001
0011
0110
1111
Modify target pagein memory
Y. Cheng GMU CS571 Spring 2020
Awkward Flash Write
58
Memory
Flash
0000
0011
0110
1111
block 0
1101
1111
1110
1111
block 1
0001
1111
1111
1011
block 2
0001
0011
0110
1111
Y. Cheng GMU CS571 Spring 2020
Awkward Flash Write
59
Memory
Flash
1111
1111
1111
1111
block 0
1101
1111
1110
1111
block 1
0001
1111
1111
1011
block 2
0001
0011
0110
1111
Erase whole block
Y. Cheng GMU CS571 Spring 2020
Awkward Flash Write
60
Memory
Flash
1111
1111
1111
1111
block 0
1101
1111
1110
1111
block 1
0001
1111
1111
1011
block 2
0001
0011
0110
1111
Y. Cheng GMU CS571 Spring 2020
Awkward Flash Write
61
Memory
Flash
block 0
1101
1111
1110
1111
block 1
0001
1111
1111
1011
block 2
0001
0011
0110
1111
Program all pages in block 00
010011
0110
1111
Y. Cheng GMU CS571 Spring 2020
Awkward Flash Write
62
Memory
Flash
block 0
1101
1111
1110
1111
block 1
0001
1111
1111
1011
block 2
0001
0011
0110
1111
Y. Cheng GMU CS571 Spring 2020
Issue: Write Amplification
• Random writes are expensive for flash!
• Writing one 4KB page may cause:• read, erase, and program of the whole 256KB
block
63Y. Cheng GMU CS571 Spring 2020
Flash Translation Layer
64Y. Cheng GMU CS571 Spring 2020
Flash Translation Layer (FTL)
• Add an address translation layer between upper-level file system and lower-level flash• Translate logical device addresses to physical
addresses
• Convert in-place write into append-write
• Essentially, a virtualization & optimization layer
65Y. Cheng GMU CS571 Spring 2020
Flash Translation Layer (FTL)
66
Logical
Physical 0011
0110
1111
0010
1101
1111
1111
1111
0 1 2 3 4 5 6 7
block 0 block 1
Y. Cheng GMU CS571 Spring 2020
Flash Translation Layer (FTL)
67
Logical
Physical 0011
0110
1111
0010
1101
1111
1111
1111
0 1 2 3 4 5 6 7
block 0 block 1
Write 0011
Y. Cheng GMU CS571 Spring 2020
Flash Translation Layer (FTL)
68
Logical
Physical 0011
0110
1111
0010
1101
0011
1111
1111
0 1 2 3 4 5 6 7
block 0 block 1
Write 0011
Y. Cheng GMU CS571 Spring 2020
Flash Translation Layer (FTL)
69
Logical
Physical 0011
0110
1111
0010
1101
0011
1111
1111
0 1 2 3 4 5 6 7
block 0 block 1
Write 0011
Change address mapping
Y. Cheng GMU CS571 Spring 2020
Flash Translation Layer (FTL)
70
Logical
Physical 0011
0110
1111
0010
1101
0011
1111
1111
0 1 2 3 4 5 6 7
block 0 block 1
Write 0011
Change address mapping
Marked as invalidand will be garbage
collected
Y. Cheng GMU CS571 Spring 2020
SSD Architecture with FTL
71
FTL SRAM:Mapping table
SSD provides disk-like interface
Y. Cheng GMU CS571 Spring 2020
Flash Translation Layer (FTL)
• Usually implemented in flash device’s firmware
• Where to store mapping? • SRAM
• Physical pages can be in three states• valid, invalid, free
72Y. Cheng GMU CS571 Spring 2020
State Transition of Physical Pages
73
Free
Invalid
Valid
Y. Cheng GMU CS571 Spring 2020
State Transition of Physical Pages
74
Free
Invalid
ProgramValid
Y. Cheng GMU CS571 Spring 2020
State Transition of Physical Pages
75
Free
Invalid
Program
Relocate(overwrite)
Valid
Y. Cheng GMU CS571 Spring 2020
State Transition of Physical Pages
76
Free Valid
Invalid
Program
Erase Relocate(overwrite)
Y. Cheng GMU CS571 Spring 2020