NANDFS: A Flexible Flash File System for RAM-Constrained Systems
description
Transcript of NANDFS: A Flexible Flash File System for RAM-Constrained Systems
![Page 1: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/1.jpg)
NANDFS: A Flexible Flash File System for RAM-Constrained SystemsAviad Zuck, Ohad Barzliay and Sivan Toledo
![Page 2: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/2.jpg)
Overview
Introduction + motivation Flash properties Big Ideas Going into details Software engineering, tests and experiments General flash issues
![Page 3: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/3.jpg)
3
Flash is Everywhere
![Page 4: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/4.jpg)
Resilient to vibrations and extreme conditions Faster up 100 times more (random access) than
rotating disks
![Page 5: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/5.jpg)
What’s missing?
5
![Page 6: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/6.jpg)
Sequential access And
“Today, consumer-grade SSD costs from $2 to $3.45 per gigabyte, hard drives about $0.38 per gigabyte…”
Computerworld.com, 27.8.2008*
*http://www.computerworld.com/s/article/print/9112065/Solid_state_disk_lackluster_for_laptops_PCs
6
![Page 7: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/7.jpg)
7
NOR Flash NAND Flash
Looser Constrained
Mostly Reads Storage
Few MB Many MB/GB
![Page 8: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/8.jpg)
8
Two Ways of Flash Management
NTFSFAText3…
NTFSFAText3…
JFFSYAFFSNANDFS…
JFFSYAFFSNANDFS…
![Page 9: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/9.jpg)
9
So Why NANDFS?
![Page 10: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/10.jpg)
10
![Page 11: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/11.jpg)
1111
NANDFS Also Has:
File locking Transactions Competitive performance and graceful
degradation
![Page 12: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/12.jpg)
12
How is it Done, in a Nutshell?Explanation does not fit in a nutshell Complex data structures New garbage collection mechanism And much more…
Let’s elaborate
![Page 13: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/13.jpg)
13
Flash Properties
![Page 14: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/14.jpg)
14
Flash memory is divided to pages – 0.5KB, 2KB, 4KB Page consists of Data and Metadata areas – 16B of
metadata for every 512B of data Pages arranged in units – 32/64/128 pages per unit Metadata contains unit validity indicator, ECC code and
file system metadata
![Page 15: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/15.jpg)
1515
![Page 16: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/16.jpg)
Erasures & Programming
Page bits initialized to 1’s Writing clears bits (1 to 0) Bits set by erasing entire
unit (“erase unit”). Erase unit has limited
endurance
16
![Page 17: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/17.jpg)
17
The Design of NANDFS -The “Big” Ideas
![Page 18: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/18.jpg)
18
Log-structured design
Overwrite-in-place is not permitted in flash
Caching avoids rippling effect
![Page 19: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/19.jpg)
19
Modular Flash File System
Traditional Block Device NANDFS “Block Device”
READ READ
WRITE ALLOCATE-AND-WRITE
(TRIM) TRIM
Modularity is good. But… We need a block device API designated for flash
We call our “block device” the sequencing layer
![Page 20: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/20.jpg)
2020
High-level Design
A 2-layer structure: File System Layer - transactional file system with
unix-like file structure Sequencing Layer – manages the allocation of
immutable page-sized chunks of data. Assists in crash recovery and atomicity
![Page 21: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/21.jpg)
2121
The Sequencing Layer
![Page 22: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/22.jpg)
2222
Divides flash to fixed-size physical units called slots Slots assigned to segments - logical units of the
same size Each segment maps to one physical matching slot,
except one “active segment” which is mapped to two slots.
![Page 23: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/23.jpg)
2323
Block access Segment ~> Slot mapping table in RAM Block is referenced by a logical handle
<segment_id, offset_in_segment> Address translation
Example: Logical address <0,2> ~> Physical address 8
![Page 24: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/24.jpg)
2424
Where’s the innovation? Logical address mapping not a new idea:
Logical Disk (1993), YAFFS, JFFS, And more Many FTL’s use some logical address mapping
Full mapping ~> expensive Coarse-grained mapping
Fragmentation, performance degradation Costly merges
![Page 25: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/25.jpg)
* DFTL: A Flash Translation Layer Employing Demand-based Selective Caching of Page-level Address Mappings (2009)* DFTL: A Flash Translation Layer Employing Demand-based Selective Caching of Page-level Address Mappings (2009)
![Page 26: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/26.jpg)
The difference in NANDFS NANDFS uses coarse-grained mapping, not full mapping Less RAM for page mapping (more RAM flexibility) Collect garbage while preserving validity of
pointers to non-obsolete blocks
Appropriate for flash, not for magnetic disks
26
![Page 27: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/27.jpg)
2727
Block allocation NANDFS is log-structured New blocks allocated sequentially from the
active segment. In a log-structured system blocks are never
re-written File pointer structures need to be updated
to reflect the new location of the data.
![Page 28: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/28.jpg)
2828
Garbage collection
TRIM - pages with obsolete data are marked with a special “obsolete flag”
sequencing layer manages counters of obsolete pages in every segment.
Problem - EUs contain a mixture of valid and obsolete data (pages), we can’t simply collect entire EUs
Solution :Garbage collection is performed together with allocation
![Page 29: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/29.jpg)
2929
Reclamation unit = Segment The sequencing layer chooses a segment to reclaim, and
allocates it another (fresh) second slot. Reclaim obsolete pages while copying non-obsolete pages
NOTICE – Logical addresses are preserved, although physical translation changed
![Page 30: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/30.jpg)
Finally when the new slot is full, the old slot is erased. Can now be used to reclaim another segment We choose the segment with the highest obsolete
counter level as the new “active segment”.
This will not go down well in rotating disks – too many seek operations
30
![Page 31: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/31.jpg)
3131
Sequencing Layer Recovery
When a new slot is allocated to a segment, a segment header is written in the slot’s first page
Header contains: Incremented segment sequencing number Segment number Segment type Checkpoint (further details later)
![Page 32: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/32.jpg)
3232
On mounting the header of every slot is read The segment-to-slot map can be reconstructed using
only the data from the headers
Other systems (with complete mapping) need to scan entire flash
![Page 33: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/33.jpg)
3333
Bad EU Management
Each flash memory chip contains some bad EUs Some slots contain more valid EUs than others Solution – some slots are set aside as a bank of
reserve EUs
![Page 34: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/34.jpg)
3434
Brief Summary
![Page 35: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/35.jpg)
35
The Design of NANDFS -More Ideas
![Page 36: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/36.jpg)
36
Wear Leveling Writes and erases should be spread evenly over all EUs
Problem: some slots may be reclaimed rarely Solution: Perform periodic random wear leveling
process Choose random slot and copy it to a fresh slot Incurs only a low overhead Guarantees near-optimal expected endurance
(Ben-Aroya and Toledo, 2006)
Technique widely used (YAFFS, JFFS)
![Page 37: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/37.jpg)
37
Transactions
File system operations are atomic and transactional
Marking pages as obsolete is not straightforward Simple transaction – block re-write
After rewriting, old data block should be marked obsolete
If we mark it, and the transaction aborts before completing, old data should remain valid
If already marked as obsolete – cannot undo
![Page 38: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/38.jpg)
38
Solution: Perform valid-to-obsolete-transition (or VOT) AFTER the transaction commits.
Write VOT records to flash in dedicated pages On commit use VOT records to mark pages as obsolete Maintain linked list of all pages written in a specific
transaction on flash Keep in RAM a pointer to the last page written in a
transaction On abort mark all pages written by the transaction as
obsolete
![Page 39: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/39.jpg)
3939
![Page 40: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/40.jpg)
40
Checkpoints Snapshot of system state Ensures returning to stable state following a crash Checkpoint is written:
As part of a segment header. Whenever a transaction commits.
Structure: Obsolete counters array Pointer to last-written block address of committed
transaction Pointers to the last-written blocks of all on-going
transactions Pointer to root inode
![Page 41: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/41.jpg)
41
Simple Example
![Page 42: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/42.jpg)
42
Finding the Last Checkpoint In every given time there is only one valid
checkpoint in flash On mounting
Locate last allocated slot (using its sequence #) Perform binary search to see if another later checkpoint
exists in the slot Aborting all other transactions Truncate all pages written after the checkpoint Finishing the transaction that was committed
![Page 43: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/43.jpg)
File System Layer
43
![Page 44: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/44.jpg)
44
Files represented by inode trees File metadata Direct pointers to data pages Indirect pointers etc.
All pointers are logical pointers Regular files not permitted to be sparse
![Page 45: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/45.jpg)
45
Root file and directory inodes may be sparse. Hole indicated by special flag
![Page 46: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/46.jpg)
46
The Root File Array of inodes
![Page 47: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/47.jpg)
47
When a file is deleted a page-size hole is created
When creating a file a hole can easily be located
If no hole exists, allocate a new inode by extending the root file
![Page 48: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/48.jpg)
48
Directory Structure Directory = array of directory entries
inode number Length UTF-8 file name.
Direntry length <= 256 bytes. Direntries packed into chunks without gaps
![Page 49: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/49.jpg)
49
chunk size < (page - direntry size) ~> directory contains “hole”
Allocating new direntry requires finding a hole Direntry Lookup is sequential
![Page 50: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/50.jpg)
50
System Calls
Most system calls (creat, unlink, mkdir…) are atomic transactions
Transaction that handles a write() commits only when on close() System calls that modify a single file can be bundled
into a single transaction 5 consecutive calls to write() + close() on a single file
are treated as a single transaction Overhead of transaction commit ~ 1
Actual physical page writes
Minimum possible page writes
![Page 51: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/51.jpg)
51
Running Out of Space Log-structured file system writes even when user deletes files When flash is full, the system may have too few free pages to
delete a file Solution – maintain number of free+obsolete pages. If next write lowers this number below threshold - abort
transactions until we have enough free pages
Threshold is : c = # of blocks written on direntry delete = max file pages = re-do records per page.
( ) /c b l
c
( )b l
![Page 52: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/52.jpg)
Software Engineering
52
![Page 53: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/53.jpg)
Coding Code written with intention to be “humanly
readable” (&(transactions[tid]))->f_type = 0x02vs. TRANSACTION_SET_FTYPE(tid, FTYPE_FILE)
Embedded development External libraries not an option (math, string) More macros, less functions (stack) No debugging – need good simulator! Various gcc compliances – cygwin, debian, arm-gcc
53
![Page 54: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/54.jpg)
Incremental development
High level and Low level design preceded development 3 weeks
Code written bottom up Flash driver –> sequencing layer –> file system layer Caching layer added later. Challenging… 1 year (~commercial code)
Test driven development “By hand” (no libraries)
54
![Page 55: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/55.jpg)
My own boss - lessons
Time frames Outsider notes
Feedback “pairing”
55
![Page 56: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/56.jpg)
56
Experiments & Tests
![Page 57: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/57.jpg)
5757
Testing
Extensive test-suite: Integration and performance tests Extensive crash tests Large set of unit tests for every function
Integrated to eCos Tests and integration verified on actual 32 MB
flash
![Page 58: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/58.jpg)
5858
Experiments
Simulated 1GB flash Configuration - 512 slots, 8 reserved for bad-
block replacement 6 open files and 8 file descriptors 3 concurrent transactions
![Page 59: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/59.jpg)
5959
Workload
![Page 60: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/60.jpg)
60
Slot Partitioning
60
![Page 61: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/61.jpg)
6161
Mounting
YAFFS mounting time - 2.7s 80% utilization
![Page 62: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/62.jpg)
62
Endurance
Repeatedly re-write a small file when the file system contains a static 205MB file.
![Page 63: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/63.jpg)
(Some) Challenges in flash
63
![Page 64: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/64.jpg)
Single vs. Multi level cell Flash classified by number of bits stored in a
single cell
64
SLC (1 bits) MLC (2-4 bit)
Smaller capacity Cheaper
Errors from partial writes Write-constrained
Faster More error-prone
Less endurance
![Page 65: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/65.jpg)
Parallelism
*Picture from N Agrawal, V Prabhakaran, T Wobber (2008)
65
![Page 66: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/66.jpg)
Simple example for utilizing parallelism
* J Seol, H Shim, J Kim, and S Maeng (2009)
66
![Page 67: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/67.jpg)
Enterprise storage
* SW Lee, B Moon, C Park, JM Kim, SW Kim (2008)
Disk bandwidth (sequential) still 2-3 times higher than flash
Read/write latency flash smaller than disk by more than an order of magnitude
This improves throughput of transaction processing – useful for database servers
67
![Page 68: NANDFS: A Flexible Flash File System for RAM-Constrained Systems](https://reader035.fdocuments.net/reader035/viewer/2022081516/5681400c550346895dab4586/html5/thumbnails/68.jpg)
6868
The End
Thank you!