Flash-DateisystemeSeite 30Flash Dateisysteme j Rainbow-OS Architekturseminar j Juni 2010 SquashFS I...

33
Flash-Dateisysteme Christian Egger | Juni 2010 | Verteilte Systeme

Transcript of Flash-DateisystemeSeite 30Flash Dateisysteme j Rainbow-OS Architekturseminar j Juni 2010 SquashFS I...

Page 1: Flash-DateisystemeSeite 30Flash Dateisysteme j Rainbow-OS Architekturseminar j Juni 2010 SquashFS I variable (compression) block size up to 1MiB I result: good compression I zlib (default)

Flash-Dateisysteme

Christian Egger | Juni 2010 | Verteilte Systeme

Page 2: Flash-DateisystemeSeite 30Flash Dateisysteme j Rainbow-OS Architekturseminar j Juni 2010 SquashFS I variable (compression) block size up to 1MiB I result: good compression I zlib (default)

Seite 2 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010

Intro - Flash

I Developed by Toshiba in 1985I Replacement of EEPROMs

I Read-Only: Bios, Firmware...I Read-Write: Embedded Devices (Router, controller...)

I Moore’s Law: faster, cheaper, bigger memoriesI New application areas

I Integration into microcontrollerI USB-Sticks (8MB... 2GB... 128GB...)I Memory cards: SD, MMC, xD, CF, MemoryStickI embedded Storage: MP3-Player, HandyI As hard disk replacement: Solid State Disks (SSDs)

I 2 Types: NOR and NAND Flash

Page 3: Flash-DateisystemeSeite 30Flash Dateisysteme j Rainbow-OS Architekturseminar j Juni 2010 SquashFS I variable (compression) block size up to 1MiB I result: good compression I zlib (default)

Seite 3 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010

Flash Technology - NOR

I expensiveI low capacityI good reliabilityI Byte/Word-wise Access (via address/data lines)I direct CPU connectionI fast random accessI bitwise programmableI Use: mostly program memory / embeddedI very low erase and write performance

Page 4: Flash-DateisystemeSeite 30Flash Dateisysteme j Rainbow-OS Architekturseminar j Juni 2010 SquashFS I variable (compression) block size up to 1MiB I result: good compression I zlib (default)

Seite 4 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010

Flash Technology - NAND

I Command driven interfaceI SLC NAND Flash (Single-Level-Cell)

I 2 States, 1 Bit per cellI robust: 100K-1Mio erase cyclesI more reliable than MLCI lower energy consumption, faster than MLCI more expensive than MLC

I MLC NAND Flash (Multi-Level-Cell)I 4 states / 2 bits per cellI 10K-100K erase cyclesI bad blocks when delivery (like bad pixels / LCDs)I more storage per siliconI stricter constraints compared to SLC and NOR

Page 5: Flash-DateisystemeSeite 30Flash Dateisysteme j Rainbow-OS Architekturseminar j Juni 2010 SquashFS I variable (compression) block size up to 1MiB I result: good compression I zlib (default)

Seite 5 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010

Flash-storage: Differences to hard disksI Basic commands

I readI eraseI write

I UnitsI Page - 2KiBI Block - 128KiB

I granularities:I read/write - byte/wordI read/write - pageI erase - block

I limited number of erase cyclesI NOR: 100k-1MI SLC NAND: 100k+I MLC NAND: 10k-100k

Page 6: Flash-DateisystemeSeite 30Flash Dateisysteme j Rainbow-OS Architekturseminar j Juni 2010 SquashFS I variable (compression) block size up to 1MiB I result: good compression I zlib (default)

Seite 6 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010

Other quirks

I WritesI in-place handicapped

I needs “read-modify-erase-write”I expensive (time, complexity)I not atomic, unsafeI high wear

I out-of-placeI needs only “write” (assumption: pre erased pages)I atomicity

I OverwritesI multiple writes to the same “page”I some flashes (NOR, SLC)

Page 7: Flash-DateisystemeSeite 30Flash Dateisysteme j Rainbow-OS Architekturseminar j Juni 2010 SquashFS I variable (compression) block size up to 1MiB I result: good compression I zlib (default)

Seite 7 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010

More quirks

I NAND: spare areasI extra storageI with/without overwritingI Application

I ECCI bad block flagsI deletion marker

I NAND: writes strict linearI only within same blockI consequence of other optimizations (price)

Page 8: Flash-DateisystemeSeite 30Flash Dateisysteme j Rainbow-OS Architekturseminar j Juni 2010 SquashFS I variable (compression) block size up to 1MiB I result: good compression I zlib (default)

Seite 8 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010

Methods of using flash

I Flash Translation Layer (FTL)I flash as a block deviceI Handling

I wear-levelingI bad-block handlingI Error CorrectionI mapping of different page sizes

I Flash File SystemsI since 1990+, FFS2 by MicrosoftI Advantages over FTL

I directly usable, no extra logicI more efficientI special applications possible (XIP)

Page 9: Flash-DateisystemeSeite 30Flash Dateisysteme j Rainbow-OS Architekturseminar j Juni 2010 SquashFS I variable (compression) block size up to 1MiB I result: good compression I zlib (default)

Seite 9 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010

Concepts

I NodesI Log StructureI Garbage CollectionI Wandering TreesI Write BackI Mount ScanningI Checkpointing / SnapshottingI CompressionI Error CorrectionI Execute-in-Place

Page 10: Flash-DateisystemeSeite 30Flash Dateisysteme j Rainbow-OS Architekturseminar j Juni 2010 SquashFS I variable (compression) block size up to 1MiB I result: good compression I zlib (default)

Seite 10 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010

Nodes

I Contiguous StructureI MetadataI (not needingly also) Data

I less write operations

Page 11: Flash-DateisystemeSeite 30Flash Dateisysteme j Rainbow-OS Architekturseminar j Juni 2010 SquashFS I variable (compression) block size up to 1MiB I result: good compression I zlib (default)

Seite 11 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010

Log Structure

I Out-of-placeI Formen

I RingbufferI Log structure within single blocksI Partitioning into areas

I best performance: blocks pre-erased

Page 12: Flash-DateisystemeSeite 30Flash Dateisysteme j Rainbow-OS Architekturseminar j Juni 2010 SquashFS I variable (compression) block size up to 1MiB I result: good compression I zlib (default)

Seite 12 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010

Log Structure

CREATE

APPEND

WRITE to OFFSET

TRUNCATE

DELETE

CREATE

Abbildung: Log with some operations.

Page 13: Flash-DateisystemeSeite 30Flash Dateisysteme j Rainbow-OS Architekturseminar j Juni 2010 SquashFS I variable (compression) block size up to 1MiB I result: good compression I zlib (default)

Seite 13 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010

Garbage Collection

I Log Structure: Trashing, fragmentation of single blocksI Block status

I empty / erasedI full / all data valid (obsolete)I partially full / some invalid dataI erasable / all data invalid

I Solution: GC!I redundant copyI obsolete full blocksI reclaim free space

I StrategiesI strict (like a Ringbuffer)I Heuristics

Page 14: Flash-DateisystemeSeite 30Flash Dateisysteme j Rainbow-OS Architekturseminar j Juni 2010 SquashFS I variable (compression) block size up to 1MiB I result: good compression I zlib (default)

Seite 14 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010

Garbage Collection

copy

erase

fragmented

Abbildung: GC with different block statuses.

Page 15: Flash-DateisystemeSeite 30Flash Dateisysteme j Rainbow-OS Architekturseminar j Juni 2010 SquashFS I variable (compression) block size up to 1MiB I result: good compression I zlib (default)

Seite 15 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010

Wandering TreesI Directory IndexI like ext2 TreeI but: out-of-place updates, floating structuresI Differences

I Index still points to obsolete Data (COW)I Update index recursivelyI Order: Leaf .. Root-node (atomicity!)I Root node has a new place

Page 16: Flash-DateisystemeSeite 30Flash Dateisysteme j Rainbow-OS Architekturseminar j Juni 2010 SquashFS I variable (compression) block size up to 1MiB I result: good compression I zlib (default)

Seite 16 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010

Write-Back Strategy

I Caching of dirty pagesI Write bulks of dataI Pros/Cons

I Fewer writesI Agglomeration of DataI not safe

Page 17: Flash-DateisystemeSeite 30Flash Dateisysteme j Rainbow-OS Architekturseminar j Juni 2010 SquashFS I variable (compression) block size up to 1MiB I result: good compression I zlib (default)

Seite 17 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010

Mount Scanning

I Index not on FlashI construct on startupI full Device-Scan neededI Complexity: O(n) (start-time + RAM vs. device size)

I Index on flashI locatable in O(1): root-nodeI complexity: O(1) possible (RAM + startup time)

Page 18: Flash-DateisystemeSeite 30Flash Dateisysteme j Rainbow-OS Architekturseminar j Juni 2010 SquashFS I variable (compression) block size up to 1MiB I result: good compression I zlib (default)

Seite 18 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010

Checkpointing / Snapshots

I CheckpointingI FS without Index on FlashI Memory dump of the index saved to flashI low Mount-Scan complexityI fast startupI validity: as long state does not change

Page 19: Flash-DateisystemeSeite 30Flash Dateisysteme j Rainbow-OS Architekturseminar j Juni 2010 SquashFS I variable (compression) block size up to 1MiB I result: good compression I zlib (default)

Seite 19 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010

Compression

I slow writesI Compression, write fewer Data: faster?I Algorithms

I deflate/zlib (default)I LZOI LZMAI bzip2

I ApplicationI compress DataI compress Metadata

I most often only dataI Problem: calculation of free space?

Page 20: Flash-DateisystemeSeite 30Flash Dateisysteme j Rainbow-OS Architekturseminar j Juni 2010 SquashFS I variable (compression) block size up to 1MiB I result: good compression I zlib (default)

Seite 20 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010

Error Detection & Correction

I NAND: Focus on cheap priceI defect blocks/pages

I delivery with bad blocks allowedI emerge during useI mark: flag in spare area

I Bit-Flips in neighbouring cellsI Software has to deal with that

I CRCsI ECC (detect 2-bit, correct 1-bit errors)I FS data structure has to allow bad blocks everywhere

Page 21: Flash-DateisystemeSeite 30Flash Dateisysteme j Rainbow-OS Architekturseminar j Juni 2010 SquashFS I variable (compression) block size up to 1MiB I result: good compression I zlib (default)

Seite 21 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010

Execute-in-Place

I No fetch into RAMI Executable Text area mapped directly into address spaceI only NORI very invasive (FS-Code - Paging Code)I used for “embedded” areasI i.e. Linux-Phones (Maemo, FIC)I 2 implementations in Linux

I AXFSI CRAMS+XIP Patch

Page 22: Flash-DateisystemeSeite 30Flash Dateisysteme j Rainbow-OS Architekturseminar j Juni 2010 SquashFS I variable (compression) block size up to 1MiB I result: good compression I zlib (default)

Seite 22 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010

Flash File Systems in Linux

Jahr Name in Kernel?1999 JFFS discontinued2001 JFFS2 Linux-2.4.10+2002 YAFFS nur patch2005 YAFFS2 nur patch2007 LogFS Linux-2.6.34+2008 UBIFS Linux-2.6.27+

Page 23: Flash-DateisystemeSeite 30Flash Dateisysteme j Rainbow-OS Architekturseminar j Juni 2010 SquashFS I variable (compression) block size up to 1MiB I result: good compression I zlib (default)

Seite 23 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010

JFFS

I Axis Communications ABI first ImplementationI Structure: Nodes + strict LogI no compressionI Kernel 2.0 / 2.2I no hardlinksI Mount-ScanI Index in RAM

Page 24: Flash-DateisystemeSeite 30Flash Dateisysteme j Rainbow-OS Architekturseminar j Juni 2010 SquashFS I variable (compression) block size up to 1MiB I result: good compression I zlib (default)

Seite 24 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010

JFFS2

I Redesign of JFFS by RedHatI designed for NORI Improvements

I Compression (zlib, rubin, rtime)I relaxed Log-Structure ApproachI Hardlink support

I most often used Flash FSI Problem: scalability

I RAM: O(n) for {Number of Objects in JFFS2}I Startup time: O(n) for device size

Page 25: Flash-DateisystemeSeite 30Flash Dateisysteme j Rainbow-OS Architekturseminar j Juni 2010 SquashFS I variable (compression) block size up to 1MiB I result: good compression I zlib (default)

Seite 25 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010

YAFFS

I designed for NANDI very portable, Linux: PatchI no Index on Flash: RAM and Start in O(n)I but: Checkpointing, fast StartI no compressionI YAFFS1

I 512B page size NANDI Spare Areas: Deletion MarkerI simple Mount-Scan

I YAFFS2I 2KiB page size NANDI Spare Areas only for marking Bad BlocksI no Overwriting

Page 26: Flash-DateisystemeSeite 30Flash Dateisysteme j Rainbow-OS Architekturseminar j Juni 2010 SquashFS I variable (compression) block size up to 1MiB I result: good compression I zlib (default)

Seite 26 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010

LogFS

I Block and MTD modeI requirement: scalabilityI RAM usage and start in in O(1)

I Index on flash: Wandering TreeI 2 Anchor Areas: Pointers to floating structuresI Block-levels: blocks only used for nodes of same level

I root node blocksI ...I level n blocksI data blocks

Page 27: Flash-DateisystemeSeite 30Flash Dateisysteme j Rainbow-OS Architekturseminar j Juni 2010 SquashFS I variable (compression) block size up to 1MiB I result: good compression I zlib (default)

Seite 27 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010

UBIFS & UBI

I UBI LayerI “unsorted block images”I LEBs / PEBsI wear-levelingI Error correctionI ScrubbingI Start in O(n)

I UBIFSI Very much like LogFS (except for UBI)

Page 28: Flash-DateisystemeSeite 30Flash Dateisysteme j Rainbow-OS Architekturseminar j Juni 2010 SquashFS I variable (compression) block size up to 1MiB I result: good compression I zlib (default)

Seite 28 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010

Read-Only File Systems in Linux

Jahr Name in Kernel?1997 RomFS 2.2+1999 CramFS 2.4+2002 SquashFS 2.6.29+2006 AXFS patch

Page 29: Flash-DateisystemeSeite 30Flash Dateisysteme j Rainbow-OS Architekturseminar j Juni 2010 SquashFS I variable (compression) block size up to 1MiB I result: good compression I zlib (default)

Seite 29 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010

CRAMFS

I Compression supportI terse MetadataI very matureI Disadvantages

I 8bit gid/uid’sI no timestampsI 16MiB file size limitI device size limit: ¡ 256MB (+16MB)

I XIP support (Montavista patch)

Page 30: Flash-DateisystemeSeite 30Flash Dateisysteme j Rainbow-OS Architekturseminar j Juni 2010 SquashFS I variable (compression) block size up to 1MiB I result: good compression I zlib (default)

Seite 30 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010

SquashFS

I variable (compression) block size up to 1MiBI result: good compression

I zlib (default)I LZMAI Bzip2I LZO

I ApplicationsI EmbeddedI LiveCDs (+UnionFS)

Page 31: Flash-DateisystemeSeite 30Flash Dateisysteme j Rainbow-OS Architekturseminar j Juni 2010 SquashFS I variable (compression) block size up to 1MiB I result: good compression I zlib (default)

Seite 31 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010

AXFS

I “Advanced eXecute-in-place File System”I only for NORI no MTD layerI not mainline, very invasive (messes with non VFS code)I pages either XIP or compressed

I runtime profiling support (XIP xor compression)I profile feeded to mkfs.axfs

Page 32: Flash-DateisystemeSeite 30Flash Dateisysteme j Rainbow-OS Architekturseminar j Juni 2010 SquashFS I variable (compression) block size up to 1MiB I result: good compression I zlib (default)

Seite 32 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010

Linux: SSDs and ATA Trim()

I currently supportedI Btrfs, VFAT, EXT4, GFS2, NILFS

I BtrfsI special Block Allocator modesI Mode “ssd”I Modu “ssd spread”

I SSD-mode off by defaultI buggy SSD FTLs

Page 33: Flash-DateisystemeSeite 30Flash Dateisysteme j Rainbow-OS Architekturseminar j Juni 2010 SquashFS I variable (compression) block size up to 1MiB I result: good compression I zlib (default)

Seite 33 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010

Windows

I ATA Trim()I Windows7 onlyI FATI NTFS

I exFATI chosen future standard file system for SDXCI no 4GiB File-LimitI no 32GiB/2TiB Device-LimitI patent-encumberedI proprietaryI Linux not (really) supported yet