More on File Management Chapter 12. File Management provide file abstraction for data storage...

32
More on File Management Chapter 12

Transcript of More on File Management Chapter 12. File Management provide file abstraction for data storage...

Page 1: More on File Management Chapter 12. File Management provide file abstraction for data storage guarantee, to the extend possible, that data in the file.

More on File Management

Chapter 12

Page 2: More on File Management Chapter 12. File Management provide file abstraction for data storage guarantee, to the extend possible, that data in the file.

File Management

• provide file abstraction for data storage• guarantee, to the extend possible, that data in the file is valid• performance: throughput and response time• minimize the potential for lost or destroyed data: reliability• provide protection• API: create, delete, read, write files

Page 3: More on File Management Chapter 12. File Management provide file abstraction for data storage guarantee, to the extend possible, that data in the file.

File Naming• files must be referable by unique names• external names: symbolic• in a hierarchical file system (UNIX) external names are given as pathnames (path from the root to the file)• internal names: i-node in UNIX (an index into an array of file descriptors/headers for a volume)• directory: translation from external to internal names (more than one external name for an internal name is allowed)• information about file is split between the directory and the file descriptor (in UNIX all of it is stored in the file descriptor): size, location on disk, owner, permissions, date created, date last modified, date last access, link count (in UNIX)

Page 4: More on File Management Chapter 12. File Management provide file abstraction for data storage guarantee, to the extend possible, that data in the file.

Protection Mechanisms• files are OS objects: unique names and a finite set of operations that processes can perform on them• protection domain is a set of {object,rights} where right is the permission to perform one of the operations• at every instant in time, each process runs in some protection domain• in Unix, a protection domain is {uid, gid} • protection domain in Unix is switched when running a program with SETUID/SETGID set or when the process enters the kernel mode by issuing a system call• how to store all the protection domains ?

Page 5: More on File Management Chapter 12. File Management provide file abstraction for data storage guarantee, to the extend possible, that data in the file.

Protection Mechanisms (cont’d)• Access Control List (ACL): associate with each object a list of all the protection domains that may access the object and how• in Unix ACL is reduced to three protection domains: owner, group and others• Capability List (C-list): associate with each process a list of objects that may be accessed along with the operations• C-list implementation issues: where/how to store them (hardware, kernel, encrypted in user space) and how to revoke them

Page 6: More on File Management Chapter 12. File Management provide file abstraction for data storage guarantee, to the extend possible, that data in the file.

Secondary Storage Management

• Space must be allocated to files

• Must keep track of the space available for allocation

Page 7: More on File Management Chapter 12. File Management provide file abstraction for data storage guarantee, to the extend possible, that data in the file.

Preallocation

• Need the maximum size for the file at the time of creation

• Difficult to reliably estimate the maximum potential size of the file

• Tend to overestimated file size so as not to run out of space

Page 8: More on File Management Chapter 12. File Management provide file abstraction for data storage guarantee, to the extend possible, that data in the file.

Methods of File Allocation

• Contiguous allocation– Single set of blocks is allocated to a file at

the time of creation– Only a single entry in the file allocation

table• Starting block and length of the file

• External fragmentation will occur

Page 9: More on File Management Chapter 12. File Management provide file abstraction for data storage guarantee, to the extend possible, that data in the file.
Page 10: More on File Management Chapter 12. File Management provide file abstraction for data storage guarantee, to the extend possible, that data in the file.
Page 11: More on File Management Chapter 12. File Management provide file abstraction for data storage guarantee, to the extend possible, that data in the file.

Methods of File Allocation

• Chained allocation– Allocation on basis of individual block

– Each block contains a pointer to the next block in the chain

– Only single entry in the file allocation table• Starting block and length of file

• No external fragmentation• Best for sequential files• No accommodation of the principle of locality

Page 12: More on File Management Chapter 12. File Management provide file abstraction for data storage guarantee, to the extend possible, that data in the file.
Page 13: More on File Management Chapter 12. File Management provide file abstraction for data storage guarantee, to the extend possible, that data in the file.
Page 14: More on File Management Chapter 12. File Management provide file abstraction for data storage guarantee, to the extend possible, that data in the file.

Methods of File Allocation

• Indexed allocation– File allocation table contains a separate one-

level index for each file– The index has one entry for each portion

allocated to the file– The file allocation table contains block

number for the index

Page 15: More on File Management Chapter 12. File Management provide file abstraction for data storage guarantee, to the extend possible, that data in the file.
Page 16: More on File Management Chapter 12. File Management provide file abstraction for data storage guarantee, to the extend possible, that data in the file.
Page 17: More on File Management Chapter 12. File Management provide file abstraction for data storage guarantee, to the extend possible, that data in the file.

File Allocation

• contiguous: a contiguous set of blocks is allocated to a file at the time of file creation

consolidation to improve locality

chained allocation: each block contains a pointer to the next one in the chain

good for sequential files file size must be known at the time of file creation external fragmentation

indexed allocation: good both for sequential and direct access (UNIX)

Page 18: More on File Management Chapter 12. File Management provide file abstraction for data storage guarantee, to the extend possible, that data in the file.

Free Space Management

• bitmap: one bit for each block on the disk

chained free portions: {pointer to the next one, length} index: treats free space as a file

good to find a contiguous group of free blocks small enough to be kept in memory

Page 19: More on File Management Chapter 12. File Management provide file abstraction for data storage guarantee, to the extend possible, that data in the file.

UNIX File System

• Naming– External/Internal names, Directories

• Lookup– File blocks Disk blocks

• Protection

• Free Space Management

Page 20: More on File Management Chapter 12. File Management provide file abstraction for data storage guarantee, to the extend possible, that data in the file.

File Naming

• External names (used by the application)– Pathname: /usr/users/file1

• Internal names (used by the OS kernel)– I-node: file number/index on disk

superblock

File system on disk

I-node area( one I-node per file)

File-block area

01

Page 21: More on File Management Chapter 12. File Management provide file abstraction for data storage guarantee, to the extend possible, that data in the file.

Directories

• Files which store translation tables (external names to internal names)

usr 23

Root directory(always I-node 2)

users 41

usr

file1 87

usr users

/usr/users/file1 corresponds to I-node 87

Page 22: More on File Management Chapter 12. File Management provide file abstraction for data storage guarantee, to the extend possible, that data in the file.

File Content Lookup

• address table used to translate logical file blocks into disk blocks

• address table stored in the I-node

File withi-node 87

File System disk

0 1 2

45 65 85

Address Table

456585

Page 23: More on File Management Chapter 12. File Management provide file abstraction for data storage guarantee, to the extend possible, that data in the file.
Page 24: More on File Management Chapter 12. File Management provide file abstraction for data storage guarantee, to the extend possible, that data in the file.

File Protection

• ACL with three protection domains (file owner, file owner group, others)

• Access rights: read/write/execute

• Stored in the I-node

Page 25: More on File Management Chapter 12. File Management provide file abstraction for data storage guarantee, to the extend possible, that data in the file.

Free Space Management

• Free I-nodes– Marked as free on disk– An array of 50 free I-nodes stored in the

superblock

• Free file blocks – Stored as a list of 50- free block arrays– First array stored in the superblock

Page 26: More on File Management Chapter 12. File Management provide file abstraction for data storage guarantee, to the extend possible, that data in the file.

In-Kernel File System Data Structures

01 File system

on disk

OS Kernel

Buffer cache

I-node cachePer-OS Open File Table(offset in file, ptr to I-node)

Per-process Open File Table

PCBs

Applicationfd=open(pathname,mode); /* fd = index in Per-Proc OFT */for (..) read(fd,buf,size);close(fd);

Page 27: More on File Management Chapter 12. File Management provide file abstraction for data storage guarantee, to the extend possible, that data in the file.

File System Consistency• a file system uses the buffer cache for performance reasons • two copies of a disk block (buffer cache, disk) -> consistency problem if the system crashes before all the modified blocks are written back to disk• the problem is critical especially for the blocks that contain control information (meta-data): directory blocks, i-node, free-list• Solution:

– write through meta-data blocks (expensive) or order of write-back is important– ordinary file data blocks written back periodically (sync)– utility programs for checking block and directory consistency after crash

Page 28: More on File Management Chapter 12. File Management provide file abstraction for data storage guarantee, to the extend possible, that data in the file.

More on File System Consistency

• Example 1: create a new file– Two updates: (1) allocate a free I-node; (2) create an entry in the directory– (1) and (2) must be write-through (expensive) or (1) must be written-back before (2)– If (2) is written back first and a crash occurs before (1) is written back the directory structure is inconsistent and cannot be recovered

• Example 2: write a new block to a file– Two updates: (1) allocate a free block; (2) update the address table of the I-node– (1) and (2) must be write-through or (1) must be written-back before (2)– If (2) is written back first and a crash occurs before (1) is written back the I-node structure is inconsistent and cannot be recovered

Page 29: More on File Management Chapter 12. File Management provide file abstraction for data storage guarantee, to the extend possible, that data in the file.

Log-Structured File System (LFS)• as memory gets larger, buffer cache size increases -> increase the fraction of read requests which can be satisfied from the buffer cache with no disk access• conclusion: in the future most disk accesses will be writes• but writes are usually done in small chunks in most file systems (meta data for instance) which makes the file system highly inefficient • LFS idea (Berkeley): to structure the entire disk as a log• periodically, or when required, all the pending writes (data and metadata together) being buffered in memory are collected and written as a single contiguous segment at the end of the log

Page 30: More on File Management Chapter 12. File Management provide file abstraction for data storage guarantee, to the extend possible, that data in the file.

LFS segment

• contain i-nodes, directory blocks and data blocks, all mixed together• each segment starts with a segment summary• segment size: 512 KB - 1MB• two key issues:

how to retrieve information from the log how to manage the free space on disk

Page 31: More on File Management Chapter 12. File Management provide file abstraction for data storage guarantee, to the extend possible, that data in the file.

File location in LFS

• the i-node contains the disk addresses of the file block as in the standard UNIX

• but there is no fixed location for the i-node

• an i-node map is used to maintain the current location of each i-node

• i-node map blocks can also be scattered but a fixed checkpoint region on the disk identifies the location of all the i-node map blocks

• usually i-node map blocks are cached in main memory most of the time, thus disk accesses for them are rare

Page 32: More on File Management Chapter 12. File Management provide file abstraction for data storage guarantee, to the extend possible, that data in the file.

Segment cleaning in LFS• LFS disk is divided in segments which are written sequentially• live data must be copied out of a segment before the segment can be re-written• the process of copying data out of a segment: cleaning• a separate cleaner thread moves along the log, removes old segments from the end and puts live data into memory for rewriting in the next segment• as a result a LFS disk appears like a big circular buffer with the writer thread adding new segments to the front and the cleaner thread removing old segments from the end• book-keeping is not trivial: i-node must be updated when blocks are moved to the current segment