Chapter 20 Distributed File Systems Copyright © 2008.

Post on 01-Jan-2016

217 views 1 download

Transcript of Chapter 20 Distributed File Systems Copyright © 2008.

Chapter 20

Distributed File SystemsCopyright © 2008

Operating Systems, by Dhananjay Dhamdhere Copyright © 2008 20.2Operating Systems, by Dhananjay Dhamdhere 2

Introduction

• Design Issues in Distributed File Systems• Transparency• Semantics of File Sharing• Fault Tolerance• DFS Performance• Case Studies

Operating Systems, by Dhananjay Dhamdhere Copyright © 2008 20.3Operating Systems, by Dhananjay Dhamdhere 3

Design Issues in Distributed File Systems

Operating Systems, by Dhananjay Dhamdhere Copyright © 2008 20.4Operating Systems, by Dhananjay Dhamdhere 4

Overview of DFS Operation

• Remote file processing model

• File server agent and client agent are analogous to RPC’s stub processes

• For efficiency, the client agent and the cache manager are typically rolled into a single unit

Operating Systems, by Dhananjay Dhamdhere Copyright © 2008 20.5Operating Systems, by Dhananjay Dhamdhere 5

Transparency

• In a conventional file system, a user identifies a file through a path name– User is aware that file belongs in a specific directory, but

is not aware of its location in the system

• Location info field of the file’s directory entry indicates the file’s location on disk

• Location transparency can be provided in a DFS through a similar mechanism– Location info: (node id, location)

• Location independence requires information in location info field to vary dynamically

Operating Systems, by Dhananjay Dhamdhere Copyright © 2008 20.6Operating Systems, by Dhananjay Dhamdhere 6

Semantics of File Sharing

• Semantics determine manner in which effect of file manipulations performed by concurrent users of a file are visible to one another

Operating Systems, by Dhananjay Dhamdhere Copyright © 2008 20.7Operating Systems, by Dhananjay Dhamdhere 7

Semantics of File Sharing (continued)

• A session consists of some clients of a file that are located in the same node of a system

• Problem with session semantics: poor portability• Session semantics are easy to implement in a DFS

employing file caching– File changes are not visible to clients in other nodes

Operating Systems, by Dhananjay Dhamdhere Copyright © 2008 20.8Operating Systems, by Dhananjay Dhamdhere 8

Fault Tolerance

• File system reliability has several facets:– A file must be robust, recoverable, available

• Robustness is achieved using techniques for reliable storage of data

• Robustness and recoverability depend on how files are stored and backed up, respectively

• Availability depends on how files are opened and accessed

• Only defense against client node crashes is use of transaction semantics in file server

Operating Systems, by Dhananjay Dhamdhere Copyright © 2008 20.9Operating Systems, by Dhananjay Dhamdhere 9

Fault Tolerance (continued)

Operating Systems, by Dhananjay Dhamdhere Copyright © 2008 20.10Operating Systems, by Dhananjay Dhamdhere 10

Availability

• File is available if a copy can be opened and accessed by client– Ability to open file depends on path name resolution

– Access requires functional client and server nodes

• An anomalous situation may arise when path names span many nodes– If a node in path crashes, file operation will fail even if the

node that contains the file has not crashed

• Solution: cached directories

• File replication is transparent to clients– Updating techniques: 2PC, use of primary copies

Operating Systems, by Dhananjay Dhamdhere Copyright © 2008 20.11Operating Systems, by Dhananjay Dhamdhere 11

Client and Server Node Failures

• File server can maintain FCBs and OFT in memory– Stateful design

– Good performance

– Problems in event of client and server crashes

• Solution: client and file server share a virtual circuit– Virtual circuit “owns” the file processing actions and

resources like file server metadata

– Actions and resources become orphans after crash• Actions are rolled back and metadata destroyed

– Client–server protocol implementing transaction semantics may be used to ensure this

Operating Systems, by Dhananjay Dhamdhere Copyright © 2008 20.12Operating Systems, by Dhananjay Dhamdhere 12

Stateless File Servers

• File server does not maintain state information about file processing activity

• Client must:– Keep state information about file processing activity

– Provide all relevant information in a file system call• read (“alpha”, <record/byte id>, <io_area address>);

• Many actions traditionally performed only at file open time are repeated at every file operation

• If file server crashes, time-outs and retransmissions occur in client

• Cannot employ file caching

Operating Systems, by Dhananjay Dhamdhere Copyright © 2008 20.13Operating Systems, by Dhananjay Dhamdhere 13

DFS Performance

• DFS design is scalable if DFS performance doesn’t degrade with increase in size of distributed system

Operating Systems, by Dhananjay Dhamdhere Copyright © 2008 20.14Operating Systems, by Dhananjay Dhamdhere 14

Efficient File Access

• Inherent efficiency of file access depends on how the operation of a file server is structured

• Two server structures that provide efficient file access:– Multithreaded file server

– Hint-based file server• State information is used as a hint• Server operation is stateless if hint is not available

Operating Systems, by Dhananjay Dhamdhere Copyright © 2008 20.15Operating Systems, by Dhananjay Dhamdhere 15

File Caching

• File cache and copy of file on disk in server node form a memory hierarchy– Operation of the file cache and its benefits are analogous

to those of a CPU cache

• Chunks of file data are loaded from the file server into the file cache

• Studies of file size distributions indicate small average file size– Whole-file caching is feasible

• File server may use a separate attributes cache

Operating Systems, by Dhananjay Dhamdhere Copyright © 2008 20.16Operating Systems, by Dhananjay Dhamdhere 16

File Caching (continued)

• Key issues:– Location of the file cache: memory or disk

– File updating policy: write-through or delayed write

– Cache validation policy: client- or server- initiated

– Chunk size: large or small? Fixed or variable?

Operating Systems, by Dhananjay Dhamdhere Copyright © 2008 20.17Operating Systems, by Dhananjay Dhamdhere 17

Scalability

• DFS scalability achieved through techniques that localize most data traffic generated by file processing activities within clusters– Clusters typically represent subnets like high-speed LANs

– An increase in the number of clusters does not lead to degradation of performance

• It does not add much network traffic

Operating Systems, by Dhananjay Dhamdhere Copyright © 2008 20.18Operating Systems, by Dhananjay Dhamdhere 18

Case Studies

• Sun Network File System• Andrew and Coda File Systems• GPFS• Windows

Operating Systems, by Dhananjay Dhamdhere Copyright © 2008 20.19Operating Systems, by Dhananjay Dhamdhere 19

Sun Network File System

• VFS implements mount protocol and creates a system-wide unique vnode for each file

• NFS layer interacts with remote node containing file through NFS protocol

Operating Systems, by Dhananjay Dhamdhere Copyright © 2008 20.20Operating Systems, by Dhananjay Dhamdhere 20

Sun Network File System (continued)

• Several techniques to improve performance– A directory names cache is used in each client node

– A file attributes cache caches inode information• Cached attributes are discarded after 3 seconds for files

and after 30 seconds for directories

– File blocks cache is the conventional file cache• Server uses large (8 Kbytes) data blocks• Cache validation performed through timestamps associated

with each file, and cache block

• File server is stateless• Neither Unix semantics nor session semantics

Operating Systems, by Dhananjay Dhamdhere Copyright © 2008 20.21Operating Systems, by Dhananjay Dhamdhere 21

Andrew and Coda File Systems

• Targeted at gigantic distributed systems• All clients have an identical shared name space

– Is location transparent in nature

– Implemented by dedicated servers (Vice)

• Clusters localize file processing activities– Traffic within cluster reduced by caching entire file on

local disk

• A volume typically contains files of a single user• 64 KB chunks (size adapted on a per-client basis)• User process called Venus performs open/close

Operating Systems, by Dhananjay Dhamdhere Copyright © 2008 20.22Operating Systems, by Dhananjay Dhamdhere 22

Andrew and Coda File Systems (continued)

• Server-initiated cache validation using callbacks• Path name resolution performed on a component-by-

component basis– Venus maintains a mapping cache

• File servers are multithreaded• Client–server communication uses RPCs• Two features to achieve high availability:

– Replication and disconnected operation• Read one, write all policy• Supports hoarding of files

Operating Systems, by Dhananjay Dhamdhere Copyright © 2008 20.23Operating Systems, by Dhananjay Dhamdhere 23

GPFS

• General parallel file system: high-performance shared-disk file system– For large computing clusters operating under Linux

• Uses data striping across all disks in cluster– A large-size block (strip) used to minimize seek overhead

during a file read/write• A smaller subblock is used for small files

– Locking used to maintain consistency of file data• Lock granularity is as coarse as possible, but as fine as

necessary• Centralized lock manager and few distributed lock

managers

Operating Systems, by Dhananjay Dhamdhere Copyright © 2008 20.24Operating Systems, by Dhananjay Dhamdhere 24

GPFS (continued)

• Notion of lock tokens to reduce latency and overhead of locking

• Race conditions may arise over metadata of a file– Solution: one of the nodes is designated as the metanode

for the file; it performs file updates

• Central allocation manager partitions free space map and gives one partition to each node

• Each node writes a separate journal for recovery• If network is partitioned, only nodes in the majority

partition can perform file processing at any time

Operating Systems, by Dhananjay Dhamdhere Copyright © 2008 20.25Operating Systems, by Dhananjay Dhamdhere 25

Windows

• Windows Server 2003 provides two features for data replication and data distribution:– Remote differential compression (RDC)

– DFS namespaces

• Replication organized using notion of a replication group

• DFS namespace is created by a system administrator• Other key concepts: referrals and hot standbys

Operating Systems, by Dhananjay Dhamdhere Copyright © 2008 20.26Operating Systems, by Dhananjay Dhamdhere 26

Summary

• Transparency concerns association between path name of a file and location of the file

• File sharing semantics may differ between DFSs:– Unix semantics

– Session semantics

– Transaction semantics (atomic transactions)

• Stateless server design provides high availability– Notion of a hint used to improve performance

• DFS uses file caching to improve performance– Cache coherence techniques are needed