Distributed File Systems
-
Upload
candace-boyer -
Category
Documents
-
view
25 -
download
2
description
Transcript of Distributed File Systems
Thomas Hollstegge
Distributed File Systems
2
Distributed file systems
Agenda
Motivation
Distributed file system basics
Case studies
Summary and outlook
3
Distributed file systems
Agenda
Motivation
Distributed file system basics
Case studies
Summary and outlook
4
Distributed file systems
Motivation
ICT allows for distributed workUsers work timely and spatially separated
They need access to common data collections
Provided by distributed file systems (DFS)
Distributed work leads to new business models24/7 customer service
Analysis of worldwide financial information (stock prices etc.)
Economic relevance!
Different DFSs were developed in the past Structured discussion necessary
5
Distributed file systems
Agenda
Motivation
Distributed file system basics
Case studies
Summary and outlook
6
Distributed file systems
Basics – Storage fundamentals
„Storage“: Fundamendal abstraction in computingData encapsulated in objects
Explicit creation and deletion
Unaffected by system failures
„File system“: Refinement of abstraction
Three different usage dimensionsSingle user vs. multiple users
Single-thread vs. multi-thread OS
Single site vs. multiple sites
7
Distributed file systems
Basics – Requirements for DFS (1/2)
TransparencyUser must be unaware of internal separation of components
Access, performance, location, scaling transparency
AvailabilitySystem should be fault tolerant
Concurrent updatesSimultaneous access to a single resource
ReplicationFile may be present at different locations
Shares load between servers, enhances fault tolerancy
8
Distributed file systems
Basics – Requirements for DFS (2/2)
Hardware and software heterogeneitySupport for various platforms
ConsistencyData integrity has to be maintained
SecurityAccess control, user authentication, confidentiality
EfficiencyPerformance should be comparable to local file systems
9
Distributed file systems
Basics – Abstract file service model (1/3)
Source: [CDK01], p. 318
10
Distributed file systems
Basics – Abstract file service model (2/3)
Service Operations
Directory service Lookup(Dir, Name) FileId – throws NotFoundAddName(Dir, Name, File) – throws NameDuplicateUnName(Dir, Name) – throws NotFoundGetNames(Dir, Pattern) NameSeq
Flat file service Read(FileId, i, n) Data – throws BadPositionWrite(FileId, i, Data) – throws BadPositionCreate() FileIdDelete(FileId)GetAttributes(FileId) AttrSetAttributes(FileId, Attr)
Source: [CDK01], p. 319-322
11
Distributed file systems
Basics – Abstract file service model (3/3)
Access controlServer-side user authorisation
Access rights checked upon directory lookup or every request
Hierarchical file structureRealised within the client module
Directories may store references to other directories
File groupsSet of files that can be moved between servers
Similar to a file system
12
Distributed file systems
Agenda
Motivation
Distributed file system basics
Case studiesNetwork File System (NFS)
Andrew File System (AFS)
Lustre
Summary and outlook
13
Distributed file systems
NFS – History
198?: NFSv1Developed at Sun Microsystems, unreleased
1984: NFSv2Developed at Sun Microsystems
First released version, widely accepted
Supports files < 4GB, synchronous writes
1992: NFSv3Developed by a group of researchers
Overcomes drawbacks (file size, asynchronous writes)
2002: NFSv4Enhanced security, user authentication
Better Windows support
14
Distributed file systems
NFS – General description (1/2)
Source: [CDK01], p. 324
15
Distributed file systems
NFS – General description (2/2)
Stateless protocolServer does not maintain client states
Client requests are blocking (Exception: asynchronous write)
User authenticationDefault: UNIX user ID (insecure!)
Optional: Kerberos, DES
CachingRead cache: Yes
Write cache: No!
Server file systemNot restricted, should support unique file IDs
16
Distributed file systems
NFS – Abstract model (1/2)
vs.
17
Distributed file systems
NFS – Abstract model (2/2)
OperationsSimilar to UNIX file system calls
All abstract operations can be represented
Access controlChecked upon every request
Hierarchical file systemRealised within the client module
File groupsNot supported, only manual movement of files
18
Distributed file systems
NFS – Requirements
Transparency
Availability
Concurrent updates
Replication
Heterogeneity
Consistency
Security
Efficiency
19
Distributed file systems
Agenda
Motivation
Distributed file system basics
Case studiesNetwork File System (NFS)
Andrew File System (AFS)
Lustre
Summary and outlook
20
Distributed file systems
AFS – History
1982: Initial versionDeveloped at Carnegie Mellon University (CMU), Pittsburgh
Part of the Andrew distributed computing environment
Provides support for teaching and research
1989: Spin-offDevelopment outsourced to Transarc Inc.
1994: Transarc acquired by IBMAll rights owned by IBM
2000: Open-sourceCode was released under an open source license
Since then: continuous development
21
Distributed file systems
AFS – General description (1/3)
22
Distributed file systems
AFS – Name spaces
23
Distributed file systems
AFS – General description (2/3)
Cached?No!
24
Distributed file systems
AFS – General description (3/3)
Caching„Callback promises“
Workstations are notified when cached files change
Stateful protocolServer maintains client states
Problematic when client fails
User authenticationKerberos
Server file systemNot restricted, should support unique file IDs
25
Distributed file systems
AFS – Abstract model (1/2)
vs.
26
Distributed file systems
AFS – Abstract model (2/2)
OperationsDiffer from abstract model
Some operations combined, callback promises added
Access controlRights checked upon every request
Extended access lists per directory
Hierarchical file systemRealized within the client module
File groupsFile idenitfier contains link to file group
Location database maps file groups to servers
27
Distributed file systems
AFS – Requirements
Transparency
Availability
Concurrent updates
Replication
Heterogeneity
Consistency
Security
Efficiency
28
Distributed file systems
Agenda
Motivation
Distributed file system basics
Case studiesNetwork File System (NFS)
Andrew File System (AFS)
Lustre
Summary and outlook
29
Distributed file systems
Lustre (1/3)
„Lustre“: Linux ClusterFile system especially suited for clusters
Easily handles thousands of clients and servers
Uses object-based storageObjects offer methods for data access, attributes, policies
High-level abstraction
Lower performance than block-based storage
Three system rolesObject Storage Targets (OST)
Metadata Servers (MDS)
Clients
30
Distributed file systems
Lustre (2/3)
Object StorageTargets(OST)
MetadataServers(MDS)
Clients
File operations,locking
Recovery,file status
Directorymetadata
Source: [BS02], p. 51
31
Distributed file systems
Lustre (3/3)
Lustre partly follows abstract modelSeparation of directory and flat file service
File attributes managed by OSTs
Hierarchical file systemsRealised within the client module
High availabilityHeavy use of redundancy
Caching of metadata
32
Distributed file systems
Agenda
Motivation
Distributed file system basics
Case studies
Summary and outlook
33
Distributed file systems
Summary and outlook
Abstract file service modelDeveloped to meet many requirements for DFSs
Different implementationsNFS: Stateless, concurrency control
AFS: Stateful, heavy use of caching, better performance
Other approach: LustreModularised approach, especially suited for clusters
Future developmentsLarge-scale environments
Cloud computing
Issues: Data security, privacy
34
Distributed file systems
ANY QUESTIONS?Thank you for your attention!
35
Distributed file systems
Literature
[BS02] Peter J. Braam, Philip Schwan: Lustre: The intergalactic file system, Proceedings of the 2003 Ottawa Linux Symposium, pp. 50–54, 2002.[CDK01] George Coulouris, Jean Dollimore, Tim Kindberg: Distributed Systems, Concepts and Design, 3rd. ed., Addison-Wesley, 2001.[Kir06] Olaf Kirch: Why NFS Sucks, Proceedings of the Linux Symposium, 2nd. ed., pp. 51–63, 2006.[MSC+ 86] James H. Morris, Mahadev Satyanarayanan, Michael H. Conner, John H. Howard, David S. H. Rosenthal, F. Donelson Smith: Andrew: A distributed personal computing environment, Commununications of the ACM, 29(3), pp. 184–201, Association for Computing Machinery, 1986.[PJS+ 94] Brian Pawlowski, Chet Juszczak, Peter Staubach, Carl Smith, Diane Lebel, David Hitz: NFS Version 3: Design and Implementation, Proceedings of the Summer 1994 USENIX Technical Conference, pp. 137–151, 1994.[Sat89] Mahadev Satyanarayanan: Distributed file systems, Distributed systems, S. Mullender (ed.), pp. 149–188, ACM Press, 1989.[Sch03] Philip Schwan: Lustre: Building a file system for 1000-node clusters, Proceedings of the 2003 Ottawa Linux Symposium, pp. 380–386, 2003.[Tan03] Andrew S. Tanenbaum: Moderne Betriebssysteme, 2nd. ed., Prentice Hall, 2003.[Tv07] Andrew S. Tanenbaum, Marten van Steen: Distributed Systems: Principles and Paradigmsva, 2nd. ed., Prentice Hall, 2007.