Don’t Give Up on Distributed File Systems
Jeremy Stribling, Emil Sit, Frans Kaashoek, Jinyang Li, and Robert Morris
MIT CSAIL and NYU
• New apps tend to use new storage layers• Examples:
• Can we invent this layer once?
Reinventing the Storage Wheel
BLAST
What About a File System?
• A FS enables quick-prototyping for apps– A familiar interface– Language-independent usage model– Hierarchical namespace useful for apps– Write distributed apps in shell scripts
if [ -f /fs/cwc/$URL ]; then if notexpired /fs/cwc/$URL; then cat /fs/cwc/$URL exit fifiwget $URL –O - | tee /fs/cwc/$URL
Why Won’t That Work Today?
• Needs of distributed apps:– Control over consistency and delays– Efficient data sharing between peers
• Current systems focus on FS transparency– Hide faults with long timeouts– Centralized file servers
Example: Cooperative Web Cache
• Would rather fail and refetch than wait
• Perfect consistency isn’t crucial
• Avoid hotspots
if [ -f /fs/cwc/$URL ]; then if notexpired /fs/cwc/$URL; then cat /fs/cwc/$URL exit fifiwget $URL –O - | tee /fs/cwc/$URL
Our Proposal: WheelFS
• A distributed wide-area FS to simplify apps
• Main contributions:
1) Give apps control with semantic cues
2) Provide good performance according to Read Globally, Write Locally
Basic Design: Reading and Writing
Node653
Node076
Node150 Node
554
Node402
Node257
File 135?
076 150257 402554 653
File135
135
135135
File135v2
File135v3
135v2
135v2
135v3
135v3
Cached135
Createfoo/bar
550
File550(bar)
Dir209(foo)
bar = 550
Explicit Semantic Cues
• Allow direct control over system behavior
• Meta-data that attach to files, dirs, or refs
• Apply recursively down dir tree
• Possible impl: intra-path component– /wfs/cwc/.cue/foo/bar
Semantic Cues: Writability• Applies to files
• WriteMany (default)
• WriteOnce Node653 Node
076
Node150
Node554
Node402
Node257
File 135?
File135
File135v2
File135v3
Cached135v3
Cached135
Semantic Cues: Freshness• Applies to file references
• LatestVersion (default)
• AnyVersion
• BestVersion
Node653 Node
076
Node150
Node554
Node402
Node257
File 135?
File135
Cached135
Semantic Cues: Write Consistency• Applies to files or directories
• Strict (default)
• Lax Node653 Node
076
Node150
Node554
Node402
Node257
WriteFile 135
File135
135
WriteFile 135
File135v2
135v2
Example: Cooperative Web Cache
• Reading an older version is ok:– cat /wfs/cwc/.bestversion,maxtime=250/foo
• Writing conflicting versions is ok:– wget http://foo > /wfs/cwc/.lax,writemany/foo
if [ -f /wfs/cwc/.maxtime=250,bestversion/$URL ]; then if notexpired /wfs/cwc/.maxtime=250,bestversion/$URL; then cat /wfs/cwc/.maxtime=250,bestversion/$URL exit fifiwget $URL –O - | tee /wfs/cwc/.lax,writemany/$URL
Example: Cooperative Web Cache
Node653 Node
076
Node150
Node554
Node402
Node257
File135
Cached135
Client $URL
“$URL”?135
135?135 = v1402
Chunk
Chunk
Chunk
Cached135
No!
$URL
File550
“$URL” == 550
Dir070
(/wfs/cwc)
Discussion• Current set of cues enough for many apps
– All-sites-pings– Grid computations– OverCite
• Stuff we swept under the rug:– Security– Atomic renames across dirs– Storage load-balancing– Unreferenced files
Related Work
• Every FS paper ever written
• Specifically:– Cluster FS: Farsite, GFS, xFS, Ceph– Wide-area FS: JetFile, CFS, Shark– Grid: LegionFS, GridFTP, IBP– POSIX I/O High Performance Computing
Extensions
Top Related