OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how...
Transcript of OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how...
![Page 1: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work](https://reader036.fdocuments.net/reader036/viewer/2022071420/6119b0fe7699fe07543b4132/html5/thumbnails/1.jpg)
![Page 2: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work](https://reader036.fdocuments.net/reader036/viewer/2022071420/6119b0fe7699fe07543b4132/html5/thumbnails/2.jpg)
OVERVIEWLBFS MOTIVATION INTRODUCTION
CHALLENGES ADVANTAGESOFLBFS HOWLBFSWORKS? RELATEDWORK
DESIGN SECURITYISSUES IMPLEMENTATION
SERVERIMPLEMENTATION CLIENTIMPLEMENTATION
EVALUATION SHARK
![Page 3: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work](https://reader036.fdocuments.net/reader036/viewer/2022071420/6119b0fe7699fe07543b4132/html5/thumbnails/3.jpg)
MOTIVATION
UsersrarelyconsiderrunningNFSoversloworwideareanetworks.
Ifbandwidthislow,performanceisunacceptable.
Datatransferssaturatebo@lenecklinksandcauseunacceptabledelays.
InteracAveapplicaAonsareslowinrespondingtouserinput.
RemoteloginisfrustraAng
SoluAon? Run InteracAve programs locally and manipulate remote files
throughthefilesystem. NetworkFilesystemshouldconsumelessbandwidth. LBFS.
![Page 4: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work](https://reader036.fdocuments.net/reader036/viewer/2022071420/6119b0fe7699fe07543b4132/html5/thumbnails/4.jpg)
INTRODUCTION
LBFSisusedforsloworWideareaNetworks.
ExploitssimilariAesbetweenFilesorversionsofthesamefile.
ItusesConvenAonalcomparisonandCaching.
In LBF, interacAve programs and accessing remote datathroughfilesystemrunlocally.
LBFS requires 90% less bandwidth than TradiAonalNetworkFileSystem.
![Page 5: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work](https://reader036.fdocuments.net/reader036/viewer/2022071420/6119b0fe7699fe07543b4132/html5/thumbnails/5.jpg)
Challenges? Advantages
ProvidesTradiFonalFSSchemaFcs LocalCache ExploitsCrossFilesimilariFes
VariableSizeChunks Indexeschunksbyhashvalues
HowLBFSWork? ProvidesClosetoOpenConsistency
![Page 6: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work](https://reader036.fdocuments.net/reader036/viewer/2022071420/6119b0fe7699fe07543b4132/html5/thumbnails/6.jpg)
RELATEDWORK
AFSusesservercallbackstoreducenetworktraffic.
LeasesarecallbackswithexpiraAondate.
CODA supports slow networks and even disconnected operaAonsthroughopAmisAcreplicaAon.
CODAsavesbandwidthasitavoidstransferringfilestotheserver.
BayouandOceanStoreinvesAgateconflictresoluAonforopAmisAcupdates.
SpringandWetherall: Uselargeclientandservercaches
RsyncexploitssimilariAesbetweendirectorytrees.
![Page 7: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work](https://reader036.fdocuments.net/reader036/viewer/2022071420/6119b0fe7699fe07543b4132/html5/thumbnails/7.jpg)
DESIGN
LBFSuseslargepersistentfilecacheatclient. ItassumesClienthasenoughcache.
ItExploitssimilariAesbetweenfilesandfileversions. DividesFilesintoChunks. Onlytransmitsdatachunkscontainingnewdata.
Tosavechunktransfer,LBFSreliesontheSHA‐1Hash.
LBFSUses“gzip”compression.
CentralchallengeinDesignis: Keepingtheindexareasonablesize Dealingwithshi[ingoffsets.
![Page 8: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work](https://reader036.fdocuments.net/reader036/viewer/2022071420/6119b0fe7699fe07543b4132/html5/thumbnails/8.jpg)
PROBLEMSWITHFIXEDSIZEDBLOCKS
Single byte inserAon shi[s all the blockboundaries.
OriginalFile
AUerInserFng
PossiblesoluFons: Indexfilesbythehashesofalloverlapping8KBblocksatalloffsets.
Rsync:Consideronly twofilesataAme.Existenceofafileisfoundusingfilename.
x
![Page 9: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work](https://reader036.fdocuments.net/reader036/viewer/2022071420/6119b0fe7699fe07543b4132/html5/thumbnails/9.jpg)
LBFSSoluFonforBlockSize
LBFS Onlylooksfornon‐overlappingchunksinfiles
AvoidssensiFvitytoshiUingfileoffsetsbySeZngchunkboundariesbasedonfilecontents.
Todivideafileintochunks,LBFS Examinesevery(overlapping)48‐byteregionofthefile.
LBFSUses Rabin’s fingerprints to select boundary regionscalledbreakpoints.
Fingerprintsareefficienttocomputeonaslidingwindowinafile.
![Page 10: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work](https://reader036.fdocuments.net/reader036/viewer/2022071420/6119b0fe7699fe07543b4132/html5/thumbnails/10.jpg)
RabinFingerPrints
Polynomial representaAonofdata in48‐byte regionmoduloanirreduciblepolynomial. FingerPrint=f(x)modp(x)
ProbabilityofCollision=max(|r|,|s|)/2w‐1
Boundaryregionshavethe13leastsignificantbitsoftheir fingerprint equal to an arbitrary predefinedvalue.
Methodisreasonablyfast.
![Page 11: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work](https://reader036.fdocuments.net/reader036/viewer/2022071420/6119b0fe7699fe07543b4132/html5/thumbnails/11.jpg)
ChunkBoundariesAUeraSeriesofEdits?
Figureshowsthefiledividedintovariablelengthchunkswithbreakpointsdeterminedbyhashofeach48bitregion.
EffectofinserAngsometextintothefileatchunkC4. TransferonlyC8.
EffectofinserAngadatainC5thatcontainsabreakpoint Spliengthechunksintotwonewchunks.(C9andC10) TransferonlytwonewchunksC9andC10.
Oneofthebreakpointiseliminated.C2+C3‐>C11 TransferonlyC11
![Page 12: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work](https://reader036.fdocuments.net/reader036/viewer/2022071420/6119b0fe7699fe07543b4132/html5/thumbnails/12.jpg)
PATHOLOGICALCASES VariablesizechunkscanleadtoPathologicalbehavior.
Ifevery48bytesofafilehappenedtobeabreakpoint.
VerylargechunkswouldbetoodifficulttosendinasingleRPC.
ArbitrarysizeRPCmessageswouldbesomewhatinconvenient.
Chunksizesmustbebetween2Kand64K
ArAficiallyinsertchunkboundariesiffileisfullofrepeatedsequences.
![Page 13: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work](https://reader036.fdocuments.net/reader036/viewer/2022071420/6119b0fe7699fe07543b4132/html5/thumbnails/13.jpg)
CHUNKDATABASE Thechunkdatabase Indexeschunksbyfirst64bitsofSHA‐1
hash.
Thedatabasemapskeysto(file,offset,count)triples.
LBFSneverreliesonthecorrectnessofthechunkdatabase.
Howtokeepthisdatabaseuptodate? Mustupdateitwheneverfileisupdated
CansAllhaveproblemswithlocalupdatesatserversite Crashescancorruptdatabasecontents.
![Page 14: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work](https://reader036.fdocuments.net/reader036/viewer/2022071420/6119b0fe7699fe07543b4132/html5/thumbnails/14.jpg)
FILECONSISTENCY TheLBFSclientcurrentlyperformswholefilecaching.
LBFSusesathree‐Aeredschemetodetermineifafileisuptodate. OPENAFILE:
IFLeaseNotExpired IFLeaseExpired
ClientgetsaleasefirstAmeafileisopenedforread.
ClientRenewstheexpiredleasebyrequesAngfilea@ributes.
It’sthejoboftheClienttocheckifthecachedcopyissAllcurrent.
![Page 15: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work](https://reader036.fdocuments.net/reader036/viewer/2022071420/6119b0fe7699fe07543b4132/html5/thumbnails/15.jpg)
FILEREADS LBFSUseaddiAonalcallsnotinNFS‐>GETHASHforreads
![Page 16: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work](https://reader036.fdocuments.net/reader036/viewer/2022071420/6119b0fe7699fe07543b4132/html5/thumbnails/16.jpg)
FILEWRITES LBFSServerupdatesfilesatomicallyatcloseAme.
UsesTemporaryFiles. 4RPC’sareusedinupdateprotocol:
*1.MKTMPFILE*2.CONDWRITE*3.TMPWRITE*4.COMMITTMP.
![Page 17: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work](https://reader036.fdocuments.net/reader036/viewer/2022071420/6119b0fe7699fe07543b4132/html5/thumbnails/17.jpg)
SECURITY LBFSusesthesecurityinfrastructurefromSFS. AllServershavepublickeys. AccessControl.
IMPLEMENTATION
mkdbuAlity IfFileSize<8KB TrashDirectory
![Page 18: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work](https://reader036.fdocuments.net/reader036/viewer/2022071420/6119b0fe7699fe07543b4132/html5/thumbnails/18.jpg)
EVALUATION–REPEATEDDATAINFILES
Bandwidth consumpAon and network uAlizaAon aremeasuredunderseveralcommonworkloads.
LBFSiscomparedwith: CIFS, NFSversion3and AFS.
![Page 19: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work](https://reader036.fdocuments.net/reader036/viewer/2022071420/6119b0fe7699fe07543b4132/html5/thumbnails/19.jpg)
EVALUATION(Cont)–BANDWIDTHUTILIZATION
• Used3WORKLOADS.
• (MSWord1.4MBfile,gcc‐>Compiledemacs20.7,ed‐>perl)
![Page 20: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work](https://reader036.fdocuments.net/reader036/viewer/2022071420/6119b0fe7699fe07543b4132/html5/thumbnails/20.jpg)
EVALUATION(3)–APPLICATIONPERFORMANCE
![Page 21: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work](https://reader036.fdocuments.net/reader036/viewer/2022071420/6119b0fe7699fe07543b4132/html5/thumbnails/21.jpg)
OVERVIEW
SHARK MOTIVATION INTRODUCTION CHALLENGES ADVANTAGESOFSHARK HOWSHARKWORKS?
DESIGN IMPLEMENTATION EVALUATION CONCLUSTION DISCUSSION(QUESTIONARIES)
![Page 22: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work](https://reader036.fdocuments.net/reader036/viewer/2022071420/6119b0fe7699fe07543b4132/html5/thumbnails/22.jpg)
MOTIVATION
CurrentSystems
1.ReplicaFngExecuFonEnvironment 2.MulFpleClientCopyingSameFiles
SERVER
Program1
libraries
P2P3
P4
Data
CLIENT
P1Launch
Data
P1
C2
P1
C3
P1
C6
P3
C4
P1
C5
Replicate their execuFon environment on each machine beforelaunchingdistributedapplicaFon‐>WASTERESOURCES+DEBUG
SERVER1
Data1 Data2
Data3
Data3
Data2
Data1
SERVER3
Data1
Data2
Data3
SERVER2
Client1
C3
C2C4
RequestData3
UpdateDB
UpdateDB
Data3,1 Data2,1
UpdateDB
Data3
ReplicaFngdataandservingsamecopyofdatatomulFpleclients‐>INCONSISTENCIES
![Page 23: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work](https://reader036.fdocuments.net/reader036/viewer/2022071420/6119b0fe7699fe07543b4132/html5/thumbnails/23.jpg)
INTRODUCTION
Sharkisadistributedfilesystem
Designedforlarge‐scale,wide‐areadeployment.
Scalable.
ProvidesanovelcooperaAve‐cachingmechanism
Reducestheloadonanoriginfileserver.
![Page 24: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work](https://reader036.fdocuments.net/reader036/viewer/2022071420/6119b0fe7699fe07543b4132/html5/thumbnails/24.jpg)
CHALLENGES
Scalability:Whatifalargeprogram(sayinMB’s)isbeingexecutedfromafileserverbymanyclients?
Becauseofbandwidth,servermightdeliverunacceptableperformance.
As the model is similar to P2P file systems, administraAon,accountability,andconsistencyneedtobeConsidered.
![Page 25: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work](https://reader036.fdocuments.net/reader036/viewer/2022071420/6119b0fe7699fe07543b4132/html5/thumbnails/25.jpg)
HOWSHARKWORKS?
Shark clients findnearby copies of data by using distributedindex.
Clientavoids transferring thefile/chunksof thefile fromtheserver,ifthesamedatacanbefetchedfromnearby,client.
Shark is compaAble with exisAng backup and restoreprocedures.
By shared read, Shark greatly reduces server load andimprovesclientlatency.
![Page 26: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work](https://reader036.fdocuments.net/reader036/viewer/2022071420/6119b0fe7699fe07543b4132/html5/thumbnails/26.jpg)
DESIGN–PROTOCOL(1/2)
1. Sharkserverdividesthefile intochunksbyusingRabinfingerprintalgorithm.
2. Shark interacts with the local host using an exisAngNFSv3andrunsinuserspace.
3. WhenFirstClientreadsaparAcularfile Gets file andRegisters as replica proxy for the chunks of the
fileinthedistributedindex.
4. Nowwhen2ndclientwantstoaccessthesamefile: it discovers the proxies of the file chunks by querying the
distributedindex. establishes a secure channel to (mulAple such) proxy(s) , and
downloadthefilechunksinparallel.
![Page 27: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work](https://reader036.fdocuments.net/reader036/viewer/2022071420/6119b0fe7699fe07543b4132/html5/thumbnails/27.jpg)
DESIGNPROTOCOL(2/2)
5.A[erfetching,theclientthenregistersitselfasareplicaproxyforthesechunks.
6.ServerexposestwoApi’s. Put: Client executes put to declare that it has something.
Get: clientexecutesget toget the listof clientswhohavesomething.
SECUREDATASHARING7. Data is encrypted by the sender and can be decrypted
onlybytheclientwithappropriatereadpermissions.
8.Clientscannotdownload largeamountsofdatawithoutproperreadauthorizaAon.
![Page 28: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work](https://reader036.fdocuments.net/reader036/viewer/2022071420/6119b0fe7699fe07543b4132/html5/thumbnails/28.jpg)
SECUREDATASHARING
How? Cross‐file‐systemsharing
Sharkuses tokengeneratedbythefileserverasasharedsecretbetweenclientandproxy.
Clientcanverifytheintegrityofthereceiveddata.
For a sender client (proxy client) to authorize requesterclient,requesterclientwillprovidetheknowledgeoftoken
Once authorized, receiver client will establish the readpermissionofthefile.
![Page 29: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work](https://reader036.fdocuments.net/reader036/viewer/2022071420/6119b0fe7699fe07543b4132/html5/thumbnails/29.jpg)
FILECONSISTENCY
Sharkusestwonetworkfilesystemtechniques: Leases
AFS‐stylewhole‐filecaching
WhenclientmakesareadRPCtothefileserver,itgetsareadleaseontheparAcularfile.
InSharkdefaultleaseduraAon=5mins.
ForFileCommonaliAes:ItusesLBFS.
![Page 30: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work](https://reader036.fdocuments.net/reader036/viewer/2022071420/6119b0fe7699fe07543b4132/html5/thumbnails/30.jpg)
COOPERATIVECACHING
Client(C1) Server C2 C3 C4
FileNotCached/LeaseExpired
GETTOK (fh, offset, count)
1. TF = tok (F) = HMACr(F) 2. Split data into chunks 3. Compute tokens for chunks
1. file attributes 2. file token 3. (chunktok1, offset1, size1) (chunktok2, offset2, size2) (chunktok3, offset2, size2)
Determinesif(LocalCache==Latest)
if(LocalCacheisnotlatest=>Create‘K’Threads
t1
Multicast Requesting for Chunk Fi
Chunk F1
Chunk F3
t2t3
t1t2t3
Request Chunk F2 t3
F2
F3
Chunk F2
Issues Series of Read calls to the Kernel NFS server.
Caches the tokens for future reference.
.
Fetch‘K’ChunksinParallel
PUT() -> Announuces as a Proxy for Fi
![Page 31: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work](https://reader036.fdocuments.net/reader036/viewer/2022071420/6119b0fe7699fe07543b4132/html5/thumbnails/31.jpg)
DISTRIBUTEDINDEXING
Sharkusesglobaldistributedindexforallsharkclients.
Systemmaps opaque keys onto nodes by hashing the valueontoakeyID.
AssigningID’stonodesallowslookupontheO(log(n))
SharkstoresonlysmallinformaAonaboutwhichclientsstoreswhatdata.
SharkusesCoralasitsdistributedindex.
CoralProvidesDistributedSloppyHashTable(DSHT).
Coralcacheskey/valuepairsatnodeswhoseIDsareclose.
![Page 32: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work](https://reader036.fdocuments.net/reader036/viewer/2022071420/6119b0fe7699fe07543b4132/html5/thumbnails/32.jpg)
IMPLEMENTATION
Sharkconsistsof3MainComponents: Serversidedaemon
Clientsidedaemon
Coraldaemon
Implemented in C++ andarebuiltusingSFStoolkit.
Clientsidedaemon
BiggestComponentofshark.
HandlesUserRequests Transparently
incorporateswholefilecaching.
Codeis~12,000Lines
![Page 33: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work](https://reader036.fdocuments.net/reader036/viewer/2022071420/6119b0fe7699fe07543b4132/html5/thumbnails/33.jpg)
EVALUATION(1/2)
SharkisevaluatedagainstNFSv3andSFS.
Readtestsareperformedboth
withinthecontrolledEmulabLANenvironmentand
InthewideareaonthePlanetLabv3.0test‐bed.
Theserverrequired0.9seconds tocomputechunksfora10MBrandomfile,and3.6secondsfora40MBrandomfile.
![Page 34: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work](https://reader036.fdocuments.net/reader036/viewer/2022071420/6119b0fe7699fe07543b4132/html5/thumbnails/34.jpg)
EVALUATION‐Microbenchmarks
Forthelocal‐areamicro‐benchmarks,localmachinesatNYUareusedasaSharkclient.
Inthismicro‐benchmark,Shark’schunkingmechanismreducesredundantdatatransfersbyexploiAngdatacommonaliAes.
![Page 35: OVERVIEW - Cornell Universityoverview lbfs motivation introduction challenges advantages of lbfs how lbfs works? related work](https://reader036.fdocuments.net/reader036/viewer/2022071420/6119b0fe7699fe07543b4132/html5/thumbnails/35.jpg)
QUESTIONS
THANKYOU