Distributed Systems aka Special Topics in Networking CS 7780.
-
Upload
brianna-christal-flynn -
Category
Documents
-
view
229 -
download
2
Transcript of Distributed Systems aka Special Topics in Networking CS 7780.
![Page 1: Distributed Systems aka Special Topics in Networking CS 7780.](https://reader036.fdocuments.net/reader036/viewer/2022062314/56649f0d5503460f94c2118e/html5/thumbnails/1.jpg)
Distributed Systems
aka Special Topics in NetworkingCS 7780
![Page 2: Distributed Systems aka Special Topics in Networking CS 7780.](https://reader036.fdocuments.net/reader036/viewer/2022062314/56649f0d5503460f94c2118e/html5/thumbnails/2.jpg)
Welcome
• This is CS7780– Everyone in the right room?– Ok, good.
• Who am I?– Professor David Choffnes– [email protected]– West Village H 256 – No office hours: Just e-mail to make an appt.
• No TAs, either
![Page 3: Distributed Systems aka Special Topics in Networking CS 7780.](https://reader036.fdocuments.net/reader036/viewer/2022062314/56649f0d5503460f94c2118e/html5/thumbnails/3.jpg)
Why take this course?
• The reason you’re here is because of a DS– Registering for a course– Checking class times and location– Visiting my website– Getting directions to class– Writing notes in a Gdoc– Checking your e-mail
![Page 4: Distributed Systems aka Special Topics in Networking CS 7780.](https://reader036.fdocuments.net/reader036/viewer/2022062314/56649f0d5503460f94c2118e/html5/thumbnails/4.jpg)
Why take this course?
• When you registered, how were you guaranteed a slot and that this slot wasn’t overwritten by someone else?
• When you got directions, how did you get results in milliseconds when looking up one of billions of locations/tiles?
• How do your notes stay properly synced even though you never hit save and sometimes use the doc while offline?
![Page 5: Distributed Systems aka Special Topics in Networking CS 7780.](https://reader036.fdocuments.net/reader036/viewer/2022062314/56649f0d5503460f94c2118e/html5/thumbnails/5.jpg)
Why take this class?
• With DS, life would be pretty boring– I’d have to assign you a textbook– And photocopy manuscripts to read– And I couldn’t paste in pictures like this
![Page 6: Distributed Systems aka Special Topics in Networking CS 7780.](https://reader036.fdocuments.net/reader036/viewer/2022062314/56649f0d5503460f94c2118e/html5/thumbnails/6.jpg)
6
Goals
• Fundamental understanding of DS– All the way from core concepts and principles– … to the applications that use them in various ways
• Focus on software systems and protocols– Not hardware (treat as black box)– Minimal theory (but some for proofs
• Paper-centric– Learn DS from source material– Build your vocabulary and awareness of foundational DS concepts
• Research projects– Apply these concepts in your own original research
![Page 7: Distributed Systems aka Special Topics in Networking CS 7780.](https://reader036.fdocuments.net/reader036/viewer/2022062314/56649f0d5503460f94c2118e/html5/thumbnails/7.jpg)
7
Online Resources
• http://david.choffnes.com/classes/cs7780fa14/• Class forum is on Piazza– Sign up today!– Install their iPhone/Android app
• When in doubt, post to Piazza– Piazza is preferable to email• If you e-mail me a question, I will tell you to post it on Piazza
• HotCRP for paper reviews– Mandatory for all papers assigned (except this week)
![Page 8: Distributed Systems aka Special Topics in Networking CS 7780.](https://reader036.fdocuments.net/reader036/viewer/2022062314/56649f0d5503460f94c2118e/html5/thumbnails/8.jpg)
Sept 10 Intro
Sept 14 No Class Monday, CAP/Clocks
Sept 21 Consistency/Consensus, No class Thursday (NSDI), proposals due
Sept 28 Fault Tolerance/Availability
Oct 5 Distributed/Remote Processing, Distributed Cache
Oct 12 No class Monday: Columbus Day, DHTs
Oct 19 File systems (early, modern)
Oct 22 Overlays (maybe), No class: IMC, Midterm reports due
Nov 2 Wild card (Christo and friends)
Nov 9 The Internet, CDNs
Nov 16 Privacy, Anonymity, BitCoin (Field trip to DTL workshop)
Nov 23 DCNs, Thanksgiving Thursday, no class
Nov 30 SDNs, Management, security
Dec 7 Project presentations
Dec 14 Reports due
![Page 9: Distributed Systems aka Special Topics in Networking CS 7780.](https://reader036.fdocuments.net/reader036/viewer/2022062314/56649f0d5503460f94c2118e/html5/thumbnails/9.jpg)
Schedule
• The schedule will probably slip• If there’s a paper/topic you really want to
discuss and is not on the list, let me know
![Page 10: Distributed Systems aka Special Topics in Networking CS 7780.](https://reader036.fdocuments.net/reader036/viewer/2022062314/56649f0d5503460f94c2118e/html5/thumbnails/10.jpg)
Teaching Style
• This is not a lecture course– There is no textbook– There are no homework assignments– There is no hand holding
• Class will be very interactive– I will ask you questions– You should ask questions– Discussion is paramount
• That said, I will lead first few lectures
![Page 11: Distributed Systems aka Special Topics in Networking CS 7780.](https://reader036.fdocuments.net/reader036/viewer/2022062314/56649f0d5503460f94c2118e/html5/thumbnails/11.jpg)
How you are evaluated
• Attend class• Read and summarize papers– Present a subset of papers
• Present your project• Write it up
![Page 12: Distributed Systems aka Special Topics in Networking CS 7780.](https://reader036.fdocuments.net/reader036/viewer/2022062314/56649f0d5503460f94c2118e/html5/thumbnails/12.jpg)
Grading
• Pretty much all on final project• Research report + presentation– Some weight on attendance/participation
![Page 13: Distributed Systems aka Special Topics in Networking CS 7780.](https://reader036.fdocuments.net/reader036/viewer/2022062314/56649f0d5503460f94c2118e/html5/thumbnails/13.jpg)
Projects
• Anything that is related to distributed systems– Needs to be approved by me (9/24)– Should be your ongoing research– If you need a project, come see me
• Midterm progress report due 10/26
• 6-page (minimum) writeup due at end of semester– Also will need to give a 15-20’ talk
![Page 14: Distributed Systems aka Special Topics in Networking CS 7780.](https://reader036.fdocuments.net/reader036/viewer/2022062314/56649f0d5503460f94c2118e/html5/thumbnails/14.jpg)
Papers and reviews
• Most content will come from original sources– Students must pick papers to present• Ok to reuse some slides from authors’ presentations
– Everyone needs to enter a review in HotCRP
• Student presents paper, then we discuss
![Page 15: Distributed Systems aka Special Topics in Networking CS 7780.](https://reader036.fdocuments.net/reader036/viewer/2022062314/56649f0d5503460f94c2118e/html5/thumbnails/15.jpg)
Cheating
• Do not plagiarize• Do not do it– Seriously, don’t make me say it again
• Cheating is an automatic zero– Will be referred to the university for discipline and possible
expulsion– I’m not kidding: I will send any suspects to OSCCR without
exception• Research code and text must be original– If you have any questions about whether there might be an
issue, ask me
![Page 16: Distributed Systems aka Special Topics in Networking CS 7780.](https://reader036.fdocuments.net/reader036/viewer/2022062314/56649f0d5503460f94c2118e/html5/thumbnails/16.jpg)
Questions?
![Page 17: Distributed Systems aka Special Topics in Networking CS 7780.](https://reader036.fdocuments.net/reader036/viewer/2022062314/56649f0d5503460f94c2118e/html5/thumbnails/17.jpg)
What is a distributed system?
![Page 18: Distributed Systems aka Special Topics in Networking CS 7780.](https://reader036.fdocuments.net/reader036/viewer/2022062314/56649f0d5503460f94c2118e/html5/thumbnails/18.jpg)
Google’s definition
• an application that executes a collection of protocols to coordinate the actions of multiple processes on a network, such that all components cooperate together to perform a single or small set of related tasks
![Page 19: Distributed Systems aka Special Topics in Networking CS 7780.](https://reader036.fdocuments.net/reader036/viewer/2022062314/56649f0d5503460f94c2118e/html5/thumbnails/19.jpg)
Why build a DS?
• Scale• Availability• … no other way to connect lots of components• …
![Page 20: Distributed Systems aka Special Topics in Networking CS 7780.](https://reader036.fdocuments.net/reader036/viewer/2022062314/56649f0d5503460f94c2118e/html5/thumbnails/20.jpg)
Key challenge
• Everything fails– No really, on a long enough timeline, everything fails
• Node goes down• Network unavailable• Storage is corrupted• Attack on system• Packet loss• Bugs• …
![Page 21: Distributed Systems aka Special Topics in Networking CS 7780.](https://reader036.fdocuments.net/reader036/viewer/2022062314/56649f0d5503460f94c2118e/html5/thumbnails/21.jpg)
Example: Google Docs
![Page 22: Distributed Systems aka Special Topics in Networking CS 7780.](https://reader036.fdocuments.net/reader036/viewer/2022062314/56649f0d5503460f94c2118e/html5/thumbnails/22.jpg)
Handling failures can be hard
• Multiple layers of components can mask problems or interfere with recovery
• Achieving agreement/consistency after failures can be difficult– Even without failures this is hard
• Reliability during failures is challenging
![Page 23: Distributed Systems aka Special Topics in Networking CS 7780.](https://reader036.fdocuments.net/reader036/viewer/2022062314/56649f0d5503460f94c2118e/html5/thumbnails/23.jpg)
Related Challenges
• Fault tolerance• Availability• Recoverability• Consistency• Scalability• Security• Predictability, Simplicity (?)
![Page 24: Distributed Systems aka Special Topics in Networking CS 7780.](https://reader036.fdocuments.net/reader036/viewer/2022062314/56649f0d5503460f94c2118e/html5/thumbnails/24.jpg)
The Network
• Communication, coordination require a network– What are examples of networks that DSes use?– Why are they used?
![Page 25: Distributed Systems aka Special Topics in Networking CS 7780.](https://reader036.fdocuments.net/reader036/viewer/2022062314/56649f0d5503460f94c2118e/html5/thumbnails/25.jpg)
How not to design a DS
Assume:• Nodes are always online.• The network is reliable.• Latency is zero.• Bandwidth is infinite.• The network is secure.• Topology doesn't change.• There is one administrator.• Transport cost is zero.• The network is homogeneous.
![Page 26: Distributed Systems aka Special Topics in Networking CS 7780.](https://reader036.fdocuments.net/reader036/viewer/2022062314/56649f0d5503460f94c2118e/html5/thumbnails/26.jpg)
Communication/Coordination
• How do you compartmentalize tasks in a computer program?
• Analogy in DS: Remote Procedure Calls– Anyone have examples?
![Page 27: Distributed Systems aka Special Topics in Networking CS 7780.](https://reader036.fdocuments.net/reader036/viewer/2022062314/56649f0d5503460f94c2118e/html5/thumbnails/27.jpg)
Key RPC components
• Protocol• Client/Server implementation• Error handling– What can go wrong?
![Page 28: Distributed Systems aka Special Topics in Networking CS 7780.](https://reader036.fdocuments.net/reader036/viewer/2022062314/56649f0d5503460f94c2118e/html5/thumbnails/28.jpg)
End-to-End Argument
![Page 29: Distributed Systems aka Special Topics in Networking CS 7780.](https://reader036.fdocuments.net/reader036/viewer/2022062314/56649f0d5503460f94c2118e/html5/thumbnails/29.jpg)
29
Where to Place Functionality
• How do we distribute functionality across a DS?– Example: who is responsible for security?
Switch SwitchRouter
??
??
?
• “The End-to-End Arguments in System Design”• Saltzer, Reed, and Clark• Endlessly debated by researchers and engineers
![Page 30: Distributed Systems aka Special Topics in Networking CS 7780.](https://reader036.fdocuments.net/reader036/viewer/2022062314/56649f0d5503460f94c2118e/html5/thumbnails/30.jpg)
30
Basic Observation
• Some applications have end-to-end requirements– Security, reliability, etc.
• Implementing this stuff inside the network is hard– Every step along the way must be fail-proof– Different applications have different needs
• End hosts…– Can’t depend on the network– Can satisfy these requirements without network level
support
![Page 31: Distributed Systems aka Special Topics in Networking CS 7780.](https://reader036.fdocuments.net/reader036/viewer/2022062314/56649f0d5503460f94c2118e/html5/thumbnails/31.jpg)
31
Example: Reliable File Transfer
Solution 1: Make the network reliable Solution 2: App level, end-to-end check, retry on failure
Integrity Check
Integrity Check
Integrity Check
App has to do a check anyway!
![Page 32: Distributed Systems aka Special Topics in Networking CS 7780.](https://reader036.fdocuments.net/reader036/viewer/2022062314/56649f0d5503460f94c2118e/html5/thumbnails/32.jpg)
32
Example: Reliable File Transfer
Solution 1: Make the network reliable Solution 2: App level, end-to-end check, retry on failure
Please Retry
Full functionality can be built at App level
• In-network implementation… Doesn’t reduce host complexity Does increase network complexity Increased overhead for apps that don’t need
functionality• But, in-network performance may be better
![Page 33: Distributed Systems aka Special Topics in Networking CS 7780.](https://reader036.fdocuments.net/reader036/viewer/2022062314/56649f0d5503460f94c2118e/html5/thumbnails/33.jpg)
33
Conservative Interpretation
“Don’t implement a function at the lower levels of the system unless it can be
completely implemented at this level” (Peterson and Davie)
Basically, unless you can completely remove the burden from endpoints, don’t bother
![Page 34: Distributed Systems aka Special Topics in Networking CS 7780.](https://reader036.fdocuments.net/reader036/viewer/2022062314/56649f0d5503460f94c2118e/html5/thumbnails/34.jpg)
34
Radical Interpretation
• Don’t implement anything in an intermediate DS component that can be implemented correctly by the endpoints
• Make each DS component absolutely minimal
• Ignore performance issues
![Page 35: Distributed Systems aka Special Topics in Networking CS 7780.](https://reader036.fdocuments.net/reader036/viewer/2022062314/56649f0d5503460f94c2118e/html5/thumbnails/35.jpg)
35
Moderate Interpretation
• Think twice before implementing functionality in a given component of the system
• If an endpoint can implement functionality correctly, implement it a lower layer only as a performance enhancement
• But do so only if it does not impose burden on applications that do not require that functionality…– …and if it doesn’t cost too much to implement– Cost = $ or complexity
![Page 36: Distributed Systems aka Special Topics in Networking CS 7780.](https://reader036.fdocuments.net/reader036/viewer/2022062314/56649f0d5503460f94c2118e/html5/thumbnails/36.jpg)
For next week
• Read the papers!– I will set up HotCRP shortly
• No class Monday• Start looking at papers to present– When I open bidding, it will be FCFS– Let me know if you want to add anything
• Start figuring out your project proposal