Distributed Systems
Sukumar Ghosh
Department of Computer ScienceUniversity of Iowa
Definition?
A distributed system is one in which I can’t do my work, because some computer that I’ve never even heard of, has failed
(Leslie Lamport)
Distributed Systems
Network of processes communicating with one another
to meet some objective.
Growth and innovations fueled by
Declining hardware cost and improved device functionality
Better networking facility
Our dreams
Distributed Systems
Traditional Client server systems
Peer to peer networks
Communicating micro-robots
Sensor networks
Vehicular networks
A client-server system
S
clients
server
(boring …)
Communicating micro-robots
Courtesy: the iSwarm project at the University of Karlsruhe
Numerous Challenges
Processes have local views, but the goals are global.
Failures and perturbations are expected events and not
catastrophic exceptions!
Clocks are not perfectly synchronized
The topology may change from time to time
Replicated servers
S0
S1
S2
S3
S
Client-serverReplicated client-server
clients
server
not so easy
9
Vehicular Networks
Applications Accident alerts/prevention Dynamic route planning Entertainment
Roadside infrastructure
Internet
CellularCellular
Vehicle-to-vehicle
Roadside infrastructure
Communications Cellular network Vehicle to roadside Vehicle to vehicle
Topics to explore
Designing fault-tolerant distributed systems
(The term “fault” has a wide scope. It does not necessarily mean crash, but include selfishness, malicious behavior, node mobility, environmental changes etc)
Topics to explore
To prevent disruptions caused by failures and perturbations,distributed systems must learn to manage themselves withoutexternal intervention (which is often costly, and sometimes notpractical). This means, most non-trivial distributed systems mustsatisfy one or more of the following properties:
self-organization, self-healing, self-stabilization, self-optimization etc.
(These are yardsticks of “smartness”)
Topics to explore
“Scalable algorithms” for distributed systems. Some large scale systems have millions of nodes in them. Will your solution be practical at that scale?
Dealing with “big data” in distributed systems (cloud computing, MapReduce, Hadoop etc)
Topics to explore
The goal is to guarantee that the system will work in real life. If it does not, then you have to question and revisit the model assumptions, algorithm correctness etc.
theory practice
Graduate courses
If you are interested in such topics, then consider taking:
(Fall 2012)
22C:166 Distributed Systems and Algorithms (Sukumar Ghosh)
22C: 196 Sensing the world (Octav Chipara)
(Other semesters)
22C:196 Parallel and Distributed Programming: Forms and Limits
Cloud Computing (Ted Herman)
Sensor Networks (Ted Herman)
Advanced Distributed Algorithms (Sriram Pemmaraju)
Top Related