Jonathan/Aldrich /CharlieGarrod&charlie/courses/15-214/2015-fall/slides/06b... · 15214 2...
Transcript of Jonathan/Aldrich /CharlieGarrod&charlie/courses/15-214/2015-fall/slides/06b... · 15214 2...
![Page 1: Jonathan/Aldrich /CharlieGarrod&charlie/courses/15-214/2015-fall/slides/06b... · 15214 2 Administrivia/ • Homework/5b/due/Thursday,/11:59/p.m./ – Finish/by/Friday/10/a.m./if/you/wantto/be/considered/as/a"Best](https://reader036.fdocuments.net/reader036/viewer/2022081614/5fc0bb779f1ed42f701e7a2c/html5/thumbnails/1.jpg)
1 15-‐214
School of Computer Science
Principles of So3ware Construc9on: Objects, Design, and Concurrency Part 6: Concurrency and distributed systems Distributed Systems Jonathan Aldrich Charlie Garrod
![Page 2: Jonathan/Aldrich /CharlieGarrod&charlie/courses/15-214/2015-fall/slides/06b... · 15214 2 Administrivia/ • Homework/5b/due/Thursday,/11:59/p.m./ – Finish/by/Friday/10/a.m./if/you/wantto/be/considered/as/a"Best](https://reader036.fdocuments.net/reader036/viewer/2022081614/5fc0bb779f1ed42f701e7a2c/html5/thumbnails/2.jpg)
2 15-‐214
Administrivia
• Homework 5b due Thursday, 11:59 p.m. – Finish by Friday 10 a.m. if you want to be considered as a "Best
Framework" for Homework 5c • Our evalua9on considers:
– Novelty – Func9onal correctness – Documenta9on – …
![Page 3: Jonathan/Aldrich /CharlieGarrod&charlie/courses/15-214/2015-fall/slides/06b... · 15214 2 Administrivia/ • Homework/5b/due/Thursday,/11:59/p.m./ – Finish/by/Friday/10/a.m./if/you/wantto/be/considered/as/a"Best](https://reader036.fdocuments.net/reader036/viewer/2022081614/5fc0bb779f1ed42f701e7a2c/html5/thumbnails/3.jpg)
3 15-‐214
Key concepts from last Thursday
![Page 4: Jonathan/Aldrich /CharlieGarrod&charlie/courses/15-214/2015-fall/slides/06b... · 15214 2 Administrivia/ • Homework/5b/due/Thursday,/11:59/p.m./ – Finish/by/Friday/10/a.m./if/you/wantto/be/considered/as/a"Best](https://reader036.fdocuments.net/reader036/viewer/2022081614/5fc0bb779f1ed42f701e7a2c/html5/thumbnails/4.jpg)
4 15-‐214
Concurrency at the language level
• Consider: int sum = 0; Iterator i = coll.iterator(); while (i.hasNext()) { sum += i.next(); }
• In python: sum = 0; for item in coll: sum += item
![Page 5: Jonathan/Aldrich /CharlieGarrod&charlie/courses/15-214/2015-fall/slides/06b... · 15214 2 Administrivia/ • Homework/5b/due/Thursday,/11:59/p.m./ – Finish/by/Friday/10/a.m./if/you/wantto/be/considered/as/a"Best](https://reader036.fdocuments.net/reader036/viewer/2022081614/5fc0bb779f1ed42f701e7a2c/html5/thumbnails/5.jpg)
5 15-‐214
Parallel prefix sums algorithm, winding
• Computes the par9al sums in a more useful manner
[13, 9, -4, 19, -6, 2, 6, 3]
[13, 22, -4, 15, -6, -4, 6, 9]
[13, 22, -4, 37, -6, -4, 6, 5]
[13, 22, -4, 37, -6, -4, 6, 42]
…
![Page 6: Jonathan/Aldrich /CharlieGarrod&charlie/courses/15-214/2015-fall/slides/06b... · 15214 2 Administrivia/ • Homework/5b/due/Thursday,/11:59/p.m./ – Finish/by/Friday/10/a.m./if/you/wantto/be/considered/as/a"Best](https://reader036.fdocuments.net/reader036/viewer/2022081614/5fc0bb779f1ed42f701e7a2c/html5/thumbnails/6.jpg)
6 15-‐214
Parallel prefix sums algorithm, unwinding
• Now unwinds to calculate the other sums
[13, 22, -4, 37, -6, -4, 6, 42]
[13, 22, -4, 37, -6, 33, 6, 42]
[13, 22, 18, 37, 31, 33, 39, 42]
• Recall, we started with:[13, 9, -4, 19, -6, 2, 6, 3]
![Page 7: Jonathan/Aldrich /CharlieGarrod&charlie/courses/15-214/2015-fall/slides/06b... · 15214 2 Administrivia/ • Homework/5b/due/Thursday,/11:59/p.m./ – Finish/by/Friday/10/a.m./if/you/wantto/be/considered/as/a"Best](https://reader036.fdocuments.net/reader036/viewer/2022081614/5fc0bb779f1ed42f701e7a2c/html5/thumbnails/7.jpg)
7 15-‐214
A framework for asynchronous computa9on
• The java.util.concurrent.Future<V> interface V get(); V get(long timeout, TimeUnit unit); boolean isDone(); boolean cancel(boolean mayInterruptIfRunning); boolean isCancelled();
• The java.util.concurrent.ExecutorService interface Future submit(Runnable task); Future<V> submit(Callable<V> task); List<Future<V>> invokeAll(Collection<Callable<V>> tasks); Future<V> invokeAny(Collection<Callable<V>> tasks);
![Page 8: Jonathan/Aldrich /CharlieGarrod&charlie/courses/15-214/2015-fall/slides/06b... · 15214 2 Administrivia/ • Homework/5b/due/Thursday,/11:59/p.m./ – Finish/by/Friday/10/a.m./if/you/wantto/be/considered/as/a"Best](https://reader036.fdocuments.net/reader036/viewer/2022081614/5fc0bb779f1ed42f701e7a2c/html5/thumbnails/8.jpg)
8 15-‐214
Fork/Join: another common computa9onal paYern
• In a long computa9on: – Fork a thread (or more) to do some work – Join the thread(s) to obtain the result of the work
• The java.util.concurrent.ForkJoinPool class – Implements ExecutorService – Executes java.util.concurrent.ForkJoinTask<V> or
java.util.concurrent.RecursiveTask<V> or java.util.concurrent.RecursiveAction
![Page 9: Jonathan/Aldrich /CharlieGarrod&charlie/courses/15-214/2015-fall/slides/06b... · 15214 2 Administrivia/ • Homework/5b/due/Thursday,/11:59/p.m./ – Finish/by/Friday/10/a.m./if/you/wantto/be/considered/as/a"Best](https://reader036.fdocuments.net/reader036/viewer/2022081614/5fc0bb779f1ed42f701e7a2c/html5/thumbnails/9.jpg)
9 15-‐214
Parallel prefix sums algorithm
• How good is this? – Work: O(n) – Depth: O(lg n)
• See PrefixSumsSequen9alImpl.java – n-‐1 addi9ons – Memory access is sequen9al
• For PrefixSumsNonsequen9alImpl.java – About 2n useful addi9ons, plus extra addi9ons for the loop indexes – Memory access is non-‐sequen9al
• The punchline: Constants maYer.
![Page 10: Jonathan/Aldrich /CharlieGarrod&charlie/courses/15-214/2015-fall/slides/06b... · 15214 2 Administrivia/ • Homework/5b/due/Thursday,/11:59/p.m./ – Finish/by/Friday/10/a.m./if/you/wantto/be/considered/as/a"Best](https://reader036.fdocuments.net/reader036/viewer/2022081614/5fc0bb779f1ed42f701e7a2c/html5/thumbnails/10.jpg)
10 15-‐214
Today: Distributed system design
• Java networking fundamentals • Introduc9on to distributed systems
– Mo9va9on: reliability and scalability – Failure models – Techniques for:
• Reliability (availability) • Scalability • Consistency
![Page 11: Jonathan/Aldrich /CharlieGarrod&charlie/courses/15-214/2015-fall/slides/06b... · 15214 2 Administrivia/ • Homework/5b/due/Thursday,/11:59/p.m./ – Finish/by/Friday/10/a.m./if/you/wantto/be/considered/as/a"Best](https://reader036.fdocuments.net/reader036/viewer/2022081614/5fc0bb779f1ed42f701e7a2c/html5/thumbnails/11.jpg)
11 15-‐214
Our des9na9on: Distributed systems
• Mul9ple system components (computers) communica9ng via some medium (the network)
• Challenges: – Heterogeneity – Scale – Geography – Security – Concurrency – Failures
(courtesy of http://www.cs.cmu.edu/~dga/15-440/F12/lectures/02-internet1.pdf
![Page 12: Jonathan/Aldrich /CharlieGarrod&charlie/courses/15-214/2015-fall/slides/06b... · 15214 2 Administrivia/ • Homework/5b/due/Thursday,/11:59/p.m./ – Finish/by/Friday/10/a.m./if/you/wantto/be/considered/as/a"Best](https://reader036.fdocuments.net/reader036/viewer/2022081614/5fc0bb779f1ed42f701e7a2c/html5/thumbnails/12.jpg)
12 15-‐214
Communica9on protocols
• Agreement between par9es for how communica9on should take place
Friendly greeting.
Muttered reply.
Destination?
Pittsburgh.
Thank you.
(courtesy of http://www.cs.cmu.edu/~dga/15-440/F12/lectures/02-internet1.pdf
![Page 13: Jonathan/Aldrich /CharlieGarrod&charlie/courses/15-214/2015-fall/slides/06b... · 15214 2 Administrivia/ • Homework/5b/due/Thursday,/11:59/p.m./ – Finish/by/Friday/10/a.m./if/you/wantto/be/considered/as/a"Best](https://reader036.fdocuments.net/reader036/viewer/2022081614/5fc0bb779f1ed42f701e7a2c/html5/thumbnails/13.jpg)
13 15-‐214
Abstractions of a network connection
IP
TCP | UDP | …
HTTP | FTP | …
HTML | Text | JPG | GIF | PDF | …
data link layer
physical layer
![Page 14: Jonathan/Aldrich /CharlieGarrod&charlie/courses/15-214/2015-fall/slides/06b... · 15214 2 Administrivia/ • Homework/5b/due/Thursday,/11:59/p.m./ – Finish/by/Friday/10/a.m./if/you/wantto/be/considered/as/a"Best](https://reader036.fdocuments.net/reader036/viewer/2022081614/5fc0bb779f1ed42f701e7a2c/html5/thumbnails/14.jpg)
14 15-‐214
Internet addresses and sockets
• For IP version 4 (IPv4) host address is a 4-‐byte number – e.g. 127.0.0.1 – Hostnames mapped to host IP addresses via DNS – ~4 billion dis9nct addresses
• Port is a 16-‐bit number (0-‐65535) – Assigned conven9onally
• e.g., port 80 is the standard port for web servers
![Page 15: Jonathan/Aldrich /CharlieGarrod&charlie/courses/15-214/2015-fall/slides/06b... · 15214 2 Administrivia/ • Homework/5b/due/Thursday,/11:59/p.m./ – Finish/by/Friday/10/a.m./if/you/wantto/be/considered/as/a"Best](https://reader036.fdocuments.net/reader036/viewer/2022081614/5fc0bb779f1ed42f701e7a2c/html5/thumbnails/15.jpg)
15 15-‐214
Packet-‐oriented and stream-‐oriented connec9ons
• UDP: User Datagram Protocol – Unreliable, discrete packets of data
• TCP: Transmission Control Protocol – Reliable data stream
![Page 16: Jonathan/Aldrich /CharlieGarrod&charlie/courses/15-214/2015-fall/slides/06b... · 15214 2 Administrivia/ • Homework/5b/due/Thursday,/11:59/p.m./ – Finish/by/Friday/10/a.m./if/you/wantto/be/considered/as/a"Best](https://reader036.fdocuments.net/reader036/viewer/2022081614/5fc0bb779f1ed42f701e7a2c/html5/thumbnails/16.jpg)
16 15-‐214
Networking in Java
• The java.net.InetAddress: static InetAddress getByName(String host); static InetAddress getByAddress(byte[] b); static InetAddress getLocalHost();
• The java.net.Socket: Socket(InetAddress addr, int port); boolean isConnected(); boolean isClosed(); void close(); InputStream getInputStream(); OutputStream getOutputStream();
• The java.net.ServerSocket: ServerSocket(int port); Socket accept(); void close(); …
![Page 17: Jonathan/Aldrich /CharlieGarrod&charlie/courses/15-214/2015-fall/slides/06b... · 15214 2 Administrivia/ • Homework/5b/due/Thursday,/11:59/p.m./ – Finish/by/Friday/10/a.m./if/you/wantto/be/considered/as/a"Best](https://reader036.fdocuments.net/reader036/viewer/2022081614/5fc0bb779f1ed42f701e7a2c/html5/thumbnails/17.jpg)
17 15-‐214
Simple sockets demos
• NetworkServer.java • A basic chat system:
– TransferThread.java – TextSocketClient.java – TextSocketServer.java
![Page 18: Jonathan/Aldrich /CharlieGarrod&charlie/courses/15-214/2015-fall/slides/06b... · 15214 2 Administrivia/ • Homework/5b/due/Thursday,/11:59/p.m./ – Finish/by/Friday/10/a.m./if/you/wantto/be/considered/as/a"Best](https://reader036.fdocuments.net/reader036/viewer/2022081614/5fc0bb779f1ed42f701e7a2c/html5/thumbnails/18.jpg)
18 15-‐214
Higher levels of abstrac9on
• Applica9on-‐level communica9on protocols • Frameworks for simple distributed computa9on
– Remote Procedure Call (RPC) – Java Remote Method Invoca9on (RMI)
• Common paYerns of distributed system design • Complex computa9onal frameworks
– e.g., distributed map-‐reduce
![Page 19: Jonathan/Aldrich /CharlieGarrod&charlie/courses/15-214/2015-fall/slides/06b... · 15214 2 Administrivia/ • Homework/5b/due/Thursday,/11:59/p.m./ – Finish/by/Friday/10/a.m./if/you/wantto/be/considered/as/a"Best](https://reader036.fdocuments.net/reader036/viewer/2022081614/5fc0bb779f1ed42f701e7a2c/html5/thumbnails/19.jpg)
19 15-‐214
Today
• Java networking fundamentals • Introduc9on to distributed systems
– Mo9va9on: reliability and scalability – Failure models – Techniques for:
• Reliability (availability) • Scalability • Consistency
![Page 20: Jonathan/Aldrich /CharlieGarrod&charlie/courses/15-214/2015-fall/slides/06b... · 15214 2 Administrivia/ • Homework/5b/due/Thursday,/11:59/p.m./ – Finish/by/Friday/10/a.m./if/you/wantto/be/considered/as/a"Best](https://reader036.fdocuments.net/reader036/viewer/2022081614/5fc0bb779f1ed42f701e7a2c/html5/thumbnails/20.jpg)
20 15-‐214
![Page 21: Jonathan/Aldrich /CharlieGarrod&charlie/courses/15-214/2015-fall/slides/06b... · 15214 2 Administrivia/ • Homework/5b/due/Thursday,/11:59/p.m./ – Finish/by/Friday/10/a.m./if/you/wantto/be/considered/as/a"Best](https://reader036.fdocuments.net/reader036/viewer/2022081614/5fc0bb779f1ed42f701e7a2c/html5/thumbnails/21.jpg)
21 15-‐214
Aside: The robustness vs. redundancy curve
? redundancy robustness
![Page 22: Jonathan/Aldrich /CharlieGarrod&charlie/courses/15-214/2015-fall/slides/06b... · 15214 2 Administrivia/ • Homework/5b/due/Thursday,/11:59/p.m./ – Finish/by/Friday/10/a.m./if/you/wantto/be/considered/as/a"Best](https://reader036.fdocuments.net/reader036/viewer/2022081614/5fc0bb779f1ed42f701e7a2c/html5/thumbnails/22.jpg)
22 15-‐214
Metrics of success
• Reliability – O3en in terms of availability: frac9on of 9me system is working
• 99.999% available is "5 nines of availability" • Scalability
– Ability to handle workload growth
![Page 23: Jonathan/Aldrich /CharlieGarrod&charlie/courses/15-214/2015-fall/slides/06b... · 15214 2 Administrivia/ • Homework/5b/due/Thursday,/11:59/p.m./ – Finish/by/Friday/10/a.m./if/you/wantto/be/considered/as/a"Best](https://reader036.fdocuments.net/reader036/viewer/2022081614/5fc0bb779f1ed42f701e7a2c/html5/thumbnails/23.jpg)
23 15-‐214
A case study: Passive primary-‐backup replica9on
• Architecture before replica9on:
– Problem: Database server might fail
client front-end {alice:90, bob:42, …} client front-end
database server:
![Page 24: Jonathan/Aldrich /CharlieGarrod&charlie/courses/15-214/2015-fall/slides/06b... · 15214 2 Administrivia/ • Homework/5b/due/Thursday,/11:59/p.m./ – Finish/by/Friday/10/a.m./if/you/wantto/be/considered/as/a"Best](https://reader036.fdocuments.net/reader036/viewer/2022081614/5fc0bb779f1ed42f701e7a2c/html5/thumbnails/24.jpg)
24 15-‐214
A case study: Passive primary-‐backup replica9on
• Architecture before replica9on:
– Problem: Database server might fail
• Solu9on: Replicate data onto mul9ple servers
client front-end {alice:90, bob:42, …} client front-end
database server:
client front-end {alice:90, bob:42, …} client front-end
primary:
{alice:90, bob:42, …}
backup:
{alice:90, bob:42, …}
backup:
![Page 25: Jonathan/Aldrich /CharlieGarrod&charlie/courses/15-214/2015-fall/slides/06b... · 15214 2 Administrivia/ • Homework/5b/due/Thursday,/11:59/p.m./ – Finish/by/Friday/10/a.m./if/you/wantto/be/considered/as/a"Best](https://reader036.fdocuments.net/reader036/viewer/2022081614/5fc0bb779f1ed42f701e7a2c/html5/thumbnails/25.jpg)
25 15-‐214
Passive primary-‐backup replica9on protocol
1. Front-‐end issues request with unique ID to primary DB 2. Primary checks request ID
– If already executed request, re-‐send response and exit protocol 3. Primary executes request and stores response 4. If request is an update, primary DB sends updated state, ID, and
response to all backups – Each backup sends an acknowledgement
5. A3er receiving all acknowledgements, primary DB sends response to front-‐end
![Page 26: Jonathan/Aldrich /CharlieGarrod&charlie/courses/15-214/2015-fall/slides/06b... · 15214 2 Administrivia/ • Homework/5b/due/Thursday,/11:59/p.m./ – Finish/by/Friday/10/a.m./if/you/wantto/be/considered/as/a"Best](https://reader036.fdocuments.net/reader036/viewer/2022081614/5fc0bb779f1ed42f701e7a2c/html5/thumbnails/26.jpg)
26 15-‐214
Issues with passive primary-‐backup replica9on
• If primary DB crashes, front-‐ends need to agree upon which unique backup is new primary DB – Primary failure vs. network failure?
• If backup DB becomes new primary, surviving replicas must agree on current DB state
• If backup DB crashes, primary must detect failure to remove the backup from the cluster – Backup failure vs. network failure?
• If replica fails* and recovers, it must detect that it previously failed
• Many subtle issues with par9al failures • …
![Page 27: Jonathan/Aldrich /CharlieGarrod&charlie/courses/15-214/2015-fall/slides/06b... · 15214 2 Administrivia/ • Homework/5b/due/Thursday,/11:59/p.m./ – Finish/by/Friday/10/a.m./if/you/wantto/be/considered/as/a"Best](https://reader036.fdocuments.net/reader036/viewer/2022081614/5fc0bb779f1ed42f701e7a2c/html5/thumbnails/27.jpg)
27 15-‐214
More issues…
• Concurrency problems? – Out of order message delivery?
• Time…
• Performance problems? – 2n messages for n replicas – Failure of any replica can delay response – Rou9ne network problems can delay response
• Scalability problems? – All replicas are wriYen for each update – Primary DB responds to every request
![Page 28: Jonathan/Aldrich /CharlieGarrod&charlie/courses/15-214/2015-fall/slides/06b... · 15214 2 Administrivia/ • Homework/5b/due/Thursday,/11:59/p.m./ – Finish/by/Friday/10/a.m./if/you/wantto/be/considered/as/a"Best](https://reader036.fdocuments.net/reader036/viewer/2022081614/5fc0bb779f1ed42f701e7a2c/html5/thumbnails/28.jpg)
28 15-‐214
Today
• Java networking fundamentals • Introduc9on to distributed systems
– Mo9va9on: reliability and scalability – Failure models – Techniques for:
• Reliability (availability) • Scalability • Consistency
![Page 29: Jonathan/Aldrich /CharlieGarrod&charlie/courses/15-214/2015-fall/slides/06b... · 15214 2 Administrivia/ • Homework/5b/due/Thursday,/11:59/p.m./ – Finish/by/Friday/10/a.m./if/you/wantto/be/considered/as/a"Best](https://reader036.fdocuments.net/reader036/viewer/2022081614/5fc0bb779f1ed42f701e7a2c/html5/thumbnails/29.jpg)
29 15-‐214
Types of failure behaviors
• Fail-‐stop • Other hal9ng failures • Communica9on failures
– Send/receive omissions – Network par99ons – Message corrup9on
• Data corrup9on • Performance failures
– High packet loss rate – Low throughput – High latency
• Byzan9ne failures
![Page 30: Jonathan/Aldrich /CharlieGarrod&charlie/courses/15-214/2015-fall/slides/06b... · 15214 2 Administrivia/ • Homework/5b/due/Thursday,/11:59/p.m./ – Finish/by/Friday/10/a.m./if/you/wantto/be/considered/as/a"Best](https://reader036.fdocuments.net/reader036/viewer/2022081614/5fc0bb779f1ed42f701e7a2c/html5/thumbnails/30.jpg)
30 15-‐214
Common assump9ons about failures
• Behavior of others is fail-‐stop (ugh) • Network is reliable (ugh) • Network is semi-‐reliable but asynchronous • Network is lossy but messages are not corrupt • Network failures are transi9ve • Failures are independent • Local data is not corrupt • Failures are reliably detectable • Failures are unreliably detectable
![Page 31: Jonathan/Aldrich /CharlieGarrod&charlie/courses/15-214/2015-fall/slides/06b... · 15214 2 Administrivia/ • Homework/5b/due/Thursday,/11:59/p.m./ – Finish/by/Friday/10/a.m./if/you/wantto/be/considered/as/a"Best](https://reader036.fdocuments.net/reader036/viewer/2022081614/5fc0bb779f1ed42f701e7a2c/html5/thumbnails/31.jpg)
31 15-‐214
Some distributed system design goals
• The end-‐to-‐end principle – When possible, implement func9onality at the end nodes (rather than the
middle nodes) of a distributed system
• The robustness principle – Be strict in what you send, but be liberal in what you accept from others
• Protocols • Failure behaviors
• Benefit from incremental changes • Be redundant
– Data replica9on – Checks for correctness
![Page 32: Jonathan/Aldrich /CharlieGarrod&charlie/courses/15-214/2015-fall/slides/06b... · 15214 2 Administrivia/ • Homework/5b/due/Thursday,/11:59/p.m./ – Finish/by/Friday/10/a.m./if/you/wantto/be/considered/as/a"Best](https://reader036.fdocuments.net/reader036/viewer/2022081614/5fc0bb779f1ed42f701e7a2c/html5/thumbnails/32.jpg)
32 15-‐214
Replica9on for scalability: Client-‐side caching
• Architecture before replica9on:
– Problem: Server throughput is too low
• Solu9on: Cache responses at (or near) the client – Cache can respond to repeated read requests
client front-end {alice:90, bob:42, …} client front-end
database server:
client front-end
client front-end
{alice:90, bob:42, …}
database server: cache
cache
![Page 33: Jonathan/Aldrich /CharlieGarrod&charlie/courses/15-214/2015-fall/slides/06b... · 15214 2 Administrivia/ • Homework/5b/due/Thursday,/11:59/p.m./ – Finish/by/Friday/10/a.m./if/you/wantto/be/considered/as/a"Best](https://reader036.fdocuments.net/reader036/viewer/2022081614/5fc0bb779f1ed42f701e7a2c/html5/thumbnails/33.jpg)
33 15-‐214
Replica9on for scalability: Client-‐side caching
• Hierarchical client-‐side caches:
client
front-end
client
front-end
{alice:90, bob:42, …}
database server:
cache
cache
cache
client
client
cache
cache
cache
![Page 34: Jonathan/Aldrich /CharlieGarrod&charlie/courses/15-214/2015-fall/slides/06b... · 15214 2 Administrivia/ • Homework/5b/due/Thursday,/11:59/p.m./ – Finish/by/Friday/10/a.m./if/you/wantto/be/considered/as/a"Best](https://reader036.fdocuments.net/reader036/viewer/2022081614/5fc0bb779f1ed42f701e7a2c/html5/thumbnails/34.jpg)
34 15-‐214
Replica9on for scalability: Server-‐side caching
• Architecture before replica9on:
– Problem: Database server throughput is too low
• Solu9on: Cache responses on mul9ple servers – Cache can respond to repeated read requests
client front-end {alice:90, bob:42, …} client front-end
database server:
client front-end
client front-end
{alice:90, bob:42, …}
database server: cache
cache
cache
![Page 35: Jonathan/Aldrich /CharlieGarrod&charlie/courses/15-214/2015-fall/slides/06b... · 15214 2 Administrivia/ • Homework/5b/due/Thursday,/11:59/p.m./ – Finish/by/Friday/10/a.m./if/you/wantto/be/considered/as/a"Best](https://reader036.fdocuments.net/reader036/viewer/2022081614/5fc0bb779f1ed42f701e7a2c/html5/thumbnails/35.jpg)
35 15-‐214
Cache invalida9on
• Time-‐based invalida9on (a.k.a. expira9on) – Read-‐any, write-‐one – Old cache entries automa9cally discarded – No expira9on date needed for read-‐only data
• Update-‐based invalida9on – Read-‐any, write-‐all – DB server broadcasts invalida9on message to all caches when the DB is
updated
![Page 36: Jonathan/Aldrich /CharlieGarrod&charlie/courses/15-214/2015-fall/slides/06b... · 15214 2 Administrivia/ • Homework/5b/due/Thursday,/11:59/p.m./ – Finish/by/Friday/10/a.m./if/you/wantto/be/considered/as/a"Best](https://reader036.fdocuments.net/reader036/viewer/2022081614/5fc0bb779f1ed42f701e7a2c/html5/thumbnails/36.jpg)
36 15-‐214
Cache replacement policies
• Problem: caches have finite size • Common* replacement policies
– Op9mal (Belady's) policy • Discard item not needed for longest 9me in future
– Least Recently Used (LRU) • Track 9me of previous access, discard item accessed least recently
– Least Frequently Used (LFU) • Count # 9mes item is accessed, discard item accessed least frequently
– Random • Discard a random item from the cache
![Page 37: Jonathan/Aldrich /CharlieGarrod&charlie/courses/15-214/2015-fall/slides/06b... · 15214 2 Administrivia/ • Homework/5b/due/Thursday,/11:59/p.m./ – Finish/by/Friday/10/a.m./if/you/wantto/be/considered/as/a"Best](https://reader036.fdocuments.net/reader036/viewer/2022081614/5fc0bb779f1ed42f701e7a2c/html5/thumbnails/37.jpg)
37 15-‐214
Par99oning for scalability
• Par99on data based on some property, put each par99on on a different server
client front-end {cohen:9, bob:42, …}
client front-end
CMU server:
{alice:90, pete:12, …}
Yale server: {deb:16, reif:40, …}
MIT server:
![Page 38: Jonathan/Aldrich /CharlieGarrod&charlie/courses/15-214/2015-fall/slides/06b... · 15214 2 Administrivia/ • Homework/5b/due/Thursday,/11:59/p.m./ – Finish/by/Friday/10/a.m./if/you/wantto/be/considered/as/a"Best](https://reader036.fdocuments.net/reader036/viewer/2022081614/5fc0bb779f1ed42f701e7a2c/html5/thumbnails/38.jpg)
38 15-‐214
Horizontal par99oning
• a.k.a. "sharding" • A table of data:
username school value
cohen CMU 9
bob CMU 42
alice Yale 90
pete Yale 12
deb MIT 16
reif MIT 40
![Page 39: Jonathan/Aldrich /CharlieGarrod&charlie/courses/15-214/2015-fall/slides/06b... · 15214 2 Administrivia/ • Homework/5b/due/Thursday,/11:59/p.m./ – Finish/by/Friday/10/a.m./if/you/wantto/be/considered/as/a"Best](https://reader036.fdocuments.net/reader036/viewer/2022081614/5fc0bb779f1ed42f701e7a2c/html5/thumbnails/39.jpg)
39 15-‐214
Recall: Basic hash tables
• For n-‐size hash table, put each item X in the bucket: X.hashCode() % n
0 1 2 3 4 5 6 7 8 9 10 11 12
{reif:40} {bob:42} {pete:12} {deb:16}
{alice:90} {cohen:9}
![Page 40: Jonathan/Aldrich /CharlieGarrod&charlie/courses/15-214/2015-fall/slides/06b... · 15214 2 Administrivia/ • Homework/5b/due/Thursday,/11:59/p.m./ – Finish/by/Friday/10/a.m./if/you/wantto/be/considered/as/a"Best](https://reader036.fdocuments.net/reader036/viewer/2022081614/5fc0bb779f1ed42f701e7a2c/html5/thumbnails/40.jpg)
40 15-‐214
Par99oning with a distributed hash table
• Each server stores data for one bucket • To store or retrieve an item, front-‐end server hashes the key,
contacts the server storing that bucket
client front-end {reif:40}
client front-end
Server 0:
{bob:42} Server 3: {pete:12,
alice:90}
Server 5:
{ } Server 1:
…
![Page 41: Jonathan/Aldrich /CharlieGarrod&charlie/courses/15-214/2015-fall/slides/06b... · 15214 2 Administrivia/ • Homework/5b/due/Thursday,/11:59/p.m./ – Finish/by/Friday/10/a.m./if/you/wantto/be/considered/as/a"Best](https://reader036.fdocuments.net/reader036/viewer/2022081614/5fc0bb779f1ed42f701e7a2c/html5/thumbnails/41.jpg)
41 15-‐214
Consistent hashing
• Goal: Benefit from incremental changes – Resizing the hash table (i.e., adding or removing a server) should not
require moving many objects
• E.g., Interpret the range of hash codes as a ring – Each bucket stores data for a range of the ring
• Assign each bucket an ID in the range of hash codes • To store item X don't compute X.hashCode() % n. Instead, place X in bucket with the same ID as or next higher ID than X.hashCode()
![Page 42: Jonathan/Aldrich /CharlieGarrod&charlie/courses/15-214/2015-fall/slides/06b... · 15214 2 Administrivia/ • Homework/5b/due/Thursday,/11:59/p.m./ – Finish/by/Friday/10/a.m./if/you/wantto/be/considered/as/a"Best](https://reader036.fdocuments.net/reader036/viewer/2022081614/5fc0bb779f1ed42f701e7a2c/html5/thumbnails/42.jpg)
42 15-‐214
Problems with hash-‐based par99oning
• Front-‐ends need to determine server for each bucket – Each front-‐end stores look-‐up table? – Master server storing look-‐up table? – Rou9ng-‐based approaches?
• Places related content on different servers – Consider range queries: SELECT * FROM users WHERE lastname STARTSWITH 'G'
![Page 43: Jonathan/Aldrich /CharlieGarrod&charlie/courses/15-214/2015-fall/slides/06b... · 15214 2 Administrivia/ • Homework/5b/due/Thursday,/11:59/p.m./ – Finish/by/Friday/10/a.m./if/you/wantto/be/considered/as/a"Best](https://reader036.fdocuments.net/reader036/viewer/2022081614/5fc0bb779f1ed42f701e7a2c/html5/thumbnails/43.jpg)
43 15-‐214
Master/tablet-‐based systems • Dynamically allocate range-‐based par99ons
– Master server maintains tablet-‐to-‐server assignments – Tablet servers store actual data – Front-‐ends cache tablet-‐to-‐server assignments
client front-end
k-z: {pete:12, reif:42}
client front-end
Tablet server 1:
a-c: {alice:90, bob:42, cohen:9}
Tablet server 2: d-g: {deb:16} h-j:{ }
Tablet server 3:
{a-c:[2], d-g:[3,4], h-j:[3], k-z:[1]}
Master:
d-g: {deb:16}
Tablet server 4:
![Page 44: Jonathan/Aldrich /CharlieGarrod&charlie/courses/15-214/2015-fall/slides/06b... · 15214 2 Administrivia/ • Homework/5b/due/Thursday,/11:59/p.m./ – Finish/by/Friday/10/a.m./if/you/wantto/be/considered/as/a"Best](https://reader036.fdocuments.net/reader036/viewer/2022081614/5fc0bb779f1ed42f701e7a2c/html5/thumbnails/44.jpg)
44 15-‐214
Coming next…
• More distributed systems – MapReduce