On Data Placement in Distributed Systems - LADIS'14jgpaiva/pubs/ladis14-presentation.pdf · I...
Transcript of On Data Placement in Distributed Systems - LADIS'14jgpaiva/pubs/ladis14-presentation.pdf · I...
![Page 1: On Data Placement in Distributed Systems - LADIS'14jgpaiva/pubs/ladis14-presentation.pdf · I Supersize-me: Reducing replication costs Read the paper to know the best policies: "Policies](https://reader036.fdocuments.net/reader036/viewer/2022071023/5fd81f512bc9805c445066ad/html5/thumbnails/1.jpg)
On Data Placement in Distributed SystemsLADIS’14
Joao Paiva, Luıs Rodrigues{joao.paiva, ler}@tecnico.ulisboa.pt
Instituto Superior Tecnico / INESC-ID, Lisboa, Portugal
October 23, 2014
![Page 2: On Data Placement in Distributed Systems - LADIS'14jgpaiva/pubs/ladis14-presentation.pdf · I Supersize-me: Reducing replication costs Read the paper to know the best policies: "Policies](https://reader036.fdocuments.net/reader036/viewer/2022071023/5fd81f512bc9805c445066ad/html5/thumbnails/2.jpg)
What is Data Placement?
I Deciding how to assign data items to nodes in a distributedsystem in such way that they can be later retrieved.
![Page 3: On Data Placement in Distributed Systems - LADIS'14jgpaiva/pubs/ladis14-presentation.pdf · I Supersize-me: Reducing replication costs Read the paper to know the best policies: "Policies](https://reader036.fdocuments.net/reader036/viewer/2022071023/5fd81f512bc9805c445066ad/html5/thumbnails/3.jpg)
Data Placement Affects
Data Access Locality
Placing correlated data together can reduce latency of operations
Load Balancing
By knowing the workload, data can be placed in a way to even outthe load across all nodes
Availability
Data can be replicated depending on probability of node failure
![Page 4: On Data Placement in Distributed Systems - LADIS'14jgpaiva/pubs/ladis14-presentation.pdf · I Supersize-me: Reducing replication costs Read the paper to know the best policies: "Policies](https://reader036.fdocuments.net/reader036/viewer/2022071023/5fd81f512bc9805c445066ad/html5/thumbnails/4.jpg)
Constraints to data placement practicality
I Lack of flexibility limits data placement improvements
I Scalability imposes limits on the flexibility of placement
Example
I Using a centralized directory is flexible, but not scalable
I Using consistent hashing is scalable, but not flexible
![Page 5: On Data Placement in Distributed Systems - LADIS'14jgpaiva/pubs/ladis14-presentation.pdf · I Supersize-me: Reducing replication costs Read the paper to know the best policies: "Policies](https://reader036.fdocuments.net/reader036/viewer/2022071023/5fd81f512bc9805c445066ad/html5/thumbnails/5.jpg)
Constraints to data placement practicality
I Lack of flexibility limits data placement improvements
I Scalability imposes limits on the flexibility of placement
Example
I Using a centralized directory is flexible, but not scalable
I Using consistent hashing is scalable, but not flexible
![Page 6: On Data Placement in Distributed Systems - LADIS'14jgpaiva/pubs/ladis14-presentation.pdf · I Supersize-me: Reducing replication costs Read the paper to know the best policies: "Policies](https://reader036.fdocuments.net/reader036/viewer/2022071023/5fd81f512bc9805c445066ad/html5/thumbnails/6.jpg)
Main Goal
Provide better options between
I Strong flexibility, limited scalability
I Limited flexibility, good scalability
![Page 7: On Data Placement in Distributed Systems - LADIS'14jgpaiva/pubs/ladis14-presentation.pdf · I Supersize-me: Reducing replication costs Read the paper to know the best policies: "Policies](https://reader036.fdocuments.net/reader036/viewer/2022071023/5fd81f512bc9805c445066ad/html5/thumbnails/7.jpg)
Two Scenarios
Internet Scale
I Millions of nodes
I Short term connections
I Asymmetric, inconstant network
Datacenter Scale
I Thousands of nodes
I Stable connections
I Controlled network infrastructure
![Page 8: On Data Placement in Distributed Systems - LADIS'14jgpaiva/pubs/ladis14-presentation.pdf · I Supersize-me: Reducing replication costs Read the paper to know the best policies: "Policies](https://reader036.fdocuments.net/reader036/viewer/2022071023/5fd81f512bc9805c445066ad/html5/thumbnails/8.jpg)
Two Scenarios: Previous state of the art
Internet Scale
I Scalable solutions with little flexibility, concerned with churn
Datacenter Scale
I Very flexible solutions, concerned with workload changes
![Page 9: On Data Placement in Distributed Systems - LADIS'14jgpaiva/pubs/ladis14-presentation.pdf · I Supersize-me: Reducing replication costs Read the paper to know the best policies: "Policies](https://reader036.fdocuments.net/reader036/viewer/2022071023/5fd81f512bc9805c445066ad/html5/thumbnails/9.jpg)
Summary of Findings
Improvements for both scenarios:
I More flexible solution for Internet-Scale
I More scalable solution for Datacenter-Scale
![Page 10: On Data Placement in Distributed Systems - LADIS'14jgpaiva/pubs/ladis14-presentation.pdf · I Supersize-me: Reducing replication costs Read the paper to know the best policies: "Policies](https://reader036.fdocuments.net/reader036/viewer/2022071023/5fd81f512bc9805c445066ad/html5/thumbnails/10.jpg)
Outline
Introduction
Internet Scale
Datacenter Scale
Conclusion
![Page 11: On Data Placement in Distributed Systems - LADIS'14jgpaiva/pubs/ladis14-presentation.pdf · I Supersize-me: Reducing replication costs Read the paper to know the best policies: "Policies](https://reader036.fdocuments.net/reader036/viewer/2022071023/5fd81f512bc9805c445066ad/html5/thumbnails/11.jpg)
Internet Scale: Rollerchain
I Data assigned to node groups
I Variable replication degree
I Nodes have no fixed position
![Page 12: On Data Placement in Distributed Systems - LADIS'14jgpaiva/pubs/ladis14-presentation.pdf · I Supersize-me: Reducing replication costs Read the paper to know the best policies: "Policies](https://reader036.fdocuments.net/reader036/viewer/2022071023/5fd81f512bc9805c445066ad/html5/thumbnails/12.jpg)
Variable Replication Degree
![Page 13: On Data Placement in Distributed Systems - LADIS'14jgpaiva/pubs/ladis14-presentation.pdf · I Supersize-me: Reducing replication costs Read the paper to know the best policies: "Policies](https://reader036.fdocuments.net/reader036/viewer/2022071023/5fd81f512bc9805c445066ad/html5/thumbnails/13.jpg)
Variable Replication Degree
![Page 14: On Data Placement in Distributed Systems - LADIS'14jgpaiva/pubs/ladis14-presentation.pdf · I Supersize-me: Reducing replication costs Read the paper to know the best policies: "Policies](https://reader036.fdocuments.net/reader036/viewer/2022071023/5fd81f512bc9805c445066ad/html5/thumbnails/14.jpg)
Variable Replication Degree
![Page 15: On Data Placement in Distributed Systems - LADIS'14jgpaiva/pubs/ladis14-presentation.pdf · I Supersize-me: Reducing replication costs Read the paper to know the best policies: "Policies](https://reader036.fdocuments.net/reader036/viewer/2022071023/5fd81f512bc9805c445066ad/html5/thumbnails/15.jpg)
Variable Replication Degree
![Page 16: On Data Placement in Distributed Systems - LADIS'14jgpaiva/pubs/ladis14-presentation.pdf · I Supersize-me: Reducing replication costs Read the paper to know the best policies: "Policies](https://reader036.fdocuments.net/reader036/viewer/2022071023/5fd81f512bc9805c445066ad/html5/thumbnails/16.jpg)
Variable Replication Degree
![Page 17: On Data Placement in Distributed Systems - LADIS'14jgpaiva/pubs/ladis14-presentation.pdf · I Supersize-me: Reducing replication costs Read the paper to know the best policies: "Policies](https://reader036.fdocuments.net/reader036/viewer/2022071023/5fd81f512bc9805c445066ad/html5/thumbnails/17.jpg)
Variable Replication Degree
![Page 18: On Data Placement in Distributed Systems - LADIS'14jgpaiva/pubs/ladis14-presentation.pdf · I Supersize-me: Reducing replication costs Read the paper to know the best policies: "Policies](https://reader036.fdocuments.net/reader036/viewer/2022071023/5fd81f512bc9805c445066ad/html5/thumbnails/18.jpg)
Variable Replication Degree
![Page 19: On Data Placement in Distributed Systems - LADIS'14jgpaiva/pubs/ladis14-presentation.pdf · I Supersize-me: Reducing replication costs Read the paper to know the best policies: "Policies](https://reader036.fdocuments.net/reader036/viewer/2022071023/5fd81f512bc9805c445066ad/html5/thumbnails/19.jpg)
Nodes have no fixed position
![Page 20: On Data Placement in Distributed Systems - LADIS'14jgpaiva/pubs/ladis14-presentation.pdf · I Supersize-me: Reducing replication costs Read the paper to know the best policies: "Policies](https://reader036.fdocuments.net/reader036/viewer/2022071023/5fd81f512bc9805c445066ad/html5/thumbnails/20.jpg)
Nodes have no fixed position
![Page 21: On Data Placement in Distributed Systems - LADIS'14jgpaiva/pubs/ladis14-presentation.pdf · I Supersize-me: Reducing replication costs Read the paper to know the best policies: "Policies](https://reader036.fdocuments.net/reader036/viewer/2022071023/5fd81f512bc9805c445066ad/html5/thumbnails/21.jpg)
Nodes have no fixed position
![Page 22: On Data Placement in Distributed Systems - LADIS'14jgpaiva/pubs/ladis14-presentation.pdf · I Supersize-me: Reducing replication costs Read the paper to know the best policies: "Policies](https://reader036.fdocuments.net/reader036/viewer/2022071023/5fd81f512bc9805c445066ad/html5/thumbnails/22.jpg)
Internet Scale: Implementation
Rollerchain
I Gossip-based and structured overlay
I Better churn resilience than state of the art
I Decreased replication costs
”Rollerchain: a DHT for Efficient Replication”, Joao Paiva, Joao Leitao and Luıs
Rodrigues, Symposium on Network Computing and Applications (IEEE NCA), August
2013. (Best student paper award)
![Page 23: On Data Placement in Distributed Systems - LADIS'14jgpaiva/pubs/ladis14-presentation.pdf · I Supersize-me: Reducing replication costs Read the paper to know the best policies: "Policies](https://reader036.fdocuments.net/reader036/viewer/2022071023/5fd81f512bc9805c445066ad/html5/thumbnails/23.jpg)
Internet Scale: Implementation
Data Placement Policies
I Avoid-Surplus: Reducing monitoring costs
I Resilient Load-Balancing : Improving load balancing
I Supersize-me: Reducing replication costs
Read the paper to know the best policies:
”Policies for Efficient Data Replication in P2P Systems”, Joao Paiva, and Luıs
Rodrigues, International Conference on Parallel and Distributed Systems (IEEE
ICPADS), December 2013.
![Page 24: On Data Placement in Distributed Systems - LADIS'14jgpaiva/pubs/ladis14-presentation.pdf · I Supersize-me: Reducing replication costs Read the paper to know the best policies: "Policies](https://reader036.fdocuments.net/reader036/viewer/2022071023/5fd81f512bc9805c445066ad/html5/thumbnails/24.jpg)
Internet Scale: Implementation
Data Placement Policies
I Avoid-Surplus: Reducing monitoring costs
I Resilient Load-Balancing : Improving load balancing
I Supersize-me: Reducing replication costs
Read the paper to know the best policies:
”Policies for Efficient Data Replication in P2P Systems”, Joao Paiva, and Luıs
Rodrigues, International Conference on Parallel and Distributed Systems (IEEE
ICPADS), December 2013.
![Page 25: On Data Placement in Distributed Systems - LADIS'14jgpaiva/pubs/ladis14-presentation.pdf · I Supersize-me: Reducing replication costs Read the paper to know the best policies: "Policies](https://reader036.fdocuments.net/reader036/viewer/2022071023/5fd81f512bc9805c445066ad/html5/thumbnails/25.jpg)
Outline
Introduction
Internet Scale
Datacenter Scale
Conclusion
![Page 26: On Data Placement in Distributed Systems - LADIS'14jgpaiva/pubs/ladis14-presentation.pdf · I Supersize-me: Reducing replication costs Read the paper to know the best policies: "Policies](https://reader036.fdocuments.net/reader036/viewer/2022071023/5fd81f512bc9805c445066ad/html5/thumbnails/26.jpg)
Datacenter scale: AutoPlacer
System where data placement is defined by combining:
I Consistent hashing for most items
I Precise placement for selected items
Locality-improving round-based algorithm for in-memory data grids
”AutoPlacer: scalable self-tuning data placement in distributed key-value stores”,
J. Paiva, P. Ruivo, P. Romano and L. Rodrigues, International Conference on
Autonomic Computing (USENIX ICAC), June 2013. (Best paper finalist)
”AutoPlacer: scalable self-tuning data placement in distributed key-value stores”,
J. Paiva, P. Ruivo, P. Romano and L. Rodrigues, ACM Transactions on Autonomous
and Adaptive Systems (ACM TAAS)
![Page 27: On Data Placement in Distributed Systems - LADIS'14jgpaiva/pubs/ladis14-presentation.pdf · I Supersize-me: Reducing replication costs Read the paper to know the best policies: "Policies](https://reader036.fdocuments.net/reader036/viewer/2022071023/5fd81f512bc9805c445066ad/html5/thumbnails/27.jpg)
Datacenter scale: AutoPlacer
System where data placement is defined by combining:
I Consistent hashing for most items
I Precise placement for selected items
Locality-improving round-based algorithm for in-memory data grids
”AutoPlacer: scalable self-tuning data placement in distributed key-value stores”,
J. Paiva, P. Ruivo, P. Romano and L. Rodrigues, International Conference on
Autonomic Computing (USENIX ICAC), June 2013. (Best paper finalist)
”AutoPlacer: scalable self-tuning data placement in distributed key-value stores”,
J. Paiva, P. Ruivo, P. Romano and L. Rodrigues, ACM Transactions on Autonomous
and Adaptive Systems (ACM TAAS)
![Page 28: On Data Placement in Distributed Systems - LADIS'14jgpaiva/pubs/ladis14-presentation.pdf · I Supersize-me: Reducing replication costs Read the paper to know the best policies: "Policies](https://reader036.fdocuments.net/reader036/viewer/2022071023/5fd81f512bc9805c445066ad/html5/thumbnails/28.jpg)
Algorithm overview
Online, round-based approach:
1. Statistics: Monitor data access to collect hotspots
2. Optimization: Decide placement for hotspots
3. Lookup: Encode / broadcast data placement
4. Move data
![Page 29: On Data Placement in Distributed Systems - LADIS'14jgpaiva/pubs/ladis14-presentation.pdf · I Supersize-me: Reducing replication costs Read the paper to know the best policies: "Policies](https://reader036.fdocuments.net/reader036/viewer/2022071023/5fd81f512bc9805c445066ad/html5/thumbnails/29.jpg)
Algorithm overview
Online, round-based approach:
1. Statistics: Monitor data access to collect hotspots
2. Optimization: Decide placement for hotspots
3. Lookup: Encode / broadcast data placement
4. Move data
![Page 30: On Data Placement in Distributed Systems - LADIS'14jgpaiva/pubs/ladis14-presentation.pdf · I Supersize-me: Reducing replication costs Read the paper to know the best policies: "Policies](https://reader036.fdocuments.net/reader036/viewer/2022071023/5fd81f512bc9805c445066ad/html5/thumbnails/30.jpg)
Statistics: Data access monitoring
Key concept: Top-K stream analysis algorithm
I Lightweight
I Sub-linear space usage
I Inaccurate result... But with bounded error
![Page 31: On Data Placement in Distributed Systems - LADIS'14jgpaiva/pubs/ladis14-presentation.pdf · I Supersize-me: Reducing replication costs Read the paper to know the best policies: "Policies](https://reader036.fdocuments.net/reader036/viewer/2022071023/5fd81f512bc9805c445066ad/html5/thumbnails/31.jpg)
Statistics: Data access monitoring
Key concept: Top-K stream analysis algorithm
I Lightweight
I Sub-linear space usage
I Inaccurate result... But with bounded error
![Page 32: On Data Placement in Distributed Systems - LADIS'14jgpaiva/pubs/ladis14-presentation.pdf · I Supersize-me: Reducing replication costs Read the paper to know the best policies: "Policies](https://reader036.fdocuments.net/reader036/viewer/2022071023/5fd81f512bc9805c445066ad/html5/thumbnails/32.jpg)
Statistics: Data access monitoring
Key concept: Top-K stream analysis algorithm
I Lightweight
I Sub-linear space usage
I Inaccurate result... But with bounded error
![Page 33: On Data Placement in Distributed Systems - LADIS'14jgpaiva/pubs/ladis14-presentation.pdf · I Supersize-me: Reducing replication costs Read the paper to know the best policies: "Policies](https://reader036.fdocuments.net/reader036/viewer/2022071023/5fd81f512bc9805c445066ad/html5/thumbnails/33.jpg)
Algorithm overview
Online, round-based approach:
1. Statistics: Monitor data access to collect hotspots
2. Optimization: Decide placement for hotspots
3. Lookup: Encode / broadcast data placement
4. Move data
![Page 34: On Data Placement in Distributed Systems - LADIS'14jgpaiva/pubs/ladis14-presentation.pdf · I Supersize-me: Reducing replication costs Read the paper to know the best policies: "Policies](https://reader036.fdocuments.net/reader036/viewer/2022071023/5fd81f512bc9805c445066ad/html5/thumbnails/34.jpg)
Optimization
Integer Linear Programming problem formulation:
min∑j∈N
∑i∈O
X ij(crr rij + crwwij) + Xij(cl
r rij + clwwij) (1)
subject to:
∀i ∈ O :∑j∈N
Xij = d ∧ ∀j ∈ N :∑i∈O
Xij ≤ Sj
Inaccurate input:
I Does not provide optimal placement
I Upper-bound on error
![Page 35: On Data Placement in Distributed Systems - LADIS'14jgpaiva/pubs/ladis14-presentation.pdf · I Supersize-me: Reducing replication costs Read the paper to know the best policies: "Policies](https://reader036.fdocuments.net/reader036/viewer/2022071023/5fd81f512bc9805c445066ad/html5/thumbnails/35.jpg)
Accelerating optimization
1. ILP Relaxed to Linear Programming problem
2. Distributed optimization
LP relaxation
I Allow data item ownership to be in [0− 1] interval
Distributed Optimization
I Partition by the N nodes
I Each node optimizes hotspots mapped to it by CH
I Strengthen capacity constraint
![Page 36: On Data Placement in Distributed Systems - LADIS'14jgpaiva/pubs/ladis14-presentation.pdf · I Supersize-me: Reducing replication costs Read the paper to know the best policies: "Policies](https://reader036.fdocuments.net/reader036/viewer/2022071023/5fd81f512bc9805c445066ad/html5/thumbnails/36.jpg)
Algorithm overview
Online, round-based approach:
1. Statistics: Monitor data access to collect hotspots
2. Optimization: Decide placement for hotspots
3. Lookup: Encode / broadcast data placement
4. Move data
![Page 37: On Data Placement in Distributed Systems - LADIS'14jgpaiva/pubs/ladis14-presentation.pdf · I Supersize-me: Reducing replication costs Read the paper to know the best policies: "Policies](https://reader036.fdocuments.net/reader036/viewer/2022071023/5fd81f512bc9805c445066ad/html5/thumbnails/37.jpg)
Lookup: Encoding placement
Probabilistic Associative Array (PAA)
I Associative array interface (keys→values)
I Probabilistic and space-efficient
I Trade-off space usage for accuracy
![Page 38: On Data Placement in Distributed Systems - LADIS'14jgpaiva/pubs/ladis14-presentation.pdf · I Supersize-me: Reducing replication costs Read the paper to know the best policies: "Policies](https://reader036.fdocuments.net/reader036/viewer/2022071023/5fd81f512bc9805c445066ad/html5/thumbnails/38.jpg)
Probabilistic Associative Array: Usage
Building
1. Build PAA from hotspot mappings
2. Broadcast PAA
Looking up objects
I If item is hotspot, return PAA mapping
I Otherwise, default to Consistent Hashing
![Page 39: On Data Placement in Distributed Systems - LADIS'14jgpaiva/pubs/ladis14-presentation.pdf · I Supersize-me: Reducing replication costs Read the paper to know the best policies: "Policies](https://reader036.fdocuments.net/reader036/viewer/2022071023/5fd81f512bc9805c445066ad/html5/thumbnails/39.jpg)
Probabilistic Associative Array: Usage
Building
1. Build PAA from hotspot mappings
2. Broadcast PAA
Looking up objects
I If item is hotspot, return PAA mapping
I Otherwise, default to Consistent Hashing
![Page 40: On Data Placement in Distributed Systems - LADIS'14jgpaiva/pubs/ladis14-presentation.pdf · I Supersize-me: Reducing replication costs Read the paper to know the best policies: "Policies](https://reader036.fdocuments.net/reader036/viewer/2022071023/5fd81f512bc9805c445066ad/html5/thumbnails/40.jpg)
PAA: Building blocks
I Bloom FilterSpace-efficient membership test (is item in PAA?)
I Decision tree classifierSpace-efficient mapping (where is hotspot mapped to?)
![Page 41: On Data Placement in Distributed Systems - LADIS'14jgpaiva/pubs/ladis14-presentation.pdf · I Supersize-me: Reducing replication costs Read the paper to know the best policies: "Policies](https://reader036.fdocuments.net/reader036/viewer/2022071023/5fd81f512bc9805c445066ad/html5/thumbnails/41.jpg)
PAA: Building blocks
I Bloom FilterSpace-efficient membership test (is item in PAA?)
I Decision tree classifierSpace-efficient mapping (where is hotspot mapped to?)
![Page 42: On Data Placement in Distributed Systems - LADIS'14jgpaiva/pubs/ladis14-presentation.pdf · I Supersize-me: Reducing replication costs Read the paper to know the best policies: "Policies](https://reader036.fdocuments.net/reader036/viewer/2022071023/5fd81f512bc9805c445066ad/html5/thumbnails/42.jpg)
PAA: Properties
Bloom Filter:
I No False Negatives: never return ⊥ for items in PAA.
I False Positives: match items that it was not supposed to.
Decision tree classifier:
I Inaccurate values (bounded error).
I Deterministic response: deterministic (item→node)mapping.
![Page 43: On Data Placement in Distributed Systems - LADIS'14jgpaiva/pubs/ladis14-presentation.pdf · I Supersize-me: Reducing replication costs Read the paper to know the best policies: "Policies](https://reader036.fdocuments.net/reader036/viewer/2022071023/5fd81f512bc9805c445066ad/html5/thumbnails/43.jpg)
PAA: Properties
Bloom Filter:
I No False Negatives: never return ⊥ for items in PAA.
I False Positives: match items that it was not supposed to.
Decision tree classifier:
I Inaccurate values (bounded error).
I Deterministic response: deterministic (item→node)mapping.
![Page 44: On Data Placement in Distributed Systems - LADIS'14jgpaiva/pubs/ladis14-presentation.pdf · I Supersize-me: Reducing replication costs Read the paper to know the best policies: "Policies](https://reader036.fdocuments.net/reader036/viewer/2022071023/5fd81f512bc9805c445066ad/html5/thumbnails/44.jpg)
PAA: Properties
Bloom Filter:
I No False Negatives: never return ⊥ for items in PAA.
I False Positives: match items that it was not supposed to.
Decision tree classifier:
I Inaccurate values (bounded error).
I Deterministic response: deterministic (item→node)mapping.
![Page 45: On Data Placement in Distributed Systems - LADIS'14jgpaiva/pubs/ladis14-presentation.pdf · I Supersize-me: Reducing replication costs Read the paper to know the best policies: "Policies](https://reader036.fdocuments.net/reader036/viewer/2022071023/5fd81f512bc9805c445066ad/html5/thumbnails/45.jpg)
Outline
Introduction
Internet Scale
Datacenter ScaleAutoplacerEvaluation
ConclusionConclusion
![Page 46: On Data Placement in Distributed Systems - LADIS'14jgpaiva/pubs/ladis14-presentation.pdf · I Supersize-me: Reducing replication costs Read the paper to know the best policies: "Policies](https://reader036.fdocuments.net/reader036/viewer/2022071023/5fd81f512bc9805c445066ad/html5/thumbnails/46.jpg)
Evaluation: Throughput
10
100
1000
0 5 10 15 20 25 30
Tran
sact
ions
per
sec
ond
(TX/
s)
Time (minutes)
100% locality90% locality50% locality
0% localitybaseline
![Page 47: On Data Placement in Distributed Systems - LADIS'14jgpaiva/pubs/ladis14-presentation.pdf · I Supersize-me: Reducing replication costs Read the paper to know the best policies: "Policies](https://reader036.fdocuments.net/reader036/viewer/2022071023/5fd81f512bc9805c445066ad/html5/thumbnails/47.jpg)
Evaluation: Optimization
![Page 48: On Data Placement in Distributed Systems - LADIS'14jgpaiva/pubs/ladis14-presentation.pdf · I Supersize-me: Reducing replication costs Read the paper to know the best policies: "Policies](https://reader036.fdocuments.net/reader036/viewer/2022071023/5fd81f512bc9805c445066ad/html5/thumbnails/48.jpg)
Outline
Introduction
Internet Scale
Datacenter Scale
Conclusion
![Page 49: On Data Placement in Distributed Systems - LADIS'14jgpaiva/pubs/ladis14-presentation.pdf · I Supersize-me: Reducing replication costs Read the paper to know the best policies: "Policies](https://reader036.fdocuments.net/reader036/viewer/2022071023/5fd81f512bc9805c445066ad/html5/thumbnails/49.jpg)
Conclusions
Internet Scale
I More flexible overlay for data placement
I Policies to improve metrics using added flexibility
Datacenter Scale
I Scalable mechanism for data placement
I Algorithm to improve locality through hotspot placement