ClouDiA: A Deployment Advisor for Public Clouds Tao Zou, Ronan Le Bras, Marcos Vaz Salles*, Alan...
-
Upload
philip-ferguson -
Category
Documents
-
view
216 -
download
0
Transcript of ClouDiA: A Deployment Advisor for Public Clouds Tao Zou, Ronan Le Bras, Marcos Vaz Salles*, Alan...
![Page 1: ClouDiA: A Deployment Advisor for Public Clouds Tao Zou, Ronan Le Bras, Marcos Vaz Salles*, Alan Demers, Johannes Gehrke Cornell University *University.](https://reader035.fdocuments.net/reader035/viewer/2022081519/56649e5c5503460f94b5498a/html5/thumbnails/1.jpg)
ClouDiA: A Deployment Advisor for Public Clouds
Tao Zou, Ronan Le Bras, Marcos Vaz Salles*, Alan Demers, Johannes Gehrke
Cornell University*University of Copenhagen (DIKU)
![Page 2: ClouDiA: A Deployment Advisor for Public Clouds Tao Zou, Ronan Le Bras, Marcos Vaz Salles*, Alan Demers, Johannes Gehrke Cornell University *University.](https://reader035.fdocuments.net/reader035/viewer/2022081519/56649e5c5503460f94b5498a/html5/thumbnails/2.jpg)
Instance Allocation in Public CloudsCloud Provider’s View Cloud Tenant’s View
2
![Page 3: ClouDiA: A Deployment Advisor for Public Clouds Tao Zou, Ronan Le Bras, Marcos Vaz Salles*, Alan Demers, Johannes Gehrke Cornell University *University.](https://reader035.fdocuments.net/reader035/viewer/2022081519/56649e5c5503460f94b5498a/html5/thumbnails/3.jpg)
Instance Allocation in Public Clouds
……
TOR TOR TOR
3
Sub-Aggregation ……
Aggregation
CoreCloud Provider’s View
![Page 4: ClouDiA: A Deployment Advisor for Public Clouds Tao Zou, Ronan Le Bras, Marcos Vaz Salles*, Alan Demers, Johannes Gehrke Cornell University *University.](https://reader035.fdocuments.net/reader035/viewer/2022081519/56649e5c5503460f94b5498a/html5/thumbnails/4.jpg)
4
• Mean Latency Heterogeneity in EC2
• Mean Latency Stability in EC2
Network Latencies in Public Clouds
Challenge: Can this be done out-of-the-box?(i.e. no changes to the cloud or the application)
Challenge: How to avoid long links?
![Page 5: ClouDiA: A Deployment Advisor for Public Clouds Tao Zou, Ronan Le Bras, Marcos Vaz Salles*, Alan Demers, Johannes Gehrke Cornell University *University.](https://reader035.fdocuments.net/reader035/viewer/2022081519/56649e5c5503460f94b5498a/html5/thumbnails/5.jpg)
Examples of Latency Sensitive ApplicationsScientific Simulation Key-value Store
Search Aggregation Service Pipelines
(time-to-solution) (response time)
(response time) (response time)
Communication Graphs
Longest Link
Longest Path
Opportunity: Communication graphs are not complete.
A careful logical to physical mapping can help!
Opportunity: Gamble with over-allocation.
![Page 6: ClouDiA: A Deployment Advisor for Public Clouds Tao Zou, Ronan Le Bras, Marcos Vaz Salles*, Alan Demers, Johannes Gehrke Cornell University *University.](https://reader035.fdocuments.net/reader035/viewer/2022081519/56649e5c5503460f94b5498a/html5/thumbnails/6.jpg)
6
Architecture of ClouDiA
Allocate Instances (+ Extra Instances)
Get Measurements
Search Mapping
Deployment Plan
Terminate Extra Instances
Communication Graph
Objectives
Start Application
ClouDiA Public Cloud Tenant
![Page 7: ClouDiA: A Deployment Advisor for Public Clouds Tao Zou, Ronan Le Bras, Marcos Vaz Salles*, Alan Demers, Johannes Gehrke Cornell University *University.](https://reader035.fdocuments.net/reader035/viewer/2022081519/56649e5c5503460f94b5498a/html5/thumbnails/7.jpg)
Measuring Network Distance
• Goal: obtain a reliable estimate of link costs
• Approximations that are easy to obtain – Hop counts– Length of Common IP Prefix
• Accurate distance: pair-wise network latencies– Large number of measurements for each pair
• To observe enough latency jitters
– Interferences at end points• Heavily application and network dependent • Can’t model exactly measure without interference
do not work
![Page 8: ClouDiA: A Deployment Advisor for Public Clouds Tao Zou, Ronan Le Bras, Marcos Vaz Salles*, Alan Demers, Johannes Gehrke Cornell University *University.](https://reader035.fdocuments.net/reader035/viewer/2022081519/56649e5c5503460f94b5498a/html5/thumbnails/8.jpg)
Measuring Network Latencies • Without interference• Most efficient method: “staged”
Stage i
Stage i+1
No concurrent send/recv at end points Parallelism with minimal coordination
![Page 9: ClouDiA: A Deployment Advisor for Public Clouds Tao Zou, Ronan Le Bras, Marcos Vaz Salles*, Alan Demers, Johannes Gehrke Cornell University *University.](https://reader035.fdocuments.net/reader035/viewer/2022081519/56649e5c5503460f94b5498a/html5/thumbnails/9.jpg)
9
Architecture of ClouDiA
Allocate Instances (+ Extra Instances)
Get Measurements
Search Mapping
Deployment Plan
Terminate Extra Instances
Communication Graph
Objectives
Start Application
ClouDiA Public Cloud Tenant
![Page 10: ClouDiA: A Deployment Advisor for Public Clouds Tao Zou, Ronan Le Bras, Marcos Vaz Salles*, Alan Demers, Johannes Gehrke Cornell University *University.](https://reader035.fdocuments.net/reader035/viewer/2022081519/56649e5c5503460f94b5498a/html5/thumbnails/10.jpg)
Search Mapping (Longest Link)• NP-Hard
– To find a solution of cost – To find a solution of cost – To find a solution of cost
• Goal: find a “good” solution within timeout
• Two formulations:– Mixed-Integer Programming
• O() boolean variables
– Constraint Programming (*)• O() integer variables• an objective cost has to be given a priori
Hard to approximate
![Page 11: ClouDiA: A Deployment Advisor for Public Clouds Tao Zou, Ronan Le Bras, Marcos Vaz Salles*, Alan Demers, Johannes Gehrke Cornell University *University.](https://reader035.fdocuments.net/reader035/viewer/2022081519/56649e5c5503460f94b5498a/html5/thumbnails/11.jpg)
Searching using Constraint Programming
0
Give an objective c:1. Remove all links with cost > c2. Find a subgraph isomorphism
k=4
• A lot of distinct latency values– Bi-section search? finding “no solution” takes time– k-means clustering? works well with proper k
Give an objective c:1. Remove all links with cost > c2. Find a subgraph isomorphism
TimeoutTimeoutcost
![Page 12: ClouDiA: A Deployment Advisor for Public Clouds Tao Zou, Ronan Le Bras, Marcos Vaz Salles*, Alan Demers, Johannes Gehrke Cornell University *University.](https://reader035.fdocuments.net/reader035/viewer/2022081519/56649e5c5503460f94b5498a/html5/thumbnails/12.jpg)
Experimental Settings
• 100 to 150 m1.large instances in EC2
• IBM ILOG CPLEX Optimizer/CP Optimizer– Multi-core, a single machine
• Three workloads:– Behavioral simulation – Synthetic aggregation query– Key-value store
![Page 13: ClouDiA: A Deployment Advisor for Public Clouds Tao Zou, Ronan Le Bras, Marcos Vaz Salles*, Alan Demers, Johannes Gehrke Cornell University *University.](https://reader035.fdocuments.net/reader035/viewer/2022081519/56649e5c5503460f94b5498a/html5/thumbnails/13.jpg)
13
Effect of Over-Allocation
• 100 instances + 10%-50% over-allocation• Get Measurements + Search Deployment < 10 minutes
![Page 14: ClouDiA: A Deployment Advisor for Public Clouds Tao Zou, Ronan Le Bras, Marcos Vaz Salles*, Alan Demers, Johannes Gehrke Cornell University *University.](https://reader035.fdocuments.net/reader035/viewer/2022081519/56649e5c5503460f94b5498a/html5/thumbnails/14.jpg)
14
• 10% over-allocation
Overall Improvement
![Page 15: ClouDiA: A Deployment Advisor for Public Clouds Tao Zou, Ronan Le Bras, Marcos Vaz Salles*, Alan Demers, Johannes Gehrke Cornell University *University.](https://reader035.fdocuments.net/reader035/viewer/2022081519/56649e5c5503460f94b5498a/html5/thumbnails/15.jpg)
Thank you!
• ClouDiA is a deployment advisor for public clouds– Out-of-the-box– 15%-55% time reduction for latency sensitive
applications
• Could be adopted by cloud services providers– With API changes
Conclusion