Datacenter application interference - Columbia...
Transcript of Datacenter application interference - Columbia...
![Page 1: Datacenter application interference - Columbia Universityarcade.cs.columbia.edu/interference-sc12-slides.pdf · 1. New datacenter application interference studies can use our identified](https://reader033.fdocuments.net/reader033/viewer/2022042001/5e6dab90b7c3bd27a659d316/html5/thumbnails/1.jpg)
Datacenter application interference
CMPs (popular in datacenters) offer increased throughput and reduced power consumption
They also increase resource sharing between applications, which can result in negative interference.
1
![Page 2: Datacenter application interference - Columbia Universityarcade.cs.columbia.edu/interference-sc12-slides.pdf · 1. New datacenter application interference studies can use our identified](https://reader033.fdocuments.net/reader033/viewer/2022042001/5e6dab90b7c3bd27a659d316/html5/thumbnails/2.jpg)
Resource contention is well studied
… at least on single machines.
3 main methods:
(1) Gladiator style match-ups
(2) Static analysis to predict application resource usage
(3) Measure benchmark resource usage; apply to live applications
2
![Page 3: Datacenter application interference - Columbia Universityarcade.cs.columbia.edu/interference-sc12-slides.pdf · 1. New datacenter application interference studies can use our identified](https://reader033.fdocuments.net/reader033/viewer/2022042001/5e6dab90b7c3bd27a659d316/html5/thumbnails/3.jpg)
New methodology for understanding datacenter interference is needed.
One that can handle complexities of a datacenter:
(10s of) thousands of applications real user inputs production hardware financially feasible low overhead
Hardware counter measurements of live applications.
3
![Page 4: Datacenter application interference - Columbia Universityarcade.cs.columbia.edu/interference-sc12-slides.pdf · 1. New datacenter application interference studies can use our identified](https://reader033.fdocuments.net/reader033/viewer/2022042001/5e6dab90b7c3bd27a659d316/html5/thumbnails/4.jpg)
Our contributions
1. ID complexities in datacenters
2. New measurement methodology
3. First large-scale study of measured interference on live datacenter applications.
4
![Page 5: Datacenter application interference - Columbia Universityarcade.cs.columbia.edu/interference-sc12-slides.pdf · 1. New datacenter application interference studies can use our identified](https://reader033.fdocuments.net/reader033/viewer/2022042001/5e6dab90b7c3bd27a659d316/html5/thumbnails/5.jpg)
Complexities of understanding application interference in a datacenter
5
![Page 6: Datacenter application interference - Columbia Universityarcade.cs.columbia.edu/interference-sc12-slides.pdf · 1. New datacenter application interference studies can use our identified](https://reader033.fdocuments.net/reader033/viewer/2022042001/5e6dab90b7c3bd27a659d316/html5/thumbnails/6.jpg)
Large chips and high core utilizations
Profiling 1000 12-core, 24-hyperthread Google servers running production workloads revealed the average machine had >14/24 HW threads in use.
6
![Page 7: Datacenter application interference - Columbia Universityarcade.cs.columbia.edu/interference-sc12-slides.pdf · 1. New datacenter application interference studies can use our identified](https://reader033.fdocuments.net/reader033/viewer/2022042001/5e6dab90b7c3bd27a659d316/html5/thumbnails/7.jpg)
Heterogeneous application mixes
Often applications have more than one co-runner on a machine.
Observed max of 19 unique co-runner threads (out of 24 HW threads).
0-1 Co-runners
2-3 Co-runners
4+ Co-runners
7
![Page 8: Datacenter application interference - Columbia Universityarcade.cs.columbia.edu/interference-sc12-slides.pdf · 1. New datacenter application interference studies can use our identified](https://reader033.fdocuments.net/reader033/viewer/2022042001/5e6dab90b7c3bd27a659d316/html5/thumbnails/8.jpg)
Application complexities
Fuzzy definitions
Varying and sometimes unpredictable inputs
Unknown optimal performance
8
![Page 9: Datacenter application interference - Columbia Universityarcade.cs.columbia.edu/interference-sc12-slides.pdf · 1. New datacenter application interference studies can use our identified](https://reader033.fdocuments.net/reader033/viewer/2022042001/5e6dab90b7c3bd27a659d316/html5/thumbnails/9.jpg)
Hardware & Economic Complexities
Varying micro-arch platforms
Necessity for low overhead = limited measurement capabilities
Corporate policies
9
![Page 10: Datacenter application interference - Columbia Universityarcade.cs.columbia.edu/interference-sc12-slides.pdf · 1. New datacenter application interference studies can use our identified](https://reader033.fdocuments.net/reader033/viewer/2022042001/5e6dab90b7c3bd27a659d316/html5/thumbnails/10.jpg)
Measurement methodology
10
![Page 11: Datacenter application interference - Columbia Universityarcade.cs.columbia.edu/interference-sc12-slides.pdf · 1. New datacenter application interference studies can use our identified](https://reader033.fdocuments.net/reader033/viewer/2022042001/5e6dab90b7c3bd27a659d316/html5/thumbnails/11.jpg)
Measurement Methodology
The goal:
A generic methodology to collect application interference data on live production datacenter servers
11
![Page 12: Datacenter application interference - Columbia Universityarcade.cs.columbia.edu/interference-sc12-slides.pdf · 1. New datacenter application interference studies can use our identified](https://reader033.fdocuments.net/reader033/viewer/2022042001/5e6dab90b7c3bd27a659d316/html5/thumbnails/12.jpg)
Measurement Methodology
12
App. A App. B
Tim
e
![Page 13: Datacenter application interference - Columbia Universityarcade.cs.columbia.edu/interference-sc12-slides.pdf · 1. New datacenter application interference studies can use our identified](https://reader033.fdocuments.net/reader033/viewer/2022042001/5e6dab90b7c3bd27a659d316/html5/thumbnails/13.jpg)
Measurement Methodology
1. Use sample-based monitoring to collect per machine per core event (HW counter) sample data.
1.
13
![Page 14: Datacenter application interference - Columbia Universityarcade.cs.columbia.edu/interference-sc12-slides.pdf · 1. New datacenter application interference studies can use our identified](https://reader033.fdocuments.net/reader033/viewer/2022042001/5e6dab90b7c3bd27a659d316/html5/thumbnails/14.jpg)
Measurement Methodology
14
App. A App. B
2 M instrs
2 M instrs
2 M instrs
2 M instrs
2 M instrs
2 M instrs
2 M instrs
2 M instrs
2 M instrs
2 M instrs
1
2
3
4
5
6
1
2
3
4
![Page 15: Datacenter application interference - Columbia Universityarcade.cs.columbia.edu/interference-sc12-slides.pdf · 1. New datacenter application interference studies can use our identified](https://reader033.fdocuments.net/reader033/viewer/2022042001/5e6dab90b7c3bd27a659d316/html5/thumbnails/15.jpg)
Measurement Methodology
2. Identify sample sized co-runner relationships…
2.
15
![Page 16: Datacenter application interference - Columbia Universityarcade.cs.columbia.edu/interference-sc12-slides.pdf · 1. New datacenter application interference studies can use our identified](https://reader033.fdocuments.net/reader033/viewer/2022042001/5e6dab90b7c3bd27a659d316/html5/thumbnails/16.jpg)
Measurement Methodology
16
App. A App. B
Samples A:1-A:6 are co-runners with App. B.
Samples B:1-B:4 are co-runners with App. A.
![Page 17: Datacenter application interference - Columbia Universityarcade.cs.columbia.edu/interference-sc12-slides.pdf · 1. New datacenter application interference studies can use our identified](https://reader033.fdocuments.net/reader033/viewer/2022042001/5e6dab90b7c3bd27a659d316/html5/thumbnails/17.jpg)
Measurement Methodology
17
App. C
App. A
App. B
Say that a new App. C starts running on CPU 1…
… B:4 no longer has a co-runner.
![Page 18: Datacenter application interference - Columbia Universityarcade.cs.columbia.edu/interference-sc12-slides.pdf · 1. New datacenter application interference studies can use our identified](https://reader033.fdocuments.net/reader033/viewer/2022042001/5e6dab90b7c3bd27a659d316/html5/thumbnails/18.jpg)
Measurement Methodology
3. Filter relationships by arch. independent interference classes…
3.
18
![Page 19: Datacenter application interference - Columbia Universityarcade.cs.columbia.edu/interference-sc12-slides.pdf · 1. New datacenter application interference studies can use our identified](https://reader033.fdocuments.net/reader033/viewer/2022042001/5e6dab90b7c3bd27a659d316/html5/thumbnails/19.jpg)
Measurement Methodology
Be on opp. sockets.
19
![Page 20: Datacenter application interference - Columbia Universityarcade.cs.columbia.edu/interference-sc12-slides.pdf · 1. New datacenter application interference studies can use our identified](https://reader033.fdocuments.net/reader033/viewer/2022042001/5e6dab90b7c3bd27a659d316/html5/thumbnails/20.jpg)
Measurement Methodology
Share only I/O
20
![Page 21: Datacenter application interference - Columbia Universityarcade.cs.columbia.edu/interference-sc12-slides.pdf · 1. New datacenter application interference studies can use our identified](https://reader033.fdocuments.net/reader033/viewer/2022042001/5e6dab90b7c3bd27a659d316/html5/thumbnails/21.jpg)
Measurement Methodology
4. Aggregate equivalent co-schedules.
4.
21
![Page 22: Datacenter application interference - Columbia Universityarcade.cs.columbia.edu/interference-sc12-slides.pdf · 1. New datacenter application interference studies can use our identified](https://reader033.fdocuments.net/reader033/viewer/2022042001/5e6dab90b7c3bd27a659d316/html5/thumbnails/22.jpg)
Measurement Methodology
22
For example: • Aggregate all the samples of App. A
that have App. B as a shared core co- runner.
• Aggregate all samples of App. A that have App. B as a shared core co-runner and App. C as a shared socket co- runner.
![Page 23: Datacenter application interference - Columbia Universityarcade.cs.columbia.edu/interference-sc12-slides.pdf · 1. New datacenter application interference studies can use our identified](https://reader033.fdocuments.net/reader033/viewer/2022042001/5e6dab90b7c3bd27a659d316/html5/thumbnails/23.jpg)
Measurement Methodology
5. Finally, calculate statistical indicators (means, medians) to get a midpoint performance for app. interference comparisons
5.
23
![Page 24: Datacenter application interference - Columbia Universityarcade.cs.columbia.edu/interference-sc12-slides.pdf · 1. New datacenter application interference studies can use our identified](https://reader033.fdocuments.net/reader033/viewer/2022042001/5e6dab90b7c3bd27a659d316/html5/thumbnails/24.jpg)
Measurement Methodology
24
App. A App. B
Avg. IPC = 2.0
Avg. IPC = 1.5
![Page 25: Datacenter application interference - Columbia Universityarcade.cs.columbia.edu/interference-sc12-slides.pdf · 1. New datacenter application interference studies can use our identified](https://reader033.fdocuments.net/reader033/viewer/2022042001/5e6dab90b7c3bd27a659d316/html5/thumbnails/25.jpg)
Applying the measurement methodology at Google.
25
![Page 26: Datacenter application interference - Columbia Universityarcade.cs.columbia.edu/interference-sc12-slides.pdf · 1. New datacenter application interference studies can use our identified](https://reader033.fdocuments.net/reader033/viewer/2022042001/5e6dab90b7c3bd27a659d316/html5/thumbnails/26.jpg)
Applying the Methodology @ Google
Event Instrs IPC
Sampling period 2.5 Million
Number of machines* 1000
Experiment Details:
* All had Intel Westmere chips (24 hyperthreads, 12 cores), matching clock speed, RAM, O/S
1. Collect samples
Method:
26
![Page 27: Datacenter application interference - Columbia Universityarcade.cs.columbia.edu/interference-sc12-slides.pdf · 1. New datacenter application interference studies can use our identified](https://reader033.fdocuments.net/reader033/viewer/2022042001/5e6dab90b7c3bd27a659d316/html5/thumbnails/27.jpg)
Applying the Methodology @ Google
Event Instrs IPC
Sampling period 2.5 Million
Number of machines* 1000
Experiment Details:
* All had Intel Westmere chips (24 hyperthreads, 12 cores), matching clock speed, RAM, O/S
Unique binary apps 1102
Co-runner relationships (top 8 apps)
Avg. shared core rel’ns 1M (min 2K)
Avg. shared socket 9.5M (min 12K)
Avg. opposite socket 11M (min 14K)
Collection results:
1. Collect samples
Method:
2. ID sample size relationships
3. Filter by interference classes
27
![Page 28: Datacenter application interference - Columbia Universityarcade.cs.columbia.edu/interference-sc12-slides.pdf · 1. New datacenter application interference studies can use our identified](https://reader033.fdocuments.net/reader033/viewer/2022042001/5e6dab90b7c3bd27a659d316/html5/thumbnails/28.jpg)
Applying the Methodology @ Google
4. Aggregate equiv. schedules
Method:
5. Calculate statistical indicators
28
![Page 29: Datacenter application interference - Columbia Universityarcade.cs.columbia.edu/interference-sc12-slides.pdf · 1. New datacenter application interference studies can use our identified](https://reader033.fdocuments.net/reader033/viewer/2022042001/5e6dab90b7c3bd27a659d316/html5/thumbnails/29.jpg)
Analyze Interference
streeview’s IPC changes with top co-runners
Overall median IPC across 1102 applications
29
![Page 30: Datacenter application interference - Columbia Universityarcade.cs.columbia.edu/interference-sc12-slides.pdf · 1. New datacenter application interference studies can use our identified](https://reader033.fdocuments.net/reader033/viewer/2022042001/5e6dab90b7c3bd27a659d316/html5/thumbnails/30.jpg)
Beyond noisy interferers (shared core)
30
Co-running applications
Base
Ap
plic
atio
n
Less or pos. interference
Negative interference
Noisy data
![Page 31: Datacenter application interference - Columbia Universityarcade.cs.columbia.edu/interference-sc12-slides.pdf · 1. New datacenter application interference studies can use our identified](https://reader033.fdocuments.net/reader033/viewer/2022042001/5e6dab90b7c3bd27a659d316/html5/thumbnails/31.jpg)
Beyond noisy interferers (shared core)
* Recall minimum pair has 2K samples; medians across full grid of 1102 apps
31
Base
Ap
plic
atio
ns
Co-running applications
Less or pos. interference
Noisy data
Negative interference
![Page 32: Datacenter application interference - Columbia Universityarcade.cs.columbia.edu/interference-sc12-slides.pdf · 1. New datacenter application interference studies can use our identified](https://reader033.fdocuments.net/reader033/viewer/2022042001/5e6dab90b7c3bd27a659d316/html5/thumbnails/32.jpg)
Performance Strategies
Restrict negative beyond noisy interferers (or encourage positive interferers as co-runners)
Isolate sensitive or antagonistic applications
32
![Page 33: Datacenter application interference - Columbia Universityarcade.cs.columbia.edu/interference-sc12-slides.pdf · 1. New datacenter application interference studies can use our identified](https://reader033.fdocuments.net/reader033/viewer/2022042001/5e6dab90b7c3bd27a659d316/html5/thumbnails/33.jpg)
Takeaways
1. New datacenter application interference studies can use our identified complexities as a check list.
2. Our measurement methodology (verified at Google in 1st large-scale measurements of live datacenter interference), is generally applicable and shows promising initial performance opportunities.
33