Measuring CPU Cache Resources in Clouds...Measuring CPU Cache Resources in Clouds Weijia Song...

24
Measuring CPU Cache Resources in Clouds Weijia Song 2018-02-28

Transcript of Measuring CPU Cache Resources in Clouds...Measuring CPU Cache Resources in Clouds Weijia Song...

Page 1: Measuring CPU Cache Resources in Clouds...Measuring CPU Cache Resources in Clouds Weijia Song 2018-02-28 A Reminder to CPU Cache •The Memory Wallproblem. (Sally A. McKee, 1995) A

Measuring CPU Cache Resources in Clouds

Weijia Song2018-02-28

Page 2: Measuring CPU Cache Resources in Clouds...Measuring CPU Cache Resources in Clouds Weijia Song 2018-02-28 A Reminder to CPU Cache •The Memory Wallproblem. (Sally A. McKee, 1995) A

A Reminder to CPU Cache

• The Memory Wall problem. (Sally A. McKee, 1995)

Page 3: Measuring CPU Cache Resources in Clouds...Measuring CPU Cache Resources in Clouds Weijia Song 2018-02-28 A Reminder to CPU Cache •The Memory Wallproblem. (Sally A. McKee, 1995) A

A Reminder to CPU Cache

• The “Memory Wall”

John McCalpin, STREAM

What makes it WORSE?

• Multi-core CPUs

• Parallelism inside the core: MMX, SSE,

AVX, FMA…

CPU : ~900 GFlops (single precision floating-point)MEM: 76.8 GB/s

Page 4: Measuring CPU Cache Resources in Clouds...Measuring CPU Cache Resources in Clouds Weijia Song 2018-02-28 A Reminder to CPU Cache •The Memory Wallproblem. (Sally A. McKee, 1995) A

A Reminder to CPU Cache

Cache

L1 Cache

L2 Cache

L1 Cache

L2 Cache

L3 Cache

FasterSmaller in Size

More Expensive

SlowerLarger in Size

Cheaper

Page 5: Measuring CPU Cache Resources in Clouds...Measuring CPU Cache Resources in Clouds Weijia Song 2018-02-28 A Reminder to CPU Cache •The Memory Wallproblem. (Sally A. McKee, 1995) A

A Reminder of CPU Cache

Processor 0

Core0

L1Cache

L2Cache

Core0

L1Cache

L2Cache

Core0

L1Cache

L2Cache

Core0

L1Cache

L2Cache

Shared L3 Cache

Processor 1

Core0

L1Cache

L2Cache

Core0

L1Cache

L2Cache

Core0

L1Cache

L2Cache

Core0

L1Cache

L2Cache

Shared L3 Cache

The most popular architecture in the Cloud

Page 6: Measuring CPU Cache Resources in Clouds...Measuring CPU Cache Resources in Clouds Weijia Song 2018-02-28 A Reminder to CPU Cache •The Memory Wallproblem. (Sally A. McKee, 1995) A

The Infrastructure-as-a-Service(IaaS) Clouds

Page 7: Measuring CPU Cache Resources in Clouds...Measuring CPU Cache Resources in Clouds Weijia Song 2018-02-28 A Reminder to CPU Cache •The Memory Wallproblem. (Sally A. McKee, 1995) A

The Problem

Page 8: Measuring CPU Cache Resources in Clouds...Measuring CPU Cache Resources in Clouds Weijia Song 2018-02-28 A Reminder to CPU Cache •The Memory Wallproblem. (Sally A. McKee, 1995) A

No, Why?

• Different hardware• Different hypervisors• Different resource sharing levels• Isolation: Cache Allocation Technology, Memory Bandwidth Allocation• Document errors

• However, virtual machine specification presented to user by cloud providers is not sufficiently descriptive, making it hard for application to find the best deployment.

Page 9: Measuring CPU Cache Resources in Clouds...Measuring CPU Cache Resources in Clouds Weijia Song 2018-02-28 A Reminder to CPU Cache •The Memory Wallproblem. (Sally A. McKee, 1995) A

CloudModel: A tool measuring the resources in the Cloud.This work focuses on CPU Cache and Memory Resources. It measures• The true allocated size of each cache level;• The sequential throughput and latency of each cache level; and• The sequential throughput and latency of memory.

Page 10: Measuring CPU Cache Resources in Clouds...Measuring CPU Cache Resources in Clouds Weijia Song 2018-02-28 A Reminder to CPU Cache •The Memory Wallproblem. (Sally A. McKee, 1995) A

Cache Allocation vs Application Performance

• Application performance with Cache allocation

Page 11: Measuring CPU Cache Resources in Clouds...Measuring CPU Cache Resources in Clouds Weijia Song 2018-02-28 A Reminder to CPU Cache •The Memory Wallproblem. (Sally A. McKee, 1995) A

Cache Contention vs Application Performance

• Application Performance with Cache contention

Page 12: Measuring CPU Cache Resources in Clouds...Measuring CPU Cache Resources in Clouds Weijia Song 2018-02-28 A Reminder to CPU Cache •The Memory Wallproblem. (Sally A. McKee, 1995) A

Performance Tuning with Cache Size

• Intel SSE temporal/non temporal data• Protean Project

Page 13: Measuring CPU Cache Resources in Clouds...Measuring CPU Cache Resources in Clouds Weijia Song 2018-02-28 A Reminder to CPU Cache •The Memory Wallproblem. (Sally A. McKee, 1995) A

Measuring Throughput

Memory Buffer

CPU Registers • Load the memory buffer in sequence for multiple times• Calculate the average throughput

L1 cache L2 cache L3 cache Memory

Page 14: Measuring CPU Cache Resources in Clouds...Measuring CPU Cache Resources in Clouds Weijia Song 2018-02-28 A Reminder to CPU Cache •The Memory Wallproblem. (Sally A. McKee, 1995) A

Measuring Throughput

Page 15: Measuring CPU Cache Resources in Clouds...Measuring CPU Cache Resources in Clouds Weijia Song 2018-02-28 A Reminder to CPU Cache •The Memory Wallproblem. (Sally A. McKee, 1995) A

Cache Sizes

• We give a range instead of a specific value.• We use the buffer size range in which, the throughput• is significantly lower than the upper cache level, and• significantly higher than the lower cache level.As an estimation of the size of the upper cache level.

Page 16: Measuring CPU Cache Resources in Clouds...Measuring CPU Cache Resources in Clouds Weijia Song 2018-02-28 A Reminder to CPU Cache •The Memory Wallproblem. (Sally A. McKee, 1995) A

Methodology-skipped

• Throughput: • Latency: random access• Cache Size: Cluster centroid: L1

Cluster centroid: L2

Cluster centroid: L3

Cluster centroid: MEM

L1C Size L2C Size L3C Size

Page 17: Measuring CPU Cache Resources in Clouds...Measuring CPU Cache Resources in Clouds Weijia Song 2018-02-28 A Reminder to CPU Cache •The Memory Wallproblem. (Sally A. McKee, 1995) A

L1 cache L2 cache L3 cache Memory

Memory Buffer

Measuring Latency

• We use a randomized cyclic linked list.

1 2 3 4 5 6 7

CPU Registers

Page 18: Measuring CPU Cache Resources in Clouds...Measuring CPU Cache Resources in Clouds Weijia Song 2018-02-28 A Reminder to CPU Cache •The Memory Wallproblem. (Sally A. McKee, 1995) A

Measuring Latency

Page 19: Measuring CPU Cache Resources in Clouds...Measuring CPU Cache Resources in Clouds Weijia Song 2018-02-28 A Reminder to CPU Cache •The Memory Wallproblem. (Sally A. McKee, 1995) A

Measuring Cache and Memory Latencies

• Observed Latency = L1 latency*L1 hit rate + L2 latency * L2 hit rate + L3 latency * L3 hit rate + Mem latency * Mem hit rate• Apply multiple linear regression to determine the latency for the

cache levels and memory.• In some clouds, the hit rate may be not available. We use the latency

measured with a buffer size smaller but close to corresponding cache size as a quick estimation.

Page 20: Measuring CPU Cache Resources in Clouds...Measuring CPU Cache Resources in Clouds Weijia Song 2018-02-28 A Reminder to CPU Cache •The Memory Wallproblem. (Sally A. McKee, 1995) A

Throughput Results in Public Cloud

Page 21: Measuring CPU Cache Resources in Clouds...Measuring CPU Cache Resources in Clouds Weijia Song 2018-02-28 A Reminder to CPU Cache •The Memory Wallproblem. (Sally A. McKee, 1995) A

Latency Results in Public Cloud

Page 22: Measuring CPU Cache Resources in Clouds...Measuring CPU Cache Resources in Clouds Weijia Song 2018-02-28 A Reminder to CPU Cache •The Memory Wallproblem. (Sally A. McKee, 1995) A

Google Compute Platform:- Failed to report its cache size- Inconsistent memory latencyJetstream- Failed to report its cache levels

n1-standard 2 @ us-west1-a n1-standard 2 @ us-central1-a

Page 23: Measuring CPU Cache Resources in Clouds...Measuring CPU Cache Resources in Clouds Weijia Song 2018-02-28 A Reminder to CPU Cache •The Memory Wallproblem. (Sally A. McKee, 1995) A

Hyper-Threading

Amazon c4.large Google n1-standard-2 Azure Standard DS2 v2

Page 24: Measuring CPU Cache Resources in Clouds...Measuring CPU Cache Resources in Clouds Weijia Song 2018-02-28 A Reminder to CPU Cache •The Memory Wallproblem. (Sally A. McKee, 1995) A

Conclusion and Future Work

We design and implement a benchmark tool to evaluate the cache and memory resources truly available to a cloud user.

Future work• Minimize the benchmark overhead with program counters.• Model application performance with cache size, cache throughput,

and latency, memory through and latency.• Benchmark other resources including network and storage.