Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge...

31
Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley Presenter: Olusanya Soyannwo

Transcript of Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge...

Page 1: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley.

Capriccio: Scalable Threads for Internet Services

Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer

University of California at Berkeley

Presenter: Olusanya Soyannwo

Page 2: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley.

Outline

Motivation Background Goals Approach Experiments Results Related work Conclusion & Future work

EECS Advanced Operating Systems Northwestern University

Page 3: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley.

Motivation

Increasing scalability demands for Internet services

Hardware improvements are limited by existing software

Current implementations are event based

EECS Advanced Operating Systems Northwestern University

Page 4: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley.

Background : Event Based Systems - Drawbacks

Events systems hide the control flowDifficult to understand and debug

Programmers need to match related events

Burdens programmers

EECS Advanced Operating Systems Northwestern University

Page 5: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley.

Goals: Capriccio

Support for existing thread API

Scalability to hundreds of thousands of threads

Automate application-specific customization

EECS Advanced Operating Systems Northwestern University

Page 6: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley.

Approach: Capriccio

Thread packageCooperative scheduling

Linked stacksAddress the problem of stack

allocation for large numbers of threadsCombination of compile-time and run-

time analysis Resource-aware scheduler

EECS Advanced Operating Systems Northwestern University

Page 7: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley.

Approach: User Level Thread – The Choice

POSIX API(-)Complex preemption

(-)Bad interaction with Kernel scheduler

Performance Ease thread synchronization overhead No kernel crossing for preemptive threading More efficient memory management at user level

Flexibility Decoupling user and kernel threads allows faster innovation Can use new kernel thread features without changing application

code Scheduler tailored for applications

EECS Advanced Operating Systems Northwestern University

Page 8: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley.

Approach: User Level Thread – Disadvantages

Additional OverheadReplacing blocking calls with non-

blocking calls

Multiple CPU synchronization

EECS Advanced Operating Systems Northwestern University

Page 9: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley.

Approach: User Level Thread – Implementation Context Switches

Built on top of Edgar Toernig’s coroutine library Fast context switches when threads voluntarily yield

I/O Capriccio intercepts blocking I/O calls Uses epoll for asynchronous I/O

Scheduling Very much like an event-driven application Events are hidden from programmers

Synchronization Supports cooperative threading on single-CPU machines

• Requires only Boolean checks

EECS Advanced Operating Systems Northwestern University

Page 10: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley.

Approach: Linked Stack

The problem: fixed stacksOverflow vs. wasted spaceLimits thread numbers

The solution: linked stacksAllocate space as neededCompiler analysis

• Add runtime checkpoints • Guarantee enough space until next check

Fixed Stacks

Linked Stack

EECS Advanced Operating Systems Northwestern University

Page 11: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley.

Approach: Linked Stack Parameters

MaxPathMinChunk

Steps Break cycles Trace back

Special Cases Function pointers External calls

5

4

2

6

3

3

2

3

MaxPath = 8

EECS Advanced Operating Systems Northwestern University

Page 12: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley.

Approach: Linked Stack Parameters

MaxPathMinChunk

Steps Break cycles Trace back

Special Cases Function pointers External calls

5

4

2

6

3

3

2

3

MaxPath = 8

EECS Advanced Operating Systems Northwestern University

Page 13: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley.

Approach: Linked Stack Parameters

MaxPathMinChunk

Steps Break cycles Trace back

Special Cases Function pointers External calls

5

4

2

3

3

2

3

MaxPath = 8

6

EECS Advanced Operating Systems Northwestern University

Page 14: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley.

Approach: Linked Stack Parameters

MaxPathMinChunk

Steps Break cycles Trace back

Special Cases Function pointers External calls

5

3

MaxPath = 8

6

3

2

2

4

3

EECS Advanced Operating Systems Northwestern University

Page 15: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley.

Approach: Linked Stack Parameters

MaxPathMinChunk

Steps Break cycles Trace back

Special Cases Function pointers External calls MaxPath = 8

6

3

2

2

4

3

3

3

EECS Advanced Operating Systems Northwestern University

Page 16: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley.

Approach: Scheduling

Advantages of event-based schedulingTailored for applicationsWith event handlers

Events provide two important pieces of information for schedulingWhether a process is close to

completionWhether a system is overloaded

EECS Advanced Operating Systems Northwestern University

Page 17: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley.

Approach: Scheduling -The Blocking Graph

Close

Write

WriteReadSleep

ThreadcreateMain

Thread-based View applications as

sequence of stages, separated by blocking calls

Analogous to event-based scheduler

EECS Advanced Operating Systems Northwestern University

Page 18: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley.

Approach: Resource-aware Scheduling Track resources used along BG edges

Memory, file descriptors, CPU Predict future from the past Algorithm

• Increase use when underutilized• Decrease use near saturation

Advantages Operate near the knee w/o thrashing Automatic admission control

EECS Advanced Operating Systems Northwestern University

Page 19: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley.

Experiment: Threading Microbenchmarks

SMP, two 2.4 GHz Xeon processors 1 GB memory two 10 K RPM SCSI Ultra II hard

drives Linux 2.5.70 Compared Capriccio, LinuxThreads,

and Native POSIX Threads for Linux

EECS Advanced Operating Systems Northwestern University

Page 20: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley.

Experiment: Thread Scalability

Producer-consumer microbenchmark LinuxThreads begin to degrade after

20 threads NPTL degrades after 100 Capriccio scales to 32K producers

and consumers (64K threads total)

EECS Advanced Operating Systems Northwestern University

Page 21: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley.

Results: Thread Primitive - Latency

Capriccio LinuxThreads NPTL

Thread creation

21.5 21.5 17.7

Thread context switch

0.24 0.71 0.65

Uncontended mutex lock

0.04 0.14 0.15

EECS Advanced Operating Systems Northwestern University

Page 22: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley.

Results: Thread Scalability

EECS Advanced Operating Systems Northwestern University

Page 23: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley.

Results: I/O performance

Network performanceToken passing among pipesSimulates the effect of slow client links10% overhead compared to epollTwice as fast as both LinuxThreads

and NPTL when more than 1000 threads

Disk I/O comparable to kernel threads

EECS Advanced Operating Systems Northwestern University

Page 24: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley.

Results: Runtime Overhead

Tested Apache 2.0.44 Stack linking

73% slowdown for null call 3-4% overall

Resource statistics 2% (on all the time) 0.1% (with sampling)

Stack traces 8% overhead

EECS Advanced Operating Systems Northwestern University

Page 25: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley.

Results: Web Server Performance

EECS Advanced Operating Systems Northwestern University

Page 26: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley.

Related Work

Programming Model of high concurrency Event based models are a result of poor thread implementations

User-Level Threads Capriccio is unique

Kernel Threads NPTL

Application Specific Optimization SPIN & Exokernel Burden on programmers Portability

Asynchronous I/O Stack Management

Using heap requires a garbage collector (ML of NJ)

EECS Advanced Operating Systems Northwestern University

Page 27: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley.

Related Work (cont’d)

Resource Aware SchedulingSeveral similar to capriccio

Page 28: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley.

Future Work

ThreadingMulti-CPU supportKernel interface

(enabled) Compile-time techniquesVariations on linked stacksStatic blocking graph

SchedulingMore sophisticated prediction

EECS Advanced Operating Systems Northwestern University

Page 29: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley.

Conclusion

Capriccio simplifies high concurrency Scalable & high performanceControl over concurrency model

• Stack safety• Resource-aware scheduling• Enables compiler support, invariants

Issues Additional burden to programmer Resource controlled sched.? What

hysteresis?

EECS Advanced Operating Systems Northwestern University

Page 30: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley.

OTHER GRAPHS

Page 31: Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley.

OTHER GRAPHS