CPU Scheduling and Queueing Theory
-
Upload
phungkhuong -
Category
Documents
-
view
228 -
download
2
Transcript of CPU Scheduling and Queueing Theory
![Page 1: CPU Scheduling and Queueing Theory](https://reader034.fdocuments.net/reader034/viewer/2022051504/5859a0821a28ab6e3290159f/html5/thumbnails/1.jpg)
CPU Scheduling and Queueing Theory
![Page 2: CPU Scheduling and Queueing Theory](https://reader034.fdocuments.net/reader034/viewer/2022051504/5859a0821a28ab6e3290159f/html5/thumbnails/2.jpg)
Last Time
• Scheduling policy: what to do next, when there are mul>ple threads ready to run
• Uniprocessor policies – FIFO, round robin, shortest job first – Mul>-‐level feedback queues as approxima>on of shortest CPU task first
![Page 3: CPU Scheduling and Queueing Theory](https://reader034.fdocuments.net/reader034/viewer/2022051504/5859a0821a28ab6e3290159f/html5/thumbnails/3.jpg)
Main Points
• Mul>processor scheduling – Affinity scheduling
– Space vs. >me sharing
• Queueing theory – Can you predict a system’s response >me?
![Page 4: CPU Scheduling and Queueing Theory](https://reader034.fdocuments.net/reader034/viewer/2022051504/5859a0821a28ab6e3290159f/html5/thumbnails/4.jpg)
Mul>processor Scheduling
• What would happen if we used MFQ on a mul>processor? – Conten>on for scheduler spinlock
![Page 5: CPU Scheduling and Queueing Theory](https://reader034.fdocuments.net/reader034/viewer/2022051504/5859a0821a28ab6e3290159f/html5/thumbnails/5.jpg)
Mul>processor Scheduling
• On modern processors, the CPU is 100x slower on a cache miss
• Cache effects of a single ready list: – Cache coherence overhead
• MFQ data structure would ping between caches • Fetching data from other caches can be even slower than re-‐fetching from DRAM
– Cache reuse • Thread’s data from last >me it ran is oVen s>ll in its old cache
![Page 6: CPU Scheduling and Queueing Theory](https://reader034.fdocuments.net/reader034/viewer/2022051504/5859a0821a28ab6e3290159f/html5/thumbnails/6.jpg)
Amdahl’s Law
• Speedup on a mul>processor limited by whatever runs sequen>ally
• Run>me >= Sequen>al por>on + parallel por>on/# CPUs
• Example: – Suppose scheduler lock used 0.1% of the >me – Suppose scheduler lock is 50x slower because of cache effects
– Run>me >= 5% + 95%/# CPUs • System is only 2.5x faster with 100 processors than 10
![Page 7: CPU Scheduling and Queueing Theory](https://reader034.fdocuments.net/reader034/viewer/2022051504/5859a0821a28ab6e3290159f/html5/thumbnails/7.jpg)
Per-‐Processor Mul>-‐level Feedback: Affinity Scheduling
CPU 1 CPU 2 CPU 3
![Page 8: CPU Scheduling and Queueing Theory](https://reader034.fdocuments.net/reader034/viewer/2022051504/5859a0821a28ab6e3290159f/html5/thumbnails/8.jpg)
Scheduling Parallel Programs Oblivious: each processor >me-‐slices its ready list independently of the other processors
Time
p1.4
CPU1 CPU2 CPU3
p2.3
p3.1
p2.1
p3.4
p2.4
p1.2
p1.3
p2.2
px.y = thread y in process x
![Page 9: CPU Scheduling and Queueing Theory](https://reader034.fdocuments.net/reader034/viewer/2022051504/5859a0821a28ab6e3290159f/html5/thumbnails/9.jpg)
Scheduling Parallel Programs
• What happens if one thread gets >me-‐sliced while other threads from the same program are s>ll running? – Assuming program uses locks and condi>on variables, it will s>ll be correct
– What about performance?
![Page 10: CPU Scheduling and Queueing Theory](https://reader034.fdocuments.net/reader034/viewer/2022051504/5859a0821a28ab6e3290159f/html5/thumbnails/10.jpg)
Bulk Synchronous Parallel Program
CPU 1 CPU 2 CPU 4CPU 3
timebarrier
local computation
communication
local computation
barrier
![Page 11: CPU Scheduling and Queueing Theory](https://reader034.fdocuments.net/reader034/viewer/2022051504/5859a0821a28ab6e3290159f/html5/thumbnails/11.jpg)
Co-‐Scheduling
Time
p1.1
CPU1 CPU2 CPU3
p2.1
p3.1
p1.2
p2.2
p3.2
p1.3
p2.3
p3.3
px.y = thread y in process x
![Page 12: CPU Scheduling and Queueing Theory](https://reader034.fdocuments.net/reader034/viewer/2022051504/5859a0821a28ab6e3290159f/html5/thumbnails/12.jpg)
Amdahl’s Law, Revisited
Number of Processors
Per
form
ance
(inve
rse
resp
onse
tim
e)perfectly parallel
diminishing returns
limited parallelism
![Page 13: CPU Scheduling and Queueing Theory](https://reader034.fdocuments.net/reader034/viewer/2022051504/5859a0821a28ab6e3290159f/html5/thumbnails/13.jpg)
Space Sharing
Time
CPU1 CPU2 CPU3
process 1
CPU4 CPU5 CPU6
process 2
Scheduler ac>va>ons: kernel informs user-‐level library as to # of processors assigned to that applica>on, with upcalls every >me the assignment changes
![Page 14: CPU Scheduling and Queueing Theory](https://reader034.fdocuments.net/reader034/viewer/2022051504/5859a0821a28ab6e3290159f/html5/thumbnails/14.jpg)
Queueing Theory
• Can we predict what will happen to user performance: – If a service becomes more popular? – If we buy more hardware? – If we change the implementa>on to provide more features?
![Page 15: CPU Scheduling and Queueing Theory](https://reader034.fdocuments.net/reader034/viewer/2022051504/5859a0821a28ab6e3290159f/html5/thumbnails/15.jpg)
Queueing Model
Queue
Arrivals
Server
Departures(Throughput)
![Page 16: CPU Scheduling and Queueing Theory](https://reader034.fdocuments.net/reader034/viewer/2022051504/5859a0821a28ab6e3290159f/html5/thumbnails/16.jpg)
Defini>ons
• Queueing delay: wait >me • Service >me: >me to service the request
• Response >me = queueing delay + service >me
• U>liza>on: frac>on of >me the server is busy – Service >me * arrival rate
• Throughput: rate of task comple>ons – If no overload, throughput = arrival rate
![Page 17: CPU Scheduling and Queueing Theory](https://reader034.fdocuments.net/reader034/viewer/2022051504/5859a0821a28ab6e3290159f/html5/thumbnails/17.jpg)
Queueing
• What is the best case scenario for minimizing queueing delay? – Keeping arrival rate, service >me constant
• What is the worst case scenario?
![Page 18: CPU Scheduling and Queueing Theory](https://reader034.fdocuments.net/reader034/viewer/2022051504/5859a0821a28ab6e3290159f/html5/thumbnails/18.jpg)
Queueing: Best Case
Arrivals During Interval
Thro
ughp
ut
Arrivals During Interval
Resp
onse
Tim
e
Service TimeFull utilization
Full utilizationMax throughput
![Page 19: CPU Scheduling and Queueing Theory](https://reader034.fdocuments.net/reader034/viewer/2022051504/5859a0821a28ab6e3290159f/html5/thumbnails/19.jpg)
Queueing: Worst Case
Arrivals During Interval
Thro
ughp
ut
Arrivals During Interval
Resp
onse
Tim
e Service Time Full utilizationMax throughput
![Page 20: CPU Scheduling and Queueing Theory](https://reader034.fdocuments.net/reader034/viewer/2022051504/5859a0821a28ab6e3290159f/html5/thumbnails/20.jpg)
Queueing: Average Case?
• Gaussian: Arrivals are spread out, around a mean value
• Exponen>al: arrivals are memoryless
• Heavy-‐tailed: arrivals are bursty
Event = x
Pro
babi
lity
of E
vent
exponential distribution
![Page 21: CPU Scheduling and Queueing Theory](https://reader034.fdocuments.net/reader034/viewer/2022051504/5859a0821a28ab6e3290159f/html5/thumbnails/21.jpg)
Exponen>al Distribu>on
0
λ
μ
1 2 3
λλλ
μμμ
λ
μ
4 ...
Permits closed form solu>on to state probabili>es, as func>on of arrival rate and service rate
![Page 22: CPU Scheduling and Queueing Theory](https://reader034.fdocuments.net/reader034/viewer/2022051504/5859a0821a28ab6e3290159f/html5/thumbnails/22.jpg)
Response Time vs. U>liza>on
• R = S/(1-‐U) – Beier if gaussian
– Worse if heavy-‐tailed
• Variance in R = S/(1-‐U)^2
Utilization
Resp
onse
Tim
e
0 1
Service Time
![Page 23: CPU Scheduling and Queueing Theory](https://reader034.fdocuments.net/reader034/viewer/2022051504/5859a0821a28ab6e3290159f/html5/thumbnails/23.jpg)
What if Mul>ple Resources?
• Response >me = Sum over all i
Service >me for resource i / (1 – U>liza>on of resource i)
• Implica>on – If you fix one boileneck, the next highest u>lized resource will limit performance
![Page 24: CPU Scheduling and Queueing Theory](https://reader034.fdocuments.net/reader034/viewer/2022051504/5859a0821a28ab6e3290159f/html5/thumbnails/24.jpg)
Overload Management
• What if arrivals occur faster than service can handle them – If do nothing, response >me will become infinite
• Turn users away? – Which ones? Average response >me is best if turn away users that have the highest service demand
• Degrade service? – Compute result with fewer resources – Example: CNN sta>c front page on 9/11 – Counterexample: highway conges>on