Friendly Virtual Machines Zhang,Bestavros etc., Boston Univ. ACM/USENIX VEE 2005

Friendly Virtual Machines

Zhang,Bestavros etc., Boston Univ.ACM/USENIX VEE 2005

CSE 598c April 17, 2006

Bhuvan Urgaonkar

Problem Setting• Growing trend of hosting applications at

third-party platforms• Two challenges

• Isolation, security to co-located applications• Efficient and fair resource allocation

• Virtualization seen as a promising approach for isolation

• What about resource allocation?

Challenge - Resource Allocation in Hosting

Environments• Traditional solutions

• Over-provisioning => wasteful• Fair schedulers in the OS, dynamic provisioning,

admission control• Complex• Deprive the application of meaningfully adapting

its behavior to match available resources• Against the famous end-to-end argument

developed in the networking community

End-to-end Argument• Clark et. al

• A functionality should be pushed to the higher layer whenever possible• IP network implements packet forwarding, leaving

congestion control to end systems• When applied to hosting platforms

• Let the applications decide how many resources they need

How do VMs make end-end idea realizable?

• In a traditional hosting system, applications would have to be modified• Always undesirable, often impossible

• In a virtualized hosting system• VMM is like OS, guest OS is like application• Guest OS modification not so unacceptable

• E.g., Xen, Denali• Main idea: It is possible to achieve good efficiency

and fairness using “friendly” virtual machines

Outline• Motivation• Approach• Implementation• Evaluation• Conclusions

Friendly Virtual Machine• Not malicious• Dynamically adapts its resource needs to

system conditions• Inspiration: AIMD congestion control in TCP• Gradually increase resource requirements, back-

off when resource contention increases• How a TCP researcher would approach the

resource management problem in data centers

System Goals• Efficiency

• Resources should not be overloaded• E.g., Heavy paging during overload => low

throughput• Resources should not be unnecessarily

underutilized• Fairness

• Each VM is allocated a proportional share of the bottleneck resource for that VM

Overload Detection• Unlike TCP, there are multiple resources to

consider• CPU, virtual memory, network bandwidth

• Resource utilization metrics not reliable• E.g., CPU util may be high but the bottleneck

may be the memory sub-system• Use application-centric metrics like

response time or throughput

Overload Detection• Virtual Clock Time (VCT)

• Real time interval between consecutive virtual clock cycles

• Bottleneck resource• The resource that is the first to trigger a significant

increase in VCT• Bottleneck-equivalence classes• Detection: Measure the ratio of current VCT to

minimum VCT observed• Compare with a threshold (2)

Adaptation Mechanisms• Control number of processes/threads

• In practice, suspending running processes may not be a good idea

• Alternatives • Suspend less important (e.g., younger) processes• Don’t allow new processes instead of suspending existing

ones• Rate control by forcing VM to sleep• Follow an AIMD style adaptation that converges to

fair/efficient allocation• Paper presents a control-theoretic model to prove

convergence/stability properties

Salient Features• Underlying system requirements

• Schedulers should be unbiased like round-robin, unlike multi-level feedback

• VMM should implement resource policing to enforce AIMD behavior

• Various adaptation strategies can co-exist• Think TCP-reno, TCP-tahoe, …

• Suggestion: VMM could provide incentives for friendly behavior

Discussion• Is this system practical?

• Rate of adaptation• Would it be fast enough for hosted applications?• Applications need resources soon after overload starts• How would the system behave with biased schedulers?

• Can the adaptation mechanism be extended to handle different levels of importance?• This system might punish an application precisely when it

is crucial for it to service its workload• E.g., An e-commerce app during Thanksgiving

• Global knowledge can be crucial for efficiency• E.g., LRU page replacement

• Security, isolation• To me it seems this would be as secure as a system with a

more heavy-weight VMM

Implementation• User-mode Linux• Implement adaptation of number of

processes and rate control• 500 lines of code

Outline• Motivation• Approach• Implementation• Evaluation• Conclusions

Memory intensive benchmark - Performance metrics vs # VMs

• Linux suspends processes arbitrarily when excessive thrashing occurs, their system spreads the punishment evenly

Benchmark - Performance metrics vs # threads/VM (2

VMs)

• Graceful degradation

How not to do Evaluation

• No confidence intervals!• Observations for light loads are meaningless

• Pick someone your own size• Of course, their system is better than vanilla

UML, so what?• Should have compared with a system that

implements fair schedulers

Apache - 4 VMs

• Graceful degradation

Evolution of VCT w/ UML

• Unfairness at high load

Evolution of VCT w FVM

• Fair CPU allocation

Tput per VM w UML

• Unfairness at high load• Unfair CPU allocations due to different paging treatment and process suspension

Per VM Tput w FVM

• Fair behavior

Conclusions• Distributed, application-driven

resource allocation• (+) Cool idea• (-) Needs more research to be

convincing• Experimental evaluation not

satisfactory

More Discussion

Friendly Virtual Machines Zhang,Bestavros etc., Boston Univ. ACM/USENIX VEE 2005

Documents

Transcript of Friendly Virtual Machines Zhang,Bestavros etc., Boston Univ. ACM/USENIX VEE 2005