The Impact of Dynamic Monitoring Interval Adjustment on Power Consumption in Virtualized Data...
-
Upload
mark-white -
Category
Technology
-
view
175 -
download
1
Transcript of The Impact of Dynamic Monitoring Interval Adjustment on Power Consumption in Virtualized Data...
The Impact of Dynamic Monitoring Interval Adjustment on Power Consumption in Virtualized Data Centers
Mark White BSc., H.Dip
Submitted in accordance with the requirements for the degree of
Masters of Science in Computer Science and Information Technology
Discipline of Information Technology, College of Engineering and Informatics
National University of Ireland, Galway
Research Supervisors: Dr. Hugh Melvin, Dr. Michael Schukat
Research Director: Prof. Gerard Lyons
September 2014
The candidate confirms that the work submitted is his own and that appropriate credit has been given where reference has been made to the work of others
Contents
Chapter 1 Introduction..........................................................................................................................1
1.1 The Hybrid Cloud...................................................................................................................1
1.2 Migration...............................................................................................................................2
1.3 Energy Efficiency....................................................................................................................2
1.4 Cooling...................................................................................................................................4
1.5 Research Objectives...............................................................................................................4
1.5.1 Hypothesis.....................................................................................................................4
1.5.2 CloudSim........................................................................................................................5
1.5.3 Methodology.................................................................................................................6
1.6 Conclusion.............................................................................................................................6
Chapter 2 Literature Review..................................................................................................................8
Introduction.......................................................................................................................................8
2.1 Performance versus Power....................................................................................................8
2.2 Increased Density..................................................................................................................9
2.3 Hardware.............................................................................................................................12
2.3.1 Uninterruptible Power Supply (UPS) & Power Distribution.........................................12
2.3.2 Servers, Storage Devices & Network Equipment.........................................................13
2.3.3 Cooling.........................................................................................................................13
2.3.4 Industry Standards & Guidelines..................................................................................15
2.3.5 Three Seminal Papers..................................................................................................17
2.4 Software..............................................................................................................................24
2.4.1 Virtualization................................................................................................................24
2.4.2 Migration.....................................................................................................................25
2.5 Monitoring Interval..............................................................................................................34
2.5.1 Static Monitoring Interval............................................................................................36
2.5.2 Dynamic Monitoring Interval.......................................................................................37
2.6 Conclusion...........................................................................................................................38
Chapter 3 CloudSim.............................................................................................................................39
Introduction.....................................................................................................................................39
3.1 Overview..............................................................................................................................39
3.2 Workload.............................................................................................................................40
3.3 Capacity...............................................................................................................................42
3.4 Local Regression / Minimum Migration Time (LR / MMT)...................................................44
3.5 Selection Policy – Local Regression (LR)...............................................................................44
3.6 Allocation Policy – Minimum Migration Time (MMT)..........................................................45
3.7 Default LRMMT....................................................................................................................45
3.7.1 init()………………………………………………………....................................................................45
3.7.2 start()...........................................................................................................................46
3.8 Over-utilization....................................................................................................................48
3.9 Migration.............................................................................................................................50
3.10 Reporting.............................................................................................................................52
3.11 Conclusion...........................................................................................................................52
Chapter 4 Implementation..................................................................................................................54
Introduction.....................................................................................................................................54
4.1 Interval Adjustment Algorithm............................................................................................54
4.2 Comparable Workloads.......................................................................................................58
4.3 C# Calculator........................................................................................................................61
4.4 Interval Adjustment Code....................................................................................................64
4.5 Reporting.............................................................................................................................69
4.6 Conclusion...........................................................................................................................70
Chapter 5 Tests, Results & Evaluation.................................................................................................71
Introduction.....................................................................................................................................71
5.1 Tests & Results.....................................................................................................................71
5.2 Evaluation of Test Results....................................................................................................75
5.2.1 Why are the Over-Utilized Hosts / Migration Counts Reduced?..................................76
5.2.2 Result of Reduced Migration Count.............................................................................77
5.2.3 Scalability.....................................................................................................................77
5.3 Evaluation of CloudSim........................................................................................................78
5.3.1 Local Regression Sliding Window.................................................................................78
5.3.2 RAM.............................................................................................................................79
5.3.3 Dynamic RAM Adjustment...........................................................................................79
5.3.4 SLA-based Migration....................................................................................................79
Chapter 6 Conclusions.........................................................................................................................81
REFERENCES........................................................................................................................................83
APPENDIX A.........................................................................................................................................89
APPENDIX B.........................................................................................................................................90
List of Figures
Figure 1 Data Center Service Supply Chain..........................................................................................3
Figure 2 Relative contributions to the thermal output of a typical DC.................................................12
Figure 3 A Typical AHU Direct Expansion (DX) Cooling System.....................................................14
Figure 4 A Typical DC Air Flow System............................................................................................15
Figure 5 Performance of web server during live migration (C. Clark).................................................30
Figure 6 Pre-Copy algorithm...............................................................................................................32
Figure 7 CloudSim Architecture..........................................................................................................41
Figure 8 Flow Chart Depicting the LR / MMT simulation process......................................................48
Figure 9 Application of the Monitoring Interval Based on Weighted Utilization Average..................57
Figure 10 A screenshot of the data generated for calculation of the default workload.........................59
Figure 11 Intervals calculated during the dynamic simulation.............................................................59
Figure 12 Calculation of the Average CPU Utilization for the Default Files.......................................60
Figure 13 How the dynamic interval adjustment code interacts with CloudSim..................................66
Figure 14 Interval Calculation for the Dynamic Simulation................................................................72
Figure 15 VM decommissioning comparison......................................................................................73
Figure 16 Operational Hosts - Default Simulation...............................................................................74
Figure 17 Operational Hosts - Dynamic Simulation............................................................................74
Figure 18 Average CPU Utilization - Dynamic Simulation.................................................................75
Acknowledgements
My supervisor, Dr. Hugh Melvin, who identified at an early stage that (for the most part) I
could be left to my own devices to get on with the work required. His supervisory approach
resulted in the freedom to progress at my own pace knowing he was available as and when I
needed a ‘boost’. When clarity was requested, Hugh demonstrated an enviable ability to
extract the salient issue and point me in the right direction. Although typically performed
cycling up a steep hill on his way home from work, the momentary pauses during review
meetings while he reflected on the issues were often more productive than hours of reading
code. Future students should be so lucky to have him oversee their research endeavours.
My second supervisor, Dr. Michael Schukat, who is capable of clarifying a complicated issue
with a carefully worded question – followed (invariably) with a reassuring smile.
Dr. Ripduman Sohan & Dr. Sherif Akoush in the Computing Laboratory at Cambridge
University without whom I would not have identified the approach taken in this thesis. Over
the course of a few (all too brief) visits with them, I also became aware of the extent of my
intellectual abilities (and limitations!!!).
The principle author of the CloudSim framework, Dr. Anton Beloglazov. Despite his having
moved on from the University of Melbourne where he wrote CloudSim for his doctoral
thesis, his detailed responses to the countless queries I posed during the course of my
research were invaluable and generous to a fault.
My colleagues in the Discipline of IT at NUI Galway for timely coffee-breaks, lunch
invitations and encounters in the corridors – because the breaks are a vital constituent of the
work and the queries as to progress and words of support were more important than you
could possibly have imagined.
My parents, who repeatedly remind me that:
‘You are capable of anything you put your mind to’
Deirdre (Dee) O’ Connor – if convention allowed your name would be on the title page!
Abstract
Virtualization is one of the principle data center technologies increasingly deployed in recent
years to meet the challenges of escalating costs, industry standards and the search for a
competitive edge. This thesis presents a novel approach to management of the virtualized
system which dynamically adjusts the monitoring interval with respect to the average CPU
utilization for the data center. The potential for reduced power consumption, by identifying
performance opportunities at an earlier stage than typical virtualized systems which use a
static interval, is analysed. It is proposed that the adjusted interval will result in analysis of
data center metrics being performed at a more appropriate level of granularity than current
static monitoring systems.
Chapter 1 Introduction
Chapter 1 Introduction
The availability of cloud-based Data Centers (DCs) in recent years has introduced significant
opportunities for enterprises to reduce costs. The initial Capital Expenditure (CapEx)
associated with setting up a DC has been prohibitively high in the past, but this may no
longer be the primary concern. For example, start-ups choosing to implement Infrastructure-
as-a-Service (IaaS) cloud architectures are free to focus on optimizing other aspects of the
business rather than worrying about raising the capital to build (and maintain) fully equipped
DCs. A young enterprise can now pay a relatively small monthly fee to Amazon (EC2) or
Microsoft (Azure), for example, in return for a scalable infrastructure on which to build their
new product or service. Existing companies are also availing of significant savings and
opportunities by moving to the cloud.
1.1 The Hybrid Cloud
In the future the architecture of cloud computing infrastructure will facilitate a business
moving the public portion of their services from one remote DC to another for cost or
efficiency gains. For example, a DC provider in one US state may be charging less for
compute time because energy costs in that state are lower than those in a neighbouring state.
Migration of enterprise services to the less expensive location could be facilitated. To enable
this type of migratory activity, the Distributed Management Task Force (DMTF) has created
the Open Virtualization Format (OVF) specification. The OVF standard “provides an
intermediary format for Virtual Machine (VM) images. It lets an organization create a VM
instance on top of one hypervisor and then export it to the OVF so that it can be run by
another hypervisor” [4]. With the exception of Amazon, all the major cloud providers (Citrix
Systems, IBM, Microsoft, Oracle and VMware) are involved in the development of OVF.
The short and medium term solution to the interoperability issue will certainly be
‘hybrid’ clouds where the enterprise maintains the private portion of their infrastructure on
their local network and the public portion is hosted on a federated cloud - facilitating indirect
(but not direct) movement between providers e.g. in a similar fashion to switching broadband
providers, a software development company may initially choose to lease a Microsoft data
center for their infrastructure but subsequently transfer to Google if the latter offering
1
Chapter 1 Introduction
becomes more suitable for their purposes (e.g. proximity to client requests or more energy
efficient).
Development of new products may be performed securely on the enterprise Local
Area Network (LAN) and subsequently ‘released’ onto the public cloud for global
distribution. Movement from one provider to another is currently (and for the foreseeable
future will be) performed manually by the enterprise administrator using separate
management interfaces i.e. an Amazon API or a Microsoft API. The vision of the DMTF is a
unified interface known as Cloud Infrastructure Management Interface (CIMI). It is currently
a work-in-progress but ultimately hopes to facilitate direct transfer of data between cloud
providers.
The core technology upon which this data transfer between providers will be
facilitated is virtualization – most specifically, migration of VMs.
1.2 Migration
The practice of regularly migrating services between providers may well become feasible in
the future, providing enterprises with significant opportunities to dynamically reduce the
energy portion of their Operating Expenditure (OpEx) budget. This would also result in
operators becoming more competitive, perhaps, all performance metrics being equal, gaining
their edge from increased energy efficiency efforts. Hybrid cloud environments also facilitate
smaller IT teams, resulting in reduced staffing costs.
1.3 Energy Efficiency
Data centers currently account for close to 3% of all global energy consumed on an annual
basis. It is certain that the industry will continue to expand as increasing volumes of data are
generated, transmitted, stored and analysed. This expansion will require significantly more
energy than is currently used by the sector, energy which must be managed as responsibly as
possible. Energy management, however, is not possible without measurement.
The measurement of a DC’s energy efficiency helps staff and management focus on
the various subsystems of the operation with a view to improving the overall efficiency of the
data center. While advances in hardware and software continue apace, the DC industry has
only recently begun to consider the responsibility of ensuring that the energy it uses is not
2
Chapter 1 Introduction
wasted. The global economic downturn of 2007 played no small part in motivating DC
operators to review their practices. In an attempt to remain competitive, while constantly
upgrading infrastructure and services to meet the needs of their customers, data center
operators have since identified energy efficiency as a cost opportunity. The moral aspects of
managing energy for the future are all well and good. It appears more likely, however, that
the potential operational savings in the short to medium term have provided the primary
motivation for data center operators to take stock.
In addition to the operational savings achieved when the data center becomes more
energy efficient on a daily basis, additional capital savings may also be realized. All items of
IT equipment have a replacement interval which may be increased due to redundancies
discovered during the energy efficiency audit. For example, should the existing cooling
volume of the room be found to be in excess of requirements, additional air handling units
(AHUs) could be switched to standby, not only reducing the power consumed by that unit but
also increasing the interval before the unit needs to be repaired or replaced.
The amount of power and cooling that a DC uses on a day-to-day basis determines
how much irreplaceable fossil fuels it consumes and the quantity of carbon emissions for
which it is responsible.
Figure 1 Data Center Service Supply Chain
3
Chapter 1 Introduction
Within the supply chain of DC services, illustrated in Figure 1, the main emissions occur at
the power generation site. Location is a key factor for the CO2 intensity of the power
consumed by the data center. A gas- or coal-fired powered utility creates much more CO2
than a hydro- or wind-powered utility. For this reason, many green-field DCs are now being
located near low-cost, environmentally friendly power sources.
1.4 Cooling
Location is also a key factor with respect to cooling. A data center in a cool climate such as
Ireland requires less cooling power than a data center in a warmer climate such as Mexico.
To avail of climate-related opportunities, large-scale DCs have recently been built in
temperate locations such as Dublin (e.g. Google) and Sweden (e.g. Facebook), demonstrating
the significance of the cost reductions possible. This being the case, if migration of DC
services across Wide Area Networks (WANs) becomes cost-feasible in the future, concepts
such as ‘follow the moon’ / ‘follow the sun’ (where the services provided by a DC are
moved, across the network, closer to where they are most needed throughout the day) may
become prevalent. Migration of data center services across both Local Area Networks
(LANs) and WANs is discussed in more detail in Chapter 2.
1.5 Research Objectives
1.5.1 Hypothesis
While the effort to optimize the individual component costs (e.g. downtime) of a migration is
worthwhile, this research aims to investigate further opportunities for energy savings if,
rather than optimising the individual component costs, a migration is viewed as a single all-
encapsulating entity and focus is applied to reducing the total number of migrations taking
place in a DC. Throughout a migration both the source and the destination servers are
running. Quite apart from the extra CPU processing, RAM access and bandwidth required to
achieve a migration, there is an additional energy cost associated with simply keeping both
servers simultaneously powered for the duration of the migration. In addition, if the
destination server was not previously running before the migration was initiated, the time
delay starting it up (as a new host machine) must also be factored into any calculation of
efficiency.
4
Chapter 1 Introduction
The principle metric for monitoring the DC workload is CPU utilization which is one of the
primary resources associated with servicing that workload. In a virtualized environment CPU
utilization is an indication of the processing capacity being used by a host while serving the
requirements of the VMs located on it. In current practice, the CPU utilization value
delivered to monitoring systems is averaged over a constant monitoring interval (i.e. 300
seconds). This interval is typically pre-configured (via a management interface) by the data
center operator, rendering it static. With a relatively small percentage of the host's CPU
concerned with running the virtualization hypervisor, CPU utilization is primarily dependent
on the workload being serviced by the VMs located on the host. This workload typically
varies with time as requests to the servers fluctuate outside the DC. As such, the frequency of
change of the CPU utilization value closely tracks the frequency of change of the incoming
workload.
This thesis investigates the merits of moving from a fixed interval to one which is
dynamically adjusted based on the overall CPU utilization average of the DC. At each
interval a weighted CPU utilization average for the DC is calculated and the next monitoring
interval is adjusted accordingly. By dynamically adjusting the monitoring interval with
respect to the average CPU utilization of the DC, this research analyses the potential for
reduced power consumption through identification of performance opportunities at an earlier
stage than systems which use a static 300 second interval. It is proposed that these
performance opportunities would otherwise have remained hidden mid-interval. Calculated
on the basis of how ‘busy’ the DC currently is, the adjusted interval is more likely to be at an
appropriate level of granularity than its static counterpart.
1.5.2 CloudSim
A secondary objective of this research was to examine the efficacy of the CloudSim
framework with respect to simulation of power-aware DCs. Given the lack of access for
researchers to ‘real-world’ data center infrastructure, a robust simulator with which to
experiment is of paramount importance. CloudSim is one such framework and is currently
deployed by many researchers in the field of data center energy efficiency worldwide. It is
discussed in detail in Chapter 3.
5
Chapter 1 Introduction
The online forums relating to the CloudSim framework are very active with researchers
attempting to establish the best way to achieve their objectives. While the documentation for
the code is extensive (and there are a number of basic examples of how the software can be
used included in the CloudSim framework), there is little by way of explanation of the
methodologies used by the original author of the code, thus resulting in each individual
researcher having to spend an inordinate amount of time investigating the capabilities (and
limitations) of the framework. This can only be achieved by reviewing many thousands of
lines of code and testing to establish the functionality of each module and method.
Through the course of this research, a number of CloudSim issues were identified
which, it is hoped, will prove useful to future researchers. They are discussed chronologically
(at the point in development when they were identified) and relate to both the framework
code and the accuracy of virtual machine and migration simulation. They are also
summarized in Chapter 5.
1.5.3 Methodology
This thesis uses the CloudSim framework (described in more detail in Chapter 3) as the base
simulator for implementation and testing of the hypothesis. A considerable review of the
existing CloudSim code was required to establish the capabilities of the framework and also
to identify what additional code modules would be needed to meet the thesis objectives.
Ultimately it was found that no facility existed in CloudSim to test the hypothesis and thus a
number of extensions to the existing code were designed and developed. These were then
integrated with the framework such that the default CloudSim simulation could be reliably
compared with the dynamic extension created for this research i.e. implementing dynamic
interval adjustment.
1.6 Conclusion
The remainder of this thesis is structured as follows. The literature review in Chapter 2
describes current (and future) efforts to improve energy efficiency in the data center industry.
Both hardware and software approaches are discussed, with a focus on virtualized systems,
installed as standard in all green-field DCs and retro-fitted to the majority of existing brown-
field sites. Chapter 3 details the specific modules in the CloudSim framework required to
build the test bed for analysis of the hypothesis. An explanation as to how these modules
6
Chapter 1 Introduction
interact with each other is also provided. Chapter 4 specifies the new Java methods written to
create and test the hypothesis. Integration of the new code with the existing framework is also
described. Chapter 5 discusses the tests performed to evaluate the hypothesis and analyses the
results in the context of current energy efficiency efforts in the data center industry. Chapter 6
concludes this thesis with a summary of the limitations identified in the CloudSim
framework, in the hope that the work of future researchers can more effectively benefit from,
and build upon, its code-base.
7
Chapter 2 Literature Review
Chapter 2 Literature Review
Introduction
By adjusting the DC monitoring interval with respect to the incoming workload, this thesis
investigates opportunities for more energy efficient management of data center resources.
Given this objective, extensive examination of the evolution of DC resource management
methods over the last few years was required in an effort to identify an approach which had
not been previously applied.
This thesis is primarily concerned with the energy efficiency of DCs when migrating
VMs across the LAN and WAN. To contextualize the research more completely, the
following literature review extends the introductory discussion in Chapter 1 to encompass the
entire data center infrastructure, analyzing current (and previous) efforts by operators and
academic researchers to reduce power consumption from, not only a software, but also a
hardware perspective. The chapter closes with an in-depth review of existing monitoring
interval approaches and technologies.
2.1 Performance versus Power
Most of the advances achieved by both the DC industry and academic researchers before
2006 paid particular attention to the performance of the infrastructure, with the principle
focus of operator efforts set firmly on keeping the DC available to clients 24/7. In fact, when
advertising and selling the services they offer, operators still choose to feature their ‘uptime’
percentage as their primary Unique Selling Point (USP). The ‘5 nines’ (i.e. 99.999% uptime),
denoting High Availability (HA), are seldom omitted from a typical DC operator’s marketing
material. However, the increase in power consumption required to boost performance seldom
received more attention than summary recognition as an additional expense. The power /
performance trade-off is undoubtedly a difficult hurdle to overcome, especially while cost
competitiveness is uppermost in the minds of data center operators. Invariably, before 2006,
most commercial development efforts to improve the operation of DCs were focussed solely
on performance.
8
Chapter 2 Literature Review
In more recent years, increased consumer demand for faster traffic and larger, more flexible,
storage solutions has changed how the industry views the resources required to operate
competitively. More equipment (e.g. servers, routers) has been required to meet demand but
the space required to accommodate this equipment has already been allocated to existing
equipment. The strategy adopted, since 2006, by a DC industry looking to the future, was to
increase the density of IT equipment rather than the more expensive option of purchasing (or
renting) additional square footage. The solution combined new server technologies and
virtualization.
2.2 Increased Density
An analogy: increasing infrastructural density in a data center is similar to adding more
bedrooms to a house without extending the property. The house can now accommodate
private spaces for more people but each person has less space than before. In the data center
there are now more servers per square foot, resulting in more compute / storage capability.
Despite the space-saving advantages of VM technology and techniques (i.e. migration),
which reduced the number of servers required to host applications, the primary disadvantage
of increased density was that each new blade server required significantly more power than
its predecessor. A standard rack with 65-70 blades operating at high loads might require 20 -
30kW of power compared with previous rack consumptions of 2 - 5kW. This additional
power generates additional heat. In a similar manner to maintaining comfortable levels of
heat and humidity for people in a house, heat in the rack, and resultant heat in the server
room, must be removed to maintain the equipment at a safe operating temperature and
humidity. Summarily, the introduction of increased server room density, from 2006 onwards,
resulted in increased power and cooling requirements for modern DCs.
At their 25th Annual Data Center Conference held in Las Vegas in late November
2006, Gartner analysts hypothesized that:
“…by 2008, 50% of current data centers will have insufficient
power and cooling capacity to meet the demands of high-density
equipment…” [1]
9
Chapter 2 Literature Review
During his address to the conference, Gartner Research Vice President, Michael Bell
suggested that: “Although power and cooling challenges will not be a perpetual problem, it is
important for DC managers to focus on the electrical and cooling issue in the near term, and
adopt best practice to mitigate the problem before it results in equipment failure, downtime
and high remediation costs”. This was one of the first ‘shots across the bow’ for a data center
industry which, until then, had been solely focussed on improving performance (e.g. uptime,
response time) almost in deference to escalating energy costs.
Based on data provided by IDC [2], Jonathon Koomey published a report [3] in
February 2007 estimating the electricity used by all DCs in both the US and globally for
2005. The executive summary states that:
“The total power demand in 2005 (including associated
infrastructure) is equivalent (in capacity terms) to about five 1000
MW power plants for the U.S. and 14 such plants for the world. The
total electricity bill for operating those servers and associated
infrastructure in 2005 was about $2.7 billion and $7.2 billion for the
U.S. and the world, respectively.”
A few months later the global economic downturn brought with it increasingly restrictive
operating budgets and higher energy prices. The competitive edge was becoming harder to
identify. Quite apart from the economic factors affecting the industry, the timely publication
by the EPA of its report to the US Congress [4] in August 2007 highlighted significant
opportunities to reduce both capital and operating costs by optimizing the power and cooling
infrastructure involved in data center operations. Industry analysts were once again
identifying an escalating power consumption trend which required immediate attention.
The report assessed the principle opportunities for energy efficiency improvements in
US DCs. The process of preparing the report brought all the major industry players together.
In an effort to identify a range of energy efficiency opportunities, 3 main improvement
scenarios were formulated:
1. Improved Operation: maximizes the efficiency of the existing data center
infrastructure by utilizing improvements such as ‘free cooling’ and raising
10
Chapter 2 Literature Review
temperature / humidity set-points. Minimal capital cost (‘the low hanging fruit’) is
incurred by the operator
2. Best Practice: adopt practices and technologies used in the most energy-efficient
facilities
3. State-of-the-art: uses all available energy efficiency practices and technologies
The potential energy savings and associated capital cost calculated for each of the 3 scenarios
respectively were:
1. Improved Operation: 20% saving - least expensive
2. Best Practice: 45% saving
3. State-of-the-art: 55% saving - most expensive
Notably, a proviso was also offered by the report in that: “…due to local constraints, the best
strategy for a particular data center could only be ascertained by means of a site-specific
review - not all suggested scenarios apply to all data centers.” Regardless of which (if any)
subsequent strategy was adopted by a particular data center operator, performance of a site-
specific review invariably served the purpose of demonstrating that reduction of power
consumption was indeed a viable opportunity to, not only significantly reduce both capital
and operating costs, but also re-gain a competitive edge.
The economic downturn, the Gartner conference and the reports by both the EPA and
Koomey were a significant part of the catalyst for the energy approach beginning to receive a
level of attention closer, if not equal, to that of the performance approach in previous years.
Efficient management of power and cooling, while maintaining performance levels, became
the order of the day.
At the highest level, DC infrastructure can be subdivided into hardware and software.
While it is true that both are inextricably linked to the energy performance of the DC, it is
useful for the purposes of this review to examine them separately.
11
Chapter 2 Literature Review
2.3 Hardware
Rasmussen [5] identified power distribution, conversion losses and cooling as representing
between 30 – 45% of the electricity bill in larger DCs. Cooling alone accounted for 30% of
this total.
Figure 2 Relative contributions to the thermal output of a typical DC
2.3.1 Uninterruptible Power Supply (UPS) & Power Distribution
The power being provided to the IT equipment in the racks is typically routed through an
Uninterruptible Power Supply (UPS) which feeds Power Distribution Units (PDUs) located
in or near the rack. Through use of better components, circuit design and right-sizing
strategies, manufacturers such as American Power Conversion (APC) and Liebert have
turned their attention to maximizing efficiency across the full load spectrum, without
12
Chapter 2 Literature Review
sacrificing redundancy. Some opportunities may exist in efforts to re-balance the load across
the 3 phases supplying the power to the racks but efficiencies in the power supply &
distribution system are outside the scope of this research.
2.3.2 Servers, Storage Devices & Network Equipment
Manufacturers such as IBM and Intel are designing increasingly efficient server blades with
features such as chip-level thermal strategies (Dynamic Voltage & Frequency Scaling
(DVFS)), multicore processors and power management leading the way. Enterprise operators
such as Google and Facebook have recently designed and installed their own servers which
have demonstrated increased efficiencies but these servers are specifically ‘fit-for-purpose’.
They may not be sufficiently generic to be applicable to a majority of DC configurations.
2.3.3 Cooling
There are a variety of standard systems for cooling in data centers but all typically involve
Air Handling Units (AHUs) or Computer Room Air Handlers (CRAHs). Well-designed DCs
have aligned their racks in an alternating hot aisle / cold aisle configuration with cold air from
the AHU(s) entering the cold aisle through perforated or grated tiles above a sub-floor
plenum. Hot air is exhausted from the rear of the racks and removed from the room back to
the same AHU(s) forming a closed-loop system. The hot air is passed directly over an
evaporator (Figure 3: 4) in the AHU which contains a liquid refrigerant (e.g. ethylene glycol /
water solution). The amount of heat absorbed is determined by the speed of the air crossing
the coil and / or the flow rate of the refrigerant through the coil. The flow rate is controlled by
tandem scroll compressors (Fig 3: 1). A dead-band setting is applied to each AHU and is
divided equally between all the compressors in the system. As each dead-band range above
the set-point is reached a compressor will engage to increase the flow rate. As the
temperature returns (down through the dead-band increments) toward the set-point, the
compressors disengage – reducing the flow through the evaporator until the set-point is
reached again. The heat absorbed through the coil is fed to an array of condensers outside the
DC where it evaporates into the atmosphere as exhaust or is reused in some other part of the
facility. The set point of the AHU is configured on installation of the unit and must (if
deemed appropriate) be changed manually by a member of staff following analysis and
review. Unfortunately these reviews happen all too seldom in typical DCs, despite the
inevitable changes taking place in the server room workload on a daily basis.
13
Chapter 2 Literature Review
Figure 3 A Typical AHU Direct Expansion (DX) Cooling System
Depending on the configuration, the heat removal system might potentially consume 50% of
a typical DC’s energy. Industry is currently embracing a number of opportunities involving
temperature and airflow analysis:
1. aisle containment strategies
2. increasing the temperature rise (ΔT) across the rack
3. raising the operating temperature of the AHU(s)
4. repositioning AHU temperature and humidity sensors
5. thermal management by balancing the IT load layout [6, 7]
6. ‘free cooling’ – eliminating the high-consumption chiller from the system through
the use of strategies such as air- and water-side economizers
14
Chapter 2 Literature Review
Figure 4 A Typical DC Air Flow System
In addition to temperature maintenance, the AHUs also vary the humidity of the air entering
the server room according to set-points. Low humidity (dry air) may cause static which has
the potential to short electronic circuits. High levels of moisture in the air may lead to faster
component degradation. Although less of a concern as a result of field experience and recent
studies performed by Intel and others, humidity ranges have been defined for the industry and
should be observed to maximize the lifetime of the IT equipment. Maintaining humidity
ranges definitively increases the interval between equipment replacement schedules and, as a
result, has a net positive outcome on capital expenditure budgets.
2.3.4 Industry Standards & Guidelines
2.3.4.1 Standards
Power Usage Effectiveness (PUE2) [8] is now the de facto standard used to measure a DC’s
efficiency. It is defined as the ratio of all electricity used by the DC to the electricity used just
by the IT equipment. In contrast to the original PUE [9] rated in kilowatts of power (kW),
PUE2 must be based on the highest measured kilowatt hour (kWh) reading taken during
analysis. In 3 of the 4 PUE2 categories now defined, the readings must span a 12 month
period, eliminating the effect of seasonal fluctuations in ambient temperatures:
PUE = Total Data Centre Electricity (kWh )
¿ Equipment Electricity (kWh )
15
Chapter 2 Literature Review
A PUE of 2.0 suggests that for each kWh of IT electricity used another kWh is used by the
infrastructure to supply and support it. The most recent PUE averages [10] for the industry
fall within the range of 1.83 – 1.92 with worst performers coming in at 3.6 and a few top
performers publishing results below 1.1 in recent months. Theoretically, the best possible
PUE is 1.0 but a web-hosting company (Pair Networks) recently quoted a PUE of 0.98 for
one of its DCs in Las Vegas, Nevada. Their calculation was based on receipt of PUE ‘credit’
for contributing unused power (generated on-site) back to the grid. Whether additional PUE
‘credit’ should be allowed for contributing to the electricity grid is debatable. If this were the
case, with sufficient on-site generation, PUE could potentially reach 0.0 and cease to have
meaning. Most DCs are now evaluating their own PUE ratio to identify possible
improvements in their power usage. Lower PUE ratios have become a very marketable aspect
of the data center business and have been recognized as such. Other standards and metrics
(2.3.4.2.1 – 2.3.4.2.4) have been designed for the industry but, due for the most part to the
complex processes required to calculate them, have not as yet experienced the same wide-
spread popularity as PUE and PUE2.
2.3.4.2 Other Standards
2.3.4.2.1 Water Usage Effectiveness (WUE) measures DC water usage to provide an
assessment of the water used on-site for operation of the data center. This includes water used
for humidification and water evaporated on-site for energy production or cooling of the DC
and its support system.
2.3.4.2.2 Carbon Usage Effectiveness (CUE) measures DC-level carbon emissions.
CUE does not cover the emissions associated with the lifecycle of the equipment in the DC or
the building itself.
2.3.4.2.3 The Data Center Productivity (DCP) framework is a collection of metrics
which measure the consumption of a DC-related resource in terms of DC output. DCP looks
to define what a data center accomplishes relative to what it consumes.
2.3.4.2.4 Data Center Compute Efficiency (DCCE) enables data center operators to
determine the efficiency of compute resources. The metric makes it easier for data center
operators to discover unused servers (both physical and virtual) and decommission or
redeploy them.
16
Chapter 2 Literature Review
Surprisingly, efforts to improve efficiency have not been implemented to the extent one
would expect. 73% of respondents to a recent Uptime Institute survey [11] stated that
someone outside of the data center (the real estate / facilities department) was responsible for
paying the utility bill. 8% of data center managers weren’t even aware who paid the bill. The
lack of accountability is obvious and problematic. If managers are primarily concerned with
maintaining the DC on a daily basis there is an inevitable lack of incentive to implement even
the most basic energy efficiency strategy in the short to medium term. It is clear that a
paradigm shift is required to advance the cause of energy efficiency monitoring at the ‘C-
level’ (CEO, CFO, CIO) of data center operations.
2.3.4.3 Guidelines
Data center guidelines are intermittently published by The American Society of Heating,
Refrigeration and Air Conditioning Engineers (ASHRAE). These guidelines [12, 13] suggest
‘allowable’ and ‘recommended’ temperature and humidity ranges within which it is safe to
operate IT equipment. The most recent edition of the guidelines [14] suggests operating
temperatures of 18 – 27⁰C. The maximum for humidity is 60% RH.
One of the more interesting objectives of the recent guidelines is to have the rack inlet
recognized as the position from where the temperature and humidity should be measured. The
majority of DCs currently measure at the return inlet to the AHU, despite more relevant
temperature and humidity metrics being present at the inlet to the racks.
2.3.5 Three Seminal Papers
In the context of improving the hardware infrastructure of the DC post-2006, three academic
papers were found to be repeatedly referenced as forming a basis for the work of the most
prominent researchers in the field. They each undertake a similar methodology when
identifying solutions and are considered to have led the way for a significant number of
subsequent research efforts. The methodologies which are common to each of the papers (and
relevant to this thesis) include:
1. Identification of a power consumption opportunity within the DC and adoption of
a software-based solution
17
Chapter 2 Literature Review
2. Demonstration of the absolute requirement for monitoring the DC environment as
accurately as possible without overloading the system with additional processing
Summary review of the three papers follows.
2.3.5.1 Paper 1: Viability of Dynamic Cooling Control in a Data Center Environment (2006)
In the context of dynamically controlling the cooling system Boucher et al. [15] focused their
efforts on 3 requirements:
1. A distributed sensor network to indicate the local conditions of the data center.
Solution: a network of temperature sensors was installed at:
Rack inlets
Rack outlets
Tile inlets
2. The ability to vary cooling resources locally. Solution: 4 actuation points, which exist
in a typical data center, were identified as having further potential in maintaining
optimal server room conditions:
2.1 CRAC supply temperature – this is the temperature of the conditioned air
entering the room. CRACs are typically operated on the basis of a single
temperature sensor at the return side of the unit. This sensor is responsible for
taking an average of the air temperature returning from the room. The CRAC then
correlates this reading with a set-point which is configured manually by data
center staff. The result of the correlation is the basis upon which the CRAC
decides by how much the temperature of the air sent back out into the room should
be adjusted. Variation is achieved in a Direct Expansion (DX) system with
variable capacity compressors varying the flow of refrigerant across the cooling
coil. In a water-cooled system chilled water supply valves modulate the
temperature.
2.2 The crucial element in the operational equation of the CRAC, regardless of the
system deployed, is the set-point. The set-point is manually set by data center staff
and generally requires considerable analysis of the DC environment before any
adjustment is made. Typically, the set-point is configured (when the CRAC is
18
Chapter 2 Literature Review
initially installed) according to some prediction of the future cooling demand. Due
to a number of factors (including the cost of consultancy) it is all too common that
no regular analysis of the room’s thermal dynamics is performed (if at all). This is
despite instalment of additional IT equipment (and increased work load on the
existing infrastructure) throughout the lifecycle of the data center. Clearly a very
static situation exists in this case.
2.3 CRAC fan speed – the speed at which the fans in the CRAC blow the air into
the room (via a sub-floor plenum). In 2006 (at the time of this paper), typical
CRACs had fans running at a set speed and without further analysis no
reconfiguration took place after installation. Most CRACs since then have been
designed with Variable Speed Drives (VSDs) - which can vary the speed of the
fan according to some set of rules. However, with no dynamic thermal analysis of
the DC environment taking place on a regular basis, the VSD rules are effectively
hardwired into the system. The VSDs are an unused feature of the CRAC as a
result.
2.4 Floor tile openings – the openings of the floor tiles in the cold aisle. The
velocity at which the cold air leaving the CRAC enters the room is dependent
upon a number of factors. Assuming it has passed through the sub-floor plenum
with minimal pressure loss, the air will rise into the room at some velocity (via the
floor tile openings). Floor tiles are either perforated or grated. Perforated tiles
typically have 25% of their surface area open whereas grated tiles may have 40 –
60% of their surface open. The more open surface area available on the tile the
higher the velocity with which the air will enter the room. The authors had
previously designed and implemented a new tile - featuring an electronically
controlled sliding damper mechanism which could vary the size of the opening
according to requirements.
So it is evident that as a typical DC matures and the thermodynamics of the
environment change with higher CPU loads and additional IT equipment, the
cooling system should have a dynamic cooling control system to configure it for
continuous maximum efficiency. Boucher et al. propose that this control system
should be based on the 4 available actuation points above.
19
Chapter 2 Literature Review
3. The knowledge of each variable’s effect on DC environment. Solution: the paper
focused on how each of the actuator variables (2.1, 2.2 and 2.3 and 2.4 above) can
affect the thermal dynamic of the data center.
Included in the findings of the study were:
CRAC supply temperatures have an approximate linear relationship with rack inlet
temperatures. An anomaly was identified where the magnitude of the rack inlet
response to a change in CRAC supply temperature was not of the same order. Further
study was suggested.
Under-provisioned flow provided by the CRAC fans affects the Supply Heat Index
(SHI*) but overprovisioning has a negligible effect. SHI is a non-dimensional
measure of the local magnitude of hot and cold air mixing. Slower air flow rates cause
an increase in SHI (more mixing) whereas faster air flow rates have little or no effect.
*SHI is also referred to as Heat Density Factor (HDF). The metric is based on the
principle of a thermal multiplier which was formulated by Sharma et al. [16]
The study concluded that significant energy savings (in the order of 70% in this case) were
possible where a dynamic cooling control system, controlled by software, was appropriately
deployed.
2.3.5.2 Paper 2: Impact of Rack-level Compaction on the Data Center Cooling Ensemble (2008)
Shah et al. [17] deal with the impact on the data center cooling ensemble when the density of
compute power is increased. The cooling ‘ensemble’ is considered to be all elements of the
cooling system from the chip to the cooling tower.
Increasing density involves replacing low-density racks with high-density blade
servers and has been the chosen alternative to purchasing (or renting) additional space for
most DCs in recent years. New enterprise and co-location data centers also implement the
strategy to maximize the available space. Densification leads to increased power dissipation
and corresponding heat flux within the DC environment.
20
Chapter 2 Literature Review
A typical cooling system performs two types of work:
1. Thermodynamic – removes the heat dissipated by the IT equipment
2. Airflow – moves the air through the data center and related systems
The metric chosen by Shah et al. for evaluation in this case is the ‘grand’ Coefficient of
Performance (COPG) which is a development of the original COP metric suggested by Patel
et al. [18, 19]. It measures the amount of heat removed by the cooling infrastructure per unit
of power input and does so at a more granular level than the traditional COP used in
thermodynamics, specifying heat removal at the chip, system, rack, room and facility levels.
In order to calculate the COPG of the model used for the test case each component of
the cooling system needed to be evaluated separately, before applying each result to the
overall system. Difficulties arose where system-level data was either simply unavailable or,
due to high heterogeneity, impossible to infer. However, the model was generic enough that it
could be applied to the variety of cooling systems currently being used by ‘real world’ DCs.
Note: in a similar vein, the research for this thesis examines the CPU utilization of
each individual server in the data center such that an overall DC utilization metric at each
interval can be calculated. Servers which are powered-off at the time of monitoring have no
effect on the result and are excluded from the calculation.
The assumption that increased density leads to less efficiency in the cooling system is
incorrect. If elements of the cooling system were previously running at low loads they would
typically have been operating at sub-optimal efficiency levels. Increasing the load on a
cooling system may in fact increase its overall efficiency through improved operational
efficiencies in one or more of its subsystems.
94 existing low-density racks were replaced with high-density Hewlett Packard (HP) blades
for Shah’s research. The heat load increased from 1.9MW to 4.7MW. The new heat load was
still within the acceptable range for the existing cooling infrastructure. No modifications to
the ensemble were required.
Upon analysis of the results, COPG was found to have increased by 15%. This was, in part,
achieved with improved efficiencies in the compressor system of the CRACs. While it is
acknowledged that there is a crossover point at which compressors become less efficient, the
increase in heat flux of the test model resulted in raising the work of the compressor to a
21
Chapter 2 Literature Review
point somewhere below this crossover. The improvement in compressor efficiency was
attributed to the higher density HP blade servers operating at a higher ΔT (reduced flow rates)
across the rack. The burden on the cooling ensemble was reduced - resulting in a higher
COPG.
With the largest individual source of DC power consumption (about 40% in this case)
typically coming from the CRAC - which contains the compressor - it makes sense to direct
an intelligent analysis of potential operational efficiencies at that particular part of the system.
The paper states that: “The continuously changing nature of the heat load distribution
in the room makes optimization of the layout challenging; therefore, to compensate for
recirculation effects, the CRAC units may be required to operate at higher speeds and lower
supply temperature than necessary. Utilization of a dynamically coupled thermal solution,
which modulates the CRAC operating points based on sensed heat load, can help reduce this
load”.
In this paper Shah et al. present a model for performing evaluation of the cooling
ensemble using COPG, filling the gap of knowledge through detailed experimentation with
measurements across the entire system. They conclude that energy efficiencies are possible
via increased COP in one or more of the cooling infrastructure components. Where thermal
management strategies capable of handling increased density are in place, there is significant
motivation to increase density without any adverse impact on energy efficiency.
2.3.5.3 Paper 3: Data Center Efficiency with Higher Ambient Temperatures and Optimized Cooling Control (2011)
Ahuja et al. [20] introduce the concept of ‘deviation from design intent’. When a data center
is first outfitted with a cooling system, best estimates are calculated for future use. The
intended use of the DC in the future is almost impossible to predict at this stage. As the
lifecycle of the DC matures, the IT equipment will deviate from the best estimates upon
which the cooling system was originally designed to operate. Without on-going analysis of
the DC’s thermal dynamics, the cooling system may become decreasingly ‘fit-for-purpose’.
As a possible solution to this deviation from intent, this paper proposes that cooling of
the DC environment should be controlled from the chip rather than a set of remote sensors in
22
Chapter 2 Literature Review
the room or on the rack doors. Each new IT component would have chip-based sensing
already installed and therefore facilitate a “plug ‘n’ play” cooling system.
The newest Intel processors (since Intel® Pentium® M) on the market feature an ‘on-
die’ Digital Thermal Sensor (DTS). DTS provides the temperature of the processor and
makes the result available for reading via Model Specific Registers (MSRs). The Intel white
paper [21] which describes DTS states that:
“… applications that are more concerned about power consumption
can use thermal information to implement intelligent power
management schemes to reduce consumption.”
While Intel is referring to power management of the server itself, DTS could
theoretically be extended to the cooling management system also.
Current DCs control the air temperature and flow rate from the chip to the chassis but
there is a lack of integration once the air has left the chassis. If the purpose of the data center
is to house, power and cool every chip then it has the same goal as the chassis and the chassis
is already taking its control data from the chip. This strategy needs to be extended to the
wider server room environment in an integrated manner.
The industry has recently been experimenting with positioning the cooling sensors at
the front of the rack rather than at the return inlet of the AHU. The motivation for this is to
sense the air temperature which matters most – the air which the IT equipment uses for
cooling. The disadvantage of these remote sensors (despite being better placed than sensors at
the AHU return inlet) is that they are statically positioned, a position which may later be
incorrect should changes in the thermal dynamics of the environment occur. The closer to the
server one senses - the more reliable the sensed data will be for thermal control purposes.
Ahuja et al. propose that the logical conclusion is to move the sensors even closer to the
server – in fact, right into the processor. If those sensors already exist (as is the case with the
Intel processors) then use should be made of them for a more accurate cooling management
system.
23
Chapter 2 Literature Review
The paper investigates the possible gains by moving the temperature sensors (and
changing the set-point accordingly) to a variety of positions in the DC:
1. AHU return – 28⁰C
2. AHU supply – 18⁰C
3. Rack inlet – 23⁰C
4. Server - 30⁰C
The first test was carried out on a single isolated rack with those results then
extrapolated to a DC model with a cooling capacity of 100kW. 4 perimeter down-flow AHUs
(N + 1 redundancy) performed the heat removal. While the 4 rows in the DC were not
contained they did follow the standard hot / cold aisle arrangement. The tests showed that use
of the server sensors resulted in more servers being maintained within the ASHRAE
guideline temperature range of 18 – 27⁰C. Controlling the cooling system at the server
yielded maximum benefit.
Ahuja et al. concluded that a processor-based set of metrics capable of controlling a
power management scheme on the server should, by extension, also be capable of controlling
a dynamic cooling control system outside the rack. If every server in a DC was intermittently
reporting its operating temperature (and air flow) to a cooling control system, the cooling
system would be operating on a more robust data set i.e., more accurate readings, delivering
higher energy efficiency savings than possible with previous DC configurations.
2.4 Software
2.4.1 Virtualization
In a virtualized data center, multiple Virtual Machines (VMs) are typically co-located on a
single physical server, sharing the processing capacity of the server's CPU between them.
When, for example, increased demands on the CPU result in reduced performance of one of
the VMs to the point where a Service Level Agreement (SLA) may be violated, virtualization
technology facilitates a migration. Migration relocates the services being provided by the VM
on this 'over-utilized' host to a similar VM on another physical server, where sufficient
capacity (e.g. CPU) is available to maintain SLA performance.
24
Chapter 2 Literature Review
Conversely, reduced demand on the CPU of a host introduces opportunities for server
consolidation, the objective of which is to minimize the number of operational servers
consuming power. The remaining VMs on an 'under-utilized' host are migrated so that the
host can be switched off, saving power. Server consolidation provides significant energy
efficiency opportunities.
There are numerous resource allocation schemes for managing VMs in a data center,
all of which involve the migration of a VM from one host to another to achieve one, or a
combination of, objectives. Primarily these objectives will involve either increased
performance or reduced energy consumption - the former, until recently, receiving more of
the operator’s time and effort than the latter.
In particular, SLA@SOI has completed extensive research in recent years in the area
of SLA-focused (e.g. CPU, memory, location, isolation, hardware redundancy level) VM
allocation and re-provisioning [22]. The underlying concept is that VMs are assigned to the
most appropriate hosts in the DC according to both service level and power consumption
objectives. Interestingly, Hyser et al. [23] suggest that a provisioning scheme which also
includes energy constraints may choose to violate user-based SLAs ‘if the financial penalty
for doing so was [sic] less than the cost of the power required to meet the agreement’. In a
cost-driven DC it is clear that some trade-off (between meeting energy objectives and
compliance with strict user-based SLAs e.g. application response times) is required. A similar
power / performance trade-off may be required to maximize the energy efficiency of a host-
level migration.
2.4.2 Migration
The principal underlying technology which facilitates management of workload in a DC is
virtualization. Rather than each server hosting a single operating system (or application),
virtualization facilitates a number of VMs being hosted on a single physical server, each of
which may run a different operating system (or even different versions of the same operating
system). These VMs may be re-located (migrated) to a different host on the LAN for a
variety of reasons:
Maintenance
25
Chapter 2 Literature Review
Servers intermittently need to be removed from the network for maintenance. The
applications running on these servers may need to be kept running during the
maintenance period so they are migrated to other servers for the duration.
Consolidation
In a virtualized DC some of the servers may be running at (or close to) idle – using
expensive power to maintain a machine which is effectively not being used to
capacity. To conserve power, resource allocation software moves the applications on
the under-utilized machine to a ‘busier’ machine - as long as the latter has the
required overhead to host the applications. The under-utilized machine can then be
switched off – saving on power and cooling.
Energy Efficiency
Hotspots regularly occur in the server room i.e. the cooling system is working too
hard in the effort to eliminate the exhaust air from a certain area. The particular
workload which is causing the problem can be identified and relocated to a cooler
region in the DC to relieve the pressure in the overheated area.
Virtual Machines may also be migrated to servers beyond the LAN (i.e. across the Wide Area
Network (WAN):
Follow the sun - minimize network latency during office hours by placing VMs close
to where their applications are requested most often
Where latency is not a primary concern there are a number of different strategies
which may apply:
Availability of renewable energy / improved energy mix
Less expensive cooling overhead (e.g. ‘free’ cooling in more temperate / cooler
climates
Follow the moon (less expensive electricity at night)
Fluctuating electricity prices on the open market [24]
Disaster Recovery (DR)
Maintenance / Fault tolerance
Bursting i.e. temporary provisioning of additional resources
26
Chapter 2 Literature Review
Backup / Mirroring
Regardless of the motivation, migration of virtual machines both within the DC and also to
other DCs (in the cloud or within the enterprise network) not only extends the opportunity for
significant cost savings but may also provide faster application response times if located
closer to clients. To maintain uptime and response Service Level Agreement (SLAs)
parameters of 99.999% (or higher), these migrations must be performed ‘hot’ or ‘live’,
keeping the application available to users while the virtual machine hosting the application
(and associated data) is moved to the destination server. Once all the data has been migrated,
requests coming into the source VM are redirected to the new machine and the source VM
can be switched off or re-allocated. The most popular algorithm by which virtual machines
are migrated is known as pre-copy and is deployed by both Citrix and VMWare – currently
considered to be the global leaders in software solutions for migration and virtualized
systems. A variety of live migration algorithms have been developed in the years since 2007.
Some are listed below:
1. Pre-copy [25]
2. GA for Renewable Energy Placement [26]
3. pMapper: Power Aware Migration [27]
4. De-duplication, Smart Stop & Copy, Page Deltas & CBR (Content Based
Replication) [28]
5. Layer 3: IP LightPath [29]
6. Adaptive Memory Compression [30]
7. Parallel Data Compression [31]
8. Adaptive Pre-paging and Dynamic Self-ballooning [32]
9. Replication and Scheduling [33]
10. Reinforcement Learning [34]
11. Trace & Replay [35]
12. Distributed Replicated Block Device (DRBD) [36]
27
Chapter 2 Literature Review
The LAN-based migration algorithm used by the Amazon EC2 virtualization hypervisor
product (Citrix XenMotion) is primarily based on pre-copy but also integrates some aspects
of the algorithms listed above. It serves as a good example of the live migration process. It is
discussed in the following section.
2.4.2.1 Citrix XenMotion Live Migration
The virtual machine on the source (or current machine) keeps running while transferring its
state to the destination. A helper thread iteratively copies the state needed while both end-
points keep evolving. The number of iterations determines the duration of live migration. As
a last step, a stop-and-copy approach is used. Its duration is referred to as downtime. All
implementations of live migration use heuristics to determine when to switch from iterating
to stop-and-copy.
Pre-copy starts by copying the whole source VM state to the destination system.
While copying, the source system keeps responding to client requests. As memory pages may
get updated (‘dirtied’) on the source system (Dirty Page Rate), even after they have been
copied to the destination system, the approach employs mechanisms to monitor page updates.
The performance of live VM migration is usually defined in terms of migration time
and system downtime. All existing techniques control migration time by limiting the rate of
memory transfers while system downtime is determined by how much state has been
transferred during the ‘live’ process. Minimizing both of these metrics is correlated with
optimal VM migration performance and it is achieved using open-loop control techniques.
With open-loop control, the VM administrator manually sets configuration parameters for the
migration service thread, hoping that these conditions can be met. The input parameters are a
limit to the network bandwidth allowed to the migration thread and the acceptable downtime
for the last iteration of the migration. Setting a low bandwidth limit while ignoring page
modification rates can result in a backlog of pages to migrate and prolong migration. Setting
a high bandwidth limit can affect the performance of running applications. Checking the
estimated downtime to transfer the backlogged pages against the desired downtime can keep
the algorithm iterating indefinitely. Approaches that impose limits on the number of iterations
or statically increasing the allowed downtime can render live migration equivalent to pure
stop-and-copy migration.
28
Chapter 2 Literature Review
2.4.2.2 Wide Area Network Migration
With WAN transmissions becoming increasingly feasible and affordable, live migration of
larger data volumes over significantly longer distances is becoming a realistic possibility [37,
38]. As a result, the existing algorithms, which have been refined for LAN migration, will be
required to perform the same functionality over the WAN. However, a number of constraints
present themselves when considering long distance migration of virtual machines. The
constraints unique to WAN migration are:
Bandwidth (I/O throughput – lower over WANs)
Latency (distance to destination VM – further on WANs)
Disk Storage (transfer of SAN / NAS data associated with the applications running on
the source VM to the destination VM)
Bandwidth (and latency) becomes an increasingly pertinent issue during WAN migration
because of the volume of data being transmitted across the network. In the time it takes to
transmit a single iteration of pre-copy memory to the destination, there is an increased chance
(relative to LAN migration) that the same memory may have been re-written at the source.
The rate at which memory is rewritten is known as the Page Dirty Rate (PDR) - calculated by
dividing the number of pages dirtied in the last round by the time the last round took (Mbits /
sec). This normalizes PDR for comparison with bandwidth. Xen implements variable
bandwidth during the pre-copy phase based on this comparison. There are 2 main categories
of PDR when live migration is being considered:
1. Low / Typical PDR: Memory is being re-written slower than the rate at which those
changes can be transmitted to the destination i.e. PDR < Migration bandwidth
2. Diabolical PDR (DPDR): The rate at which memory is being re-written at the source
VM exceeds the rate at which that re-written memory can be migrated ‘live’ to the
destination (PDR > Migration bandwidth). The result of this is that the pre-copy phase
may not converge at all. The PDR floods I/O and the pre-copy migration must be
immediately stopped i.e. pre-copy migration will not converge. All remaining pages
are then transferred to the destination. The result of this is a longer downtime (while
the pages are transferred), potential SLA violations and, most notably for the purposes
29
Chapter 2 Literature Review
of this research, increased power consumption while both hosts are running
concurrently.
2.4.2.3 PDR Analysis and Compression of Transmitted Pages
Current algorithms send the entire VM state on the 1st iteration (Figure 5: 62 seconds). To
reduce the time spent on the 1st iteration, pages frequently ‘dirtied’ should be identified
before the 1st iteration - the objective being to hold back these pages until the final iteration
(reducing the number of pages resent during iterative pre-copy) or at least hold them back
until some analysis calculates that they are ‘unlikely’ (with some confidence interval) to be
dirtied again. There is a reasonable assumption that there will be multiple iterations in a high
PDR environment – in the (rare) case where a VM has no dirty pages, only a single iteration
would be required to transfer the entire state. Pre-migration analysis would not be continuous
(due to the CPU overhead) but should begin at some short interval before the migration takes
place i.e. just after the decision to migrate has been made.
Figure 5 Performance of web server during live migration (C. Clark)
With a pre-migration analysis phase the time required for the 1st iteration will be reduced.
There may be an argument that downtime is increased due to the additional pages held back
during the first iteration. High PDR pages - which have not been sent in the 1 st iteration –
would likely be identified in the 2nd (or subsequent) iterations anyway – resulting in a very
similar Writable Working Set (WWS) on the final iteration. In low PDR environments
research suggests that the WWS in the majority of cases is a small proportion of the entire
30
Chapter 2 Literature Review
data set (perhaps approximately 10%) which needs to be transferred – resulting in minimal
iterations being required before a stop condition is reached i.e. subsequent iterations would
yield diminishing returns. This is not the case where an application may be memory intensive
i.e. the PDR is diabolical and floods the I/O rate.
Conversely, if the WWS is so small – is the effort to identify it at the pre-iterative
stage worth the effort? If the algorithm can be applied to diabolical environments as well as
acceptable PDR environments then the answer is yes – the effort is worth it. There is an
inevitable trade-off between the time (and CPU overhead) required to identify the WWS on
each iteration and the resulting time saved during iterative pre-copy due to less pages being
transferred. However, identifying a minimal WWS will intrinsically save time.
Finding the ‘threshold’ (current research suggests a simple high/low threshold) is an
interesting research challenge! A bitmap indicating the Page Dirty Count is required to keep
track of pages being repeatedly dirtied. A count however is probably too simplistic. Would an
upper / lower bounded threshold be more applicable? A bounded threshold would ‘hold’ the
pages which are above the lower threshold boundary but below the upper threshold boundary
i.e. deemed least likely to be dirtied again. Boundary calculation should include a confidence
interval - to minimize the un-synced pages before the final iteration occurs. These categorized
‘hold’ pages might be held until the next iteration and if they are found to still have a ‘hold’
status (fall between the upper and lower threshold boundaries) they are then transferred. With
successive iterations more is known about recent PDR patterns. Analysis of these should
theoretically yield boundary calculations which are more accurate as a result.
Note: An additional parallel check, before the nth iteration takes place, of all the pages
which were transmitted from the threshold area would identify those pages which have been
subsequently dirtied. The compressed deltas of these pages would be re-transmitted in the
final iteration – along with those that were still above the upper threshold. The success of the
new algorithm could be judged on the percentage error at this stage i.e. how many pages were
sent from the ‘hold’ area but subsequently dirtied?
2.4.2.4 Parallel Identification of Dirty Pages and Multi-Threaded Adaptive Memory Compression
In addition to the pre-migration analysis stage it may also be useful to examine the potential
of parallel dirty page identification and compression. In Figure 6 the blue area is when dirty
31
Chapter 2 Literature Review
pages are identified for the next round and delta compression takes place. However, in the
time this phase is taking place more pages will be dirtied. If the same interval was moved
back (to be in parallel with the previous data transfer) would more pages be dirtied? The
answer appears to be no i.e. the PDR is independent of the process which actually calculates
it. The benefit of this parallelism is that the algorithm is ready to move immediately to
transfer n + 1 when transfer n has completed – reducing the iterative pre-copy time by
eliminating the blue interval in Figure 6.
It is probable that some overlap may be optimal rather than full parallelism. Time–
series analysis of dirtying patterns during the previous transfer interval might yield an
optimal overlap i.e. the best time to start identifying the new dirty pages, rather than waiting
until the transfer has completed. It would also be beneficial to investigate further whether, as
the number of dirty pages reduces with subsequent iterations, the time required to identify
(and compress) the dirty page deltas could also be reduced (research suggests cache access
times remain relatively constant). If this were true then the inner overlap could be sent deeper
back into the transfer time reducing the outer overlap further. Additionally, multi-threaded
compression would yield further reductions in the overlap interval.
Figure 6 Pre-Copy algorithm
2.4.2.5 Throttling
The critical issue in high PDR environments is that the possibility of convergence is reduced
(if not eliminated altogether). It is similar to a funnel filling up too quickly. If the PDR
continues at a high rate the funnel will eventually overflow resulting in service timeouts i.e.
32
Chapter 2 Literature Review
the application will not respond to subsequent requests or response times will be significantly
decreased. The current solution is to abandon pre-copy migration, stop the VM and transfer
all memory i.e. empty the funnel. Unfortunately, in the time it takes to empty the funnel,
more pages have been dirtied because requests to the application do not stop. This may
actually prohibit the migration altogether because the downtime is such that an unacceptable
level of SLA violations occur.
If, however, the speed at which the application’s response thread can be artificially
slowed down (throttled) intermittently then the funnel is given a better chance to empty its
current contents. This would be analogous to temporarily decreasing the flow from the tap to
reduce the volume in the funnel.
Previous solutions suggested that slowing response time to requests (known as
Dynamic Rate Limiting) would alter the rate at which I/O throughput was performed but
results proved that detrimental VM degradation tended to occur. In addition, other processes
on the same physical machine were negatively affected. Dedicated migration switches were
required to divert the additional load from the core. The focus was on the I/O throughput as
opposed to the incoming workload (PDR).
How the PDR could be intermittently throttled without adverse degradation of either
the VM in question, or other machine processes, is the central question.
Successful PDR throttling, in conjunction with threshold calculations and optimized
parallel adaptive memory compression / dirty page identification, would achieve a lower
PDR. However, the issue of PDR can be essentially circumvented if the number of migrations
taking place in a DC as a whole can be reduced.
In the majority of typical PDR environments Clark et al. [39] have shown that the
initial number of dirty pages i.e. the Writable Working Set (WWS), is a small proportion of
the entire page set (perhaps 10% or less) which needs to be transferred, typically resulting in
minimal iterations being required before a stop condition is reached i.e. subsequent iterations
would yield diminishing returns. This is not the case where an application may be particularly
memory-intensive i.e. the PDR is diabolical.
Degradation of application performance during live migration (due to DPDRs or for
other reasons) results in increased response times, threatening violation of SLAs and
33
Chapter 2 Literature Review
increasing power consumption. For optimization of migration algorithms with DPDRs there
are 2 possible approaches for solving the DPDR problem:
1. Increase bandwidth
2. Decrease PDR
Typical applications only exhibit this DPDR-like behaviour as spikes or outliers in normal
write activity. Live migration was previously abandoned by commercial algorithms when
DPDRs were encountered. However, in its most recent version of vSphere (5.0), VMWare
has included an enhancement called ‘Stun During Page Send’ (SDPS) [40] which guarantees
that the migration will continue despite experiencing a DPDR (VMWare refers to DPDRs as
‘pathological’ loads). Tracking both the transmission rate and the PDR, a diabolical PDR can
be identified. When a DPDR is identified by VMWare, the response time of the virtual
machine is slowed down (‘stunned’) by introducing microsecond delays (sleep processes) to
the vCPU. This lowers the response time to application requests and thus slows the rate of
PDR to less than the migration bandwidth (in order to ensure convergence (PDR <
bandwidth).
Xen implements a simple equivalent – limiting ‘rogue’ processes (other applications
or services running parallel to the migration) to 40 write faults before putting them on a wait
queue.
2.5 Monitoring Interval
Much effort has been applied to optimizing the live migration process in recent years. During
migration, the primary factors impacting on a VM’s response SLA are the migration time
and, perhaps more importantly, the downtime. These are the metrics which define the
efficiency of a migration. If a DC operator intends to migrate the VM(s) hosting a client
application it must factor these constraints into its SLA guarantee. It is clear that every
possible effort should be made to minimize the migration time (and downtime) - so that the
best possible SLAs may be offered to clients. This can only be achieved by choosing the VM
with the lowest potential PDR for each migration.
However, response and uptime SLAs become increasingly difficult to maintain if
reduction (or at least minimization) of power consumption is a primary objective because
each migration taking place consumes additional energy (while both servers are running and
34
Chapter 2 Literature Review
processing cycles, RAM, bandwidth are being consumed). Based on this premise, Voorsluys
et al. [41] evaluate the cost of live migration, demonstrating that DC power consumption can
be reduced if there is a reduction in migrations. The cost of a migration (as shown) is
dependent on a number of factors, including the amount of RAM being used by the source
VM (which needs to be transferred to the destination) and the bandwidth available for the
migration. The higher the bandwidth the faster data can be transferred. Additionally, power
consumption is increased because 2 VMs (source and destination) are running concurrently
for much of the migration process.
In order to reduce the migration count in a DC each migration should be performed
under the strict condition that the destination host is chosen such that power consumption in
the DC is minimized post-migration. This can only be achieved by examining all possible
destinations before each migration begins - to identify the optimal destination host for each
migrating VM from a power consumption point-of-view. The critical algorithm for resource
(VM) management is the placement algorithm.
35
Chapter 2 Literature Review
These 2 conditions i.e.
1. Migrate the VM with the lowest Page Dirty Rate
2. Choose the destination host for minimal power consumption post-migration
form the basis upon which the Local Regression / Minimum Migration Time (LRMMT)
algorithm in CloudSim [42] operates (c.f. Chapter 3).
2.5.1 Static Monitoring Interval
Recent research efforts in energy efficiency perform monitoring of the incoming workload
but almost exclusively focus on techniques for analysis of the data being collected rather than
improving the quality of the data.
In their hotspot identification paper, Xu and Sekiya [43] select a monitoring interval
of 2 minutes. The interval is chosen on the basis of balancing the cost of the additional
processing required against the benefit of performing the migration. The 2 minute interval
remains constant during experimentation.
Using an extended version of the First Fit Decreasing algorithm, Takeda et al. [44]
are motivated by consolidation of servers, to save power. They use a static 60 second
monitoring interval for their work.
Xu and Chen et al. [45] monitor the usage levels of a variety of server resources
(CPU, memory, and bandwidth), polling metrics as often as they become available. Their
results show that monitoring at such a granular level may not only lead to excessive data
processing but the added volume of network monitoring traffic (between multiple hosts and
the monitoring system) may also be disproportionate to the accuracy required.
The processing requirements of DC hosts vary as the workload varies and are not
known until they arrive at the VM, requesting service. While some a priori analysis of the
workload may be performed to predict future demand, as in the work of Gmach et al. [46],
unexpected changes may occur which have not been established by any previously identified
patterns. A more dynamic solution is required which reacts in real-time to the incoming
workload rather than making migration decisions based on a priori analysis.
36
Chapter 2 Literature Review
VMware vSphere facilitates a combination of collection intervals and levels [47].
The interval is the time between data collection points and the level determines which metrics
are collected at each interval. Examples of vSphere metrics are as follows:
Collection Interval: 1 day
Collection Frequency: 5 minutes (static)
Level 1 data: 'cpuentitlement', 'totalmhz', 'usage', 'usagemhz'
Level 2 data: 'idle', 'reservedCapacity' + all of Level 1 data (above)
VMware intervals and levels in a DC are adjusted manually by the operator as circumstances
require. Once chosen, they remain constant until the operator re-configures them. Manual
adjustment decisions, which rely heavily on the experience and knowledge of the operator,
may not prove as accurate and consistent over time as an informed, dynamically adjusted
system.
In vSphere, the minimum collection frequency available is 5 minutes. Real-time data
is summarized at each interval and later aggregated for more permanent storage and analysis.
2.5.2 Dynamic Monitoring Interval
Chandra et al. [48] focus on dynamic resource allocation techniques which are sensitive to
fluctuations in data center application workloads. Typically SLA guarantees are managed by
reserving a percentage of available resources (e.g. CPU, network) for each application. The
portion allocated to each application depends on the expected workload and the SLA
requirements of the application. The workload of many applications (e.g. web servers) varies
over time, presenting a significant challenge when attempting to perform a priori estimation
of such workloads. Two issues arise when considering provisioning of resources for web
servers:
1. Over-provisioning based on worst case workload scenarios may result in potential
underutilization of resources e.g. higher CPU priority allocated to application which
seldom requires it
2. Under-provisioning may result in violation of SLAs e.g. not enough CPU priority
given to an application which requires it
37
Chapter 2 Literature Review
An alternate approach is to allocate resources to applications dynamically based on
observation of their behaviour in real-time. Any remaining capacity is later allocated to those
applications as and when they are found to require it. Such a system reacts in real-time to
unanticipated workload fluctuations (in either direction), meeting QoS objectives which may
include optimization of power consumption in addition to typical performance SLAs such as
response time.
While Chandra and others [49, 50] have previously used dynamic workload analysis
approaches, their focus was on resource management to optimize SLA guarantees i.e.
performance. No consideration is given in their work to the effect on power consumption
when performance is enhanced. This research differentiates itself in that dynamic analysis of
the workload is performed for the purpose of identifying power consumption opportunities
while also maintaining (or improving) the performance of the DC infrastructure. The search
for improved energy efficiency is driven in this research by DC cost factors which were not as
significant an issue 10-15 years ago as they are now.
2.6 Conclusion
This chapter provided an in-depth analysis of data center energy efficiency state-of-the-art.
Software solutions to energy efficiency issues were presented, demonstrating that many
opportunities still exist for improvement in server room power consumption using a software
approach to monitoring (and control) of the complex systems which comprise a typical DC.
The principle lesson to take from prior (and existing) research in the field is that most of the
DC infrastructure can be monitored using software solutions but that monitoring (and
subsequent processing of the data collected) should not overwhelm the monitoring /
processing system and thus impact negatively on the operation of the DC infrastructure. This
thesis proposes that dynamic adjustment of the monitoring interval with respect to the
incoming workload may represent a superior strategy from an energy efficiency perspective.
Chapter 3 discusses in more detail the capabilities provided by the Java-based CloudSim
framework (used for this research) and the particular code modules relevant to testing the
hypothesis presented herein.
38
Chapter 3 CloudSim
Chapter 3 CloudSim
Introduction
Researchers working on data center energy efficiency from a software perspective are
typically hindered by lack of access to real-world infrastructure because it is infeasible to add
additional workload to a data center which already has a significant ‘real-world’ workload to
service on a daily basis. From a commercial perspective, DC operators are understandably
unwilling to permit experimentation on a network which, for the most part, has been fine-
tuned to manage their existing workload. In this chapter, details of the CloudSim framework
are presented with special emphasis on those aspects particularly related to this MSc research
topic.
The CloudSim framework [42] is a Java-based simulator, designed and written by
Anton Beloglazov at the University of Melbourne for his doctoral thesis. It provides a limited
software solution to the above issues and is deployed in this research to simulate a standalone
power-aware data center with LAN-based migration capabilities. The Eclipse IDE is used to
run (and edit) CloudSim.
3.1 Overview
Default power-aware algorithms in CloudSim analyse the state of the DC infrastructure at
static 300-second intervals. This reflects current industry practice where an average CPU
utilization value for each host is polled every 5 minutes (i.e. 300 seconds) by virtualization
monitoring systems (e.g. VMware). At each interval the CPU utilization of all hosts in the
simulation is examined to establish whether or not they are adequately servicing the workload
which has been applied to the VMs placed on them.
If a host is found to be over-utilized (i.e. the CPU does not have the capacity to
service the complete workload of all the VMs placed on it) a decision is made to migrate one
or more of the VMs to another host where the required capacity to service the workload is
available.
39
Chapter 3 CloudSim
Conversely, if a host is found to be under-utilized (i.e. the CPU is operating at such a low
capacity that power could be saved by switching it off), the remaining VMs are migrated to
another host and the machine is powered off. The CloudSim modules used only implement
migration when a host is over-utilized, reflecting the focus of this research.
There are two primary steps in the power-aware CloudSim migration algorithm for
over-utilized hosts:
1. Migrate the VM with the lowest Page Dirty Rate
2. Choose the destination host for minimal power consumption post-migration
The default CPU utilization threshold for an over-utilized host in CloudSim is 100%. An
adjustable safety parameter is also provided by CloudSim, effectively acting as overhead
provision. As an example, if the CPU utilization value were 90% and was then multiplied by
a safety parameter of 1.2, the resulting value of 108% would exceed the over-utilization
threshold. A safety parameter of 1.1 would result in a final value of 99% (for the same initial
utilization), thus not exceeding the threshold.
3.2 Workload
The workloads applied to the VMs on each host in a simulated DC for power-aware
CloudSim simulations are referred to as a ‘cloudlets’. These are flat text files which contain
sample CPU utilization percentages gathered (per interval) from over 500 DC locations
worldwide. As long as no migration takes place (i.e. the host doesn’t become over-utilized),
the VM assigned to service the workload at the beginning of a simulation (depicted in Figure
7) remains associated with that workload until the cloudlet has been completed. However, if a
migration takes place (because the host has become over-utilized) the workload is then
applied to the VM on the destination host. Despite the term ‘VM Migration’, it is the
workload (not the VM) which changes location within the DC when a migration takes place.
40
Chapter 3 CloudSim
Figure 7 CloudSim Architecture
The duration of default CloudSim simulations is 24 hours (i.e. 86,400 seconds). This equates
to 288 intervals of 5 minutes (300 seconds) each. Thus, each of the 1052 cloudlets (stored in
the PlanetLab directory) contains 288 values to make a value available for reading at each
interval of the simulation.
41
Chapter 3 CloudSim
At the beginning of each simulation, the entire cloudlet is loaded into an array
(UtilizationModelPlanetLabInMemory.data[]) from which the values are read at
each interval throughout the simulation. Each cloudlet is assigned to a corresponding VM on
a 1-2-1 basis at the beginning of the simulation.
The values being read from the cloudlets are percentages which simulate ‘real-
world’ CPU utilization values. These need to be converted to a variable in CloudSim which is
related to actual work performed. CloudSim work performance is defined in MIs (Million
Instructions). The workload of the cloudlet (termed length) is a constant i.e. 2500 *
SIMULATION_LENGTH (2500 * 86400 = 216,000,000 MIs). CloudSim keeps track of the
VM workload already performed by subtracting the MIs completed during each interval from
the total cloudlet MI length. As such each cloudlet starts at t = 0 seconds with a workload of
216,000,000 MI and this load is reduced according to the work completed at each interval.
To check whether a cloudlet has been fully executed the IsFinished() method is called at each interval.
// checks whether this Cloudlet has finished or notif (cl.IsFinished){
…}
final long finish = resList.get(index).finishedSoFar;final long result = cloudletLength - finish;if (result <= 0.0){
completed = true;}
From the code tract above it can be seen that when (or if) the VM’s workload (represented by
the cloudletLength variable) is completed during the simulation the VM will be ‘de-
commissioned’.
3.3 Capacity
Each of the 4 VM types used in the CloudSim framework represents a ‘real-world’ virtual
machine. They are assigned a MIPS value (i.e. 500, 1000, 2000, 2500) before the simulation
begins. This value reflects the maximum amount of processing capacity on the host to which
42
Chapter 3 CloudSim
a VM is entitled. Likewise each host CPU has an initial MIPS capacity of either 1860 or
2660, again reflecting ‘real-world’ servers. These configuration settings limit the number of
VMs which can be run on each host and also the volume of workload which can be
performed by each VM at each interval.
Example: A host has a capacity of 2660 MIPS. A VM (with a capacity of 500
MIPS) has just been started on the host and the first value read from the cloudlet array is 5%
of the host’s capacity (i.e. 2660 / 20 = 133 MIPS). If the next interval is 30 seconds long then
the amount of instructions processed by the VM is 133 * 30 = 3990MI.
This completed work is subtracted from the total cloudlet length (i.e. 216,000,000 –
3990 = 215,996,010MI). At each subsequent interval throughout the simulation the same
algorithm is applied until such time as the remaining workload to be processed is at (or
below) zero. At this stage the VM is de-commissioned because the workload is complete.
In this example the 5% CPU percentage from the cloudlet (i.e. 133 MIPS) is
approximately 27% (500/133) of the CPU capacity allocated to the VM. If the original value
read from the cloudlet was greater than 18.79% (i.e. 2660 / 500), the VM would have
insufficient capacity to continue servicing the workload and SLA violations would occur. 2
options typically need to be considered when this happens:
1. Increase the VM’s capacity on the host – not facilitated in CloudSim
2. Apply the workload to a VM with a larger capacity on a different host, requiring a
migration. This will only occur if the host is also over-utilized which is a significant
shortfall in the CloudSim modules used for testing the hypothesis. An over-utilized
VM (causing SLA violations) will not result in a migration in the version of CloudSim
being used for this research. Additionally, it is notable that the CloudSim reports
(generated at the end of the simulation) detail very low SLA violation averages which
indicates that the particular workload (cloudlet) percentages being applied in this
version of CloudSim are insufficient to push the VMs beyond their capacity.
The difficulty of correctly sizing VM MIPS (and allocating appropriate host capacity to
them) so that they are capable of meeting their workload requirements can be seen from this
example. CloudSim goes some way to achieving this by applying VMs to hosts on a 1-2-1
43
Chapter 3 CloudSim
basis at the start of the simulation i.e. in a default simulation with 1052 VMs being placed on
800 hosts, the first 800 VMs are applied to the first 800 hosts and the remaining 352 VMs are
allocated to hosts 1 -> 352. Therefore, when the simulation starts, 352 hosts have 2 VMs and
the remainder host a single VM.
As processing continues the VM placement algorithm attempts to allocate as many
VMs to each host as capacity will allow. The remaining (empty) hosts are then powered off -
simulating the server consolidation effort typical of most modern DCs. It is clear that there is
a conflict of interests taking place. On the one hand there is an attempt to maximize
performance by migrating VMs to hosts with excess capacity but, on the other hand,
competition for CPU cycles is being created by co-locating VMs on the same host,
potentially creating an over-utilization scenario.
3.4 Local Regression / Minimum Migration Time (LR / MMT)
Beloglazov concludes from his CloudSim experiments that the algorithm which combines
Local Regression and Minimum Migration Time (LR / MMT) is most efficient for
maintaining optimal performance and maximizing energy efficiency. Accordingly this
research uses the LR / MMT algorithmic combination as the basis for test and evaluation.
3.5 Selection Policy – Local Regression (LR)
Having passed the most recent CPU utilization values through the Local Regression (LR)
algorithm, hosts are considered over-utilized if the next predicted utilization value exceeds
the threshold of 100% [Appendix A]. LR predicts this value using a sliding window, each
new value being added at each subsequent interval throughout the simulation. The size of the
sliding window is 10. Until initial filling of the window has taken place (i.e. 10 intervals have
elapsed since the simulation began), CloudSim relies on a 'fallback' algorithm [Appendix A]
which considers a host to be over-utilized if its CPU utilization exceeds 70%.
VMs are chosen for migration according to MMT i.e. the VM with the lowest
predicted migration time will be selected for migration to another host. Migration time is
based on the amount of RAM being used by the VM. The VM using the least RAM will be
44
Chapter 3 CloudSim
chosen as the primary candidate for migration, simulating minimization of the Dirty Page
Rate (DPR) during VM transfer [39] as previously discussed in Section 2.4.2.2.
3.6 Allocation Policy – Minimum Migration Time (MMT)
The destination host for the migration is chosen on the basis of power consumption following
migration i.e. the host with the lowest power consumption (post migration) is chosen as the
primary destination candidate. In some cases, more than one VM may require migration to
reduce the host's utilization below the threshold. Dynamic RAM adjustment is not facilitated
in CloudSim as the simulation proceeds. Rather, RAM values are read (during execution of
the MMT algorithm) on the basis of the initial allocation to each VM at the start of the
simulation.
3.7 Default LRMMT
The LRMMT algorithm begins in the main() method of the LrMmt.java class [Appendix
B]. A PlanetLabRunner() object is instantiated. The PlanetLabRunner() class inherits
from the RunnerAbstract() class and sets up the various parameters required to run the
LRMMT simulation. The parameters are passed to the initLogOutput() method in the
default constructor of the super class (RunnerAbstract) which creates the folders required
for saving the results of the simulation. Two methods are subsequently called:
3.7.1 init()
Defined in the sub-class (PlanetLabRunner), this method takes the location of the
PlanetLab workload (string inputFolder) as a parameter and initiates the simulation. A
new DatacenterBroker() object is instantiated. Among other responsibilities, the broker
will create the VMs for the simulation, bind the cloudlets to those VMs and assign the VMs
to the data center hosts. The broker’s ‘id’ is now passed to the
createCloudletListPlanetLab() method which prepares the cloudlet files in the input
folder for storage in a data[288] array. It is from this array that each cloudlet value will be
read so that an equivalent MI workload value can be calculated for each VM. Having created
the cloudletList, the number of cloudlets (files) in the PlanetLab folder is now known
and a list of VMs can be created with each cloudlet being assigned to an individual VM i.e.
there is a 1-2-1 relationship between a cloudlet and a VM at the start of the simulation. The
45
Chapter 3 CloudSim
last call of the init() method creates a hostList which takes as a parameter the number
of hosts configured for the DC (i.e. 800) from the PlanetLabConstants() class. On
completion of the init() method the cloudlets (workload), hosts and VMs are all
instantiated and ready for the data center to be created.
3.7.2 start()
The start() method creates the data center, binds all the components created in the
init() method to the new data center and starts the simulation.
The first helper call of the start() method is to createDatacenter() which
sets up a number of parameters related to the characteristics of the DC. These include:
arch (string) – whether the DC has a 32 or 64 bit architecture
os (string) – the operating system running on the hosts – e.g. Linux / Windows
vmm (string) – the virtual machine manager running on the hosts e.g. Xen
time_zone (double) – where the DC is located – e.g. 10.0
cost (double) – the cost of processing in this resource – e.g. 3.0
costPerMem (double) – the cost of using memory in this resource – e.g. 0.05
costPerStorage (double) – the cost of using storage in this resource – e.g. 0.001
costPerBm (double) – the cost of using bandwidth in this resource – e.g. 0.0
In the case of a simulation of a cloud network, where more than one data center would be
required, these values can be altered for the purposes of calculating different infrastructural
costs across the cloud. In this research a single data center is being simulated. The defaults
are not adjusted.
Once the data center has been created a boolean value
(PowerDatacenter.disableMigrations - indicating whether or not migrations are
disabled) is set to false i.e. migrations are enabled for this simulation. The VM and cloudlet
lists are submitted (by the broker) to the datacenter object and the simulation is started i.e.
double lastClock = CloudSim.StartSimulation();
46
Chapter 3 CloudSim
The StartSimulation() method calls the run() method which waits for completion of
all entities i.e. run() waits until the entities (cloudlets running on VMs) are run as threads so
that the stop condition for the StartSimulation() method is reached when the threads
reach the ‘non-RUNNABLE’ state or when there are no more events in the future event queue.
Once this point has been reached the clock time is returned to the calling method (i.e.
RunnerAbstract.start()) and the simulation is stopped.
Helper.printResults(datacenter, vmList, lastClock, experimentName, Constants.OUTPUT_CSV, outputFolder);
Results (if enabled) are printed to both log and trace files and the simulation is completed.
47
Chapter 3 CloudSim
3.8 Over-utilization
Figure 8 Flow Chart Depicting the LR / MMT simulation process
The full simulation process is depicted in Figure 8. The scheduling interval (i.e. how often
analysis will be performed) is set as a static variable in the Constants class. For the default
CloudSim simulation, this interval is 300 seconds. At each interval the CPU utilization of
every host is examined. Using a sliding window of the last 10 CPU utilization values, the
local regression algorithm predicts the CPU utilization value for the next interval. If this
value is below 100% no action is taken. However, if the CPU is predicted to be greater than
48
Chapter 3 CloudSim
100% (at the next interval) the host is considered over-utilized and the MMT portion of the
algorithm is called. As mentioned previously, a ‘fallback’ algorithm is used until the first 10
CPU values are available. The ‘fallback’ over-utilization threshold is 70%. The code for
testing a host for overutilization is shown below:
if (utilizationHistory.length < length){
return getFallbackVmAllocationPolicy().isHostOverUtilized(host);}
The length of the sliding window is 10. This is known in code as the
utilizationHistory.
try{
estimates = getParameterEstimates(utilizationHistoryReversed);}
The getParameterEstimates() call runs the local regression algorithm against the
sliding window and (after including the safety parameter as a multiplier) the predicted
utilization of the host is calculated.
predictedUtilization *= getSafetyParameter();
if(predictedUtilization >= 1)
{
Constants.OverUtilizedHostsThisInterval++;
}
return predictedUtilization >= 1;
A Boolean indicating the utilization state of the host is returned to the calling function. If the
host is predicted to be over-utilized at the next interval, the value of the returned Boolean will
be true.
49
Chapter 3 CloudSim
3.9 Migration
One or more VMs need to be migrated from the host in order to bring the CPU utilization
back below the threshold. The VM(s) to be migrated are chosen on the basis of the amount of
RAM they are using. Thus, the VM with the least RAM will be the primary candidate for
migration. The VM types used by CloudSim are listed below. It can be seen that (for the most
part †) different RAM values are configured for each VM at the start of the simulation.
CloudSim does not include dynamic RAM adjustment so the static values applied initially
remain the same for the duration. Cloud providers such as Amazon use the term ‘instance’ to
denote a spinning VM. The CloudSim VM types provided simulate some of the VM
instances available to customers in Amazon EC2.
1. High-CPU Medium Instance: 2.5 EC2 Compute Units, 0.85 GB
2. Extra Large Instance: 2 EC2 Compute Units, 3.75 GB †
3. Small Instance: 1 EC2 Compute Unit, 1.7 GB
4. Micro Instance: 0.5 EC2 Compute Unit, 0.633 GB
public final static int VM_TYPES = 4;
public final static int[] VM_RAM = { 870, 1740, 1740†, 613 };
All types are deployed when the VMs are being created at the beginning of the simulation.
Assuming all 4 are on a single host when the host is found to be over-utilized, the order in
which the VMs will be chosen for migration is:
1. 613 [index 3]
2. 870 [index 0]
3. 1740 [index 1]
4. 1740 [index 2] † (note: the VM_RAM value in the default CloudSim code is 1740.
This does not reflect the ‘real-world’ server [Extra Large Instance] being simulated,
which has a RAM value of 3.75 GB)
If there is more than one VM with RAM of 613 on the host they will be queued for migration
before the first ‘870’ enters the queue. The chosen VM is then added to a migration map
which holds a key-value pair of the:
50
Chapter 3 CloudSim
VM ID
Destination Host ID
Once all the hosts have been analysed, the VMs in the migration map are migrated to their
chosen destinations using the VM placement algorithm. The destination for each VM is
chosen with the objective of optimizing power consumption i.e. the host which will use the
least power post-migration is deemed the most suitable.
public Vm getVmToMigrate(PowerHost host){
List<PowerVm> migratableVms = getMigratableVms(host);if (migratableVms.isEmpty()){
return null;}Vm vmToMigrate = null;double minMetric = Double.MAX_VALUE;for (Vm vm : migratableVms){
if (vm.isInMigration()){
continue;}double metric = vm.getRam();if (metric < minMetric){
minMetric = metric;vmToMigrate = vm;
}}return vmToMigrate;}
}
From the code above it can be seen that the VM with the least RAM (vm.getRam()) is
chosen for migration, the objective of which is to minimize the downtime required to transfer
the final RAM pages during the migration. Increased downtime during migration would
result in potential SLA violations as described in detail in Section 2.4.2.3. It is clear that, to a
certain extent, CloudSim is replicating the effort to minimize SLA violations which takes
place during ‘real-world’ live migrations.
51
Chapter 3 CloudSim
3.10 Reporting
CloudSim facilitates reporting on various metrics available during the simulation. Reports are
generated as either flat text or MS Excel-type Comma Separated Values (CSV) file formats.
Additionally, metrics can be sent to the Eclipse console and read as the simulation progresses.
Below is a sample of the metrics summary from the trace file of the default CloudSim
simulation. Notable metrics include:
Number of Hosts
Number of VMs
Energy Consumption
Over-utilized Hosts
Number of VM Migrations
Average SLA Violation
Trace.printLine(String.format("Experiment name: " + experimentName));Trace.printLine(String.format("Number of hosts: " + numberOfHosts));Trace.printLine(String.format("Number of VMs: " + numberOfVms));Trace.printLine(String.format("Total simulation time: %.2f sec", totalSimulationTime));Trace.printLine(String.format("Energy consumption: %.2f kWh", energy));Trace.printLine(String.format("Overutilized Hosts: %d", Constants.OverUtilizedHostsThisInterval));Trace.printLine(String.format("Number of VM migrations: %d", numberOfMigrations));Trace.printLine(String.format("SLA: %.5f%%", sla * 100));Trace.printLine(String.format("SLA perf degradation due to migration: _%.2f%%", slaDegradationDueToMigration * 100));Trace.printLine(String.format("SLA time per active host: %.2f%%", slaTimePerActiveHost * 100));Trace.printLine(String.format("Overall SLA violation: %.2f%%", slaOverall _* 100));Trace.printLine(String.format("Average SLA violation: %.2f%%", slaAverage _* 100));
3.11 Conclusion
52
Chapter 3 CloudSim
This chapter discussed some of the capabilities and limitations of the default CloudSim
framework being used for this research and identified the modules most related to testing the
hypothesis presented in this research. An explanation of how CloudSim processes the
workload, being applied from the cloudlets, was also provided. A range of shortfalls and
possible errors were also identified. Chapter 4 details the changes made to the default
framework and the new code added, which were integrated into the CloudSim code to
evaluate the hypothesis.
53
Chapter 4 Implementation
Chapter 4 Implementation
Introduction
As it is not provided in the default CloudSim package, additional code was required to test
the effect on power consumption when the monitoring interval is adjusted. Chapter 3
provided an overview of the default power-aware CloudSim simulation and the related
modules. The specific capabilities (and limitations) of the framework, as they apply to
dynamic interval adjustment, were also outlined. In this chapter the changes which were
required to implement dynamic adjustment of the monitoring interval are described in more
detail.
The primary contribution of the thesis is to evaluate the impact of moving from a
static to a dynamic monitoring process whereby the predicted average utilization for the DC
at each interval is used to adjust the next interval. Before writing the code, which would
ultimately be integrated with the existing LR/MMT modules in CloudSim, an algorithm was
designed to clarify the steps involved.
4.1 Interval Adjustment Algorithm
The dynamic interval adjustment algorithm involves two principle steps:
1. Calculate the weighted mean of the CPU utilization value for all operational hosts in the
data center as in Equation 1. Non-operational hosts are excluded from the calculation,
they do not affect the average CPU utilization for the DC:
weighted mean=∑i=1
n
(w¿¿ i x i¿)¿
¿∑i=1
n
(w i )¿, (1)
54
Chapter 4 Implementation
where wi is the weight applied to the range within which the predicted utilization value x i for
each operational host falls and n is the number of operational hosts.
2. Choose and set the next monitoring interval with respect to the appropriate weighted
mean from Table 4.1:
The premise upon which weightings are applied to calculate the average utilization of the DC
is simplified for the purposes of this research. The primary objective is to adjust the
monitoring interval with respect to the upper utilization threshold. As such, a straightforward
set of weights (from 1 - 10) are applied to the CPU utilization for each host, such that host's
which have a higher CPU utilization i.e., are closer to 100%, are given more priority in the
calculation. If a simple average was taken of host utilization across the DC, this would have
the effect of masking hosts that are close to the threshold, where SLA’s are in danger of being
violated. If the lower threshold were taken into consideration, a different set of weights would
be appropriate, with increased importance applied in the regions closer to both thresholds and
reduced importance at the center (e.g. 40-60% CPU utilization). There is certainly scope for
further investigation of the simplified set of weights applied in this research, depicted in
Table 4.1.
Table 4.1: Application of Weights to Predicted Utilization
Predicted Utilization (%) per host ( xi ) Weight Applied ( wi )
1 – 10 1
11 – 20 2
21 – 30 3
… …
91 – 100 10
The monitoring intervals applied to the resulting weighted average prediction are depicted in
Figure 9. As with the weights discussed above, the intervals were chosen somewhat
arbitrarily and would benefit from further analysis. The maximum interval is aligned with the
existing default interval in CloudSim i.e. 300 seconds. A minimum interval of 30 seconds
55
Chapter 4 Implementation
facilitates 10 intervals in total, each having a corresponding 10% CPU utilization range from
0 to 100.
However, if the minimum interval of 30 seconds was applied for the full 24 hour simulation,
2880 values would be required (i.e. (60 x 60 x 24) / 30) in each PlanetLab cloudlet file to
ensure a value could be read at each interval. The 288 values in the 1052 default files
provided by CloudSim were thus concatenated (using a C# program written specifically for
this purpose) to ensure sufficient values were available, resulting in a total of 105 complete
files with 2880 values each.
if(Constants.IsDefault){
data = new double[288]; // PlanetLab workload}else{
data = new double[2880];}
The code tract above demonstrates the difference between the two data[] arrays which hold
the PlanetLab cloudlet values read at each interval during the simulation. The default array
is 288 in length while the dynamic is 2880 – providing sufficient indices in the array to store
the required number of cloudlet values throughout the simulation.
56
Chapter 4 Implementation
Figure 9 Application of the Monitoring Interval Based on Weighted Utilization Average
It should be noted that the intervals and CPU ranges were chosen somewhat arbitrarily and
could be fine-tuned following further investigation. Additionally, as a result of the reduced
file count after concatenation, the number of hosts running in the simulation was reduced,
from the CloudSim default of 800 to 80, to maintain the ratio of VMs to hosts (Table 4.2):
Table 4.2: VMs to Hosts - Ratio Correction
Cloudlets / VMs Hosts
Default 1052 800
Dynamic 105 80
57
Chapter 4 Implementation
4.2 Comparable Workloads
This research focuses on a comparison of the default CloudSim simulation with a dynamic
version and was required, therefore, to use comparable workloads. Ensuring that the
workloads are comparable (in a simulation which monitors at different intervals) involves
applying the same amount of processing to each VM during each interval. Accordingly, a
new set of files was created for the default simulation. The values for these files were
calculated based on the average of the values used in the dynamic simulation by the time each
300 second interval had elapsed. This was achieved by running the dynamic simulation for 24
hours and recording the data (e.g. interval length, cumulative interval, average utilization per
interval) observed (shown in Figure 10). This data was initially written to a Microsoft Excel
worksheet from within the CloudSim reporting structure and then exported to a Microsoft
SQL Server database. The number (727) and length of the intervals in the dynamic simulation
can be seen in Figure 11.
From Figure 12, it is clear that a lower (and/or) upper offset may occur during
calculation i.e. the dynamic interval ‘straddles’ the 300-second mark. To maintain as much
accuracy as possible for the calculation of the default file values, two new variables (i.e.
offsetBelow300, offsetAbove300) were introduced.
58
Chapter 4 Implementation
Figure 10 A screenshot of the data generated for calculation of the default workload
Figure 11 Intervals calculated during the dynamic simulation
59
Chapter 4 Implementation
Figure 12 Calculation of the Average CPU Utilization for the Default Files
The length of each interval is added to an accumulator until the total equals (or exceeds) 300
seconds. The average utilization for the accumulated intervals is then calculated. This average
includes (if required) the average for the final portion of any interval below the 300-second
mark. When an offset occurs above the 300-second mark (in the current accumulator), it is
‘held-over’ (i.e. added to the accumulator in the next 300-second interval). Some of the new
Java code written in CloudSim to monitor the interval and workload activity (generating the
data required for the calculator) is shown below – the code comments provide explanation:
if(Constants.IntervalGenerator){
int intervalDifference = 0;int iOffsetBelowForPrinting = 0;int iOffsetAboveForPrinting = 0;int iAccumulatedIntervalsForPrinting = 0;
if(Constants.accumulatedIntervals >= 300){
//Constants.accumulatedIntervals is exactly 300int accumulated = 300;
//calculate offsetsif(Constants.accumulatedIntervals > 300){
60
Chapter 4 Implementation
accumulated = (int) Constants.accumulatedIntervals - _(int) dInterval;
}Constants.offsetBelow = 300 - accumulated;Constants.offsetAbove = Constants.accumulatedIntervals -
300;
}}
4.3 C# Calculator
Calculation of the new per-interval workloads was achieved using a separate ‘calculator’
program written in C#. The calculator implements the process depicted in Figure 12. The
principle C# method used to calculate the new averages for the default files in the calculator
program is CreateDefaultFiles(). The comments in the code explain each step of the
process:
private void CreateDefaultFiles(){ //read in first 727 from each file - used in dynamic simulation FileInfo[] Files = dinfo.GetFiles(); string currentNumber = string.Empty; int iOffsetAboveFromPrevious = 0; //initialize at max to ensure not used on first iteration int iIndexForOffsetAbove = 727;
foreach (FileInfo filex in Files) { using (var reader = new StreamReader(filex.FullName)) { //fill dynamicIn for (int i = 0; i < 727; i++) { dynamicIn[i] = Convert.ToInt32(reader.ReadLine()); }
int iCurrentOutputIndex = 0; //Calculate for (int k = 0; k < 727; k++) { //add each average used here - including any offset float iAccumulatedTotal = 0;
//reached > 300 accumulated intervals
61
Chapter 4 Implementation
int iReadCount = _Convert.ToInt32(ds.Tables[0].Rows[k]["ReadCount"]); if (iReadCount > 0) { //first interval if (k == 0) { int iValue = dynamicIn[k]; iAccumulatedTotal += iValue; } else { //readCount == 1: just check for offsets if (iReadCount > 1) { for (int m = 1; m < iReadCount; m++) { int iValue = dynamicIn[k - m]; int iInterval = _Convert.ToInt32(ds.Tables[0].Rows[k - _m]["Interval"]); iAccumulatedTotal += iValue * iInterval; } } } //offset - read this interval int iOffsetBelow = _Convert.ToInt32(ds.Tables[0].Rows[k]["OffsetBelow300"]); if (iOffsetBelow > 0) { iAccumulatedTotal += iOffsetBelow * dynamicIn[k]; }
//use previous offset above in this calculation if (k >= iIndexForOffsetAbove) { iAccumulatedTotal += iOffsetAboveFromPrevious; //reset iOffsetAboveFromPrevious = 0; iIndexForOffsetAbove = 727; } //use this offset above in next calculation int iOffsetAbove = _Convert.ToInt32(ds.Tables[0].Rows[k]["OffsetAbove300"]); if (iOffsetAbove > 0) { //value for offset above to add to next _accumulator iOffsetAboveFromPrevious = iOffsetAbove * _dynamicIn[k];
62
Chapter 4 Implementation
//use in next calculation - at a minimum
iIndexForOffsetAbove = k; }
float fAverage = iAccumulatedTotal / 300; int iAverage = Convert.ToInt32( iAccumulatedTotal / _300);
//first interval if (k == 0) { iAverage = dynamicIn[k]; } //save averaged value to array for writing defaultOutput[iCurrentOutputIndex] = _iAverage.ToString(); iCurrentOutputIndex++; } } }
//Print to text file for default cloudlet System.IO.File.WriteAllLines("C:\\Users\\scooby\\Desktop\\DefaultNewFiles\\" + _filex.Name, defaultOutput); }}
The code above depicts the process by which the values required to calculate the default file
averages was achieved. Having run the dynamic simulationThe results of the calculator
program were written back out to flat text files i.e. the same format as the original CloudSim
cloudlet files. To compare the difference between the default and dynamic workloads, both
default and dynamic simulations were run using the new cloudlet files with a few lines of
additional code to monitor the workload during each simulation added to the default
constructor of the UtilizationModelPlanetlabInMemory() class in CloudSim. This
additional code ensured that the workload would be observed (and accumulated) as each
cloudlet was processed.
As the data is being read into the data[] array from the PlanetLab cloudlet files
each value is added to a new accumulator variable i.e. Constants.totalWorkload:
int n = data.length;for (int i = 0; i < n - 1; i++){
63
Chapter 4 Implementation
data[i] = Integer.valueOf(input.readLine());Constants.totalWorkload += data[i];
}
The Constants.totalWorkload value was then divided by the relevant number of
intervals (i.e. Default: 288 / Dynamic: 727) to calculate the average workload per interval. A
difference of less than 1% in the per-interval workloads was observed, validating, for the
most part, the results generated for the default cloudlet files by the C# program.
The negligible difference may be explained by migrations taking place during
collection of the CPU utilization data prior to export. For example, if a workload on a VM is
calculated at 10% of its host’s capacity and then migrated to a host with a lower capacity, the
same workload would require more time to complete – skewing the average CPU utilization
that would have otherwise been calculated had the migration not taken place. This scenario
was not factored into calculation of the per-interval average CPU utilization, resulting in the
difference between the workloads of approximately 1%. This error margin was considered
acceptable in the context of the overall thesis objectives.
4.4 Interval Adjustment Code
The updateCloudletProcessing() method in the PowerDatacenter class is the
principle cloudlet processing method, run at each interval and provided by default in
CloudSim. As such it is the ideal place to position the function call to the additional code
required to implement interval adjustment.
To differentiate between the default and dynamic simulations at runtime, a constant
boolean variable (IsDefault) was created which indicates which type of simulation is being
run. Based on the value of the IsDefault variable, the code will fork either to the default
CloudSim code or the dynamic code written to adjust the monitoring interval. The fork
returns to the default CloudSim code once the AdjustInterval() method has been
executed:
if(Constants.IsDefault){
//run default simulation}else
64
Chapter 4 Implementation
{//run dynamic simulationAdjustInterval();
}
The AdjustInterval() method (outlined below) is the entry point for the dynamic
monitoring interval adjustment simulation. Figure 13 depicts how the dynamic code interacts
with the CloudSim default:
protected void AdjustInterval(double currentTime){
double dTotalUsageForAverage = 0;double dAverageUsage = 0;int iDenominator = 0;int iWeight = 0;double timeDiff = currentTime - getLastProcessTime();
for (PowerHost host : this.<PowerHost> getHostList()){
double utilizationOfCpu = host.getUtilizationOfCpu();if(utilizationOfCpu > 0){
iWeight = GetWeight(utilizationOfCpu);dTotalUsageForAverage += utilizationOfCpu *
iWeight;iDenominator += iWeight;
}}
dAverageUsage = dTotalUsageForAverage / iDenominator;
//alter scheduling interval according to average utilizationSetSchedulingIntervalRelativeToUtilization(dAverageUsage);
}
65
Chapter 4 Implementation
Figure 13 How the dynamic interval adjustment code interacts with CloudSim
66
Chapter 4 Implementation
A host which is not running would have a CPU utilization of 0. As depicted by the code, only
hosts with CPU utilization greater than 0 will be included in the average CPU utilization for
the DC i.e.
if(utilizationOfCpu > 0)
A weighting is then applied (helper function: GetWeight() - below) to each result obtained.
This weighting (cf. Table 4.1) is based on the CPU utilization calculated for each host by the
getUtilizationOfCpu() method provided by default in CloudSim:
public int GetWeight(double utilization){
double iUtilization = utilization * 100;int iWeight = 0;
//check utilization value rangeif(iUtilization >= 0.00 && iUtilization <= 10.00){
iWeight = 1;}else if(iUtilization > 10.00 && iUtilization <= 20.00){
iWeight = 2;}else if(iUtilization > 20.00 && iUtilization <= 30.00){
iWeight = 3;}else if(iUtilization > 30.00 && iUtilization <= 40.00){
iWeight = 4;}else if(iUtilization > 40.00 && iUtilization <= 50.00){
iWeight = 5;}else if(iUtilization > 50.00 && iUtilization <= 60.00){
iWeight = 6;}else if(iUtilization > 60.00 && iUtilization <= 70.00){
iWeight = 7;}else if(iUtilization > 70.00 && iUtilization <= 80.00){
iWeight = 8;}else if(iUtilization > 80.00 && iUtilization <= 90.00)
67
Chapter 4 Implementation
{iWeight = 9;
}else if(iUtilization > 90.00 && iUtilization <= 100.00){
iWeight = 10;}
return iWeight;}
The average utilization for the DC is then passed to another helper function
(SetSchedulingIntervalRelativeToUtilization() – shown below) which will
adjust the next monitoring interval (i.e. Constants.SCHEDULING_INTERVAL) based on the
range within which the utilization falls.
public void SetSchedulingIntervalRelativeToUtilization(double dAverageUsage){
double iUtilization = dAverageUsage * 100;double dInterval = 300;
if(iUtilization >= 0.00 && iUtilization <= 10.00){
dInterval = 300;}else if(iUtilization > 10.00 && iUtilization <= 20.00){
dInterval = 270;}else if(iUtilization > 20.00 && iUtilization <= 30.00){
dInterval = 240;}else if(iUtilization > 30.00 && iUtilization <= 40.00){
dInterval = 210;}else if(iUtilization > 40.00 && iUtilization <= 50.00){
dInterval = 180;}else if(iUtilization > 50.00 && iUtilization <= 60.00){
dInterval = 150;}else if(iUtilization > 60.00 && iUtilization <= 70.00){
dInterval = 120;}
68
Chapter 4 Implementation
else if(iUtilization > 70.00 && iUtilization <= 80.00){
dInterval = 90;}else if(iUtilization > 80.00 && iUtilization <= 90.00){
dInterval = 60;}else{
dInterval = 30;}
setSchedulingInterval(dInterval);Constants.SCHEDULING_INTERVAL = dInterval;
}
The process then returns to the AdjustInterval() method which returns control back to
the default CloudSim code where the fork began. The default CloudSim code continues,
completing the (per-interval) updateCloudletProcessing() method and continuing the
simulation into the next interval, the length of which has now been adjusted with respect to
the predicted average CPU utilization for the DC.
4.5 Reporting
The metrics available in the default CloudSim reports (as described in Section 3.10) were
found to be sufficient for the purposes of the testing phase of this research (cf. Chapter 5).
However, some additional variables were needed during the design phase of the new
algorithm to adjust the monitoring interval. These were added as static Constants so that
they would be globally available across a range of classes and could be used without
requiring instantiation of any new objects. Most are associated with calculation of the per-
interval workload for the default PlanetLab cloudlet files. They include:
public static int fileBeingRead = 0;public static boolean IntervalGenerator = false;public static int previousIntervalCount = 0;public static int intervalCount = 0;public static int offsetBelow = 0;public static int offsetAbove = 0;public static int accumulatedOffsetTotal = 0;public static int intervalLengthTotal = 1;
69
Chapter 4 Implementation
The list below depicts a typical trace file for the default CloudSim LRMMT simulation. It
contains the output of the CloudSim reporting class i.e. Helper(). It also includes some of
the additional metrics added for the purpose of this research (bold green italics):
Experiment name: default_lr_mmt_1.2
Number of hosts: 80
Number of VMs: 105
Total simulation time: 86100.00 sec
Energy consumption: 16.76 kWh
Overutilized Hosts: 2249
Number of VM migrations: 2305
Total Workload: 3833.310000
SLA: 0.00428%
SLA perf degradation due to migration: 0.07%
SLA time per active host: 5.97%
Overall SLA violation: 0.53%
Average SLA violation: 11.61%
SLA time per host: 0.05%
Number of host shutdowns: 1184
Mean time before a host shutdown: 627.78 sec
StDev time before a host shutdown: 1443.06 sec
Mean time before a VM migration: 17.12 sec
StDev time before a VM migration: 7.67 sec
4.6 Conclusion
This chapter has discussed the modifications required to CloudSim to implement dynamic
adjustment of the monitoring interval. A description of how the new code integrates with the
default code provided by CloudSim was also provided. In Chapter 5 the tests carried out to
compare the default with the dynamic simulations are described and results analysed. Finally,
potential opportunities for improvement of the CloudSim framework during the course of this
research are suggested.
70
Chapter 5 Tests, Results & Evaluation
Chapter 5 Tests, Results & Evaluation
Introduction
Chapter 4 detailed the code changes that were required in the CloudSim framework to
implement dynamic adjustment of the monitoring interval. The new code integrates
seamlessly with the existing framework. No alterations were made to the underlying
CloudSim architecture. This chapter deals with the specifics of the simulations carried out to
test the hypothesis that opportunities for reduction of power consumption can be identified
when the length of the interval changes with respect to the varying workload experienced by
a typical DC.
5.1 Tests & Results
Using the dynamic PlanetLab cloudlet files and interval adjustment code, the simulation
was run for the full duration (i.e. 86100 seconds) and compared with the CloudSim default
which used the cloudlet files generated by the C# calculator. Key results are presented in
Table 5.1, whereby a significant reduction in over-utilized hosts, migrations and power
consumption was observed.
Table 5.1: Simulation Results
Interval (seconds)
Time Interval Count
Over-utilized Hosts
Migration Count
Power Consumption
Static 300 86100 287 2249 2305 16.76Dynamic 86100 727 1697 979 8.23
Figure 14 depicts the intervals that were calculated during the dynamic simulation based on
the average CPU utilization for the DC. It can be seen that the interval ranges from a
minimum of 30 seconds to a maximum of 270 seconds - indicating that the average CPU
utilization for the DC did not exceed 90% nor drop below 10%. From Figure 14, and as
71
Chapter 5 Tests, Results & Evaluation
described in Chapter 4, the number of intervals in the dynamic simulation is 727, compared
with 288 in the static simulation.
Figure 14 Interval Calculation for the Dynamic Simulation
Figure 15 shows a comparison of the VM count during both simulations - indicating that
VMs are being constantly ‘de-commissioned’ as their workloads are completed during the
simulation. There are 34 VMs still running at the end of the default simulation whereas there
are only 6 VMs which have not completed their workloads at the end of the dynamic. This
indicates that the VM placement algorithm has performed more efficiently in the dynamic
simulation i.e. more of the PlanetLab workload from the cloudlet files has been completed
by the time the dynamic simulation has finished.
72
Chapter 5 Tests, Results & Evaluation
Figure 15 VM decommissioning comparison
Comparing Figures 16 & 17, which depict the operational hosts at each interval in the default
and dynamic simulations, it can be seen that more efficient use is made of the hosts when the
interval is adjusted dynamically. A minimal number of operational servers is achieved sooner
in the dynamic simulation and the power-on / power-off behaviour of the default simulation
(which consumes both time and energy) is primarily absent from the dynamic. This is
discussed further in Section 5.2.1 below.
73
Chapter 5 Tests, Results & Evaluation
Figure 16 Operational Hosts - Default Simulation
Figure 17 Operational Hosts - Dynamic Simulation
74
Chapter 5 Tests, Results & Evaluation
Figure 18 depicts the per-interval average CPU utilization in the DC for the dynamic
simulation. A cluster of values can be seen at approximately 99% between 17 – 19 hours in
the dynamic simulation. There is a single operational host in this time period (Figure 17) with
a range of 9 - 19 VMs running on it. The high average CPU utilization is as a direct result of
all the remaining VMs being placed on this single host. The VM placement algorithm is most
efficient at this point in the dynamic simulation from a power consumption perspective,
optimizing energy efficiency by minimizing the number of operational hosts required to
service the DC workload. This placement configuration would not be possible if the
PlanetLab cloudlet workloads were higher. It is the relatively low CPU values being
allocated to the VMs from the cloudlet files that make the placement on a single host in this
time period possible.
Figure 18 Average CPU Utilization - Dynamic Simulation
5.2 Evaluation of Test Results
In Section 5.2 the results summarized above are investigated, based on an understanding of
CloudSim as derived from code, code comments and help documentation / user forums. It
explains the research findings using an ‘under-the-hood’ analysis. Section 5.3 identifies a
number of limitations in the CloudSim framework which would benefit from further
75
Chapter 5 Tests, Results & Evaluation
investigation and is tailored more towards future researchers using the CloudSim framework
than DC operators.
5.2.1 Why are the Over-Utilized Hosts / Migration Counts Reduced?
CloudSim performs migrations when an over-utilized host is identified. One or more VMs are
chosen for migration to bring the CPU utilization of the host back below the over-utilization
threshold. It is not explicitly clear from the CloudSim metrics available why the migration
count is reduced in dynamic mode relative to static mode. The VM placement algorithm
defined by CloudSim is a complex module within the framework. The logic upon which it
works is that the most appropriate destination for the migrating VM is the one which results
in the lowest power consumption post-migration. However, it is clear from the operational
hosts observed in the dynamic simulation (depicted in Figure 17) that the VM placement
algorithm is also performing consolidation (c.f. Sections 2.4.1, 3.3 & 5.1 above).
During the period when only 1 host is operational in the dynamic simulation (Figure
17: 17 – 24 hours) it was observed that there were as many as 19 VMs running on that host
(c.f. Section 5.1). As a result, some over-allocation is occurring. Over-allocation is when so
many VMs are placed on a host than the host has insufficient capacity to service the workload
of every VM at each time frame. In the effort to consolidate, VMs will sometimes be placed
on a host which is currently running, rather than switching on a new host. The effect is that,
due to the increased length of the CPU queue on the host (i.e. more VMs are ‘waiting’ for
processing time slices), some VMs will not receive the CPU cycles required to complete their
workload in the available interval. The expected action would be migration of the ‘starved’
VMs but it is evident (from Figure 17) that no migrations are taking place i.e. no other host is
switched on. This is due to one of the limitations identified in the framework - that CloudSim
only performs a VM migration when the entire host is over-utilized – not when an individual
VM requires more capacity (c.f. Section 5.2.7 below). Clearly, there is a trade-off between
consolidation and migration. The conclusion reached, based on the results observed, is that
the reduced intervals in the dynamic simulation result in more frequent analysis, performing
this consolidation / migration trade-off more efficiently than the default simulation, resulting
in fewer over-utilized hosts and a reduced migration count.
76
Chapter 5 Tests, Results & Evaluation
5.2.2 Result of Reduced Migration Count
Beloglazov et al. [51] show that decreased power consumption can be achieved in a DC if the
VM migration count can be reduced. Their work is based on the premise that additional
resources are consumed during a migration due to the extra processing required to move the
memory of the VM from its current host to another. Those processes may include:
Identification of a suitable destination server i.e. VM placement algorithm
Network traffic
CPU processing on both source and destination servers whilst concurrently running
two VMs
In the case of live migration, transfer of the VM’s memory image is performed by the Virtual
Machine Manager (VMM) which copies the RAM, associated with the VM service, across to
the destination while the service on the VM is still running. RAM which is re-written on the
source must be transferred again. This process continues iteratively until the remaining
volume of RAM needing to be transferred is such that the service can be switched off with
minimal interruption. This period of time, while the service is unavailable, is known as
downtime. Any attempt to improve migration algorithms must take live-copy downtime into
consideration to prevent (or minimize) response SLAs. CloudSim achieves this (to some
extent) by choosing, for migration, the VM with the lowest RAM. However, the CloudSim
SLA metric in the modules used for this research does not take this downtime into
consideration.
Dynamic adjustment of the monitoring interval, however, minimizes this issue of
RAM transfer by reducing the need for the migration in the first place. The power consumed
as a result of the migrations is saved when additional migrations are not required.
5.2.3 Scalability
As outlined in Section 2.1 there is a trade-off between DC monitoring overhead costs and net
DC benefits. The issue here is that the additional volume of processing which takes place
when shorter monitoring intervals are applied may become such that it would not be
beneficial to apply dynamic interval adjustment at all.
77
Chapter 5 Tests, Results & Evaluation
Take for example, Amazon’s EC2 EU West DC (located in Dublin, Ireland) which is
estimated to contain over 52,000 operational servers [52]. Processing of the data (CPU
utilization values) required to perform the interval adjustment is not an insignificant
additional workload. The algorithm will calculate the average CPU utilization of some 52,000
servers and apply the new interval. As such, if this calculation were to take place every 30
seconds (in a DC with an average CPU utilization above 90%), rather than every 300
seconds, there is a ten-fold increase in the total processing volume which includes both
collection and analysis of the data points. While it is unlikely that even the average CPU
utilization of the most efficient DC would exceed 90% for any extended period of time, it is
clear that the size of the DC (i.e. number of operational servers) does play a role in
establishing whether or not the interval adjustment algorithm described in this research
should be applied. Microsoft’s Chicago DC has approximately 140,000 servers installed.
With increasingly larger DCs being built to meet growing consumer demand, it is reasonable
to expect that DC server counts will reach 500,000 in the foreseeable future. Rather than
viewing the entire DC as a single entity from a monitoring perspective, perhaps the most
viable application of dynamic monitoring interval adjustment would be to sub-divide these
larger DCs into more manageable sections, calculating the monitoring interval for each
section separately.
5.3 Evaluation of CloudSim
5.3.1 Local Regression Sliding Window
The adjusted interval in this research (discussed in Chapter 3 Section 5) results in the 'first
fill' of the window occurring sooner than the CloudSim default i.e. the longest interval (5
minutes) in the dynamic version is the minimum (static) interval in the CloudSim default.
The first 10 values in the sliding window take 3000 seconds (i.e. 300 x 10) in the default
CloudSim simulation whereas, in the dynamic version, the window is filled after 1480
seconds. The result is a small increase in the accuracy of the utilization prediction at the
beginning of the simulation because the less accurate ‘fallback’ algorithm is ‘discarded’
sooner.
The size of the sliding window in the default CloudSim framework is 10 i.e. the 10 most
recent CPU utilization values from the host are used each time the local regression algorithm
78
Chapter 5 Tests, Results & Evaluation
is performed. If there were more values in the window, the algorithm would be less sensitive
to short-term changes in the workload. Clearly the size of the sliding window should be
proportionate to the level of sensitivity required. The choice of this parameter would most
likely benefit from detailed sensitivity analysis.
5.3.2 RAM
Chapter 3 detailed the configuration settings provided by default in CloudSim in an effort to
simulate ‘real-world’ VMs. However, for reasons unclear from the code, two of the VM types
have the same RAM applied to them i.e. 1740. It would be preferable if either:
4 distinct VM types were configured to better reflect ‘real-world’ scenarios and
improve the VM selection policy deployed by default
Provision for dynamic RAM adjustment was included in the CloudSim framework
(c.f. Section 5.3.3 below).
5.3.3 Dynamic RAM Adjustment
No facility is provided in CloudSim to adjust the amount of RAM available to a VM while
the simulation proceeds. A migration to another host with a higher-capacity VM is required
in CloudSim should a VM require more RAM. While this simulates many ‘real-world’
systems, the facility to dynamically adjust the amount of RAM allocated to a VM (without
requiring migration) would improve the VM selection algorithm.
5.3.4 SLA-based Migration
The basis upon which a migration takes place in the CloudSim module used for this research
is an over-utilized host. If a VM requires additional RAM to service its workload it must be
migrated to another host where a larger VM can be configured to run the workload. However
the module does not facilitate SLA-based migration. Rather, only VMs on a host which is
over-utilized are migrated. This is a significant limitation in the design of CloudSim. Even
with this scenario, the VMs which need additional RAM may not be those migrated because
the algorithm for choosing the VMs to migrate selects the VM with the lowest RAM pages
requiring migration first. This will typically leave ‘starved’ VMs on the source host - still
79
Chapter 5 Tests, Results & Evaluation
requiring additional RAM. Clearly, some improvement is required in the VM selection and
allocation policies deployed by the default CloudSim framework.
80
Chapter 6 Conclusions
Chapter 6 Conclusions
In order to identify a potentially novel energy efficiency approach for virtualized DCs, a large
part of the research effort in this thesis was dedicated to evaluating the current state-of-the-
art. On completion of this investigative phase it was decided to focus on opportunities
relating to DC management software i.e. virtualization. Following this, the concept of a
dynamic monitoring interval was then proposed. Once CloudSim had been identified as the
most accessible framework in which to build a test bed, a significant amount of time was
spent reviewing the existing code to establish the capabilities (and limitations) of the
framework.
The dynamic simulation presented in this thesis is differentiated from the default
LR/MMT CloudSim in that the duration of the next interval is adjusted (with respect to the
weighted average of the data DC CPU utilization) rather than maintaining a static interval of
300 seconds which is the standard monitoring interval used in commercial applications (e.g.
VMWare, Citrix Xen).
The primary aim of this research (as outlined in the introductory chapter) was to
determine the impact on power consumption of dynamically adjusting the monitoring
interval. Analysis of DC metrics is performed more often suggesting that the DC is more
sensitive to changes in CPU utilization. The focus of this research was the over-utilization
threshold. In calculating the average CPU utilization for the DC, shorter intervals are applied
as the average utilization rate increases. Results indicated that power consumption could be
reduced when the monitoring interval is adjusted with respect to the incoming workload. As
indicated, future work should also examine the potential for reduced power consumption as
the average CPU utilization for the DC approaches some under-utilization threshold. This
would improve the CloudSim VM placement algorithm, providing a more accurate
simulation of the server consolidation efforts used by industry.
In addition, this research had a secondary objective – to evaluate the efficacy of
CloudSim as a simulator for power-aware DCs. During the course of reviewing existing code
and writing new modules the specific issues (outlined above) were found to exist in the
CloudSim framework. Discovery and documentation of them in this thesis will undoubtedly
81
Chapter 6 Conclusions
prove both informative and useful for researchers undertaking CloudSim-based simulations in
the future.
Recent reports suggest that Microsoft’s Chicago DC has approximately 140,000
servers installed [53]. With increasingly larger DCs being built to meet growing consumer
demand, it is reasonable to expect that individual DC server counts will reach 250,000 -
500,000 in the foreseeable future. Rather than viewing the entire DC as a single entity from a
monitoring perspective, perhaps the most viable application of dynamic monitoring interval
adjustment would be to sub-divide these larger DCs into more manageable sections,
calculating (and adjusting) the monitoring interval for each section separately – ensuring that
the granularity of analysis most appropriately caters for all possible data center sizes and
configurations. This analysis would also make a valuable contribution to the state-of-the-art.
82
References
REFERENCES
[1] http://www.gartner.com/newsroom/id/499090 - last accessed on 19/09/2014
[2] http://www.idc.com - last accessed on 19/09/2014
[3] Koomey J.G., “Estimating Total Power Consumption by Servers in the U.S. and the
World”, 2007
[4] Energy Star Program - U.S. Environmental Protection Agency, “EPA Report to Congress
on Server and Data Center Energy Efficiency”, EPA, Aug 2007.
[5] N. Rasmussen, “Calculating Total Cooling Requirements for Data Centers,” American
Power Conversion, White Paper #25, 2007.
[6] Sharma, R.K.; Bash, C.E.; Patel, C.D.; Friedrich, R.J.; Chase, J.S., "Balance of Power:
Dynamic Thermal Management for Internet Data Centers," Internet Computing, IEEE, vol.9,
no.1, pp. 42-49, January 2005
[7] Moore, J.; Sharma, R.; Shih, R.; Chase, J.; Patel, C.; Ranganathan, P., “Going Beyond
CPUs: The Potential of Temperature-Aware Solutions for the Data Center”, Hewlett Packard
Labs, 2002
[8] Data Center Efficiency Task Force, “Recommendations for Measuring and Reporting
Version 2 – Measuring PUE for Data Centers”, 17th May 2011
[9] C. Belady, A., Rawson, J. Pfleuger, and T., Cader, "Green Grid Data Center Power
Efficiency Metrics: PUE and DCIE," The Green Grid, 2008
[10] Koomey J.G., “Growth in Data Center Electricity Use 2005 to 2010”, report by
Analytics Press, completed at the request of The New York Times, August 2011
[11] The Uptime Institute, “Inaugural Annual Uptime Institute Data Center Industry Survey”,
Uptime Institute, May 2011
[12] ASHRAE, “Datacom Equipment Power Trends and Cooling Applications”, ASHRAE
INC, 2005
83
References
[13] ASHRAE, "Environmental Guidelines for Datacom Equipment - Expanding the
Recommended Environmental Envelope", ASHRAE INC, 2008
[14] ASHRAE, “Thermal Guidelines for Data Processing Environments – Expanded Data
Center Classes and Usage Guidance”, ASHRAE INC, August 2011
[15] Boucher, T.D.; Auslander, D.M.; Bash, C.E.; Federspiel, C.C.; Patel, C.D., "Viability of
Dynamic Cooling Control in a Data Center Environment," Thermal and Thermomechanical
Phenomena in Electronic Systems, 2004. ITHERM '04. The Ninth Intersociety Conference
on, pp. 593- 600 Vol. 1, 1-4 June 2004
[16] Sharma, R.K.; Bash, C.E.; Patel, C.D.; Friedrich, R.J.; Chase, J.S.; , "Balance of Power:
Dynamic Thermal Management for Internet Data Centers," Internet Computing, IEEE , vol.9,
no.1, pp. 42- 49, Jan.-Feb. 2005
[17] Shah, A.; Patel, C.; Bash, C.; Sharma, R.; Shih, R.; , "Impact of Rack-level Compaction
on the Data Center Cooling Ensemble," Thermal and Thermomechanical Phenomena in
Electronic Systems, 2008. ITHERM 2008. 11th Intersociety Conference on, pp.1175-1182,
28-31 May 2008
[18] C. Patel, et al., “Energy Flow in the Information Technology Stack: Coefficient of
Performance of the Ensemble and its Impact on the Total Cost of Ownership,” Technical
Report No. HPL-2006-55, Hewlett Packard Laboratories, March 2006
[19] C. Patel, et al., “Energy Flow in the Information Technology Stack: Introducing the
Coefficient of Performance of the Ensemble,” Proc. ASME IMECE, November 2006
[20] Ahuja, N.; Rego, C.; Ahuja, S.; Warner, M.; Docca, A.; "Data Center Efficiency with
Higher Ambient Temperatures and Optimized Cooling Control," Semiconductor Thermal
Measurement and Management Symposium (SEMI-THERM), 2011 27th Annual IEEE,
pp.105-109, 20-24 March 2011
[21] Berktold, M.; Tian, T., “CPU Monitoring With DTS/PECI”, Intel Corporation,
September 2010
[22] M. Stopar, SLA@SOI XLAB, Efficient Distribution of Virtual Machines, March 24,
2011.
84
References
[23] C. Hyser, B. McKee, R. Gardner, and B. Watson. Autonomic Virtual Machine
Placement in the Data Center. Technical Report HPL-2007-189, HP Laboratories, Feb. 2008.
[24] A. Qureshi, R. Weber, H. Balakrishnan, J. Guttag, and B. Maggs, “Cutting the Electric
Bill for Internet-Scale Systems,” in Proc. ACM Conference on Data Communication
(SIGCOMM’09), New York, NY, USA, 2009, pp. 123–134
[25] Bolin Hu; Zhou Lei; Yu Lei; Dong Xu; Jiandun Li; , "A Time-Series Based Precopy
Approach for Live Migration of Virtual Machines," Parallel and Distributed Systems
(ICPADS), 2011 IEEE 17th International Conference on , vol., no., pp.947-952, 7-9 Dec.
2011
[26] Carroll, R, Balasubramaniam, S, Botvich, D and Donnelly, W, Application of Genetic
Algorithm to Maximise Clean Energy usage for Data Centers, to appear in proceedings of
Bionetics 2010, Boston, December 2010
[27] Akshat Verma, Puneet Ahuja, Anindya Neogi, “pMapper: Power and Migration Cost
Aware Application Placement in Virtualized Systems”, Middleware 2008: 243-264
[28] P. Riteau, C. Morin, T. Priol, “Shrinker: Efficient Wide-Area Live Virtual Machine
Migration using Distributed Content-Based Addressing,”
http://hal.inria.fr/docs/00/45/47/27/PDF/RR-7198.pdf, 2010
[29] F. Travostino, P. Daspit, L. Gommans, C. Jog, C. de Laat, J. Mambretti, I. Monga, B.
van Oudenaarde, S. Raghunath, and P. Wang. Seamless Live Migration of Virtual Machines
over the MAN/WAN. iGrid, 2006
[30] Hai Jin, Li Deng, Song Wu, Xuanhua Shi, and Xiaodong Pan. Live virtual machine
migration with adaptive memory compression. In Cluster, 2009
[31] Jonghyun Lee, MarianneWinslett, Xiaosong Ma, and Shengke Yu. Enhancing Data
Migration Performance via Parallel Data Compression. In Proceedings of the 16th
International Parallel and Distributed Processing Symposium (IPDPS), pages 47–54, April
2002
[32] M. R. Hines and K. Gopalan, “Post-copy based live virtual machine migration using
adaptive pre-paging and dynamic self-ballooning,” in Proceedings of the ACM/Usenix
international conference on Virtual execution environments (VEE’09), 2009, pp. 51–60
85
References
[33] Bose, S.K.; Brock, S.; Skeoch, R.; Rao, S.; , "CloudSpider: Combining Replication with
Scheduling for Optimizing Live Migration of Virtual Machines across Wide Area
Networks," Cluster, Cloud and Grid Computing (CCGrid), 2011 11th IEEE/ACM
International Symposium on , vol., no., pp.13-22, 23-26 May 2011
[34] Cioara, T.; Anghel, I.; Salomie, I.; Copil, G.; Moldovan, D.; Kipp, A.; , "Energy Aware
Dynamic Resource Consolidation Algorithm for Virtualized Service Centers Based on
Reinforcement Learning," Parallel and Distributed Computing (ISPDC), 2011 10th
International Symposium on , vol., no., pp.163-169, 6-8 July 2011
[35] H. Liu, H. Jin, X. Liao, L. Hu, and C. Yu, “Live migration of virtual machine based on
full system trace and replay,” in Proceedings of the 18th International Symposium on High
Performance Distributed Computing (HPDC’09), 2009, pp. 101–110.
[36] http://www.drbd.org - last accessed on 19/09/2014
[37] K. Nagin, D. Hadas, Z. Dubitzky, A. Glikson, I. Loy, B. Rochwerger, and L. Schour,
“Inter-Cloud Mobility of Virtual Machines,” in Proc. of 4th Int’l Conf. on Systems & Storage
(SYSTOR). ACM, 2011, pp. 3:1–3:12.
[38] R. Bradford, E. Kotsovinos, A. Feldmann, and H. Schi �oberg. Live Wide-Area Migration
of Virtual Machines including Local Persistent State. In VEE '07: Proceedings of the 3rd
international conference on Virtual execution environments, pages 169{179, New York, NY,
USA, 2007. ACM.
[39] C. Clark, K. Fraser, A. Hand, J. Hansen, E. Jul, C. Limpach, I. Pratt, A. Warfield. Live
Migration of Virtual Machines. in Proceedings of the Symposium on Networked Systems
Design and Implementation, 2005.
[40] VMware vSphere® vMotion®, Architecture, Performance and Best Practices in
VMware vSphere® 5. Performance Study, Technical White Paper, Oct 2011.
[41] Voorsluys W., Broberg J., Venugopal S., Buyya R.: Cost of Virtual Machine Live
Migration in Clouds: a Performance Evaluation. In: Proceedings of the 1st International
Conference on Cloud Computing. Vol. 2009. Springer (2009)
[42] Buyya, R., Ranjan, R., Calheiros, R. N.: Modeling and Simulation of Scalable Cloud
Computing Environments and the CloudSim Toolkit: Challenges and Opportunities. In: High
86
References
Performance Computing & Simulation, 2009. HPCS'09. International Conference on, pp. 1-
11. IEEE (2009)
[43] Xu, Y., Sekiya, Y.: Scheme of Resource Optimization using VM Migration for
Federated Cloud. In: Proceedings of the Asia-Pacific Advanced Network, vol. 32, pp. 36-44.
(2011)
[44] Takeda, S., and Toshinori T.: A Rank-Based VM Consolidation Method for Power
Saving in Data Centers. IPSJ Online Transactions, vol. 3 pp. 88-96. J-STAGE (2010)
[45] Xu, L., Chen, W., Wang, Z., Yang, S.: Smart-DRS: A Strategy of Dynamic Resource
Scheduling in Cloud Data Center. In: Cluster Computing Workshops (CLUSTER
WORKSHOPS), IEEE International Conference on, pp. 120-127. IEEE (2012)
[46] Gmach, D., Rolia, J., Cherkasova, L., Kemper, A.: Workload Analysis and Demand
Prediction of Enterprise Data Center Applications. In: Workload Characterization, 2007.
IISWC 2007. IEEE 10th International Symposium on, pp. 171-180. IEEE (2007)
[47] VMware,
http://pubs.vmware.com/vsphere-4-esx-vcenter/index.jsp?topic=/com.vmware.vsphere.bsa.do
c_40/vc_perfcharts_help/c_perfcharts_collection_intervals.html - last accessed on 19/09/2014
[48] Chandra, A., W. Gong, et al. (2003). Dynamic Resource Allocation for Shared Data
Centers Using Online Measurements. Proceedings of the Eleventh International Workshop on
Quality of Service (IWQoS 2003), Berkeley, Monterey, CA, Springer. pp. 381-400.
[49] M. Aron, P. Druschel, and S. Iyer. A Resource Management Framework for Predictable
Quality of Service in Web Servers, 2001.
[50] J. Carlstrom and R. Rom. Application-Aware Admission Control and Scheduling in Web
Servers. In Proceedings of the IEEE Infocom 2002, June 2002.
[51] Beloglazov, A., and Rajkumar B.: Energy Efficient Resource Management in
Virtualized Cloud Data Centers. Proceedings of the 2010 10th IEEE/ACM International
Conference on Cluster, Cloud and Grid Computing. IEEE Computer Society, 2010.
[52] http://huanliu.wordpress.com/2012/03/13/amazon-data-center-size - last accessed on 19/09/2014
87
References
[53] http://www.datacenterknowledge.com/archives/2009/06/29/microsoft-to-open-two-massive-data-centers - last accessed on 19/09/2014
88
Appendices
APPENDIX A
The code which checks if a host is over-utilized. The ‘fallback’ algorithm is used until the sliding window (length = 10) has been filled.
@Overrideprotected boolean isHostOverUtilized(PowerHost host) {
PowerHostUtilizationHistory _host = (PowerHostUtilizationHistory) host;
double[] utilizationHistory = _host.getUtilizationHistory();int length = 10;if (utilizationHistory.length < length){
return getFallbackVmAllocationPolicy().isHostOverUtilized(host);
}double[] utilizationHistoryReversed = new double[length];for (int i = 0; i < length; i++){
utilizationHistoryReversed[i] = utilizationHistory[length - i - 1];
}double[] estimates = null;try{
estimates = getParameterEstimates(utilizationHistoryReversed);
}catch (IllegalArgumentException e){
return getFallbackVmAllocationPolicy().isHostOverUtilized(host);
}double migrationIntervals =
Math.ceil(getMaximumVmMigrationTime(_host) / Constants.SCHEDULING_INTERVAL);
double predictedUtilization = estimates[0] + estimates[1] * (length + migrationIntervals);
predictedUtilization *= getSafetyParameter();
addHistoryEntry(host, predictedUtilization);
if(predictedUtilization >= 1){
Constants.OverUtilizedHostsThisInterval++;}
return predictedUtilization >= 1;}
89
Appendices
APPENDIX B
The main() method of the LRMMT algorithm.
public static void main(String[] args) throws IOException{
boolean enableOutput = true;boolean outputToFile = true;String inputFolder = "C:\\Users\\scooby\\Desktop\\Eclipse\\
CloudSim\\examples\\workload\\planetlab";
//default workload generated from dynamic averagesString workload = "default";
//dynamic workloadif(!Constants.IsDefault){
workload = "dynamic"; // PlanetLab workload}
if(Constants.IntervalGenerator){
Constants.SIMULATION_LIMIT = 86400;}
String outputFolder = "C:\\Users\\scooby\\Desktop\\Eclipse\\Workspace\\output\\";
String vmAllocationPolicy = "lr"; // Local Regression (LR) VM allocation policy
String vmSelectionPolicy = "mmt"; // Minimum Migration Time (MMT) VM selection policy
String parameter = "1.2"; // the safety parameter of the LR policy
new PlanetLabRunner(enableOutput, outputToFile, inputFolder, outputFolder, workload, vmAllocationPolicy, _vmSelectionPolicy, parameter);
}
90