The Impact of Dynamic Monitoring Interval Adjustment on Power Consumption in Virtualized Data...

139
The Impact of Dynamic Monitoring Interval Adjustment on Power Consumption in Virtualized Data Centers Mark White BSc., H.Dip Submitted in accordance with the requirements for the degree of Masters of Science in Computer Science and Information Technology Discipline of Information Technology, College of Engineering and Informatics National University of Ireland, Galway Research Supervisors: Dr. Hugh Melvin, Dr. Michael Schukat Research Director: Prof. Gerard Lyons September 2014

Transcript of The Impact of Dynamic Monitoring Interval Adjustment on Power Consumption in Virtualized Data...

The Impact of Dynamic Monitoring Interval Adjustment on Power Consumption in Virtualized Data Centers

Mark White BSc., H.Dip

Submitted in accordance with the requirements for the degree of

Masters of Science in Computer Science and Information Technology

Discipline of Information Technology, College of Engineering and Informatics

National University of Ireland, Galway

Research Supervisors: Dr. Hugh Melvin, Dr. Michael Schukat

Research Director: Prof. Gerard Lyons

September 2014

The candidate confirms that the work submitted is his own and that appropriate credit has been given where reference has been made to the work of others

Contents

Chapter 1 Introduction..........................................................................................................................1

1.1 The Hybrid Cloud...................................................................................................................1

1.2 Migration...............................................................................................................................2

1.3 Energy Efficiency....................................................................................................................2

1.4 Cooling...................................................................................................................................4

1.5 Research Objectives...............................................................................................................4

1.5.1 Hypothesis.....................................................................................................................4

1.5.2 CloudSim........................................................................................................................5

1.5.3 Methodology.................................................................................................................6

1.6 Conclusion.............................................................................................................................6

Chapter 2 Literature Review..................................................................................................................8

Introduction.......................................................................................................................................8

2.1 Performance versus Power....................................................................................................8

2.2 Increased Density..................................................................................................................9

2.3 Hardware.............................................................................................................................12

2.3.1 Uninterruptible Power Supply (UPS) & Power Distribution.........................................12

2.3.2 Servers, Storage Devices & Network Equipment.........................................................13

2.3.3 Cooling.........................................................................................................................13

2.3.4 Industry Standards & Guidelines..................................................................................15

2.3.5 Three Seminal Papers..................................................................................................17

2.4 Software..............................................................................................................................24

2.4.1 Virtualization................................................................................................................24

2.4.2 Migration.....................................................................................................................25

2.5 Monitoring Interval..............................................................................................................34

2.5.1 Static Monitoring Interval............................................................................................36

2.5.2 Dynamic Monitoring Interval.......................................................................................37

2.6 Conclusion...........................................................................................................................38

Chapter 3 CloudSim.............................................................................................................................39

Introduction.....................................................................................................................................39

3.1 Overview..............................................................................................................................39

3.2 Workload.............................................................................................................................40

3.3 Capacity...............................................................................................................................42

3.4 Local Regression / Minimum Migration Time (LR / MMT)...................................................44

3.5 Selection Policy – Local Regression (LR)...............................................................................44

3.6 Allocation Policy – Minimum Migration Time (MMT)..........................................................45

3.7 Default LRMMT....................................................................................................................45

3.7.1 init()………………………………………………………....................................................................45

3.7.2 start()...........................................................................................................................46

3.8 Over-utilization....................................................................................................................48

3.9 Migration.............................................................................................................................50

3.10 Reporting.............................................................................................................................52

3.11 Conclusion...........................................................................................................................52

Chapter 4 Implementation..................................................................................................................54

Introduction.....................................................................................................................................54

4.1 Interval Adjustment Algorithm............................................................................................54

4.2 Comparable Workloads.......................................................................................................58

4.3 C# Calculator........................................................................................................................61

4.4 Interval Adjustment Code....................................................................................................64

4.5 Reporting.............................................................................................................................69

4.6 Conclusion...........................................................................................................................70

Chapter 5 Tests, Results & Evaluation.................................................................................................71

Introduction.....................................................................................................................................71

5.1 Tests & Results.....................................................................................................................71

5.2 Evaluation of Test Results....................................................................................................75

5.2.1 Why are the Over-Utilized Hosts / Migration Counts Reduced?..................................76

5.2.2 Result of Reduced Migration Count.............................................................................77

5.2.3 Scalability.....................................................................................................................77

5.3 Evaluation of CloudSim........................................................................................................78

5.3.1 Local Regression Sliding Window.................................................................................78

5.3.2 RAM.............................................................................................................................79

5.3.3 Dynamic RAM Adjustment...........................................................................................79

5.3.4 SLA-based Migration....................................................................................................79

Chapter 6 Conclusions.........................................................................................................................81

REFERENCES........................................................................................................................................83

APPENDIX A.........................................................................................................................................89

APPENDIX B.........................................................................................................................................90

List of Figures

Figure 1 Data Center Service Supply Chain..........................................................................................3

Figure 2 Relative contributions to the thermal output of a typical DC.................................................12

Figure 3 A Typical AHU Direct Expansion (DX) Cooling System.....................................................14

Figure 4 A Typical DC Air Flow System............................................................................................15

Figure 5 Performance of web server during live migration (C. Clark).................................................30

Figure 6 Pre-Copy algorithm...............................................................................................................32

Figure 7 CloudSim Architecture..........................................................................................................41

Figure 8 Flow Chart Depicting the LR / MMT simulation process......................................................48

Figure 9 Application of the Monitoring Interval Based on Weighted Utilization Average..................57

Figure 10 A screenshot of the data generated for calculation of the default workload.........................59

Figure 11 Intervals calculated during the dynamic simulation.............................................................59

Figure 12 Calculation of the Average CPU Utilization for the Default Files.......................................60

Figure 13 How the dynamic interval adjustment code interacts with CloudSim..................................66

Figure 14 Interval Calculation for the Dynamic Simulation................................................................72

Figure 15 VM decommissioning comparison......................................................................................73

Figure 16 Operational Hosts - Default Simulation...............................................................................74

Figure 17 Operational Hosts - Dynamic Simulation............................................................................74

Figure 18 Average CPU Utilization - Dynamic Simulation.................................................................75

Acknowledgements

My supervisor, Dr. Hugh Melvin, who identified at an early stage that (for the most part) I

could be left to my own devices to get on with the work required. His supervisory approach

resulted in the freedom to progress at my own pace knowing he was available as and when I

needed a ‘boost’. When clarity was requested, Hugh demonstrated an enviable ability to

extract the salient issue and point me in the right direction. Although typically performed

cycling up a steep hill on his way home from work, the momentary pauses during review

meetings while he reflected on the issues were often more productive than hours of reading

code. Future students should be so lucky to have him oversee their research endeavours.

My second supervisor, Dr. Michael Schukat, who is capable of clarifying a complicated issue

with a carefully worded question – followed (invariably) with a reassuring smile.

Dr. Ripduman Sohan & Dr. Sherif Akoush in the Computing Laboratory at Cambridge

University without whom I would not have identified the approach taken in this thesis. Over

the course of a few (all too brief) visits with them, I also became aware of the extent of my

intellectual abilities (and limitations!!!).

The principle author of the CloudSim framework, Dr. Anton Beloglazov. Despite his having

moved on from the University of Melbourne where he wrote CloudSim for his doctoral

thesis, his detailed responses to the countless queries I posed during the course of my

research were invaluable and generous to a fault.

My colleagues in the Discipline of IT at NUI Galway for timely coffee-breaks, lunch

invitations and encounters in the corridors – because the breaks are a vital constituent of the

work and the queries as to progress and words of support were more important than you

could possibly have imagined.

My parents, who repeatedly remind me that:

‘You are capable of anything you put your mind to’

Deirdre (Dee) O’ Connor – if convention allowed your name would be on the title page!

Abstract

Virtualization is one of the principle data center technologies increasingly deployed in recent

years to meet the challenges of escalating costs, industry standards and the search for a

competitive edge. This thesis presents a novel approach to management of the virtualized

system which dynamically adjusts the monitoring interval with respect to the average CPU

utilization for the data center. The potential for reduced power consumption, by identifying

performance opportunities at an earlier stage than typical virtualized systems which use a

static interval, is analysed. It is proposed that the adjusted interval will result in analysis of

data center metrics being performed at a more appropriate level of granularity than current

static monitoring systems.

Chapter 1 Introduction

Chapter 1 Introduction

The availability of cloud-based Data Centers (DCs) in recent years has introduced significant

opportunities for enterprises to reduce costs. The initial Capital Expenditure (CapEx)

associated with setting up a DC has been prohibitively high in the past, but this may no

longer be the primary concern. For example, start-ups choosing to implement Infrastructure-

as-a-Service (IaaS) cloud architectures are free to focus on optimizing other aspects of the

business rather than worrying about raising the capital to build (and maintain) fully equipped

DCs. A young enterprise can now pay a relatively small monthly fee to Amazon (EC2) or

Microsoft (Azure), for example, in return for a scalable infrastructure on which to build their

new product or service. Existing companies are also availing of significant savings and

opportunities by moving to the cloud.

1.1 The Hybrid Cloud

In the future the architecture of cloud computing infrastructure will facilitate a business

moving the public portion of their services from one remote DC to another for cost or

efficiency gains. For example, a DC provider in one US state may be charging less for

compute time because energy costs in that state are lower than those in a neighbouring state.

Migration of enterprise services to the less expensive location could be facilitated. To enable

this type of migratory activity, the Distributed Management Task Force (DMTF) has created

the Open Virtualization Format (OVF) specification. The OVF standard “provides an

intermediary format for Virtual Machine (VM) images. It lets an organization create a VM

instance on top of one hypervisor and then export it to the OVF so that it can be run by

another hypervisor” [4]. With the exception of Amazon, all the major cloud providers (Citrix

Systems, IBM, Microsoft, Oracle and VMware) are involved in the development of OVF.

The short and medium term solution to the interoperability issue will certainly be

‘hybrid’ clouds where the enterprise maintains the private portion of their infrastructure on

their local network and the public portion is hosted on a federated cloud - facilitating indirect

(but not direct) movement between providers e.g. in a similar fashion to switching broadband

providers, a software development company may initially choose to lease a Microsoft data

center for their infrastructure but subsequently transfer to Google if the latter offering

1

Chapter 1 Introduction

becomes more suitable for their purposes (e.g. proximity to client requests or more energy

efficient).

Development of new products may be performed securely on the enterprise Local

Area Network (LAN) and subsequently ‘released’ onto the public cloud for global

distribution. Movement from one provider to another is currently (and for the foreseeable

future will be) performed manually by the enterprise administrator using separate

management interfaces i.e. an Amazon API or a Microsoft API. The vision of the DMTF is a

unified interface known as Cloud Infrastructure Management Interface (CIMI). It is currently

a work-in-progress but ultimately hopes to facilitate direct transfer of data between cloud

providers.

The core technology upon which this data transfer between providers will be

facilitated is virtualization – most specifically, migration of VMs.

1.2 Migration

The practice of regularly migrating services between providers may well become feasible in

the future, providing enterprises with significant opportunities to dynamically reduce the

energy portion of their Operating Expenditure (OpEx) budget. This would also result in

operators becoming more competitive, perhaps, all performance metrics being equal, gaining

their edge from increased energy efficiency efforts. Hybrid cloud environments also facilitate

smaller IT teams, resulting in reduced staffing costs.

1.3 Energy Efficiency

Data centers currently account for close to 3% of all global energy consumed on an annual

basis. It is certain that the industry will continue to expand as increasing volumes of data are

generated, transmitted, stored and analysed. This expansion will require significantly more

energy than is currently used by the sector, energy which must be managed as responsibly as

possible. Energy management, however, is not possible without measurement.

The measurement of a DC’s energy efficiency helps staff and management focus on

the various subsystems of the operation with a view to improving the overall efficiency of the

data center. While advances in hardware and software continue apace, the DC industry has

only recently begun to consider the responsibility of ensuring that the energy it uses is not

2

Chapter 1 Introduction

wasted. The global economic downturn of 2007 played no small part in motivating DC

operators to review their practices. In an attempt to remain competitive, while constantly

upgrading infrastructure and services to meet the needs of their customers, data center

operators have since identified energy efficiency as a cost opportunity. The moral aspects of

managing energy for the future are all well and good. It appears more likely, however, that

the potential operational savings in the short to medium term have provided the primary

motivation for data center operators to take stock.

In addition to the operational savings achieved when the data center becomes more

energy efficient on a daily basis, additional capital savings may also be realized. All items of

IT equipment have a replacement interval which may be increased due to redundancies

discovered during the energy efficiency audit. For example, should the existing cooling

volume of the room be found to be in excess of requirements, additional air handling units

(AHUs) could be switched to standby, not only reducing the power consumed by that unit but

also increasing the interval before the unit needs to be repaired or replaced.

The amount of power and cooling that a DC uses on a day-to-day basis determines

how much irreplaceable fossil fuels it consumes and the quantity of carbon emissions for

which it is responsible.

Figure 1 Data Center Service Supply Chain

3

Chapter 1 Introduction

Within the supply chain of DC services, illustrated in Figure 1, the main emissions occur at

the power generation site. Location is a key factor for the CO2 intensity of the power

consumed by the data center. A gas- or coal-fired powered utility creates much more CO2

than a hydro- or wind-powered utility. For this reason, many green-field DCs are now being

located near low-cost, environmentally friendly power sources.

1.4 Cooling

Location is also a key factor with respect to cooling. A data center in a cool climate such as

Ireland requires less cooling power than a data center in a warmer climate such as Mexico.

To avail of climate-related opportunities, large-scale DCs have recently been built in

temperate locations such as Dublin (e.g. Google) and Sweden (e.g. Facebook), demonstrating

the significance of the cost reductions possible. This being the case, if migration of DC

services across Wide Area Networks (WANs) becomes cost-feasible in the future, concepts

such as ‘follow the moon’ / ‘follow the sun’ (where the services provided by a DC are

moved, across the network, closer to where they are most needed throughout the day) may

become prevalent. Migration of data center services across both Local Area Networks

(LANs) and WANs is discussed in more detail in Chapter 2.

1.5 Research Objectives

1.5.1 Hypothesis

While the effort to optimize the individual component costs (e.g. downtime) of a migration is

worthwhile, this research aims to investigate further opportunities for energy savings if,

rather than optimising the individual component costs, a migration is viewed as a single all-

encapsulating entity and focus is applied to reducing the total number of migrations taking

place in a DC. Throughout a migration both the source and the destination servers are

running. Quite apart from the extra CPU processing, RAM access and bandwidth required to

achieve a migration, there is an additional energy cost associated with simply keeping both

servers simultaneously powered for the duration of the migration. In addition, if the

destination server was not previously running before the migration was initiated, the time

delay starting it up (as a new host machine) must also be factored into any calculation of

efficiency.

4

Chapter 1 Introduction

The principle metric for monitoring the DC workload is CPU utilization which is one of the

primary resources associated with servicing that workload. In a virtualized environment CPU

utilization is an indication of the processing capacity being used by a host while serving the

requirements of the VMs located on it. In current practice, the CPU utilization value

delivered to monitoring systems is averaged over a constant monitoring interval (i.e. 300

seconds). This interval is typically pre-configured (via a management interface) by the data

center operator, rendering it static. With a relatively small percentage of the host's CPU

concerned with running the virtualization hypervisor, CPU utilization is primarily dependent

on the workload being serviced by the VMs located on the host. This workload typically

varies with time as requests to the servers fluctuate outside the DC. As such, the frequency of

change of the CPU utilization value closely tracks the frequency of change of the incoming

workload.

This thesis investigates the merits of moving from a fixed interval to one which is

dynamically adjusted based on the overall CPU utilization average of the DC. At each

interval a weighted CPU utilization average for the DC is calculated and the next monitoring

interval is adjusted accordingly. By dynamically adjusting the monitoring interval with

respect to the average CPU utilization of the DC, this research analyses the potential for

reduced power consumption through identification of performance opportunities at an earlier

stage than systems which use a static 300 second interval. It is proposed that these

performance opportunities would otherwise have remained hidden mid-interval. Calculated

on the basis of how ‘busy’ the DC currently is, the adjusted interval is more likely to be at an

appropriate level of granularity than its static counterpart.

1.5.2 CloudSim

A secondary objective of this research was to examine the efficacy of the CloudSim

framework with respect to simulation of power-aware DCs. Given the lack of access for

researchers to ‘real-world’ data center infrastructure, a robust simulator with which to

experiment is of paramount importance. CloudSim is one such framework and is currently

deployed by many researchers in the field of data center energy efficiency worldwide. It is

discussed in detail in Chapter 3.

5

Chapter 1 Introduction

The online forums relating to the CloudSim framework are very active with researchers

attempting to establish the best way to achieve their objectives. While the documentation for

the code is extensive (and there are a number of basic examples of how the software can be

used included in the CloudSim framework), there is little by way of explanation of the

methodologies used by the original author of the code, thus resulting in each individual

researcher having to spend an inordinate amount of time investigating the capabilities (and

limitations) of the framework. This can only be achieved by reviewing many thousands of

lines of code and testing to establish the functionality of each module and method.

Through the course of this research, a number of CloudSim issues were identified

which, it is hoped, will prove useful to future researchers. They are discussed chronologically

(at the point in development when they were identified) and relate to both the framework

code and the accuracy of virtual machine and migration simulation. They are also

summarized in Chapter 5.

1.5.3 Methodology

This thesis uses the CloudSim framework (described in more detail in Chapter 3) as the base

simulator for implementation and testing of the hypothesis. A considerable review of the

existing CloudSim code was required to establish the capabilities of the framework and also

to identify what additional code modules would be needed to meet the thesis objectives.

Ultimately it was found that no facility existed in CloudSim to test the hypothesis and thus a

number of extensions to the existing code were designed and developed. These were then

integrated with the framework such that the default CloudSim simulation could be reliably

compared with the dynamic extension created for this research i.e. implementing dynamic

interval adjustment.

1.6 Conclusion

The remainder of this thesis is structured as follows. The literature review in Chapter 2

describes current (and future) efforts to improve energy efficiency in the data center industry.

Both hardware and software approaches are discussed, with a focus on virtualized systems,

installed as standard in all green-field DCs and retro-fitted to the majority of existing brown-

field sites. Chapter 3 details the specific modules in the CloudSim framework required to

build the test bed for analysis of the hypothesis. An explanation as to how these modules

6

Chapter 1 Introduction

interact with each other is also provided. Chapter 4 specifies the new Java methods written to

create and test the hypothesis. Integration of the new code with the existing framework is also

described. Chapter 5 discusses the tests performed to evaluate the hypothesis and analyses the

results in the context of current energy efficiency efforts in the data center industry. Chapter 6

concludes this thesis with a summary of the limitations identified in the CloudSim

framework, in the hope that the work of future researchers can more effectively benefit from,

and build upon, its code-base.

7

Chapter 2 Literature Review

Chapter 2 Literature Review

Introduction

By adjusting the DC monitoring interval with respect to the incoming workload, this thesis

investigates opportunities for more energy efficient management of data center resources.

Given this objective, extensive examination of the evolution of DC resource management

methods over the last few years was required in an effort to identify an approach which had

not been previously applied.

This thesis is primarily concerned with the energy efficiency of DCs when migrating

VMs across the LAN and WAN. To contextualize the research more completely, the

following literature review extends the introductory discussion in Chapter 1 to encompass the

entire data center infrastructure, analyzing current (and previous) efforts by operators and

academic researchers to reduce power consumption from, not only a software, but also a

hardware perspective. The chapter closes with an in-depth review of existing monitoring

interval approaches and technologies.

2.1 Performance versus Power

Most of the advances achieved by both the DC industry and academic researchers before

2006 paid particular attention to the performance of the infrastructure, with the principle

focus of operator efforts set firmly on keeping the DC available to clients 24/7. In fact, when

advertising and selling the services they offer, operators still choose to feature their ‘uptime’

percentage as their primary Unique Selling Point (USP). The ‘5 nines’ (i.e. 99.999% uptime),

denoting High Availability (HA), are seldom omitted from a typical DC operator’s marketing

material. However, the increase in power consumption required to boost performance seldom

received more attention than summary recognition as an additional expense. The power /

performance trade-off is undoubtedly a difficult hurdle to overcome, especially while cost

competitiveness is uppermost in the minds of data center operators. Invariably, before 2006,

most commercial development efforts to improve the operation of DCs were focussed solely

on performance.

8

Chapter 2 Literature Review

In more recent years, increased consumer demand for faster traffic and larger, more flexible,

storage solutions has changed how the industry views the resources required to operate

competitively. More equipment (e.g. servers, routers) has been required to meet demand but

the space required to accommodate this equipment has already been allocated to existing

equipment. The strategy adopted, since 2006, by a DC industry looking to the future, was to

increase the density of IT equipment rather than the more expensive option of purchasing (or

renting) additional square footage. The solution combined new server technologies and

virtualization.

2.2 Increased Density

An analogy: increasing infrastructural density in a data center is similar to adding more

bedrooms to a house without extending the property. The house can now accommodate

private spaces for more people but each person has less space than before. In the data center

there are now more servers per square foot, resulting in more compute / storage capability.

Despite the space-saving advantages of VM technology and techniques (i.e. migration),

which reduced the number of servers required to host applications, the primary disadvantage

of increased density was that each new blade server required significantly more power than

its predecessor. A standard rack with 65-70 blades operating at high loads might require 20 -

30kW of power compared with previous rack consumptions of 2 - 5kW. This additional

power generates additional heat. In a similar manner to maintaining comfortable levels of

heat and humidity for people in a house, heat in the rack, and resultant heat in the server

room, must be removed to maintain the equipment at a safe operating temperature and

humidity. Summarily, the introduction of increased server room density, from 2006 onwards,

resulted in increased power and cooling requirements for modern DCs.

At their 25th Annual Data Center Conference held in Las Vegas in late November

2006, Gartner analysts hypothesized that:

“…by 2008, 50% of current data centers will have insufficient

power and cooling capacity to meet the demands of high-density

equipment…” [1]

9

Chapter 2 Literature Review

During his address to the conference, Gartner Research Vice President, Michael Bell

suggested that: “Although power and cooling challenges will not be a perpetual problem, it is

important for DC managers to focus on the electrical and cooling issue in the near term, and

adopt best practice to mitigate the problem before it results in equipment failure, downtime

and high remediation costs”. This was one of the first ‘shots across the bow’ for a data center

industry which, until then, had been solely focussed on improving performance (e.g. uptime,

response time) almost in deference to escalating energy costs.

Based on data provided by IDC [2], Jonathon Koomey published a report [3] in

February 2007 estimating the electricity used by all DCs in both the US and globally for

2005. The executive summary states that:

“The total power demand in 2005 (including associated

infrastructure) is equivalent (in capacity terms) to about five 1000

MW power plants for the U.S. and 14 such plants for the world. The

total electricity bill for operating those servers and associated

infrastructure in 2005 was about $2.7 billion and $7.2 billion for the

U.S. and the world, respectively.”

A few months later the global economic downturn brought with it increasingly restrictive

operating budgets and higher energy prices. The competitive edge was becoming harder to

identify. Quite apart from the economic factors affecting the industry, the timely publication

by the EPA of its report to the US Congress [4] in August 2007 highlighted significant

opportunities to reduce both capital and operating costs by optimizing the power and cooling

infrastructure involved in data center operations. Industry analysts were once again

identifying an escalating power consumption trend which required immediate attention.

The report assessed the principle opportunities for energy efficiency improvements in

US DCs. The process of preparing the report brought all the major industry players together.

In an effort to identify a range of energy efficiency opportunities, 3 main improvement

scenarios were formulated:

1. Improved Operation: maximizes the efficiency of the existing data center

infrastructure by utilizing improvements such as ‘free cooling’ and raising

10

Chapter 2 Literature Review

temperature / humidity set-points. Minimal capital cost (‘the low hanging fruit’) is

incurred by the operator

2. Best Practice: adopt practices and technologies used in the most energy-efficient

facilities

3. State-of-the-art: uses all available energy efficiency practices and technologies

The potential energy savings and associated capital cost calculated for each of the 3 scenarios

respectively were:

1. Improved Operation: 20% saving - least expensive

2. Best Practice: 45% saving

3. State-of-the-art: 55% saving - most expensive

Notably, a proviso was also offered by the report in that: “…due to local constraints, the best

strategy for a particular data center could only be ascertained by means of a site-specific

review - not all suggested scenarios apply to all data centers.” Regardless of which (if any)

subsequent strategy was adopted by a particular data center operator, performance of a site-

specific review invariably served the purpose of demonstrating that reduction of power

consumption was indeed a viable opportunity to, not only significantly reduce both capital

and operating costs, but also re-gain a competitive edge.

The economic downturn, the Gartner conference and the reports by both the EPA and

Koomey were a significant part of the catalyst for the energy approach beginning to receive a

level of attention closer, if not equal, to that of the performance approach in previous years.

Efficient management of power and cooling, while maintaining performance levels, became

the order of the day.

At the highest level, DC infrastructure can be subdivided into hardware and software.

While it is true that both are inextricably linked to the energy performance of the DC, it is

useful for the purposes of this review to examine them separately.

11

Chapter 2 Literature Review

2.3 Hardware

Rasmussen [5] identified power distribution, conversion losses and cooling as representing

between 30 – 45% of the electricity bill in larger DCs. Cooling alone accounted for 30% of

this total.

Figure 2 Relative contributions to the thermal output of a typical DC

2.3.1 Uninterruptible Power Supply (UPS) & Power Distribution

The power being provided to the IT equipment in the racks is typically routed through an

Uninterruptible Power Supply (UPS) which feeds Power Distribution Units (PDUs) located

in or near the rack. Through use of better components, circuit design and right-sizing

strategies, manufacturers such as American Power Conversion (APC) and Liebert have

turned their attention to maximizing efficiency across the full load spectrum, without

12

Chapter 2 Literature Review

sacrificing redundancy. Some opportunities may exist in efforts to re-balance the load across

the 3 phases supplying the power to the racks but efficiencies in the power supply &

distribution system are outside the scope of this research.

2.3.2 Servers, Storage Devices & Network Equipment

Manufacturers such as IBM and Intel are designing increasingly efficient server blades with

features such as chip-level thermal strategies (Dynamic Voltage & Frequency Scaling

(DVFS)), multicore processors and power management leading the way. Enterprise operators

such as Google and Facebook have recently designed and installed their own servers which

have demonstrated increased efficiencies but these servers are specifically ‘fit-for-purpose’.

They may not be sufficiently generic to be applicable to a majority of DC configurations.

2.3.3 Cooling

There are a variety of standard systems for cooling in data centers but all typically involve

Air Handling Units (AHUs) or Computer Room Air Handlers (CRAHs). Well-designed DCs

have aligned their racks in an alternating hot aisle / cold aisle configuration with cold air from

the AHU(s) entering the cold aisle through perforated or grated tiles above a sub-floor

plenum. Hot air is exhausted from the rear of the racks and removed from the room back to

the same AHU(s) forming a closed-loop system. The hot air is passed directly over an

evaporator (Figure 3: 4) in the AHU which contains a liquid refrigerant (e.g. ethylene glycol /

water solution). The amount of heat absorbed is determined by the speed of the air crossing

the coil and / or the flow rate of the refrigerant through the coil. The flow rate is controlled by

tandem scroll compressors (Fig 3: 1). A dead-band setting is applied to each AHU and is

divided equally between all the compressors in the system. As each dead-band range above

the set-point is reached a compressor will engage to increase the flow rate. As the

temperature returns (down through the dead-band increments) toward the set-point, the

compressors disengage – reducing the flow through the evaporator until the set-point is

reached again. The heat absorbed through the coil is fed to an array of condensers outside the

DC where it evaporates into the atmosphere as exhaust or is reused in some other part of the

facility. The set point of the AHU is configured on installation of the unit and must (if

deemed appropriate) be changed manually by a member of staff following analysis and

review. Unfortunately these reviews happen all too seldom in typical DCs, despite the

inevitable changes taking place in the server room workload on a daily basis.

13

Chapter 2 Literature Review

Figure 3 A Typical AHU Direct Expansion (DX) Cooling System

Depending on the configuration, the heat removal system might potentially consume 50% of

a typical DC’s energy. Industry is currently embracing a number of opportunities involving

temperature and airflow analysis:

1. aisle containment strategies

2. increasing the temperature rise (ΔT) across the rack

3. raising the operating temperature of the AHU(s)

4. repositioning AHU temperature and humidity sensors

5. thermal management by balancing the IT load layout [6, 7]

6. ‘free cooling’ – eliminating the high-consumption chiller from the system through

the use of strategies such as air- and water-side economizers

14

Chapter 2 Literature Review

Figure 4 A Typical DC Air Flow System

In addition to temperature maintenance, the AHUs also vary the humidity of the air entering

the server room according to set-points. Low humidity (dry air) may cause static which has

the potential to short electronic circuits. High levels of moisture in the air may lead to faster

component degradation. Although less of a concern as a result of field experience and recent

studies performed by Intel and others, humidity ranges have been defined for the industry and

should be observed to maximize the lifetime of the IT equipment. Maintaining humidity

ranges definitively increases the interval between equipment replacement schedules and, as a

result, has a net positive outcome on capital expenditure budgets.

2.3.4 Industry Standards & Guidelines

2.3.4.1 Standards

Power Usage Effectiveness (PUE2) [8] is now the de facto standard used to measure a DC’s

efficiency. It is defined as the ratio of all electricity used by the DC to the electricity used just

by the IT equipment. In contrast to the original PUE [9] rated in kilowatts of power (kW),

PUE2 must be based on the highest measured kilowatt hour (kWh) reading taken during

analysis. In 3 of the 4 PUE2 categories now defined, the readings must span a 12 month

period, eliminating the effect of seasonal fluctuations in ambient temperatures:

PUE = Total Data Centre Electricity (kWh )

¿ Equipment Electricity (kWh )

15

Chapter 2 Literature Review

A PUE of 2.0 suggests that for each kWh of IT electricity used another kWh is used by the

infrastructure to supply and support it. The most recent PUE averages [10] for the industry

fall within the range of 1.83 – 1.92 with worst performers coming in at 3.6 and a few top

performers publishing results below 1.1 in recent months. Theoretically, the best possible

PUE is 1.0 but a web-hosting company (Pair Networks) recently quoted a PUE of 0.98 for

one of its DCs in Las Vegas, Nevada. Their calculation was based on receipt of PUE ‘credit’

for contributing unused power (generated on-site) back to the grid. Whether additional PUE

‘credit’ should be allowed for contributing to the electricity grid is debatable. If this were the

case, with sufficient on-site generation, PUE could potentially reach 0.0 and cease to have

meaning. Most DCs are now evaluating their own PUE ratio to identify possible

improvements in their power usage. Lower PUE ratios have become a very marketable aspect

of the data center business and have been recognized as such. Other standards and metrics

(2.3.4.2.1 – 2.3.4.2.4) have been designed for the industry but, due for the most part to the

complex processes required to calculate them, have not as yet experienced the same wide-

spread popularity as PUE and PUE2.

2.3.4.2 Other Standards

2.3.4.2.1 Water Usage Effectiveness (WUE) measures DC water usage to provide an

assessment of the water used on-site for operation of the data center. This includes water used

for humidification and water evaporated on-site for energy production or cooling of the DC

and its support system.

2.3.4.2.2 Carbon Usage Effectiveness (CUE) measures DC-level carbon emissions.

CUE does not cover the emissions associated with the lifecycle of the equipment in the DC or

the building itself.

2.3.4.2.3 The Data Center Productivity (DCP) framework is a collection of metrics

which measure the consumption of a DC-related resource in terms of DC output. DCP looks

to define what a data center accomplishes relative to what it consumes.

2.3.4.2.4 Data Center Compute Efficiency (DCCE) enables data center operators to

determine the efficiency of compute resources. The metric makes it easier for data center

operators to discover unused servers (both physical and virtual) and decommission or

redeploy them.

16

Chapter 2 Literature Review

Surprisingly, efforts to improve efficiency have not been implemented to the extent one

would expect. 73% of respondents to a recent Uptime Institute survey [11] stated that

someone outside of the data center (the real estate / facilities department) was responsible for

paying the utility bill. 8% of data center managers weren’t even aware who paid the bill. The

lack of accountability is obvious and problematic. If managers are primarily concerned with

maintaining the DC on a daily basis there is an inevitable lack of incentive to implement even

the most basic energy efficiency strategy in the short to medium term. It is clear that a

paradigm shift is required to advance the cause of energy efficiency monitoring at the ‘C-

level’ (CEO, CFO, CIO) of data center operations.

2.3.4.3 Guidelines

Data center guidelines are intermittently published by The American Society of Heating,

Refrigeration and Air Conditioning Engineers (ASHRAE). These guidelines [12, 13] suggest

‘allowable’ and ‘recommended’ temperature and humidity ranges within which it is safe to

operate IT equipment. The most recent edition of the guidelines [14] suggests operating

temperatures of 18 – 27⁰C. The maximum for humidity is 60% RH.

One of the more interesting objectives of the recent guidelines is to have the rack inlet

recognized as the position from where the temperature and humidity should be measured. The

majority of DCs currently measure at the return inlet to the AHU, despite more relevant

temperature and humidity metrics being present at the inlet to the racks.

2.3.5 Three Seminal Papers

In the context of improving the hardware infrastructure of the DC post-2006, three academic

papers were found to be repeatedly referenced as forming a basis for the work of the most

prominent researchers in the field. They each undertake a similar methodology when

identifying solutions and are considered to have led the way for a significant number of

subsequent research efforts. The methodologies which are common to each of the papers (and

relevant to this thesis) include:

1. Identification of a power consumption opportunity within the DC and adoption of

a software-based solution

17

Chapter 2 Literature Review

2. Demonstration of the absolute requirement for monitoring the DC environment as

accurately as possible without overloading the system with additional processing

Summary review of the three papers follows.

2.3.5.1 Paper 1: Viability of Dynamic Cooling Control in a Data Center Environment (2006)

In the context of dynamically controlling the cooling system Boucher et al. [15] focused their

efforts on 3 requirements:

1. A distributed sensor network to indicate the local conditions of the data center.

Solution: a network of temperature sensors was installed at:

Rack inlets

Rack outlets

Tile inlets

2. The ability to vary cooling resources locally. Solution: 4 actuation points, which exist

in a typical data center, were identified as having further potential in maintaining

optimal server room conditions:

2.1 CRAC supply temperature – this is the temperature of the conditioned air

entering the room. CRACs are typically operated on the basis of a single

temperature sensor at the return side of the unit. This sensor is responsible for

taking an average of the air temperature returning from the room. The CRAC then

correlates this reading with a set-point which is configured manually by data

center staff. The result of the correlation is the basis upon which the CRAC

decides by how much the temperature of the air sent back out into the room should

be adjusted. Variation is achieved in a Direct Expansion (DX) system with

variable capacity compressors varying the flow of refrigerant across the cooling

coil. In a water-cooled system chilled water supply valves modulate the

temperature.

2.2 The crucial element in the operational equation of the CRAC, regardless of the

system deployed, is the set-point. The set-point is manually set by data center staff

and generally requires considerable analysis of the DC environment before any

adjustment is made. Typically, the set-point is configured (when the CRAC is

18

Chapter 2 Literature Review

initially installed) according to some prediction of the future cooling demand. Due

to a number of factors (including the cost of consultancy) it is all too common that

no regular analysis of the room’s thermal dynamics is performed (if at all). This is

despite instalment of additional IT equipment (and increased work load on the

existing infrastructure) throughout the lifecycle of the data center. Clearly a very

static situation exists in this case.

2.3 CRAC fan speed – the speed at which the fans in the CRAC blow the air into

the room (via a sub-floor plenum). In 2006 (at the time of this paper), typical

CRACs had fans running at a set speed and without further analysis no

reconfiguration took place after installation. Most CRACs since then have been

designed with Variable Speed Drives (VSDs) - which can vary the speed of the

fan according to some set of rules. However, with no dynamic thermal analysis of

the DC environment taking place on a regular basis, the VSD rules are effectively

hardwired into the system. The VSDs are an unused feature of the CRAC as a

result.

2.4 Floor tile openings – the openings of the floor tiles in the cold aisle. The

velocity at which the cold air leaving the CRAC enters the room is dependent

upon a number of factors. Assuming it has passed through the sub-floor plenum

with minimal pressure loss, the air will rise into the room at some velocity (via the

floor tile openings). Floor tiles are either perforated or grated. Perforated tiles

typically have 25% of their surface area open whereas grated tiles may have 40 –

60% of their surface open. The more open surface area available on the tile the

higher the velocity with which the air will enter the room. The authors had

previously designed and implemented a new tile - featuring an electronically

controlled sliding damper mechanism which could vary the size of the opening

according to requirements.

So it is evident that as a typical DC matures and the thermodynamics of the

environment change with higher CPU loads and additional IT equipment, the

cooling system should have a dynamic cooling control system to configure it for

continuous maximum efficiency. Boucher et al. propose that this control system

should be based on the 4 available actuation points above.

19

Chapter 2 Literature Review

3. The knowledge of each variable’s effect on DC environment. Solution: the paper

focused on how each of the actuator variables (2.1, 2.2 and 2.3 and 2.4 above) can

affect the thermal dynamic of the data center.

Included in the findings of the study were:

CRAC supply temperatures have an approximate linear relationship with rack inlet

temperatures. An anomaly was identified where the magnitude of the rack inlet

response to a change in CRAC supply temperature was not of the same order. Further

study was suggested.

Under-provisioned flow provided by the CRAC fans affects the Supply Heat Index

(SHI*) but overprovisioning has a negligible effect. SHI is a non-dimensional

measure of the local magnitude of hot and cold air mixing. Slower air flow rates cause

an increase in SHI (more mixing) whereas faster air flow rates have little or no effect.

*SHI is also referred to as Heat Density Factor (HDF). The metric is based on the

principle of a thermal multiplier which was formulated by Sharma et al. [16]

The study concluded that significant energy savings (in the order of 70% in this case) were

possible where a dynamic cooling control system, controlled by software, was appropriately

deployed.

2.3.5.2 Paper 2: Impact of Rack-level Compaction on the Data Center Cooling Ensemble (2008)

Shah et al. [17] deal with the impact on the data center cooling ensemble when the density of

compute power is increased. The cooling ‘ensemble’ is considered to be all elements of the

cooling system from the chip to the cooling tower.

Increasing density involves replacing low-density racks with high-density blade

servers and has been the chosen alternative to purchasing (or renting) additional space for

most DCs in recent years. New enterprise and co-location data centers also implement the

strategy to maximize the available space. Densification leads to increased power dissipation

and corresponding heat flux within the DC environment.

20

Chapter 2 Literature Review

A typical cooling system performs two types of work:

1. Thermodynamic – removes the heat dissipated by the IT equipment

2. Airflow – moves the air through the data center and related systems

The metric chosen by Shah et al. for evaluation in this case is the ‘grand’ Coefficient of

Performance (COPG) which is a development of the original COP metric suggested by Patel

et al. [18, 19]. It measures the amount of heat removed by the cooling infrastructure per unit

of power input and does so at a more granular level than the traditional COP used in

thermodynamics, specifying heat removal at the chip, system, rack, room and facility levels.

In order to calculate the COPG of the model used for the test case each component of

the cooling system needed to be evaluated separately, before applying each result to the

overall system. Difficulties arose where system-level data was either simply unavailable or,

due to high heterogeneity, impossible to infer. However, the model was generic enough that it

could be applied to the variety of cooling systems currently being used by ‘real world’ DCs.

Note: in a similar vein, the research for this thesis examines the CPU utilization of

each individual server in the data center such that an overall DC utilization metric at each

interval can be calculated. Servers which are powered-off at the time of monitoring have no

effect on the result and are excluded from the calculation.

The assumption that increased density leads to less efficiency in the cooling system is

incorrect. If elements of the cooling system were previously running at low loads they would

typically have been operating at sub-optimal efficiency levels. Increasing the load on a

cooling system may in fact increase its overall efficiency through improved operational

efficiencies in one or more of its subsystems.

94 existing low-density racks were replaced with high-density Hewlett Packard (HP) blades

for Shah’s research. The heat load increased from 1.9MW to 4.7MW. The new heat load was

still within the acceptable range for the existing cooling infrastructure. No modifications to

the ensemble were required.

Upon analysis of the results, COPG was found to have increased by 15%. This was, in part,

achieved with improved efficiencies in the compressor system of the CRACs. While it is

acknowledged that there is a crossover point at which compressors become less efficient, the

increase in heat flux of the test model resulted in raising the work of the compressor to a

21

Chapter 2 Literature Review

point somewhere below this crossover. The improvement in compressor efficiency was

attributed to the higher density HP blade servers operating at a higher ΔT (reduced flow rates)

across the rack. The burden on the cooling ensemble was reduced - resulting in a higher

COPG.

With the largest individual source of DC power consumption (about 40% in this case)

typically coming from the CRAC - which contains the compressor - it makes sense to direct

an intelligent analysis of potential operational efficiencies at that particular part of the system.

The paper states that: “The continuously changing nature of the heat load distribution

in the room makes optimization of the layout challenging; therefore, to compensate for

recirculation effects, the CRAC units may be required to operate at higher speeds and lower

supply temperature than necessary. Utilization of a dynamically coupled thermal solution,

which modulates the CRAC operating points based on sensed heat load, can help reduce this

load”.

In this paper Shah et al. present a model for performing evaluation of the cooling

ensemble using COPG, filling the gap of knowledge through detailed experimentation with

measurements across the entire system. They conclude that energy efficiencies are possible

via increased COP in one or more of the cooling infrastructure components. Where thermal

management strategies capable of handling increased density are in place, there is significant

motivation to increase density without any adverse impact on energy efficiency.

2.3.5.3 Paper 3: Data Center Efficiency with Higher Ambient Temperatures and Optimized Cooling Control (2011)

Ahuja et al. [20] introduce the concept of ‘deviation from design intent’. When a data center

is first outfitted with a cooling system, best estimates are calculated for future use. The

intended use of the DC in the future is almost impossible to predict at this stage. As the

lifecycle of the DC matures, the IT equipment will deviate from the best estimates upon

which the cooling system was originally designed to operate. Without on-going analysis of

the DC’s thermal dynamics, the cooling system may become decreasingly ‘fit-for-purpose’.

As a possible solution to this deviation from intent, this paper proposes that cooling of

the DC environment should be controlled from the chip rather than a set of remote sensors in

22

Chapter 2 Literature Review

the room or on the rack doors. Each new IT component would have chip-based sensing

already installed and therefore facilitate a “plug ‘n’ play” cooling system.

The newest Intel processors (since Intel® Pentium® M) on the market feature an ‘on-

die’ Digital Thermal Sensor (DTS). DTS provides the temperature of the processor and

makes the result available for reading via Model Specific Registers (MSRs). The Intel white

paper [21] which describes DTS states that:

“… applications that are more concerned about power consumption

can use thermal information to implement intelligent power

management schemes to reduce consumption.”

While Intel is referring to power management of the server itself, DTS could

theoretically be extended to the cooling management system also.

Current DCs control the air temperature and flow rate from the chip to the chassis but

there is a lack of integration once the air has left the chassis. If the purpose of the data center

is to house, power and cool every chip then it has the same goal as the chassis and the chassis

is already taking its control data from the chip. This strategy needs to be extended to the

wider server room environment in an integrated manner.

The industry has recently been experimenting with positioning the cooling sensors at

the front of the rack rather than at the return inlet of the AHU. The motivation for this is to

sense the air temperature which matters most – the air which the IT equipment uses for

cooling. The disadvantage of these remote sensors (despite being better placed than sensors at

the AHU return inlet) is that they are statically positioned, a position which may later be

incorrect should changes in the thermal dynamics of the environment occur. The closer to the

server one senses - the more reliable the sensed data will be for thermal control purposes.

Ahuja et al. propose that the logical conclusion is to move the sensors even closer to the

server – in fact, right into the processor. If those sensors already exist (as is the case with the

Intel processors) then use should be made of them for a more accurate cooling management

system.

23

Chapter 2 Literature Review

The paper investigates the possible gains by moving the temperature sensors (and

changing the set-point accordingly) to a variety of positions in the DC:

1. AHU return – 28⁰C

2. AHU supply – 18⁰C

3. Rack inlet – 23⁰C

4. Server - 30⁰C

The first test was carried out on a single isolated rack with those results then

extrapolated to a DC model with a cooling capacity of 100kW. 4 perimeter down-flow AHUs

(N + 1 redundancy) performed the heat removal. While the 4 rows in the DC were not

contained they did follow the standard hot / cold aisle arrangement. The tests showed that use

of the server sensors resulted in more servers being maintained within the ASHRAE

guideline temperature range of 18 – 27⁰C. Controlling the cooling system at the server

yielded maximum benefit.

Ahuja et al. concluded that a processor-based set of metrics capable of controlling a

power management scheme on the server should, by extension, also be capable of controlling

a dynamic cooling control system outside the rack. If every server in a DC was intermittently

reporting its operating temperature (and air flow) to a cooling control system, the cooling

system would be operating on a more robust data set i.e., more accurate readings, delivering

higher energy efficiency savings than possible with previous DC configurations.

2.4 Software

2.4.1 Virtualization

In a virtualized data center, multiple Virtual Machines (VMs) are typically co-located on a

single physical server, sharing the processing capacity of the server's CPU between them.

When, for example, increased demands on the CPU result in reduced performance of one of

the VMs to the point where a Service Level Agreement (SLA) may be violated, virtualization

technology facilitates a migration. Migration relocates the services being provided by the VM

on this 'over-utilized' host to a similar VM on another physical server, where sufficient

capacity (e.g. CPU) is available to maintain SLA performance.

24

Chapter 2 Literature Review

Conversely, reduced demand on the CPU of a host introduces opportunities for server

consolidation, the objective of which is to minimize the number of operational servers

consuming power. The remaining VMs on an 'under-utilized' host are migrated so that the

host can be switched off, saving power. Server consolidation provides significant energy

efficiency opportunities.

There are numerous resource allocation schemes for managing VMs in a data center,

all of which involve the migration of a VM from one host to another to achieve one, or a

combination of, objectives. Primarily these objectives will involve either increased

performance or reduced energy consumption - the former, until recently, receiving more of

the operator’s time and effort than the latter.

In particular, SLA@SOI has completed extensive research in recent years in the area

of SLA-focused (e.g. CPU, memory, location, isolation, hardware redundancy level) VM

allocation and re-provisioning [22]. The underlying concept is that VMs are assigned to the

most appropriate hosts in the DC according to both service level and power consumption

objectives. Interestingly, Hyser et al. [23] suggest that a provisioning scheme which also

includes energy constraints may choose to violate user-based SLAs ‘if the financial penalty

for doing so was [sic] less than the cost of the power required to meet the agreement’. In a

cost-driven DC it is clear that some trade-off (between meeting energy objectives and

compliance with strict user-based SLAs e.g. application response times) is required. A similar

power / performance trade-off may be required to maximize the energy efficiency of a host-

level migration.

2.4.2 Migration

The principal underlying technology which facilitates management of workload in a DC is

virtualization. Rather than each server hosting a single operating system (or application),

virtualization facilitates a number of VMs being hosted on a single physical server, each of

which may run a different operating system (or even different versions of the same operating

system). These VMs may be re-located (migrated) to a different host on the LAN for a

variety of reasons:

Maintenance

25

Chapter 2 Literature Review

Servers intermittently need to be removed from the network for maintenance. The

applications running on these servers may need to be kept running during the

maintenance period so they are migrated to other servers for the duration.

Consolidation

In a virtualized DC some of the servers may be running at (or close to) idle – using

expensive power to maintain a machine which is effectively not being used to

capacity. To conserve power, resource allocation software moves the applications on

the under-utilized machine to a ‘busier’ machine - as long as the latter has the

required overhead to host the applications. The under-utilized machine can then be

switched off – saving on power and cooling.

Energy Efficiency

Hotspots regularly occur in the server room i.e. the cooling system is working too

hard in the effort to eliminate the exhaust air from a certain area. The particular

workload which is causing the problem can be identified and relocated to a cooler

region in the DC to relieve the pressure in the overheated area.

Virtual Machines may also be migrated to servers beyond the LAN (i.e. across the Wide Area

Network (WAN):

Follow the sun - minimize network latency during office hours by placing VMs close

to where their applications are requested most often

Where latency is not a primary concern there are a number of different strategies

which may apply:

Availability of renewable energy / improved energy mix

Less expensive cooling overhead (e.g. ‘free’ cooling in more temperate / cooler

climates

Follow the moon (less expensive electricity at night)

Fluctuating electricity prices on the open market [24]

Disaster Recovery (DR)

Maintenance / Fault tolerance

Bursting i.e. temporary provisioning of additional resources

26

Chapter 2 Literature Review

Backup / Mirroring

Regardless of the motivation, migration of virtual machines both within the DC and also to

other DCs (in the cloud or within the enterprise network) not only extends the opportunity for

significant cost savings but may also provide faster application response times if located

closer to clients. To maintain uptime and response Service Level Agreement (SLAs)

parameters of 99.999% (or higher), these migrations must be performed ‘hot’ or ‘live’,

keeping the application available to users while the virtual machine hosting the application

(and associated data) is moved to the destination server. Once all the data has been migrated,

requests coming into the source VM are redirected to the new machine and the source VM

can be switched off or re-allocated. The most popular algorithm by which virtual machines

are migrated is known as pre-copy and is deployed by both Citrix and VMWare – currently

considered to be the global leaders in software solutions for migration and virtualized

systems. A variety of live migration algorithms have been developed in the years since 2007.

Some are listed below:

1. Pre-copy [25]

2. GA for Renewable Energy Placement [26]

3. pMapper: Power Aware Migration [27]

4. De-duplication, Smart Stop & Copy, Page Deltas & CBR (Content Based

Replication) [28]

5. Layer 3: IP LightPath [29]

6. Adaptive Memory Compression [30]

7. Parallel Data Compression [31]

8. Adaptive Pre-paging and Dynamic Self-ballooning [32]

9. Replication and Scheduling [33]

10. Reinforcement Learning [34]

11. Trace & Replay [35]

12. Distributed Replicated Block Device (DRBD) [36]

27

Chapter 2 Literature Review

The LAN-based migration algorithm used by the Amazon EC2 virtualization hypervisor

product (Citrix XenMotion) is primarily based on pre-copy but also integrates some aspects

of the algorithms listed above. It serves as a good example of the live migration process. It is

discussed in the following section.

2.4.2.1 Citrix XenMotion Live Migration

The virtual machine on the source (or current machine) keeps running while transferring its

state to the destination. A helper thread iteratively copies the state needed while both end-

points keep evolving. The number of iterations determines the duration of live migration. As

a last step, a stop-and-copy approach is used. Its duration is referred to as downtime. All

implementations of live migration use heuristics to determine when to switch from iterating

to stop-and-copy.

Pre-copy starts by copying the whole source VM state to the destination system.

While copying, the source system keeps responding to client requests. As memory pages may

get updated (‘dirtied’) on the source system (Dirty Page Rate), even after they have been

copied to the destination system, the approach employs mechanisms to monitor page updates.

The performance of live VM migration is usually defined in terms of migration time

and system downtime. All existing techniques control migration time by limiting the rate of

memory transfers while system downtime is determined by how much state has been

transferred during the ‘live’ process. Minimizing both of these metrics is correlated with

optimal VM migration performance and it is achieved using open-loop control techniques.

With open-loop control, the VM administrator manually sets configuration parameters for the

migration service thread, hoping that these conditions can be met. The input parameters are a

limit to the network bandwidth allowed to the migration thread and the acceptable downtime

for the last iteration of the migration. Setting a low bandwidth limit while ignoring page

modification rates can result in a backlog of pages to migrate and prolong migration. Setting

a high bandwidth limit can affect the performance of running applications. Checking the

estimated downtime to transfer the backlogged pages against the desired downtime can keep

the algorithm iterating indefinitely. Approaches that impose limits on the number of iterations

or statically increasing the allowed downtime can render live migration equivalent to pure

stop-and-copy migration.

28

Chapter 2 Literature Review

2.4.2.2 Wide Area Network Migration

With WAN transmissions becoming increasingly feasible and affordable, live migration of

larger data volumes over significantly longer distances is becoming a realistic possibility [37,

38]. As a result, the existing algorithms, which have been refined for LAN migration, will be

required to perform the same functionality over the WAN. However, a number of constraints

present themselves when considering long distance migration of virtual machines. The

constraints unique to WAN migration are:

Bandwidth (I/O throughput – lower over WANs)

Latency (distance to destination VM – further on WANs)

Disk Storage (transfer of SAN / NAS data associated with the applications running on

the source VM to the destination VM)

Bandwidth (and latency) becomes an increasingly pertinent issue during WAN migration

because of the volume of data being transmitted across the network. In the time it takes to

transmit a single iteration of pre-copy memory to the destination, there is an increased chance

(relative to LAN migration) that the same memory may have been re-written at the source.

The rate at which memory is rewritten is known as the Page Dirty Rate (PDR) - calculated by

dividing the number of pages dirtied in the last round by the time the last round took (Mbits /

sec). This normalizes PDR for comparison with bandwidth. Xen implements variable

bandwidth during the pre-copy phase based on this comparison. There are 2 main categories

of PDR when live migration is being considered:

1. Low / Typical PDR: Memory is being re-written slower than the rate at which those

changes can be transmitted to the destination i.e. PDR < Migration bandwidth

2. Diabolical PDR (DPDR): The rate at which memory is being re-written at the source

VM exceeds the rate at which that re-written memory can be migrated ‘live’ to the

destination (PDR > Migration bandwidth). The result of this is that the pre-copy phase

may not converge at all. The PDR floods I/O and the pre-copy migration must be

immediately stopped i.e. pre-copy migration will not converge. All remaining pages

are then transferred to the destination. The result of this is a longer downtime (while

the pages are transferred), potential SLA violations and, most notably for the purposes

29

Chapter 2 Literature Review

of this research, increased power consumption while both hosts are running

concurrently.

2.4.2.3 PDR Analysis and Compression of Transmitted Pages

Current algorithms send the entire VM state on the 1st iteration (Figure 5: 62 seconds). To

reduce the time spent on the 1st iteration, pages frequently ‘dirtied’ should be identified

before the 1st iteration - the objective being to hold back these pages until the final iteration

(reducing the number of pages resent during iterative pre-copy) or at least hold them back

until some analysis calculates that they are ‘unlikely’ (with some confidence interval) to be

dirtied again. There is a reasonable assumption that there will be multiple iterations in a high

PDR environment – in the (rare) case where a VM has no dirty pages, only a single iteration

would be required to transfer the entire state. Pre-migration analysis would not be continuous

(due to the CPU overhead) but should begin at some short interval before the migration takes

place i.e. just after the decision to migrate has been made.

Figure 5 Performance of web server during live migration (C. Clark)

With a pre-migration analysis phase the time required for the 1st iteration will be reduced.

There may be an argument that downtime is increased due to the additional pages held back

during the first iteration. High PDR pages - which have not been sent in the 1 st iteration –

would likely be identified in the 2nd (or subsequent) iterations anyway – resulting in a very

similar Writable Working Set (WWS) on the final iteration. In low PDR environments

research suggests that the WWS in the majority of cases is a small proportion of the entire

30

Chapter 2 Literature Review

data set (perhaps approximately 10%) which needs to be transferred – resulting in minimal

iterations being required before a stop condition is reached i.e. subsequent iterations would

yield diminishing returns. This is not the case where an application may be memory intensive

i.e. the PDR is diabolical and floods the I/O rate.

Conversely, if the WWS is so small – is the effort to identify it at the pre-iterative

stage worth the effort? If the algorithm can be applied to diabolical environments as well as

acceptable PDR environments then the answer is yes – the effort is worth it. There is an

inevitable trade-off between the time (and CPU overhead) required to identify the WWS on

each iteration and the resulting time saved during iterative pre-copy due to less pages being

transferred. However, identifying a minimal WWS will intrinsically save time.

Finding the ‘threshold’ (current research suggests a simple high/low threshold) is an

interesting research challenge! A bitmap indicating the Page Dirty Count is required to keep

track of pages being repeatedly dirtied. A count however is probably too simplistic. Would an

upper / lower bounded threshold be more applicable? A bounded threshold would ‘hold’ the

pages which are above the lower threshold boundary but below the upper threshold boundary

i.e. deemed least likely to be dirtied again. Boundary calculation should include a confidence

interval - to minimize the un-synced pages before the final iteration occurs. These categorized

‘hold’ pages might be held until the next iteration and if they are found to still have a ‘hold’

status (fall between the upper and lower threshold boundaries) they are then transferred. With

successive iterations more is known about recent PDR patterns. Analysis of these should

theoretically yield boundary calculations which are more accurate as a result.

Note: An additional parallel check, before the nth iteration takes place, of all the pages

which were transmitted from the threshold area would identify those pages which have been

subsequently dirtied. The compressed deltas of these pages would be re-transmitted in the

final iteration – along with those that were still above the upper threshold. The success of the

new algorithm could be judged on the percentage error at this stage i.e. how many pages were

sent from the ‘hold’ area but subsequently dirtied?

2.4.2.4 Parallel Identification of Dirty Pages and Multi-Threaded Adaptive Memory Compression

In addition to the pre-migration analysis stage it may also be useful to examine the potential

of parallel dirty page identification and compression. In Figure 6 the blue area is when dirty

31

Chapter 2 Literature Review

pages are identified for the next round and delta compression takes place. However, in the

time this phase is taking place more pages will be dirtied. If the same interval was moved

back (to be in parallel with the previous data transfer) would more pages be dirtied? The

answer appears to be no i.e. the PDR is independent of the process which actually calculates

it. The benefit of this parallelism is that the algorithm is ready to move immediately to

transfer n + 1 when transfer n has completed – reducing the iterative pre-copy time by

eliminating the blue interval in Figure 6.

It is probable that some overlap may be optimal rather than full parallelism. Time–

series analysis of dirtying patterns during the previous transfer interval might yield an

optimal overlap i.e. the best time to start identifying the new dirty pages, rather than waiting

until the transfer has completed. It would also be beneficial to investigate further whether, as

the number of dirty pages reduces with subsequent iterations, the time required to identify

(and compress) the dirty page deltas could also be reduced (research suggests cache access

times remain relatively constant). If this were true then the inner overlap could be sent deeper

back into the transfer time reducing the outer overlap further. Additionally, multi-threaded

compression would yield further reductions in the overlap interval.

Figure 6 Pre-Copy algorithm

2.4.2.5 Throttling

The critical issue in high PDR environments is that the possibility of convergence is reduced

(if not eliminated altogether). It is similar to a funnel filling up too quickly. If the PDR

continues at a high rate the funnel will eventually overflow resulting in service timeouts i.e.

32

Chapter 2 Literature Review

the application will not respond to subsequent requests or response times will be significantly

decreased. The current solution is to abandon pre-copy migration, stop the VM and transfer

all memory i.e. empty the funnel. Unfortunately, in the time it takes to empty the funnel,

more pages have been dirtied because requests to the application do not stop. This may

actually prohibit the migration altogether because the downtime is such that an unacceptable

level of SLA violations occur.

If, however, the speed at which the application’s response thread can be artificially

slowed down (throttled) intermittently then the funnel is given a better chance to empty its

current contents. This would be analogous to temporarily decreasing the flow from the tap to

reduce the volume in the funnel.

Previous solutions suggested that slowing response time to requests (known as

Dynamic Rate Limiting) would alter the rate at which I/O throughput was performed but

results proved that detrimental VM degradation tended to occur. In addition, other processes

on the same physical machine were negatively affected. Dedicated migration switches were

required to divert the additional load from the core. The focus was on the I/O throughput as

opposed to the incoming workload (PDR).

How the PDR could be intermittently throttled without adverse degradation of either

the VM in question, or other machine processes, is the central question.

Successful PDR throttling, in conjunction with threshold calculations and optimized

parallel adaptive memory compression / dirty page identification, would achieve a lower

PDR. However, the issue of PDR can be essentially circumvented if the number of migrations

taking place in a DC as a whole can be reduced.

In the majority of typical PDR environments Clark et al. [39] have shown that the

initial number of dirty pages i.e. the Writable Working Set (WWS), is a small proportion of

the entire page set (perhaps 10% or less) which needs to be transferred, typically resulting in

minimal iterations being required before a stop condition is reached i.e. subsequent iterations

would yield diminishing returns. This is not the case where an application may be particularly

memory-intensive i.e. the PDR is diabolical.

Degradation of application performance during live migration (due to DPDRs or for

other reasons) results in increased response times, threatening violation of SLAs and

33

Chapter 2 Literature Review

increasing power consumption. For optimization of migration algorithms with DPDRs there

are 2 possible approaches for solving the DPDR problem:

1. Increase bandwidth

2. Decrease PDR

Typical applications only exhibit this DPDR-like behaviour as spikes or outliers in normal

write activity. Live migration was previously abandoned by commercial algorithms when

DPDRs were encountered. However, in its most recent version of vSphere (5.0), VMWare

has included an enhancement called ‘Stun During Page Send’ (SDPS) [40] which guarantees

that the migration will continue despite experiencing a DPDR (VMWare refers to DPDRs as

‘pathological’ loads). Tracking both the transmission rate and the PDR, a diabolical PDR can

be identified. When a DPDR is identified by VMWare, the response time of the virtual

machine is slowed down (‘stunned’) by introducing microsecond delays (sleep processes) to

the vCPU. This lowers the response time to application requests and thus slows the rate of

PDR to less than the migration bandwidth (in order to ensure convergence (PDR <

bandwidth).

Xen implements a simple equivalent – limiting ‘rogue’ processes (other applications

or services running parallel to the migration) to 40 write faults before putting them on a wait

queue.

2.5 Monitoring Interval

Much effort has been applied to optimizing the live migration process in recent years. During

migration, the primary factors impacting on a VM’s response SLA are the migration time

and, perhaps more importantly, the downtime. These are the metrics which define the

efficiency of a migration. If a DC operator intends to migrate the VM(s) hosting a client

application it must factor these constraints into its SLA guarantee. It is clear that every

possible effort should be made to minimize the migration time (and downtime) - so that the

best possible SLAs may be offered to clients. This can only be achieved by choosing the VM

with the lowest potential PDR for each migration.

However, response and uptime SLAs become increasingly difficult to maintain if

reduction (or at least minimization) of power consumption is a primary objective because

each migration taking place consumes additional energy (while both servers are running and

34

Chapter 2 Literature Review

processing cycles, RAM, bandwidth are being consumed). Based on this premise, Voorsluys

et al. [41] evaluate the cost of live migration, demonstrating that DC power consumption can

be reduced if there is a reduction in migrations. The cost of a migration (as shown) is

dependent on a number of factors, including the amount of RAM being used by the source

VM (which needs to be transferred to the destination) and the bandwidth available for the

migration. The higher the bandwidth the faster data can be transferred. Additionally, power

consumption is increased because 2 VMs (source and destination) are running concurrently

for much of the migration process.

In order to reduce the migration count in a DC each migration should be performed

under the strict condition that the destination host is chosen such that power consumption in

the DC is minimized post-migration. This can only be achieved by examining all possible

destinations before each migration begins - to identify the optimal destination host for each

migrating VM from a power consumption point-of-view. The critical algorithm for resource

(VM) management is the placement algorithm.

35

Chapter 2 Literature Review

These 2 conditions i.e.

1. Migrate the VM with the lowest Page Dirty Rate

2. Choose the destination host for minimal power consumption post-migration

form the basis upon which the Local Regression / Minimum Migration Time (LRMMT)

algorithm in CloudSim [42] operates (c.f. Chapter 3).

2.5.1 Static Monitoring Interval

Recent research efforts in energy efficiency perform monitoring of the incoming workload

but almost exclusively focus on techniques for analysis of the data being collected rather than

improving the quality of the data.

In their hotspot identification paper, Xu and Sekiya [43] select a monitoring interval

of 2 minutes. The interval is chosen on the basis of balancing the cost of the additional

processing required against the benefit of performing the migration. The 2 minute interval

remains constant during experimentation.

Using an extended version of the First Fit Decreasing algorithm, Takeda et al. [44]

are motivated by consolidation of servers, to save power. They use a static 60 second

monitoring interval for their work.

Xu and Chen et al. [45] monitor the usage levels of a variety of server resources

(CPU, memory, and bandwidth), polling metrics as often as they become available. Their

results show that monitoring at such a granular level may not only lead to excessive data

processing but the added volume of network monitoring traffic (between multiple hosts and

the monitoring system) may also be disproportionate to the accuracy required.

The processing requirements of DC hosts vary as the workload varies and are not

known until they arrive at the VM, requesting service. While some a priori analysis of the

workload may be performed to predict future demand, as in the work of Gmach et al. [46],

unexpected changes may occur which have not been established by any previously identified

patterns. A more dynamic solution is required which reacts in real-time to the incoming

workload rather than making migration decisions based on a priori analysis.

36

Chapter 2 Literature Review

VMware vSphere facilitates a combination of collection intervals and levels [47].

The interval is the time between data collection points and the level determines which metrics

are collected at each interval. Examples of vSphere metrics are as follows:

Collection Interval: 1 day

Collection Frequency: 5 minutes (static)

Level 1 data: 'cpuentitlement', 'totalmhz', 'usage', 'usagemhz'

Level 2 data: 'idle', 'reservedCapacity' + all of Level 1 data (above)

VMware intervals and levels in a DC are adjusted manually by the operator as circumstances

require. Once chosen, they remain constant until the operator re-configures them. Manual

adjustment decisions, which rely heavily on the experience and knowledge of the operator,

may not prove as accurate and consistent over time as an informed, dynamically adjusted

system.

In vSphere, the minimum collection frequency available is 5 minutes. Real-time data

is summarized at each interval and later aggregated for more permanent storage and analysis.

2.5.2 Dynamic Monitoring Interval

Chandra et al. [48] focus on dynamic resource allocation techniques which are sensitive to

fluctuations in data center application workloads. Typically SLA guarantees are managed by

reserving a percentage of available resources (e.g. CPU, network) for each application. The

portion allocated to each application depends on the expected workload and the SLA

requirements of the application. The workload of many applications (e.g. web servers) varies

over time, presenting a significant challenge when attempting to perform a priori estimation

of such workloads. Two issues arise when considering provisioning of resources for web

servers:

1. Over-provisioning based on worst case workload scenarios may result in potential

underutilization of resources e.g. higher CPU priority allocated to application which

seldom requires it

2. Under-provisioning may result in violation of SLAs e.g. not enough CPU priority

given to an application which requires it

37

Chapter 2 Literature Review

An alternate approach is to allocate resources to applications dynamically based on

observation of their behaviour in real-time. Any remaining capacity is later allocated to those

applications as and when they are found to require it. Such a system reacts in real-time to

unanticipated workload fluctuations (in either direction), meeting QoS objectives which may

include optimization of power consumption in addition to typical performance SLAs such as

response time.

While Chandra and others [49, 50] have previously used dynamic workload analysis

approaches, their focus was on resource management to optimize SLA guarantees i.e.

performance. No consideration is given in their work to the effect on power consumption

when performance is enhanced. This research differentiates itself in that dynamic analysis of

the workload is performed for the purpose of identifying power consumption opportunities

while also maintaining (or improving) the performance of the DC infrastructure. The search

for improved energy efficiency is driven in this research by DC cost factors which were not as

significant an issue 10-15 years ago as they are now.

2.6 Conclusion

This chapter provided an in-depth analysis of data center energy efficiency state-of-the-art.

Software solutions to energy efficiency issues were presented, demonstrating that many

opportunities still exist for improvement in server room power consumption using a software

approach to monitoring (and control) of the complex systems which comprise a typical DC.

The principle lesson to take from prior (and existing) research in the field is that most of the

DC infrastructure can be monitored using software solutions but that monitoring (and

subsequent processing of the data collected) should not overwhelm the monitoring /

processing system and thus impact negatively on the operation of the DC infrastructure. This

thesis proposes that dynamic adjustment of the monitoring interval with respect to the

incoming workload may represent a superior strategy from an energy efficiency perspective.

Chapter 3 discusses in more detail the capabilities provided by the Java-based CloudSim

framework (used for this research) and the particular code modules relevant to testing the

hypothesis presented herein.

38

Chapter 3 CloudSim

Chapter 3 CloudSim

Introduction

Researchers working on data center energy efficiency from a software perspective are

typically hindered by lack of access to real-world infrastructure because it is infeasible to add

additional workload to a data center which already has a significant ‘real-world’ workload to

service on a daily basis. From a commercial perspective, DC operators are understandably

unwilling to permit experimentation on a network which, for the most part, has been fine-

tuned to manage their existing workload. In this chapter, details of the CloudSim framework

are presented with special emphasis on those aspects particularly related to this MSc research

topic.

The CloudSim framework [42] is a Java-based simulator, designed and written by

Anton Beloglazov at the University of Melbourne for his doctoral thesis. It provides a limited

software solution to the above issues and is deployed in this research to simulate a standalone

power-aware data center with LAN-based migration capabilities. The Eclipse IDE is used to

run (and edit) CloudSim.

3.1 Overview

Default power-aware algorithms in CloudSim analyse the state of the DC infrastructure at

static 300-second intervals. This reflects current industry practice where an average CPU

utilization value for each host is polled every 5 minutes (i.e. 300 seconds) by virtualization

monitoring systems (e.g. VMware). At each interval the CPU utilization of all hosts in the

simulation is examined to establish whether or not they are adequately servicing the workload

which has been applied to the VMs placed on them.

If a host is found to be over-utilized (i.e. the CPU does not have the capacity to

service the complete workload of all the VMs placed on it) a decision is made to migrate one

or more of the VMs to another host where the required capacity to service the workload is

available.

39

Chapter 3 CloudSim

Conversely, if a host is found to be under-utilized (i.e. the CPU is operating at such a low

capacity that power could be saved by switching it off), the remaining VMs are migrated to

another host and the machine is powered off. The CloudSim modules used only implement

migration when a host is over-utilized, reflecting the focus of this research.

There are two primary steps in the power-aware CloudSim migration algorithm for

over-utilized hosts:

1. Migrate the VM with the lowest Page Dirty Rate

2. Choose the destination host for minimal power consumption post-migration

The default CPU utilization threshold for an over-utilized host in CloudSim is 100%. An

adjustable safety parameter is also provided by CloudSim, effectively acting as overhead

provision. As an example, if the CPU utilization value were 90% and was then multiplied by

a safety parameter of 1.2, the resulting value of 108% would exceed the over-utilization

threshold. A safety parameter of 1.1 would result in a final value of 99% (for the same initial

utilization), thus not exceeding the threshold.

3.2 Workload

The workloads applied to the VMs on each host in a simulated DC for power-aware

CloudSim simulations are referred to as a ‘cloudlets’. These are flat text files which contain

sample CPU utilization percentages gathered (per interval) from over 500 DC locations

worldwide. As long as no migration takes place (i.e. the host doesn’t become over-utilized),

the VM assigned to service the workload at the beginning of a simulation (depicted in Figure

7) remains associated with that workload until the cloudlet has been completed. However, if a

migration takes place (because the host has become over-utilized) the workload is then

applied to the VM on the destination host. Despite the term ‘VM Migration’, it is the

workload (not the VM) which changes location within the DC when a migration takes place.

40

Chapter 3 CloudSim

Figure 7 CloudSim Architecture

The duration of default CloudSim simulations is 24 hours (i.e. 86,400 seconds). This equates

to 288 intervals of 5 minutes (300 seconds) each. Thus, each of the 1052 cloudlets (stored in

the PlanetLab directory) contains 288 values to make a value available for reading at each

interval of the simulation.

41

Chapter 3 CloudSim

At the beginning of each simulation, the entire cloudlet is loaded into an array

(UtilizationModelPlanetLabInMemory.data[]) from which the values are read at

each interval throughout the simulation. Each cloudlet is assigned to a corresponding VM on

a 1-2-1 basis at the beginning of the simulation.

The values being read from the cloudlets are percentages which simulate ‘real-

world’ CPU utilization values. These need to be converted to a variable in CloudSim which is

related to actual work performed. CloudSim work performance is defined in MIs (Million

Instructions). The workload of the cloudlet (termed length) is a constant i.e. 2500 *

SIMULATION_LENGTH (2500 * 86400 = 216,000,000 MIs). CloudSim keeps track of the

VM workload already performed by subtracting the MIs completed during each interval from

the total cloudlet MI length. As such each cloudlet starts at t = 0 seconds with a workload of

216,000,000 MI and this load is reduced according to the work completed at each interval.

To check whether a cloudlet has been fully executed the IsFinished() method is called at each interval.

// checks whether this Cloudlet has finished or notif (cl.IsFinished){

…}

final long finish = resList.get(index).finishedSoFar;final long result = cloudletLength - finish;if (result <= 0.0){

completed = true;}

From the code tract above it can be seen that when (or if) the VM’s workload (represented by

the cloudletLength variable) is completed during the simulation the VM will be ‘de-

commissioned’.

3.3 Capacity

Each of the 4 VM types used in the CloudSim framework represents a ‘real-world’ virtual

machine. They are assigned a MIPS value (i.e. 500, 1000, 2000, 2500) before the simulation

begins. This value reflects the maximum amount of processing capacity on the host to which

42

Chapter 3 CloudSim

a VM is entitled. Likewise each host CPU has an initial MIPS capacity of either 1860 or

2660, again reflecting ‘real-world’ servers. These configuration settings limit the number of

VMs which can be run on each host and also the volume of workload which can be

performed by each VM at each interval.

Example: A host has a capacity of 2660 MIPS. A VM (with a capacity of 500

MIPS) has just been started on the host and the first value read from the cloudlet array is 5%

of the host’s capacity (i.e. 2660 / 20 = 133 MIPS). If the next interval is 30 seconds long then

the amount of instructions processed by the VM is 133 * 30 = 3990MI.

This completed work is subtracted from the total cloudlet length (i.e. 216,000,000 –

3990 = 215,996,010MI). At each subsequent interval throughout the simulation the same

algorithm is applied until such time as the remaining workload to be processed is at (or

below) zero. At this stage the VM is de-commissioned because the workload is complete.

In this example the 5% CPU percentage from the cloudlet (i.e. 133 MIPS) is

approximately 27% (500/133) of the CPU capacity allocated to the VM. If the original value

read from the cloudlet was greater than 18.79% (i.e. 2660 / 500), the VM would have

insufficient capacity to continue servicing the workload and SLA violations would occur. 2

options typically need to be considered when this happens:

1. Increase the VM’s capacity on the host – not facilitated in CloudSim

2. Apply the workload to a VM with a larger capacity on a different host, requiring a

migration. This will only occur if the host is also over-utilized which is a significant

shortfall in the CloudSim modules used for testing the hypothesis. An over-utilized

VM (causing SLA violations) will not result in a migration in the version of CloudSim

being used for this research. Additionally, it is notable that the CloudSim reports

(generated at the end of the simulation) detail very low SLA violation averages which

indicates that the particular workload (cloudlet) percentages being applied in this

version of CloudSim are insufficient to push the VMs beyond their capacity.

The difficulty of correctly sizing VM MIPS (and allocating appropriate host capacity to

them) so that they are capable of meeting their workload requirements can be seen from this

example. CloudSim goes some way to achieving this by applying VMs to hosts on a 1-2-1

43

Chapter 3 CloudSim

basis at the start of the simulation i.e. in a default simulation with 1052 VMs being placed on

800 hosts, the first 800 VMs are applied to the first 800 hosts and the remaining 352 VMs are

allocated to hosts 1 -> 352. Therefore, when the simulation starts, 352 hosts have 2 VMs and

the remainder host a single VM.

As processing continues the VM placement algorithm attempts to allocate as many

VMs to each host as capacity will allow. The remaining (empty) hosts are then powered off -

simulating the server consolidation effort typical of most modern DCs. It is clear that there is

a conflict of interests taking place. On the one hand there is an attempt to maximize

performance by migrating VMs to hosts with excess capacity but, on the other hand,

competition for CPU cycles is being created by co-locating VMs on the same host,

potentially creating an over-utilization scenario.

3.4 Local Regression / Minimum Migration Time (LR / MMT)

Beloglazov concludes from his CloudSim experiments that the algorithm which combines

Local Regression and Minimum Migration Time (LR / MMT) is most efficient for

maintaining optimal performance and maximizing energy efficiency. Accordingly this

research uses the LR / MMT algorithmic combination as the basis for test and evaluation.

3.5 Selection Policy – Local Regression (LR)

Having passed the most recent CPU utilization values through the Local Regression (LR)

algorithm, hosts are considered over-utilized if the next predicted utilization value exceeds

the threshold of 100% [Appendix A]. LR predicts this value using a sliding window, each

new value being added at each subsequent interval throughout the simulation. The size of the

sliding window is 10. Until initial filling of the window has taken place (i.e. 10 intervals have

elapsed since the simulation began), CloudSim relies on a 'fallback' algorithm [Appendix A]

which considers a host to be over-utilized if its CPU utilization exceeds 70%.

VMs are chosen for migration according to MMT i.e. the VM with the lowest

predicted migration time will be selected for migration to another host. Migration time is

based on the amount of RAM being used by the VM. The VM using the least RAM will be

44

Chapter 3 CloudSim

chosen as the primary candidate for migration, simulating minimization of the Dirty Page

Rate (DPR) during VM transfer [39] as previously discussed in Section 2.4.2.2.

3.6 Allocation Policy – Minimum Migration Time (MMT)

The destination host for the migration is chosen on the basis of power consumption following

migration i.e. the host with the lowest power consumption (post migration) is chosen as the

primary destination candidate. In some cases, more than one VM may require migration to

reduce the host's utilization below the threshold. Dynamic RAM adjustment is not facilitated

in CloudSim as the simulation proceeds. Rather, RAM values are read (during execution of

the MMT algorithm) on the basis of the initial allocation to each VM at the start of the

simulation.

3.7 Default LRMMT

The LRMMT algorithm begins in the main() method of the LrMmt.java class [Appendix

B]. A PlanetLabRunner() object is instantiated. The PlanetLabRunner() class inherits

from the RunnerAbstract() class and sets up the various parameters required to run the

LRMMT simulation. The parameters are passed to the initLogOutput() method in the

default constructor of the super class (RunnerAbstract) which creates the folders required

for saving the results of the simulation. Two methods are subsequently called:

3.7.1 init()

Defined in the sub-class (PlanetLabRunner), this method takes the location of the

PlanetLab workload (string inputFolder) as a parameter and initiates the simulation. A

new DatacenterBroker() object is instantiated. Among other responsibilities, the broker

will create the VMs for the simulation, bind the cloudlets to those VMs and assign the VMs

to the data center hosts. The broker’s ‘id’ is now passed to the

createCloudletListPlanetLab() method which prepares the cloudlet files in the input

folder for storage in a data[288] array. It is from this array that each cloudlet value will be

read so that an equivalent MI workload value can be calculated for each VM. Having created

the cloudletList, the number of cloudlets (files) in the PlanetLab folder is now known

and a list of VMs can be created with each cloudlet being assigned to an individual VM i.e.

there is a 1-2-1 relationship between a cloudlet and a VM at the start of the simulation. The

45

Chapter 3 CloudSim

last call of the init() method creates a hostList which takes as a parameter the number

of hosts configured for the DC (i.e. 800) from the PlanetLabConstants() class. On

completion of the init() method the cloudlets (workload), hosts and VMs are all

instantiated and ready for the data center to be created.

3.7.2 start()

The start() method creates the data center, binds all the components created in the

init() method to the new data center and starts the simulation.

The first helper call of the start() method is to createDatacenter() which

sets up a number of parameters related to the characteristics of the DC. These include:

arch (string) – whether the DC has a 32 or 64 bit architecture

os (string) – the operating system running on the hosts – e.g. Linux / Windows

vmm (string) – the virtual machine manager running on the hosts e.g. Xen

time_zone (double) – where the DC is located – e.g. 10.0

cost (double) – the cost of processing in this resource – e.g. 3.0

costPerMem (double) – the cost of using memory in this resource – e.g. 0.05

costPerStorage (double) – the cost of using storage in this resource – e.g. 0.001

costPerBm (double) – the cost of using bandwidth in this resource – e.g. 0.0

In the case of a simulation of a cloud network, where more than one data center would be

required, these values can be altered for the purposes of calculating different infrastructural

costs across the cloud. In this research a single data center is being simulated. The defaults

are not adjusted.

Once the data center has been created a boolean value

(PowerDatacenter.disableMigrations - indicating whether or not migrations are

disabled) is set to false i.e. migrations are enabled for this simulation. The VM and cloudlet

lists are submitted (by the broker) to the datacenter object and the simulation is started i.e.

double lastClock = CloudSim.StartSimulation();

46

Chapter 3 CloudSim

The StartSimulation() method calls the run() method which waits for completion of

all entities i.e. run() waits until the entities (cloudlets running on VMs) are run as threads so

that the stop condition for the StartSimulation() method is reached when the threads

reach the ‘non-RUNNABLE’ state or when there are no more events in the future event queue.

Once this point has been reached the clock time is returned to the calling method (i.e.

RunnerAbstract.start()) and the simulation is stopped.

Helper.printResults(datacenter, vmList, lastClock, experimentName, Constants.OUTPUT_CSV, outputFolder);

Results (if enabled) are printed to both log and trace files and the simulation is completed.

47

Chapter 3 CloudSim

3.8 Over-utilization

Figure 8 Flow Chart Depicting the LR / MMT simulation process

The full simulation process is depicted in Figure 8. The scheduling interval (i.e. how often

analysis will be performed) is set as a static variable in the Constants class. For the default

CloudSim simulation, this interval is 300 seconds. At each interval the CPU utilization of

every host is examined. Using a sliding window of the last 10 CPU utilization values, the

local regression algorithm predicts the CPU utilization value for the next interval. If this

value is below 100% no action is taken. However, if the CPU is predicted to be greater than

48

Chapter 3 CloudSim

100% (at the next interval) the host is considered over-utilized and the MMT portion of the

algorithm is called. As mentioned previously, a ‘fallback’ algorithm is used until the first 10

CPU values are available. The ‘fallback’ over-utilization threshold is 70%. The code for

testing a host for overutilization is shown below:

if (utilizationHistory.length < length){

return getFallbackVmAllocationPolicy().isHostOverUtilized(host);}

The length of the sliding window is 10. This is known in code as the

utilizationHistory.

try{

estimates = getParameterEstimates(utilizationHistoryReversed);}

The getParameterEstimates() call runs the local regression algorithm against the

sliding window and (after including the safety parameter as a multiplier) the predicted

utilization of the host is calculated.

predictedUtilization *= getSafetyParameter();

if(predictedUtilization >= 1)

{

Constants.OverUtilizedHostsThisInterval++;

}

return predictedUtilization >= 1;

A Boolean indicating the utilization state of the host is returned to the calling function. If the

host is predicted to be over-utilized at the next interval, the value of the returned Boolean will

be true.

49

Chapter 3 CloudSim

3.9 Migration

One or more VMs need to be migrated from the host in order to bring the CPU utilization

back below the threshold. The VM(s) to be migrated are chosen on the basis of the amount of

RAM they are using. Thus, the VM with the least RAM will be the primary candidate for

migration. The VM types used by CloudSim are listed below. It can be seen that (for the most

part †) different RAM values are configured for each VM at the start of the simulation.

CloudSim does not include dynamic RAM adjustment so the static values applied initially

remain the same for the duration. Cloud providers such as Amazon use the term ‘instance’ to

denote a spinning VM. The CloudSim VM types provided simulate some of the VM

instances available to customers in Amazon EC2.

1. High-CPU Medium Instance: 2.5 EC2 Compute Units, 0.85 GB

2. Extra Large Instance: 2 EC2 Compute Units, 3.75 GB †

3. Small Instance: 1 EC2 Compute Unit, 1.7 GB

4. Micro Instance: 0.5 EC2 Compute Unit, 0.633 GB

public final static int VM_TYPES = 4;

public final static int[] VM_RAM = { 870, 1740, 1740†, 613 };

All types are deployed when the VMs are being created at the beginning of the simulation.

Assuming all 4 are on a single host when the host is found to be over-utilized, the order in

which the VMs will be chosen for migration is:

1. 613 [index 3]

2. 870 [index 0]

3. 1740 [index 1]

4. 1740 [index 2] † (note: the VM_RAM value in the default CloudSim code is 1740.

This does not reflect the ‘real-world’ server [Extra Large Instance] being simulated,

which has a RAM value of 3.75 GB)

If there is more than one VM with RAM of 613 on the host they will be queued for migration

before the first ‘870’ enters the queue. The chosen VM is then added to a migration map

which holds a key-value pair of the:

50

Chapter 3 CloudSim

VM ID

Destination Host ID

Once all the hosts have been analysed, the VMs in the migration map are migrated to their

chosen destinations using the VM placement algorithm. The destination for each VM is

chosen with the objective of optimizing power consumption i.e. the host which will use the

least power post-migration is deemed the most suitable.

public Vm getVmToMigrate(PowerHost host){

List<PowerVm> migratableVms = getMigratableVms(host);if (migratableVms.isEmpty()){

return null;}Vm vmToMigrate = null;double minMetric = Double.MAX_VALUE;for (Vm vm : migratableVms){

if (vm.isInMigration()){

continue;}double metric = vm.getRam();if (metric < minMetric){

minMetric = metric;vmToMigrate = vm;

}}return vmToMigrate;}

}

From the code above it can be seen that the VM with the least RAM (vm.getRam()) is

chosen for migration, the objective of which is to minimize the downtime required to transfer

the final RAM pages during the migration. Increased downtime during migration would

result in potential SLA violations as described in detail in Section 2.4.2.3. It is clear that, to a

certain extent, CloudSim is replicating the effort to minimize SLA violations which takes

place during ‘real-world’ live migrations.

51

Chapter 3 CloudSim

3.10 Reporting

CloudSim facilitates reporting on various metrics available during the simulation. Reports are

generated as either flat text or MS Excel-type Comma Separated Values (CSV) file formats.

Additionally, metrics can be sent to the Eclipse console and read as the simulation progresses.

Below is a sample of the metrics summary from the trace file of the default CloudSim

simulation. Notable metrics include:

Number of Hosts

Number of VMs

Energy Consumption

Over-utilized Hosts

Number of VM Migrations

Average SLA Violation

Trace.printLine(String.format("Experiment name: " + experimentName));Trace.printLine(String.format("Number of hosts: " + numberOfHosts));Trace.printLine(String.format("Number of VMs: " + numberOfVms));Trace.printLine(String.format("Total simulation time: %.2f sec", totalSimulationTime));Trace.printLine(String.format("Energy consumption: %.2f kWh", energy));Trace.printLine(String.format("Overutilized Hosts: %d", Constants.OverUtilizedHostsThisInterval));Trace.printLine(String.format("Number of VM migrations: %d", numberOfMigrations));Trace.printLine(String.format("SLA: %.5f%%", sla * 100));Trace.printLine(String.format("SLA perf degradation due to migration: _%.2f%%", slaDegradationDueToMigration * 100));Trace.printLine(String.format("SLA time per active host: %.2f%%", slaTimePerActiveHost * 100));Trace.printLine(String.format("Overall SLA violation: %.2f%%", slaOverall _* 100));Trace.printLine(String.format("Average SLA violation: %.2f%%", slaAverage _* 100));

3.11 Conclusion

52

Chapter 3 CloudSim

This chapter discussed some of the capabilities and limitations of the default CloudSim

framework being used for this research and identified the modules most related to testing the

hypothesis presented in this research. An explanation of how CloudSim processes the

workload, being applied from the cloudlets, was also provided. A range of shortfalls and

possible errors were also identified. Chapter 4 details the changes made to the default

framework and the new code added, which were integrated into the CloudSim code to

evaluate the hypothesis.

53

Chapter 4 Implementation

Chapter 4 Implementation

Introduction

As it is not provided in the default CloudSim package, additional code was required to test

the effect on power consumption when the monitoring interval is adjusted. Chapter 3

provided an overview of the default power-aware CloudSim simulation and the related

modules. The specific capabilities (and limitations) of the framework, as they apply to

dynamic interval adjustment, were also outlined. In this chapter the changes which were

required to implement dynamic adjustment of the monitoring interval are described in more

detail.

The primary contribution of the thesis is to evaluate the impact of moving from a

static to a dynamic monitoring process whereby the predicted average utilization for the DC

at each interval is used to adjust the next interval. Before writing the code, which would

ultimately be integrated with the existing LR/MMT modules in CloudSim, an algorithm was

designed to clarify the steps involved.

4.1 Interval Adjustment Algorithm

The dynamic interval adjustment algorithm involves two principle steps:

1. Calculate the weighted mean of the CPU utilization value for all operational hosts in the

data center as in Equation 1. Non-operational hosts are excluded from the calculation,

they do not affect the average CPU utilization for the DC:

weighted mean=∑i=1

n

(w¿¿ i x i¿)¿

¿∑i=1

n

(w i )¿, (1)

54

Chapter 4 Implementation

where wi is the weight applied to the range within which the predicted utilization value x i for

each operational host falls and n is the number of operational hosts.

2. Choose and set the next monitoring interval with respect to the appropriate weighted

mean from Table 4.1:

The premise upon which weightings are applied to calculate the average utilization of the DC

is simplified for the purposes of this research. The primary objective is to adjust the

monitoring interval with respect to the upper utilization threshold. As such, a straightforward

set of weights (from 1 - 10) are applied to the CPU utilization for each host, such that host's

which have a higher CPU utilization i.e., are closer to 100%, are given more priority in the

calculation. If a simple average was taken of host utilization across the DC, this would have

the effect of masking hosts that are close to the threshold, where SLA’s are in danger of being

violated. If the lower threshold were taken into consideration, a different set of weights would

be appropriate, with increased importance applied in the regions closer to both thresholds and

reduced importance at the center (e.g. 40-60% CPU utilization). There is certainly scope for

further investigation of the simplified set of weights applied in this research, depicted in

Table 4.1.

Table 4.1: Application of Weights to Predicted Utilization

Predicted Utilization (%) per host ( xi ) Weight Applied ( wi )

1 – 10 1

11 – 20 2

21 – 30 3

… …

91 – 100 10

The monitoring intervals applied to the resulting weighted average prediction are depicted in

Figure 9. As with the weights discussed above, the intervals were chosen somewhat

arbitrarily and would benefit from further analysis. The maximum interval is aligned with the

existing default interval in CloudSim i.e. 300 seconds. A minimum interval of 30 seconds

55

Chapter 4 Implementation

facilitates 10 intervals in total, each having a corresponding 10% CPU utilization range from

0 to 100.

However, if the minimum interval of 30 seconds was applied for the full 24 hour simulation,

2880 values would be required (i.e. (60 x 60 x 24) / 30) in each PlanetLab cloudlet file to

ensure a value could be read at each interval. The 288 values in the 1052 default files

provided by CloudSim were thus concatenated (using a C# program written specifically for

this purpose) to ensure sufficient values were available, resulting in a total of 105 complete

files with 2880 values each.

if(Constants.IsDefault){

data = new double[288]; // PlanetLab workload}else{

data = new double[2880];}

The code tract above demonstrates the difference between the two data[] arrays which hold

the PlanetLab cloudlet values read at each interval during the simulation. The default array

is 288 in length while the dynamic is 2880 – providing sufficient indices in the array to store

the required number of cloudlet values throughout the simulation.

56

Chapter 4 Implementation

Figure 9 Application of the Monitoring Interval Based on Weighted Utilization Average

It should be noted that the intervals and CPU ranges were chosen somewhat arbitrarily and

could be fine-tuned following further investigation. Additionally, as a result of the reduced

file count after concatenation, the number of hosts running in the simulation was reduced,

from the CloudSim default of 800 to 80, to maintain the ratio of VMs to hosts (Table 4.2):

Table 4.2: VMs to Hosts - Ratio Correction

Cloudlets / VMs Hosts

Default 1052 800

Dynamic 105 80

57

Chapter 4 Implementation

4.2 Comparable Workloads

This research focuses on a comparison of the default CloudSim simulation with a dynamic

version and was required, therefore, to use comparable workloads. Ensuring that the

workloads are comparable (in a simulation which monitors at different intervals) involves

applying the same amount of processing to each VM during each interval. Accordingly, a

new set of files was created for the default simulation. The values for these files were

calculated based on the average of the values used in the dynamic simulation by the time each

300 second interval had elapsed. This was achieved by running the dynamic simulation for 24

hours and recording the data (e.g. interval length, cumulative interval, average utilization per

interval) observed (shown in Figure 10). This data was initially written to a Microsoft Excel

worksheet from within the CloudSim reporting structure and then exported to a Microsoft

SQL Server database. The number (727) and length of the intervals in the dynamic simulation

can be seen in Figure 11.

From Figure 12, it is clear that a lower (and/or) upper offset may occur during

calculation i.e. the dynamic interval ‘straddles’ the 300-second mark. To maintain as much

accuracy as possible for the calculation of the default file values, two new variables (i.e.

offsetBelow300, offsetAbove300) were introduced.

58

Chapter 4 Implementation

Figure 10 A screenshot of the data generated for calculation of the default workload

Figure 11 Intervals calculated during the dynamic simulation

59

Chapter 4 Implementation

Figure 12 Calculation of the Average CPU Utilization for the Default Files

The length of each interval is added to an accumulator until the total equals (or exceeds) 300

seconds. The average utilization for the accumulated intervals is then calculated. This average

includes (if required) the average for the final portion of any interval below the 300-second

mark. When an offset occurs above the 300-second mark (in the current accumulator), it is

‘held-over’ (i.e. added to the accumulator in the next 300-second interval). Some of the new

Java code written in CloudSim to monitor the interval and workload activity (generating the

data required for the calculator) is shown below – the code comments provide explanation:

if(Constants.IntervalGenerator){

int intervalDifference = 0;int iOffsetBelowForPrinting = 0;int iOffsetAboveForPrinting = 0;int iAccumulatedIntervalsForPrinting = 0;

if(Constants.accumulatedIntervals >= 300){

//Constants.accumulatedIntervals is exactly 300int accumulated = 300;

//calculate offsetsif(Constants.accumulatedIntervals > 300){

60

Chapter 4 Implementation

accumulated = (int) Constants.accumulatedIntervals - _(int) dInterval;

}Constants.offsetBelow = 300 - accumulated;Constants.offsetAbove = Constants.accumulatedIntervals -

300;

}}

4.3 C# Calculator

Calculation of the new per-interval workloads was achieved using a separate ‘calculator’

program written in C#. The calculator implements the process depicted in Figure 12. The

principle C# method used to calculate the new averages for the default files in the calculator

program is CreateDefaultFiles(). The comments in the code explain each step of the

process:

private void CreateDefaultFiles(){ //read in first 727 from each file - used in dynamic simulation FileInfo[] Files = dinfo.GetFiles(); string currentNumber = string.Empty; int iOffsetAboveFromPrevious = 0; //initialize at max to ensure not used on first iteration int iIndexForOffsetAbove = 727;

foreach (FileInfo filex in Files) { using (var reader = new StreamReader(filex.FullName)) { //fill dynamicIn for (int i = 0; i < 727; i++) { dynamicIn[i] = Convert.ToInt32(reader.ReadLine()); }

int iCurrentOutputIndex = 0; //Calculate for (int k = 0; k < 727; k++) { //add each average used here - including any offset float iAccumulatedTotal = 0;

//reached > 300 accumulated intervals

61

Chapter 4 Implementation

int iReadCount = _Convert.ToInt32(ds.Tables[0].Rows[k]["ReadCount"]); if (iReadCount > 0) { //first interval if (k == 0) { int iValue = dynamicIn[k]; iAccumulatedTotal += iValue; } else { //readCount == 1: just check for offsets if (iReadCount > 1) { for (int m = 1; m < iReadCount; m++) { int iValue = dynamicIn[k - m]; int iInterval = _Convert.ToInt32(ds.Tables[0].Rows[k - _m]["Interval"]); iAccumulatedTotal += iValue * iInterval; } } } //offset - read this interval int iOffsetBelow = _Convert.ToInt32(ds.Tables[0].Rows[k]["OffsetBelow300"]); if (iOffsetBelow > 0) { iAccumulatedTotal += iOffsetBelow * dynamicIn[k]; }

//use previous offset above in this calculation if (k >= iIndexForOffsetAbove) { iAccumulatedTotal += iOffsetAboveFromPrevious; //reset iOffsetAboveFromPrevious = 0; iIndexForOffsetAbove = 727; } //use this offset above in next calculation int iOffsetAbove = _Convert.ToInt32(ds.Tables[0].Rows[k]["OffsetAbove300"]); if (iOffsetAbove > 0) { //value for offset above to add to next _accumulator iOffsetAboveFromPrevious = iOffsetAbove * _dynamicIn[k];

62

Chapter 4 Implementation

//use in next calculation - at a minimum

iIndexForOffsetAbove = k; }

float fAverage = iAccumulatedTotal / 300; int iAverage = Convert.ToInt32( iAccumulatedTotal / _300);

//first interval if (k == 0) { iAverage = dynamicIn[k]; } //save averaged value to array for writing defaultOutput[iCurrentOutputIndex] = _iAverage.ToString(); iCurrentOutputIndex++; } } }

//Print to text file for default cloudlet System.IO.File.WriteAllLines("C:\\Users\\scooby\\Desktop\\DefaultNewFiles\\" + _filex.Name, defaultOutput); }}

The code above depicts the process by which the values required to calculate the default file

averages was achieved. Having run the dynamic simulationThe results of the calculator

program were written back out to flat text files i.e. the same format as the original CloudSim

cloudlet files. To compare the difference between the default and dynamic workloads, both

default and dynamic simulations were run using the new cloudlet files with a few lines of

additional code to monitor the workload during each simulation added to the default

constructor of the UtilizationModelPlanetlabInMemory() class in CloudSim. This

additional code ensured that the workload would be observed (and accumulated) as each

cloudlet was processed.

As the data is being read into the data[] array from the PlanetLab cloudlet files

each value is added to a new accumulator variable i.e. Constants.totalWorkload:

int n = data.length;for (int i = 0; i < n - 1; i++){

63

Chapter 4 Implementation

data[i] = Integer.valueOf(input.readLine());Constants.totalWorkload += data[i];

}

The Constants.totalWorkload value was then divided by the relevant number of

intervals (i.e. Default: 288 / Dynamic: 727) to calculate the average workload per interval. A

difference of less than 1% in the per-interval workloads was observed, validating, for the

most part, the results generated for the default cloudlet files by the C# program.

The negligible difference may be explained by migrations taking place during

collection of the CPU utilization data prior to export. For example, if a workload on a VM is

calculated at 10% of its host’s capacity and then migrated to a host with a lower capacity, the

same workload would require more time to complete – skewing the average CPU utilization

that would have otherwise been calculated had the migration not taken place. This scenario

was not factored into calculation of the per-interval average CPU utilization, resulting in the

difference between the workloads of approximately 1%. This error margin was considered

acceptable in the context of the overall thesis objectives.

4.4 Interval Adjustment Code

The updateCloudletProcessing() method in the PowerDatacenter class is the

principle cloudlet processing method, run at each interval and provided by default in

CloudSim. As such it is the ideal place to position the function call to the additional code

required to implement interval adjustment.

To differentiate between the default and dynamic simulations at runtime, a constant

boolean variable (IsDefault) was created which indicates which type of simulation is being

run. Based on the value of the IsDefault variable, the code will fork either to the default

CloudSim code or the dynamic code written to adjust the monitoring interval. The fork

returns to the default CloudSim code once the AdjustInterval() method has been

executed:

if(Constants.IsDefault){

//run default simulation}else

64

Chapter 4 Implementation

{//run dynamic simulationAdjustInterval();

}

The AdjustInterval() method (outlined below) is the entry point for the dynamic

monitoring interval adjustment simulation. Figure 13 depicts how the dynamic code interacts

with the CloudSim default:

protected void AdjustInterval(double currentTime){

double dTotalUsageForAverage = 0;double dAverageUsage = 0;int iDenominator = 0;int iWeight = 0;double timeDiff = currentTime - getLastProcessTime();

for (PowerHost host : this.<PowerHost> getHostList()){

double utilizationOfCpu = host.getUtilizationOfCpu();if(utilizationOfCpu > 0){

iWeight = GetWeight(utilizationOfCpu);dTotalUsageForAverage += utilizationOfCpu *

iWeight;iDenominator += iWeight;

}}

dAverageUsage = dTotalUsageForAverage / iDenominator;

//alter scheduling interval according to average utilizationSetSchedulingIntervalRelativeToUtilization(dAverageUsage);

}

65

Chapter 4 Implementation

Figure 13 How the dynamic interval adjustment code interacts with CloudSim

66

Chapter 4 Implementation

A host which is not running would have a CPU utilization of 0. As depicted by the code, only

hosts with CPU utilization greater than 0 will be included in the average CPU utilization for

the DC i.e.

if(utilizationOfCpu > 0)

A weighting is then applied (helper function: GetWeight() - below) to each result obtained.

This weighting (cf. Table 4.1) is based on the CPU utilization calculated for each host by the

getUtilizationOfCpu() method provided by default in CloudSim:

public int GetWeight(double utilization){

double iUtilization = utilization * 100;int iWeight = 0;

//check utilization value rangeif(iUtilization >= 0.00 && iUtilization <= 10.00){

iWeight = 1;}else if(iUtilization > 10.00 && iUtilization <= 20.00){

iWeight = 2;}else if(iUtilization > 20.00 && iUtilization <= 30.00){

iWeight = 3;}else if(iUtilization > 30.00 && iUtilization <= 40.00){

iWeight = 4;}else if(iUtilization > 40.00 && iUtilization <= 50.00){

iWeight = 5;}else if(iUtilization > 50.00 && iUtilization <= 60.00){

iWeight = 6;}else if(iUtilization > 60.00 && iUtilization <= 70.00){

iWeight = 7;}else if(iUtilization > 70.00 && iUtilization <= 80.00){

iWeight = 8;}else if(iUtilization > 80.00 && iUtilization <= 90.00)

67

Chapter 4 Implementation

{iWeight = 9;

}else if(iUtilization > 90.00 && iUtilization <= 100.00){

iWeight = 10;}

return iWeight;}

The average utilization for the DC is then passed to another helper function

(SetSchedulingIntervalRelativeToUtilization() – shown below) which will

adjust the next monitoring interval (i.e. Constants.SCHEDULING_INTERVAL) based on the

range within which the utilization falls.

public void SetSchedulingIntervalRelativeToUtilization(double dAverageUsage){

double iUtilization = dAverageUsage * 100;double dInterval = 300;

if(iUtilization >= 0.00 && iUtilization <= 10.00){

dInterval = 300;}else if(iUtilization > 10.00 && iUtilization <= 20.00){

dInterval = 270;}else if(iUtilization > 20.00 && iUtilization <= 30.00){

dInterval = 240;}else if(iUtilization > 30.00 && iUtilization <= 40.00){

dInterval = 210;}else if(iUtilization > 40.00 && iUtilization <= 50.00){

dInterval = 180;}else if(iUtilization > 50.00 && iUtilization <= 60.00){

dInterval = 150;}else if(iUtilization > 60.00 && iUtilization <= 70.00){

dInterval = 120;}

68

Chapter 4 Implementation

else if(iUtilization > 70.00 && iUtilization <= 80.00){

dInterval = 90;}else if(iUtilization > 80.00 && iUtilization <= 90.00){

dInterval = 60;}else{

dInterval = 30;}

setSchedulingInterval(dInterval);Constants.SCHEDULING_INTERVAL = dInterval;

}

The process then returns to the AdjustInterval() method which returns control back to

the default CloudSim code where the fork began. The default CloudSim code continues,

completing the (per-interval) updateCloudletProcessing() method and continuing the

simulation into the next interval, the length of which has now been adjusted with respect to

the predicted average CPU utilization for the DC.

4.5 Reporting

The metrics available in the default CloudSim reports (as described in Section 3.10) were

found to be sufficient for the purposes of the testing phase of this research (cf. Chapter 5).

However, some additional variables were needed during the design phase of the new

algorithm to adjust the monitoring interval. These were added as static Constants so that

they would be globally available across a range of classes and could be used without

requiring instantiation of any new objects. Most are associated with calculation of the per-

interval workload for the default PlanetLab cloudlet files. They include:

public static int fileBeingRead = 0;public static boolean IntervalGenerator = false;public static int previousIntervalCount = 0;public static int intervalCount = 0;public static int offsetBelow = 0;public static int offsetAbove = 0;public static int accumulatedOffsetTotal = 0;public static int intervalLengthTotal = 1;

69

Chapter 4 Implementation

The list below depicts a typical trace file for the default CloudSim LRMMT simulation. It

contains the output of the CloudSim reporting class i.e. Helper(). It also includes some of

the additional metrics added for the purpose of this research (bold green italics):

Experiment name: default_lr_mmt_1.2

Number of hosts: 80

Number of VMs: 105

Total simulation time: 86100.00 sec

Energy consumption: 16.76 kWh

Overutilized Hosts: 2249

Number of VM migrations: 2305

Total Workload: 3833.310000

SLA: 0.00428%

SLA perf degradation due to migration: 0.07%

SLA time per active host: 5.97%

Overall SLA violation: 0.53%

Average SLA violation: 11.61%

SLA time per host: 0.05%

Number of host shutdowns: 1184

Mean time before a host shutdown: 627.78 sec

StDev time before a host shutdown: 1443.06 sec

Mean time before a VM migration: 17.12 sec

StDev time before a VM migration: 7.67 sec

4.6 Conclusion

This chapter has discussed the modifications required to CloudSim to implement dynamic

adjustment of the monitoring interval. A description of how the new code integrates with the

default code provided by CloudSim was also provided. In Chapter 5 the tests carried out to

compare the default with the dynamic simulations are described and results analysed. Finally,

potential opportunities for improvement of the CloudSim framework during the course of this

research are suggested.

70

Chapter 5 Tests, Results & Evaluation

Chapter 5 Tests, Results & Evaluation

Introduction

Chapter 4 detailed the code changes that were required in the CloudSim framework to

implement dynamic adjustment of the monitoring interval. The new code integrates

seamlessly with the existing framework. No alterations were made to the underlying

CloudSim architecture. This chapter deals with the specifics of the simulations carried out to

test the hypothesis that opportunities for reduction of power consumption can be identified

when the length of the interval changes with respect to the varying workload experienced by

a typical DC.

5.1 Tests & Results

Using the dynamic PlanetLab cloudlet files and interval adjustment code, the simulation

was run for the full duration (i.e. 86100 seconds) and compared with the CloudSim default

which used the cloudlet files generated by the C# calculator. Key results are presented in

Table 5.1, whereby a significant reduction in over-utilized hosts, migrations and power

consumption was observed.

Table 5.1: Simulation Results

Interval (seconds)

Time Interval Count

Over-utilized Hosts

Migration Count

Power Consumption

Static 300 86100 287 2249 2305 16.76Dynamic 86100 727 1697 979 8.23

Figure 14 depicts the intervals that were calculated during the dynamic simulation based on

the average CPU utilization for the DC. It can be seen that the interval ranges from a

minimum of 30 seconds to a maximum of 270 seconds - indicating that the average CPU

utilization for the DC did not exceed 90% nor drop below 10%. From Figure 14, and as

71

Chapter 5 Tests, Results & Evaluation

described in Chapter 4, the number of intervals in the dynamic simulation is 727, compared

with 288 in the static simulation.

Figure 14 Interval Calculation for the Dynamic Simulation

Figure 15 shows a comparison of the VM count during both simulations - indicating that

VMs are being constantly ‘de-commissioned’ as their workloads are completed during the

simulation. There are 34 VMs still running at the end of the default simulation whereas there

are only 6 VMs which have not completed their workloads at the end of the dynamic. This

indicates that the VM placement algorithm has performed more efficiently in the dynamic

simulation i.e. more of the PlanetLab workload from the cloudlet files has been completed

by the time the dynamic simulation has finished.

72

Chapter 5 Tests, Results & Evaluation

Figure 15 VM decommissioning comparison

Comparing Figures 16 & 17, which depict the operational hosts at each interval in the default

and dynamic simulations, it can be seen that more efficient use is made of the hosts when the

interval is adjusted dynamically. A minimal number of operational servers is achieved sooner

in the dynamic simulation and the power-on / power-off behaviour of the default simulation

(which consumes both time and energy) is primarily absent from the dynamic. This is

discussed further in Section 5.2.1 below.

73

Chapter 5 Tests, Results & Evaluation

Figure 16 Operational Hosts - Default Simulation

Figure 17 Operational Hosts - Dynamic Simulation

74

Chapter 5 Tests, Results & Evaluation

Figure 18 depicts the per-interval average CPU utilization in the DC for the dynamic

simulation. A cluster of values can be seen at approximately 99% between 17 – 19 hours in

the dynamic simulation. There is a single operational host in this time period (Figure 17) with

a range of 9 - 19 VMs running on it. The high average CPU utilization is as a direct result of

all the remaining VMs being placed on this single host. The VM placement algorithm is most

efficient at this point in the dynamic simulation from a power consumption perspective,

optimizing energy efficiency by minimizing the number of operational hosts required to

service the DC workload. This placement configuration would not be possible if the

PlanetLab cloudlet workloads were higher. It is the relatively low CPU values being

allocated to the VMs from the cloudlet files that make the placement on a single host in this

time period possible.

Figure 18 Average CPU Utilization - Dynamic Simulation

5.2 Evaluation of Test Results

In Section 5.2 the results summarized above are investigated, based on an understanding of

CloudSim as derived from code, code comments and help documentation / user forums. It

explains the research findings using an ‘under-the-hood’ analysis. Section 5.3 identifies a

number of limitations in the CloudSim framework which would benefit from further

75

Chapter 5 Tests, Results & Evaluation

investigation and is tailored more towards future researchers using the CloudSim framework

than DC operators.

5.2.1 Why are the Over-Utilized Hosts / Migration Counts Reduced?

CloudSim performs migrations when an over-utilized host is identified. One or more VMs are

chosen for migration to bring the CPU utilization of the host back below the over-utilization

threshold. It is not explicitly clear from the CloudSim metrics available why the migration

count is reduced in dynamic mode relative to static mode. The VM placement algorithm

defined by CloudSim is a complex module within the framework. The logic upon which it

works is that the most appropriate destination for the migrating VM is the one which results

in the lowest power consumption post-migration. However, it is clear from the operational

hosts observed in the dynamic simulation (depicted in Figure 17) that the VM placement

algorithm is also performing consolidation (c.f. Sections 2.4.1, 3.3 & 5.1 above).

During the period when only 1 host is operational in the dynamic simulation (Figure

17: 17 – 24 hours) it was observed that there were as many as 19 VMs running on that host

(c.f. Section 5.1). As a result, some over-allocation is occurring. Over-allocation is when so

many VMs are placed on a host than the host has insufficient capacity to service the workload

of every VM at each time frame. In the effort to consolidate, VMs will sometimes be placed

on a host which is currently running, rather than switching on a new host. The effect is that,

due to the increased length of the CPU queue on the host (i.e. more VMs are ‘waiting’ for

processing time slices), some VMs will not receive the CPU cycles required to complete their

workload in the available interval. The expected action would be migration of the ‘starved’

VMs but it is evident (from Figure 17) that no migrations are taking place i.e. no other host is

switched on. This is due to one of the limitations identified in the framework - that CloudSim

only performs a VM migration when the entire host is over-utilized – not when an individual

VM requires more capacity (c.f. Section 5.2.7 below). Clearly, there is a trade-off between

consolidation and migration. The conclusion reached, based on the results observed, is that

the reduced intervals in the dynamic simulation result in more frequent analysis, performing

this consolidation / migration trade-off more efficiently than the default simulation, resulting

in fewer over-utilized hosts and a reduced migration count.

76

Chapter 5 Tests, Results & Evaluation

5.2.2 Result of Reduced Migration Count

Beloglazov et al. [51] show that decreased power consumption can be achieved in a DC if the

VM migration count can be reduced. Their work is based on the premise that additional

resources are consumed during a migration due to the extra processing required to move the

memory of the VM from its current host to another. Those processes may include:

Identification of a suitable destination server i.e. VM placement algorithm

Network traffic

CPU processing on both source and destination servers whilst concurrently running

two VMs

In the case of live migration, transfer of the VM’s memory image is performed by the Virtual

Machine Manager (VMM) which copies the RAM, associated with the VM service, across to

the destination while the service on the VM is still running. RAM which is re-written on the

source must be transferred again. This process continues iteratively until the remaining

volume of RAM needing to be transferred is such that the service can be switched off with

minimal interruption. This period of time, while the service is unavailable, is known as

downtime. Any attempt to improve migration algorithms must take live-copy downtime into

consideration to prevent (or minimize) response SLAs. CloudSim achieves this (to some

extent) by choosing, for migration, the VM with the lowest RAM. However, the CloudSim

SLA metric in the modules used for this research does not take this downtime into

consideration.

Dynamic adjustment of the monitoring interval, however, minimizes this issue of

RAM transfer by reducing the need for the migration in the first place. The power consumed

as a result of the migrations is saved when additional migrations are not required.

5.2.3 Scalability

As outlined in Section 2.1 there is a trade-off between DC monitoring overhead costs and net

DC benefits. The issue here is that the additional volume of processing which takes place

when shorter monitoring intervals are applied may become such that it would not be

beneficial to apply dynamic interval adjustment at all.

77

Chapter 5 Tests, Results & Evaluation

Take for example, Amazon’s EC2 EU West DC (located in Dublin, Ireland) which is

estimated to contain over 52,000 operational servers [52]. Processing of the data (CPU

utilization values) required to perform the interval adjustment is not an insignificant

additional workload. The algorithm will calculate the average CPU utilization of some 52,000

servers and apply the new interval. As such, if this calculation were to take place every 30

seconds (in a DC with an average CPU utilization above 90%), rather than every 300

seconds, there is a ten-fold increase in the total processing volume which includes both

collection and analysis of the data points. While it is unlikely that even the average CPU

utilization of the most efficient DC would exceed 90% for any extended period of time, it is

clear that the size of the DC (i.e. number of operational servers) does play a role in

establishing whether or not the interval adjustment algorithm described in this research

should be applied. Microsoft’s Chicago DC has approximately 140,000 servers installed.

With increasingly larger DCs being built to meet growing consumer demand, it is reasonable

to expect that DC server counts will reach 500,000 in the foreseeable future. Rather than

viewing the entire DC as a single entity from a monitoring perspective, perhaps the most

viable application of dynamic monitoring interval adjustment would be to sub-divide these

larger DCs into more manageable sections, calculating the monitoring interval for each

section separately.

5.3 Evaluation of CloudSim

5.3.1 Local Regression Sliding Window

The adjusted interval in this research (discussed in Chapter 3 Section 5) results in the 'first

fill' of the window occurring sooner than the CloudSim default i.e. the longest interval (5

minutes) in the dynamic version is the minimum (static) interval in the CloudSim default.

The first 10 values in the sliding window take 3000 seconds (i.e. 300 x 10) in the default

CloudSim simulation whereas, in the dynamic version, the window is filled after 1480

seconds. The result is a small increase in the accuracy of the utilization prediction at the

beginning of the simulation because the less accurate ‘fallback’ algorithm is ‘discarded’

sooner.

The size of the sliding window in the default CloudSim framework is 10 i.e. the 10 most

recent CPU utilization values from the host are used each time the local regression algorithm

78

Chapter 5 Tests, Results & Evaluation

is performed. If there were more values in the window, the algorithm would be less sensitive

to short-term changes in the workload. Clearly the size of the sliding window should be

proportionate to the level of sensitivity required. The choice of this parameter would most

likely benefit from detailed sensitivity analysis.

5.3.2 RAM

Chapter 3 detailed the configuration settings provided by default in CloudSim in an effort to

simulate ‘real-world’ VMs. However, for reasons unclear from the code, two of the VM types

have the same RAM applied to them i.e. 1740. It would be preferable if either:

4 distinct VM types were configured to better reflect ‘real-world’ scenarios and

improve the VM selection policy deployed by default

Provision for dynamic RAM adjustment was included in the CloudSim framework

(c.f. Section 5.3.3 below).

5.3.3 Dynamic RAM Adjustment

No facility is provided in CloudSim to adjust the amount of RAM available to a VM while

the simulation proceeds. A migration to another host with a higher-capacity VM is required

in CloudSim should a VM require more RAM. While this simulates many ‘real-world’

systems, the facility to dynamically adjust the amount of RAM allocated to a VM (without

requiring migration) would improve the VM selection algorithm.

5.3.4 SLA-based Migration

The basis upon which a migration takes place in the CloudSim module used for this research

is an over-utilized host. If a VM requires additional RAM to service its workload it must be

migrated to another host where a larger VM can be configured to run the workload. However

the module does not facilitate SLA-based migration. Rather, only VMs on a host which is

over-utilized are migrated. This is a significant limitation in the design of CloudSim. Even

with this scenario, the VMs which need additional RAM may not be those migrated because

the algorithm for choosing the VMs to migrate selects the VM with the lowest RAM pages

requiring migration first. This will typically leave ‘starved’ VMs on the source host - still

79

Chapter 5 Tests, Results & Evaluation

requiring additional RAM. Clearly, some improvement is required in the VM selection and

allocation policies deployed by the default CloudSim framework.

80

Chapter 6 Conclusions

Chapter 6 Conclusions

In order to identify a potentially novel energy efficiency approach for virtualized DCs, a large

part of the research effort in this thesis was dedicated to evaluating the current state-of-the-

art. On completion of this investigative phase it was decided to focus on opportunities

relating to DC management software i.e. virtualization. Following this, the concept of a

dynamic monitoring interval was then proposed. Once CloudSim had been identified as the

most accessible framework in which to build a test bed, a significant amount of time was

spent reviewing the existing code to establish the capabilities (and limitations) of the

framework.

The dynamic simulation presented in this thesis is differentiated from the default

LR/MMT CloudSim in that the duration of the next interval is adjusted (with respect to the

weighted average of the data DC CPU utilization) rather than maintaining a static interval of

300 seconds which is the standard monitoring interval used in commercial applications (e.g.

VMWare, Citrix Xen).

The primary aim of this research (as outlined in the introductory chapter) was to

determine the impact on power consumption of dynamically adjusting the monitoring

interval. Analysis of DC metrics is performed more often suggesting that the DC is more

sensitive to changes in CPU utilization. The focus of this research was the over-utilization

threshold. In calculating the average CPU utilization for the DC, shorter intervals are applied

as the average utilization rate increases. Results indicated that power consumption could be

reduced when the monitoring interval is adjusted with respect to the incoming workload. As

indicated, future work should also examine the potential for reduced power consumption as

the average CPU utilization for the DC approaches some under-utilization threshold. This

would improve the CloudSim VM placement algorithm, providing a more accurate

simulation of the server consolidation efforts used by industry.

In addition, this research had a secondary objective – to evaluate the efficacy of

CloudSim as a simulator for power-aware DCs. During the course of reviewing existing code

and writing new modules the specific issues (outlined above) were found to exist in the

CloudSim framework. Discovery and documentation of them in this thesis will undoubtedly

81

Chapter 6 Conclusions

prove both informative and useful for researchers undertaking CloudSim-based simulations in

the future.

Recent reports suggest that Microsoft’s Chicago DC has approximately 140,000

servers installed [53]. With increasingly larger DCs being built to meet growing consumer

demand, it is reasonable to expect that individual DC server counts will reach 250,000 -

500,000 in the foreseeable future. Rather than viewing the entire DC as a single entity from a

monitoring perspective, perhaps the most viable application of dynamic monitoring interval

adjustment would be to sub-divide these larger DCs into more manageable sections,

calculating (and adjusting) the monitoring interval for each section separately – ensuring that

the granularity of analysis most appropriately caters for all possible data center sizes and

configurations. This analysis would also make a valuable contribution to the state-of-the-art.

82

References

REFERENCES

[1] http://www.gartner.com/newsroom/id/499090 - last accessed on 19/09/2014

[2] http://www.idc.com - last accessed on 19/09/2014

[3] Koomey J.G., “Estimating Total Power Consumption by Servers in the U.S. and the

World”, 2007

[4] Energy Star Program - U.S. Environmental Protection Agency, “EPA Report to Congress

on Server and Data Center Energy Efficiency”, EPA, Aug 2007.

[5] N. Rasmussen, “Calculating Total Cooling Requirements for Data Centers,” American

Power Conversion, White Paper #25, 2007.

[6] Sharma, R.K.; Bash, C.E.; Patel, C.D.; Friedrich, R.J.; Chase, J.S., "Balance of Power:

Dynamic Thermal Management for Internet Data Centers," Internet Computing, IEEE, vol.9,

no.1, pp. 42-49, January 2005

[7] Moore, J.; Sharma, R.; Shih, R.; Chase, J.; Patel, C.; Ranganathan, P., “Going Beyond

CPUs: The Potential of Temperature-Aware Solutions for the Data Center”, Hewlett Packard

Labs, 2002

[8] Data Center Efficiency Task Force, “Recommendations for Measuring and Reporting

Version 2 – Measuring PUE for Data Centers”, 17th May 2011

[9] C. Belady, A., Rawson, J. Pfleuger, and T., Cader, "Green Grid Data Center Power

Efficiency Metrics: PUE and DCIE," The Green Grid, 2008

[10] Koomey J.G., “Growth in Data Center Electricity Use 2005 to 2010”, report by

Analytics Press, completed at the request of The New York Times, August 2011

[11] The Uptime Institute, “Inaugural Annual Uptime Institute Data Center Industry Survey”,

Uptime Institute, May 2011

[12] ASHRAE, “Datacom Equipment Power Trends and Cooling Applications”, ASHRAE

INC, 2005

83

References

[13] ASHRAE, "Environmental Guidelines for Datacom Equipment - Expanding the

Recommended Environmental Envelope", ASHRAE INC, 2008

[14] ASHRAE, “Thermal Guidelines for Data Processing Environments – Expanded Data

Center Classes and Usage Guidance”, ASHRAE INC, August 2011

[15] Boucher, T.D.; Auslander, D.M.; Bash, C.E.; Federspiel, C.C.; Patel, C.D., "Viability of

Dynamic Cooling Control in a Data Center Environment," Thermal and Thermomechanical

Phenomena in Electronic Systems, 2004. ITHERM '04. The Ninth Intersociety Conference

on, pp. 593- 600 Vol. 1, 1-4 June 2004

[16] Sharma, R.K.; Bash, C.E.; Patel, C.D.; Friedrich, R.J.; Chase, J.S.; , "Balance of Power:

Dynamic Thermal Management for Internet Data Centers," Internet Computing, IEEE , vol.9,

no.1, pp. 42- 49, Jan.-Feb. 2005

[17] Shah, A.; Patel, C.; Bash, C.; Sharma, R.; Shih, R.; , "Impact of Rack-level Compaction

on the Data Center Cooling Ensemble," Thermal and Thermomechanical Phenomena in

Electronic Systems, 2008. ITHERM 2008. 11th Intersociety Conference on, pp.1175-1182,

28-31 May 2008

[18] C. Patel, et al., “Energy Flow in the Information Technology Stack: Coefficient of

Performance of the Ensemble and its Impact on the Total Cost of Ownership,” Technical

Report No. HPL-2006-55, Hewlett Packard Laboratories, March 2006

[19] C. Patel, et al., “Energy Flow in the Information Technology Stack: Introducing the

Coefficient of Performance of the Ensemble,” Proc. ASME IMECE, November 2006

[20] Ahuja, N.; Rego, C.; Ahuja, S.; Warner, M.; Docca, A.; "Data Center Efficiency with

Higher Ambient Temperatures and Optimized Cooling Control," Semiconductor Thermal

Measurement and Management Symposium (SEMI-THERM), 2011 27th Annual IEEE,

pp.105-109, 20-24 March 2011

[21] Berktold, M.; Tian, T., “CPU Monitoring With DTS/PECI”, Intel Corporation,

September 2010

[22] M. Stopar, SLA@SOI XLAB, Efficient Distribution of Virtual Machines, March 24,

2011.

84

References

[23] C. Hyser, B. McKee, R. Gardner, and B. Watson. Autonomic Virtual Machine

Placement in the Data Center. Technical Report HPL-2007-189, HP Laboratories, Feb. 2008.

[24] A. Qureshi, R. Weber, H. Balakrishnan, J. Guttag, and B. Maggs, “Cutting the Electric

Bill for Internet-Scale Systems,” in Proc. ACM Conference on Data Communication

(SIGCOMM’09), New York, NY, USA, 2009, pp. 123–134

[25] Bolin Hu; Zhou Lei; Yu Lei; Dong Xu; Jiandun Li; , "A Time-Series Based Precopy

Approach for Live Migration of Virtual Machines," Parallel and Distributed Systems

(ICPADS), 2011 IEEE 17th International Conference on , vol., no., pp.947-952, 7-9 Dec.

2011

[26] Carroll, R, Balasubramaniam, S, Botvich, D and Donnelly, W, Application of Genetic

Algorithm to Maximise Clean Energy usage for Data Centers, to appear in proceedings of

Bionetics 2010, Boston, December 2010

[27] Akshat Verma, Puneet Ahuja, Anindya Neogi, “pMapper: Power and Migration Cost

Aware Application Placement in Virtualized Systems”, Middleware 2008: 243-264

[28] P. Riteau, C. Morin, T. Priol, “Shrinker: Efficient Wide-Area Live Virtual Machine

Migration using Distributed Content-Based Addressing,”

http://hal.inria.fr/docs/00/45/47/27/PDF/RR-7198.pdf, 2010

[29] F. Travostino, P. Daspit, L. Gommans, C. Jog, C. de Laat, J. Mambretti, I. Monga, B.

van Oudenaarde, S. Raghunath, and P. Wang. Seamless Live Migration of Virtual Machines

over the MAN/WAN. iGrid, 2006

[30] Hai Jin, Li Deng, Song Wu, Xuanhua Shi, and Xiaodong Pan. Live virtual machine

migration with adaptive memory compression. In Cluster, 2009

[31] Jonghyun Lee, MarianneWinslett, Xiaosong Ma, and Shengke Yu. Enhancing Data

Migration Performance via Parallel Data Compression. In Proceedings of the 16th

International Parallel and Distributed Processing Symposium (IPDPS), pages 47–54, April

2002

[32] M. R. Hines and K. Gopalan, “Post-copy based live virtual machine migration using

adaptive pre-paging and dynamic self-ballooning,” in Proceedings of the ACM/Usenix

international conference on Virtual execution environments (VEE’09), 2009, pp. 51–60

85

References

[33] Bose, S.K.; Brock, S.; Skeoch, R.; Rao, S.; , "CloudSpider: Combining Replication with

Scheduling for Optimizing Live Migration of Virtual Machines across Wide Area

Networks," Cluster, Cloud and Grid Computing (CCGrid), 2011 11th IEEE/ACM

International Symposium on , vol., no., pp.13-22, 23-26 May 2011

[34] Cioara, T.; Anghel, I.; Salomie, I.; Copil, G.; Moldovan, D.; Kipp, A.; , "Energy Aware

Dynamic Resource Consolidation Algorithm for Virtualized Service Centers Based on

Reinforcement Learning," Parallel and Distributed Computing (ISPDC), 2011 10th

International Symposium on , vol., no., pp.163-169, 6-8 July 2011

[35] H. Liu, H. Jin, X. Liao, L. Hu, and C. Yu, “Live migration of virtual machine based on

full system trace and replay,” in Proceedings of the 18th International Symposium on High

Performance Distributed Computing (HPDC’09), 2009, pp. 101–110.

[36] http://www.drbd.org - last accessed on 19/09/2014

[37] K. Nagin, D. Hadas, Z. Dubitzky, A. Glikson, I. Loy, B. Rochwerger, and L. Schour,

“Inter-Cloud Mobility of Virtual Machines,” in Proc. of 4th Int’l Conf. on Systems & Storage

(SYSTOR). ACM, 2011, pp. 3:1–3:12.

[38] R. Bradford, E. Kotsovinos, A. Feldmann, and H. Schi �oberg. Live Wide-Area Migration

of Virtual Machines including Local Persistent State. In VEE '07: Proceedings of the 3rd

international conference on Virtual execution environments, pages 169{179, New York, NY,

USA, 2007. ACM.

[39] C. Clark, K. Fraser, A. Hand, J. Hansen, E. Jul, C. Limpach, I. Pratt, A. Warfield. Live

Migration of Virtual Machines. in Proceedings of the Symposium on Networked Systems

Design and Implementation, 2005.

[40] VMware vSphere® vMotion®, Architecture, Performance and Best Practices in

VMware vSphere® 5. Performance Study, Technical White Paper, Oct 2011.

[41] Voorsluys W., Broberg J., Venugopal S., Buyya R.: Cost of Virtual Machine Live

Migration in Clouds: a Performance Evaluation. In: Proceedings of the 1st International

Conference on Cloud Computing. Vol. 2009. Springer (2009)

[42] Buyya, R., Ranjan, R., Calheiros, R. N.: Modeling and Simulation of Scalable Cloud

Computing Environments and the CloudSim Toolkit: Challenges and Opportunities. In: High

86

References

Performance Computing & Simulation, 2009. HPCS'09. International Conference on, pp. 1-

11. IEEE (2009)

[43] Xu, Y., Sekiya, Y.: Scheme of Resource Optimization using VM Migration for

Federated Cloud. In: Proceedings of the Asia-Pacific Advanced Network, vol. 32, pp. 36-44.

(2011)

[44] Takeda, S., and Toshinori T.: A Rank-Based VM Consolidation Method for Power

Saving in Data Centers. IPSJ Online Transactions, vol. 3 pp. 88-96. J-STAGE (2010)

[45] Xu, L., Chen, W., Wang, Z., Yang, S.: Smart-DRS: A Strategy of Dynamic Resource

Scheduling in Cloud Data Center. In: Cluster Computing Workshops (CLUSTER

WORKSHOPS), IEEE International Conference on, pp. 120-127. IEEE (2012)

[46] Gmach, D., Rolia, J., Cherkasova, L., Kemper, A.: Workload Analysis and Demand

Prediction of Enterprise Data Center Applications. In: Workload Characterization, 2007.

IISWC 2007. IEEE 10th International Symposium on, pp. 171-180. IEEE (2007)

[47] VMware,

http://pubs.vmware.com/vsphere-4-esx-vcenter/index.jsp?topic=/com.vmware.vsphere.bsa.do

c_40/vc_perfcharts_help/c_perfcharts_collection_intervals.html - last accessed on 19/09/2014

[48] Chandra, A., W. Gong, et al. (2003). Dynamic Resource Allocation for Shared Data

Centers Using Online Measurements. Proceedings of the Eleventh International Workshop on

Quality of Service (IWQoS 2003), Berkeley, Monterey, CA, Springer. pp. 381-400.

[49] M. Aron, P. Druschel, and S. Iyer. A Resource Management Framework for Predictable

Quality of Service in Web Servers, 2001.

[50] J. Carlstrom and R. Rom. Application-Aware Admission Control and Scheduling in Web

Servers. In Proceedings of the IEEE Infocom 2002, June 2002.

[51] Beloglazov, A., and Rajkumar B.: Energy Efficient Resource Management in

Virtualized Cloud Data Centers. Proceedings of the 2010 10th IEEE/ACM International

Conference on Cluster, Cloud and Grid Computing. IEEE Computer Society, 2010.

[52] http://huanliu.wordpress.com/2012/03/13/amazon-data-center-size - last accessed on 19/09/2014

87

References

[53] http://www.datacenterknowledge.com/archives/2009/06/29/microsoft-to-open-two-massive-data-centers - last accessed on 19/09/2014

88

Appendices

APPENDIX A

The code which checks if a host is over-utilized. The ‘fallback’ algorithm is used until the sliding window (length = 10) has been filled.

@Overrideprotected boolean isHostOverUtilized(PowerHost host) {

PowerHostUtilizationHistory _host = (PowerHostUtilizationHistory) host;

double[] utilizationHistory = _host.getUtilizationHistory();int length = 10;if (utilizationHistory.length < length){

return getFallbackVmAllocationPolicy().isHostOverUtilized(host);

}double[] utilizationHistoryReversed = new double[length];for (int i = 0; i < length; i++){

utilizationHistoryReversed[i] = utilizationHistory[length - i - 1];

}double[] estimates = null;try{

estimates = getParameterEstimates(utilizationHistoryReversed);

}catch (IllegalArgumentException e){

return getFallbackVmAllocationPolicy().isHostOverUtilized(host);

}double migrationIntervals =

Math.ceil(getMaximumVmMigrationTime(_host) / Constants.SCHEDULING_INTERVAL);

double predictedUtilization = estimates[0] + estimates[1] * (length + migrationIntervals);

predictedUtilization *= getSafetyParameter();

addHistoryEntry(host, predictedUtilization);

if(predictedUtilization >= 1){

Constants.OverUtilizedHostsThisInterval++;}

return predictedUtilization >= 1;}

89

Appendices

APPENDIX B

The main() method of the LRMMT algorithm.

public static void main(String[] args) throws IOException{

boolean enableOutput = true;boolean outputToFile = true;String inputFolder = "C:\\Users\\scooby\\Desktop\\Eclipse\\

CloudSim\\examples\\workload\\planetlab";

//default workload generated from dynamic averagesString workload = "default";

//dynamic workloadif(!Constants.IsDefault){

workload = "dynamic"; // PlanetLab workload}

if(Constants.IntervalGenerator){

Constants.SIMULATION_LIMIT = 86400;}

String outputFolder = "C:\\Users\\scooby\\Desktop\\Eclipse\\Workspace\\output\\";

String vmAllocationPolicy = "lr"; // Local Regression (LR) VM allocation policy

String vmSelectionPolicy = "mmt"; // Minimum Migration Time (MMT) VM selection policy

String parameter = "1.2"; // the safety parameter of the LR policy

new PlanetLabRunner(enableOutput, outputToFile, inputFolder, outputFolder, workload, vmAllocationPolicy, _vmSelectionPolicy, parameter);

}

90