Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Proposal]

34
Introduction Main Topics of the Thesis Methods and Techniques Time Plan Conclusion Modeling and Optimization of Resource Allocation in Cloud PhD Thesis Proposal Atakan Aral Thesis Advisor: Asst. Prof. Dr. Tolga Ovatman Istanbul Technical University – Department of Computer Engineering June 25, 2014 1 / 40

description

As cloud services continue to grow rapidly, the need for more efficient utilization of resources and better optimized distribution of workload emerges. Our aim is to provide a solution to this optimization problem from three perspectives: (1) Infrastructure level; (2) Application level; and (3) Platform level. In infrastructure level, the problem is to distribute virtual machines (VMs) to the datacenters (DCs) with geographical locations in such a way that latency is minimized while WAN bandwidth and datacenter capacity limits are respected. We plan to model latency as a function of DC load, inter-DC communication and proximity to user. Both VM requests and cloud infrastructure can be represented by graphs where vertices correspond to machines (either physical or virtual) and edges correspond to network connections between them. Our approach will employ weighted graph similarity and subgraph matching to suggest an efficient placement or the list of migrations to reach an efficient placement. VM request is an input to our optimization in infrastructure level but the requested number and capacity of VMs as well as their inter-connections may not be optimal for a specific cloud service. In application level, on the other hand, we plan to deal with the configuration of the MapReduce programming model (More specifically one of its implementations: Apache Hadoop) especially in terms of number of maps and reduces. Previous work has shown that the optimum configuration depends on the resource consumption of the cloud service and it can be determined by executing a fraction to generate a signature. Our aim is to determine the number of maps and reduces statically by analyzing the design model, source code and/or the input data. Furthermore, in platform level we aim to analyze the shuffle and scheduling algorithms of the MapReduce system to achieve a better load balancing and less amount communication between nodes. Shuffle is the phase where the output of all map processors with the same key are assigned to the same reduce processor. We will carry out a cost based formal analysis of the existing Hadoop schedulers to determine which scheduler is more appropriate for given costs of map, reduce and shuffle. In summary, a holistic approach will be carried out to improve the performance of cloud software by optimizing resource allocation. Intended optimizations are on infrastructure level (Resource Selection and Assignment), application level (Map Reduce Configuration) and platform level (Work Distribution and Load Balancing). Cloud environment and resource allocation problem will be modeled based on graphs and automata, and the solutions will be within this context as well.

Transcript of Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Proposal]

Page 1: Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Proposal]

IntroductionMain Topics of the ThesisMethods and Techniques

Time PlanConclusion

Modeling and Optimization of Resource Allocation in CloudPhD Thesis Proposal

Atakan Aral

Thesis Advisor: Asst. Prof. Dr. Tolga Ovatman

Istanbul Technical University – Department of Computer Engineering

June 25, 2014

1 / 40

Page 2: Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Proposal]

IntroductionMain Topics of the ThesisMethods and Techniques

Time PlanConclusion

Outline

1 Introduction

2 Main Topics of the Thesis

3 Methods and TechniquesMapReduce ConfigurationResource Selection and OptimizationWork Distribution to Resources

4 Time Plan

5 Conclusion

2 / 40

Page 3: Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Proposal]

IntroductionMain Topics of the ThesisMethods and Techniques

Time PlanConclusion

Cloud Computing

Definition

Applications and services that run on a distributed network using virtualizedresources and accessed by common Internet protocols and networking standards.

Broad network access: Platform-independent, via standard methodsMeasured service: Pay-per-use, e.g. amount of storage/processing power,number of transactions, bandwidth etc.On-demand self-service: No need to contact provider to provision resourcesRapid elasticity: Automatic scale up/out, illusion of infinite resourcesResource pooling: Abstraction, virtualization, multi-tenancy

4 / 40

Page 4: Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Proposal]

IntroductionMain Topics of the ThesisMethods and Techniques

Time PlanConclusion

Cloud Computing Roots

Cloud computing paradigm is revolutionary, however the technology it is builton is only evolutionary.

5 / 40

Page 5: Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Proposal]

IntroductionMain Topics of the ThesisMethods and Techniques

Time PlanConclusion

Cloud Computing Benefits

Lower costsEase of utilizationQuality of ServiceReliability

Outsourced IT managementSimplified maintenance and upgradeLower barrier to entry

6 / 40

Page 6: Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Proposal]

IntroductionMain Topics of the ThesisMethods and Techniques

Time PlanConclusion

Cloud Computing Architecture

7 / 40

Page 7: Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Proposal]

IntroductionMain Topics of the ThesisMethods and Techniques

Time PlanConclusion

MapReduce

Definition

A programming model for processing large data sets with a parallel, distributedalgorithm on a cluster.

8 / 40

Page 8: Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Proposal]

IntroductionMain Topics of the ThesisMethods and Techniques

Time PlanConclusion

MapReduce Entities

DataNode: Stores blocks of data in distributed filesystem (HDFS).NameNode: Holds metadata (i.e. location information) of the files in HDFS.Jobtracker: Coordinates the MapReduce job.Tasktracker: Runs the tasks that the MapReduce job split into.

9 / 40

Page 9: Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Proposal]

IntroductionMain Topics of the ThesisMethods and Techniques

Time PlanConclusion

Life Cycle of Cloud Software

Development of Distributed

Software

• MapReduceConfiguration

Resource Allocation

• Resource Selection andOptimization

Load Balancing

• WorkDistribution to Resources

11 / 40

Page 10: Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Proposal]

IntroductionMain Topics of the ThesisMethods and Techniques

Time PlanConclusion

MapReduce Configuration

Motivation

The cost of using 1000 machines for 1 hour, is the same as using 1 machine for1000 hours in the cloud paradigm (Cost associativity).

Optimum number of maps and reduces that maximize resource utilization aredependent on the resource consumption profile of the cloud software.Bottleneck resources should be identified.Optimization at Application level

12 / 40

Page 11: Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Proposal]

IntroductionMain Topics of the ThesisMethods and Techniques

Time PlanConclusion

Resource Selection and Optimization

Motivation

In distributed computing environments, up to 85 percent of computing capacityremains idle mainly due to poor optimization of placement.

Better assignment of virtual nodes to physical nodes may result in moreefficient use of resources.There are several possible constraints to consider / optimize e.g. capacitylimits, proximity to user, latency etc.Optimization at Infrastructure level

13 / 40

Page 12: Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Proposal]

IntroductionMain Topics of the ThesisMethods and Techniques

Time PlanConclusion

Work Distribution to Resources

Motivation

Data flows between nodes only in the shuffle step of the MapReduce job, andtuning it can have a big impact on job execution time.

Mapper and reducer nodes should be selected carefully to minimize networktraffic.Dynamic load balancing should be ensured.Optimization at Platform level

14 / 40

Page 13: Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Proposal]

IntroductionMain Topics of the ThesisMethods and Techniques

Time PlanConclusion

MapReduce ConfigurationResource Selection and OptimizationWork Distribution to Resources

Problem

Aim

Maximizing the utilization of all nodes for a Hadoop job by calculating the optimumparameters i.e. number of mappers and reducers

Higher values mean higher parallelism but may cause resource contentionand coordination problems.Optimum parameters depend on the resource consumption of the software.

17 / 40

Page 14: Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Proposal]

IntroductionMain Topics of the ThesisMethods and Techniques

Time PlanConclusion

MapReduce ConfigurationResource Selection and OptimizationWork Distribution to Resources

Previous Solutions

Kambatla, K., Pathak, A., and Pucha, H. (2009). Towards Optimizing Hadoop Provisioning in theCloud. In Proceedings of the 1st USENIX Workshop on Hot Topics in Cloud Computing (HotCloud),118-122.

Calculates the optimum parameters (number of M/R) for the Hadoop job suchthat each resource set is fully utilized.Each application has a different bottleneck resource and utilization.Requires to run a small chunk of the application to create a signature

Split job into n intervals of same duration.Calculate the average consumption of all 3 resources (CPU, Disk, and Network)in each interval.

Uses the configuration of the most similar signature in the database.

18 / 40

Page 15: Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Proposal]

IntroductionMain Topics of the ThesisMethods and Techniques

Time PlanConclusion

MapReduce ConfigurationResource Selection and OptimizationWork Distribution to Resources

Previous Solutions

19 / 40

Page 16: Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Proposal]

IntroductionMain Topics of the ThesisMethods and Techniques

Time PlanConclusion

MapReduce ConfigurationResource Selection and OptimizationWork Distribution to Resources

Suggested Solution

Design model of the software will be statically analyzed in order to guessresource consumption pattern.Critical (bottleneck) resources will be identified.If required, source code or input data may also be included to the analysis.

20 / 40

Page 17: Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Proposal]

IntroductionMain Topics of the ThesisMethods and Techniques

Time PlanConclusion

MapReduce ConfigurationResource Selection and OptimizationWork Distribution to Resources

Output

An algorithm that calculates optimum Hadoop configuration for a givensoftwareAn API that receives software model and outputs a Hadoop configurationsuggestion

21 / 40

Page 18: Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Proposal]

IntroductionMain Topics of the ThesisMethods and Techniques

Time PlanConclusion

MapReduce ConfigurationResource Selection and OptimizationWork Distribution to Resources

Problem

Aim

Optimally assigning interconnected virtual nodes to substrate network withconstraints

Constraints:Datacenter capacities and bandwidthVirtual topology requests and incompletely known cloud topologyLocality and jurisdictionApplication interaction and scalability rules

Objectives (Minimization):Inter-DC communicationGeographical proximity to user and latency

23 / 40

Page 19: Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Proposal]

IntroductionMain Topics of the ThesisMethods and Techniques

Time PlanConclusion

MapReduce ConfigurationResource Selection and OptimizationWork Distribution to Resources

Previous Solutions

Papagianni, C., Leivadeas, A., Papavassiliou, S., Maglaris, V., Cervello-Pastor, C., and Monje, A.(2013). On the Optimal Allocation of Virtual Resources in Cloud Computing Networks. IEEETransactions on Computers (TC), 62(6):1060-1071.

Mapping user requests for virtual resources onto shared substrate resourcesProblem is solved in two coordinated phases: node and link mapping.In node mapping, the objective is to minimize the cost of mapping and themethod is the random relaxation of MIP to LP.In link mapping, the objective is to minimize the number of hops and themethod is shortest path or minimum cost flow algorithms.Suggested algorithm is compared with greedy heuristics in terms ofacceptance ratio and number of hops.

24 / 40

Page 20: Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Proposal]

IntroductionMain Topics of the ThesisMethods and Techniques

Time PlanConclusion

MapReduce ConfigurationResource Selection and OptimizationWork Distribution to Resources

Previous Solutions

Larumbe, F. and Sanso, B. (2013). A Tabu Search Algorithm for the Location of Data Centersand Software Components in Green Cloud Computing Networks. IEEE Transactions on CloudComputing (TCC), 1(1):22-35.

Which component of the software should be hosted at which datacenter?Minimize delay, cost, energy consumption, and CO2 emission.Greedy initial solution is improved by moving one component at each step.A random subset of neighbours are analyzed for the best improvement.Suggested tabu search heuristic is compared with MIP formulation andgreedy heuristic in terms of execution time and optimality.Tradeoff analysis between the multiple objectives

25 / 40

Page 21: Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Proposal]

IntroductionMain Topics of the ThesisMethods and Techniques

Time PlanConclusion

MapReduce ConfigurationResource Selection and OptimizationWork Distribution to Resources

Previous Solutions

Agarwal, S., Dunagan, J., Jain, N., Saroiu, S., Wolman, A., and Bhogan, H. (2011). Volley:Automated Data Placement for Geo-Distributed Cloud Services. In Proceedings of the 7thUSENIX Symposium on Networked Systems Design and Implementation (NSDI), 17-32.

Volley analyzes trace data and suggest migrations to improve DC capacityskew, inter-DC traffic and user latency.Considers client geographic diversity, client mobility and data dependency.Method contains of 3 sequential phases:

1 Compute initial placement using weighted spherical mean,2 Iteratively move data to reduce latency,3 Iteratively collapse data to datacenters.

Compared against 3 simple heuristics: oneDC, commonIP and hash.

26 / 40

Page 22: Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Proposal]

IntroductionMain Topics of the ThesisMethods and Techniques

Time PlanConclusion

MapReduce ConfigurationResource Selection and OptimizationWork Distribution to Resources

Suggested Solution

VN requests and DC network are represented as undirected weighted graphs.Nodes store computational capacities (i.e. CPU, memory) and storage.Edge weights represent bandwidth capacities.

Graph similarity - subgraph matching algorithms will be employed.Suggested solution will be static.

27 / 40

Page 23: Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Proposal]

IntroductionMain Topics of the ThesisMethods and Techniques

Time PlanConclusion

MapReduce ConfigurationResource Selection and OptimizationWork Distribution to Resources

Output

An algorithm that optimizes the resource provisioningAn API that receives graph and constraint inputs and outputs a matchingAral, A., and Ovatman, T. (2014). Improving Resource Utilization in CloudEnvironments using Application Placement Heuristics. In Proceedings ofthe 4th International Conference on Cloud Computing and Services Science(CLOSER), 527-534.

28 / 40

Page 24: Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Proposal]

IntroductionMain Topics of the ThesisMethods and Techniques

Time PlanConclusion

MapReduce ConfigurationResource Selection and OptimizationWork Distribution to Resources

Problem

Aim

Analysis of the shuffle and scheduling algorithms of Apache Hadoop framework.

Initial selection of DataNodes, Mappers and Reducers is critical to reducenetwork traffic.Dynamic re-distribution of work during execution results in better loadbalancing and better performance.

30 / 40

Page 25: Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Proposal]

IntroductionMain Topics of the ThesisMethods and Techniques

Time PlanConclusion

MapReduce ConfigurationResource Selection and OptimizationWork Distribution to Resources

Previous Solutions

FIFO scheduler: Integrated scheduling algorithm, manual prioritizationFair scheduler: Each job receives an equal share of resources over time, onaverage. Organizes jobs into user pools that are resourced fairly.Capacity scheduler: Allows sharing a large cluster while giving each user aminimum capacity guarantee. Priority queues are used instead of poolswhere excess capacity is reused.Hadoop On Demand: Provisions virtual clusters from within larger physicalclusters. Adapts and scales when the workload changes.Learning scheduler: Chooses the task that is expected to provide max utility.Adaptive scheduler: User specifies a deadline at job submission time, andthe scheduler adjusts the resources to meet that deadline.

31 / 40

Page 26: Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Proposal]

IntroductionMain Topics of the ThesisMethods and Techniques

Time PlanConclusion

MapReduce ConfigurationResource Selection and OptimizationWork Distribution to Resources

Suggested Solution

Cost based analysis of the existing schedulersWhich scheduler is more appropriate for given costs of map, reduce, shufflephases?How can DataNode selection for blocks and clones be optimized to reducenetwork traffic?

32 / 40

Page 27: Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Proposal]

IntroductionMain Topics of the ThesisMethods and Techniques

Time PlanConclusion

MapReduce ConfigurationResource Selection and OptimizationWork Distribution to Resources

Output

Formal modeling and analysis of Hadoop schedulersAn API that receives costs of MapReduce phases and outputs a schedulersuggestionA patch for open-source Apache Hadoop framework

33 / 40

Page 28: Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Proposal]

IntroductionMain Topics of the ThesisMethods and Techniques

Time PlanConclusion

Time Plan

Today

2014 2015 2016

1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12

Resource Selection

Research

Modeling

Development

Documentation

MapReduce Configuration

Research

Modeling

Development

Documentation

Work Distribution

Research

Modeling

Development

Documentation

35 / 40

Page 29: Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Proposal]

IntroductionMain Topics of the ThesisMethods and Techniques

Time PlanConclusion

Summary

Optimization on 3 related phases of the cloud software life cycle1 Map Reduce Configuration2 Resource Selection and Assignment3 Work Distribution and Load Balancing

Graph based modeling of the cloud environment and resource allocationproblemHolistic approach that aims to increase the performance of cloud software(Map Reduce) by optimizing resource allocation

37 / 40

Page 30: Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Proposal]

IntroductionMain Topics of the ThesisMethods and Techniques

Time PlanConclusion

Unique Aspects

Inclusion of cloud computing aspects to the RA problemVirtual topology requests and incompletely known network topologyLocality and JurisdictionScalability rules

Inclusion of software characteristics (i.e. design model) to the RA problemOptimization with different objectives at each level

38 / 40

Page 31: Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Proposal]

IntroductionMain Topics of the ThesisMethods and Techniques

Time PlanConclusion

Impact

Our goal is to contribute to the long-held goal of utility computing.Cloud providers will benefit since they will be able to use their resources moreefficiently and will be able to serve more customers without violating theservice level agreements.Cloud users will also benefit in terms of shorter application execution time andless infrastructure requirement (lower cost).

39 / 40

Page 32: Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Proposal]

IntroductionMain Topics of the ThesisMethods and Techniques

Time PlanConclusion

Thank you for your time.

40 / 40

Page 33: Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Proposal]

Appendix I: Cloud Computing Architecture

1 / 2

Page 34: Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Proposal]

Appendix II: Resource Allocation Problem in Cloud

Conceptual phase1 Resource modeling: Notations only exist for concrete resources. Different

levels of granularity can be chosen. Vertical and horizontal heterogeneity causeinteroperability problems.

2 Resource offering and treatment: Independent from modeling. Newrequirements are present in addition to common network and computation ones.

Operational phase3 Resource discovery and monitoring: Finding suitable candidate resources

based on proximity or impact on network. Passive or active monitoringstrategies exist.

4 Resource selection and optimization: Finding a configuration that fulfills allrequirements and optimizes infrastructure usage. Dynamicity, algorithmcomplexity, and large number of constraints are the main problems.

2 / 2