Post on 20-Jan-2016
Multicriteria Driven Resource Management Strategies in GRMS
Krzysztof Kurowski, Jarek Nabrzyski,
Ariel Oleksiak, Juliusz Pukacki
Poznan Supercomputing and Networking Center, Poland
The 4th Cracow Grid Workshop -CGW Krzysztof Kurowski
Resource management strategies
Grid Computing
The 4th Cracow Grid Workshop -CGW Krzysztof Kurowski
GRMS Architecture
The 4th Cracow Grid Workshop -CGW Krzysztof Kurowski
GRMS as a part of the GridLab
The 4th Cracow Grid Workshop -CGW Krzysztof Kurowski
GRMS as a part of the GridLab
The 4th Cracow Grid Workshop -CGW Krzysztof Kurowski
Introduction to MC
MotivationGRMS is a solution for non-dedicated resourcesWe consider a resource management problem as a multicriteria decision making process with various stakeholders (end-users, administrators, etc.)Different preferences of stakeholders (sets of criteria and their importance)We need to aggregate somehow (negotiation and agreement processes are required) various criteria and stakeholders’ preferencesWe focus on a compromise solution rather than the optimal one We want to satisfy many stakeholders rather than the particular one
Various criteria and constraints depending on available information and scenarios
RequirementsFlexibility (in terms of criteria and multicriteria methods)Re-use of common functionality (ranking, non-dominated solutions, solutions meeting constraints)
The 4th Cracow Grid Workshop -CGW Krzysztof Kurowski
Introduction to MC
MCEvaluatorImplementation of needed multicriteria models and toolsSet of classes providing abstraction of entities used in multicriteria models
Job description/propertiesIncluding user’s and administrator preferencesTaking into consideration multiple criteria, their importance, indifference thresholds and constraints
Two main effortsMulti-criteria analysis engine – MCEvaluatorIncluding user’s preferences in a job description
The 4th Cracow Grid Workshop -CGW Krzysztof Kurowski
MCEvaluator - Design
The 4th Cracow Grid Workshop -CGW Krzysztof Kurowski
MCEvaluator - Model
Main entitiesCriteria (objectives, soft constraints: user is interested in their greatest or lowest possible values)Constraints (hard constraints: if is not satisfied solution is not taken into consideration)Solutions (e.g. resources, schedules etc. along with description parameters)MCEvaluator (decision point)
Multi-criteria methodsEvaluation function (e.g. weighted sum)Non-dominated solutionsLexicographic orderRule-based systemsOther ...
The 4th Cracow Grid Workshop -CGW Krzysztof Kurowski
Memory
Storage
Memory
Storage
End user 1 End user 2
Application 1(e.g. Data analysis )
Application 2(e.g. Data mining)
R1
R2
R3R4
R1
R2R3
R4
Hard constraints (e.g. RSL… ClassAd, scripts)<Mem = 100MB>, <Storage = 1G>
??? ???
Simple example
The 4th Cracow Grid Workshop -CGW Krzysztof Kurowski
Memory
Storage
MAX Z = 1*Mem + 2*Storage(where z is the objective function)
Memory
Storage
MAX Z = 2*Mem + 1*Storage(where z is the objective function)
R1
R2
R3R4
R1
End user 1 End user 2
Application 1(e.g. Data analysis )
Application 2(e.g. Data mining)
R2R3
R4
Simple example
The 4th Cracow Grid Workshop -CGW Krzysztof Kurowski
Job Description
Expressing user preferencesCriteria (I want as fast CPU as possible)
Weights (amount of memory is two times more important than a number of processors)
Constraints (amount of free memory must be greater than x)
Indifference thresholds (difference of 10MB is not significant)
The 4th Cracow Grid Workshop -CGW Krzysztof Kurowski
Job Description - Schema
<constraints><resConstraint parameter=„MEMORY">
<constraintModel preferenceType="PRIORITY"
optimizationType="GAIN"><indiffThreshold>10</indiffThreshold> <value>1</value>
</constraintModel><constraintValue>
<min indiffThreshold="0">100</min><max indiffThreshold=„20">200</max>
</constraintValue> </resConstraint>
</constraints>
The 4th Cracow Grid Workshop -CGW Krzysztof Kurowski
Examples
Selection of the best hosts for a jobBased on hosts parameters: CPU load, total memory, free memory, CPU count, CPU speed
Selection of the best host or the best queueBased on estimated runtime and queue waiting time (+ estimated errors) taken from the prediction system
Selection of the best job to migrate (to release resources for a new job)
Based on parameters of the host after job migration and migration costs
Selection of the best queue in the local resource management system
Based on aggregated historical data taken from the GridLab Adaptive Service (Delphoi)
The 4th Cracow Grid Workshop -CGW Krzysztof Kurowski
Selection of the best hosts for a job
The 4th Cracow Grid Workshop -CGW Krzysztof Kurowski
Example 1 (Host Parameters)
CriteriaCPU loadCPU countCPU speedTotal memoryFree memory
ConstraintsNo constraints (resources filtered in the resource discovery phase)
SolutionsSet on the basis of parameters describing hosts
//criteria NumericalCriterion[] criteria = new
NumericalCriterion[5]; criteria[0] = new NumericalCriterion("cpuLoad",
Criterion.GAIN); criteria[1] = new NumericalCriterion("cpuCount",
Criterion.GAIN); criteria[2] = new NumericalCriterion("cpuSpeed",
Criterion.GAIN); criteria[3] = new NumericalCriterion("memory",
Criterion.GAIN); criteria[4] = new NumericalCriterion("memAvail",
Criterion.GAIN); // //weights float[] weights = new float[5]; weights[0] = weightLoad; weights[1] = weightCPUCount; weights[2] = weightCPUSpeed; weights[3] = weightMemory; weights[4] = weightMemAvail; //evaluator mcEvaluators = new Vector(1); mcEvaluators.add(new
MCFuncEvaluator("GRMSmatchmaking", criteria, weights, null));
The 4th Cracow Grid Workshop -CGW Krzysztof Kurowski
Selection of the best host or the best queue
The 4th Cracow Grid Workshop -CGW Krzysztof Kurowski
Example 2 (Time Prediction)
CriteriaTime
Error
Standard Deviation
Maximal – minimal time
Constraintserr <= stdev
stdev < time/2
max-min <= time (+- 1 minute)
SolutionsSet on the basis of information from the prediction system
//criteria NumericalCriterion[] criteriaMCT = new NumericalCriterion[4]; criteriaMCT[0] = new NumericalCriterion("time", Criterion.COST); criteriaMCT[0].setIndiffThreshold(1); criteriaMCT[1] = new NumericalCriterion("err", Criterion.COST); criteriaMCT[2] = new NumericalCriterion("stdev", Criterion.COST); criteriaMCT[3] = new NumericalCriterion("max-min", Criterion.COST); criteriaMCT[3].setIndiffThreshold(1); // //weights float[] weightsMCT = new float[4]; weightsMCT[0] = weightTime; weightsMCT[1] = weightErr; weightsMCT[2] = weightStdev; weightsMCT[3] = weightMaxMin; //constraints MetricConstraint[] constraintsMCT = new MetricConstraint[3]; //err <= stdev constraintsMCT[0] = new MetricConstraint("err", MetricConstraint.LE,
null); constraintsMCT[0].setOtherMetric("stdev"); //stdev < time/2 constraintsMCT[1] = new MetricConstraint("stdev", MetricConstraint.LS,
null); constraintsMCT[1].setOtherMetric("time"); constraintsMCT[1].setOtherMetricFactor(0.5f); //max-min <= time (+- 1) constraintsMCT[2] = new MetricConstraint("max-min",
MetricConstraint.LE, null); constraintsMCT[2].setOtherMetric("time"); constraintsMCT[2].setIndiffThreshold(1);
The 4th Cracow Grid Workshop -CGW Krzysztof Kurowski
Selection of the best job to migrate (to release resources for a new job)
The 4th Cracow Grid Workshop -CGW Krzysztof Kurowski
Example 3 (Adaptive Service)
CriteriaavgNumberWaitingJobsavgJobWaitingTimesdJobWaitingTimeavgFreeNodescpuTimemaxJobsmaxNodes
ConstraintscpuTimemaxJobsmaxNodes
SolutionsSet on the basis of information from the adaptive service
// based on historic information (dynamic parameters) criteria[0] = new NumericalCriterion("avgNumberWaitingJobs", Criterion.COST); criteria[1] = new NumericalCriterion("avgJobWaitingTime", Criterion.COST); criteria[2] = new NumericalCriterion("sdJobWaitingTime", Criterion.COST);
criteria[3] = new NumericalCriterion("avgFreeNodes", Criterion.GAIN);
// based on queue configuration (static parameters) criteria[4] = new NumericalCriterion("cpuTime", Criterion.GAIN); criteria[5] = new NumericalCriterion("cpus", Criterion.GAIN); criteria[6] = new NumericalCriterion("maxJobs", Criterion.GAIN); criteria[7] = new NumericalCriterion("maxMem", Criterion.GAIN); criteria[7] = new NumericalCriterion("maxNodes", Criterion.GAIN);
The 4th Cracow Grid Workshop -CGW Krzysztof Kurowski
MCEvaluator – Next Steps
Development of the engineImplementation of other multi-criteria methods
Negotiable constraints
Use in another scenariosUse of other decision models (currently the evaluation function)
Scheduling with advance reservation
Resource co-allocations
Other ?
The 4th Cracow Grid Workshop -CGW Krzysztof Kurowski
Conclusion
Flexible engine for criteria and constraints management – various:
criteriaconstraintsdecision making methods
Used in a few scenarios/testbedsCurrently multi-criteria evaluation function
Job description containing user preferencesCriteria (+ their importance) and constraints
It would be nice to useCriteria in agreement negotiationsCriteria concerning deadlines, runtimes etc.
The 4th Cracow Grid Workshop -CGW Krzysztof Kurowski
GRMS web page
www.gridlab.org/grms/
Thank you!