Nimrod & NetSolve

37
Nimrod & NetSolve Nimrod & NetSolve Sathish Vadhiyar Sathish Vadhiyar

description

Nimrod & NetSolve. Sathish Vadhiyar. Nimrod. Sources/Credits: Nimrod web site & papers. Background. For execution of parametric experiments across distributed computers User describes plan file that declares parameters - PowerPoint PPT Presentation

Transcript of Nimrod & NetSolve

Page 1: Nimrod & NetSolve

Nimrod & NetSolveNimrod & NetSolve

Sathish VadhiyarSathish Vadhiyar

Page 2: Nimrod & NetSolve

NimrodNimrod

Sources/Credits: Nimrod web site Sources/Credits: Nimrod web site & papers& papers

Page 3: Nimrod & NetSolve

BackgroundBackground

For execution of parametric experiments across For execution of parametric experiments across distributed computersdistributed computers

User describes plan file that declares User describes plan file that declares parametersparameters

Parametric studies – range of different Parametric studies – range of different simulations calculated using the same programsimulations calculated using the same program

Need for a GridNeed for a Grid 3 variables, 4 values – 64 experiments3 variables, 4 values – 64 experiments Each experiment – several hoursEach experiment – several hours

Page 4: Nimrod & NetSolve

Sample plan fileSample plan fileparameter iseed integer range from 100 to 4000 step 100;parameter iseed integer range from 100 to 4000 step 100;parameter thick label "BUC thickness" float range from 1.1 to 2.0 step parameter thick label "BUC thickness" float range from 1.1 to 2.0 step

0.1;0.1;parameter jseed integer compute thick*1000;parameter jseed integer compute thick*1000;

task nodestarttask nodestart copy ccal.$OS node:./ccalcopy ccal.$OS node:./ccal copy dummy node:.copy dummy node:. copy ccal.dat node:.copy ccal.dat node:. copy skel.inp node:.copy skel.inp node:.endtaskendtask

task maintask main node:substitute skel.inp ccal.inp node:substitute skel.inp ccal.inp node:execute ./ccalnode:execute ./ccal copy node:ccal.op ccalout.$jobnamecopy node:ccal.op ccalout.$jobnameendtaskendtask

Page 5: Nimrod & NetSolve

Phases of Computational Phases of Computational ExperimentExperiment

1.1. Experiment pre-processing, when data is set Experiment pre-processing, when data is set up for the experiment; up for the experiment;

2.2. Execution pre-processing, when data is Execution pre-processing, when data is prepared for a particular execution; prepared for a particular execution;

3.3. Execution, when the program is executed for a Execution, when the program is executed for a given set of parameter values; given set of parameter values;

4.4. Execution post-processing, when data from a Execution post-processing, when data from a particular execution is reduced; particular execution is reduced;

5.5. Experiment post-processing, when results are Experiment post-processing, when results are processed, for example by running data processed, for example by running data interpretation or visualization software. interpretation or visualization software.

Page 6: Nimrod & NetSolve

IllustrationIllustration

Page 7: Nimrod & NetSolve

Nimrod ArchitectureNimrod Architecture

Page 8: Nimrod & NetSolve

ArchitectureArchitecture

Page 9: Nimrod & NetSolve

ArchitectureArchitecture

ComponentsComponents ClientClient Parametric engineParametric engine SchedulerScheduler DispatcherDispatcher Job wrapperJob wrapper

Page 10: Nimrod & NetSolve

ComponentsComponents

Parametric engineParametric engine Persistent job servicePersistent job service Interacts with the client, schedule advisor and dispatcherInteracts with the client, schedule advisor and dispatcher Takes declarative plan from the userTakes declarative plan from the user

SchedulerScheduler Objectives – meet deadlines, minimize costObjectives – meet deadlines, minimize cost

DispatcherDispatcher Starts remote component called job wrapperStarts remote component called job wrapper Updates status of task to parametric engineUpdates status of task to parametric engine

Job wrapperJob wrapper Responsible for staging-in, execution and staging outResponsible for staging-in, execution and staging out

Page 11: Nimrod & NetSolve

Cost ModelCost Model

Cost / Priority matrix defined Cost / Priority matrix defined based on specification by based on specification by resource providersresource providers

Nimrod/G scheduler Nimrod/G scheduler performs discovery and performs discovery and allocation of resources based allocation of resources based on specified execution times on specified execution times and cost constraintsand cost constraints

Cost of experiment varies Cost of experiment varies depending on the loaddepending on the load

Page 12: Nimrod & NetSolve

Scheduling HeuristicScheduling Heuristic

1.1. DiscoveryDiscovery1.1. Initial filtering of resources based on cost specificationsInitial filtering of resources based on cost specifications2.2. Identification of lowest-cost set of resources able to meet Identification of lowest-cost set of resources able to meet

deadlinesdeadlines

2.2. AllocationsAllocations1.1. Jobs allocated from the queue to the resources identified in Jobs allocated from the queue to the resources identified in

step 1step 1

3.3. MonitoringMonitoring1.1. Completion time of jobs monitoredCompletion time of jobs monitored2.2. Execution rate establishedExecution rate established

4.4. RefinementRefinement1.1. Execution rate used to update expected completion times of Execution rate used to update expected completion times of

remaining jobsremaining jobs2.2. Revisit steps 1 and 2Revisit steps 1 and 2

Page 13: Nimrod & NetSolve

ExperimentsExperiments

Page 14: Nimrod & NetSolve

Ionization chamber calibrationIonization chamber calibration

Chamber response to front wall thicknessChamber response to front wall thickness

ion-pair = ion-pair =

400 tasks400 tasks

Each model involved about 40 minutes – 140 Each model involved about 40 minutes – 140 minutesminutes

3 experiments – 10-hr, 15-hr, 20-hr deadline3 experiments – 10-hr, 15-hr, 20-hr deadline

Page 15: Nimrod & NetSolve

No. of resources Vs timeNo. of resources Vs time

Page 16: Nimrod & NetSolve

Cost vs TimeCost vs Time

Page 17: Nimrod & NetSolve

Cost vs TimeCost vs Time

Page 18: Nimrod & NetSolve

Cost vs TimeCost vs Time

Page 19: Nimrod & NetSolve

Cost vs TimeCost vs Time

Page 20: Nimrod & NetSolve

Another ExperimentAnother Experiment

Page 21: Nimrod & NetSolve

Experimental setupExperimental setup

165 cpu jobs, each 5 min. in duration165 cpu jobs, each 5 min. in duration

Deadline – 2 hoursDeadline – 2 hours

Budget - 396000Budget - 396000

2 strategies:2 strategies: Optimize timeOptimize time Optimize costOptimize cost

Page 22: Nimrod & NetSolve

ResultsResults

Page 23: Nimrod & NetSolve

Time OptimizationTime Optimization

Page 24: Nimrod & NetSolve

Cost OptimizationCost Optimization

Page 25: Nimrod & NetSolve

SchedulingScheduling

Adaptive scheduling Adaptive scheduling algorithmsalgorithms

Time minimization and Time minimization and limited budget (etime limited budget (etime optimal)optimal)

Time minimization and Time minimization and unlimited budget (etime unlimited budget (etime highoptimal)highoptimal)

Cost minimization and Cost minimization and limited by deadline (ecost limited by deadline (ecost optimal)optimal)

None minimization, limited None minimization, limited time and cost (etime + time and cost (etime + ecost optimal)ecost optimal)

Page 26: Nimrod & NetSolve

Nimrod / ONimrod / O

Optimization of parameters to minimize objective Optimization of parameters to minimize objective functionfunctionCase study: optimize shape and angle of attack Case study: optimize shape and angle of attack of airfoil that maximizes the lift to drag ratioof airfoil that maximizes the lift to drag ratioDesign optimization problemDesign optimization problemObjective function can be non-linear, contain Objective function can be non-linear, contain noise, can be continuous or discretenoise, can be continuous or discreteNo single optimization algorithm can give the No single optimization algorithm can give the best resultbest resultNimrod / O supports a range of algorithmsNimrod / O supports a range of algorithms

Page 27: Nimrod & NetSolve

Contd …Contd …

Search algorithmsSearch algorithms P-BFGSP-BFGS SimplexSimplex Divide-and-conquerDivide-and-conquer Simulated annealingSimulated annealing

Page 28: Nimrod & NetSolve

Plan file modified by Nimrod / OPlan file modified by Nimrod / O

Page 29: Nimrod & NetSolve

ReferencesReferences

Abramson D, Lewis A, Peachey T, Fletcher, C., “An Abramson D, Lewis A, Peachey T, Fletcher, C., “An Automatic Design Optimization Tool and its Application Automatic Design Optimization Tool and its Application to Computational Fluid Dynamics”, SuperComputing to Computational Fluid Dynamics”, SuperComputing 2001, Denver, Nov 2001.2001, Denver, Nov 2001.Abramson , D., Sosic , R., Giddy , J., Cope , M. "The Abramson , D., Sosic , R., Giddy , J., Cope , M. "The Laboratory Bench: Distributed Computing for Laboratory Bench: Distributed Computing for Parametised Simulations", 1994 Parallel Computing and Parametised Simulations", 1994 Parallel Computing and Transputers Conference, Wollongong, Nov 94, pp 17 27.Transputers Conference, Wollongong, Nov 94, pp 17 27.Abramson D., Sosic R., Giddy J. and Hall B., "Nimrod: A Abramson D., Sosic R., Giddy J. and Hall B., "Nimrod: A Tool for Performing Parametised Simulations using Tool for Performing Parametised Simulations using Distributed Workstations", The 4th IEEE Symposium on Distributed Workstations", The 4th IEEE Symposium on High Performance Distributed Computing, Virginia, High Performance Distributed Computing, Virginia, August 1995. August 1995.

Page 30: Nimrod & NetSolve

ReferencesReferences

Abramson, D., Giddy, J. and Kotler, L. High Performance Abramson, D., Giddy, J. and Kotler, L. High Performance Parametric Modeling with Nimrod/G: Killer Application for the Global Parametric Modeling with Nimrod/G: Killer Application for the Global Grid?, International Parallel and Distributed Processing Symposium Grid?, International Parallel and Distributed Processing Symposium (IPDPS), pp 520- 528, Cancun, Mexico, May 2000.(IPDPS), pp 520- 528, Cancun, Mexico, May 2000.Buyya, R., Abramson, D. and Giddy, J. Nimrod/G: An Architecture of Buyya, R., Abramson, D. and Giddy, J. Nimrod/G: An Architecture of a Resource Management and Scheduling System in a Global a Resource Management and Scheduling System in a Global Computational Grid, HPC Asia 2000, May 14-17, 2000, pp 283 289, Computational Grid, HPC Asia 2000, May 14-17, 2000, pp 283 289, Beijing, China.Beijing, China.Abramson, D, Buuya, R. and Giddy, J. “A Computational Economy Abramson, D, Buuya, R. and Giddy, J. “A Computational Economy for Grid Computing and its Implementation in the Nimrod-G for Grid Computing and its Implementation in the Nimrod-G Resource Broker”, Future Generation Computer Systems. Volume Resource Broker”, Future Generation Computer Systems. Volume 18, Issue 8, Oct-2002.18, Issue 8, Oct-2002.Buyya, R., Giddy, J. and Abramson, D. "An Evaluation of Economy-Buyya, R., Giddy, J. and Abramson, D. "An Evaluation of Economy-based Resource Trading and Scheduling on Computational Power based Resource Trading and Scheduling on Computational Power Grids for Parameter Sweep Applications", Workshop on Active Grids for Parameter Sweep Applications", Workshop on Active Middleware Services (AMS 2000), (in conjuction with Ninth IEEE Middleware Services (AMS 2000), (in conjuction with Ninth IEEE International Symposium on High Performance Distributed International Symposium on High Performance Distributed Computing), Kluwer Academic Press, August 1, 2000, Pittsburgh, Computing), Kluwer Academic Press, August 1, 2000, Pittsburgh, USA. USA.

Page 31: Nimrod & NetSolve

Junk !!Junk !!

Page 32: Nimrod & NetSolve

Nimrod ArchitectureNimrod Architecture

Page 33: Nimrod & NetSolve

ComponentsComponents

GeneratorGenerator Input: plan fileInput: plan file Processes plan file, gives choices to the user Processes plan file, gives choices to the user

regarding parametersregarding parameters Output: run file (description of a job)Output: run file (description of a job)

DispatcherDispatcher Input: run fileInput: run file Stages file to remote resourcesStages file to remote resources Runs jobs on remote resourcesRuns jobs on remote resources

Page 34: Nimrod & NetSolve

Nimrod-G ArchitectureNimrod-G Architecture

•Origin:

•Implements scheduling and monitoring

•Exists for the entire duration of the experiment

•Responsible for execution of experiment within specified time and cost constraints

•Client

•User interacts with the Origin process through the client

•Multiple clients can connect to a single origin process and monitor the same experiment

Page 35: Nimrod & NetSolve

Nimrod ComponentsNimrod Components

Nimrod Resource BrokerNimrod Resource Broker Origin process spawns NRB on the remote Origin process spawns NRB on the remote

sitesite Interacts with GRAMInteracts with GRAM Capabilities beyond GRAM including file Capabilities beyond GRAM including file

staging, creation of jobs and process controlstaging, creation of jobs and process control

Page 36: Nimrod & NetSolve

experimentsexperiments

90-second jobs over 10 simulated queues 90-second jobs over 10 simulated queues with different access costs (Q1=10, Q2 = with different access costs (Q1=10, Q2 = 12 etc.)12 etc.)

100 jobs, 9000 seconds100 jobs, 9000 seconds

10 queues, 900 seconds optimal10 queues, 900 seconds optimal

Deadlines – 990, 1980, 2970Deadlines – 990, 1980, 2970

Costs – 252000, 171000, 126000Costs – 252000, 171000, 126000

Page 37: Nimrod & NetSolve