DIRAC Review (12 th December 2005)Stuart K. Paterson1 DIRAC Review Workload Management System.

50
DIRAC Review (12 th December 2005) Stuart K. Paterson 1 DIRAC Review Workload Management System

description

DIRAC Review (12 th December 2005)Stuart K. Paterson3 Introduction The WMS is a key component of DIRAC Realizes PULL Mechanism Community Overlaying Grid System COGS Paradigm WMS Services:- Rely on MySQL Job Database XML-RPC Protocol used for client service access Jabber is used for communication between services Condor Classad for Job JDL

Transcript of DIRAC Review (12 th December 2005)Stuart K. Paterson1 DIRAC Review Workload Management System.

Page 1: DIRAC Review (12 th December 2005)Stuart K. Paterson1 DIRAC Review Workload Management System.

DIRAC Review (12th December 2005) Stuart K. Paterson 1

DIRAC Review

Workload Management System

Page 2: DIRAC Review (12 th December 2005)Stuart K. Paterson1 DIRAC Review Workload Management System.

DIRAC Review (12th December 2005) Stuart K. Paterson 2

Contents

Introduction & OverviewThe Life Cycle of a DIRAC Job

Central Services & InteractionsDistributed Workload Management

WMS StrategiesOutlook & Improvements

Page 3: DIRAC Review (12 th December 2005)Stuart K. Paterson1 DIRAC Review Workload Management System.

DIRAC Review (12th December 2005) Stuart K. Paterson 3

Introduction

The WMS is a key component of DIRACRealizes PULL MechanismCommunity Overlaying Grid System

COGS Paradigm

WMS Services:-Rely on MySQL Job DatabaseXML-RPC Protocol used for client service accessJabber is used for communication between servicesCondor Classad for Job JDL

Page 4: DIRAC Review (12 th December 2005)Stuart K. Paterson1 DIRAC Review Workload Management System.

DIRAC Review (12th December 2005) Stuart K. Paterson 4

The Life of a DIRAC Job

Consider a typical DaVinci Analysis job since other use cases involve a subset of the steps for thisJob is submitted to WMS via DIRAC API

See tomorrow’s presentation

User

DIRACsubmit()

status()

getOutput()

DIRACAPI

GANGA

Page 5: DIRAC Review (12 th December 2005)Stuart K. Paterson1 DIRAC Review Workload Management System.

DIRAC Review (12th December 2005) Stuart K. Paterson 5

WMS Overview

JobReceiver

LFC

MatcherDataOptimiser

JobDB

TaskQueue

AgentDirector

Pilot Agent

LCGWMS

Computing Resource

Pilot Agent

AgentMonitor

Page 6: DIRAC Review (12 th December 2005)Stuart K. Paterson1 DIRAC Review Workload Management System.

DIRAC Review (12th December 2005) Stuart K. Paterson 6

Secure Job Receiver

Currently the only secure serviceAssigns Job ID Saves job in JobDBAlso uploads and saves proxy of userNotifies Optimiser Data or FIFO depending on requirements of job

Page 7: DIRAC Review (12 th December 2005)Stuart K. Paterson1 DIRAC Review Workload Management System.

DIRAC Review (12th December 2005) Stuart K. Paterson 7

Input/Output Sandbox Services

At present use MySQL DB for storing I/O sandboxVery fast and efficient No problems observed for ‘small’ files (<10Mb)Limit on DB size of 4Gb

Proposal is to move to Grid storage for this but:This will be slower

Extra dependency on LFCWill be implemented and tested

Final decision will be made based on performance

Page 8: DIRAC Review (12 th December 2005)Stuart K. Paterson1 DIRAC Review Workload Management System.

DIRAC Review (12th December 2005) Stuart K. Paterson 8

Input/Output Sandbox Mechanism

Page 9: DIRAC Review (12 th December 2005)Stuart K. Paterson1 DIRAC Review Workload Management System.

DIRAC Review (12th December 2005) Stuart K. Paterson 9

Job Database

MySQL DB but this is not accessed directly Accessed through the JobDB class

Contains full information about all the jobsJob description and statusPrimary job parameters

Common for all jobs e.g. Owner, access optimizedExtra Job Parameters

Arbitrary key/value pairs

Page 10: DIRAC Review (12 th December 2005)Stuart K. Paterson1 DIRAC Review Workload Management System.

DIRAC Review (12th December 2005) Stuart K. Paterson 10

JobDB Interface

Marks job status as ‘ready’ when addedNot only a thin layer on top of SQL statements Performs high-level operations

Adding jobsRemoving jobs

Provides bulk queries (e.g. for job monitoring)Scalability issues

Test system up to 15000 (production and analysis) jobs without automatic cleaning, no problems so far…

Page 11: DIRAC Review (12 th December 2005)Stuart K. Paterson1 DIRAC Review Workload Management System.

DIRAC Review (12th December 2005) Stuart K. Paterson 11

WMS Overview

JobReceiver

LFC

Matcher

JobDB

TaskQueue

AgentDirector

Pilot Agent

LCGWMS

Computing Resource

Pilot Agent

AgentMonitor

DataOptimiser

JobDB

JobReceiver

Page 12: DIRAC Review (12 th December 2005)Stuart K. Paterson1 DIRAC Review Workload Management System.

DIRAC Review (12th December 2005) Stuart K. Paterson 12

Data Optimizer

Instantiates File Catalog Client (LFC)Retrieves requirements of job from JobDB

Uses Condor Classad and MySQLSets job status to ‘waitingdata’Checks LFC for input data files and determines suitable SEs

Job fails meaningfully if the data is not availableRead-only LFC could speed up this process

Sets job status to ‘waiting’ ‘PilotAgent Submission’ if data is available

Inserts job into Task Queue

Page 13: DIRAC Review (12 th December 2005)Stuart K. Paterson1 DIRAC Review Workload Management System.

DIRAC Review (12th December 2005) Stuart K. Paterson 13

Data Optimizer - Improvements

Currently uses proxy from the process owner certificate to access LFC

Could use user proxy or Server certificate

Read-only LFC would solve this issueOptimizer currently checks all input files for each job

Can move to directories of datasets in the futureNeed to optimize LFC interaction for bulk queries

In cooperation with LFC developers

Page 14: DIRAC Review (12 th December 2005)Stuart K. Paterson1 DIRAC Review Workload Management System.

DIRAC Review (12th December 2005) Stuart K. Paterson 14

Task Queue

Deliberately have many task queues1 Task Queue per set of requirementsDrastically reduces the matching time

Works ok for production jobsFor analysis jobs with many varied requirements this remains to be seen

‘Double matching’ helps with this – see laterToo many queues can cause problems

Hierarchical organisation of queues with respect to requirements should improve matching

Page 15: DIRAC Review (12 th December 2005)Stuart K. Paterson1 DIRAC Review Workload Management System.

DIRAC Review (12th December 2005) Stuart K. Paterson 15

WMS Overview

JobReceiver

LFC

Matcher

JobDB

TaskQueue

AgentDirector

Pilot Agent

LCGWMS

Computing Resource

Pilot Agent

AgentMonitor

DataOptimiser

JobDB

JobReceiver

DataOptimiser

TaskQueue

LFC

Page 16: DIRAC Review (12 th December 2005)Stuart K. Paterson1 DIRAC Review Workload Management System.

DIRAC Review (12th December 2005) Stuart K. Paterson 16

Agent Director (1)

Agent Director is an API for Pilot Agent submission to LCG

Sets up the proxy of the user for submission to LCGMonitors jobs in ‘waiting’ ‘PilotAgent Submission’ status

Polls frequentlyAfter submission it enters ‘waiting’ ‘PilotAgent Response’

Since job remains in the ‘waiting’ state, can be picked up at any time by existing Agents from the same user

Page 17: DIRAC Review (12 th December 2005)Stuart K. Paterson1 DIRAC Review Workload Management System.

DIRAC Review (12th December 2005) Stuart K. Paterson 17

Agent Director (2)

Currently only used for ‘user’ jobsMove to AD for production

Can submit agents for each job in the Task QueueCurrently Pilot Agents are submitted by cron job independently of the Task Queue state

Can also potentially have agents working in filling mode (some infinite queues)

Could be used for submission to other Grids…

Page 18: DIRAC Review (12 th December 2005)Stuart K. Paterson1 DIRAC Review Workload Management System.

DIRAC Review (12th December 2005) Stuart K. Paterson 18

Agent Monitoring Service

Checks Pilot Agents of jobs in ‘waiting’ stateCurrently every 5 mins

Monitoring of Pilot Agents on LCG allows:Catch jobs spending too much time in the ‘Waiting’ stateQuickly spot the ‘Aborted’ status problem

Aborted agents are tracked and can be accountedCan also spot the problem of Pilot Agents being submitted to batch queues

Assigns ‘waiting’ ‘Proxy Expired’ state Can flag jobs for Agent Director to submit further Pilot Agents as necessary

Page 19: DIRAC Review (12 th December 2005)Stuart K. Paterson1 DIRAC Review Workload Management System.

DIRAC Review (12th December 2005) Stuart K. Paterson 19

Future Developments of AD and AM Services

Ensure PilotAgents are submitted to different LCG sites if the job requirements permit it

Not immediately obvious how to accomplish thisWould need direct submission to LCG CE

Make use of MyProxy ServerAutomatically renew user proxies before submission when necessaryPipe long life proxy with jobs?Run proxy monitor as a service which can deliver renewed user proxy?

Page 20: DIRAC Review (12 th December 2005)Stuart K. Paterson1 DIRAC Review Workload Management System.

DIRAC Review (12th December 2005) Stuart K. Paterson 20

WMS Overview

JobReceiver

LFC

Matcher

JobDB

TaskQueue

AgentDirector

Pilot Agent

LCGWMS

Computing Resource

Pilot Agent

AgentMonitor

DataOptimiser

JobDB

JobReceiver

DataOptimiser

TaskQueue

LFC

AgentMonitor

AgentDirector

Page 21: DIRAC Review (12 th December 2005)Stuart K. Paterson1 DIRAC Review Workload Management System.

DIRAC Review (12th December 2005) Stuart K. Paterson 21

Job State Machine up to this point

Jobs may be picked up as soon as they enter the ‘Waiting’ state

Page 22: DIRAC Review (12 th December 2005)Stuart K. Paterson1 DIRAC Review Workload Management System.

DIRAC Review (12th December 2005) Stuart K. Paterson 22

Pilot Agent running on LCG WN

Simple wrapper script sent as LCG jobInstalls DIRAC

Runs a standard DIRAC Agent which polls for the particular Job from a particular UserIf not successful, requests any job from particular user which is satisfied by the requirements of the site it runs on

‘Filling’ Mode – see laterDIRAC Agent starts JobAgent Module which performs the job requests

Page 23: DIRAC Review (12 th December 2005)Stuart K. Paterson1 DIRAC Review Workload Management System.

DIRAC Review (12th December 2005) Stuart K. Paterson 23

Matcher (1)

Receives request from Pilot Agent Only responds to sites in ‘mask’

Contains list of allowed sitesChecks available jobs in task queue Matches requirements of job (e.g. possible SEs) to requirements from Agent (e.g. owner, JobID at site with particular LocalSE)

Double match, agent can put specific requirements on jobs (e.g. job of particular owner or certain priority level)

Page 24: DIRAC Review (12 th December 2005)Stuart K. Paterson1 DIRAC Review Workload Management System.

DIRAC Review (12th December 2005) Stuart K. Paterson 24

Matcher (2)

Matcher has ‘semaphore’ mechanism to ensure job is only picked up once Assigns ‘matched’ state, sets the site for the job, logs this info and deletes job from task queue

Sends job to WNDoesn’t need to be secure

However, must ensure jobs are not picked up in error

Page 25: DIRAC Review (12 th December 2005)Stuart K. Paterson1 DIRAC Review Workload Management System.

DIRAC Review (12th December 2005) Stuart K. Paterson 25

WMS Workflow

Job State Machine

Page 26: DIRAC Review (12 th December 2005)Stuart K. Paterson1 DIRAC Review Workload Management System.

DIRAC Review (12th December 2005) Stuart K. Paterson 26

Job Agent Module Running on WN

Job Agent Requests Job from WMS, gets JDL if successfulInstalls any software not available locally

Links to any pre-installed software are created local to job during installation of DIRAC (dirac-install)

Creates Job Wrapper using information local to the WNTemplate + job specific parameters, e.g. job JDL ++ site specific parameters, e.g. software paths

Job Agent executes Job WrapperLocal InProcess CE

Page 27: DIRAC Review (12 th December 2005)Stuart K. Paterson1 DIRAC Review Workload Management System.

DIRAC Review (12th December 2005) Stuart K. Paterson 27

Job Agent

Job Preparation

Installing Application Software

Create Job Wrapper

Page 28: DIRAC Review (12 th December 2005)Stuart K. Paterson1 DIRAC Review Workload Management System.

DIRAC Review (12th December 2005) Stuart K. Paterson 28

WMS Overview

JobReceiver

LFC

Matcher

JobDB

TaskQueue

AgentDirector

Pilot Agent

LCGWMS

Computing Resource

Pilot Agent

AgentMonitor

DataOptimiser

JobDB

JobReceiver

DataOptimiser

TaskQueue

LFC

AgentMonitor

AgentDirector

Matcher

Pilot Agent

LCGWMS

Page 29: DIRAC Review (12 th December 2005)Stuart K. Paterson1 DIRAC Review Workload Management System.

DIRAC Review (12th December 2005) Stuart K. Paterson 29

WMS Overview

Pilot

Agent

Computing Resource

Page 30: DIRAC Review (12 th December 2005)Stuart K. Paterson1 DIRAC Review Workload Management System.

DIRAC Review (12th December 2005) Stuart K. Paterson 30

WMS Overview

Computing Resource

Page 31: DIRAC Review (12 th December 2005)Stuart K. Paterson1 DIRAC Review Workload Management System.

DIRAC Review (12th December 2005) Stuart K. Paterson 31

Job Wrapper (1)

Downloads input sandboxCurrently InputSandbox is a DIRAC WMS specific serviceCan also use generic LFNs

Provides access to the input dataResolves the input data LFN into a “best replica” PFN for the execution siteGenerates an appropriate Pool XML slice for protocol

Page 32: DIRAC Review (12 th December 2005)Stuart K. Paterson1 DIRAC Review Workload Management System.

DIRAC Review (12th December 2005) Stuart K. Paterson 32

Input Data Access Strategy

Attempt to stage input data (multi-threaded, via lcg-gt)Try rfio and dcapReturns TURL for protocol (when it works)

The returned TURLs currently don’t work inside the applications (also can’t ‘pin’ files yet…)

Currently use globally constructed TURL from DIRAC Storage Element class

Works fine with Gaudi applicationsIf this isn’t available, bring datasets local

Page 33: DIRAC Review (12 th December 2005)Stuart K. Paterson1 DIRAC Review Workload Management System.

DIRAC Review (12th December 2005) Stuart K. Paterson 33

Job Wrapper (2)

Invokes the job application in a child processRuns a watchdog process parallel to the application one:

Provides heart-beats for the Job Monitoring ServiceCollects the application CPU and memory consumptionto generate average numbers in the endMay catch the application in a ‘stalled’ state if no CPU consumption detected

Page 34: DIRAC Review (12 th December 2005)Stuart K. Paterson1 DIRAC Review Workload Management System.

DIRAC Review (12th December 2005) Stuart K. Paterson 34

Job Wrapper (3)

May receive messages through a messaging systemJabber messaging was demonstratedPossibility to kill the application gracefully or spy on the application outputNot used currently as security issues should be sorted out first if at all possible

Collecting job execution environment and consumption parameters, passing them to the Job Monitoring Service

Local job ID’s, worker node characteristics and load, total CPU consumption, job timing, etc

Page 35: DIRAC Review (12 th December 2005)Stuart K. Paterson1 DIRAC Review Workload Management System.

DIRAC Review (12th December 2005) Stuart K. Paterson 35

Job Wrapper (4)

Uploading output sandboxCurrently OuputSandbox is a DIRAC WMS specific serviceMight be moved to a generic SE implementation soon

Uploading output data Uploads output data to a predefined SE

Default one or user definedChooses PFN path according to the LHCb conventions

May be overridden by user – not recommendedNotifies the Job Monitoring Service of the changes in the job state

Page 36: DIRAC Review (12 th December 2005)Stuart K. Paterson1 DIRAC Review Workload Management System.

DIRAC Review (12th December 2005) Stuart K. Paterson 36

Job Wrapper Workflow

Page 37: DIRAC Review (12 th December 2005)Stuart K. Paterson1 DIRAC Review Workload Management System.

DIRAC Review (12th December 2005) Stuart K. Paterson 37

Overview of State Machine After Job Reaches WN (1)

Page 38: DIRAC Review (12 th December 2005)Stuart K. Paterson1 DIRAC Review Workload Management System.

DIRAC Review (12th December 2005) Stuart K. Paterson 38

Overview of State Machine After Job Reaches WN (2)

Page 39: DIRAC Review (12 th December 2005)Stuart K. Paterson1 DIRAC Review Workload Management System.

DIRAC Review (12th December 2005) Stuart K. Paterson 39

Overview of State Machine After Job Reaches WN (3)

Page 40: DIRAC Review (12 th December 2005)Stuart K. Paterson1 DIRAC Review Workload Management System.

DIRAC Review (12th December 2005) Stuart K. Paterson 40

Overview of Status Machine for Failed Jobs

Page 41: DIRAC Review (12 th December 2005)Stuart K. Paterson1 DIRAC Review Workload Management System.

DIRAC Review (12th December 2005) Stuart K. Paterson 41

DIRAC Job in Final State

Once DIRAC Agent(s) executed, Pilot Agent terminates gracefully, freeing the resourceIf successful, job is in ‘outputready’ state which means it is retrievable User can request output at any time

Page 42: DIRAC Review (12 th December 2005)Stuart K. Paterson1 DIRAC Review Workload Management System.

DIRAC Review (12th December 2005) Stuart K. Paterson 42

Job Logging

Logging is a mixture of primary (Job State) and secondary (App State) job states

JOB STATE DATE TIME SITE

submission 2005-12-11 13:40:21 LCG.CNAF.it

ready 2005-12-11 13:40:23 LCG.CNAF.it

waitingdata 2005-12-11 13:40:24 LCG.CNAF.it

waiting 2005-12-11 13:40:26 LCG.CNAF.it

matched 2005-12-11 13:53:09 LCG.CNAF.it

Job received by Agent 2005-12-11 13:53:09 LCG.CNAF.it

Installing Software 2005-12-11 13:53:09 LCG.CNAF.it

Job prepared to submit 2005-12-11 13:55:01 LCG.CNAF.it

scheduled 2005-12-11 13:55:01 LCG.CNAF.it

queued 2005-12-11 13:55:01 LCG.CNAF.it

running 2005-12-11 13:55:03 LCG.CNAF.it

Starting DIRAC job 2005-12-11 13:55:03 LCG.CNAF.it

DIRAC job initialization 2005-12-11 13:55:04 LCG.CNAF.it

Getting Input Data 2005-12-11 13:55:04 LCG.CNAF.it

Starting the application 2005-12-11 13:55:27 LCG.CNAF.it

DaVinci step 1 started 2005-12-11 13:55:30 LCG.CNAF.it

DaVinci execution, step 1 2005-12-11 13:55:38 LCG.CNAF.it

DaVinci, step 1 done 2005-12-11 13:56:54 LCG.CNAF.it

Job finalization 2005-12-11 13:56:54 LCG.CNAF.it

Job finished successfully 2005-12-11 13:56:55 LCG.CNAF.it

done 2005-12-11 13:56:55 LCG.CNAF.it

outputready 2005-12-11 13:56:56 LCG.CNAF.it

Page 43: DIRAC Review (12 th December 2005)Stuart K. Paterson1 DIRAC Review Workload Management System.

DIRAC Review (12th December 2005) Stuart K. Paterson 43

Job Monitoring Service

At all stages the Job Monitoring Service is used as an interface to update status information

Changes status in Job DB directlyAlso updates the Job Logging information

Two entry points, one for writing one for readingMove writing to secure service in the futureOptimized for bulk queries

One of the most solicited servicesCould maintain separate cache of state information to reduce future loads if necessary in the future

Page 44: DIRAC Review (12 th December 2005)Stuart K. Paterson1 DIRAC Review Workload Management System.

DIRAC Review (12th December 2005) Stuart K. Paterson 44

Cleaning Up

Cleaning Agent is used for Production jobsAccounting Agent monitors jobs in final state

Extracts accounting info and marks job as ‘deleted’ Cleaning Agent deletes job on next loop

For analysis/user jobs, could mark as ‘Accounting Sent’ then mark as ‘Purgeable’

To be decidedNew Cleaning Agent can implement policy

If output retrieved hold job for ~1 day more ??If output not yet retrieved hold job for ~ 1 week ??

Page 45: DIRAC Review (12 th December 2005)Stuart K. Paterson1 DIRAC Review Workload Management System.

DIRAC Review (12th December 2005) Stuart K. Paterson 45

Overall Job State Machine

Page 46: DIRAC Review (12 th December 2005)Stuart K. Paterson1 DIRAC Review Workload Management System.

DIRAC Review (12th December 2005) Stuart K. Paterson 46

Current WMS Pilot Agent Strategy

Now submit up to 4 Pilot Agents per job (unless all are being ‘Aborted’)

These are submitted one at a time with a waiting period of 5 minutes between

Agent ‘Filling’ modeWhen a Pilot Agent arrives at a WN, it first requests a particular job from the user. Next, the Pilot Agent will request any job from the same user

If the requirements of the job match the site this is successful

This should be optimized to make the most of the available resource – e.g. check time left on WN

Page 47: DIRAC Review (12 th December 2005)Stuart K. Paterson1 DIRAC Review Workload Management System.

DIRAC Review (12th December 2005) Stuart K. Paterson 47

Possible WMS Strategies (1)

Simplest strategy is no strategy, 1Pilot Agent per job‘Filling’ modeMulti-Threaded Agent infrastructure in place

Can run jobs in parallel, can be a huge improvementEspecially when mixing jobs of different priority and nature

Reading data, downloading etc. can be complementary activities

Page 48: DIRAC Review (12 th December 2005)Stuart K. Paterson1 DIRAC Review Workload Management System.

DIRAC Review (12th December 2005) Stuart K. Paterson 48

Possible WMS Strategies (2)

Picking up jobs with higher priority, can be achieved through ‘double matching’ mechanism Running jobs of different members of same VO by same Pilot Agent

Very promising mechanism to optimize the workload for the LHCb VO as a whole

Page 49: DIRAC Review (12 th December 2005)Stuart K. Paterson1 DIRAC Review Workload Management System.

DIRAC Review (12th December 2005) Stuart K. Paterson 49

Outlook & Improvements (1)

Input / Output Sandbox move to Grid storageJob ‘failures’ on LCG will be recovered as much as possible

e.g. treatment of Stalled jobsEnsure PilotAgents are submitted to different LCG sites if the job requirements permit it

To cope with troublesome sites

Page 50: DIRAC Review (12 th December 2005)Stuart K. Paterson1 DIRAC Review Workload Management System.

DIRAC Review (12th December 2005) Stuart K. Paterson 50

Outlook & Improvements (2)

Extended life of user proxies using MyProxy Server or …Explore use of Multi-Threaded Agent on the GridProvide at least minimal interactivity with running job

Job killing/spyingSpotting stalled Applications