Grid Engine 6.2 Simple Workﬂow Intro - BioTeam · 2017-04-01 · [email protected] Job...

[email protected]

Grid Engine 6.2

Simple Workflow Intro

Created by The BioTeam, http://blog.bioteam.net

[email protected]

Topics   Talk about using SGE more effectively   Enabling workflows & pipelines via:

  Job Dependencies   Array Jobs

  Some live examples   SGE Troubleshooting (time permitting)


[email protected]

Automating Workflows

  A few comments:   Methods we will talk about today are great for

flexibility & ad-hoc development   Especially for shell, Perl or Ruby scripters

  There are more formal methods available for “serious” cluster-aware scientific software   DRMAA


[email protected]

DRMAA

  Distributed Resource Management Application API (“DRMAA”)

  Standard API for cluster job submission & control   Lets you write cluster-aware software that will be

portable across different cluster schedulers   Available on:

  SGE, PBS, PBSPro, Torque and Platform LSF*


[email protected]

DRMAA

  Many DRMAA bindings   Perl, C, C++, C#, Java, Python, Ruby

  Documentation & Tutorials   http://www.drmaa.org


[email protected]

Being more effective


[email protected]

Job Dependencies   SGE scheduler does not promise to dispatch jobs in the

order in which one submits them.   What if I have jobs that need to run in a certain order?   Imagine this scenario:

  Step 1 - Data staging script   Step 2 - Data analysis script   Step 3 - Result QC & staging script   Step 4 - Cleanup script


[email protected]

Job Dependencies   SGE Job Dependency Syntax allows for ordered job

execution   Hinges upon a simple SGE feature:

  Job Names   Huh?

  We need job names or some other identifier because we can’t be sure what SGE jobID the scheduler will assign our task

  With assignable names we can reference jobs that are already pending, holding or running


[email protected]

Job Dependency Example   qsub -N “worker1” my-job-script.sh!  qsub -N “worker2” my-job-script2.sh!  qsub -hold_jid worker1,worker2 cleanupJob.sh!


[email protected]

Job Dependency Example   qsub -N “worker1” my-job-script.sh!  qsub -N “worker2” my-job-script2.sh!  qsub -hold_jid worker1,worker2 cleanupJob.sh!

  See what we did up there?   Our worker scripts will run when resources are available   The cleanup script won’t run until the workers are done   It all hinges on this:

  By “naming” our jobs we can now reference them when using the “-hold_jid” argument.


[email protected]

Job Dependency Live Demo

  Using example scripts in:   { dag fill in path here! }


[email protected]

Array Jobs


[email protected]

Array Jobs   Extremely common use case in life science clustering:

  “I need to run my program 100,000 times against 100,000 different input files”

  Most people would …   Use ‘qsub’ to submit 100,000 separate jobs

  This will work but is not ideal   Each job consumes filehandles and other system resources on the SGE qmaster

host. This can slow down or even crash SGE at large enough scales   For users it can be a pain to monitor 100,000 jobs via ‘qstat’

  There is a better way!


[email protected]

Array Jobs

  Array jobs let you submit many individual “tasks” within one job submission

  Benefits:   Only one qsub required   Only one jobID or name to monitor in qstat   Significantly reduces load on SGE qmaster


[email protected]

Array Jobs: Qsub syntax

  This is a 10 element task submission

qsub -t 1-10:1 -N arrayJob \ !./my-arrayJobScript.sh


[email protected]

Array Jobs: Qsub syntax

  The “-t” switch:   -t [FirstTask] - [LastTask]:StepSize

  Examples   What is the difference in these two

commands?   qsub -t 1-100:1 ./my-array-job.sh   qsub -t 1-100:2 ./my-array-job.sh


[email protected]

Array Jobs: How they work   The secret is simple   For each task in the array, SGE will

populate a special environment variable   $SGE_TASK_ID

  Running tasks can query this variable to learn what position they are

  Often use to build paths to input or output files


[email protected]

Array Jobs: Live Demo   Using example scripts from   { dag put path here! }


[email protected]

Array Jobs: Final   For advance cases:   Recent SGE enhancement allows for job

dependency conditions among individual array job task elements


[email protected]

Job Dependencies   SGE Job Dependency Syntax allows for ordered job

execution   Hinges upon a simple SGE feature:

  Job Names   Huh?

  We need job names or some other identifier because we can’t be sure what SGE jobID the scheduler will assign our task

  With assignable names we can reference jobs that are already pending, holding or running


[email protected]

Synchronous qsub


[email protected]

Synchronous qsub   What if running a cluster job is only a tiny part of a larger

workflow or pipeline?   Solution:

  Synchronous job submission will “block” until job completes   This lets you embed a qsub call into some other script or workflow

  When qsub completes, your script resumes

  Example   qsub -sync y -b y /bin/sleep 10


[email protected]

Questions?


[email protected]

Troubleshooting


[email protected]

Grid Engine Troubleshooting   Lets be honest

  Not many user accessible troubleshooting methods   Best resource still the output and error files that

your jobs produce   The most powerful methods are available to

cluster admins only


[email protected]

Grid Engine Troubleshooting   There are two core problem types

  Job Level   Cluster seems OK, example scripts work fine   Some user jobs/apps fail

  Cluster Level   Problems running all jobs   Problems submitting to certain PE/queue/Project   Problems with jobs on certain nodes


[email protected]

Grid Engine Troubleshooting   Dealing with Cluster Level problems

  STDOUT/STDERR from user jobs still the best initial debug resource

  SGE messages and logs are usually very helpful   $SGE_ROOT/$SGE_CELL/spool/qmaster/messages!  $SGE_ROOT/$SGE_CELL/spool/qmaster/schedd/messages

  Execd spool logs often hold job specific error data   Remember that local spooling may be used (!)   $SGE_ROOT/$SGE_CELL/spool/<node>/messages

  SGE panic location   Will log to /tmp on any node when $SGE_ROOT not found or not writable


[email protected]

Job Level Troubleshooting   Job dies instantly

  First pass   Check the .o and .e files in the job directory   Check .po and .pe files for parallel MPI jobs   Best resource, usually clear error messages found:

  Permission problem, no license available, path problem, syntax error in app, etc.

  Second pass (admin assistance required)   Check qmaster spool messages and node execd

messages


[email protected]

Job Level Troubleshooting

  Job dies instantly …   Third pass

 qsub -w v <full job request>   This will tell you if the job can run assuming:

  All slots on all queues were empty   All load values were ignored

  Good source of info on ‘why can’t my job be scheduled’ problems


[email protected]

Job Level Troubleshooting   Job pending forever

  First Pass:   qstat -j <job_id>   This will tell you why the job is pending and if

there are any reasons why queues cannot accept the job

  Possible root causes   Impossible resource requested, license not available   Scheduling oddness


[email protected]


  Job pending forever   Second Pass (admin required)   $SGE_ROOT/default/spool/qmaster/schedd/messages   Just to see if anything weird is going on with the

scheduler


[email protected]


  Job runs from command line on front end node, but not under Grid Engine

  Most common root cause:   Difference in environment variables   Difference in shell execution environment


[email protected]

General Troubleshooting

  Many times the problems are not SGE related   Permission, path or ENV problems

  Best thing to do is watch STDERR and STDOUT   Use the qsub ‘-e’ and ‘-o’ switches to send output to a

file that you can read   Use qsub ‘-eo’ to send STDOUT and STDERR to the

same file (useful for debugging)


[email protected]

General Troubleshooting (cont.)

  To get email listing why a job aborted   Use: ‘qsub -m a user@host [rest of command] ’


[email protected]

General Troubleshooting (cont.)

  Checking exit status and seeing if jobs ran to completion without error   Use: ‘qacct -j <job_id>’ to query the accounting data   Will also tell you if the job had to be requeued onto a

different queue or exechost


[email protected]

Basic Debug Process

  Verify for yourself that cluster and SGE is happy before you do anything else   ‘qstat -f’, ‘qrsh hostname’, ‘qhost’, etc.

  This will quickly identify systemic or cluster wide issues

  Then move on to dealing with the specific issue


[email protected]

Basic Debug Process

  If problems persist, verify that the application actually runs OUTSIDE of Grid Engine   Easier to catch app/user/system issues   Good way to catch the super subtle stuff   This is especially useful for MPI parallel

programs


[email protected]

Recommendation   Build a personal portfolio of simple testing scripts

  qrsh hostname!  $SGE_ROOT/examples/jobs/simple.sh!  $SGE_ROOT/examples/jobs/sleeper.sh

  Get your users to supply you with example or dummy scripts that use real portfolio apps


[email protected]

Questions?


Grid Engine 6.2 Simple Workﬂow Intro - BioTeam · 2017-04-01 · [email protected] Job...

Documents

Transcript of Grid Engine 6.2 Simple Workﬂow Intro - BioTeam · 2017-04-01 · [email protected] Job...