Grid Engine 6.2 Simple Workflow Intro - BioTeam · 2017-04-01 · [email protected] Job...
Transcript of Grid Engine 6.2 Simple Workflow Intro - BioTeam · 2017-04-01 · [email protected] Job...
Grid Engine 6.2
Simple Workflow Intro
Created by The BioTeam, http://blog.bioteam.net
Topics Talk about using SGE more effectively Enabling workflows & pipelines via:
Job Dependencies Array Jobs
Some live examples SGE Troubleshooting (time permitting)
Created by The BioTeam, http://blog.bioteam.net
Automating Workflows
A few comments: Methods we will talk about today are great for
flexibility & ad-hoc development Especially for shell, Perl or Ruby scripters
There are more formal methods available for “serious” cluster-aware scientific software DRMAA
Created by The BioTeam, http://blog.bioteam.net
DRMAA
Distributed Resource Management Application API (“DRMAA”)
Standard API for cluster job submission & control Lets you write cluster-aware software that will be
portable across different cluster schedulers Available on:
SGE, PBS, PBSPro, Torque and Platform LSF*
Created by The BioTeam, http://blog.bioteam.net
DRMAA
Many DRMAA bindings Perl, C, C++, C#, Java, Python, Ruby
Documentation & Tutorials http://www.drmaa.org
Created by The BioTeam, http://blog.bioteam.net
Job Dependencies SGE scheduler does not promise to dispatch jobs in the
order in which one submits them. What if I have jobs that need to run in a certain order? Imagine this scenario:
Step 1 - Data staging script Step 2 - Data analysis script Step 3 - Result QC & staging script Step 4 - Cleanup script
Created by The BioTeam, http://blog.bioteam.net
Job Dependencies SGE Job Dependency Syntax allows for ordered job
execution Hinges upon a simple SGE feature:
Job Names Huh?
We need job names or some other identifier because we can’t be sure what SGE jobID the scheduler will assign our task
With assignable names we can reference jobs that are already pending, holding or running
Created by The BioTeam, http://blog.bioteam.net
Job Dependency Example qsub -N “worker1” my-job-script.sh! qsub -N “worker2” my-job-script2.sh! qsub -hold_jid worker1,worker2 cleanupJob.sh!
Created by The BioTeam, http://blog.bioteam.net
Job Dependency Example qsub -N “worker1” my-job-script.sh! qsub -N “worker2” my-job-script2.sh! qsub -hold_jid worker1,worker2 cleanupJob.sh!
See what we did up there? Our worker scripts will run when resources are available The cleanup script won’t run until the workers are done It all hinges on this:
By “naming” our jobs we can now reference them when using the “-hold_jid” argument.
Created by The BioTeam, http://blog.bioteam.net
Job Dependency Live Demo
Using example scripts in: { dag fill in path here! }
Created by The BioTeam, http://blog.bioteam.net
Array Jobs Extremely common use case in life science clustering:
“I need to run my program 100,000 times against 100,000 different input files”
Most people would … Use ‘qsub’ to submit 100,000 separate jobs
This will work but is not ideal Each job consumes filehandles and other system resources on the SGE qmaster
host. This can slow down or even crash SGE at large enough scales For users it can be a pain to monitor 100,000 jobs via ‘qstat’
There is a better way!
Created by The BioTeam, http://blog.bioteam.net
Array Jobs
Array jobs let you submit many individual “tasks” within one job submission
Benefits: Only one qsub required Only one jobID or name to monitor in qstat Significantly reduces load on SGE qmaster
Created by The BioTeam, http://blog.bioteam.net
Array Jobs: Qsub syntax
This is a 10 element task submission
qsub -t 1-10:1 -N arrayJob \ !./my-arrayJobScript.sh
Created by The BioTeam, http://blog.bioteam.net
Array Jobs: Qsub syntax
The “-t” switch: -t [FirstTask] - [LastTask]:StepSize
Examples What is the difference in these two
commands? qsub -t 1-100:1 ./my-array-job.sh qsub -t 1-100:2 ./my-array-job.sh
Created by The BioTeam, http://blog.bioteam.net
Array Jobs: How they work The secret is simple For each task in the array, SGE will
populate a special environment variable $SGE_TASK_ID
Running tasks can query this variable to learn what position they are
Often use to build paths to input or output files
Created by The BioTeam, http://blog.bioteam.net
Array Jobs: Live Demo Using example scripts from { dag put path here! }
Created by The BioTeam, http://blog.bioteam.net
Array Jobs: Final For advance cases: Recent SGE enhancement allows for job
dependency conditions among individual array job task elements
Created by The BioTeam, http://blog.bioteam.net
Job Dependencies SGE Job Dependency Syntax allows for ordered job
execution Hinges upon a simple SGE feature:
Job Names Huh?
We need job names or some other identifier because we can’t be sure what SGE jobID the scheduler will assign our task
With assignable names we can reference jobs that are already pending, holding or running
Created by The BioTeam, http://blog.bioteam.net
Synchronous qsub What if running a cluster job is only a tiny part of a larger
workflow or pipeline? Solution:
Synchronous job submission will “block” until job completes This lets you embed a qsub call into some other script or workflow
When qsub completes, your script resumes
Example qsub -sync y -b y /bin/sleep 10
Created by The BioTeam, http://blog.bioteam.net
Grid Engine Troubleshooting Lets be honest
Not many user accessible troubleshooting methods Best resource still the output and error files that
your jobs produce The most powerful methods are available to
cluster admins only
Created by The BioTeam, http://blog.bioteam.net
Grid Engine Troubleshooting There are two core problem types
Job Level Cluster seems OK, example scripts work fine Some user jobs/apps fail
Cluster Level Problems running all jobs Problems submitting to certain PE/queue/Project Problems with jobs on certain nodes
Created by The BioTeam, http://blog.bioteam.net
Grid Engine Troubleshooting Dealing with Cluster Level problems
STDOUT/STDERR from user jobs still the best initial debug resource
SGE messages and logs are usually very helpful $SGE_ROOT/$SGE_CELL/spool/qmaster/messages! $SGE_ROOT/$SGE_CELL/spool/qmaster/schedd/messages
Execd spool logs often hold job specific error data Remember that local spooling may be used (!) $SGE_ROOT/$SGE_CELL/spool/<node>/messages
SGE panic location Will log to /tmp on any node when $SGE_ROOT not found or not writable
Created by The BioTeam, http://blog.bioteam.net
Job Level Troubleshooting Job dies instantly
First pass Check the .o and .e files in the job directory Check .po and .pe files for parallel MPI jobs Best resource, usually clear error messages found:
Permission problem, no license available, path problem, syntax error in app, etc.
Second pass (admin assistance required) Check qmaster spool messages and node execd
messages
Created by The BioTeam, http://blog.bioteam.net
Job Level Troubleshooting
Job dies instantly … Third pass
qsub -w v <full job request> This will tell you if the job can run assuming:
All slots on all queues were empty All load values were ignored
Good source of info on ‘why can’t my job be scheduled’ problems
Created by The BioTeam, http://blog.bioteam.net
Job Level Troubleshooting Job pending forever
First Pass: qstat -j <job_id> This will tell you why the job is pending and if
there are any reasons why queues cannot accept the job
Possible root causes Impossible resource requested, license not available Scheduling oddness
Created by The BioTeam, http://blog.bioteam.net
Job Level Troubleshooting
Job pending forever Second Pass (admin required) $SGE_ROOT/default/spool/qmaster/schedd/messages Just to see if anything weird is going on with the
scheduler
Created by The BioTeam, http://blog.bioteam.net
Job Level Troubleshooting
Job runs from command line on front end node, but not under Grid Engine
Most common root cause: Difference in environment variables Difference in shell execution environment
Created by The BioTeam, http://blog.bioteam.net
General Troubleshooting
Many times the problems are not SGE related Permission, path or ENV problems
Best thing to do is watch STDERR and STDOUT Use the qsub ‘-e’ and ‘-o’ switches to send output to a
file that you can read Use qsub ‘-eo’ to send STDOUT and STDERR to the
same file (useful for debugging)
Created by The BioTeam, http://blog.bioteam.net
General Troubleshooting (cont.)
To get email listing why a job aborted Use: ‘qsub -m a user@host [rest of command] ’
Created by The BioTeam, http://blog.bioteam.net
General Troubleshooting (cont.)
Checking exit status and seeing if jobs ran to completion without error Use: ‘qacct -j <job_id>’ to query the accounting data Will also tell you if the job had to be requeued onto a
different queue or exechost
Created by The BioTeam, http://blog.bioteam.net
Basic Debug Process
Verify for yourself that cluster and SGE is happy before you do anything else ‘qstat -f’, ‘qrsh hostname’, ‘qhost’, etc.
This will quickly identify systemic or cluster wide issues
Then move on to dealing with the specific issue
Created by The BioTeam, http://blog.bioteam.net
Basic Debug Process
If problems persist, verify that the application actually runs OUTSIDE of Grid Engine Easier to catch app/user/system issues Good way to catch the super subtle stuff This is especially useful for MPI parallel
programs
Created by The BioTeam, http://blog.bioteam.net
Recommendation Build a personal portfolio of simple testing scripts
qrsh hostname! $SGE_ROOT/examples/jobs/simple.sh! $SGE_ROOT/examples/jobs/sleeper.sh
Get your users to supply you with example or dummy scripts that use real portfolio apps
Created by The BioTeam, http://blog.bioteam.net