Softpanorama

May the source be with you, but remember the KISS principle ;-)
Home Switchboard Unix Administration Red Hat TCP/IP Networks Neoliberalism Toxic Managers
(slightly skeptical) Educational society promoting "Back to basics" movement against IT overcomplexity and  bastardization of classic Unix

SGE Array Jobs

News SGE Commands Recommended Links SGE cheat sheet Reference qsub qstat
qhost qping qacct qmod Starting and Killing Daemons qalter -- Change Job Priority Getting information about hosts
SGE cheat sheet Creating and modifying SGE Queues Monitoring Queues and Jobs Submitting Jobs To Queue Instanc Monitoring and Controlling Jobs Humor Etc

From GridWiki

A common problem is that you have a large number of jobs to run, and they are largely identical in terms of the command to run. For example, you may have 1000 data sets, and you want to run a single program on them, using the cluster. The naive solution is to somehow generate 1000 shell scripts, and submit them to the queue. This is not efficient, neither for you nor for the head node.

Array jobs are the solution

There is an alternative on SGE systems – array jobs. The advantages are:

  1. You only have to write one shell script
  2. You don't have to worry about deleting thousands of shell scripts, etc.
  3. If you submit an array job, and realize you've made a mistake, you only have one job id to qdel, instead of figuring out how to remove 100s of them.
  4. You put less of a burden on the head node.

In fact, there are no disadvantages that I'm aware of. Submitting an array job to do 1000 computations is entirely equivalent to submitting 1000 separate scripts, but much less work for you.

The basic commands

In this section, I assume that you prefer the bash shell. To review, a basic SGE job using bash may look like the following:

#!/bin/sh
~/programs/program -i ~/data/input -o ~/results/output

Now, let's complicate things. Assume you have input files input.1, input.2, ..., input.10000, and you want the output to be placed in files with a similar numbering scheme. You could use perl to generate 10000 shell scripts, submit them, then clean up the mess later. Or, you could use an array job. The modification to the previous shell script is simple:

#!/bin/sh
# Tell the SGE that this is an array job, with "tasks" to be numbered 1 to 10000
#$ -t 1-10000
# When a single command in the array job is sent to a compute node,
# its task number is stored in the variable SGE_TASK_ID,
# so we can use the value of that variable to get the results we want:
~/programs/program -i ~/data/input.$SGE_TASK_ID -o ~/results/output.$SGE_TASK_ID

That's it. When the above script is submitted, it will find available nodes, and jobs will execute in order of the task IDs specified by the -t option. Also, the array job is subject to all the fair queueing rules. The above script is entirely equivalent to submitting 10000 scripts, but without the mess. Sewa mobil jakarta, Glutera

A more complex example

This is a modification of the above which only runs the program if the output file is not present.

#!/bin/sh
# Tell the SGE that this is an array job, with "tasks" to be numbered 1 to 10000
#$ -t 1-10000
# When a single command in the array job is sent to a compute node,
# its task number is stored in the variable SGE_TASK_ID,
# so we can use the value of that variable to get the results we want:
if [ ! -e ~/results/output.$SGE_TASK_ID ]
then
~/programs/program -i ~/data/input.$SGE_TASK_ID -o ~/results/output.$SGE_TASK_ID
fi

Pulling data from the ith line of a file

Let's say you have a list of numbers in a file, one number per line. For example, the numbers could be random number seeds for a simulation. For each task in an array job, you want to get the ith line from the file, where i equals SGE_TASK_ID, and use that value as the seed. This can be accomplished by using the Unix awk command:

#!/bin/sh
#$ -t 1-10000
SEEDFILE=~/data/seeds
SEED=$(awk "NR==$SGE_TASK_ID" $SEEDFILE)
~/programs/simulation -s $SEED -o ~/results/output.$SGE_TASK_ID

You can use this trick for all sorts of things. For example, if your jobs all use the same program, but with very different command-line options, you can list all the options in the file, one set per line, and the exercise is basically the same as the above, and you only have two files to handle (or 3, if you have a perl script generate the file of command-lines).

Alternatives using cat, head and tail, or sed or perl are of course also possible.

What if you number files from 0 instead of 1?

The '-t' option will not accept 0 as part of the range, i.e.

#$ -t 0-99

is invalid, and will generate an error. However, I often label my input files from 0 to n−1. That's easy to deal with:

#!/bin/sh
# Tell the SGE that this is an array job, with "tasks" to be numbered 1 to 10000
#$ -t 1-10000
i=$(expr $SGE_TASK_ID - 1)
if [ ! -e ~/results/output.$i ]
then
~/programs/program -i ~/data/input.$i -o ~/results/output.$i
fi

Example: R Scripts with Grid Engine Job Arrays

All of the above applies to well-behaved, interactive programs. However, sometimes you need to use R to analyze your data. In order to do this, you have to hardcode file names into the R script, because these scripts are not interactive. This is a royal pain. However, there is a solution that makes use of HERE documents in bash. HERE documents also exist in perl, and an online tutorial for them in bash is at http://www.tldp.org/LDP/abs/html/here-docs.html. The short of it is that a HERE document can represent a skeleton document at the end of a shell script. Let's concoct an example. You have 100 data files, labeled data.1 to data.10. Each file contains a single column of numbers, and you want to do some calculation for each of them, using R. Let's use a HERE document:

#!/bin/sh
#$ -t 1-10
WORKDIR=/Users/jl566/testing
INFILE=$WORKDIR/data.$SGE_TASK_ID
OUTFILE=$WORKDIR/data.$SGE_TASK_ID.out
# See comment below about paths to R
PATHTOR=/common/bin
if [ -e $OUTFILE ]
then
rm -f $OUTFILE
fi
# Below, the phrase "EOF" marks the beginning and end of the HERE document.
# Basically, what's going on is that we're running R, and suppressing all of
# it's output to STDOUT, and then redirecting whatever's between the EOF words
# as an R script, and using variable substitution to act on the desired files.
$PATHTOR/R --quiet --no-save > /dev/null <<EOF
x<-read.table("$INFILE")
write(mean(x\$V1),"$OUTFILE")
EOF

So now you can use the cluster to analyze your data – just write the R script within the HERE document, and go from there. As I've only just figured this out, some caveats are necessary. If anyone experiments and figures out something neat, let me know. Be aware of the following:

  1. In my limited experience, indenting is important for HERE documents. In particular, it seems that the beginning and end (i.e. both lines containing the term EOF in the above example), must be aligned with the left-hand edge of the buffer (i.e. not indented at all). So, if you use a HERE document in a conditional or control statement, be mindful of this.
  2. In the mean command, I escaped the dollar sign with a backslash. In my limited experiments, both mean(x\$V1) and mean(x$V1) seem to work. However, escaping the dollar sign for the read.table command prevents the variable substitution from occurring in the shell, causing R to fail, because the input file named $INFILE cannot be found. In other words, escaping in that context causes the HERE doc to pass $INFILE as a string literal to R, rather than the value stored in the shell variable.
  3. This is more useful than just array jobs on an SGE system. If you know bash well enough, you can write a shell script that takes a load of arguments, and processes them with a HERE document. This solves a major limitation with R scripts themselves. You can do the same in perl, too, on your workstation, but you must use a shell language on the cluster.

Using Rscript

Since around R-2.7, R has had the Rscript binary (same location as the R binary) which allows you to invoke R as a plain shell script using the hash bang notation. For submitting jobs this may be cleaner than using a shell script with a HERE document. The only thing you have to take care of is to use the -shell no -b yes options to avoid using a shell like tcsh. There's an additional trick: if you make sure your $PATH contains the right Rscript binary (which may well differ across different systems etc.), then the following would be an example of a script that is submitted as an array

#!/bin/env Rscript
# example.R
library(somelibrary) # etc.

task.id <- Sys.getenv("SGE_TASK_ID")
results <- calc.something(i=task.id)
print(results)

q(save="no")

The first line, #!/bin/env Rscript, is the trick that makes tcsh morph into an Rscript shell. The SGE_TASK_ID variable is available from the environment.

Submitting would be done as follows:

qsub -t 1-10 -shell no -b yes -v PATH=$PATH  -v R_LIBS=$R_LIBS `pwd`/example.R

The PATH has to be exported so that /bin/env can find the Rscript binary; R_LIBS has to be exported to that R can find the right libraries. In practice you would of course have to supply other arguments to your R script. This is best done using some command line argument parser (personally I prefer the simple one by Vincent Zoonekynd, show at the bottom of this post)


Top Visited
Switchboard
Latest
Past week
Past month

NEWS CONTENTS

Old News ;-)

[Sep 19, 2014] SGE Array Jobs - Scalable Computing Support Center - DukeWikiby by John Pormann

Jul 10, 2012 | wiki.duke.edu

** NOTE ** In the examples below, data files are accessed via the shared cluster file system. This can result in slow performance, especially when the file server is overloaded. To improve your application's performance, and also to avoid adding additional load on the file server, please modify the scripts below so that your program is using Scratch Disk Space.

Introduction

An SGE Array Job is a script that is to be run multiple times. Note that this means EXACTLY the same script is going to be run multiple times, the only difference between each run is a single environment variable, $SGE_TASK_ID, so your script MUST be reasonably intelligent. However, compared to submitting 100's of independent SGE jobs, Array Jobs can be more readable - there may be only one script that you have to understand (an alternative approach might have one script to write a set of batch-scripts, one script to submit those batch-jobs to SGE, and another script which performs the actual program-logic that you want to run).

An array job is started like any other, by issuing a qsub command with the '-t' option:

? qsub -t 1-1000 myscript.q

This will run the script (usually csh or bash) 1000 times, first with $SGE_TASK_ID=1, then with $SGE_TASK_ID=2, etc. Again, it is up to your script to do something different in each case ... perhaps just cd into a different directory, process inputfile.$SGE_TASK_ID, etc. If, for some reason, you want to process every Nth number in the sequence, you can use, e.g., "qsub -t 1-1000:4" to do every 4th number (1 .. 5 .. 9).

One of the easiest ways to use an array task is to pre-compute or set-up N different input files, or input directories if more than one input file is needed. Let's say files inputA and inputB are needed by the program. We could create 100 directories, dir.1, dir.2, through dir.100 and put different input data into each directory. Then the script might look like:

? #!/bin/csh

# : call this file 'example1.q'

cd dir.$SGE_TASK_ID

run_my_program

then we do "qsub -t 1-100 example1.q" to start all 100 jobs. SGE will start as many of the individual tasks as it can, as soon as it can.

Note that you can also embed the "-t 1-100" into the .q file:

? #!/bin/csh

#

#$ -cwd -m b,e

#$ -t 1-100

cd dir.$SGE_TASK_ID

run_my_program

With the '-t' embedded in the file, you can submit it as "qsub example.q".

Environment Variables

There are a few environment variables that SGE sets during array tasks: $SGE_TASK_FIRST, $SGE_TASK_LAST, $SGE_STEP_SIZE. So the following script may be useful if you need certain things to happen at the start or end of the job:

? #!/bin/csh

#

#$ -cwd -m b,e

#$ -t 1-100

if( $SGE_TASK_ID == $SGE_TASK_FIRST ) then

# do first-task stuff here

endif

# do normal processing here

if( $SGE_TASK_ID == $SGE_TASK_LAST ) then

# do last-task stuff here

endif

NOTE: remember that SGE may start multiple jobs simultaneously! So you probably shouldn't expect the "do first-task stuff" code to do on-the-fly initializations - other jobs may already be running when that "first-task stuff" is starting! Do NOT expect the first task to build files or directories for the other tasks to use.

Similarly, the last task may NOT be the last one to complete - it is simply the last one to be started. Do NOT expect the last task to clean up after the other tasks - they may still be running! If the last task deletes a file that another task is using, it will probably crash and its output will be corrupted.

The first task could be used to write a temporary file so you know that it has started. The last task could be used to submit a new job to SGE ... but ONLY IF you make that second job dependent on the current one (i.e. force the second job to wait until SGE knows that all current tasks are complete) - see [SGE Job Dependencies] for more info.

See SGE Env Vars for more information on accessing environment variables within other programming environments, e.g. Perl and Matlab.

Array Jobs vs. Multiple (Identical) Job Submissions

Using tsub instead of qsub for low-priority Array jobs

A tsub example:

?
[head4 ~]$ /opt/apps/bin/tsub myLowPrioArrayjob.q

Checking for free machine groups ...

qsub -q *@machinegroup1-n*,*@machinegroup2-n*,*@machinegroup3-n*,*@machinegroup4-n*, ... myLowPrioArrayjob.q

Your job 3168 ("myLowPrioArrayjob.q") has been submitted

Although the SGE scheduler can usually avoid preempting a low-priority job on a given processor core (or "slot"), the possibility of at least one job being preempted (actually, partially suspended) becomes increasing likely when the running large array jobs. An unfortunate side effect is that some jobs make take 10x (or even 20x) longer to finish than the others. One way to avoid this scenario is to use the SGE "-q" directive to direct the job to specific machine groups (see the "Requesting certain machines" section of [Submitting Single-CPU Jobs]) on which no high-priority jobs are running at the moment and therefore, presumably, will not be launched until the low priority job completes. To facilitate this, we have written the "tsub" wrapper for qsub which does the following:

1. Parses the output of the command "qconf -shgrp @8cores" to generate a list of machine groups with the newest, fastest nodes in the DSCR (8-core minimum).

2. Queries qstat for each group to see if any of the nodes are running high-priority jobs. The names of machine groups not actively running high-priority jobs are appended (in the wildcard form above) to the "-q" list.

3. Prints the complete "-q" list to the screen for reference and invokes qsub with the "-q" directive and the batch script.

Note that tsub is only intended for low-priority jobs. If a batch script contains an "-l highprio" directive, then the job will be routed to the active group to which the user has high priority access and the "-q" directive will be ignored.

[gridengine users] limit the number of jobs users can put on queue (pending)

MacMullan, Hugh hughmac at wharton.upenn.edu
Wed Nov 12 14:21:11 UTC 2014
•Previous message: [gridengine users] limit the number of jobs users can put on queue (pending)
•Next message: [gridengine users] limit the number of jobs users can put on queue (pending)
• Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hi Robert:

The tasks jobs are all submitted at the same time, and they also have SGE_TASK_ID environment variable set, which is VERY useful! Give it a try:

echo 'HOSTNAME=`hostname`; echo "this is task $SGE_TASK_ID on $HOSTNAME"' | qsub -N arraytest -t 1-4 -j y

Use that SGE_TASK_ID to import options or data from a file or files, set a seed, etc.

Task array jobs rule! :)

-Hugh

William Hay w.hay at ucl.ac.uk
Wed Nov 12 15:00:40 UTC 2014
•Previous message: [gridengine users] limit the number of jobs users can put on queue (pending)
•Next message: [gridengine users] limit the number of jobs users can put on queue (pending)
• Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Wed, 12 Nov 2014 13:56:24 +0000
Roberto Nunnari <roberto.nunnari at supsi.ch> wrote:

> Il 12.11.2014 14:51, William Hay ha scritto:

> > It's a way to submit a bunch of jobs that are identical from grid engine's POV as a single job. This lightens the load on the scheduler
> > and means qstat normally only reports a single queued job. Probably what the user who caused your original issue should have submitted.
>
> Nice! :-)
>
> And the tasks in the array jobs can get running in parallel or they
> always run in serial?

Yes grid engine can run multiple tasks from the same array job at the same time. Unlike parallel jobs tasks they
won't be synchronised and won't know about each other. In recent verison of grid engine there
is a qsub flag that lets the user limit the number of tasks from a given array job that will run simultaneously.

[gridengine users] Multiple slots per task in array job

Christopher Heiny cheiny at synaptics.com
Tue Nov 11 05:37:10 UTC 2014
•Previous message: [gridengine users] More on testing of Son of Grid Engine 8.1.8l
•Next message: [gridengine users] Multiple slots per task in array job
• Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hi all,

We're running OGS 2011.11p1, with a 1:1 slot:core mapping. We mostly
run submit single threaded array jobs, one slot per task. We now want
to multi-thread some of the programs in order to speed up processing,
but need a way to allocate <n> slots to an <n> threaded task.

The first idea was to use a parallel environment to manage this, but
since the jobs are all with script (rather than a binary), we got bitten
by this bug:
http://gridengine.org/pipermail/dev/2011-December/000081.html

Unfortunately, patching/updating the GE software is not an option at the
moment. We've got to wait a few months for that, and in the meantime we
need a workaround for slot allocation.

Currently, I'm thinking of using a consumable equal to the number of
slots/cores on a machine to control allocation. I *think* this will
work OK as an interim solution. What I was wondering was: has anyone
encountered a similar situation? Did you use this trick to work around
it? Or is there a better workaround?

Thanks very much!
Chris

Recommended Links

Google matched content

Softpanorama Recommended

Top articles

Sites

Top articles

Sites

Simple-Job-Array-Howto - GridWiki

SGE Array Jobs - Scalable Computing Support Center - DukeWiki

SGE Array Jobs



Etc

Society

Groupthink : Two Party System as Polyarchy : Corruption of Regulators : Bureaucracies : Understanding Micromanagers and Control Freaks : Toxic Managers :   Harvard Mafia : Diplomatic Communication : Surviving a Bad Performance Review : Insufficient Retirement Funds as Immanent Problem of Neoliberal Regime : PseudoScience : Who Rules America : Neoliberalism  : The Iron Law of Oligarchy : Libertarian Philosophy

Quotes

War and Peace : Skeptical Finance : John Kenneth Galbraith :Talleyrand : Oscar Wilde : Otto Von Bismarck : Keynes : George Carlin : Skeptics : Propaganda  : SE quotes : Language Design and Programming Quotes : Random IT-related quotesSomerset Maugham : Marcus Aurelius : Kurt Vonnegut : Eric Hoffer : Winston Churchill : Napoleon Bonaparte : Ambrose BierceBernard Shaw : Mark Twain Quotes

Bulletin:

Vol 25, No.12 (December, 2013) Rational Fools vs. Efficient Crooks The efficient markets hypothesis : Political Skeptic Bulletin, 2013 : Unemployment Bulletin, 2010 :  Vol 23, No.10 (October, 2011) An observation about corporate security departments : Slightly Skeptical Euromaydan Chronicles, June 2014 : Greenspan legacy bulletin, 2008 : Vol 25, No.10 (October, 2013) Cryptolocker Trojan (Win32/Crilock.A) : Vol 25, No.08 (August, 2013) Cloud providers as intelligence collection hubs : Financial Humor Bulletin, 2010 : Inequality Bulletin, 2009 : Financial Humor Bulletin, 2008 : Copyleft Problems Bulletin, 2004 : Financial Humor Bulletin, 2011 : Energy Bulletin, 2010 : Malware Protection Bulletin, 2010 : Vol 26, No.1 (January, 2013) Object-Oriented Cult : Political Skeptic Bulletin, 2011 : Vol 23, No.11 (November, 2011) Softpanorama classification of sysadmin horror stories : Vol 25, No.05 (May, 2013) Corporate bullshit as a communication method  : Vol 25, No.06 (June, 2013) A Note on the Relationship of Brooks Law and Conway Law

History:

Fifty glorious years (1950-2000): the triumph of the US computer engineering : Donald Knuth : TAoCP and its Influence of Computer Science : Richard Stallman : Linus Torvalds  : Larry Wall  : John K. Ousterhout : CTSS : Multix OS Unix History : Unix shell history : VI editor : History of pipes concept : Solaris : MS DOSProgramming Languages History : PL/1 : Simula 67 : C : History of GCC developmentScripting Languages : Perl history   : OS History : Mail : DNS : SSH : CPU Instruction Sets : SPARC systems 1987-2006 : Norton Commander : Norton Utilities : Norton Ghost : Frontpage history : Malware Defense History : GNU Screen : OSS early history

Classic books:

The Peter Principle : Parkinson Law : 1984 : The Mythical Man-MonthHow to Solve It by George Polya : The Art of Computer Programming : The Elements of Programming Style : The Unix Hater’s Handbook : The Jargon file : The True Believer : Programming Pearls : The Good Soldier Svejk : The Power Elite

Most popular humor pages:

Manifest of the Softpanorama IT Slacker Society : Ten Commandments of the IT Slackers Society : Computer Humor Collection : BSD Logo Story : The Cuckoo's Egg : IT Slang : C++ Humor : ARE YOU A BBS ADDICT? : The Perl Purity Test : Object oriented programmers of all nations : Financial Humor : Financial Humor Bulletin, 2008 : Financial Humor Bulletin, 2010 : The Most Comprehensive Collection of Editor-related Humor : Programming Language Humor : Goldman Sachs related humor : Greenspan humor : C Humor : Scripting Humor : Real Programmers Humor : Web Humor : GPL-related Humor : OFM Humor : Politically Incorrect Humor : IDS Humor : "Linux Sucks" Humor : Russian Musical Humor : Best Russian Programmer Humor : Microsoft plans to buy Catholic Church : Richard Stallman Related Humor : Admin Humor : Perl-related Humor : Linus Torvalds Related humor : PseudoScience Related Humor : Networking Humor : Shell Humor : Financial Humor Bulletin, 2011 : Financial Humor Bulletin, 2012 : Financial Humor Bulletin, 2013 : Java Humor : Software Engineering Humor : Sun Solaris Related Humor : Education Humor : IBM Humor : Assembler-related Humor : VIM Humor : Computer Viruses Humor : Bright tomorrow is rescheduled to a day after tomorrow : Classic Computer Humor

The Last but not Least Technology is dominated by two types of people: those who understand what they do not manage and those who manage what they do not understand ~Archibald Putt. Ph.D


Copyright © 1996-2021 by Softpanorama Society. www.softpanorama.org was initially created as a service to the (now defunct) UN Sustainable Development Networking Programme (SDNP) without any remuneration. This document is an industrial compilation designed and created exclusively for educational use and is distributed under the Softpanorama Content License. Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.

FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available to advance understanding of computer science, IT technology, economic, scientific, and social issues. We believe this constitutes a 'fair use' of any such copyrighted material as provided by section 107 of the US Copyright Law according to which such material can be distributed without profit exclusively for research and educational purposes.

This is a Spartan WHYFF (We Help You For Free) site written by people for whom English is not a native language. Grammar and spelling errors should be expected. The site contain some broken links as it develops like a living tree...

You can use PayPal to to buy a cup of coffee for authors of this site

Disclaimer:

The statements, views and opinions presented on this web page are those of the author (or referenced source) and are not endorsed by, nor do they necessarily reflect, the opinions of the Softpanorama society. We do not warrant the correctness of the information provided or its fitness for any purpose. The site uses AdSense so you need to be aware of Google privacy policy. You you do not want to be tracked by Google please disable Javascript for this site. This site is perfectly usable without Javascript.

Last modified: March, 12, 2019