|
Home | Switchboard | Unix Administration | Red Hat | TCP/IP Networks | Neoliberalism | Toxic Managers |
(slightly skeptical) Educational society promoting "Back to basics" movement against IT overcomplexity and bastardization of classic Unix |
|
|
In the clusters there is "batch scheduler" which typically in installed on the headnode. In case the batch scheduler is SGE the submit command is qsub.
Once a job has been received by the batch server, the scheduler decides the placement and notifies the batch server which in turn notifies qsub whether the job can be run or not. The current status (whether the job was successfully scheduled or not) is then returned to the user. You may use a command file or STDIN as input for qsub.
A job in SGE represents a task to be performed on a node (or multiple nodes) in the cluster and contains the command line used to start the task. A job may have specific resource requirements but in general should be agnostic to which node in the cluster it runs on as long as its resource requirements are met.
All jobs require at least one available slot on a node in the cluster to run. SGE does not deal with fractional slots.
Here is a simple example of the qsub command which launch a simple job that runs the hostname command on a given cluster node. You can't submit jobs unless your UID is more the 100. That excludes submission jobs as root. Note that in example below sgeadmin account is used.
sgeadmin@master:~$ qsub -o /Apps/myputput.txt -b y -cwd -q all.q@b1 hostname Your job 1 ("hostname") has been submitted
Notice that the qsub command, when successful, will print the job number to stdout. You can use the job number to monitor the job’s status and progress within the queue as we’ll see in the next section.
If you always include the same options with your job (e.g. the email notifications above) you can include these automatically. Typically at least two option are specified:
To accomplish this you need to create the file '$HOME/.sge_request', and include the options you would include on the command link in there, one option per line. For example:
# mail me when the job starts & ends
-M [email protected]
-m be
# pass through some environment variables
-v PYTHONPATH
# pass through ALL environment variables
# -V
# use multiple cores
# NOTE: Divide h_vmem by the number of cores you request (e.g. 2)
-pe smp 2
There are following three options control output stream
The job’s stdout and stderr files are named after the job with the extension ending in the job’s number.
For the simple job submitted with the command
sgeadmin@master:~$ ls hostname.* hostname.e1 hostname.o1 sgeadmin@master:~$ cat hostname.o1 b1 sgeadmin@master:~$ cat hostname.e1 sgeadmin@master:~$
Notice that Sun Grid Engine automatically named the job hostname and created two output files: hostname.e1 and hostname.o1. The e stands for stderr and the o for stdout. The 1 at the end of the files’ extension is the job number. So if the job had been named my_new_job and was job #23 submitted, the output files would look like:
my_new_job.e23 my_new_job.o23
If queue is not specified it is submitted to the default queue (typically all.q ) A job can be submitted to a particular queue without any host(s) specification, or to a selected host (queue instance), or to a selected host group (queue domain). Note that this is opposites order in comparison to DNS specification: here query (domain) goes before host.
qsub -q queue_name job qsub -q queue_name@host_name job qsub -q queue_name@@hostgroup_name job
One can use regular expressions (single quoted) to specify the hosts. For example, to request any host in the given group regardless of the queue name:
qsub -q '*@@hostgroup_name' job
Option -l accepts resource=value pairs separated by comma
It allows to launch the job in a Sun Grid Engine queue meeting the given resource request list.
This option is available for qsub, qsh, qrsh, qlogin and qalter only. There may be multiple -l switches in a single command. You may request multiple -l options to be soft or hard both in the same command line. In case of a serial job multiple -l switches refine the definition for the sought queue.
Among other things you can specidy hostname using this option. For example
qsub -l hostname='p6.hpc.firma.com' ...
complex(5) describes the list available resources and their associated valid value specifiers can be obtained.
DESCRIPTIONComplex reflects the format of the Sun Grid Engine complex configuration. The definition of complex attributes pro- vides all pertinent information concerning the resource attributes a user may request for a Sun Grid Engine job via the qsub(1) -l option and for the interpretation of these parameters within the Sun Grid Engine system.
The Sun Grid Engine complex object defines all entries which are used for configuring the global, the host, and queue object. The system has a set of pre defined entries, which are assigned to a host or queue per default. In a addition can the user define new entries and assign them to one or multiple objects. Each load value has to have its corresponding complex entry object, which defines the type and the relational operator for it.
defining resource attributes
The complex configuration should not be accessed directly. In order to add or modify complex entries, the qconf(1) options -Mc and -mc should be used instead. While the -Mc option takes a complex configuration file as an argument and overrides the current configuration, the -mc option bring up an editor filled in with the current complex configuration.
The provided list contains all definitions of resource attributes in the system. Adding a new entry means to pro- vide: name, shortcut, type, relop, requestable, consumable, default, and urgency. The fields are described below. Chang- ing one is easily done by updating the field to change and removing an entry by deleting its definition. An attribute can only be removed, when it is not referenced in a host or queue object anymore. Also does the system have a set of default resource attributes which are always attached to a host or queue. They cannot be deleted nor can the type of such an attribute be changed.
working with resource attributes Before a user can request a resource attribute it has to be attached to the global, host, or cqueue object. The resource attribute exists only for the objects, it got attached to ( if it is attached to the global object(qconf -me global), it exits system wide, host object: only on that host (qconf -me NAME): cqueue object: only on that cqueue (qconf -mq NAME)).
When the user attached a resource attribute to an object, one also has to assign a value to it; the resource limit. Another way to get a resource attribute value is done by configuring a load sensor for that attribute.
Default queue resource attributes In its default form it contains a selection of parameters in the queue configuration as defined in queue_conf(5). The queue configuration parameters being requestable for a job by the user in principal are:
qname hostname notify calendar min_cpu_interval tmpdir seq_no s_rt h_rt s_cpu h_cpu s_data h_data s_stack h_stack s_core h_core s_rss h_rssDefault host resource attributes The standard set of host related attributes consists of two categories. he first category is built by several queue con- figuration attributes which are particularly suitable to be managed on a host basis. These attributes are:slots s_vmem h_vmem s_fsize h_fsize(please refer to queue_conf(5) for details).Note: Defining these attributes in the host complex is no contradiction to having them also in the queue configura- tion. It allows maintaining the corresponding resources on a host level and at the same time on a queue level. Total vir- tual free memory (h_vmem) can be managed for a host, for example, and a subset of the total amount can be associated with a queue on that host.
The second attribute category in the standard host complex are the default load values Every sge_execd(8) periodically reports load to sge_qmaster(8). The reported load values are either the standard Sun Grid Engine load values such as the CPU load average (see uptime(1)) or load values defined by the Sun Grid Engine administration (see the load_sensor parameter in the cluster configuration sge_conf(5) and the Sun Grid Engine Installation and Administration for details). The characteristics definition for the standard load values is part of the default host complex, while administrator defined load values require extension of the host complex. Please refer to the file
/doc/load_parameters.asc for detailed information on the standard set of load values. Overriding attributes
One attribute can be assigned to the global object, host object, and queue object at the same time. On the host level it might get its value from the user defined resource limit and a load sensor. In case that the attribute is a consum- able, we have in addition to the resource limit and its load report on host level also the internal usage, which the sys- tem keeps track of. The merge is done as follows:
In general an attribute can be overridden on a lower level
- global by hosts and queues
- hosts by queues and load values or resource limits on the same level.
We have one limitation for overriding attributes based on its relational operator:
- !=, == operators can only be overridden on the same level, but not on a lower level. The user defined value always overrides the load value.
- >=, >, <=, < operators can only be overridden, when the new value is more restrictive than the old one.
In the case of a consumable on host level, which has also a load sensor, the system checks for the current usage, and if the internal accounting is more restrictive than the load sensor report, the internal value is kept; if the load sen- sor report is more restrictive, that one is kept.
Note, Sun Grid Engine allows backslashes (\) be used to escape newline (\newline) characters. The backslash and the newline are replaced with a space (" ") character before any interpretation.
Scripts that are submitted via SGE can have pseudocomments that will be processed by SGE and set options much like option to SGE command itself. Here is an example of such comments, which start with "#$" in a simple job script:
$ cat sleep.sh #!/bin/bash # #$ -cwd #$ -j y #$ -S /bin/bash # date sleep 10 dateYou can put several lines which start with #$. Those are treated by SGE as pseudo comments everything specified in them will be treated as SGE options.
To submit such a wrapper job script, you can use the qsub command without any additional parameters, which is very convenient:
$ qsub sleep.sh your job 16 ("sleep.sh") has been submitted
For a parallel MPI job script, take a look at this script, linpack.sh. Note that you need to put in two SGE variables, $NSLOTS and $TMP/machines within the job script.
$ cat linpack.sh #!/bin/bash # #$ -cwd #$ -j y #$ -S /bin/bash # MPI_DIR=/opt/mpich/gnu/ HPL_DIR=/opt/hpl/mpich-hpl/
# OpenMPI part. Uncomment the following code and comment the above code # to use OpemMPI rather than MPICH
# MPI_DIR=/opt/openmpi/ # HPL_DIR=/opt/hpl/openmpi-hpl/ $MPI_DIR/bin/mpirun -np $NSLOTS -machinefile $TMP/machines \ $HPL_DIR/bin/xhpl
The command to submit a MPI parallel job script is similar to submitting a serial job script but you will need to use the -pe mpich N. N refers to the number of processes that you want to allocate to the MPI program. Here's an example of submitting a 2 processes linpack program using this HPL.dat file:
$ qsub -pe mpich 2 linpack.sh your job 17 ("linpack.sh") has been submitted |
If you need to delete an already submitted job, you can use qdel given it's job id. Here's an example of deleting a fluent job under SGE:
$ qsub fluent.sh your job 31 ("fluent.sh") has been submitted $ qstat job-ID prior name user state submit/start at queue master ja-task-ID --------------------------------------------------------------------------------------------- 31 0 fluent.sh sysadm1 t 12/24/2003 01:10:28 comp-pvfs- MASTER $ qdel 31 sysadm1 has registered the job 31 for deletion $ qstat $ |
Although the example job scripts are bash scripts, SGE can also accept other types of shell scripts. It is trivial to wrap serial programs into a SGE job script. Similarly, for MPI parallel jobs, you just need to use the correct mpirun launcher and to also add in the two SGE variables, $NSLOTS and $TMP/machines within the job script. For other parallel jobs other than MPI, a Parallel Environment or PE needs to be defined. This is covered in the SGE documentation.
qsub
command line options within the script on lines beginning with
#$
. For example:#$ -S /bin/bash
If #$
is not followed by a valid qsub
option, you get the unhelpful message:
qsub: Unknown option
If you get this message, search the script you are submitting for #$
and make sure it
is followed by a valid qsub
command line option.
It is easy to inadvertently introduce #$
when you comment out a line that begins with
$
:
#$SOME_COMMAND
There is no qsub
command line option SOME_COMMAND
, so this is an error.
If you are really must keep your lines beginning with #$
, you can specify a different
prefix string using the qsub -C
command line option.
Inherited Job Environment
Job submission default settings files hierarchy:
$SGE_ROOT/$SGE_CELL/common/sge_request $HOME/.sge_request $PWD/.sge_request
The two command that will be most useful to you are
Node or host status can be obtained by using the qhost command. An example listing is shown below.
HOSTNAME ARCH NPROC LOAD MEMTOT MEMUSE SWAPTO SWAPUS ------------------------------------------------------------------------------- global - - - - - - - node000 lx24-amd64 2 0.00 3.8G 35.8M 0.0 0.0 node001 lx24-amd64 2 0.00 3.8G 35.2M 0.0 0.0 node002 lx24-amd64 2 0.00 3.8G 35.7M 0.0 0.0 node003 lx24-amd64 2 0.00 3.8G 35.6M 0.0 0.0 node004 lx24-amd64 2 0.00 3.8G 35.7M 0.0 0.0
If job started then for computational tasks load jumps from zero to the number of CPUs.
See qhost For more details
Queue status for your jobs can be found by issuing a qstat command. An example qstat issued by user deadline is shown below.
job-ID prior name user state submit/start at queue slots ja-task-ID --------------------------------------------------------------------------------- 304 0.60500 Sleeper4 deadline r 01/18/2008 17:42:36 cluster@norbert 4 307 0.60500 Sleeper4 deadline r 01/18/2008 17:42:37 cluster@norbert 4 310 0.60500 Sleeper4 deadline qw 01/18/2008 17:42:29 4 313 0.60500 Sleeper4 deadline qw 01/18/2008 17:42:29 4 316 0.60500 Sleeper4 deadline qw 01/18/2008 17:42:29 4 321 0.60500 Sleeper4 deadline qw 01/18/2008 17:42:30 4 325 0.60500 Sleeper4 deadline qw 01/18/2008 17:42:30 4 308 0.53833 Sleeper2 deadline qw 01/18/2008 17:42:29 2More detail can be found by using the -f option. An example qstat -f issued by user deadline is shown below.
[deadline@norbert sge-tests]$ qstat -f queuename qtype used/tot. load_avg arch states ---------------------------------------------------------------------------- cluster@node0 BIP 2/2 0.00 lx26-amd64 310 0.60500 Sleeper4 deadline r 01/18/2008 17:43:51 1 313 0.60500 Sleeper4 deadline r 01/18/2008 17:43:52 1 ---------------------------------------------------------------------------- cluster@node1 BIP 2/2 0.00 lx26-amd64 310 0.60500 Sleeper4 deadline r 01/18/2008 17:43:51 1 313 0.60500 Sleeper4 deadline r 01/18/2008 17:43:52 1 ---------------------------------------------------------------------------- cluster@node2 BIP 2/2 0.00 lx26-amd64 310 0.60500 Sleeper4 deadline r 01/18/2008 17:43:51 1 313 0.60500 Sleeper4 deadline r 01/18/2008 17:43:52 1 ---------------------------------------------------------------------------- cluster@norbert BIP 2/2 0.02 lx26-amd64 310 0.60500 Sleeper4 deadline r 01/18/2008 17:43:51 1 313 0.60500 Sleeper4 deadline r 01/18/2008 17:43:52 1 ############################################################################ - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS - PENDING JOBS ############################################################################ 316 0.60500 Sleeper4 deadline qw 01/18/2008 17:42:29 4 321 0.60500 Sleeper4 deadline qw 01/18/2008 17:42:30 4 325 0.60500 Sleeper4 deadline qw 01/18/2008 17:42:30 4 308 0.53833 Sleeper2 deadline qw 01/18/2008 17:42:29 2To look at jobs for all users, you must issue the following:
qstat -u "*"For queue details, you may add the -f option as shown above. If you prefer to always see all user jobs, you can use the alias command to make this the default behavior. For bash users add the following to your .bashrc file.
alias qstat='qstat -u "*"'
Even more data information can be obtained by using the -F option (see the qstat for more information. For parallel jobs, the output is not very easy to understand. In the above listing, the stat is either qw (queue waiting), t (transferring), and r (running).
A more convenient queue status package called userstat combines qstat, qhost, and qdel into a simple easy to use "top" like interface. Each will be described below. Additional information on these commands is available by using man command-nameLet's assume we have a script sge-date that looks like:
#!/bin/bash /bin/dateWe could run it using the command:qsub sge-dateSGE will then run the program, and place two files in your current directory:sge-date.e# sge-date.o#where # is the job number assigned by SGE.After job finished the sge-date.e# file contains the output from standard error and the sge-date.o# file contains the output form standard out. Note, the home directory of the user who is submitting job is not of NFS outpout will be on the node on which job run, not visible on the head node. See Viewing SGE Job Output
The following basic options may be used to submit the job.
-A [account name] -- Specify the account under which to run the job -N [name] -- The name of the job -l h rt=hr:min:sec -- Maximum walltime for this job -r [y,n] -- Should this job be re-runnable (default y) -pe [type] [num] -- Request [num] amount of [type] nodes. -cwd -- Place the output files (.e,.o) in the current working directory. The default is to place them in the users home directory. -S [shell path] -- Specify the shell to use when running the job scriptAlthough it is possible to use command line options and script wrappers to submit jobs, it is usually more convenient to use just a single script to include all options for the job. The next section describes how this is done.
The most convenient method to submit a job to SGE is to use a "job script" which contains SGE options as pseudo comments.
SGE pseudo comments allow all options and the program file to placed in the batch file.
The following script will report the node on which it is running, sleep for 60 seconds, then exit. It also reports the start/end date and time as well as sending an email to user when the jobs starts and when the job finishes. Other SGE options are set as well. The example script can be found here as well.
#!/bin/bash # # Usage: sleeper.sh [time]] # default for time is 60 seconds # -- our name --- #$ -N Sleeper1 #$ -S /bin/sh # Make sure that the .e and .o file arrive in the # working directory #$ -cwd #Merge the standard out and standard error to one file #$ -j y/bin/echo Here I am: `hostname`. Sleeping now at: `date` /bin/echo Running on host: `hostname`. /bin/echo In directory: `pwd` /bin/echo Starting on: `date`# Send mail at submission and completion of script #$ -m be #$ -M deadline@kronostime=60 if [ $# -ge 1 ]; then time=$1 fi sleep $time echo Now it is: `date`The "#$" is used in the script to indicate an SGE option. If we name the script sleeper1.sh we can submit it to SGE as follows:qsub sleeper1.shThe output will be in the file Sleeper1.o#, where # is the job number assigned by SGE. Here is an example output file for the sleeper1.sh script. When submitting MPI or PVM jobs, we will need additional information in the job script. See below.Inheriting Your Environment
If you want to make sure your current environment variables are used on you SGE jobs, include the following in your submit script:
#$ -V
Recently -V might stopped working because of fix to Shellshock Bash bug. See
#!/bin/sh # # EXAMPLE MPICH SCRIPT FOR SGE # Modified by Basement Supercomputing 1/2/2006 DJE # To use, change "MPICH_JOB", "NUMBER_OF_CPUS" # and "MPICH_PROGRAM_NAME" to real values. # # Your job name #$ -N MPICH_JOB # # Use current working directory #$ -cwd # # Join stdout and stderr #$ -j y # # pe request for MPICH. Set your number of processors here. # Make sure you use the "mpich" parallel environment. #$ -pe mpich NUMBER_OF_CPUS # # Run job through bash shell #$ -S /bin/bash # # The following is for reporting only. It is not really needed # to run the job. It will show up in your output file. echo "Got $NSLOTS processors." echo "Machines:" cat $TMPDIR/machines # Adjust MPICH procgroup to ensure smooth shutdown export MPICH_PROCESS_GROUP=no # # Use full pathname to make sure we are using the right mpirun /opt/mpi/tcp/mpich-gnu3/bin/mpirun -np $NSLOTS -machinefile $TMPDIR/machines MPICH_PROGRAM_NAME
The important option is the -pe line in the submit script. This variable must be set for the MPI environment for which you compiled your program. The following example submit scripts are available in examples directory:
To use SGE with MPI simply copy the appropriate scripts to your working directory, edit the script to fill in the appropriate variables, rename it to reflect your program and use qsub to submit it to SGE.
See also
|
||||
Bulletin | Latest | Past week | Past month |
|
biowiki.org
After submitting your job to Grid Engine you may track its status by using either the qstat command, the GUI interface QMON, or by email.
Monitoring with qstatThe qstat command provides the status of all jobs and queues in the cluster. The most useful options are:
- qstat: Displays list of all jobs with no queue status information.
- qstat -u hpc1***: Displays list of all jobs belonging to user hpc1***
- qstat -f: gives full information about jobs and queues.
- qstat -j [job_id]: Gives the reason why the pending job (if any) is not being scheduled.
You can refer to the man pages for a complete description of all the options of the qstat command.
Monitoring Jobs by Electronic MailAnother way to monitor your jobs is to make Grid Engine notify you by email on status of the job.
In your batch script or from the command line use the -m option to request that an email should be send and -M option to precise the email address where this should be sent. This will look like:
#$ -M myaddress@work
#$ -m beasWhere the (-m) option can select after which events you want to receive your email. In particular you can select to be notified at the beginning/end of the job, or when the job is aborted/suspended (see the sample script lines above).
And from the command line you can use the same options (for example):
qsub -M myaddress@work -m be job.sh
How do I control my jobsBased on the status of the job displayed, you can control the job by the following actions:
Monitoring and controlling with QMON
Modify a job: As a user, you have certain rights that apply exclusively to your jobs. The Grid Engine command line used is qmod. Check the man pages for the options that you are allowed to use.
- Suspend/(or Resume) a job: This uses the UNIX kill command, and applies only to running jobs, in practice you type
qmod -s/(or-r)job_id (where job_id is given by qstat or qsub).
- Delete a job: You can delete a job that is running or spooled in the queue by using the qdel command like this
qdel job_id (where job_id is given by qstat or qsub).
You can also use the GUI QMON, which gives a convenient window dialog specifically designed for monitoring and controlling jobs, and the buttons are self explanatory.
For further information, see the SGE User's Guide ( PDF, HTML).
Adding SGE Options
If you always need certain SGE options to be specified for a given job, you can embed those options into the SGE job script using lines that start with "#$":
?
#!/bin/tcsh
#
#$ -S /bin/tcsh -cwd
#$ -o simple.out -j y
#$ -l mem_free=500M
cd /home/username/seq/simple
myprog
The '-cwd' option tells SGE to run the job in the same directory that the
qsub
command was issued - i.e. SGE will 'cd' into the current working directory before it runs the job script. Again, the '-o' option is used to direct screen output to a file. The '-S' option is another way to indicate the shell type (tcsh, bash, or sh) to SGE. The '-j y' is a way to tell SGE to "join" the standard-error output with the standard-output (screen output) - in this case all of it will go to the file 'simple.out'. The '-l mem_free=500M' tells SGE to only run the job on a node with at least 500 megabytes of free RAM available. The '500M' here should be changed to match the actual amount of RAM you expect your job to use, e.g. '-l mem_free=750K' (750 kilobytes of RAM) or '-l mem_free=4G' (4 gigabytes of RAM). Help with estimating your program's memory use can be found here: Monitoring Memory Usage.In any shell script, lines starting with a "#" are generally ignored as comments by a shell script. It is only SGE that interprets the lines that start with a "#$", other systems will consider them to be comments. This makes it possible for your SGE script to still run on other Linux machines, for example.
** NOTE ** Only the first contiguous block of comments is searched by SGE for "#$" lines. SGE stops processing the "#$" lines when it sees the first command or blank line. It is usually easiest to just follow the above example and put all SGE information at the very top of the file.
** NOTE ** SGE is very particular about the formatting of the queue-submission scripts. In particular, you should make sure that you have a blank line at the end of your script, and that the script is saved in the standard Unix text format (and NOT in the standard Windows-text format). Generally, this is only an issue if you copy job scripts from your Windows laptop to the cluster. If you do so, you may want to run the '
dos2unix
' program on your script before submitting. If you are a 'vi' user and you see a '[ dos ]' tag on the status line at the bottom of the screen, you can do ':set fileformat=unix' and save the file; 'vi' will also show '[ noeol ]' on the status line if you don't have an end-of-line marker at the end of the file.** NOTE ** The more memory your job requires, the more important it is to include a '-l mem_free' request in your script. See Monitoring Memory Usage for more info on determining how much the memory request should be.
Common SGE Options
-cwd use the current directory (where the job was submitted from) to store all output files, including the -o specified file -M user@hostname -m b,e sends email to the specified account at the beginning (-m b) and end (-m e) of the job. This way, you know when the job has finally started if it got stuck in the queue. You can also use this to send yourself a text message. -o file -e file directs the standard output (-o) and standard error-output (-e) to the specified files. Note that this is output or error-output that is not otherwise directed to files; ie. csh redirection (myprog > file.out) takes precedence -j y "joins" the error-output with the standard output, thereby sending both to the same file (given by -o) -S /bin/tcsh what shell to use, tcsh or sh or whatever you prefer -N name use this name when displaying in the qstat output; defaults to the name of the script file -hold_jid job_id_or_job_name if you have one job that must wait for another to complete (perhaps the first one creates an output file which is needed by the second program), then you can request that the job be held until that first job completes, see [SGE Job Dependencies] -pe high 10-20 requests a high-priority "parallel environment" that spans several machines, in this case any number of CPUs between 10 and 20 (inclusive), a single number will request exactly that many CPUs. Note that if you request more CPUs than you actually have high-priority access to, your job will hang. See Submitting OpenMP Jobs or Submitting MPI Jobs -q *@machineName-n* request a specific machine, or machine group -l slots=2 requests that the job be given 2 slots (or 2 cpus) instead of 1; you MUST use this if your program is multi-threaded, you should NOT use it otherwise -l mem_free=1.5G requests that only machines with 1.5GB (=1536MB) or more be used for this job; ie. the job requires a lot of memory and thus is not suitable for all hosts. Note that 1G is equal to 1024M (How do I determine how much memory my program needs? See the FAQ) -l h_cpu=Xh requests that X hours be allocated for this job to run -l scr_free=XG requests that only machines with X GB or more free disk space in the /scratch partition to be used for this job; ie. the job requires a lot of temporary file space and thus is not suitable for all hosts. Note that 1G is equal to 1024M -l highprio requests that the job be placed in a high-priority queue or parallel environment -soft putting -soft before a -l requirement indicates that it is a "soft" request; SGE will make a best-effort attempt to find a machine with the requested attribute, but the job may be queue with machines that do NOT have the attribute. Note that -soft applies to ALL -l options that come after it -t start-stop:step submit an SGE "array" job, that is, run the same job multiple times, but set SGE_TASK_ID to the value start, then start+step, etc., up through stop (step may be omitted); it is up to your script to do something different for each task-ID; see [SGE Array Jobs]
Common Parallel Environment Options
-pe low-all 8 low priority use any machines (Note: No longer working)
-pe low-core 8 low priority only use core machines (Note: No longer working) -pe threaded 8 low priority use multiple slots on one machine -l highprio -pe high 8 high priority use any high-priority slots
** NOTE ** Jobs with no memory requests will be placed on ANY machine that is not heavily-loaded. If you need a significant amount of memory in order for your program to run, then you must explicitly request it with "-l mem_free=750M". (How do I determine this? FAQ)
** NOTE ** If you run an MPI job in low-priority mode and even one CPU gets slowed down due to a high-priority request, then all of your MPI tasks are likely to be slowed down as well. This will waste those computational resources that other people could have used.
** NOTE ** For shell script programmers, there are a number of "environment variables" that are automatically set by SGE, see [SGE Env Vars] (and see [SGE Array Jobs] for possible ways to use them).
** NOTE ** For users who "live" in multiple groups (perhaps you collaborate with multiple professors who each have machines in the DSCR), please see [SGE Multiple Groups] for info on how to properly allocate your runs to each group, if this is important to you.
See also [SGE Job Monitoring] or [FAQ]
web.njit.edu
Submitting a job to the queue: qsub
Qsub is used to submit a job to SGE. The qsub command has the following syntax:qsub [ options ] [ scriptfile | -- [ script args ]]Binary files may not be submitted directly to SGE. For example, if we wanted to submit the "date" command to SGE we would need a script that looks like:#!/bin/bash bin/dateIf the script were called sge-date, then we could simply run the following:$ qsub sge-dateSGE will then run the program, and place two files in your current directory:sge-date.e# sge-date.o#where # is the job number assigned by SGE. The sge-date.e# file contains the output from standard error and the sge-date.o# file contains the output form standard out. The following basic options may be used to submit the job.-A [account name] -- Specify the account under which to run the job -N [name] -- The name of the job -l h rt=hr:min:sec -- Maximum walltime for this job -r [y,n] -- Should this job be re-runnable (default y) -pe [type] [num] -- Request [num] amount of [type] nodes. -cwd -- Place the output files (.e,.o) in the current working directory. The default is to place them in the users home directory. -S [shell path] -- Specify the shell to use when running the job scriptAlthough it is possible to use command line options and script wrappers to submit jobs, it is usually more convenient to use just a single script to include all options for the job. The next section describes how this is done.Job Scripts
The most convenient method to submit a job to SGE is to use a "job script". The job script allows all options and the program file to placed in a single file. The following script will report the node on which it is running, sleep for 60 seconds, then exit. It also reports the start/end date and time as well as sending an email to user when the jobs starts and when the job finishes. Other SGE options are set as well. The example script can be found here as well.
#!/bin/sh # # Usage: sleeper.sh [time]] # default for time is 60 seconds # -- our name --- #$ -N Sleeper1 #$ -S /bin/sh # Make sure that the .e and .o file arrive in the # working directory #$ -cwd #Merge the standard out and standard error to one file #$ -j y /bin/echo Here I am: `hostname`. Sleeping now at: `date` /bin/echo Running on host: `hostname`. /bin/echo In directory: `pwd` /bin/echo Starting on: `date` # Send mail at submission and completion of script #$ -m be #$ -M deadline@kronos time=60 if [ $# -ge 1 ]; then time=$1 fi sleep $time echo Now it is: `date`The "#$" is used in the script to indicate an SGE option. If we name the script sleeper1.sh and then submit it to SGE as follows:
qsub sleeper1.sh
The output will be in the file Sleeper1.o#, where # is the job number assigned by SGE. Here is an example output file for the sleeper1.sh script. When submitting MPI or PVM jobs, we will need additional information in the job script. See below.
- matyldaX
- scratchX
- ram_free, mem_free
- disk_free, tmp_free
- gpu
We have found that for some tasks, it is advantageous to specify the info on required resources to SGE. It has sense in case an excessive use of RAM/netowrk storage is expected. The limits are soft and hard (parameters -soft, -hard), the limits themselves are:
-l resource=valueFor example, in case a job needs at least 400MB RAM: qsub -l ram_free=400M my_script.sh Another often requested resource is the space in /tmp: qsub -l tmp_free=10G my_script.sh. Or both:
qsub -l ram_free=400M,tmp_free=10G my_script.shOf course, it is possible (and preferable if the number does not change) to use the construction #$ -l ram_free=400M directly in the script. The actual status of given resource on all nodes can be obtained by: qstat -F ram_free, or more things by: qstat -F ram_free,tmp_free.
Details on other standard available resources are in /usr/local/share/SGE/doc/load_parameters.asc. In case you do not specify value for given resource, implicit value will be used (for space on /tmp it is 1GB, for RAM 100MB)
WARNING: You need to distinguish, if you request resources that are available at the time of submission (so called non-consumable resources), or if you need to allocate given resource for the whole runtime of your computation - for example, your program will need 400MB of memory but in the first 10 min of computation, it will allocate only 100MB. In case you use the standard resource mem_free, and during the first 10min another jobs will be submitted to the given node, SGE will interpret it in the following way: you wanted 400MB but you finally use only 100MB so that the rest of 300MB will be given to someone else (i.e. it will submit another task requesting this memory).
For these purposes, it is better to use consumable resources, that are computed independently on the current status of the task - for memory it is ram_free, for disc tmp_free. For example, resource ram_free does not look at the actual free RAM, but it computes the occupation of RAM only based on the requests of individual scripts. It works with the size of RAM of the given machine and subtracts the amount requested by the job that should be run on this machine. In case the job does not specify ram_free, implicit value of ram_free=100M will be used.
For the disk space in /tmp (tmp_free), the situation is more tricky: in case a job does not clean up properly its mess after it finishes, the disk can actually have less space than defined by the resource. Unfortunately, nothing can be done about this.
Known problems with SGE
- Use of paths - for home directory it is necessary to use the official path - i.e. /homes/kazi/... or /homes/eva (or simply the variable $HOME). In case the path of the internal mountpoint of the automounter is used - i.e. - /var/mnt/... an error will occur. (this is not an error of SGE, the internal path is not fully functional for access)
- Availability of nodes - due to the existence of nodes with limited access (employees' PCs), it is necessary to specify a list of nodes, on which your job can run. This can be done using parameter -q. The machines that are available are nodes in IBM Blades and also some computer labs in case you turn the machines on over night. The list of queues for -q must be only on one line even if it is very long. For the availability of given groups of nodes, the parameter -q can be used in the following way:
#$ -q all.q@@blade,all.q@@PCNxxx,all.q@@serversMain groups of computers are: @blade, @servers, @speech, @PCNxxx, @PCN2xxx - the full and actual list can be obtained by qconf -shgrpl
- The syntax for access is QUEUE@OBJECT - i.e. all.q@OBJECT. The object is either one computer, for example all.q@svatava, or a group of computers (which begins also by @ - @blade) i.e. all.q@@blade.
- The computers in the labs are sometimes restarted by students during computation - we can't do much about this. In case you really need the computation to finish (i.e. it is not easy to re-run a job in case it is brutally killed) use newly defined groups of computers:
@stable - @blade, @servers - servers that run all the time w/o restarting @PCOxxx, @PCNxxx - computer labs, there is a possibility that any node might be restarted at any time, a student or someone can shut the machine down by error or "by error". It is more or less sure that these machines will run smoothly over night and during weekends. There is also a group for each independent lab e.g. @PCN103.
- Runnnig other scripts than bash - it is necessary to specify the interpret on the first line of your script (it is probably already there), for example #!/usr/bin/perl, etc.
- Does your script generate a heavy traffic on matyldas ? It is necessary to set -l matyldaX=10, (for example 10 - i.e. in total 100/10 = 10 concurrent jobs from given matyldaX), where X is the number of matylda used (in case you use several matyldas, specify -l matyldaX=Y several times). We have created an SGE resource for each matylda (each matylda has 100 points in total) and the jobs using -l matyldaX=Y are submitted until given matylda has free points. This can be used to balance the load of given storage server from the user side. The same holds for servers scratch0X.
- Attention to parameter -cwd, is is not guaranteed that it will work all the time, better use cd /where/do/i/want at the beginning of your script.
- In case a node is restarted, a job will still be shown in SGE, although it is not running any more. This is because SGE is waiting until the node confirms termination of the computation (i.e. until it boots Linux again and starts the SGE client). In case you use qdel to delete a job, it will be only marked by flag d. Jobs marked by this flag are automatically deleted by the server every hour.
Parallel jobs - OpenMP
For parallel tasks with threads, it is enough to use parallel environment smp and to set the number of threads:
#!/bin/sh # #$ -N OpenMPjob #$ -o $JOB_NAME.$JOB_ID.out #$ -e $JOB_NAME.$JOB_ID.err # # PE_name CPU_Numbers_requested #$ -pe smp 4 # cd SOME_DIR_WITH_YOUR_PROGRAM export OMP_NUM_THREADS=$NSLOTS ./your_openmp_program [options]Parallel jobs - OpenMPI
- Open MPI is now fully supported, and it is the default parallel environment (mpirun is by default Open MPI)
- The SGE parallel environment is openmpi
- The allocation rule is $fill_in$ which means that the preferred allocation is on the same machine.
- Open MPI is compiled with tight SGE integration:
- mpirun will automatically submit to machines reserved by SGE
- qdel will automatically clean all MPI stubs
- In the parallel task, do not forget (preferably directly in the script) to use parameter -R y, this will turn on the reservation of slots, i.e. you won't be jumped by processes requesting less slots.
- in case a parallel task is launched using qlogin, there is no variable containing information on what slots were reserved. A useful tool is then qstat -u `whoami` -g t | grep QLOGIN, which says what parallel jobs are running.
Listing follows:
#!/bin/bash # --------------------------- # our name #$ -N MPI_Job # # use reservation to stop starvation #$ -R y # # pe request #$ -pe openmpi 2-4 # # --------------------------- # # $NSLOTS # the number of tasks to be used echo "Got $NSLOTS slots." mpirun -n $NSLOTS /full/path/to/your/executable
I'm running some jobs on an SGE cluster. Is there a way to make qstat show me only jobs that are not on hold?
qstat -s p shows pending jobs, which is all those with state "qw" and "hqw".
qstat -s h shows hold jobs, which is all those with state "hqw".
I want to be able to see all jobs with state "qw" only and NOT state "hqw". The man pages seem to suggest it isn't possible, but I want to be sure I didn't miss something. It would be REALLY useful and it's really frustrating me that I can't make it work.
Other cluster users have a few thousand jobs on hold ("hqw") and only a handful actually in the queue waiting to run ("qw"). I want to see quickly and easily the stuff that is not on hold so I can see where my jobs are in the queue. It's a pain to have to show everything and then scroll back up to find the relevant part of the output.
Laura
So I figured out a way to show what I want by piping the output of qstat into grep:
qstat -u "*" | grep " qw"
(Note that I need to search for " qw" not just "qw" or it will return the "hqw" states as well.)But I'd still love to know if it's possible using qstat options only.
columbia.edu
Besides the guide below here are another links I found covering qsub:
https://www.nbcr.net/pub/wiki/index.php?title=Sample_SGE_Script
http://www.it.uu.se/datordrift/maskinpark/albireo/gridengine.html
http://www.rbvi.ucsf.edu/Resources/sge/user_guide.html
Guide to Using the Grid Engine
The main access to the nodes of the Beowulf Cluster is done by the Sun Grid Engine batch system. Grid Engine will distribute requested jobs on the nodes, depending on the current load of the nodes, the priority of the job and the numbers of jobs a user has already running on the cluster (jobs in the queue of users, which have fewer jobs running, are preferred within the same priority level). Direct login onto the nodes and interactive executions of programs are strongly discourage, because it bypasses the monitoring system of the nodes by the Sun Gridengine and can cause incomplete execution of batch jobs. If interactive jobs are required by some users they can use the command qsh, which starts an xterm session through Grid Engine. Submitting a Job
Programs cannot be submitted directly to the grid engine. Instead they require a small shell script, which is a wrapper for the program to be run. Note that the script must be an executable (check with the ls -l command. If there is not an x in front of the shell script name, it is not executable. It can be changed with the command chmod +x <script name> ). If the program requires interactive input (e.g. Genesis) the input has to be piped in by either the echo command or an external file. The minimal script genesis.sh to run Genesis would be:
#!/bin/bash #$ -S /bin/sh echo "lcls.in" | ~/bin/genesisNote that this is a specific case, which requires that the executable of genesis is located in the directory bin of your home directory. After a check that the script runs correctly (typing ./genesis.sh at the prompt should execute genesis without an error), the job is submitted with the qsub command:
qsub genesis.shThe command qsub has many option which should be explicitly defined for each submitted job. There are three methods of doing so with increasing priority (a higher priority will overwrite an already defined option of a lower priority):
* The default option in the file .sge_request, located in your home directy. The format is just one line with white space between the list of options (e.g. '-cwd -A reiche -j y') * Option embedded in your shell script. Normally lines starting with a pound sign are ignored, except if it is immediatly followed by a dollar sign. Everything behind #$ is filtered out by the grid engine as an option. * Command line arguments of the qsub command (e.g. qsub -cwd genesis.sh)In any case an option starts always with a minus sign and a keyword, followed - if necessary - by additional arguments. Following options are recommended to be set, preferable by the .sge_request file in the home directory:
-cwd Uses the directory, where the job has been submitted, as the working directory. Otherwise your home directory is used.
-C #$ Defines the letter sequence in the script which indicates additional option for submitting the job.
-A <login-name> Defines the user account of the job owner. If not defined it falls back to the user who submitted the job.
-j y Merges the normal output of the file and any error messages into one file, typically with the name <job-name>.o<job-id>.
-m aes Sun Grid Engine will notify the job owner by email if the job is either completed, suspended or aborted.
-M <email-address> The email address to where the notification is send.
-p 0 The priority level of the submitted jobs. Jobs with a higher priority are preferred to be submitted to a node by the grid engine.
-r forces grid engine to restart the job in the case the system has a crash or is rebooted (note, this does not apply if the job itself crashes).
Following option should be defined differently for each job, because they are defined in a context to the specific jobs which is not generally applicable for all jobs.
-N <job-name> Defines a short name for the job to identify it besides the job ID. If omitted the job name is the name of the shell script
-o <outputfile> Names the output file. If omitted the output filename is the defined by <job-name>.o<job-id>
-v <environment> Normally environment variables, defined in your .bash_profile or relarted file, are not exported to the node, where the job runs. With this option grid engine sets the environment variable prior to starting the job.
-notify If the code supports the signals SIGUSR1 and SIGUSR2, these signal will be sent to the program before it is terminated by the grid engine
-pe <parallel environment> Needed for executing parallel jobs
Use man qsub to see further option. All options can also be set/defined in an interactive way by using the job submission feature of qmon.
Monitoring a Job and the Queue
Once the job is submitted a job id is assigned and the job is placed in the queue. To see the status of the queue the command qstat prints a list of all running and pending jobs with a list of the most important information (job ID, job owner, job name, status, node). More information on a specific job can be optain with qstat -j <job-id>. The status of the job is indicated by one or more characters:
r - running
t - transfering to a node
qw - waiting in the queue
d - marked for deletion
R - marked for restart
Normally the status d is hardly observed with qstat and if a job hangs in the queue for a long time, marked for deletion, it indicates that the grid engine is not running properly. Please inform the system administrator about it.
To remove a job from the queue, the command qdel only requires the job-id. A job can also be changed after it has been submitted with the qalter command. It works similar to the qsub commmand but with the job-id instead of the shell script name.
The command qhost gives the status of all nodes. If the load is close to unity it indicates that the machine is busy and most likely running a job (use the qstat command to check - if not then a user might have logged directly onto the node to run a job interatively). Submitting an MPI-Job
To run a parallel job the script requires some additional information. First the option -pe has to be used to indicate the parallel environment. Right now only mpich is supported on the Beowulf cluster. The second mandatory argument for the pe-optionn is the number of requested nodes, which can be also defined as a range of needed nodes. Sun gridengine tries to maximized this number. It is recommmended to add this line to the shell script
#$ -pe mpich Nwhere N is the number of the desired nodes. Right now it is limited to 14, corresponding loosely to one job per node/CPU. If mulitple instances per node are required, please contact the system administrator to increase the maximum number of slots.
The invocation of mpirun requires also some non-standard place holders (environmental variables), which is then filled by grid engine at the execution of the script. The format is (one line!)
/usr/local/mpich/bin/mpirun -np $NSLOTS -machinefile $TMPDIR/machines <path to mpi program + optional command line arguments>
Everything up to the path to the mpi program should be used as it is. $NSLOTS and $TMPDIR will be defined by the sun grid engine. Not also that this script does not run correctly if it is executed directly. Further information on MPICH can be found here. Interactive Sessions
If the user has to run interactive session (e.g. Oopics) it can log onto a node with the qsh command. The Sun Grid Engine will then mark that node as busy and do not submit any further job to it till the user has logged out. The command qstat will show INTERATIVE as the job name, indicating that an interactive session is running on that node.
For now the command qsh is not working properly, but the system adminsitrator is currently working on it to fix it. The Interactive Monitor QMON
QMON
is a user interface to replace all of the UNIX commands of the grid engine (e.g. qsub, qdel ...). It is started by typing qmon at the command prompt, follow by a space and an ampersand, so that the prompt is not blocked. For the normal user only the first three buttons are of importance. They correspond to qstat, qhost and qsub, respectively. The usage is mostly intuitive. You can ask also the system administrators for help. It is recommended that at least once the job submission panel is used to define your default parameters and to save the settings. After filling out the parameter press the 'Save Setting' button and name the file to be written. The generated file can be used as a template for .sge_request. Overview of the Most Common Gridengine Commands
- qsub Submits a job to the queue. It requires a shell scripts, which is wrapped around the program to be run. Options can be either defined as command line arguments, in the script file or by the .sge_request file in your home directory. See above for more information.
- qdel Marks a job for deletion. It requires the job-id and not the job name, which can be ambigious.
- qalter Change the options for an already submitted job. The options are the same as for qsub but requires the job-id instead of the shell script name. If the job is already running it will be restarted.
- qstat Shows the status of the queue or of a specific job if it is specified with the -j <job-id> option.
- qhost Shows the status of the nodes.
- qsh Starts an xterm session through the grid engine for interactive jobs.
- qhold Puts a job, which hasn't been startet yet on hold and is not schedeled for execution by the gridengine till the hold is removed. Requires the job-id as argument.
- qrls Releases a job from a hold. It will be put back in the queue and schedule for execution. Requires the job-id
- qmon
- Interactive monitor of the sun gridengine.
More information can be obtain by the man command at the command prompt. The User and Adminstration guide gives a complete discription of the sun gridengine, which most can be also found on the official homepage.
wiki.ibest.uidaho.edu
qsub [options] [-a date_time] request a start time [-ac context_list] add context variable(s) [-ar ar_id] bind job to advance reservation [-A account_string] account string in accounting record [-b y[es]|n[o]] handle command as binary [-binding [env|pe|set] exp|lin|str] binds job to processor cores [-c n s m x] define type of checkpointing for job n no checkpoint is performed. s checkpoint when batch server is shut down. m checkpoint at minimum CPU interval. x checkpoint when job gets suspended. <interval> checkpoint in the specified time interval. [-ckpt ckpt-name] request checkpoint method [-clear] skip previous definitions for job [-cwd] use current working directory [-C directive_prefix] define command prefix for job script [-dc simple_context_list] delete context variable(s) [-dl date_time] request a deadline initiation time [-e path_list] specify standard error stream path(s) [-h] place user hold on job [-hard] consider following requests "hard" [-help] print this help [-hold_jid job_identifier_list] define jobnet interdependencies [-hold_jid_ad job_identifier_list] define jobnet array interdependencies [-i file_list] specify standard input stream file(s) [-j y[es]|n[o]] merge stdout and stderr stream of job [-js job_share] share tree or functional job share [-jsv jsv_url] job submission verification script to be used [-l resource_list] request the given resources [-m mail_options] define mail notification events [-masterq wc_queue_list] bind master task to queue(s) [-notify] notify job before killing/suspending it [-now y[es]|n[o]] start job immediately or not at all [-M mail_list] notify these e-mail addresses [-N name] specify job name [-o path_list] specify standard output stream path(s) [-P project_name] set job's project [-p priority] define job's relative priority [-pe pe-name slot_range] request slot range for parallel jobs [-q wc_queue_list] bind job to queue(s) [-R y[es]|n[o]] reservation desired [-r y[es]|n[o]] define job as (not) restartable [-sc context_list] set job context (replaces old context) [-shell y[es]|n[o]] start command with or without wrapping <loginshell> -c [-soft] consider following requests as soft [-sync y[es]|n[o]] wait for job to end and return exit code [-S path_list] command interpreter to be used [-t task_id_range] create a job-array with these tasks [-tc max_running_tasks] throttle the number of concurrent tasks (experimental) [-terse] tersed output, print only the job-id [-v variable_list] export these environment variables [-verify] do not submit just verify [-V] export all environment variables [-w e|w|n|v|p] verify mode (error|warning|none|just verify|poke) for jobs [-wd working_directory] use working_directory [-@ file] read commandline input from file [{command|-} [command_args]]
What is qsub?Qsub is the command used for job submission to the cluster. It takes several command line arguments and can also use special directives found in the submission scripts or command file. Several of the most widely used arguments are described in detail below.
Environment variables in qsubThe qsub command will pass certain environment variables in the Variable_List attribute of the job. These variables will be available to the job. The value for the following variables will be taken from the environment of the qsub command:
- HOME (the path to your home directory)
- LOGNAME (the name that you logged in with)
- PATH (standard path to excecutables)
- SHELL (command shell, i.e bash,sh,zsh,csh, ect.)
- WORKDIR (time zone)
- HOST computer name where submitted from
These values will be assigned to a new name which is the current name prefixed with the string "sge_o_". For example, the job will have access to an environment variable named sge_o_home which have the value of the variable HOME in the qsub command environment.
Arguments to control behavior and request resourcesAs stated before there are several arguments that you can use to get your jobs to behave a specific way or request resources. This is not an exhaustive list, but some of the most widely used and many that you will will probably need to accomplish specific tasks.
Declare the date/time a job becomes eligible for execution
To set the date/time which a job becomes eligible to run, use the -a argument. The date/time format is [[[[CC]YY]MM]DD]hhmm[.SS]. If -a is not specified qsub assumes that the job should be run immediately.
Try it out
To test -a get the current date from the command line and add a couple of minutes to it. It was 11:31 when I checked. Add hhmm to -a and submit a command from STDIN.
echo "sleep 30" | qsub -a 1133Manipulate the output files
As a default all jobs will print all stdout (standard output) messages to a file with the name in the format <job_name>.o<job_id> and all stderr (standard error) messages will be sent to a file named <job_name>.e<job_id>. These files will be copied to your working directory when the job finishes. To rename the file or specify a different location for the standard output and error files, use the -o for standard output and -e for the standard error file. You can also combine the output using -j.
Try it out
Create a simple submission file:
sleep.sh
#!/bin/sh for i in `seq 1 60` ; do echo $i sleep 1 doneThen submit your job with the standard output file renamed to sleep.log:
qsub -o sleep.log sleep.shSubmit your job with the standard error file renamed:
qsub -e sleep.log sleep.shMail job status at the start and end of a job
The mailing options are set using the -m and -M arguments. The -m argument sets the conditions under which the batch server will send a mail message about the job and -M will define the users that emails will be sent to (multiple users can be specified in a list seperated by commas). The conditions for the -m argument include:
- a: mail is sent when the job is aborted.
- b: mail is sent when the job begins.
- e: main is sent when the job ends.
Try it out
Using the sleep.sh script created earlier, submit a job that emails you for all conditions:
# qsub -m abe -M [email protected] sleep.shSubmitting a job that uses specific resources
For now lets look at checking resources available.
Submitting a job that is dependent on the output of another
To create a job that will not run until another job has completed, simple add the -hold_jid <job name> argument to your environment. This takes the place of PBS's '-W depend=afterok:$ID' argument.
An SGE script example: test2.sge'
#!/bin/sh #$ -cwd #$ -N test2 #$ -hold_jid test1 ./test2Now, the 'test2' job will not run until test1 has completed. Note that this is a fairly static way of doing things. If you are building a batch submit script that creates job dependency trees, you could not replace 'test1' with 'test$1', $1 being an argument or environment variable, in this submit script 'test2.sge'. This is because even though the #$ -hold_jid test$1 line is an active comment, since it is a comment in bash, the $1 is not evaluated and changed; it stays as the literal $1, and then is interpreted by SGE as an unset value. The solution is to simply call qsub with the argument in the program call:
qsub -hold_jid $WAITONJOB test2.sgeThis allows you to make the job to wait on dynamic. This can be a way of submitting all the directives you want to qsub (like -cwd here) instead of with the active comments. But, if the argument is not dynamic, it complicated job submissions and is generally not a good way to go.
For more examples on dependent job submissions, see an example PBS pipeline. The -W depend=afterok:$ID directive would be replaced with our -hold_jid <job name> for SGE, and you would have the same thing.
Opening an interactive shell to the compute node
See SGE_Tutorial:_Interactive_jobs
Passing an environment variable to your job
You can pass user defined environment variables to a job by using the -v argument.
Try it outTo test this we will use a simple script that prints out an environment variable.
variable.sh
#!/bin/sh if [ "x" == "x$MYVAR" ] ; then echo "Variable is not set" else echo "Variable says: $MYVAR" fiNext use qsub without the -v and check your standard out file
# qsub variable.shThen use the -v to set the variable
# qsub -v MYVAR="hello" variable.sh
Google matched content |
Etc
qsub - submit a batch job to Sun Grid Engine.
qsub [ options ] [ command | -- [ command_args ]]
Qsub submits batch jobs to the Sun Grid Engine queuing sys- tem. Sun Grid Engine supports single- and multiple-node jobs. Command can be a path to a binary or a script (see -b below) which contains the commands to be run by the job using a shell (for example, sh(1) or csh(1)). Arguments to the command are given as command_args to qsub . If command is handled as a script then it is possible to embed flags in the script. If the first two characters of a script line either match '#$' or are equal to the prefix string defined with the -C option described below, the line is parsed for embedded command flags.
For qsub, the administrator and the user may define default request files (see sge_request(5)) which can contain any of the options described below. If an option in a default request file is understood by qsub and qlogin but not by qsh the option is silently ignored if qsh is invoked. Thus you can maintain shared default request files for both qsub and qsh.
A cluster wide default request file may be placed under $SGE_ROOT/$SGE_CELL/common/sge_request. User private default request files are processed under the locations $HOME/.sge_request and $cwd/.sge_request. The working direc- tory local default request file has the highest precedence, then the home directory located file and then the cluster global file. The option arguments, the embedded script flags and the options in the default request files are pro- cessed in the following order: left to right in the script line, left to right in the default request files, from top to bottom of the script file (qsub only), from top to bottom of default request files, from left to right of the command line. In other words, the command line can be used to override the embedded flags and the default request settings. The embed- ded flags, however, will override the default settings.
Note, that the -clear option can be used to discard any previous settings at any time in a default request file, in the embedded script flags, or in a command-line option. It is, however, not available with qalter.
The options described below can be requested either hard or soft. By default, all requests are considered hard until the -soft option (see below) is encountered. The hard/soft status remains in effect until its counterpart is encountered again. If all the hard requests for a job cannot be met, the job will not be scheduled. Jobs which cannot be run at the present time remain spooled.
OPTIONS
If this option is used with qsub or if a corresponding value is specified in qmon then a parameter named a and the value in the format CCYYMMDDhhmm.SS will be passed to the defined JSV instances (see -jsv option below or find more information concerning JSV in jsv(1))
Note that the -ar option adds implicitly the -w e option if not otherwise requested. jsv(1))
To enforce Sun Grid Engine to select hardware on which the binding can be applied please use the -l switch in combination with the complex attribute m_topology.
binding_instance is an optional parameter. It might either be env, pe or set depending on which instance should accomplish the job to core binding. If the value for binding_instance is not specified then set will be used.
env means that the environment variable SGE_BINDING will be exported to the job environment of the job. This variable contains the selected operating system internal processor numbers. They might be more than selected cores in presence of SMT or CMT because each core could be represented by multiple processor iden- tifiers. The processor numbers are space separated.
pe means that the information about the selected cores appears in the fourth column of the pe_hostfile. Here the logical core and socket numbers are printed (they start at 0 and have no holes) in colon separated pairs (i.e. 0,0:1,0 which means core 0 on socket 0 and core 0 on socket 1). For more information about the $pe_hostfile check sge_pe(5)
set (default if nothing else is specified). The binding strategy is applied by Sun Grid Engine. How this is achieved depends on the underlying hardware architec- ture of the execution host where the submitted job will be started.
On Solaris 10 hosts a processor set will be created where the job can exclusively run in. Because of operating system limitations at least one core must remain unbound. This resource could of course used by an unbound job.
On Linux hosts a processor affinity mask will be set to restrict the job to run exclusively on the selected cores. The operating system allows other unbound processes to use these cores. Please note that on Linux the binding requires a Linux kernel version of 2.6.16 or greater. It might be even possible to use a kernel with lower version number but in that case addi- tional kernel patches have to be applied. The loadcheck tool in the utilbin directory can be used to check if the hosts capabilities. You can also use the -sep in combination with -cb of qconf(5) command to identify if Sun Grid Engine is able to recognize the hardware topology.
Possible values for binding_strategy are as follows:
linear:<amount>[:<socket>,<core>] striding:<amount>:<n>[:<socket>,<core>] explicit:[<socket>,<core>;...]<socket>,<core>For the binding strategy linear and striding there is an optional socket and core pair attached. These denotes the mandatory starting point for the first core to bind on.
linear means that Sun Grid Engine tries to bind the job on amount successive cores. If socket and core is omit- ted then Sun Grid Engine first allocates successive cores on the first empty socket found. Empty means that there are no jobs bound to the socket by Sun Grid Engine. If this is not possible or is not sufficient Sun Grid Engine tries to find (further) cores on the socket with the most unbound cores and so on. If the amount of allocated cores is lower than requested cores, no binding is done for the job. If socket and core is specified then Sun Grid Engine tries to find amount of empty cores beginning with this starting point. If this is not possible then binding is not done.
striding means that Sun Grid Engine tries to find cores with a certain offset. It will select amount of empty cores with a offset of n -1 cores in between. Start point for the search algorithm is socket 0 core 0. As soon as amount cores are found they will be used to do the job binding. If there are not enough empty cores or if correct offset cannot be achieved then there will be no binding done.
explicit binds the specified sockets and cores that are mentioned in the provided socket/core list. Each socket/core pair has to be specified only once. If a socket/core pair is already in use by a different job the whole binding request will be ignored.
If this option or a corresponding value in qmon is specified then these values will be passed to defined JSV instances as parameters with the names binding_strategy, binding_type, binding_amount, binding_step, binding_socket, binding_core, binding_exp_n, binding_exp_socket<id>, binding_exp_core<id>.
Please note that the length of the socket/core value list of the explicit binding is reported as binding_exp_n. <id> will be replaced by the position of the socket/core pair within the explicit list (0 <= id < binding_exp_n). The first socket/core pair of the explicit binding will be reported with the parameter names binding_exp_socket0 and binding_exp_core0.
Values that do not apply for the specified binding will not be reported to JSV. E.g. binding_step will only be reported for the striding binding and all binding_exp_* values will passed to JSV if explicit binding was specified. (see -jsv option below or find more infor- mation concerning JSV in jsv(1))
Gives the user the possibility to indicate explicitly whether command should be treated as binary or script. If the value of -b is 'y', then command may be a binary or script. The command might not be accessible from the submission host. Nothing except the path of the command will be transferred from the submission host to the execution host. Path aliasing will be applied to the path of command before command will be executed.
If the value of -b is 'n' then command needs to be a script and it will be handled as script. The script file has to be accessible by the submission host. It will be transferred to the execution host. qsub/qrsh will search directive prefixes within script.
qsub will implicitly use -b n whereas qrsh will apply the -b y option if nothing else is specified.
The value specified with this option or the correspond- ing value specified in qmon will only be passed to defined JSV instances if the value is yes. The name of the parameter will be b. The value will be y also when then long form yes was specified during submission. (see -jsv option below or find more information con- cerning JSV in jsv(1)) Please note that submission of command as script (-b n) can have a significant performance impact, especially for short running jobs and big job scripts. Script submission adds a number of operations to the submis- sion process: The job script needs to be
n no checkpoint is performed. s checkpoint when batch server is shut down. m checkpoint at minimum CPU interval. x checkpoint when job gets suspended. <interval> checkpoint in the specified time interval.
The minimum CPU interval is defined in the queue confi- guration (see queue_conf(5) for details). <interval> has to be specified in the format hh:mm:ss. The max- imum of <interval> and the queue's minimum CPU interval is used if <interval> is specified. This is done to ensure that a machine is not overloaded by checkpoints being generated too frequently.
The value specified with this option or the correspond- ing value specified in qmon will be passed to defined JSV instances. The <interval> will be available as parameter with the name c_interval. The character sequence specified will be available as parameter with the name c_occasion. Please note that if you change c_occasion via JSV then the last setting of c_interval will be overwritten and vice versa. (see -jsv option below or find more information concerning JSV in jsv(1))
If this option or a corresponding value in qmon is specified then this value will be passed to defined JSV instances as parameter with the name cwd. The value of this parameter will be the absolute path to the current working directory. JSV scripts can remove the path from jobs during the verification process by setting the value of this parameter to an empty string. As a result the job behaves as if -cwd was not specified during job submission. (see -jsv option below or find more infor- mation concerning JSV in jsv(1))
`u' denotes a user hold. `s' denotes a system hold. `o' denotes a operator hold. `n' denotes no hold (requires manager privileges).As long as any hold other than `n' is assigned to the job the job is not eligible for execution. Holds can be released via qalter and qrls(1). In case of qalter this is supported by the following additional option specifiers for the -h switch:
`U' removes a user hold. `S' removes a system hold. `O' removes a operator hold.Sun Grid Engine managers can assign and remove all hold types, Sun Grid Engine operators can assign and remove user and operator holds, and users can only assign or remove user holds.
In the case of qsub only user holds can be placed on a job and thus only the first form of the option with the -h switch alone is allowed. As opposed to this, qalter requires the second form described above.
An alternate means to assign hold is provided by the qhold(1) facility.
If the job is a array job (see the -t option below), all tasks specified via -t are affected by the -h operation simultaneously.
If this option is specified with qsub or during the submission of a job in qmon then the parameter h with the value u will be passed to the defined JSV instances indicating that the job will be in user hold after the submission finishes. (see -jsv option below or find more information concerning JSV in jsv(1))
Defines or redefines the job dependency list of the submitted job. A reference by job name or pattern is only accepted if the referenced job is owned by the same user as the referring job. The submitted job is not eligible for execution unless all jobs referenced in the comma-separated job id and/or job name list have completed. If any of the referenced jobs exits with exit code 100, the submitted job will remain ineligible for execution.
With the help of job names or regular pattern one can specify a job dependency on multiple jobs satisfying the regular pattern or on all jobs with the requested name. The name dependencies are resolved at submit time and can only be changed via qalter. New jobs or name changes of other jobs will not be taken into account.If this option or a corresponding value in qmon is specified then this value will be passed to defined JSV instances as parameter with the name hold_jid. (see -jsv option below or find more information concerning JSV in jsv(1))
With the help of job names or regular pattern one can specify a job dependency on multiple jobs satisfying the regular pattern or on all jobs with the requested name. The name dependencies are resolved at submit time and can only be changed via qalter. New jobs or name changes of other jobs will not be taken into account.
If either the submitted job or any job in wc_job_list are not array jobs with the same range of sub-tasks (see -t option below), the request list will be rejected and the job create or modify operation will error.
If this option or a corresponding value in qmon is specified then this value will be passed to defined JSV instances as parameter with the name hold_jid_ad. (see -jsv option below or find more information concerning JSV in jsv(1))
By default /dev/null is the input stream for the job.
It is possible to use certain pseudo variables, whose values will be expanded at runtime of the job and will be used to express the standard input stream as described in the -e option for the standard error stream.
If this option or a corresponding value in qmon is specified then this value will be passed to defined JSV instances as parameter with the name i. (see -jsv option below or find more information concerning JSV in jsv(1))
In contrast to other options this switch will not be overwritten if it is also used in sge_request files. Instead all specified JSV instances will be executed to verify the job to be submitted.
The JSV instance which is directly passed with the com- mandline of a client is executed as first to verify the job specification. After that the JSV instance which might have been defined in various sge_request files will be triggered to check the job. Find more details in man page jsv(1) and sge_request(5).
The syntax of the jsv_url is specified in sge_types(1).()
Launch the job in a Sun Grid Engine queue meeting the given resource request list. In case of qalter the previous definition is replaced by the specified one.
complex(5) describes how a list of available resources and their associated valid value specifiers can be obtained.
There may be multiple -l switches in a single command. You may request multiple -l options to be soft or hard both in the same command line. In case of a serial job multiple -l switches refine the definition for the sought queue.
Defines or redefines the list of users to which the server that executes the job has to send mail, if the server sends mail about the job. Default is the job owner at the originating host.
Defines or redefines a list of cluster queues, queue domains and queue instances which may be used to become the so called master queue of this parallel job. A more detailed description of wc_queue_list can be found in sge_types(1). The master queue is defined as the queue where the parallel job is started. The other queues to which the parallel job spawns tasks are called slave queues. A parallel job only has one master queue.
This parameter has all the properties of a resource request and will be merged with requirements derived from the -l option described above.
Note that the Linux operating system "misused" the user signals SIGUSR1 and SIGUSR2 in some early Posix thread implementations. You might not want to use the -notify option if you are running multi-threaded applications in your jobs under Linux, particularly on 2.0 or ear- lier kernels.
Specifies the project to which this job is assigned. The administrator needs to give permission to indivi- dual users to submit jobs to a specific project. (see -aprj option to qconf(1)).
If this option or a corresponding value in qmon is specified then this value will be passed to defined JSV instances as parameter with the name ot. (see -jsv option above or find more information concerning JSV in jsv(1))
Users may only decrease the priority of their jobs. Sun Grid Engine managers and administrators may also increase the priority associated with jobs. If a pend- ing job has higher priority, it is earlier eligible for being dispatched by the Sun Grid Engine scheduler.
If this option or a corresponding value in qmon is specified and the priority is not 0 then this value will be passed to defined JSV instances as parameter with the name p. (see -jsv option above or find more information concerning JSV in jsv(1))
Parallel programming environment (PE) to instantiate. For more detail about PEs, please see the sge_types(1).
Defines or redefines a list of cluster queues, queue domains or queue instances which may be used to execute this job. Please find a description of wc_queue_list in sge_types(1). This parameter has all the properties of a resource request and will be merged with requirements derived from the -l option described above. Qalter allows changing this option even while the job executes. The modified parameter will only be in effect after a restart or migration of the job, however. If this option or a corresponding value in qmon is specified the these hard and soft resource requirements will be passed to defined JSV instances as parameters with the names q_hard and q_soft. If regular expres- sions will be used for resource requests, then these expressions will be passed as they are. Also shortcut names will not be expanded. (see -jsv option above or find more information concerning JSV in jsv(1))
-R y[es]|n[o] Indicates whether a reservation for this job should be done. Reservation is never done for immediate jobs, i.e. jobs submitted using the -now yes option. Please note that regardless of the reservation request, job reservation might be disabled using max_reservation in sched_conf(5) and might be limited only to a certain number of high priority jobs.
By default jobs are submitted with the -R n option.
The value specified with this option or the correspond- ing value specified in qmon will only be passed to defined JSV instances if the value is yes. The name of the parameter will be R. The value will be y also when then long form yes was specified during submission. (see -jsv option above or find more information con- cerning JSV in jsv(1))
-r y[es]|n[o] Identifies the ability of a job to be rerun or not. If the value of -r is 'yes', the job will be rerun if the job was aborted without leaving a consistent exit state. (This is typically the case if the node on which the job is running crashes). If -r is 'no', the job will not be rerun under any circumstances. Interactive jobs submitted with qsh, qrsh or qlogin are not rerunnable.
Contexts provide a way to dynamically attach and remove meta-information to and from a job. The context vari- ables are not passed to the job's execution context in its environment.
-shell y[es]|n[o] -shell n causes qsub to execute the command line directly, as if by exec(2). No command shell will be executed for the job. This option only applies when -b y is also used. Without -b y, -shell n has no effect.
This option can be used to speed up execution as some overhead, like the shell startup and sourcing the shell resource files is avoided.
This option can only be used if no shell-specific com- mand line parsing is required. If the command line con- tains shell syntax, like environment variable substitu- tion or (back) quoting, a shell must be started. In this case either do not use the -shell n option or exe- cute the shell as the command line and pass the path to the executable as a parameter.
If a job executed with the -shell n option fails due to a user error, such as an invalid path to the execut- able, the job will enter the error state.
-shell y cancels the effect of a previous -shell n. Otherwise, it has no effect.
See -b and -noshell for more information.
The value specified with this option or the correspond- ing value specified in qmon will only be passed to defined JSV instances if the value is yes. The name of the parameter will be shell. The value will be y also when then long form yes was specified during submis- sion. (see -jsv option above or find more information concerning JSV in jsv(1))
If this option or a corresponding value in qmon is specified then the corresponding -q and -l resource requirements will be passed to defined JSV instances as parameter with the names q_soft and l_soft. Find for information in the sections describing -q and -l. (see -jsv option above or find more information concerning JSV in jsv(1))
-sync y[es]|n[o] -sync y causes qsub to wait for the job to complete before exiting. If the job completes successfully, qsub's exit code will be that of the completed job. If the job fails to complete successfully, qsub will print out a error message indicating why the job failed and will have an exit code of 1. If qsub is interrupted, e.g. with CTRL-C, before the job completes, the job will be canceled.
With the -sync n option, qsub will exit with an exit code of 0 as soon as the job is submitted successfully. -sync n is default for qsub.
If -sync y is used in conjunction with -now y, qsub will behave as though only -now y were given until the job has been successfully scheduled, after which time qsub will behave as though only -sync y were given. If -sync y is used in conjunction with -t n[-m[:i]], qsub will wait for all the job's tasks to complete before exiting. If all the job's tasks complete successfully, qsub's exit code will be that of the first completed job tasks with a non-zero exit code, or 0 if all job tasks exited with an exit code of 0. If any of the job's tasks fail to complete successfully, qsub will print out an error message indicating why the job task(s) failed and will have an exit code of 1. If qsub is interrupted, e.g. with CTRL-C, before the job completes, all of the job's tasks will be canceled.
Furthermore, the pathname can be constructed with pseudo environment variables as described for the -e option above.
In the case of qsh the specified shell path is used to execute the corresponding command interpreter in the xterm(1) (via its -e option) started on behalf of the interactive job. Qalter allows changing this option even while the job executes. The modified parameter will only be in effect after a restart or migration of the job, however.
If this option or a corresponding value in qmon is specified then this value will be passed to defined JSV instances as parameter with the name S. (see -jsv option above or find more information concerning JSV in jsv(1))
Following restrictions apply to the values n and m:
1 <= n <= MIN(2^31-1, max_aj_tasks) 1 <= m <= MIN(2^31-1, max_aj_tasks) n <= mmax_aj_tasks is defined in the cluster configuration (see sge_conf(5)) The task id range specified in the option argument may be a single number, a simple range of the form n-m or a range with a step size. Hence, the task id range speci- fied by 2-10:2 would result in the task id indexes 2, 4, 6, 8, and 10, for a total of 5 identical tasks, each with the environment variable SGE_TASK_ID containing one of the 5 index numbers.
All array job tasks inherit the same resource requests and attribute definitions as specified in the qsub or qalter command line, except for the -t option. The tasks are scheduled independently and, provided enough resources exist, concurrently, very much like separate jobs. However, an array job or a sub-array there of can be accessed as a single unit by commands like qmod(1) or qdel(1). See the corresponding manual pages for further detail.
Array jobs are commonly used to execute the same type of operation on varying input data sets correlated with the task index number. The number of tasks in a array job is unlimited.
STDOUT and STDERR of array job tasks will be written into different files with the default location
<jobname>.['e'|'o']<job_id>'.'<task_id>In order to change this default, the -e and -o options (see above) can be used together with the pseudo environment variables $HOME, $USER, $JOB_ID, $JOB_NAME, $HOSTNAME, and $SGE_TASK_ID.
Note, that you can use the output redirection to divert the output of all tasks into the same file, but the result of this is undefined.
If this option or a corresponding value in qmon is specified then this value will be passed to defined JSV instances as parameters with the name t_min, t_max and t_step (see -jsv option above or find more information concerning JSV in jsv(1))
-v variable[=value],... Defines or redefines the environment variables to be exported to the execution context of the job. If the -v option is present Sun Grid Engine will add the environment variables defined as arguments to the switch and, optionally, values of specified variables, to the execution context of the job.
All environment variables specified with -v, -V or the DISPLAY variable provided with -display will be exported to the defined JSV instances only optionally when this is requested explicitly during the job sub- mission verification. (see -jsv option above or find more information concerning JSV in jsv(1))
The specifiers e, w, n and v define the following vali- dation modes:
`e' error - jobs with invalid requests will be rejected. `w' warning - only a warning will be displayed for invalid requests. `n' none - switches off validation; the default for qsub, qalter, qrsh, qsh and qlogin. `p' poke - does not submit the job but prints a validation report based on a cluster as is with all resource utilizations in place. `v' verify - does not submit the job but prints a validation report based on an empty cluster.
Note, that the necessary checks are performance consum- ing and hence the checking is switched off by default. It should also be noted that load values are not taken into account with the verification since they are assumed to be too volatile. To cause -w e verification to be passed at submission time, it is possible to specify non-volatile values (non-consumables) or max- imum values (consumables) in complex_values.
The name of the cell specified in the environment variable SGE_CELL, if it is set. The name of the default cell, i.e. default.
In addition to those environment variables specified to be exported to the job via the -v or the -V option (see above) qsub, qsh, and qlogin add the following variables with the indicated values to the variable list:
if ( $?JOB_NAME) then echo "Sun Grid Engine spooled job" exit 0 endifDon't forget to set your shell's search path in your shell start-up before this code. EXIT STATUS The following exit values are returned:
EXAMPLES
The following is the simplest form of a Sun Grid Engine script file.#!/bin/csh a.out ===================================================== The next example is a more complex Sun Grid Engine script. ===================================================== #!/bin/csh # Which account to be charged cpu time #$ -A santa_claus # date-time to run, format [[CC]yy]MMDDhhmm[.SS] #$ -a 12241200 # to run I want 6 or more parallel processes # under the PE pvm. the processes require # 128M of memory #$ -pe pvm 6- -l mem=128 # If I run on dec_x put stderr in /tmp/foo, if I # run on sun_y, put stderr in /usr/me/foo #$ -e dec_x:/tmp/foo,sun_y:/usr/me/foo # Send mail to these users #$ -M santa@nothpole,claus@northpole # Mail at beginning/end/on suspension #$ -m bes # Export these environmental variables #$ -v PVM_ROOT,FOOBAR=BAR # The job is located in the current # working directory. #$ -cwdFILES
$REQUEST.oJID[.TASKID] STDOUT of job #JID $REQUEST.eJID[.TASKID] STDERR of job $REQUEST.poJID[.TASKID] STDOUT of par. env. of job $REQUEST.peJID[.TASKID] STDERR of par. env. of job $cwd/.sge_aliases cwd path aliases $cwd/.sge_request cwd default request $HOME/.sge_aliases user path aliases $HOME/.sge_request user default request <sge_root>/<cell>/common/sge_aliases cluster path aliases <sge_root>/<cell>/common/sge_request cluster default request <sge_root>/<cell>/common/act_qmaster Sun Grid Engine master host file
Society
Groupthink : Two Party System as Polyarchy : Corruption of Regulators : Bureaucracies : Understanding Micromanagers and Control Freaks : Toxic Managers : Harvard Mafia : Diplomatic Communication : Surviving a Bad Performance Review : Insufficient Retirement Funds as Immanent Problem of Neoliberal Regime : PseudoScience : Who Rules America : Neoliberalism : The Iron Law of Oligarchy : Libertarian Philosophy
Quotes
War and Peace : Skeptical Finance : John Kenneth Galbraith :Talleyrand : Oscar Wilde : Otto Von Bismarck : Keynes : George Carlin : Skeptics : Propaganda : SE quotes : Language Design and Programming Quotes : Random IT-related quotes : Somerset Maugham : Marcus Aurelius : Kurt Vonnegut : Eric Hoffer : Winston Churchill : Napoleon Bonaparte : Ambrose Bierce : Bernard Shaw : Mark Twain Quotes
Bulletin:
Vol 25, No.12 (December, 2013) Rational Fools vs. Efficient Crooks The efficient markets hypothesis : Political Skeptic Bulletin, 2013 : Unemployment Bulletin, 2010 : Vol 23, No.10 (October, 2011) An observation about corporate security departments : Slightly Skeptical Euromaydan Chronicles, June 2014 : Greenspan legacy bulletin, 2008 : Vol 25, No.10 (October, 2013) Cryptolocker Trojan (Win32/Crilock.A) : Vol 25, No.08 (August, 2013) Cloud providers as intelligence collection hubs : Financial Humor Bulletin, 2010 : Inequality Bulletin, 2009 : Financial Humor Bulletin, 2008 : Copyleft Problems Bulletin, 2004 : Financial Humor Bulletin, 2011 : Energy Bulletin, 2010 : Malware Protection Bulletin, 2010 : Vol 26, No.1 (January, 2013) Object-Oriented Cult : Political Skeptic Bulletin, 2011 : Vol 23, No.11 (November, 2011) Softpanorama classification of sysadmin horror stories : Vol 25, No.05 (May, 2013) Corporate bullshit as a communication method : Vol 25, No.06 (June, 2013) A Note on the Relationship of Brooks Law and Conway Law
History:
Fifty glorious years (1950-2000): the triumph of the US computer engineering : Donald Knuth : TAoCP and its Influence of Computer Science : Richard Stallman : Linus Torvalds : Larry Wall : John K. Ousterhout : CTSS : Multix OS Unix History : Unix shell history : VI editor : History of pipes concept : Solaris : MS DOS : Programming Languages History : PL/1 : Simula 67 : C : History of GCC development : Scripting Languages : Perl history : OS History : Mail : DNS : SSH : CPU Instruction Sets : SPARC systems 1987-2006 : Norton Commander : Norton Utilities : Norton Ghost : Frontpage history : Malware Defense History : GNU Screen : OSS early history
Classic books:
The Peter Principle : Parkinson Law : 1984 : The Mythical Man-Month : How to Solve It by George Polya : The Art of Computer Programming : The Elements of Programming Style : The Unix Hater’s Handbook : The Jargon file : The True Believer : Programming Pearls : The Good Soldier Svejk : The Power Elite
Most popular humor pages:
Manifest of the Softpanorama IT Slacker Society : Ten Commandments of the IT Slackers Society : Computer Humor Collection : BSD Logo Story : The Cuckoo's Egg : IT Slang : C++ Humor : ARE YOU A BBS ADDICT? : The Perl Purity Test : Object oriented programmers of all nations : Financial Humor : Financial Humor Bulletin, 2008 : Financial Humor Bulletin, 2010 : The Most Comprehensive Collection of Editor-related Humor : Programming Language Humor : Goldman Sachs related humor : Greenspan humor : C Humor : Scripting Humor : Real Programmers Humor : Web Humor : GPL-related Humor : OFM Humor : Politically Incorrect Humor : IDS Humor : "Linux Sucks" Humor : Russian Musical Humor : Best Russian Programmer Humor : Microsoft plans to buy Catholic Church : Richard Stallman Related Humor : Admin Humor : Perl-related Humor : Linus Torvalds Related humor : PseudoScience Related Humor : Networking Humor : Shell Humor : Financial Humor Bulletin, 2011 : Financial Humor Bulletin, 2012 : Financial Humor Bulletin, 2013 : Java Humor : Software Engineering Humor : Sun Solaris Related Humor : Education Humor : IBM Humor : Assembler-related Humor : VIM Humor : Computer Viruses Humor : Bright tomorrow is rescheduled to a day after tomorrow : Classic Computer Humor
The Last but not Least Technology is dominated by two types of people: those who understand what they do not manage and those who manage what they do not understand ~Archibald Putt. Ph.D
Copyright © 1996-2021 by Softpanorama Society. www.softpanorama.org was initially created as a service to the (now defunct) UN Sustainable Development Networking Programme (SDNP) without any remuneration. This document is an industrial compilation designed and created exclusively for educational use and is distributed under the Softpanorama Content License. Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.
FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available to advance understanding of computer science, IT technology, economic, scientific, and social issues. We believe this constitutes a 'fair use' of any such copyrighted material as provided by section 107 of the US Copyright Law according to which such material can be distributed without profit exclusively for research and educational purposes.
This is a Spartan WHYFF (We Help You For Free) site written by people for whom English is not a native language. Grammar and spelling errors should be expected. The site contain some broken links as it develops like a living tree...
|
You can use PayPal to to buy a cup of coffee for authors of this site |
Disclaimer:
The statements, views and opinions presented on this web page are those of the author (or referenced source) and are not endorsed by, nor do they necessarily reflect, the opinions of the Softpanorama society. We do not warrant the correctness of the information provided or its fitness for any purpose. The site uses AdSense so you need to be aware of Google privacy policy. You you do not want to be tracked by Google please disable Javascript for this site. This site is perfectly usable without Javascript.
Last modified: July, 28, 2019