|
Home | Switchboard | Unix Administration | Red Hat | TCP/IP Networks | Neoliberalism | Toxic Managers |
(slightly skeptical) Educational society promoting "Back to basics" movement against IT overcomplexity and bastardization of classic Unix |
News | Cron | Books | Recommended Links | Reference | at.allow/at.deny | |
Time Specification | Options | atq command | batch command | shutdown | ||
nohup command | Enterprise Job schedulers | Cron Tips | Sysadmin Horror Stories |
|
The idea of the at command is to facilitate "one time command scheduling". We will be taking about unix/linux command, but at command also exists in Windows (with a different syntax).
The value of at became instantly clear if, doe example, you need to transfer files and the requires bandwidth is available only at night (say 7PM to 7AM, or 19:00 to 7:00). Large data transferred can be scheduled for those hours. It is part of "playing nice". The utility at was written way before screen, tmux, and so on and was one of the early way to have a detached shell, i.e., one that would not die once you logged off the system.
The shutdown command also is of specified via at command.
In this sense it is complementary to cron command which usually is used to schedule aperiodic jobs that need to start somewhere in the future. In other words at designed for occasional invocations or a program or a script, while cron was deigned to run jobs which need to be executed forever at fixed time.
At still can be used instead of nohup command
at -f myjob now exit
NOTES:
|
In it important to understand that at is quite different from cron: at command preserves the environment in which it was invoked, while cron does not (it executes command in its own "cron" environment, and you should not expect that PATH and other valuables are preserved from the account under which cron is running). In cron you need to source your .bashrc (or whatever dot file you are using) yourself to get correct values of PATH and other env variables.
You can compile crontab into sequence of at commands and use at exclusively creating "day schedule" at the beginning of the day, each day. This way you can you consolidate all crontabs on one server and perform compilation at the end of each scheduling day distributing the results via ssh to target servers for the next day. This scheme allow to have a more flexible environment then "distributed" cron, somewhat similar to expensive Enterprise Job schedulers.
The only argument at accepts is time specification. At allows fairly complex time specifications, extending the POSIX.2 standard. It accepts times both in the form HH:MM and HHMM (four digits without column) to run a job at a specific time of day. (If that time is already past, the next day is assumed.)
Instead of specifying exact hour and minute, you may also use words such as midnight, noon, teatime (4pm), now. For some strange (and somewhat pervert) reason they even allow you can have a time-of-day suffixed with AM or PM for running in the morning or the evening. I think that those who can't specify time in 24 hour format should not use Unix/Linux ;-). This is somewhat common to inability of the USA to switch from imperial measures to metric system, which is source of nasty jokes about the USA elite mental capacities (or, more correctly, incapacities) for 50 years or so. Such a provincial, mentally challenged elite. The UK elite fared much better in this regard.
In addition to time of the day, you can also specify the day the job will be run. Here at allow additional baroque forms to classic YYYY-MM-DD. Among them there are three "the USA friendly" forms: MMDDYY or MM/DD/YY and one EU-friendly form DD.MM.YY, The specification of a date must follow the specification of the time of day.
For example, to run a job at 4 PM three days from now, you would do
at 16:00 + 3 days
to run a job at 10:00am on July 31, you would do
at 10:00 2014-06-31
There are also several specific keywords like tomorrow and all days of the week (Monday-Sunday) For example:
at 1:00 tomorrow
The exact definition of the time specification format can be found in /usr/share/doc/at-3.1.10/timespec.
The most important ability of at command is the its ability to operate in relative time terms. There are several important relative time frames used:
Relative intervals are specified relative to now or any "fixed time" using "+" sign:
now + digits time-units,
where the time-units can be minutes, hours, days, or weeks and you can tell at to run the job today by suffixing the time with today and to run the job tomorrow by suffixing the time with tomorrow.
The optional increment after time specification in at command permit to specify offset from the time. It should be a number preceded by a plus sign (+) with one of the following suffixes:
The spacing is very flexible as long as there are no ambiguities. Case generally does not matter. I wonder how many hours developer have spend debugging this "Christmas tree" specification and how many bugs still remain. For example:
at 0815am Jan 24 at 8 :15amjan24 at now "+ 1day" at 5 pm FRIday at '17 utc+30minutes'
The singular forms are also accepted, for example
at now + 1 minute
The keyword next can be used as an equivalent to an increment + 1. For example the following two commands are equivalent:
at 2pm + 1 week at 2pm next week
The last two are equivalent commands.
One can also use at for periodic but dynamic rescheduling of jobs. In such case at can be specified inside the job with new, dynamic interval.
Simplified (and actually wrong as if interval is fixed crontab is a more appropriate vehicle for such work) example, can be something like a script named my.daily which we force to run every day by invoking at inside the script itself:
# my.daily # runs every day at now tomorrow < my.daily
... ... ...For large intervals (say more than an hour) this is a better implementation then using sleep inside the script as it release resources after execution
# atq 4 2013-07-11 18:00 a root
basfimgw:/etc/cron.d # at -c 4 #!/bin/sh # atrun uid=0 gid=0 # mail root 0 umask 22 USER=root; export USER LS_COLORS=no=00:fi=00:di=01\;34:ln=00\;36:pi=40\;33:so=01\;35:do=01\;35:bd=40\;33\;01:cd=40\;33\;01:or=41\;33\;01:ex=00\;32:\*.cmd=00\;32:\*.exe=01\;32:\*.com=01\;32:\*.bat=01\;32:\*.btm=01\;32:\*.dll=01\;32:\*.tar=00\;31:\*.tbz=00\;31:\*.tgz=00\;31:\*.rpm=00\;31:\*.deb=00\;31:\*.arj=00\;31:\*.taz=00\;31:\*.lzh=00\;31:\*.lzma=00\;31:\*.zip=00\;31:\*.zoo=00\;31:\*.z=00\;31:\*.Z=00\;31:\*.gz=00\;31:\*.bz2=00\;31:\*.tb2=00\;31:\*.tz2=00\;31:\*.tbz2=00\;31:\*.avi=01\;35:\*.bmp=01\;35:\*.fli=01\;35:\*.gif=01\;35:\*.jpg=01\;35:\*.jpeg=01\;35:\*.mng=01\;35:\*.mov=01\;35:\*.mpg=01\;35:\*.pcx=01\;35:\*.pbm=01\;35:\*.pgm=01\;35:\*.png=01\;35:\*.ppm=01\;35:\*.tga=01\;35:\*.tif=01\;35:\*.xbm=01\;35:\*.xpm=01\;35:\*.dl=01\;35:\*.gl=01\;35:\*.wmv=01\;35:\*.aiff=00\;32:\*.au=00\;32:\*.mid=00\;32:\*.mp3=00\;32:\*.ogg=00\;32:\*.voc=00\;32:\*.wav=00\;32:; export LS_COLORS SUDO_USER=joeuser; export SUDO_USER SUDO_UID=324547; export SUDO_UID USERNAME=root; export USERNAME PATH=/usr/bin:/bin:/usr/sbin:/sbin; export PATH MAIL=/var/mail/root; export MAIL PWD=/etc/; export PWD LANG=en_US.UTF-8; export LANG SHLVL=1; export SHLVL SUDO_COMMAND=/bin/su; export SUDO_COMMAND HOME=/root; export HOME LS_OPTIONS=-A\ -N\ --color=tty\ -T\ 0; export LS_OPTIONS LOGNAME=root; export LOGNAME SUDO_GID=324547; export SUDO_GID COLORTERM=1; export COLORTERM OLDPWD=/lib/modules/3.0.80-0.5-default; export OLDPWD cd /etc || { echo 'Execution directory inaccessible' >&2 exit 1 } reboot
Use atrm command to remove job, if you change your mind. for example
atrm 2
echo "perl myjob" | at now
echo /sbin/reboot | at 18:00Unlike cron environment is inherited so you can put command without explicit path. The colon between hours and minites is optional if hours and minutes are secied with two digits.
echo reboot | at 2:00 July 14 2013
echo reboot | at 0400 2014-12-26
The at-job is executed in a separate invocation of the shell, running in
a separate process group with no controlling terminal, except that the environment
variables, current working directory, file creation mask (see
umask(1)), and system resource limits .
at command can be used similar to the nohup command and screen command: you can execute a command (or shell script) on the server using the at command and then logout from the server:
at -f myjob now exit
The at command is part of the set of four commands that includes at, batch, atq, and atrm commands:
# atq 4 2013-07-11 18:00 a root
atrm [-V] job [job...] deletes jobs, identified by their job number. (use atq command to get job number)
Syntax of Linux Gnu implementation is as follows:
at [-V] [-q queue] [-f file] [-mldbv] TIMEat -c job [job...]
The special
queue "=" is reserved for jobs which are currently running. If a job is submitted
to a queue designated with an uppercase letter, it is treated as if it had been
submitted to batch at that time. If atq is given a specific queue, it will only
show jobs pending in that queue.
Like cron command, at command is controlled by two files that list users one per line similar to cron.allow and cron.deny :
/etc/at.allow
/etc/at.deny
/usr/lib/cron/at.allow
/usr/lib/cron/at.deny
Some sysadmin hardlink those two files to cron files to avoid synchronization problem.
You can hardlink at.allow to cron.allow and at.deny to cron.deny to avoid synchronization problems |
echo "/sbin/chkconfig sendmail off" | at -m 19:00
atq 3 2013-06-07 19:00 a root
at -c 3 #!/bin/sh # atrun uid=0 gid=0 # mail root 1 umask 22 MC_COLOR_TABLE=editnormal=black,white:editbold=white,black:editmarked=blue,cyan; export MC_COLOR_TABLE MANPATH=/usr/man:/usr/local/man:/usr/share/man:/opt/OV/man:/opt/perf/man; export MANPATH ... ... ... COLORTERM=1; export COLORTERM cd /home/joeuser || { echo 'Execution directory inaccessible' >&2 exit 1 } /sbin/chkconfig sendmail off
at -m 0730 tomorrow <<! sort < file >outfile !
$ at now + 1 hour <<! diff file1 file2 2>&1 >outfile | mailx mygroup !
This sequence can be used at a terminal:
$ batch <<! sort <file >outfile !
This sequence, which demonstrates redirecting standard error to a pipe, is useful in a command procedure (the sequence of output redirection specifications is significant):
$ batch <<! diff file1 file2 2>&1 >outfile | mailx mygroup !
|
Switchboard | ||||
Latest | |||||
Past week | |||||
Past month |
Aug 12, 2008 | www.linux.com
The Task Spooler project allows you to queue up tasks from the shell for batch execution. Task Spooler is simple to use and requires no configuration. You can view and edit queued commands, and you can view the output of queued commands at any time.
Task Spooler has some similarities with other delayed and batch execution projects, such as " at ." While both Task Spooler and at handle multiple queues and allow the execution of commands at a later point, the at project handles output from commands by emailing the results to the user who queued the command, while Task Spooler allows you to get at the results from the command line instead. Another major difference is that Task Spooler is not aimed at executing commands at a specific time, but rather at simply adding to and executing commands from queues.
The main repositories for Fedora, openSUSE, and Ubuntu do not contain packages for Task Spooler. There are packages for some versions of Debian, Ubuntu, and openSUSE 10.x available along with the source code on the project's homepage. In this article I'll use a 64-bit Fedora 9 machine and install version 0.6 of Task Spooler from source. Task Spooler does not use autotools to build, so to install it, simply run
make; sudo make install
. This will install the main Task Spooler commandts
and its manual page into /usr/local.A simple interaction with Task Spooler is shown below. First I add a new job to the queue and check the status. As the command is a very simple one, it is likely to have been executed immediately. Executing ts by itself with no arguments shows the executing queue, including tasks that have completed. I then use
$ ts echo "hello world"ts -c
to get at the stdout of the executed command. The-c
option usescat
to display the output file for a task. Usingts -i
shows you information about the job. To clear finished jobs from the queue, use thets -C
command, not shown in the example.
6$ ts
ID State Output E-Level Times(r/u/s) Command [run=0/1]
6 finished /tmp/ts-out.QoKfo9 0 0.00/0.00/0.00 echo hello world$ ts -c 6
hello world$ ts -i 6
Command: echo hello world
Enqueue time: Tue Jul 22 14:42:22 2008
Start time: Tue Jul 22 14:42:22 2008
End time: Tue Jul 22 14:42:22 2008
Time run: 0.003336sThe
$ ts tar czvf /tmp/mytarball.tar.gz liberror-2.1.80011-t
option operates liketail -f
, showing you the last few lines of output and continuing to show you any new output from the task. If you would like to be notified when a task has completed, you can use the-m
option to have the results mailed to you, or you can queue another command to be executed that just performs the notification. For example, I might add a tar command and want to know when it has completed. The below commands will create a tarball and use libnotify commands to create an inobtrusive popup window on my desktop when the tarball creation is complete. The popup will be dismissed automatically after a timeout.
11
$ ts notify-send "tarball creation" "the long running tar creation process is complete."
12
$ ts
ID State Output E-Level Times(r/u/s) Command [run=0/1]
11 finished /tmp/ts-out.O6epsS 0 4.64/4.31/0.29 tar czvf /tmp/mytarball.tar.gz liberror-2.1.80011
12 finished /tmp/ts-out.4KbPSE 0 0.05/0.00/0.02 notify-send tarball creation the long... is complete.Notice in the output above, toward the far right of the header information, the
$ ts -S 2run=0/1
line. This tells you that Task Spooler is executing nothing, and can possibly execute one task. Task spooler allows you to execute multiple tasks at once from your task queue to take advantage of multicore CPUs. The-S
option allows you to set how many tasks can be executed in parallel from the queue, as shown below.
$ ts
ID State Output E-Level Times(r/u/s) Command [run=0/2]
6 finished /tmp/ts-out.QoKfo9 0 0.00/0.00/0.00 echo hello worldIf you have two tasks that you want to execute with Task Spooler but one depends on the other having already been executed (and perhaps that the previous job has succeeded too) you can handle this by having one task wait for the other to complete before executing. This becomes more important on a quad core machine when you might have told Task Spooler that it can execute three tasks in parallel. The commands shown below create an explicit dependency, making sure that the second command is executed only if the first has completed successfully, even when the queue allows multiple tasks to be executed. The first command is queued normally using
$ ts bash -c "sleep 10; echo hi"ts
. I use a subshell to execute the commands by havingts
explicitly start a new bash shell. The second command uses the-d
option, which tellsts
to execute the command only after the successful completion of the last command that was appended to the queue. When I first inspect the queue I can see that the first command (28) is executing. The second command is queued but has not been added to the list of executing tasks because Task Spooler is aware that it cannot execute until task 28 is complete. The second time I view the queue, both tasks have completed.
28
$ ts -d echo there
29
$ ts
ID State Output E-Level Times(r/u/s) Command [run=1/2]
28 running /tmp/ts-out.hKqDva bash -c sleep 10; echo hi
29 queued (file) && echo there
$ ts
ID State Output E-Level Times(r/u/s) Command [run=0/2]
28 finished /tmp/ts-out.hKqDva 0 10.01/0.00/0.01 bash -c sleep 10; echo hi
29 finished /tmp/ts-out.VDtVp7 0 0.00/0.00/0.00 && echo there
$ cat /tmp/ts-out.hKqDva
hi
$ cat /tmp/ts-out.VDtVp7
thereYou can also explicitly set dependencies on other tasks as shown below. Because the
ts
command prints the ID of a new task to the console, the first command puts that ID into a shell variable for use in the second command. The second command passes the task ID of the first task to ts, telling it to wait for the task with that ID to complete before returning. Because this is joined with the command we wish to execute with the&&
operation, the second command will execute only if the first one has finished and succeeded.The first time we view the queue you can see that both tasks are running. The first task will be in the
$ FIRST_TASKID=`ts bash -c "sleep 10; echo hi"`sleep
command that we used explicitly to slow down its execution. The second command will be executingts
, which will be waiting for the first task to complete. One downside of tracking dependencies this way is that the second command is added to the running queue even though it cannot do anything until the first task is complete.
$ ts sh -c "ts -w $FIRST_TASKID && echo there"
25
$ ts
ID State Output E-Level Times(r/u/s) Command [run=2/2]
24 running /tmp/ts-out.La9Gmz bash -c sleep 10; echo hi
25 running /tmp/ts-out.Zr2n5u sh -c ts -w 24 && echo there
$ ts
ID State Output E-Level Times(r/u/s) Command [run=0/2]
24 finished /tmp/ts-out.La9Gmz 0 10.01/0.00/0.00 bash -c sleep 10; echo hi
25 finished /tmp/ts-out.Zr2n5u 0 9.47/0.00/0.01 sh -c ts -w 24 && echo there
$ ts -c 24
hi
$ ts -c 25
thereWrap-upTask Spooler allows you to convert a shell command to a queued command by simply prepending
ts
to the command line. One major advantage of using ts over something like theat
command is that you can effectively runtail -f
on the output of a running task and also get at the output of completed tasks from the command line. The utility's ability to execute multiple tasks in parallel is very handy if you are running on a multicore CPU. Because you can explicitly wait for a task, you can set up very complex interactions where you might have several tasks running at once and have jobs that depend on multiple other tasks to complete successfully before they can execute.Because you can make explicitly dependant tasks take up slots in the actively running task queue, you can effectively delay the execution of the queue until a time of your choosing. For example, if you queue up a task that waits for a specific time before returning successfully and have a small group of other tasks that are dependent on this first task to complete, then no tasks in the queue will execute until the first task completes.
Category:
- Tools & Utilities
Click Here!
- Print This
- Like ( 0 likes )
Jun 23, 2018 | www.computerhope.com
at -m 01:35 < my-at-jobs.txtRun the commands listed in the ' my-at-jobs.txt ' file at 1:35 AM. All output from the job will be mailed to the user running the task. When this command has been successfully entered you should receive a prompt similar to the example below:
commands will be executed using /bin/sh job 1 at Wed Dec 24 00:22:00 2014at -lThis command will list each of the scheduled jobs in a format like the following:
1 Wed Dec 24 00:22:00 2003...this is the same as running the command atq .
at -r 1Deletes job 1 . This command is the same as running the command atrm 1 .
atrm 23Deletes job 23. This command is the same as running the command at -r 23 .
Jun 23, 2018 | stackoverflow.com
AL-Kateb ,Oct 23, 2013 at 13:33
I have a bash script that looks like this:#!/bin/bash wget LINK1 >/dev/null 2>&1 wget LINK2 >/dev/null 2>&1 wget LINK3 >/dev/null 2>&1 wget LINK4 >/dev/null 2>&1 # .. # .. wget LINK4000 >/dev/null 2>&1But processing each line until the command is finished then moving to the next one is very time consuming, I want to process for instance 20 lines at once then when they're finished another 20 lines are processed.
I thought of
wget LINK1 >/dev/null 2>&1 &
to send the command to the background and carry on, but there are 4000 lines here this means I will have performance issues, not to mention being limited in how many processes I should start at the same time so this is not a good idea.One solution that I'm thinking of right now is checking whether one of the commands is still running or not, for instance after 20 lines I can add this loop:
while [ $(ps -ef | grep KEYWORD | grep -v grep | wc -l) -gt 0 ]; do sleep 1 doneOf course in this case I will need to append & to the end of the line! But I'm feeling this is not the right way to do it.
So how do I actually group each 20 lines together and wait for them to finish before going to the next 20 lines, this script is dynamically generated so I can do whatever math I want on it while it's being generated, but it DOES NOT have to use wget, it was just an example so any solution that is wget specific is not gonna do me any good.
kojiro ,Oct 23, 2013 at 13:46
wait
is the right answer here, but yourwhile [ $(ps
would be much better writtenwhile pkill -0 $KEYWORD
– using proctools that is, for legitimate reasons to check if a process with a specific name is still running. – kojiro Oct 23 '13 at 13:46VasyaNovikov ,Jan 11 at 19:01
I think this question should be re-opened. The "possible duplicate" QA is all about running a finite number of programs in parallel. Like 2-3 commands. This question, however, is focused on running commands in e.g. a loop. (see "but there are 4000 lines"). – VasyaNovikov Jan 11 at 19:01robinCTS ,Jan 11 at 23:08
@VasyaNovikov Have you read all the answers to both this question and the duplicate? Every single answer to this question here, can also be found in the answers to the duplicate question. That is precisely the definition of a duplicate question. It makes absolutely no difference whether or not you are running the commands in a loop. – robinCTS Jan 11 at 23:08VasyaNovikov ,Jan 12 at 4:09
@robinCTS there are intersections, but questions themselves are different. Also, 6 of the most popular answers on the linked QA deal with 2 processes only. – VasyaNovikov Jan 12 at 4:09Dan Nissenbaum ,Apr 20 at 15:35
I recommend reopening this question because its answer is clearer, cleaner, better, and much more highly upvoted than the answer at the linked question, though it is three years more recent. – Dan Nissenbaum Apr 20 at 15:35devnull ,Oct 23, 2013 at 13:35
Use thewait
built-in:process1 & process2 & process3 & process4 & wait process5 & process6 & process7 & process8 & waitFor the above example, 4 processes
process1
..process4
would be started in the background, and the shell would wait until those are completed before starting the next set ..From the manual :
wait [jobspec or pid ...]Wait until the child process specified by each process ID pid or job specification jobspec exits and return the exit status of the last command waited for. If a job spec is given, all processes in the job are waited for. If no arguments are given, all currently active child processes are waited for, and the return status is zero. If neither jobspec nor pid specifies an active child process of the shell, the return status is 127.
kojiro ,Oct 23, 2013 at 13:48
So basicallyi=0; waitevery=4; for link in "${links[@]}"; do wget "$link" & (( i++%waitevery==0 )) && wait; done >/dev/null 2>&1
– kojiro Oct 23 '13 at 13:48rsaw ,Jul 18, 2014 at 17:26
Unless you're sure that each process will finish at the exact same time, this is a bad idea. You need to start up new jobs to keep the current total jobs at a certain cap .... parallel is the answer. – rsaw Jul 18 '14 at 17:26DomainsFeatured ,Sep 13, 2016 at 22:55
Is there a way to do this in a loop? – DomainsFeatured Sep 13 '16 at 22:55Bobby ,Apr 27, 2017 at 7:55
I've tried this but it seems that variable assignments done in one block are not available in the next block. Is this because they are separate processes? Is there a way to communicate the variables back to the main process? – Bobby Apr 27 '17 at 7:55choroba ,Oct 23, 2013 at 13:38
See parallel . Its syntax is similar toxargs
, but it runs the commands in parallel.chepner ,Oct 23, 2013 at 14:35
This is better than usingwait
, since it takes care of starting new jobs as old ones complete, instead of waiting for an entire batch to finish before starting the next. – chepner Oct 23 '13 at 14:35Mr. Llama ,Aug 13, 2015 at 19:30
For example, if you have the list of links in a file, you can docat list_of_links.txt | parallel -j 4 wget {}
which will keep fourwget
s running at a time. – Mr. Llama Aug 13 '15 at 19:300x004D44 ,Nov 2, 2015 at 21:42
There is a new kid in town called pexec which is a replacement forparallel
. – 0x004D44 Nov 2 '15 at 21:42mat ,Mar 1, 2016 at 21:04
Not to be picky, but xargs can also parallelize commands. – mat Mar 1 '16 at 21:04Vader B ,Jun 27, 2016 at 6:41
In fact,xargs
can run commands in parallel for you. There is a special-P max_procs
command-line option for that. Seeman xargs
.> ,
You can run 20 processes and use the command:waitYour script will wait and continue when all your background jobs are finished.
Jun 23, 2018 | unix.stackexchange.com
Yan Zhu ,Apr 19, 2015 at 6:59
I am usingxargs
to call a python script to process about 30 million small files. I hope to usexargs
to parallelize the process. The command I am using is:find ./data -name "*.json" -print0 | xargs -0 -I{} -P 40 python Convert.py {} > log.txtBasically,
Convert.py
will read in a small json file (4kb), do some processing and write to another 4kb file. I am running on a server with 40 CPU cores. And no other CPU-intense process is running on this server.By monitoring htop (btw, is there any other good way to monitor the CPU performance?), I find that
-P 40
is not as fast as expected. Sometimes all cores will freeze and decrease almost to zero for 3-4 seconds, then will recover to 60-70%. Then I try to decrease the number of parallel processes to-P 20-30
, but it's still not very fast. The ideal behavior should be linear speed-up. Any suggestions for the parallel usage of xargs ?Ole Tange ,Apr 19, 2015 at 8:45
You are most likely hit by I/O: The system cannot read the files fast enough. Try starting more than 40: This way it will be fine if some of the processes have to wait for I/O. – Ole Tange Apr 19 '15 at 8:45Fox ,Apr 19, 2015 at 10:30
What kind of processing does the script do? Any database/network/io involved? How long does it run? – Fox Apr 19 '15 at 10:30PSkocik ,Apr 19, 2015 at 11:41
I second @OleTange. That is the expected behavior if you run as many processes as you have cores and your tasks are IO bound. First the cores will wait on IO for their task (sleep), then they will process, and then repeat. If you add more processes, then the additional processes that currently aren't running on a physical core will have kicked off parallel IO operations, which will, when finished, eliminate or at least reduce the sleep periods on your cores. – PSkocik Apr 19 '15 at 11:41Bichoy ,Apr 20, 2015 at 3:32
1- Do you have hyperthreading enabled? 2- in what you have up there, log.txt is actually overwritten with each call to convert.py ... not sure if this is the intended behavior or not. – Bichoy Apr 20 '15 at 3:32Ole Tange ,May 11, 2015 at 18:38
xargs -P
and>
is opening up for race conditions because of the half-line problem gnu.org/software/parallel/ Using GNU Parallel instead will not have that problem. – Ole Tange May 11 '15 at 18:38James Scriven ,Apr 24, 2015 at 18:00
I'd be willing to bet that your problem is python . You didn't say what kind of processing is being done on each file, but assuming you are just doing in-memory processing of the data, the running time will be dominated by starting up 30 million python virtual machines (interpreters).If you can restructure your python program to take a list of files, instead of just one, you will get a huge improvement in performance. You can then still use xargs to further improve performance. For example, 40 processes, each processing 1000 files:
find ./data -name "*.json" -print0 | xargs -0 -L1000 -P 40 python Convert.pyThis isn't to say that python is a bad/slow language; it's just not optimized for startup time. You'll see this with any virtual machine-based or interpreted language. Java, for example, would be even worse. If your program was written in C, there would still be a cost of starting a separate operating system process to handle each file, but it would be much less.
From there you can fiddle with
-P
to see if you can squeeze out a bit more speed, perhaps by increasing the number of processes to take advantage of idle processors while data is being read/written.Stephen ,Apr 24, 2015 at 13:03
So firstly, consider the constraints:What is the constraint on each job? If it's I/O you can probably get away with multiple jobs per CPU core up till you hit the limit of I/O, but if it's CPU intensive, its going to be worse than pointless running more jobs concurrently than you have CPU cores.
My understanding of these things is that GNU Parallel would give you better control over the queue of jobs etc.
See GNU parallel vs & (I mean background) vs xargs -P for a more detailed explanation of how the two differ.
,
As others said, check whether you're I/O-bound. Also, xargs' man page suggests using-n
with-P
, you don't mention the number ofConvert.py
processes you see running in parallel.As a suggestion, if you're I/O-bound, you might try using an SSD block device, or try doing the processing in a tmpfs (of course, in this case you should check for enough memory, avoiding swap due to tmpfs pressure (I think), and the overhead of copying the data to it in the first place).
Jun 23, 2018 | superuser.com
Andrei ,Apr 10, 2013 at 14:26
I want the ability to schedule commands to be run in a FIFO queue. I DON'T want them to be run at a specified time in the future as would be the case with the "at" command. I want them to start running now, but not simultaneously. The next scheduled command in the queue should be run only after the first command finishes executing. Alternatively, it would be nice if I could specify a maximum number of commands from the queue that could be run simultaneously; for example if the maximum number of simultaneous commands is 2, then only at most 2 commands scheduled in the queue would be taken from the queue in a FIFO manner to be executed, the next command in the remaining queue being started only when one of the currently 2 running commands finishes.I've heard task-spooler could do something like this, but this package doesn't appear to be well supported/tested and is not in the Ubuntu standard repositories (Ubuntu being what I'm using). If that's the best alternative then let me know and I'll use task-spooler, otherwise, I'm interested to find out what's the best, easiest, most tested, bug-free, canonical way to do such a thing with bash.
UPDATE:
Simple solutions like ; or && from bash do not work. I need to schedule these commands from an external program, when an event occurs. I just don't want to have hundreds of instances of my command running simultaneously, hence the need for a queue. There's an external program that will trigger events where I can run my own commands. I want to handle ALL triggered events, I don't want to miss any event, but I also don't want my system to crash, so that's why I want a queue to handle my commands triggered from the external program.
Andrei ,Apr 11, 2013 at 11:40
Task Spooler:http://vicerveza.homeunix.net/~viric/soft/ts/
https://launchpad.net/ubuntu/+source/task-spooler/0.7.3-1
Does the trick very well. Hopefully it will be included in Ubuntu's package repos.
Hennes ,Apr 10, 2013 at 15:00
Use;
For example:
ls ; touch test ; ls
That will list the directory. Only after
ls
has run it will runtouch test
which will create a file named test. And only after that has finished it will run the next command. (In this case anotherls
which will show the old contents and the newly created file).Similar commands are
||
and&&
.
;
will always run the next command.
&&
will only run the next command it the first returned success.
Example:rm -rf *.mp3 && echo "Success! All MP3s deleted!"
||
will only run the next command if the first command returned a failure (non-zero) return value. Example:rm -rf *.mp3 || echo "Error! Some files could not be deleted! Check permissions!"
If you want to run a command in the background, append an ampersand (
&
).
Example:
make bzimage &
mp3blaster sound.mp3
make mytestsoftware ; ls ; firefox ; make clean
Will run two commands int he background (in this case a kernel build which will take some time and a program to play some music). And in the foregrounds it runs another compile job and, once that is finished ls, firefox and a make clean (all sequentially)
For more details, see
man bash
[Edit after comment]
in pseudo code, something like this?
Program run_queue: While(true) { Wait_for_a_signal(); While( queue not empty ) { run next command from the queue. remove this command from the queue. // If commands where added to the queue during execution then // the queue is not empty, keep processing them all. } // Queue is now empty, returning to wait_for_a_signal }// // Wait forever on commands and add them to a queue // Signal run_quueu when something gets added. // program add_to_queue() { While(true) { Wait_for_event(); Append command to queue signal run_queue } }terdon ,Apr 10, 2013 at 15:03
The easiest way would be to simply run the commands sequentially:cmd1; cmd2; cmd3; cmdNIf you want the next command to run only if the previous command exited successfully, use
&&
:cmd1 && cmd2 && cmd3 && cmdNThat is the only bash native way I know of doing what you want. If you need job control (setting a number of parallel jobs etc), you could try installing a queue manager such as TORQUE but that seems like overkill if all you want to do is launch jobs sequentially.
psusi ,Apr 10, 2013 at 15:24
You are looking forat
's twin brother:batch
. It uses the same daemon but instead of scheduling a specific time, the jobs are queued and will be run whenever the system load average is low.mpy ,Apr 10, 2013 at 14:59
Apart from dedicated queuing systems (like the Sun Grid Engine ) which you can also use locally on one machine and which offer dozens of possibilities, you can use something likecommand1 && command2 && command3which is the other extreme -- a very simple approach. The latter neither does provide multiple simultaneous processes nor gradually filling of the "queue".
Bogdan Dumitru ,May 3, 2016 at 10:12
I went on the same route searching, trying out task-spooler and so on. The best of the best is this:GNU Parallel --semaphore --fg It also has -j for parallel jobs.
Jun 23, 2018 | vicerveza.homeunix.net
As in freshmeat.net :
task spooler is a Unix batch system where the tasks spooled run one after the other. The amount of jobs to run at once can be set at any time. Each user in each system has his own job queue. The tasks are run in the correct context (that of enqueue) from any shell/process, and its output/results can be easily watched. It is very useful when you know that your commands depend on a lot of RAM, a lot of disk use, give a lot of output, or for whatever reason it's better not to run them all at the same time, while you want to keep your resources busy for maximum benfit. Its interface allows using it easily in scripts.For your first contact, you can read an article at linux.com , which I like as overview, guide and examples (original url) . On more advanced usage, don't neglect the TRICKS file in the package.
FeaturesI wrote Task Spooler because I didn't have any comfortable way of running batch jobs in my linux computer. I wanted to:
- Queue jobs from different terminals.
- Use it locally in my machine (not as in network queues).
- Have a good way of seeing the output of the processes (tail, errorlevels, ...).
- Easy use: almost no configuration.
- Easy to use in scripts.
At the end, after some time using and developing ts , it can do something more:
- It works in most systems I use and some others, like GNU/Linux, Darwin, Cygwin, and FreeBSD.
- No configuration at all for a simple queue.
- Good integration with renice, kill, etc. (through `ts -p` and process groups).
- Have any amount of queues identified by name, writting a simple wrapper script for each (I use ts2, tsio, tsprint, etc).
- Control how many jobs may run at once in any queue (taking profit of multicores).
- It never removes the result files, so they can be reached even after we've lost the ts task list.
- Transparent if used as a subprogram with -nf .
- Optional separation of stdout and stderr.
You can look at an old (but representative) screenshot of ts-0.2.1 if you want.
Mailing listI created a GoogleGroup for the program. You look for the archive and the join methods in the taskspooler google group page .
Alessandro Öhler once maintained a mailing list for discussing newer functionalities and interchanging use experiences. I think this doesn't work anymore , but you can look at the old archive or even try to subscribe .
How it worksThe queue is maintained by a server process. This server process is started if it isn't there already. The communication goes through a unix socket usually in /tmp/ .
When the user requests a job (using a ts client), the client waits for the server message to know when it can start. When the server allows starting , this client usually forks, and runs the command with the proper environment, because the client runs run the job and not the server, like in 'at' or 'cron'. So, the ulimits, environment, pwd,. apply.
When the job finishes, the client notifies the server. At this time, the server may notify any waiting client, and stores the output and the errorlevel of the finished job.
Moreover the client can take advantage of many information from the server: when a job finishes, where does the job output go to, etc.
DownloadDownload the latest version (GPLv2+ licensed): ts-1.0.tar.gz - v1.0 (2016-10-19) - Changelog
Look at the version repository if you are interested in its development.
Андрей Пантюхин (Andrew Pantyukhin) maintains the BSD port .
Alessandro Öhler provided a Gentoo ebuild for 0.4 , which with simple changes I updated to the ebuild for 0.6.4 . Moreover, the Gentoo Project Sunrise already has also an ebuild ( maybe old ) for
ts
.Alexander V. Inyukhin maintains unofficial debian packages for several platforms. Find the official packages in the debian package system .
Pascal Bleser packed the program for SuSE and openSuSE in RPMs for various platforms .
Gnomeye maintains the AUR package .
Eric Keller wrote a nodejs web server showing the status of the task spooler queue ( github project ).
ManualLook at its manpage (v0.6.1). Here you also have a copy of the help for the same version:
usage: ./ts [action] [-ngfmd] [-L <lab>] [cmd...] Env vars: TS_SOCKET the path to the unix socket used by the ts command. TS_MAILTO where to mail the result (on -m). Local user by default. TS_MAXFINISHED maximum finished jobs in the queue. TS_ONFINISH binary called on job end (passes jobid, error, outfile, command). TS_ENV command called on enqueue. Its output determines the job information. TS_SAVELIST filename which will store the list, if the server dies. TS_SLOTS amount of jobs which can run at once, read on server start. Actions: -K kill the task spooler server -C clear the list of finished jobs -l show the job list (default action) -S [num] set the number of max simultanious jobs of the server. -t [id] tail -f the output of the job. Last run if not specified. -c [id] cat the output of the job. Last run if not specified. -p [id] show the pid of the job. Last run if not specified. -o [id] show the output file. Of last job run, if not specified. -i [id] show job information. Of last job run, if not specified. -s [id] show the job state. Of the last added, if not specified. -r [id] remove a job. The last added, if not specified. -w [id] wait for a job. The last added, if not specified. -u [id] put that job first. The last added, if not specified. -U <id-id> swap two jobs in the queue. -h show this help -V show the program version Options adding jobs: -n don't store the output of the command. -g gzip the stored output (if not -n). -f don't fork into background. -m send the output by e-mail (uses sendmail). -d the job will be run only if the job before ends well -L <lab> name this task with a label, to be distinguished on listing.Thanks
- To Raúl Salinas, for his inspiring ideas
- To Alessandro Öhler, the first non-acquaintance user, who proposed and created the mailing list.
- Андрею Пантюхину, who created the BSD port .
- To the useful, although sometimes uncomfortable, UNIX interface.
- To Alexander V. Inyukhin, for the debian packages.
- To Pascal Bleser, for the SuSE packages.
- To Sergio Ballestrero, who sent code and motivated the development of a multislot version of ts.
- To GNU, an ugly but working and helpful ol' UNIX implementation.
IBM Developerworks
Sometimes you may need to run a job just once, rather than regularly. For this you use the at command. The commands to be run are read from a file specified with the -f option, or from stdin if -f is not used. The -m option sends mail to the user even if there is no stdout from the command. The -v option displays the time at which the job will run before reading the job. The time is also displayed in the output.
Listing 5 shows an example of running the mycrontest.sh script that you used earlier. Listing 6 shows the output that is mailed back to the user after the job runs. Notice that it is somewhat more compact than the corresponding output from the cron job.
[ian@lyrebird ~]$ at -f mycrontest.sh -v 10:25 Sat Jul 7 10:25:00 2007 job 5 at Sat Jul 7 10:25:00 2007 From [email protected] Sat Jul 7 10:25:00 2007 Date: Sat, 7 Jul 2007 10:25:00 -0400 From: Ian Shields <[email protected]> Subject: Output from your job 5 To: [email protected] It is now 10:25:00 on SaturdayTime specifications can be quite complex. Listing 7 shows a few examples. See the man page for at or the file /usr/share/doc/at/timespec or a file such as /usr/share/doc/at-3.1.10/timespec, where 3.1.10 in this example is the version of the at package.
Listing 7. Time values with the at command[ian@lyrebird ~]$ at -f mycrontest.sh 10pm tomorrow job 14 at Sun Jul 8 22:00:00 2007 [ian@lyrebird ~]$ at -f mycrontest.sh 2:00 tuesday job 15 at Tue Jul 10 02:00:00 2007 [ian@lyrebird ~]$ at -f mycrontest.sh 2:00 july 11 job 16 at Wed Jul 11 02:00:00 2007 [ian@lyrebird ~]$ at -f mycrontest.sh 2:00 next week job 17 at Sat Jul 14 02:00:00 2007&Being nice
The nice value for a job is a measure of how nice it is to other users. See our tutorial LPI exam 101 prep: GNU and UNIX commands for information on the nice and renice commands.
The at command also has a -q option. Increasing the queue increases the nice value for the job. There is also a batch command, which is similar to the at command except that jobs are run only when the system load is low enough. See the man pages for more details on these features.
Google matched content |
System V
BSD
/usr/xpg4/bin/at [-c| -k| -s] [-m] [-f file] [-p project] [-q queuename] -t time
/usr/xpg4/bin/at [-c| -k| -s] [-m] [-f file] [-p project] [-q queuename] timespec…
/usr/xpg4/bin/at -l [-p project] [-q queuename] [at_job_id...]
/usr/xpg4/bin/at -r at_job_id. ..
/usr/xpg4/bin/batch [-p project]
Two "twins" at and batch are very similar with batch being alias to at with supplied "now" time of execution.
Commands of the forms:
/usr/bin/batch [-p project] /usr/xpg4/bin/batch [-p project]
are respectively equivalent to:
/usr/bin/at -q b [-p project] now /usr/xpg4/bin/at -q b -m [-p project] now
At the same time at is quite different animal than cron: "at" preserves the environment in which it was invoked, while cron does not (it executes command in its own "cron" environment, and you should not expect that PATH and other valuables will be preserved).
The at utility is pipable: it can reads commands from standard input and submit a job to be executed immediately (like in example below) or at a later time.
echo "perl myjob" | at now
The at-job is executed in a separate invocation of the shell, running in a separate process group with no controlling terminal, except that the environment variables, current working directory, file creation mask (see umask(1)), and system resource limits (for sh and ksh only, see ulimit(1)) in effect when the at utility is executed is retained and used when the at-job is executed.
When the at-job is submitted, the at_job_id and scheduled time are written to standard error. The at_job_id is an identifier that is a string consisting solely of alphanumeric characters and the period character. The at_job_id is assigned by the system when the job is scheduled such that it uniquely identifies a particular job.
User notification and the processing of the job's standard output and standard error are described under the -m option.
Like with cron two files that list users one per line and are similar to cron control files control the behavior of the command:
Rules
If that file does not exist, the file /usr/lib/cron/at.deny is
checked to determine if the user should be denied access to at.
! bad user (webservd) Fri Apr 21 14:47:49 2006
That can also happen with human accounts if password aging was turned on.
Apparently if the password expires cron jobs do not run.
The batch utility reads commands to be executed one after another and is equivalent to at now. The difference is that the queue used is a special at queue, that exists specifically for batch jobs. Execution of submitted jobs can be delayed by limits on the number of jobs allowed to run concurrently. See queuedefs(4).
If the -c, -k, or -s options are not specified, the SHELL environment variable by default determines which shell to use.
For /usr/xpg4/bin/at and /usr/xpg4/bin/batch, if SHELL is unset or NULL, /usr/xpg4/bin/sh is used.
For usr/bin/at and /usr/bin/batch, if SHELL is unset or NULL, /bin/sh is used.
The following options are supported:
If -m is not used, the job's standard output
and standard error is provided to the user by means of mail, unless they are
redirected elsewhere; if there is no such output to provide, the user is not
notified of the job's completion.
The following operands are supported:
In the "C" locale, the following describes the three parts of the time specification string. All of the values from the LC_TIME categories in the "C" locale are recognized in a case-insensitive manner.
If no date is given, today is assumed if the given
time is greater than the current time, and tomorrow is assumed
if it is less. If the given month is less than the current month (and no
year is given), next year is assumed.
at 2pm + 1 week at 2pm next week
USAGE
The format of the at command line shown here is guaranteed only for the "C" locale. Other locales are not supported for midnight, noon, now, mon, abmon, day, abday, today, tomorrow, minutes, hours, days, weeks, months, years, and next.
Since the commands run in a separate shell invocation, running in a separate process group with no controlling terminal, open file descriptors, traps and priority inherited from the invoking environment are lost.
Society
Groupthink : Two Party System as Polyarchy : Corruption of Regulators : Bureaucracies : Understanding Micromanagers and Control Freaks : Toxic Managers : Harvard Mafia : Diplomatic Communication : Surviving a Bad Performance Review : Insufficient Retirement Funds as Immanent Problem of Neoliberal Regime : PseudoScience : Who Rules America : Neoliberalism : The Iron Law of Oligarchy : Libertarian Philosophy
Quotes
War and Peace : Skeptical Finance : John Kenneth Galbraith :Talleyrand : Oscar Wilde : Otto Von Bismarck : Keynes : George Carlin : Skeptics : Propaganda : SE quotes : Language Design and Programming Quotes : Random IT-related quotes : Somerset Maugham : Marcus Aurelius : Kurt Vonnegut : Eric Hoffer : Winston Churchill : Napoleon Bonaparte : Ambrose Bierce : Bernard Shaw : Mark Twain Quotes
Bulletin:
Vol 25, No.12 (December, 2013) Rational Fools vs. Efficient Crooks The efficient markets hypothesis : Political Skeptic Bulletin, 2013 : Unemployment Bulletin, 2010 : Vol 23, No.10 (October, 2011) An observation about corporate security departments : Slightly Skeptical Euromaydan Chronicles, June 2014 : Greenspan legacy bulletin, 2008 : Vol 25, No.10 (October, 2013) Cryptolocker Trojan (Win32/Crilock.A) : Vol 25, No.08 (August, 2013) Cloud providers as intelligence collection hubs : Financial Humor Bulletin, 2010 : Inequality Bulletin, 2009 : Financial Humor Bulletin, 2008 : Copyleft Problems Bulletin, 2004 : Financial Humor Bulletin, 2011 : Energy Bulletin, 2010 : Malware Protection Bulletin, 2010 : Vol 26, No.1 (January, 2013) Object-Oriented Cult : Political Skeptic Bulletin, 2011 : Vol 23, No.11 (November, 2011) Softpanorama classification of sysadmin horror stories : Vol 25, No.05 (May, 2013) Corporate bullshit as a communication method : Vol 25, No.06 (June, 2013) A Note on the Relationship of Brooks Law and Conway Law
History:
Fifty glorious years (1950-2000): the triumph of the US computer engineering : Donald Knuth : TAoCP and its Influence of Computer Science : Richard Stallman : Linus Torvalds : Larry Wall : John K. Ousterhout : CTSS : Multix OS Unix History : Unix shell history : VI editor : History of pipes concept : Solaris : MS DOS : Programming Languages History : PL/1 : Simula 67 : C : History of GCC development : Scripting Languages : Perl history : OS History : Mail : DNS : SSH : CPU Instruction Sets : SPARC systems 1987-2006 : Norton Commander : Norton Utilities : Norton Ghost : Frontpage history : Malware Defense History : GNU Screen : OSS early history
Classic books:
The Peter Principle : Parkinson Law : 1984 : The Mythical Man-Month : How to Solve It by George Polya : The Art of Computer Programming : The Elements of Programming Style : The Unix Hater’s Handbook : The Jargon file : The True Believer : Programming Pearls : The Good Soldier Svejk : The Power Elite
Most popular humor pages:
Manifest of the Softpanorama IT Slacker Society : Ten Commandments of the IT Slackers Society : Computer Humor Collection : BSD Logo Story : The Cuckoo's Egg : IT Slang : C++ Humor : ARE YOU A BBS ADDICT? : The Perl Purity Test : Object oriented programmers of all nations : Financial Humor : Financial Humor Bulletin, 2008 : Financial Humor Bulletin, 2010 : The Most Comprehensive Collection of Editor-related Humor : Programming Language Humor : Goldman Sachs related humor : Greenspan humor : C Humor : Scripting Humor : Real Programmers Humor : Web Humor : GPL-related Humor : OFM Humor : Politically Incorrect Humor : IDS Humor : "Linux Sucks" Humor : Russian Musical Humor : Best Russian Programmer Humor : Microsoft plans to buy Catholic Church : Richard Stallman Related Humor : Admin Humor : Perl-related Humor : Linus Torvalds Related humor : PseudoScience Related Humor : Networking Humor : Shell Humor : Financial Humor Bulletin, 2011 : Financial Humor Bulletin, 2012 : Financial Humor Bulletin, 2013 : Java Humor : Software Engineering Humor : Sun Solaris Related Humor : Education Humor : IBM Humor : Assembler-related Humor : VIM Humor : Computer Viruses Humor : Bright tomorrow is rescheduled to a day after tomorrow : Classic Computer Humor
The Last but not Least Technology is dominated by two types of people: those who understand what they do not manage and those who manage what they do not understand ~Archibald Putt. Ph.D
Copyright © 1996-2021 by Softpanorama Society. www.softpanorama.org was initially created as a service to the (now defunct) UN Sustainable Development Networking Programme (SDNP) without any remuneration. This document is an industrial compilation designed and created exclusively for educational use and is distributed under the Softpanorama Content License. Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.
FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available to advance understanding of computer science, IT technology, economic, scientific, and social issues. We believe this constitutes a 'fair use' of any such copyrighted material as provided by section 107 of the US Copyright Law according to which such material can be distributed without profit exclusively for research and educational purposes.
This is a Spartan WHYFF (We Help You For Free) site written by people for whom English is not a native language. Grammar and spelling errors should be expected. The site contain some broken links as it develops like a living tree...
|
You can use PayPal to to buy a cup of coffee for authors of this site |
Disclaimer:
The statements, views and opinions presented on this web page are those of the author (or referenced source) and are not endorsed by, nor do they necessarily reflect, the opinions of the Softpanorama society. We do not warrant the correctness of the information provided or its fitness for any purpose. The site uses AdSense so you need to be aware of Google privacy policy. You you do not want to be tracked by Google please disable Javascript for this site. This site is perfectly usable without Javascript.
Last modified: February 19, 2020