Home Switchboard Unix Administration Red Hat TCP/IP Networks Neoliberalism Toxic Managers
May the source be with you, but remember the KISS principle ;-)

Classic Unix Utilities


Recommended Books

Recommended Links

Pipable Tools




Pipes Selected
cut find sort awk tr dd tee ifconfig xargs
uniq route cat tail grep sudo sar eval paste
head teraterm screen split expr join rpm ln at
wc ps YUM rpm diff touch alias cp diff_tools
vnc expect logrotate netcat Curl tar cron chpasswd gcc
chmod chown df du mkisofs logrotate tree kill nohup
pushd popd dirs last e2label hostname nslookup dig  
basename dirname dos2unix dmesg          
OFM CVS date history Horror Stories Tips Unix History Humor Etc
There are many people who use UNIX or Linux but who IMHO do not understand UNIX. UNIX is not just an operating system, it is a way of doing things, and the shell plays a key role by providing the glue that makes it work. The UNIX methodology relies heavily on reuse of a set of tools rather than on building monolithic applications. Even perl programmers often miss the point, writing the heart and soul of the application as perl script without making use of the UNIX toolkit.

David Korn(bold italic is mine -- BNN)

Alphabetical list


























IMHO there are three Unix tools that can spell the difference between really good programmer or sysadmin and just above average one (even if the latter has solid knowledge of shell and Perl, knowledge of shell and Perl is necessary but not sufficient):

This two tools can also be used as a fine text in interviews on advanced Unix-related positions if you have several similar candidates. Other things equal, their knowledge definitely demonstrate the level of Unix culture superior to the average "command line junkies" level ;-)

Overview of books about GNU/open source tools can be found in Unix tools bibliography. There not that much good books on the subject, still even average books can provide you with insight in usage of the tool that you might never get via daily practice.

Please note that Unix is a pretty complex system and some aspects of it are non-obvious even for those who have more than ten years of experience.

Dr. Nikolai Bezroukov

Top Visited
Past week
Past month


Old News ;-)

[Jul 05, 2018] Can rsync resume after being interrupted

Notable quotes:
"... as if it were successfully transferred ..."
Jul 05, 2018 |

Tim ,Sep 15, 2012 at 23:36

I used rsync to copy a large number of files, but my OS (Ubuntu) restarted unexpectedly.

After reboot, I ran rsync again, but from the output on the terminal, I found that rsync still copied those already copied before. But I heard that rsync is able to find differences between source and destination, and therefore to just copy the differences. So I wonder in my case if rsync can resume what was left last time?

Gilles ,Sep 16, 2012 at 1:56

Yes, rsync won't copy again files that it's already copied. There are a few edge cases where its detection can fail. Did it copy all the already-copied files? What options did you use? What were the source and target filesystems? If you run rsync again after it's copied everything, does it copy again? – Gilles Sep 16 '12 at 1:56

Tim ,Sep 16, 2012 at 2:30

@Gilles: Thanks! (1) I think I saw rsync copied the same files again from its output on the terminal. (2) Options are same as in my other post, i.e. sudo rsync -azvv /home/path/folder1/ /home/path/folder2 . (3) Source and target are both NTFS, buy source is an external HDD, and target is an internal HDD. (3) It is now running and hasn't finished yet. – Tim Sep 16 '12 at 2:30

jwbensley ,Sep 16, 2012 at 16:15

There is also the --partial flag to resume partially transferred files (useful for large files) – jwbensley Sep 16 '12 at 16:15

Tim ,Sep 19, 2012 at 5:20

@Gilles: What are some "edge cases where its detection can fail"? – Tim Sep 19 '12 at 5:20

Gilles ,Sep 19, 2012 at 9:25

@Tim Off the top of my head, there's at least clock skew, and differences in time resolution (a common issue with FAT filesystems which store times in 2-second increments, the --modify-window option helps with that). – Gilles Sep 19 '12 at 9:25

DanielSmedegaardBuus ,Nov 1, 2014 at 12:32

First of all, regarding the "resume" part of your question, --partial just tells the receiving end to keep partially transferred files if the sending end disappears as though they were completely transferred.

While transferring files, they are temporarily saved as hidden files in their target folders (e.g. .TheFileYouAreSending.lRWzDC ), or a specifically chosen folder if you set the --partial-dir switch. When a transfer fails and --partial is not set, this hidden file will remain in the target folder under this cryptic name, but if --partial is set, the file will be renamed to the actual target file name (in this case, TheFileYouAreSending ), even though the file isn't complete. The point is that you can later complete the transfer by running rsync again with either --append or --append-verify .

So, --partial doesn't itself resume a failed or cancelled transfer. To resume it, you'll have to use one of the aforementioned flags on the next run. So, if you need to make sure that the target won't ever contain files that appear to be fine but are actually incomplete, you shouldn't use --partial . Conversely, if you want to make sure you never leave behind stray failed files that are hidden in the target directory, and you know you'll be able to complete the transfer later, --partial is there to help you.

With regards to the --append switch mentioned above, this is the actual "resume" switch, and you can use it whether or not you're also using --partial . Actually, when you're using --append , no temporary files are ever created. Files are written directly to their targets. In this respect, --append gives the same result as --partial on a failed transfer, but without creating those hidden temporary files.

So, to sum up, if you're moving large files and you want the option to resume a cancelled or failed rsync operation from the exact point that rsync stopped, you need to use the --append or --append-verify switch on the next attempt.

As @Alex points out below, since version 3.0.0 rsync now has a new option, --append-verify , which behaves like --append did before that switch existed. You probably always want the behaviour of --append-verify , so check your version with rsync --version . If you're on a Mac and not using rsync from homebrew , you'll (at least up to and including El Capitan) have an older version and need to use --append rather than --append-verify . Why they didn't keep the behaviour on --append and instead named the newcomer --append-no-verify is a bit puzzling. Either way, --append on rsync before version 3 is the same as --append-verify on the newer versions.

--append-verify isn't dangerous: It will always read and compare the data on both ends and not just assume they're equal. It does this using checksums, so it's easy on the network, but it does require reading the shared amount of data on both ends of the wire before it can actually resume the transfer by appending to the target.

Second of all, you said that you "heard that rsync is able to find differences between source and destination, and therefore to just copy the differences."

That's correct, and it's called delta transfer, but it's a different thing. To enable this, you add the -c , or --checksum switch. Once this switch is used, rsync will examine files that exist on both ends of the wire. It does this in chunks, compares the checksums on both ends, and if they differ, it transfers just the differing parts of the file. But, as @Jonathan points out below, the comparison is only done when files are of the same size on both ends -- different sizes will cause rsync to upload the entire file, overwriting the target with the same name.

This requires a bit of computation on both ends initially, but can be extremely efficient at reducing network load if for example you're frequently backing up very large files fixed-size files that often contain minor changes. Examples that come to mind are virtual hard drive image files used in virtual machines or iSCSI targets.

It is notable that if you use --checksum to transfer a batch of files that are completely new to the target system, rsync will still calculate their checksums on the source system before transferring them. Why I do not know :)

So, in short:

If you're often using rsync to just "move stuff from A to B" and want the option to cancel that operation and later resume it, don't use --checksum , but do use --append-verify .

If you're using rsync to back up stuff often, using --append-verify probably won't do much for you, unless you're in the habit of sending large files that continuously grow in size but are rarely modified once written. As a bonus tip, if you're backing up to storage that supports snapshotting such as btrfs or zfs , adding the --inplace switch will help you reduce snapshot sizes since changed files aren't recreated but rather the changed blocks are written directly over the old ones. This switch is also useful if you want to avoid rsync creating copies of files on the target when only minor changes have occurred.

When using --append-verify , rsync will behave just like it always does on all files that are the same size. If they differ in modification or other timestamps, it will overwrite the target with the source without scrutinizing those files further. --checksum will compare the contents (checksums) of every file pair of identical name and size.

UPDATED 2015-09-01 Changed to reflect points made by @Alex (thanks!)

UPDATED 2017-07-14 Changed to reflect points made by @Jonathan (thanks!)

Alex ,Aug 28, 2015 at 3:49

According to the documentation --append does not check the data, but --append-verify does. Also, as @gaoithe points out in a comment below, the documentation claims --partial does resume from previous files. – Alex Aug 28 '15 at 3:49

DanielSmedegaardBuus ,Sep 1, 2015 at 13:29

Thank you @Alex for the updates. Indeed, since 3.0.0, --append no longer compares the source to the target file before appending. Quite important, really! --partial does not itself resume a failed file transfer, but rather leaves it there for a subsequent --append(-verify) to append to it. My answer was clearly misrepresenting this fact; I'll update it to include these points! Thanks a lot :) – DanielSmedegaardBuus Sep 1 '15 at 13:29

Cees Timmerman ,Sep 15, 2015 at 17:21

This says --partial is enough. – Cees Timmerman Sep 15 '15 at 17:21

DanielSmedegaardBuus ,May 10, 2016 at 19:31

@CMCDragonkai Actually, check out Alexander's answer below about --partial-dir -- looks like it's the perfect bullet for this. I may have missed something entirely ;) – DanielSmedegaardBuus May 10 '16 at 19:31

Jonathan Y. ,Jun 14, 2017 at 5:48

What's your level of confidence in the described behavior of --checksum ? According to the man it has more to do with deciding which files to flag for transfer than with delta-transfer (which, presumably, is rsync 's default behavior). – Jonathan Y. Jun 14 '17 at 5:48

Alexander O'Mara ,Jan 3, 2016 at 6:34


Just specify a partial directory as the rsync man pages recommends:


Longer explanation:

There is actually a built-in feature for doing this using the --partial-dir option, which has several advantages over the --partial and --append-verify / --append alternative.

Excerpt from the rsync man pages:
      A  better way to keep partial files than the --partial option is
      to specify a DIR that will be used  to  hold  the  partial  data
      (instead  of  writing  it  out to the destination file).  On the
      next transfer, rsync will use a file found in this dir  as  data
      to  speed  up  the resumption of the transfer and then delete it
      after it has served its purpose.

      Note that if --whole-file is specified (or  implied),  any  par-
      tial-dir  file  that  is  found for a file that is being updated
      will simply be removed (since rsync  is  sending  files  without
      using rsync's delta-transfer algorithm).

      Rsync will create the DIR if it is missing (just the last dir --
      not the whole path).  This makes it easy to use a relative  path
      (such  as  "--partial-dir=.rsync-partial")  to have rsync create
      the partial-directory in the destination file's  directory  when
      needed,  and  then  remove  it  again  when  the partial file is

      If the partial-dir value is not an absolute path, rsync will add
      an  exclude rule at the end of all your existing excludes.  This
      will prevent the sending of any partial-dir files that may exist
      on the sending side, and will also prevent the untimely deletion
      of partial-dir items on the receiving  side.   An  example:  the
      above  --partial-dir  option would add the equivalent of "-f '-p
      .rsync-partial/'" at the end of any other filter rules.

By default, rsync uses a random temporary file name which gets deleted when a transfer fails. As mentioned, using --partial you can make rsync keep the incomplete file as if it were successfully transferred , so that it is possible to later append to it using the --append-verify / --append options. However there are several reasons this is sub-optimal.

  1. Your backup files may not be complete, and without checking the remote file which must still be unaltered, there's no way to know.
  2. If you are attempting to use --backup and --backup-dir , you've just added a new version of this file that never even exited before to your version history.

However if we use --partial-dir , rsync will preserve the temporary partial file, and resume downloading using that partial file next time you run it, and we do not suffer from the above issues.

trs ,Apr 7, 2017 at 0:00

This is really the answer. Hey everyone, LOOK HERE!! – trs Apr 7 '17 at 0:00

JKOlaf ,Jun 28, 2017 at 0:11

I agree this is a much more concise answer to the question. the TL;DR: is perfect and for those that need more can read the longer bit. Strong work. – JKOlaf Jun 28 '17 at 0:11

N2O ,Jul 29, 2014 at 18:24

You may want to add the -P option to your command.

From the man page:

--partial By default, rsync will delete any partially transferred file if the transfer
         is interrupted. In some circumstances it is more desirable to keep partially
         transferred files. Using the --partial option tells rsync to keep the partial
         file which should make a subsequent transfer of the rest of the file much faster.

  -P     The -P option is equivalent to --partial --progress.   Its  pur-
         pose  is to make it much easier to specify these two options for
         a long transfer that may be interrupted.

So instead of:

sudo rsync -azvv /home/path/folder1/ /home/path/folder2


sudo rsync -azvvP /home/path/folder1/ /home/path/folder2

Of course, if you don't want the progress updates, you can just use --partial , i.e.:

sudo rsync --partial -azvv /home/path/folder1/ /home/path/folder2

gaoithe ,Aug 19, 2015 at 11:29

@Flimm not quite correct. If there is an interruption (network or receiving side) then when using --partial the partial file is kept AND it is used when rsync is resumed. From the manpage: "Using the --partial option tells rsync to keep the partial file which should <b>make a subsequent transfer of the rest of the file much faster</b>." – gaoithe Aug 19 '15 at 11:29

DanielSmedegaardBuus ,Sep 1, 2015 at 14:11

@Flimm and @gaoithe, my answer wasn't quite accurate, and definitely not up-to-date. I've updated it to reflect version 3 + of rsync . It's important to stress, though, that --partial does not itself resume a failed transfer. See my answer for details :) – DanielSmedegaardBuus Sep 1 '15 at 14:11

guettli ,Nov 18, 2015 at 12:28

@DanielSmedegaardBuus I tried it and the -P is enough in my case. Versions: client has 3.1.0 and server has 3.1.1. I interrupted the transfer of a single large file with ctrl-c. I guess I am missing something. – guettli Nov 18 '15 at 12:28

Yadunandana ,Sep 16, 2012 at 16:07

I think you are forcibly calling the rsync and hence all data is getting downloaded when you recall it again. use --progress option to copy only those files which are not copied and --delete option to delete any files if already copied and now it does not exist in source folder...
rsync -avz --progress --delete -e  /home/path/folder1/ /home/path/folder2

If you are using ssh to login to other system and copy the files,

rsync -avz --progress --delete -e "ssh -o UserKnownHostsFile=/dev/null -o \
StrictHostKeyChecking=no" /home/path/folder1/ /home/path/folder2

let me know if there is any mistake in my understanding of this concept...

Fabien ,Jun 14, 2013 at 12:12

Can you please edit your answer and explain what your special ssh call does, and why you advice to do it? – Fabien Jun 14 '13 at 12:12

DanielSmedegaardBuus ,Dec 7, 2014 at 0:12

@Fabien He tells rsync to set two ssh options (rsync uses ssh to connect). The second one tells ssh to not prompt for confirmation if the host he's connecting to isn't already known (by existing in the "known hosts" file). The first one tells ssh to not use the default known hosts file (which would be ~/.ssh/known_hosts). He uses /dev/null instead, which is of course always empty, and as ssh would then not find the host in there, it would normally prompt for confirmation, hence option two. Upon connecting, ssh writes the now known host to /dev/null, effectively forgetting it instantly :) – DanielSmedegaardBuus Dec 7 '14 at 0:12

DanielSmedegaardBuus ,Dec 7, 2014 at 0:23

...but you were probably wondering what effect, if any, it has on the rsync operation itself. The answer is none. It only serves to not have the host you're connecting to added to your SSH known hosts file. Perhaps he's a sysadmin often connecting to a great number of new servers, temporary systems or whatnot. I don't know :) – DanielSmedegaardBuus Dec 7 '14 at 0:23

moi ,May 10, 2016 at 13:49

"use --progress option to copy only those files which are not copied" What? – moi May 10 '16 at 13:49

Paul d'Aoust ,Nov 17, 2016 at 22:39

There are a couple errors here; one is very serious: --delete will delete files in the destination that don't exist in the source. The less serious one is that --progress doesn't modify how things are copied; it just gives you a progress report on each file as it copies. (I fixed the serious error; replaced it with --remove-source-files .) – Paul d'Aoust Nov 17 '16 at 22:39

[Jun 24, 2018] Three Ways to Script Processes in Parallel by Rudis Muiznieks

Sep 02, 2015 |
Wednesday, September 02, 2015 | 9 Comments

I was recently troubleshooting some issues we were having with Shippable , trying to get a bunch of our unit tests to run in parallel so that our builds would complete faster. I didn't care what order the different processes completed in, but I didn't want the shell script to exit until all the spawned unit test processes had exited. I ultimately wasn't able to satisfactorily solve the issue we were having, but I did learn more than I ever wanted to know about how to run processes in parallel in shell scripts. So here I shall impart unto you the knowledge I have gained. I hope someone else finds it useful!


The simplest way to achieve what I wanted was to use the wait command. You simply fork all of your processes with & , and then follow them with a wait command. Behold:


/usr/bin/my-process-1 --args1 &
/usr/bin/my-process-2 --args2 &
/usr/bin/my-process-3 --args3 &

echo all processes complete

It's really as easy as that. When you run the script, all three processes will be forked in parallel, and the script will wait until all three have completed before exiting. Anything after the wait command will execute only after the three forked processes have exited.


Damn, son! It doesn't get any simpler than that!


I don't think there's really any way to determine the exit codes of the processes you forked. That was a deal-breaker for my use case, since I needed to know if any of the tests failed and return an error code from the parent shell script if they did.

Another downside is that output from the processes will be all mish-mashed together, which makes it difficult to follow. In our situation, it was basically impossible to determine which unit tests had failed because they were all spewing their output at the same time.

GNU Parallel

There is a super nifty program called GNU Parallel that does exactly what I wanted. It works kind of like xargs in that you can give it a collection of arguments to pass to a single command which will all be run, only this will run them in parallel instead of in serial like xargs does (OR DOES IT??</foreshadowing>). It is super powerful, and all the different ways you can use it are beyond the scope of this article, but here's a rough equivalent to the example script above:


parallel /usr/bin/my-process-{} --args{} ::: 1 2 3
echo all processes complete

The official "10 seconds installation" method for the latest version of GNU Parallel (from the README) is as follows:

(wget -O - || curl || fetch -o - | bash


If any of the processes returns a non-zero exit code, parallel will return a non-zero exit code. This means you can use $? in your shell script to detect if any of the processes failed. Nice! GNU Parallel also (by default) collates the output of each process together, so you'll see the complete output of each process as it completes instead of a mash-up of all the output combined together as it's produced. Also nice!

I am such a damn fanboy I might even buy an official GNU Parallel mug and t-shirt . Actually I'll probably save the money and get the new Star Wars Battlefront game when it comes out instead. But I did seriously consider the parallel schwag for a microsecond or so.


Literally none.


So it turns out that our old friend xargs has supported parallel processing all along! Who knew? It's like the nerdy chick in the movies who gets a makeover near the end and it turns out she's even hotter than the stereotypical hot cheerleader chicks who were picking on her the whole time. Just pass it a -Pn argument and it will run your commands using up to n threads. Check out this mega-sexy equivalent to the above scripts:


printf "1\n2\n3" | xargs -n1 -P3 -I{} /usr/bin/my-process-{} --args{}
echo all processes complete


xargs returns a non-zero exit code if any of the processes fails, so you can again use $? in your shell script to detect errors. The difference is it will return 123 , unlike GNU Parallel which passes through the non-zero exit code of the process that failed (I'm not sure how parallel picks if more than one process fails, but I'd assume it's either the first or last process to fail). Another pro is that xargs is most likely already installed on your preferred distribution of Linux.


I have read reports that the non-GNU version of xargs does not support parallel processing, so you may or may not be out of luck with this option if you're on AIX or a BSD or something.

xargs also has the same problem as the wait solution where the output from your processes will be all mixed together.

Another con is that xargs is a little less flexible than parallel in how you specify the processes to run. You have to pipe your values into it, and if you use the -I argument for string-replacement then your values have to be separated by newlines (which is more annoying when running it ad-hoc). It's still pretty nice, but nowhere near as flexible or powerful as parallel .

Also there's no place to buy an xargs mug and t-shirt. Lame!

And The Winner Is

After determining that the Shippable problem we were having was completely unrelated to the parallel scripting method I was using, I ended up sticking with parallel for my unit tests. Even though it meant one more dependency on our build machine, the ease

[Jun 23, 2018] Queuing tasks for batch execution with Task Spooler by Ben Martin

Aug 12, 2008 |

The Task Spooler project allows you to queue up tasks from the shell for batch execution. Task Spooler is simple to use and requires no configuration. You can view and edit queued commands, and you can view the output of queued commands at any time.

Task Spooler has some similarities with other delayed and batch execution projects, such as " at ." While both Task Spooler and at handle multiple queues and allow the execution of commands at a later point, the at project handles output from commands by emailing the results to the user who queued the command, while Task Spooler allows you to get at the results from the command line instead. Another major difference is that Task Spooler is not aimed at executing commands at a specific time, but rather at simply adding to and executing commands from queues.

The main repositories for Fedora, openSUSE, and Ubuntu do not contain packages for Task Spooler. There are packages for some versions of Debian, Ubuntu, and openSUSE 10.x available along with the source code on the project's homepage. In this article I'll use a 64-bit Fedora 9 machine and install version 0.6 of Task Spooler from source. Task Spooler does not use autotools to build, so to install it, simply run make; sudo make install . This will install the main Task Spooler command ts and its manual page into /usr/local.

A simple interaction with Task Spooler is shown below. First I add a new job to the queue and check the status. As the command is a very simple one, it is likely to have been executed immediately. Executing ts by itself with no arguments shows the executing queue, including tasks that have completed. I then use ts -c to get at the stdout of the executed command. The -c option uses cat to display the output file for a task. Using ts -i shows you information about the job. To clear finished jobs from the queue, use the ts -C command, not shown in the example.

$ ts echo "hello world"

$ ts
ID State Output E-Level Times(r/u/s) Command [run=0/1]
6 finished /tmp/ts-out.QoKfo9 0 0.00/0.00/0.00 echo hello world

$ ts -c 6
hello world

$ ts -i 6
Command: echo hello world
Enqueue time: Tue Jul 22 14:42:22 2008
Start time: Tue Jul 22 14:42:22 2008
End time: Tue Jul 22 14:42:22 2008
Time run: 0.003336s

The -t option operates like tail -f , showing you the last few lines of output and continuing to show you any new output from the task. If you would like to be notified when a task has completed, you can use the -m option to have the results mailed to you, or you can queue another command to be executed that just performs the notification. For example, I might add a tar command and want to know when it has completed. The below commands will create a tarball and use libnotify commands to create an inobtrusive popup window on my desktop when the tarball creation is complete. The popup will be dismissed automatically after a timeout.

$ ts tar czvf /tmp/mytarball.tar.gz liberror-2.1.80011
$ ts notify-send "tarball creation" "the long running tar creation process is complete."
$ ts
ID State Output E-Level Times(r/u/s) Command [run=0/1]
11 finished /tmp/ts-out.O6epsS 0 4.64/4.31/0.29 tar czvf /tmp/mytarball.tar.gz liberror-2.1.80011
12 finished /tmp/ts-out.4KbPSE 0 0.05/0.00/0.02 notify-send tarball creation the long... is complete.

Notice in the output above, toward the far right of the header information, the run=0/1 line. This tells you that Task Spooler is executing nothing, and can possibly execute one task. Task spooler allows you to execute multiple tasks at once from your task queue to take advantage of multicore CPUs. The -S option allows you to set how many tasks can be executed in parallel from the queue, as shown below.

$ ts -S 2
$ ts
ID State Output E-Level Times(r/u/s) Command [run=0/2]
6 finished /tmp/ts-out.QoKfo9 0 0.00/0.00/0.00 echo hello world

If you have two tasks that you want to execute with Task Spooler but one depends on the other having already been executed (and perhaps that the previous job has succeeded too) you can handle this by having one task wait for the other to complete before executing. This becomes more important on a quad core machine when you might have told Task Spooler that it can execute three tasks in parallel. The commands shown below create an explicit dependency, making sure that the second command is executed only if the first has completed successfully, even when the queue allows multiple tasks to be executed. The first command is queued normally using ts . I use a subshell to execute the commands by having ts explicitly start a new bash shell. The second command uses the -d option, which tells ts to execute the command only after the successful completion of the last command that was appended to the queue. When I first inspect the queue I can see that the first command (28) is executing. The second command is queued but has not been added to the list of executing tasks because Task Spooler is aware that it cannot execute until task 28 is complete. The second time I view the queue, both tasks have completed.

$ ts bash -c "sleep 10; echo hi"
$ ts -d echo there
$ ts
ID State Output E-Level Times(r/u/s) Command [run=1/2]
28 running /tmp/ts-out.hKqDva bash -c sleep 10; echo hi
29 queued (file) && echo there
$ ts
ID State Output E-Level Times(r/u/s) Command [run=0/2]
28 finished /tmp/ts-out.hKqDva 0 10.01/0.00/0.01 bash -c sleep 10; echo hi
29 finished /tmp/ts-out.VDtVp7 0 0.00/0.00/0.00 && echo there
$ cat /tmp/ts-out.hKqDva
$ cat /tmp/ts-out.VDtVp7

You can also explicitly set dependencies on other tasks as shown below. Because the ts command prints the ID of a new task to the console, the first command puts that ID into a shell variable for use in the second command. The second command passes the task ID of the first task to ts, telling it to wait for the task with that ID to complete before returning. Because this is joined with the command we wish to execute with the && operation, the second command will execute only if the first one has finished and succeeded.

The first time we view the queue you can see that both tasks are running. The first task will be in the sleep command that we used explicitly to slow down its execution. The second command will be executing ts , which will be waiting for the first task to complete. One downside of tracking dependencies this way is that the second command is added to the running queue even though it cannot do anything until the first task is complete.

$ FIRST_TASKID=`ts bash -c "sleep 10; echo hi"`
$ ts sh -c "ts -w $FIRST_TASKID && echo there"
$ ts
ID State Output E-Level Times(r/u/s) Command [run=2/2]
24 running /tmp/ts-out.La9Gmz bash -c sleep 10; echo hi
25 running /tmp/ts-out.Zr2n5u sh -c ts -w 24 && echo there
$ ts
ID State Output E-Level Times(r/u/s) Command [run=0/2]
24 finished /tmp/ts-out.La9Gmz 0 10.01/0.00/0.00 bash -c sleep 10; echo hi
25 finished /tmp/ts-out.Zr2n5u 0 9.47/0.00/0.01 sh -c ts -w 24 && echo there
$ ts -c 24
$ ts -c 25
there Wrap-up

Task Spooler allows you to convert a shell command to a queued command by simply prepending ts to the command line. One major advantage of using ts over something like the at command is that you can effectively run tail -f on the output of a running task and also get at the output of completed tasks from the command line. The utility's ability to execute multiple tasks in parallel is very handy if you are running on a multicore CPU. Because you can explicitly wait for a task, you can set up very complex interactions where you might have several tasks running at once and have jobs that depend on multiple other tasks to complete successfully before they can execute.

Because you can make explicitly dependant tasks take up slots in the actively running task queue, you can effectively delay the execution of the queue until a time of your choosing. For example, if you queue up a task that waits for a specific time before returning successfully and have a small group of other tasks that are dependent on this first task to complete, then no tasks in the queue will execute until the first task completes.


Click Here!

[Jun 23, 2018] at, batch, atq, and atrm examples

Jun 23, 2018 |
at -m 01:35 < my-at-jobs.txt

Run the commands listed in the ' my-at-jobs.txt ' file at 1:35 AM. All output from the job will be mailed to the user running the task. When this command has been successfully entered you should receive a prompt similar to the example below:

commands will be executed using /bin/sh
job 1 at Wed Dec 24 00:22:00 2014
at -l

This command will list each of the scheduled jobs in a format like the following:

1          Wed Dec 24 00:22:00 2003

...this is the same as running the command atq .

at -r 1

Deletes job 1 . This command is the same as running the command atrm 1 .

atrm 23

Deletes job 23. This command is the same as running the command at -r 23 .

[Jun 23, 2018] Bash script processing limited number of commands in parallel

Jun 23, 2018 |

AL-Kateb ,Oct 23, 2013 at 13:33

I have a bash script that looks like this:
wget LINK1 >/dev/null 2>&1
wget LINK2 >/dev/null 2>&1
wget LINK3 >/dev/null 2>&1
wget LINK4 >/dev/null 2>&1
# ..
# ..
wget LINK4000 >/dev/null 2>&1

But processing each line until the command is finished then moving to the next one is very time consuming, I want to process for instance 20 lines at once then when they're finished another 20 lines are processed.

I thought of wget LINK1 >/dev/null 2>&1 & to send the command to the background and carry on, but there are 4000 lines here this means I will have performance issues, not to mention being limited in how many processes I should start at the same time so this is not a good idea.

One solution that I'm thinking of right now is checking whether one of the commands is still running or not, for instance after 20 lines I can add this loop:

while [  $(ps -ef | grep KEYWORD | grep -v grep | wc -l) -gt 0 ]; do
sleep 1

Of course in this case I will need to append & to the end of the line! But I'm feeling this is not the right way to do it.

So how do I actually group each 20 lines together and wait for them to finish before going to the next 20 lines, this script is dynamically generated so I can do whatever math I want on it while it's being generated, but it DOES NOT have to use wget, it was just an example so any solution that is wget specific is not gonna do me any good.

kojiro ,Oct 23, 2013 at 13:46

wait is the right answer here, but your while [ $(ps would be much better written while pkill -0 $KEYWORD – using proctools that is, for legitimate reasons to check if a process with a specific name is still running. – kojiro Oct 23 '13 at 13:46

VasyaNovikov ,Jan 11 at 19:01

I think this question should be re-opened. The "possible duplicate" QA is all about running a finite number of programs in parallel. Like 2-3 commands. This question, however, is focused on running commands in e.g. a loop. (see "but there are 4000 lines"). – VasyaNovikov Jan 11 at 19:01

robinCTS ,Jan 11 at 23:08

@VasyaNovikov Have you read all the answers to both this question and the duplicate? Every single answer to this question here, can also be found in the answers to the duplicate question. That is precisely the definition of a duplicate question. It makes absolutely no difference whether or not you are running the commands in a loop. – robinCTS Jan 11 at 23:08

VasyaNovikov ,Jan 12 at 4:09

@robinCTS there are intersections, but questions themselves are different. Also, 6 of the most popular answers on the linked QA deal with 2 processes only. – VasyaNovikov Jan 12 at 4:09

Dan Nissenbaum ,Apr 20 at 15:35

I recommend reopening this question because its answer is clearer, cleaner, better, and much more highly upvoted than the answer at the linked question, though it is three years more recent. – Dan Nissenbaum Apr 20 at 15:35

devnull ,Oct 23, 2013 at 13:35

Use the wait built-in:
process1 &
process2 &
process3 &
process4 &
process5 &
process6 &
process7 &
process8 &

For the above example, 4 processes process1 .. process4 would be started in the background, and the shell would wait until those are completed before starting the next set ..

From the manual :

wait [jobspec or pid ...]

Wait until the child process specified by each process ID pid or job specification jobspec exits and return the exit status of the last command waited for. If a job spec is given, all processes in the job are waited for. If no arguments are given, all currently active child processes are waited for, and the return status is zero. If neither jobspec nor pid specifies an active child process of the shell, the return status is 127.

kojiro ,Oct 23, 2013 at 13:48

So basically i=0; waitevery=4; for link in "${links[@]}"; do wget "$link" & (( i++%waitevery==0 )) && wait; done >/dev/null 2>&1kojiro Oct 23 '13 at 13:48

rsaw ,Jul 18, 2014 at 17:26

Unless you're sure that each process will finish at the exact same time, this is a bad idea. You need to start up new jobs to keep the current total jobs at a certain cap .... parallel is the answer. – rsaw Jul 18 '14 at 17:26

DomainsFeatured ,Sep 13, 2016 at 22:55

Is there a way to do this in a loop? – DomainsFeatured Sep 13 '16 at 22:55

Bobby ,Apr 27, 2017 at 7:55

I've tried this but it seems that variable assignments done in one block are not available in the next block. Is this because they are separate processes? Is there a way to communicate the variables back to the main process? – Bobby Apr 27 '17 at 7:55

choroba ,Oct 23, 2013 at 13:38

See parallel . Its syntax is similar to xargs , but it runs the commands in parallel.

chepner ,Oct 23, 2013 at 14:35

This is better than using wait , since it takes care of starting new jobs as old ones complete, instead of waiting for an entire batch to finish before starting the next. – chepner Oct 23 '13 at 14:35

Mr. Llama ,Aug 13, 2015 at 19:30

For example, if you have the list of links in a file, you can do cat list_of_links.txt | parallel -j 4 wget {} which will keep four wget s running at a time. – Mr. Llama Aug 13 '15 at 19:30

0x004D44 ,Nov 2, 2015 at 21:42

There is a new kid in town called pexec which is a replacement for parallel . – 0x004D44 Nov 2 '15 at 21:42

mat ,Mar 1, 2016 at 21:04

Not to be picky, but xargs can also parallelize commands. – mat Mar 1 '16 at 21:04

Vader B ,Jun 27, 2016 at 6:41

In fact, xargs can run commands in parallel for you. There is a special -P max_procs command-line option for that. See man xargs .

> ,

You can run 20 processes and use the command:

Your script will wait and continue when all your background jobs are finished.

[Jun 23, 2018] parallelism - correct xargs parallel usage

Jun 23, 2018 |

Yan Zhu ,Apr 19, 2015 at 6:59

I am using xargs to call a python script to process about 30 million small files. I hope to use xargs to parallelize the process. The command I am using is:
find ./data -name "*.json" -print0 |
  xargs -0 -I{} -P 40 python {} > log.txt

Basically, will read in a small json file (4kb), do some processing and write to another 4kb file. I am running on a server with 40 CPU cores. And no other CPU-intense process is running on this server.

By monitoring htop (btw, is there any other good way to monitor the CPU performance?), I find that -P 40 is not as fast as expected. Sometimes all cores will freeze and decrease almost to zero for 3-4 seconds, then will recover to 60-70%. Then I try to decrease the number of parallel processes to -P 20-30 , but it's still not very fast. The ideal behavior should be linear speed-up. Any suggestions for the parallel usage of xargs ?

Ole Tange ,Apr 19, 2015 at 8:45

You are most likely hit by I/O: The system cannot read the files fast enough. Try starting more than 40: This way it will be fine if some of the processes have to wait for I/O. – Ole Tange Apr 19 '15 at 8:45

Fox ,Apr 19, 2015 at 10:30

What kind of processing does the script do? Any database/network/io involved? How long does it run? – Fox Apr 19 '15 at 10:30

PSkocik ,Apr 19, 2015 at 11:41

I second @OleTange. That is the expected behavior if you run as many processes as you have cores and your tasks are IO bound. First the cores will wait on IO for their task (sleep), then they will process, and then repeat. If you add more processes, then the additional processes that currently aren't running on a physical core will have kicked off parallel IO operations, which will, when finished, eliminate or at least reduce the sleep periods on your cores. – PSkocik Apr 19 '15 at 11:41

Bichoy ,Apr 20, 2015 at 3:32

1- Do you have hyperthreading enabled? 2- in what you have up there, log.txt is actually overwritten with each call to ... not sure if this is the intended behavior or not. – Bichoy Apr 20 '15 at 3:32

Ole Tange ,May 11, 2015 at 18:38

xargs -P and > is opening up for race conditions because of the half-line problem Using GNU Parallel instead will not have that problem. – Ole Tange May 11 '15 at 18:38

James Scriven ,Apr 24, 2015 at 18:00

I'd be willing to bet that your problem is python . You didn't say what kind of processing is being done on each file, but assuming you are just doing in-memory processing of the data, the running time will be dominated by starting up 30 million python virtual machines (interpreters).

If you can restructure your python program to take a list of files, instead of just one, you will get a huge improvement in performance. You can then still use xargs to further improve performance. For example, 40 processes, each processing 1000 files:

find ./data -name "*.json" -print0 |
  xargs -0 -L1000 -P 40 python

This isn't to say that python is a bad/slow language; it's just not optimized for startup time. You'll see this with any virtual machine-based or interpreted language. Java, for example, would be even worse. If your program was written in C, there would still be a cost of starting a separate operating system process to handle each file, but it would be much less.

From there you can fiddle with -P to see if you can squeeze out a bit more speed, perhaps by increasing the number of processes to take advantage of idle processors while data is being read/written.

Stephen ,Apr 24, 2015 at 13:03

So firstly, consider the constraints:

What is the constraint on each job? If it's I/O you can probably get away with multiple jobs per CPU core up till you hit the limit of I/O, but if it's CPU intensive, its going to be worse than pointless running more jobs concurrently than you have CPU cores.

My understanding of these things is that GNU Parallel would give you better control over the queue of jobs etc.

See GNU parallel vs & (I mean background) vs xargs -P for a more detailed explanation of how the two differ.


As others said, check whether you're I/O-bound. Also, xargs' man page suggests using -n with -P , you don't mention the number of processes you see running in parallel.

As a suggestion, if you're I/O-bound, you might try using an SSD block device, or try doing the processing in a tmpfs (of course, in this case you should check for enough memory, avoiding swap due to tmpfs pressure (I think), and the overhead of copying the data to it in the first place).

[Jun 23, 2018] Linux/Bash, how to schedule commands in a FIFO queue?

Jun 23, 2018 |

Andrei ,Apr 10, 2013 at 14:26

I want the ability to schedule commands to be run in a FIFO queue. I DON'T want them to be run at a specified time in the future as would be the case with the "at" command. I want them to start running now, but not simultaneously. The next scheduled command in the queue should be run only after the first command finishes executing. Alternatively, it would be nice if I could specify a maximum number of commands from the queue that could be run simultaneously; for example if the maximum number of simultaneous commands is 2, then only at most 2 commands scheduled in the queue would be taken from the queue in a FIFO manner to be executed, the next command in the remaining queue being started only when one of the currently 2 running commands finishes.

I've heard task-spooler could do something like this, but this package doesn't appear to be well supported/tested and is not in the Ubuntu standard repositories (Ubuntu being what I'm using). If that's the best alternative then let me know and I'll use task-spooler, otherwise, I'm interested to find out what's the best, easiest, most tested, bug-free, canonical way to do such a thing with bash.


Simple solutions like ; or && from bash do not work. I need to schedule these commands from an external program, when an event occurs. I just don't want to have hundreds of instances of my command running simultaneously, hence the need for a queue. There's an external program that will trigger events where I can run my own commands. I want to handle ALL triggered events, I don't want to miss any event, but I also don't want my system to crash, so that's why I want a queue to handle my commands triggered from the external program.

Andrei ,Apr 11, 2013 at 11:40

Task Spooler:

Does the trick very well. Hopefully it will be included in Ubuntu's package repos.

Hennes ,Apr 10, 2013 at 15:00

Use ;

For example:
ls ; touch test ; ls

That will list the directory. Only after ls has run it will run touch test which will create a file named test. And only after that has finished it will run the next command. (In this case another ls which will show the old contents and the newly created file).

Similar commands are || and && .

; will always run the next command.

&& will only run the next command it the first returned success.
Example: rm -rf *.mp3 && echo "Success! All MP3s deleted!"

|| will only run the next command if the first command returned a failure (non-zero) return value. Example: rm -rf *.mp3 || echo "Error! Some files could not be deleted! Check permissions!"

If you want to run a command in the background, append an ampersand ( & ).
make bzimage &
mp3blaster sound.mp3
make mytestsoftware ; ls ; firefox ; make clean

Will run two commands int he background (in this case a kernel build which will take some time and a program to play some music). And in the foregrounds it runs another compile job and, once that is finished ls, firefox and a make clean (all sequentially)

For more details, see man bash

[Edit after comment]

in pseudo code, something like this?

Program run_queue:


   While( queue not empty )
       run next command from the queue.
       remove this command from the queue.
       // If commands where added to the queue during execution then
       // the queue is not empty, keep processing them all.
   // Queue is now empty, returning to wait_for_a_signal
// Wait forever on commands and add them to a queue
// Signal run_quueu when something gets added.
program add_to_queue()
       Append command to queue
       signal run_queue

terdon ,Apr 10, 2013 at 15:03

The easiest way would be to simply run the commands sequentially:
cmd1; cmd2; cmd3; cmdN

If you want the next command to run only if the previous command exited successfully, use && :

cmd1 && cmd2 && cmd3 && cmdN

That is the only bash native way I know of doing what you want. If you need job control (setting a number of parallel jobs etc), you could try installing a queue manager such as TORQUE but that seems like overkill if all you want to do is launch jobs sequentially.

psusi ,Apr 10, 2013 at 15:24

You are looking for at 's twin brother: batch . It uses the same daemon but instead of scheduling a specific time, the jobs are queued and will be run whenever the system load average is low.

mpy ,Apr 10, 2013 at 14:59

Apart from dedicated queuing systems (like the Sun Grid Engine ) which you can also use locally on one machine and which offer dozens of possibilities, you can use something like
 command1 && command2 && command3

which is the other extreme -- a very simple approach. The latter neither does provide multiple simultaneous processes nor gradually filling of the "queue".

Bogdan Dumitru ,May 3, 2016 at 10:12

I went on the same route searching, trying out task-spooler and so on. The best of the best is this:

GNU Parallel --semaphore --fg It also has -j for parallel jobs.

[Jun 23, 2018] Task Spooler

Notable quotes:
"... As in : ..."
"... doesn't work anymore ..."
Jun 23, 2018 |

As in :

task spooler is a Unix batch system where the tasks spooled run one after the other. The amount of jobs to run at once can be set at any time. Each user in each system has his own job queue. The tasks are run in the correct context (that of enqueue) from any shell/process, and its output/results can be easily watched. It is very useful when you know that your commands depend on a lot of RAM, a lot of disk use, give a lot of output, or for whatever reason it's better not to run them all at the same time, while you want to keep your resources busy for maximum benfit. Its interface allows using it easily in scripts.

For your first contact, you can read an article at , which I like as overview, guide and examples (original url) . On more advanced usage, don't neglect the TRICKS file in the package.


I wrote Task Spooler because I didn't have any comfortable way of running batch jobs in my linux computer. I wanted to:

At the end, after some time using and developing ts , it can do something more:

You can look at an old (but representative) screenshot of ts-0.2.1 if you want.

Mailing list

I created a GoogleGroup for the program. You look for the archive and the join methods in the taskspooler google group page .

Alessandro Öhler once maintained a mailing list for discussing newer functionalities and interchanging use experiences. I think this doesn't work anymore , but you can look at the old archive or even try to subscribe .

How it works

The queue is maintained by a server process. This server process is started if it isn't there already. The communication goes through a unix socket usually in /tmp/ .

When the user requests a job (using a ts client), the client waits for the server message to know when it can start. When the server allows starting , this client usually forks, and runs the command with the proper environment, because the client runs run the job and not the server, like in 'at' or 'cron'. So, the ulimits, environment, pwd,. apply.

When the job finishes, the client notifies the server. At this time, the server may notify any waiting client, and stores the output and the errorlevel of the finished job.

Moreover the client can take advantage of many information from the server: when a job finishes, where does the job output go to, etc.


Download the latest version (GPLv2+ licensed): ts-1.0.tar.gz - v1.0 (2016-10-19) - Changelog

Look at the version repository if you are interested in its development.

Андрей Пантюхин (Andrew Pantyukhin) maintains the BSD port .

Alessandro Öhler provided a Gentoo ebuild for 0.4 , which with simple changes I updated to the ebuild for 0.6.4 . Moreover, the Gentoo Project Sunrise already has also an ebuild ( maybe old ) for ts .

Alexander V. Inyukhin maintains unofficial debian packages for several platforms. Find the official packages in the debian package system .

Pascal Bleser packed the program for SuSE and openSuSE in RPMs for various platforms .

Gnomeye maintains the AUR package .

Eric Keller wrote a nodejs web server showing the status of the task spooler queue ( github project ).


Look at its manpage (v0.6.1). Here you also have a copy of the help for the same version:

usage: ./ts [action] [-ngfmd] [-L <lab>] [cmd...]
Env vars:
  TS_SOCKET  the path to the unix socket used by the ts command.
  TS_MAILTO  where to mail the result (on -m). Local user by default.
  TS_MAXFINISHED  maximum finished jobs in the queue.
  TS_ONFINISH  binary called on job end (passes jobid, error, outfile, command).
  TS_ENV  command called on enqueue. Its output determines the job information.
  TS_SAVELIST  filename which will store the list, if the server dies.
  TS_SLOTS   amount of jobs which can run at once, read on server start.
  -K       kill the task spooler server
  -C       clear the list of finished jobs
  -l       show the job list (default action)
  -S [num] set the number of max simultanious jobs of the server.
  -t [id]  tail -f the output of the job. Last run if not specified.
  -c [id]  cat the output of the job. Last run if not specified.
  -p [id]  show the pid of the job. Last run if not specified.
  -o [id]  show the output file. Of last job run, if not specified.
  -i [id]  show job information. Of last job run, if not specified.
  -s [id]  show the job state. Of the last added, if not specified.
  -r [id]  remove a job. The last added, if not specified.
  -w [id]  wait for a job. The last added, if not specified.
  -u [id]  put that job first. The last added, if not specified.
  -U <id-id>  swap two jobs in the queue.
  -h       show this help
  -V       show the program version
Options adding jobs:
  -n       don't store the output of the command.
  -g       gzip the stored output (if not -n).
  -f       don't fork into background.
  -m       send the output by e-mail (uses sendmail).
  -d       the job will be run only if the job before ends well
  -L <lab> name this task with a label, to be distinguished on listing.

[Jun 23, 2018] bash - Shell Scripting Using xargs to execute parallel instances of a shell function

Jun 23, 2018 |

Gnats ,Jul 23, 2010 at 19:33

I'm trying to use xargs in a shell script to run parallel instances of a function I've defined in the same script. The function times the fetching of a page, and so it's important that the pages are actually fetched concurrently in parallel processes, and not in background processes (if my understanding of this is wrong and there's negligible difference between the two, just let me know).

The function is:

function time_a_url ()
     oneurltime=$($time_command -p wget -p $1 -O /dev/null 2>&1 1>/dev/null | grep real | cut -d" " -f2)
     echo "Fetching $1 took $oneurltime seconds."

How does one do this with an xargs pipe in a form that can take number of times to run time_a_url in parallel as an argument? And yes, I know about GNU parallel, I just don't have the privilege to install software where I'm writing this.

Dennis Williamson ,Jul 23, 2010 at 23:03

Here's a demo of how you might be able to get your function to work:
$ f() { echo "[$@]"; }
$ export -f f
$ echo -e "b 1\nc 2\nd 3 4" | xargs -P 0 -n 1 -I{} bash -c f\ \{\}
[b 1]
[d 3 4]
[c 2]

The keys to making this work are to export the function so the bash that xargs spawns will see it and to escape the space between the function name and the escaped braces. You should be able to adapt this to work in your situation. You'll need to adjust the arguments for -P and -n (or remove them) to suit your needs.

You can probably get rid of the grep and cut . If you're using the Bash builtin time , you can specify an output format using the TIMEFORMAT variable. If you're using GNU /usr/bin/time , you can use the --format argument. Either of these will allow you to drop the -p also.

You can replace this part of your wget command: 2>&1 1>/dev/null with -q . In any case, you have those reversed. The correct order would be >/dev/null 2>&1 .

Lee Netherton ,Aug 30, 2011 at 16:32

I used xargs -P0 -n1 -I{} bash -c "f {}" which still works, and seems a little tidier. – Lee Netherton Aug 30 '11 at 16:32

tmpvar ,Jul 24, 2010 at 15:21

On Mac OS X:

xargs: max. processes must be >0 (for: xargs -P [>0])

f() { echo "[$@]"; }
export -f f

echo -e "b 1\nc 2\nd 3 4" | sed 's/ /\\ /g' | xargs -P 10 -n 1 -I{} bash -c f\ \{\}

echo -e "b 1\nc 2\nd 3 4" | xargs -P 10 -I '{}' bash -c 'f "$@"' arg0 '{}'


If you install GNU Parallel on another system, you will see the functionality is in a single file (called parallel).

You should be able to simply copy that file to your own ~/bin.

[Jun 13, 2018] parsync - a parallel rsync wrapper for large data transfers by Harry Mangalam

Jan 22, 2017 |

v1.67 (Mac Beta) Table of Contents

  1. Download
  2. Dependencies
  3. Overview
  4. parsync help

1. Download

If you already know you want it, get it here: parsync+utils.tar.gz (contains parsync plus the kdirstat-cache-writer , stats , and scut utilities below) Extract it into a dir on your $PATH and after verifying the other dependencies below, give it a shot.

While parsync is developed for and test on Linux, the latest version of parsync has been modified to (mostly) work on the Mac (tested on OSX 10.9.5). A number of the Linux-specific dependencies have been removed and there are a number of Mac-specific work arounds.

Thanks to Phil Reese < > for the code mods needed to get it started. It's the same package and instructions for both platforms.

2. Dependencies

parsync requires the following utilities to work:

non-default Perl utility: URI::Escape qw(uri_escape)
sudo yum install perl-URI  # CentOS-like

sudo apt-get install liburi-perl  # Debian-like
parsync needs to be installed only on the SOURCE end of the transfer and uses whatever rsync is available on the TARGET. It uses a number of Linux- specific utilities so if you're transferring between Linux and a FreeBSD host, install parsync on the Linux side. In fact, as currently written, it will only PUSH data to remote targets ; it will not pull data as rsync itself can do. This will probably in the near future. 3. Overview rsync is a fabulous data mover. Possibly more bytes have been moved (or have been prevented from being moved) by rsync than by any other application. So what's not to love? For transferring large, deep file trees, rsync will pause while it generates lists of files to process. Since Version 3, it does this pretty fast, but on sluggish filesystems, it can take hours or even days before it will start to actually exchange rsync data. Second, due to various bottlenecks, rsync will tend to use less than the available bandwidth on high speed networks. Starting multiple instances of rsync can improve this significantly. However, on such transfers, it is also easy to overload the available bandwidth, so it would be nice to both limit the bandwidth used if necessary and also to limit the load on the system. parsync tries to satisfy all these conditions and more by:
Important Only use for LARGE data transfers The main use case for parsync is really only very large data transfers thru fairly fast network connections (>1Gb/s). Below this speed, a single rsync can saturate the connection, so there's little reason to use parsync and in fact the overhead of testing the existence of and starting more rsyncs tends to worsen its performance on small transfers to slightly less than rsync alone.
Beyond this introduction, parsync's internal help is about all you'll need to figure out how to use it; below is what you'll see when you type parsync -h . There are still edge cases where parsync will fail or behave oddly, especially with small data transfers, so I'd be happy to hear of such misbehavior or suggestions to improve it. Download the complete tarball of parsync, plus the required utilities here: parsync+utils.tar.gz Unpack it, move the contents to a dir on your $PATH , chmod it executable, and try it out.
parsync --help
or just
Below is what you should see:

4. parsync help

parsync version 1.67 (Mac compatibility beta) Jan 22, 2017
by Harry Mangalam <> || <>

parsync is a Perl script that wraps Andrew Tridgell's miraculous 'rsync' to
provide some load balancing and parallel operation across network connections
to increase the amount of bandwidth it can use.

parsync is primarily tested on Linux, but (mostly) works on MaccOSX
as well.

parsync needs to be installed only on the SOURCE end of the
transfer and only works in local SOURCE -> remote TARGET mode
(it won't allow remote local SOURCE <- remote TARGET, emitting an
error and exiting if attempted).

It uses whatever rsync is available on the TARGET.  It uses a number
of Linux-specific utilities so if you're transferring between Linux
and a FreeBSD host, install parsync on the Linux side.

The only native rsync option that parsync uses is '-a' (archive) &
'-s' (respect bizarro characters in filenames).
If you need more, then it's up to you to provide them via
'--rsyncopts'. parsync checks to see if the current system load is
too heavy and tries to throttle the rsyncs during the run by
monitoring and suspending / continuing them as needed.

It uses the very efficient (also Perl-based) kdirstat-cache-writer
from kdirstat to generate lists of files which are summed and then
crudely divided into NP jobs by size.

It appropriates rsync's bandwidth throttle mechanism, using '--maxbw'
as a passthru to rsync's 'bwlimit' option, but divides it by NP so
as to keep the total bw the same as the stated limit.  It monitors and
shows network bandwidth, but can't change the bw allocation mid-job.
It can only suspend rsyncs until the load decreases below the cutoff.
If you suspend parsync (^Z), all rsync children will suspend as well,
regardless of current state.

Unless changed by '--interface', it tried to figure out how to set the
interface to monitor.  The transfer will use whatever interface routing
provides, normally set by the name of the target.  It can also be used for
non-host-based transfers (between mounted filesystems) but the network
bandwidth continues to be (usually pointlessly) shown.

[[NB: Between mounted filesystems, parsync sometimes works very poorly for
reasons still mysterious.  In such cases (monitor with 'ifstat'), use 'cp'
or 'tnc' ( for the initial data movement and a single
rsync to finalize.  I believe the multiple rsync chatter is interfering with
the transfer.]]

It only works on dirs and files that originate from the current dir (or
specified via "--rootdir").  You cannot include dirs and files from
discontinuous or higher-level dirs.

** the ~/.parsync files **
The ~/.parsync dir contains the cache (*.gz), the chunk files (kds*), and the
time-stamped log files. The cache files can be re-used with '--reusecache'
(which will re-use ALL the cache and chunk files.  The log files are
datestamped and are NOT overwritten.

** Odd characters in names **
parsync will sometimes refuse to transfer some oddly named files, altho
recent versions of rsync allow the '-s' flag (now a parsync default)
which tries to respect names with spaces and properly escaped shell
characters.  Filenames with embedded newlines, DOS EOLs, and other
odd chars will be recorded in the log files in the ~/.parsync dir.

** Because of the crude way that files are chunked, NP may be
adjusted slightly to match the file chunks. ie '--NP 8' -> '--NP 7'.
If so, a warning will be issued and the rest of the transfer will be
automatically adjusted.

[i] = integer number
[f] = floating point number
[s] = "quoted string"
( ) = the default if any

--NP [i] (sqrt(#CPUs)) ...............  number of rsync processes to start
      optimal NP depends on many vars.  Try the default and incr as needed
--startdir [s] (`pwd`)  .. the directory it works relative to. If you omit
                           it, the default is the CURRENT dir. You DO have
                           to specify target dirs.  See the examples below.
--maxbw [i] (unlimited) ..........  in KB/s max bandwidth to use (--bwlimit
       passthru to rsync).  maxbw is the total BW to be used, NOT per rsync.
--maxload [f] (NP+2)  ........ max total system load - if sysload > maxload,
                                               sleeps an rsync proc for 10s
--checkperiod [i] (5) .......... sets the period in seconds between updates
--rsyncopts [s]  ...  options passed to rsync as a quoted string (CAREFUL!)
           this opt triggers a pause before executing to verify the command.
--interface [s]  .............  network interface to /monitor/, not nec use.
      default: `/sbin/route -n | grep "^" | rev | cut -d' ' -f1 | rev`
      above works on most simple hosts, but complex routes will confuse it.
--reusecache  ..........  don't re-read the dirs; re-use the existing caches
--email [s]  .....................  email address to send completion message
                                      (requires working mail system on host)
--barefiles   .....  set to allow rsync of individual files, as oppo to dirs
--nowait  ................  for scripting, sleep for a few s instead of wait
--version  .................................  dumps version string and exits
--help  .........................................................  this help

-- Good example 1 --
% parsync  --maxload=5.5 --NP=4 --startdir='/home/hjm' dir1 dir2 dir3

  = "--startdir='/home/hjm'" sets the working dir of this operation to
      '/home/hjm' and dir1 dir2 dir3 are subdirs from '/home/hjm'
  = the target "hjm@remotehost:~/backups" is the same target rsync would use
  = "--NP=4" forks 4 instances of rsync
  = -"-maxload=5.5" will start suspending rsync instances when the 5m system
      load gets to 5.5 and then unsuspending them when it goes below it.

  It uses 4 instances to rsync dir1 dir2 dir3 to hjm@remotehost:~/backups

-- Good example 2 --
% parsync --rsyncopts="--ignore-existing" --reusecache  --NP=3
  --barefiles  *.txt   /mount/backups/txt

  =  "--rsyncopts='--ignore-existing'" is an option passed thru to rsync
     telling it not to disturb any existing files in the target directory.
  = "--reusecache" indicates that the filecache shouldn't be re-generated,
    uses the previous filecache in ~/.parsync
  = "--NP=3" for 3 copies of rsync (with no "--maxload", the default is 4)
  = "--barefiles" indicates that it's OK to transfer barefiles instead of
    recursing thru dirs.
  = "/mount/backups/txt" is the target - a local disk mount instead of a network host.

  It uses 3 instances to rsync *.txt from the current dir to "/mount/backups/txt".

-- Error Example 1 --
% pwd
/home/hjm  # executing parsync from here

% parsync --NP4 --compress /usr/local  /media/backupdisk

why this is an error:
  = '--NP4' is not an option (parsync will say "Unknown option: np4")
    It should be '--NP=4'
  = if you were trying to rsync '/usr/local' to '/media/backupdisk',
    it will fail since there is no /home/hjm/usr/local dir to use as
    a source. This will be shown in the log files in
    as a spew of "No such file or directory (2)" errors
  = the '--compress' is a native rsync option, not a native parsync option.
    You have to pass it to rsync with "--rsyncopts='--compress'"

The correct version of the above command is:

% parsync --NP=4  --rsyncopts='--compress' --startdir=/usr  local

-- Error Example 2 --
% parsync --start-dir /home/hjm  mooslocal

why this is an error:
  = this command is trying to PULL data from a remote SOURCE to a
    local TARGET.  parsync doesn't support that kind of operation yet.

The correct version of the above command is:

# ssh to hjm@moo, install parsync, then:
% parsync  --startdir=/usr  local  hjm@remote:/home/hjm/mooslocal

[Jun 02, 2018] How to run Linux commands simultaneously with GNU Parallel

Jun 02, 2018 |

Scratching the surface

We've only just scratched the surface of GNU Parallel. I highly recommend you give the official GNU Parallel tutorial a read, and watch this video tutorial series on Yutube , so you can understand the complexities of the tool (of which there are many).

But this will get you started on a path to helping your data center Linux servers use commands with more efficiency.

[Jun 02, 2018] Parallelise rsync using GNU Parallel

Jun 02, 2018 |

up vote 7 down vote favorite 4

Mandar Shinde ,Mar 13, 2015 at 6:51

I have been using a rsync script to synchronize data at one host with the data at another host. The data has numerous small-sized files that contribute to almost 1.2TB.

In order to sync those files, I have been using rsync command as follows:

rsync -avzm --stats --human-readable --include-from proj.lst /data/projects REMOTEHOST:/data/

The contents of proj.lst are as follows:

+ proj1
+ proj1/*
+ proj1/*/*
+ proj1/*/*/*.tar
+ proj1/*/*/*.pdf
+ proj2
+ proj2/*
+ proj2/*/*
+ proj2/*/*/*.tar
+ proj2/*/*/*.pdf
- *

As a test, I picked up two of those projects (8.5GB of data) and I executed the command above. Being a sequential process, it tool 14 minutes 58 seconds to complete. So, for 1.2TB of data it would take several hours.

If I would could multiple rsync processes in parallel (using & , xargs or parallel ), it would save my time.

I tried with below command with parallel (after cd ing to source directory) and it took 12 minutes 37 seconds to execute:

parallel --will-cite -j 5 rsync -avzm --stats --human-readable {} REMOTEHOST:/data/ ::: .

This should have taken 5 times less time, but it didn't. I think, I'm going wrong somewhere.

How can I run multiple rsync processes in order to reduce the execution time?

Ole Tange ,Mar 13, 2015 at 7:25

Are you limited by network bandwidth? Disk iops? Disk bandwidth? – Ole Tange Mar 13 '15 at 7:25

Mandar Shinde ,Mar 13, 2015 at 7:32

If possible, we would want to use 50% of total bandwidth. But, parallelising multiple rsync s is our first priority. – Mandar Shinde Mar 13 '15 at 7:32

Ole Tange ,Mar 13, 2015 at 7:41

Can you let us know your: Network bandwidth, disk iops, disk bandwidth, and the bandwidth actually used? – Ole Tange Mar 13 '15 at 7:41

Mandar Shinde ,Mar 13, 2015 at 7:47

In fact, I do not know about above parameters. For the time being, we can neglect the optimization part. Multiple rsync s in parallel is the primary focus now. – Mandar Shinde Mar 13 '15 at 7:47

Mandar Shinde ,Apr 11, 2015 at 13:53

Following steps did the job for me:
  1. Run the rsync --dry-run first in order to get the list of files those would be affected.

rsync -avzm --stats --safe-links --ignore-existing --dry-run --human-readable /data/projects REMOTE-HOST:/data/ > /tmp/transfer.log

  1. I fed the output of cat transfer.log to parallel in order to run 5 rsync s in parallel, as follows:

cat /tmp/transfer.log | parallel --will-cite -j 5 rsync -avzm --relative --stats --safe-links --ignore-existing --human-readable {} REMOTE-HOST:/data/ > result.log

Here, --relative option ( link ) ensured that the directory structure for the affected files, at the source and destination, remains the same (inside /data/ directory), so the command must be run in the source folder (in example, /data/projects ).

Sandip Bhattacharya ,Nov 17, 2016 at 21:22

That would do an rsync per file. It would probably be more efficient to split up the whole file list using split and feed those filenames to parallel. Then use rsync's --files-from to get the filenames out of each file and sync them. rm backups.* split -l 3000 backup.list backups. ls backups.* | parallel --line-buffer --verbose -j 5 rsync --progress -av --files-from {} /LOCAL/PARENT/PATH/ REMOTE_HOST:REMOTE_PATH/ – Sandip Bhattacharya Nov 17 '16 at 21:22

Mike D ,Sep 19, 2017 at 16:42

How does the second rsync command handle the lines in result.log that are not files? i.e. receiving file list ... done created directory /data/ . – Mike D Sep 19 '17 at 16:42

Cheetah ,Oct 12, 2017 at 5:31

On newer versions of rsync (3.1.0+), you can use --info=name in place of -v , and you'll get just the names of the files and directories. You may want to use --protect-args to the 'inner' transferring rsync too if any files might have spaces or shell metacharacters in them. – Cheetah Oct 12 '17 at 5:31

Mikhail ,Apr 10, 2017 at 3:28

I would strongly discourage anybody from using the accepted answer, a better solution is to crawl the top level directory and launch a proportional number of rync operations.

I have a large zfs volume and my source was was a cifs mount. Both are linked with 10G, and in some benchmarks can saturate the link. Performance was evaluated using zpool iostat 1 .

The source drive was mounted like:

mount -t cifs -o username=,password= //static_ip/70tb /mnt/Datahoarder_Mount/ -o vers=3.0

Using a single rsync process:

rsync -h -v -r -P -t /mnt/Datahoarder_Mount/ /StoragePod

the io meter reads:

StoragePod  30.0T   144T      0  1.61K      0   130M
StoragePod  30.0T   144T      0  1.61K      0   130M
StoragePod  30.0T   144T      0  1.62K      0   130M

This in synthetic benchmarks (crystal disk), performance for sequential write approaches 900 MB/s which means the link is saturated. 130MB/s is not very good, and the difference between waiting a weekend and two weeks.

So, I built the file list and tried to run the sync again (I have a 64 core machine):

cat /home/misha/Desktop/rsync_logs_syncs/Datahoarder_Mount.log | parallel --will-cite -j 16 rsync -avzm --relative --stats --safe-links --size-only --human-readable {} /StoragePod/ > /home/misha/Desktop/rsync_logs_syncs/Datahoarder_Mount_result.log

and it had the same performance!

StoragePod  29.9T   144T      0  1.63K      0   130M
StoragePod  29.9T   144T      0  1.62K      0   130M
StoragePod  29.9T   144T      0  1.56K      0   129M

As an alternative I simply ran rsync on the root folders:

rsync -h -v -r -P -t /mnt/Datahoarder_Mount/Mikhail/Marcello_zinc_bone /StoragePod/Marcello_zinc_bone
rsync -h -v -r -P -t /mnt/Datahoarder_Mount/Mikhail/fibroblast_growth /StoragePod/fibroblast_growth
rsync -h -v -r -P -t /mnt/Datahoarder_Mount/Mikhail/QDIC /StoragePod/QDIC
rsync -h -v -r -P -t /mnt/Datahoarder_Mount/Mikhail/sexy_dps_cell /StoragePod/sexy_dps_cell

This actually boosted performance:

StoragePod  30.1T   144T     13  3.66K   112K   343M
StoragePod  30.1T   144T     24  5.11K   184K   469M
StoragePod  30.1T   144T     25  4.30K   196K   373M

In conclusion, as @Sandip Bhattacharya brought up, write a small script to get the directories and parallel that. Alternatively, pass a file list to rsync. But don't create new instances for each file.

Julien Palard ,May 25, 2016 at 14:15

I personally use this simple one:
ls -1 | parallel rsync -a {} /destination/directory/

Which only is usefull when you have more than a few non-near-empty directories, else you'll end up having almost every rsync terminating and the last one doing all the job alone.

Ole Tange ,Mar 13, 2015 at 7:25

A tested way to do the parallelized rsync is:

rsync is a great tool, but sometimes it will not fill up the available bandwidth. This is often a problem when copying several big files over high speed connections.

The following will start one rsync per big file in src-dir to dest-dir on the server fooserver:

cd src-dir; find . -type f -size +100000 | \
parallel -v ssh fooserver mkdir -p /dest-dir/{//}\; \
  rsync -s -Havessh {} fooserver:/dest-dir/{}

The directories created may end up with wrong permissions and smaller files are not being transferred. To fix those run rsync a final time:

rsync -Havessh src-dir/ fooserver:/dest-dir/

If you are unable to push data, but need to pull them and the files are called digits.png (e.g. 000000.png) you might be able to do:

seq -w 0 99 | parallel rsync -Havessh fooserver:src/*{}.png destdir/

Mandar Shinde ,Mar 13, 2015 at 7:34

Any other alternative in order to avoid find ? – Mandar Shinde Mar 13 '15 at 7:34

Ole Tange ,Mar 17, 2015 at 9:20

Limit the -maxdepth of find. – Ole Tange Mar 17 '15 at 9:20

Mandar Shinde ,Apr 10, 2015 at 3:47

If I use --dry-run option in rsync , I would have a list of files that would be transferred. Can I provide that file list to parallel in order to parallelise the process? – Mandar Shinde Apr 10 '15 at 3:47

Ole Tange ,Apr 10, 2015 at 5:51

cat files | parallel -v ssh fooserver mkdir -p /dest-dir/{//}\; rsync -s -Havessh {} fooserver:/dest-dir/{} – Ole Tange Apr 10 '15 at 5:51

Mandar Shinde ,Apr 10, 2015 at 9:49

Can you please explain the mkdir -p /dest-dir/{//}\; part? Especially the {//} thing is a bit confusing. – Mandar Shinde Apr 10 '15 at 9:49


For multi destination syncs, I am using
parallel rsync -avi /path/to/source ::: host1: host2: host3:

Hint: All ssh connections are established with public keys in ~/.ssh/authorized_keys

[Jun 02, 2018] Parallelizing rsync

Jun 02, 2018 |

rsync is a great tool, but sometimes it will not fill up the available bandwidth. This is often a problem when copying several big files over high speed connections.

The following will start one rsync per big file in src-dir to dest-dir on the server fooserver :

  cd src-dir; find . -type f -size +100000 | \
    parallel -v ssh fooserver mkdir -p /dest-dir/{//}\; \
      rsync -s -Havessh {} fooserver:/dest-dir/{}

The dirs created may end up with wrong permissions and smaller files are not being transferred. To fix those run rsync a final time:

  rsync -Havessh src-dir/ fooserver:/dest-dir/

If you are unable to push data, but need to pull them and the files are called digits.png (e.g. 000000.png) you might be able to do:

  seq -w 0 99 | parallel rsync -Havessh fooserver:src/*{}.png destdir/

[May 28, 2018] TIP 7-zip s XZ compression on a multiprocessor system is often faster and compresses better than gzip linuxadmin

May 28, 2018 |

TIP: 7-zip's XZ compression on a multiprocessor system is often faster and compresses better than gzip ( self.linuxadmin )

TyIzaeL line"> [–] kristopolous 4 years ago (4 children)

I did this a while back also. Here's a graph:

X axis is compression level (min to max) Y is the size of the file that was compressed

I forget what the file was.

TyIzaeL 4 years ago (3 children)
That is a great start (probably better than what I am doing). Do you have time comparisons as well?
kristopolous 4 years ago (1 child) there's the post
TyIzaeL 4 years ago (0 children)
Very nice. I might work on something similar to this soon next time I'm bored.
kristopolous 4 years ago (0 children)
TyIzaeL 4 years ago (0 children)
That's a great point to consider among all of this. Compression is always a tradeoff between how much CPU and memory you want to throw at something and how much space you would like to save. In my case, hammering the server for 3 minutes in order to take a backup is necessary because the uncompressed data would bottleneck at the LAN speed.
randomfrequency 4 years ago (0 children)
You might want to play with 'pigz' - it's gzip, multi-threaded. You can 'pv' to restrict the rate of the output, and it accepts signals to control the rate limiting.
rrohbeck 4 years ago (1 child)
Also pbzip2 -1 to -9 and pigz -1 to -9.

With -9 you can surely make backup CPU bound. I've given up on compression though: rsync is much faster than straight backup and I use btrfs compression/deduplication/snapshotting on the backup server.

TyIzaeL 4 years ago (0 children)
pigz -9 is already on the chart as pigz --best. I'm working on adding the others though.
TyIzaeL 4 years ago (0 children)
I'm running gzip, bzip2, and pbzip2 now (not at the same time, of course) and will add results soon. But in my case the compression keeps my db dumps from being IO bound by the 100mbit LAN connection. For example, lzop in the results above puts out 6041.632 megabits in 53.82 seconds for a total compressed data rate of 112 megabits per second, which would make the transfer IO bound. Whereas the pigz example puts out 3339.872 megabits in 81.892 seconds, for an output data rate of 40.8 megabits per second. This is just on my dual-core box with a static file, on the 8-core server I see the transfer takes a total of about three minutes. It's probably being limited more by the rate at which the MySQL server can dump text from the database, but if there was no compression it'd be limited by the LAN speed. If we were dumping 2.7GB over the LAN directly, we would need 122mbit/s of real throughput to complete it in three minutes.
Shammyhealz 4 years ago (2 children)
I thought the best compression was supposed to be LZMA? Which is what the .7z archives are. I have no idea of the relative speed of LZMA and gzip
TyIzaeL 4 years ago (1 child)
xz archives use the LZMA2 format (which is also used in 7z archives). LZMA2 speed seems to range from a little slower than gzip to much slower than bzip2, but results in better compression all around.
primitive_screwhead 4 years ago (0 children)
However LZMA2 decompression speed is generally much faster than bzip2, in my experience, though not as fast as gzip. This is why we use it, as we decompress our data much more often than we compress it, and the space saving/decompression speed tradeoff is much more favorable for us than either gzip of bzip2.
crustang 4 years ago (2 children)
I mentioned how 7zip was superior to all other zip programs in /r/osx a few days ago and my comment was burried in favor of the the osx circlejerk .. it feels good seeing this data.

I love 7zip

RTFMorGTFO 4 years ago (1 child)
Why... Tar supports xz, lzma, lzop, lzip, and any other kernel based compression algorithms. Its also much more likely to be preinstalled on your given distro.
crustang 4 years ago (0 children)
I've used 7zip at my old job for a backup of our business software's database. We needed speed, high level of compression, and encryption. Portability wasn't high on the list since only a handful of machines needed access to the data. All machines were multi-processor and 7zip gave us the best of everything given the requirements. I haven't really looked at anything deeply - including tar, which my old boss didn't care for.

[May 28, 2018] RPM RedHat EL 6 p7zip 9.20.1 x86_64 rpm

May 28, 2018 |
p7zip rpm build for : RedHat EL 6 . For other distributions click p7zip .
Name : p7zip
Version : 9.20.1 Vendor : Dag Apt Repository, http://dag_wieers_com/apt/
Release : 1.el6.rf Date : 2011-04-20 15:23:34
Group : Applications/Archiving Source RPM : p7zip-9.20.1-1.el6.rf.src.rpm
Size : 14.84 MB
Packager : Dag Wieers < dag_wieers_com>
Summary : Very high compression ratio file archiver
Description :
p7zip is a port of 7za.exe for Unix. 7-Zip is a file archiver with a very high
compression ratio. The original version can be found at

RPM found in directory: /mirror/

Content of RPM Changelog Provides Requires
Download p7zip-9.20.1-1.el6.rf.x86_64.rpm p7zip-9.20.1-1.el6.rf.x86_64.rpm p7zip-9.20.1-1.el6.rf.x86_64.rpm p7zip-9.20.1-1.el6.rf.x86_64.rpm p7zip-9.20.1-1.el6.rf.x86_64.rpm p7zip-9.20.1-1.el6.rf.x86_64.rpm p7zip-9.20.1-1.el6.rf.x86_64.rpm

[May 28, 2018] TIL pigz exists A parallel implementation of gzip for modern multi-processor, multi-core machines linux

May 28, 2018 |

TIL pigz exists "A parallel implementation of gzip for modern multi-processor, multi-core machines" ( self.linux )

submitted 3 years ago by

msiekkinen y unvoted">

[–] tangre 3 years ago (74 children)

Why wouldn't gzip be updated with this functionality instead? Is there a point in keeping it separate?
ilikerackmounts 3 years ago (59 children)
There are certain file sizes were pigz makes no difference, in general you need at least 2 cores to feel the benefits, there are quite a few reasons. That being said, pigz and its bzip counterpart pbzip2 can be symlinked in place when emerged with gentoo and using the "symlink" use flag.
adam@eggsbenedict ~ $ eix pigz
[I] app-arch/pigz
   Available versions:  2.2.5 2.3 2.3.1 (~)2.3.1-r1 {static symlink |test}
   Installed versions:  2.3.1-r1(02:06:01 01/25/14)(symlink -static -|test)
   Description:         A parallel implementation of gzip
msiekkinen 3 years ago (38 children)

in general you need at least 2 cores to feel the benefits

Is it even possible to buy any single core cpus outside of some kind of specialized embedded system these days?

exdirrk 3 years ago (5 children)
tw4 3 years ago (2 children)
Yes, but nevertheless it's possible to allocate only one.
too_many_secrets 3 years ago (0 children)

Giving a VM more than one CPU is quite a rare circumstance.

Depends on your circumstances. It's rare that we have any VMs with a single CPU, but we have thousands of servers and a lot of things going on.

FakingItEveryDay 3 years ago (0 children)
You can, but often shouldn't. I can only speak for vmware here, other hypervisors may work differently. Generally you want to size your VMware vm's so that they are around 80% cpu utilization. When any VM with multiple cores needs compute power the hypervisor will make it wait to until it can free that number of CPUs, even if the task in the VM only needs one core. This makes the multi-core VM slower by having to wait longer to do it's work, as well as makes other VMs on the hypervisor slower as they must all wait for it to finish before they can get a core allocated.

[May 28, 2018] Solaris: Parallel Compression/Decompression

Notable quotes:
"... the following prstat, vmstat outputs show that gzip is compressing the ..."
"... tar file using a single thread – hence low CPU utilization. ..."
"... wall clock time is 25s compared to gzip's 3m 27s ..."
"... the following prstat, vmstat outputs show that pigz is compressing the ..."
"... tar file using many threads – hence busy system with high CPU utilization. ..."
"... shows that the pigz compressed file is ..."
"... compatible with gzip/gunzip ..."
"... compare gzip's 52s decompression time with pigz's 18s ..."
May 28, 2018 |

Posted on January 26, 2015 by Sandeep Shenoy This topic is not Solaris specific, but certainly helps Solaris users who are frustrated with the single threaded implementation of all officially supported compression tools such as compress, gzip, zip. pigz (pig-zee) is a parallel implementation of gzip that suits well for the latest multi-processor, multi-core machines. By default, pigz breaks up the input into multiple chunks of size 128 KB, and compress each chunk in parallel with the help of light-weight threads. The number of compress threads is set by default to the number of online processors. The chunk size and the number of threads are configurable. Compressed files can be restored to their original form using -d option of pigz or gzip tools. As per the man page, decompression is not parallelized out of the box, but may show some improvement compared to the existing old tools. The following example demonstrates the advantage of using pigz over gzip in compressing and decompressing a large file. eg.,
Original file, and the target hardware. $ ls -lh PT8.53.04.tar -rw-r–r– 1 psft dba 4.8G Feb 28 14:03 PT8.53.04.tar
$ psrinfo -pv The physical processor has 8 cores and 64 virtual processors (0-63) The core has 8 virtual processors (0-7) The core has 8 virtual processors (56-63) SPARC-T5 (chipid 0, clock 3600 MHz)
gzip compression.
$ time gzip –fast PT8.53.04.tar
real 3m40.125s user 3m27.105s sys 0m13.008s
$ ls -lh PT8.53* -rw-r–r– 1 psft dba 3.1G Feb 28 14:03 PT8.53.04.tar.gz /* the following prstat, vmstat outputs show that gzip is compressing the tar file using a single thread – hence low CPU utilization. */
$ prstat -p 42510 PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/NLWP 42510 psft 2616K 2200K cpu16 10 0 0:01:00 1.5% gzip/ 1

$ prstat -m -p 42510 PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG PROCESS/NLWP 42510 psft 95 4.6 0.0 0.0 0.0 0.0 0.0 0.0 0 35 7K 0 gzip/1 $ vmstat 2 r b w swap free re mf pi po fr de sr s0 s1 s2 s3 in sy cs us sy id 0 0 0 776242104 917016008 0 7 0 0 0 0 0 0 0 52 52 3286 2606 2178 2 0 98
1 0 0 776242104 916987888 0 14 0 0 0 0 0 0 0 0 0 3851 3359 2978 2 1 97
0 0 0 776242104 916962440 0 0 0 0 0 0 0 0 0 0 0 3184 1687 2023 1 0 98
0 0 0 775971768 916930720 0 0 0 0 0 0 0 0 0 39 37 3392 1819 2210 2 0 98
0 0 0 775971768 916898016 0 0 0 0 0 0 0 0 0 0 0 3452 1861 2106 2 0 98

pigz compression. $ time ./pigz PT8.53.04.tar real 0m25.111s <== wall clock time is 25s compared to gzip's 3m 27s
user 17m18.398s sys 0m37.718s
/* the following prstat, vmstat outputs show that pigz is compressing the tar file using many threads – hence busy system with high CPU utilization. */
$ prstat -p 49734 PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/NLWP 49734 psft 59M 58M sleep 11 0 0:12:58 38% pigz/ 66

$ vmstat 2 kthr memory page disk faults cpu r b w swap free re mf pi po fr de sr s0 s1 s2 s3 in sy cs us sy id 0 0 0 778097840 919076008 6 113 0 0 0 0 0 0 0 40 36 39330 45797 74148 61 4 35
0 0 0 777956280 918841720 0 1 0 0 0 0 0 0 0 0 0 38752 43292 71411 64 4 32
0 0 0 777490336 918334176 0 3 0 0 0 0 0 0 0 17 15 46553 53350 86840 60 4 35
1 0 0 777274072 918141936 0 1 0 0 0 0 0 0 0 39 34 16122 20202 28319 88 4 9
1 0 0 777138800 917917376 0 0 0 0 0 0 0 0 0 3 3 46597 51005 86673 56 5 39

$ ls -lh PT8.53.04.tar.gz -rw-r–r– 1 psft dba 3.0G Feb 28 14:03 PT8.53.04.tar.gz
$ gunzip PT8.53.04.tar.gz <== shows that the pigz compressed file is compatible with gzip/gunzip

$ ls -lh PT8.53* -rw-r–r– 1 psft dba 4.8G Feb 28 14:03 PT8.53.04.tar
Decompression. $ time ./pigz -d PT8.53.04.tar.gz real 0m18.068s
user 0m22.437s sys 0m12.857s
$ time gzip -d PT8.53.04.tar.gz real 0m52.806s <== compare gzip's 52s decompression time with pigz's 18s
user 0m42.068s sys 0m10.736s
$ ls -lh PT8.53.04.tar -rw-r–r– 1 psft dba 4.8G Feb 28 14:03 PT8.53.04.tar
Of course, there are other tools such as Parallel BZIP2 (PBZIP2), which is a parallel implementation of the bzip2 tool are worth a try too. The idea here is to highlight the fact that there are better tools out there to get the job done in a quick manner compared to the existing/old tools that are bundled with the operating system distribution.

[Apr 22, 2018] Happy Sysadmin Appreciation Day 2016

Apr 22, 2018 |

Necessity is frequently the mother of invention. I knew very little about BASH scripting but that was about to change rapidly. Working with the existing script and using online help forums, search engines, and some printed documentation, I setup Linux network attached storage computer running on Fedora Core. I learned how to create an SSH keypair and configure that along with rsync to move the backup file from the email server to the storage server. That worked well for a few days until I noticed that the storage servers disk space was rapidly disappearing. What was I going to do?

That's when I learned more about Bash scripting. I modified my rsync command to delete backed up files older than ten days. In both cases I learned that a little knowledge can be a dangerous thing but in each case my experience and confidence as Linux user and system administrator grew and due to that I functioned as a resource for other. On the plus side, we soon realized that the disk to disk backup system was superior to tape when it came to restoring email files. In the long run it was a win but there was a lot of uncertainty and anxiety along the way.

[Apr 04, 2018] The gzip Recovery Toolkit

Apr 04, 2018 |

So you thought you had your files backed up - until it came time to restore. Then you found out that you had bad sectors and you've lost almost everything because gzip craps out 10% of the way through your archive. The gzip Recovery Toolkit has a program - gzrecover - that attempts to skip over bad data in a gzip archive. This saved me from exactly the above situation. Hopefully it will help you as well.

I'm very eager for feedback on this program . If you download and try it, I'd appreciate and email letting me know what your results were. My email is . Thanks.


99% of "corrupted" gzip archives are caused by transferring the file via FTP in ASCII mode instead of binary mode. Please re-transfer the file in the correct mode first before attempting to recover from a file you believe is corrupted.

Disclaimer and Warning

This program is provided AS IS with absolutely NO WARRANTY. It is not guaranteed to recover anything from your file, nor is what it does recover guaranteed to be good data. The bigger your file, the more likely that something will be extracted from it. Also keep in mind that this program gets faked out and is likely to "recover" some bad data. Everything should be manually verified.

Downloading and Installing

Note that version 0.8 contains major bug fixes and improvements. See the ChangeLog for details. Upgrading is recommended. The old version is provided in the event you run into troubles with the new release.

You need the following packages:

First, build and install zlib if necessary. Next, unpack the gzrt sources. Then cd to the gzrt directory and build the gzrecover program by typing make . Install manually by copying to the directory of your choice.


Run gzrecover on a corrupted .gz file. If you leave the filename blank, gzrecover will read from the standard input. Anything that can be read from the file will be written to a file with the same name, but with a .recovered appended (any .gz is stripped). You can override this with the -o option. The default filename when reading from the standard input is "stdin.recovered". To write recovered data to the standard output, use the -p option. (Note that -p and -o cannot be used together).

To get a verbose readout of exactly where gzrecover is finding bad bytes, use the -v option to enable verbose mode. This will probably overflow your screen with text so best to redirect the stderr stream to a file. Once gzrecover has finished, you will need to manually verify any data recovered as it is quite likely that our output file is corrupt and has some garbage data in it. Note that gzrecover will take longer than regular gunzip. The more corrupt your data the longer it takes. If your archive is a tarball, read on.

For tarballs, the tar program will choke because GNU tar cannot handle errors in the file format. Fortunately, GNU cpio (tested at version 2.6 or higher) handles corrupted files out of the box.

Here's an example:

$ ls *.gz
$ gzrecover my-corrupted-backup.tar.gz
$ ls *.recovered
$ cpio -F my-corrupted-backup.tar.recovered -i -v

Note that newer versions of cpio can spew voluminous error messages to your terminal. You may want to redirect the stderr stream to /dev/null. Also, cpio might take quite a long while to run.


The gzip Recovery Toolkit v0.8
Copyright (c) 2002-2013 Aaron M. Renn ( )

[Jan 14, 2018] How to remount filesystem in read write mode under Linux

Jan 14, 2018 |

Most of the time on newly created file systems of NFS filesystems we see error like below :

1 2 3 4 root @ kerneltalks # touch file1 touch : cannot touch ' file1 ' : Read - only file system

This is because file system is mounted as read only. In such scenario you have to mount it in read-write mode. Before that we will see how to check if file system is mounted in read only mode and then we will get to how to re mount it as a read write filesystem.

How to check if file system is read only

To confirm file system is mounted in read only mode use below command –

1 2 3 4 # cat /proc/mounts | grep datastore / dev / xvdf / datastore ext3 ro , seclabel , relatime , data = ordered 0 0

Grep your mount point in cat /proc/mounts and observer third column which shows all options which are used in mounted file system. Here ro denotes file system is mounted read-only.

You can also get these details using mount -v command

1 2 3 4 root @ kerneltalks # mount -v |grep datastore / dev / xvdf on / datastore type ext3 ( ro , relatime , seclabel , data = ordered )

In this output. file system options are listed in braces at last column.

Re-mount file system in read-write mode

To remount file system in read-write mode use below command –

1 2 3 4 5 6 root @ kerneltalks # mount -o remount,rw /datastore root @ kerneltalks # mount -v |grep datastore / dev / xvdf on / datastore type ext3 ( rw , relatime , seclabel , data = ordered )

Observe after re-mounting option ro changed to rw . Now, file system is mounted as read write and now you can write files in it.

Note : It is recommended to fsck file system before re mounting it.

You can check file system by running fsck on its volume.

1 2 3 4 5 6 7 8 9 10 root @ kerneltalks # df -h /datastore Filesystem Size Used Avail Use % Mounted on / dev / xvda2 10G 881M 9.2G 9 % / root @ kerneltalks # fsck /dev/xvdf fsck from util - linux 2.23.2 e2fsck 1.42.9 ( 28 - Dec - 2013 ) / dev / xvdf : clean , 12 / 655360 files , 79696 / 2621440 blocks

Sometimes there are some corrections needs to be made on file system which needs reboot to make sure there are no processes are accessing file system.

[Jan 14, 2018] Linux yes Command Tutorial for Beginners (with Examples)

Jan 14, 2018 |

You can see that user has to type 'y' for each query. It's in situation like these where yes can help. For the above scenario specifically, you can use yes in the following way:

yes | rm -ri test Q3. Is there any use of yes when it's used alone?

Yes, there's at-least one use: to tell how well a computer system handles high amount of loads. Reason being, the tool utilizes 100% processor for systems that have a single processor. In case you want to apply this test on a system with multiple processors, you need to run a yes process for each processor.

[Dec 09, 2017] How to rsync only a specific list of files - Stack Overflow

Notable quotes:
"... The filenames that are read from the FILE are all relative to the source dir ..."
Dec 09, 2017 |

ash, May 11, 2015 at 20:05

There is a flag --files-from that does exactly what you want. From man rsync :

Using this option allows you to specify the exact list of files to transfer (as read from the specified FILE or - for standard input). It also tweaks the default behavior of rsync to make transferring just the specified files and directories easier:

The filenames that are read from the FILE are all relative to the source dir -- any leading slashes are removed and no ".." references are allowed to go higher than the source dir. For example, take this command:

rsync -a --files-from=/tmp/foo /usr remote:/backup

If /tmp/foo contains the string "bin" (or even "/bin"), the /usr/bin directory will be created as /backup/bin on the remote host. If it contains "bin/" (note the trailing slash), the immediate contents of the directory would also be sent (without needing to be explicitly mentioned in the file -- this began in version 2.6.4). In both cases, if the -r option was enabled, that dir's entire hierarchy would also be transferred (keep in mind that -r needs to be specified explicitly with --files-from, since it is not implied by -a). Also note that the effect of the (enabled by default) --relative option is to duplicate only the path info that is read from the file -- it does not force the duplication of the source-spec path (/usr in this case).

In addition, the --files-from file can be read from the remote host instead of the local host if you specify a "host:" in front of the file (the host must match one end of the transfer). As a short-cut, you can specify just a prefix of ":" to mean "use the remote end of the transfer". For example:

rsync -a --files-from=:/path/file-list src:/ /tmp/copy

This would copy all the files specified in the /path/file-list file that was located on the remote "src" host.

If the --iconv and --protect-args options are specified and the --files-from filenames are being sent from one host to another, the filenames will be translated from the sending host's charset to the receiving host's charset.

NOTE: sorting the list of files in the --files-from input helps rsync to be more efficient, as it will avoid re-visiting the path elements that are shared between adjacent entries. If the input is not sorted, some path elements (implied directories) may end up being scanned multiple times, and rsync will eventually unduplicate them after they get turned into file-list elements.

Nicolas Mattia, Feb 11, 2016 at 11:06

Note that you still have to specify the directory where the files listed are located, for instance: rsync -av --files-from=file-list . target/ for copying files from the current dir. – Nicolas Mattia Feb 11 '16 at 11:06

ash, Feb 12, 2016 at 2:25

Yes, and to reiterate: The filenames that are read from the FILE are all relative to the source dir . – ash Feb 12 '16 at 2:25

Michael ,Nov 2, 2016 at 0:09

if the files-from file has anything starting with .. rsync appears to ignore the .. giving me an error like rsync: link_stat "/home/michael/test/subdir/test.txt" failed: No such file or directory (in this case running from the "test" dir and trying to specify "../subdir/test.txt" which does exist. – Michael Nov 2 '16 at 0:09


--files-from= parameter needs trailing slash if you want to keep the absolute path intact. So your command would become something like below:
rsync -av --files-from=/path/to/file / /tmp/

This could be done like there are a large number of files and you want to copy all files to x path. So you would find the files and throw output to a file like below:

find /var/* -name *.log > file

[Nov 13, 2017] 20 Sed (Stream Editor) Command Examples for Linux Users

Nov 13, 2017 |

20 Sed (Stream Editor) Command Examples for Linux Users

by Pradeep Kumar · Published November 9, 2017 · Updated November 9, 2017

Sed command or Stream Editor is very powerful utility offered by Linux/Unix systems. It is mainly used for text substitution , find & replace but it can also perform other text manipulations like insertion deletion search etc. With SED, we can edit complete files without actually having to open it. Sed also supports the use of regular expressions, which makes sed an even more powerful test manipulation tool

In this article, we will learn to use SED command with the help some examples. Basic syntax for using sed command is,


Now let's see some examples.

Example :1) Displaying partial text of a file

With sed, we can view only some part of a file rather than seeing whole file. To see some lines of the file, use the following command,

[linuxtechi@localhost ~]$ sed -n 22,29p testfile.txt

here, option 'n' suppresses printing of whole file & option 'p' will print only line lines from 22 to 29.

Example :2) Display all except some lines

To display all content of a file except for some portion, use the following command,

[linuxtechi@localhost ~]$ sed 22,29d testfile.txt

Option 'd' will remove the mentioned lines from output.

Example :3) Display every 3rd line starting with Nth line

Do display content of every 3rd line starting with line number 2 or any other line, use the following command

[linuxtechi@localhost ~]$ sed -n '2-3p' file.txt
Example :4 ) Deleting a line using sed command

To delete a line with sed from a file, use the following command,

[linuxtechi@localhost ~]$ sed Nd testfile.txt

where 'N' is the line number & option 'd' will delete the mentioned line number. To delete the last line of the file, use

[linuxtechi@localhost ~]$ sed $d testfile.txt
Example :5) Deleting a range of lines

To delete a range of lines from the file, run

[linuxtechi@localhost ~]$ sed '29-34d' testfile.txt

This will delete lines 29 to 34 from testfile.txt file.

Example :6) Deleting lines other than the mentioned

To delete lines other than the mentioned lines from a file, we will use '!'

[linuxtechi@localhost ~]$ sed '29-34!d' testfile.txt

here '!' option is used as not, so it will reverse the condition i.e. will not delete the lines mentioned. All the lines other 29-34 will be deleted from the files testfile.txt.

Example :7) Adding Blank lines/spaces

To add a blank line after every non-blank line, we will use option 'G',

[linuxtechi@localhost ~]$ sed G testfile.txt
Example :8) Search and Replacing a string using sed

To search & replace a string from the file, we will use the following example,

[linuxtechi@localhost ~]$ sed 's/danger/safety/' testfile.txt

here option 's' will search for word 'danger' & replace it with 'safety' on every line for the first occurrence only.

Example :9) Search and replace a string from whole file using sed

To replace the word completely from the file, we will use option 'g' with 's',

[linuxtechi@localhost ~]$ sed 's/danger/safety/g' testfile.txt
Example :10) Replace the nth occurrence of string pattern

We can also substitute a string on nth occurrence from a file. Like replace 'danger' with 'safety' only on second occurrence,

[linuxtechi@localhost ~]$ sed 's/danger/safety/2' testfile.txt

To replace 'danger' on 2nd occurrence of every line from whole file, use

[linuxtechi@localhost ~]$ sed 's/danger/safety/2g' testfile.txt
Example :11) Replace a string on a particular line

To replace a string only from a particular line, use

[linuxtechi@localhost ~]$ sed '4 s/danger/safety/' testfile.txt

This will only substitute the string from 4th line of the file. We can also mention a range of lines instead of a single line,

[linuxtechi@localhost ~]$  sed '4-9 s/danger/safety/' testfile.txt
Example :12) Add a line after/before the matched search

To add a new line with some content after every pattern match, use option 'a' ,

[linuxtechi@localhost ~]$ sed '/danger/a "This is new line with text after match"' testfile.txt

To add a new line with some content a before every pattern match, use option 'i',

[linuxtechi@localhost ~]$ sed '/danger/i "This is new line with text before match" ' testfile.txt
Example :13) Change a whole line with matched pattern

To change a whole line to a new line when a search pattern matches we need to use option 'c' with sed,

[linuxtechi@localhost ~]$ sed '/danger/c "This will be the new line" ' testfile.txt

So when the pattern matches 'danger', whole line will be changed to the mentioned line.

Advanced options with sed

Up until now we were only using simple expressions with sed, now we will discuss some advanced uses of sed with regex,

Example :14) Running multiple sed commands

If we need to perform multiple sed expressions, we can use option 'e' to chain the sed commands,

[linuxtechi@localhost ~]$  sed -e 's/danger/safety/g' -e 's/hate/love/' testfile.txt
Example :15) Making a backup copy before editing a file

To create a backup copy of a file before we edit it, use option '-i.bak',

[linuxtechi@localhost ~]$ sed -i.bak -e 's/danger/safety/g'  testfile.txt

This will create a backup copy of the file with extension .bak. You can also use other extension if you like.

Example :16) Delete a file line starting with & ending with a pattern

To delete a file line starting with a particular string & ending with another string, use

[linuxtechi@localhost ~]$ sed -e 's/danger.*stops//g' testfile.txt

This will delete the line with 'danger' on start & 'stops' in the end & it can have any number of words in between , '.*' defines that part.

Example :17) Appending lines

To add some content before every line with sed & regex, use

[linuxtechi@localhost ~]$ sed -e 's/.*/testing sed &/' testfile.txt

So now every line will have 'testing sed' before it.

Example :18) Removing all commented lines & empty lines

To remove all commented lines i.e. lines with # & all the empty lines, use

[linuxtechi@localhost ~]$ sed -e 's/#.*//;/^$/d' testfile.txt

To only remove commented lines, use

[linuxtechi@localhost ~]$ sed -e 's/#.*//' testfile.txt
Example :19) Get list of all usernames from /etc/passwd

To get the list of all usernames from /etc/passwd file, use

[linuxtechi@localhost ~]$  sed 's/\([^:]*\).*/\1/' /etc/passwd

a complete list all usernames will be generated on screen as output.

Example :20) Prevent overwriting of system links with sed command

'sed -i' command has been know to remove system links & create only regular files in place of the link file. So to avoid such a situation & prevent ' sed -i ' from destroying the links, use ' –follow-symklinks ' options with the command being executed.

Let's assume i want to disable SELinux on CentOS or RHEL Severs

[linuxtechi@localhost ~]# sed -i --follow-symlinks 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/sysconfig/selinux

These were some examples to show sed, we can use these reference to employ them as & when needed. If you guys have any queries related to this or any article, do share with us.

[Nov 09, 2017] TERM strings by Tom Ryder

Jan 26, 2013 |

A certain piece of very misleading advice is often given online to users having problems with the way certain command-line applications are displaying in their terminals. This is to suggest that the user change the value of their TERM environment variable from within the shell, doing something like this:

$ TERM=xterm-256color

This misinformation sometimes extends to suggesting that users put the forced TERM change into their shell startup scripts. The reason this is such a bad idea is that it forces your shell to assume what your terminal is, and thereby disregards the initial terminal identity string sent by the emulator. This leads to a lot of confusion when one day you need to connect with a very different terminal emulator.

Accounting for differences

All terminal emulators are not created equal. Certainly, not all of them are xterm(1) , although many other terminal emulators do a decent but not comprehensive job of copying it. The value of the TERM environment variable is used by the system running the shell to determine what the terminal connecting to it can and cannot do, what control codes to send to the program to use those features, and how the shell should understand the input of certain key codes, such as the Home and End keys. These things in particular are common causes of frustration for new users who turn out to be using a forced TERM string.

Instead, focus on these two guidelines for setting TERM :

  1. Avoid setting TERM from within the shell, especially in your startup scripts like .bashrc or .bash_profile . If that ever seems like the answer, then you are probably asking the wrong question! The terminal identification string should always be sent by the terminal emulator you are using; if you do need to change it, then change it in the settings for the emulator.
  2. Always use an appropriate TERM string that accurately describes what your choice of terminal emulator can and cannot display. Don't make an rxvt(1) terminal identify itself as xterm ; don't make a linux console identify itself as vt100 ; and don't make an xterm(1) compiled without 256 color support refer to itself as xterm-256color .

In particular, note that sometimes for compatibility reasons, the default terminal identification used by an emulator is given as something generic like xterm , when in fact a more accurate or comprehensive terminal identity file is more than likely available for your particular choice of terminal emulator with a little searching.

An example that surprises a lot of people is the availability of the putty terminal identity file, when the application defaults to presenting itself as an imperfect xterm(1) emulator.

Configuring your emulator's string

Before you change your terminal string in its settings, check whether the default it uses is already the correct one, with one of these:

$ echo $TERM
$ tset -q

Most builds of rxvt(1) , for example, should already use the correct TERM string by default, such as rxvt-unicode-256color for builds with 256 colors and Unicode support.

Where to configure which TERM string your terminal uses will vary depending on the application. For xterm(1) , your .Xresources file should contain a definition like the below:

XTerm*termName: xterm-256color

For rxvt(1) , the syntax is similar:

URxvt*termName: rxvt-unicode-256color

Other GTK and Qt emulators sometimes include the setting somewhere in their preferences. Look for mentions of xterm , a common fallback default.

For Windows PuTTY, it's configurable under the "'Connections > Data"' section:

Setting the terminal string in PuTTY

More detail about configuring PuTTY for connecting to modern systems can be found in my article on configuring PuTTY .

Testing your TERM string

On GNU/Linux systems, an easy way to test the terminal capabilities (particularly effects like colors and reverse video) is using the msgcat(1) utility:

$ msgcat --color=test

This will output a large number of tests of various features to the terminal, so that you can check their appearance is what you expect.

Finding appropriate terminfo(5) definitions

On GNU/Linux systems, the capabilities and behavior of various terminal types is described using terminfo(5) files, usually installed as part of the ncurses package. These files are often installed in /lib/terminfo or /usr/share/terminfo , in subdirectories by first letter.

In order to use a particular TERM string, an appropriate file must exist in one of these directories. On Debian-derived systems, a large collection of terminal types can be installed to the system with the ncurses-term package.

For example, the following variants of the rxvt terminal emulator are all available:

$ cd /usr/share/terminfo/r
$ ls rxvt*
rxvt-16color  rxvt-256color  rxvt-88color  rxvt-color  rxvt-cygwin
rxvt-cygwin-native  rxvt+pcfkeys  rxvt-unicode-256color  rxvt-xpm
Private and custom terminfo(5) files

If you connect to a system that doesn't have a terminfo(5) definition to match the TERM definition for your particular terminal, you might get a message similar to this on login:

setterm: rxvt-unicode-256color: unknown terminal type
tput: unknown terminal "rxvt-unicode-256color"

If you're not able to install the appropriate terminal definition system-wide, one technique is to use a private .terminfo directory in your home directory containing the definitions you need:

$ cd ~/.terminfo
$ find

You can copy this to your home directory on the servers you manage with a tool like scp :

$ scp -r .terminfo server:
TERM and multiplexers

Terminal multiplexers like screen(1) and tmux(1) are special cases, and they cause perhaps the most confusion to people when inaccurate TERM strings are used. The tmux FAQ even opens by saying that most of the display problems reported by people are due to incorrect TERM settings, and a good portion of the codebase in both multiplexers is dedicated to negotiating the differences between terminal capacities.

This is because they are "terminals within terminals", and provide their own functionality only within the bounds of what the outer terminal can do. In addition to this, they have their own type for terminals within them; both of them use screen and its variants, such as screen-256color .

It's therefore very important to check that both the outer and inner definitions for TERM are correct. In .screenrc it usually suffices to use a line like the following:

term screen

Or in .tmux.conf :

set-option -g default-terminal screen

If the outer terminals you use consistently have 256 color capabilities, you may choose to use the screen-256color variant instead.

If you follow all of these guidelines, your terminal experience will be much smoother, as your terminal and your system will understand each other that much better. You may find that this fixes a lot of struggles with interactive tools like vim(1) , for one thing, because if the application is able to divine things like the available color space directly from terminal information files, it saves you from having to include nasty hacks on the t_Co variable in your .vimrc . Posted in Terminal Tagged term strings , terminal types , terminfo

[Nov 09, 2017] PuTTY configuration by Tom Ryder

Dec 22, 2012 |

Posted on PuTTY is a terminal emulator with a free software license, including an SSH client. While it has cross-platform ports, it's used most frequently on Windows systems, because they otherwise lack a built-in terminal emulator that interoperates well with Unix-style TTY systems.

While it's very popular and useful, PuTTY's defaults are quite old, and are chosen for compatibility reasons rather than to take advantage of all the features of a more complete terminal emulator. For new users, this is likely an advantage as it can avoid confusion, but more advanced users who need to use a Windows client to connect to a modern GNU/Linux system may find the defaults frustrating, particularly when connecting to a more capable and custom-configured server.

Here are a few of the problems with the default configuration:

All of these things are fixable.

Terminal type

Usually the most important thing in getting a terminal working smoothly is to make sure it identifies itself correctly to the machine to which it's connecting, using an appropriate $TERM string. By default, PuTTY identifies itself as an xterm(1) terminal emulator, which most systems will support.

However, there's a terminfo(5) definition for putty and putty-256color available as part of ncurses , and if you have it available on your system then you should use it, as it slightly more precisely describes the features available to PuTTY as a terminal emulator.

You can check that you have the appropriate terminfo(5) definition installed by looking in /usr/share/terminfo/p :

$ ls -1 /usr/share/terminfo/p/putty*

On Debian and Ubuntu systems, these files can be installed with:

# apt-get install ncurses-term

If you can't install the files via your system's package manager, you can also keep a private repository of terminfo(5) files in your home directory, in a directory called .terminfo :

$ ls -1 $HOME/.terminfo/p

Once you have this definition installed, you can instruct PuTTY to identify with that $TERM string in the Connection > Data section:

Correct terminal definition in PuTTY

Here, I've used putty-256color ; if you don't need or want a 256 color terminal you could just use putty .

Once connected, make sure that your $TERM string matches what you specified, and hasn't been mangled by any of your shell or terminal configurations:

$ echo $TERM
Color space

Certain command line applications like Vim and Tmux can take advantage of a full 256 colors in the terminal. If you'd like to use this, set PuTTY's $TERM string to putty-256color as outlined above, and select Allow terminal to use xterm 256-colour mode in Window > Colours

256 colours in PuTTY

You can test this is working by using a 256 color application, or by trying out the terminal colours directly in your shell using tput :

$ for ((color = 0; color <= 255; color++)); do
> tput setaf "$color"
> printf "test"
> done

If you see the word test in many different colors, then things are probably working. Type reset to fix your terminal after this:

$ reset
Using UTF-8

If you're connecting to a modern GNU/Linux system, it's likely that you're using a UTF-8 locale. You can check which one by typing locale . In my case, I'm using the en_NZ locale with UTF-8 character encoding:

$ locale

If the output of locale does show you're using a UTF-8 character encoding, then you should configure PuTTY to interpret terminal output using that character set; it can't detect it automatically (which isn't PuTTY's fault; it's a known hard problem). You do this in the Window > Translation section:

Using UTF-8 encoding in PuTTY

While you're in this section, it's best to choose the Use Unicode line drawing code points option as well. Line-drawing characters are most likely to work properly with this setting for UTF-8 locales and modern fonts:

Using Unicode line-drawing points in PuTTY

If Unicode and its various encodings is new to you, I highly recommend Joel Spolsky's classic article about what programmers should know about both.


Courier New is a workable monospace font, but modern Windows systems include Consolas , a much nicer terminal font. You can change this in the Window > Appearance section:

Using Consolas font in PuTTY

There's no reason you can't use another favourite Bitmap or TrueType font instead once it's installed on your system; DejaVu Sans Mono , Inconsolata , and Terminus are popular alternatives. I personally favor Ubuntu Mono .


Terminal bells by default in PuTTY emit the system alert sound. Most people find this annoying; some sort of visual bell tends to be much better if you want to use the bell at all. Configure this in Terminal > Bell

Given the purpose of the alert is to draw attention to the window, I find that using a flashing taskbar icon works well; I use this to draw my attention to my prompt being displayed after a long task completes, or if someone mentions my name or directly messages me in irssi(1) .

Another option is using the Visual bell (flash window) option, but I personally find this even worse than the audible bell.

Default palette

The default colours for PuTTY are rather like those used in xterm(1) , and hence rather harsh, particularly if you're used to the slightly more subdued colorscheme of terminal emulators like gnome-terminal(1) , or have customized your palette to something like Solarized .

If you have decimal RGB values for the colours you'd prefer to use, you can enter those in the Window > Colours section, making sure that Use system colours and Attempt to use logical palettes are unchecked:

There are a few other default annoyances in PuTTY, but the above are the ones that seem to annoy advanced users most frequently. Dag Wieers has a similar post with a few more defaults to fix.

[Nov 09, 2017] Searching files

Notable quotes:
"... With all this said, there's a very popular alternative to grep called ack , which excludes this sort of stuff for you by default. It also allows you to use Perl-compatible regular expressions (PCRE), which are a favourite for many programmers. It has a lot of utilities that are generally useful for working with source code, so while there's nothing wrong with good old grep since you know it will always be there, if you can install ack I highly recommend it. There's a Debian package called ack-grep , and being a Perl script it's otherwise very simple to install. ..."
"... Unix purists might be displeased with my even mentioning a relatively new Perl script alternative to classic grep , but I don't believe that the Unix philosophy or using Unix as an IDE is dependent on sticking to the same classic tools when alternatives with the same spirit that solve new problems are available. ..."

More often than attributes of a set of files, however, you want to find files based on their contents, and it's no surprise that grep, in particular grep -R, is useful here. This searches the current directory tree recursively for anything matching 'someVar':

$ grep -FR someVar .

Don't forget the case insensitivity flag either, since by default grep works with fixed case:

$ grep -iR somevar .

Also, you can print a list of files that match without printing the matches themselves with grep -l:

$ grep -lR someVar .

If you write scripts or batch jobs using the output of the above, use a while loop with read to handle spaces and other special characters in filenames:

grep -lR someVar | while IFS= read -r file; do
    head "$file"

If you're using version control for your project, this often includes metadata in the .svn, .git, or .hg directories. This is dealt with easily enough by excluding (grep -v) anything matching an appropriate fixed (grep -F) string:

$ grep -R someVar . | grep -vF .svn

Some versions of grep include --exclude and --exclude-dir options, which may be tidier.

With all this said, there's a very popular alternative to grep called ack, which excludes this sort of stuff for you by default. It also allows you to use Perl-compatible regular expressions (PCRE), which are a favourite for many programmers. It has a lot of utilities that are generally useful for working with source code, so while there's nothing wrong with good old grep since you know it will always be there, if you can install ack I highly recommend it. There's a Debian package called ack-grep, and being a Perl script it's otherwise very simple to install.

Unix purists might be displeased with my even mentioning a relatively new Perl script alternative to classic grep, but I don't believe that the Unix philosophy or using Unix as an IDE is dependent on sticking to the same classic tools when alternatives with the same spirit that solve new problems are available.

[Nov 01, 2017] Cron best practices by Tom Ryder

May 08, 2016 |

The time-based job scheduler cron(8) has been around since Version 7 Unix, and its crontab(5) syntax is familiar even for people who don't do much Unix system administration. It's standardised , reasonably flexible, simple to configure, and works reliably, and so it's trusted by both system packages and users to manage many important tasks.

However, like many older Unix tools, cron(8) 's simplicity has a drawback: it relies upon the user to know some detail of how it works, and to correctly implement any other safety checking behaviour around it. Specifically, all it does is try and run the job at an appropriate time, and email the output. For simple and unimportant per-user jobs, that may be just fine, but for more crucial system tasks it's worthwhile to wrap a little extra infrastructure around it and the tasks it calls.

There are a few ways to make the way you use cron(8) more robust if you're in a situation where keeping track of the running job is desirable.

Apply the principle of least privilege

The sixth column of a system crontab(5) file is the username of the user as which the task should run:

0 * * * *  root  cron-task

To the extent that is practical, you should run the task as a user with only the privileges it needs to run, and nothing else. This can sometimes make it worthwhile to create a dedicated system user purely for running scheduled tasks relevant to your application.

0 * * * *  myappcron  cron-task

This is not just for security reasons, although those are good ones; it helps protect you against nasties like scripting errors attempting to remove entire system directories .

Similarly, for tasks with database systems such as MySQL, don't use the administrative root user if you can avoid it; instead, use or even create a dedicated user with a unique random password stored in a locked-down ~/.my.cnf file, with only the needed permissions. For a MySQL backup task, for example, only a few permissions should be required, including SELECT , SHOW VIEW , and LOCK TABLES .

In some cases, of course, you really will need to be root . In particularly sensitive contexts you might even consider using sudo(8) with appropriate NOPASSWD options, to allow the dedicated user to run only the appropriate tasks as root , and nothing else.

Test the tasks

Before placing a task in a crontab(5) file, you should test it on the command line, as the user configured to run the task and with the appropriate environment set. If you're going to run the task as root , use something like su or sudo -i to get a root shell with the user's expected environment first:

$ sudo -i -u cronuser
$ cron-task

Once the task works on the command line, place it in the crontab(5) file with the timing settings modified to run the task a few minutes later, and then watch /var/log/syslog with tail -f to check that the task actually runs without errors, and that the task itself completes properly:

May  7 13:30:01 yourhost CRON[20249]: (you) CMD (cron-task)

This may seem pedantic at first, but it becomes routine very quickly, and it saves a lot of hassles down the line as it's very easy to make an assumption about something in your environment that doesn't actually hold in the one that cron(8) will use. It's also a necessary acid test to make sure that your crontab(5) file is well-formed, as some implementations of cron(8) will refuse to load the entire file if one of the lines is malformed.

If necessary, you can set arbitrary environment variables for the tasks at the top of the file:


0 * * * *  you  cron-task
Don't throw away errors or useful output

You've probably seen tutorials on the web where in order to keep the crontab(5) job from sending standard output and/or standard error emails every five minutes, shell redirection operators are included at the end of the job specification to discard both the standard output and standard error. This kluge is particularly common for running web development tasks by automating a request to a URL with curl(1) or wget(1) :

*/5 * * *  root  curl >/dev/null 2>&1

Ignoring the output completely is generally not a good idea, because unless you have other tasks or monitoring ensuring the job does its work, you won't notice problems (or know what they are), when the job emits output or errors that you actually care about.

In the case of curl(1) , there are just way too many things that could go wrong, that you might notice far too late:

The author has seen all of the above happen, in some cases very frequently.

As a general policy, it's worth taking the time to read the manual page of the task you're calling, and to look for ways to correctly control its output so that it emits only the output you actually want. In the case of curl(1) , for example, I've found the following formula works well:

curl -fLsS -o /dev/null

This way, the curl(1) request should stay silent if everything is well, per the old Unix philosophy Rule of Silence .

You may not agree with some of the choices above; you might think it important to e.g. log the complete output of the returned page, or to fail rather than silently accept a 301 redirect, or you might prefer to use wget(1) . The point is that you take the time to understand in more depth what the called program will actually emit under what circumstances, and make it match your requirements as closely as possible, rather than blindly discarding all the output and (worse) the errors. Work with Murphy's law ; assume that anything that can go wrong eventually will.

Send the output somewhere useful

Another common mistake is failing to set a useful MAILTO at the top of the crontab(5) file, as the specified destination for any output and errors from the tasks. cron(8) uses the system mail implementation to send its messages, and typically, default configurations for mail agents will simply send the message to an mbox file in /var/mail/$USER , that they may not ever read. This defeats much of the point of mailing output and errors.

This is easily dealt with, though; ensure that you can send a message to an address you actually do check from the server, perhaps using mail(1) :

$ printf '%s\n' 'Test message' | mail -s 'Test subject'

Once you've verified that your mail agent is correctly configured and that the mail arrives in your inbox, set the address in a MAILTO variable at the top of your file:

0 * * * *    you  cron-task-1
*/5 * * * *  you  cron-task-2

If you don't want to use email for routine output, another method that works is sending the output to syslog with a tool like logger(1) :

0 * * * *   you  cron-task | logger -it cron-task

Alternatively, you can configure aliases on your system to forward system mail destined for you on to an address you check. For Postfix, you'd use an aliases(5) file.

I sometimes use this setup in cases where the task is expected to emit a few lines of output which might be useful for later review, but send stderr output via MAILTO as normal. If you'd rather not use syslog , perhaps because the output is high in volume and/or frequency, you can always set up a log file /var/log/cron-task.log but don't forget to add a logrotate(8) rule for it!

Put the tasks in their own shell script file

Ideally, the commands in your crontab(5) definitions should only be a few words, in one or two commands. If the command is running off the screen, it's likely too long to be in the crontab(5) file, and you should instead put it into its own script. This is a particularly good idea if you want to reliably use features of bash or some other shell besides POSIX/Bourne /bin/sh for your commands, or even a scripting language like Awk or Perl; by default, cron(8) uses the system's /bin/sh implementation for parsing the commands.

Because crontab(5) files don't allow multi-line commands, and have other gotchas like the need to escape percent signs % with backslashes, keeping as much configuration out of the actual crontab(5) file as you can is generally a good idea.

If you're running cron(8) tasks as a non-system user, and can't add scripts into a system bindir like /usr/local/bin , a tidy method is to start your own, and include a reference to it as part of your PATH . I favour ~/.local/bin , and have seen references to ~/bin as well. Save the script in ~/.local/bin/cron-task , make it executable with chmod +x , and include the directory in the PATH environment definition at the top of the file:


0 * * * *  you  cron-task

Having your own directory with custom scripts for your own purposes has a host of other benefits, but that's another article

Avoid /etc/crontab

If your implementation of cron(8) supports it, rather than having an /etc/crontab file a mile long, you can put tasks into separate files in /etc/cron.d :

$ ls /etc/cron.d

This approach allows you to group the configuration files meaningfully, so that you and other administrators can find the appropriate tasks more easily; it also allows you to make some files editable by some users and not others, and reduces the chance of edit conflicts. Using sudoedit(8) helps here too. Another advantage is that it works better with version control; if I start collecting more than a few of these task files or to update them more often than every few months, I start a Git repository to track them:

$ cd /etc/cron.d
$ sudo git init
$ sudo git add --all
$ sudo git commit -m "First commit"

If you're editing a crontab(5) file for tasks related only to the individual user, use the crontab(1) tool; you can edit your own crontab(5) by typing crontab -e , which will open your $EDITOR to edit a temporary file that will be installed on exit. This will save the files into a dedicated directory, which on my system is /var/spool/cron/crontabs .

On the systems maintained by the author, it's quite normal for /etc/crontab never to change from its packaged template.

Include a timeout

cron(8) will normally allow a task to run indefinitely, so if this is not desirable, you should consider either using options of the program you're calling to implement a timeout, or including one in the script. If there's no option for the command itself, the timeout(1) command wrapper in coreutils is one possible way of implementing this:

0 * * * *  you  timeout 10s cron-task

Greg's wiki has some further suggestions on ways to implement timeouts .

Include file locking to prevent overruns

cron(8) will start a new process regardless of whether its previous runs have completed, so if you wish to avoid locking for long-running task, on GNU/Linux you could use the flock(1) wrapper for the flock(2) system call to set an exclusive lockfile, in order to prevent the task from running more than one instance in parallel.

0 * * * *  you  flock -nx /var/lock/cron-task cron-task

Greg's wiki has some more in-depth discussion of the file locking problem for scripts in a general sense, including important information about the caveats of "rolling your own" when flock(1) is not available.

If it's important that your tasks run in a certain order, consider whether it's necessary to have them in separate tasks at all; it may be easier to guarantee they're run sequentially by collecting them in a single shell script.

Do something useful with exit statuses

If your cron(8) task or commands within its script exit non-zero, it can be useful to run commands that handle the failure appropriately, including cleanup of appropriate resources, and sending information to monitoring tools about the current status of the job. If you're using Nagios Core or one of its derivatives, you could consider using send_nsca to send passive checks reporting the status of jobs to your monitoring server. I've written a simple script called nscaw to do this for me:

0 * * * *  you  nscaw CRON_TASK -- cron-task
Consider alternatives to cron(8)

If your machine isn't always on and your task doesn't need to run at a specific time, but rather needs to run once daily or weekly, you can install anacron and drop scripts into the cron.hourly , cron.daily , cron.monthly , and cron.weekly directories in /etc , as appropriate. Note that on Debian and Ubuntu GNU/Linux systems, the default /etc/crontab contains hooks that run these, but they run only if anacron(8) is not installed.

If you're using cron(8) to poll a directory for changes and run a script if there are such changes, on GNU/Linux you could consider using a daemon based on inotifywait(1) instead.

Finally, if you require more advanced control over when and how your task runs than cron(8) can provide, you could perhaps consider writing a daemon to run on the server consistently and fork processes for its task. This would allow running a task more often than once a minute, as an example. Don't get too bogged down into thinking that cron(8) is your only option for any kind of asynchronous task management!

[Nov 01, 2017] Listing files

Using ls is probably one of the first commands an administrator will learn for getting a simple list of the contents of the directory. Most administrators will also know about the -a and -l switches, to show all files including dot files and to show more detailed data about files in columns, respectively.

There are other switches to GNU ls which are less frequently used, some of which turn out to be very useful for programming:

Since the listing is text like anything else, you could, for example, pipe the output of this command into a vim process, so you could add explanations of what each file is for and save it as an inventory file or add it to a README:

$ ls -XR | vim -

This kind of stuff can even be automated by make with a little work, which I'll cover in another article later in the series.

[Nov 01, 2017] Default grep options by Tom Ryder

May 18, 2012 |

When you're searching a set of version-controlled files for a string with grep , particularly if it's a recursive search, it can get very annoying to be presented with swathes of results from the internals of the hidden version control directories like .svn or .git , or include metadata you're unlikely to have wanted in files like .gitmodules .

GNU grep uses an environment variable named GREP_OPTIONS to define a set of options that are always applied to every call to grep . This comes in handy when exported in your .bashrc file to set a "standard" grep environment for your interactive shell. Here's an example of a definition of GREP_OPTIONS that excludes a lot of patterns which you'd very rarely if ever want to search with grep :

for pattern in .cvs .git .hg .svn; do
    GREP_OPTIONS="$GREP_OPTIONS --exclude-dir=$pattern

Note that --exclude-dir is a relatively recent addition to the options for GNU grep , but it should only be missing on very legacy GNU/Linux machines by now. If you want to keep your .bashrc file compatible, you could apply a little extra hackery to make sure the option is available before you set it up to be used:

if grep --help | grep -- --exclude-dir &>/dev/null; then
    for pattern in .cvs .git .hg .svn; do
        GREP_OPTIONS="$GREP_OPTIONS --exclude-dir=$pattern"

Similarly, you can ignore single files with --exclude . There's also --exclude-from=FILE if your list of excluded patterns starts getting too long.

Other useful options available in GNU grep that you might wish to add to this environment variable include:

If you don't want to use GREP_OPTIONS , you could instead simply set up an alias :

alias grep='grep --exclude-dir=.git'

You may actually prefer this method as it's essentially functionally equivalent, but if you do it this way, when you want to call grep without your standard set of options, you only have to prepend a backslash to its call:

$ \grep pattern file

Commenter Andy Pearce also points out that using this method can avoid some build problems where GREP_OPTIONS would interfere.

Of course, you could solve a lot of these problems simply by using ack but that's another post. Posted in Bash Tagged ack , alias , color , default , environment , exclude , grep , grep_options , options , pcre , variable , version control

[Oct 31, 2017] Bash job control by Tom Ryder

Jan 31, 2012 |

Oftentimes you may wish to start a process on the Bash shell without having to wait for it to actually complete, but still be notified when it does. Similarly, it may be helpful to temporarily stop a task while it's running without actually quitting it, so that you can do other things with the terminal. For these kinds of tasks, Bash's built-in job control is very useful. Backgrounding processes

If you have a process that you expect to take a long time, such as a long cp or scp operation, you can start it in the background of your current shell by adding an ampersand to it as a suffix:

$ cp -r /mnt/bigdir /home &
[1] 2305

This will start the copy operation as a child process of your bash instance, but will return you to the prompt to enter any other commands you might want to run while that's going.

The output from this command shown above gives both the job number of 1, and the process ID of the new task, 2305. You can view the list of jobs for the current shell with the builtin jobs :

$ jobs
[1]+  Running  cp -r /mnt/bigdir /home &

If the job finishes or otherwise terminates while it's backgrounded, you should see a message in the terminal the next time you update it with a newline:

[1]+  Done  cp -r /mnt/bigdir /home &
Foregrounding processes

If you want to return a job in the background to the foreground, you can type fg :

$ fg
cp -r /mnt/bigdir /home &

If you have more than one job backgrounded, you should specify the particular job to bring to the foreground with a parameter to fg :

$ fg %1

In this case, for shorthand, you can optionally omit fg and it will work just the same:

$ %1
Suspending processes

To temporarily suspend a process, you can press Ctrl+Z:

$ cp -r /mnt/bigdir /home
[1]+  Stopped  cp -r /mnt/bigdir /home

You can then continue it in the foreground or background with fg %1 or bg %1 respectively, as above.

This is particularly useful while in a text editor; instead of quitting the editor to get back to a shell, or dropping into a subshell from it, you can suspend it temporarily and return to it with fg once you're ready.

Dealing with output

While a job is running in the background, it may still print its standard output and standard error streams to your terminal. You can head this off by redirecting both streams to /dev/null for verbose commands:

$ cp -rv /mnt/bigdir /home &>/dev/null

However, if the output of the task is actually of interest to you, this may be a case where you should fire up another terminal emulator, perhaps in GNU Screen or tmux , rather than using simple job control.

Suspending SSH sessions

As a special case, you can suspend an SSH session using an SSH escape sequence . Type a newline followed by a ~ character, and finally press Ctrl+Z to background your SSH session and return to the terminal from which you invoked it.

tom@conan:~$ ssh crom
tom@crom:~$ ~^Z [suspend ssh]
[1]+  Stopped  ssh crom

You can then resume it as you would any job by typing fg :

tom@conan:~$ fg %1
ssh crom

[Oct 31, 2017] Elegant Awk usage by Tom Ryder

It's better to use Perl for this pupose...
Feb 06, 2012 |

For many system administrators, Awk is used only as a way to print specific columns of data from programs that generate columnar output, such as netstat or ps .

For example, to get a list of all the IP addresses and ports with open TCP connections on a machine, one might run the following:

# netstat -ant | awk '{print $5}'

This works pretty well, but among the data you actually wanted it also includes the fifth word of the opening explanatory note, and the heading of the fifth column:


There are varying ways to deal with this.

Matching patterns

One common way is to pipe the output further through a call to grep , perhaps to only include results with at least one number:

# netstat -ant | awk '{print $5}' | grep '[0-9]'

In this case, it's instructive to use the awk call a bit more intelligently by setting a regular expression which the applicable line must match in order for that field to be printed, with the standard / characters as delimiters. This eliminates the need for the call to grep :

# netstat -ant | awk '/[0-9]/ {print $5}'

We can further refine this by ensuring that the regular expression should only match data in the fifth column of the output, using the ~ operator:

# netstat -ant | awk '$5 ~ /[0-9]/ {print $5}'
Skipping lines

Another approach you could take to strip the headers out might be to use sed to skip the first two lines of the output:

# netstat -ant | awk '{print $5}' | sed 1,2d

However, this can also be incorporated into the awk call, using the NR variable and making it part of a conditional checking the line number is greater than two:

# netstat -ant | awk 'NR>2 {print $5}'
Combining and excluding patterns

Another common idiom on systems that don't have the special pgrep command is to filter ps output for a string, but exclude the grep process itself from the output with grep -v grep :

# ps -ef | grep apache | grep -v grep | awk '{print $2}'

If you're using Awk to get columnar data from the output, in this case the second column containing the process ID, both calls to grep can instead be incorporated into the awk call:

# ps -ef | awk '/apache/ && !/awk/ {print $2}'

Again, this can be further refined if necessary to ensure you're only matching the expressions against the command name by specifying the field number for each comparison:

# ps -ef | awk '$8 ~ /apache/ && $8 !~ /awk/ {print $2}'

If you're used to using Awk purely as a column filter, the above might help to increase its utility for you and allow you to write shorter and more efficient command lines. The Awk Primer on Wikibooks is a really good reference for using Awk to its fullest for the sorts of tasks for which it's especially well-suited.

[Oct 31, 2017] Counting with grep and uniq by Tom Ryder

Feb 18, 2012 |

A common idiom in Unix is to count the lines of output in a file or pipe with wc -l :

$ wc -l example.txt
$ ps -e | wc -l

Sometimes you want to count the number of lines of output from a grep call, however. You might do it this way:

$ ps -ef | grep apache | wc -l

But grep has built-in counting of its own, with the -c option:

$ ps -ef | grep -c apache

The above is more a matter of good style than efficiency, but another tool with a built-in counting option that could save you time is the oft-used uniq . The below example shows a use of uniq to filter a sorted list into unique rows:

$ ps -ef | awk '{print $1}' | sort | uniq

If it would be useful to know in this case how many processes were being run by each of these users, you can include the -c option for uniq :

$ ps -ef | awk '{print $1}' | sort | uniq -c
    1 105
    1 daemon
    1 lp
    1 mysql
    1 nagios
    2 postfix
    78 root
    1 snmp
    7 tom
    1 UID
    5 www-data

You could even sort this output itself to show the users running the most processes first with sort -rn :

$ ps -ef | awk '{print $1}' | sort | uniq -c | sort -rn
    78 root
    8 tom
    5 www-data
    2 postfix
    1 UID
    1 snmp
    1 nagios
    1 mysql
    1 lp
    1 daemon
    1 105

Incidentally, if you're not counting results and really do just want a list of unique users, you can leave out the uniq and just add the -u flag to sort :

$ ps -ef | awk '{print $1}' | sort -u

The above means I actually find myself using uniq with no options quite seldom.

[Oct 31, 2017] 256 colour terminals by Tom Ryder

Notable quotes:
"... An earlier version of this post suggested changing the TERM definition in .bashrc , which is generally not a good idea, even if bounded with conditionals as my example was. You should always set the terminal string in the emulator itself if possible, if you do it at all. ..."
"... Similarly, to use 256 colours in GNU Screen, add the following to your .screenrc : ..."
February 23, 2012 |

Using 256 colours in terminals is well-supported in GNU/Linux distributions these days, and also in Windows terminal emulators like PuTTY. Using 256 colours is great for Vim colorschemes in particular, but also very useful for Tmux colouring or any other terminal application where a slightly wider colour space might be valuable. Be warned that once you get this going reliably, there's no going back if you spend a lot of time in the terminal. Xterm

To set this up for xterm or emulators that use xterm as the default value for $TERM , such as xfce4-terminal or gnome-terminal , it generally suffices to check the options for your terminal emulator to ensure that it will allow 256 colors, and then use the TERM string xterm-256color for it.

An earlier version of this post suggested changing the TERM definition in .bashrc , which is generally not a good idea, even if bounded with conditionals as my example was. You should always set the terminal string in the emulator itself if possible, if you do it at all.

Be aware that older systems may not have terminfo definitions for this terminal, but you can always copy them in using a private .terminfo directory if need be.


To use 256 colours in Tmux, you should set the default terminal in .tmux.conf to be screen-256color :

set -g default-terminal "screen-256color"

This will allow you to use color definitions like colour231 in your status lines and other configurations. Again, this particular terminfo definition may not be present on older systems, so you should copy it into ~/.terminfo/s/screen-256color on those systems if you want to use it everywhere.

GNU Screen

Similarly, to use 256 colours in GNU Screen, add the following to your .screenrc :

term screen-256color

With the applicable options from the above set, you should not need to change anything in Vim to be able to use 256-color colorschemes. If you're wanting to write or update your own 256-colour compatible scheme, it should either begin with set t_Co=256 , or more elegantly, check the value of the corresponding option value is &t_Co is 256 before trying to use any of the extra colour set.

The Vim Tips Wiki contains a detailed reference of the colour codes for schemes in 256-color terminals.

[Oct 22, 2017] Unix text editing - sed, tr, cut, od

Oct 22, 2017 |

A tr script to remove all non-printing characters from a file is below. Non-printing characters may be invisible, but cause problems with printing or sending the file via electronic mail. You run it from Unix command prompt, everything on one line:

> tr -d '\001'-'\011''\013''\014''\016'-'\037''\200'-'\377' 
   < filein > fileout
What is the meaning of this tr script is, that it deletes all charactes with octal value from 001 to 011, characters 013, 014, characters from 016 to 037 and characters from 200 to 377. Other characters are copied over from filein to fileout and these are printable. Please remember, you can not fold a line containing tr command, everything must be on one line, how long it would be. In practice, this script solves some mysterious Unix printing problems.

Type in a text file named "f127.TR" with the line starting tr above. Print the file on screen with cat f127.TR command, replace "filein" and "fileout" with your file names, not same the file, then copy and paste the line and run (execute) it. Please, remember this does not solve Unix end-of-file problem, that is the character '\000', also known as a 'null', in the file. Nor does it handle binary file problem, that is a file starting with two zeroes '\060' and '\060'

Sometimes there are some invisible characters causing havoc. This tr command line converts tabulate- characters into hashes (#) and formfeed- characters into stars (*).

> tr '\011\014' '#*'  < filein > fileout
The numeric value of tabulate is 9, hex 09, octal 011 and in C-notation it is \t or \011. Formfeed is 12, hex 0C, octal 014 and in C-notation it is \f or \014. Please note, tr replaces character from the first (leftmost) group with corresponding character in the second group. Characters in octal format, like \014 are counted as one character each.

[Oct 01, 2017] How to Use Script Command To Record Linux Terminal Session

Oct 01, 2017 |

How to Use "Script" Command To Record Linux Terminal Session May 30, 2014 By Pungki Arianto Updated June 14, 2017 Facebook Google+ Twitter Pinterest LinkedIn StumbleUpon Reddit Email This script command is very helpful for system admin. If any problem occurs to the system, it is very difficult to find what command was executed previously. Hence, system admin knows the importance of this script command. Sometimes you are on the server and you think to yourself that your team or somebody you know is actually missing a documentation on how to do a specific configuration. It is possible for you to do the configuration, record all actions of your shell session and show the record to the person who will see exactly what you had (the same output) on your shell at the moment of the configuration. How does script command work?

script command records a shell session for you so that you can look at the output that you saw at the time and you can even record with timing so that you can have a real-time playback. It is really useful and comes in handy in the strangest kind of times and places.

The script command keeps action log for various tasks. The script records everything in a session such as things you type, things you see. To do this you just type script command on the terminal and type exit when finished. Everything between the script and the exit command is logged to the file. This includes the confirmation messages from script itself.

1. Record your terminal session

script makes a typescript of everything printed on your terminal. If the argument file is given, script saves all dialogue in the indicated file in the current directory. If no file name is given, the typescript is saved in default file typescript. To record your shell session so what you are doing in the current shell, just use the command below

# script shell_record1
Script started, file is shell_record1

It indicates that a file shell_record1 is created. Let's check the file

# ls -l shell_*
-rw-r--r-- 1 root root 0 Jun 9 17:50 shell_record1

After completion of your task, you can enter exit or Ctrl-d to close down the script session and save the file.

# exit
Script done, file is shell_record1

You can see that script indicates the filename.

2. Check the content of a recorded terminal session

When you use script command, it records everything in a session such as things you type so all your output. As the output is saved into a file, it is possible after to check its content after existing a recorded session. You can simply use a text editor command or a text file command viewer.

# cat shell_record1 
Script started on Fri 09 Jun 2017 06:23:41 PM UTC
[root@centos-01 ~]# date
Fri Jun 9 18:23:46 UTC 2017
[root@centos-01 ~]# uname -a
Linux centos-01 3.10.0-514.16.1.el7.x86_64 #1 SMP Wed Apr 12 15:04:24 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
[root@centos-01 ~]# whoami
[root@centos-01 ~]# pwd
[root@centos-01 ~]# exit

Script done on Fri 09 Jun 2017 06:25:11 PM UTC

While you view the file you realize that the script also stores line feeds and backspaces. It also indicates the time of the recording to the top and the end of the file.

3. Record several terminal session

You can record several terminal session as you want. When you finish a record, just begin another new session record. It can be helpful if you want to record several configurations that you are doing to show it to your team or students for example. You just need to name each recording file.

For example, let us assume that you have to do OpenLDAP , DNS , Machma configurations. You will need to record each configuration. To do this, just create recording file corresponding to each configuration when finished.

# script openldap_record
    configuration step
# exit

When you have finished with the first configuration, begin to record the next configuration

# script machma_record
     configuration steps
# exit

And so on for the other. Note that if you script command followed by existing filename, the file will be replaced. So you will lost everything.

Now, let us imagine that you have begun Machma configuration but you have to abort its configuration in order to finish DNS configuration because of some emergency case. Now you want to continue the machma configuration where you left. It means you want to record the next steps into the existing file machma_record without deleting its previous content; to do this you will use script -a command to append the new output to the file.

This is the content of our recorded file

Now if we want to continue our recording in this file without deleting the content already present, we will do

# script -a machma_record
Script started, file is machma_record

Now continue the configuration, then exit when finished and let's check the content of the recorded file.

Note the new time of the new record which appears. You can see that the file has the previous and actual records.

4. Replay a linux terminal session

We have seen that it is possible to see the content of the recorded file with commands to display a text file content. The script command also gives the possibility to see the recorded session as a video. It means that you will review exactly what you have done step by step at the moment you were entering the commands as if you were looking a video. So you will playback/replay the recorded terminal session.

To do it, you have to use --timing option of script command when you will start the record.

# script --timing=file_time shell_record1
Script started, file is shell_record1

See that the file into which to record is shell_record1. When the record is finished, exit normally

# exit
Script done, file is shell_record1

Let's see check the content of file_time

# cat file_time 
0.807440 49
0.030061 1
116.131648 1
0.226914 1
0.033997 1
0.116936 1
0.104201 1
0.392766 1
0.301079 1
0.112105 2
0.363375 152

The --timing option outputs timing data to the file indicated. This data contains two fields, separated by a space which indicates how much time elapsed since the previous output how many characters were output this time. This information can be used to replay typescripts with realistic typing and output delays.

Now to replay the terminal session, we use scriptreplay command instead of script command with the same syntax when recording the session. Look below

# scriptreplay --timing=file_time shell_record1

You will that the recorded session with be played as if you were looking a video which was recording all that you were doing. You can just insert the timing file without indicating all the --timing=file_time. Look below

# scriptreplay file_time shell_record1

So you understand that the first parameter is the timing file and the second is the recorded file.


The script command can be your to-go tool for documenting your work and showing others what you did in a session. It can be used as a way to log what you are doing in a shell session. When you run script, a new shell is forked. It reads standard input and output for your terminal tty and stores the data in a file.

[Aug 28, 2017] Rsync over ssh with root access on both sides

Aug 28, 2017 |

I have one older ubuntu server, and one newer debian server and I am migrating data from the old one to the new one. I want to use rsync to transfer data across to make final migration easier and quicker than the equivalent tar/scp/untar process.

As an example, I want to sync the home folders one at a time to the new server. This requires root access at both ends as not all files at the source side are world readable and the destination has to be written with correct permissions into /home. I can't figure out how to give rsync root access on both sides.

I've seen a few related questions, but none quite match what I'm trying to do.

I have sudo set up and working on both servers. ubuntu ssh debian rsync root

share improve this question asked Apr 28 '10 at 9:18 Tim Abell 732 20
add a comment | 3 Answers active oldest votes
up vote down vote accepted Actually you do NOT need to allow root authentication via SSH to run rsync as Antoine suggests. The transport and system authentication can be done entirely over user accounts as long as you can run rsync with sudo on both ends for reading and writing the files.

As a user on your destination server you can suck the data from your source server like this:

sudo rsync -aPe ssh --rsync-path='sudo rsync' boron:/home/fred /home/

The user you run as on both servers will need passwordless* sudo access to the rsync binary, but you do NOT need to enable ssh login as root anywhere. If the user you are using doesn't match on the other end, you can add user@boron: to specify a different remote user.

Good luck.

*or you will need to have entered the password manually inside the timeout window.

share improve this answer edited Jun 30 '10 at 13:51 answered Apr 28 '10 at 22:06 Caleb 9,089 27 43
Although this is an old question I'd like to add word of CAUTION to this accepted answer. From my understanding allowing passwordless "sudo rsync" is equivalent to open the root account to remote login. This is because with this it is very easy to gain full root access, e.g. because all system files can be downloaded, modified and replaced without a password. – Ascurion Jan 8 '16 at 16:30
add a comment |
up vote down vote If your data is not highly sensitive, you could use tar and socat. In my experience this is often faster as rsync over ssh.

You need socat or netcat on both sides.

On the target host, go to the directory where you would like to put your data, after that run: socat TCP-LISTEN:4444 - | tar xzf -

If the target host is listening, start it on the source like: tar czf - /home/fred /home/ | socat - TCP:ip-of-remote-server:4444

For this setup you'll need a reliably connection between the 2 servers.

share improve this answer answered Apr 28 '10 at 21:20 Jeroen Moors
Good point. In a trusted environment, you'll pick up a lot of speed by not encrypting. It might not matter on small files, but with GBs of data it will. – pboin May 18 '10 at 10:53
add a comment |
up vote down vote Ok, i've pieced together all the clues to get something that works for me.

Lets call the servers "src" & "dst".

Set up a key pair for root on the destination server, and copy the public key to the source server:

dest $ sudo -i
dest # ssh-keygen
dest # exit
dest $ scp /root/ src:

Add the public key to root's authorized keys on the source server

src $ sudo -i
src # cp /home/tim/ .ssh/authorized_keys

Back on the destination server, pull the data across with rsync:

dest $ sudo -i
dest # rsync -aP src:/home/fred /home/

[Aug 28, 2017] Unix Rsync Copy Hidden Dot Files and Directories Only by Vivek Gite

Feb 06, 2014 |
November 9, 2012 February 6, 2014 in Categories Commands , File system , Linux , UNIX last updated February 6, 2014

How do I use the rsync tool to copy only the hidden files and directory (such as ~/.ssh/, ~/.foo, and so on) from /home/jobs directory to the /mnt/usb directory under Unix like operating system?

The rsync program is used for synchronizing files over a network or local disks. To view or display only hidden files with ls command:

ls -ld ~/.??*


ls -ld ~/.[^.]*

Sample outputs:

ls command: List only hidden files in Unix / Linux terminal

Fig:01 ls command to view only hidden files

rsync not synchronizing all hidden .dot files?

In this example, you used the pattern .[^.]* or .??* to select and display only hidden files using ls command . You can use the same pattern with any Unix command including rsync command. The syntax is as follows to copy hidden files with rsync:

rsync -av /path/to/dir/.??* /path/to/dest
rsync -avzP /path/to/dir/.??* /mnt/usb
rsync -avzP $HOME/.??*
rsync -avzP ~/.[^.]*

rsync -av /path/to/dir/.??* /path/to/dest rsync -avzP /path/to/dir/.??* /mnt/usb rsync -avzP $HOME/.??* rsync -avzP ~/.[^.]*

In this example, copy all hidden files from my home directory to /mnt/test:

rsync -avzP ~/.[^.]* /mnt/test

rsync -avzP ~/.[^.]* /mnt/test

Sample outputs:

Rsync example to copy only hidden files

Fig.02 Rsync example to copy only hidden files

Vivek Gite is the creator of nixCraft and a seasoned sysadmin and a trainer for the Linux operating system/Unix shell scripting. He has worked with global clients and in various industries, including IT, education, defense and space research, and the nonprofit sector. Follow him on Twitter , Facebook , Google+ .

[Aug 28, 2017] rsync doesn't copy files with restrictive permissions

Aug 28, 2017 |
up vote down vote favorite Trying to copy files with rsync, it complains:
rsync: send_files failed to open "VirtualBox/Machines/Lubuntu/Lubuntu.vdi" \
(in media): Permission denied (13)

That file is not copied. Indeed the file permissions of that file are very restrictive on the server side:

-rw-------    1 1000     1000     3133181952 Nov  1  2011 Lubuntu.vdi

I call rsync with

sudo rsync -av --fake-super root@sheldon::media /mnt/media

The rsync daemon runs as root on the server. root can copy that file (of course). rsyncd has "fake super = yes" set in /etc/rsyncd.conf.

What can I do so that the file is copied without changing the permissions of the file on the server? rsync file-permissions

share improve this question asked Dec 29 '12 at 10:15 Torsten Bronger 207
If you use RSync as daemon on destination, please post grep rsync /var/log/daemon to improve your question – F. Hauri Dec 29 '12 at 13:23
add a comment |
1 Answer active oldest votes
up vote down vote As you appear to have root access to both servers have you tried a: --force ?

Alternatively you could bypass the rsync daemon and try a direct sync e.g.

rsync -optg --rsh=/usr/bin/ssh --rsync-path=/usr/bin/rsync --verbose --recursive --delete-after --force  root@sheldon::media /mnt/media
share improve this answer edited Jan 2 '13 at 10:55 answered Dec 29 '12 at 13:21 arober11 376
Using ssh means encryption, which makes things slower. --force does only affect directories, if I read the man page correctly. – Torsten Bronger Jan 1 '13 at 23:08
Unless your using ancient kit, the CPU overhead of encrypting / decrypting the traffic shouldn't be noticeable, but you will loose 10-20% of your bandwidth, through the encapsulation process. Then again 80% of a working link is better than 100% of a non working one :) – arober11 Jan 2 '13 at 10:52
do have an "ancient kit". ;-) (Slow ARM CPU on a NAS.) But I now mount the NAS with NFS and use rsync (with "sudo") locally. This solves the problem (and is even faster). However, I still think that my original problem must be solvable using the rsync protocol (remote, no ssh). – Torsten Bronger Jan 4 '13 at 7:55

[Aug 28, 2017] Using rsync under target user to copy home directories

Aug 28, 2017 |

up vote down vote favorite

nixnotwin , asked Sep 21 '12 at 5:11

On my Ubuntu server there are about 150 shell accounts. All usernames begin with the prefix u12.. I have root access and I am trying to copy a directory named "somefiles" to all the home directories. After copying the directory the user and group ownership of the directory should be changed to user's. Username, group and home-dir name are same. How can this be done?

Gilles , answered Sep 21 '12 at 23:44

Do the copying as the target user. This will automatically make the target files. Make sure that the original files are world-readable (or at least readable by all the target users). Run chmod afterwards if you don't want the copied files to be world-readable.
getent passwd |
awk -F : '$1 ~ /^u12/ {print $1}' |
while IFS= read -r user; do
  su "$user" -c 'cp -Rp /original/location/somefiles ~/'

[Aug 28, 2017] rsync over SSH preserve ownership only for www-data owned files

Aug 28, 2017 |
up vote 10 down vote favorite 4

jeffery_the_wind , asked Mar 6 '12 at 15:36

I am using rsync to replicate a web folder structure from a local server to a remote server. Both servers are ubuntu linux. I use the following command, and it works well:
rsync -az /var/www/ user@

The usernames for the local system and the remote system are different. From what I have read it may not be possible to preserve all file and folder owners and groups. That is OK, but I would like to preserve owners and groups just for the www-data user, which does exist on both servers.

Is this possible? If so, how would I go about doing that?


** EDIT **

There is some mention of rsync being able to preserve ownership and groups on remote file syncs here:

** EDIT 2 **

I ended up getting the desired affect thanks to many of the helpful comments and answers here. Assuming the IP of the source machine is and the IP of the destination machine is I can use this line from the destination machine:

sudo rsync -az user@ /var/www/

This preserves the ownership and groups of the files that have a common user name, like www-data. Note that using rsync without sudo does not preserve these permissions.

ghoti , answered Mar 6 '12 at 19:01

You can also sudo the rsync on the target host by using the --rsync-path option:
# rsync -av --rsync-path="sudo rsync" /path/to/files user@targethost:/path

This lets you authenticate as user on targethost, but still get privileged write permission through sudo . You'll have to modify your sudoers file on the target host to avoid sudo's request for your password. man sudoers or run sudo visudo for instructions and samples.

You mention that you'd like to retain the ownership of files owned by www-data, but not other files. If this is really true, then you may be out of luck unless you implement chown or a second run of rsync to update permissions. There is no way to tell rsync to preserve ownership for just one user .

That said, you should read about rsync's --files-from option.

rsync -av /path/to/files user@targethost:/path
find /path/to/files -user www-data -print | \
  rsync -av --files-from=- --rsync-path="sudo rsync" /path/to/files user@targethost:/path

I haven't tested this, so I'm not sure exactly how piping find's output into --files-from=- will work. You'll undoubtedly need to experiment.

xato , answered Mar 6 '12 at 15:39

As far as I know, you cannot chown files to somebody else than you, if you are not root. So you would have to rsync using the www-data account, as all files will be created with the specified user as owner. So you need to chown the files afterwards.

user2485267 , answered Jun 14 '13 at 8:22

I had a similar problem and cheated the rsync command,

rsync -avz --delete root@x.x.x.x:/home//domains/site/public_html/ /home/domains2/public_html && chown -R wwwusr:wwwgrp /home/domains2/public_html/

the && runs the chown against the folder when the rsync completes successfully (1x '&' would run the chown regardless of the rsync completion status)

Graham , answered Mar 6 '12 at 15:51

The root users for the local system and the remote system are different.

What does this mean? The root user is uid 0. How are they different?

Any user with read permission to the directories you want to copy can determine what usernames own what files. Only root can change the ownership of files being written .

You're currently running the command on the source machine, which restricts your writes to the permissions associated with user@ Instead, you can try to run the command as root on the target machine. Your read access on the source machine isn't an issue.

So on the target machine (, assuming the source is

# rsync -az user@ /var/www/

Make sure your groups match on both machines.

Also, set up access to user@ using a DSA or RSA key, so that you can avoid having passwords floating around. For example, as root on your target machine, run:

# ssh-keygen -d

Then take the contents of the file /root/.ssh/ and add it to ~user/.ssh/authorized_keys on the source machine. You can ssh user@ as root from the target machine to see if it works. If you get a password prompt, check your error log to see why the key isn't working.

ghoti , answered Mar 6 '12 at 18:54

Well, you could skip the challenges of rsync altogether, and just do this through a tar tunnel.
sudo tar zcf - /path/to/files | \
  ssh user@remotehost "cd /some/path; sudo tar zxf -"

You'll need to set up your SSH keys as Graham described.

Note that this handles full directory copies, not incremental updates like rsync.

The idea here is that:

[Aug 28, 2017] rsync and file permissions

Aug 28, 2017 |
up vote down vote favorite I'm trying to use rsync to copy a set of files from one system to another. I'm running the command as a normal user (not root). On the remote system, the files are owned by apache and when copied they are obviously owned by the local account (fred).

My problem is that every time I run the rsync command, all files are re-synched even though they haven't changed. I think the issue is that rsync sees the file owners are different and my local user doesn't have the ability to change ownership to apache, but I'm not including the -a or -o options so I thought this would not be checked. If I run the command as root, the files come over owned by apache and do not come a second time if I run the command again. However I can't run this as root for other reasons. Here is the command:

/usr/bin/rsync --recursive --rsh=/usr/bin/ssh --rsync-path=/usr/bin/rsync --verbose /local/dir
unix rsync
share improve this question edited May 2 '11 at 23:53 Gareth 13.9k 11 44 58 asked May 2 '11 at 23:43 Fred Snertz 11
Why can't you run rsync as root? On the remote system, does fred have read access to the apache-owned files? – chrishiestand May 3 '11 at 0:32
Ah, I left out the fact that there are ssh keys set up so that local fred can become remote root, so yes fred/root can read them. I know this is a bit convoluted but its real. – Fred Snertz May 3 '11 at 14:50
Always be careful when root can ssh into the machine. But if you have password and challenge response authentication disabled it's not as bad. – chrishiestand May 3 '11 at 17:32
add a comment |
1 Answer active oldest votes
up vote down vote Here's the answer to your problem:
-c, --checksum
      This changes the way rsync checks if the files have been changed and are in need of a  transfer.   Without  this  option,
      rsync  uses  a "quick check" that (by default) checks if each file's size and time of last modification match between the
      sender and receiver.  This option changes this to compare a 128-bit checksum for each file  that  has  a  matching  size.
      Generating  the  checksums  means  that both sides will expend a lot of disk I/O reading all the data in the files in the
      transfer (and this is prior to any reading that will be done to transfer changed files), so this  can  slow  things  down

      The  sending  side  generates  its checksums while it is doing the file-system scan that builds the list of the available
      files.  The receiver generates its checksums when it is scanning for changed files, and will checksum any file  that  has
      the  same  size  as the corresponding sender's file:  files with either a changed size or a changed checksum are selected
      for transfer.

      Note that rsync always verifies that each transferred file was correctly reconstructed on the receiving side by  checking
      a  whole-file  checksum  that is generated as the file is transferred, but that automatic after-the-transfer verification
      has nothing to do with this option's before-the-transfer "Does this file need to be updated?" check.

      For protocol 30 and beyond (first supported in 3.0.0), the checksum used is MD5.  For older protocols, the checksum  used
      is MD4.

So run:

/usr/bin/rsync -c --recursive --rsh=/usr/bin/ssh --rsync-path=/usr/bin/rsync --verbose /local/dir

Note there may be a time+disk churn tradeoff by using this option. Personally, I'd probably just sync the file's mtimes too:

/usr/bin/rsync -t --recursive --rsh=/usr/bin/ssh --rsync-path=/usr/bin/rsync --verbose /local/dir
share improve this answer edited May 3 '11 at 17:55 answered May 3 '11 at 17:48 chrishiestand 1,098 10
Awesome. Thank you. Looks like the second option is going to work for me and I found the first very interesting. – Fred Snertz May 3 '11 at 18:40
psst, hit the green checkbox to give my answer credit ;-) Thx. – chrishiestand May 12 '11 at 1:56

[Aug 28, 2017] Why does rsync fail to copy files from /sys in Linux?

Notable quotes:
"... pseudo file system ..."
"... pseudo filesystems ..."
Aug 28, 2017 |

up vote 11 down vote favorite 1

Eugene Yarmash , asked Apr 24 '13 at 16:35

I have a bash script which uses rsync to backup files in Archlinux. I noticed that rsync failed to copy a file from /sys , while cp worked just fine:
# rsync /sys/class/net/enp3s1/address /tmp    
rsync: read errors mapping "/sys/class/net/enp3s1/address": No data available (61)
rsync: read errors mapping "/sys/class/net/enp3s1/address": No data available (61)
ERROR: address failed verification -- update discarded.
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1052) [sender=3.0.9]

# cp  /sys/class/net/enp3s1/address /tmp   ## this works

I wonder why does rsync fail, and is it possible to copy the file with it?

mattdm , answered Apr 24 '13 at 18:20

Rsync has code which specifically checks if a file is truncated during read and gives this error ! ENODATA . I don't know why the files in /sys have this behavior, but since they're not real files, I guess it's not too surprising. There doesn't seem to be a way to tell rsync to skip this particular check.

I think you're probably better off not rsyncing /sys and using specific scripts to cherry-pick out the particular information you want (like the network card address).

Runium , answered Apr 25 '13 at 0:23

First off /sys is a pseudo file system . If you look at /proc/filesystems you will find a list of registered file systems where quite a few has nodev in front. This indicates they are pseudo filesystems . This means they exists on a running kernel as a RAM-based filesystem. Further they do not require a block device.
$ cat /proc/filesystems
nodev   sysfs
nodev   rootfs
nodev   bdev

At boot the kernel mount this system and updates entries when suited. E.g. when new hardware is found during boot or by udev .

In /etc/mtab you typically find the mount by:

sysfs /sys sysfs rw,noexec,nosuid,nodev 0 0

For a nice paper on the subject read Patric Mochel's – The sysfs Filesystem .

stat of /sys files

If you go into a directory under /sys and do a ls -l you will notice that all files has one size. Typically 4096 bytes. This is reported by sysfs .

:/sys/devices/pci0000:00/0000:00:19.0/net/eth2$ ls -l
-r--r--r-- 1 root root 4096 Apr 24 20:09 addr_assign_type
-r--r--r-- 1 root root 4096 Apr 24 20:09 address
-r--r--r-- 1 root root 4096 Apr 24 20:09 addr_len

Further you can do a stat on a file and notice another distinct feature; it occupies 0 blocks. Also inode of root (stat /sys) is 1. /stat/fs typically has inode 2. etc.

rsync vs. cp

The easiest explanation for rsync failure of synchronizing pseudo files is perhaps by example.

Say we have a file named address that is 18 bytes. An ls or stat of the file reports 4096 bytes.

  1. Opens file descriptor, fd.
  2. Uses fstat(fd) to get information such as size.
  3. Set out to read size bytes, i.e. 4096. That would be line 253 of the code linked by @mattdm . read_size == 4096
    1. Ask; read: 4096 bytes.
    2. A short string is read i.e. 18 bytes. nread == 18
    3. read_size = read_size - nread (4096 - 18 = 4078)
    4. Ask; read: 4078 bytes
    5. 0 bytes read (as first read consumed all bytes in file).
    6. nread == 0 , line 255
    7. Unable to read 4096 bytes. Zero out buffer.
    8. Set error ENODATA .
    9. Return.
  4. Report error.
  5. Retry. (Above loop).
  6. Fail.
  7. Report error.
  8. FINE.

During this process it actually reads the entire file. But with no size available it cannot validate the result – thus failure is only option.

  1. Opens file descriptor, fd.
  2. Uses fstat(fd) to get information such as st_size (also uses lstat and stat).
  3. Check if file is likely to be sparse. That is the file has holes etc.
    /* Use a heuristic to determine whether SRC_NAME contains any sparse
     * blocks.  If the file has fewer blocks than would normally be
     * needed for a file of its size, then at least one of the blocks in
     * the file is a hole.  */
    sparse_src = is_probably_sparse (&src_open_sb);

    As stat reports file to have zero blocks it is categorized as sparse.

  4. Tries to read file by extent-copy (a more efficient way to copy normal sparse files), and fails.
  5. Copy by sparse-copy.
    1. Starts out with max read size of MAXINT.
      Typically 18446744073709551615 bytes on a 32 bit system.
    2. Ask; read 4096 bytes. (Buffer size allocated in memory from stat information.)
    3. A short string is read i.e. 18 bytes.
    4. Check if a hole is needed, nope.
    5. Write buffer to target.
    6. Subtract 18 from max read size.
    7. Ask; read 4096 bytes.
    8. 0 bytes as all got consumed in first read.
    9. Return success.
  6. All OK. Update flags for file.
  7. FINE.


Might be related, but extended attribute calls will fail on sysfs:

[root@hypervisor eth0]# lsattr address

lsattr: Inappropriate ioctl for device While reading flags on address

[root@hypervisor eth0]#

Looking at my strace it looks like rsync tries to pull in extended attributes by default:

22964 <... getxattr resumed> , 0x7fff42845110, 132) = -1 ENODATA (No data available)

I tried finding a flag to give rsync to see if skipping extended attributes resolves the issue but wasn't able to find anything ( --xattrs turns them on at the destination).

[Aug 28, 2017] Rsync doesn't copy everyting s

Aug 28, 2017 |

View Full Version : [ubuntu] Rsync doesn't copy everyting

Scormen May 31st, 2009, 10:09 AM Hi all,

I'm having some trouble with rsync. I'm trying to sync my local /etc directory to a remote server, but this won't work.

The problem is that it seems he doesn't copy all the files.
The local /etc dir contains 15MB of data, after a rsync, the remote backup contains only 4.6MB of data.

Rsync is running by root. I'm using this command:

rsync --rsync-path="sudo rsync" -e "ssh -i /root/.ssh/backup" -avz --delete --delete-excluded -h --stats /etc kris@

I hope someone can help.


Scormen May 31st, 2009, 11:05 AM I found that if I do a local sync, everything goes fine.
But if I do a remote sync, it copies only 4.6MB.

Any idea?

LoneWolfJack May 31st, 2009, 05:14 PM never used rsync on a remote machine, but "sudo rsync" looks wrong. you probably can't call sudo like that so the ssh connection needs to have the proper privileges for executing rsync.

just an educated guess, though.

Scormen May 31st, 2009, 05:24 PM Thanks for your answer.

In /etc/sudoers I have added next line, so "sudo rsync" will work.

kris ALL=NOPASSWD: /usr/bin/rsync

I also tried without --rsync-path="sudo rsync", but without success.

I have also tried on the server to pull the files from the laptop, but that doesn't work either.

LoneWolfJack May 31st, 2009, 05:30 PM in the rsync help file it says that --rsync-path is for the path to rsync on the remote machine, so my guess is that you can't use sudo there as it will be interpreted as a path.

so you will have to do --rsync-path="/path/to/rsync" and make sure the ssh login has root privileges if you need them to access the files you want to sync.

--rsync-path="sudo rsync" probably fails because
a) sudo is interpreted as a path
b) the space isn't escaped
c) sudo probably won't allow itself to be called remotely

again, this is not more than an educated guess.

Scormen May 31st, 2009, 05:45 PM I understand what you mean, so I tried also:

rsync -Cavuhzb --rsync-path="/usr/bin/rsync" -e "ssh -i /root/.ssh/backup" /etc kris@

Then I get this error:

sending incremental file list
rsync: recv_generator: failed to stat "/home/kris/backup/laptopkris/etc/chatscripts/pap": Permission denied (13)
rsync: recv_generator: failed to stat "/home/kris/backup/laptopkris/etc/chatscripts/provider": Permission denied (13)
rsync: symlink "/home/kris/backup/laptopkris/etc/cups/ssl/server.crt" -> "/etc/ssl/certs/ssl-cert-snakeoil.pem" failed: Permission denied (13)
rsync: symlink "/home/kris/backup/laptopkris/etc/cups/ssl/server.key" -> "/etc/ssl/private/ssl-cert-snakeoil.key" failed: Permission denied (13)
rsync: recv_generator: failed to stat "/home/kris/backup/laptopkris/etc/ppp/peers/provider": Permission denied (13)
rsync: recv_generator: failed to stat "/home/kris/backup/laptopkris/etc/ssl/private/ssl-cert-snakeoil.key": Permission denied (13)

sent 86.85K bytes received 306 bytes 174.31K bytes/sec
total size is 8.71M speedup is 99.97
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1058) [sender=3.0.5]

And the same command with "root" instead of "kris".
Then, I get no errors, but I still don't have all the files synced.

Scormen June 1st, 2009, 09:00 AM Sorry for this bump.
I'm still having the same problem.

Any idea?


binary10 June 1st, 2009, 10:36 AM I understand what you mean, so I tried also:

rsync -Cavuhzb --rsync-path="/usr/bin/rsync" -e "ssh -i /root/.ssh/backup" /etc kris@

Then I get this error:

And the same command with "root" instead of "kris".
Then, I get no errors, but I still don't have all the files synced.

Maybe there's a nicer way but you could place /usr/bin/rsync into a private protected area and set the owner to root place the sticky bit on it and change your rsync-path argument such like:

# on the remote side, aka kris@
mkdir priv-area
# protect it from normal users running a priv version of rsync
chmod 700 priv-area
cd priv-area
cp -p /usr/local/bin/rsync ./rsync-priv
sudo chown 0:0 ./rsync-priv
sudo chmod +s ./rsync-priv
ls -ltra # rsync-priv should now be 'bold-red' in bash

Looking at your flags, you've specified a cvs ignore factor, ignore files that are updated on the target, and you're specifying a backup of removed files.

rsync -Cavuhzb --rsync-path="/home/kris/priv-area/rsync-priv" -e "ssh -i /root/.ssh/backup" /etc kris@

From those qualifiers you're not going to be getting everything sync'd. It's doing what you're telling it to do.

If you really wanted to perform a like for like backup.. (not keeping stuff that's been changed/deleted from the source. I'd go for something like the following.

rsync --archive --delete --hard-links --one-file-system --acls --xattrs --dry-run -i --rsync-path="/home/kris/priv-area/rsync-priv" --rsh="ssh -i /root/.ssh/backup" /etc/ kris@

Remove the --dry-run and -i when you're happy with the output, and it should do what you want. A word of warning, I get a bit nervous when not seeing trailing (/) on directories as it could lead to all sorts of funnies if you end up using rsync on softlinks.

Scormen June 1st, 2009, 12:19 PM Thanks for your help, binary10.

I've tried what you have said, but still, I only receive 4.6MB on the remote server.
Thanks for the warning, I'll not that!

Did someone already tried to rsync their own /etc to a remote system? Just to know if this strange thing only happens to me...


binary10 June 1st, 2009, 01:22 PM Thanks for your help, binary10.

I've tried what you have said, but still, I only receive 4.6MB on the remote server.
Thanks for the warning, I'll not that!

Did someone already tried to rsync their own /etc to a remote system? Just to know if this strange thing only happens to me...


Ok so I've gone back and looked at your original post, how are you calculating 15MB of data under etc - via a du -hsx /etc/ ??

I do daily drive to drive backup copies via rsync and drive to network copies.. and have used them recently for restoring.

Sure my du -hsx /etc/ reports 17MB of data of which 10MB gets transferred via an rsync. My backup drives still operate.

rsync 3.0.6 has some fixes to do with ACLs and special devices rsyncing between solaris. but I think 3.0.5 is still ok with ubuntu to ubuntu systems.

Here is my test doing exactly what you you're probably trying to do. I even check the remote end..

binary10@jsecx25:~/bin-priv$ ./rsync --archive --delete --hard-links --one-file-system --stats --acls --xattrs --human-readable --rsync-path="~/bin/rsync-priv-os-specific" --rsh="ssh" /etc/ rsyncbck@

Number of files: 3121
Number of files transferred: 1812
Total file size: 10.04M bytes
Total transferred file size: 10.00M bytes
Literal data: 10.00M bytes
Matched data: 0 bytes
File list size: 109.26K
File list generation time: 0.002 seconds
File list transfer time: 0.000 seconds
Total bytes sent: 10.20M
Total bytes received: 38.70K

sent 10.20M bytes received 38.70K bytes 4.09M bytes/sec
total size is 10.04M speedup is 0.98

binary10@jsecx25:~/bin-priv$ sudo du -hsx /etc/
17M /etc/

And then on the remote system I do the du -hsx

binary10@lenovo-n200:/home/kris/backup/laptopkris/etc$ cd ..
binary10@lenovo-n200:/home/kris/backup/laptopkris$ sudo du -hsx etc
17M etc

Scormen June 1st, 2009, 01:35 PM ow are you calculating 15MB of data under etc - via a du -hsx /etc/ ??
Indeed, on my laptop I see:

root@laptopkris:/home/kris# du -sh /etc/
15M /etc/

If I do the same thing after a fresh sync to the server, I see:

root@server:/home/kris# du -sh /home/kris/backup/laptopkris/etc/
4.6M /home/kris/backup/laptopkris/etc/

On both sides, I have installed Ubuntu 9.04, with version 3.0.5 of rsync.
So strange...

binary10 June 1st, 2009, 01:45 PM it does seem a bit odd.

I'd start doing a few diffs from the outputs find etc/ -printf "%f %s %p %Y\n" | sort

And see what type of files are missing.

- edit - Added the %Y file type.

Scormen June 1st, 2009, 01:58 PM Hmm, it's going stranger.
Now I see that I have all my files on the server, but they don't have their full size (bytes).

I have uploaded the files, so you can look into them.


binary10 June 1st, 2009, 02:16 PM If you look at the files that are different aka the ssl's they are links to local files else where aka linked to /usr and not within /etc/

aka they are different on your laptop and the server

Scormen June 1st, 2009, 02:25 PM I understand that soft links are just copied, and not the "full file".

But, you have run the same command to test, a few posts ago.
How is it possible that you can see the full 15MB?

binary10 June 1st, 2009, 02:34 PM I was starting to think that this was a bug with du.

The de-referencing is a bit topsy.

If you rsync copy the remote backup back to a new location back onto the laptop and do the du command. I wonder if you'll end up with 15MB again.

Scormen June 1st, 2009, 03:20 PM Good tip.

On the server side, the backup of the /etc was still 4.6MB.
I have rsynced it back to the laptop, to a new directory.

If I go on the laptop to that new directory and do a du, it says 15MB.

binary10 June 1st, 2009, 03:34 PM Good tip.

On the server side, the backup of the /etc was still 4.6MB.
I have rsynced it back to the laptop, to a new directory.

If I go on the laptop to that new directory and do a du, it says 15MB.

I think you've now confirmed that RSYNC DOES copy everything.. just tht du confusing what you had expected by counting the end link sizes.

It might also think about what you're copying, maybe you need more than just /etc of course it depends on what you are trying to do with the backup :)


Scormen June 1st, 2009, 03:37 PM Yeah, it seems to work well.
So, the "problem" where just the soft links, that couldn't be counted on the server side?
binary10 June 1st, 2009, 04:23 PM Yeah, it seems to work well.
So, the "problem" where just the soft links, that couldn't be counted on the server side?

The links were copied as links as per the design of the --archive in rsync.

The contents of the pointing links were different between your two systems. These being that that reside outside of /etc/ in /usr And so DU reporting them differently.

Scormen June 1st, 2009, 05:36 PM Okay, I got it.
Many thanks for the support, binarty10!
Scormen June 1st, 2009, 05:59 PM Just to know, is it possible to copy the data from these links as real, hard data?
binary10 June 2nd, 2009, 09:54 AM Just to know, is it possible to copy the data from these links as real, hard data?

Yep absolutely

You should then look at other possibilities of:

-L, --copy-links transform symlink into referent file/dir
--copy-unsafe-links only "unsafe" symlinks are transformed
--safe-links ignore symlinks that point outside the source tree
-k, --copy-dirlinks transform symlink to a dir into referent dir
-K, --keep-dirlinks treat symlinked dir on receiver as dir

but then you'll have to start questioning why you are backing them up like that especially stuff under /etc/. If you ever wanted to restore it you'd be restoring full files and not symlinks the restore result could be a nightmare as well as create future issues (upgrades etc) let alone your backup will be significantly larger, could be 150MB instead of 4MB.

Scormen June 2nd, 2009, 10:04 AM Okay, now I'm sure what its doing :)
Is it also possible to show on a system the "real disk usage" of e.g. that /etc directory? So, without the links, that we get a output of 4.6MB.

Thank you very much for your help!

binary10 June 2nd, 2009, 10:22 AM What does the following respond with.

sudo du --apparent-size -hsx /etc

If you want the real answer then your result from a dry-run rsync will only be enough for you.

sudo rsync --dry-run --stats -h --archive /etc/ /tmp/etc/

[Aug 27, 2017] Diff A Directory Recursively, Ignoring All Binary Files

It is now possible to use -r to recursively compare directories
Aug 27, 2017 |
diff -r dir1/ dir2/ | sed '/Binary\ files\ /d' >outputfile

This recursively compares dir1 to dir2, sed removes the lines for binary files (begins with " Binary files "), then it's redirected to the outputfile.

-- Shannon VanWagner

[Aug 14, 2017] Cut command on RHEL 6.8 compatibility issues Unix Linux Forums Shell Programming and Scripting

Notable quotes:
"... Last edited by RudiC; 06-30-2016 at 04:53 AM .. Reason: Added code tags. ..."
"... Last edited by rbatte1; 06-30-2016 at 11:38 AM .. Reason: Code tags ..."
"... Last edited by Scrutinizer; 07-02-2016 at 02:28 AM .. ..."
"... Much better: change your scripts. Run the following fix_cut script on your scripts: ..."
Aug 14, 2017 |

06-29-2016Vikram Jain Registered UserJoin Date: Jun 2016 Last Activity: 23 March 2017, 2:57 PM EDT Posts: 3 Thanks: 3 Thanked 0 Times in 0 Posts

Cut command on RHEL 6.8 compatibility issues

We have a lot of scripts using cut as :
cut -c 0-8 --works for cut (GNU coreutils) 5.97, but does not work for cut (GNU coreutils) 8.4.
Gives error -


cut: fields and positions are numbered from 1
Try `cut --help' for more information.
The position needs to start with 1 for later version of cut and this is causing an issue.

Is there a way where I can have multiple cut versions installed and use the older version of cut for the user which runs the script?

or any other work around without having to change the scripts?


Last edited by RudiC; 06-30-2016 at 04:53 AM .. Reason: Added code tags.

Vikram Jain

Don Cragun AdministratorJoin Date: Jul 2012 Last Activity: 14 August 2017, 3:59 PM EDT Location: San Jose, CA, USA Posts: 10,455 Thanks: 533 Thanked 3,654 Times in 3,118 Posts

What are you trying to do when you invoke


cut -c 0-8

with your old version of cut

With that old version of cut , is there any difference in the output produced by the two pipelines:


echo 0123456789abcdef | cut -c 0-8



echo 0123456789abcdef | cut -c 1-8

or do they produce the same output?

Don Cragun

# 06-30-2016

Vikram Jain Registered UserJoin Date: Jun 2016 Last Activity: 23 March 2017, 2:57 PM EDT Posts: 3 Thanks: 3 Thanked 0 Times in 0 Posts

I am trying to get a value from the 1st line of the file and check if that value is a valid date or not.
Below is the output for the cut command from new version


 $ echo 0123456789abcdef | cut -c 0-8
cut: fields and positions are numbered from 1
Try `cut --help' for more information.
$ echo 0123456789abcdef | cut -c 1-8
With old version, both have same results:


$ echo 0123456789abcdef | cut -c 0-8
$ echo 0123456789abcdef | cut -c 1-8

Please wrap all code, files, input & output/errors in CODE tags
It makes them far easier to read and preserves spaces for indenting or fixed-width data.

Last edited by rbatte1; 06-30-2016 at 11:38 AM .. Reason: Code tags

Vikram Jain


Scrutinizer ModeratorJoin Date: Nov 2008 Last Activity: 14 August 2017, 2:48 PM EDT Location: Amsterdam Posts: 11,509 Thanks: 497 Thanked 3,326 Times in 2,934 Posts

The use of 0 is not according to specification. Alternatively, you can just omit it, which should work across versions


$ echo 0123456789abcdef | cut -c -8

If you cannot adjust the scripts, you could perhaps create a wrapper script for cut, so that the 0 gets stripped..
Last edited by Scrutinizer; 07-02-2016 at 02:28 AM ..



Vikram Jain Registered UserJoin Date: Jun 2016 Last Activity: 23 March 2017, 2:57 PM EDT Posts: 3 Thanks: 3 Thanked 0 Times in 0 Posts

Yes, don't want to adjust my scripts.
Wrapper for cut looks like something that would work.

could you please tell me how would I use it, as in, how would I make sure that the wrapper is called and not the cut command which causes the issue.

Vikram Jain

Don Cragun AdministratorJoin Date: Jul 2012 Last Activity: 14 August 2017, 3:59 PM EDT Location: San Jose, CA, USA Posts: 10,455 Thanks: 533 Thanked 3,654 Times in 3,118 Posts

The only way to make sure that your wrapper is always called instead of the OS supplied utility is to move the OS supplied utility to a different location and install your wrapper in the location where your OS installed cut originally.

Of course, once you have installed this wrapper, your code might or might not work properly (depending on the quality of your wrapper) and no one else on your system will be able to look at the diagnostics produced by scripts that have bugs in the way they specify field and character ranges so they can identify and fix their code.

My personal opinion is that you should spend time fixing your scripts that call cut -c 0.... , cut -f 0... , and lots of other possible misuses of 0 that are now correctly diagnosed as errors by the new version of cut instead of debugging code to be sure that it changes all of the appropriate 0 characters in its argument list to 1 characters and doesn't change any 0 characters that are correctly specified and do not reference a character 0 or field 0.

vgersh99 (06-30-2016), Vikram Jain (06-30-2016)


MadeInGermany ModeratorJoin Date: May 2012 Last Activity: 14 August 2017, 2:33 PM EDT Location: Simplicity Posts: 3,666 Thanks: 295 Thanked 1,226 Times in 1,108 Posts

An update of "cut" will overwrite your wrapper.

Much better: change your scripts. Run the following fix_cut script on your scripts:


# fix_cut
for arg
  perl -ne 'exit 1 if m/'"$PRE"'/' "$arg" || {
    perl -i -pe 's/'"$PRE"'/${1}1-/g' "$arg"
Example: fix all .sh scripts


fix_cut *.sh

The Following User Says Thank You to MadeInGermany For This Useful Post:

Vikram Jain (07-08-2016)

[Jul 17, 2017] Setup Centralized Rsyslog Server On CentOS 7

Jul 17, 2017 |

Install and configure Rsyslog server and client configuration on CentOS 7 server.

YUM configuration in Linux (Mar 24, 2017, 06:00)
kerneltalks: Learn YUM configuration in Linux.

8 Practical Examples of Linux Xargs Command for Beginners (Mar 27, 2017, 13:00)
HowToForge: The Linux xargs command may not be a hugely popular command line tool, but this doesn't take away the fact that it's extremely useful

14 Practical Examples of Linux Find Command for Beginners (Mar 27, 2017, 04:00)
HowToForge: Find is one of the most frequently used Linux commands, and it offers a plethora of features in the form of command line options.

[Jul 16, 2017] How to use a man page Faster than a Google search

Jul 16, 2017 |
It's easy to get into the habit of googling anything you want to know about a command or operation in Linux, but I'd argue there's something even better: a living and breathing, complete reference, the man pages , which is short for manual pages.

The history of man pages predates Linux, all the way back to the early days of Unix. According to Wikipedia , Dennis Ritchie and Ken Thompson wrote the first man pages in 1971, well before the days of personal computers, around the time when many calculators in use were the size of toaster ovens. Man pages also have a reputation of being terse and, in a way, have a language of their own. Just like Unix and Linux, the man pages have not been static, and they continue to be developed and maintained just like the kernel.

Man pages are divided into sections referenced by numbers:

  1. General user commands
  2. System calls
  3. Library functions
  4. Special files and drivers
  5. File formats
  6. Games and screensavers
  7. Miscellanea
  8. System administration commands and daemons

Even so, users generally don't need to know the section where a particular command lies to find what they need.

The files are formatted in a way that may look odd to many users today. Originally, they were written in in an old form of markup called troff because they were designed to be printed through a PostScript printer, so they included formatting for headers and other layout aspects. In Linux, groff is used instead.

In my Fedora, the man pages are located in /usr/share/man with subdirectories (like man1 for Section 1 commands) as well as additional subdirectories for translations of the man pages.

If you look up the man page for the command man , you'll see the file man.1.gz , which is the man pages compressed with the gzip utility. To access a man page, type a command such as:


for example, to show the man page for man . This uncompresses the man page, interprets the formatting commands, and displays the results with less , so navigation is the same as when you use less .

All man pages should have the following subsections: Name , Synopsis , Description , Examples , and See Also . Many have additional sections, like Options , Exit Status , Environment , Bugs , Files , Author , Reporting Bugs , History , and Copyright .

Breaking down a man page

To explain how to interpret a typical man page, let's use the man page for ls as an example. Under Name , we see

 - list directory contents

which tells us what ls means in the simplest terms.

Under Synopsis , we begin to see the terseness:


Any element that occurs inside brackets is optional. The above command means you can legitimately type ls and nothing else. The ellipsis after each element indicates that you can include as many options as you want (as long as they're compatible with each other) and as many files as you want. You can specify a directory name, and you can also use * as a wildcard. For example:


Under Description , we see a more verbose description of what the command does, followed by a list of the available options for the command. The first option for ls is

-a, --all
do not ignore entries starting with .

If we want to use this option, we can either type the short form syntax, -a , or the long form --all . Not all options have two forms (e.g., --author ), and even when they do, they aren't always so obviously related (e.g., - F, --classify ). When you want to use multiple options, you can either type the short forms with spaces in between or type them with a single hyphen and no spaces (as long as they do not require further sub-options). Therefore,




are equivalent.

The command tar is somewhat unique, presumably due to its long history, in that it doesn't require a hyphen at all for the short form. Therefore,

 filearchive.tar thisdirectory


 cvf filearchive.tar thisdirectory

are both legitimate.

On the ls man page, after Description are Author , Reporting Bugs , Copyright , and See Also .

The See Also section will often suggest related man pages, so it is generally worth a glance. After all, there is much more to man pages than just commands.

Certain commands that are specific to Bash and not system commands, like alias , cd , and a number of others, are listed together in a single BASH_BUILTINS man page. While the documentation for these is even more terse and compact, overall it contains similar information.

I find that man pages offer a lot of good, usable information, especially when I need a command I haven't used recently, and I need to brush up on the options and requirements. This is one place where the man pages' much-maligned terseness is actually very beneficial. Topics Linux About the author Greg Pittman - Greg is a retired neurologist in Louisville, Kentucky, with a long-standing interest in computers and programming, beginning with Fortran IV in the 1960s. When Linux and open source software came along, it kindled a commitment to learning more, and eventually contributing. He is a member of the Scribus Team.

[Jun 29, 2017] printf Command

[Feb 25, 2017] 5 basic cURL command examples

Feb 25, 2017 |
cURL is very useful command line tool to transfer data from or to a server. cURL supports various protocols like FILE, HTTP, HTTPS, IMAP, IMAPS, LDAP, DICT, LDAPS, TELNET, FTP, FTPS, GOPHER, RTMP, RTSP, SCP, SFTP, POP3, POP3S, SMB, SMBS, SMTP, SMTPS, and TFTP.

cURL can be used in many different and interesting ways. With this tool you can download, upload and manage files, check your email address, or even update your status on some of the social media websites or check the weather outside. In this article will cover five of the most useful and basic uses of the cURL tool on any Linux VPS .

1. Check URL

One of the most common and simplest uses of cURL is typing the command itself, followed by the URL you want to check


This command will display the content of the URL on your terminal

2. Save the output of the URL to a file

The output of the cURL command can be easily saved to a file by adding the -o option to the command, as shown below

curl -o website

% Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
Dload  Upload   Total   Spent    Left  Speed
100 41793    0 41793    0     0   275k      0 --:--:-- --:--:-- --:--:--  2.9M

In this example, output will be save to a file named 'website' in the current working directory.

3. Download files with cURL

You can downlaod files with cURL by adding the -O option to the command. It is used for saving files on the local server with the same names as on the remote server

curl -O

In this example, the '' zip archive will be downloaded to the current working directory.

You can also download the file with a different name by adding the -o option to cURL.

curl -o

This way the '' archive will be downloaded and saved as ''.

cURL can be also used to download multiple files simultaneously, as shown in the example below

curl -O -O

cURL can be also used to download files securely via SSH using the following command

curl -u user s

Note that you have to use the full path of the file you want to download

4. Get HTTP header information from a website

You can easily get HTTP header information from any website you want by adding the -I option (capital 'i') to cURL.

curl -I

HTTP/1.1 200 OK
Date: Sun, 16 Oct 2016 23:37:15 GMT
Server: Apache/2.4.23 (Unix)
X-Powered-By: PHP/5.6.24
Connection: close
Content-Type: text/html; charset=UTF-8
5. Access an FTP server

To access your FTP server with cURL use the following command

curl --user username:password

cURL will connect to the FTP server and list all files and directories in user's home directory

You can download a file via FTP

curl --user username:password

and upload a file ot the FTP server

curl -T --user username:password

You can check cURL manual page to see all available cURL options and functionalities

man curl

Of course, if you use one of our Linux VPS Hosting services, you can always contact and ask our expert Linux admins (via chat or ticket) about cURL and anything related to cURL. They are available 24×7 and will provide information or assistance immediately.

PS. If you liked this post please share it with your friends on the social networks using the buttons below or simply leave a reply. Thanks.

[Feb 20, 2017] Using rsync to back up your Linux system

Feb 20, 2017 |
Another interesting option, and my personal favorite because it increases the power and flexibility of rsync immensely, is the --link-dest option. The --link-dest option allows a series of daily backups that take up very little additional space for each day and also take very little time to create.

Specify the previous day's target directory with this option and a new directory for today. rsync then creates today's new directory and a hard link for each file in yesterday's directory is created in today's directory. So we now have a bunch of hard links to yesterday's files in today's directory. No new files have been created or duplicated. Just a bunch of hard links have been created. Wikipedia has a very good description of hard links . After creating the target directory for today with this set of hard links to yesterday's target directory, rsync performs its sync as usual, but when a change is detected in a file, the target hard link is replaced by a copy of the file from yesterday and the changes to the file are then copied from the source to the target.

So now our command looks like the following.

rsync -aH --delete --link-dest=yesterdaystargetdir sourcedir todaystargetdir

There are also times when it is desirable to exclude certain directories or files from being synchronized. For this, there is the --exclude option. Use this option and the pattern for the files or directories you want to exclude. You might want to exclude browser cache files so your new command will look like this.

rsync -aH --delete --exclude Cache --link-dest=yesterdaystargetdir sourcedir todaystargetdir

Note that each file pattern you want to exclude must have a separate exclude option.

rsync can sync files with remote hosts as either the source or the target. For the next example, let's assume that the source directory is on a remote computer with the hostname remote1 and the target directory is on the local host. Even though SSH is the default communications protocol used when transferring data to or from a remote host, I always add the ssh option. The command now looks like this.

rsync -aH -e ssh --delete --exclude Cache --link-dest=yesterdaystargetdir remote1:sourcedir todaystargetdir

This is the final form of my rsync backup command.

rsync has a very large number of options that you can use to customize the synchronization process. For the most part, the relatively simple commands that I have described here are perfect for making backups for my personal needs. Be sure to read the extensive man page for rsync to learn about more of its capabilities as well as the options discussed here.

[Feb 14, 2017] switching from gnu screen to tmux (updated) Linux~ized

Ability to watch the other user screen is a very valuable option...
Feb 14, 2017 |

ed says: June 16, 2010 at 15:15

screen is really cool, and does somethings that I've yet to find counterparts to with tmux, such as the -x option:

[Feb 12, 2017] HowTo Use rsync For Transferring Files Under Linux or UNIX

Feb 12, 2017 |
So what is unique about the rsync command?

It can perform differential uploads and downloads (synchronization) of files across the network, transferring only data that has changed. The rsync remote-update protocol allows rsync to transfer just the differences between two sets of files across the network connection.

How do I install rsync?

Use any one of the following commands to install rsync. If you are using Debian or Ubuntu Linux, type the following command:
# apt-get install rsync
$ sudo apt-get install rsync
If you are using Red Hat Enterprise Linux (RHEL) / CentOS 4.x or older version, type the following command:
# up2date rsync
RHEL / CentOS 5.x or newer (or Fedora Linux) user type the following command:
# yum install rsync

Always use rsync over ssh

Since rsync does not provide any security while transferring data it is recommended that you use rsync over ssh session. This allows a secure remote connection. Now let us see some examples of rsync command.

Comman rsync command options Task : Copy file from a local computer to a remote server

Copy file from /www/backup.tar.gz to a remote server called
$ rsync -v -e ssh /www/backup.tar.gz

sent 19099 bytes  received 36 bytes  1093.43 bytes/sec
total size is 19014  speedup is 0.99

Please note that symbol ~ indicate the users home directory (/home/jerry).

Task : Copy file from a remote server to a local computer

Copy file /home/jerry/webroot.txt from a remote server to a local computer's /tmp directory:
$ rsync -v -e ssh /tmp

Task: Synchronize a local directory with a remote directory

$ rsync -r -a -v -e "ssh -l jerry" --delete /local/webroot

Task: Synchronize a remote directory with a local directory

$ rsync -r -a -v -e "ssh -l jerry" --delete /local/webroot

Task: Synchronize a local directory with a remote rsync server or vise-versa

$ rsync -r -a -v --delete rsync:// /home/cvs
$ rsync -r -a -v --delete /home/cvs rsync://

Task: Mirror a directory between my "old" and "new" web server/ftp

You can mirror a directory between my "old" ( and "new" web server with the command (assuming that ssh keys are set for password less authentication)
$ rsync -zavrR --delete --links --rsh="ssh -l vivek" /home/lighttpd

Read related previous articles
Other options – rdiff and rdiff-backup

The rdiff command uses the rsync algorithm. A utility called rdiff-backup has been created which is capable of maintaining a backup mirror of a file or directory over the network, on another server. rdiff-backup stores incremental rdiff deltas with the backup, with which it is possible to recreate any backup point. Next time I will write about these utilities.

rsync for Windows Server/XP/7/8

Please note if you are using MS-Windows, try any one of the program:

  1. DeltaCopy
  2. NasBackup
Further readings

=> Read rsync man page
=> Official rsync documentation

[Feb 12, 2017] How to Sync Two Apache Web Servers-Websites Using Rsync

Feb 12, 2017 |
The purpose of creating a mirror of your Web Server with Rsync is if your main web server fails, your backup server can take over to reduce downtime of your website. This way of creating a web server backup is very good and effective for small and medium size web businesses. Advantages of Syncing Web Servers

The main advantages of creating a web server backup with rsync are as follows:

  1. Rsync syncs only those bytes and blocks of data that have changed.
  2. Rsync has the ability to check and delete those files and directories at backup server that have been deleted from the main web server.
  3. It takes care of permissions, ownerships and special attributes while copying data remotely.
  4. It also supports SSH protocol to transfer data in an encrypted manner so that you will be assured that all data is safe.
  5. Rsync uses compression and decompression method while transferring data which consumes less bandwidth.
How To Sync Two Apache Web Servers

Let's proceed with setting up rsync to create a mirror of your web server. Here, I'll be using two servers.

Main Server
  1. IP Address :
  2. Hostname :
Backup Server
  1. IP Address :
  2. Hostname :
Step 1: Install Rsync Tool

Here in this case web server data of will be mirrored on . And to do so first, we need to install Rsync on both the server with the help of following command.

[root@tecmint]# yum install rsync        [On 
Red Hat
 based systems]
[root@tecmint]# apt-get install rsync    [On 
 based systems]
Step 2: Create a User to run Rsync

We can setup rsync with root user, but for security reasons, you can create an unprivileged user on main webserver i.e to run rsync.

[root@tecmint]# useradd tecmint
[root@tecmint]# passwd tecmint

Here I have created a user " tecmint " and assigned a password to user.

Step 3: Test Rsync Setup

It's time to test your rsync setup on your backup server (i.e. ) and to do so, please type following command.

[root@backup www]# rsync -avzhe ssh /var/www
Sample Output's password:
receiving incremental file list
sent 128 bytes  received 32.67K bytes  5.96K bytes/sec
total size is 12.78M  speedup is 389.70

You can see that your rsync is now working absolutely fine and syncing data. I have used " /var/www " to transfer; you can change the folder location according to your needs.

Step 4: Automate Sync with SSH Passwordless Login

Now, we are done with rsync setups and now its time to setup a cron for rsync. As we are going to use rsync with SSH protocol, ssh will be asking for authentication and if we won't provide a password to cron it will not work. In order to work cron smoothly, we need to setup passwordless ssh logins for rsync.

Here in this example, I am doing it as root to preserve file ownerships as well, you can do it for alternative users too.

First, we'll generate a public and private key with following commands on backups server (i.e. ).

[root@backup]# ssh-keygen -t rsa -b 2048

When you enter this command, please don't provide passphrase and click enter for Empty passphrase so that rsync cron will not need any password for syncing data.

Sample Output
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/
The key fingerprint is:
The key's randomart image is:
+--[ RSA 2048]----+
|          .o.    |
|           ..    |
|        ..++ .   |
|        o=E *    |
|       .Sooo o   |
|       =.o o     |
|      * . o      |
|     o +         |
|    . .          |

Now, our Public and Private key has been generated and we will have to share it with main server so that main web server will recognize this backup machine and will allow it to login without asking any password while syncing data.

[root@backup html]# ssh-copy-id -i /root/.ssh/

Now try logging into the machine, with " ssh ' '", and check in .ssh/authorized_keys .

[root@backup html]#

Now, we are done with sharing keys. To know more in-depth about SSH password less login , you can read our article on it.

  1. SSH Passwordless Login in in 5 Easy Steps
Step 5: Schedule Cron To Automate Sync

Let's setup a cron for this. To setup a cron, please open crontab file with the following command.

[root@backup ~]# crontab –e

It will open up /etc/crontab file to edit with your default editor. Here In this example, I am writing a cron to run it every 5 minutes to sync the data.

*/5        *        *        *        *   rsync -avzhe ssh /var/www/

The above cron and rsync command simply syncing " /var/www/ " from the main web server to a backup server in every 5 minutes . You can change the time and folder location configuration according to your needs. To be more creative and customize with Rsync and Cron command, you can check out our more detailed articles at:

  1. 10 Rsync Commands to Sync Files/Folders in Linux
  2. 11 Cron Scheduling Examples in Linux

[Feb 12, 2017] How to Use rsync to Synchronize Files Between Servers Linux Server Training 101

Feb 12, 2017 |

Keith Pawson 2 years ago

Great demonstration and very easy to follow Don! Just a note to anyone who might come across this and start using it in production based systems is that you certainly would not want to be rsyncing with root accounts. In addition you would use key based auth with SSH as an additional layer of security. Just my 2cents ;-) curtis shaw 11 months ago Best rsync tutorial on the web. Thanks.

[Feb 12, 2017] An Easy Way To Monitor A Website From Command Line In Linux


We all know that ping command will tell you instantly whether the website is live or down. Usually, we all check whether a website is up or down like below.

ping -c 3

Sample output:

PING ( 56(84) bytes of data.
64 bytes from ( icmp_seq=1 ttl=51 time=376 ms
64 bytes from ( icmp_seq=2 ttl=51 time=374 ms

--- ping statistics ---
3 packets transmitted, 2 received, 33% packet loss, time 2000ms
rtt min/avg/max/mdev = 374.828/375.471/376.114/0.643 ms

But, Would you run this command every time to check whether your website is live or down? You may create a script to check your website status at periodic intervals. But wait. It's not necessary! Here is simple command that will watch or monitor on a regular interval.

watch -n 1 curl -I http://DOMAIN_NAME/

For those who don't know, watch command is used to run any command on a particular intervals.

Download Free Guide: "Introduction to Linux – A Hands on Guide"


Let us check if site is live or down. To do so, run:

watch -n 1 curl -I

Sample output:

Every 1.0s: curl -I sk: Thu Dec 22 17:37:24 2016

% Total % Received % Xferd Average Speed Time Time Time Current
 Dload Upload Total Spent Left Speed
 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
 0 0 0 0 0 0 0 0 --:--:-- 0:00:01 --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- 0:00:01 --:--:-- 0
HTTP/1.1 200 OK 
Date: Thu, 22 Dec 2016 12:07:09 GMT
Server: ApacheD
Vary: Cookieh
Link: <>; rel="", <https://w>; rel=shortlinki
Content-Type: text/html; charset=UTF-8

The above command will monitor our site at every one second interval. You can change the monitoring time as you wish. Unlike ping command, it will keep watching your site status until you stop it. To stop this command, press CTRL+C.

If you got HTTP/1.1 200 OK in the output, great? It means your website is working and live.

[Aug 01, 2014] Getting Back To Coding


New submitter rrconan writes I always feel like I'm getting old because of the constant need to learn a new tools to do the same job. At the end of projects, I get the impression that nothing changes — there are no real benefits to the new tools, and the only result is a lot of time wasted learning them instead of doing the work. We discussed this last week with Andrew Binstock's "Just Let Me Code" article, and now he's written a follow-up about reducing tool complexity and focusing on writing code. He says, "Tool vendors have several misperceptions that stand in the way. The first is a long-standing issue, which is 'featuritis': the tendency to create the perception of greater value in upgrades by adding rarely needed features. ... The second misperception is that many tool vendors view the user experience they offer as already pretty darn good. Compared with tools we had 10 years ago or more, UIs have indeed improved significantly. But they have not improved as fast as complexity has increased. And in that gap lies the problem.' Now I understand that what I thought of as "getting old" was really "getting smart."

10 most rated linux commands for last past weeks at commandlinefu.

1- Save man-page as pdf

 man -t awk | ps2pdf - awk.pdf

2- Duplicate installed packages from one machine to the other (RPM-based systems)

ssh "rpm -qa" | xargs yum -y install

3- Stamp a text line on top of the pdf pages to quickly add some remark, comment, stamp text, … on top of (each of) the pages of the input pdf file

echo "This text gets stamped on the top of the pdf pages." | enscript -B -f Courier-Bold16 -o- | ps2pdf - | pdftk input.pdf stamp - output output.pdf

4- Display the number of connections to a MySQL Database

Count the number of active connections to a MySQL database.
The MySQL command “show processlist” gives a list of all the active clients.
However, by using the processlist table, in the information_schema database, we can sort and count the results within MySQL.

mysql -u root -p -BNe "select host,count(host) from processlist group by host;" information_schema

5- Create a local compressed tarball from remote host directory

ssh user@host "tar -zcf - /path/to/dir" > dir.tar.gz

This improves on #9892 by compressing the directory on the remote machine so that the amount of data transferred over the network is much smaller. The command uses ssh(1) to get to a remote host, uses tar(1) to archive and compress a remote directory, prints the result to STDOUT, which is written to a local file. In other words, we are archiving and compressing a remote directory to our local box.

6- tail a log over ssh

This is also handy for taking a look at resource usage of a remote box.

ssh -t remotebox "tail -f /var/log/remote.log"

7- Print diagram of user/groups

Parses /etc/group to “dot” format and pases it to “display” (imagemagick) to show a usefull diagram of users and groups (don’t show empty groups).

awk 'BEGIN{FS=":"; print "digraph{"}{split($4, a, ","); for (i in a) printf "\"%s\" [shape=box]\n\"%s\" -> \"%s\"\n", $1, a[i], $1}END{print "}"}' /etc/group|display

8- Draw kernel module dependancy graph.

Parse `lsmod’ output and pass to `dot’ drawing utility then finally pass it to an image viewer

lsmod | perl -e 'print "digraph \"lsmod\" {";<>;while(<>){@_=split/\s+/; print "\"$_[0]\" -> \"$_\"\n" for split/,/,$_[3]}print "}"' | dot -Tpng | display -

9- Create strong, but easy to remember password

Why remember? Generate!
Up to 48 chars, works on any unix-like system

read -s pass; echo $pass | md5sum | base64 | cut -c -16

10- Find all files larger than 500M and less than 1GB

find / -type f -size +500M -size -1G

11- Limit the cpu usage of a process

This will limit the average amount of CPU it consumes.

sudo cpulimit -p pid -l 50

[Jul 26, 2011] Pipe Viewer Online Man Page

pv allows a user to see the progress of data through a pipeline, by giving information such as time elapsed, percentage completed (with progress bar), current throughput rate, total data transferred, and ETA.

To use it, insert it in a pipeline between two processes, with the appropriate options. Its standard input will be passed through to its standard output and progress will be shown on standard error.

pv will copy each supplied FILE in turn to standard output (- means standard input), or if no FILEs are specified just standard input is copied. This is the same behaviour as cat(1).

A simple example to watch how quickly a file is transferred using nc(1):

pv file | nc -w 1 3000

A similar example, transferring a file from another process and passing the expected size to pv:

cat file | pv -s 12345 | nc -w 1 3000

A more complicated example using numeric output to feed into the dialog(1) program for a full-screen progress display:

(tar cf - . \
| pv -n -s $(du -sb . | awk '{print $1}') \
| gzip -9 > out.tgz) 2>&1 \
| dialog --gauge 'Progress' 7 70

Frequent use of this third form is not recommended as it may cause the programmer to overheat.

[Jan 24, 2008] Project details for cgipaf

The package also contain Solaris binary of chpasswd clone, which is extremely useful for mass changes of passwords in corporate environments which include Solaris and other Unixes that does not have chpasswd utility (HP-UX is another example in this category). Version 1.3.2 now includes Solaris binary of chpasswd which works on Solaris 9 and 10.

cgipaf is a combination of three CGI programs.

All programs use PAM for user authentication. It is possible to run a script to update SAMBA passwords or NIS configuration when a password is changed. mailcfg.cgi creates a .procmailrc in the user's home directory. A user with too many invalid logins can be locked. The minimum and maximum UID can be set in the configuration file, so you can specify a range of UIDs that are allowed to use cgipaf.

[Aug 7, 2007] Expect plays a crucial role in network management by Cameron Laird

31 Jul 2007 |

If you manage systems and networks, you need Expect.

More precisely, why would you want to be without Expect? It saves hours common tasks otherwise demand. Even if you already depend on Expect, though, you might not be aware of the capabilities described below.

Expect automates command-line interactions

You don't have to understand all of Expect to begin profiting from the tool; let's start with a concrete example of how Expect can simplify your work on AIX® or other operating systems:

Suppose you have logins on several UNIX® or UNIX-like hosts and you need to change the passwords of these accounts, but the accounts are not synchronized by Network Information Service (NIS), Lightweight Directory Access Protocol (LDAP), or some other mechanism that recognizes you're the same person logging in on each machine. Logging in to a specific host and running the appropriate passwd command doesn't take long—probably only a minute, in most cases. And you must log in "by hand," right, because there's no way to script your password?

Wrong. In fact, the standard Expect distribution (full distribution) includes a command-line tool (and a manual page describing its use!) that precisely takes over this chore. passmass (see Resources) is a short script written in Expect that makes it as easy to change passwords on twenty machines as on one. Rather than retyping the same password over and over, you can launch passmass once and let your desktop computer take care of updating each individual host. You save yourself enough time to get a bit of fresh air, and multiple opportunities for the frustration of mistyping something you've already entered.

The limits of Expect

This passmass application is an excellent model—it illustrates many of Expect's general properties:

You probably know enough already to begin to write or modify your own Expect tools. As it turns out, the passmass distribution actually includes code to log in by means of ssh, but omits the command-line parsing to reach that code. Here's one way you might modify the distribution source to put ssh on the same footing as telnet and the other protocols:
Listing 1. Modified passmass fragment that accepts the -ssh argument

} "-rlogin" {
set login "rlogin"
} "-slogin" {
set login "slogin"
} "-ssh" {
set login "ssh"
} "-telnet" {
set login "telnet"

In my own code, I actually factor out more of this "boilerplate." For now, though, this cascade of tests, in the vicinity of line #100 of passmass, gives a good idea of Expect's readability. There's no deep programming here—no need for object-orientation, monadic application, co-routines, or other subtleties. You just ask the computer to take over typing you usually do for yourself. As it happens, this small step represents many minutes or hours of human effort saved.

[April 23, 2006] Port25 Running Windows Command Line Applications from a Linux Box

What is interesting comments does not mention that ssh server is available under SFU 3.5.
Research and Analysis

Wednesday, April 19, 2006 5:42 PM by admin

Running Command Line Applications on Windows XP/2000 from a Linux Box:


-----Original Message-----
From: swagner@********
Sent: Thursday, April 13, 2006 2:35 PM
To: Port25 Feedback
Subject: (Port25) : You guys should look into _____
Importance: High

Can you recommend anything for running command line applications on a Windows XP/2000 box from within a program that runs on Linux? For example I want a script to run on a Linux server that will connect to a Windows server, on our network, and run certain commands.


One way to do this would be to install an SSH daemon on the Windows machine and run commands via the ssh client on the Linux machine. Simply search the web for information on setting up the Cygwin SSH daemon as a service in Windows (there are docs about this everywhere). You can then run commands with ssh, somewhat like:

ssh administrator@<hostname> 'touch /cygdrive/c/blar'

That will create a file in C:\ called "blar". You can also access Windows commands if you alter the path in the Cygwin environment or use the full path to the command:

ssh administrator@<hostname> '/cygdrive/c/windows/system32/net.exe view'

re: Running Windows Command Line Applications from a Linux Box

Sunday, April 23, 2006 3:44 AM by nektar

I am disappointed that Microsoft does not offer an SSH implementation with Services for Unix or with SUA.

re: Running Windows Command Line Applications from a Linux Box

Sunday, April 23, 2006 4:36 PM by szlwzl

I would also very much like to see this as a built in feature - cygwin is great and I use it all the time but why not build something like this into the OS?

re: Running Windows Command Line Applications from a Linux Box

Sunday, April 23, 2006 6:05 PM by breiter

I'm stunned that you didn't recommend OpenSSH running on Interix from SFU 3.5 or SUA 5.2. I would much rather rely upon Interix than Cygwin. Interopsystems maintains an both a free straight OpenSSH package and an commercial enhanced version with an MMC-based GUI configurator.

re: Running Windows Command Line Applications from a Linux Box

Monday, April 24, 2006 1:12 AM by vox

Of course if there was an RDP client that could access Windows full screen using a browser (the same way as Virtual Labs work) you could run GUI programs as well

Replies to all

Monday, April 24, 2006 1:30 AM by einhverfr

Hi all.

Nektar wrote:

" I am disappointed that Microsoft does not offer an SSH implementation with Services for Unix or with SUA."

When I was at Microsoft, the legal department raised objections. Not sure if they were trademark related or what. But a good substitute would be a kerberized telnet client and server that would be capable of session encryption as per the Kerberos specification. People usually don't know that this is possible using Kerberos and telnet but it is. And given the architecture of AD, this would lead to close integration.

Vox wrote:
" Of course if there was an RDP client that could access Windows full screen using a browser (the same way as Virtual Labs work) you could run GUI programs as well"

Ever use rdesktop? It doesn;t use a browser, but it close enough you can easily run GUI apps.

Best Wishes,
Chris Travers
Metatron Technology Consulting

re: Running Windows Command Line Applications from a Linux Box

Monday, April 24, 2006 12:42 PM by docsmooth

rdesktop -0 -f <servername> will work the same as mstsc /console with the fullscreen switch set. As Chris said, it's not a browser, but it's a 100% replacement for MSTSC, and fits every single option, security and otherwise, that is in MSTSC.

Also, KDE users have "krdc" which wraps around rdesktop and VNC, so you can connect to either, and save off your settings, just like saving .RDP files in Windows.


re: Running Windows Command Line Applications from a Linux Box

Monday, April 24, 2006 4:42 PM by docsmooth

I completely forgot this portion to my previous comment:

Is there anyone who has experience running Windows Resource Kit tools or Windows 2003 Support Tools from Wine or similar directly off of the Linux box? It would be fantastic to be able to run those and the MMC tools, perhaps with WinBind as the authentication path?

As things sit right now, I have to run a VMWare WinXP instance, or dual-boot to get access to those tools that I run less frequently than certain FOSS tools, but still need.

re: Running Windows Command Line Applications from a Linux Box

Thursday, April 27, 2006 4:39 PM by smither

Simply install vncserver from, for example,, then use vncviewer on the Linux box. You have your complete Windows desktop within a window in your X server. Open the terminal from the start menu.

re: Running Windows Command Line Applications from a Linux Box

Friday, April 28, 2006 2:49 PM by remdotc

You can either purchase a copy of Cross Over Office and/or Cedega, which allow you to run windows native binarys on linux (directX) or you can under wine get these to work, though you need to install
IE 6.1
You need to set your 0/S in wine.conf to 2000
you need to copy most of the files contained in
sysroot/system32 to your winex install
performance is horrible

The better sollution is to install a ssh server on the windows box and then remote in via command line. If you can not afford a commerical one, you can always use cygwin

[Jan 25, 2005] Tool of the Month: ManEdit by Joe "Zonker" Brockmeier

ManEdit is provided by WolfPack Entertainment. I know, that doesn't sound like a company that would be releasing a manual page editor, but they did — and under the GNU General Public License, no less.

It's not terribly difficult to create manual pages using an editor like Emacs or Vim (see my December 2003 column if you'd like to start from scratch) but it's yet another skill that developers and admins have to tackle to learn how to write in *roff format. ManEdit actually uses an XML format that it converts to groff format when saving, so it's not necessary to delve into groff if you don't want to. (I would recommend having at least a passing familiarity with groff if you're going to be doing much development, but it's not absolutely necessary.)

ManEdit is an easy-to-use manual page editor and viewer that takes all the hassle out of creating manual pages (well, the formatting hassle, anyway — you still have to actually write the manual itself).

The ManEdit homepage has source and packages for Debian, Mandrake, Slackware, and SUSE Linux. The source should compile on FreeBSD and Solaris as well, so long as you have GTK 1.2.0. I used the SUSE packages without any problem on a SUSE 9.2 system.

Sys Admin Magazine

Useful Solaris Commands

truss -c (Solaris >= 8): This astounding option to truss provides a profile summary of the command being trussed:

$ truss -c grep asdf work.doc
syscall              seconds   calls  errors
_exit                    .00       1
read                     .01      24
open                     .00       8      4
close                    .00       5
brk                      .00      15
stat                     .00       1
fstat                    .00       4
execve                   .00       1
mmap                     .00      10
munmap                   .01       3
memcntl                  .00       2
llseek                   .00       1
open64                   .00       1
                        ----     ---    ---
sys totals:              .02      76      4
usr time:                .00
elapsed:                 .05

It can also show profile data on a running process. In this case, the data shows what the process did between when truss was started and when truss execution was terminated with a control-c. It’s ideal for determining why a process is hung without having to wade through the pages of truss output.

truss -d and truss -D (Solaris >= 8): These truss options show the time associated with each system call being shown by truss and is excellent for finding performance problems in custom or commercial code. For example:

$ truss -d who
Base time stamp:  1035385727.3460  [ Wed Oct 23 11:08:47 EDT 2002 ]
 0.0000 execve(“/usr/bin/who”, 0xFFBEFD5C, 0xFFBEFD64)  argc = 1
 0.0032 stat(“/usr/bin/who”, 0xFFBEFA98)                = 0
 0.0037 open(“/var/ld/ld.config”, O_RDONLY)             Err#2 ENOENT
 0.0042 open(“/usr/local/lib/”, O_RDONLY)      Err#2 ENOENT
 0.0047 open(“/usr/lib/”, O_RDONLY)            = 3
 0.0051 fstat(3, 0xFFBEF42C)                            = 0
. . .

truss -D is even more useful, showing the time delta between system calls:

Dilbert> truss -D who
 0.0000 execve(“/usr/bin/who”, 0xFFBEFD5C, 0xFFBEFD64)  argc = 1
 0.0028 stat(“/usr/bin/who”, 0xFFBEFA98)                = 0
 0.0005 open(“/var/ld/ld.config”, O_RDONLY)             Err#2 ENOENT
 0.0006 open(“/usr/local/lib/”, O_RDONLY)      Err#2 ENOENT
 0.0005 open(“/usr/lib/”, O_RDONLY)            = 3
 0.0004 fstat(3, 0xFFBEF42C)                            = 0

In this example, the stat system call took a lot longer than the others.

truss -T: This is a great debugging help. It will stop a process at the execution of a specified system call. (“-U” does the same, but with user-level function calls.) A core could then be taken for further analysis, or any of the /proc tools could be used to determine many aspects of the status of the process.

truss -l (improved in Solaris 9): Shows the thread number of each call in a multi-threaded processes. Solaris 9 truss -l finally makes it possible to watch the execution of a multi-threaded application.

Truss is truly a powerful tool. It can be used on core files to analyze what caused the problem, for example. It can also show details on user-level library calls (either system libraries or programmer libraries) via the “-u” option.

pkg-get: This is a nice tool ( for automatically getting freeware packages. It is configured via /etc/pkg-get.conf. Once it’s up and running, execute pkg-get -a to get a list of available packages, and pkg-get -i to get and install a given package.

plimit (Solaris >= 8): This command displays and sets the per-process limits on a running process. This is handy if a long-running process is running up against a limit (for example, number of open files). Rather than using limit and restarting the command, plimit can modify the running process.

coreadm (Solaris >= 8): In the “old” days (before coreadm), core dumps were placed in the process’s working directory. Core files would also overwrite each other. All this and more has been addressed by coreadm, a tool to manage core file creation. With it, you can specify whether to save cores, where cores should be stored, how many versions should be retained, and more. Settings can be retained between reboots by coreadm modifying /etc/coreadm.conf.

pgrep (Solaris >= 8): pgrep searches through /proc for processes matching the given criteria, and returns their process-ids. A great option is “-n”, which returns the newest process that matches.

preap (Solaris >= 9): Reaps zombie processes. Any processes stuck in the “z” state (as shown by ps), can be removed from the system with this command.

pargs (Solaris >= 9): Shows the arguments and environment variables of a process.

nohup -p (Solaris >= 9): The nohup command can be used to start a process, so that if the shell that started the process closes (i.e., the process gets a “SIGHUP” signal), the process will keep running. This is useful for backgrounding a task that should continue running no matter what happens around it. But what happens if you start a process and later want to HUP-proof it? With Solaris 9, nohup -p takes a process-id and causes SIGHUP to be ignored.

prstat (Solaris >= 8): prstat is top and a lot more. Both commands provide a screen’s worth of process and other information and update it frequently, for a nice window on system performance. prstat has much better accuracy than top. It also has some nice options. “-a” shows process and user information concurrently (sorted by CPU hog, by default). “-c” causes it to act like vmstat (new reports printed below old ones). “-C” shows processes in a processor set. “-j” shows processes in a “project”. “-L” shows per-thread information as well as per-process. “-m” and “-v” show quite a bit of per-process performance detail (including pages, traps, lock wait, and CPU wait). The output data can also be sorted by resident-set (real memory) size, virtual memory size, execute time, and so on. prstat is very useful on systems without top, and should probably be used instead of top because of its accuracy (and some sites care that it is a supported program).

trapstat (Solaris >= 9): trapstat joins lockstat and kstat as the most inscrutable commands on Solaris. Each shows gory details about the innards of the running operating system. Each is indispensable in solving strange happenings on a Solaris system. Best of all, their output is good to send along with bug reports, but further study can reveal useful information for general use as well.

vmstat -p (Solaris >= 8): Until this option became available, it was almost impossible (see the “se toolkit”) to determine what kind of memory demand was causing a system to page. vmstat -p is key because it not only shows whether your system is under memory stress (via the “sr” column), it also shows whether that stress is from application code, application data, or I/O. “-p” can really help pinpoint the cause of any mysterious memory issues on Solaris.

pmap -x (Solaris >= 8, bugs fixed in Solaris >= 9): If the process with memory problems is known, and more details on its memory use are needed, check out pmap -x. The target process-id has its memory map fully explained, as in:

# pmap -x 1779
1779:   -ksh
 Address  Kbytes     RSS    Anon  Locked Mode   Mapped File
00010000     192     192       -       - r-x--  ksh
00040000       8       8       8       - rwx--  ksh
00042000      32      32       8       - rwx--    [ heap ]
FF180000     680     664       -       - r-x--
FF23A000      24      24       -       - rwx--
FF240000       8       8       -       - rwx--
FF280000     568     472       -       - r-x--
FF31E000      32      32       -       - rwx--
FF326000      32      24       -       - rwx--
FF340000      16      16       -       - r-x--
FF350000      16      16       -       - r-x--
FF364000       8       8       -       - rwx--
FF380000      40      40       -       - r-x--
FF39A000       8       8       -       - rwx--
FF3A0000       8       8       -       - r-x--
FF3B0000       8       8       8       - rwx--    [ anon ]
FF3C0000     152     152       -       - r-x--
FF3F6000       8       8       8       - rwx--
FFBFE000       8       8       8       - rw---    [ stack ]
-------- ------- ------- ------- -------
total Kb    1848    1728      40       -

Here we see each chunk of memory, what it is being used for, how much space it is taking (virtual and real), and mode information.

df -h (Solaris >= 9): This command is popular on Linux, and just made its way into Solaris. df -h displays summary information about file systems in human-readable form:

$ df -h
Filesystem             size   used  avail capacity  Mounted on
/dev/dsk/c0t0d0s0      4.8G   1.7G   3.0G    37%    /
/proc                    0K     0K     0K     0%    /proc
mnttab                   0K     0K     0K     0%    /etc/mnttab
fd                       0K     0K     0K     0%    /dev/fd
swap                   848M    40K   848M     1%    /var/run
swap                   849M   1.0M   848M     1%    /tmp
/dev/dsk/c0t0d0s7       13G    78K    13G     1%    /export/home


Each administrator has a set of tools used daily, and another set of tools to help in a pinch. This column included a wide variety of commands and options that are lesser known, but can be very useful. Do you have favorite tools that have saved you in a bind? If so, please send them to me so I can expand my tool set as well. Alternately, send along any tools that you hate or that you feel are dangerous, which could also turn into a useful column!

[Jan 13, 2004] The art of writing Linux utilities Peter Seebach

What makes a good utility?

There is a wonderful discussion of this question in The UNIX Programming Environment, by Kernighan & Pike. A good utility is one that does its job as well as possible. It has to play well with others; it has to be amenable to being combined with other utilities. A program that doesn't combine with others isn't a utility; it's an application.

Utilities are supposed to let you build one-off applications cheaply and easily from the materials at hand. A lot of people think of them as being like tools in a toolbox. The goal is not to have a single widget that does everything, but to have a handful of tools, each of which does one thing as well as possible.

Some utilities are reasonably useful on their own, whereas others imply cooperation in pipelines of utilities. Examples of the former include sort and grep. On the other hand, xargs is rarely used except with other utilities, most often find.

What language to write in?
Most of the UNIX system utilities are written in C. The examples here are in Perl and sh. Use the right tool for the right job. If you use a utility heavily enough, the cost of writing it in a compiled language might be justified by the performance gain. On the other hand, for the fairly common case where a program's workload is light, a scripting language may offer faster development.

If you aren't sure, you should use the language you know best. At least when you're prototyping a utility, or figuring out how useful it is, favor programmer efficiency over performance tuning. Most of the UNIX system utilities are in C, simply because they're heavily used enough to justify the development cost. Perl and sh (or ksh) can be good languages for a quick prototype. Utilities that tie other programs together may be easier to write in a shell than in a more conventional programming language. On the other hand, any time you want to interact with raw bytes, C is probably looming on your horizon.

Designing a utility

A good rule of thumb is to start thinking about the design of a utility the second time you have to solve a problem. Don't mourn the one-off hack you write the first time; think of it as a prototype. The second time, compare what you need to do with what you needed to do the first time. Around the third time, you should start thinking about taking the time to write a general utility. Even a merely repetitive task might merit the development of a utility; for instance, many generalized file-renaming programs have been written based on the frustration of trying to rename files in a generalized way.

Here are some design goals of utilities; each gets its own section, below.

Do one thing well

Do one thing well; don't do multiple things badly. The best example of this doing one thing well is probably sort. No utilities other than sort have a sort feature. The idea is simple; if you only solve a problem once, you can take the time to do it well.

Imagine how frustrating it would be if most programs sorted data, but some supported only lexographic sorts, while others supported only numeric sorts, and a few even supported selection of keys rather than sorting by whole lines. It would be annoying at best.

When you find a problem to solve, try to break the problem up into parts, and don't duplicate the parts for which utilities already exist. The more you can focus on a tool that lets you work with existing tools, the better the chances that your utility will stay useful.

You may need to write more than one program. The best way to solve a specialized task is often to write one or two utilities and a bit of glue to tie them together, rather than writing a single program to solve the whole thing. It's fine to use a 20-line shell script to tie your new utility together with existing tools. If you try to solve the whole problem at once, the first change that comes along might require you to rethink everything.

I have occasionally needed to produce two-column or three-column output from a database. It is generally more efficient to write a program to build the output in a single column and then glue it to a program that puts things in columns. The shell script that combines these two utilities is itself a throwaway; the separate utilities have outlived it.

Some utilities serve very specialized needs. If the output of ls in a crowded directory scrolls off the screen very quickly, it might be because there's a file with a very long name, forcing ls to use only a single column for output. Paging through it using more takes time. Why not just sort lines by length, and pipe the result through tail, as follows?

Listing 1. One of the smallest utilities anywhere, sl

#/usr/bin/perl -w
print sort { length $a <=> length $b } <>;

The script in Listing 1 does exactly one thing. It takes no options, because it needs no options; it only cares about the length of lines. Thanks to Perl's convenient <> idiom, this automatically works either on standard input or on files named on the command line.

Be a filter

Almost all utilities are best conceived of as filters, although a few very useful utilities don't fit this model. (For instance, a program that counts might be very useful, even though it doesn't work well as a filter. Programs that take only command-line arguments as input, and produce potentially complicated output, can be very useful.) Most utilities, though, should work as filters. By convention, filters work on lines of text. Most filters should have some support for running on multiple input files.

Remember that a utility needs to work on the command line and in scripts. Sometimes, the ideal behavior varies a little. For instance, most versions of ls automatically sort input into columns when writing to a terminal. The default behavior of grep is to print the file name in which a match was found only if multiple files were specified. Such differences should have to do with how users will want the utility to work, not with other agendas. For instance, old versions of GNU bc displayed an intrusive copyright notice when started. Please don't do that. Make your utility stick to doing its job.

Utilities like to live in pipelines. A pipeline lets a utility focus on doing its job, and nothing else. To live in a pipeline, a utility needs to read data from standard input and write data to standard output. If you want to deal with records, it's best if you can make each line be a "record." Existing programs such as sort and join are already thinking that way. They'll thank you for it.

One utility I occasionally use is a program that calls other programs iteratively over a tree of files. This makes very good use of the standard UNIX utility filter model, but it only works with utilities that read input and write output; you can't use it with utilities that operate in place, or take input and output file names.

Most programs that can run from standard input can also reasonably be run on a single file, or possibly on a group of files. Note that this arguably violates the rule against duplicating effort; obviously, this could be managed by feeding cat into the next program in the series. However, in practice, it seems to be justified.

Some programs may legitimately read records in one format but produce something entirely different. An example would be a utility to put material into columnar form. Such a utility might equate lines to records on input, but produce multiple records per line on output.

Not every utility fits entirely into this model. For instance, xargs takes not records but names of files as input, and all of the actual processing is done by some other program.


Try to think of tasks similar to the one you're actually performing; if you can find a general description of these tasks, it may be best to try to write a utility that fits that description. For instance, if you find yourself sorting text lexicographically one day and numerically another day, it might make sense to consider attempting a general sort utility.

Generalizing functionality sometimes leads to the discovery that what seemed like a single utility is really two utilities used in concert. That's fine. Two well-defined utilities can be easier to write than one ugly or complicated one.

Doing one thing well doesn't mean doing exactly one thing. It means handling a consistent but useful problem space. Lots of people use grep. However, a great deal of its utility comes from the ability to perform related tasks. The various options to grep do the work of a handful of small utilities that would have ended up sharing, or duplicating, a lot of code.

This rule, and the rule to do one thing, are both corollaries of an underlying principle: avoid duplication of code whenever possible. If you write a half-dozen programs, each of which sorts lines, you can end up having to fix similar bugs half a dozen times instead of having one better-maintained sort program to work on.

This is the part of writing a utility that adds the most work to the process of getting it completed. You may not have time to generalize something fully at first, but it pays off when you get to keep using the utility.

Sometimes, it's very useful to add related functionality to a program, even when it's not quite the same task. For instance, a program to pretty-print raw binary data might be more useful if, when run on a terminal device, it threw the terminal into raw mode. This makes it a lot easier to test questions involving keymaps, new keyboards, and the like. Not sure why you're getting tildes when you hit the delete key? This is an easy way to find out what's really getting sent. It's not exactly the same task, but it's similar enough to be a likely addition.

The errno utility in Listing 2 below is a good example of generalizing, as it supports both numeric and symbolic names.

Be robust

It's important that a utility be durable. A utility that crashes easily or can't handle real data is not a useful utility. Utilities should handle arbitrarily long lines, huge files, and so on. It is perhaps tolerable for a utility to fail on a data set larger than it can hold in memory, but some utilities don't do this; for instance, sort, by using temporary files, can generally sort data sets much larger than it can hold in memory.

Try to make sure you've figured out what data your utility can possibly run on. Don't just ignore the possibility of data you can't handle. Check for it and diagnose it. The more specific your error messages, the more helpful you are being to your users. Try to give the user enough information to know what happened and how to fix it. When processing data files, try to identify exactly what the malformed data was. When trying to parse a number, don't just give up; tell the user what you got, and if possible, what line of the input stream the data was on.

As a good example, consider the difference between two implementations of dc. If you run dc /home, one of them says "Cannot use directory as input!" The other just returns silently; no error message, no unusual exit code. Which of these would you rather have in your path when you make a typo on a cd command? Similarly, the former will give verbose error messages if you feed it the stream of data from a directory, perhaps by doing dc < /home. On the other hand, it might be nice for it to give up early on when getting invalid data.

Security holes are often rooted in a program that isn't robust in the face of unexpected data. Keep in mind that a good utility might find its way into a shell script run as root. A buffer overflow in a program such as find is likely to be a risk to a great number of systems.

The better a program deals with unexpected data, the more likely it is to adapt well to varied circumstances. Often, trying to make a program more robust leads to a better understanding of its role, and better generalizations of it.

Be new

One of the worst kinds of utility to write is the one you already have. I wrote a wonderful utility called count. It allowed me to perform just about any counting task. It's a great utility, but there's a standard BSD utility called jot that does the same thing. Likewise, my very clever program for turning data into columns duplicates an existing utility, rs, likewise found on BSD systems except that rs is much more flexible and better designed. See Resources below for more information on jot and rs.

If you're about to start writing a utility, take a bit of time to browse around a few systems to see if there might be one already. Don't be afraid to steal Linux utilities for use on BSD, or BSD utilities for use on Linux; one of the joys of utility code is that almost all utilities are quite portable.

Don't forget to look at the possibility of combining existing applications to make a utility. It is possible, in theory, that you'll find stringing existing programs together is not fast enough, but it's very rare that writing a new utility is faster than waiting for a slightly slow pipeline.

An example utility

In a sense this program is a counterexample, in that it is never useful as a filter. It works very well as a command-line utility, however.

This program does one thing only. It prints out errno lines from /usr/include/sys/errno.h in a slightly pretty-printed format. For instance:

$ errno 22
EINVAL [22]: Invalid argument

Listing 2. Errno finder

    usage() {
        echo >&2 "usage: errno [numbers or error names]\n"
        exit 1

    for i
        case "$i" in
            awk '/^#define/ && $3 == '"$i"' {
                for (i = 5; i < NF; ++i) {
                    foo = foo " " $i;
                printf("%-22s%s\n", $2 " [" $3 "]:", foo);
                foo = ""
            }' < /usr/include/sys/errno.h
            awk '/^#define/ && $2 == "'"$i"'" {
                for (i = 5; i < NF; ++i) {
                    foo = foo " " $i;
                printf("%-22s%s\n", $2 " [" $3 "]:", foo);
                foo = ""
            }' < /usr/include/sys/errno.h
            echo >&2 "errno: can't figure out whether '$i' is a name or a number."

Does it generalize? Yes, nicely. It supports both numeric and symbolic names. On the other hand, it doesn't know about other files, such as /usr/include/sys/signal.h, that are likely in the same format. It could easily be extended to do that, but for a convenience utility like this, it's easier to just make a copy called "signal" that reads signal.h, and uses "SIG*" as the pattern to match a name.

This is just a tad more convenient than using grep on system header files, but it's less error-prone. It doesn't produce garbled results from ill-considered arguments. On the other hand, it produces no diagnostic if a given name or number is not found in the header. It also doesn't bother to correct some invalid inputs. Still, as a command-line utility never intended to be used in an automated context, it's okay.

Another example might be a program to unsort input (see Resources for a link to this utility). This is simple enough; read in input files, store them in some way, then generate a random order in which to print out the lines. This is a utility of nearly infinite applications. It's also a lot easier to write than a sorting program; for instance, you don't need to specify which keys you're not sorting on, or whether you want things in a random order alphabetically, lexicographically, or numerically. The tricky part comes in reading in potentially very long lines. In fact, the provided version cheats; it assumes there will be no null bytes in the lines it reads. It's a lot harder to get that right, and I was lazy when I wrote it.


If you find yourself performing a task repeatedly, consider writing a program to do it. If the program turns out to be reasonable to generalize a bit, generalize it, and you will have written a utility.

Don't design the utility the first time you need it. Wait until you have some experience. Feel free to write a prototype or two; a good utility is sufficiently better than a bad utility to justify a bit of time and effort on researching it. Don't feel bad if what you thought would be a great utility ends up gathering dust after you wrote it. If you find yourself frustrated by your new program's shortcomings, you just had another prototyping phase. If it turns out to be useless, well, that happens sometimes.

The thing you're looking for is a program that finds general application outside your initial usage patterns. I wrote unsort because I wanted an easy way to get a random series of colors out of an old X11 "rgb.txt" file. Since then, I've used it for an incredible number of tasks, not the least of which was producing test data for debugging and benchmarking sort routines.

One good utility can pay back the time you spent on all the near misses. The next thing to do is make it available for others, so they can experiment. Make your failed attempts available, too; other people may have a use for a utility you didn't need. More importantly, your failed utility may be someone else's prototype, and lead to a wonderful utility program for everyone.


[Jul 3, 2003] the m4 Macro Processor - updated link

"What is it about m4 that makes it so useful, and yet so overlooked? m4 -- a macro processor -- unfortunately has a dry name that disguises a great utility. A macro processor is basically a program that scans text and looks for defined symbols, which it replaces with other text or other symbols."

[Apr 17, 2003] Exploring processes with Truss: Part 1 By Sandra Henry-Stocker

The ps command can tell you quite a few things about each process running on your system. These include the process owner, memory use, accumulated time, the process status (e.g., waiting on resources) and many other things as well. But one thing that ps cannot tell you is what a process is doing - what files it is using, what ports it has opened, what libraries it is using and what system calls it is making. If you can't look at source code to determine how a program works, you can tell a lot about it by using a procedure called "tracing". When you trace a process (e.g., truss date), you get verbose commentary on the process' actions. For example, you will see a line like this each time the program opens a file:

open("/usr/lib/", O_RDONLY) = 4

The text on the left side of the equals sign clearly indicates what is happening. The program is trying to open the file /usr/lib/ and it's trying to open it in read-only mode (as you would expect, given that this is a system library). The right side is not nearly as self-evident. We have just the number 4. Open is not a Unix command, of course, but a system call. That means that you can only use the command within a program. Due to the nature of Unix, however, system calls are documented in man pages just like ls and pwd.

To determine what this number represents, you can skip down in this column or you can read the man page. If you elect to read the man page, you will undoubtedly read a line that tells you that the open() function returns a file descriptor for the named file. In other words, the number, 4 in our example, is the number of the file descriptor referred to in this open call. If the process that you are tracing opens a number of files, you will see a sequence of open calls. With other activity removed, the list might look something like this:

open("/dev/zero", O_RDONLY) = 3

open("/var/ld/ld.config", O_RDONLY) Err#2 ENOENT

open("/usr/lib/", O_RDONLY) = 4

open("/usr/lib/", O_RDONLY) = 4

open64("./../", O_RDONLY|O_NDELAY) = 3

open64("./../../", O_RDONLY|O_NDELAY) = 3

open("/etc/mnttab", O_RDONLY) = 4

Notice that the first file handle is 3 and that file handles 3 and 4 are used repeatedly. The initial file handle is always 3. This indicates that it is the first file handle following those that are the same for every process that you will run - 0, 1 and 2. These represent standard in, standard out and standard error.

The file handles shown in the example truss output above are repeated only because the associated files are subsequently closed. When a file is closed, the file handle that was used to access it can be used again.

The close commands include only the file handle, since the location of the file is known. A close command would, therefore, be something like close(3). One of the lines shown above displays a different response - Err#2

ENOENT. This "error" (the word is put in quotes because this does not necessarily indicate that the process is defective in any way) indicates that the file the open call is attempting to open does not exist. Read "ENOENT" as "No such file".

Some open calls place multiple restrictions on the way that a file is opened. The open64 calls in the example output above, for example, specify both O_RDONLY and O_NDELAY. Again, reading the man page will help you to understand what each of these specifications means and will present with a list of other options as well.

As you might expect, open is only one of many system calls that you will see when you run the truss command. Next week we will look at some additional system calls and determine what they are doing.

Exploring processes with Truss: part 2 By Sandra Henry-Stocker

While truss and its cousins on non-Solaris systems (e.g., strace on Linux and ktrace on many BSD systems) provide a lot of data on what a running process is doing, this information is only useful if you know what it means. Last week, we looked at the open call and the file handles that are returned by the call to open(). This week, we look at some other system calls and analyze what these system calls are doing. You've probably noticed that the nomenclature for system functions is to follow the name of the call with a set of empty parentheses for example, open(). You will see this nomenclature in use whenever system calls are discussed.

The fstat() and fstat64() calls obtains information about open files - "fstat" refers to "file status". As you might expect, this information is retrieved from the files' inodes, including whether or not you are allowed to read the files' contents. If you trace the ls command (i.e., truss ls), for example, your trace will start with lines that resemble these:

1 execve("/usr/bin/ls", 0x08047BCC, 0x08047BD4) argc = 1

2 open("/dev/zero", O_RDONLY) = 3

3 mmap(0x00000000, 4096, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0xDFBFA000

4 xstat(2, "/usr/bin/ls", 0x08047934) = 0

5 open("/var/ld/ld.config", O_RDONLY) Err#2 ENOENT

6 sysconfig(_CONFIG_PAGESIZE) = 4096

7 open("/usr/lib/", O_RDONLY) = 4

8 fxstat(2, 4, 0x08047310) = 0


28 lstat64(".", 0x080478B4) = 0

29 open64(".", O_RDONLY|O_NDELAY) = 3

30 fcntl(3, F_SETFD, 0x00000001) = 0

31 fstat64(3, 0x0804787C) = 0

32 brk(0x08057208) = 0

33 brk(0x08059208) = 0

34 getdents64(3, 0x08056F40, 1048) = 424

35 getdents64(3, 0x08056F40, 1048) = 0

36 close(3) = 0

In line 31, we see a call to fstat64, but what file is it checking? The man page for the fstat() and your intuition are probably both telling you that this fstat call is obtaining information on the file opened two lines before – "." or the current directory - and that it is referring to this file by its file handle (3) returned by the open() call in line

2. Keep in mind that a directory is simply a file, though a different variety of file, so the same system calls are used as would be used to check a text file.

You will probably also notice that the file being opened is called /dev/zero (again, see line 2). Most Unix sysadmins will immediately know that /dev/zero is a special kind of file - primarily because it is stored in /dev. And, if moved to look more closely at the file, they

will confirm that the file that /dev/zero points to (it is itself a symbolic link) is a special character file. What /dev/zero provides to system programmers, and to sysadmins if they care to use it, is an endless stream of zeroes. This is more useful than might first appear.

To see how /dev/zero works, you can create a 10M-byte file full of zeroes with a command like this:

/bin/dd < /dev/zero > zerofile bs=1024 seek=10240 count=1

This command works well because it creates the needed file with only a few read and write operations; in other words, it is very efficient.

You can verify that the file is zero-filled with od.

# od -x zerofile

0000000 0000 0000 0000 0000 0000 0000 0000 0000



Each string of four zeros (0000) represents two bytes of data. The * on the second line of output indicates that all of the remaining lines are identical to the first.

Looking back at the truss output above, we cannot help but notice that the first line of the truss output includes the name of the command that we are tracing. The execve() system call executes a process. The first argument to execve() is the name of the file from which the new process

image is to be loaded. The mmap() call which follows maps the process image into memory. In

other words, it directly incorporates file data into the process address space. The getdents64() calls on lines 34 and 35 are extracting information from the directory file - "dents" refers to "directory entries'.

The sequence of steps that we see at the beginning of the truss output executing the entered command, opening /dev/zero, mapping memory and so on - looks the same whether you are tracing ls, pwd, date or restarting Apache. In fact, the first dozen or so lines in your truss output will be nearly identical regardless of the command you are running. You should, however, expect to see some differences between different Unix systems and different versions of Solaris.

Viewing the output of truss, you can get a solid sense of how the operating system works. The same insights are available if you are tracing your own applications or troubleshooting third party executables.


Sandra Henry-Stocker Using the ps command.

3.2. Displaying all processes owned by a specific user

$ ps ux
heyne      691  0.0  2.4 19272 9576 ?        S    13:35   0:00 kdeinit: kded    
heyne      700  0.1  1.0  5880 3944 ?        S    13:35   0:01 artsd -F 10 -S 40
... ... ... 

You can also use the syntax "ps U username".

As you can see, the ps command can give you a lot of interesting information. If you for example want to know what your friend actually does, just replace your login name with her/his name and you see all processe belonging to her/him.

3.3. Own output format

If you are bored by the regular output, you could simply change the format. To do so use the formatting characters which are supported by the ps command.
If you execute the ps command with the 'o' parameter you can tell the ps command what you want to see:
Odd display with AIX field descriptors:

$ ps -o "%u : %U : %p : %a"
heyne    : heyne    :  3363 : bash
heyne    : heyne    :  3367 : ps -o %u : %U : %p : %a

developerWorks Concatenating files with cat Cat has two useful options:

Dogs of the Linux Shell Posted on Saturday, October 19, 2002 by Louis J. Iacona Could the command-line tools you've forgotten or never knew save time and some frustration?

One incarnation of the so called 80/20 rule has been associated with software systems. It has been observed that 80% of a user population regularly uses only 20% of a system's features. Without backing this up with hard statistics, my 20+ years of building and using software systems tells me that this hypothesis is probably true. The collection of Linux command-line programs is no exception to this generalization. Of the dozens of shell-level commands offered by Linux, perhaps only ten commands are commonly understood and utilized, and the remaining majority are virtually ignored.

Which of these dogs of the Linux shell have the most value to offer? I'll briefly describe ten of the less popular but useful Linux shell commands, those which I have gotten some mileage from over the years. Specifically, I've chosen to focus on commands that parse and format textual content.

The working examples presented here assume a basic familiarity with command-line syntax, simple shell constructs and some of the not-so-uncommon Linux commands. Even so, the command-line examples are fairly well commented and straightforward. Whenever practical, the output of usage examples is presented under each command-line execution.

The following eight commands parse, format and display textual content. Although not all provided examples demonstrate this, be aware that the following commands will read from standard input if file arguments are not presented.

Table 1. Summary of Commands


As their names imply, head and tail are used to display some amount of the top or bottom of a text block. head presents beginning of a file to standard output while tail does the same with the end of a file. Review the following commented examples:

## (1) displays the first 6 lines of a file
   head -6 readme.txt
## (2) displays the last 25 lines of a file
   tail -25 mail.txt

Here's an example of using head and tail in concert to display the 11th through 20th line of a file.

# (3)
head -20 file | tail -10 

Manual pages show that the tail command has more command-line options than head. One of the more useful tail option is -f. When it is used, tail does not return when end-of-file is detected, unless it is explicitly interrupted. Instead, tail sleeps for a period and checks for new lines of data that may have been appended since the last read.

## (4) display ongoing updates to the given
##     log file 

tail -f /usr/tmp/logs/daemon_log.txt

Imagine that a dæmon process was continually appending activity logs to the /usr/adm/logs/daemon_log.txt file. Using tail -f at a console window, for example, will more or less track all updates to the file in real time. (The -f option is applicable only when tail's input is a file).

If you give multiple arguments to tail, you can track several log files in the same window.

## track the mail log and the server error log
## at the same time.

tail -f /var/log/mail.log /var/log/apache/error_log

tac--Concatenate in Reverse

What is cat spelled backwards? Well, that's what tac's functionality is all about. It concatenates file order and their contents in reverse. So what's its usefulness? It can be used on any task that requires ordering elements in a last-in, first-out (LIFO) manner. Consider the following command line to list the three most recently established user accounts from the most recent through the least recent.

# (5) last 3 /etc/passwd records - in reverse
$ tail -3 /etc/passwd | tac
curly:x:1003:100:3rd Stooge:/homes/curly:/bin/ksh
larry:x:1002:100:2nd Stooge:/homes/larry:/bin/ksh
moe:x:1001:100:1st Stooge:/homes/moe:/bin/ksh

nl--Numbered Line Output

nl is a simple but useful numbering filter. I displays input with each line numbered in the left margin, in a format dictated by command-line options. nl provides a plethora of options that specify every detail of its numbered output. The following commented examples demonstrate some of of those options:

# (6) Display the first 4 entries of the password
#     file - numbers to be three columns wide and 
#     padded by zeros.
$ head -4 /etc/passwd | nl -nrz -w3
001	root:x:0:1:Super-User:/:/bin/ksh
002	daemon:x:1:1::/:
003	bin:x:2:2::/usr/bin:
004	sys:x:3:3::/:
# (7) Prepend ordered line numbers followed by an
#     '=' sign to each line -- start at 101.
$ nl -s= -v101 Data.txt
101=1st Line ...
102=2nd Line ...
103=3rd Line ...
104=4th Line ...
105=5th Line ...


The fmt command is a simple text formatter that focuses on making textual data conform to a maximum line width. It accomplishes this by joining and breaking lines around white space. Imagine that you need to maintain textual content that was generated with a word processor. The exported text may contain lines whose lengths vary from very short to much longer than a standard screen length. If such text is to be maintained in a text editor (like vi), fmt is the command of choice to transform the original text into a more maintainable format. The first example below shows fmt being asked to reformat file contents as text lines no greater than 60 characters long.

# (8) No more than 60 char lines
$ fmt -w 60 README.txt > NEW_README.txt
# (9) Force uniform spacing:
#     1 space between words, 2 between sentences
$ echo "Hello   World. Hello Universe." | fmt -u -w80 

Hello World.  Hello Universe.

fold--Break Up Input

fold is similar to fmt but is used typically to format data that will be used by other programs, rather than to make the text more readable to the human eye. The commented examples below are fairly easy to follow:

# (10) Format text in 3 column width lines
$ echo oxoxoxoxo | fold -w3 
# (11) Parse by triplet-char strings - 
#      search for 'xox'
$ echo oxoxoxoxo | fold -w3 | grep "xox"
# (12) One way to iterate through a string of chars
$ for i in $(echo 12345 | fold -w1)
> do
> ### perform some task ...
> print $i
> done


pr shares features with simpler commands like nl and fmt, but its command-line options make it ideal for converting text files into a format that's suitable for printing. pr offers options that allow you to specify page length, column width, margins, headers/footers, double line spacing and more.

Aside from being the best suited formatter for printing tasks, pr also offers other useful features. These features include allowing you to view multiple files vertically in adjacent columns or columnizing a list in a fixed number of columns (see Listing 2).

Listing 2. Using pr


The following two commands are specialized parsers used to pick apart file path pieces.


The basename and dirname commands are useful for presenting portions of a given file path. Quite often in scripting situations, it's convenient to be able to parse and capture a file name or the containing-directory name portions of a file path. These commands reduce this task to a simple one-line command. (There are other ways to approach this using the Korn shell or sed "magic", but basename and dirname are more portable and straightforward).

basename is used to strip off the directory, and optionally, the file suffix parts of a file path. Consider the following trivial examples:

:# (21) Parse out the Java Class name
$ basename
/usr/local/src/java/ .java 
# (22) Parse out the file name.  
$ basename srcs/C/main.c 

dirname is used to display the containing directory path, as much of the path as is provided. Consider the following examples:

# (23) absolute and relative directory examples
$ dirname /homes/curly/.profile 
$ dirname curly/.profile
# (24) From any korn-shell script, the following
#  line will assign the directory from where 
#  the script was launched 
SCRIPT_HOME="$(dirname $(whence $0))" 
# (25)
# Okay, how about a non-trivial practical example?
#  List all directories (under $PWD that contain a  
#  file called 'core'.
$ for i in $(find $PWD -name core )^
> do 
> dirname $i
> done | sort -u

ttyrec a tty recorder

ttyrec is a tty recorder. Recorded data can be played back with the included ttyplay command.

ttyrec is just a derivative of script command for recording timing information with microsecond accuracy as well.

It can record emacs -nw, vi, lynx, or any programs running on tty.

Understanding Archivers

In the next few articles, I'd like to take a look at backups and archiving utilities. if you're like I was when I started using Unix, I was intimidated by the words tar, cpio and dump, and a quick peek at their respective man pages did not alleviate my fears.

Online Gnu Documentation

Links to the manuals for the Gnu tools most commonly used in embedded development: Using and Porting GNU CC * Using as, The GNU Assembler * GASP, an assembly preprocessor * Using ld, the GNU linker

Slashdot Articles Free Books Online Matt Braithwaite writes "Answering RMS's call for free documentation, Karl Fogel has written a book on CVS that is free (GPLed) and available online. (The paper version has additional non-free material.) " Also, edinator wrote to say that ORA has put the Using Samba text online. The entire text of the Oreilly Docbook is downloadable Book Review: Professional Linux Programming

(Oct 21, 2000, 18:38 UTC) (116 reads) (0 talkbacks) (Posted by john)
"This book takes a different approach in that it steps through the development of a fictional application. The application you will build is an interface for a DVD rental store." RPM usage for newbies

(Oct 21, 2000, 18:03 UTC) (203 reads) (0 talkbacks) (Posted by john)
"The Red Hat Package Manager (RPM) has establised itself as one of the most popular distrubution formats for linux software today. A first time user may feel overwhelmed by the vast number of options available and this article will help a newbie to get familiar with usage of this tool."

Signal Ground: Stupid dd Tricks (or, Why We Didn't buy Norton Ghost)

"The company that employs Tom and me builds big pieces of food processing machinery that cost upwards of $400K. Each machine includes an embedded PCs running -- and I cringe -- NT 4. While the company's legacy currently dictates NT, those of us at the lower levels of the totem pole work to wedge Linux in wherever we can. What follows is a short story of a successful insertion that turned out to be (gasp!) financially beneficial to the company, too."

"...Ghost works well; it does exactly what we wanted it to. You boot off of a floppy (while the image medium is in another drive), and Ghost does the rest. The problem lies in Ghost's licensing. If you want to install in a situation like ours, you have to purchase a Value-Added Reseller (VAR) license from Symantec. And, every time you create a drive, you have to pay them about 17 dollars. When you also figure in the time needed to keep track of those licenses, that adds up in a hurry."

"It finally occurred to me that we could use Linux and a couple of simple tools (dd, gzip, and a shell script) to do the same thing as Ghost -- at least as far as our purposes go. ... The Results? We showed our little program to management, and they were impressed. We were able to create disk images almost as quickly as Norton Ghost, and we did it all in an afternoon using entirely free software. The rest is history."

Issue #87 Common Shell Tools - Focus On Linux - 05-25-00

"sort and uniq

The sort command is used to sort the lines in an input stream in alphanumeric or telephone book order. The simplest ways to use sort are to provide it with a filename to sort or an input stream whose data should be output in sorted form:

  sort myfile.txt
  cat myfile.txt | sort

This tool can be told to sort based on alternate fields and in several different orders. The uniq command is often used in conjunction with sort because it removes consecutive duplicate lines from and input stream before writing them to standard output. This provides a quick easy way to sort a pool of data and them remove duplicate entries.

A more in-depth discussion of sort can be found in the past QuickTip called Sort and Uniq.


The tr command in its simplest form can be thought of as a simpler case of the sed command discussed earlier. It is used to replace all occurances of a single character in an input stream with an alternate character before writing to the output stream. For example, to change all percent (%) characters to spaces, you might use:

  tr '%' ' ' 

Though sed can be used to accomplish the same task, it is often simpler to use tr when replacing a single character because the syntax is easy to remember and many special characters which must be escaped for sed can be supplied to tr without escaping.


The wc, or "word count" command does just what its name implies: it counts words. As an added feature, tr also counts lines and bytes. The formats for counting words, lines, or bytes in a file or input stream are:

  $ wc -w myfile.txt
      897 myfile.txt
  $ wc -l myfile.txt
      193 myfile.txt
  $ wc -c myfile.txt
     5927 myfile.txt

Notice that the output for wc normally includes the filename (when reading from a file) and always includes a number of spaces as well. Often, this behavior is undesirable, usually when a number is required without leading or trailing whitespace. In such cases, sed and cut can be used to eliminate them:

  $ wc -l myfile.txt | cut -d ' ' -f 1 | sed 's! !!g'

Note that other methods for removing spaces or filenames include using a more complex sed command alone or even using awk, which we won't discuss in this issue.


The xargs utility is used to break long input streams into groups of lines so that the shell isn't overloaded by command substitution. For example, the following command may fail if too many files are present in the current directory tree for BASH to substitute correctly:

  lpr $(find .)

However, using xargs, the desired effect can be obtained:

  find . | xargs lpr

More information on using xargs can be found in the QuickTip called Long Argument Lists and on the xargs manual page.

Linux Today PRNewswire SCO Contributes to the Open Source Community; Kicks Off Open Source Initiatives

"SCO is contributing source code for two developer tools -- "cscope" and "fur." The code is released under the terms of the BSD License and will be maintained by SCO. The first technology, cscope, is available to download at Software developers can use cscope to help design and debug programs coded with the C programming language. The second technology, Fur, will be available to download in several weeks. Fur is a real-time analysis program used to optimize application and system binaries for more effective run time execution. Dramatic results have been seen in high-level applications and database systems using fur."

[Jan 30, 2000] Use the Source, Luke Compiling and installing from source code LG #49

One of the greatest strengths of the Open Source movement is the availability of source code for almost every program. This article will discuss in general terms, with some examples, how to install a program from source code rather than a precompiled binary package. The primary audience for this article is the user who has some familiarity with installing programs from binaries, but isn't familiar with installing from source code. Some knowledge of compiling software is helpful, but not required.

[Jan 3, 2000] Advanced Programming in Expect A Bulletproof Interface LG #48 -- very interesting and useful paper. See also: Ext2- Automating interactive tasks with expect and crontab
QCad the user-friendly CAD system for Linux is now open source PC Week PC Week Labs evaluates open-source apps

They recommend Apache, Mozilla, Samba and Perl for enterprise use. Evaluations of a particular product are second-rate and does not deserve attention. Only the list is interesting [Jan 25, 1999] Win32 Editors page was added Open Source Software Chronicles -- October-December, 1998

Open Source Software Chronicles -- July-September, 1998

See Also

Recommended Links

Google matched content

Softpanorama Recommended

Top articles



Main components (Core Gnu)



Linux, Unix, -etc Using the m4 Macro Processor nice m4 intro by Paul Dunne

"What is it about m4 that makes it so useful, and yet so overlooked? m4 -- a macro processor -- unfortunately has a dry name that disguises a great utility. A macro processor is basically a program that scans text and looks for defined symbols, which it replaces with other text or other symbols."

GNU macro processor - Table of Contents Programming Utilities Guide/m4

m4 macro processor Caldera

Programming in standard C and C++

m4 macro processor
Defining macros
Arithmetic built-ins
File inclusion
System command
String manipulation

General Programming Concepts Writing and Debugging Programs - m4 Macro Processor Overview

This chapter provides information about the m4 macro processor, which is a front-end processor for any programming language being used in the operating system environment.

The m4 macro processor is useful in many ways. At the beginning of a program, you can define a symbolic name or symbolic constant as a particular string of characters. You can then use the m4 program to replace unquoted occurrences of the symbolic name with the corresponding string. Besides replacing one string of text with another, the m4 macro processor provides the following features:

The m4 macro processor processes strings of letters and digits called tokens. The m4 program reads each alphanumeric token and determines if it is the name of a macro. The program then replaces the name of the macro with its defining text, and pushes the resulting string back onto the input to be rescanned. You can call macros with arguments, in which case the arguments are collected and substituted into the right places in the defining text before the defining text is rescanned.

The m4 program provides built-in macros such as define. You can also create new macros. Built-in and user-defined macros work the same way.

  • Autoconf



    Mailing lists


    Less sucks less more than more.
    That's why I use more less, and less more.



    Groupthink : Two Party System as Polyarchy : Corruption of Regulators : Bureaucracies : Understanding Micromanagers and Control Freaks : Toxic Managers :   Harvard Mafia : Diplomatic Communication : Surviving a Bad Performance Review : Insufficient Retirement Funds as Immanent Problem of Neoliberal Regime : PseudoScience : Who Rules America : Neoliberalism  : The Iron Law of Oligarchy : Libertarian Philosophy


    War and Peace : Skeptical Finance : John Kenneth Galbraith :Talleyrand : Oscar Wilde : Otto Von Bismarck : Keynes : George Carlin : Skeptics : Propaganda  : SE quotes : Language Design and Programming Quotes : Random IT-related quotesSomerset Maugham : Marcus Aurelius : Kurt Vonnegut : Eric Hoffer : Winston Churchill : Napoleon Bonaparte : Ambrose BierceBernard Shaw : Mark Twain Quotes


    Vol 25, No.12 (December, 2013) Rational Fools vs. Efficient Crooks The efficient markets hypothesis : Political Skeptic Bulletin, 2013 : Unemployment Bulletin, 2010 :  Vol 23, No.10 (October, 2011) An observation about corporate security departments : Slightly Skeptical Euromaydan Chronicles, June 2014 : Greenspan legacy bulletin, 2008 : Vol 25, No.10 (October, 2013) Cryptolocker Trojan (Win32/Crilock.A) : Vol 25, No.08 (August, 2013) Cloud providers as intelligence collection hubs : Financial Humor Bulletin, 2010 : Inequality Bulletin, 2009 : Financial Humor Bulletin, 2008 : Copyleft Problems Bulletin, 2004 : Financial Humor Bulletin, 2011 : Energy Bulletin, 2010 : Malware Protection Bulletin, 2010 : Vol 26, No.1 (January, 2013) Object-Oriented Cult : Political Skeptic Bulletin, 2011 : Vol 23, No.11 (November, 2011) Softpanorama classification of sysadmin horror stories : Vol 25, No.05 (May, 2013) Corporate bullshit as a communication method  : Vol 25, No.06 (June, 2013) A Note on the Relationship of Brooks Law and Conway Law


    Fifty glorious years (1950-2000): the triumph of the US computer engineering : Donald Knuth : TAoCP and its Influence of Computer Science : Richard Stallman : Linus Torvalds  : Larry Wall  : John K. Ousterhout : CTSS : Multix OS Unix History : Unix shell history : VI editor : History of pipes concept : Solaris : MS DOSProgramming Languages History : PL/1 : Simula 67 : C : History of GCC developmentScripting Languages : Perl history   : OS History : Mail : DNS : SSH : CPU Instruction Sets : SPARC systems 1987-2006 : Norton Commander : Norton Utilities : Norton Ghost : Frontpage history : Malware Defense History : GNU Screen : OSS early history

    Classic books:

    The Peter Principle : Parkinson Law : 1984 : The Mythical Man-MonthHow to Solve It by George Polya : The Art of Computer Programming : The Elements of Programming Style : The Unix Hater’s Handbook : The Jargon file : The True Believer : Programming Pearls : The Good Soldier Svejk : The Power Elite

    Most popular humor pages:

    Manifest of the Softpanorama IT Slacker Society : Ten Commandments of the IT Slackers Society : Computer Humor Collection : BSD Logo Story : The Cuckoo's Egg : IT Slang : C++ Humor : ARE YOU A BBS ADDICT? : The Perl Purity Test : Object oriented programmers of all nations : Financial Humor : Financial Humor Bulletin, 2008 : Financial Humor Bulletin, 2010 : The Most Comprehensive Collection of Editor-related Humor : Programming Language Humor : Goldman Sachs related humor : Greenspan humor : C Humor : Scripting Humor : Real Programmers Humor : Web Humor : GPL-related Humor : OFM Humor : Politically Incorrect Humor : IDS Humor : "Linux Sucks" Humor : Russian Musical Humor : Best Russian Programmer Humor : Microsoft plans to buy Catholic Church : Richard Stallman Related Humor : Admin Humor : Perl-related Humor : Linus Torvalds Related humor : PseudoScience Related Humor : Networking Humor : Shell Humor : Financial Humor Bulletin, 2011 : Financial Humor Bulletin, 2012 : Financial Humor Bulletin, 2013 : Java Humor : Software Engineering Humor : Sun Solaris Related Humor : Education Humor : IBM Humor : Assembler-related Humor : VIM Humor : Computer Viruses Humor : Bright tomorrow is rescheduled to a day after tomorrow : Classic Computer Humor

    The Last but not Least

    Copyright © 1996-2016 by Dr. Nikolai Bezroukov. was initially created as a service to the (now defunct) UN Sustainable Development Networking Programme (SDNP) in the author free time and without any remuneration. This document is an industrial compilation designed and created exclusively for educational use and is distributed under the Softpanorama Content License.

    The site uses AdSense so you need to be aware of Google privacy policy. You you do not want to be tracked by Google please disable Javascript for this site. This site is perfectly usable without Javascript.

    Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.

    FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available to advance understanding of computer science, IT technology, economic, scientific, and social issues. We believe this constitutes a 'fair use' of any such copyrighted material as provided by section 107 of the US Copyright Law according to which such material can be distributed without profit exclusively for research and educational purposes.

    This is a Spartan WHYFF (We Help You For Free) site written by people for whom English is not a native language. Grammar and spelling errors should be expected. The site contain some broken links as it develops like a living tree...

    You can use PayPal to make a contribution, supporting development of this site and speed up access. In case is down you can use the at


    The statements, views and opinions presented on this web page are those of the author (or referenced source) and are not endorsed by, nor do they necessarily reflect, the opinions of the author present and former employers, SDNP or any other organization the author may be associated with. We do not warrant the correctness of the information provided or its fitness for any purpose.

    Last modified: July 16, 2018