Softpanorama
May the source be with you, but remember the KISS principle ;-)

Contents Bulletin Scripting in shell and Perl Network troubleshooting History Humor

Classic Unix Utilities

News

Recommended Books

Recommended Links

Pipable Tools

Reference

 Shells 

bash

Pipes Selected
Manpages
cut find sort awk tr dd tee ifconfig xargs
uniq route cat tail grep sudo sar eval paste
head teraterm screen split expr join rpm ln at
wc ps YUM rpm diff touch alias cp diff_tools
vnc expect logrotate netcat Curl tar cron chpasswd gcc
chmod chown df du mkisofs logrotate tree kill nohup
pushd popd dirs last e2label hostname nslookup dig  
OFM CVS date history Horror Stories Tips Unix History Humor Etc
 
There are many people who use UNIX or Linux but who IMHO do not understand UNIX. UNIX is not just an operating system, it is a way of doing things, and the shell plays a key role by providing the glue that makes it work. The UNIX methodology relies heavily on reuse of a set of tools rather than on building monolithic applications. Even perl programmers often miss the point, writing the heart and soul of the application as perl script without making use of the UNIX toolkit.

David Korn(bold italic is mine -- BNN)

Alphabetical list

A B C D E F G H I J K L M
N O P R S Q T U V W X Y Z

A

B

C

D

E

F

G

H

I

J

K

L

M

N

P

R

S

T

U

V

W

X

Y

Z

IMHO there are three Unix tools that can spell the difference between really good programmer or sysadmin and just above average one (even if the latter has solid knowledge of shell and Perl, knowledge of shell and Perl is necessary but not sufficient):

This two tools can also be used as a fine text in interviews on advanced Unix-related positions if you have several similar candidates. Other things equal, their knowledge definitely demonstrate the level of Unix culture superior to the average "command line junkies" level ;-)

Overview of books about GNU/open source tools can be found in Unix tools bibliography. There not that much good books on the subject, still even average books can provide you with insight in usage of the tool that you might never get via daily practice. Please note that Unix is a pretty complex system and some aspects of it are non-obvious even for those who have more than ten years of experience.

Dr. Nikolai Bezroukov


Top updates

Bulletin Latest Past week Past month
Google Search


NEWS CONTENTS

Old News ;-)

_commandlinefu" href="#NEWS_TOC">

10 most rated linux commands for last past weeks at commandlinefu.

1- Save man-page as pdf

 man -t awk | ps2pdf - awk.pdf

2- Duplicate installed packages from one machine to the other (RPM-based systems)

ssh root@remote.host "rpm -qa" | xargs yum -y install

3- Stamp a text line on top of the pdf pages to quickly add some remark, comment, stamp text, … on top of (each of) the pages of the input pdf file

echo "This text gets stamped on the top of the pdf pages." | enscript -B -f Courier-Bold16 -o- | ps2pdf - | pdftk input.pdf stamp - output output.pdf

4- Display the number of connections to a MySQL Database

Count the number of active connections to a MySQL database.
The MySQL command “show processlist” gives a list of all the active clients.
However, by using the processlist table, in the information_schema database, we can sort and count the results within MySQL.

mysql -u root -p -BNe "select host,count(host) from processlist group by host;" information_schema

5- Create a local compressed tarball from remote host directory

ssh user@host "tar -zcf - /path/to/dir" > dir.tar.gz

This improves on #9892 by compressing the directory on the remote machine so that the amount of data transferred over the network is much smaller. The command uses ssh(1) to get to a remote host, uses tar(1) to archive and compress a remote directory, prints the result to STDOUT, which is written to a local file. In other words, we are archiving and compressing a remote directory to our local box.

6- tail a log over ssh

This is also handy for taking a look at resource usage of a remote box.

ssh -t remotebox "tail -f /var/log/remote.log"

7- Print diagram of user/groups

Parses /etc/group to “dot” format and pases it to “display” (imagemagick) to show a usefull diagram of users and groups (don’t show empty groups).

awk 'BEGIN{FS=":"; print "digraph{"}{split($4, a, ","); for (i in a) printf "\"%s\" [shape=box]\n\"%s\" -> \"%s\"\n", $1, a[i], $1}END{print "}"}' /etc/group|display

8- Draw kernel module dependancy graph.

Parse `lsmod’ output and pass to `dot’ drawing utility then finally pass it to an image viewer

lsmod | perl -e 'print "digraph \"lsmod\" {";<>;while(<>){@_=split/\s+/; print "\"$_[0]\" -> \"$_\"\n" for split/,/,$_[3]}print "}"' | dot -Tpng | display -

9- Create strong, but easy to remember password

Why remember? Generate!
Up to 48 chars, works on any unix-like system

read -s pass; echo $pass | md5sum | base64 | cut -c -16

10- Find all files larger than 500M and less than 1GB

find / -type f -size +500M -size -1G

11- Limit the cpu usage of a process

This will limit the average amount of CPU it consumes.

sudo cpulimit -p pid -l 50
_commandlinefu">

[Jul 26, 2011] ivarch.com Pipe Viewer Online Man Page

pv allows a user to see the progress of data through a pipeline, by giving information such as time elapsed, percentage completed (with progress bar), current throughput rate, total data transferred, and ETA.

To use it, insert it in a pipeline between two processes, with the appropriate options. Its standard input will be passed through to its standard output and progress will be shown on standard error.

pv will copy each supplied FILE in turn to standard output (- means standard input), or if no FILEs are specified just standard input is copied. This is the same behaviour as cat(1).

A simple example to watch how quickly a file is transferred using nc(1):

pv file | nc -w 1 somewhere.com 3000

A similar example, transferring a file from another process and passing the expected size to pv:

cat file | pv -s 12345 | nc -w 1 somewhere.com 3000

A more complicated example using numeric output to feed into the dialog(1) program for a full-screen progress display:
 

(tar cf - . \
| pv -n -s $(du -sb . | awk '{print $1}') \
| gzip -9 > out.tgz) 2>&1 \
| dialog --gauge 'Progress' 7 70

Frequent use of this third form is not recommended as it may cause the programmer to overheat.

 

[Jan 24, 2008] freshmeat.net Project details for cgipaf

The package also contain Solaris binary of chpasswd clone, which is extremely useful for mass changes of passwords in corporate environments which include Solaris and other Unixes that does not have chpasswd utility (HP-UX is another example in this category).   Version 1.3.2 now includes Solaris binary of chpasswd which works on Solaris 9 and 10.

cgipaf is a combination of three CGI programs.

All programs use PAM for user authentication. It is possible to run a script to update SAMBA passwords or NIS configuration when a password is changed. mailcfg.cgi creates a .procmailrc in the user's home directory. A user with too many invalid logins can be locked. The minimum and maximum UID can be set in the configuration file, so you can specify a range of UIDs that are allowed to use cgipaf.

[Aug 7, 2007] Expect plays a crucial role in network management  by Cameron Laird

31 Jul 2007 | www.ibm.com/developerworks

If you manage systems and networks, you need Expect.

More precisely, why would you want to be without Expect? It saves hours common tasks otherwise demand. Even if you already depend on Expect, though, you might not be aware of the capabilities described below.

Expect automates command-line interactions

You don't have to understand all of Expect to begin profiting from the tool; let's start with a concrete example of how Expect can simplify your work on AIX® or other operating systems:

Suppose you have logins on several UNIX® or UNIX-like hosts and you need to change the passwords of these accounts, but the accounts are not synchronized by Network Information Service (NIS), Lightweight Directory Access Protocol (LDAP), or some other mechanism that recognizes you're the same person logging in on each machine. Logging in to a specific host and running the appropriate passwd command doesn't take long—probably only a minute, in most cases. And you must log in "by hand," right, because there's no way to script your password?

Wrong. In fact, the standard Expect distribution (full distribution) includes a command-line tool (and a manual page describing its use!) that precisely takes over this chore. passmass (see Resources) is a short script written in Expect that makes it as easy to change passwords on twenty machines as on one. Rather than retyping the same password over and over, you can launch passmass once and let your desktop computer take care of updating each individual host. You save yourself enough time to get a bit of fresh air, and multiple opportunities for the frustration of mistyping something you've already entered.

The limits of Expect

This passmass application is an excellent model—it illustrates many of Expect's general properties:

You probably know enough already to begin to write or modify your own Expect tools. As it turns out, the passmass distribution actually includes code to log in by means of ssh, but omits the command-line parsing to reach that code. Here's one way you might modify the distribution source to put ssh on the same footing as telnet and the other protocols:
Listing 1. Modified passmass fragment that accepts the -ssh argument                   

            ...
         } "-rlogin" {
            set login "rlogin"
            continue
        } "-slogin" {
            set login "slogin"
            continue
        } "-ssh" {
            set login "ssh"
            continue
        } "-telnet" {
            set login "telnet"
            continue
           ...
     

In my own code, I actually factor out more of this "boilerplate." For now, though, this cascade of tests, in the vicinity of line #100 of passmass, gives a good idea of Expect's readability. There's no deep programming here—no need for object-orientation, monadic application, co-routines, or other subtleties. You just ask the computer to take over typing you usually do for yourself. As it happens, this small step represents many minutes or hours of human effort saved. 

[April 23, 2006] Port25 Running Windows Command Line Applications from a Linux Box

What is interesting comments does not mention that ssh server is available under SFU 3.5.
Wednesday, April 19, 2006 5:42 PM by admin
 
Running Command Line Applications on Windows XP/2000 from a Linux Box:

Question:

-----Original Message-----
From: swagner@********
Sent: Thursday, April 13, 2006 2:35 PM
To: Port25 Feedback
Subject: (Port25) : You guys should look into _____
Importance: High

Can you recommend anything for running command line applications on a Windows XP/2000 box from within a program that runs on Linux?  For example I want a script to run on a Linux server that will connect to a Windows server, on our network, and run certain commands.

Answer:

One way to do this would be to install an SSH daemon on the Windows machine and run commands via the ssh client on the Linux machine.  Simply search the web for information on setting up the Cygwin SSH daemon as a service in Windows (there are docs about this everywhere).  You can then run commands with ssh, somewhat like:

      ssh administrator@<hostname> 'touch /cygdrive/c/blar'

That will create a file in C:\ called "blar".  You can also access Windows commands if you alter the path in the Cygwin environment or use the full path to the command:

      ssh administrator@<hostname> '/cygdrive/c/windows/system32/net.exe view'
 

re: Running Windows Command Line Applications from a Linux Box

Sunday, April 23, 2006 3:44 AM by nektar
I am disappointed that Microsoft does not offer an SSH implementation with Services for Unix or with SUA.
 

re: Running Windows Command Line Applications from a Linux Box

Sunday, April 23, 2006 4:36 PM by szlwzl
I would also very much like to see this as a built in feature - cygwin is great and I use it all the time but why not build something like this into the OS?

re: Running Windows Command Line Applications from a Linux Box

Sunday, April 23, 2006 6:05 PM by breiter
I'm stunned that you didn't recommend OpenSSH running on Interix from SFU 3.5 or SUA 5.2. I would much rather rely upon Interix than Cygwin. Interopsystems maintains an both a free straight OpenSSH package and an commercial enhanced version with an MMC-based GUI configurator.

re: Running Windows Command Line Applications from a Linux Box

Monday, April 24, 2006 1:12 AM by vox
Of course if there was an RDP client that could access Windows full screen using a browser (the same way as Virtual Labs work) you could run GUI programs as well

Replies to all

Monday, April 24, 2006 1:30 AM by einhverfr
Hi all.

Nektar wrote:

" I am disappointed that Microsoft does not offer an SSH implementation with Services for Unix or with SUA."

When I was at Microsoft, the legal department raised objections.  Not sure if they were trademark related or what.  But a good substitute would be a kerberized telnet client and server that would be capable of session encryption as per the Kerberos specification.  People usually don't know that this is possible using Kerberos and telnet but it is.  And given the architecture of AD, this would lead to close integration.

Vox wrote:
" Of course if there was an RDP client that could access Windows full screen using a browser (the same way as Virtual Labs work) you could run GUI programs as well"

Ever use rdesktop?  It doesn;t use a browser, but it close enough you can easily run GUI apps.

Best Wishes,
Chris Travers
Metatron Technology Consulting
 

re: Running Windows Command Line Applications from a Linux Box

Monday, April 24, 2006 12:42 PM by docsmooth
rdesktop -0 -f <servername> will work the same as mstsc /console with the fullscreen switch set.  As Chris said, it's not a browser, but it's a 100% replacement for MSTSC, and fits every single option, security and otherwise, that is in MSTSC.

Also, KDE users have "krdc" which wraps around rdesktop and VNC, so you can connect to either, and save off your settings, just like saving .RDP files in Windows.

Rob

re: Running Windows Command Line Applications from a Linux Box

Monday, April 24, 2006 4:42 PM by docsmooth
I completely forgot this portion to my previous comment:

Is there anyone who has experience running Windows Resource Kit tools or Windows 2003 Support Tools from Wine or similar directly off of the Linux box?  It would be fantastic to be able to run those and the MMC tools, perhaps with WinBind as the authentication path?

As things sit right now, I have to run a VMWare WinXP instance, or dual-boot to get access to those tools that I run less frequently than certain FOSS tools, but still need.

re: Running Windows Command Line Applications from a Linux Box

Thursday, April 27, 2006 4:39 PM by smither
Simply install vncserver from, for example, realvnc.com, then use vncviewer on the Linux box.  You have your complete Windows desktop within a window in your X server.  Open the terminal from the start menu.
 

re: Running Windows Command Line Applications from a Linux Box

Friday, April 28, 2006 2:49 PM by remdotc
You can either purchase a copy of Cross Over Office and/or Cedega, which allow you to run windows native binarys on linux  (directX) or you can under wine get these to work, though  you need to install
IE 6.1
You need to set your 0/S in wine.conf to 2000
you need to copy most of the files contained in
sysroot/system32 to your winex install
performance is horrible

The better sollution is to install a ssh server on the windows box and then remote in via command line. If you can not afford a commerical one, you can always use cygwin

[Jan 25, 2005] Tool of the Month: ManEdit by Joe "Zonker" Brockmeier

ManEdit is provided by WolfPack Entertainment. I know, that doesn't sound like a company that would be releasing a manual page editor, but they did — and under the GNU General Public License, no less.

It's not terribly difficult to create manual pages using an editor like Emacs or Vim (see my December 2003 column if you'd like to start from scratch) but it's yet another skill that developers and admins have to tackle to learn how to write in *roff format. ManEdit actually uses an XML format that it converts to groff format when saving, so it's not necessary to delve into groff if you don't want to. (I would recommend having at least a passing familiarity with groff if you're going to be doing much development, but it's not absolutely necessary.)

ManEdit is an easy-to-use manual page editor and viewer that takes all the hassle out of creating manual pages (well, the formatting hassle, anyway — you still have to actually write the manual itself).

The ManEdit homepage has source and packages for Debian, Mandrake, Slackware, and SUSE Linux. The source should compile on FreeBSD and Solaris as well, so long as you have GTK 1.2.0. I used the SUSE packages without any problem on a SUSE 9.2 system.

Sys Admin Magazine

Useful Solaris Commands

truss -c (Solaris >= 8): This astounding option to truss provides a profile summary of the command being trussed:

$ truss -c grep asdf work.doc
syscall              seconds   calls  errors
_exit                    .00       1
read                     .01      24
open                     .00       8      4
close                    .00       5
brk                      .00      15
stat                     .00       1
fstat                    .00       4
execve                   .00       1
mmap                     .00      10
munmap                   .01       3
memcntl                  .00       2
llseek                   .00       1
open64                   .00       1
                        ----     ---    ---
sys totals:              .02      76      4
usr time:                .00
elapsed:                 .05

It can also show profile data on a running process. In this case, the data shows what the process did between when truss was started and when truss execution was terminated with a control-c. It’s ideal for determining why a process is hung without having to wade through the pages of truss output.

truss -d and truss -D (Solaris >= 8): These truss options show the time associated with each system call being shown by truss and is excellent for finding performance problems in custom or commercial code. For example:

$ truss -d who
Base time stamp:  1035385727.3460  [ Wed Oct 23 11:08:47 EDT 2002 ]
 0.0000 execve(“/usr/bin/who”, 0xFFBEFD5C, 0xFFBEFD64)  argc = 1
 0.0032 stat(“/usr/bin/who”, 0xFFBEFA98)                = 0
 0.0037 open(“/var/ld/ld.config”, O_RDONLY)             Err#2 ENOENT
 0.0042 open(“/usr/local/lib/libc.so.1”, O_RDONLY)      Err#2 ENOENT
 0.0047 open(“/usr/lib/libc.so.1”, O_RDONLY)            = 3
 0.0051 fstat(3, 0xFFBEF42C)                            = 0
. . .

truss -D is even more useful, showing the time delta between system calls:

Dilbert> truss -D who
 0.0000 execve(“/usr/bin/who”, 0xFFBEFD5C, 0xFFBEFD64)  argc = 1
 0.0028 stat(“/usr/bin/who”, 0xFFBEFA98)                = 0
 0.0005 open(“/var/ld/ld.config”, O_RDONLY)             Err#2 ENOENT
 0.0006 open(“/usr/local/lib/libc.so.1”, O_RDONLY)      Err#2 ENOENT
 0.0005 open(“/usr/lib/libc.so.1”, O_RDONLY)            = 3
 0.0004 fstat(3, 0xFFBEF42C)                            = 0

In this example, the stat system call took a lot longer than the others.

 

truss -T: This is a great debugging help. It will stop a process at the execution of a specified system call. (“-U” does the same, but with user-level function calls.) A core could then be taken for further analysis, or any of the /proc tools could be used to determine many aspects of the status of the process.

truss -l (improved in Solaris 9): Shows the thread number of each call in a multi-threaded processes. Solaris 9 truss -l finally makes it possible to watch the execution of a multi-threaded application.

Truss is truly a powerful tool. It can be used on core files to analyze what caused the problem, for example. It can also show details on user-level library calls (either system libraries or programmer libraries) via the “-u” option.

pkg-get: This is a nice tool (http://www.bolthole.com/solaris) for automatically getting freeware packages. It is configured via /etc/pkg-get.conf. Once it’s up and running, execute pkg-get -a to get a list of available packages, and pkg-get -i to get and install a given package.

plimit (Solaris >= 8): This command displays and sets the per-process limits on a running process. This is handy if a long-running process is running up against a limit (for example, number of open files). Rather than using limit and restarting the command, plimit can modify the running process.

coreadm (Solaris >= 8): In the “old” days (before coreadm), core dumps were placed in the process’s working directory. Core files would also overwrite each other. All this and more has been addressed by coreadm, a tool to manage core file creation. With it, you can specify whether to save cores, where cores should be stored, how many versions should be retained, and more. Settings can be retained between reboots by coreadm modifying /etc/coreadm.conf.

pgrep (Solaris >= 8): pgrep searches through /proc for processes matching the given criteria, and returns their process-ids. A great option is “-n”, which returns the newest process that matches.

preap (Solaris >= 9): Reaps zombie processes. Any processes stuck in the “z” state (as shown by ps), can be removed from the system with this command.

pargs (Solaris >= 9): Shows the arguments and environment variables of a process.

nohup -p (Solaris >= 9): The nohup command can be used to start a process, so that if the shell that started the process closes (i.e., the process gets a “SIGHUP” signal), the process will keep running. This is useful for backgrounding a task that should continue running no matter what happens around it. But what happens if you start a process and later want to HUP-proof it? With Solaris 9, nohup -p takes a process-id and causes SIGHUP to be ignored.

prstat (Solaris >= 8): prstat is top and a lot more. Both commands provide a screen’s worth of process and other information and update it frequently, for a nice window on system performance. prstat has much better accuracy than top. It also has some nice options. “-a” shows process and user information concurrently (sorted by CPU hog, by default). “-c” causes it to act like vmstat (new reports printed below old ones). “-C” shows processes in a processor set. “-j” shows processes in a “project”. “-L” shows per-thread information as well as per-process. “-m” and “-v” show quite a bit of per-process performance detail (including pages, traps, lock wait, and CPU wait). The output data can also be sorted by resident-set (real memory) size, virtual memory size, execute time, and so on. prstat is very useful on systems without top, and should probably be used instead of top because of its accuracy (and some sites care that it is a supported program).

trapstat (Solaris >= 9): trapstat joins lockstat and kstat as the most inscrutable commands on Solaris. Each shows gory details about the innards of the running operating system. Each is indispensable in solving strange happenings on a Solaris system. Best of all, their output is good to send along with bug reports, but further study can reveal useful information for general use as well.

vmstat -p (Solaris >= 8): Until this option became available, it was almost impossible (see the “se toolkit”) to determine what kind of memory demand was causing a system to page. vmstat -p is key because it not only shows whether your system is under memory stress (via the “sr” column), it also shows whether that stress is from application code, application data, or I/O. “-p” can really help pinpoint the cause of any mysterious memory issues on Solaris.

pmap -x (Solaris >= 8, bugs fixed in Solaris >= 9): If the process with memory problems is known, and more details on its memory use are needed, check out pmap -x. The target process-id has its memory map fully explained, as in:

# pmap -x 1779
1779:   -ksh
 Address  Kbytes     RSS    Anon  Locked Mode   Mapped File
00010000     192     192       -       - r-x--  ksh
00040000       8       8       8       - rwx--  ksh
00042000      32      32       8       - rwx--    [ heap ]
FF180000     680     664       -       - r-x--  libc.so.1
FF23A000      24      24       -       - rwx--  libc.so.1
FF240000       8       8       -       - rwx--  libc.so.1
FF280000     568     472       -       - r-x--  libnsl.so.1
FF31E000      32      32       -       - rwx--  libnsl.so.1
FF326000      32      24       -       - rwx--  libnsl.so.1
FF340000      16      16       -       - r-x--  libc_psr.so.1
FF350000      16      16       -       - r-x--  libmp.so.2
FF364000       8       8       -       - rwx--  libmp.so.2
FF380000      40      40       -       - r-x--  libsocket.so.1
FF39A000       8       8       -       - rwx--  libsocket.so.1
FF3A0000       8       8       -       - r-x--  libdl.so.1
FF3B0000       8       8       8       - rwx--    [ anon ]
FF3C0000     152     152       -       - r-x--  ld.so.1
FF3F6000       8       8       8       - rwx--  ld.so.1
FFBFE000       8       8       8       - rw---    [ stack ]
-------- ------- ------- ------- -------
total Kb    1848    1728      40       -

Here we see each chunk of memory, what it is being used for, how much space it is taking (virtual and real), and mode information.

df -h (Solaris >= 9): This command is popular on Linux, and just made its way into Solaris. df -h displays summary information about file systems in human-readable form:

$ df -h
Filesystem             size   used  avail capacity  Mounted on
/dev/dsk/c0t0d0s0      4.8G   1.7G   3.0G    37%    /
/proc                    0K     0K     0K     0%    /proc
mnttab                   0K     0K     0K     0%    /etc/mnttab
fd                       0K     0K     0K     0%    /dev/fd
swap                   848M    40K   848M     1%    /var/run
swap                   849M   1.0M   848M     1%    /tmp
/dev/dsk/c0t0d0s7       13G    78K    13G     1%    /export/home

Conclusion

Each administrator has a set of tools used daily, and another set of tools to help in a pinch. This column included a wide variety of commands and options that are lesser known, but can be very useful. Do you have favorite tools that have saved you in a bind? If so, please send them to me so I can expand my tool set as well. Alternately, send along any tools that you hate or that you feel are dangerous, which could also turn into a useful column!

[Jan 13, 2004] The art of writing Linux utilities Peter Seebach

What makes a good utility?

There is a wonderful discussion of this question in The UNIX Programming Environment, by Kernighan & Pike. A good utility is one that does its job as well as possible. It has to play well with others; it has to be amenable to being combined with other utilities. A program that doesn't combine with others isn't a utility; it's an application.

Utilities are supposed to let you build one-off applications cheaply and easily from the materials at hand. A lot of people think of them as being like tools in a toolbox. The goal is not to have a single widget that does everything, but to have a handful of tools, each of which does one thing as well as possible.

Some utilities are reasonably useful on their own, whereas others imply cooperation in pipelines of utilities. Examples of the former include sort and grep. On the other hand, xargs is rarely used except with other utilities, most often find.

What language to write in?
Most of the UNIX system utilities are written in C. The examples here are in Perl and sh. Use the right tool for the right job. If you use a utility heavily enough, the cost of writing it in a compiled language might be justified by the performance gain. On the other hand, for the fairly common case where a program's workload is light, a scripting language may offer faster development.

If you aren't sure, you should use the language you know best. At least when you're prototyping a utility, or figuring out how useful it is, favor programmer efficiency over performance tuning. Most of the UNIX system utilities are in C, simply because they're heavily used enough to justify the development cost. Perl and sh (or ksh) can be good languages for a quick prototype. Utilities that tie other programs together may be easier to write in a shell than in a more conventional programming language. On the other hand, any time you want to interact with raw bytes, C is probably looming on your horizon.

Designing a utility

A good rule of thumb is to start thinking about the design of a utility the second time you have to solve a problem. Don't mourn the one-off hack you write the first time; think of it as a prototype. The second time, compare what you need to do with what you needed to do the first time. Around the third time, you should start thinking about taking the time to write a general utility. Even a merely repetitive task might merit the development of a utility; for instance, many generalized file-renaming programs have been written based on the frustration of trying to rename files in a generalized way.

Here are some design goals of utilities; each gets its own section, below.

Do one thing well

Do one thing well; don't do multiple things badly. The best example of this doing one thing well is probably sort. No utilities other than sort have a sort feature. The idea is simple; if you only solve a problem once, you can take the time to do it well.

Imagine how frustrating it would be if most programs sorted data, but some supported only lexographic sorts, while others supported only numeric sorts, and a few even supported selection of keys rather than sorting by whole lines. It would be annoying at best.

When you find a problem to solve, try to break the problem up into parts, and don't duplicate the parts for which utilities already exist. The more you can focus on a tool that lets you work with existing tools, the better the chances that your utility will stay useful.

You may need to write more than one program. The best way to solve a specialized task is often to write one or two utilities and a bit of glue to tie them together, rather than writing a single program to solve the whole thing. It's fine to use a 20-line shell script to tie your new utility together with existing tools. If you try to solve the whole problem at once, the first change that comes along might require you to rethink everything.

I have occasionally needed to produce two-column or three-column output from a database. It is generally more efficient to write a program to build the output in a single column and then glue it to a program that puts things in columns. The shell script that combines these two utilities is itself a throwaway; the separate utilities have outlived it.

Some utilities serve very specialized needs. If the output of ls in a crowded directory scrolls off the screen very quickly, it might be because there's a file with a very long name, forcing ls to use only a single column for output. Paging through it using more takes time. Why not just sort lines by length, and pipe the result through tail, as follows?

Listing 1. One of the smallest utilities anywhere, sl
 

#/usr/bin/perl -w
print sort { length $a <=> length $b } <>;

The script in Listing 1 does exactly one thing. It takes no options, because it needs no options; it only cares about the length of lines. Thanks to Perl's convenient <> idiom, this automatically works either on standard input or on files named on the command line.

Be a filter

Almost all utilities are best conceived of as filters, although a few very useful utilities don't fit this model. (For instance, a program that counts might be very useful, even though it doesn't work well as a filter. Programs that take only command-line arguments as input, and produce potentially complicated output, can be very useful.) Most utilities, though, should work as filters. By convention, filters work on lines of text. Most filters should have some support for running on multiple input files.

Remember that a utility needs to work on the command line and in scripts. Sometimes, the ideal behavior varies a little. For instance, most versions of ls automatically sort input into columns when writing to a terminal. The default behavior of grep is to print the file name in which a match was found only if multiple files were specified. Such differences should have to do with how users will want the utility to work, not with other agendas. For instance, old versions of GNU bc displayed an intrusive copyright notice when started. Please don't do that. Make your utility stick to doing its job.

Utilities like to live in pipelines. A pipeline lets a utility focus on doing its job, and nothing else. To live in a pipeline, a utility needs to read data from standard input and write data to standard output. If you want to deal with records, it's best if you can make each line be a "record." Existing programs such as sort and join are already thinking that way. They'll thank you for it.

One utility I occasionally use is a program that calls other programs iteratively over a tree of files. This makes very good use of the standard UNIX utility filter model, but it only works with utilities that read input and write output; you can't use it with utilities that operate in place, or take input and output file names.

Most programs that can run from standard input can also reasonably be run on a single file, or possibly on a group of files. Note that this arguably violates the rule against duplicating effort; obviously, this could be managed by feeding cat into the next program in the series. However, in practice, it seems to be justified.

Some programs may legitimately read records in one format but produce something entirely different. An example would be a utility to put material into columnar form. Such a utility might equate lines to records on input, but produce multiple records per line on output.

Not every utility fits entirely into this model. For instance, xargs takes not records but names of files as input, and all of the actual processing is done by some other program.

Generalize

Try to think of tasks similar to the one you're actually performing; if you can find a general description of these tasks, it may be best to try to write a utility that fits that description. For instance, if you find yourself sorting text lexicographically one day and numerically another day, it might make sense to consider attempting a general sort utility.

Generalizing functionality sometimes leads to the discovery that what seemed like a single utility is really two utilities used in concert. That's fine. Two well-defined utilities can be easier to write than one ugly or complicated one.

Doing one thing well doesn't mean doing exactly one thing. It means handling a consistent but useful problem space. Lots of people use grep. However, a great deal of its utility comes from the ability to perform related tasks. The various options to grep do the work of a handful of small utilities that would have ended up sharing, or duplicating, a lot of code.

This rule, and the rule to do one thing, are both corollaries of an underlying principle: avoid duplication of code whenever possible. If you write a half-dozen programs, each of which sorts lines, you can end up having to fix similar bugs half a dozen times instead of having one better-maintained sort program to work on.

This is the part of writing a utility that adds the most work to the process of getting it completed. You may not have time to generalize something fully at first, but it pays off when you get to keep using the utility.

Sometimes, it's very useful to add related functionality to a program, even when it's not quite the same task. For instance, a program to pretty-print raw binary data might be more useful if, when run on a terminal device, it threw the terminal into raw mode. This makes it a lot easier to test questions involving keymaps, new keyboards, and the like. Not sure why you're getting tildes when you hit the delete key? This is an easy way to find out what's really getting sent. It's not exactly the same task, but it's similar enough to be a likely addition.

The errno utility in Listing 2 below is a good example of generalizing, as it supports both numeric and symbolic names.

Be robust

It's important that a utility be durable. A utility that crashes easily or can't handle real data is not a useful utility. Utilities should handle arbitrarily long lines, huge files, and so on. It is perhaps tolerable for a utility to fail on a data set larger than it can hold in memory, but some utilities don't do this; for instance, sort, by using temporary files, can generally sort data sets much larger than it can hold in memory.

Try to make sure you've figured out what data your utility can possibly run on. Don't just ignore the possibility of data you can't handle. Check for it and diagnose it. The more specific your error messages, the more helpful you are being to your users. Try to give the user enough information to know what happened and how to fix it. When processing data files, try to identify exactly what the malformed data was. When trying to parse a number, don't just give up; tell the user what you got, and if possible, what line of the input stream the data was on.

As a good example, consider the difference between two implementations of dc. If you run dc /home, one of them says "Cannot use directory as input!" The other just returns silently; no error message, no unusual exit code. Which of these would you rather have in your path when you make a typo on a cd command? Similarly, the former will give verbose error messages if you feed it the stream of data from a directory, perhaps by doing dc < /home. On the other hand, it might be nice for it to give up early on when getting invalid data.

Security holes are often rooted in a program that isn't robust in the face of unexpected data. Keep in mind that a good utility might find its way into a shell script run as root. A buffer overflow in a program such as find is likely to be a risk to a great number of systems.

The better a program deals with unexpected data, the more likely it is to adapt well to varied circumstances. Often, trying to make a program more robust leads to a better understanding of its role, and better generalizations of it.

Be new

One of the worst kinds of utility to write is the one you already have. I wrote a wonderful utility called count. It allowed me to perform just about any counting task. It's a great utility, but there's a standard BSD utility called jot that does the same thing. Likewise, my very clever program for turning data into columns duplicates an existing utility, rs, likewise found on BSD systems except that rs is much more flexible and better designed. See Resources below for more information on jot and rs.

If you're about to start writing a utility, take a bit of time to browse around a few systems to see if there might be one already. Don't be afraid to steal Linux utilities for use on BSD, or BSD utilities for use on Linux; one of the joys of utility code is that almost all utilities are quite portable.

Don't forget to look at the possibility of combining existing applications to make a utility. It is possible, in theory, that you'll find stringing existing programs together is not fast enough, but it's very rare that writing a new utility is faster than waiting for a slightly slow pipeline.

An example utility

In a sense this program is a counterexample, in that it is never useful as a filter. It works very well as a command-line utility, however.

This program does one thing only. It prints out errno lines from /usr/include/sys/errno.h in a slightly pretty-printed format. For instance:

$ errno 22
EINVAL [22]: Invalid argument

Listing 2. Errno finder
 


    #!/bin/sh
    usage() {
        echo >&2 "usage: errno [numbers or error names]\n"
        exit 1
    }

    for i
    do
        case "$i" in
        [0-9]*)
            awk '/^#define/ && $3 == '"$i"' {
                for (i = 5; i < NF; ++i) {
                    foo = foo " " $i;
                }
                printf("%-22s%s\n", $2 " [" $3 "]:", foo);
                foo = ""
            }' < /usr/include/sys/errno.h
            ;;
        E*)
            awk '/^#define/ && $2 == "'"$i"'" {
                for (i = 5; i < NF; ++i) {
                    foo = foo " " $i;
                }
                printf("%-22s%s\n", $2 " [" $3 "]:", foo);
                foo = ""
            }' < /usr/include/sys/errno.h
            ;;
        *)
            echo >&2 "errno: can't figure out whether '$i' is a name or a number."
            usage
            ;;
        esac
    done

Does it generalize? Yes, nicely. It supports both numeric and symbolic names. On the other hand, it doesn't know about other files, such as /usr/include/sys/signal.h, that are likely in the same format. It could easily be extended to do that, but for a convenience utility like this, it's easier to just make a copy called "signal" that reads signal.h, and uses "SIG*" as the pattern to match a name.

This is just a tad more convenient than using grep on system header files, but it's less error-prone. It doesn't produce garbled results from ill-considered arguments. On the other hand, it produces no diagnostic if a given name or number is not found in the header. It also doesn't bother to correct some invalid inputs. Still, as a command-line utility never intended to be used in an automated context, it's okay.

Another example might be a program to unsort input (see Resources for a link to this utility). This is simple enough; read in input files, store them in some way, then generate a random order in which to print out the lines. This is a utility of nearly infinite applications. It's also a lot easier to write than a sorting program; for instance, you don't need to specify which keys you're not sorting on, or whether you want things in a random order alphabetically, lexicographically, or numerically. The tricky part comes in reading in potentially very long lines. In fact, the provided version cheats; it assumes there will be no null bytes in the lines it reads. It's a lot harder to get that right, and I was lazy when I wrote it.

Summary

If you find yourself performing a task repeatedly, consider writing a program to do it. If the program turns out to be reasonable to generalize a bit, generalize it, and you will have written a utility.

Don't design the utility the first time you need it. Wait until you have some experience. Feel free to write a prototype or two; a good utility is sufficiently better than a bad utility to justify a bit of time and effort on researching it. Don't feel bad if what you thought would be a great utility ends up gathering dust after you wrote it. If you find yourself frustrated by your new program's shortcomings, you just had another prototyping phase. If it turns out to be useless, well, that happens sometimes.

The thing you're looking for is a program that finds general application outside your initial usage patterns. I wrote unsort because I wanted an easy way to get a random series of colors out of an old X11 "rgb.txt" file. Since then, I've used it for an incredible number of tasks, not the least of which was producing test data for debugging and benchmarking sort routines.

One good utility can pay back the time you spent on all the near misses. The next thing to do is make it available for others, so they can experiment. Make your failed attempts available, too; other people may have a use for a utility you didn't need. More importantly, your failed utility may be someone else's prototype, and lead to a wonderful utility program for everyone.

Resources

[Jul 3, 2003] dunne.dyn.dhs.org/Using the m4 Macro Processor - updated link

"What is it about m4 that makes it so useful, and yet so overlooked? m4 -- a macro processor -- unfortunately has a dry name that disguises a great utility. A macro processor is basically a program that scans text and looks for defined symbols, which it replaces with other text or other symbols."

[Apr 17, 2003] Exploring processes with Truss: Part 1 By Sandra Henry-Stocker

The ps command can tell you quite a few things about each process running on your system. These include the process owner, memory use, accumulated time, the process status (e.g., waiting on resources) and many other things as well. But one thing that ps cannot tell you is what a process is doing - what files it is using, what ports it has opened, what libraries it is using and what system calls it is making. If you can't look at source code to determine how a program works, you can tell a lot about it by using a procedure called "tracing". When you trace a process (e.g., truss date), you get verbose commentary on the process' actions. For example, you will see a line like this each time the program opens a file:

open("/usr/lib/libc.so.1", O_RDONLY) = 4

The text on the left side of the equals sign clearly indicates what is happening. The program is trying to open the file /usr/lib/libc.so.1 and it's trying to open it in read-only mode (as you would expect, given that this is a system library). The right side is not nearly as self-evident. We have just the number 4. Open is not a Unix command, of course, but a system call. That means that you can only use the command within a program. Due to the nature of Unix, however, system calls are documented in man pages just like ls and pwd.

To determine what this number represents, you can skip down in this column or you can read the man page. If you elect to read the man page, you will undoubtedly read a line that tells you that the open() function returns a file descriptor for the named file. In other words, the number, 4 in our example, is the number of the file descriptor referred to in this open call. If the process that you are tracing opens a number of files, you will see a sequence of open calls. With other activity removed, the list might look something like this:

open("/dev/zero", O_RDONLY) = 3

open("/var/ld/ld.config", O_RDONLY) Err#2 ENOENT

open("/usr/lib/libc.so.1", O_RDONLY) = 4

open("/usr/lib/libdl.so.1", O_RDONLY) = 4

open64("./../", O_RDONLY|O_NDELAY) = 3

open64("./../../", O_RDONLY|O_NDELAY) = 3

open("/etc/mnttab", O_RDONLY) = 4

Notice that the first file handle is 3 and that file handles 3 and 4 are used repeatedly. The initial file handle is always 3. This indicates that it is the first file handle following those that are the same for every process that you will run - 0, 1 and 2. These represent standard in, standard out and standard error.

The file handles shown in the example truss output above are repeated only because the associated files are subsequently closed. When a file is closed, the file handle that was used to access it can be used again. 

The close commands include only the file handle, since the location of the file is known. A close command would, therefore, be something like close(3). One of the lines shown above displays a different response - Err#2

ENOENT. This "error" (the word is put in quotes because this does not necessarily indicate that the process is defective in any way) indicates that the file the open call is attempting to open does not exist. Read "ENOENT" as "No such file".

Some open calls place multiple restrictions on the way that a file is opened. The open64 calls in the example output above, for example, specify both O_RDONLY and O_NDELAY. Again, reading the man page will help you to understand what each of these specifications means and will present with a list of other options as well.

As you might expect, open is only one of many system calls that you will see when you run the truss command. Next week we will look at some additional system calls and determine what they are doing.

Exploring processes with Truss: part 2 By Sandra Henry-Stocker

While truss and its cousins on non-Solaris systems (e.g., strace on Linux and ktrace on many BSD systems) provide a lot of data on what a running process is doing, this information is only useful if you know what it means. Last week, we looked at the open call and the file handles that are returned by the call to open(). This week, we look at some other system calls and analyze what these system calls are doing. You've probably noticed that the nomenclature for system functions is to follow the name of the call with a set of empty parentheses for example, open(). You will see this nomenclature in use whenever system calls are discussed.

The fstat() and fstat64() calls obtains information about open files - "fstat" refers to "file status". As you might expect, this information is retrieved from the files' inodes, including whether or not you are allowed to read the files' contents. If you trace the ls command (i.e., truss ls), for example, your trace will start with lines that resemble these:

1 execve("/usr/bin/ls", 0x08047BCC, 0x08047BD4) argc = 1

2 open("/dev/zero", O_RDONLY) = 3

3 mmap(0x00000000, 4096, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0xDFBFA000

4 xstat(2, "/usr/bin/ls", 0x08047934) = 0

5 open("/var/ld/ld.config", O_RDONLY) Err#2 ENOENT

6 sysconfig(_CONFIG_PAGESIZE) = 4096

7 open("/usr/lib/libc.so.1", O_RDONLY) = 4

8 fxstat(2, 4, 0x08047310) = 0

...

28 lstat64(".", 0x080478B4) = 0

29 open64(".", O_RDONLY|O_NDELAY) = 3

30 fcntl(3, F_SETFD, 0x00000001) = 0

31 fstat64(3, 0x0804787C) = 0

32 brk(0x08057208) = 0

33 brk(0x08059208) = 0

34 getdents64(3, 0x08056F40, 1048) = 424

35 getdents64(3, 0x08056F40, 1048) = 0

36 close(3) = 0

In line 31, we see a call to fstat64, but what file is it checking? The man page for the fstat() and your intuition are probably both telling you that this fstat call is obtaining information on the file opened two lines before – "." or the current directory - and that it is referring to this file by its file handle (3) returned by the open() call in line

2. Keep in mind that a directory is simply a file, though a different variety of file, so the same system calls are used as would be used to check a text file.

You will probably also notice that the file being opened is called /dev/zero (again, see line 2). Most Unix sysadmins will immediately know that /dev/zero is a special kind of file - primarily because it is stored in /dev. And, if moved to look more closely at the file, they

will confirm that the file that /dev/zero points to (it is itself a symbolic link) is a special character file. What /dev/zero provides to system programmers, and to sysadmins if they care to use it, is an endless stream of zeroes. This is more useful than might first appear.

To see how /dev/zero works, you can create a 10M-byte file full of zeroes with a command like this:

/bin/dd < /dev/zero > zerofile bs=1024 seek=10240 count=1

This command works well because it creates the needed file with only a few read and write operations; in other words, it is very efficient.

You can verify that the file is zero-filled with od.

# od -x zerofile

0000000 0000 0000 0000 0000 0000 0000 0000 0000

*

50002000

Each string of four zeros (0000) represents two bytes of data. The * on the second line of output indicates that all of the remaining lines are identical to the first.

Looking back at the truss output above, we cannot help but notice that the first line of the truss output includes the name of the command that we are tracing. The execve() system call executes a process. The first argument to execve() is the name of the file from which the new process

image is to be loaded. The mmap() call which follows maps the process image into memory. In

other words, it directly incorporates file data into the process address space. The getdents64() calls on lines 34 and 35 are extracting information from the directory file - "dents" refers to "directory entries'.

The sequence of steps that we see at the beginning of the truss output executing the entered command, opening /dev/zero, mapping memory and so on - looks the same whether you are tracing ls, pwd, date or restarting Apache. In fact, the first dozen or so lines in your truss output will be nearly identical regardless of the command you are running. You should, however, expect to see some differences between different Unix systems and different versions of Solaris.

Viewing the output of truss, you can get a solid sense of how the operating system works. The same insights are available if you are tracing your own applications or troubleshooting third party executables.

-------------------

Sandra Henry-Stocker

Linux.ie Using the ps command.

3.2. Displaying all processes owned by a specific user

$ ps ux
USER       PID %CPU %MEM   VSZ  RSS TTY      STAT START   TIME COMMAND
heyne      691  0.0  2.4 19272 9576 ?        S    13:35   0:00 kdeinit: kded    
heyne      700  0.1  1.0  5880 3944 ?        S    13:35   0:01 artsd -F 10 -S 40
... ... ... 


You can also use the syntax "ps U username".

As you can see, the ps command can give you a lot of interesting information. If you for example want to know what your friend actually does, just replace your login name with her/his name and you see all processe belonging to her/him.

3.3. Own output format

If you are bored by the regular output, you could simply change the format. To do so use the formatting characters which are supported by the ps command.
If you execute the ps command with the 'o' parameter you can tell the ps command what you want to see:
e.g.
Odd display with AIX field descriptors:
 

$ ps -o "%u : %U : %p : %a"
RUSER    : USER     :   PID : COMMAND
heyne    : heyne    :  3363 : bash
heyne    : heyne    :  3367 : ps -o %u : %U : %p : %a

developerWorks  Concatenating files with cat Cat has two useful options:

Dogs of the Linux Shell Posted on Saturday, October 19, 2002 by Louis J. Iacona  Could the command-line tools you've forgotten or never knew save time and some frustration?

One incarnation of the so called 80/20 rule has been associated with software systems. It has been observed that 80% of a user population regularly uses only 20% of a system's features. Without backing this up with hard statistics, my 20+ years of building and using software systems tells me that this hypothesis is probably true. The collection of Linux command-line programs is no exception to this generalization. Of the dozens of shell-level commands offered by Linux, perhaps only ten commands are commonly understood and utilized, and the remaining majority are virtually ignored.

Which of these dogs of the Linux shell have the most value to offer? I'll briefly describe ten of the less popular but useful Linux shell commands, those which I have gotten some mileage from over the years. Specifically, I've chosen to focus on commands that parse and format textual content.

The working examples presented here assume a basic familiarity with command-line syntax, simple shell constructs and some of the not-so-uncommon Linux commands. Even so, the command-line examples are fairly well commented and straightforward. Whenever practical, the output of usage examples is presented under each command-line execution.

The following eight commands parse, format and display textual content. Although not all provided examples demonstrate this, be aware that the following commands will read from standard input if file arguments are not presented.

Table 1. Summary of Commands

Head/Tail

As their names imply, head and tail are used to display some amount of the top or bottom of a text block. head presents beginning of a file to standard output while tail does the same with the end of a file. Review the following commented examples:

## (1) displays the first 6 lines of a file
   head -6 readme.txt
## (2) displays the last 25 lines of a file
   tail -25 mail.txt

Here's an example of using head and tail in concert to display the 11th through 20th line of a file.

# (3)
head -20 file | tail -10 

Manual pages show that the tail command has more command-line options than head. One of the more useful tail option is -f. When it is used, tail does not return when end-of-file is detected, unless it is explicitly interrupted. Instead, tail sleeps for a period and checks for new lines of data that may have been appended since the last read.

## (4) display ongoing updates to the given
##     log file 

tail -f /usr/tmp/logs/daemon_log.txt

Imagine that a dæmon process was continually appending activity logs to the /usr/adm/logs/daemon_log.txt file. Using tail -f at a console window, for example, will more or less track all updates to the file in real time. (The -f option is applicable only when tail's input is a file).

If you give multiple arguments to tail, you can track several log files in the same window.

## track the mail log and the server error log
## at the same time.

tail -f /var/log/mail.log /var/log/apache/error_log

tac--Concatenate in Reverse

What is cat spelled backwards? Well, that's what tac's functionality is all about. It concatenates file order and their contents in reverse. So what's its usefulness? It can be used on any task that requires ordering elements in a last-in, first-out (LIFO) manner. Consider the following command line to list the three most recently established user accounts from the most recent through the least recent.

# (5) last 3 /etc/passwd records - in reverse
$ tail -3 /etc/passwd | tac
curly:x:1003:100:3rd Stooge:/homes/curly:/bin/ksh
larry:x:1002:100:2nd Stooge:/homes/larry:/bin/ksh
moe:x:1001:100:1st Stooge:/homes/moe:/bin/ksh

nl--Numbered Line Output

nl is a simple but useful numbering filter. I displays input with each line numbered in the left margin, in a format dictated by command-line options. nl provides a plethora of options that specify every detail of its numbered output. The following commented examples demonstrate some of of those options:

# (6) Display the first 4 entries of the password
#     file - numbers to be three columns wide and 
#     padded by zeros.
$ head -4 /etc/passwd | nl -nrz -w3
001	root:x:0:1:Super-User:/:/bin/ksh
002	daemon:x:1:1::/:
003	bin:x:2:2::/usr/bin:
004	sys:x:3:3::/:
#
# (7) Prepend ordered line numbers followed by an
#     '=' sign to each line -- start at 101.
$ nl -s= -v101 Data.txt
101=1st Line ...
102=2nd Line ...
103=3rd Line ...
104=4th Line ...
105=5th Line ...
  .......

fmt--Format

The fmt command is a simple text formatter that focuses on making textual data conform to a maximum line width. It accomplishes this by joining and breaking lines around white space. Imagine that you need to maintain textual content that was generated with a word processor. The exported text may contain lines whose lengths vary from very short to much longer than a standard screen length. If such text is to be maintained in a text editor (like vi), fmt is the command of choice to transform the original text into a more maintainable format. The first example below shows fmt being asked to reformat file contents as text lines no greater than 60 characters long.

# (8) No more than 60 char lines
$ fmt -w 60 README.txt > NEW_README.txt
# 
# (9) Force uniform spacing:
#     1 space between words, 2 between sentences
$ echo "Hello   World. Hello Universe." | fmt -u -w80 

Hello World.  Hello Universe.

fold--Break Up Input

fold is similar to fmt but is used typically to format data that will be used by other programs, rather than to make the text more readable to the human eye. The commented examples below are fairly easy to follow:

# (10) Format text in 3 column width lines
$ echo oxoxoxoxo | fold -w3 
oxo
xox
oxo
# (11) Parse by triplet-char strings - 
#      search for 'xox'
$ echo oxoxoxoxo | fold -w3 | grep "xox"
xox
# (12) One way to iterate through a string of chars
$ for i in $(echo 12345 | fold -w1)
> do
> ### perform some task ...
> print $i
> done
1
2
3
4
5

pr

pr shares features with simpler commands like nl and fmt, but its command-line options make it ideal for converting text files into a format that's suitable for printing. pr offers options that allow you to specify page length, column width, margins, headers/footers, double line spacing and more.

Aside from being the best suited formatter for printing tasks, pr also offers other useful features. These features include allowing you to view multiple files vertically in adjacent columns or columnizing a list in a fixed number of columns (see Listing 2).

Listing 2. Using pr

Miscellaneous

The following two commands are specialized parsers used to pick apart file path pieces.

Basename/Dirname

The basename and dirname commands are useful for presenting portions of a given file path. Quite often in scripting situations, it's convenient to be able to parse and capture a file name or the containing-directory name portions of a file path. These commands reduce this task to a simple one-line command. (There are other ways to approach this using the Korn shell or sed "magic", but basename and dirname are more portable and straightforward).

basename is used to strip off the directory, and optionally, the file suffix parts of a file path. Consider the following trivial examples:

:# (21) Parse out the Java Class name
$ basename
/usr/local/src/java/TheClass.java .java 
TheClass 
# (22) Parse out the file name.  
$ basename srcs/C/main.c 
main.c

dirname is used to display the containing directory path, as much of the path as is provided. Consider the following examples:

# (23) absolute and relative directory examples
$ dirname /homes/curly/.profile 
/homes/curly 
$ dirname curly/.profile
curly 
# 
# (24) From any korn-shell script, the following
#  line will assign the directory from where 
#  the script was launched 
SCRIPT_HOME="$(dirname $(whence $0))" 
# 
# (25)
# Okay, how about a non-trivial practical example?
#  List all directories (under $PWD that contain a  
#  file called 'core'.
$ for i in $(find $PWD -name core )^
> do 
> dirname $i
> done | sort -u
bin 
rje/gcc 
src/C

ttyrec a tty recorder

ttyrec is a tty recorder. Recorded data can be played back with the included ttyplay command.

ttyrec is just a derivative of script command for recording timing information with microsecond accuracy as well.

It can record emacs -nw, vi, lynx, or any programs running on tty.

Understanding Archivers

In the next few articles, I'd like to take a look at backups and archiving utilities. if you're like I was when I started using Unix, I was intimidated by the words tar, cpio and dump, and a quick peek at their respective man pages did not alleviate my fears.

Online Gnu Documentation

Links to the manuals for the Gnu tools most commonly used in embedded development: Using and Porting GNU CC * Using as, The GNU Assembler * GASP, an assembly preprocessor * Using ld, the GNU linker
 http://www.objsw.com/docs/

Slashdot Articles Free Books Online Matt Braithwaite writes "Answering RMS's call for free documentation, Karl Fogel has written a book on CVS that is free (GPLed) and available online. (The paper version has additional non-free material.) " Also, edinator wrote to say that ORA has put the Using Samba text online. The entire text of the Oreilly Docbook is downloadable www.docbook.org

TheLinuxGurus.org: Book Review: Professional Linux Programming

(Oct 21, 2000, 18:38 UTC) (116 reads) (0 talkbacks) (Posted by john)
"This book takes a different approach in that it steps through the development of a fictional application. The application you will build is an interface for a DVD rental store."  

FreeOS.com: RPM usage for newbies

(Oct 21, 2000, 18:03 UTC) (203 reads) (0 talkbacks) (Posted by john)
"The Red Hat Package Manager (RPM) has establised itself as one of the most popular distrubution formats for linux software today. A first time user may feel overwhelmed by the vast number of options available and this article will help a newbie to get familiar with usage of this tool."

Signal Ground: Stupid dd Tricks (or, Why We Didn't buy Norton Ghost)

"The company that employs Tom and me builds big pieces of food processing machinery that cost upwards of $400K. Each machine includes an embedded PCs running -- and I cringe -- NT 4. While the company's legacy currently dictates NT, those of us at the lower levels of the totem pole work to wedge Linux in wherever we can. What follows is a short story of a successful insertion that turned out to be (gasp!) financially beneficial to the company, too."

"...Ghost works well; it does exactly what we wanted it to. You boot off of a floppy (while the image medium is in another drive), and Ghost does the rest. The problem lies in Ghost's licensing. If you want to install in a situation like ours, you have to purchase a Value-Added Reseller (VAR) license from Symantec. And, every time you create a drive, you have to pay them about 17 dollars. When you also figure in the time needed to keep track of those licenses, that adds up in a hurry."

"It finally occurred to me that we could use Linux and a couple of simple tools (dd, gzip, and a shell script) to do the same thing as Ghost -- at least as far as our purposes go. ... The Results? We showed our little program to management, and they were impressed. We were able to create disk images almost as quickly as Norton Ghost, and we did it all in an afternoon using entirely free software. The rest is history."

Issue #87 Common Shell Tools - Focus On Linux - 05-25-00

"sort and uniq

The sort command is used to sort the lines in an input stream in alphanumeric or telephone book order. The simplest ways to use sort are to provide it with a filename to sort or an input stream whose data should be output in sorted form:

  sort myfile.txt
  cat myfile.txt | sort

This tool can be told to sort based on alternate fields and in several different orders. The uniq command is often used in conjunction with sort because it removes consecutive duplicate lines from and input stream before writing them to standard output. This provides a quick easy way to sort a pool of data and them remove duplicate entries.

A more in-depth discussion of sort can be found in the past QuickTip called Sort and Uniq.

tr

The tr command in its simplest form can be thought of as a simpler case of the sed command discussed earlier. It is used to replace all occurances of a single character in an input stream with an alternate character before writing to the output stream. For example, to change all percent (%) characters to spaces, you might use:

  tr '%' ' ' 
newfile.txt

Though sed can be used to accomplish the same task, it is often simpler to use tr when replacing a single character because the syntax is easy to remember and many special characters which must be escaped for sed can be supplied to tr without escaping.

wc

The wc, or "word count" command does just what its name implies: it counts words. As an added feature, tr also counts lines and bytes. The formats for counting words, lines, or bytes in a file or input stream are:

  $ wc -w myfile.txt
      897 myfile.txt
  $ wc -l myfile.txt
      193 myfile.txt
  $ wc -c myfile.txt
     5927 myfile.txt
  $

Notice that the output for wc normally includes the filename (when reading from a file) and always includes a number of spaces as well. Often, this behavior is undesirable, usually when a number is required without leading or trailing whitespace. In such cases, sed and cut can be used to eliminate them:

  $ wc -l myfile.txt | cut -d ' ' -f 1 | sed 's! !!g'
  193
  $

Note that other methods for removing spaces or filenames include using a more complex sed command alone or even using awk, which we won't discuss in this issue.

xargs

The xargs utility is used to break long input streams into groups of lines so that the shell isn't overloaded by command substitution. For example, the following command may fail if too many files are present in the current directory tree for BASH to substitute correctly:

  lpr $(find .)

However, using xargs, the desired effect can be obtained:

  find . | xargs lpr

More information on using xargs can be found in the QuickTip called Long Argument Lists and on the xargs manual page.

Linux Today PRNewswire SCO Contributes to the Open Source Community; Kicks Off Open Source Initiatives

"SCO is contributing source code for two developer tools -- "cscope" and "fur." The code is released under the terms of the BSD License and will be maintained by SCO. The first technology, cscope, is available to download at www.sco.com/opensource. Software developers can use cscope to help design and debug programs coded with the C programming language. The second technology, Fur, will be available to download in several weeks. Fur is a real-time analysis program used to optimize application and system binaries for more effective run time execution. Dramatic results have been seen in high-level applications and database systems using fur." 

[Jan 30, 2000] Use the Source, Luke Compiling and installing from source code LG #49

One of the greatest strengths of the Open Source movement is the availability of source code for almost every program. This article will discuss in general terms, with some examples, how to install a program from source code rather than a precompiled binary package. The primary audience for this article is the user who has some familiarity with installing programs from binaries, but isn't familiar with installing from source code. Some knowledge of compiling software is helpful, but not required.

[Jan 3, 2000] Advanced Programming in Expect A Bulletproof Interface LG #48 -- very interesting and useful paper. See also: Ext2- Automating interactive tasks with expect and crontab
QCad the user-friendly CAD system for Linux is now open source PC Week PC Week Labs evaluates open-source apps 

They recommend Apache, Mozilla, Samba and Perl for enterprise use.  Evaluations of a particular product are second-rate and does not deserve attention. Only the list is interesting  [Jan 25, 1999] Win32 Editors page was added Open Source Software Chronicles --  October-December, 1998

Open Source Software Chronicles --  July-September, 1998


See Also


Recommended Links

Softpanorama Top Visited

Softpanorama Recommended


Reference


Main components (Core Gnu)


Ghostscript


m4

Linux, Unix, -etc Using the m4 Macro Processor nice m4 intro by Paul Dunne

"What is it about m4 that makes it so useful, and yet so overlooked? m4 -- a macro processor -- unfortunately has a dry name that disguises a great utility. A macro processor is basically a program that scans text and looks for defined symbols, which it replaces with other text or other symbols."

GNU macro processor - Table of Contents

docs.sun.com Programming Utilities Guide/m4

m4 macro processor Caldera

Programming in standard C and C++

m4 macro processor
        Defining macros
        Quoting
        Arguments
        Arithmetic built-ins
        File inclusion
        Diversions
        System command
        Conditionals
        String manipulation
        Printing

General Programming Concepts Writing and Debugging Programs - m4 Macro Processor Overview

This chapter provides information about the m4 macro processor, which is a front-end processor for any programming language being used in the operating system environment.

The m4 macro processor is useful in many ways. At the beginning of a program, you can define a symbolic name or symbolic constant as a particular string of characters. You can then use the m4 program to replace unquoted occurrences of the symbolic name with the corresponding string. Besides replacing one string of text with another, the m4 macro processor provides the following features:

The m4 macro processor processes strings of letters and digits called tokens. The m4 program reads each alphanumeric token and determines if it is the name of a macro. The program then replaces the name of the macro with its defining text, and pushes the resulting string back onto the input to be rescanned. You can call macros with arguments, in which case the arguments are collected and substituted into the right places in the defining text before the defining text is rescanned.

The m4 program provides built-in macros such as define. You can also create new macros. Built-in and user-defined macros work the same way.

  • Autoconf

    Documentation

    Tutorials

    Mailing lists


    Humor

    Less sucks less more than more.
    That's why I use more less, and less more.




    Etc

    Society

    Groupthink : Understanding Micromanagers and Control Freaks : Toxic Managers : BureaucraciesHarvard Mafia : Diplomatic Communication : Surviving a Bad Performance Review : Insufficient Retirement Funds as Immanent Problem of Neoliberal Regime : PseudoScience : Who Rules America : Two Party System as Polyarchy : Neoliberalism  : The Iron Law of Oligarchy : Libertarian Philosophy

    Quotes

    Skeptical Finance : John Kenneth Galbraith : Keynes : George Carlin : Skeptics : Propaganda  : SE quotes : Language Design and Programming Quotes : Random IT-related quotes : Oscar Wilde : Talleyrand : Somerset Maugham : War and Peace : Marcus Aurelius : Eric Hoffer : Kurt Vonnegut : Otto Von Bismarck : Winston Churchill : Napoleon Bonaparte : Ambrose Bierce : Oscar Wilde : Bernard Shaw : Mark Twain Quotes

    Bulletin:

    Vol 26, No.1 (January, 2013) Object-Oriented Cult : Vol 25, No.12 (December, 2013) Rational Fools vs. Efficient Crooks: The efficient markets hypothesis : Vol 25, No.08 (August, 2013) Cloud providers as intelligence collection hubs : Vol 23, No.10 (October, 2011) An observation about corporate security departments : Vol 23, No.11 (November, 2011) Softpanorama classification of sysadmin horror stories : Vol 25, No.05 (May, 2013) Corporate bullshit as a communication method : Vol 25, No.10 (October, 2013) Cryptolocker Trojan (Win32/Crilock.A) : Vol 25, No.06 (June, 2013) A Note on the Relationship of Brooks Law and Conway Law

    History:

    Fifty glorious years (1950-2000): the triumph of the US computer engineering : Donald Knuth : TAoCP and its Influence of Computer Science : Richard Stallman : Linus Torvalds  : Larry Wall  : John K. Ousterhout : CTSS : Multix OS Unix History : Unix shell history : VI editor : History of pipes concept : Solaris : MS DOSProgramming Languages History : PL/1 : Simula 67 : C : History of GCC developmentScripting Languages : Perl history   : OS History : Mail : DNS : SSH : CPU Instruction Sets : SPARC systems 1987-2006 : Norton Commander : Norton Utilities : Norton Ghost : Frontpage history : Malware Defense History : GNU Screen : OSS early history

    Classic books:

    The Peter Principle : Parkinson Law : 1984 : The Mythical Man-MonthHow to Solve It by George Polya : The Art of Computer Programming : The Elements of Programming Style : The Unix Hater’s Handbook : The Jargon file : The True Believer : Programming Pearls : The Good Soldier Svejk : The Power Elite

    Most popular humor pages:

    Manifest of the Softpanorama IT Slacker Society : Ten Commandments of the IT Slackers Society : Computer Humor Collection : BSD Logo Story : The Cuckoo's Egg : IT Slang : C++ Humor : ARE YOU A BBS ADDICT? : The Perl Purity Test : Object oriented programmers of all nations : Financial Humor : Financial Humor Bulletin, 2008 : Financial Humor Bulletin, 2010 : The Most Comprehensive Collection of Editor-related Humor : Programming Language Humor : Goldman Sachs related humor : Greenspan humor : C Humor : Scripting Humor : Real Programmers Humor : Web Humor : GPL-related Humor : OFM Humor : Politically Incorrect Humor : IDS Humor : "Linux Sucks" Humor : Russian Musical Humor : Best Russian Programmer Humor : Microsoft plans to buy Catholic Church : Richard Stallman Related Humor : Admin Humor : Perl-related Humor : Linus Torvalds Related humor : PseudoScience Related Humor : Networking Humor : Shell Humor : Financial Humor Bulletin, 2011 : Financial Humor Bulletin, 2012 : Financial Humor Bulletin, 2013 : Java Humor : Software Engineering Humor : Sun Solaris Related Humor : Education Humor : IBM Humor : Assembler-related Humor : VIM Humor : Computer Viruses Humor : Bright tomorrow is rescheduled to a day after tomorrow : Classic Computer Humor

     

    The Last but not Least


    Copyright © 1996-2014 by Dr. Nikolai Bezroukov. www.softpanorama.org was created as a service to the UN Sustainable Development Networking Programme (SDNP) in the author free time. This document is an industrial compilation designed and created exclusively for educational use and is distributed under the Softpanorama Content License. Site uses AdSense so you need to be aware of Google privacy policy. Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine. This is a Spartan WHYFF (We Help You For Free) site written by people for whom English is not a native language. Grammar and spelling errors should be expected. The site contain some broken links as it develops like a living tree...

    You can use PayPal to make a contribution, supporting hosting of this site with different providers to distribute and speed up access. Currently there are two functional mirrors: softpanorama.info (the fastest) and softpanorama.net.

    Disclaimer:

    The statements, views and opinions presented on this web page are those of the author and are not endorsed by, nor do they necessarily reflect, the opinions of the author present and former employers, SDNP or any other organization the author may be associated with. We do not warrant the correctness of the information provided or its fitness for any purpose.

    Last modified: February 19, 2014