Softpanorama

Home Switchboard Unix Administration Red Hat TCP/IP Networks Neoliberalism Toxic Managers
May the source be with you, but remember the KISS principle ;-)
Bigger doesn't imply better. Bigger often is a sign of obesity, of lost control, of overcomplexity, of cancerous cells

Using redirection and pipes

News Red Hat Certification Program Understanding and using essential tools Access a shell prompt and issue commands with correct syntax Pipes -- powerful and elegant programming paradigm Pipe viewer Named pipes History of pipes concept Unix Filters
Text files processing Using redirection and pipes Use grep and extended regular expressions to analyze text files Finding files and directories; mass operations on files Connecting to the server via ssh, using multiple consoles and screen command Introduction to Unix permissions model Using the Midnight Commander as file manager Piping Vim Buffer VIM: Basic and intermediate set of command for syadmins
Managing local users and groups                
Finding Help Managing files in RHEL Working with hard and soft links Working with archives and compressed files Tips Sysadmin Horror Stories Unix History with some Emphasis on Scripting Humor Etc

Extracted from Professor Nikolai Bezroukov unpublished lecture notes.

Copyright 2010-2018, Dr. Nikolai Bezroukov. This is a fragment of the copyrighted unpublished work. All rights reserved.


Introduction

Redirections and pipes are two (closely related) innovation that helped to propel Unix to the role it plays today.   

By default when a command is executed it shows its results on the screen of the computer you are working on. The computer monitor serves as the target for the so-called standard output, which is also referred to as the STDOUT.  The shell also has default destinations to send error messages  (STDERR ) to and to accept input (STDIN)

So if you run a command, that command would expect input from the keyboard, and it would normally send its output to the monitor of your computer without making a difference between normal output and errors. Some commands, however, are started at the background and not from a current terminal session, so these commands do not have a monitor or console session to send their output to, and they do not listen to keyboard input to accept their standard input. That is where redirection comes in handy.

It is important to understand that when messages aren't redirected in your program, the output goes through a special file called standard output. By default, standard output represents the screen. That means that everything sent through standard output is redirected to the screen.

Programs started from the command line have no idea what they are reading from or writing to. They just read from file descriptor 0 if they want to read from standard input, and they write to file descriptor number 1 to display output and to file descriptor 2 if they have error messages to be output. By default, these are connected to the keyboard and the screen.

Programs started from the command line have no idea what they are reading from or writing to. They just read from file descriptor 0 if they want to read from standard input, and they write to file descriptor number 1 to display output and to file descriptor 2 if they have error messages to be output. By default, these are connected to the keyboard and the screen.

In I/O redirection, files can be used to replace the default STDIN, STDOUT, and STDERR. You can also redirect to device files. A device file on Linux is a file that is used to access specific hardware. Your hard disk for instance can be referred to as /dev/sda, the console of your server is known as /dev/console or /dev/tty1, and if you want to discard a commands output, you can redirect to /dev/null. To access most device files you need to be root.

If you use redirection symbols such as <, >, and |, the shell connects the file descriptors to files you specified or other commands.

There are two major mechanism that increase flexibility of Unix utilities:

Redirection basics

By default Unix/Linux assumes that all output is going to STDOUT  which is assigned to a user screen/console called  /dev/tty. You can divert messages directed to standard output, for example from commands like echo,  to files or other commands. Bash refers to this as redirection.

Before shell executes a command, it scans the command line for redirection characters. These special symbols instruct the shell to redirect input and output accordingly. Redirection characters can appear anywhere in a simple command or can precede or follow a command. They are not passed on to the invoked command as parameters.

The most popular is > operator, which redirects STDOUT to a file. The redirection operator is followed by the name of the file the messages should be written to. For example, to write the message "The processing is complete" to a file named my.log , you use

timestamp=`date`
echo "The processing started at $timestamp" > /tmp/my.log

Try to execute

echo "Hello world" > /dev/tty

You will see that it typed on the your screen exactly the say way as if you executed the command

echo "Hello to myself"

Because those two command are actually identical.

Bash uses the symbol &l to refer to standard output, and you can explicitly redirect messages to it. You can redirect to the file the output of the whole script

bash myscript.sh > mylisting.txt
This is the same as
bash myscript.sh 1> mylisting.txt

In this case any echo statement will write the information not the  screen, but to the file you've redirected the output to. In this case this is the file mylisting. txt

But you can also redirect each echo statement in you script. Let's see another set of examples:

echo "Don't forget to backup your data" > /dev/tty      # send explicitly to the screen
echo "Don't forget to backup your data"                 # sent to screen via standard output stream
echo "Don't forget to backup your data >&1              # same as the last one
echo "Don't forget to backup your data >/dev/stdout     # same as the last one
echo "Don't forget to backup your data" > warning.txt   # redirect to a file in the current directory

Using standard output is a way to send all the output from a script and any commands in it to a new destination.

A script doesn't usually need to know where the messages are going: There’s always the possibility they were redirected. However, when errors occur and when warning messages are printed to the user, you don't want these messages to get redirected along with everything else.

Linux defines a second file especially for messages intended for the user called standard error. This file represents the destination for all error messages. Because standard error, like standard output, is a file, standard error can likewise be redirected. The symbol for standard error is &2.  Instead of &2 /dev/stderr can also be used. The default destination for standard error, like standard output, is the screen. For example,

echo "$SCRIPT:SLINENO: No files available for processing" >&2

This command appears to work the same as a echo without the >&2 redirection, but there is an important difference. It displays an error message to the screen, no matter where standard output has been previously redirected.

 The redirection for the standard error is very similar but naturally they begin with the number 2. For example

bash myscript.sh 2> myscript_errors.txt

You can merge standard output and standard error streams with  2>&1: This redirects the stderr to the stdout, which in turn can be redirected to a file:

bash myscript.sh 2>&1 > myscript_errors.txt

There are several classic types of redirection. Among them: 

  1. <Sourcefile -- reading input from a specified file
  2. >Targetfile -- writing output to the specified file
  3. >> Targetfile -- adding output to the specified file
  4. <> Stream -- reading and writing to the specified stream
  5. <<[-] Source  -- converting variable and strings into input file

Source and target can be expression. In this case bash performs command and parameter substitution before using the  parameter. File name substitution occurs only if the pattern matches a single file

The < operator

Unix command cat is actually short for "catenate," i.e., link together. It accepts multiple filename arguments and copies them to the standard output. But let's pretend, for the moment, that cat and other utilities don't accept filename arguments and accept only standard input. Unix shell lets you redirect standard input so that it comes from a file. The notation command <  filename does the same as cat with less overhead.  In other words

cat /var/log/messages | more

and

more <  /var/log/messages

are equivalent, but the latter is slightly faster and more efficiently uses CPU and memory as it does not create a separate process. 

another example. The utility wc (word count) is able to calculate number of lines in the file with the option -l. That means that you can count the number of lines in a file /etc/passwd, which represent the number of accounts on your system,  using the command:

wc  -l <  /etc/passwd

Again, wc -l count lines of the file. In this case this is number of lines in your  .bashrc. Printing this information from your .bash_profile script might be a useful reminder to you that can alert you to the fact that you recently modified your env, or God forbid your .bashrc file disappeared without trace :-)  

The > operator

The > operator always overwrites the named file. If a series of messages are redirected from the echo command to the same file, only the last message appears:

echo "The processing started at" `date` > /tmp/myprogram/log
,,, ,,, ,,, 
echo  "There were no errors. Normal exist of the program" `date ` > /tmp/myprogram/log

To add messages to a file without overwriting the earlier ones, Bash has an append operator, >>. This operator redirects messages to the end of a file.

echo "The processing started at"  > /tmp/myprogram/log
... ... ... 
echo "There were no errors. Normal exist of the program" >> /tmp/myprogram/log

The << operator ("here file")

This operator imitates reading from a file inside the script by putting several lines directly into the script. The operator <<MARKER treats the lines following it in a script as if they were typed from the keyboard until it reaches the line starting with word MARKER in the first postion. In other words the lines which are treated as an input file are limited by the a special line using the delimiter you you define yourself.

The data in the << list is known as a here file (or a here document) because, historically, the word HERE was often used in Bourne shell scripts as the marker of the end of the input lines.

For example, in the following example the delimiter word used is "EOF": 

cat > /tmp/example <<EOF
this is a test demostrating how you can 
write several lines of text into 
a file
EOF

If you use >> instead of  you can add lines to a file without using any editor. this is how typically sysadmincreated small files during installationof the operating system, for example /etc/resolv.conf

cat >>/etc/resolv.conf <<EOF
search datacenter.mycompany.com headquarters.mycompany.com
nameserver 10.100.20.5
nameserver 10.100.20.6
EOF

In this example bash treats the three lines between the EOF markers as if they were being typed from the keyboard and write them to the file specified after > (/tmp/example in our case). 

NOTE: there should be no spaces between << and EOF marker.

Again, the name EOF is arbitrary. you can choose, for example,  LINES_END instead. the only important thing is there should be no lines in your test that start with the same word.

cat >>/etc/resolv.conf <<LINES_END
search datacenter.mycompany.com headquaters.mycompany.com
nameserver 10.100.20.5
nameserver 10.100.20.6
LINES_END

There should no market at any beginning of the lines of included text. that's why using all caps makes sense in this case. 

Operator <<< ("here string")

From version 3 bash has another here file redirection operator, <<<, which redirects a variable or a literal.

cat > /tmp/example <<<  "this is another example of piping info into the file" 

Here is a summary of what we can do.

Exercise: Using I/O Redirection

  1. Open a shell as user user and type cd without any arguments. This ensures that the home directory of this user is the current directory while working on this exercise. Type pwd  to verify this.
  2. Type ls.  You’ll see the results onscreen.
  3. Type ls > /dev/null. This redirects the STDOUT to the null device, with the result that you will not see it.
  4. Type ls /root/nofile  > /dev/null. This command shows a “no such file or directory” message onscreen. You see the message because it is not STDOUT, but an error message that is written to STDERR.
  5. Type ls /root/nofile 2> /dev/null. Now you will not see the error message anymore.
  6. Type ls /root/bin  /root/etc  2> /dev/null. This command will show content of /root/bin directory suppressing all error message from non existent /root/etc directory
  7. Type ls /root/bin  /root/etc  2> /dev/null > output. In this command, you still write the error message to /dev/null while sending STDOUT to a file with the name output that will be created in your home directory.
  8. Type cat output  to show the contents of this file.
  9. Type echo hello > output. This overwrites the contents of the output file.
  10. Type ls >> output. This appends the result of the ls  command to the output file.
  11. Type ls -R /. This shows a long list of files and folders scrolling over your computer monitor. (You may want to type Ctrl+C  to stop [or wait a long time])
  12. Type ls -R | less. This shows the same result, but in the pager less, where you can scroll up and down using the arrow keys on your keyboard.
  13. Type q  to close less. This will also end the ls  program.
  14. Type ls > /dev/tty1. This gives an error message because you are executing the command as an ordinary user (unless you were logged in to tty1). Only the user root has permission to write to device files directly.

Pipes as cascading redirection

Linux administrator needs to know well how to use pipes, because pipes are used for construction simple sysadmin scripts on daily basis. Pipes were one of the most significant innovation brought to the OS area by Unix. By combining multiple commands using pipes, you can create kind of super commands that make almost anything possible. Pipe can be used to catch the output of one command and process it as input by  a second command. And so on.

Many text utilities are used as stages in multistage pipes (utilities which accept standard input processes it and output the result into standard output are called filters)

Pipeline programming involves applying special style of componentization  that allows to break a problem into a number of small steps, each of which can then be performed by a simple program. We will call this type of componentization pipethink in which wherever possible, programmer  relies on preexisting collection of useful "stages" implemented by Unix filters. David Korn quote catches the essence of pipethink -- "reuse of a set of components rather than on building monolithic applications".

This process is called piping and shell uses the vertical bar (or pipe) operator | to specify it:

who | wc -l # count the number or users

Any number of commands can be strung together with vertical bar symbols. A group of such commands is called a pipeline. This is actually a language that system administrators are learning all their carrier.  Level of mastery of this language directly correlates with the qualification of sysadmin.  See also Pipes -- powerful and elegant programming paradigm

Pipes are often used for processing log -- for example analyzing if there are some types of errors or selecting appropriate fragment for further analysis. For example to select lines for 100 to 200 you can use two stage pipe:

cat /etc/log/messages | head -200 | tail -100

If one command ends prematurely in a series of pipe commands, for example, because you interrupted a command with a Ctrl-C, Bash displays the message "Broken Pipe" on the screen.

 If a user runs the command ls, for instance, the output of the command is shown onscreen. If the user uses ls | less, the commands ls  and less  are started in parallel. The standard output of the ls  command is connected to the standard input of less. Everything that ls writes to the standard output will become available for read from standard input in less. The result is that the output of ls  is shown in a pager, where the user can browse up and down through the results easily.

This arrangement is called corotines. If the result is not what you expect you can breka pipe at any state redirect the output to a file and see if it is correct. After that you can redirect file into first state of the remaning part of the pipe and  debut  stage by stage using the same method. 

This way less can serve as interactive frontend to any utilities that does not have such capabilities.

You can you filter pv (pipe viewer) to debug each pipe stage on different datasets. pv outputs on console each record that is passing via it.

You can also split pipe into two parts. One that is working properly and the other that is not. Write output of the first stage after the one that is working properly  into the file in /tmp and analyze it visually with less and other tools. This way you can find out what's wrong with the output of problematic stage, if any. After that you can add the next stage and repeat this procedure.  And so on until the whole pipe is debugged. 

You can use tee on any stage of the pipe to divert output to a file.

Here us how Stephen G. Kochan explains  this  concent in his boor (with Patrick Wood) Shell Programming in Unix, Linux and OS X, Fourth Edition

Pipes

As you will recall, the file users that was created previously contains a list of all the users currently logged in to the system. Because you know that there will be one line in the file for each user logged in to the system, you can easily determine the number of login sessions by counting the number of lines in the users file:

$ who > users
$ wc -l < users
      5
$

This output indicates that currently five users are logged in or that there are five login sessions, the difference being that users, particularly administrators, often log in more than once. Now you have a command sequence you can use whenever you want to know how many users are logged in.

Another approach to determine the number of logged-in users bypasses the intermediate file. As referenced earlier, Unix lets you “connect” two commands together. This connection is known as a pipe, and it enables you to take the output from one command and feed it directly into the input of another. A pipe is denoted by the character |, which is placed between the two commands. To create a pipe between the who and wc -l commands, you type who | wc -l:

$ who | wc -l
      5
$

Pipeline process: who | wc -l

When a pipe is established between two commands, the standard output from the first command is connected directly to the standard input of the second command. You know that the who command writes its list of logged-in users to standard output. Furthermore, you know that if no filename argument is specified to the wc command, it takes its input from standard input. Therefore, the list of logged-in users that is output from the who command automatically becomes the input to the wc command. Note that you never see the output of the who command at the terminal because it is piped directly into the wc command. This is depicted in Figure 1.13.

... ... ...

A pipe can be made between any two programs, provided that the first program writes its output to standard output, and the second program reads its input from standard input.

As another example, suppose you wanted to count the number of files contained in your directory. Knowledge of the fact that the ls command displays one line of output per file enables you to use the same type of approach as before:

$ ls | wc -l
      10
$

The output indicates that the current directory contains 10 files.

It is also possible to create a more complicated pipeline that consists of more than two programs, with the output of one program feeding into the input of the next. As you become a more sophisticated command line user, you’ll find many situations where pipelines can be tremendously powerful.

Filters

The term filter is often used in Unix terminology to refer to any program that can take input from standard input, perform some operation on that input, and write the results to standard output. More succinctly, a filter is any program that can be used to modify the output of other programs in a pipeline. So in the pipeline in the previous example, wc is considered a filter. ls is not because it does not read its input from standard input. As other examples, cat and sort are filters, whereas who, date, cd, pwd, echo, rm, mv, and cp are not.

 

Using AWK in pipes

Previously it was AWK that was used (and still is widely used in scripts, as this is standalone utility that does not change).  You can search for AWK one liners for examples. Simple AWK programs enclosed in single quotes can be passed as a parameter. For example, you can specify filed separators using option -F  and extract that fields you want much like cut:

awk -F ':' { print $1 | "sort" }' /etc/passwd
This pipe
awk -F ':' { print $1 | "sort" }' /etc/passwd | sort 

prints a sorted list of the login names of all users from /etc/passwd.

Generally AWK is more flexible then cut utility and often is use instead:

Here are more examples from HANDY ONE-LINE SCRIPTS FOR AWK  by Eric Pement :

 # print the first 2 fields, in opposite order, of every line
 awk '{print $2, $1}' file

 # switch the first 2 fields of every line
 awk '{temp = $1; $1 = $2; $2 = temp}' file

 # print every line, deleting the second field of that line
 awk '{ $2 = ""; print }'

If an input file or output file are not specified, AWK will expect input from stdin or output to stdout.

AWK was pioneer in introducing regular expression to Unix See AWK Regular expressions. But pattern matching capabilities of AWK are not limited to regular expression. Patterns can be

  1. regular expressions enclosed by slashes, e.g.: /[a-z]+/
  2. relational expressions, e.g.: $3!=$4
  3. pattern-matching expressions, e.g.: $1 !~ /string/
  4. or any combination of these (This  example selects lines where the two characters starting in fifth column are xx and the third field matches nasty, plus lines beginning with The, plus lines ending with mean, plus lines in which the fourth field is greater than two.):
    (substr($0,5,2)=="xx" && $3 ~ /nasty/ ) || /^The/ || /$mean/ || $4>2

Here are some examples (HANDY ONE-LINE SCRIPTS FOR AWK  by Eric Pement ):

 # substitute (find and replace) "foo" with "bar" on each line
 awk '{sub(/foo/,"bar")}; 1'           # replace only 1st instance
 gawk '{$0=gensub(/foo/,"bar",4)}; 1'  # replace only 4th instance
 awk '{gsub(/foo/,"bar")}; 1'          # replace ALL instances in a line
 
 # substitute "foo" with "bar" ONLY for lines which contain "baz"
 awk '/baz/{gsub(/foo/, "bar")}; 1'

 # substitute "foo" with "bar" EXCEPT for lines which contain "baz"
 awk '!/baz/{gsub(/foo/, "bar")}; 1'

 # change "scarlet" or "ruby" or "puce" to "red"
 awk '{gsub(/scarlet|ruby|puce/, "red")}; 1'
More examples:
SELECTIVE PRINTING OF CERTAIN LINES:

 # print first 10 lines of file (emulates behavior of "head")
 awk 'NR < 11'

 # print first line of file (emulates "head -1")
 awk 'NR>1{exit};1'

  # print the last 2 lines of a file (emulates "tail -2")
 awk '{y=x "\n" $0; x=$0};END{print y}'

 # print the last line of a file (emulates "tail -1")
 awk 'END{print}'

 # print only lines which match regular expression (emulates "grep")
 awk '/regex/'

 # print only lines which do NOT match regex (emulates "grep -v")
 awk '!/regex/'

 # print any line where field #5 is equal to "abc123"
 awk '$5 == "abc123"'

 # print only those lines where field #5 is NOT equal to "abc123"
 # This will also print lines which have less than 5 fields.
 awk '$5 != "abc123"'
 awk '!($5 == "abc123")'

 # matching a field against a regular expression
 awk '$7  ~ /^[a-f]/'    # print line if field #7 matches regex
 awk '$7 !~ /^[a-f]/'    # print line if field #7 does NOT match regex

 # print the line immediately before a regex, but not the line
 # containing the regex
 awk '/regex/{print x};{x=$0}'
 awk '/regex/{print (NR==1 ? "match on line 1" : x)};{x=$0}'

 # print the line immediately after a regex, but not the line
 # containing the regex
 awk '/regex/{getline;print}'

 # grep for AAA and BBB and CCC (in any order on the same line)
 awk '/AAA/ && /BBB/ && /CCC/'

 # grep for AAA and BBB and CCC (in that order)
 awk '/AAA.*BBB.*CCC/'

 # print only lines of 65 characters or longer
 awk 'length > 64'

 # print only lines of less than 65 characters
 awk 'length < 64'

 # print section of file from regular expression to end of file
 awk '/regex/,0'
 awk '/regex/,EOF'

 # print section of file based on line numbers (lines 8-12, inclusive)
 awk 'NR==8,NR==12'

 # print line number 52
 awk 'NR==52'
 awk 'NR==52 {print;exit}'          # more efficient on large files

 # print section of file between two regular expressions (inclusive)
 awk '/Iowa/,/Montana/'             # case sensitive

SELECTIVE DELETION OF CERTAIN LINES:

 # delete ALL blank lines from a file (same as "grep '.' ")
 awk NF
 awk '/./'

 # remove duplicate, consecutive lines (emulates "uniq")
 awk 'a !~ $0; {a=$0}'

 # remove duplicate, nonconsecutive lines
 awk '!a[$0]++'                     # most concise script
 awk '!($0 in a){a[$0];print}'      # most efficient script

 

Using Perl in pipes

The pipe command lets you take the output from one command and use it as input to a custom script, written in any scripting language that you know. Sysadmins typically use Perl for this purpose, but fashion changes.

Many sysadmins know Perl and it is powerful and flexible language to write you own pipe stages. Now Python that gradually displaces Perl form this role (and it is possible that in 20 years you will search for Python one liners ;-) , much like Perl slightly displaced AWK in 1990th, but Perl is still widely used and it closer to shell than Python, so it have "home field advantage".  i personally prefer Perl to Python as Perl as many interesting "one liners" are available for this scripting language. Just search them using Bing or Google. They can be reused. 

Minimal Perl that is used in pipes can be explained just in few paragraphs:

Let's discuss several useful examples:

Here is a classic one liner that replaces a regular expression specified in the first operator into string and creates and backup file:

cat /etc/hosts | perl -pe 's/oldserver/newserver/g' > /tmp/newhosts
df -k | awk 'say $F[2]' 
/dev/hda5 
/dev/hda8 
/dev/hda6 
/dev/hdb5 
/dev/hdb1 
/dev/hda7 
/dev/hda1
					
  1. Delete several lines from the beginning of the script of file. For example one liner below deletes first 10 lines:
    cat /etc/hosts | perl -i.bak -ne 'print unless 1 .. 10' > /tmp/newhosts
  1. Emulation of Unix cut utility in Perl. Option -a ( autosplit mode) converts each line into array @F. The pattern for splitting a whitespace (space or \t) and unlike cut it accommodates consecutive spaces of \t mixes with spaces. Here's a simple one-line script that will print out the first word of every line, but also skip any line beginning with a # because it's a comment line.
    perl -nae 'next if /^#/; print "$F[0]\n"'

    Here is example on how to print first and the last fields:

    perl -lane 'print "$F[0]:$F[-1]\n"'

Pipes in VIM

Everybody knows how pipes work at the command prompt. Text originates from some source, is processed via one or more filters and output goes either to the console display or is redirected to a file.

VI takes this same paradigm of pipes and filters and wraps it in a editor user interface in which the pipe is applied to editing buffer both as a source and as a destination. A VI pipe is thus can alter the buffer using standard Unix filters that instantly become a part of editor toolbox. This is an extremely elegant idea. The ability to pass nearly arbitrary chunks of text through any UNIX filter adds incredible flexibility  at no "additional cost" in size or performance of the editor. 

That was a stoke in genius in design on vi.  And still remains one of the most advanced features vi(and by extension VIM) has. Which, unfortunately, few people understand and use to the full extent.

Pipes can be used both from command line commands:

With the vi filter command

:[address-range] ! external-command-name

you can process a range of lines with an external program. This enables you to do much more effective editing. For example, the command:

:1,$ ! indent

will beautify your program using  standard Unix beautifier (indent).  This is a classic example of using piping in vi. You can also create a macro for this operation using keystroke remapping. Instead of indent you can use any available batch beautifier most suitable for the language that you are using.

Without any filter, the command !, prompts for the name of a UNIX command (which should be a filter), then passes selected lines through the filter, replacing those selected line in the vi buffer with the output of the filter command.

Actually the ex  % operator is the easiest way to filter all the lines in your buffer, and this classic vi idiom should look like:

:%!indent

To edit the current line you need to use:

!!command

To edit the next paragraph: 

!}!command

If you use any keystrokes that move cursor you need to use them so that they move cursor more than one line ( G, { }, ( ), [[ ]], +, - ). To repeat the effect, a number may precede either the exclamation mark or the text object. (For example, both !10+ and 10!+ would indicate the next ten lines.) Objects such as w do not work unless enough of them are specified so as to exceed a single line.

You can also use a slash (/) followed by a regular expression  and a carriage return to specify the object. This takes the text up to the pattern as input to the command.

The entire sequence can be preceded by a number to repetitions:

20!!grep -v '^#'

or:

!20!grep -v '^#'

NOTE: To move one paragraph down you can use !}. That's a very convenient, often  used idiom.  

Among Unix filters that can be used I would like to mention AWK and  Perl. Of course, any filter can be used, for example  tr can be used for case conversion of deletion of  some characters (tr '[a-z]' '[A-Z]').

As most Unix/Linux sysadmin know some Perl along with bash it is natural choice that gradually replaces AWK (although AWK remains attractive for Python users because it is a more simple and still quite powerful language). 

For example,  to pipe all the buffer you need to specify the filer (which can be written by you Perl script) after ! command:

:1,$ ! perltidy

will beautify your program using perltidy.  Beautifying your script is a classic example of using piping in vi. You can also create a macro using keystroke remapping. Instead of beautifier you can use any other filter. This way for example you can temporary remove all comments from the script as it simplifies understanding  and then reread the file with comments back into the buffer:

To do the same for selection you need to provide boundary of selection to the command. The easiest way to do this is to use visual mode:  v, V, or Ctrl-v commands to select the part of your script that you want to process. In this  case boundary of your selection are defined by two mark '< and '> which represent upper and lower boundary correspondingly. They will be put on the command like automatically as soon as you enter : .  For example:

'<,'>! perl -nle 'print "\#".$_'

(this operation is called "commenting out" and it can also be performed via a plugin, but it is a still a good illustration of using Perl filters with piping).

You can return to original text is you pipe worked incorrectly

Vim will run pass your selection to the script as STDIN  and insert standard output in place of it. If you incorrectly overwrite your selection or the whole buffer you can restore it using undo command (u). As buffer is an "in-memory" mapping of the file, rereading file from the disk using r command also allows you to restore the original text.

Repeating previous pipe command

To repeat the previous pipe command, you need to type:

! object !

Use of shell as a filter for execution of command or command sequences vi

Shell can be used as a filter, which gives you an ability to replace shell command that you typed in the current line (lines) with their output.  That's actually great, unappreciated and underutilized vi capability.

Shell can be used as a filter, which gives you an ability to replace shell command that you typed in the current line (lines) with their output.

Here is a relevant quote from Vim tutorial (Vim Color Editor HOW-TO (Vi Improved with syntax color highlighting) Vi Tutorial)

Create a line in your file containing just the word who and absolutely no other text. Put the cursor on this line, and press !! This command is analogous to dd, cc, or yy, but instead of deleting, changing, or yanking the current line, it filters the current line. When you press the second !, the cursor drops down to the lower left corner of the screen and a single ! is displayed, prompting you to enter the name of a filter.

As the filter name, type sh and press the Return key. sh (the Bourne shell) is a filter! It reads standard input, does some processing of its input (that is, executes commands), and sends its output (the output of those commands) to standard output. Filtering the line containing who through sh causes the line containing who to be replaced with a list of the current users on the system - right in your file!

Try repeating this process with date. That is, create a line containing nothing but the word date, then put the cursor on the line, and press !!sh and the Return key. The line containing date is replaced with the output of the date command.

Put your cursor on the first line of the output of who. Count the number of lines. Suppose, for example, the number is six. Then select those six lines to be filtered through sort; press 6!!sort and the Return key. The six lines will be passed through sort, and sort's output replaces the original six lines.

The filter command can only be used on complete lines, not on characters or words.

Some other filter commands (here, < CR > means press Return):

For example, you need to insert into document or script you are editing the  IP address and netmask of the server on which you are working. you can just execute 'ifconfig eth0 | grep Mask' using ! command and get it directly in the place where it is needed. For example:

25 ! bash

Using vi as a simple program generator

You can use internal editor piping for a lot of interesting stuff. For example you can read list of files in the current directory into the buffer, convert then into some commands and then execute them in shell:

$ vim
:r! ls *.c
:%s/\(.*\).c/mv & \1.bla :w !sh :q!

or as one liner:

:r! ls *.c :%s/\(.*\).c/mv & \1.bla :w 
!sh :q!

You can format text without the fmt program using instead perl's Text::Wrap module (especially useful if you are working in Cygwin):

:% ! perl -00 -MText::Wrap -ne 'BEGIN{$Text::Wrap::columns=40} 
print wrap("\t","",$_)'

Numbering items with pattern matches in vi

:! type foo.html | perl -pe"BEGIN{$i=1;} ++$i if s:<foo>:<bar$i>:;" > bar.html

Use filtering with a tool like perl to get variable interpolation into search patterns, unless you are lucky enough to have compiled-in support for perl or other tools that allow you to do this.

Range command

Another quite useful and powerful Vim command is range command:

:range g[lobal][!]/pattern/cmd
It executes the Ex command cmd (default ":p") on each line within [range] where pattern matches. If pattern is preceded with a ! - only where match does not occur.

The global commands work by first scanning through the [range] of of the lines and marking each line where a match occurs. In a second scan the [cmd] is executed for each marked line with its line number prepended. If a line is changed or deleted its mark disappears. The default for the [range] is the whole file.

Note: Non-Ex commands (normal mode commands) can be also executed from the command line using :norm[al]non-ex command  mechanism.

Putting visual block boundaries into the command line for filtering

In the previous section, we saw that we could execute shell commands from within Vim, causing Vim to be moved to a background process while the command executed. There are more practical use cases for this feature, mainly, having the ability to manipulate our content via external filters.

Let’s use as an example tr command. This command translates character set to another. For example we might need to convert some characters that appear in the text copied from the WEB to "normal" characters acceptable to particular scripting language interpreter.

If we selected visual block and types ":" after will  we’d enter COMMAND-LINE mode with boundaries on the block already put into the command line. After that  we can use tr command

:'<,'> ! tr '[:lower:]' '[:upper:]'

Using coroutines in shell: feeding pipe into a loop or sending output of the loop to a pipe

In bash this capability is limited as, by default, bash does not run the last stage of the pipe as the current process... Ksh93 is better in this respect, but bash 4.x introduced option to imitate ksh93 behaviour.

Let's assume that we need to find all files that contain string "19%" which is a typical for printing commands like "19%2d"

cd/ /usr/bin
ls | while read file
do
    echo $file
    string $file | grep '19%'
done

Here we use the ls command to generate the list of the file names and this list it piped into a loop. In a loop we echo command and then run strings piped to grep looking for suspicious format strings.

In another example from O'Reilly "Learning Korn Shell" (first edition). Here we will pipe awk output into the loop. This is a  function that, given a pathname as argument, prints its equivalent in tilde notation if possible:

function tildize {
    if [[ $1 = $HOME* ]]; then
        print "\~/${1#$HOME}"
        return 0
    fi
    awk '{FS=":"; print $1, $6}' /etc/passwd | 
        while read user homedir; do
            if [[ $homedir != / && $1 = ${homedir}?(/*) ]]; then
                print "\~$user/${1#$homedir}"
                return 0
            fi
        done
    print "$1"
    return 1
}

Loop can also serve as a source to input for the pipe. For example

{ while read line'?adc> '; do
      print "$(alg2rpn $line)"
  done 
} | dc

As an example; assume that you want to go through all  files of a directory and, if they are readable to you, convert the filenames to contain lowercase letters only. We can do it it in slightly different ways.

There are two major ways to accomplish this:

  1. The first, more traditional, variant calls tr inside the the for loop:
    #!/bin/ksh
    for x in * 
    do
      [ -r $x ] && echo $x | tr 'A-Z' 'a-z'
    done
    
  2. The second, more elegant variant uses pipe to feed tr from the loop:
    #!/bin/ksh
    for x in * 
    do
      [ -r $x ] && echo $x 
    done | tr 'A-Z' 'a-z'
  3. Usage in submission scripts for SGE and other HPC schedulers. Here is one example when we generate ./machine file for MPI using SGE variable $PE_HOSTFILE:
    # get machine from $PE_HOSTFILE
    
    cat /dev/null > ./machines
    
    cat $PE_HOSTFILE | while read line; do
    host=`echo $line | cut -d" " -f1`
    cores=`echo $line | cut -d" " -f2`
    
    while (( $cores > 0 )) ; do
            echo $host >> machines
            let cores--
    done
    done
    
    ## done with $PE_HOSTFILE

Monitoring the progress of data  through a pipeline

There is also a useful terminal-based tool for monitoring the progress of data through a pipeline called pv - pipe viewer.  It can be inserted into any normal pipeline between two processes to give a visual indication of how quickly data is passing through, how long it has taken, how near to completion it is, and an estimate of how long it will be until completion. It is available for all major Linux distributions. It also has precompiled Solaris binary (Solaris binary ).

 

NEWS CONTENTS

Old News ;-)

[Nov 02, 2018] Working with data streams on the Linux command line by David Both The Linux Philosophy for SysAdmins And Everyone Who Wants To Be One by David Both. It is well worth $32.

Notable quotes:
"... This is the Unix philosophy: Write programs that do one thing and do it well. Write programs to work together. Write programs to handle text streams, because that is a universal interface." ..."
Oct 30, 2018 | opensource.com
Author's note: Much of the content in this article is excerpted, with some significant edits to fit the Opensource.com article format, from Chapter 3: Data Streams, of my new book, The Linux Philosophy for SysAdmins .

Everything in Linux revolves around streams of data -- particularly text streams. Data streams are the raw materials upon which the GNU Utilities , the Linux core utilities, and many other command-line tools perform their work.

As its name implies, a data stream is a stream of data -- especially text data -- being passed from one file, device, or program to another using STDIO. This chapter introduces the use of pipes to connect streams of data from one utility program to another using STDIO. You will learn that the function of these programs is to transform the data in some manner. You will also learn about the use of redirection to redirect the data to a file.

The Linux Terminal

I use the term "transform" in conjunction with these programs because the primary task of each is to transform the incoming data from STDIO in a specific way as intended by the sysadmin and to send the transformed data to STDOUT for possible use by another transformer program or redirection to a file.

The standard term, "filters," implies something with which I don't agree. By definition, a filter is a device or a tool that removes something, such as an air filter removes airborne contaminants so that the internal combustion engine of your automobile does not grind itself to death on those particulates. In my high school and college chemistry classes, filter paper was used to remove particulates from a liquid. The air filter in my home HVAC system removes particulates that I don't want to breathe.

Although they do sometimes filter out unwanted data from a stream, I much prefer the term "transformers" because these utilities do so much more. They can add data to a stream, modify the data in some amazing ways, sort it, rearrange the data in each line, perform operations based on the contents of the data stream, and so much more. Feel free to use whichever term you prefer, but I prefer transformers. I expect that I am alone in this.

Data streams can be manipulated by inserting transformers into the stream using pipes. Each transformer program is used by the sysadmin to perform some operation on the data in the stream, thus changing its contents in some manner. Redirection can then be used at the end of the pipeline to direct the data stream to a file. As mentioned, that file could be an actual data file on the hard drive, or a device file such as a drive partition, a printer, a terminal, a pseudo-terminal, or any other device connected to a computer.

The ability to manipulate these data streams using these small yet powerful transformer programs is central to the power of the Linux command-line interface. Many of the core utilities are transformer programs and use STDIO.

In the Unix and Linux worlds, a stream is a flow of text data that originates at some source; the stream may flow to one or more programs that transform it in some way, and then it may be stored in a file or displayed in a terminal session. As a sysadmin, your job is intimately associated with manipulating the creation and flow of these data streams. In this post, we will explore data streams -- what they are, how to create them, and a little bit about how to use them.

Text streams -- a universal interface

The use of Standard Input/Output (STDIO) for program input and output is a key foundation of the Linux way of doing things. STDIO was first developed for Unix and has found its way into most other operating systems since then, including DOS, Windows, and Linux.

" This is the Unix philosophy: Write programs that do one thing and do it well. Write programs to work together. Write programs to handle text streams, because that is a universal interface."

-- Doug McIlroy, Basics of the Unix Philosophy

STDIO

STDIO was developed by Ken Thompson as a part of the infrastructure required to implement pipes on early versions of Unix. Programs that implement STDIO use standardized file handles for input and output rather than files that are stored on a disk or other recording media. STDIO is best described as a buffered data stream, and its primary function is to stream data from the output of one program, file, or device to the input of another program, file, or device.

There are three STDIO data streams, each of which is automatically opened as a file at the startup of a program -- well, those programs that use STDIO. Each STDIO data stream is associated with a file handle, which is just a set of metadata that describes the attributes of the file. File handles 0, 1, and 2 are explicitly defined by convention and long practice as STDIN, STDOUT, and STDERR, respectively.

STDIN, File handle 0 , is standard input which is usually input from the keyboard. STDIN can be redirected from any file, including device files, instead of the keyboard. It is not common to need to redirect STDIN, but it can be done.

STDOUT, File handle 1 , is standard output which sends the data stream to the display by default. It is common to redirect STDOUT to a file or to pipe it to another program for further processing.

STDERR, File handle 2 . The data stream for STDERR is also usually sent to the display.

If STDOUT is redirected to a file, STDERR continues to be displayed on the screen. This ensures that when the data stream itself is not displayed on the terminal, that STDERR is, thus ensuring that the user will see any errors resulting from execution of the program. STDERR can also be redirected to the same or passed on to the next transformer program in a pipeline.

STDIO is implemented as a C library, stdio.h , which can be included in the source code of programs so that it can be compiled into the resulting executable.

Simple streams

You can perform the following experiments safely in the /tmp directory of your Linux host. As the root user, make /tmp the PWD, create a test directory, and then make the new directory the PWD.

# cd /tmp ; mkdir test ; cd test

Enter and run the following command line program to create some files with content on the drive. We use the dmesg command simply to provide data for the files to contain. The contents don't matter as much as just the fact that each file has some content.

# for I in 0 1 2 3 4 5 6 7 8 9 ; do dmesg > file$I.txt ; done

Verify that there are now at least 10 files in /tmp/ with the names file0.txt through file9.txt .

# ll
total 1320
-rw-r--r-- 1 root root 131402 Oct 17 15:50 file0.txt
-rw-r--r-- 1 root root 131402 Oct 17 15:50 file1.txt
-rw-r--r-- 1 root root 131402 Oct 17 15:50 file2.txt
-rw-r--r-- 1 root root 131402 Oct 17 15:50 file3.txt
-rw-r--r-- 1 root root 131402 Oct 17 15:50 file4.txt
-rw-r--r-- 1 root root 131402 Oct 17 15:50 file5.txt
-rw-r--r-- 1 root root 131402 Oct 17 15:50 file6.txt
-rw-r--r-- 1 root root 131402 Oct 17 15:50 file7.txt
-rw-r--r-- 1 root root 131402 Oct 17 15:50 file8.txt
-rw-r--r-- 1 root root 131402 Oct 17 15:50 file9.txt

We have generated data streams using the dmesg command, which was redirected to a series of files. Most of the core utilities use STDIO as their output stream and those that generate data streams, rather than acting to transform the data stream in some way, can be used to create the data streams that we will use for our experiments. Data streams can be as short as one line or even a single character, and as long as needed.

Exploring the hard drive

It is now time to do a little exploring. In this experiment, we will look at some of the filesystem structures.

Let's start with something simple. You should be at least somewhat familiar with the dd command. Officially known as "disk dump," many sysadmins call it "disk destroyer" for good reason. Many of us have inadvertently destroyed the contents of an entire hard drive or partition using the dd command. That is why we will hang out in the /tmp/test directory to perform some of these experiments.

Despite its reputation, dd can be quite useful in exploring various types of storage media, hard drives, and partitions. We will also use it as a tool to explore other aspects of Linux.

Log into a terminal session as root if you are not already. We first need to determine the device special file for your hard drive using the lsblk command.

[root@studentvm1 test]# lsblk -i
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 60G 0 disk
|-sda1 8:1 0 1G 0 part /boot
`-sda2 8:2 0 59G 0 part
|-fedora_studentvm1-pool00_tmeta 253:0 0 4M 0 lvm
| `-fedora_studentvm1-pool00-tpool 253:2 0 2G 0 lvm
| |-fedora_studentvm1-root 253:3 0 2G 0 lvm /
| `-fedora_studentvm1-pool00 253:6 0 2G 0 lvm
|-fedora_studentvm1-pool00_tdata 253:1 0 2G 0 lvm
| `-fedora_studentvm1-pool00-tpool 253:2 0 2G 0 lvm
| |-fedora_studentvm1-root 253:3 0 2G 0 lvm /
| `-fedora_studentvm1-pool00 253:6 0 2G 0 lvm
|-fedora_studentvm1-swap 253:4 0 10G 0 lvm [SWAP]
|-fedora_studentvm1-usr 253:5 0 15G 0 lvm /usr
|-fedora_studentvm1-home 253:7 0 2G 0 lvm /home
|-fedora_studentvm1-var 253:8 0 10G 0 lvm /var
`-fedora_studentvm1-tmp 253:9 0 5G 0 lvm /tmp
sr0 11:0 1 1024M 0 rom

We can see from this that there is only one hard drive on this host, that the device special file associated with it is /dev/sda , and that it has two partitions. The /dev/sda1 partition is the boot partition, and the /dev/sda2 partition contains a volume group on which the rest of the host's logical volumes have been created.

As root in the terminal session, use the dd command to view the boot record of the hard drive, assuming it is assigned to the /dev/sda device. The bs= argument is not what you might think; it simply specifies the block size, and the count= argument specifies the number of blocks to dump to STDIO. The if= argument specifies the source of the data stream, in this case, the /dev/sda device. Notice that we are not looking at the first block of the partition, we are looking at the very first block of the hard drive.

[root@studentvm1 test]# dd if=/dev/sda bs=512 count=1
�c�#�м���؎���|�#�#���!#��8#u
��#���u��#�#�#�|���t#�L#�#�|���#�����?t��pt#���y|1��؎м ��d|<�t#��R�|1��D#@�D��D#�##f�#\|f�f�#`|f�\
�D#p�B�#r�p�#�K`#�#��1��������#a`���#f��u#����f1�f�TCPAf�#f�#a�&Z|�#}�#�.}�4�3}�.�#��GRUB GeomHard DiskRead Error
�#��#�<u��ܻޮ�###��� ������ �_U�1+0 records in
1+0 records out
512 bytes copied, 4.3856e-05 s, 11.7 MB/s

This prints the text of the boot record, which is the first block on the disk -- any disk. In this case, there is information about the filesystem and, although it is unreadable because it is stored in binary format, the partition table. If this were a bootable device, stage 1 of GRUB or some other boot loader would be located in this sector. The last three lines contain data about the number of records and bytes processed.

Starting with the beginning of /dev/sda1 , let's look at a few blocks of data at a time to find what we want. The command is similar to the previous one, except that we have specified a few more blocks of data to view. You may have to specify fewer blocks if your terminal is not large enough to display all of the data at one time, or you can pipe the data through the less utility and use that to page through the data -- either way works. Remember, we are doing all of this as root user because non-root users do not have the required permissions.

Enter the same command as you did in the previous experiment, but increase the block count to be displayed to 100, as shown below, in order to show more data.

[root@studentvm1 test]# dd if=/dev/sda1 bs=512 count=100
##33��#:�##�� :o�[:o�[#��S�###�q[#
#<�#{5OZh�GJ͞#t�Ұ##boot/bootysimage/booC�dp��G'�*)�#A�##@
#�q[
�## ## ###�#���To=###<#8���#'#�###�#�����#�' �����#Xi �#��` qT���
<���
� r���� ]�#�#�##�##�##�#�##�##�##�#�##�##�#��#�#�##�#�##�##�#��#�#����# � �# �# �#

�#
�#
�#

�#
�#
�#

�#
�#
�#100+0 records in
100+0 records out
51200 bytes (51 kB, 50 KiB) copied, 0.00117615 s, 43.5 MB/s

Now try this command. I won't reproduce the entire data stream here because it would take up huge amounts of space. Use Ctrl-C to break out and stop the stream of data.

[root@studentvm1 test]# dd if=/dev/sda

This command produces a stream of data that is the complete content of the hard drive, /dev/sda , including the boot record, the partition table, and all of the partitions and their content. This data could be redirected to a file for use as a complete backup from which a bare metal recovery can be performed. It could also be sent directly to another hard drive to clone the first. But do not perform this particular experiment.

[root@studentvm1 test]# dd if=/dev/sda of=/dev/sdx

You can see that the dd command can be very useful for exploring the structures of various types of filesystems, locating data on a defective storage device, and much more. It also produces a stream of data on which we can use the transformer utilities in order to modify or view.

The real point here is that dd , like so many Linux commands, produces a stream of data as its output. That data stream can be searched and manipulated in many ways using other tools. It can even be used for ghost-like backups or disk duplication.

Randomness

It turns out that randomness is a desirable thing in computers -- who knew? There are a number of reasons that sysadmins might want to generate a stream of random data. A stream of random data is sometimes useful to overwrite the contents of a complete partition, such as /dev/sda1 , or even the entire hard drive, as in /dev/sda .

Perform this experiment as a non-root user. Enter this command to print an unending stream of random data to STDIO.

[student@studentvm1 ~]$ cat /dev/urandom

Use Ctrl-C to break out and stop the stream of data. You may need to use Ctrl-C multiple times.

Random data is also used as the input seed to programs that generate random passwords and random data and numbers for use in scientific and statistical calculations. I will cover randomness and other interesting data sources in a bit more detail in Chapter 24: Everything is a file.

Pipe dreams

Pipes are critical to our ability to do the amazing things on the command line, so much so that I think it is important to recognize that they were invented by Douglas McIlroy during the early days of Unix (thanks, Doug!). The Princeton University website has a fragment of an interview with McIlroy in which he discusses the creation of the pipe and the beginnings of the Unix philosophy.

Notice the use of pipes in the simple command-line program shown next, which lists each logged-in user a single time, no matter how many logins they have active. Perform this experiment as the student user. Enter the command shown below:

[student@studentvm1 ~]$ w | tail -n +3 | awk '{print $1}' | sort | uniq
root
student
[student@studentvm1 ~]$

The results from this command produce two lines of data that show that the user's root and student are both logged in. It does not show how many times each user is logged in. Your results will almost certainly differ from mine.

Pipes -- represented by the vertical bar ( | ) -- are the syntactical glue, the operator, that connects these command-line utilities together. Pipes allow the Standard Output from one command to be "piped," i.e., streamed from Standard Output of one command to the Standard Input of the next command.

The |& operator can be used to pipe the STDERR along with STDOUT to STDIN of the next command. This is not always desirable, but it does offer flexibility in the ability to record the STDERR data stream for the purposes of problem determination.

A string of programs connected with pipes is called a pipeline, and the programs that use STDIO are referred to officially as filters, but I prefer the term "transformers."

Think about how this program would have to work if we could not pipe the data stream from one command to the next. The first command would perform its task on the data and then the output from that command would need to be saved in a file. The next command would have to read the stream of data from the intermediate file and perform its modification of the data stream, sending its own output to a new, temporary data file. The third command would have to take its data from the second temporary data file and perform its own manipulation of the data stream and then store the resulting data stream in yet another temporary file. At each step, the data file names would have to be transferred from one command to the next in some way.

I cannot even stand to think about that because it is so complex. Remember: Simplicity rocks!

Building pipelines

When I am doing something new, solving a new problem, I usually do not just type in a complete Bash command pipeline from scratch off the top of my head. I usually start with just one or two commands in the pipeline and build from there by adding more commands to further process the data stream. This allows me to view the state of the data stream after each of the commands in the pipeline and make corrections as they are needed.

It is possible to build up very complex pipelines that can transform the data stream using many different utilities that work with STDIO.

Redirection

Redirection is the capability to redirect the STDOUT data stream of a program to a file instead of to the default target of the display. The "greater than" ( > ) character, aka "gt", is the syntactical symbol for redirection of STDOUT.

Redirecting the STDOUT of a command can be used to create a file containing the results from that command.

[student@studentvm1 ~]$ df -h > diskusage.txt

There is no output to the terminal from this command unless there is an error. This is because the STDOUT data stream is redirected to the file and STDERR is still directed to the STDOUT device, which is the display. You can view the contents of the file you just created using this next command:

[student@studentvm1 test]# cat diskusage.txt
Filesystem Size Used Avail Use% Mounted on
devtmpfs 2.0G 0 2.0G 0% /dev
tmpfs 2.0G 0 2.0G 0% /dev/shm
tmpfs 2.0G 1.2M 2.0G 1% /run
tmpfs 2.0G 0 2.0G 0% /sys/fs/cgroup
/dev/mapper/fedora_studentvm1-root 2.0G 50M 1.8G 3% /
/dev/mapper/fedora_studentvm1-usr 15G 4.5G 9.5G 33% /usr
/dev/mapper/fedora_studentvm1-var 9.8G 1.1G 8.2G 12% /var
/dev/mapper/fedora_studentvm1-tmp 4.9G 21M 4.6G 1% /tmp
/dev/mapper/fedora_studentvm1-home 2.0G 7.2M 1.8G 1% /home
/dev/sda1 976M 221M 689M 25% /boot
tmpfs 395M 0 395M 0% /run/user/0
tmpfs 395M 12K 395M 1% /run/user/1000

When using the > symbol to redirect the data stream, the specified file is created if it does not already exist. If it does exist, the contents are overwritten by the data stream from the command. You can use double greater-than symbols, >>, to append the new data stream to any existing content in the file.

[student@studentvm1 ~]$ df -h >> diskusage.txt

You can use cat and/or less to view the diskusage.txt file in order to verify that the new data was appended to the end of the file.

The < (less than) symbol redirects data to the STDIN of the program. You might want to use this method to input data from a file to STDIN of a command that does not take a filename as an argument but that does use STDIN. Although input sources can be redirected to STDIN, such as a file that is used as input to grep, it is generally not necessary as grep also takes a filename as an argument to specify the input source. Most other commands also take a filename as an argument for their input source.

Just grep'ing around

The grep command is used to select lines that match a specified pattern from a stream of data. grep is one of the most commonly used transformer utilities and can be used in some very creative and interesting ways. The grep command is one of the few that can correctly be called a filter because it does filter out all the lines of the data stream that you do not want; it leaves only the lines that you do want in the remaining data stream.

If the PWD is not the /tmp/test directory, make it so. Let's first create a stream of random data to store in a file. In this case, we want somewhat less random data that would be limited to printable characters. A good password generator program can do this. The following program (you may have to install pwgen if it is not already) creates a file that contains 50,000 passwords that are 80 characters long using every printable character. Try it without redirecting to the random.txt file first to see what that looks like, and then do it once redirecting the output data stream to the file.

$ pwgen -sy 80 50000 > random.txt

Considering that there are so many passwords, it is very likely that some character strings in them are the same. First, cat the random.txt file, then use the grep command to locate some short, randomly selected strings from the last ten passwords on the screen. I saw the word "see" in one of those ten passwords, so my command looked like this: grep see random.txt , and you can try that, but you should also pick some strings of your own to check. Short strings of two to four characters work best.

$ grep see random.txt
R=p)'s/~0}wr~2(OqaL.S7DNyxlmO69`"12u]h@rp[D2%3}1b87+>Vk,;4a0hX]d7see;1%9|wMp6Yl.
bSM_mt_hPy|YZ1<TY/Hu5{g#mQ<u_(@8B5Vt?w%i-&C>NU@[;zV2-see)>(BSK~n5mmb9~h)yx{a&$_e
cjR1QWZwEgl48[3i-(^x9D=v)seeYT2R#M:>wDh?Tn$]HZU7}j!7bIiIr^cI.DI)W0D"'vZU@.Kxd1E1
z=tXcjVv^G\nW`,y=bED]d|7%s6iYT^a^Bvsee:v\UmWT02|P|nq%A*;+Ng[$S%*s)-ls"dUfo|0P5+n Summary

It is the use of pipes and redirection that allows many of the amazing and powerful tasks that can be performed with data streams on the Linux command line. It is pipes that transport STDIO data streams from one program or file to another. The ability to pipe streams of data through one or more transformer programs supports powerful and flexible manipulation of data in those streams.

Each of the programs in the pipelines demonstrated in the experiments is small, and each does one thing well. They are also transformers; that is, they take Standard Input, process it in some way, and then send the result to Standard Output. Implementation of these programs as transformers to send processed data streams from their own Standard Output to the Standard Input of the other programs is complementary to, and necessary for, the implementation of pipes as a Linux tool.

STDIO is nothing more than streams of data. This data can be almost anything from the output of a command to list the files in a directory, or an unending stream of data from a special device like /dev/urandom , or even a stream that contains all of the raw data from a hard drive or a partition.

Any device on a Linux computer can be treated like a data stream. You can use ordinary tools like dd and cat to dump data from a device into a STDIO data stream that can be processed using other ordinary Linux tools.

Topics Linux Command line

David Both is a Linux and Open Source advocate who resides in Raleigh, North Carolina. He has been in the IT industry for over forty years and taught OS/2 for IBM where he worked for over 20 years. While at IBM, he wrote the first training course for the original IBM PC in 1981. He has taught RHCE classes for Red Hat and has worked at MCI Worldcom, Cisco, and the State of North Carolina. He has been working with Linux and Open Source Software for almost 20 years. David has written articles for...

Recommended Links

Google matched content

Softpanorama Recommended

Unix Pipes

AWK

Perl

Etc

...



Etc

Society

Groupthink : Two Party System as Polyarchy : Corruption of Regulators : Bureaucracies : Understanding Micromanagers and Control Freaks : Toxic Managers :   Harvard Mafia : Diplomatic Communication : Surviving a Bad Performance Review : Insufficient Retirement Funds as Immanent Problem of Neoliberal Regime : PseudoScience : Who Rules America : Neoliberalism  : The Iron Law of Oligarchy : Libertarian Philosophy

Quotes

War and Peace : Skeptical Finance : John Kenneth Galbraith :Talleyrand : Oscar Wilde : Otto Von Bismarck : Keynes : George Carlin : Skeptics : Propaganda  : SE quotes : Language Design and Programming Quotes : Random IT-related quotesSomerset Maugham : Marcus Aurelius : Kurt Vonnegut : Eric Hoffer : Winston Churchill : Napoleon Bonaparte : Ambrose BierceBernard Shaw : Mark Twain Quotes

Bulletin:

Vol 25, No.12 (December, 2013) Rational Fools vs. Efficient Crooks The efficient markets hypothesis : Political Skeptic Bulletin, 2013 : Unemployment Bulletin, 2010 :  Vol 23, No.10 (October, 2011) An observation about corporate security departments : Slightly Skeptical Euromaydan Chronicles, June 2014 : Greenspan legacy bulletin, 2008 : Vol 25, No.10 (October, 2013) Cryptolocker Trojan (Win32/Crilock.A) : Vol 25, No.08 (August, 2013) Cloud providers as intelligence collection hubs : Financial Humor Bulletin, 2010 : Inequality Bulletin, 2009 : Financial Humor Bulletin, 2008 : Copyleft Problems Bulletin, 2004 : Financial Humor Bulletin, 2011 : Energy Bulletin, 2010 : Malware Protection Bulletin, 2010 : Vol 26, No.1 (January, 2013) Object-Oriented Cult : Political Skeptic Bulletin, 2011 : Vol 23, No.11 (November, 2011) Softpanorama classification of sysadmin horror stories : Vol 25, No.05 (May, 2013) Corporate bullshit as a communication method  : Vol 25, No.06 (June, 2013) A Note on the Relationship of Brooks Law and Conway Law

History:

Fifty glorious years (1950-2000): the triumph of the US computer engineering : Donald Knuth : TAoCP and its Influence of Computer Science : Richard Stallman : Linus Torvalds  : Larry Wall  : John K. Ousterhout : CTSS : Multix OS Unix History : Unix shell history : VI editor : History of pipes concept : Solaris : MS DOSProgramming Languages History : PL/1 : Simula 67 : C : History of GCC developmentScripting Languages : Perl history   : OS History : Mail : DNS : SSH : CPU Instruction Sets : SPARC systems 1987-2006 : Norton Commander : Norton Utilities : Norton Ghost : Frontpage history : Malware Defense History : GNU Screen : OSS early history

Classic books:

The Peter Principle : Parkinson Law : 1984 : The Mythical Man-MonthHow to Solve It by George Polya : The Art of Computer Programming : The Elements of Programming Style : The Unix Hater’s Handbook : The Jargon file : The True Believer : Programming Pearls : The Good Soldier Svejk : The Power Elite

Most popular humor pages:

Manifest of the Softpanorama IT Slacker Society : Ten Commandments of the IT Slackers Society : Computer Humor Collection : BSD Logo Story : The Cuckoo's Egg : IT Slang : C++ Humor : ARE YOU A BBS ADDICT? : The Perl Purity Test : Object oriented programmers of all nations : Financial Humor : Financial Humor Bulletin, 2008 : Financial Humor Bulletin, 2010 : The Most Comprehensive Collection of Editor-related Humor : Programming Language Humor : Goldman Sachs related humor : Greenspan humor : C Humor : Scripting Humor : Real Programmers Humor : Web Humor : GPL-related Humor : OFM Humor : Politically Incorrect Humor : IDS Humor : "Linux Sucks" Humor : Russian Musical Humor : Best Russian Programmer Humor : Microsoft plans to buy Catholic Church : Richard Stallman Related Humor : Admin Humor : Perl-related Humor : Linus Torvalds Related humor : PseudoScience Related Humor : Networking Humor : Shell Humor : Financial Humor Bulletin, 2011 : Financial Humor Bulletin, 2012 : Financial Humor Bulletin, 2013 : Java Humor : Software Engineering Humor : Sun Solaris Related Humor : Education Humor : IBM Humor : Assembler-related Humor : VIM Humor : Computer Viruses Humor : Bright tomorrow is rescheduled to a day after tomorrow : Classic Computer Humor

The Last but not Least Technology is dominated by two types of people: those who understand what they do not manage and those who manage what they do not understand ~Archibald Putt. Ph.D


Copyright © 1996-2018 by Dr. Nikolai Bezroukov. www.softpanorama.org was initially created as a service to the (now defunct) UN Sustainable Development Networking Programme (SDNP) in the author free time and without any remuneration. This document is an industrial compilation designed and created exclusively for educational use and is distributed under the Softpanorama Content License. Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.

FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available to advance understanding of computer science, IT technology, economic, scientific, and social issues. We believe this constitutes a 'fair use' of any such copyrighted material as provided by section 107 of the US Copyright Law according to which such material can be distributed without profit exclusively for research and educational purposes.

This is a Spartan WHYFF (We Help You For Free) site written by people for whom English is not a native language. Grammar and spelling errors should be expected. The site contain some broken links as it develops like a living tree...

You can use PayPal to make a contribution, supporting development of this site and speed up access. In case softpanorama.org is down you can use the at softpanorama.info

Disclaimer:

The statements, views and opinions presented on this web page are those of the author (or referenced source) and are not endorsed by, nor do they necessarily reflect, the opinions of the author present and former employers, SDNP or any other organization the author may be associated with. We do not warrant the correctness of the information provided or its fitness for any purpose.

The site uses AdSense so you need to be aware of Google privacy policy. You you do not want to be tracked by Google please disable Javascript for this site. This site is perfectly usable without Javascript.

Last modified: December 07, 2018