Softpanorama
(slightly skeptical) Open Source Software Educational Society

May the source be with you, but remember the KISS principle ;-)

Softpanorama Search

IFS variable and Field splitting in the Korn shell and Bash

News

See also

Best Shell Books

Recommended Links Papers, ebooks  tutorials

Pipes

Reference
Pipes in Loops Process Substitution in Shell Tee   Tips Humor Etc

IFS Specifies internal field separators (normally space, tab, and new line) used to separate command words that result from command or parameter substitution and for separating words with the regular built-in command read. The first character of the IFS parameter is used to separate arguments for the $* substitution.

After performing command substitution, the Korn shell scans the results of substitutions for those field separator characters found in the IFS (Internal Field Separator) variable. Where such characters are found, the shell splits the substitutions into distinct arguments.

The shell retains explicit null arguments ("" or '') and removes implicit null arguments (those resulting from parameters that have no values).
  • Why are the elements of "$*" separated by the first character of IFS instead of just spaces? To give you output flexibility. As a simple example, let's say you want to print a list of positional parameters separated by commas. This script would do it:

    IFS=,
    print $*

    Changing IFS in a script is fairly risky, but it's probably OK as long as nothing else in the script depends on it. If this script were called arglist, then the command arglist bob dave ed would produce the output bob,dave,ed.

    The Answer Guy 54 Embedding Newlines in Shell and Environment Values

    Extra tidbit:
    I recently found a quirky difference between Korn shell ('93) and bash. Consider the following:
    echo foo | read bar; echo $bar
    ... whenever you see a "|" operator in a shell command sequence you should understand that there is implicitly a subshell (new process) that is created (forked) on one side of it or the other.
    Of course other processes (including subshells) cannot affect the values of your shell variables. So the sequence above consists of three commands (echo the string "foo", read something and assign it to a shell variable named "bar", and echo the value of (read the $ dereferencing operator as "the value of") the shell named "bar"). It consists of two processes. One on one side of the pipe, and the other on the other side of the pipe. At the semicolon the shell waits for the completion of any programs and commands that precede it, and then continues with a new command sequence in the current shell.
    The question becomes whether the subshell was created on the left or the right of the | in this command. In bash it is clearly created on the right. The 'read' command executes in a subshell. That then exits (thus "forgetting" its variable and environment heaps). Thus $bar is unaffected after the semicolon.
    In ksh '93 and in zsh the subshell seems to be created to the left of the pipe. The 'read' command is executed in the current shell and thus the local value of "bar" is affected. Then the subsequent access to that shell variable does reflect the new value.
    As far as I know the POSIX spec is silent on this point. It may even be that ksh '93 and zsh are in violation of the spec. If so, the spec is wrong!
    It is very useful to be able to parse a set of command outputs into a local list of shell variables. Note that for a single variable this is easy:
    bar=$(echo foo)
    or:
    bar=`echo foo`
    ... are equivalent expressions and they work just fine.
    However, when we want to read the outputs into several values, and especially when we want to do so using the IFS environment value to parse these values then we have to resort of inordinate amounts of fussing in bash while ksh '93 and newer versions of zsh allow us to do something like:
    grep ^joe /etc/passwd | IFS=":" read login pw uid gid gecos home sh
    (Note the form: 'VAR=val cmd' as shown here is also a bit obscure but handy. The value of VAR is only affected for the duration of the following command --- thus saving us the trouble of saving the old IFS value, executing our 'read' command and restoring the IFS).
    BTW: If you do need to save/restore something like IFS you must using proper quoting. For example:
    OLDIFS="$IFS"
    # MUST have double/soft quotes here!
    IFS=:,
    # do stuff parsing words on colons and commas
    IFS="$OLDIFS"
    # MUST also have double/soft quotes here!
    Anyway, I would like to do some more teaching in the field of shell scripting. I also plan to get as good with C and Python as I currently am with 'sh'. That'll take at least another year or so, and a lot more practice!

     

     

    sh vs. ksh

    Others have showed you:

      IFS='
    '   # actual newline in quotes

    My usual idiom for what you're doing is:

      ps | while read nxt; do
        ...
      done

    But there's a gotcha here.  The Bourne shell forks a separate shell for
    a set of shell commands that are being piped to.  It does a decent job
    of hiding this fact, but your:

      eval jbid_$dex=$e

     

    Parameter Substitution in the Korn Shell or POSIX Shell

    IFS Specifies internal field separators (normally space, tab, and new line) used to separate command words that result from command or parameter substitution and for separating words with the regular built-in command read. The first character of the IFS parameter is used to separate arguments for the $* substitution.

    Korn shell exec, read and miscellaneous commands

    The prompt can be specified in the read statement:

    $ read var?prompt
    
    If var is not defined, input is assigned to variable REPLY. Field separator can be assigned with the IFS ( Internal Field Separator) variable.
    Example:

    > cat kpwd
    #!/bin/ksh
    #-----------kpwd: read example in Korn shell
    echo Proc $0: type pwd info with Korn shell
    echo
    read ok?"Type pwd info? (y/n)"                  #read with prompt
    [[ $ok = @([Nn])* ]] && exit 1                  #test read variable
    echo pwd data are:
    echo ""
    IFS=:                                           #set IFS to :
    exec 0</etc/passwd                              #redirect stdin to /etc/passwd
    # list users
    #
    while read -r NAME PAS UID GID COM HOME SHELL
    do
       print "acct= $NAME - home= $HOME - shell= $SHELL:"
    done
    #----------end script------------------
    > kpwd
    
    Type pwd info? (y/n)y
    pwd data are:
    
    acct= john - home= /home/john - shell= /bin/tcsh:
    acct= mary - home= /home/mary - shell= /bin/tcsh:
    acct= tester - home= /d4/check - shell= /bin/sh:
    >
    
    If you need to run a script in the current shell either use the dot [.] or the source command:

    $ . .profile
    $ source .profile
    $ rehash
    
    In this way any variable, alias or function setting stay in effect. Note the rehash command that recreates the in-memory shell tables and grants that the system aknowledges new .profile definitions.

     

    [Jun 25, 2007] Useful Shell Scripting Variables - Part III - IFS (Internal Field Separator)

    October 13, 2003

    ... The shell uses the value stored in IFS, which is the space, tab, and newline characters by default, to delimit words for the read and set commands, when parsing output from command substitution, and when performing variable substitution.

    IFS can be redefined to parse one or more lines of data whose fields are not delimited by the default white-space characters.  Consider this sequence of variable assignments and for loops:

    		$ line=learn:unix:at:livefire:labs
    ... ... ...
    $ OIFS=$IFS
    $ IFS=':'
    $ for i in $line; do; echo $i; done
    learn
    unix
    at
    livefire
    labs
    $

    The first command assigns the string “learn:unix:at:livefire:labs” to the variable named line.  You can see from the first for loop that the shell treats the entire string as a single field.  This is because the string does not contain a space, tab, or newline character.

    After redefining IFS, the second for loop treats the string as four separated fields, each delimited by a colon.  Using a colon for IFS would be appropriate when parsing the fields in a record from /etc/passwd, the user account information file:

    		livefire:x:100:1::/export/home/livefire:/bin/ksh		
    Notice that the original value of IFS was stored in OIFS (“O” for original) prior to changing its value.  After you are finished using the new definition, it would be wise to return it to its original value to avoid unexpected side effects that may surface later on in your script.

    TIP – The current value of IFS may be viewed using the following pipeline:
    			$ echo "$IFS" | od -b
    0000000 040 011 012 012
    0000004
    $

    The output of the echo command is piped into the octal dump command, giving you its octal equivalent.  You can then use an ASCII table to determine what characters are stored in the variable.  Hint: Ignore the first set of zeros and the second newline character (012), which was generated by echo.

    O'Reilly - Safari Books Online - 0201675234 - Korn Shell Unix and Linux Programming Manual, Third Edition, The

    Substitution

    The first step the shell takes in executing a simple-command is to perform substitutions on the words of the command. There are three kinds of substitution: parameter, command and arithmetic. Parameter substitutions, which are described in detail in the next section, take the form $name or ${...}; command substitutions take the form $(command) or 'command'; and arithmetic substitutions take the form $((expression)). If a substitution appears outside of double quotes, the results of the substitution are generally subject to word or field splitting according to the current value of the IFS parameter. The IFS parameter specifies a list of characters which are used to break a string up into several words; any characters from the set space, tab and newline that appear in the IFS characters are called IFS white space. Sequences of one or more IFS white space characters, in combination with zero or one non-IFS white space characters delimit a field. As a special case, leading and trailing IFS white space is stripped (i.e., no leading or trailing empty field is created by it); leading or trailing non-IFS white space does create an empty field. Example: if IFS is set to ':', the sequence of characters 'A:B::D' contains four fields: 'A', 'B', '' and 'D'. Note that if the IFS parameter is set to the null string, no field splitting is done; if the parameter is unset, the default value of space, tab and newline is used.

    [Chapter 10] Korn Shell Administration

    Here is a script that looks for ^ in shell scripts in every directory in your PATH:

    [2] The exact message varies from system to system; make sure that yours prints this message when given the name of a shell script. If not, just substitute the message your file command prints for "shell script" in the code below.

     

    IFS=:
    for d in $PATH; do
        print checking $d:
        cd $d
        scripts=$(file * | grep 'shell script' | cut -d: -f1)
        for f in $scripts; do
            grep '' $f /dev/null
        done
    done

     

    The first line of this script make it possible to use $PATH as an item list in the for loop. For each directory, it cds there and finds all shell scripts by piping the file command into grep and then, to extract the filename only, into cut. Then for each shell script, it searches for the ^ character. [3]

    [3] The inclusion of /dev/null in the grep command is a kludge that forces grep to print the names of files that contain a match, even if there is only one such file in a given directory.



    Copyright © 1996-2009 by Dr. Nikolai Bezroukov. www.softpanorama.org was created as a service to the UN Sustainable Development Networking Programme (SDNP) in the author free time. Submit comments This document is an industrial compilation designed and created exclusively for educational use and is placed under the copyright of the Open Content License(OPL). Site uses AdSense so you need to be aware of Google privacy policy. Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.

    Disclaimer:

    Last modified: August 15, 2009