May the source be with you, but remember the KISS principle ;-)

Contents Bulletin Scripting in shell and Perl Network troubleshooting History Humor

String Operations in Shell



Recommended Links

Double square bracket conditionals   Pattern Matching

Variable Substitution

(set if not defined)
# ## % %% KSH Substitutions
Length Index Substr Search and Replace Concatenation Trimming from left and right
BASH Debugging Bash Built-in Variables Annotated List of Bash Enhancements bash Tips and Tricks Tips and Tricks Humor Etc

Those notes are partially based on lecture notes by Professor Nikolai Bezroukov at FDU.

String operators allow you to manipulate the contents of a variable without resorting to AWK or Perl. Modern shells such as bash 3.x or ksh93 supports most of the standard string manipulation functions, but in a very pervert, idiosyncratic way. Anyway, standard functions like length, index, substr are available. Strings can be concatenated by juxtaposition and using double quoted strings. You can ensure that variables exist (i.e., are defined and have non-null values) and set default values for variables and catch errors that result from variables not being set. You can also perform basic pattern matching. There are several basic string operations available in bash, ksh93 and similar shells:


String operators in shell use unique among programming languages curly-bracket syntax. In shell any variable can be displayed as ${name_of_the_variable} instead of $name_of_the_variable. This notation was introduced to protect a variable name from merging with string that comes after it, but was extended to allow string operations on variables.  Here is example in which it is used for separation of a variable $var and a string "_string" using curly brackets:

$ export var='test' 
$ echo ${var}_string # var is a variable that uses syntax ${var} with the value test
$ echo $var_string # var_string is a variable that doesn't exist, echo doesn't print anything

In Korn 88 shell this notation was extended to allow expressions inside curvy brackets. For example


Each operation is encoded using special symbol or two symbols ( "digram", for example :- , := , etc). An argument that the operator may need is positioned after the symbol of the operation. Later this notation extended ksh93 and adopted by bash and other shells.

This "ksh-originated" group of operators is the most popular and probably the most widely used group of string-handling operators so it makes sense to learn them, if only in order to be able to modify old scripts.

Recent developments

Bash 3.2 introduced =~ operator with "normal" Perl-style regular expressions that can be used instead in many cases and they are definitely preferable in new scripts that you might write. Let's say we need to establish whether variable $x appears to be a social security number:

if [[ $x =~ [0-9]{3}-[0-9]{2}-[0-9]{4} ]]
	# process SSN
	# print error message

In bash-3.1, a string append operator (+=) was added:

echo "$PATH"

In bash 4.1 negative length specifications in the ${var:offset:length} expansion,   previously errors, are now treated as offsets from the end of the variable.

It also extended printf builtin which now has a new %(fmt)T specifier, which allows time values to use strftime-like formatting.

printf -v can now assign values to array indices.

The read builtin now has a new `-N nchars' option, which reads exactly NCHARS characters, ignoring delimiters like newline.

The {x} operators to the [[ conditional command now do string  comparison according to the current locale if the compatibility level is greater than 40.

Variable substitution

Introduced in ksh88 notation was and still it really very idiosyncratic. In examples below we will assume that the variable var has value "this is a test" (as produced by execution of statement export var="this is a test")

Although the # and % operators mnemonics looks arbitrary, they have a convenient mnemonic if you use the US keyboard. The # key is on the left side of the $ key and operates from the left, while % is to right and usually is used to the right of the string like is 100%. Also C preprocessor uses "#"; as a prefix to identify preprocessor statements (#define, #include).

Implementation of classic string operations in shell

Despite shell deficiencies in this area and idiosyncrasies preserved from 1970th most classic string operations can be implemented in shell. You can define functions that behave almost exactly like in Perl or other "more normal" language. In case shell facilities are not enough you can use AWK or Perl. It's actually sad that AWK was not integrated into shell.

Length Operator

There are several ways to get length of the string.

More complex example. Here's the function for validating that that string is within a given max length. It requires two parameters, the actual string and the maximum length the string should be.

# check_length 
# to call: check_length string max_length_of_string 
	# check we have the right params 
	if (( $# != 2 )) ; then 
	   echo "check_length need two parameters: a string and max_length" 
	   return 1 
	if (( ${#1} > $2 )) ; then 
	   return 1 
	return 0 

You could call the function check_length like this:

# test_name 
while : 
  echo -n "Enter customer name :" 
  read NAME 
  [ check_length $NAME 10 ] && break
  echo "The string $NAME is longer then 10 characters"    
echo $NAME

Determining the Length of Matching Substring at Beginning of String

This is pretty rarely used capability of expr built-in but still sometimes it can be useful:
expr match "$string" '$substring' 


#       |------|

echo `expr match "$my_regex" 'abc[A-Z]*.2'`   # 8
echo `expr "$my_regex" : 'abc[A-Z]*.2'`       # 8


Function index return the position of substring in string counting from one and 0 if substring is not found.

expr index $string $substring
Numerical position in $string of first character in $substring that matches.
echo `expr index "$stringZ" C12`             # 6
                                             # C position.

echo `expr index "$stringZ" c`              # 3
# 'c' (in #3 position) 

This is the close equivalent of strchr() in C.


Bash proves two implementation of substr fundtion which are not identical:

  1. Via expr function
  2. a part of pattern matching operators in the form ${param:offset[:length}.

Implementation via expr function

Classic subsr function is available  via expr function:

expr substr $string $position $length
Extracts $length characters from $string  starting at $position.. The first character has index one.
#       123456789......
#       1-based indexing.

echo `expr substr $stringZ 1 2`              # ab
echo `expr substr $stringZ 4 3`              # ABC


Implementation within pattern matching operators

Idiosyncratic implementation of substring function is also available as a part of pattern matching operators in the form

Extracts substring with $length characters from $string   starting at $position..

If the $string parameter is "*" or "@", then this extracts the positional parameters, starting at $position.

Extracts $length  characters of substring from $string  at $position.
#       0123456789.....
#       0-based indexing.

echo ${stringZ:0}                            # abcABC123ABCabc
echo ${stringZ:1}                            # bcABC123ABCabc
echo ${stringZ:7}                            # 23ABCabc

echo ${stringZ:7:3}                          # 23A
                                             # Three characters of substring.


You can also emulate it (substr):
# substr -- a function to emulate the ancient ksh builtin

# -l == shortest from left
# -L == longest from left
# -r == shortest from right (the default)
# -R == longest from right

	local flag pat str
	local usage="usage: substr -lLrR pat string or substr string pat"

	case "$1" in
	-l | -L | -r | -R)
		shift 2
		echo "substr: unknown option: $1"
		echo "$usage"
		return 1

	if [ "$#" -eq 0 ] || [ "$#" -gt 2 ] ; then
		echo "substr: bad argument count"
		return 2


	# We don't want -f, but we don't want to turn it back on if
	# we didn't have it already
	case "$-" in
		set -f

	case "$flag" in
		str="${str#$pat}"		# substr -l pat string
		str="${str##$pat}"		# substr -L pat string
		str="${str%$pat}"		# substr -r pat string
		str="${str%%$pat}"		# substr -R pat string
		str="${str%$2}"			# substr string pat

	echo "$str"

	# If we had file name generation when we started, re-enable it
	if [ "$fng" = "1" ] ; then
		set +f

Something really strange: working with the array of positional parameters

If the $string parameter is "*" or "@", then this extracts a maximum of $length positional parameters, starting at $position.

echo ${*:2}          # Echoes second and following positional parameters.
echo ${@:2}          # Same as above.

echo ${*:2:3}        # Echoes three positional parameters, starting at second.
expr substr $string $position $length


Search and Replace

You can search and replace substring in a variable using ksh syntax:

alpha='This is a test string in which the word "test" is replaced.' 

The string "beta" now contains an edited version of the original string in which the first case of the word "test" has been replaced by "replace". To replace all cases, not just the first, use this syntax:


Note the double "//" symbol.

Here is an example in which we replace one string with another in a multi-line block of text:

list="cricket frog cat dog" 
poem="I wanna be a x\n\ A x is what I'd love to be\n\ If I became a x\n\ How happy I would be.\n"
for critter in $list; do
   echo -e ${poem//x/$critter}
There are several additional capabilities
${var:pos[:len]} # extract substr from pos (0-based) for len
${var/substr/repl} # replace first match
${var//substr/repl} # replace all matches
${var/#substr/repl} # replace if matches at beginning (non-greedy)
${var/##substr/repl} # replace if matches at beginning (greedy)
${var/%substr/repl} # replace if matches at end (non-greedy)
${var/%%substr/repl} # replace if matches at end (greedy)
${#var} # returns length of $var
${!var} # indirect expansion


In bash-3.1, a string append operator (+=) was added, It is now a preferred solution:

echo "$PATH"

Traditionally in shell strings were concatenated by juxtaposition and using double quoted strings. For example


Double quoted string in shell is almost identical to double quoted string in Perl and performs macro expansion of all variables in it. The minor differences are the treatment of escaped characters and new line character.  If you want exact match you can use $'string'


# String expansion.Introduced with version 2 of Bash.

#  Strings of the form $'xxx' have the standard escaped characters interpreted. 

echo $'Ringing bell 3 times \a \a \a'
     # May only ring once with certain terminals.
echo $'Three form feeds \f \f \f'
echo $'10 newlines \n\n\n\n\n\n\n\n\n\n'
echo $'\102\141\163\150'   # Bash
                           # Octal equivalent of characters.

exit 0

Trimming from left and right

Using the wildcard character (?), you can imitate Perl chop function (which cuts the last character of the string and returns the rest) quite easily

echo "original='$test,timmed_first='$trimmed_first', trimmed_last='$trimmed_last'"

The first character of a string can also be obtained with printf:

printf -v char "%c" "$source"
Conditional chopping line in Perl chomp function or REXX function trim can be done using while loop, for example:
function trim
   while : # this is an infinite loop
   case $target in
      ' '*) target=${target#?} ;; ## if $target begins with a space remove it
      *' ') target=${target%?} ;; ## if $target ends with a space remove it
      *) break ;; # no more leading or trailing spaces, so exit the loop
   return target

A more Perl-style method to trim trailing blanks would be

spaces=${source_var##*[! ]} ## get the trailing blanks in var $spaces
The same trick can be used for removing leading spaces.

Assignment of default value for undefined variables

Operator: ${var:-bar} is useful for assigning a variable a default value.

 It word the following way: if $var exists and is not null, it returns $var. If it doesn't exist or is null, return bar.  This operator does not change the variable $var.


$ export var=""
$ echo ${var:-one}
$ echo $var

More complex example:

sort -nr $1 | head -${2:-10}

A typical usage include situations when you need to check if arguments were passed to the script and if not assign some default values::

export FROM=${1:-"~root/.profile"}
export TO=${2:-"~my/.profile"}
cp -p $FROM $TO

set variable if it is not defined with the operator ${var:=bar}

It works as following: If $var exists and is not null, return $var. If it doesn't exist or is null, set $var to bar and return bar.


$ export var=""
$ echo ${var:=one}
echo $var

String comparison via double square bracket conditionals

The [[ ]] construct was introduced in ksh88 as a way to compensate for multiple shortcomings and limitations of the [ ] (test) solution.  Essentially it makes [ ] construct obsolete except for running a program to get a return code.  In turn double round brackets ((..)) construct made  The [[ ]] construct obsolete for integer comparisons.

There are two types of operators that can be used inside double square bracket construct:

Paradoxically integer comparison operators are represented as strings ( -eq, -ne, -gt, etc)  while string comparison operators as delimiters ("=", "==", "!=", "<", ">", etc). 

The [[ ]] construct expects expression. In ksh delimiters [[ and ]] serve as single quotes so you do not have macro expansion inside: variable substitution and wildcard expansion aren't done within [[ and ]], making quoting less necessary. In bash this is less true :-). 

It can act as independent operator as it produces return code. So constructs like

[[ $string =∼ [aeiou] ]] && exit; 

are legitimate and actually pretty compact way to write if statements without else clause that contain a single statement in then block.

One of the [[ ]] construct warts is that it redefined == as a pattern matching operation, which anybody who programmed in C/C++/Java strongly resent. Latest bash version corrected that and allow using Perl-style =~ operator instead (I think ksh93 allow that too):

[[ $string =∼ [aeiou] ]]
echo $?

[[ $string =∼ h[sdfghjkl] ]]
echo $?

Like [ ] construct [[  ]] construct can be used as a separate statement that returns an exit status depending upon whether condition is true or not.  With && and || constructs discussed above this provides an alternative syntax for if-then and if-else constructs

if [[ -d  $HOME/$user ]] ; then echo " Home for user $user exists..."; fi

can be written simpler as

[[ -d  $HOME/$user ]] && echo " Home for user $user exists..."

There are several types of expressions that can be used inside [[  ... ]] construct:

One unpleasant reality (and probably the most common gotcha) of using legacy [[...]] integer comparison constructs is that if one of the variable is not initialized it produces syntax error. Various tricks are used to avoid this nasty macro substitution side effect, that came of a legacy of extremely week implementation of comparisons in Borne shell (there are way too many crazy things implemented in Borne shell, anyway ;-). 

There are two classic tricks to deal with this gotchas in old [[..]] construct as you will be dealing with scripts infested with those old constructs pretty often. They can be and often are used simultaneously:

Generally, it is better to initialize most variables explicitly. I know it is difficult as old habits die slowly, but this can be done. Here are the most common "legacy integer comparison operators":

is equal to

if [[ "$a" -eq "$b" ]]

is not equal to

if [[ "$a" -ne "$b" ]]

is greater than

if [[ "$a" -gt "$b" ]]

is greater than or equal to

if [[ "$a" -ge "$b" ]]

is less than

if [[ "$a" -lt "$b" ]]

is less than or equal to

if [[ "$a" -le "$b" ]]

String comparisons

String comparisons is all what left useful in this construct as for integer comparisons ((..)) construct is better and for file comparisons older [...] construct is equal. This topic is discussed at greater length at String Operations in Shell


Operator True if...
str = pat
str == pat
str matches pat. Note that in case of "==" that's not what you logically expect, if you have some experience with C /C++/Java programming !!!
str != pat str does not match pat.
str1 < str2 str1 is less than str2 is collation order used
str1 > str2 str1 is greater than str2.
-n str str is not null (has length greater than 0).
-z str str is null (has length 0).
file1 -ef file2 file1 is another name for file2 (hard or symbolic link)

While we're cleaning up code we wrote in the last chapter, let's fix up the error handling in the highest script  The code for that script is:

filename=${1:?"filename missing."}
sort -nr $filename | head -$howmany

Recall that if you omit the first argument (the filename), the shell prints the message highest: 1: filename missing. We can make this better by substituting a more standard "usage" message:

if [[ -z $1 ]]; then
    print 'usage: howmany filename [-N]'
    sort -nr $filename | head -$howmany

It is considered better programming style to enclose all of the code in the if-then-else, but such code can get confusing if you are writing a long script in which you need to check for errors and bail out at several points along the way. Therefore, a more usual style for shell programming is this:

if [[ -z $1 ]]; then
    print 'usage: howmany filename [-N]'
    return 1
sort -nr $filename | head -$howmany

Pattern Matching

There are two types of pattern matching is shell:

Unless you need to modify old scripts it does not make sense to use old ksh-style regex in bash.

Perl-style regular expressions

(partially borrowed from Bash Regular Expressions | Linux Journal)

Since version 3 of bash (released in 2004) bash implements  an extended regular expressions which are mostly compatible with Perl regex. They are also called POSIX regular expressions as they are defined in IEEE POSIX 1003.2. (which you should read and understand to use the full power provided). Extended regular expression are also used in egrep so they are  well known by system administrators.  Please note that Perl regular expressions are equivalent to extended regular expressions with a few additional features:

Predefined Character Classes

Extended regular expression support set of predefined character classes. When used between brackets, these define commonly used sets of characters. The POSIX character classes implemented in extended regular expressions include:

NOTE: I have problems with GNU bash, version 3.2.25(1)-release (x86_64-redhat-linux-gnu)  using those extended classes. It does accept them, but it does not match correctly.

Modifiers are similar to Perl

Extended regex Perl regex
a+ a+
a? a?
a|b a|b
(expression1) (expression1)
{m,n} {m,n}
{,n} {,n}
{m,} {m,}
{m} {m}

It returns 0 (success) if the regular expression matches the string, otherwise it returns 1 (failure).

In addition to doing simple matching, bash regular expressions support sub-patterns surrounded by parenthesis for capturing parts of the match. The matches are assigned to an array variable BASH_REMATCH. The entire match is assigned to BASH_REMATCH[0], the first sub-pattern is assigned to BASH_REMATCH[1], etc..

The following example script takes a regular expression as its first argument and one or more strings to match against. It then cycles through the strings and outputs the results of the match process:


if [[ $# -lt 2 ]]; then
    echo "Usage: $0 PATTERN STRINGS..."
    exit 1
echo "regex: $regex"

while [[ $1 ]]
    if [[ $1 =~ $regex ]]; then
        echo "$1 matches"
        while [[ $i -lt $n ]]
            echo "  capture[$i]: ${BASH_REMATCH[$i]}"
            let i++
        echo "$1 does not match"

Assuming the script is saved in "", the following sample shows its output:

  # sh 'aa(b{2,3}[xyz])cc' aabbxcc aabbcc
  regex: aa(b{2,3}[xyz])cc

  aabbxcc matches
    capture[1]: bbx
  aabbcc does not match

Old KSH Pattern-matching Operators

Pattern-matching operators were introduced in ksh88 in a very idiosyncratic way. The notation is different from used by Perl or utilities such as grep. That's a shame, but that's how it is. Life is not perfect. They are hard to remember, but there is a handy mnemonic tip: # matches the front because number signs precede numbers; % matches the rear because percent signs follow numbers.

There are two kinds of pattern matching available: matching from the left and matching from the right.

The operators, with their functions and an example, are shown in the following table (note that on keyboard symbol "#" is to the left and symbol "%" is to the right of dollar sign; that might help to memorize them):

Operator Meaning Example
${var#t*is} Deletes the shortest possible match from the left:  If the pattern matches the beginning of the variable's value, delete the shortest part that matches and return the rest. export $var="this is a test"
echo ${var#t*is}
is a test
${var##t*is} Deletes the longest possible match from the left: If the pattern matches the beginning of the variable's value, delete the longest part that matches and return the rest. export $var="this is a test"

echo ${var##t*is}

a test

${var%t*st} Deletes the shortest possible match from the right: If the pattern matches the end of the variable's value, delete the shortest part that matches and return the rest. export $var="this is a test"

echo ${var%t*st}

this is a

${var%%t*st} Deletes the longest possible match from the right: If the pattern matches the end of the variable's value, delete the longest part that matches and return the rest. export $var="this is a test" echo ${var%%t*is}

NOTE: While the #  and %  identifiers may not seem obvious, they have a convenient mnemonic. The #  key is on the left side of the $  key on the keyboard and operates from the left. The % key is on the right of the $  key and operated from the right.

These operators can be used to cu string both from right and left and extract the necessary part. In the example below this is done with uptime:

cores=`grep processor /proc/cpuinfo | wc -l`

cpuload=${cpuload#*average: }
if (( cpuload > cores + 1  )) ;  then 
   echo "Server $HOSTNAME overloaded: $cpuload on $corex cores"

 For example, the following script changes the extension of all .html files to .htm.

# quickly convert html filenames for use on a dossy system
# only handles file extensions, not filenames

for i in *.html; do
  if [ -f ${i%l} ]; then
    echo ${i%l} already exists
    mv $i ${i%l}

The classic use for pattern-matching operators is stripping off components of pathnames, such as directory prefixes and filename suffixes. With that in mind, here is an example that shows how all of the operators work. Assume that the variable path  has the value /home /billr/mem/; then:

Expression         	  Result
${path#/*/}              billr/mem/
$path              /home/billr/mem/
${path%.*}         /home/billr/mem/long.file
${path%%.*}        /home/billr/mem/long

Operator #: ${var#t*is} deletes the shortest possible match from the left


$ export var="this is a test"
$ echo ${var#t*is}
is a test

Operator ##: ${var##t*is} deletes the longest possible match from the left


$ export var="this is a test"
$ echo ${var##t*is}
a test

Operator %: ${var%t*st} Function: deletes the shortest possible match from the right


$ export var="this is a test" 
$ echo ${var%t*st} 
this is a
for i in *.htm*; do 
   if [ -f ${i%l} ]; then  
      echo "${i%l} already exists" 
      mv $i ${i%l} 

Operator %%: ${var%%t*st} deletes the longest possible match from the right


$ export var="this is a test" 
$ echo ${var%%t*st}

Ksh-style regular expressions

A KSH regular expression are now obsolete. Please use Perl-style regular expression instead.

They use idiosyncratic prefix-based notation that is difficult to learn.  Each such operator has the form x(exp), where x is the particular operator and exp is any regular expression (often simply a regular string). The operator determines how many occurrences of exp a string that matches the pattern can contain.

Operator Meaning
*(exp) 0 or more occurrences of exp
+(exp) 1 or more occurrences of exp
?(exp) 0 or 1 occurrences of exp
@(exp1|exp2|...) exp1 or exp2 or...
!(exp) Anything that doesn't match exp
Expression Matches
x x
*(x) Null string, x, xx, xxx, ...
+(x) x, xx, xxx, ...
?(x) Null string, x
!(x) Any string except x
@(x) x (see below)

The following section compares Korn shell regular expressions to analogous features in awk and egrep. If you aren't familiar with these, skip to the section entitled "Pattern-matching Operators."

shell basic regex vs awk/egrep regular expressions

Shell egrep/awk Meaning
*(exp) exp* 0 or more occurrences of exp
+(exp) exp+ 1 or more occurrences of exp
?(exp) exp? 0 or 1 occurrences of exp
@(exp1|exp2|...) exp1|exp2|... exp1 or exp2 or...
!(exp) (none) Anything that doesn't match exp

These equivalents are close but not quite exact. Actually, an exp within any of the Korn shell operators can be a series of exp1|exp2|... alternates. But because the shell would interpret an expression like dave|fred|bob  as a pipeline of commands, you must use @(dave|fred|bob)  for alternates

For example:

It is worth re-emphasizing that shell regular expressions can still contain standard shell wildcards. Thus, the shell wildcard ?  (match any single character) is the equivalent to .  in egrep or awk, and the shell's character set operator [...] is the same as in those utilities. For example, the expression +([0-9])  matches a number, i.e., one or more digits. The shell wildcard character * is equivalent to the shell regular expression * (?).

A few egrep and awk regexp operators do not have equivalents in the Korn shell. These include:

The first two pairs are hardly necessary, since the Korn shell doesn't normally operate on text files and does parse strings into words itself.

Using typeset for converting string to to lower or upper case

typeset function allow creating local varivales in shell.  But two of its options allow to perform really useful string operations:

You can turn off typeset options explicitly by typing typeset +o , where o is the option you turned on before.

In Korn shell there are two additional options (-L and -R) that allow also trimming string to fixed length and remove leading blanks. An obvious application for the -L and -R options is one in which you need fixed-width output.

Here are a  simple example taken from Leaning the Korn Shell [Chapter 6] 6.3 Arrays

 Assume that the variable alpha is assigned the letters of the alphabet, in alternating case, surrounded by three blanks on each side:

alpha="   aBcDeFgHiJkLmNoPqRsTuVwXyZ   "

Table 6.6 shows some typeset statements and their resulting values (assuming that each of the statements are run "independently").

Table 6.6: Examples of typeset String Formatting Options
Statement Value of v
typeset -L v=$alpha "aBcDeFgHiJkLmNoPqRsTuVwXyZ   "
typeset -L10 v=$alpha "aBcDeFgHiJ"
typeset -R v=$alpha "   aBcDeFgHiJkLmNoPqRsTuVwXyZ"
typeset -R16 v=$alpha "kLmNoPqRsTuVwXyZ"
typeset -l v=$alpha "   abcdefghijklmnopqrstuvwxyz"
typeset -uR5 v=$alpha "VWXYZ"
typeset -Z8 v= "123.50" "00123.50"



Top updates

Softpanorama Switchboard
Softpanorama Search


Old News :-)

Please visit Heiner Steven's SHELLdorado, the best shell scripting site on the Internet

[Nov 04, 2016] Coding Style rear-rear Wiki

Reading rear sources is an interesting exercise. It really demonstrates attempt to use "reasonable' style of shell programming and you can learn a lot.
Nov 04, 2016 |

Relax-and-Recover is written in Bash (at least bash version 3 is needed), a language that can be used in many styles. We want to make it easier for everybody to understand the Relax-and-Recover code and subsequently to contribute fixes and enhancements.

Here is a collection of coding hints that should help to get a more consistent code base.

Don't be afraid to contribute to Relax-and-Recover even if your contribution does not fully match all this coding hints. Currently large parts of the Relax-and-Recover code are not yet in compliance with this coding hints. This is an ongoing step by step process. Nevertheless try to understand the idea behind this coding hints so that you know how to break them properly (i.e. "learn the rules so you know how to break them properly").

The overall idea behind this coding hints is:

Make yourself understood

Make yourself understood to enable others to fix and enhance your code properly as needed.

From this overall idea the following coding hints are derived.

For the fun of it an extreme example what coding style should be avoided:

#!/bin/bash for i in `seq 1 2 $((2*$1-1))`;do echo $((j+=i));done


Try to find out what that code is about - it does a useful thing.

Code must be easy to read Code should be easy to understand

Do not only tell what the code does (i.e. the implementation details) but also explain what the intent behind is (i.e. why ) to make the code maintainable.

Here the initial example so that one can understand what it is about:

#!/bin/bash # output the first N square numbers # by summing up the first N odd numbers 1 3 ... 2*N-1 # where each nth partial sum is the nth square number # see # this way it is a little bit faster for big N compared to # calculating each square number on its own via multiplication N=$1 if ! [[ $N =~ ^[0-9]+$ ]] ; then echo "Input must be non-negative integer." 1>&2 exit 1 fi square_number=0 for odd_number in $( seq 1 2 $(( 2 * N - 1 )) ) ; do (( square_number += odd_number )) && echo $square_number done

Now the intent behind is clear and now others can easily decide if that code is really the best way to do it and easily improve it if needed.

Try to care about possible errors

By default bash proceeds with the next command when something failed. Do not let your code blindly proceed in case of errors because that could make it hard to find the root cause of a failure when it errors out somewhere later at an unrelated place with a weird error message which could lead to false fixes that cure only a particular symptom but not the root cause.

Maintain Backward Compatibility

Implement adaptions and enhancements in a backward compatible way so that your changes do not cause regressions for others.

Dirty hacks welcome

When there are special issues on particular systems it is more important that the Relax-and-Recover code works than having nice looking clean code that sometimes fails. In such special cases any dirty hacks that intend to make it work everywhere are welcome. But for dirty hacks the above listed coding hints become mandatory rules:

For example a dirty hack like the following is perfectly acceptable:

# FIXME: Dirty hack to make it work # on "FUBAR Linux version 666" # where COMMAND sometimes inexplicably fails # but always works after at most 3 attempts # see # Retries should have no bad effect on other systems # where the first run of COMMAND works. COMMAND || COMMAND || COMMAND || Error "COMMAND failed."

Character Encoding

Use only traditional (7-bit) ASCII charactes. In particular do not use UTF-8 encoded multi-byte characters.

Text Layout Variables Functions Relax-and-Recover functions

Use the available Relax-and-Recover functions when possible instead of re-implementing basic functionality again and again. The Relax-and-Recover functions are implemented in various lib/* files .

test, [, [[, (( Paired parenthesis See also

[Nov 01, 2014] How to determine if a string is a substring of another in bash?

I want to see if a string is inside a portion of another string.
'ab' in 'abc' -> true
'ab' in 'bcd' -> false

How can I do this in a conditional of a bash script?

A: You can use the form ${VAR/subs} where VAR contains the bigger string and subs is the substring your are trying to find:

if [ "${my_string/$substring}" = "$my_string" ] ; then
  echo "${substring} is not in ${my_string}"
  echo "${substring} was found in ${my_string}"

This works because ${VAR/subs} is equal to $VAR but with the first occurrence of the string subs removed, in particular if $VAR does not contains the word subs it won't be modified.

I think that you should change the sequence of the echo statements. Because I get ab is not in abc

Mmm.. No, the script is wrong. Like that I get ab was found in abc, but if I use substring=z I get z was found in abcLucio May 25 '13 at 0:08


Sorry again I forgot the $ in substring. – edwin May 25 '13 at 0:10

Now I get ab is not in abc. But z was found in abc. This is funny :D – Lucio May 25 '13 at 0:11


[[ "ab" =~ "bcd" ]]
[[ "ab" =~ "abc" ]]

the brackets are for the test, and as it is double brackets, it can so some extra tests like =~.

So you could use this form something like

if [[ "$var2" =~ "$var1" ]]; then
    echo "pass"
    echo "fail"

Edit: corrected "=~", had flipped.

I get fail with this parameters: var2="abcd"Lucio May 25 '13 at 0:02


@Lucio The correct is [[ $string =~ $substring ]]. I updated the answer. – Eric Carvalho May 25 '13 at 0:38


@EricCarvalho opps, thanks for correcting it. – demure May 25 '13 at 0:49


Using bash filename patterns (aka "glob" patterns)

[[ abc == *"$substr"* ]] && echo yes || echo no    # yes
[[ bcd == *"$substr"* ]] && echo yes || echo no    # no

The following two approaches will work on any POSIX-compatible environment, not just in bash:

for s in abc bcd; do
    if case ${s} in *"${substr}"*) true;; *) false;; esac; then
        printf %s\\n "'${s}' contains '${substr}'"
        printf %s\\n "'${s}' does not contain '${substr}'"
for s in abc bcd; do
    if printf %s\\n "${s}" | grep -qF "${substr}"; then
        printf %s\\n "'${s}' contains '${substr}'"
        printf %s\\n "'${s}' does not contain '${substr}'"

Both of the above output:

'abc' contains 'ab'
'bcd' does not contain 'ab'

The former has the advantage of not spawning a separate grep process.

Note that I use printf %s\\n "${foo}" instead of echo "${foo}" because echo might mangle ${foo} if it contains backslashes.


Mind the [[ and ":

[[ $a == z* ]]   # True if $a starts with an "z" (pattern matching).
[[ $a == "z*" ]] # True if $a is equal to z* (literal matching).

[ $a == z* ]     # File globbing and word splitting take place.
[ "$a" == "z*" ] # True if $a is equal to z* (literal matching).

So as @glenn_jackman said, but mind that if you wrap the whole second term in double quotes, it will switch the test to literal matching.


[Mar 16, 2011] Bash pattern substitution

Here's the actual formal definition from the bash man pages:

The pattern is expanded to produce a pattern just as in pathname expansion. Parameter is expanded and the longest match of pat- tern against its value is replaced with string. In the first form, only the first match is replaced. The second form causes all matches of pattern to be replaced with string. If pattern begins with #, it must match at the beginning of the expanded value of parameter. If pattern begins with %, it must match at the end of the expanded value of parameter. If string is null, matches of pattern are deleted and the / following pattern may be omitted. If parameter is @ or *, the substitution operation is applied to each positional parameter in turn, and the expan- sion is the resultant list. If parameter is an array variable subscripted with @ or *, the substitution operation is applied to each member of the array in turn, and the expansion is the resultant list.

[Mar 16, 2011] Bash info, scripting examples, regex parameter substitution, interactive shell, and more

Regular expressions and globbing

Globbing is use of * as a wildcard to glob file name list together. Use of wildcards is not a regular expression.

These following examples should also work inside bash scripts. These may or may not be compatible with sh. These are "interesting" regex or globbing examples. I say "interesting" because they don't seem to follow the path of "true" regular expressions used by Perl.

[mst3k@zeus ~]$ echo ${HOME/\/home\//}
[mst3k@zeus ~]$ echo ${HOME##home}
[mst3k@zeus ~]$ echo ${HOME##/home}
[mst3k@zeus ~]$ echo ${HOME##/home/}
[mst3k@zeus ~]$ echo ${HOME##*}

[mst3k@zeus ~]$ echo ${HOME##*/}

Trouble with the string replacement function in bash

I'm having trouble with the string replacement function in bash. The problem is that I want to replace a printing character, in this case &, by a non-printing character, either new line or null in this case. I don't see how to specify the non-printing character in the string replacement function ${variable//a/b}.

I have a long, URL-encoded-like file name that I would like to parse with grep. I have used & as a delimiter between variables within the long file name. I would like to use the string replacement function in bash to search for all instances of & and replace each one with either the null character or the new line character since grep can recognize either one.

How do I specify a non-printing character in the bash string replacement function ?

Thank you.

Special Syntax
Submitted by Mitch Frazier on Fri, 04/02/2010 - 11:43.
Use the $'\xNN' syntax for the non-printing character. Note though that a NULL character does not work:

$ cat


# ^^^^^^^ change to newline
echo -n ">>$v2<<" | hexdump -C

# ^^^^^^^ change to null (doesn't work)
echo -n ">>$v2<<" | hexdump -C
If you run this you can see that the substitution works for a newline but not for a NULL:

$ sh
00000000 3e 3e 68 65 6c 6c 6f 3d 79 65 73 0a 77 6f 72 6c |>>hello=yes.worl|
00000010 64 3d 6e 6f 3c 3c |d=no<<|
00000000 3e 3e 68 65 6c 6c 6f 3d 79 65 73 77 6f 72 6c 64 |>>hello=yesworld|
00000010 3d 6e 6f 3c 3c |=no<<|
Mitch Frazier is an Associate Editor for Linux Journal.

Multiple operations?
Submitted by Anonymous on Wed, 03/24/2010 - 10:56.
Very interesting article.
A question: is it possible to use in the same expression many operators, as:
${var#t*is%t*st} which uses both '#t*is' and '%t*st' which gives 'is a' in the example?
I tried some forms but it doesn't work... Has someone an idea?

Doesn't Work
Submitted by Mitch Frazier on Wed, 03/24/2010 - 14:59.
You can't do multiple operations in one expression.

Mitch Frazier is an Associate Editor for Linux Journal.

Submitted by First question (not verified) on Tue, 09/01/2009 - 07:07.
I want to do something like this using linux bash script:
a1="Chris Alonso"
echo $a$i #I only trying to write: echo $a1 using the variable i

Someone can help me, please?

Submitted by Mitch Frazier on Tue, 09/01/2009 - 13:41.
Eval will do this for you but you may decide you really don't want to do this after seeing it:

eval echo \$$(echo a$i)
eval echo \$`echo a$i`
A slightly less complicated sequence would be something like:

eval echo \$$v
It looks like what you're trying to do here is simulate arrays. If that's the case then you'd be better or using bash's built-in arrays.

Mitch Frazier is an Associate Editor for Linux Journal.

How about v=a$i echo ${!v}
Submitted by Anonymous (not verified) on Fri, 09/25/2009 - 15:09.
How about

echo ${!v}

simplification of indirect reference
Submitted by Anonymous on Fri, 03/12/2010 - 23:16.
Is there any way to rid the statements of the variable assignment? As in, make it so that:

echo ${!a$i}

works? I'm thinking that there has to be a way to escape the "a$i" inside the indirect reference construct. I have a case where I'm trying to do this with the result of a regex match, and am not able to figure out the right syntax:


for item in ${ARRAY[@]; do
[[ "$item" =~ hay(needle)stack ]] &&

But the seventh line (with the indirect reference) chokes with a "bad substitution" error. I should be able to do this on one line, without using eval with the right syntax, no?


Submitted by Mitch Frazier on Fri, 09/25/2009 - 15:36.
That works and is simpler than my solution.

Mitch Frazier is an Associate Editor for Linux Journal.

: or not to :
Submitted by Ash (not verified) on Mon, 01/08/2007 - 07:47.
Interesting, ":" can be ommited for "numeric" variables (script/function arguments).


First time I thought it is a typo, but it is not.

interesting, this will save a few seds and greps!
Submitted by mangoo (not verified) on Tue, 01/29/2008 - 06:24.
Interesting article, this will save me a few seds, greps and awks!

what if we are to operate on
Submitted by MgBaMa req (not verified) on Mon, 09/03/2007 - 22:44.
what if we are to operate on the param $1 $2, ...?
i mean is it feasible to see a result of ${4%/*}
to get a valule as from
$ export $1="this is a test/none"
$ echo ${$1/*}
> this is a test
$ echo ${$2#*/}
> none

variable contents confusing
Submitted by Paul Archerr (not verified) on Wed, 03/29/2006 - 09:39.
Minor typos not withstanding, I had a bit of a problem with the values of the variables used. assigning the value 'bar' to the variable bar makes it confusing to quickly figure out which is which. (Is that 'bar' another variable? Or a value?)
I would suggest making the simple change of putting your values in uppercase. They would stand out and make the article more readable.
For example:
$ export var=var
$ echo ${var}bar # var exists so this works as expected
$ echo $varbar # varbar doesn't exist, so this doesn't


$ export var=VAR
$ echo ${var}bar # var exists so this works as expected
$ echo $varbar # varbar doesn't exist, so this doesn't

You can see how the 'VARbar' on the third line becomes differentiated from the 'varbar' on the fourth line.

var, bar, +varbar: worst things for programming since Microsoft.
Submitted by Anonymous (not verified) on Mon, 09/10/2007 - 14:35.
Using var and bar to try to inform someone is pretty much uniformly bad everywhere it's done, as var and bar explicitly indicate things that don't have any meaning whatsoever. Varbar is even worse, since it is visibly only different from var bar because of a single " " (space).

If you're trying to confuse the reader, use var, bar, and especially varbar.

If you're trying to be informative, please, give your damn variables a short but logically useful name.

Part II?
Submitted by Anonymous (not verified) on Fri, 03/24/2006 - 08:58.
OK but there's a lot more to it than just this. How about some of the following?

${var:pos[:len]} # extract substr from pos (0-based) for len

${var/substr/repl} # replace first match
${var//substr/repl} # replace all matches
${var/#substr/repl} # replace if matches at beginning (non-greedy)
${var/##substr/repl} # replace if matches at beginning (greedy)
${var/%substr/repl} # replace if matches at end (non-greedy)
${var/%%substr/repl} # replace if matches at end (greedy)

${#var} # returns length of $var
${!var} # indirect expansion

...Sorry, those round parens

Submitted by Anonymous (not verified) on Fri, 03/24/2006 - 09:07.
...Sorry, those round parens should be curlies.

Submitted by Stephanie (not verified) on Sun, 03/12/2006 - 21:51.
I think they author did a great job explaining the article and am glad that I was able to learn from it and finally found something interesting to read online!

Examples in Table 1 are rubbish
Submitted by Anonymous (not verified) on Sun, 03/12/2006 - 20:56.
In addition to the incorrect $var= (should be var=), the last two examples don't illustrate the use of the construct they are supposed to . Pity the author did not proof-read the first table.

Examples using same operator yet differing results?!
Submitted by really-txtedmacs (not verified) on Fri, 03/10/2006 - 19:46.
${var#t*is} deletes the shortest possible match from the left export $var="this is a test" echo ${var#t*is} is a test

fine, but next in line is supposed to remove the maximum from the left, but uses the same exact operator, how does it get the correct result?

${var##t*is} deletes the longest possible match from the left export $var="this is a test" echo ${var#t*is} a test

Get's worse when going from the right, the original operation from the right is employed. Moreover, on my system an Ubuntu 05.10 descktop, this gave:

txtedmacs@phpserver:~$ export $var="this is a test"
bash: export: `=this is a test': not a valid identifier

Take out the $var, and it works fine.

Much easier to catch someone else's errors than one's own - I hate looking at my articles or emails.

Errors in article
Submitted by Anonymous (not verified) on Mon, 03/13/2006 - 22:49.
As really-txtedmacs tried to politely point out, there are errors in the Pattern Matching table - Example column, as of when he looked at it and as of now. Each instance of "export $var" should be "export var" in bash and most similar shells. Also, the operator in the echo command needs to match exactly the operator in the first column. Interestingly, some but not all of these errors still exist in the original article at, which is in issue 57, not 67.
Otherwise, a very good article. I will save the info in my bag of tricks.

You're channelling Larry Wall, dude!
Submitted by Jim Dennis (not verified) on Sat, 03/11/2006 - 18:30.
In Bourne shell and its ilk (like Korn shell and bash) the assignment syntax is:
var=... You only prefix a variable's name with $ when you're "dereferencing" it (expanding it into its value).

So the shell was parsing
export $var="this is a test" as:

export ???="this is a test" (where ??? is whatever "var" was set to before this statement ... probably the empty string if the variable was previously unset).

I know this is confusing because Perl does it completely differently. In Perl the $ is a "sigil" which, on an "lvalue" (a variable name or other assignable token) tells the interpeter what "type" of assignment is occuring. Thus a Perl statement like:
$var="this is a test"; (note the required semicolon, too) is a "scalar" assignment. This also sets the context of the assignment. In Perl a scalar value in scalar context is conceptually the closest to a normal shell variable assignment. However, a list value in a scalar assignment context is a different beast entirely. So a line of Perl like
perl -e '@bar=(1,2,3)]; $var=@bar; print $var ;' will set $var to the number of items in the bar array. (Of course we could use @var for the array name since they are different namespaces in Perl. But I wanted my example to be clear). So an array/list value in scalar context returns an integer (a type of scalar) which represents the number of elements in the list.

Anyway, just remembrer that the shell $ it more like the C programming * operator ... it dereferences the variable into its value.

The Linux Gazette "Answer Guy"

USA <> World :-)
Submitted by on Fri, 03/10/2006 - 14:57.
Although the # and % identifiers may not seem obvious, they have a convenient mnemonic. The # key is on the left side of the $ key and operates from the left.
In the USA, perhaps, but my UK keyboard has the # key nestling up against the Enter and right-Shift keys. Not to mention layouts such as Dvorak...!

Other (non-USA-specific?!) Mnemonics
Submitted by Anonymous (not verified) on Sat, 03/18/2006 - 16:29.
Another way to keep track is that we say "#1" and "1%", not "1#" and "%1". That is, unless you're using "#" to mean "pounds", in which case "1#" is correct, but it's antiquated at best in the USA, and presumably a nonissue for other countries that use metric...

C programmers are used to using "#" at the start of lines (#define, #include). LaTeX authors are used to "%" at the end of lines when writing macro definitions, as a comment to keep extraneous whitespace from creeping in--but "%" is comment to end-of-line so it's also likely to show up at the start of a line too...

Submitted by Island Joe (not verified) on Tue, 03/21/2006 - 04:25.
Thanks for sharing those mnemonic insights, it's most helpful.

[Feb 09, 2011] Pattern matching with replacement

${var:pos[:len]} # extract substr from pos (0-based) for len

${var/substr/repl} # replace first match
${var//substr/repl} # replace all matches
${var/#substr/repl} # replace if matches at beginning (non-greedy)
${var/##substr/repl} # replace if matches at beginning (greedy)
${var/%substr/repl} # replace if matches at end (non-greedy)
${var/%%substr/repl} # replace if matches at end (greedy)

${#var} # returns length of $var
${!var} # indirect expansion

[Feb 09, 2011] : or not to :

Jan 08, 2007


Interesting, ":" can be omitted for "numeric" variables (script/function arguments).

First time I thought it is a typo, but it is not.

bash String Manipulations By Jim Dennis,

The bash shell has many features that are sufficiently obscure you almost never see them used. One of the problems is that the man page offers no examples.

Here I'm going t... some of these features to do the sorts of simple string manipulations that are commonly needed on file and path names.

In traditional Bourne shell programming you might see references to the basename and dirname commands. These perform simple string manipulations on their arguments. You'll also see many uses of sed and awk or perl -e to perform simple string manipulations.

Often these machinations are necessary perform on lists of filenames and paths. There are many specialized programs that are conventionally included with Unix to perform these sorts of utility functions: tr, cut, paste, and join. Given a filename like /home/myplace/ which we'll call $f you could use commands like:

dirname $f 
basename $f 
basename $f.txt

... to see output like:


Notice that the GNU version of basename takes an optional parameter. This handy for specifying a filename "extension" like .tar.gz which will be stripped off of the output. Note that basename and dirname don't verify that these parameters are valid filenames or paths. They simple perform simple string operations on a single argument. You shouldn't use wild cards with them -- since dirname takes exactly one argument (and complains if given more) and basename takes one argument and an optional one which is not a filename.

Despite their simplicity these two commands are used frequently in shell programming because most shells don't have any built-in string handling functions -- and we frequently need to refer to just the directory or just the file name parts of a given full file specification.

Usually these commands are used within the "back tick" shell operators like TARGETDIR=`dirname $1`. The "back tick" operators are equivalent to the $(...) construct. This latter construct is valid in Korn shell and bash -- and I find it easier to read (since I don't have to squint at me screen wondering which direction the "tick" is slanted).

Although the basename and dirname commands embody the "small is beautiful" spirit of Unix -- they may push the envelope towards the "too simple to be worth a separate program" end of simplicity.

Naturally you can call on sed, awk, TCL or perl for more flexible and complete string handling. However this can be overkill -- and a little ungainly.

So, bash (which long ago abandoned the "small is beautiful" principal and went the way of emacs) has some built in syntactical candy for doing these operations. Since bash is the default shell on Linux systems then there is no reason not to use these features when writing scripts for Linux.

The bash man page is huge. In contains a complete reference to the "readline" libraries and how to write a .inputrc file (which I think should all go in a separate man page) -- and a run down of all the csh "history" or bang! operators (which I think should be replaced with a simple statement like: "Most of the csh work the same way in bash").

However, buried in there is a section on Parameter Substitution which tells us that $var is really a shorthand for ${var} which is really the simplest case of several ${var:operators} and similar constructs.

Are you confused, yet?

Here's where a few examples would have helped. To understand the man page I simply experimented with the echo command and several shell variables. This is what it all means:

Here we notice two different "operators" being used inside the parameters (curly braces). Those are the # and the % operators. We also see them used as single characters and in pairs. This gives us four combinations for trimming patterns off the beginning or end of a string:

Trim the shortest match from the end
Trim the longest match from the beginning
Trim the shortest match from the end
Trim the shortest match from the beginning

It's important to understand that these use shell "globbing" rather than "regular expressions" to match these patterns. Naturally a simple string like "txt" will match sequences of exactly those three characters in that sequence -- so the difference between "shortest" and "longest" only applies if you are using a shell wild card in your pattern.

A simple example of using these operators comes in the common question of copying or renaming all the *.txt to change the .txt to .bak (in MS-DOS' COMMAND.COM that would be REN *.TXT *.BAK).

This is complicated in Unix/Linux because of a fundamental difference in the programming API's. In most Unix shells the expansion of a wild card pattern into a list of filenames (called "globbing") is done by the shell -- before the command is executed. Thus the command normally sees a list of filenames (like "var.txt bar.txt etc.txt") where DOS (COMMAND.COM) hands external programs a pattern like *.TXT.

Under Unix shells, if a pattern doesn't match any filenames the parameter is usually left on the command like literally. Under bash this is a user-settable option. In fact, under bash you can disable shell "globbing" if you like -- there's a simple option to do this. It's almost never used -- because commands like mv, and cp won't work properly if their arguments are passed to them in this manner.

However here's a way to accomplish a similar result:

for i in *.txt; do cp $i ${i%.txt}.bak; done 

... obviously this is more typing. If you tried to create a shell function or alias for it -- you have to figure out how to pass this parameters. Certainly the following seems simple enough:

function cp-pattern { for i in $1; do cp $i ${i%$1}$2; done

... but that doesn't work like most Unix users would expect. You'd have to pass this command a pair of specially chosen, and quoted arguments like:

cp-pattern '*.txt' .bak 

... note how the second pattern has no wild cards and how the first is quoted to prevent any shell globbing. That's fine for something you might just use yourself -- if you remember to quote it right. It's easy enough to add check for the number of arguments and to ensure that there is at least one file that exists in the $1 pattern. However it becomes much harder to make this command reasonably safe and robust. Inevitably it becomes less "unix-like" and thus more difficult to use with other Unix tools.

I generally just take a whole different approach. Rather than trying to use cp to make a backup of each file under a slightly changed name I might just make a directory (usually using the date and my login ID as a template) and use a simple cp command to copy all my target files into the new directory.

Another interesting thing we can do with these "parameter expansion" features is to iterate over a list of components in a single variable.

For example, you might want to do something to traverse over every directory listed in your path -- perhaps to verify that everything listed therein is really a directory and is accessible to you.

Here's a command that will echo each directory named on your path on it's own line:

p=$PATH until [ $p = $d ]; do d=${p%%:*}; p=${p#*:}; echo $d; 

... obviously you can replace the echo $d part of this command with anything you like.

Another case might be where you'd want to traverse a list of directories that were all part of a path. Here's a command pair that echos each directory from the root down to the "current working directory":

p=$(pwd) until [ $p = $d ]; do p=${p#*/}; d=${p%%/*}; echo $d; 

... here we've reversed the assignments to p and d so that we skip the root directory itself -- which must be "special cased" since it appears to be a "null" entry if we do it the other way. The same problem would have occurred in the previous example -- if the value assigned to $PATH had started with a ":" character.

Of course, its important to realize that this is not the only, or necessarily the best method to parse a line or value into separate fields. Here's an example that uses the old IFS variable (the "inter-field separator in the Bourne, and Korn shells as well as bash) to parse each line of /etc/passwd and extract just two fields:

cat /etc/passwd | ( \
   IFS=: ;
   while read lognam pw id gp fname home sh; \
   do echo $home \"$fname\"; done \

Here we see the parentheses used to isolate the contents in a subshell -- such that the assignment to IFS doesn't affect our current shell. Setting the IFS to a "colon" tells the shell to treat that character as the separater between "words" -- instead of the usual "whitespace" that's assigned to it. For this particular function it's very important that IFS consist solely of that character -- usually it is set to "space," "tab," and "newline.

After that we see a typical while read loop -- where we read values from each line of input (from /etc/passwd into seven variables per line. This allows us to use any of these fields that we need from within the loop. Here we are just using the echo command -- as we have in the other examples.

My point here has been to show how we can do quite a bit of string parsing and manipulation directly within bash -- which will allow our shell scripts to run faster with less overhead and may be easier than some of the more complex sorts of pipes and command substitutions one might have to employ to pass data to the various external commands and return the results.

Many people might ask: Why not simply do it all in perl? I won't dignify that with a response. Part of the beauty of Unix is that each user has many options about how they choose to program something. Well written scripts and programs interoperate regardless of what particular scripting or programming facility was used to create them. Issue the command file /usr/bin/* on your system and and you may be surprised at how many Bourne and C shell scripts there are in there

In conclusion I'll just provide a sampler of some other bash parameter expansions:

Provide a default if parameter is unset or null.
echo ${1:-"default"}
Note: this would have to be used from within a functions or shell script -- the point is to show that some of the parameter substitutions can be use with shell numbered arguments. In this case the string "default" would be returned if the function or script was called with no $1 (or if all of the arguments had been shifted out of existence. ${parameter:=word}
Assign a value to parameter if it was previously unset or null.
echo ${HOME:="/home/.nohome"}
Generate an error if parameter is unset or null by printing word to stdout.
${TMP:?"Error: Must have a valid Temp Variable Set"}

This one just uses the shell "null command" (the : command) to evaluate the expression. If the variable doesn't exist or has a null value -- this will print the string to the standard error file handle and exit the script with a return code of one.

Oddly enough -- while it is easy to redirect the standard error of processes under bash -- there doesn't seem to be an easy portable way to explicitly generate message or redirect output to stderr. The best method I've come up with is to use the /proc/ filesystem (process table) like so:

function error { echo "$*" > /proc/self/fd/2 } 

... self is always a set of entries that refers to the current process -- and self/fd/ is a directory full of the currently open file descriptors. Under Unix and DOS every process is given the following pre-opened file descriptors: stdin, stdout, and stderr.

Alternative value. ${TMP:+"/mnt/tmp"}
use /mnt/tmp instead of $TMP but do nothing if TMP was unset. This is a weird one that I can't ever see myself using. But it is a logical complement to the ${var:-value} we saw above.
Return the length of the variable in characters.

    echo The length of your PATH is ${#PATH}

Manipulating Strings

From Advanced Bash-Scripting Guide: Chapter 10. Manipulating Variables

Bash supports a number of string manipulation operations. Unfortunately, these tools lack a unified focus. Some are a subset of parameter substitution, and others fall under the functionality of the UNIX expr command. This results in inconsistent command syntax and overlap of functionality, not to mention confusion.

expr match "$string" '\($substring\)'
Extracts $substring at beginning of $string, where $substring is a regular expression.
expr "$string" : '\($substring\)'
Extracts $substring at beginning of $string, where $substring is a regular expression.
#     =======	  echo `expr match "$stringZ" '\(.[b-c]*[A-Z]..[0-9]\)'` # abcABC1
echo `expr "$stringZ" : '\(.[b-c]*[A-Z]..[0-9]\)'`     # abcABC1
echo `expr "$stringZ" : '\(.......\)'`                 # abcABC1
# All of the above forms give an identical result.
expr match "$string" '.*\($substring\)'
Extracts $substring at end of $string, where $substring is a regular expression.
expr "$string" : '.*\($substring\)'
Extracts $substring at end of $string, where $substring is a regular expression.

#              ======

echo `expr match "$stringZ" '.*\([A-C][A-C][A-C][a-c]*\)'`  # ABCabc
echo `expr "$stringZ" : '.*\(......\)'`                     # ABCabc
Substring Removal
Strips shortest match of $substring from front of $string.
Strips longest match of $substring from front of $string.
#     |----|
#     |----------|

echo ${stringZ#a*C}    # 123ABCabc
# Strip out shortest match between 'a' and 'C'.

echo ${stringZ##a*C}   # abc
# Strip out longest match between 'a' and 'C'.

Strips shortest match of $substring from back of $string.
Strips longest match of $substring from back of $string.

#                  ||
#      |------------|

echo ${stringZ%b*c}    # abcABC123ABCa
# Strip out shortest match between 'b' and 'c', from back of $stringZ.

echo ${stringZ%%b*c}   # a
# Strip out longest match between 'b' and 'c', from back of $stringZ.
Example 9-10. Converting graphic file formats, with filename change
#  Converts all the MacPaint image files in a directory to "pbm" format.

#  Uses the "macptopbm" binary from the "netpbm" package,
#+ which is maintained by Brian Henderson (
#  Netpbm is a standard part of most Linux distros.

SUFFIX=pbm        # New filename suffix.

if [ -n "$1" ]
  directory=$1    # If directory name given as a script argument...
  directory=$PWD  # Otherwise use current working directory.
#  Assumes all files in the target directory are MacPaint image files,
# + with a ".mac" suffix.

for file in $directory/*  # Filename globbing.
  filename=${file%.*c}    #  Strip ".mac" suffix off filename
                          #+ ('.*c' matches everything
			  #+ between '.' and 'c', inclusive).
  $OPERATION $file > $filename.$SUFFIX
                          # Redirect conversion to new filename.
  rm -f $file             # Delete original files after converting. echo "$filename.$SUFFIX"  # Log what is happening to stdout.

exit 0
Substring Replacement
Replace first match of $substring with $replacement.
Replace all matches of $substring with $replacement.

echo ${stringZ/abc/xyz}         # xyzABC123ABCabc
                                # Replaces first match of 'abc' with 'xyz'.

echo ${stringZ//abc/xyz}        # xyzABC123ABCxyz
                                # Replaces all matches of 'abc' with # 'xyz'.

If $substring matches front end of $string, substitute $replacement for $substring.
If $substring matches back end of $string, substitute $replacement for $substring.

echo ${stringZ/#abc/XYZ}        # XYZABC123ABCabc
                                # Replaces front-end match of 'abc' with 'xyz'.

echo ${stringZ/%abc/XYZ}        # abcABC123ABCXYZ
                                # Replaces back-end match of 'abc' with 'xyz'.

Manipulating strings using awk

A Bash script may invoke the string manipulation facilities of awk as an alternative to using its built-in operations.

Example 9-11. Alternate ways of extracting substrings

#    012345678  Bash
#    123456789  awk
# Note different string indexing system:
# Bash numbers first character of string as '0'.
# Awk  numbers first character of string as '1'.

echo ${String:2:4} # position 3 (0-1-2), 4 characters long
                                       # skid

# The awk equivalent of ${string:pos:length} is substr(string,pos,length).
echo | awk '
{ print substr("'"${String}"'",3,4)    # skid
#  Piping an empty "echo" to awk gives it dummy input,
#+ and thus makes it unnecessary to supply a filename.

exit 0

For more on string manipulation in scripts, refer to Section 9.3 and the relevant section of the expr command listing. For script examples, see:

  1. Example 12-6
  2. Example 9-14
  3. Example 9-15
  4. Example 9-16
  5. Example 9-18

David Korn Tells All

# More to the point, thanks to the way ksh works, you can do this:
# make an array, words, local to the current function

typeset -A words

# read a full line

read line

# split the line into words

echo "$line" | read -A words

# Now you can access the line either word-wise or string-wise - useful if you want to, say, check for a command as the Nth parameter,
# but also keep formatting of the other parameters...

Recommended Links

Softpanorama hot topic of the month

Softpanorama Recommended


FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available in our efforts to advance understanding of environmental, political, human rights, economic, democracy, scientific, and social justice issues, etc. We believe this constitutes a 'fair use' of any such copyrighted material as provided for in section 107 of the US Copyright Law. In accordance with Title 17 U.S.C. Section 107, the material on this site is distributed without profit exclusivly for research and educational purposes.   If you wish to use copyrighted material from this site for purposes of your own that go beyond 'fair use', you must obtain permission from the copyright owner. 

ABUSE: IPs or network segments from which we detect a stream of probes might be blocked for no less then 90 days. Multiple types of probes increase this period.  


Groupthink : Two Party System as Polyarchy : Corruption of Regulators : Bureaucracies : Understanding Micromanagers and Control Freaks : Toxic Managers :   Harvard Mafia : Diplomatic Communication : Surviving a Bad Performance Review : Insufficient Retirement Funds as Immanent Problem of Neoliberal Regime : PseudoScience : Who Rules America : Neoliberalism  : The Iron Law of Oligarchy : Libertarian Philosophy


War and Peace : Skeptical Finance : John Kenneth Galbraith :Talleyrand : Oscar Wilde : Otto Von Bismarck : Keynes : George Carlin : Skeptics : Propaganda  : SE quotes : Language Design and Programming Quotes : Random IT-related quotesSomerset Maugham : Marcus Aurelius : Kurt Vonnegut : Eric Hoffer : Winston Churchill : Napoleon Bonaparte : Ambrose BierceBernard Shaw : Mark Twain Quotes


Vol 25, No.12 (December, 2013) Rational Fools vs. Efficient Crooks The efficient markets hypothesis : Political Skeptic Bulletin, 2013 : Unemployment Bulletin, 2010 :  Vol 23, No.10 (October, 2011) An observation about corporate security departments : Slightly Skeptical Euromaydan Chronicles, June 2014 : Greenspan legacy bulletin, 2008 : Vol 25, No.10 (October, 2013) Cryptolocker Trojan (Win32/Crilock.A) : Vol 25, No.08 (August, 2013) Cloud providers as intelligence collection hubs : Financial Humor Bulletin, 2010 : Inequality Bulletin, 2009 : Financial Humor Bulletin, 2008 : Copyleft Problems Bulletin, 2004 : Financial Humor Bulletin, 2011 : Energy Bulletin, 2010 : Malware Protection Bulletin, 2010 : Vol 26, No.1 (January, 2013) Object-Oriented Cult : Political Skeptic Bulletin, 2011 : Vol 23, No.11 (November, 2011) Softpanorama classification of sysadmin horror stories : Vol 25, No.05 (May, 2013) Corporate bullshit as a communication method  : Vol 25, No.06 (June, 2013) A Note on the Relationship of Brooks Law and Conway Law


Fifty glorious years (1950-2000): the triumph of the US computer engineering : Donald Knuth : TAoCP and its Influence of Computer Science : Richard Stallman : Linus Torvalds  : Larry Wall  : John K. Ousterhout : CTSS : Multix OS Unix History : Unix shell history : VI editor : History of pipes concept : Solaris : MS DOSProgramming Languages History : PL/1 : Simula 67 : C : History of GCC developmentScripting Languages : Perl history   : OS History : Mail : DNS : SSH : CPU Instruction Sets : SPARC systems 1987-2006 : Norton Commander : Norton Utilities : Norton Ghost : Frontpage history : Malware Defense History : GNU Screen : OSS early history

Classic books:

The Peter Principle : Parkinson Law : 1984 : The Mythical Man-MonthHow to Solve It by George Polya : The Art of Computer Programming : The Elements of Programming Style : The Unix Hater’s Handbook : The Jargon file : The True Believer : Programming Pearls : The Good Soldier Svejk : The Power Elite

Most popular humor pages:

Manifest of the Softpanorama IT Slacker Society : Ten Commandments of the IT Slackers Society : Computer Humor Collection : BSD Logo Story : The Cuckoo's Egg : IT Slang : C++ Humor : ARE YOU A BBS ADDICT? : The Perl Purity Test : Object oriented programmers of all nations : Financial Humor : Financial Humor Bulletin, 2008 : Financial Humor Bulletin, 2010 : The Most Comprehensive Collection of Editor-related Humor : Programming Language Humor : Goldman Sachs related humor : Greenspan humor : C Humor : Scripting Humor : Real Programmers Humor : Web Humor : GPL-related Humor : OFM Humor : Politically Incorrect Humor : IDS Humor : "Linux Sucks" Humor : Russian Musical Humor : Best Russian Programmer Humor : Microsoft plans to buy Catholic Church : Richard Stallman Related Humor : Admin Humor : Perl-related Humor : Linus Torvalds Related humor : PseudoScience Related Humor : Networking Humor : Shell Humor : Financial Humor Bulletin, 2011 : Financial Humor Bulletin, 2012 : Financial Humor Bulletin, 2013 : Java Humor : Software Engineering Humor : Sun Solaris Related Humor : Education Humor : IBM Humor : Assembler-related Humor : VIM Humor : Computer Viruses Humor : Bright tomorrow is rescheduled to a day after tomorrow : Classic Computer Humor

The Last but not Least

Copyright © 1996-2016 by Dr. Nikolai Bezroukov. was created as a service to the UN Sustainable Development Networking Programme (SDNP) in the author free time. This document is an industrial compilation designed and created exclusively for educational use and is distributed under the Softpanorama Content License.

The site uses AdSense so you need to be aware of Google privacy policy. You you do not want to be tracked by Google please disable Javascript for this site. This site is perfectly usable without Javascript.

Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.

FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available to advance understanding of computer science, IT technology, economic, scientific, and social issues. We believe this constitutes a 'fair use' of any such copyrighted material as provided by section 107 of the US Copyright Law according to which such material can be distributed without profit exclusively for research and educational purposes.

This is a Spartan WHYFF (We Help You For Free) site written by people for whom English is not a native language. Grammar and spelling errors should be expected. The site contain some broken links as it develops like a living tree...

You can use PayPal to make a contribution, supporting development of this site and speed up access. In case is down you can use the at


The statements, views and opinions presented on this web page are those of the author (or referenced source) and are not endorsed by, nor do they necessarily reflect, the opinions of the author present and former employers, SDNP or any other organization the author may be associated with. We do not warrant the correctness of the information provided or its fitness for any purpose.

Last modified: March 18, 2017