Line-based records are those where each line in the file is a complete record. It will usually be divided into fields by a delimiting character, but sometimes the fields are defined by length: the first 20 characters are the names, the next 20 are the first line of the address, and so on.
When the files are large, the processing is usually done by an external utility such as sed or awk. Sometimes an external utility will be used to select a few records for the shell to process. This snippet searches the password file for users whose shell is bash and feeds the results to the shell to perform some (unspecified) checks:
grep 'bash$' /etc/passwd | while read line do : perform some checking here done
Most single-line records will have fields delimited by a certain character. In /etc/passwd, the delimiter is a colon. In other files, the delimiter may be a tab, tilde, or, very commonly, a comma. For these records to be useful, they must be split into their separate fields.
When records are received on an input stream, the easiest way to split them is to change IFS and read each field into its own variable:
grep 'bash$' /etc/passwd | while IFS=: read user passwd uid gid name homedir shell do printf "%16s: %s\n" \ User "$user" \ Password "$passwd" \ "User ID" "$uid" \ "Group ID" "$gid" \ Name "$name" \ "Home directory" "$homedir" \ Shell "$shell" read < /dev/tty done
Sometimes it is not possible to split a record as it is read, such as if the record will be needed in its entirety as well as split into its constituent fields. In such cases, the entire line can be read into a single variable and then split later using any of several techniques. For all of these, the examples here will use the root entry from /etc/passwd:
record=root:x:0:0:root:/root:/bin/bash
The fields can be extracted one at a time using parameter expansion:
for var in user passwd uid gid name homedir shell do eval "$var=\${record%%:*}" ## extract the first field record=${record#*:} ## and take it off the record done
As long as the delimiting character is not found within any field, records can be split by setting IFS to the delimiter. When doing this, file name expansion should be turned off (with set -f) to avoid expanding any wildcard characters. The fields can be stored in an array and variables can be set to reference them:
IFS=: set -f data=( $record ) user=0 passwd=1 uid=2 gid=3 name=4 homedir=5 shell=6
The variable names are the names of the fields that can then be used to retrieve values from the data array:
$ echo;printf "%16s: %s\n" \ User "${data[$user]}" \ Password "${data[$passwd]}" \ "User ID" "${data[$uid]}" \ "Group ID" "${data[$gid]}" \ Name "${data[$name]}" \ "Home directory" "${data[$homedir]}" \ Shell "${data[$shell]}" User: root Password: x User ID: 0 Group ID: 0 Name: root Home directory: /root Shell: /bin/bash
It is more usual to assign each field to a scalar variable. This function (Listing 13-16) takes a passwd record and splits it on colons and assigns fields to the variables.
Listing 13-16. split_passwd, Split a Record from /etc/passwd into Fields and Assign to Variables
split_passwd() #@ USAGE: split_passwd RECORD { local opts=$- ## store current shell options local IFS=: local record=${1:?} array set -f ## Turn off filename expansion array=( $record ) ## Split record into array case $opts in *f*);; *) set +f;; esac ## Turn on expansion if previously set user=${array[0]} passwd=${array[1]} uid=${array[2]} gid=${array[3]} name=${array[4]} homedir=${array[5]} shell=${array[6]} }
The same thing can be accomplished using a here document (Listing 13-17).
Listing 13-17. split_passwd, Split a Record from /etc/passwd into Fields and Assign to Variables
split_passwd() { IFS=: read user passwd uid gid name homedir shell <<. $1 . }
More generally, any character-delimited record can be split into variables for each field with this function (Listing 13-18).
Listing 13-18. split_record, Split a Record by Reading Variables
split_record() #@ USAGE parse_record record delimiter var ... { local record=${1:?} IFS=${2:?} ## record and delimiter must be provided : ${3:?} ## at least one variable is required shift 2 ## remove record and delimiter, leaving variables ## Read record into a list of variables using a 'here document' read "$@" <<. $record . }
Using the record defined earlier, here’s the output:
$ split_record "$record" : user passwd uid gid name homedir shell $ sa "$user" "$passwd" "$uid" "$gid" "$name" "$homedir" "$shell" :root: :x: :0: :0: :root: :/root: :/bin/bash:
Less common than delimited fields are fixed-length fields. They aren’t used often, but when they are, they would be looped through name=width strings to parse them, which is how many text editors import data from fixed-length field data files:
line="John 123 Fourth Street Toronto Canada " for nw in name=15 address=20 city=12 country=22 do var=${nw%%=*} ## variable name precedes the equals sign width=${nw#*=} ## field width follows it eval "$var=\${line:0:width}" ## extract field line=${line:width} ## remove field from the record done