Softpanorama
(slightly skeptical) Open Source Software Educational Society

May the source be with you, but remember the KISS principle ;-)

Softpanorama Search

Introduction to Perl for Unix System Administrators

(Perl without excessive complexity)

by Dr Nikolai Bezroukov


4.3. HERE string literals
and DATA filehandle

Suppose you have several lines (say a dozen) that you want to to put into some variable (or just print), without any changes but  you want to interpolate the variables that you find. This can be accomplished with so called HERE documents. HERE documents in Perl is just another notation for a string literal. Therefore as with regular string literals Perl allows three types of HERE documents

Here string literal is an analog of double quoted strings. It allows interpolation but is more convenient for multi-line strings.  For example:

$lang="Perl";
print <<"EOF";
Dear $lang Language Designer,
   It's not the first language that I learn 
   and I got used not expect much from the compiler/interpreter.
   But in this case I was really upset when I discovered 
   how difficult is to catch a syntax error in the $lang
   and how fuzzy the $lang diagnostic is. 

Your truly, 
Joe User
EOF

Please note that EOF is an special closing tag that tells Perl to look for a complete line stating with this tag as the end of the string literal that usually encompass several lines .

Attention: There should be no space at the start of line that contains closing tag

This is a pretty old and very useful mechanism for incorporating an input streams of data into scripts. It was probably first used in OS/360 job control language (JCL) and later was adopted by Unix shells.  Again this is an old mechanism and as with many such things the position of the end tag is crucial. Generally the following rules are applicable:

  1. There must be no space between = and <<.
  2.  The statement that contains =<<  string literal operator is regular Perl statement that (probably) should have semicolon at the end.\
  3. You can't (easily) have any space in front of the end tag. If you want to indent the text in the here document, you can do this:

#2345678901234567890
  ($VAR = <<"EOF") =~ s/^\s+//gm;
  your text
  goes here

EOF

But the EOF mark will still needs to be stared from the very first column. If you really want it to be indented, you'll have to include spaces in the end tag too.

    ($quote = <<'    EOF') =~ s/^\s+//gm;
            ...
        EOF
    $quote =~ s/\s*--/\n--/;

The classic Perl idiom that uses this construct is producing a help screen for a utility in case you discover that iether the number of parameters of some parameter is wrong. The main advantage is that you can now use special characters within the print statement without fear of confusing the print function about where the string ends.

die <<"EOF";

Usage $my_util -a[lr]  file

-a -- all
-l  -- long format
- r -- recursive

EOF

Since you are saying "EOL", interpolation happens and $my_util will be replaces with its value. Also please note that we did not put end marker in quotes at the endline. This would be a syntax error.

The third type (backticked endtag) is probably the most interesting,  as it permit as simple and convenient way to generate and then execute a shell script in Perl. For example:

my $fd = new FileHandle("scriptname"); my @script = <$fd>; close($fd);

$line =<<`SCRIPT`;

@script

SCRIPT

for simplicity this example just reads the script from the file instead of generating it on the fly. The result of the script execution will piped into the variable $line.

DATA filehandle and __DATA__ and __END__ tokens

The DATA filehandle provides an elegant and  simple mechanism to include a small datafile directly into the script.

Often simple scripts use just one small datafile. In this case you can attach this datafile to the code, and thus process that datafile 'in place'. That permits coping just one file for the utility. For example, the example above that generates a simple help screen for a utility can be rewritten using __DATA__  filehandle:

$my_util="supercopy";
while ($line = <DATA>) { print $line; }
__DATA__
 Usage $my_util -a[lr]  file
   -a -- all
   -l  -- long format
   -r -- recursive

Warning:

A special token __END__  can also be used with the DATA special filehandle. The only difference between the two is that __END__ can be used in a file only once. The  __DATA__  token can be used more than once, which might be useful in cases when a your database consist of several sections or is a single file contains several packages. In this case each package can have its own 'DATA' handler.

Macrovariables  __FILE__ and  __LINE__

The special tokens __FILE__ and __LINE__  are often used in conjunction with the special filehandles like DATA. It's important to understand that they are not a variables but macros that are replaced (interpolated) with their values during compilation.

  • __FILE__  will be replaced to the name of the current file
  •  __LINE__ will be replaced into the line number of  that contains the statement were this macro variable is used.
  • They give information that is similar to caller() but there is a difference. the caller() built-in function reports the lines that are in the stack above the place where the statement is executing in, not the actual place itself. Below is an test that illustrated this fact:

    print __FILE__, " ", __LINE__, "@{[caller()]}\n";

    Note that since they are not variables per se, you cannot put them in double quotes.

    A Sample Script

    A typical example of HERE document usage is the generation of HTML in CGI scripts:  Perl lets you run on over multiple lines within a single print statement; only the closing quote and semi-colon tell Perl the statement is finished. For example using a regular double quoted literal the simple HTML page can be generated in the following way:

    $language="Perl"
    print "
    <title>A Hello $language Page</title>
    <h1>A Hello $language Page</h1>
    <hr>
    <p>Hello $language.
    <p>It's not the first language that I learn
      and I got used not expect much from the compiler.
    <p>But I was pretty upset when I discovered
    how difficult is to catch a syntax error in $language
    and how fuzzy $language diagnostic is.
    <p>Your truly
    <br>Joe User
    "; 

    HERE documents provide more convenient way to do the same thing:

        print <<"EOF";
        <title>A Hello $language Page</title>
        <h1>A Hello $language Page</h1>
        <hr>
        <p>Hello $language. 
        <p>It's not the first language that I learn 
         and I got used not expect much from the compiler. 
        <p>But I was pretty upset when I discovered 
        how difficult is to catch a syntax error in $language 
        and how fuzzy $language diagnostic is. 
        <p>Your truly 
        <br>Joe User
        EOF  

    __DATA__  is even more convenient but if interpolation of variables is required, it should be done via eval function:

    ... ... ...

    $file_delim=$\; $\=undef; $line = eval(<DATA>);  $\=$file_delim;

    ... ... ,..

    __DATA__

      <title>A Hello $language Page</title>
        <h1>A Hello $language Page</h1>
        <hr>

        <p>Hello $language.
        <p>It's not the first language that I learn
         and I got used not expect much from the compiler.
        <p>But I was pretty upset when I discovered
        how difficult is to catch a syntax error in $language
        and how fuzzy $language diagnostic is.
        <p>Your truly
        <br>Joe User


    Copyright © 1996-2009 by Dr. Nikolai Bezroukov. www.softpanorama.org was created as a service to the UN Sustainable Development Networking Programme (SDNP) in the author free time. Submit comments This document is an industrial compilation designed and created exclusively for educational use and is placed under the copyright of the Open Content License(OPL). Site uses AdSense so you need to be aware of Google privacy policy. Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.

    Disclaimer:

    Created: November 7 1998; Last modified: September 05, 2009