Softpanorama

Home Switchboard Unix Administration Red Hat TCP/IP Networks Neoliberalism Toxic Managers
May the source be with you, but remember the KISS principle ;-)
Bigger doesn't imply better. Bigger often is a sign of obesity, of lost control, of overcomplexity, of cancerous cells

Perl Tips/Snippets

News Recommended Links Perl Style Perl Programming Environment Perl as a command line utility tool Perl Debugging
Perl Xref Perl One Liners Perl Options Perl options processing Pipes in Perl Perl POD documentation

Using VIM with Perl

AWK one liners Shell Tips and Tricks VIM Tips Humor Etc

 

Here are some of the most useful Perl tips and snippets that I collected:

  1. If you use OFMs it is easy to save stokes in checking Perl scripts. Configure extension pl  or usemenu item p  to invoke perl -cw !.! (!.! is FAR idiom, other OFMs like Midnight Commander use different macros).

     

  2. Special variable $^O contains the name of your operating system in the format provided by uname.

    No need for something like:

    $OS=`uname`; chomp $OS;

    Some additional scalars that Perl defines for you:

  3. Create a log file and write important messages to the log file
     

  4. In more or less complex script control printing of debugging information using some variable (for example $debug ). Design and maintain your own system of diagnostic output from various subroutines of the program

    For more or less complex program diagnostic output using special print statements is the most efficient debugging method. It should be controlled by special variable, for example $debug, which can be integer or bit value. For example:

    ($debug) && print "text=$test";
    You can also use binary numbers and & operator which permits you operating with small sets of debug flags one for each section of a program The following code snippet demonstrates this:
    # A $debug eight bits (one byte)
    $debug=0b10110000;
    
    if ( $debug & 0b10000000) {
       print "Some dignistic output\n";
    } elsif ( $debug & 0b0100000) {
       print "Other (possiblly more detailed diagnistic output)
    }
  5. Initializing list of words use qw
    @mylist=qw(one, two, three, four);
  6. You can check Perl syntax in VIM on each save
    au BufWritePost *.pl,*.pm !perl -c %

    Every time you save a .pl or .pm file, it executes perl -c and shows you the output.

    ~~
    naChoZ

  7. Dynamic activation of the debugger (from "Perl debugged" book):
    while (<INPUT>) {
       $DB::trace = 1, next if /debug/;
       $DB::trace = 0, next if /nodebug/;
       # more code
    }

    When run under the debugger, this enables tracing when the loop encounters an input line containing "debug" and ceases tracing upon reading one containing "nodebug".

    You can switch to interactive debugging  by using:

    $DB::single = 1

    instead. That also provide a way you can debug code in BEGIN blocks (which otherwise are executed before control is given to the debugger).

  8. Sometimes it makes sense to use regular expressions instead of substr. One such task is extraction of component of date, for example:
    $cur_date='20060325';
    (year, $month, $day)=$cur_date=~/(\d{4})(\d\d)(\d\d)/;
  9. Getting Perl cross-reference reports. The B::Xref module can be used to generate cross-reference reports for
    Perl programs.
    perl -MO=Xref[,OPTIONS] scriptname.plx
  10. Setting a value of parameter to default value:
    # --- process the second parameter
    $msglevel=($ARGV[1]) ? $ARGV[1] : $msglevel; # defaults is the three digit constant(see below)
    ($msglevel1, $msglevel2, $testing) = split(//,$msglevel); # get one byte flags
  11. Creating timestamp
    # Timestamp
    #
    ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst)=localtime(time);
    $year+=1900;
    $mon++;
    for ($mon, $mday, $hour, $min, $sec) {
       if (length($_)==1) {
           $_="0$_";
       }
    } 
  12. Move via link/unlink (should be the same filesystem):
    link($_[0], $target);
    if (-e $target) {
       unlink($_[0]);
    } else {
       logger("SFail to move the file '$_[0]' to '$home/$_[1]/$target' \n");
       return;
    }
  13. Removing duplicates: here the second part will be executed only if $new{$match} is still undefined:
    if ( $new{$match}++ || !( $tags{$match} = sprintf( "%s\t%s\t?^%s\$?\n", $match, $ARGV, $_ ) ) )

Top Visited
Switchboard
Latest
Past week
Past month

NEWS CONTENTS

Old News ;-)

[Sep 21, 2018] Preferred editor or IDE for development work - Red Hat Learning Community

Pycharm supports Perl, althouth this is not advertized.
Sep 21, 2018 | learn.redhat.com

Re: Preferred editor or IDE for development work

I don't do a lot of development work, but while learning Python I've found pycharm to be a robust and helpful IDE. Other than that, I'm old school like Proksch and use vi.

MICHAEL BAKER
SYSTEM ADMINISTRATOR, IT MAIL SERVICES

micjohns

Re: Preferred editor or IDE for development work

Yes, I'm the same as @Proksch. For my development environment at Red Hat, vim is easiest to use as I'm using Linux to pop in and out of files. Otherwise, I've had a lot of great experiences with Visual Studio.

[Sep 10, 2018] Parsing HTML with Perl by A. Sinan Unur

Notable quotes:
"... Editor's note: If you're looking for tips on how to write more efficient, robust, and maintainable Perl code, you'll want to check out Damien Conway's " Modern Perl Best Practices " video. ..."
Feb 06, 2014 | radar.oreilly.com

Efficiently manipulate documents on the Web|

The need to extract interesting bits of an HTML document comes up often enough that by now we have all seen many ways of doing it wrong and some ways of doing it right for some values of "right".

One might think that one of the most fascinating answers on Stackoverflow has put an end to the desire to parse HTML using regular expressions, but time and again such a desire proves too tempting .

Let's say you want to check all the links on a page to identify stale ones, using regular expressions:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 use strict ; use warnings ; use feature 'say' ; my $ re = qr /< as + href = [ "']([^"']+)["' ] / i ; my $ html = do { local $ / ; < DATA > }; # slurp _DATA_ section my @ links = ($ html =~ m { $ re } gx ); say for @ links ; __DATA__ < html >< body > < p >< a href = "http://example.com/" > An Example </ a ></ p > <!-- < a href = "http://invalid.example.com/" > An Example </ a > --> </ body ></ html >

In this self-contained example, I put a small document in the __DATA__ section. This example corresponds to a situation where the maintainer of the page commented out a previously broken link, and replaced it with the correct link.

When run, this script produces the output:

1 2 3 $ . / href . pl http : //example.com/ http : //invalid.example.com/

It is surprisingly easy to fix using HTML::TokeParser::Simple . Just replace the body of the script above with:

1 2 3 4 5 6 7 8 use HTML :: TokeParser :: Simple ; my $ parser = HTML :: TokeParser :: Simple -> new ( handle => * DATA ); while ( my $ anchor = $ parser -> get_tag ( 'a' )) { next unless defined ( my $ href = $ anchor -> get_attr ( 'href' )); say $ href ; }

When run, this script correctly prints:

1 2 $ . / href http : //example.com/

And, it looks like we made it much more readable in the process!

Of course, interesting HTML parsing jobs involve more than just extracting links. While even that task can be made ever-increasingly complex for the regular expression jockey by, say, adding some interesting attributes between the a and the href , code using HTML::TokeParser::Simple would not be affected.

Another specialized HTML parsing module is HTML::TableExtract . In most cases, it makes going through tables on a page a breeze. For example, the State Actions to Address Health Insurance Exchanges contains State Table 2: Snapshot of State Actions and Figures. The contents of this page may change with new developments, so here is a screenshot of the first few lines of the table:

screen-shot-state-actions-table-2

Parsing this table using HTML::TableExtract is straightforward:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 use HTML :: TableExtract ; use Text :: Table ; my $ doc = 'state-actions-to-implement-the-health-benefit.aspx' ; my $ headers = [ 'State' , 'Selected a Plan' ]; my $ table_extract = HTML :: TableExtract -> new ( headers => $ headers ); my $ table_output = Text :: Table -> new (@$ headers ); $ table_extract -> parse_file ($ doc ); my ($ table ) = $ table_extract -> tables ; for my $ row ($ table -> rows ) { clean_up_spaces ($ row ); # not shown for brevity $ table_output -> load ($ row ); } print $ table_output ;

Running this script yields:

1 2 3 4 5 6 7 $ . / te State Selected a Plan Alabama 624 Alaska 53 Arizona 739 Arkansas 250

Note that I did not even have to look at the underlying HTML code at all for this code to work. If it hadn't, I would have had to delve into that mess to find the specific problem, but, in this case, as in many others in my experience, HTML::TableExtract gave me just what I wanted. So long as the substrings I picked continue to match the content, my script will extract the desired columns even if some of the underlying HTML changes.

Both HTML::TokeParser::Simple (based on HTML::PullParser ) and HTML::TableExtract (which subclasses HTML::Parser parse a stream rather than loading the entire document to memory and building a tree. This made them performant enough for whatever I was able to throw at them in the past.

With HTML::TokeParser::Simple , it is also easy to stop processing a file once you have extracted what you need. That helps when you are dealing with thousands of documents, each several megabytes in size where the interesting content is located towards the beginning. With HTML::TablExtract , performance can be improved by switching to less robust table identifiers such as depths and counts. However, in certain pathological conditions I seem to run into a lot, you may need to play with regexes to first extract the exact region of the HTML source that contains the content of interest.

In one case I had to process large sets of HTML files I had to process where each file was about 8 Mb. The interesting table occurred about 3/4 through the HTML source, and it was clearly separated from the rest of the page by <!-- interesting content here --> style comments. In this particular case, slurping each file, extracting the interesting bit, and passing the content to HTML::TableExtract helped. Throw a little Parallel::ForkManager into the mix, and a task that used to take a few hours went down to less than half an hour.

Sometimes, you just need to be able to extract the contents of the third span within the sixth paragraph of the first content div on the right. Especially if you need to extract multiple pieces of information depending on various parts of the document, creating a tree structure will make that task simpler. It may have a huge performance cost, however, depending on the size of the document. Building trees out of the smallest possible HTML fragments can help here.

Once you have the tree structure, you can address each element or sets of elements. XPath is a way of addressing those elements. HTML::TreeBuilder builds a tree representation of HTML documents. HTML::TreeBuilder::XPath adds the ability to locate nodes in that representation using XPath expressions. So, if I wanted to get the table of contents of the same document, I could have used something along the lines of:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 use HTML :: TreeBuilder :: XPath ; use Text :: Table ; my $ doc = 'state-actions-to-implement-the-health-benefit.aspx' ; my $ tree = HTML :: TreeBuilder :: XPath -> new ; my $ toc_table = Text :: Table -> new ( 'Entry' , 'Link' ); $ tree -> parse_file ($ doc ); my @ toc = $ tree -> findnodes ( '//table[@id="bookmark"]/tbody/*/*/*//li/a' ); for my $ el ( @ toc ) { $ toc_table -> add ( $ el -> as_trimmed_text , $ el -> attr ( 'href' ), ); } print $ toc_table ;

Mojo::DOM is an excellent module that uses JQuery style selectors to address individual elements. It is extremely helpful when dealing with documents were HTML elements, classes, and ids were used in intelligent ways.

XML::Twig will also work for some HTML documents, but in general, using an XML parser to parse HTML documents found in the wild is perilious. On the other hand, if you do have well-formed documents, or HTML::Tidy can make them nice, XML::Twig is a joy to use. Unfortunately, it is depressingly too common to find documents pretending to be HTML, using a mish-mash of XML and HTML styles, and doing all sorts of things which browsers can accommodate, but XML parsers cannot.

And, if your purpose is just to clean some wild HTML document, use HTML::Tidy . It gives you an interface to the command line utility tidyp . For really convoluted HTML, it sometimes pays to pass through tidyp first before feeding it into one of the higher level modules.

Thanks to others who have built on HTML::Parser , I have never had to write a line of event handler code myself for real work. It is not that they are difficult to write. I do recommend you study the examples bundled with the distribution to see how the underlying machinery works. It is just that the modules others have built on top of and beyond HTML::Parser make life so much easier that I never had to worry much about going to the lowest possible level.

That's a good thing.

Editor's note: If you're looking for tips on how to write more efficient, robust, and maintainable Perl code, you'll want to check out Damien Conway's " Modern Perl Best Practices " video.

[May 28, 2018] Handling Binary Files in Perl

May 28, 2018 | www.devx.com

For some reason, there exists a common misconception that there is no cross-platform, built-in way in Perl to handle binary files. The copy_file code snippet below illustrates that Perl handles such tasks quite well. The trick is to use "binmode" on both the input and output files after opening them. "Binmode" switches files to binary mode, which for the input file means it won't stop reading at the first "end of text file" character (^Z in win/dos); for the output file binmode means it won't translate '\n' (LF) into '\r\n' (CRLF) when printing. In this way the files get copied byte for byte.

sub copy_file {
  my ($srcfile, $destfile) = @_;
  my $buffer;

  open INF, $srcfile
    or die "\nCan't open $srcfile for reading: $!\n";
  open OUTF, ">$destfile"
    or die "\nCan't open $destfile for writing: $!\n";

  binmode INF;
  binmode OUTF;

  while (
    read (INF, $buffer, 65536)  # read in (up to) 64k chunks, write
    and print OUTF $buffer      # exit if read or write fails
  ) {};
  die "Problem copying: $!\n" if $!;

  close OUTF
    or die "Can't close $destfile: $!\n";
  close INF
    or die "Can't close $srcfile: $!\n";
}
Atanas Banov

[May 09, 2018] reading binary files with Perl

May 09, 2018 | www.perlmonks.org

jpk1292000 has asked for the wisdom of the Perl Monks concerning the following question:

Hi monks, I'm new to the board and I've been struggling with this problem for some time now. Hope someone can give me some suggestions... I am trying to read a binary file with the following format: The 4-byte integer and (4 byte float) are in the native format of the machine.


*** First record (4 byte integer) - byte size of record (4*N) (f77 header) (4 byte float) .. 
      value 1 (4 byte float) .. value 2 ... (4 byte float) .. value N N = number of grid points in the field (4 byte 
      integer) .. byte size of record (4*N) (f77 trailer) **** Second record (4 byte integer) - byte size of record (4*N) 
      (f77 header) (4 byte float) .. value 1 (4 byte float) .. value 2 ... (4 byte float) .. value N N = number of grid 
      points in the field (4 byte integer) .. byte size of record (4*N) (f77 trailer)

[download]


    The data is meteorological data (temperature in degrees K) on a 614 x 428 grid. I tried coding up a reader for this, 
    but am getting nonsensical results. Here is the code: 


my $out_file = "/dicast2-papp/DICAST/smg_data/" . $gfn . ".bin"; #path
+

 to binary file my $template = "if262792i"; #binary layout (integer 262792 floats 
      in
+

teger) as described in the format documentation 
      above (not sure if th
+

is is correct) my $record_length 
      = 4; #not sure what record_length is supposed to rep
+

resent 
      (number of values in 1st record, or should it be length of var
+

iable 
      [4 bytes]) my (@fields,$record); open (FH, $out_files ) || die "couldn't open $out_files\n"; until (eof(FH)) { my $val_of_read 
      = read (FH, $record, $record_length) == $record_
+

length 
      or die "short read\n"; @fields = unpack ($template, $record); print "field = $fields[0]\n"; }

[download]


    The results I get when I print out the first field are non-sensical (negative numbers, etc). I think the issue is 
    that I'm not properly setting up my template and record length. Also, how do I find out what is "the native format of 
    the machine"?




Replies are listed 'Best First'.


davorg (Chancellor) on Nov 16, 2006 at 15:53 UTC

Re: reading binary files with Perl

You can find out more about how "read" works by reading its documentation .

From there, you'll find out that the third parameter (your $record_length) is the number of bytes to read from the filehandle[1]. As your template is set up to handle all of the data for one record in one go, you'll need to read one record's worth of data. That's 4 * (1 + 262792 + 1) bytes of data. Currently you're reading four bytes, and the template is looking for a lot more.

The documention for unpack says this:

If there are more pack codes or if the repeat count of a field or a group is larger than what the remainder of the input string allows, the result is not well defined: in some cases, the repeat count is decreased, or unpack() will produce null strings or zeroes, or terminate with an error. If the input string is longer than one described by the TEMPLATE, the rest is ignored.

[1] Actually, the number of _characters_ but let's assume single byte characters for the time being.

--
< http://dave.org.uk >

"The first rule of Perl club is you do not talk about Perl club."
-- Chip Salzenberg

ikegami (Pope) on Nov 16, 2006 at 16:04 UTC

Re: reading binary files with Perl

Depending on your OS, another problem is the lack of binmode . Add binmode(FH) after the open so that Perl doesn't mess with the data. Not all OSes require binmode , but it's safe to use binmode on all OSes.

Oh and I'd use l instead of i . i is not guaranteed to be 4 bytes.

jpk1292000 (Initiate) on Nov 16, 2006 at 19:09 UTC

Re^2: reading binary files with Perl


by jpk1292000 (Initiate) on Nov 16, 2006 at 19:09 UTC

BrowserUk (Pope) on Nov 16, 2006 at 16:13 UTC

Re: reading binary files with Perl

Something like this should do it. See the docs and/or ask for anything you do not understand.


#! perl -slw use strict; my @grid; open my $fh, '<:raw', 'the file' or die $!; while( 1 
            ) { my( $recSize, $dummy, $record ); sysread( $fh, $recSize, 4 ) or last; $recSize = unpack 'N', $recSize; 
            ##(*) sysread( $fh, $record, $recSize ) == $recSize or die "truncated record"; sysread( $fh, $dummy, 4 ) == 4 
            and unpack( 'N', $dummy ) == $recSize ##(*) or die "missing or invalid trailer"; ## (*) You may need V 
            depending upon which platform your file was
+

created 
            on push @grid, [ unpack 'N*', $record ]; } close $fh; ## @grid should now contain your data ## Addressable in 
            the usual $grid[ X ][ Y ] manner. ## Though it might be $array[ Y ][ X ] ## I forget which order FORTRAN 
            writes arrays in?

[download]



Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal? "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.

ikegami (Pope) on Nov 16, 2006 at 16:29 UTC

Re^2: reading binary files with Perl


by ikegami (Pope) on Nov 16, 2006 at 16:29 UTC

BrowserUk (Pope) on Nov 16, 2006 at 19:17 UTC

Re^3: reading binary files with Perl

ikegami (Pope) on Nov 16, 2006 at 21:12 UTC

Re^4: reading binary files with Perl

jmcnamara (Monsignor) on Nov 16, 2006 at 16:33 UTC

Re: reading binary files with Perl


Try something like the following:


#!/usr/bin/perl -w use strict; open FILE, 'file.bin' or die "Couldn't open file: $!\n"; 
            binmode FILE; my $record = 1; my $buffer = ''; while ( read( FILE, $buffer, 4 ) ) { my $record_length = 
            unpack 'N', $buffer; my $num_fields = $record_length / 4; printf "Record %d. Number of fields = %d\n", 
            $record, $num_fie
+

lds; for (1 .. $num_fields ) { 
            read( FILE, $buffer, 4 ); my $temperature = unpack 'f', $buffer; # Or if the above gives the wrong result try 
            this: #my $temperature = unpack 'f', reverse $buffer; print "\t", $temperature, "\n"; } # Read but ignore 
            record trailer. read( FILE, $buffer, 4 ); print "\n"; $record++; } __END__

[download]


          If the number of fields is wrong subtitute
unpack 'V'
 for
unpack 'N'
. If the float is wrong 
          try the
reverse
ed value that is commented out. 

Update: Added read for trailer.

--
John.

[May 09, 2018] How to read binary file in Perl - Stack Overflow

Notable quotes:
"... BTW: I don't think it's a good idea to read tons of binary files into memory at once. You can search them 1 by 1... ..."
May 09, 2018 | stackoverflow.com

2 down vote favorite 1


Grace ,Jan 19, 2012 at 2:08

I'm having an issue with writing a Perl script to read a binary file.

My code is as the following whereby the $file are files in binary format. I tried to search through the web and apply in my code, tried to print it out, but it seems it doesn't work well.

Currently it only prints the '&&&&&&&&&&&" and ""ppppppppppp", but what I really want is it can print out each of the $line , so that I can do some other post processing later. Also, I'm not quite sure what the $data is as I see it is part of the code from sample in article, stating suppose to be a scalar. I need somebody who can pin point me where the error goes wrong in my code. Below is what I did.

my $tmp = "$basedir/$key";
opendir (TEMP1, "$tmp");
my @dirs = readdir(TEMP1);
closedir(TEMP1);

foreach my $dirs (@dirs) {
    next if ($dirs eq "." || $dirs eq "..");
    print "---->$dirs\n";
    my $d = "$basedir/$key/$dirs";
    if (-d "$d") {
        opendir (TEMP2, $d) || die $!;
        my @files = readdir (TEMP2); # This should read binary files
        closedir (TEMP2);

        #my $buffer = "";
        #opendir (FILE, $d) || die $!;
        #binmode (FILE);
        #my @files =  readdir (FILE, $buffer, 169108570);
        #closedir (FILE);

        foreach my $file (@files) {
            next if ($file eq "." || $file eq "..");
            my $f = "$d/$file";
            print "==>$file\n";
            open FILE, $file || die $!;
            binmode FILE;
            foreach ($line = read (FILE, $data, 169108570)) {
                print "&&&&&&&&&&&$line\n";
                print "ppppppppppp$data\n";
            }
            close FILE;
        }
    }
}

I have altered my code so that it goes like as below. Now I can read the $data. Thanks J-16 SDiZ for pointing out that. I'm trying to push the info I got from the binary file to an array called "@array", thinkking to grep data from the array for string whichever match "p04" but fail. Can someone point out where is the error?

my $tmp = "$basedir/$key";
opendir (TEMP1, "$tmp");
my @dirs = readdir (TEMP1);
closedir (TEMP1);

foreach my $dirs (@dirs) {
    next if ($dirs eq "." || $dirs eq "..");
    print "---->$dirs\n";
    my $d = "$basedir/$key/$dirs";
    if (-d "$d") {
        opendir (TEMP2, $d) || die $!;
        my @files = readdir (TEMP2); #This should read binary files
        closedir (TEMP2);

        foreach my $file (@files) {
            next if ($file eq "." || $file eq "..");
            my $f = "$d/$file";
            print "==>$file\n";
            open FILE, $file || die $!;
            binmode FILE;
            foreach ($line = read (FILE, $data, 169108570)) {
                print "&&&&&&&&&&&$line\n";
                print "ppppppppppp$data\n";
                push @array, $data;
            }
            close FILE;
        }
    }
}

foreach $item (@array) {
    #print "==>$item<==\n"; # It prints out content of binary file without the ==> and <== if I uncomment this.. weird!
    if ($item =~ /p04(.*)/) {
        print "=>$item<===============\n"; # It prints "=><===============" according to the number of binary file I have.  This is wrong that I aspect it to print the content of each binary file instead :(
        next if ($item !~ /^w+/);
        open (LOG, ">log") or die $!;
        #print LOG $item;
        close LOG;
    }
}

Again, I changed my code as following, but it still doesn't work as it do not able to grep the "p04" correctly by checking on the "log" file. It did grep the whole file including binary like this "@^@^@^@^G^D^@^@^@^^@p04bbhi06^@^^@^@^@^@^@^@^@^@hh^R^@^@^@^^@^@^@p04lohhj09^@^@^@^^@@" . What I'm aspecting is it do grep the anything with p04 only such as grepping p04bbhi06 and p04lohhj09. Here is how my code goes:-

foreach my $file (@files) {
    next if ($file eq "." || $file eq "..");
    my $f = "$d/$file";
    print "==>$file\n";
    open FILE, $f || die $!;
    binmode FILE;
    my @lines = <FILE>;
    close FILE;
    foreach $cell (@lines) {
        if ($cell =~ /b12/) {
            push @array, $cell;
        }
    }
}

#my @matches = grep /p04/, @lines;
#foreach $item (@matches) {
foreach $item (@array) {
    #print "-->$item<--";
    open (LOG, ">log") or die $!;
    print LOG $item;
    close LOG;
}

Brad Gilbert ,Jan 19, 2012 at 15:53

use autodie Brad Gilbert Jan 19 '12 at 15:53

reinierpost ,Jan 30, 2012 at 13:00

There is no such thing as 'binary format'. Please be more precise. What format are the files in? What characteristics do they have that cause you to call them 'in binary format'? – reinierpost Jan 30 '12 at 13:00

Grace ,Jan 31, 2012 at 6:56

It is in .gds format. This file is able to read in Unix with strings command. It was reaable in my Perl script but I am not able to grep the data I wanted (p04* here in my code) . – Grace Jan 31 '12 at 6:56

mivk ,Nov 19, 2013 at 13:16

As already suggested, use File::Find or something to get your list of files. For the rest, what do you really want? Output the whole file content if you found a match? Or just the parts that match? And what do you want to match? p04(.*) matches anything from "p04" up to the next newline. You then have that "anything" in $1 . Leave out all the clumsy directory stuff and concentrate first on what you want out of a single file. How big are the files? You are only reading the first 170MB. And you keep overwriting the "log" file, so it only contains the last item from the last file. – mivk Nov 19 '13 at 13:16

jm666 ,May 12, 2015 at 6:44

@reinierpost the OP under the "binary file" probably mean the opposite of the text files - e.g. same thing as is in the perldoc's -X documentation see the -B explanation. (cite: -B File is a "binary" file (opposite of -T).) – jm666 May 12 '15 at 6:44

J-16 SDiZ ,Jan 19, 2012 at 2:19

Use:
$line = read (FILE, $data, 169108570);

The data is in $data ; and $line is the number of bytes read.

       my $f = "$d/$file" ;
       print "==>$file\n" ;
       open FILE, $file || die $! ;

I guess the full path is in $f , but you are opening $file . (In my testing -- even $f is not the full path, but I guess you may have some other glue code...)

If you just want to walk all the files in a directory, try File::DirWalk or File::Find .

Grace ,Jan 19, 2012 at 2:34

Hi J-16 SDiZ, thanks for the reply. each of the $file is in binary format, and what I want to do is to read eaxh of the file to grep some information in readable format and dump into another file (which I consider here as post processing). I want to perform something like "strings <filename> | grep <text synctax>" as in Unix. whereby the <filename> is the $file here in my code. My problem here is cannot read the binary file so that I can proceed with other stuff. Thanks. – Grace Jan 19 '12 at 2:34

Dimanoid ,Jan 20, 2012 at 8:51

I am not sure if I understood you right.

If you need to read a binary file, you can do the same as for a text file:

open F, "/bin/bash";
my $file = do { local $/; <F> };
close F;

Under Windows you may need to add binmode F; under *nix it works without it.

If you need to find which lines in an array contains some word, you can use grep function:

my @matches = grep /something/, @array_to_grep;

You will get all matched lines in the new array @matches .

BTW: I don't think it's a good idea to read tons of binary files into memory at once. You can search them 1 by 1...

If you need to find where the match occurs you can use another standard function, index :

my $offset = index('myword', $file);

Grace ,Jan 30, 2012 at 4:30

Hi Dinanoid, thanks for your answer, I tried it but it didn't work well for me. I tried to edit my code as above (my own code, and it didn't work). Also, tried code as below as you suggested, it didn't work for me either. Can you point out where I did wrong? Thanks. – Grace Jan 30 '12 at 4:30

Peter Mortensen ,May 1, 2016 at 8:31

What will $file be assigned to? An array of characters? A string? Something else? – Peter Mortensen May 1 '16 at 8:31

> ,

I'm not sure I'll be able to answer the OP question exactly, but here are some notes that may be related. (edit: this is the same approach as answer by @Dimanoid, but with more detail)

Say you have a file, which is a mix of ASCII data, and binary. Here is an example in a bash terminal:

$ echo -e "aa aa\x00\x0abb bb" | tee tester.txt
aa aa
bb bb
$ du -b tester.txt 
13  tester.txt
$ hexdump -C tester.txt 
00000000  61 61 20 61 61 00 0a 62  62 20 62 62 0a           |aa aa..bb bb.|
0000000d

Note that byte 00 (specified as \x00 ) is a non-printable character, (and in C , it also means "end of a string") - thereby, its presence makes tester.txt a binary file. The file has size of 13 bytes as seen by du , because of the trailing \n added by the echo (as it can be seen from hexdump ).

Now, let's see what happens when we try to read it with perl 's <> diamond operator (see also What's the use of <> in perl? ):

$ perl -e '
open IN, "<./tester.txt";
binmode(IN);
$data = <IN>; # does this slurp entire file in one go?
close(IN);
print "length is: " . length($data) . "\n";
print "data is: --$data--\n";
'

length is: 7
data is: --aa aa
--

Clearly, the entire file didn't get slurped - it broke at the line end \n (and not at the binary \x00 ). That is because the diamond filehandle <FH> operator is actually shortcut for readline (see Perl Cookbook: Chapter 8, File Contents )

The same link tells that one should undef the input record separator, \$ (which by default is set to \n ), in order to slurp the entire file. You may want to have this change be only local, which is why the braces and local are used instead of undef (see Perl Idioms Explained - my $string = do { local $/; }; ); so we have:

$ perl -e '
open IN, "<./tester.txt";
print "_$/_\n"; # check if $/ is \n
binmode(IN);
{
local $/; # undef $/; is global
$data = <IN>; # this should slurp one go now
};
print "_$/_\n"; # check again if $/ is \n
close(IN);
print "length is: " . length($data) . "\n";
print "data is: --$data--\n";
'

_
_
_
_
length is: 13
data is: --aa aa
bb bb
--

... and now we can see the file is slurped in its entirety.

Since binary data implies unprintable characters, you may want to inspect the actual contents of $data by printing via sprintf or pack / unpack instead.

Hope this helps someone,
Cheers!

[May 04, 2018] bit manipulation - Bit operations in Perl

May 04, 2018 | stackoverflow.com

4 down vote favorite


Toren ,Jan 12, 2011 at 14:50

I have an attribute (32 bits-long), that each bit responsible to specific functionality. Perl script I'm writing should turn on 4th bit, but save previous definitions of other bits.

I use in my program:

Sub BitOperationOnAttr

{

my $a="";

MyGetFunc( $a);

$a |= 0x00000008;

MySetFunc( $a);

}

** MyGetFunc/ MySetFunc my own functions that know read/fix value.

Questions:

  1. if usage of $a |= 0x00000008; is right ?
  2. how extract hex value by Regular Expression from string I have : For example:

"Attribute: Somestring: value (8 long (0x8))"

Michael Carman ,Jan 12, 2011 at 16:13

Your questions are not related; they should be posted separately. That makes it easier for other people with similar questions to find them. – Michael Carman Jan 12 '11 at 16:13

toolic ,Jan 12, 2011 at 16:47

Same question asked on PerlMonks: perlmonks.org/?node_id=881892toolic Jan 12 '11 at 16:47

psmears ,Jan 12, 2011 at 15:00

  1. if usage of $a |= 0x00000008; is right ?

Yes, this is fine.

  1. how extract hex value by Regular Expression from string I have : For example:

"Attribute: Somestring: value (8 long (0x8))"

I'm assuming you have a string like the above, and want to use a regular expression to extract the "0x8". In that case, something like:

if ($string =~ m/0x([0-9a-fA-F]+)/) {
    $value = hex($1);
} else {
    # string didn't match
}

should work.

Toren ,Jan 16, 2011 at 12:35

Thank you for quick answer. You show me the right way to solve the problem – Toren Jan 16 '11 at 12:35

Michael Carman ,Jan 12, 2011 at 16:32

Perl provides several ways for dealing with binary data:

Your scenario sounds like a set of packed flags. The bitwise operators are a good fit for this:

my $mask = 1 << 3;   # 0x0008
$value |=  $mask;    # set bit
$value &= ~$mask;    # clear bit
if ($value & $mask)  # check bit

vec is designed for use with bit vectors. (Each element has the same size, which must be a power of two.) It could work here as well:

vec($value, 3, 1) = 1;  # set bit
vec($value, 3, 1) = 0;  # clear bit
if (vec($value, 3, 1))  # check bit

pack and unpack are better suited for working with things like C structs or endianness.

Toren ,Jan 16, 2011 at 12:36

Thank you . Your answer is very informative – Toren Jan 16 '11 at 12:36

sdaau ,Jul 15, 2014 at 5:01

I upvoted, but there is something very important missing: vec operates on a string! If we use a number; say: $val=5; printf("b%08b",$val); (this gives b00000101 ) -- then one can see that the "check bit" syntax, say: for($ix=7;$ix>=0;$ix--) { print vec($val, $ix, 1); }; print "\n"; will not work (it gives 00110101 , which is not the same number). The correct is to convert the number to ASCII char, i.e. print vec(sprintf("%c", $val), $ix, 1); . – sdaau Jul 15 '14 at 5:01

[Dec 21, 2017] Common Syntax Errors

Dec 21, 2017 | affy.blogspot.com

One very common error is to use elseif instead of the correct elsif keyword. As you program, you'll find that you consistently make certain kinds of errors. This is okay. Everyone has his or her own little quirks. Mine is that I keep using the assignment operator instead of the equality operator. Just remember what your particular blind spot is. When errors occur, check for your personal common errors first.

This section shows some common syntax errors and the error messages that are generated as a result. First, the error message is shown and then the script that generated it. After the script, I'll cast some light as to why that particular message was generated.

Missing semiconon in one of the statements

Scalar found where operator expected at test.pl line 2, near "$bar"
        (Missing semicolon on previous line?)
$foo = { }    # this line is missing a semi-colon.
$bar = 5;
Perl sees the anonymous hash on the first line and is expecting either an operator or the semicolon to follow it. The scalar variable that it finds, $bar , does not fit the syntax of an expression because two variables can't be right after each other. In this case, even though the error message indicates line 2, the problem is in line 1.

Missing quote

Bare word found where operator expected at
    test.pl line 2, near "print("This"
  (Might be a runaway multi-line "" string starting on line 1)
syntax error at test.pl line 2, near "print("This is "
String found where operator expected at test.pl line 3, near "print(""
  (Might be a runaway multi-line "" string starting on line 2)
        (Missing semicolon on previous line?)
Bare word found where operator expected at
    test.pl line 3, near "print("This"
String found where operator expected at test.pl line 3, at end of line
        (Missing operator before ");
?)
Can't find string terminator '"' anywhere before EOF at test.pl line 3.
print("This is a test.\n);    # this line is missing a ending quote.
print("This is a test.\n");
print("This is a test.\n");

In this example, a missing end quote has generated 12 lines of error messages! You really need to look only at the last one in order to find out that the problem is a missing string terminator. While the last error message describes the problem, it does not tell you where the problem is. For that piece of information, you need to look at the first line where it tells you to look at line two. Of course, by this time you already know that if the error message says line 2, the error is probably in line 1.

Unquoted literal

Can't call method "a" in empty package "test" at test.pl line 1.
print(This is a test.\n);    # this line is missing a beginning quote.

The error being generated here is very cryptic and has little to do with the actual problem. In order to understand why the message mentions methods and packages, you need to understand the different, arcane ways you can invoke methods when programming with objects. You probably need to add a beginning quote if you ever see this error message.

... ... ..

This list of syntax errors could go on for quite a while, but you probably understand the basic concepts:

[Dec 20, 2017] Teach Yourself Perl 5 in 21 days - Table of Contents

Dec 20, 2017 | www.davetill.com

Chapter 21 The Perl Debugger


CONTENTS

Today's lesson describes the Perl debugging facility. You'll learn the following:

Entering and Exiting the Perl Debugger

The following sections describe how to start the Perl debugger and how to exit.

Entering the Debugger

To debug a Perl program, specify the -d option when you run the program. For example, to debug a program named debugtest , specify the following command:

$ perl -d debugtest

You can supply other options along with -d if you want to.

When the Perl interpreter sees the -d option, it starts the Perl debugger. The debugger begins by displaying a message similar to the following one on your screen:

Loading DB routines from $RCSfile: perldb.pl,v  $$Revision: 4.0.1.3

$$Date: 92/06/08 13:43:57 $

Emacs support available.



Enter h for help.



main::(debugtest:3):        $dircount = 0;

  DB<1>

The first few lines display the date on which this version of the debugger was created. The only lines of interest are the last two.

The second-to-last line in this display lists the line that the debugger is about to execute. When the debugger starts, the first executable line of the program is displayed.

When the debugger displays a line that it is about to execute, it also provides the following information about the line:

The last line of the display prompts you for a debugging command. The number enclosed in angle brackets indicates the command number; in this case, the number is 1 , because you are about to specify the first debugging command.

Later today you will learn how to use the debugging command number to re-enter debugging commands you have previously executed.

NOTE
To enter the debugger without supplying a program, supply the -e option with the -d option:
$ perl -d -e "1;"
This line starts the debugger with a "program" consisting of the single statement
1;
(which is an expression that doesn't do anything meaningful).
Starting the debugger without a program enables you to examine the predefined system variables or supply statements to be executed. You will learn how to perform both of these tasks later in today's lesson
Exiting the Debugger

To exit the debugger, enter the debugging command q :

DB<1> q

This command halts program execution immediately and returns you to the command shell.

Listing Your Program

You can list any part of your program from within the debugger. The following sections describe debugging commands that perform the display operations.

The command

The simplest way to list part of your program is with the l command, which lists the next few statements in your program:

  DB<1> l

3:      $dircount = 0;

4:      $curdir = "";

5:      while (1) {

6:              # if we don't have a current directory, get one

7:              if ($curdir eq "") {

8:                      print ("Enter directory to list:\n");

9:                      $curdir = <STDIN>;

10:                     $curdir =~ s/^\s+|\s+$//g;

11:                     $curdir = &followlink($curdir);

12:                     &readsubdirs($curdir);

The l command lists a window of executable statements, which is a group of statements small enough to be easily displayed on your screen. A window usually consists of about ten statements. The line number of each statement is displayed at the beginning of its line.

NOTE
The statements displayed in today's lesson are taken from the program presented in "Week 2 in Review."

Entering the l command again displays the window of statements immediately following the last displayed line, which in this example is line 12:

  DB<1> l

13:             }

14:             $curdir = &menudir($curdir);

15:     }

16:

17:

18:     # Find all subdirectories of the given directory,

19:     # and store them in an associative array.

20:     #

21:     # The associative array subscripts and values are:

22:     # <directory name>:       1

You can specify the lines displayed by the l command. If you specify a single line number with the l command, it displays that line:

  DB<1> l 10

10:                      $curdir =~ s/^\s+|\s+$//g;

To display a range of lines, specify the first and last line number, and separate them with a hyphen:

  DB<1> l 10-15

10:                      $curdir =~ s/^\s+|\s+$//g;

11:                      $curdir = &followlink($curdir);

12:                      &readsubdirs($curdir);

13:             }

14:             $curdir = &menudir($curdir);

15:     }

To display a specified number of lines starting at a certain point, supply the starting line number, a + character, and a number of additional lines to display:

  DB<1> l 10+5

10:                      $curdir =~ s/^\s+|\s+$//g;

11:                      $curdir = &followlink($curdir);

12:                      &readsubdirs($curdir);

13:             }

14:             $curdir = &menudir($curdir);

15:     }

You also can use the l command to specify a subroutine to display. To do this, provide the name of the subroutine to display:

  DB<1> l readsubdirs

26:     sub readsubdirs {

27:             local ($dirname) = @_;

28:             local ($dirvar, $subdircount, $name, $index);

29:

30:             # open the current directory;

31:             # $dircount ensures that each file variable is unique

32:             $dirvar = "DIR" . ++$dircount;

33:             if (!opendir ($dirvar, $dirname)) {

34:                     warn ("Can't open $dirname\n");

35:                     return;

This command lists the statements in the subroutine. If the subroutine is too large to fit in a single window, only the first few statements are listed; you can list subsequent statements by entering l with no arguments.

The Command

You can display the lines immediately preceding the last displayed line by entering the - command. For example, the following - command lists the window of lines immediately preceding the subroutine readsubdirs .

  DB<1> -

16:

17:

18:     # Find all subdirectories of the given directory,

19:     # and store them in an associative array.

20:     #

21:     # The associative array subscripts and values are:

22:     # <directory name>:       1

23:     #       (indicates that directory has been read)

24:     # <directory name>.<num>  the <num>th subdirectory

25:

Subsequent - commands go back further in the file.

The Command

To list a window of lines containinga specified line, use the w command, and specify the number of the line to be included:

  DB<1> w 7

4:      $curdir = "";

5:      while (1) {

6:              # if we don't have a current directory, get one

7:              if ($curdir eq "") {

8:                      print ("Enter directory to list:\n");

9:                      $curdir = <STDIN>;

10:                     $curdir =~ s/^\s+|\s+$//g;

11:                     $curdir = &followlink($curdir);

12:                     &readsubdirs($curdir);

13:             }

The w command displays the three lines before the specified line and fills the window with the lines following it.

The // and ?? Commands

You can search for a line containing a particular pattern by enclosing the pattern in slashes:

  DB<1> /Find/

18:     # Find all subdirectories of the given directory,

The debugger searches forward from the last displayed line for a line matching the specified pattern. If it finds such a line, the line is displayed.

To search backward for a particular pattern, enclose the pattern in question marks:

  DB<1> ?readsubdirs?

12:                      &readsubdirs($curdir);

This command starts with the last displayed line and searches backward until it finds a line matching the specified pattern.

NOTE
Patterns specified by // and ?? can contain any special character understood by the Perl interpreter.
You optionally can omit the final / or ? character when you match a pattern.
The Command

The S command lists all the subroutines in the current file, one subroutine per line:

  DB<> S

main::display

main::followlink

main::menudir

main::readsubdirs

Each subroutine name is preceded by the package name and a single quotation mark.

Stepping Through Programs

One of the most useful features of the Perl debugger is the capability to execute a program one statement at a time. The following sections describe the statements that carry out this action.

The Command

To execute a single statement of your program, use the s command:

  DB<2> s

main::(debugtest:4):        $curdir = "";

This command executes one statement of your program and then displays the next statement to be executed. If the statement executed needs to read from the standard input file, the debugger waits until the input is provided before displaying the next line to execute.

TIP
If you have forgotten which line is the next line to execute (because, for example, you have displayed lines using the l command), you can list the next line to execute using the L command:
DB<2> L
3: $dircount = 0;
The L command lists the last lines executed by the program. It also lists any breakpoints and line actions that have been defined for particular lines. Breakpoints and line actions are discussed later today.

If the statement executed by the s command calls a subroutine, the Perl debugger enters the subroutine but does not execute any statements in it. Instead, it stops at the first executable statement in the subroutine and displays it. For example, if the following is the current line:

main::(debugtest:12):                      &readsubdirs($curdir);

specifying the s command tells the Perl debugger to enter readsubdirs and display the following, which is the first executable line of readsubdirs :

main::readsubdirs(debugtest:27):      local ($dirname) = @_;

The s command assumes that you want to debug the subroutine you have entered. If you know that a particular subroutine works properly and you don't want to step through it one statement at a time, use the n command, described in the following section.

The Command

The n command, like the s command, executes one line of your program and displays the next line to be executed:

  DB<2> n

main::(debugtest:5):        while (1) {

The n statement, however, does not enter any subroutines. If the statement executed by n contains a subroutine call, the subroutine is executed in its entirety. After the subroutine is executed, the debugger displays the line immediately following the call.

For example, if the current line is

main::(debugtest:12):                      &readsubdirs($curdir);

the n command tells the debugger to execute readsubdirs and then display the next line in the program, which is

main::(debugtest:13:):             }

Combining the use of s and n ensures that the debugger examines only the subroutines you want to see.

NOTE
The Perl debugger does not enable you to enter any library functions. You can enter only subroutines that you have created yourself or that have been created previously and added to a subroutine library
The command

The f command tells the Perl debugger to execute the remainder of the statements in the current subroutine and then display the line immediately after the subroutine call. This is useful when you are looking for a bug and have determined that the current subroutine does not contain the problem.

The Carriage-Return Command

If you are stepping through a program using s or n , you can save yourself some typing by just pressing Enter when you want to execute another statement. When you press Enter, the debugger repeats the last s or n command executed.

For example, to step from line 5 to line 7, you can use the s command as usual:

  DB<3> s

main::(debugtest:7):              if ($curdir eq "") {

(Line 6 is skipped because it contains no executable statements.) To execute line 7, you can now just press Enter:

  DB<2>

main::(debugtest:8):              print ("Enter directory to list:\n");



NOTE
Pressing Enter has no effect if you have not specified any s or n commands.
The Command

If you are inside a subroutine and decide that you no longer need to step through it, you can tell the Perl debugger to finish executing the subroutine and return to the statement after the subroutine call. To do this, use the r command:

  DB<4> r

main::(debugtest:13:):             }

The statement displayed by the debugger is the first statement following the call to the subroutine.

Displaying Variable Values

Another powerful feature of the Perl debugger is the capability to display the value of any variable at any time. The following sections describe the commands that perform this action.

The Command

The X command displays variables in the current package (which is main if no other package has been specified). If the X command is specified by itself, it lists all the variables in the current package, including the system-defined variables and the variables used by the Perl interpreter itself. Usually, you won't want to use the X command by itself, because there are a lot of system-defined and internal variables known to the Perl interpreter.

To print the value of a particular variable or variables, specify the variable name or names with the X command:

  DB<5> X dircount

$dircount = '0'

This capability often is useful when you are checking for errors in your program.

You must not supply the $ character with the variable name when you use the X command. If you supply the $ character (or the @ or % characters for arrays), the debugger displays nothing.

You can use X to display the values of array variables and associative array variables.

  DB<6> X regarray

@regarray = (

  0     14

  1     'hello'

  2     36

)

  DB<7> X assocarray

%assoc_array = (

  'hi'  1

  'there' 2

)

Each command prints the subscripts of the array and their values. Regular arrays are printed in order of subscript; associative arrays are printed in no particular order.

NOTE
If you have an array variable and a scalar variable with the same name, the X command prints both variables:
DB<8> X var
$var = '0'
@var = (
0 'test1'
1 'test2'
)
There is no way to use X to display one variable but not the other.
The Command

The V command is identical to the X command except that it prints the values of variables in any package. If you specify just a package name, as in the following, this command displays the values of all variables in the package (including system-defined and internal variables):

DB<9> V mypack

If you specify a package name and one or more variable names, as in the following, the debugger prints the values of the variables (if they are defined in that package):

  DB<10> V main dircount

$dircount = '0'

Breakpoints

As you have seen, you can tell the Perl debugger to execute one statement at a time. Another way of controlling program execution is to tell the debugger to execute up to a certain specified point in the program, called a breakpoint .

The following sections describe the commands that create breakpoints, and the command that executes until a breakpoint is detected.

The Command

To set a breakpoint in your program, use the b command. This command tells the debugger to halt program execution whenever it is about to execute the specified line. For example, the following command tells the debugger to halt when it is about to execute line 10:

DB<11> b 10

(If the line is not breakable, the debugger will return Line 10 is not breakable .)

NOTE
You can have as many breakpoints in your program as you want. The debugger will halt program execution if it is about to execute any of the statements at which a breakpoint has been defined.

The b command also accepts subroutine names:

DB<12> b menudir

This sets a breakpoint at the first executable statement of the subroutine menudir .

You can use the b command to tell the program to halt only when a specified condition is true. For example, the following command tells the debugger to halt if it is about to execute line 10 and the variable $curdir is equal to the null string:

DB<12> b 10 ($curdir eq "")

The condition specified with the b statement can be any legal Perl conditional expression.

If a statement is longer than a single line, you can set a breakpoint only at the first line of the statement:
71: print ("Test",
72: " here is more output");
Here, you can set a breakpoint at line 71, but not line 72.
The Command

After you have set a breakpoint, you can tell the debugger to execute until it reaches either the breakpoint or the end of the program. To do this, use the c command:

  DB<13> c

main::(debugtest:10):                  $curdir =~ s/^\s+|\s+$//g;

  DB<14>

When the debugger detects that it is about to execute line 10-the line at which the breakpoint was set-it halts and displays the line. (Recall that the debugger always displays the line it is about to execute.)

The debugger now prompts you for another debugging command. This action enables you to start executing one statement at a time using n or s , continue execution using c , set more breakpoints using b , or perform any other debugging operation.

You can specify a temporary (one-time-only) breakpoint with the c command by supplying a line number:

  DB<15> c 12

main::(debugtest:12):                      &readsubdirs($curdir);

The argument 12 supplied with the c command tells the debugger to define a temporary breakpoint at line 12 and then resume execution. When the debugger reaches line 12, it halts execution, displays the line, and deletes the breakpoint. (The line itself still exists, of course.)

Using c to define a temporary breakpoint is useful if you want to skip a few lines without wasting your time executing the program one statement at a time. Using c also means that you don't have to bother defining a breakpoint using b and deleting it using d (described in the following section).

TIP
If you intend to define breakpoints using c or b , it is a good idea to ensure that each line of your program contains at most one statement. If you are in the habit of writing lines that contain more than one statement, such as
$x++; $y++;
you won't get as much use out of the debugger, because it can't stop in the middle of a line
The Command and Breakpoints

To list all of your breakpoints, use the L command. This command lists the last few lines executed, the current line, the breakpoints you have defined, and the conditions under which the breakpoints go into effect.

  DB<16> L

3:      $dircount = 0;

4:      $curdir = "";

5:      while (1) {

7:              if ($curdir eq "") {

10:                      $curdir =~ s/^\s+|\s+$//g;

  break if (1)

Here, the program has executed lines 3-7, and a breakpoint is defined for line 10. (Line 6 is not listed because it is a comment.) You can distinguish breakpoints from executed lines by looking for the breakpoint conditional expression, which immediately follows the breakpoint. Here, the conditional expression is (1) , which indicates that the breakpoint is always in effect.

The and Commands

When you are finished with a breakpoint, you can delete it using the d command.

DB<16> d 10

This command tells the debugger to delete the breakpoint at line 10. The line itself remains in the program.

If you do not specify a breakpoint to delete, the debugger assumes that a breakpoint is defined for the next line to be executed, and deletes it.

main::(debugtest:12):                      &readsubdirs($curdir);

  DB<17> d

Here, line 12 is the next line to be executed, so the debugger deletes the breakpoint at line 12.

To delete all your breakpoints, use the D command.

DB<18> D

This command deletes all the breakpoints you have defined with the b command.

Tracing Program Execution

When you run a program using the Perl debugger, you can tell it to display each line as it is executed. When the debugger is doing this, it is said to be in trace mode .

To turn on trace mode, use the T command.

  DB<18> t

Trace = on

When a statement is executed in trace mode, the statement is displayed. For example, if the current line is line 5 and the command c 10 (which executes up to line 10) is entered, the following is displayed:

  DB<18> c 10

main::(debugtest:5):      while (1) {

main::(debugtest:7):              if ($curdir eq "") {

main::(debugtest:10):                      $curdir =~ s/^\s+|\s+$//g;

  DB<19>

The debugger prints and executes line 5 and line 7, then displays line 10 and waits for further instructions.

To turn off trace mode, specify the t command again.

  DB<19> t

Trace = off

At this point, trace mode is turned off until another t command is entered.

Line Actions

The Perl debugger enables you to specify one or more statements to be executed whenever the program reaches a specified line. Such statements are known as line actions. The most common line actions are printing the value of a variable and resetting a variable containing an erroneous value to the value you want.

The following sections describe the debugging commands that define line actions.

The Command

To specify a line action for a particular line, use the a command.

DB<19> a 10 print ("curdir is $curdir\n");

This command tells the debugger to execute the statement

print ("curdir is $curdir\n");

whenever it is about to execute line 10 of the program. The debugger performs the action just after it displays the current line and before it asks for the next debugging command.

To create a line action containing more than one statement, just string the statements together. If you need more than one line for the statements, put a backslash at the end of the first line.

  DB<20> a 10 print ("curdir is $curdir\n"); print \

("this is a long line action\n");

In this case, when the debugger reaches line 10, it executes the following statements:

print ("curdir is $curdir\n");

print ("this is a long line action\n");

The Command

To delete the line actions defined using the a command, use the A command.

DB<21> A

This command deletes all line actions currently defined.

NOTE
The A command does not affect the < and > commands, described in the following section.
The < and > Commands

To define a line action that is to be executed before the debugger executes any further statements, use the > command.

DB<21> > print ("curdir before execution is $curdir\n");

This command tells the debugger to print the value of $curdir before continuing.

Similarly, the < command defines a line action that is to be performed after the debugger has finished executing statements and before it asks for another debugging command:

DB<22> < print ("curdir after execution is $curdir\n");

This command tells the debugger to print the value of $curdir before halting execution again.

The < and > commands are useful when you know that one of your variables has the wrong value, but you don't know which statement assigned the wrong value to the variable. By single-stepping through the program using s or n , and printing the variable either before or after executing each statement, you can determine where the variable was given its incorrect value.

NOTE
To delete a line action defined by the < command, enter another < command with no line action defined.
DB<23> <
Similarly, the following command undoes the effects of a > command:
DB<24> >
Displaying Line Actions Using the Command

The L command prints any line actions you have defined using the a command (as well as breakpoints and executed lines). For example, suppose that you have defined a line action using the following command:

DB<25> a 10 print ("curdir is $curdir\n");

The L command then displays this line action as shown here:

main::(debugtest:10):                      $curdir =~ s/^\s+|\s+$//g;

  action:  print ("curdir is $curdir\n");

The line action is always displayed immediately after the line for which it is defined. This method of display enables you to distinguish lines containing line actions from other lines displayed by the L command.

Other Debugging Commands

The following sections describe the debugging commands not previously covered.

Executing Other Perl Statements

In the debugger, anything that is not a debugging command is assumed to be a Perl statement and is performed right away. For example:

DB<4> @array = (1, 2, 3);

You can use statements such as this to alter values in your program as it is being executed. This capability is useful when you are testing your code.

NOTE
If you wish, you can omit the semicolon at the end of the statement.
The Command: Listing Preceding Commands

The H (for "history") command lists the preceding few commands you have entered.

  DB<4> H

3: b 7

2: b 14

1: b 13

The commands are listed in reverse order, with the most recently executed command listed first. Each command is preceded by its command number, which is used by the ! command (described in the following section).

NOTE
The debugger saves only the commands that actually affect the debugging environment. Commands such as l and s , which perform useful work but do not change how the debugger behaves, are not listed by the H command.
This is not a significant limitation because you can enter the letter again if needed.
The Command: Executing Previous Commands

Each command that is saved by the debugger and can be listed by the H command has a command number. You can use this command number to repeat a previously executed command. For example, to repeat command number 5, make the following entry:

  DB <11> !5

b 8

  DB <12>

The debugger displays command number 5-in this case, the command b 8 - and then executes it.

If you omit the number, the debugger repeats the last command executed.

  DB <12> $foo += $bar + 1

  DB <13> !

$foo += $bar + 1

  DB <14>

If you specify a negative number with ! , the debugger skips back that many commands:

  DB <14> $foo += $bar + 1

  DB <15> $foo *= 2

  DB <16> ! -2

$foo += $bar + 1

  DB <17>

Here, the ! -2 command refers to the command $foo += $bar + 1 .

You can use ! only to repeat commands that are actually repeatable. Use the H command to list the commands that the debugger has saved and that can be repeated
The Command: Stack Tracing

The T command enables you to display a stack trace, which is a collection of all the subroutines that have been called, listed in reverse order. Here is an example:

  DB <16> T

$ = &main::sub2('hi') from file debug1 line 7

$ = &main::sub1('hi') from file debug1 line 3

Here, the T command indicates that the program is currently inside subroutine sub2 , which was called from line 7 of your program; this subroutine is part of the main package. The call to sub2 is passed the argument 'hi' .

The $ = preceding the subroutine name indicates that the subroutine call is expecting a scalar return value. If the call is expecting a list to be returned, the characters @ = appear in front of the subroutine name.

The next line of the displayed output tells you that sub2 was called by another subroutine, sub1 . This subroutine was also passed the argument 'hi' , and it was called by line 3 of the program. Because the stack trace lists no more subroutines, line 3 is part of your main program.

NOTE
The list of arguments passed to a subroutine that is displayed by the stack trace is the list of actual values after variable substitution and expression evaluation are performed. This procedure enables you to use the stack trace to check whether your subroutines are being passed the values you expect.
The Command: Printing an Expression

An easy way to print the value of an expression from inside the debugger is to use the p command.

  DB <17> p $curdir + 1

1

The p command evaluates the expression and displays the result.

NOTE
The p command writes to the screen even when the program has redirected STDOUT to a file.
The Command: Defining Aliases

If you find yourself repeatedly entering a long debugging command and you want to save yourself some typing, you can define an alias for the long command by using the = command. For example:

  DB <15> = pc print ("curdir is $curdir\n");

= pc print ("curdir is $curdir\n");

The = command prints the alias you have just defined and then stores it in the associative array %DB'alias (package DB , array name alias ) for future reference. From here on, the command

DB <16> pc

is equivalent to the command

DB <16> print ("curdir is $curdir\n");

To list the aliases you have defined so far, enter the = command by itself:

  DB <17> =

pc =  print ("curdir is $curdir\n")

This command displays your defined aliases and their equivalent values.

Predefining Aliases

You can define aliases that are to be created every time you enter the Perl debugger.

When the debugger starts, it first searches for a file named .perldb in your home directory. If the debugger finds this file, it executes the statements contained there.

To create an alias, add it to the .perldb file. For example, to add the alias

= pc print ("curdir is $curdir\n");

add the following statement to your .perldb file:

$DB'alias{"pc"} = 's/^pc/print ("curdir is $curdir\n");/';

Here's how this works: when the Perl debugger creates an alias, it adds an element to the $DB'alias associative array. The subscript for this element is the alias you are defining, and the value is a substitution command that replaces the alias with the actual command you want to use. In the preceding example, the substitution takes any command starting with pc and replaces it with

print ("curdir is $curdir\n");


Be careful when you define aliases in this way. For example, your substitution should match only the beginning of a command, as in /^pc/ . Otherwise, the alias will replace any occurrence of the letters pc with your print command, which is not what you want.
The Command: Debugger Help

The h (for help) command provides a list of each of the debugger commands listed in today's lesson, along with a one-line explanation of each. This is handy if you are in the middle of debugging a program and forget the syntax of a particular command.

Summary

Today, you have learned about the Perl debugger. This debugger enables you to perform the following tasks, among others:

Q&A
Q: Is it possible to enter more than one debugging command at a time?
A: No; however, there's no real need to do so. If you want to perform several single steps at once, use the c command to skip ahead to a specified point. If you want to both step ahead and print the value of a variable, use the < or > command.
Q: Is it possible to examine variables in one package while inside another?
A: Yes. Use the V command or the standard Perl package/variable syntax.
Q: If I discover that my program works and I want to turn off debugging, what do I do?
A: You cannot exit the debugger in the middle of a program. However, if you delete all breakpoints and line actions and then enter the c command, the program begins executing normally and is no longer under control of the debugger.
Q: How can I convert to a reusable breakpoint a one-time breakpoint created using c ?
A: By default, the b command sets a breakpoint at the line that is about to be executed. This is the line at which c has set its one-time breakpoint.
Q: How can I execute other UNIX commands from inside the debugger?
A: Enter a statement containing a call to the Perl system function. For example, to display the contents of the current directory, enter the following command:
DB <11> system ("ls");
To temporarily escape from the debugger to a UNIX shell, enter the following command:
DB <12> system ("sh");
When you are finished with the shell, enter the command exit, and you will return to the debugger.
Q: What special built-in variables can be accessed from inside the debugger?
A: All of them.
Workshop

The Workshop provides quiz questions to help you solidify your understanding of the material covered.

Quiz
  1. Define the following terms:
    1. trace mode
    2. stack trace
    3. breakpoint
    4. line action
  1. Explain the differences between the X and V commands.
  2. Explain the differences between the // and ?? commands.
  3. Explain the differences between the < and > commands.
  4. Explain the differences between the s and n commands.
  5. What do the following commands do?
    1. l
    2. l 26
    3. l 5-7
    4. l 5+7
    5. w


[Dec 20, 2017] debugging - Can the Perl debugger save the ReadLine history to a file

Dec 20, 2017 | stackoverflow.com
The way I do this is by having the following line in my ~/.perldb file:

&parse_options("HistFile=$ENV{HOME}/.perldb.hist");

Debugger commands are then stored in ~/.perldb.hist and accessible across sessions.

I did the following:

1) Created ~/.perldb , which did not exist previously.

2) Added &parse_options("HistFile=$ENV{HOME}/.perldb.hist"); from mirod's answer.

3) Added export PERLDB_OPTS=HistFile=$HOME/.perldb.history to ~/.bashrc from mephinet's answer.

4) Ran source .bashrc

5) Ran perl -d my program.pl , and got this warning/error

perldb: Must not source insecure rcfile /home/ics/.perldb.
        You or the superuser must be the owner, and it must not 
        be writable by anyone but its owner.

6) I protected ~/.perldb with owner rw chmod 700 ~/.perldb , and the error went away.

[Dec 20, 2017] The Perl Debugger

Dec 20, 2017 | nnc3.com

Subroutines

There is one more variation of the list code command, l . It is the ability to list the code of a subroutine, by typing l sub , where sub is the subroutine name.

Running the code in Listing 2 returns:

Loading DB routines from perl5db.pl version 1
Emacs support available.
Enter h or h h for help.
main::(./p2.pl:3): require 5.001;
 DB<1>

Entering l searchdir allows us to see the text of searchdir , which is the meat of this program.

22 sub searchdir { # takes directory as argument
23: my($dir) = @_;
24: my(@files, @subdirs);
25
26: opendir(DIR,$dir) or die "Can't open \"
27:     $dir\" for reading: $!\n";
28
29: while(defined($_ = readdir(DIR))) {
30: /^\./ and next; # if file begins with '.', skip
31
32 ### SUBTLE HINT ###
As you can see, I left a subtle hint. The bug is that I deleted an important line at this point.

Setting Breakpoints

If we were to step through every line of code in a subroutine that is supposed to be recursive, it would take all day. As I mentioned before, the code as in Listing 2 seems only to list the files in the current directory, and it ignores the files in any subdirectories. Since the code only prints the files in the current, initial directory, maybe the recursive calls aren't working. Invoke the Listing 2 code under the debugger.

Now, set a breakpoint. A breakpoint is a way to tell the debugger that we want normal execution of the program until it gets to a specific point in the code. To specify where the debugger should stop, we insert a breakpoint. In the Perl debugger, there there are two basic ways to insert a breakpoint. The first is by line number, with the syntax b linenum . If linenum is omitted, the breakpoint is inserted at the next line about to be executed. However, we can also specify breakpoints by subroutine, by typing b sub , where sub is the subroutine name. Both forms of breakpointing take an optional second argument, a Perl conditional. If when the flow of execution reached the breakpoint the conditional evaluates to true, the debugger will stop at the breakpoint; otherwise, it will continue. This gives greater control of execution.

For now we'll set a break at the searchdir subroutine with b searchdir . Once the breakpoint is set, we'll just execute until we hit the subroutine. To do this, enter c (for continue). Adding Actions

Looking at the code in Listing 2, we can see that the first call to searchdir comes in the main code. This seems to works fine, or else nothing would be printed out. Press c again to continue to the next invocation of searchdir , which occurs in the searchdir routine.

We wish to know what is in the $dir variable, which represents the directory that will be searched for files and subdirectories. Specifically, we want to know the contents of this variable each time we cycle through the code. We can do this by setting an action. By looking at the program listing, we see that by line 25, the variable $dir has been assigned. So, set an action at line 25 in this way:

a 25 print "dir is $dir\n"

Now, whenever line 25 comes around, the print command will be executed. Note that for the a command, the line number is optional and defaults to the next line to be executed.

Pressing c will execute the code until we come across a breakpoint, executing action points that are set along the way. In our example, pressing c continuously will yield the following:

main::(../p2.pl:3): require 5.001;
 DB<1> b searchdir
 DB<2> a 25 print "dir is $dir\n"
 DB<3> c
main::searchdir(../p2.pl:23): my($dir) = @_;
 DB<3> c
dir is .
main::searchdir(../p2.pl:23): my($dir) = @_;
 DB<3> c
dir is dir1.0
main::searchdir(../p2.pl:23): my($dir) = @_;
 DB<3> c
dir is dir2.0
main::searchdir(../p2.pl:23): my($dir) = @_;
 DB<3> c
dir is dir3.0
file1
file1
file1
file1
DB::fake::(/usr/lib/perl5/perl5db.pl:2043):
2043: "Debugged program terminated. Use `q' to quit or `R' to
restart.";
 DB<3>

Note that older versions of the debugger don't output the last line as listed here, but instead exit the debugger. This newer version is nice because when the program has finished it still lets you have control so that you can restart the program.

It still seems that we aren't getting into any subdirectories. Enter D and A to clear all breakpoints and actions, respectively, and enter R to restart. Or, in older debugger versions, simply restart the program to begin again.

We now know that the searchdir subroutine isn't being called for any subdirectories except the first level ones. Looking back at the text of the program, notice in lines 44 through 46 that the only time the searchdir subroutine is called recursively is when there is something in the @subdirs list. Put an action at line 42 that will print the $dir and @subdirs variables by entering:

a 42 print "in $dir is @subdirs \n"

Now, put a breakpoint at line 12 to prevent the program from outputting to our screen ( b 12 ), then enter c . This will tell us all the subdirectories that our program thinks are in the directory.

main::(../p2.pl:3): require 5.001;
 DB<1> a 42 print "in $dir is @subdirs \n"
 DB<2> b 12
 DB<3> c
in . is dir1.0 dir2.0 dir3.0
in dir1.0 is
in dir2.0 is
in dir3.0 is
main::(../p2.pl:12): foreach (@files) {
 DB<3>
This program sees that there are directories in ".", but not in any of the subdirectories within ".". Since we are printing out the value of @subdirs at line 42, we know that @subdirs has no elements in it. (Notice that when listing line 42, there is the letter "a" after the line number and a colon. This tells us that there is an action point here.) So, nothing is being assigned to @subdirs in line 37, but should be if the current (as held in $_ ) file is a directory. If it is, it should be pushed into the @subdirs list. This is not happening.

One error I've committed (intentionally, of course) is on line 38. There is no catch-all "else" statement. I should probably put an error statement here. Instead of doing this, let's put in another action point. Reinitialize the program so that all points are cleared and enter the following:

a 34 if( ! -f $_ and ! -d $_ ) { print "in $dir: $_ is
weird!\n" }
b 12"
c

which reveals:

main::(../p2.pl:3): require 5.001;
 DB<1> a 34 if( ! -f $_ and ! -d $_ ) { print "in $dir:
$_ is weird!\n" }
 DB<2> b 12
 DB<3> c
in dir1.0: dir1.1 is weird!
in dir1.0: dir2.1 is weird!
in dir1.0: file2 is weird!
in dir1.0: file3 is weird!
in dir2.0: dir2.1 is weird!
in dir2.0: dir1.1 is weird!
in dir2.0: file2 is weird!
in dir2.0: file3 is weird!
main::(../p2.pl:12): foreach (@files) {
 DB<3>
While the program can read (through the readdir call on line 29) that dir1.1 is a file of some type in dir1.0, the file test (the -f construct) on dir1.1 says that it is not.

It would be nice to halt the execution at a point (line 34) where we have a problem. We can use the conditional breakpoint that I mentioned earlier to do this. Reinitialize or restart the debugger, and enter:

b 34 ( ! -f $_ and ! -d $_ )
c
p
p $dir

You'll get output that looks like this:

main::(../p2.pl:3): require 5.001;
 DB<1> b 34 ( ! -f $_ and ! -d $_ )
 DB<2> c
main::searchdir(../p2.pl:34): if( -f $_) { # if its a file...
 DB<2> p
dir1.1
 DB<2> p $dir
dir1.0
 DB<3>
The first line sets the breakpoint, the next c executes the program until the break point stops it. The p prints the contents of the variable $_ and the last command, p $dir prints out $dir . So, dir1.1 is a file in dir1.0, but the file tests ( -d and -f ) don't admit that it exists, and therefore dir1.1 is not being inserted into @subdirs (if it's a directory) or into @files (if it's a file).

Now that we are back at a prompt, we could inspect all sorts of variables, subroutines or any other Perl construct. To save you from banging your heads against your monitors, and thus saving both your heads and your monitors, I'll tell you what is wrong.

All programs have something known as the current working directory (CWD). By default, the CWD is the directory where the program starts. Any and all file accesses (such as file tests or file and directory openings) are made in reference from the CWD. At no time does our program change its CWD. But the values returned by the readdir call on line 29 are simply file names relative to the directory that readdir is reading (which is in $dir ). So, when we do the readdir , $_ gets assigned a string representing a file (or directory) within the directory in $dir (which is why it's called a subdirectory). But when running the -f and -d file tests, they look for $_ in the context of the CWD. But it isn't in the CWD, it's in the directory represented by $dir . The moral of the story is that we should be working with $dir/$_ , not just $_ . So the string

###SUBTLE HINT###

should be replaced by

$_ = "$dir/$_"; # make all path names absolute
That sums it up. Our problem was we were dealing with relative paths, not absolute (from the CWD) paths.

Putting it back into our example, we need to check dir1.0/dir1.1 , not dir1.1 . To check to make sure that this is what we want, we can put in another action point. Try typing:

a 34 $_ = "$dir/$_"
c

In effect this temporarily places the corrective measure into our code. Action points are the first item on the line to be evaluated. You should now see the proper results of the execution of the program:

DB<1> a 34 $_ = "$dir/$_"
DB<2> c
./file1
./dir1.0/file1
./dir1.0/file2
./dir1.0/file3
./dir1.0/dir1.1/file1
./dir1.0/dir1.1/file2
./dir1.0/dir1.1/file3
./dir2.0/file1
./dir2.0/file2
./dir2.0/file3
./dir2.0/dir2.1/file1
./dir2.0/dir2.1/file2
./dir3.0/file1
DB::fake::(/usr/lib/perl5/perl5db.pl:2043):
2043: "Debugged program terminated. Use `q' to quit or `R' to
restart.";
 DB<2>

Stack Traces

Now that we've got the recursive call debugged, let's play with the calling stack a bit. Giving the command T will display the current calling stack. The calling stack is a list of the subroutines which have been called between the current point in execution and the beginning of execution. In other words, if the main portion of the code executes subroutine "a", which in turn executes subroutine "b", which calls "c", then pressing "T" while in the middle of subroutine "c" outputs a list going from "c" all the way back to "main".

Start up the program and enter the following commands (omit the second one if you have fixed the bug we discovered in the last section):

b 34 ( $_ =~ /file2$/)
a 34 $_ = "$dir/$_"
c

These commands set a breakpoint that will only stop execution if the value of the variable $_ ends with the string file2 . Effectively, this code will halt execution at arbitrary points in the program. Press T and you'll get this:

@ = main::searchdir('./dir1.0/file2') called from file '../p2.pl' line
45
@ = main::searchdir(.) called from file '../p2.pl' line 10

Enter c , then T again:

@ = main::searchdir('./dir1.0/dir1.1/file2') called from file
`../p2.pl' line 45
@ = main::searchdir(undef) called from file '../p2.pl' line 45
@ = main::searchdir(.) called from file '../p2.pl' line 10

Do it once more:

@ = main::searchdir('./dir2.0/file2') called from file '../p2.pl' line
45
@ = main::searchdir(.) called from file '../p2.pl' line 10

You can go on, if you so desire, but I think we have enough data from the arbitrary stack dumps we've taken.

We see here which subroutines were called, the debugger's best guess of which arguments were passed to the subroutine and which line of which file the subroutine was called from. Since the lines begin with @ = , we know that searchdir will return a list. If it were going to return a scalar value, we'd see $ = . For hashes (also known as associative arrays), we would see % = .

I say "best guess of what arguments were passed" because in Perl, the arguments to subroutines are placed into the @_ magic list. However, manipulating @_ (or $_ ) in the body of the subroutine is allowed and even encouraged. When a T is entered, the stack trace is printed out, and the current value of @_ is printed as the arguments to the subroutine. So when @_ is changed, the trace doesn't reflect what was actually passed as arguments to the subroutine.

[Dec 20, 2017] Creating Command Aliases

Notable quotes:
"... You use the = command without any arguments when you want a list of the current aliases. ..."
Dec 20, 2017 | affy.blogspot.com

The = command is used to create command aliases. If you find yourself issuing the same long command over and over again, you can create an alias for that command. For example, the debugger command

= pFoo print("foo=$foo\n");
creates an alias called pFoo . After this command is issued, typing pFoo at the debugger prompt produces the same results as typing print("foo=$foo\n"); .

You use the = command without any arguments when you want a list of the current aliases.

If you want to set up some aliases that will always be defined, create a file called .perldb and fill it with your alias definitions. Use the following line as a template:

$DB::alias{'pFoo'} = 'print("foo=$foo\n");';
After you create this file and its alias definitions, the aliases will be available in every debugging session.

[Dec 20, 2017] Perl Debugger Quick Reference Card by Andrew Ford

Notable quotes:
"... in the current and home directories ..."
"... Any input to the debugger that is not recognized is executed as Perl code in the current package. ..."
Dec 20, 2017 | shinnok.com

Revision 0.1 for Perl Debugger version 5.8.x

Copyright: Andrew Ford refcards.com™

... ... ...

Debugger Commands

The debugger reads commands from the files .perldb in the current and home directories, and stops before the first run-time executable statement, displaying the line it is about to execute and a prompt:

DB<1>

If you run code from the debugger and hit another breakpoint, the prompt will look like DB"42". The numbers within the angle brackets are the command numbers, used when repeating commands.

Any input to the debugger that is not recognized is executed as Perl code in the current package.

Prefixing a command with ' | ' pipes the output to your current pager.

Help and Quiting

Debugger Control

... ... ...

[Dec 20, 2017] Chapter 30 -- Using the Perl Debugger

Dec 20, 2017 | ods.com.ua
... ... ... Looking at Values

To see the values of certain variables in the program, use the V command. Used by itself, V lists all the variables in scope at this time. Here's the syntax:

V [ package [ variable ]]

To look at values in your program, you'll want to look at the main package. For example, to print the value of $reply , use this command:

V main reply
$reply = '1'

Note that the dollar sign before the variable specified to V is not supplied. Therefore, if you specify the command V main $reply , you are actually asking for the value of $$reply and not $reply .

The trace option is available with the t toggle command. Issuing trace once turns it on, and issuing it again turns it off. See Figure 30.4 for a sample use of the trace command on Listing 30.2. In this example, trace is turned on, and then the c command is issued to run the debugger continuously. In trace mode, the debugger prints out each line of code that executes.

Figure 30.4 : Using the trace command with breakpoints.

The X command is helpful when displaying values of variables in the current package. Remember that the main package is the default package for a Perl script. Issued by itself with no options, the X command displays all the variables in the current package. Avoid issuing the X command by itself because it can generate a very long listing of all the variables in the main package.

To see the value of a particular variable instead of all the variables, type the name of the variable after the X command. For example, the following command

X fileNumber

will print the value of the fileNumber variable in the current package. If you have array variables and scalar variables with the same name in the same package, the X command will display the values of both these variables. For example, if you have a scalar variable called names and an array called names , the X command will show the values of both variables:

DB<3> X names
$names = "kamran"
@names = (
"kamran"
"joe"
"donald"
)
Breakpoints

You can place breakpoints at suspect locations in your code and run the program until one of the specified breakpoints is hit. Breakpoints can be specified to be hit as soon as the line of code is about to be executed.

The c command is used to step forward until either the program stops or a specified breakpoint is hit. To specify a breakpoint at the current line, use the b command without any parameters. To specify a specific line, use the command of the form:

b linenumber

Usually, you use trace statements to see statements between the current execution point and a breakpoint (refer to Figure 30.4). The program is run in continuous mode with the c command until it hits a breakpoint. There is a breakpoint in Listing 30.1 that causes the debugger to stop. The L command is issued in the example to list the breakpoints in the system.

Breakpoints can also be specified to occur at the first executable line of code within a subroutine. Simply use the b command with the name of the subroutine as the first parameter. For example, to break at the first line of code in the xyc subroutine, try this command:

b xyc

You can also ask the debugger to look at a condition when a line is hit with a breakpoint tag on it. If the breakpoint is specified at a line and the condition is true, the debugger stops; otherwise, it keeps on going. For example, if you want the debugger to stop in xyc only when the global $reply is 1 , use this command:

b xyc ($reply == '1')

To list all breakpoints defined during a debug session, use the L command. If you issue unconditional breakpoints, you'll see breakpoints listed as this:

break if (1)

The L command will also list up to the last five executed lines of the program.

To remove a breakpoint, use the d command and specify the line number to delete. To remove all breakpoints, use the D command. For example, to delete a breakpoint at line 12, you would issue the command d 12 .

The DB package uses the following sequence to hit breakpoints and evaluate code on each line of executable code:

  1. Checks to see whether the breakpoint is defined at this line number. If there is no breakpoint defined for this line, it starts to process the next line. If there is a break-
    point at this line, the debugger prepares to stop. If the condition for the defined breakpoint is true, the debugger stops execution and presents a prompt to the user.
  2. Checks to see whether the line of code is printable. If so, it prints the entire line of code (including code spanning multiple lines).
  3. Checks to see whether there are any actions defined for this line and performs these actions. (An action is a set of Perl commands to be executed.)
  4. Checks to see whether the stop was due to a breakpoint. If the condition for the breakpoint is true and a breakpoint has been marked in this location, the debugger stops and presents a prompt for user interaction.
  5. Evaluates the line and gets ready to execute it. Gets user input if the user is stopping; otherwise, it executes the line and returns to item 1 in order to process the next line.
Actions

You can specify actions to take when a certain line of code is executed. This step is very important when you want to print out values as the program executes (see Figure 30.5). Notice how the value of reply is printed out when line 73 is reached. The action is defined with this statement:

Figure 30.5 : Using actions in the debugger.

a 73 print "I am on line 73 and reply is $reply"

Notice that you did not have to terminate the action command with a semicolon. You need to use semicolons only if you have more than one statement for an action. If you forget to supply the terminating semicolon, the debugger will supply it for you. In any event, try to keep actions simple and short. Don't write lengthy actions unless absolutely necessary; otherwise, you'll slow down the debugger and clutter up the output on your terminal.

Actions are not limited to displaying values. For instance, you can use an action to reset a variable to a known value while in a loop, using a statement like this:

a 73 $reply = 1; print "forced reply to 1\n";

To execute statements within the debugged program's space, simply type the command at the prompt. For example, to explicitly create and set the value of $kw to 2 in the code, use the following commands at the DB<> prompt:

DB<1> $kw = 2
... nothing is printed here ...
DB<1> print $kw
2
DB<1> V main kw
$kw = '2'

In this example, the variable $kw is created and defined in the program environment. You cannot modify the source code in the original program, but you can add items to the name space.

In some cases, your program may have redirected its output to STDOUT and therefore whatever it is printing will not be shown on the console. To evaluate an expression and print its value out to the console regardless of how STDOUT is redirected, you can use the p command. The p command evaluates an expression in the current program's environment and prints it out to the debugger console. Basically, the print command prints the output to wherever STDOUT is redirected, whereas the p command is equivalent to the following print command:

print DB::OUT

The command above forces output from a print command to where the DB:: package prints its output.

Searching for Patterns

To look for certain strings in the source code, you can use the forward slash command followed by the string to look for. Note that there are no spaces between the / and the string you are looking for. The string can be specified between two slashes, but the second slash is optional. Actually, you can search for regular expressions, just as in Perl.

To search forward in the file, use the / operator. To search backward, use the question mark operator ( ? ).

The history of the commands you have executed is tracked in the debugger. Only commands greater than one character long are listed in this directory. To execute commands from the history list, use the bang operator ( ! ) followed by the index of the command. To execute a command from the history, type ! and the index of the command to redo. This should be familiar to Bash and C shell programmers.

To see the current history of commands in the buffer of commands in the debugger, type the H command. For example, in the middle of a debug session, if you type in the H command at the DB<3> prompt, you should expect to see three items listed in reverse order of execution:

DB<3> H
3: b 79
2: w 2
1: w 9
Subroutines

To list all the subroutines currently in the system, use the S command. The output from the S command lists all subroutines in any package that your code uses. For example, if you run the program in Listing 30.2 with the debugger, you will see output as shown in Figure 30.6.

Figure 30.6 : Listing subroutine names.


Listing 30.2. A sample listing.
1 #!/usr/bin/perl -d
2
3 use VRML;
4 use VRML::Cube;
5
6 my $header = VRML::new();
7 $header->VRML::startHeader;
8
9 $header->VRML::startSeparator;
10
11 my $cubeb = $header->VRML::putCube(
12 'width' => 0.5, 'height' => 0.5 , 'depth' => 0.5 ,
13 'translation' => [1,0,0]
14 );
15 my $cubed = $header->VRML::putCube(
16 'width' => 1, 'height' => 1 , 'depth' => 1 ,
17 'translation' => [1,1,0],
18 );
19 $header->VRML::stopSeparator;

At any time in a debug session, you can do a "stack trace," which is a listing of the calling order of the functions called so far. Be aware that if you are modifying the argument stack in any way, the values of the passed arguments might not be correct. The T command will do a stack trace for you.

Caveats

First of all, there is no way to restart the debugger if there is a problem. If you overstep something, you have to start all over. This means getting out of the program and restarting the debugger.

Second, the debugger itself is not completely debugged yet. If you notice certain problems, such as your commands not being recognized, it's probably because you typed too many characters at the prompt.

Table 30.1 lists the information about the available debugger commands. All information in this table is gleaned from the perl5db.pl source file. Keep this table handy so that you don't have to go to the file to see what options are available.

Table 30.1. The commands available from the debugger.
Command Description
a [ ln ] command Sets an action to take before the line is executed.
b Sets an unconditional breakpoint at the current line.
b [ ln ] [ cond ] Sets a breakpoint if the condition is true at the specified line number.
b sname [ cond ] Sets a breakpoint at the first line inside the subroutine sname() .
c Continues until the next breakpoint or until the end of the program.
c line Continues and stops at the specified line.
d [ line ] Deletes the breakpoint at a given line.
D Deletes all breakpoints.
f filename Switches to the filename as the default.
H - number Displays history of all commands longer than one character.
L Lists all breakpoints and actions.
l min+incr Lists incr+1 lines starting at line #min .
l min-max Lists lines from min to max , inclusively.
l line Lists one line of code at a specified line.
l Lists the next 10 lines of code from the last location.
l name Lists a subroutine by name.
n Next code at the same level. Steps over subroutine calls.
p expr Same as print DB::OUT expr in current package.
q or ^D Quits. You cannot use quit .
r Returns from current subroutine.
s Single-step over code. Steps into subroutines.
S Lists all known subroutine names in the current scope.
t Toggles trace mode on and off.
T Performs a stack trace.
V Lists all variables in all used packages.
V pkg List all variables in a given package.
V pkg var Lists all variables in a package that have var in them.
w line Lists five lines before and five lines after current line.
<CR> Repeats last n or s .
- Lists the previous window.
/ regexp / Searches forward for a pattern using a regular expression.
? regexp ? Searches backward for a pattern using a regular expression.
< command Defines the command before the prompt.
> command Defines the command after the prompt.
! number Redoes a command (the default is the previous command).
! - number Redoes number\'th to the last command.
= [ alias value ] Starts a command alias.
= Lists all the current aliases.
command Executes as a Perl statement in the current package.
Customizing Your Debugger Environment

There are ways to customize your debugger environment. If you do not like the one-character commands that come with the debugger, you can use different aliases. There is a hash in the DB:: package called %alias() that contains the command strings. You can substitute your own commands in place of the existing ones using the = command. Since most of the time you'll want to keep your changes consistent between debug sessions, you can edit a file called .perldb in the current working directory and place the assignments there. Here's a sample .perldb file:

$DB::alias{'ln'} = 's/ln/p $1/';
$DB::alias{'z'} = 's/z/l/';

These two lines will substitute the value of p for every command ln you type, and the value of l for every z command. Of course, you'll probably want to alias long commands into short one-character sequences to save yourself some time.

Using the debugger should not be your only method for getting bugs out of the system. The -w switch is important if you want Perl to do checking and warn you of error conditions while executing. The types of messages generated vary from warnings to notifications of fatal errors that can cause the program to abort.

For More Information

Reading the source file perl5db.pl gives you a few clues about how the debugger works and the commands that are available during a debug session. Consult the perldebug.html page at www.metronet.com . This file contains the full list of all the options in the debug environment. Review the perldiag.html page for a list of possible diagnostic values you get from using the w switch.

Summary

Nothing really beats the use of well-placed print statements to do debugging. However, Perl does offer a simple yet powerful debugging tool with the -d option. The interactive debugger lets you step through code, into or over subroutines, set breakpoints, execute commands, and look at variables in a Perl program.

[Dec 20, 2017] chomp - perldoc.perl.org

Notable quotes:
"... Note that parentheses are necessary ..."
Dec 20, 2017 | perldoc.perl.org

chomp Perl functions A-Z | Perl functions by category | The 'perlfunc' manpage

[Dec 19, 2017] Perl IDE and Editor Poll, October 2009 - Which editor(s) or IDE(s) are you using for Perl development

Oct 01, 2009 | perlide.org

In October 2009 we ran a poll asking people Which editor(s) or IDE(s) are you using for Perl development? . The poll was promoted via the blog of Gabor Szabo which is syndicated in several Perl related sites such as the Iron Man Challenge , Perlshpere and Planet Perl . It was also promoted via Twitter , the Perl group in Reddit , the Perl Mongers group in LinkedIn and the Perl Community Adserver to get more people to cast their vote. Request was also sent to the Perl Monger group leaders. Some of them have forwarded the request to their respective groups.

The list of editors was taken from the Perl Development Tools page on Perlmonks and the "randomize answers" checkbox was clicked after filling in the data. No idea if that really randomized the answers. During the poll people could mark other editors and type in the name of and editor. Some of these editors were added to the list of possible answers during the poll. In addition there were people who typed in the name of the editor in the other field even though the name appeared on the list.

At the begining we set the poll to allow multiple choice with up to 3 answers per person but later on we noticed that at one of the updates it became multiple choice unlimited answers. Unfortunatelly the free polling system we used gave details only on the number of answers and not the number of people who answered.

The poll ran between 21-24 October 2009 for about 72 hours. There were 3,234 answers when it was closed.

The results are as follows.
Vim (or vi or gvim) 1097 34%
Emacs (or xemacs, with or without extensions) 430 13%
Ultra Edit (plain or Studio) 224 7%
Eclipse EPIC 210 6%
Other answer... 143 4%
Notepad++ 142 4%
Komodo IDE 128 4%
Komodo Edit 105 3%
TextMate 105 3%
Padre 101 3%
Kate 56 2%
Gedit 55 2%
TextPad 49 2%
nano 40 1%
SciTE 38 1%
Geany 36 1%
NEdit 27 1%
mcedit 26 1%
EditPlus 26 1%
BBEdit 25 1%
JEdit 23 1%
Joe 20 1%
Smultron 16 0%
TextWrangler 14 0%
PSPad 12 0%
Notepad2 12 0%
Open Perl IDE 10 0%
OptiPerl 9 0%
Pico 7 0%
Jed 6 0%
Kephra 6 0%
SlickEdit 6 0%
KDevelop 6 0%
Notepad 5 0%
Crimson 4 0%
Anjuta 3 0%
EngInSite-Perl 3 0%
KEdit 3 0%
Perl Express 2 0%
DzSoft Perl 2 0%
PerlWiz 1 0%
Far 1 0%
Perl Studio 0 0%
Perl Builder 0 0%
Editeur 0 0%
Perl Code Editor 0 0%
ED for Windows 0 0%
PerlEdit 0 0%
FTE 0 0%
visiPerl+ 0 0%
Prof. Notepad 0 0%
Perl Scripting Tool 0 0%

[Dec 19, 2017] Programming in Perl - Debugging

Mar 13, 2007 | cs.rpi.edu

On this page, I will post aides and tools that Perl provides which allow you to more efficently debug your Perl code. I will post updates as we cover material necessary for understanding the tools mentioned.

CGI::Dump
Dump is one of the functions exported in CGI.pm's :standard set. It's functionality is similar to that of Data::Dumper . Rather than pretty-printing a complex data structure, however, this module pretty-prints all of the parameters passed to your CGI script. That is to say that when called, it generates an HTML list of each parameter's name and value, so that you can see exactly what parameters were passed to your script. Don't forget that you must print the return value of this function - it doesn't do any printing on its own.
use CGI qw/:standard/;
print Dump;
Benchmark
As you know by now, one of Perl's mottos is "There's More Than One Way To Do It" (TMTOWTDI ©). This is usually a Good Thing, but can occasionally lead to confusion. One of the most common forms of confusion that Perl's verstaility causes is wondering which of multiple ways one should use to get the job done most quickly.

Analyzing two or more chunks of code to see how they compare time-wise is known as "Benchmarking". Perl provides a standard module that will Benchmark your code for you. It is named, unsurprisingly, Benchmark . Benchmark provides several helpful subroutines, but the most common is called cmpthese() . This subroutine takes two arguments: The number of iterations to run each method, and a hashref containing the code blocks (subroutines) you want to compare, keyed by a label for each block. It will run each subroutine the number of times specified, and then print out statistics telling you how they compare.

For example, my solution to ICA5 contained three different ways of creating a two dimensional array. Which one of these ways is "best"? Let's have Benchmark tell us:

#!/usr/bin/perl
use strict;
use warnings;
use Benchmark 'cmpthese';

sub explicit {
    my @two_d = ([ ('x') x 10 ],
                 [ ('x') x 10 ],
                 [ ('x') x 10 ],
                 [ ('x') x 10 ],
                 [ ('x') x 10 ]);
}

sub new_per_loop {
    my @two_d;
    for (0..4){
        my @inner = ('x') x 10;
        push @two_d, \@inner;
    }
}

sub anon_ref_per_loop {
    my @two_d;
    for (0..4){
        push @two_d, [ ('x') x 10 ];
    }
}

sub nested {
    my @two_d;
    for my $i (0..4){
        for my $j (0..9){
            $two_d[$i][$j] = 'x';
        }
    }
}
cmpthese (10_000, {
                 'Explicit'           => \&explicit,
                 'New Array Per Loop' => \&new_per_loop,
                 'Anon. Ref Per Loop' => \&anon_ref_per_loop,
                 'Nested Loops'       => \&nested,
             }
      );
The above code will print out the following statistics (numbers may be slightly off, of course):
Benchmark: timing 10000 iterations of Anon. Ref Per Loop, Explicit, Nested Loops, New Array Per Loop...
Anon. Ref Per Loop:  2 wallclock secs ( 1.53 usr +  0.00 sys =  1.53 CPU) @ 6535.95/s (n=10000)
Explicit:  1 wallclock secs ( 1.24 usr +  0.00 sys =  1.24 CPU) @ 8064.52/s (n=10000)
Nested Loops:  4 wallclock secs ( 4.01 usr +  0.00 sys =  4.01 CPU) @ 2493.77/s (n=10000)
New Array Per Loop:  2 wallclock secs ( 1.76 usr +  0.00 sys =  1.76 CPU) @ 5681.82/s (n=10000)
                     Rate Nested Loops New Array Per Loop Anon. Ref Per Loop Explicit
Nested Loops       2494/s           --               -56%               -62%     -69%
New Array Per Loop 5682/s         128%                 --               -13%     -30%
Anon. Ref Per Loop 6536/s         162%                15%                 --     -19%
Explicit           8065/s         223%                42%                23%       --

The benchmark first tells us how many iterations of which subroutines it's running. It then tells us how long each method took to run the given number of iterations. Finally, it prints out the statistics table, sorted from slowest to fastest. The Rate column tells us how many iterations each subroutine was able to perform per second. The remaining colums tells us how fast each method was in comparison to each of the other methods. (For example, 'Explicit' was 223% faster than 'Nested Loops', while 'New Array Per Loop' is 13% slower than 'Anon. Ref Per Loop'). From the above, we can see that 'Explicit' is by far the fastest of the four methods. It is, however, only 23% faster than 'Ref Per Loop', which requires far less typing and is much more easily maintainable (if your boss suddenly tells you he'd rather have the two-d array be 20x17, and each cell init'ed to 'X' rather than 'x', which of the two would you rather had been used?).

You can, of course, read more about this module, and see its other options, by reading: perldoc Benchmark

Command-line options
Perl provides several command-line options which make it possible to write very quick and very useful "one-liners". For more information on all the options available, refer to perldoc perlrun
-e
This option takes a string and evaluates the Perl code within. This is the primary means of executing a one-liner
perl -e'print qq{Hello World\n};'
(In windows, you may have to use double-quotes rather than single. Either way, it's probably better to use q// and qq// within your one liner, rather than remembering to escape the quotes).
-l
This option has two distinct effects that work in conjunction. First, it sets $\ (the output record terminator) to the current value of $/ (the input record separator). In effect, this means that every print statement will automatically have a newline appended. Secondly, it auto-chomps any input read via the <> operator, saving you the typing necessary to do it.
perl -le 'while (<>){ $_ .= q{testing};  print; }'
The above would automatically chomp $_, and then add the newline back on at the print statement, so that "testing" appears on the same line as the entered string.
-w
This is the standard way to enable warnings in your one liners. This saves you from having to type use warnings;
-M
This option auto- use s a given module.
perl -MData::Dumper -le'my @foo=(1..10); print Dumper(\@foo);'
-n
This disturbingly powerful option wraps your entire one-liner in a while (<>) { ... } loop. That is, your one-liner will be executed once for each line of each file specified on the command line, each time setting $_ to the current line and $. to current line number.
perl -ne 'print if /^\d/' foo.txt beta.txt
The above one-line of code would loop through foo.txt and beta.txt, printing out all the lines that start with a digit. ($_ is assigned via the implicit while (<>) loop, and both print and m// operate on $_ if an explict argument isn't given).
-p
This is essentially the same thing as -n , except that it places a continue { print; } block after the while (<>) { ... } loop in which your code is wrapped. This is useful for reading through a list of files, making some sort of modification, and printing the results.
perl -pe 's/Paul/John/' email.txt
Open the file email.txt, loop through each line, replacing any instance of "Paul" with "John", and print every line (modified or not) to STDOUT
-i
This one sometimes astounds people that such a thing is possible with so little typing. -i is used in conjunction with either -n or -p. It causes the files specified on the command line to be edited "in-place", meaning that while you're looping through the lines of the files, all print statements are directed back to the original files. (That goes for both explicit print s, as well as the print in the continue block added by -p.)
If you give -i a string, this string will be used to create a back-up copy of the original file. Like so:
perl -pi.bkp -e's/Paul/John/' email.txt msg.txt
The above opens email.txt, replaces each line's instance of "Paul" with "John", and prints the results back to email.txt. The original email.txt is saved as email.txt.bkp. The same is then done for msg.txt

Remember that any of the command-line options listed here can also be given at the end of the shebang in non-one-liners. (But please do not start using -w in your real programs - use warnings; is still preferred because of its lexical scope and configurability).

Data::Dumper
The standard Data::Dumper module is very useful for examining exactly what is contained in your data structure (be it hash, array, or object (when we come to them) ). When you use this module, it exports one function, named Dumper . This function takes a reference to a data structure and returns a nicely formatted description of what that structure contains.
#!/usr/bin/env perl
use strict;
use warnings;
use Data::Dumper;

my @foo = (5..10);
#add one element to the end of the array
#do you see the error?
$foo[@foo+1] = 'last';

print Dumper(\@foo);

When run, this program shows you exactly what is inside @foo:

$VAR1 = [
          5,
          6,
          7,
          8,
          9,
          10,
          undef,
          'last'
        ];

(I know we haven't covered references yet. For now, just accept my assertion that you create a reference by prepending the variable name with a backslash...)

__DATA__ & <DATA>
Perl uses the __DATA__ marker as a pseudo-datafile. You can use this marker to write quick tests which would involve finding a file name, opening that file, and reading from that file. If you just want to test a piece of code that requires a file to be read (but don't want to test the actual file opening and reading), place the data that would be in the input file under the __DATA__ marker. You can then read from this pseudo-file using <DATA>, without bothering to open an actual file:
#!/usr/bin/env perl
use strict;
use warnings;

while (my $line = <DATA>) {
  chomp $line;
  print "Size of line $.:  ", length $line, "\n";
}

__DATA__
hello world
42
abcde

The above program would print:

Size of line 1: 11
Size of line 2: 2
Size of line 3: 5
$.
The $. variable keeps track of the line numbers of the file currently being processed via a while (<$fh>) { ... } loop. More explicitly, it is the number of the last line read of the last file read.
__FILE__ & __LINE__
These are two special markers that return, respectively, the name of the file Perl is currently executing, and the Line number where it resides. These can be used in your own debugging statements, to remind yourself where your outputs were in the source code:
  print "On line " . __LINE__ . " of file " . __FILE__ . ", \$foo = $foo\n";
   
   
   
   

Note that neither of these markers are variables, so they cannot be interpolated in a double-quoted string

warn() & die()
These are the most basic of all debugging techniques. warn() takes a list of strings, and prints them to STDERR. If the last element of the list does not end in a newline, warn() will also print the current filename and line number on which the warning occurred. Execution then proceeds as normal.

die() is identical to warn() , with one major exception - the program exits after printing the list of strings.

All debugging statements should make use of either warn() or die() rather than print() . This will insure you see your debugging output even if STDOUT has been redirected, and will give you the helpful clues of exactly where in your code the warning occurred.

[Dec 19, 2017] Open Perl IDE - User Manual

Dec 19, 2017 | open-perl-ide.sourceforge.net

This section explains how to use Open Perl IDE for debugging.

Important: Open Perl IDE is not able to debug any scripts, if it does not know a path to "perl.exe". If the PATH environment variable contains a valid location, then "perl.exe" will be detected automatically. Otherwise it is necessary to enter a valid location into the "Preferences | General | Directories | Path to perl.exe" field.

There are two methods to debug a script:

After execution is stopped, it is possible to analyse the actual state of the script by Furthermore, it is possible to set/delete breakpoints (see section 5.1 Breakpoints) or to continue/abort the execution of the script. The following table shows the different navigation possibilities:

Table: Debug Navigation

Name Shortcut Description
Run F9 Start/Continue script execution until next breakpoint is reached.
Step Over F8 Execute the current script line, not tracing into subroutines.
Step Into F7 Execute the next command in the current script line, tracing into subroutines.
Abort CTRL-F2 Request termination of debug session.
Force Termination CTRL-ALT-F12 Immediately terminate debug session.
You should only use "Force Termination" if you see no other way to stop script execution. Dont't expect Open Perl IDE to work correctly after using forced termination !

If script execution has finished, then Open Perl IDE automatically switches back from debug mode to edit mode.

[Dec 19, 2017] Antibugging in Perl 7 Tips for Reducing Complexity

Notable quotes:
"... The complexity of a program is a function of several factors: ..."
Dec 19, 2017 | www.informit.com

"Complexity is the enemy, and our aim is to kill it." -Jan Baan

One of Perl's greatest strengths is its expressiveness and extreme conciseness. Complexity is the bane of software development: when a program grows beyond a certain size, it becomes much harder to test, maintain, read, or extend. Unfortunately, today's problems mean this is true for every program we need. Anything you can do to minimize the complexity of your program will pay handsome dividends.

The complexity of a program is a function of several factors:

Whenever a language allows you to change some code to reduce any of these factors, you reduce complexity.

3.7.1 Lose the Temporary Variables

The poster child for complexity is the temporary variable. Any time a language intrudes between you and the solution you visualize, it diminishes your ability to implement the solution. All languages do this to some degree; Perl less than most. 13 In most languages, you swap two variables a and b with the following algorithm:

Declare temp to be of the same type as a and b
temp = a;
a = b;
b = temp;

But most languages are not Perl:

($b, $a) = ($a, $b);

Iterating over an array usually requires an index variable and a count of how many things are currently stored in the array:

int i;
for (i = 0; i < count_lines; i++) 
 {
 strcat (line[i], suffix);
 }

Whereas in Perl, you have the foreach construct borrowed from the shell:

foreach my $line (@lines) { $line .= $suffix }

And if you feel put out by having to type foreach instead of just for , you're in luck, because they're synonyms for each other; so just type for if you want (Perl can tell which one you mean).

Because functions can return lists, you no longer need to build special structures just to return multivalued data. Because Perl does reference-counting garbage collection, you can return variables from the subroutine in which they are created and know that they won't be trampled on, yet their storage will be released later when they're no longer in use. And because Perl doesn't have strong typing of scalars, you can fill a hierarchical data structure with heterogeneous values without having to construct a union datatype and some kind of type descriptor.

Because built-in functions take lists of arguments where it makes sense to do that, you can pass them the results of other functions without having to construct an iterative loop:

unlink grep /~$/, readdir DIR;

And the map function lets you form a new list from an old one with no unnecessary temporary variables:

open PASSWD, '/etc/passwd' or die "passwd: $!\n";
my @usernames = map /^([^:]+)/, <PASSWD>;
close PASSWD;

Because Perl's arrays grow and shrink automatically and there are simple operators for inserting, modifying, or deleting array elements, you don't need to build linked lists and worry if you've got the traversal termination conditions right. And because Perl has the hash data type, you can quickly locate a particular chunk of information by key or find out whether a member of a set exists.

3.7.2 Scope Out the Problem

Of course, sometimes temporary variables are unavoidable. Whenever you create one though, be sure and do it in the innermost scope possible (in other words, within the most deeply nested set of braces containing all references to the variable).

Create variables in the innermost scope possible.

For example, let's say somewhere in my program I am traversing my Netscape history file and want to save the URLs visited in the last 10 days in @URLs :

use Netscape::History;
my $history = new Netscape::History;
my (@URLs, $url);
while (defined($url = $history->next_url() )) 
 {
 push @URLs, $url if 
    time - $url->last_visit_time < 10 * 24 * 3600;
 }

This looks quite reasonable on the face of it, but what if later on in our program we create a variable called $history or $url ? We'd get the message

"my" variable $url masks earlier declaration in same scope

which would cause us to search backward in the code to find exactly which one it's referring to. Note the clause " in same scope " -- if in the meantime you created a variable $url at a different scope, well, that may be the one you find when searching backward with a text editor, but it won't be the right one. You may have to check your indentation level to see the scope level.

This process could be time-consuming. And really, the problem is in the earlier code, which created the variables $history or $url with far too wide a scope to begin with. We can (as of perl 5.004) put the my declaration of $url right where it is first used in the while statement and thereby limit its scope to the while block. As for $history , we can wrap a bare block around all the code to limit the scope of those variables:

use Netscape::History;
my @URLs;
 {
 my $history = new Netscape::History;
 while (defined(my $url = $history->next_url() )) 
  {
  push @URLs, $url 
   if time - $url->last_visit_time < 10 * 24 * 3600;
  }
 }

If you want to create a constant value to use in several places, use constant.pm to make sure it can't be overwritten:

$PI = 3.1415926535897932384;

use constant PI => 3.1415926535897932384;

my $volume = 4/3 * PI * $radius ** 3;

$PI = 3.0; # The 'Indiana maneuver' works!
PI = 3.0; # But this does not

In response to the last statement, Perl returns the error message, " Can't modify constant item in scalar assignment ."

constant.pm creates a subroutine of that name which returns the value you've assigned to it, so trying to overwrite it is like trying to assign a value to a subroutine call. Although the absurdity of that may sound like sufficient explanation for how use constant works, in fact, the latest version of perl allows you to assign a value to a subroutine call, provided the result of the subroutine is a place where you could store the value. For example, the subroutine could return a scalar variable. The term for this feature is lvaluable subroutine . But since the results of the subroutines created by use constant aren't lvalues, lvaluable subroutines won't cause problems for them.

[Dec 19, 2017] Cultured Perl: Debugging Perl with ease. Catch the bugs before they bite

Nov 01, 2000 | www.ibm.com

Bugs are as inevitable as death and taxes. Nevertheless, the following material should help you avoid the pitfalls of bugs.

... ... ...

First let's simply make sure the bug is repeatable. We'll set an action on line 8 to print $line where the error occurred, and run the program.

perl -d ./buggy.pl buggy.pl

use Data::Dumpe

a 8 print 'The line variable is now ', Dumper $line

The Data::Dumper module loads so that the autoaction can use a nice output format. The autoaction is set to do a print statement every time line 8 is reached. Now let's watch the show.

[Dec 08, 2017] Perl Debugger Tutorial 10 Easy Steps to Debug Perl Program

Dec 08, 2017 | www.thegeekstuff.com

Perl Debugger Tutorial: 10 Easy Steps to Debug Perl Program by Balakrishnan Mariyappan on May 19, 2010

https://apis.google.com/se/0/_/+1/fastbutton?usegapi=1&size=medium&origin=http%3A%2F%2Fwww.thegeekstuff.com&url=http%3A%2F%2Fwww.thegeekstuff.com%2F2010%2F05%2Fperl-debugger%2F&gsrc=3p&jsh=m%3B%2F_%2Fscs%2Fapps-static%2F_%2Fjs%2Fk%3Doz.gapi.en_US.7iE0RPXkeyg.O%2Fm%3D__features__%2Fam%3DAQ%2Frt%3Dj%2Fd%3D1%2Frs%3DAGLTcCPtrDcrcZ6TwfUke349lDWwAOzBUw#_methods=onPlusOne%2C_ready%2C_close%2C_open%2C_resizeMe%2C_renderstart%2Concircled%2Cdrefresh%2Cerefresh&id=I0_1512705132381&_gfid=I0_1512705132381&parent=http%3A%2F%2Fwww.thegeekstuff.com&pfname=&rpctoken=25025448

http://www.facebook.com/plugins/like.php?href=http%3A%2F%2Fwww.thegeekstuff.com%2F2010%2F05%2Fperl-debugger%2F&send=false&layout=button_count&width=450&show_faces=false&action=like&colorscheme=light&font&height=21

http://platform.twitter.com/widgets/tweet_button.6b8337773e8a8ecc4f0b054fec8f1482.en.html#dnt=false&id=twitter-widget-0&lang=en&original_referer=http%3A%2F%2Fwww.thegeekstuff.com%2F2010%2F05%2Fperl-debugger%2F&size=m&text=Perl%20Debugger%20Tutorial%3A%2010%20Easy%20Steps%20to%20Debug%20Perl%20Program&time=1512705132548&type=share&url=http%3A%2F%2Fwww.thegeekstuff.com%2F2010%2F05%2Fperl-debugger%2F

Earlier we discussed the basics of how to write and execute a perl program using Perl Hello World Example .

In this article, Let us review how to debug a perl program / script using Perl debugger , which is similar to the gdb tool for debugging C code .

To debug a perl program, invoke the perl debugger using "perl -d" as shown below.

# perl -d  ./perl_debugger.pl

To understand the perl debugger commands in detail, let us create the following sample perl program (perl_debugger.pl).

$ cat perl_debugger.pl
#!/usr/bin/perl -w

# Script to list out the filenames (in the pwd) that contains specific pattern.

#Enabling slurp mode
$/=undef;

# Function : get_pattern
# Description : to get the pattern to be matched in files.
sub get_pattern
{
my $pattern;
print "Enter search string: ";
chomp ($pattern = <> );
return $pattern;
}

# Function : find_files
# Description : to get list of filenames that contains the input pattern.
sub find_files
{
my $pattern = shift;
my (@files,@list,$file);

# using glob, obtaining the filenames,
@files = <./*>;

# taking out the filenames that contains pattern.
@list = grep {
$file = $_;
open $FH,"$file";
@lines = <$FH>;
$count = grep { /$pattern/ } @lines;
$file if($count);
} @files;
return @list;
}
# to obtain the pattern from STDIN
$pattern = get_pattern();

# to find-out the list of filenames which has the input pattern.
@list = find_files($pattern);

print join "\n",@list;
1. Enter Perl Debugger

# perl -d ./perl_debugger.pl

it prompts,
DB<1>

2. View specific lines or subroutine statements using (l)

DB<1> l 10
10: my $pattern;

DB<2> l get_pattern
11 {
12: my $pattern;
13: print "Enter search string: ";
14: chomp ($pattern = );
15: return $pattern;
16 }

3. Set the breakpoint on get_pattern function using (b)

DB<3> b find_files

4. Set the breakpoint on specific line using (b)

DB<4> b 44

5. View the breakpoints using (L)

DB<5> L
./perl_debugger.pl:
22: my $pattern = shift;
break if (1)
44: print join "\n",@list;
break if (1)

6. step by step execution using (s and n)

DB<5> s
main::(./perl_debugger.pl:39): $pattern = get_pattern();

DB<5> s
main::get_pattern(./perl_debugger.pl:12):
12: my $pattern;

Option s and n does step by step execution of each statements. Option s steps into the subroutine. Option n executes the subroutine in a single step (stepping over it).

The s option does stepping into the subroutine but while n option which would execute the subroutine(stepping over it).

7. Continue till next breakpoint (or line number, or subroutine) using (c)

DB<5> c
Enter search string: perl
main::find_files(./perl_debugger.pl:22):
22: my $pattern = shift;

8. Continue down to the specific line number using (c)

DB<5> c 36
main::find_files(./perl_debugger.pl:36):
36: return @list;

9. Print the value in the specific variable using (p)

DB<6> p $pattern
perl

DB<7> c
main::(./perl_debugger.pl:44): print join "\n",@list;
DB<7> c
./perl_debugger.pl
Debugged program terminated. Use q to quit or R to restart,
use o inhibit_exit to avoid stopping after program termination,
h q, h R or h o to get additional info.

After the last continue operation, the output gets printed on the stdout as "./perl_debugger.pl" since it matches the pattern "perl".

10. Get debug commands from the file (source)

Perl debugger can get the debug command from the file and execute it. For example, create the file called "debug_cmds" with the perl debug commands as,

c
p $pattern
q

Note that R is used to restart the operation(no need quit and start debugger again).
DB<7> R
DB<7> source debug_cmds
>> c
Enter search string: perl
./perl_debugger.pl
Debugged program terminated. Use q to quit or R to restart,
use o inhibit_exit to avoid stopping after program termination,
h q, h R or h o to get additional info.
>> p $pattern
perl
>> q

Note : If you are relatively new to perl, refer to our previous article: 20 perl programming tips for beginners .

Summary of perl debugger commands

Following options can be used once you enter the perl debugger.

[Dec 03, 2017] Debugging Regular Expressions

Dec 03, 2017 | my.safaribooksonline.com

Not for the fainthearted, if you want to see how a regular expression runs when used in a match or substitution, use the core re pragma with its debug option:

% perl -Mstrict -Mwarnings
use re qw(debug);
$_ = "cats=purr, dog=bark";
my %sound = /(\w+)=(\w+)/g;
^D
Compiling REx `(\w+)=(\w+)'
size 15 first at 4
1: OPEN1(3)
3: PLUS(5)
4: ALNUM(0)
5: CLOSE1(7)
7: EXACT <=>(9)
9: OPEN2(11)
11: PLUS(13)
12: ALNUM(0)
13: CLOSE2(15)
15: END(0)
floating `=' at 1..2147483647 (checking floating) stclass `ALNUM' plus
minlen 3
Guessing start of match, REx `(\w+)=(\w+)' against `cats=purr,
dog=bark'...
Found floating substr `=' at offset 4...
Does not contradict STCLASS...
Guessed: match at offset 0
Matching REx `(\w+)=(\w+)' against `cats=purr, dog=bark'
Setting an EVAL scope, savestack=3
0 <> <cats=purr, d> | 1: OPEN1
0 <> <cats=purr, d> | 3: PLUS
ALNUM can match 4 times out of 32767...
Setting an EVAL scope, savestack=3
4 <cats> <=purr, d> | 5: CLOSE1
4 <cats> <=purr, d> | 7: EXACT <=>
5 <cats=> <purr, d> | 9: OPEN2
5 <cats=> <purr, d> | 11: PLUS

Setting an EVAL scope, savestack=3
9 <=purr> <, dog=b> | 13: CLOSE2
9 <=purr> <, dog=b> | 15: END
Match successful!
Guessing start of match, REx `(\w+)=(\w+)' against `, dog=bark'...
Found floating substr `=' at offset 5...
By STCLASS: moving 0 --> 2
Guessed: match at offset 2
Matching REx `(\w+)=(\w+)' against `dog=bark'
Setting an EVAL scope, savestack=3
11 <urr, > <dog=bar> | 1: OPEN1
11 <urr, > <dog=bar> | 3: PLUS
ALNUM can match 3 times out of 32767...
Setting an EVAL scope, savestack=3
14 <rr, dog> <=bark> | 5: CLOSE1
14 <rr, dog> <=bark> | 7: EXACT <=>
15 <rr, dog=> <bark> | 9: OPEN2
15 <rr, dog=> <bark> | 11: PLUS
ALNUM can match 4 times out of 32767...
Setting an EVAL scope, savestack=3
19 <rr, dog=bark> <> | 13: CLOSE2
19 <rr, dog=bark> <> | 15: END
Match successful!
Freeing REx: `(\w+)=(\w+)'

debugcolor option instead of debug , you'll get some form of highlighting or coloring in the output that'll make it prettier, if not more understandable

[Dec 03, 2017] Just writing the tests is often a damn fine way of finding bugs

Nov 15, 2017 | perlmonks.com

GrandFather (Sage) on Jul 12, 2006 at 08:38 UTC

Re^2: Strategies for maintenance of horrible code?

Actually, just writing the tests is often a damn fine way of finding bugs. No exactly what OP is after at the moment, but something that is at the forefront of my mind because I'm in the middle of writing a set of tests (in Perl :) for some XML processing C++ code and turning up a pile of bugs as I go.

However it does suggest another test avenue: write test harnesses for modules so that you can exercise them in isolation and better understand how they work. If the test harness ends up part of a regression test system so much the better.


DWIM is Perl's answer to Gödel

[Dec 03, 2017] Strategies for maintenance of horrible code?

Notable quotes:
"... Debugging this code is a whole different game, and I'd really appreciate some input from other monks who've dealt with this type of problem. ..."
Jul 12, 2006 | perlmonks.com

converter has asked for the wisdom of the Perl Monks concerning the following question:

For the past several months I've been busy rewriting the horrible Perl code left behind by my predecessor. His approach to development was "Write some code. If the code runs without revealing any of the damage it's done, ship it. If not, write some more code."

This code is so bad that when co-workers ask me what I'm working on, I tell them "The Madman's Diary." Yes, it would have been cheaper and faster to throw this code away and start over, but I wasn't given that option.

My latest assignment is the repair of a tangled mess of a show-stopper that was discovered in a product that was supposed to ship today. After adding an open() override that logs the arguments to open() and some quality time with the watch(1) utility observing changes to the files containing the data that are causing the problem, I've narrowed the list of suspects down to a couple in-house scripts and a few (probably altered) webmin modules.

Now that I know where to look, I'd like to identify as quickly as possible which details can be safely ignored. I plan to use Devel::DProf to produce an execution graph for reference and Tie::Watch to watch variables, but I wonder if there are other tools that I should look at. A utility or module that would allow me to incrementally build a profile with persistent notes would be wonderful.

Debugging this code is a whole different game, and I'd really appreciate some input from other monks who've dealt with this type of problem.

eyepopslikeamosquito (Chancellor) on Jul 12, 2006 at 08:30 UTC

Re: Strategies for maintenance of horrible code?
Yes, it would have been cheaper and faster to throw this code away and start over
Maybe. For another point of view, see Joel Spolsky on not rewriting from scratch .

I agree with adrianh . If a component is not broken, don't rewrite it. Rewrite a component when you find a number of bugs in it. But first write a regression test suite for the component. I've seen many folks over the years throw out old code, rewrite it ... and introduce a heap of new bugs in the process. If you come into a new company and introduce a swag of new bugs in previously working code, you will start to smell very badly.

See also:

GrandFather (Sage) on Jul 12, 2006 at 08:38 UTC

Re^2: Strategies for maintenance of horrible code?

Actually, just writing the tests is often a damn fine way of finding bugs. No exactly what OP is after at the moment, but something that is at the forefront of my mind because I'm in the middle of writing a set of tests (in Perl :) for some XML processing C++ code and turning up a pile of bugs as I go.

However it does suggest another test avenue: write test harnesses for modules so that you can exercise them in isolation and better understand how they work. If the test harness ends up part of a regression test system so much the better.


DWIM is Perl's answer to Gödel

tinita (Parson) on Jul 12, 2006 at 12:28 UTC

Maybe. For another point of view, see Joel Spolsky on not rewriting from scratch.
uh oh. why does this remind me of perl 6? =)

adrianh (Chancellor) on Jul 12, 2006 at 07:49 UTC

Debugging this code is a whole different game, and I'd really appreciate some input from other monks who've dealt with this type of problem.

I'd recommend reading Perl Medic and Working Effectively with Legacy Code (the latter isn't Perl specific - but is chock full of useful advice).

I would not spend any time fixing the code if it's not breaking (assuming you're not being paid to review/fix the code). However evil it may be - if it's doing it's job leave it alone.

Instead - every time you need to fix a bug or add some new functionality just test/refactor the bits of the evil code that are touched by the changes. I've found incrementally adding tests and refactoring to be much more effective than any sort of "big bang" fixing things for the sake of them approach :-)

If you are being paid to do a review/fix then Perl::Critic might give you some useful places to look.

webfiend (Vicar) on Jul 14, 2006 at 21:24 UTC

Definitely agree about the approach of sorting things out with gradual refactoring and tests as the need arises. The problem with the "Big Bang" approach is that you have the potential for a very long stretch of time where there are two forks of the code: ugly shipping code that will need to be fixed and refactored as bugs are reported, and pretty nonfunctioning code that will need to incorporate those fixes as they are uncovered, resulting in a perpetual loop of "it's not quite ready yet."

Ovid (Cardinal) on Jul 12, 2006 at 10:37 UTC

Check out Suggestions for working with poor code and some of the replies.

Cheers,
Ovid

New address of my CGI Course .

GrandFather (Sage) on Jul 12, 2006 at 07:39 UTC

Re: Strategies for maintenance of horrible code?

What tools are you using already and on what platform? For a large range of "detail" debugging there is nothing like as good as an IDE with a good integrated debugger. For a higher level view of where things are going Devel::TraceCalls may be handy, although it's output can be rather voluminous.


DWIM is Perl's answer to Gödel

Moron (Curate) on Jul 12, 2006 at 12:13 UTC

Some basic CYA I can see:

1) Ensure there is sufficient functional and technical design documentation against which the routines can be tested.

2) (updated) Make sure there is a sufficiently detailed project plan to include tasks for: systems analysis, functional and technical design, test planning, test script writing (e.g. using Expect ), developing, unit-, integrated and functional testing, rework and implementation, to include a GANTT chart of the work done so far and by who to what % of completion, to avoid getting the blame for not meeting poorly conceived targets over which you had no control.

In response to formal testing against the plan, I find it a useful aid to bug-fixing to monitor execution with perl -d, setting breakpoints and examining variables to hunt down which line of code causes each failure.

-M

Free your mind

aufflick (Deacon) on Jul 13, 2006 at 00:17 UTC

You might find the comments to my recent question Generating documentation from Perl code (not just POD) useful.

The Doxygen perl extension creates docs that are great for seeing what classes re-implement what methods etc. Also the UML::Sequence sounds intriguing - it pupports to generate a sequence diagram by monitoring code execution.

[Dec 03, 2017] Core module Tie::File - Access the lines of a disk file via a Perl array

Dec 03, 2017 | perldoc.perl.org

Tie::File

NAME

Tie::File - Access the lines of a disk file via a Perl array

SYNOPSIS
  1. # This file documents Tie::File version 0.98
  2. use Tie::File
  3. tie @array 'Tie::File' filename or die ...
  4. $array 13 ] = 'blah' # line 13 of the file is now 'blah'
  5. print $array 42 # display line 42 of the file
  6. $n_recs = @array # how many records are in the file?
  7. $#array -= # chop two records off the end
  8. for @array
  9. s/PERL/Perl/g # Replace PERL with Perl everywhere in the file
  10. # These are just like regular push, pop, unshift, shift, and splice
  11. # Except that they modify the file in the way you would expect
  12. push @array new recs ...
  13. my $r1 = pop @array
  14. unshift @array new recs ...
  15. my $r2 = shift @array
  16. @old_recs = splice @array new recs ...
  17. untie @array # all finished
DESCRIPTION

Tie::File represents a regular text file as a Perl array. Each element in the array corresponds to a record in the file. The first line of the file is element 0 of the array; the second line is element 1, and so on.

The file is not loaded into memory, so this will work even for gigantic files.

Changes to the array are reflected in the file immediately.

Lazy people and beginners may now stop reading the manual.

recsep

What is a 'record'? By default, the meaning is the same as for the <...> operator: It's a string terminated by $/ , which is probably "\n" . (Minor exception: on DOS and Win32 systems, a 'record' is a string terminated by "\r\n" .) You may change the definition of "record" by supplying the recsep option in the tie call:

  1. tie @array 'Tie::File' $file recsep => 'es'

This says that records are delimited by the string es . If the file contained the following data:

  1. Curse these pesky flies !\

then the @array would appear to have four elements:

  1. "Curse th"
  2. "e p"
  3. "ky fli"
  4. "!\n"

An undefined value is not permitted as a record separator. Perl's special "paragraph mode" semantics (à la $/ = "" ) are not emulated.

Records read from the tied array do not have the record separator string on the end; this is to allow

  1. $array 17 ] .= "extra"

to work as expected.

(See autochomp , below.) Records stored into the array will have the record separator string appended before they are written to the file, if they don't have one already. For example, if the record separator string is "\n" , then the following two lines do exactly the same thing:

  1. $array 17 ] = "Cherry pie"
  2. $array 17 ] = "Cherry pie\n"

The result is that the contents of line 17 of the file will be replaced with "Cherry pie"; a newline character will separate line 17 from line 18. This means that this code will do nothing:

  1. chomp $array 17

Because the chomp ed value will have the separator reattached when it is written back to the file. There is no way to create a file whose trailing record separator string is missing.

Inserting records that contain the record separator string is not supported by this module. It will probably produce a reasonable result, but what this result will be may change in a future version. Use 'splice' to insert records or to replace one record with several.

autochomp

Normally, array elements have the record separator removed, so that if the file contains the text

  1. Gold
  2. Frankincense
  3. Myrrh

the tied array will appear to contain "Gold" "Frankincense" "Myrrh" . If you set autochomp to a false value, the record separator will not be removed. If the file above was tied with

  1. tie @gifts "Tie::File" $gifts autochomp =>

then the array @gifts would appear to contain "Gold\n" "Frankincense\n" "Myrrh\n" , or (on Win32 systems) "Gold\r\n" "Frankincense\r\n" "Myrrh\r\n" .

mode

Normally, the specified file will be opened for read and write access, and will be created if it does not exist. (That is, the flags O_RDWR | O_CREAT are supplied in the open call.) If you want to change this, you may supply alternative flags in the mode option. See Fcntl for a listing of available flags. For example:

  1. # open the file if it exists, but fail if it does not exist
  2. use Fcntl 'O_RDWR'
  3. tie @array 'Tie::File' $file mode => O_RDWR
  4. # create the file if it does not exist
  5. use Fcntl 'O_RDWR' 'O_CREAT'
  6. tie @array 'Tie::File' $file mode => O_RDWR | O_CREAT
  7. # open an existing file in read-only mode
  8. use Fcntl 'O_RDONLY'
  9. tie @array 'Tie::File' $file mode => O_RDONLY

Opening the data file in write-only or append mode is not supported.

memory

This is an upper limit on the amount of memory that Tie::File will consume at any time while managing the file. This is used for two things: managing the read cache and managing the deferred write buffer .

Records read in from the file are cached, to avoid having to re-read them repeatedly. If you read the same record twice, the first time it will be stored in memory, and the second time it will be fetched from the read cache . The amount of data in the read cache will not exceed the value you specified for memory . If Tie::File wants to cache a new record, but the read cache is full, it will make room by expiring the least-recently visited records from the read cache.

The default memory limit is 2Mib. You can adjust the maximum read cache size by supplying the memory option. The argument is the desired cache size, in bytes.

  1. # I have a lot of memory, so use a large cache to speed up access
  2. tie @array 'Tie::File' $file memory => 20_000_000

Setting the memory limit to 0 will inhibit caching; records will be fetched from disk every time you examine them.

The memory value is not an absolute or exact limit on the memory used. Tie::File objects contains some structures besides the read cache and the deferred write buffer, whose sizes are not charged against memory .

The cache itself consumes about 310 bytes per cached record, so if your file has many short records, you may want to decrease the cache memory limit, or else the cache overhead may exceed the size of the cached data.

dw_size

(This is an advanced feature. Skip this section on first reading.)

If you use deferred writing (See Deferred Writing , below) then data you write into the array will not be written directly to the file; instead, it will be saved in the deferred write buffer to be written out later. Data in the deferred write buffer is also charged against the memory limit you set with the memory option.

You may set the dw_size option to limit the amount of data that can be saved in the deferred write buffer. This limit may not exceed the total memory limit. For example, if you set dw_size to 1000 and memory to 2500, that means that no more than 1000 bytes of deferred writes will be saved up. The space available for the read cache will vary, but it will always be at least 1500 bytes (if the deferred write buffer is full) and it could grow as large as 2500 bytes (if the deferred write buffer is empty.)

If you don't specify a dw_size , it defaults to the entire memory limit.

Option Format

- mode is a synonym for mode . - recsep is a synonym for recsep . - memory is a synonym for memory . You get the idea.

Public Methods

The tie call returns an object, say $o . You may call

  1. $rec = $o->FETCH $n
  2. $o->STORE $n $rec

to fetch or store the record at line $n , respectively; similarly the other tied array methods. (See perltie for details.) You may also call the following methods on this object:

flock
  1. $o->flock MODE

will lock the tied file. MODE has the same meaning as the second argument to the Perl built-in flock function; for example LOCK_SH or LOCK_EX | LOCK_NB . (These constants are provided by the use Fcntl ':flock' declaration.)

MODE is optional; the default is LOCK_EX .

Tie::File maintains an internal table of the byte offset of each record it has seen in the file.

When you use flock to lock the file, Tie::File assumes that the read cache is no longer trustworthy, because another process might have modified the file since the last time it was read. Therefore, a successful call to flock discards the contents of the read cache and the internal record offset table.

Tie::File promises that the following sequence of operations will be safe:

  1. my $o = tie @array "Tie::File" $filename
  2. $o->flock

In particular, Tie::File will not read or write the file during the tie call. (Exception: Using mode => O_TRUNC will, of course, erase the file during the tie call. If you want to do this safely, then open the file without O_TRUNC , lock the file, and use @array = () .)

The best way to unlock a file is to discard the object and untie the array. It is probably unsafe to unlock the file without also untying it, because if you do, changes may remain unwritten inside the object. That is why there is no shortcut for unlocking. If you really want to unlock the file prematurely, you know what to do; if you don't know what to do, then don't do it.

All the usual warnings about file locking apply here. In particular, note that file locking in Perl is advisory , which means that holding a lock will not prevent anyone else from reading, writing, or erasing the file; it only prevents them from getting another lock at the same time. Locks are analogous to green traffic lights: If you have a green light, that does not prevent the idiot coming the other way from plowing into you sideways; it merely guarantees to you that the idiot does not also have a green light at the same time.

autochomp
  1. my $old_value = $o->autochomp # disable autochomp option
  2. my $old_value = $o->autochomp # enable autochomp option
  3. my $ac = $o->autochomp () # recover current value

See autochomp , above.

defer , flush , discard , and autodefer

See Deferred Writing , below.

offset
  1. $off = $o->offset $n

This method returns the byte offset of the start of the $n th record in the file. If there is no such record, it returns an undefined value.

Tying to an already-opened filehandle

If $fh is a filehandle, such as is returned by IO::File or one of the other IO modules, you may use:

  1. tie @array 'Tie::File' $fh ...

Similarly if you opened that handle FH with regular open or sysopen , you may use:

  1. tie @array 'Tie::File' \ *FH ...

Handles that were opened write-only won't work. Handles that were opened read-only will work as long as you don't try to modify the array. Handles must be attached to seekable sources of data---that means no pipes or sockets. If Tie::File can detect that you supplied a non-seekable handle, the tie call will throw an exception. (On Unix systems, it can detect this.)

Note that Tie::File will only close any filehandles that it opened internally. If you passed it a filehandle as above, you "own" the filehandle, and are responsible for closing it after you have untied the @array.

Deferred Writing

(This is an advanced feature. Skip this section on first reading.)

Normally, modifying a Tie::File array writes to the underlying file immediately. Every assignment like $a ] = ... rewrites as much of the file as is necessary; typically, everything from line 3 through the end will need to be rewritten. This is the simplest and most transparent behavior. Performance even for large files is reasonably good.

However, under some circumstances, this behavior may be excessively slow. For example, suppose you have a million-record file, and you want to do:

  1. for @FILE
  2. $_ = "> $_"

The first time through the loop, you will rewrite the entire file, from line 0 through the end. The second time through the loop, you will rewrite the entire file from line 1 through the end. The third time through the loop, you will rewrite the entire file from line 2 to the end. And so on.

If the performance in such cases is unacceptable, you may defer the actual writing, and then have it done all at once. The following loop will perform much better for large files:

  1. tied @a ->defer
  2. for @a
  3. $_ = "> $_"
  4. tied @a ->flush

If Tie::File 's memory limit is large enough, all the writing will done in memory. Then, when you call ->flush , the entire file will be rewritten in a single pass.

(Actually, the preceding discussion is something of a fib. You don't need to enable deferred writing to get good performance for this common case, because Tie::File will do it for you automatically unless you specifically tell it not to. See autodeferring , below.)

Calling ->flush returns the array to immediate-write mode. If you wish to discard the deferred writes, you may call ->discard instead of ->flush . Note that in some cases, some of the data will have been written already, and it will be too late for ->discard to discard all the changes. Support for ->discard may be withdrawn in a future version of Tie::File .

Deferred writes are cached in memory up to the limit specified by the dw_size option (see above). If the deferred-write buffer is full and you try to write still more deferred data, the buffer will be flushed. All buffered data will be written immediately, the buffer will be emptied, and the now-empty space will be used for future deferred writes.

If the deferred-write buffer isn't yet full, but the total size of the buffer and the read cache would exceed the memory limit, the oldest records will be expired from the read cache until the total size is under the limit.

push , pop , shift , unshift , and splice cannot be deferred. When you perform one of these operations, any deferred data is written to the file and the operation is performed immediately. This may change in a future version.

If you resize the array with deferred writing enabled, the file will be resized immediately, but deferred records will not be written. This has a surprising consequence: @a = ... erases the file immediately, but the writing of the actual data is deferred. This might be a bug. If it is a bug, it will be fixed in a future version.

Autodeferring

Tie::File tries to guess when deferred writing might be helpful, and to turn it on and off automatically.

  1. for @a
  2. $_ = "> $_"

In this example, only the first two assignments will be done immediately; after this, all the changes to the file will be deferred up to the user-specified memory limit.

You should usually be able to ignore this and just use the module without thinking about deferring. However, special applications may require fine control over which writes are deferred, or may require that all writes be immediate. To disable the autodeferment feature, use

  1. tied @o ->autodefer

or

  1. tie @array 'Tie::File' $file autodefer =>

Similarly, ->autodefer re-enables autodeferment, and ->autodefer () recovers the current value of the autodefer setting.

CONCURRENT ACCESS TO FILES

Caching and deferred writing are inappropriate if you want the same file to be accessed simultaneously from more than one process. Other optimizations performed internally by this module are also incompatible with concurrent access. A future version of this module will support a concurrent => option that enables safe concurrent access.

Previous versions of this documentation suggested using memory => for safe concurrent access. This was mistaken. Tie::File will not support safe concurrent access before version 0.96.

CAVEATS

(That's Latin for 'warnings'.)

SUBCLASSING

This version promises absolutely nothing about the internals, which may change without notice. A future version of the module will have a well-defined and stable subclassing API.

WHAT ABOUT DB_File ?

People sometimes point out that DB_File will do something similar, and ask why Tie::File module is necessary.

There are a number of reasons that you might prefer Tie::File . A list is available at http://perl.plover.com/TieFile/why-not-DB_File .

AUTHOR

Mark Jason Dominus

To contact the author, send email to: mjd perl tiefile @plover com

To receive an announcement whenever a new version of this module is released, send a blank email message to mjd perl tiefile subscribe @plover com .

The most recent version of this module, including documentation and any news of importance, will be available at

  1. http://perl.plover.com/TieFile/
LICENSE

Tie::File version 0.96 is copyright (C) 2003 Mark Jason Dominus.

This library is free software; you may redistribute it and/or modify it under the same terms as Perl itself.

These terms are your choice of any of (1) the Perl Artistic Licence, or (2) version 2 of the GNU General Public License as published by the Free Software Foundation, or (3) any later version of the GNU General Public License.

This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this library program; it should be in the file COPYING . If not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA

For licensing inquiries, contact the author at:

  1. Mark Jason Dominus
  2. 255 S. Warnock St.
  3. Philadelphia, PA 19107
WARRANTY

Tie::File version 0.98 comes with ABSOLUTELY NO WARRANTY. For details, see the license.

THANKS

[Dec 01, 2017] regex - Debugging Perl Regular expression

Dec 01, 2017 | stackoverflow.com

down vote favorite 1

AnonGeek ,Jun 20, 2012 at 20:37

I am trying to debug few regular expressions using:
perl -Mre=debug file.pl

The file.pl script has many regular expression. Some of them are repeated. Using above syntax, all the regex in file.pl are being debugged.

Is there a way to tell Perl to debug only a particular regex in a script?

I am familiar with YAPE::Regex module, but that is not what I require. So please don't suggest to use that.

Ehtesh Choudhury ,Jun 20, 2012 at 20:45

Why not just comment out the other regexes, or run just the particular regex on the command line, via perl -e ? – Ehtesh Choudhury Jun 20 '12 at 20:45

AnonGeek ,Jun 20, 2012 at 20:55

the script is very compilcated(12000 LOC). If I will comment out any of regex then the execution will fail..Also if I provide dummy values, then it will give unexpected results :( – AnonGeek Jun 20 '12 at 20:55

Oleg V. Volkov ,Jun 20, 2012 at 20:41

As with many other pragmas, you can use no to cancel previous use .
use re 'debug';

$str=~/\d{3}/;

no re 'debug';

$str=~/\d{3}/;

Denis Ibaev ,Jun 20, 2012 at 20:48

As of 5.9.5 the directive use re 'debug' and its equivalents are lexically scoped, as the other directives are.

Use:

{
    use re 'debug';
    # Debugged regexp here.
}

AnonGeek ,Jun 20, 2012 at 21:10

Is this also supported in 5.8.8? I am putting it under a condition but it is enabling it globally for all regex. – AnonGeek Jun 20 '12 at 21:10

Denis Ibaev ,Jun 21, 2012 at 5:24

No, since version 5.9.5. In 5.8.8 you need use no statement. – Denis Ibaev Jun 21 '12 at 5:24

[Nov 30, 2017] debugging - Perl Debugger Filehandle as Input

Highly recommended!
Nov 30, 2017 | stackoverflow.com
I have this problem: I need to control the perl-debugger from an external script. By research I found out about various solutions, but I don't understand them. I failed to properly set up the RemotePort option (editing ".perldb"), which was the first I tried, and found no useful information on providing a filehandle from which the debugger would get its input (by somehow setting @cmdfhs) I found both options over here: http://search.cpan.org/~nwclark/perl-5.8.6/lib/perl5db.pl

It would be nice if you could tell me how to provide the filehandle from which the debugger gets its input, or if you know a link where this is explained?

Casper ,Jun 28, 2015 at 21:53

Here's a simple example setting it up using RemotePort , which seemed easier to me:

The trick to using RemotePort is that you have to have someone listening on the remote end BEFORE you launch the script to be debugged.

As soon as you launch your script with -d Perl will attempt to connect to RemotePort . So you have to make sure the initial connection succeeds by having someone listening there beforehand.

Here I assume some Linux/Unix variant, which has the netcat utility installed. We use netcat to wait for incoming connections in this example, but you can use anything else you wish too which is able to create a service port and shuffle data between that and the current TTY:

In terminal 1

 # Use netcat to listen for incoming connections on port 9999
 > nc -l -p 9999

In terminal 2

 # Start perl with -d and request a RemotePort connection 
 > PERLDB_OPTS=RemotePort=127.0.0.1:9999 perl -d my_script.pl

As soon as you do that in terminal 1 you will see something like this:

Loading DB routines from perl5db.pl version 1.39_10
Editor support available.

Enter h or 'h h' for help, or 'man perldebug' for more help.

main::(my_script.pl:4):
  DB<1>

There you go..debug away.

Devel::Trepan is a gdb-like debugger. Although it has remote control, you can also run it at the outset with the option --command which will "source" (in the gdb-sense) or run a series of debugger commands.

To go into remote control, either start the debugger using the --server option or inside the debugger use the " server " command once inside the debugger.

See Options for a list of options you can give at the outset.

[Nov 30, 2017] Working with character arrays in perl

Nov 30, 2017 | stackoverflow.com

up vote down vote

cbg ,Jul 7, 2014 at 11:36

my @char_array = split "", $s1;

@char_array now contains all the characters of the $s1 string and it's possible to manipulate it, iterate over it or do whatever to it just like with any other array.

You can you splice to insert elements at a given position of the array:
echo -e 'hello\ndisk\ncaller' | perl -F'' -ane ' splice (@F,2,0," "); splice(@F,4,0," "); foreach(@F){print}'
he l lo
di s k
ca l ler

You can use Data::Dumper for better visualization when working with arrays:

echo -n 'hello' | perl -MData::Dumper -F'' -ane ' splice (@F,2,0," "); splice(@F,4,0," ");print Dumper(\@F)'
$VAR1 = [
          'h',
          'e',
          ' ',
          'l',
          ' ',
          'l',
          'o'
        ];

[Nov 30, 2017] regex - How to read perl regular expression debugger - Stack Overflow

Nov 30, 2017 | stackoverflow.com

Wakan Tanka ,Mar 20, 2015 at 13:17

I've come across following materials:
  1. Mastering Perl by brian d foy , chapter: Debugging Regular Expressions.
  2. Debugging regular expressions which mentions re::debug module for perl

I've also try to use various another techniques:

  1. Module re=debugcolor which highlights it's output.
  2. Used following construction ?{print "$1 $2\n"} .

but still did not get the point how to read their output. I've also found another modules used for debugging regular expressions here but I did not tried them yet, can you please explain how to read output of use re 'debug' or another command used for debugging regular expressions in perl?

EDIT in reply to Borodin:

1st example:

perl -Mre=debug -e' "foobar"=~/(.)\1/'
Compiling REx "(.)\1"
Final program:
   1: OPEN1 (3)
   3:   REG_ANY (4)
   4: CLOSE1 (6)
   6: REF1 (8)
   8: END (0)
minlen 1
Matching REx "(.)\1" against "foobar"
   0 <> <foobar>             |  1:OPEN1(3)
   0 <> <foobar>             |  3:REG_ANY(4)
   1 <f> <oobar>             |  4:CLOSE1(6)
   1 <f> <oobar>             |  6:REF1(8)
                                  failed...
   1 <f> <oobar>             |  1:OPEN1(3)
   1 <f> <oobar>             |  3:REG_ANY(4)
   2 <fo> <obar>             |  4:CLOSE1(6)
   2 <fo> <obar>             |  6:REF1(8)
   3 <foo> <bar>             |  8:END(0)
Match successful!
Freeing REx: "(.)\1"
  1. What does OPEN1, REG_ANY, CLOSE1 ... mean ?
  2. What numbers like 1 3 4 6 8 mean?
  3. What does number in braces OPEN1(3) mean?
  4. Which output should I look at, Compiling REx or Matching REx?

2nd example:

 perl -Mre=debugcolor -e' "foobar"=~/(.*)\1/'
Compiling REx "(.*)\1"
Final program:
   1: OPEN1 (3)
   3:   STAR (5)
   4:     REG_ANY (0)
   5: CLOSE1 (7)
   7: REF1 (9)
   9: END (0)
minlen 0
Matching REx "(.*)\1" against "foobar"
   0 <foobar>|  1:OPEN1(3)
   0 <foobar>|  3:STAR(5)
                                  REG_ANY can match 6 times out of 2147483647...
   6 <foobar>|  5:  CLOSE1(7)
   6 <foobar>|  7:  REF1(9)
                                    failed...
   5 <foobar>|  5:  CLOSE1(7)
   5 <foobar>|  7:  REF1(9)
                                    failed...
   4 <foobar>|  5:  CLOSE1(7)
   4 <foobar>|  7:  REF1(9)
                                    failed...
   3 <foobar>|  5:  CLOSE1(7)
   3 <foobar>|  7:  REF1(9)
                                    failed...
   2 <foobar>|  5:  CLOSE1(7)
   2 <foobar>|  7:  REF1(9)
                                    failed...
   1 <foobar>|  5:  CLOSE1(7)
   1 <foobar>|  7:  REF1(9)
                                    failed...
   0 <foobar>|  5:  CLOSE1(7)
   0 <foobar>|  7:  REF1(9)
   0 <foobar>|  9:  END(0)
Match successful!
Freeing REx: "(.*)\1"
  1. Why are numbers descending 6 5 4 3 ... in this example?
  2. What does failed keyword mean?

Borodin ,Mar 20, 2015 at 14:45

Asking how to use a debugger is very broad. Can you show us the pattern that you are trying to debug, and explain what you don't understand? – Borodin Mar 20 '15 at 14:45

ThisSuitIsBlackNot ,Mar 20, 2015 at 15:38

When you run perl -Mre=debug , you're using the re module; you can see the documentation by running perldoc re . The section on "debug mode" is a bit sparse, but ends with "See 'Debugging regular expressions' in perldebug for additional info." perldoc perldebug is similarly short on details, but ends with "These matters are explored in some detail in 'Debugging regular expressions' in perldebguts." And now we have your answer . – ThisSuitIsBlackNot Mar 20 '15 at 15:38

axblount ,Mar 20, 2015 at 20:25

Regular expressions define finite state machines 1 . The debugger is more or less showing you how the state machine is progressing as the string is consumed character by character.

"Compiling REx" is the listing of instructions for that regular expression. The number in parenthesis after each instruction is where to go once the step succeeds. In /(.*)\1/ :

1: OPEN1 (3)
3:   STAR (5)
4:     REG_ANY (0)
5: CLOSE1 (7)

STAR (5) means compute STAR and once you succeed, go to instruction 5 CLOSE1 .

"Matching REx" is the step-by-step execution of those instructions. The number on the left is the total number of characters that have been consumed so far. This number can go down if the matcher has to go backwards because something it tried didn't work.

To understand these instructions, it's important to understand how regular expressions "work." Finite state machines are usually visualized as a kind of flow chart. I have produced a crude one below for /(.)\1/ . Because of the back reference to a capture group, I don't believe this regex is a strict finite state machine. The chart is useful none the less.

               Match                           
+-------+     Anything     +----------+        
| Start +------------------+  State 1 |        
+---^---+                  +--+---+---+        
    |                         |   |            
    |                         |   |Matched same
    +-------------------------+   | character  
            matched different     |            
                character    +----+------+     
                             |  Success  |     
                             +-----------+

We start on Start . It's easy to advance to the first state, we just consume any one character ( REG_ANY ). The only other thing that could happen is end of input. I haven't drawn that here. The REG_ANY instruction is wrapped in the capture group instructions. OPEN1 starts recording all matched characters into the first capture group. CLOSE1 stops recording characters to the first capture group.

Once we consume a character, we sit on State 1 and consume the next char. If it matches the previous char we move to success! REF1 is the instruction that attempts to match capture group #1. Otherwise, we failed and need to move back to the Start to try again. Whenever the matcher says "failed..." it's telling you that something didn't work, so it's returning to an earlier state (that may or may not include 'unconsuming' characters).

The example with * is more complicated. * (which corresponds to STAR ) tries to match the given pattern zero or more times, and it is greedy . That means it tries to match as many characters as it possibly can. Starting at the beginning of the string, it says "I can match up to 6 characters!" So, it matches all 6 characters ( "foobar" ), closes the capture group, and tries to match "foobar" again. That doesn't work! It tries again with 5, that doesn't work. And so on, until it tries to matching zero characters. That means the capture group is empty, matching the empty string always succeeds. So the match succeeds with \1 = "" .

I realize I've spent more time explaining regular expressions than I have Perl's regex debugger. But I think its output will become much more clear once you understand how regexes operate.

Here is a finite state machine simulator . You can enter a regex and see it executed. Unfortunately, it doesn't support back references.

1: I believe some of Perl's regular expression features push it beyond this definition but it's still useful to think about them this way.

> ,

The debug Iinformation contains description of the bytecode. Numbers denote the node indices in the op tree. Numbers in round brackets tell the engine to jump to a specific node upon match. The EXACT operator tells the regex engine to look for a literal string. REG_ANY means the . symbol. PLUS means the +. Code 0 is for the 'end' node. OPEN1 is a '(' symbol. CLOSE1 means ')'. STAR is a '*'. When the matcher reaches the end node, it returns a success code back to Perl, indicating that the entire regex has matched.

See more details at http://perldoc.perl.org/perldebguts.html#Debugging-Regular-Expressions and a more conceptual http://perl.plover.com/Rx/paper/

[Nov 30, 2017] Working with character arrays in perl

Nov 30, 2017 | stackoverflow.com

up vote down vote

cbg ,Jul 7, 2014 at 11:36

my @char_array = split "", $s1;

@char_array now contains all the characters of the $s1 string and it's possible to manipulate it, iterate over it or do whatever to it just like with any other array.

You can you splice to insert elements at a given position of the array:
echo -e 'hello\ndisk\ncaller' | perl -F'' -ane ' splice (@F,2,0," "); splice(@F,4,0," "); foreach(@F){print}'
he l lo
di s k
ca l ler

You can use Data::Dumper for better visualization when working with arrays:

echo -n 'hello' | perl -MData::Dumper -F'' -ane ' splice (@F,2,0," "); splice(@F,4,0," ");print Dumper(\@F)'
$VAR1 = [
          'h',
          'e',
          ' ',
          'l',
          ' ',
          'l',
          'o'
        ];

[Nov 30, 2017] regex - How to read perl regular expression debugger - Stack Overflow

Nov 30, 2017 | stackoverflow.com

Wakan Tanka ,Mar 20, 2015 at 13:17

I've come across following materials:
  1. Mastering Perl by brian d foy , chapter: Debugging Regular Expressions.
  2. Debugging regular expressions which mentions re::debug module for perl

I've also try to use various another techniques:

  1. Module re=debugcolor which highlights it's output.
  2. Used following construction ?{print "$1 $2\n"} .

but still did not get the point how to read their output. I've also found another modules used for debugging regular expressions here but I did not tried them yet, can you please explain how to read output of use re 'debug' or another command used for debugging regular expressions in perl?

EDIT in reply to Borodin:

1st example:

perl -Mre=debug -e' "foobar"=~/(.)\1/'
Compiling REx "(.)\1"
Final program:
   1: OPEN1 (3)
   3:   REG_ANY (4)
   4: CLOSE1 (6)
   6: REF1 (8)
   8: END (0)
minlen 1
Matching REx "(.)\1" against "foobar"
   0 <> <foobar>             |  1:OPEN1(3)
   0 <> <foobar>             |  3:REG_ANY(4)
   1 <f> <oobar>             |  4:CLOSE1(6)
   1 <f> <oobar>             |  6:REF1(8)
                                  failed...
   1 <f> <oobar>             |  1:OPEN1(3)
   1 <f> <oobar>             |  3:REG_ANY(4)
   2 <fo> <obar>             |  4:CLOSE1(6)
   2 <fo> <obar>             |  6:REF1(8)
   3 <foo> <bar>             |  8:END(0)
Match successful!
Freeing REx: "(.)\1"
  1. What does OPEN1, REG_ANY, CLOSE1 ... mean ?
  2. What numbers like 1 3 4 6 8 mean?
  3. What does number in braces OPEN1(3) mean?
  4. Which output should I look at, Compiling REx or Matching REx?

2nd example:

 perl -Mre=debugcolor -e' "foobar"=~/(.*)\1/'
Compiling REx "(.*)\1"
Final program:
   1: OPEN1 (3)
   3:   STAR (5)
   4:     REG_ANY (0)
   5: CLOSE1 (7)
   7: REF1 (9)
   9: END (0)
minlen 0
Matching REx "(.*)\1" against "foobar"
   0 <foobar>|  1:OPEN1(3)
   0 <foobar>|  3:STAR(5)
                                  REG_ANY can match 6 times out of 2147483647...
   6 <foobar>|  5:  CLOSE1(7)
   6 <foobar>|  7:  REF1(9)
                                    failed...
   5 <foobar>|  5:  CLOSE1(7)
   5 <foobar>|  7:  REF1(9)
                                    failed...
   4 <foobar>|  5:  CLOSE1(7)
   4 <foobar>|  7:  REF1(9)
                                    failed...
   3 <foobar>|  5:  CLOSE1(7)
   3 <foobar>|  7:  REF1(9)
                                    failed...
   2 <foobar>|  5:  CLOSE1(7)
   2 <foobar>|  7:  REF1(9)
                                    failed...
   1 <foobar>|  5:  CLOSE1(7)
   1 <foobar>|  7:  REF1(9)
                                    failed...
   0 <foobar>|  5:  CLOSE1(7)
   0 <foobar>|  7:  REF1(9)
   0 <foobar>|  9:  END(0)
Match successful!
Freeing REx: "(.*)\1"
  1. Why are numbers descending 6 5 4 3 ... in this example?
  2. What does failed keyword mean?

Borodin ,Mar 20, 2015 at 14:45

Asking how to use a debugger is very broad. Can you show us the pattern that you are trying to debug, and explain what you don't understand? – Borodin Mar 20 '15 at 14:45

ThisSuitIsBlackNot ,Mar 20, 2015 at 15:38

When you run perl -Mre=debug , you're using the re module; you can see the documentation by running perldoc re . The section on "debug mode" is a bit sparse, but ends with "See 'Debugging regular expressions' in perldebug for additional info." perldoc perldebug is similarly short on details, but ends with "These matters are explored in some detail in 'Debugging regular expressions' in perldebguts." And now we have your answer . – ThisSuitIsBlackNot Mar 20 '15 at 15:38

axblount ,Mar 20, 2015 at 20:25

Regular expressions define finite state machines 1 . The debugger is more or less showing you how the state machine is progressing as the string is consumed character by character.

"Compiling REx" is the listing of instructions for that regular expression. The number in parenthesis after each instruction is where to go once the step succeeds. In /(.*)\1/ :

1: OPEN1 (3)
3:   STAR (5)
4:     REG_ANY (0)
5: CLOSE1 (7)

STAR (5) means compute STAR and once you succeed, go to instruction 5 CLOSE1 .

"Matching REx" is the step-by-step execution of those instructions. The number on the left is the total number of characters that have been consumed so far. This number can go down if the matcher has to go backwards because something it tried didn't work.

To understand these instructions, it's important to understand how regular expressions "work." Finite state machines are usually visualized as a kind of flow chart. I have produced a crude one below for /(.)\1/ . Because of the back reference to a capture group, I don't believe this regex is a strict finite state machine. The chart is useful none the less.

               Match                           
+-------+     Anything     +----------+        
| Start +------------------+  State 1 |        
+---^---+                  +--+---+---+        
    |                         |   |            
    |                         |   |Matched same
    +-------------------------+   | character  
            matched different     |            
                character    +----+------+     
                             |  Success  |     
                             +-----------+

We start on Start . It's easy to advance to the first state, we just consume any one character ( REG_ANY ). The only other thing that could happen is end of input. I haven't drawn that here. The REG_ANY instruction is wrapped in the capture group instructions. OPEN1 starts recording all matched characters into the first capture group. CLOSE1 stops recording characters to the first capture group.

Once we consume a character, we sit on State 1 and consume the next char. If it matches the previous char we move to success! REF1 is the instruction that attempts to match capture group #1. Otherwise, we failed and need to move back to the Start to try again. Whenever the matcher says "failed..." it's telling you that something didn't work, so it's returning to an earlier state (that may or may not include 'unconsuming' characters).

The example with * is more complicated. * (which corresponds to STAR ) tries to match the given pattern zero or more times, and it is greedy . That means it tries to match as many characters as it possibly can. Starting at the beginning of the string, it says "I can match up to 6 characters!" So, it matches all 6 characters ( "foobar" ), closes the capture group, and tries to match "foobar" again. That doesn't work! It tries again with 5, that doesn't work. And so on, until it tries to matching zero characters. That means the capture group is empty, matching the empty string always succeeds. So the match succeeds with \1 = "" .

I realize I've spent more time explaining regular expressions than I have Perl's regex debugger. But I think its output will become much more clear once you understand how regexes operate.

Here is a finite state machine simulator . You can enter a regex and see it executed. Unfortunately, it doesn't support back references.

1: I believe some of Perl's regular expression features push it beyond this definition but it's still useful to think about them this way.

> ,

The debug Iinformation contains description of the bytecode. Numbers denote the node indices in the op tree. Numbers in round brackets tell the engine to jump to a specific node upon match. The EXACT operator tells the regex engine to look for a literal string. REG_ANY means the . symbol. PLUS means the +. Code 0 is for the 'end' node. OPEN1 is a '(' symbol. CLOSE1 means ')'. STAR is a '*'. When the matcher reaches the end node, it returns a success code back to Perl, indicating that the entire regex has matched.

See more details at http://perldoc.perl.org/perldebguts.html#Debugging-Regular-Expressions and a more conceptual http://perl.plover.com/Rx/paper/

[Nov 29, 2017] How can I have variable assertions in Perl

Notable quotes:
"... Smart::Comments++ When used with the -ENV switch, it's a fantastic tool for this sort of thing. Much better than having to strip all the tests out before going to production, as someone else suggested. ..."
Nov 29, 2017 | stackoverflow.com

Alex ,Jun 21, 2009 at 12:45

How can I check that a variable has a specific value in Perl? Is there a command to stop a script's execution to look up some of it's variables?

I wonder if I can use the Pythonic practice of inserting:

    assert 0, (foo, bar)

to debug scripts in a debuger-less way?

Telemachus ,Jun 21, 2009 at 12:58

A quick CPAN search suggests Carp::Assert .

Sinan Ünür ,Jun 21, 2009 at 13:30

+1 for typing in more characters faster than I did. I am going to edit the URL to be version agnostic though. – Sinan Ünür Jun 21 '09 at 13:30

Telemachus ,Jun 21, 2009 at 13:50

I was going to comment on your answer about the photo finish. As for the URL, I constantly forget that, so thanks. – Telemachus Jun 21 '09 at 13:50

Sinan Ünür ,Jun 21, 2009 at 12:58

See Carp::Assert .

zoul ,Jun 21, 2009 at 13:44

Smart::Comments are nice.

RET ,Jun 22, 2009 at 3:28

Smart::Comments++ When used with the -ENV switch, it's a fantastic tool for this sort of thing. Much better than having to strip all the tests out before going to production, as someone else suggested.

RET Jun 22 '09 at 3:28

nik ,Jun 21, 2009 at 12:54

There is a script at PerlMonks that introduces a fast assert method.

Speed is important since Perl is interpreted and any inline checks will impact performance (unlike simple C macros for example)


I am not sure if these things are going to be directly usable.


Ok! This is what i was looking for -- PDF Warning: Test-Tutorial.pdf . The Test::Harness is used for writing Perl module tests.

Ape-inago ,Jun 21, 2009 at 13:51

$var_to_check =~ /sometest/ or die "bad variable!";

I tend to throw things like this in my code, and later use a find and replace to get rid of them (in production code).

Also, ' eval ' can be used to run a section of code and capture errors and can be used to create exception handling functionality. If you are asserting that a value is not 0, perhaps you want to throw an exception and handle that case in a special way?

> ,

if ( $next_sunrise_time > 24*60*60 ) { warn( "assertion failed" ); } # Assert that the sun must rise in the next 24 hours.

You can do this if you do not have access to Perl 5.9 which is required for Carp::Assert .

[Nov 29, 2017] How can I have variable assertions in Perl

Nov 29, 2017 | stackoverflow.com

Alex ,Jun 21, 2009 at 12:45

How can I check that a variable has a specific value in Perl? Is there a command to stop a script's execution to look up some of it's variables?

I wonder if I can use the Pythonic practice of inserting:

    assert 0, (foo, bar)

to debug scripts in a debuger-less way?

Telemachus ,Jun 21, 2009 at 12:58

A quick CPAN search suggests Carp::Assert .

Sinan Ünür ,Jun 21, 2009 at 13:30

+1 for typing in more characters faster than I did. I am going to edit the URL to be version agnostic though. – Sinan Ünür Jun 21 '09 at 13:30

Telemachus ,Jun 21, 2009 at 13:50

I was going to comment on your answer about the photo finish. As for the URL, I constantly forget that, so thanks. – Telemachus Jun 21 '09 at 13:50

Sinan Ünür ,Jun 21, 2009 at 12:58

See Carp::Assert .

zoul ,Jun 21, 2009 at 13:44

Smart::Comments are nice.

RET ,Jun 22, 2009 at 3:28

Smart::Comments++ When used with the -ENV switch, it's a fantastic tool for this sort of thing. Much better than having to strip all the tests out before going to production, as someone else suggested.

RET Jun 22 '09 at 3:28

nik ,Jun 21, 2009 at 12:54

There is a script at PerlMonks that introduces a fast assert method.

Speed is important since Perl is interpreted and any inline checks will impact performance (unlike simple C macros for example)


I am not sure if these things are going to be directly usable.


Ok! This is what i was looking for -- PDF Warning: Test-Tutorial.pdf . The Test::Harness is used for writing Perl module tests.

Ape-inago ,Jun 21, 2009 at 13:51

$var_to_check =~ /sometest/ or die "bad variable!";

I tend to throw things like this in my code, and later use a find and replace to get rid of them (in production code).

Also, ' eval ' can be used to run a section of code and capture errors and can be used to create exception handling functionality. If you are asserting that a value is not 0, perhaps you want to throw an exception and handle that case in a special way?

> ,

if ( $next_sunrise_time > 24*60*60 ) { warn( "assertion failed" ); } # Assert that the sun must rise in the next 24 hours.

You can do this if you do not have access to Perl 5.9 which is required for Carp::Assert .

[Nov 23, 2017] Simple Module Tutorial

Notable quotes:
"... Quite some years have gone by since the inital post, just want to point out that "use vars" is now considered deprecated, instead, "our" is prefered. ..."
Aug 06, 2001 | perlmonks.com
So you find the Perl docs on modules a bit confusing? OK here is the world's simplest Perl module demonstrating all the salient features of Exporter and a script that uses this module. We also give a short rundown on @INC and finish with a note on using warnings and modules. Here is the module code. MyModule.pm package MyModule; use strict; use Exporter; use vars qw($VERSION @ISA @EXPORT @EXPORT_OK %EXPORT_TAGS); $VERSION = 1.00; @ISA = qw(Exporter); @EXPORT = (); @EXPORT_OK = qw(func1 func2); %EXPORT_TAGS = ( DEFAULT => [qw(&func1)], Both => [qw(&func1 &func2)]); sub func1 { return reverse @_ } sub func2 { return map{ uc }@_ } 1; [download]

First we get a namespace by declaring a package name. This helps ensure our module's functions and variables remain separate from any script that uses it.

Use strict is a very good idea for modules to restrict the use of global variables. See use strict warnings and diagnostics or die for more details.

We need to use the Exporter module to export our functions from the MyModule:: namespace into the main:: namespace to make them available to scripts that 'use' MyModule.

We pacify strict with the use vars declaration of some variables. We can use an 'our' declaration in 5.6+

We now set a $VERSION number and make Exporter part of MyModule using the @ISA. See perlboot for all the gory details on what @ISA is or just use it as shown.

@EXPORT contains a list of functions that we export by default, in this case nothing. Generally the less you export by default using @EXPORT the better. This avoids accidentally clashing with functions defined in the script using the module. If a script wants a function let it ask.

@EXPORT_OK contains a list of functions that we export on demand so we export &func1 &func2 only if specifically requested to. Use this in preference to just blindly exporting functions via @EXPORT. You can also export variables like $CONFIG provided they are globals not lexicals scoped with my (read declare them with our or use vars).

%EXPORT_TAGS. For convenience we define two sets of export tags. The ':DEFAULT' tag exports only &func1; the ':Both' tag exports both &func1 &func2. This hash stores labels pointing to array references. In this case the arrays are anonymous.

We need the 1; at the end because when a module loads Perl checks to see that the module returns a true value to ensure it loaded OK. You could put any true value at the end (see Code::Police ) but 1 is the convention.

MyScript.pl (A simple script to use MyModule) #!/usr/bin/perl -w use strict; # you may need to set @INC here (see below) my @list = qw (J u s t ~ A n o t h e r ~ P e r l ~ H a c k e r !); # case 1 # use MyModule; # print func1(@list),"\n"; # print func2(@list),"\n"; # case 2 # use MyModule qw(&func1); # print func1(@list),"\n"; # print MyModule::func2(@list),"\n"; # case 3 # use MyModule qw(:DEFAULT); # print func1(@list),"\n"; # print func2(@list),"\n"; # case 4 # use MyModule qw(:Both); # print func1(@list),"\n"; # print func2(@list),"\n"; [download]

We use MyModule in MyScript.pl as shown. Uncomment the examples to see what happens. Just uncomment one at a time.

Case 1: Because our module exports nothing by default we get errors as &funct1 and &funct2 have not been exported thus do not exist in the main:: namespace of the script.

Case 2: This works OK. We ask our module to export the &func1 so we can use it. Although &func2 was not exported we reference it with its full package name so this works OK.

Case 3: The ':DEFAULT' tag *should* export &func1 so you might expect the error here to concern a missing &func2. In fact Perl complains about &func1. Hmm, what is going on here. The DEFAULT tag name is special and is automatically set in our modules %EXPORT_TAGS hash like this DEFAULT => \@EXPORT.

Case 4: We specified the export of both our functions with the ':Both' thus this works.

A note on @INC

When you issue a use MyModule; directive perl searchs the @INC array for a module with the correct name. @INC usually contains:

/perl/lib 
/perl/site/lib
.

The . directory (dot dir) is the current working directory. CORE modules are installed under perl/lib whereas non-CORE modules install under perl/site/lib. You can add directories to the module search path in @INC like this:

BEGIN { push @INC, '/my/dir' } # or BEGIN { unshift @INC, '/my/dir' } # or use lib '/my/dir'; [download]

We need to use a BEGIN block to shift values into @INC at compile time as this is when perl checks for modules. If you wait until the script is comiled it is too late and perl will throw an exception saying it can't find MyModule in @INC... The difference between pushing a value and unshifting a value into @INC is that perl searches the @INC array for the module starting with the first dir in that array. Thus is you have a MyModule in /perl/lib/ and another in /perl/site/lib/ and another in ./ the one in /perl/lib will be found first and thus the one used. The use lib pragma effectively does the same as the BEGIN { unshift @INC, $dir } block - see perlman:lib:lib for full specifics.

What use Foo::Bar means

use Foo::Bar does not mean look for a module called "Foo::Bar.pm" in the @INC directories. It means search @INC for a *subdir* called "Foo" and a *module* called "Bar.pm".

Now once we have "use'd" a module its functions are available via the fully specified &PACKAGE::FUNCTION syntax. When we say &Foo::Bar::some_func we are refering to the *package name* not the (dir::)file name that we used in the use. This allows you to have many package names in one use'd file. In practice the names are usually the same.

use Warnings;

You should test your module with warnings enabled as this will pick up many subtle (and not so subtle :-) errors. You can activate warnings using the -w flag in the script you use to test the module. If you add use warnings to the module then your module will require Perl 5.6+ as this was not available before then. If you put $^W++ at the top of the module then you will globally enable warnings - this may break *other modules* a script may be using in addition to your module so is rather antisocial. An expert coder here called tye tests with warnings but does not include them directly in his/her modules.

Hope this explains how it works.

cheers

tachyon

Update

Fixed a typo and added a few comments. Thanks to John M. Dlugosz . Rewrote and restyled tute for compatibility with versions of Perl < 5.6 thanks to crazyinsomniac . Also thanks to tye for reminding me that $^W++ is globally scoped and a bit antisocial for a module.

s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

John M. Dlugosz (Monsignor) on Aug 06, 2001 at 04:30 UTC

Re: Simple Module Tutorial

Very nice, getting everything into a short page like that. But, I have a few comments:

Are you sure you want to make $VERSION a float, rather than a v-string? And if so, illustrate the three-digit convention (e.g. 5.005_001 for version 5.5.1).

I'm also shocked that your pm file doesn't use strict !

I would also suggest adding a comment to the 1; line, saying that this means "loaded OK".

-- John

tachyon (Chancellor) on Aug 06, 2001 at 05:42 UTC

Re: Re: Simple Module Tutorial


by tachyon (Chancellor) on Aug 06, 2001 at 05:42 UTC

Thanks John I've updated the text a bit in line with your suggestions. Forgot the strict in the module! Oops it is back in its rightful place right at the top. I just used the simple $VERSION numbering because this is a simple tute :-) Here is an excerpt from the Exporter manpage for those interested.

Module Version Checking The Exporter module will convert an attempt to import a number from a module into a call to $module_name->require_version($value). This can be used to validate that the version of the module being used is greater than or equal to the required version. The Exporter module supplies a default require_version method which checks the value of $VERSION in the exporting module. Since the default require_version method treats the $VERSION number as a simple numeric value it will regard version 1.10 as lower than 1.9. For this reason it is strongly recommended that you use numbers with at least two decimal places, e.g., 1.09. [download]

cheers

tachyon

s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

John M. Dlugosz (Monsignor) on Aug 06, 2001 at 08:18 UTC

Re: Re: Re: Simple Module Tutorial
by John M. Dlugosz (Monsignor) on Aug 06, 2001 at 08:18 UTC Yea, I just posted a tutorial on VERSION.

For compatibility with mixing decimals and v-strings, the built-in UNIVERSAL::require_version uses three decimal digits per part.

If you have $MyModule::VERSION= 1.12; (a decimal number) and do a use MyModule 1.20.1 qw/bar/ , it will tell you that the module 1.120 and you asked for 1.020, so that's OK. You expected 1.20 to be greater than 1.12, not-OK.

-- John

tye (Sage) on Aug 06, 2001 at 22:09 UTC

(tye)Re: Simple Module Tutorial

$W++ will only give you run-time warnings and will affect other packages. Personally, I don't turn on warnings in modules that I write but I do make a point of testing them with warnings turned on (by putting "#!/usr/bin/perl -w" at the top of my test scripts).

- tye (but my friends call me "Tye")

tachyon (Chancellor) on Aug 06, 2001 at 22:36 UTC

Re: (tye)Re: Simple Module Tutorial


by tachyon (Chancellor) on Aug 06, 2001 at 22:36 UTC

Thanks I changed it from the lexically scoped 'use warnings;' so that this is applicable to versions < 5.6. but as it adds little value to the tutorial and has the unwanted side effects you point out I have just deleted it - saves a few lines of dubious value. I'll add a note on testing with warnings when I have a moment.

cheers

tachyon

s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

johnnywang (Priest) on Aug 09, 2004 at 22:58 UTC

Re: Simple Module Tutorial

Quite some years have gone by since the inital post, just want to point out that "use vars" is now considered deprecated, instead, "our" is prefered. Another already mentioned point is not to use @EXPORT too much. So my basic module is like the following (I have a emacs function to output this):

package MyModule; use strict; use Exporter qw(import); our $VERSION = 1.00; our @ISA = qw(Exporter); our @EXPORT_OK = qw(func1 func2); our %EXPORT_TAGS = ( DEFAULT => [qw(func1)], Both => [qw(func1 func2)]); sub func1 { return reverse @_ } sub func2 { return map{ uc }@_ } 1; [download] For those emacs users, here's the simple script to generate the skeleton: (defun perl-new-module () "Generate a skeleton source for a traditional perl module." (interactive) (setq var (split-string (read-from-minibuffer "Enter module name (eg. Web::Test): "nil nil nil nil nil nil) " ")) (setq name (car var)) (insert (format "package %s;\n\n" name)) (insert "use strict;\n\n") (insert "use Exporter qw(import);\n") (insert "our @ISA = qw(Exporter);\n") (insert "our @EXPORT_OK = qw();\n") (insert "our %EXPORT_TAGS = ();\n") (insert "our $VERSION = 1.00; \n\n") (insert "\n\n\n\n\n\n") (insert "1;") (insert "\n") (previous-line 6) (end-of-line) ) [download]

adrianh (Chancellor) on Aug 10, 2004 at 00:30 UTC

Re^2: Simple Module Tutorial


by adrianh (Chancellor) on Aug 10, 2004 at 00:30 UTC

Quite some years have gone by since the inital post, just want to point out that "use vars" is now considered deprecated, instead, "our" is prefered.

Some people prefer to use our rather than use vars (I'm one of them) - but the latter is not deprecated. Both do slightly different things and many people still prefer to use vars .

beretboy (Chaplain) on Aug 18, 2001 at 15:07 UTC

Re: Simple Module Tutorial

Exellent tutorial ++! I have never understood the writing of modules till now

"Sanity is the playground of the unimaginative" -Unknown

Codmate (Novice) on Sep 06, 2001 at 20:18 UTC

Re: Re: Simple Module Tutorial


by Codmate (Novice) on Sep 06, 2001 at 20:18 UTC

Fantastic stuff - I just converted a full script into a module (and added some bits) and it worked 1st time - WITH NO DEBUGGING REQUIRED (and yes - I am using strict)!!! I was very scared of modules before but now feel like I could write a hundred. Thanks very much for this - invaluable tutorial for a newbie like me :))))))

Jaap (Curate) on Jul 22, 2002 at 09:35 UTC

Re: Simple Module Tutorial

This is a nice tutorial tachyon. Are you considering writing a more advanced tutorial on modules (combined with OO)?

Especially, what a GOOD module looks like. Should we use carp, dynaloader and what not?

gawatkins (Monsignor) on Apr 10, 2003 at 11:44 UTC

Re: Simple Module Tutorial

Great Tutorial, It helped to clear up the muddy water created by my Perl Black Book .

Thanks again,

Greg W.

twotone (Beadle) on Oct 14, 2007 at 05:06 UTC

Re: Simple Module Tutorial

Great summary of module basics!

Here's a little code I came up with to add my module location to @INC (in a cgi environment) by dynamically determining the document root for the script. It works on the remote apache server and when testing locally in windows. It might be of some interest:

BEGIN { # get doc root from %ENV # implicitly declare file root path if %ENV not fount my $doc_root = $ENV{DOCUMENT_ROOT} || 'C:/Users/User/Documents/website/sites/mysite'; # change \ to / $doc_root =~ s/\\/\//g; # add module folder location $doc_root .= "/cgi-bin/cms/"; # add module location to @INC push(@INC,$doc_root); } [download]

bychan (Initiate) on Jan 28, 2008 at 08:45 UTC

Re^2: Simple Module Tutorial


by bychan (Initiate) on Jan 28, 2008 at 08:45 UTC

This tutorial is great. The only problem is, that I get the following result, if I comment out all the cases:

!rekcaH~lreP~rehtonA~tsuJ
JUST~ANOTHER~PERL~HACKER!
!rekcaH~lreP~rehtonA~tsuJ
JUST~ANOTHER~PERL~HACKER!
!rekcaH~lreP~rehtonA~tsuJ
JUST~ANOTHER~PERL~HACKER!
!rekcaH~lreP~rehtonA~tsuJ
JUST~ANOTHER~PERL~HACKER!

Shouldn't there be some error messages or warnings?

chexmix (Hermit) on Aug 12, 2008 at 13:24 UTC

Re: Simple Module Tutorial

I like this post very much, but the following is opaque to me for some reason:

" Case 3: The ':DEFAULT' tag *should* export &func1 so you might expect the error here to concern a missing &func2. In fact Perl complains about &func1. Hmm, what is going on here. The DEFAULT tag name is special and is automatically set in our modules %EXPORT_TAGS hash like this DEFAULT => \@EXPORT."

I confess I still don't know what is going on here, and am wondering if someone can help me out.

The words "is automatically set in our modules %EXPORT_TAGS hash like this DEFAULT => \@EXPORT" seem inaccurate to me, since when I look up at the code for MyModule.pm, I see the line

%EXPORT_TAGS = ( DEFAULT => qw(&func1) ,

and not the line

%EXPORT_TAGS = ( DEFAULT => \@EXPORT,

Is the OP saying that the line as is in the MyModule.pm code is incorrect, because the "special" nature of DEFAULT overrides it with something else?

Thanks. I feel I am blanking on something obvious here, but just can't see it this rainy morning.

tye (Sage) on Aug 14, 2008 at 15:41 UTC

Re^2: Simple Module Tutorial (DEFAULT)


by tye (Sage) on Aug 14, 2008 at 15:41 UTC

Is the OP saying that the line as is in the MyModule.pm code is incorrect, because the "special" nature of DEFAULT overrides it with something else?

Yes. Exporter.pm wants :DEFAULT to match @EXPORT so the module is incorrect in trying to define its own meaning for :DEFAULT. Based on what you've quoted, it appears that Exporter.pm forces this issue, but the more important point is that you shouldn't set $EXPORT_TAGS{DEFAULT} yourself.

- tye

sg (Pilgrim) on Feb 05, 2011 at 23:13 UTC

Re: Simple Module Tutorial

Thanks for the exposition; my inclination regarding a simple module is as follows:

MyModule.pm package MyModule; use strict; use warnings; use diagnostics; use Carp; our $VERSION = 1.08; sub see_me { my $foo = shift; print "\t\tDo you see this: $foo?\n"; } 1; __END__ last line of the module needs to be true; last line of the _file_ need not be true: 0; [download]

The above module is exercised by the following script:

exercise_my_module.pl

#!/c/opt/perl/bin/perl use strict; use warnings; use diagnostics; use Carp; use MyModule 1.05; #use MyModule 1.10; # will fail MyModule::see_me( 8 ); __END__ [download]

chanslor (Acolyte) on Feb 27, 2015 at 19:47 UTC

Re^2: Simple Module Tutorial


by chanslor (Acolyte) on Feb 27, 2015 at 19:47 UTC

Thank you for this post! It has gotten me past the first barrier of writing my own module. Thanks again! - chanslor

Anonymous Monk on Mar 09, 2010 at 12:21 UTC

Re: Simple Module Tutorial

Excellent representation of what seemed a tough nut to swallow. Thank you very much. Tanuj Bhargava

Anonymous Monk on Mar 07, 2011 at 15:12 UTC

Re: Simple Module Tutorial

Thanks to the writer for your trouble. But everyone seems to get it except me. I have tried to duplicate your results.

I have:

- MyScript.pl under /storage/username/PERL_SCRIPTS/dev

- Test.pm under /storage/username/local/perl/site/lib/Test/Test.pm (I just replaced MyModule.pm with Test.pm).

The module code is exactly the same. In MyScript.pl I have added

use lib '/storage/username/local/perl/site/lib'; <p>and typed in the first two cases.</p> <code>perl MyScript.pl [download] gives: Undefined subroutine &main::func1 called at MyScript.pl line 10

Line 10 is:

print func1(@list),"\n";

after typing "use Test;"

What am I missing here? Also, is the BEGIN command supposed to be used in the Perl script? It gives syntax errors when I try to use it.

Thanks in advance,

Gideon

toolic (Bishop) on Mar 07, 2011 at 15:25 UTC

Re^2: Simple Module Tutorial


by toolic (Bishop) on Mar 07, 2011 at 15:25 UTC

"Test" is a poor choice of a module name because there is a Core module of the same name ( Test ) which is part of the standard Perl distribution. Furthermore, since you placed your .pm file under a directory named "Test", you would need to type use Test::Test; . I strongly recommend you change the name of your module to something more unique in order to avoid this naming collision.

Anonymous Monk on Mar 07, 2011 at 16:11 UTC

Re^3: Simple Module Tutorial
by Anonymous Monk on Mar 07, 2011 at 16:11 UTC

Dear toolic,

Thank you for your reply. I now have changed all instances of "Test" with "MyModule" and also changed the name of the module. MyModule.pm is now situated on

/storage/username/local/perl/site/lib

and I use

use lib '/storage/username/local/perl/site/lib'; with the second case (as per the example): # case 2 use MyModule; print func1(@list),"\n"; print MyModule::func2(@list),"\n"; [download]

but I still get the same error: Undefined subroutine &main::func1 called at MyScript line 15.

Just to make sure I copied the module exactly from the example but to no avail. Interestingly, when I comment out the print func1 part, the line after that produces correct output. I hope that someone could point out to me where I am at fault.

Best regards,

Gideon

toolic (Bishop) on Mar 07, 2011 at 16:34 UTC

Re^4: Simple Module Tutorial
by toolic (Bishop) on Mar 07, 2011 at 16:34 UTC

Anonymous Monk on Mar 07, 2011 at 16:58 UTC

Re^3: Simple Module Tutorial
by Anonymous Monk on Mar 07, 2011 at 16:58 UTC

Hi toolic

For some reason I can't reply to your latest post but thanks a mil! I did copy the module exactly but not the script. Somehow I mixed up case 1 and case 2. I expected case 1 not to work but case 2 but instead of coding

use MyModule qw(&func1);

I simply used

use MyModule;

Thanks a lot for pointing it out, it seems to work now. I have learned quite a bit.

Best regards,

Gideon

Anonymous Monk on Nov 18, 2014 at 00:13 UTC

Re: Simple Module Tutorial

Thank you for this topic, it is very useful for a beginner. However i had a trouble with use of Module.

Anonymous Monk on Nov 18, 2014 at 01:06 UTC

Re^2: Simple Module Tutorial


by Anonymous Monk on Nov 18, 2014 at 01:06 UTC

You did not post any code

Anonymous Monk on Dec 16, 2014 at 23:03 UTC

Re^3: Simple Module Tutorial
by Anonymous Monk on Dec 16, 2014 at 23:03 UTC Hello, I used exactly the same code than quoted at Re: Simple Module Tutorial by johnnywang on Aug 09, 2004 at 22:58 UTC I don't get trouble with the code, but it is just that once i used the module i can't modify function in it and see the effects. It looks like library are not updated. How could we do that ?
Replies are listed 'Best First'.

[Nov 23, 2017] A Perl array 'contains' example by Alvin Alexander

Jun 03, 2016 | alvinalexander.com

Perl array FAQ: How can I test to see if a Perl array already contains a given value? (Also written as, How do I search an array with the Perl grep function?)

I use the Perl grep function to see if a Perl array contains a given entry. For instance, in this Perl code:

if ( grep { $_ eq $clientAddress} @ip_addresses ) {
  # the array already contains this ip address; skip it this time
  next;
} else {
  # the array does not yet contain this ip address; add it
  push @ip_addresses, $clientAddress;
}

I'm testing to see if the Perl array "@ip_addresses" contains an entry given by the variable "$clientAddress".

Just use this Perl array search technique in an "if" clause, as shown, and then add whatever logic you want within your if and else statements. In this case, if the current IP address is not already in the array, I add it to the array in the "else" clause, but of course your logic will be unique.

An easier "Perl array contains" example

If it's easier to read without a variable in there, here's another example of this "Perl array contains" code:

if ( grep { $_ eq '192.168.1.100'} @ip_addresses )

if you'd like more details, I didn't realize it, but I have another good example out here in my " Perl grep array tutorial ." (It's pretty bad when you can't find things on your own website.)

[Nov 23, 2017] Simple Module Tutorial

Aug 06, 2001 | perlmonks.com
So you find the Perl docs on modules a bit confusing? OK here is the world's simplest Perl module demonstrating all the salient features of Exporter and a script that uses this module. We also give a short rundown on @INC and finish with a note on using warnings and modules. Here is the module code. MyModule.pm package MyModule; use strict; use Exporter; use vars qw($VERSION @ISA @EXPORT @EXPORT_OK %EXPORT_TAGS); $VERSION = 1.00; @ISA = qw(Exporter); @EXPORT = (); @EXPORT_OK = qw(func1 func2); %EXPORT_TAGS = ( DEFAULT => [qw(&func1)], Both => [qw(&func1 &func2)]); sub func1 { return reverse @_ } sub func2 { return map{ uc }@_ } 1; [download]

First we get a namespace by declaring a package name. This helps ensure our module's functions and variables remain separate from any script that uses it.

Use strict is a very good idea for modules to restrict the use of global variables. See use strict warnings and diagnostics or die for more details.

We need to use the Exporter module to export our functions from the MyModule:: namespace into the main:: namespace to make them available to scripts that 'use' MyModule.

We pacify strict with the use vars declaration of some variables. We can use an 'our' declaration in 5.6+

We now set a $VERSION number and make Exporter part of MyModule using the @ISA. See perlboot for all the gory details on what @ISA is or just use it as shown.

@EXPORT contains a list of functions that we export by default, in this case nothing. Generally the less you export by default using @EXPORT the better. This avoids accidentally clashing with functions defined in the script using the module. If a script wants a function let it ask.

@EXPORT_OK contains a list of functions that we export on demand so we export &func1 &func2 only if specifically requested to. Use this in preference to just blindly exporting functions via @EXPORT. You can also export variables like $CONFIG provided they are globals not lexicals scoped with my (read declare them with our or use vars).

%EXPORT_TAGS. For convenience we define two sets of export tags. The ':DEFAULT' tag exports only &func1; the ':Both' tag exports both &func1 &func2. This hash stores labels pointing to array references. In this case the arrays are anonymous.

We need the 1; at the end because when a module loads Perl checks to see that the module returns a true value to ensure it loaded OK. You could put any true value at the end (see Code::Police ) but 1 is the convention.

MyScript.pl (A simple script to use MyModule) #!/usr/bin/perl -w use strict; # you may need to set @INC here (see below) my @list = qw (J u s t ~ A n o t h e r ~ P e r l ~ H a c k e r !); # case 1 # use MyModule; # print func1(@list),"\n"; # print func2(@list),"\n"; # case 2 # use MyModule qw(&func1); # print func1(@list),"\n"; # print MyModule::func2(@list),"\n"; # case 3 # use MyModule qw(:DEFAULT); # print func1(@list),"\n"; # print func2(@list),"\n"; # case 4 # use MyModule qw(:Both); # print func1(@list),"\n"; # print func2(@list),"\n"; [download]

We use MyModule in MyScript.pl as shown. Uncomment the examples to see what happens. Just uncomment one at a time.

Case 1: Because our module exports nothing by default we get errors as &funct1 and &funct2 have not been exported thus do not exist in the main:: namespace of the script.

Case 2: This works OK. We ask our module to export the &func1 so we can use it. Although &func2 was not exported we reference it with its full package name so this works OK.

Case 3: The ':DEFAULT' tag *should* export &func1 so you might expect the error here to concern a missing &func2. In fact Perl complains about &func1. Hmm, what is going on here. The DEFAULT tag name is special and is automatically set in our modules %EXPORT_TAGS hash like this DEFAULT => \@EXPORT.

Case 4: We specified the export of both our functions with the ':Both' thus this works.

A note on @INC

When you issue a use MyModule; directive perl searchs the @INC array for a module with the correct name. @INC usually contains:

/perl/lib 
/perl/site/lib
.

The . directory (dot dir) is the current working directory. CORE modules are installed under perl/lib whereas non-CORE modules install under perl/site/lib. You can add directories to the module search path in @INC like this:

BEGIN { push @INC, '/my/dir' } # or BEGIN { unshift @INC, '/my/dir' } # or use lib '/my/dir'; [download]

We need to use a BEGIN block to shift values into @INC at compile time as this is when perl checks for modules. If you wait until the script is comiled it is too late and perl will throw an exception saying it can't find MyModule in @INC... The difference between pushing a value and unshifting a value into @INC is that perl searches the @INC array for the module starting with the first dir in that array. Thus is you have a MyModule in /perl/lib/ and another in /perl/site/lib/ and another in ./ the one in /perl/lib will be found first and thus the one used. The use lib pragma effectively does the same as the BEGIN { unshift @INC, $dir } block - see perlman:lib:lib for full specifics.

What use Foo::Bar means

use Foo::Bar does not mean look for a module called "Foo::Bar.pm" in the @INC directories. It means search @INC for a *subdir* called "Foo" and a *module* called "Bar.pm".

Now once we have "use'd" a module its functions are available via the fully specified &PACKAGE::FUNCTION syntax. When we say &Foo::Bar::some_func we are refering to the *package name* not the (dir::)file name that we used in the use. This allows you to have many package names in one use'd file. In practice the names are usually the same.

use Warnings;

You should test your module with warnings enabled as this will pick up many subtle (and not so subtle :-) errors. You can activate warnings using the -w flag in the script you use to test the module. If you add use warnings to the module then your module will require Perl 5.6+ as this was not available before then. If you put $^W++ at the top of the module then you will globally enable warnings - this may break *other modules* a script may be using in addition to your module so is rather antisocial. An expert coder here called tye tests with warnings but does not include them directly in his/her modules.

Hope this explains how it works.

cheers

tachyon

Update

Fixed a typo and added a few comments. Thanks to John M. Dlugosz . Rewrote and restyled tute for compatibility with versions of Perl < 5.6 thanks to crazyinsomniac . Also thanks to tye for reminding me that $^W++ is globally scoped and a bit antisocial for a module.

s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

John M. Dlugosz (Monsignor) on Aug 06, 2001 at 04:30 UTC

Re: Simple Module Tutorial

Very nice, getting everything into a short page like that. But, I have a few comments:

Are you sure you want to make $VERSION a float, rather than a v-string? And if so, illustrate the three-digit convention (e.g. 5.005_001 for version 5.5.1).

I'm also shocked that your pm file doesn't use strict !

I would also suggest adding a comment to the 1; line, saying that this means "loaded OK".

-- John

tachyon (Chancellor) on Aug 06, 2001 at 05:42 UTC

Re: Re: Simple Module Tutorial


by tachyon (Chancellor) on Aug 06, 2001 at 05:42 UTC

Thanks John I've updated the text a bit in line with your suggestions. Forgot the strict in the module! Oops it is back in its rightful place right at the top. I just used the simple $VERSION numbering because this is a simple tute :-) Here is an excerpt from the Exporter manpage for those interested.

Module Version Checking The Exporter module will convert an attempt to import a number from a module into a call to $module_name->require_version($value). This can be used to validate that the version of the module being used is greater than or equal to the required version. The Exporter module supplies a default require_version method which checks the value of $VERSION in the exporting module. Since the default require_version method treats the $VERSION number as a simple numeric value it will regard version 1.10 as lower than 1.9. For this reason it is strongly recommended that you use numbers with at least two decimal places, e.g., 1.09. [download]

cheers

tachyon

s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

John M. Dlugosz (Monsignor) on Aug 06, 2001 at 08:18 UTC

Re: Re: Re: Simple Module Tutorial
by John M. Dlugosz (Monsignor) on Aug 06, 2001 at 08:18 UTC Yea, I just posted a tutorial on VERSION.

For compatibility with mixing decimals and v-strings, the built-in UNIVERSAL::require_version uses three decimal digits per part.

If you have $MyModule::VERSION= 1.12; (a decimal number) and do a use MyModule 1.20.1 qw/bar/ , it will tell you that the module 1.120 and you asked for 1.020, so that's OK. You expected 1.20 to be greater than 1.12, not-OK.

-- John

tye (Sage) on Aug 06, 2001 at 22:09 UTC

(tye)Re: Simple Module Tutorial

$W++ will only give you run-time warnings and will affect other packages. Personally, I don't turn on warnings in modules that I write but I do make a point of testing them with warnings turned on (by putting "#!/usr/bin/perl -w" at the top of my test scripts).

- tye (but my friends call me "Tye")

tachyon (Chancellor) on Aug 06, 2001 at 22:36 UTC

Re: (tye)Re: Simple Module Tutorial


by tachyon (Chancellor) on Aug 06, 2001 at 22:36 UTC

Thanks I changed it from the lexically scoped 'use warnings;' so that this is applicable to versions < 5.6. but as it adds little value to the tutorial and has the unwanted side effects you point out I have just deleted it - saves a few lines of dubious value. I'll add a note on testing with warnings when I have a moment.

cheers

tachyon

s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

johnnywang (Priest) on Aug 09, 2004 at 22:58 UTC

Re: Simple Module Tutorial

Quite some years have gone by since the inital post, just want to point out that "use vars" is now considered deprecated, instead, "our" is prefered. Another already mentioned point is not to use @EXPORT too much. So my basic module is like the following (I have a emacs function to output this):

package MyModule; use strict; use Exporter qw(import); our $VERSION = 1.00; our @ISA = qw(Exporter); our @EXPORT_OK = qw(func1 func2); our %EXPORT_TAGS = ( DEFAULT => [qw(func1)], Both => [qw(func1 func2)]); sub func1 { return reverse @_ } sub func2 { return map{ uc }@_ } 1; [download] For those emacs users, here's the simple script to generate the skeleton: (defun perl-new-module () "Generate a skeleton source for a traditional perl module." (interactive) (setq var (split-string (read-from-minibuffer "Enter module name (eg. Web::Test): "nil nil nil nil nil nil) " ")) (setq name (car var)) (insert (format "package %s;\n\n" name)) (insert "use strict;\n\n") (insert "use Exporter qw(import);\n") (insert "our @ISA = qw(Exporter);\n") (insert "our @EXPORT_OK = qw();\n") (insert "our %EXPORT_TAGS = ();\n") (insert "our $VERSION = 1.00; \n\n") (insert "\n\n\n\n\n\n") (insert "1;") (insert "\n") (previous-line 6) (end-of-line) ) [download]

adrianh (Chancellor) on Aug 10, 2004 at 00:30 UTC

Re^2: Simple Module Tutorial


by adrianh (Chancellor) on Aug 10, 2004 at 00:30 UTC

Quite some years have gone by since the inital post, just want to point out that "use vars" is now considered deprecated, instead, "our" is prefered.

Some people prefer to use our rather than use vars (I'm one of them) - but the latter is not deprecated. Both do slightly different things and many people still prefer to use vars .

beretboy (Chaplain) on Aug 18, 2001 at 15:07 UTC

Re: Simple Module Tutorial

Exellent tutorial ++! I have never understood the writing of modules till now

"Sanity is the playground of the unimaginative" -Unknown

Codmate (Novice) on Sep 06, 2001 at 20:18 UTC

Re: Re: Simple Module Tutorial


by Codmate (Novice) on Sep 06, 2001 at 20:18 UTC

Fantastic stuff - I just converted a full script into a module (and added some bits) and it worked 1st time - WITH NO DEBUGGING REQUIRED (and yes - I am using strict)!!! I was very scared of modules before but now feel like I could write a hundred. Thanks very much for this - invaluable tutorial for a newbie like me :))))))

Jaap (Curate) on Jul 22, 2002 at 09:35 UTC

Re: Simple Module Tutorial

This is a nice tutorial tachyon. Are you considering writing a more advanced tutorial on modules (combined with OO)?

Especially, what a GOOD module looks like. Should we use carp, dynaloader and what not?

gawatkins (Monsignor) on Apr 10, 2003 at 11:44 UTC

Re: Simple Module Tutorial

Great Tutorial, It helped to clear up the muddy water created by my Perl Black Book .

Thanks again,

Greg W.

twotone (Beadle) on Oct 14, 2007 at 05:06 UTC

Re: Simple Module Tutorial

Great summary of module basics!

Here's a little code I came up with to add my module location to @INC (in a cgi environment) by dynamically determining the document root for the script. It works on the remote apache server and when testing locally in windows. It might be of some interest:

BEGIN { # get doc root from %ENV # implicitly declare file root path if %ENV not fount my $doc_root = $ENV{DOCUMENT_ROOT} || 'C:/Users/User/Documents/website/sites/mysite'; # change \ to / $doc_root =~ s/\\/\//g; # add module folder location $doc_root .= "/cgi-bin/cms/"; # add module location to @INC push(@INC,$doc_root); } [download]

bychan (Initiate) on Jan 28, 2008 at 08:45 UTC

Re^2: Simple Module Tutorial


by bychan (Initiate) on Jan 28, 2008 at 08:45 UTC

This tutorial is great. The only problem is, that I get the following result, if I comment out all the cases:

!rekcaH~lreP~rehtonA~tsuJ
JUST~ANOTHER~PERL~HACKER!
!rekcaH~lreP~rehtonA~tsuJ
JUST~ANOTHER~PERL~HACKER!
!rekcaH~lreP~rehtonA~tsuJ
JUST~ANOTHER~PERL~HACKER!
!rekcaH~lreP~rehtonA~tsuJ
JUST~ANOTHER~PERL~HACKER!

Shouldn't there be some error messages or warnings?

chexmix (Hermit) on Aug 12, 2008 at 13:24 UTC

Re: Simple Module Tutorial

I like this post very much, but the following is opaque to me for some reason:

" Case 3: The ':DEFAULT' tag *should* export &func1 so you might expect the error here to concern a missing &func2. In fact Perl complains about &func1. Hmm, what is going on here. The DEFAULT tag name is special and is automatically set in our modules %EXPORT_TAGS hash like this DEFAULT => \@EXPORT."

I confess I still don't know what is going on here, and am wondering if someone can help me out.

The words "is automatically set in our modules %EXPORT_TAGS hash like this DEFAULT => \@EXPORT" seem inaccurate to me, since when I look up at the code for MyModule.pm, I see the line

%EXPORT_TAGS = ( DEFAULT => qw(&func1) ,

and not the line

%EXPORT_TAGS = ( DEFAULT => \@EXPORT,

Is the OP saying that the line as is in the MyModule.pm code is incorrect, because the "special" nature of DEFAULT overrides it with something else?

Thanks. I feel I am blanking on something obvious here, but just can't see it this rainy morning.

tye (Sage) on Aug 14, 2008 at 15:41 UTC

Re^2: Simple Module Tutorial (DEFAULT)


by tye (Sage) on Aug 14, 2008 at 15:41 UTC

Is the OP saying that the line as is in the MyModule.pm code is incorrect, because the "special" nature of DEFAULT overrides it with something else?

Yes. Exporter.pm wants :DEFAULT to match @EXPORT so the module is incorrect in trying to define its own meaning for :DEFAULT. Based on what you've quoted, it appears that Exporter.pm forces this issue, but the more important point is that you shouldn't set $EXPORT_TAGS{DEFAULT} yourself.

- tye

sg (Pilgrim) on Feb 05, 2011 at 23:13 UTC

Re: Simple Module Tutorial

Thanks for the exposition; my inclination regarding a simple module is as follows:

MyModule.pm package MyModule; use strict; use warnings; use diagnostics; use Carp; our $VERSION = 1.08; sub see_me { my $foo = shift; print "\t\tDo you see this: $foo?\n"; } 1; __END__ last line of the module needs to be true; last line of the _file_ need not be true: 0; [download]

The above module is exercised by the following script:

exercise_my_module.pl

#!/c/opt/perl/bin/perl use strict; use warnings; use diagnostics; use Carp; use MyModule 1.05; #use MyModule 1.10; # will fail MyModule::see_me( 8 ); __END__ [download]

chanslor (Acolyte) on Feb 27, 2015 at 19:47 UTC

Re^2: Simple Module Tutorial


by chanslor (Acolyte) on Feb 27, 2015 at 19:47 UTC

Thank you for this post! It has gotten me past the first barrier of writing my own module. Thanks again! - chanslor

Anonymous Monk on Mar 09, 2010 at 12:21 UTC

Re: Simple Module Tutorial

Excellent representation of what seemed a tough nut to swallow. Thank you very much. Tanuj Bhargava

Anonymous Monk on Mar 07, 2011 at 15:12 UTC

Re: Simple Module Tutorial

Thanks to the writer for your trouble. But everyone seems to get it except me. I have tried to duplicate your results.

I have:

- MyScript.pl under /storage/username/PERL_SCRIPTS/dev

- Test.pm under /storage/username/local/perl/site/lib/Test/Test.pm (I just replaced MyModule.pm with Test.pm).

The module code is exactly the same. In MyScript.pl I have added

use lib '/storage/username/local/perl/site/lib'; <p>and typed in the first two cases.</p> <code>perl MyScript.pl [download] gives: Undefined subroutine &main::func1 called at MyScript.pl line 10

Line 10 is:

print func1(@list),"\n";

after typing "use Test;"

What am I missing here? Also, is the BEGIN command supposed to be used in the Perl script? It gives syntax errors when I try to use it.

Thanks in advance,

Gideon

toolic (Bishop) on Mar 07, 2011 at 15:25 UTC

Re^2: Simple Module Tutorial


by toolic (Bishop) on Mar 07, 2011 at 15:25 UTC

"Test" is a poor choice of a module name because there is a Core module of the same name ( Test ) which is part of the standard Perl distribution. Furthermore, since you placed your .pm file under a directory named "Test", you would need to type use Test::Test; . I strongly recommend you change the name of your module to something more unique in order to avoid this naming collision.

Anonymous Monk on Mar 07, 2011 at 16:11 UTC

Re^3: Simple Module Tutorial
by Anonymous Monk on Mar 07, 2011 at 16:11 UTC

Dear toolic,

Thank you for your reply. I now have changed all instances of "Test" with "MyModule" and also changed the name of the module. MyModule.pm is now situated on

/storage/username/local/perl/site/lib

and I use

use lib '/storage/username/local/perl/site/lib'; with the second case (as per the example): # case 2 use MyModule; print func1(@list),"\n"; print MyModule::func2(@list),"\n"; [download]

but I still get the same error: Undefined subroutine &main::func1 called at MyScript line 15.

Just to make sure I copied the module exactly from the example but to no avail. Interestingly, when I comment out the print func1 part, the line after that produces correct output. I hope that someone could point out to me where I am at fault.

Best regards,

Gideon

toolic (Bishop) on Mar 07, 2011 at 16:34 UTC

Re^4: Simple Module Tutorial
by toolic (Bishop) on Mar 07, 2011 at 16:34 UTC

Anonymous Monk on Mar 07, 2011 at 16:58 UTC

Re^3: Simple Module Tutorial
by Anonymous Monk on Mar 07, 2011 at 16:58 UTC

Hi toolic

For some reason I can't reply to your latest post but thanks a mil! I did copy the module exactly but not the script. Somehow I mixed up case 1 and case 2. I expected case 1 not to work but case 2 but instead of coding

use MyModule qw(&func1);

I simply used

use MyModule;

Thanks a lot for pointing it out, it seems to work now. I have learned quite a bit.

Best regards,

Gideon

Anonymous Monk on Nov 18, 2014 at 00:13 UTC

Re: Simple Module Tutorial

Thank you for this topic, it is very useful for a beginner. However i had a trouble with use of Module.

Anonymous Monk on Nov 18, 2014 at 01:06 UTC

Re^2: Simple Module Tutorial


by Anonymous Monk on Nov 18, 2014 at 01:06 UTC

You did not post any code

Anonymous Monk on Dec 16, 2014 at 23:03 UTC

Re^3: Simple Module Tutorial
by Anonymous Monk on Dec 16, 2014 at 23:03 UTC Hello, I used exactly the same code than quoted at Re: Simple Module Tutorial by johnnywang on Aug 09, 2004 at 22:58 UTC I don't get trouble with the code, but it is just that once i used the module i can't modify function in it and see the effects. It looks like library are not updated. How could we do that ?
Replies are listed 'Best First'.

[Nov 23, 2017] A Perl array 'contains' example by Alvin Alexander

June 3, 2016 | alvinalexander.com

Perl array FAQ: How can I test to see if a Perl array already contains a given value? (Also written as, How do I search an array with the Perl grep function?)

I use the Perl grep function to see if a Perl array contains a given entry. For instance, in this Perl code:

if ( grep { $_ eq $clientAddress} @ip_addresses ) {
  # the array already contains this ip address; skip it this time
  next;
} else {
  # the array does not yet contain this ip address; add it
  push @ip_addresses, $clientAddress;
}

I'm testing to see if the Perl array "@ip_addresses" contains an entry given by the variable "$clientAddress".

Just use this Perl array search technique in an "if" clause, as shown, and then add whatever logic you want within your if and else statements. In this case, if the current IP address is not already in the array, I add it to the array in the "else" clause, but of course your logic will be unique.

An easier "Perl array contains" example

If it's easier to read without a variable in there, here's another example of this "Perl array contains" code:

if ( grep { $_ eq '192.168.1.100'} @ip_addresses )

if you'd like more details, I didn't realize it, but I have another good example out here in my " Perl grep array tutorial ." (It's pretty bad when you can't find things on your own website.)

[Nov 22, 2017] edited yesterday

Notable quotes:
"... Comment on possible multiple conflicting options ..."
"... Higher-Order Perl ..."
Nov 22, 2017 | stackoverflow.com

down vote favorite

Speeddymon ,2 days ago

I've been reading up on dispatch tables and I get the general idea of how they work, but I'm having some trouble taking what I see online and applying the concept to some code I originally wrote as an ugly mess of if-elsif-else statements.

I have options parsing configured by using GetOpt::Long , and in turn, those options set a value in the %OPTIONS hash, depending on the option used.

Taking the below code as an example... ( UPDATED WITH MORE DETAIL

use     5.008008;
use     strict;
use     warnings;
use     File::Basename qw(basename);
use     Getopt::Long qw(HelpMessage VersionMessage :config posix_default require_order no_ignore_case auto_version auto_help);

my $EMPTY      => q{};

sub usage
{
    my $PROG = basename($0);
    print {*STDERR} $_ for @_;
    print {*STDERR} "Try $PROG --help for more information.\n";
    exit(1);
}

sub process_args
{
    my %OPTIONS;

    $OPTIONS{host}              = $EMPTY;
    $OPTIONS{bash}              = 0;
    $OPTIONS{nic}               = 0;
    $OPTIONS{nicName}           = $EMPTY;
    $OPTIONS{console}           = 0;
    $OPTIONS{virtual}           = 0;
    $OPTIONS{cmdb}              = 0;
    $OPTIONS{policyid}          = 0;
    $OPTIONS{showcompliant}     = 0;
    $OPTIONS{backup}            = 0;
    $OPTIONS{backuphistory}     = 0;
    $OPTIONS{page}              = $EMPTY;

    GetOptions
      (
        'host|h=s'              => \$OPTIONS{host}               ,
        'use-bash-script'       => \$OPTIONS{bash}               ,
        'remote-console|r!'     => \$OPTIONS{console}            ,
        'virtual-console|v!'    => \$OPTIONS{virtual}            ,
        'nic|n!'                => \$OPTIONS{nic}                ,
        'nic-name|m=s'          => \$OPTIONS{nicName}            ,
        'cmdb|d!'               => \$OPTIONS{cmdb}               ,
        'policy|p=i'            => \$OPTIONS{policyid}           ,
        'show-compliant|c!'     => \$OPTIONS{showcompliant}      ,
        'backup|b!'             => \$OPTIONS{backup}             ,
        'backup-history|s!'     => \$OPTIONS{backuphistory}      ,
        'page|g=s'              => \$OPTIONS{page}               ,
        'help'                  => sub      { HelpMessage(-exitval => 0, -verbose ->1)     },
        'version'               => sub      { VersionMessage()  },
      ) or usage;

    if ($OPTIONS{host} eq $EMPTY)
    {
        print {*STDERR} "ERROR: Must specify a host with -h flag\n";
        HelpMessage;
    }

    sanity_check_options(\%OPTIONS);

    # Parse anything else on the command line and throw usage
    for (@ARGV)
    {
        warn "Unknown argument: $_\n";
        HelpMessage;
    }

    return {%OPTIONS};
}

sub sanity_check_options
{
    my $OPTIONS     = shift;

    if (($OPTIONS->{console}) and ($OPTIONS->{virtual}))
    {
        print "ERROR: Cannot use flags -r and -v together\n";
        HelpMessage;
    }
    elsif (($OPTIONS->{console}) and ($OPTIONS->{cmdb}))
    {
        print "ERROR: Cannot use flags -r and -d together\n";
        HelpMessage;
    }
    elsif (($OPTIONS->{console}) and ($OPTIONS->{backup}))
    {
        print "ERROR: Cannot use flags -r and -b together\n";
        HelpMessage;
    }
    elsif (($OPTIONS->{console}) and ($OPTIONS->{nic}))
    {
        print "ERROR: Cannot use flags -r and -n together\n";
        HelpMessage;
    }

    if (($OPTIONS->{virtual}) and ($OPTIONS->{backup}))
    {
        print "ERROR: Cannot use flags -v and -b together\n";
        HelpMessage;
    }
    elsif (($OPTIONS->{virtual}) and ($OPTIONS->{cmdb}))
    {
        print "ERROR: Cannot use flags -v and -d together\n";
        HelpMessage;
    }
    elsif (($OPTIONS->{virtual}) and ($OPTIONS->{nic}))
    {
        print "ERROR: Cannot use flags -v and -n together\n";
        HelpMessage;
    }

    if (($OPTIONS->{backup}) and ($OPTIONS->{cmdb}))
    {
        print "ERROR: Cannot use flags -b and -d together\n";
        HelpMessage;
    }
    elsif (($OPTIONS->{backup}) and ($OPTIONS->{nic}))
    {
        print "ERROR: Cannot use flags -b and -n together\n";
        HelpMessage;
    }

    if (($OPTIONS->{nic}) and ($OPTIONS->{cmdb}))
    {
        print "ERROR: Cannot use flags -n and -d together\n";
        HelpMessage;
    }

    if (($OPTIONS->{policyid} != 0) and not ($OPTIONS->{cmdb}))
    {
        print "ERROR: Cannot use flag -p without also specifying -d\n";
        HelpMessage;
    }

    if (($OPTIONS->{showcompliant}) and not ($OPTIONS->{cmdb}))
    {
        print "ERROR: Cannot use flag -c without also specifying -d\n";
        HelpMessage;
    }

    if (($OPTIONS->{backuphistory}) and not ($OPTIONS->{backup}))
    {
        print "ERROR: Cannot use flag -s without also specifying -b\n";
        HelpMessage;
    }

    if (($OPTIONS->{nicName}) and not ($OPTIONS->{nic}))
    {
        print "ERROR: Cannot use flag -m without also specifying -n\n";
        HelpMessage;
    }

    return %{$OPTIONS};
}

I'd like to turn the above code into a dispatch table, but can't figure out how to do it.

Any help is appreciated.

Jim Garrison ,2 days ago

Are the sets of conflicting options always pairs? Can you have situations where options a , b , and c cannot occur together but any two are OK? Before you can pick a representation you need to be sure your model can handle the logic you need in a general way. This is not an easy problem. – Jim Garrison 2 days ago

simbabque ,yesterday

Don't use English, it's horribly slow and makes your code harder to read. – simbabque yesterday

Speeddymon ,yesterday

Removed English module and changed $ARG / @ARG to $_ / @_ Added $EMPTY as I forgot I had it defined globally. – Speeddymon yesterday

Speeddymon ,yesterday

@JimGarrison -- you are correct. The if-elsif-else does not explicitly account for 3 options that conflict (though it does account for that implicitly) As an example, using -h is required with all of the other options. But, using -h , -r , v , all together is not allowed, while -h , -r , and -d is allowed. – Speeddymon yesterday

ikegami ,yesterday

Since the host must be provided, it should be an argument, not an option. – ikegami yesterday

zdim ,2 days ago

I am not sure how a dispatch table would help since you need to go through pair-wise combinations of specific possibilities, and thus cannot trigger a suitable action by one lookup.

Here is another way to organize it

use List::MoreUtils 'firstval';

sub sanity_check_options
{
    my ($OPTIONS, $opt_excl) = @_;

    # Check each of 'opt_excl' against all other for ConFLict
    my @excl = sort keys %$opt_excl;
    while (my $eo = shift @excl) 
    {
        if (my $cfl = firstval { $OPTIONS->{$eo} and $OPTIONS->{$_} } @excl) 
        {
            say "Can't use -$opt_excl->{$eo} and -$opt_excl->{$cfl} together";
            HelpMessage();
            last;
        }
    }

    # Go through specific checks on
    # policyid, showcompliant, backuphistory, and nicName
    ...
    return 1;  # or some measure of whether there were errors
}

# Mutually exclusive options
my %opt_excl = (
    console => 'r', virtual => 'v', cmdb => 'c', backup => 'b', nic => 'n'
); 

sanity_check_options(\%OPTIONS, \%opt_excl);

This checks all options listed in %opt_excl against each other for conflict, removing the segments of elsif involving the (five) options that are mutually exclusive. It uses List::MoreUtils::firstval . The few other specific invocations are best checked one by one.

There is no use of returning $OPTIONS since it is passed as reference so any changes apply to the original structure (while it's not meant to be changed either). Perhaps you can keep track of whether there were errors and return that if it can be used in the caller, or just return 1 .

This addresses the long elsif chain as asked, and doesn't go into the rest of code. Here is one comment though: There is no need for {%OPTIONS} , which copies the hash in order to create an anonymous one; just use return \%OPTIONS;


Comment on possible multiple conflicting options

This answer as it stands does not print all conflicting options that have been used if there are more than two, as raised by ikegami in comments; it does catch any conflicts so that the run is aborted.

The code is readily adjusted for this. Instead of the code in the if block either

However, one is expected to know of allowed invocations and any listing of conflicts is a courtesy to the forgetful user (or a debugging aid); a usage message is printed as well anyway.

Given the high number of conflicting options the usage message should contain a prominent note on this. Also consider that so many conflicting options may indicate a design flaw.

Finally, this code fully relies on the fact that this processing goes once per run and operates with a handful of options; thus it is not concerned with efficiency and freely uses ancillary data structures.

Speeddymon ,yesterday

Updated the question to clarify. – Speeddymon yesterday

zdim ,yesterday

@Speeddymon Thank you, updated. This brings together checks of those five options which can't go one with another. The remaining few I leave to be checked one by one; "encoding" one or two possibilities in some all-encompassing system would just increase complexity (and may end up less readable). – zdim yesterday

zdim ,yesterday

@Speeddymon Added the missing include, use List::MoreUtils 'firstval' . Edited a little in the meanwhile, as well. – zdim yesterday

Speeddymon ,yesterday

Thank you for the easy to follow example. I went with yours as it was the clearest and contained the least duplicate code. – Speeddymon yesterday

ikegami ,23 hours ago

@Speeddymon, Apparently, it's not clear as you think since you didn't realize if doesn't work. It doesn't mention the error of using -r and -c together if -b is also provided. And why is a hash being used at all? Wasteful and needlessly complex. – ikegami 23 hours ago

simbabque ,yesterday

You can use a dispatch table if there are a lot of options. I would build that table programmatically. It might not be the best option here, but it works and the configuration is more readable than your elsif construct.
use strict;
use warnings;
use Ref::Util::XS 'is_arrayref';    # or Ref::Util

sub create_key {
    my $input = shift;

    # this would come from somewhere else, probably the Getopt config
    my @opts = qw( host bash nic nicName console virtual cmdb
        policyid showcompliant backup backuphistory page );

    # this is to cover the configuration with easier syntax
    $input = { map { $_ => 1 } @{$input} }
        if is_arrayref($input);

    # options are always prefilled with false values
    return join q{}, map { $input->{$_} ? 1 : 0 }
        sort @opts;
}

my %forbidden_combinations = (
    map { create_key( $_->[0] ) => $_->[1] } (
        [ [qw( console virtual )] => q{Cannot use flags -r and -v together} ],
        [ [qw( console cmdb )]    => q{Cannot use flags -r and -d together} ],
        [ [qw( console backup )]  => q{Cannot use flags -r and -b together} ],
        [ [qw( console nic )]     => q{Cannot use flags -r and -n together} ],
    )
);

p %forbidden_combinations; # from Data::Printer

The output of the p function is the dispatch table.

{
    00101   "Cannot use flags -r and -v together",
    00110   "Cannot use flags -r and -n together",
    01100   "Cannot use flags -r and -d together",
    10100   "Cannot use flags -r and -b together"
}

As you can see, we've sorted all the options ascii-betically to use them as keys. That way, you could in theory build all kinds of combinations like exclusive options.

Let's take a look at the configuration itself.

my %forbidden_combinations = (
    map { create_key( $_->[0] ) => $_->[1] } (
        [ [qw( console virtual )] => q{Cannot use flags -r and -v together} ],
        # ...
    )
);

We use a list of array references. Each entry is on one line and contains two pieces of information. Using the fat comma => makes it easy to read. The first part, which is much like a key in a hash, is the combination. It's a list of fields that should not occur together. The second element in the array ref is the error message. I've removed all the recurring elements, like the newline, to make it easier to change how and where the error can be displayed.

The map around this list of combination configuration runs the options through our create_key function, which translates it to a simple bitmap-style string. We assign all of it to a hash of that map and the error message.

Inside create_key , we check if it was called with an array reference as its argument. If that's the case, the call was for building the table, and we convert it to a hash reference so we have a proper map to look stuff up in. We know that the %OPTIONS always contains all the keys that exist, and that those are pre-filled with values that all evaluate to false . We can harness that convert the truthiness of those values to 1 or 0 , which then builds our key.

We will see in a moment why that is useful.

Now how do we use this?

sub HelpMessage { exit; }; # as a placeholder

# set up OPTIONS
my %OPTIONS = (
    host          => q{},
    bash          => 0,
    nic           => 0,
    nicName       => q{},
    console       => 0,
    virtual       => 0,
    cmdb          => 0,
    policyid      => 0,
    showcompliant => 0,
    backup        => 0,
    backuphistory => 0,
    page          => q{},
);

# read options with Getopt::Long ...
$OPTIONS{console} = $OPTIONS{virtual} = 1;

# ... and check for wrong invocations
if ( exists $forbidden_combinations{ my $key = create_key($OPTIONS) } ) {
    warn "ERROR: $forbidden_combinations{$key}\n";
    HelpMessage;
}

All we need to do now is get the $OPTIONS hash reference from Getopt::Long, and pass it through our create_key function to turn it into the map string. Then we can simply see if that key exists in our %forbidden_combinations dispatch table and show the corresponding error message.


Advantages of this approach

If you want to add more parameters, all you need to do is include them in @opts . In a full implementation that would probably be auto-generated from the config for the Getopt call. The keys will change under the hood, but since that is abstracted away you don't have to care.

Furthermore, this is easy to read. The create_key aside, the actual dispatch table syntax is quite concise and even has documentary character.

Disadvantages of this approach

There is a lot of programmatic generation going on for just a single call. It's certainly not the most efficient way to do it.


To take this further, you can write functions that auto-generate entries for certain scenarios.

I suggest you take a look at the second chapter in Mark Jason Dominus' excellent book Higher-Order Perl , which is available for free as a PDF.

Speeddymon ,yesterday

Thank you for the detailed answer. I've updated the question to help clarify how the $OPTIONS hash is setup. Can your example work within the bounds of what I have already, or should I rewrite the whole thing from scratch? – Speeddymon yesterday

simbabque ,yesterday

@Speeddymon yeah, that should work. I see you've got %OPTIONS , and it is always pre-set with values. That's going to be interesting. Let me try. – simbabque yesterday

Speeddymon ,yesterday

Speaking of the HOP book... That was actually what I was using to try to learn and where I was having trouble in applying the concept to my code. :-) I couldn't find a PDF version before, so thank you for the link! – Speeddymon yesterday

simbabque ,yesterday

@Speeddymon I've updated the answer and changed it to match your updated code. I suggest you read the diff first. What I don't like about it yet is that the possible keys are there twice, but that can be solved with some more trickery. I think that would blow up the answer even more, so I didn't do that. – simbabque yesterday

ikegami ,23 hours ago

Doesn't detect the case when -r , -v and -b are provided as an error. – ikegami 23 hours ago

ikegami ,2 days ago

You shouldn't be using elsif here because multiple condition could be true. And since multiple conditions could be true, a dispatch table can't be used. Your code can still be simplified greatly.
my @errors;

push @errors, "ERROR: Host must be provided\n"
   if !defined($OPTIONS{host});

my @conflicting =
   map { my ($opt, $flag) = @$_; $OPTIONS->{$opt} ? $flag : () }
      [ 'console', '-r' ],
      [ 'virtual', '-v' ],
      [ 'cmdb',    '-d' ],
      [ 'backup',  '-b' ],
      [ 'nic',     '-n' ];

push @errors, "ERROR: Can only use one the following flags at a time: @conflicting\n"
   if @conflicting > 1;

push @errors, "ERROR: Can't use flag -p without also specifying -d\n"
   if defined($OPTIONS->{policyid}) && !$OPTIONS->{cmdb};

push @errors, "ERROR: Can't use flag -c without also specifying -d\n"
   if $OPTIONS->{showcompliant} && !$OPTIONS->{cmdb};

push @errors, "ERROR: Can't use flag -s without also specifying -b\n"
   if $OPTIONS->{backuphistory} && !$OPTIONS->{backup};

push @errors, "ERROR: Can't use flag -m without also specifying -n\n"
   if defined($OPTIONS->{nicName}) && !$OPTIONS->{nic};

push @errors, "ERROR: Incorrect number of arguments\n"
   if @ARGV;

usage(@errors) if @errors;

Note that the above fixes numerous errors in your code.


Help vs Usage Error

Calling HelpMessage indifferently in both situations is therefore incorrect.

Create the following sub named usage to use (without arguments) when GetOptions returns false, and with an error message when some other usage error occurs:

use File::Basename qw( basename );

sub usage {
   my $prog = basename($0);
   print STDERR $_ for @_;
   print STDERR "Try '$prog --help' for more information.\n";
   exit(1);
}

Keep using HelpMessage in response to --help , but the defaults for the arguments are not appropriate for --help . You should use the following:

'help' => sub { HelpMessage( -exitval => 0, -verbose => 1 ) },

Speeddymon ,yesterday

I wondered if it would be impossible because of multiple conditions being true, but based on other answers, it seems that it is possible to still build a table and compare... – Speeddymon yesterday

ikegami ,yesterday

What are you talking about? No answer used a dispatch table. All the answers (including mine) used a ( for or map ) loop that performs as many checks as there are conditions. The points of a dispatch table is to do a single check no matter how many conditions there are. Since all conditions can be true, you need to check all conditions, so a dispatch table is impossible by definition. (And that's without even mentioning that the value of a dispatch table should be a code reference or similar (something to dispatch to).) – ikegami yesterday

ikegami ,yesterday

The difference between mine and the others is that mine avoids using an inefficient unordered hash and uses an efficient ordered list instead. (You could place the list in an array if you prefer.) – ikegami yesterday

ikegami ,yesterday

Updated to match updated question. That fact that none of the other answers can be extended for your updated question proves my pointthat trying to put everything into one loop or table just makes things less flexible, longer and more complex. – ikegami yesterday

Speeddymon ,yesterday

In response to the "help" tip -- HelpMessage is defined by GetOpt::Long and reads from the PODs at the end of the file. – Speeddymon yesterday

[Nov 22, 2017] Bitwise operators supported by Perl language

Nov 18, 2017 | www.tutorialspoint.com
Bitwise operator works on bits and perform bit by bit operation. Assume if $a = 60; and $b = 13; Now in binary format they will be as follows − $a = 0011 1100 $b = 0000 1101 ----------------- $a&$b = 0000 1100 $a|$b = 0011 1101 $a^$b = 0011 0001 ~$a = 1100 0011 There are following Bitwise operators supported by Perl language, assume if $a = 60; and $b = 13
S.No. Operator & Description
1 &

Binary AND Operator copies a bit to the result if it exists in both operands.

Example − ($a & $b) will give 12 which is 0000 1100

2 |

Binary OR Operator copies a bit if it exists in eather operand.

Example − ($a | $b) will give 61 which is 0011 1101

3 ^

Binary XOR Operator copies the bit if it is set in one operand but not both.

Example − ($a ^ $b) will give 49 which is 0011 0001

4 ~

Binary Ones Complement Operator is unary and has the efect of 'flipping' bits.

Example − (~$a ) will give -61 which is 1100 0011 in 2's complement form due to a signed binary number.

5 <<

Binary Left Shift Operator. The left operands value is moved left by the number of bits specified by the right operand.

Example − $a << 2 will give 240 which is 1111 0000

6 >>

Binary Right Shift Operator. The left operands value is moved right by the number of bits specified by the right operand.

Example − $a >> 2 will give 15 which is 0000 1111

Example

Try the following example to understand all the bitwise operators available in Perl. Copy and paste the following Perl program in test.pl file and execute this program.

#!/usr/local/bin/perl

use integer;
 
$a = 60;
$b = 13;

print "Value of \$a = $a and value of \$b = $b\n";

$c = $a & $b;
print "Value of \$a & \$b = $c\n";

$c = $a | $b;
print "Value of \$a | \$b = $c\n";

$c = $a ^ $b;
print "Value of \$a ^ \$b = $c\n";

$c = ~$a;
print "Value of ~\$a = $c\n";

$c = $a << 2;
print "Value of \$a << 2 = $c\n";

$c = $a >> 2;
print "Value of \$a >> 2 = $c\n";

When the above code is executed, it produces the following result −

Value of $a = 60 and value of $b = 13
Value of $a & $b = 12
Value of $a | $b = 61
Value of $a ^ $b = 49
Value of ~$a = -61
Value of $a << 2 = 240
Value of $a >> 2 = 15

[Nov 22, 2017] perl - How can I also get an element's index when I grep through an array - Stack Overflow

Nov 22, 2017 | stackoverflow.com

Learn more up vote down vote favorite

Geo ,Jun 10, 2010 at 16:39

Let's say I have this list:
my @list = qw(one two three four five);

and I want to grab all the elements containing o . I'd have this:

my @containing_o = grep { /o/ } @list;

But what would I have to do to also receive an index, or to be able to access the index in grep 's body?

,

my @index_containing_o = grep { $list[$_] =~ /o/ } 0..$#list;  # ==> (0,1,3)

my %hash_of_containing_o = map { $list[$_]=~/o/?($list[$_]=>$_):() } 0..$#list
            # ==> ( 'one' => 0, 'two' => 1, 'four' => 3 )

[Nov 22, 2017] Perl grep array FAQ - How to search an array-list of strings alvinalexander.com

Nov 22, 2017 | alvinalexander.com

Perl grep array FAQ - How to search an array/list of strings By Alvin Alexander. Last updated: June 3 2016 Perl "grep array" FAQ: Can you demonstrate a Perl grep array example? (Related: Can you demonstrate how to search a Perl array?)

A very cool thing about Perl is that you can search lists (arrays) with the Perl grep function. This makes it very easy to find things in large lists -- without having to write your own Perl for/foreach loops.

A simple Perl grep array example (Perl array search)

Here's a simple Perl array grep example. First I create a small string array (pizza toppings), and then search the Perl array for the string "pepper":

# create a perl list/array of strings
@pizzas = qw(cheese pepperoni veggie sausage spinach garlic);

# use the perl grep function to search the @pizzas list for the string "pepper"
@results = grep /pepper/, @pizzas;

# print the results
print "@results\n";

As you might guess from looking at the code, my @results Perl array prints the following output:

pepperoni
Perl grep array - case-insensitive searching

If you're familiar with Perl regular expressions, you might also guess that it's very easy to make this Perl array search example case-insensitive using the standard i operator at the end of my search string.

Here's what our Perl grep array example looks like with this change:

@results = grep /pepper/i, @pizzas;
Perl grep array and regular expressions (regex)

You can also use more complex Perl regular expressions (regex) in your array search. For instance, if for some reason you wanted to find all strings in your array that contain at least eight consecutive word characters, you could use this search pattern:

@results = grep /\w{8}/, @pizzas;

That example results in the following output:

pepperoni
Perl grep array - Summary

I hope this Perl grep array example (Perl array search example) has been helpful. For related Perl examples, see the Related block on this web page, or use the search form on this website. If you have any questions, or better yet, more Perl array search examples, feel free to use the Comments section below.

[Nov 22, 2017] Perl Searching for item in an Array - Stack Overflow

Nov 22, 2017 | stackoverflow.com

Perl: Searching for item in an Array Ask Question up vote down vote favorite 1

Majic Johnson ,Apr 20, 2012 at 4:53

Given an array @A we want to check if the element $B is in it. One way is to say this:
Foreach $element (@A){
    if($element eq $B){
        print "$B is in array A";
    }
}

However when it gets to Perl, I am thinking always about the most elegant way. And this is what I am thinking: Is there a way to find out if array A contains B if we convert A to a variable string and use

index(@A,$B)=>0

Is that possible?

cHao ,Apr 20, 2012 at 4:55

grep { $_ eq $B } @A ? – cHao Apr 20 '12 at 4:55

daxim ,Apr 20, 2012 at 7:06

Related: stackoverflow.com/questions/7898499/ stackoverflow.com/questions/3086874/daxim Apr 20 '12 at 7:06

Nikhil Jain ,Apr 20, 2012 at 5:49

There are many ways to find out whether the element is present in the array or not:
  1. Using foreach
    foreach my $element (@a) {
        if($element eq $b) {
           # do something             
           last;
        }
    }
    
  2. Using Grep:
    my $found = grep { $_ eq $b } @a;
    
  3. Using List::Util module
    use List::Util qw(first); 
    
    my $found = first { $_ eq $b } @a;
    
  4. Using Hash initialised by a Slice
    my %check;
    @check{@a} = ();
    
    my $found = exists $check{$b};
    
  5. Using Hash initialised by map
    my %check = map { $_ => 1 } @a;
    
    my $found = $check{$b};
    

pilcrow ,May 2, 2012 at 19:56

The List::Util::first() example is (potentially) subtly incorrect when searching for false values, since $found will also evaluate false. ( die unless $found ... oops!) List::MoreUtils::any does the right thing here. – pilcrow May 2 '12 at 19:56

yazu ,Apr 20, 2012 at 4:56

use 5.10.1;

$B ~~ @A and say '$B in @A';

brian d foy ,Apr 20, 2012 at 13:07

You have to be very careful with this because this distributes the match over the elements. If @A has an array reference element that contains $B, this will still match even though $B isn't a top level element of @A. The smart match is fundamentally broken for this and many other reasons. – brian d foy Apr 20 '12 at 13:07

obmib ,Apr 20, 2012 at 5:51

use List::AllUtils qw/ any /;
print "\@A contains $B" if any { $B eq $_ } @A;

bvr ,Apr 20, 2012 at 7:43

I would recommend first in this case, as it does not have to traverse whole array. It can stop when item is found. – bvr Apr 20 '12 at 7:43

brian d foy ,Apr 20, 2012 at 13:10

any can stop too because it needs only one element to be true. – brian d foy Apr 20 '12 at 13:10

pilcrow ,May 3, 2012 at 1:38

Beware that first can also return a false value if it finds, e.g., "0", which would confound the example given in this answer. any has the desired semantics. – pilcrow May 3 '12 at 1:38

[Nov 22, 2017] edited yesterday

Nov 22, 2017 | stackoverflow.com

down vote favorite

Speeddymon ,2 days ago

I've been reading up on dispatch tables and I get the general idea of how they work, but I'm having some trouble taking what I see online and applying the concept to some code I originally wrote as an ugly mess of if-elsif-else statements.

I have options parsing configured by using GetOpt::Long , and in turn, those options set a value in the %OPTIONS hash, depending on the option used.

Taking the below code as an example... ( UPDATED WITH MORE DETAIL

use     5.008008;
use     strict;
use     warnings;
use     File::Basename qw(basename);
use     Getopt::Long qw(HelpMessage VersionMessage :config posix_default require_order no_ignore_case auto_version auto_help);

my $EMPTY      => q{};

sub usage
{
    my $PROG = basename($0);
    print {*STDERR} $_ for @_;
    print {*STDERR} "Try $PROG --help for more information.\n";
    exit(1);
}

sub process_args
{
    my %OPTIONS;

    $OPTIONS{host}              = $EMPTY;
    $OPTIONS{bash}              = 0;
    $OPTIONS{nic}               = 0;
    $OPTIONS{nicName}           = $EMPTY;
    $OPTIONS{console}           = 0;
    $OPTIONS{virtual}           = 0;
    $OPTIONS{cmdb}              = 0;
    $OPTIONS{policyid}          = 0;
    $OPTIONS{showcompliant}     = 0;
    $OPTIONS{backup}            = 0;
    $OPTIONS{backuphistory}     = 0;
    $OPTIONS{page}              = $EMPTY;

    GetOptions
      (
        'host|h=s'              => \$OPTIONS{host}               ,
        'use-bash-script'       => \$OPTIONS{bash}               ,
        'remote-console|r!'     => \$OPTIONS{console}            ,
        'virtual-console|v!'    => \$OPTIONS{virtual}            ,
        'nic|n!'                => \$OPTIONS{nic}                ,
        'nic-name|m=s'          => \$OPTIONS{nicName}            ,
        'cmdb|d!'               => \$OPTIONS{cmdb}               ,
        'policy|p=i'            => \$OPTIONS{policyid}           ,
        'show-compliant|c!'     => \$OPTIONS{showcompliant}      ,
        'backup|b!'             => \$OPTIONS{backup}             ,
        'backup-history|s!'     => \$OPTIONS{backuphistory}      ,
        'page|g=s'              => \$OPTIONS{page}               ,
        'help'                  => sub      { HelpMessage(-exitval => 0, -verbose ->1)     },
        'version'               => sub      { VersionMessage()  },
      ) or usage;

    if ($OPTIONS{host} eq $EMPTY)
    {
        print {*STDERR} "ERROR: Must specify a host with -h flag\n";
        HelpMessage;
    }

    sanity_check_options(\%OPTIONS);

    # Parse anything else on the command line and throw usage
    for (@ARGV)
    {
        warn "Unknown argument: $_\n";
        HelpMessage;
    }

    return {%OPTIONS};
}

sub sanity_check_options
{
    my $OPTIONS     = shift;

    if (($OPTIONS->{console}) and ($OPTIONS->{virtual}))
    {
        print "ERROR: Cannot use flags -r and -v together\n";
        HelpMessage;
    }
    elsif (($OPTIONS->{console}) and ($OPTIONS->{cmdb}))
    {
        print "ERROR: Cannot use flags -r and -d together\n";
        HelpMessage;
    }
    elsif (($OPTIONS->{console}) and ($OPTIONS->{backup}))
    {
        print "ERROR: Cannot use flags -r and -b together\n";
        HelpMessage;
    }
    elsif (($OPTIONS->{console}) and ($OPTIONS->{nic}))
    {
        print "ERROR: Cannot use flags -r and -n together\n";
        HelpMessage;
    }

    if (($OPTIONS->{virtual}) and ($OPTIONS->{backup}))
    {
        print "ERROR: Cannot use flags -v and -b together\n";
        HelpMessage;
    }
    elsif (($OPTIONS->{virtual}) and ($OPTIONS->{cmdb}))
    {
        print "ERROR: Cannot use flags -v and -d together\n";
        HelpMessage;
    }
    elsif (($OPTIONS->{virtual}) and ($OPTIONS->{nic}))
    {
        print "ERROR: Cannot use flags -v and -n together\n";
        HelpMessage;
    }

    if (($OPTIONS->{backup}) and ($OPTIONS->{cmdb}))
    {
        print "ERROR: Cannot use flags -b and -d together\n";
        HelpMessage;
    }
    elsif (($OPTIONS->{backup}) and ($OPTIONS->{nic}))
    {
        print "ERROR: Cannot use flags -b and -n together\n";
        HelpMessage;
    }

    if (($OPTIONS->{nic}) and ($OPTIONS->{cmdb}))
    {
        print "ERROR: Cannot use flags -n and -d together\n";
        HelpMessage;
    }

    if (($OPTIONS->{policyid} != 0) and not ($OPTIONS->{cmdb}))
    {
        print "ERROR: Cannot use flag -p without also specifying -d\n";
        HelpMessage;
    }

    if (($OPTIONS->{showcompliant}) and not ($OPTIONS->{cmdb}))
    {
        print "ERROR: Cannot use flag -c without also specifying -d\n";
        HelpMessage;
    }

    if (($OPTIONS->{backuphistory}) and not ($OPTIONS->{backup}))
    {
        print "ERROR: Cannot use flag -s without also specifying -b\n";
        HelpMessage;
    }

    if (($OPTIONS->{nicName}) and not ($OPTIONS->{nic}))
    {
        print "ERROR: Cannot use flag -m without also specifying -n\n";
        HelpMessage;
    }

    return %{$OPTIONS};
}

I'd like to turn the above code into a dispatch table, but can't figure out how to do it.

Any help is appreciated.

Jim Garrison ,2 days ago

Are the sets of conflicting options always pairs? Can you have situations where options a , b , and c cannot occur together but any two are OK? Before you can pick a representation you need to be sure your model can handle the logic you need in a general way. This is not an easy problem. – Jim Garrison 2 days ago

simbabque ,yesterday

Don't use English, it's horribly slow and makes your code harder to read. – simbabque yesterday

Speeddymon ,yesterday

Removed English module and changed $ARG / @ARG to $_ / @_ Added $EMPTY as I forgot I had it defined globally. – Speeddymon yesterday

Speeddymon ,yesterday

@JimGarrison -- you are correct. The if-elsif-else does not explicitly account for 3 options that conflict (though it does account for that implicitly) As an example, using -h is required with all of the other options. But, using -h , -r , v , all together is not allowed, while -h , -r , and -d is allowed. – Speeddymon yesterday

ikegami ,yesterday

Since the host must be provided, it should be an argument, not an option. – ikegami yesterday

zdim ,2 days ago

I am not sure how a dispatch table would help since you need to go through pair-wise combinations of specific possibilities, and thus cannot trigger a suitable action by one lookup.

Here is another way to organize it

use List::MoreUtils 'firstval';

sub sanity_check_options
{
    my ($OPTIONS, $opt_excl) = @_;

    # Check each of 'opt_excl' against all other for ConFLict
    my @excl = sort keys %$opt_excl;
    while (my $eo = shift @excl) 
    {
        if (my $cfl = firstval { $OPTIONS->{$eo} and $OPTIONS->{$_} } @excl) 
        {
            say "Can't use -$opt_excl->{$eo} and -$opt_excl->{$cfl} together";
            HelpMessage();
            last;
        }
    }

    # Go through specific checks on
    # policyid, showcompliant, backuphistory, and nicName
    ...
    return 1;  # or some measure of whether there were errors
}

# Mutually exclusive options
my %opt_excl = (
    console => 'r', virtual => 'v', cmdb => 'c', backup => 'b', nic => 'n'
); 

sanity_check_options(\%OPTIONS, \%opt_excl);

This checks all options listed in %opt_excl against each other for conflict, removing the segments of elsif involving the (five) options that are mutually exclusive. It uses List::MoreUtils::firstval . The few other specific invocations are best checked one by one.

There is no use of returning $OPTIONS since it is passed as reference so any changes apply to the original structure (while it's not meant to be changed either). Perhaps you can keep track of whether there were errors and return that if it can be used in the caller, or just return 1 .

This addresses the long elsif chain as asked, and doesn't go into the rest of code. Here is one comment though: There is no need for {%OPTIONS} , which copies the hash in order to create an anonymous one; just use return \%OPTIONS;


Comment on possible multiple conflicting options

This answer as it stands does not print all conflicting options that have been used if there are more than two, as raised by ikegami in comments; it does catch any conflicts so that the run is aborted.

The code is readily adjusted for this. Instead of the code in the if block either

However, one is expected to know of allowed invocations and any listing of conflicts is a courtesy to the forgetful user (or a debugging aid); a usage message is printed as well anyway.

Given the high number of conflicting options the usage message should contain a prominent note on this. Also consider that so many conflicting options may indicate a design flaw.

Finally, this code fully relies on the fact that this processing goes once per run and operates with a handful of options; thus it is not concerned with efficiency and freely uses ancillary data structures.

Speeddymon ,yesterday

Updated the question to clarify. – Speeddymon yesterday

zdim ,yesterday

@Speeddymon Thank you, updated. This brings together checks of those five options which can't go one with another. The remaining few I leave to be checked one by one; "encoding" one or two possibilities in some all-encompassing system would just increase complexity (and may end up less readable). – zdim yesterday

zdim ,yesterday

@Speeddymon Added the missing include, use List::MoreUtils 'firstval' . Edited a little in the meanwhile, as well. – zdim yesterday

Speeddymon ,yesterday

Thank you for the easy to follow example. I went with yours as it was the clearest and contained the least duplicate code. – Speeddymon yesterday

ikegami ,23 hours ago

@Speeddymon, Apparently, it's not clear as you think since you didn't realize if doesn't work. It doesn't mention the error of using -r and -c together if -b is also provided. And why is a hash being used at all? Wasteful and needlessly complex. – ikegami 23 hours ago

simbabque ,yesterday

You can use a dispatch table if there are a lot of options. I would build that table programmatically. It might not be the best option here, but it works and the configuration is more readable than your elsif construct.
use strict;
use warnings;
use Ref::Util::XS 'is_arrayref';    # or Ref::Util

sub create_key {
    my $input = shift;

    # this would come from somewhere else, probably the Getopt config
    my @opts = qw( host bash nic nicName console virtual cmdb
        policyid showcompliant backup backuphistory page );

    # this is to cover the configuration with easier syntax
    $input = { map { $_ => 1 } @{$input} }
        if is_arrayref($input);

    # options are always prefilled with false values
    return join q{}, map { $input->{$_} ? 1 : 0 }
        sort @opts;
}

my %forbidden_combinations = (
    map { create_key( $_->[0] ) => $_->[1] } (
        [ [qw( console virtual )] => q{Cannot use flags -r and -v together} ],
        [ [qw( console cmdb )]    => q{Cannot use flags -r and -d together} ],
        [ [qw( console backup )]  => q{Cannot use flags -r and -b together} ],
        [ [qw( console nic )]     => q{Cannot use flags -r and -n together} ],
    )
);

p %forbidden_combinations; # from Data::Printer

The output of the p function is the dispatch table.

{
    00101   "Cannot use flags -r and -v together",
    00110   "Cannot use flags -r and -n together",
    01100   "Cannot use flags -r and -d together",
    10100   "Cannot use flags -r and -b together"
}

As you can see, we've sorted all the options ascii-betically to use them as keys. That way, you could in theory build all kinds of combinations like exclusive options.

Let's take a look at the configuration itself.

my %forbidden_combinations = (
    map { create_key( $_->[0] ) => $_->[1] } (
        [ [qw( console virtual )] => q{Cannot use flags -r and -v together} ],
        # ...
    )
);

We use a list of array references. Each entry is on one line and contains two pieces of information. Using the fat comma => makes it easy to read. The first part, which is much like a key in a hash, is the combination. It's a list of fields that should not occur together. The second element in the array ref is the error message. I've removed all the recurring elements, like the newline, to make it easier to change how and where the error can be displayed.

The map around this list of combination configuration runs the options through our create_key function, which translates it to a simple bitmap-style string. We assign all of it to a hash of that map and the error message.

Inside create_key , we check if it was called with an array reference as its argument. If that's the case, the call was for building the table, and we convert it to a hash reference so we have a proper map to look stuff up in. We know that the %OPTIONS always contains all the keys that exist, and that those are pre-filled with values that all evaluate to false . We can harness that convert the truthiness of those values to 1 or 0 , which then builds our key.

We will see in a moment why that is useful.

Now how do we use this?

sub HelpMessage { exit; }; # as a placeholder

# set up OPTIONS
my %OPTIONS = (
    host          => q{},
    bash          => 0,
    nic           => 0,
    nicName       => q{},
    console       => 0,
    virtual       => 0,
    cmdb          => 0,
    policyid      => 0,
    showcompliant => 0,
    backup        => 0,
    backuphistory => 0,
    page          => q{},
);

# read options with Getopt::Long ...
$OPTIONS{console} = $OPTIONS{virtual} = 1;

# ... and check for wrong invocations
if ( exists $forbidden_combinations{ my $key = create_key($OPTIONS) } ) {
    warn "ERROR: $forbidden_combinations{$key}\n";
    HelpMessage;
}

All we need to do now is get the $OPTIONS hash reference from Getopt::Long, and pass it through our create_key function to turn it into the map string. Then we can simply see if that key exists in our %forbidden_combinations dispatch table and show the corresponding error message.


Advantages of this approach

If you want to add more parameters, all you need to do is include them in @opts . In a full implementation that would probably be auto-generated from the config for the Getopt call. The keys will change under the hood, but since that is abstracted away you don't have to care.

Furthermore, this is easy to read. The create_key aside, the actual dispatch table syntax is quite concise and even has documentary character.

Disadvantages of this approach

There is a lot of programmatic generation going on for just a single call. It's certainly not the most efficient way to do it.


To take this further, you can write functions that auto-generate entries for certain scenarios.

I suggest you take a look at the second chapter in Mark Jason Dominus' excellent book Higher-Order Perl , which is available for free as a PDF.

Speeddymon ,yesterday

Thank you for the detailed answer. I've updated the question to help clarify how the $OPTIONS hash is setup. Can your example work within the bounds of what I have already, or should I rewrite the whole thing from scratch? – Speeddymon yesterday

simbabque ,yesterday

@Speeddymon yeah, that should work. I see you've got %OPTIONS , and it is always pre-set with values. That's going to be interesting. Let me try. – simbabque yesterday

Speeddymon ,yesterday

Speaking of the HOP book... That was actually what I was using to try to learn and where I was having trouble in applying the concept to my code. :-) I couldn't find a PDF version before, so thank you for the link! – Speeddymon yesterday

simbabque ,yesterday

@Speeddymon I've updated the answer and changed it to match your updated code. I suggest you read the diff first. What I don't like about it yet is that the possible keys are there twice, but that can be solved with some more trickery. I think that would blow up the answer even more, so I didn't do that. – simbabque yesterday

ikegami ,23 hours ago

Doesn't detect the case when -r , -v and -b are provided as an error. – ikegami 23 hours ago

ikegami ,2 days ago

You shouldn't be using elsif here because multiple condition could be true. And since multiple conditions could be true, a dispatch table can't be used. Your code can still be simplified greatly.
my @errors;

push @errors, "ERROR: Host must be provided\n"
   if !defined($OPTIONS{host});

my @conflicting =
   map { my ($opt, $flag) = @$_; $OPTIONS->{$opt} ? $flag : () }
      [ 'console', '-r' ],
      [ 'virtual', '-v' ],
      [ 'cmdb',    '-d' ],
      [ 'backup',  '-b' ],
      [ 'nic',     '-n' ];

push @errors, "ERROR: Can only use one the following flags at a time: @conflicting\n"
   if @conflicting > 1;

push @errors, "ERROR: Can't use flag -p without also specifying -d\n"
   if defined($OPTIONS->{policyid}) && !$OPTIONS->{cmdb};

push @errors, "ERROR: Can't use flag -c without also specifying -d\n"
   if $OPTIONS->{showcompliant} && !$OPTIONS->{cmdb};

push @errors, "ERROR: Can't use flag -s without also specifying -b\n"
   if $OPTIONS->{backuphistory} && !$OPTIONS->{backup};

push @errors, "ERROR: Can't use flag -m without also specifying -n\n"
   if defined($OPTIONS->{nicName}) && !$OPTIONS->{nic};

push @errors, "ERROR: Incorrect number of arguments\n"
   if @ARGV;

usage(@errors) if @errors;

Note that the above fixes numerous errors in your code.


Help vs Usage Error

Calling HelpMessage indifferently in both situations is therefore incorrect.

Create the following sub named usage to use (without arguments) when GetOptions returns false, and with an error message when some other usage error occurs:

use File::Basename qw( basename );

sub usage {
   my $prog = basename($0);
   print STDERR $_ for @_;
   print STDERR "Try '$prog --help' for more information.\n";
   exit(1);
}

Keep using HelpMessage in response to --help , but the defaults for the arguments are not appropriate for --help . You should use the following:

'help' => sub { HelpMessage( -exitval => 0, -verbose => 1 ) },

Speeddymon ,yesterday

I wondered if it would be impossible because of multiple conditions being true, but based on other answers, it seems that it is possible to still build a table and compare... – Speeddymon yesterday

ikegami ,yesterday

What are you talking about? No answer used a dispatch table. All the answers (including mine) used a ( for or map ) loop that performs as many checks as there are conditions. The points of a dispatch table is to do a single check no matter how many conditions there are. Since all conditions can be true, you need to check all conditions, so a dispatch table is impossible by definition. (And that's without even mentioning that the value of a dispatch table should be a code reference or similar (something to dispatch to).) – ikegami yesterday

ikegami ,yesterday

The difference between mine and the others is that mine avoids using an inefficient unordered hash and uses an efficient ordered list instead. (You could place the list in an array if you prefer.) – ikegami yesterday

ikegami ,yesterday

Updated to match updated question. That fact that none of the other answers can be extended for your updated question proves my pointthat trying to put everything into one loop or table just makes things less flexible, longer and more complex. – ikegami yesterday

Speeddymon ,yesterday

In response to the "help" tip -- HelpMessage is defined by GetOpt::Long and reads from the PODs at the end of the file. – Speeddymon yesterday

[Nov 22, 2017] Bitwise operators supported by Perl language

Nov 18, 2017 | www.tutorialspoint.com
Bitwise operator works on bits and perform bit by bit operation. Assume if $a = 60; and $b = 13; Now in binary format they will be as follows − $a = 0011 1100 $b = 0000 1101 ----------------- $a&$b = 0000 1100 $a|$b = 0011 1101 $a^$b = 0011 0001 ~$a = 1100 0011 There are following Bitwise operators supported by Perl language, assume if $a = 60; and $b = 13
S.No. Operator & Description
1 &

Binary AND Operator copies a bit to the result if it exists in both operands.

Example − ($a & $b) will give 12 which is 0000 1100

2 |

Binary OR Operator copies a bit if it exists in eather operand.

Example − ($a | $b) will give 61 which is 0011 1101

3 ^

Binary XOR Operator copies the bit if it is set in one operand but not both.

Example − ($a ^ $b) will give 49 which is 0011 0001

4 ~

Binary Ones Complement Operator is unary and has the efect of 'flipping' bits.

Example − (~$a ) will give -61 which is 1100 0011 in 2's complement form due to a signed binary number.

5 <<

Binary Left Shift Operator. The left operands value is moved left by the number of bits specified by the right operand.

Example − $a << 2 will give 240 which is 1111 0000

6 >>

Binary Right Shift Operator. The left operands value is moved right by the number of bits specified by the right operand.

Example − $a >> 2 will give 15 which is 0000 1111

Example

Try the following example to understand all the bitwise operators available in Perl. Copy and paste the following Perl program in test.pl file and execute this program.

#!/usr/local/bin/perl

use integer;
 
$a = 60;
$b = 13;

print "Value of \$a = $a and value of \$b = $b\n";

$c = $a & $b;
print "Value of \$a & \$b = $c\n";

$c = $a | $b;
print "Value of \$a | \$b = $c\n";

$c = $a ^ $b;
print "Value of \$a ^ \$b = $c\n";

$c = ~$a;
print "Value of ~\$a = $c\n";

$c = $a << 2;
print "Value of \$a << 2 = $c\n";

$c = $a >> 2;
print "Value of \$a >> 2 = $c\n";

When the above code is executed, it produces the following result −

Value of $a = 60 and value of $b = 13
Value of $a & $b = 12
Value of $a | $b = 61
Value of $a ^ $b = 49
Value of ~$a = -61
Value of $a << 2 = 240
Value of $a >> 2 = 15

[Nov 22, 2017] Perl grep array FAQ - How to search an array-list of strings alvinalexander.com

Nov 22, 2017 | alvinalexander.com

Perl grep array FAQ - How to search an array/list of strings By Alvin Alexander. Last updated: June 3 2016 Perl "grep array" FAQ: Can you demonstrate a Perl grep array example? (Related: Can you demonstrate how to search a Perl array?)

A very cool thing about Perl is that you can search lists (arrays) with the Perl grep function. This makes it very easy to find things in large lists -- without having to write your own Perl for/foreach loops.

A simple Perl grep array example (Perl array search)

Here's a simple Perl array grep example. First I create a small string array (pizza toppings), and then search the Perl array for the string "pepper":

# create a perl list/array of strings
@pizzas = qw(cheese pepperoni veggie sausage spinach garlic);

# use the perl grep function to search the @pizzas list for the string "pepper"
@results = grep /pepper/, @pizzas;

# print the results
print "@results\n";

As you might guess from looking at the code, my @results Perl array prints the following output:

pepperoni
Perl grep array - case-insensitive searching

If you're familiar with Perl regular expressions, you might also guess that it's very easy to make this Perl array search example case-insensitive using the standard i operator at the end of my search string.

Here's what our Perl grep array example looks like with this change:

@results = grep /pepper/i, @pizzas;
Perl grep array and regular expressions (regex)

You can also use more complex Perl regular expressions (regex) in your array search. For instance, if for some reason you wanted to find all strings in your array that contain at least eight consecutive word characters, you could use this search pattern:

@results = grep /\w{8}/, @pizzas;

That example results in the following output:

pepperoni
Perl grep array - Summary

I hope this Perl grep array example (Perl array search example) has been helpful. For related Perl examples, see the Related block on this web page, or use the search form on this website. If you have any questions, or better yet, more Perl array search examples, feel free to use the Comments section below.

[Nov 22, 2017] Perl Searching for item in an Array - Stack Overflow

Nov 22, 2017 | stackoverflow.com

Perl: Searching for item in an Array Ask Question up vote down vote favorite 1

Majic Johnson ,Apr 20, 2012 at 4:53

Given an array @A we want to check if the element $B is in it. One way is to say this:
Foreach $element (@A){
    if($element eq $B){
        print "$B is in array A";
    }
}

However when it gets to Perl, I am thinking always about the most elegant way. And this is what I am thinking: Is there a way to find out if array A contains B if we convert A to a variable string and use

index(@A,$B)=>0

Is that possible?

cHao ,Apr 20, 2012 at 4:55

grep { $_ eq $B } @A ? – cHao Apr 20 '12 at 4:55

daxim ,Apr 20, 2012 at 7:06

Related: stackoverflow.com/questions/7898499/ stackoverflow.com/questions/3086874/daxim Apr 20 '12 at 7:06

Nikhil Jain ,Apr 20, 2012 at 5:49

There are many ways to find out whether the element is present in the array or not:
  1. Using foreach
    foreach my $element (@a) {
        if($element eq $b) {
           # do something             
           last;
        }
    }
    
  2. Using Grep:
    my $found = grep { $_ eq $b } @a;
    
  3. Using List::Util module
    use List::Util qw(first); 
    
    my $found = first { $_ eq $b } @a;
    
  4. Using Hash initialised by a Slice
    my %check;
    @check{@a} = ();
    
    my $found = exists $check{$b};
    
  5. Using Hash initialised by map
    my %check = map { $_ => 1 } @a;
    
    my $found = $check{$b};
    

pilcrow ,May 2, 2012 at 19:56

The List::Util::first() example is (potentially) subtly incorrect when searching for false values, since $found will also evaluate false. ( die unless $found ... oops!) List::MoreUtils::any does the right thing here. – pilcrow May 2 '12 at 19:56

yazu ,Apr 20, 2012 at 4:56

use 5.10.1;

$B ~~ @A and say '$B in @A';

brian d foy ,Apr 20, 2012 at 13:07

You have to be very careful with this because this distributes the match over the elements. If @A has an array reference element that contains $B, this will still match even though $B isn't a top level element of @A. The smart match is fundamentally broken for this and many other reasons. – brian d foy Apr 20 '12 at 13:07

obmib ,Apr 20, 2012 at 5:51

use List::AllUtils qw/ any /;
print "\@A contains $B" if any { $B eq $_ } @A;

bvr ,Apr 20, 2012 at 7:43

I would recommend first in this case, as it does not have to traverse whole array. It can stop when item is found. – bvr Apr 20 '12 at 7:43

brian d foy ,Apr 20, 2012 at 13:10

any can stop too because it needs only one element to be true. – brian d foy Apr 20 '12 at 13:10

pilcrow ,May 3, 2012 at 1:38

Beware that first can also return a false value if it finds, e.g., "0", which would confound the example given in this answer. any has the desired semantics. – pilcrow May 3 '12 at 1:38

[Nov 22, 2017] perl - How can I also get an element's index when I grep through an array - Stack Overflow

Nov 22, 2017 | stackoverflow.com

Learn more up vote down vote favorite

Geo ,Jun 10, 2010 at 16:39

Let's say I have this list:
my @list = qw(one two three four five);

and I want to grab all the elements containing o . I'd have this:

my @containing_o = grep { /o/ } @list;

But what would I have to do to also receive an index, or to be able to access the index in grep 's body?

,

my @index_containing_o = grep { $list[$_] =~ /o/ } 0..$#list;  # ==> (0,1,3)

my %hash_of_containing_o = map { $list[$_]=~/o/?($list[$_]=>$_):() } 0..$#list
            # ==> ( 'one' => 0, 'two' => 1, 'four' => 3 )

[Nov 22, 2017] perl modules

Nov 17, 2017 | perlmonks.com

Discipulus (Monsignor) on Nov 16, 2017 at 09:04 UTC

Re: perl modules

Hello codestroman and welcome to the monastery and to the wonderful world of Perl!

First of all, please, add <c> code tags </c> around your code and output.

Then be sure to have read the standard documentation: perlmod and perlnewmod

Infact a basic Perl module define a package and use Exporter to export functions in the using Perl program.

In my homenode i've collected a lot of links on about module creation

L*

Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.

thanos1983 (Priest) on Nov 16, 2017 at 09:17 UTC

Re: perl modules

Hello codestroman

Just to add a minor suggestion here, to the full cover reply of fellow monk Discipulus . It will assist you a lot also to read Simple Module Tutorial

Update: Direct answer to your question can be found here How to add a relative directory to @INC with multiple possible solutions. I would strongly recommend to go through all the articles that all monks proposed.

Hope this helps, BR.

Seeking for Perl wisdom...on the process of learning...not there...yet!

hippo (Abbot) on Nov 16, 2017 at 09:21 UTC

Re: perl modules (Can't locate in @INC)
PLEASE HELP!!

This is a monastery - a place of quite contemplation. The louder you shout the less wisdom shall you receive.

The error message Can't locate dog.pm in @INC is pretty explicit. Either your module file is not called dog.pm in which case, change it or else your file dog.pm is not in any of the directories listed in @INC in which case either move it to one of those directories or else change @INC with use lib .

I also see, despite the lack of formatting in your post that your module doesn't use any namespace. You should probably address that. Perhaps a solid read through Simple Module Tutorial would be a good idea?

Anonymous Monk on Nov 16, 2017 at 09:07 UTC

Re: perl modules

use an absolute pathname in use lib

Anonymous Monk on Nov 16, 2017 at 15:16 UTC

Re: perl modules

Welcome to the language ... and, to the Monastery. The "simple module tutorial" listed above is a very good place to start. Like all languages of its kind, Perl looks at runtime for external modules in a prescribed list of places, in a specified order. You can affect this in several ways, as the tutorials describe. Please read them carefully.

In the Perl(-5) language, this list is stored in a pre-defined array variable called @INC and it is populated from a variety of sources: a base-list that is compiled directly into the Perl interpreter, the PERL5LIB environment-variable, use lib statements, and even direct modification of the variable itself. Perl searches this list from beginning to end and processes (only) the first matching file that it finds.

(Note that, in Perl, the use statement is actually a pragma, or declaration to the compiler, and as such it has many "uses" and a somewhat complicated syntax.)

Corion (Pope) on Nov 16, 2017 at 15:23 UTC

Re^2: perl modules


by Corion (Pope) on Nov 16, 2017 at 15:23 UTC

(Note that, in Perl, the use statement is actually a pragma, or declaration to the compiler, and as such it has many "uses" and a somewhat complicated syntax.)

Please no.

The word "pragma" has a special meaning in Perl, and it is highly confusing to claim that a Perl "keyword" would be a "pragma". use certainly is a keyword and nothing else.

If you mean to say something different, please describe in more words what you want to say.

[Nov 18, 2017] Using the built-in debugger of Perl as REPL by Gabor Szabo

Youtube video, Mainly explain how to use x command in Perl debugger.
Nov 18, 2017 | www.youtube.com

The command line debugger that comes with perl is very powerful.
Not only does it allow us to debug script but it can be used as a REPL - a Read Eval Print Loop to explore the capabilities of the language. There are a few basic examples in this screencast.

http://perlmaven.com/using-the-built-...

To see all the Perl tutorials visit http://perlmaven.com/perl-tutorial

About Perl Programming and Perl programmers.

In this screencast:

perl -d e 1

p - print scalar
x - print data structure
b subname - set breakpoint

[Nov 17, 2017] Why key function applied to hash reference adds reference to the hash if it does not exist

Nov 17, 2017 | perlmonks.com

nikmit has asked for the wisdom of the Perl Monks concerning the following question:

Dear monks,

I came across this behaviour in perl which I find unintuitive, was wondering what the use case scenario for it is or whether I have done something wrong to bring it about...

I had a statement checking for the existence of data like so return 0 unless keys %{$hashref->{$key}} and I failed to realise that $key may not always exist.

I would have expected to see an error if $href->{$key} is undefined and therefore not a reference, but instead $key was just added to the hash.

Example:

#!/usr/bin/perl -w #perl-5.22.3 use strict; my $href = { cat => {milk => 1}, dog => {bone => 1} }; if (keys %{$href->{cow}}) { print "noop\n"; } else { if (exists $href->{cow}) { print "holy cow\n"; } else { print "no cow\n"; } } [download]

This prints 'holy cow'

Discipulus (Monsignor) on Nov 17, 2017 at 09:32 UTC

Re: keys %{$hash->{$href}} adds $href to the hash if it doesnt exist?

Hello nikmit ,

it's not a case of autovivification?

It is explained in perlref and for more informations see Explaining Autovivication and Autovivification in perl and https://perlmaven.com/autovivification

On CPAN there is a pragma to disable it if unwanted.

PS exists fixes your snippet: if (exists $href->{cow} and keys %{$href->{cow}}) { # no cow

L*


Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.

1nickt (Prior) on Nov 17, 2017 at 13:54 UTC

Re^2: keys %{$hash->{$href}} adds $href to the hash if it doesnt exist?


by 1nickt (Prior) on Nov 17, 2017 at 13:54 UTC

Care must be used with exists as it will indeed autovivify intermediate hashes:

use strict; use warnings; use feature 'say'; use Data::Dumper; $Data::Dumper::Sortkeys = $Data::Dumper::Indent = 1; my $href = { cat => { milk => 1 }, dog => { bone => 1 }, }; say exists $href->{'cow'}->{'alfalfa'} ? 'cow' : 'no cow'; say Dumper $href __END__ [download] Output: no cow $VAR1 = { 'cat' => { 'milk' => 1 }, 'cow' => {}, # uh-oh 'dog' => { 'bone' => 1 } }; [download] So you would have to either use exists on all levels of the structure as haukex suggested : use strict; use warnings; use feature 'say'; use Data::Dumper; $Data::Dumper::Sortkeys = $Data::Dumper::Indent = 1; my $href = { cat => { milk => 1 }, dog => { bone => 1 }, }; say exists $href->{'cow'} && exists $href->{'cow'}->{'alfalfa'} ? 'cow' : 'no cow'; say Dumper $href __END__ [download] Output: no cow $VAR1 = { 'cat' => { 'milk' => 1 }, 'dog' => { 'bone' => 1 } }; [download] ... or use autovivification : use strict; use warnings; use feature 'say'; use Data::Dumper; $Data::Dumper::Sortkeys = $Data::Dumper::Indent = 1; my $href = { cat => { milk => 1 }, dog => { bone => 1 }, }; no autovivification; say exists $href->{'cow'}->{'alfalfa'} ? 'cow' : 'no cow'; say Dumper $href __END__ [download] Output: no cow $VAR1 = { 'cat' => { 'milk' => 1 }, 'dog' => { 'bone' => 1 } }; [download] Note that autovivification.pm has effect lexically: use strict; use warnings; use feature 'say'; use Data::Dumper; $Data::Dumper::Sortkeys = $Data::Dumper::Indent = 1; my $href = { cat => { milk => 1 }, dog => { bone => 1 }, }; { no autovivification; say exists $href->{'cow'}->{'alfalfa'} ? 'cow' : 'no cow'; say Dumper $href } say exists $href->{'cow'}->{'alfalfa'} ? 'cow' : 'still no cow'; say Dumper $href; __END__ [download] no cow $VAR1 = { 'cat' => { 'milk' => 1 }, 'dog' => { 'bone' => 1 } }; still no cow $VAR1 = { 'cat' => { 'milk' => 1 }, 'cow' => {}, # uh-oh 'dog' => { 'bone' => 1 } }; [download]
The way forward always starts with a minimal test.

nikmit (Sexton) on Nov 17, 2017 at 10:15 UTC

Re^2: keys %{$hash->{$href}} adds $href to the hash if it doesnt exist?


by nikmit (Sexton) on Nov 17, 2017 at 10:15 UTC

Thanks - no autovivification will become a permanent presence for me, next to use strict .

haukex (Monsignor) on Nov 17, 2017 at 09:42 UTC

Re: keys %{$hash->{$href}} adds $href to the hash if it doesnt exist?

This is "autovivification" and was just discussed the other day, see the replies in the thread Array dereference in foreach() , including the ones deeper down in the thread.

Use exists to check if a hash key exists. As described in its documentation, if you have multi-level data structures (hashes of hashes), you need to check every level. Update: Discipulus just updated to show an example.

Eily (Parson) on Nov 17, 2017 at 10:01 UTC

Re: keys %{$hash->{$href}} adds $href to the hash if it doesnt exist?

FYI, while keys %hash returns the number of keys in scalar context, you can also use the hash itself, the value will be false if the hash is empty and true otherwise (actually 0 when empty, and information on the content otherwise). So if (exists $href->{cow} and %{ $href->{cow} }) . Unlike the keys version, scalar %{ $href->{cow} } will not create a new hash (autovivify) if the cow key doesn't exist, but die instead (at least if you forgot to check if the key exists, you'll get an error in the right place).

[Nov 17, 2017] Meteoalarm - Weather warnings

Nov 17, 2017 | perlmonks.com


1 direct reply -- Read more / Contribute by walto
on Sep 23, 2017 at 00:50

http://meteoalarm.eu ) is the official website from European national weather services that gives out warnings in extreme weather situations. It has been a while ago that i wrote a perl module for processing this information. ( Weather warnings from www.meteoalarm.eu ). The website is still on but has changed since. That made some changes necessary. I wrote the module only for informational purposes and it is not meant to use it for anything critical. Here is the code: #!/usr/bin/perl # # package Meteoalarm; use strict; use warnings; use Carp; use LWP; use HTML::Entities; use HTML::TreeBuilder; use utf8; binmode STDOUT, ":encoding(UTF-8)"; our $VERSION = "0.06"; sub new { my $class = shift; my $self = {}; my %passed_params = @_; $self->{'user_agent'} = _make_user_agent( $passed_params{'user_agent'} ); bless( $self, $class ); return $self; } sub countries { my $self = shift; my %passed_params = @_; my %type = ( 'all' => 0, 'wind' => 1, 'snow' => 2, 'ice' => 2, 'snow/ice' => 2, 'thunderstorm' => 3, 'fog' => 4, 'extreme high temperature' => 5, 'extreme low temperature' => 6, 'coastal event' => 7, 'forestfire' => 8, 'avalanches' => 9, 'rain' => 10, 'unnamed' => 11, 'flood' => 12, 'rainflood' => 13 ); if ( !$passed_params{type} ) { $passed_params{type} = 0; } elsif ( !$type{ $passed_params{type} } ) { $passed_params{type} = 0; } else { $passed_params{type} = $type{ $passed_params{type} }; } my %day = ( 'today' => 0, 'tomorrow' => 1 ); if ( !$passed_params{day} ) { $passed_params{day} = 0; } elsif ( !$day{ $passed_params{day} } ) { $passed_params{day} = 0; } else { $passed_params{day} = $day{ $passed_params{day} }; } my $url = _make_country_url( $passed_params{day}, $passed_params{type} ); my $content = _fetch_content( $url, $self->{'user_agent'} ); my $country_warnings = _parse_country_warnings($content); return $country_warnings; } sub regions { my ($self) = shift; my %passed_params = @_; my %day = ( 'today' => 0, 'tomorrow' => 1 ); if ( !$passed_params{day} ) { $passed_params{day} = 0; } elsif ( !$day{ $passed_params{day} } ) { $passed_params{day} = 0; } else { $passed_params{day} = $day{ $passed_params{day} }; } my %type = ( 'all' => 0, 'wind' => 1, 'snow' => 2, 'ice' => 2, 'snow/ice' => 2, 'thunderstorm' => 3, 'fog' => 4, 'extreme high temperature' => 5, 'extreme low temperature' => 6, 'coastal event' => 7, 'forestfire' => 8, 'avalanches' => 9, 'rain' => 10, 'unnamed' => 11, 'flood' => 12, 'rainflood' => 13 ); my %country_codes = ( 'AT' => 'AT-Austria', 'BA' => 'BA_Bosnia-Herzegovina', 'BE' => 'BE-Belgium', 'BG' => 'BG-Bulgaria', 'CH' => 'CH-Switzerland', 'CY' => 'CY-Cyprus', 'CZ' => 'CZ-Czechia', 'DE' => 'DE-Germany', 'DK' => 'DK-Denmark', 'EE' => 'EE-Estonia', 'ES' => 'ES-Spain', 'FI' => 'FI-Finland', 'FR' => 'FR-France', 'GR' => 'GR-Greece', 'HR' => 'HR-Croatia', 'HU' => 'HU-Hungary', 'IE' => 'IE-Ireland', 'IL' => 'IL-Israel', 'IS' => 'IS-Iceland', 'IT' => 'IT-Italy', 'LT' => 'LT-Lithuania', 'LU' => 'LU-Luxemburg', 'LV' => 'LV-Latvia', 'MD' => 'MD-Moldova', 'ME' => 'ME-Montenegro', 'MK' => 'MK-Former Yugoslav Republic of Macedonia', 'MT' => 'MT-Malta', 'NL' => 'NL-Netherlands', 'NO' => 'NO-Norway', 'PL' => 'PL-Poland', 'PT' => 'PT-Portugal', 'RO' => 'RO-Romania', 'RS' => 'RS-Serbia', 'SE' => 'SE-Sweden', 'SI' => 'SI-Slovenia', 'SK' => 'SK-Slovakia', 'UK' => 'UK-United-Kingdom' ); if ( !$passed_params{type} ) { $passed_params{type} = 0; } elsif ( !$type{ $passed_params{type} } ) { $passed_params{type} = 0; } else { $passed_params{type} = $type{ $passed_params{type} }; } croak "Invalid country_code: $passed_params{country_code}" unless $passed_params{country_code}; my $url = 'http://meteoalarm.eu/en_UK/' . $passed_params{day} . '/' . $passed_params{type} . '/' . $country_codes{ $passed_params{country_code} } . '.html'; my $content = _fetch_content( $url, $self->{'user_agent'} ); my $region_warnings = _parse_region_warnings($content); return $region_warnings; } sub details { my $self = shift; my %passed_params = @_; my %country_codes = ( 'AT' => 10, 'BA' => 10, 'BE' => 801, 'BG' => 28, 'CH' => 319, 'CY' => 1, 'CZ' => 14, 'DE' => 808, 'DK' => 8, 'EE' => 805, 'ES' => 831, 'FI' => 813, 'FR' => 94, 'GR' => 16, 'HR' => 806, 'HU' => 7, 'IE' => 804, 'IL' => 803, 'IS' => 11, 'IT' => 20, 'LT' => 801, 'LU' => 2, 'LV' => 804, 'MD' => 37, 'ME' => 3, 'MK' => 6, 'MT' => 1, 'NL' => 807, 'NO' => 814, 'PL' => 802, 'PT' => 26, 'RO' => 42, 'RS' => 11, 'SE' => 813, 'SI' => 801, 'SK' => 16, 'UK' => 16 ); my ( $region, $code ) = $passed_params{region_code} =~ /^([ABCDEFGHILMNPRSU][A-Z])(\d\d\ d)/; $code =~ s /^0//; croak "Invalid region_code: $passed_params{region_code}" unless ( $country_codes{$region} and ( $code <= $country_codes{$region} ) ); my $details; my %type = ( 'all' => 0, 'wind' => 1, 'snow' => 2, 'ice' => 2, 'snow/ice' => 2, 'thunderstorm' => 3, 'fog' => 4, 'extreme high temperature' => 5, 'extreme low temperature' => 6, 'coastal event' => 7, 'forestfire' => 8, 'avalanches' => 9, 'rain' => 10, 'unnamed' => 11, 'flood' => 12, 'rainflood' => 13 ); if ( !$passed_params{type} ) { $passed_params{type} = 0; } elsif ( !$type{ $passed_params{type} } ) { $passed_params{type} = 0; } else { $passed_params{type} = $type{ $passed_params{type} }; } my %day = ( 'today' => 0, 'tomorrow' => 1 ); if ( !$passed_params{day} ) { $passed_params{day} = 0; } elsif ( !$day{ $passed_params{day} } ) { $passed_params{day} = 0; } else { $passed_params{day} = $day{ $passed_params{day} }; } my $url = 'http://meteoalarm.eu/en_UK/' . $passed_params{day} . '/' . $passed_params{type} . '/' . $passed_params{region_code} . '.html'; my $content = _fetch_content( $url, $self->{'user_agent'} ); $details = _parse_details($content); return $details; } sub codes { my $self = shift; my @codes; my @countries_short; if (@_) { @countries_short = @_; } else { @countries_short = qw(AT BA BE BG CH CY CZ DE DK EE ES FI FR GR HR HU IE IL IS IT LT LU LV MD ME MK MT NL NO PL PT RO RS SE SI SK UK); } my %country_codes = ( 'AT' => 'AT-Austria', 'BA' => 'BA_Bosnia-Herzegovina', 'BE' => 'BE-Belgium', 'BG' => 'BG-Bulgaria', 'CH' => 'CH-Switzerland', 'CY' => 'CY-Cyprus', 'CZ' => 'CZ-Czechia', 'DE' => 'DE-Germany', 'DK' => 'DK-Denmark', 'EE' => 'EE-Estonia', 'ES' => 'ES-Spain', 'FI' => 'FI-Finland', 'FR' => 'FR-France', 'GR' => 'GR-Greece', 'HR' => 'HR-Croatia', 'HU' => 'HU-Hungary', 'IE' => 'IE-Ireland', 'IL' => 'IL-Israel', 'IS' => 'IS-Iceland', 'IT' => 'IT-Italy', 'LT' => 'LT-Lithuania', 'LU' => 'LU-Luxemburg', 'LV' => 'LV-Latvia', 'MD' => 'MD-Moldova', 'ME' => 'ME-Montenegro', 'MK' => 'MK-Former Yugoslav Republic of Macedonia', 'MT' => 'MT-Malta', 'NL' => 'NL-Netherlands', 'NO' => 'NO-Norway', 'PL' => 'PL-Poland', 'PT' => 'PT-Portugal', 'RO' => 'RO-Romania', 'RS' => 'RS-Serbia', 'SE' => 'SE-Sweden', 'SI' => 'SI-Slovenia', 'SK' => 'SK-Slovakia', 'UK' => 'UK-United-Kingdom' ); foreach my $country_short (@countries_short) { my $url = 'http://meteoalarm.eu/en_UK/' . '0' . '/' . '0' . '/' . $country_codes{$country_short} . '.html'; my $content = _fetch_content( $url, $self->{'user_agent'} ); push @codes, _parse_codes($content); } return @codes; } sub _make_country_url { my ( $day, $type ) = @_; my $url = 'http://meteoalarm.eu/en_UK/' . $day . '/' . $type . '/EU-Europe .html'; return $url; } sub _fetch_content { my ( $url, $user_agent ) = @_; my $ua = LWP::UserAgent->new; $ua->agent($user_agent); my $res = $ua->request( HTTP::Request->new( GET => $url ) ); croak " Can't fetch http://meteoalarm.eu: $res->status_line \n" unless ( $res->is_success ); return $res->decoded_content; } sub _parse_country_warnings { my $content = shift; my $p = HTML::TreeBuilder->new_from_content($content); my (%data); my @cells = $p->look_down( _tag => q{td}, class => qr/^col[12]$/ ) ; for my $cell (@cells) { my @src; my $div = $cell->look_down( _tag => q{div} ); my $id = $div->id; my $alt = $div->attr(q{alt}); $data{$id}{fullname} = $alt; my @weather_events = $div->look_down( _tag => 'span', class => qr{warn awt} ); $data{$id}{warnings} = _parse_weather_events( \@weather_events ); # # get tendency # my $tendency = $div->look_down( _tag => 'div', class => qr{tendenz awt nt l\d} ); if ( $tendency->{class} ) { $tendency->{class} =~ /tendenz awt nt l(\d)/; $data{$id}{tendency} = $1; } } return \%data; } sub _parse_region_warnings { my $content = shift; my $p = HTML::TreeBuilder->new_from_content($content); my (%data); my @cells = $p->look_down(_tag=>qr{div}, id=>qr{area}); for my $cell (@cells) { $cell->id =~ /area_([A-Z][A-Z]\d+)/; my $id = $1; my $fullname = $cell->look_down(_tag=>'span',id=>'cname')->as_text ; my $div = $cell->look_down( _tag => q{div} ); $data{$id}{fullname} = $fullname; my @weather_events = $div->look_down(_tag=> 'span', class=>qr{warnflag warn2}); $data{$id}{warnings} = _parse_weather_events( \@weather_events ); # # get tendency # my $tendency = $div->look_down( _tag => 'span', class => qr{tendenz awt\d l\d} ); if ( $tendency->{class} ) { $tendency->{class} =~ /tendenz awt\d l(\d)/; $data{$id}{tendency} = $1; } } return \%data; } sub _parse_weather_events { my $events = shift; my %weather_to_text = ( # lower case for consistency 1 => 'wind', 2 => 'snow/ice', 3 => 'thunderstorm', 4 => 'fog', 5 => 'extreme high temperature', 6 => 'extreme low temperature', 7 => 'coastal event', 8 => 'forestfire', 9 => 'avalanches', 10 => 'rain', 11 => 'unnamed', 12 => 'flood', 13 => 'rainflood' ); my %literal_warnings; for my $event (@$events) { #print $event->{class}, "\n"; $event->{class} =~ /warn\d* awt l(\d+) t(\d+)/; my $warn_level = $1; my $weather = $2; $literal_warnings{ $weather_to_text{$weather} } = $warn_level; } return \%literal_warnings; } sub _parse_details { my $content = shift; my (%data); my $p = HTML::TreeBuilder->new_from_content( decode_entities $cont ent); $data{fullname} = $p->look_down( _tag => q{h1} )->as_text; if ( $p->look_down( _tag => q{div}, class => q{warnbox awt nt l l1} ) ) { $data{warnings} = 'no warnings'; } else { my @warnboxes = $p->look_down( _tag => q{div}, class => qr/warnbox awt/ ); for my $warnbox (@warnboxes) { my ($as_txt); my @info_divs = $warnbox->look_down( _tag => q{div}, class => q{info} ); $as_txt = $info_divs[0]->as_text; my ( $from, $until ) = $as_txt =~ /valid from (.*) Until ( .*)$/; $as_txt = $info_divs[1]->as_text; my ( $warning, $level ) = $as_txt =~ /(.+?)\s+Awareness Level:\s+(.*)/; $warning =~ s/s$//; my $text = $warnbox->look_down( _tag => q{div}, class => q{text} )->as_text; $data{warnings}{ lc $warning } = { #lower case for constistency level => $level, from => $from, until => $until, text => $text, }; } } return \%data; } sub _parse_codes { my $content = shift; my $p = HTML::TreeBuilder->new_from_content($content); my (%data); #my @cells = $p->look_down( _tag => 'div', class => 'flags' ); my @cells = $p->look_down( _tag => qr{a} ); for my $cell (@cells) { if ( $cell->attr('xlink:href') ) { if ( $cell->attr('xlink:href') =~ /\/([A-Z][A-Z]\d+)-(.+?) .html/ ) { my $code = $1; my $fullname = $2; $data{$fullname} = $code; } } } return \%data; } sub _make_user_agent { my $ua = shift; $ua = 'Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:55.0) Gecko/20100101 Fire fox/55.0' unless ($ua); return $ua; } sub _extract_details_fullname { my $content = shift; my $region; if ( $content =~ /<h1>Weather warnings: (.+?)<\/h1>/ ) { $region = $1; decode_entities($region); if ( $region =~ /.??<.*<\/a>/ ) { $region =~ s/.??<.*<\/a>//; } } else { carp "Can't get region name\n"; } return $region; } 1; __END__ =head1 NAME B<Meteoalarm> - OO Interface for meteoalarm.eu =head1 SYNOPSIS This Module gets weather warnings from meteoalarm.eu. For further reading of terms and conditions see http://meteoalarm.eu/t erms.php?lang=en_UK use Meteoalarm; my $meteo = Meteoalarm->new( 'user_agent' => 'Meteobot 0.001' ); my $countries = $meteo -> countries ('type' => 'all', 'day' => 'today' ); foreach my $country_code (sort keys %{$countries}){ print "Country: $countries->{$country_code}->{'fullname'}\n"; print "Tendency = $countries->{$country_code}->{tendency}\n" if ( $countries->{$country_code}->{'tendency'}); if (keys %{$countries->{$country_code}->{'warnings'}}){ foreach my $warning (keys %{$countries->{$country_code}->{'warning s'}}){ print "Event: $warning, severity: $countries->{$country_co de}->{'warnings'}->{$warning}\n"; } } else {print "No Warnings\n";} } my $regions = $meteo->regions( 'country_code' => 'PT', 'day' => 'today ', 'type' => 'all' ); foreach my $code ( sort keys %{$regions} ) { print "Region : $regions->{$code}->{'fullname'}: region_code = $co de\n" if ( keys %{ $regions->{$code}->{'warnings'} } ); print "Tendency = $regions->{$code}->{tendency}\n" if ( $regions-> {$code}->{'tendency'}); foreach my $type ( keys %{ $regions->{$code}->{'warnings'} } ) { print "$type Severity: $regions->{$code}->{'warnings'}->{$type}\n"; } } my $details = $meteo->details( 'region_code' => 'UK010', 'day' => 'tod ay'); my $name = $details->{'fullname'}; print "$name\n"; if ( $details->{warnings} eq 'no warnings' ) { print $details->{warnings}, "\n"; } else { foreach my $warning ( keys %{ $details->{'warnings'} } ) { print "$warning\n"; foreach my $detail ( keys %{ $details->{'warnings'}->{$warning } } ) { print "$detail: $details->{'warnings'}->{$warning}->{$deta il}\n"; } } } my $codes = $meteo->codes('FR'); my @codes = $meteo->codes(); foreach my $code (@codes) { foreach my $region ( sort keys %{$code} ) { print "Region name: $region, region code: $code->{$region}\n"; } } =head1 DESCRIPTION $meteo -> countries returns hashref of warnings for all countries. $meteo -> regions returns hashref of warnings for all regions in a spe cified country $meteo -> details returns hashref of detailled warnings for a specifie d region $meteo -> codes returns arrayref of hash of name and region code of a country =head1 METHODS =head1 new( ) creates a new meteoalarm object =head2 Optional Arguments: new( 'user_agent' => 'Meteobot 0.001'); changes the user agent string =head1 my $country = $meteo -> countries (); =head2 Optional Arguments: 'day' => 'today' || 'tomorrow' if day is not defined, default value is today 'type' => 'all' || 'wind' || 'snow' || 'ice' || 'snow/ice' || 'snow' || 'ice' || 'thunderstorm' || 'fog' || 'extreme high temperature' || 'extreme low temperature' || 'coastal event' || 'fores tfire' || 'avalanches' || 'rain' if type is not defined, default type is all =head1 $regions = $meteo -> regions ('country_code' => 'DE'); country_code is a 2 letter abbreviation =head2 Optional arguments: day=> 'today' || 'tomorrow' if day is not defined, default value is today =head1 $details = $meteo->details ('region_code' => 'ES005'); region_code consits of 2 letters for the country and 3 digits =head2 Optional arguments: day=> 'today' || 'tomorrow' if day is not defined, default value is today =head1 $code = $meteo -> codes (); Returns arrayref of hash for region names and codes for all countries =head2 Optional Arguments $code = $meteo -> codes ('PL'); Countrycode for a specific country =cut

[Nov 17, 2017] Safe string handling

Nov 17, 2017 | perlmonks.com


2 direct replies -- Read more / Contribute by tdlewis77
on Aug 25, 2017 at 13:07

# This string has a mixture of ASCII, UTF-8, 2 byte wide, and 4 byte # wide characters my $crazy = "Hello\x{26c4}".encode("utf-8","\x{26f0}"). "\x{10102}\x{2fa1b}"; # Now the string only has ASCII and UTF-8 characters my $sane = safeString($crazy); # testString($crazy) returns 7 # testString($sane) returns 3 # length($sane) returns 19 # trueLength($sane) returns 9 my $snowman = safeSubstr($crazy,5,1); ######################################## # safeString($string) # return a safe version of the string sub safeString { my ($string) = @_; return "" unless defined($string); my $t = testString($string); return $string if $t <= 3; return encode("utf-8",$string) if $t <= 5; # The string has both UTF-8 and wide characters so it needs # tender-loving care my @s = unpack('C*',$string); my @r; for (my $i = 0; $i < scalar(@s);) { if ($s[$i] < 128) { push @r, $s[$i]; $i++; } elsif ($s[$i] > 255) { # encode a wide character push @r,unpack("C*",encode("utf-8",chr($s[$i]))); $i++; } else { # copy all the utf-8 bytes $n = _charBytes($i,@s) - 1; map { push @r, $s[$i+$_] } 0..$n; $i += $n + 1; } } return pack("C*",@r); } ######################################## # safeSubstr($string,$pos,$n) # return a safe substring (treats utf-8 sequences as a single # character) sub safeSubstr { my ($string,$pos,$n) = @_; $s = safeString($string); my $p = 0; my $rPos = 0; my $rEnd = -1; my @s = unpack('C*',$s); for (my $i = 0; $i < scalar(@s);) { if ($s[$i] < 128) { $i++; } elsif ($s[$i] > 255) { $i++; } else { $i += _charBytes($i,@s); } $p++; $rPos = $i if $p == $pos; $rEnd = $i-1 if $p == $pos + $n; } $rEnd = scalar(@s) - 1 if $rEnd < 0; return "" if $rPos > $rEnd; my @r; map { push @r, $s[$_] } $rPos..$rEnd; return pack("C*",@r); } ######################################## # testString($string) # returns information about the characters in the string # # The 1, 2, and 4 bits of the result are for ASCII, UTF-8, and # wide characters respectively. If multiple bits are set, # characters of each type appear in the string. If the result is: # <= 1 simple ASCII string # <= 3 simple UTF-8 string # >3 && <= 5 mixed ASCII & wide characters # >= 6 mixed UTF-8 & wide characters sub testString { my ($s) = @_; return undef unless defined($s); my $r = 0; my @s = unpack('C*',$s); for (my $i = 0; $i < scalar(@s);) { if ($s[$i] < 128) { $r |= 1; $i++; } elsif ($s[$i] > 255) { $r |= 4; $i++; } else { $r |= 2; $i += _charBytes($i,@s); } } return $r; } ######################################## # trueLength($string) # returns the number of UTF-8 characters in a string sub trueLength { my ($s) = @_; return unless defined($s); my $len = 0; my @s = unpack('C*',$s); for (my $i = 0; $i < scalar(@s);) { if ($s[$i] < 128) { $i++; } elsif ($s[$i] > 255) { $i++; } else { $i += _charBytes($i,@s); } $len++; } return $len; } ######################################## # String support routines sub _charBytes { my $n = shift(@_); my $len = scalar(@_); if ($_[$n] < 128) { return 1; } elsif ($_[$n] > 65535) { return 4; } elsif ($_[$n] > 255) { return 2; } elsif (($_[$n] & 0xFC) == 0xFC) { return min(6,$len); } elsif (($_[$n] & 0xF8) == 0xF8) { return min(5,$len); } elsif (($_[$n] & 0xF0) == 0xF0) { return min(4,$len); } elsif (($_[$n] & 0xE0) == 0xE0) { return min(3,$len); } elsif (($_[$n] & 0xC0) == 0xC0) { return min(2,$len); } else { return 1; } }

[Nov 17, 2017] ndexed Flat File databases (for ISAM, NoSQL, Perl Embedded databases)

Nov 17, 2017 | perlmonks.com


1 direct reply -- Read more / Contribute by erichansen1836
on Oct 08, 2017 at 11:13

TOPIC: FAST!! Random Access Indexed, Relational Flat File Databases, Indexed by external Perl SDBM databases of key/value pairs tied to program "in memory" hash tables, where the Key in the Key/Value Pair is one or more fields and/or partial fields concatenated together (separated by a delimiter such as a pipe "|") and contained within the Flat File records for you to arbitrarily seek to a single record or a sorted/related group of records within your database.

Since it has been over 2 years ago since I first posted about this TOPIC I discovered, I wanted to alert the Perl community to the original thread where you can find Perl source code now for examples of how to implement Joint Database Technology/Methodology. Inparticular the King James Bible Navigator software DEMO I posted which used FlatFile/SDBM for its database. I have made this a native Windows GUI application (TreeView/RichEdit COMBO interface) to demonstrate how to show your end-users a summary of the information of the data contained within a database, and allow them to drill down to a small amount of specific information (e.g. verses within a single book/chapter) for actual viewing (and retrieving from the database). The TreeView Double Click Event was originally written to random access the first verse within a chapter, then sequentially access the remaining verses within a chapter - performing a READ for each verse. I posted a separate modified TreeView Double Click Event for you to insert into the Application which reads an entire chapter in one (1) giant READ, breaking out the individual verses (into an array) using the UNPACK statement. -- Eric

Joint Database Technology: http://www.perlmonks.org/?node_id=1121222

[Nov 17, 2017] How do the Perl 6 set operations compare elements?

Notable quotes:
"... Running under moar (2016.10) ..."
Nov 17, 2017 | stackoverflow.com

Ask Question up vote down vote favorite 1

brian d foy ,Nov 26, 2016 at 4:32

Running under moar (2016.10)

Consider this code that constructs a set and tests for membership:

my $num_set = set( < 1 2 3 4 > );
say "set: ", $num_set.perl;
say "4 is in set: ", 4 ∈ $num_set;
say "IntStr 4 is in set: ", IntStr.new(4, "Four") ∈ $num_set;
say "IntStr(4,...) is 4: ", IntStr.new(4, "Four") == 4;
say "5 is in set: ", 5 ∈ $num_set;

A straight 4 is not in the set, but the IntStr version is:

set: set(IntStr.new(4, "4"),IntStr.new(1, "1"),IntStr.new(2, "2"),IntStr.new(3, "3"))
4 is in set: False
IntStr 4 is in set: True
IntStr(4,...) is 4: True
5 is in set: False

I think most people aren't going to expect this, but the docs doesn't say anything about how this might work. I don't have this problem if I don't use the quote words (i.e. set( 1, 2, 3, 4) ).

timotimo ,Nov 26, 2016 at 5:47

You took a wrong turn in the middle. The important part is what nqp::existskey is called with: the k.WHICH . This method is there for value types, i.e. immutable types where the value - rather than identity - defines if two things are supposed to be the same thing (even if created twice). It returns a string representation of an object's value that is equal for two things that are supposed to be equal. For <1>.WHICH you get IntStr|1 and for 1.WHICH you get just Int|1 .

brian d foy ,Nov 26, 2016 at 6:18

Ah, okay. I can see a lot of pain for regular people trying to debug these things. – brian d foy Nov 26 '16 at 6:18

smls ,Nov 26, 2016 at 14:46

As explained in the Set documentation, sets compare object identity, same as the === operator:

Within a Set, every element is guaranteed to be unique (in the sense that no two elements would compare positively with the === operator)

The identity of an object is defined by the .WHICH method, as timotimo elaborates in his answer.

brian d foy ,Nov 26, 2016 at 21:28

That's not really clear from that statement. That's talking about which elements are in the set. Beyond that, even if you choose to compare with ===, you have to know how other things are stored. This is the sort of info that should show up next to the Set operators. – brian d foy Nov 26 '16 at 21:28

brian d foy ,Nov 26, 2016 at 23:09

Indeed, I think I've found a bug. The qw docs says this should be true: < a b 137 > eqv ( 'a', 'b', '137' ) , but in the same version of Rakudo Star I get false. It's different object types on each side. – brian d foy Nov 26 '16 at 23:09

brian d foy ,Nov 26, 2016 at 23:16

Despite all this, your answer was the A-ha! moment that led me to look at the right thing. Thanks for all of your help. – brian d foy Nov 26 '16 at 23:16

raiph ,Nov 27, 2016 at 4:50

Write your list of numbers using commas

As you mention in your answer, your code works if you write your numbers as a simple comma separated list rather than using the <...> construct.

Here's why:

4 ∈ set 1, 2, 3, 4 # True

A bare numeric literal in code like the 4 to the left of constructs a single value with a numeric type. (In this case the type is Int, an integer.) If a set constructor receives a list of similar literals on the right then everything works out fine.

<1 2 3 4> produces a list of "dual values"

The various <...> "quote words" constructs turn the list of whitespace separated literal elements within the angle brackets into an output list of values.

The foundational variant ( qw<...> ) outputs nothing but strings. Using it for your use case doesn't work:

4 ∈ set qw<1 2 3 4> # False

The 4 on the left constructs a single numeric value, type Int . In the meantime the set constructor receives a list of strings, type Str : ('1','2','3','4') . The operator doesn't find an Int in the set because all the values are Str s so returns False .

Moving along, the huffmanized <...> variant outputs Str s unless an element is recognized as a number. If an element is recognized as a number then the output value is a "dual value". For example a 1 becomes an IntStr .

According to the doc "an IntStr can be used interchangeably where one might use a Str or an Int". But can it?

Your scenario is a case in point. While 1 ∈ set 1,2,3 and <1> ∈ set <1 2 3> both work, 1 ∈ set <1 2 3> and <1> ∈ set 1, 2, 3 both return False .

So it seems the operator isn't living up to the quoted doc's claim of dual value interchangeability

This may already be recognized as a bug in the set operation and/or other operations. Even if not, this sharp "dual value" edge of the <...> list constructor may eventually be viewed as sufficiently painful that Perl 6 needs to change.

brian d foy ,Nov 26, 2016 at 23:29

I think this is a bug, but not in the set stuff. The other answers were very helpful in sorting out what was important and what wasn't.

I used the angle-brackets form of the quote words . The quote words form is supposed to be equivalent to the quoting version (that is, True under eqv ). Here's the doc example:

<a b c> eqv ('a', 'b', 'c')

But, when I try this with a word that is all digits, this is broken:

 $ perl6
 > < a b 137 > eqv ( 'a', 'b', '137' )
 False

But, the other forms work:

> qw/ a b 137 / eqv ( 'a', 'b', '137' )
True
> Q:w/ a b 137 / eqv ( 'a', 'b', '137' )
True

The angle-bracket word quoting uses IntStr :

> my @n = < a b 137 >
[a b 137]
> @n.perl
["a", "b", IntStr.new(137, "137")]

Without the word quoting, the digits word comes out as [Str]:

> ( 'a', 'b', '137' ).perl
("a", "b", "137")
> ( 'a', 'b', '137' )[*-1].perl
"137"
> ( 'a', 'b', '137' )[*-1].WHAT
(Str)
> my @n = ( 'a', 'b', '137' );
[a b 137]
> @n[*-1].WHAT
(Str)

You typically see these sorts of errors when there are two code paths to get to a final result instead of shared code that converges to one path very early. That's what I would look for if I wanted to track this down (but, I need to work on the book!)

This does highlight, though, that you have to be very careful about sets. Even if this bug was fixed, there are other, non-buggy ways that eqv can fail. I would have still failed because 4 as Int is not "4" as Str . I think this level of attention to data types in unperly in it's DWIMery. It's certainly something I'd have to explain very carefully in a classroom and still watch everyone mess up on it.

For what it's worth, I think the results of gist tend to be misleading in their oversimplification, and sometimes the results of perl aren't rich enough (e.g. hiding Str which forces me to .WHAT ). The more I use those, the less useful I find them.

But, knowing that I messed up before I even started would have saved me from that code spelunking that ended up meaning nothing!

Christoph ,Nov 26, 2016 at 23:55

Could you clarify what you consider the bug to be? As far as I can tell, this is all by design: (a) <...> goes through &val , which returns allomorphs if possible (b) set membership is defined in terms of identity, which distinguishes between allomorphs and their corresponding value types; so I would not classify it as a bug, but 'broken' by design; or phrased another way, it's just the WAT that comes with this particular DWIMChristoph Nov 26 '16 at 23:55

Brad Gilbert ,Nov 26, 2016 at 23:59

This was intentionally added, and is part of the testsuite . ( I can't seem to find anywhere that tests for < > being equivalent to q:w:v< > and << >> / " " being equivalent to qq:ww:v<< >> ) – Brad Gilbert Nov 26 '16 at 23:59

brian d foy ,Nov 27, 2016 at 0:02

The docs say the two lists should be eqv, and they are not. If they are not meant to be equivalent, the docs need to change. Nothing in docs.perl6.org/language/quoting#Word_quoting:_qw mentions any of this stuff. – brian d foy Nov 27 '16 at 0:02

Christoph ,Nov 27, 2016 at 0:17

The documentation seems to be just wrong here, <...> does not correspond to qw(...) , but qw:v(...) . Cf S02 for the description of the adverb and this test that Brad was <del>looking for</del> already linked toChristoph Nov 27 '16 at 0:17

Christoph ,Nov 27, 2016 at 0:45

or perhaps not outright wrong, but rather 'just' misleading: <...> is indeed a :w form, and the given example code does compare equal according to eqvChristoph Nov 27 '16 at 0:45

dwarring ,Nov 27, 2016 at 18:33

Just to add to the other answers and point out a consistancy here between sets and object hashes .

An object hash is declared as my %object-hash{Any} . This effectively hashes on objects .WHICH method, which is similar to how sets distinguish individual members.

Substituting the set with an object hash:

my %obj-hash{Any};

%obj-hash< 1 2 3 4 > = Any;
say "hash: ", %obj-hash.keys.perl;
say "4 is in hash: ", %obj-hash{4}:exists;
say "IntStr 4 is in hash: ", %obj-hash{ IntStr.new(4, "Four") }:exists;
say "IntStr(4,...) is 4: ", IntStr.new(4, "Four") == 4;
say "5 is in hash: ", %obj-hash{5}:exists;

gives similar results to your original example:

hash: (IntStr.new(4, "4"), IntStr.new(1, "1"), IntStr.new(2, "2"), IntStr.new(3, "3")).Seq
4 is in hash: False
IntStr 4 is in hash: True
IntStr(4,...) is 4: True
5 is in hash: False

brian d foy ,Nov 29, 2016 at 21:52

Oh, this is going to suck hard while teaching a class. – brian d foy Nov 29 '16 at 21:52

dwarring ,Nov 30, 2016 at 4:21

I agree its not great, as it is. – dwarring Nov 30 '16 at 4:21

dwarring ,Nov 30, 2016 at 18:26

Have raised an RT rt.perl.org/Ticket/Display.html?id=130222dwarring Nov 30 '16 at 18:26

[Nov 17, 2017] Introducing TestSimple for testing Perl programs - YouTube

Nov 17, 2017 | www.youtube.com

Published on Oct 17, 2015

For details visit: http://perlmaven.com/introducing-test... Category Science & Technology License Standard YouTube License

[Nov 17, 2017] Bruce Gray - Your Perl 5 Brain, on Perl 6 > by Bruce Gray

Nov 17, 2017 | www.youtube.com

Published on Jun 21, 2017

In which I detail the Perl 6 elements that have most changed my Perl 5 coding, and share the Perl 5 techniques I have adopted.

I eat, sleep, live, and breathe Perl!

Consultant and Contract Programmer Frequent PerlMongers speaker Dedicated Shakespeare theater-goer Armchair Mathematician Author of Blue_Tiger, a tool for modernizing Perl.

36 years coding 22 years Perl 16 years Married 15 YAPCs 7 Hackathons 3 PerlWhirls Perl interests: Refactoring, Perl Idioms / Micropatterns, RosettaCode, and Perl 6.

[Nov 17, 2017] Bit operations in Perl

Nov 17, 2017 | stackoverflow.com

Ask Question up vote down vote favorite

Toren ,Jan 12, 2011 at 14:50

I have an attribute (32 bits-long), that each bit responsible to specific functionality. Perl script I'm writing should turn on 4th bit, but save previous definitions of other bits.

I use in my program:

Sub BitOperationOnAttr
{

my $a="";

MyGetFunc( $a);

$a |= 0x00000008;

MySetFunc( $a);

}

** MyGetFunc/ MySetFunc my own functions that know read/fix value.

Questions:

  1. if usage of $a |= 0x00000008; is right ?
  2. how extract hex value by Regular Expression from string I have : For example:

"Attribute: Somestring: value (8 long (0x8))"

Michael Carman ,Jan 12, 2011 at 16:13

Your questions are not related; they should be posted separately. That makes it easier for other people with similar questions to find them. – Michael Carman Jan 12 '11 at 16:13

toolic ,Jan 12, 2011 at 16:47

Same question asked on PerlMonks: perlmonks.org/?node_id=881892toolic Jan 12 '11 at 16:47

psmears ,Jan 12, 2011 at 15:00

  1. if usage of $a |= 0x00000008; is right ?

Yes, this is fine.

  1. how extract hex value by Regular Expression from string I have : For example:

"Attribute: Somestring: value (8 long (0x8))"

I'm assuming you have a string like the above, and want to use a regular expression to extract the "0x8". In that case, something like:

if ($string =~ m/0x([0-9a-fA-F]+)/) {
    $value = hex($1);
} else {
    # string didn't match
}

should work.

Michael Carman ,Jan 12, 2011 at 16:32

Perl provides several ways for dealing with binary data:

Your scenario sounds like a set of packed flags. The bitwise operators are a good fit for this:

my $mask = 1 << 3;   # 0x0008
$value |=  $mask;    # set bit
$value &= ~$mask;    # clear bit
if ($value & $mask)  # check bit

vec is designed for use with bit vectors. (Each element has the same size, which must be a power of two.) It could work here as well:

vec($value, 3, 1) = 1;  # set bit
vec($value, 3, 1) = 0;  # clear bit
if (vec($value, 3, 1))  # check bit

pack and unpack are better suited for working with things like C structs or endianness.

sdaau ,Jul 15, 2014 at 5:01

I upvoted, but there is something very important missing: vec operates on a string!

If we use a number; say:

$val=5;

printf("b%08b",$val);

(this gives b00000101 ) -- then one can see that the "check bit" syntax, say:

for($ix=7;$ix>=0;$ix--) {

print vec($val, $ix, 1);

};

print "\n";

will not work (it gives 00110101 , which is not the same number). The correct is to convert the number to ASCII char, i.e.

print vec(sprintf("%c", $val), $ix, 1); .

sdaau Jul 15 '14 at 5:01

[Nov 17, 2017] date - How to convert epoch seconds to normal time in perl - Stack Overflow

Nov 17, 2017 | stackoverflow.com

confused ,2 days ago

I have a string of epoch seconds "1510652305" which when i convert to normal time on unix command line using
`date -d @1510652305`

i get Tue Nov 14 15:08:25 IST 2017

But when i tried it in perl using something like this

use POSIX qw(strftime);
use Time::Local;
use Time::localtime;

$kickoff_time=1510652305;
$kickoff_time=ctime($kickoff_time);

i get

Thu Jan 1 05:30:00 1970

How can i achieve the result i am getting in linux in perl?

Thanks!!

mwp ,2 days ago

Don't overthink it!
my $kickoff_time = localtime 1510652305;
say $kickoff_time; # Tue Nov 14 15:08:25 2017

If you absolutely, positively need the timezone in there:

use POSIX qw{strftime};

my $kickoff_time = strftime '%a %b %e %H:%M:%S %Z %Y', localtime 1510652305;
say $kickoff_time; # Tue Nov 14 15:08:25 IST 2017

Note that this is locale-dependent.

confused ,yesterday

We have to use localtime to convert in time from epoch seconds and gmtime to convert in time from normal seconds i got it now....Thanks!! – confused yesterday

mob ,yesterday

Still confused. Both localtime and gmtime expect the input to be epoch seconds. – mob yesterday

mwp ,4 hours ago

Exactly. To expound, localtime() takes the epoch and returns a string (or date parts array) representing the time in your local timezone; gmtime() takes the epoch and returns a string (or date parts array) representing the time in UTC. – mwp 4 hours ago

,

I would recommend using Time::Piece for this job - it's core in perl.
#!/usr/bin/env perl

use strict;
use warnings; 
use Time::Piece;

my $t = localtime ( 1510652305 );

print $t;

It'll print default format, or you can use formatted using strftime .

[Nov 17, 2017] Meteoalarm - Weather warnings

Nov 17, 2017 | perlmonks.com


1 direct reply -- Read more / Contribute by walto
on Sep 23, 2017 at 00:50

Meteoalarm.eu ( http://meteoalarm.eu ) is the official website from European national weather services that gives out warnings in extreme weather situations. It has been a while ago that i wrote a perl module for processing this information. ( Weather warnings from www.meteoalarm.eu ). The website is still on but has changed since. That made some changes necessary. I wrote the module only for informational purposes and it is not meant to use it for anything critical. Here is the code: #!/usr/bin/perl # # package Meteoalarm; use strict; use warnings; use Carp; use LWP; use HTML::Entities; use HTML::TreeBuilder; use utf8; binmode STDOUT, ":encoding(UTF-8)"; our $VERSION = "0.06"; sub new { my $class = shift; my $self = {}; my %passed_params = @_; $self->{'user_agent'} = _make_user_agent( $passed_params{'user_agent'} ); bless( $self, $class ); return $self; } sub countries { my $self = shift; my %passed_params = @_; my %type = ( 'all' => 0, 'wind' => 1, 'snow' => 2, 'ice' => 2, 'snow/ice' => 2, 'thunderstorm' => 3, 'fog' => 4, 'extreme high temperature' => 5, 'extreme low temperature' => 6, 'coastal event' => 7, 'forestfire' => 8, 'avalanches' => 9, 'rain' => 10, 'unnamed' => 11, 'flood' => 12, 'rainflood' => 13 ); if ( !$passed_params{type} ) { $passed_params{type} = 0; } elsif ( !$type{ $passed_params{type} } ) { $passed_params{type} = 0; } else { $passed_params{type} = $type{ $passed_params{type} }; } my %day = ( 'today' => 0, 'tomorrow' => 1 ); if ( !$passed_params{day} ) { $passed_params{day} = 0; } elsif ( !$day{ $passed_params{day} } ) { $passed_params{day} = 0; } else { $passed_params{day} = $day{ $passed_params{day} }; } my $url = _make_country_url( $passed_params{day}, $passed_params{type} ); my $content = _fetch_content( $url, $self->{'user_agent'} ); my $country_warnings = _parse_country_warnings($content); return $country_warnings; } sub regions { my ($self) = shift; my %passed_params = @_; my %day = ( 'today' => 0, 'tomorrow' => 1 ); if ( !$passed_params{day} ) { $passed_params{day} = 0; } elsif ( !$day{ $passed_params{day} } ) { $passed_params{day} = 0; } else { $passed_params{day} = $day{ $passed_params{day} }; } my %type = ( 'all' => 0, 'wind' => 1, 'snow' => 2, 'ice' => 2, 'snow/ice' => 2, 'thunderstorm' => 3, 'fog' => 4, 'extreme high temperature' => 5, 'extreme low temperature' => 6, 'coastal event' => 7, 'forestfire' => 8, 'avalanches' => 9, 'rain' => 10, 'unnamed' => 11, 'flood' => 12, 'rainflood' => 13 ); my %country_codes = ( 'AT' => 'AT-Austria', 'BA' => 'BA_Bosnia-Herzegovina', 'BE' => 'BE-Belgium', 'BG' => 'BG-Bulgaria', 'CH' => 'CH-Switzerland', 'CY' => 'CY-Cyprus', 'CZ' => 'CZ-Czechia', 'DE' => 'DE-Germany', 'DK' => 'DK-Denmark', 'EE' => 'EE-Estonia', 'ES' => 'ES-Spain', 'FI' => 'FI-Finland', 'FR' => 'FR-France', 'GR' => 'GR-Greece', 'HR' => 'HR-Croatia', 'HU' => 'HU-Hungary', 'IE' => 'IE-Ireland', 'IL' => 'IL-Israel', 'IS' => 'IS-Iceland', 'IT' => 'IT-Italy', 'LT' => 'LT-Lithuania', 'LU' => 'LU-Luxemburg', 'LV' => 'LV-Latvia', 'MD' => 'MD-Moldova', 'ME' => 'ME-Montenegro', 'MK' => 'MK-Former Yugoslav Republic of Macedonia', 'MT' => 'MT-Malta', 'NL' => 'NL-Netherlands', 'NO' => 'NO-Norway', 'PL' => 'PL-Poland', 'PT' => 'PT-Portugal', 'RO' => 'RO-Romania', 'RS' => 'RS-Serbia', 'SE' => 'SE-Sweden', 'SI' => 'SI-Slovenia', 'SK' => 'SK-Slovakia', 'UK' => 'UK-United-Kingdom' ); if ( !$passed_params{type} ) { $passed_params{type} = 0; } elsif ( !$type{ $passed_params{type} } ) { $passed_params{type} = 0; } else { $passed_params{type} = $type{ $passed_params{type} }; } croak "Invalid country_code: $passed_params{country_code}" unless $passed_params{country_code}; my $url = 'http://meteoalarm.eu/en_UK/' . $passed_params{day} . '/' . $passed_params{type} . '/' . $country_codes{ $passed_params{country_code} } . '.html'; my $content = _fetch_content( $url, $self->{'user_agent'} ); my $region_warnings = _parse_region_warnings($content); return $region_warnings; } sub details { my $self = shift; my %passed_params = @_; my %country_codes = ( 'AT' => 10, 'BA' => 10, 'BE' => 801, 'BG' => 28, 'CH' => 319, 'CY' => 1, 'CZ' => 14, 'DE' => 808, 'DK' => 8, 'EE' => 805, 'ES' => 831, 'FI' => 813, 'FR' => 94, 'GR' => 16, 'HR' => 806, 'HU' => 7, 'IE' => 804, 'IL' => 803, 'IS' => 11, 'IT' => 20, 'LT' => 801, 'LU' => 2, 'LV' => 804, 'MD' => 37, 'ME' => 3, 'MK' => 6, 'MT' => 1, 'NL' => 807, 'NO' => 814, 'PL' => 802, 'PT' => 26, 'RO' => 42, 'RS' => 11, 'SE' => 813, 'SI' => 801, 'SK' => 16, 'UK' => 16 ); my ( $region, $code ) = $passed_params{region_code} =~ /^([ABCDEFGHILMNPRSU][A-Z])(\d\d\ d)/; $code =~ s /^0//; croak "Invalid region_code: $passed_params{region_code}" unless ( $country_codes{$region} and ( $code <= $country_codes{$region} ) ); my $details; my %type = ( 'all' => 0, 'wind' => 1, 'snow' => 2, 'ice' => 2, 'snow/ice' => 2, 'thunderstorm' => 3, 'fog' => 4, 'extreme high temperature' => 5, 'extreme low temperature' => 6, 'coastal event' => 7, 'forestfire' => 8, 'avalanches' => 9, 'rain' => 10, 'unnamed' => 11, 'flood' => 12, 'rainflood' => 13 ); if ( !$passed_params{type} ) { $passed_params{type} = 0; } elsif ( !$type{ $passed_params{type} } ) { $passed_params{type} = 0; } else { $passed_params{type} = $type{ $passed_params{type} }; } my %day = ( 'today' => 0, 'tomorrow' => 1 ); if ( !$passed_params{day} ) { $passed_params{day} = 0; } elsif ( !$day{ $passed_params{day} } ) { $passed_params{day} = 0; } else { $passed_params{day} = $day{ $passed_params{day} }; } my $url = 'http://meteoalarm.eu/en_UK/' . $passed_params{day} . '/' . $passed_params{type} . '/' . $passed_params{region_code} . '.html'; my $content = _fetch_content( $url, $self->{'user_agent'} ); $details = _parse_details($content); return $details; } sub codes { my $self = shift; my @codes; my @countries_short; if (@_) { @countries_short = @_; } else { @countries_short = qw(AT BA BE BG CH CY CZ DE DK EE ES FI FR GR HR HU IE IL IS IT LT LU LV MD ME MK MT NL NO PL PT RO RS SE SI SK UK); } my %country_codes = ( 'AT' => 'AT-Austria', 'BA' => 'BA_Bosnia-Herzegovina', 'BE' => 'BE-Belgium', 'BG' => 'BG-Bulgaria', 'CH' => 'CH-Switzerland', 'CY' => 'CY-Cyprus', 'CZ' => 'CZ-Czechia', 'DE' => 'DE-Germany', 'DK' => 'DK-Denmark', 'EE' => 'EE-Estonia', 'ES' => 'ES-Spain', 'FI' => 'FI-Finland', 'FR' => 'FR-France', 'GR' => 'GR-Greece', 'HR' => 'HR-Croatia', 'HU' => 'HU-Hungary', 'IE' => 'IE-Ireland', 'IL' => 'IL-Israel', 'IS' => 'IS-Iceland', 'IT' => 'IT-Italy', 'LT' => 'LT-Lithuania', 'LU' => 'LU-Luxemburg', 'LV' => 'LV-Latvia', 'MD' => 'MD-Moldova', 'ME' => 'ME-Montenegro', 'MK' => 'MK-Former Yugoslav Republic of Macedonia', 'MT' => 'MT-Malta', 'NL' => 'NL-Netherlands', 'NO' => 'NO-Norway', 'PL' => 'PL-Poland', 'PT' => 'PT-Portugal', 'RO' => 'RO-Romania', 'RS' => 'RS-Serbia', 'SE' => 'SE-Sweden', 'SI' => 'SI-Slovenia', 'SK' => 'SK-Slovakia', 'UK' => 'UK-United-Kingdom' ); foreach my $country_short (@countries_short) { my $url = 'http://meteoalarm.eu/en_UK/' . '0' . '/' . '0' . '/' . $country_codes{$country_short} . '.html'; my $content = _fetch_content( $url, $self->{'user_agent'} ); push @codes, _parse_codes($content); } return @codes; } sub _make_country_url { my ( $day, $type ) = @_; my $url = 'http://meteoalarm.eu/en_UK/' . $day . '/' . $type . '/EU-Europe .html'; return $url; } sub _fetch_content { my ( $url, $user_agent ) = @_; my $ua = LWP::UserAgent->new; $ua->agent($user_agent); my $res = $ua->request( HTTP::Request->new( GET => $url ) ); croak " Can't fetch http://meteoalarm.eu: $res->status_line \n" unless ( $res->is_success ); return $res->decoded_content; } sub _parse_country_warnings { my $content = shift; my $p = HTML::TreeBuilder->new_from_content($content); my (%data); my @cells = $p->look_down( _tag => q{td}, class => qr/^col[12]$/ ) ; for my $cell (@cells) { my @src; my $div = $cell->look_down( _tag => q{div} ); my $id = $div->id; my $alt = $div->attr(q{alt}); $data{$id}{fullname} = $alt; my @weather_events = $div->look_down( _tag => 'span', class => qr{warn awt} ); $data{$id}{warnings} = _parse_weather_events( \@weather_events ); # # get tendency # my $tendency = $div->look_down( _tag => 'div', class => qr{tendenz awt nt l\d} ); if ( $tendency->{class} ) { $tendency->{class} =~ /tendenz awt nt l(\d)/; $data{$id}{tendency} = $1; } } return \%data; } sub _parse_region_warnings { my $content = shift; my $p = HTML::TreeBuilder->new_from_content($content); my (%data); my @cells = $p->look_down(_tag=>qr{div}, id=>qr{area}); for my $cell (@cells) { $cell->id =~ /area_([A-Z][A-Z]\d+)/; my $id = $1; my $fullname = $cell->look_down(_tag=>'span',id=>'cname')->as_text ; my $div = $cell->look_down( _tag => q{div} ); $data{$id}{fullname} = $fullname; my @weather_events = $div->look_down(_tag=> 'span', class=>qr{warnflag warn2}); $data{$id}{warnings} = _parse_weather_events( \@weather_events ); # # get tendency # my $tendency = $div->look_down( _tag => 'span', class => qr{tendenz awt\d l\d} ); if ( $tendency->{class} ) { $tendency->{class} =~ /tendenz awt\d l(\d)/; $data{$id}{tendency} = $1; } } return \%data; } sub _parse_weather_events { my $events = shift; my %weather_to_text = ( # lower case for consistency 1 => 'wind', 2 => 'snow/ice', 3 => 'thunderstorm', 4 => 'fog', 5 => 'extreme high temperature', 6 => 'extreme low temperature', 7 => 'coastal event', 8 => 'forestfire', 9 => 'avalanches', 10 => 'rain', 11 => 'unnamed', 12 => 'flood', 13 => 'rainflood' ); my %literal_warnings; for my $event (@$events) { #print $event->{class}, "\n"; $event->{class} =~ /warn\d* awt l(\d+) t(\d+)/; my $warn_level = $1; my $weather = $2; $literal_warnings{ $weather_to_text{$weather} } = $warn_level; } return \%literal_warnings; } sub _parse_details { my $content = shift; my (%data); my $p = HTML::TreeBuilder->new_from_content( decode_entities $cont ent); $data{fullname} = $p->look_down( _tag => q{h1} )->as_text; if ( $p->look_down( _tag => q{div}, class => q{warnbox awt nt l l1} ) ) { $data{warnings} = 'no warnings'; } else { my @warnboxes = $p->look_down( _tag => q{div}, class => qr/warnbox awt/ ); for my $warnbox (@warnboxes) { my ($as_txt); my @info_divs = $warnbox->look_down( _tag => q{div}, class => q{info} ); $as_txt = $info_divs[0]->as_text; my ( $from, $until ) = $as_txt =~ /valid from (.*) Until ( .*)$/; $as_txt = $info_divs[1]->as_text; my ( $warning, $level ) = $as_txt =~ /(.+?)\s+Awareness Level:\s+(.*)/; $warning =~ s/s$//; my $text = $warnbox->look_down( _tag => q{div}, class => q{text} )->as_text; $data{warnings}{ lc $warning } = { #lower case for constistency level => $level, from => $from, until => $until, text => $text, }; } } return \%data; } sub _parse_codes { my $content = shift; my $p = HTML::TreeBuilder->new_from_content($content); my (%data); #my @cells = $p->look_down( _tag => 'div', class => 'flags' ); my @cells = $p->look_down( _tag => qr{a} ); for my $cell (@cells) { if ( $cell->attr('xlink:href') ) { if ( $cell->attr('xlink:href') =~ /\/([A-Z][A-Z]\d+)-(.+?) .html/ ) { my $code = $1; my $fullname = $2; $data{$fullname} = $code; } } } return \%data; } sub _make_user_agent { my $ua = shift; $ua = 'Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:55.0) Gecko/20100101 Fire fox/55.0' unless ($ua); return $ua; } sub _extract_details_fullname { my $content = shift; my $region; if ( $content =~ /<h1>Weather warnings: (.+?)<\/h1>/ ) { $region = $1; decode_entities($region); if ( $region =~ /.??<.*<\/a>/ ) { $region =~ s/.??<.*<\/a>//; } } else { carp "Can't get region name\n"; } return $region; } 1; __END__ =head1 NAME B<Meteoalarm> - OO Interface for meteoalarm.eu =head1 SYNOPSIS This Module gets weather warnings from meteoalarm.eu. For further reading of terms and conditions see http://meteoalarm.eu/t erms.php?lang=en_UK use Meteoalarm; my $meteo = Meteoalarm->new( 'user_agent' => 'Meteobot 0.001' ); my $countries = $meteo -> countries ('type' => 'all', 'day' => 'today' ); foreach my $country_code (sort keys %{$countries}){ print "Country: $countries->{$country_code}->{'fullname'}\n"; print "Tendency = $countries->{$country_code}->{tendency}\n" if ( $countries->{$country_code}->{'tendency'}); if (keys %{$countries->{$country_code}->{'warnings'}}){ foreach my $warning (keys %{$countries->{$country_code}->{'warning s'}}){ print "Event: $warning, severity: $countries->{$country_co de}->{'warnings'}->{$warning}\n"; } } else {print "No Warnings\n";} } my $regions = $meteo->regions( 'country_code' => 'PT', 'day' => 'today ', 'type' => 'all' ); foreach my $code ( sort keys %{$regions} ) { print "Region : $regions->{$code}->{'fullname'}: region_code = $co de\n" if ( keys %{ $regions->{$code}->{'warnings'} } ); print "Tendency = $regions->{$code}->{tendency}\n" if ( $regions-> {$code}->{'tendency'}); foreach my $type ( keys %{ $regions->{$code}->{'warnings'} } ) { print "$type Severity: $regions->{$code}->{'warnings'}->{$type}\n"; } } my $details = $meteo->details( 'region_code' => 'UK010', 'day' => 'tod ay'); my $name = $details->{'fullname'}; print "$name\n"; if ( $details->{warnings} eq 'no warnings' ) { print $details->{warnings}, "\n"; } else { foreach my $warning ( keys %{ $details->{'warnings'} } ) { print "$warning\n"; foreach my $detail ( keys %{ $details->{'warnings'}->{$warning } } ) { print "$detail: $details->{'warnings'}->{$warning}->{$deta il}\n"; } } } my $codes = $meteo->codes('FR'); my @codes = $meteo->codes(); foreach my $code (@codes) { foreach my $region ( sort keys %{$code} ) { print "Region name: $region, region code: $code->{$region}\n"; } } =head1 DESCRIPTION $meteo -> countries returns hashref of warnings for all countries. $meteo -> regions returns hashref of warnings for all regions in a spe cified country $meteo -> details returns hashref of detailled warnings for a specifie d region $meteo -> codes returns arrayref of hash of name and region code of a country =head1 METHODS =head1 new( ) creates a new meteoalarm object =head2 Optional Arguments: new( 'user_agent' => 'Meteobot 0.001'); changes the user agent string =head1 my $country = $meteo -> countries (); =head2 Optional Arguments: 'day' => 'today' || 'tomorrow' if day is not defined, default value is today 'type' => 'all' || 'wind' || 'snow' || 'ice' || 'snow/ice' || 'snow' || 'ice' || 'thunderstorm' || 'fog' || 'extreme high temperature' || 'extreme low temperature' || 'coastal event' || 'fores tfire' || 'avalanches' || 'rain' if type is not defined, default type is all =head1 $regions = $meteo -> regions ('country_code' => 'DE'); country_code is a 2 letter abbreviation =head2 Optional arguments: day=> 'today' || 'tomorrow' if day is not defined, default value is today =head1 $details = $meteo->details ('region_code' => 'ES005'); region_code consits of 2 letters for the country and 3 digits =head2 Optional arguments: day=> 'today' || 'tomorrow' if day is not defined, default value is today =head1 $code = $meteo -> codes (); Returns arrayref of hash for region names and codes for all countries =head2 Optional Arguments $code = $meteo -> codes ('PL'); Countrycode for a specific country =cut

[Nov 17, 2017] Safe string handling

Nov 17, 2017 | perlmonks.com


2 direct replies -- Read more / Contribute by tdlewis77
on Aug 25, 2017 at 13:07

Dealing with data that comes from webpages can be really complicated. There is likely to be a combination of ASCII, UTF-8, and wide characters in the data returned and you cannot depend on the website to tell you what type of content is being returned. The routines safeString, safeSubstr, testString, and trueLength can be used to easily manipulate these strings. Pass any string to safeString and you will never get a wide character warning from print. Use safeSubstr to extract complete UTF-8 characters sequences from a string. Use testString to tell you what's really in the string. Use trueLength to find out how many characters wide the output will be. # This string has a mixture of ASCII, UTF-8, 2 byte wide, and 4 byte # wide characters my $crazy = "Hello\x{26c4}".encode("utf-8","\x{26f0}"). "\x{10102}\x{2fa1b}"; # Now the string only has ASCII and UTF-8 characters my $sane = safeString($crazy); # testString($crazy) returns 7 # testString($sane) returns 3 # length($sane) returns 19 # trueLength($sane) returns 9 my $snowman = safeSubstr($crazy,5,1); ######################################## # safeString($string) # return a safe version of the string sub safeString { my ($string) = @_; return "" unless defined($string); my $t = testString($string); return $string if $t <= 3; return encode("utf-8",$string) if $t <= 5; # The string has both UTF-8 and wide characters so it needs # tender-loving care my @s = unpack('C*',$string); my @r; for (my $i = 0; $i < scalar(@s);) { if ($s[$i] < 128) { push @r, $s[$i]; $i++; } elsif ($s[$i] > 255) { # encode a wide character push @r,unpack("C*",encode("utf-8",chr($s[$i]))); $i++; } else { # copy all the utf-8 bytes $n = _charBytes($i,@s) - 1; map { push @r, $s[$i+$_] } 0..$n; $i += $n + 1; } } return pack("C*",@r); } ######################################## # safeSubstr($string,$pos,$n) # return a safe substring (treats utf-8 sequences as a single # character) sub safeSubstr { my ($string,$pos,$n) = @_; $s = safeString($string); my $p = 0; my $rPos = 0; my $rEnd = -1; my @s = unpack('C*',$s); for (my $i = 0; $i < scalar(@s);) { if ($s[$i] < 128) { $i++; } elsif ($s[$i] > 255) { $i++; } else { $i += _charBytes($i,@s); } $p++; $rPos = $i if $p == $pos; $rEnd = $i-1 if $p == $pos + $n; } $rEnd = scalar(@s) - 1 if $rEnd < 0; return "" if $rPos > $rEnd; my @r; map { push @r, $s[$_] } $rPos..$rEnd; return pack("C*",@r); } ######################################## # testString($string) # returns information about the characters in the string # # The 1, 2, and 4 bits of the result are for ASCII, UTF-8, and # wide characters respectively. If multiple bits are set, # characters of each type appear in the string. If the result is: # <= 1 simple ASCII string # <= 3 simple UTF-8 string # >3 && <= 5 mixed ASCII & wide characters # >= 6 mixed UTF-8 & wide characters sub testString { my ($s) = @_; return undef unless defined($s); my $r = 0; my @s = unpack('C*',$s); for (my $i = 0; $i < scalar(@s);) { if ($s[$i] < 128) { $r |= 1; $i++; } elsif ($s[$i] > 255) { $r |= 4; $i++; } else { $r |= 2; $i += _charBytes($i,@s); } } return $r; } ######################################## # trueLength($string) # returns the number of UTF-8 characters in a string sub trueLength { my ($s) = @_; return unless defined($s); my $len = 0; my @s = unpack('C*',$s); for (my $i = 0; $i < scalar(@s);) { if ($s[$i] < 128) { $i++; } elsif ($s[$i] > 255) { $i++; } else { $i += _charBytes($i,@s); } $len++; } return $len; } ######################################## # String support routines sub _charBytes { my $n = shift(@_); my $len = scalar(@_); if ($_[$n] < 128) { return 1; } elsif ($_[$n] > 65535) { return 4; } elsif ($_[$n] > 255) { return 2; } elsif (($_[$n] & 0xFC) == 0xFC) { return min(6,$len); } elsif (($_[$n] & 0xF8) == 0xF8) { return min(5,$len); } elsif (($_[$n] & 0xF0) == 0xF0) { return min(4,$len); } elsif (($_[$n] & 0xE0) == 0xE0) { return min(3,$len); } elsif (($_[$n] & 0xC0) == 0xC0) { return min(2,$len); } else { return 1; } }

[Nov 17, 2017] ndexed Flat File databases (for ISAM, NoSQL, Perl Embedded databases)

Nov 17, 2017 | perlmonks.com


1 direct reply -- Read more / Contribute by erichansen1836
on Oct 08, 2017 at 11:13

TOPIC: FAST!! Random Access Indexed, Relational Flat File Databases, Indexed by external Perl SDBM databases of key/value pairs tied to program "in memory" hash tables, where the Key in the Key/Value Pair is one or more fields and/or partial fields concatenated together (separated by a delimiter such as a pipe "|") and contained within the Flat File records for you to arbitrarily seek to a single record or a sorted/related group of records within your database.

Since it has been over 2 years ago since I first posted about this TOPIC I discovered, I wanted to alert the Perl community to the original thread where you can find Perl source code now for examples of how to implement Joint Database Technology/Methodology. Inparticular the King James Bible Navigator software DEMO I posted which used FlatFile/SDBM for its database. I have made this a native Windows GUI application (TreeView/RichEdit COMBO interface) to demonstrate how to show your end-users a summary of the information of the data contained within a database, and allow them to drill down to a small amount of specific information (e.g. verses within a single book/chapter) for actual viewing (and retrieving from the database). The TreeView Double Click Event was originally written to random access the first verse within a chapter, then sequentially access the remaining verses within a chapter - performing a READ for each verse. I posted a separate modified TreeView Double Click Event for you to insert into the Application which reads an entire chapter in one (1) giant READ, breaking out the individual verses (into an array) using the UNPACK statement. -- Eric

Joint Database Technology: http://www.perlmonks.org/?node_id=1121222

[Nov 17, 2017] How do the Perl 6 set operations compare elements?

Nov 17, 2017 | stackoverflow.com

Ask Question up vote down vote favorite 1

brian d foy ,Nov 26, 2016 at 4:32

Running under moar (2016.10)

Consider this code that constructs a set and tests for membership:

my $num_set = set( < 1 2 3 4 > );
say "set: ", $num_set.perl;
say "4 is in set: ", 4 ∈ $num_set;
say "IntStr 4 is in set: ", IntStr.new(4, "Four") ∈ $num_set;
say "IntStr(4,...) is 4: ", IntStr.new(4, "Four") == 4;
say "5 is in set: ", 5 ∈ $num_set;

A straight 4 is not in the set, but the IntStr version is:

set: set(IntStr.new(4, "4"),IntStr.new(1, "1"),IntStr.new(2, "2"),IntStr.new(3, "3"))
4 is in set: False
IntStr 4 is in set: True
IntStr(4,...) is 4: True
5 is in set: False

I think most people aren't going to expect this, but the docs doesn't say anything about how this might work. I don't have this problem if I don't use the quote words (i.e. set( 1, 2, 3, 4) ).

timotimo ,Nov 26, 2016 at 5:47

You took a wrong turn in the middle. The important part is what nqp::existskey is called with: the k.WHICH . This method is there for value types, i.e. immutable types where the value - rather than identity - defines if two things are supposed to be the same thing (even if created twice). It returns a string representation of an object's value that is equal for two things that are supposed to be equal. For <1>.WHICH you get IntStr|1 and for 1.WHICH you get just Int|1 .

brian d foy ,Nov 26, 2016 at 6:18

Ah, okay. I can see a lot of pain for regular people trying to debug these things. – brian d foy Nov 26 '16 at 6:18

smls ,Nov 26, 2016 at 14:46

As explained in the Set documentation, sets compare object identity, same as the === operator:

Within a Set, every element is guaranteed to be unique (in the sense that no two elements would compare positively with the === operator)

The identity of an object is defined by the .WHICH method, as timotimo elaborates in his answer.

brian d foy ,Nov 26, 2016 at 21:28

That's not really clear from that statement. That's talking about which elements are in the set. Beyond that, even if you choose to compare with ===, you have to know how other things are stored. This is the sort of info that should show up next to the Set operators. – brian d foy Nov 26 '16 at 21:28

brian d foy ,Nov 26, 2016 at 23:09

Indeed, I think I've found a bug. The qw docs says this should be true: < a b 137 > eqv ( 'a', 'b', '137' ) , but in the same version of Rakudo Star I get false. It's different object types on each side. – brian d foy Nov 26 '16 at 23:09

brian d foy ,Nov 26, 2016 at 23:16

Despite all this, your answer was the A-ha! moment that led me to look at the right thing. Thanks for all of your help. – brian d foy Nov 26 '16 at 23:16

raiph ,Nov 27, 2016 at 4:50

Write your list of numbers using commas

As you mention in your answer, your code works if you write your numbers as a simple comma separated list rather than using the <...> construct.

Here's why:

4 ∈ set 1, 2, 3, 4 # True

A bare numeric literal in code like the 4 to the left of constructs a single value with a numeric type. (In this case the type is Int, an integer.) If a set constructor receives a list of similar literals on the right then everything works out fine.

<1 2 3 4> produces a list of "dual values"

The various <...> "quote words" constructs turn the list of whitespace separated literal elements within the angle brackets into an output list of values.

The foundational variant ( qw<...> ) outputs nothing but strings. Using it for your use case doesn't work:

4 ∈ set qw<1 2 3 4> # False

The 4 on the left constructs a single numeric value, type Int . In the meantime the set constructor receives a list of strings, type Str : ('1','2','3','4') . The operator doesn't find an Int in the set because all the values are Str s so returns False .

Moving along, the huffmanized <...> variant outputs Str s unless an element is recognized as a number. If an element is recognized as a number then the output value is a "dual value". For example a 1 becomes an IntStr .

According to the doc "an IntStr can be used interchangeably where one might use a Str or an Int". But can it?

Your scenario is a case in point. While 1 ∈ set 1,2,3 and <1> ∈ set <1 2 3> both work, 1 ∈ set <1 2 3> and <1> ∈ set 1, 2, 3 both return False .

So it seems the operator isn't living up to the quoted doc's claim of dual value interchangeability

This may already be recognized as a bug in the set operation and/or other operations. Even if not, this sharp "dual value" edge of the <...> list constructor may eventually be viewed as sufficiently painful that Perl 6 needs to change.

brian d foy ,Nov 26, 2016 at 23:29

I think this is a bug, but not in the set stuff. The other answers were very helpful in sorting out what was important and what wasn't.

I used the angle-brackets form of the quote words . The quote words form is supposed to be equivalent to the quoting version (that is, True under eqv ). Here's the doc example:

<a b c> eqv ('a', 'b', 'c')

But, when I try this with a word that is all digits, this is broken:

 $ perl6
 > < a b 137 > eqv ( 'a', 'b', '137' )
 False

But, the other forms work:

> qw/ a b 137 / eqv ( 'a', 'b', '137' )
True
> Q:w/ a b 137 / eqv ( 'a', 'b', '137' )
True

The angle-bracket word quoting uses IntStr :

> my @n = < a b 137 >
[a b 137]
> @n.perl
["a", "b", IntStr.new(137, "137")]

Without the word quoting, the digits word comes out as [Str]:

> ( 'a', 'b', '137' ).perl
("a", "b", "137")
> ( 'a', 'b', '137' )[*-1].perl
"137"
> ( 'a', 'b', '137' )[*-1].WHAT
(Str)
> my @n = ( 'a', 'b', '137' );
[a b 137]
> @n[*-1].WHAT
(Str)

You typically see these sorts of errors when there are two code paths to get to a final result instead of shared code that converges to one path very early. That's what I would look for if I wanted to track this down (but, I need to work on the book!)

This does highlight, though, that you have to be very careful about sets. Even if this bug was fixed, there are other, non-buggy ways that eqv can fail. I would have still failed because 4 as Int is not "4" as Str . I think this level of attention to data types in unperly in it's DWIMery. It's certainly something I'd have to explain very carefully in a classroom and still watch everyone mess up on it.

For what it's worth, I think the results of gist tend to be misleading in their oversimplification, and sometimes the results of perl aren't rich enough (e.g. hiding Str which forces me to .WHAT ). The more I use those, the less useful I find them.

But, knowing that I messed up before I even started would have saved me from that code spelunking that ended up meaning nothing!

Christoph ,Nov 26, 2016 at 23:55

Could you clarify what you consider the bug to be? As far as I can tell, this is all by design: (a) <...> goes through &val , which returns allomorphs if possible (b) set membership is defined in terms of identity, which distinguishes between allomorphs and their corresponding value types; so I would not classify it as a bug, but 'broken' by design; or phrased another way, it's just the WAT that comes with this particular DWIMChristoph Nov 26 '16 at 23:55

Brad Gilbert ,Nov 26, 2016 at 23:59

This was intentionally added, and is part of the testsuite . ( I can't seem to find anywhere that tests for < > being equivalent to q:w:v< > and << >> / " " being equivalent to qq:ww:v<< >> ) – Brad Gilbert Nov 26 '16 at 23:59

brian d foy ,Nov 27, 2016 at 0:02

The docs say the two lists should be eqv, and they are not. If they are not meant to be equivalent, the docs need to change. Nothing in docs.perl6.org/language/quoting#Word_quoting:_qw mentions any of this stuff. – brian d foy Nov 27 '16 at 0:02

Christoph ,Nov 27, 2016 at 0:17

The documentation seems to be just wrong here, <...> does not correspond to qw(...) , but qw:v(...) . Cf S02 for the description of the adverb and this test that Brad was <del>looking for</del> already linked toChristoph Nov 27 '16 at 0:17

Christoph ,Nov 27, 2016 at 0:45

or perhaps not outright wrong, but rather 'just' misleading: <...> is indeed a :w form, and the given example code does compare equal according to eqvChristoph Nov 27 '16 at 0:45

dwarring ,Nov 27, 2016 at 18:33

Just to add to the other answers and point out a consistancy here between sets and object hashes .

An object hash is declared as my %object-hash{Any} . This effectively hashes on objects .WHICH method, which is similar to how sets distinguish individual members.

Substituting the set with an object hash:

my %obj-hash{Any};

%obj-hash< 1 2 3 4 > = Any;
say "hash: ", %obj-hash.keys.perl;
say "4 is in hash: ", %obj-hash{4}:exists;
say "IntStr 4 is in hash: ", %obj-hash{ IntStr.new(4, "Four") }:exists;
say "IntStr(4,...) is 4: ", IntStr.new(4, "Four") == 4;
say "5 is in hash: ", %obj-hash{5}:exists;

gives similar results to your original example:

hash: (IntStr.new(4, "4"), IntStr.new(1, "1"), IntStr.new(2, "2"), IntStr.new(3, "3")).Seq
4 is in hash: False
IntStr 4 is in hash: True
IntStr(4,...) is 4: True
5 is in hash: False

brian d foy ,Nov 29, 2016 at 21:52

Oh, this is going to suck hard while teaching a class. – brian d foy Nov 29 '16 at 21:52

dwarring ,Nov 30, 2016 at 4:21

I agree its not great, as it is. – dwarring Nov 30 '16 at 4:21

dwarring ,Nov 30, 2016 at 18:26

Have raised an RT rt.perl.org/Ticket/Display.html?id=130222dwarring Nov 30 '16 at 18:26

[Nov 17, 2017] Introducing TestSimple for testing Perl programs - YouTube

Nov 17, 2017 | www.youtube.com

Published on Oct 17, 2015

For details visit: http://perlmaven.com/introducing-test... Category Science & Technology License Standard YouTube License

[Nov 17, 2017] Bruce Gray - Your Perl 5 Brain, on Perl 6 by Bruce Gray

Nov 17, 2017 | www.youtube.com

Published on Jun 21, 2017

In which I detail the Perl 6 elements that have most changed my Perl 5 coding, and share the Perl 5 techniques I have adopted.

I eat, sleep, live, and breathe Perl!

Consultant and Contract Programmer Frequent PerlMongers speaker Dedicated Shakespeare theater-goer Armchair Mathematician Author of Blue_Tiger, a tool for modernizing Perl.

36 years coding 22 years Perl 16 years Married 15 YAPCs 7 Hackathons 3 PerlWhirls Perl interests: Refactoring, Perl Idioms / Micropatterns, RosettaCode, and Perl 6.

[Nov 17, 2017] Bit operations in Perl

Nov 17, 2017 | stackoverflow.com

Ask Question up vote down vote favorite

Toren ,Jan 12, 2011 at 14:50

I have an attribute (32 bits-long), that each bit responsible to specific functionality. Perl script I'm writing should turn on 4th bit, but save previous definitions of other bits.

I use in my program:

Sub BitOperationOnAttr
{

my $a="";

MyGetFunc( $a);

$a |= 0x00000008;

MySetFunc( $a);

}

** MyGetFunc/ MySetFunc my own functions that know read/fix value.

Questions:

  1. if usage of $a |= 0x00000008; is right ?
  2. how extract hex value by Regular Expression from string I have : For example:

"Attribute: Somestring: value (8 long (0x8))"

Michael Carman ,Jan 12, 2011 at 16:13

Your questions are not related; they should be posted separately. That makes it easier for other people with similar questions to find them. – Michael Carman Jan 12 '11 at 16:13

toolic ,Jan 12, 2011 at 16:47

Same question asked on PerlMonks: perlmonks.org/?node_id=881892toolic Jan 12 '11 at 16:47

psmears ,Jan 12, 2011 at 15:00

  1. if usage of $a |= 0x00000008; is right ?

Yes, this is fine.

  1. how extract hex value by Regular Expression from string I have : For example:

"Attribute: Somestring: value (8 long (0x8))"

I'm assuming you have a string like the above, and want to use a regular expression to extract the "0x8". In that case, something like:

if ($string =~ m/0x([0-9a-fA-F]+)/) {
    $value = hex($1);
} else {
    # string didn't match
}

should work.

Michael Carman ,Jan 12, 2011 at 16:32

Perl provides several ways for dealing with binary data:

Your scenario sounds like a set of packed flags. The bitwise operators are a good fit for this:

my $mask = 1 << 3;   # 0x0008
$value |=  $mask;    # set bit
$value &= ~$mask;    # clear bit
if ($value & $mask)  # check bit

vec is designed for use with bit vectors. (Each element has the same size, which must be a power of two.) It could work here as well:

vec($value, 3, 1) = 1;  # set bit
vec($value, 3, 1) = 0;  # clear bit
if (vec($value, 3, 1))  # check bit

pack and unpack are better suited for working with things like C structs or endianness.

sdaau ,Jul 15, 2014 at 5:01

I upvoted, but there is something very important missing: vec operates on a string!

If we use a number; say:

$val=5;

printf("b%08b",$val);

(this gives b00000101 ) -- then one can see that the "check bit" syntax, say:

for($ix=7;$ix>=0;$ix--) {

print vec($val, $ix, 1);

};

print "\n";

will not work (it gives 00110101 , which is not the same number). The correct is to convert the number to ASCII char, i.e.

print vec(sprintf("%c", $val), $ix, 1); .

sdaau Jul 15 '14 at 5:01

[Nov 17, 2017] date - How to convert epoch seconds to normal time in perl - Stack Overflow

Nov 17, 2017 | stackoverflow.com

confused ,2 days ago

I have a string of epoch seconds "1510652305" which when i convert to normal time on unix command line using
`date -d @1510652305`

i get Tue Nov 14 15:08:25 IST 2017

But when i tried it in perl using something like this

use POSIX qw(strftime);
use Time::Local;
use Time::localtime;

$kickoff_time=1510652305;
$kickoff_time=ctime($kickoff_time);

i get

Thu Jan 1 05:30:00 1970

How can i achieve the result i am getting in linux in perl?

Thanks!!

mwp ,2 days ago

Don't overthink it!
my $kickoff_time = localtime 1510652305;
say $kickoff_time; # Tue Nov 14 15:08:25 2017

If you absolutely, positively need the timezone in there:

use POSIX qw{strftime};

my $kickoff_time = strftime '%a %b %e %H:%M:%S %Z %Y', localtime 1510652305;
say $kickoff_time; # Tue Nov 14 15:08:25 IST 2017

Note that this is locale-dependent.

confused ,yesterday

We have to use localtime to convert in time from epoch seconds and gmtime to convert in time from normal seconds i got it now....Thanks!! – confused yesterday

mob ,yesterday

Still confused. Both localtime and gmtime expect the input to be epoch seconds. – mob yesterday

mwp ,4 hours ago

Exactly. To expound, localtime() takes the epoch and returns a string (or date parts array) representing the time in your local timezone; gmtime() takes the epoch and returns a string (or date parts array) representing the time in UTC. – mwp 4 hours ago

,

I would recommend using Time::Piece for this job - it's core in perl.
#!/usr/bin/env perl

use strict;
use warnings; 
use Time::Piece;

my $t = localtime ( 1510652305 );

print $t;

It'll print default format, or you can use formatted using strftime .

[Nov 17, 2017] perl modules

Nov 17, 2017 | perlmonks.com

Discipulus (Monsignor) on Nov 16, 2017 at 09:04 UTC

Re: perl modules

Hello codestroman and welcome to the monastery and to the wonderful world of Perl!

First of all, please, add <c> code tags </c> around your code and output.

Then be sure to have read the standard documentation: perlmod and perlnewmod

Infact a basic perl module define a package and use Exporter to export functions in the using perl program.

In my homenode i've collected a lot of links on about module creation

L*


Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.

thanos1983 (Priest) on Nov 16, 2017 at 09:17 UTC

Re: perl modules

Hello codestroman

Just to add a minor suggestion here, to the full cover reply of fellow monk Discipulus . It will assist you a lot also to read Simple Module Tutorial

Update: Direct answer to your question can be found here How to add a relative directory to @INC with multiple possible solutions. I would strongly recommend to go through all the articles that all monks proposed.

Hope this helps, BR.

Seeking for Perl wisdom...on the process of learning...not there...yet!

hippo (Abbot) on Nov 16, 2017 at 09:21 UTC

Re: perl modules (Can't locate in @INC)
PLEASE HELP!!

This is a monastery - a place of quite contemplation. The louder you shout the less wisdom shall you receive.

The error message Can't locate dog.pm in @INC is pretty explicit. Either your module file is not called dog.pm in which case, change it or else your file dog.pm is not in any of the directories listed in @INC in which case either move it to one of those directories or else change @INC with use lib .

I also see, despite the lack of formatting in your post that your module doesn't use any namespace. You should probably address that. Perhaps a solid read through Simple Module Tutorial would be a good idea?

Anonymous Monk on Nov 16, 2017 at 09:07 UTC

Re: perl modules

use an absolute pathname in use lib

Anonymous Monk on Nov 16, 2017 at 15:16 UTC

Re: perl modules

Welcome to the language ... and, to the Monastery. The "simple module tutorial" listed above is a very good place to start. Like all languages of its kind, Perl looks at runtime for external modules in a prescribed list of places, in a specified order. You can affect this in several ways, as the tutorials describe. Please read them carefully.

In the Perl(-5) language, this list is stored in a pre-defined array variable called @INC and it is populated from a variety of sources: a base-list that is compiled directly into the Perl interpreter, the PERL5LIB environment-variable, use lib statements, and even direct modification of the variable itself. Perl searches this list from beginning to end and processes (only) the first matching file that it finds.

(Note that, in Perl, the use statement is actually a pragma, or declaration to the compiler, and as such it has many "uses" and a somewhat complicated syntax.)

Corion (Pope) on Nov 16, 2017 at 15:23 UTC

Re^2: perl modules


by Corion (Pope) on Nov 16, 2017 at 15:23 UTC

(Note that, in Perl, the use statement is actually a pragma, or declaration to the compiler, and as such it has many "uses" and a somewhat complicated syntax.)

Please no.

The word "pragma" has a special meaning in Perl, and it is highly confusing to claim that a Perl "keyword" would be a "pragma". use certainly is a keyword and nothing else.

If you mean to say something different, please describe in more words what you want to say.

[Nov 17, 2017] Using the built-in debugger of Perl by Gabor Szabo

Youtube video, 9 min.
Nov 17, 2017 | www.youtube.com

Perl comes with a very powerful built-in command line debugger. In this screencast you can see basics how to use it.

For blog entries and for more screencasts see http://perlmaven.com/

About Perl Programming and Perl programmers.

For the blog entry of this screencast visit
http://perlmaven.com/using-the-built-...

Debugger commands used:

q - quit,
h - help,
p - print,
s - step in,
n - step over,
r - step out,
T - stack trace
l - listing code

The Padre project can be found here: http://padre.perlide.org/

The book mentioned was Pro Perl Debugging: http://www.apress.com/9781590594544

If you are interested an on-site Perl training contact me http://szabgab.com/contact.html

[Nov 17, 2017] Modern Perl by chromatic

Notable quotes:
"... 'shift key br0ken' ..."
"... # appease the Mitchell estate ..."
Nov 17, 2017 | www.amazon.com

Regex Modifiers

Several modifiers change the behavior of the regular expression operators. These modifiers appear at the end of the match, substitution, and qr// operators. For example, here's how to enable case-insensitive matching:

​ my ​ $pet = ​ 'ELLie' ​;
like $pet, ​ qr ​/Ellie/, ​ 'Nice puppy!' ​;
like $pet, ​ qr ​/Ellie/i, ​ 'shift key br0ken' ​;

The first like() will fail because the strings contain different letters. The second like() will pass, because the /i modifier causes the regex to ignore case distinctions. and are effectively equivalent in the second regex due to the modifier.

You may also embed regex modifiers within a pattern:

​ my ​ $find_a_cat = ​ qr ​/(?<feline>(?i)cat)/;

The (?i) syntax enables case-insensitive matching only for its enclosing group -- in this case, the named capture. You may use multiple modifiers with this form. Disable specific modifiers by preceding them with the minus character ( ):

​ my ​ $find_a_rational = ​ qr ​/(?<number>(?-i)Rat)/;

... ... ...

The /e modifier lets you write arbitrary code on the right side of a substitution operation. If the match succeeds, the regex engine will use the return value of that code as the substitution value. The earlier global substitution example could be simpler with code like the following:

# appease the Mitchell estate
$sequel =~ ​ ​{Scarlett( O​ 'Hara)?}
{
' ​Mauve​ ' . defined $1
? ' ​ Midway​ '
: ''
}ge;

Each additional occurrence of the /e modifier will cause another evaluation of the result of the expression, though only Perl golfers use anything beyond /ee

[Nov 16, 2017] Connecting with NetFTP in Perl, but fails to upload - Stack Overflow

Nov 16, 2017 | stackoverflow.com

Andrew Newby, Nov 10 at 11:33

I am trying to use Net::FTP ( http://search.cpan.org/~shay/libnet-3.10/lib/Net/FTP.pm ) to upload a file to a remote server. I have:
use Net::FTP;

my $ftp = Net::FTP->new("example.com", Debug => 1) or die "Cannot connect to example.com: $@";

$ftp->login("username",'xxxx') or die "Cannot login ", $ftp->message;
$ftp->pasv();
$ftp->binary();
$ftp->cwd("/web/example.com/public_html/cgi-bin/links/admin/IMPORT") or die "Cannot change working directory ", $ftp->message;
print "Currently in: " . $ftp->pwd(). "\n";
$ftp->put("/home/chambres/web/example.com/public_html/cgi-bin/links/admin/org.csv") or die "Cannot upload ", $ftp->message;
$ftp->quit;

However, when I run it I get:

Net::FTP>>> Net::FTP(3.05)
Net::FTP>>>   Exporter(5.72)
Net::FTP>>>   Net::Cmd(3.05)
Net::FTP>>>   IO::Socket::SSL(2.024)
Net::FTP>>>     IO::Socket::IP(0.37)
Net::FTP>>>       IO::Socket(1.38)
Net::FTP>>>         IO::Handle(1.35)
Net::FTP=GLOB(0x182e348)<<< 220 (vsFTPd 3.0.3)
Net::FTP=GLOB(0x182e348)>>> USER username
Net::FTP=GLOB(0x182e348)<<< 331 Please specify the password.
Net::FTP=GLOB(0x182e348)>>> PASS ....
Net::FTP=GLOB(0x182e348)<<< 230 Login successful.
Net::FTP=GLOB(0x182e348)>>> EPSV
Net::FTP=GLOB(0x182e348)<<< 229 Entering Extended Passive Mode (|||12065|)
Net::FTP=GLOB(0x182e348)>>> TYPE I
Net::FTP=GLOB(0x182e348)<<< 200 Switching to Binary mode.
Net::FTP=GLOB(0x182e348)>>> CWD /web/example.com/public_html/cgi-bin/links/admin/IMPORT
Net::FTP=GLOB(0x182e348)<<< 250 Directory successfully changed.
Net::FTP=GLOB(0x182e348)>>> PWD
Net::FTP=GLOB(0x182e348)<<< 257 "/web/example.com/public_html/cgi-bin/links/admin/IMPORT" is the current directory
Currently in: /web/example.com/public_html/cgi-bin/links/admin/IMPORT
Net::FTP=GLOB(0x182e348)>>> PORT 139,162,208,252,155,199
Net::FTP=GLOB(0x182e348)<<< 200 PORT command successful. Consider using PASV.
Net::FTP=GLOB(0x182e348)>>> FEAT
Net::FTP=GLOB(0x182e348)<<< 211-Features:
Net::FTP=GLOB(0x182e348)<<<  EPRT
Net::FTP=GLOB(0x182e348)<<<  EPSV
Net::FTP=GLOB(0x182e348)<<<  MDTM
Net::FTP=GLOB(0x182e348)<<<  PASV
Net::FTP=GLOB(0x182e348)<<<  REST STREAM
Net::FTP=GLOB(0x182e348)<<<  SIZE
Net::FTP=GLOB(0x182e348)<<<  TVFS
Net::FTP=GLOB(0x182e348)<<< 211 End
Net::FTP=GLOB(0x182e348)>>> HELP ALLO
Net::FTP=GLOB(0x182e348)<<< 214-The following commands are recognized.
Net::FTP=GLOB(0x182e348)<<<  ABOR ACCT ALLO APPE CDUP CWD  DELE EPRT EPSV FEAT HELP LIST MDTM MKD
Net::FTP=GLOB(0x182e348)<<<  MODE NLST NOOP OPTS PASS PASV PORT PWD  QUIT REIN REST RETR RMD  RNFR
Net::FTP=GLOB(0x182e348)<<<  RNTO SITE SIZE SMNT STAT STOR STOU STRU SYST TYPE USER XCUP XCWD XMKD
Net::FTP=GLOB(0x182e348)<<<  XPWD XRMD
Net::FTP=GLOB(0x182e348)<<< 214 Help OK.
Net::FTP=GLOB(0x182e348)>>> ALLO 37954326
Net::FTP=GLOB(0x182e348)<<< 202 ALLO command ignored.
Net::FTP=GLOB(0x182e348)>>> STOR org.csv
Net::FTP=GLOB(0x182e348)<<< 425 Failed to establish connection.
<h1>Software error:</h1>
<pre>Cannot upload Failed to establish connection.
</pre>
<p>
For help, please send mail to this site's webmaster, giving this error message
and the time and date of the error.

</p>
[Fri Nov 10 10:57:33 2017] export-csv-other-sites.cgi: Cannot upload Failed to establish connection.

It seems to work up until the put() command. Any ideas as to what is going on?

Gerhard Barnard, Nov 10 at 11:36

huh? $ftp->put("/home/chambres/web/example.com/public_html/cgi- That seems incomplete. – Gerhard Barnard Nov 10 at 11:36

Gerhard Barnard, Nov 10 at 11:38

Secondly, it is not connecting. It tells you that twice Net::FTP=GLOB(0x182e348)<<< 425 Failed to establish connection. and then again <pre>Cannot upload Failed to establish connectionGerhard Barnard Nov 10 at 11:38

Andrew Newby, Nov 10 at 11:41

@GerhardBarnard - I know that :) The weird part, is that it says it IS connected: Currently in: /web/example.com/public_html/cgi-bin/links/admin/IMPORT . – Andrew Newby Nov 10 at 11:41

Andrew Newby, Nov 10 at 11:42

"That seems incomplete." - what seems incomplete? – Andrew Newby Nov 10 at 11:42

Gerhard Barnard, Nov 10 at 11:43

I suspect it is not keeping the connection open. can you also fix the code? it seems incomplete. $ftp->put("/home/chambres/web/example.com/public_html/cgi-Gerhard Barnard Nov 10 at 11:43
Net::FTP=GLOB(0x182e348)>>> PORT 139,162,208,252,155,199
Net::FTP=GLOB(0x182e348)<<< 200 PORT command successful. Consider using PASV.

FTP uses a control connection for the command and data connections for each data transfer. With the PORT command your local system is instructing the server to connect to the given IP address (139.162.208.252) and port (39879=155*256+199). Connecting from outside to some arbitrary port on your system will not work if you are behind a firewall or some NAT or if there is a firewall configured on your system. In these cases it might work to use the passive mode where the client opens a connection to the server and not the server a connection to the client.

Net::FTP=GLOB(0x182e348)>>> STOR org.csv
Net::FTP=GLOB(0x182e348)<<< 425 Failed to establish connection.

It looks like the server could not connect to your system in order to create a connection to transfer the data. Probably a firewall or NAT involved. Try passive mode.

It looks like that you tried to use passive mode already:

$ftp->pasv();
...
Net::FTP=GLOB(0x182e348)>>> EPSV
Net::FTP=GLOB(0x182e348)<<< 229 Entering Extended Passive Mode (|||12065|)

Only you did it the wrong way. The command above just sends the PASV/EPSV command to the server but does not change which mode gets used for the next data transfer. To cite from the documentation :

If for some reason you want to have complete control over the data connection, this includes generating it and calling the response method when required, then the user can use these methods to do so.
However calling these methods only affects the use of the methods above that can return a data connection. They have no effect on methods get, put, put_unique and those that do not require data connections.

To instead enable passive mode in connection with put , get etc use passive not pasv :

$ftp->passive(1);

[Nov 16, 2017] Re^4 Strange behaviour of tr function in case the set1 is supplied by a variable

Nov 16, 2017 | perlmonks.com

likbez

// is an abbreviation for m// (be careful of context). But // is can be replaced by (almost?) any delimiter, by using m or s or tr.

You make a very good point. Now I started to understand why they put description of tr, which is actually a function into this strange place

http://perldoc.perl.org/perlop.html#Quote-Like-Operators
Strings with arbitrary delimiters after tr, m, s, etc are a special, additional type of literals. Each with its own rules. And those rules are different from rules that exist for single quoted strings, or double quoted strings or regex (three most popular types of literals in Perl).

For example, the treatment of backslash in "tr literal" is different from single quoted strings:

"A single-quoted, literal string. A backslash represents a backslash unless followed by the delimiter or another backslash, in which case the delimiter or backslash is interpolated."

This means that in Perl there is a dozen or so of different types of literals, each with its own idiosyncratic rules. Which create confusion even for long type Perl users as they tend to forget detail of constructs they use rarely and extrapolate them from more often used constructs.

For example, in my case, I was burned by the fact that "m literals" allows interpolation of variables, but "tr literals" do not. And even created a test case to study this behavior :-)

In other words, the nature of those "context-dependent-literals" (on the level of lexical scanner they are all literals) is completely defined not by delimiters they are using (which are arbitrary), but by the operator used before it. If there none, m is assumed.

This "design decision" (in retrospect this is a design decision, although in reality it was "absence of design decition" situation ;-) adds unnecessary complexity to the language and several new (and completely unnecessary) types of bugs.

This "design decision" is also poorly documented and for typical "possible blunders" (for tr that would be usage of "[","$","@" without preceding backslash) there is no warnings.

This trick of putting tr description into http://perldoc.perl.org/perlop.html that I mentioned before now can be viewed as an attempt to hide this additional complexity. It might be beneficial to revise the docs along the lines I proposed.

In reality in Perl q, qq, qr, m, s, tr are functions each of which accepts (and interpret) a specific, unique type of "context-dependent-literal" as the argument. That's the reality of this, pretty unique, situation with the language, as I see it.

Quote-Like-Operators shows 2 interesting examples with tr: tr[aeiouy][yuoiea] or tr(+\-*/)/ABCD/. [download]
The second variant look like a perversion for me. I never thought that this is possible. I thought that the "arbitrary delimiter" is "catched" after the operator and after that they should be uniform within the operator ;-).

And the first is not without problems either: if you "extrapolate" your skills with regex into tr you can write instead of tr[aeiouy][yuoiea] obviously incorrect< code>tr/ aeiouy /] yuoiea / that will work fine as long as strings are of equal length.

[Nov 16, 2017] Accessing a filehandle which is defined in main program from different modules

Nov 11, 2017 | stackoverflow.com

Rotch Miller, Nov 11 at 6:48

I have following query in Perl regarding the accessing of file handlers.

Consider the following code snippet which describes the exact scenario.

Main.pl
#!/usr/bin/perl -w
use warnings;
use strict;
use strict 'refs';

use File::Basename;
use Fcntl ':flock';

use feature qw/say switch/;

use File::Spec::Functions;
use File::Find;

require( "/home/rxa3kor/Mastering_Perl/sample.pm" );

our $LOGFILE = "sample";
open( LOGFILE, ">$LOGFILE" ) or die "__ERROR: can't open file\n'", $LOGFILE, "'!\n";
flock( LOGFILE, LOCK_EX );
print LOGFILE ( "Tool Start\n" );

&sample::func();

flock( LOGFILE, LOCK_UN );
close( LOGFILE );
sample.pm
#!/usr/bin/perl -w
package sample;

sub func() {
    print $main::LOGFILE ( "Printing in subroutine\n" );
}

when I execute the above said code snippet I am getting the following error.

print() on unopened filehandle Mastering at /home/rxa3kor/Mastering_Perl/sample.pm line 6.

Th error is because the filehandle LOGFILE is not visible under sample.pm module.

How this concept can be implemented?

I want to open a file in Main.pl and I need this file handle to be accessible in different Perl modules.

Dave Cross ,Nov 11 at 6:54

I don't think this is the code you are using. This code doesn't compile. You are missing a semicolon at the end of the use File::Find line. And once I fix that, I get another problem as you are not loading sample.pm in your main program. Please don't waste our time by posting sample code where we have to fix simple errors like that. – Dave Cross Nov 11 at 6:54

Dave Cross ,Nov 11 at 6:55

Two more errors. sample.pm does not return a true value. And the filename is different between this sample code and the error message that you quote. – Dave Cross Nov 11 at 6:55

Rotch Miller ,Nov 11 at 7:01

Basically i wanted to know whether we can open a file under main.pl and i need this file handle to be accessible in different Perl modules. – Rotch Miller Nov 11 at 7:01

DavidO ,Nov 11 at 7:11

This is unrelated to the problem that you are asking about, but what do you think will happen when you open your logfile in '>' mode, and then discover you're unable to obtain an exclusive lock because someone else has it locked? – DavidO Nov 11 at 7:11

DavidO ,Nov 11 at 7:13

Hint: Clobber-output mode will clobber the output file before you've obtained a lock. This means if someone else already had the file opened with a lock, you just clobbered them. – DavidO Nov 11 at 7:13

Dave Cross ,Nov 11 at 7:19

The reason why you're seeing this error is that $main::LOGFILE refers to the scalar variable $LOGFILE which contains the filename, sample . The filehandle, LOGFILE , is a completely different variable. And here we see the dangers of having two variables of different types (scalar vs filehandle) with the same name.

Bareword filehandles (the ones in capital letters with no sigil attached, the type you are using) are slightly strange variables. They don't need a sigil, so you shouldn't use one. So the simplest fix is to just remove the $ .

sub func()
{
  print main::LOGFILE ("Printing in subroutine\n");
}

But using global variables like this is a terrible idea. It will quickly lead to your code turning into an unmaintainable mess.

Far better to use a lexical filehandle and to pass that into your subroutine.

our $LOGFILE="sample";
open( my $log_fh, ">$LOGFILE" ) or die "__ERROR: can't open file\n'",$LOGFILE,"'!\n";
flock( $log_fh, LOCK_EX );
print $log_fh ("Tool Start\n");
&sample::func($log_fh);
flock( $log_fh, LOCK_UN );
close( $log_fh );

And in sample.pm :

sub func
{
  my ($fh) = @_;
  print $fh ("Printing in subroutine\n");
}

Note that as I'm now passing a parameter to func() . I've removed the prototype saying that it takes no parameters (although the fact that you were calling it with & turns off parameter checking!)

A few other points.

I'd write your code like this:

# main.pl
use warnings;
use strict;

use File::Basename; # Not used. Remove?
use Fcntl ':flock'; # Not user. Remove?
use feature qw/say switch/;
use File::Spec::Functions; # Not user. Remove?
use File::Find; # Not user. Remove?
use Sample;

my $LOGFILE = 'sample';
# Lexical filehandle. Three-arg version of open()
open( my $log_fh, '>', $LOGFILE )
  or die "__ERROR: can't open file\n'$LOGFILE'!\n";
flock( $log_fh, LOCK_EX );

print $log_fh ("Tool Start\n");
sample::func($log_fh);

flock( $log_fh, LOCK_UN );
close( $log_fh );

And...

package Sample;
use strict;
use warnings;

sub func {
  my ($fh) = @_;
  print $fh ("Printing in subroutine\n");
}

1;

Rotch Miller ,Nov 11 at 7:29

Is there any method where we can avoid passing the file handler to a subroutine ? I need to directly access the file handler in the perl module which is present in main,pl. Reason for this requirement is because i may have different Perl modules and different subroutines inside each modules, every time i need to pass the file handlers to each of these subroutines in Perl module. Another difficulty will be always subroutine need not be called from main.pl file, subroutine defined in a *.pm file may call other subroutine which is defined in another *.pm module. – Rotch Miller Nov 11 at 7:29

Dave Cross ,Nov 11 at 7:32

@RotchMiller: My answer already tells you how to do that. But I think it's a very bad idea. – Dave Cross Nov 11 at 7:32

Rotch Miller ,Nov 11 at 7:54

Main underlying problem is the way how the file handler's can be made visible in the subroutine of different Perl modules. Like how we have to export a scalar variables from one *.pm module to any perl modules using the EXPORTER, similar concept for file handlers would be good. – Rotch Miller Nov 11 at 7:54

Dave Cross ,Nov 11 at 7:59

@RotchMiller: Exporter works fine for filehandles. Obviously not if they're lexical variables. But for package variables and bareword filehandles, there's no problem. – Dave Cross Nov 11 at 7:59

Dave Cross ,Nov 11 at 8:03

@RotchMiller The traditional way to make a variable visible within subroutines in many different modules is to pass it in as a parameter. But if you want to ignore seventy years of good software engineering practice - feel free :-) – Dave Cross Nov 11 at 8:03

> ,

You've got an extremely detailed analysis from Dave Cross .

Here I'd like to offer a way to cleanly provide a log file for all modules to write to.

Introduce a module that performs the writes to a log file in a sub; load it by all modules that need that. In that sub open the log file to append, using state filehandle which thus stays open across the calls. Then the modules write by invoking this sub, and this can be initiated by a call from main .

The logger module

package LogAll;

use warnings;
use strict;
use feature qw(say state);
use Carp qw(croak);    
use Exporter qw(import);

our @EXPORT_OK = qw(write_log);

sub write_log {
    state $fh = do {               # initialize; stays open across calls
        my $log = 'LOG_FILE.txt';
        open my $afh, '>>', $log or croak "Can't open $log: $!";
        $afh;
    };  
    say $fh $_ for @_;
}
1;

Two other modules, that need to log, are virtually the same for this example; here is one

package Mod1;

use warnings;
use strict;

use Exporter qw(import);    
use LogAll qw(write_log);

our @EXPORT_OK = qw(f1);

sub f1 {
    write_log(__PACKAGE__ . ": @_");
}
1;

The main

use warnings;
use strict;

use LogAll qw(write_log);    
use Mod1 qw(f1);
use Mod2 qw(f2);

write_log('START');

f1("hi from " . __PACKAGE__);
f2("another " . __PACKAGE__);

A run results in the file LOG_FILE.txt

START
Mod1: hi from main
Mod2: another main

I print START for a demo but the file need not be opened from main .

Please develop the printer module further as suitable. For example, and a way for the file name to be passed optionally so that main can name the log (by varying type and number of arguments), and add a way to close the log controllably,

[Nov 16, 2017] Generating a range of Unicode characters

Notable quotes:
"... The auto-increment operator has a little extra builtin magic to it. If you increment a variable that is numeric, or that has ever been used in a numeric context, you get a normal increment. If, however, the variable has been used in only string contexts since it was set, and has a value that is not the empty string and matches the pattern /^ a-zA-Z * 0-9 *\z/ , the increment is done as a string, preserving each character within its range, with carry: ..."
"... print ++($foo = "99"); # prints "100" print ++($foo = "a0"); # prints "a1" print ++($foo = "Az"); # prints "Ba" print ++($foo = "zz"); # prints "aaa" [download] ..."
Nov 16, 2017 | perlmonks.com

davido (Archbishop) on Nov 16, 2017 at 05:46 UTC

Re: Generating a range of Unicode characters

Check out perlop Auto-increment and Auto-decrement for an explanation.

The thing to consider here is that the .. range operator leverages the semantics provided by ++ (auto-increment). The documentation for auto-increment says this:

The auto-increment operator has a little extra builtin magic to it. If you increment a variable that is numeric, or that has ever been used in a numeric context, you get a normal increment. If, however, the variable has been used in only string contexts since it was set, and has a value that is not the empty string and matches the pattern /^ a-zA-Z * 0-9 *\z/ , the increment is done as a string, preserving each character within its range, with carry:

print ++($foo = "99"); # prints "100" print ++($foo = "a0"); # prints "a1" print ++($foo = "Az"); # prints "Ba" print ++($foo = "zz"); # prints "aaa" [download]

The components of the range you are trying to construct do not meet the criteria for Perl's built-in autoincrement behavior.

However, if you're using Perl 5.26 or newer, and enable unicode_strings you can use the following, as documented in perlop Range Operators .

use charnames "greek"; my @greek_small = map { chr } (ord("\N{alpha}") .. ord("\N{omega}")); [download]

Or forgo the \N{charname} lookups and just use the actual ordinal values:

my @chars = map {chr} $ord_first .. $ord_last; [download]

Dave

Your Mother (Chancellor) on Nov 16, 2017 at 06:13 UTC

Re: Generating a range of Unicode characters

Is this what you're after?

perl -CSD -le 'print chr for 0xDF .. 0x0101' [download]

Update: I hadn't read all the way down davido 's post. He is making the same suggestion already at the end.

[Nov 16, 2017] regex - Parsing a whole file in Perl

Nov 12, 2017 | stackoverflow.com

pleriche, Nov 12 at 9:52

I have an HTML file containing a 2-column table which I want to parse in order to extract pairs of strings representing the columns. The page layout of the HTML (white space, new lines) is arbitrary, hence I can't parse the file line by line.

I recall that you can parse such a thing by slurping the whole file into a string and operating on the entire string, which I'm finding a bit more challenging. I'm trying things like the following:

#!/usr/bin/perl

open(FILE, "Glossary") || die "Couldn't open file\n";
@lines = <FILE>;
close(FILE);

$data = join(' ', @lines);

while ($data =~ /<tr>.*(<td>.*<\/td>).*(<td>.*<\/td>).*<\/tr>/g) {
    print $1, ":", $2, "\n";
}

which gives a null output. Here's a section of the input file:

<table class="wikitable">
    <tr>
        <td><b>Term</b>
        </td>
        <td><b>Meaning</b>
        </td></tr>
    <tr>
        <td><span id="0-Day">0-Day</span>
        </td>
        <td>
        <p>See <a href="#Zero_Day">Zero Day</a>.
        </p>
        </td>

Can someone help me out?

Borodin ,Nov 12 at 21:20

Use HTML::TableExtractBorodin Nov 12 at 21:20

zdim ,Nov 12 at 21:46

To correct my early comment (removed), while I recommend HTML::TreeBuilder for general parsing of HTML (and there are others), here you indeed want HTML::TableExtract . And you do not want to use regex – zdim Nov 12 at 21:46

Dave Cross ,2 days ago

You can't parse HTML with a regexDave Cross 2 days ago

Miguel Prz ,Nov 12 at 10:03

There is a HTML::TableExtract module in CPAN, which simplifies the problem you are trying to solve:
use strict;
use warnings;
use HTML::TableExtract qw(tree);

my $te = HTML::TableExtract->new( headers => qw(Term Meaning) );
my $html_file = "Glossary";
$te->parse_file($html_file);
my $table = $te->first_table_found;
# ...

pleriche ,yesterday

Thank you and I'm sure TableExtract is the better way of doing it, but the object of my question was to improve my understanding of how to use regular expressions since they're so central to Perl. Adding gs to the regexpr as someone suggested (since deleted) was the leg-up I needed. – pleriche yesterday

Miguel Prz ,yesterday

I see your point, and it's really important build a solid knowlegment on regexpr. But, like other people have said, it's not a goot idea apply regexpr to parsing html documents – Miguel Prz yesterday

,

You already have answers explaining why you shouldn't parse HTML with regexes. And you really shouldn't. But you've asked for an explanation of why your code doesn't work. So here goes...

You have two problems in your code. One stops it working and the other stops it working as you expect.

Firstly, you are using . in your regex to match any character. But . doesn't match any character. It matches any character except a newline. And you have newlines in your string. You fix that by adding the /s option to your match operator (so it has /gs instead of /s ).

With that fix in place, you get a result from your code. Using your test data, I see:

<td><b>Term</b>
         </td>:<td><b>Meaning</b>
         </td>

Which is correct. But looking at your test data, I wondered why I wasn't getting two results - because of the /g . I soon realised it was because your test data is missing the closing </td> . When I added that, I got this result:

<td><span id="0-Day">0-Day</span>
         </td>:<td>
         <p>See <a href="#Zero_Day">Zero Day</a>.
         </p>
         </td>

Ok. It's now finding the second result. But what has happened to the first one? That's the second error in your code.

You have .* a few times in your regex. That means "zero or more of any character". But it's the "or more" that is a problem here. By default, Perl regex qualifiers ( * or + ) are greedy. That means they will use up as much of the string as possible. And the first .* in your regex is eating up a lot of your string. All of it up to the second <tr> in fact.

The solution to that is to make the .* non-greedy. And you do that by adding ? to the end. So you can replace all of the .* with .*? . Having done that, I get this output:

<td><b>Term</b>
         </td>:<td><b>Meaning</b>
         </td>
<td><span id="0-Day">0-Day</span>
         </td>:<td>
         <p>See <a href="#Zero_Day">Zero Day</a>.
         </p>
         </td>

Which seems correct to me.

So, to summarise:

  1. By default, . doesn't match newlines. To do that, you need /s .
  2. Beware of greedy qualifiers.

[Nov 16, 2017] Reading/dumping a perl hash from shell

Nov 11, 2017 | stackoverflow.com

newbie ,Nov 11 at 0:27

I have a read-only perl file with a huge hash defined in it. Is there anyway for me to read this perl file and dump out the hash contents?

this is basic structure of the hash within the file.

%hash_name = {
    -files => [
         '<some_path>',
    ],
    -dirs => [
         '<some_path>',
         '<some_path>',
         '<some_path>',
         '<some_path>',
         '<some_path>',
    ],
};

Davy M ,Nov 11 at 0:30

can you not cat the file and redirect it into a one that does have write permissions? cat perl_file_name > new_perl_file_nameDavy M Nov 11 at 0:30

newbie ,Nov 11 at 0:34

yes I did consider that but will go with that approach only if there is no other way to dump the hash without creating a new file. – newbie Nov 11 at 0:34

zdim ,Nov 11 at 2:59

@newbie Thank you, and to repeat the question: Does this file have other Perl code or just this hash? Also, is the hash undeclared (just %hash_name ), as you show it, or is it "lexical," so with my such as: my %hash_name ? – zdim Nov 11 at 2:59

zdim ,Nov 11 at 3:12

@newbie What you show is invalid in Perl: the % in %hash_name indicates that the variable is a hash , but { .. } form a hash reference , which is a scalar variable (not a hash). So it should be either %hash_name = ( .. ) or it's $hashref_name = { .. }zdim Nov 11 at 3:12

Schwern ,Nov 11 at 6:59

Note this is an insecure way to store data. The data file must be evaluated as perl code. Any arbitrary code could be in the file. In addition, the data file can only be read by Perl programs. Instead, use JSON or similar data format. JSON::MaybeXS can convert between JSON and Perl. – Schwern Nov 11 at 6:59

zdim ,Nov 11 at 3:40

Ideally you'd copy the file so that you can edit it, then turn it into a module so to use it nicely.

But if for some reason this isn't feasible here are your options.

If that hash is the only thing in the file , "load" it using do and assign to a hash

use warnings;
use strict;

my $file = './read_this.pl';  # the file has *only* that one hash

my %hash = do $file;

This form of do executes the file (runs it as a script), returning the last expression that is evaluated. With only the hash in the file that last expression is the hash definition, precisely what you need.

If the hash is undeclared , so a global variable (or declared with our ), then declare as our a hash with the same name in your program and again load the file with do

our %hash_name;  # same name as in the file
do $file;        # file has "%hash" or "our %hash" (not "my %hash")

Here we "pick up" the hash that is evaluated as do runs the file by virtues of our

If the hash is "lexical" , declared as my %hash (as it should be!) ... well, this is bad. Then you need to parse the text of the file so to extract lines with the hash. This is in general very hard to do, as it amounts to parsing Perl. (A hash can be built using map , returned from a sub as a reference or a flat list ...) Once that is done you eval the variable which contains the text defining that hash.

However, if you know how the hash is built, as you imply, with no () anywhere inside

use warnings; 
use strict;

my $file = './read_this.pl';

my $content = do {  # "slurp" the file -- read it into a variable
    local $/;
    open my $fh, '<', $file or die "Can't open $file: $!";
    <$fh>;
};

my ($hash_text) = $content =~ /\%hash_name\s*=\s*(\(.*?\)/s;
my %hash = eval $hash_text;

This simple shot leaves out a lot, assuming squarely that the hash is as shown. Also note that this form of eval carries real and serious security risks.


Files are also loaded using require . Apart from it doing a lot more than do , the important thing here is that even if it runs multiple times require still loads that file only once . This matters for modules in the first place, which shouldn't be loaded multiple times, and use indeed uses require .

On the other hand, do does it every time, what makes it suitable for loading files to be used as data, which presumably should be read every time. This is the recommended method. Note that require itself uses do to actually load the file.

Thanks to Schwern for a comment.

Schwern ,Nov 11 at 4:31

do will always load the file. require will only load it once. Since you want to get data from the file, it's recommended to use do . Else the second or third time anything in that process loads the file they'll end up with 1 . – Schwern Nov 11 at 4:31

zdim ,Nov 11 at 4:41

@Schwern Right, thank you for the comment. I wanted to avoid excessive explanation thus I simply use do . (I still mention require since it is feasible that the data is loaded once.) But it is good to state this, thank you -- I am adding the comment. – zdim Nov 11 at 4:41

Schwern ,Nov 11 at 4:57

It's bad practice to use require because a future person maintaining the code may also require the same file elsewhere (not even in the same code file, it's per process) and not realize it has already been required. I'd suggest instead explaining why do is the right thing to do here instead of require , it's a necessary complexity. – Schwern Nov 11 at 4:57

zdim ,Nov 11 at 5:40

@Schwern A good point, thank you. Adjusted the post. – zdim Nov 11 at 5:40

zdim ,Nov 13 at 2:15

@DavyM Thank you for kind words. You are right, and there are so many such questions that it even seems more common (than otherwise) when people start out. All that we can do is to keep pointing it out, and write it always in full code examples. – zdim Nov 13 at 2:15

[Nov 16, 2017] perl perlpacktut not making sense for me - Stack Overflow

Nov 13, 2017 | stackoverflow.com

brian d foy ,Nov 13 at 2:34

The pack function puts one or more things together in a single string. It represents things as octets (bytes) in a way that it can unpack reliably in some other program. That program might be far away (like, the distance to Mars far away). It doesn't matter if it starts as something human readable or not. That's not the point.

Consider some task where you have a numeric ID that's up to about 65,000 and a string that might be up to six characters.

print pack 'S A6', 137, $ARGV[0];

It's easier to see what this is doing if you run it through a hex dumper as you run it:

$ perl pack.pl Snoopy | hexdump -C
00000000  89 00 53 6e 6f 6f 70 79                           |..Snoopy|

The first column counts the position in the output so ignore that. Then the first two octets represent the S (short, 'word', whatever, but two octets) format. I gave it the number 137 and it stored that as 0x8900. Then it stored 'Snoopy' in the next six octets.

Now try it with a shorter name:

$ perl test.pl Linus | hexdump -C
00000000  89 00 4c 69 6e 75 73 20                           |..Linus |

Now there's a space character at the end (0x20). The packed data still has six octets. Try it with a longer name:

$ perl test.pl 'Peppermint Patty' | hexdump -C
00000000  89 00 50 65 70 70 65 72                           |..Pepper|

Now it truncates the string to fit the six available spaces.

Consider the case where you immediately send this through a socket or some other way of communicating with something else. The thing on the other side knows it's going to get eight octets. It also knows that the first two will be the short and the next six will be the name. Suppose the other side stored that it $tidy_little_package . It gets the separate values by unpacking them:

my( $id, $name ) = unpack 'S A6', $tidy_little_package;

That's the idea. You can represent many values of different types in a binary format that's completely reversible. You send that packed string wherever it needs to be used.

I have many more examples of pack in Learning Perl and Programming Perl .

[Nov 16, 2017] perl - Grep Two Dimensional Array - Stack Overflow

Nov 16, 2017 | stackoverflow.com

Grep Two Dimensional Array Ask Question up vote down vote favorite

Taranasaur ,yesterday

Since this is not a question directly covered here, thought best I ask and answer it.

I had an issue where I wanted to add a node name to a list only if the same node doesn't already exist. The array was built using:

push (@fin_nodes, [$node, $hindex, $e->{$hip}->{FREQ}]);

So given when given array (@fin_nodes) that looks like:

$VAR1 = [
      'first-node',
      '4',
      3
    ];
$VAR2 = [
      'second-node',
      '1',
      3
    ];
$VAR3 = [
      'another-node',
      '1',
      5
    ];
$VAR4 = [
      'some-node',
      '0',
      5
    ];

To do a grep on this the following works:

my @match = grep { grep { $_ =~ $node } @$_ } @fin_nodes;

So given a $node "second-node" the above statement will return @match as:

$VAR1 = [
  'second-node',
  '1',
  3
];

Sobrique ,yesterday

Why not use a hash instead? – Sobrique yesterday

ysth ,yesterday

when dumping an array, do Data::Dumper::Dumper(\@array), not ...(@array). if passed a list, Dumper dumps each element individually, which is not what you want here – ysth yesterday

,

I would say "don't" and instead:
my %fin_nodes;
$fin_nodes{$node} = [$hindex, $e->{$hip}->{FREQ}]);

And then you can simply if ($fin_nodes{$node}) {

Failing that though - you don't need to grep every element, as your node name is always first.

So:

 my @matches = grep { $_ -> [0] eq $node } @fin_nodes;

eq is probably a better choice than =~ here, because the latter will substring match. (And worse, can potentially do some quite unexpected things if you've metacharacters in there, since you're not quoting or escaping them)

E.g. in your example - if you look for a node called "node" you'll get multiple hits.

Note - if you're only looking for one match, you can do something like:

my ( $first_match ) =  grep { $_ -> [0] eq $node } @fin_nodes;

This will just get you the first result, and the rest will be discarded. (Which isn't too efficient, because grep will continue to iterate the whole list).

Taranasaur ,yesterday

Your last statement was on point, I only needed one match. Then before pushing a node onto fin_nodes this was enough: "if (!$first_match)" – Taranasaur yesterday

Borodin ,yesterday

@Taranasaur: I think you missed the point of Sobrique's answer. A hash is by far the better choice for this, and you can simply write $fin_nodes{$node} //= [ $hindex, $e->{$hip}{FREQ} ] and avoid the need for any explicit test altogether. – Borodin yesterday

Taranasaur ,yesterday

@Borodin, no I do get Sobrique's point. The fin_nodes array is being used for a simple list function that another method is already using quite happily in my program. I will at some point go back and create a hash as there might be more attributes I'll need to include in that array/hash – Taranasaur yesterday

ysth ,yesterday

"because the latter will substring match" assuming no regex metacharacters; if there are any, it will be even worse – ysth yesterday

Sobrique ,yesterday

Good point @ysth I will add that. – Sobrique yesterday

[Nov 16, 2017] Generating a range of Unicode characters

Nov 16, 2017 | perlmonks.com

davido (Archbishop) on Nov 16, 2017 at 05:46 UTC

Re: Generating a range of Unicode characters

Check out perlop Auto-increment and Auto-decrement for an explanation.

The thing to consider here is that the .. range operator leverages the semantics provided by ++ (auto-increment). The documentation for auto-increment says this:

The auto-increment operator has a little extra builtin magic to it. If you increment a variable that is numeric, or that has ever been used in a numeric context, you get a normal increment. If, however, the variable has been used in only string contexts since it was set, and has a value that is not the empty string and matches the pattern /^ a-zA-Z * 0-9 *\z/ , the increment is done as a string, preserving each character within its range, with carry:

print ++($foo = "99"); # prints "100" print ++($foo = "a0"); # prints "a1" print ++($foo = "Az"); # prints "Ba" print ++($foo = "zz"); # prints "aaa" [download]

The components of the range you are trying to construct do not meet the criteria for Perl's built-in autoincrement behavior.

However, if you're using Perl 5.26 or newer, and enable unicode_strings you can use the following, as documented in perlop Range Operators .

use charnames "greek"; my @greek_small = map { chr } (ord("\N{alpha}") .. ord("\N{omega}")); [download]

Or forgo the \N{charname} lookups and just use the actual ordinal values:

my @chars = map {chr} $ord_first .. $ord_last; [download]

Dave

Your Mother (Chancellor) on Nov 16, 2017 at 06:13 UTC

Re: Generating a range of Unicode characters

Is this what you're after?

perl -CSD -le 'print chr for 0xDF .. 0x0101' [download]

Update: I hadn't read all the way down davido 's post. He is making the same suggestion already at the end.

[Nov 16, 2017] regex - Parsing a whole file in Perl

Nov 12, 2017 | stackoverflow.com

pleriche, Nov 12 at 9:52

I have an HTML file containing a 2-column table which I want to parse in order to extract pairs of strings representing the columns. The page layout of the HTML (white space, new lines) is arbitrary, hence I can't parse the file line by line.

I recall that you can parse such a thing by slurping the whole file into a string and operating on the entire string, which I'm finding a bit more challenging. I'm trying things like the following:

#!/usr/bin/perl

open(FILE, "Glossary") || die "Couldn't open file\n";
@lines = <FILE>;
close(FILE);

$data = join(' ', @lines);

while ($data =~ /<tr>.*(<td>.*<\/td>).*(<td>.*<\/td>).*<\/tr>/g) {
    print $1, ":", $2, "\n";
}

which gives a null output. Here's a section of the input file:

<table class="wikitable">
    <tr>
        <td><b>Term</b>
        </td>
        <td><b>Meaning</b>
        </td></tr>
    <tr>
        <td><span id="0-Day">0-Day</span>
        </td>
        <td>
        <p>See <a href="#Zero_Day">Zero Day</a>.
        </p>
        </td>

Can someone help me out?

Borodin ,Nov 12 at 21:20

Use HTML::TableExtractBorodin Nov 12 at 21:20

zdim ,Nov 12 at 21:46

To correct my early comment (removed), while I recommend HTML::TreeBuilder for general parsing of HTML (and there are others), here you indeed want HTML::TableExtract . And you do not want to use regex – zdim Nov 12 at 21:46

Dave Cross ,2 days ago

You can't parse HTML with a regexDave Cross 2 days ago

Miguel Prz ,Nov 12 at 10:03

There is a HTML::TableExtract module in CPAN, which simplifies the problem you are trying to solve:
use strict;
use warnings;
use HTML::TableExtract qw(tree);

my $te = HTML::TableExtract->new( headers => qw(Term Meaning) );
my $html_file = "Glossary";
$te->parse_file($html_file);
my $table = $te->first_table_found;
# ...

pleriche ,yesterday

Thank you and I'm sure TableExtract is the better way of doing it, but the object of my question was to improve my understanding of how to use regular expressions since they're so central to Perl. Adding gs to the regexpr as someone suggested (since deleted) was the leg-up I needed. – pleriche yesterday

Miguel Prz ,yesterday

I see your point, and it's really important build a solid knowlegment on regexpr. But, like other people have said, it's not a goot idea apply regexpr to parsing html documents – Miguel Prz yesterday

,

You already have answers explaining why you shouldn't parse HTML with regexes. And you really shouldn't. But you've asked for an explanation of why your code doesn't work. So here goes...

You have two problems in your code. One stops it working and the other stops it working as you expect.

Firstly, you are using . in your regex to match any character. But . doesn't match any character. It matches any character except a newline. And you have newlines in your string. You fix that by adding the /s option to your match operator (so it has /gs instead of /s ).

With that fix in place, you get a result from your code. Using your test data, I see:

<td><b>Term</b>
         </td>:<td><b>Meaning</b>
         </td>

Which is correct. But looking at your test data, I wondered why I wasn't getting two results - because of the /g . I soon realised it was because your test data is missing the closing </td> . When I added that, I got this result:

<td><span id="0-Day">0-Day</span>
         </td>:<td>
         <p>See <a href="#Zero_Day">Zero Day</a>.
         </p>
         </td>

Ok. It's now finding the second result. But what has happened to the first one? That's the second error in your code.

You have .* a few times in your regex. That means "zero or more of any character". But it's the "or more" that is a problem here. By default, Perl regex qualifiers ( * or + ) are greedy. That means they will use up as much of the string as possible. And the first .* in your regex is eating up a lot of your string. All of it up to the second <tr> in fact.

The solution to that is to make the .* non-greedy. And you do that by adding ? to the end. So you can replace all of the .* with .*? . Having done that, I get this output:

<td><b>Term</b>
         </td>:<td><b>Meaning</b>
         </td>
<td><span id="0-Day">0-Day</span>
         </td>:<td>
         <p>See <a href="#Zero_Day">Zero Day</a>.
         </p>
         </td>

Which seems correct to me.

So, to summarise:

  1. By default, . doesn't match newlines. To do that, you need /s .
  2. Beware of greedy qualifiers.

[Nov 16, 2017] Reading/dumping a perl hash from shell

Nov 11, 2017 | stackoverflow.com

newbie ,Nov 11 at 0:27

I have a read-only perl file with a huge hash defined in it. Is there anyway for me to read this perl file and dump out the hash contents?

this is basic structure of the hash within the file.

%hash_name = {
    -files => [
         '<some_path>',
    ],
    -dirs => [
         '<some_path>',
         '<some_path>',
         '<some_path>',
         '<some_path>',
         '<some_path>',
    ],
};

Davy M ,Nov 11 at 0:30

can you not cat the file and redirect it into a one that does have write permissions? cat perl_file_name > new_perl_file_nameDavy M Nov 11 at 0:30

newbie ,Nov 11 at 0:34

yes I did consider that but will go with that approach only if there is no other way to dump the hash without creating a new file. – newbie Nov 11 at 0:34

zdim ,Nov 11 at 2:59

@newbie Thank you, and to repeat the question: Does this file have other Perl code or just this hash? Also, is the hash undeclared (just %hash_name ), as you show it, or is it "lexical," so with my such as: my %hash_name ? – zdim Nov 11 at 2:59

zdim ,Nov 11 at 3:12

@newbie What you show is invalid in Perl: the % in %hash_name indicates that the variable is a hash , but { .. } form a hash reference , which is a scalar variable (not a hash). So it should be either %hash_name = ( .. ) or it's $hashref_name = { .. }zdim Nov 11 at 3:12

Schwern ,Nov 11 at 6:59

Note this is an insecure way to store data. The data file must be evaluated as perl code. Any arbitrary code could be in the file. In addition, the data file can only be read by Perl programs. Instead, use JSON or similar data format. JSON::MaybeXS can convert between JSON and Perl. – Schwern Nov 11 at 6:59

zdim ,Nov 11 at 3:40

Ideally you'd copy the file so that you can edit it, then turn it into a module so to use it nicely.

But if for some reason this isn't feasible here are your options.

If that hash is the only thing in the file , "load" it using do and assign to a hash

use warnings;
use strict;

my $file = './read_this.pl';  # the file has *only* that one hash

my %hash = do $file;

This form of do executes the file (runs it as a script), returning the last expression that is evaluated. With only the hash in the file that last expression is the hash definition, precisely what you need.

If the hash is undeclared , so a global variable (or declared with our ), then declare as our a hash with the same name in your program and again load the file with do

our %hash_name;  # same name as in the file
do $file;        # file has "%hash" or "our %hash" (not "my %hash")

Here we "pick up" the hash that is evaluated as do runs the file by virtues of our

If the hash is "lexical" , declared as my %hash (as it should be!) ... well, this is bad. Then you need to parse the text of the file so to extract lines with the hash. This is in general very hard to do, as it amounts to parsing Perl. (A hash can be built using map , returned from a sub as a reference or a flat list ...) Once that is done you eval the variable which contains the text defining that hash.

However, if you know how the hash is built, as you imply, with no () anywhere inside

use warnings; 
use strict;

my $file = './read_this.pl';

my $content = do {  # "slurp" the file -- read it into a variable
    local $/;
    open my $fh, '<', $file or die "Can't open $file: $!";
    <$fh>;
};

my ($hash_text) = $content =~ /\%hash_name\s*=\s*(\(.*?\)/s;
my %hash = eval $hash_text;

This simple shot leaves out a lot, assuming squarely that the hash is as shown. Also note that this form of eval carries real and serious security risks.


Files are also loaded using require . Apart from it doing a lot more than do , the important thing here is that even if it runs multiple times require still loads that file only once . This matters for modules in the first place, which shouldn't be loaded multiple times, and use indeed uses require .

On the other hand, do does it every time, what makes it suitable for loading files to be used as data, which presumably should be read every time. This is the recommended method. Note that require itself uses do to actually load the file.

Thanks to Schwern for a comment.

Schwern ,Nov 11 at 4:31

do will always load the file. require will only load it once. Since you want to get data from the file, it's recommended to use do . Else the second or third time anything in that process loads the file they'll end up with 1 . – Schwern Nov 11 at 4:31

zdim ,Nov 11 at 4:41

@Schwern Right, thank you for the comment. I wanted to avoid excessive explanation thus I simply use do . (I still mention require since it is feasible that the data is loaded once.) But it is good to state this, thank you -- I am adding the comment. – zdim Nov 11 at 4:41

Schwern ,Nov 11 at 4:57

It's bad practice to use require because a future person maintaining the code may also require the same file elsewhere (not even in the same code file, it's per process) and not realize it has already been required. I'd suggest instead explaining why do is the right thing to do here instead of require , it's a necessary complexity. – Schwern Nov 11 at 4:57

zdim ,Nov 11 at 5:40

@Schwern A good point, thank you. Adjusted the post. – zdim Nov 11 at 5:40

zdim ,Nov 13 at 2:15

@DavyM Thank you for kind words. You are right, and there are so many such questions that it even seems more common (than otherwise) when people start out. All that we can do is to keep pointing it out, and write it always in full code examples. – zdim Nov 13 at 2:15

[Nov 16, 2017] perl perlpacktut not making sense for me - Stack Overflow

Nov 13, 2017 | stackoverflow.com

brian d foy ,Nov 13 at 2:34

The pack function puts one or more things together in a single string. It represents things as octets (bytes) in a way that it can unpack reliably in some other program. That program might be far away (like, the distance to Mars far away). It doesn't matter if it starts as something human readable or not. That's not the point.

Consider some task where you have a numeric ID that's up to about 65,000 and a string that might be up to six characters.

print pack 'S A6', 137, $ARGV[0];

It's easier to see what this is doing if you run it through a hex dumper as you run it:

$ perl pack.pl Snoopy | hexdump -C
00000000  89 00 53 6e 6f 6f 70 79                           |..Snoopy|

The first column counts the position in the output so ignore that. Then the first two octets represent the S (short, 'word', whatever, but two octets) format. I gave it the number 137 and it stored that as 0x8900. Then it stored 'Snoopy' in the next six octets.

Now try it with a shorter name:

$ perl test.pl Linus | hexdump -C
00000000  89 00 4c 69 6e 75 73 20                           |..Linus |

Now there's a space character at the end (0x20). The packed data still has six octets. Try it with a longer name:

$ perl test.pl 'Peppermint Patty' | hexdump -C
00000000  89 00 50 65 70 70 65 72                           |..Pepper|

Now it truncates the string to fit the six available spaces.

Consider the case where you immediately send this through a socket or some other way of communicating with something else. The thing on the other side knows it's going to get eight octets. It also knows that the first two will be the short and the next six will be the name. Suppose the other side stored that it $tidy_little_package . It gets the separate values by unpacking them:

my( $id, $name ) = unpack 'S A6', $tidy_little_package;

That's the idea. You can represent many values of different types in a binary format that's completely reversible. You send that packed string wherever it needs to be used.

I have many more examples of pack in Learning Perl and Programming Perl .

[Nov 16, 2017] perl - Grep Two Dimensional Array - Stack Overflow

Nov 16, 2017 | stackoverflow.com

Grep Two Dimensional Array Ask Question up vote down vote favorite

Taranasaur ,yesterday

Since this is not a question directly covered here, thought best I ask and answer it.

I had an issue where I wanted to add a node name to a list only if the same node doesn't already exist. The array was built using:

push (@fin_nodes, [$node, $hindex, $e->{$hip}->{FREQ}]);

So given when given array (@fin_nodes) that looks like:

$VAR1 = [
      'first-node',
      '4',
      3
    ];
$VAR2 = [
      'second-node',
      '1',
      3
    ];
$VAR3 = [
      'another-node',
      '1',
      5
    ];
$VAR4 = [
      'some-node',
      '0',
      5
    ];

To do a grep on this the following works:

my @match = grep { grep { $_ =~ $node } @$_ } @fin_nodes;

So given a $node "second-node" the above statement will return @match as:

$VAR1 = [
  'second-node',
  '1',
  3
];

Sobrique ,yesterday

Why not use a hash instead? – Sobrique yesterday

ysth ,yesterday

when dumping an array, do Data::Dumper::Dumper(\@array), not ...(@array). if passed a list, Dumper dumps each element individually, which is not what you want here – ysth yesterday

,

I would say "don't" and instead:
my %fin_nodes;
$fin_nodes{$node} = [$hindex, $e->{$hip}->{FREQ}]);

And then you can simply if ($fin_nodes{$node}) {

Failing that though - you don't need to grep every element, as your node name is always first.

So:

 my @matches = grep { $_ -> [0] eq $node } @fin_nodes;

eq is probably a better choice than =~ here, because the latter will substring match. (And worse, can potentially do some quite unexpected things if you've metacharacters in there, since you're not quoting or escaping them)

E.g. in your example - if you look for a node called "node" you'll get multiple hits.

Note - if you're only looking for one match, you can do something like:

my ( $first_match ) =  grep { $_ -> [0] eq $node } @fin_nodes;

This will just get you the first result, and the rest will be discarded. (Which isn't too efficient, because grep will continue to iterate the whole list).

Taranasaur ,yesterday

Your last statement was on point, I only needed one match. Then before pushing a node onto fin_nodes this was enough: "if (!$first_match)" – Taranasaur yesterday

Borodin ,yesterday

@Taranasaur: I think you missed the point of Sobrique's answer. A hash is by far the better choice for this, and you can simply write $fin_nodes{$node} //= [ $hindex, $e->{$hip}{FREQ} ] and avoid the need for any explicit test altogether. – Borodin yesterday

Taranasaur ,yesterday

@Borodin, no I do get Sobrique's point. The fin_nodes array is being used for a simple list function that another method is already using quite happily in my program. I will at some point go back and create a hash as there might be more attributes I'll need to include in that array/hash – Taranasaur yesterday

ysth ,yesterday

"because the latter will substring match" assuming no regex metacharacters; if there are any, it will be even worse – ysth yesterday

Sobrique ,yesterday

Good point @ysth I will add that. – Sobrique yesterday

[Nov 16, 2017] Namespaces and modules

Nov 16, 2017 | perlmonks.com

on Feb 09, 2015 at 13:21 UTC ( # 1116049 = perlquestion : print w/replies , xml ) Need Help?? kzwix has asked for the wisdom of the Perl Monks concerning the following question:

Greetings, Ô wise monks !

I come to you because of a mystery I'd like to unravel: The module import code doesn't work as I expected. So, as I'm thinking that it probably is a problem with my chair-keyboard interface, rather than with the language, I need your help.

So, there are these modules I have, the first one goes like this:

use utf8; use Date::Manip; use LogsMarcoPolo; package LibOutils; BEGIN { require Exporter; # set the version for version checking our $VERSION = 1.00; # Inherit from Exporter to export functions and variables our @ISA = qw(Exporter); # Functions and variables which are exported by default our @EXPORT = qw(getDateDuJour getHeureActuelle getInfosSemaine ge tTailleRepertoire getInfosPartition getHashInfosContenuRepertoire dor mir); # Functions and variables which can be optionally exported our @EXPORT_OK = qw(); } # Under this line are definitions of local variables, and the subs. [download]

I also have another module, which goes like that:

use utf8; use strict; use warnings; use Cwd; # Module "CORE" use Encode; use LibOutils qw(getHeureActuelle); package LogsMarcoPolo; BEGIN { require Exporter; # set the version for version checking our $VERSION = 1.00; # Inherit from Exporter to export functions and variables our @ISA = qw(Exporter); # Functions and variables which are exported by default our @EXPORT = qw(setNomProgramme ouvreFichierPourLog assigneFluxPo urLog pushFlux popFlux init printAndLog); # Functions and variables which can be optionally exported our @EXPORT_OK = qw(); } # Here are other definitions of variables and subs, which I removed fo r the sake of clarity sub init { my ($nomDuProgramme, $pathLogGeneral, $pathLogErreurs) = @_; my $date = LibOutils::getDateDuJour(); # La date de l'appel à init() my $time = LibOutils::getHeureActuelle(); # L'heure de l'appel à init() $nomProgramme = $nomDuProgramme; # Ouverture du flux pour STDOUT: my $stdout = assigneFluxPourLog(*STDOUT); # On l'ajoute à la liste de flux 'OUT': pushFlux('OUT', $stdout); # Ouverture du flux pour STDERR: my $stderr = assigneFluxPourLog(*STDERR); # On l'ajoute à la liste de flux 'ERR', et à la liste 'DUO': pushFlux('ERR', $stderr); pushFlux('DUO', $stderr); if (defined $pathLogGeneral) { my $plg = $pathLogGeneral; $plg =~ s/<DATE>/$date/g; $plg =~ s/<TIME>/$time/g; my $logG = ouvreFichierPourLog($plg); pushFlux('OUT', $logG); pushFlux('DUO', $logG); } if (defined $pathLogErreurs) { my $ple = $pathLogErreurs; $ple =~ s/<DATE>/$date/g; $ple =~ s/<TIME>/$time/g; my $logE = ouvreFichierPourLog($ple); pushFlux('ERR', $logE); pushFlux('DUO', $logE); } } [download]

Now, look at the second module: When, in the "init" sub, I call the getDateDuJour() and getHeureActuelle() functions with an explicit namespace, it works fine.

If I remove the prefix, it doesn't work, even for the function whose name I put in the "qw(...)" chain after the use.

Would a fellow monk know why ?

choroba (Bishop) on Feb 09, 2015 at 13:24 UTC

Re: Namespaces and modules

By putting package after the use clauses, you are importing all the functions to the "main" namespace, not into your package's namespace. Moving the package declaration up should help. لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ

kzwix (Sexton) on Feb 09, 2015 at 13:34 UTC

Re^2: Namespaces and modules


by kzwix (Sexton) on Feb 09, 2015 at 13:34 UTC

I wonder, could it have something to do with loop-including ?

I mean, package "LibOutils" uses "LogsMarcoPolo" (for its logging system), but "LogsMarcoPolo" uses "LibOutils" for its dates and times.

Could that circular include be the origin of this bug ?

Anonymous Monk on Feb 09, 2015 at 14:18 UTC

Re^3: Namespaces and modules
by Anonymous Monk on Feb 09, 2015 at 14:18 UTC
I wonder, could it have something to do with loop-including ?

Circular dependencies don't automatically cause a problem, it also depends on what the module does in its body (which you haven't shown). If you think there is a problem, a short piece of example code that reproduces the problem would help, see http://sscce.org/

But first, did you try what choroba suggested ?

kzwix (Sexton) on Feb 09, 2015 at 15:04 UTC

Re^4: Namespaces and modules
by kzwix (Sexton) on Feb 09, 2015 at 15:04 UTC

Corion (Pope) on Feb 09, 2015 at 15:11 UTC

Re^5: Namespaces and modules
by Corion (Pope) on Feb 09, 2015 at 15:11 UTC

Anonymous Monk on Feb 09, 2015 at 15:59 UTC

Re^5: Namespaces and modules
by Anonymous Monk on Feb 09, 2015 at 15:59 UTC

Anonymous Monk on Feb 09, 2015 at 14:11 UTC

Re: Namespaces and modules
doesn't work as I expected ... it works fine ... it doesn't work

What are the exact error messages? What is the expected behavior vs. the behavior you're getting? See How do I post a question effectively?

Replies are listed 'Best First'.

[Nov 16, 2017] perl - Passing an inner array to a function - Stack Overflow

Nov 16, 2017 | stackoverflow.com

,

There are no arrays in your code. And there are no method calls in your code.

Your hash is defined incorrectly. You cannot embed hashes inside other hashes. You need to use hash references. Like this:

my %data = (
    'a' => {
        x => 'Hello',
        y => 'World'
    },
    'b' => {
        x => 'Foo',
        y => 'Bar'
    }
);

Note, I'm using { ... } to define your inner hashes, not ( ... ) .

That still gives us an error though.

Type of arg 1 to main::p must be hash (not hash element) at passhash line 20, near "})"

If that's unclear, we can always try adding use diagnostics to get more details of the error:

(F) This function requires the argument in that position to be of a certain type. Arrays must be @NAME or @{EXPR}. Hashes must be %NAME or %{EXPR}. No implicit dereferencing is allowed--use the {EXPR} forms as an explicit dereference. See perlref.

Parameter type definitions come from prototypes. Your prototype is \% . People often think that means a hash reference. It doesn't. It means, "give me a real hash in this position and I'll take a reference to it and pass that reference to the subroutine".

(See, this is why people say that prototypes shouldn't be used in Perl - they often don't do what you think they do.)

You're not passing a hash. You're passing a hash reference. You can fix it by dereferencing the hash in the subroutine call.

p(%{$data{a}});

But that's a really silly idea. Take a hash reference and turn it into a hash, so that Perl can take its reference to pass it into a subroutine.

What you really want to do is to change the prototype to just $ so the subroutine accepts a hash reference. You can then check that you have a hash reference using ref .

But that's still overkill. People advise against using Perl prototypes for very good reasons. Just remove it

> ,

Your definition of the structure is wrong. Inner hashes need to use {} , not () .
my %data = (
    a => {
        x => 'Hello',
        y => 'World'
    },
    b => {
        x => 'Foo',
        y => 'Bar'
    }
);

Also, to get a single hash element, use $data{'a'} (or even $data{a} ), not %data{'a'} .

Moreover, see Why are Perl 5's function prototypes bad? on why not to use prototypes. After correcting the syntax as above, the code works even without the prototype. If you really need the prototype, use % , not \% . But you clearly don't know exactly what purpose prototypes serve, so don't use them.

[Nov 16, 2017] perl get reference to temp list returned by function without making a copy - Stack Overflow

Nov 16, 2017 | stackoverflow.com

newguy, 2 days ago

I have a function in perl that returns a list. It is my understanding that when foo() is assigned to list a copy is made:
sub foo() { return `ping 127.0.0.1` }

my @list = foo();

That @list then needs to be transferred to another list like @oldlist = @list; and another copy is made. So I was thinking can I just make a reference from the returned list like my $listref = \foo(); and then I can assign that reference, but that doesn't work.

The function I'm working with runs a command that returns a pretty big list (the ping command is just for example purposes) and I have call it often so I want to minimize the copies if possible. what is a good way to deal with that?

zdim ,2 days ago

Make an anonymous array reference of the list that is returned
my $listref = [ foo() ];

But, can you not return an arrayref to start with? That is better in general, too.


What you attempted "takes a reference of a list" ... what one cannot do in the literal sense; lists are "elusive" things , while a reference can be taken

By using the backslash operator on a variable, subroutine, or value.

and a "list" isn't either (with a subroutine we need syntax \&sub_name )

However, with the \ operator a reference is taken, either to all elements of the list if in list context

my @ref_of_LIST = \( 1,2,3 );  #-->  @ref_of_LIST: (\1, \2, \3)

or to a scalar if in scalar context, which is what happens in your attempt. Since your sub returns a list of values, they are evaluated by the comma operator and discarded, one by one, until the last one. The reference is then taken of that scalar

my $ref_of_LIST = \( 1,2,3 );  #--> $ref_of_LIST: \3

As it happens, all this applies without parens as well, with \foo() .

newguy ,2 days ago

I don't know how to return an array ref from a command that returns a list. Would it be acceptable to do it as return [`ping 1.2.3.4`];newguy 2 days ago

zdim ,2 days ago

@newguy Yes, that would be a fine way to do it. Another is to store the command's return in an array variable (say, @ary ) -- if you need it elsewhere in the sub -- and then return \@ary;zdim 2 days ago

newguy ,2 days ago

Ok thanks. Wouldn't the @ary way create a copy though – newguy 2 days ago

zdim ,2 days ago

@newguy For one, those elements must be stored somewhere, either anonymously by [ .. ] or associated with a named variable by @ary = .. . I don't know whether yet an extra copy is made in order to construct an array, but I'd expect that it isn't When you return \@ary no new copies are made. I would expect that they are about the same. – zdim 2 days ago

zdim ,2 days ago

@newguy I added an explanation of what happens with \foo()zdim 2 days ago

[Nov 16, 2017] Perl captured digits from string are always 1

Nov 16, 2017 | stackoverflow.com

The match operator in scalar context evaluates to a boolean that indicates whether the match succeeded or not.

my $success = $user =~ /(\d+)/;

The match operator in list context returns the captured strings (or 1 if there are no captures) on success and an empty list on error.

my ($num) = $user =~ /(\d+)/;

You used the former, but you want the latter. That gives you the following (after a few other small fixes):

sub next_level {
    my ($user) = @_;
    my ($num) = $user =~ /(\d+)\z/;
    $user =~ s/\d+\z//g;
    $user .= ++$num;
    return $user;
}

But that approach is complicated and inefficient. Simpler solution:

sub next_level {
    my ($user) = @_;
    $user =~ s/(\d+)\z/ $1 + 1 /e;
    return $user;
}

[Nov 16, 2017] regex - Use of uninitialized value $a in concatenation (.) or string - Stack Overflow

Nov 16, 2017 | stackoverflow.com

sampath, yesterday

I am trying to remove the old files in a dir if the count is more than 3 over SSH

Kindly suggest how to resolve the issue.

Please refer the code snippet

#!/usr/bin/perl
use strict;
use warnings;

my $HOME="/opt/app/latest";
my $LIBS="${HOME}/libs";
my $LIBS_BACKUP_DIR="${HOME}/libs_backups";
my $a;
my $b;
my $c;
my $d;

my $command =qq(sudo /bin/su - jenkins -c "ssh username\@server 'my $a=ls ${LIBS_BACKUP_DIR} | wc -l;my $b=`$a`;if ($b > 3); { print " Found More than 3 back up files , removing older files..";my $c=ls -tr ${LIBS_BACKUP_DIR} | head -1;my $d=`$c`;print "Old file name $d";}else { print "No of back up files are less then 3 .";} '");

print "$command\n";
system($command);

output:

sudo /bin/su - jenkins -c "ssh username@server 'my ; =ls /opt/app/latest/libs_backups | wc -l;my ; =``;if ( > 3); { print " Found More than 3 back up files , removing older files..";my ; =ls -tr /opt/app/latest/libs_backups | head -1;my ; =``;print "Old file name ";}else { print "No of back up files are less then 3 .";} '" Found: -c: line 0: unexpected EOF while looking for matching `'' Found: -c: line 1: syntax error: unexpected end of file

janh ,yesterday

Are you trying to execute parts of your local perl script in an ssh session on a remote server? That will not work. – janh yesterday

simbabque ,yesterday

Look into Object::Remote. Here is a good talk by the author from the German Perl Workshop 2014. It will essentially let you write Perl code locally, and execute it completely on a remote machine. It doesn't even matter what Perl version you have there. – simbabque yesterday

simbabque ,yesterday

You should also not use $a and $b . They are reserved global variables for sort . – simbabque yesterday

Chris Turner ,yesterday

Why are you sudoing when your command is running on an entirely different server? – Chris Turner yesterday

shawnhcorey ,yesterday

Never put sudo or su in a script. This is security breach. Instead run the script as sudo or su . – shawnhcorey yesterday
If you have three levels of escaping, you're bound to get it wrong if you do it manually. Use String::ShellQuote's shell_quote instead.

Furthermore, avoid generating code. You're bound to get it wrong! Pass the necessary information using arguments, the environment or some other channel of communication instead.

There were numerous errors in the interior Perl script on top of the fact that you tried to execute a Perl script without actually invoking perl !

#!/usr/bin/perl

use strict;
use warnings;

use String::ShellQuote qw( shell_quote );

my $HOME = "/opt/app/latest";
my $LIBS = "$HOME/libs";
my $LIBS_BACKUP_DIR = "$HOME/libs_backups";

my $perl_script = <<'__EOI__';
   use strict;
   use warnings;

   use String::ShellQuote qw( shell_quote );

   my ($LIBS_BACKUP_DIR) = @ARGV;

   my $cmd = shell_quote("ls", "-tr", "--", $LIBS_BACKUP_DIR);
   chomp( my @files =  `$cmd` );
   if (@files > 3) {
      print "Found more than 3 back up files. Removing older files...\n";
      print "$_\n" for @files;
   } else {
      print "Found three or fewer backup files.\n";
   }
__EOI__

my $remote_cmd = shell_quote("perl", "-e", $perl_script, "--", $LIBS_BACKUP_DIR);
my $ssh_cmd = shell_quote("ssh", 'username@server', "--", $remote_cmd);
my $local_cmd = shell_quote("sudo", "su", "-c", $ssh_ccmd);
system($local_cmd);

[Nov 15, 2017] converter (Priest)

on Jul 12, 2006 at 05:21 UTC ( # 560614 = perlquestion : print w/replies , xml ) Need Help?? converter has asked for the wisdom of the Perl Monks concerning the following question:

For the past several months I've been busy rewriting the horrible perl code left behind by my predecessor. His approach to development was "Write some code. If the code runs without revealing any of the damage it's done, ship it. If not, write some more code." This code is so bad that when co-workers ask me what I'm working on, I tell them "The Madman's Diary." Yes, it would have been cheaper and faster to throw this code away and start over, but I wasn't given that option.

My latest assignment is the repair of a tangled mess of a show-stopper that was discovered in a product that was supposed to ship today. After adding an open() override that logs the arguments to open() and some quality time with the watch(1) utility observing changes to the files containing the data that are causing the problem, I've narrowed the list of suspects down to a couple in-house scripts and a few (probably altered) webmin modules.

Now that I know where to look, I'd like to identify as quickly as possible which details can be safely ignored. I plan to use Devel::DProf to produce an execution graph for reference and Tie::Watch to watch variables, but I wonder if there are other tools that I should look at. A utility or module that would allow me to incrementally build a profile with persistent notes would be wonderful.

Debugging this code is a whole different game, and I'd really appreciate some input from other monks who've dealt with this type of problem.

converter

eyepopslikeamosquito (Chancellor) on Jul 12, 2006 at 08:30 UTC

Re: Strategies for maintenance of horrible code?
Yes, it would have been cheaper and faster to throw this code away and start over
Maybe. For another point of view, see Joel Spolsky on not rewriting from scratch .

I agree with adrianh . If a component is not broken, don't rewrite it. Rewrite a component when you find a number of bugs in it. But first write a regression test suite for the component. I've seen many folks over the years throw out old code, rewrite it ... and introduce a heap of new bugs in the process. If you come into a new company and introduce a swag of new bugs in previously working code, you will start to smell very badly.

See also:

GrandFather (Sage) on Jul 12, 2006 at 08:38 UTC

Re^2: Strategies for maintenance of horrible code?


by GrandFather (Sage) on Jul 12, 2006 at 08:38 UTC

Actually, just writing the tests is often a damn fine way of finding bugs. No exactly what OP is after at the moment, but something that is at the forefront of my mind because I'm in the middle of writing a set of tests (in Perl :) for some XML processing C++ code and turning up a pile of bugs as I go.

However it does suggest another test avenue: write test harnesses for modules so that you can exercise them in isolation and better understand how they work. If the test harness ends up part of a regression test system so much the better.


DWIM is Perl's answer to Gödel

tinita (Parson) on Jul 12, 2006 at 12:28 UTC

Re^2: Strategies for maintenance of horrible code?


by tinita (Parson) on Jul 12, 2006 at 12:28 UTC

Maybe. For another point of view, see Joel Spolsky on not rewriting from scratch.
uh oh. why does this remind me of perl 6? =)

adrianh (Chancellor) on Jul 12, 2006 at 07:49 UTC

Re: Strategies for maintenance of horrible code?
Debugging this code is a whole different game, and I'd really appreciate some input from other monks who've dealt with this type of problem.

I'd recommend reading Perl Medic and Working Effectively with Legacy Code (the latter isn't Perl specific - but is chock full of useful advice).

I would not spend any time fixing the code if it's not breaking (assuming you're not being paid to review/fix the code). However evil it may be - if it's doing it's job leave it alone.

Instead - every time you need to fix a bug or add some new functionality just test/refactor the bits of the evil code that are touched by the changes. I've found incrementally adding tests and refactoring to be much more effective than any sort of "big bang" fixing things for the sake of them approach :-)

If you are being paid to do a review/fix then Perl::Critic might give you some useful places to look.

webfiend (Vicar) on Jul 14, 2006 at 21:24 UTC

Re^2: Strategies for maintenance of horrible code?


by webfiend (Vicar) on Jul 14, 2006 at 21:24 UTC

Definitely agree about the approach of sorting things out with gradual refactoring and tests as the need arises. The problem with the "Big Bang" approach is that you have the potential for a very long stretch of time where there are two forks of the code: ugly shipping code that will need to be fixed and refactored as bugs are reported, and pretty nonfunctioning code that will need to incorporate those fixes as they are uncovered, resulting in a perpetual loop of "it's not quite ready yet."

Ovid (Cardinal) on Jul 12, 2006 at 10:37 UTC

Re: Strategies for maintenance of horrible code?

Check out Suggestions for working with poor code and some of the replies.

Cheers,
Ovid

New address of my CGI Course .

GrandFather (Sage) on Jul 12, 2006 at 07:39 UTC

Re: Strategies for maintenance of horrible code?

What tools are you using already and on what platform? For a large range of "detail" debugging there is nothing like as good as an IDE with a good integrated debugger. For a higher level view of where things are going Devel::TraceCalls may be handy, although it's output can be rather voluminous.


DWIM is Perl's answer to Gödel

Moron (Curate) on Jul 12, 2006 at 12:13 UTC

Re: Strategies for maintenance of horrible code?

Some basic CYA I can see:

1) Ensure there is sufficient functional and technical design documentation against which the routines can be tested.

2) (updated) Make sure there is a sufficiently detailed project plan to include tasks for: systems analysis, functional and technical design, test planning, test script writing (e.g. using Expect ), developing, unit-, integrated and functional testing, rework and implementation, to include a GANTT chart of the work done so far and by who to what % of completion, to avoid getting the blame for not meeting poorly conceived targets over which you had no control.

In response to formal testing against the plan, I find it a useful aid to bug-fixing to monitor execution with perl -d, setting breakpoints and examining variables to hunt down which line of code causes each failure.

-M

Free your mind

aufflick (Deacon) on Jul 13, 2006 at 00:17 UTC

Re: Strategies for maintenance of horrible code?

You might find the comments to my recent question Generating documentation from Perl code (not just POD) useful.

The Doxygen perl extension creates docs that are great for seeing what classes re-implement what methods etc. Also the UML::Sequence sounds intriguing - it pupports to generate a sequence diagram by monitoring code execution.

Anonymous Monk on Jul 12, 2006 at 06:40 UTC

Re: Strategies for maintenance of horrible code?

Ignore nothing. Whats the nature of the problem?

Replies are listed 'Best First'.

[Nov 15, 2017] Suggestions for working with poor code

Notable quotes:
"... Still looking for time to record time usage ..."
Nov 15, 2017 | perlmonks.com

Suggestions for working with poor code by Ovid (Cardinal)

on May 10, 2001 at 01:34 UTC ( # 79261 = perlmeditation : print w/replies , xml ) Need Help??

I am currently working on adding a fair amount of functionality to a Web site whose programs have been designed very poorly. Amongst other things, taint checking and strict have not been used. Code has been thrown together without regard to side effects, massive Here docs are used to output HTML, etc. Since I am getting a fair amount of experience with these issues, I thought I would offer some of my observations for fellow monks. Some of these pertain to the existing code and concentrates on 'quick fixes'. Some pertains to new code that's added.

Quick (?) Fixes Adding new functionality

Any and all tips that others wish to add are welcome!

Cheers,
Ovid

Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.

dws (Chancellor) on May 10, 2001 at 01:51 UTC

Re: Suggestions for working with poor code

Bad formatting can hide a number of sins.

If necessary, the first thing I do when taking on bad code is reformat it. It doesn't matter whether it's Perl, Java, C, or HTML. A surprising number of problems (like mangled boolean conditions in branches and loops) fall right out when the code is tidied up so that you can actually see what it's doing.

Then it's a lot easier to get on with the fixes Ovid suggests.

tinman (Curate) on May 10, 2001 at 02:02 UTC

Re: Suggestions for working with poor code

I've found that taking a deep breath and a step back from the turmoil of badly written code can help immensely.. quite a few instances where you can see places where code can be consolidated into a single reusable library..

With this in mind, trying to understand the basic intent of the code is really important to me.. I write down a small note describing what each section of code tries to do... this allows me to focus on reuse as well as consolidate several segments together..

Related to this: in addition to liberal comments, updating documentation or in some cases, writing some document that describes the structure and function of a code block is very helpful to any person maintaining the code.. you don't have to wonder "what was that guy thinking" or "why did he do *that* ?".. its all there in a document.. and also provides a cursory overview of what has been going on without jumping straight into the code (I'm a big fan of the saying that goes "the less time you spend planning, the more time you spend coding" )...
Caveat: Docs that aren't updated are worse than useless, though...

clemburg (Curate) on May 10, 2001 at 12:20 UTC

Re: Suggestions for working with poor code

Track how long it takes you to fix bugs.

I agree enthusiastically. It will be your only argument when somebody comes and asks you where all the hours have gone. For this kind of job (take responsibility for badly written code, fixing bugs, etc.) this is an absolute must.

For these purposes, two little forms (or spreadsheets, or editor modes/templates, or whatever) will be very helpful (pedantically detailed discussion of these can be found in An Introduction to the Personal Software Process , electronic materials are available at The PSP Resource Page , including time tracking tools, emacs modes, forms, etc.):

  • Time recording log
  • Defect recording log

These are the essentials of both (header columns, add date, person, project, client, etc. as you need):

Time recording log:

  • Start Time
  • Stop Time
  • Interruption Time
  • Delta Time
  • Activity Category (coding, testing, reading docs - make up your own)
  • Comments (more detailed description of task)

Defect recording log:

  • Defect ID (e.g., sequential number)
  • Type (one of: documentation, syntax, build/package, assignment, interface, checking, data, function, system, environment - your own are welcome)
  • Inject Phase (when was the defect put into the program - estimate - design, coding, testing, linking, etc.)
  • Remove Phase (when was the defect found - compile time, testing, etc.)
  • Fix Time (how long did it take to fix)
  • Description (description of defect)

Contrary to what you may think, it does *not* take much time to use these forms (or similar means to record the information). But it will give you all the data you need to be sure you did the Right Thing, and the confidence and evidence to convince your boss or client that what you did was worth the time and the money.

Christian Lemburg
Brainbench MVP for Perl
http://www.brainbench.com

coreolyn (Parson) on May 11, 2001 at 18:55 UTC

Re: Re: Suggestions for working with poor code


by coreolyn (Parson) on May 11, 2001 at 18:55 UTC

You mean these logs haven't been automated into CPAN module yet??

coreolyn Still looking for time to record time usage

r.joseph (Hermit) on May 10, 2001 at 04:04 UTC

Re: Suggestions for working with poor code

Wonderful post Ovid - just added to my favs list. For someone who had the great misfortune a while back of inheriting a large, ill-maintained and astrociously coded website, I know what you mean and this post really highlights some of the main points that go into fixing it.

I also have to agree heartily with the replies, although I would like to add something. I find sometimes that it actually helps, with particularily insubordinate code, to take part of it out of the main file (say, a sub) and put it into another script that has major error-checking, lots of warnings and what not, and then test it from there. Sometimes this will yield a solution very quickly, and other times it has quickly allowed me to see what was wrong and what needed to be recoded.

Just thought I'd offer a quick idea...great job again!

r.
"Violence is a last resort of the incompetent" - Salvor Hardin, Foundation by Issac AsimovW

knobunc (Pilgrim) on May 10, 2001 at 18:19 UTC

Re: Suggestions for working with poor code

Very cool node.

With regard to the To Do list, I scatter them throughout my code if there is a place I need to do further work. However, I have a make rule for todo that searches for all of the lines with TODO in them and prints them out. So a usage of a TODO:

if ($whatever) { # TODO - Finish code to take over the world } [download]

Becomes:

To Do List Dir/file.pl 132: Finish code to take over the world [download]

When run through the following (ugly, suboptimal, but working) code in Tools/todo.sh :

#/bin/sh echo 'To Do List' find . -type f | xargs grep -n TODO | perl -ne '($file, $line, $rest) = split /:/, $_, 3; $file =~ s|^./||; $rest =~ s|.*?TODO.*?[-\s:]+||; $rest =~ s|"[.;,]\s*$||; $rest =~ s|\\n||g; print "$file $line: \u$r est\n"' | sort | uniq | grep -v '.#' | grep -v Makefile | grep -v CVS [download]

Which I call from my Makefile:

todo: Tools/todo.sh [download]

Kinda ugly, but it lets me put the TODO statements where I actually need to do the work. So I can proof out a block of code by writing narrative comments with TODO at the start of the line (behind comment characters of course). Then fill in the code later and not worry about missing a piece. Also since the TODOs are where the stuff needs to be filled in, I have lots of context around the issue and don't need to write as much as I would if they were at the top of the file. Plus anyone without something to do in the group can just type make todo and add some code. Finally, it is easier to add a TODO right where you need it, than bop up to the top of the file and then have to find where you were back in the code.

-ben

Replies are listed 'Best First'.

[Nov 15, 2017] A crucial element in controlling time is controlling the amount of detail needed to gain understanding. It is easy to lose sight of the forest for the trees.

Notable quotes:
"... The Perl Monks website has 83 data tables, two main type hierarchies (nodetypes and perl classes), a core engine of about 12K and about 600 additional code units spread throughout the database. Documentation is scattered and mostly out of date. ..."
"... The initial architecture seems solid but its features have been used inconsistently over time. ..."
Nov 15, 2017 | perlmonks.com

Re^2: Swallowing an elephant in 10 easy steps
by ELISHEVA (Prior) on Aug 13, 2009 at 18:27 UTC

The time drivers are the overall quality of the design, ease of access to code and database schemas, and the size of the system: the number of database tables, the complexity of the type/class system(s), the amount of code, and the number of features in whatever subsystem you explore in step 10. Rather than an average, I'll take the most recent example, Perl Monks.

The Perl Monks website has 83 data tables, two main type hierarchies (nodetypes and perl classes), a core engine of about 12K and about 600 additional code units spread throughout the database. Documentation is scattered and mostly out of date.

The initial architecture seems solid but its features have been used inconsistently over time. Accessing the schema and code samples is slow because there is no tarball to download - it has to be done through the web interface or manually cut and pasted into files off line. The database/class assessment (1-4) took about 16 hours. Steps 5-7 took about 30 hours. Steps 8-10 took about 24 hours. All told that is 70 hours, including writing up documentation and formatting it with HTML.

However, I always like to leave myself some breathing space. If I were contracting to learn a system that size, I'd want 90 hours and an opportunity to reassess time schedules after the initial code walk through was complete. If a system is very poorly designed this process takes somewhat longer.

A crucial element in controlling time is controlling the amount of detail needed to gain understanding. It is easy to lose sight of the forest for the trees. That is why I advise stopping and moving onto the next phase once your categories give a place to most design elements and the categories work together to tell story. That is also why I recommend backtracking as needed. Sometimes we make mistakes about which details really matter and which can be temporarily blackboxed. Knowing I can backtrack lets me err on the side of black boxing.

The other element affecting time is, of course, the skill of the analyst or developer. I have the advantage that I have worked both at the coding and the architecture level of software. I doubt I could work that fast if I didn't know how to read code fluently and trace the flow of data through code. Having been exposed to many different system designs over the years also helps - architectural strategies leave telltale footprints and experience helps me pick up on those quickly.

However one can also learn these skills by doing. The more you practice scanning, categorizing and tracing through code and data the better you get at it. It will take longer, but the steps are designed to build on themselves and are, in a way, self-teaching. That is why you can't just do the 10 steps in parallel as jdporter jokingly suggests below.

However some theoretical context and a naturally open mind definitely helps: if you think that database tables should always have a one-to-one relationship with classes you will be very very confused by a system where that isn't true. If I had to delegate this work to someone else I probably would work up a set of reading materials on different design strategies that have been used in the past 30 years. Alternatively or in addition, I might pair an analyst with a programmer so that they could learn from each other (with neither having priority!)

Best, beth

Update: expanded description of the PerlMonks system so that it addresses all of the time drivers mentioned in the first paragaph.

Update: fixed miscalculation of time

[Nov 15, 2017] Xref helped me make sense of the interactions in the old codebase. I didn't bother with any visualization tools or graph-creation, though. I just took the output of perl -MO=Xref filename for each file, removed some of the cruft with a text editor, ran it through mpage -4 to print, and spent a day with coffee and pencil, figuring out how things worked.

Nov 15, 2017 | perlmonks.com

dave0 (Friar) on Apr 15, 2005 at 15:32 UTC

Re: Analyzing large Perl code base.

Having recently done this on a fairly large codebase that grew organically (no design, no refactoring) over the course of four years, I feel your pain.

Writing a testsuite, on any level, is nearly essential for this. If you're rewriting an existing module, you'll need to ensure it's compatible with the old one, and the only sane way to do that is to test. If the old code is monolithic, it might be difficult to test individual units, but don't let that stop you from testing at a higher level.

B::Xref helped me make sense of the interactions in the old codebase. I didn't bother with any visualization tools or graph-creation, though. I just took the output of perl -MO=Xref filename for each file, removed some of the cruft with a text editor, ran it through mpage -4 to print, and spent a day with coffee and pencil, figuring out how things worked.

Pretty much the same tactic was used on the actual code. Print it out, annotate it away from the computer, and then sit down with the notes to implement the refactoring. If your codebase is huge (mine was about 4-5k lines in several .pl and .pm files, and was still manageable) you might not want to do this, though.

[Nov 15, 2017] Generating documentation from Perl code

Nov 15, 2017 | perlmonks.com

Re: Strategies for maintenance of horrible code?
by aufflick (Deacon) on Jul 13, 2006 at 00:17 UTC

You might find the comments to my recent question Generating documentation from Perl code (not just POD) useful.

The Doxygen perl extension creates docs that are great for seeing what classes re-implement what methods etc. Also the UML::Sequence sounds intriguing - it pupports to generate a sequence diagram by monitoring code execution.

[Nov 15, 2017] Generating documentation from Perl code (not just POD)

Nov 15, 2017 | perlmonks.com

Generating documentation from Perl code (not just POD) by aufflick (Deacon)

on Jul 11, 2006 at 05:15 UTC ( # 560312 = perlquestion : print w/replies , xml ) Need Help?? aufflick has asked for the wisdom of the Perl Monks concerning the following question:

Ideally a script/module would do that, and also interleave the POD from the file, so any POD directly before the method/sub would be linked to it. Any method/sub without POD would at least be documented by it's name.

Of course a major limitation is that (for OO Perl at least), we have no idea what the method arguments are, simply from robotically inspecting the code. Something I always liked in OpenACS is the way that they replace the builtin Tcl proc keyword with a custom ad_proc keyword that works just the same as proc but which takes an optional documentation block that accepts javadoc-like keyword embedding and also a block detailing any arguments and their default values. Because of the tight coupling, the generated documentation is very rich for little developer effort.

Does anyone know of attempts at this sort of thing in Perl, or have any good ideas to offer? Ideally I want to come up with something that will work with existing Perl code, and that any extensions won't break normal Perl compilation (no literate programming preprocessors need apply).

/Mark

planetscape (Chancellor) on Jul 11, 2006 at 06:27 UTC

Re: Generating documentation from Perl code (not just POD)

These links should get you started:


DoxyFilt ( Doxygen for Perl) offsite
Analyzing large Perl code base.
Becoming familiar with a too-big codebase?

HTH,

aufflick (Deacon) on Jul 13, 2006 at 00:34 UTC

Re^2: Generating documentation from Perl code (not just POD)


by aufflick (Deacon) on Jul 13, 2006 at 00:34 UTC

Wow - Doxygen + Doxyfilt is *exactly* what I was looking for - fantastic!

planetscape (Chancellor) on Jul 13, 2006 at 00:39 UTC

Re^3: Generating documentation from Perl code (not just POD)
by planetscape (Chancellor) on Jul 13, 2006 at 00:39 UTC

Glad to hear it! :-D

If you have any questions about configuring Doxyfile to run Doxygen / DoxyFilt under Cygwin , for instance, please /msg me .

HTH,

aufflick (Deacon) on Jul 13, 2006 at 02:12 UTC

Re^4: Generating documentation from Perl code (not just POD)
by aufflick (Deacon) on Jul 13, 2006 at 02:12 UTC

BrowserUk (Pope) on Jul 13, 2006 at 02:31 UTC

Re^3: Generating documentation from Perl code (not just POD)
by BrowserUk (Pope) on Jul 13, 2006 at 02:31 UTC
Wow - Doxygen + Doxyfilt is *exactly* what I was looking for - fantastic!

Now, please someone with influence sell the P6 guys on Doxygen. Let's have it built into the language and allow the terminally-ill POD slip away peacefully.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal? "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.

philcrow (Priest) on Jul 11, 2006 at 13:41 UTC

Re: Generating documentation from Perl code (not just POD)

I'm interested in this area. Once, in the past, I wrote UML::Sequence which runs perl programs to produce sequence diagrams of what they actually do. This leads me to think that you could write a special driver using the debugger hooks to load the modules from your app, then dump out their inheritence relationships, etc. (based on what is loaded and what those modules have in their @ISA and symbol tables). Maybe that could be incorporated with some good POD parsing, but I'm just rambling now.

Phil

Replies are listed 'Best First'.

[Nov 15, 2017] With regard to the To Do list, I scatter them throughout my code if there is a place I need to do further work.

Nov 15, 2017 | perlmonks.com

knobunc (Pilgrim) on May 10, 2001 at 18:19 UTC

Re: Suggestions for working with poor code

Very cool node.

With regard to the To Do list, I scatter them throughout my code if there is a place I need to do further work. However, I have a make rule for todo that searches for all of the lines with TODO in them and prints them out. So a usage of a TODO:

if ($whatever) {
    # TODO - Finish code to take over the world
}
[download]

Becomes:

To Do List
Dir/file.pl 132: Finish code to take over the world
[download]

When run through the following (ugly, suboptimal, but working) code in Tools/todo.sh :

#/bin/sh

echo 'To Do List'

find . -type f | xargs grep -n TODO | perl -ne \
'($file, $line, $rest) += split /:/, $_, 3; 
$file =~ s|^./||;  
$rest =~ s|.*?TODO.*?[-\s:]+||; 
$rest =~ s|"[.;,]\s*$||;  
$rest =~ s|\\n||g; print "$file $line: \u$rest\n"' 

| sort | uniq | grep -v '.#' | grep -v Makefile | grep -v CVS
[download]

Which I call from my Makefile:

todo: Tools/todo.sh [download]

Kinda ugly, but it lets me put the TODO statements where I actually need to do the work.

So I can proof out a block of code by writing narrative comments with TODO at the start of the line (behind comment characters of course).

Then fill in the code later and not worry about missing a piece. Also since the TODOs are where the stuff needs to be filled in, I have lots of context around the issue and don't need to write as much as I would if they were at the top of the file. Plus anyone without something to do in the group can just type make todo and add some code. Finally, it is easier to add a TODO right where you need it, than bop up to the top of the file and then have to find where you were back in the code.

-ben

[Nov 15, 2017] Basic Debugger Commands

Notable quotes:
"... pseudo-signal handlers, ..."
"... programmatic debugger control ..."
Nov 15, 2017 | my.safaribooksonline.com

Debugging is just an extreme case of dynamic analysis. Third-party code can be extremely convoluted (so can your own code, of course, but you don't usually think of it that way because you're familiar with it; you knew it when it was just a subroutine); sometimes you just can't tell how part of the code fits in, or whether it's called at all. The code is laid out in some arrangement that makes no sense; if only you could see where the program would actually go when it was run.

Well, you can, using Perl's built-in debugger. Even though you're not actually trying to find a bug, the code-tracing ability of the debugger is perfect for the job.

This isn't the place for a full treatment of the debugger (you can see more detail in [ SCOTT01 ]), but fortunately you don't need a full treatment; a subset of the commands is enough for what you need to do. (Using the debugger is like getting in a fight; it's usually over very quickly without using many of the fancy moves you trained for.)

-d command-line flag; either edit the program to add -d to the shebang line, or run the program by invoking Perl explicitly:

% perl -d program argument argument...

Make sure that the perl in your path is the same one in the shebang line of program or you'll go crazy if there are differences between the two perls.

Basic Debugger Commands

Armed with these commands, we can go code spelunking. Suppose you are debugging a program containing the following code fragment:

77 for my $url (@url_queue)
78 {
79 my $res = $ua->request($url);
80 summarize($res->content);
81 }

and you know that whenever the program gets to the URL http://www.perlmedic.com/fnord.html something strange happens in the summarize() subroutine. You'd like to check the HTTP::Response object to see if there were any redirects you didn't know about. You start the program under the debugger and type:

DB<1> b 80 $url =~ /fnord/
DB<2>

The program will run until it has fetched the URL you're interested in, at which point you can examine the response object -- here's an example of what it might look like:

Perl 5.8.0 and later will give you a stack trace anyway if you run a program under the debugger and some code triggers a warning. But suppose you are either running under an earlier perl, or you'd really like to have a debugger prompt at the point the warning was about to happen.

You can combine two advanced features of Perl to do this: pseudo-signal handlers, and programmatic debugger control .

A signal handler is a subroutine you can tell Perl to execute whenever your program receives a signal. For instance, when the user interrupts your program by pressing Control-C, that works by sending an INT signal to your program, which interprets it by default as an instruction to stop executing.

There are two pseudo-signals, called __WARN__ and __DIE__ . They aren't real signals, but Perl "generates" them whenever it's told to issue a warning or to die, respectively. You can supply code to be run in those events by inserting a subroutine reference in the %SIG hash (see perlvar ) as follows:

$SIG{__WARN__} = sub { print "Ouch, I'm bad" };

(Try it on some code that generates a warning.)

The next piece of the solution is that the debugger can be controlled from within your program; the variable $single in the special package DB determines what Perl does at each statement: 0 means keep going, and 1 or 2 mean give a user prompt. 1 So setting $DB::single to 1 in a pseudo-signal handler will give us a debugger prompt at just the point we wanted.

1 . The difference between the two values is that a 1 causes the debugger to act as though the last n or s command the user typed was s , whereas a 2 is equivalent to an n . When you type an empty command in the debugger (just hit Return), it repeats whatever the last n or s command was.

Putting the pieces together, you can start running the program under the debugger and give the commands:

DB<1> $SIG{__WARN__} = sub { warn @_; $DB::single = 1 }
DB<2>

Now the program will breakpoint where it was about to issue a warning, and you can issue a T command to see a stack trace, examine data, or do anything else you want. 2 The warning is still printed first.

2 . Under some circumstances, the breakpoint might not occur at the actual place of warning: The current routine might return if the statement triggering the warning is the last one being executed in that routine.

Unfortunately, no __DIE__ pseudo-signal handler will return control to the debugger (evidently death is considered too pressing an engagement to be interrupted). However, you can get a stack trace by calling the confess() function in the Carp module:

DB<1> use Carp
DB<2> $SIG{__DIE__} = sub { confess (@_) }

The output will look something like this:

DB<3>
Insufficient privilege to launch preemptive strike at wargames line
109.
main::__ANON__[(eval 17)[/usr/lib/perl5/5.6.1/
perl5db.pl:1521]:2]('Insufficient privilege to launch preemptive
strike at wargames line 109.^J') called at wargames line 121
main::preemptive('Strike=HASH(0x82069d4)') called at wargames
line 109
main::make_strike('ICBM=HASH(0x820692c)') called at wargames
line 74
main::icbm('Silo_ND') called at wargames line 32
main::wmd('ICBM') called at wargames line 22
main::strike() called at wargames line 11
main::menu() called at wargames line 5
Debugged program terminated. Use q to quit or R to restart,
use O inhibit_exit to avoid stopping after program termination,
h q, h R or h O to get additional info.

I've often found it amusing that the debugger refers to the program at this point as "debugged."

[Nov 15, 2017] Strange behaviour of tr function in case the set1 is supplied by a variable

Notable quotes:
"... Characters may be literals or any of the escape sequences accepted in double-quoted strings. But there is no interpolation, so "$" and "@" are treated as literals. ..."
Nov 15, 2017 | perlmonks.com
Nov 16, 2017 at 02:50 UTC ( # 1203542 = perlquestion : print w/replies , xml ) Need Help??

likbez has asked for the wisdom of the Perl Monks concerning the following question:

Looks like in tr function a scalar variable is accepted as the first argument, but is not compiled properly into set of characters

use strict;
use warnings;

my $str1 = 'abcde';
my $str2 = 'eda';
my $diff1 = 0;

eval "\$diff1=\$str1=~tr/$str2//";

print "diff1: $diff1\n";

$ perl foo.pl
diff1: 3

[download]

This produces in perl 5, version 26:

Test 1: strait set diff1=0, diff2=3
Test 2: complement set diff1=5, diff2=2

[download]

Obviously only the second result in both tests is correct. Looks like only explicitly given first set is correctly compiled. Is this a feature or a bug ?

Athanasius (Chancellor) on Nov 16, 2017 at 03:08 UTC

Re: Strange behaviour of tr function in case the set1 is supplied by a variable

Hello likbez ,

The transliteration operator tr/SEARCHLIST/REPLACEMENTLIST/ does not interpolate its SEARCHLIST , so in your first example the search list is simply the literal characters , , , , . See Quote and Quote like Operators .

Hope that helps,

Athanasius  < contra mundum Iustus alius egestas vitae, eros Piratica,

roboticus (Chancellor) on Nov 16, 2017 at 03:08 UTC

Re: Strange behaviour of tr function in case the set1 is supplied by a variable

likbez :

Feature, per the tr docs

Characters may be literals or any of the escape sequences accepted in double-quoted strings. But there is no interpolation, so "$" and "@" are treated as literals.

A hyphen at the beginning or end, or preceded by a backslash is considered a literal. Escape sequence details are in the table near the beginning of this section.

So if you want to use a string to specify the values in a tr statement, you'll probably have to do it via a string eval:

$ cat foo.pl use strict; use warnings;
my $str1 = 'abcde';
my $str2 = 'eda';
my $diff1 = 0;
eval "\$diff1=\$str1=~tr/$str2//";
print "diff1: $diff1\n";
perl foo.pl diff1: 3

[download]

... roboticus

When your only tool is a hammer, all problems look like your thumb.

Anonymous Monk on Nov 16, 2017 at 03:09 UTC

Re: Strange behaviour of tr function in case the set1 is supplied by a variable

Looks like in tr function a scalar variable is accepted as the fist argument, but is not compiled properly into set of characters

:)

you're guessing how tr /// works, you're guessing it works like s/// or m///, but you can't guess , it doesn't work like that, it doesn't interpolate variables, read perldoc -f tr for the details

likbez !!! on Nov 16, 2017 at 04:41 UTC

Re^2: Strange behaviour of tr function in case the set1 is supplied by a variable
you're guessing how tr/// works, you're guessing it works like s/// or m///, but you can't guess , it doesn't work like that, it doesn't interpolate variables, read perldoc -f tr for the details
Houston, we have a problem ;-)

First of all that limits tr area of applicability.

The second, it's not that I am guessing, I just (wrongly) extrapolated regex behavior on tr, as people more often use regex then tr. Funny, but searching my old code and comments in it is clear that I remembered (probably discovered the hard way, not by reading the documentation ;-) this nuance several years ago. Not now. Completely forgotten. Erased from memory. And that tells you something about Perl complexity (actually tr is not that frequently used by most programmers, especially for counting characters).

And that's a real situation, that we face with Perl in other areas too (and not only with Perl): Perl exceeds typical human memory capacity to hold the information about the language. That's why we need "crutches" like strict.

You simply can't remember all the nuances of more then a dozen of string-related built-in functions, can you? You probably can (and should) for index/rindex and substr , but that's about it.

So here are two problems here:

1. Are / / strings uniformly interpreted in the language, or there is a "gotcha" because they are differently interpreted by tr (essentially as a single quoted strings) and regex (as double quoted strings) ?

2. If so, what is the quality of warnings about this gotcha? There is no warning issued, if you use strict and warnings. BTW, it looks like $ can be escaped:

main::(-e:1): 0
DB<5> $_='\$bba\$'
DB<6> tr/\$/?/
DB<7> print $_
\?bba\?

[download]

Right now there is zero warnings issued with use strict and use warnings enabled. Looks like this idea of using =~ for tr was not so good, after all. Regular syntax like tr(set1, set2) would be much better. But it's to late to change and now we need warnings to be implemented.

likbez !!! on Nov 16, 2017 at 03:10 UTC

Re: Strange behaviour of tr function in case the set1 is supplied by a variable

With eval statement works correctly. So it looks like $ is treated by tr as a regular symbol and no warnings are issued.

$statement='$diff1=$str1'."=~tr/$str2//;";
eval($statement);
print "With eval: diff1=$diff1\n";
[download]

that will produce:

With eval: diff1=3

ww (Archbishop) on Nov 16, 2017 at 03:16 UTC

Re: Strange behaviour of tr function in case the set1 is supplied by a variable

Same results in AS 5.24 under Win7x64.

Suspected problem might have arisen from lack of strict, warnings. Wrong, same results BUT using both remains a generally good idea.

Also wondered if compiling (with qr/.../ ) might change the outcome. Wrong again, albeit with variant (erroneous) output.

Correct me if I'm wrong, guessing that "strait" is a typo or personal shortening of "straight."

Update: Now that I've seen earlier replies... ouch, pounding forehead into brick wall!

[Nov 14, 2017] Exporter - search.cpan.org

Nov 14, 2017 | search.cpan.org

Todd Rinaldo > Exporter-5.72 > Exporter

Download:
Exporter-5.72.tar.gz

Dependencies

Annotate this POD

View/Report Bugs
Module Version: 5.72 Source NAME ^

Exporter - Implements default import method for modules

SYNOPSIS ^

In module YourModule.pm :

  package YourModule;
  require Exporter;
  @ISA = qw(Exporter);
  @EXPORT_OK = qw(munge frobnicate);  # symbols to export on request

or

  package YourModule;
  use Exporter 'import'; # gives you Exporter's import() method directly
  @EXPORT_OK = qw(munge frobnicate);  # symbols to export on request

In other files which wish to use YourModule :

  use YourModule qw(frobnicate);      # import listed symbols
  frobnicate ($left, $right)          # calls YourModule::frobnicate

Take a look at "Good Practices" for some variants you will like to use in modern Perl code.

DESCRIPTION ^

The Exporter module implements an import method which allows a module to export functions and variables to its users' namespaces. Many modules use Exporter rather than implementing their own import method because Exporter provides a highly flexible interface, with an implementation optimised for the common case.

Perl automatically calls the import method when processing a use statement for a module. Modules and use are documented in perlfunc and perlmod . Understanding the concept of modules and how the use statement operates is important to understanding the Exporter.

How to Export

The arrays @EXPORT and @EXPORT_OK in a module hold lists of symbols that are going to be exported into the users name space by default, or which they can request to be exported, respectively. The symbols can represent functions, scalars, arrays, hashes, or typeglobs. The symbols must be given by full name with the exception that the ampersand in front of a function is optional, e.g.

    @EXPORT    = qw(afunc $scalar @array);   # afunc is a function
    @EXPORT_OK = qw(&bfunc %hash *typeglob); # explicit prefix on &bfunc

If you are only exporting function names it is recommended to omit the ampersand, as the implementation is faster this way.

Selecting What to Export

Do not export method names!

Do not export anything else by default without a good reason!

Exports pollute the namespace of the module user. If you must export try to use @EXPORT_OK in preference to @EXPORT and avoid short or common symbol names to reduce the risk of name clashes.

Generally anything not exported is still accessible from outside the module using the YourModule::item_name (or $blessed_ref->method ) syntax. By convention you can use a leading underscore on names to informally indicate that they are 'internal' and not for public use.

(It is actually possible to get private functions by saying:

  my $subref = sub { ... };
  $subref->(@args);            # Call it as a function
  $obj->$subref(@args);        # Use it as a method

However if you use them for methods it is up to you to figure out how to make inheritance work.)

As a general rule, if the module is trying to be object oriented then export nothing. If it's just a collection of functions then @EXPORT_OK anything but use @EXPORT with caution. For function and method names use barewords in preference to names prefixed with ampersands for the export lists.

Other module design guidelines can be found in perlmod .

How to Import

In other files which wish to use your module there are three basic ways for them to load your module and import its symbols:

use YourModule;
This imports all the symbols from YourModule's @EXPORT into the namespace of the use statement.
use YourModule ();
This causes perl to load your module but does not import any symbols.
use YourModule qw(...);
This imports only the symbols listed by the caller into their namespace. All listed symbols must be in your @EXPORT or @EXPORT_OK , else an error occurs. The advanced export features of Exporter are accessed like this, but with list entries that are syntactically distinct from symbol names.

Unless you want to use its advanced features, this is probably all you need to know to use Exporter.

Advanced Features ^ Specialised Import Lists

If any of the entries in an import list begins with !, : or / then the list is treated as a series of specifications which either add to or delete from the list of names to import. They are processed left to right. Specifications are in the form:

    [!]name         This name only
    [!]:DEFAULT     All names in @EXPORT
    [!]:tag         All names in $EXPORT_TAGS{tag} anonymous array
    [!]/pattern/    All names in @EXPORT and @EXPORT_OK which match

A leading ! indicates that matching names should be deleted from the list of names to import. If the first specification is a deletion it is treated as though preceded by :DEFAULT. If you just want to import extra names in addition to the default set you will still need to include :DEFAULT explicitly.

e.g., Module.pm defines:

    @EXPORT      = qw(A1 A2 A3 A4 A5);
    @EXPORT_OK   = qw(B1 B2 B3 B4 B5);
    %EXPORT_TAGS = (T1 => [qw(A1 A2 B1 B2)], T2 => [qw(A1 A2 B3 B4)]);

Note that you cannot use tags in @EXPORT or @EXPORT_OK.

Names in EXPORT_TAGS must also appear in @EXPORT or @EXPORT_OK.

An application using Module can say something like:

    use Module qw(:DEFAULT :T2 !B3 A3);

Other examples include:

    use Socket qw(!/^[AP]F_/ !SOMAXCONN !SOL_SOCKET);
    use POSIX  qw(:errno_h :termios_h !TCSADRAIN !/^EXIT/);

Remember that most patterns (using //) will need to be anchored with a leading ^, e.g., /^EXIT/ rather than /EXIT/ .

You can say BEGIN { $Exporter::Verbose=1 } to see how the specifications are being processed and what is actually being imported into modules.

Exporting Without Using Exporter's import Method

Exporter has a special method, 'export_to_level' which is used in situations where you can't directly call Exporter's import method. The export_to_level method looks like:

    MyPackage->export_to_level(
        $where_to_export, $package, @what_to_export
    );

where $where_to_export is an integer telling how far up the calling stack to export your symbols, and @what_to_export is an array telling what symbols *to* export (usually this is @_ ). The $package argument is currently unused.

For example, suppose that you have a module, A, which already has an import function:

    package A;

    @ISA = qw(Exporter);
    @EXPORT_OK = qw($b);

    sub import
    {
        $A::b = 1;     # not a very useful import method
    }

and you want to Export symbol $A::b back to the module that called package A. Since Exporter relies on the import method to work, via inheritance, as it stands Exporter::import() will never get called. Instead, say the following:

    package A;
    @ISA = qw(Exporter);
    @EXPORT_OK = qw($b);

    sub import
    {
        $A::b = 1;
        A->export_to_level(1, @_);
    }

This will export the symbols one level 'above' the current package - ie: to the program or module that used package A.

Note: Be careful not to modify @_ at all before you call export_to_level - or people using your package will get very unexplained results!

Exporting Without Inheriting from Exporter

By including Exporter in your @ISA you inherit an Exporter's import() method but you also inherit several other helper methods which you probably don't want. To avoid this you can do:

  package YourModule;
  use Exporter qw(import);

which will export Exporter's own import() method into YourModule. Everything will work as before but you won't need to include Exporter in @YourModule::ISA .

Note: This feature was introduced in version 5.57 of Exporter, released with perl 5.8.3.

Module Version Checking

The Exporter module will convert an attempt to import a number from a module into a call to $module_name->VERSION($value) . This can be used to validate that the version of the module being used is greater than or equal to the required version.

For historical reasons, Exporter supplies a require_version method that simply delegates to VERSION . Originally, before UNIVERSAL::VERSION existed, Exporter would call require_version .

Since the UNIVERSAL::VERSION method treats the $VERSION number as a simple numeric value it will regard version 1.10 as lower than 1.9. For this reason it is strongly recommended that you use numbers with at least two decimal places, e.g., 1.09.

Managing Unknown Symbols

In some situations you may want to prevent certain symbols from being exported. Typically this applies to extensions which have functions or constants that may not exist on some systems.

The names of any symbols that cannot be exported should be listed in the @EXPORT_FAIL array.

If a module attempts to import any of these symbols the Exporter will give the module an opportunity to handle the situation before generating an error. The Exporter will call an export_fail method with a list of the failed symbols:

  @failed_symbols = $module_name->export_fail(@failed_symbols);

If the export_fail method returns an empty list then no error is recorded and all the requested symbols are exported. If the returned list is not empty then an error is generated for each symbol and the export fails. The Exporter provides a default export_fail method which simply returns the list unchanged.

Uses for the export_fail method include giving better error messages for some symbols and performing lazy architectural checks (put more symbols into @EXPORT_FAIL by default and then take them out if someone actually tries to use them and an expensive check shows that they are usable on that platform).

Tag Handling Utility Functions

Since the symbols listed within %EXPORT_TAGS must also appear in either @EXPORT or @EXPORT_OK , two utility functions are provided which allow you to easily add tagged sets of symbols to @EXPORT or @EXPORT_OK :

  %EXPORT_TAGS = (foo => [qw(aa bb cc)], bar => [qw(aa cc dd)]);

  Exporter::export_tags('foo');     # add aa, bb and cc to @EXPORT
  Exporter::export_ok_tags('bar');  # add aa, cc and dd to @EXPORT_OK

Any names which are not tags are added to @EXPORT or @EXPORT_OK unchanged but will trigger a warning (with -w ) to avoid misspelt tags names being silently added to @EXPORT or @EXPORT_OK . Future versions may make this a fatal error.

Generating Combined Tags

If several symbol categories exist in %EXPORT_TAGS , it's usually useful to create the utility ":all" to simplify "use" statements.

The simplest way to do this is:

  %EXPORT_TAGS = (foo => [qw(aa bb cc)], bar => [qw(aa cc dd)]);

  # add all the other ":class" tags to the ":all" class,
  # deleting duplicates
  {
    my %seen;

    push @{$EXPORT_TAGS{all}},
      grep {!$seen{$_}++} @{$EXPORT_TAGS{$_}} foreach keys %EXPORT_TAGS;
  }

CGI.pm creates an ":all" tag which contains some (but not really all) of its categories. That could be done with one small change:

  # add some of the other ":class" tags to the ":all" class,
  # deleting duplicates
  {
    my %seen;

    push @{$EXPORT_TAGS{all}},
      grep {!$seen{$_}++} @{$EXPORT_TAGS{$_}}
        foreach qw/html2 html3 netscape form cgi internal/;
  }

Note that the tag names in %EXPORT_TAGS don't have the leading ':'.

AUTOLOAD ed Constants

Many modules make use of AUTOLOAD ing for constant subroutines to avoid having to compile and waste memory on rarely used values (see perlsub for details on constant subroutines). Calls to such constant subroutines are not optimized away at compile time because they can't be checked at compile time for constancy.

Even if a prototype is available at compile time, the body of the subroutine is not (it hasn't been AUTOLOAD ed yet). perl needs to examine both the () prototype and the body of a subroutine at compile time to detect that it can safely replace calls to that subroutine with the constant value.

A workaround for this is to call the constants once in a BEGIN block:

   package My ;

   use Socket ;

   foo( SO_LINGER );  ## SO_LINGER NOT optimized away; called at runtime
   BEGIN { SO_LINGER }
   foo( SO_LINGER );  ## SO_LINGER optimized away at compile time.

This forces the AUTOLOAD for SO_LINGER to take place before SO_LINGER is encountered later in My package.

If you are writing a package that AUTOLOAD s, consider forcing an AUTOLOAD for any constants explicitly imported by other packages or which are usually used when your package is use d.

Good Practices ^ Declaring @EXPORT_OK and Friends

When using Exporter with the standard strict and warnings pragmas, the our keyword is needed to declare the package variables @EXPORT_OK , @EXPORT , @ISA , etc.

  our @ISA = qw(Exporter);
  our @EXPORT_OK = qw(munge frobnicate);

If backward compatibility for Perls under 5.6 is important, one must write instead a use vars statement.

  use vars qw(@ISA @EXPORT_OK);
  @ISA = qw(Exporter);
  @EXPORT_OK = qw(munge frobnicate);
Playing Safe

There are some caveats with the use of runtime statements like require Exporter and the assignment to package variables, which can be very subtle for the unaware programmer. This may happen for instance with mutually recursive modules, which are affected by the time the relevant constructions are executed.

The ideal (but a bit ugly) way to never have to think about that is to use BEGIN blocks. So the first part of the "SYNOPSIS" code could be rewritten as:

  package YourModule;

  use strict;
  use warnings;

  our (@ISA, @EXPORT_OK);
  BEGIN {
     require Exporter;
     @ISA = qw(Exporter);
     @EXPORT_OK = qw(munge frobnicate);  # symbols to export on request
  }

The BEGIN will assure that the loading of Exporter.pm and the assignments to @ISA and @EXPORT_OK happen immediately, leaving no room for something to get awry or just plain wrong.

With respect to loading Exporter and inheriting, there are alternatives with the use of modules like base and parent .

  use base qw(Exporter);
  # or
  use parent qw(Exporter);

Any of these statements are nice replacements for BEGIN { require Exporter; @ISA = qw(Exporter); } with the same compile-time effect. The basic difference is that base code interacts with declared fields while parent is a streamlined version of the older base code to just establish the IS-A relationship.

For more details, see the documentation and code of base and parent .

Another thorough remedy to that runtime vs. compile-time trap is to use Exporter::Easy , which is a wrapper of Exporter that allows all boilerplate code at a single gulp in the use statement.

   use Exporter::Easy (
       OK => [ qw(munge frobnicate) ],
   );
   # @ISA setup is automatic
   # all assignments happen at compile time
What Not to Export

You have been warned already in "Selecting What to Export" to not export:

There's one more item to add to this list. Do not export variable names. Just because Exporter lets you do that, it does not mean you should.

  @EXPORT_OK = qw($svar @avar %hvar); # DON'T!

Exporting variables is not a good idea. They can change under the hood, provoking horrible effects at-a-distance that are too hard to track and to fix. Trust me: they are not worth it.

To provide the capability to set/get class-wide settings, it is best instead to provide accessors as subroutines or class methods instead.

SEE ALSO ^

Exporter is definitely not the only module with symbol exporter capabilities. At CPAN, you may find a bunch of them. Some are lighter. Some provide improved APIs and features. Pick the one that fits your needs. The following is a sample list of such modules.

    Exporter::Easy
    Exporter::Lite
    Exporter::Renaming
    Exporter::Tidy
    Sub::Exporter / Sub::Installer
    Perl6::Export / Perl6::Export::Attrs
LICENSE ^

This library is free software. You can redistribute it and/or modify it under the same terms as Perl itself.

[Nov 14, 2017] Exporter - search.cpan.org

Nov 14, 2017 | search.cpan.org

Todd Rinaldo > Exporter-5.72 > Exporter

Download:
Exporter-5.72.tar.gz

Dependencies

Annotate this POD

View/Report Bugs
Module Version: 5.72 Source NAME ^

Exporter - Implements default import method for modules

SYNOPSIS ^

In module YourModule.pm :

  package YourModule;
  require Exporter;
  @ISA = qw(Exporter);
  @EXPORT_OK = qw(munge frobnicate);  # symbols to export on request

or

  package YourModule;
  use Exporter 'import'; # gives you Exporter's import() method directly
  @EXPORT_OK = qw(munge frobnicate);  # symbols to export on request

In other files which wish to use YourModule :

  use YourModule qw(frobnicate);      # import listed symbols
  frobnicate ($left, $right)          # calls YourModule::frobnicate

Take a look at "Good Practices" for some variants you will like to use in modern Perl code.

DESCRIPTION ^

The Exporter module implements an import method which allows a module to export functions and variables to its users' namespaces. Many modules use Exporter rather than implementing their own import method because Exporter provides a highly flexible interface, with an implementation optimised for the common case.

Perl automatically calls the import method when processing a use statement for a module. Modules and use are documented in perlfunc and perlmod . Understanding the concept of modules and how the use statement operates is important to understanding the Exporter.

How to Export

The arrays @EXPORT and @EXPORT_OK in a module hold lists of symbols that are going to be exported into the users name space by default, or which they can request to be exported, respectively. The symbols can represent functions, scalars, arrays, hashes, or typeglobs. The symbols must be given by full name with the exception that the ampersand in front of a function is optional, e.g.

    @EXPORT    = qw(afunc $scalar @array);   # afunc is a function
    @EXPORT_OK = qw(&bfunc %hash *typeglob); # explicit prefix on &bfunc

If you are only exporting function names it is recommended to omit the ampersand, as the implementation is faster this way.

Selecting What to Export

Do not export method names!

Do not export anything else by default without a good reason!

Exports pollute the namespace of the module user. If you must export try to use @EXPORT_OK in preference to @EXPORT and avoid short or common symbol names to reduce the risk of name clashes.

Generally anything not exported is still accessible from outside the module using the YourModule::item_name (or $blessed_ref->method ) syntax. By convention you can use a leading underscore on names to informally indicate that they are 'internal' and not for public use.

(It is actually possible to get private functions by saying:

  my $subref = sub { ... };
  $subref->(@args);            # Call it as a function
  $obj->$subref(@args);        # Use it as a method

However if you use them for methods it is up to you to figure out how to make inheritance work.)

As a general rule, if the module is trying to be object oriented then export nothing. If it's just a collection of functions then @EXPORT_OK anything but use @EXPORT with caution. For function and method names use barewords in preference to names prefixed with ampersands for the export lists.

Other module design guidelines can be found in perlmod .

How to Import

In other files which wish to use your module there are three basic ways for them to load your module and import its symbols:

use YourModule;
This imports all the symbols from YourModule's @EXPORT into the namespace of the use statement.
use YourModule ();
This causes perl to load your module but does not import any symbols.
use YourModule qw(...);
This imports only the symbols listed by the caller into their namespace. All listed symbols must be in your @EXPORT or @EXPORT_OK , else an error occurs. The advanced export features of Exporter are accessed like this, but with list entries that are syntactically distinct from symbol names.

Unless you want to use its advanced features, this is probably all you need to know to use Exporter.

Advanced Features ^ Specialised Import Lists

If any of the entries in an import list begins with !, : or / then the list is treated as a series of specifications which either add to or delete from the list of names to import. They are processed left to right. Specifications are in the form:

    [!]name         This name only
    [!]:DEFAULT     All names in @EXPORT
    [!]:tag         All names in $EXPORT_TAGS{tag} anonymous array
    [!]/pattern/    All names in @EXPORT and @EXPORT_OK which match

A leading ! indicates that matching names should be deleted from the list of names to import. If the first specification is a deletion it is treated as though preceded by :DEFAULT. If you just want to import extra names in addition to the default set you will still need to include :DEFAULT explicitly.

e.g., Module.pm defines:

    @EXPORT      = qw(A1 A2 A3 A4 A5);
    @EXPORT_OK   = qw(B1 B2 B3 B4 B5);
    %EXPORT_TAGS = (T1 => [qw(A1 A2 B1 B2)], T2 => [qw(A1 A2 B3 B4)]);

Note that you cannot use tags in @EXPORT or @EXPORT_OK.

Names in EXPORT_TAGS must also appear in @EXPORT or @EXPORT_OK.

An application using Module can say something like:

    use Module qw(:DEFAULT :T2 !B3 A3);

Other examples include:

    use Socket qw(!/^[AP]F_/ !SOMAXCONN !SOL_SOCKET);
    use POSIX  qw(:errno_h :termios_h !TCSADRAIN !/^EXIT/);

Remember that most patterns (using //) will need to be anchored with a leading ^, e.g., /^EXIT/ rather than /EXIT/ .

You can say BEGIN { $Exporter::Verbose=1 } to see how the specifications are being processed and what is actually being imported into modules.

Exporting Without Using Exporter's import Method

Exporter has a special method, 'export_to_level' which is used in situations where you can't directly call Exporter's import method. The export_to_level method looks like:

    MyPackage->export_to_level(
        $where_to_export, $package, @what_to_export
    );

where $where_to_export is an integer telling how far up the calling stack to export your symbols, and @what_to_export is an array telling what symbols *to* export (usually this is @_ ). The $package argument is currently unused.

For example, suppose that you have a module, A, which already has an import function:

    package A;

    @ISA = qw(Exporter);
    @EXPORT_OK = qw($b);

    sub import
    {
        $A::b = 1;     # not a very useful import method
    }

and you want to Export symbol $A::b back to the module that called package A. Since Exporter relies on the import method to work, via inheritance, as it stands Exporter::import() will never get called. Instead, say the following:

    package A;
    @ISA = qw(Exporter);
    @EXPORT_OK = qw($b);

    sub import
    {
        $A::b = 1;
        A->export_to_level(1, @_);
    }

This will export the symbols one level 'above' the current package - ie: to the program or module that used package A.

Note: Be careful not to modify @_ at all before you call export_to_level - or people using your package will get very unexplained results!

Exporting Without Inheriting from Exporter

By including Exporter in your @ISA you inherit an Exporter's import() method but you also inherit several other helper methods which you probably don't want. To avoid this you can do:

  package YourModule;
  use Exporter qw(import);

which will export Exporter's own import() method into YourModule. Everything will work as before but you won't need to include Exporter in @YourModule::ISA .

Note: This feature was introduced in version 5.57 of Exporter, released with perl 5.8.3.

Module Version Checking

The Exporter module will convert an attempt to import a number from a module into a call to $module_name->VERSION($value) . This can be used to validate that the version of the module being used is greater than or equal to the required version.

For historical reasons, Exporter supplies a require_version method that simply delegates to VERSION . Originally, before UNIVERSAL::VERSION existed, Exporter would call require_version .

Since the UNIVERSAL::VERSION method treats the $VERSION number as a simple numeric value it will regard version 1.10 as lower than 1.9. For this reason it is strongly recommended that you use numbers with at least two decimal places, e.g., 1.09.

Managing Unknown Symbols

In some situations you may want to prevent certain symbols from being exported. Typically this applies to extensions which have functions or constants that may not exist on some systems.

The names of any symbols that cannot be exported should be listed in the @EXPORT_FAIL array.

If a module attempts to import any of these symbols the Exporter will give the module an opportunity to handle the situation before generating an error. The Exporter will call an export_fail method with a list of the failed symbols:

  @failed_symbols = $module_name->export_fail(@failed_symbols);

If the export_fail method returns an empty list then no error is recorded and all the requested symbols are exported. If the returned list is not empty then an error is generated for each symbol and the export fails. The Exporter provides a default export_fail method which simply returns the list unchanged.

Uses for the export_fail method include giving better error messages for some symbols and performing lazy architectural checks (put more symbols into @EXPORT_FAIL by default and then take them out if someone actually tries to use them and an expensive check shows that they are usable on that platform).

Tag Handling Utility Functions

Since the symbols listed within %EXPORT_TAGS must also appear in either @EXPORT or @EXPORT_OK , two utility functions are provided which allow you to easily add tagged sets of symbols to @EXPORT or @EXPORT_OK :

  %EXPORT_TAGS = (foo => [qw(aa bb cc)], bar => [qw(aa cc dd)]);

  Exporter::export_tags('foo');     # add aa, bb and cc to @EXPORT
  Exporter::export_ok_tags('bar');  # add aa, cc and dd to @EXPORT_OK

Any names which are not tags are added to @EXPORT or @EXPORT_OK unchanged but will trigger a warning (with -w ) to avoid misspelt tags names being silently added to @EXPORT or @EXPORT_OK . Future versions may make this a fatal error.

Generating Combined Tags

If several symbol categories exist in %EXPORT_TAGS , it's usually useful to create the utility ":all" to simplify "use" statements.

The simplest way to do this is:

  %EXPORT_TAGS = (foo => [qw(aa bb cc)], bar => [qw(aa cc dd)]);

  # add all the other ":class" tags to the ":all" class,
  # deleting duplicates
  {
    my %seen;

    push @{$EXPORT_TAGS{all}},
      grep {!$seen{$_}++} @{$EXPORT_TAGS{$_}} foreach keys %EXPORT_TAGS;
  }

CGI.pm creates an ":all" tag which contains some (but not really all) of its categories. That could be done with one small change:

  # add some of the other ":class" tags to the ":all" class,
  # deleting duplicates
  {
    my %seen;

    push @{$EXPORT_TAGS{all}},
      grep {!$seen{$_}++} @{$EXPORT_TAGS{$_}}
        foreach qw/html2 html3 netscape form cgi internal/;
  }

Note that the tag names in %EXPORT_TAGS don't have the leading ':'.

AUTOLOAD ed Constants

Many modules make use of AUTOLOAD ing for constant subroutines to avoid having to compile and waste memory on rarely used values (see perlsub for details on constant subroutines). Calls to such constant subroutines are not optimized away at compile time because they can't be checked at compile time for constancy.

Even if a prototype is available at compile time, the body of the subroutine is not (it hasn't been AUTOLOAD ed yet). perl needs to examine both the () prototype and the body of a subroutine at compile time to detect that it can safely replace calls to that subroutine with the constant value.

A workaround for this is to call the constants once in a BEGIN block:

   package My ;

   use Socket ;

   foo( SO_LINGER );  ## SO_LINGER NOT optimized away; called at runtime
   BEGIN { SO_LINGER }
   foo( SO_LINGER );  ## SO_LINGER optimized away at compile time.

This forces the AUTOLOAD for SO_LINGER to take place before SO_LINGER is encountered later in My package.

If you are writing a package that AUTOLOAD s, consider forcing an AUTOLOAD for any constants explicitly imported by other packages or which are usually used when your package is use d.

Good Practices ^ Declaring @EXPORT_OK and Friends

When using Exporter with the standard strict and warnings pragmas, the our keyword is needed to declare the package variables @EXPORT_OK , @EXPORT , @ISA , etc.

  our @ISA = qw(Exporter);
  our @EXPORT_OK = qw(munge frobnicate);

If backward compatibility for Perls under 5.6 is important, one must write instead a use vars statement.

  use vars qw(@ISA @EXPORT_OK);
  @ISA = qw(Exporter);
  @EXPORT_OK = qw(munge frobnicate);
Playing Safe

There are some caveats with the use of runtime statements like require Exporter and the assignment to package variables, which can be very subtle for the unaware programmer. This may happen for instance with mutually recursive modules, which are affected by the time the relevant constructions are executed.

The ideal (but a bit ugly) way to never have to think about that is to use BEGIN blocks. So the first part of the "SYNOPSIS" code could be rewritten as:

  package YourModule;

  use strict;
  use warnings;

  our (@ISA, @EXPORT_OK);
  BEGIN {
     require Exporter;
     @ISA = qw(Exporter);
     @EXPORT_OK = qw(munge frobnicate);  # symbols to export on request
  }

The BEGIN will assure that the loading of Exporter.pm and the assignments to @ISA and @EXPORT_OK happen immediately, leaving no room for something to get awry or just plain wrong.

With respect to loading Exporter and inheriting, there are alternatives with the use of modules like base and parent .

  use base qw(Exporter);
  # or
  use parent qw(Exporter);

Any of these statements are nice replacements for BEGIN { require Exporter; @ISA = qw(Exporter); } with the same compile-time effect. The basic difference is that base code interacts with declared fields while parent is a streamlined version of the older base code to just establish the IS-A relationship.

For more details, see the documentation and code of base and parent .

Another thorough remedy to that runtime vs. compile-time trap is to use Exporter::Easy , which is a wrapper of Exporter that allows all boilerplate code at a single gulp in the use statement.

   use Exporter::Easy (
       OK => [ qw(munge frobnicate) ],
   );
   # @ISA setup is automatic
   # all assignments happen at compile time
What Not to Export

You have been warned already in "Selecting What to Export" to not export:

There's one more item to add to this list. Do not export variable names. Just because Exporter lets you do that, it does not mean you should.

  @EXPORT_OK = qw($svar @avar %hvar); # DON'T!

Exporting variables is not a good idea. They can change under the hood, provoking horrible effects at-a-distance that are too hard to track and to fix. Trust me: they are not worth it.

To provide the capability to set/get class-wide settings, it is best instead to provide accessors as subroutines or class methods instead.

SEE ALSO ^

Exporter is definitely not the only module with symbol exporter capabilities. At CPAN, you may find a bunch of them. Some are lighter. Some provide improved APIs and features. Pick the one that fits your needs. The following is a sample list of such modules.

    Exporter::Easy
    Exporter::Lite
    Exporter::Renaming
    Exporter::Tidy
    Sub::Exporter / Sub::Installer
    Perl6::Export / Perl6::Export::Attrs
LICENSE ^

This library is free software. You can redistribute it and/or modify it under the same terms as Perl itself.

[Nov 14, 2017] Perl archeology Need help in refactoring of old Perl code that does not use strict

Nov 14, 2017 | perlmonks.com

likbez has asked for the wisdom of the Perl Monks concerning the following question:

This is kind of topic that previously was reserved to Cobol and PL/1 forums ;-) but now Perl is almost 30 years old and it looks like the space for Perl archeology is gradually opening ;-).

I got a dozen of fairly large scripts (several thousand lines each) written in a (very) early version of Perl 5 (below Perl 5.6), I now need:

1. Convert them to use strict pragma. The problem is that all of them share (some heavily, some not) information from main program to subroutines (and sometimes among subroutines too) via global variables in addition to (or sometimes instead of) parameters. Those scripts mostly do not use my declarations either.

So I need to map variables into local and global namespaces for each subroutine (around 40 per script; each pretty small -- less then hundred lines) to declare them properly.

As initial step I just plan use global variable with namespace qualification or our lists for each subroutine. Currently I plan to postprocess output of perl -MO=Xref old_perl_script.pl

and generate such statement. Is there a better way ?

2. If possible, I want to split the main namespace into at least two chunks putting all subroutines into another namespace, or module. I actually do not know how to export subroutines names into other namespace (for example main::) when just package statements is used in Perl as in example below. Modules do some magic via exporter that I just use but do not fully understand. For example if we have

#main_script ... ... ... x:a(1,2,3); ... ... ... package x; sub a {...) sub b {...} sub c {...} package y; ... ... ... [download] How can I access subs a,b,c without qualifying them with namespace x from the main:: namespace?

3. Generally this task looks like a case of refactoring. I wonder, if any Perl IDE has some of required capabilities, or are there tools that can helpful.

My time to make the conversion is limited and using some off the shelf tools that speed up the process would be a great help.

Any advice will be greatly appreciated.

AnomalousMonk (Chancellor) on Nov 14, 2017 at 07:20 UTC

Re: Perl archeology: Need help in refactoring of old Perl code that does not use strict

I'd like to suggest that you also need a

Step 0: Write a test suite that the current code passes for all normal modes of operation and for all failure modes.
With this test suite, you can be reasonably certain that refactored code isn't just going to be spreading the devastation.

Given that you seem to be describing a spaghetti-coded application with communication from function to function via all kinds of secret tunnels and spooky-action-at-a-distance global variables, I'd say you have a job on your hands just with Step 0. But you've already taken a test suite into consideration... Right?


Give a man a fish : <%-{-{-{-<

Monk::Thomas (Friar) on Nov 14, 2017 at 12:14 UTC

Re^2: Perl archeology: Need help in refactoring of old Perl code that does not use strict

by Monk::Thomas (Friar) on Nov 14, 2017 at 12:14 UTC

This is what I would do after 'Step 0':

If the variable does change during the run then pick a different function first. When you got the global state disentangled a bit it's a lot easier to reason about what this code is doing. Everything that's still using a global needs to be treated with very careful attention.

Corion (Pope) on Nov 14, 2017 at 08:45 UTC

Re: Perl archeology: Need help in refactoring of old Perl code that does not use strict

In addition to AnomalousMonk s advice of a test suite, I would suggest at the very least to invest the time up front to run automatic regression tests between whatever development version of the program you have and the current "good" (but ugly) version. That way you can easily verify whether your change affected the output and operation of the program. Ideally, the output of your new program and the old program should remain identical while you are cleaning things up.

Note that you can enable strict locally in blocks, so you don't need to make the main program compliant but can start out with subroutines or files and slowly convert them.

For your second question, have a look at Exporter . Basically it allows you to im/export subroutine names between packages:

package x;
use Exporter 'import';
our @EXPORT_OK = ('a', 'b', 'c');
[download] #main_script use x 'a', 'b'; # makes a() and b() available in the main namespace [download]

To find and collect the global variables, maybe it helps you to dump the global namespace before and after your program has run. All these names are good candidates for being at least declared via our to make them visible, and then ideally removed to pass the parameters explicitly instead of implicitly:

#!perl -w
use strict;

our $already_fixed = 1; # this won't show up

# Put this right before the "uncleaned" part of the script starts
my %initial_variables;
BEGIN {
    %initial_variables = %main::; # make a copy at the start of the program
}
END {
#use Data::Dumper;
#warn Dumper \%initial_variables;
#warn Dumper \%main::;
    # At the end, look what names came newly into being, and tell us about them:
    for my $key (sort keys %main::) {
        if( ! exists $initial_variables{ $key } ) {
            print "Undeclared global variable '$key' found\n";
            
            my $glob = $main::{ $key };
            
            if( defined *{ $glob }{GLOB}) {
                print "used as filehandle *'$key', replace by a lexical filehandle\n";
            };
            if( defined *{ $glob }{CODE}) {
                print "used as subroutine '$key'\n"; # so maybe a false alarm unless you dynamically load code?!
            };
            if( defined *{ $glob }{SCALAR}) {
                print "used as scalar \$'$key', declare as 'our'\n";
            };
            if( defined *{ $glob }{ARRAY}) {
                print "used as array \@'$key', declare as 'our'\n";
            };
            if( defined *{ $glob }{HASH}) {
                print "used as hash \%'$key', declare as 'our'\n";
            };
        };
    };
}
no strict;

$foo = 1;
@bar = (qw(baz bat man));
open LOG, '<', *STDIN;
sub foo_2 {}
 
[download]

The above code is a rough cut and for some reason it claims all global names as scalars in addition to their real use, but it should give you a start at generating a list of undeclared names.

Also see Of Symbol Tables and Globs .

Anonymous Monk on Nov 14, 2017 at 08:26 UTC

Re: Perl archeology: Need help in refactoring of old Perl code that does not use strict (hurry up and wait)

1) ... strict pragma ...My time to make the conversion is limited and using some off the shelf tools that speed up the process would be a great help.

Hurry up and leave it alone :)

use strict; itself confers no benefits; The benefits come from avoidance of the bad practices forbidden by strict :)

That pretty much means convert one at a time by hand after you have learned the understanding of importance of knowing :) Speed kills

2. If possible ... I do not understand ...

That is a hint you shouldn't be refactoring anything programmatically. There are a million nodes on perlmonks, and a readers digest version might be Modern Perl a loose description of how experienced and effective Perl 5 programmers work....You can learn this too.

Hurry up and bone up

3. Generally this task looks like a case of refactoring. I wonder, if any Perl IDE has some of required capabilities, or are there tools that can helpful.

I hope you have foot insurance :) happy hunting :) perlcritic , PPI / PPIx::XPath , PPIx::EditorTools ,
App::EditorTools - Command line tool for Perl code refactoring
Code::CutNPaste - Find Duplicate Perl Code

So enjoy, test first, step0++

[Nov 14, 2017] scoping - What is the difference between my and local in Perl - Stack Overflow

Notable quotes:
"... temporarily changes the value of the variable ..."
"... within the scope ..."
"... Unlike dynamic variables created by the local operator, lexical variables declared with my are totally hidden from the outside world, including any called subroutines. ..."
Nov 14, 2017 | stackoverflow.com

down vote favorite 10

Brian G ,Sep 24, 2008 at 20:12

I am seeing both of them used in this script I am trying to debug and the literature is just not clear. Can someone demystify this for me?

J.J. ,Sep 24, 2008 at 20:24

Dynamic Scoping. It is a neat concept. Many people don't use it, or understand it.

Basically think of my as creating and anchoring a variable to one block of {}, A.K.A. scope.

my $foo if (true); # $foo lives and dies within the if statement.

So a my variable is what you are used to. whereas with dynamic scoping $var can be declared anywhere and used anywhere. So with local you basically suspend the use of that global variable, and use a "local value" to work with it. So local creates a temporary scope for a temporary variable.

$var = 4;
print $var, "\n";
&hello;
print $var, "\n";

# subroutines
sub hello {
     local $var = 10;
     print $var, "\n";
     &gogo; # calling subroutine gogo
     print $var, "\n";
}
sub gogo {
     $var ++;
}

This should print:

4
10
11
4

Brad Gilbert ,Sep 24, 2008 at 20:50

You didn't call the subroutines. – Brad Gilbert Sep 24 '08 at 20:50

brian d foy ,Sep 25, 2008 at 18:23

Don't conditionally declare lexical variables: it has undefined behavior. – brian d foy Sep 25 '08 at 18:23

Jeremy Bourque ,Sep 24, 2008 at 20:26

The short answer is that my marks a variable as private in a lexical scope, and local marks a variable as private in a dynamic scope.

It's easier to understand my , since that creates a local variable in the usual sense. There is a new variable created and it's accessible only within the enclosing lexical block, which is usually marked by curly braces. There are some exceptions to the curly-brace rule, such as:

foreach my $x (@foo) { print "$x\n"; }

But that's just Perl doing what you mean. Normally you have something like this:

sub Foo {
   my $x = shift;

   print "$x\n";
}

In that case, $x is private to the subroutine and it's scope is enclosed by the curly braces. The thing to note, and this is the contrast to local , is that the scope of a my variable is defined with respect to your code as it is written in the file. It's a compile-time phenomenon.

To understand local , you need to think in terms of the calling stack of your program as it is running. When a variable is local , it is redefined from the point at which the local statement executes for everything below that on the stack, until you return back up the stack to the caller of the block containing the local .

This can be confusing at first, so consider the following example.

sub foo { print "$x\n"; }
sub bar { local $x; $x = 2; foo(); }

$x = 1;
foo(); # prints '1'
bar(); # prints '2' because $x was localed in bar
foo(); # prints '1' again because local from foo is no longer in effect

When foo is called the first time, it sees the global value of $x which is 1. When bar is called and local $x runs, that redefines the global $x on the stack. Now when foo is called from bar , it sees the new value of 2 for $x . So far that isn't very special, because the same thing would have happened without the call to local . The magic is that when bar returns we exit the dynamic scope created by local $x and the previous global $x comes back into scope. So for the final call of foo , $x is 1.

You will almost always want to use my , since that gives you the local variable you're looking for. Once in a blue moon, local is really handy to do cool things.

Drew Stephens ,Sep 24, 2008 at 22:58

Quoting from Learning Perl :

But local is misnamed, or at least misleadingly named. Our friend Chip Salzenberg says that if he ever gets a chance to go back in a time machine to 1986 and give Larry one piece of advice, he'd tell Larry to call local by the name "save" instead.[14] That's because local actually will save the given global variable's value away, so it will later automatically be restored to the global variable. (That's right: these so-called "local" variables are actually globals!) This save-and-restore mechanism is the same one we've already seen twice now, in the control variable of a foreach loop, and in the @_ array of subroutine parameters.

So, local saves a global variable's current value and then set it to some form of empty value. You'll often see it used to slurp an entire file, rather than leading just a line:

my $file_content;
{
    local $/;
    open IN, "foo.txt";
    $file_content = <IN>;
}

Calling local $/ sets the input record separator (the value that Perl stops reading a "line" at) to an empty value, causing the spaceship operator to read the entire file, so it never hits the input record separator.

Aristotle Pagaltzis ,Sep 25, 2008 at 23:25

I can't believe no one has linked to Mark Jason Dominus' exhaustive treatises on the matter:

dan1111 ,Jan 28, 2013 at 11:21

Word of warning: both of these articles are quite old, and the second one (by the author's own warning) is obsolete. It demonstrates techniques for localization of file handles that have been superseded by lexical file handles in modern versions of Perl. – dan1111