Softpanorama May the source be with you, but remember the KISS principle ;-)	Home	Switchboard	Unix Administration	Red Hat	TCP/IP Networks	Neoliberalism	Toxic Managers
	(slightly skeptical) Educational society promoting "Back to basics" movement against IT overcomplexity and bastardization of classic Unix

Neatperl -- a simple Perl prettyprinter
based of "fuzzy" determination of nesting level

Nikolai Bezroukov, 2019, Licensed under Perl Artistic license
Version 0.81 (Oct 9, 2019)

News	Programming style	Recommended Links	Program Understanding	Defensive programming	neatbash -- bash beautifier	Neatperl -- a simple Perl prettyprinter
Compilers Algorithms	Lexical analysis	Debugging	Perl Prettyprinting	C	C++	HTML
Multitarget verifiers	Multitarget highlighters	Multitarget beautifiers
Unix History	Admin Horror Stories	Software Engineering	Language Design and Programming Quotes	Humor	Random Findings	Etc

Pretty printer Neatperl can be called a "fuzzy" pretty-printer. If does not perform full lexical analysis (which for Perl is difficult as Perl has complex lexical structure, which is further complicated by such innovation as custom delimiters in q, qq and qr strings). Instead it relies on analysis of a limited context of each line (prefix and suffix) to "guess" the correct nesting level. It does not perform any reorganization of the text other then re-indentation.

Neatperl also computer some basic statistics about code, such as number of lines without comments, the number of subroutines, number of block, etc.

For a reasonable Perl style typically found in production scripts the results are quite satisfactory. Of course, it will not work for compressed or obscured code.

This is a relatively non-traditional approach as typically prettyprinter attempt to implement full lexical analysis of the language with some elements of syntax analysis, see for example my (very old) NEATPL pretty printer ( http://www.softpanorama.org/Articles/Oldies/neatpl.pdf ) -- one of the first first program that I have written. It implements full lexical analyzer for PL/1/. Perltidy architecture is closer to this approach too.

The main advantage is that such approach allows to implement a useful pretty printer is less then 500 lines of Perl source code with around 200 lines implementing the formatting algorithm. Such small scripts are more maintainable and have less chances to "drop dead" and became abandonware after the initial author lost interest and no longer supports the script.

Neatperl does not depends on any non-standard Perl modules and it's distribution consists just of two items: the script itself and the readme file. This can be an important advantage as installing Perl modules in corporate environment often is not that simple and you can run into some bureaucratic nightmare. Also with many modules used you always risk compatibility hell. It is sad that Perl does not have zipped format as jar files for Java which allow to package the program with dependencies as a single file.

Another important advantage is the this is a very safe approach, which normally does not introduce any errors in bash code with the exception of indented HERE lines which might be incorrectly "re-indented" based on the current nesting. As Perl have very complex lexical structure which in not a context free grammar its parsing represent a daunting programming task and it can never be guaranteed to be correct. Which means that such "fuzzy" prettyprinter approach is a safer approach for such language.

But even it can mangle some parts of the script such as HERE strings in case of some "too inventive" delimiters used. In Perl the delimiter can be defined via single or double quotes string, Neatperl does not recognize perverted HERE string defined via q or qq notation.

Neatperl does not try to nest the string which start with the first position. Such strings always remains intact.

There is no free lunch, and such limited context approach means that sometimes (rarely) the nesting level can be determined incorrectly. There also might be problem with determination of the correct end of of HERE literals or q and qq literals. Missed HERE string that have non zero fixed indent can be shifted left or right which might be not a good thing (HERE stings with zero indent are safe). So fuzzy prettyprinter is best for you own scripts in which you can maintain a safe Perl style which is prettyprinter friendly. For scripts written by other people your mileage can vary but even in this case this is a great diagnostic tools. It also greatly helps to understand the scripts written by other people.

To correct this situation three pseudo-comments (pragmas) were introduced using which you can control the formatting and correct formatting errors. All pesudocomments should start at the beginning of the line. No leading spaces allowed.

Currently Neatperl allows three types of pseudo-comments:

Switching formatting off and on for the set of lines. This idea is similar to HERE documents allowing to skip portions of the script which are too difficult to format correctly. One example is a here statement with indented lines when re-indenting them to the current nesting level (which is the default action of the formatter) is undesirable.
- #%OFF -- (all capitals, should be on the only text in the line, starting from the first position) stops formatting, All lines after this directive are not processed and put into listing and formatted code buffer intact
- #%ON -- (all capitals, the only text on the line starting from the first position with no leading blanks) resumes formatting
Correcting nesting level if it was determined incorrectly. The directive is "#%NEST" which has several forms (more can be added if necessary ;-):
- Set the current nesting level to specified integer
```
 #%NEST=digit --
```
- Increment
```
#%NEST--
```
- Decrement
```
#%NEST--
```

For example, if Neatperl did not recognize correctly the point of closing of a particular control structure you can close it yourself with the directive

#%NEST--

#%NEST=0

NOTES:

Again, all control statement should start at the first position of the line. No leading blanks are allowed.
No spaces between NEST and = pr NEAT and ++/-- are allowed.

Also you can arbitrary increase and decrease indent with this directive

As Neatperl maintains stack of control keywords it reorganize it also produces some useful diagnostic messages.

For most scripts Neatperl is able to determine that correct nesting level and proper indentation. Of course, to be successful, this approach requires a certain (very reasonable) layout of the script. The main requirement is that multiline control statements should start and end on a separate line.

One liners (control statements which start and end on the same line) are acceptable

While any of us saw pretty perverted formatting style in some scripts this typically is an anomaly in production quality scripts. Most production quality scripts display very reasonable control statements layout, the one that is expected by this pretty printer. But again that's why I called this pretty printer "fuzzy." For example, for any script compressed to eliminate whitespace this approach to pretty printing is not successful

INVOCATION

 neatperl [options] [file_to_process]

 neatperl -f [other_options] [file_to_process] # in this case the text will be replaced with formatted text, 
                                              # backup will be saved in the same directory

cat file |  neatperl -p [other_options] > formatted_text # invocation as pipe

OPTIONS

-h -- help
-t number -- size of tab (emulated with spaces). The default is 3
-f -- "in place" formatting of a file: write formatted text into the same files creating backup
-p -- work as a pipe
-r -- some minor (and subjective) readability improvements like replacing if (a==1... with if( a==1... and ...) { with ... ){. The idea is to add space before and after the logical expression in if, while, until and similar statements. IMHO that improves readability. Your mileage may vary.
-v -- - provides additional warnings about non-balance of quotes and round parentheses. Balance is calculated per line and can be incorrect. You can specify verbosity level
- 0 -- only serious errors are displayed
- 1 -- serious errors and errors are displayed
- 2 -- serious errors, errors and warnings are displayed

PARAMETERS

1st -- name of the file to be formatted

Top Visited <p>Your browser does not support iframes.</p>					Switchboard
					Latest
					Past week
					Past month

NEWS CONTENTS

20190903 : Uploaded to GitHub ( Uploaded to GitHub, Sep 03, 2019 )

Old News ;-)

[Sep 03, 2019] Uploaded to GitHub

This is still raw version (0.4, so it is still beta, but usable ). It works for all my scripts and script by other authors, that I tested

Neatperl -- a simple Perl prettyprinter
based of "fuzzy" determination of nesting level

Nikolai Bezroukov, 2019, Licensed under Perl Artistic license
Version 0.81 (Oct 9, 2019)

INVOCATION

OPTIONS

NEWS CONTENTS

Old News ;-)

[Sep 03, 2019] Uploaded to GitHub

Recommended Links

Google matched content

Softpanorama Recommended

Top articles

Sites

Neatperl -- a simple Perl prettyprinter based of "fuzzy" determination of nesting level

Nikolai Bezroukov, 2019, Licensed under Perl Artistic license Version 0.81 (Oct 9, 2019)

INVOCATION

OPTIONS

NEWS CONTENTS

Old News ;-)

[Sep 03, 2019] Uploaded to GitHub

Recommended Links

Google matched content

Softpanorama Recommended

Top articles

Sites

Neatperl -- a simple Perl prettyprinter
based of "fuzzy" determination of nesting level

Nikolai Bezroukov, 2019, Licensed under Perl Artistic license
Version 0.81 (Oct 9, 2019)