Pythonizer user guide

Version 0.7 (Sept 18, 2020)

News  Python for Perl programmers

Best Python books for system administrators

Recommended Links Perl to Python functions map Execution of commands and shell scripts using subprocess module Full protocol of translation of pre_pythonizer.pl by the current version of Pythonizer

Introduction

Some organizations are now involved in converting their old Perl codebase into other scripting languages, such as Python. But a more common task is to maintain existing Perl scripts, when the person who is assigned to this task known only Python.

University graduates now typically know Python but not Perl and that creates difficulties in the old codebase maintenance.   In this case, a program that "explains" Perl constructs in Python term would be extremely useful and, sometimes, a lifesaver. Of course, Perl 5 is here to stay (please note what happened with people who were predicting the demise of Fortran ;-), and in most cases, old scripts will stay too.

The other role is to give a quick start for system administrators who want to learn Python (for example, need to support researchers who work with Python), but who currently knows only Perl -- many older school sysadmins dislike Python and for a reason ;-)  

Yes another role is provide a proof that those two languages are mostly compatible and that program from one  can be translated into another with modest amount of effort. Although such translation is not necessary a best fit,  in most cases it is close enough and needs just minor manual editing. 

Of course, complex constructs and idioms are often translated incorrectly. Several complex issues remains unresolved (implicit conversion to strings in Perl is one such issues.) As experience with Google translation of natural languages attests there  are always around 10 to 20% of sentences (depending of the subject area of the text)  that are translated incorrectly. And, probably,  2-3% that have absurd or funny translation. 

The idea here is that using "fuzzy translation" concept it is possible to create such a tool with relatively modest efforts. A tool, written with some knowledge of compiler technologies, that falls into the category of "small language compliers" with the total effort around one man-year or less. Assuming ten lines per day of debugged code for the task of complexity comparable with the writing of compilers, the estimated size should be around 3-5K lines of code (~1K line phase 1 and 2-3K line phase 2).

As of version  0.7 it looks like the initial idea was a sound one: within  5K LOC limit it is possible to create a useful utility that transcribes Perl in Python.

As the currently code base exceeded 4K lines, it is close to the  limit on which I can maintain this codebase as a hobby project, so some enhancements need to be abandoned or moved to the pre-pythonizer phase. 

For example, the creation of the list of global variables for each subroutine to maintain the same visibility in Python in version  0.7 is implemented. But state variables are handled like regular global variable which they are not and that can cause problems (they generally do not belong to the global namespace as while they have lifetime similar to global variables their namespace is local).  Also while on the current level of complexity it is possible to guess if variable is changed within the subroutine or is just read, there is no easy way to put this knowledge into improving global variables map of each sub. This should be done manually. 

The same is true with conversions of types. Right now it is left to the programmer to fix those issues. 

Pre-pythonizer  implements the first phase of translation

Processing consists of two passes, which currently are not integrated in any away with pre_pythonizer mainly providing refactoring of Perl code to create main procedure and move all subroutines up so that they are now defined before use. 

NOTE: In the next version it will create the list of variables that need to be declared global to preserve a part of the namespace the namespace that Perl program has. This is a needed function as Python has different rules of variables visibility that I decided to move to the pr-pythonizer due to the size of the codebase for pythonizer.  Hopefully this part can be at least partially automated.

Running this phase also reformat the Perl script in a way that slightly increases chances that the script will be translated with fewer errors. Opening and closeting curvy bracket are put on single lines to ease the job of the lexical scanner in Pythonizer.

It can be used iether as separate utility of in integrated way via -r (refactor) option in pythonizer.

pre_pythonizer [options] <file>  # currently performs is refactoring of  Perl script, 
                                 # pushing subroutines to the top and creating main sub
                                 # out of code not included  into any subroutine. 
pythonizer [options] <file>

The first pass is currently fully optional and need transformations of Perl code can be performed by other utilities. It just slightly increase probability of more correct translation of the code. It reformat the code so that curvy brackets were mostly on separate lines (this was essential only for pythonizer up to version 0.2; later versions do not depend directly on this transformation.)

It can be used as a separate program, which transforms initial Perl script creating a backup with the extension .original. The main useful function in the current version is refactoring of the Perl program by pushing subroutines up as in Python subroutines needs to be declared before use.

It needs to run only once for each Perl script you want to translate to Python. Subsequence modifications of Perl script to make it more "Python-compatible" can be performed on this text instead of the original script.

Pythonizer implements actual transformation of Perl into Python

This guide corresponds to version 0.7 of pythonizer. Running  pythonizer -h provides a  list of options.

This is alpha version, so do not expect any perfection. It woks but with errors.  It still has bugs and not all constructs are transliterate correctly.

Also detecting of the type of variables is not implemented and that increase the number of errors in Python as conversion to string is not automatic like in Perl.

Here is example of the protocol  Full protocol of translation of pre_pythonizer.pl by version 0.2 of pythonizer

To increase chances of correct transliteration it is recommended to run the Perl script via pre_pythonizer.pl

Parts that can't be translated during the first invocation can be commented outs and iteratively the stage can be reached when the Perl script is completely Pythonized and can be corrected manually. 

Some features:

Limitations

Options

NOTE: All options with small numeric values can be expressed by  repeating the letter, so -p 2 is equivalent to -pp. -d 3 -ddd

Currently only few user options are supported (pythonizer -h provides a  list of options): 

Options for developers

The same options work for the pre_pythonizer, but usually defaults are OK.  There is no options to control refactoring of the script.

Logs

Currently logs are written to /tmp/Pythonizer  You can redirect this to any directory via symlink. Currently there is no option to customize the location of the log.

Structure

Pythonizer consists of the main program called, as you can guess, pythonizer and three modules (which currently need to reside in the same directory as the main program. Main program currently used three modules:

The total size of the codebase in version 0.6 is over 4K lines:
wc -l Perlscan.pm Pythonizer.pm  pythonizer  Softpano.pm pre_pythonizer.pl
  1107 Perlscan.pm
   382 Pythonizer.pm
  1515 pythonizer
   339 Softpano.pm
   825 pre_pythonizer.pl
  4168 total

The total size of the codebase in version 0.5 is around 4K lines:

wc -l Perlscan.pm Pythonizer.pm  pythonizer  Softpano.pm pre_pythonizer.pl
  1051 Perlscan.pm
   317 Pythonizer.pm
  1442 pythonizer
   336 Softpano.pm
   825 pre_pythonizer.pl
  3971 total

The total size of the codebase in version 0.4 is around 3.6K lines:

wc -l Perlscan.pm Pythonizer.pm  pythonizer  Softpano.pm pre_pythonizer.pl
   866 Perlscan.pm
   292 Pythonizer.pm
  1268 pythonizer
   312 Softpano.pm
   833 pre_pythonizer.pl
  3571 total

Installation

You need to download files or replicate the directory via git . In the later case the main program and three modules mentioned about should be put into a separate directory. For example,  /opt/Pythonizer

The directory into which the main program and modules are downloaded  needs to be made current before the run.

Currently main program and all modules should reside in a single directory from which you will run the program. 

ATTENTION: During invocation of pythonizer  this directory should be current.

Invocation

You can run Pythonizer both in Cygwin and Linux.

To "pythonize" the Perl script /path/to/your/program.pl  you need to use the following invocation (the directory in which pythonizer resides should be current otherwise modules will not be loaded)

cd /path/to/pythonizer && pythonizer /path/to/your/program.pl

If the program runs to the end you will get "pythonized" text in /path/to/your/program.py

It also produces protocol of translation in /tmp/Pythonizer  with size by side Perl and Python code, which allows you to analyses the protocol detect places that need to be commented out translated manually.

If  __DATA__ or __END__ are used a separate file with  the extension  .data  (/path/to/your/program.data for the example above) will be created with  the content on this section of Perl script.

Subsequent transformations

For some ideas see Perl to Python translation

History

Changes since version 0.6.

This version creates of the list of global variables for each subroutine to maintain the same visibility in Python as in Perl and generates global statement with the list of such  variables that is inserted in each Python subroutine definition if pythonizer determined that this subroutine access global variables. The list might be excessive.

Changes since version 0.5.

Regular expressions now are translated more  correctly. Short cut if like (debug>0) && say $line are translated in more general way then before.  This is the first version that translates the main test (pre_pythonizer.pl) without syntax errors.   Generated source  starts executing in Python interpreter till the first error. List on internal functions created. Translation of backquotes and open statement improved.

Changes since version 0.4

Regular expression and tr function translation was improved. Many other changes and error corrections. -r (refactor) option implemented to allow refactoring Perl source via pre-pythonlizer.pl in integrated fashion.

Changes  since version 0.3

Changes since version 0.2:


Top Visited
Switchboard
Latest
Past week
Past month

NEWS CONTENTS

Old News ;-)

[Sep 18, 2020] Version 0.7 uploaded

Changes since version 0.6

This version creates of the list of global variables for each subroutine to maintain the same visibility in Python as in Perl and generates global statement with the list of such variables that is inserted in each Python subroutine definition if pythonizer determined that this subroutine access global variables.

So far the specifics of Perl state variable is ignored and they are assumed to be yet another type of global variables (they generally do not belong to the global namespace as while they have lifetime similar to global variables their namespace is local).

[Sep 08, 2020] Version 0.6 uploaded

Regular expressions now are translated more correctly. Short cut if like (debug>0) && say $line are translated in more general way then before. This is the first version that translates the main test (pre_pythonizer.pl) without syntax errors. Generated source starts executing in Python interpreter till the first error. List on internal functions created. Translation of backquotes and open statement improved.

[Aug 31, 2020] Version 0.5 uploaded

Changes since version 0.4

[Aug 22, 2020] Version 0.4 uploaded

Changes since version 0.3

[Aug 17, 2020] Version 0.3 was uploaded

Changes since version 0.2:

Recommended Links

Top articles

Sites