Softpanorama

May the source be with you, but remember the KISS principle ;-)
Home Switchboard Unix Administration Red Hat TCP/IP Networks Neoliberalism Toxic Managers
(slightly skeptical) Educational society promoting "Back to basics" movement against IT overcomplexity and  bastardization of classic Unix

Scriptorama: A Slightly Skeptical View
on Scripting Languages

News Introduction Recommended Books Recommended Links Programming Languages Design Scripting language wars Scripting Languages for Java vs. Pure Java
Software Engineering Anti-OO John Ousterhout Larry Wall Shell Giants Software Prototyping Software Life Cycle Models
Shells AWK Perl Perl Warts and Quirks Python PHP Javascript
Ruby Tcl/Tk R programming language Rexx Lua S-lang JVM-based scripting languages
Pipes Regex Program understanding Beautifiers and Pretty Printers Neatbash -- a simple bash prettyprinter Neatperl -- a simple Perl prettyprinter  
Brooks law Conway Law KISS Principle Featuritis Software Prototyping Unix Component Model  
Programming as a Profession Programming style Language design and programming quotes History Humor Random Findings Etc

This is the central page of the Softpanorama WEB site because I am strongly convinced that the development of scripting languages, not the replication of the efforts of BSD group undertaken by Stallman and Torvalds is the central part of open source. See Scripting languages as VHLL for more details.

 
Ordinarily technology changes fast. But programming languages are different: programming languages are not just technology, but what programmers think in.

They're half technology and half religion. And so the median language, meaning whatever language the median programmer uses, moves as slow as an iceberg.

Paul Graham: Beating the Averages

Libraries are more important that the language.

Donald Knuth


Introduction

A fruitful way to think about language development is to consider it a to be special type of theory building. Peter Naur suggested that programming in general is theory building activity in his 1985 paper "Programming as Theory Building". But idea is especially applicable to compilers and interpreters. What Peter Naur failed to understand was that design of programming languages has religious overtones and sometimes represent an activity, which is pretty close to the process of creating a new, obscure cult ;-). Clueless academics publishing junk papers at obscure conferences are high priests of the church of programming languages. Some, like Niklaus Wirth and Edsger W. Dijkstra, (temporary) reached the status close to (false) prophets :-).

On a deep conceptual level building of a new language is a human way of solving complex problems. That means that complier construction in probably the most underappreciated paradigm of programming of large systems. Much more so then greatly oversold object-oriented programming. OO benefits are greatly overstated.

For users, programming languages distinctly have religious aspects, so decisions about what language to use are often far from being rational and are mainly cultural.  Indoctrination at the university plays a very important role. Recently they were instrumental in making Java a new Cobol.

The second important observation about programming languages is that language per se is just a tiny part of what can be called language programming environment. The latter includes libraries, IDE, books, level of adoption at universities,  popular, important applications written in the language, level of support and key players that support the language on major platforms such as Windows and Linux and other similar things.

A mediocre language with good programming environment can give a run for the money to similar superior in design languages that are just naked.  This is  a story behind success of  Java and PHP. Critical application is also very important and this is a story of success of PHP which is nothing but a bastardatized derivative of Perl (with all the most interesting Perl features surgically removed ;-) adapted to creation of dynamic web sites using so called LAMP stack.

Progress in programming languages has been very uneven and contain several setbacks. Currently this progress is mainly limited to development of so called scripting languages.  Traditional high level languages field is stagnant for many decades.  From 2000 to 2017 we observed the huge sucess of Javascript; Python encroached in Perl territory (including genomics/bioinformatics) and R in turn start squeezing Python in several areas. At the same time Ruby despite initial success remained niche language.  PHP still holds its own in web-site design.

Some observations about scripting language design and  usage

At the same time there are some mysterious, unanswered question about factors that help the particular scripting language to increase its user base, or fail in popularity. Among them:

Nothing succeed like success

Those are difficult questions to answer without some way of classifying languages into different categories. Several such classifications exists. First of all like with natural languages, the number of people who speak a given language is a tremendous force that can overcome any real of perceived deficiencies of the language. In programming languages, like in natural languages nothing succeed like success.

The second interesting category is number of applications written in particular language that became part of Linux or, at least, are including in standard RHEL/FEDORA/CENTOS or Debian/Ubuntu repository.

The third relevant category is the number and quality of books for the particular language.

Complexity Curse

History of programming languages raises interesting general questions about the limit of complexity of programming languages. There is strong historical evidence that a language with simpler core, or even simplistic core Basic, Pascal) have better chances to acquire high level of popularity.

The underlying fact here probably is that most programmers are at best mediocre and such programmers tend on intuitive level to avoid more complex, more rich languages and prefer, say, Pascal to PL/1 and PHP to Perl. Or at least avoid it on a particular phase of language development (C++ is not simpler language then PL/1, but was widely adopted because of the progress of hardware, availability of compilers and not the least, because it was associated with OO exactly at the time OO became a mainstream fashion).

Complex non-orthogonal languages can succeed only as a result of a long period of language development (which usually adds complexly -- just compare Fortran IV with Fortran 99; or PHP 3 with PHP 5 ) from a smaller core. Attempts to ride some fashionable new trend extending existing popular language to this new "paradigm" also proved to be relatively successful (OO programming in case of C++, which is a superset of C).

Historically, few complex languages were successful (PL/1, Ada, Perl, C++), but even if they were successful, their success typically was temporary rather then permanent  (PL/1, Ada, Perl). As Professor Wilkes noted   (iee90):

Things move slowly in the computer language field but, over a sufficiently long period of time, it is possible to discern trends. In the 1970s, there was a vogue among system programmers for BCPL, a typeless language. This has now run its course, and system programmers appreciate some typing support. At the same time, they like a language with low level features that enable them to do things their way, rather than the compiler�s way, when they want to.

They continue, to have a strong preference for a lean language. At present they tend to favor C in its various versions. For applications in which flexibility is important, Lisp may be said to have gained strength as a popular programming language.

Further progress is necessary in the direction of achieving modularity. No language has so far emerged which exploits objects in a fully satisfactory manner, although C++ goes a long way. ADA was progressive in this respect, but unfortunately it is in the process of collapsing under its own great weight.

ADA is an example of what can happen when an official attempt is made to orchestrate technical advances. After the experience with PL/1 and ALGOL 68, it should have been clear that the future did not lie with massively large languages.

I would direct the reader�s attention to Modula-3, a modest attempt to build on the appeal and success of Pascal and Modula-2 [12].

Complexity of the compiler/interpreter also matter as it affects portability: this is one thing that probably doomed PL/1 (and later Ada), although those days a new language typically come with open source compiler (or in case of scripting languages, an interpreter) and this is less of a problem.

Here is an interesting take on language design from the preface to The D programming language book 9D language failed to achieve any significant level of popularity):

Programming language design seeks power in simplicity and, when successful, begets beauty.

Choosing the trade-offs among contradictory requirements is a difficult task that requires good taste from the language designer as much as mastery of theoretical principles and of practical implementation matters. Programming language design is software-engineering-complete.

D is a language that attempts to consistently do the right thing within the constraints it chose: system-level access to computing resources, high performance, and syntactic similarity with C-derived languages. In trying to do the right thing, D sometimes stays with tradition and does what other languages do, and other times it breaks tradition with a fresh, innovative solution. On occasion that meant revisiting the very constraints that D ostensibly embraced. For example, large program fragments or indeed entire programs can be written in a well-defined memory-safe subset of D, which entails giving away a small amount of system-level access for a large gain in program debuggability.

You may be interested in D if the following values are important to you:

The role of fashion

At the initial, the most difficult stage of language development the language should solve an important problem that was inadequately solved by currently popular languages.  But at the same time the language has few chances to succeed unless it perfectly fits into the current software fashion. This "fashion factor" is probably as important as several other factors combined. With the notable exclusion of "language sponsor" factor.  The latter can make or break the language.

Like in woman dress fashion rules in language design.  And with time this trend became more and more pronounced.  A new language should simultaneously represent the current fashionable trend.  For example OO-programming was a visit card into the world of "big, successful languages" since probably early 90th (C++, Java, Python).  Before that "structured programming" and "verification" (Pascal, Modula) played similar role.

Programming environment and the role of "powerful sponsor" in language success

PL/1, Java, C#, Ada, Python are languages that had powerful sponsors. Pascal, Basic, Forth, partially Perl (O'Reilly was a sponsor for a short period of time)  are examples of the languages that had no such sponsor during the initial period of development.  C and C++ are somewhere in between.

But language itself is not enough. Any language now need a "programming environment" which consists of a set of libraries, debugger and other tools (make tool, lint, pretty-printer, etc). The set of standard" libraries and debugger are probably two most important elements. They cost  lot of time (or money) to develop and here the role of powerful sponsor is difficult to underestimate.

While this is not the necessary condition for becoming popular, it really helps: other things equal the weight of the sponsor of the language does matter. For example Java, being a weak, inconsistent language (C-- with garbage collection and OO) was pushed through the throat on the strength of marketing and huge amount of money spend on creating Java programming environment. 

The same was partially true for  C# and Python. That's why Python, despite its "non-Unix" origin is more viable scripting language now then, say, Perl (which is better integrated with Unix and has pretty innovative for scripting languages support of pointers and regular expressions), or Ruby (which has support of coroutines from day 1, not as "bolted on" feature like in Python).

Like in political campaigns, negative advertizing also matter. For example Perl suffered greatly from blackmail comparing programs in it with "white noise". And then from withdrawal of O'Reilly from the role of sponsor of the language (although it continue to milk that Perl book publishing franchise ;-)

People proved to be pretty gullible and in this sense language marketing is not that different from woman clothing marketing :-)

Language level and success

One very important classification of programming languages is based on so called the level of the language.  Essentially after there is at least one language that is successful on a given level, the success of other languages on the same level became more problematic. Higher chances for success are for languages that have even slightly higher, but still higher level then successful predecessors.

The level of the language informally can be described as the number of statements (or, more correctly, the number of  lexical units (tokens)) needed to write a solution of a particular problem in one language versus another. This way we can distinguish several levels of programming languages:

 "Nanny languages" vs "Sharp razor" languages

Some people distinguish between "nanny languages" and "sharp razor" languages. The latter do not attempt to protect user from his errors while the former usually go too far... Right compromise is extremely difficult to find.

For example, I consider the explicit availability of pointers as an important feature of the language that greatly increases its expressive power and far outweighs risks of errors in hands of unskilled practitioners.  In other words attempts to make the language "safer" often misfire.

Expressive style of the languages

Another useful typology is based in expressive style of the language:

Those categories are not pure and somewhat overlap. For example, it's possible to program in an object-oriented style in C, or even assembler. Some scripting languages like Perl have built-in regular expressions engines that are a part of the language so they have functional component despite being procedural. Some relatively low level languages (Algol-style languages) implement garbage collection. A good example is Java. There are scripting languages that compile into common language framework which was designed for high level languages. For example, Iron Python compiles into .Net.

Weak correlation between quality of design and popularity

Popularity of the programming languages is not strongly connected to their quality. Some languages that look like a collection of language designer blunders (PHP, Java ) became quite popular. Java became  a new Cobol and PHP dominates dynamic Web sites construction. The dominant technology for such Web sites is often called LAMP, which means Linux - Apache -MySQL- PHP. Being a highly simplified but badly constructed subset of Perl ( kind of new Basic for dynamic Web sites) PHP provides the most depressing experience. I was unpleasantly surprised when I had learnt that the Wikipedia engine was rewritten in PHP from Perl some time ago, but this fact quite illustrates the trend.  The number of mediocre programmer outweigh  the number of talented programmers by factor of 100 or higher.

So language design quality has little to do with the language success in the marketplace. Simpler languages have more wide appeal as success of PHP (which at the beginning was at the expense of Perl) suggests. In addition much depends whether the language has powerful sponsor like was the case with Java (Sun and IBM), PHO (Facebook). This is partially true for Python (Google) but it was after the designers of the language spend many years fighting for survival. 

Progress in programming languages has been very uneven and contain several setbacks like Java and PHP (and partially C++). Currently this progress is usually associated with scripting languages. History of programming languages raises interesting general questions about "laws" of programming language design. First let's reproduce several notable quotes:

  1. Knuth law of optimization: "Premature optimization is the root of all evil (or at least most of it) in programming." - Donald Knuth
  2. "Greenspun's Tenth Rule of Programming: any sufficiently complicated C or Fortran program contains an ad hoc informally-specified bug-ridden slow implementation of half of Common Lisp." - Phil Greenspun
  3. "The key to performance is elegance, not battalions of special cases." - Jon Bentley and Doug McIlroy
  4. "Some may say Ruby is a bad rip-off of Lisp or Smalltalk, and I admit that. But it is nicer to ordinary people." - Matz, LL2
  5. Most papers in computer science describe how their author learned what someone else already knew. - Peter Landin
  6. "The only way to learn a new programming language is by writing programs in it." - Kernighan and Ritchie
  7. "If I had a nickel for every time I've written "for (i = 0; i < N; i++)" in C, I'd be a millionaire." - Mike Vanier
  8. "Language designers are not intellectuals. They're not as interested in thinking as you might hope. They just want to get a language done and start using it." - Dave Moon
  9. "Don't worry about what anybody else is going to do. The best way to predict the future is to invent it." - Alan Kay
  10. "Programs must be written for people to read, and only incidentally for machines to execute." - Abelson & Sussman, SICP, preface to the first edition

Please note that one thing is to read language manual and appreciate how good the concepts are, and another to bet your project on a new, unproved language without good debuggers, manuals and, what is very important, libraries. Debugger is very important but standard libraries are crucial: they represent a factor that makes or breaks new languages.

In this sense languages are much like cars. For many people car is the thing that they use get to work and shopping mall and they are not very interesting is engine inline or V-type and the use of fuzzy logic in the transmission. What they care is safety, reliability, mileage, insurance and the size of trunk. In this sense "Worse is better" is very true. I already mentioned the importance of the debugger. The other important criteria is quality and availability of libraries. Actually libraries are what make 80% of the usability of the language, moreover in a sense libraries are more important than the language...

A popular belief that scripting is "unsafe" or "second rate" or "prototype" solution is completely wrong. If a project had died than it does not matter what was the implementation language, so for any successful project and tough schedules scripting language (especially in dual scripting language+C combination, for example TCL+C) is an optimal blend that for a large class of tasks. Such an approach helps to separate architectural decisions from implementation details much better that any OO model does.

Moreover even for tasks that handle a fair amount of computations and data (computationally intensive tasks) such languages as Python and Perl are often (but not always !) competitive with C++, C# and, especially, Java.

The second important observation about programming languages is that language per se is just a tiny part of what can be called language programming environment. the latter includes libraries, IDE, books, level of adoption at universities, popular, important applications written in the language, level of support and key players that support the language on major platforms such as Windows and Linux and other similar things. A mediocre language with good programming environment can give a run for the money to similar superior in design languages that are just naked. This is a story behind success of Java. Critical application is also very important and this is a story of success of PHP which is nothing but a bastardatized derivative of Perl (with all most interesting Perl features removed ;-) adapted to creation of dynamic web sites using so called LAMP stack.

History of programming languages raises interesting general questions about the limit of complexity of programming languages. There is strong historical evidence that languages with simpler core, or even simplistic core has more chanced to acquire high level of popularity. The underlying fact here probably is that most programmers are at best mediocre and such programmer tend on intuitive level to avoid more complex, more rich languages like, say, PL/1 and Perl. Or at least avoid it on a particular phase of language development (C++ is not simpler language then PL/1, but was widely adopted because OO became a fashion). Complex non-orthogonal languages can succeed only as a result on long period of language development from a smaller core or with the banner of some fashionable new trend (OO programming in case of C++).

Programming Language Development Timeline

Here is modified from Byte the timeline of Programming Languages (for the original see BYTE.com September 1995 / 20th Anniversary /)

Forties

ca. 1946


1949

Fifties


1951


1952

1957


1958


1959

Sixties


1960


1962


1963


1964


1965


1966


1967



1969

Seventies


1970


1972


1974


1975


1976


1977


1978


1979

Eighties


1980


1981


1982


1983


1984


1985


1986


1987


1988


1989

Nineties


1990


1991


1992


1993


1994


1995


1996


1997


2000


2006

2007 

2008:

2009:

2011

2017:

Special note on Scripting languages

Scripting helps to avoid OO trap that is pushed by
  "a hoard of practically illiterate researchers
publishing crap papers in junk conferences."

Despite the fact that scripting languages are really important computer science phenomena, they are usually happily ignored in university curriculums.  Students are usually indoctrinated (or in less politically correct terms  "brainwashed")  in Java and OO programming ;-)

This site tries to give scripting languages proper emphasis and  promotes scripting languages as an alternative to mainstream reliance on "Java as a new Cobol" approach for software development. Please read my introduction to the topic that was recently converted into the article: A Slightly Skeptical View on Scripting Languages.  

The tragedy of scripting language designer is that there is no way to overestimate the level of abuse of any feature of the language.  Half of the programmers by definition is below average and it is this half that matters most in enterprise environment.  In a way the higher is the level of programmer, the less relevant for him are limitations of the language. That's why statements like "Perl is badly suitable for large project development" are plain vanilla silly. With proper discipline it is perfectly suitable and programmers can be more productive with Perl than with Java. The real question is "What is the team quality and quantity?".  

Scripting is a part of Unix cultural tradition and Unix was the initial development platform for most of mainstream scripting languages with the exception of REXX. But they are portable and now all can be used in Windows and other OSes.

List of Softpanorama pages related to scripting languages

Standard topics

Main Representatives of the family Related topics History Etc

Different scripting languages provide different level of integration with base OS API (for example, Unix or Windows). For example Iron Python compiles into .Net and provides pretty high level of integration with Windows. The same is true about Perl and Unix: almost all Unix system calls are available directly from Perl. Moreover Perl integrates most of Unix API in a very natural way, making it perfect replacement of shell for coding complex scripts. It also have very good debugger. The latter is weak point of shells like bash and ksh93

Unix proved that treating everything like a file is a powerful OS paradigm. In a similar way scripting languages proved that "everything is a string" is also an extremely powerful programming paradigm.

Unix proved that treating everything like a file is a powerful OS paradigm. In a similar way scripting languages proved that "everything is a string" is also extremely powerful programming paradigm.

There are also several separate pages devoted to scripting in different applications. The main emphasis is on shells and Perl. Right now I am trying to convert my old Perl lecture notes into a eBook Nikolai Bezroukov. Introduction to Perl for Unix System Administrators.

Along with pages devoted to major scripting languages this site has many pages devoted to scripting in different applications.  There are more then a dozen of "Perl/Scripting tools for a particular area" type of pages. The most well developed and up-to-date pages of this set are probably Shells and Perl. This page main purpose is to follow the changes in programming practices that can be called the "rise of  scripting," as predicted in the famous John Ousterhout article Scripting: Higher Level Programming for the 21st Century in IEEE COMPUTER (1998). In this brilliant paper he wrote:

...Scripting languages such as Perl and Tcl represent a very different style of programming than system programming languages such as C or Java. Scripting languages are designed for "gluing" applications; they use typeless approaches to achieve a higher level of programming and more rapid application development than system programming languages. Increases in computer speed and changes in the application mix are making scripting languages more and more important for applications of the future.

...Scripting languages and system programming languages are complementary, and most major computing platforms since the 1960's have provided both kinds of languages. The languages are typically used together in component frameworks, where components are created with system programming languages and glued together with scripting languages. However, several recent trends, such as faster machines, better scripting languages, the increasing importance of graphical user interfaces and component architectures, and the growth of the Internet, have greatly increased the applicability of scripting languages. These trends will continue over the next decade, with more and more new applications written entirely in scripting languages and system programming languages used primarily for creating components.

My e-book Portraits of Open Source Pioneers contains several chapters on scripting (most are in early draft stage) that expand on this topic. 

The reader must understand that the treatment of the scripting languages in press, and especially academic press is far from being fair: entrenched academic interests often promote old or commercially supported paradigms until they retire, so change of paradigm often is possible only with the change of generations. And people tend to live longer those days... Please also be aware that even respectable academic magazines like Communications of ACM and IEEE Software often promote "Cargo cult software engineering" like Capability Maturity (CMM) model.

Dr. Nikolai Bezroukov


Top Visited
Switchboard
Latest
Past week
Past month

NEWS CONTENTS

Old News ;-)

2007 2006 2005 2004 2003 2002 2001 2000 1999

[Jun 23, 2021] How to make a pipe loop in bash

Jun 23, 2021 | stackoverflow.com

Ask Question Asked 12 years, 9 months ago Active 6 years, 3 months ago Viewed 21k times

https://832dd5f9ff74a6be66c562b9cf145a16.safeframe.googlesyndication.com/safeframe/1-0-38/html/container.html Report this ad


mweerden ,

22 10

Assume that I have programs P0 , P1 , ... P(n-1) for some n > 0 . How can I easily redirect the output of program Pi to program P(i+1 mod n) for all i ( 0 <= i < n )?

For example, let's say I have a program square , which repeatedly reads a number and than prints the square of that number, and a program calc , which sometimes prints a number after which it expects to be able to read the square of it. How do I connect these programs such that whenever calc prints a number, square squares it returns it to calc ?

Edit: I should probably clarify what I mean with "easily". The named pipe/fifo solution is one that indeed works (and I have used in the past), but it actually requires quite a bit of work to do properly if you compare it with using a bash pipe. (You need to get a not yet existing filename, make a pipe with that name, run the "pipe loop", clean up the named pipe.) Imagine you could no longer write prog1 | prog2 and would always have to use named pipes to connect programs.

I'm looking for something that is almost as easy as writing a "normal" pipe. For instance something like { prog1 | prog2 } >&0 would be great. bash Share Improve this question Follow edited Sep 4 '08 at 7:38 asked Sep 2 '08 at 18:40 mweerden 12.5k 4 4 gold badges 28 28 silver badges 31 31 bronze badges

> ,

Add a comment 7 Answers Active Oldest Votes

mweerden ,

27

After spending quite some time yesterday trying to redirect stdout to stdin , I ended up with the following method. It isn't really nice, but I think I prefer it over the named pipe/fifo solution.

read | { P0 | ... | P(n-1); } >/dev/fd/0

The { ... } >/dev/fd/0 is to redirect stdout to stdin for the pipe sequence as a whole (i.e. it redirects the output of P(n-1) to the input of P0). Using >&0 or something similar does not work; this is probably because bash assumes 0 is read-only while it doesn't mind writing to /dev/fd/0 .

The initial read -pipe is necessary because without it both the input and output file descriptor are the same pts device (at least on my system) and the redirect has no effect. (The pts device doesn't work as a pipe; writing to it puts things on your screen.) By making the input of the { ... } a normal pipe, the redirect has the desired effect.

To illustrate with my calc / square example:

function calc() {
  # calculate sum of squares of numbers 0,..,10

  sum=0
  for ((i=0; i<10; i++)); do
    echo $i                   # "request" the square of i

    read ii                   # read the square of i
    echo "got $ii" >&2          # debug message

    let sum=$sum+$ii
  done

  echo "sum $sum" >&2           # output result to stderr
}

function square() {
  # square numbers

  read j                         # receive first "request"
  while [ "$j" != "" ]; do
    let jj=$j*$j
    echo "square($j) = $jj" >&2  # debug message

    echo $jj                     # send square

    read j                       # receive next "request"
  done
}

read | { calc | square; } >/dev/fd/0

Running the above code gives the following output:

square(0) = 0
got 0
square(1) = 1
got 1
square(2) = 4
got 4
square(3) = 9
got 9
square(4) = 16
got 16
square(5) = 25
got 25
square(6) = 36
got 36
square(7) = 49
got 49
square(8) = 64
got 64
square(9) = 81
got 81
sum 285

Of course, this method is quite a bit of a hack. Especially the read part has an undesired side-effect: termination of the "real" pipe loop does not lead to termination of the whole. I couldn't think of anything better than read as it seems that you can only determine that the pipe loop has terminated by try to writing write something to it. Share Improve this answer Follow answered Sep 4 '08 at 8:22 mweerden 12.5k 4 4 gold badges 28 28 silver badges 31 31 bronze badges

regnarg ,

Nice solution. I had to do something similar using netcat inside a loop and worked around the 'read' side effect by 'closing' its input with an 'echo'. In the end it was something like this : echo | read | { P0 | ... | P(n-1); } >/dev/fd/0 – Thiago de Arruda Nov 30 '11 at 16:29

Douglas Leeder , 2008-09-02 20:57:53

15

A named pipe might do it:

$ mkfifo outside
$ <outside calc | square >outside &
$ echo "1" >outside ## Trigger the loop to start
Share Improve this answer Follow answered Sep 2 '08 at 20:57 Douglas Leeder 49.1k 8 8 gold badges 86 86 silver badges 133 133 bronze badges

Douglas Leeder ,

Could you explain the line "<outside calc | square >outside &"? I am unsure about <outside and >outside. – Léo Léopold Hertz 준영 May 7 '09 at 18:35

Mark Witczak ,

5

This is a very interesting question. I (vaguely) remember an assignment very similar in college 17 years ago. We had to create an array of pipes, where our code would get filehandles for the input/output of each pipe. Then the code would fork and close the unused filehandles.

I'm thinking you could do something similar with named pipes in bash. Use mknod or mkfifo to create a set of pipes with unique names you can reference then fork your program. Share Improve this answer Follow answered Sep 2 '08 at 19:16 Mark Witczak 1,413 2 2 gold badges 14 14 silver badges 13 13 bronze badges

> ,

Add a comment

Andreas Florath , 2015-03-14 20:30:14

3

My solutions uses pipexec (Most of the function implementation comes from your answer):

square.sh

function square() {
  # square numbers

  read j                         # receive first "request"
  while [ "$j" != "" ]; do
    let jj=$j*$j
    echo "square($j) = $jj" >&2  # debug message

    echo $jj                     # send square

    read j                       # receive next "request"
  done
}

square $@

calc.sh

function calc() {
  # calculate sum of squares of numbers 0,..,10

  sum=0
  for ((i=0; i<10; i++)); do
    echo $i                   # "request" the square of i

    read ii                   # read the square of i
    echo "got $ii" >&2          # debug message

    let sum=$sum+$ii
 done

 echo "sum $sum" >&2           # output result to stderr
}

calc $@

The command

pipexec [ CALC /bin/bash calc.sh ] [ SQUARE /bin/bash square.sh ] \
    "{CALC:1>SQUARE:0}" "{SQUARE:1>CALC:0}"

The output (same as in your answer)

square(0) = 0
got 0
square(1) = 1
got 1
square(2) = 4
got 4
square(3) = 9
got 9
square(4) = 16
got 16
square(5) = 25
got 25
square(6) = 36
got 36
square(7) = 49
got 49
square(8) = 64
got 64
square(9) = 81
got 81
sum 285

Comment: pipexec was designed to start processes and build arbitrary pipes in between. Because bash functions cannot be handled as processes, there is the need to have the functions in separate files and use a separate bash. Share Improve this answer Follow answered Mar 14 '15 at 20:30 Andreas Florath 3,797 19 19 silver badges 31 31 bronze badges

> ,

Add a comment

1729 ,

1

Named pipes.

Create a series of fifos, using mkfifo

i.e fifo0, fifo1

Then attach each process in term to the pipes you want:

processn < fifo(n-1) > fifon Share Improve this answer Follow answered Sep 2 '08 at 20:57 1729 4,589 2 2 gold badges 24 24 silver badges 17 17 bronze badges

> ,

Add a comment

Penz ,

-1

I doubt sh/bash can do it. ZSH would be a better bet, with its MULTIOS and coproc features. Share Improve this answer Follow answered Sep 2 '08 at 20:31 Penz 4,680 4 4 gold badges 26 26 silver badges 26 26 bronze badges

Léo Léopold Hertz 준영 ,

Could you give an example about Zsh? I am interested in it. – Léo Léopold Hertz 준영 May 7 '09 at 18:36

Fritz G. Mehner ,

-2

A command stack can be composed as string from an array of arbitrary commands and evaluated with eval. The following example gives the result 65536.

function square ()
{
  read n
  echo $((n*n))
}    # ----------  end of function square  ----------

declare -a  commands=( 'echo 4' 'square' 'square' 'square' )

#-------------------------------------------------------------------------------
#   build the command stack using pipes
#-------------------------------------------------------------------------------
declare     stack=${commands[0]}

for (( COUNTER=1; COUNTER<${#commands[@]}; COUNTER++ )); do
  stack="${stack} | ${commands[${COUNTER}]}"
done

#-------------------------------------------------------------------------------
#   run the command stack
#-------------------------------------------------------------------------------
eval "$stack"
Share Improve this answer Follow answered Jan 21 '09 at 9:56 Fritz G. Mehner 14.9k 2 2 gold badges 30 30 silver badges 40 40 bronze badges

reinierpost ,

I don't think you're answering the question. – reinierpost Jan 29 '10 at 15:04

[Jun 23, 2021] bash - How can I use a pipe in a while condition- - Ask Ubuntu

Notable quotes:
"... This is not at all what you are looking for. ..."
Jun 23, 2021 | askubuntu.com

John1024 , 2016-09-17 06:14:32

10

To get the logic right, just minor changes are required. Use:

while ! df | grep '/toBeMounted'
do
  sleep 2
done
echo -e '\a'Hey, I think you wanted to know that /toBeMounted is available finally.
Discussion

The corresponding code in the question was:

while df | grep -v '/toBeMounted'

The exit code of a pipeline is the exit code of the last command in the pipeline. grep -v '/toBeMounted' will return true (code=0) if at least one line of input does not match /toBeMounted . Thus, this tests whether there are other things mounted besides /toBeMounted . This is not at all what you are looking for.

To use df and grep to test whether /toBeMounted is mounted, we need

df | grep '/toBeMounted'

This returns true if /toBeMounted is mounted. What you actually need is the negation of this: you need a condition that is true if /toBeMounted is not mounted. To do that, we just need to use negation, denoted by ! :

! df | grep '/toBeMounted'

And, this is what we use in the code above.

Documentation

From the Bash manual :

The return status of a pipeline is the exit status of the last command, unless the pipefail option is enabled. If pipefail is enabled, the pipeline's return status is the value of the last (rightmost) command to exit with a non-zero status, or zero if all commands exit successfully. If the reserved word ! precedes a pipeline, the exit status of that pipeline is the logical negation of the exit status as described above. The shell waits for all commands in the pipeline to terminate before returning a value.

Share Improve this answer Follow edited Sep 17 '16 at 12:29 ilkkachu 1,463 5 5 silver badges 13 13 bronze badges answered Sep 17 '16 at 6:14 John1024 12.6k 38 38 silver badges 47 47 bronze badges

John1024 ,

Yeah it looks like my real problem wasn't the pipe, but not clearly thinking about the -v on a line by line basis. – dlamblin Sep 17 '16 at 6:47

Sergiy Kolodyazhnyy ,

4

The fact that you're using df with grep tells me that you're filtering output of df until some device mounts to specific directory, i.e. whether or not it's on the list.

Instead of filtering the list focus on the directory that you want. Luckly for us, the utility mountpoint allows us to do exactly that, and allows to deal with exit status of that command. Consider this:

$ mountpoint  /mnt/HDD/                                                        
/mnt/HDD/ is a mountpoint
$ echo $?
0
$ mountpoint  ~                                                                
/home/xieerqi is not a mountpoint
$ echo $?
1

Your script thus, can be rewritten as

while ! mountput /toBeMounted > /dev/null
do
   sleep 3
done
echo "Yup, /toBeMounted got mounted!"

Sample run with my own disk:

$ while ! mountpoint /mnt/HDD > /dev/null
> do 
>     echo "Waiting"
>     sleep 1
> done && echo "/mnt/HDD is mounted"
Waiting
Waiting
Waiting
Waiting
Waiting
/mnt/HDD is mounted

On a side note, you can fairly easy implement your own version of mountpoint command, for instance , in python , like i did:

#!/usr/bin/env python3
from os import path
import sys

def main():

    if not sys.argv[1]:
       print('Missing a path')
       sys.exit(1)

    full_path = path.realpath(sys.argv[1])
    with open('/proc/self/mounts') as mounts:
       print
       for line in mounts:
           if full_path in line:
              print(full_path,' is mountpoint')
              sys.exit(0)
    print(full_path,' is not a mountpoint')
    sys.exit(1)

if __name__ == '__main__':
    main()

Sample run:

$ python3 ./is_mountpoint.py /mnt/HDD                                          
/mnt/HDD  is mountpoint
$ python3 ./is_mountpoint.py ~                                                 
/home/xieerqi  is not a mountpoint
Share Improve this answer Follow edited Sep 17 '16 at 9:41 answered Sep 17 '16 at 9:03 Sergiy Kolodyazhnyy 93.4k 18 18 gold badges 236 236 silver badges 429 429 bronze badges

Sergiy Kolodyazhnyy ,

I was generally unclear on using a pipe in a conditional statement. But the specific case of checking for a mounted device, mountpoint sounds perfect, thanks. Though conceptually in this case I could have also just done: while [ ! -d /toBeMounted ]; do sleep 2; done; echo -e \\aDing the directory is available now.dlamblin Sep 20 '16 at 0:52

[Jun 23, 2021] bash - multiple pipes in loop, saving pipeline-result to array - Unix Linux Stack Exchange

Jun 23, 2021 | unix.stackexchange.com

multiple pipes in loop, saving pipeline-result to array Ask Question Asked 2 years, 11 months ago Active 2 years, 11 months ago Viewed 1k times

https://523467b4f3186a665b8a0c59ce7f89c4.safeframe.googlesyndication.com/safeframe/1-0-38/html/container.html Report this ad


gugy , 2018-07-25 09:56:33

0

I am trying to do the following (using bash): Search for files that always have the same name and extract data from these files. I want to store the extracted data in new arrays I am almost there, I think, see code below.

The files I am searching for all have this format:

 #!/bin/bash
  echo "the concentration of NDPH is 2 mM, which corresponds to 2 molecules in a box of size 12 nm (12 x 12 x 12 nm^3)" > README_test

#find all the README* files and save the paths into an array called files
  files=()
  data1=()
  data2=()
  data3=()

  while IFS=  read -r -d $'\0'; do
files+=("$REPLY")
  #open all the files and extract data from them
  while read -r line
  do
name="$line"
echo "$name" | tr ' ' '\n'|  awk 'f{print;f=0;exit} /of/{f=1}' 
echo "$name" 
echo "$name" | tr ' ' '\n'|  awk 'f{print;f=0;exit} /of/{f=1}'
data1+=( "$echo "$name" | tr ' ' '\n'|  awk 'f{print;f=0;exit} /of/{f=1}' )" )    

# variables are not preserved...
# data2+= echo "$name"  | tr ' ' '\n'|  awk 'f{print;f=0;exit} /is/{f=1}'
echo "$name"  | tr ' ' '\n'|  awk 'f{print;f=0;exit} /size/{f=1}'
# variables are not preserved... 
# data3+= echo "$name"  | tr ' ' '\n'|  awk 'f{print;f=0;exit} /size/{f=1}'
  done < "$REPLY"
  done < <(find . -name "README*" -print0)
  echo ${data1[0]}

The issue is that the pipe giving me the exact output I want from the files is "not working" (variables are not preserved) in the loops. I have no idea how/if I can use process substitution to get what I want: an array (data1, data2, data3) filled with the output of the pipes.

UPDATE: SO I was not assigning things to the array correctly (see data1, which is properly assigning sth now.) But why are

echo ${data1[0]}

and

echo "$name" | tr ' ' '\n'|  awk 'f{print;f=0;exit} /of/{f=1}'

not the same?

SOLUTION (as per ilkkachu' s accepted answer):

  #!/bin/bash
  echo "the concentration of NDPH is 2 mM, which corresponds to 2 molecules in a box of size 12 nm (12 x 12 x 12 nm^3)" > README_test
  files=()
  data1=()
  data2=()
  data3=()

  get_some_field() {    
 echo "$1" | tr ' ' '\n'|  awk -vkey="$2" 'f{print;f=0;exit} $0 ~ key {f=1}' 
  }

  #find all the README* files and save the paths into an array called files
  while IFS=  read -r -d $'\0'; do
files+=("$REPLY")
  #open all the files and extract data from them
  while read -r line
  do
name="$line"
echo "$name" 
echo "$name" | tr ' ' '\n'|  awk 'f{print;f=0;exit} /of/{f=1}'
data1+=( "$(get_some_field "$name" of)" )
data2+=( "$(get_some_field "$name" is)" )
data3+=( "$(get_some_field "$name" size)" )

  done < "$REPLY"
 done < <(find . -name "README*" -print0)

  echo ${data1[0]}
  echo ${data2[0]}
  echo ${data3[0]}
bash pipe process-substitution Share Improve this question Follow edited Jul 25 '18 at 12:34 asked Jul 25 '18 at 9:56 gugy 133 1 1 silver badge 9 9 bronze badges

steeldriver ,

data1+= echo... doesn't really do anything to the data1 variable. Do you mean to use data1+=( "$(echo ... | awk)" ) ? – ilkkachu Jul 25 '18 at 10:20

> ,

2

I'm assuming you want the output of the echo ... | awk stored in a variable, and in particular, appended to one of the arrays.

First, to capture the output of a command, use "$( cmd... )" (command substitution). As a trivial example, this prints your hostname:

var=$(uname -n)
echo $var

Second, to append to an array, you need to use the array assignment syntax, with parenthesis around the right hand side. This would append the value of var to the array:

array+=( $var )

And third, the expansion of $var and the command substitution $(...) are subject to word splitting, so you want to use parenthesis around them. Again a trivial example, this puts the full output of uname -a as a single element in the array:

array+=( "$(uname -a)" )

Or, in your case, in full:

data1+=( "$(echo "$1" | tr ' ' '\n'|  awk 'f{print;f=0;exit} /of/{f=1}')" )

(Note that the quotes inside the command substitution are distinct from the quotes outside it. The quote before $1 doesn't stop the quoting started outside $() , unlike what the syntax hilighting on SE seems to imply.)

You could make that slightly simpler to read by putting the pipeline in a function:

get_data1() {
    echo "$name" | tr ' ' '\n'|  awk 'f{print;f=0;exit} /of/{f=1}'
}
...
data1+=( "$(get_data1)" )

Or, as the pipelines seem similar, use the function to avoid repeating the code:

get_some_field() {
    echo "$1" | tr ' ' '\n'|  awk -vkey="$2" 'f{print;f=0;exit} $0 ~ key {f=1}'
}

and then

data1+=( "$(get_some_field "$name" of)" )
data2+=( "$(get_some_field "$name" is)" )
data3+=( "$(get_some_field "$name" size)" )

(If I read your pipeline right, that is, I didn't test the above.)

[Jun 23, 2021] Working with data streams on the Linux command line by David Both

From The Linux Philosophy for SysAdmins And Everyone Who Wants To Be One by David Both
"... This is the Unix philosophy: Write programs that do one thing and do it well. Write programs to work together. Write programs to handle text streams, because that is a universal interface." ..."
Notable quotes:
"... This is the Unix philosophy: Write programs that do one thing and do it well. Write programs to work together. Write programs to handle text streams, because that is a universal interface." ..."
Oct 30, 2018 | opensource.com
Author's note: Much of the content in this article is excerpted, with some significant edits to fit the Opensource.com article format, from Chapter 3: Data Streams, of my new book, The Linux Philosophy for SysAdmins .

Everything in Linux revolves around streams of data -- particularly text streams. Data streams are the raw materials upon which the GNU Utilities , the Linux core utilities, and many other command-line tools perform their work.

As its name implies, a data stream is a stream of data -- especially text data -- being passed from one file, device, or program to another using STDIO. This chapter introduces the use of pipes to connect streams of data from one utility program to another using STDIO. You will learn that the function of these programs is to transform the data in some manner. You will also learn about the use of redirection to redirect the data to a file.

I use the term "transform" in conjunction with these programs because the primary task of each is to transform the incoming data from STDIO in a specific way as intended by the sysadmin and to send the transformed data to STDOUT for possible use by another transformer program or redirection to a file.

The standard term, "filters," implies something with which I don't agree. By definition, a filter is a device or a tool that removes something, such as an air filter removes airborne contaminants so that the internal combustion engine of your automobile does not grind itself to death on those particulates. In my high school and college chemistry classes, filter paper was used to remove particulates from a liquid. The air filter in my home HVAC system removes particulates that I don't want to breathe.

Although they do sometimes filter out unwanted data from a stream, I much prefer the term "transformers" because these utilities do so much more. They can add data to a stream, modify the data in some amazing ways, sort it, rearrange the data in each line, perform operations based on the contents of the data stream, and so much more. Feel free to use whichever term you prefer, but I prefer transformers. I expect that I am alone in this.

Data streams can be manipulated by inserting transformers into the stream using pipes. Each transformer program is used by the sysadmin to perform some operation on the data in the stream, thus changing its contents in some manner. Redirection can then be used at the end of the pipeline to direct the data stream to a file. As mentioned, that file could be an actual data file on the hard drive, or a device file such as a drive partition, a printer, a terminal, a pseudo-terminal, or any other device connected to a computer.

The ability to manipulate these data streams using these small yet powerful transformer programs is central to the power of the Linux command-line interface. Many of the core utilities are transformer programs and use STDIO.

In the Unix and Linux worlds, a stream is a flow of text data that originates at some source; the stream may flow to one or more programs that transform it in some way, and then it may be stored in a file or displayed in a terminal session. As a sysadmin, your job is intimately associated with manipulating the creation and flow of these data streams. In this post, we will explore data streams -- what they are, how to create them, and a little bit about how to use them.

Text streams -- a universal interface

The use of Standard Input/Output (STDIO) for program input and output is a key foundation of the Linux way of doing things. STDIO was first developed for Unix and has found its way into most other operating systems since then, including DOS, Windows, and Linux.

" This is the Unix philosophy: Write programs that do one thing and do it well. Write programs to work together. Write programs to handle text streams, because that is a universal interface."

-- Doug McIlroy, Basics of the Unix Philosophy

STDIO

STDIO was developed by Ken Thompson as a part of the infrastructure required to implement pipes on early versions of Unix. Programs that implement STDIO use standardized file handles for input and output rather than files that are stored on a disk or other recording media. STDIO is best described as a buffered data stream, and its primary function is to stream data from the output of one program, file, or device to the input of another program, file, or device.

If STDOUT is redirected to a file, STDERR continues to be displayed on the screen. This ensures that when the data stream itself is not displayed on the terminal, that STDERR is, thus ensuring that the user will see any errors resulting from execution of the program. STDERR can also be redirected to the same or passed on to the next transformer program in a pipeline.

STDIO is implemented as a C library, stdio.h , which can be included in the source code of programs so that it can be compiled into the resulting executable.

Simple streams

You can perform the following experiments safely in the /tmp directory of your Linux host. As the root user, make /tmp the PWD, create a test directory, and then make the new directory the PWD.

# cd /tmp ; mkdir test ; cd test

Enter and run the following command line program to create some files with content on the drive. We use the dmesg command simply to provide data for the files to contain. The contents don't matter as much as just the fact that each file has some content.

# for I in 0 1 2 3 4 5 6 7 8 9 ; do dmesg > file$I.txt ; done

Verify that there are now at least 10 files in /tmp/ with the names file0.txt through file9.txt .

# ll
total 1320
-rw-r--r-- 1 root root 131402 Oct 17 15:50 file0.txt
-rw-r--r-- 1 root root 131402 Oct 17 15:50 file1.txt
-rw-r--r-- 1 root root 131402 Oct 17 15:50 file2.txt
-rw-r--r-- 1 root root 131402 Oct 17 15:50 file3.txt
-rw-r--r-- 1 root root 131402 Oct 17 15:50 file4.txt
-rw-r--r-- 1 root root 131402 Oct 17 15:50 file5.txt
-rw-r--r-- 1 root root 131402 Oct 17 15:50 file6.txt
-rw-r--r-- 1 root root 131402 Oct 17 15:50 file7.txt
-rw-r--r-- 1 root root 131402 Oct 17 15:50 file8.txt
-rw-r--r-- 1 root root 131402 Oct 17 15:50 file9.txt

We have generated data streams using the dmesg command, which was redirected to a series of files. Most of the core utilities use STDIO as their output stream and those that generate data streams, rather than acting to transform the data stream in some way, can be used to create the data streams that we will use for our experiments. Data streams can be as short as one line or even a single character, and as long as needed.

Exploring the hard drive

It is now time to do a little exploring. In this experiment, we will look at some of the filesystem structures.

Let's start with something simple. You should be at least somewhat familiar with the dd command. Officially known as "disk dump," many sysadmins call it "disk destroyer" for good reason. Many of us have inadvertently destroyed the contents of an entire hard drive or partition using the dd command. That is why we will hang out in the /tmp/test directory to perform some of these experiments.

Despite its reputation, dd can be quite useful in exploring various types of storage media, hard drives, and partitions. We will also use it as a tool to explore other aspects of Linux.

Log into a terminal session as root if you are not already. We first need to determine the device special file for your hard drive using the lsblk command.

[root@studentvm1 test]# lsblk -i
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 60G 0 disk
|-sda1 8:1 0 1G 0 part /boot
`-sda2 8:2 0 59G 0 part
|-fedora_studentvm1-pool00_tmeta 253:0 0 4M 0 lvm
| `-fedora_studentvm1-pool00-tpool 253:2 0 2G 0 lvm
| |-fedora_studentvm1-root 253:3 0 2G 0 lvm /
| `-fedora_studentvm1-pool00 253:6 0 2G 0 lvm
|-fedora_studentvm1-pool00_tdata 253:1 0 2G 0 lvm
| `-fedora_studentvm1-pool00-tpool 253:2 0 2G 0 lvm
| |-fedora_studentvm1-root 253:3 0 2G 0 lvm /
| `-fedora_studentvm1-pool00 253:6 0 2G 0 lvm
|-fedora_studentvm1-swap 253:4 0 10G 0 lvm [SWAP]
|-fedora_studentvm1-usr 253:5 0 15G 0 lvm /usr
|-fedora_studentvm1-home 253:7 0 2G 0 lvm /home
|-fedora_studentvm1-var 253:8 0 10G 0 lvm /var
`-fedora_studentvm1-tmp 253:9 0 5G 0 lvm /tmp
sr0 11:0 1 1024M 0 rom

We can see from this that there is only one hard drive on this host, that the device special file associated with it is /dev/sda , and that it has two partitions. The /dev/sda1 partition is the boot partition, and the /dev/sda2 partition contains a volume group on which the rest of the host's logical volumes have been created.

As root in the terminal session, use the dd command to view the boot record of the hard drive, assuming it is assigned to the /dev/sda device. The bs= argument is not what you might think; it simply specifies the block size, and the count= argument specifies the number of blocks to dump to STDIO. The if= argument specifies the source of the data stream, in this case, the /dev/sda device. Notice that we are not looking at the first block of the partition, we are looking at the very first block of the hard drive.

... ... ...

This prints the text of the boot record, which is the first block on the disk -- any disk. In this case, there is information about the filesystem and, although it is unreadable because it is stored in binary format, the partition table. If this were a bootable device, stage 1 of GRUB or some other boot loader would be located in this sector. The last three lines contain data about the number of records and bytes processed.

Starting with the beginning of /dev/sda1 , let's look at a few blocks of data at a time to find what we want. The command is similar to the previous one, except that we have specified a few more blocks of data to view. You may have to specify fewer blocks if your terminal is not large enough to display all of the data at one time, or you can pipe the data through the less utility and use that to page through the data -- either way works. Remember, we are doing all of this as root user because non-root users do not have the required permissions.

Enter the same command as you did in the previous experiment, but increase the block count to be displayed to 100, as shown below, in order to show more data.

.... ... ...

Now try this command. I won't reproduce the entire data stream here because it would take up huge amounts of space. Use Ctrl-C to break out and stop the stream of data.

[root@studentvm1 test]# dd if=/dev/sda

This command produces a stream of data that is the complete content of the hard drive, /dev/sda , including the boot record, the partition table, and all of the partitions and their content. This data could be redirected to a file for use as a complete backup from which a bare metal recovery can be performed. It could also be sent directly to another hard drive to clone the first. But do not perform this particular experiment.

[root@studentvm1 test]# dd if=/dev/sda of=/dev/sdx

You can see that the dd command can be very useful for exploring the structures of various types of filesystems, locating data on a defective storage device, and much more. It also produces a stream of data on which we can use the transformer utilities in order to modify or view.

The real point here is that dd , like so many Linux commands, produces a stream of data as its output. That data stream can be searched and manipulated in many ways using other tools. It can even be used for ghost-like backups or disk duplication.

Randomness

It turns out that randomness is a desirable thing in computers -- who knew? There are a number of reasons that sysadmins might want to generate a stream of random data. A stream of random data is sometimes useful to overwrite the contents of a complete partition, such as /dev/sda1 , or even the entire hard drive, as in /dev/sda .

Perform this experiment as a non-root user. Enter this command to print an unending stream of random data to STDIO.

[student@studentvm1 ~]$ cat /dev/urandom

Use Ctrl-C to break out and stop the stream of data. You may need to use Ctrl-C multiple times.

Random data is also used as the input seed to programs that generate random passwords and random data and numbers for use in scientific and statistical calculations. I will cover randomness and other interesting data sources in a bit more detail in Chapter 24: Everything is a file.

Pipe dreams

Pipes are critical to our ability to do the amazing things on the command line, so much so that I think it is important to recognize that they were invented by Douglas McIlroy during the early days of Unix (thanks, Doug!). The Princeton University website has a fragment of an interview with McIlroy in which he discusses the creation of the pipe and the beginnings of the Unix philosophy.

Notice the use of pipes in the simple command-line program shown next, which lists each logged-in user a single time, no matter how many logins they have active. Perform this experiment as the student user. Enter the command shown below:

[student@studentvm1 ~]$ w | tail -n +3 | awk '{print $1}' | sort | uniq
root
student
[student@studentvm1 ~]$

The results from this command produce two lines of data that show that the user's root and student are both logged in. It does not show how many times each user is logged in. Your results will almost certainly differ from mine.

Pipes -- represented by the vertical bar ( | ) -- are the syntactical glue, the operator, that connects these command-line utilities together. Pipes allow the Standard Output from one command to be "piped," i.e., streamed from Standard Output of one command to the Standard Input of the next command.

The |& operator can be used to pipe the STDERR along with STDOUT to STDIN of the next command. This is not always desirable, but it does offer flexibility in the ability to record the STDERR data stream for the purposes of problem determination.

A string of programs connected with pipes is called a pipeline, and the programs that use STDIO are referred to officially as filters, but I prefer the term "transformers."

Think about how this program would have to work if we could not pipe the data stream from one command to the next. The first command would perform its task on the data and then the output from that command would need to be saved in a file. The next command would have to read the stream of data from the intermediate file and perform its modification of the data stream, sending its own output to a new, temporary data file. The third command would have to take its data from the second temporary data file and perform its own manipulation of the data stream and then store the resulting data stream in yet another temporary file. At each step, the data file names would have to be transferred from one command to the next in some way.

I cannot even stand to think about that because it is so complex. Remember: Simplicity rocks!

Building pipelines

When I am doing something new, solving a new problem, I usually do not just type in a complete Bash command pipeline from scratch off the top of my head. I usually start with just one or two commands in the pipeline and build from there by adding more commands to further process the data stream. This allows me to view the state of the data stream after each of the commands in the pipeline and make corrections as they are needed.

It is possible to build up very complex pipelines that can transform the data stream using many different utilities that work with STDIO.

Redirection

Redirection is the capability to redirect the STDOUT data stream of a program to a file instead of to the default target of the display. The "greater than" ( > ) character, aka "gt", is the syntactical symbol for redirection of STDOUT.

Redirecting the STDOUT of a command can be used to create a file containing the results from that command.

[student@studentvm1 ~]$ df -h > diskusage.txt

There is no output to the terminal from this command unless there is an error. This is because the STDOUT data stream is redirected to the file and STDERR is still directed to the STDOUT device, which is the display. You can view the contents of the file you just created using this next command:

[student@studentvm1 test]# cat diskusage.txt
Filesystem Size Used Avail Use% Mounted on
devtmpfs 2.0G 0 2.0G 0% /dev
tmpfs 2.0G 0 2.0G 0% /dev/shm
tmpfs 2.0G 1.2M 2.0G 1% /run
tmpfs 2.0G 0 2.0G 0% /sys/fs/cgroup
/dev/mapper/fedora_studentvm1-root 2.0G 50M 1.8G 3% /
/dev/mapper/fedora_studentvm1-usr 15G 4.5G 9.5G 33% /usr
/dev/mapper/fedora_studentvm1-var 9.8G 1.1G 8.2G 12% /var
/dev/mapper/fedora_studentvm1-tmp 4.9G 21M 4.6G 1% /tmp
/dev/mapper/fedora_studentvm1-home 2.0G 7.2M 1.8G 1% /home
/dev/sda1 976M 221M 689M 25% /boot
tmpfs 395M 0 395M 0% /run/user/0
tmpfs 395M 12K 395M 1% /run/user/1000

When using the > symbol to redirect the data stream, the specified file is created if it does not already exist. If it does exist, the contents are overwritten by the data stream from the command. You can use double greater-than symbols, >>, to append the new data stream to any existing content in the file.

[student@studentvm1 ~]$ df -h >> diskusage.txt

You can use cat and/or less to view the diskusage.txt file in order to verify that the new data was appended to the end of the file.

The < (less than) symbol redirects data to the STDIN of the program. You might want to use this method to input data from a file to STDIN of a command that does not take a filename as an argument but that does use STDIN. Although input sources can be redirected to STDIN, such as a file that is used as input to grep, it is generally not necessary as grep also takes a filename as an argument to specify the input source. Most other commands also take a filename as an argument for their input source.

Just grep'ing around

The grep command is used to select lines that match a specified pattern from a stream of data. grep is one of the most commonly used transformer utilities and can be used in some very creative and interesting ways. The grep command is one of the few that can correctly be called a filter because it does filter out all the lines of the data stream that you do not want; it leaves only the lines that you do want in the remaining data stream.

If the PWD is not the /tmp/test directory, make it so. Let's first create a stream of random data to store in a file. In this case, we want somewhat less random data that would be limited to printable characters. A good password generator program can do this. The following program (you may have to install pwgen if it is not already) creates a file that contains 50,000 passwords that are 80 characters long using every printable character. Try it without redirecting to the random.txt file first to see what that looks like, and then do it once redirecting the output data stream to the file.

$ pwgen -sy 80 50000 > random.txt

Considering that there are so many passwords, it is very likely that some character strings in them are the same. First, cat the random.txt file, then use the grep command to locate some short, randomly selected strings from the last ten passwords on the screen. I saw the word "see" in one of those ten passwords, so my command looked like this: grep see random.txt , and you can try that, but you should also pick some strings of your own to check. Short strings of two to four characters work best.

$ grep see random.txt
R=p)'s/~0}wr~2(OqaL.S7DNyxlmO69`"12u]h@rp[D2%3}1b87+>Vk,;4a0hX]d7see;1%9|wMp6Yl.
bSM_mt_hPy|YZ1<TY/Hu5{g#mQ<u_(@8B5Vt?w%i-&C>NU@[;zV2-see)>(BSK~n5mmb9~h)yx{a&$_e
cjR1QWZwEgl48[3i-(^x9D=v)seeYT2R#M:>wDh?Tn$]HZU7}j!7bIiIr^cI.DI)W0D"'[email protected]
z=tXcjVv^G\nW`,y=bED]d|7%s6iYT^a^Bvsee:v\UmWT02|P|nq%A*;+Ng[$S%*s)-ls"dUfo|0P5+n

Summary

It is the use of pipes and redirection that allows many of the amazing and powerful tasks that can be performed with data streams on the Linux command line. It is pipes that transport STDIO data streams from one program or file to another. The ability to pipe streams of data through one or more transformer programs supports powerful and flexible manipulation of data in those streams.

Each of the programs in the pipelines demonstrated in the experiments is small, and each does one thing well. They are also transformers; that is, they take Standard Input, process it in some way, and then send the result to Standard Output. Implementation of these programs as transformers to send processed data streams from their own Standard Output to the Standard Input of the other programs is complementary to, and necessary for, the implementation of pipes as a Linux tool.

STDIO is nothing more than streams of data. This data can be almost anything from the output of a command to list the files in a directory, or an unending stream of data from a special device like /dev/urandom , or even a stream that contains all of the raw data from a hard drive or a partition.

Any device on a Linux computer can be treated like a data stream. You can use ordinary tools like dd and cat to dump data from a device into a STDIO data stream that can be processed using other ordinary Linux tools.

Topics Linux Command line

David Both is a Linux and Open Source advocate who resides in Raleigh, North Carolina. He has been in the IT industry for over forty years and taught OS/2 for IBM where he worked for over 20 years. While at IBM, he wrote the first training course for the original IBM PC in 1981. He has taught RHCE classes for Red Hat and has worked at MCI Worldcom, Cisco, and the State of North Carolina. He has been working with Linux and Open Source Software for almost 20 years. David has written articles for...

[Oct 11, 2020] Re^12- What esteemed monks think about changes necessary-desirable in Perl 7 outside of OO staff

Oct 11, 2020 | perlmonks.org

by likbez

on Oct 11, 2020 at 04:45 UTC ( # 11122681 = note : print w/replies , xml ) Need Help??


in reply to Re^10: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
in thread What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff

Reputation: 0

Edit

The problem is C-style delimiters for conditional statements (round brackets) and overuse of curvy brackets. The former is present in all C-style languages.

So IMHO omitting brackets in built-in functions was a false start; the problem that should be addressed is the elimination of brackets in prefix conditionals.

One possible way is to have a pragma "altblockdelim" or something like that, which would allow to use,say, ?? and ;; or classic "begin/end" pair instead of '{' and '}', which are overused in Perl. That would decrease parenthesis nesting.

After all, we can write && as "and" and some people like it.

It's like within Perl 5 exists a language with more modern syntax that just wants to emerge.

by GrandFather on Oct 11, 2020 at 07:13 UTC ( # 11122686 = note : print w/replies , xml ) Need Help??


in reply to Re^11: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
in thread What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff

I'm not sure how a discussion about parenthesis ( ) morphed into too many brackets ("curvy brackets" - { } ), but I don't see the problem in any case. The use of brackets for block delimiters is visually quite distinct from any other use I'm familiar with so I don't see the problem.

There is a usage of ;; that I don't quite grok, but seems to be fairly common so the ;; option probably wouldn't fly in any case.

Perl's && and and operators have substantially different precedence. They must not be used as interchangeable. Yes, subtle I know, but very useful.

Optimising for fewest key strokes only makes sense transmitting to Pluto or beyond

likbez on Oct 11, 2020 at 15:22 UTC

Re^13: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
I'm not sure how a discussion about parenthesis ( ) morphed into too many brackets ("curvy brackets" - { }), but I don't see the problem in any case.
In C you can write

if(i<0) i=0

if Perl you can't and should write

if( $i<0 ){ $i=0 }

because the only statement allowed after conditionals is a compound statement -- a block. Which was a pretty elegant idea that eliminates the problem of "dangling else" https://www.sanfoundry.com/c-question-dangling-else-statements/

But the problem is that at this point round parenthesis become a wart. They are not needed and they detract from readability. So if curvy brackets were not used anywhere else you can simplify this to

if $i<0 {$i=0}

But you can't do this in Perl because curvy brackets are used for hashes.

There is a usage of ;; that I don't quite grok, but seems to be fairly common so the ;; option probably wouldn't fly in any case.

In Perl ; is an empty(null) statement. So the current meaning of ;; is "the end of the previous statement followed by the null statement".
main::(-e:1):   1
  DB<1> ;;;;

  DB<2> $a=5;;

  DB<3> print $a;;
5
the new meaning will be "the end of the current statement and the end of the block", which is pretty elegant idea in its own way. Because now Perl allows omitting semicolon before } as special case, but in the new syntax this is just a general case and the special case is not needed.
Replies are listed 'Best First'.

[Oct 02, 2020] Brian D Foy post that announced this new version is really weak. It essentially states We decided to rename 5.32 and you all should be happy. It does not contain any new ideas

Oct 02, 2020 | perlmonks.org

likbez on Oct 02, 2020 at 02:37 UTC ( # 11122463

in reply to Re^14: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff (don't feed)
in thread What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff

For programming languages to evolve and flourish, we all need to accept other people's viewpoints and continue open-minded, civil and respectful dialogue.

In science, scientists always question everything; why shouldn't we question some features and point out deficiencies of Perl 5 which after version 5.10 became really stale feature-wise -- the last important addition was the addition of state variables in 5.10. Partially this happened as most resources were reallocated to Perl 6 (The Perl 6 project was announced in 2000), a robust interpreter for which failed to materialize for too long: the situation which also slowed down Perl 5 interpreter development.

The question arise: Should it be possible on perlmonks to criticize some aspects of Perl 5 current features and implementation as well as its use without being denigrated as a reward?

At least after the split Perl 5 has theoretical chances to stand on its own, and evolve like other languages evolved (for example, FORTRAN after 1977 adopted 11 years cycle for new versions). As Perl 5.10 was released in 2007, now it is 13 years since this date and Perl 7 is really overdue. The question is what to include and what to exclude and what glaring flaws need to be rectified (typically a new version of a programming language tries to rectify the most glaring design flaws in the language and introduce changes that could not be implemented while retaining full backward compatibility.)

Brian D Foy post that announced this new version is really weak. It essentially states "We decided to rename 5.32 and you all should be happy." It does not contain any new ideas, just the desire to have new version of Perl as Perl 5.32 with few new defaults (which BTW will break compatibility with old scripts at least with 5.8 and earlier versions scripts as not all of them use strict pragma, and strict pragma implementation still has its own set of problems ).

The question arises: Whether the game worth candles? Unless the new editions of O'Reilly books is the goal. That's why I provided this contribution, suggesting some minor enhancements which might better justify calling the new version Perl 7. And what I got in return ?

I hoped that this post would be a start of the meaningful discussion. But people like you turned it into a flame-fest.

It looks like it is impossible to have a rational fact-based discussion on this subject with zealots like you.

[Sep 27, 2020] Is Perl dead- - Quora

Sep 27, 2020 | www.quora.com


Eric Christian Hansen
, former Senior Systems Analyst (1999-2002) Answered May 15, 2019 · Author has 5.1K answers and 1.6M answer views

PERL is not dead, only those guardians of PerlMonks dot org, who lie in wait to bounce upon your most recent posts with spiteful replies loaded with falsehoods and hate and jealousy.

Good luck trying to impart your acquired PERL knowledge there. They will do their very best to attempt to discredit you and your ideas. Alex Jones , works at Own My Own Business Answered January 12, 2020 · Author has 259 answers and 76.1K answer views

My answer refers to Perl 5 rather than Raku (Perl 6),

Perl 5 is a veteran computer language with a track record and pedigree of several decades. Perl has been around long enough that its strengths and weaknesses are known; it is a stable, predictable and reliable language that will deliver results with little effort.

In the new decade 2020 and beyond, Perl in my opinion, remains competitive in performance against any other computer language. Perl remains viable as a language to use in even the most advanced of information technology projects.

Simple market forces have driven Perl out of the top computer languages of choice for projects. Because a business finds it hard to find Perl developers, they are forced to use a computer language where there are more developers such as Python. Because fewer businesses are using Perl in their projects, the education system selects a language such as Python to train their students in.

Perl 5 will probably no longer be the universal language of choice for developers and businesses, but may dominate in a particular niche or market. There is a major campaign underway by supporters of Perl 5 and Raku to promote and encourage people to learn and use these languages again.

My startup is involved in AI, and I use Perl 5 for the projects I am developing. There are a number of strengths in Perl 5 which appeal to me in my projects. Perl 5 has a strong reputation for the abilty to create and execute scripts of only a few lines of code to solve problems. As Perl 5 is designed to be like a natural spoken language, it becomes the practical choice for handling text. When handling complex patterns, the regex capabilities in Perl 5 is probably the best of any computer language. Lastly, Perl 5 was the glue that enabled the systems of the 1990's to work together, and might offer a pragmatic solution to bridging the old with the new in the modern era.

I would describe Perl as existing in a dormant phase, which is waiting for the right conditions to emerge where it will regain its place at the leading edge in a niche or market such as in artificial intelligence. Joe Pepersack , Just Another Perl Hacker Answered May 31, 2015 · Author has 5.7K answers and 7M answer views

No. It's not dead. But it's not very active, either and it's lost a lot of mindshare to Ruby and Python. Hopefully the recently-announced December release of Perl 6 (Finally!) will renew interest in the language.

I found a really useful site the other day: Modulecounts . CPAN is Perl's greatest asset, but unfortunately it seems to have stagnated compared to Pypi or RubyGems. CPAN is getting 3 new modules per day whereas RubyGems is getting 53/day. Rubygems overtook CPAN in 2011 and Pypi overtook it in 2013.

Personally I think Python is Perl on training wheels and represents a step backwards if you're coming from Perl. Ruby is a great language and is pretty Perl-ish overall. Plus someone just recently ported Moose to Ruby so that's a huge win.

I would argue Perl is still worth learning for a couple main reasons:

  1. It's ubiquitous. Every Unix-ish system made in the last decade has some version of Perl on it.
  2. It's still unbeaten for text manipulation and for doing shell-scripty type things that are too hard to do in bash.
  3. Ad-hoc one-liners. Neither ruby nor python can match perl for hacking together something on the command line.
  4. There's a lot of Perl code still out there doing important things. It's cheaper to maintain it than it is to re-write it in another language.
Tom Le , CSO • CISO • CTO | Security Expert Answered April 3, 2015

Perl is certainly not dead, but it does face an adoption challenge. For example, fewer vendors are releasing Perl API's or code samples (but the Perl community often steps in at least for popular platforms). Finding new developers who know Perl is more difficult, while it is much less difficult to find developers with Python and Java. The emerging technology areas such as big data and data science have a strong Python bent, but a lot of their tasks could be done faster in Perl (from my own experience).

What is great about Perl is despite its quirks, is it is relatively easy to learn if you know other programming languages. What I have found amazing is that when developers are "forced to learn Perl" for a project, they usually pleasantly surprised at how powerful and unique Perl is compared to their language of choice.

From a job value perspective, Perl knowledge has some interesting value quirks (just like the language has some interesting quirks). The market for Perl developers is not as large as other languages, but companies that need Perl developers have a hard time finding good candidates. Thus, you might find it easier to get a job with Perl skills even though there are fewer jobs that require it.

In short, Perl has an amazing ability to convert existing programmers, but fewer programmers are coming into the workforce with Perl experience. Avi Mehenwal , Ex-perl programmer, but still cannot let it go. I feel the RegEx Attachement Answered April 2, 2016 · Author has 64 answers and 226.8K answer views

Perl has been around since 1987 and became an early darling of web developers. These days, however, you don't hear much about Perl. Everyone seems to be talking about trendier languages like PHP, Python and Ruby, with Perl left in the back as a neglected, not-so-hip cousin.

That might lead you to think that Perl is dying, but as it turns out, it's still used by plenty of websites out there, including some pretty big hitters.

Here are some of the more popular sites that use Perl extensively today:

Sources:

  1. Perl
  2. Perl far from dead, more popular than you think - Pingdom Royal
Dave Cross , I make things with software. Answered October 1, 2014 · Author has 1.6K answers and 1.4M answer views

Depends what you mean by dead.

The language is still thriving. There's a new release every year and each release includes interesting new features (most recently, subroutine signatures). More modules are uploaded to CPAN every year. More authors contribute code to CPAN every year.

But I still think that Perl is dying and I would find it hard to recommend that anyone should choose a career in Perl at this point.

Ask yourself these three questions:


I should be working 16 hours ago remove link

Hey guys: MAYBE WE SHOULD FOCUS ON GETTING GOOGLE UNDER CONTROL FIRST! play_arrow mike_1010 17 hours ago (Edited)

Source code information is a closely guarded secret for all IT companies. Because if hackers get access to it, then they can find many ways to compromise its security and to spy on its users.

So, it makes sense that the Chinese government might want to protect the source code of apps that are used by many people in China.

I'm sure the US government would say the same thing, if some Chinese company wanted to buy the source code of Microsoft's Windows 10 operating system or something like that.

From the point of view of cybersecurity, this makes perfect sense.

Every country has legitimate security concerns. And these concerns were heightened, when Edward Snowden revealed the extent of US government hacking and spying of the rest of the world, including China.

The Chinese government has actually more evidence and more reasons to be concerned about possible hacking and spying by the US government, than the other way. USA has only been accusing China of doing the same. But they've never shown any conclusive evidence to back their claims, the way Edward Snowden has revealed such evidence about USA.

The only thing that surprises me in this whole affair is that it took the Chinese government this long to say the obvious. If the situation was reversed and the issue was about the source code of some US company software, then US politicians and security experts would've been yelling about this kind of thing right from the start.

[Sep 22, 2020] Softsemicolon in Perl debate

Sep 22, 2020 | perlmonks.org
All of the following satisfy your criteria, are valid and normal Perl code, and would get a semicolon incorrectly inserted based on your criteria:
use softsemicolon;

$x = $a
   + $b;

$x = 1
    if $condition;

$x = 1 unless  $condition1
           && $condition2;
Yes in cases 1 and 2; it depends on depth of look-ahead in case 3. Yes if it is one symbol. No it it is two(no Perl statement can start with && )

As for "valid and normal" your millage may vary. For people who would want to use this pragma it is definitely not "valid and normal". Both 1 and 2 looks to me like frivolities without any useful meaning or justification. Moreover, case 1 can be rewritten as:

$x =($a + $b); [download]

The case 3 actually happens in Perl most often with regular if and here the opening bracket is obligatory:

if ( ( $tokenstr=~/a\[s\]/ || $tokenstr =~/h\[s\]/ )
&& ( $tokenstr... ) )
{ .... }

[download]

Also Python-inspired fascination with eliminating all brackets does not do here any good

$a=$b=1;
$x=1 if $a==1
&& $b=2;

[download]

should generally be written

$a=$b=1;
$x=1 if( $a==1
&& $b=2);

[download]

was surprised that the case without brackets was accepted by the syntax analyzer. Because how would you interpret

$y=1 if $x{$i++};

without brackets is unclear to me. It has dual meaning: should be a syntax error in one case

$y=1
if $y {
$i++
};

[download]

and the test for an element of hash $a in another.


dave_the_m on Sep 12, 2020 at 06:52 UTC

Re^13: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
Both 1 and 2 looks to me like frivolities without any useful meaning or justification
You and I have vastly differing perceptions of what constitutes normal perl code. For example there are over 700 examples of the 'postfix if on next line' pattern in the .pm files distributed with the perl core.

There doesn't really seem any point in discussing this further. You have failed to convince me, and I am very unlikely to work on this myself or accept such a patch into core.

Dave.

likbez on Sep 12, 2020 at 19:53 UTC

Re^14: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff by likbez on Sep 12, 2020 at 19:53 UTC
You and I have vastly differing perceptions of what constitutes normal perl code. For example there are over 700 examples of the 'postfix if on next line' pattern in the .pm files distributed with the perl core.
Probably yes. I am an adherent of "defensive programming" who is against over-complexity as well as arbitrary formatting (pretty printer is preferable to me to manual formatting of code). Which in this audience unfortunately means that I am a minority.

BTW your idea that this pragma (which should be optional) matters for Perl standard library has no connection to reality.

GrandFather on Sep 12, 2020 at 23:53 UTC

Re^15: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff

A very large proportion of the replies you have received in this thread are from people who put a high value on writing maintainable code. "maintainable" is short hand for code that is written to be understood and maintained with minimum effort over long periods of time and by different programmers of mixed ability.

There is a strong correlation with your stance of "defensive programming" ... against over-complexity as well as arbitrary formatting . None of us are arguing with that stance. We are arguing with the JavaScript semicolon that you would like introduced based on a personal whim in a context of limited understanding of Perl syntax and idiomatic use.

Personally I use an editor that has an on demand pretty printer which I use frequently. The pretty printer does very little work because I manually format my code as I go and almost always that is how the pretty printer will format it. I do this precisely to ensure my code is not overly complex and is maintainable. I do this in all the languages that I use and the hardest languages to do that in are Python, VBScript and JavaScript because of the way they deal with semi-colons.

Oh, and in case it is of interest, dave_the_m is one of the current maintainers of Perl. He is in a great position to know how the nuts and bolts of an optional semi-colon change might be made and has a great understanding of how Perl is commonly used. Both give him something of a position of authority in determining the utility of such a change.

Optimising for fewest key strokes only makes sense transmitting to Pluto or beyond

tobyink on Sep 12, 2020 at 22:24 UTC

Re^11: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff

"no Perl statement can start with the dot"

Yada-yada operator in Perl 5.12+.

toby döt ink

ikegami on Sep 14, 2020 at 22:15 UTC

Re^12: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff

Parser lookaheads are implemented in terms of tokens, not characters. The first token of yada is a triple-dot, not a dot. While you may think it starts with a dot, that's not how the parser sees it, so the existence of yada is not relevant here.

Tux on Sep 12, 2020 at 09:38 UTC

Re^7: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff

You also completely ruin maintainability and extensibility. Consider a filter module ...

my $fixed = $bad =~ y/\x{00d0}/\x{0110}/r # Eth != D-stroke =~ y/\x{0189}/\x{0110}/r # LETTER AFRICAN D != + D-stroke =~ s{\bpra[ck]ti[sc]e\b}{practice}gr # All 4 seen in docume + nt AB12.38C =~ s{\bX13\.GtrA\.14\b}{X13_GA12}gr # Product got renamed =~ s{\b1234\s*zip\b}{1234ZIP}gir # Reciever will crash + on badly formed ZIP code =~ s{\bpays\s*-?\s*bas\b} {The Netherlands}gir # French forms :( =~ ....;

[download]

The more examples I see posted by my esteemed co-monks, the less I like the idea, and I hated it already when I read it in the OP.

Enjoy, Have FUN! H.Merijn

likbez on Sep 13, 2020 at 19:48 UTC

Re^8: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff

Why you are concentrating on just one proposal. Are all other equally bad ?

As for soft-semicolon you completly misunderstood the situation:

First, nobody force you to use this pragma. And if you do not use it you are not affected. I am thinking now that it should be enabled only with option -d.

It does not make sense to conduct something like "performance review" in a large corporation for my proposals concentrating on "soft-semicolon" idea and ignoring all others. As if it is the only one worth any discussion. It might be the easiest one to piss off, but it is far from being the most important or far reaching among those proposals.

There is no free lunch, and for some coding styles (including but not limited to coding styles used in many modules in Perl standard library) it is definitely inappropriate. Nobody claim that it is suitable for all users. It is an optional facility for those who want and need it. In a way, it is a debugging aid that allows to cut the number of debugging runs. And IMHO there is not a zero subset of Perl users who would be interested in this capability. Especially system administrators who systematically use bash along with Perl. And many of them do not use sophisticated editors, often this is just vi or Midnight Commander editor.

Detractors can happily stay with the old formatting styles forever. Why is this so difficult to understand before producing such an example?

Moreover, how can you reconcile the amount of efforts (and resulting bugs) for the elimination of extra round brackets in Perl with this proposal? Is not this the same idea -- to lessen the possible number of user errors?

For me, it looks like a pure hypocrisy - in one case we are spending some efforts following other scripting languages at some cost; but the other, similar in its essence, proposal is rejected blindly as just a bad fashion. If this is a fashion, then eliminating round brackets is also a bad fashion, IMHO.

And why only I see some improvements possible at low cost in the current Perl implementation and nobody else proposed anything similar or better, or attempted to modify/enhance my proposals? After all Perl 5.10 was a definite step forward for Perl. Perl 7 should be the same.

I think the effort spend here in criticizing my proposal would be adequate to introduce the additional parameter into index function ("to" limit). Which is needed and absence of which dictates using substr to limit the search zone in long strings. Which is sub-optimal solution unless the interpreter has advanced optimization capabilities and can recognize such a use as the attempt to impose the limit on the search.

Or both this and an option in tr that allows it to stop after the first character not is set1 and return this position.:-)

Constructive discussion does not mean pissing off each and every my posts ( one has -17 votes now; looks a little bit like schoolyard bulling ) -- you need to try to find rational grain in them, and if such exists, try to revise and enhance the proposal.

The stance "I am happy with Perl 'as is' and go to hell with your suggestions" has its value and attraction, but it is unclear how it will affect the future of the language.

johngg on Sep 13, 2020 at 22:49 UTC

Re^9: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
As for soft-semicolon you completly misunderstood the situation: First, nobody force you to use this pragma. And if you do not use it you are not affected. I am thinking now that it should be enabled only with option -d.

In the OP you make no mention of a pragma in proposal 1, you just say that it would be "highly desirable" to have soft semicolons. This implies that you would like it to be the default behaviour in Perl 7, which, judging by the responses, would hack a lot of people off, me included. If you are proposing that soft semicolons are only enabled via a pragma perhaps you should add a note to that effect in the OP, being sure to make it clear that it is an update rather than silently changing the text.

And IMHO there is not a zero subset of Perl users who would be interested in this capability. Especially system administrators who systematically use bash along with Perl.

I spent the last 26 years of my career as a systems administrator (I had no ambition to leave technical work and become a manager) on Unix/Linux systems and started using Perl in that role in 1994 with perl 4.036, quickly moving to 5. The lack of semicolon statement terminators in the various shell programming languages I had to use was a pain in the arse and moving to Perl was a huge relief as well as a boost to effectiveness. I would not be the slightest bit interested in soft semicolons and they would, to my mind, be either a debugging nightmare or would force me into a coding style alien to my usual practice.

In this post you say

Also Python-inspired fascination with eliminating all brackets does not do here any good 1 2 $a=$b=1; 3 $x=1 if $a==1 4 && $b=2; [download]

should generally be written

2 $a=$b=1; 3 $x=1 if( $a==1 4 && $b=2); [download]

to which I say, nonsense! Why add unnecessary round brackets to perfectly valid code? Use round brackets where they are needed to disambiguate precedence but not where they just add superfluous noise. Nothing to do with fascination, I've never touched Python!

You should be commended on the amount of thought that you have put into your proposals and such efforts should not be discouraged. It is unfortunate that your first proposal has been the most contentious and the one that most responses have latched onto. Sticking to one's guns is also a praiseworthy trait but doing so in the face of several powerful and cogent arguments to the contrary from experienced Perl users is perhaps taking it too far. Making it clear that soft semicolons would not be the default behaviour might apply some soothing balm to this thread.

Cheers,

JohnGG

dsheroh on Sep 14, 2020 at 08:09 UTC

Re^9: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
It does not make sense to conduct something like "performance review" in a large corporation for my proposals concentrating on "soft-semicolon" idea and ignoring all others. As if it is the only one worth any discussion.
Others have already contributed their thoughts on the rest of your proposals, which I generally agree with and (more significantly) you haven't disputed. IMO, the primary reason that all the discussion is focusing on soft semicolons is because it's the only point you're attempting to defend against our criticisms. There was also a brief subthread about your ideas on substring manipulation, and a slightly longer one about alternate braces which close multiple levels of blocks, but those only lasted as long as you continued the debate.
In a way, it is a debugging aid that allows to cut the number of debugging runs.
Seems like just the opposite to me. It may allow you to get your code to run sooner, but, when it does, any semicolon errors will still be there and need to be fixed in additional debugging runs. Maybe a marginal decrease in overall debugging time if there's a line where you never have to fix the semicolon error because that line ends up getting deleted before you finish, but it seems unlikely to provide any great savings if (as you assert) such errors are likely to be present on a significant proportion of lines.

Also, even if it does cut out some debugging runs, they're runs with a very fast turnaround and little-to-no cognitive effort involved. According to your "BlueJ" paper, even rank beginners need only 8 seconds to fix a missing semicolon error and initiate a new compile.

ikegami on Sep 14, 2020 at 22:11 UTC

Re^7: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff

Yes, and the user will get an error.

Then your suggestion would break a very useful feature. So useful that I take advantage of it in virtually every one of my programs/modules.

[Sep 17, 2020] Discussion of the proposal to add to Perl 7 trim, ltrim and rtrim functions

Sep 17, 2020 | perlmonks.org

Re^5:

johngg on Sep 12, 2020 at 13:46 UTC

Re^5: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by johngg on Sep 12, 2020 at 13:46 UTC
if we assume that somebody uses this formatting to suffix conditionals

I do, pretty much all the time! The ability to span a statement over multiple lines without jumping through backslash hoops is one of the things that makes Perl so attractive. I also think it makes code much easier to read rather than having excessively long lines that involve either horizontal scrolling or line wrapping. As to your comment regarding excessive length identifiers, I come from a Fortran IV background where we had a maximum of 8 characters for identifiers (ICL 1900 Fortran compiler) so I'm all for long, descriptive and unambiguous identifiers that aid those who come after in understanding my code.

Cheers,

JohnGG

dsheroh on Sep 11, 2020 at 08:11 UTC

Re^5: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by dsheroh on Sep 11, 2020 at 08:11 UTC

you !!! on Sep 13, 2020 at 21:25 UTC

Re^6: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by you !!! on Sep 13, 2020 at 21:25 UTC

It might make sense to enable it only with -d options as a help for debugging, which cuts the number of debugging runs for those who do not have editor with built-in syntax checking (like ActiveState Komodo Editor; which really helps is such cases ).

That list includes most Linux/Unix system administrators, who use just command line and vi or similar. And they also use bash of daily basis along with Perl, which increases the probability of making such an error. And this is probably one of the most important category of uses for the future of Perl: Perl started with this group (Larry himself, Randal L. Schwartz, Tom Christiansen, etc) and after a short affair with the Web programming (yahoo, etc) and bioinformatics (bioperl) retreated back to the status of the scripting language of choice for the elite Unix sysadmins.

That does not exclude other users and applications, but I think the core of Perl users are now Unix sysadmins. And their interests should be reflected in Perl 7 with some priority.

BTW, I do not see benefits of omitted semicolons in the final program (as well as, in certain cases, omitted round brackets).


by johngg on Sep 12, 2020 at 13:46 UTC

if we assume that somebody uses this formatting to suffix conditionals

I do, pretty much all the time! The ability to span a statement over multiple lines without jumping through backslash hoops is one of the things that makes Perl so attractive. I also think it makes code much easier to read rather than having excessively long lines that involve either horizontal scrolling or line wrapping. As to your comment regarding excessive length identifiers, I come from a Fortran IV background where we had a maximum of 8 characters for identifiers (ICL 1900 Fortran compiler) so I'm all for long, descriptive and unambiguous identifiers that aid those who come after in understanding my code.

Cheers,

JohnGG

dsheroh on Sep 11, 2020 at 08:11 UTC

Re^5: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by dsheroh on Sep 11, 2020 at 08:11 UTC

you !!! on Sep 13, 2020 at 21:25 UTC

Re^6: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by you !!! on Sep 13, 2020 at 21:25 UTC

It might make sense to enable it only with -d options as a help for debugging, which cuts the number of debugging runs for those who do not have editor with built-in syntax checking (like ActiveState Komodo Editor; which really helps is such cases ).

That list includes most Linux/Unix system administrators, who use just command line and vi or similar. And they also use bash of daily basis along with Perl, which increases the probability of making such an error. And this is probably one of the most important category of uses for the future of Perl: Perl started with this group (Larry himself, Randal L. Schwartz, Tom Christiansen, etc) and after a short affair with the Web programming (yahoo, etc) and bioinformatics (bioperl) retreated back to the status of the scripting language of choice for the elite Unix sysadmins.

That does not exclude other users and applications, but I think the core of Perl users are now Unix sysadmins. And their interests should be reflected in Perl 7 with some priority.

BTW, I do not see benefits of omitted semicolons in the final program (as well as, in certain cases, omitted round brackets).

[Sep 17, 2020] How the strengths of Lisp-family languages facilitate building complex and flexible bioinformatics applications - Briefings in Bioinformatics - Oxford Academic

Sep 17, 2020 | academic.oup.com

How the strengths of Lisp-family languages facilitate building complex and flexible bioinformatics applications Bohdan B Khomtchouk , Edmund Weitz , Peter D Karp , Claes Wahlestedt Briefings in Bioinformatics , Volume 19, Issue 3, May 2018, Pages 537–543, https://doi.org/10.1093/bib/bbw130 Published: 31 December 2016 Article history A correction has been published: Briefings in Bioinformatics , Volume 18, Issue 5, September 2017, Page 905, https://doi.org/10.1093/bib/bbx016

Abstract

We present a rationale for expanding the presence of the Lisp family of programming languages in bioinformatics and computational biology research. Put simply, Lisp-family languages enable programmers to more quickly write programs that run faster than in other languages. Languages such as Common Lisp, Scheme and Clojure facilitate the creation of powerful and flexible software that is required for complex and rapidly evolving domains like biology. We will point out several important key features that distinguish languages of the Lisp family from other programming languages, and we will explain how these features can aid researchers in becoming more productive and creating better code. We will also show how these features make these languages ideal tools for artificial intelligence and machine learning applications. We will specifically stress the advantages of domain-specific languages (DSLs): languages that are specialized to a particular area, and thus not only facilitate easier research problem formulation, but also aid in the establishment of standards and best programming practices as applied to the specific research field at hand. DSLs are particularly easy to build in Common Lisp, the most comprehensive Lisp dialect, which is commonly referred to as the 'programmable programming language'. We are convinced that Lisp grants programmers unprecedented power to build increasingly sophisticated artificial intelligence systems that may ultimately transform machine learning and artificial intelligence research in bioinformatics and computational biology.

lisp , software engineering , bioinformatics , computational biology , programming languages Issue Section: Opinion Note Introduction and background

The programming language Lisp is credited for pioneering fundamental computer science concepts that have influenced the development of nearly every modern programming language to date. Concepts such as tree data structures, automatic storage management, dynamic typing, conditionals, exception handling, higher-order functions, recursion and more have all shaped the foundations of today's software engineering community. The name Lisp derives from 'List processor' [ 1 ], as linked lists are one of Lisp's major data structures, and Lisp source code is composed of lists. Lists, which are a generalization of graphs, are extraordinarily well supported by Lisp. As such, programs that analyze sequence data (such as genomics), graph knowledge (such as pathways) and tabular data (such as that handled by R [ 2 ]) can be written easily, and can be made to work together naturally in Lisp. As a programming language, Lisp supports many different programming paradigms, each of which can be used exclusively or intermixed with others; this includes functional and procedural programming, object orientation, meta programming and reflection.

But more to the point, we have empirical evidence that Lisp is a more productive general-purpose programming language than the other usual suspects, and that most Lisp programs run faster than their counterparts in other languages. Gat [ 3 ] compared the run times, development times and memory usage of 16 programs written by 14 programmers in Lisp, C/C ++ and Java. Development times for the Lisp programs ranged from 2 to 8.5 h, compared with 2 to 25 h for C/C ++ and 4 to 63 h for Java (programmer experience alone does not account for the differences). The Lisp programs were also significantly shorter than the other programs.

And although the execution times of the fastest C/C ++ programs were faster than the fastest Lisp programs, on average, the Lisp programs ran significantly faster than the C/C ++ programs and much faster than the Java programs (mean runtimes were 41 s for Lisp versus 165 s for C/C ++).

Lisp applications and dialects

In bioinformatics and computational biology, Lisp has successfully been applied to research in systems biology [ 4 , 5 ], high-performance computing (HPC) [ 6 ], database curation [ 7 , 8 ], drug discovery [ 9 ], computational chemistry and nanotechnology [ 10 , 11 ], network and pathway -omics analysis [ 12 , 13 , 14 , 15 , 16 ], single-nucleotide polymorphism analysis [ 17 , 18 , 19 ] and RNA structure prediction [ 20 , 21 , 22 ]. In general, the Lisp family of programming languages, which includes Common Lisp, Scheme and Clojure, has powered multiple applications across fields as diverse as [ 23 ]: animation and graphics, artificial intelligence (AI), bioinformatics, B2B and e-commerce, data mining, electronic design automation/semiconductor applications, embedded systems, expert systems, finance, intelligent agents, knowledge management, mechanical computer-aided design (CAD), modeling and simulation, natural language, optimization, risk analysis, scheduling, telecommunications and Web authoring.

Programmers often test a language's mettle by how successfully it has fared in commercial settings, where big money is often on the line. To this end, Lisp has been successfully adopted by commercial vendors such as the Roomba vacuuming robot [ 24 , 25 ], Viaweb (acquired by Yahoo! Store) [ 26 ], ITA Software (acquired by Google Inc. and in use at Orbitz, Bing Travel, United Airlines, US Airways, etc.) [ 27 ], Mirai (used to model the Gollum character for the Lord of the Rings movies) [ 28 ], Boeing [ 29 ], AutoCAD [ 30 ], among others. Lisp has also been the driving force behind open source applications like Emacs [ 31 ] and Maxima [ 32 ], which both have existed for decades and continue to be used worldwide.

Among the Lisp-family languages (LFLs), Common Lisp has been described as the most powerful and accessible modern language for advanced biomedical concept representation and manipulation [ 33 ]. For concrete code examples of Common Lisp's dominance over mainstream programming languages like R and Python, we refer the reader to Sections 4 and 5 of Ross Ihaka's (creator of the R programming language) seminal paper [ 34 ].

Scheme [ 35 ] is an elegant and compact version of Common Lisp that supports a minimalistic core language and an excellent suite of language extension tools. However, Scheme has traditionally mainly been used in teaching and computer science research and its implementors have thus prioritized small size, the functional programming paradigm and a certain kind of 'cleanliness' over more pragmatic features. As such, Scheme is considered far less popular than Common Lisp for building large-scale applications [ 24 ].

The third most common LFL, Clojure [ 36 , 37 ], is a rising star language in the modern software development community. Clojure specializes in the parallel processing of big data through the Java Virtual Machine (JVM), recently making its debut in bioinformatics and computational biology research [ 38 , 39 , 40 ]. Most recently, Clojure was used to parallelize the processing and analysis of SAM/BAM files [ 39 ]. Furthermore, the BioClojure project provides seeds for the bioinformatics community that can be used as building blocks for writing LFL applications. As of now, BioClojure consists of parsers for various kinds of file formats (UniProtXML, Genbank XML, FASTA and FASTQ), as well as wrappers of select data analysis programs (BLAST, SignalP, TMHMM and InterProScan) [ 39 ].

As a whole, Lisp continues to develop new offshoots. A relatively recent addition to the family is Julia [ 41 ]. Although it is sometimes touted 'C for scientists' and caters to a different community because of its syntactical proximity to Python, it is a Lisp at heart and certainly worth watching.

Rewards and challenges

In general, early adopters of a language framework are better poised to reap the scientific benefits, as they are the first to set out building the critical libraries, ultimately attracting and retaining a growing share of the research and developer community. As library support for bioinformatics tasks in the Lisp family of programming languages (Clojure, Common Lisp and Scheme) is yet in its early stages and on the rise, and there is (as of yet) no officially established bioinformatics Lisp community, there is plenty of opportunity for high-impact work in this direction.

It is well known that the best language to choose from should be the one that is most well suited to the job at hand. Yet, in practice, few programmers may consider a nonmainstream programming language for a project, unless it offers strong, community-tested benefits over its popular contenders for the specific task under study. Often times, the choice comes down to library support: does language X already offer well-written, optimized code to help solve my research problem, as opposed to language Y (or perhaps language Z)? In general, new language adoption boils down to a chicken-and-egg problem: without a large user base, it is difficult to create and maintain large-scale, reproducible tools and libraries. But without these tools and libraries, there can never be a large user base. Hence, a new language must have a big advantage over the existing ones and/or a powerful corporate sponsorship behind it to compete [ 42 ]. Most often, a positive feedback loop is generated by repositories of useful libraries attracting users, who, in turn, add more functional libraries, thereby raising a programming language's popularity, rather than reflecting its theoretical potential.

With mainstream languages like R [ 2 ] and Python [ 43 ] dominating the bioinformatics and computational biology scene for years, large-scale software development and community support for other less popular language frameworks have waned to relative obscurity. Consequently, languages winning over increasingly growing proportions of a steadily expanding user base have the effect of shaping research paradigms and influencing modern research trends. For example, R programming generally promotes research that frequently leads to the deployment of R packages to Bioconductor [ 44 ], which has steadily grown into the largest bioinformatics package ecosystem in the world, whose package count is considerably ahead of BioPython [ 45 ], BioClojure [ 38 ], BioPerl [ 46 ], BioJava [ 47 ], BioRuby [ 48 ], BioJulia [ 49 ] or SCABIO [ 50 ]. Given the choice, R programmers interested in deploying large-scale applications are more likely to branch out to releasing Web applications (e.g. Shiny [ 51 ]) than to graphical user interface (GUI) binary executables, which are generally more popular with lower-level languages like C/C ++ [ 52 ]. As such, language often dictates research direction, output and funding. Questions like 'who will be able to read my code?', 'is it portable?', 'does it already have a library for that?' or 'can I hire someone?' are pressing questions, often inexorably shaping the course and productivity of a project. However, despite its popularity, R has been severely criticized for its many shortcomings by its own creator, Ross Ihaka, who has openly proposed to scrap the language altogether and start afresh by using a Lisp-based engine as the foundation for a statistical computing system [ 34 , 53 ].

As a community repository of bioinformatics packages, BioLisp does not yet exist as such (albeit its name currently denotes the native language of BioBike [ 4 , 54 ], a large-scale bioinformatics Lisp application), which means that there is certainly wide scope and potential for its rise and development in the bioinformatics community.

Macros and domain-specific languages

Lisp is a so-called homoiconic language, which means that Lisp code is represented as a data structure of the language itself in such a way that its syntactical structure is preserved. In more technical terms, while the Lisp compiler has to parse the textual representation of the program (the 'source code') into a so-called abstract syntax tree (like any other compiler of any programming language has to), a Lisp program has direct access to (and can modify) this abstract syntax tree, which is presented to the program in a convenient, structured way.

This property enables Lisp to have a macro system that remains undisputed in the programming language world [ 55 ]. Although 'macros' in languages like C have the same name, they are essentially just text substitutions performed on the source code before it is compiled and they cannot always reliably preserve the lexical structure of the code. Lisp macros, on the other hand, operate at the syntactic level. They transform the program structure itself and, as opposed to C macros, are written in the same language they work on and have the full language available all the time. Lisp macros are thus not only used for moderately simple 'find and replace' chores but can apply extensive structural changes to a program. This includes tasks that are impossible in other languages. Examples would be the introduction of new control structures (while Python users had to wait for the language designers to introduce the 'with' statement in version 2.5, Lisp programmers could always add something like that to the language themselves), pattern matching capabilities (while Lisp does not have pattern matching like ML or Haskell out of the box, it is easy to add [ 56 ]) or the integration of code with markup languages (if you want you can, e.g., write code that mimics the structure of an HTML document it is supposed to emit [ 57 , 58 ]).

In addition to that, Common Lisp even offers access to its 'reader', which means that code can be manipulated (in Lisp) before it is parsed [ 59 ]. This enables Lisp programs to completely change their surface syntax if necessary. Examples would be code that adds Perl-like interpolation capabilities to Lisp strings [ 60 ] or a library [ 61 ] that enables Lisp to read arithmetic in 'infix' notation, i.e. to understand '20 + 2 * 21' in addition to the usual '(+ 20 (* 2 21))'.

These features make Lisp an ideal tool for the creation of domain-specific languages: languages that are custom-tailored to a specific problem domain but can still have access to all of Lisp. A striking example is Common Prolog [ 62 ], a professional Prolog system implemented and embedded in Common Lisp. In bioinformatics, the Biolingua [ 5 ] project (now called BioBike) built a cloud-based general symbolic biocomputing domain-specific language (DSL) entirely in Common Lisp. The system, which could be programmed entirely through the browser, was its own complete biocomputing language, which included a built-in deductive reasoner, called BioDeducta [ 54 ]. Biolingua programs, guided by the reasoner, would invisibly call tools such as BLAST [ 63 ] and Bioconductor [ 44 ] on the server-side, as needed. Symbolic biocomputing has also previously been used to create user-friendly visual tools for interactive data analysis and exploration [ 64 ].

Other unique strengths

In addition to homoiconicity, Lisp has several other features that set it apart from mainstream languages:

It has been shown that these features, together with other amenities like powerful debugging tools that Lisp programmers take for granted, offer a significant productivity boost to programmers [ 3 ]. Lisp also gives programmers the ability to implement complex data operations and mathematical constructs in an expressive and natural idiom [ 69 ].

Speed considerations

The interactivity and flexibility of Lisp languages are something that can usually only be found (if at all) in interpreted languages. This might be the origin of the old myth that Lisp is interpreted and must thus be slow -- however, this is not true. Compilers for Lisp have existed since 1959, and all major Common Lisp implementations nowadays can compile directly to machine code, which is often on par with C code [ 70 , 71 , 72 ] or only slightly slower. Some also offer an interpreter in addition to the compiler, but examples like Clozure Common Lisp demonstrate that a programmer can have a compiler-only Common Lisp. For example, CL-PPCRE, a regular expression library written in Common Lisp, runs faster than Perl's regular expression engine on some benchmarks, even though Perl's engine is written in highly tuned C [ 24 ].

Although programmers who use interpreted languages like Python or Perl for their convenience and flexibility will have to resort to writing in C/C ++ for time-critical portions of their code, Lisp programmers can usually have their cake and eat it too. This was perhaps best shown with direct benchmarking by the creator of the R programming language, Ross Ihaka, who provided benchmarks demonstrating that Lisp's optional type declaration and machine-code compiler allow for code that is 380 times faster than R and 150 times faster than Python [ 34 ]. And not only will the code created by Lisp compilers be efficient by default, Common Lisp, in particular, offers unique features to optimize those parts of the code (usually only a tiny fraction) that really need to be as fast as possible [ 59 ]. This includes so-called compiler macros, which can transform function calls into more efficient code at runtime, and a mandatory disassembler, which enables programmers to fine-tune time-critical functions until the compiled code matches their expectations. It should also be emphasized that while the C or Java compiler is 'history' once the compiled program is started, the Lisp compiler is always present and can thus generate new, fast code while the program is already running. This is rarely used in finished applications (except for some areas of AI), but it is an important feature during development and helpful for explorative programming.

To further debunk the popular misconception that Lisp languages are slow, Clojure was recently used to process and analyze SAM/BAM files [ 39 ] with significantly less lines of code and almost identical speeds as SAMTools [ 73 ], which is written in the C programming language. In addition, Common Lisp was recently used to build a high-performance tool for preparing sequence alignment/map files for variant calling in sequencing pipelines [ 6 ]. This HPC tool was shown to significantly outperform SAMTools and Picard on a variety of benchmarks [ 6 ].

A case study: Pathway Tools

Pathway Tools [ 74 , 75 ] is an example of a large bioinformatics software system written in Common Lisp (Allegro Common Lisp from Franz Inc.). Pathway Tools has among the largest functionality of any bioinformatics software system, including genome informatics, regulatory network informatics, metabolic pathway informatics and omics data analysis. For example, the software includes a genome browser that zooms from the nucleotide level to the chromosome level; it infers metabolic reconstructions from annotated genomes; it computes organism-specific layouts of metabolic map diagrams; it computes optimal routes within metabolic networks; and it can execute quantitative metabolic flux models.

The same Pathway Tools binary executable can execute as both a desktop window application and as a Web server. In Web server mode, Pathway Tools powers the BioCyc.org Web site, which contains 7600 organism-specific Pathway/Genome Databases, and services ∼500 000 unique visitors per year and up to 100 000 page views per day. Pathway Tools uses the 'hot-swapping' capabilities of Common Lisp to download and install software patches at user sites and within the running BioCyc Web server. Pathway Tools has been licensed by 7200 groups, and was found to have the best performance and documentation among multiple genome database warehousing systems [ 76 ].

Pathway Tools consists of 680 000 lines of Common Lisp code (roughly the equivalent of 1 400 000 lines of C or Java code), organized into 20 subsystems. In addition, 30 000 lines of JavaScript code are present within the Pathway Tools Web interface. We chose Common Lisp for development of Pathway Tools because of its excellent properties as a high-level, highly productive, easy-to-debug programming language; we strongly believe that the choice of Common Lisp has been a key factor behind our ability to develop and maintain this large and complex software system.

A case study: BioBike

BioBike provides an example of a large-scale application of the power of homoiconicity. In personal communication, the inventor of BioBike, Jeff Shrager, explained why Lisp (in this case, Common Lisp) was chosen as the implementation language, an unusual choice even for the early 2000's. According to Shrager, Lisp-style DSL creation is uniquely suited to 'living' domains, such as biology, where new concepts are being introduced on an ongoing basis (as opposed to, for example, electronics, where the domain is better understood, and so the conceptual space is more at rest). Shrager pointed out that as Lisp-based DSLs are usually implemented through macros, this provides the unique capability of creating new language constructs that are embedded in the home programming language (here, in Lisp). This is a critical distinction: in most programming languages, DSLs are whole new programming languages built on top of the base language, whereas in Lisp, DSLs are built directly into the language.

Lisp-based DSLs commonly show up in two sorts of domain-specific control structures: WITH-  clauses and MAP-  clauses. By virtue of Lisp's homoiconicity, such constructs can take code as arguments, and can thereby create code-local bindings, and do various specialized manipulation directly on the code itself, in accord with the semantics of the new construct. In non-homoiconic languages, users must do this either by creating new classes/objects, or through function calls or via an ugly hack commonly referred to as 'Greenspun's 10th rule' [ 77 ], wherein users must first implement a quasi-LFL on top of the base language, and then implement the DSL in that quasi-LFL. Both the object-creation and function-call means of creating new constructs lead to encapsulation problems, often requiring ugly manipulations such as representing code as strings, passing code-conditionalizing arguments, and then having to either globalize them, or re-pass them throughout a large part of the codebase. The Lisp-like methods of embedding DSLs into the base language via macros, one can simply use, for example, a WITH-GENES or a MAP-GENES macro wrapper, and within these, all one need do is to write normal everyday Lisp code, and the wrapper, because it has access to and can modify the code that gets run, has no such firewalls, enabling a much more powerful sort of computation. This greatly simplifies the incremental creation and maintenance of the DSL, and it is for this reason, argues Shrager, that Lisp (and LFLs more generally) is well suited to biology. Being a science that is creating new concepts constantly, it is especially important to be able to flexibly add concepts to the DSL.

BioBike was created by a team led by Jeff Shrager and JP Massar, and later Jeff Elhai. Its core Web listener is almost 15 000 lines of Common Lisp code in 25 modules, and the entire BioBike system is nearly 400 000 lines of code in about 850 modules, including the Web listener, many specialized bioinformatics modules, a scratch-like visual programming language (built using a specialized LFL that compiles to JavaScript, because of Peter Siebel), a specialized bioinformatics-oriented frame system (because of Mike Travers) and many other smaller modules.

Perspectives and outlook

Historically speaking, Lisp is the second oldest (second only to Fortran) programming language still in use and has influenced nearly every major programming language to date with its constructs [ 78 ]. For example, it may be surprising to learn that R is written atop of Scheme [ 79 ]. In fact, R borrows directly from its Lisp roots for creating embedded domain-specific languages within R's core language set [ 80 ]. For instance, ggplot2 [ 81 ], dplyr [ 82 ] and plyr [ 83 ] are all examples of DSLs in R. This highlights the importance and relevance of Lisp as a programmable programming language, namely the ability to be user-extensible beyond the core language set. Given the wide spectrum of domains and subdomains in bioinformatics and computational biology research, it follows that similar applications tailored to genomics, proteomics, metabolomics or other research fields may also be developed as extensible macros in Common Lisp. By way of analogy, perhaps a genomics equivalent of ggplot2 or dplyr is in store in the not-so-distant future. Advice for when such pursuits are useful is readily available [ 84 ]. Perhaps even more importantly, it is imperative to take into the consideration the future of statistical computing [ 34 ], which will form the big data backbone of artificial intelligence and machine learning applications in bioinformatics.

Conclusions

New programming language adoption in a scientific community is both a challenging and rewarding process. Here, we advocate for and propose a greater inclusion of the LFLs into large-scale bioinformatics research, outlining the benefits and opportunities of the adoption process. We provide historical perspective on the influence of language choice on research trends and community standards, and emphasize Lisp's unparalleled support for homoiconicity, domain-specific languages, extensible macros and error handling, as well as their significance to future bioinformatics research. We forecast that the current state of Lisp research in bioinformatics and computational biology is highly conducive to a timely establishment of robust community standards and support centered around not only the development of bioinformatic domain-specific libraries but also the rise of highly customizable and efficient machine learning and AI applications written in languages like Common Lisp, Clojure and Scheme.

Key Points

Bohdan B. Khomtchouk is an NDSEG Fellow and PhD candidate in the Human Genetics and Genomics Graduate Program at the University of Miami Miller School of Medicine. His research interests include bioinformatics and computational biology applications in HPC, integrative multi-omics, artificial intelligence, machine learning, mathematical genetics, biostatistics, epigenetics, visualization, search engines and databases.

Edmund Weitz is full professor at the University of Applied Sciences in Hamburg, Germany. He is a mathematician and his research interests include set theory, logic and combinatorics.

Peter D. Karp is the director of the Bioinformatics Research Group within the Artificial Intelligence Center at SRI International. Dr Karp has authored >130 publications in bioinformatics and computer science in areas including metabolic pathway bioinformatics, computational genomics, scientific visualization and scientific databases.

Claes Wahlestedt is Leonard M. Miller Professor at the University of Miami Miller School of Medicine and is working on a range of basic science and translational efforts in his roles as Associate Dean and Center Director for Therapeutic Innovation. The author of some 250 peer-reviewed scientific publications, his ongoing research projects concern bioinformatics, epigenetics, genomics and drug/biomarker discovery across several therapeutic areas. He has experience not only from academia but also from leadership positions in the pharmaceutical and biotechnology industry.

Acknowledgements

B.B.K. dedicates this work to the memory of his uncle, Taras Khomchuk. B.B.K. wishes to acknowledge the financial support of the United States Department of Defense (DoD) through the National Defense Science and Engineering Graduate Fellowship (NDSEG) Program: this research was conducted with Government support under and awarded by DoD, Army Research Office (ARO), National Defense Science and Engineering Graduate (NDSEG) Fellowship, 32 CFR 168a. C.W. thanks Jeff Shrager for critical review and helpful comments on the manuscript.

Funding

[Sep 16, 2020] Rather heated discussion about the value of adding "softsemicolon" to Perl 7

Sep 16, 2020 | perlmonks.org
  1. [Edited] [Highly desirable] Make a semicolon optional at the end of the line, if there is a balance of brackets on the line and the statement looks syntactically correct (optional pragma "soft semicolon", similar to the solution used in famous IBM PL/1 debugging compiler). That can help sysadmins who use bash and Perl in parallel and work from command line with vi or similar editors, and are not using such editors as Komodo Edit which flag syntax errors. If might make sense to enable this pragma only via option -d of the interpreter. In this case it will suit as a pure debugging aid, cutting the number of iterations of editing the source before actual run. It does not make much sense to leave statements without semicolons in the final, production version of the program. See, for example, the discussion in Stack Overflow Do you recommend using semicolons after every statement in JavaScript

... ... ...


johngg on Sep 12, 2020 at 13:46 UTC

Re^5: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by johngg on Sep 12, 2020 at 13:46 UTC
if we assume that somebody uses this formatting to suffix conditionals

I do, pretty much all the time! The ability to span a statement over multiple lines without jumping through backslash hoops is one of the things that makes Perl so attractive. I also think it makes code much easier to read rather than having excessively long lines that involve either horizontal scrolling or line wrapping. As to your comment regarding excessive length identifiers, I come from a Fortran IV background where we had a maximum of 8 characters for identifiers (ICL 1900 Fortran compiler) so I'm all for long, descriptive and unambiguous identifiers that aid those who come after in understanding my code.

Cheers,

JohnGG

dsheroh on Sep 11, 2020 at 08:11 UTC

Re^5: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by dsheroh on Sep 11, 2020 at 08:11 UTC

you !!! on Sep 13, 2020 at 21:25 UTC

Re^6: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by you !!! on Sep 13, 2020 at 21:25 UTC

It might make sense to enable it only with -d options as a help for debugging, which cuts the number of debugging runs for those who do not have editor with built-in syntax checking (like ActiveState Komodo Editor; which really helps is such cases ).

That list includes most Linux/Unix system administrators, who use just command line and vi or similar. And they also use bash of daily basis along with Perl, which increases the probability of making such an error. And this is probably one of the most important category of uses for the future of Perl: Perl started with this group (Larry himself, Randal L. Schwartz, Tom Christiansen, etc) and after a short affair with the Web programming (yahoo, etc) and bioinformatics (bioperl) retreated back to the status of the scripting language of choice for the elite Unix sysadmins.

That does not exclude other users and applications, but I think the core of Perl users are now Unix sysadmins. And their interests should be reflected in Perl 7 with some priority.

BTW, I do not see benefits of omitted semicolons in the final program (as well as, in certain cases, omitted round brackets).

dave_the_m on Sep 11, 2020 at 10:37 UTC

Re^5: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by dave_the_m on Sep 11, 2020 at 10:37 UTC $a = $b + $c + $d + $e; [download] If not, what are the exact criteria for things on the next line to trigger or not a semicolon?

Dave.

you !!! on Sep 11, 2020 at 14:20 UTC

Re^6: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by you !!! on Sep 11, 2020 at 14:20 UTC
In the following, the first line has a balance of brackets and looks syntactically correct. Would you expect the lexer to add a semicolon?
  $a = $b + $c
            + $d + $e;
Yes, and the user will get an error. This is similar to previous example with trailing on a new line

if (1);

The first question is why he/she wants to format the code this way if he/she suffers from "missing semicolons" problem, wants to avoid missing semicolon error and, supposedly deliberately enabled pragma "softsemicolons" for that?

This is the case where the user need to use #\ to inform the scanner about his choice. But you are right in a sense that it creates a new type of errors -- "missing continuation." And that there is no free lunch. This approach requires specific discipline to formatting your code.

dave_the_m on Sep 11, 2020 at 14:52 UTC

Re^7: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by dave_the_m on Sep 11, 2020 at 14:52 UTC

The reason I gave that code as an example is that it's a perfectly normal way of spreading complex expressions over multiple lines: e.g. where you need to add several variables together and the variables have non-trivial (i.e. long) names, e.g.

$pressure = $partial_pressure_nitrogen + $partial_pressure_oxygen + $partial_pressure_water_vapour + $partial_pressure_argon + $partial_pressure_carbon_dioxide; [download] In this case, the automatic semicolons are unhelpful and will give rise to confusing error messages. So you've just switched one problem for another, and raised the cognitive load - people now need to know about your pragma and also know when its in scope.

Dave.

you !!! on Sep 11, 2020 at 16:51 UTC

Re^8: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by you !!! on Sep 11, 2020 at 16:51 UTC

Yes it discourages certain formatting style. So what ? If you can't live without such formatting (many can) do not use this pragma. BTW you can always use extra parentheses, which will be eliminated by the parser as in

$pressure = (
       $partial_pressure_nitrogen
     + $partial_pressure_oxygen
     + $partial_pressure_water_vapour
     + $partial_pressure_argon
     + $partial_pressure_carbon_dioxide
     );

dave_the_m on Sep 11, 2020 at 17:05 UTC

Re^9: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by dave_the_m on Sep 11, 2020 at 17:05 UTC

* How exactly does the lexer/parser know when it should insert a soft semicolon?

* How exactly does it give a meaningful error message when it inserts one where the user didn't intend for there to be one?

My problem with your proposal is that it seems to require the parser to apply some complex heuristics to determine when to insert and when to complain meaningfully. It is not obvious to me what these heuristics should be. My suspicion is that such an implementation will just add to perl's already colourful collection of edge cases, and just confuse both beginner and expert alike.

Bear in mind that I am one of just a handful of people who actively work on perl's lexer and parser, so I have a good understanding of how it works, and am painfully aware of its many complexities. (And its quite likely that I would end up being the one implementing this.)

Dave.

you !!! on Sep 11, 2020 at 18:51 UTC

Re^10: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by you !!! on Sep 11, 2020 at 18:51 UTC

The lexical analyser is Perl is quite sophisticated due to lexical complexity of the language. So I think it already counts past lexems and thus can determine the balance of "()", '[]' and "{}"

So you probably can initially experiment with the following scheme

If all the following conditions are true

  1. You reached the EOL
  2. Pragma "softsemicolon" is on
  3. The balance is zero
  4. [Edited] The last processed token in not ',', '.' '=' *and all derivatives like ++, -++),"=='( and other conditionals like <,>,!=, =<, <=.<=, eq,etc), ':','&','&&','!',"||",'+','-','*' or similar tokens which imply the continuation of the statement.
  5. [Edited] The next token (not symbol but token) via look-ahead buffer is not one of the set "{", "}", ';', and ".", "!!", "+"(but not "++") '-','*' and several others (see above).

the lexical analyser needs to insert lexem "semicolon" in the stream of lexem passed to syntax analyser.

The warning issued should be something like:

"Attempt to correct missing semicolon was attempted. If this is incorrect please use extra parenthesis or disable pragma "softsemicolon" for this fragment."
From what I read, Perl syntax analyser relies on lexical analyser in some unorthodox way, so it might be possible to use "clues" from syntax analyser for improving this scheme. See, for example, the scheme proposed for recursive descent parsers in:
Follow set error recovery
C Stirling - Software: Practice and Experience, 1985 - Wiley Online Library
  Some accounts of the recovery scheme mention and make use of non-systematic changes to
their recursive descent parsers in order to improve   In the former he anticipates the possibility of
a missing semicolon whereas in the latter he does not anticipate a missing comma

dave_the_m on Sep 11, 2020 at 22:02 UTC

Re^11: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by dave_the_m on Sep 11, 2020 at 22:02 UTC
So I think it already counts past lexems and thus can determine the balance of "()", '[]' and "{}"
It can't currently.
If all the following conditions are true
All of the following satisfy your criteria, are valid and normal perl code, and would get a semicolon incorrectly inserted based on your criteria: use softsemicolon; $x = $a + $b; $x = 1 if $condition; $x = 1 unless $condition1 && $condition2; [download]
The warning issued should be something like
I didn't ask what the text of the warning should be, I asked how the parser can determine when the warning should be issued.
the scheme proposed for recursive descent parsers
But perl uses an LR(1) parser, not a recursive descent parser.

Dave.

you !!! on Sep 12, 2020 at 02:06 UTC

Re^12: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by you !!! on Sep 12, 2020 at 02:06 UTC
All of the following satisfy your criteria, are valid and normal Perl code, and would get a semicolon incorrectly inserted based on your criteria:
use softsemicolon;

$x = $a
   + $b;

$x = 1
    if $condition;

$x = 1 unless  $condition1
           && $condition2;

Yes in cases 1 and 2; it depends on depth of look-ahead in case 3. Yes if it is one symbol. No it it is two(no Perl statement can start with && )

As for "valid and normal" your millage may vary. For people who would want to use this pragma it is definitely not "valid and normal". Both 1 and 2 looks to me like frivolities without any useful meaning or justification. Moreover, case 1 can be rewritten as:

$x =($a + $b); [download] The case 3 actually happens in Perl most often with regular if and here opening bracket is obligatory: if ( ( $tokenstr=~/a\[s\]/ || $tokenstr =~/h\[s\]/ ) && ( $tokenstr... ) ){ .... } [download] Also Python-inspired fascination with eliminating all brackets does not do here any good 1 2 $a=$b=1; 3 $x=1 if $a==1 4 && $b=2; [download] should generally be written 2 $a=$b=1; 3 $x=1 if( $a==1 4 && $b=2); [download] I was surprised that the case without brackets was accepted by the syntax analyser. Because how would you interpret $x=1 if $a{$b}; without brackets is unclear to me. It has dual meaning: should be a syntax error in one case $x=1 if $a{ $b }; [download] and the test for an element of hash $a in another.

dave_the_m on Sep 12, 2020 at 06:52 UTC

Re^13: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by dave_the_m on Sep 12, 2020 at 06:52 UTC
Both 1 and 2 looks to me like frivolities without any useful meaning or justification
You and I have vastly differing perceptions of what constitutes normal perl code. For example there are over 700 examples of the 'postfix if on next line' pattern in the .pm files distributed with the perl core.

There doesn't really seem any point in discussing this further. You have failed to convince me, and I am very unlikely to work on this myself or accept such a patch into core.

Dave.

you !!! on Sep 12, 2020 at 19:53 UTC

Re^14: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by you !!! on Sep 12, 2020 at 19:53 UTC
You and I have vastly differing perceptions of what constitutes normal perl code. For example there are over 700 examples of the 'postfix if on next line' pattern in the .pm files distributed with the perl core.
Probably yes. I am an adherent of "defensive programming" who is against over-complexity as well as arbitrary formatting (pretty printer is preferable to me to manual formatting of code). Which in this audience unfortunately means that I am a minority.

BTW your idea that this pragma (which should be optional) matters for Perl standard library has no connection to reality.

GrandFather on Sep 12, 2020 at 23:53 UTC

Re^15: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by GrandFather on Sep 12, 2020 at 23:53 UTC

A very large proportion of the replies you have received in this thread are from people who put a high value on writing maintainable code. "maintainable" is short hand for code that is written to be understood and maintained with minimum effort over long periods of time and by different programmers of mixed ability. There is a strong correlation with your stance of "defensive programming" ... against over-complexity as well as arbitrary formatting . None of us are arguing with that stance. We are arguing with the JavaScript semicolon that you would like introduced based on a personal whim in a context of limited understanding of Perl syntax and idiomatic use.

Personally I use an editor that has an on demand pretty printer which I use frequently. The pretty printer does very little work because I manually format my code as I go and almost always that is how the pretty printer will format it. I do this precisely to ensure my code is not overly complex and is maintainable. I do this in all the languages that I use and the hardest languages to do that in are Python, VBScript and JavaScript because of the way they deal with semi-colons.

Oh, and in case it is of interest, dave_the_m is one of the current maintainers of Perl. He is in a great position to know how the nuts and bolts of an optional semi-colon change might be made and has a great understanding of how Perl is commonly used. Both give him something of a position of authority in determining the utility of such a change.

Optimising for fewest key strokes only makes sense transmitting to Pluto or beyond

tobyink on Sep 12, 2020 at 22:24 UTC

Re^11: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by tobyink on Sep 12, 2020 at 22:24 UTC

"no Perl statement can start with the dot"

Yada-yada operator in Perl 5.12+.

toby döt ink

ikegami on Sep 14, 2020 at 22:15 UTC

Re^12: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by ikegami on Sep 14, 2020 at 22:15 UTC

Parser lookaheads are implemented in terms of tokens, not characters. The first token of yada is a triple-dot, not a dot. While you may think it starts with a dot, that's not how the parser sees it, so the existence of yada is not relevant here.

Tux on Sep 12, 2020 at 09:38 UTC

Re^7: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by Tux on Sep 12, 2020 at 09:38 UTC

You also completely ruin maintainability and extensibility. Consider a filter module ...

my $fixed = $bad =~ y/\x{00d0}/\x{0110}/r # Eth != D-stroke =~ y/\x{0189}/\x{0110}/r # LETTER AFRICAN D != + D-stroke =~ s{\bpra[ck]ti[sc]e\b}{practice}gr # All 4 seen in docume + nt AB12.38C =~ s{\bX13\.GtrA\.14\b}{X13_GA12}gr # Product got renamed =~ s{\b1234\s*zip\b}{1234ZIP}gir # Reciever will crash + on badly formed ZIP code =~ s{\bpays\s*-?\s*bas\b} {The Netherlands}gir # French forms :( =~ ....; [download]

The more examples I see posted by my esteemed co-monks, the less I like the idea, and I hated it already when I read it in the OP.


Enjoy, Have FUN! H.Merijn

you !!! on Sep 13, 2020 at 19:48 UTC

Re^8: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by you !!! on Sep 13, 2020 at 19:48 UTC

As for soft-semicolon you completly misunderstood the situation:

First, nobody force you to use this pragma. And if you do not use it you are not affected. I am thinking now that it should be enabled only with option -d.

It does not make sense to conduct something like "performance review" in a large corporation for my proposals concentrating on "soft-semicolon" idea and ignoring all others. As if it is the only one worth any discussion. It might be the easiest one to piss off, but it is far from being the most important or far reaching among those proposals.

There is no free lunch, and for some coding styles (including but not limited to coding styles used in many modules in Perl standard library) it is definitely inappropriate. Nobody claim that it is suitable for all users. It is an optional facility for those who want and need it. In a way, it is a debugging aid that allows to cut the number of debugging runs. And IMHO there is not a zero subset of Perl users who would be interested in this capability. Especially system administrators who systematically use bash along with Perl.

Detractors can happily stay with the old formatting styles forever. Why is this so difficult to understand before producing such an example?

Moreover, how can you reconcile the amount of efforts (and resulting bugs) for the elimination of extra round brackets in Perl with this proposal? Is not this the same idea -- to lessen the possible number of user errors?

For me, it looks like a pure hypocrisy - in one case we are spending some efforts following other scripting languages at some cost; but the other, similar in its essence, proposal is rejected blindly as just a bad fashion. If this is a fashion, then eliminating round brackets is also a bad fashion, IMHO.

And why only I see some improvements possible at low cost in the current Perl implementation and nobody else proposed anything similar or better, or attempted to modify/enhance my proposals? After all Perl 5.10 was a definite step forward for Perl. Perl 7 should be the same.

I think the effort spend here in criticizing my proposal would be adequate to introduce the additional parameter into index function ("to" limit). Which is needed and absence of which dictates using substr to limit the search zone in long strings. Which is sub-optimal solution unless the interpreter has advanced optimization capabilities and can recognize such a use as the attempt to impose the limit on the search.

Constructive discussion does not mean pissing off each and every my posts ( one has -17 votes now; looks a little bit like schoolyard bulling ) -- you need to try to find rational grain in them, and if such exists, try to revise and enhance the proposal.

The stance "I am happy with Perl 'as is' and go to hell with your suggestions" has its value and attraction, but it is unclear how it will affect the future of the language.

johngg on Sep 13, 2020 at 22:49 UTC

Re^9: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by johngg on Sep 13, 2020 at 22:49 UTC
As for soft-semicolon you completly misunderstood the situation: First, nobody force you to use this pragma. And if you do not use it you are not affected. I am thinking now that it should be enabled only with option -d.

In the OP you make no mention of a pragma in proposal 1, you just say that it would be "highly desirable" to have soft semicolons. This implies that you would like it to be the default behaviour in Perl 7, which, judging by the responses, would hack a lot of people off, me included. If you are proposing that soft semicolons are only enabled via a pragma perhaps you should add a note to that effect in the OP, being sure to make it clear that it is an update rather than silently changing the text.

And IMHO there is not a zero subset of Perl users who would be interested in this capability. Especially system administrators who systematically use bash along with Perl.

I spent the last 26 years of my career as a systems administrator (I had no ambition to leave technical work and become a manager) on Unix/Linux systems and started using Perl in that role in 1994 with perl 4.036, quickly moving to 5. The lack of semicolon statement terminators in the various shell programming languages I had to use was a pain in the arse and moving to Perl was a huge relief as well as a boost to effectiveness. I would not be the slightest bit interested in soft semicolons and they would, to my mind, be either a debugging nightmare or would force me into a coding style alien to my usual practice.

In this post you say

Also Python-inspired fascination with eliminating all brackets does not do here any good 1 2 $a=$b=1; 3 $x=1 if $a==1 4 && $b=2; [download]

should generally be written

2 $a=$b=1; 3 $x=1 if( $a==1 4 && $b=2); [download]

to which I say, nonsense! Why add unnecessary round brackets to perfectly valid code? Use round brackets where they are needed to disambiguate precedence but not where they just add superfluous noise. Nothing to do with fascination, I've never touched Python!

You should be commended on the amount of thought that you have put into your proposals and such efforts should not be discouraged. It is unfortunate that your first proposal has been the most contentious and the one that most responses have latched onto. Sticking to one's guns is also a praiseworthy trait but doing so in the face of several powerful and cogent arguments to the contrary from experienced Perl users is perhaps taking it too far. Making it clear that soft semicolons would not be the default behaviour might apply some soothing balm to this thread.

Cheers,

JohnGG

dsheroh on Sep 14, 2020 at 08:09 UTC

Re^9: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by dsheroh on Sep 14, 2020 at 08:09 UTC
It does not make sense to conduct something like "performance review" in a large corporation for my proposals concentrating on "soft-semicolon" idea and ignoring all others. As if it is the only one worth any discussion.
Others have already contributed their thoughts on the rest of your proposals, which I generally agree with and (more significantly) you haven't disputed. IMO, the primary reason that all the discussion is focusing on soft semicolons is because it's the only point you're attempting to defend against our criticisms. There was also a brief subthread about your ideas on substring manipulation, and a slightly longer one about alternate braces which close multiple levels of blocks, but those only lasted as long as you continued the debate.
In a way, it is a debugging aid that allows to cut the number of debugging runs.
Seems like just the opposite to me. It may allow you to get your code to run sooner, but, when it does, any semicolon errors will still be there and need to be fixed in additional debugging runs. Maybe a marginal decrease in overall debugging time if there's a line where you never have to fix the semicolon error because that line ends up getting deleted before you finish, but it seems unlikely to provide any great savings if (as you assert) such errors are likely to be present on a significant proportion of lines.

Also, even if it does cut out some debugging runs, they're runs with a very fast turnaround and little-to-no cognitive effort involved. According to your "BlueJ" paper, even rank beginners need only 8 seconds to fix a missing semicolon error and initiate a new compile.

ikegami on Sep 14, 2020 at 22:11 UTC

Re^7: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by ikegami on Sep 14, 2020 at 22:11 UTC

Yes, and the user will get an error.

Then your suggestion would break a very useful feature. So useful that I take advantage of it in virtually every one of my programs/modules.

haj on Sep 10, 2020 at 18:35 UTC

Re^3: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by haj on Sep 10, 2020 at 18:35 UTC

That's neither a natural tendency nor an interesting psychological phenomenon. You just made that up.

Semicolons at the end of a statement are as natural as a full stop "." at the end of a sentence, regardless of whether the sentence is the last in a paragraph. The verification process whether a line "looks syntactically correct" takes longer than just hitting the ";" key, and the chances of a wrong assessment of "correct" may lead to wrong behavior of the software.

Language-aware editors inform you about a missing semicolon by indenting the following line as a continuation of the statement in the previous line, so it is hard to miss.

If, on the other hand, you want to omit semicolons, then the discussion should have informed you that you aren't going to find followers.

you !!! on Sep 10, 2020 at 21:20 UTC

Re^4: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by you !!! on Sep 10, 2020 at 21:20 UTC
Semicolons at the end of a statement are as natural as a full stop "." at the end of a sentence, regardless of whether the sentence is the last in a paragraph.
I respectfully disagree, but your comment can probably explain fierce rejection of this proposal in this forum. IMHO this is a wrong analogy as the level of precision requred is different. If you analyse books in print you will find paragraphs in which full stop is missing at the end. Most people do not experience difficulties learning to put a full stop at the end of the sentence most of the time. Unfortunately this does work this way in programming languages with semicolon at the end of statement. Because what is needed is not "most of the time" but "all the time"

My view, supported by some circumstantial evidence and my own practice, is that this is a persistent error that arise independently of the level of qualification for most or all people, and semicolon at the end of the statement contradicts some psychological mechanism programmers have.

haj on Sep 11, 2020 at 00:41 UTC

Re^5: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by haj on Sep 11, 2020 at 00:41 UTC
If you analyse books in print you will find paragraphs in which full stop is missing at the end.

You are still making things up.

..and semicolon at the end of the statement contradicts some psychological mechanism programmers have.

There is no evidence for that.

You should have understood that your idea doesn't get support here. Defending it with made-up evidence doesn't help.

Anonymous Monk on Sep 11, 2020 at 15:14 UTC

Re^6: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by Anonymous Monk on Sep 11, 2020 at 15:14 UTC

dsheroh on Sep 11, 2020 at 08:07 UTC

Re^3: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by dsheroh on Sep 11, 2020 at 08:07 UTC
Because people have a natural tendency to omit them at the end of the line.
Fascinating. I've never heard of, nor observed, such a tendency. Might you provide references to a few peer-reviewed studies on the topic? I don't necessarily need URLs or DOIs (although those would be most convenient) - bibliographic citations, or even just the titles, should be sufficient, since I have access to a good academic publication search system.

Offhand, the only potentially-related publication I can locate is "The Case of the Disappearing Semicolon: Expressive-Assertivism and the Embedding Problem" (Philosophia. Dec2018, Vol. 46 Issue 4), but that's a paper on meta-ethics, not programming.

you !!! on Sep 11, 2020 at 16:38 UTC

Re^4: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by you !!! on Sep 11, 2020 at 16:38 UTC

Literature is available for free only to academic researchers, so some money might be involved in getting access.

You can start with

A statistical analysis of syntax errors - ScienceDirect
 For example, approximately one-fourth of all original syntax errors in the Pascal sample were
missing semicolons or use of comma in place of semicolon   4) indicates that this type of error
is quite infrequent (80o) and hence needn't be of as great a concern to recovery pro  

PDF Error log analysis in C programming language courses

BOOK Programming languages
JJ Horning - 1979 - books.google.com
  to note that over 14% of the faults occurring in topps programs during the second half of the
experiment were still semicolon faults (compared to 1% for toppsii), and that missing semicolons
were about   Every decision takes time, and provides an opportunity for error  
n assessment of locally least-cost error recovery

SO Anderson, RC Backhouse, EH Bugge  - The Computer  , 1983 - academic.oup.com
  sym = semicolon in the former, one is anticipating the possibility of a missing semicolon; in contrast,
a missing comma is   13, p. 229) if sy = semicolon then insymbol else begin error(14); if sy = comma
then insymbol end Both conditional statements accept semicolons but the  
The role of systematic errors in developmental studies of programming language learners

J Segal, K Ahmad, M Rogers - Journal of Educational  , 1992 - journals.sagepub.com
  Errors were classified by their surface characteristics into single token (missing   gathered from
the students, was that they would experience considerable difficulties with using semicolons,
and that   the specific rule of ALGOL 68 syntax concerning the role of the semicolon as a  
  Cited by 9 Related articles
Follow set error recovery

C Stirling - Software: Practice and Experience, 1985 - Wiley Online Library
  Some accounts of the recovery scheme mention and make use of non-systematic changes to
their recursive descent parsers in order to improve   In the former he anticipates the possibility of
a missing semicolon whereas in the latter he does not anticipate a missing comma  

A first look at novice compilation behaviour using BlueJ 
MC Jadud - Computer Science Education, 2005 - Taylor & Francis
  or mark themselves present from weeks previous they may have missed -- either way   change
programmer behaviour -- perhaps encouraging them to make fewer "missing semicolon" errors,
or be   or perhaps highlight places where semicolons should be when they are missing 

Making programming more conversational
A Repenning - 2011 IEEE Symposium on Visual Languages  , 2011 - ieeexplore.ieee.org
  Miss one semicolon in a C program and the program may no longer work at all   Similar to code
auto-completion approaches, these kinds of visual programming environments prevent syntactic
programming mistakes such as missing semicolons or typos

dsheroh on Sep 12, 2020 at 13:20 UTC

Re^5: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by dsheroh on Sep 12, 2020 at 13:20 UTC
Literature is available for free only to academic researchers, so some money might be involved in getting access.
No problem here. Not only do I work at an academic library, I'm the primary responsible for the proxy we use to provide journal access for off-campus researchers. All the benefits of being an academic researcher, with none of the grant proposals!
A statistical analysis of syntax errors - ScienceDirect
The first thing to catch my eye was that the abstract states it found that syntax errors as a whole (not just semicolon errors) "occur relatively infrequently", which seems to contradict your presentation of semicolon problems as something which constantly afflicts all programmers.

Going over the content of the paper itself, I couldn't help noticing that a substantial fraction of the semicolon errors discussed were in contexts idiosyncratic to Pascal which have no Perl equivalent, such as the use of semicolons to separate groups of formal parameters (vs. commas within each group); using semicolon after END most of the time, but a period at the end of the program; or incorrectly using a semicolon before ELSE. Aside from being idiosyncratic, these situations also have the common feature of being cases where sometimes a semicolon is correct and sometimes a semicolon is incorrect, depending on the context of the surrounding code - which is precisely the major criticism of your "make semicolons sometimes optional, and escaping line breaks sometimes required, depending on the context of the surrounding code". The primary issue in these cases is that the rules change based on context, and you've proposed propagating the larger problem in an attempt to resolve a smaller problem which, it seems, only you perceive.

I also note that the data used in this research consisted of code errors collected from two university programming classes, one of which was an introductory course and the other a relatively advanced one. It is to be expected that semicolon errors (particularly given the Pascal idiosyncrasies I mentioned above) would be common in code written for the introductory course. It would be interesting to see how the frequency compared between the two courses; I expect that it would be much, much lower in the advanced course - and lower still in code written by practicing professionals in the field, which was omitted entirely from the study.

Oh, and a number of other comments in this discussion have mentioned using syntax-aware editors. Did those even exist in 1978, when this paper was published? Sorry, I'm just being silly with that question - the paper mentions card decks and keypunch errors, and says that the students were asked to "access [the compiler] using a 'cataloged procedure' of job control statements". These programs weren't entered using anything like a modern text editor, much less one with syntax awareness. (I wasn't able to find a clear indication of whether the CDC 6000 Series, which is the computer these programs were compiled on, would have used a card reader or a keyboard for them to enter their code, but I did find that CDC didn't make a full-screen editor available to time-sharing users on the 6000 series until 1982, which is well after the paper's publication date.)

A first look at novice compilation behaviour using BlueJ
Yep, this one indeed found that missing semicolons were the most common type of compilation error at 18%, with unknown variable name and missing brackets in a dead heat for second place at 12%. Of course, it also found that the median time to correct and run another compile was only 8 seconds after getting a missing semicolon error, so hardly a major problem to resolve.

Also, once again, as even stated in the title of the paper, this was limited to code written by novice programmers, taking a one-hour-a-week introductory course, so it seems misguided to make assertions about the semicolon habits of experienced programmers based on its findings.

Making programming more conversational
The only mentions of semicolons in this document are " Miss one semicolon in a C program and the program may no longer work at all. " and " Instead of typing in text-based instructions, many visual programming languages use mechanisms such as drag and drop to compose programs. Similar to code auto-completion approaches, these kinds of visual programming environments prevent syntactic programming mistakes such as missing semicolons or typos. " While these statements confirm that semicolons are important and that programmers can sometimes get them wrong (neither of which has been in dispute here), they make no attempt to examine how commonly semicolon-related errors occur. Given that the purpose of this paper was to introduce a new form of computer-assisted programming rather than to examine existing coding practices, I doubt that the authors even considered looking into the frequency of semicolon errors.

I was not able to locate the remaining papers you mentioned by doing title or author searches using Ebsco's metasearch tools.

Tux on Sep 10, 2020 at 08:52 UTC

Re: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
  1. Highly desirable Make a semicolon optional at the end of the line
    Highly un desirable. If things to be made optional for increased readability, not this, but making braces optional for singles statement blocks. But that won't happen either.
  2. Highly Questionable Introduce pragma that specify max allowed length of single and double quoted string
    Probably already possible with a CPAN module, but who would use it? This is more something for a linter or perltidy.
  3. Highly desirable Compensate for some deficiencies of using curvy brackets as the block delimiters
    Unlikely to happen and very un undesirable. The first option is easy } # LABEL (why introduce new syntax when comments will suffice). The second is just plain illogical and uncommon in most other languages. It will confuse the hell out of every programmer.
  4. Make function slightly more flexible
    a) no b) Await the new signatures c) Macro's are unlikely to happen. See the problems they faced in Raku. Would be fun though
  5. Long function names
    Feel free to introduce a CPAN module that does all you propose. A new function for trimming has recently been introduced and spun off a lot of debate. I think none of your proposed changes in this point is likely to gain momentum.
  6. Allow to specify and use "hyperstrings"
    I have no idea what is to be gained. Eager to learn though. Can you give better examples?
  7. Put more attention of managing namespaces
    I think a) is part of the proposed OO reworks for perl7 based on Cor , b) is just plain silly, c) could be useful, but not based on letters but on sigils or interpunction, like in Raku</lI.
  8. Analyze structure of text processing functions in competing scripting languages
    Sounds like a great idea for a CPAN module, so all that require this functionality can use it
  9. Improve control statements
    Oooooh, enter the snake pit! There be dragons here, lots of nasty dragons. We have has given/when and several switch implementations and suggestions, and so far there has been no single solution to this. We all want it, but we all have different expectations for its feature sets and behavior. Wise people are still working on it so expect *something* at some time.

Enjoy, Have FUN! H.Merijn

you !!! on Sep 10, 2020 at 16:57 UTC

Re^2: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff


by you !!! on Sep 10, 2020 at 16:57 UTC Reputation: -4

Because }:LABEL actually forcefully closes all blocks in between, but the comment just informs you which opening bracket this closing bracket corresponds to. and, as such, can placed on the wrong closing bracket, especially if the indentation is wrong too. Worsening already bad situation.

Been there, done that.

dsheroh on Sep 11, 2020 at 08:18 UTC

Re^3: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by dsheroh on Sep 11, 2020 at 08:18 UTC

Your "one brace to close them all" idea is not needed if you have a decent editor - and, incidentally, would most likely break this feature in many/most/all editors which provide it.

you !!! on Sep 11, 2020 at 16:45 UTC

Re^2: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff


by you !!! on Sep 11, 2020 at 16:45 UTC Reputation: -5

Highly desirable Make a semicolon optional at the end of the line
Highly undesirable. If things to be made optional for increased readability, not this, but making braces optional for singles statement blocks. But that won't happen either.

Making single statement blocks is a programming language design blunder made in PHP. It creates the so called "dangling else" problem.

BTW, if this is "highly undesirable", can you please explain why Perl designers took some efforts to allow omitting semicolon before closing brace?

Tux on Sep 12, 2020 at 09:24 UTC

Re^3: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by Tux on Sep 12, 2020 at 09:24 UTC

I cannot answer the why in that question, but the only place where *I* use it on a very regular basis is

my @foo = map { $_->[0] } sort { $a->[1] <=> $b->[1] } map { m/^(.*(\d+).*)$/ } @bar; [download]

Which works evenly fine when semi-colons are added.

Following the complete discussion, I wonder why you persist. To me it is obvious that Perl is not (or should not be) your language of choice.

If you really think trailing semi-colons should be omitted, do find a language that allows it. You have come up with exactly ZERO arguments that will convince the other users of perl and perl language designers and language maintainers.

To me however, all the counter-arguments were very insightful, so thank you for starting it anyway.

/me wonders how many users would stop using perl completely if your rules would be implemented (wild guess 90%) and how many new users the language would gain (wild guess 1%)


Enjoy, Have FUN! H.Merijn

ikegami on Sep 14, 2020 at 22:38 UTC

Re^3: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by ikegami on Sep 14, 2020 at 22:38 UTC

can you please explain why Perl designers took some efforts to allow omitting semicolon before closing brace? >

Because it's unambiguous, and because it allows one to treat it as a statement separator (like the comma) instead of statement terminator .

Getting rid of the others would cause countless ambiguities.

[Sep 15, 2020] What esteemed monks think about changes necessary-desirable in Perl 7 outside of OO staff

Sep 15, 2020 | perlmonks.org

alexander_lunev on Sep 10, 2020 at 09:02 UTC

Perl use and abuse of curvy brackets dicussion

Making Perl more like modern Python or JS is not improvement to language, you need another word for that, something like "trends" or "fashion", or something like that. I see this list as a simplification of language (and in a bad way), not improvement. As if some newby programmer would not want to improve himself, to get himself up to match the complexity of language, but blaim language complexity and demand the language complexity to go down to his (low) level. "I don't want to count closing brackets, make something that will close them all", "I don't want to watch for semicolons, let interpreter watch for end of sentence for me", "This complex function is hard to understand and remember how to use it in a right way, give me bunch of simple functions that will do the same as this one function, but they will be easy to remember".

Making tool more simple will not make it more powerful, or more efficient, but instead could make it less efficient, because the tool will have to waste some of its power to compensate user's ineptitude. Interpreter would waste CPU and memory to comprehend sentence ending, this "new" closing brackets and extra function calls, and what's gain here? I see only one - that newby programmer could write code with less mind efforts. So it's not improvement of language to do more with less, but instead a change that will cause tool do same with more. Is it improvement? I don't think so.

you !!! on Sep 10, 2020 at 16:52 UTC

Re^2: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff


by you !!! on Sep 10, 2020 at 16:52 UTC Reputation: -7

As if some newby programmer would not want to improve himself, to get himself up to match the complexity of language, but blaim language complexity and demand the language complexity to go down to his (low) level.

The programming language should be adapted to actual use by programmers, not to some illusions of actual use under the disguise of "experts do not commit those errors." If the errors committed by programmers in the particular language are chronic like is the case for semicolons and missing closing brace something needs to be done about them, IMHO.

The same is true with the problem of "overexposure" of global variables. Most programmers at some point suffer from this type of bugs. That's why "my" was pushed into the language. But IMHO it does not go far enough as it does not distinguish between reading and modifying a variable. And "sunglasses" approach to visibility of global variable might be beneficial.

BTW the problem of missing parentheses affects all languages which use this "{" and "}" as block delimiters and the only implementation which solved this complex problem satisfactory were closing labels on closing block delimiter in PL/1 ("}" in Perl; "begin/end" pair in PL/1). Like with "missing semicolon" this is the problem from which programmer suffer independently of the level of experience with the language.

So IMHO any measures that compensate for "dangling '}' " problem and provide better coordination between opening and closing delimiters in the nested blocks would be beneficial.

Again the problem of missing closing brace is a chronic one. As somebody mentioned here the editor that has "match brace" can be used to track it but that does not solve the problem itself, rather it provides a rather inefficient (for complex script) way to troubleshoot one. Which arise especially often if you modify a long and not written by you (or written by you long go) script. I experienced even a case when syntactically { } braces structure were correct but semantically wrong and that was detected only after the program was moved to production. Closing label on bracket would prevent it.

choroba on Sep 10, 2020 at 17:10 UTC

Re^3: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by choroba on Sep 10, 2020 at 17:10 UTC

If you write short subroutines, as you should, you don't suffer from misplaced closing curly braces. I had problems with them, especially when doing large edits on code not written by me, but the editor always saved me.

Both puns intended.

map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]

Fletch on Sep 10, 2020 at 19:27 UTC

Re^4: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by Fletch on Sep 10, 2020 at 19:27 UTC

More or less agree WRT mismatched closing curlies. I see it pretty much entirely as an editor issue.

(I mean isn't that the whole python argument for Semantic-Whitespace-As-Grouping? At least I recall that ("Your editor will keep it straight") being seriously offered as a valid dismissal of the criticism against S-W-A-G . . .)

The cake is a lie.
The cake is a lie.
The cake is a lie.

you !!! on Sep 10, 2020 at 21:37 UTC

Re^5: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by you !!! on Sep 10, 2020 at 21:37 UTC
I mean isn't that the whole python argument for Semantic-Whitespace-As-Grouping?
No the argument is different, but using indentation to determine block nesting does allow multiple closure of blocks, as a side effect. Python invented strange mixed solution when there is an opening bracket (usually ":") and there is no closing bracket -- instead indent is used as the closing bracket.

The problem is that it breaks too many other things, so here the question "whether it worth it" would be more appropriate, then in the case of soft semicolons.

dsheroh on Sep 11, 2020 at 08:27 UTC

Re^3: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by dsheroh on Sep 11, 2020 at 08:27 UTC
As somebody mentioned here the editor that has "match brace" can be used to track it but that does not solve the problem itself, rather it provides a rather inefficient (for complex script) way to troubleshoot one. Which arise especially often if you modify the script.
I would submit that, if you have enough levels of nested blocks that "match brace" becomes a cumbersome and "inefficient" tool, then your problem is that your code is overly-complex and poorly-structured, not any issue with the language or the editor. Good code does not have 47-level-deep nested blocks.

atcroft on Sep 12, 2020 at 00:23 UTC

Re^4: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by atcroft on Sep 12, 2020 at 00:23 UTC
Good code does not have 47-level-deep nested blocks.
... 43 ... 44 ... 45.

<humor> *whew* Made it with 2 to spare. Glad you said "47-level-deep nested blocks".

Wait, there was one more conditi...</humor> :D :)

[Sep 15, 2020] Knuth multiple escape from the loop construct in Perl

Sep 15, 2020 | perlmonks.org

ikegami on Sep 14, 2020 at 22:21 UTC

Re: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff

Extend last to accept labels

last already accepts labels.

implement "post loop switch"

That's horrible. Noone uses continue since it doesn't give access to the lexical vars of the loop, and this suffers from the same problem.

See Donald Knuth Structured Programming with go to Statements programming with goto statements

Perl already has a goto statement.

That said, while I use goto regularly in C, there's no reason to use it in Perl.

[Sep 15, 2020] Extracting a substring in Perl

Sep 15, 2020 | perlmonks.org

ikegami on Sep 14, 2020 at 22:30 UTC

Re: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff

As extracting of substring is a very frequent operation

It's actually quite rare to want to extract a substring by position.

Implement tail and head functions as synonyms to substr ($line,0,$len) and substr($line,-$len)

Nothing's stopping you from doing that right now.

likbez on Sep 15, 2020 at 04:12 UTC

Re^2: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff


by likbez on Sep 15, 2020 at 04:12 UTC Reputation: -1

Implement head and tail functions as synonyms to substr ($line,0,$len) and substr($line,-$len)
Nothing's stopping you from doing that right now.
Yes, you can do it with certain limitations and the loss of flexibility as a user function. The key question here is not whether "you can do it" -- but how convenient it will be in comparison with the "status quo", are key categories of users benefit directly from this addition (for Perl first of all whether sysadmins will benefit), and what is the cost -- how much trouble is to add it into the already huge interpreter, which inevitably increase already large number of built-in functions. As well as whether "in the long run" new functions can retire same "inferior" functions like chomp and chop

NOTE: it is better to call them ltrim and rtrim.

With chomp, which is far more frequently used out of two, replacing it by rtrim is just a renaming operation, with chop you need some "inline" function capability (macrosubstitution). So rtrim($line) should be equivalent of chomp($line) --assuming that "\n" is the default second argument for rtrim)

Also any user function by definition has more limited flexibility in comparison to the built-in function and is less efficient unless implemented in C.

Without introduction of additional argument for a user-defined function it is impossible to determine if the function ltrim has target or not (if not, it should modify the first parameter.

So on user function level you need to have two functions: ltrim and myltrim ), as it this case the second argument has a more natural meaning.

On use defined function level you have quite limited capabilities to determine the lexical type of the second argument (at run time in Perl you can only distinguish between the numeric type and the string -- not that regex is passed, or translation table is passed. Actually some languages allow to specify different entry points to the function depending on the number and type of arguments (string, integer, float, pointer, etc) passed. In Perl terms this looks something like extended signatures:

sub name { entry ($$){ } entry (\$\$){ } } [download]

A couple of examples:

The call ltrim($line,7) should be interpreted as

$line=substr($line,7)

but the call $header=ltrim($line,'<h1>'); obviously should be interpreted as

$header=substr($line,index($line,'<h1>');

Also if you want to pass regex or translation table you need somehow to distinguish type of the last argument passed. So instead of the function call

$body=ltrim($line,/\s*/); you need to use

$body=ltrim($line,'\s*','r'); which should be interpreted as

if ($line=~/^\s*(.+)$/) { return $1; } [download]

the same problem arise if you want to pass a set of characters to be eliminated like in tr/set1//d;

$body=ltrim($line," \t",'t'); # equivalent to ($body)=split(' ',$line,1);

One argument in favor of such functions is that in many languages the elimination of free space at the beginning and end of strings is recognized as an important special case and built-in function provided for this purpose. Perl is one of the few in which there is no such special operation.

[Sep 15, 2020] Idea of introducing Fortran stle declaration of my variables in Perl

Sep 15, 2020 | perlmonks.org

ikegami on Sep 14, 2020 at 22:27 UTC

Re: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff

Allow default read access for global variables, but write mode only with own declaration via special pragma, for example use sunglasses.

You can do this already. But it doesn't make sense to do this instead of creating accessors.

Allow to specify set of characters, for which variable acquires my attribute automatically, as well as the default minimum length of non my variables via pragma my

There's a lot of problems with this. But hey, if you want this, there's nothing's stopping from writing a module that provide this "feature".

[Sep 15, 2020] The problem of dangling close bracket in Perl and other C-style language

Sep 15, 2020 | perlmonks.org
Re^2: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by likbez on Sep 10, 2020 at 16:52 UTC

...BTW the problem of missing parentheses affects all languages which use this "{" and "}" as block delimiters and the only implementation which solved this complex problem satisfactory were closing labels on closing block delimiter in PL/1 ("}" in Perl; "begin/end" pair in PL/1). Like with "missing semicolon" this is the problem from which programmer suffer independently of the level of experience with the language.

So IMHO any measures that compensate for "dangling '}' " problem and provide better coordination between opening and closing delimiters in the nested blocks would be beneficial.

Again the problem of missing closing brace is a chronic one. As somebody mentioned here the editor that has "match brace" can be used to track it but that does not solve the problem itself, rather it provides a rather inefficient (for complex script) way to troubleshoot one. Which arise especially often if you modify a long and not written by you (or written by you long go) script. I experienced even a case when syntactically { } braces structure were correct but semantically wrong and that was detected only after the program was moved to production. Closing label on bracket would prevent it.

choroba on Sep 10, 2020 at 17:10 UTC

Re^3: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by choroba on Sep 10, 2020 at 17:10 UTC

If you write short subroutines, as you should, you don't suffer from misplaced closing curly braces. I had problems with them, especially when doing large edits on code not written by me, but the editor always saved me.

Both puns intended.

Re^4: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by Fletch on Sep 10, 2020 at 19:27 UTC

More or less agree WRT mismatched closing curlies. I see it pretty much entirely as an editor issue.

(I mean isn't that the whole python argument for Semantic-Whitespace-As-Grouping? At least I recall that ("Your editor will keep it straight") being seriously offered as a valid dismissal of the criticism against S-W-A-G . . .)

The cake is a lie.
The cake is a lie.
The cake is a lie.

you !!! on Sep 10, 2020 at 21:37 UTC

Re^5: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by you !!! on Sep 10, 2020 at 21:37 UTC
I mean isn't that the whole python argument for Semantic-Whitespace-As-Grouping?
No the argument is different, but using indentation to determine block nesting does allow multiple closure of blocks, as a side effect. Python invented mixed solution when there is an opening bracket (usually ":") and there is no closing bracket -- instead the change in indent is used as the proxy for the presence the closing bracket.

The problem is that it breaks too many other things, so here the question "whether it worth it" would be more appropriate, then in the case of soft semicolons or "reverse labels on "}" like "}LABEL" .

[Sep 14, 2020] You Infinite Snake- Programming Language Wars- The Movie

Sep 14, 2020 | youinfinitesnake.blogspot.com

[Sep 14, 2020] Re^6- What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff by likbez

Sep 14, 2020 | perlmonks.org

on Sep 10, 2020 at 21:28 UTC ( # 11121583 = note : print w/replies , xml ) Need Help??

f

Reputation: -10

Edit

OK. You are right. So it will now be interpreted as syntax error, but was valid previously, if we assume that somebody uses this formatting for suffix conditionals.

That supports another critique of the same proposal -- it might break old Perl 5 scripts and should be implemented only as optional pragma. Useful only for programmers who experience this problem.

Because even the fact that this error is universal and occurs to all programmers is disputed here.

johngg on Sep 12, 2020 at 13:46 UTC

Re^5: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
if we assume that somebody uses this formatting to suffix conditionals

I do, pretty much all the time! The ability to span a statement over multiple lines without jumping through backslash hoops is one of the things that makes Perl so attractive. I also think it makes code much easier to read rather than having excessively long lines that involve either horizontal scrolling or line wrapping. As to your comment regarding excessive length identifiers, I come from a Fortran IV background where we had a maximum of 8 characters for identifiers (ICL 1900 Fortran compiler) so I'm all for long, descriptive and unambiguous identifiers that aid those who come after in understanding my code.

Cheers,

JohnGG

dsheroh on Sep 11, 2020 at 08:11 UTC

Re^5: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff

We need not "assume that somebody uses this formatting". I do it frequently, and I have often seen it in other people's code. It is not a purely-hypothetical case.

Replies are listed 'Best First'.

[Sep 14, 2020] Perl 7 should probably be more sysadmin friendly, not OO friendly

Sep 14, 2020 | perlmonks.org
[reply]
[/msg]

likbez on Sep 13, 2020 at 21:25 UTC

Re^6: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by likbez on Sep 13, 2020 at 21:25 UTC

It might make sense to enable it only with -d options as a help for debugging, which cuts the number of debugging runs for those who do not have editor with built-in syntax checking (like ActiveState Komodo Editor; which really helps is such cases ).

That list includes most Linux/Unix system administrators, who use just command line and vi or similar. And they also use bash of daily basis along with Perl, which increases the probability of making such an error. And this is probably one of the most important category of uses for the future of Perl: Perl started with this group (Larry himself, Randal L. Schwartz, Tom Christiansen, etc) and after a short affair with the Web programming (yahoo, etc) and bioinformatics (bioperl) retreated back to the status of the scripting language of choice for the elite Unix sysadmins.

That does not exclude other users and applications, but I think the core of Perl users are now Unix sysadmins. And their interests should be reflected in Perl 7 with some priority.

BTW, I do not see benefits of omitted semicolons in the final program (as well as, in certain cases, omitted round brackets).

johngg on Sep 13, 2020 at 22:49 UTC

Re^9: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
As for soft-semicolon you completly misunderstood the situation: First, nobody force you to use this pragma. And if you do not use it you are not affected. I am thinking now that it should be enabled only with option -d.

In the OP you make no mention of a pragma in proposal 1, you just say that it would be "highly desirable" to have soft semicolons. This implies that you would like it to be the default behaviour in Perl 7, which, judging by the responses, would hack a lot of people off, me included. If you are proposing that soft semicolons are only enabled via a pragma perhaps you should add a note to that effect in the OP, being sure to make it clear that it is an update rather than silently changing the text.

And IMHO there is not a zero subset of Perl users who would be interested in this capability. Especially system administrators who systematically use bash along with Perl.

I spent the last 26 years of my career as a systems administrator (I had no ambition to leave technical work and become a manager) on Unix/Linux systems and started using Perl in that role in 1994 with perl 4.036, quickly moving to 5. The lack of semicolon statement terminators in the various shell programming languages I had to use was a pain in the arse and moving to Perl was a huge relief as well as a boost to effectiveness. I would not be the slightest bit interested in soft semicolons and they would, to my mind, be either a debugging nightmare or would force me into a coding style alien to my usual practice.

In this post you say

Also Python-inspired fascination with eliminating all brackets does not do here any good 1 2 $a=$b=1; 3 $x=1 if $a==1 4 && $b=2; [download]

should generally be written

2 $a=$b=1; 3 $x=1 if( $a==1 4 && $b=2); [download]

to which I say, nonsense! Why add unnecessary round brackets to perfectly valid code? Use round brackets where they are needed to disambiguate precedence but not where they just add superfluous noise. Nothing to do with fascination, I've never touched Python!

You should be commended on the amount of thought that you have put into your proposals and such efforts should not be discouraged. It is unfortunate that your first proposal has been the most contentious and the one that most responses have latched onto. Sticking to one's guns is also a praiseworthy trait but doing so in the face of several powerful and cogent arguments to the contrary from experienced Perl users is perhaps taking it too far. Making it clear that soft semicolons would not be the default behaviour might apply some soothing balm to this thread.

Cheers,

JohnGG

[Sep 12, 2020] The discussion of the idea of soft semicolons in Perl

Sep 12, 2020 | perlmonks.org

likbez on Sep 10, 2020 at 20:41 UTC

Re^2: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff


by likbez on Sep 10, 2020 at 20:41 UTC Reputation: -11

Why would this be highly desirable? Consider: print( "Hello World" ) if( 1 ); [download] versus
print( "Hello World" )
    if( 1 < 2 ) {
         print("Goodbye");
    };
I do not understand your train of thought. In the first example end of the line occurred when all brackets are balanced, so it will will be interpretered as print( "Hello World" ); if( 1 ); [download]

So this is a syntactically incorrect example, as it should be. The second example will be interpreted as

print( "Hello World" );
    if( 1 < 2 ) { print("Goodbye");
    };

Anonymous Monk on Sep 10, 2020 at 20:51 UTC

Re^3: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by Anonymous Monk on Sep 10, 2020 at 20:51 UTC So this is a syntactically incorrect example, as it should be.

wrong. print "Hello World" if 1; is valid Perl

likbez on Sep 10, 2020 at 21:28 UTC

Re^4: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by likbez on Sep 10, 2020 at 21:28 UTC

That supports another critique of the same proposal -- it might break old Perl 5 scripts and should be implemented only as optional pragma. Useful only for programmers who experience this problem.

Because even the fact that this error is universal and occurs to all programmers is disputed here.

dsheroh on Sep 11, 2020 at 08:11 UTC

Re^5: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by dsheroh on Sep 11, 2020 at 08:11 UTC

johngg on Sep 12, 2020 at 13:46 UTC

Re^5: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by johngg on Sep 12, 2020 at 13:46 UTC
if we assume that somebody uses this formatting to suffix conditionals

I do, pretty much all the time! The ability to span a statement over multiple lines without jumping through backslash hoops is one of the things that makes Perl so attractive. I also think it makes code much easier to read rather than having excessively long lines that involve either horizontal scrolling or line wrapping. As to your comment regarding excessive length identifiers, I come from a Fortran IV background where we had a maximum of 8 characters for identifiers (ICL 1900 Fortran compiler) so I'm all for long, descriptive and unambiguous identifiers that aid those who come after in understanding my code.

Cheers,

JohnGG

likbez on Sep 10, 2020 at 15:38 UTC

Re^2: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff


by likbez on Sep 10, 2020 at 15:38 UTC Reputation: -14

Because people have a natural tendency to omit them at the end of the line. That's why.

This is an interesting psychological phenomenon that does not depend on your level of mastery of the language and is not limited to novices.

dave_the_m on Sep 10, 2020 at 18:09 UTC

Re^3: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by dave_the_m on Sep 10, 2020 at 18:09 UTC

Dave.

likbez on Sep 10, 2020 at 20:56 UTC

Re^4: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by likbez on Sep 10, 2020 at 20:56 UTC

Can you please tell us how many times you corrected the missing semicolon error in your scripts during the last week?

dave_the_m on Sep 11, 2020 at 10:37 UTC

Re^5: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by dave_the_m on Sep 11, 2020 at 10:37 UTC $a = $b + $c + $d + $e; [download] If not, what are the exact criteria for things on the next line to trigger or not a semicolon?

Dave.

likbez on Sep 11, 2020 at 14:20 UTC

Re^6: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by likbez on Sep 11, 2020 at 14:20 UTC
In the following, the first line has a balance of brackets and looks syntactically correct. Would you expect the lexer to add a semicolon?
  $a = $b + $c
            + $d + $e;
Yes, and the user will get an error. This is similar to previous example with trailing on a new line "if (1);" suffix. The first question is why he/she wants to format the code this way if he/she suffers from this problem, wants to avoid missing semicolon error and, supposedly enabled pragma "softsemicolons" for that?

This is the case where the user need to use #\ to inform the scanner about his choice. But you are right in a sense that it creates a new type of errors -- "missing continuation." And that there is no free lunch. This approach requires specific discipline to formatting your code.

dave_the_m on Sep 11, 2020 at 14:52 UTC

Re^7: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by dave_the_m on Sep 11, 2020 at 14:52 UTC

The reason I gave that code as an example is that it's a perfectly normal way of spreading complex expressions over multiple lines: e.g. where you need to add several variables together and the variables have non-trivial (i.e. long) names, e.g.

$pressure = $partial_pressure_nitrogen 
               + $partial_pressure_oxygen 
               + $partial_pressure_water_vapour
               + $partial_pressure_argon
               + $partial_pressure_carbon_dioxide;
[download] In this case, the automatic semicolons are unhelpful and will give rise to confusing error messages. So you've just switched one problem for another, and raised the cognitive load - people now need to know about your pragma and also know when its in scope.

Dave.

likbez on Sep 11, 2020 at 16:51 UTC

Re^8: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by likbez on Sep 11, 2020 at 16:51 UTC

Yes it discourages certain formatting style. So what ? If you can't live without such formatting (many can) do not use this pragma. BTW you can always use extra parentheses, which will be eliminated by the parser as in

$pressure = (
       $partial_pressure_nitrogen
     + $partial_pressure_oxygen
     + $partial_pressure_water_vapour
     + $partial_pressure_argon
     + $partial_pressure_carbon_dioxide
     );

dave_the_m on Sep 11, 2020 at 17:05 UTC

Re^9: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by dave_the_m on Sep 11, 2020 at 17:05 UTC

* How exactly does the lexer/parser know when it should insert a soft semicolon?

* How exactly does it give a meaningful error message when it inserts one where the user didn't intend for there to be one?

My problem with your proposal is that it seems to require the parser to apply some complex heuristics to determine when to insert and when to complain meaningfully. It is not obvious to me what these heuristics should be. My suspicion is that such an implementation will just add to perl's already colourful collection of edge cases, and just confuse both beginner and expert alike.

Bear in mind that I am one of just a handful of people who actively work on perl's lexer and parser, so I have a good understanding of how it works, and am painfully aware of its many complexities. (And its quite likely that I would end up being the one implementing this.)

Dave.

likbez on Sep 11, 2020 at 18:51 UTC

Re^10: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by likbez on Sep 11, 2020 at 18:51 UTC

The lexical analyzer is Perl is quite sophisticated due to lexical complexity of the language. So I think it already counts past lexems and thus can determine the balance of "()", '[]' and "{}"

So you probably can initially experiment with the following scheme

If all the following conditions are true

  1. You reached the EOL
  2. Pragma "softsemicolon" is on
  3. The balance is zero
  4. The next symbol via look-ahead buffer is not one of the set "{", "}", ';', and ".", -- no Perl statement can start with the dot. Probably this set can be extended with "&&", '||', and "!". Also the last ',' on the current line, and some other symbols clearly pointing toward extension of the statement on the next line should block this insertion.

the lexical analyzer needs to insert lexem "semicolon" in the stream of lexem passed to syntax analyzer.

The warning issued should be something like:

"Attempt to correct missing semicolon was attempted. If this is incorrect please use extra parenthesis or disable pragma "softsemicolon" for this fragment."
From what I read, Perl syntax analyser relies on lexical analyser in some unorthodox way, so it might be possible to use "clues" from syntax analyser for improving this scheme. See, for example, the scheme proposed for recursive descent parsers in:
Follow set error recovery
C Stirling - Software: Practice and Experience, 1985 - Wiley Online Library
  Some accounts of the recovery scheme mention and make use of non-systematic changes to
their recursive descent parsers in order to improve   In the former he anticipates the possibility of
a missing semicolon whereas in the latter he does not anticipate a missing comma

dave_the_m on Sep 11, 2020 at 22:02 UTC

Re^11: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by dave_the_m on Sep 11, 2020 at 22:02 UTC
So I think it already counts past lexems and thus can determine the balance of "()", '[]' and "{}"
It can't currently.
If all the following conditions are true
All of the following satisfy your criteria, are valid and normal perl code, and would get a semicolon incorrectly inserted based on your criteria:
use softsemicolon; 
$x = $a 
     + $b; 
$x = 1 
     if $condition; 
$x = 1 unless $condition1 
     && $condition2;
[download]
The warning issued should be something like
I didn't ask what the text of the warning should be, I asked how the parser can determine when the warning should be issued.
the scheme proposed for recursive descent parsers
But perl uses an LR(1) parser, not a recursive descent parser.

Dave.

likbez on Sep 12, 2020 at 02:06 UTC

Re^12: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by likbez on Sep 12, 2020 at 02:06 UTC
All of the following satisfy your criteria, are valid and normal Perl code, and would get a semicolon incorrectly inserted based on your criteria:
use softsemicolon;

$x = $a
   + $b;

$x = 1
    if $condition;

$x = 1 unless  $condition1
            && $condition2;

Yes in cases 1 and 2; it depends on depth of look-ahead in case 3. Yes if it is one symbol. No it it is two(no Perl statement can start with && )

As for "valid and normal" your millage may vary. For people who would want to use this pragma it is definitely not "valid and normal". Both 1 and 2 looks to me like frivolities without any useful meaning or justification. Moreover, case 1 can be rewritten as:

$x =($a 
        + $b);
[download] The case 3 actually happens in Perl most often with regular if and here opening bracket is obligatory:
if ( ( $tokenstr=~/a\[s\]/ || $tokenstr =~/h\[s\]/ ) 
        && ( $tokenstr... ) ){ .... } 

                                        
                                          
                                             
                                             [download]

                                          Also Python-inspired fascination with eliminating all brackets does not do here any good
$a=$b=1; 
$x=1 if $a==1  
        && $b=2;
[download] should generally be written
$a=$b=1; 
$x=1 if( $a==1 
        && $b=2);
[download]

I was surprised that the case without brackets was accepted by the syntax analyser. Because how would you interpret $x=1 if $a{$b}; without brackets is unclear to me. It has dual meaning: should be a syntax error in one case

$x=1 if $a{
        $b
      };
[download] and the test for an element of hash $a in another.

dave_the_m on Sep 12, 2020 at 06:52 UTC

Re^13: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by dave_the_m on Sep 12, 2020 at 06:52 UTC
Both 1 and 2 looks to me like frivolities without any useful meaning or justification
You and I have vastly differing perceptions of what constitutes normal perl code. For example there are over 700 examples of the 'postfix if on next line' pattern in the .pm files distributed with the perl core.

There doesn't really seem any point in discussing this further. You have failed to convince me, and I am very unlikely to work on this myself or accept such a patch into core.

Dave.

likbez on Sep 12, 2020 at 19:53 UTC

Re^14: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by likbez on Sep 12, 2020 at 19:53 UTC
You and I have vastly differing perceptions of what constitutes normal perl code. For example there are over 700 examples of the 'postfix if on next line' pattern in the .pm files distributed with the perl core.
Probably yes. I am an adherent of "defensive programming" who is against over-complexity as well as arbitrary formatting (pretty printer is preferable to me to manual formatting of code). Which in this audience unfortunately means that I am a minority.

BTW your idea that this pragma (which should be optional) matters for Perl standard library has no connection to reality.

[Sep 12, 2020] The Programming Language Wars

Jun 29, 2017 | www.microsoft.com
Speaker

Andreas Stefik Affiliation

University of Nevada, Las Vegas Series

Microsoft Research Talks

Overview

Modern society is built on software platforms that encompass a great deal of our lives. While this is well known, software is invented by people and this comes at considerable cost. Notably, approximately $331.7 billion are paid, in the U.S. alone, in wages every year for this purpose. Generally, developers in industry use programming languages to create their software, but there exists significant dispersion in the designs of competing language products. In some cases, this dispersion leads to trivial design inconsistencies (e.g., the meaning of the symbol +), while in other cases the approaches are radically different. Studies in the literature show that some of the broader debates, like the classic ones on static vs. dynamic typing or competing syntactic designs, provide consistent and replicable results in regard to their human factors impacts.

For example, programmers can generally write correct programs more quickly using static typing than dynamic for reasons that are now known. In this talk, we will discuss three facets of language design dispersion, sometimes colloquially referred to as the "programming language wars."

First, we will flesh out the broader impacts inventing software has on society, including its cost to industry, education, and government. Second, recent evidence has shown that even research scholars are not gathering replicable and reliable data on the problem. Finally, we will give an overview of the facts now known about competing alternatives (e.g., types, syntax, compiler error design, lambdas).

[Sep 11, 2020] What esteemed monks think about changes necessary-desirable in Perl 7 outside of OO staff

Notable quotes:
"... Most people do not experience difficulties learning to put a full stop at the end of the sentence most of the time. Unfortunately this does work this way in programming languages with semicolon at the end of statement. Because what is needed is not "most of the time" but "all the time" ..."
"... My view supported by some circumstantial evidence and my own practice is the this is a persistent error that arise independently of the level of qualification for most or all people, and semicolon at the end of the statement contradicts some psychological mechanism programmers have. ..."
Sep 10, 2020 | perlmonks.org

likbez has asked for the wisdom of the Perl Monks concerning the following question:

Edit

What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff. I compiled some my suggestions and will appreciate the feedback:
  1. [Highly desirable] Make a semicolon optional at the end of the line, if there is a balance of brackets on the line and the statement looks syntactically correct ("soft semicolon", the solution used in famous IBM PL/1 debugging compiler).
  2. [Highly Questionable] Introduce pragma that specify max allowed length of single and double quoted string (not not any other type of literals). That might simplify catching missing quote (which is not a big problem with any decent Perl aware editor anyway)
  3. [Highly desirable] Compensate for some deficiencies of using curvy brackets as the block delimiters:
    1. Treat "}:LABEL" as the bracket closing "LABEL:{" and all intermediate blocks (This idea was also first implemented in PL/1)
    2. Treat " }.. " symbol as closing all opened brackets up to the subroutine/BEGIN block level and }... including this level (closing up to the nesting level zero. ). Along with conserving vertical space, this allows search for missing closing bracket to be more efficient.
  4. Make function slightly more flexible:
    1. Introduce pragma that allows to define synonyms to built-in functions, for example ss for for substr and ix for index
    2. Allow default read access for global variables with subroutines, but write mode only with own declaration via special pragma, for example use sunglasses;
    3. Introduce inline functions which will be expanded like macros at compile time: sub subindex inline{ $_[0]=substr($_[0],index($_[0],$_[1],$_2])) } [download]
  5. As extracting of substring is a very frequent operation the use of such long name as substr is counterproductive; it also contradicts the Perl goal of being concise and expressive .
    1. allow to extract substring via : or '..' notations like $line [$from:$to] (label can't be put inside square brackets in any case)
    2. Explicitly distinguish between translation table and regular expressions by introducing tt-strings
    3. Implement tail and head functions as synonyms to substr ($line,0,$len) and substr($line,-$len)
      With the ability to specify string, regex of translation table(tr style) instead of number as the third argument tail($line,'#') tail($line,/\s+#\w+$/) tail($line,tt/a-zA-z]/ [download]
    4. Implement similar to head and tail function called, for example, trim: trim(string,tt/leftcharacter_set/, tt/right_character_set/); [download] which deleted all characters from the first character set at the left and all characters from the second character set from the right, trim(string,,/right_character_set)
      strips trailing characters only.
  6. Allow to specify and use "hyperstrings" -- strings with characters occupying any power of 2 bytes (2,4,8, ...). Unicode is just a special case of hyperstring
    1. $hyper_example1= h4/aaaa/bbbb/cccc/;
    2. $hyper_example2= h2[aa][bb][cc];
    3. $pos=index($hyper_example,h4/bbbb/cccc/)
  7. Put more attention of managing namespaces.
    1. Allow default read access for global variables, but write mode only with own declaration via special pragma, for example use sunglasses.
    2. Allow to specify set of characters, for which variable acquires my attribute automatically, as well as the default minimum length of non my variables via pragma my (for example, variables with the length of less then three character should always be my)
    3. Allow to specify set of character starting from which variable is considered to be own, for example [A-Z] via pragma own.
  8. Analyze structure of text processing functions in competing scripting languages and implement several enhancements for existing functions. For example:
    1. Allow "TO" argument in index function, specifying upper range of the search.
    1. Implement delete function for strings and arrays. For example adel(@array,$from,$to) and asubstr and aindex functions.
  9. Improve control statements
    1. Eliminate keyword 'given' and treat for(scalar) as a switch statement. Allow when operator in all regular loops too. for($var){<br> when('b'){ ...;} # means if ($var eq 'b') { ... ; las + t} when(&gt;'c'){...;} } # for [download]
    2. [Questionable] Extend last to accept labels and implement "post loop switch" (See Donald Knuth Structured Programming with go to Statements programming with goto statements) my rc==0; for(...){ if (condition1) { $rc=1; last;} elsif(...){$rc=2; last} } if ($rc==0){...} elif($rc==1){...} elif($rc==3){...} [download]

      May be (not that elegant, but more compact the emulation above)

      for ...{ when (...); when (...); }with switch{ default: 1: ... 2: ... } [download]

Corion on Sep 10, 2020 at 07:03 UTC

Re: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
Highly desirable Make a semicolon optional at the end of the line, if there is a balance of brackets on the line and the statement looks syntactically correct ("soft semicolon", the solution used in famous IBM PL/1 debugging compiler).

Why would this be highly desirable? Consider:

print( "Hello World" ) if( 1 ); [download]

versus

print( "Hello World" ) if( 1 < 2 ) { print("Goodbye"); }; [download]

Adding your change idea makes the parser even more complex and introduces weird edge cases.

I think even Javascript now recommends using semicolons instead of eliding them at the end of a line.

Update : Some examples where ASI in Javascript goes wrong:

dsheroh on Sep 10, 2020 at 09:07 UTC

Re^2: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff


by dsheroh on Sep 10, 2020 at 09:07 UTC

Even aside from that, some of us don't like really long lines of code. Having to scroll horizontally in GUI editors sucks, and I do most of my coding in good-old 80-column terminal windows. So it's not uncommon for me to split up a long statement into multiple shorter lines, since whitespace has no syntactic significance.

If CRLF becomes a potential statement terminator, then breaking a single statement across multiple lines not only becomes a minefield of "will this be treated as one or multiple statements?", but the answer to that question may change depending on where in the statement the line breaks are inserted!

If implemented, this change would make a mockery of any claims that Perl 7 will just be "Perl 5 with different defaults", as well as any expectations that it could be used to run "clean" (by some definition) Perl 5 code without modification.

likbez on Sep 10, 2020 at 21:02 UTC

Re^3: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff

by likbez on Sep 10, 2020 at 21:02 UTC

If implemented, this change would make a mockery of any claims that Perl 7 will just be "Perl 5 with different defaults", as well as any expectations that it could be used to run "clean" (by some definition) Perl 5 code without modification.
Looks like a valid objection. I agree. With certain formatting style it is possible. But do you understand the strict as the default will break a lot of old scripts too.

Per your critique, it probably should not be made as the default and implemented as pragma similar to warnings and strict. You can call this pragma "softsemicolon"

What most people here do not understand is it can be implemented completely on lexical scanner level, not affecting syntax analyzer.

likbez on Sep 10, 2020 at 20:45 UTC

Re^3: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff

by likbez on Sep 10, 2020 at 20:45 UTC

If CRLF becomes a potential statement terminator, then breaking a single statement across multiple lines not only becomes a minefield of "will this be treated as one or multiple statements?", but the answer to that question may change depending on where in the statement the line breaks are inserted!
No. The classic solution of this problem was invented in FORTRAN in early 50 -- it is a backslash at the end of the line. Perl can use #\ as this is pragma to lexical scanner, not the element of the language.

Usually long line in Perl is the initialization of array or hash and after the split they do not have balanced brackets and, as such, are not affected and do not require #\ at the end.

Question to you: how many times you corrected missing semicolon in your Perl scripts the last week ? If you do not know, please count this during the next week and tell us.

hippo on Sep 10, 2020 at 21:46 UTC

Re^4: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO stuff

by hippo on Sep 10, 2020 at 21:46 UTC

The classic solution of this problem was invented in FORTRAN in early 50 -- it is a backslash at the end of the line.

Fortran didn't have a release until 1957 so not early 50s. Fortran prior to F90 used a continuation character at the start (column 6) of the subsequent line not the end of the previous line. The continuation character in Fortran has never been specified as a backslash. Perhaps you meant some other language?

likbez on Sep 11, 2020 at 01:28 UTC

Re^5: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO stuff

by likbez on Sep 11, 2020 at 01:28 UTC

Yes, the first FORTRAN compiler delivered in April 1957. I was wrong, sorry about it. Still the idea of continuation symbol belongs to FORTRAN, although the solution was different then I mentioned.

GrandFather on Sep 10, 2020 at 21:19 UTC

Re^4: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff

by GrandFather on Sep 10, 2020 at 21:19 UTC

how many times you corrected missing semicolon in your Perl scripts the last week

After running the code - never. All the IDEs I use for all the languages I use flag missing semi-colons and other similar foibles (like mis-matched brackets.

There are nasty languages that I use occasionally, and even some respectable ones, that need to quote new lines to extend a statement across multiple lines. That is just nasty on so many levels. I very much agree with dsheroh that long lines are anathema. Code becomes much harder to read and understand when lines are long and statements are not chunked nicely.

Don't break what's not broken!

Optimising for fewest key strokes only makes sense transmitting to Pluto or beyond

likbez on Sep 10, 2020 at 20:41 UTC

Re^2: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff

by likbez on Sep 10, 2020 at 20:41 UTC

Why would this be highly desirable? Consider: print( "Hello World" ) if( 1 ); [download] versus
print( "Hello World" )
    if( 1 < 2 ) {
         print("Goodbye");
    };
I do not understand your train of thought. In the first example end of the line occurred when all brackets are balanced, so it will will be interpretered as print( "Hello World" ); if( 1 ); [download]

So this is a syntactically incorrect example, as it should be. The second example will be interpreted as

print( "Hello World" );
    if( 1 < 2 ) { print("Goodbye");
    };

Anonymous Monk on Sep 10, 2020 at 20:51 UTC

Re^3: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff

by Anonymous Monk on Sep 10, 2020 at 20:51 UTC

So this is a syntactically incorrect example, as it should be.

wrong. print "Hello World" if 1; is valid Perl

likbez on Sep 10, 2020 at 21:28 UTC

Re^4: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff

by likbez on Sep 10, 2020 at 21:28 UTC

OK. You are right. So it will now be interpreted as syntax error, but was valid previously, if we assume that somebody uses this formatting for suffix conditionals.

That supports another critique of the same proposal -- it might break old Perl 5 scripts and should be implemented only as optional pragma. Useful only for programmers who experience this problem.

Because even the fact that this error is universal and occurs to all programmers is disputed here.

likbez on Sep 10, 2020 at 15:38 UTC

Re^2: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff

by likbez on Sep 10, 2020 at 15:38 UTC

> Why would this be highly desirable?

Because people have a natural tendency to omit them at the end of the line. That's why. This is an interesting psychological phenomenon that does not depend on your level of mastery of the language and is not limited to novices.

dave_the_m on Sep 10, 2020 at 18:09 UTC

Re^3: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff

by dave_the_m on Sep 10, 2020 at 18:09 UTC

So instead, beginners would encounter the interesting psychological phenomenon where a physical end of line is sometimes interpreted by the compiler as an end of statement, and other times not. One set of errors would be replaced by another.

Dave.

likbez on Sep 10, 2020 at 20:56 UTC

Re^4: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff

by likbez on Sep 10, 2020 at 20:56 UTC

The problem is real and the solution is real. Objections so far were pretty superficial and stems from insufficient level of understanding of how the proposal works on the level of lexical scanner -- it essentially replaces the end of line with semicolon if there a balance in brackets and syntax analyzer is not affected at all.

Can you please tell us how many times you corrected the missing semicolon error in your scripts during the lst week?

choroba on Sep 10, 2020 at 21:16 UTC

Re^5: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff

by choroba on Sep 10, 2020 at 21:16 UTC

As I said, I don't forget to include semicolons. See for example this video , it's 7 years old, but my habits haven't changed much since then. map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]

haj on Sep 10, 2020 at 18:35 UTC

Re^3: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff

by haj on Sep 10, 2020 at 18:35 UTC

That's neither a natural tendency nor an interesting psychological phenomenon. You just made that up.

Semicolons at the end of a statement are as natural as a full stop "." at the end of a sentence, regardless of whether the sentence is the last in a paragraph. The verification process whether a line "looks syntactically correct" takes longer than just hitting the ";" key, and the chances of a wrong assessment of "correct" may lead to wrong behavior of the software.

Language-aware editors inform you about a missing semicolon by indenting the following line as a continuation of the statement in the previous line, so it is hard to miss.

If, on the other hand, you want to omit semicolons, then the discussion should have informed you that you aren't going to find followers.

likbez on Sep 10, 2020 at 21:20 UTC

Re^4: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff

by likbez on Sep 10, 2020 at 21:20 UTC

Semicolons at the end of a statement are as natural as a full stop "." at the end of a sentence, regardless of whether the sentence is the last in a paragraph.
I respectfully disagree, but your comment can probably explain fierce rejection of this proposal in this forum. IMHO this is a wrong analogy as the level of precision required is different. If you analyze books in print you will find paragraphs in which full stop is missing at the end. Most people do not experience difficulties learning to put a full stop at the end of the sentence most of the time. Unfortunately this does work this way in programming languages with semicolon at the end of statement. Because what is needed is not "most of the time" but "all the time"

My view supported by some circumstantial evidence and my own practice is the this is a persistent error that arise independently of the level of qualification for most or all people, and semicolon at the end of the statement contradicts some psychological mechanism programmers have.

haj on Sep 11, 2020 at 00:41 UTC

Re^5: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff

by haj on Sep 11, 2020 at 00:41 UTC

If you analyse books in print you will find paragraphs in which full stop is missing at the end.

You are still making things up.

..and semicolon at the end of the statement contradicts some psychological mechanism programmers have.

There is no evidence for that.

You should have understood that your idea doesn't get support here. Defending it with made-up evidence doesn't help.

Tux on Sep 10, 2020 at 08:52 UTC

Re: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
  1. Highly desirable Make a semicolon optional at the end of the line
    Highly un desirable. If things to be made optional for increased readability, not this, but making braces optional for singles statement blocks. But that won't happen either.
  2. Highly Questionable Introduce pragma that specify max allowed length of single and double quoted string
    Probably already possible with a CPAN module, but who would use it? This is more something for a linter or perltidy.
  3. Highly desirable Compensate for some deficiencies of using curvy brackets as the block delimiters
    Unlikely to happen and very un undesirable. The first option is easy } # LABEL (why introduce new syntax when comments will suffice). The second is just plain illogical and uncommon in most other languages. It will confuse the hell out of every programmer.
  4. Make function slightly more flexible
    a) no b) Await the new signatures c) Macro's are unlikely to happen. See the problems they faced in Raku. Would be fun though
  5. Long function names
    Feel free to introduce a CPAN module that does all you propose. A new function for trimming has recently been introduced and spun off a lot of debate. I think none of your proposed changes in this point is likely to gain momentum.
  6. Allow to specify and use "hyperstrings"
    I have no idea what is to be gained. Eager to learn though. Can you give better examples?
  7. Put more attention of managing namespaces
    I think a) is part of the proposed OO reworks for perl7 based on Cor , b) is just plain silly, c) could be useful, but not based on letters but on sigils or interpunction, like in Raku</lI.
  8. Analyze structure of text processing functions in competing scripting languages
    Sounds like a great idea for a CPAN module, so all that require this functionality can use it
  9. Improve control statements
    Oooooh, enter the snake pit! There be dragons here, lots of nasty dragons. We have has given/when and several switch implementations and suggestions, and so far there has been no single solution to this. We all want it, but we all have different expectations for its feature sets and behavior. Wise people are still working on it so expect *something* at some time.
Enjoy, Have FUN! H.Merijn

likbez on Sep 10, 2020 at 16:57 UTC

Re^2: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff

by likbez on Sep 10, 2020 at 16:57 UTC Reputation: 0

>The first option is easy } # LABEL (why introduce new syntax when comments will suffice).

Because }:LABEL actually forcefully closes all blocks in between, but the comment just informs you which opening bracket this closing bracket corresponds to. and, as such, can placed on the wrong closing bracket, especially if the indentation is wrong too. Worsening already bad situation.

Been there, done that.

hippo on Sep 10, 2020 at 08:34 UTC

Re: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO stuff
9. ... a. Eliminate keyword 'given'

That I can agree with. The rest of your proposals seem either unnecessary (because the facilities already exist in the language) or potentially problematic or almost without utility to me. Sorry. That's not to say you shouldn't suggest them all to p5p for further review of course - it's only the opinion of a humble monk after all.

9. ... b. ... Extend last to accept labels

I have good news: it already does

likbez on Sep 10, 2020 at 15:16 UTC

Re^2: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO stuff

by likbez on Sep 10, 2020 at 15:16 UTC Reputation: 2

> I have good news: it already does

What I mean is a numeric "local" (in Pascal style; can be redefined later in other blocks ) label in the context of the Knuth idea of "continuations" outside the loop

haj on Sep 10, 2020 at 11:00 UTC

Re: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff

That's quite some work you've invested here. I've looked at them from two perspectives:

In summary, your suggestions don't perform that great. These are rather nerdy ideas where I don't see which problem they solve. There isn't much to be added to the comments of other monks, so I'll keep attention to two items:

I challenge the claim that closing more than one block with one brace allows search for missing closing bracket to be more efficient . It just hides problems when you lost control over your block structure. Source code editors easily allow to jump from opening to closing brace, or to highlight matching braces, but they are extremely unlikely to support such constructs.

I challenge the claim that extracting of substring is a very frequent operation . It is not in the Perl repositories I've cloned. Many of them don't have a single occurrence of substring . Please support that claim with actual data.

likbez on Sep 10, 2020 at 21:49 UTC

Re^2: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff

by likbez on Sep 10, 2020 at 21:49 UTC Reputation: -1

The frequency per line of code is rather low -- slightly above 4% (156/3678)

But in my text processing scripts this is the most often used function. In comparison the function "index" is used only 53 times. Or three times less. It also exceeds the use of regular expressions -- 108 in 3678.

GrandFather on Sep 10, 2020 at 22:26 UTC

Re^3: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff

by GrandFather on Sep 10, 2020 at 22:26 UTC

Strokes for folks. I use regexen vastly more often than substr and I almost never use index. Maybe you grew up with some other language (Visual Basic maybe?) and haven't actually learned to use Perl in an idiomatic way? Perl encourages a plethora of paradigms for solving problems. The flip side is Perl doesn't do much to discourage hauling less appropriate "comfort coding practices" from other languages. That is no reason to assume that all Perl users abuse Perl in the same way you do, or have as much trouble typing statement terminators

Optimising for fewest key strokes only makes sense transmitting to Pluto or beyond

likbez on Sep 11, 2020 at 02:55 UTC

Re^4: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff

by likbez on Sep 11, 2020 at 02:55 UTC

Have you ever noticed that anybody driving slower than you is an idiot, and anyone going faster than you is a maniac?

George Carlin

alexander_lunev on Sep 10, 2020 at 09:02 UTC

Re: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff

Making Perl more like modern Python or JS is not improvement to language, you need another word for that, something like "trends" or "fashion", or something like that. I see this list as a simplification of language (and in a bad way), not improvement. As if some newby programmer would not want to improve himself, to get himself up to match the complexity of language, but blaim language complexity and demand the language complexity to go down to his (low) level. "I don't want to count closing brackets, make something that will close them all", "I don't want to watch for semicolons, let interpreter watch for end of sentence for me", "This complex function is hard to understand and remember how to use it in a right way, give me bunch of simple functions that will do the same as this one function, but they will be easy to remember".

Making tool more simple will not make it more powerful, or more efficient, but instead could make it less efficient, because the tool will have to waste some of its power to compensate user's ineptitude. Interpreter would waste CPU and memory to comprehend sentence ending, this "new" closing brackets and extra function calls, and what's gain here? I see only one - that newby programmer could write code with less mind efforts. So it's not improvement of language to do more with less, but instead a change that will cause tool do same with more. Is it improvement? I don't think so.

likbez on Sep 10, 2020 at 16:52 UTC

Re^2: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff

by likbez on Sep 10, 2020 at 16:52 UTC

As if some newby programmer would not want to improve himself, to get himself up to match the complexity of language, but blaim language complexity and demand the language complexity to go down to his (low) level.

The programming language should be adapted to actual use by programmers, not to some illusions of actual use under the disguise of "experts do not commit those errors." If the errors committed by programmers in the particular language are chronic like is the case for semicolons and missing closing brace something needs to be done about them, IMHO.

The same is true with the problem of "overexposure" of global variables. Most programmers at some point suffer from this type of bugs. That's why "my" was pushed into the language. But IMHO it does not go far enough as it does not distinguish between reading and modifying a variable. And "sunglasses" approach to visibility of global variable might be beneficial.

BTW the problem of missing parentheses affects all languages which use this "{" and "}" as block delimiters and the only implementation which solved this complex problem satisfactory were closing labels on closing block delimiter in PL/1 ("}" in Perl; "begin/end" pair in PL/1). Like with "missing semicolon" this is the problem from which programmer suffer independently of the level of experience with the language.

So IMHO any measures that compensate for "dangling '}' " problem and provide better coordination between opening and closing delimiters in the nested blocks would be beneficial.

Again the problem of missing closing brace is a chronic one. As somebody mentioned here the editor that has "match brace" can be used to track it but that does not solve the problem itself, rather it provides a rather inefficient (for complex script) way to troubleshoot one. Which arise especially often if you modify the script. I experienced even a case when syntactically { } braces structure were correct but semantically wrong and that was detected only after the program was moved to production. Closing label on bracket would prevent it.

choroba on Sep 10, 2020 at 17:10 UTC

Re^3: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff

by choroba on Sep 10, 2020 at 17:10 UTC

I never had problems with omitting semicolons; maybe it's because of the extensive Pascal training.

If you write short subroutines, as you should, you don't suffer from misplaced closing curly braces. I had problems with them, especially when doing large edits on code not written by me, but the editor always saved me.

Both puns intended.

map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]

Fletch on Sep 10, 2020 at 19:27 UTC

Re^4: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff

by Fletch on Sep 10, 2020 at 19:27 UTC

More or less agree WRT mismatched closing curlies. I see it pretty much entirely as an editor issue.

(I mean isn't that the whole python argument for Semantic-Whitespace-As-Grouping? At least I recall that ("Your editor will keep it straight") being seriously offered as a valid dismissal of the criticism against S-W-A-G . . .)

likbez on Sep 10, 2020 at 21:37 UTC

Re^5: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff

by likbez on Sep 10, 2020 at 21:37 UTC

Reputation: -1
I mean isn't that the whole python argument for Semantic-Whitespace-As-Grouping?
No the argument is different, but using indentation to determine block nesting does allow multiple close of blocks, as a side effect. Python invented strange mixed solution when there is an opening bracket (usually ":") and there is no closing bracket -- instead indent is used as the closing bracket.

The problem is that it breaks too many other things, so here the question "whether it worth it" would be more appropriate, then in case of soft semicolons.

swl on Sep 10, 2020 at 08:54 UTC

Re: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff

WRT 5d, a trim function has recently been discussed for the core. See https://github.com/Perl/perl5/issues/17952 and https://github.com/Perl/perl5/pull/17999 .

jo37 on Sep 10, 2020 at 17:08 UTC

Re: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
[Highly desirable] Make a semicolon optional at the end of the line, if there is a balance of brackets on the line and the statement looks syntactically correct ("soft semicolon", the solution used in famous IBM PL/1 debugging compiler).

I feel a bit ashamed to admit that I had programmed in PL/I for several years. The reason why PL/I was so relaxed w.r.t. syntax is simple: You put your box full of punched cards to the operators' desk and you get the compiler's result the next day. If the job had failed just because of a missing semicolon, you'd loose one full day. Nowadays there is absolutely no need for such stuff.

BTW, the really fatal errors in a PL/I program resulted in a compiler warning of the kind "conversion done by subroutine call". This happend e.g. when assigning a pointer to a character array.

I wouldn't like to see any of the fancy features of PL/I in Perl. Consult your fortune database:

Speaking as someone who has delved into the intricacies of PL/I, I am sure that only Real Men could have written such a machine-hogging, cycle-grabbing, all-encompassing monster. Allocate an array and free the middle third? Sure! Why not? Multiply a character string times a bit string and assign the result to a float decimal? Go ahead! Free a controlled variable procedure parameter and reallocate it before passing it back? Overlay three different types of variable on the same memory location? Anything you say! Write a recursive macro? Well, no, but Real Men use rescan. How could a language so obviously designed and written by Real Men not be intended for Real Man use?
Greetings,
-jo

$gryYup$d0ylprbpriprrYpkJl2xyl~rzg??P~5lp2hyl0p$

likbez on Sep 11, 2020 at 02:05 UTC

Re^2: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff


by likbez on Sep 11, 2020 at 02:05 UTC

PL/1 still exists, although as a niche language practically limited to mainframes. Along with being a base for C it also was probably the first programming language that introduced exceptions as mainstream language feature. Also IMHO it is the origin of functions substr, index and translate as we know them. Compilers from PL/1 were real masterpieces of software engineering and probably in many aspects remain unsurpassed.
https://www.ibm.com/support/knowledgecenter/zosbasics/com.ibm.zos.zmainframe/zmainframe_book.pdf

What is common between PL/1 and Perl is the amount of unjustified hate from CS departments and users of other languages toward them.

What I think is common about both is that, while being very unorthodox, they are expressive and useful. Fun to program with. As Larry Wall said: "Perl is, in intent, a cleaned up and summarized version of that wonderful semi-natural language known as 'Unix'."

Unorthodox nature and solutions in Perl which stems from Unix shell is probably what makes people coming from Python/Java/JavaScript background hate it.

perlfan on Sep 10, 2020 at 14:25 UTC

Re: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff

Currently, the big push is to turn on warnings and strict by default; I like the initially slow approach. I don't have a strong opinion about any of your suggestions (good or bad) because I see none of them as particularly disruptive. Heck, I'd be happy to to have say and state available without turning them on explicitly. Ultimately, I just look forward to moving towards a more aggressive model of having new features on by default.

[Sep 10, 2020] Perl is the Most Hated Programming Language, Developers Say - Slashdot

Well written Perl is readable and efficient. People who hate it as a language in general, most likely have no idea what they are talking about.
Sep 10, 2020 | developers.slashdot.org

Thomas Claburn, writing for The Register: Developers really dislike Perl, and projects associated with Microsoft, at least among those who volunteer their views through Stack Overflow. The community coding site offers programmers a way to document their technical affinities on their developer story profile pages. Included therein is an input box for tech they'd prefer to avoid. For developers who have chosen to provide testaments of loathing, Perl tops the list of disliked programming languages, followed by Delphi and VBA . The yardstick here consists of the ratio of "likes" and "dislikes" listed in developer story profiles; to merit chart position, the topic or tag in question had to show up in at least 2,000 stories. Further down the down the list of unloved programming language comes PHP, Objective-C, CoffeeScript, and Ruby. In a blog post seen by The Register ahead of its publication today, Stack Overflow data scientist David Robinson said usually there's a relationship between how fast a particular tag is growing and how often it's disliked. "Almost everything disliked by more than 3 per cent of Stories mentioning it is shrinking in Stack Overflow traffic (except for the quite polarizing VBA, which is steady or slightly growing)," said Robinson. "And the least-disliked tags -- R, Rust, TypeScript and Kotlin -- are all among the fast-growing tags (TypeScript and Kotlin growing so quickly they had to be truncated in the plot).

Problems ( Score: 5 , Funny) by saphena ( 322272 ) on Wednesday November 01, 2017 @10:03AM ( #55468971 ) Homepage

You have a problem and you think Perl provides the solution. Now you have two problems. Re:No COBOL? ( Score: 4 , Informative) by Shompol ( 1690084 ) on Wednesday November 01, 2017 @10:22AM ( #55469085 ) If you look at the original article [stackoverflow.blog] -- cobol is there, as the 3rd most hated "tag". Re:Perl Is Hated Because It's Difficult ( Score: 5 , Informative) by Doctor Memory ( 6336 ) on Wednesday November 01, 2017 @10:47AM ( #55469251 )

And once you want to move beyond some simple automation scripts, you find that Python doesn't have the performance to handle anything more taxing. Re:Perl Is Hated Because It's Difficult ( Score: 4 , Interesting) by Anonymous Coward on Wednesday November 01, 2017 @11:05AM ( #55469365 )

Perl doesn't encourage or discourage you to write good or bad code. What it does very well is work with the philosophy of DWIM (Do What I Mean). Importantly, it doesn't throw a giant pile of (effectively) RFCs with an accompanying Nazi yelling, "YOU VILL VRITE CODE DIS VAY." at you the way Python does. I've seen great Perl code and poor Perl code. I've seen great Python code and poor Python code. A shitty developer writes shitty code and doesn't read documentation. A great developer can take a language like Perl and create a great, readable code. Real source ( Score: 4 , Informative) by Shompol ( 1690084 ) on Wednesday November 01, 2017 @10:12AM ( #55469013 ) The original study is here [stackoverflow.blog] I found the "polarization of technology" diagram at the bottom even more interesting. Experience-based opinions ( Score: 5 , Insightful) by Sarten-X ( 1102295 ) on Wednesday November 01, 2017 @10:16AM ( #55469047 ) Homepage

Having worked in Perl (and many other languages) for about 15 years now, I'm curious how many of those polled actually use Perl regularly.

Whenever I have to introduce someone to my Perl scripts, their first reaction is usually the typical horror, which fades in a few days after they start using it. Yes, there are comments. Yes, there is decent design. No, the regular expressions are not worse than any other implementation. No, the "clever" one-liner you copied off of a PerlMonks golf challenge will not pass review.

Sure, there are a few weird warts on the language ("bless" being the most obvious example), but it's no worse than any other, and significantly better than some of today's much more popular languages. Mostly, I find that Perl just has a bad reputation because it allows you to write ugly code, just like C allows you to corrupt data and Java allows you to consume obscene amounts of memory. The language choice does not excuse being a bad programmer. At least Perl stable. ( Score: 5 , Insightful) by Qbertino ( 265505 ) < [email protected] > on Wednesday November 01, 2017 @10:38AM ( #55469163 )

Perl is a wacky language and only bareable if you can handle old school unix stunts, no doubt. It gave birth to PHP, which speaks volumes. I remember reading an OReilly introduction to Perl and laughing at the wackyness. I've done the same with PHP, but I've always respected both. Sort of.
Unlike newfangled fads and desasters like Ruby, Perl is a language that remains usable. Books on Perl from 18 years ago are still valid today, just like with awk, TCL and Emacs Lisp.

Complain all you want about the awkwardness of old-school languages - they still work and many of them run on just about anything that can be powered by electricity. These days I'm still a little reluctant to say which side Javascript will come up on now that Node has it's very own version hodgepodge gumming up the works. Two types of languages . . . ( Score: 5 , Insightful) by walterbyrd ( 182728 ) on Wednesday November 01, 2017 @10:42AM ( #55469203 )

Those people like, and those people use. Share Nothing new ( Score: 5 , Interesting) by simplu ( 522692 ) on Wednesday November 01, 2017 @10:46AM ( #55469243 ) People hate what they don't understand. Perl is a scripting language... ( Score: 4 , Interesting) by bobbied ( 2522392 ) on Wednesday November 01, 2017 @10:48AM ( #55469255 )

Personally I prefer Perl over similar scripting languages.

I write in KSH, CSH, Python and Perl regularly... Of the three, Perl is my hands down favorite for a scripting language.

If you are writing applications in Perl though, it sucks. The implementation of objects is obtuse, it isn't geared for User Interfaces (Perl/TK anyone?) and performance is really horrid.

But... I cut my programming teeth on C (K&R, not ANSI) so I'm one of those old grey headed guys who go "tisk tisk" at all those new fangled, it's better because it's new things you young ones think are great.

Now get off my lawn... Enjoyed Perl 5 ( Score: 5 , Insightful) by mattr ( 78516 ) < mattr@te l e b o d y . c om > on Wednesday November 01, 2017 @10:49AM ( #55469271 ) Homepage Journal

Funny, I quite enjoyed writing in Perl 5 and the feeling was empowerment, and the community was excellent. At the time Python was quite immature. Python has grown but Perl 5 is still quite useful.

There is also quite a difference between legacy code and code written today using modern extensions, though it seems people enjoy trashing things, instead of admitting they did not actually learn it. Perl is just fine ( Score: 2 , Insightful) by Anonymous Coward on Wednesday November 01, 2017 @11:43AM ( #55469621 )

I love perl. What I don't love is the deliberately obfuscated perl written by someone trying to be clever and/or indispensible by writing code only they can (quickly) understand. A quick down-and-dirty perl script is one thing, using it in reusable scripts is just immature and pointless. Especially those who refuse to document their code.

THAT is the part I detest. Perl has too many choices ( Score: 2 ) by Hydrian ( 183536 ) on Wednesday November 01, 2017 @11:56AM ( #55469727 ) Homepage

My biggest problem I find with Perl is that there were SO many ways to express a similar operations, conditionals, etc. While this may be nice for single developer projects, it is utter hell if someone has to read that code. This has happened because of Perl's long life and its iterations to add more and more contemporary programming concepts. This has made it possible (and thus it will happen) to make Perl code a spaghetti mess of syntaxes. This makes perl code difficult to read much less grok.

I'm not saying Perl is the only offender of this. PHP has the same issue with its older functional programming syntax style and its OOP syntax. But PHP has kept it mainly to two styles. Perl has way too many styles so people get lots in syntax and find it hard fallow the code. Share Re:

It is surprising to me that enough developers have used Perl for it to be the most hated language. I would have guessed JavaScript, or maybe VB (#4 & #2 most hated). Re: My usual experience with Perl goes like this: We can't process data this year can you help us? Oh, this is a 20-year-old Perl script. Let the biopsy begin. Re:Is that surprising? ( Score: 5 , Informative) by Austerity Empowers ( 669817 ) on Wednesday November 01, 2017 @11:05AM ( #55469361 )

My experience with the Perl hate is it's usually from younger people (by which I mean anyone under about 40). It violates everything some may have been taught as part of their software engineering program: it's difficult to read, maintain, and support.

But, it exists for a reason and it's ridiculously good at that purpose. If I want to process lots of text, I do not use Python, I whip out perl. And usually it's fine, the little bits of perl here and there that glue the world together aren't usually that egregious to maintain (particularly in context of the overall mechanism it's being used to glue together, usually).

If I'm going to write serious code, code that may formulate the basis for my corporations revenue model or may seriously improve our cost structure, I use a serious language (C/C++, usually) and spend significant amounts of time architecting it properly. The problem is that more and more people are using scripting languages for this purpose, and it's becoming socially acceptable to do so. The slippery slope being loved by children and idiots alike, one might say "I know Perl, let's use that!" and countless innocents are harmed. Re:Is that surprising? ( Score: 5 , Informative) by networkBoy ( 774728 ) on Wednesday November 01, 2017 @11:57AM ( #55469737 ) Journal

I *love* perl.
It is C for lazy programmers.
I tend to use it for four distinct problem domains:

* one-offs for data processing (file to file, file to stream, stream to file, stream to stream). When I'm done I don't need it any more

* glue code for complex build processes (think a preprocessor and puppetmaster for G/CMAKE)

* cgi scripts on websites. Taint is an amazing tool for dealing with untrusted user input. The heavy lifting may be done by a back end binary, but the perl script is what lives in the /cgi-bin dir.

* test applications. I do QA and Perl is a godsend for writing fuzzers and mutators. Since it's loosely typed and dynamically allocates/frees memory in a quite sane manner it is able to deal with the wacky data you want fuzzers to be working with. Parent Share Re:Is that surprising? ( Score: 5 , Insightful) by al0ha ( 1262684 ) on Wednesday November 01, 2017 @01:28PM ( #55470385 ) Journal Yep - Perl is C for lazy programmers - well maybe not lazy, but programmers that don't want to have to deal with allocating and freeing memory, which is the bane of C and where many of the security problems arise. The other beautiful thing about Perl is no matter how you write your code, the interpreter compiles it into the most efficient form, just like C.

I think hate for Perl stems from the scripters who try to show off their Perl skills, writing the most concise code which is exasperatingly confusing and serves absolutely no purpose. Whether you write verbose code which takes many lines to do the same thing as concise and hard to understand code, at run time they perform exactly the same.

Perl coders have only themselves to blame for the hate; thousands of lines of stupid hard to read code is a nightmare for the person that comes along months or years later and has to work on your code. Stop it damn it, stop it!!!!! Re:Is that surprising? ( Score: 5 , Insightful) by fahrbot-bot ( 874524 ) on Wednesday November 01, 2017 @12:28PM ( #55469959 )

My experience with the Perl hate is it's usually from younger people (by which I mean anyone under about 40). It violates everything some may have been taught as part of their software engineering program: it's difficult to read, maintain, and support.

The quality of the program structure and the ability to read, maintain and support it are due to the programmer, not Perl. People can write programs well/poorly in any language. Like some others here, I *love* Perl and always endeavor to write clear, well-organized code - like I do in any other programming language - so others can make sense of it -- you know, in case I get hit by a bus tomorrow... It's call being professional.

Hate the programmer, not the programming language. Re:Is that surprising? ( Score: 5 , Funny) by Anonymous Coward on Wednesday November 01, 2017 @10:16AM ( #55469039 )

Many of us who know perl (and think you're a hypersensitive snowflake of a developer) learned C before we learned Perl.

We're immune to coding horrors. Re:Ruby... ( Score: 4 , Interesting) by Anonymous Coward on Wednesday November 01, 2017 @11:28AM ( #55469503 )

The problem is because people use the wrong tools for things. This is not a definitive list:

Perl is ONLY useful today as a server-sided processing script. If you are using Perl on your front end, you will get dependency hell as your server updates things arbitrarily. Perl breaks super-frequently due to the move from manual updates to automatic updates of third party libraries/ports. Thus if you don't update Perl and everything that uses Perl at the same time, mass-breakage. Thus "Don't update Perl you moron"

To that end PHP is on the other side of that coin. PHP is only useful for websites and nothing else. If you run PHP as a backend script it will typically time out, or run out of memory, because it's literately not designed to live very long. Unfortunately the monkeys that make Wordpress themes, plugins, and "frameworks" for PHP don't understand this. Symfony is popular, Symfony also is a huge fucking pain in the ass. Doctrine, gobbles up memory and gets exponentially slower the longer the process runs.

Thus "Don't update Wordpress" mantra, because good lord there are a lot of shitty plugins and themes. PHP's only saving grace is that they don't break shit to cause dependency hell, they just break API's arbitrarily, thus rendering old PHP code broken until you update it, or abandon it.

Ruby is a poor all-purpose tool. In order to use it with the web, you basically need to have the equivalent of php-fpm for Ruby running, and if your server is exhausted, just like php, it just rolls over and dies. Ruby developers are just like Python developers (next) in that they don't fundamentally understand what they are doing , and leave (crashed) processes running perpetually. At least PHP gets a force-kill after a while. Ruby Gems create another dependency hell. In fact good luck getting Ruby on a CentOS installation, it will be obsolete and broken.

Python, has all the worst of Perl's dependency hell with Ruby's clueless developers. Python simply doesn't exist on the web, but good lord so many "build tools" love to use it, and when it gets depreciated, whole projects that aren't even developed in Python, stop working.

Which leads me to NodeJS/NodeWebkit. Hey it's Javascript, everyone loves javascript. If you're not competent enough to write Javascript, turn in your developers license. Despite that, just like Perl, Ruby and Python, setting up a build environment is an annoying pain in the ass. Stick to the web browser and don't bother with it.

So that covers all the interpreted languages that you will typically run into on the web.

Java is another language that sometimes pops up on servers, but it's more common in finance and math projects, which are usually secretive. Java, just like everything mentioned, breaks shit with every update.

C is the only languages that haven't adopted the "break shit with every update" because C can not be "improved" on any level. Most of what has been added to the C API deals with threading and string handling. At the very basics, anything written in C can compile on everything as long as the platform has the same functions built into the runtime. Which isn't true when cross-compiling between Linux and Windows. Windows doesn't "main()" while Linux has a bunch of posix functions that don't exist on Windows.

Ultimately the reasons all these languages suck comes right back to dependency hell. A language that has a complete API, requiring no libraries, simply doesn't exist, and isn't future proof anyways.

People hate a lot of these languages because they don't adhere to certain programming habits they have, like object oriented "overide-bullshit", abuse of global variables, or strongly typed languages. Thus what should work in X language, doesn't work in Y language, because that language simply does it differently.

Like weakly typed languages are probably supreme, at the expense of runtime performance, because it results in less errors. That said, =, == and === are different. In a strong type language, you can't fuck that up. In a weak type language, you can make mistakes like if(moose=squirrel){blowshitup();} and the script will assume you want to make moose the value of squirrel, AND run blowshitup() regardless of the result. Now if you meant ===, no type conversion. Re:Ruby... ( Score: 5 , Interesting) by Darinbob ( 1142669 ) on Wednesday November 01, 2017 @02:20PM ( #55470827 )

You can write Forth code that is readable. Once you've got the reverse notation figured out it is very simple to deal with. The real problem with Perl is that the same variable name can mean many different things depending upon the prefix character and the context in which it is used. This can lead to a lot of subtle bugs, leads to a steep learning curve, and even a few months of vacation from the language can result in being unable to read one's own code.

On the other hand, Perl was never designed to be a typical computer language. I was berated by Larry Wall over this, he told me "you computer scientists are all alike". His goal was to get a flexible and powerful scripting language that can be used to get the job done. And it does just that - people use Perl because it can get stuff done. When it was new on Unix it was the only thing that could really replace that nasty mix of sh+awk+ed scripting that was common, instead being able to do all of that in a single script, and that led to its extremely fast rise in popularity in the early 90s. Yes, it's an ugly syntax but it's strong underneath, like the Lou Ferrigno of programming languages. Re:Ruby... ( Score: 2 ) by Shompol ( 1690084 ) on Wednesday November 01, 2017 @10:27AM ( #55469105 ) Ruby is ahead of Perl, in the "medium-disliked" [stackoverflow.blog] category. I find it amusing that Ruby was conceived as a Python replacement, yet fell hopelessly behind in the popularity contest.

[Sep 10, 2020] Joe Zbiciak's answer to Why is Perl so hated and not commonly used- And why should I learn it- - Quora

Sep 10, 2020 | www.quora.com

Joe Zbiciak , Software engineer and System-on-a-Chip (SoC) architect Updated November 5, 2017 · Author has 2.8K answers and 11.2M answer views

Perl bashing is popular sport among a particularly vocal crowd.

Perl is extremely flexible. Perl holds up TIMTOWTDI ( There Is More Than One Way To Do It ) as a virtue. Larry Wall's Twitter handle is @TimToady, for goodness sake!

That flexibility makes it extremely powerful. It also makes it extremely easy to write code that nobody else can understand. (Hence, Tim Toady Bicarbonate.)

You can pack a lot of punch in a one-liner in Perl:

  1. print $fo map { sprintf(" .pword 0x%.6X\n", $_) } unpack("n*", $data);

That one-liner takes a block of raw data (in $data ), expands it to an array of values, and then f

Continue Reading Steve J , Software Engineer Answered November 4, 2017 · Author has 486 answers and 133.6K answer views Originally Answered: Why does Perl so hated and not commanly used? And why should I learn it?

You should learn things that make your life easier or better. I am not an excellent Perl user, but it is usually my go-to scripting language for important projects. The syntax is difficult, and it's very easy to forget how to use it when you take significant time away from it.

That being said, I love how regular expressions work in Perl. I can use sed like commands $myvar =~ s/old/new/g for string replacement when processing or filtering strings. It's much nicer than other languages imo.

I also like Perls foreach loops and its data structures.

I tried writing a program of moderate length in Pytho

Continue Reading Related Questions More Answers Below Why do some programmers say that Perl is a dying language? Is Perl still a hacking language? Is Perl really outdated? Why or why not? Why do people learn Perl language? Is Perl dead? Joachim Pense , Perl is my language of choice Answered November 4, 2017 · Author has 6.5K answers and 8M answer views

It is still used, but its usage is declining. People use Python today in situations when they would have used Perl ten years ago.

The problem is that Perl is extremely pragmatic. It is designed to be "a language to get your job done", and it does that well; however, that led to rejection by language formalists. However, Perl is very well designed, only it is well designed for professionals who grab in the dark expecting that at this place there should be a button to do the desired functionality, and indeed, there will be the button. It is much safer to use than for example C (the sharp knife th

Continue Reading Michael Yousrie , A programmer, A Problem Solver Answered November 4, 2017 · Author has 82 answers and 169.1K answer views Originally Answered: Why does Perl so hated and not commanly used? And why should I learn it?

You can read this ( The Fall Of Perl, The Web's Most Promising Language ). It explains quite a bit.

Allow me to state my opinion though; You can't have people agreeing on everything because that's just people. You can't expect every single person to agree on a certain thing, it's impossible. People argue over everything and anything, that's just people.

You will find people out there that are Perl fanatics, people that praise the language! You will also find people that don't like Perl at all and always give negative feedback about it.

To be honest, I never gave a damn about people's opinions, I a

Continue Reading Randal L. Schwartz , Literally "wrote the books" on it Answered March 3, 2018 · Author has 109 answers and 105.1K answer views

The truth is, that by any metric, more Perl is being done today than during the dot com boom. It's just a somewhat smaller piece of a much bigger pie. In fact, I've heard from some hiring managers that there's actually a shortage of Perl programmers, and not just for maintaining projects, but for new greenfield deploys.

1.2K views View 25 Upvoters Richard Conto , Programmer in multiple languages. Debugger in even more Answered December 18, 2017 · Author has 7K answers and 5.4M answer views

Perl bashing is largely hear-say. People hear something and they say it. It doesn't require a great deal of thought.

As for Perl not commonly being used - that's BS. It may not be as common as the usual gang of languages, but there's an enormous amount of work done in Perl.

As for you you should learn Perl, it's for the same reason you would learn any other language - it helps you solve a particular problem better than another language available. And yes, that can be a very subjective decision to make.

563 views View 2 Upvoters · Answer requested by Mustafa Mansy Pekka Järveläinen , studied at Tampere University of Technology Answered November 4, 2017

Because even the best features of perl produce easily write only language. I have written one liner XML parser using perl regex. The program has worked perfectly more than 10 years but I have been afraid of change or new feature requiment which I cann't fullfil without writing totally new program bacause I cann't understand my old one.

649 views View 4 Upvoters Reed White , former Engineer at Hewlett-Packard (1978-2000) Answered November 7, 2017 · Author has 3K answers and 695.9K answer views

Yes, Perl takes verbal abuse; but in truth, it is an extremely powerful, reliable language. In my opinion, one of its outstanding characteristics is that you don't need much knowledge before you can write useful programs. As time goes by, you gradually learn the real power of the language.

However, because Perl-bashing is popular, you might better put your efforts into learning Python, which is also quite capable.

381 views View 1 Upvoter Anonymous Answered May 2, 2020

Perl is absolutely awesome and you should really learn it.

Don't believe the Perl-haters, you will not find a better language to write scripts, not Python, not Javascript, not anything else.

I wouldn't use it for large projects anymore but for your quick sort all lines if a file to make a statistic-script it is unparalleled

69 views

[Sep 10, 2020] What esteemed monks think about changes necessary-desirable in Perl 7 outside of OO staff

Sep 10, 2020 | perlmonks.org

likbez has asked for the wisdom of the Perl Monks concerning the following question: Reputation: -1

Edit

What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff. I compiled some my suggestions and will appreciate the feedback:
  1. [Highly desirable] Make a semicolon optional at the end of the line, if there is a balance of brackets on the line and the statement looks syntactically correct ("soft semicolon", the solution used in famous IBM PL/1 debugging compiler).
  2. [Highly Questionable] Introduce pragma that specify max allowed length of single and double quoted string (not not any other type of literals). That might simplify catching missing quote (which is not a big problem with any decent Perl aware editor anyway)
  3. [Highly desirable] Compensate for some deficiencies of using curvy brackets as the block delimiters:
    1. Treat "}:LABEL" as the bracket closing "LABEL:{" and all intermediate blocks (This idea was also first implemented in PL/1)
    2. Treat " }.. " symbol as closing all opened brackets up to the subroutine/BEGIN block level and }... including this level (closing up to the nesting level zero. ). Along with conserving vertical space, this allows search for missing closing bracket to be more efficient.
  4. Make function slightly more flexible:
    1. Introduce pragma that allows to define synonyms to built-in functions, for example ss for for substr and ix for index
    2. Allow default read access for global variables with subroutines, but write mode only with own declaration via special pragma, for example use sunglasses;
    3. Introduce inline functions which will be expanded like macros at compile time: sub subindex inline{ $_[0]=substr($_[0],index($_[0],$_[1],$_2])) } [download]
  5. As extracting of substring is a very frequent operation and use of such long name is counterproductive; it also contradicts the Perl goal of being concise and expressive .
    1. allow to extract substring via : or '..' notations like $line [$from:$to] (label can't be put inside square brackets in any case)
    2. Explicitly distinguish between translation table and regular expressions by introducing tt-strings
    3. Implement tail and head functions as synonyms to substr ($line,0,$len) and substr($line,-$len)
      With the ability to specify string, regex of translation table(tr style) instead of number as the third argument tail($line,'#') tail($line,/\s+#\w+$/) tail($line,tt/a-zA-z]/ [download]
    4. Implement similar to head and tail function called, for example, trim:
      trim(string,tt/leftcharacter_set/, tt/right_character_set/); [download]
      which deleted all characters from the first character set at the left and all characters from the second character set from the right, trim(string,,/right_character_set)
      strips trailing characters only.
  6. Allow to specify and use "hyperstrings" -- strings with characters occupying any power of 2 bytes (2,4,8, ...). Unicode is just a special case of hyperstring
    1. $hyper_example1= h4/aaaa/bbbb/cccc/;
    2. $hyper_example2= h2[aa][bb][cc];
    3. $pos=index($hyper_example,h4/bbbb/cccc/)
  7. Put more attention of managing namespaces.
    1. Allow default read access for global variables, but write mode only with own declaration via special pragma, for example use sunglasses.
    2. Allow to specify set of characters, for which variable acquires my attribute automatically, as well as the default minimum length of non my variables via pragma my (for example, variables with the length of less then three character should always be my)
    3. Allow to specify set of character starting from which variable is considered to be own, for example [A-Z] via pragma own.
  8. Analyze structure of text processing functions in competing scripting languages and implement several enhancements for existing functions. For example:
    1. Allow "TO" argument in index function, specifying upper range of the search.
    1. Implement delete function for strings and arrays. For example adel(@array,$from,$to) and asubstr and aindex functions. [download]
  9. Improve control statements
    1. Eliminate keyword 'given' and treat for(scalar) as a switch statement. Allow when operator in all regular loops too. for($var){<br> when('b'){ ...;} # means if ($var eq 'b') { ... ; las + t} when(&gt;'c'){...;} } # for [download]
    2. [Questionable] Extend last to accept labels and implement "post loop switch" (See Donald Knuth Structured Programming with go to Statements programming with goto statements) my rc==0; for(...){ if (condition1) { $rc=1; last;} elsif(...){$rc=2; last} } if ($rc==0){...} elif($rc==1){...} elif($rc==3){...} [download]

      May be (not that elegant, but more compact the emulation above)

      for ...{ when (...); when (...); }with switch{ default: 1: ... 2: ... } [download]

Corion on Sep 10, 2020 at 07:03 UTC

Re: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
Highly desirable Make a semicolon optional at the end of the line, if there is a balance of brackets on the line and the statement looks syntactically correct ("soft semicolon", the solution used in famous IBM PL/1 debugging compiler).

Why would this be highly desirable? Consider:

print( "Hello World" ) if( 1 ); [download]

versus

print( "Hello World" ) if( 1 < 2 ) { print("Goodbye"); }; [download]

Adding your change idea makes the parser even more complex and introduces weird edge cases.

I think even Javascript now recommends using semicolons instead of eliding them at the end of a line.

Update : Some examples where ASI in Javascript goes wrong:

dsheroh on Sep 10, 2020 at 09:07 UTC

Re^2: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff


by dsheroh on Sep 10, 2020 at 09:07 UTC

If CRLF becomes a potential statement terminator, then breaking a single statement across multiple lines not only becomes a minefield of "will this be treated as one or multiple statements?", but the answer to that question may change depending on where in the statement the line breaks are inserted!

If implemented, this change would make a mockery of any claims that Perl 7 will just be "Perl 5 with different defaults", as well as any expectations that it could be used to run "clean" (by some definition) Perl 5 code without modification.

you !!! on Sep 10, 2020 at 21:02 UTC

Re^3: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by you !!! on Sep 10, 2020 at 21:02 UTC
If implemented, this change would make a mockery of any claims that Perl 7 will just be "Perl 5 with different defaults", as well as any expectations that it could be used to run "clean" (by some definition) Perl 5 code without modification.
Looks like a valid objection. I agree. With certain formatting style it is possible. But do you understand the strict as the default will break a lot of old scripts too. Per your critique, it probably should not be made as the default and implemented as pragma similar to warnings and strict. You can call this pragma "softsemicolon"

What most people here do not understand is it can be implemented completely on lexical scanner level, not affecting syntax analyser.

you !!! on Sep 10, 2020 at 20:45 UTC

Re^3: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by you !!! on Sep 10, 2020 at 20:45 UTC
If CRLF becomes a potential statement terminator, then breaking a single statement across multiple lines not only becomes a minefield of "will this be treated as one or multiple statements?", but the answer to that question may change depending on where in the statement the line breaks are inserted!
No. The classic solution of this problem was invented in FORTRAN in early 50 -- it is a backslash at the end of the line. Perl can use #\ as this is pragma to lexical scanner, not the element of the language.

Usually long line in Perl is the initialization of array or hash and after the split they do not have balanced brackets and, as such, are not affected and do not require #\ at the end.

Question to you: how many times you corrected missing semicolon in your Perl scripts the last week ? If you do not know, please count this during the next week and tell us.

GrandFather on Sep 10, 2020 at 21:19 UTC

Re^4: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by GrandFather on Sep 10, 2020 at 21:19 UTC
how many times you corrected missing semicolon in your Perl scripts the last week

After running the code - never. All the IDEs I use for all the languages I use flag missing semi-colons and other similar foibles (like mis-matched brackets.

There are nasty languages that I use occasionally, and even some respectable ones, that need to quote new lines to extend a statement across multiple lines. That is just nasty on so many levels. I very much agree with dsheroh that long lines are anathema. Code becomes much harder to read and understand when lines are long and statements are not chunked nicely.

Don't break what's not broken!

Optimising for fewest key strokes only makes sense transmitting to Pluto or beyond

hippo on Sep 10, 2020 at 21:46 UTC

Re^4: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO stuff
by hippo on Sep 10, 2020 at 21:46 UTC
The classic solution of this problem was invented in FORTRAN in early 50 -- it is a backslash at the end of the line.

Fortran didn't have a release until 1957 so not early 50s. Fortran prior to F90 used a continuation character at the start (column 6) of the subsequent line not the end of the previous line. The continuation character in Fortran has never been specified as a backslash. Perhaps you meant some other language?

🦛

you !!! on Sep 10, 2020 at 20:41 UTC

Re^2: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff


by you !!! on Sep 10, 2020 at 20:41 UTC Reputation: -2

Why would this be highly desirable? Consider: print( "Hello World" ) if( 1 ); [download] versus
print( "Hello World" )
    if( 1 < 2 ) {
         print("Goodbye");
    };
I do not understand your train of thought. In the first example end of the line occurred when all brackets are balanced, so it will will be interpretered as print( "Hello World" ); if( 1 ); [download]

So this is a syntactically incorrect example, as it should be. The second example will be interpreted as

print( "Hello World" );
    if( 1 < 2 ) { print("Goodbye");
    };

Anonymous Monk on Sep 10, 2020 at 20:51 UTC

Re^3: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by Anonymous Monk on Sep 10, 2020 at 20:51 UTC So this is a syntactically incorrect example, as it should be.

wrong. print "Hello World" if 1; is valid Perl

you !!! on Sep 10, 2020 at 21:28 UTC

Re^4: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by you !!! on Sep 10, 2020 at 21:28 UTC

That support another critique of the same proposal -- it might break old Perl 5 scripts and should be implemented only as optional pragma. Usuful only for programmers who experience this problem.

Because even the fact that this error is universal and occurs to all programmers is disputed here.

you !!! on Sep 10, 2020 at 15:38 UTC

Re^2: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff


by you !!! on Sep 10, 2020 at 15:38 UTC Reputation: -10

Because people have a natural tendency to omit them at the end of the line. That's why.

This is an interesting psychological phenomenon that does not depend on your level of mastery of the language and is not limited to novices.

dave_the_m on Sep 10, 2020 at 18:09 UTC

Re^3: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by dave_the_m on Sep 10, 2020 at 18:09 UTC

Dave.

you !!! on Sep 10, 2020 at 20:56 UTC

Re^4: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by you !!! on Sep 10, 2020 at 20:56 UTC

Can you please tell us how many times you corrected the missing semicolon error in your scripts during the lst week?

choroba on Sep 10, 2020 at 21:16 UTC

Re^5: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by choroba on Sep 10, 2020 at 21:16 UTC this video , it's 7 years old, but my habits haven't changed much since then. map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]

haj on Sep 10, 2020 at 18:35 UTC

Re^3: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by haj on Sep 10, 2020 at 18:35 UTC

That's neither a natural tendency nor an interesting psychological phenomenon. You just made that up.

Semicolons at the end of a statement are as natural as a full stop "." at the end of a sentence, regardless of whether the sentence is the last in a paragraph. The verification process whether a line "looks syntactically correct" takes longer than just hitting the ";" key, and the chances of a wrong assessment of "correct" may lead to wrong behavior of the software.

Language-aware editors inform you about a missing semicolon by indenting the following line as a continuation of the statement in the previous line, so it is hard to miss.

If, on the other hand, you want to omit semicolons, then the discussion should have informed you that you aren't going to find followers.

you !!! on Sep 10, 2020 at 21:20 UTC

Re^4: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by you !!! on Sep 10, 2020 at 21:20 UTC
Semicolons at the end of a statement are as natural as a full stop "." at the end of a sentence, regardless of whether the sentence is the last in a paragraph.
I respectfully disagree, but your comment can probably explain fierce rejection of this proposal in this forum. IMHO this is a wrong analogy as the level of precision requred is different. If you analyse books in print you will find paragraphs in which full stop is missing at the end. Most people do not experience difficulties learning to put a full stop at the end of the sentence most of the time. Unfortunately this does work this way in programming languages with semicolon at the end of statement. Because what is needed is not "most of the time" but "all the time"

My view supported by some circumstantial evidence and my own practice is the this is a persistent error that arise independently of the level of qualification for most or all people, and semicolon at the end of the statement contradicts some psychological mechanism programmers have.

Tux on Sep 10, 2020 at 08:52 UTC

Re: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
  1. Highly desirable Make a semicolon optional at the end of the line
    Highly un desirable. If things to be made optional for increased readability, not this, but making braces optional for singles statement blocks. But that won't happen either.
  2. Highly Questionable Introduce pragma that specify max allowed length of single and double quoted string
    Probably already possible with a CPAN module, but who would use it? This is more something for a linter or perltidy.
  3. Highly desirable Compensate for some deficiencies of using curvy brackets as the block delimiters
    Unlikely to happen and very un undesirable. The first option is easy } # LABEL (why introduce new syntax when comments will suffice). The second is just plain illogical and uncommon in most other languages. It will confuse the hell out of every programmer.
  4. Make function slightly more flexible
    a) no b) Await the new signatures c) Macro's are unlikely to happen. See the problems they faced in Raku. Would be fun though
  5. Long function names
    Feel free to introduce a CPAN module that does all you propose. A new function for trimming has recently been introduced and spun off a lot of debate. I think none of your proposed changes in this point is likely to gain momentum.
  6. Allow to specify and use "hyperstrings"
    I have no idea what is to be gained. Eager to learn though. Can you give better examples?
  7. Put more attention of managing namespaces
    I think a) is part of the proposed OO reworks for perl7 based on Cor , b) is just plain silly, c) could be useful, but not based on letters but on sigils or interpunction, like in Raku</lI.
  8. Analyze structure of text processing functions in competing scripting languages
    Sounds like a great idea for a CPAN module, so all that require this functionality can use it
  9. Improve control statements
    Oooooh, enter the snake pit! There be dragons here, lots of nasty dragons. We have has given/when and several switch implementations and suggestions, and so far there has been no single solution to this. We all want it, but we all have different expectations for its feature sets and behavior. Wise people are still working on it so expect *something* at some time.

Enjoy, Have FUN! H.Merijn

you !!! on Sep 10, 2020 at 16:57 UTC

Re^2: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff


by you !!! on Sep 10, 2020 at 16:57 UTC Reputation: -2

Because }:LABEL actually forcefully closes all blocks in between, but the comment just informs you which opening bracket this closing bracket corresponds to. and, as such, can placed on the wrong closing bracket, especially if the indentation is wrong too. Worsening already bad situation.

Been there, done that.

hippo on Sep 10, 2020 at 08:34 UTC

Re: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO stuff
9. ... a. Eliminate keyword 'given'

That I can agree with. The rest of your proposals seem either unnecessary (because the facilities already exist in the language) or potentially problematic or almost without utility to me. Sorry. That's not to say you shouldn't suggest them all to p5p for further review of course - it's only the opinion of a humble monk after all.

9. ... b. ... Extend last to accept labels

I have good news: it already does

🦛

you !!! on Sep 10, 2020 at 15:16 UTC

Re^2: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO stuff


by you !!! on Sep 10, 2020 at 15:16 UTC Reputation: 1

What I mean is a numeric "local" (in Pascal style; can be redefined later in other blocks ) label in context of the Knuth idea of "continuations" outside the loop

haj on Sep 10, 2020 at 11:00 UTC

Re: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff

That's quite some work you've invested here. I've looked at them from two perspectives:

In summary, your suggestions don't perform that great. These are rather nerdy ideas where I don't see which problem they solve. There isn't much to be added to the comments of other monks, so I'll keep attention to two items:

I challenge the claim that closing more than one block with one brace allows search for missing closing bracket to be more efficient . It just hides problems when you lost control over your block structure. Source code editors easily allow to jump from opening to closing brace, or to highlight matching braces, but they are extremely unlikely to support such constructs.

I challenge the claim that extracting of substring is a very frequent operation . It is not in the Perl repositories I've cloned. Many of them don't have a single occurrence of substring . Please support that claim with actual data.

you !!! on Sep 10, 2020 at 21:49 UTC

Re^2: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff


by you !!! on Sep 10, 2020 at 21:49 UTC Reputation: 0

alexander_lunev on Sep 10, 2020 at 09:02 UTC

Re: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff

Making Perl more like modern Python or JS is not improvement to language, you need another word for that, something like "trends" or "fashion", or something like that. I see this list as a simplification of language (and in a bad way), not improvement. As if some newby programmer would not want to improve himself, to get himself up to match the complexity of language, but blaim language complexity and demand the language complexity to go down to his (low) level. "I don't want to count closing brackets, make something that will close them all", "I don't want to watch for semicolons, let interpreter watch for end of sentence for me", "This complex function is hard to understand and remember how to use it in a right way, give me bunch of simple functions that will do the same as this one function, but they will be easy to remember".

Making tool more simple will not make it more powerful, or more efficient, but instead could make it less efficient, because the tool will have to waste some of its power to compensate user's ineptitude. Interpreter would waste CPU and memory to comprehend sentence ending, this "new" closing brackets and extra function calls, and what's gain here? I see only one - that newby programmer could write code with less mind efforts. So it's not improvement of language to do more with less, but instead a change that will cause tool do same with more. Is it improvement? I don't think so.

you !!! on Sep 10, 2020 at 16:52 UTC

Re^2: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff


by you !!! on Sep 10, 2020 at 16:52 UTC Reputation: -4

As if some newby programmer would not want to improve himself, to get himself up to match the complexity of language, but blaim language complexity and demand the language complexity to go down to his (low) level.

The programming language should be adapted to actual use by programmers, not to some illusions of actual use under the disguise of "experts do not commit those errors." If the errors committed by programmers in the particular language are chronic like is the case for semicolons and missing closing brace something needs to be done about them, IMHO.

The same is true with the problem of "overexposure" of global variables. Most programmers at some point suffer from this type of bugs. That's why "my" was pushed into the language.

BTW the problem of missing parentheses affects all languages which use this "{" and "}" as block delimiters and the only implementation which solved this complex problem satisfactory were closing labels on closing block delimiter in PL/1 ("}' in Perl; "begin/end pair in PL/1). Like with "missing semicolon" this is the problem from which programmer suffer independently of the their level of experience with the language.

So IMHO any measures that compensate for "dangling '}' " problem and provide better coordination between opening and closing delimiters in the nested blocks would be beneficial.

Again the problem of missing closing brace is a chronic one. As somebody mentioned here the editor that has "match brace" can be used to track it but that does not solve the problem itself, dues provide a rather inefficient (for complex script) way to troubleshoot one. Which arise especially often if you modify the script. I experienced even a case when syntactically { } braces structure were correct but semantically wrong and that was detected only after the program was moved to production. Closing label on bracket would prevent it.

But IMHO it does not go far enough as it does not distinguish between reading and modifying a variable. And "sunglasses" approach to visibility of global variable might be beneficial.

choroba on Sep 10, 2020 at 17:10 UTC

Re^3: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by choroba on Sep 10, 2020 at 17:10 UTC

If you write short subroutines, as you should, you don't suffer from misplaced closing curly braces. I had problems with them, especially when doing large edits on code not written by me, but the editor always saved me.

Both puns intended.

map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]

Fletch on Sep 10, 2020 at 19:27 UTC

Re^4: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by Fletch on Sep 10, 2020 at 19:27 UTC

More or less agree WRT mismatched closing curlies. I see it pretty much entirely as an editor issue.

(I mean isn't that the whole python argument for Semantic-Whitespace-As-Grouping? At least I recall that ("Your editor will keep it straight") being seriously offered as a valid dismissal of the criticism against S-W-A-G . . .)

The cake is a lie.
The cake is a lie.
The cake is a lie.

you !!! on Sep 10, 2020 at 21:37 UTC

Re^5: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
by you !!! on Sep 10, 2020 at 21:37 UTC
I mean isn't that the whole python argument for Semantic-Whitespace-As-Grouping?
No the argument is different, but using indentation to determine block nesting does allow multiple close of blocks, as a side effect. Python invented strange mixed solution when there is an opening bracket (usually ":") and there is no closing bracket -- instead indent is used as the closing bracket.

The problem is that it breaks too many other things, so here the question "whether it worth it" would be more appropriated that in case of soft semicolons.

swl on Sep 10, 2020 at 08:54 UTC

Re: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff

WRT 5d, a trim function has recently been discussed for the core. See https://github.com/Perl/perl5/issues/17952 and https://github.com/Perl/perl5/pull/17999 .

jo37 on Sep 10, 2020 at 17:08 UTC

Re: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff
[Highly desirable] Make a semicolon optional at the end of the line, if there is a balance of brackets on the line and the statement looks syntactically correct ("soft semicolon", the solution used in famous IBM PL/1 debugging compiler).

I feel a bit ashamed to admit that I had programmed in PL/I for several years. The reason why PL/I was so relaxed w.r.t. syntax is simple: You put your box full of punched cards to the operators' desk and you get the compiler's result the next day. If the job had failed just because of a missing semicolon, you'd loose one full day. Nowadays there is absolutely no need for such stuff.

BTW, the really fatal errors in a PL/I program resulted in a compiler warning of the kind "conversion done by subroutine call". This happend e.g. when assigning a pointer to a character array.

I wouldn't like to see any of the fancy features of PL/I in Perl. Consult your fortune database:

Speaking as someone who has delved into the intricacies of PL/I, I am sure that only Real Men could have written such a machine-hogging, cycle-grabbing, all-encompassing monster. Allocate an array and free the middle third? Sure! Why not? Multiply a character string times a bit string and assign the result to a float decimal? Go ahead! Free a controlled variable procedure parameter and reallocate it before passing it back? Overlay three different types of variable on the same memory location? Anything you say! Write a recursive macro? Well, no, but Real Men use rescan. How could a language so obviously designed and written by Real Men not be intended for Real Man use?
Greetings,
-jo

$gryYup$d0ylprbpriprrYpkJl2xyl~rzg??P~5lp2hyl0p$

perlfan on Sep 10, 2020 at 14:25 UTC

Re: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff

Currently, the big push is to turn on warnings and strict by default; I like the initially slow approach. I don't have a strong opinion about any of your suggestions (good or bad) because I see none of them as particularly disruptive. Heck, I'd be happy to to have say and state available without turning them on explicitly. Ultimately, I just look forward to moving towards a more aggressive model of having new features on by default.

Perlbotics on Sep 10, 2020 at 19:24 UTC

Re: What esteemed monks think about changes necessary/desirable in Perl 7 outside of OO staff

I would barter all of these improvements for a less noisy but performant elemement accessor syntax:

i.e.

$it->{$key}->[$idx]->{section}->[$i]->{'some.doc'}->method1()->[sub1 + (5)]->method2($x); # or $it->{$key}[$idx]{section}[$i]{'some.doc'}->method1()->[sub1(5)]->me + thod2($x); [download]

becomes something like:

$it->@( $key $idx section $i some.doc method1() sub1(5) method2($x) + ); [download] or something smarter...

Disambiguation: If the actual element is blessed and can('method1') , it is invoked. Otherwise it is treated as a function call ( :: might be used for further disambiguation).

I.e. similar to Data::Diver , just more efficient together with a pragma or other method to control auto-vivification. Yes, I am aware, that I could build something similar as a module, but it would be pure Perl.

Replies are listed 'Best First'.

[Sep 08, 2020] A Question about Hostility Towards Perl

Notable quotes:
"... A lot of resources have been pushed into Python and The Cloud in the past decade. It seems to me that this has come at the opportunity cost of traditional Linux/Unix sysadmin skills. ..."
"... And in a lot of cases, there's no documentation, because after all, the guy was just trying to solve a problem, so why document it? It's "just for now," right? If you find yourself in this situation enough times, then it's easy to start resenting the thing that all of these pieces of code have in common: Perl. ..."
"... I'm glad you're finding Perl to be clever and useful. Because it is. And these days, there are lots of cool modules and frameworks that make it easier to write maintainable code. ..."
"... Perl was the tool of choice at the dawn of the web and as a result a lot of low to average skill coders produced a large amount of troublesome code much of which ended up being critical to business operations. ..."
"... As a Bioinformatician I also see a bunch of hostility for Perl, many people claim it is unreadable and inefficient, but as other pointed, it has a lot of flexibility and if the code is bad is because of the programmer, not the language. ..."
"... I don't hate Python but I prefer Perl because I feel more productive on it, so I don't understand people who said that Perl is far worse than Python. Both have their own advantages and disadvantages as any other computer language. ..."
"... When you look at a language like Go, it was designed to make writing good go easy and bad go hard. it's still possible to architect your program in a bad way, but at least your implementation details will generally make sense to the next person who uses it. ..."
"... I also commonly see bad code in Python or Java ..."
"... Much of the hostility comes from new programmers who have not had to work in more than one language, Part of this is the normal Kubler Ross grief cycle whenever programmers take on a legacy code base. Part of it has to do with some poorly written free code that became popular on early Web 1.0 websites from the 90's. Part of this come from the organizations where "scripting" languages are popular and their "Engineering In Place" approach to infrastructure. ..."
"... Perl might be better than Python3 for custom ETL work on large databases. ..."
"... Perl might be better than Python at just about everything. ..."
"... I use f-strings in Python for SQL and... I have no particular feelings about them. They're not the worst thing ever. Delimiters aren't awful. I'm not sure they do much more for me than scalar interpolation in Perl. ..."
"... I think Perl is objectively better, performance and toolset wise. My sense is the overhead of "objectifying" every piece of data is too much of a performance hit for high-volume database processing ..."
"... the Python paradigm of "everything is an object" introduces overhead to a process that doesn't need it and is repeated millions or billions of times, so even small latencies add up quickly. ..."
"... I think what the Perl haters are missing about the language is that Perl is FUN and USEFUL. It's a joy to code in. It accomplishes what I think was Larry Wall's primary goal in creating it: that it is linguistically expressive. There's a certain feeling of freedom when coding in it. I do get though that it's that linguistic expressiveness characteristic that makes people coming from Python/Java/JavaScript background dislike about it. ..."
"... Like you said, the way to appreciate Perl is to be aware that it is part of the Unix package. I think Larry Wall said it best: "Perl is, in intent, a cleaned up and summarized version of that wonderful semi-natural language known as 'Unix'." ..."
"... I don't know why, but people hating on Perl doesn't bother me as much as those who are adverse to Perl but fond of other languages that heavily borrow from Perl -- without acknowledging the aspects of their language that were born, and created first in Perl. Most notably regular expressions; especially Perl compatible expressions. ..."
"... My feelings is that Perl is an easy language to learn, but difficult to master. By master I don't just mean writing concise, reusable code, but I mean readable, clean, well-documented code. ..."
"... Larry Wall designed perl using a radically different approach from the conventional wisdom among the Computer Science intelligentsia, and it turned out to be wildly successful. They find this tremendously insulting: it was an attack on their turf by an outsider (a guy who kept talking about linguistics when all the cool people know that mathematical elegance is the only thing that matters). ..."
"... The CS-gang responded to this situation with what amounts to a sustained smear campaign, attacking perl at every opportunity, and pumping up Python as much as possible. ..."
"... The questionable objections to perl was not that it was useless-- the human genome project, web 1.0, why would anyone need to defend perl? The questionable objections were stuff like "that's an ugly language". ..."
"... I generally agree that weak *nix skills is part of it. People don't appreciate the fact that Perl has very tight integration with unix (fork is a top-level built in keyword) and think something like `let p :Process = new Process('ls', new CommandLineArguments(new Array<string>('-l'))` is clean and elegant. ..."
"... But also there's a lot of dumb prejudice that all techies are guilty of. Think Bing -- it's a decent search engine now ..."
"... On a completely different note, there's a lot of parallels between the fates of programming languages (and, dare I say, ideas in general ) and the gods of Terry Pratchett's Discworld. I mean, how they are born, how they compete for believers, how they dwindle, how they are reborn sometimes. ..."
"... You merely tell your conclusions and give your audience no chance to independently arrive at the same, they just have to believe you. Most of the presented facts are vague allusions, not hard and verifiable. If you cannot present your evidence and train of thought, then hardly anyone takes you serious even if the expressed opinions happen to reflect the truth. ..."
Sep 08, 2020 | www.reddit.com

A Question about Hostility Towards Perl

I like Perl, even though I struggle with it sometimes. I've slowly been pecking away at getting better at it. I'm a "the right tool for the job" kind of person and Perl really is the lowest common denominator across many OSes and Bash really has its limits. Perl still trips me up on occasion, but I find it a very clever and efficient language, so I like it.

I don't understand the hostility towards it given present reality. With the help of Perl Critic and Perl Tidy it's no longer a "write only language." I find it strange that people call it a "dead language" when it's still widely used in production.

A lot of resources have been pushed into Python and The Cloud in the past decade. It seems to me that this has come at the opportunity cost of traditional Linux/Unix sysadmin skills. Perl is part of that package, along with awk, sed, and friends along with a decent understanding of how the init system actually works, what kernel tunables do, etc.

I could be wrong, not nearly all seeming correlations are causal relationships. Am I alone in thinking a decent portion of the hostility towards Perl is a warning sign of weak sysadmin skills a decent chunk of the time?

60 Comments

m0llusk 19 points· 5 days ago

Just some thoughts:

Perl was the tool of choice at the dawn of the web and as a result a lot of low to average skill coders produced a large amount of troublesome code much of which ended up being critical to business operations. This was complicated by the fact that much early web interaction was dominated by CGI based forms which had many limitations as well as early Perl CGI modules having many quirks.

The long term oriented dreaming about the future that started with Perl 6 and matured into Rakudo also made a lot of people with issues to resolve with the deployed base of mostly Perl 5 code also alienated a lot of people.

readparse 14 points· 5 days ago

Yeah, this is where the hostility comes from. The only reason to be angry at Perl is that Perl allows you to do almost anything. And this large user base of people who weren't necessarily efficient programmers -- or even programmers at all -- people like me, that is... took up Perl on that challenge.

"OK, we'll do it HOWEVER we want."

Perl's flexibility makes it very powerful, and can also make it fairly dangerous. And whether that code continues to work or not (it generally does), somebody is inevitably going to have to come along and maintain it, and since anything goes, it can be an amazingly frustrating experience to try to piece together what the programmer was thinking.

And in a lot of cases, there's no documentation, because after all, the guy was just trying to solve a problem, so why document it? It's "just for now," right? If you find yourself in this situation enough times, then it's easy to start resenting the thing that all of these pieces of code have in common: Perl.

I'm glad you're finding Perl to be clever and useful. Because it is. And these days, there are lots of cool modules and frameworks that make it easier to write maintainable code.

lindleyw 8 points· 5 days ago

Matt's Script Archive has been the poster child for long-lived undesirable Perl coding practices for nearly a quarter century.

readparse 6 points· 5 days ago

It was also where I began to learn my craft. My coding practices improved as a learned more, but I appreciate that Matt was there at the time, offering solutions to those who needed them, when they needed them.

I also appreciate that Matt himself has spoken out about this, saying "The code you find at Matt's Script Archive is not representative of how even I would code these days."

It's easy to throw stones 25 years later, but I think he did more good than harm. That might be a minority opinion. In any case, I'm grateful for the start it gave me.

Speaking of his script archive, I believe his early scripts used cgi-lib.pl, which had a subroutine in it called ReadParse() . That is where my username comes form. It's a tribute to the subroutine that my career relied on in the early days, before I graduated to CGI.pm, before I graduated to mod_perl, before I graduated to Dancer and Nginx.

Urist_McPencil 7 points· 5 days ago

Perl was the tool of choice at the dawn of the web and as a result a lot of low to average skill coders produced a large amount of troublesome code much of which ended up being critical to business operations.

So in the context of webdev, it was JavaScript before JavaScript was a thing. No wonder people still have a bad taste in their mouth lol

LordLinxe 11 points· 5 days ago

As a Bioinformatician I also see a bunch of hostility for Perl, many people claim it is unreadable and inefficient, but as other pointed, it has a lot of flexibility and if the code is bad is because of the programmer, not the language.

I don't hate Python but I prefer Perl because I feel more productive on it, so I don't understand people who said that Perl is far worse than Python. Both have their own advantages and disadvantages as any other computer language.

fried_green_baloney 10 points· 5 days ago

unreadable

That usually means it took me two minutes to understand the Perl code, which replaces two pages of C++.

Grinnz 2 points· 4 days ago

Most of these comparisons are ultimately unfair. Context is everything, and that includes who will be writing or maintaining the code.

semi- 2 points· 4 days ago

'if the code is bad is because of the programmer, not the language'

That's not going to make you feel any better about joining a project and having to work on a lot of badly written code, nor does it help when you need to trace through your dependencies and find it too is badly written.

in the end it's entirely possible to write good Perl, but you have to go out of your way to do so. Bad Perl is just as valid and still works, so it gets used more often than good Perl.

When you look at a language like Go, it was designed to make writing good go easy and bad go hard. it's still possible to architect your program in a bad way, but at least your implementation details will generally make sense to the next person who uses it.

Personally I still really like Perl for one-off tasks that are primarily string manipulation. it's really good at that, and maintainability doesn't matter nor does anyone else code. For anything else, there's usually a better tool to reach for.

LordLinxe 2 points· 4 days ago

agree, I also commonly see bad code in Python or Java. I am tented to learn Go too, I was looking into the syntax but I don't have any project or requirement that needs it.

Also, Bioinformatics has a large part for string manipulation (genes and genomes are represented as long strings), so Perl fits naturally. Hard tasks are commonly using specific programs (C/Java/etc) so you need to glue them, for that Bash, Perl or even Python are perfectly fine.

crashorbit 11 points· 5 days ago
· edited 5 days ago

Much of the hostility comes from new programmers who have not had to work in more than one language, Part of this is the normal Kubler Ross grief cycle whenever programmers take on a legacy code base. Part of it has to do with some poorly written free code that became popular on early Web 1.0 websites from the 90's. Part of this come from the organizations where "scripting" languages are popular and their "Engineering In Place" approach to infrastructure.

And then there is the self inflicted major version freeze for 20 years. Any normal project would have had three or more major version bumps for the amount of change between perl 5.0 and perl 5.30. Instead perl had a schism. Now perl5 teams are struggling to create the infrastructure needed to release a major version bump. Even seeding the field ahead with mines just to make the bump from 5 to 7 harder.

5upertaco 8 points· 5 days ago

Perl might be better than Python3 for custom ETL work on large databases.

raevnos 13 points· 5 days ago

Perl might be better than Python at just about everything.

WesolyKubeczek 3 points· 3 days ago

What do you think about Javascript's template strings (which can be tagged for custom behaviors!) and Python's recent f-strings? Well, there's also Ruby's #{interpolation} which allows arbitrary expressions to be right there, and which existed for quite a while (maybe even inspiring similar additions elsewhere, directly or indirectly).

Having to either fall back on sprintf for readability or turn the strings into a concatenation-fest somewhat tarnishes Perl's reputation as the ideal language for text processing in this day and age.

My other pet peeve with Perl is how fuzzy its boundaries between bytes and unicode are, and how you always need to go an extra mile to ensure the string has the exact state you expect it to be in, at all callsites which care. Basically, string handling in Perl is something that could be vastly improved for software where bytes/character differences are important.

mr_chromatic 1 point· 3 days ago

What do you think about Javascript's template strings (which can be tagged for custom behaviors!) and Python's recent f-strings?

I use f-strings in Python for SQL and... I have no particular feelings about them. They're not the worst thing ever. Delimiters aren't awful. I'm not sure they do much more for me than scalar interpolation in Perl. Maybe because it's Python I'm always trying to write the most boring code ever because it feels like the language fights me when I'm not doing it Python's way.

My other pet peeve with Perl is how fuzzy its boundaries between bytes and unicode are, and how you always need to go an extra mile to ensure the string has the exact state you expect it to be in, at all callsites which care.

I'd have to know more details about what's bitten you here to have a coherent opinion. I think people should know the details of the encoding of their strings everywhere it matters, but I'm not sure that's what you mean.

WesolyKubeczek 1 point· 3 days ago

In Python's f-strings (and JS template strings) you can interpolate arbitrary expressions, thus no need to pollute the local scope with ad hoc scalars.

mr_chromatic 1 point· 2 days ago

Ad hoc scalar pollution hasn't been a problem in code I've worked on, mostly because I try to write the most boring Python code possible. I've seen Rust, Go, and plpgsql code get really ugly with lots of interpolation in formatted strings though, so I believe you.

relishketchup 5 points· 5 days ago

I think Perl is objectively better, performance and toolset wise. My sense is the overhead of "objectifying" every piece of data is too much of a performance hit for high-volume database processing. Just one datapoint, but as far as I know, Python doesn't support prepared statements in Postgres. psycopg2 is serviceable but a far cry from the nearly-perfect DBI. Sqlalchemy is a wonderful alternative to the also wonderful DBIx::Class, but performance-wise neither are suitable for ETL.

WesolyKubeczek 2 points· 4 days ago

I think Perl is objectively better, performance and toolset wise. My sense is the overhead of "objectifying" every piece of data is too much of a performance hit for high-volume database processing.

It's because objects are not first-class in Perl. Python's objects run circles around plain Perl blessed hashes, and we aren't even talking about Moose at this point.

relishketchup 2 points· 4 days ago

The point is, for ETL in particular, the Python paradigm of "everything is an object" introduces overhead to a process that doesn't need it and is repeated millions or billions of times, so even small latencies add up quickly. And, if you are using psycopg2, the lack of prepared statements adds yet more latency. This is a very specific use case where Perl is unequivocally better.

WesolyKubeczek 3 points· 4 days ago

Do you have any measurements to back this overhead assertion, or are you just imagining it would be slower, because objects? int, str, float "objects" in Python are objects indeed, but they also are optimized highly enough to be on par, if not, dare I say, faster than Perl counterparts.

Also, you can run "PREPARE x AS SELECT" in psycopg2. It's not trying to actively prevent it from doing it. I also bet the author would add this functionality if someone paid him, but even big corporations tend to be on the "all take and no give" side, which shouldn't come as news anyway.

Before you say "but it's inconvenient!", I'd really like my exceptions, context managers and generator functions in Perl, in a readable and caveat-less style, thank you very much, before we can continue our discussion about readability.

robdelacruz 5 points· 5 days ago

I think what the Perl haters are missing about the language is that Perl is FUN and USEFUL. It's a joy to code in. It accomplishes what I think was Larry Wall's primary goal in creating it: that it is linguistically expressive. There's a certain feeling of freedom when coding in it. I do get though that it's that linguistic expressiveness characteristic that makes people coming from Python/Java/JavaScript background dislike about it.

Like you said, the way to appreciate Perl is to be aware that it is part of the Unix package. I think Larry Wall said it best: "Perl is, in intent, a cleaned up and summarized version of that wonderful semi-natural language known as 'Unix'."

unphamiliarterritory 5 points· 5 days ago

I don't know why, but people hating on Perl doesn't bother me as much as those who are adverse to Perl but fond of other languages that heavily borrow from Perl -- without acknowledging the aspects of their language that were born, and created first in Perl. Most notably regular expressions; especially Perl compatible expressions.

Strange thing perhaps to be annoyed about.

WesolyKubeczek 1 point· 4 days ago

What else, except for regular expressions, which wasn't borrowed by Perl itself from awk, C, Algol?

Ruby, for example, does state quite plainly that it aims to absorb the good parts of Perl without its warts, and add good things of its own. So you have, for example, the "unless" keyword in it, as well as postfix conditionals. Which are exceptionally good for guard clauses, IMO.

PHP started specifically as "Perl-lite", thus it borrowed a lot from Perl, variables having the $ sigil in front of them are taken specifically from Perl, nobody is denying that.

This doesn't mean this cross-pollination should ever stop, or all other languages suddenly need to start paying tribute for things they might have got inspired by Perl. Making every little user on the internets acknowledge that this or that appeared in Perl first does little, alas, to make Perl better and catch up to what the world is doing today.

It's very much like modern Greeks are so enamored with their glorious past, Alexander the Great, putting a lot of effort into preserving their ancient history, and to remind the world about how glorious the ancient Greeks were while the barbarians of Europe were all unwashed and uncivilized, that they forget to build a glorious present and future.

Also an interesting quote from the Man Himself in 1995:

I certainly "borrowed" some OO ideas from Python, but it would be inaccurate to claim either that Perl borrowed all of Python's OO stuff, or that all of Perl's OO stuff is borrowed from Python.

Looking at Perl's OO system, I find myself mildly surprised, because it's nothing like Python's. But here you have, cross-pollination at work.

unphamiliarterritory 3 points· 5 days ago

My feelings is that Perl is an easy language to learn, but difficult to master. By master I don't just mean writing concise, reusable code, but I mean readable, clean, well-documented code.

I can count on one hand the Perl developers I've known that really write such clean Perl code. I feel the freehand style of perl has been a double-edged sword. With freedom comes to many developers a relaxed sense of responsibility.

I feel that the vast amount of poorly written code that has been used in the wild, which has earned (as we've all heard at one time or another) the dubious honor of being the "duct tape and bailing wire" language that glues the IT world together has caused a lot of people to be biased poorly against the language as a whole.

doomvox 3 points· 5 days ago
· edited 2 days ago

Larry Wall designed perl using a radically different approach from the conventional wisdom among the Computer Science intelligentsia, and it turned out to be wildly successful. They find this tremendously insulting: it was an attack on their turf by an outsider (a guy who kept talking about linguistics when all the cool people know that mathematical elegance is the only thing that matters).

You would have to say that Larry Wall has conceded he initally over-did some things, and the perl5 developers later set about fixing them as well as they could, but perl's detractors always seem to be unaware of these fixes: they've never heard of "use strict", and they certainly haven't ever heard of //x extended regex syntax.

The CS-gang responded to this situation with what amounts to a sustained smear campaign, attacking perl at every opportunity, and pumping up Python as much as possible.

Any attempt at understanding this situation is going to fail if you try to understand it on anything like rational grounds-- e.g. might their be some virtue in Python's rigid syntax? Maybe, but it can't possibly have been a big enough advantage to justify re-writing all of CPAN.

WesolyKubeczek 1 point· 3 days ago

You described it as if there had been a religious war, or a conspiracy, and not simple honest-to-god pragmatism at work. People have work to do, that's all there is to it.

doomvox 1 point· 2 days ago

and not simple honest-to-god pragmatism

Because I really don't think it was. I was there, and I've been around for quite some time, and I've watched many technical fads take off before there was any there there which then had people scrambling to back-fill the Latest Thing with enough of a foundation to keep working with it. Because once one gets going, it can't be stopped without an admission we were wrong again.

The questionable objections to perl was not that it was useless-- the human genome project, web 1.0, why would anyone need to defend perl? The questionable objections were stuff like "that's an ugly language".

ganjaptics 2 points· 5 days ago

I generally agree that weak *nix skills is part of it. People don't appreciate the fact that Perl has very tight integration with unix (fork is a top-level built in keyword) and think something like `let p :Process = new Process('ls', new CommandLineArguments(new Array<string>('-l'))` is clean and elegant.

But also there's a lot of dumb prejudice that all techies are guilty of. Think Bing -- it's a decent search engine now, but everyone who has ever opened a terminal window thinks it because it had a shaky first few years. Perl 4 and early Perl 5 which looked like Perl 4 was basically our "Bing".

WesolyKubeczek 1 point· 15 hours ago

On a completely different note, there's a lot of parallels between the fates of programming languages (and, dare I say, ideas in general ) and the gods of Terry Pratchett's Discworld. I mean, how they are born, how they compete for believers, how they dwindle, how they are reborn sometimes.

(Take it with a grain of salt, of course. But I generally like the idea of ideas being a kind of lifeforms unto themselves, of which we the minds are mere medium.)

s-ro_mojosa 1 point· 15 hours ago

I follow the analogy. Ideas are in some sense "alive" (and tend to follow a viral model in my view) to a great extent. I have not read Discworld, so the rest I do not follow.

Can you spell it out for me, especially any ideas you have about a Perl renaissance? I have a gut sense Perl is at the very start of one, but feel free to burst my bubble if you think I'm too far wrong.

WesolyKubeczek 0 points· 4 days ago

It was year 2020.

In Perl, length(@list) still wasn't doing what any reasonable person would expect it to do.

CPAN was full of zombie modules. There is a maintainer who is apparently unaware what "deprecated" means, so you have a lot of useful and widely-used modules DEPRECATED without any alternative.

So far, there only exists one Perl implementation. Can you compile it down to Javascript so there is a way to run honest-to-god Perl in a browser? Many languages can do it without resorting to emscripten or WebAssembly.

I'm not aware of any new Perl 5 books which would promote good programming style, instead of trying to enamore the newbies with cutesy of "this string of sigils you can't make heads or tails of prints 'Yet another Perl hacker! How powerful!'". Heck, I'm not aware of any Perl books published in the last decade at all . So much for teaching good practices to newbies, eh?

Perl is a packaging nightmare, compared to Python (and Python is a nightmare too in this regard, but a far far more manageable one), Javascript (somewhat better), or Go. It takes a lot of mental gymnastics to make Perl CI-friendly and reproducible-process-friendly (not the same as reproducible-builds-friendly, but still a neat thing to have in this day and age).

Tell me about new Perl projects that started in 2020, and about teams who can count their money who would consciously choose Perl for new code.

And the community. There are feuds between that one developer and the world, between that other developer and the world, and it just so happens that those two wrote things 90% of functioning CPAN mass depends on one way or the other.

I don't hate Perl (it makes me decent money, why would I?), so much as I have a pity for it. It's a little engine that could, but not the whole way up.

mr_chromatic 6 points· 4 days ago

In Perl, length(@list) still wasn't doing what any reasonable person would expect it to do.

I'm not aware of any new Perl 5 books which would promote good programming style, instead of trying to enamore the newbies with cutesy of "this string of sigils you can't make heads or tails of prints 'Yet another Perl hacker! How powerful!'". Heck, I'm not aware of any Perl books published in the last decade at all.

I find neither of these points compelling.

raevnos 3 points· 4 days ago

The second one just shows that he hasn't heard of Modern Perl. (Which could use a new edition, granted, as the last one is from 2015)

mr_chromatic 4 points· 4 days ago

Which could use a new edition, granted, as the last one is from 2015

There are a couple of things I'd like to add in a new edition (postfix dereferencing, signatures), but I might wait to see if there's more clarification around Perl 7 first.

daxim 4 points· 4 days ago

You really need to explain how you arrive at the claim that "Perl is a packaging nightmare" – I am a packager – and also be less vague about the other things you mention. It is not possible to tell whether you are calibrated accurately against reality.

so there is a way to run honest-to-god Perl in a browser?

https://old.reddit.com/r/perl/comments/9mj63s/201841_merged_the_js_weekly_changes_in_and_around/e7kfw9r/

WesolyKubeczek 1 point· 4 days ago

It is a packaging nightmare if you have a sizable codebase to deploy somewhere. This kind of packaging.

Grinnz 4 points· 4 days ago

https://metacpan.org/pod/Carton

daxim 2 points· 3 days ago

You merely tell your conclusions and give your audience no chance to independently arrive at the same, they just have to believe you. Most of the presented facts are vague allusions, not hard and verifiable. If you cannot present your evidence and train of thought, then hardly anyone takes you serious even if the expressed opinions happen to reflect the truth.

WesolyKubeczek 3 points· 3 days ago
· edited 3 days ago

The tooling around CPAN, including cpan and cpanm alike, last that I looked at them, did a depth-first dependency resolution. So, the first thing is a module A. It depends on module B, get the latest version of that. It depends on C, get the latest version of that. Finally, install C, B, and A. Next on the dependency list is module D, which depends on module E, which wants a particular, not the latest, version of B. But B is already installed and at the wrong version! So cpan and cpanm alike will just give up at this point, leaving my $INSTALL_BASE in a broken state.

Note that we're talking about second- and third-order dependencies here, in the ideal world, I'd prefer I didn't have to know about them. In a particular codebase I'm working on, there are 130 first-order dependencies already.

Carton, I see, is trying to sidestep the dependency resolution of cpanm . Good, but not good enough, once your codebase depends on Image::Magick . Which the one I care about does. You cannot install it from CPAN straight, not if you want a non-ancient version.

So I had to write another tool that is able to do breadth-first search when resolving dependencies, so that I either get an installable set of modules, or an error before it messes up the $INSTALL_BASE . In the process, I've learned a lot about the ecosystem which is in the category of "how sausages are made": for example, in the late 2017 and early 2018, ExtUtils::Depends , according to MetaCPAN, was provided by Perl-Gtk . Look it up if you don't believe me, ask the MetaCPAN explorer this:

/module/_search?q=module.name.lowercase:"extutils::depends" AND maturity:released&sort=version_numified:desc&fields=module.name,distribution,date

The first entry rectifies the problem, but it was released in 2019. In 2018, MetaCPAN thought that ExtUtils::Depends was best served by Perl-Gtk . Also, to this day, MetaCPAN thinks that the best distribution to contain Devel::CheckLib is Sereal-Path .

Oh. And I wanted an ability to apply arbitrary patches to distributions, which fix issues but the maintainers can't be bothered to apply them, or which remove annoyances. Not globally, like cpan+distroprefs does, but per-project. (Does Carton even work with distroprefs or things resembling them?) Yes, I know I can vendor in a dependency, but it's a third-order dependency, and why should I even bother for a one-liner patch?

Now, the deal is, I needed a tool that installs for me everything the project depends on, and does it right on the first try, because the primary user of the tool is CI, and there are few things I hate more than alarms about broken, flaky builds. Neither Carton, nor cpanminus, nor cpan could deliver on this. Maybe they do today, but I simply don't care anymore, good for them if they do. I've got under a very strong impression that the available Perl tooling is still firmly in the land of sysadmin practices from the 1990s, and it's going to take a long while before workflows that other stacks take for granted today arrive there.

P. S. I don't particularly care how seriously, if at all, I'm taken here. There have been questions asked, so I'm telling about my experience. Since comments which express a dislike with particular language warts but don't have a disclaimer "don't get me wrong, I ABSOLUTELY LOVE Perl" get punished by downvoting here, I feel that the judgment may be mutual.

daxim 3 points· 2 days ago

I am glad that you wrote a post with substance, thank you for taking the time to gather the information. That makes it possible to comment on it and further the insight and reply with corrections where appropriate. From my point of view the situation is not as bad as you made it out to be initially, let me know what you think.

punished by downvoting

I didn't downvote you because I despise downvote abuse; it's a huge problem on Reddit and this subreddit is no exception. I upvoted the top post to prevent it from becoming hidden.

dependency resolution

You got into the proverbial weeds by trying to shove the square peg into the round hole, heaping work-around onto work-around. You should have noticed that when you "had to write another tool"; that would be the point in time to stop and realise you need to ask experts for advice. They would have put you on the right track: OS level packaging. That's what I use, too, and it works pretty good, especially compared with other programming languages and their respective library archives. Each dep is created in a clean fakeroot, so it is impossible to run into " $INSTALL_BASE in a broken state". Image::Magick is not a problem because it's already packaged, and even in the case of a library in a similar broken state it can be packaged straight-forward because OS level packaging does not care about CPAN releases per se. Module E depending on a certain version of B is not a problem because a repo can contain many versions of B and the OS packager tool will resolve it appropriately at install time. Per-project patches are not a problem because patches are a built-in feature in the packaging toolchain and one can set up self-contained repos for packages made from a patched dist if they should not be used in the general case.

MetaCPAN explorer

I hope you reported those two bugs. Using that API to resolve deps is a large blunder. cpan uses the index files, cpanm uses cpanmetadb.

sysadmin practices from the 1990s

There's nothing wrong with them, these practices do not lose value just by time passing by, the practices and corresponding software tools are continually updated over the years, and the fundamentals are applicable with innovative packaging targets (e.g. CI environments or open containers).

I simply don't care anymore

I omit the details showing how to do packaging properly.

WesolyKubeczek 1 point· 2 days ago

Re: OS packaging versus per-project silos. The latter approach is winning now. There's an ongoing pressure that OS-packaged scripting runtimes (like Python, Ruby, and Perl) should only be used by the OS-provided software which happens to depend on it. I think I've read some about it even here on this sub.

And I'll tell you why. By depending on OS-provided versions of modules, you basically cast your fate to Canonical, or Red Hat, or whoever else is maintaining the OS, and they don't really care that what they thought was a minor upgrade broke your code (say, Crypt::PW44 changed its interface when moving from 0.13 to 0.14, how could anyone even suspect?). You are too small a fish for them to care. They go with whatever the upstream supplies. And you have better things to do than adapt whenever your underlying OS changes things behind your back. Keeping a balance between having to rebuild what's needed in response to key system libraries moving can be work enough.

There's also this problem when a package you absolutely need becomes orphaned by the distribution.

So any sane project with bigger codebases will keep their own dependency tree. Not being married to a particular OS distribution helps, too.

So, keeping your own dependency tree is sort of an established pattern now. Try to suggest to someone who maintains Node-based software that they use what, for example, CentOS provides in RPMs instead of directly using NPM. They will die of laughter, I'm afraid.

Re: sysadmin practices from 1990s. They are alive and well, the problem is that they are not nearly enough in the world of cloud where you can bring a hundred servers to life with an API call and tear down other 100 with another API call. "Sysadmin practices from 1990s" assume very dedicated humans who work with physical machines and know names of each of them, think a university lab. Today, you usually need a superset of these skills and practices. You need to manage machines with other machines. Perl's tooling could be vastly improved in this regard.

Re: CPAN, MetaCPAN, cpanmetadb and friends . So I'm getting confused. Which metadata database is authoritative, the most accurate and reliable for package metadata retrieval, including historical data? MetaCPAN, despite its pain points, looks the most complete so far. cpanmetadb doesn't have some of the metacpan's bugs, but I'm wary of it as it looks like it's a one man show (one brilliant man show, but still) and consists of archiving package metadata files as they change.

Also, I don't think that if one Marc Lehmann provides such metadata in Gtk-Perl that metacpan starts honestly thinking it provides ExtUtils::Depends (which is by the same author, so fair enough), there can be anything done about it. When I pointed out those things, I was lamenting the state of the ecosystem as such more than any tool. With metacpan, my biggest peeve is that they use ElasticSearch as the database, which is wrong on many levels (like 404-ing legit packages because the search index died, WTF? It also appears anyone on the internets can purge the cache with a curl command, WTF???)

Grinnz 1 point· 2 days ago
· edited 2 days ago

The MetaCPAN API is not the canonical way to resolve module dependencies, and is not used by CPAN clients normally, only used by cpanm when a specific version or dev release of a module is requested. See https://cpanmeta.grinnz.com/packages for a way to search the 02packages index, which is canonical.

I understand you are beyond this experience, but for anyone who runs into similar problems and wants guidance for the correct solutions, please ask the experts at #toolchain on irc.perl.org (politely, we are all volunteers) before digging yourself further holes.

Grinnz 3 points· 4 days ago

In Perl, length(@list) still wasn't doing what any reasonable person would expect it to do.

I would be quite against overloading length to have a different effect when passed an array. But I don't disagree that "array in scalar context" being the standard way to get its size is unintuitive.

davorg 2 points· 4 days ago

Heck, I'm not aware of any Perl books published in the last decade at all.

Perl School

tarje 1 point· 2 days ago

those are all self-published. Apress has published 2 perl books in 2020 and 2 in 2019.

davorg 2 points· 2 days ago
· edited 2 days ago

Here is one of the Perl books published by Apress in 2019 - Beginning Perl Programming .

Looking at the preview, I see:

Looking at the table of contents, I see he calls hashes "associative array variables" (a term that Perl stopped using when Perl 5 was released in 1994).

This is not a book that I would recommend to anyone.

Update: And here's Pro Perl Programming by the same author. In the preview, he uses bareword filehandles instead of the lexical variables that have been recommended for about fifteen years.

davorg 1 point· 2 days ago

I know the Perl School books are self-published. I published them :-)

WesolyKubeczek 1 point· 1 day ago

Which of those would you recommend to beginners? Those are usually the people who would benefit the most from a book.

davorg 1 point· 1 day ago
· edited 1 day ago

Perl Taster is aimed at complete beginners.

szabgab 2 points· 3 days ago

length actually works very well on arrays. It gives this error message:

length() used on @event_ids (did you mean "scalar(@event_ids)"

This error message just saved my forgetting a.. memory

frogspa 7 points· 5 days ago

Crosspost this on r/programming and see what the response is.

codon011 6 points· 5 days ago

I have run into developers who active loathe Perl for it's "line noise" quality. Unfortunately, I think they've mostly every encountered bad Perl, to which they would respond, "Is there any other kind of Perl?"

[Sep 05, 2020] C++ vs. Python vs. Perl vs. PHP performance benchmark (2016) by Ivan Zahariev

For a Perl-type problem (scanning and parsing big files), Perl is very fast.
Feb 09, 2016 | blog.famzah.net

47 Comments

The benchmarks here do not try to be complete, as they are showing the performance of the languages in one aspect, and mainly: loops, dynamic arrays with numbers, basic math operations .

This is a redo of the tests done in previous years . You are strongly encouraged to read the additional information about the tests in the article .

Here are the benchmark results:

Language CPU time Slower than Language
version
Source
code
User System Total C++ previous
C++ ( optimized with -O2 ) 0.952 0.172 1.124 g++ 5.3.1 link
Java 8 ( non-std lib ) 1.332 0.096 1.428 27% 27% 1.8.0_72 link
Python 2.7 + PyPy 1.560 0.160 1.720 53% 20% PyPy 4.0.1 link
Javascript ( nodejs ) 1.524 0.516 2.040 81% 19% 4.2.6 link
C++ (not optimized) 2.988 0.168 3.156 181% 55% g++ 5.3.1 link
PHP 7.0 6.524 0.184 6.708 497% 113% 7.0.2 link
Java 8 14.616 0.908 15.524 1281% 131% 1.8.0_72 link
Python 3.5 18.656 0.348 19.004 1591% 22% 3.5.1 link
Python 2.7 20.776 0.336 21.112 1778% 11% 2.7.11 link
Perl 25.044 0.236 25.280 2149% 20% 5.22.1 link
PHP 5.6 66.444 2.340 68.784 6020% 172% 5.6.17 link

[Aug 13, 2020] Perl is dying quick. Could be extinct by 2023. The HFT Guy

Aug 13, 2020 | thehftguy.com
  1. Curt J. Sampson says: 7 OCTOBER 2019 AT 09:27

    "One of the first programming languages." Wow. That kinda dismisses about 30 years of programming language history before Perl, and at least a couple of dozen major languages, including LISP, FORTRAN, Algol, BASIC, PL/1, Pascal, Smalltalk, ML, FORTH, Bourne shell and AWK, just off the top of my head. Most of what exists in today's common (and even not-so-common) programming languages was invented before Perl.

    That said, I know you're arm-waving the history here, and those details are not really part of the point of your post. But I do have a few comments on the meat of your post.

    Perl is a bit punctuation- and magic-variable-heavy, but is far from unique in being so. One example I just happened to be looking at today is VTL-2 ("A Very Tiny Language") which, admittedly, ran under unusually heavy memory constraints (a 768 byte interpreter able to run not utterly trivial programs in a total of 1 KB of memory). This uses reading from and assignment to special "magic" variables for various functions. X=? would read a character or number from the terminal and assign it to X ; ?=X would print that value. # was the current execution line; #=300 would goto line 300. Comparisons returned 0 or 1, so #=(X=25)\*50 was, "If X is equal to 25, goto line 50."

    Nor is Perl at all exotic if you look at its antecedents. Much of its syntax and semantics are inspired by Bourne shell, AWK and similar languages, and a number of these ideas were even carried forward into Ruby. Various parts of that style (magic variables, punctuation prefixes/suffixes determining variable type, automatic variable interpolation in strings, etc.) have been slowly but steadily going out of style since the 70s, for good reasons, but those also came into existence for good reasons and were not at all unique to Perl. Perl may look exotic now, but to someone who had been scripting on Unix in the 80s and 90s, Perl was very comfortable because it was full of common idioms that they were already familiar with.

    I'm not sure what you mean by Perl "[not supporting] functions with arguments"; functions work the same way that they work in other languages, defined with sub foo { ... } and taking parameters; as with Bourne shell, the parameters need not be declared in the definition. It's far from the only language where parentheses need not be used to delimit parameters when calling a function. Further, it's got fairly good functional and object-oriented programming support.

    I'm not a huge fan of Perl (though I was back in the early '90s), nor do I think its decline is unwarranted (Ruby is probably a better language to use now if you want to program in that style), but I don't think you give it a fair shake here.

    Nor is it its decline, along with COBOL and Delphi, anything to do with age. Consider LISP, which is much older, arguably weirder, and yet is seeing if anything a resurgence of popularity (e.g., Clojure) in the last ten years.

    Like REPLY

  2. thehftguy says: 7 OCTOBER 2019 AT 13:51

    There are many languages indeed. Speaking from a career-wise, professional perspective here. It could be quite difficult to make a career today out of those.

    About functions. What I mean is that Perl doesn't do functions with arguments like current languages do. "func myfunction(arg1, arg2, arg3)."
    It's correct to say that Perl has full support for routines and parameters, it does and even in multiple ways, but it's not comparable to what is in mainstream languages today.

    Like REPLY

    • dex4er says: 7 OCTOBER 2019 AT 15:59

      Of course Perl supports function arguments. I think since 2015. It is in official documentation: https://perldoc.perl.org/5.30.0/perlsub.html#Signatures

      I can understand, that you don't like Perl as a language, but it doesn't mean you should write misconceptions about it.

      Personally I think Perl won't go anywhere. Nobody wants to rewrite existing scripts that are used by system tools, ie. dpkg utilities in Debian or Linux kernel profiling stuff. As a real scripting language for basic system tasks is still good enough and probably you won't find better replacement.

      And nobody uses CGI module from Perl in 2019. Really.

      Like REPLY

    • Curt J. Sampson says: 7 OCTOBER 2019 AT 16:21

      I see by "functions with arguments" you mean specifically call-site checking against a prototype. By that definition you can just as well argue that Python and Ruby "don't support functions with arguments" because they also don't do do call-site checking against prototypes in the way that C and Java do, instead letting you pass a string to a function expecting an integer and waiting until it gets however much further down the call stack before generating an exception.

      "Dynamic" languages all rely to some degree or other on runtime checks; how and what you check is something you weigh against other tradeoffs in the language design. If you were saying that you don't like the syntax of sub myfunction { my ($arg1, $arg2, $arg3) = @_; ...} as compared to def myfunction(arg1, arg2, arg3): ... that would be fair enough, but going so far as to say "Perl doesn't support functions with arguments" is at best highly misleading and at worst flat-out wrong. Particularly when Perl does have prototypes with more call site checking than Python or Ruby do, albeit as part of a language feature for doing things that neither those nor any other language you mention support.

      In fact, many languages even deliberately provide support to remove parameter count checks and get Perl's @_ semantics. Python programmers regularly use def f(*args): ... ; C uses the more awkward varargs .

      And again I reiterate (perhaps more clearly this time): Perl was in no way "genuinely unique and exotic" when it was introduced; it brought together and built on a bunch of standard language features from various languages that anybody programming on Unix above the level of C in the 80s and 90s was already very familiar with.

      Also, I forgot to mention this in my previous comment, but neither Python nor Perl have ever been required by POSIX (or even mentioned by it, as far as I know), nor did Python always come pre-installed on Linux distributions. Also, it seems unlikely to be a "matter of time" until Python gets removed from the default Ubuntu install since Snappy and other Canonical tools are written in it.

      There are plenty of folks making a career out of Clojure, which is one flavour of LISP, these days. According to your metric, Google Trends, it overtook OCaml years ago, and seems to be trending roughly even, which is better than Haskell is doing .

      Like REPLY

    • anon says: 7 OCTOBER 2019 AT 16:40

      @thehftguy what are you talking about? `$ perl -E 'use v5.30; use feature qw(signatures); sub foo ($a) { say $a }; foo(1)'`

[Aug 13, 2020] Now I Am Become Perl ?

Aug 13, 2020 | vector-of-bool.github.io

Now I Am Become Perl ?

Oct 31, 2018

Destroyer of verbosity.

A Defence of Terseness

Perl gets picked on for its syntax. It is able to represent very complex programs with minimalist tokens. A jumble of punctuation can serve to represent an intricate program. This is trivial terseness in comparison to programming languages like APL (or its later ASCII-suitable descendants, such as J), where not a single character is wasted.

The Learning Curb

Something can be said for terseness. Rust, having chosen fn to denote functions, seems to have hit a balance in that regard. There is very little confusion over what fn means these days, and a simple explanation can immediately alleviate any confusion. Don't confuse initial confusion with permanent confusion. Once you get over that initial "curb" of confusion, we don't have to worry any more.

Foreign != Confusing

You'll also find when encountering a new syntax that you will immediately not understand, and instead wish for something much simpler. Non-C++ programers, for example, will raise an eyebrow at the following snippet:

[&, =foo](auto&& item) mutable -> int { return item + foo.bar(something); }

I remember my first encounter with C++ lambdas, and I absolutely hated the syntax. It was foreign and unfamiliar, but other than that, my complaints stopped. I could have said "This is confusing," but after having written C++ lambda expressions for years the above syntax has become second nature and very intuitive. Do not confuse familiarity with simplicity.

Explicit is Better than Implicit

except when it needlessly verbose.

Consider the following code:

template <typename T, typename U, int N>
class some_class {};

Pretty straightforward, right?

Now consider this:

class<T, U, int N> some_class {};

Whoa that's not C++!

Sure, but it could be, if someone were convinced enough that it warranted a proposal, but I doubt it will happen any time soon.

So, you know it isn't valid C++, but do you know what the code means? I'd wager that the second example is quite clear to almost all readers. It's semantically identical to the former example, but significantly terser . It's visually distinct from any existing C++ construct, yet when shown the two "equivalent" code samples side-by-side you can immediately cross-correlate them to understand what I'm trying to convey.

There's a lot of bemoaning the verbosity of C++ class templates, especially in comparison to the syntax of generics in other languages. While they don't map identically, a lot of the template syntax is visual noise that was inserted to be "explicit" about what was going on, so as not to confuse a reader that didn't understand how template syntax works.

The template syntax, despite being an expert-friendly feature , uses a beginner-friendly syntax. As someone who writes a lot of C++ templates, I've often wished for terseness in this regard.

foo and bar considered harmful.

Consider this:

auto foo = frombulate();
std::sort(
    foo.begin(),
    foo.end(),
    [](auto&& lhs, auto&& rhs) {
        return lhs.bar() < rhs.bar();
    }
);

What?

What does the code even do ? Obviously auto is harmful. It's completely obscuring the meaning of our code! Let's fix that by adding explicit types:

std::vector<data::person> foo = frombulate();
std::sort(
    foo.begin(),
    foo.end(),
    [](const data::person& lhs, const data::person& rhs) {
        return lhs.bar() < rhs.bar();
    }
);

Looking at the API for data::person , we can see that bar() is a deprecated alias of name() , and frombulate() is deprecated in favor of get_people() . And using the name foo to refer to a sequence of data::person seems silly. We have an English plural people . Okay, let's fix all those things too:

std::vector<data::person> people = get_people();
std::sort(
    people.begin(),
    people.end(),
    [](const data::person& lhs, const data::person& rhs) {
        return lhs.name() < rhs.name();
    }
);

Perfect! We're now know exactly what we're doing: Sorting a list of people by name.

Crazy idea, though Let's put those auto s back in and see what happens:

auto people = get_people();
std::sort(
    people.begin(),
    people.end(),
    [](auto&& lhs, auto&& rhs) {
        return lhs.name() < rhs.name();
    }
);

Oh no! Our code has suddenly become unreadable again and oh.

Oh wait.

No, it's just fine. We can see that we're sorting a list of people by their name. No explicit types needed. We can see perfectly well what's going on here. Using foo and bar while demonstrating why some syntax/semantics are bad is muddying the water. No one writes foo and bar in real production-ready code. (If you do, please don't send me any pull requests.)

Even Terser?

std::sort in the above example takes an iterator pair to represent a "range" of items to iterate over. Iterators are pretty cool, but the common case of "iterate the whole thing " is common enough to warrant "we want ranges." Dealing with iterables should be straightforward and simple. With ranges, the iterator pair is extracted implicitly, and we might write the above code like this:

auto people = get_people();
std::sort(
    people,
    [](auto&& lhs, auto&& rhs) {
        return lhs.name() < rhs.name();
    }
);

That's cool! And we could even make it shorter (even fitting the whole sort() call on a single line) using an expression lambda:

auto people = get_people();
std::sort(people, [][&1.name() < &2.name()]);

What? You haven't seen this syntax before? Don't worry, you're not alone: I made it up. The &1 means "the first argument", and &2 means "the second argument."

Note: I'm going to be using range-based algorithms for the remainder of this post, just to follow the running theme of terseness.

A Modest Proposal: Expression Lambdas

If my attempt has been successful, you did not recoil in horror and disgust as the sight of my made-up "expression lambda" syntax:

[][&1.name() < &2.name()]

Here's what I hope:

Yes, the lead-in paragraphs were me buttering you up in preparation for me to unveil the horror and beauty of "expression lambdas."

Prior Art?

But Vector, this is just Abbreviated Lambdas !

I am aware of the abbreviated lambdas proposals, and I am aware that it was shot down as (paraphrasing) "they did not offer sufficient benefit for their added cost and complexity."

Besides that, "expression lambdas" are not abbreviated lambdas. Rather, the original proposal document cites this style as "hyper-abbreviated" lambdas. The original authors note that their abbreviated lambda syntax "is about as abbreviated as you can get, without loss of clarity or functionality." I take that as a challenge.

For one, I'd note that all their examples use simplistic variables names, like a , b , x , y , args , and several others. The motivation for the abbreviated lambda is to gain the ability to wield terseness where verbosity is unnecessary. Even in my own example, I named my parameters lhs and rhs to denote their position in the comparison, yet there is very little confusion as to what was going on. I could as well have named them a and b . We understood with the context what they were. The naming of parameters when we have such useful context clues is unnecessary!

I don't want abbreviated lambdas. I'm leap-frogging it and proposing hyper-abbreviated lambdas, but I'm going to call them "expression lambdas," because I want to be different (and I think it's a significantly better name).

Use-case: Calling an overload-set

C++ overload sets live in a weird semantic world of their own. They are not objects, and you cannot easily create an object from one. For additional context, see Simon Brand's talk on the subject . There are several proposals floating around to fill this gap, but I contend that "expression lambdas" can solve the problem quite nicely.

Suppose I have a function that takes a sequence of sequences. I want to iterate over each sequence and find the maximum-valued element within. I can use std::transform and std::max_element to do this work:

template <typename SeqOfSeq>
void find_maximums(Seq& s) {
    std::vector<typename SeqOfSeq::value_type::const_iterator> maximums;
    std::transform(s,
                   std::back_inserter(maximums),
                   std::max_element);
    return maximums;
}

Oops! I can't pass std::max_element because it is an overload set, including function templates. How might an "expression lambda" help us here? Well, take a look:

template <typename SeqOfSeq>
void find_maximums(Seq& s) {
    std::vector<typename SeqOfSeq::value_type::const_iterator> maximums;
    std::transform(s,
                   std::back_inserter(maximums),
                   [][std::max_element(&1)]);
    return maximums;
}

If you follow along, you can infer that the special token sequence &1 represents "Argument number 1" to the expression closure object.

What if we want to use a comparator with our expression lambda?

template <typename SeqOfSeq, typename Compare>
void find_maximums(Seq& s, Compare&& comp) {
    std::vector<typename SeqOfSeq::value_type::const_iterator> maximums;
    std::transform(s,
                   std::back_inserter(maximums),
                   [&][std::max_element(&1, comp)]);
    return maximums;
}

Cool. We capture like a regular lambda [&] and pass the comparator as an argument to max_element . What does the equivalent with regular lambdas look like?

template <typename SeqOfSeq, typename Compare>
void find_maximums(Seq& s, Compare&& comp) {
    std::vector<typename SeqOfSeq::value_type::const_iterator> maximums;
    std::transform(s,
                   std::back_inserter(maximums),
                   [&](auto&& arg) -> decltype(std::max_element(arg, comp)) {
                       std::max_element(arg, comp)
                   });
    return maximums;
}

That's quite a bit more. And yes, that decltype(<expr>) is required for proper SFINAE when calling the closure object. It may not be used in this exact context, but it is useful in general.

What about variadics?

Simple:

[][some_function(&...)]

What about perfect forwarding?

Well we're still in the boat of using std::forward<decltype(...)> on that one. Proposals for a dedicated "forward" operator have been shot down repeatedly. As someone who does a lot of perfect forwarding, I would love to see a dedicated operator (I'll throw up the ~> spelling for now).

The story isn't much better for current generic lambdas, though:

[&](auto&&... args) -> decltype(do_work(std::forward<decltype(args)>(args)...)) {
    return do_work(std::forward<decltype(args)>(args)...);
}

"Expression lambdas" would face a similar ugliness:

[&][do_work(std::forward<decltype(&...)>(&...))]

At least it can get away from the -> decltype(...) part.

If we had a "forwarding operator", the code might look something like this:

[&](auto&&... args) -> decltype(do_work(~>args...)) {
    return do_work(~>args...);
}

And this for "expression lambdas":

[&][do_work(~>&...)]

Are we Perl yet?

Tell me if and why you love or hate my "expression lambda" concept.

[Aug 11, 2020] What are the drawbacks of Python?

Jan 01, 2012 | softwareengineering.stackexchange.com

Ask Question Asked 9 years, 9 months ago Active 7 years, 2 months ago Viewed 204k times


4 revs, 4 users 62%
, 2012-06-27 15:11:57

zvrba ,

I think that this is a helpful subjective question, and it would be a shame to close it. � Eric Wilson Oct 29 '10 at 0:09

2 revs
, 2010-10-29 01:02:45

I use Python somewhat regularly, and overall I consider it to be a very good language. Nonetheless, no language is perfect. Here are the drawbacks in order of importance to me personally:

  1. It's slow. I mean really, really slow. A lot of times this doesn't matter, but it definitely means you'll need another language for those performance-critical bits.

  2. Nested functions kind of suck in that you can't modify variables in the outer scope. Edit: I still use Python 2 due to library support, and this design flaw irritates the heck out of me, but apparently it's fixed in Python 3 due to the nonlocal statement. Can't wait for the libs I use to be ported so this flaw can be sent to the ash heap of history for good.

  3. It's missing a few features that can be useful to library/generic code and IMHO are simplicity taken to unhealthy extremes. The most important ones I can think of are user-defined value types (I'm guessing these can be created with metaclass magic, but I've never tried), and ref function parameter.

  4. It's far from the metal. Need to write threading primitives or kernel code or something? Good luck.

  5. While I don't mind the lack of ability to catch semantic errors upfront as a tradeoff for the dynamism that Python offers, I wish there were a way to catch syntactic errors and silly things like mistyping variable names without having to actually run the code.

  6. The documentation isn't as good as languages like PHP and Java that have strong corporate backings.

Mark Canlas ,

@Casey, I have to disagree. The index is horrible - try looking up the with statement, or methods on a list . Anything covered in the tutorial is basically unsearchable. I have much better luck with Microsoft's documentation for C++. � Mark Ransom Oct 29 '10 at 6:14

2 revs
, 2011-07-24 13:49:48

I hate that Python can't distinguish between declaration and usage of a variable. You don't need static typing to make that happen. It would just be nice to have a way to say "this is a variable that I deliberately declare, and I intend to introduce a new name, this is not a typo".

Furthermore, I usually use Python variables in a write-once style, that is, I treat variables as being immutable and don't modify them after their first assignment. Thanks to features such as list comprehension, this is actually incredibly easy and makes the code flow more easy to follow.

However, I can't document that fact. Nothing in Python prevents me form overwriting or reusing variables.

In summary, I'd like to have two keywords in the language: var and let . If I write to a variable not declared by either of those, Python should raise an error. Furthermore, let declares variables as read-only, while var variables are "normal".

Consider this example:

x = 42    # Error: Variable `x` undeclared

var x = 1 # OK: Declares `x` and assigns a value.
x = 42    # OK: `x` is declared and mutable.

var x = 2 # Error: Redeclaration of existing variable `x`

let y     # Error: Declaration of read-only variable `y` without value
let y = 5 # OK: Declares `y` as read-only and assigns a value.

y = 23    # Error: Variable `y` is read-only

Notice that the types are still implicit (but let variables are for all intents and purposes statically typed since they cannot be rebound to a new value, while var variables may still be dynamically typed).

Finally, all method arguments should automatically be let , i.e. they should be read-only. There's in general no good reason to modify a parameter, except for the following idiom:

def foo(bar = None):
    if bar == None: bar = [1, 2, 3]

This could be replaced by a slightly different idiom:

def foo(bar = None):
    let mybar = bar or [1, 2, 3]

Evan Plaice ,

I so so wish Python had a "var" statement. Besides the (very good) reason you state, it would also make it a lot easier to read the code because then you can just scan over the page to spot all the variable declarations. � jhocking Jul 11 '11 at 23:19

2 revs, 2 users 67%
, 2012-09-08 13:01:49

My main complaint is threading, which is not as performant in many circumstances (compared to Java, C and others) due to the global interpreter lock (see "Inside the Python GIL" (PDF link) talk)

However there is a multiprocess interface that is very easy to use, however it is going to be heavier on memory usage for the same number of processes vs. threads, or difficult if you have a lot of shared data. The benefit however, is that once you have a program working on with multiple processes, it can scale across multiple machines, something a threaded program can't do.

I really disagree on the critique of the documentation, I think it is excellent and better than most if not all major languages out there.

Also you can catch many of the runtime bugs running pylint .

dbr ,

+1 for pylint. I was unaware of it. Next time I do a project in Python, I'll try it out. Also, multithreading seems to work fine if you use Jython instead of the reference CPython implementation. OTOH Jython is somewhat slower than CPython, so this can partially defeat the purpose. � dsimcha Oct 29 '10 at 0:48

Jacob , 2010-10-28 22:33:08

Arguably , the lack of static typing, which can introduce certain classes of runtime errors, is not worth the added flexibility that duck typing provides.

Jacob ,

This is correct, though there are tools like PyChecker which can check for errors a compiler in languages like C/Java would do. � Oliver Weiler Oct 28 '10 at 23:42

2 revs
, 2010-10-29 14:14:06

I think the object-oriented parts of Python feel kind of "bolted on". The whole need to explicitly pass "self" to every method is a symptom that it's OOP component wasn't expressly planned , you could say; it also shows Python's sometimes warty scoping rules that were criticized in another answer.

Edit:

When I say Python's object-oriented parts feel "bolted on", I mean that at times, the OOP side feels rather inconsistent. Take Ruby, for example: In Ruby, everything is an object, and you call a method using the familiar obj.method syntax (with the exception of overloaded operators, of course); in Python, everything is an object, too, but some methods you call as a function; i.e., you overload __len__ to return a length, but call it using len(obj) instead of the more familiar (and consistent) obj.length common in other languages. I know there are reasons behind this design decision, but I don't like them.

Plus, Python's OOP model lacks any sort of data protection, i.e., there aren't private, protected, and public members; you can mimic them using _ and __ in front of methods, but it's kind of ugly. Similarly, Python doesn't quite get the message-passing aspect of OOP right, either.

ncoghlan ,

The self parameter is merely making explicit what other languages leave implicit. Those languages clearly have a "self" parameter. � Roger Pate Oct 29 '10 at 6:08

MAK , 2010-11-11 13:38:01

Things I don't like about Python:

  1. Threading (I know its been mentioned already, but worth mentioning in every post).
  2. No support for multi-line anonymous functions ( lambda can contain only one expression).
  3. Lack of a simple but powerful input reading function/class (like cin or scanf in C++ and C or Scanner in Java).
  4. All strings are not unicode by default (but fixed in Python 3).

Bryan Oakley ,

Regarding (2), I think this is offset by the possibility to have nested functions. � Konrad Rudolph Dec 26 '10 at 12:13

2 revs
, 2011-07-25 22:43:03

Default arguments with mutable data types.

def foo(a, L = []):
    L.append(a)
    print L

>>> foo(1)
[1]
>>> foo(2)
[1, 2]

It's usually the result of some subtle bugs. I think it would be better if it created a new list object whenever a default argument was required (rather than creating a single object to use for every function call).

Edit: It's not a huge problem, but when something needs to be referred in the docs, it commonly means it's a problem. This shouldn't be required.

def foo(a, L = None):
    if L is None:
        L = []
    ...

Especially when that should have been the default. It's just a strange behavior that doesn't match what you would expect and isn't useful for a large number of circumstances.

Patrick Collins ,

I see lots of complaints about this, but why does people insist having an empty list (that the function modifies) as a default argument? Is this really such a big problem? I.e., is this a real problem? � Martin Vilcans Jul 25 '11 at 21:22

3 revs
, 2011-07-15 03:15:50

Some of Python's features that make it so flexible as a development language are also seen as major drawbacks by those used to the "whole program" static analysis conducted by the compilation and linking process in languages such as C++ and Java.

Local variables are declared using the ordinary assignment statement. This means that variable bindings in any other scope require explicit annotation to be picked up by the compiler (global and nonlocal declarations for outer scopes, attribute access notation for instance scopes). This massively reduces the amount of boilerplate needed when programming, but means that third party static analysis tools (such as pyflakes) are needed to perform checks that are handled by the compiler in languages that require explicit variable declarations.

The contents of modules, class objects and even the builtin namespace can be modified at runtime. This is hugely powerful, allowing many extremely useful techniques. However, this flexibility means that Python does not offer some features common to statically typed OO languages. Most notably, the "self" parameter to instance methods is explicit rather than implicit (since "methods" don't have to be defined inside a class, they can be added later by modifying the class, meaning that it isn't particularly practical to pass the instance reference implicitly) and attribute access controls can't readily be enforced based on whether or not code is "inside" or "outside" the class (as that distinction only exists while the class definition is being executed).

This is also true of many other high level languages, but Python tends to abstract away most hardware details. Systems programming languages like C and C++ are still far better suited to handling direct hardware access (however, Python will quite happily talk to those either via CPython extension modules or, more portably, via the ctypes library).

> ,

add a comment

zvrba , 2010-10-29 07:20:33

  1. Using indentation for code blocks instead of {} / begin-end, whatever.
  2. Every newer modern language has proper lexical scoping, but not Python (see below).
  3. Chaotic docs (compare with Perl5 documentation, which is superb).
  4. Strait-jacket (there's only one way to do it).

Example for broken scoping; transcript from interpreter session:

>>> x=0
>>> def f():
...     x+=3
...     print x
... 
>>> f()
Traceback (most recent call last):
  File "", line 1, in ?
  File "", line 2, in f
UnboundLocalError: local variable 'x' referenced before assignment

global and nonlocal keywords have been introduced to patch this design stupidity.

Ben ,

regarding the scoping, it might worth it for the curious to look at python.org/dev/peps/pep-3104 to understand the reasoning of the current method. � Winston Ewert Oct 30 '10 at 1:13

2 revs
, 2012-09-08 16:49:46

I find python's combination of object-oriented this.method() and procedural/functional method(this) syntax very unsettling:

x = [0, 1, 2, 3, 4]
x.count(1)
len(x)
any(x)
x.reverse()
reversed(x)
x.sort()
sorted(x)

This is particularly bad because a large number of the functions (rather than methods) are just dumped into the global namespace : methods relating to lists, strings, numbers, constructors, metaprogramming, all mixed up in one big alphabetically-sorted list.

At the very least, functional languages like F# have all the functions properly namespaced in modules:

List.map(x)
List.reversed(x)
List.any(x)

So they aren't all together. Furthermore, this is a standard followed throughout the library, so at least it's consistent.

I understand the reasons for doing the function vs method thing, but i still think it's a bad idea to mix them up like this. I would be much happier if the method-syntax was followed, at least for the common operations:

x.count(1)
x.len()
x.any()
x.reverse()
x.reversed()
x.sort()
x.sorted()

Whether the methods are mutating or not, having them as methods on the object has several advantages:

And of course it has advantages over the put-everything-in-global-namespace way of doing it. It's not that the current way is incapable of getting things done. It's even pretty terse ( len(lst) ), since nothing is namespaced! I understand the advantages in using functions (default behavior, etc.) over methods, but I still don't like it.

It's just messy. And in big projects, messiness is your worst enemy.

Wayne Werner ,

yeah... I really miss LINQ style (I'm sure LINQ isn't the first to implement it, but I'm most familiar with it) list handling. � CookieOfFortune Sep 8 '12 at 15:38

2 revs, 2 users 95%
, 2012-09-21 07:38:31

Lack of homoiconicity .

Python had to wait for 3.x to add a "with" keyword. In any homoiconic language it could have trivially been added in a library.

Most other issues I've seen in the answers are of one of 3 types:

1) Things that can be fixed with tooling (e.g. pyflakes) 2) Implementation details (GIL, performance) 3) Things that can be fixed with coding standards (i.e. features people wish weren't there)

#2 isn't a problem with the language, IMO #1 and #3 aren't serious problems.

dbr ,

with was available from Python 2.5 with from __future__ import with_statement , but I agree, I've occasionally found it unfortunate that statements like if / for / print /etc are "special" instead of regular functions � dbr Sep 9 '12 at 22:03

Martin Vilcans , 2011-07-25 22:21:42

Python is my favourite language as it is very expressive, but still keeps you from making too many mistakes. I still have a few things that annoy me:

Of these complaints, it's only the very first one that I care enough about that I think it should be added to the language. The other ones are rather minor, except for the last one, which would be great if it happened!

Zoran Pavlovic ,

+1 It makes me wonder whether to write datetime.datetime.now() when one project could write datetime.now and then mixing two projects one way of writing it rules out the other and surely this wouldn't have happened in Java which wouldn't name a module the same as a file(?) if you see how the common way seems to have the module confusing us with the file when both uses are practiced and explicit self I still try to understand since the calls don't have the same number of arguments as the functions. And you might thinkn that the VM python has is slow? � Niklas R. Sep 1 '11 at 16:19

5 revs
, 2013-05-23 22:03:02

Python is not fully mature: the python 3.2 language at this moment in time has compatibility problems with most of the packages currently distributed (typically they are compatible with python 2.5). This is a big drawback which currently requires more development effort (find the package needed; verify compatibility; weigh choosing a not-as-good package which may be more compatible; take the best version, update it to 3.2 which could take days; then begin doing something useful).

Likely in mid-2012 this will be less of a drawback.

Note that I guess I got downvoted by a fan-boy. During a developer discussion our high level developer team reached the same conclusion though.

Maturity in one main sense means a team can use the technology and be very quickly up & running without hidden risks (including compatibility problems). 3rd party python packages and many apps do not work under 3.2 for the majority of the packages today. This creates more work of integration, testing, reimplementing the technology itself instead of solving the problem at hand == less mature technology.

Update for June 2013: Python 3 still has maturity problems. Every so often a team member will mention a package needed then say "except it is only for 2.6" (in some of these cases I've implemented a workaround via localhost socket to use the 2.6-only package with 2.6, and the rest of our tools stay with 3.2). Not even MoinMoin, the pure-python wiki, is written in Python 3.

Jonathan Cline IEEE ,

I agree with you only if your definition of maturity is not compatible with a version that is incompatible by design . � tshepang Jul 17 '11 at 7:25

Mason Wheeler , 2010-10-28 22:35:52

Python's scoping is badly broken, which makes object-oriented programming in Python very awkward.

LennyProgrammers ,

can you give an example? (I'm sure you are right, but I'd like an example) � Winston Ewert Oct 28 '10 at 22:36

missingfaktor , 2010-11-11 12:26:23

My gripes about Python:

JB. ,

Why the downvote? � missingfaktor Dec 26 '10 at 13:32

3 revs
, 2011-07-23 23:08:37

Access modifiers in Python are not enforcable - makes it difficult to write well structured, modularized code.

I suppose that's part of @Mason's broken scoping - a big problem in general with this language. For code that's supposed to be readable, it seems quite difficult to figure what can and should be in scope and what a value will be at any given point in time - I'm currently thinking of moving on from the Python language because of these drawbacks.

Just because "we're all consenting adults" doesn't mean that we don't make mistakes and don't work better within a strong structure, especially when working on complex projects - indentation and meaningless underscores don't seem to be sufficient.

ncoghlan ,

So lack of access controls is bad... but explicit scoping of variable writes to any non-local namespace is also bad? � ncoghlan Jul 13 '11 at 3:25

dan_waterworth , 2010-12-26 13:05:49

  1. The performance is not good, but is improving with pypy,
  2. The GIL prevents the use of threading to speed up code, (although this is usually a premature optimization),
  3. It's only useful for application programming,

But it has some great redeeming features:

  1. It's perfect for RAD,
  2. It's easy to interface with C (and for C to embed a python interpreter),
  3. It's very readable,
  4. It's easy to learn,
  5. It's well documented,
  6. Batteries really are included, it's standard library is huge and pypi contains modules for practically everything,
  7. It has a healthy community.

dan_waterworth ,

What inspired to mention the advantages? The question for the problems. Anyways, what you mean it's useful only for application programming? What other programming is there? What specifically is it not good for? � tshepang Dec 30 '10 at 13:27

Niklas R. , 2011-07-23 07:31:38

I do favor python and the first disadvantage that comes to my mind is when commenting out a statement like if myTest(): then you must change the indentation of the whole executed block which you wouldn't have to do with C or Java. In fact in python instead of commenting out an if-clause instead I've started to comment it out this way: `if True:#myTest() so I won't also have to change the following code block. Since Java and C don't rely on indentation it makes commenting out statements easier with C and Java.

Christopher Mahan ,

You would seriously edit C or Java code to change the block level of some code without changing its indentation? � Ben Jul 24 '11 at 6:35

Jed , 2012-09-08 13:49:04

Multiple dispatch does not integrate well with the established single-dispatch type system and is not very performant.

Dynamic loading is a massive problem on parallel file systems where POSIX-like semantics lead to catastrophic slow-downs for metadata-intensive operations. I have colleagues that have burned a quarter million core-hours just getting Python (with numpy, mpi4py, petsc4py, and other extension modules) loaded on 65k cores. (The simulation delivered a significant new science results, so it was worth it, but it is a problem when more than a barrel of oil is burned to load Python once.) Inability to link statically has forced us to go to great contortions to get reasonable load times at scale, including patching libc-rtld to make dlopen perform collective file system access.

Jed ,

Wow, seems highly technical, do you have any reference material, examples, blog posts or articles on the subject ? I wonder if I might be exposed to such cases in a near future. � vincent Sep 8 '12 at 20:00

vincent , 2012-09-08 16:03:54

Anyhow, python is my main language for 4 years now. Being fanboys, elitists or monomaniacs is not a part of the python culture.

Andrew Janke ,

+1. Spec for memory and threading model is right on. But FWIW, the Java garbage collector being on a thread (and most everything else about the GC) is not an aspect of the Java language or VM specifications per se, but is a matter of a particular JVM's implementation. However, the main Sun/Oracle JVM is extensively documented wrt GC behavior and configurability, to the extent that there are whole books published on JVM tuning. In theory one could document CPython in the same way, regardless of language spec. � Andrew Janke Nov 26 '12 at 3:44

deamon , 2012-09-10 12:59:24

Konrad Rudolph ,

Personally I think that incompatibility between between 2.x and 3.x is one of Python's biggest advantages. Sure, it also is a disadvantage. But the audacity of the developers to break backwards compatibility also means that they didn't have to carry cruft around endlessly. More languages need such an overhaul. � Konrad Rudolph Sep 10 '12 at 13:39

Kosta , 2012-09-08 15:04:10

"Immutability" is not exactly it's strong point. AFAIK numbers, tuples and strings are immutable, everything else (i.e. objects) is mutable. Compare that to functional languages like Erlang or Haskell where everything is immutable (by default, at least).

However, Immutability really really shines with concurrency*, which is also not Python's strong point, so at least it's consequent.

(*= For the nitpickers: I mean concurrency which is at least partially parallel. I guess Python is ok with "single-threaded" concurrency, in which immutability is not as important. (Yes, FP-lovers, I know that immutability is great even without concurrency.))

> ,

add a comment

rbanffy , 2012-09-08 16:47:42

I'd love to have explicitly parallel constructs. More often than not, when I write a list comprehension like

[ f(x) for x in lots_of_sx ]

I don't care the order in which the elements will be processed. Sometimes, I don't even care in which order they are returned.

Even if CPython can't do it well when my f is pure Python, behavior like this could be defined for other implementations to use.

Zoran Pavlovic ,

//spawn bunch of threads //pass Queue que to all threads que.extend([x for x in lots_of_sx]) que.wait() # Wait for all lots_of_sx to be processed by threads. � Zoran Pavlovic Jan 7 '13 at 7:02

2 revs, 2 users 80%
, 2012-10-07 15:16:37

Python has no tail-call optimization, mostly for philosophical reasons . This means that tail-recursing on large structures can cost O(n) memory (because of the unnecessary stack that is kept) and will require you to rewrite the recursion as a loop to get O(1) memory.

> ,

add a comment Not the answer you're looking for? Browse other questions tagged programming-languages python or ask your own question .

https://tpc.googlesyndication.com/safeframe/1-0-37/html/container.html 3dB Labs Defense

View all job openings!
Linked
88 Why was Python's popularity so sudden? 16 Why is Python recommended as an entry level programming language? 9 What to cover in a "introduction to python" talk?
Related
16 Preferring Python over C for Algorithmic Programming 3 Method object creation in Python data model 22 Is it ok to have multiple classes in the same file in Python? 6 Python - Architecture for related instance attributes 12 Working through the single responsibility principle (SRP) in Python when calls are expensive
Hot Network Questions

site design / logo � 2020 Stack Exchange Inc; user contributions licensed under cc by-sa . rev 2020.8.11.37373

[Dec 26, 2019] What is the Python equivalent of Perl's 'if exists' for hashes?

Jan 01, 2015 | stackoverflow.com

Ask Question Asked 4 years, 5 months ago Active 4 years, 5 months ago Viewed 583 times The Streaming SQL Materialized View Engine powered by Timely Dataflow. View all job openings!


flybonzai ,

I'm writing a script for work, and need to be able to create a hash of arrays that will check to see if a key exists in the hash (or dictionary), and if it does I will roll up some values from the new line into the existing hash values. Here is my code in Perl, what would be the translation in Python?
if (exists($rollUpHash{$hashKey}))
        {
          say("Same key found, summing up!")
          $rollUpHash{$hashKey}[14] += $lineFields[14];
          $rollUpHash{$hashKey}[15] += $lineFields[15];
          $rollUpHash{$hashKey}[16] += $lineFields[16];
          $rollUpHash{$hashKey}[17] += $lineFields[17];
          $rollUpHash{$hashKey}[24] += $lineFields[24];
          push @{$rollUpHash{$hashKey}}, $sumDeduct_NonDeduct_ytd;
          # print %rollUpHash;
        }
      else
        {
          $rollUpHash{$hashKey} = \@lineFields;
        }

blasko , 2015-07-22 20:15:35

If you're just checking if the key exists, you can do if "key" in your_dictionary

Edit:

To handle the unintended second part of your question, about adding the new value to the array, you can do something like this

# -1 will give you the last item in the list every time
for key, value in nums.iteritems():
    nums[key].append(value[-1]+value[-1])

omri_saadon ,

You can use this as well
rollUpHash.get(key, None)

If the key exists then the function return the value of this key, else the function will return whatever you assigned as the default value (second parameter)

if rollUpHash.has_key(hashkey):
    print "Same key found, summing up!"
    rollUpHash[hashkey][14] += lineFields[14]
    ...
    ...
    rollUpHash[hashkey].append(sumDeduct_NonDeduct_ytd)
else:
    rollUpHash[hashkey] = lineFields

omri_saadon , 2015-07-23 16:31:30

So if I want to add the rest of the line as a list to my dictionary using key as the key, how would that look? – flybonzai Jul 22 '15 at 22:05

[Dec 23, 2019] when it rains, it pours

Notable quotes:
"... Out of the frying pan; into the fire ..."
Jan 01, 2019 | english.stackexchange.com

share edit asked yesterday user3477108 151 1 1 bronze badge


Edwin Ashworth , 2019-12-22 14:40:37

Does this answer your question? In my native language, we have a saying - a stone will get a wretched person, going uphill 'Is there a similar saying or idiomatic expression in English, which would correlate with the above-mentioned one, implying that misfortune will befall even on those ones, already in trouble ?' – Edwin Ashworth 17 hours ago

Fraser Orr ,

A very common idiom is to say "when it rains, it pours."

"Pours" in this context means, "rains very heavily."

What this means, roughly speaking is "when one bad thing happens, you can expect a lot more bad things." So, for example, when talking to a friend who has just described a litany of bad luck in his life you'd say, "when it rains, it pours."

Laconic Droid , 2019-12-22 03:14:49

In the UK the longer "it never rains but it pours" is not uncommon. – Laconic Droid yesterday

Tim , 2019-12-22 19:04:57

A fairly well known option is add insult to injury

to worsen an unfavourable situation

-- wiktionary

Scoots , 2019-12-22 13:49:57

An alternative is:

Out of the frying pan; into the fire

Which is usually meant as escaping a bad situation only to find oneself in a worse situation.

nnnnnn , 2019-12-23 04:41:12

"Out of the frying pan, into the fire" is identified in the question as not applicable. (Unless the question was edited to add that after you answered, but there's no edit history shown.) But in any case as you have correctly noted it means to replace a problem with a worse problem, whereas the question is asking about adding additional problems without solving the original one. – nnnnnn 3 hours ago
When the first bad situation was recent and related with subsequent deterioration (so the "new difficulties" are related to the old ones)

This went downhill fast.

is a common way of expressing exasperation, particularly when human factors are involved in the deterioration (namely people taking things badly). For new difficulties unrelated to the old ones, I'd choose the previously mentioned "when it rains, it pours".

Mark Foskey ,

I believe there is no idiom that means exactly the same thing. Maybe you could just translate yours into English? "It's like I had a lightning strike followed by a snake bite." People won't even know you're using a cliche, or you can choose to say it's an expression from your native language.

Actually, come to think of it, the word "snakebit" means that someone has had bad luck, but it seems to be especially often used when someone has had a whole series of misfortunes. So it might do.

dimachaerus , 2019-12-22 14:30:22

between a rock and a hard place

This phrase became popular during The Great Depression of the 1930s, as many citizens were hit hard by the stock market crash and were left with no choice as unemployment levels rose also.

awe lotta , 2019-12-22 22:50:41

This would be improved with a little more explanation. – KillingTime 17 hours ago

[Dec 23, 2019] Are there shibboleths specific to native Russian speakers?

Jan 01, 2015 | english.stackexchange.com

Ask Question Asked 8 years, 9 months ago Active 5 years, 11 months ago Viewed 4k times


> ,

29

F'x ,

I am doing these days a lot of collaborative writing with a colleague born and raised in Russia, and now working in the US. He has a very good English and yet, as we circulated various texts, I noticed that he tends to drop the definite article, the , more than is acceptable. I attributed that to a trend of his native language.

Because I will continue working with him for some time, I hope to be aware of other such possible errors influenced by his mother tongue (especially because I'm not a native English speaker either!). So, what are common errors (or shibboleths) of native Russian speakers when they write in English?

Andrew Grimm , 2015-03-28 00:38:44

Russian is a very flexible language. Its complexity allows one to put words in a sentence in just about any order. There are 5 * 4 * 3 * 2 * 1 = 120 valid permutations of "I ate green apple yesterday", although some sound more weird than others. Run-on sentences are not a big deal - they are allowed. In fact, one can express a sentence "It is getting dark" with just a single word, and it is a valid sentence. The words from Latin made their way into many languages, but sometime their meanings have changed. – Job Mar 20 '11 at 2:18

> ,

13

mplungjan , 2011-03-19 09:31:44

A, an and the are all dropped. Using past tense with did (in my experience almost all non-native do this until they learn not to). Sometimes using she instead of he. Word order is not as important in Russian as in English. Missing prepositions

Russians I have met who have large vocabularies tend to stress words with more than two syllables in an idiosyncratic manner since they likely only ever read the words.
I have the same problem on rare occasions where I know a word, know how to use it but guess the pronunciation since I got it from literature.

More here

http://esl.fis.edu/grammar/langdiff/russian.htm

For example beginning learners often omit the auxiliary in questions or negatives: How you do that? / I no have it. The present simple is commonly used where the progressive form or perfect is needed: She has a bath now / How long are you in Germany?. In comparison with Russian the modal verb system in English is very complex. Mistakes such as Must you to work on Friday? / I will not can come, etc. are common among beginners. The lack of a copula in Russian leads to errors such as She good teacher.

MT_Head , 2012-09-28 00:50:51

Conversely, they may insert extraneous articles as a hypercorrection. – Mechanical snail May 23 '12 at 4:10

> ,

16

DVK ,

Aside from the items pointed above, a well-educated native russian speaker often writes (and speaks) in incredibly long, almost Hemingway-ish, compound sentences, where you can barely remember what the beginning of the sentence was about. I'm not sure if it's primarily the influence of russian prose, or something about the language itself which causes the brain to produce the long sentences.

kotekzot , 2012-06-14 09:21:37

+1 It is certainly true for well-educated ones. Our famous writers used to write sentences half a page long. Even for native speakers it is too much sometimes. – Edwin Ross Mar 19 '11 at 20:01

> ,

11

rem ,

Russian and English languages have somewhat different structure of verb tenses. For native speakers of Russian it can often be difficult to correctly use perfect tense forms due to the influence of their mother tongue.
The grammatical concepts behind the correct usage of English perfect tenses can be very confusing to Russian speakers, so they tend to replace it with Simple Past tense for example (in case of Present Perfect or Past Perfect), or just fail to use it appropriately.

> ,

add a comment

> ,

10

Edwin Ross ,

I am from Russia and I work at an international company so my colleagues and I have to use English all the time. There are really some common errors. The most difficult for us is to use articles properly. There are nothing similar to them in our native language. That is why we often use them where they are not needed and vice versa.

The second difficult part is using of prepositions. We tend to use those that we would use in our language if they were translated. For example, instead of at office we tend to say in office , instead of to London we often say in London . There are many other examples.

We don't have gerund in our language, so sometimes it is difficult for us to use it properly.

I can not agree with mplungjan that word order is not so important. It is important in any language and in Russian you can too change the meaning of a sentence if you change word order. Not always though, but in English it does not happen every time either.

There is also a rather big problem with sequence of tenses. In our language we do not have to do it. That is why we misuse perfect tense and even past tense forms often.

These are the most often encountered mistakes that I can spot when I talk to or read something from a native Russian speaker.

mplungjan , 2011-03-19 22:21:46

Almost all my closest colleagues are from the former soviet union. The order of the words in a sentence seem to a non-Russian speaker to at least have a different importance since I see this very often. Perhaps the person wanted to make a point in his native tongue, but the end effect was an incorrect sentence. – mplungjan Mar 19 '11 at 17:48

konung , 2011-03-31 14:34:41

One thing that nobody seemed to mention is punctuation. It is of paramount importance in Russian, because it brings intonation across.

Here is a famous example from an old Soviet cartoon that is based on a tale by Hans Christian Andersen in which a little princess is asked to sign a decree of execution. Pay attention to the position of the comma.

I guess you could argue that you can do the same in English like so:

Execute cannot, pardon! vs Execute, cannot pardon!

And this would make sense to an attentive English speaker, but punctuation tends to be not emphasized as much as spelling; as a result it will most likely be ignored or at the very least be ambiguous. I was just trying to illustrate the point that punctuation is so important that they made a cartoon for little children about it :-)

In fact it's so important that in Russia, Russian Language teachers usually give 2 grades for some written assignments: one for grammar and the other one for punctuation (it wasn't uncommon for me to get 5/3 ( or A/C in American equivalent) (I'm not a bad speller, but sometimes I can't get those punctuations signs right even if my life depended on it :-) )

To relate to this question though: you will notice that Russian speakers that finished at least 9 classes or high school in Russia will tend to use a lot more of , ; : " etc to bring extra nuances across, especially in run-on sentences because it's ingrained in the way language is taught. I see it with my Dad a lot. I've lived in the US for more than a decade now myself and I still tend to put commas in front of "that" in the middle of the sentence.

konung , 2011-04-06 22:36:00

+1: I never knew that about Russian. Here's a link that shows why punctuation is important (in English). – oosterwal Mar 31 '11 at 20:32

> ,

> ,

add a comment

> ,

5

MT_Head , 2012-09-28 00:57:06

As previously mentioned, Russian doesn't use articles ( a, the ), so Russian speakers use them - or don't - by guesswork, and often get them wrong.

What I haven't seen anyone else mention, however, is that the present tense of to be ( I am, thou art†, he is, we are, you are, they are ) is rarely (if ever) used in Russian. As a result, again, Russian speakers sometimes make surprising mistakes in this area. (My favorite: " Is there is...? ")

In speech, of course, there are at least three major pitfalls: Russian lacks a "th" sound - foreign words that are imported into Russian tend to get substituted with "f" or "t". When speaking English, "th" tends to turn into "s" or "z". If you're feeling especially cruel, ask your Russian colleague to say "thither". (Of course, a lot of Americans also have trouble with that one.)

Russian also doesn't have an equivalent to English "h" - the Russian letter х , pronounced like the "ch" in loch , is not equivalent - so foreign (mostly German) words imported into Russian usually substitute "g". Russians speaking English will, at first, turn all of their aitches into gees; later on, some learn to pronounce an English h , while others convert h 's into х 's - the source of the infamous "kheavy Roossian excent".

Finally, several of the "short" English vowel sounds - the a in "at", i in "in", and u in "up" - don't exist in Russian, while Russian has at least one vowel sound (ы) that doesn't exist in English. (Hence "excent" instead of "accent".)


†Yes, I know - we don't actually use "thou" anymore. Russians do, however (ты) and so I mentioned it for completeness.

> ,

add a comment

> ,

2

user57297 , 2013-11-13 13:57:30

Here´s a conversation I had with my Russian colleague, who speaks English well:

me: Is Jane on board with this plan?

Russian: Jane's not on the board now. Didn't you know that?

me: No, I mean, does Jane agree with us on this?

Russian: What? What are you talking about?

me: "on board" means "is she on the same boat (page, etc) with us?"

To her, the word "the" should carry no significant change in meaning. She didn't 'get it' on an intuitive level, despite years of successful study of English.

Human languages gather their own logic. Shall we discuss 'verbs of motion' in Russian, for example? Why, if I am 200 miles outside of Moscow, do I have to specify whether I'm walking or going by a vehicle when I say, "I'm going to Moscow tomorrow." Isn't it obvious I won't be walking?

I'm enjoying learning Russian, because I'm uncovering the hidden logic in it. It's a beautiful language.

> ,

add a comment

Pavel , 2013-02-21 08:37:48

Thanks for the very useful examples and explanations!

Actually I am still keep "fighting" with English articles after my at least 15 years of good English experience. I tend to drop them in order to avoid using them wrong. I remember very good how my collegues and my chief cursed my disability to use articles when editing my English texts (looking for and fixing mostly only articles). The idea of articles in English (and in German, French, too) seems very weird to my Russian mind. Why one need articles at all? There are much more logical words"this", "that", "these" in English language as in Russian (and many other languages). If we need to pinpoint the object (stress which one exactly) then we use these words in Russian: "this car". Otherwise we Russians just do not care to show that some "car" exist only in one piece (it's damn clear already since it's not "cars") like one should do it in English stressing "a car" or "une voiture" in French.

I wonder what happens in the old times in English (and other Germanic languages) to force people use article instead of logical "this", "that", "these" words?

Surprisingly it works much better for me with Swedish articles. May be because they are not so strict about the articles, may be because Swedish article always connected with the different ending of the word. They say and write not just "a car -- the car" but "en bil (often droping "en") - den ha:r bilen". This somehow more complicated but in some strange way concentrate me more on the certain object. Here is the link with professional explanation about Swedish approach: http://www.thelocal.se/blogs/theswedishteacher/2012/04/11/denna-or-den-har/

, 2013-02-21 09:37:10

Well if you wonder what happened in the old times, look up the etymology of the and a . The former is the word "that", and the latter comes from the word "one". I.e. "the apple" is simply "that apple", and "an apple" is "one apple". So English is not that different from Russian, actually. – RegDwigнt ♦ Feb 21 '13 at 9:37

[Dec 23, 2019] Expressing the concept of "spreading oneself shallowly" in English

Jan 01, 2011 | english.stackexchange.com

Ask Question Asked 8 years ago Active 4 years, 1 month ago Viewed 513 times


> ,

2

brilliant , 2011-12-21 08:44:15

The words in bold in the quote below are meant to express something that I don't know how to put in English. The main idea is that someone is spending too much energy in many different areas thinking that he is going to achieve some considerable progress in all of them while in fact he is only going to enjoy a small amount of success (if any) in all those areas due to the enormous scale of area.

Jack: So what project did you choose for this semester?

Linda: The children illiteracy in in-land towns in Uganda, The correlation between humans' eating habits and their behavioral patterns, The possibility of practical application of the Poincaré conjecture solution in the nearest future, The affect of globing warming on blue whales migratory patterns...

Jack: Wow! Isn't it too many? Why not focus on only one project and research it thoroughly instead? I suggest that you should not shallowly spread yourself on so many projects.

> ,

add a comment 2 Answers active oldest votes

> ,

6

Pitarou , 2011-12-21 09:03:01

I can't think of an idiom that exactly expresses your meaning. We can suggest to Linda that she should not spread herself so thinly , but that suggests the risk of failure, rather than insufficient progress.

If Linda lives her life this way, she might become a jack of all trades, but master of none . I.e. she has acquired many "shallow skills" through her diverse experiences, but no deep ones.

> ,

add a comment

Barrie England ,

The colloquial form is you should not spread yourself so thinly .

sq33G , 2011-12-21 09:47:08

thinly ? Or thin ? ( thinly would modify spread , thin would modify yourself ) – sq33G Dec 21 '11 at 9:03

[Dec 23, 2019] Is it right, or better way, to say someone "denies themselves agency"

Jan 01, 2016 | english.stackexchange.com

Ask Question Asked 3 years, 11 months ago Active 3 years, 5 months ago Viewed 255 times


dsollen , 2016-01-06 19:25:34

I'm trying to express the idea of someone who consistently underestimates his own contributions or his ability to impact a situation, despite having high self esteem. This is due to seeing themselves as currently fitting into a category of people that are not expected to be impactful in a situation, and thus they don't believe they should be impactful despite their actually being qualified to have an impact.

In essences it's like someone saying they are just an intern so they can't/didn't have a significant impact on a project because everyone knows interns are just there to learn not create something, or someone saying they couldn't/didn't help lead the direction of a project because they weren't a manager and only the manager is allowed to do that etc.

I struggle to best explain this concept, while stressing that the underestimation is not due to bad self esteem or negativity, simply the fact that he does not believe he should be impactful and thus underestimates any impact he could have.

In this situation would it be right to say that the individual is denying their agency? Or perhaps does not acknowledge their agency, in the situation? I'm not certain if it is right to say someone can be 'given' agency, or if agency is the intrinsic quality that the person has rather or not he acknowledges its existence?

If the phrase isn't right, is there a better phrase to use?

Dan Bron , 2016-01-06 20:03:12

This is typically termed Impostor syndrome . – Dan Bron Jan 6 '16 at 19:27

> ,

, 2016-07-23 18:37:21

Please understand that this use of agency is going to confuse most people. – tchrist ♦ Jul 23 '16 at 18:37

> ,

1

haha ,

you may refer to:

self-doubt (Noun)

A lack of faith or confidence in oneself.

Self-doubting (Adjective)

self-distrust (Noun)

Lack of faith or confidence in one's own abilities.

Self-distrustful (Adjective)

Insecure (Adjective)

Not feeling at all confident about yourself, your abilities, or your relationships with people.

She's very insecure about her appearance. (Longman dictionary)

> ,

add a comment

> ,

I want to say something along the lines of "Warrantless Deference"

1) that they are deferring to someone else without authorization to do so. and 2) though having the ability, the person assumes someone else will take on responsibility of the tasks.

This also sounds like a less serious example of " The Bystander Effect " with a touch of " Diffusion of Responsibility "

[Dec 01, 2019] How can I export all subs in a Perl package?

Jan 01, 2009 | stackoverflow.com

Ask Question Asked 10 years, 7 months ago Active 3 years, 5 months ago Viewed 18k times


Ville M ,

I would like to expose all subs into my namespace without having to list them one at a time:
@EXPORT = qw( firstsub secondsub third sub etc );

Using fully qualified names would require bunch of change to existing code so I'd rather not do that.

Is there @EXPORT_ALL?

I think documentation says it's a bad idea, but I'd like to do it anyway, or at least know how.

To answer Jon's why: right now for quick refactoring I want to move of bunch of subs into their own package with least hassle and code changes to the existing scripts (where those subs are currenty used and often repeated).

Also, mostly, I was just curious. (since it seemed like that Exporter might as well have that as standard feature, but somewhat surprisingly based on answers so far it doesn't)

brian d foy , 2009-04-08 23:58:35

Don't do any exporting at all, and don't declare a package name in your library. Just load the file with require and everything will be in the current package. Easy peasy.

Michael Carman , 2009-04-09 00:15:10

Don't. But if you really want to... write a custom import that walks the symbol table and export all the named subroutines.
# Export all subs in package. Not for use in production code!
sub import {
    no strict 'refs';

    my $caller = caller;

    while (my ($name, $symbol) = each %{__PACKAGE__ . '::'}) {
        next if      $name eq 'BEGIN';   # don't export BEGIN blocks
        next if      $name eq 'import';  # don't export this sub
        next unless *{$symbol}{CODE};    # export subs only

        my $imported = $caller . '::' . $name;
        *{ $imported } = \*{ $symbol };
    }
}

Chas. Owens ,

Warning, the code following is as bad an idea as exporting everything:
package Expo;

use base "Exporter";

seek DATA, 0, 0; #move DATA back to package

#read this file looking for sub names
our @EXPORT = map { /^sub\s+([^({\s]+)/ ? $1 : () } <DATA>;

my $sub = sub {}; #make sure anon funcs aren't grabbed

sub foo($) {
    print shift, "\n";
}

sub bar ($) {
    print shift, "\n";
}

sub baz{
    print shift,"\n";
}

sub quux {
    print shift,"\n";
}

1;

__DATA__

Here is the some code that uses the module:

#!/usr/bin/perl

use strict;
use warnings;

use Expo;

print map { "[$_]\n" } @Expo::EXPORT;

foo("foo");
bar("bar");
baz("baz");
quux("quux");

And here is its output:

[foo]
[bar]
[baz]
[quux]
foo
bar
baz
quux

Jon Ericson , 2009-04-08 22:33:36

You can always call subroutines in there fully-specified form:
MyModule::firstsub();

For modules I write internally, I find this convention works fairly well. It's a bit more typing, but tends to be better documentation.

Take a look at perldoc perlmod for more information about what you are trying to accomplish.

More generally, you could look at Exporter 's code and see how it uses glob aliasing. Or you can examine your module's namespace and export each subroutine. (I don't care to search for how to do that at the moment, but Perl makes this fairly easy.) Or you could just stick your subroutines in the main package:

 package main;
 sub firstsub() { ... }

(I don't think that's a good idea, but you know better than I do what you are trying to accomplish.)

There's nothing wrong with doing this provided you know what you are doing and aren't just trying to avoid thinking about your interface to the outside world.

ysth , 2009-04-09 01:29:04

Perhaps you would be interested in one of the Export* modules on CPAN that lets you mark subs as exportable simply by adding an attribute to the sub definition? (Don't remember which one it was, though.)

echo , 2014-10-11 18:23:01

https://metacpan.org/pod/Exporter::Auto

Exporter::Auto. this is all you need.

Tero Niemi , 2013-04-02 00:32:25

Although it is not usually wise to dump all sub s from module into the caller namespace, it is sometimes useful (and more DRY!) to automatically generate @EXPORT_OK and %EXPORT_TAGS variables.

The easiest method is to extend the Exporter. A simple example is something like this:

package Exporter::AutoOkay;
#
#   Automatically add all subroutines from caller package into the
#   @EXPORT_OK array. In the package use like Exporter, f.ex.:
#
#       use parent 'Exporter::AutoOkay';
#
use warnings;
use strict;
no strict 'refs';

require Exporter;

sub import {
    my $package = $_[0].'::';

    # Get the list of exportable items
    my @export_ok = (@{$package.'EXPORT_OK'});

    # Automatically add all subroutines from package into the list
    foreach (keys %{$package}) {
        next unless defined &{$package.$_};
        push @export_ok, $_;
    }

    # Set variable ready for Exporter
    @{$package.'EXPORT_OK'} = @export_ok;

    # Let Exporter do the rest
    goto &Exporter::import;
}

1;

Note the use of goto that removes us from the caller stack.

A more complete example can be found here: http://pastebin.com/Z1QWzcpZ It automatically generates tag groups from subroutine prefixes.

Sérgio , 2013-11-14 21:38:06

case 1

Library is :

package mycommon;

use strict;
use warnings;

sub onefunctionthatyoumadeonlibary() {
}
1;

you can use it, calling common:: :

#!/usr/bin/perl
use strict;
use warnings;
use mycommon;

common::onefunctionthatyoumadeonlibary()
case 2

Library is , yousimple export them :

package mycommon;

use strict;
use warnings;

use base 'Exporter';

our @EXPORT = qw(onefunctionthatyoumadeonlibary);
sub onefunctionthatyoumadeonlibary() {
}
1;

use it in same "namespace":

#!/usr/bin/perl
use strict;
use warnings;
use mycommon qw(onefunctionthatyoumadeonlibary);

onefunctionthatyoumadeonlibary()

Also we can do a mix of this two cases , we can export more common functions to use it without calling the packages name and other functions that we only call it with package name and that ones don't need to be exported.

> ,

You will have to do some typeglob munging. I describe something similar here:

Is there a way to "use" a single file that in turn uses multiple others in Perl?

The import routine there should do exactly what you want -- just don't import any symbols into your own namespace.

Ville M ,

I would like to expose all subs into my namespace without having to list them one at a time:
@EXPORT = qw( firstsub secondsub third sub etc );

Using fully qualified names would require bunch of change to existing code so I'd rather not do that.

Is there @EXPORT_ALL?

I think documentation says it's a bad idea, but I'd like to do it anyway, or at least know how.

To answer Jon's why: right now for quick refactoring I want to move of bunch of subs into their own package with least hassle and code changes to the existing scripts (where those subs are currenty used and often repeated).

Also, mostly, I was just curious. (since it seemed like that Exporter might as well have that as standard feature, but somewhat surprisingly based on answers so far it doesn't)

brian d foy , 2009-04-08 23:58:35

Don't do any exporting at all, and don't declare a package name in your library. Just load the file with require and everything will be in the current package. Easy peasy.

Michael Carman , 2009-04-09 00:15:10

Don't. But if you really want to... write a custom import that walks the symbol table and export all the named subroutines.
# Export all subs in package. Not for use in production code!
sub import {
    no strict 'refs';

    my $caller = caller;

    while (my ($name, $symbol) = each %{__PACKAGE__ . '::'}) {
        next if      $name eq 'BEGIN';   # don't export BEGIN blocks
        next if      $name eq 'import';  # don't export this sub
        next unless *{$symbol}{CODE};    # export subs only

        my $imported = $caller . '::' . $name;
        *{ $imported } = \*{ $symbol };
    }
}

Chas. Owens ,

Warning, the code following is as bad an idea as exporting everything:
package Expo;

use base "Exporter";

seek DATA, 0, 0; #move DATA back to package

#read this file looking for sub names
our @EXPORT = map { /^sub\s+([^({\s]+)/ ? $1 : () } <DATA>;

my $sub = sub {}; #make sure anon funcs aren't grabbed

sub foo($) {
    print shift, "\n";
}

sub bar ($) {
    print shift, "\n";
}

sub baz{
    print shift,"\n";
}

sub quux {
    print shift,"\n";
}

1;

__DATA__

Here is the some code that uses the module:

#!/usr/bin/perl

use strict;
use warnings;

use Expo;

print map { "[$_]\n" } @Expo::EXPORT;

foo("foo");
bar("bar");
baz("baz");
quux("quux");

And here is its output:

[foo]
[bar]
[baz]
[quux]
foo
bar
baz
quux

Jon Ericson , 2009-04-08 22:33:36

You can always call subroutines in there fully-specified form:
MyModule::firstsub();

For modules I write internally, I find this convention works fairly well. It's a bit more typing, but tends to be better documentation.

Take a look at perldoc perlmod for more information about what you are trying to accomplish.

More generally, you could look at Exporter 's code and see how it uses glob aliasing. Or you can examine your module's namespace and export each subroutine. (I don't care to search for how to do that at the moment, but Perl makes this fairly easy.) Or you could just stick your subroutines in the main package:

 package main;
 sub firstsub() { ... }

(I don't think that's a good idea, but you know better than I do what you are trying to accomplish.)

There's nothing wrong with doing this provided you know what you are doing and aren't just trying to avoid thinking about your interface to the outside world.

ysth , 2009-04-09 01:29:04

Perhaps you would be interested in one of the Export* modules on CPAN that lets you mark subs as exportable simply by adding an attribute to the sub definition? (Don't remember which one it was, though.)

echo , 2014-10-11 18:23:01

https://metacpan.org/pod/Exporter::Auto

Exporter::Auto. this is all you need.

Tero Niemi , 2013-04-02 00:32:25

Although it is not usually wise to dump all sub s from module into the caller namespace, it is sometimes useful (and more DRY!) to automatically generate @EXPORT_OK and %EXPORT_TAGS variables.

The easiest method is to extend the Exporter. A simple example is something like this:

package Exporter::AutoOkay;
#
#   Automatically add all subroutines from caller package into the
#   @EXPORT_OK array. In the package use like Exporter, f.ex.:
#
#       use parent 'Exporter::AutoOkay';
#
use warnings;
use strict;
no strict 'refs';

require Exporter;

sub import {
    my $package = $_[0].'::';

    # Get the list of exportable items
    my @export_ok = (@{$package.'EXPORT_OK'});

    # Automatically add all subroutines from package into the list
    foreach (keys %{$package}) {
        next unless defined &{$package.$_};
        push @export_ok, $_;
    }

    # Set variable ready for Exporter
    @{$package.'EXPORT_OK'} = @export_ok;

    # Let Exporter do the rest
    goto &Exporter::import;
}

1;

Note the use of goto that removes us from the caller stack.

A more complete example can be found here: http://pastebin.com/Z1QWzcpZ It automatically generates tag groups from subroutine prefixes.

Sérgio , 2013-11-14 21:38:06

case 1

Library is :

package mycommon;

use strict;
use warnings;

sub onefunctionthatyoumadeonlibary() {
}
1;

you can use it, calling common:: :

#!/usr/bin/perl
use strict;
use warnings;
use mycommon;

common::onefunctionthatyoumadeonlibary()
case 2

Library is , yousimple export them :

package mycommon;

use strict;
use warnings;

use base 'Exporter';

our @EXPORT = qw(onefunctionthatyoumadeonlibary);
sub onefunctionthatyoumadeonlibary() {
}
1;

use it in same "namespace":

#!/usr/bin/perl
use strict;
use warnings;
use mycommon qw(onefunctionthatyoumadeonlibary);

onefunctionthatyoumadeonlibary()

Also we can do a mix of this two cases , we can export more common functions to use it without calling the packages name and other functions that we only call it with package name and that ones don't need to be exported.

> ,

You will have to do some typeglob munging. I describe something similar here:

Is there a way to "use" a single file that in turn uses multiple others in Perl?

The import routine there should do exactly what you want -- just don't import any symbols into your own namespace.

Ville M ,

I would like to expose all subs into my namespace without having to list them one at a time:
@EXPORT = qw( firstsub secondsub third sub etc );

Using fully qualified names would require bunch of change to existing code so I'd rather not do that.

Is there @EXPORT_ALL?

I think documentation says it's a bad idea, but I'd like to do it anyway, or at least know how.

To answer Jon's why: right now for quick refactoring I want to move of bunch of subs into their own package with least hassle and code changes to the existing scripts (where those subs are currenty used and often repeated).

Also, mostly, I was just curious. (since it seemed like that Exporter might as well have that as standard feature, but somewhat surprisingly based on answers so far it doesn't)

brian d foy , 2009-04-08 23:58:35

Don't do any exporting at all, and don't declare a package name in your library. Just load the file with require and everything will be in the current package. Easy peasy.

Michael Carman , 2009-04-09 00:15:10

Don't. But if you really want to... write a custom import that walks the symbol table and export all the named subroutines.
# Export all subs in package. Not for use in production code!
sub import {
    no strict 'refs';

    my $caller = caller;

    while (my ($name, $symbol) = each %{__PACKAGE__ . '::'}) {
        next if      $name eq 'BEGIN';   # don't export BEGIN blocks
        next if      $name eq 'import';  # don't export this sub
        next unless *{$symbol}{CODE};    # export subs only

        my $imported = $caller . '::' . $name;
        *{ $imported } = \*{ $symbol };
    }
}

Chas. Owens ,

Warning, the code following is as bad an idea as exporting everything:
package Expo;

use base "Exporter";

seek DATA, 0, 0; #move DATA back to package

#read this file looking for sub names
our @EXPORT = map { /^sub\s+([^({\s]+)/ ? $1 : () } <DATA>;

my $sub = sub {}; #make sure anon funcs aren't grabbed

sub foo($) {
    print shift, "\n";
}

sub bar ($) {
    print shift, "\n";
}

sub baz{
    print shift,"\n";
}

sub quux {
    print shift,"\n";
}

1;

__DATA__

Here is the some code that uses the module:

#!/usr/bin/perl

use strict;
use warnings;

use Expo;

print map { "[$_]\n" } @Expo::EXPORT;

foo("foo");
bar("bar");
baz("baz");
quux("quux");

And here is its output:

[foo]
[bar]
[baz]
[quux]
foo
bar
baz
quux

Jon Ericson , 2009-04-08 22:33:36

You can always call subroutines in there fully-specified form:
MyModule::firstsub();

For modules I write internally, I find this convention works fairly well. It's a bit more typing, but tends to be better documentation.

Take a look at perldoc perlmod for more information about what you are trying to accomplish.

More generally, you could look at Exporter 's code and see how it uses glob aliasing. Or you can examine your module's namespace and export each subroutine. (I don't care to search for how to do that at the moment, but Perl makes this fairly easy.) Or you could just stick your subroutines in the main package:

 package main;
 sub firstsub() { ... }

(I don't think that's a good idea, but you know better than I do what you are trying to accomplish.)

There's nothing wrong with doing this provided you know what you are doing and aren't just trying to avoid thinking about your interface to the outside world.

ysth , 2009-04-09 01:29:04

Perhaps you would be interested in one of the Export* modules on CPAN that lets you mark subs as exportable simply by adding an attribute to the sub definition? (Don't remember which one it was, though.)

echo , 2014-10-11 18:23:01

https://metacpan.org/pod/Exporter::Auto

Exporter::Auto. this is all you need.

Tero Niemi , 2013-04-02 00:32:25

Although it is not usually wise to dump all sub s from module into the caller namespace, it is sometimes useful (and more DRY!) to automatically generate @EXPORT_OK and %EXPORT_TAGS variables.

The easiest method is to extend the Exporter. A simple example is something like this:

package Exporter::AutoOkay;
#
#   Automatically add all subroutines from caller package into the
#   @EXPORT_OK array. In the package use like Exporter, f.ex.:
#
#       use parent 'Exporter::AutoOkay';
#
use warnings;
use strict;
no strict 'refs';

require Exporter;

sub import {
    my $package = $_[0].'::';

    # Get the list of exportable items
    my @export_ok = (@{$package.'EXPORT_OK'});

    # Automatically add all subroutines from package into the list
    foreach (keys %{$package}) {
        next unless defined &{$package.$_};
        push @export_ok, $_;
    }

    # Set variable ready for Exporter
    @{$package.'EXPORT_OK'} = @export_ok;

    # Let Exporter do the rest
    goto &Exporter::import;
}

1;

Note the use of goto that removes us from the caller stack.

A more complete example can be found here: http://pastebin.com/Z1QWzcpZ It automatically generates tag groups from subroutine prefixes.

Sérgio , 2013-11-14 21:38:06

case 1

Library is :

package mycommon;

use strict;
use warnings;

sub onefunctionthatyoumadeonlibary() {
}
1;

you can use it, calling common:: :

#!/usr/bin/perl
use strict;
use warnings;
use mycommon;

common::onefunctionthatyoumadeonlibary()
case 2

Library is , yousimple export them :

package mycommon;

use strict;
use warnings;

use base 'Exporter';

our @EXPORT = qw(onefunctionthatyoumadeonlibary);
sub onefunctionthatyoumadeonlibary() {
}
1;

use it in same "namespace":

#!/usr/bin/perl
use strict;
use warnings;
use mycommon qw(onefunctionthatyoumadeonlibary);

onefunctionthatyoumadeonlibary()

Also we can do a mix of this two cases , we can export more common functions to use it without calling the packages name and other functions that we only call it with package name and that ones don't need to be exported.

> ,

You will have to do some typeglob munging. I describe something similar here:

Is there a way to "use" a single file that in turn uses multiple others in Perl?

The import routine there should do exactly what you want -- just don't import any symbols into your own namespace.

Ville M ,

I would like to expose all subs into my namespace without having to list them one at a time:
@EXPORT = qw( firstsub secondsub third sub etc );

Using fully qualified names would require bunch of change to existing code so I'd rather not do that.

Is there @EXPORT_ALL?

I think documentation says it's a bad idea, but I'd like to do it anyway, or at least know how.

To answer Jon's why: right now for quick refactoring I want to move of bunch of subs into their own package with least hassle and code changes to the existing scripts (where those subs are currenty used and often repeated).

Also, mostly, I was just curious. (since it seemed like that Exporter might as well have that as standard feature, but somewhat surprisingly based on answers so far it doesn't)

brian d foy , 2009-04-08 23:58:35

Don't do any exporting at all, and don't declare a package name in your library. Just load the file with require and everything will be in the current package. Easy peasy.

Michael Carman , 2009-04-09 00:15:10

Don't. But if you really want to... write a custom import that walks the symbol table and export all the named subroutines.
# Export all subs in package. Not for use in production code!
sub import {
    no strict 'refs';

    my $caller = caller;

    while (my ($name, $symbol) = each %{__PACKAGE__ . '::'}) {
        next if      $name eq 'BEGIN';   # don't export BEGIN blocks
        next if      $name eq 'import';  # don't export this sub
        next unless *{$symbol}{CODE};    # export subs only

        my $imported = $caller . '::' . $name;
        *{ $imported } = \*{ $symbol };
    }
}

Chas. Owens ,

Warning, the code following is as bad an idea as exporting everything:
package Expo;

use base "Exporter";

seek DATA, 0, 0; #move DATA back to package

#read this file looking for sub names
our @EXPORT = map { /^sub\s+([^({\s]+)/ ? $1 : () } <DATA>;

my $sub = sub {}; #make sure anon funcs aren't grabbed

sub foo($) {
    print shift, "\n";
}

sub bar ($) {
    print shift, "\n";
}

sub baz{
    print shift,"\n";
}

sub quux {
    print shift,"\n";
}

1;

__DATA__

Here is the some code that uses the module:

#!/usr/bin/perl

use strict;
use warnings;

use Expo;

print map { "[$_]\n" } @Expo::EXPORT;

foo("foo");
bar("bar");
baz("baz");
quux("quux");

And here is its output:

[foo]
[bar]
[baz]
[quux]
foo
bar
baz
quux

Jon Ericson , 2009-04-08 22:33:36

You can always call subroutines in there fully-specified form:
MyModule::firstsub();

For modules I write internally, I find this convention works fairly well. It's a bit more typing, but tends to be better documentation.

Take a look at perldoc perlmod for more information about what you are trying to accomplish.

More generally, you could look at Exporter 's code and see how it uses glob aliasing. Or you can examine your module's namespace and export each subroutine. (I don't care to search for how to do that at the moment, but Perl makes this fairly easy.) Or you could just stick your subroutines in the main package:

 package main;
 sub firstsub() { ... }

(I don't think that's a good idea, but you know better than I do what you are trying to accomplish.)

There's nothing wrong with doing this provided you know what you are doing and aren't just trying to avoid thinking about your interface to the outside world.

ysth , 2009-04-09 01:29:04

Perhaps you would be interested in one of the Export* modules on CPAN that lets you mark subs as exportable simply by adding an attribute to the sub definition? (Don't remember which one it was, though.)

echo , 2014-10-11 18:23:01

https://metacpan.org/pod/Exporter::Auto

Exporter::Auto. this is all you need.

Tero Niemi , 2013-04-02 00:32:25

Although it is not usually wise to dump all sub s from module into the caller namespace, it is sometimes useful (and more DRY!) to automatically generate @EXPORT_OK and %EXPORT_TAGS variables.

The easiest method is to extend the Exporter. A simple example is something like this:

package Exporter::AutoOkay;
#
#   Automatically add all subroutines from caller package into the
#   @EXPORT_OK array. In the package use like Exporter, f.ex.:
#
#       use parent 'Exporter::AutoOkay';
#
use warnings;
use strict;
no strict 'refs';

require Exporter;

sub import {
    my $package = $_[0].'::';

    # Get the list of exportable items
    my @export_ok = (@{$package.'EXPORT_OK'});

    # Automatically add all subroutines from package into the list
    foreach (keys %{$package}) {
        next unless defined &{$package.$_};
        push @export_ok, $_;
    }

    # Set variable ready for Exporter
    @{$package.'EXPORT_OK'} = @export_ok;

    # Let Exporter do the rest
    goto &Exporter::import;
}

1;

Note the use of goto that removes us from the caller stack.

A more complete example can be found here: http://pastebin.com/Z1QWzcpZ It automatically generates tag groups from subroutine prefixes.

Sérgio , 2013-11-14 21:38:06

case 1

Library is :

package mycommon;

use strict;
use warnings;

sub onefunctionthatyoumadeonlibary() {
}
1;

you can use it, calling common:: :

#!/usr/bin/perl
use strict;
use warnings;
use mycommon;

common::onefunctionthatyoumadeonlibary()
case 2

Library is , yousimple export them :

package mycommon;

use strict;
use warnings;

use base 'Exporter';

our @EXPORT = qw(onefunctionthatyoumadeonlibary);
sub onefunctionthatyoumadeonlibary() {
}
1;

use it in same "namespace":

#!/usr/bin/perl
use strict;
use warnings;
use mycommon qw(onefunctionthatyoumadeonlibary);

onefunctionthatyoumadeonlibary()

Also we can do a mix of this two cases , we can export more common functions to use it without calling the packages name and other functions that we only call it with package name and that ones don't need to be exported.

> ,

You will have to do some typeglob munging. I describe something similar here:

Is there a way to "use" a single file that in turn uses multiple others in Perl?

The import routine there should do exactly what you want -- just don't import any symbols into your own namespace.

Ville M ,

I would like to expose all subs into my namespace without having to list them one at a time:
@EXPORT = qw( firstsub secondsub third sub etc );

Using fully qualified names would require bunch of change to existing code so I'd rather not do that.

Is there @EXPORT_ALL?

I think documentation says it's a bad idea, but I'd like to do it anyway, or at least know how.

To answer Jon's why: right now for quick refactoring I want to move of bunch of subs into their own package with least hassle and code changes to the existing scripts (where those subs are currenty used and often repeated).

Also, mostly, I was just curious. (since it seemed like that Exporter might as well have that as standard feature, but somewhat surprisingly based on answers so far it doesn't)

brian d foy , 2009-04-08 23:58:35

Don't do any exporting at all, and don't declare a package name in your library. Just load the file with require and everything will be in the current package. Easy peasy.

Michael Carman , 2009-04-09 00:15:10

Don't. But if you really want to... write a custom import that walks the symbol table and export all the named subroutines.
# Export all subs in package. Not for use in production code!
sub import {
    no strict 'refs';

    my $caller = caller;

    while (my ($name, $symbol) = each %{__PACKAGE__ . '::'}) {
        next if      $name eq 'BEGIN';   # don't export BEGIN blocks
        next if      $name eq 'import';  # don't export this sub
        next unless *{$symbol}{CODE};    # export subs only

        my $imported = $caller . '::' . $name;
        *{ $imported } = \*{ $symbol };
    }
}

Chas. Owens ,

Warning, the code following is as bad an idea as exporting everything:
package Expo;

use base "Exporter";

seek DATA, 0, 0; #move DATA back to package

#read this file looking for sub names
our @EXPORT = map { /^sub\s+([^({\s]+)/ ? $1 : () } <DATA>;

my $sub = sub {}; #make sure anon funcs aren't grabbed

sub foo($) {
    print shift, "\n";
}

sub bar ($) {
    print shift, "\n";
}

sub baz{
    print shift,"\n";
}

sub quux {
    print shift,"\n";
}

1;

__DATA__

Here is the some code that uses the module:

#!/usr/bin/perl

use strict;
use warnings;

use Expo;

print map { "[$_]\n" } @Expo::EXPORT;

foo("foo");
bar("bar");
baz("baz");
quux("quux");

And here is its output:

[foo]
[bar]
[baz]
[quux]
foo
bar
baz
quux

Jon Ericson , 2009-04-08 22:33:36

You can always call subroutines in there fully-specified form:
MyModule::firstsub();

For modules I write internally, I find this convention works fairly well. It's a bit more typing, but tends to be better documentation.

Take a look at perldoc perlmod for more information about what you are trying to accomplish.

More generally, you could look at Exporter 's code and see how it uses glob aliasing. Or you can examine your module's namespace and export each subroutine. (I don't care to search for how to do that at the moment, but Perl makes this fairly easy.) Or you could just stick your subroutines in the main package:

 package main;
 sub firstsub() { ... }

(I don't think that's a good idea, but you know better than I do what you are trying to accomplish.)

There's nothing wrong with doing this provided you know what you are doing and aren't just trying to avoid thinking about your interface to the outside world.

ysth , 2009-04-09 01:29:04

Perhaps you would be interested in one of the Export* modules on CPAN that lets you mark subs as exportable simply by adding an attribute to the sub definition? (Don't remember which one it was, though.)

echo , 2014-10-11 18:23:01

https://metacpan.org/pod/Exporter::Auto

Exporter::Auto. this is all you need.

Tero Niemi , 2013-04-02 00:32:25

Although it is not usually wise to dump all sub s from module into the caller namespace, it is sometimes useful (and more DRY!) to automatically generate @EXPORT_OK and %EXPORT_TAGS variables.

The easiest method is to extend the Exporter. A simple example is something like this:

package Exporter::AutoOkay;
#
#   Automatically add all subroutines from caller package into the
#   @EXPORT_OK array. In the package use like Exporter, f.ex.:
#
#       use parent 'Exporter::AutoOkay';
#
use warnings;
use strict;
no strict 'refs';

require Exporter;

sub import {
    my $package = $_[0].'::';

    # Get the list of exportable items
    my @export_ok = (@{$package.'EXPORT_OK'});

    # Automatically add all subroutines from package into the list
    foreach (keys %{$package}) {
        next unless defined &{$package.$_};
        push @export_ok, $_;
    }

    # Set variable ready for Exporter
    @{$package.'EXPORT_OK'} = @export_ok;

    # Let Exporter do the rest
    goto &Exporter::import;
}

1;

Note the use of goto that removes us from the caller stack.

A more complete example can be found here: http://pastebin.com/Z1QWzcpZ It automatically generates tag groups from subroutine prefixes.

Sérgio , 2013-11-14 21:38:06

case 1

Library is :

package mycommon;

use strict;
use warnings;

sub onefunctionthatyoumadeonlibary() {
}
1;

you can use it, calling common:: :

#!/usr/bin/perl
use strict;
use warnings;
use mycommon;

common::onefunctionthatyoumadeonlibary()
case 2

Library is , yousimple export them :

package mycommon;

use strict;
use warnings;

use base 'Exporter';

our @EXPORT = qw(onefunctionthatyoumadeonlibary);
sub onefunctionthatyoumadeonlibary() {
}
1;

use it in same "namespace":

#!/usr/bin/perl
use strict;
use warnings;
use mycommon qw(onefunctionthatyoumadeonlibary);

onefunctionthatyoumadeonlibary()

Also we can do a mix of this two cases , we can export more common functions to use it without calling the packages name and other functions that we only call it with package name and that ones don't need to be exported.

> ,

You will have to do some typeglob munging. I describe something similar here:

Is there a way to "use" a single file that in turn uses multiple others in Perl?

The import routine there should do exactly what you want -- just don't import any symbols into your own namespace.

Ville M ,

I would like to expose all subs into my namespace without having to list them one at a time:
@EXPORT = qw( firstsub secondsub third sub etc );

Using fully qualified names would require bunch of change to existing code so I'd rather not do that.

Is there @EXPORT_ALL?

I think documentation says it's a bad idea, but I'd like to do it anyway, or at least know how.

To answer Jon's why: right now for quick refactoring I want to move of bunch of subs into their own package with least hassle and code changes to the existing scripts (where those subs are currenty used and often repeated).

Also, mostly, I was just curious. (since it seemed like that Exporter might as well have that as standard feature, but somewhat surprisingly based on answers so far it doesn't)

brian d foy , 2009-04-08 23:58:35

Don't do any exporting at all, and don't declare a package name in your library. Just load the file with require and everything will be in the current package. Easy peasy.

Michael Carman , 2009-04-09 00:15:10

Don't. But if you really want to... write a custom import that walks the symbol table and export all the named subroutines.
# Export all subs in package. Not for use in production code!
sub import {
    no strict 'refs';

    my $caller = caller;

    while (my ($name, $symbol) = each %{__PACKAGE__ . '::'}) {
        next if      $name eq 'BEGIN';   # don't export BEGIN blocks
        next if      $name eq 'import';  # don't export this sub
        next unless *{$symbol}{CODE};    # export subs only

        my $imported = $caller . '::' . $name;
        *{ $imported } = \*{ $symbol };
    }
}

Chas. Owens ,

Warning, the code following is as bad an idea as exporting everything:
package Expo;

use base "Exporter";

seek DATA, 0, 0; #move DATA back to package

#read this file looking for sub names
our @EXPORT = map { /^sub\s+([^({\s]+)/ ? $1 : () } <DATA>;

my $sub = sub {}; #make sure anon funcs aren't grabbed

sub foo($) {
    print shift, "\n";
}

sub bar ($) {
    print shift, "\n";
}

sub baz{
    print shift,"\n";
}

sub quux {
    print shift,"\n";
}

1;

__DATA__

Here is the some code that uses the module:

#!/usr/bin/perl

use strict;
use warnings;

use Expo;

print map { "[$_]\n" } @Expo::EXPORT;

foo("foo");
bar("bar");
baz("baz");
quux("quux");

And here is its output:

[foo]
[bar]
[baz]
[quux]
foo
bar
baz
quux

Jon Ericson , 2009-04-08 22:33:36

You can always call subroutines in there fully-specified form:
MyModule::firstsub();

For modules I write internally, I find this convention works fairly well. It's a bit more typing, but tends to be better documentation.

Take a look at perldoc perlmod for more information about what you are trying to accomplish.

More generally, you could look at Exporter 's code and see how it uses glob aliasing. Or you can examine your module's namespace and export each subroutine. (I don't care to search for how to do that at the moment, but Perl makes this fairly easy.) Or you could just stick your subroutines in the main package:

 package main;
 sub firstsub() { ... }

(I don't think that's a good idea, but you know better than I do what you are trying to accomplish.)

There's nothing wrong with doing this provided you know what you are doing and aren't just trying to avoid thinking about your interface to the outside world.

ysth , 2009-04-09 01:29:04

Perhaps you would be interested in one of the Export* modules on CPAN that lets you mark subs as exportable simply by adding an attribute to the sub definition? (Don't remember which one it was, though.)

echo , 2014-10-11 18:23:01

https://metacpan.org/pod/Exporter::Auto

Exporter::Auto. this is all you need.

Tero Niemi , 2013-04-02 00:32:25

Although it is not usually wise to dump all sub s from module into the caller namespace, it is sometimes useful (and more DRY!) to automatically generate @EXPORT_OK and %EXPORT_TAGS variables.

The easiest method is to extend the Exporter. A simple example is something like this:

package Exporter::AutoOkay;
#
#   Automatically add all subroutines from caller package into the
#   @EXPORT_OK array. In the package use like Exporter, f.ex.:
#
#       use parent 'Exporter::AutoOkay';
#
use warnings;
use strict;
no strict 'refs';

require Exporter;

sub import {
    my $package = $_[0].'::';

    # Get the list of exportable items
    my @export_ok = (@{$package.'EXPORT_OK'});

    # Automatically add all subroutines from package into the list
    foreach (keys %{$package}) {
        next unless defined &{$package.$_};
        push @export_ok, $_;
    }

    # Set variable ready for Exporter
    @{$package.'EXPORT_OK'} = @export_ok;

    # Let Exporter do the rest
    goto &Exporter::import;
}

1;

Note the use of goto that removes us from the caller stack.

A more complete example can be found here: http://pastebin.com/Z1QWzcpZ It automatically generates tag groups from subroutine prefixes.

Sérgio , 2013-11-14 21:38:06

case 1

Library is :

package mycommon;

use strict;
use warnings;

sub onefunctionthatyoumadeonlibary() {
}
1;

you can use it, calling common:: :

#!/usr/bin/perl
use strict;
use warnings;
use mycommon;

common::onefunctionthatyoumadeonlibary()
case 2

Library is , yousimple export them :

package mycommon;

use strict;
use warnings;

use base 'Exporter';

our @EXPORT = qw(onefunctionthatyoumadeonlibary);
sub onefunctionthatyoumadeonlibary() {
}
1;

use it in same "namespace":

#!/usr/bin/perl
use strict;
use warnings;
use mycommon qw(onefunctionthatyoumadeonlibary);

onefunctionthatyoumadeonlibary()

Also we can do a mix of this two cases , we can export more common functions to use it without calling the packages name and other functions that we only call it with package name and that ones don't need to be exported.

> ,

You will have to do some typeglob munging. I describe something similar here:

Is there a way to "use" a single file that in turn uses multiple others in Perl?

The import routine there should do exactly what you want -- just don't import any symbols into your own namespace.

[Dec 01, 2019] function - How can I export all subs in a Perl package - Stack Overflow

Jan 01, 2009 | stackoverflow.com

How can I export all subs in a Perl package? Ask Question Asked 10 years, 7 months ago Active 3 years, 5 months ago Viewed 18k times


Ville M ,

I would like to expose all subs into my namespace without having to list them one at a time:
@EXPORT = qw( firstsub secondsub third sub etc );

Using fully qualified names would require bunch of change to existing code so I'd rather not do that.

Is there @EXPORT_ALL?

I think documentation says it's a bad idea, but I'd like to do it anyway, or at least know how.

To answer Jon's why: right now for quick refactoring I want to move of bunch of subs into their own package with least hassle and code changes to the existing scripts (where those subs are currenty used and often repeated).

Also, mostly, I was just curious. (since it seemed like that Exporter might as well have that as standard feature, but somewhat surprisingly based on answers so far it doesn't)

brian d foy , 2009-04-08 23:58:35

Don't do any exporting at all, and don't declare a package name in your library. Just load the file with require and everything will be in the current package. Easy peasy.

Michael Carman , 2009-04-09 00:15:10

Don't. But if you really want to... write a custom import that walks the symbol table and export all the named subroutines.
# Export all subs in package. Not for use in production code!
sub import {
    no strict 'refs';

    my $caller = caller;

    while (my ($name, $symbol) = each %{__PACKAGE__ . '::'}) {
        next if      $name eq 'BEGIN';   # don't export BEGIN blocks
        next if      $name eq 'import';  # don't export this sub
        next unless *{$symbol}{CODE};    # export subs only

        my $imported = $caller . '::' . $name;
        *{ $imported } = \*{ $symbol };
    }
}

Chas. Owens ,

Warning, the code following is as bad an idea as exporting everything:
package Expo;

use base "Exporter";

seek DATA, 0, 0; #move DATA back to package

#read this file looking for sub names
our @EXPORT = map { /^sub\s+([^({\s]+)/ ? $1 : () } <DATA>;

my $sub = sub {}; #make sure anon funcs aren't grabbed

sub foo($) {
    print shift, "\n";
}

sub bar ($) {
    print shift, "\n";
}

sub baz{
    print shift,"\n";
}

sub quux {
    print shift,"\n";
}

1;

__DATA__

Here is the some code that uses the module:

#!/usr/bin/perl

use strict;
use warnings;

use Expo;

print map { "[$_]\n" } @Expo::EXPORT;

foo("foo");
bar("bar");
baz("baz");
quux("quux");

And here is its output:

[foo]
[bar]
[baz]
[quux]
foo
bar
baz
quux

Jon Ericson , 2009-04-08 22:33:36

You can always call subroutines in there fully-specified form:
MyModule::firstsub();

For modules I write internally, I find this convention works fairly well. It's a bit more typing, but tends to be better documentation.

Take a look at perldoc perlmod for more information about what you are trying to accomplish.

More generally, you could look at Exporter 's code and see how it uses glob aliasing. Or you can examine your module's namespace and export each subroutine. (I don't care to search for how to do that at the moment, but Perl makes this fairly easy.) Or you could just stick your subroutines in the main package:

 package main;
 sub firstsub() { ... }

(I don't think that's a good idea, but you know better than I do what you are trying to accomplish.)

There's nothing wrong with doing this provided you know what you are doing and aren't just trying to avoid thinking about your interface to the outside world.

ysth , 2009-04-09 01:29:04

Perhaps you would be interested in one of the Export* modules on CPAN that lets you mark subs as exportable simply by adding an attribute to the sub definition? (Don't remember which one it was, though.)

echo , 2014-10-11 18:23:01

https://metacpan.org/pod/Exporter::Auto

Exporter::Auto. this is all you need.

Tero Niemi , 2013-04-02 00:32:25

Although it is not usually wise to dump all sub s from module into the caller namespace, it is sometimes useful (and more DRY!) to automatically generate @EXPORT_OK and %EXPORT_TAGS variables.

The easiest method is to extend the Exporter. A simple example is something like this:

package Exporter::AutoOkay;
#
#   Automatically add all subroutines from caller package into the
#   @EXPORT_OK array. In the package use like Exporter, f.ex.:
#
#       use parent 'Exporter::AutoOkay';
#
use warnings;
use strict;
no strict 'refs';

require Exporter;

sub import {
    my $package = $_[0].'::';

    # Get the list of exportable items
    my @export_ok = (@{$package.'EXPORT_OK'});

    # Automatically add all subroutines from package into the list
    foreach (keys %{$package}) {
        next unless defined &{$package.$_};
        push @export_ok, $_;
    }

    # Set variable ready for Exporter
    @{$package.'EXPORT_OK'} = @export_ok;

    # Let Exporter do the rest
    goto &Exporter::import;
}

1;

Note the use of goto that removes us from the caller stack.

A more complete example can be found here: http://pastebin.com/Z1QWzcpZ It automatically generates tag groups from subroutine prefixes.

Sérgio , 2013-11-14 21:38:06

case 1

Library is :

package mycommon;

use strict;
use warnings;

sub onefunctionthatyoumadeonlibary() {
}
1;

you can use it, calling common:: :

#!/usr/bin/perl
use strict;
use warnings;
use mycommon;

common::onefunctionthatyoumadeonlibary()
case 2

Library is , yousimple export them :

package mycommon;

use strict;
use warnings;

use base 'Exporter';

our @EXPORT = qw(onefunctionthatyoumadeonlibary);
sub onefunctionthatyoumadeonlibary() {
}
1;

use it in same "namespace":

#!/usr/bin/perl
use strict;
use warnings;
use mycommon qw(onefunctionthatyoumadeonlibary);

onefunctionthatyoumadeonlibary()

Also we can do a mix of this two cases , we can export more common functions to use it without calling the packages name and other functions that we only call it with package name and that ones don't need to be exported.

> ,

You will have to do some typeglob munging. I describe something similar here:

Is there a way to "use" a single file that in turn uses multiple others in Perl?

The import routine there should do exactly what you want -- just don't import any symbols into your own namespace.

[Nov 23, 2019] What is the meaning of a single and a double underscore before an object name?

Notable quotes:
"... does not import objects whose name starts with an underscore. ..."
Aug 19, 2009 | stackoverflow.com
Can someone please explain the exact meaning of having leading underscores before an object's name in Python? Also, explain the difference between a single and a double leading underscore. Also, does that meaning stay the same whether the object in question is a variable, a function, a method, etc.?

Andrew Keeton , 2009-08-19 17:15:53

Single Underscore

Names, in a class, with a leading underscore are simply to indicate to other programmers that the attribute or method is intended to be private. However, nothing special is done with the name itself.

To quote PEP-8 :

_single_leading_underscore: weak "internal use" indicator. E.g. from M import * does not import objects whose name starts with an underscore.

Double Underscore (Name Mangling)

From the Python docs :

Any identifier of the form __spam (at least two leading underscores, at most one trailing underscore) is textually replaced with _classname__spam , where classname is the current class name with leading underscore(s) stripped. This mangling is done without regard to the syntactic position of the identifier, so it can be used to define class-private instance and class variables, methods, variables stored in globals, and even variables stored in instances. private to this class on instances of other classes.

And a warning from the same page:

Name mangling is intended to give classes an easy way to define "private" instance variables and methods, without having to worry about instance variables defined by derived classes, or mucking with instance variables by code outside the class. Note that the mangling rules are designed mostly to avoid accidents; it still is possible for a determined soul to access or modify a variable that is considered private.

Example
>>> class MyClass():
...     def __init__(self):
...             self.__superprivate = "Hello"
...             self._semiprivate = ", world!"
...
>>> mc = MyClass()
>>> print mc.__superprivate
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: myClass instance has no attribute '__superprivate'
>>> print mc._semiprivate
, world!
>>> print mc.__dict__
{'_MyClass__superprivate': 'Hello', '_semiprivate': ', world!'}

Alex Martelli , 2009-08-19 17:52:36

Excellent answers so far but some tidbits are missing. A single leading underscore isn't exactly just a convention: if you use from foobar import * , and module foobar does not define an __all__ list, the names imported from the module do not include those with a leading underscore. Let's say it's mostly a convention, since this case is a pretty obscure corner;-).

The leading-underscore convention is widely used not just for private names, but also for what C++ would call protected ones -- for example, names of methods that are fully intended to be overridden by subclasses (even ones that have to be overridden since in the base class they raise NotImplementedError !-) are often single-leading-underscore names to indicate to code using instances of that class (or subclasses) that said methods are not meant to be called directly.

For example, to make a thread-safe queue with a different queueing discipline than FIFO, one imports Queue, subclasses Queue.Queue, and overrides such methods as _get and _put ; "client code" never calls those ("hook") methods, but rather the ("organizing") public methods such as put and get (this is known as the Template Method design pattern -- see e.g. here for an interesting presentation based on a video of a talk of mine on the subject, with the addition of synopses of the transcript).

Ned Batchelder , 2009-08-19 17:21:29

__foo__ : this is just a convention, a way for the Python system to use names that won't conflict with user names.

_foo : this is just a convention, a way for the programmer to indicate that the variable is private (whatever that means in Python).

__foo : this has real meaning: the interpreter replaces this name with _classname__foo as a way to ensure that the name will not overlap with a similar name in another class.

No other form of underscores have meaning in the Python world.

There's no difference between class, variable, global, etc in these conventions.

2 revs, 2 users 93%
, 2016-05-17 10:09:08

._variable is semiprivate and meant just for convention

.__variable is often incorrectly considered superprivate, while it's actual meaning is just to namemangle to prevent accidental access [1]

.__variable__ is typically reserved for builtin methods or variables

You can still access .__mangled variables if you desperately want to. The double underscores just namemangles, or renames, the variable to something like instance._className__mangled

Example:

class Test(object):
    def __init__(self):
        self.__a = 'a'
        self._b = 'b'

>>> t = Test()
>>> t._b
'b'

t._b is accessible because it is only hidden by convention

>>> t.__a
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'Test' object has no attribute '__a'

t.__a isn't found because it no longer exists due to namemangling

>>> t._Test__a
'a'

By accessing instance._className__variable instead of just the double underscore name, you can access the hidden value

9 revs, 8 users 82%
, 2018-08-21 19:42:09

Single underscore at the beginning:

Python doesn't have real private methods. Instead, one underscore at the start of a method or attribute name means you shouldn't access this method, because it's not part of the API.

class BaseForm(StrAndUnicode):

    def _get_errors(self):
        "Returns an ErrorDict for the data provided for the form"
        if self._errors is None:
            self.full_clean()
        return self._errors

    errors = property(_get_errors)

(This code snippet was taken from django source code: django/forms/forms.py). In this code, errors is a public property, but the method this property calls, _get_errors, is "private", so you shouldn't access it.

Two underscores at the beginning:

This causes a lot of confusion. It should not be used to create a private method. It should be used to avoid your method being overridden by a subclass or accessed accidentally. Let's see an example:

class A(object):
    def __test(self):
        print "I'm a test method in class A"

    def test(self):
        self.__test()

a = A()
a.test()
# a.__test() # This fails with an AttributeError
a._A__test() # Works! We can access the mangled name directly!

Output:

$ python test.py
I'm test method in class A
I'm test method in class A

Now create a subclass B and do customization for __test method

class B(A):
    def __test(self):
        print "I'm test method in class B"

b = B()
b.test()

Output will be....

$ python test.py
I'm test method in class A

As we have seen, A.test() didn't call B.__test() methods, as we might expect. But in fact, this is the correct behavior for __. The two methods called __test() are automatically renamed (mangled) to _A__test() and _B__test(), so they do not accidentally override. When you create a method starting with __ it means that you don't want to anyone to be able to override it, and you only intend to access it from inside its own class.

Two underscores at the beginning and at the end:

When we see a method like __this__ , don't call it. This is a method which python is meant to call, not you. Let's take a look:

>>> name = "test string"
>>> name.__len__()
11
>>> len(name)
11

>>> number = 10
>>> number.__add__(40)
50
>>> number + 50
60

There is always an operator or native function which calls these magic methods. Sometimes it's just a hook python calls in specific situations. For example __init__() is called when the object is created after __new__() is called to build the instance...

Let's take an example...

class FalseCalculator(object):

    def __init__(self, number):
        self.number = number

    def __add__(self, number):
        return self.number - number

    def __sub__(self, number):
        return self.number + number

number = FalseCalculator(20)
print number + 10      # 10
print number - 20      # 40

For more details, see the PEP-8 guide . For more magic methods, see this PDF .

Tim D , 2012-01-11 16:28:22

Sometimes you have what appears to be a tuple with a leading underscore as in
def foo(bar):
    return _('my_' + bar)

In this case, what's going on is that _() is an alias for a localization function that operates on text to put it into the proper language, etc. based on the locale. For example, Sphinx does this, and you'll find among the imports

from sphinx.locale import l_, _

and in sphinx.locale, _() is assigned as an alias of some localization function.

Dev Maha , 2013-04-15 01:58:14

If one really wants to make a variable read-only, IMHO the best way would be to use property() with only getter passed to it. With property() we can have complete control over the data.
class PrivateVarC(object):

    def get_x(self):
        pass

    def set_x(self, val):
        pass

    rwvar = property(get_p, set_p)  

    ronly = property(get_p)

I understand that OP asked a little different question but since I found another question asking for 'how to set private variables' marked duplicate with this one, I thought of adding this additional info here.

SilentGhost ,

Single leading underscores is a convention. there is no difference from the interpreter's point of view if whether names starts with a single underscore or not.

Double leading and trailing underscores are used for built-in methods, such as __init__ , __bool__ , etc.

Double leading underscores w/o trailing counterparts are a convention too, however, the class methods will be mangled by the interpreter. For variables or basic function names no difference exists.

3 revs
, 2018-12-16 11:41:34

Since so many people are referring to Raymond's talk , I'll just make it a little easier by writing down what he said:

The intention of the double underscores was not about privacy. The intention was to use it exactly like this

class Circle(object):

    def __init__(self, radius):
        self.radius = radius

    def area(self):
        p = self.__perimeter()
        r = p / math.pi / 2.0
        return math.pi * r ** 2.0

    def perimeter(self):
        return 2.0 * math.pi * self.radius

    __perimeter = perimeter  # local reference


class Tire(Circle):

    def perimeter(self):
        return Circle.perimeter(self) * 1.25

It's actually the opposite of privacy, it's all about freedom. It makes your subclasses free to override any one method without breaking the others .

Say you don't keep a local reference of perimeter in Circle . Now, a derived class Tire overrides the implementation of perimeter , without touching area . When you call Tire(5).area() , in theory it should still be using Circle.perimeter for computation, but in reality it's using Tire.perimeter , which is not the intended behavior. That's why we need a local reference in Circle.

But why __perimeter instead of _perimeter ? Because _perimeter still gives derived class the chance to override:

class Tire(Circle):

    def perimeter(self):
        return Circle.perimeter(self) * 1.25

    _perimeter = perimeter

Double underscores has name mangling, so there's a very little chance that the local reference in parent class get override in derived class. thus " makes your subclasses free to override any one method without breaking the others ".

If your class won't be inherited, or method overriding does not break anything, then you simply don't need __double_leading_underscore .

u0b34a0f6ae , 2009-08-19 17:31:04

Your question is good, it is not only about methods. Functions and objects in modules are commonly prefixed with one underscore as well, and can be prefixed by two.

But __double_underscore names are not name-mangled in modules, for example. What happens is that names beginning with one (or more) underscores are not imported if you import all from a module (from module import *), nor are the names shown in help(module).

Marc , 2014-08-22 19:15:48

Here is a simple illustrative example on how double underscore properties can affect an inherited class. So with the following setup:
class parent(object):
    __default = "parent"
    def __init__(self, name=None):
        self.default = name or self.__default

    @property
    def default(self):
        return self.__default

    @default.setter
    def default(self, value):
        self.__default = value


class child(parent):
    __default = "child"

if you then create a child instance in the python REPL, you will see the below

child_a = child()
child_a.default            # 'parent'
child_a._child__default    # 'child'
child_a._parent__default   # 'parent'

child_b = child("orphan")
## this will show 
child_b.default            # 'orphan'
child_a._child__default    # 'child'
child_a._parent__default   # 'orphan'

This may be obvious to some, but it caught me off guard in a much more complex environment

aptro , 2015-02-07 17:57:10

"Private" instance variables that cannot be accessed except from inside an object don't exist in Python. However, there is a convention that is followed by most Python code: a name prefixed with an underscore (e.g. _spam) should be treated as a non-public part of the API (whether it is a function, a method or a data member). It should be considered an implementation detail and subject to change without notice.

reference https://docs.python.org/2/tutorial/classes.html#private-variables-and-class-local-references

grepit , 2019-01-21 22:23:39

Great answers and all are correct.I have provided simple example along with simple definition/meaning.

Meaning:

some_variable --► it's public anyone can see this.

_some_variable --► it's public anyone can see this but it's a convention to indicate private... warning no enforcement is done by Python.

__some_varaible --► Python replaces the variable name with _classname__some_varaible (AKA name mangling) and it reduces/hides it's visibility and be more like private variable.

Just to be honest here According to Python documentation

""Private" instance variables that cannot be accessed except from inside an object don't exist in Python"

The example:

class A():
    here="abc"
    _here="_abc"
    __here="__abc"


aObject=A()
print(aObject.here) 
print(aObject._here)
# now if we try to print __here then it will fail because it's not public variable 
#print(aObject.__here)

2 revs
, 2017-11-04 17:51:49

Getting the facts of _ and __ is pretty easy; the other answers express them pretty well. The usage is much harder to determine.

This is how I see it:

_

Should be used to indicate that a function is not for public use as for example an API. This and the import restriction make it behave much like internal in c#.

__

Should be used to avoid name collision in the inheritace hirarchy and to avoid latebinding. Much like private in c#.

==>

If you want to indicate that something is not for public use, but it should act like protected use _ . If you want to indicate that something is not for public use, but it should act like private use __ .

This is also a quote that I like very much:

The problem is that the author of a class may legitimately think "this attribute/method name should be private, only accessible from within this class definition" and use the __private convention. But later on, a user of that class may make a subclass that legitimately needs access to that name. So either the superclass has to be modified (which may be difficult or impossible), or the subclass code has to use manually mangled names (which is ugly and fragile at best).

But the problem with that is in my opinion that if there's no IDE that warns you when you override methods, finding the error might take you a while if you have accidentially overriden a method from a base-class.

[Nov 23, 2019] Static local variables in Perl

Jan 01, 2012 | stackoverflow.com

Ask Question Asked 7 years, 5 months ago Active 2 years, 8 months ago Viewed 12k times


Charles , 2012-05-31 20:50:19

I'm looking for advice on Perl best practices. I wrote a script which had a complicated regular expression:
my $regex = qr/complicated/;

# ...

sub foo {
  # ...

  if (/$regex/)
  # ...
}

where foo is a function which is called often, and $regex is not used outside that function. What is the best way to handle situations like this? I only want it to be interpreted once, since it's long and complicated. But it seems a bit questionable to have it in global scope since it's only used in that sub. Is there a reasonable way to declare it static?

A similar issue arises with another possibly-unjustified global. It reads in the current date and time and formats it appropriately. This is also used many times, and again only in one function. But in this case it's even more important that it not be re-initialized, since I want all instances of the date-time to be the same from a given invocation of the script, even if the minutes roll over during execution.

At the moment I have something like

my ($regex, $DT);

sub driver {
  $regex = qr/complicated/;
  $DT = dateTime();
  # ...
}

# ...

driver();

which at least slightly segregates it. But perhaps there are better ways.

Again: I'm looking for the right way to do this, in terms of following best practices and Perl idioms. Performance is nice but readability and other needs take priority if I can't have everything.

hobbs ,

If you're using perl 5.10+, use a state variable.
use feature 'state';
# use 5.010; also works

sub womble {
    state $foo = something_expensive();
    return $foo ** 2;
}

will only call something_expensive once.

If you need to work with older perls, then use a lexical variable in an outer scope with an extra pair of braces:

{
    my $foo = something_expensive();
    sub womble {
        return $foo ** 2;
    }
}

this keeps $foo from leaking to anyone except for womble .

ikegami , 2012-05-31 21:14:04

Is there any interpolation in the pattern? If not, the pattern will only be compiled once no matter how many times the qr// is executed.
$ perl -Mre=debug -e'qr/foo/ for 1..10' 2>&1 | grep Compiling | wc -l
1

$ perl -Mre=debug -e'qr/foo$_/ for 1..10' 2>&1 | grep Compiling | wc -l
10

Even if there is interpolation, the pattern will only be compiled if the interpolated variables have changed.

$ perl -Mre=debug -e'$x=123; qr/foo$x/ for 1..10;' 2>&1 | grep Compiling | wc -l
1

$ perl -Mre=debug -e'qr/foo$_/ for 1..10' 2>&1 | grep Compiling | wc -l
10

Otherwise, you can use

{
   my $re = qr/.../;
   sub foo {
      ...
      /$re/
      ...
   }
}

or

use feature qw( state );
sub foo {
   state $re = qr/.../;
   ...
   /$re/
   ...
}

Alan Rocker , 2014-07-02 16:25:27

Regexes can be specified with the "o" modifier, which says "compile pattern once only" - in the 3rd. edition of the Camel, see p. 147

zoul ,

There's a state keyword that might be a good fit for this situation:
sub foo {
    state $regex = /.../;
    ...
}

TrueY , 2015-01-23 10:14:12

I would like to complete ikegami 's great answer. Some more words I would like to waste on the definition of local variables in pre 5.10 perl .

Let's see a simple example code:

#!/bin/env perl 

use strict;
use warnings;

{ # local 
my $local = "After Crying";
sub show { print $local,"\n"; }
} # local

sub show2;

show;
show2;

exit;

{ # local 
my $local = "Solaris";
sub show2 { print $local,"\n"; }
} # local

The user would expect that both sub will print the local variable, but this is not true!

Output:

After Crying
Use of uninitialized value $local in print at ./x.pl line 20.

The reason is that show2 is parsed, but the initialization of the local variable is not executed! (Of course if exit is removed and a show2 is added at the end, Solaris will be printed in the thirds line)

This can be fixed easily:

{ # local 
my $local;
BEGIN { $local = "Solaris"; }
sub show2 { print $local,"\n"; }
} # local

And now the output what was expected:

After Crying
Solaris

But state in 5.10+ is a better choice...

I hope this helps!

[Nov 23, 2019] How can I remove a trailing newline?

Jan 01, 2008 | stackoverflow.com

Ask Question Asked 11 years ago Active 16 days ago Viewed 1.7m times


, 2008-11-08 18:25:24

What is the Python equivalent of Perl's chomp function, which removes the last character of a string if it is a newline?

9 revs, 7 users 37%
, 2017-05-11 19:54:59

Try the method rstrip() (see doc Python 2 and Python 3 )
>>> 'test string\n'.rstrip()
'test string'

Python's rstrip() method strips all kinds of trailing whitespace by default, not just one newline as Perl does with chomp .

>>> 'test string \n \r\n\n\r \n\n'.rstrip()
'test string'

To strip only newlines:

>>> 'test string \n \r\n\n\r \n\n'.rstrip('\n')
'test string \n \r\n\n\r '

There are also the methods lstrip() and strip() :

>>> s = "   \n\r\n  \n  abc   def \n\r\n  \n  "
>>> s.strip()
'abc   def'
>>> s.lstrip()
'abc   def \n\r\n  \n  '
>>> s.rstrip()
'   \n\r\n  \n  abc   def'

Ryan Ginstrom , 2008-11-09 05:52:43

And I would say the "pythonic" way to get lines without trailing newline characters is splitlines().
>>> text = "line 1\nline 2\r\nline 3\nline 4"
>>> text.splitlines()
['line 1', 'line 2', 'line 3', 'line 4']

Mike ,

The canonical way to strip end-of-line (EOL) characters is to use the string rstrip() method removing any trailing \r or \n. Here are examples for Mac, Windows, and Unix EOL characters.
>>> 'Mac EOL\r'.rstrip('\r\n')
'Mac EOL'
>>> 'Windows EOL\r\n'.rstrip('\r\n')
'Windows EOL'
>>> 'Unix EOL\n'.rstrip('\r\n')
'Unix EOL'

Using '\r\n' as the parameter to rstrip means that it will strip out any trailing combination of '\r' or '\n'. That's why it works in all three cases above.

This nuance matters in rare cases. For example, I once had to process a text file which contained an HL7 message. The HL7 standard requires a trailing '\r' as its EOL character. The Windows machine on which I was using this message had appended its own '\r\n' EOL character. Therefore, the end of each line looked like '\r\r\n'. Using rstrip('\r\n') would have taken off the entire '\r\r\n' which is not what I wanted. In that case, I simply sliced off the last two characters instead.

Note that unlike Perl's chomp function, this will strip all specified characters at the end of the string, not just one:

>>> "Hello\n\n\n".rstrip("\n")
"Hello"

, 2008-11-28 17:31:34

Note that rstrip doesn't act exactly like Perl's chomp() because it doesn't modify the string. That is, in Perl:
$x="a\n";

chomp $x

results in $x being "a" .

but in Python:

x="a\n"

x.rstrip()

will mean that the value of x is still "a\n" . Even x=x.rstrip() doesn't always give the same result, as it strips all whitespace from the end of the string, not just one newline at most.

Jamie ,

I might use something like this:
import os
s = s.rstrip(os.linesep)

I think the problem with rstrip("\n") is that you'll probably want to make sure the line separator is portable. (some antiquated systems are rumored to use "\r\n" ). The other gotcha is that rstrip will strip out repeated whitespace. Hopefully os.linesep will contain the right characters. the above works for me.

kiriloff , 2013-05-13 16:41:22

You may use line = line.rstrip('\n') . This will strip all newlines from the end of the string, not just one.

slec , 2015-03-09 08:02:55

s = s.rstrip()

will remove all newlines at the end of the string s . The assignment is needed because rstrip returns a new string instead of modifying the original string.

Alien Life Form ,

This would replicate exactly perl's chomp (minus behavior on arrays) for "\n" line terminator:
def chomp(x):
    if x.endswith("\r\n"): return x[:-2]
    if x.endswith("\n") or x.endswith("\r"): return x[:-1]
    return x

(Note: it does not modify string 'in place'; it does not strip extra trailing whitespace; takes \r\n in account)

Hackaholic ,

you can use strip:
line = line.strip()

demo:

>>> "\n\n hello world \n\n".strip()
'hello world'

mihaicc ,

"line 1\nline 2\r\n...".replace('\n', '').replace('\r', '')
>>> 'line 1line 2...'

or you could always get geekier with regexps :)

have fun!

Carlos Valiente , 2011-04-27 11:43:20

Careful with "foo".rstrip(os.linesep) : That will only chomp the newline characters for the platform where your Python is being executed. Imagine you're chimping the lines of a Windows file under Linux, for instance:
$ python
Python 2.7.1 (r271:86832, Mar 18 2011, 09:09:48) 
[GCC 4.5.0 20100604 [gcc-4_5-branch revision 160292]] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import os, sys
>>> sys.platform
'linux2'
>>> "foo\r\n".rstrip(os.linesep)
'foo\r'
>>>

Use "foo".rstrip("\r\n") instead, as Mike says above.

minopret , 2013-10-23 01:32:11

An example in Python's documentation simply uses line.strip() .

Perl's chomp function removes one linebreak sequence from the end of a string only if it's actually there.

Here is how I plan to do that in Python, if process is conceptually the function that I need in order to do something useful to each line from this file:

import os
sep_pos = -len(os.linesep)
with open("file.txt") as f:
    for line in f:
        if line[sep_pos:] == os.linesep:
            line = line[:sep_pos]
        process(line)

ingydotnet ,

rstrip doesn't do the same thing as chomp, on so many levels. Read http://perldoc.perl.org/functions/chomp.html and see that chomp is very complex indeed.

However, my main point is that chomp removes at most 1 line ending, whereas rstrip will remove as many as it can.

Here you can see rstrip removing all the newlines:

>>> 'foo\n\n'.rstrip(os.linesep)
'foo'

A much closer approximation of typical Perl chomp usage can be accomplished with re.sub, like this:

>>> re.sub(os.linesep + r'\Z','','foo\n\n')
'foo\n'

Andrew Grimm ,

I don't program in Python, but I came across an FAQ at python.org advocating S.rstrip("\r\n") for python 2.2 or later.

, 2014-01-20 19:07:03

import re

r_unwanted = re.compile("[\n\t\r]")
r_unwanted.sub("", your_text)

Leozj ,

If your question is to clean up all the line breaks in a multiple line str object (oldstr), you can split it into a list according to the delimiter '\n' and then join this list into a new str(newstr).

newstr = "".join(oldstr.split('\n'))

kuzzooroo ,

I find it convenient to have be able to get the chomped lines via in iterator, parallel to the way you can get the un-chomped lines from a file object. You can do so with the following code:
def chomped_lines(it):
    return map(operator.methodcaller('rstrip', '\r\n'), it)

Sample usage:

with open("file.txt") as infile:
    for line in chomped_lines(infile):
        process(line)

Chij , 2011-11-30 14:04:19

workaround solution for special case:

if the newline character is the last character (as is the case with most file inputs), then for any element in the collection you can index as follows:

foobar= foobar[:-1]

to slice out your newline character.

user3780389 , 2017-04-26 17:58:16

It looks like there is not a perfect analog for perl's chomp . In particular, rstrip cannot handle multi-character newline delimiters like \r\n . However, splitlines does as pointed out here . Following my answer on a different question, you can combine join and splitlines to remove/replace all newlines from a string s :
''.join(s.splitlines())

The following removes exactly one trailing newline (as chomp would, I believe). Passing True as the keepends argument to splitlines retain the delimiters. Then, splitlines is called again to remove the delimiters on just the last "line":

def chomp(s):
    if len(s):
        lines = s.splitlines(True)
        last = lines.pop()
        return ''.join(lines + last.splitlines())
    else:
        return ''

Taylor Edmiston ,

I'm bubbling up my regular expression based answer from one I posted earlier in the comments of another answer. I think using re is a clearer more explicit solution to this problem than str.rstrip .
>>> import re

If you want to remove one or more trailing newline chars:

>>> re.sub(r'[\n\r]+$', '', '\nx\r\n')
'\nx'

If you want to remove newline chars everywhere (not just trailing):

>>> re.sub(r'[\n\r]+', '', '\nx\r\n')
'x'

If you want to remove only 1-2 trailing newline chars (i.e., \r , \n , \r\n , \n\r , \r\r , \n\n )

>>> re.sub(r'[\n\r]{1,2}$', '', '\nx\r\n\r\n')
'\nx\r'
>>> re.sub(r'[\n\r]{1,2}$', '', '\nx\r\n\r')
'\nx\r'
>>> re.sub(r'[\n\r]{1,2}$', '', '\nx\r\n')
'\nx'

I have a feeling what most people really want here, is to remove just one occurrence of a trailing newline character, either \r\n or \n and nothing more.

>>> re.sub(r'(?:\r\n|\n)$', '', '\nx\n\n', count=1)
'\nx\n'
>>> re.sub(r'(?:\r\n|\n)$', '', '\nx\r\n\r\n', count=1)
'\nx\r\n'
>>> re.sub(r'(?:\r\n|\n)$', '', '\nx\r\n', count=1)
'\nx'
>>> re.sub(r'(?:\r\n|\n)$', '', '\nx\n', count=1)
'\nx'

(The ?: is to create a non-capturing group.)

(By the way this is not what '...'.rstrip('\n', '').rstrip('\r', '') does which may not be clear to others stumbling upon this thread. str.rstrip strips as many of the trailing characters as possible, so a string like foo\n\n\n would result in a false positive of foo whereas you may have wanted to preserve the other newlines after stripping a single trailing one.)

Help me , 2016-05-20 12:29:21

Just use :
line = line.rstrip("\n")

or

line = line.strip("\n")

You don't need any of this complicated stuff

, 2016-11-22 18:30:37

>>> '   spacious   '.rstrip()
'   spacious'
>>> "AABAA".rstrip("A")
  'AAB'
>>> "ABBA".rstrip("AB") # both AB and BA are stripped
   ''
>>> "ABCABBA".rstrip("AB")
   'ABC'

internetional , 2016-11-22 20:17:58

There are three types of line endings that we normally encounter: \n , \r and \r\n . A rather simple regular expression in re.sub , namely r"\r?\n?$" , is able to catch them all.

(And we gotta catch 'em all , am I right?)

import re

re.sub(r"\r?\n?$", "", the_text, 1)

With the last argument, we limit the number of occurences replaced to one, mimicking chomp to some extent. Example:

import re

text_1 = "hellothere\n\n\n"
text_2 = "hellothere\n\n\r"
text_3 = "hellothere\n\n\r\n"

a = re.sub(r"\r?\n?$", "", text_1, 1)
b = re.sub(r"\r?\n?$", "", text_2, 1)
c = re.sub(r"\r?\n?$", "", text_3, 1)

... where a == b == c is True .

Venfah Nazir , 2018-06-15 07:24:21


This will work both for windows and linux (bit expensive with re sub if you are looking for only re solution)

import re 
if re.search("(\\r|)\\n$", line):
    line = re.sub("(\\r|)\\n$", "", line)

Stephen Miller ,

If you are concerned about speed (say you have a looong list of strings) and you know the nature of the newline char, string slicing is actually faster than rstrip. A little test to illustrate this:
import time

loops = 50000000

def method1(loops=loops):
    test_string = 'num\n'
    t0 = time.time()
    for num in xrange(loops):
        out_sting = test_string[:-1]
    t1 = time.time()
    print('Method 1: ' + str(t1 - t0))

def method2(loops=loops):
    test_string = 'num\n'
    t0 = time.time()
    for num in xrange(loops):
        out_sting = test_string.rstrip()
    t1 = time.time()
    print('Method 2: ' + str(t1 - t0))

method1()
method2()

Output:

Method 1: 3.92700004578
Method 2: 6.73000001907

sim , 2019-10-22 07:43:27

s = '''Hello  World \t\n\r\tHi There'''
# import the module string   
import string
# use the method translate to convert 
s.translate({ord(c): None for c in string.whitespace}
>>'HelloWorldHiThere'

With regex

s = '''  Hello  World 
\t\n\r\tHi '''
print(re.sub(r"\s+", "", s), sep='')  # \s matches all white spaces
>HelloWorldHi

Replace \n,\t,\r

s.replace('\n', '').replace('\t','').replace('\r','')
>'  Hello  World Hi '

With regex

s = '''Hello  World \t\n\r\tHi There'''
regex = re.compile(r'[\n\r\t]')
regex.sub("", s)
>'Hello  World Hi There'

with Join

s = '''Hello  World \t\n\r\tHi There'''
' '.join(s.split())
>'Hello  World Hi There'

DeepBlue , 2019-11-06 20:50:30

First split lines then join them by any separator you like.
  x = ' '.join(x.splitlines())

should work like a charm.

user4178860 , 2014-10-24 18:34:12

A catch all:
line = line.rstrip('\r|\n')

Flimm , 2016-06-30 16:20:15

rstrip does not take regular expression. "hi|||\n\n".rstrip("\r|\n") returns "hi"Flimm Jun 30 '16 at 16:20

[Nov 22, 2019] Using global variables in a function

Jan 01, 2009 | stackoverflow.com

user46646 , 2009-01-08 05:45:02

How can I create or use a global variable in a function?

If I create a global variable in one function, how can I use that global variable in another function? Do I need to store the global variable in a local variable of the function which needs its access?

Paul Stephenson , 2009-01-08 08:39:44

You can use a global variable in other functions by declaring it as global in each function that assigns to it:
globvar = 0

def set_globvar_to_one():
    global globvar    # Needed to modify global copy of globvar
    globvar = 1

def print_globvar():
    print(globvar)     # No need for global declaration to read value of globvar

set_globvar_to_one()
print_globvar()       # Prints 1

I imagine the reason for it is that, since global variables are so dangerous, Python wants to make sure that you really know that's what you're playing with by explicitly requiring the global keyword.

See other answers if you want to share a global variable across modules.

Jeff Shannon , 2009-01-08 09:19:55

If I'm understanding your situation correctly, what you're seeing is the result of how Python handles local (function) and global (module) namespaces.

Say you've got a module like this:

# sample.py
myGlobal = 5

def func1():
    myGlobal = 42

def func2():
    print myGlobal

func1()
func2()

You might expecting this to print 42, but instead it prints 5. As has already been mentioned, if you add a ' global ' declaration to func1() , then func2() will print 42.

def func1():
    global myGlobal
    myGlobal = 42

What's going on here is that Python assumes that any name that is assigned to , anywhere within a function, is local to that function unless explicitly told otherwise. If it is only reading from a name, and the name doesn't exist locally, it will try to look up the name in any containing scopes (e.g. the module's global scope).

When you assign 42 to the name myGlobal , therefore, Python creates a local variable that shadows the global variable of the same name. That local goes out of scope and is garbage-collected when func1() returns; meanwhile, func2() can never see anything other than the (unmodified) global name. Note that this namespace decision happens at compile time, not at runtime -- if you were to read the value of myGlobal inside func1() before you assign to it, you'd get an UnboundLocalError , because Python has already decided that it must be a local variable but it has not had any value associated with it yet. But by using the ' global ' statement, you tell Python that it should look elsewhere for the name instead of assigning to it locally.

(I believe that this behavior originated largely through an optimization of local namespaces -- without this behavior, Python's VM would need to perform at least three name lookups each time a new name is assigned to inside a function (to ensure that the name didn't already exist at module/builtin level), which would significantly slow down a very common operation.)

gimel , 2009-01-08 05:59:04

You may want to explore the notion of namespaces . In Python, the module is the natural place for global data:

Each module has its own private symbol table, which is used as the global symbol table by all functions defined in the module. Thus, the author of a module can use global variables in the module without worrying about accidental clashes with a user's global variables. On the other hand, if you know what you are doing you can touch a module's global variables with the same notation used to refer to its functions, modname.itemname .

A specific use of global-in-a-module is described here - How do I share global variables across modules? , and for completeness the contents are shared here:

The canonical way to share information across modules within a single program is to create a special configuration module (often called config or cfg ). Just import the configuration module in all modules of your application; the module then becomes available as a global name. Because there is only one instance of each module, any changes made to the module object get reflected everywhere. For example:

File: config.py

x = 0   # Default value of the 'x' configuration setting

File: mod.py

import config
config.x = 1

File: main.py

import config
import mod
print config.x

SingleNegationElimination ,

Python uses a simple heuristic to decide which scope it should load a variable from, between local and global. If a variable name appears on the left hand side of an assignment, but is not declared global, it is assumed to be local. If it does not appear on the left hand side of an assignment, it is assumed to be global.
>>> import dis
>>> def foo():
...     global bar
...     baz = 5
...     print bar
...     print baz
...     print quux
... 
>>> dis.disassemble(foo.func_code)
  3           0 LOAD_CONST               1 (5)
              3 STORE_FAST               0 (baz)

  4           6 LOAD_GLOBAL              0 (bar)
              9 PRINT_ITEM          
             10 PRINT_NEWLINE       

  5          11 LOAD_FAST                0 (baz)
             14 PRINT_ITEM          
             15 PRINT_NEWLINE       

  6          16 LOAD_GLOBAL              1 (quux)
             19 PRINT_ITEM          
             20 PRINT_NEWLINE       
             21 LOAD_CONST               0 (None)
             24 RETURN_VALUE        
>>>

See how baz, which appears on the left side of an assignment in foo() , is the only LOAD_FAST variable.

J S , 2009-01-08 09:03:33

If you want to refer to a global variable in a function, you can use the global keyword to declare which variables are global. You don't have to use it in all cases (as someone here incorrectly claims) - if the name referenced in an expression cannot be found in local scope or scopes in the functions in which this function is defined, it is looked up among global variables.

However, if you assign to a new variable not declared as global in the function, it is implicitly declared as local, and it can overshadow any existing global variable with the same name.

Also, global variables are useful, contrary to some OOP zealots who claim otherwise - especially for smaller scripts, where OOP is overkill.

Rauni Lillemets ,

In addition to already existing answers and to make this more confusing:

In Python, variables that are only referenced inside a function are implicitly global . If a variable is assigned a new value anywhere within the function's body, it's assumed to be a local . If a variable is ever assigned a new value inside the function, the variable is implicitly local, and you need to explicitly declare it as 'global'.

Though a bit surprising at first, a moment's consideration explains this. On one hand, requiring global for assigned variables provides a bar against unintended side-effects. On the other hand, if global was required for all global references, you'd be using global all the time. You'd have to declare as global every reference to a built-in function or to a component of an imported module. This clutter would defeat the usefulness of the global declaration for identifying side-effects.

Source: What are the rules for local and global variables in Python? .

Aaron Hall ,

If I create a global variable in one function, how can I use that variable in another function?

We can create a global with the following function:

def create_global_variable():
    global global_variable # must declare it to be a global first
    # modifications are thus reflected on the module's global scope
    global_variable = 'Foo'

Writing a function does not actually run its code. So we call the create_global_variable function:

>>> create_global_variable()
Using globals without modification

You can just use it, so long as you don't expect to change which object it points to:

For example,

def use_global_variable():
    return global_variable + '!!!'

and now we can use the global variable:

>>> use_global_variable()
'Foo!!!'
Modification of the global variable from inside a function

To point the global variable at a different object, you are required to use the global keyword again:

def change_global_variable():
    global global_variable
    global_variable = 'Bar'

Note that after writing this function, the code actually changing it has still not run:

>>> use_global_variable()
'Foo!!!'

So after calling the function:

>>> change_global_variable()

we can see that the global variable has been changed. The global_variable name now points to 'Bar' :

>>> use_global_variable()
'Bar!!!'

Note that "global" in Python is not truly global - it's only global to the module level. So it is only available to functions written in the modules in which it is global. Functions remember the module in which they are written, so when they are exported into other modules, they still look in the module in which they were created to find global variables.

Local variables with the same name

If you create a local variable with the same name, it will overshadow a global variable:

def use_local_with_same_name_as_global():
    # bad name for a local variable, though.
    global_variable = 'Baz' 
    return global_variable + '!!!'

>>> use_local_with_same_name_as_global()
'Baz!!!'

But using that misnamed local variable does not change the global variable:

>>> use_global_variable()
'Bar!!!'

Note that you should avoid using the local variables with the same names as globals unless you know precisely what you are doing and have a very good reason to do so. I have not yet encountered such a reason.

Bohdan , 2013-10-03 05:41:16

With parallel execution, global variables can cause unexpected results if you don't understand what is happening. Here is an example of using a global variable within multiprocessing. We can clearly see that each process works with its own copy of the variable:
import multiprocessing
import os
import random
import sys
import time

def worker(new_value):
    old_value = get_value()
    set_value(random.randint(1, 99))
    print('pid=[{pid}] '
          'old_value=[{old_value:2}] '
          'new_value=[{new_value:2}] '
          'get_value=[{get_value:2}]'.format(
          pid=str(os.getpid()),
          old_value=old_value,
          new_value=new_value,
          get_value=get_value()))

def get_value():
    global global_variable
    return global_variable

def set_value(new_value):
    global global_variable
    global_variable = new_value

global_variable = -1

print('before set_value(), get_value() = [%s]' % get_value())
set_value(new_value=-2)
print('after  set_value(), get_value() = [%s]' % get_value())

processPool = multiprocessing.Pool(processes=5)
processPool.map(func=worker, iterable=range(15))

Output:

before set_value(), get_value() = [-1]
after  set_value(), get_value() = [-2]
pid=[53970] old_value=[-2] new_value=[ 0] get_value=[23]
pid=[53971] old_value=[-2] new_value=[ 1] get_value=[42]
pid=[53970] old_value=[23] new_value=[ 4] get_value=[50]
pid=[53970] old_value=[50] new_value=[ 6] get_value=[14]
pid=[53971] old_value=[42] new_value=[ 5] get_value=[31]
pid=[53972] old_value=[-2] new_value=[ 2] get_value=[44]
pid=[53973] old_value=[-2] new_value=[ 3] get_value=[94]
pid=[53970] old_value=[14] new_value=[ 7] get_value=[21]
pid=[53971] old_value=[31] new_value=[ 8] get_value=[34]
pid=[53972] old_value=[44] new_value=[ 9] get_value=[59]
pid=[53973] old_value=[94] new_value=[10] get_value=[87]
pid=[53970] old_value=[21] new_value=[11] get_value=[21]
pid=[53971] old_value=[34] new_value=[12] get_value=[82]
pid=[53972] old_value=[59] new_value=[13] get_value=[ 4]
pid=[53973] old_value=[87] new_value=[14] get_value=[70]

user2876408 ,

As it turns out the answer is always simple.

Here is a small sample module with a simple way to show it in a main definition:

def five(enterAnumber,sumation):
    global helper
    helper  = enterAnumber + sumation

def isTheNumber():
    return helper

Here is how to show it in a main definition:

import TestPy

def main():
    atest  = TestPy
    atest.five(5,8)
    print(atest.isTheNumber())

if __name__ == '__main__':
    main()

This simple code works just like that, and it will execute. I hope it helps.

gxyd , 2014-12-04 06:27:43

What you are saying is to use the method like this:
globvar = 5

def f():
    var = globvar
    print(var)

f()  # Prints 5

But the better way is to use the global variable like this:

globavar = 5
def f():
    global globvar
    print(globvar)
f()   #prints 5

Both give the same output.

Mohamed El-Saka , 2014-12-20 12:45:26

You need to reference the global variable in every function you want to use.

As follows:

var = "test"

def printGlobalText():
    global var #wWe are telling to explicitly use the global version
    var = "global from printGlobalText fun."
    print "var from printGlobalText: " + var

def printLocalText():
    #We are NOT telling to explicitly use the global version, so we are creating a local variable
    var = "local version from printLocalText fun"
    print "var from printLocalText: " + var

printGlobalText()
printLocalText()
"""
Output Result:
var from printGlobalText: global from printGlobalText fun.
var from printLocalText: local version from printLocalText
[Finished in 0.1s]
"""

Kylotan , 2009-01-09 11:56:19

You're not actually storing the global in a local variable, just creating a local reference to the same object that your original global reference refers to. Remember that pretty much everything in Python is a name referring to an object, and nothing gets copied in usual operation.

If you didn't have to explicitly specify when an identifier was to refer to a predefined global, then you'd presumably have to explicitly specify when an identifier is a new local variable instead (for example, with something like the 'var' command seen in JavaScript). Since local variables are more common than global variables in any serious and non-trivial system, Python's system makes more sense in most cases.

You could have a language which attempted to guess, using a global variable if it existed or creating a local variable if it didn't. However, that would be very error-prone. For example, importing another module could inadvertently introduce a global variable by that name, changing the behaviour of your program.

Sagar Mehta ,

Try this:
def x1():
    global x
    x = 6

def x2():
    global x
    x = x+1
    print x

x = 5
x1()
x2()  # output --> 7

Martin Thoma , 2017-04-07 18:52:13

In case you have a local variable with the same name, you might want to use the globals() function .
globals()['your_global_var'] = 42

, 2015-10-24 15:46:18

Following on and as an add on, use a file to contain all global variables all declared locally and then import as :

File initval.py :

Stocksin = 300
Prices = []

File getstocks.py :

import initval as iv

def getmystocks(): 
    iv.Stocksin = getstockcount()


def getmycharts():
    for ic in range(iv.Stocksin):

Mike Lampton , 2016-01-07 20:41:19

Writing to explicit elements of a global array does not apparently need the global declaration, though writing to it "wholesale" does have that requirement:
import numpy as np

hostValue = 3.14159
hostArray = np.array([2., 3.])
hostMatrix = np.array([[1.0, 0.0],[ 0.0, 1.0]])

def func1():
    global hostValue    # mandatory, else local.
    hostValue = 2.0

def func2():
    global hostValue    # mandatory, else UnboundLocalError.
    hostValue += 1.0

def func3():
    global hostArray    # mandatory, else local.
    hostArray = np.array([14., 15.])

def func4():            # no need for globals
    hostArray[0] = 123.4

def func5():            # no need for globals
    hostArray[1] += 1.0

def func6():            # no need for globals
    hostMatrix[1][1] = 12.

def func7():            # no need for globals
    hostMatrix[0][0] += 0.33

func1()
print "After func1(), hostValue = ", hostValue
func2()
print "After func2(), hostValue = ", hostValue
func3()
print "After func3(), hostArray = ", hostArray
func4()
print "After func4(), hostArray = ", hostArray
func5()
print "After func5(), hostArray = ", hostArray
func6()
print "After func6(), hostMatrix = \n", hostMatrix
func7()
print "After func7(), hostMatrix = \n", hostMatrix

Rafaël Dera ,

I'm adding this as I haven't seen it in any of the other answers and it might be useful for someone struggling with something similar. The globals() function returns a mutable global symbol dictionary where you can "magically" make data available for the rest of your code. For example:
from pickle import load
def loaditem(name):
    with open(r"C:\pickle\file\location"+"\{}.dat".format(name), "rb") as openfile:
        globals()[name] = load(openfile)
    return True

and

from pickle import dump
def dumpfile(name):
    with open(name+".dat", "wb") as outfile:
        dump(globals()[name], outfile)
    return True

Will just let you dump/load variables out of and into the global namespace. Super convenient, no muss, no fuss. Pretty sure it's Python 3 only.

llewellyn falco , 2017-08-19 08:48:27

Reference the class namespace where you want the change to show up.

In this example, runner is using max from the file config. I want my test to change the value of max when runner is using it.

main/config.py

max = 15000

main/runner.py

from main import config
def check_threads():
    return max < thread_count

tests/runner_test.py

from main import runner                # <----- 1. add file
from main.runner import check_threads
class RunnerTest(unittest):
   def test_threads(self):
       runner.max = 0                  # <----- 2. set global 
       check_threads()

[Nov 21, 2019] How to fix Python indentation

Nov 21, 2019 | stackoverflow.com

ephemient ,Feb 28, 2010 at 0:35

Use the reindent.py script that you find in the Tools/scripts/ directory of your Python installation:

Change Python (.py) files to use 4-space indents and no hard tab characters. Also trim excess spaces and tabs from ends of lines, and remove empty lines at the end of files. Also ensure the last line ends with a newline.

Have a look at that script for detailed usage instructions.

[Nov 21, 2019] Python Formatter Tool

Jul 3, 2014 | stackoverflow.com

tricasse ,Jul 3, 2014 at 19:15

I was wondering if there exists a sort of Python beautifier like the gnu-indent command line tool for C code. Of course indentation is not the point in Python since it is programmer's responsibility but I wish to get my code written in a perfectly homogenous way, taking care particularly of having always identical blank space between operands or after and before separators and between blocks.

Mike A ,Mar 1, 2010 at 17:49

I am the one who asks the question. In fact, the tool the closest to my needs seems to be PythonTidy (it's a Python program of course : Python is best served by himself ;) ).

tom ,Sep 29, 2014 at 18:26

autopep8 attempts to automate making your code conform to pep8 coding standards

https://pypi.python.org/pypi/autopep8

Eyal Levin ,Oct 31, 2017 at 9:31

You can also try yapf :

A formatter for Python files

https://github.com/google/yapf/

Vinko Vrsalovic ,Jun 23, 2009 at 12:57

PyLint has some formatting checks.

xxx ,Jun 23, 2009 at 12:57

Have you looked at pindent ?

[Nov 21, 2019] Passing an Array/List into Python

Dec 17, 2017 | stackoverflow.com

JAL ,Dec 17, 2017 at 8:12

I've been looking at passing arrays, or lists, as Python tends to call them, into a function.

I read something about using *args, such as:

def someFunc(*args)
    for x in args
        print x

But not sure if this is right/wrong. Nothing seems to work as I want. I'm used to be able to pass arrays into PHP function with ease and this is confusing me. It also seems I can't do this:

def someFunc(*args, someString)

As it throws up an error.

I think I've just got myself completely confused and looking for someone to clear it up for me.

Rafał Rawicki ,Feb 13 at 15:08

When you define your function using this syntax:
def someFunc(*args)
    for x in args
        print x

You're telling it that you expect a variable number of arguments. If you want to pass in a List (Array from other languages) you'd do something like this:

def someFunc(myList = [], *args)
    for x in myList:
        print x

Then you can call it with this:

items = [1,2,3,4,5]

someFunc(items)

You need to define named arguments before variable arguments, and variable arguments before keyword arguments. You can also have this:

def someFunc(arg1, arg2, arg3, *args, **kwargs)
    for x in args
        print x

Which requires at least three arguments, and supports variable numbers of other arguments and keyword arguments.

JoshD ,Oct 18, 2010 at 20:28

You can pass lists just like other types:
l = [1,2,3]

def stuff(a):
   for x in a:
      print a


stuff(l)

This prints the list l. Keep in mind lists are passed as references not as a deep copy.

Gintautas Miliauskas ,Oct 18, 2010 at 16:14

Python lists (which are not just arrays because their size can be changed on the fly) are normal Python objects and can be passed in to functions as any variable. The * syntax is used for unpacking lists, which is probably not something you want to do now.

,

You don't need to use the asterisk to accept a list.

Simply give the argument a name in the definition, and pass in a list like

def takes_list(a_list):
    for item in a_list:
         print item

[Nov 21, 2019] How to print to stderr in Python?

Nov 21, 2019 | stackoverflow.com

Ask Question Asked 8 years, 7 months ago Active 3 months ago Viewed 762k times 1239 170


Steve Howard ,Jul 31, 2013 at 14:05

There are several ways to write to stderr:
# Note: this first one does not work in Python 3
print >> sys.stderr, "spam"

sys.stderr.write("spam\n")

os.write(2, b"spam\n")

from __future__ import print_function
print("spam", file=sys.stderr)

That seems to contradict zen of Python #13 , so what's the difference here and are there any advantages or disadvantages to one way or the other? Which way should be used?

There should be one -- and preferably only one -- obvious way to do it.

Dan H ,May 16, 2017 at 22:51

I found this to be the only one short + flexible + portable + readable:
from __future__ import print_function
import sys

def eprint(*args, **kwargs):
    print(*args, file=sys.stderr, **kwargs)

The function eprint can be used in the same way as the standard print function:

>>> print("Test")
Test
>>> eprint("Test")
Test
>>> eprint("foo", "bar", "baz", sep="---")
foo---bar---baz

Dheeraj V.S. ,Jan 13, 2013 at 3:18

import sys
sys.stderr.write()

Is my choice, just more readable and saying exactly what you intend to do and portable across versions.

Edit: being 'pythonic' is a third thought to me over readability and performance... with these two things in mind, with python 80% of your code will be pythonic. list comprehension being the 'big thing' that isn't used as often (readability).

Michael Scheper ,Aug 26 at 17:01

print >> sys.stderr is gone in Python3. http://docs.python.org/3.0/whatsnew/3.0.html says:
Old: print >>sys.stderr, "fatal error"
New: print("fatal error", file=sys.stderr)

For many of us, it feels somewhat unnatural to relegate the destination to the end of the command. The alternative

sys.stderr.write("fatal error\n")

looks more object oriented, and elegantly goes from the generic to the specific. But note that write is not a 1:1 replacement for print .

luketparkinson ,Apr 23, 2013 at 10:04

For Python 2 my choice is: print >> sys.stderr, 'spam' Because you can simply print lists/dicts etc. without convert it to string. print >> sys.stderr, {'spam': 'spam'} instead of: sys.stderr.write(str({'spam': 'spam'}))

Mnebuerquo ,Jul 11 at 9:44

Nobody's mentioned logging yet, but logging was created specifically to communicate error messages. By default it is set up to write to stderr. This script:
# foo.py
import logging
logging.basicConfig(format='%(message)s')

logging.warning('I print to stderr by default')
logging.info('For this you must change the level and add a handler.')
print('hello world')

has the following result when run on the command line:

$ python3 foo.py > bar.txt
I print to stderr by default

(and bar.txt contains the 'hello world')

(Note, logging.warn has been deprecated , use logging.warning instead)

porgarmingduod ,Apr 15, 2016 at 1:37

I would say that your first approach:
print >> sys.stderr, 'spam'

is the "One . . . obvious way to do it" The others don't satisfy rule #1 ("Beautiful is better than ugly.")

Rebs ,Dec 30, 2013 at 2:26

I did the following using Python 3:
from sys import stderr

def print_err(*args, **kwargs):
    print(*args, file=stderr, **kwargs)

So now I'm able to add keyword arguments, for example, to avoid carriage return:

print_err("Error: end of the file reached. The word ", end='')
print_err(word, "was not found")

AMS ,Nov 5, 2015 at 14:15

This will mimic the standard print function but output on stderr
def print_err(*args):
    sys.stderr.write(' '.join(map(str,args)) + '\n')

Agi Hammerthief ,Dec 31, 2015 at 22:58

EDIT In hind-sight, I think the potential confusion with changing sys.stderr and not seeing the behaviour updated makes this answer not as good as just using a simple function as others have pointed out.

Using partial only saves you 1 line of code. The potential confusion is not worth saving 1 line of code.

original

To make it even easier, here's a version that uses 'partial', which is a big help in wrapping functions.

from __future__ import print_function
import sys
from functools import partial

error = partial(print, file=sys.stderr)

You then use it like so

error('An error occured!')

You can check that it's printing to stderr and not stdout by doing the following (over-riding code from http://coreygoldberg.blogspot.com.au/2009/05/python-redirect-or-turn-off-stdout-and.html ):

# over-ride stderr to prove that this function works.
class NullDevice():
    def write(self, s):
        pass
sys.stderr = NullDevice()

# we must import print error AFTER we've removed the null device because
# it has been assigned and will not be re-evaluated.
# assume error function is in print_error.py
from print_error import error

# no message should be printed
error("You won't see this error!")

The downside to this is partial assigns the value of sys.stderr to the wrapped function at the time of creation. Which means, if you redirect stderr later it won't affect this function. If you plan to redirect stderr, then use the **kwargs method mentioned by aaguirre on this page.

Florian Castellane ,Jan 8 at 6:57

In Python 3, one can just use print():
print(*objects, sep=' ', end='\n', file=sys.stdout, flush=False)

almost out of the box:

import sys
print("Hello, world!", file=sys.stderr)

or:

from sys import stderr
print("Hello, world!", file=stderr)

This is straightforward and does not need to include anything besides sys.stderr .

phoenix ,Mar 2, 2016 at 23:57

The same applies to stdout:
print 'spam'
sys.stdout.write('spam\n')

As stated in the other answers, print offers a pretty interface that is often more convenient (e.g. for printing debug information), while write is faster and can also be more convenient when you have to format the output exactly in certain way. I would consider maintainability as well:

  1. You may later decide to switch between stdout/stderr and a regular file.
  2. print() syntax has changed in Python 3, so if you need to support both versions, write() might be better.

user1928764 ,Feb 10, 2016 at 2:29

I am working in python 3.4.3. I am cutting out a little typing that shows how I got here:
[18:19 jsilverman@JSILVERMAN-LT7 pexpect]$ python3
>>> import sys
>>> print("testing", file=sys.stderr)
testing
>>>
[18:19 jsilverman@JSILVERMAN-LT7 pexpect]$

Did it work? Try redirecting stderr to a file and see what happens:

[18:22 jsilverman@JSILVERMAN-LT7 pexpect]$ python3 2> /tmp/test.txt
>>> import sys
>>> print("testing", file=sys.stderr)
>>> [18:22 jsilverman@JSILVERMAN-LT7 pexpect]$
[18:22 jsilverman@JSILVERMAN-LT7 pexpect]$ cat /tmp/test.txt
Python 3.4.3 (default, May  5 2015, 17:58:45)
[GCC 4.9.2] on cygwin
Type "help", "copyright", "credits" or "license" for more information.
testing

[18:22 jsilverman@JSILVERMAN-LT7 pexpect]$

Well, aside from the fact that the little introduction that python gives you has been slurped into stderr (where else would it go?), it works.

hamish ,Oct 8, 2017 at 16:18

If you do a simple test:
import time
import sys

def run1(runs):
    x = 0
    cur = time.time()
    while x < runs:
        x += 1
        print >> sys.stderr, 'X'
    elapsed = (time.time()-cur)
    return elapsed

def run2(runs):
    x = 0
    cur = time.time()
    while x < runs:
        x += 1
        sys.stderr.write('X\n')
        sys.stderr.flush()
    elapsed = (time.time()-cur)
    return elapsed

def compare(runs):
    sum1, sum2 = 0, 0
    x = 0
    while x < runs:
        x += 1
        sum1 += run1(runs)
        sum2 += run2(runs)
    return sum1, sum2

if __name__ == '__main__':
    s1, s2 = compare(1000)
    print "Using (print >> sys.stderr, 'X'): %s" %(s1)
    print "Using (sys.stderr.write('X'),sys.stderr.flush()):%s" %(s2)
    print "Ratio: %f" %(float(s1) / float(s2))

You will find that sys.stderr.write() is consistently 1.81 times faster!

Vinay Kumar ,Jan 30, 2018 at 13:17

Answer to the question is : There are different way to print stderr in python but that depends on 1.) which python version we are using 2.) what exact output we want.

The differnce between print and stderr's write function: stderr : stderr (standard error) is pipe that is built into every UNIX/Linux system, when your program crashes and prints out debugging information (like a traceback in Python), it goes to the stderr pipe.

print : print is a wrapper that formats the inputs (the input is the space between argument and the newline at the end) and it then calls the write function of a given object, the given object by default is sys.stdout, but we can pass a file i.e we can print the input in a file also.

Python2: If we are using python2 then

>>> import sys
>>> print "hi"
hi
>>> print("hi")
hi
>>> print >> sys.stderr.write("hi")
hi

Python2 trailing comma has in Python3 become a parameter, so if we use trailing commas to avoid the newline after a print, this will in Python3 look like print('Text to print', end=' ') which is a syntax error under Python2.

http://python3porting.com/noconv.html

If we check same above sceario in python3:

>>> import sys
>>> print("hi")
hi

Under Python 2.6 there is a future import to make print into a function. So to avoid any syntax errors and other differences we should start any file where we use print() with from future import print_function. The future import only works under Python 2.6 and later, so for Python 2.5 and earlier you have two options. You can either convert the more complex print to something simpler, or you can use a separate print function that works under both Python2 and Python3.

>>> from __future__ import print_function
>>> 
>>> def printex(*args, **kwargs):
...     print(*args, file=sys.stderr, **kwargs)
... 
>>> printex("hii")
hii
>>>

Case: Point to be noted that sys.stderr.write() or sys.stdout.write() ( stdout (standard output) is a pipe that is built into every UNIX/Linux system) is not a replacement for print, but yes we can use it as a alternative in some case. Print is a wrapper which wraps the input with space and newline at the end and uses the write function to write. This is the reason sys.stderr.write() is faster.

Note: we can also trace and debugg using Logging

#test.py
import logging
logging.info('This is the existing protocol.')
FORMAT = "%(asctime)-15s %(clientip)s %(user)-8s %(message)s"
logging.basicConfig(format=FORMAT)
d = {'clientip': '192.168.0.1', 'user': 'fbloggs'}
logging.warning("Protocol problem: %s", "connection reset", extra=d)

https://docs.python.org/2/library/logging.html#logger-objects

[Nov 21, 2019] How do I convert a binary string to a number in Perl - Stack Overflow

Nov 21, 2019 | stackoverflow.com

How do I convert a binary string to a number in Perl? Ask Question Asked 10 years, 10 months ago Active 1 year, 9 months ago Viewed 32k times 29 15


Nathan Fellman ,Jan 27, 2009 at 14:43

How can I convert the binary string $x_bin="0001001100101" to its numeric value $x_num=613 in Perl?

innaM ,Jan 28, 2009 at 0:04

sub bin2dec {
    return unpack("N", pack("B32", substr("0" x 32 . shift, -32)));
}

ysth ,Jan 28, 2009 at 1:48

My preferred way is:
$x_num = oct("0b" . $x_bin);

Quoting from man perlfunc :

    oct EXPR
    oct     Interprets EXPR as an octal string and returns the
            corresponding value. (If EXPR happens to start
            off with "0x", interprets it as a hex string. If
            EXPR starts off with "0b", it is interpreted as a
            binary string. Leading whitespace is ignored in
            all three cases.)

innaM ,Jan 27, 2009 at 16:23

As usual, there's is also an excellent CPAN module that should be mentioned here: Bit::Vector .

The transformation would look something like this:

use Bit::Vector;

my $v = Bit::Vector->new_Bin( 32, '0001001100101' );
print "hex: ", $v->to_Hex(), "\n";
print "dec: ", $v->to_Dec(), "\n";

The binary strings can be of almost any length and you can do other neat stuff like bit-shifting, etc.

noswonky ,Jan 28, 2009 at 0:28

Actually you can just stick '0b' on the front and it's treated as a binary number.
perl -le 'print 0b101'
5

But this only works for a bareword.

,

You can use the eval() method to work around the bare-word restriction:
eval "\$num=0b$str;";

[Nov 21, 2019] regex - Perl regular expression match nested brackets - Stack Overflow

Nov 21, 2019 | stackoverflow.com

Perl regular expression: match nested brackets Ask Question Asked 6 years, 8 months ago Active 6 years ago Viewed 21k times 9 7


i-- ,Mar 8, 2013 at 19:33

I'm trying to match nested {} brackets with a regular expressions in Perl so that I can extract certain pieces of text from a file. This is what I have currently:
my @matches = $str =~ /\{(?:\{.*\}|[^\{])*\}|\w+/sg;

foreach (@matches) {
    print "$_\n";
}

At certain times this works as expected. For instance, if $str = "abc {{xyz} abc} {xyz}" I obtain:

abc
{{xyz} abc}
{xyz}

as expected. But for other input strings it does not function as expected. For example, if $str = "{abc} {{xyz}} abc" , the output is:

{abc} {{xyz}}
abc

which is not what I expected. I would have wanted {abc} and {{xyz}} to be on separate lines, since each is balanced on its own in terms of brackets. Is there an issue with my regular expression? If so, how would I go about fixing it?

l--''''''---------'''''''''''' ,Dec 9, 2013 at 14:30

You were surprised how your pattern matched, but noone explained it? Here's how your pattern is matching:
my @matches = $str =~ /\{(?:\{.*\}|[^{])*\}|\w+/sg;
                       ^    ^ ^ ^  ^      ^
                       |    | | |  |      |
{ ---------------------+    | | |  |      |
a --------------------------)-)-)--+      |
b --------------------------)-)-)--+      |
c --------------------------)-)-)--+      |
} --------------------------)-)-)--+      |
  --------------------------)-)-)--+      |
{ --------------------------+ | |         |
{ ----------------------------+ |         |
x ----------------------------+ |         |
y ----------------------------+ |         |
z ----------------------------+ |         |
} ------------------------------+         |
} ----------------------------------------+

As you can see, the problem is that / \{.*\} / matches too much. What should be in there is a something that matches

(?: \s* (?: \{ ... \} | \w+ ) )*

where the ... is

(?: \s* (?: \{ ... \} | \w+ ) )*

So you need some recursion. Named groups are an easy way of doing this.

say $1
   while /
      \G \s*+ ( (?&WORD) | (?&BRACKETED) )

      (?(DEFINE)
         (?<WORD>      \s* \w+ )
         (?<BRACKETED> \s* \{ (?&TEXT)? \s* \} )
         (?<TEXT>      (?: (?&WORD) | (?&BRACKETED) )+ )
      )
   /xg;

But instead of reinventing the wheel, why not use Text::Balanced .

Schwern ,Mar 8, 2013 at 20:01

The problem of matching balanced and nested delimiters is covered in perlfaq5 and I'll leave it to them to cover all the options including (?PARNO) and Regexp::Common .

But matching balanced items is tricky and prone to error, unless you really want to learn and maintain advanced regexes, leave it to a module. Fortunately there is Text::Balanced to handle this and so very much more. It is the Swiss Army Chainsaw of balanced text matching.

Unfortunately it does not handle escaping on bracketed delimiters .

use v5.10;
use strict;
use warnings;

use Text::Balanced qw(extract_multiple extract_bracketed);

my @strings = ("abc {{xyz} abc} {xyz}", "{abc} {{xyz}} abc");

for my $string (@strings) {
    say "Extracting from $string";

    # Extract all the fields, rather than one at a time.
    my @fields = extract_multiple(
        $string,
        [
            # Extract {...}
            sub { extract_bracketed($_[0], '{}') },
            # Also extract any other non whitespace
            qr/\S+/
        ],
        # Return all the fields
        undef,
        # Throw out anything which does not match
        1
    );

    say join "\n", @fields;
    print "\n";
}

You can think of extract_multiple like a more generic and powerful split .

arshajii ,Mar 8, 2013 at 20:34

You need a recursive regex. This should work:
my @matches;
push @matches, $1 while $str =~ /( [^{}\s]+ | ( \{ (?: [^{}]+ | (?2) )* \} ) )/xg;

or, if you prefer the non-loop version:

my @matches = $str =~ /[^{}\s]+ | \{ (?: (?R) | [^{}]+ )+ \} /gx;

nhahtdh ,Mar 8, 2013 at 20:40

To match nested brackets with just one pair at each level of nesting,
but any number of levels, e.g. {1{2{3}}} , you could use
/\{[^}]*[^{]*\}|\w+/g

To match when there may be multiple pairs at any level of nesting, e.g. {1{2}{2}{2}} , you could use

/(?>\{(?:[^{}]*|(?R))*\})|\w+/g

The (?R) is used to match the whole pattern recursively.

To match the text contained within a pair of brackets the engine must match (?:[^{}]*|(?R))* ,
i.e. either [^{}]* or (?R) , zero or more times * .

So in e.g. "{abc {def}}" , after the opening "{" is matched, the [^{}]* will match the "abc " and the (?R) will match the "{def}" , then the closing "}" will be matched.

The "{def}" is matched because (?R) is simply short for the whole pattern
(?>\{(?:[^{}]*|(?R))*\})|\w+ , which as we have just seen will match a "{" followed by text matching [^{}]* , followed by "}" .

Atomic grouping (?> ... ) is used to prevent the regex engine backtracking into bracketed text once it has been matched. This is important to ensure the regex will fail fast if it cannot find a match.

Francisco Zarabozo ,Mar 15, 2013 at 13:15

Wow. What a bunch of complicated answers to something that simple.

The problem you're having is that you're matching in greedy mode. That is, you are aking the regex engine to match as much as possible while making the expression true.

To avoid greedy match, just add a '?' after your quantifier. That makes the match as short as possible.

So, I changed your expression from:

my @matches = $str =~ /\{(?:\{.*\}|[^\{])*\}|\w+/sg;

To:

my @matches = $str =~ /\{(?:\{.*?\}|[^\{])*?\}|\w+/sg;

...and now it works exactly as you're expecting.

HTH

Francisco

Joel Berger ,Mar 8, 2013 at 20:01

One way using the built-in module Text::Balanced .

Content of script.pl :

#!/usr/bin/env perl

use warnings;
use strict;
use Text::Balanced qw<extract_bracketed>;

while ( <DATA> ) { 

    ## Remove '\n' from input string.
    chomp;

    printf qq|%s\n|, $_; 
    print "=" x 20, "\n";


    ## Extract all characters just before first curly bracket.
    my @str_parts = extract_bracketed( $_, '{}', '[^{}]*' );

    if ( $str_parts[2] ) { 
        printf qq|%s\n|, $str_parts[2];
    }   

    my $str_without_prefix = "@str_parts[0,1]";


    ## Extract data of balanced curly brackets, remove leading and trailing
    ## spaces and print.
    while ( my $match = extract_bracketed( $str_without_prefix, '{}' ) ) { 
        $match =~ s/^\s+//;
        $match =~ s/\s+$//;
        printf qq|%s\n|, $match;

    }   

    print "\n";
}

__DATA__
abc {{xyz} abc} {xyz}
{abc} {{xyz}} abc

Run it like:

perl script.pl

That yields:

abc {{xyz} abc} {xyz}
====================
abc 
{{xyz} abc}
{xyz}

{abc} {{xyz}} abc
====================
{abc}
{{xyz}}

Borodin ,Mar 8, 2013 at 19:54

Just modifies and extends the classic solution a bit:
(\{(?:(?1)|[^{}]*+)++\})|[^{}\s]++

Demo (This is in PCRE. The behavior is slightly different from Perl when it comes to recursive regex, but I think it should produce the same result for this case).

After some struggle (I am not familiar with Perl!), this is the demo on ideone . $& refers to the string matched by the whole regex.

my $str = "abc {{xyz} abc} {xyz} {abc} {{xyz}} abc";

while ($str =~ /(\{(?:(?1)|[^{}]*+)++\})|[^{}\s]++/g) {
    print "$&\n"
}

Note that this solution assumes that the input is valid. It will behave rather randomly on invalid input. It can be modified slightly to halt when invalid input is encountered. For that, I need more details on the input format (preferably as a grammar), such as whether abc{xyz}asd is considered valid input or not.

[Nov 21, 2019] regex - How can I extract a string between matching braces in Perl - Stack Overflow

Nov 21, 2019 | stackoverflow.com

How can I extract a string between matching braces in Perl? Ask Question Asked 9 years, 7 months ago Active 2 years, 6 months ago Viewed 18k times 10 3


Srilesh ,Apr 23, 2010 at 17:26

My input file is as below :
HEADER 
{ABC|*|DEF {GHI 0 1 0} {{Points {}}}}

{ABC|*|DEF {GHI 0 2 0} {{Points {}}}}

{ABC|*|XYZ:abc:def {GHI 0 22 0} {{Points {{F1 1.1} {F2 1.2} {F3 1.3} {F4 1.4}}}}}

{ABC|*|XYZ:ghi:jkl {JKL 0 372 0} {{Points {}}}}

{ABC|*|XYZ:mno:pqr {GHI 0 34 0} {{Points {}}}}

{
    ABC|*|XYZ:abc:pqr {GHI 0 68 0}
        {{Points {{F1 11.11} {F2 12.10} {F3 14.11} {F4 16.23}}}}
        }
TRAILER

I want to extract the file into an array as below :

$array[0] = "{ABC|*|DEF {GHI 0 1 0} {{Points {}}}}"

$array[1] = "{ABC|*|DEF {GHI 0 2 0} {{Points {}}}}"

$array[2] = "{ABC|*|XYZ:abc:def {GHI 0 22 0} {{Points {{F1 1.1} {F2 1.2} {F3 1.3} {F4 1.4}}}}}"

..
..

$array[5] = "{
    ABC|*|XYZ:abc:pqr {GHI 0 68 0}
        {{Points {{F1 11.11} {F2 12.10} {F3 14.11} {F4 16.23}}}}
        }"

Which means, I need to match the first opening curly brace with its closing curly brace and extract the string in between.

I have checked the below link, but this doesnt apply to my question. Regex to get string between curly braces "{I want what's between the curly braces}"

I am trying but would really help if someone can assist me with their expertise ...

Thanks Sri ...

Srilesh ,Apr 23, 2010 at 20:26

This can certainly be done with regex at least in modern versions of Perl:
my @array = $str =~ /( \{ (?: [^{}]* | (?0) )* \} )/xg;

print join "\n" => @array;

The regex matches a curly brace block that contains either non curly brace characters, or a recursion into itself (matches nested braces)

Edit: the above code works in Perl 5.10+, for earlier versions the recursion is a bit more verbose:

my $re; $re = qr/ \{ (?: [^{}]* | (??{$re}) )* \} /x;

my @array = $str =~ /$re/xg;

Srilesh ,Apr 23, 2010 at 18:34

Use Text::Balanced

Srilesh ,Apr 23, 2010 at 18:29

I second ysth's suggestion to use the Text::Balanced module. A few lines will get you on your way.
use strict;
use warnings;
use Text::Balanced qw/extract_multiple extract_bracketed/;

my $file;
open my $fileHandle, '<', 'file.txt';

{ 
  local $/ = undef; # or use File::Slurp
  $file = <$fileHandle>;
}

close $fileHandle;

my @array = extract_multiple(
                               $file,
                               [ sub{extract_bracketed($_[0], '{}')},],
                               undef,
                               1
                            );

print $_,"\n" foreach @array;

OUTPUT
{ABC|*|DEF {GHI 0 1 0} {{Points {}}}}
{ABC|*|DEF {GHI 0 2 0} {{Points {}}}}
{ABC|*|XYZ:abc:def {GHI 0 22 0} {{Points {{F1 1.1} {F2 1.2} {F3 1.3} {F4 1.4}}}}}
{ABC|*|XYZ:ghi:jkl {JKL 0 372 0} {{Points {}}}}
{ABC|*|XYZ:mno:pqr {GHI 0 34 0} {{Points {}}}}
{
    ABC|*|XYZ:abc:pqr {GHI 0 68 0}
        {{Points {{F1 11.11} {F2 12.10} {F3 14.11} {F4 16.23}}}}
        }

Srilesh ,Apr 23, 2010 at 18:19

You can always count braces:
my $depth = 0;
my $out = "";
my @list=();
foreach my $fr (split(/([{}])/,$data)) {
    $out .= $fr;
    if($fr eq '{') {
        $depth ++;
    }
    elsif($fr eq '}') {
        $depth --;
        if($depth ==0) {
            $out =~ s/^.*?({.*}).*$/$1/s; # trim
            push @list, $out;
            $out = "";
        }
    }
}
print join("\n==================\n",@list);

This is old, plain Perl style (and ugly, probably).

Srilesh ,Apr 23, 2010 at 18:30

I don't think pure regular expressions are what you want to use here (IMHO this might not even be parsable using regex).

Instead, build a small parser, similar to what's shown here: http://www.perlmonks.org/?node_id=308039 (see the answer by shotgunefx (Parson) on Nov 18, 2003 at 18:29 UTC)

UPDATE It seems it might be doable with a regex - I saw a reference to matching nested parentheses in Mastering Regular Expressions (that's available on Google Books and thus can be googled for if you don't have the book - see Chapter 5, section "Matching balanced sets of parentheses")

Borodin ,Mar 8, 2013 at 20:22

You're much better off using a state machine than a regex for this type of parsing.

> ,

Regular expressions are actually pretty bad for matching braces. Depending how deep you want to go, you could write a full grammar (which is a lot easier than it sounds!) for Parse::RecDescent . Or, if you just want to get the blocks, search through for opening '{' marks and closing '}', and just keep count of how many are open at any given time.

[Nov 21, 2019] Can the Perl debugger save the ReadLine history to a file?

Nov 21, 2019 | stackoverflow.com

Ask Question Asked 8 years, 5 months ago Active 6 years ago Viewed 941 times 10 2


eli ,Jun 7, 2018 at 14:13

I work quit a bit with lib ReadLine and the lib Perl Readline.

Yet, the Perl debugger refuses to save the session command line history.

Thus, each time I invoke the debugger I lose all of my previous history.

Does anyone know how to have the Perl debugger save, and hopefully, append session history similar to the bash HISTORYFILE ?

mirod ,Jun 22, 2011 at 10:31

The way I do this is by having the following line in my ~/.perldb file:

&parse_options("HistFile=$ENV{HOME}/.perldb.hist");

Debugger commands are then stored in ~/.perldb.hist and accessible across sessions.

ysth ,Jul 13, 2011 at 9:37

Add parse_options("TTY=/dev/stdin ReadLine=0"); to .perldb, then:
rlwrap -H .perl_history perl -d ...

mephinet ,Feb 21, 2012 at 12:37

$ export PERLDB_OPTS=HistFile=$HOME/.perldb.history

,

I did the following:

1) Created ~/.perldb , which did not exist previously.

2) Added &parse_options("HistFile=$ENV{HOME}/.perldb.hist"); from mirod's answer.

3) Added export PERLDB_OPTS=HistFile=$HOME/.perldb.history to ~/.bashrc from mephinet's answer.

4) Ran source .bashrc

5) Ran perl -d my program.pl , and got this warning/error

perldb: Must not source insecure rcfile /home/ics/.perldb.
        You or the superuser must be the owner, and it must not 
        be writable by anyone but its owner.

6) I protected ~/.perldb with owner rw chmod 700 ~/.perldb , and the error went away.

[Nov 18, 2019] Python enclose each word of a space separated string in quotes - Stack Overflow

Nov 18, 2019 | stackoverflow.com

Python: enclose each word of a space separated string in quotes Ask Question Asked 4 years, 1 month ago Active 1 year, 9 months ago Viewed 3k times 2


vaultah ,Sep 24, 2015 at 15:53

I have a string eg:
line="a sentence with a few words"

I want to convert the above in a string with each of the words in double quotes, eg:

 "a" "sentence" "with" "a" "few" "words"

Any suggestions?

mkc ,Sep 24, 2015 at 18:26

Split the line into words, wrap each word in quotes, then re-join:
' '.join('"{}"'.format(word) for word in line.split(' '))

Anand S Kumar ,Sep 24, 2015 at 15:54

Since you say -

I want to convert the above in a string with each of the words in double quotes

You can use the following regex -

>>> line="a sentence with a few words"
>>> import re
>>> re.sub(r'(\w+)',r'"\1"',line)
'"a" "sentence" "with" "a" "few" "words"'

This would take into consideration punctuations, etc as well (if that is really what you wanted) -

>>> line="a sentence with a few words. And, lots of punctuations!"
>>> re.sub(r'(\w+)',r'"\1"',line)
'"a" "sentence" "with" "a" "few" "words". "And", "lots" "of" "punctuations"!'

user4568737 user4568737

add a comment ,Aug 10, 2017 at 18:42
Or you can something simpler (more implementation but easier for beginners) by searching for each space in the quote then slice whatever between the spaces, add " before and after it then print it.
quote = "they stumble who run fast"
first_space = 0
last_space = quote.find(" ")
while last_space != -1:
    print("\"" + quote[first_space:last_space] + "\"")
    first_space = last_space + 1
    last_space = quote.find(" ",last_space + 1)

Above code will output for you the following:

"they"
"stumble"
"who"
"run"

Stephen Rauch ,Jan 28, 2018 at 3:29

The first answer missed an instance of the original quote. The last string/word "fast" was not printed. This solution will print the last string:
quote = "they stumble who run fast"

start = 0
location = quote.find(" ")

while location >=0:
    index_word = quote[start:location]
    print(index_word)

    start = location + 1
    location = quote.find(" ", location + 1)

#this runs outside the While Loop, will print the final word
index_word = quote[start:]
print(index_word)

This is the result:

they
stumble
who
run
fast

[Nov 15, 2019] Why are Unix system administrators still using Perl for scripting when they could use Python - Quora

Nov 15, 2019 | www.quora.com

Why are Unix system administrators still using Perl for scripting when they could use Python? Update Cancel

a OYLu d zEv ORPC b dRl y q nyXNY D AZ a eSr t gpl a yTipB d lH o xE g ookz H Tr Q voRm . iKPKM c YuOhH o M m HVViy Visualize Docker performance and usage in real time. Track Docker health and usage alongside custom metrics from your apps and services. Try Datadog for free. Learn More You dismissed this ad. The feedback you provide will help us show you more relevant content in the future. Undo Answer Wiki 12 Answers Joshua Day

Joshua Day , Currently developing reporting and testing tools for linux Updated Apr 26 · Author has 83 answers and 71k answer views

There are several reasons and ill try to name a few.

  1. Perl syntax and semantics closely resembles shell languages that are part of core Unix systems like sed, awk, and bash. Of these languages at least bash knowledge is required to administer a Unix system anyway.
  2. Perl was designed to replace or improve the shell languages in Unix/linux by combining all their best features into a single language whereby an administrator can write a complex script with a single language instead of 3 languages. It was essentially designed for Unix/linux system administration.
  3. Perl regular expressions (text manipulation) were modeled off of sed and then drastically improved upon to the extent that subsequent languages like python have borrowed the syntax because of just how powerful it is. This is infinitely powerful on a unix system because the entire OS is controlled using textual data and files. No other language ever devised has implemented regular expressions as gracefully as perl and that includes the beloved python. Only in perl is regex integrated with such natural syntax.
  4. Perl typically comes preinstalled on Unix and linux systems and is practically considered part of the collection of softwares that define such a system.
  5. Thousands of apps written for Unix and linux utilize the unique properties of this language to accomplish any number of tasks. A Unix/linux sysadmin must be somewhat familiar with perl to be effective at all. To remove the language would take considerable effort for most systems to the extent that it's not practical.. Therefore with regard to this environment Perl will remain for years to come.
  6. Perl's module archive called CPAN already contains a massive quantity of modules geared directly for unix systems. If you use Perl for your administration tasks you can capitalize on these modules. These are not newly written and untested modules. These libraries have been controlling Unix systems for 20 years reliably and the pinnacle of stability in Unix systems running across the world.
  7. Perl is particularly good at glueing other software together. It can take the output of one application and manipulate it into a format that is easily consumable by another, mostly due to its simplistic text manipulation syntax. This has made Perl the number 1 glue language in the world. There are millions of softwares around the world that are talking to each other even though they were not designed to do so. This is in large part because of Perl. This particular niche will probably decline as standardization of interchange formats and APIs improves but it will never go away.

I hope this helps you understand why perl is so prominent for Unix administrators. These features may not seem so obviously valuable on windows systems and the like. However on Unix systems this language comes alive like no other.

[Nov 15, 2019] Python Displaces C++ In TIOBE Index Top 3 - Slashdot

Nov 15, 2019 | developers.slashdot.org

Posted by EditorDavid on Saturday September 08, 2018 @03:34PM from the newer-kid-on-the-block dept. InfoWorld described the move as a "breakthrough": As expected, Python has climbed into the Top 3 of the Tiobe index of language popularity, achieving that milestone for the first time ever in the September 2018 edition of the index. With a rating of 7.653 percent, Python placed third behind first-place Java, which had a rating of 17.436 percent, and second-place C, rated at 15.447. Python displaced C++, which finished third last month and took fourth place this month, with a rating of 7.394 percent...

Python also has been scoring high in two other language rankings:

- The PyPL Popularity of Programming Language index, where it ranked No. 1 this month , as it has done before, and has had the most growth in the past five years.

- The RedMonk Programming Language Rankings, where Python again placed third .
Tiobe notes that Python's arrival in the top 3 "really took a long time," since it first entered their chart at the beginning of the 1990s. But today, "It is already the first choice at universities (for all kinds of subjects for which programming is demanded) and is now also conquering the industrial world." In February Tiobe also added a new programming language to their index: SQL. (Since "SQL appears to be Turing complete.")

"Other interesting moves this month are: Rust jumps from #36 to #31, Groovy from #44 to #34 and Julia from #50 to #39."
0 Share Facebook Twitter LinkedIn Reddit Share this with your friends! From To Compose your message Share share image Python Displaces C++ In TIOBE Index Top 3 https://developers.slashdot.org/story/18/09/08/1722213/python-displaces-c-in-tiobe-index-top-3

programming stats python Related Links Wikipedia Seeks Photos of 20 Million Artifacts Lost in Brazil Museum Fire Interviews: Guido van Rossum Answers Your Questions IEEE Spectrum Declares Python The #1 Programming Language Is C++ a 'Really Terrible Language'? Python Language Founder Steps Down Beta Release Nears For BeOS-inspired Open Source OS Haiku This discussion has been archived. No new comments can be posted. Python Displaces C++ In TIOBE Index Top 3 54 More Login Python Displaces C++ In TIOBE Index Top 3 Comments Filter: All Insightful Informative Interesting Funny

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.


Anonymous Coward writes:

Congratulations to the Python maintainers ( Score: 1 )

... but if they hadn't handled the Python 2/3 fork so clumsily, this might have happened years ago.

raymorris ( 2726007 ) , Saturday September 08, 2018 @03:58PM ( #57276974 ) Journal
Losing backwards compatibility with point releases ( Score: 5 , Interesting)

Never mind Python 2 vs 3; one major reason I shy away from Python is the incompatibility in point releases. I'd see "requires Python 2.6� and see that I have Python 2.7 so it should be fine, right? Nope, code written for 2.6 won't run under Python 2.7. It needs to be EXACTLY 2.6.

It's at this point that some Python fanboi gets really upset and starts screaming about how that's now problem, with Python you set up separate virtual environments for each script, so that each one can have exactly the version of Python it is written for, with exactly the version of each library. When there is some bug or security issue you then hope that there is a patch for each, and deal with all that. (As opposed to every other peice of software in the world, which you simply upgrade to the latest version to get all the latest fixes). Yes, you CAN deal with that problem, it's possible, in most cases. You shouldn't have to. Every other language does some simple things to maintain backward compatibility in point releases (and mostly in major releases too).

Also the fact that most languages use every day and have used for decades use braces for blocks means my eyes and mind are very much trained for that. Braces aren't necessarily BETTER than just using indentation, but it's kinda like building a car which uses the pedals to steer and a hand stick for the gas. It's not necessarily inherently better or worse, but it would be almost undriveable for an experienced driver with decades of muscle memory in normal cars. Python's seemingly random choices on these things make it feel like using your feet to steer a car. There should be compelling reasons before you break long-established conventions and Python seems to prefer to break conventions just to be different. It seems the Python team is a bit like Berstein in that way. It's really annoying.

slack_justyb ( 862874 ) , Saturday September 08, 2018 @05:00PM ( #57277244 )
Re:Losing backwards compatibility with point relea ( Score: 5 , Informative)

Python 2.7. It needs to be EXACTLY 2.6

Yeah, just FYI Python 2.7 is in a way its own thing. Different from the 2.x and different from the 3.x series. 2.6 is a no holds barred pure 2.x whereas 2.7 is a mixture of 2.x and 3.x features. So if you want to compare point releases, best to try that with the 3.x series. Also, if you're using something that requires the 2.x series, you shouldn't use that unless it is absolutely critical with zero replacements.

You shouldn't have to. Every other language does some simple things to maintain backward compatibility in point releases (and mostly in major releases too).

Again see argument about 3.x, but yeah not every language does this. Java 8/9 transition breaks things. ASP.Net to ASP.Net core breaks things along the way. I'm interested in what languages you have in mind, because I know quite a few languages that do maintain backwards compatibility (ish). For example, C++ pre and post namespace breaks fstreams in programs, but compilers provide flags to override that, so it depends on what you mean by breaking. Does it count if the compiler by default breaks, but providing flags fixes it? Because if your definition means including flags break compatibility, then oooh boy are there a a shit ton of broken languages.

Also the fact that most languages use every day and have used for decades use braces for blocks means my eyes and mind are very much trained for that

Yeah, it's clear that you've never used a positional programming language. I guess it'll be a sign of my age, but buddy, program COBOL or RPG on punch cards and let me know about that curly brace issue you're having. Positional and indentation has been used way, way, way, longer than curly braces. That's not me knocking on the curly braces, I love my C/C++ folks out there! But I hate to tell you C and C-style is pretty recent in the timeline of all things computer.

raymorris ( 2726007 ) writes:
Depends on the results / message, actually ( Score: 3 , Insightful)

> C++ pre and post namespace breaks fstreams in programs, but compilers provide flags to override that, so it depends on what you mean by breaking. Does it count if the compiler by default breaks, but providing flags fixes it?

If it results in weird runtime errors, that's definitely a problem.
If the compiler I'm using gives the message "incompatible use of fstream, try '-fstreamcompat' flag", that's no big deal.

raymorris ( 2726007 ) writes:
Also deprecation warnings ( Score: 2 )

On a similar note, if something is marked deprecated long before it's removed, that matters. Five years of compiler/interpreter warnings saying "deprecated use of function in null context on line #47" gives plenty of opportunity. To fix it. From the bit of Python I've worked with, the recommended method on Friday completely stops working on Monday.

shutdown -p now ( 807394 ) writes:
Re: ( Score: 2 )

That's plainly not true - Python follows the established deprecate-first-remove-next cycle. This is readily obvious when you look at the changelogs. For example, from the 2.6 changelog:

The threading module API is being changed to use properties such as daemon instead of setDaemon() and isDaemon() methods, and some methods have been renamed to use underscores instead of camel-case; for example, the activeCount() method is renamed to active_count(). Both the 2.6 and 3.0 versions of the module support the same properties and renamed methods, but don't remove the old methods. No date has been set for the deprecation of the old APIs in Python 3.x; the old APIs won't be removed in any 2.x version.

For another example, the ability to throw strings (rather than BaseException-derived objects) was deprecated in 2.3 (2003) and finally removed in 2.6 (2008).

For comparison, in the C++ land, dynamic exception specifications were deprecated in C++11, and removed in C++17. So the time scale is comparable.

raymorris ( 2726007 ) writes:
That's nice they deprecated something ( Score: 2 )

That's great that they deprecate something on some occasions.
MY experience with the Python I run is that one version gives no warning, going up one point release throws multiple fatal errors.

> This is readily obvious when you look at the changelogs

Maybe that's the thing - one has read the changelogs to see what is deprecated, as opposed to getting a clear deprecation warning from the interpreter/compiler like you would with C, Perl, and other languages?

It's possible that a Python expert might be able to

shutdown -p now ( 807394 ) writes:
Re: ( Score: 2 )

MY experience with the Python I run is that one version gives no warning, going up one point release throws multiple fatal errors.

Can you give an example? I'm just not aware of any, and it makes me suspect that what you were running into was an issue in a third-party library (some of which do indeed have a cowboy attitude towards breaking changes - but that's common across all languages).

Maybe that's the thing - one has read the changelogs to see what is deprecated, as opposed to getting a clear deprecation warning from the interpreter/compiler like you would with C, Perl, and other languages?

Like this [python.org]?

And I have never, ever seen a deprecation warning in C or C++. You have to read the change sections for new standards to see what was deprecated or removed.

raymorris ( 2726007 ) writes:
Default warns, unless you turn it off in CFLAGS ( Score: 2 )

> And I have never, ever seen a deprecation warning in C or C++. You have to read the change sections for new standards to see what was deprecated or removed.

The default with gcc is to warn about deprecation.
You can turn the warnings off by setting the CFLAGS environment variable to include -Wno-deprecated, which you can do in your .bashrc oe wherever. What's most often recommended is -Wall to show all warnings of all types.

johannesg ( 664142 ) writes:
Re: ( Score: 3 )

For example, C++ pre and post namespace breaks fstreams in programs, but compilers provide flags to override that

Dude, that was in 1990, back before there even was a standard C++. And I very much doubt those flags still exist today.

program COBOL or RPG on punch cards and let me know about that curly brace issue you're having

You seem to have forgotten how that really worked in your old age though. Punch cards had columns with specific functions assigned to them, so yes, of course you would have to skip certain columns on occasion. That was not indentation, though. You didn't have indentation; moving your holes by one position or one column meant the machine would interpret your instruction as something else ent

sjames ( 1099 ) writes:
Re: ( Score: 3 )

That must be an odd package. I have literally NEVER seen that with anything I have wanted to use, including my own pre-2.7.x software.

fluffernutter ( 1411889 ) writes:
Warrior ( Score: 1 )

Real code warriors don't need static types. If a variable is so badly named that the type is not clear, use type().

PhrostyMcByte ( 589271 ) writes:
Re: ( Score: 2 )

If a variable is so badly named that the type is not clear

Never fear, I've brought my LPCWSTR.

jd ( 1658 ) writes:
Re: Warrior ( Score: 3 , Insightful)

Static typing isn't just about clarity to the programmer. In strict typing languages, the rule is to use the type that matches the range that actually applies. This is to help testing (something coders should not ignore), automated validation, compilation (a compiler can choose sensible values, optimise the code, etc etc etc) and maintainers (a clear variable name won't tell anyone if a variable's range can be extended without impacting the compiled code).

Besides, I've looked at Python code. I'm not convinc

serviscope_minor ( 664417 ) writes:
Re: ( Score: 2 )

Real code warriors don't need static types. If a variable is so badly named that the type is not clear, use type().

pfaaah. Real programmers don't need names. If the type of a variable is not obvious from context, get another job.

Undead Waffle ( 1447615 ) writes:
Re: ( Score: 2 )

Type annotations and docstrings help with the whole lack of type declaration thing. Of course that requires discipline, which is in short supply from my experience. If you can force your developers to run pylint that will at least complain when they don't have docstrings.

Aighearach ( 97333 ) writes:
Re: ( Score: 2 )

You definitely never had to debug student's code.

Is that even a thing?

mi ( 197448 ) writes: < [email protected] > on Saturday September 08, 2018 @03:42PM ( #57276880 ) Homepage Journal
If Java is the first... ( Score: 5 , Insightful)

behind first-place Java

Whatever the list, if Java is in the first place, there is no honor in being anywhere near the top.

jd ( 1658 ) writes: < [email protected] minus threevowels > on Saturday September 08, 2018 @04:18PM ( #57277066 ) Homepage Journal
Re: If Java is the first... ( Score: 4 , Insightful)

The list is compiled from a restricted pool and lists popularity.

That may mean a vendor throwing out ten individually packaged Python scripts counts as ten sources with one C program of equalling counts as one. If that's the case, Python would be ten times as popular in the stats whilst being equally popular in practice.

So if Python needed ten times as many modules to be as versatile, it would seem popular whilst only being frustrating.

The fact is, we don't know their methodology. We don't know if they're weighting results in any way to factor in workarounds and hacks due to language deficiency that might show up as extra use.

We do know they don't factor in defect density, complexity or anything else like that as they do say that much. So are ten published attempts at a working program worth ten times one attempt in a language that makes it easy first time? We will never know.

nten ( 709128 ) writes:
vba ( Score: 2 )

I find java in an uncanny valley. Its still a few times slower than c++ for the sort of stuff I do but it isn't enough quicker to develop than c++ to be worth that hit. Python is far slower than java even using numpy but its so easy to develop in that it is worth the gamble that it will be fast enough. And the rewrite in c++ will go quickly even if it isn't. The title is because VBA is 11x faster than numpy at small dense matricies and almost as easy to develop in.

phantomfive ( 622387 ) writes:
Re: vba ( Score: 2 )

Java is useful because you can throw a team of lowskill developers at it and they won't mess things up beyond the point of unmaintanability. It will be a pain to maintain, sure, but the same developers using C would make memory errors that push things beyond hopeless, and if they were using Python or JavaScript the types would become more and more jumbled as the size of the program increases that no one would be able to understand it and things would start breaking more and more. Java enforces a minimal lev

angel'o'sphere ( 80593 ) writes:
Re: ( Score: 2 )

Luckily I usually have high skilled developers in my Java projects :P

phantomfive ( 622387 ) writes:
Re: vba ( Score: 2 )

Then your codebase is better.

Tough Love ( 215404 ) writes:
Re: ( Score: 3 )

Tiobe is utter crap. Javascript (barf) is by far the most popular programming language today and Tiobe puts it in 8th place, behind Visual Basic.

ljw1004 ( 764174 ) writes:
Re: If Java is the first... ( Score: 2 )

From the same article - Visual Basic overtakes C#, PHP and Javascript...

El Cubano ( 631386 ) , Saturday September 08, 2018 @03:59PM ( #57276982 )
Love Python ( Score: 3 )

Tiobe notes that Python's arrival in the top 3 "really took a long time," since it first entered their chart at the beginning of the 1990s. But today, "It is already the first choice at universities (for all kinds of subjects for which programming is demanded)

Undergraduate was all C/C++ for me then I ended up at a graduate school where everything was Java. I disliked it so much that I decided to find an alternative and teach myself. I found Python and loved it. I still love it. You can't find anything better for both heavy duty programming and quick and dirty scripting. It's versatility makes It like the Linux of programming languages.

Tough Love ( 215404 ) , Saturday September 08, 2018 @05:02PM ( #57277252 )
Re:Love Python ( Score: 4 , Insightful)

I found Python and loved it. I still love it. You can't find anything better for both heavy duty programming...

What? Python is hopelessly inefficient for heavy duty programming, unless you happen to be doing something that is mainly handled by a Python library, written in C. Python's interface to C disgusting, so if you have a lot of small operations handled by a C library, you will get pathetic performance.

sjames ( 1099 ) writes:
Re: ( Score: 2 )

It really isn't. There are some apps that actually need something faster and a lot of apps that don't. It really doesn't help if a faster executable ends up waiting for I./O.

Tough Love ( 215404 ) writes:
Re: ( Score: 3 )

It really isn't.

It really is [debian.net] and you blathering about what you don't know does not change that fact. (Python 14 minutes vs C++ 8..24 seconds for N-Body simulation.)

sjames ( 1099 ) writes:
Re: ( Score: 2 )

Well, since 99.99999999999% of all software run by literally everybody is an n-body simulation....

That would be an example of the "some apps" I spoke of. I note that Intel Fortran was at the top of the list (not surprising). Would ifort be your first choice if you were writing a text editor or a tar file splitter? How about an smtp daemon?

I sure hope not.

Tough Love ( 215404 ) writes:
Re: ( Score: 2 )

Well, since 99.99999999999% of all software run by literally everybody is an n-body simulation..

Explaining the concept of "compute intensive" to you makes me feel more stupid. Check out any of the compute intensive Python benchmarks. Consider not waving your ignorance around quite so much.

sjames ( 1099 ) writes:
Re: ( Score: 2 )

Having actually built a cluster that was in the top 500 for a while, I am well acquainted with compute intensive applications. I am also aware that compute intensive is a subset of "heavy duty" programming which is a subset of general programming.

Now, pull your head out of your ass and look around, you might learn something. And while you're at it, consider working on your social skills.

Tough Love ( 215404 ) writes:
Re: ( Score: 2 )

Either you understand that Python is crap for compute intensive work, or you are lying about building a cluster. Or you just connected the cables, more like it, and really don't have a clue about how to use it.

sjames ( 1099 ) writes:
Re: ( Score: 2 )

I do understand that python isn't the right choice for compute intensive work. With the exception that if it is great for doing setup for something in FORTRAN or C that does the heavy lifting.

I am certain that YOU don't understand that compute intensive work is a small fraction of what is done on computers. For example, I/O intensive work doesn't really care if it is Python or FORTRAN that is waiting for I/O to complete. There is a reason people don't drive a top fuel dragster to work.

If you meant compute i

Tough Love ( 215404 ) writes:
Re: ( Score: 2 )

I am certain that YOU don't understand that compute intensive work is a small fraction of what is done on computers.

First, you have no idea what I do or do not understand because you find yourself way too entertained by your own blather, and second, computers are used more for browsing than any other single task these days, and wasteful use of the CPU translates into perceptible lag. Playing media is very CPU intensive. You don't write those things in Python because Python sucks for efficiency. My point.

Yes, I had you figured, you're a sysadmin with delusions about being a dev. Seen way too many of those. They tend to ta

sjames ( 1099 ) writes:
Re: ( Score: 2 )

I draw my conclusions from what you have written in this thread. You see one screw and think NOTHING is a nail.

You don't write a video codec in Python, but it's a great choice for handling the UI and feeding the stream to the codec.

You have much to learn.

Tough Love ( 215404 ) writes:
Re: ( Score: 2 )

As I said, "Python is hopelessly inefficient for heavy duty programming". WTF are you blathering on about. Fresh air is good for you, maybe get out of your basement more.

sjames ( 1099 ) writes:
Re: ( Score: 2 )

"Heavy Duty " != "compute intensive". Say what you mean or don't complain when people disagree.

Tough Love ( 215404 ) writes:
Re: ( Score: 2 )

OK, you have your own private definition of terminology. Enjoy life in your own private universe.

sjames ( 1099 ) writes:
Re: ( Score: 2 )

I made a couple attempts to clarify terminology but you were too busy looking for an excuse to make an ass of yourself to notice.

K. S. Kyosuke ( 729550 ) writes:
Re: ( Score: 2 )

It really is [debian.net] and you blathering about what you don't know does not change that fact. (Python 14 minutes vs C++ 8..24 seconds for N-Body simulation.)

I've just run it on my machine. C++: 2.3 seconds, Python: 22 seconds. That's for straightforward mathy Python against C++ code with vector instrinsics. Concerning C++ code without manual vectorization, it's 4 seconds against 22. Not terribly bad, I'd say. Not to mention that this isn't the kind of code that would be typical for a larger application.

Tough Love ( 215404 ) , Saturday September 08, 2018 @09:44PM ( #57278110 )
Re:Love Python ( Score: 3 )

5x+ penalty just for writing the code in Python, you call it not terribly bad? So this is how Python fans think.

angel'o'sphere ( 80593 ) writes: < angelo...schneider@@@oomentor...de > on Saturday September 08, 2018 @10:19PM ( #57278224 ) Journal
Re:Love Python ( Score: 2 )

The first python program I wrote was a test for a job interview.
It involved downloading meteorologic data from the internet.
Analyzing it, creating a kind of summary and using a graph plotting library to display a graph (generate a *.png)

It would not have been noticeable faster if I had written it in C++, because ... you know: downloading via a network.

K. S. Kyosuke ( 729550 ) writes:
Re: ( Score: 2 )

I'm pretty sure you get another 3x penalty for not writing in assembly, too. So this is how C++ fans think.

Tough Love ( 215404 ) writes:
Re: ( Score: 2 )

I'm pretty sure you get another 3x penalty for not writing in assembly, too...

You are sure of that, are you? I bet you have never coded in assembly yourself, or looked at the assembly that gcc puts out in O3.

K. S. Kyosuke ( 729550 ) writes:
Re: ( Score: 2 )

I did, for 6502, 8051 and 8086. And I've seen GCC's output.

Tough Love ( 215404 ) writes:
Re: ( Score: 2 )

Then you looked at it without understanding it. Do you seriously think you can out-optimize gcc's code generator? Do you even know how to use LEA for arithmetic?

K. S. Kyosuke ( 729550 ) writes:
Re: ( Score: 2 )

LEA is not going to help you much in a small fixed-size n-body kernel. That's going to be mostly unrolled AVX code.

dwpro ( 520418 ) writes:
Re: ( Score: 2 )

Have you worked on or have an example of a large, complex application in Python? I'd like to see how it's organized, seems like it'd be a nightmare.

Paul Carver ( 4555 ) writes:
Re: ( Score: 2 )

https://github.com/openstack/n... [github.com]

flargleblarg ( 685368 ) writes:
Re: ( Score: 2 )

Undergraduate was all C/C++ for me [...]

And I believe you! -- because...

It's versatility makes It [...]

...evidentally you forgot to take grammar class. ;)

Anonymous Coward writes:
I'm looking for a language ... ( Score: 1 )

Is there a programming language out there, that is as fast as C++ or even C, has a proper strict type system (no duck typing, nothing like Python or JS), fast garbage collection (no fuckin' auto_pointer worst-of-both-worlds), is elegant and emergent (so very powerful for its simplicity), and doesn't require an advanced degree in computer sciences to do simple things (Hello Haskell!).
Of course with key libraries being available for it. (The equivalent of a standard library, Vulcan, a simple GUI widget toolki

Tablizer ( 95088 ) writes:
Re: ( Score: 1 )

It's called "IwannaPony", and you are NOT going to get one. You are asking too much.

CQDX ( 2720013 ) writes:
Qt (with C++) ( Score: 2 )

I really like the Qt framework. It's well done, well documented and well supported. Sure it's C++ so it doesn't meet your need of finding a new language but the API is pretty clean and simple so that you can avoid the complications and ugliness of C++ in most cases. If you need to though, it's all right there so you don't give up the additional power if you need it. The Python version is good too and very similar to the C++ version so it's not hard to switch between languages as your needs change.

serviscope_minor ( 664417 ) writes:
Re: ( Score: 2 )

Is there a programming language out there, that is as fast as C++ or even C, has a proper strict type system (no duck typing, nothing like Python or JS), fast garbage collection

No.

Neither will there be. There's always a penalty for garbage collection.

rkcth ( 262028 ) writes:
Re: ( Score: 1 )

Is there a programming language out there, that is as fast as C++ or even C, has a proper strict type system (no duck typing, nothing like Python or JS), fast garbage collection

No.

Neither will there be. There's always a penalty for garbage collection.

I think go is the closest to your requested feature list.

serviscope_minor ( 664417 ) writes:
Re: ( Score: 2 )

I think go is the closest to your requested feature list.

The GP, not mine. And yuck, no thanks. Go just seems, well, deeply mediocre in many places. It's like someone pdated C, ignoreing the last 40 years of language developments.

Sure I can program without generics, I'm at a loss to see why I'd want to though.

Spacelem ( 189863 ) writes:
Re: ( Score: 2 )

Julia? It ticks quite a few of those boxes.

Its performance is good enough that I'm able to drop C++ (I'm a mathematical modeller), it's amazing at multidimensional array manipulation, and its typing system is really good. It just feels nice to program in. Bonus, one of the inspirations was Lisp, so it's got good metaprogramming. Also it's free software, made by people at MIT, so your conscience can remain appeased.

It's still a young language, but libraries are being built for it at an impressive rate, and i

TechyImmigrant ( 175943 ) writes:
Re: ( Score: 2 )

I'm holding out for Jai.

jd ( 1658 ) writes:
I would advise against any university ( Score: 2 )

That advocated a language. Languages shift faster than sand on speed. Universities should teach logic, reasoning, methodology, good practices and programming technique.

Languages should be for the purpose of example only. Universities should teach programming, not Java, software engineering, not Python. Java and Python should be in there, yes, along with Perl, C and Ada. Syntax is just sugar over the semantics. Learn the semantics well and the syntax is irrelevant. You want universities to teach kids how to

Tablizer ( 95088 ) writes:
Re: ( Score: 1 )

when Cobol and Fortran were the in thing. Last forever, they thought.

Any evidence most universities believed that? (They are still around and relatively common, by the way.)

Universities have to pick something to program lesson projects in, and selecting language(s) common in the current market helps student job prospects. I suggest STEM students be required to learn at least one compiled/strong-typed language, and one script/dynamic language.

TechyImmigrant ( 175943 ) writes:
Re: ( Score: 2 )

My university (Manchester University, UK) certainly didn't pick a language.

We studied many languages, compiler design, formal semantics and a boatload of other computer science things but at no point did they try to teach me a programming language. In fact at induction they said explicitly that they expected us all to know how to program before we arrived.

That was 30 years ago. Things might have changed.

petermgreen ( 876956 ) writes:
Re: ( Score: 2 )

(note: this is a UK perspective, other places may vary)

Universities have to work with the students they can get.

I think you and your co-students were lucky to catch the height of the 80s microcomputer boom, the time when computers booted into BASIC, when using a computer pretty much meant programming it.

Then the windows PCs with their pre-canned applications and no obvious programming language swept in. Leaning to program now meant not just finding a suitible book, it often meant buying the programming lang

Tablizer ( 95088 ) writes:
Re: ( Score: 1 )

My university...didn't pick a language.

Didn't you have projects that involved turning in your code to the teacher/graders? The graders don't want to see every which language. Multi-lingual graders are more expensive. Most colleges dictate a narrow set of languages for such projects.

TechyImmigrant ( 175943 ) writes:
Re: ( Score: 2 )

My university...didn't pick a language.

Didn't you have projects that involved turning in your code to the teacher/graders? The graders don't want to see every which language. Multi-lingual graders are more expensive. Most colleges dictate a narrow set of languages for such projects.

Yes, but it was hardly narrow. We had homework to hand in using a variety of languages, depending on the course. Pascal tended to be used for general algorithm stuff. But Smalltalk, Prolog, ML and other usual suspects were used when they made sense for the course. You were supposed to leave with a CompSci degree where you understood the theory of languages more than the details of specific languages. Usually, for project work, you were free to choose your language and would be expected to justify the reason

Tablizer ( 95088 ) writes:
Re: ( Score: 1 )

I don't remember herds of graders. That's what postgrads are for.

Maybe the graduation % was too low at my U.

pauljlucas ( 529435 ) writes:
Re: ( Score: 2 )

If you want to teach semantics, use either Smalltalk or Scheme. You can teach the syntax for either in five minutes.

pilaftank ( 1096645 ) writes:
Go Groovy! ( Score: 1 )

Groovy from #44 to #34

That's a pretty big jump. Groovy is a well-thought-out language and nicely facilitates writing clean, readable, compact code (especially compared to Java). However, it needs a better framework than Grails (85% really good convention over configuration stuff but 15% convoluted j2ee era framework stuff).

Tablizer ( 95088 ) writes:
Hackers love TC ( Score: 1 )

[SQL added to list] since "SQL appears to be Turing complete."

Not sure that's a good thing.

cats-paw ( 34890 ) writes:
python3 for full application development. wtf? ( Score: 2 , Interesting)

Can someone explain to me why using a dynamically typed language is a good idea for "big" applications ?

Python is subject to all sorts of really horrendous bugs that would not happen in a compiled, type-checked language.

For example if you are accessing an undefined variable in the else branch of an if statement, you won't know it's undefined unless that branch is taken. which means if it's something like a rarely occurring error condition it's kind of annoying. yes you can figure it out by writing enough t

Frankablu ( 812139 ) writes:
Re: ( Score: 1 )

It's really simple, Writing an application in Python is x3 quicker than writing it in C/C++/Java, etc... That means you either get to market 3x faster or only need 1/3 the number of programmers. Everything else is completely and utterly irrelevant. "you won't know it's undefined unless that branch is taken" The code linter built into your Python IDE, will tell you about it.

dwpro ( 520418 ) writes: < dgeller777@@@gmail...com > on Sunday September 09, 2018 @05:06AM ( #57278852 )
Re:python3 for full application development. wtf? ( Score: 2 )

Hogwash. Even if x3 were true, Dev is roughly 20-40% of overall software cost. Unless you're arguing that every aspect of coding is reliably 3x faster in Python. Given the value of strong typing when refactoring, I'd wager python is not even competitive price wise past the proof of concept/one-off script scenario.

Frankablu ( 812139 ) , Sunday September 09, 2018 @07:52AM ( #57279148 ) Journal
Re:python3 for full application development. wtf? ( Score: 1 )

I'm am arguing that because it's true. There isn't much benefit to strong typing when refactoring but the benefits of duck typing when it comes to unit testing are quite significant. I've done commercial software development in strong and weakly typed languages before. The benefits of things like "strong typing" are generally not that much. If you are on board with the whole agile bandwagon and writing unit tests and all that. You would be much better off with Python's significally better unit testing facilities than strong typing.

dwpro ( 520418 ) writes:
Re: ( Score: 2 )

I'm at least in the caravan trailing the agile/unit-test bandwagon, but those are orthogonal to typing (and being explicit generally). Looking at a method signature and knowing that it requires a decimal and enumeration of a given type is more than a run-time check; it provides information about the intent, limitations, and discoverability options. There are very real trade-offs for the speed and flexibility of a language like Python, and my view is that it's more jet-fuel than solar power.

Frankablu ( 812139 ) writes:
Re: ( Score: 1 )

If you are using a good IDE "provides information about the intent, limitations, and discoverability options" can all be found out with a couple of key strokes (git blame, find all usages, pylint, etc..). So putting that information into the language explicitly is an obsolete and backwards way of going about things :p The job of the compiler in Python has just been redistributed elsewhere. It's different but there are many ways to solve the same problems.

sjames ( 1099 ) writes:
Re: ( Score: 2 )

Six of one, half dozen of the other. The Python program will be smaller for the same functionality and it won't have buffer overflows and memory leaks The C program will run faster (unless it has to wait on I/O) and will check for variables used before assignment.

munch117 ( 214551 ) , Sunday September 09, 2018 @03:40AM ( #57278724 )
Re:python3 for full application development. wtf? ( Score: 4 , Interesting)

Using an undefined variable in Python triggers an exception, and you get a traceback. In a larger program you will normally have a system for capturing and storing such tracebacks for analysis, and with the traceback in hand, it's typically a very simple fix.

In C++ you get an incorrect value created by default-initialisation (or maybe undefined behaviour): the program hobbles along as best it can, and you may never find the problem. You just see your program behaving strangely sometimes, and as the program gets larger, those strange behaviours accumulate.

Python is subject to all sorts of really horrendous bugs that would not happen in a compiled, type-checked language.

Horrendous is not the right word. Bugs that come with tracebacks are simple bugs. Zen#10: "Errors should never pass silently" is exactly what you want in large-scale programming.

shutdown -p now ( 807394 ) writes:
Re: ( Score: 2 )

By "undefined variable" I think he means undeclared (or, in Python, unassigned in the proper scope, since locals are declared implicitly).

donbudge ( 4988829 ) writes:
Re: ( Score: 1 )

Writing big applications in Java/C++ takes too long. And then managements decide to avoid 'custom code' in favor of 'standard' vendor tools where you can drag and drop to build parts of the 'big' application. This applies to ETL, reporting, messaging to name a few. With Python, the development cycle shortens and you can still stick to writing code instead of dealing with vendor binaries, lock-ins, licensing etc. Python with strong emphasis on unit tests, coupled with plugging in C/C++ where necessary f

nashv ( 1479253 ) writes:
Re: ( Score: 2 )

If you are really really asinine about strong typing, you can declare types in Python https://medium.com/@ageitgey/l... [medium.com]

SuperKendall ( 25149 ) , Saturday September 08, 2018 @09:51PM ( #57278124 )
Hello Machine Learning ( Score: 4 , Interesting)

I think what has really propelled Python into a higher rank is machine learning, where it is simply the de-facto language of choice by quite a margin.

I have to admit I am impressed with the progress it has made; of many recent CS grads I've talked to it seemed to be the favorite language...

I have to admit that over the years I've not really enjoyed Python much myself in the on and off again times I've used it, for me the spaces as indent levels maybe get too close to the meaningful whitespace of Fortran... I guess modern programmers do not have this hangup. :-)

So good work Python, a well deserved ascent!

shutdown -p now ( 807394 ) writes:
Re: ( Score: 2 )

It's not just ML, but data science in general.

And the other big thing was many universities switching to it from Java for their CS courses.

SuperKendall ( 25149 ) writes:
Re: ( Score: 2 )

That's a great point, and to be honest that is probably a better language for learning than Java... it would also explain why newer CS grads all like it more now.

The only downside is that most jobs are still using Java or something besides python... but probably it means we'll se more python used in businesses I guess. That usually ends up following eventually.

CustomSolvers2 ( 4118921 ) , Sunday September 09, 2018 @03:06AM ( #57278678 ) Homepage
Python is very newcomer-friendly ( Score: 2 )

I have been sporadically using Python for some years already and never really liked it. Note that most of my experience is focused on C-based and strongly-typed programming languages. Recently, I have been spending some time on a Python project and have realised about its (newbie) friendliness.

I still don't quite understand the reason for all the tabs/spaces problems, consider it too slow, don't like the systematic need of relying on external resources and I will certainly continue using other languages before it. But I do understand now why newbies or those performing relatively small developments or those wanting to rely on some of the associated resources (although I don't like being systematically forced to include external dependencies, I do recognise that Python deals with these aspects quite gracefully and that there are many interesting libraries) might prefer Python. It is one of the most intuitive programming languages which I have ever used, at least from what seems a newcomer perspective (e.g., same command performing what are intuitively seen as similar actions).

zmooc ( 33175 ) writes:
Too simple ( Score: 2 )

I don't think this should be about lines of code written. A more interesting approach would be to also count all dependencies, counting things like libc a gazillion times. Even more interesting would be to count what's actually executed.

casperghst42 ( 580449 ) writes:
RIght... ( Score: 1 )

Not typesafe, and no switch/case ... what ever rocks your boat.

vux984 ( 928602 ) writes:
Re: ( Score: 2 )

"Or any of the games in my library, which all appear to be C or C++, with a few C#."

From what I've seen a lot of the core engine stuff is C/C++; but a lot of the UI, AI, and "mod support" stuff is commonly done in Python and Lua.

Personally, I disagree with semantic whitespace so I don't like python. (I think its the editors job to handle pretty formatting to reflect the structure, rather than the programmers job to define structure with pretty formatting.) But I can see why python would be a good learning l

angel'o'sphere ( 80593 ) writes:
Re: ( Score: 3 )

Eve Online is mostly written in Python, client and server.
It is the MMO game with the most concurrent users online at any time of the day.

Speed is not their problem.

vux984 ( 928602 ) writes:
Re: ( Score: 2 )

'bad habits' ? what sort do you think?

I sort of see it as the opposite... semantic whitespace teaches mostly good habits, its just fucking irritating to maintain, and to work with snippets and code fragments etc.

But its highly readable, and pretty straightforward, and i don't see anything wrong with it as a beginning/educational language; for teaching flow control, algorithms, structured/modular programming, and so on.

[Nov 15, 2019] Go versus Python 3 -- fastest programs

Nov 15, 2019 | pages.debian.net
Back in April 2010, Russ Cox charitably suggested that only fannkuch-redux, fasta, k-nucleotide, mandlebrot, nbody, reverse-complement and spectral-norm were close to fair comparisons. As someone who implemented programming languages, his interest was "measuring the quality of the generated code when both compilers are presented with what amounts to the same program."

Differences in approach - to memory management, parallel programming, regex, arbitrary precision arithmetic, implementation technique - don't fit in that kind-of fair comparison -- but we still have to deal with them.

These are only the fastest programs. There may be additional measurements for programs which seem more-like a fair comparison to you. Always look at the source code.

mandelbrot
source secs mem gz busy cpu load
Go 5.47 31,088 905 21.77 100% 99% 99% 99%
Python 3 259.50 48,192 688 1,036.70 100% 100% 100% 100%
spectral-norm
source secs mem gz busy cpu load
Go 3.94 2,740 548 15.74 100% 100% 100% 100%
Python 3 169.87 49,188 417 675.02 100% 99% 99% 99%
n-body
source secs mem gz busy cpu load
Go 21.25 1,588 1310 21.48 0% 1% 100% 0%
Python 3 865.18 8,176 1196 874.96 2% 20% 79% 0%
fasta
source secs mem gz busy cpu load
Go 2.08 3,560 1358 5.61 80% 37% 76% 78%
Python 3 63.55 844,180 1947 129.71 40% 71% 33% 61%
fannkuch-redux
source secs mem gz busy cpu load
Go 17.56 1,524 900 70.00 100% 100% 100% 100%
Python 3 534.40 47,236 950 2,104.05 99% 97% 99% 99%
k-nucleotide
source secs mem gz busy cpu load
Go 11.77 160,184 1607 44.52 94% 98% 94% 92%
Python 3 72.24 199,856 1967 275.38 94% 94% 96% 96%
reverse-complement
source secs mem gz busy cpu load
Go 3.83 1,782,940 1338 6.67 21% 74% 16% 62%
Python 3 16.93 1,777,852 434 17.58 78% 21% 4% 0%
binary-trees
source secs mem gz busy cpu load
Go 25.17 361,152 1005 86.86 87% 89% 86% 84%
Python 3 80.30 448,004 589 286.50 95% 87% 87% 88%
pidigits
source secs mem gz busy cpu load
Go 2.10 8,448 603 2.17 1% 48% 55% 0%
Python 3 3.47 10,356 386 3.53 1% 1% 0% 100%
regex-redux
source secs mem gz busy cpu load
Go 29.82 428,224 802 62.65 48% 45% 52% 65%
Python 3 18.45 457,340 512 37.52 69% 37% 50% 48%
Go go version go1.13 linux/amd64
Python 3 Python 3.8.0

[Nov 14, 2019] perl - package variable scope in module subroutine

Nov 14, 2019 | stackoverflow.com

Asked 7 years, 7 months ago Active 7 years, 7 months ago Viewed 20k times 8 1


brian d foy ,Jul 17, 2014 at 17:54

How do I change the value of a variable in the package used by a module so that subroutines in that module can use it?

Here's my test case:

testmodule.pm:

package testmodule;

use strict;
use warnings;
require Exporter;

our ($VERSION, @ISA, @EXPORT, @EXPORT_OK, %EXPORT_TAGS);

@ISA = qw(Exporter);
@EXPORT = qw(testsub);

my $greeting = "hello testmodule";
my $var2;

sub testsub {
    printf "__PACKAGE__: %s\n", __PACKAGE__;
    printf "\$main::greeting: %s\n", $main::greeting;
    printf "\$greeting: %s\n", $greeting;
    printf "\$testmodule::greeting: %s\n", $testmodule::greeting;
    printf "\$var2: %s\n", $var2;
} # End testsub
1;

testscript.pl:

#!/usr/bin/perl -w
use strict;
use warnings;
use testmodule;

our $greeting = "hello main";
my $var2 = "my var2 in testscript";

$testmodule::greeting = "hello testmodule from testscript";
$testmodule::var2 = "hello var2 from testscript";

testsub();

output:

Name "testmodule::var2" used only once: possible typo at ./testscript.pl line 11.
__PACKAGE__: testmodule
$main::greeting: hello main
$greeting: hello testmodule
$testmodule::greeting: hello testmodule from testscript
Use of uninitialized value $var2 in printf at testmodule.pm line 20.
$var2:

I expected $greeting and $testmodule::greeting to be the same since the package of the subroutine is testmodule .

I guess this has something to do with the way use d modules are eval d as if in a BEGIN block, but I'd like to understand it better.

I was hoping to set the value of the variable from the main script and use it in the module's subroutine without using the fully-qualified name of the variable.

perl-user ,Sep 5, 2013 at 13:58

As you found out, when you use my , you are creating a locally scoped non-package variable. To create a package variable, you use our and not my :
my $foo = "this is a locally scoped, non-package variable";
our $bar = "This is a package variable that's visible in the entire package";

Even better:

{
   my $foo = "This variable is only available in this block";
   our $bar = "This variable is available in the whole package":
}

print "$foo\n";    #Whoops! Undefined variable
print "$bar\n";    #Bar is still defined even out of the block

When you don't put use strict in your program, all variables defined are package variables. That's why when you don't put it, it works the way you think it should and putting it in breaks your program.

However, as you can see in the following example, using our will solve your dilemma:

File Local/Foo.pm
#! /usr/local/bin perl
package Local::Foo;

use strict;
use warnings;
use feature qw(say);

use Exporter 'import';
our @EXPORT = qw(testme);

our $bar = "This is the package's bar value!";
sub testme {

    # $foo is a locally scoped, non-package variable. It's undefined and an error
    say qq(The value of \$main::foo is "$main::foo");

    # $bar is defined in package main::, and will print out
    say qq(The value of \$main::bar is "$main::bar");

    # These both refer to $Local::Foo::bar
    say qq(The value of \$Local::Foo::bar is "$Local::Foo::bar");
    say qq(The value of bar is "$bar");
}

1;
File test.pl
#! /usr/local/bin perl
use strict;
use warnings;
use feature qw(say);
use Local::Foo;

my $foo = "This is foo";
our $bar = "This is bar";
testme;

say "";
$Local::Foo::bar = "This is the NEW value for the package's bar";
testme

And, the output is:

Use of uninitialized value $foo in concatenation (.) or string at Local/Foo.pm line 14.
The value of $main::foo is ""
The value of $main::bar is "This is bar"
The value of $Local::Foo::bar is "This is the package's bar value!"
The value of bar is "This is the package's bar value!"

Use of uninitialized value $foo in concatenation (.) or string at Local/Foo.pm line 14.
The value of $main::foo is ""
The value of $main::bar is "This is bar"
The value of $Local::Foo::bar is "This is the NEW value for the package's bar"
The value of bar is "This is the NEW value for the package's bar"

The error message you're getting is the result of $foo being a local variable, and thus isn't visible inside the package. Meanwhile, $bar is a package variable and is visible.

Sometimes, it can be a bit tricky:

if ($bar -eq "one") {
   my $foo = 1;
}
else {
   my $foo = 2;
}

print "Foo = $foo\n";

That doesn't work because $foo only bas a value inside the if block. You have to do this:

my $foo;
if ($bar -eq "one") {
   $foo = 1;
}
else {
  $foo = 2;
}

print "Foo = $foo\n"; #This works!

Yes, it can be a bit to get your head wrapped around it initially, but the use of use strict; and use warnings; is now de rigueur and for good reasons. The use of use strict; and use warnings; probably has eliminated 90% of the mistakes people make in Perl. You can't make a mistake of setting the value of $foo in one part of the program, and attempting to use $Foo in another. It's one of the things I really miss in Python.

> ,

After reading Variable Scoping in Perl: the basics more carefully, I realized that a variable declared with my isn't in the current package. For example, in a simple script with no modules if I declare my $var = "hello" $main::var still doesn't have a value.

The way that this applies in this case is in the module. Since my $greeting is declared in the file, that hides the package's version of $greeting and that's the value which the subroutine sees. If I don't declare the variable first, the subroutine would see the package variable, but it doesn't get that far because I use strict .

If I don't use strict and don't declare my $greeting , it works as I would have expected. Another way to get the intended value and not break use strict is to use our $greeting . The difference being that my declares a variable in the current scope while our declares a variable in the current package .

[Nov 13, 2019] How fast is Perl s smartmatch operator when searching for a scalar in an array

Nov 13, 2019 | stackoverflow.com

Paul Tomblin ,Oct 19, 2010 at 13:38

I want to repeatedly search for values in an array that does not change.

So far, I have been doing it this way: I put the values in a hash (so I have an array and a hash with essentially the same contents) and I search the hash using exists .

I don't like having two different variables (the array and the hash) that both store the same thing; however, the hash is much faster for searching.

I found out that there is a ~~ (smartmatch) operator in Perl 5.10. How efficient is it when searching for a scalar in an array?

> ,

If you want to search for a single scalar in an array, you can use List::Util 's first subroutine. It stops as soon as it knows the answer. I don't expect this to be faster than a hash lookup if you already have the hash , but when you consider creating the hash and having it in memory, it might be more convenient for you to just search the array you already have.

As for the smarts of the smart-match operator, if you want to see how smart it is, test it. :)

There are at least three cases you want to examine. The worst case is that every element you want to find is at the end. The best case is that every element you want to find is at the beginning. The likely case is that the elements you want to find average out to being in the middle.

Now, before I start this benchmark, I expect that if the smart match can short circuit (and it can; its documented in perlsyn ), that the best case times will stay the same despite the array size, while the other ones get increasingly worse. If it can't short circuit and has to scan the entire array every time, there should be no difference in the times because every case involves the same amount of work.

Here's a benchmark:

#!perl
use 5.12.2;
use strict;
use warnings;

use Benchmark qw(cmpthese);

my @hits = qw(A B C);
my @base = qw(one two three four five six) x ( $ARGV[0] || 1 );

my @at_end       = ( @base, @hits );
my @at_beginning = ( @hits, @base );

my @in_middle = @base;
splice @in_middle, int( @in_middle / 2 ), 0, @hits;

my @random = @base;
foreach my $item ( @hits ) {
    my $index = int rand @random;
    splice @random, $index, 0, $item;
    }

sub count {
    my( $hits, $candidates ) = @_;

    my $count;
    foreach ( @$hits ) { when( $candidates ) { $count++ } }
    $count;
    }

cmpthese(-5, {
    hits_beginning => sub { my $count = count( \@hits, \@at_beginning ) },
    hits_end       => sub { my $count = count( \@hits, \@at_end ) },
    hits_middle    => sub { my $count = count( \@hits, \@in_middle ) },
    hits_random    => sub { my $count = count( \@hits, \@random ) },
    control        => sub { my $count = count( [], [] ) },
  }
);
div class="answercell post-layout--right

,

Here's how the various parts did. Note that this is a logarithmic plot on both axes, so the slopes of the plunging lines aren't as close as they look:

So, it looks like the smart match operator is a bit smart, but that doesn't really help you because you still might have to scan the entire array. You probably don't know ahead of time where you'll find your elements. I expect a hash will perform the same as the best case smart match, even if you have to give up some memory for it.


Okay, so the smart match being smart times two is great, but the real question is "Should I use it?". The alternative is a hash lookup, and it's been bugging me that I haven't considered that case.

As with any benchmark, I start off thinking about what the results might be before I actually test them. I expect that if I already have the hash, looking up a value is going to be lightning fast. That case isn't a problem. I'm more interested in the case where I don't have the hash yet. How quickly can I make the hash and lookup a key? I expect that to perform not so well, but is it still better than the worst case smart match?

Before you see the benchmark, though, remember that there's almost never enough information about which technique you should use just by looking at the numbers. The context of the problem selects the best technique, not the fastest, contextless micro-benchmark. Consider a couple of cases that would select different techniques:

Now, keeping those in mind, I add to my previous program:

my %old_hash = map {$_,1} @in_middle; 

cmpthese(-5, {
    ...,
    new_hash       => sub { 
        my %h = map {$_,1} @in_middle; 
        my $count = 0;
        foreach ( @hits ) { $count++ if exists $h{$_} }
        $count;
        },
    old_hash       => sub { 
        my $count = 0;
        foreach ( @hits ) { $count++ if exists $old_hash{$_} }
        $count;
        },
    control_hash   => sub { 
        my $count = 0;
        foreach ( @hits ) { $count++ }
        $count;
        },
    }
);

Here's the plot. The colors are a bit difficult to distinguish. The lowest line there is the case where you have to create the hash any time you want to search it. That's pretty poor. The highest two (green) lines are the control for the hash (no hash actually there) and the existing hash lookup. This is a log/log plot; those two cases are faster than even the smart match control (which just calls a subroutine).

There are a few other things to note. The lines for the "random" case are a bit different. That's understandable because each benchmark (so, once per array scale run) randomly places the hit elements in the candidate array. Some runs put them a bit earlier and some a bit later, but since I only make the @random array once per run of the entire program, they move around a bit. That means that the bumps in the line aren't significant. If I tried all positions and averaged, I expect that "random" line to be the same as the "middle" line.

Now, looking at these results, I'd say that a smart-match is much faster in its worst case than the hash lookup is in its worst case. That makes sense. To create a hash, I have to visit every element of the array and also make the hash, which is a lot of copying. There's no copying with the smart match.

Here's a further case I won't examine though. When does the hash become better than the smart match? That is, when does the overhead of creating the hash spread out enough over repeated searches that the hash is the better choice?

,

Fast for small numbers of potential matches, but not faster than the hash. Hashes are really the right tool for testing set membership. Since hash access is O(log n) and smartmatch on an array is still O(n) linear scan (albeit short-circuiting, unlike grep), with larger numbers of values in the allowed matches, smartmatch gets relatively worse. Benchmark code (matching against 3 values):
#!perl
use 5.12.0;
use Benchmark qw(cmpthese);

my @hits = qw(one two three);
my @candidates = qw(one two three four five six); # 50% hit rate
my %hash;
@hash{@hits} = ();

sub count_hits_hash {
  my $count = 0;
  for (@_) {
    $count++ if exists $hash{$_};
  }
  $count;
}

sub count_hits_smartmatch {
  my $count = 0;
  for (@_) {
    $count++ when @hits;
  }
  $count;
}

say count_hits_hash(@candidates);
say count_hits_smartmatch(@candidates);

cmpthese(-5, {
    hash => sub { count_hits_hash((@candidates) x 1000) },
    smartmatch => sub { count_hits_smartmatch((@candidates) x 1000) },
  }
);
Benchmark results:
             Rate smartmatch       hash
smartmatch  404/s         --       -65%
hash       1144/s       183%         --

[Nov 13, 2019] python - Running shell command and capturing the output

Nov 22, 2012 | stackoverflow.com

Vartec's answer doesn't read all lines, so I made a version that did:

def run_command(command):
    p = subprocess.Popen(command,
                         stdout=subprocess.PIPE,
                         stderr=subprocess.STDOUT)
    return iter(p.stdout.readline, b'')

Usage is the same as the accepted answer:

command = 'mysqladmin create test -uroot -pmysqladmin12'.split()
for line in run_command(command):
    print(line)
share python - Running shell command and capturing the output - Stack Overflow Share a link to this answer Copy link | improve this answer edited May 23 '17 at 11:33 Community ♦ 1 1 1 silver badge answered Oct 30 '12 at 9:24 Max Ekman Max Ekman 769 5 5 silver badges 5 5 bronze badges

[Nov 13, 2019] Execute shell commands in Python

Nov 13, 2019 | unix.stackexchange.com

Execute shell commands in Python Ask Question Asked 4 years ago Active 2 months ago Viewed 557k times 67 32


fooot ,Nov 8, 2017 at 21:39

I'm currently studying penetration testing and Python programming. I just want to know how I would go about executing a Linux command in Python. The commands I want to execute are:
echo 1 > /proc/sys/net/ipv4/ip_forward
iptables -t nat -A PREROUTING -p tcp --destination-port 80 -j REDIRECT --to-port 8080

If I just use print in Python and run it in the terminal will it do the same as executing it as if you was typing it yourself and pressing Enter ?

binarysubstrate ,Feb 28 at 19:58

You can use os.system() , like this:
import os
os.system('ls')

Or in your case:

os.system('echo 1 > /proc/sys/net/ipv4/ip_forward')
os.system('iptables -t nat -A PREROUTING -p tcp --destination-port 80 -j REDIRECT --to-port 8080')

Better yet, you can use subprocess's call, it is safer, more powerful and likely faster:

from subprocess import call
call('echo "I like potatos"', shell=True)

Or, without invoking shell:

call(['echo', 'I like potatos'])

If you want to capture the output, one way of doing it is like this:

import subprocess
cmd = ['echo', 'I like potatos']
proc = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE)

o, e = proc.communicate()

print('Output: ' + o.decode('ascii'))
print('Error: '  + e.decode('ascii'))
print('code: ' + str(proc.returncode))

I highly recommend setting a timeout in communicate , and also to capture the exceptions you can get when calling it. This is a very error-prone code, so you should expect errors to happen and handle them accordingly.

https://docs.python.org/3/library/subprocess.html

jordanm ,Oct 23, 2015 at 15:43

The first command simply writes to a file. You wouldn't execute that as a shell command because python can read and write to files without the help of a shell:
with open('/proc/sys/net/ipv4/ip_forward', 'w') as f:
    f.write("1")

The iptables command is something you may want to execute externally. The best way to do this is to use the subprocess module .

import subprocess
subprocess.check_call(['iptables', '-t', 'nat', '-A',
                       'PREROUTING', '-p', 'tcp', 
                       '--destination-port', '80',
                       '-j', 'REDIRECT', '--to-port', '8080'])

Note that this method also does not use a shell, which is unnecessary overhead.

Tom Hunt ,Oct 23, 2015 at 15:41

The quickest way:
import os
os.system("your command here")

This isn't the most flexible approach; if you need any more control over your process than "run it once, to completion, and block until it exits", then you should use the subprocess module instead.

jordanm ,Apr 5, 2018 at 9:23

As a general rule, you'd better use python bindings whenever possible (better Exception catching, among other advantages.)

For the echo command, it's obviously better to use python to write in the file as suggested in @jordanm's answer.

For the iptables command, maybe python-iptables ( PyPi page , GitHub page with description and doc ) would provide what you need (I didn't check your specific command).

This would make you depend on an external lib, so you have to weight the benefits. Using subprocess works, but if you want to use the output, you'll have to parse it yourself, and deal with output changes in future iptables versions.

,

A python version of your shell. Be careful, I haven't tested it.
from subprocess import run

def bash(command):
        run(command.split())

>>> bash('find / -name null')
/dev/null
/sys/fs/selinux/null
/sys/devices/virtual/mem/null
/sys/class/mem/null
/usr/lib/kbd/consoletrans/null

[Nov 13, 2019] Static code analysis module in Perl - Stack Overflow

Nov 13, 2019 | stackoverflow.com

Static code analysis module in Perl Ask Question Asked 7 years, 5 months ago Active 1 year, 7 months ago Viewed 835 times 0

DavidO ,Jun 12, 2012 at 9:13

Is there any static code analysis module in Perl except B::Lint and Perl::Critic? How effective is Module::Checkstyle?

> ,

There is a post on perlmonks.org asking if PPI can be used for static analysis. PPI is the power behind Perl::Critic, according to the reviews of this module. (I have not used it yet).

Then there is perltidy .

[Nov 12, 2019] c - python (conditional-ternary) operator for assignments

Jul 30, 2014 | stackoverflow.com
This question already has an answer here:

karadoc ,May 14, 2013 at 13:01

Python has such an operator:
variable = something if condition else something_else

Alternatively, although not recommended (see @karadoc's comment):

variable = (condition and something) or something_else

[Nov 12, 2019] Static code analysis module in Perl - Stack Overflow

Nov 12, 2019 | stackoverflow.com

Static code analysis module in Perl Ask Question Asked 7 years, 5 months ago Active 1 year, 7 months ago Viewed 835 times 0

DavidO ,Jun 12, 2012 at 9:13

Is there any static code analysis module in Perl except B::Lint and Perl::Critic? How effective is Module::Checkstyle?

> ,

There is a post on perlmonks.org asking if PPI can be used for static analysis. PPI is the power behind Perl::Critic, according to the reviews of this module. (I have not used it yet).

Then there is perltidy .

[Nov 11, 2019] How fast is Perl's smartmatch operator when searching for a scalar in an array - Stack Overflow

Nov 11, 2019 | stackoverflow.com

[Nov 11, 2019] python - How to access environment variable values

Nov 11, 2019 | stackoverflow.com
Environment variables are accessed through os.environ
import os
print(os.environ['HOME'])

Or you can see a list of all the environment variables using:

os.environ

As sometimes you might need to see a complete list!

# using get will return `None` if a key is not present rather than raise a `KeyError`
print(os.environ.get('KEY_THAT_MIGHT_EXIST'))

# os.getenv is equivalent, and can also give a default value instead of `None`
print(os.getenv('KEY_THAT_MIGHT_EXIST', default_value))

Python default installation on Windows is C:\Python . If you want to find out while running python you can do:

import sys
print(sys.prefix)

,

import sys
print sys.argv[0]

This will print foo.py for python foo.py , dir/foo.py for python dir/foo.py , etc. It's the first argument to python . (Note that after py2exe it would be foo.exe .)

[Nov 11, 2019] How can I find the current OS in Python

Nov 11, 2019 | stackoverflow.com

> ,May 29, 2012 at 21:57

Possible Duplicate:
Python: What OS am I running on?

As the title says, how can I find the current operating system in python?

Shital Shah ,Sep 23 at 23:34

I usually use sys.platform ( docs ) to get the platform. sys.platform will distinguish between linux, other unixes, and OS X, while os.name is " posix " for all of them.

For much more detailed information, use the platform module . This has cross-platform functions that will give you information on the machine architecture, OS and OS version, version of Python, etc. Also it has os-specific functions to get things like the particular linux distribution.

xssChauhan ,Sep 9 at 7:34

If you want user readable data but still detailed, you can use platform.platform()
>>> import platform
>>> platform.platform()
'Linux-3.3.0-8.fc16.x86_64-x86_64-with-fedora-16-Verne'

platform also has some other useful methods:

>>> platform.system()
'Windows'
>>> platform.release()
'XP'
>>> platform.version()
'5.1.2600'

Here's a few different possible calls you can make to identify where you are

import platform
import sys

def linux_distribution():
  try:
    return platform.linux_distribution()
  except:
    return "N/A"

print("""Python version: %s
dist: %s
linux_distribution: %s
system: %s
machine: %s
platform: %s
uname: %s
version: %s
mac_ver: %s
""" % (
sys.version.split('\n'),
str(platform.dist()),
linux_distribution(),
platform.system(),
platform.machine(),
platform.platform(),
platform.uname(),
platform.version(),
platform.mac_ver(),))

The outputs of this script ran on a few different systems (Linux, Windows, Solaris, MacOS) and architectures (x86, x64, Itanium, power pc, sparc) is available here: https://github.com/hpcugent/easybuild/wiki/OS_flavor_name_version

Steg ,Mar 31, 2015 at 15:13

import os
print os.name

This gives you the essential information you will usually need. To distinguish between, say, different editions of Windows, you will have to use a platform-specific method.

UnkwnTech ,Sep 21, 2008 at 6:17

https://docs.python.org/library/os.html

To complement Greg's post, if you're on a posix system, which includes MacOS, Linux, Unix, etc. you can use os.uname() to get a better feel for what kind of system it is.

> ,

Something along the lines:
import os
if (os.name == "posix"):
    print os.system("uname -a")
# insert other possible OSes here
# ...
else:
    print "unknown OS"

[Nov 11, 2019] Python read file from current line

Nov 11, 2019 | stackoverflow.com

Ask Question Asked 6 years, 8 months ago Active 6 years, 8 months ago Viewed 2k times 0


dylanoo ,Feb 18, 2013 at 17:09

I have one problem regarding using python to process the trace file (it contains billion lines of data).

What I want to do is, the program will find one specific line in the file (say it is line# x), and it needs to find another symbol from this (line# x) in the file. Once it finds the line, starts from (line# x) again to search another one.

What I did now, is as following, but the problem is it always needs to reopen the file and read from the beginning to find the match ones (line # > x, and contain the symbol I want). For one big trace file, it takes too long to processing.

1.

    for line in file.readlines()
      i++ #update the line number
      if i > x:
          if (line.find()):

or:

   for i, line in enumerate(open(file)):
      if i > x:
          if ....

Anyone can give me one hint on better ideas?

Thanks

dylanoo ,Feb 18, 2013 at 20:24

If the file is otherwise stable, use fileobj.tell() to remember your position in the file, then next time use fileobj.seek(pos) to return to that same position in the file.

This only works if you do not use the fileobject as an iterator (no for line in fileobject) or next(fileobject) ) as that uses a read-ahead buffer that will obscure the exact position.

Instead, use:

for line in iter(fileobj.readline, ''):

to still use fileobj in an iteration context.

Martijn Pieters ♦ ,Feb 18, 2013 at 17:30

I suggest you use random access, and record where your line started. Something like:
index = []

fh = open(gash.txt)

for line in fh:
    if target in line:
        index.append(fh.tell() - len(line))

Then, when you want to recall the contents, use fh.seek(index[n]) .

A couple of "gotchas":

  1. Notice that the index position will not be the same as the line number. If you need the line number then maybe use a dictionary, with the line number as the key.
  2. On Windows, you will have to adjust the file position by -1. This is because the "\r" is stripped out and does not appear in the len(line) .

[Nov 08, 2019] Is losing BDFL a death sentence for open source projects such as Python? by Jason Baker

Jul 16, 2018 | opensource.com
What happens when a Benevolent Dictator For Life moves on from an open source project? up 2 comments Image credits : Original photo by Gabriel Kamener, Sown Together, Modified by Jen Wike Huger x Subscribe now

Get the highlights in your inbox every week.

https://opensource.com/eloqua-embedded-email-capture-block.html?offer_id=70160000000QzXNAA0 Guido van Rossum , creator of the Python programming language and Benevolent Dictator For Life (BDFL) of the project, announced his intention to step away.

Below is a portion of his message, although the entire email is not terribly long and worth taking the time to read if you're interested in the circumstances leading to van Rossum's departure.

I would like to remove myself entirely from the decision process. I'll still be there for a while as an ordinary core dev, and I'll still be available to mentor people -- possibly more available. But I'm basically giving myself a permanent vacation from being BDFL, and you all will be on your own.

After all that's eventually going to happen regardless -- there's still that bus lurking around the corner, and I'm not getting younger... (I'll spare you the list of medical issues.)

I am not going to appoint a successor.

So what are you all going to do? Create a democracy? Anarchy? A dictatorship? A federation?

It's worth zooming out for a moment to consider the issue at a larger scale. How an open source project is governed can have very real consequences on the long-term sustainability of its user and developer communities alike.

BDFLs tend to emerge from passion projects, where a single individual takes on a project before growing a community around it. Projects emerging from companies or other large organization often lack this role, as the distribution of authority is more formalized, or at least more dispersed, from the start. Even then, it's not uncommon to need to figure out how to transition from one form of project governance to another as the community grows and expands.

More Python Resources

But regardless of how an open source project is structured, ultimately, there needs to be some mechanism for deciding how to make technical decisions. Someone, or some group, has to decide which commits to accept, which to reject, and more broadly what direction the project is going to take from a technical perspective.

Surely the Python project will be okay without van Rossum. The Python Software Foundation has plenty of formalized structure in place bringing in broad representation from across the community. There's even been a humorous April Fools Python Enhancement Proposal (PEP) addressing the BDFL's retirement in the past.

That said, it's interesting that van Rossum did not heed the fifth lesson of Eric S. Raymond from his essay, The Mail Must Get Through (part of The Cathedral & the Bazaar ) , which stipulates: "When you lose interest in a program, your last duty to it is to hand it off to a competent successor." One could certainly argue that letting the community pick its own leadership, though, is an equally-valid choice.

What do you think? Are projects better or worse for being run by a BDFL? What can we expect when a BDFL moves on? And can someone truly step away from their passion project after decades of leading it? Will we still turn to them for the hard decisions, or can a community smoothly transition to new leadership without the pitfalls of forks or lost participants?

Can you truly stop being a BDFL? Or is it a title you'll hold, at least informally, until your death? Topics Community management Python 2018 Open Source Yearbook Yearbook About the author Jason Baker - I use technology to make the world more open. Linux desktop enthusiast. Map/geospatial nerd. Raspberry Pi tinkerer. Data analysis and visualization geek. Occasional coder. Cloud nativist. Civic tech and open government booster. More about me

Recommended reading
Conquering documentation challenges on a massive project

4 Python tools for getting started with astronomy

5 reasons why I love Python

Building trust in the Linux community

Pylint: Making your Python code consistent

Perceiving Python programming paradigms
2 Comments

Mike James on 17 Jul 2018 Permalink

My take on the issue:
https://www.i-programmer.info/news/216-python/11967-guido-van-rossum-qui...

Maxim Stewart on 05 Aug 2018 Permalink

"So what are you all going to do? Create a democracy? Anarchy? A dictatorship? A federation?"

Power coalesced to one point is always scary when thought about in the context of succession. A vacuum invites anarchy and I often think about this for when Linus Torvalds leaves the picture. We really have no concrete answers for what is the best way forward but my hope is towards a democratic process. But, as current history indicates, a democracy untended by its citizens invites quite the nightmare and so too does this translate to the keeping up of a project.

[Nov 08, 2019] How to escape unicode characters in bash prompt correctly - Stack Overflow

Nov 08, 2019 | stackoverflow.com

How to escape unicode characters in bash prompt correctly Ask Question Asked 8 years, 2 months ago Active 9 months ago Viewed 6k times 7 2


Andy Ray ,Aug 18, 2011 at 19:08

I have a specific method for my bash prompt, let's say it looks like this:
CHAR="༇ "
my_function="
    prompt=\" \[\$CHAR\]\"
    echo -e \$prompt"

PS1="\$(${my_function}) \$ "

To explain the above, I'm builidng my bash prompt by executing a function stored in a string, which was a decision made as the result of this question . Let's pretend like it works fine, because it does, except when unicode characters get involved

I am trying to find the proper way to escape a unicode character, because right now it messes with the bash line length. An easy way to test if it's broken is to type a long command, execute it, press CTRL-R and type to find it, and then pressing CTRL-A CTRL-E to jump to the beginning / end of the line. If the text gets garbled then it's not working.

I have tried several things to properly escape the unicode character in the function string, but nothing seems to be working.

Special characters like this work:

COLOR_BLUE=$(tput sgr0 && tput setaf 6)

my_function="
    prompt="\\[\$COLOR_BLUE\\] \"
    echo -e \$prompt"

Which is the main reason I made the prompt a function string. That escape sequence does NOT mess with the line length, it's just the unicode character.

Andy Ray ,Aug 23, 2011 at 2:09

The \[...\] sequence says to ignore this part of the string completely, which is useful when your prompt contains a zero-length sequence, such as a control sequence which changes the text color or the title bar, say. But in this case, you are printing a character, so the length of it is not zero. Perhaps you could work around this by, say, using a no-op escape sequence to fool Bash into calculating the correct line length, but it sounds like that way lies madness.

The correct solution would be for the line length calculations in Bash to correctly grok UTF-8 (or whichever Unicode encoding it is that you are using). Uhm, have you tried without the \[...\] sequence?

Edit: The following implements the solution I propose in the comments below. The cursor position is saved, then two spaces are printed, outside of \[...\] , then the cursor position is restored, and the Unicode character is printed on top of the two spaces. This assumes a fixed font width, with double width for the Unicode character.

PS1='\['"`tput sc`"'\]  \['"`tput rc`"'༇ \] \$ '

At least in the OSX Terminal, Bash 3.2.17(1)-release, this passes cursory [sic] testing.

In the interest of transparency and legibility, I have ignored the requirement to have the prompt's functionality inside a function, and the color coding; this just changes the prompt to the character, space, dollar prompt, space. Adapt to suit your somewhat more complex needs.

tripleee ,Aug 23, 2011 at 7:01

@tripleee wins it, posting the final solution here because it's a pain to post code in comments:
CHAR="༇"
my_function="
    prompt=\" \\[`tput sc`\\]  \\[`tput rc`\\]\\[\$CHAR\\] \"
    echo -e \$prompt"

PS1="\$(${my_function}) \$ "

The trick as pointed out in @tripleee's link is the use of the commands tput sc and tput rc which save and then restore the cursor position. The code is effectively saving the cursor position, printing two spaces for width, restoring the cursor position to before the spaces, then printing the special character so that the width of the line is from the two spaces, not the character.

> ,

(Not the answer to your problem, but some pointers and general experience related to your issue.)

I see the behaviour you describe about cmd-line editing (Ctrl-R, ... Cntrl-A Ctrl-E ...) all the time, even without unicode chars.

At one work-site, I spent the time to figure out the diff between the terminals interpretation of the TERM setting VS the TERM definition used by the OS (well, stty I suppose).

NOW, when I have this problem, I escape out of my current attempt to edit the line, bring the line up again, and then immediately go to the 'vi' mode, which opens the vi editor. (press just the 'v' char, right?). All the ease of use of a full-fledged session of vi; why go with less ;-)?

Looking again at your problem description, when you say

my_function="
    prompt=\" \[\$CHAR\]\"
    echo -e \$prompt"

That is just a string definition, right? and I'm assuming your simplifying the problem definition by assuming this is the output of your my_function . It seems very likely in the steps of creating the function definition, calling the function AND using the values returned are a lot of opportunities for shell-quoting to not work the way you want it to.

If you edit your question to include the my_function definition, and its complete use (reducing your function to just what is causing the problem), it may be easier for others to help with this too. Finally, do you use set -vx regularly? It can help show how/wnen/what of variable expansions, you may find something there.

Failing all of those, look at Orielly termcap & terminfo . You may need to look at the man page for your local systems stty and related cmds AND you may do well to look for user groups specific to you Linux system (I'm assuming you use a Linux variant).

I hope this helps.

[Nov 07, 2019] Is BDFL a death sentence Opensource.com

Nov 07, 2019 | opensource.com

What happens when a Benevolent Dictator For Life moves on from an open source project? 16 Jul 2018 Jason Baker (Red Hat) Feed 131 up 2 comments Image credits : Original photo by Gabriel Kamener, Sown Together, Modified by Jen Wike Huger x Subscribe now

Get the highlights in your inbox every week.

https://opensource.com/eloqua-embedded-email-capture-block.html?offer_id=70160000000QzXNAA0 Guido van Rossum , creator of the Python programming language and Benevolent Dictator For Life (BDFL) of the project, announced his intention to step away.

Below is a portion of his message, although the entire email is not terribly long and worth taking the time to read if you're interested in the circumstances leading to van Rossum's departure.

I would like to remove myself entirely from the decision process. I'll still be there for a while as an ordinary core dev, and I'll still be available to mentor people -- possibly more available. But I'm basically giving myself a permanent vacation from being BDFL, and you all will be on your own.

After all that's eventually going to happen regardless -- there's still that bus lurking around the corner, and I'm not getting younger... (I'll spare you the list of medical issues.)

I am not going to appoint a successor.

So what are you all going to do? Create a democracy? Anarchy? A dictatorship? A federation?

It's worth zooming out for a moment to consider the issue at a larger scale. How an open source project is governed can have very real consequences on the long-term sustainability of its user and developer communities alike.

BDFLs tend to emerge from passion projects, where a single individual takes on a project before growing a community around it. Projects emerging from companies or other large organization often lack this role, as the distribution of authority is more formalized, or at least more dispersed, from the start. Even then, it's not uncommon to need to figure out how to transition from one form of project governance to another as the community grows and expands.

More Python Resources

But regardless of how an open source project is structured, ultimately, there needs to be some mechanism for deciding how to make technical decisions. Someone, or some group, has to decide which commits to accept, which to reject, and more broadly what direction the project is going to take from a technical perspective.

Surely the Python project will be okay without van Rossum. The Python Software Foundation has plenty of formalized structure in place bringing in broad representation from across the community. There's even been a humorous April Fools Python Enhancement Proposal (PEP) addressing the BDFL's retirement in the past.

That said, it's interesting that van Rossum did not heed the fifth lesson of Eric S. Raymond from his essay, The Mail Must Get Through (part of The Cathedral & the Bazaar ) , which stipulates: "When you lose interest in a program, your last duty to it is to hand it off to a competent successor." One could certainly argue that letting the community pick its own leadership, though, is an equally-valid choice.

What do you think? Are projects better or worse for being run by a BDFL? What can we expect when a BDFL moves on? And can someone truly step away from their passion project after decades of leading it? Will we still turn to them for the hard decisions, or can a community smoothly transition to new leadership without the pitfalls of forks or lost participants?

Can you truly stop being a BDFL? Or is it a title you'll hold, at least informally, until your death? Topics Community management Python 2018 Open Source Yearbook Yearbook About the author Jason Baker - I use technology to make the world more open. Linux desktop enthusiast. Map/geospatial nerd. Raspberry Pi tinkerer. Data analysis and visualization geek. Occasional coder. Cloud nativist. Civic tech and open government booster. More about me

Recommended reading
Conquering documentation challenges on a massive project

4 Python tools for getting started with astronomy

5 reasons why I love Python

Building trust in the Linux community

Pylint: Making your Python code consistent

Perceiving Python programming paradigms
2 Comments

Mike James on 17 Jul 2018 Permalink

My take on the issue:
https://www.i-programmer.info/news/216-python/11967-guido-van-rossum-qui...

Maxim Stewart on 05 Aug 2018 Permalink

"So what are you all going to do? Create a democracy? Anarchy? A dictatorship? A federation?"

Power coalesced to one point is always scary when thought about in the context of succession. A vacuum invites anarchy and I often think about this for when Linus Torvalds leaves the picture. We really have no concrete answers for what is the best way forward but my hope is towards a democratic process. But, as current history indicates, a democracy untended by its citizens invites quite the nightmare and so too does this translate to the keeping up of a project.

[Nov 02, 2019] Copied variable changes the original

Nov 14, 2011 | stackoverflow.com

Copied variable changes the original? Ask Question Asked 7 years, 11 months ago Active 2 years, 9 months ago Viewed 61k times 46 17


André Freitas ,Nov 14, 2011 at 13:56

I have a simple problem in Python that is very very strange.
def estExt(matriz,erro):
    # (1) Determinar o vector X das soluções
    print ("Matrix after:");
    print(matriz);

    aux=matriz;
    x=solucoes(aux); # IF aux is a copy of matrix, why the matrix is changed??

    print ("Matrix before: ");
    print(matriz)

...

As you see below, the matrix matriz is changed in spite of the fact that aux is the one being changed by the function solucoes() .

Matrix before:
[[7, 8, 9, 24], [8, 9, 10, 27], [9, 10, 8, 27]]

Matrix after:
[[7, 8, 9, 24], [0.0, -0.14285714285714235, -0.2857142857142847, -0.42857142857142705], [0.0, 0.0, -3.0, -3.0000000000000018]]

André Freitas ,Nov 14, 2011 at 17:16

The line
aux=matriz;

Does not make a copy of matriz , it merely creates a new reference to matriz named aux . You probably want

aux=matriz[:]

Which will make a copy, assuming matriz is a simple data structure. If it is more complex, you should probably use copy.deepcopy

aux = copy.deepcopy(matriz)

As an aside, you don't need semi-colons after each statement, python doesn't use them as EOL markers.

André Freitas ,Nov 15, 2011 at 8:49

Use copy module
aux = copy.deepcopy(matriz) # there is copy.copy too for shallow copying

Minor one: semicolons are not needed.

aux is not a copy of matrix , it's just a different name that refers to the same object.

[Oct 25, 2019] unix - Remove a file on Linux using the inode number - Super User

Oct 25, 2019 | superuser.com

,

ome other methods include:

escaping the special chars:

[~]$rm \"la\*

use the find command and only search the current directory. The find command can search for inode numbers, and has a handy -delete switch:

[~]$ls -i
7404301 "la*

[~]$find . -maxdepth 1 -type f -inum 7404301
./"la*

[~]$find . -maxdepth 1 -type f -inum 7404301 -delete
[~]$ls -i
[~]$

,

Maybe I'm missing something, but...
rm '"la*'

Anyways, filenames don't have inodes, files do. Trying to remove a file without removing all filenames that point to it will damage your filesystem.

[Oct 22, 2019] Is there an advantage to using Bash over Perl or Python?

Oct 22, 2019 | stackoverflow.com

Ask Question Asked 8 years, 5 months ago Active 8 years, 5 months ago Viewed 19k times 23 10


> ,May 2, 2011 at 18:58

Hey I've been using Linux for a while and thought it was time to finally dive into shell scripting.

The problem is I've failed to find any significant advantage of using Bash over something like Perl or Python. Are there any performance or power differences between the two? I'd figure Python/Perl would be more well suited as far as power and efficiency goes.

Sebastian ,May 2, 2011 at 15:21

Two advantages come to mind:

By the way, I usually have some python calls in my bash scripts (e.g. for plotting). Use whatever is best for the task!

Mario Peshev ,May 2, 2011 at 15:16

Perl scripts are usually (if not 100% of the times) faster than bash.

A discussion on that: Perl vs Bash

reinierpost ,May 7, 2011 at 12:16

bash isn't a language so much as a command interpreter that's been hacked to death to allow for things that make it look like a scripting language. It's great for the simplest 1-5 line one-off tasks, but things that are dead simple in Perl or Python like array manipulation are horribly ugly in bash. I also find that bash tends not to pass two critical rules of thumb:
  1. The 6-month rule, which says you should be able to easily discern the purpose and basic mechanics of a script you wrote but haven't looked at in 6 months.
  2. The 'WTF per minute' rule. Everyone has their limit, and mine is pretty small. Once I get to 3 WTFs/min, I'm looking elsewhere.

As for 'shelling out' in scripting languages like Perl and Python, I find that I almost never need to do this, fwiw (disclaimer: I code almost 100% in Python). The Python os and shutil modules have most of what I need most of the time, and there are built-in modules for handling tarfiles, gzip files, zip files, etc. There's a glob module, an fnmatch module... there's a lot of stuff there. If you come across something you need to parallelize, then indent your code a level, put it in a 'run()' method, put that in a class that extends either threading.Thread or multiprocessing.Process, instantiate as many of those as you want, calling 'start()' on each one. Less than 5 minutes to get parallel execution generally.

Best of luck. Hope this helps.

daotoad ,May 2, 2011 at 17:40

For big projects use a language like Perl.

There are a few things you can only do in bash (for example, alter the calling environment (when a script is sourced rather than run). Also, shell scripting is commonplace. It is worthwhile to learn the basics and learn your way around the available docs.

Plus there are times when knowing a shell well can save your bacon (on a fork-bombed system where you can't start any new processes, or if /usr/bin and or /usr/local/bin fail to mount).

Sebastian ,May 3, 2011 at 8:47

The advantage is that it's right there. Unless you use Python (or Perl) as your shell, writing a script to do a simple loop is a bunch of extra work.

For short, simple scripts that call other programs, I'll use Bash. If I want to keep the output, odds are good that I'll trade up to Python.

For example:

for file in *; do process $file ; done

where process is a program I want to run on each file, or...

while true; do program_with_a_tendency_to_fail ; done

Doing either of those in Python or Perl is overkill.

For actually writing a program that I expect to maintain and use over time, Bash is rarely the right tool for the job. Particularly since most modern Unices come with both Perl and Python.

tchrist ,May 4, 2011 at 11:01

The most important advantage of POSIX shell scripts over Python or Perl scripts is that a POSIX shell is available on virtually every Unix machine. (There are also a few tasks shell scripts happen to be slightly more convenient for, but that's not a major issue.) If the portability is not an issue for you, I don't see much need to learn shell scripting.

tchrist ,May 3, 2011 at 23:50

If you want to execute programs installed on the machine, nothing beats bash. You can always make a system call from Perl or Python, but I find it to be a hassle to read return values, etc.

And since you know it will work pretty much anywhere throughout all of of time...

Alexandr Ciornii ,May 3, 2011 at 8:26

The advantage of shell scripting is that it's globally present on *ix boxes, and has a relatively stable core set of features you can rely on to run everywhere. With Perl and Python you have to worry about whether they're available and if so what version, as there have been significant syntactical incompatibilities throughout their lifespans. (Especially if you include Python 3 and Perl 6.)

The disadvantage of shell scripting is everything else. Shell scripting languages are typically lacking in expressiveness, functionality and performance. And hacking command lines together from strings in a language without strong string processing features and libraries, to ensure the escaping is correct, invites security problems. Unless there's a compelling compatibility reason you need to go with shell, I would personally plump for a scripting language every time.

[Oct 22, 2019] Perl vs Python log processing performance

Oct 22, 2019 | stackoverflow.com

Ask Question Asked 6 years, 11 months ago Active 6 years, 11 months ago Viewed 2k times 0 0

texasbruce ,Nov 11, 2012 at 2:05

I am working on a web-based log management system that will be built on the Grails framework and I am going to use one of the text processing languages like Python or Perl. I have created Python and Perl scripts that load log files and parse each line to save them to a MySQL database (the file contains about 40,000 lines, about 7MB). It took 1 min 2 secs using Perl and only 17 secs using Python .

I had supposed that Perl would be faster than Python, as Perl is the original text processing language (my suspicions also coming from different blogs where I was reading about Perl text processing performance).

Also I was not expecting a 47 second difference between Perl and Python. Why is Perl taking more time than Python to process my log file? Is it because I am using some wrong db module or my code and regular expression for Perl can be improved?

Note: I am a Java and Groovy developer and I have no experience with Perl (I am using Strawberry Perl v5.16). Also I have made this test with Java (1 min 5 secs) and Groovy (1 min 7 secs) but more than 1 min to process the log file is too much, so both languages are out and now I want to choose between Perl and Python.

PERL Code

use DBI;
use DBD::mysql;
# make connection to database
$connection = DBI->connect("dbi:mysql:logs:localhost:3306","root","") || die      "Cannot connect: $DBI::errstr";

# set the value of your SQL query
$query = "insert into logs (line_number, dated, time_stamp, thread, level, logger, user, message)
        values (?, ?, ?, ?, ?, ?, ?, ?) ";

# prepare your statement for connecting to the database
$statement = $connection->prepare($query); 

$runningTime = time;

# open text file
open (LOG,'catalina2.txt') || die "Cannot read logfile!\n";;

while (<LOG>) {
    my ($date, $time, $thread, $level, $logger, $user, $message) = /^(\d{4}-\d{2}-\d{2}) (\d{2}:\d{2}:\d{2},\d{3}) (\[.*\]) (.*) (\S*) (\(.*\)) - (.*)$/;

    $statement->execute(1, $date, $time, $thread, $level, $logger, $user, $message);
}  

# close the open text file
close(LOG);

# close database connection
$connection->disconnect;

$runningTime = time - $runningTime;
printf("\n\nTotal running time: %02d:%02d:%02d\n\n", int($runningTime / 3600),   int(($runningTime % 3600) / 60), int($runningTime % 60));

# exit the script
exit;

PYTHON Code

import re
import mysql.connector
import time

file = open("D:\catalina2.txt","r")
rexp = re.compile('^(\d{4}-\d{2}-\d{2}) (\d{2}:\d{2}:\d{2},\d{3}) (\[.*\]) (.*) (\S*) (\(.*\)) - (.*)$')
conn = mysql.connector.connect(user='root',host='localhost',database='logs')
cursor = conn.cursor()

tic = time.clock()

increment  = 1
for text in file.readlines():
    match = rexp.match(text)
    increment +=  1
cursor.execute('insert into logs (line_number,dated, time_stamp, thread,level,logger,user,message ) values (%s,%s,%s,%s,%s,%s,%s,%s)', (increment, match.group(1), match.group(2),match.group(3),match.group(4),match.group(5),match.group(6),match.group(7)))

conn.commit()
cursor.close()
conn.close()

toc = time.clock()
print "Total time: %s" % (toc - tic)

David-SkyMesh ,Nov 11, 2012 at 1:35

It is not a fair comparison:

You are only calling cursor.execute once in Python:

for text in file.readlines():
    match = rexp.match(text)
    increment +=  1
cursor.execute('insert into logs (line_number,dated, time_stamp, thread,level,logger,user,message ) values (%s,%s,%s,%s,%s,%s,%s,%s)', (increment, match.group(1), match.group(2),match.group(3),match.group(4),match.group(5),match.group(6),match.group(7)))

But you are calling $statement->execute many times in Perl:

while (<LOG>) {
    my ($date, $time, $thread, $level, $logger, $user, $message) = /^(\d{4}-\d{2}-\d{2}) (\d{2}:\d{2}:\d{2},\d{3}) (\[.*\]) (.*) (\S*) (\(.*\)) - (.*)$/;

    $statement->execute(1, $date, $time, $thread, $level, $logger, $user, $message);
}

By the way, for the Python version, calling cursor.execute once for every row will be slow. You can make it faster by using cursor.executemany :

sql = 'insert into logs (line_number,dated, time_stamp, thread,level,logger,user,message ) values (%s,%s,%s,%s,%s,%s,%s,%s)'
args = []
for text in file:
    match = rexp.match(text)
    increment +=  1
    args.append([increment] + list(match.groups()))

cursor.executemany(sql, args)

If there are too many lines in the log file, you may need to break this up into blocks:

args = []
for text in file:
    match = rexp.match(text)
    increment +=  1
    args.append([increment] + list(match.groups()))
    if increment % 1000 == 0:
        cursor.executemany(sql, args)
        args = []
if args:
    cursor.executemany(sql, args)

(Also, don't use file.readlines() because this creates a list (which may be huge). file is an iterator which spits out one line at a time, so for text in file suffices.)

[Oct 22, 2019] Python Code Glitch May Have Caused Errors In Over 100 Published Studies

Oct 22, 2019 | science.slashdot.org

(vice.com) 121

An anonymous reader quotes Motherboard: The glitch caused results of a common chemistry computation to vary depending on the operating system used, causing discrepancies among Mac, Windows, and Linux systems. The researchers published the revelation and a debugged version of the script, which amounts to roughly 1,000 lines of code, on Tuesday in the journal Organic Letters .

"This simple glitch in the original script calls into question the conclusions of a significant number of papers on a wide range of topics in a way that cannot be easily resolved from published information because the operating system is rarely mentioned," the new paper reads. "Authors who used these scripts should certainly double-check their results and any relevant conclusions using the modified scripts in the [supplementary information]."

Yuheng Luo, a graduate student at the University of Hawaii at Manoa, discovered the glitch this summer when he was verifying the results of research conducted by chemistry professor Philip Williams on cyanobacteria... Under supervision of University of Hawaii at Manoa assistant chemistry professor Rui Sun, Luo used a script written in Python that was published as part of a 2014 paper by Patrick Willoughby, Matthew Jansma, and Thomas Hoye in the journal Nature Protocols . The code computes chemical shift values for NMR, or nuclear magnetic resonance spectroscopy, a common technique used by chemists to determine the molecular make-up of a sample. Luo's results did not match up with the NMR values that Williams' group had previously calculated, and according to Sun, when his students ran the code on their computers, they realized that different operating systems were producing different results.

Sun then adjusted the code to fix the glitch, which had to do with how different operating systems sort files.
The researcher who wrote the flawed script told Motherboard that the new study was "a beautiful example of science working to advance the work we reported in 2014. They did a tremendous service to the community in figuring this out."

Sun described the original authors as "very gracious," saying they encouraged the publication of the findings.

[Oct 22, 2019] Difference in regex behavior between Perl and Python?

Oct 22, 2019 | stackoverflow.com

Ask Question Asked 10 years, 6 months ago Active 10 years, 6 months ago Viewed 2k times 3 1


Gumbo ,Apr 16, 2009 at 18:42

I have a couple email addresses, '[email protected]' and '[email protected]' .

In perl, I could take the To: line of a raw email and find either of the above addresses with

/\w+@(tickets\.)?company\.com/i

In python, I simply wrote the above regex as '\w+@(tickets\.)?company\.com' expecting the same result. However, [email protected] isn't found at all and a findall on the second returns a list containing only 'tickets.' . So clearly the '(tickets\.)?' is the problem area, but what exactly is the difference in regular expression rules between Perl and Python that I'm missing?

Axeman ,Apr 16, 2009 at 21:10

The documentation for re.findall :
findall(pattern, string, flags=0)
    Return a list of all non-overlapping matches in the string.

    If one or more groups are present in the pattern, return a
    list of groups; this will be a list of tuples if the pattern
    has more than one group.

    Empty matches are included in the result.

Since (tickets\.) is a group, findall returns that instead of the whole match. If you want the whole match, put a group around the whole pattern and/or use non-grouping matches, i.e.

r'(\w+@(tickets\.)?company\.com)'
r'\w+@(?:tickets\.)?company\.com'

Note that you'll have to pick out the first element of each tuple returned by findall in the first case.

chaos ,Apr 16, 2009 at 18:45

I think the problem is in your expectations of extracted values. Try using this in your current Python code:
'(\w+@(?:tickets\.)?company\.com)'

Jason Coon ,Apr 16, 2009 at 18:46

Two problems jump out at me:
  1. You need to use a raw string to avoid having to escape " \ "
  2. You need to escape " . "

So try:

r'\w+@(tickets\.)?company\.com'

EDIT

Sample output:

>>> import re
>>> exp = re.compile(r'\w+@(tickets\.)?company\.com')
>>> bool(exp.match("[email protected]"))
True
>>> bool(exp.match("[email protected]"))
True

,

There isn't a difference in the regexes, but there is a difference in what you are looking for. Your regex is capturing only "tickets." if it exists in both regexes. You probably want something like this
#!/usr/bin/python

import re

regex = re.compile("(\w+@(?:tickets\.)?company\.com)");

a = [
    "[email protected]", 
    "[email protected]", 
    "[email protected]",
    "[email protected]"
];

for string in a:
    print regex.findall(string)

[Oct 22, 2019] Python for a Perl programmer

Oct 22, 2019 | stackoverflow.com

Ask Question Asked 9 years, 8 months ago Active 11 months ago Viewed 22k times 53 47


Hamish Grubijan ,Feb 17, 2010 at 17:56

I am an experienced Perl developer with some degree of experience and/or familiarity with other languages (working experience with C/C++, school experience with Java and Scheme, and passing familiarity with many others).

I might need to get some web work done in Python (most immediately, related to Google App Engine). As such, I'd like to ask SO overmind for good references on how to best learn Python for someone who's coming from Perl background (e.g. the emphasis would be on differences between the two and how to translate perl idiomatics into Python idiomatics, as opposed to generic Python references). Something also centered on Web development is even better. I'll take anything - articles, tutorials, books, sample apps?

Thanks!

FMc ,Dec 19, 2014 at 17:50

I've recently had to make a similar transition for work reasons, and it's been pretty painful. For better or worse, Python has a very different philosophy and way of working than Perl, and getting used to that can be frustrating. The things I've found most useful have been

Personally, I found Dive Into Python annoying and patronising, but it's freely available online, so you can form your own judgment on that.

Philip Durbin ,Feb 18, 2010 at 18:12

If you happen to be a fan of The Perl Cookbook , you might be interested in checking out PLEAC, the Programming Language Examples Alike Cookbook , specifically the section that shows the Perl Cookbook code translated into Python .

larley ,Feb 18, 2010 at 6:16

Being a hardcore Perl programmer, all I can say is DO NOT BUY O'Reilly's "Learning Python". It is nowhere NEAR as good as "Learning Perl", and there's no equivalent I know of to Larry Wall's "Programming Perl", which is simply unbeatable.

I've had the most success taking past Perl programs and translating them into Python, trying to make use of as many new techniques as possible.

Mike Graham ,Feb 17, 2010 at 18:02

Check out the official tutorial , which is actually pretty good. If you are interested in web development you should be ready at that point to jump right in to the documentation of the web framework you will be working with; Python has many to choose from, with zope, cherrypy, pylons, and werkzeug all having good reputations.

I would not try to search for things specifically meant to help you transition from Perl, which are not to be of as high of quality as references that can be useful for more people.

ghostdog74 ,Feb 18, 2010 at 1:17

This is the site you should really go to. There's a section called Getting Started which you should take a look. There are also recommendations on books. On top of that, you might also be interested in this on "idioms"

sateesh ,Feb 17, 2010 at 18:08

If what you are looking at is succinct, concise reference to python then the book Python Essential Reference might be helpful.

Robert P ,May 31, 2013 at 22:39

I wouldn't try to compare Perl and Python too much in order to learn Python, especially since you have working knowledge of other languages. If you are unfamiliar with OOP/Functional programming aspects and just looking to work procedurally like in Perl, start learning the Python language constructs / syntax and then do a couple examples. if you are making a switch to OO or functional style paradigms, I would read up on OO fundamentals first, then start on Python syntax and examples...so you have a sort of mental blueprint of how things can be constructed before you start working with the actual materials. this is just my humble opinion however..

[Oct 21, 2019] Differences between Perl and PHP [closed]

Notable quotes:
"... Perl has native regular expression support, ..."
"... Perl has quite a few more operators , including matching ..."
"... In PHP, new is an operator. In Perl, it's the conventional name of an object creation subroutine defined in packages, nothing special as far as the language is concerned. ..."
"... Perl logical operators return their arguments, while they return booleans in PHP. ..."
"... Perl gives access to the symbol table ..."
"... Note that "references" has a different meaning in PHP and Perl. In PHP, references are symbol table aliases. In Perl, references are smart pointers. ..."
"... Perl has different types for integer-indexed collections (arrays) and string indexed collections (hashes). In PHP, they're the same type: an associative array/ordered map ..."
"... Perl arrays aren't sparse ..."
"... Perl supports hash and array slices natively, ..."
Nov 23, 2013 | stackoverflow.com

jholster ,Nov 23, 2013 at 21:20

I'm planning to learn Perl 5 and as I have only used PHP until now, I wanted to know a bit about how the languages differ from each other.

As PHP started out as a set of "Perl hacks" it has obviously cloned some of Perls features.

hobbs ,Jan 17, 2013 at 8:36

Perl and PHP are more different than alike. Let's consider Perl 5, since Perl 6 is still under development. Some differences, grouped roughly by subject:

PHP was inspired by Perl the same way Phantom of the Paradise was inspired by Phantom of the Opera , or Strange Brew was inspired by Hamlet . It's best to put the behavior specifics of PHP out of your mind when learning Perl, else you'll get tripped up.

My brain hurts now, so I'm going to stop.

Your Common Sense ,Mar 29, 2010 at 2:19

When PHP came to the scene, everyone were impressed with main differences from Perl:
  1. Input variables already in the global scope, no boring parsing.
  2. HTML embedding. Just <?php ... ?> anywhere. No boring templates.
  3. On-screen error messages. No boring error log peeks.
  4. Easy to learn. No boring book reading.

As the time passed, everyone learned that they were not a benefit, hehe...

Quentin ,Jan 15, 2016 at 3:27

I've noticed that most PHP vs. Perl pages seem to be of the

PHP is better than Perl because <insert lame reason here>

ilk, and rarely make reasonable comparisons.

Syntax-wise, you will find PHP is often easier to understand than Perl, particularly when you have little experience. For example, trimming a string of leading and trailing whitespace in PHP is simply

$string = trim($string);

In Perl it is the somewhat more cryptic

$string =~ s/^\s+//;
$string =~ s/\s+$//;

(I believe this is slightly more efficient than a single line capture and replace, and also a little more understandable.) However, even though PHP is often more English-like, it sometimes still shows its roots as a wrapper for low level C, for example, strpbrk and strspn are probably rarely used, because most PHP dabblers write their own equivalent functions for anything too esoteric, rather than spending time exploring the manual. I also wonder about programmers for whom English is a second language, as everybody is on equal footing with things such as Perl, having to learn it from scratch.

I have already mentioned the manual. PHP has a fine online manual, and unfortunately it needs it. I still refer to it from time to time for things that should be simple, such as order of parameters or function naming convention. With Perl, you will probably find you are referring to the manual a lot as you get started and then one day you will have an a-ha moment and never need it again. Well, at least not until you're more advanced and realize that not only is there more than one way, there is probably a better way, somebody else has probably already done it that better way, and perhaps you should just visit CPAN.

Perl does have a lot more options and ways to express things. This is not necessarily a good thing, although it allows code to be more readable if used wisely and at least one of the ways you are likely to be familiar with. There are certain styles and idioms that you will find yourself falling into, and I can heartily recommend reading Perl Best Practices (sooner rather than later), along with Perl Cookbook, Second Edition to get up to speed on solving common problems.

I believe the reason Perl is used less often in shared hosting environments is that historically the perceived slowness of CGI and hosts' unwillingness to install mod_perl due to security and configuration issues has made PHP a more attractive option. The cycle then continued, more people learned to use PHP because more hosts offered it, and more hosts offered it because that's what people wanted to use. The speed differences and security issues are rendered moot by FastCGI these days, and in most cases PHP is run out of FastCGI as well, rather than leaving it in the core of the web server.

Whether or not this is the case or there are other reasons, PHP became popular and a myriad of applications have been written in it. For the majority of people who just want an entry-level website with a simple blog or photo gallery, PHP is all they need so that's what the hosts promote. There should be nothing stopping you from using Perl (or anything else you choose) if you want.

At an enterprise level, I doubt you would find too much PHP in production (and please, no-one point at Facebook as a counter-example, I said enterprise level).

Leon Timmermans ,Mar 28, 2010 at 22:15

Perl is used plenty for websites, no less than Python and Ruby for example. That said, PHP is used way more often than any of those. I think the most important factors in that are PHP's ease of deployment and the ease to start with it.

The differences in syntax are too many to sum up here, but generally it is true that it has more ways to express yourself (this is know as TIMTWOTDI, There Is More Than One Way To Do It).

Brad Gilbert ,Mar 29, 2010 at 4:04

My favorite thing about Perl is the way it handles arrays/lists. Here's an example of how you would make and use a Perl function (or "subroutine"), which makes use of this for arguments:
sub multiply
{
    my ($arg1, $arg2) = @_; # @_ is the array of arguments
    return $arg1 * $arg2;
}

In PHP you could do a similar thing with list() , but it's not quite the same; in Perl lists and arrays are actually treated the same (usually). You can also do things like:

$week_day_name = ("Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday")[$week_day_index];

And another difference that you MUST know about, is numerical/string comparison operators. In Perl, if you use < , > , == , != , <=> , and so on, Perl converts both operands to numbers. If you want to convert as strings instead, you have to use lt , gt , eq , ne , cmp (the respective equivalents of the operators listed previously). Examples where this will really get you:

if ("a" == "b") { ... } # This is true.
if ("a" == 0) { ... } # This is also true, for the same reason.

Sorin Postelnicu, Aug 5, 2015 at 15:44

I do not need add anything to outis's fantastic answer, i want only show the answer for you question:

Why is Perl not used for dynamic websites very often anymore? What made PHP gain more popularity than it?

Please check first some "Job Trends" sites - and you can make the judgement alone.

as you can see, perl is still a leader - but preferable for real applications not for toys. :)

[Oct 20, 2019] Using command line arguments to R CMD BATCH

Oct 20, 2019 | stackoverflow.com

Ask Question Asked 6 years, 9 months ago Active 1 year, 7 months ago Viewed 112k times 93 54


Bryce Thomas ,Jan 5, 2013 at 0:26

I have been using R CMD BATCH my_script.R from a terminal to execute an R script. I am now at the point where I would like to pass an argument to the command, but am having some issues getting it working. If I do R CMD BATCH my_script.R blabla then blabla becomes the output file, rather than being interpreted as an argument available to the R script being executed.

I have tried Rscript my_script.R blabla which seems to pass on blabla correctly as an argument, but then I don't get the my_script.Rout output file that I get with R CMD BATCH (I want the .Rout file). While I could redirect the output of a call to Rscript to a file name of my choosing, I would not be getting the R input commands included in the file in the way R CMD BATCH does in the .Rout file.

So, ideally, I'm after a way to pass arguments to an R script being executed via the R CMD BATCH method, though would be happy with an approach using Rscript if there is a way to make it produce a comparable .Rout file.

Bryce Thomas ,Feb 11, 2015 at 20:59

My impression is that R CMD BATCH is a bit of a relict. In any case, the more recent Rscript executable (available on all platforms), together with commandArgs() makes processing command line arguments pretty easy.

As an example, here is a little script -- call it "myScript.R" :

## myScript.R
args <- commandArgs(trailingOnly = TRUE)
rnorm(n=as.numeric(args[1]), mean=as.numeric(args[2]))

And here is what invoking it from the command line looks like

> Rscript myScript.R 5 100
[1]  98.46435 100.04626  99.44937  98.52910 100.78853

Edit:

Not that I'd recommend it, but ... using a combination of source() and sink() , you could get Rscript to produce an .Rout file like that produced by R CMD BATCH . One way would be to create a little R script -- call it RscriptEcho.R -- which you call directly with Rscript. It might look like this:

## RscriptEcho.R
args <- commandArgs(TRUE)
srcFile <- args[1]
outFile <- paste0(make.names(date()), ".Rout")
args <- args[-1]

sink(outFile, split = TRUE)
source(srcFile, echo = TRUE)

To execute your actual script, you would then do:

Rscript RscriptEcho.R myScript.R 5 100
[1]  98.46435 100.04626  99.44937  98.52910 100.78853

which will execute myScript.R with the supplied arguments and sink interleaved input, output, and messages to a uniquely named .Rout .

Edit2:
You can run Rscript verbosely and place the verbose output in a file.

Rscript --verbose myScript.R 5 100 > myScript.Rout

d2a2d ,Apr 9 at 22:33

After trying the options described here, I found this post from Forester in r-bloggers . I think it is a clean option to consider.

I put his code here:

From command line

$ R CMD BATCH --no-save --no-restore '--args a=1 b=c(2,5,6)' test.R test.out &

Test.R

##First read in the arguments listed at the command line
args=(commandArgs(TRUE))

##args is now a list of character vectors
## First check to see if arguments are passed.
## Then cycle through each element of the list and evaluate the expressions.
if(length(args)==0){
    print("No arguments supplied.")
    ##supply default values
    a = 1
    b = c(1,1,1)
}else{
    for(i in 1:length(args)){
      eval(parse(text=args[[i]]))
    }
}

print(a*2)
print(b*3)

In test.out

> print(a*2)
[1] 2
> print(b*3)
[1]  6 15 18

Thanks to Forester !

user1563570 ,Apr 5, 2013 at 13:15

In your R script, called test.R :
args <- commandArgs(trailingOnly = F)
myargument <- args[length(args)]
myargument <- sub("-","",myargument)
print(myargument)
q(save="no")

From the command line run:

R CMD BATCH -4 test.R

Your output file, test.Rout, will show that the argument 4 has been successfully passed to R:

cat test.Rout

> args <- commandArgs(trailingOnly = F)
> myargument <- args[length(args)]
> myargument <- sub("-","",myargument)
> print(myargument)
[1] "4"
> q(save="no")
> proc.time()
user  system elapsed 
0.222   0.022   0.236

Bryce Thomas ,Oct 23, 2014 at 20:57

You need to put arguments before my_script.R and use - on the arguments, e.g.
R CMD BATCH -blabla my_script.R

commandArgs() will receive -blabla as a character string in this case. See the help for details:

$ R CMD BATCH --help
Usage: R CMD BATCH [options] infile [outfile]

Run R non-interactively with input from infile and place output (stdout
and stderr) to another file.  If not given, the name of the output file
is the one of the input file, with a possible '.R' extension stripped,
and '.Rout' appended.

Options:
  -h, --help        print short help message and exit
  -v, --version     print version info and exit
  --no-timing           do not report the timings
  --            end processing of options

Further arguments starting with a '-' are considered as options as long
as '--' was not encountered, and are passed on to the R process, which
by default is started with '--restore --save --no-readline'.
See also help('BATCH') inside R.

ClementWalter ,Mar 16, 2016 at 9:52

I add an answer because I think a one line solution is always good! Atop of your myRscript.R file, add the following line:
eval(parse(text=paste(commandArgs(trailingOnly = TRUE), collapse=";")))

Then submit your script with something like:

R CMD BATCH [options] '--args arguments you want to supply' myRscript.R &

For example:

R CMD BATCH --vanilla '--args N=1 l=list(a=2, b="test") name="aname"' myscript.R &

Then:

> ls()
[1] "N"    "l"    "name"

Dagremu ,Dec 15, 2016 at 0:50

Here's another way to process command line args, using R CMD BATCH . My approach, which builds on an earlier answer here , lets you specify arguments at the command line and, in your R script, give some or all of them default values.

Here's an R file, which I name test.R :

defaults <- list(a=1, b=c(1,1,1)) ## default values of any arguments we might pass

## parse each command arg, loading it into global environment
for (arg in commandArgs(TRUE))
  eval(parse(text=arg))

## if any variable named in defaults doesn't exist, then create it
## with value from defaults
for (nm in names(defaults))
  assign(nm, mget(nm, ifnotfound=list(defaults[[nm]]))[[1]])

print(a)
print(b)

At the command line, if I type

R CMD BATCH --no-save --no-restore '--args a=2 b=c(2,5,6)' test.R

then within R we'll have a = 2 and b = c(2,5,6) . But I could, say, omit b , and add in another argument c :

R CMD BATCH --no-save --no-restore '--args a=2 c="hello"' test.R

Then in R we'll have a = 2 , b = c(1,1,1) (the default), and c = "hello" .

Finally, for convenience we can wrap the R code in a function, as long as we're careful about the environment:

## defaults should be either NULL or a named list
parseCommandArgs <- function(defaults=NULL, envir=globalenv()) {
  for (arg in commandArgs(TRUE))
    eval(parse(text=arg), envir=envir)

  for (nm in names(defaults))
    assign(nm, mget(nm, ifnotfound=list(defaults[[nm]]), envir=envir)[[1]], pos=envir)
}

## example usage:
parseCommandArgs(list(a=1, b=c(1,1,1)))

[Oct 20, 2019] How can I use multiple library paths?

Oct 20, 2019 | stackoverflow.com

Ask Question Asked 2 years, 10 months ago Active 2 years, 10 months ago Viewed 747 times 2 0

user797963 ,Nov 28, 2016 at 16:11

I'm trying to set up an easy to use R development environment for multiple users. R is installed along with a set of other dev tools on an NFS mount.

I want to create a core set of R packages that also live on NFS so n users don't need to install their own copies of the same packages n times. Then, I was hoping users can install one off packages to a local R library. Has anyone worked with an R setup like this before? From the doc, it looks doable by adding both the core package and personal package file paths to .libPaths() .

,

You want to use the .Renviron file (see ?Startup ).

There are three places to put the file:

In this file you can specify R_LIBS and the R_LIBS_SITE environment variables.

For your particular problem, you probably want to add the NFS drive location to R_LIBS_SITE in the R_HOME/etc/Renviron.site file.


## To get R_HOME
Sys.getenv("R_HOME")

[Oct 20, 2019] How to update a package in R?

Oct 20, 2019 | stackoverflow.com

Ask Question Asked 5 years, 8 months ago Active 10 months ago Viewed 23k times 23 4


Joshua Ulrich ,Jan 31, 2014 at 8:15

I would like to upgrade one R package to the newer version which is already available. I tried
update.packages(c("R2jags"))

but it does nothing! No output on console, no error, nothing. I used the same syntax as for install.packages but perhaps I'm doing something wrong. I have been looking at ?update.packages but I was not been able to figure out how it works, where to specify the package(s) etc. There is no example. I also tried to update the package using install.packages to "install" it again but that says "Warning: package 'R2jags' is in use and will not be installed" .

TMS ,Jan 30, 2014 at 16:47

You can't do this I'm afraid, well, not with update.packages() . You need to call install.packages("R2jags") instead.

You can't install R2jags in the current session because you have already loaded the current version into the session. If you need to, save any objects you can't easily recreate, and quit out of R. Then start a new R session, immediately run install.packages("R2jags") , then once finished, load the package and reload in any previously saved objects. You could try to unload the package with:

detach(package:R2jags, unload = TRUE)

but it is quite complex to do this cleanly unless the package cleans up after itself.

update.packages() exists to update all outdated packages in a stated library location. That library location is given by the first argument (if not supplied it works on all known library locations for the current R session). Hence you were asking it the update the packages in library location R2jags which is most unlikely to exist on your R installation.

amzu ,Jan 30, 2014 at 16:36

Additionally, you can install RStudio and update all packages by going to the Tools menu and selecting Check for Package Updates .

DJ6968 ,Dec 13, 2018 at 11:42

# The following two commands remove any previously installed H2O packages for R.
if ("package:h2o" %in% search()) { detach("package:h2o", unload=TRUE) }
if ("h2o" %in% rownames(installed.packages())) { remove.packages("h2o") }

# Next, we download packages that H2O depends on.
pkgs <- c("RCurl","jsonlite")
for (pkg in pkgs) {
if (! (pkg %in% rownames(installed.packages()))) { install.packages(pkg) }
}

# Now we download, install and initialize the H2O package for R.
install.packages("h2o", type="source", repos="http://h2o-release.s3.amazonaws.com/h2o/rel-xia/2/R")

# Finally, let's load H2O and start up an H2O cluster
library(h2o)`enter code here`
h2o.init()

[Oct 16, 2019] R libraries installation - Stack Overflow

Oct 31, 2016 | stackoverflow.com

R libraries installation Ask Question Asked 2 years, 11 months ago Active 2 years, 10 months ago Viewed 783 times 0

horseshoe ,Oct 31, 2016 at 14:47

I am using a computer where I have read only rights for the R library folder. When I am installing new packages I therefore use
libpath <- "c:/R/mylibraries"
.libPaths( c( .libPaths(), libpath) )
install.packages("htmltools",   lib=libpath)

always when I am installing a new packages with dependecies (like e.g. htmltools depends on lme4), I get erros like:

Error in .requirePackage(package) : 
  unable to find required package 'lme4'

although lme4 is installed and I used it before.... also other errors/warnings like:

Warning in install.packages :
  cannot remove prior installation of package 'Rcpp'

or:

Warning in install.packages :
  unable to move temporary installation 'c:\...\file17b033a54a21\jsonlite' to 'c:\...\jsonlite'

occur. If I install them twice they usually work but sometimes dependencies to packages that worked before are lost and I have to reinstall them again. Is there a way to circumvent this?

> ,

Put this in a file named .REnviron in your Documents folder and restart R:
R_LIBS=c:/R/mylibraries

From then on, you should be able to install packages into that location automatically, without having to fiddle around with .libPaths .

[Oct 16, 2019] R libraries installation - Stack Overflow

Oct 16, 2019 | stackoverflow.com

R libraries installation Ask Question Asked 2 years, 11 months ago Active 2 years, 10 months ago Viewed 783 times 0

horseshoe ,Oct 31, 2016 at 14:47

I am using a computer where I have read only rights for the R library folder. When I am installing new packages I therefore use
libpath <- "c:/R/mylibraries"
.libPaths( c( .libPaths(), libpath) )
install.packages("htmltools",   lib=libpath)

always when I am installing a new packages with dependecies (like e.g. htmltools depends on lme4), I get erros like:

Error in .requirePackage(package) : 
  unable to find required package 'lme4'

although lme4 is installed and I used it before.... also other errors/warnings like:

Warning in install.packages :
  cannot remove prior installation of package 'Rcpp'

or:

Warning in install.packages :
  unable to move temporary installation 'c:\...\file17b033a54a21\jsonlite' to 'c:\...\jsonlite'

occur. If I install them twice they usually work but sometimes dependencies to packages that worked before are lost and I have to reinstall them again. Is there a way to circumvent this?

> ,

Put this in a file named .REnviron in your Documents folder and restart R:
R_LIBS=c:/R/mylibraries

From then on, you should be able to install packages into that location automatically, without having to fiddle around with .libPaths .

[Oct 16, 2019] R setting library path via R_LIBS

Oct 16, 2019 | stackoverflow.com

Ask Question Asked 3 years, 4 months ago Active 3 years, 1 month ago Viewed 5k times 2

Xiongbing Jin ,Jun 12, 2016 at 12:42

I have read the R FAQS and other posts but I am a bit confused and would be grateful to know whether I did everything correctly.

In Windows, in order to modify the default library folder I created a file Renviron.site and put inside E:/Programs/R-3.3.0/etc . The file has only one line saying

R_LIBS=E:/Rlibrary

When I open R and run .libPaths() I see E:/Rlibrary as [1] and the default R library E:/Programs/R-3.3.0/library as [2].

This should mean that from now on all packages I will install will go in E:/Rlibrary but at the same time I will be able to load and use both packages in this folder and those in the default location. Am I correct?

,

When you load a package via library , it will go through each directory in .libPaths() in turn to find the required package. If the package hasn't been found, you will get an error. This means you can have multiple versions of a package (in different directories), but the package that will be used is determined by the order of .libPaths() .

Regarding how .libPaths() is constructed, from ?.R_LIBS

The library search path is initialized at startup from the environment variable 'R_LIBS' (which should be a colon-separated list of directories at which R library trees are rooted) followed by those in environment variable 'R_LIBS_USER'. Only directories which exist at the time will be included.

[Oct 16, 2019] The .Rprofile file

Oct 16, 2019 | csgillespie.github.io

3.3 R startup

Every time R starts, a number of files are read, in a particular order. The contents of these files determine how R performs for the duration of the session. Note that these files should only be changed with caution, as they may make your R version behave differently to other R installations. This could reduce the reproducibility of your code.

Files in three folders are important in this process:

It is important to know the location of the .Rprofile and .Renviron set-up files that are being used out of these three options. R only uses one .Rprofile and one .Renviron in any session: if you have a .Rprofile file in your current project, R will ignore .Rprofile in R_HOME and HOME . Likewise, .Rprofile in HOME overrides .Rprofile in R_HOME . The same applies to .Renviron : you should remember that adding project specific environment variables with .Renviron will de-activate other .Renviron files.

To create a project-specific start-up script, simply create a .Rprofile file in the project's root directory and start adding R code, e.g. via file.edit(".Rprofile") . Remember that this will make .Rprofile in the home directory be ignored. The following commands will open your .Rprofile from within an R editor:

file.edit(file.path("~", ".Rprofile")) # edit .Rprofile in HOME
file.edit(".Rprofile") # edit project specific .Rprofile

Note that editing the .Renviron file in the same locations will have the same effect. The following code will create a user specific .Renviron file (where API keys and other cross-project environment variables can be stored), without overwriting any existing file.

user_renviron = path.expand(file.path("~", ".Renviron"))
if(!file.exists(user_renviron)) # check to see if the file already exists
  file.create(user_renviron)
file.edit(user_renviron) # open with another text editor if this fails

The location, contents and uses of each is outlined in more detail below. 3.3.1 The .Rprofile file

By default R looks for and runs .Rprofile files in the three locations described above, in a specific order. .Rprofile files are simply R scripts that run each time R runs and they can be found within R_HOME , HOME and the project's home directory, found with getwd() . To check if you have a site-wide .Rprofile , which will run for all users on start-up, run:

site_path = R.home(component = "home")
fname = file.path(site_path, "etc", "Rprofile.site")
file.exists(fname)

The above code checks for the presence of Rprofile.site in that directory. As outlined above, the .Rprofile located in your home directory is user-specific. Again, we can test whether this file exists using

file.exists("~/.Rprofile")

We can use R to create and edit .Rprofile (warning: do not overwrite your previous .Rprofile - we suggest you try project-specific .Rprofile first):

if(!file.exists("~/.Rprofile")) # only create if not already there
  file.create("~/.Rprofile")    # (don't overwrite it)
file.edit("~/.Rprofile")
3.3.2 Example .Rprofile settings

An .Rprofile file is just an R script that is run at start-up. The examples at the bottom of the .Rprofile help file

help("Rprofile")

give clues as to the types of things we could place in our profile.

3.3.2.1 Setting options

The function options is a list that contains a number of default options. See help("options") or simply type options() to get an idea of what we can configure. In my .Rprofile file, we have the line

options(prompt="R> ", digits=4, show.signif.stars=FALSE)

This changes three features.

Typically we want to avoid adding options to the start-up file that make our code non-portable. For example, adding

options(stringsAsFactors=FALSE)

to your start-up script has knock-on effects for read.table and related functions including read.csv , making them convert text strings into characters rather than into factors as is default. This may be useful for you, but it is dangerous as it may make your code less portable. 3.3.2.2 Setting the CRAN mirror

To avoid setting the CRAN mirror each time you run install.packages you can permanently set the mirror in your .Rprofile .

## local creates a new, empty environment
## This avoids polluting the global environment with
## the object r
local({
  r = getOption("repos")             
  r["CRAN"] = "https://cran.rstudio.com/"
  options(repos = r)
})

The RStudio mirror is a virtual machine run by Amazon's EC2 service, and it syncs with the main CRAN mirror in Austria once per day. Since RStudio is using Amazon's CloudFront, the repository is automatically distributed around the world, so no matter where you are in the world, the data doesn't need to travel very far, and is therefore fast to download. 3.3.2.3 The fortunes package

This section illustrate what .Rprofile does with reference to a package that was developed for fun. The code below could easily be altered to automatically connect to a database, or ensure that the latest packages have been downloaded.

The fortunes package contains a number of memorable quotes that the community has collected over many years, called R fortunes. Each fortune has a number. To get fortune number , for example, enter

fortunes::fortune(50)

It is easy to make R print out one of these nuggets of truth each time you start a session, by adding the following to ~/.Rprofile :

if(interactive()) 
  try(fortunes::fortune(), silent=TRUE)

The interactive function tests whether R is being used interactively in a terminal. The fortune function is called within try . If the fortunes package is not available, we avoid raising an error and move on. By using :: we avoid adding the fortunes package to our list of attached packages..

The function .Last , if it exists in the .Rprofile , is always run at the end of the session. We can use it to install the fortunes package if needed. To load the package, we use require , since if the package isn't installed, the require function returns FALSE and raises a warning.

.Last = function() {
  cond = suppressWarnings(!require(fortunes, quietly=TRUE))
  if(cond) 
    try(install.packages("fortunes"), silent=TRUE)
  message("Goodbye at ", date(), "\n")
}
3.3.2.4 Useful functions

You can also load useful functions in .Rprofile . For example, we could load the following two functions for examining data frames:

## ht == headtail
ht = function(d, n=6) rbind(head(d, n), tail(d, n))
  
## Show the first 5 rows & first 5 columns of a data frame
hh = function(d) d[1:5, 1:5]

and a function for setting a nice plotting window:

setnicepar = function(mar = c(3, 3, 2, 1), mgp = c(2, 0.4, 0), 
                      tck = -.01, cex.axis = 0.9, 
                      las = 1, mfrow = c(1, 1), ...) {
    par(mar = mar, mgp = mgp, tck = tck,
        cex.axis = cex.axis, las = las, 
        mfrow = mfrow, ...)
}

Note that these functions are for personal use and are unlikely to interfere with code from other people. For this reason even if you use a certain package every day, we don't recommend loading it in your .Rprofile . Also beware the dangers of loading many functions by default: it may make your code less portable. Another downside of putting functions in your .Rprofile is that it can clutter-up your work space: when you run the ls() command, your .Rprofile functions will appear. Also if you run rm(list=ls()) , your functions will be deleted.

One neat trick to overcome this issue is to use hidden objects and environments. When an object name starts with . , by default it doesn't appear in the output of the ls() function

.obj = 1
".obj" %in% ls()
## [1] FALSE

This concept also works with environments. In the .Rprofile file we can create a hidden environment

.env = new.env()

and then add functions to this environment

.env$ht = function(d, n = 6) rbind(head(d, n), tail(d, n))

At the end of the .Rprofile file, we use attach , which makes it possible to refer to objects in the environment by their names alone.

attach(.env)
3.3.3 The .Renviron file

The .Renviron file is used to store system variables. It follows a similar start up routine to the .Rprofile file: R first looks for a global .Renviron file, then for local versions. A typical use of the .Renviron file is to specify the R_LIBS path

## Linux
R_LIBS=~/R/library

## Windows
R_LIBS=C:/R/library

This variable points to a directory where R packages will be installed. When install.packages is called, new packages will be stored in R_LIBS .

Another common use of .Renviron is to store API keys that will be available from one session to another. 4 The following line in .Renviron , for example, sets the ZEIT_KEY environment variable which is used in the package diezeit package:

ZEIT_KEY=PUT_YOUR_KEY_HERE

You will need to sign-in and start a new R session for the environment variable (accessed by Sys.getenv ) to be visible. To test if the example API key has been successfully added as an environment variable, run the following:

Sys.getenv("ZEIT_KEY")

Use of the .Renviron file for storing settings such as library paths and API keys is efficient because it reduces the need to update your settings for every R session. Furthermore, the same .Renviron file will work across different platforms so keep it stored safely. 3.3.4 Exercises

  1. What are the three locations where they are stored? Where are these locations on your computer?
  2. For each location, does a .Rprofile or .Renviron file exist?
  3. Create a .Rprofile file in your current working directory that prints the message Happy efficient R programming each time you start R at this location.

[Oct 15, 2019] Perl to Python Function translation [closed]

Feb 01, 2014 | stackoverflow.com

Ask Question Asked 5 years, 8 months ago Active 5 years, 8 months ago Viewed 303 times -3


Jim Garrison ,Feb 1, 2014 at 22:24

I am trying to translate a Perl function into a Python function, but I am having trouble figuring out what some of the Perl to Python function equivalents.

Perl function:

sub reverse_hex {

 my $HEXDATE = shift;
 my @bytearry=();
 my $byte_cnt = 0;
 my $max_byte_cnt = 8;
 my $byte_offset = 0;
 while($byte_cnt < $max_byte_cnt) {
   my $tmp_str = substr($HEXDATE,$byte_offset,2);
    push(@bytearry,$tmp_str);
   $byte_cnt++;
   $byte_offset+=2;
 }
   return join('',reverse(@bytearry));
}

I am not sure what "push", "shift", and "substr" are doing here that would be the same in Python.

Any help will be much appreciated.

Kenosis ,Feb 1, 2014 at 22:17

The Perl subroutine seems rather complicated for what it does, viz., taking chunks of two chars at a time (the first 16 chars) from the sent string and then reverses it. Another Perl option is:
sub reverse_hex {
    return join '', reverse unpack 'A2' x 8, $_[0];
}

First, unpack here takes two characters at a time (eight times) and produces a list. That list is reverse d and join ed to produce the final string.

Here's a Python subroutine to accomplish this:

def reverse_hex(HEXDATE):
    hexVals = [HEXDATE[i:i + 2] for i in xrange(0, 16, 2)]
    reversedHexVals = hexVals[::-1]
    return ''.join(reversedHexVals)

The list comprehension produces eight elements of two characters each. [::-1] reverses the list's elements and the result is join ed and returned.

Hope this helps!

MikeMayer67 ,Feb 2, 2014 at 2:10

I realize that you are asking about the perl to python translation, but if you have any control over the perl, I would like to point out that this function is a lot more complicated than it needs to be.

The entire thing could be replaced with:

sub reverse_hex
{
  my $hexdate = shift;
  my @bytes = $hexdate =~ /../g;  # break $hexdate into array of character pairs
  return join '', reverse(@bytes);
}

Not only is this shorter, it is much easier to get your head around. Of course, if you have no control over the perl, you are stuck with what you were dealt.

[Oct 15, 2019] What's the easiest way to install a missing Perl module

Oct 15, 2019 | stackoverflow.com

ikegami ,Dec 13, 2017 at 21:21

I get this error:

Can't locate Foo.pm in @INC

Is there an easier way to install it than downloading, untarring, making, etc?

brian d foy ,Jun 10, 2014 at 21:52

On Unix :

usually you start cpan in your shell:

# cpan

and type

install Chocolate::Belgian

or in short form:

cpan Chocolate::Belgian

On Windows :

If you're using ActivePerl on Windows, the PPM (Perl Package Manager) has much of the same functionality as CPAN.pm.

Example:

# ppm
ppm> search net-smtp
ppm> install Net-SMTP-Multipart

see How do I install Perl modules? in the CPAN FAQ

Many distributions ship a lot of perl modules as packages.

You should always prefer them as you benefit from automatic (security) updates and the ease of removal . This can be pretty tricky with the cpan tool itself.

For Gentoo there's a nice tool called g-cpan which builds/installs the module from CPAN and creates a Gentoo package ( ebuild ) for you.

Chas. Owens ,Jan 30, 2017 at 21:08

Try App::cpanminus :
# cpanm Chocolate::Belgian

It's great for just getting stuff installed. It provides none of the more complex functionality of CPAN or CPANPLUS, so it's easy to use, provided you know which module you want to install. If you haven't already got cpanminus, just type:

# cpan App::cpanminus

to install it.

It is also possible to install it without using cpan at all. The basic bootstrap procedure is,

curl -L http://cpanmin.us | perl - --sudo App::cpanminus

For more information go to the App::cpanminus page and look at the section on installation.

isomorphismes ,Mar 22, 2011 at 16:03

I note some folks suggesting one run cpan under sudo. That used to be necessary to install into the system directory, but modern versions of the CPAN shell allow you to configure it to use sudo just for installing. This is much safer, since it means that tests don't run as root.

If you have an old CPAN shell, simply install the new cpan ("install CPAN") and when you reload the shell, it should prompt you to configure these new directives.

Nowadays, when I'm on a system with an old CPAN, the first thing I do is update the shell and set it up to do this so I can do most of my cpan work as a normal user.

Also, I'd strongly suggest that Windows users investigate strawberry Perl . This is a version of Perl that comes packaged with a pre-configured CPAN shell as well as a compiler. It also includes some hard-to-compile Perl modules with their external C library dependencies, notably XML::Parser. This means that you can do the same thing as every other Perl user when it comes to installing modules, and things tend to "just work" a lot more often.

Ivan X ,Sep 19 at 19:32

If you're on Ubuntu and you want to install the pre-packaged perl module (for example, geo::ipfree) try this:
    $ apt-cache search perl geo::ipfree
    libgeo-ipfree-perl - A look up country of ip address Perl module

    $ sudo apt-get install libgeo-ipfree-perl

brian d foy ,Sep 15, 2008 at 22:47

A couple of people mentioned the cpan utility, but it's more than just starting a shell. Just give it the modules that you want to install and let it do it's work.
$prompt> cpan Foo::Bar

If you don't give it any arguments it starts the CPAN.pm shell. This works on Unix, Mac, and should be just fine on Windows (especially Strawberry Perl).

There are several other things that you can do with the cpan tool as well. Here's a summary of the current features (which might be newer than the one that comes with CPAN.pm and perl):

-a
Creates the CPAN.pm autobundle with CPAN::Shell->autobundle.

-A module [ module ... ]
Shows the primary maintainers for the specified modules

-C module [ module ... ]
Show the Changes files for the specified modules

-D module [ module ... ]
Show the module details. This prints one line for each out-of-date module (meaning,
modules locally installed but have newer versions on CPAN). Each line has three columns:
module name, local version, and CPAN version.

-L author [ author ... ]
List the modules by the specified authors.

-h
Prints a help message.

-O
Show the out-of-date modules.

-r
Recompiles dynamically loaded modules with CPAN::Shell->recompile.

-v
Print the script version and CPAN.pm version.

wytten ,Apr 25 at 19:26

sudo perl -MCPAN -e 'install Foo'

Corion ,Sep 16, 2008 at 6:36

Also see Yes, even you can use CPAN . It shows how you can use CPAN without having root or sudo access.

mikegrb ,Sep 16, 2008 at 19:25

Otto made a good suggestion . This works for Debian too, as well as any other Debian derivative. The missing piece is what to do when apt-cache search doesn't find something.
$ sudo apt-get install dh-make-perl build-essential apt-file
$ sudo apt-file update

Then whenever you have a random module you wish to install:

$ cd ~/some/path
$ dh-make-perl --build --cpan Some::Random::Module
$ sudo dpkg -i libsome-random-module-perl-0.01-1_i386.deb

This will give you a deb package that you can install to get Some::Random::Module. One of the big benefits here is man pages and sample scripts in addition to the module itself will be placed in your distro's location of choice. If the distro ever comes out with an official package for a newer version of Some::Random::Module, it will automatically be installed when you apt-get upgrade.

community wiki
jm666
,May 22, 2011 at 18:19

Already answered and accepted answer - but anyway:

IMHO the easiest way installing CPAN modules (on unix like systems, and have no idea about the wondows) is:

curl -L http://cpanmin.us | perl - --sudo App::cpanminus

The above is installing the "zero configuration CPAN modules installer" called cpanm . (Can take several minutes to install - don't break the process)

and after - simply:

cpanm Foo
cpanm Module::One
cpanm Another::Module

brian d foy ,Oct 8, 2008 at 7:26

Lots of recommendation for CPAN.pm , which is great, but if you're using Perl 5.10 then you've also got access to CPANPLUS.pm which is like CPAN.pm but better.

And, of course, it's available on CPAN for people still using older versions of Perl. Why not try:

$ cpan CPANPLUS

IgorGanapolsky ,Jan 30, 2017 at 21:09

Many times it does happen that cpan install command fails with the message like "make test had returned bad status, won't install without force"

In that case following is the way to install the module:

perl -MCPAN -e "CPAN::Shell->force(qw(install Foo::Bar));"

community wiki
2 revs, 2 users 80%
,Apr 13, 2015 at 14:50

On ubuntu most perl modules are already packaged, so installing is much faster than most other systems which have to compile.

To install Foo::Bar at a commmand prompt for example usually you just do:

sudo apt-get install libfoo-bar-perl

Sadly not all modules follow that naming convention.

community wiki
3 revs, 3 users 50%
,Apr 13, 2015 at 14:52

Even it should work:
cpan -i module_name

community wiki
3 revs, 2 users 97%
,Apr 13, 2015 at 16:43

Use cpan command as cpan Modulename
$ cpan HTML::Parser

To install dependencies automatically follow the below

$ perl -MCPAN -e shell
cpan[1]>  o conf prerequisites_policy follow
cpan[2]>  o conf commit
exit

I prefer App::cpanminus , it installs dependencies automatically. Just do

$ cpanm HTML::Parser

brian d foy ,Sep 27, 2008 at 18:58

2 ways that I know of :

USING PPM :

With Windows (ActivePerl) I've used ppm

from the command line type ppm. At the ppm prompt ...

ppm> install foo

or

ppm> search foo

to get a list of foo modules available. Type help for all the commands

USING CPAN :

you can also use CPAN like this ( *nix systems ) :

perl -MCPAN -e 'shell'

gets you a prompt

cpan>

at the prompt ...

cpan> install foo  (again to install the foo module)

type h to get a list of commands for cpan

community wiki
Bruce Alderman
,Nov 21, 2008 at 19:59

On Fedora you can use
# yum install foo

as long as Fedora has an existing package for the module.

community wiki
2 revs, 2 users 89%
,Apr 13, 2015 at 14:51

On Fedora Linux or Enterprise Linux , yum also tracks perl library dependencies. So, if the perl module is available, and some rpm package exports that dependency, it will install the right package for you.
yum install 'perl(Chocolate::Belgian)'

(most likely perl-Chocolate-Belgian package, or even ChocolateFactory package)

community wiki
Mister X
,Dec 28, 2016 at 11:16

Easiest way for me is this:
PERL_MM_USE_DEFAULT=1 perl -MCPAN -e 'install DateTime::TimeZone'

a) automatic recursive dependency detection/resolving/installing

b) it's a shell onliner, good for setup-scripts

community wiki
venkrao
,Sep 11, 2013 at 18:06

If you want to put the new module into a custom location that your cpan shell isn't configured to use, then perhaps, the following will be handy.
 #wget <URL to the module.tgz>
 ##unpack
 perl Build.PL
./Build destdir=$HOME install_base=$HOME
./Build destdir=$HOME install_base=$HOME install

community wiki
2 revs, 2 users 67%
,Apr 13, 2015 at 14:50

Sometimes you can use the yum search foo to search the relative perl module, then use yum install xxx to install.

PW. ,Sep 15, 2008 at 19:26

On Windows with the ActiveState distribution of Perl, use the ppm command.

community wiki
Kamal Nayan
,Oct 1, 2015 at 9:56

Simply executing cpan Foo::Bar on shell would serve the purpose.

community wiki
Ed Dunn
,Nov 4, 2016 at 15:26

Seems like you've already got your answer but I figured I'd chime in. This is what I do in some scripts on an Ubuntu (or debian server)
#!/usr/bin/perl

use warnings;
use strict;

#I've gotten into the habit of setting this on all my scripts, prevents weird path issues if the script is not being run by root
$ENV{'PATH'} = '/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin';

#Fill this with the perl modules required for your project
my @perl = qw(LWP::Simple XML::LibXML MIME::Lite DBI DateTime Config::Tiny Proc::ProcessTable);

chomp(my $curl = `which curl`);

if(!$curl){ system('apt-get install curl -y > /dev/null'); }

chomp(my $cpanm = system('/bin/bash', '-c', 'which cpanm &>/dev/null'));

#installs cpanm if missing
if($cpanm){ system('curl -s -L http://cpanmin.us | perl - --sudo App::cpanminus'); }

#loops through required modules and installs them if missing
foreach my $x (@perl){
    eval "use $x";
    if($@){
        system("cpanm $x");
        eval "use $x";
    }
}

This works well for me, maybe there is something here you can use.

[Oct 13, 2019] What is the system function in python

Oct 13, 2019 | stackoverflow.com

what is the system function in python Ask Question Asked 9 years, 3 months ago Active 9 years, 3 months ago Viewed 6k times 1


Eva Feldman ,Jul 6, 2010 at 15:55

I want to play with system command in python . for example we have this function in perl : system("ls -la"); and its run ls -la what is the system function in python ? Thanks in Advance .

Felix Kling ,Jul 6, 2010 at 15:58

It is os.system :
import os
os.system('ls -la')

But this won't give you any output. So subprocess.check_output is probably more what you want:

>>> import subprocess
>>> subprocess.check_output(["ls", "-l", "/dev/null"])
'crw-rw-rw- 1 root root 1, 3 Oct 18  2007 /dev/null\n'

KLee1 ,Jul 6, 2010 at 16:00

import os
os.system("")

From here

> ,

In the os module there is os.system() .

But if you want to do more advanced things with subprocesses the subprocess module provides a higher level interface with more possibilities that is usually preferable.

[Oct 13, 2019] python - Find size and free space of the filesystem containing a given file - Stack Overflow

Oct 13, 2019 | stackoverflow.com

Find size and free space of the filesystem containing a given file Ask Question Asked 8 years, 10 months ago Active 6 months ago Viewed 110k times 67 21


Piskvor ,Aug 21, 2013 at 7:19

I'm using Python 2.6 on Linux. What is the fastest way:

Sven Marnach ,May 5, 2016 at 11:11

If you just need the free space on a device, see the answer using os.statvfs() below.

If you also need the device name and mount point associated with the file, you should call an external program to get this information. df will provide all the information you need -- when called as df filename it prints a line about the partition that contains the file.

To give an example:

import subprocess
df = subprocess.Popen(["df", "filename"], stdout=subprocess.PIPE)
output = df.communicate()[0]
device, size, used, available, percent, mountpoint = \
    output.split("\n")[1].split()

Note that this is rather brittle, since it depends on the exact format of the df output, but I'm not aware of a more robust solution. (There are a few solutions relying on the /proc filesystem below that are even less portable than this one.)

Halfgaar ,Feb 9, 2017 at 10:41

This doesn't give the name of the partition, but you can get the filesystem statistics directly using the statvfs Unix system call. To call it from Python, use os.statvfs('/home/foo/bar/baz') .

The relevant fields in the result, according to POSIX :

unsigned long f_frsize   Fundamental file system block size. 
fsblkcnt_t    f_blocks   Total number of blocks on file system in units of f_frsize. 
fsblkcnt_t    f_bfree    Total number of free blocks. 
fsblkcnt_t    f_bavail   Number of free blocks available to 
                         non-privileged process.

So to make sense of the values, multiply by f_frsize :

import os
statvfs = os.statvfs('/home/foo/bar/baz')

statvfs.f_frsize * statvfs.f_blocks     # Size of filesystem in bytes
statvfs.f_frsize * statvfs.f_bfree      # Actual number of free bytes
statvfs.f_frsize * statvfs.f_bavail     # Number of free bytes that ordinary users
                                        # are allowed to use (excl. reserved space)

Halfgaar ,Feb 9, 2017 at 10:44

import os

def get_mount_point(pathname):
    "Get the mount point of the filesystem containing pathname"
    pathname= os.path.normcase(os.path.realpath(pathname))
    parent_device= path_device= os.stat(pathname).st_dev
    while parent_device == path_device:
        mount_point= pathname
        pathname= os.path.dirname(pathname)
        if pathname == mount_point: break
        parent_device= os.stat(pathname).st_dev
    return mount_point

def get_mounted_device(pathname):
    "Get the device mounted at pathname"
    # uses "/proc/mounts"
    pathname= os.path.normcase(pathname) # might be unnecessary here
    try:
        with open("/proc/mounts", "r") as ifp:
            for line in ifp:
                fields= line.rstrip('\n').split()
                # note that line above assumes that
                # no mount points contain whitespace
                if fields[1] == pathname:
                    return fields[0]
    except EnvironmentError:
        pass
    return None # explicit

def get_fs_freespace(pathname):
    "Get the free space of the filesystem containing pathname"
    stat= os.statvfs(pathname)
    # use f_bfree for superuser, or f_bavail if filesystem
    # has reserved space for superuser
    return stat.f_bfree*stat.f_bsize

Some sample pathnames on my computer:

path 'trash':
  mp /home /dev/sda4
  free 6413754368
path 'smov':
  mp /mnt/S /dev/sde
  free 86761562112
path '/usr/local/lib':
  mp / rootfs
  free 2184364032
path '/proc/self/cmdline':
  mp /proc proc
  free 0
PS

if on Python ≥3.3, there's shutil.disk_usage(path) which returns a named tuple of (total, used, free) expressed in bytes.

Xiong Chiamiov ,Sep 30, 2016 at 20:39

As of Python 3.3, there an easy and direct way to do this with the standard library:
$ cat free_space.py 
#!/usr/bin/env python3

import shutil

total, used, free = shutil.disk_usage(__file__)
print(total, used, free)

$ ./free_space.py 
1007870246912 460794834944 495854989312

These numbers are in bytes. See the documentation for more info.

Giampaolo Rodolà ,Aug 16, 2017 at 9:08

This should make everything you asked:
import os
from collections import namedtuple

disk_ntuple = namedtuple('partition',  'device mountpoint fstype')
usage_ntuple = namedtuple('usage',  'total used free percent')

def disk_partitions(all=False):
    """Return all mountd partitions as a nameduple.
    If all == False return phyisical partitions only.
    """
    phydevs = []
    f = open("/proc/filesystems", "r")
    for line in f:
        if not line.startswith("nodev"):
            phydevs.append(line.strip())

    retlist = []
    f = open('/etc/mtab', "r")
    for line in f:
        if not all and line.startswith('none'):
            continue
        fields = line.split()
        device = fields[0]
        mountpoint = fields[1]
        fstype = fields[2]
        if not all and fstype not in phydevs:
            continue
        if device == 'none':
            device = ''
        ntuple = disk_ntuple(device, mountpoint, fstype)
        retlist.append(ntuple)
    return retlist

def disk_usage(path):
    """Return disk usage associated with path."""
    st = os.statvfs(path)
    free = (st.f_bavail * st.f_frsize)
    total = (st.f_blocks * st.f_frsize)
    used = (st.f_blocks - st.f_bfree) * st.f_frsize
    try:
        percent = ret = (float(used) / total) * 100
    except ZeroDivisionError:
        percent = 0
    # NB: the percentage is -5% than what shown by df due to
    # reserved blocks that we are currently not considering:
    # http://goo.gl/sWGbH
    return usage_ntuple(total, used, free, round(percent, 1))


if __name__ == '__main__':
    for part in disk_partitions():
        print part
        print "    %s\n" % str(disk_usage(part.mountpoint))

On my box the code above prints:

giampaolo@ubuntu:~/dev$ python foo.py 
partition(device='/dev/sda3', mountpoint='/', fstype='ext4')
    usage(total=21378641920, used=4886749184, free=15405903872, percent=22.9)

partition(device='/dev/sda7', mountpoint='/home', fstype='ext4')
    usage(total=30227386368, used=12137168896, free=16554737664, percent=40.2)

partition(device='/dev/sdb1', mountpoint='/media/1CA0-065B', fstype='vfat')
    usage(total=7952400384, used=32768, free=7952367616, percent=0.0)

partition(device='/dev/sr0', mountpoint='/media/WB2PFRE_IT', fstype='iso9660')
    usage(total=695730176, used=695730176, free=0, percent=100.0)

partition(device='/dev/sda6', mountpoint='/media/Dati', fstype='fuseblk')
    usage(total=914217758720, used=614345637888, free=299872120832, percent=67.2)

AK47 ,Jul 7, 2016 at 10:37

The simplest way to find out it.
import os
from collections import namedtuple

DiskUsage = namedtuple('DiskUsage', 'total used free')

def disk_usage(path):
    """Return disk usage statistics about the given path.

    Will return the namedtuple with attributes: 'total', 'used' and 'free',
    which are the amount of total, used and free space, in bytes.
    """
    st = os.statvfs(path)
    free = st.f_bavail * st.f_frsize
    total = st.f_blocks * st.f_frsize
    used = (st.f_blocks - st.f_bfree) * st.f_frsize
    return DiskUsage(total, used, free)

tzot ,Aug 8, 2011 at 10:11

For the first point, you can try using os.path.realpath to get a canonical path, check it against /etc/mtab (I'd actually suggest calling getmntent , but I can't find a normal way to access it) to find the longest match. (to be sure, you should probably stat both the file and the presumed mountpoint to verify that they are in fact on the same device)

For the second point, use os.statvfs to get block size and usage information.

(Disclaimer: I have tested none of this, most of what I know came from the coreutils sources)

andrew ,Dec 15, 2017 at 0:55

For the second part of your question, "get usage statistics of the given partition", psutil makes this easy with the disk_usage(path) function. Given a path, disk_usage() returns a named tuple including total, used, and free space expressed in bytes, plus the percentage usage.

Simple example from documentation:

>>> import psutil
>>> psutil.disk_usage('/')
sdiskusage(total=21378641920, used=4809781248, free=15482871808, percent=22.5)

Psutil works with Python versions from 2.6 to 3.6 and on Linux, Windows, and OSX among other platforms.

Donald Duck ,Jan 12, 2018 at 18:28

import os

def disk_stat(path):
    disk = os.statvfs(path)
    percent = (disk.f_blocks - disk.f_bfree) * 100 / (disk.f_blocks -disk.f_bfree + disk.f_bavail) + 1
    return percent


print disk_stat('/')
print disk_stat('/data')

> ,

Usually the /proc directory contains such information in Linux, it is a virtual filesystem. For example, /proc/mounts gives information about current mounted disks; and you can parse it directly. Utilities like top , df all make use of /proc .

I haven't used it, but this might help too, if you want a wrapper: http://bitbucket.org/chrismiles/psi/wiki/Home

[Oct 09, 2019] scope - What is the difference between my and our in Perl - Stack Overflow

Oct 09, 2019 | stackoverflow.com

Asked 10 years, 5 months ago Active 3 years, 1 month ago Viewed 107k times 180 56


Nathan Fellman ,May 10, 2009 at 10:24

I know what my is in Perl. It defines a variable that exists only in the scope of the block in which it is defined. What does our do? How does our differ from my ?

Nathan Fellman ,Nov 20, 2016 at 1:15

Great question: How does our differ from my and what does our do?

In Summary:

Available since Perl 5, my is a way to declare:


On the other hand, our variables are:


Declaring a variable with our allows you to predeclare variables in order to use them under use strict without getting typo warnings or compile-time errors. Since Perl 5.6, it has replaced the obsolete use vars , which was only file-scoped, and not lexically scoped as is our .

For example, the formal, qualified name for variable $x inside package main is $main::x . Declaring our $x allows you to use the bare $x variable without penalty (i.e., without a resulting error), in the scope of the declaration, when the script uses use strict or use strict "vars" . The scope might be one, or two, or more packages, or one small block.

Georg ,Oct 1, 2016 at 6:41

The PerlMonks and PerlDoc links from cartman and Olafur are a great reference - below is my crack at a summary:

my variables are lexically scoped within a single block defined by {} or within the same file if not in {} s. They are not accessible from packages/subroutines defined outside of the same lexical scope / block.

our variables are scoped within a package/file and accessible from any code that use or require that package/file - name conflicts are resolved between packages by prepending the appropriate namespace.

Just to round it out, local variables are "dynamically" scoped, differing from my variables in that they are also accessible from subroutines called within the same block.

Nathan Fellman ,Nov 20, 2015 at 18:46

An example:
use strict;

for (1 .. 2){
    # Both variables are lexically scoped to the block.
    our ($o);  # Belongs to 'main' package.
    my  ($m);  # Does not belong to a package.

    # The variables differ with respect to newness.
    $o ++;
    $m ++;
    print __PACKAGE__, " >> o=$o m=$m\n";  # $m is always 1.

    # The package has changed, but we still have direct,
    # unqualified access to both variables, because the
    # lexical scope has not changed.
    package Fubb;
    print __PACKAGE__, " >> o=$o m=$m\n";
}

# The our() and my() variables differ with respect to privacy.
# We can still access the variable declared with our(), provided
# that we fully qualify its name, but the variable declared
# with my() is unavailable.
print __PACKAGE__, " >> main::o=$main::o\n";  # 2
print __PACKAGE__, " >> main::m=$main::m\n";  # Undefined.

# Attempts to access the variables directly won't compile.
# print __PACKAGE__, " >> o=$o\n";
# print __PACKAGE__, " >> m=$m\n";

# Variables declared with use vars() are like those declared
# with our(): belong to a package; not private; and not new.
# However, their scoping is package-based rather than lexical.
for (1 .. 9){
    use vars qw($uv);
    $uv ++;
}

# Even though we are outside the lexical scope where the
# use vars() variable was declared, we have direct access
# because the package has not changed.
print __PACKAGE__, " >> uv=$uv\n";

# And we can access it from another package.
package Bubb;
print __PACKAGE__, " >> main::uv=$main::uv\n";

daotoad ,May 10, 2009 at 16:37

Coping with Scoping is a good overview of Perl scoping rules. It's old enough that our is not discussed in the body of the text. It is addressed in the Notes section at the end.

The article talks about package variables and dynamic scope and how that differs from lexical variables and lexical scope.

Chas. Owens ,Oct 7, 2013 at 14:02

my is used for local variables, where as our is used for global variables. More reading over Variable Scoping in Perl: the basics .

ruffin ,Feb 10, 2015 at 19:47

It's an old question, but I ever met some pitfalls about lexical declarations in Perl that messed me up, which are also related to this question, so I just add my summary here:

1. definition or declaration?

local $var = 42; 
print "var: $var\n";

The output is var: 42 . However we couldn't tell if local $var = 42; is a definition or declaration. But how about this:

use strict;
use warnings;

local $var = 42;
print "var: $var\n";

The second program will throw an error:

Global symbol "$var" requires explicit package name.

$var is not defined, which means local $var; is just a declaration! Before using local to declare a variable, make sure that it is defined as a global variable previously.

But why this won't fail?

use strict;
use warnings;

local $a = 42;
print "var: $a\n";

The output is: var: 42 .

That's because $a , as well as $b , is a global variable pre-defined in Perl. Remember the sort function?

2. lexical or global?

I was a C programmer before starting using Perl, so the concept of lexical and global variables seems straightforward to me: just corresponds to auto and external variables in C. But there're small differences:

In C, an external variable is a variable defined outside any function block. On the other hand, an automatic variable is a variable defined inside a function block. Like this:

int global;

int main(void) {
    int local;
}

While in Perl, things are subtle:

sub main {
    $var = 42;
}

&main;

print "var: $var\n";

The output is var: 42 , $var is a global variable even it's defined in a function block! Actually in Perl, any variable is declared as global by default.

The lesson is to always add use strict; use warnings; at the beginning of a Perl program, which will force the programmer to declare the lexical variable explicitly, so that we don't get messed up by some mistakes taken for granted.

Ólafur Waage ,May 10, 2009 at 10:25

The perldoc has a good definition of our.

Unlike my, which both allocates storage for a variable and associates a simple name with that storage for use within the current scope, our associates a simple name with a package variable in the current package, for use within the current scope. In other words, our has the same scoping rules as my, but does not necessarily create a variable.

Cosmicnet ,Nov 22, 2014 at 13:57

This is only somewhat related to the question, but I've just discovered a (to me) obscure bit of perl syntax that you can use with "our" (package) variables that you can't use with "my" (local) variables.
#!/usr/bin/perl

our $foo = "BAR";

print $foo . "\n";
${"foo"} = "BAZ";
print $foo . "\n";

Output:

BAR
BAZ

This won't work if you change 'our' to 'my'.

Okuma.Scott ,Sep 6, 2014 at 20:13

print "package is: " . __PACKAGE__ . "\n";
our $test = 1;
print "trying to print global var from main package: $test\n";

package Changed;

{
        my $test = 10;
        my $test1 = 11;
        print "trying to print local vars from a closed block: $test, $test1\n";
}

&Check_global;

sub Check_global {
        print "trying to print global var from a function: $test\n";
}
print "package is: " . __PACKAGE__ . "\n";
print "trying to print global var outside the func and from \"Changed\" package:     $test\n";
print "trying to print local var outside the block $test1\n";

Will Output this:

package is: main
trying to print global var from main package: 1
trying to print local vars from a closed block: 10, 11
trying to print global var from a function: 1
package is: Changed
trying to print global var outside the func and from "Changed" package: 1
trying to print local var outside the block

In case using "use strict" will get this failure while attempting to run the script:

Global symbol "$test1" requires explicit package name at ./check_global.pl line 24.
Execution of ./check_global.pl aborted due to compilation errors.

Nathan Fellman ,Nov 5, 2015 at 14:03

Just try to use the following program :
#!/usr/local/bin/perl
use feature ':5.10';
#use warnings;
package a;
{
my $b = 100;
our $a = 10;


print "$a \n";
print "$b \n";
}

package b;

#my $b = 200;
#our $a = 20 ;

print "in package b value of  my b $a::b \n";
print "in package b value of our a  $a::a \n";

Nathan Fellman ,May 16, 2013 at 11:07

#!/usr/bin/perl -l

use strict;

# if string below commented out, prints 'lol' , if the string enabled, prints 'eeeeeeeee'
#my $lol = 'eeeeeeeeeee' ;
# no errors or warnings at any case, despite of 'strict'

our $lol = eval {$lol} || 'lol' ;

print $lol;

Evgeniy ,Jan 27, 2016 at 4:57

Let us think what an interpreter actually is: it's a piece of code that stores values in memory and lets the instructions in a program that it interprets access those values by their names, which are specified inside these instructions. So, the big job of an interpreter is to shape the rules of how we should use the names in those instructions to access the values that the interpreter stores.

On encountering "my", the interpreter creates a lexical variable: a named value that the interpreter can access only while it executes a block, and only from within that syntactic block. On encountering "our", the interpreter makes a lexical alias of a package variable: it binds a name, which the interpreter is supposed from then on to process as a lexical variable's name, until the block is finished, to the value of the package variable with the same name.

The effect is that you can then pretend that you're using a lexical variable and bypass the rules of 'use strict' on full qualification of package variables. Since the interpreter automatically creates package variables when they are first used, the side effect of using "our" may also be that the interpreter creates a package variable as well. In this case, two things are created: a package variable, which the interpreter can access from everywhere, provided it's properly designated as requested by 'use strict' (prepended with the name of its package and two colons), and its lexical alias.

Sources:

[Oct 09, 2019] Perl Import Package in different Namespace

Oct 09, 2019 | stackoverflow.com

Perl Import Package in different Namespace Ask Question Asked 1 year ago Active 7 months ago Viewed 150 times We're doing things differently. View all 8 job openings! 2


choroba ,Sep 28, 2018 at 22:17

is it possible to import ( use ) a perl module within a different namespace?

Let's say I have a Module A (XS Module with no methods Exported @EXPORT is empty) and I have no way of changing the module.

This Module has a Method A::open

currently I can use that Module in my main program (package main) by calling A::open I would like to have that module inside my package main so that I can directly call open

I tried to manually push every key of %A:: into %main:: however that did not work as expected.

The only way that I know to achieve what I want is by using package A; inside my main program, effectively changing the package of my program from main to A . Im not satisfied with this. I would really like to keep my program inside package main.

Is there any way to achieve this and still keep my program in package main?

Offtopic: Yes I know usually you would not want to import everything into your namespace but this module is used by us extensively and we don't want to type A:: (well the actual module name is way longer which isn't making the situation better)in front of hundreds or thousands of calls

Grinnz ,Oct 1, 2018 at 6:26

This is one of those "impossible" situations, where the clear solution -- to rework that module -- is off limits.

But, you can alias that package's subs names, from its symbol table, to the same names in main . Worse than being rude, this comes with a glitch: it catches all names that that package itself imported in any way. However, since this package is a fixed quantity it stands to reason that you can establish that list (and even hard-code it). It is just this one time, right?

main

use warnings;
use strict;
use feature 'say';

use OffLimits;

GET_SUBS: {
    # The list of names to be excluded
    my $re_exclude = qr/^(?:BEGIN|import)$/;  # ...
    my @subs = grep { !/$re_exclude/ } sort keys %OffLimits::;
    no strict 'refs';
    for my $sub_name (@subs) {
        *{ $sub_name } = \&{ 'OffLimits::' . $sub_name };
    }   
};

my $name = name('name() called from ' . __PACKAGE__);
my $id   = id('id() called from ' . __PACKAGE__);

say "name() returned: $name";
say "id()   returned: $id";

with OffLimits.pm

package OffLimits;    
use warnings;
use strict;

sub name { return "In " .  __PACKAGE__ . ": @_" }
sub id   { return "In " .  __PACKAGE__ . ": @_" }

1;

It prints

name() returned: In OffLimits: name() called from  main
id()   returned: In OffLimits: id() called from  main

You may need that code in a BEGIN block, depending on other details.

Another option is of course to hard-code the subs to be "exported" (in @subs ). Given that the module is in practice immutable this option is reasonable and more reliable.


This can also be wrapped in a module, so that you have the normal, selective, importing.

WrapOffLimits.pm

package WrapOffLimits;
use warnings;
use strict;

use OffLimits;

use Exporter qw(import);

our @sub_names;
our @EXPORT_OK   = @sub_names;
our %EXPORT_TAGS = (all => \@sub_names);

BEGIN { 
    # Or supply a hard-coded list of all module's subs in @sub_names
    my $re_exclude = qr/^(?:BEGIN|import)$/;  # ...
    @sub_names = grep { !/$re_exclude/ } sort keys %OffLimits::;

    no strict 'refs';
    for my $sub_name (@sub_names) {
        *{ $sub_name } = \&{ 'OffLimits::' . $sub_name };
    }   
};
1;

and now in the caller you can import either only some subs

use WrapOffLimits qw(name);

or all

use WrapOffLimits qw(:all);

with otherwise the same main as above for a test.

The module name is hard-coded, which should be OK as this is meant only for that module.


The following is added mostly for completeness.

One can pass the module name to the wrapper by writing one's own import sub, which is what gets used then. The import list can be passed as well, at the expense of an awkward interface of the use statement.

It goes along the lines of

package WrapModule;
use warnings;
use strict;

use OffLimits;

use Exporter qw();  # will need our own import 

our ($mod_name, @sub_names);

our @EXPORT_OK   = @sub_names;
our %EXPORT_TAGS = (all => \@sub_names);

sub import {
    my $mod_name = splice @_, 1, 1;  # remove mod name from @_ for goto

    my $re_exclude = qr/^(?:BEGIN|import)$/;  # etc

    no strict 'refs';
    @sub_names = grep { !/$re_exclude/ } sort keys %{ $mod_name . '::'};    
    for my $sub_name (@sub_names) {    
        *{ $sub_name } = \&{ $mod_name . '::' . $sub_name };
    }   

    push @EXPORT_OK, @sub_names;

    goto &Exporter::import;
}
1;

what can be used as

use WrapModule qw(OffLimits name id);  # or (OffLimits :all)

or, with the list broken-up so to remind the user of the unusual interface

use WrapModule 'OffLimits', qw(name id);

When used with the main above this prints the same output.

The use statement ends up using the import sub defined in the module, which exports symbols by writing to the caller's symbol table. (If no import sub is written then the Exporter 's import method is nicely used, which is how this is normally done.)

This way we are able to unpack the arguments and have the module name supplied at use invocation. With the import list supplied as well now we have to push manually to @EXPORT_OK since this can't be in the BEGIN phase. In the end the sub is replaced by Exporter::import via the (good form of) goto , to complete the job.

Simerax ,Sep 30, 2018 at 10:19

You can forcibly "import" a function into main using glob assignment to alias the subroutine (and you want to do it in BEGIN so it happens at compile time, before calls to that subroutine are parsed later in the file):
use strict;
use warnings;
use Other::Module;

BEGIN { *open = \&Other::Module::open }

However, another problem you might have here is that open is a builtin function, which may cause some problems . You can add use subs 'open'; to indicate that you want to override the built-in function in this case, since you aren't using an actual import function to do so.

Grinnz ,Sep 30, 2018 at 17:33

Here is what I now came up with. Yes this is hacky and yes I also feel like I opened pandoras box with this. However at least a small dummy program ran perfectly fine.

I renamed the module in my code again. In my original post I used the example A::open actually this module does not contain any method/variable reserved by the perl core. This is why I blindly import everything here.

BEGIN {
    # using the caller to determine the parent. Usually this is main but maybe we want it somewhere else in some cases
    my ($parent_package) = caller;

    package A;

    foreach (keys(%A::)) {
        if (defined $$_) {
            eval '*'.$parent_package.'::'.$_.' = \$A::'.$_;
        }
        elsif (%$_) {
            eval '*'.$parent_package.'::'.$_.' = \%A::'.$_;
        }
        elsif (@$_) {
            eval '*'.$parent_package.'::'.$_.' = \@A::'.$_;
        }
        else {
            eval '*'.$parent_package.'::'.$_.' = \&A::'.$_;
        }
    }
}

[Oct 09, 2019] oop - Perl Importing Variables From Calling Module

Oct 09, 2019 | stackoverflow.com

Perl Importing Variables From Calling Module Ask Question Asked 9 years, 1 month ago Active 9 years, 1 month ago Viewed 4k times 0 1


Russell C. ,Aug 31, 2010 at 20:31

I have a Perl module (Module.pm) that initializes a number of variables, some of which I'd like to import ($VAR2, $VAR3) into additional submodules that it might load during execution.

The way I'm currently setting up Module.pm is as follows:

package Module;

use warnings;
use strict;

use vars qw($SUBMODULES $VAR1 $VAR2 $VAR3);

require Exporter;
our @ISA = qw(Exporter);
our @EXPORT = qw($VAR2 $VAR3);

sub new {
    my ($package) = @_;
    my $self = {};
    bless ($self, $package);
    return $self;
}

sub SubModules1 {
    my $self = shift;
    if($SUBMODULES->{'1'}) { return $SUBMODULES->{'1'}; }

    # Load & cache submodule
    require Module::SubModule1;
    $SUBMODULES->{'1'} = Module::SubModule1->new(@_);    
    return $SUBMODULES->{'1'};
}

sub SubModules2 {
    my $self = shift;
    if($SUBMODULES->{'2'}) { return $SUBMODULES->{'2'}; }

    # Load & cache submodule
    require Module::SubModule2;
    $SUBMODULES->{'2'} = Module::SubModule2->new(@_);    
    return $SUBMODULES->{'2'};
}

Each submodule is structured as follows:

package Module::SubModule1;

use warnings;
use strict;
use Carp;

use vars qw();

sub new {
    my ($package) = @_;
    my $self = {};
    bless ($self, $package);
    return $self;
}

I want to be able to import the $VAR2 and $VAR3 variables into each of the submodules without having to reference them as $Module::VAR2 and $Module::VAR3. I noticed that the calling script is able to access both the variables that I have exported in Module.pm in the desired fashion but SubModule1.pm and SubModule2.pm still have to reference the variables as being from Module.pm.

I tried updating each submodule as follows which unfortunately didn't work I was hoping:

package Module::SubModule1;

use warnings;
use strict;
use Carp;

use vars qw($VAR2 $VAR3);

sub new {
    my ($package) = @_;
    my $self = {};
    bless ($self, $package);
    $VAR2 = $Module::VAR2;
    $VAR3 = $Module::VAR3;
    return $self;
}

Please let me know how I can successfully export $VAR2 and $VAR3 from Module.pm into each Submodule. Thanks in advance for your help!

Russell C. ,Aug 31, 2010 at 22:37

In your submodules, are you forgetting to say
use Module;

? Calling use Module from another package (say Module::Submodule9 ) will try to run the Module::import method. Since you don't have that method, it will call the Exporter::import method, and that is where the magic that exports Module 's variables into the Module::Submodule9 namespace will happen.


In your program there is only one Module namespace and only one instance of the (global) variable $Module::VAR2 . Exporting creates aliases to this variable in other namespaces, so the same variable can be accessed in different ways. Try this in a separate script:

package Whatever;
use Module;
use strict;
use vars qw($VAR2);

$Module::VAR2 = 5;
print $Whatever::VAR2;    # should be 5.
$VAR2 = 14;               # same as $Whatever::VAR2 = 14
print $Module::VAR2;      # should be 14

Russell C. ,Aug 31, 2010 at 21:38

Well there is the easy way:

In M.pm:

package M;

use strict;
use warnings;

#our is better than "use vars" for creating package variables
#it creates an alias to $M::foo named $foo in the current lexical scope 
our $foo = 5;

sub inM { print "$foo\n" }

1;

In M/S.pm

package M;

#creates an alias to $M::foo that will last for the entire scope,
#in this case the entire file
our $foo;

package M::S;

use strict;
use warnings;

sub inMS { print "$foo\n" }

1;

In the script:

#!/usr/bin/perl

use strict;
use warnings;

use M;
use M::S;

M::inM();
M::S::inMS();

But I would advise against this. Global variables are not a good practice, and sharing global variables between modules is even worse.

[Oct 09, 2019] gzip - How can I recover files from a corrupted .tar.gz archive - Stack Overflow

Oct 09, 2019 | stackoverflow.com

15


George ,Jun 24, 2016 at 2:49

Are you sure that it is a gzip file? I would first run 'file SMS.tar.gz' to validate that.

Then I would read the The gzip Recovery Toolkit page.

JohnEye ,Oct 4, 2016 at 11:27

Recovery is possible but it depends on what caused the corruption.

If the file is just truncated, getting some partial result out is not too hard; just run

gunzip < SMS.tar.gz > SMS.tar.partial

which will give some output despite the error at the end.

If the compressed file has large missing blocks, it's basically hopeless after the bad block.

If the compressed file is systematically corrupted in small ways (e.g. transferring the binary file in ASCII mode, which smashes carriage returns and newlines throughout the file), it is possible to recover but requires quite a bit of custom programming, it's really only worth it if you have absolutely no other recourse (no backups) and the data is worth a lot of effort. (I have done it successfully.) I mentioned this scenario in a previous question .

The answers for .zip files differ somewhat, since zip archives have multiple separately-compressed members, so there's more hope (though most commercial tools are rather bogus, they eliminate warnings by patching CRCs, not by recovering good data). But your question was about a .tar.gz file, which is an archive with one big member.

,

Here is one possible scenario that we encountered. We had a tar.gz file that would not decompress, trying to unzip gave the error:
gzip -d A.tar.gz
gzip: A.tar.gz: invalid compressed data--format violated

I figured out that the file may been originally uploaded over a non binary ftp connection (we don't know for sure).

The solution was relatively simple using the unix dos2unix utility

dos2unix A.tar.gz
dos2unix: converting file A.tar.gz to UNIX format ...
tar -xvf A.tar
file1.txt
file2.txt 
....etc.

It worked! This is one slim possibility, and maybe worth a try - it may help somebody out there.

[Oct 08, 2019] Perl constant array

Oct 08, 2019 | stackoverflow.com

Ask Question Asked 6 years, 1 month ago Active 4 years ago Viewed 5k times 4 1


Alec ,Sep 5, 2018 at 8:25

use constant {
    COLUMNS => qw/ TEST1 TEST2 TEST3 /,
}

Can I store an array using the constant package in Perl?

Whenever I go on to try to use the array like my @attr = (COLUMNS); , it does not contain the values.

Сухой27 ,Aug 12, 2013 at 13:37

use constant {
  COLUMNS => [qw/ TEST1 TEST2 TEST3 /],
};

print @{+COLUMNS};

> ,

Or remove the curly braces as the docs show :-
  1 use strict;
  2 use constant COLUMNS => qw/ TEST1 TEST2 TEST3 /;
  3 
  4 my @attr = (COLUMNS);
  5 print @attr;

which gives :-

 % perl test.pl
TEST1TEST2TEST3

Your code actually defines two constants COLUMNS and TEST2 :-

use strict;
use constant { COLUMNS => qw/ TEST1 TEST2 TEST3 /, };

my @attr = (COLUMNS);
print @attr;
print TEST2

and gives :-

% perl test.pl
TEST1TEST3

[Oct 07, 2019] How to commit to remote git repository

Apr 28, 2012 | stackoverflow.com

Ahmed ,Apr 28, 2012 at 14:32

I am new to git.
I have done a clone of remote repo as follows
git clone https://[email protected]/repo.git

then I did

git checkout master

made some changes and committed these changes to my local repository like below..

git add .

git commit -m "my changes"

Now I have to push these changes to the remote repository. I am not sure what to do.

Would I do a merge of my repo to remote ? what steps do I need to take ?

I have git bash and git gui

please advise,
thanks,

zafarkhaja ,Apr 28, 2012 at 14:39

All You have to do is git push origin master , where origin is the default name (alias) of Your remote repository and master is the remote branch You want to push Your changes to.

You may also want to check these out:

  1. http://gitimmersion.com/
  2. http://progit.org/book/

Sergey K. ,Apr 28, 2012 at 14:54

You just need to make sure you have the rights to push to the remote repository and do
git push origin master

or simply

git push

haziz ,Apr 28, 2012 at 21:30

git push

or

git push server_name master

should do the trick, after you have made a commit to your local repository.

Bryan Oakley ,Apr 28, 2012 at 14:34

Have you tried git push ? gitref.org has a nice section dealing with remote repositories .

You can also get help from the command line using the --help option. For example:

% git push --help
GIT-PUSH(1)                             Git Manual                             GIT-PUSH(1)



NAME
       git-push - Update remote refs along with associated objects

SYNOPSIS
       git push [--all | --mirror | --tags] [-n | --dry-run] [--receive-pack=<git-receive-pack>]
                  [--repo=<repository>] [-f | --force] [-v | --verbose] [-u | --set-upstream]
                  [<repository> [<refspec>...]]
...

[Sep 30, 2019] How can I convert a string to a number in Perl?

Sep 30, 2019 | stackoverflow.com

Ask Question Asked 10 years, 10 months ago Active 4 years, 7 months ago Viewed 264k times 81 7


TimK ,Mar 1, 2016 at 22:51

How would I convert a string holding a number into its numeric value in Perl?

OrangeDog ,May 22, 2014 at 15:24

You don't need to convert it at all:
% perl -e 'print "5.45" + 0.1;'
5.55

,

This is a simple solution:

Example 1

my $var1 = "123abc";
print $var1 + 0;

Result

123

Example 2

my $var2 = "abc123";
print $var2 + 0;

Result

0

[Sep 24, 2019] How to properly use the try catch in perl that error.pm provides?

Sep 24, 2019 | stackoverflow.com

Sinan Ünür ,Apr 28, 2012 at 18:07

Last I checked, Error was deprecated. But here's how you would do it without that module:
eval {
    die "Oops!";
    1;
} or do {
    my $e = $@;
    print("Something went wrong: $e\n");
};

Basically, use eval instead of try , die instead of throw , and look for the exception in $@ . The true value at the end of the eval block is part of an idiom to prevent $@ from unintentionally changing before it is used again in Perl versions older than 5.14, see P::C::P::ErrorHandling::RequireCheckingReturnValueOfEval for details. For example, this code suffers from this flaw.

# BAD, DO NOT USE WITH PERLS OLDER THAN 5.14
eval {
    die "Oops!";
};
if (my $e = $@) {
    print("Something went wrong: $e\n");
}
# BAD, DO NOT USE WITH PERLS OLDER THAN 5.14

But note that many Perl operations do not raise exceptions when they fail; they simply return an error code. This behavior can be altered via autodie for builtins and standard modules. If you're using autodie , then the standard way of doing try/catch is this (straight out of the autodie perldoc):

use feature qw(switch);

eval {
   use autodie;

   open(my $fh, '<', $some_file);

   my @records = <$fh>;

   # Do things with @records...

   close($fh);

};

given ($@) {
   when (undef)   { say "No error";                    }
   when ('open')  { say "Error from open";             }
   when (':io')   { say "Non-open, IO error.";         }
   when (':all')  { say "All other autodie errors."    }
   default        { say "Not an autodie error at all." }
}

For getting a stacktrace, look at Carp .

> ,

If you want something a bit more powerful than Try::Tiny, you might want to try looking at the TryCatch module in CPAN.

[Sep 24, 2019] What is the best way to handle exceptions in Perl - Stack Overflow

Sep 24, 2019 | stackoverflow.com

Michael Carman ,Apr 8 at 9:52

The consensus of the Perl community seems to be that Try::Tiny is the preferred way of doing exception handling. The "lenient policy" you refer to is probably due to a combination of:

Note that the last item means that you'll see a lot of code like this:

eval { something() };
if ($@) {
    warn "Oh no! [$@]\n";
}

That's exception handling even though it doesn't use try/catch syntax. It's fragile, though, and will break in a number of subtle edge cases that most people don't think about.

Try::Tiny and the other exception handling modules on CPAN were written to make it easier to get right.

1. C does have setjmp() and longjmp() , which can be used for a very crude form of exception handling.

,

Never test $@ as is, because it is a global variable, so even the test itself can change it.

General eval-template:

my $result;

eval {
    $result= something();
    # ...
    1;  # ok
} or do {
    my $eval_error= $@ || "error";
    # ...
    die $eval_error;
};  # needs a semicolon

In practice that is the lightest way. It still leaves a tiny room for funny $@ behaviour, but nothing that really concerned me enough.

[Sep 21, 2019] How Did Perl Lose Ground to Bash?

Notable quotes:
"... It baffles me the most because the common objection to Perl is legibility. Even if you assume that the objection is made from ignorance - i.e. not even having looked at some Perl to gauge its legibility - the nonsense you see in a complex bash script is orders of magnitude worse! ..."
"... Maybe it's not reassuring to hear that, but I took an interest in Perl precisely because it's seen as an underdog and "dead" despite having experienced users and a lot of code, kind of like TCL, Prolog, or Ada. ..."
"... There's a long history of bad code written by mediocre developers who became the only one who could maintain the codebase until they no longer worked for the organization. The next poor sap to go in found a mess of a codebase and did their best to not break it further. After a few iterations, the whole thing is ready for /dev/null and Perl gets the blame. ..."
"... All in all, Perl is still my first go-to language, but there are definitely some things I wish it did better. ..."
"... The Perl leadership Osborned itself with Perl6. 20/20 hindsight says the new project should have been given a different name at conception, that way all the "watch this space -- under construction" signage wouldn't have steered people away from perfectly usable Perl5. Again, IMO. ..."
"... I don't observe the premise at all though. Is bash really gaining ground over anything recently? ..."
"... Python again is loved, because "taught by rote" idiots. Now you can give them pretty little packages. And it's no wonder they can do little better than be glorified system admins (which id rather have a real sys admin, since he's likely to understand Perl) ..."
"... Making a new language means lots of new training. Lots of profit in this. Nobody profits from writing new books on old languages. Lots of profit in general from supporting a new language. In the end, owning the language gets you profits. ..."
"... And I still don't get why tab for blocks python is even remotely more readable than Perl. ..."
"... If anything, JavaScript is pretty dang godly at what it does, I understand why that's popular. But I don't get python one bit, except to employ millions of entry level minions who can't think on their own. ..."
"... "Every teacher I know has students using it. We do it because it's an easy language, there's only one way to do it, and with whitespace as syntax it's easy to grade. We don't teach it because it is some powerful or exceptional language. " ..."
Sep 21, 2019 | www.reddit.com

How Did Perl Lose Ground to Bash?

Setting aside Perl vs. Python for the moment, how did Perl lose ground to Bash? It used to be that Bash scripts often got replaced by Perl scripts because Perl was more powerful. Even with very modern versions of Bash, Perl is much more powerful.

The Linux Standards Base (LSB) has helped ensure that certain tools are in predictable locations. Bash has gotten a bit more powerful since the release of 4.x, sure. Arrays, handicapped to 2-D arrays, have improved somewhat. There is a native regex engine in Bash 3.x, which admit is a big deal. There is also support for hash maps.

This is all good stuff for Bash. But, none of this is sufficient to explain why Perl isn't the thing you learn after Bash, or, after Bash and Python; take your pick. Thoughts?

28 comments 75% Upvoted What are your thoughts? Log in or Sign up log in sign up Sort by

oldmanwillow21 9 points · 9 days ago

Because Perl has suffered immensely in the popularity arena and is now viewed as undesirable. It's not that Bash is seen as an adequate replacement for Perl, that's where Python has landed.

emilper 8 points · 8 days ago

How did Perl5 lose ground to anything else?

Thusly

- "thou must use Moose for everything" -> "Perl is too slow" -> rewrite in Python because the architect loves Python -> Python is even slower -> architect shunned by the team and everything new written in Go, nobody dares to complain about speed now because the budget people don't trust them -> Perl is slow

- "globals are bad, singletons are good" -> spaghetti -> Perl is unreadable

- "lets use every single item from the gang of four book" -> insanity -> Perl is bad

- "we must be more OOP" -> everything is a faux object with everything else as attributes -> maintenance team quits and they all take PHP jobs, at least the PHP people know their place in the order of things and do less hype-driven-development -> Perl is not OOP enough

- "CGI is bad" -> app needs 6.54GB of RAM for one worker -> customer refuses to pay for more RAM, fires the team, picks a PHP team to do the next version -> PHP team laughs all the way to the bank, chanting "CGI is king"

recrof 2 points · 8 days ago

"CGI is bad" is real. PSGI or FCGI is much faster for web services, and if there are memory leaks, it's always possible to debug & fix them.

Grinnz 6 points · 8 days ago

CGI is fine, when it's all you need. There are many different use cases out there. Just don't use CGI.pm .

emilper 2 points · 7 days ago

memory leaks

memory leaks ... do huge monoliths count as "memory leaks" ?

Altreus 7 points · 8 days ago

It baffles me the most because the common objection to Perl is legibility. Even if you assume that the objection is made from ignorance - i.e. not even having looked at some Perl to gauge its legibility - the nonsense you see in a complex bash script is orders of magnitude worse!

Not to mention its total lack of common language features like first-class data and... Like, a compiler...

I no longer write bash scripts because it takes about 5 lines to become unmaintainable.

crashorbit 5 points · 9 days ago

Every language that reaches functional equity with Perl is perceived as better than it. Mostly because hey, at least it's not Perl.

oldmanwillow21 15 points · 9 days ago · edited 9 days ago

Jumbled mess of thoughts surely to follow.

When I discuss projects with peers and mention that I chose to develop in Perl, the responses range from passive bemusement, to scorn, to ridicule. The assumption is usually that I'm using a dead language that's crippled in functionality and uses syntax that will surely make everyone's eyes bleed to read. This is the culture everywhere from the casual hackers to the C-suite.

I've proven at work that I can write nontrivial software using Perl. I'm still asked to use Python or Go (edit: or node, ugh) for any project that'll have contributors from other teams, or to containerize apps using Docker to remove the need for Perl knowledge for end-users (no CPAN, carton, etc.). But I'll take what I can get, and now the attitude has gone from "get with the times" or "that's cute", to "ok but I don't expect everyone else to know it".

Perl has got a lot to offer, and I vastly enjoy using it over other languages I work with. I know that all the impassioned figures in the Perl community love it just the same, but the community's got some major fragmentation going on. I understand that everyone's got ideas about the future of the language, but is this really the best time to pull the community apart? I feel like if everyone was able to let go of their ego and put their heads together to bring us to a point of stability, even a place where we're not laughed at for professing our support for the language, it would be a major step in the right direction. I think we're heading to the bottom fast, otherwise.

In that spirit of togetherness, I think the language, particularly the community, needs to be made more accessible to newcomers. Not accessible to one Perl offshoot, but accessible to Perl. It needs to be decided what Perl means in today's day and age. What can it do? Why would I want to use it over another shiny language? What are the definitive places I can go to learn more? Who else will be there? How do I contribute and grow as a Perl developer? There need to be people talking about Perl in places that aren't necessarily hubs for other Perl enthusiasts. It needs to be something business decision-makers can look at and feel confident in using.

I really hope something changes. I'd be pretty sad if I had to spend the rest of my career writing whatever the trendy language of the day is. These are just observations from someone that likes writing Perl and has been watching from the sidelines.

PhloxPaniculata 2 points · 7 days ago

Maybe it's not reassuring to hear that, but I took an interest in Perl precisely because it's seen as an underdog and "dead" despite having experienced users and a lot of code, kind of like TCL, Prolog, or Ada.

Being able to read Modern Perl for free also helped a lot. I'm still lacking experience in Perl and I've yet to write anything of importance in it because I don't see an area in which it's clearly better than anything else, either because of the language, a package, or a framework, and I don't do a lot of text-munging anymore (I'm also a fan of awk so for small tasks it has the priority).

codon011 1 point · 9 days ago

Don't call it Perl. Unfortunately. Also IME multitasking in Perl5 (or the lack thereof and/or severe issues with) has been a detriment to it's standing in a "multithread all the things" world.

crashorbit 4 points · 8 days ago

So often I see people drag themselves down that "thread my app" path. Eventually realize that they are implementing a whole multi-processing operating system inside their app rather than taking advantage of the perfectly good one they are running on.

There are several perfectly good ways to do concurrency, multitasking, async IO and so on in perl. Many work well in the single node case and in the multi-node case. Anyone who tells you that multitasking systems are easy because of some implementation language choice has not made it through the whole Dunning Kruger cycle yet.

codon011 2 points · 8 days ago

Multithreading is never easy. The processors will always manage to do things in a "wrong" order unless you are very careful with your gatekeeping. However, other languages/frameworks have paradigms that make it seem easier such that those race conditions show up much later in your product lifecycle.

codon011 3 points · 9 days ago

There's a long history of bad code written by mediocre developers who became the only one who could maintain the codebase until they no longer worked for the organization. The next poor sap to go in found a mess of a codebase and did their best to not break it further. After a few iterations, the whole thing is ready for /dev/null and Perl gets the blame.

Bash has limitations, but that (usually) means fewer ways to mess it up. There's less domain knowledge to learn, (afaik) no CPAN equivalent, and fewer issues with things like "I need to upgrade this but I can't because this other thing uses this older version which is incompatible with the newer version so now we have to maintain two versions of the library and/or interpreter."

All in all, Perl is still my first go-to language, but there are definitely some things I wish it did better.

crb3 3 points · 9 days ago · edited 9 days ago

*[e:] Consider, not just core here, but CPAN pull-in as well. I had one project clobbered on a smaller-memory machine when I tried to set up a pure-Perl scp transfer -- there wasn't room enough for the full file to transfer if it was larger than about 50k, what with all the CPAN. Shelling to commandline scp worked just fine.

beermad 2 points · 8 days ago

To be fair, wrapping a Perl script around something that's (if I read your comment right) just running SCP is adding a pointless extra layer of complexity anyway.

It's a matter of using the best tool for each particular job, not just sticking with one. My own ~/bin directory has a big mix of Perl and pure shell, depending on the complexity of the job to be done.

crb3 2 points · 8 days ago · edited 7 days ago

Agreed; I brought that example up to illustrate the bulk issue. In it, I was feeling my way, not sure how much finagling I might have to do for the task (backdoor-passing legitimate sparse but possibly quite bulky email from one server to another), which is why I initially went for the pure-Perl approach, so I'd have the mechanics exposed for any needed hackery. The experience taught me to get by more on shelling to precompiled tooling where appropriate... and a healthy respect for CPAN pull-in, [e:] the way that this module depends on that module so it gets pulled in along with its dependencies in turn, and the pileup grows in memory. There was a time or two here and there where I only needed a teeny bit of what a module does, so I went in and studied the code, then implemented it internally as a function without the object's generalities and bulk. The caution learned on ancient x86 boxes now seems appropriate on ARM boards like rPi; what goes around comes around.

minimim 1 point · 4 days ago

wouldn't have steered people away from perfectly usable Perl5

Perl5 development was completely stalled at the time. Perl6 brought not only new blood into it's own effort, it reinvigorated Perl5 in the process.

It's completely backwards to suggest Perl 5 was fine until perl6 came along. It was almost dormant and became a lively language after Perl 6 was announced.

perlancar 2 points · 8 days ago

I don't observe the premise at all though. Is bash really gaining ground over anything recently? l

linearblade 3 points · 8 days ago

Perl is better than pretty much everything g out there at what it does.

But keep in mind,

They say C sharp is loved by everyone, when in reality it's Microsoft pushing their narrative and the army of "learn by rote" engineers In developing countries

Python again is loved, because "taught by rote" idiots. Now you can give them pretty little packages. And it's no wonder they can do little better than be glorified system admins (which id rather have a real sys admin, since he's likely to understand Perl)

Making a new language means lots of new training. Lots of profit in this. Nobody profits from writing new books on old languages. Lots of profit in general from supporting a new language. In the end, owning the language gets you profits.

And I still don't get why tab for blocks python is even remotely more readable than Perl.

If anything, JavaScript is pretty dang godly at what it does, I understand why that's popular. But I don't get python one bit, except to employ millions of entry level minions who can't think on their own.

duo-rotae 6 points · 8 days ago

I know a comp sci professor. I asked why he thought Python was so popular.

"Every teacher I know has students using it. We do it because it's an easy language, there's only one way to do it, and with whitespace as syntax it's easy to grade. We don't teach it because it is some powerful or exceptional language. "

Then he said if he really needs to get something done, it's Perl or C.

linearblade 2 points · 8 days ago

Yep that's pretty much my opinion from using it.

techsnapp 1 point · 2 days ago

So is per harder than python because the lack of everyone else using it?

duo-rotae 1 point · 2 days ago

Perl has a steeper and longer learning with it. curve than Python, and there is more than one way to do anything. And there quite a few that continue coding

[Sep 19, 2019] List::MoreUtils's minmax is more efficient when you need both the min and the max (because it does fewer comparisons).

Notable quotes:
"... List::MoreUtils's minmax is more efficient when you need both the min and the max (because it does fewer comparisons). ..."
Sep 19, 2019 | stackoverflow.com

List::Util's min and max are fine,

use List::Util qw( min max );
my $min = min @numbers;
my $max = max @numbers;

But List::MoreUtils's minmax is more efficient when you need both the min and the max (because it does fewer comparisons).

use List::MoreUtils qw( minmax );
my ($min, $max) = minmax @numbers;

List::Util is part of core, but List::MoreUtils isn't.

--ikegami

[Sep 16, 2019] How can I capture multiple matches from the same Perl regex - Stack Overflow

Sep 16, 2019 | stackoverflow.com

How can I capture multiple matches from the same Perl regex? Ask Question Asked 9 years, 4 months ago Active 7 years, 4 months ago Viewed 35k times 24 1


brian d foy ,May 22, 2010 at 15:42

I'm trying to parse a single string and get multiple chunks of data out from the same string with the same regex conditions. I'm parsing a single HTML doc that is static (For an undisclosed reason, I can't use an HTML parser to do the job.) I have an expression that looks like:
$string =~ /\<img\ssrc\="(.*)"/;

and I want to get the value of $1. However, in the one string, there are many img tags like this, so I need something like an array returned (@1?) is this possible?

VolatileRig ,Jan 14, 2014 at 19:41

As Jim's answer, use the /g modifier (in list context or in a loop).

But beware of greediness, you dont want the .* to match more than necessary (and dont escape < = , they are not special).

while($string =~ /<img\s+src="(.*?)"/g ) {
  ...
}

Robert Wohlfarth ,May 21, 2010 at 18:44

@list = ($string =~ m/\<img\ssrc\="(.*)"/g);

The g modifier matches all occurences in the string. List context returns all of the matches. See the m// operator in perlop .

dalton ,May 21, 2010 at 18:42

You just need the global modifier /g at the end of the match. Then loop through until there are no matches remaining
my @matches;
while ($string =~ /\<img\ssrc\="(.*)"/g) {
        push(@matches, $1);
}

VolatileRig ,May 24, 2010 at 16:37

Use the /g modifier and list context on the left, as in
@result = $string =~ /\<img\ssrc\="(.*)"/g;

[Sep 16, 2019] Pretty-print for shell script

Sep 16, 2019 | stackoverflow.com

Benoit ,Oct 21, 2010 at 13:19

I'm looking for something similiar to indent but for (bash) scripts. Console only, no colorizing, etc.

Do you know of one ?

Jamie ,Sep 11, 2012 at 3:00

Vim can indent bash scripts. But not reformat them before indenting.
Backup your bash script, open it with vim, type gg=GZZ and indent will be corrected. (Note for the impatient: this overwrites the file, so be sure to do that backup!)

Though, some bugs with << (expecting EOF as first character on a line) e.g.

EDIT: ZZ not ZQ

Daniel Martí ,Apr 8, 2018 at 13:52

A bit late to the party, but it looks like shfmt could do the trick for you.

Brian Chrisman ,Sep 9 at 7:47

In bash I do this:
reindent() {
source <(echo "Zibri () {";cat "$1"; echo "}")
declare -f Zibri|head --lines=-1|tail --lines=+3 | sed -e "s/^\s\s\s\s//"
}

this eliminates comments and reindents the script "bash way".

If you have HEREDOCS in your script, they got ruined by the sed in the previous function.

So use:

reindent() {
source <(echo "Zibri () {";cat "$1"; echo "}")
declare -f Zibri|head --lines=-1|tail --lines=+3"
}

But all your script will have a 4 spaces indentation.

Or you can do:

reindent () 
{ 
    rstr=$(mktemp -u "XXXXXXXXXX");
    source <(echo "Zibri () {";cat "$1"|sed -e "s/^\s\s\s\s/$rstr/"; echo "}");
    echo '#!/bin/bash';
    declare -f Zibri | head --lines=-1 | tail --lines=+3 | sed -e "s/^\s\s\s\s//;s/$rstr/    /"
}

which takes care also of heredocs.

> ,

Found this http://www.linux-kheops.com/doc/perl/perl-aubert/fmt.script .

Very nice, only one thing i took out is the [...]->test substitution.

[Sep 16, 2019] A command-line HTML pretty-printer Making messy HTML readable - Stack Overflow

Notable quotes:
"... Have a look at the HTML Tidy Project: http://www.html-tidy.org/ ..."
Sep 16, 2019 | stackoverflow.com

nisetama ,Aug 12 at 10:33

I'm looking for recommendations for HTML pretty printers which fulfill the following requirements:

> ,

Have a look at the HTML Tidy Project: http://www.html-tidy.org/

The granddaddy of HTML tools, with support for modern standards.

There used to be a fork called tidy-html5 which since became the official thing. Here is its GitHub repository .

Tidy is a console application for Mac OS X, Linux, Windows, UNIX, and more. It corrects and cleans up HTML and XML documents by fixing markup errors and upgrading legacy code to modern standards.

For your needs, here is the command line to call Tidy:

[Sep 11, 2019] string - Extract substring in Bash - Stack Overflow

Sep 11, 2019 | stackoverflow.com

Jeff ,May 8 at 18:30

Given a filename in the form someletters_12345_moreleters.ext , I want to extract the 5 digits and put them into a variable.

So to emphasize the point, I have a filename with x number of characters then a five digit sequence surrounded by a single underscore on either side then another set of x number of characters. I want to take the 5 digit number and put that into a variable.

I am very interested in the number of different ways that this can be accomplished.

Berek Bryan ,Jan 24, 2017 at 9:30

Use cut :
echo 'someletters_12345_moreleters.ext' | cut -d'_' -f 2

More generic:

INPUT='someletters_12345_moreleters.ext'
SUBSTRING=$(echo $INPUT| cut -d'_' -f 2)
echo $SUBSTRING

JB. ,Jan 6, 2015 at 10:13

If x is constant, the following parameter expansion performs substring extraction:
b=${a:12:5}

where 12 is the offset (zero-based) and 5 is the length

If the underscores around the digits are the only ones in the input, you can strip off the prefix and suffix (respectively) in two steps:

tmp=${a#*_}   # remove prefix ending in "_"
b=${tmp%_*}   # remove suffix starting with "_"

If there are other underscores, it's probably feasible anyway, albeit more tricky. If anyone knows how to perform both expansions in a single expression, I'd like to know too.

Both solutions presented are pure bash, with no process spawning involved, hence very fast.

A Sahra ,Mar 16, 2017 at 6:27

Generic solution where the number can be anywhere in the filename, using the first of such sequences:
number=$(echo $filename | egrep -o '[[:digit:]]{5}' | head -n1)

Another solution to extract exactly a part of a variable:

number=${filename:offset:length}

If your filename always have the format stuff_digits_... you can use awk:

number=$(echo $filename | awk -F _ '{ print $2 }')

Yet another solution to remove everything except digits, use

number=$(echo $filename | tr -cd '[[:digit:]]')

sshow ,Jul 27, 2017 at 17:22

In case someone wants more rigorous information, you can also search it in man bash like this
$ man bash [press return key]
/substring  [press return key]
[press "n" key]
[press "n" key]
[press "n" key]
[press "n" key]

Result:

${parameter:offset}
       ${parameter:offset:length}
              Substring Expansion.  Expands to  up  to  length  characters  of
              parameter  starting  at  the  character specified by offset.  If
              length is omitted, expands to the substring of parameter  start‐
              ing at the character specified by offset.  length and offset are
              arithmetic expressions (see ARITHMETIC  EVALUATION  below).   If
              offset  evaluates  to a number less than zero, the value is used
              as an offset from the end of the value of parameter.  Arithmetic
              expressions  starting  with  a - must be separated by whitespace
              from the preceding : to be distinguished from  the  Use  Default
              Values  expansion.   If  length  evaluates to a number less than
              zero, and parameter is not @ and not an indexed  or  associative
              array,  it is interpreted as an offset from the end of the value
              of parameter rather than a number of characters, and the  expan‐
              sion is the characters between the two offsets.  If parameter is
              @, the result is length positional parameters beginning at  off‐
              set.   If parameter is an indexed array name subscripted by @ or
              *, the result is the length members of the array beginning  with
              ${parameter[offset]}.   A  negative  offset is taken relative to
              one greater than the maximum index of the specified array.  Sub‐
              string  expansion applied to an associative array produces unde‐
              fined results.  Note that a negative offset  must  be  separated
              from  the  colon  by  at least one space to avoid being confused
              with the :- expansion.  Substring indexing is zero-based  unless
              the  positional  parameters are used, in which case the indexing
              starts at 1 by default.  If offset  is  0,  and  the  positional
              parameters are used, $0 is prefixed to the list.

Aleksandr Levchuk ,Aug 29, 2011 at 5:51

Building on jor's answer (which doesn't work for me):
substring=$(expr "$filename" : '.*_\([^_]*\)_.*')

kayn ,Oct 5, 2015 at 8:48

I'm surprised this pure bash solution didn't come up:
a="someletters_12345_moreleters.ext"
IFS="_"
set $a
echo $2
# prints 12345

You probably want to reset IFS to what value it was before, or unset IFS afterwards!

zebediah49 ,Jun 4 at 17:31

Here's how i'd do it:
FN=someletters_12345_moreleters.ext
[[ ${FN} =~ _([[:digit:]]{5})_ ]] && NUM=${BASH_REMATCH[1]}

Note: the above is a regular expression and is restricted to your specific scenario of five digits surrounded by underscores. Change the regular expression if you need different matching.

TranslucentCloud ,Jun 16, 2014 at 13:27

Following the requirements

I have a filename with x number of characters then a five digit sequence surrounded by a single underscore on either side then another set of x number of characters. I want to take the 5 digit number and put that into a variable.

I found some grep ways that may be useful:

$ echo "someletters_12345_moreleters.ext" | grep -Eo "[[:digit:]]+" 
12345

or better

$ echo "someletters_12345_moreleters.ext" | grep -Eo "[[:digit:]]{5}" 
12345

And then with -Po syntax:

$ echo "someletters_12345_moreleters.ext" | grep -Po '(?<=_)\d+' 
12345

Or if you want to make it fit exactly 5 characters:

$ echo "someletters_12345_moreleters.ext" | grep -Po '(?<=_)\d{5}' 
12345

Finally, to make it be stored in a variable it is just need to use the var=$(command) syntax.

Darron ,Jan 9, 2009 at 16:13

Without any sub-processes you can:
shopt -s extglob
front=${input%%_+([a-zA-Z]).*}
digits=${front##+([a-zA-Z])_}

A very small variant of this will also work in ksh93.

user2350426

add a comment ,Aug 5, 2014 at 8:11
If we focus in the concept of:
"A run of (one or several) digits"

We could use several external tools to extract the numbers.
We could quite easily erase all other characters, either sed or tr:

name='someletters_12345_moreleters.ext'

echo $name | sed 's/[^0-9]*//g'    # 12345
echo $name | tr -c -d 0-9          # 12345

But if $name contains several runs of numbers, the above will fail:

If "name=someletters_12345_moreleters_323_end.ext", then:

echo $name | sed 's/[^0-9]*//g'    # 12345323
echo $name | tr -c -d 0-9          # 12345323

We need to use regular expresions (regex).
To select only the first run (12345 not 323) in sed and perl:

echo $name | sed 's/[^0-9]*\([0-9]\{1,\}\).*$/\1/'
perl -e 'my $name='$name';my ($num)=$name=~/(\d+)/;print "$num\n";'

But we could as well do it directly in bash (1) :

regex=[^0-9]*([0-9]{1,}).*$; \
[[ $name =~ $regex ]] && echo ${BASH_REMATCH[1]}

This allows us to extract the FIRST run of digits of any length
surrounded by any other text/characters.

Note : regex=[^0-9]*([0-9]{5,5}).*$; will match only exactly 5 digit runs. :-)

(1) : faster than calling an external tool for each short texts. Not faster than doing all processing inside sed or awk for large files.

codist ,May 6, 2011 at 12:50

Here's a prefix-suffix solution (similar to the solutions given by JB and Darron) that matches the first block of digits and does not depend on the surrounding underscores:
str='someletters_12345_morele34ters.ext'
s1="${str#"${str%%[[:digit:]]*}"}"   # strip off non-digit prefix from str
s2="${s1%%[^[:digit:]]*}"            # strip off non-digit suffix from s1
echo "$s2"                           # 12345

Campa ,Oct 21, 2016 at 8:12

I love sed 's capability to deal with regex groups:
> var="someletters_12345_moreletters.ext"
> digits=$( echo $var | sed "s/.*_\([0-9]\+\).*/\1/p" -n )
> echo $digits
12345

A slightly more general option would be not to assume that you have an underscore _ marking the start of your digits sequence, hence for instance stripping off all non-numbers you get before your sequence: s/[^0-9]\+\([0-9]\+\).*/\1/p .


> man sed | grep s/regexp/replacement -A 2
s/regexp/replacement/
    Attempt to match regexp against the pattern space.  If successful, replace that portion matched with replacement.  The replacement may contain the special  character  &  to
    refer to that portion of the pattern space which matched, and the special escapes \1 through \9 to refer to the corresponding matching sub-expressions in the regexp.

More on this, in case you're not too confident with regexps:

All escapes \ are there to make sed 's regexp processing work.

Dan Dascalescu ,May 8 at 18:28

Given test.txt is a file containing "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
cut -b19-20 test.txt > test1.txt # This will extract chars 19 & 20 "ST" 
while read -r; do;
> x=$REPLY
> done < test1.txt
echo $x
ST

Alex Raj Kaliamoorthy ,Jul 29, 2016 at 7:41

My answer will have more control on what you want out of your string. Here is the code on how you can extract 12345 out of your string
str="someletters_12345_moreleters.ext"
str=${str#*_}
str=${str%_more*}
echo $str

This will be more efficient if you want to extract something that has any chars like abc or any special characters like _ or - . For example: If your string is like this and you want everything that is after someletters_ and before _moreleters.ext :

str="someletters_123-45-24a&13b-1_moreleters.ext"

With my code you can mention what exactly you want. Explanation:

#* It will remove the preceding string including the matching key. Here the key we mentioned is _ % It will remove the following string including the matching key. Here the key we mentioned is '_more*'

Do some experiments yourself and you would find this interesting.

Dan Dascalescu ,May 8 at 18:27

similar to substr('abcdefg', 2-1, 3) in php:
echo 'abcdefg'|tail -c +2|head -c 3

olibre ,Nov 25, 2015 at 14:50

Ok, here goes pure Parameter Substitution with an empty string. Caveat is that I have defined someletters and moreletters as only characters. If they are alphanumeric, this will not work as it is.
filename=someletters_12345_moreletters.ext
substring=${filename//@(+([a-z])_|_+([a-z]).*)}
echo $substring
12345

gniourf_gniourf ,Jun 4 at 17:33

There's also the bash builtin 'expr' command:
INPUT="someletters_12345_moreleters.ext"  
SUBSTRING=`expr match "$INPUT" '.*_\([[:digit:]]*\)_.*' `  
echo $SUBSTRING

russell ,Aug 1, 2013 at 8:12

A little late, but I just ran across this problem and found the following:
host:/tmp$ asd=someletters_12345_moreleters.ext 
host:/tmp$ echo `expr $asd : '.*_\(.*\)_'`
12345
host:/tmp$

I used it to get millisecond resolution on an embedded system that does not have %N for date:

set `grep "now at" /proc/timer_list`
nano=$3
fraction=`expr $nano : '.*\(...\)......'`
$debug nano is $nano, fraction is $fraction

> ,Aug 5, 2018 at 17:13

A bash solution:
IFS="_" read -r x digs x <<<'someletters_12345_moreleters.ext'

This will clobber a variable called x . The var x could be changed to the var _ .

input='someletters_12345_moreleters.ext'
IFS="_" read -r _ digs _ <<<"$input"

[Sep 10, 2019] How do I avoid an uninitialized value

Sep 10, 2019 | stackoverflow.com

marto ,Jul 15, 2011 at 16:52

I use this scrub function to clean up output from other functions.
#!/usr/bin/perl
use warnings;
use strict;
use Data::Dumper;

my %h = (
    a => 1,
    b => 1
    );

print scrub($h{c});

sub scrub {
    my $a = shift;

    return ($a eq '' or $a eq '~' or not defined $a) ? -1 : $a;
}

The problem occurs when I also would like to handle the case, where the key in a hash doesn't exist, which is shown in the example with scrub($h{c}) .

What change should be make to scrub so it can handle this case?

Sandra Schlichting ,Jun 22, 2017 at 19:00

You're checking whether $a eq '' before checking whether it's defined, hence the warning "Use of uninitialized value in string eq". Simply change the order of things in the conditional:
return (!defined($a) or $a eq '' or $a eq '~') ? -1 : $a;

As soon as anything in the chain of 'or's matches, Perl will stop processing the conditional, thus avoiding the erroneous attempt to compare undef to a string.

Sandra Schlichting ,Jul 14, 2011 at 14:34

In scrub it is too late to check, if the hash has an entry for key key . scrub() only sees a scalar, which is undef , if the hash key does not exist. But a hash could have an entry with the value undef also, like this:
my %h = (
 a => 1,
 b => 1,
 c => undef
);

So I suggest to check for hash entries with the exists function.

[Sep 10, 2019] How do I check if a Perl scalar variable has been initialized - Stack Overflow

Sep 10, 2019 | stackoverflow.com

How do I check if a Perl scalar variable has been initialized? Ask Question Asked 8 years, 11 months ago Active 3 years ago Viewed 49k times 33 10


brian d foy ,Sep 18, 2010 at 13:53

Is the following the best way to check if a scalar variable is initialized in Perl, using defined ?
my $var;

if (cond) {
    $var = "string1";
}

# Is this the correct way?
if (defined $var) {
    ...
}

mob ,Sep 25, 2010 at 21:35

Perl doesn't offer a way to check whether or not a variable has been initialized.

However, scalar variables that haven't been explicitly initialized with some value happen to have the value of undef by default. You are right about defined being the right way to check whether or not a variable has a value of undef .

There's several other ways tho. If you want to assign to the variable if it's undef , which your example code seems to indicate, you could, for example, use perl's defined-or operator:

$var //= 'a default value';

vol7ron ,Sep 17, 2010 at 23:17

It depends on what you're trying to do. The proper C way to do things is to initialize variables when they are declared; however, Perl is not C , so one of the following may be what you want:
  1)   $var = "foo" unless defined $var;      # set default after the fact
  2)   $var = defined $var? $var : {...};     # ternary operation
  3)   {...} if !(defined $var);              # another way to write 1)
  4)   $var = $var || "foo";                  # set to $var unless it's falsy, in which case set to 'foo'
  5)   $var ||= "foo";                        # retain value of $var unless it's falsy, in which case set to 'foo' (same as previous line)
  6)   $var = $var // "foo";                  # set to $var unless it's undefined, in which case set to 'foo'
  7)   $var //= "foo";                        # 5.10+ ; retain value of $var unless it's undefined, in which case set to 'foo' (same as previous line)


C way of doing things ( not recommended ):

# initialize the variable to a default value during declaration
#   then test against that value when you want to see if it's been changed
my $var = "foo";
{...}
if ($var eq "foo"){
   ... # do something
} else {
   ... # do something else
}

Another long-winded way of doing this is to create a class and a flag when the variable's been changed, which is unnecessary.

Axeman ,Sep 17, 2010 at 20:39

If you don't care whether or not it's empty, it is. Otherwise you can check
if ( length( $str || '' )) {}

swilliams ,Sep 17, 2010 at 20:53

It depends on what you plan on doing with the variable whether or not it is defined; as of Perl 5.10, you can do this (from perl51000delta ):

A new operator // (defined-or) has been implemented. The following expression:

 $a // $b

is merely equivalent to

defined $a ? $a : $b

and the statement

$c //= $d;

can now be used instead of

$c = $d unless defined $c;

rafl ,Jun 24, 2012 at 7:53

'defined' will return true if a variable has a real value.

As an aside, in a hash, this can be true:

if(exists $h{$e} && !defined $h{$e})

[Sep 10, 2019] logging - Perl - Output the log files - Stack Overflow

Aug 27, 2015 | stackoverflow.com

Perl - Output the log files Ask Question Asked 4 years ago Active 4 years ago Viewed 3k times 1 2


Arunesh Singh ,Aug 27, 2015 at 8:53

I have created a perl that telnet to multiple switches. I would like to check if telnet functions properly by telneting the switch.

This is my code to telnet to the switches:

#!/usr/bin/perl
use warnings;
use Net::Cisco;

open( OUTPUT, ">log.txt" );
open( SWITCHIP, "ip.txt" ) or die "couldn't open ip.txt";

my $count = 0;

while (<SWITCHIP>) {
    chomp($_);
    my $switch = $_;
    my $tl     = 0;
    my $t      = Net::Telnet::Cisco->new(
        Host => $switch,
        Prompt =>
            '/(?m:^(?:[\w.\/]+\:)?[\w.-]+\s?(?:\(config[^\)]*\))?\s?[\$#>]\s?(?:\(enable\))?\s*$)/',
        Timeout => 5,
        Errmode => 'return'
    ) or $tl = 1;

    my @output = ();
    if ( $tl != 1 ) {
        print "$switch Telnet success\n";
    }
    else {
        my $telnetstat = "Telnet Failed";
        print "$switch $telnetstat\n";
    }
    close(OUTPUT);
    $count++;
}

This is my output status after I was testing 7 switches:

10.xxx.3.17 Telnet success
10.xxx.10.12 Telnet success
10.xxx.136.10 Telnet success
10.xxx.136.12 Telnet success
10.xxx.188.188 Telnet Failed
10.xxx.136.13 Telnet success

I would like to convert the telnet result as log file.
How to separate successful and failed telnet results by using perl?

Danny Luk ,Aug 28, 2015 at 8:40

Please Try the following
#!/usr/bin/perl
use warnings;
use Net::Cisco;
################################### S
open( OUTPUTS, ">log_Success.txt" );
open( OUTPUTF, ">log_Fail.txt" );
################################### E
open( SWITCHIP, "ip.txt" ) or die "couldn't open ip.txt";

my $count = 0;

while (<SWITCHIP>) {
    chomp($_);
    my $switch = $_;
    my $tl     = 0;
    my $t      = Net::Telnet::Cisco->new(
        Host => $switch,
        Prompt =>
            '/(?m:^(?:[\w.\/]+\:)?[\w.-]+\s?(?:\(config[^\)]*\))?\s?[\$#>]\s?(?:\(enable\))?\s*$)/',
        Timeout => 5,
        Errmode => 'return'
    ) or $tl = 1;

    my @output = ();
################################### S
    if ( $tl != 1 ) {
        print "$switch Telnet success\n"; # for printing it in screen
        print OUTPUTS "$switch Telnet success\n"; # it will print it in the log_Success.txt
    }
    else {
        my $telnetstat = "Telnet Failed";
        print "$switch $telnetstat\n"; # for printing it in screen
        print OUTPUTF "$switch $telnetstat\n"; # it will print it in the log_Fail.txt
    }
################################### E
    $count++;
}
################################### S
close(SWITCHIP);
close(OUTPUTS);
close(OUTPUTF);
################################### E

Danny Luk ,Aug 28, 2015 at 8:39

In print statement after print just write the filehandle name which is OUTPUT in your code:
print OUTPUT "$switch Telnet success\n";

and

print OUTPUT "$switch $telnetstat\n";

A side note: always use a lexical filehandle and three arguments with error handling to open a file. This line open(OUTPUT, ">log.txt"); you can write like this:

open my $fhout, ">", "log.txt" or die $!;

Sobrique ,Aug 28, 2015 at 8:39

Use Sys::Syslog to write log messages.

But since you're opening a log.txt file with the handle OUTPUT , just change your two print statements to have OUTPUT as the first argument and the string as the next (without a comma).

my $telnetstat;
if($tl != 1) {
  $telnetstat = "Telnet success";
} else {
  $telnetstat = "Telnet Failed";
}
print OUTPUT "$switch $telnetstat\n";

# Or the shorter ternary operator line for all the above:
print OUTPUT $swtich . (!$tl ? " Telnet success\n" : " Telnet failed\n");

You might consider moving close to an END block:

END {
  close(OUTPUT);
}

Not only because it's in your while loop.

[Sep 08, 2019] How to replace spaces in file names using a bash script

Sep 08, 2019 | stackoverflow.com

Ask Question Asked 9 years, 4 months ago Active 2 months ago Viewed 226k times 238 127


Mark Byers ,Apr 25, 2010 at 19:20

Can anyone recommend a safe solution to recursively replace spaces with underscores in file and directory names starting from a given root directory? For example:
$ tree
.
|-- a dir
|   `-- file with spaces.txt
`-- b dir
    |-- another file with spaces.txt
    `-- yet another file with spaces.pdf

becomes:

$ tree
.
|-- a_dir
|   `-- file_with_spaces.txt
`-- b_dir
    |-- another_file_with_spaces.txt
    `-- yet_another_file_with_spaces.pdf

Jürgen Hötzel ,Nov 4, 2015 at 3:03

Use rename (aka prename ) which is a Perl script which may be on your system already. Do it in two steps:
find -name "* *" -type d | rename 's/ /_/g'    # do the directories first
find -name "* *" -type f | rename 's/ /_/g'

Based on Jürgen's answer and able to handle multiple layers of files and directories in a single bound using the "Revision 1.5 1998/12/18 16:16:31 rmb1" version of /usr/bin/rename (a Perl script):

find /tmp/ -depth -name "* *" -execdir rename 's/ /_/g' "{}" \;

oevna ,Jan 1, 2016 at 8:25

I use:
for f in *\ *; do mv "$f" "${f// /_}"; done

Though it's not recursive, it's quite fast and simple. I'm sure someone here could update it to be recursive.

The ${f// /_} part utilizes bash's parameter expansion mechanism to replace a pattern within a parameter with supplied string. The relevant syntax is ${parameter/pattern/string} . See: https://www.gnu.org/software/bash/manual/html_node/Shell-Parameter-Expansion.html or http://wiki.bash-hackers.org/syntax/pe .

armandino ,Dec 3, 2013 at 20:51

find . -depth -name '* *' \
| while IFS= read -r f ; do mv -i "$f" "$(dirname "$f")/$(basename "$f"|tr ' ' _)" ; done

failed to get it right at first, because I didn't think of directories.

Edmund Elmer ,Jul 3 at 7:12

you can use detox by Doug Harple
detox -r <folder>

Dennis Williamson ,Mar 22, 2012 at 20:33

A find/rename solution. rename is part of util-linux.

You need to descend depth first, because a whitespace filename can be part of a whitespace directory:

find /tmp/ -depth -name "* *" -execdir rename " " "_" "{}" ";"

armandino ,Apr 26, 2010 at 11:49

bash 4.0
#!/bin/bash
shopt -s globstar
for file in **/*\ *
do 
    mv "$file" "${file// /_}"       
done

Itamar ,Jan 31, 2013 at 21:27

you can use this:
    find . -name '* *' | while read fname 

do
        new_fname=`echo $fname | tr " " "_"`

        if [ -e $new_fname ]
        then
                echo "File $new_fname already exists. Not replacing $fname"
        else
                echo "Creating new file $new_fname to replace $fname"
                mv "$fname" $new_fname
        fi
done

yabt ,Apr 26, 2010 at 14:54

Here's a (quite verbose) find -exec solution which writes "file already exists" warnings to stderr:
function trspace() {
   declare dir name bname dname newname replace_char
   [ $# -lt 1 -o $# -gt 2 ] && { echo "usage: trspace dir char"; return 1; }
   dir="${1}"
   replace_char="${2:-_}"
   find "${dir}" -xdev -depth -name $'*[ \t\r\n\v\f]*' -exec bash -c '
      for ((i=1; i<=$#; i++)); do
         name="${@:i:1}"
         dname="${name%/*}"
         bname="${name##*/}"
         newname="${dname}/${bname//[[:space:]]/${0}}"
         if [[ -e "${newname}" ]]; then
            echo "Warning: file already exists: ${newname}" 1>&2
         else
            mv "${name}" "${newname}"
         fi
      done
  ' "${replace_char}" '{}' +
}

trspace rootdir _

degi ,Aug 8, 2011 at 9:10

This one does a little bit more. I use it to rename my downloaded torrents (no special characters (non-ASCII), spaces, multiple dots, etc.).
#!/usr/bin/perl

&rena(`find . -type d`);
&rena(`find . -type f`);

sub rena
{
    ($elems)=@_;
    @t=split /\n/,$elems;

    for $e (@t)
    {
    $_=$e;
    # remove ./ of find
    s/^\.\///;
    # non ascii transliterate
    tr [\200-\377][_];
    tr [\000-\40][_];
    # special characters we do not want in paths
    s/[ \-\,\;\?\+\'\"\!\[\]\(\)\@\#]/_/g;
    # multiple dots except for extension
    while (/\..*\./)
    {
        s/\./_/;
    }
    # only one _ consecutive
    s/_+/_/g;
    next if ($_ eq $e ) or ("./$_" eq $e);
    print "$e -> $_\n";
    rename ($e,$_);
    }
}

Junyeop Lee ,Apr 10, 2018 at 9:44

Recursive version of Naidim's Answers.
find . -name "* *" | awk '{ print length, $0 }' | sort -nr -s | cut -d" " -f2- | while read f; do base=$(basename "$f"); newbase="${base// /_}"; mv "$(dirname "$f")/$(basename "$f")" "$(dirname "$f")/$newbase"; done

ghoti ,Dec 5, 2016 at 21:16

I found around this script, it may be interesting :)
 IFS=$'\n';for f in `find .`; do file=$(echo $f | tr [:blank:] '_'); [ -e $f ] && [ ! -e $file ] && mv "$f" $file;done;unset IFS

ghoti ,Dec 5, 2016 at 21:17

Here's a reasonably sized bash script solution
#!/bin/bash
(
IFS=$'\n'
    for y in $(ls $1)
      do
         mv $1/`echo $y | sed 's/ /\\ /g'` $1/`echo "$y" | sed 's/ /_/g'`
      done
)

user1060059 ,Nov 22, 2011 at 15:15

This only finds files inside the current directory and renames them . I have this aliased.

find ./ -name "* *" -type f -d 1 | perl -ple '$file = $_; $file =~ s/\s+/_/g; rename($_, $file);

Hongtao ,Sep 26, 2014 at 19:30

I just make one for my own purpose. You may can use it as reference.
#!/bin/bash
cd /vzwhome/c0cheh1/dev_source/UB_14_8
for file in *
do
    echo $file
    cd "/vzwhome/c0cheh1/dev_source/UB_14_8/$file/Configuration/$file"
    echo "==> `pwd`"
    for subfile in *\ *; do [ -d "$subfile" ] && ( mv "$subfile" "$(echo $subfile | sed -e 's/ /_/g')" ); done
    ls
    cd /vzwhome/c0cheh1/dev_source/UB_14_8
done

Marcos Jean Sampaio ,Dec 5, 2016 at 20:56

For files in folder named /files
for i in `IFS="";find /files -name *\ *`
do
   echo $i
done > /tmp/list


while read line
do
   mv "$line" `echo $line | sed 's/ /_/g'`
done < /tmp/list

rm /tmp/list

Muhammad Annaqeeb ,Sep 4, 2017 at 11:03

For those struggling through this using macOS, first install all the tools:
 brew install tree findutils rename

Then when needed to rename, make an alias for GNU find (gfind) as find. Then run the code of @Michel Krelin:

alias find=gfind 
find . -depth -name '* *' \
| while IFS= read -r f ; do mv -i "$f" "$(dirname "$f")/$(basename "$f"|tr ' ' _)" ; done

[Sep 07, 2019] As soon as you stop writing code on a regular basis you stop being a programmer. You lose you qualification very quickly. That's a typical tragedy of talented programmers who became mediocre managers or, worse, theoretical computer scientists

Programming skills are somewhat similar to the skills of people who play violin or piano. As soon a you stop playing violin or piano still start to evaporate. First slowly, then quicker. In two yours you probably will lose 80%.
Notable quotes:
"... I happened to look the other day. I wrote 35 programs in January, and 28 or 29 programs in February. These are small programs, but I have a compulsion. I love to write programs and put things into it. ..."
Sep 07, 2019 | archive.computerhistory.org

Dijkstra said he was proud to be a programmer. Unfortunately he changed his attitude completely, and I think he wrote his last computer program in the 1980s. At this conference I went to in 1967 about simulation language, Chris Strachey was going around asking everybody at the conference what was the last computer program you wrote. This was 1967. Some of the people said, "I've never written a computer program." Others would say, "Oh yeah, here's what I did last week." I asked Edsger this question when I visited him in Texas in the 90s and he said, "Don, I write programs now with pencil and paper, and I execute them in my head." He finds that a good enough discipline.

I think he was mistaken on that. He taught me a lot of things, but I really think that if he had continued... One of Dijkstra's greatest strengths was that he felt a strong sense of aesthetics, and he didn't want to compromise his notions of beauty. They were so intense that when he visited me in the 1960s, I had just come to Stanford. I remember the conversation we had. It was in the first apartment, our little rented house, before we had electricity in the house.

We were sitting there in the dark, and he was telling me how he had just learned about the specifications of the IBM System/360, and it made him so ill that his heart was actually starting to flutter.

He intensely disliked things that he didn't consider clean to work with. So I can see that he would have distaste for the languages that he had to work with on real computers. My reaction to that was to design my own language, and then make Pascal so that it would work well for me in those days. But his response was to do everything only intellectually.

So, programming.

I happened to look the other day. I wrote 35 programs in January, and 28 or 29 programs in February. These are small programs, but I have a compulsion. I love to write programs and put things into it. I think of a question that I want to answer, or I have part of my book where I want to present something. But I can't just present it by reading about it in a book. As I code it, it all becomes clear in my head. It's just the discipline. The fact that I have to translate my knowledge of this method into something that the machine is going to understand just forces me to make that crystal-clear in my head. Then I can explain it to somebody else infinitely better. The exposition is always better if I've implemented it, even though it's going to take me more time.

[Sep 07, 2019] Knuth about computer science and money: At that point I made the decision in my life that I wasn't going to optimize my income;

Sep 07, 2019 | archive.computerhistory.org

So I had a programming hat when I was outside of Cal Tech, and at Cal Tech I am a mathematician taking my grad studies. A startup company, called Green Tree Corporation because green is the color of money, came to me and said, "Don, name your price. Write compilers for us and we will take care of finding computers for you to debug them on, and assistance for you to do your work. Name your price." I said, "Oh, okay. $100,000.", assuming that this was In that era this was not quite at Bill Gate's level today, but it was sort of out there.

The guy didn't blink. He said, "Okay." I didn't really blink either. I said, "Well, I'm not going to do it. I just thought this was an impossible number."

At that point I made the decision in my life that I wasn't going to optimize my income; I was really going to do what I thought I could do for well, I don't know. If you ask me what makes me most happy, number one would be somebody saying "I learned something from you". Number two would be somebody saying "I used your software". But number infinity would be Well, no. Number infinity minus one would be "I bought your book". It's not as good as "I read your book", you know. Then there is "I bought your software"; that was not in my own personal value. So that decision came up. I kept up with the literature about compilers. The Communications of the ACM was where the action was. I also worked with people on trying to debug the ALGOL language, which had problems with it. I published a few papers, like "The Remaining Trouble Spots in ALGOL 60" was one of the papers that I worked on. I chaired a committee called "Smallgol" which was to find a subset of ALGOL that would work on small computers. I was active in programming languages.

[Sep 07, 2019] Knuth: maybe 1 in 50 people have the "computer scientist's" type of intellect

Sep 07, 2019 | conservancy.umn.edu

Frana: You have made the comment several times that maybe 1 in 50 people have the "computer scientist's mind." Knuth: Yes. Frana: I am wondering if a large number of those people are trained professional librarians? [laughter] There is some strangeness there. But can you pinpoint what it is about the mind of the computer scientist that is....

Knuth: That is different?

Frana: What are the characteristics?

Knuth: Two things: one is the ability to deal with non-uniform structure, where you have case one, case two, case three, case four. Or that you have a model of something where the first component is integer, the next component is a Boolean, and the next component is a real number, or something like that, you know, non-uniform structure. To deal fluently with those kinds of entities, which is not typical in other branches of mathematics, is critical. And the other characteristic ability is to shift levels quickly, from looking at something in the large to looking at something in the small, and many levels in between, jumping from one level of abstraction to another. You know that, when you are adding one to some number, that you are actually getting closer to some overarching goal. These skills, being able to deal with nonuniform objects and to see through things from the top level to the bottom level, these are very essential to computer programming, it seems to me. But maybe I am fooling myself because I am too close to it.

Frana: It is the hardest thing to really understand that which you are existing within.

Knuth: Yes.

[Sep 07, 2019] Knuth: I can be a writer, who tries to organize other people's ideas into some kind of a more coherent structure so that it is easier to put things together

Sep 07, 2019 | conservancy.umn.edu

Knuth: I can be a writer, who tries to organize other people's ideas into some kind of a more coherent structure so that it is easier to put things together. I can see that I could be viewed as a scholar that does his best to check out sources of material, so that people get credit where it is due. And to check facts over, not just to look at the abstract of something, but to see what the methods were that did it and to fill in holes if necessary. I look at my role as being able to understand the motivations and terminology of one group of specialists and boil it down to a certain extent so that people in other parts of the field can use it. I try to listen to the theoreticians and select what they have done that is important to the programmer on the street; to remove technical jargon when possible.

But I have never been good at any kind of a role that would be making policy, or advising people on strategies, or what to do. I have always been best at refining things that are there and bringing order out of chaos. I sometimes raise new ideas that might stimulate people, but not really in a way that would be in any way controlling the flow. The only time I have ever advocated something strongly was with literate programming; but I do this always with the caveat that it works for me, not knowing if it would work for anybody else.

When I work with a system that I have created myself, I can always change it if I don't like it. But everybody who works with my system has to work with what I give them. So I am not able to judge my own stuff impartially. So anyway, I have always felt bad about if anyone says, 'Don, please forecast the future,'...

[Sep 06, 2019] Python vs. Ruby Which is best for web development Opensource.com

Sep 06, 2019 | opensource.com

Python was developed organically in the scientific space as a prototyping language that easily could be translated into C++ if a prototype worked. This happened long before it was first used for web development. Ruby, on the other hand, became a major player specifically because of web development; the Rails framework extended Ruby's popularity with people developing complex websites.

Which programming language best suits your needs? Here is a quick overview of each language to help you choose:

Approach: one best way vs. human-language Python

Python takes a direct approach to programming. Its main goal is to make everything obvious to the programmer. In Python, there is only one "best" way to do something. This philosophy has led to a language strict in layout.

Python's core philosophy consists of three key hierarchical principles:

This regimented philosophy results in Python being eminently readable and easy to learn -- and why Python is great for beginning coders. Python has a big foothold in introductory programming courses . Its syntax is very simple, with little to remember. Because its code structure is explicit, the developer can easily tell where everything comes from, making it relatively easy to debug.

Python's hierarchy of principles is evident in many aspects of the language. Its use of whitespace to do flow control as a core part of the language syntax differs from most other languages, including Ruby. The way you indent code determines the meaning of its action. This use of whitespace is a prime example of Python's "explicit" philosophy, the shape a Python app takes spells out its logic and how the app will act.

Ruby

In contrast to Python, Ruby focuses on "human-language" programming, and its code reads like a verbal language rather than a machine-based one, which many programmers, both beginners and experts, like. Ruby follows the principle of " least astonishment ," and offers myriad ways to do the same thing. These similar methods can have multiple names, which many developers find confusing and frustrating.

Unlike Python, Ruby makes use of "blocks," a first-class object that is treated as a unit within a program. In fact, Ruby takes the concept of OOP (Object-Oriented Programming) to its limit. Everything is an object -- even global variables are actually represented within the ObjectSpace object. Classes and modules are themselves objects, and functions and operators are methods of objects. This ability makes Ruby especially powerful, especially when combined with its other primary strength: functional programming and the use of lambdas.

In addition to blocks and functional programming, Ruby provides programmers with many other features, including fragmentation, hashable and unhashable types, and mutable strings.

Ruby's fans find its elegance to be one of its top selling points. At the same time, Ruby's "magical" features and flexibility can make it very hard to track down bugs.

Communities: stability vs. innovation

Although features and coding philosophy are the primary drivers for choosing a given language, the strength of a developer community also plays an important role. Fortunately, both Python and Ruby boast strong communities.

Python

Python's community already includes a large Linux and academic community and therefore offers many academic use cases in both math and science. That support gives the community a stability and diversity that only grows as Python increasingly is used for web development.

Ruby

However, Ruby's community has focused primarily on web development from the get-go. It tends to innovate more quickly than the Python community, but this innovation also causes more things to break. In addition, while it has gotten more diverse, it has yet to reach the level of diversity that Python has.

Final thoughts

For web development, Ruby has Rails and Python has Django. Both are powerful frameworks, so when it comes to web development, you can't go wrong with either language. Your decision will ultimately come down to your level of experience and your philosophical preferences.

If you plan to focus on building web applications, Ruby is popular and flexible. There is a very strong community built upon it and they are always on the bleeding edge of development.

If you are interested in building web applications and would like to learn a language that's used more generally, try Python. You'll get a diverse community and lots of influence and support from the various industries in which it is used.

Tom Radcliffe - Tom Radcliffe has over 20 years experience in software development and management in both academia and industry. He is a professional engineer (PEO and APEGBC) and holds a PhD in physics from Queen's University at Kingston. Tom brings a passion for quantitative, data-driven processes to ActiveState .

[Sep 06, 2019] Knuth: Programming and architecture are interrelated and it is impossible to create good architecure wthout actually programming at least of a prototype

Notable quotes:
"... When you're writing a document for a human being to understand, the human being will look at it and nod his head and say, "Yeah, this makes sense." But then there's all kinds of ambiguities and vagueness that you don't realize until you try to put it into a computer. Then all of a sudden, almost every five minutes as you're writing the code, a question comes up that wasn't addressed in the specification. "What if this combination occurs?" ..."
"... When you're faced with implementation, a person who has been delegated this job of working from a design would have to say, "Well hmm, I don't know what the designer meant by this." ..."
Sep 06, 2019 | archive.computerhistory.org

...I showed the second version of this design to two of my graduate students, and I said, "Okay, implement this, please, this summer. That's your summer job." I thought I had specified a language. I had to go away. I spent several weeks in China during the summer of 1977, and I had various other obligations. I assumed that when I got back from my summer trips, I would be able to play around with TeX and refine it a little bit. To my amazement, the students, who were outstanding students, had not competed [it]. They had a system that was able to do about three lines of TeX. I thought, "My goodness, what's going on? I thought these were good students." Well afterwards I changed my attitude to saying, "Boy, they accomplished a miracle."

Because going from my specification, which I thought was complete, they really had an impossible task, and they had succeeded wonderfully with it. These students, by the way, [were] Michael Plass, who has gone on to be the brains behind almost all of Xerox's Docutech software and all kind of things that are inside of typesetting devices now, and Frank Liang, one of the key people for Microsoft Word.

He did important mathematical things as well as his hyphenation methods which are quite used in all languages now. These guys were actually doing great work, but I was amazed that they couldn't do what I thought was just sort of a routine task. Then I became a programmer in earnest, where I had to do it. The reason is when you're doing programming, you have to explain something to a computer, which is dumb.

When you're writing a document for a human being to understand, the human being will look at it and nod his head and say, "Yeah, this makes sense." But then there's all kinds of ambiguities and vagueness that you don't realize until you try to put it into a computer. Then all of a sudden, almost every five minutes as you're writing the code, a question comes up that wasn't addressed in the specification. "What if this combination occurs?"

It just didn't occur to the person writing the design specification. When you're faced with implementation, a person who has been delegated this job of working from a design would have to say, "Well hmm, I don't know what the designer meant by this."

If I hadn't been in China they would've scheduled an appointment with me and stopped their programming for a day. Then they would come in at the designated hour and we would talk. They would take 15 minutes to present to me what the problem was, and then I would think about it for a while, and then I'd say, "Oh yeah, do this. " Then they would go home and they would write code for another five minutes and they'd have to schedule another appointment.

I'm probably exaggerating, but this is why I think Bob Floyd's Chiron compiler never got going. Bob worked many years on a beautiful idea for a programming language, where he designed a language called Chiron, but he never touched the programming himself. I think this was actually the reason that he had trouble with that project, because it's so hard to do the design unless you're faced with the low-level aspects of it, explaining it to a machine instead of to another person.

Forsythe, I think it was, who said, "People have said traditionally that you don't understand something until you've taught it in a class. The truth is you don't really understand something until you've taught it to a computer, until you've been able to program it." At this level, programming was absolutely important

[Sep 06, 2019] Knuth: No, I stopped going to conferences. It was too discouraging. Computer programming keeps getting harder because more stuff is discovered

Sep 06, 2019 | conservancy.umn.edu

Knuth: No, I stopped going to conferences. It was too discouraging. Computer programming keeps getting harder because more stuff is discovered. I can cope with learning about one new technique per day, but I can't take ten in a day all at once. So conferences are depressing; it means I have so much more work to do. If I hide myself from the truth I am much happier.

[Sep 06, 2019] How TAOCP was hatched

Notable quotes:
"... Also, Addison-Wesley was the people who were asking me to do this book; my favorite textbooks had been published by Addison Wesley. They had done the books that I loved the most as a student. For them to come to me and say, "Would you write a book for us?", and here I am just a secondyear gradate student -- this was a thrill. ..."
"... But in those days, The Art of Computer Programming was very important because I'm thinking of the aesthetical: the whole question of writing programs as something that has artistic aspects in all senses of the word. The one idea is "art" which means artificial, and the other "art" means fine art. All these are long stories, but I've got to cover it fairly quickly. ..."
Sep 06, 2019 | archive.computerhistory.org

Knuth: This is, of course, really the story of my life, because I hope to live long enough to finish it. But I may not, because it's turned out to be such a huge project. I got married in the summer of 1961, after my first year of graduate school. My wife finished college, and I could use the money I had made -- the $5000 on the compiler -- to finance a trip to Europe for our honeymoon.

We had four months of wedded bliss in Southern California, and then a man from Addison-Wesley came to visit me and said "Don, we would like you to write a book about how to write compilers."

The more I thought about it, I decided "Oh yes, I've got this book inside of me."

I sketched out that day -- I still have the sheet of tablet paper on which I wrote -- I sketched out 12 chapters that I thought ought to be in such a book. I told Jill, my wife, "I think I'm going to write a book."

As I say, we had four months of bliss, because the rest of our marriage has all been devoted to this book. Well, we still have had happiness. But really, I wake up every morning and I still haven't finished the book. So I try to -- I have to -- organize the rest of my life around this, as one main unifying theme. The book was supposed to be about how to write a compiler. They had heard about me from one of their editorial advisors, that I knew something about how to do this. The idea appealed to me for two main reasons. One is that I did enjoy writing. In high school I had been editor of the weekly paper. In college I was editor of the science magazine, and I worked on the campus paper as copy editor. And, as I told you, I wrote the manual for that compiler that we wrote. I enjoyed writing, number one.

Also, Addison-Wesley was the people who were asking me to do this book; my favorite textbooks had been published by Addison Wesley. They had done the books that I loved the most as a student. For them to come to me and say, "Would you write a book for us?", and here I am just a secondyear gradate student -- this was a thrill.

Another very important reason at the time was that I knew that there was a great need for a book about compilers, because there were a lot of people who even in 1962 -- this was January of 1962 -- were starting to rediscover the wheel. The knowledge was out there, but it hadn't been explained. The people who had discovered it, though, were scattered all over the world and they didn't know of each other's work either, very much. I had been following it. Everybody I could think of who could write a book about compilers, as far as I could see, they would only give a piece of the fabric. They would slant it to their own view of it. There might be four people who could write about it, but they would write four different books. I could present all four of their viewpoints in what I would think was a balanced way, without any axe to grind, without slanting it towards something that I thought would be misleading to the compiler writer for the future. I considered myself as a journalist, essentially. I could be the expositor, the tech writer, that could do the job that was needed in order to take the work of these brilliant people and make it accessible to the world. That was my motivation. Now, I didn't have much time to spend on it then, I just had this page of paper with 12 chapter headings on it. That's all I could do while I'm a consultant at Burroughs and doing my graduate work. I signed a contract, but they said "We know it'll take you a while." I didn't really begin to have much time to work on it until 1963, my third year of graduate school, as I'm already finishing up on my thesis. In the summer of '62, I guess I should mention, I wrote another compiler. This was for Univac; it was a FORTRAN compiler. I spent the summer, I sold my soul to the devil, I guess you say, for three months in the summer of 1962 to write a FORTRAN compiler. I believe that the salary for that was $15,000, which was much more than an assistant professor. I think assistant professors were getting eight or nine thousand in those days.

Feigenbaum: Well, when I started in 1960 at [University of California] Berkeley, I was getting $7,600 for the nine-month year.

Knuth: Knuth: Yeah, so you see it. I got $15,000 for a summer job in 1962 writing a FORTRAN compiler. One day during that summer I was writing the part of the compiler that looks up identifiers in a hash table. The method that we used is called linear probing. Basically you take the variable name that you want to look up, you scramble it, like you square it or something like this, and that gives you a number between one and, well in those days it would have been between 1 and 1000, and then you look there. If you find it, good; if you don't find it, go to the next place and keep on going until you either get to an empty place, or you find the number you're looking for. It's called linear probing. There was a rumor that one of Professor Feller's students at Princeton had tried to figure out how fast linear probing works and was unable to succeed. This was a new thing for me. It was a case where I was doing programming, but I also had a mathematical problem that would go into my other [job]. My winter job was being a math student, my summer job was writing compilers. There was no mix. These worlds did not intersect at all in my life at that point. So I spent one day during the summer while writing the compiler looking at the mathematics of how fast does linear probing work. I got lucky, and I solved the problem. I figured out some math, and I kept two or three sheets of paper with me and I typed it up. ["Notes on 'Open' Addressing', 7/22/63] I guess that's on the internet now, because this became really the genesis of my main research work, which developed not to be working on compilers, but to be working on what they call analysis of algorithms, which is, have a computer method and find out how good is it quantitatively. I can say, if I got so many things to look up in the table, how long is linear probing going to take. It dawned on me that this was just one of many algorithms that would be important, and each one would lead to a fascinating mathematical problem. This was easily a good lifetime source of rich problems to work on. Here I am then, in the middle of 1962, writing this FORTRAN compiler, and I had one day to do the research and mathematics that changed my life for my future research trends. But now I've gotten off the topic of what your original question was.

Feigenbaum: We were talking about sort of the.. You talked about the embryo of The Art of Computing. The compiler book morphed into The Art of Computer Programming, which became a seven-volume plan.

Knuth: Exactly. Anyway, I'm working on a compiler and I'm thinking about this. But now I'm starting, after I finish this summer job, then I began to do things that were going to be relating to the book. One of the things I knew I had to have in the book was an artificial machine, because I'm writing a compiler book but machines are changing faster than I can write books. I have to have a machine that I'm totally in control of. I invented this machine called MIX, which was typical of the computers of 1962.

In 1963 I wrote a simulator for MIX so that I could write sample programs for it, and I taught a class at Caltech on how to write programs in assembly language for this hypothetical computer. Then I started writing the parts that dealt with sorting problems and searching problems, like the linear probing idea. I began to write those parts, which are part of a compiler, of the book. I had several hundred pages of notes gathering for those chapters for The Art of Computer Programming. Before I graduated, I've already done quite a bit of writing on The Art of Computer Programming.

I met George Forsythe about this time. George was the man who inspired both of us [Knuth and Feigenbaum] to come to Stanford during the '60s. George came down to Southern California for a talk, and he said, "Come up to Stanford. How about joining our faculty?" I said "Oh no, I can't do that. I just got married, and I've got to finish this book first." I said, "I think I'll finish the book next year, and then I can come up [and] start thinking about the rest of my life, but I want to get my book done before my son is born." Well, John is now 40-some years old and I'm not done with the book. Part of my lack of expertise is any good estimation procedure as to how long projects are going to take. I way underestimated how much needed to be written about in this book. Anyway, I started writing the manuscript, and I went merrily along writing pages of things that I thought really needed to be said. Of course, it didn't take long before I had started to discover a few things of my own that weren't in any of the existing literature. I did have an axe to grind. The message that I was presenting was in fact not going to be unbiased at all. It was going to be based on my own particular slant on stuff, and that original reason for why I should write the book became impossible to sustain. But the fact that I had worked on linear probing and solved the problem gave me a new unifying theme for the book. I was going to base it around this idea of analyzing algorithms, and have some quantitative ideas about how good methods were. Not just that they worked, but that they worked well: this method worked 3 times better than this method, or 3.1 times better than this method. Also, at this time I was learning mathematical techniques that I had never been taught in school. I found they were out there, but they just hadn't been emphasized openly, about how to solve problems of this kind.

So my book would also present a different kind of mathematics than was common in the curriculum at the time, that was very relevant to analysis of algorithm. I went to the publishers, I went to Addison Wesley, and said "How about changing the title of the book from 'The Art of Computer Programming' to 'The Analysis of Algorithms'." They said that will never sell; their focus group couldn't buy that one. I'm glad they stuck to the original title, although I'm also glad to see that several books have now come out called "The Analysis of Algorithms", 20 years down the line.

But in those days, The Art of Computer Programming was very important because I'm thinking of the aesthetical: the whole question of writing programs as something that has artistic aspects in all senses of the word. The one idea is "art" which means artificial, and the other "art" means fine art. All these are long stories, but I've got to cover it fairly quickly.

I've got The Art of Computer Programming started out, and I'm working on my 12 chapters. I finish a rough draft of all 12 chapters by, I think it was like 1965. I've got 3,000 pages of notes, including a very good example of what you mentioned about seeing holes in the fabric. One of the most important chapters in the book is parsing: going from somebody's algebraic formula and figuring out the structure of the formula. Just the way I had done in seventh grade finding the structure of English sentences, I had to do this with mathematical sentences.

Chapter ten is all about parsing of context-free language, [which] is what we called it at the time. I covered what people had published about context-free languages and parsing. I got to the end of the chapter and I said, well, you can combine these ideas and these ideas, and all of a sudden you get a unifying thing which goes all the way to the limit. These other ideas had sort of gone partway there. They would say "Oh, if a grammar satisfies this condition, I can do it efficiently." "If a grammar satisfies this condition, I can do it efficiently." But now, all of a sudden, I saw there was a way to say I can find the most general condition that can be done efficiently without looking ahead to the end of the sentence. That you could make a decision on the fly, reading from left to right, about the structure of the thing. That was just a natural outgrowth of seeing the different pieces of the fabric that other people had put together, and writing it into a chapter for the first time. But I felt that this general concept, well, I didn't feel that I had surrounded the concept. I knew that I had it, and I could prove it, and I could check it, but I couldn't really intuit it all in my head. I knew it was right, but it was too hard for me, really, to explain it well.

So I didn't put in The Art of Computer Programming. I thought it was beyond the scope of my book. Textbooks don't have to cover everything when you get to the harder things; then you have to go to the literature. My idea at that time [is] I'm writing this book and I'm thinking it's going to be published very soon, so any little things I discover and put in the book I didn't bother to write a paper and publish in the journal because I figure it'll be in my book pretty soon anyway. Computer science is changing so fast, my book is bound to be obsolete.

It takes a year for it to go through editing, and people drawing the illustrations, and then they have to print it and bind it and so on. I have to be a little bit ahead of the state-of-the-art if my book isn't going to be obsolete when it comes out. So I kept most of the stuff to myself that I had, these little ideas I had been coming up with. But when I got to this idea of left-to-right parsing, I said "Well here's something I don't really understand very well. I'll publish this, let other people figure out what it is, and then they can tell me what I should have said." I published that paper I believe in 1965, at the end of finishing my draft of the chapter, which didn't get as far as that story, LR(k). Well now, textbooks of computer science start with LR(k) and take off from there. But I want to give you an idea of

[Sep 03, 2019] bash - How to convert strings like 19-FEB-12 to epoch date in UNIX - Stack Overflow

Feb 11, 2013 | stackoverflow.com

Asked 6 years, 6 months ago Active 2 years, 2 months ago Viewed 53k times 24 4

hellish ,Feb 11, 2013 at 3:45

In UNIX how to convert to epoch milliseconds date strings like:
19-FEB-12
16-FEB-12
05-AUG-09

I need this to compare these dates with the current time on the server.

> ,

To convert a date to seconds since the epoch:
date --date="19-FEB-12" +%s

Current epoch:

date +%s

So, since your dates are in the past:

NOW=`date +%s`
THEN=`date --date="19-FEB-12" +%s`

let DIFF=$NOW-$THEN
echo "The difference is: $DIFF"

Using BSD's date command, you would need

$ date -j -f "%d-%B-%y" 19-FEB-12 +%s

Differences from GNU date :

  1. -j prevents date from trying to set the clock
  2. The input format must be explicitly set with -f
  3. The input date is a regular argument, not an option (viz. -d )
  4. When no time is specified with the date, use the current time instead of midnight.

[Sep 03, 2019] command line - How do I convert an epoch timestamp to a human readable format on the cli - Unix Linux Stack Exchange

Sep 03, 2019 | unix.stackexchange.com

Gilles ,Oct 11, 2010 at 18:14

date -d @1190000000 Replace 1190000000 with your epoch

Stefan Lasiewski ,Oct 11, 2010 at 18:04

$ echo 1190000000 | perl -pe 's/(\d+)/localtime($1)/e' 
Sun Sep 16 20:33:20 2007

This can come in handy for those applications which use epoch time in the logfiles:

$ tail -f /var/log/nagios/nagios.log | perl -pe 's/(\d+)/localtime($1)/e'
[Thu May 13 10:15:46 2010] EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;HOSTA;check_raid;0;check_raid.pl: OK (Unit 0 on Controller 0 is OK)

Stéphane Chazelas ,Jul 31, 2015 at 20:24

With bash-4.2 or above:
printf '%(%F %T)T\n' 1234567890

(where %F %T is the strftime() -type format)

That syntax is inspired from ksh93 .

In ksh93 however, the argument is taken as a date expression where various and hardly documented formats are supported.

For a Unix epoch time, the syntax in ksh93 is:

printf '%(%F %T)T\n' '#1234567890'

ksh93 however seems to use its own algorithm for the timezone and can get it wrong. For instance, in Britain, it was summer time all year in 1970, but:

$ TZ=Europe/London bash -c 'printf "%(%c)T\n" 0'
Thu 01 Jan 1970 01:00:00 BST
$ TZ=Europe/London ksh93 -c 'printf "%(%c)T\n" "#0"'
Thu Jan  1 00:00:00 1970

DarkHeart ,Jul 28, 2014 at 3:56

Custom format with GNU date :
date -d @1234567890 +'%Y-%m-%d %H:%M:%S'

Or with GNU awk :

awk 'BEGIN { print strftime("%Y-%m-%d %H:%M:%S", 1234567890); }'

Linked SO question: https://stackoverflow.com/questions/3249827/convert-from-unixtime-at-command-line

,

The two I frequently use are:
$ perl -leprint\ scalar\ localtime\ 1234567890
Sat Feb 14 00:31:30 2009

[Sep 02, 2019] bash - Pretty-print for shell script

Oct 21, 2010 | stackoverflow.com

Pretty-print for shell script Ask Question Asked 8 years, 10 months ago Active 30 days ago Viewed 14k times


Benoit ,Oct 21, 2010 at 13:19

I'm looking for something similiar to indent but for (bash) scripts. Console only, no colorizing, etc.

Do you know of one ?

Jamie ,Sep 11, 2012 at 3:00

Vim can indent bash scripts. But not reformat them before indenting.
Backup your bash script, open it with vim, type gg=GZZ and indent will be corrected. (Note for the impatient: this overwrites the file, so be sure to do that backup!)

Though, some bugs with << (expecting EOF as first character on a line) e.g.

EDIT: ZZ not ZQ

Daniel Martí ,Apr 8, 2018 at 13:52

A bit late to the party, but it looks like shfmt could do the trick for you.

Brian Chrisman ,Aug 11 at 4:08

In bash I do this:
reindent() {
source <(echo "Zibri () {";cat "$1"; echo "}")
declare -f Zibri|head --lines=-1|tail --lines=+3 | sed -e "s/^\s\s\s\s//"
}

this eliminates comments and reindents the script "bash way".

If you have HEREDOCS in your script, they got ruined by the sed in the previous function.

So use:

reindent() {
source <(echo "Zibri () {";cat "$1"; echo "}")
declare -f Zibri|head --lines=-1|tail --lines=+3"
}

But all your script will have a 4 spaces indentation.

Or you can do:

reindent () 
{ 
    rstr=$(cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w 16 | head -n 1);
    source <(echo "Zibri () {";cat "$1"|sed -e "s/^\s\s\s\s/$rstr/"; echo "}");
    echo '#!/bin/bash';
    declare -f Zibri | head --lines=-1 | tail --lines=+3 | sed -e "s/^\s\s\s\s//;s/$rstr/    /"
}

which takes care also of heredocs.

Pius Raeder ,Jan 10, 2017 at 8:35

Found this http://www.linux-kheops.com/doc/perl/perl-aubert/fmt.script .

Very nice, only one thing i took out is the [...]->test substitution.

[Sep 02, 2019] How to get the current line number of a file open using Perl

Sep 02, 2019 | stackoverflow.com

How to get the current line number of a file open using Perl? Ask Question Asked 8 years, 3 months ago Active 6 months ago Viewed 33k times 25 1


tadmc ,May 8, 2011 at 17:08

open my $fp, '<', $file or die $!;

while (<$fp>) {
    my $line = $_;
    if ($line =~ /$regex/) {
        # How do I find out which line number this match happened at?
    }
}

close $fp;

tchrist ,Apr 22, 2015 at 21:16

Use $. (see perldoc perlvar ).

tchrist ,May 7, 2011 at 16:48

You can also do it through OO interface:
use IO::Handle;
# later on ...
my $n = $fp->input_line_number();

This is in perldoc perlvar , too.

> ,

Don't use $. , nor $_ or any global variable. Use this instead:
while(my $line = <FILE>) {
  print $line unless ${\*FILE}->input_line_number == 1;
}

To avoid this and a lot of others Perl gotchas you can use on Atom or VSCode packages like linter-perl . Stop making Perl a write-only language !

[Aug 31, 2019] Complexity prevent programmer from ever learning the whole language, only subset is learned and used

Aug 31, 2019 | ask.slashdot.org

Re:Neither! (Score 2, Interesting) 817 by M. D. Nahas on Friday December 23, 2005 @06:08PM ( #14329127 ) Attached to: Learning Java or C# as a Next Language? The cleanest languages I've used are C, Java, and OCaml. By "clean", I mean the language has a few concepts that can be completely memorized, which results in less "gotchas" and manual reading. For these languages, you'll see small manuals (e.g., K&R's book for C) which cover the complete language and then lots of pages devoted to the libraries that come with the language. I'd definitely recommend Java (or C, or OCaml) over C# for this reason. C# seems to have combined every feature of C++, Java, and VBA into a single language. It is very complex and has a ton of concepts, for which I could never memorize the whole language. I have a feeling that most programmers will use the subset of C# that is closest to the language they understand, whether it is C++, Java or VBA. You might as well learn Java's style of programming, and then, if needed, switch to C# using its Java-like features.

[Aug 29, 2019] How do I parse command line arguments in Bash - Stack Overflow

Jul 10, 2017 | stackoverflow.com

Livven, Jul 10, 2017 at 8:11

Update: It's been more than 5 years since I started this answer. Thank you for LOTS of great edits/comments/suggestions. In order save maintenance time, I've modified the code block to be 100% copy-paste ready. Please do not post comments like "What if you changed X to Y ". Instead, copy-paste the code block, see the output, make the change, rerun the script, and comment "I changed X to Y and " I don't have time to test your ideas and tell you if they work.
Method #1: Using bash without getopt[s]

Two common ways to pass key-value-pair arguments are:

Bash Space-Separated (e.g., --option argument ) (without getopt[s])

Usage demo-space-separated.sh -e conf -s /etc -l /usr/lib /etc/hosts

cat >/tmp/demo-space-separated.sh <<'EOF'
#!/bin/bash

POSITIONAL=()
while [[ $# -gt 0 ]]
do
key="$1"

case $key in
    -e|--extension)
    EXTENSION="$2"
    shift # past argument
    shift # past value
    ;;
    -s|--searchpath)
    SEARCHPATH="$2"
    shift # past argument
    shift # past value
    ;;
    -l|--lib)
    LIBPATH="$2"
    shift # past argument
    shift # past value
    ;;
    --default)
    DEFAULT=YES
    shift # past argument
    ;;
    *)    # unknown option
    POSITIONAL+=("$1") # save it in an array for later
    shift # past argument
    ;;
esac
done
set -- "${POSITIONAL[@]}" # restore positional parameters

echo "FILE EXTENSION  = ${EXTENSION}"
echo "SEARCH PATH     = ${SEARCHPATH}"
echo "LIBRARY PATH    = ${LIBPATH}"
echo "DEFAULT         = ${DEFAULT}"
echo "Number files in SEARCH PATH with EXTENSION:" $(ls -1 "${SEARCHPATH}"/*."${EXTENSION}" | wc -l)
if [[ -n $1 ]]; then
    echo "Last line of file specified as non-opt/last argument:"
    tail -1 "$1"
fi
EOF

chmod +x /tmp/demo-space-separated.sh

/tmp/demo-space-separated.sh -e conf -s /etc -l /usr/lib /etc/hosts

output from copy-pasting the block above:

FILE EXTENSION  = conf
SEARCH PATH     = /etc
LIBRARY PATH    = /usr/lib
DEFAULT         =
Number files in SEARCH PATH with EXTENSION: 14
Last line of file specified as non-opt/last argument:
#93.184.216.34    example.com
Bash Equals-Separated (e.g., --option=argument ) (without getopt[s])

Usage demo-equals-separated.sh -e=conf -s=/etc -l=/usr/lib /etc/hosts

cat >/tmp/demo-equals-separated.sh <<'EOF'
#!/bin/bash

for i in "$@"
do
case $i in
    -e=*|--extension=*)
    EXTENSION="${i#*=}"
    shift # past argument=value
    ;;
    -s=*|--searchpath=*)
    SEARCHPATH="${i#*=}"
    shift # past argument=value
    ;;
    -l=*|--lib=*)
    LIBPATH="${i#*=}"
    shift # past argument=value
    ;;
    --default)
    DEFAULT=YES
    shift # past argument with no value
    ;;
    *)
          # unknown option
    ;;
esac
done
echo "FILE EXTENSION  = ${EXTENSION}"
echo "SEARCH PATH     = ${SEARCHPATH}"
echo "LIBRARY PATH    = ${LIBPATH}"
echo "DEFAULT         = ${DEFAULT}"
echo "Number files in SEARCH PATH with EXTENSION:" $(ls -1 "${SEARCHPATH}"/*."${EXTENSION}" | wc -l)
if [[ -n $1 ]]; then
    echo "Last line of file specified as non-opt/last argument:"
    tail -1 $1
fi
EOF

chmod +x /tmp/demo-equals-separated.sh

/tmp/demo-equals-separated.sh -e=conf -s=/etc -l=/usr/lib /etc/hosts

output from copy-pasting the block above:

FILE EXTENSION  = conf
SEARCH PATH     = /etc
LIBRARY PATH    = /usr/lib
DEFAULT         =
Number files in SEARCH PATH with EXTENSION: 14
Last line of file specified as non-opt/last argument:
#93.184.216.34    example.com

To better understand ${i#*=} search for "Substring Removal" in this guide . It is functionally equivalent to `sed 's/[^=]*=//' <<< "$i"` which calls a needless subprocess or `echo "$i" | sed 's/[^=]*=//'` which calls two needless subprocesses.

Method #2: Using bash with getopt[s]

from: http://mywiki.wooledge.org/BashFAQ/035#getopts

getopt(1) limitations (older, relatively-recent getopt versions):

More recent getopt versions don't have these limitations.

Additionally, the POSIX shell (and others) offer getopts which doesn't have these limitations. I've included a simplistic getopts example.

Usage demo-getopts.sh -vf /etc/hosts foo bar

cat >/tmp/demo-getopts.sh <<'EOF'
#!/bin/sh

# A POSIX variable
OPTIND=1         # Reset in case getopts has been used previously in the shell.

# Initialize our own variables:
output_file=""
verbose=0

while getopts "h?vf:" opt; do
    case "$opt" in
    h|\?)
        show_help
        exit 0
        ;;
    v)  verbose=1
        ;;
    f)  output_file=$OPTARG
        ;;
    esac
done

shift $((OPTIND-1))

[ "${1:-}" = "--" ] && shift

echo "verbose=$verbose, output_file='$output_file', Leftovers: $@"
EOF

chmod +x /tmp/demo-getopts.sh

/tmp/demo-getopts.sh -vf /etc/hosts foo bar

output from copy-pasting the block above:

verbose=1, output_file='/etc/hosts', Leftovers: foo bar

The advantages of getopts are:

  1. It's more portable, and will work in other shells like dash .
  2. It can handle multiple single options like -vf filename in the typical Unix way, automatically.

The disadvantage of getopts is that it can only handle short options ( -h , not --help ) without additional code.

There is a getopts tutorial which explains what all of the syntax and variables mean. In bash, there is also help getopts , which might be informative.

johncip ,Jul 23, 2018 at 15:15

No answer mentions enhanced getopt . And the top-voted answer is misleading: It either ignores -⁠vfd style short options (requested by the OP) or options after positional arguments (also requested by the OP); and it ignores parsing-errors. Instead:

The following calls

myscript -vfd ./foo/bar/someFile -o /fizz/someOtherFile
myscript -v -f -d -o/fizz/someOtherFile -- ./foo/bar/someFile
myscript --verbose --force --debug ./foo/bar/someFile -o/fizz/someOtherFile
myscript --output=/fizz/someOtherFile ./foo/bar/someFile -vfd
myscript ./foo/bar/someFile -df -v --output /fizz/someOtherFile

all return

verbose: y, force: y, debug: y, in: ./foo/bar/someFile, out: /fizz/someOtherFile

with the following myscript

#!/bin/bash
# saner programming env: these switches turn some bugs into errors
set -o errexit -o pipefail -o noclobber -o nounset

# -allow a command to fail with !'s side effect on errexit
# -use return value from ${PIPESTATUS[0]}, because ! hosed $?
! getopt --test > /dev/null 
if [[ ${PIPESTATUS[0]} -ne 4 ]]; then
    echo 'I'm sorry, `getopt --test` failed in this environment.'
    exit 1
fi

OPTIONS=dfo:v
LONGOPTS=debug,force,output:,verbose

# -regarding ! and PIPESTATUS see above
# -temporarily store output to be able to check for errors
# -activate quoting/enhanced mode (e.g. by writing out "--options")
# -pass arguments only via   -- "$@"   to separate them correctly
! PARSED=$(getopt --options=$OPTIONS --longoptions=$LONGOPTS --name "$0" -- "$@")
if [[ ${PIPESTATUS[0]} -ne 0 ]]; then
    # e.g. return value is 1
    #  then getopt has complained about wrong arguments to stdout
    exit 2
fi
# read getopt's output this way to handle the quoting right:
eval set -- "$PARSED"

d=n f=n v=n outFile=-
# now enjoy the options in order and nicely split until we see --
while true; do
    case "$1" in
        -d|--debug)
            d=y
            shift
            ;;
        -f|--force)
            f=y
            shift
            ;;
        -v|--verbose)
            v=y
            shift
            ;;
        -o|--output)
            outFile="$2"
            shift 2
            ;;
        --)
            shift
            break
            ;;
        *)
            echo "Programming error"
            exit 3
            ;;
    esac
done

# handle non-option arguments
if [[ $# -ne 1 ]]; then
    echo "$0: A single input file is required."
    exit 4
fi

echo "verbose: $v, force: $f, debug: $d, in: $1, out: $outFile"

1 enhanced getopt is available on most "bash-systems", including Cygwin; on OS X try brew install gnu-getopt or sudo port install getopt
2 the POSIX exec() conventions have no reliable way to pass binary NULL in command line arguments; those bytes prematurely end the argument
3 first version released in 1997 or before (I only tracked it back to 1997)

Tobias Kienzler ,Mar 19, 2016 at 15:23

from : digitalpeer.com with minor modifications

Usage myscript.sh -p=my_prefix -s=dirname -l=libname

#!/bin/bash
for i in "$@"
do
case $i in
    -p=*|--prefix=*)
    PREFIX="${i#*=}"

    ;;
    -s=*|--searchpath=*)
    SEARCHPATH="${i#*=}"
    ;;
    -l=*|--lib=*)
    DIR="${i#*=}"
    ;;
    --default)
    DEFAULT=YES
    ;;
    *)
            # unknown option
    ;;
esac
done
echo PREFIX = ${PREFIX}
echo SEARCH PATH = ${SEARCHPATH}
echo DIRS = ${DIR}
echo DEFAULT = ${DEFAULT}

To better understand ${i#*=} search for "Substring Removal" in this guide . It is functionally equivalent to `sed 's/[^=]*=//' <<< "$i"` which calls a needless subprocess or `echo "$i" | sed 's/[^=]*=//'` which calls two needless subprocesses.

Robert Siemer ,Jun 1, 2018 at 1:57

getopt() / getopts() is a good option. Stolen from here :

The simple use of "getopt" is shown in this mini-script:

#!/bin/bash
echo "Before getopt"
for i
do
  echo $i
done
args=`getopt abc:d $*`
set -- $args
echo "After getopt"
for i
do
  echo "-->$i"
done

What we have said is that any of -a, -b, -c or -d will be allowed, but that -c is followed by an argument (the "c:" says that).

If we call this "g" and try it out:

bash-2.05a$ ./g -abc foo
Before getopt
-abc
foo
After getopt
-->-a
-->-b
-->-c
-->foo
-->--

We start with two arguments, and "getopt" breaks apart the options and puts each in its own argument. It also added "--".

hfossli ,Jan 31 at 20:05

More succinct way

script.sh

#!/bin/bash

while [[ "$#" -gt 0 ]]; do case $1 in
  -d|--deploy) deploy="$2"; shift;;
  -u|--uglify) uglify=1;;
  *) echo "Unknown parameter passed: $1"; exit 1;;
esac; shift; done

echo "Should deploy? $deploy"
echo "Should uglify? $uglify"

Usage:

./script.sh -d dev -u

# OR:

./script.sh --deploy dev --uglify

bronson ,Apr 27 at 23:22

At the risk of adding another example to ignore, here's my scheme.

Hope it's useful to someone.

while [ "$#" -gt 0 ]; do
  case "$1" in
    -n) name="$2"; shift 2;;
    -p) pidfile="$2"; shift 2;;
    -l) logfile="$2"; shift 2;;

    --name=*) name="${1#*=}"; shift 1;;
    --pidfile=*) pidfile="${1#*=}"; shift 1;;
    --logfile=*) logfile="${1#*=}"; shift 1;;
    --name|--pidfile|--logfile) echo "$1 requires an argument" >&2; exit 1;;

    -*) echo "unknown option: $1" >&2; exit 1;;
    *) handle_argument "$1"; shift 1;;
  esac
done

Robert Siemer ,Jun 6, 2016 at 19:28

I'm about 4 years late to this question, but want to give back. I used the earlier answers as a starting point to tidy up my old adhoc param parsing. I then refactored out the following template code. It handles both long and short params, using = or space separated arguments, as well as multiple short params grouped together. Finally it re-inserts any non-param arguments back into the $1,$2.. variables. I hope it's useful.
#!/usr/bin/env bash

# NOTICE: Uncomment if your script depends on bashisms.
#if [ -z "$BASH_VERSION" ]; then bash $0 $@ ; exit $? ; fi

echo "Before"
for i ; do echo - $i ; done


# Code template for parsing command line parameters using only portable shell
# code, while handling both long and short params, handling '-f file' and
# '-f=file' style param data and also capturing non-parameters to be inserted
# back into the shell positional parameters.

while [ -n "$1" ]; do
        # Copy so we can modify it (can't modify $1)
        OPT="$1"
        # Detect argument termination
        if [ x"$OPT" = x"--" ]; then
                shift
                for OPT ; do
                        REMAINS="$REMAINS \"$OPT\""
                done
                break
        fi
        # Parse current opt
        while [ x"$OPT" != x"-" ] ; do
                case "$OPT" in
                        # Handle --flag=value opts like this
                        -c=* | --config=* )
                                CONFIGFILE="${OPT#*=}"
                                shift
                                ;;
                        # and --flag value opts like this
                        -c* | --config )
                                CONFIGFILE="$2"
                                shift
                                ;;
                        -f* | --force )
                                FORCE=true
                                ;;
                        -r* | --retry )
                                RETRY=true
                                ;;
                        # Anything unknown is recorded for later
                        * )
                                REMAINS="$REMAINS \"$OPT\""
                                break
                                ;;
                esac
                # Check for multiple short options
                # NOTICE: be sure to update this pattern to match valid options
                NEXTOPT="${OPT#-[cfr]}" # try removing single short opt
                if [ x"$OPT" != x"$NEXTOPT" ] ; then
                        OPT="-$NEXTOPT"  # multiple short opts, keep going
                else
                        break  # long form, exit inner loop
                fi
        done
        # Done with that param. move to next
        shift
done
# Set the non-parameters back into the positional parameters ($1 $2 ..)
eval set -- $REMAINS


echo -e "After: \n configfile='$CONFIGFILE' \n force='$FORCE' \n retry='$RETRY' \n remains='$REMAINS'"
for i ; do echo - $i ; done

> ,

I have found the matter to write portable parsing in scripts so frustrating that I have written Argbash - a FOSS code generator that can generate the arguments-parsing code for your script plus it has some nice features:

https://argbash.io

[Aug 29, 2019] shell - An example of how to use getopts in bash - Stack Overflow

The key thing to understand is that getops is just parsing options. You need to shift them as a separate operation:
shift $((OPTIND-1))
May 10, 2013 | stackoverflow.com

An example of how to use getopts in bash Ask Question Asked 6 years, 3 months ago Active 10 months ago Viewed 419k times 288 132

chepner ,May 10, 2013 at 13:42

I want to call myscript file in this way:
$ ./myscript -s 45 -p any_string

or

$ ./myscript -h >>> should display help
$ ./myscript    >>> should display help

My requirements are:

I tried so far this code:

#!/bin/bash
while getopts "h:s:" arg; do
  case $arg in
    h)
      echo "usage" 
      ;;
    s)
      strength=$OPTARG
      echo $strength
      ;;
  esac
done

But with that code I get errors. How to do it with Bash and getopt ?

,

#!/bin/bash

usage() { echo "Usage: $0 [-s <45|90>] [-p <string>]" 1>&2; exit 1; }

while getopts ":s:p:" o; do
    case "${o}" in
        s)
            s=${OPTARG}
            ((s == 45 || s == 90)) || usage
            ;;
        p)
            p=${OPTARG}
            ;;
        *)
            usage
            ;;
    esac
done
shift $((OPTIND-1))

if [ -z "${s}" ] || [ -z "${p}" ]; then
    usage
fi

echo "s = ${s}"
echo "p = ${p}"

Example runs:

$ ./myscript.sh
Usage: ./myscript.sh [-s <45|90>] [-p <string>]

$ ./myscript.sh -h
Usage: ./myscript.sh [-s <45|90>] [-p <string>]

$ ./myscript.sh -s "" -p ""
Usage: ./myscript.sh [-s <45|90>] [-p <string>]

$ ./myscript.sh -s 10 -p foo
Usage: ./myscript.sh [-s <45|90>] [-p <string>]

$ ./myscript.sh -s 45 -p foo
s = 45
p = foo

$ ./myscript.sh -s 90 -p bar
s = 90
p = bar

[Aug 27, 2019] linux - How to show line number when executing bash script

Aug 27, 2019 | stackoverflow.com

How to show line number when executing bash script Ask Question Asked 6 years, 1 month ago Active 1 year, 4 months ago Viewed 47k times 68 31


dspjm ,Jul 23, 2013 at 7:31

I have a test script which has a lot of commands and will generate lots of output, I use set -x or set -v and set -e , so the script would stop when error occurs. However, it's still rather difficult for me to locate which line did the execution stop in order to locate the problem. Is there a method which can output the line number of the script before each line is executed? Or output the line number before the command exhibition generated by set -x ? Or any method which can deal with my script line location problem would be a great help. Thanks.

Suvarna Pattayil ,Jul 28, 2017 at 17:25

You mention that you're already using -x . The variable PS4 denotes the value is the prompt printed before the command line is echoed when the -x option is set and defaults to : followed by space.

You can change PS4 to emit the LINENO (The line number in the script or shell function currently executing).

For example, if your script reads:

$ cat script
foo=10
echo ${foo}
echo $((2 + 2))

Executing it thus would print line numbers:

$ PS4='Line ${LINENO}: ' bash -x script
Line 1: foo=10
Line 2: echo 10
10
Line 3: echo 4
4

http://wiki.bash-hackers.org/scripting/debuggingtips gives the ultimate PS4 that would output everything you will possibly need for tracing:

export PS4='+(${BASH_SOURCE}:${LINENO}): ${FUNCNAME[0]:+${FUNCNAME[0]}(): }'

Deqing ,Jul 23, 2013 at 8:16

In Bash, $LINENO contains the line number where the script currently executing.

If you need to know the line number where the function was called, try $BASH_LINENO . Note that this variable is an array.

For example:

#!/bin/bash       

function log() {
    echo "LINENO: ${LINENO}"
    echo "BASH_LINENO: ${BASH_LINENO[*]}"
}

function foo() {
    log "$@"
}

foo "$@"

See here for details of Bash variables.

Eliran Malka ,Apr 25, 2017 at 10:14

Simple (but powerful) solution: Place echo around the code you think that causes the problem and move the echo line by line until the messages does not appear anymore on screen - because the script has stop because of an error before.

Even more powerful solution: Install bashdb the bash debugger and debug the script line by line

kklepper ,Apr 2, 2018 at 22:44

Workaround for shells without LINENO

In a fairly sophisticated script I wouldn't like to see all line numbers; rather I would like to be in control of the output.

Define a function

echo_line_no () {
    grep -n "$1" $0 |  sed "s/echo_line_no//" 
    # grep the line(s) containing input $1 with line numbers
    # replace the function name with nothing 
} # echo_line_no

Use it with quotes like

echo_line_no "this is a simple comment with a line number"

Output is

16   "this is a simple comment with a line number"

if the number of this line in the source file is 16.

This basically answers the question How to show line number when executing bash script for users of ash or other shells without LINENO .

Anything more to add?

Sure. Why do you need this? How do you work with this? What can you do with this? Is this simple approach really sufficient or useful? Why do you want to tinker with this at all?

Want to know more? Read reflections on debugging

[Aug 27, 2019] perl defensive programming (die, assert, croak) - Stack Overflow

Aug 27, 2019 | stackoverflow.com

perl defensive programming (die, assert, croak) Ask Question Asked 5 years, 6 months ago Active 5 years, 6 months ago Viewed 645 times 2 0


Zaid ,Feb 23, 2014 at 17:11

What is the best (or recommended) approach to do defensive programming in perl? For example if I have a sub which must be called with a (defined) SCALAR, an ARRAYREF and an optional HASHREF.

Three of the approaches I have seen:

sub test1 {
    die if !(@_ == 2 || @_ == 3);
    my ($scalar, $arrayref, $hashref) = @_;
    die if !defined($scalar) || ref($scalar);
    die if ref($arrayref) ne 'ARRAY';
    die if defined($hashref) && ref($hashref) ne 'HASH';
    #do s.th with scalar, arrayref and hashref
}

sub test2 {
    Carp::assert(@_ == 2 || @_ == 3) if DEBUG;
    my ($scalar, $arrayref, $hashref) = @_;
    if(DEBUG) {
        Carp::assert defined($scalar) && !ref($scalar);
        Carp::assert ref($arrayref) eq 'ARRAY';
        Carp::assert !defined($hashref) || ref($hashref) eq 'HASH';
    }
    #do s.th with scalar, arrayref and hashref
}

sub test3 {
    my ($scalar, $arrayref, $hashref) = @_;
    (@_ == 2 || @_ == 3 && defined($scalar) && !ref($scalar) && ref($arrayref) eq 'ARRAY' && (!defined($hashref) || ref($hashref) eq 'HASH'))
        or Carp::croak 'usage: test3(SCALAR, ARRAYREF, [HASHREF])';
    #do s.th with scalar, arrayref and hashref
}

tobyink ,Feb 23, 2014 at 21:44

use Params::Validate qw(:all);

sub Yada {
   my (...)=validate_pos(@_,{ type=>SCALAR },{ type=>ARRAYREF },{ type=>HASHREF,optional=>1 });
   ...
}

ikegami ,Feb 23, 2014 at 17:33

I wouldn't use any of them. Aside from not not accepting many array and hash references, the checks you used are almost always redundant.
>perl -we"use strict; sub { my ($x) = @_; my $y = $x->[0] }->( 'abc' )"
Can't use string ("abc") as an ARRAY ref nda"strict refs" in use at -e line 1.

>perl -we"use strict; sub { my ($x) = @_; my $y = $x->[0] }->( {} )"
Not an ARRAY reference at -e line 1.

The only advantage to checking is that you can use croak to show the caller in the error message.


Proper way to check if you have an reference to an array:

defined($x) && eval { @$x; 1 }

Proper way to check if you have an reference to a hash:

defined($x) && eval { %$x; 1 }

Borodin ,Feb 23, 2014 at 17:23

None of the options you show display any message to give a reason for the failure, which I think is paramount.

It is also preferable to use croak instead of die from within library subroutines, so that the error is reported from the point of view of the caller.

I would replace all occurrences of if ! with unless . The former is a C programmer's habit.

I suggest something like this

sub test1 {
    croak "Incorrect number of parameters" unless @_ == 2 or @_ == 3;
    my ($scalar, $arrayref, $hashref) = @_;
    croak "Invalid first parameter" unless $scalar and not ref $scalar;
    croak "Invalid second parameter" unless $arrayref eq 'ARRAY';
    croak "Invalid third parameter" if defined $hashref and ref $hashref ne 'HASH';

    # do s.th with scalar, arrayref and hashref
}

[Aug 27, 2019] How do I get the filename and line number in Perl - Stack Overflow

Aug 27, 2019 | stackoverflow.com

How do I get the filename and line number in Perl? Ask Question Asked 8 years, 10 months ago Active 8 years, 9 months ago Viewed 6k times 6


Elijah ,Nov 1, 2010 at 17:35

I would like to get the current filename and line number within a Perl script. How do I do this?

For example, in a file call test.pl :

my $foo = 'bar';
print 'Hello World';
print functionForFilename() . ':' . functionForLineNo();

It would output:

Hello World
test.pl:3

tchrist ,Nov 2, 2010 at 19:13

These are available with the __LINE__ and __FILE__ tokens, as documented in perldoc perldata under "Special Literals":

The special literals __FILE__, __LINE__, and __PACKAGE__ represent the current filename, line number, and package name at that point in your program. They may be used only as separate tokens; they will not be interpolated into strings. If there is no current package (due to an empty package; directive), __PACKAGE__ is the undefined value.

Eric Strom ,Nov 1, 2010 at 17:41

The caller function will do what you are looking for:
sub print_info {
   my ($package, $filename, $line) = caller;
   ...
}

print_info(); # prints info about this line

This will get the information from where the sub is called, which is probably what you are looking for. The __FILE__ and __LINE__ directives only apply to where they are written, so you can not encapsulate their effect in a subroutine. (unless you wanted a sub that only prints info about where it is defined)

,

You can use:
print __FILE__. " " . __LINE__;

[Aug 26, 2019] bash - How to prevent rm from reporting that a file was not found

Aug 26, 2019 | stackoverflow.com

How to prevent rm from reporting that a file was not found? Ask Question Asked 7 years, 4 months ago Active 1 year, 4 months ago Viewed 101k times 133 19


pizza ,Apr 20, 2012 at 21:29

I am using rm within a BASH script to delete many files. Sometimes the files are not present, so it reports many errors. I do not need this message. I have searched the man page for a command to make rm quiet, but the only option I found is -f , which from the description, "ignore nonexistent files, never prompt", seems to be the right choice, but the name does not seem to fit, so I am concerned it might have unintended consequences.

Keith Thompson ,Dec 19, 2018 at 13:05

The main use of -f is to force the removal of files that would not be removed using rm by itself (as a special case, it "removes" non-existent files, thus suppressing the error message).

You can also just redirect the error message using

$ rm file.txt 2> /dev/null

(or your operating system's equivalent). You can check the value of $? immediately after calling rm to see if a file was actually removed or not.

vimdude ,May 28, 2014 at 18:10

Yes, -f is the most suitable option for this.

tripleee ,Jan 11 at 4:50

-f is the correct flag, but for the test operator, not rm
[ -f "$THEFILE" ] && rm "$THEFILE"

this ensures that the file exists and is a regular file (not a directory, device node etc...)

mahemoff ,Jan 11 at 4:41

\rm -f file will never report not found.

Idelic ,Apr 20, 2012 at 16:51

As far as rm -f doing "anything else", it does force ( -f is shorthand for --force ) silent removal in situations where rm would otherwise ask you for confirmation. For example, when trying to remove a file not writable by you from a directory that is writable by you.

Keith Thompson ,May 28, 2014 at 18:09

I had same issue for cshell. The only solution I had was to create a dummy file that matched pattern before "rm" in my script.

[Aug 26, 2019] shell - rm -rf return codes

Aug 26, 2019 | superuser.com

rm -rf return codes Ask Question Asked 6 years ago Active 6 years ago Viewed 15k times 8 0


SheetJS ,Aug 15, 2013 at 2:50

Any one can let me know the possible return codes for the command rm -rf other than zero i.e, possible return codes for failure cases. I want to know more detailed reason for the failure of the command unlike just the command is failed(return other than 0).

Adrian Frühwirth ,Aug 14, 2013 at 7:00

To see the return code, you can use echo $? in bash.

To see the actual meaning, some platforms (like Debian Linux) have the perror binary available, which can be used as follows:

$ rm -rf something/; perror $?
rm: cannot remove `something/': Permission denied
OS error code   1:  Operation not permitted

rm -rf automatically suppresses most errors. The most likely error you will see is 1 (Operation not permitted), which will happen if you don't have permissions to remove the file. -f intentionally suppresses most errors

Adrian Frühwirth ,Aug 14, 2013 at 7:21

grabbed coreutils from git....

looking at exit we see...

openfly@linux-host:~/coreutils/src $ cat rm.c | grep -i exit
  if (status != EXIT_SUCCESS)
  exit (status);
  /* Since this program exits immediately after calling 'rm', rm need not
  atexit (close_stdin);
          usage (EXIT_FAILURE);
        exit (EXIT_SUCCESS);
          usage (EXIT_FAILURE);
        error (EXIT_FAILURE, errno, _("failed to get attributes of %s"),
        exit (EXIT_SUCCESS);
  exit (status == RM_ERROR ? EXIT_FAILURE : EXIT_SUCCESS);

Now looking at the status variable....

openfly@linux-host:~/coreutils/src $ cat rm.c | grep -i status
usage (int status)
  if (status != EXIT_SUCCESS)
  exit (status);
  enum RM_status status = rm (file, &x);
  assert (VALID_STATUS (status));
  exit (status == RM_ERROR ? EXIT_FAILURE : EXIT_SUCCESS);

looks like there isn't much going on there with the exit status.

I see EXIT_FAILURE and EXIT_SUCCESS and not anything else.

so basically 0 and 1 / -1

To see specific exit() syscalls and how they occur in a process flow try this

openfly@linux-host:~/ $ strace rm -rf $whatever

fairly simple.

ref:

http://www.unix.com/man-page/Linux/EXIT_FAILURE/exit/

[Aug 26, 2019] debugging - How can I debug a Perl script - Stack Overflow

Jun 27, 2014 | stackoverflow.com

Matthew Lock ,Jun 27, 2014 at 1:01

To run your script under perl debugger you should use -d switch:
perl -d script.pl

But perl is flexible. It supply some hooks and you may force debugger to work as you want

So to use different debuggers you may do:

perl -d:DebugHooks::Terminal script.pl
# OR
perl -d:Trepan script.pl

Look these modules here and here

There are several most interesting perl modules that hook into perl debugger internals: Devel::NYTProf , Devel::Cover

And many others

XXX,

If you want to do remote debug (for cgi or if you don't want to mess output with debug command line) use this:

given test:

use v5.14;
say 1;
say 2;
say 3;

Start a listener on whatever host and port on terminal 1 (here localhost:12345):

$ nc -v -l localhost -p 12345

for readline support use rlwrap (you can use on perl -d too):

$ rlwrap nc -v -l localhost -p 12345

And start the test on another terminal (say terminal 2):

$ PERLDB_OPTS="RemotePort=localhost:12345" perl -d test

Input/Output on terminal 1:

Connection from 127.0.0.1:42994

Loading DB routines from perl5db.pl version 1.49
Editor support available.

Enter h or 'h h' for help, or 'man perldebug' for more help.

main::(test:2): say 1;
  DB<1> n
main::(test:3): say 2;
  DB<1> select $DB::OUT

  DB<2> n
2
main::(test:4): say 3;
  DB<2> n
3
Debugged program terminated.  Use q to quit or R to restart,
use o inhibit_exit to avoid stopping after program termination,
h q, h R or h o to get additional info.  
  DB<2>

Output on terminal 2:

1

Note the sentence if you want output on debug terminal

select $DB::OUT

If you are vim user, install this plugin: dbg.vim which provides basic support for perl

[Aug 26, 2019] D>ebugging - How to use the Perl debugger

Aug 26, 2019 | stackoverflow.com
This is like "please can you give me an example how to drive a car" .

I have explained the basic commands that you will use most often. Beyond this you must read the debugger's inline help and reread the perldebug documentation

The debugger will do a lot more than this, but these are the basic commands that you need to know. You should experiment with them and look at the contents of the help text to get more proficient with the Perl debugger

[Aug 25, 2019] How to check if a variable is set in Bash?

Aug 25, 2019 | stackoverflow.com

Ask Question Asked 8 years, 11 months ago Active 2 months ago Viewed 1.1m times 1339 435


Jens ,Jul 15, 2014 at 9:46

How do I know if a variable is set in Bash?

For example, how do I check if the user gave the first parameter to a function?

function a {
    # if $1 is set ?
}

Graeme ,Nov 25, 2016 at 5:07

(Usually) The right way
if [ -z ${var+x} ]; then echo "var is unset"; else echo "var is set to '$var'"; fi

where ${var+x} is a parameter expansion which evaluates to nothing if var is unset, and substitutes the string x otherwise.

Quotes Digression

Quotes can be omitted (so we can say ${var+x} instead of "${var+x}" ) because this syntax & usage guarantees this will only expand to something that does not require quotes (since it either expands to x (which contains no word breaks so it needs no quotes), or to nothing (which results in [ -z ] , which conveniently evaluates to the same value (true) that [ -z "" ] does as well)).

However, while quotes can be safely omitted, and it was not immediately obvious to all (it wasn't even apparent to the first author of this quotes explanation who is also a major Bash coder), it would sometimes be better to write the solution with quotes as [ -z "${var+x}" ] , at the very small possible cost of an O(1) speed penalty. The first author also added this as a comment next to the code using this solution giving the URL to this answer, which now also includes the explanation for why the quotes can be safely omitted.

(Often) The wrong way
if [ -z "$var" ]; then echo "var is blank"; else echo "var is set to '$var'"; fi

This is often wrong because it doesn't distinguish between a variable that is unset and a variable that is set to the empty string. That is to say, if var='' , then the above solution will output "var is blank".

The distinction between unset and "set to the empty string" is essential in situations where the user has to specify an extension, or additional list of properties, and that not specifying them defaults to a non-empty value, whereas specifying the empty string should make the script use an empty extension or list of additional properties.

The distinction may not be essential in every scenario though. In those cases [ -z "$var" ] will be just fine.

Flow ,Nov 26, 2014 at 13:49

To check for non-null/non-zero string variable, i.e. if set, use
if [ -n "$1" ]

It's the opposite of -z . I find myself using -n more than -z .

You would use it like:

if [ -n "$1" ]; then
  echo "You supplied the first parameter!"
else
  echo "First parameter not supplied."
fi

Jens ,Jan 19, 2016 at 23:30

Here's how to test whether a parameter is unset , or empty ("Null") or set with a value :
+--------------------+----------------------+-----------------+-----------------+
|                    |       parameter      |     parameter   |    parameter    |
|                    |   Set and Not Null   |   Set But Null  |      Unset      |
+--------------------+----------------------+-----------------+-----------------+
| ${parameter:-word} | substitute parameter | substitute word | substitute word |
| ${parameter-word}  | substitute parameter | substitute null | substitute word |
| ${parameter:=word} | substitute parameter | assign word     | assign word     |
| ${parameter=word}  | substitute parameter | substitute null | assign word     |
| ${parameter:?word} | substitute parameter | error, exit     | error, exit     |
| ${parameter?word}  | substitute parameter | substitute null | error, exit     |
| ${parameter:+word} | substitute word      | substitute null | substitute null |
| ${parameter+word}  | substitute word      | substitute word | substitute null |
+--------------------+----------------------+-----------------+-----------------+

Source: POSIX: Parameter Expansion :

In all cases shown with "substitute", the expression is replaced with the value shown. In all cases shown with "assign", parameter is assigned that value, which also replaces the expression.

Dan ,Jul 24, 2018 at 20:16

While most of the techniques stated here are correct, bash 4.2 supports an actual test for the presence of a variable ( man bash ), rather than testing the value of the variable.
[[ -v foo ]]; echo $?
# 1

foo=bar
[[ -v foo ]]; echo $?
# 0

foo=""
[[ -v foo ]]; echo $?
# 0

Notably, this approach will not cause an error when used to check for an unset variable in set -u / set -o nounset mode, unlike many other approaches, such as using [ -z .

chepner ,Sep 11, 2013 at 14:22

There are many ways to do this with the following being one of them:
if [ -z "$1" ]

This succeeds if $1 is null or unset

phkoester ,Feb 16, 2018 at 8:06

To see if a variable is nonempty, I use
if [[ $var ]]; then ...       # `$var' expands to a nonempty string

The opposite tests if a variable is either unset or empty:

if [[ ! $var ]]; then ...     # `$var' expands to the empty string (set or not)

To see if a variable is set (empty or nonempty), I use

if [[ ${var+x} ]]; then ...   # `var' exists (empty or nonempty)
if [[ ${1+x} ]]; then ...     # Parameter 1 exists (empty or nonempty)

The opposite tests if a variable is unset:

if [[ ! ${var+x} ]]; then ... # `var' is not set at all
if [[ ! ${1+x} ]]; then ...   # We were called with no arguments

Palec ,Jun 19, 2017 at 3:25

I always find the POSIX table in the other answer slow to grok, so here's my take on it:
   +----------------------+------------+-----------------------+-----------------------+
   |   if VARIABLE is:    |    set     |         empty         |        unset          |
   +----------------------+------------+-----------------------+-----------------------+
 - |  ${VARIABLE-default} | $VARIABLE  |          ""           |       "default"       |
 = |  ${VARIABLE=default} | $VARIABLE  |          ""           | $(VARIABLE="default") |
 ? |  ${VARIABLE?default} | $VARIABLE  |          ""           |       exit 127        |
 + |  ${VARIABLE+default} | "default"  |       "default"       |          ""           |
   +----------------------+------------+-----------------------+-----------------------+
:- | ${VARIABLE:-default} | $VARIABLE  |       "default"       |       "default"       |
:= | ${VARIABLE:=default} | $VARIABLE  | $(VARIABLE="default") | $(VARIABLE="default") |
:? | ${VARIABLE:?default} | $VARIABLE  |       exit 127        |       exit 127        |
:+ | ${VARIABLE:+default} | "default"  |          ""           |          ""           |
   +----------------------+------------+-----------------------+-----------------------+

Note that each group (with and without preceding colon) has the same set and unset cases, so the only thing that differs is how the empty cases are handled.

With the preceding colon, the empty and unset cases are identical, so I would use those where possible (i.e. use := , not just = , because the empty case is inconsistent).

Headings:

Values:

chepner ,Mar 28, 2017 at 12:26

On a modern version of Bash (4.2 or later I think; I don't know for sure), I would try this:
if [ ! -v SOMEVARIABLE ] #note the lack of a $ sigil
then
    echo "Variable is unset"
elif [ -z "$SOMEVARIABLE" ]
then
    echo "Variable is set to an empty string"
else
    echo "Variable is set to some string"
fi

Gordon Davisson ,May 15, 2015 at 13:53

if [ "$1" != "" ]; then
  echo \$1 is set
else
  echo \$1 is not set
fi

Although for arguments it is normally best to test $#, which is the number of arguments, in my opinion.

if [ $# -gt 0 ]; then
  echo \$1 is set
else
  echo \$1 is not set
fi

Jarrod Chesney ,Dec 9, 2016 at 3:34

You want to exit if it's unset

This worked for me. I wanted my script to exit with an error message if a parameter wasn't set.

#!/usr/bin/env bash

set -o errexit

# Get the value and empty validation check all in one
VER="${1:?You must pass a version of the format 0.0.0 as the only argument}"

This returns with an error when it's run

peek@peek:~$ ./setver.sh
./setver.sh: line 13: 1: You must pass a version of the format 0.0.0 as the only argument
Check only, no exit - Empty and Unset are INVALID

Try this option if you just want to check if the value set=VALID or unset/empty=INVALID.

TSET="good val"
TEMPTY=""
unset TUNSET

if [ "${TSET:-}" ]; then echo "VALID"; else echo "INVALID";fi
# VALID
if [ "${TEMPTY:-}" ]; then echo "VALID"; else echo "INVALID";fi
# INVALID
if [ "${TUNSET:-}" ]; then echo "VALID"; else echo "INVALID";fi
# INVALID

Or, Even short tests ;-)

[ "${TSET:-}"   ] && echo "VALID" || echo "INVALID"
[ "${TEMPTY:-}" ] && echo "VALID" || echo "INVALID"
[ "${TUNSET:-}" ] && echo "VALID" || echo "INVALID"
Check only, no exit - Only empty is INVALID

And this is the answer to the question. Use this if you just want to check if the value set/empty=VALID or unset=INVALID.

NOTE, the "1" in "..-1}" is insignificant, it can be anything (like x)

TSET="good val"
TEMPTY=""
unset TUNSET

if [ "${TSET+1}" ]; then echo "VALID"; else echo "INVALID";fi
# VALID
if [ "${TEMPTY+1}" ]; then echo "VALID"; else echo "INVALID";fi
# VALID
if [ "${TUNSET+1}" ]; then echo "VALID"; else echo "INVALID";fi
# INVALID

Short tests

[ "${TSET+1}"   ] && echo "VALID" || echo "INVALID"
[ "${TEMPTY+1}" ] && echo "VALID" || echo "INVALID"
[ "${TUNSET+1}" ] && echo "VALID" || echo "INVALID"

I dedicate this answer to @mklement0 (comments) who challenged me to answer the question accurately.

Reference http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_06_02

Gilles ,Aug 31, 2010 at 7:30

To check whether a variable is set with a non-empty value, use [ -n "$x" ] , as others have already indicated.

Most of the time, it's a good idea to treat a variable that has an empty value in the same way as a variable that is unset. But you can distinguish the two if you need to: [ -n "${x+set}" ] ( "${x+set}" expands to set if x is set and to the empty string if x is unset).

To check whether a parameter has been passed, test $# , which is the number of parameters passed to the function (or to the script, when not in a function) (see Paul's answer ).

tripleee ,Sep 12, 2015 at 6:33

Read the "Parameter Expansion" section of the bash man page. Parameter expansion doesn't provide a general test for a variable being set, but there are several things you can do to a parameter if it isn't set.

For example:

function a {
    first_arg=${1-foo}
    # rest of the function
}

will set first_arg equal to $1 if it is assigned, otherwise it uses the value "foo". If a absolutely must take a single parameter, and no good default exists, you can exit with an error message when no parameter is given:

function a {
    : ${1?a must take a single argument}
    # rest of the function
}

(Note the use of : as a null command, which just expands the values of its arguments. We don't want to do anything with $1 in this example, just exit if it isn't set)

AlejandroVD ,Feb 8, 2016 at 13:31

In bash you can use -v inside the [[ ]] builtin:
#! /bin/bash -u

if [[ ! -v SOMEVAR ]]; then
    SOMEVAR='hello'
fi

echo $SOMEVAR

Palec ,Nov 16, 2016 at 15:01

For those that are looking to check for unset or empty when in a script with set -u :
if [ -z "${var-}" ]; then
   echo "Must provide var environment variable. Exiting...."
   exit 1
fi

The regular [ -z "$var" ] check will fail with var; unbound variable if set -u but [ -z "${var-}" ] expands to empty string if var is unset without failing.

user1387866 ,Jul 30 at 15:57

Note

I'm giving a heavily Bash-focused answer because of the bash tag.

Short answer

As long as you're only dealing with named variables in Bash, this function should always tell you if the variable has been set, even if it's an empty array.

is-variable-set() {
    declare -p $1 &>dev/null
}
Why this works

In Bash (at least as far back as 3.0), if var is a declared/set variable, then declare -p var outputs a declare command that would set variable var to whatever its current type and value are, and returns status code 0 (success). If var is undeclared, then declare -p var outputs an error message to stderr and returns status code 1 . Using &>/dev/null , redirects both regular stdout and stderr output to /dev/null , never to be seen, and without changing the status code. Thus the function only returns the status code.

Why other methods (sometimes) fail in Bash
  • [ -n "$var" ] : This only checks if ${var[0]} is nonempty. (In Bash, $var is the same as ${var[0]} .)
  • [ -n "${var+x}" ] : This only checks if ${var[0]} is set.
  • [ "${#var[@]}" != 0 ] : This only checks if at least one index of $var is set.
When this method fails in Bash

This only works for named variables (including $_ ), not certain special variables ( $! , $@ , $# , $$ , $* , $? , $- , $0 , $1 , $2 , ..., and any I may have forgotten). Since none of these are arrays, the POSIX-style [ -n "${var+x}" ] works for all of these special variables. But beware of wrapping it in a function since many special variables change values/existence when functions are called.

Shell compatibility note

If your script has arrays and you're trying to make it compatible with as many shells as possible, then consider using typeset -p instead of declare -p . I've read that ksh only supports the former, but haven't been able to test this. I do know that Bash 3.0+ and Zsh 5.5.1 each support both typeset -p and declare -p , differing only in which one is an alternative for the other. But I haven't tested differences beyond those two keywords, and I haven't tested other shells.

If you need your script to be POSIX sh compatible, then you can't use arrays. Without arrays, [ -n "{$var+x}" ] works.

Comparison code for different methods in Bash

This function unsets variable var , eval s the passed code, runs tests to determine if var is set by the eval d code, and finally shows the resulting status codes for the different tests.

I'm skipping test -v var , [ -v var ] , and [[ -v var ]] because they yield identical results to the POSIX standard [ -n "${var+x}" ] , while requiring Bash 4.2+. I'm also skipping typeset -p because it's the same as declare -p in the shells I've tested (Bash 3.0 thru 5.0, and Zsh 5.5.1).

is-var-set-after() {
    # Set var by passed expression.
    unset var
    eval "$1"

    # Run the tests, in increasing order of accuracy.
    [ -n "$var" ] # (index 0 of) var is nonempty
    nonempty=$?
    [ -n "${var+x}" ] # (index 0 of) var is set, maybe empty
    plus=$?
    [ "${#var[@]}" != 0 ] # var has at least one index set, maybe empty
    count=$?
    declare -p var &>/dev/null # var has been declared (any type)
    declared=$?

    # Show test results.
    printf '%30s: %2s %2s %2s %2s\n' "$1" $nonempty $plus $count $declared
}
Test case code

Note that test results may be unexpected due to Bash treating non-numeric array indices as "0" if the variable hasn't been declared as an associative array. Also, associative arrays are only valid in Bash 4.0+.

# Header.
printf '%30s: %2s %2s %2s %2s\n' "test" '-n' '+x' '#@' '-p'
# First 5 tests: Equivalent to setting 'var=foo' because index 0 of an
# indexed array is also the nonindexed value, and non-numerical
# indices in an array not declared as associative are the same as
# index 0.
is-var-set-after "var=foo"                        #  0  0  0  0
is-var-set-after "var=(foo)"                      #  0  0  0  0
is-var-set-after "var=([0]=foo)"                  #  0  0  0  0
is-var-set-after "var=([x]=foo)"                  #  0  0  0  0
is-var-set-after "var=([y]=bar [x]=foo)"          #  0  0  0  0
# '[ -n "$var" ]' fails when var is empty.
is-var-set-after "var=''"                         #  1  0  0  0
is-var-set-after "var=([0]='')"                   #  1  0  0  0
# Indices other than 0 are not detected by '[ -n "$var" ]' or by
# '[ -n "${var+x}" ]'.
is-var-set-after "var=([1]='')"                   #  1  1  0  0
is-var-set-after "var=([1]=foo)"                  #  1  1  0  0
is-var-set-after "declare -A var; var=([x]=foo)"  #  1  1  0  0
# Empty arrays are only detected by 'declare -p'.
is-var-set-after "var=()"                         #  1  1  1  0
is-var-set-after "declare -a var"                 #  1  1  1  0
is-var-set-after "declare -A var"                 #  1  1  1  0
# If 'var' is unset, then it even fails the 'declare -p var' test.
is-var-set-after "unset var"                      #  1  1  1  1
Test output

The test mnemonics in the header row correspond to [ -n "$var" ] , [ -n "${var+x}" ] , [ "${#var[@]}" != 0 ] , and declare -p var , respectively.

                         test: -n +x #@ -p
                      var=foo:  0  0  0  0
                    var=(foo):  0  0  0  0
                var=([0]=foo):  0  0  0  0
                var=([x]=foo):  0  0  0  0
        var=([y]=bar [x]=foo):  0  0  0  0
                       var='':  1  0  0  0
                 var=([0]=''):  1  0  0  0
                 var=([1]=''):  1  1  0  0
                var=([1]=foo):  1  1  0  0
declare -A var; var=([x]=foo):  1  1  0  0
                       var=():  1  1  1  0
               declare -a var:  1  1  1  0
               declare -A var:  1  1  1  0
                    unset var:  1  1  1  1
Summary
  • declare -p var &>/dev/null is (100%?) reliable for testing named variables in Bash since at least 3.0.
  • [ -n "${var+x}" ] is reliable in POSIX compliant situations, but cannot handle arrays.
  • Other tests exist for checking if a variable is nonempty, and for checking for declared variables in other shells. But these tests are suited for neither Bash nor POSIX scripts.

Peregring-lk ,Oct 18, 2014 at 22:09

Using [[ -z "$var" ]] is the easiest way to know if a variable was set or not, but that option -z doesn't distinguish between an unset variable and a variable set to an empty string:
$ set=''
$ [[ -z "$set" ]] && echo "Set" || echo "Unset" 
Unset
$ [[ -z "$unset" ]] && echo "Set" || echo "Unset"
Unset

It's best to check it according to the type of variable: env variable, parameter or regular variable.

For a env variable:

[[ $(env | grep "varname=" | wc -l) -eq 1 ]] && echo "Set" || echo "Unset"

For a parameter (for example, to check existence of parameter $5 ):

[[ $# -ge 5 ]] && echo "Set" || echo "Unset"

For a regular variable (using an auxiliary function, to do it in an elegant way):

function declare_var {
   declare -p "$1" &> /dev/null
}
declare_var "var_name" && echo "Set" || echo "Unset"

Notes:

  • $# : gives you the number of positional parameters.
  • declare -p : gives you the definition of the variable passed as a parameter. If it exists, returns 0, if not, returns 1 and prints an error message.
  • &> /dev/null : suppresses output from declare -p without affecting its return code.

Dennis Williamson ,Nov 27, 2013 at 20:56

You can do:
function a {
        if [ ! -z "$1" ]; then
                echo '$1 is set'
        fi
}

LavaScornedOven ,May 11, 2017 at 13:14

The answers above do not work when Bash option set -u is enabled. Also, they are not dynamic, e.g., how to test is variable with name "dummy" is defined? Try this:
is_var_defined()
{
    if [ $# -ne 1 ]
    then
        echo "Expected exactly one argument: variable name as string, e.g., 'my_var'"
        exit 1
    fi
    # Tricky.  Since Bash option 'set -u' may be enabled, we cannot directly test if a variable
    # is defined with this construct: [ ! -z "$var" ].  Instead, we must use default value
    # substitution with this construct: [ ! -z "${var:-}" ].  Normally, a default value follows the
    # operator ':-', but here we leave it blank for empty (null) string.  Finally, we need to
    # substitute the text from $1 as 'var'.  This is not allowed directly in Bash with this
    # construct: [ ! -z "${$1:-}" ].  We need to use indirection with eval operator.
    # Example: $1="var"
    # Expansion for eval operator: "[ ! -z \${$1:-} ]" -> "[ ! -z \${var:-} ]"
    # Code  execute: [ ! -z ${var:-} ]
    eval "[ ! -z \${$1:-} ]"
    return $?  # Pedantic.
}

Related: In Bash, how do I test if a variable is defined in "-u" mode

Aquarius Power ,Nov 15, 2014 at 17:55

My prefered way is this:
$var=10
$if ! ${var+false};then echo "is set";else echo "NOT set";fi
is set
$unset var
$if ! ${var+false};then echo "is set";else echo "NOT set";fi
NOT set

So basically, if a variable is set, it becomes "a negation of the resulting false " (what will be true = "is set").

And, if it is unset, it will become "a negation of the resulting true " (as the empty result evaluates to true ) (so will end as being false = "NOT set").

kenorb ,Sep 22, 2014 at 13:57

In a shell you can use the -z operator which is True if the length of string is zero.

A simple one-liner to set default MY_VAR if it's not set, otherwise optionally you can display the message:

[[ -z "$MY_VAR" ]] && MY_VAR="default"
[[ -z "$MY_VAR" ]] && MY_VAR="default" || echo "Variable already set."

Zlatan ,Nov 20, 2013 at 18:53

if [[ ${1:+isset} ]]
then echo "It was set and not null." >&2
else echo "It was not set or it was null." >&2
fi

if [[ ${1+isset} ]]
then echo "It was set but might be null." >&2
else echo "It was was not set." >&2
fi

solidsnack ,Nov 30, 2013 at 16:47

I found a (much) better code to do this if you want to check for anything in $@ .
if [[ $1 = "" ]]
then
  echo '$1 is blank'
else
  echo '$1 is filled up'
fi

Why this all? Everything in $@ exists in Bash, but by default it's blank, so test -z and test -n couldn't help you.

Update: You can also count number of characters in a parameters.

if [ ${#1} = 0 ]
then
  echo '$1 is blank'
else
  echo '$1 is filled up'
fi

Steven Penny ,May 11, 2014 at 4:59

[[ $foo ]]

Or

(( ${#foo} ))

Or

let ${#foo}

Or

declare -p foo

Celeo ,Feb 11, 2015 at 20:58

if [[ ${!xx[@]} ]] ; then echo xx is defined; fi

HelloGoodbye ,Nov 29, 2013 at 22:41

I always use this one, based on the fact that it seems easy to be understood by anybody who sees the code for the very first time:
if [ "$variable" = "" ]
    then
    echo "Variable X is empty"
fi

And, if wanting to check if not empty;

if [ ! "$variable" = "" ]
    then
    echo "Variable X is not empty"
fi

That's it.

fr00tyl00p ,Nov 29, 2015 at 20:26

This is what I use every day:
#
# Check if a variable is set
#   param1  name of the variable
#
function is_set()
{
    [[ -n "${1}" ]] && test -n "$(eval "echo "\${${1}+x}"")"
}

This works well under Linux and Solaris down to bash 3.0.

bash-3.00$ myvar="TEST"
bash-3.00$ is_set myvar ; echo $?
0
bash-3.00$ mavar=""
bash-3.00$ is_set myvar ; echo $?
0
bash-3.00$ unset myvar
bash-3.00$ is_set myvar ; echo $?
1

Daniel S ,Mar 1, 2016 at 13:12

I like auxiliary functions to hide the crude details of bash. In this case, doing so adds even more (hidden) crudeness:
# The first ! negates the result (can't use -n to achieve this)
# the second ! expands the content of varname (can't do ${$varname})
function IsDeclared_Tricky
{
  local varname="$1"
  ! [ -z ${!varname+x} ]
}

Because I first had bugs in this implementation (inspired by the answers of Jens and Lionel), I came up with a different solution:

# Ask for the properties of the variable - fails if not declared
function IsDeclared()
{
  declare -p $1 &>/dev/null
}

I find it to be more straight-forward, more bashy and easier to understand/remember. Test case shows it is equivalent:

function main()
{
  declare -i xyz
  local foo
  local bar=
  local baz=''

  IsDeclared_Tricky xyz; echo "IsDeclared_Tricky xyz: $?"
  IsDeclared_Tricky foo; echo "IsDeclared_Tricky foo: $?"
  IsDeclared_Tricky bar; echo "IsDeclared_Tricky bar: $?"
  IsDeclared_Tricky baz; echo "IsDeclared_Tricky baz: $?"

  IsDeclared xyz; echo "IsDeclared xyz: $?"
  IsDeclared foo; echo "IsDeclared foo: $?"
  IsDeclared bar; echo "IsDeclared bar: $?"
  IsDeclared baz; echo "IsDeclared baz: $?"
}

main

The test case also shows that local var does NOT declare var (unless followed by '='). For quite some time I thought i declared variables this way, just to discover now that i merely expressed my intention... It's a no-op, i guess.

IsDeclared_Tricky xyz: 1
IsDeclared_Tricky foo: 1
IsDeclared_Tricky bar: 0
IsDeclared_Tricky baz: 0
IsDeclared xyz: 1
IsDeclared foo: 1
IsDeclared bar: 0
IsDeclared baz: 0

BONUS: usecase

I mostly use this test to give (and return) parameters to functions in a somewhat "elegant" and safe way (almost resembling an interface...):

#auxiliary functions
function die()
{
  echo "Error: $1"; exit 1
}

function assertVariableDeclared()
{
  IsDeclared "$1" || die "variable not declared: $1"
}

function expectVariables()
{
  while (( $# > 0 )); do
    assertVariableDeclared $1; shift
  done
}

# actual example
function exampleFunction()
{
  expectVariables inputStr outputStr
  outputStr="$inputStr world!"
}

function bonus()
{
  local inputStr='Hello'
  local outputStr= # remove this to trigger error
  exampleFunction
  echo $outputStr
}

bonus

If called with all requires variables declared:

Hello world!

else:

Error: variable not declared: outputStr

Hatem Jaber ,Jun 13 at 12:08

After skimming all the answers, this also works:
if [[ -z $SOME_VAR ]]; then read -p "Enter a value for SOME_VAR: " SOME_VAR; fi
echo "SOME_VAR=$SOME_VAR"

if you don't put SOME_VAR instead of what I have $SOME_VAR , it will set it to an empty value; $ is necessary for this to work.

Keith Thompson ,Aug 5, 2013 at 19:10

If you wish to test that a variable is bound or unbound, this works well, even after you've turned on the nounset option:
set -o noun set

if printenv variableName >/dev/null; then
    # variable is bound to a value
else
    # variable is unbound
fi

> ,Jan 30 at 18:23

Functions to check if variable is declared/unset including empty $array=()


The following functions test if the given name exists as a variable

# The first parameter needs to be the name of the variable to be checked.
# (See example below)

var_is_declared() {
    { [[ -n ${!1+anything} ]] || declare -p $1 &>/dev/null;}
}

var_is_unset() {
    { [[ -z ${!1+anything} ]] && ! declare -p $1 &>/dev/null;} 
}

This functions would test as showed in the following conditions:

a;       # is not declared
a=;      # is declared
a="foo"; # is declared
a=();    # is declared
a=("");  # is declared
unset a; # is not declared

a;       # is unset
a=;      # is not unset
a="foo"; # is not unset
a=();    # is not unset
a=("");  # is not unset
unset a; # is unset

.

For more details

and a test script see my answer to the question "How do I check if a variable exists in bash?" .

Remark: The similar usage of declare -p , as it is also shown by Peregring-lk 's answer , is truly coincidental. Otherwise I would of course have credited it!

[Aug 20, 2019] Is it possible to insert separator in midnight commander menu?

Jun 07, 2010 | superuser.com

Ask Question Asked 9 years, 2 months ago Active 7 years, 10 months ago Viewed 363 times 2

okutane ,Jun 7, 2010 at 3:36

I want to insert some items into mc menu (which is opened by F2) grouped together. Is it possible to insert some sort of separator before them or put them into some submenu?
Probably, not.
The format of the menu file is very simple. Lines that start with anything but
space or tab are considered entries for the menu (in order to be able to use
it like a hot key, the first character should be a letter). All the lines that
start with a space or a tab are the commands that will be executed when the
entry is selected.

But MC allows you to make multiple menu entries with same shortcut and title, so you can make a menu entry that looks like separator and does nothing, like:

a hello
  echo world
- --------
b world
  echo hello
- --------
c superuser
  ls /

This will look like:

[Aug 20, 2019] Midnight Commander, using date in User menu

Dec 31, 2013 | unix.stackexchange.com

user2013619 ,Dec 31, 2013 at 0:43

I would like to use MC (midnight commander) to compress the selected dir with date in its name, e.g: dirname_20131231.tar.gz

The command in the User menu is :

tar -czf dirname_`date '+%Y%m%d'`.tar.gz %d

The archive is missing because %m , and %d has another meaning in MC. I made an alias for the date, but it also doesn't work.

Does anybody solved this problem ever?

John1024 ,Dec 31, 2013 at 1:06

To escape the percent signs, double them:
tar -czf dirname_$(date '+%%Y%%m%%d').tar.gz %d

The above would compress the current directory (%d) to a file also in the current directory. If you want to compress the directory pointed to by the cursor rather than the current directory, use %f instead:

tar -czf %f_$(date '+%%Y%%m%%d').tar.gz %f

mc handles escaping of special characters so there is no need to put %f in quotes.

By the way, midnight commander's special treatment of percent signs occurs not just in the user menu file but also at the command line. This is an issue when using shell commands with constructs like ${var%.c} . At the command line, the same as in the user menu file, percent signs can be escaped by doubling them.

[Aug 19, 2019] mc - Is there are any documentation about user-defined menu in midnight-commander - Unix Linux Stack Exchange

Aug 19, 2019 | unix.stackexchange.com

Is there are any documentation about user-defined menu in midnight-commander? Ask Question Asked 5 years, 2 months ago Active 1 year, 2 months ago Viewed 3k times 6 2


login ,Jun 11, 2014 at 13:13

I'd like to create my own user-defined menu for mc ( menu file). I see some lines like
+ t r & ! t t

or

+ t t

What does it mean?

goldilocks ,Jun 11, 2014 at 13:35

It is documented in the help, the node is "Edit Menu File" under "Command Menu"; if you scroll down you should find "Addition Conditions":

If the condition begins with '+' (or '+?') instead of '=' (or '=?') it is an addition condition. If the condition is true the menu entry will be included in the menu. If the condition is false the menu entry will not be included in the menu.

This is preceded by "Default conditions" (the = condition), which determine which entry will be highlighted as the default choice when the menu appears. Anyway, by way of example:

+ t r & ! t t

t r means if this is a regular file ("t(ype) r"), and ! t t means if the file has not been tagged in the interface.

Jarek

On top what has been written above, this page can be browsed in the Internet, when searching for man pages, e.g.: https://www.systutorials.com/docs/linux/man/1-mc/

Search for "Menu File Edit" .

Best regards, Jarek

[Aug 14, 2019] bash - PID background process - Unix Linux Stack Exchange

Aug 14, 2019 | unix.stackexchange.com

PID background process Ask Question Asked 2 years, 8 months ago Active 2 years, 8 months ago Viewed 2k times 2


Raul ,Nov 27, 2016 at 18:21

As I understand pipes and commands, bash takes each command, spawns a process for each one and connects stdout of the previous one with the stdin of the next one.

For example, in "ls -lsa | grep feb", bash will create two processes, and connect the output of "ls -lsa" to the input of "grep feb".

When you execute a background command like "sleep 30 &" in bash, you get the pid of the background process running your command. Surprisingly for me, when I wrote "ls -lsa | grep feb &" bash returned only one PID.

How should this be interpreted? A process runs both "ls -lsa" and "grep feb"? Several process are created but I only get the pid of one of them?

Raul ,Nov 27, 2016 at 19:21

Spawns 2 processes. The & displays the PID of the second process. Example below.
$ echo $$
13358
$ sleep 100 | sleep 200 &
[1] 13405
$ ps -ef|grep 13358
ec2-user 13358 13357  0 19:02 pts/0    00:00:00 -bash
ec2-user 13404 13358  0 19:04 pts/0    00:00:00 sleep 100
ec2-user 13405 13358  0 19:04 pts/0    00:00:00 sleep 200
ec2-user 13406 13358  0 19:04 pts/0    00:00:00 ps -ef
ec2-user 13407 13358  0 19:04 pts/0    00:00:00 grep --color=auto 13358
$

> ,

When you run a job in the background, bash prints the process ID of its subprocess, the one that runs the command in that job. If that job happens to create more subprocesses, that's none of the parent shell's business.

When the background job is a pipeline (i.e. the command is of the form something1 | something2 & , and not e.g. { something1 | something2; } & ), there's an optimization which is strongly suggested by POSIX and performed by most shells including bash: each of the elements of the pipeline are executed directly as subprocesses of the original shell. What POSIX mandates is that the variable $! is set to the last command in the pipeline in this case. In most shells, that last command is a subprocess of the original process, and so are the other commands in the pipeline.

When you run ls -lsa | grep feb , there are three processes involved: the one that runs the left-hand side of the pipe (a subshell that finishes setting up the pipe then executes ls ), the one that runs the right-hand side of the pipe (a subshell that finishes setting up the pipe then executes grep ), and the original process that waits for the pipe to finish.

You can watch what happens by tracing the processes:

$ strace -f -e clone,wait4,pipe,execve,setpgid bash --norc
execve("/usr/local/bin/bash", ["bash", "--norc"], [/* 82 vars */]) = 0
setpgid(0, 24084)                       = 0
bash-4.3$ sleep 10 | sleep 20 &

Note how the second sleep is reported and stored as $! , but the process group ID is the first sleep . Dash has the same oddity, ksh and mksh don't.

[Aug 14, 2019] unix - How to get PID of process by specifying process name and store it in a variable to use further - Stack Overflow

Aug 14, 2019 | stackoverflow.com

Nidhi ,Nov 28, 2014 at 0:54

pids=$(pgrep <name>)

will get you the pids of all processes with the given name. To kill them all, use

kill -9 $pids

To refrain from using a variable and directly kill all processes with a given name issue

pkill -9 <name>

panticz.de ,Nov 11, 2016 at 10:11

On a single line...
pgrep -f process_name | xargs kill -9

flazzarini ,Jun 13, 2014 at 9:54

Another possibility would be to use pidof it usually comes with most distributions. It will return you the PID of a given process by using it's name.
pidof process_name

This way you could store that information in a variable and execute kill -9 on it.

#!/bin/bash
pid=`pidof process_name`
kill -9 $pid

Pawel K ,Dec 20, 2017 at 10:27

use grep [n]ame to remove that grep -v name this is first... Sec using xargs in the way how it is up there is wrong to rnu whatever it is piped you have to use -i ( interactive mode) otherwise you may have issues with the command.

ps axf | grep | grep -v grep | awk '{print "kill -9 " $1}' ? ps aux |grep [n]ame | awk '{print "kill -9 " $2}' ? isnt that better ?

[Aug 14, 2019] linux - How to get PID of background process - Stack Overflow

Highly recommended!
Aug 14, 2019 | stackoverflow.com

How to get PID of background process? Ask Question Asked 9 years, 8 months ago Active 7 months ago Viewed 238k times 336 64


pixelbeat ,Mar 20, 2013 at 9:11

I start a background process from my shell script, and I would like to kill this process when my script finishes.

How to get the PID of this process from my shell script? As far as I can see variable $! contains the PID of the current script, not the background process.

WiSaGaN ,Jun 2, 2015 at 14:40

You need to save the PID of the background process at the time you start it:
foo &
FOO_PID=$!
# do other stuff
kill $FOO_PID

You cannot use job control, since that is an interactive feature and tied to a controlling terminal. A script will not necessarily have a terminal attached at all so job control will not necessarily be available.

Phil ,Dec 2, 2017 at 8:01

You can use the jobs -l command to get to a particular jobL
^Z
[1]+  Stopped                 guard

my_mac:workspace r$ jobs -l
[1]+ 46841 Suspended: 18           guard

In this case, 46841 is the PID.

From help jobs :

-l Report the process group ID and working directory of the jobs.

jobs -p is another option which shows just the PIDs.

Timo ,Dec 2, 2017 at 8:03

Here's a sample transcript from a bash session ( %1 refers to the ordinal number of background process as seen from jobs ):

$ echo $$
3748

$ sleep 100 &
[1] 192

$ echo $!
192

$ kill %1

[1]+  Terminated              sleep 100

lepe ,Dec 2, 2017 at 8:29

An even simpler way to kill all child process of a bash script:
pkill -P $$

The -P flag works the same way with pkill and pgrep - it gets child processes, only with pkill the child processes get killed and with pgrep child PIDs are printed to stdout.

Luis Ramirez ,Feb 20, 2013 at 23:11

this is what I have done. Check it out, hope it can help.
#!/bin/bash
#
# So something to show.
echo "UNO" >  UNO.txt
echo "DOS" >  DOS.txt
#
# Initialize Pid List
dPidLst=""
#
# Generate background processes
tail -f UNO.txt&
dPidLst="$dPidLst $!"
tail -f DOS.txt&
dPidLst="$dPidLst $!"
#
# Report process IDs
echo PID=$$
echo dPidLst=$dPidLst
#
# Show process on current shell
ps -f
#
# Start killing background processes from list
for dPid in $dPidLst
do
        echo killing $dPid. Process is still there.
        ps | grep $dPid
        kill $dPid
        ps | grep $dPid
        echo Just ran "'"ps"'" command, $dPid must not show again.
done

Then just run it as: ./bgkill.sh with proper permissions of course

root@umsstd22 [P]:~# ./bgkill.sh
PID=23757
dPidLst= 23758 23759
UNO
DOS
UID        PID  PPID  C STIME TTY          TIME CMD
root      3937  3935  0 11:07 pts/5    00:00:00 -bash
root     23757  3937  0 11:55 pts/5    00:00:00 /bin/bash ./bgkill.sh
root     23758 23757  0 11:55 pts/5    00:00:00 tail -f UNO.txt
root     23759 23757  0 11:55 pts/5    00:00:00 tail -f DOS.txt
root     23760 23757  0 11:55 pts/5    00:00:00 ps -f
killing 23758. Process is still there.
23758 pts/5    00:00:00 tail
./bgkill.sh: line 24: 23758 Terminated              tail -f UNO.txt
Just ran 'ps' command, 23758 must not show again.
killing 23759. Process is still there.
23759 pts/5    00:00:00 tail
./bgkill.sh: line 24: 23759 Terminated              tail -f DOS.txt
Just ran 'ps' command, 23759 must not show again.
root@umsstd22 [P]:~# ps -f
UID        PID  PPID  C STIME TTY          TIME CMD
root      3937  3935  0 11:07 pts/5    00:00:00 -bash
root     24200  3937  0 11:56 pts/5    00:00:00 ps -f

Phil ,Oct 15, 2013 at 18:22

You might also be able to use pstree:
pstree -p user

This typically gives a text representation of all the processes for the "user" and the -p option gives the process-id. It does not depend, as far as I understand, on having the processes be owned by the current shell. It also shows forks.

Phil ,Dec 4, 2018 at 9:46

pgrep can get you all of the child PIDs of a parent process. As mentioned earlier $$ is the current scripts PID. So, if you want a script that cleans up after itself, this should do the trick:
trap 'kill $( pgrep -P $$ | tr "\n" " " )' SIGINT SIGTERM EXIT

[Aug 10, 2019] How to check the file size in Linux-Unix bash shell scripting by Vivek Gite

Aug 10, 2019 | www.cyberciti.biz

The stat command shows information about the file. The syntax is as follows to get the file size on GNU/Linux stat:

stat -c %s "/etc/passwd"

OR

stat --format=%s "/etc/passwd"

[Aug 10, 2019] bash - How to check size of a file - Stack Overflow

Aug 10, 2019 | stackoverflow.com

[ -n file.txt ] doesn't check its size , it checks that the string file.txt is non-zero length, so it will always succeed.

If you want to say " size is non-zero", you need [ -s file.txt ] .

To get a file's size , you can use wc -c to get the size ( file length) in bytes:

file=file.txt
minimumsize=90000
actualsize=$(wc -c <"$file")
if [ $actualsize -ge $minimumsize ]; then
    echo size is over $minimumsize bytes
else
    echo size is under $minimumsize bytes
fi

In this case, it sounds like that's what you want.

But FYI, if you want to know how much disk space the file is using, you could use du -k to get the size (disk space used) in kilobytes:

file=file.txt
minimumsize=90
actualsize=$(du -k "$file" | cut -f 1)
if [ $actualsize -ge $minimumsize ]; then
    echo size is over $minimumsize kilobytes
else
    echo size is under $minimumsize kilobytes
fi

If you need more control over the output format, you can also look at stat . On Linux, you'd start with something like stat -c '%s' file.txt , and on BSD/Mac OS X, something like stat -f '%z' file.txt .

--Mikel

On Linux, you'd start with something like stat -c '%s' file.txt , and on BSD/Mac OS X, something like stat -f '%z' file.txt .

Oz Solomon ,Jun 13, 2014 at 21:44

It surprises me that no one mentioned stat to check file size. Some methods are definitely better: using -s to find out whether the file is empty or not is easier than anything else if that's all you want. And if you want to find files of a size, then find is certainly the way to go.

I also like du a lot to get file size in kb, but, for bytes, I'd use stat :

size=$(stat -f%z $filename) # BSD stat

size=$(stat -c%s $filename) # GNU stat?
alternative solution with awk and double parenthesis:
FILENAME=file.txt
SIZE=$(du -sb $FILENAME | awk '{ print $1 }')

if ((SIZE<90000)) ; then 
    echo "less"; 
else 
    echo "not less"; 
fi

[Aug 10, 2019] command line - How do I add file and directory comparision option to mc user menu - Ask Ubuntu

Aug 10, 2019 | askubuntu.com

How do I add file and directory comparision option to mc user menu? Ask Question Asked 7 years, 4 months ago Active 7 years, 3 months ago Viewed 664 times 0

sorin ,Mar 30, 2012 at 8:57

I want to add Beyond Compare diff to mc (midnight commmander) user menu.

All I know is that I need to add my custom command to ~/.mc/menu but I have no idea about the syntax to use.

I want to be able to compare two files from the two panes or the directories themselves.

The command that I need to run is bcompare file1 file2 & (some for directories, it will figure it out).

mivk ,Oct 17, 2015 at 15:35

Add this to ~/.mc/menu :
+ t r & ! t t
d       Diff against file of same name in other directory
        if [ "%d" = "%D" ]; then
          echo "The two directores must be different"
          exit 1
        fi
        if [ -f %D/%f ]; then        # if two of them, then
          bcompare %f %D/%f &
        else
          echo %f: No copy in %D/%f
        fi

x       Diff file to file
        if [ -f %D/%F ]; then        # if two of them, then
          bcompare %f %D/%F &
        else
          echo %f: No copy in %D/%f
        fi

D       Diff current directory against other directory
        if [ "%d" = "%D" ]; then
          echo "The two directores must be different"
          exit 1
        fi
        bcompare %d %D &

[Aug 10, 2019] mc - Is there are any documentation about user-defined menu in midnight-commander - Unix Linux Stack Exchange

Aug 10, 2019 | unix.stackexchange.com

Is there are any documentation about user-defined menu in midnight-commander? Ask Question Asked 5 years, 2 months ago Active 1 year, 1 month ago Viewed 3k times 6 2


login ,Jun 11, 2014 at 13:13

I'd like to create my own user-defined menu for mc ( menu file). I see some lines like
+ t r & ! t t

or

+ t t

What does it mean?

goldilocks ,Jun 11, 2014 at 13:35

It is documented in the help, the node is "Edit Menu File" under "Command Menu"; if you scroll down you should find "Addition Conditions":

If the condition begins with '+' (or '+?') instead of '=' (or '=?') it is an addition condition. If the condition is true the menu entry will be included in the menu. If the condition is false the menu entry will not be included in the menu.

This is preceded by "Default conditions" (the = condition), which determine which entry will be highlighted as the default choice when the menu appears. Anyway, by way of example:

+ t r & ! t t

t r means if this is a regular file ("t(ype) r"), and ! t t means if the file has not been tagged in the interface.

> ,

On top what has been written above, this page can be browsed in the Internet, when searching for man pages, e.g.: https://www.systutorials.com/docs/linux/man/1-mc/

Search for "Menu File Edit" .

Best regards, Jarek

[Aug 10, 2019] midnight commander - How to configure coloring of the file names in MC - Super User

If colors are crazy, the simplest way to solve this problem is to turn them off
To turn off color you can also use option mc --nocolor or by by using the -b flag
You can customize the color displayed by defining them in ~/.mc/ini . But that requres some work. Have a look here for an example: http://ajnasz.hu/blog/20080101/midnight-commander-coloring .
Aug 10, 2019 | superuser.com
How to configure coloring of the file names in MC? Ask Question Asked 8 years, 7 months ago Active 1 year, 4 months ago Viewed 4k times 8 3

Mike L. ,Jan 9, 2011 at 17:21

Is it possible to configure the Midnight Commander (Ubuntu 10.10) to show certain file and directory names differently, e.g. all hidden (starting with a period) using grey color?

Mike L. ,Feb 20, 2018 at 5:51

Under Options -> Panel Options select File highlight -> File types .

See man mc in the Colors section for ways to choose particular colors by adding entries in your ~/.config/mc/ini file. Unfortunately, there doesn't appear to be a keyword for hidden files.

[Aug 07, 2019] Find files and tar them (with spaces)

Aug 07, 2019 | stackoverflow.com

Ask Question Asked 8 years, 3 months ago Active 1 month ago Viewed 104k times 106 45


porges ,Sep 6, 2012 at 17:43

Alright, so simple problem here. I'm working on a simple back up code. It works fine except if the files have spaces in them. This is how I'm finding files and adding them to a tar archive:
find . -type f | xargs tar -czvf backup.tar.gz

The problem is when the file has a space in the name because tar thinks that it's a folder. Basically is there a way I can add quotes around the results from find? Or a different way to fix this?

Brad Parks ,Mar 2, 2017 at 18:35

Use this:
find . -type f -print0 | tar -czvf backup.tar.gz --null -T -

It will:

Also see:

czubehead ,Mar 19, 2018 at 11:51

There could be another way to achieve what you want. Basically,
  1. Use the find command to output path to whatever files you're looking for. Redirect stdout to a filename of your choosing.
  2. Then tar with the -T option which allows it to take a list of file locations (the one you just created with find!)
    find . -name "*.whatever" > yourListOfFiles
    tar -cvf yourfile.tar -T yourListOfFiles
    

gsteff ,May 5, 2011 at 2:05

Try running:
    find . -type f | xargs -d "\n" tar -czvf backup.tar.gz

Caleb Kester ,Oct 12, 2013 at 20:41

Why not:
tar czvf backup.tar.gz *

Sure it's clever to use find and then xargs, but you're doing it the hard way.

Update: Porges has commented with a find-option that I think is a better answer than my answer, or the other one: find -print0 ... | xargs -0 ....

Kalibur x ,May 19, 2016 at 13:54

If you have multiple files or directories and you want to zip them into independent *.gz file you can do this. Optional -type f -atime
find -name "httpd-log*.txt" -type f -mtime +1 -exec tar -vzcf {}.gz {} \;

This will compress

httpd-log01.txt
httpd-log02.txt

to

httpd-log01.txt.gz
httpd-log02.txt.gz

Frank Eggink ,Apr 26, 2017 at 8:28

Why not give something like this a try: tar cvf scala.tar `find src -name *.scala`

tommy.carstensen ,Dec 10, 2017 at 14:55

Another solution as seen here :
find var/log/ -iname "anaconda.*" -exec tar -cvzf file.tar.gz {} +

Robino ,Sep 22, 2016 at 14:26

The best solution seem to be to create a file list and then archive files because you can use other sources and do something else with the list.

For example this allows using the list to calculate size of the files being archived:

#!/bin/sh

backupFileName="backup-big-$(date +"%Y%m%d-%H%M")"
backupRoot="/var/www"
backupOutPath=""

archivePath=$backupOutPath$backupFileName.tar.gz
listOfFilesPath=$backupOutPath$backupFileName.filelist

#
# Make a list of files/directories to archive
#
echo "" > $listOfFilesPath
echo "${backupRoot}/uploads" >> $listOfFilesPath
echo "${backupRoot}/extra/user/data" >> $listOfFilesPath
find "${backupRoot}/drupal_root/sites/" -name "files" -type d >> $listOfFilesPath

#
# Size calculation
#
sizeForProgress=`
cat $listOfFilesPath | while read nextFile;do
    if [ ! -z "$nextFile" ]; then
        du -sb "$nextFile"
    fi
done | awk '{size+=$1} END {print size}'
`

#
# Archive with progress
#
## simple with dump of all files currently archived
#tar -czvf $archivePath -T $listOfFilesPath
## progress bar
sizeForShow=$(($sizeForProgress/1024/1024))
echo -e "\nRunning backup [source files are $sizeForShow MiB]\n"
tar -cPp -T $listOfFilesPath | pv -s $sizeForProgress | gzip > $archivePath

user3472383 ,Jun 27 at 1:11

Would add a comment to @Steve Kehlet post but need 50 rep (RIP).

For anyone that has found this post through numerous googling, I found a way to not only find specific files given a time range, but also NOT include the relative paths OR whitespaces that would cause tarring errors. (THANK YOU SO MUCH STEVE.)

find . -name "*.pdf" -type f -mtime 0 -printf "%f\0" | tar -czvf /dir/zip.tar.gz --null -T -
  1. . relative directory
  2. -name "*.pdf" look for pdfs (or any file type)
  3. -type f type to look for is a file
  4. -mtime 0 look for files created in last 24 hours
  5. -printf "%f\0" Regular -print0 OR -printf "%f" did NOT work for me. From man pages:

This quoting is performed in the same way as for GNU ls. This is not the same quoting mechanism as the one used for -ls and -fls. If you are able to decide what format to use for the output of find then it is normally better to use '\0' as a terminator than to use newline, as file names can contain white space and newline characters.

  1. -czvf create archive, filter the archive through gzip , verbosely list files processed, archive name

[Aug 06, 2019] Tar archiving that takes input from a list of files>

Aug 06, 2019 | stackoverflow.com

Tar archiving that takes input from a list of files Ask Question Asked 7 years, 9 months ago Active 6 months ago Viewed 123k times 131 29


Kurt McKee ,Apr 29 at 10:22

I have a file that contain list of files I want to archive with tar. Let's call it mylist.txt

It contains:

file1.txt
file2.txt
...
file10.txt

Is there a way I can issue TAR command that takes mylist.txt as input? Something like

tar -cvf allfiles.tar -[someoption?] mylist.txt

So that it is similar as if I issue this command:

tar -cvf allfiles.tar file1.txt file2.txt file10.txt

Stphane ,May 25 at 0:11

Yes:
tar -cvf allfiles.tar -T mylist.txt

drue ,Jun 23, 2014 at 14:56

Assuming GNU tar (as this is Linux), the -T or --files-from option is what you want.

Stphane ,Mar 1, 2016 at 20:28

You can also pipe in the file names which might be useful:
find /path/to/files -name \*.txt | tar -cvf allfiles.tar -T -

David C. Rankin ,May 31, 2018 at 18:27

Some versions of tar, for example, the default versions on HP-UX (I tested 11.11 and 11.31), do not include a command line option to specify a file list, so a decent work-around is to do this:
tar cvf allfiles.tar $(cat mylist.txt)

Jan ,Sep 25, 2015 at 20:18

On Solaris, you can use the option -I to read the filenames that you would normally state on the command line from a file. In contrast to the command line, this can create tar archives with hundreds of thousands of files (just did that).

So the example would read

tar -cvf allfiles.tar -I mylist.txt

,

For me on AIX, it worked as follows:
tar -L List.txt -cvf BKP.tar

[Aug 06, 2019] Shell command to tar directory excluding certain files-folders

Aug 06, 2019 | stackoverflow.com

Shell command to tar directory excluding certain files/folders Ask Question Asked 10 years, 1 month ago Active 1 month ago Viewed 787k times 720 186


Rekhyt ,Jun 24, 2014 at 16:06

Is there a simple shell command/script that supports excluding certain files/folders from being archived?

I have a directory that need to be archived with a sub directory that has a number of very large files I do not need to backup.

Not quite solutions:

The tar --exclude=PATTERN command matches the given pattern and excludes those files, but I need specific files & folders to be ignored (full file path), otherwise valid files might be excluded.

I could also use the find command to create a list of files and exclude the ones I don't want to archive and pass the list to tar, but that only works with for a small amount of files. I have tens of thousands.

I'm beginning to think the only solution is to create a file with a list of files/folders to be excluded, then use rsync with --exclude-from=file to copy all the files to a tmp directory, and then use tar to archive that directory.

Can anybody think of a better/more efficient solution?

EDIT: Charles Ma 's solution works well. The big gotcha is that the --exclude='./folder' MUST be at the beginning of the tar command. Full command (cd first, so backup is relative to that directory):

cd /folder_to_backup
tar --exclude='./folder' --exclude='./upload/folder2' -zcvf /backup/filename.tgz .

James O'Brien ,Nov 24, 2016 at 9:55

You can have multiple exclude options for tar so
$ tar --exclude='./folder' --exclude='./upload/folder2' -zcvf /backup/filename.tgz .

etc will work. Make sure to put --exclude before the source and destination items.

Johan Soderberg ,Jun 11, 2009 at 23:10

You can exclude directories with --exclude for tar.

If you want to archive everything except /usr you can use:

tar -zcvf /all.tgz / --exclude=/usr

In your case perhaps something like

tar -zcvf archive.tgz arc_dir --exclude=dir/ignore_this_dir

cstamas ,Oct 8, 2018 at 18:02

Possible options to exclude files/directories from backup using tar:

Exclude files using multiple patterns

tar -czf backup.tar.gz --exclude=PATTERN1 --exclude=PATTERN2 ... /path/to/backup

Exclude files using an exclude file filled with a list of patterns

tar -czf backup.tar.gz -X /path/to/exclude.txt /path/to/backup

Exclude files using tags by placing a tag file in any directory that should be skipped

tar -czf backup.tar.gz --exclude-tag-all=exclude.tag /path/to/backup

Anish Ramaswamy ,Apr 1 at 16:18

old question with many answers, but I found that none were quite clear enough for me, so I would like to add my try.

if you have the following structure

/home/ftp/mysite/

with following file/folders

/home/ftp/mysite/file1
/home/ftp/mysite/file2
/home/ftp/mysite/file3
/home/ftp/mysite/folder1
/home/ftp/mysite/folder2
/home/ftp/mysite/folder3

so, you want to make a tar file that contain everyting inside /home/ftp/mysite (to move the site to a new server), but file3 is just junk, and everything in folder3 is also not needed, so we will skip those two.

we use the format

tar -czvf <name of tar file> <what to tar> <any excludes>

where the c = create, z = zip, and v = verbose (you can see the files as they are entered, usefull to make sure none of the files you exclude are being added). and f= file.

so, my command would look like this

cd /home/ftp/
tar -czvf mysite.tar.gz mysite --exclude='file3' --exclude='folder3'

note the files/folders excluded are relatively to the root of your tar (I have tried full path here relative to / but I can not make that work).

hope this will help someone (and me next time I google it)

not2qubit ,Apr 4, 2018 at 3:24

You can use standard "ant notation" to exclude directories relative.
This works for me and excludes any .git or node_module directories.
tar -cvf myFile.tar --exclude=**/.git/* --exclude=**/node_modules/*  -T /data/txt/myInputFile.txt 2> /data/txt/myTarLogFile.txt

myInputFile.txt Contains:

/dev2/java
/dev2/javascript

GeertVc ,Feb 9, 2015 at 13:37

I've experienced that, at least with the Cygwin version of tar I'm using ("CYGWIN_NT-5.1 1.7.17(0.262/5/3) 2012-10-19 14:39 i686 Cygwin" on a Windows XP Home Edition SP3 machine), the order of options is important.

While this construction worked for me:

tar cfvz target.tgz --exclude='<dir1>' --exclude='<dir2>' target_dir

that one didn't work:

tar cfvz --exclude='<dir1>' --exclude='<dir2>' target.tgz target_dir

This, while tar --help reveals the following:

tar [OPTION...] [FILE]

So, the second command should also work, but apparently it doesn't seem to be the case...

Best rgds,

Scott Stensland ,Feb 12, 2015 at 20:55

This exclude pattern handles filename suffix like png or mp3 as well as directory names like .git and node_modules
tar --exclude={*.png,*.mp3,*.wav,.git,node_modules} -Jcf ${target_tarball}  ${source_dirname}

Michael ,May 18 at 23:29

I found this somewhere else so I won't take credit, but it worked better than any of the solutions above for my mac specific issues (even though this is closed):
tar zc --exclude __MACOSX --exclude .DS_Store -f <archive> <source(s)>

J. Lawson ,Apr 17, 2018 at 23:28

For those who have issues with it, some versions of tar would only work properly without the './' in the exclude value.
Tar --version

tar (GNU tar) 1.27.1

Command syntax that work:

tar -czvf ../allfiles-butsome.tar.gz * --exclude=acme/foo

These will not work:

$ tar -czvf ../allfiles-butsome.tar.gz * --exclude=./acme/foo
$ tar -czvf ../allfiles-butsome.tar.gz * --exclude='./acme/foo'
$ tar --exclude=./acme/foo -czvf ../allfiles-butsome.tar.gz *
$ tar --exclude='./acme/foo' -czvf ../allfiles-butsome.tar.gz *
$ tar -czvf ../allfiles-butsome.tar.gz * --exclude=/full/path/acme/foo
$ tar -czvf ../allfiles-butsome.tar.gz * --exclude='/full/path/acme/foo'
$ tar --exclude=/full/path/acme/foo -czvf ../allfiles-butsome.tar.gz *
$ tar --exclude='/full/path/acme/foo' -czvf ../allfiles-butsome.tar.gz *

Jerinaw ,May 6, 2017 at 20:07

For Mac OSX I had to do

tar -zcv --exclude='folder' -f theOutputTarFile.tar folderToTar

Note the -f after the --exclude=

Aaron Votre ,Jul 15, 2016 at 15:56

I agree the --exclude flag is the right approach.
$ tar --exclude='./folder_or_file' --exclude='file_pattern' --exclude='fileA'

A word of warning for a side effect that I did not find immediately obvious: The exclusion of 'fileA' in this example will search for 'fileA' RECURSIVELY!

Example:A directory with a single subdirectory containing a file of the same name (data.txt)

data.txt
config.txt
--+dirA
  |  data.txt
  |  config.docx

Znik ,Nov 15, 2014 at 5:12

To avoid possible 'xargs: Argument list too long' errors due to the use of find ... | xargs ... when processing tens of thousands of files, you can pipe the output of find directly to tar using find ... -print0 | tar --null ... .
# archive a given directory, but exclude various files & directories 
# specified by their full file paths
find "$(pwd -P)" -type d \( -path '/path/to/dir1' -or -path '/path/to/dir2' \) -prune \
   -or -not \( -path '/path/to/file1' -or -path '/path/to/file2' \) -print0 | 
   gnutar --null --no-recursion -czf archive.tar.gz --files-from -
   #bsdtar --null -n -czf archive.tar.gz -T -

Mike ,May 9, 2014 at 21:29

After reading this thread, I did a little testing on RHEL 5 and here are my results for tarring up the abc directory:

This will exclude the directories error and logs and all files under the directories:

tar cvpzf abc.tgz abc/ --exclude='abc/error' --exclude='abc/logs'

Adding a wildcard after the excluded directory will exclude the files but preserve the directories:

tar cvpzf abc.tgz abc/ --exclude='abc/error/*' --exclude='abc/logs/*'

Alex B ,Jun 11, 2009 at 23:03

Use the find command in conjunction with the tar append (-r) option. This way you can add files to an existing tar in a single step, instead of a two pass solution (create list of files, create tar).
find /dir/dir -prune ... -o etc etc.... -exec tar rvf ~/tarfile.tar {} \;

frommelmak ,Sep 10, 2012 at 14:08

You can also use one of the "--exclude-tag" options depending on your needs:

The folder hosting the specified FILE will be excluded.

camh ,Jun 12, 2009 at 5:53

You can use cpio(1) to create tar files. cpio takes the files to archive on stdin, so if you've already figured out the find command you want to use to select the files the archive, pipe it into cpio to create the tar file:
find ... | cpio -o -H ustar | gzip -c > archive.tar.gz

PicoutputCls ,Aug 21, 2018 at 14:13

gnu tar v 1.26 the --exclude needs to come after archive file and backup directory arguments, should have no leading or trailing slashes, and prefers no quotes (single or double). So relative to the PARENT directory to be backed up, it's:

tar cvfz /path_to/mytar.tgz ./dir_to_backup --exclude=some_path/to_exclude

user2553863 ,May 28 at 21:41

After reading all this good answers for different versions and having solved the problem for myself, I think there are very small details that are very important, and rare to GNU/Linux general use , that aren't stressed enough and deserves more than comments.

So I'm not going to try to answer the question for every case, but instead, try to register where to look when things doesn't work.

IT IS VERY IMPORTANT TO NOTICE:

  1. THE ORDER OF THE OPTIONS MATTER: it is not the same put the --exclude before than after the file option and directories to backup. This is unexpected at least to me, because in my experience, in GNU/Linux commands, usually the order of the options doesn't matter.
  2. Different tar versions expects this options in different order: for instance, @Andrew's answer indicates that in GNU tar v 1.26 and 1.28 the excludes comes last, whereas in my case, with GNU tar 1.29, it's the other way.
  3. THE TRAILING SLASHES MATTER : at least in GNU tar 1.29, it shouldn't be any .

In my case, for GNU tar 1.29 on Debian stretch, the command that worked was

tar --exclude="/home/user/.config/chromium" --exclude="/home/user/.cache" -cf file.tar  /dir1/ /home/ /dir3/

The quotes didn't matter, it worked with or without them.

I hope this will be useful to someone.

jørgensen ,Dec 19, 2015 at 11:10

Your best bet is to use find with tar, via xargs (to handle the large number of arguments). For example:
find / -print0 | xargs -0 tar cjf tarfile.tar.bz2

Ashwini Gupta ,Jan 12, 2018 at 10:30

tar -cvzf destination_folder source_folder -X /home/folder/excludes.txt

-X indicates a file which contains a list of filenames which must be excluded from the backup. For Instance, you can specify *~ in this file to not include any filenames ending with ~ in the backup.

George ,Sep 4, 2013 at 22:35

Possible redundant answer but since I found it useful, here it is:

While a FreeBSD root (i.e. using csh) I wanted to copy my whole root filesystem to /mnt but without /usr and (obviously) /mnt. This is what worked (I am at /):

tar --exclude ./usr --exclude ./mnt --create --file - . (cd /mnt && tar xvd -)

My whole point is that it was necessary (by putting the ./ ) to specify to tar that the excluded directories where part of the greater directory being copied.

My €0.02

t0r0X ,Sep 29, 2014 at 20:25

I had no luck getting tar to exclude a 5 Gigabyte subdirectory a few levels deep. In the end, I just used the unix Zip command. It worked a lot easier for me.

So for this particular example from the original post
(tar --exclude='./folder' --exclude='./upload/folder2' -zcvf /backup/filename.tgz . )

The equivalent would be:

zip -r /backup/filename.zip . -x upload/folder/**\* upload/folder2/**\*

(NOTE: Here is the post I originally used that helped me https://superuser.com/questions/312301/unix-zip-directory-but-excluded-specific-subdirectories-and-everything-within-t )

RohitPorwal ,Jul 21, 2016 at 9:56

Check it out
tar cvpzf zip_folder.tgz . --exclude=./public --exclude=./tmp --exclude=./log --exclude=fileName

tripleee ,Sep 14, 2017 at 4:38

The following bash script should do the trick. It uses the answer given here by Marcus Sundman.
#!/bin/bash

echo -n "Please enter the name of the tar file you wish to create with out extension "
read nam

echo -n "Please enter the path to the directories to tar "
read pathin

echo tar -czvf $nam.tar.gz
excludes=`find $pathin -iname "*.CC" -exec echo "--exclude \'{}\'" \;|xargs`
echo $pathin

echo tar -czvf $nam.tar.gz $excludes $pathin

This will print out the command you need and you can just copy and paste it back in. There is probably a more elegant way to provide it directly to the command line.

Just change *.CC for any other common extension, file name or regex you want to exclude and this should still work.

EDIT

Just to add a little explanation; find generates a list of files matching the chosen regex (in this case *.CC). This list is passed via xargs to the echo command. This prints --exclude 'one entry from the list'. The slashes () are escape characters for the ' marks.

[Aug 06, 2019] bash - More efficient way to find tar millions of files - Stack Overflow

Aug 06, 2019 | stackoverflow.com

More efficient way to find & tar millions of files Ask Question Asked 9 years, 3 months ago Active 8 months ago Viewed 25k times 22 13


theomega ,Apr 29, 2010 at 13:51

I've got a job running on my server at the command line prompt for a two days now:
find data/ -name filepattern-*2009* -exec tar uf 2009.tar {} ;

It is taking forever , and then some. Yes, there are millions of files in the target directory. (Each file is a measly 8 bytes in a well hashed directory structure.) But just running...

find data/ -name filepattern-*2009* -print > filesOfInterest.txt

...takes only two hours or so. At the rate my job is running, it won't be finished for a couple of weeks .. That seems unreasonable. Is there a more efficient to do this? Maybe with a more complicated bash script?

A secondary questions is "why is my current approach so slow?"

Stu Thompson ,May 6, 2013 at 1:11

If you already did the second command that created the file list, just use the -T option to tell tar to read the files names from that saved file list. Running 1 tar command vs N tar commands will be a lot better.

Matthew Mott ,Jul 3, 2014 at 19:21

One option is to use cpio to generate a tar-format archive:
$ find data/ -name "filepattern-*2009*" | cpio -ov --format=ustar > 2009.tar

cpio works natively with a list of filenames from stdin, rather than a top-level directory, which makes it an ideal tool for this situation.

bashfu ,Apr 23, 2010 at 10:05

Here's a find-tar combination that can do what you want without the use of xargs or exec (which should result in a noticeable speed-up):
tar --version    # tar (GNU tar) 1.14 

# FreeBSD find (on Mac OS X)
find -x data -name "filepattern-*2009*" -print0 | tar --null --no-recursion -uf 2009.tar --files-from -

# for GNU find use -xdev instead of -x
gfind data -xdev -name "filepattern-*2009*" -print0 | tar --null --no-recursion -uf 2009.tar --files-from -

# added: set permissions via tar
find -x data -name "filepattern-*2009*" -print0 | \
    tar --null --no-recursion --owner=... --group=... --mode=... -uf 2009.tar --files-from -

Stu Thompson ,Apr 28, 2010 at 12:50

There is xargs for this:
find data/ -name filepattern-*2009* -print0 | xargs -0 tar uf 2009.tar

Guessing why it is slow is hard as there is not much information. What is the structure of the directory, what filesystem do you use, how it was configured on creating. Having milions of files in single directory is quite hard situation for most filesystems.

bashfu ,May 1, 2010 at 14:18

To correctly handle file names with weird (but legal) characters (such as newlines, ...) you should write your file list to filesOfInterest.txt using find's -print0:
find -x data -name "filepattern-*2009*" -print0 > filesOfInterest.txt
tar --null --no-recursion -uf 2009.tar --files-from filesOfInterest.txt

Michael Aaron Safyan ,Apr 23, 2010 at 8:47

The way you currently have things, you are invoking the tar command every single time it finds a file, which is not surprisingly slow. Instead of taking the two hours to print plus the amount of time it takes to open the tar archive, see if the files are out of date, and add them to the archive, you are actually multiplying those times together. You might have better success invoking the tar command once, after you have batched together all the names, possibly using xargs to achieve the invocation. By the way, I hope you are using 'filepattern-*2009*' and not filepattern-*2009* as the stars will be expanded by the shell without quotes.

ruffrey ,Nov 20, 2018 at 17:13

There is a utility for this called tarsplitter .
tarsplitter -m archive -i folder/*.json -o archive.tar -p 8

will use 8 threads to archive the files matching "folder/*.json" into an output archive of "archive.tar"

https://github.com/AQUAOSOTech/tarsplitter

syneticon-dj ,Jul 22, 2013 at 8:47

Simplest (also remove file after archive creation):
find *.1  -exec tar czf '{}.tgz' '{}' --remove-files \;

[Aug 06, 2019] backup - Fastest way combine many files into one (tar czf is too slow) - Unix Linux Stack Exchange

Aug 06, 2019 | unix.stackexchange.com

Fastest way combine many files into one (tar czf is too slow) Ask Question Asked 7 years, 11 months ago Active 21 days ago Viewed 32k times 22 5


Gilles ,Nov 5, 2013 at 0:05

Currently I'm running tar czf to combine backup files. The files are in a specific directory.

But the number of files is growing. Using tzr czf takes too much time (more than 20 minutes and counting).

I need to combine the files more quickly and in a scalable fashion.

I've found genisoimage , readom and mkisofs . But I don't know which is fastest and what the limitations are for each of them.

Rufo El Magufo ,Aug 24, 2017 at 7:56

You should check if most of your time are being spent on CPU or in I/O. Either way, there are ways to improve it:

A: don't compress

You didn't mention "compression" in your list of requirements so try dropping the "z" from your arguments list: tar cf . This might be speed up things a bit.

There are other techniques to speed-up the process, like using "-N " to skip files you already backed up before.

B: backup the whole partition with dd

Alternatively, if you're backing up an entire partition, take a copy of the whole disk image instead. This would save processing and a lot of disk head seek time. tar and any other program working at a higher level have a overhead of having to read and process directory entries and inodes to find where the file content is and to do more head disk seeks , reading each file from a different place from the disk.

To backup the underlying data much faster, use:

dd bs=16M if=/dev/sda1 of=/another/filesystem

(This assumes you're not using RAID, which may change things a bit)

,

To repeat what others have said: we need to know more about the files that are being backed up. I'll go with some assumptions here. Append to the tar file

If files are only being added to the directories (that is, no file is being deleted), make sure you are appending to the existing tar file rather than re-creating it every time. You can do this by specifying the existing archive filename in your tar command instead of a new one (or deleting the old one).

Write to a different disk

Reading from the same disk you are writing to may be killing performance. Try writing to a different disk to spread the I/O load. If the archive file needs to be on the same disk as the original files, move it afterwards.

Don't compress

Just repeating what @Yves said. If your backup files are already compressed, there's not much need to compress again. You'll just be wasting CPU cycles.

[Aug 02, 2019] linux - How to tar directory and then remove originals including the directory - Super User

Aug 02, 2019 | superuser.com

How to tar directory and then remove originals including the directory? Ask Question Asked 9 years, 6 months ago Active 4 years, 6 months ago Viewed 124k times 28 7


mit ,Dec 7, 2016 at 1:22

I'm trying to tar a collection of files in a directory called 'my_directory' and remove the originals by using the command:
tar -cvf files.tar my_directory --remove-files

However it is only removing the individual files inside the directory and not the directory itself (which is what I specified in the command). What am I missing here?

EDIT:

Yes, I suppose the 'remove-files' option is fairly literal. Although I too found the man page unclear on that point. (In linux I tend not to really distinguish much between directories and files that much, and forget sometimes that they are not the same thing). It looks like the consensus is that it doesn't remove directories.

However, my major prompting point for asking this question stems from tar's handling of absolute paths. Because you must specify a relative path to a file/s to be compressed, you therefore must change to the parent directory to tar it properly. As I see it using any kind of follow-on 'rm' command is potentially dangerous in that situation. Thus I was hoping to simplify things by making tar itself do the remove.

For example, imagine a backup script where the directory to backup (ie. tar) is included as a shell variable. If that shell variable value was badly entered, it is possible that the result could be deleted files from whatever directory you happened to be in last.

Arjan ,Feb 13, 2016 at 13:08

You are missing the part which says the --remove-files option removes files after adding them to the archive.

You could follow the archive and file-removal operation with a command like,

find /path/to/be/archived/ -depth -type d -empty -exec rmdir {} \;


Update: You may be interested in reading this short Debian discussion on,
Bug 424692: --remove-files complains that directories "changed as we read it" .

Kim ,Feb 13, 2016 at 13:08

Since the --remove-files option only removes files , you could try
tar -cvf files.tar my_directory && rm -R my_directory

so that the directory is removed only if the tar returns an exit status of 0

redburn ,Feb 13, 2016 at 13:08

Have you tried to put --remove-files directive after archive name? It works for me.
tar -cvf files.tar --remove-files my_directory

shellking ,Oct 4, 2010 at 19:58

source={directory argument}

e.g.

source={FULL ABSOLUTE PATH}/my_directory
parent={parent directory of argument}

e.g.

parent={ABSOLUTE PATH of 'my_directory'/
logFile={path to a run log that captures status messages}

Then you could execute something along the lines of:

cd ${parent}

tar cvf Tar_File.`date%Y%M%D_%H%M%S` ${source}

if [ $? != 0 ]

then

 echo "Backup FAILED for ${source} at `date` >> ${logFile}

else

 echo "Backup SUCCESS for ${source} at `date` >> ${logFile}

 rm -rf ${source}

fi

mit ,Nov 14, 2011 at 13:21

This was probably a bug.

Also the word "file" is ambigous in this case. But because this is a command line switch I would it expect to mean also directories, because in unix/lnux everything is a file, also a directory. (The other interpretation is of course also valid, but It makes no sense to keep directories in such a case. I would consider it unexpected and confusing behavior.)

But I have found that in gnu tar on some distributions gnu tar actually removes the directory tree. Another indication that keeping the tree was a bug. Or at least some workaround until they fixed it.

This is what I tried out on an ubuntu 10.04 console:

mit:/var/tmp$ mkdir tree1                                                                                               
mit:/var/tmp$ mkdir tree1/sub1                                                                                          
mit:/var/tmp$ > tree1/sub1/file1                                                                                        

mit:/var/tmp$ ls -la                                                                                                    
drwxrwxrwt  4 root root 4096 2011-11-14 15:40 .                                                                              
drwxr-xr-x 16 root root 4096 2011-02-25 03:15 ..
drwxr-xr-x  3 mit  mit  4096 2011-11-14 15:40 tree1

mit:/var/tmp$ tar -czf tree1.tar.gz tree1/ --remove-files

# AS YOU CAN SEE THE TREE IS GONE NOW:

mit:/var/tmp$ ls -la
drwxrwxrwt  3 root root 4096 2011-11-14 15:41 .
drwxr-xr-x 16 root root 4096 2011-02-25 03:15 ..
-rw-r--r--  1 mit   mit    159 2011-11-14 15:41 tree1.tar.gz                                                                   


mit:/var/tmp$ tar --version                                                                                             
tar (GNU tar) 1.22                                                                                                           
Copyright © 2009 Free Software Foundation, Inc.

If you want to see it on your machine, paste this into a console at your own risk:

tar --version                                                                                             
cd /var/tmp
mkdir -p tree1/sub1                                                                                          
> tree1/sub1/file1                                                                                        
tar -czf tree1.tar.gz tree1/ --remove-files
ls -la

[Jul 31, 2019] Is Ruby moving toward extinction?

Jul 31, 2019 | developers.slashdot.org

timeOday ( 582209 ) , Monday July 29, 2019 @03:44PM ( #59007686 )

Re:ORLY ( Score: 5 , Insightful)

This is what it feels like to actually learn from an article instead of simply having it confirm your existing beliefs.

Here is what it says:

An analysis of Dice job-posting data over the past year shows a startling dip in the number of companies looking for technology professionals who are skilled in Ruby. In 2018, the number of Ruby jobs declined 56 percent. That's a huge warning sign that companies are turning away from Ruby - and if that's the case, the language's user-base could rapidly erode to almost nothing.

Well, what's your evidence-based rebuttal to that?

Wdomburg ( 141264 ) writes:
Re: ( Score: 2 )

If you actually look at the TIOBE rankings, it's #11 (not #12 as claimed in the article), and back on the upswing. If you look at RedMonk, which they say they looked at but don't reference with respect to Ruby, it is a respectable #8, being one of the top languages on GitHub and Stack Overflow.

We are certainly past the glory days of Ruby, when it was the Hot New Thing and everyone was deploying Rails, but to suggest that it is "probably doomed" seems a somewhat hysterical prediction.

OrangeTide ( 124937 ) , Tuesday July 30, 2019 @01:52AM ( #59010348 ) Homepage Journal
Re:ORLY ( Score: 4 , Funny)
How do they know how many Ruby jobs there are? Maybe how many Ruby job openings announced, but not the actual number of jobs. Or maybe they are finding Ruby job-applicants and openings via other means.

Maybe there is a secret list of Ruby job postings only available to the coolest programmers? Man! I never get to hang out with the cool kids.

jellomizer ( 103300 ) , Monday July 29, 2019 @03:48PM ( #59007714 )
Re:ORLY ( Score: 5 , Insightful)

Perhaps the devops/web programmers is a dying field.

But to be fair, Ruby had its peak about 10 years ago. With Ruby on Rails. However the problem is the "Rails" started to get very dated. And Python and Node.JS had taken its place.

whitelabrat ( 469237 ) , Monday July 29, 2019 @03:57PM ( #59007778 )
Re:ORLY ( Score: 5 , Insightful)

I don't see Ruby dying anytime soon, but I do get the feeling that Python is the go-to scripting language for all the things now. I learned Ruby and wish I spent that time learning Python.

Perl is perl. It will live on, but anybody writing new things with it probably needs to have a talkin' to.

phantomfive ( 622387 ) , Monday July 29, 2019 @07:32PM ( #59009188 ) Journal
Re:ORLY ( Score: 4 , Insightful)
I learned Ruby and wish I spent that time learning Python.

Ruby and Python are basically the same thing. With a little google, you can literally start programming in Python today. Search for "print python" and you can easily write a hello world. search for 'python for loop' and suddenly you can do repetitious tasks. Search for "define function python" and you can organize your code.

After that do a search for hash tables and lists in Python and you'll be good enough to pass a coding interview in the language.

[Jul 31, 2019] 5 Programming Languages That Are Probably Doomed

The article is a clickbait. entrenched languages seldom die. But some Slashdot comments are interesting.
Jul 31, 2019 | developers.slashdot.org

NoNonAlphaCharsHere ( 2201864 ) , Monday July 29, 2019 @03:39PM ( #59007638 )

Re:ORLY ( Score: 5 , Funny)

Perl has been "doomed" for over 30 years now, hasn't stopped it.

geekoid ( 135745 ) writes:
Re: ( Score: 2 )

OTOH, it not exactly what it once was.

IMO: if you can't write good readable code in PERL, you should find a new business to work in.

Anonymous Coward writes:
check the job description ( Score: 3 , Funny)

Writing unreadable perl is the business.

ShanghaiBill ( 739463 ) writes:
Re: ( Score: 3 )
Perl has been "doomed" for over 30 years now, hasn't stopped it.

I love Perl, but today it is mostly small throw-away scripts and maintaining legacy apps.

It makes little sense to use Perl for a new project.

Perl won't disappear, but the glory days are in the past.

Anonymous Coward , Monday July 29, 2019 @03:59PM ( #59007794 )
Re:ORLY ( Score: 4 , Interesting)

I write new code in perl all the time. Cleanly written, well formatted and completely maintainable. Simply because YOU can't write perl in such a manner, that doesn't mean others can't.

Anonymous Coward writes:
Re: ORLY ( Score: 2 , Insightful)

Do you have someone else who is saying that about your code or is that your own opinion?

Sarten-X ( 1102295 ) , Monday July 29, 2019 @05:53PM ( #59008624 ) Homepage
Re: ORLY ( Score: 4 , Insightful)

I happen to read a lot of Perl in my day job, involving reverse-engineering a particular Linux-based appliance for integration purposes. I seldom come across scripts that are too actually bad.

It's important to understand that Perl has a different concept of readability. It's more like reading a book than reading a program, because there are so many ways to write any given task. A good Perl programmer will incorporate that flexibility into their style, so intent can be inferred not just from the commands used, but also how the code is arranged. For example, a large block describing a complex function would be written verbosely for detailed clarity.

A trivial statement could be used, if it resolves an edge case.

Conversely, a good Perl reader will be familiar enough with the language to understand the idioms and shorthand used, so they can understand the story as written without being distracted by the ugly bits. Once viewed from that perspective, a Perl program can condense incredible amounts of description into just a few lines, and still be as readily-understood as any decent novel.

Sarten-X ( 1102295 ) writes: on Monday July 29, 2019 @07:06PM ( #59009056 ) Homepage
Re: ORLY ( Score: 4 , Insightful)

Since you brought it up...

In building several dev teams, I have never tried to hire everyone with any particular skill. I aim to have at least two people with each skill, but won't put effort to having more than that at first. After the initial startup of the team, I try to run projects in pairs, with an expert starting the project, then handing it to a junior (in that particular skill) for completion. After a few rounds of that, the junior is close enough to an expert, and somebody else takes the junior role. That way, even with turnover, expertise is shared among the team, and there's always someone who can be the expert.

Back to the subject at hand, though...

My point is that Perl is a more conversational language that others, and its structure reflects that. It is unreasonable to simply look at Perl code, see the variety of structures, and declare it "unreadable" simply because the reader doesn't understand the language.

As an analogy, consider the structural differences between Lord of the Rings and The Cat in the Hat . A reader who is only used to The Cat in the Hat would find Lord of the Rings to be ridiculously complex to the point of being unreadable, when Lord of the Rings is simply making use of structures and capabilities that are not permitted in the language of young children's' books.

This is not to say that other languages are wrong to have a more limited grammar. They are simply different, and learning to read a more flexible language is a skill to be developed like any other. Similar effort must be spent to learn other languages with sufficiently-different structure, like Lisp or Haskell.

phantomfive ( 622387 ) , Monday July 29, 2019 @07:24PM ( #59009128 ) Journal
Re:ORLY ( Score: 3 )

FWIW DuckDuckGo is apparently written primarily in Perl.

fahrbot-bot ( 874524 ) , Monday July 29, 2019 @03:46PM ( #59007696 )
If your career is based on ... ( Score: 3 , Interesting)

From TFA:

Perl: Even if RedMonk has Perl's popularity declining, it's still going to take a long time for the language to flatten out completely, given the sheer number of legacy websites that still feature its code. Nonetheless, a lack of active development, and widespread developer embrace of other languages for things like building websites, means that Perl is going to just fall into increasing disuse.

First, Perl is used for many, many more things than websites -- and the focus in TFA is short-sighted. Second, I've written a LOT of Perl in my many years, but wouldn't say my (or most people's) career is based on it. Yes, I have written applications in Perl, but more often used it for utility, glue and other things that help get things done, monitor and (re)process data. Nothing (or very few things) can beat Perl for a quick knock-off script to do something or another.

Perl's not going anywhere and it will be a useful language to know for quite a while. Languages like Perl (and Python) are great tools to have in your toolbox, ones that you know how to wield well when you need them. Knowing when you need them, and not something else, is important.

TimHunter ( 174406 ) , Monday July 29, 2019 @05:22PM ( #59008400 )
Career based on *a* programming language? ( Score: 4 , Insightful)

Anybody whose career is based on a single programming language is doomed already. Programmers know how to write code. The language they use is beside the point. A good programmer can write code in whatever language is asked of them.

bobbied ( 2522392 ) , Monday July 29, 2019 @04:23PM ( #59007966 )
Re:Diversifying ( Score: 5 , Insightful)
The writer of this article should consider diversifying his skillset at some point, as not all bloggers endure forever and his popularity ranking on Slashdot has recently tanked.

I'd suggest that this writer quit their day job and take up stand up...

Old languages never really die until the platform dies. Languages may fall out of favor, but they don't usually die until the platform they are running on disappears and then the people who used them die. So, FORTRAN, C, C++, and COBOL and more are here to pretty much stay.

Specifically, PERL isn't going anywhere being fundamentally on Linux, neither is Ruby, the rest to varying degrees have been out of favor for awhile now, but none of the languages in the article are dead. They are, however, falling out of favor and because of that, it might be a good idea to be adding other tools to your programmer's tool box if your livelihood depends on one of them.

[Jul 30, 2019] Python is overrated

Notable quotes:
"... R commits a substantial scale crime by being so dependent on memory-resident objects. Python commits major scale crime with its single-threaded primary interpreter loop. ..."
Jul 29, 2019 | developers.slashdot.org

epine ( 68316 ), Monday July 29, 2019 @05:48PM ( #59008600 ) Score: 3 )

Jul 30, 2019 | developers.slashdot.org

I had this naive idea that Python might substantially displace R until I learned more about the Python internals, which are pretty nasty. This is the new generation's big data language? If true, sure sucks to be young again.

Python isn't even really used to do big data. It's mainly used to orchestrate big data flows on top of other libraries or facilities. It has more or less become the lingua franca of high-level hand waving. Any real grunt is far below.

R commits a substantial scale crime by being so dependent on memory-resident objects. Python commits major scale crime with its single-threaded primary interpreter loop.

If I move away from R, it will definitely be Julia for any real work (as Julia matures, if it matures well), and not Python.

[Jul 30, 2019] The difference between tar and tar.gz archives

With tar.gz to extract a file archiver first creates an intermediary tarball x.tar file from x.tar.gz by uncompressing the whole archive then unpack requested files from this intermediary tarball. In tar.gz archive is large unpacking can take several hours or even days.
Jul 30, 2019 | askubuntu.com

[Jul 29, 2019] How do I tar a directory of files and folders without including the directory itself - Stack Overflow

Jan 05, 2017 | stackoverflow.com

How do I tar a directory of files and folders without including the directory itself? Ask Question Asked 10 years, 1 month ago Active 8 months ago Viewed 464k times 348 105


tvanfosson ,Jan 5, 2017 at 12:29

I typically do:
tar -czvf my_directory.tar.gz my_directory

What if I just want to include everything (including any hidden system files) in my_directory, but not the directory itself? I don't want:

my_directory
   --- my_file
   --- my_file
   --- my_file

I want:

my_file
my_file
my_file

PanCrit ,Feb 19 at 13:04

cd my_directory/ && tar -zcvf ../my_dir.tgz . && cd -

should do the job in one line. It works well for hidden files as well. "*" doesn't expand hidden files by path name expansion at least in bash. Below is my experiment:

$ mkdir my_directory
$ touch my_directory/file1
$ touch my_directory/file2
$ touch my_directory/.hiddenfile1
$ touch my_directory/.hiddenfile2
$ cd my_directory/ && tar -zcvf ../my_dir.tgz . && cd ..
./
./file1
./file2
./.hiddenfile1
./.hiddenfile2
$ tar ztf my_dir.tgz
./
./file1
./file2
./.hiddenfile1
./.hiddenfile2

JCotton ,Mar 3, 2015 at 2:46

Use the -C switch of tar:
tar -czvf my_directory.tar.gz -C my_directory .

The -C my_directory tells tar to change the current directory to my_directory , and then . means "add the entire current directory" (including hidden files and sub-directories).

Make sure you do -C my_directory before you do . or else you'll get the files in the current directory.

Digger ,Mar 23 at 6:52

You can also create archive as usual and extract it with:
tar --strip-components 1 -xvf my_directory.tar.gz

jwg ,Mar 8, 2017 at 12:56

Have a look at --transform / --xform , it gives you the opportunity to massage the file name as the file is added to the archive:
% mkdir my_directory
% touch my_directory/file1
% touch my_directory/file2
% touch my_directory/.hiddenfile1
% touch my_directory/.hiddenfile2
% tar -v -c -f my_dir.tgz --xform='s,my_directory/,,' $(find my_directory -type f)
my_directory/file2
my_directory/.hiddenfile1
my_directory/.hiddenfile2
my_directory/file1
% tar -t -f my_dir.tgz 
file2
.hiddenfile1
.hiddenfile2
file1

Transform expression is similar to that of sed , and we can use separators other than / ( , in the above example).
https://www.gnu.org/software/tar/manual/html_section/tar_52.html

Alex ,Mar 31, 2017 at 15:40

TL;DR
find /my/dir/ -printf "%P\n" | tar -czf mydir.tgz --no-recursion -C /my/dir/ -T -

With some conditions (archive only files, dirs and symlinks):

find /my/dir/ -printf "%P\n" -type f -o -type l -o -type d | tar -czf mydir.tgz --no-recursion -C /my/dir/ -T -
Explanation

The below unfortunately includes a parent directory ./ in the archive:

tar -czf mydir.tgz -C /my/dir .

You can move all the files out of that directory by using the --transform configuration option, but that doesn't get rid of the . directory itself. It becomes increasingly difficult to tame the command.

You could use $(find ...) to add a file list to the command (like in magnus' answer ), but that potentially causes a "file list too long" error. The best way is to combine it with tar's -T option, like this:

find /my/dir/ -printf "%P\n" -type f -o -type l -o -type d | tar -czf mydir.tgz --no-recursion -C /my/dir/ -T -

Basically what it does is list all files ( -type f ), links ( -type l ) and subdirectories ( -type d ) under your directory, make all filenames relative using -printf "%P\n" , and then pass that to the tar command (it takes filenames from STDIN using -T - ). The -C option is needed so tar knows where the files with relative names are located. The --no-recursion flag is so that tar doesn't recurse into folders it is told to archive (causing duplicate files).

If you need to do something special with filenames (filtering, following symlinks etc), the find command is pretty powerful, and you can test it by just removing the tar part of the above command:

$ find /my/dir/ -printf "%P\n" -type f -o -type l -o -type d
> textfile.txt
> documentation.pdf
> subfolder2
> subfolder
> subfolder/.gitignore

For example if you want to filter PDF files, add ! -name '*.pdf'

$ find /my/dir/ -printf "%P\n" -type f ! -name '*.pdf' -o -type l -o -type d
> textfile.txt
> subfolder2
> subfolder
> subfolder/.gitignore
Non-GNU find

The command uses printf (available in GNU find ) which tells find to print its results with relative paths. However, if you don't have GNU find , this works to make the paths relative (removes parents with sed ):

find /my/dir/ -type f -o -type l -o -type d | sed s,^/my/dir/,, | tar -czf mydir.tgz --no-recursion -C /my/dir/ -T -

BrainStone ,Dec 21, 2016 at 22:14

This Answer should work in most situations. Notice however how the filenames are stored in the tar file as, for example, ./file1 rather than just file1 . I found that this caused problems when using this method to manipulate tarballs used as package files in BuildRoot .

One solution is to use some Bash globs to list all files except for .. like this:

tar -C my_dir -zcvf my_dir.tar.gz .[^.]* ..?* *

This is a trick I learnt from this answer .

Now tar will return an error if there are no files matching ..?* or .[^.]* , but it will still work. If the error is a problem (you are checking for success in a script), this works:

shopt -s nullglob
tar -C my_dir -zcvf my_dir.tar.gz .[^.]* ..?* *
shopt -u nullglob

Though now we are messing with shell options, we might decide that it is neater to have * match hidden files:

shopt -s dotglob
tar -C my_dir -zcvf my_dir.tar.gz *
shopt -u dotglob

This might not work where your shell globs * in the current directory, so alternatively, use:

shopt -s dotglob
cd my_dir
tar -zcvf ../my_dir.tar.gz *
cd ..
shopt -u dotglob

PanCrit ,Jun 14, 2010 at 6:47

cd my_directory
tar zcvf ../my_directory.tar.gz *

anion ,May 11, 2018 at 14:10

If it's a Unix/Linux system, and you care about hidden files (which will be missed by *), you need to do:
cd my_directory
tar zcvf ../my_directory.tar.gz * .??*

I don't know what hidden files look like under Windows.

gpz500 ,Feb 27, 2014 at 10:46

I would propose the following Bash function (first argument is the path to the dir, second argument is the basename of resulting archive):
function tar_dir_contents ()
{
    local DIRPATH="$1"
    local TARARCH="$2.tar.gz"
    local ORGIFS="$IFS"
    IFS=$'\n'
    tar -C "$DIRPATH" -czf "$TARARCH" $( ls -a "$DIRPATH" | grep -v '\(^\.$\)\|\(^\.\.$\)' )
    IFS="$ORGIFS"
}

You can run it in this way:

$ tar_dir_contents /path/to/some/dir my_archive

and it will generate the archive my_archive.tar.gz within current directory. It works with hidden (.*) elements and with elements with spaces in their filename.

med ,Feb 9, 2017 at 17:19

cd my_directory && tar -czvf ../my_directory.tar.gz $(ls -A) && cd ..

This one worked for me and it's include all hidden files without putting all files in a root directory named "." like in tomoe's answer :

Breno Salgado ,Apr 16, 2016 at 15:42

Use pax.

Pax is a deprecated package but does the job perfectly and in a simple fashion.

pax -w > mydir.tar mydir

asynts ,Jun 26 at 16:40

Simplest way I found:

cd my_dir && tar -czvf ../my_dir.tar.gz *

marcingo ,Aug 23, 2016 at 18:04

# tar all files within and deeper in a given directory
# with no prefixes ( neither <directory>/ nor ./ )
# parameters: <source directory> <target archive file>
function tar_all_in_dir {
    { cd "$1" && find -type f -print0; } \
    | cut --zero-terminated --characters=3- \
    | tar --create --file="$2" --directory="$1" --null --files-from=-
}

Safely handles filenames with spaces or other unusual characters. You can optionally add a -name '*.sql' or similar filter to the find command to limit the files included.

user1456599 ,Feb 13, 2013 at 21:37

 tar -cvzf  tarlearn.tar.gz --remove-files mytemp/*

If the folder is mytemp then if you apply the above it will zip and remove all the files in the folder but leave it alone

 tar -cvzf  tarlearn.tar.gz --remove-files --exclude='*12_2008*' --no-recursion mytemp/*

You can give exclude patterns and also specify not to look into subfolders too

Aaron Digulla ,Jun 2, 2009 at 15:33

tar -C my_dir -zcvf my_dir.tar.gz `ls my_dir`

[Jun 26, 2019] 7,000 Developers Report Their Top Languages: Java, JavaScript, and Python

The article mixes apples and oranges and demonstrates complete ignorance in the of language classification.
Two of the three top language are scripting languages. This is a huge victory. But Python has problems with efficiency (not that they matter everywhere) and is far from being an elegant language. It entered mainstream via the adoption it at universities as the first programming language, displacing Java (which I think might be a mistake -- I think teaching should start with assembler and replicate the history of development -- assembler -- compiled languages -- scripting language)
Perl which essentially heralded the era of scripting languages is now losing its audience and shrinks to its initial purpose -- the tool for Unix system administrators. But I think is such surveys its use is underreported for obvious reasons -- it is not fashionable. But please note that Fortran is still widely used.
Go is just veriant of a "better C" -- statically typed, compiled language. Rust is an attempt to improve C++. Both belong to the class of compiled languages. So complied language still hold their own and are important part of the ecosystem. See also How Rust Compares to Other Programming Languages - The New Stack
Jun 26, 2019 | developers.slashdot.org
The report surveyed about 7,000 developers worldwide, and revealed Python is the most studied programming language, the most loved language , and the third top primary programming language developers are using... The top use cases developers are using Python for include data analysis, web development, machine learning and writing automation scripts, according to the JetBrains report . More developers are also beginning to move over to Python 3, with 9 out of 10 developers using the current version.

The JetBrains report also found while Go is still a young language, it is the most promising programming language. "Go started out with a share of 8% in 2017 and now it has reached 18%. In addition, the biggest number of developers (13%) chose Go as a language they would like to adopt or migrate to," the report stated...

Seventy-three percent of JavaScript developers use TypeScript, which is up from 17 percent last year. Seventy-one percent of Kotlin developers use Kotlin for work. Java 8 is still the most popular programming language, but developers are beginning to migrate to Java 10 and 11.
JetBrains (which designed Kotlin in 2011) also said that 60% of their survey's respondents identified themselves as professional web back-end developers (while 46% said they did web front-end, and 23% developed mobile applications). 41% said they hadn't contributed to open source projects "but I would like to," while 21% said they contributed "several times a year."

"16% of developers don't have any tests in their projects. Among fully-employed senior developers though, that statistic is just 8%. Like last year, about 30% of developers still don't have unit tests in their projects." Other interesting statistics: 52% say they code in their dreams. 57% expect AI to replace developers "partially" in the future. "83% prefer the Dark theme for their editor or IDE. This represents a growth of 6 percentage points since last year for each environment. 47% take public transit to work.

And 97% of respondents using Rust "said they have been using Rust for less than a year. With only 14% using it for work, it's much more popular as a language for personal/side projects." And more than 90% of the Rust developers who responded worked with codebases with less than 300 files.

[Jun 23, 2019] Utilizing multi core for tar+gzip-bzip compression-decompression

Highly recommended!
Notable quotes:
"... There is effectively no CPU time spent tarring, so it wouldn't help much. The tar format is just a copy of the input file with header blocks in between files. ..."
"... You can also use the tar flag "--use-compress-program=" to tell tar what compression program to use. ..."
Jun 23, 2019 | stackoverflow.com

user1118764 , Sep 7, 2012 at 6:58

I normally compress using tar zcvf and decompress using tar zxvf (using gzip due to habit).

I've recently gotten a quad core CPU with hyperthreading, so I have 8 logical cores, and I notice that many of the cores are unused during compression/decompression.

Is there any way I can utilize the unused cores to make it faster?

Warren Severin , Nov 13, 2017 at 4:37

The solution proposed by Xiong Chiamiov above works beautifully. I had just backed up my laptop with .tar.bz2 and it took 132 minutes using only one cpu thread. Then I compiled and installed tar from source: gnu.org/software/tar I included the options mentioned in the configure step: ./configure --with-gzip=pigz --with-bzip2=lbzip2 --with-lzip=plzip I ran the backup again and it took only 32 minutes. That's better than 4X improvement! I watched the system monitor and it kept all 4 cpus (8 threads) flatlined at 100% the whole time. THAT is the best solution. – Warren Severin Nov 13 '17 at 4:37

Mark Adler , Sep 7, 2012 at 14:48

You can use pigz instead of gzip, which does gzip compression on multiple cores. Instead of using the -z option, you would pipe it through pigz:
tar cf - paths-to-archive | pigz > archive.tar.gz

By default, pigz uses the number of available cores, or eight if it could not query that. You can ask for more with -p n, e.g. -p 32. pigz has the same options as gzip, so you can request better compression with -9. E.g.

tar cf - paths-to-archive | pigz -9 -p 32 > archive.tar.gz

user788171 , Feb 20, 2013 at 12:43

How do you use pigz to decompress in the same fashion? Or does it only work for compression?

Mark Adler , Feb 20, 2013 at 16:18

pigz does use multiple cores for decompression, but only with limited improvement over a single core. The deflate format does not lend itself to parallel decompression.

The decompression portion must be done serially. The other cores for pigz decompression are used for reading, writing, and calculating the CRC. When compressing on the other hand, pigz gets close to a factor of n improvement with n cores.

Garrett , Mar 1, 2014 at 7:26

The hyphen here is stdout (see this page ).

Mark Adler , Jul 2, 2014 at 21:29

Yes. 100% compatible in both directions.

Mark Adler , Apr 23, 2015 at 5:23

There is effectively no CPU time spent tarring, so it wouldn't help much. The tar format is just a copy of the input file with header blocks in between files.

Jen , Jun 14, 2013 at 14:34

You can also use the tar flag "--use-compress-program=" to tell tar what compression program to use.

For example use:

tar -c --use-compress-program=pigz -f tar.file dir_to_zip

Valerio Schiavoni , Aug 5, 2014 at 22:38

Unfortunately by doing so the concurrent feature of pigz is lost. You can see for yourself by executing that command and monitoring the load on each of the cores. – Valerio Schiavoni Aug 5 '14 at 22:38

bovender , Sep 18, 2015 at 10:14

@ValerioSchiavoni: Not here, I get full load on all 4 cores (Ubuntu 15.04 'Vivid'). – bovender Sep 18 '15 at 10:14

Valerio Schiavoni , Sep 28, 2015 at 23:41

On compress or on decompress ? – Valerio Schiavoni Sep 28 '15 at 23:41

Offenso , Jan 11, 2017 at 17:26

I prefer tar - dir_to_zip | pv | pigz > tar.file pv helps me estimate, you can skip it. But still it easier to write and remember. – Offenso Jan 11 '17 at 17:26

Maxim Suslov , Dec 18, 2014 at 7:31

Common approach

There is option for tar program:

-I, --use-compress-program PROG
      filter through PROG (must accept -d)

You can use multithread version of archiver or compressor utility.

Most popular multithread archivers are pigz (instead of gzip) and pbzip2 (instead of bzip2). For instance:

$ tar -I pbzip2 -cf OUTPUT_FILE.tar.bz2 paths_to_archive
$ tar --use-compress-program=pigz -cf OUTPUT_FILE.tar.gz paths_to_archive

Archiver must accept -d. If your replacement utility hasn't this parameter and/or you need specify additional parameters, then use pipes (add parameters if necessary):

$ tar cf - paths_to_archive | pbzip2 > OUTPUT_FILE.tar.gz
$ tar cf - paths_to_archive | pigz > OUTPUT_FILE.tar.gz

Input and output of singlethread and multithread are compatible. You can compress using multithread version and decompress using singlethread version and vice versa.

p7zip

For p7zip for compression you need a small shell script like the following:

#!/bin/sh
case $1 in
  -d) 7za -txz -si -so e;;
   *) 7za -txz -si -so a .;;
esac 2>/dev/null

Save it as 7zhelper.sh. Here the example of usage:

$ tar -I 7zhelper.sh -cf OUTPUT_FILE.tar.7z paths_to_archive
$ tar -I 7zhelper.sh -xf OUTPUT_FILE.tar.7z
xz

Regarding multithreaded XZ support. If you are running version 5.2.0 or above of XZ Utils, you can utilize multiple cores for compression by setting -T or --threads to an appropriate value via the environmental variable XZ_DEFAULTS (e.g. XZ_DEFAULTS="-T 0" ).

This is a fragment of man for 5.1.0alpha version:

Multithreaded compression and decompression are not implemented yet, so this option has no effect for now.

However this will not work for decompression of files that haven't also been compressed with threading enabled. From man for version 5.2.2:

Threaded decompression hasn't been implemented yet. It will only work on files that contain multiple blocks with size information in block headers. All files compressed in multi-threaded mode meet this condition, but files compressed in single-threaded mode don't even if --block-size=size is used.

Recompiling with replacement

If you build tar from sources, then you can recompile with parameters

--with-gzip=pigz
--with-bzip2=lbzip2
--with-lzip=plzip

After recompiling tar with these options you can check the output of tar's help:

$ tar --help | grep "lbzip2\|plzip\|pigz"
  -j, --bzip2                filter the archive through lbzip2
      --lzip                 filter the archive through plzip
  -z, --gzip, --gunzip, --ungzip   filter the archive through pigz

mpibzip2 , Apr 28, 2015 at 20:57

I just found pbzip2 and mpibzip2 . mpibzip2 looks very promising for clusters or if you have a laptop and a multicore desktop computer for instance. – user1985657 Apr 28 '15 at 20:57

oᴉɹǝɥɔ , Jun 10, 2015 at 17:39

Processing STDIN may in fact be slower. – oᴉɹǝɥɔ Jun 10 '15 at 17:39

selurvedu , May 26, 2016 at 22:13

Plus 1 for xz option. It the simplest, yet effective approach. – selurvedu May 26 '16 at 22:13

panticz.de , Sep 1, 2014 at 15:02

You can use the shortcut -I for tar's --use-compress-program switch, and invoke pbzip2 for bzip2 compression on multiple cores:
tar -I pbzip2 -cf OUTPUT_FILE.tar.bz2 DIRECTORY_TO_COMPRESS/

einpoklum , Feb 11, 2017 at 15:59

A nice TL;DR for @MaximSuslov's answer . – einpoklum Feb 11 '17 at 15:59
If you want to have more flexibility with filenames and compression options, you can use:
find /my/path/ -type f -name "*.sql" -o -name "*.log" -exec \
tar -P --transform='s@/my/path/@@g' -cf - {} + | \
pigz -9 -p 4 > myarchive.tar.gz
Step 1: find

find /my/path/ -type f -name "*.sql" -o -name "*.log" -exec

This command will look for the files you want to archive, in this case /my/path/*.sql and /my/path/*.log . Add as many -o -name "pattern" as you want.

-exec will execute the next command using the results of find : tar

Step 2: tar

tar -P --transform='s@/my/path/@@g' -cf - {} +

--transform is a simple string replacement parameter. It will strip the path of the files from the archive so the tarball's root becomes the current directory when extracting. Note that you can't use -C option to change directory as you'll lose benefits of find : all files of the directory would be included.

-P tells tar to use absolute paths, so it doesn't trigger the warning "Removing leading `/' from member names". Leading '/' with be removed by --transform anyway.

-cf - tells tar to use the tarball name we'll specify later

{} + uses everyfiles that find found previously

Step 3: pigz

pigz -9 -p 4

Use as many parameters as you want. In this case -9 is the compression level and -p 4 is the number of cores dedicated to compression. If you run this on a heavy loaded webserver, you probably don't want to use all available cores.

Step 4: archive name

> myarchive.tar.gz

Finally.

[May 24, 2019] How to send keystrokes from one computer to another by USB?

Notable quotes:
"... On a different note, have you considered a purely software/network solution such as TightVNC ? ..."
Aug 05, 2018 | stackoverflow.com

Ask Question


Yehonatan ,Aug 5, 2018 at 6:34

Is there a way to use one computer to send keystrokes to another by usb ?

What i'm looking to do is to capture the usb signal used by a keyboard (with USBTrace for example) and use it with PC-1 to send it to PC-2. So that PC-2 recognize it as a regular keyboard input.

Some leads to do this would be very appreciated.

Lucas ,Jan 16, 2011 at 19:18

What you essentially need is a USB port on PC-1 that will act as a USB device for PC-2.

That is not possible for the vast majority of PC systems because USB is an asymmetric bus, with a host/device (or master/slave, if you wish) architecture. USB controllers (and their ports) on most PCs can only work in host mode and cannot simulate a device.

That is the reason that you cannot network computers through USB without a special cable with specialized electronics.

The only exception is if you somehow have a PC that supports the USB On-The-Go standard that allows for a USB port to act in both host and device mode. USB-OTG devices do exist, but they are usually embedded devices (smartphones etc). I don't know if there is a way to add a USB-OTG port to a commodity PC.

EDIT:

If you do not need a keyboard before the OS on PC-2 boots, you might be able to use a pair of USB Bluetooth dongles - one on each PC. You'd have to use specialised software on PC-1, but it is definitely possible - I've already seen a possible implementation on Linux , and I am reasonably certain that there must be one for Windows. You will also need Bluetooth HID drivers on PC-2, if they are not already installed.

On a different note, have you considered a purely software/network solution such as TightVNC ?

bebbo ,Sep 20, 2017 at 18:14

There is a solution: https://github.com/Flowm/etherkey

This uses a network connection from your computer to the raspi which is connected to a teensy (usb developer board) to send the key strokes.

This solution is not an out-of-the-box product. The required skill is similar to programming some other devices like arduino. But it's a complete and working setup.

Yehonatan ,Jan 25, 2011 at 5:51

The cheapest options are commercial microcontrollers (eg arduino platform, pic, etc) or ready built usb keyboard controllers (eg i-pac, arcade controllers,etc)

Benoit-Pierre DEMAINE ,Oct 27, 2017 at 17:17

SEARCH THIS PROGRAM:

TWedge: Keyboard Wedge Software (RS232, Serial, TCP, Bluetooth)

then, MAKE YOUR OWN CONNECTION CABLE WITH:

(usb <-> rs232) + (NULL MODEM) + (rs232 <-> usb)

Connect 2 computer, write your own program to send signal to your (usb <-> rs232) unit, then you can control another computer under the help of TWedge.

> ,

The above mentionned https://github.com/Flowm/etherkey is one way. The keyboard is emulated from an rPi, but the principle can be used from PC to PC (or Mac to Whatever). The core answer to your question is to use an OTG-capable chip, and then you control this chip via a USB-serial adapter.

https://euer.krebsco.de/a-software-kvm-switch.html uses a very similar method, using an Arduino instead of the Teensy.

The generic answer is: you need an OTG capable, or slave capable device: Arduino, Teensy, Pi 0 (either from Rapberry or Orange brands, both work; only the ZERO models are OTG capable), or, an rPi-A with heavy customisation (since it does not include USB hub, it can theoretically be converted into a slave; never found any public tutorial to do it), or any smartphone (Samsung, Nokia, HTC, Oukitel ... most smartphones are OTG capable). If you go for a Pi or a phone, then, you want to dig around USB Gadget. Cheaper solutions (Arduino/Teensy) need custom firmware.

[Mar 20, 2019] How do I troubleshoot a yum repository problem that has an error No package available error?

Mar 20, 2019 | unix.stackexchange.com

Kiran ,Jan 2, 2017 at 23:57

I have three RHEL 6.6 servers. One has a yum repository that I know works. The other two servers I will refer to as "yum clients." These two are configured to use the same yum repository (the first server described). When I do yum install httpd on each of these two yum client servers, I get two different results. One server prepares for the installation as normal and prompts me with a y/n prompt. The second server says

No package httpd available.

The /etc/yum.conf files on each of the two servers is identical. The /etc/yum.repos.d/ directories have the same .repo files. Why does one yum client not see the httpd package? I use httpd as an example. One yum client cannot install any package. The other yum client can install anything. Neither have access to the Internet or different servers the other one does not have access to.

XXX

If /etc/yum.conf is identical on all servers, and that package is not listed there in exclude line, check if the repo is enabled on all the servers.

Do

grep enabled /etc/yum.repos.d/filename.repo

and see if it is set to 0 or 1.

value of enabled needs to be set to 1, for yum to use that repo.

If repo is not enabled, you can edit the repo file, and change the enable to 1, or try to run yum with enablerepo switch, to enable it for that operation.

Try to run yum like this.

yum --enablerepo=repo_name install package_name

[Mar 20, 2019] How to I print to STDERR only if STDOUT is a different destination?

Mar 14, 2013 | stackoverflow.com

squiguy, Mar 14, 2013 at 19:06

I would like Perl to write to STDERR only if STDOUT is not the same. For example, if both STDOUT and STDERR would redirect output to the Terminal, then I don't want STDERR to be printed.

Consider the following example (outerr.pl):

#!/usr/bin/perl

use strict;
use warnings;

print STDOUT "Hello standard output!\n";
print STDERR "Hello standard error\n" if ($someMagicalFlag);
exit 0

Now consider this (this is what I would like to achieve):

bash $ outerr.pl
Hello standard output!

However, if I redirect out to a file, I'd like to get:

bash $ outerr.pl > /dev/null
Hello standard error

and similary the other way round:

bash $ outerr.pl 2> /dev/null
Hello standard output!

If I re-direct both out/err to the same file, then only stdout should be displayed:

bash $ outerr.pl > foo.txt 2>&1
bash $ cat foo.txt
Hello standard output!

So is there a way to evaluate / determine whether OUT and ERR and are pointing to the same "thing" (descriptor?)?

tchrist ,Mar 15, 2013 at 5:07

On Unix-style systems, you should be able to do:
my @stat_err = stat STDERR;
my @stat_out = stat STDOUT;

my $stderr_is_not_stdout = (($stat_err[0] != $stat_out[0]) ||
                            ($stat_err[1] != $stat_out[1]));

But that won't work on Windows, which doesn't have real inode numbers. It gives both false positives (thinks they're different when they aren't) and false negatives (thinks they're the same when they aren't).

Jim Stewart ,Mar 14, 2013 at 20:59

You can do that (almost) with -t:
-t STDERR

will be true if it is a terminal, and likewise for STDOUT.

This still would not tell you what terminal, and if you redirect to the same file, you may stilll get both.

Hence, if

-t STDERR && ! (-t STDOUT) || -t STDOUT && !(-t STDERR)

or shorter

-t STDOUT ^ -t STDERR  # thanks to @mob

you know you're okay.

EDIT: Solutions for the case that both STDERR and STDOUT are regular files:

Tom Christianson suggested to stat and compare the dev and ino fields. This will work in UNIX, but, as @cjm pointed out, not in Windows.

If you can guarantee that no other program will write to the file, you could do the following both in Windows and UNIX:

  1. check the position the file descriptors for STDOUT and STDERR are at, if they are not equal, you redirected one of them with >> to a nonempty file.
  2. Otherwise, write 42 bytes to file descriptor 2
  3. Seek to the end of file descriptor 1. If it is 42 more than before, chances are high that both are redirected to the same file. If it is unchanged, files are different. If it is changed, but not by 42, someone else is writing there, all bets are off (but then, you're not in Windows, so the stat method will work).

[Mar 17, 2019] Translating Perl to Python

Mar 17, 2019 | stackoverflow.com

John Kugelman ,Jul 1, 2009 at 3:29

I found this Perl script while migrating my SQLite database to mysql

I was wondering (since I don't know Perl) how could one rewrite this in Python?

Bonus points for the shortest (code) answer :)

edit : sorry I meant shortest code, not strictly shortest answer

#! /usr/bin/perl

while ($line = <>){
    if (($line !~  /BEGIN TRANSACTION/) && ($line !~ /COMMIT/) && ($line !~ /sqlite_sequence/) && ($line !~ /CREATE UNIQUE INDEX/)){

        if ($line =~ /CREATE TABLE \"([a-z_]*)\"(.*)/){
                $name = $1;
                $sub = $2;
                $sub =~ s/\"//g; #"
                $line = "DROP TABLE IF EXISTS $name;\nCREATE TABLE IF NOT EXISTS $name$sub\n";
        }
        elsif ($line =~ /INSERT INTO \"([a-z_]*)\"(.*)/){
                $line = "INSERT INTO $1$2\n";
                $line =~ s/\"/\\\"/g; #"
                $line =~ s/\"/\'/g; #"
        }else{
                $line =~ s/\'\'/\\\'/g; #'
        }
        $line =~ s/([^\\'])\'t\'(.)/$1THIS_IS_TRUE$2/g; #'
        $line =~ s/THIS_IS_TRUE/1/g;
        $line =~ s/([^\\'])\'f\'(.)/$1THIS_IS_FALSE$2/g; #'
        $line =~ s/THIS_IS_FALSE/0/g;
        $line =~ s/AUTOINCREMENT/AUTO_INCREMENT/g;
        print $line;
    }
}

Some additional code was necessary to successfully migrate the sqlite database (handles one line Create table statements, foreign keys, fixes a bug in the original program that converted empty fields '' to \' .

I posted the code on the migrating my SQLite database to mysql Question

Jiaaro ,Jul 2, 2009 at 10:15

Here's a pretty literal translation with just the minimum of obvious style changes (putting all code into a function, using string rather than re operations where possible).
import re, fileinput

def main():
  for line in fileinput.input():
    process = False
    for nope in ('BEGIN TRANSACTION','COMMIT',
                 'sqlite_sequence','CREATE UNIQUE INDEX'):
      if nope in line: break
    else:
      process = True
    if not process: continue
    m = re.search('CREATE TABLE "([a-z_]*)"(.*)', line)
    if m:
      name, sub = m.groups()
      line = '''DROP TABLE IF EXISTS %(name)s;
CREATE TABLE IF NOT EXISTS %(name)s%(sub)s
'''
      line = line % dict(name=name, sub=sub)
    else:
      m = re.search('INSERT INTO "([a-z_]*)"(.*)', line)
      if m:
        line = 'INSERT INTO %s%s\n' % m.groups()
        line = line.replace('"', r'\"')
        line = line.replace('"', "'")
    line = re.sub(r"([^'])'t'(.)", r"\1THIS_IS_TRUE\2", line)
    line = line.replace('THIS_IS_TRUE', '1')
    line = re.sub(r"([^'])'f'(.)", r"\1THIS_IS_FALSE\2", line)
    line = line.replace('THIS_IS_FALSE', '0')
    line = line.replace('AUTOINCREMENT', 'AUTO_INCREMENT')
    print line,

main()

dr jimbob ,May 20, 2018 at 0:54

Alex Martelli's solution above works good, but needs some fixes and additions:

In the lines using regular expression substitution, the insertion of the matched groups must be double-escaped OR the replacement string must be prefixed with r to mark is as regular expression:

line = re.sub(r"([^'])'t'(.)", "\\1THIS_IS_TRUE\\2", line)

or

line = re.sub(r"([^'])'f'(.)", r"\1THIS_IS_FALSE\2", line)

Also, this line should be added before print:

line = line.replace('AUTOINCREMENT', 'AUTO_INCREMENT')

Last, the column names in create statements should be backticks in MySQL. Add this in line 15:

  sub = sub.replace('"','`')

Here's the complete script with modifications:

import re, fileinput

def main():
  for line in fileinput.input():
    process = False
    for nope in ('BEGIN TRANSACTION','COMMIT',
                 'sqlite_sequence','CREATE UNIQUE INDEX'):
      if nope in line: break
    else:
      process = True
    if not process: continue
    m = re.search('CREATE TABLE "([a-z_]*)"(.*)', line)
    if m:
      name, sub = m.groups()
      sub = sub.replace('"','`')
      line = '''DROP TABLE IF EXISTS %(name)s;
CREATE TABLE IF NOT EXISTS %(name)s%(sub)s
'''
      line = line % dict(name=name, sub=sub)
    else:
      m = re.search('INSERT INTO "([a-z_]*)"(.*)', line)
      if m:
        line = 'INSERT INTO %s%s\n' % m.groups()
        line = line.replace('"', r'\"')
        line = line.replace('"', "'")
    line = re.sub(r"([^'])'t'(.)", "\\1THIS_IS_TRUE\\2", line)
    line = line.replace('THIS_IS_TRUE', '1')
    line = re.sub(r"([^'])'f'(.)", "\\1THIS_IS_FALSE\\2", line)
    line = line.replace('THIS_IS_FALSE', '0')
    line = line.replace('AUTOINCREMENT', 'AUTO_INCREMENT')
    if re.search('^CREATE INDEX', line):
        line = line.replace('"','`')
    print line,

main()

Brad Gilbert ,Jul 1, 2009 at 18:43

Here is a slightly better version of the original.
#! /usr/bin/perl
use strict;
use warnings;
use 5.010; # for s/\K//;

while( <> ){
  next if m'
    BEGIN TRANSACTION   |
    COMMIT              |
    sqlite_sequence     |
    CREATE UNIQUE INDEX
  'x;

  if( my($name,$sub) = m'CREATE TABLE \"([a-z_]*)\"(.*)' ){
    # remove "
    $sub =~ s/\"//g; #"
    $_ = "DROP TABLE IF EXISTS $name;\nCREATE TABLE IF NOT EXISTS $name$sub\n";

  }elsif( /INSERT INTO \"([a-z_]*)\"(.*)/ ){
    $_ = "INSERT INTO $1$2\n";

    # " => \"
    s/\"/\\\"/g; #"
    # " => '
    s/\"/\'/g; #"

  }else{
    # '' => \'
    s/\'\'/\\\'/g; #'
  }

  # 't' => 1
  s/[^\\']\K\'t\'/1/g; #'

  # 'f' => 0
  s/[^\\']\K\'f\'/0/g; #'

  s/AUTOINCREMENT/AUTO_INCREMENT/g;
  print;
}

Mickey Mouse ,Jun 14, 2011 at 15:48

all of scripts on this page can't deal with simple sqlite3:
PRAGMA foreign_keys=OFF;
BEGIN TRANSACTION;
CREATE TABLE Filename (
  FilenameId INTEGER,
  Name TEXT DEFAULT '',
  PRIMARY KEY(FilenameId) 
  );
INSERT INTO "Filename" VALUES(1,'');
INSERT INTO "Filename" VALUES(2,'bigfile1');
INSERT INTO "Filename" VALUES(3,'%gconf-tree.xml');

None were able to reformat "table_name" into proper mysql's `table_name` . Some messed up empty string value.

Sinan Ünür ,Jul 1, 2009 at 3:24

I am not sure what is so hard to understand about this that it requires a snide remark as in your comment above. Note that <> is called the diamond operator. s/// is the substitution operator and // is the match operator m// .

Ken_g6 ,Jul 1, 2009 at 3:22

Based on http://docs.python.org/dev/howto/regex.html ...
  1. Replace $line =~ /.*/ with re.search(r".*", line) .
  2. $line !~ /.*/ is just !($line =~ /.*/) .
  3. Replace $line =~ s/.*/x/g with line=re.sub(r".*", "x", line) .
  4. Replace $1 through $9 inside re.sub with \1 through \9 respectively.
  5. Outside a sub, save the return value, i.e. m=re.search() , and replace $1 with the return value of m.group(1) .
  6. For "INSERT INTO $1$2\n" specifically, you can do "INSERT INTO %s%s\n" % (m.group(1), m.group(2)) .

hpavc ,Jul 1, 2009 at 12:33

Real issue is do you know actually how to migrate the database? What is presented is merely a search and replace loop.

> ,

Shortest? The tilde signifies a regex in perl. "import re" and go from there. The only key differences are that you'll be using \1 and \2 instead of $1 and $2 when you assign values, and you'll be using %s for when you're replacing regexp matches inside strings.

[Mar 16, 2019] Translating Perl to Python - Stack Overflow

Mar 16, 2019 | stackoverflow.com

Translating Perl to Python Ask Question 21


John Kugelman ,Jul 1, 2009 at 3:29

I found this Perl script while migrating my SQLite database to mysql

I was wondering (since I don't know Perl) how could one rewrite this in Python?

Bonus points for the shortest (code) answer :)

edit : sorry I meant shortest code, not strictly shortest answer

#! /usr/bin/perl

while ($line = <>){
    if (($line !~  /BEGIN TRANSACTION/) && ($line !~ /COMMIT/) && ($line !~ /sqlite_sequence/) && ($line !~ /CREATE UNIQUE INDEX/)){

        if ($line =~ /CREATE TABLE \"([a-z_]*)\"(.*)/){
                $name = $1;
                $sub = $2;
                $sub =~ s/\"//g; #"
                $line = "DROP TABLE IF EXISTS $name;\nCREATE TABLE IF NOT EXISTS $name$sub\n";
        }
        elsif ($line =~ /INSERT INTO \"([a-z_]*)\"(.*)/){
                $line = "INSERT INTO $1$2\n";
                $line =~ s/\"/\\\"/g; #"
                $line =~ s/\"/\'/g; #"
        }else{
                $line =~ s/\'\'/\\\'/g; #'
        }
        $line =~ s/([^\\'])\'t\'(.)/$1THIS_IS_TRUE$2/g; #'
        $line =~ s/THIS_IS_TRUE/1/g;
        $line =~ s/([^\\'])\'f\'(.)/$1THIS_IS_FALSE$2/g; #'
        $line =~ s/THIS_IS_FALSE/0/g;
        $line =~ s/AUTOINCREMENT/AUTO_INCREMENT/g;
        print $line;
    }
}

Some additional code was necessary to successfully migrate the sqlite database (handles one line Create table statements, foreign keys, fixes a bug in the original program that converted empty fields '' to \' .

I posted the code on the migrating my SQLite database to mysql Question

Jiaaro ,Jul 2, 2009 at 10:15

Here's a pretty literal translation with just the minimum of obvious style changes (putting all code into a function, using string rather than re operations where possible).
import re, fileinput

def main():
  for line in fileinput.input():
    process = False
    for nope in ('BEGIN TRANSACTION','COMMIT',
                 'sqlite_sequence','CREATE UNIQUE INDEX'):
      if nope in line: break
    else:
      process = True
    if not process: continue
    m = re.search('CREATE TABLE "([a-z_]*)"(.*)', line)
    if m:
      name, sub = m.groups()
      line = '''DROP TABLE IF EXISTS %(name)s;
CREATE TABLE IF NOT EXISTS %(name)s%(sub)s
'''
      line = line % dict(name=name, sub=sub)
    else:
      m = re.search('INSERT INTO "([a-z_]*)"(.*)', line)
      if m:
        line = 'INSERT INTO %s%s\n' % m.groups()
        line = line.replace('"', r'\"')
        line = line.replace('"', "'")
    line = re.sub(r"([^'])'t'(.)", r"\1THIS_IS_TRUE\2", line)
    line = line.replace('THIS_IS_TRUE', '1')
    line = re.sub(r"([^'])'f'(.)", r"\1THIS_IS_FALSE\2", line)
    line = line.replace('THIS_IS_FALSE', '0')
    line = line.replace('AUTOINCREMENT', 'AUTO_INCREMENT')
    print line,

main()

dr jimbob ,May 20, 2018 at 0:54

Alex Martelli's solution above works good, but needs some fixes and additions:

In the lines using regular expression substitution, the insertion of the matched groups must be double-escaped OR the replacement string must be prefixed with r to mark is as regular expression:

line = re.sub(r"([^'])'t'(.)", "\\1THIS_IS_TRUE\\2", line)

or

line = re.sub(r"([^'])'f'(.)", r"\1THIS_IS_FALSE\2", line)

Also, this line should be added before print:

line = line.replace('AUTOINCREMENT', 'AUTO_INCREMENT')

Last, the column names in create statements should be backticks in MySQL. Add this in line 15:

  sub = sub.replace('"','`')

Here's the complete script with modifications:

import re, fileinput

def main():
  for line in fileinput.input():
    process = False
    for nope in ('BEGIN TRANSACTION','COMMIT',
                 'sqlite_sequence','CREATE UNIQUE INDEX'):
      if nope in line: break
    else:
      process = True
    if not process: continue
    m = re.search('CREATE TABLE "([a-z_]*)"(.*)', line)
    if m:
      name, sub = m.groups()
      sub = sub.replace('"','`')
      line = '''DROP TABLE IF EXISTS %(name)s;
CREATE TABLE IF NOT EXISTS %(name)s%(sub)s
'''
      line = line % dict(name=name, sub=sub)
    else:
      m = re.search('INSERT INTO "([a-z_]*)"(.*)', line)
      if m:
        line = 'INSERT INTO %s%s\n' % m.groups()
        line = line.replace('"', r'\"')
        line = line.replace('"', "'")
    line = re.sub(r"([^'])'t'(.)", "\\1THIS_IS_TRUE\\2", line)
    line = line.replace('THIS_IS_TRUE', '1')
    line = re.sub(r"([^'])'f'(.)", "\\1THIS_IS_FALSE\\2", line)
    line = line.replace('THIS_IS_FALSE', '0')
    line = line.replace('AUTOINCREMENT', 'AUTO_INCREMENT')
    if re.search('^CREATE INDEX', line):
        line = line.replace('"','`')
    print line,

main()

Brad Gilbert ,Jul 1, 2009 at 18:43

Here is a slightly better version of the original.
#! /usr/bin/perl
use strict;
use warnings;
use 5.010; # for s/\K//;

while( <> ){
  next if m'
    BEGIN TRANSACTION   |
    COMMIT              |
    sqlite_sequence     |
    CREATE UNIQUE INDEX
  'x;

  if( my($name,$sub) = m'CREATE TABLE \"([a-z_]*)\"(.*)' ){
    # remove "
    $sub =~ s/\"//g; #"
    $_ = "DROP TABLE IF EXISTS $name;\nCREATE TABLE IF NOT EXISTS $name$sub\n";

  }elsif( /INSERT INTO \"([a-z_]*)\"(.*)/ ){
    $_ = "INSERT INTO $1$2\n";

    # " => \"
    s/\"/\\\"/g; #"
    # " => '
    s/\"/\'/g; #"

  }else{
    # '' => \'
    s/\'\'/\\\'/g; #'
  }

  # 't' => 1
  s/[^\\']\K\'t\'/1/g; #'

  # 'f' => 0
  s/[^\\']\K\'f\'/0/g; #'

  s/AUTOINCREMENT/AUTO_INCREMENT/g;
  print;
}

Mickey Mouse ,Jun 14, 2011 at 15:48

all of scripts on this page can't deal with simple sqlite3:
PRAGMA foreign_keys=OFF;
BEGIN TRANSACTION;
CREATE TABLE Filename (
  FilenameId INTEGER,
  Name TEXT DEFAULT '',
  PRIMARY KEY(FilenameId) 
  );
INSERT INTO "Filename" VALUES(1,'');
INSERT INTO "Filename" VALUES(2,'bigfile1');
INSERT INTO "Filename" VALUES(3,'%gconf-tree.xml');

None were able to reformat "table_name" into proper mysql's `table_name` . Some messed up empty string value.

Sinan Ünür ,Jul 1, 2009 at 3:24

I am not sure what is so hard to understand about this that it requires a snide remark as in your comment above. Note that <> is called the diamond operator. s/// is the substitution operator and // is the match operator m// .

Ken_g6 ,Jul 1, 2009 at 3:22

Based on http://docs.python.org/dev/howto/regex.html ...
  1. Replace $line =~ /.*/ with re.search(r".*", line) .
  2. $line !~ /.*/ is just !($line =~ /.*/) .
  3. Replace $line =~ s/.*/x/g with line=re.sub(r".*", "x", line) .
  4. Replace $1 through $9 inside re.sub with \1 through \9 respectively.
  5. Outside a sub, save the return value, i.e. m=re.search() , and replace $1 with the return value of m.group(1) .
  6. For "INSERT INTO $1$2\n" specifically, you can do "INSERT INTO %s%s\n" % (m.group(1), m.group(2)) .

hpavc ,Jul 1, 2009 at 12:33

Real issue is do you know actually how to migrate the database? What is presented is merely a search and replace loop.

> ,

Shortest? The tilde signifies a regex in perl. "import re" and go from there. The only key differences are that you'll be using \1 and \2 instead of $1 and $2 when you assign values, and you'll be using %s for when you're replacing regexp matches inside strings.

[Mar 16, 2019] Regex translation from Perl to Python - Stack Overflow

Mar 16, 2019 | stackoverflow.com

Regex translation from Perl to Python Ask Question 1


royskatt ,Jan 30, 2014 at 14:45

I would like to rewrite a small Perl programm to Python. I am processing text files with it as follows:

Input:

00000001;Root;;
00000002;  Documents;;
00000003;    oracle-advanced_plsql.zip;file;
00000004;  Public;;
00000005;  backup;;
00000006;    20110323-JM-F.7z.001;file;
00000007;    20110426-JM-F.7z.001;file;
00000008;    20110603-JM-F.7z.001;file;
00000009;    20110701-JM-F-via-summer_school;;
00000010;      20110701-JM-F-yyy.7z.001;file;

Desired output:

00000001;;Root;;
00000002;  ;Documents;;
00000003;    ;oracle-advanced_plsql.zip;file;
00000004;  ;Public;;
00000005;  ;backup;;
00000006;    ;20110323-JM-F.7z.001;file;
00000007;    ;20110426-JM-F.7z.001;file;
00000008;    ;20110603-JM-F.7z.001;file;
00000009;    ;20110701-JM-F-via-summer_school;;
00000010;      ;20110701-JM-F-yyy.7z.001;file;

Here is the working Perl code:

#filename: perl_regex.pl
#/usr/bin/perl -w
while(<>) {                                                           
  s/^(.*?;.*?)(\w)/$1;$2/;                                            
  print $_;                                                           
}

It call it from the command line: perl_regex.pl input.txt

Explanation of the Perl-style regex:

s/        # start search-and-replace regexp
  ^       # start at the beginning of this line
  (       # save the matched characters until ')' in $1
    .*?;  # go forward until finding the first semicolon
    .*?   # go forward until finding... (to be continued below)
  )
  (       # save the matched characters until ')' in $2
    \w    # ... the next alphanumeric character.
  )
/         # continue with the replace part
  $1;$2   # write all characters found above, but insert a ; before $2
/         # finish the search-and-replace regexp.

Could anyone tell me, how to get the same result in Python? Especially for the $1 and $2 variables I couldn't find something alike.

royskatt ,Jan 31, 2014 at 6:18

Python regular expression is very similar to Perl's, except:

Use re.sub to replace.

import re
import sys

for line in sys.stdin: # Explicitly iterate standard input line by line
    # `line` contains trailing newline!
    line = re.sub(r'^(.*?;.*?)(\w)', r'\1;\2', line)
    #print(line) # This print trailing newline
    sys.stdout.write(line) # Print the replaced string back.

royskatt ,Jan 31, 2014 at 16:36

The replace instruction for s/pattern/replace/ in python regexes is the re.sub(pattern, replace, string) function, or re.compile(pattern).sub(replace, string). In your case, you will do it so:
_re_pattern = re.compile(r"^(.*?;.*?)(\w)")
result = _re_pattern.sub(r"\1;\2", line)

Note that $1 becomes \1 . As for perl, you need to iterate over your lines the way you want to do it (open, inputfile, splitlines, ...).

[Mar 13, 2019] amp html - Convert img to amp-img - Stack Overflow

Mar 13, 2019 | stackoverflow.com

> ,

Which is the default way to convert an <img> to a <amp-img> ?

I explain myself: In the site that I'm converting to AMP I have lot of images without widht and height e.g.:

<img src="/img/image.png" alt="My image">

If I not specify the layout, the layout="container" is set by default and the most of the images throw the following error:

amp-img error: Layout not supported for: container

In the other hand, the most of the images don't fit with the responsive layout, which is recommended by Google for most of the cases

I have been checking the types of layout on the documentation:

But any of them seems to fit with an image that have to be shown as its real size, not specifying width or height.

So, in that case, which is the equivalent in AMP?

,

As you are saying you have multiple images, it's better you use the ' layout="responsive" ', with that, you will make your images responsive atleast.

Now regarding the Width and Height . They are must.

If you read the purpose of AMP, one of them is to make the pages ' Jumping/Flickering Free Content ', which happens if there is no width mentioned for Images.

By Specifying the Width, the Browser (mobile browser), can calculate the precise space and keep it for that Image and show the Content after that. In that way, there wont' be any flickering of the content, as the page and images are loaded.

Regarding the re-writing of your HTML, one tip I can provide is, you can write some small utility with PHP, Python or Node JavaScript, which can actually read the source image, calculate their dimensions and replace your IMG tags.

Hope this helps and wish you good luck for your AMP powered site :-)

[Mar 10, 2019] How do I detach a process from Terminal, entirely?

Mar 10, 2019 | superuser.com

stackoverflow.com, Aug 25, 2016 at 17:24

I use Tilda (drop-down terminal) on Ubuntu as my "command central" - pretty much the way others might use GNOME Do, Quicksilver or Launchy.

However, I'm struggling with how to completely detach a process (e.g. Firefox) from the terminal it's been launched from - i.e. prevent that such a (non-)child process

For example, in order to start Vim in a "proper" terminal window, I have tried a simple script like the following:

exec gnome-terminal -e "vim $@" &> /dev/null &

However, that still causes pollution (also, passing a file name doesn't seem to work).

lhunath, Sep 23, 2016 at 19:08

First of all; once you've started a process, you can background it by first stopping it (hit Ctrl - Z ) and then typing bg to let it resume in the background. It's now a "job", and its stdout / stderr / stdin are still connected to your terminal.

You can start a process as backgrounded immediately by appending a "&" to the end of it:

firefox &

To run it in the background silenced, use this:

firefox </dev/null &>/dev/null &

Some additional info:

nohup is a program you can use to run your application with such that its stdout/stderr can be sent to a file instead and such that closing the parent script won't SIGHUP the child. However, you need to have had the foresight to have used it before you started the application. Because of the way nohup works, you can't just apply it to a running process .

disown is a bash builtin that removes a shell job from the shell's job list. What this basically means is that you can't use fg , bg on it anymore, but more importantly, when you close your shell it won't hang or send a SIGHUP to that child anymore. Unlike nohup , disown is used after the process has been launched and backgrounded.

What you can't do, is change the stdout/stderr/stdin of a process after having launched it. At least not from the shell. If you launch your process and tell it that its stdout is your terminal (which is what you do by default), then that process is configured to output to your terminal. Your shell has no business with the processes' FD setup, that's purely something the process itself manages. The process itself can decide whether to close its stdout/stderr/stdin or not, but you can't use your shell to force it to do so.

To manage a background process' output, you have plenty of options from scripts, "nohup" probably being the first to come to mind. But for interactive processes you start but forgot to silence ( firefox < /dev/null &>/dev/null & ) you can't do much, really.

I recommend you get GNU screen . With screen you can just close your running shell when the process' output becomes a bother and open a new one ( ^Ac ).


Oh, and by the way, don't use " $@ " where you're using it.

$@ means, $1 , $2 , $3 ..., which would turn your command into:

gnome-terminal -e "vim $1" "$2" "$3" ...

That's probably not what you want because -e only takes one argument. Use $1 to show that your script can only handle one argument.

It's really difficult to get multiple arguments working properly in the scenario that you gave (with the gnome-terminal -e ) because -e takes only one argument, which is a shell command string. You'd have to encode your arguments into one. The best and most robust, but rather cludgy, way is like so:

gnome-terminal -e "vim $(printf "%q " "$@")"

Limited Atonement ,Aug 25, 2016 at 17:22

nohup cmd &

nohup detaches the process completely (daemonizes it)

Randy Proctor ,Sep 13, 2016 at 23:00

If you are using bash , try disown [ jobspec ] ; see bash(1) .

Another approach you can try is at now . If you're not superuser, your permission to use at may be restricted.

Stephen Rosen ,Jan 22, 2014 at 17:08

Reading these answers, I was under the initial impression that issuing nohup <command> & would be sufficient. Running zsh in gnome-terminal, I found that nohup <command> & did not prevent my shell from killing child processes on exit. Although nohup is useful, especially with non-interactive shells, it only guarantees this behavior if the child process does not reset its handler for the SIGHUP signal.

In my case, nohup should have prevented hangup signals from reaching the application, but the child application (VMWare Player in this case) was resetting its SIGHUP handler. As a result when the terminal emulator exits, it could still kill your subprocesses. This can only be resolved, to my knowledge, by ensuring that the process is removed from the shell's jobs table. If nohup is overridden with a shell builtin, as is sometimes the case, this may be sufficient, however, in the event that it is not...


disown is a shell builtin in bash , zsh , and ksh93 ,

<command> &
disown

or

<command> &; disown

if you prefer one-liners. This has the generally desirable effect of removing the subprocess from the jobs table. This allows you to exit the terminal emulator without accidentally signaling the child process at all. No matter what the SIGHUP handler looks like, this should not kill your child process.

After the disown, the process is still a child of your terminal emulator (play with pstree if you want to watch this in action), but after the terminal emulator exits, you should see it attached to the init process. In other words, everything is as it should be, and as you presumably want it to be.

What to do if your shell does not support disown ? I'd strongly advocate switching to one that does, but in the absence of that option, you have a few choices.

  1. screen and tmux can solve this problem, but they are much heavier weight solutions, and I dislike having to run them for such a simple task. They are much more suitable for situations in which you want to maintain a tty, typically on a remote machine.
  2. For many users, it may be desirable to see if your shell supports a capability like zsh's setopt nohup . This can be used to specify that SIGHUP should not be sent to the jobs in the jobs table when the shell exits. You can either apply this just before exiting the shell, or add it to shell configuration like ~/.zshrc if you always want it on.
  3. Find a way to edit the jobs table. I couldn't find a way to do this in tcsh or csh , which is somewhat disturbing.
  4. Write a small C program to fork off and exec() . This is a very poor solution, but the source should only consist of a couple dozen lines. You can then pass commands as commandline arguments to the C program, and thus avoid a process specific entry in the jobs table.

Sheljohn ,Jan 10 at 10:20

  1. nohup $COMMAND &
  2. $COMMAND & disown
  3. setsid command

I've been using number 2 for a very long time, but number 3 works just as well. Also, disown has a 'nohup' flag of '-h', can disown all processes with '-a', and can disown all running processes with '-ar'.

Silencing is accomplished by '$COMMAND &>/dev/null'.

Hope this helps!

dunkyp

add a comment ,Mar 25, 2009 at 1:51
I think screen might solve your problem

Nathan Fellman ,Mar 23, 2009 at 14:55

in tcsh (and maybe in other shells as well), you can use parentheses to detach the process.

Compare this:

> jobs # shows nothing
> firefox &
> jobs
[1]  + Running                       firefox

To this:

> jobs # shows nothing
> (firefox &)
> jobs # still shows nothing
>

This removes firefox from the jobs listing, but it is still tied to the terminal; if you logged in to this node via 'ssh', trying to log out will still hang the ssh process.

,

To disassociate tty shell run command through sub-shell for e.g.

(command)&

When exit used terminal closed but process is still alive.

check -

(sleep 100) & exit

Open other terminal

ps aux | grep sleep

Process is still alive.

[Mar 10, 2019] How to run tmux/screen with systemd 230 ?

Aug 02, 2018 | askubuntu.com

MvanGeest ,May 10, 2017 at 20:59

I run 16.04 and systemd now kills tmux when the user disconnects ( summary of the change ).

Is there a way to run tmux or screen (or any similar program) with systemd 230? I read all the heated disussion about pros and cons of the behavious but no solution was suggested.

(I see the behaviour in 229 as well)

WoJ ,Aug 2, 2016 at 20:30

RemainAfterExit=

Takes a boolean value that specifies whether the service shall be considered active even when all its processes exited. Defaults to no.

jpath ,Feb 13 at 12:29

The proper solution is to disable the offending systemd behavior system-wide.

Edit /etc/systemd/logind.conf ( you must sudo , of course) and set

KillUserProcesses=no

You can also put this setting in a separate file, e.g. /etc/systemd/logind.conf.d/99-dont-kill-user-processes.conf .

Then restart systemd-logind.service .

sudo systemctl restart systemd-logind

sarnold ,Dec 9, 2016 at 11:59

Based on @Rinzwind's answer and inspired by a unit description the best I could find is to use TaaS (Tmux as a Service) - a generic detached instance of tmux one reattaches to.
# cat /etc/systemd/system/[email protected]

[Unit]
Description=tmux default session (detached)
Documentation=man:tmux(1)

[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/usr/bin/tmux new-session -d -s %I
ExecStop=/usr/bin/tmux kill-server
KillMode=none

[Install]
WantedBy=multiplexer.target

# systemctl start [email protected]
# systemctl start [email protected]
# tmux list-sessions

instanceone: 1 windows (created Sun Jul 24 00:52:15 2016) [193x49]
instancetwo: 1 windows (created Sun Jul 24 00:52:19 2016) [193x49]

# tmux attach-session -t instanceone

(instanceone)#

Robin Hartmann ,Aug 2, 2018 at 20:23

You need to set the Type of the service to forking , as explained here .

Let's assume the service you want to run in screen is called minecraft . Then you would open minecraft.service in a text editor and add or edit the entry Type=forking under the section [Service] .

> ,

According to https://unix.stackexchange.com/a/287282/117599 invoking tmux using
systemd-run --user --scope tmux

should also do the trick.

[Mar 10, 2019] linux - How to attach terminal to detached process

Mar 10, 2019 | unix.stackexchange.com

Ask Question 86


Gilles ,Feb 16, 2012 at 21:39

I have detached a process from my terminal, like this:
$ process &

That terminal is now long closed, but process is still running and I want to send some commands to that process's stdin. Is that possible?

Samuel Edwin Ward ,Dec 22, 2018 at 13:34

Yes, it is. First, create a pipe: mkfifo /tmp/fifo . Use gdb to attach to the process: gdb -p PID

Then close stdin: call close (0) ; and open it again: call open ("/tmp/fifo", 0600)

Finally, write away (from a different terminal, as gdb will probably hang):

echo blah > /tmp/fifo

NiKiZe ,Jan 6, 2017 at 22:52

When original terminal is no longer accessible...

reptyr might be what you want, see https://serverfault.com/a/284795/187998

Quote from there:

Have a look at reptyr , which does exactly that. The github page has all the information.
reptyr - A tool for "re-ptying" programs.

reptyr is a utility for taking an existing running program and attaching it to a new terminal. Started a long-running process over ssh, but have to leave and don't want to interrupt it? Just start a screen, use reptyr to grab it, and then kill the ssh session and head on home.

USAGE

reptyr PID

"reptyr PID" will grab the process with id PID and attach it to your current terminal.

After attaching, the process will take input from and write output to the new terminal, including ^C and ^Z. (Unfortunately, if you background it, you will still have to run "bg" or "fg" in the old terminal. This is likely impossible to fix in a reasonable way without patching your shell.)

manatwork ,Nov 20, 2014 at 22:59

I am quite sure you can not.

Check using ps x . If a process has a ? as controlling tty , you can not send input to it any more.

9942 ?        S      0:00 tail -F /var/log/messages
9947 pts/1    S      0:00 tail -F /var/log/messages

In this example, you can send input to 9947 doing something like echo "test" > /dev/pts/1 . The other process ( 9942 ) is not reachable.

Next time, you could use screen or tmux to avoid this situation.

Stéphane Gimenez ,Feb 16, 2012 at 16:16

EDIT : As Stephane Gimenez said, it's not that simple. It's only allowing you to print to a different terminal.

You can try to write to this process using /proc . It should be located in /proc/ pid /fd/0 , so a simple :

echo "hello" > /proc/PID/fd/0

should do it. I have not tried it, but it should work, as long as this process still has a valid stdin file descriptor. You can check it with ls -l on /proc/ pid /fd/ .

See nohup for more details about how to keep processes running.

Stéphane Gimenez ,Nov 20, 2015 at 5:08

Just ending the command line with & will not completely detach the process, it will just run it in the background. (With zsh you can use &! to actually detach it, otherwise you have do disown it later).

When a process runs in the background, it won't receive input from its controlling terminal anymore. But you can send it back into the foreground with fg and then it will read input again.

Otherwise, it's not possible to externally change its filedescriptors (including stdin) or to reattach a lost controlling terminal unless you use debugging tools (see Ansgar's answer , or have a look at the retty command).

[Mar 10, 2019] linux - Preventing tmux session created by systemd from automatically terminating on Ctrl+C - Stack Overflow

Mar 10, 2019 | stackoverflow.com

Preventing tmux session created by systemd from automatically terminating on Ctrl+C Ask Question -1


Jim Stewart ,Nov 10, 2018 at 12:55

Since a few days I'm successfully running the new Minecraft Bedrock Edition dedicated server on my Ubuntu 18.04 LTS home server. Because it should be available 24/7 and automatically startup after boot I created a systemd service for a detached tmux session:

tmux.minecraftserver.service

[Unit]
Description=tmux minecraft_server detached

[Service]
Type=forking
WorkingDirectory=/home/mine/minecraftserver
ExecStart=/usr/bin/tmux new -s minecraftserver -d "LD_LIBRARY_PATH=. /home/mine/minecraftser$
User=mine

[Install]
WantedBy=multi-user.target

Everything works as expected but there's one tiny thing that keeps bugging me:

How can I prevent tmux from terminating it's whole session when I press Ctrl+C ? I just want to terminate the Minecraft server process itself instead of the whole tmux session. When starting the server from the command line in a manually created tmux session this does work (session stays alive) but not when the session was brought up by systemd .

FlKo ,Nov 12, 2018 at 6:21

When starting the server from the command line in a manually created tmux session this does work (session stays alive) but not when the session was brought up by systemd .

The difference between these situations is actually unrelated to systemd. In one case, you're starting the server from a shell within the tmux session, and when the server terminates, control returns to the shell. In the other case, you're starting the server directly within the tmux session, and when it terminates there's no shell to return to, so the tmux session also dies.

tmux has an option to keep the session alive after the process inside it dies (look for remain-on-exit in the manpage), but that's probably not what you want: you want to be able to return to an interactive shell, to restart the server, investigate why it died, or perform maintenance tasks, for example. So it's probably better to change your command to this:

'LD_LIBRARY_PATH=. /home/mine/minecraftserver/ ; exec bash'

That is, first run the server, and then, after it terminates, replace the process (the shell which tmux implicitly spawns to run the command, but which will then exit) with another, interactive shell. (For some other ways to get an interactive shell after the command exits, see e. g. this question – but note that the <(echo commands) syntax suggested in the top answer is not available in systemd unit files.)

FlKo ,Nov 12, 2018 at 6:21

I as able to solve this by using systemd's ExecStartPost and tmux's send-keys like this:
[Unit]
Description=tmux minecraft_server detached

[Service]
Type=forking
WorkingDirectory=/home/mine/minecraftserver
ExecStart=/usr/bin/tmux new -d -s minecraftserver
ExecStartPost=/usr/bin/tmux send-keys -t minecraftserver "cd /home/mine/minecraftserver/" Enter "LD_LIBRARY_PATH=. ./bedrock_server" Enter

User=mine

[Install]
WantedBy=multi-user.target

[Mar 01, 2019] Creating symlinks instead of /bin /sbin /lib and /lib64 directories in RHEL7

That change essentially means that /usr should be on the root partition, not on a separate partition which with the current sizes of harddrive is a resobale requirement.
Notable quotes:
"... On Linux /bin and /usr/bin are still separate because it is common to have /usr on a separate partition (although this configuration breaks in subtle ways, sometimes). In /bin is all the commands that you will need if you only have / mounted. ..."
Mar 01, 2019 | unix.stackexchange.com

balki ,May 2, 2015 at 6:17

What? no /bin/ is not a symlink to /usr/bin on any FHS compliant system. Note that there are still popular Unixes and Linuxes that ignore this - for example, /bin and /sbin are symlinked to /usr/bin on Arch Linux (the reasoning being that you don't need /bin for rescue/single-user-mode, since you'd just boot a live CD).

/bin

contains commands that may be used by both the system administrator and by users, but which are required when no other filesystems are mounted (e.g. in single user mode). It may also contain commands which are used indirectly by scripts

/usr/bin/

This is the primary directory of executable commands on the system.

essentially, /bin contains executables which are required by the system for emergency repairs, booting, and single user mode. /usr/bin contains any binaries that aren't required.

I will note, that they can be on separate disks/partitions, /bin must be on the same disk as / . /usr/bin can be on another disk - although note that this configuration has been kind of broken for a while (this is why e.g. systemd warns about this configuration on boot).

For full correctness, some unices may ignore FHS, as I believe it is only a Linux Standard, I'm not aware that it has yet been included in SUS, Posix or any other UNIX standard, though it should be IMHO. It is a part of the LSB standard though.

LawrenceC ,Jan 13, 2015 at 16:12

/sbin - Binaries needed for booting, low-level system repair, or maintenance (run level 1 or S)

/bin - Binaries needed for normal/standard system functioning at any run level.

/usr/bin - Application/distribution binaries meant to be accessed by locally logged in users

/usr/sbin - Application/distribution binaries that support or configure stuff in /sbin.

/usr/share/bin - Application/distribution binaries or scripts meant to be accessed via the web, i.e. Apache web applications

*local* - Binaries not part of a distribution; locally compiled or manually installed. There's usually never a /local/bin but always a /usr/local/bin and /usr/local/share/bin .

JonnyJD ,Jan 3, 2013 at 0:17

Some kind of "update" on this issue:

Recently some Linux distributions are merging /bin into /usr/bin and relatedly /lib into /usr/lib . Sometimes also (/usr)/sbin to /usr/bin (Arch Linux). So /usr is expected to be available at the same time as / .

The distinction between the two hierarchies is taken to be unnecessary complexity now. The idea was once having only /bin available at boot, but having an initial ramdisk makes this obsolete.

I know of Fedora Linux (2011) and Arch Linux (2012) going this way and Solaris is doing this for a long time (> 15 years).

xenoterracide ,Jan 17, 2011 at 16:23

On Linux /bin and /usr/bin are still separate because it is common to have /usr on a separate partition (although this configuration breaks in subtle ways, sometimes). In /bin is all the commands that you will need if you only have / mounted.

On Solaris and Arch Linux (and probably others) /bin is a symlink to /usr/bin . Arch also has /sbin and /usr/sbin symlinked to /usr/bin .

Of particular note, the statement that /bin is for "system administrator" commands and /usr/bin is for user commands is not true (unless you think that bash and ls are for admins only, in which case you have a lot to learn). Administrator commands are in /sbin and /usr/sbin .

[Feb 21, 2019] The rm='rm -i' alias is an horror

Feb 21, 2019 | superuser.com

The rm='rm -i' alias is an horror because after a while using it, you will expect rm to prompt you by default before removing files. Of course, one day you'll run it with an account that hasn't that alias set and before you understand what's going on, it is too late.

... ... ...

If you want save aliases, but don't want to risk getting used to the commands working differently on your system than on others, you can to disable rm like this
alias rm='echo "rm is disabled, use remove or trash or /bin/rm instead."'

Then you can create your own safe alias, e.g.

alias remove='/bin/rm -irv'

or use trash instead.

[Feb 21, 2019] What is the minimum I have to do to create an RPM file?

Feb 21, 2019 | stackoverflow.com

webwesen ,Jan 29, 2016 at 6:42

I just want to create an RPM file to distribute my Linux binary "foobar", with only a couple of dependencies. It has a config file, /etc/foobar.conf and should be installed in /usr/bin/foobar.

Unfortunately the documentation for RPM is 27 chapters long and I really don't have a day to sit down and read this, because I am also busy making .deb and EXE installers for other platforms.

What is the absolute minimum I have to do to create an RPM? Assume the foobar binary and foobar.conf are in the current working directory.

icasimpan ,Apr 10, 2018 at 13:33

I often do binary rpm per packaging proprietary apps - also moster as websphere - on linux. So my experience could be useful also a you, besides that it would better to do a TRUE RPM if you can. But i digress.

So the a basic step for packaging your (binary) program is as follow - in which i suppose the program is toybinprog with version 1.0, have a conf to be installed in /etc/toybinprog/toybinprog.conf and have a bin to be installed in /usr/bin called tobinprog :

1. create your rpm build env for RPM < 4.6,4.7
mkdir -p ~/rpmbuild/{RPMS,SRPMS,BUILD,SOURCES,SPECS,tmp}

cat <<EOF >~/.rpmmacros
%_topdir   %(echo $HOME)/rpmbuild
%_tmppath  %{_topdir}/tmp
EOF

cd ~/rpmbuild
2. create the tarball of your project
mkdir toybinprog-1.0
mkdir -p toybinprog-1.0/usr/bin
mkdir -p toybinprog-1.0/etc/toybinprog
install -m 755 toybinprog toybinprog-1.0/usr/bin
install -m 644 toybinprog.conf toybinprog-1.0/etc/toybinprog/

tar -zcvf toybinprog-1.0.tar.gz toybinprog-1.0/
3. Copy to the sources dir
cp toybinprog-1.0.tar.gz SOURCES/

cat <<EOF > SPECS/toybinprog.spec
# Don't try fancy stuff like debuginfo, which is useless on binary-only
# packages. Don't strip binary too
# Be sure buildpolicy set to do nothing
%define        __spec_install_post %{nil}
%define          debug_package %{nil}
%define        __os_install_post %{_dbpath}/brp-compress

Summary: A very simple toy bin rpm package
Name: toybinprog
Version: 1.0
Release: 1
License: GPL+
Group: Development/Tools
SOURCE0 : %{name}-%{version}.tar.gz
URL: http://toybinprog.company.com/

BuildRoot: %{_tmppath}/%{name}-%{version}-%{release}-root

%description
%{summary}

%prep
%setup -q

%build
# Empty section.

%install
rm -rf %{buildroot}
mkdir -p  %{buildroot}

# in builddir
cp -a * %{buildroot}


%clean
rm -rf %{buildroot}


%files
%defattr(-,root,root,-)
%config(noreplace) %{_sysconfdir}/%{name}/%{name}.conf
%{_bindir}/*

%changelog
* Thu Apr 24 2009  Elia Pinto <[email protected]> 1.0-1
- First Build

EOF
4. build the source and the binary rpm
rpmbuild -ba SPECS/toybinprog.spec

And that's all.

Hope this help

> ,

As an application distributor, fpm sounds perfect for your needs . There is an example here which shows how to package an app from source. FPM can produce both deb files and RPM files.

[Feb 21, 2019] perl - How to prompt for input and exit if the user entered an empty string - Stack Overflow

Feb 20, 2019 | stackoverflow.com

NewLearner ,Mar 12, 2012 at 3:22

I'm new to Perl and I'm writing a program where I want to force the user to enter a word. If the user enters an empty string then the program should exit.

This is what I have so far:

print "Enter a word to look up: ";

chomp ($usrword = <STDIN>);

DVK , Nov 19, 2015 at 19:11

You're almost there.
print "Enter a word to look up: ";
my $userword = <STDIN>; # I moved chomp to a new line to make it more readable
chomp $userword; # Get rid of newline character at the end
exit 0 if ($userword eq ""); # If empty string, exit.

Pondy , Jul 6 '16 at 22:11

File output is buffered by default. Since the prompt is so short, it is still sitting in the output buffer. You can disable buffering on STDOUT by adding this line of code before printing...
select((select(STDOUT), $|=1)[0]);

[Feb 11, 2019] Resuming rsync on a interrupted transfer

May 15, 2013 | stackoverflow.com

Glitches , May 15, 2013 at 18:06

I am trying to backup my file server to a remove file server using rsync. Rsync is not successfully resuming when a transfer is interrupted. I used the partial option but rsync doesn't find the file it already started because it renames it to a temporary file and when resumed it creates a new file and starts from beginning.

Here is my command:

rsync -avztP -e "ssh -p 2222" /volume1/ myaccont@backup-server-1:/home/myaccount/backup/ --exclude "@spool" --exclude "@tmp"

When this command is ran, a backup file named OldDisk.dmg from my local machine get created on the remote machine as something like .OldDisk.dmg.SjDndj23 .

Now when the internet connection gets interrupted and I have to resume the transfer, I have to find where rsync left off by finding the temp file like .OldDisk.dmg.SjDndj23 and rename it to OldDisk.dmg so that it sees there already exists a file that it can resume.

How do I fix this so I don't have to manually intervene each time?

Richard Michael , Nov 6, 2013 at 4:26

TL;DR : Use --timeout=X (X in seconds) to change the default rsync server timeout, not --inplace .

The issue is the rsync server processes (of which there are two, see rsync --server ... in ps output on the receiver) continue running, to wait for the rsync client to send data.

If the rsync server processes do not receive data for a sufficient time, they will indeed timeout, self-terminate and cleanup by moving the temporary file to it's "proper" name (e.g., no temporary suffix). You'll then be able to resume.

If you don't want to wait for the long default timeout to cause the rsync server to self-terminate, then when your internet connection returns, log into the server and clean up the rsync server processes manually. However, you must politely terminate rsync -- otherwise, it will not move the partial file into place; but rather, delete it (and thus there is no file to resume). To politely ask rsync to terminate, do not SIGKILL (e.g., -9 ), but SIGTERM (e.g., pkill -TERM -x rsync - only an example, you should take care to match only the rsync processes concerned with your client).

Fortunately there is an easier way: use the --timeout=X (X in seconds) option; it is passed to the rsync server processes as well.

For example, if you specify rsync ... --timeout=15 ... , both the client and server rsync processes will cleanly exit if they do not send/receive data in 15 seconds. On the server, this means moving the temporary file into position, ready for resuming.

I'm not sure of the default timeout value of the various rsync processes will try to send/receive data before they die (it might vary with operating system). In my testing, the server rsync processes remain running longer than the local client. On a "dead" network connection, the client terminates with a broken pipe (e.g., no network socket) after about 30 seconds; you could experiment or review the source code. Meaning, you could try to "ride out" the bad internet connection for 15-20 seconds.

If you do not clean up the server rsync processes (or wait for them to die), but instead immediately launch another rsync client process, two additional server processes will launch (for the other end of your new client process). Specifically, the new rsync client will not re-use/reconnect to the existing rsync server processes. Thus, you'll have two temporary files (and four rsync server processes) -- though, only the newer, second temporary file has new data being written (received from your new rsync client process).

Interestingly, if you then clean up all rsync server processes (for example, stop your client which will stop the new rsync servers, then SIGTERM the older rsync servers, it appears to merge (assemble) all the partial files into the new proper named file. So, imagine a long running partial copy which dies (and you think you've "lost" all the copied data), and a short running re-launched rsync (oops!).. you can stop the second client, SIGTERM the first servers, it will merge the data, and you can resume.

Finally, a few short remarks:

JamesTheAwesomeDude , Dec 29, 2013 at 16:50

Just curious: wouldn't SIGINT (aka ^C ) be 'politer' than SIGTERM ? � JamesTheAwesomeDude Dec 29 '13 at 16:50

Richard Michael , Dec 29, 2013 at 22:34

I didn't test how the server-side rsync handles SIGINT, so I'm not sure it will keep the partial file - you could check. Note that this doesn't have much to do with Ctrl-c ; it happens that your terminal sends SIGINT to the foreground process when you press Ctrl-c , but the server-side rsync has no controlling terminal. You must log in to the server and use kill . The client-side rsync will not send a message to the server (for example, after the client receives SIGINT via your terminal Ctrl-c ) - might be interesting though. As for anthropomorphizing, not sure what's "politer". :-) � Richard Michael Dec 29 '13 at 22:34

d-b , Feb 3, 2015 at 8:48

I just tried this timeout argument rsync -av --delete --progress --stats --human-readable --checksum --timeout=60 --partial-dir /tmp/rsync/ rsync://$remote:/ /src/ but then it timed out during the "receiving file list" phase (which in this case takes around 30 minutes). Setting the timeout to half an hour so kind of defers the purpose. Any workaround for this? � d-b Feb 3 '15 at 8:48

Cees Timmerman , Sep 15, 2015 at 17:10

@user23122 --checksum reads all data when preparing the file list, which is great for many small files that change often, but should be done on-demand for large files. � Cees Timmerman Sep 15 '15 at 17:10

[Jan 29, 2019] Do journaling filesystems guarantee against corruption after a power failure

Jan 29, 2019 | unix.stackexchange.com

Nathan Osman ,May 6, 2011 at 1:50

I am asking this question on behalf of another user who raised the issue in the Ubuntu chat room.

Do journaling filesystems guarantee that no corruption will occur if a power failure occurs?

If this answer depends on the filesystem, please indicate which ones do protect against corruption and which ones don't.

Andrew Lambert ,May 6, 2011 at 2:51

There are no guarantees. A Journaling File System is more resilient and is less prone to corruption, but not immune.

All a journal is is a list of operations which have recently been done to the file system. The crucial part is that the journal entry is made before the operations take place. Most operations have multiple steps. Deleting a file, for example might entail deleting the file's entry in the file system's table of contents and then marking the sectors on the drive as free. If something happens between the two steps, a journaled file system can tell immediately and perform the necessary clean up to keep everything consistent. This is not the case with a non-journaled file system which has to look at the entire contents of the volume to find errors.

While this journaling is much less prone to corruption than not journaling, corruption can still occur. For example, if the hard drive is mechanically malfunctioning or if writes to the journal itself are failing or interrupted.

The basic premise of journaling is that writing a journal entry is much quicker, usually, than the actual transaction it describes will be. So, the period between the OS ordering a (journal) write and the hard drive fulfilling it is much shorter than for a normal write: a narrower window for things to go wrong in, but there's still a window.

Further reading

Nathan Osman ,May 6, 2011 at 2:57

Could you please elaborate a little bit on why this is true? Perhaps you could give an example of how corruption would occur in a certain scenario. – Nathan Osman May 6 '11 at 2:57

Andrew Lambert ,May 6, 2011 at 3:21

@George Edison See my expanded answer. – Andrew Lambert May 6 '11 at 3:21

psusi ,May 6, 2011 at 17:58

That last bit is incorrect; there is no window for things to go wrong. Since it records what it is about to do before it starts doing it, the operation can be restarted after the power failure, no matter at what point it occurs during the operation. It is a matter of ordering, not timing. – psusi May 6 '11 at 17:58

Andrew Lambert ,May 6, 2011 at 21:23

@psusi there is still a window for the write to the journal to be interrupted. Journal writes may appear atomic to the OS but they're still writes to the disk. – Andrew Lambert May 6 '11 at 21:23

psusi ,May 7, 2011 at 1:57

@Amazed they are atomic because they have sequence numbers and/or checksums, so the journal entry is either written entirely, or not. If it is not written entirely, it is simply ignored after the system restarts, and no further changes were made to the fs so it remains consistent. – psusi May 7 '11 at 1:57

Mikel ,May 6, 2011 at 6:03

No.

The most common type of journaling, called metadata journaling, only protects the integrity of the file system, not of data. This includes xfs , and ext3 / ext4 in the default data=ordered mode.

If a non-journaling file system suffers a crash, it will be checked using fsck on the next boot. fsck scans every inode on the file system, looking for blocks that are marked as used but are not reachable (i.e. have no file name), and marks those blocks as unused. Doing this takes a long time.

With a metadata journaling file system, instead of doing an fsck , it knows which blocks it was in the middle of changing, so it can mark them as free without searching the whole partition for them.

There is a less common type of journaling, called data journaling, which is what ext3 does if you mount it with the data=journal option.

It attempts to protect all your data by writing not just a list of logical operations, but also the entire contents of each write to the journal. But because it's writing your data twice, it can be much slower.

As others have pointed out, even this is not a guarantee, because the hard drive might have told the operating system it had stored the data, when it fact it was still in the hard drive's cache.

For more information, take a look at the Wikipedia Journaling File System article and the Data Mode section of the ext4 documentation .

SplinterReality ,May 6, 2011 at 8:03

+1 for the distinction between file system corruption and data corruption. That little distinction is quite the doozy in practice. – SplinterReality May 6 '11 at 8:03

boehj ,May 6, 2011 at 10:57

Excuse my utter ignorance, but doesn't data=journal as a feature make no sense at all? – boehj May 6 '11 at 10:57

psusi ,May 6, 2011 at 18:11

Again, the OS knows when the drive caches data and forces it to flush it when needed in order to maintain a coherent fs. Your data file of course, can be lost or corrupted if the application that was writing it when the power failed was not doing so carefully, and that applies whether or not you use data=journal. – psusi May 6 '11 at 18:11

user3338098 ,Aug 1, 2016 at 16:30

@psusi doesn't matter how careful the program is in writing the data, plenty of hard drives silently corrupt the data on READING stackoverflow.com/q/34141117/3338098user3338098 Aug 1 '16 at 16:30

psusi ,Aug 21, 2016 at 3:22

@user3338098, drives that silently corrupt data are horribly broken and should not ever be used, and are an entirely different conversation than corruption caused by software doing the wrong thing. – psusi Aug 21 '16 at 3:22

camh ,May 6, 2011 at 3:26

A filesystem cannot guarantee the consistency of its filesystem if a power failure occurs, because it does not know what the hardware will do.

If a hard drive buffers data for write but tells the OS that it has written the data and does not support the appropriate write barriers, then out-of-order writes can occur where an earlier write has not hit the platter, but a later one has. See this serverfault answer for more details.

Also, the position of the head on a magnetic HDD is controlled with electro-magnets. If power fails in the middle of a write, it is possible for some data to continue to be written while the heads move, corrupting data on blocks that the filesystem never intended to be written.

Nathan Osman ,May 6, 2011 at 6:43

Isn't the drive's firmware smart enough to suspend writing when retracting the head? – Nathan Osman May 6 '11 at 6:43

camh ,May 6, 2011 at 7:54

@George: It's going to depend on the drive. There's a lot out there and you don't know how well your (cheap) drive does things. – camh May 6 '11 at 7:54

psusi ,May 6, 2011 at 18:05

The hard drive tells the OS if it uses a write behind cache, and the OS takes measures to ensure they are flushed in the correct order. Also drives are designed so that when the power fails, they stop writing. I have seen some cases where the sector being written at the time of power loss becomes corrupt because it did not finish updating the ecc ( but can be easily re-written correctly ), but never heard of random sectors being corrupted on power loss. – psusi May 6 '11 at 18:05

jlliagre ,May 6, 2011 at 8:35

ZFS, which is close but not exactly a journaling filesystem, is guaranteeing by design against corruption after a power failure.

It doesn't matter if an ongoing write is interrupted in the middle as in such case, its checksum will be certainly incorrect so the block will be ignored. As the file system is copy on write, the previous correct data (or meta-data) is still on disk and will be used instead.

sakisk ,May 6, 2011 at 10:13

The answer is in most cases no:

Nathan Osman ,May 6, 2011 at 16:35

What events could lead to a corrupt journal? The only thing I could think of was bad sectors - is there anything else? – Nathan Osman May 6 '11 at 16:35

sakisk ,May 7, 2011 at 13:21

That's right, hardware failures are the usual case. – sakisk May 7 '11 at 13:21

[Jan 29, 2019] Split string into an array in Bash

May 14, 2012 | stackoverflow.com

Lgn ,May 14, 2012 at 15:15

In a Bash script I would like to split a line into pieces and store them in an array.

The line:

Paris, France, Europe

I would like to have them in an array like this:

array[0] = Paris
array[1] = France
array[2] = Europe

I would like to use simple code, the command's speed doesn't matter. How can I do it?

antak ,Jun 18, 2018 at 9:22

This is #1 Google hit but there's controversy in the answer because the question unfortunately asks about delimiting on , (comma-space) and not a single character such as comma. If you're only interested in the latter, answers here are easier to follow: stackoverflow.com/questions/918886/ � antak Jun 18 '18 at 9:22

Dennis Williamson ,May 14, 2012 at 15:16

IFS=', ' read -r -a array <<< "$string"

Note that the characters in $IFS are treated individually as separators so that in this case fields may be separated by either a comma or a space rather than the sequence of the two characters. Interestingly though, empty fields aren't created when comma-space appears in the input because the space is treated specially.

To access an individual element:

echo "${array[0]}"

To iterate over the elements:

for element in "${array[@]}"
do
    echo "$element"
done

To get both the index and the value:

for index in "${!array[@]}"
do
    echo "$index ${array[index]}"
done

The last example is useful because Bash arrays are sparse. In other words, you can delete an element or add an element and then the indices are not contiguous.

unset "array[1]"
array[42]=Earth

To get the number of elements in an array:

echo "${#array[@]}"

As mentioned above, arrays can be sparse so you shouldn't use the length to get the last element. Here's how you can in Bash 4.2 and later:

echo "${array[-1]}"

in any version of Bash (from somewhere after 2.05b):

echo "${array[@]: -1:1}"

Larger negative offsets select farther from the end of the array. Note the space before the minus sign in the older form. It is required.

l0b0 ,May 14, 2012 at 15:24

Just use IFS=', ' , then you don't have to remove the spaces separately. Test: IFS=', ' read -a array <<< "Paris, France, Europe"; echo "${array[@]}" � l0b0 May 14 '12 at 15:24

Dennis Williamson ,May 14, 2012 at 16:33

@l0b0: Thanks. I don't know what I was thinking. I like to use declare -p array for test output, by the way. � Dennis Williamson May 14 '12 at 16:33

Nathan Hyde ,Mar 16, 2013 at 21:09

@Dennis Williamson - Awesome, thorough answer. � Nathan Hyde Mar 16 '13 at 21:09

dsummersl ,Aug 9, 2013 at 14:06

MUCH better than multiple cut -f calls! � dsummersl Aug 9 '13 at 14:06

caesarsol ,Oct 29, 2015 at 14:45

Warning: the IFS variable means split by one of these characters , so it's not a sequence of chars to split by. IFS=', ' read -a array <<< "a,d r s,w" => ${array[*]} == a d r s w � caesarsol Oct 29 '15 at 14:45

Jim Ho ,Mar 14, 2013 at 2:20

Here is a way without setting IFS:
string="1:2:3:4:5"
set -f                      # avoid globbing (expansion of *).
array=(${string//:/ })
for i in "${!array[@]}"
do
    echo "$i=>${array[i]}"
done

The idea is using string replacement:

${string//substring/replacement}

to replace all matches of $substring with white space and then using the substituted string to initialize a array:

(element1 element2 ... elementN)

Note: this answer makes use of the split+glob operator . Thus, to prevent expansion of some characters (such as * ) it is a good idea to pause globbing for this script.

Werner Lehmann ,May 4, 2013 at 22:32

Used this approach... until I came across a long string to split. 100% CPU for more than a minute (then I killed it). It's a pity because this method allows to split by a string, not some character in IFS. � Werner Lehmann May 4 '13 at 22:32

Dieter Gribnitz ,Sep 2, 2014 at 15:46

WARNING: Just ran into a problem with this approach. If you have an element named * you will get all the elements of your cwd as well. thus string="1:2:3:4:*" will give some unexpected and possibly dangerous results depending on your implementation. Did not get the same error with (IFS=', ' read -a array <<< "$string") and this one seems safe to use. � Dieter Gribnitz Sep 2 '14 at 15:46

akostadinov ,Nov 6, 2014 at 14:31

not reliable for many kinds of values, use with care � akostadinov Nov 6 '14 at 14:31

Andrew White ,Jun 1, 2016 at 11:44

quoting ${string//:/ } prevents shell expansion � Andrew White Jun 1 '16 at 11:44

Mark Thomson ,Jun 5, 2016 at 20:44

I had to use the following on OSX: array=(${string//:/ }) � Mark Thomson Jun 5 '16 at 20:44

bgoldst ,Jul 19, 2017 at 21:20

All of the answers to this question are wrong in one way or another.

Wrong answer #1

IFS=', ' read -r -a array <<< "$string"

1: This is a misuse of $IFS . The value of the $IFS variable is not taken as a single variable-length string separator, rather it is taken as a set of single-character string separators, where each field that read splits off from the input line can be terminated by any character in the set (comma or space, in this example).

Actually, for the real sticklers out there, the full meaning of $IFS is slightly more involved. From the bash manual :

The shell treats each character of IFS as a delimiter, and splits the results of the other expansions into words using these characters as field terminators. If IFS is unset, or its value is exactly <space><tab><newline> , the default, then sequences of <space> , <tab> , and <newline> at the beginning and end of the results of the previous expansions are ignored, and any sequence of IFS characters not at the beginning or end serves to delimit words. If IFS has a value other than the default, then sequences of the whitespace characters <space> , <tab> , and <newline> are ignored at the beginning and end of the word, as long as the whitespace character is in the value of IFS (an IFS whitespace character). Any character in IFS that is not IFS whitespace, along with any adjacent IFS whitespace characters, delimits a field. A sequence of IFS whitespace characters is also treated as a delimiter. If the value of IFS is null, no word splitting occurs.

Basically, for non-default non-null values of $IFS , fields can be separated with either (1) a sequence of one or more characters that are all from the set of "IFS whitespace characters" (that is, whichever of <space> , <tab> , and <newline> ("newline" meaning line feed (LF) ) are present anywhere in $IFS ), or (2) any non-"IFS whitespace character" that's present in $IFS along with whatever "IFS whitespace characters" surround it in the input line.

For the OP, it's possible that the second separation mode I described in the previous paragraph is exactly what he wants for his input string, but we can be pretty confident that the first separation mode I described is not correct at all. For example, what if his input string was 'Los Angeles, United States, North America' ?

IFS=', ' read -ra a <<<'Los Angeles, United States, North America'; declare -p a;
## declare -a a=([0]="Los" [1]="Angeles" [2]="United" [3]="States" [4]="North" [5]="America")

2: Even if you were to use this solution with a single-character separator (such as a comma by itself, that is, with no following space or other baggage), if the value of the $string variable happens to contain any LFs, then read will stop processing once it encounters the first LF. The read builtin only processes one line per invocation. This is true even if you are piping or redirecting input only to the read statement, as we are doing in this example with the here-string mechanism, and thus unprocessed input is guaranteed to be lost. The code that powers the read builtin has no knowledge of the data flow within its containing command structure.

You could argue that this is unlikely to cause a problem, but still, it's a subtle hazard that should be avoided if possible. It is caused by the fact that the read builtin actually does two levels of input splitting: first into lines, then into fields. Since the OP only wants one level of splitting, this usage of the read builtin is not appropriate, and we should avoid it.

3: A non-obvious potential issue with this solution is that read always drops the trailing field if it is empty, although it preserves empty fields otherwise. Here's a demo:

string=', , a, , b, c, , , '; IFS=', ' read -ra a <<<"$string"; declare -p a;
## declare -a a=([0]="" [1]="" [2]="a" [3]="" [4]="b" [5]="c" [6]="" [7]="")

Maybe the OP wouldn't care about this, but it's still a limitation worth knowing about. It reduces the robustness and generality of the solution.

This problem can be solved by appending a dummy trailing delimiter to the input string just prior to feeding it to read , as I will demonstrate later.


Wrong answer #2

string="1:2:3:4:5"
set -f                     # avoid globbing (expansion of *).
array=(${string//:/ })

Similar idea:

t="one,two,three"
a=($(echo $t | tr ',' "\n"))

(Note: I added the missing parentheses around the command substitution which the answerer seems to have omitted.)

Similar idea:

string="1,2,3,4"
array=(`echo $string | sed 's/,/\n/g'`)

These solutions leverage word splitting in an array assignment to split the string into fields. Funnily enough, just like read , general word splitting also uses the $IFS special variable, although in this case it is implied that it is set to its default value of <space><tab><newline> , and therefore any sequence of one or more IFS characters (which are all whitespace characters now) is considered to be a field delimiter.

This solves the problem of two levels of splitting committed by read , since word splitting by itself constitutes only one level of splitting. But just as before, the problem here is that the individual fields in the input string can already contain $IFS characters, and thus they would be improperly split during the word splitting operation. This happens to not be the case for any of the sample input strings provided by these answerers (how convenient...), but of course that doesn't change the fact that any code base that used this idiom would then run the risk of blowing up if this assumption were ever violated at some point down the line. Once again, consider my counterexample of 'Los Angeles, United States, North America' (or 'Los Angeles:United States:North America' ).

Also, word splitting is normally followed by filename expansion ( aka pathname expansion aka globbing), which, if done, would potentially corrupt words containing the characters * , ? , or [ followed by ] (and, if extglob is set, parenthesized fragments preceded by ? , * , + , @ , or ! ) by matching them against file system objects and expanding the words ("globs") accordingly. The first of these three answerers has cleverly undercut this problem by running set -f beforehand to disable globbing. Technically this works (although you should probably add set +f afterward to reenable globbing for subsequent code which may depend on it), but it's undesirable to have to mess with global shell settings in order to hack a basic string-to-array parsing operation in local code.

Another issue with this answer is that all empty fields will be lost. This may or may not be a problem, depending on the application.

Note: If you're going to use this solution, it's better to use the ${string//:/ } "pattern substitution" form of parameter expansion , rather than going to the trouble of invoking a command substitution (which forks the shell), starting up a pipeline, and running an external executable ( tr or sed ), since parameter expansion is purely a shell-internal operation. (Also, for the tr and sed solutions, the input variable should be double-quoted inside the command substitution; otherwise word splitting would take effect in the echo command and potentially mess with the field values. Also, the $(...) form of command substitution is preferable to the old `...` form since it simplifies nesting of command substitutions and allows for better syntax highlighting by text editors.)


Wrong answer #3

str="a, b, c, d"  # assuming there is a space after ',' as in Q
arr=(${str//,/})  # delete all occurrences of ','

This answer is almost the same as #2 . The difference is that the answerer has made the assumption that the fields are delimited by two characters, one of which being represented in the default $IFS , and the other not. He has solved this rather specific case by removing the non-IFS-represented character using a pattern substitution expansion and then using word splitting to split the fields on the surviving IFS-represented delimiter character.

This is not a very generic solution. Furthermore, it can be argued that the comma is really the "primary" delimiter character here, and that stripping it and then depending on the space character for field splitting is simply wrong. Once again, consider my counterexample: 'Los Angeles, United States, North America' .

Also, again, filename expansion could corrupt the expanded words, but this can be prevented by temporarily disabling globbing for the assignment with set -f and then set +f .

Also, again, all empty fields will be lost, which may or may not be a problem depending on the application.


Wrong answer #4

string='first line
second line
third line'

oldIFS="$IFS"
IFS='
'
IFS=${IFS:0:1} # this is useful to format your code with tabs
lines=( $string )
IFS="$oldIFS"

This is similar to #2 and #3 in that it uses word splitting to get the job done, only now the code explicitly sets $IFS to contain only the single-character field delimiter present in the input string. It should be repeated that this cannot work for multicharacter field delimiters such as the OP's comma-space delimiter. But for a single-character delimiter like the LF used in this example, it actually comes close to being perfect. The fields cannot be unintentionally split in the middle as we saw with previous wrong answers, and there is only one level of splitting, as required.

One problem is that filename expansion will corrupt affected words as described earlier, although once again this can be solved by wrapping the critical statement in set -f and set +f .

Another potential problem is that, since LF qualifies as an "IFS whitespace character" as defined earlier, all empty fields will be lost, just as in #2 and #3 . This would of course not be a problem if the delimiter happens to be a non-"IFS whitespace character", and depending on the application it may not matter anyway, but it does vitiate the generality of the solution.

So, to sum up, assuming you have a one-character delimiter, and it is either a non-"IFS whitespace character" or you don't care about empty fields, and you wrap the critical statement in set -f and set +f , then this solution works, but otherwise not.

(Also, for information's sake, assigning a LF to a variable in bash can be done more easily with the $'...' syntax, e.g. IFS=$'\n'; .)


Wrong answer #5

countries='Paris, France, Europe'
OIFS="$IFS"
IFS=', ' array=($countries)
IFS="$OIFS"

Similar idea:

IFS=', ' eval 'array=($string)'

This solution is effectively a cross between #1 (in that it sets $IFS to comma-space) and #2-4 (in that it uses word splitting to split the string into fields). Because of this, it suffers from most of the problems that afflict all of the above wrong answers, sort of like the worst of all worlds.

Also, regarding the second variant, it may seem like the eval call is completely unnecessary, since its argument is a single-quoted string literal, and therefore is statically known. But there's actually a very non-obvious benefit to using eval in this way. Normally, when you run a simple command which consists of a variable assignment only , meaning without an actual command word following it, the assignment takes effect in the shell environment:

IFS=', '; ## changes $IFS in the shell environment

This is true even if the simple command involves multiple variable assignments; again, as long as there's no command word, all variable assignments affect the shell environment:

IFS=', ' array=($countries); ## changes both $IFS and $array in the shell environment

But, if the variable assignment is attached to a command name (I like to call this a "prefix assignment") then it does not affect the shell environment, and instead only affects the environment of the executed command, regardless whether it is a builtin or external:

IFS=', ' :; ## : is a builtin command, the $IFS assignment does not outlive it
IFS=', ' env; ## env is an external command, the $IFS assignment does not outlive it

Relevant quote from the bash manual :

If no command name results, the variable assignments affect the current shell environment. Otherwise, the variables are added to the environment of the executed command and do not affect the current shell environment.

It is possible to exploit this feature of variable assignment to change $IFS only temporarily, which allows us to avoid the whole save-and-restore gambit like that which is being done with the $OIFS variable in the first variant. But the challenge we face here is that the command we need to run is itself a mere variable assignment, and hence it would not involve a command word to make the $IFS assignment temporary. You might think to yourself, well why not just add a no-op command word to the statement like the : builtin to make the $IFS assignment temporary? This does not work because it would then make the $array assignment temporary as well:

IFS=', ' array=($countries) :; ## fails; new $array value never escapes the : command

So, we're effectively at an impasse, a bit of a catch-22. But, when eval runs its code, it runs it in the shell environment, as if it was normal, static source code, and therefore we can run the $array assignment inside the eval argument to have it take effect in the shell environment, while the $IFS prefix assignment that is prefixed to the eval command will not outlive the eval command. This is exactly the trick that is being used in the second variant of this solution:

IFS=', ' eval 'array=($string)'; ## $IFS does not outlive the eval command, but $array does

So, as you can see, it's actually quite a clever trick, and accomplishes exactly what is required (at least with respect to assignment effectation) in a rather non-obvious way. I'm actually not against this trick in general, despite the involvement of eval ; just be careful to single-quote the argument string to guard against security threats.

But again, because of the "worst of all worlds" agglomeration of problems, this is still a wrong answer to the OP's requirement.


Wrong answer #6

IFS=', '; array=(Paris, France, Europe)

IFS=' ';declare -a array=(Paris France Europe)

Um... what? The OP has a string variable that needs to be parsed into an array. This "answer" starts with the verbatim contents of the input string pasted into an array literal. I guess that's one way to do it.

It looks like the answerer may have assumed that the $IFS variable affects all bash parsing in all contexts, which is not true. From the bash manual:

IFS The Internal Field Separator that is used for word splitting after expansion and to split lines into words with the read builtin command. The default value is <space><tab><newline> .

So the $IFS special variable is actually only used in two contexts: (1) word splitting that is performed after expansion (meaning not when parsing bash source code) and (2) for splitting input lines into words by the read builtin.

Let me try to make this clearer. I think it might be good to draw a distinction between parsing and execution . Bash must first parse the source code, which obviously is a parsing event, and then later it executes the code, which is when expansion comes into the picture. Expansion is really an execution event. Furthermore, I take issue with the description of the $IFS variable that I just quoted above; rather than saying that word splitting is performed after expansion , I would say that word splitting is performed during expansion, or, perhaps even more precisely, word splitting is part of the expansion process. The phrase "word splitting" refers only to this step of expansion; it should never be used to refer to the parsing of bash source code, although unfortunately the docs do seem to throw around the words "split" and "words" a lot. Here's a relevant excerpt from the linux.die.net version of the bash manual:

Expansion is performed on the command line after it has been split into words. There are seven kinds of expansion performed: brace expansion , tilde expansion , parameter and variable expansion , command substitution , arithmetic expansion , word splitting , and pathname expansion .

The order of expansions is: brace expansion; tilde expansion, parameter and variable expansion, arithmetic expansion, and command substitution (done in a left-to-right fashion); word splitting; and pathname expansion.

You could argue the GNU version of the manual does slightly better, since it opts for the word "tokens" instead of "words" in the first sentence of the Expansion section:

Expansion is performed on the command line after it has been split into tokens.

The important point is, $IFS does not change the way bash parses source code. Parsing of bash source code is actually a very complex process that involves recognition of the various elements of shell grammar, such as command sequences, command lists, pipelines, parameter expansions, arithmetic substitutions, and command substitutions. For the most part, the bash parsing process cannot be altered by user-level actions like variable assignments (actually, there are some minor exceptions to this rule; for example, see the various compatxx shell settings , which can change certain aspects of parsing behavior on-the-fly). The upstream "words"/"tokens" that result from this complex parsing process are then expanded according to the general process of "expansion" as broken down in the above documentation excerpts, where word splitting of the expanded (expanding?) text into downstream words is simply one step of that process. Word splitting only touches text that has been spit out of a preceding expansion step; it does not affect literal text that was parsed right off the source bytestream.


Wrong answer #7

string='first line
        second line
        third line'

while read -r line; do lines+=("$line"); done <<<"$string"

This is one of the best solutions. Notice that we're back to using read . Didn't I say earlier that read is inappropriate because it performs two levels of splitting, when we only need one? The trick here is that you can call read in such a way that it effectively only does one level of splitting, specifically by splitting off only one field per invocation, which necessitates the cost of having to call it repeatedly in a loop. It's a bit of a sleight of hand, but it works.

But there are problems. First: When you provide at least one NAME argument to read , it automatically ignores leading and trailing whitespace in each field that is split off from the input string. This occurs whether $IFS is set to its default value or not, as described earlier in this post. Now, the OP may not care about this for his specific use-case, and in fact, it may be a desirable feature of the parsing behavior. But not everyone who wants to parse a string into fields will want this. There is a solution, however: A somewhat non-obvious usage of read is to pass zero NAME arguments. In this case, read will store the entire input line that it gets from the input stream in a variable named $REPLY , and, as a bonus, it does not strip leading and trailing whitespace from the value. This is a very robust usage of read which I've exploited frequently in my shell programming career. Here's a demonstration of the difference in behavior:

string=$'  a  b  \n  c  d  \n  e  f  '; ## input string

a=(); while read -r line; do a+=("$line"); done <<<"$string"; declare -p a;
## declare -a a=([0]="a  b" [1]="c  d" [2]="e  f") ## read trimmed surrounding whitespace

a=(); while read -r; do a+=("$REPLY"); done <<<"$string"; declare -p a;
## declare -a a=([0]="  a  b  " [1]="  c  d  " [2]="  e  f  ") ## no trimming

The second issue with this solution is that it does not actually address the case of a custom field separator, such as the OP's comma-space. As before, multicharacter separators are not supported, which is an unfortunate limitation of this solution. We could try to at least split on comma by specifying the separator to the -d option, but look what happens:

string='Paris, France, Europe';
a=(); while read -rd,; do a+=("$REPLY"); done <<<"$string"; declare -p a;
## declare -a a=([0]="Paris" [1]=" France")

Predictably, the unaccounted surrounding whitespace got pulled into the field values, and hence this would have to be corrected subsequently through trimming operations (this could also be done directly in the while-loop). But there's another obvious error: Europe is missing! What happened to it? The answer is that read returns a failing return code if it hits end-of-file (in this case we can call it end-of-string) without encountering a final field terminator on the final field. This causes the while-loop to break prematurely and we lose the final field.

Technically this same error afflicted the previous examples as well; the difference there is that the field separator was taken to be LF, which is the default when you don't specify the -d option, and the <<< ("here-string") mechanism automatically appends a LF to the string just before it feeds it as input to the command. Hence, in those cases, we sort of accidentally solved the problem of a dropped final field by unwittingly appending an additional dummy terminator to the input. Let's call this solution the "dummy-terminator" solution. We can apply the dummy-terminator solution manually for any custom delimiter by concatenating it against the input string ourselves when instantiating it in the here-string:

a=(); while read -rd,; do a+=("$REPLY"); done <<<"$string,"; declare -p a;
declare -a a=([0]="Paris" [1]=" France" [2]=" Europe")

There, problem solved. Another solution is to only break the while-loop if both (1) read returned failure and (2) $REPLY is empty, meaning read was not able to read any characters prior to hitting end-of-file. Demo:

a=(); while read -rd,|| [[ -n "$REPLY" ]]; do a+=("$REPLY"); done <<<"$string"; declare -p a;
## declare -a a=([0]="Paris" [1]=" France" [2]=$' Europe\n')

This approach also reveals the secretive LF that automatically gets appended to the here-string by the <<< redirection operator. It could of course be stripped off separately through an explicit trimming operation as described a moment ago, but obviously the manual dummy-terminator approach solves it directly, so we could just go with that. The manual dummy-terminator solution is actually quite convenient in that it solves both of these two problems (the dropped-final-field problem and the appended-LF problem) in one go.

So, overall, this is quite a powerful solution. It's only remaining weakness is a lack of support for multicharacter delimiters, which I will address later.


Wrong answer #8

string='first line
        second line
        third line'

readarray -t lines <<<"$string"

(This is actually from the same post as #7 ; the answerer provided two solutions in the same post.)

The readarray builtin, which is a synonym for mapfile , is ideal. It's a builtin command which parses a bytestream into an array variable in one shot; no messing with loops, conditionals, substitutions, or anything else. And it doesn't surreptitiously strip any whitespace from the input string. And (if -O is not given) it conveniently clears the target array before assigning to it. But it's still not perfect, hence my criticism of it as a "wrong answer".

First, just to get this out of the way, note that, just like the behavior of read when doing field-parsing, readarray drops the trailing field if it is empty. Again, this is probably not a concern for the OP, but it could be for some use-cases. I'll come back to this in a moment.

Second, as before, it does not support multicharacter delimiters. I'll give a fix for this in a moment as well.

Third, the solution as written does not parse the OP's input string, and in fact, it cannot be used as-is to parse it. I'll expand on this momentarily as well.

For the above reasons, I still consider this to be a "wrong answer" to the OP's question. Below I'll give what I consider to be the right answer.


Right answer

Here's a na�ve attempt to make #8 work by just specifying the -d option:

string='Paris, France, Europe';
readarray -td, a <<<"$string"; declare -p a;
## declare -a a=([0]="Paris" [1]=" France" [2]=$' Europe\n')

We see the result is identical to the result we got from the double-conditional approach of the looping read solution discussed in #7 . We can almost solve this with the manual dummy-terminator trick:

readarray -td, a <<<"$string,"; declare -p a;
## declare -a a=([0]="Paris" [1]=" France" [2]=" Europe" [3]=$'\n')

The problem here is that readarray preserved the trailing field, since the <<< redirection operator appended the LF to the input string, and therefore the trailing field was not empty (otherwise it would've been dropped). We can take care of this by explicitly unsetting the final array element after-the-fact:

readarray -td, a <<<"$string,"; unset 'a[-1]'; declare -p a;
## declare -a a=([0]="Paris" [1]=" France" [2]=" Europe")

The only two problems that remain, which are actually related, are (1) the extraneous whitespace that needs to be trimmed, and (2) the lack of support for multicharacter delimiters.

The whitespace could of course be trimmed afterward (for example, see How to trim whitespace from a Bash variable? ). But if we can hack a multicharacter delimiter, then that would solve both problems in one shot.

Unfortunately, there's no direct way to get a multicharacter delimiter to work. The best solution I've thought of is to preprocess the input string to replace the multicharacter delimiter with a single-character delimiter that will be guaranteed not to collide with the contents of the input string. The only character that has this guarantee is the NUL byte . This is because, in bash (though not in zsh, incidentally), variables cannot contain the NUL byte. This preprocessing step can be done inline in a process substitution. Here's how to do it using awk :

readarray -td '' a < <(awk '{ gsub(/, /,"\0"); print; }' <<<"$string, "); unset 'a[-1]';
declare -p a;
## declare -a a=([0]="Paris" [1]="France" [2]="Europe")

There, finally! This solution will not erroneously split fields in the middle, will not cut out prematurely, will not drop empty fields, will not corrupt itself on filename expansions, will not automatically strip leading and trailing whitespace, will not leave a stowaway LF on the end, does not require loops, and does not settle for a single-character delimiter.


Trimming solution

Lastly, I wanted to demonstrate my own fairly intricate trimming solution using the obscure -C callback option of readarray . Unfortunately, I've run out of room against Stack Overflow's draconian 30,000 character post limit, so I won't be able to explain it. I'll leave that as an exercise for the reader.

function mfcb { local val="$4"; "$1"; eval "$2[$3]=\$val;"; };
function val_ltrim { if [[ "$val" =~ ^[[:space:]]+ ]]; then val="${val:${#BASH_REMATCH[0]}}"; fi; };
function val_rtrim { if [[ "$val" =~ [[:space:]]+$ ]]; then val="${val:0:${#val}-${#BASH_REMATCH[0]}}"; fi; };
function val_trim { val_ltrim; val_rtrim; };
readarray -c1 -C 'mfcb val_trim a' -td, <<<"$string,"; unset 'a[-1]'; declare -p a;
## declare -a a=([0]="Paris" [1]="France" [2]="Europe")

fbicknel ,Aug 18, 2017 at 15:57

It may also be helpful to note (though understandably you had no room to do so) that the -d option to readarray first appears in Bash 4.4. � fbicknel Aug 18 '17 at 15:57

Cyril Duchon-Doris ,Nov 3, 2017 at 9:16

You should add a "TL;DR : scroll 3 pages to see the right solution at the end of my answer" � Cyril Duchon-Doris Nov 3 '17 at 9:16

dawg ,Nov 26, 2017 at 22:28

Great answer (+1). If you change your awk to awk '{ gsub(/,[ ]+|$/,"\0"); print }' and eliminate that concatenation of the final ", " then you don't have to go through the gymnastics on eliminating the final record. So: readarray -td '' a < <(awk '{ gsub(/,[ ]+/,"\0"); print; }' <<<"$string") on Bash that supports readarray . Note your method is Bash 4.4+ I think because of the -d in readarray � dawg Nov 26 '17 at 22:28

datUser ,Feb 22, 2018 at 14:54

Looks like readarray is not an available builtin on OSX. � datUser Feb 22 '18 at 14:54

bgoldst ,Feb 23, 2018 at 3:37

@datUser That's unfortunate. Your version of bash must be too old for readarray . In this case, you can use the second-best solution built on read . I'm referring to this: a=(); while read -rd,; do a+=("$REPLY"); done <<<"$string,"; (with the awk substitution if you need multicharacter delimiter support). Let me know if you run into any problems; I'm pretty sure this solution should work on fairly old versions of bash, back to version 2-something, released like two decades ago. � bgoldst Feb 23 '18 at 3:37

Jmoney38 ,Jul 14, 2015 at 11:54

t="one,two,three"
a=($(echo "$t" | tr ',' '\n'))
echo "${a[2]}"

Prints three

shrimpwagon ,Oct 16, 2015 at 20:04

I actually prefer this approach. Simple. � shrimpwagon Oct 16 '15 at 20:04

Ben ,Oct 31, 2015 at 3:11

I copied and pasted this and it did did not work with echo, but did work when I used it in a for loop. � Ben Oct 31 '15 at 3:11

Pinaki Mukherjee ,Nov 9, 2015 at 20:22

This is the simplest approach. thanks � Pinaki Mukherjee Nov 9 '15 at 20:22

abalter ,Aug 30, 2016 at 5:13

This does not work as stated. @Jmoney38 or shrimpwagon if you can paste this in a terminal and get the desired output, please paste the result here. � abalter Aug 30 '16 at 5:13

leaf ,Jul 17, 2017 at 16:28

@abalter Works for me with a=($(echo $t | tr ',' "\n")) . Same result with a=($(echo $t | tr ',' ' ')) . � leaf Jul 17 '17 at 16:28

Luca Borrione ,Nov 2, 2012 at 13:44

Sometimes it happened to me that the method described in the accepted answer didn't work, especially if the separator is a carriage return.
In those cases I solved in this way:
string='first line
second line
third line'

oldIFS="$IFS"
IFS='
'
IFS=${IFS:0:1} # this is useful to format your code with tabs
lines=( $string )
IFS="$oldIFS"

for line in "${lines[@]}"
    do
        echo "--> $line"
done

Stefan van den Akker ,Feb 9, 2015 at 16:52

+1 This completely worked for me. I needed to put multiple strings, divided by a newline, into an array, and read -a arr <<< "$strings" did not work with IFS=$'\n' . � Stefan van den Akker Feb 9 '15 at 16:52

Stefan van den Akker ,Feb 10, 2015 at 13:49

Here is the answer to make the accepted answer work when the delimiter is a newline . � Stefan van den Akker Feb 10 '15 at 13:49

,Jul 24, 2015 at 21:24

The accepted answer works for values in one line.
If the variable has several lines:
string='first line
        second line
        third line'

We need a very different command to get all lines:

while read -r line; do lines+=("$line"); done <<<"$string"

Or the much simpler bash readarray :

readarray -t lines <<<"$string"

Printing all lines is very easy taking advantage of a printf feature:

printf ">[%s]\n" "${lines[@]}"

>[first line]
>[        second line]
>[        third line]

Mayhem ,Dec 31, 2015 at 3:13

While not every solution works for every situation, your mention of readarray... replaced my last two hours with 5 minutes... you got my vote � Mayhem Dec 31 '15 at 3:13

Derek 朕會功夫 ,Mar 23, 2018 at 19:14

readarray is the right answer. � Derek 朕會功夫 Mar 23 '18 at 19:14

ssanch ,Jun 3, 2016 at 15:24

This is similar to the approach by Jmoney38, but using sed:
string="1,2,3,4"
array=(`echo $string | sed 's/,/\n/g'`)
echo ${array[0]}

Prints 1

dawg ,Nov 26, 2017 at 19:59

The key to splitting your string into an array is the multi character delimiter of ", " . Any solution using IFS for multi character delimiters is inherently wrong since IFS is a set of those characters, not a string.

If you assign IFS=", " then the string will break on EITHER "," OR " " or any combination of them which is not an accurate representation of the two character delimiter of ", " .

You can use awk or sed to split the string, with process substitution:

#!/bin/bash

str="Paris, France, Europe"
array=()
while read -r -d $'\0' each; do   # use a NUL terminated field separator 
    array+=("$each")
done < <(printf "%s" "$str" | awk '{ gsub(/,[ ]+|$/,"\0"); print }')
declare -p array
# declare -a array=([0]="Paris" [1]="France" [2]="Europe") output

It is more efficient to use a regex you directly in Bash:

#!/bin/bash

str="Paris, France, Europe"

array=()
while [[ $str =~ ([^,]+)(,[ ]+|$) ]]; do
    array+=("${BASH_REMATCH[1]}")   # capture the field
    i=${#BASH_REMATCH}              # length of field + delimiter
    str=${str:i}                    # advance the string by that length
done                                # the loop deletes $str, so make a copy if needed

declare -p array
# declare -a array=([0]="Paris" [1]="France" [2]="Europe") output...

With the second form, there is no sub shell and it will be inherently faster.


Edit by bgoldst: Here are some benchmarks comparing my readarray solution to dawg's regex solution, and I also included the read solution for the heck of it (note: I slightly modified the regex solution for greater harmony with my solution) (also see my comments below the post):

## competitors
function c_readarray { readarray -td '' a < <(awk '{ gsub(/, /,"\0"); print; };' <<<"$1, "); unset 'a[-1]'; };
function c_read { a=(); local REPLY=''; while read -r -d ''; do a+=("$REPLY"); done < <(awk '{ gsub(/, /,"\0"); print; };' <<<"$1, "); };
function c_regex { a=(); local s="$1, "; while [[ $s =~ ([^,]+),\  ]]; do a+=("${BASH_REMATCH[1]}"); s=${s:${#BASH_REMATCH}}; done; };

## helper functions
function rep {
    local -i i=-1;
    for ((i = 0; i<$1; ++i)); do
        printf %s "$2";
    done;
}; ## end rep()

function testAll {
    local funcs=();
    local args=();
    local func='';
    local -i rc=-1;
    while [[ "$1" != ':' ]]; do
        func="$1";
        if [[ ! "$func" =~ ^[_a-zA-Z][_a-zA-Z0-9]*$ ]]; then
            echo "bad function name: $func" >&2;
            return 2;
        fi;
        funcs+=("$func");
        shift;
    done;
    shift;
    args=("$@");
    for func in "${funcs[@]}"; do
        echo -n "$func ";
        { time $func "${args[@]}" >/dev/null 2>&1; } 2>&1| tr '\n' '/';
        rc=${PIPESTATUS[0]}; if [[ $rc -ne 0 ]]; then echo "[$rc]"; else echo; fi;
    done| column -ts/;
}; ## end testAll()

function makeStringToSplit {
    local -i n=$1; ## number of fields
    if [[ $n -lt 0 ]]; then echo "bad field count: $n" >&2; return 2; fi;
    if [[ $n -eq 0 ]]; then
        echo;
    elif [[ $n -eq 1 ]]; then
        echo 'first field';
    elif [[ "$n" -eq 2 ]]; then
        echo 'first field, last field';
    else
        echo "first field, $(rep $[$1-2] 'mid field, ')last field";
    fi;
}; ## end makeStringToSplit()

function testAll_splitIntoArray {
    local -i n=$1; ## number of fields in input string
    local s='';
    echo "===== $n field$(if [[ $n -ne 1 ]]; then echo 's'; fi;) =====";
    s="$(makeStringToSplit "$n")";
    testAll c_readarray c_read c_regex : "$s";
}; ## end testAll_splitIntoArray()

## results
testAll_splitIntoArray 1;
## ===== 1 field =====
## c_readarray   real  0m0.067s   user 0m0.000s   sys  0m0.000s
## c_read        real  0m0.064s   user 0m0.000s   sys  0m0.000s
## c_regex       real  0m0.000s   user 0m0.000s   sys  0m0.000s
##
testAll_splitIntoArray 10;
## ===== 10 fields =====
## c_readarray   real  0m0.067s   user 0m0.000s   sys  0m0.000s
## c_read        real  0m0.064s   user 0m0.000s   sys  0m0.000s
## c_regex       real  0m0.001s   user 0m0.000s   sys  0m0.000s
##
testAll_splitIntoArray 100;
## ===== 100 fields =====
## c_readarray   real  0m0.069s   user 0m0.000s   sys  0m0.062s
## c_read        real  0m0.065s   user 0m0.000s   sys  0m0.046s
## c_regex       real  0m0.005s   user 0m0.000s   sys  0m0.000s
##
testAll_splitIntoArray 1000;
## ===== 1000 fields =====
## c_readarray   real  0m0.084s   user 0m0.031s   sys  0m0.077s
## c_read        real  0m0.092s   user 0m0.031s   sys  0m0.046s
## c_regex       real  0m0.125s   user 0m0.125s   sys  0m0.000s
##
testAll_splitIntoArray 10000;
## ===== 10000 fields =====
## c_readarray   real  0m0.209s   user 0m0.093s   sys  0m0.108s
## c_read        real  0m0.333s   user 0m0.234s   sys  0m0.109s
## c_regex       real  0m9.095s   user 0m9.078s   sys  0m0.000s
##
testAll_splitIntoArray 100000;
## ===== 100000 fields =====
## c_readarray   real  0m1.460s   user 0m0.326s   sys  0m1.124s
## c_read        real  0m2.780s   user 0m1.686s   sys  0m1.092s
## c_regex       real  17m38.208s   user 15m16.359s   sys  2m19.375s
##

bgoldst ,Nov 27, 2017 at 4:28

Very cool solution! I never thought of using a loop on a regex match, nifty use of $BASH_REMATCH . It works, and does indeed avoid spawning subshells. +1 from me. However, by way of criticism, the regex itself is a little non-ideal, in that it appears you were forced to duplicate part of the delimiter token (specifically the comma) so as to work around the lack of support for non-greedy multipliers (also lookarounds) in ERE ("extended" regex flavor built into bash). This makes it a little less generic and robust. � bgoldst Nov 27 '17 at 4:28

bgoldst ,Nov 27, 2017 at 4:28

Secondly, I did some benchmarking, and although the performance is better than the other solutions for smallish strings, it worsens exponentially due to the repeated string-rebuilding, becoming catastrophic for very large strings. See my edit to your answer. � bgoldst Nov 27 '17 at 4:28

dawg ,Nov 27, 2017 at 4:46

@bgoldst: What a cool benchmark! In defense of the regex, for 10's or 100's of thousands of fields (what the regex is splitting) there would probably be some form of record (like \n delimited text lines) comprising those fields so the catastrophic slow-down would likely not occur. If you have a string with 100,000 fields -- maybe Bash is not ideal ;-) Thanks for the benchmark. I learned a thing or two. � dawg Nov 27 '17 at 4:46

Geoff Lee ,Mar 4, 2016 at 6:02

Try this
IFS=', '; array=(Paris, France, Europe)
for item in ${array[@]}; do echo $item; done

It's simple. If you want, you can also add a declare (and also remove the commas):

IFS=' ';declare -a array=(Paris France Europe)

The IFS is added to undo the above but it works without it in a fresh bash instance

MrPotatoHead ,Nov 13, 2018 at 13:19

Pure bash multi-character delimiter solution.

As others have pointed out in this thread, the OP's question gave an example of a comma delimited string to be parsed into an array, but did not indicate if he/she was only interested in comma delimiters, single character delimiters, or multi-character delimiters.

Since Google tends to rank this answer at or near the top of search results, I wanted to provide readers with a strong answer to the question of multiple character delimiters, since that is also mentioned in at least one response.

If you're in search of a solution to a multi-character delimiter problem, I suggest reviewing Mallikarjun M 's post, in particular the response from gniourf_gniourf who provides this elegant pure BASH solution using parameter expansion:

#!/bin/bash
str="LearnABCtoABCSplitABCaABCString"
delimiter=ABC
s=$str$delimiter
array=();
while [[ $s ]]; do
    array+=( "${s%%"$delimiter"*}" );
    s=${s#*"$delimiter"};
done;
declare -p array

Link to cited comment/referenced post

Link to cited question: Howto split a string on a multi-character delimiter in bash?

Eduardo Cuomo ,Dec 19, 2016 at 15:27

Use this:
countries='Paris, France, Europe'
OIFS="$IFS"
IFS=', ' array=($countries)
IFS="$OIFS"

#${array[1]} == Paris
#${array[2]} == France
#${array[3]} == Europe

gniourf_gniourf ,Dec 19, 2016 at 17:22

Bad: subject to word splitting and pathname expansion. Please don't revive old questions with good answers to give bad answers. � gniourf_gniourf Dec 19 '16 at 17:22

Scott Weldon ,Dec 19, 2016 at 18:12

This may be a bad answer, but it is still a valid answer. Flaggers / reviewers: For incorrect answers such as this one, downvote, don't delete! � Scott Weldon Dec 19 '16 at 18:12

George Sovetov ,Dec 26, 2016 at 17:31

@gniourf_gniourf Could you please explain why it is a bad answer? I really don't understand when it fails. � George Sovetov Dec 26 '16 at 17:31

gniourf_gniourf ,Dec 26, 2016 at 18:07

@GeorgeSovetov: As I said, it's subject to word splitting and pathname expansion. More generally, splitting a string into an array as array=( $string ) is a (sadly very common) antipattern: word splitting occurs: string='Prague, Czech Republic, Europe' ; Pathname expansion occurs: string='foo[abcd],bar[efgh]' will fail if you have a file named, e.g., food or barf in your directory. The only valid usage of such a construct is when string is a glob. � gniourf_gniourf Dec 26 '16 at 18:07

user1009908 ,Jun 9, 2015 at 23:28

UPDATE: Don't do this, due to problems with eval.

With slightly less ceremony:

IFS=', ' eval 'array=($string)'

e.g.

string="foo, bar,baz"
IFS=', ' eval 'array=($string)'
echo ${array[1]} # -> bar

caesarsol ,Oct 29, 2015 at 14:42

eval is evil! don't do this. � caesarsol Oct 29 '15 at 14:42

user1009908 ,Oct 30, 2015 at 4:05

Pfft. No. If you're writing scripts large enough for this to matter, you're doing it wrong. In application code, eval is evil. In shell scripting, it's common, necessary, and inconsequential. � user1009908 Oct 30 '15 at 4:05

caesarsol ,Nov 2, 2015 at 18:19

put a $ in your variable and you'll see... I write many scripts and I never ever had to use a single eval � caesarsol Nov 2 '15 at 18:19

Dennis Williamson ,Dec 2, 2015 at 17:00

Eval command and security issues � Dennis Williamson Dec 2 '15 at 17:00

user1009908 ,Dec 22, 2015 at 23:04

You're right, this is only usable when the input is known to be clean. Not a robust solution. � user1009908 Dec 22 '15 at 23:04

Eduardo Lucio ,Jan 31, 2018 at 20:45

Here's my hack!

Splitting strings by strings is a pretty boring thing to do using bash. What happens is that we have limited approaches that only work in a few cases (split by ";", "/", "." and so on) or we have a variety of side effects in the outputs.

The approach below has required a number of maneuvers, but I believe it will work for most of our needs!

#!/bin/bash

# --------------------------------------
# SPLIT FUNCTION
# ----------------

F_SPLIT_R=()
f_split() {
    : 'It does a "split" into a given string and returns an array.

    Args:
        TARGET_P (str): Target string to "split".
        DELIMITER_P (Optional[str]): Delimiter used to "split". If not 
    informed the split will be done by spaces.

    Returns:
        F_SPLIT_R (array): Array with the provided string separated by the 
    informed delimiter.
    '

    F_SPLIT_R=()
    TARGET_P=$1
    DELIMITER_P=$2
    if [ -z "$DELIMITER_P" ] ; then
        DELIMITER_P=" "
    fi

    REMOVE_N=1
    if [ "$DELIMITER_P" == "\n" ] ; then
        REMOVE_N=0
    fi

    # NOTE: This was the only parameter that has been a problem so far! 
    # By Questor
    # [Ref.: https://unix.stackexchange.com/a/390732/61742]
    if [ "$DELIMITER_P" == "./" ] ; then
        DELIMITER_P="[.]/"
    fi

    if [ ${REMOVE_N} -eq 1 ] ; then

        # NOTE: Due to bash limitations we have some problems getting the 
        # output of a split by awk inside an array and so we need to use 
        # "line break" (\n) to succeed. Seen this, we remove the line breaks 
        # momentarily afterwards we reintegrate them. The problem is that if 
        # there is a line break in the "string" informed, this line break will 
        # be lost, that is, it is erroneously removed in the output! 
        # By Questor
        TARGET_P=$(awk 'BEGIN {RS="dn"} {gsub("\n", "3F2C417D448C46918289218B7337FCAF"); printf $0}' <<< "${TARGET_P}")

    fi

    # NOTE: The replace of "\n" by "3F2C417D448C46918289218B7337FCAF" results 
    # in more occurrences of "3F2C417D448C46918289218B7337FCAF" than the 
    # amount of "\n" that there was originally in the string (one more 
    # occurrence at the end of the string)! We can not explain the reason for 
    # this side effect. The line below corrects this problem! By Questor
    TARGET_P=${TARGET_P%????????????????????????????????}

    SPLIT_NOW=$(awk -F"$DELIMITER_P" '{for(i=1; i<=NF; i++){printf "%s\n", $i}}' <<< "${TARGET_P}")

    while IFS= read -r LINE_NOW ; do
        if [ ${REMOVE_N} -eq 1 ] ; then

            # NOTE: We use "'" to prevent blank lines with no other characters 
            # in the sequence being erroneously removed! We do not know the 
            # reason for this side effect! By Questor
            LN_NOW_WITH_N=$(awk 'BEGIN {RS="dn"} {gsub("3F2C417D448C46918289218B7337FCAF", "\n"); printf $0}' <<< "'${LINE_NOW}'")

            # NOTE: We use the commands below to revert the intervention made 
            # immediately above! By Questor
            LN_NOW_WITH_N=${LN_NOW_WITH_N%?}
            LN_NOW_WITH_N=${LN_NOW_WITH_N#?}

            F_SPLIT_R+=("$LN_NOW_WITH_N")
        else
            F_SPLIT_R+=("$LINE_NOW")
        fi
    done <<< "$SPLIT_NOW"
}

# --------------------------------------
# HOW TO USE
# ----------------

STRING_TO_SPLIT="
 * How do I list all databases and tables using psql?

\"
sudo -u postgres /usr/pgsql-9.4/bin/psql -c \"\l\"
sudo -u postgres /usr/pgsql-9.4/bin/psql <DB_NAME> -c \"\dt\"
\"

\"
\list or \l: list all databases
\dt: list all tables in the current database
\"

[Ref.: https://dba.stackexchange.com/questions/1285/how-do-i-list-all-databases-and-tables-using-psql]


"

f_split "$STRING_TO_SPLIT" "bin/psql -c"

# --------------------------------------
# OUTPUT AND TEST
# ----------------

ARR_LENGTH=${#F_SPLIT_R[*]}
for (( i=0; i<=$(( $ARR_LENGTH -1 )); i++ )) ; do
    echo " > -----------------------------------------"
    echo "${F_SPLIT_R[$i]}"
    echo " < -----------------------------------------"
done

if [ "$STRING_TO_SPLIT" == "${F_SPLIT_R[0]}bin/psql -c${F_SPLIT_R[1]}" ] ; then
    echo " > -----------------------------------------"
    echo "The strings are the same!"
    echo " < -----------------------------------------"
fi

sel-en-ium ,May 31, 2018 at 5:56

Another way to do it without modifying IFS:
read -r -a myarray <<< "${string//, /$IFS}"

Rather than changing IFS to match our desired delimiter, we can replace all occurrences of our desired delimiter ", " with contents of $IFS via "${string//, /$IFS}" .

Maybe this will be slow for very large strings though?

This is based on Dennis Williamson's answer.

rsjethani ,Sep 13, 2016 at 16:21

Another approach can be:
str="a, b, c, d"  # assuming there is a space after ',' as in Q
arr=(${str//,/})  # delete all occurrences of ','

After this 'arr' is an array with four strings. This doesn't require dealing IFS or read or any other special stuff hence much simpler and direct.

gniourf_gniourf ,Dec 26, 2016 at 18:12

Same (sadly common) antipattern as other answers: subject to word splitting and filename expansion. � gniourf_gniourf Dec 26 '16 at 18:12

Safter Arslan ,Aug 9, 2017 at 3:21

Another way would be:
string="Paris, France, Europe"
IFS=', ' arr=(${string})

Now your elements are stored in "arr" array. To iterate through the elements:

for i in ${arr[@]}; do echo $i; done

bgoldst ,Aug 13, 2017 at 22:38

I cover this idea in my answer ; see Wrong answer #5 (you might be especially interested in my discussion of the eval trick). Your solution leaves $IFS set to the comma-space value after-the-fact. � bgoldst Aug 13 '17 at 22:38

[Jan 28, 2019] regex - Safe rm -rf function in shell script

Jan 28, 2019 | stackoverflow.com

community wiki
5 revs
,May 23, 2017 at 12:26

This question is similar to What is the safest way to empty a directory in *nix?

I'm writing bash script which defines several path constants and will use them for file and directory manipulation (copying, renaming and deleting). Often it will be necessary to do something like:

rm -rf "/${PATH1}"
rm -rf "${PATH2}/"*

While developing this script I'd want to protect myself from mistyping names like PATH1 and PATH2 and avoid situations where they are expanded to empty string, thus resulting in wiping whole disk. I decided to create special wrapper:

rmrf() {
    if [[ $1 =~ "regex" ]]; then
        echo "Ignoring possibly unsafe path ${1}"
        exit 1
    fi

    shopt -s dotglob
    rm -rf -- $1
    shopt -u dotglob
}

Which will be called as:

rmrf "/${PATH1}"
rmrf "${PATH2}/"*

Regex (or sed expression) should catch paths like "*", "/*", "/**/", "///*" etc. but allow paths like "dir", "/dir", "/dir1/dir2/", "/dir1/dir2/*". Also I don't know how to enable shell globbing in case like "/dir with space/*". Any ideas?

EDIT: this is what I came up with so far:

rmrf() {
    local RES
    local RMPATH="${1}"
    SAFE=$(echo "${RMPATH}" | sed -r 's:^((\.?\*+/+)+.*|(/+\.?\*+)+.*|[\.\*/]+|.*/\.\*+)$::g')
    if [ -z "${SAFE}" ]; then
        echo "ERROR! Unsafe deletion of ${RMPATH}"
        return 1
    fi

    shopt -s dotglob
    if [ '*' == "${RMPATH: -1}" ]; then
        echo rm -rf -- "${RMPATH/%\*/}"*
        RES=$?
    else
        echo rm -rf -- "${RMPATH}"
        RES=$?
    fi
    shopt -u dotglob

    return $RES
}

Intended use is (note an asterisk inside quotes):

rmrf "${SOMEPATH}"
rmrf "${SOMEPATH}/*"

where $SOMEPATH is not system or /home directory (in my case all such operations are performed on filesystem mounted under /scratch directory).

CAVEATS:

SpliFF ,Jun 14, 2009 at 13:45

I've found a big danger with rm in bash is that bash usually doesn't stop for errors. That means that:
cd $SOMEPATH
rm -rf *

Is a very dangerous combination if the change directory fails. A safer way would be:

cd $SOMEPATH && rm -rf *

Which will ensure the rf won't run unless you are really in $SOMEPATH. This doesn't protect you from a bad $SOMEPATH but it can be combined with the advice given by others to help make your script safer.

EDIT: @placeybordeaux makes a good point that if $SOMEPATH is undefined or empty cd doesn't treat it as an error and returns 0. In light of that this answer should be considered unsafe unless $SOMEPATH is validated as existing and non-empty first. I believe cd with no args should be an illegal command since at best is performs a no-op and at worse it can lead to unexpected behaviour but it is what it is.

Sazzad Hissain Khan ,Jul 6, 2017 at 11:45

nice trick, I am one stupid victim. – Sazzad Hissain Khan Jul 6 '17 at 11:45

placeybordeaux ,Jun 21, 2018 at 22:59

If $SOMEPATH is empty won't this rm -rf the user's home directory? – placeybordeaux Jun 21 '18 at 22:59

SpliFF ,Jun 27, 2018 at 4:10

@placeybordeaux The && only runs the second command if the first succeeds - so if cd fails rm never runs – SpliFF Jun 27 '18 at 4:10

placeybordeaux ,Jul 3, 2018 at 18:46

@SpliFF at least in ZSH the return value of cd $NONEXISTANTVAR is 0placeybordeaux Jul 3 '18 at 18:46

ruakh ,Jul 13, 2018 at 6:46

Instead of cd $SOMEPATH , you should write cd "${SOMEPATH?}" . The ${varname?} notation ensures that the expansion fails with a warning-message if the variable is unset or empty (such that the && ... part is never run); the double-quotes ensure that special characters in $SOMEPATH , such as whitespace, don't have undesired effects. – ruakh Jul 13 '18 at 6:46

community wiki
2 revs
,Jul 24, 2009 at 22:36

There is a set -u bash directive that will cause exit, when uninitialized variable is used. I read about it here , with rm -rf as an example. I think that's what you're looking for. And here is set's manual .

,Jun 14, 2009 at 12:38

I think "rm" command has a parameter to avoid the deleting of "/". Check it out.

Max ,Jun 14, 2009 at 12:56

Thanks! I didn't know about such option. Actually it is named --preserve-root and is not mentioned in the manpage. – Max Jun 14 '09 at 12:56

Max ,Jun 14, 2009 at 13:18

On my system this option is on by default, but it cat't help in case like rm -ri /* – Max Jun 14 '09 at 13:18

ynimous ,Jun 14, 2009 at 12:42

I would recomend to use realpath(1) and not the command argument directly, so that you can avoid things like /A/B/../ or symbolic links.

Max ,Jun 14, 2009 at 13:30

Useful but non-standard command. I've found possible bash replacement: archlinux.org/pipermail/pacman-dev/2009-February/008130.htmlMax Jun 14 '09 at 13:30

Jonathan Leffler ,Jun 14, 2009 at 12:47

Generally, when I'm developing a command with operations such as ' rm -fr ' in it, I will neutralize the remove during development. One way of doing that is:
RMRF="echo rm -rf"
...
$RMRF "/${PATH1}"

This shows me what should be deleted - but does not delete it. I will do a manual clean up while things are under development - it is a small price to pay for not running the risk of screwing up everything.

The notation ' "/${PATH1}" ' is a little unusual; normally, you would ensure that PATH1 simply contains an absolute pathname.

Using the metacharacter with ' "${PATH2}/"* ' is unwise and unnecessary. The only difference between using that and using just ' "${PATH2}" ' is that if the directory specified by PATH2 contains any files or directories with names starting with dot, then those files or directories will not be removed. Such a design is unlikely and is rather fragile. It would be much simpler just to pass PATH2 and let the recursive remove do its job. Adding the trailing slash is not necessarily a bad idea; the system would have to ensure that $PATH2 contains a directory name, not just a file name, but the extra protection is rather minimal.

Using globbing with ' rm -fr ' is usually a bad idea. You want to be precise and restrictive and limiting in what it does - to prevent accidents. Of course, you'd never run the command (shell script you are developing) as root while it is under development - that would be suicidal. Or, if root privileges are absolutely necessary, you neutralize the remove operation until you are confident it is bullet-proof.

Max ,Jun 14, 2009 at 13:09

To delete subdirectories and files starting with dot I use "shopt -s dotglob". Using rm -rf "${PATH2}" is not appropriate because in my case PATH2 can be only removed by superuser and this results in error status for "rm" command (and I verify it to track other errors). – Max Jun 14 '09 at 13:09

Jonathan Leffler ,Jun 14, 2009 at 13:37

Then, with due respect, you should use a private sub-directory under $PATH2 that you can remove. Avoid glob expansion with commands like 'rm -rf' like you would avoid the plague (or should that be A/H1N1?). – Jonathan Leffler Jun 14 '09 at 13:37

Max ,Jun 14, 2009 at 14:10

Meanwhile I've found this perl project: http://code.google.com/p/safe-rm/

community wiki
too much php
,Jun 15, 2009 at 1:55

If it is possible, you should try and put everything into a folder with a hard-coded name which is unlikely to be found anywhere else on the filesystem, such as ' foofolder '. Then you can write your rmrf() function as:
rmrf() {
    rm -rf "foofolder/$PATH1"
    # or
    rm -rf "$PATH1/foofolder"
}

There is no way that function can delete anything but the files you want it to.

vadipp ,Jan 13, 2017 at 11:37

Actually there is a way: if PATH1 is something like ../../someotherdirvadipp Jan 13 '17 at 11:37

community wiki
btop
,Jun 15, 2009 at 6:34

You may use
set -f    # cf. help set

to disable filename generation (*).

community wiki
Howard Hong
,Oct 28, 2009 at 19:56

You don't need to use regular expressions.
Just assign the directories you want to protect to a variable and then iterate over the variable. eg:
protected_dirs="/ /bin /usr/bin /home $HOME"
for d in $protected_dirs; do
    if [ "$1" = "$d" ]; then
        rm=0
        break;
    fi
done
if [ ${rm:-1} -eq 1 ]; then
    rm -rf $1
fi

,

Add the following codes to your ~/.bashrc
# safe delete
move_to_trash () { now="$(date +%Y%m%d_%H%M%S)"; mv "$@" ~/.local/share/Trash/files/"$@_$now"; }
alias del='move_to_trash'

# safe rm
alias rmi='rm -i'

Every time you need to rm something, first consider del , you can change the trash folder. If you do need to rm something, you could go to the trash folder and use rmi .

One small bug for del is that when del a folder, for example, my_folder , it should be del my_folder but not del my_folder/ since in order for possible later restore, I attach the time information in the end ( "$@_$now" ). For files, it works fine.

[Jan 17, 2019] How do I launch the default web browser in Perl on any operating system

Jan 17, 2019 | stackoverflow.com

The second hit on "open url" at search.cpan brings up Browser::Open:

use Browser::Open qw( open_browser );

my $url = 'http://www.google.com/';
open_browser($url);

If your OS isn't supported, send a patch or a bug report.

--cjm

More at Stack Overflow More at Stack Overflow

[Jan 10, 2019] linux - How does cat EOF work in bash - Stack Overflow

Notable quotes:
"... The $sql variable now holds the new-line characters too. You can verify with echo -e "$sql" . ..."
"... The print.sh file now contains: ..."
"... The b.txt file contains bar and baz lines. The same output is printed to stdout . ..."
Jan 10, 2019 | stackoverflow.com

How does "cat << EOF" work in bash? Ask Question 454


hasen ,Mar 23, 2010 at 13:57

I needed to write a script to enter multi-line input to a program ( psql ).

After a bit of googling, I found the following syntax works:

cat << EOF | psql ---params
BEGIN;

`pg_dump ----something`

update table .... statement ...;

END;
EOF

This correctly constructs the multi-line string (from BEGIN; to END; , inclusive) and pipes it as an input to psql .

But I have no idea how/why it works, can some one please explain?

I'm referring mainly to cat << EOF , I know > outputs to a file, >> appends to a file, < reads input from file.

What does << exactly do?

And is there a man page for it?

Dennis Williamson ,Mar 23, 2010 at 18:28

That's probably a useless use of cat . Try psql ... << EOF ... See also "here strings". mywiki.wooledge.org/BashGuide/InputAndOutput?#Here_StringsDennis Williamson Mar 23 '10 at 18:28

hasen ,Mar 23, 2010 at 18:54

@Dennis: good point, and thanks for the link! – hasen Mar 23 '10 at 18:54

Alex ,Mar 23, 2015 at 23:31

I'm surprised it works with cat but not with echo. cat should expect a file name as stdin, not a char string. psql << EOF sounds logical, but not othewise. Works with cat but not with echo. Strange behaviour. Any clue about that? – Alex Mar 23 '15 at 23:31

Alex ,Mar 23, 2015 at 23:39

Answering to myself: cat without parameters executes and replicates to the output whatever send via input (stdin), hence using its output to fill the file via >. In fact a file name read as a parameter is not a stdin stream. – Alex Mar 23 '15 at 23:39

The-null-Pointer- ,Jan 1, 2018 at 18:03

@Alex echo just prints it's command line arguments while cat reads stding(when piped to it) or reads a file that corresponds to it's command line args – The-null-Pointer- Jan 1 '18 at 18:03

kennytm ,Mar 23, 2010 at 13:58

This is called heredoc format to provide a string into stdin. See https://en.wikipedia.org/wiki/Here_document#Unix_shells for more details.

From man bash :

Here Documents

This type of redirection instructs the shell to read input from the current source until a line containing only word (with no trailing blanks) is seen.

All of the lines read up to that point are then used as the standard input for a command.

The format of here-documents is:

          <<[-]word
                  here-document
          delimiter

No parameter expansion, command substitution, arithmetic expansion, or pathname expansion is performed on word . If any characters in word are quoted, the delimiter is the result of quote removal on word , and the lines in the here-document are not expanded. If word is unquoted, all lines of the here-document are subjected to parameter expansion, command substitution, and arithmetic expansion. In the latter case, the character sequence \<newline> is ignored, and \ must be used to quote the characters \ , $ , and ` .

If the redirection operator is <<- , then all leading tab characters are stripped from input lines and the line containing delimiter . This allows here-documents within shell scripts to be indented in a natural fashion.

Xeoncross ,May 26, 2011 at 22:51

I was having the hardest time disabling variable/parameter expansion. All I needed to do was use "double-quotes" and that fixed it! Thanks for the info! – Xeoncross May 26 '11 at 22:51

trkoch ,Nov 10, 2015 at 17:23

Concerning <<- please note that only leading tab characters are stripped -- not soft tab characters. This is one of those rare case when you actually need the tab character. If the rest of your document uses soft tabs, make sure to show invisible characters and (e.g.) copy and paste a tab character. If you do it right, your syntax highlighting should correctly catch the ending delimiter. – trkoch Nov 10 '15 at 17:23

BrDaHa ,Jul 13, 2017 at 19:01

I don't see how this answer is more helpful than the ones below. It merely regurgitates information that can be found in other places (that have likely already been checked) – BrDaHa Jul 13 '17 at 19:01

Vojtech Vitek ,Feb 4, 2014 at 10:28

The cat <<EOF syntax is very useful when working with multi-line text in Bash, eg. when assigning multi-line string to a shell variable, file or a pipe. Examples of cat <<EOF syntax usage in Bash: 1. Assign multi-line string to a shell variable
$ sql=$(cat <<EOF
SELECT foo, bar FROM db
WHERE foo='baz'
EOF
)

The $sql variable now holds the new-line characters too. You can verify with echo -e "$sql" .

2. Pass multi-line string to a file in Bash
$ cat <<EOF > print.sh
#!/bin/bash
echo \$PWD
echo $PWD
EOF

The print.sh file now contains:

#!/bin/bash
echo $PWD
echo /home/user
3. Pass multi-line string to a pipe in Bash
$ cat <<EOF | grep 'b' | tee b.txt
foo
bar
baz
EOF

The b.txt file contains bar and baz lines. The same output is printed to stdout .

edelans ,Aug 22, 2014 at 8:48

In your case, "EOF" is known as a "Here Tag". Basically <<Here tells the shell that you are going to enter a multiline string until the "tag" Here . You can name this tag as you want, it's often EOF or STOP .

Some rules about the Here tags:

  1. The tag can be any string, uppercase or lowercase, though most people use uppercase by convention.
  2. The tag will not be considered as a Here tag if there are other words in that line. In this case, it will merely be considered part of the string. The tag should be by itself on a separate line, to be considered a tag.
  3. The tag should have no leading or trailing spaces in that line to be considered a tag. Otherwise it will be considered as part of the string.

example:

$ cat >> test <<HERE
> Hello world HERE <-- Not by itself on a separate line -> not considered end of string
> This is a test
>  HERE <-- Leading space, so not considered end of string
> and a new line
> HERE <-- Now we have the end of the string

oemb1905 ,Feb 22, 2017 at 7:17

this is the best actual answer ... you define both and clearly state the primary purpose of the use instead of related theory ... which is important but not necessary ... thanks - super helpful – oemb1905 Feb 22 '17 at 7:17

The-null-Pointer- ,Jan 1, 2018 at 18:05

@edelans you must add that when <<- is used leading tab will not prevent the tag from being recognized – The-null-Pointer- Jan 1 '18 at 18:05

JawSaw ,Oct 28, 2018 at 13:44

your answer clicked me on "you are going to enter a multiline string" – JawSaw Oct 28 '18 at 13:44

Ciro Santilli 新疆改造中心 六四事件 法轮功 ,Jun 9, 2015 at 9:41

POSIX 7

kennytm quoted man bash , but most of that is also POSIX 7: http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_07_04 :

The redirection operators "<<" and "<<-" both allow redirection of lines contained in a shell input file, known as a "here-document", to the input of a command.

The here-document shall be treated as a single word that begins after the next and continues until there is a line containing only the delimiter and a , with no characters in between. Then the next here-document starts, if there is one. The format is as follows:

[n]<<word
    here-document
delimiter

where the optional n represents the file descriptor number. If the number is omitted, the here-document refers to standard input (file descriptor 0).

If any character in word is quoted, the delimiter shall be formed by performing quote removal on word, and the here-document lines shall not be expanded. Otherwise, the delimiter shall be the word itself.

If no characters in word are quoted, all lines of the here-document shall be expanded for parameter expansion, command substitution, and arithmetic expansion. In this case, the in the input behaves as the inside double-quotes (see Double-Quotes). However, the double-quote character ( '"' ) shall not be treated specially within a here-document, except when the double-quote appears within "$()", "``", or "${}".

If the redirection symbol is "<<-", all leading <tab> characters shall be stripped from input lines and the line containing the trailing delimiter. If more than one "<<" or "<<-" operator is specified on a line, the here-document associated with the first operator shall be supplied first by the application and shall be read first by the shell.

When a here-document is read from a terminal device and the shell is interactive, it shall write the contents of the variable PS2, processed as described in Shell Variables, to standard error before reading each line of input until the delimiter has been recognized.

Examples

Some examples not yet given.

Quotes prevent parameter expansion

Without quotes:

a=0
cat <<EOF
$a
EOF

Output:

0

With quotes:

a=0
cat <<'EOF'
$a
EOF

or (ugly but valid):

a=0
cat <<E"O"F
$a
EOF

Outputs:

$a
Hyphen removes leading tabs

Without hyphen:

cat <<EOF
<tab>a
EOF

where <tab> is a literal tab, and can be inserted with Ctrl + V <tab>

Output:

<tab>a

With hyphen:

cat <<-EOF
<tab>a
<tab>EOF

Output:

a

This exists of course so that you can indent your cat like the surrounding code, which is easier to read and maintain. E.g.:

if true; then
    cat <<-EOF
    a
    EOF
fi

Unfortunately, this does not work for space characters: POSIX favored tab indentation here. Yikes.

David C. Rankin ,Aug 12, 2015 at 7:10

In your last example discussing <<- and <tab>a , it should be noted that the purpose was to allow normal indentation of code within the script while allowing heredoc text presented to the receiving process to begin in column 0. It is a not too commonly seen feature and a bit more context may prevent a good deal of head-scratching... – David C. Rankin Aug 12 '15 at 7:10

Ciro Santilli 新疆改造中心 六四事件 法轮功 ,Aug 12, 2015 at 8:22

@DavidC.Rankin updated to clarify that, thanks. – Ciro Santilli 新疆改造中心 六四事件 法轮功 Aug 12 '15 at 8:22

Jeanmichel Cote ,Sep 23, 2015 at 19:58

How should i escape expension if some of the content in between my EOF tags needs to be expanded and some don't? – Jeanmichel Cote Sep 23 '15 at 19:58

Jeanmichel Cote ,Sep 23, 2015 at 20:00

...just use the backslash in front of the $Jeanmichel Cote Sep 23 '15 at 20:00

Ciro Santilli 新疆改造中心 六四事件 法轮功 ,Sep 23, 2015 at 20:01

@JeanmichelCote I don't see a better option :-) With regular strings you can also consider mixing up quotes like "$a"'$b'"$c" , but there is no analogue here AFAIK. – Ciro Santilli 新疆改造中心 六四事件 法轮功 Sep 23 '15 at 20:01

Andreas Maier ,Feb 13, 2017 at 12:14

Using tee instead of cat

Not exactly as an answer to the original question, but I wanted to share this anyway: I had the need to create a config file in a directory that required root rights.

The following does not work for that case:

$ sudo cat <<EOF >/etc/somedir/foo.conf
# my config file
foo=bar
EOF

because the redirection is handled outside of the sudo context.

I ended up using this instead:

$ sudo tee <<EOF /etc/somedir/foo.conf >/dev/null
# my config file
foo=bar
EOF

user9048395

add a comment ,Jun 6, 2018 at 0:15
This isn't necessarily an answer to the original question, but a sharing of some results from my own testing. This:
<<test > print.sh
#!/bin/bash
echo \$PWD
echo $PWD
test

will produce the same file as:

cat <<test > print.sh
#!/bin/bash
echo \$PWD
echo $PWD
test

So, I don't see the point of using the cat command.

> ,Dec 19, 2013 at 21:40

Worth noting that here docs work in bash loops too. This example shows how-to get the column list of table:
export postgres_db_name='my_db'
export table_name='my_table_name'

# start copy 
while read -r c; do test -z "$c" || echo $table_name.$c , ; done < <(cat << EOF | psql -t -q -d $postgres_db_name -v table_name="${table_name:-}"
SELECT column_name
FROM information_schema.columns
WHERE 1=1
AND table_schema = 'public'
AND table_name   =:'table_name'  ;
EOF
)
# stop copy , now paste straight into the bash shell ...

output: 
my_table_name.guid ,
my_table_name.id ,
my_table_name.level ,
my_table_name.seq ,

or even without the new line

while read -r c; do test -z "$c" || echo $table_name.$c , | perl -ne 
's/\n//gm;print' ; done < <(cat << EOF | psql -t -q -d $postgres_db_name -v table_name="${table_name:-}"
 SELECT column_name
 FROM information_schema.columns
 WHERE 1=1
 AND table_schema = 'public'
 AND table_name   =:'table_name'  ;
 EOF
 )

 # output: daily_issues.guid ,daily_issues.id ,daily_issues.level ,daily_issues.seq ,daily_issues.prio ,daily_issues.weight ,daily_issues.status ,daily_issues.category ,daily_issues.name ,daily_issues.description ,daily_issues.type ,daily_issues.owner

[Jan 03, 2019] Using Lua for working with excel - Stack Overflow

Jan 03, 2019 | stackoverflow.com

Using Lua for working with excel Ask Question 2


Animesh ,Oct 14, 2009 at 12:04

I am planning to learn Lua for my desktop scripting needs. I want to know if there is any documentation available and also if there are all the things needed in the Standard Lib.

uroc ,Oct 14, 2009 at 12:09

You should check out Lua for Windows -- a 'batteries included environment' for the Lua scripting language on Windows

http://luaforwindows.luaforge.net/

It includes the LuaCOM library, from which you can access the Excel COM object.

Try looking at the LuaCOM documentation, there are some Excel examples in that:

http://www.tecgraf.puc-rio.br/~rcerq/luacom/pub/1.3/luacom-htmldoc/

I've only ever used this for very simplistic things. Here is a sample to get you started:

-- test.lua
require('luacom')
excel = luacom.CreateObject("Excel.Application")
excel.Visible = true
wb = excel.Workbooks:Add()
ws = wb.Worksheets(1)

for i=1, 20 do
    ws.Cells(i,1).Value2 = i
end

Animesh ,Oct 14, 2009 at 12:26

Thanks uroc for your quick response. If possible, please let me know of any beginner tutorial or atleast some sample code for using COM programming via Lua. :) � Animesh Oct 14 '09 at 12:26

sagasw ,Oct 16, 2009 at 1:02

More complex code example for lua working with excel:
require "luacom"

excel = luacom.CreateObject("Excel.Application")

local book  = excel.Workbooks:Add()
local sheet = book.Worksheets(1)

excel.Visible = true

for row=1, 30 do
  for col=1, 30 do
    sheet.Cells(row, col).Value2 = math.floor(math.random() * 100)
  end
end


local range = sheet:Range("A1")

for row=1, 30 do
  for col=1, 30 do
    local v = sheet.Cells(row, col).Value2

    if v > 50 then
        local cell = range:Offset(row-1, col-1)

        cell:Select()
        excel.Selection.Interior.Color = 65535
    end
  end
end

excel.DisplayAlerts = false
excel:Quit()
excel = nil

Another example, could add a graph chart.

require "luacom"

excel = luacom.CreateObject("Excel.Application")

local book  = excel.Workbooks:Add()
local sheet = book.Worksheets(1)

excel.Visible = true

for row=1, 30 do
  sheet.Cells(row, 1).Value2 = math.floor(math.random() * 100)
end

local chart = excel.Charts:Add()
chart.ChartType = 4  --  xlLine

local range = sheet:Range("A1:A30")
chart:SetSourceData(range)

Incredulous Monk ,Oct 19, 2009 at 4:17

A quick suggestion: fragments of code will look better if you format them as code (use the little "101 010" button). � Incredulous Monk Oct 19 '09 at 4:17

[Jan 01, 2019] mc - How can I set the default (user defined) listing mode in Midnight Commander- - Unix Linux Stack Exchange

Jan 01, 2019 | unix.stackexchange.com

Ask Question 0

papaiatis ,Jul 14, 2016 at 11:51

I defined my own listing mode and I'd like to make it permanent so that on the next mc start my defined listing mode will be set. I found no configuration file for mc.

,

You have probably Auto save setup turned off in Options->Configuration menu.

You can save the configuration manually by Options->Save setup .

Panels setup is saved to ~/.config/mc/panels.ini .

[Dec 05, 2018] How to make putty ssh connection never to timeout when user is idle?

Dec 05, 2018 | askubuntu.com

David MZ ,Feb 13, 2013 at 18:07

I have a Ubuntu 12.04 server I bought, if I connect with putty using ssh and a sudoer user putty gets disconnected by the server after some time if I am idle How do I configure Ubuntu to keep this connection alive indefinitely?

das Keks ,Feb 13, 2013 at 18:24

If you go to your putty settings -> Connection and set the value of "Seconds between keepalives" to 30 seconds this should solve your problem.

kokbira ,Feb 19 at 11:42

?????? "0 to turn off" or 30 to turn off????????? I think he must put 0 instead of 30! – kokbira Feb 19 at 11:42

das Keks ,Feb 19 at 11:46

No, it's the time between keepalives. If you set it to 0, no keepalives are sent but you want putty to send keepalives to keep the connection alive. – das Keks Feb 19 at 11:46

Aaron ,Mar 19 at 20:39

I did this but still it drops.. – Aaron Mar 19 at 20:39

0xC0000022L ,Feb 13, 2013 at 19:29

In addition to the answer from "das Keks" there is at least one other aspect that can affect this behavior. Bash (usually the default shell on Ubuntu) has a value TMOUT which governs (decimal value in seconds) after which time an idle shell session will time out and the user will be logged out, leading to a disconnect in an SSH session.

In addition I would strongly recommend that you do something else entirely. Set up byobu (or even just tmux alone as it's superior to GNU screen ) and always log in and attach to a preexisting session (that's GNU screen and tmux terminology). This way even if you get forcibly disconnected - let's face it, a power outage or network interruption can always happen - you can always resume your work where you left. And that works across different machines. So you can connect to the same session from another machine (e.g. from home). The possibilities are manifold and it's a true productivity booster. And not to forget, terminal multiplexers overcome one of the big disadvantages of PuTTY: no tabbed interface. Now you get "tabs" in the form of windows and panes inside GNU screen and tmux .

apt-get install tmux
apt-get install byobu

Byobu is a nice frontend to both terminal multiplexers, but tmux is so comfortable that in my opinion it obsoletes byobu to a large extent. So my recommendation would be tmux .

Also search for "dotfiles", in particular tmux.conf and .tmux.conf on the web for many good customizations to get you started.

Rajesh ,Mar 19, 2015 at 15:10

Go to PuTTy options --> Connection
  1. Change the default value for "Seconds between keepalives(0 to turn off)" : from 0 to 600 (10 minutes) --This varies...reduce if 10 minutes doesn't help
  2. Check the "Enable TCP_keepalives (SO_KEEPALIVE option)" check box.
  3. Finally save setting for session

,

I keep my PuTTY sessions alive by monitoring the cron logs
tail -f /var/log/cron

I want the PuTTY session alive because I'm proxying through socks.

[Dec 05, 2018] How can I scroll up to see the past output in PuTTY?

Dec 05, 2018 | superuser.com

Ask Question up vote 3 down vote favorite 1

user1721949 ,Dec 12, 2012 at 8:32

I have a script which, when I run it from PuTTY, it scrolls the screen. Now, I want to go back to see the errors, but when I scroll up, I can see the past commands, but not the output of the command.

How can I see the past output?

Rico ,Dec 13, 2012 at 8:24

Shift+Pgup/PgDn should work for scrolling without using the scrollbar.

> ,Jul 12, 2017 at 21:45

If shift pageup/pagedown fails, try this command: "reset", which seems to correct the display. – user530079 Jul 12 '17 at 21:45

RedGrittyBrick ,Dec 12, 2012 at 9:31

If you don't pipe the output of your commands into something like less , you will be able to use Putty's scroll-bars to view earlier output.

Putty has settings for how many lines of past output it retains in it's buffer.


before scrolling

after scrolling back (upwards)

If you use something like less the output doesn't get into Putty's scroll buffer


after using less

David Dai ,Dec 14, 2012 at 3:31

why is putty different with the native linux console at this point? – David Dai Dec 14 '12 at 3:31

konradstrack ,Dec 12, 2012 at 9:52

I would recommend using screen if you want to have good control over the scroll buffer on a remote shell.

You can change the scroll buffer size to suit your needs by setting:

defscrollback 4000

in ~/.screenrc , which will specify the number of lines you want to be buffered (4000 in this case).

Then you should run your script in a screen session, e.g. by executing screen ./myscript.sh or first executing screen and then ./myscript.sh inside the session.

It's also possible to enable logging of the console output to a file. You can find more info on the screen's man page .

,

From your descript, it sounds like the "problem" is that you are using screen, tmux, or another window manager dependent on them (byobu). Normally you should be able to scroll back in putty with no issue. Exceptions include if you are in an application like less or nano that creates it's own "window" on the terminal.

With screen and tmux you can generally scroll back with SHIFT + PGUP (same as you could from the physical terminal of the remote machine). They also both have a "copy" mode that frees the cursor from the prompt and lets you use arrow keys to move it around (for selecting text to copy with just the keyboard). It also lets you scroll up and down with the PGUP and PGDN keys. Copy mode under byobu using screen or tmux backends is accessed by pressing F7 (careful, F6 disconnects the session). To do so directly under screen you press CTRL + a then ESC or [ . You can use ESC to exit copy mode. Under tmux you press CTRL + b then [ to enter copy mode and ] to exit.

The simplest solution, of course, is not to use either. I've found both to be quite a bit more trouble than they are worth. If you would like to use multiple different terminals on a remote machine simply connect with multiple instances of putty and manage your windows using, er... Windows. Now forgive me but I must flee before I am burned at the stake for my heresy.

EDIT: almost forgot, some keys may not be received correctly by the remote terminal if putty has not been configured correctly. In your putty config check Terminal -> Keyboard . You probably want the function keys and keypad set to be either Linux or Xterm R6 . If you are seeing strange characters on the terminal when attempting the above this is most likely the problem.

[Dec 01, 2018] Lua editors WoWWiki FANDOM powered by Wikia

Dec 01, 2018 | wowwiki.wikia.com

[Nov 13, 2018] Resuming rsync partial (-P/--partial) on a interrupted transfer

Notable quotes:
"... should ..."
May 15, 2013 | stackoverflow.com

Glitches , May 15, 2013 at 18:06

I am trying to backup my file server to a remove file server using rsync. Rsync is not successfully resuming when a transfer is interrupted. I used the partial option but rsync doesn't find the file it already started because it renames it to a temporary file and when resumed it creates a new file and starts from beginning.

Here is my command:

rsync -avztP -e "ssh -p 2222" /volume1/ myaccont@backup-server-1:/home/myaccount/backup/ --exclude "@spool" --exclude "@tmp"

When this command is ran, a backup file named OldDisk.dmg from my local machine get created on the remote machine as something like .OldDisk.dmg.SjDndj23 .

Now when the internet connection gets interrupted and I have to resume the transfer, I have to find where rsync left off by finding the temp file like .OldDisk.dmg.SjDndj23 and rename it to OldDisk.dmg so that it sees there already exists a file that it can resume.

How do I fix this so I don't have to manually intervene each time?

Richard Michael , Nov 6, 2013 at 4:26

TL;DR : Use --timeout=X (X in seconds) to change the default rsync server timeout, not --inplace .

The issue is the rsync server processes (of which there are two, see rsync --server ... in ps output on the receiver) continue running, to wait for the rsync client to send data.

If the rsync server processes do not receive data for a sufficient time, they will indeed timeout, self-terminate and cleanup by moving the temporary file to it's "proper" name (e.g., no temporary suffix). You'll then be able to resume.

If you don't want to wait for the long default timeout to cause the rsync server to self-terminate, then when your internet connection returns, log into the server and clean up the rsync server processes manually. However, you must politely terminate rsync -- otherwise, it will not move the partial file into place; but rather, delete it (and thus there is no file to resume). To politely ask rsync to terminate, do not SIGKILL (e.g., -9 ), but SIGTERM (e.g., pkill -TERM -x rsync - only an example, you should take care to match only the rsync processes concerned with your client).

Fortunately there is an easier way: use the --timeout=X (X in seconds) option; it is passed to the rsync server processes as well.

For example, if you specify rsync ... --timeout=15 ... , both the client and server rsync processes will cleanly exit if they do not send/receive data in 15 seconds. On the server, this means moving the temporary file into position, ready for resuming.

I'm not sure of the default timeout value of the various rsync processes will try to send/receive data before they die (it might vary with operating system). In my testing, the server rsync processes remain running longer than the local client. On a "dead" network connection, the client terminates with a broken pipe (e.g., no network socket) after about 30 seconds; you could experiment or review the source code. Meaning, you could try to "ride out" the bad internet connection for 15-20 seconds.

If you do not clean up the server rsync processes (or wait for them to die), but instead immediately launch another rsync client process, two additional server processes will launch (for the other end of your new client process). Specifically, the new rsync client will not re-use/reconnect to the existing rsync server processes. Thus, you'll have two temporary files (and four rsync server processes) -- though, only the newer, second temporary file has new data being written (received from your new rsync client process).

Interestingly, if you then clean up all rsync server processes (for example, stop your client which will stop the new rsync servers, then SIGTERM the older rsync servers, it appears to merge (assemble) all the partial files into the new proper named file. So, imagine a long running partial copy which dies (and you think you've "lost" all the copied data), and a short running re-launched rsync (oops!).. you can stop the second client, SIGTERM the first servers, it will merge the data, and you can resume.

Finally, a few short remarks:

JamesTheAwesomeDude , Dec 29, 2013 at 16:50

Just curious: wouldn't SIGINT (aka ^C ) be 'politer' than SIGTERM ? � JamesTheAwesomeDude Dec 29 '13 at 16:50

Richard Michael , Dec 29, 2013 at 22:34

I didn't test how the server-side rsync handles SIGINT, so I'm not sure it will keep the partial file - you could check. Note that this doesn't have much to do with Ctrl-c ; it happens that your terminal sends SIGINT to the foreground process when you press Ctrl-c , but the server-side rsync has no controlling terminal. You must log in to the server and use kill . The client-side rsync will not send a message to the server (for example, after the client receives SIGINT via your terminal Ctrl-c ) - might be interesting though. As for anthropomorphizing, not sure what's "politer". :-) � Richard Michael Dec 29 '13 at 22:34

d-b , Feb 3, 2015 at 8:48

I just tried this timeout argument rsync -av --delete --progress --stats --human-readable --checksum --timeout=60 --partial-dir /tmp/rsync/ rsync://$remote:/ /src/ but then it timed out during the "receiving file list" phase (which in this case takes around 30 minutes). Setting the timeout to half an hour so kind of defers the purpose. Any workaround for this? � d-b Feb 3 '15 at 8:48

Cees Timmerman , Sep 15, 2015 at 17:10

@user23122 --checksum reads all data when preparing the file list, which is great for many small files that change often, but should be done on-demand for large files. � Cees Timmerman Sep 15 '15 at 17:10

[Nov 08, 2018] How to find which process is regularly writing to disk?

Notable quotes:
"... tick...tick...tick...trrrrrr ..."
"... /var/log/syslog ..."
Nov 08, 2018 | unix.stackexchange.com

Cedric Martin , Jul 27, 2012 at 4:31

How can I find which process is constantly writing to disk?

I like my workstation to be close to silent and I just build a new system (P8B75-M + Core i5 3450s -- the 's' because it has a lower max TDP) with quiet fans etc. and installed Debian Wheezy 64-bit on it.

And something is getting on my nerve: I can hear some kind of pattern like if the hard disk was writing or seeking someting ( tick...tick...tick...trrrrrr rinse and repeat every second or so).

In the past I had a similar issue in the past (many, many years ago) and it turned out it was some CUPS log or something and I simply redirected that one (not important) logging to a (real) RAM disk.

But here I'm not sure.

I tried the following:

ls -lR /var/log > /tmp/a.tmp && sleep 5 && ls -lR /var/log > /tmp/b.tmp && diff /tmp/?.tmp

but nothing is changing there.

Now the strange thing is that I also hear the pattern when the prompt asking me to enter my LVM decryption passphrase is showing.

Could it be something in the kernel/system I just installed or do I have a faulty harddisk?

hdparm -tT /dev/sda report a correct HD speed (130 GB/s non-cached, sata 6GB) and I've already installed and compiled from big sources (Emacs) without issue so I don't think the system is bad.

(HD is a Seagate Barracude 500GB)

Mat , Jul 27, 2012 at 6:03

Are you sure it's a hard drive making that noise, and not something else? (Check the fans, including PSU fan. Had very strange clicking noises once when a very thin cable was too close to a fan and would sometimes very slightly touch the blades and bounce for a few "clicks"...) � Mat Jul 27 '12 at 6:03

Cedric Martin , Jul 27, 2012 at 7:02

@Mat: I'll take the hard drive outside of the case (the connectors should be long enough) to be sure and I'll report back ; ) � Cedric Martin Jul 27 '12 at 7:02

camh , Jul 27, 2012 at 9:48

Make sure your disk filesystems are mounted relatime or noatime. File reads can be causing writes to inodes to record the access time. � camh Jul 27 '12 at 9:48

mnmnc , Jul 27, 2012 at 8:27

Did you tried to examin what programs like iotop is showing? It will tell you exacly what kind of process is currently writing to the disk.

example output:

Total DISK READ: 0.00 B/s | Total DISK WRITE: 0.00 B/s
  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>    COMMAND
    1 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % init
    2 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [kthreadd]
    3 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [ksoftirqd/0]
    6 rt/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [migration/0]
    7 rt/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [watchdog/0]
    8 rt/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [migration/1]
 1033 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [flush-8:0]
   10 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [ksoftirqd/1]

Cedric Martin , Aug 2, 2012 at 15:56

thanks for that tip. I didn't know about iotop . On Debian I did an apt-cache search iotop to find out that I had to apt-get iotop . Very cool command! � Cedric Martin Aug 2 '12 at 15:56

ndemou , Jun 20, 2016 at 15:32

I use iotop -o -b -d 10 which every 10secs prints a list of processes that read/wrote to disk and the amount of IO bandwidth used. � ndemou Jun 20 '16 at 15:32

scai , Jul 27, 2012 at 10:48

You can enable IO debugging via echo 1 > /proc/sys/vm/block_dump and then watch the debugging messages in /var/log/syslog . This has the advantage of obtaining some type of log file with past activities whereas iotop only shows the current activity.

dan3 , Jul 15, 2013 at 8:32

It is absolutely crazy to leave sysloging enabled when block_dump is active. Logging causes disk activity, which causes logging, which causes disk activity etc. Better stop syslog before enabling this (and use dmesg to read the messages) � dan3 Jul 15 '13 at 8:32

scai , Jul 16, 2013 at 6:32

You are absolutely right, although the effect isn't as dramatic as you describe it. If you just want to have a short peek at the disk activity there is no need to stop the syslog daemon. � scai Jul 16 '13 at 6:32

dan3 , Jul 16, 2013 at 7:22

I've tried it about 2 years ago and it brought my machine to a halt. One of these days when I have nothing important running I'll try it again :) � dan3 Jul 16 '13 at 7:22

scai , Jul 16, 2013 at 10:50

I tried it, nothing really happened. Especially because of file system buffering. A write to syslog doesn't immediately trigger a write to disk. � scai Jul 16 '13 at 10:50

Volker Siegel , Apr 16, 2014 at 22:57

I would assume there is rate general rate limiting in place for the log messages, which handles this case too(?) � Volker Siegel Apr 16 '14 at 22:57

Gilles , Jul 28, 2012 at 1:34

Assuming that the disk noises are due to a process causing a write and not to some disk spindown problem , you can use the audit subsystem (install the auditd package ). Put a watch on the sync calls and its friends:
auditctl -S sync -S fsync -S fdatasync -a exit,always

Watch the logs in /var/log/audit/audit.log . Be careful not to do this if the audit logs themselves are flushed! Check in /etc/auditd.conf that the flush option is set to none .

If files are being flushed often, a likely culprit is the system logs. For example, if you log failed incoming connection attempts and someone is probing your machine, that will generate a lot of entries; this can cause a disk to emit machine gun-style noises. With the basic log daemon sysklogd, check /etc/syslog.conf : if a log file name is not be preceded by - , then that log is flushed to disk after each write.

Gilles , Mar 23 at 18:24

@StephenKitt Huh. No. The asker mentioned Debian so I've changed it to a link to the Debian package. � Gilles Mar 23 at 18:24

cas , Jul 27, 2012 at 9:40

It might be your drives automatically spinning down, lots of consumer-grade drives do that these days. Unfortunately on even a lightly loaded system, this results in the drives constantly spinning down and then spinning up again, especially if you're running hddtemp or similar to monitor the drive temperature (most drives stupidly don't let you query the SMART temperature value without spinning up the drive - cretinous!).

This is not only annoying, it can wear out the drives faster as many drives have only a limited number of park cycles. e.g. see https://bugs.launchpad.net/ubuntu/+source/hdparm/+bug/952556 for a description of the problem.

I disable idle-spindown on all my drives with the following bit of shell code. you could put it in an /etc/rc.boot script, or in /etc/rc.local or similar.

for disk in /dev/sd? ; do
  /sbin/hdparm -q -S 0 "/dev/$disk"
done

Cedric Martin , Aug 2, 2012 at 16:03

that you can't query SMART readings without spinning up the drive leaves me speechless :-/ Now obviously the "spinning down" issue can become quite complicated. Regarding disabling the spinning down: wouldn't that in itself cause the HD to wear out faster? I mean: it's never ever "resting" as long as the system is on then? � Cedric Martin Aug 2 '12 at 16:03

cas , Aug 2, 2012 at 21:42

IIRC you can query some SMART values without causing the drive to spin up, but temperature isn't one of them on any of the drives i've tested (incl models from WD, Seagate, Samsung, Hitachi). Which is, of course, crazy because concern over temperature is one of the reasons for idling a drive. re: wear: AIUI 1. constant velocity is less wearing than changing speed. 2. the drives have to park the heads in a safe area and a drive is only rated to do that so many times (IIRC up to a few hundred thousand - easily exceeded if the drive is idling and spinning up every few seconds) � cas Aug 2 '12 at 21:42

Micheal Johnson , Mar 12, 2016 at 20:48

It's a long debate regarding whether it's better to leave drives running or to spin them down. Personally I believe it's best to leave them running - I turn my computer off at night and when I go out but other than that I never spin my drives down. Some people prefer to spin them down, say, at night if they're leaving the computer on or if the computer's idle for a long time, and in such cases the advantage of spinning them down for a few hours versus leaving them running is debatable. What's never good though is when the hard drive repeatedly spins down and up again in a short period of time. � Micheal Johnson Mar 12 '16 at 20:48

Micheal Johnson , Mar 12, 2016 at 20:51

Note also that spinning the drive down after it's been idle for a few hours is a bit silly, because if it's been idle for a few hours then it's likely to be used again within an hour. In that case, it would seem better to spin the drive down promptly if it's idle (like, within 10 minutes), but it's also possible for the drive to be idle for a few minutes when someone is using the computer and is likely to need the drive again soon. � Micheal Johnson Mar 12 '16 at 20:51

,

I just found that s.m.a.r.t was causing an external USB disk to spin up again and again on my raspberry pi. Although SMART is generally a good thing, I decided to disable it again and since then it seems that unwanted disk activity has stopped

[Nov 08, 2018] Determining what process is bound to a port

Mar 14, 2011 | unix.stackexchange.com
I know that using the command:
lsof -i TCP

(or some variant of parameters with lsof) I can determine which process is bound to a particular port. This is useful say if I'm trying to start something that wants to bind to 8080 and some else is already using that port, but I don't know what.

Is there an easy way to do this without using lsof? I spend time working on many systems and lsof is often not installed.

Cakemox , Mar 14, 2011 at 20:48

netstat -lnp will list the pid and process name next to each listening port. This will work under Linux, but not all others (like AIX.) Add -t if you want TCP only.
# netstat -lntp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 0.0.0.0:24800           0.0.0.0:*               LISTEN      27899/synergys
tcp        0      0 0.0.0.0:8000            0.0.0.0:*               LISTEN      3361/python
tcp        0      0 127.0.0.1:3306          0.0.0.0:*               LISTEN      2264/mysqld
tcp        0      0 0.0.0.0:80              0.0.0.0:*               LISTEN      22964/apache2
tcp        0      0 192.168.99.1:53         0.0.0.0:*               LISTEN      3389/named
tcp        0      0 192.168.88.1:53         0.0.0.0:*               LISTEN      3389/named

etc.

xxx , Mar 14, 2011 at 21:01

Cool, thanks. Looks like that that works under RHEL, but not under Solaris (as you indicated). Anybody know if there's something similar for Solaris? � user5721 Mar 14 '11 at 21:01

Rich Homolka , Mar 15, 2011 at 19:56

netstat -p above is my vote. also look at lsof . � Rich Homolka Mar 15 '11 at 19:56

Jonathan , Aug 26, 2014 at 18:50

As an aside, for windows it's similar: netstat -aon | more � Jonathan Aug 26 '14 at 18:50

sudo , May 25, 2017 at 2:24

What about for SCTP? � sudo May 25 '17 at 2:24

frielp , Mar 15, 2011 at 13:33

On AIX, netstat & rmsock can be used to determine process binding:
[root@aix] netstat -Ana|grep LISTEN|grep 80
f100070000280bb0 tcp4       0      0  *.37               *.*        LISTEN
f1000700025de3b0 tcp        0      0  *.80               *.*        LISTEN
f1000700002803b0 tcp4       0      0  *.111              *.*        LISTEN
f1000700021b33b0 tcp4       0      0  127.0.0.1.32780    *.*        LISTEN

# Port 80 maps to f1000700025de3b0 above, so we type:
[root@aix] rmsock f1000700025de3b0 tcpcb
The socket 0x25de008 is being held by process 499790 (java).

Olivier Dulac , Sep 18, 2013 at 4:05

Thanks for this! Is there a way, however, to just display what process listen on the socket (instead of using rmsock which attempt to remove it) ? � Olivier Dulac Sep 18 '13 at 4:05

Vitor Py , Sep 26, 2013 at 14:18

@OlivierDulac: "Unlike what its name implies, rmsock does not remove the socket, if it is being used by a process. It just reports the process holding the socket." ( ibm.com/developerworks/community/blogs/cgaix/entry/ ) � Vitor Py Sep 26 '13 at 14:18

Olivier Dulac , Sep 26, 2013 at 16:00

@vitor-braga: Ah thx! I thought it was trying but just said which process holds in when it couldn't remove it. Apparently it doesn't even try to remove it when a process holds it. That's cool! Thx! � Olivier Dulac Sep 26 '13 at 16:00

frielp , Mar 15, 2011 at 13:27

Another tool available on Linux is ss . From the ss man page on Fedora:
NAME
       ss - another utility to investigate sockets
SYNOPSIS
       ss [options] [ FILTER ]
DESCRIPTION
       ss is used to dump socket statistics. It allows showing information 
       similar to netstat. It can display more TCP and state informations  
       than other tools.

Example output below - the final column shows the process binding:

[root@box] ss -ap
State      Recv-Q Send-Q      Local Address:Port          Peer Address:Port
LISTEN     0      128                    :::http                    :::*        users:(("httpd",20891,4),("httpd",20894,4),("httpd",20895,4),("httpd",20896,4)
LISTEN     0      128             127.0.0.1:munin                    *:*        users:(("munin-node",1278,5))
LISTEN     0      128                    :::ssh                     :::*        users:(("sshd",1175,4))
LISTEN     0      128                     *:ssh                      *:*        users:(("sshd",1175,3))
LISTEN     0      10              127.0.0.1:smtp                     *:*        users:(("sendmail",1199,4))
LISTEN     0      128             127.0.0.1:x11-ssh-offset                  *:*        users:(("sshd",25734,8))
LISTEN     0      128                   ::1:x11-ssh-offset                 :::*        users:(("sshd",25734,7))

Eugen Constantin Dinca , Mar 14, 2011 at 23:47

For Solaris you can use pfiles and then grep by sockname: or port: .

A sample (from here ):

pfiles `ptree | awk '{print $1}'` | egrep '^[0-9]|port:'

rickumali , May 8, 2011 at 14:40

I was once faced with trying to determine what process was behind a particular port (this time it was 8000). I tried a variety of lsof and netstat, but then took a chance and tried hitting the port via a browser (i.e. http://hostname:8000/ ). Lo and behold, a splash screen greeted me, and it became obvious what the process was (for the record, it was Splunk ).

One more thought: "ps -e -o pid,args" (YMMV) may sometimes show the port number in the arguments list. Grep is your friend!

Gilles , Oct 8, 2015 at 21:04

In the same vein, you could telnet hostname 8000 and see if the server prints a banner. However, that's mostly useful when the server is running on a machine where you don't have shell access, and then finding the process ID isn't relevant. � Gilles May 8 '11 at 14:45

[Nov 08, 2018] How to find which process is regularly writing to disk?

Notable quotes:
"... tick...tick...tick...trrrrrr ..."
"... /var/log/syslog ..."
Jul 27, 2012 | unix.stackexchange.com

Cedric Martin , Jul 27, 2012 at 4:31

How can I find which process is constantly writing to disk?

I like my workstation to be close to silent and I just build a new system (P8B75-M + Core i5 3450s -- the 's' because it has a lower max TDP) with quiet fans etc. and installed Debian Wheezy 64-bit on it.

And something is getting on my nerve: I can hear some kind of pattern like if the hard disk was writing or seeking someting ( tick...tick...tick...trrrrrr rinse and repeat every second or so).

In the past I had a similar issue in the past (many, many years ago) and it turned out it was some CUPS log or something and I simply redirected that one (not important) logging to a (real) RAM disk.

But here I'm not sure.

I tried the following:

ls -lR /var/log > /tmp/a.tmp && sleep 5 && ls -lR /var/log > /tmp/b.tmp && diff /tmp/?.tmp

but nothing is changing there.

Now the strange thing is that I also hear the pattern when the prompt asking me to enter my LVM decryption passphrase is showing.

Could it be something in the kernel/system I just installed or do I have a faulty harddisk?

hdparm -tT /dev/sda report a correct HD speed (130 GB/s non-cached, sata 6GB) and I've already installed and compiled from big sources (Emacs) without issue so I don't think the system is bad.

(HD is a Seagate Barracude 500GB)

Mat , Jul 27, 2012 at 6:03

Are you sure it's a hard drive making that noise, and not something else? (Check the fans, including PSU fan. Had very strange clicking noises once when a very thin cable was too close to a fan and would sometimes very slightly touch the blades and bounce for a few "clicks"...) � Mat Jul 27 '12 at 6:03

Cedric Martin , Jul 27, 2012 at 7:02

@Mat: I'll take the hard drive outside of the case (the connectors should be long enough) to be sure and I'll report back ; ) � Cedric Martin Jul 27 '12 at 7:02

camh , Jul 27, 2012 at 9:48

Make sure your disk filesystems are mounted relatime or noatime. File reads can be causing writes to inodes to record the access time. � camh Jul 27 '12 at 9:48

mnmnc , Jul 27, 2012 at 8:27

Did you tried to examin what programs like iotop is showing? It will tell you exacly what kind of process is currently writing to the disk.

example output:

Total DISK READ: 0.00 B/s | Total DISK WRITE: 0.00 B/s
  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>    COMMAND
    1 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % init
    2 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [kthreadd]
    3 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [ksoftirqd/0]
    6 rt/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [migration/0]
    7 rt/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [watchdog/0]
    8 rt/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [migration/1]
 1033 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [flush-8:0]
   10 be/4 root        0.00 B/s    0.00 B/s  0.00 %  0.00 % [ksoftirqd/1]

Cedric Martin , Aug 2, 2012 at 15:56

thanks for that tip. I didn't know about iotop . On Debian I did an apt-cache search iotop to find out that I had to apt-get iotop . Very cool command! � Cedric Martin Aug 2 '12 at 15:56

ndemou , Jun 20, 2016 at 15:32

I use iotop -o -b -d 10 which every 10secs prints a list of processes that read/wrote to disk and the amount of IO bandwidth used. � ndemou Jun 20 '16 at 15:32

scai , Jul 27, 2012 at 10:48

You can enable IO debugging via echo 1 > /proc/sys/vm/block_dump and then watch the debugging messages in /var/log/syslog . This has the advantage of obtaining some type of log file with past activities whereas iotop only shows the current activity.

dan3 , Jul 15, 2013 at 8:32

It is absolutely crazy to leave sysloging enabled when block_dump is active. Logging causes disk activity, which causes logging, which causes disk activity etc. Better stop syslog before enabling this (and use dmesg to read the messages) � dan3 Jul 15 '13 at 8:32

scai , Jul 16, 2013 at 6:32

You are absolutely right, although the effect isn't as dramatic as you describe it. If you just want to have a short peek at the disk activity there is no need to stop the syslog daemon. � scai Jul 16 '13 at 6:32

dan3 , Jul 16, 2013 at 7:22

I've tried it about 2 years ago and it brought my machine to a halt. One of these days when I have nothing important running I'll try it again :) � dan3 Jul 16 '13 at 7:22

scai , Jul 16, 2013 at 10:50

I tried it, nothing really happened. Especially because of file system buffering. A write to syslog doesn't immediately trigger a write to disk. � scai Jul 16 '13 at 10:50

Volker Siegel , Apr 16, 2014 at 22:57

I would assume there is rate general rate limiting in place for the log messages, which handles this case too(?) � Volker Siegel Apr 16 '14 at 22:57

Gilles , Jul 28, 2012 at 1:34

Assuming that the disk noises are due to a process causing a write and not to some disk spindown problem , you can use the audit subsystem (install the auditd package ). Put a watch on the sync calls and its friends:
auditctl -S sync -S fsync -S fdatasync -a exit,always

Watch the logs in /var/log/audit/audit.log . Be careful not to do this if the audit logs themselves are flushed! Check in /etc/auditd.conf that the flush option is set to none .

If files are being flushed often, a likely culprit is the system logs. For example, if you log failed incoming connection attempts and someone is probing your machine, that will generate a lot of entries; this can cause a disk to emit machine gun-style noises. With the basic log daemon sysklogd, check /etc/syslog.conf : if a log file name is not be preceded by - , then that log is flushed to disk after each write.

Gilles , Mar 23 at 18:24

@StephenKitt Huh. No. The asker mentioned Debian so I've changed it to a link to the Debian package. � Gilles Mar 23 at 18:24

cas , Jul 27, 2012 at 9:40

It might be your drives automatically spinning down, lots of consumer-grade drives do that these days. Unfortunately on even a lightly loaded system, this results in the drives constantly spinning down and then spinning up again, especially if you're running hddtemp or similar to monitor the drive temperature (most drives stupidly don't let you query the SMART temperature value without spinning up the drive - cretinous!).

This is not only annoying, it can wear out the drives faster as many drives have only a limited number of park cycles. e.g. see https://bugs.launchpad.net/ubuntu/+source/hdparm/+bug/952556 for a description of the problem.

I disable idle-spindown on all my drives with the following bit of shell code. you could put it in an /etc/rc.boot script, or in /etc/rc.local or similar.

for disk in /dev/sd? ; do
  /sbin/hdparm -q -S 0 "/dev/$disk"
done

Cedric Martin , Aug 2, 2012 at 16:03

that you can't query SMART readings without spinning up the drive leaves me speechless :-/ Now obviously the "spinning down" issue can become quite complicated. Regarding disabling the spinning down: wouldn't that in itself cause the HD to wear out faster? I mean: it's never ever "resting" as long as the system is on then? � Cedric Martin Aug 2 '12 at 16:03

cas , Aug 2, 2012 at 21:42

IIRC you can query some SMART values without causing the drive to spin up, but temperature isn't one of them on any of the drives i've tested (incl models from WD, Seagate, Samsung, Hitachi). Which is, of course, crazy because concern over temperature is one of the reasons for idling a drive. re: wear: AIUI 1. constant velocity is less wearing than changing speed. 2. the drives have to park the heads in a safe area and a drive is only rated to do that so many times (IIRC up to a few hundred thousand - easily exceeded if the drive is idling and spinning up every few seconds) � cas Aug 2 '12 at 21:42

Micheal Johnson , Mar 12, 2016 at 20:48

It's a long debate regarding whether it's better to leave drives running or to spin them down. Personally I believe it's best to leave them running - I turn my computer off at night and when I go out but other than that I never spin my drives down. Some people prefer to spin them down, say, at night if they're leaving the computer on or if the computer's idle for a long time, and in such cases the advantage of spinning them down for a few hours versus leaving them running is debatable. What's never good though is when the hard drive repeatedly spins down and up again in a short period of time. � Micheal Johnson Mar 12 '16 at 20:48

Micheal Johnson , Mar 12, 2016 at 20:51

Note also that spinning the drive down after it's been idle for a few hours is a bit silly, because if it's been idle for a few hours then it's likely to be used again within an hour. In that case, it would seem better to spin the drive down promptly if it's idle (like, within 10 minutes), but it's also possible for the drive to be idle for a few minutes when someone is using the computer and is likely to need the drive again soon. � Micheal Johnson Mar 12 '16 at 20:51

,

I just found that s.m.a.r.t was causing an external USB disk to spin up again and again on my raspberry pi. Although SMART is generally a good thing, I decided to disable it again and since then it seems that unwanted disk activity has stopped

[Nov 08, 2018] How to split one string into multiple variables in bash shell? [duplicate]

Nov 08, 2018 | stackoverflow.com
This question already has an answer here:

Rob I , May 9, 2012 at 19:22

For your second question, see @mkb's comment to my answer below - that's definitely the way to go! – Rob I May 9 '12 at 19:22

Dennis Williamson , Jul 4, 2012 at 16:14

See my edited answer for one way to read individual characters into an array. – Dennis Williamson Jul 4 '12 at 16:14

Nick Weedon , Dec 31, 2015 at 11:04

Here is the same thing in a more concise form: var1=$(cut -f1 -d- <<<$STR) – Nick Weedon Dec 31 '15 at 11:04

Rob I , May 9, 2012 at 17:00

If your solution doesn't have to be general, i.e. only needs to work for strings like your example, you could do:
var1=$(echo $STR | cut -f1 -d-)
var2=$(echo $STR | cut -f2 -d-)

I chose cut here because you could simply extend the code for a few more variables...

crunchybutternut , May 9, 2012 at 17:40

Can you look at my post again and see if you have a solution for the followup question? thanks! – crunchybutternut May 9 '12 at 17:40

mkb , May 9, 2012 at 17:59

You can use cut to cut characters too! cut -c1 for example. – mkb May 9 '12 at 17:59

FSp , Nov 27, 2012 at 10:26

Although this is very simple to read and write, is a very slow solution because forces you to read twice the same data ($STR) ... if you care of your script performace, the @anubhava solution is much better – FSp Nov 27 '12 at 10:26

tripleee , Jan 25, 2016 at 6:47

Apart from being an ugly last-resort solution, this has a bug: You should absolutely use double quotes in echo "$STR" unless you specifically want the shell to expand any wildcards in the string as a side effect. See also stackoverflow.com/questions/10067266/tripleee Jan 25 '16 at 6:47

Rob I , Feb 10, 2016 at 13:57

You're right about double quotes of course, though I did point out this solution wasn't general. However I think your assessment is a bit unfair - for some people this solution may be more readable (and hence extensible etc) than some others, and doesn't completely rely on arcane bash feature that wouldn't translate to other shells. I suspect that's why my solution, though less elegant, continues to get votes periodically... – Rob I Feb 10 '16 at 13:57

Dennis Williamson , May 10, 2012 at 3:14

read with IFS are perfect for this:
$ IFS=- read var1 var2 <<< ABCDE-123456
$ echo "$var1"
ABCDE
$ echo "$var2"
123456

Edit:

Here is how you can read each individual character into array elements:

$ read -a foo <<<"$(echo "ABCDE-123456" | sed 's/./& /g')"

Dump the array:

$ declare -p foo
declare -a foo='([0]="A" [1]="B" [2]="C" [3]="D" [4]="E" [5]="-" [6]="1" [7]="2" [8]="3" [9]="4" [10]="5" [11]="6")'

If there are spaces in the string:

$ IFS=$'\v' read -a foo <<<"$(echo "ABCDE 123456" | sed 's/./&\v/g')"
$ declare -p foo
declare -a foo='([0]="A" [1]="B" [2]="C" [3]="D" [4]="E" [5]=" " [6]="1" [7]="2" [8]="3" [9]="4" [10]="5" [11]="6")'

insecure , Apr 30, 2014 at 7:51

Great, the elegant bash-only way, without unnecessary forks. – insecure Apr 30 '14 at 7:51

Martin Serrano , Jan 11 at 4:34

this solution also has the benefit that if delimiter is not present, the var2 will be empty – Martin Serrano Jan 11 at 4:34

mkb , May 9, 2012 at 17:02

If you know it's going to be just two fields, you can skip the extra subprocesses like this:
var1=${STR%-*}
var2=${STR#*-}

What does this do? ${STR%-*} deletes the shortest substring of $STR that matches the pattern -* starting from the end of the string. ${STR#*-} does the same, but with the *- pattern and starting from the beginning of the string. They each have counterparts %% and ## which find the longest anchored pattern match. If anyone has a helpful mnemonic to remember which does which, let me know! I always have to try both to remember.

Jens , Jan 30, 2015 at 15:17

Plus 1 For knowing your POSIX shell features, avoiding expensive forks and pipes, and the absence of bashisms. – Jens Jan 30 '15 at 15:17

Steven Lu , May 1, 2015 at 20:19

Dunno about "absence of bashisms" considering that this is already moderately cryptic .... if your delimiter is a newline instead of a hyphen, then it becomes even more cryptic. On the other hand, it works with newlines , so there's that. – Steven Lu May 1 '15 at 20:19

mkb , Mar 9, 2016 at 17:30

@KErlandsson: done – mkb Mar 9 '16 at 17:30

mombip , Aug 9, 2016 at 15:58

I've finally found documentation for it: Shell-Parameter-Expansionmombip Aug 9 '16 at 15:58

DS. , Jan 13, 2017 at 19:56

Mnemonic: "#" is to the left of "%" on a standard keyboard, so "#" removes a prefix (on the left), and "%" removes a suffix (on the right). – DS. Jan 13 '17 at 19:56

tripleee , May 9, 2012 at 17:57

Sounds like a job for set with a custom IFS .
IFS=-
set $STR
var1=$1
var2=$2

(You will want to do this in a function with a local IFS so you don't mess up other parts of your script where you require IFS to be what you expect.)

Rob I , May 9, 2012 at 19:20

Nice - I knew about $IFS but hadn't seen how it could be used. – Rob I May 9 '12 at 19:20

Sigg3.net , Jun 19, 2013 at 8:08

I used triplee's example and it worked exactly as advertised! Just change last two lines to <pre> myvar1= echo $1 && myvar2= echo $2 </pre> if you need to store them throughout a script with several "thrown" variables. – Sigg3.net Jun 19 '13 at 8:08

tripleee , Jun 19, 2013 at 13:25

No, don't use a useless echo in backticks . – tripleee Jun 19 '13 at 13:25

Daniel Andersson , Mar 27, 2015 at 6:46

This is a really sweet solution if we need to write something that is not Bash specific. To handle IFS troubles, one can add OLDIFS=$IFS at the beginning before overwriting it, and then add IFS=$OLDIFS just after the set line. – Daniel Andersson Mar 27 '15 at 6:46

tripleee , Mar 27, 2015 at 6:58

FWIW the link above is broken. I was lazy and careless. The canonical location still works; iki.fi/era/unix/award.html#echotripleee Mar 27 '15 at 6:58

anubhava , May 9, 2012 at 17:09

Using bash regex capabilities:
re="^([^-]+)-(.*)$"
[[ "ABCDE-123456" =~ $re ]] && var1="${BASH_REMATCH[1]}" && var2="${BASH_REMATCH[2]}"
echo $var1
echo $var2

OUTPUT

ABCDE
123456

Cometsong , Oct 21, 2016 at 13:29

Love pre-defining the re for later use(s)! – Cometsong Oct 21 '16 at 13:29

Archibald , Nov 12, 2012 at 11:03

string="ABCDE-123456"
IFS=- # use "local IFS=-" inside the function
set $string
echo $1 # >>> ABCDE
echo $2 # >>> 123456

tripleee , Mar 27, 2015 at 7:02

Hmmm, isn't this just a restatement of my answer ? – tripleee Mar 27 '15 at 7:02

Archibald , Sep 18, 2015 at 12:36

Actually yes. I just clarified it a bit. – Archibald Sep 18 '15 at 12:36

[Nov 08, 2018] How to split a string in shell and get the last field

Nov 08, 2018 | stackoverflow.com

cd1 , Jul 1, 2010 at 23:29

Suppose I have the string 1:2:3:4:5 and I want to get its last field ( 5 in this case). How do I do that using Bash? I tried cut , but I don't know how to specify the last field with -f .

Stephen , Jul 2, 2010 at 0:05

You can use string operators :
$ foo=1:2:3:4:5
$ echo ${foo##*:}
5

This trims everything from the front until a ':', greedily.

${foo  <-- from variable foo
  ##   <-- greedy front trim
  *    <-- matches anything
  :    <-- until the last ':'
 }

eckes , Jan 23, 2013 at 15:23

While this is working for the given problem, the answer of William below ( stackoverflow.com/a/3163857/520162 ) also returns 5 if the string is 1:2:3:4:5: (while using the string operators yields an empty result). This is especially handy when parsing paths that could contain (or not) a finishing / character. – eckes Jan 23 '13 at 15:23

Dobz , Jun 25, 2014 at 11:44

How would you then do the opposite of this? to echo out '1:2:3:4:'? – Dobz Jun 25 '14 at 11:44

Mihai Danila , Jul 9, 2014 at 14:07

And how does one keep the part before the last separator? Apparently by using ${foo%:*} . # - from beginning; % - from end. # , % - shortest match; ## , %% - longest match. – Mihai Danila Jul 9 '14 at 14:07

Putnik , Feb 11, 2016 at 22:33

If i want to get the last element from path, how should I use it? echo ${pwd##*/} does not work. – Putnik Feb 11 '16 at 22:33

Stan Strum , Dec 17, 2017 at 4:22

@Putnik that command sees pwd as a variable. Try dir=$(pwd); echo ${dir##*/} . Works for me! – Stan Strum Dec 17 '17 at 4:22

a3nm , Feb 3, 2012 at 8:39

Another way is to reverse before and after cut :
$ echo ab:cd:ef | rev | cut -d: -f1 | rev
ef

This makes it very easy to get the last but one field, or any range of fields numbered from the end.

Dannid , Jan 14, 2013 at 20:50

This answer is nice because it uses 'cut', which the author is (presumably) already familiar. Plus, I like this answer because I am using 'cut' and had this exact question, hence finding this thread via search. – Dannid Jan 14 '13 at 20:50

funroll , Aug 12, 2013 at 19:51

Some cut-and-paste fodder for people using spaces as delimiters: echo "1 2 3 4" | rev | cut -d " " -f1 | revfunroll Aug 12 '13 at 19:51

EdgeCaseBerg , Sep 8, 2013 at 5:01

the rev | cut -d -f1 | rev is so clever! Thanks! Helped me a bunch (my use case was rev | -d ' ' -f 2- | rev – EdgeCaseBerg Sep 8 '13 at 5:01

Anarcho-Chossid , Sep 16, 2015 at 15:54

Wow. Beautiful and dark magic. – Anarcho-Chossid Sep 16 '15 at 15:54

shearn89 , Aug 17, 2017 at 9:27

I always forget about rev , was just what I needed! cut -b20- | rev | cut -b10- | revshearn89 Aug 17 '17 at 9:27

William Pursell , Jul 2, 2010 at 7:09

It's difficult to get the last field using cut, but here's (one set of) solutions in awk and perl
$ echo 1:2:3:4:5 | awk -F: '{print $NF}'
5
$ echo 1:2:3:4:5 | perl -F: -wane 'print $F[-1]'
5

eckes , Jan 23, 2013 at 15:20

great advantage of this solution over the accepted answer: it also matches paths that contain or do not contain a finishing / character: /a/b/c/d and /a/b/c/d/ yield the same result ( d ) when processing pwd | awk -F/ '{print $NF}' . The accepted answer results in an empty result in the case of /a/b/c/d/eckes Jan 23 '13 at 15:20

stamster , May 21 at 11:52

@eckes In case of AWK solution, on GNU bash, version 4.3.48(1)-release that's not true, as it matters whenever you have trailing slash or not. Simply put AWK will use / as delimiter, and if your path is /my/path/dir/ it will use value after last delimiter, which is simply an empty string. So it's best to avoid trailing slash if you need to do such a thing like I do. – stamster May 21 at 11:52

Nicholas M T Elliott , Jul 1, 2010 at 23:39

Assuming fairly simple usage (no escaping of the delimiter, for example), you can use grep:
$ echo "1:2:3:4:5" | grep -oE "[^:]+$"
5

Breakdown - find all the characters not the delimiter ([^:]) at the end of the line ($). -o only prints the matching part.

Dennis Williamson , Jul 2, 2010 at 0:05

One way:
var1="1:2:3:4:5"
var2=${var1##*:}

Another, using an array:

var1="1:2:3:4:5"
saveIFS=$IFS
IFS=":"
var2=($var1)
IFS=$saveIFS
var2=${var2[@]: -1}

Yet another with an array:

var1="1:2:3:4:5"
saveIFS=$IFS
IFS=":"
var2=($var1)
IFS=$saveIFS
count=${#var2[@]}
var2=${var2[$count-1]}

Using Bash (version >= 3.2) regular expressions:

var1="1:2:3:4:5"
[[ $var1 =~ :([^:]*)$ ]]
var2=${BASH_REMATCH[1]}

liuyang1 , Mar 24, 2015 at 6:02

Thanks so much for array style, as I need this feature, but not have cut, awk these utils. – liuyang1 Mar 24 '15 at 6:02

user3133260 , Dec 24, 2013 at 19:04

$ echo "a b c d e" | tr ' ' '\n' | tail -1
e

Simply translate the delimiter into a newline and choose the last entry with tail -1 .

Yajo , Jul 30, 2014 at 10:13

It will fail if the last item contains a \n , but for most cases is the most readable solution. – Yajo Jul 30 '14 at 10:13

Rafael , Nov 10, 2016 at 10:09

Using sed :
$ echo '1:2:3:4:5' | sed 's/.*://' # => 5

$ echo '' | sed 's/.*://' # => (empty)

$ echo ':' | sed 's/.*://' # => (empty)
$ echo ':b' | sed 's/.*://' # => b
$ echo '::c' | sed 's/.*://' # => c

$ echo 'a' | sed 's/.*://' # => a
$ echo 'a:' | sed 's/.*://' # => (empty)
$ echo 'a:b' | sed 's/.*://' # => b
$ echo 'a::c' | sed 's/.*://' # => c

Ab Irato , Nov 13, 2013 at 16:10

If your last field is a single character, you could do this:
a="1:2:3:4:5"

echo ${a: -1}
echo ${a:(-1)}

Check string manipulation in bash .

gniourf_gniourf , Nov 13, 2013 at 16:15

This doesn't work: it gives the last character of a , not the last field . – gniourf_gniourf Nov 13 '13 at 16:15

Ab Irato , Nov 25, 2013 at 13:25

True, that's the idea, if you know the length of the last field it's good. If not you have to use something else... – Ab Irato Nov 25 '13 at 13:25

sphakka , Jan 25, 2016 at 16:24

Interesting, I didn't know of these particular Bash string manipulations. It also resembles to Python's string/array slicing . – sphakka Jan 25 '16 at 16:24

ghostdog74 , Jul 2, 2010 at 1:16

Using Bash.
$ var1="1:2:3:4:0"
$ IFS=":"
$ set -- $var1
$ eval echo  \$${#}
0

Sopalajo de Arrierez , Dec 24, 2014 at 5:04

I would buy some details about this method, please :-) . – Sopalajo de Arrierez Dec 24 '14 at 5:04

Rafa , Apr 27, 2017 at 22:10

Could have used echo ${!#} instead of eval echo \$${#} . – Rafa Apr 27 '17 at 22:10

Crytis , Dec 7, 2016 at 6:51

echo "a:b:c:d:e"|xargs -d : -n1|tail -1

First use xargs split it using ":",-n1 means every line only have one part.Then,pring the last part.

BDL , Dec 7, 2016 at 13:47

Although this might solve the problem, one should always add an explanation to it. – BDL Dec 7 '16 at 13:47

Crytis , Jun 7, 2017 at 9:13

already added.. – Crytis Jun 7 '17 at 9:13

021 , Apr 26, 2016 at 11:33

There are many good answers here, but still I want to share this one using basename :
 basename $(echo "a:b:c:d:e" | tr ':' '/')

However it will fail if there are already some '/' in your string . If slash / is your delimiter then you just have to (and should) use basename.

It's not the best answer but it just shows how you can be creative using bash commands.

Nahid Akbar , Jun 22, 2012 at 2:55

for x in `echo $str | tr ";" "\n"`; do echo $x; done

chepner , Jun 22, 2012 at 12:58

This runs into problems if there is whitespace in any of the fields. Also, it does not directly address the question of retrieving the last field. – chepner Jun 22 '12 at 12:58

Christoph Böddeker , Feb 19 at 15:50

For those that comfortable with Python, https://github.com/Russell91/pythonpy is a nice choice to solve this problem.
$ echo "a:b:c:d:e" | py -x 'x.split(":")[-1]'

From the pythonpy help: -x treat each row of stdin as x .

With that tool, it is easy to write python code that gets applied to the input.

baz , Nov 24, 2017 at 19:27

a solution using the read builtin
IFS=':' read -a field <<< "1:2:3:4:5"
echo ${field[4]}

[Nov 08, 2018] How do I split a string on a delimiter in Bash?

Notable quotes:
"... Bash shell script split array ..."
"... associative array ..."
"... pattern substitution ..."
"... Debian GNU/Linux ..."
Nov 08, 2018 | stackoverflow.com

stefanB , May 28, 2009 at 2:03

I have this string stored in a variable:
IN="[email protected];[email protected]"

Now I would like to split the strings by ; delimiter so that I have:

ADDR1="[email protected]"
ADDR2="[email protected]"

I don't necessarily need the ADDR1 and ADDR2 variables. If they are elements of an array that's even better.


After suggestions from the answers below, I ended up with the following which is what I was after:

#!/usr/bin/env bash

IN="[email protected];[email protected]"

mails=$(echo $IN | tr ";" "\n")

for addr in $mails
do
    echo "> [$addr]"
done

Output:

> [[email protected]]
> [[email protected]]

There was a solution involving setting Internal_field_separator (IFS) to ; . I am not sure what happened with that answer, how do you reset IFS back to default?

RE: IFS solution, I tried this and it works, I keep the old IFS and then restore it:

IN="[email protected];[email protected]"

OIFS=$IFS
IFS=';'
mails2=$IN
for x in $mails2
do
    echo "> [$x]"
done

IFS=$OIFS

BTW, when I tried

mails2=($IN)

I only got the first string when printing it in loop, without brackets around $IN it works.

Brooks Moses , May 1, 2012 at 1:26

With regards to your "Edit2": You can simply "unset IFS" and it will return to the default state. There's no need to save and restore it explicitly unless you have some reason to expect that it's already been set to a non-default value. Moreover, if you're doing this inside a function (and, if you aren't, why not?), you can set IFS as a local variable and it will return to its previous value once you exit the function. – Brooks Moses May 1 '12 at 1:26

dubiousjim , May 31, 2012 at 5:21

@BrooksMoses: (a) +1 for using local IFS=... where possible; (b) -1 for unset IFS , this doesn't exactly reset IFS to its default value, though I believe an unset IFS behaves the same as the default value of IFS ($' \t\n'), however it seems bad practice to be assuming blindly that your code will never be invoked with IFS set to a custom value; (c) another idea is to invoke a subshell: (IFS=$custom; ...) when the subshell exits IFS will return to whatever it was originally. – dubiousjim May 31 '12 at 5:21

nicooga , Mar 7, 2016 at 15:32

I just want to have a quick look at the paths to decide where to throw an executable, so I resorted to run ruby -e "puts ENV.fetch('PATH').split(':')" . If you want to stay pure bash won't help but using any scripting language that has a built-in split is easier. – nicooga Mar 7 '16 at 15:32

Jeff , Apr 22 at 17:51

This is kind of a drive-by comment, but since the OP used email addresses as the example, has anyone bothered to answer it in a way that is fully RFC 5322 compliant, namely that any quoted string can appear before the @ which means you're going to need regular expressions or some other kind of parser instead of naive use of IFS or other simplistic splitter functions. – Jeff Apr 22 at 17:51

user2037659 , Apr 26 at 20:15

for x in $(IFS=';';echo $IN); do echo "> [$x]"; doneuser2037659 Apr 26 at 20:15

Johannes Schaub - litb , May 28, 2009 at 2:23

You can set the internal field separator (IFS) variable, and then let it parse into an array. When this happens in a command, then the assignment to IFS only takes place to that single command's environment (to read ). It then parses the input according to the IFS variable value into an array, which we can then iterate over.
IFS=';' read -ra ADDR <<< "$IN"
for i in "${ADDR[@]}"; do
    # process "$i"
done

It will parse one line of items separated by ; , pushing it into an array. Stuff for processing whole of $IN , each time one line of input separated by ; :

 while IFS=';' read -ra ADDR; do
      for i in "${ADDR[@]}"; do
          # process "$i"
      done
 done <<< "$IN"

Chris Lutz , May 28, 2009 at 2:25

This is probably the best way. How long will IFS persist in it's current value, can it mess up my code by being set when it shouldn't be, and how can I reset it when I'm done with it? – Chris Lutz May 28 '09 at 2:25

Johannes Schaub - litb , May 28, 2009 at 3:04

now after the fix applied, only within the duration of the read command :) – Johannes Schaub - litb May 28 '09 at 3:04

lhunath , May 28, 2009 at 6:14

You can read everything at once without using a while loop: read -r -d '' -a addr <<< "$in" # The -d '' is key here, it tells read not to stop at the first newline (which is the default -d) but to continue until EOF or a NULL byte (which only occur in binary data). – lhunath May 28 '09 at 6:14

Charles Duffy , Jul 6, 2013 at 14:39

@LucaBorrione Setting IFS on the same line as the read with no semicolon or other separator, as opposed to in a separate command, scopes it to that command -- so it's always "restored"; you don't need to do anything manually. – Charles Duffy Jul 6 '13 at 14:39

chepner , Oct 2, 2014 at 3:50

@imagineerThis There is a bug involving herestrings and local changes to IFS that requires $IN to be quoted. The bug is fixed in bash 4.3. – chepner Oct 2 '14 at 3:50

palindrom , Mar 10, 2011 at 9:00

Taken from Bash shell script split array :
IN="[email protected];[email protected]"
arrIN=(${IN//;/ })

Explanation:

This construction replaces all occurrences of ';' (the initial // means global replace) in the string IN with ' ' (a single space), then interprets the space-delimited string as an array (that's what the surrounding parentheses do).

The syntax used inside of the curly braces to replace each ';' character with a ' ' character is called Parameter Expansion .

There are some common gotchas:

  1. If the original string has spaces, you will need to use IFS :
    • IFS=':'; arrIN=($IN); unset IFS;
  2. If the original string has spaces and the delimiter is a new line, you can set IFS with:
    • IFS=$'\n'; arrIN=($IN); unset IFS;

Oz123 , Mar 21, 2011 at 18:50

I just want to add: this is the simplest of all, you can access array elements with ${arrIN[1]} (starting from zeros of course) – Oz123 Mar 21 '11 at 18:50

KomodoDave , Jan 5, 2012 at 15:13

Found it: the technique of modifying a variable within a ${} is known as 'parameter expansion'. – KomodoDave Jan 5 '12 at 15:13

qbolec , Feb 25, 2013 at 9:12

Does it work when the original string contains spaces? – qbolec Feb 25 '13 at 9:12

Ethan , Apr 12, 2013 at 22:47

No, I don't think this works when there are also spaces present... it's converting the ',' to ' ' and then building a space-separated array. – Ethan Apr 12 '13 at 22:47

Charles Duffy , Jul 6, 2013 at 14:39

This is a bad approach for other reasons: For instance, if your string contains ;*; , then the * will be expanded to a list of filenames in the current directory. -1 – Charles Duffy Jul 6 '13 at 14:39

Chris Lutz , May 28, 2009 at 2:09

If you don't mind processing them immediately, I like to do this:
for i in $(echo $IN | tr ";" "\n")
do
  # process
done

You could use this kind of loop to initialize an array, but there's probably an easier way to do it. Hope this helps, though.

Chris Lutz , May 28, 2009 at 2:42

You should have kept the IFS answer. It taught me something I didn't know, and it definitely made an array, whereas this just makes a cheap substitute. – Chris Lutz May 28 '09 at 2:42

Johannes Schaub - litb , May 28, 2009 at 2:59

I see. Yeah i find doing these silly experiments, i'm going to learn new things each time i'm trying to answer things. I've edited stuff based on #bash IRC feedback and undeleted :) – Johannes Schaub - litb May 28 '09 at 2:59

lhunath , May 28, 2009 at 6:12

-1, you're obviously not aware of wordsplitting, because it's introducing two bugs in your code. one is when you don't quote $IN and the other is when you pretend a newline is the only delimiter used in wordsplitting. You are iterating over every WORD in IN, not every line, and DEFINATELY not every element delimited by a semicolon, though it may appear to have the side-effect of looking like it works. – lhunath May 28 '09 at 6:12

Johannes Schaub - litb , May 28, 2009 at 17:00

You could change it to echo "$IN" | tr ';' '\n' | while read -r ADDY; do # process "$ADDY"; done to make him lucky, i think :) Note that this will fork, and you can't change outer variables from within the loop (that's why i used the <<< "$IN" syntax) then – Johannes Schaub - litb May 28 '09 at 17:00

mklement0 , Apr 24, 2013 at 14:13

To summarize the debate in the comments: Caveats for general use : the shell applies word splitting and expansions to the string, which may be undesired; just try it with. IN="[email protected];[email protected];*;broken apart" . In short: this approach will break, if your tokens contain embedded spaces and/or chars. such as * that happen to make a token match filenames in the current folder. – mklement0 Apr 24 '13 at 14:13

F. Hauri , Apr 13, 2013 at 14:20

Compatible answer

To this SO question, there is already a lot of different way to do this in bash . But bash has many special features, so called bashism that work well, but that won't work in any other shell .

In particular, arrays , associative array , and pattern substitution are pure bashisms and may not work under other shells .

On my Debian GNU/Linux , there is a standard shell called dash , but I know many people who like to use ksh .

Finally, in very small situation, there is a special tool called busybox with his own shell interpreter ( ash ).

Requested string

The string sample in SO question is:

IN="[email protected];[email protected]"

As this could be useful with whitespaces and as whitespaces could modify the result of the routine, I prefer to use this sample string:

 IN="[email protected];[email protected];Full Name <[email protected]>"
Split string based on delimiter in bash (version >=4.2)

Under pure bash, we may use arrays and IFS :

var="[email protected];[email protected];Full Name <[email protected]>"
oIFS="$IFS"
IFS=";"
declare -a fields=($var)
IFS="$oIFS"
unset oIFS

IFS=\; read -a fields <<<"$var"

Using this syntax under recent bash don't change $IFS for current session, but only for the current command:

set | grep ^IFS=
IFS=$' \t\n'

Now the string var is split and stored into an array (named fields ):

set | grep ^fields=\\\|^var=
fields=([0]="[email protected]" [1]="[email protected]" [2]="Full Name <[email protected]>")
var='[email protected];[email protected];Full Name <[email protected]>'

We could request for variable content with declare -p :

declare -p var fields
declare -- var="[email protected];[email protected];Full Name <[email protected]>"
declare -a fields=([0]="[email protected]" [1]="[email protected]" [2]="Full Name <[email protected]>")

read is the quickiest way to do the split, because there is no forks and no external resources called.

From there, you could use the syntax you already know for processing each field:

for x in "${fields[@]}";do
    echo "> [$x]"
    done
> [[email protected]]
> [[email protected]]
> [Full Name <[email protected]>]

or drop each field after processing (I like this shifting approach):

while [ "$fields" ] ;do
    echo "> [$fields]"
    fields=("${fields[@]:1}")
    done
> [[email protected]]
> [[email protected]]
> [Full Name <[email protected]>]

or even for simple printout (shorter syntax):

printf "> [%s]\n" "${fields[@]}"
> [[email protected]]
> [[email protected]]
> [Full Name <[email protected]>]
Split string based on delimiter in shell

But if you would write something usable under many shells, you have to not use bashisms .

There is a syntax, used in many shells, for splitting a string across first or last occurrence of a substring:

${var#*SubStr}  # will drop begin of string up to first occur of `SubStr`
${var##*SubStr} # will drop begin of string up to last occur of `SubStr`
${var%SubStr*}  # will drop part of string from last occur of `SubStr` to the end
${var%%SubStr*} # will drop part of string from first occur of `SubStr` to the end

(The missing of this is the main reason of my answer publication ;)

As pointed out by Score_Under :

# and % delete the shortest possible matching string, and

## and %% delete the longest possible.

This little sample script work well under bash , dash , ksh , busybox and was tested under Mac-OS's bash too:

var="[email protected];[email protected];Full Name <[email protected]>"
while [ "$var" ] ;do
    iter=${var%%;*}
    echo "> [$iter]"
    [ "$var" = "$iter" ] && \
        var='' || \
        var="${var#*;}"
  done
> [[email protected]]
> [[email protected]]
> [Full Name <[email protected]>]

Have fun!

Score_Under , Apr 28, 2015 at 16:58

The # , ## , % , and %% substitutions have what is IMO an easier explanation to remember (for how much they delete): # and % delete the shortest possible matching string, and ## and %% delete the longest possible. – Score_Under Apr 28 '15 at 16:58

sorontar , Oct 26, 2016 at 4:36

The IFS=\; read -a fields <<<"$var" fails on newlines and add a trailing newline. The other solution removes a trailing empty field. – sorontar Oct 26 '16 at 4:36

Eric Chen , Aug 30, 2017 at 17:50

The shell delimiter is the most elegant answer, period. – Eric Chen Aug 30 '17 at 17:50

sancho.s , Oct 4 at 3:42

Could the last alternative be used with a list of field separators set somewhere else? For instance, I mean to use this as a shell script, and pass a list of field separators as a positional parameter. – sancho.s Oct 4 at 3:42

F. Hauri , Oct 4 at 7:47

Yes, in a loop: for sep in "#" "ł" "@" ; do ... var="${var#*$sep}" ...F. Hauri Oct 4 at 7:47

DougW , Apr 27, 2015 at 18:20

I've seen a couple of answers referencing the cut command, but they've all been deleted. It's a little odd that nobody has elaborated on that, because I think it's one of the more useful commands for doing this type of thing, especially for parsing delimited log files.

In the case of splitting this specific example into a bash script array, tr is probably more efficient, but cut can be used, and is more effective if you want to pull specific fields from the middle.

Example:

$ echo "[email protected];[email protected]" | cut -d ";" -f 1
[email protected]
$ echo "[email protected];[email protected]" | cut -d ";" -f 2
[email protected]

You can obviously put that into a loop, and iterate the -f parameter to pull each field independently.

This gets more useful when you have a delimited log file with rows like this:

2015-04-27|12345|some action|an attribute|meta data

cut is very handy to be able to cat this file and select a particular field for further processing.

MisterMiyagi , Nov 2, 2016 at 8:42

Kudos for using cut , it's the right tool for the job! Much cleared than any of those shell hacks. – MisterMiyagi Nov 2 '16 at 8:42

uli42 , Sep 14, 2017 at 8:30

This approach will only work if you know the number of elements in advance; you'd need to program some more logic around it. It also runs an external tool for every element. – uli42 Sep 14 '17 at 8:30

Louis Loudog Trottier , May 10 at 4:20

Excatly waht i was looking for trying to avoid empty string in a csv. Now i can point the exact 'column' value as well. Work with IFS already used in a loop. Better than expected for my situation. – Louis Loudog Trottier May 10 at 4:20

, May 28, 2009 at 10:31

How about this approach:
IN="[email protected];[email protected]" 
set -- "$IN" 
IFS=";"; declare -a Array=($*) 
echo "${Array[@]}" 
echo "${Array[0]}" 
echo "${Array[1]}"

Source

Yzmir Ramirez , Sep 5, 2011 at 1:06

+1 ... but I wouldn't name the variable "Array" ... pet peev I guess. Good solution. – Yzmir Ramirez Sep 5 '11 at 1:06

ata , Nov 3, 2011 at 22:33

+1 ... but the "set" and declare -a are unnecessary. You could as well have used just IFS";" && Array=($IN)ata Nov 3 '11 at 22:33

Luca Borrione , Sep 3, 2012 at 9:26

+1 Only a side note: shouldn't it be recommendable to keep the old IFS and then restore it? (as shown by stefanB in his edit3) people landing here (sometimes just copying and pasting a solution) might not think about this – Luca Borrione Sep 3 '12 at 9:26

Charles Duffy , Jul 6, 2013 at 14:44

-1: First, @ata is right that most of the commands in this do nothing. Second, it uses word-splitting to form the array, and doesn't do anything to inhibit glob-expansion when doing so (so if you have glob characters in any of the array elements, those elements are replaced with matching filenames). – Charles Duffy Jul 6 '13 at 14:44

John_West , Jan 8, 2016 at 12:29

Suggest to use $'...' : IN=$'[email protected];[email protected];bet <d@\ns* kl.com>' . Then echo "${Array[2]}" will print a string with newline. set -- "$IN" is also neccessary in this case. Yes, to prevent glob expansion, the solution should include set -f . – John_West Jan 8 '16 at 12:29

Steven Lizarazo , Aug 11, 2016 at 20:45

This worked for me:
string="1;2"
echo $string | cut -d';' -f1 # output is 1
echo $string | cut -d';' -f2 # output is 2

Pardeep Sharma , Oct 10, 2017 at 7:29

this is sort and sweet :) – Pardeep Sharma Oct 10 '17 at 7:29

space earth , Oct 17, 2017 at 7:23

Thanks...Helped a lot – space earth Oct 17 '17 at 7:23

mojjj , Jan 8 at 8:57

cut works only with a single char as delimiter. – mojjj Jan 8 at 8:57

lothar , May 28, 2009 at 2:12

echo "[email protected];[email protected]" | sed -e 's/;/\n/g'
[email protected]
[email protected]

Luca Borrione , Sep 3, 2012 at 10:08

-1 what if the string contains spaces? for example IN="this is first line; this is second line" arrIN=( $( echo "$IN" | sed -e 's/;/\n/g' ) ) will produce an array of 8 elements in this case (an element for each word space separated), rather than 2 (an element for each line semi colon separated) – Luca Borrione Sep 3 '12 at 10:08

lothar , Sep 3, 2012 at 17:33

@Luca No the sed script creates exactly two lines. What creates the multiple entries for you is when you put it into a bash array (which splits on white space by default) – lothar Sep 3 '12 at 17:33

Luca Borrione , Sep 4, 2012 at 7:09

That's exactly the point: the OP needs to store entries into an array to loop over it, as you can see in his edits. I think your (good) answer missed to mention to use arrIN=( $( echo "$IN" | sed -e 's/;/\n/g' ) ) to achieve that, and to advice to change IFS to IFS=$'\n' for those who land here in the future and needs to split a string containing spaces. (and to restore it back afterwards). :) – Luca Borrione Sep 4 '12 at 7:09

lothar , Sep 4, 2012 at 16:55

@Luca Good point. However the array assignment was not in the initial question when I wrote up that answer. – lothar Sep 4 '12 at 16:55

Ashok , Sep 8, 2012 at 5:01

This also works:
IN="[email protected];[email protected]"
echo ADD1=`echo $IN | cut -d \; -f 1`
echo ADD2=`echo $IN | cut -d \; -f 2`

Be careful, this solution is not always correct. In case you pass "[email protected]" only, it will assign it to both ADD1 and ADD2.

fersarr , Mar 3, 2016 at 17:17

You can use -s to avoid the mentioned problem: superuser.com/questions/896800/ "-f, --fields=LIST select only these fields; also print any line that contains no delimiter character, unless the -s option is specified" – fersarr Mar 3 '16 at 17:17

Tony , Jan 14, 2013 at 6:33

I think AWK is the best and efficient command to resolve your problem. AWK is included in Bash by default in almost every Linux distribution.
echo "[email protected];[email protected]" | awk -F';' '{print $1,$2}'

will give

[email protected] [email protected]

Of course your can store each email address by redefining the awk print field.

Jaro , Jan 7, 2014 at 21:30

Or even simpler: echo "[email protected];[email protected]" | awk 'BEGIN{RS=";"} {print}' – Jaro Jan 7 '14 at 21:30

Aquarelle , May 6, 2014 at 21:58

@Jaro This worked perfectly for me when I had a string with commas and needed to reformat it into lines. Thanks. – Aquarelle May 6 '14 at 21:58

Eduardo Lucio , Aug 5, 2015 at 12:59

It worked in this scenario -> "echo "$SPLIT_0" | awk -F' inode=' '{print $1}'"! I had problems when trying to use atrings (" inode=") instead of characters (";"). $ 1, $ 2, $ 3, $ 4 are set as positions in an array! If there is a way of setting an array... better! Thanks! – Eduardo Lucio Aug 5 '15 at 12:59

Tony , Aug 6, 2015 at 2:42

@EduardoLucio, what I'm thinking about is maybe you can first replace your delimiter inode= into ; for example by sed -i 's/inode\=/\;/g' your_file_to_process , then define -F';' when apply awk , hope that can help you. – Tony Aug 6 '15 at 2:42

nickjb , Jul 5, 2011 at 13:41

A different take on Darron's answer , this is how I do it:
IN="[email protected];[email protected]"
read ADDR1 ADDR2 <<<$(IFS=";"; echo $IN)

ColinM , Sep 10, 2011 at 0:31

This doesn't work. – ColinM Sep 10 '11 at 0:31

nickjb , Oct 6, 2011 at 15:33

I think it does! Run the commands above and then "echo $ADDR1 ... $ADDR2" and i get "[email protected] ... [email protected]" output – nickjb Oct 6 '11 at 15:33

Nick , Oct 28, 2011 at 14:36

This worked REALLY well for me... I used it to itterate over an array of strings which contained comma separated DB,SERVER,PORT data to use mysqldump. – Nick Oct 28 '11 at 14:36

dubiousjim , May 31, 2012 at 5:28

Diagnosis: the IFS=";" assignment exists only in the $(...; echo $IN) subshell; this is why some readers (including me) initially think it won't work. I assumed that all of $IN was getting slurped up by ADDR1. But nickjb is correct; it does work. The reason is that echo $IN command parses its arguments using the current value of $IFS, but then echoes them to stdout using a space delimiter, regardless of the setting of $IFS. So the net effect is as though one had called read ADDR1 ADDR2 <<< "[email protected] [email protected]" (note the input is space-separated not ;-separated). – dubiousjim May 31 '12 at 5:28

sorontar , Oct 26, 2016 at 4:43

This fails on spaces and newlines, and also expand wildcards * in the echo $IN with an unquoted variable expansion. – sorontar Oct 26 '16 at 4:43

gniourf_gniourf , Jun 26, 2014 at 9:11

In Bash, a bullet proof way, that will work even if your variable contains newlines:
IFS=';' read -d '' -ra array < <(printf '%s;\0' "$in")

Look:

$ in=$'one;two three;*;there is\na newline\nin this field'
$ IFS=';' read -d '' -ra array < <(printf '%s;\0' "$in")
$ declare -p array
declare -a array='([0]="one" [1]="two three" [2]="*" [3]="there is
a newline
in this field")'

The trick for this to work is to use the -d option of read (delimiter) with an empty delimiter, so that read is forced to read everything it's fed. And we feed read with exactly the content of the variable in , with no trailing newline thanks to printf . Note that's we're also putting the delimiter in printf to ensure that the string passed to read has a trailing delimiter. Without it, read would trim potential trailing empty fields:

$ in='one;two;three;'    # there's an empty field
$ IFS=';' read -d '' -ra array < <(printf '%s;\0' "$in")
$ declare -p array
declare -a array='([0]="one" [1]="two" [2]="three" [3]="")'

the trailing empty field is preserved.


Update for Bash≥4.4

Since Bash 4.4, the builtin mapfile (aka readarray ) supports the -d option to specify a delimiter. Hence another canonical way is:

mapfile -d ';' -t array < <(printf '%s;' "$in")

John_West , Jan 8, 2016 at 12:10

I found it as the rare solution on that list that works correctly with \n , spaces and * simultaneously. Also, no loops; array variable is accessible in the shell after execution (contrary to the highest upvoted answer). Note, in=$'...' , it does not work with double quotes. I think, it needs more upvotes. – John_West Jan 8 '16 at 12:10

Darron , Sep 13, 2010 at 20:10

How about this one liner, if you're not using arrays:
IFS=';' read ADDR1 ADDR2 <<<$IN

dubiousjim , May 31, 2012 at 5:36

Consider using read -r ... to ensure that, for example, the two characters "\t" in the input end up as the same two characters in your variables (instead of a single tab char). – dubiousjim May 31 '12 at 5:36

Luca Borrione , Sep 3, 2012 at 10:07

-1 This is not working here (ubuntu 12.04). Adding echo "ADDR1 $ADDR1"\n echo "ADDR2 $ADDR2" to your snippet will output ADDR1 [email protected] [email protected]\nADDR2 (\n is newline) – Luca Borrione Sep 3 '12 at 10:07

chepner , Sep 19, 2015 at 13:59

This is probably due to a bug involving IFS and here strings that was fixed in bash 4.3. Quoting $IN should fix it. (In theory, $IN is not subject to word splitting or globbing after it expands, meaning the quotes should be unnecessary. Even in 4.3, though, there's at least one bug remaining--reported and scheduled to be fixed--so quoting remains a good idea.) – chepner Sep 19 '15 at 13:59

sorontar , Oct 26, 2016 at 4:55

This breaks if $in contain newlines even if $IN is quoted. And adds a trailing newline. – sorontar Oct 26 '16 at 4:55

kenorb , Sep 11, 2015 at 20:54

Here is a clean 3-liner:
in="foo@bar;bizz@buzz;fizz@buzz;buzz@woof"
IFS=';' list=($in)
for item in "${list[@]}"; do echo $item; done

where IFS delimit words based on the separator and () is used to create an array . Then [@] is used to return each item as a separate word.

If you've any code after that, you also need to restore $IFS , e.g. unset IFS .

sorontar , Oct 26, 2016 at 5:03

The use of $in unquoted allows wildcards to be expanded. – sorontar Oct 26 '16 at 5:03

user2720864 , Sep 24 at 13:46

+ for the unset command – user2720864 Sep 24 at 13:46

Emilien Brigand , Aug 1, 2016 at 13:15

Without setting the IFS

If you just have one colon you can do that:

a="foo:bar"
b=${a%:*}
c=${a##*:}

you will get:

b = foo
c = bar

Victor Choy , Sep 16, 2015 at 3:34

There is a simple and smart way like this:
echo "add:sfff" | xargs -d: -i  echo {}

But you must use gnu xargs, BSD xargs cant support -d delim. If you use apple mac like me. You can install gnu xargs :

brew install findutils

then

echo "add:sfff" | gxargs -d: -i  echo {}

Halle Knast , May 24, 2017 at 8:42

The following Bash/zsh function splits its first argument on the delimiter given by the second argument:
split() {
    local string="$1"
    local delimiter="$2"
    if [ -n "$string" ]; then
        local part
        while read -d "$delimiter" part; do
            echo $part
        done <<< "$string"
        echo $part
    fi
}

For instance, the command

$ split 'a;b;c' ';'

yields

a
b
c

This output may, for instance, be piped to other commands. Example:

$ split 'a;b;c' ';' | cat -n
1   a
2   b
3   c

Compared to the other solutions given, this one has the following advantages:

If desired, the function may be put into a script as follows:

#!/usr/bin/env bash

split() {
    # ...
}

split "$@"

sandeepkunkunuru , Oct 23, 2017 at 16:10

works and neatly modularized. – sandeepkunkunuru Oct 23 '17 at 16:10

Prospero , Sep 25, 2011 at 1:09

This is the simplest way to do it.
spo='one;two;three'
OIFS=$IFS
IFS=';'
spo_array=($spo)
IFS=$OIFS
echo ${spo_array[*]}

rashok , Oct 25, 2016 at 12:41

IN="[email protected];[email protected]"
IFS=';'
read -a IN_arr <<< "${IN}"
for entry in "${IN_arr[@]}"
do
    echo $entry
done

Output

[email protected]
[email protected]

System : Ubuntu 12.04.1

codeforester , Jan 2, 2017 at 5:37

IFS is not getting set in the specific context of read here and hence it can upset rest of the code, if any. – codeforester Jan 2 '17 at 5:37

shuaihanhungry , Jan 20 at 15:54

you can apply awk to many situations
echo "[email protected];[email protected]"|awk -F';' '{printf "%s\n%s\n", $1, $2}'

also you can use this

echo "[email protected];[email protected]"|awk -F';' '{print $1,$2}' OFS="\n"

ghost , Apr 24, 2013 at 13:13

If no space, Why not this?
IN="[email protected];[email protected]"
arr=(`echo $IN | tr ';' ' '`)

echo ${arr[0]}
echo ${arr[1]}

eukras , Oct 22, 2012 at 7:10

There are some cool answers here (errator esp.), but for something analogous to split in other languages -- which is what I took the original question to mean -- I settled on this:
IN="[email protected];[email protected]"
declare -a a="(${IN/;/ })";

Now ${a[0]} , ${a[1]} , etc, are as you would expect. Use ${#a[*]} for number of terms. Or to iterate, of course:

for i in ${a[*]}; do echo $i; done

IMPORTANT NOTE:

This works in cases where there are no spaces to worry about, which solved my problem, but may not solve yours. Go with the $IFS solution(s) in that case.

olibre , Oct 7, 2013 at 13:33

Does not work when IN contains more than two e-mail addresses. Please refer to same idea (but fixed) at palindrom's answerolibre Oct 7 '13 at 13:33

sorontar , Oct 26, 2016 at 5:14

Better use ${IN//;/ } (double slash) to make it also work with more than two values. Beware that any wildcard ( *?[ ) will be expanded. And a trailing empty field will be discarded. – sorontar Oct 26 '16 at 5:14

jeberle , Apr 30, 2013 at 3:10

Use the set built-in to load up the $@ array:
IN="[email protected];[email protected]"
IFS=';'; set $IN; IFS=$' \t\n'

Then, let the party begin:

echo $#
for a; do echo $a; done
ADDR1=$1 ADDR2=$2

sorontar , Oct 26, 2016 at 5:17

Better use set -- $IN to avoid some issues with "$IN" starting with dash. Still, the unquoted expansion of $IN will expand wildcards ( *?[ ). – sorontar Oct 26 '16 at 5:17

NevilleDNZ , Sep 2, 2013 at 6:30

Two bourne-ish alternatives where neither require bash arrays:

Case 1 : Keep it nice and simple: Use a NewLine as the Record-Separator... eg.

IN="[email protected]
[email protected]"

while read i; do
  # process "$i" ... eg.
    echo "[email:$i]"
done <<< "$IN"

Note: in this first case no sub-process is forked to assist with list manipulation.

Idea: Maybe it is worth using NL extensively internally , and only converting to a different RS when generating the final result externally .

Case 2 : Using a ";" as a record separator... eg.

NL="
" IRS=";" ORS=";"

conv_IRS() {
  exec tr "$1" "$NL"
}

conv_ORS() {
  exec tr "$NL" "$1"
}

IN="[email protected];[email protected]"
IN="$(conv_IRS ";" <<< "$IN")"

while read i; do
  # process "$i" ... eg.
    echo -n "[email:$i]$ORS"
done <<< "$IN"

In both cases a sub-list can be composed within the loop is persistent after the loop has completed. This is useful when manipulating lists in memory, instead storing lists in files. {p.s. keep calm and carry on B-) }

fedorqui , Jan 8, 2015 at 10:21

Apart from the fantastic answers that were already provided, if it is just a matter of printing out the data you may consider using awk :
awk -F";" '{for (i=1;i<=NF;i++) printf("> [%s]\n", $i)}' <<< "$IN"

This sets the field separator to ; , so that it can loop through the fields with a for loop and print accordingly.

Test
$ IN="[email protected];[email protected]"
$ awk -F";" '{for (i=1;i<=NF;i++) printf("> [%s]\n", $i)}' <<< "$IN"
> [[email protected]]
> [[email protected]]

With another input:

$ awk -F";" '{for (i=1;i<=NF;i++) printf("> [%s]\n", $i)}' <<< "a;b;c   d;e_;f"
> [a]
> [b]
> [c   d]
> [e_]
> [f]

18446744073709551615 , Feb 20, 2015 at 10:49

In Android shell, most of the proposed methods just do not work:
$ IFS=':' read -ra ADDR <<<"$PATH"                             
/system/bin/sh: can't create temporary file /sqlite_stmt_journals/mksh.EbNoR10629: No such file or directory

What does work is:

$ for i in ${PATH//:/ }; do echo $i; done
/sbin
/vendor/bin
/system/sbin
/system/bin
/system/xbin

where // means global replacement.

sorontar , Oct 26, 2016 at 5:08

Fails if any part of $PATH contains spaces (or newlines). Also expands wildcards (asterisk *, question mark ? and braces [ ]). – sorontar Oct 26 '16 at 5:08

Eduardo Lucio , Apr 4, 2016 at 19:54

Okay guys!

Here's my answer!

DELIMITER_VAL='='

read -d '' F_ABOUT_DISTRO_R <<"EOF"
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=14.04
DISTRIB_CODENAME=trusty
DISTRIB_DESCRIPTION="Ubuntu 14.04.4 LTS"
NAME="Ubuntu"
VERSION="14.04.4 LTS, Trusty Tahr"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 14.04.4 LTS"
VERSION_ID="14.04"
HOME_URL="http://www.ubuntu.com/"
SUPPORT_URL="http://help.ubuntu.com/"
BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"
EOF

SPLIT_NOW=$(awk -F$DELIMITER_VAL '{for(i=1;i<=NF;i++){printf "%s\n", $i}}' <<<"${F_ABOUT_DISTRO_R}")
while read -r line; do
   SPLIT+=("$line")
done <<< "$SPLIT_NOW"
for i in "${SPLIT[@]}"; do
    echo "$i"
done

Why this approach is "the best" for me?

Because of two reasons:

  1. You do not need to escape the delimiter;
  2. You will not have problem with blank spaces . The value will be properly separated in the array!

[]'s

gniourf_gniourf , Jan 30, 2017 at 8:26

FYI, /etc/os-release and /etc/lsb-release are meant to be sourced, and not parsed. So your method is really wrong. Moreover, you're not quite answering the question about spiltting a string on a delimiter.gniourf_gniourf Jan 30 '17 at 8:26

Michael Hale , Jun 14, 2012 at 17:38

A one-liner to split a string separated by ';' into an array is:
IN="[email protected];[email protected]"
ADDRS=( $(IFS=";" echo "$IN") )
echo ${ADDRS[0]}
echo ${ADDRS[1]}

This only sets IFS in a subshell, so you don't have to worry about saving and restoring its value.

Luca Borrione , Sep 3, 2012 at 10:04

-1 this doesn't work here (ubuntu 12.04). it prints only the first echo with all $IN value in it, while the second is empty. you can see it if you put echo "0: "${ADDRS[0]}\n echo "1: "${ADDRS[1]} the output is 0: [email protected];[email protected]\n 1: (\n is new line) – Luca Borrione Sep 3 '12 at 10:04

Luca Borrione , Sep 3, 2012 at 10:05

please refer to nickjb's answer at for a working alternative to this idea stackoverflow.com/a/6583589/1032370 – Luca Borrione Sep 3 '12 at 10:05

Score_Under , Apr 28, 2015 at 17:09

-1, 1. IFS isn't being set in that subshell (it's being passed to the environment of "echo", which is a builtin, so nothing is happening anyway). 2. $IN is quoted so it isn't subject to IFS splitting. 3. The process substitution is split by whitespace, but this may corrupt the original data. – Score_Under Apr 28 '15 at 17:09

ajaaskel , Oct 10, 2014 at 11:33

IN='[email protected];[email protected];Charlie Brown <[email protected];!"#$%&/()[]{}*? are no problem;simple is beautiful :-)'
set -f
oldifs="$IFS"
IFS=';'; arrayIN=($IN)
IFS="$oldifs"
for i in "${arrayIN[@]}"; do
echo "$i"
done
set +f

Output:

[email protected]
[email protected]
Charlie Brown <[email protected]
!"#$%&/()[]{}*? are no problem
simple is beautiful :-)

Explanation: Simple assignment using parenthesis () converts semicolon separated list into an array provided you have correct IFS while doing that. Standard FOR loop handles individual items in that array as usual. Notice that the list given for IN variable must be "hard" quoted, that is, with single ticks.

IFS must be saved and restored since Bash does not treat an assignment the same way as a command. An alternate workaround is to wrap the assignment inside a function and call that function with a modified IFS. In that case separate saving/restoring of IFS is not needed. Thanks for "Bize" for pointing that out.

gniourf_gniourf , Feb 20, 2015 at 16:45

!"#$%&/()[]{}*? are no problem well... not quite: []*? are glob characters. So what about creating this directory and file: `mkdir '!"#$%&'; touch '!"#$%&/()[]{} got you hahahaha - are no problem' and running your command? simple may be beautiful, but when it's broken, it's broken. – gniourf_gniourf Feb 20 '15 at 16:45

ajaaskel , Feb 25, 2015 at 7:20

@gniourf_gniourf The string is stored in a variable. Please see the original question. – ajaaskel Feb 25 '15 at 7:20

gniourf_gniourf , Feb 25, 2015 at 7:26

@ajaaskel you didn't fully understand my comment. Go in a scratch directory and issue these commands: mkdir '!"#$%&'; touch '!"#$%&/()[]{} got you hahahaha - are no problem' . They will only create a directory and a file, with weird looking names, I must admit. Then run your commands with the exact IN you gave: IN='[email protected];[email protected];Charlie Brown <[email protected];!"#$%&/()[]{}*? are no problem;simple is beautiful :-)' . You'll see that you won't get the output you expect. Because you're using a method subject to pathname expansions to split your string. – gniourf_gniourf Feb 25 '15 at 7:26

gniourf_gniourf , Feb 25, 2015 at 7:29

This is to demonstrate that the characters * , ? , [...] and even, if extglob is set, !(...) , @(...) , ?(...) , +(...) are problems with this method! – gniourf_gniourf Feb 25 '15 at 7:29

ajaaskel , Feb 26, 2015 at 15:26

@gniourf_gniourf Thanks for detailed comments on globbing. I adjusted the code to have globbing off. My point was however just to show that rather simple assignment can do the splitting job. – ajaaskel Feb 26 '15 at 15:26

> , Dec 19, 2013 at 21:39

Maybe not the most elegant solution, but works with * and spaces:
IN="bla@so me.com;*;[email protected]"
for i in `delims=${IN//[^;]}; seq 1 $((${#delims} + 1))`
do
   echo "> [`echo $IN | cut -d';' -f$i`]"
done

Outputs

> [bla@so me.com]
> [*]
> [[email protected]]

Other example (delimiters at beginning and end):

IN=";bla@so me.com;*;[email protected];"
> []
> [bla@so me.com]
> [*]
> [[email protected]]
> []

Basically it removes every character other than ; making delims eg. ;;; . Then it does for loop from 1 to number-of-delimiters as counted by ${#delims} . The final step is to safely get the $i th part using cut .

[Nov 08, 2018] Utilizing multi core for tar+gzip-bzip compression-decompression

Nov 08, 2018 | stackoverflow.com

Ask Question up vote 163 down vote favorite 67


user1118764 , Sep 7, 2012 at 6:58

I normally compress using tar zcvf and decompress using tar zxvf (using gzip due to habit).

I've recently gotten a quad core CPU with hyperthreading, so I have 8 logical cores, and I notice that many of the cores are unused during compression/decompression.

Is there any way I can utilize the unused cores to make it faster?

Warren Severin , Nov 13, 2017 at 4:37

The solution proposed by Xiong Chiamiov above works beautifully. I had just backed up my laptop with .tar.bz2 and it took 132 minutes using only one cpu thread. Then I compiled and installed tar from source: gnu.org/software/tar I included the options mentioned in the configure step: ./configure --with-gzip=pigz --with-bzip2=lbzip2 --with-lzip=plzip I ran the backup again and it took only 32 minutes. That's better than 4X improvement! I watched the system monitor and it kept all 4 cpus (8 threads) flatlined at 100% the whole time. THAT is the best solution. – Warren Severin Nov 13 '17 at 4:37

Mark Adler , Sep 7, 2012 at 14:48

You can use pigz instead of gzip, which does gzip compression on multiple cores. Instead of using the -z option, you would pipe it through pigz:
tar cf - paths-to-archive | pigz > archive.tar.gz

By default, pigz uses the number of available cores, or eight if it could not query that. You can ask for more with -p n, e.g. -p 32. pigz has the same options as gzip, so you can request better compression with -9. E.g.

tar cf - paths-to-archive | pigz -9 -p 32 > archive.tar.gz

user788171 , Feb 20, 2013 at 12:43

How do you use pigz to decompress in the same fashion? Or does it only work for compression? – user788171 Feb 20 '13 at 12:43

Mark Adler , Feb 20, 2013 at 16:18

pigz does use multiple cores for decompression, but only with limited improvement over a single core. The deflate format does not lend itself to parallel decompression. The decompression portion must be done serially. The other cores for pigz decompression are used for reading, writing, and calculating the CRC. When compressing on the other hand, pigz gets close to a factor of n improvement with n cores. – Mark Adler Feb 20 '13 at 16:18

Garrett , Mar 1, 2014 at 7:26

The hyphen here is stdout (see this page ). – Garrett Mar 1 '14 at 7:26

Mark Adler , Jul 2, 2014 at 21:29

Yes. 100% compatible in both directions. – Mark Adler Jul 2 '14 at 21:29

Mark Adler , Apr 23, 2015 at 5:23

There is effectively no CPU time spent tarring, so it wouldn't help much. The tar format is just a copy of the input file with header blocks in between files. – Mark Adler Apr 23 '15 at 5:23

Jen , Jun 14, 2013 at 14:34

You can also use the tar flag "--use-compress-program=" to tell tar what compression program to use.

For example use:

tar -c --use-compress-program=pigz -f tar.file dir_to_zip

ranman , Nov 13, 2013 at 10:01

This is an awesome little nugget of knowledge and deserves more upvotes. I had no idea this option even existed and I've read the man page a few times over the years. – ranman Nov 13 '13 at 10:01

Valerio Schiavoni , Aug 5, 2014 at 22:38

Unfortunately by doing so the concurrent feature of pigz is lost. You can see for yourself by executing that command and monitoring the load on each of the cores. – Valerio Schiavoni Aug 5 '14 at 22:38

bovender , Sep 18, 2015 at 10:14

@ValerioSchiavoni: Not here, I get full load on all 4 cores (Ubuntu 15.04 'Vivid'). – bovender Sep 18 '15 at 10:14

Valerio Schiavoni , Sep 28, 2015 at 23:41

On compress or on decompress ? – Valerio Schiavoni Sep 28 '15 at 23:41

Offenso , Jan 11, 2017 at 17:26

I prefer tar - dir_to_zip | pv | pigz > tar.file pv helps me estimate, you can skip it. But still it easier to write and remember. – Offenso Jan 11 '17 at 17:26

Maxim Suslov , Dec 18, 2014 at 7:31

Common approach

There is option for tar program:

-I, --use-compress-program PROG
      filter through PROG (must accept -d)

You can use multithread version of archiver or compressor utility.

Most popular multithread archivers are pigz (instead of gzip) and pbzip2 (instead of bzip2). For instance:

$ tar -I pbzip2 -cf OUTPUT_FILE.tar.bz2 paths_to_archive
$ tar --use-compress-program=pigz -cf OUTPUT_FILE.tar.gz paths_to_archive

Archiver must accept -d. If your replacement utility hasn't this parameter and/or you need specify additional parameters, then use pipes (add parameters if necessary):

$ tar cf - paths_to_archive | pbzip2 > OUTPUT_FILE.tar.gz
$ tar cf - paths_to_archive | pigz > OUTPUT_FILE.tar.gz

Input and output of singlethread and multithread are compatible. You can compress using multithread version and decompress using singlethread version and vice versa.

p7zip

For p7zip for compression you need a small shell script like the following:

#!/bin/sh
case $1 in
  -d) 7za -txz -si -so e;;
   *) 7za -txz -si -so a .;;
esac 2>/dev/null

Save it as 7zhelper.sh. Here the example of usage:

$ tar -I 7zhelper.sh -cf OUTPUT_FILE.tar.7z paths_to_archive
$ tar -I 7zhelper.sh -xf OUTPUT_FILE.tar.7z
xz

Regarding multithreaded XZ support. If you are running version 5.2.0 or above of XZ Utils, you can utilize multiple cores for compression by setting -T or --threads to an appropriate value via the environmental variable XZ_DEFAULTS (e.g. XZ_DEFAULTS="-T 0" ).

This is a fragment of man for 5.1.0alpha version:

Multithreaded compression and decompression are not implemented yet, so this option has no effect for now.

However this will not work for decompression of files that haven't also been compressed with threading enabled. From man for version 5.2.2:

Threaded decompression hasn't been implemented yet. It will only work on files that contain multiple blocks with size information in block headers. All files compressed in multi-threaded mode meet this condition, but files compressed in single-threaded mode don't even if --block-size=size is used.

Recompiling with replacement

If you build tar from sources, then you can recompile with parameters

--with-gzip=pigz
--with-bzip2=lbzip2
--with-lzip=plzip

After recompiling tar with these options you can check the output of tar's help:

$ tar --help | grep "lbzip2\|plzip\|pigz"
  -j, --bzip2                filter the archive through lbzip2
      --lzip                 filter the archive through plzip
  -z, --gzip, --gunzip, --ungzip   filter the archive through pigz

> , Apr 28, 2015 at 20:41

This is indeed the best answer. I'll definitely rebuild my tar! – user1985657 Apr 28 '15 at 20:41

mpibzip2 , Apr 28, 2015 at 20:57

I just found pbzip2 and mpibzip2 . mpibzip2 looks very promising for clusters or if you have a laptop and a multicore desktop computer for instance. – user1985657 Apr 28 '15 at 20:57

oᴉɹǝɥɔ , Jun 10, 2015 at 17:39

This is a great and elaborate answer. It may be good to mention that multithreaded compression (e.g. with pigz ) is only enabled when it reads from the file. Processing STDIN may in fact be slower. – oᴉɹǝɥɔ Jun 10 '15 at 17:39

selurvedu , May 26, 2016 at 22:13

Plus 1 for xz option. It the simplest, yet effective approach. – selurvedu May 26 '16 at 22:13

panticz.de , Sep 1, 2014 at 15:02

You can use the shortcut -I for tar's --use-compress-program switch, and invoke pbzip2 for bzip2 compression on multiple cores:
tar -I pbzip2 -cf OUTPUT_FILE.tar.bz2 DIRECTORY_TO_COMPRESS/

einpoklum , Feb 11, 2017 at 15:59

A nice TL;DR for @MaximSuslov's answer . – einpoklum Feb 11 '17 at 15:59

,

If you want to have more flexibility with filenames and compression options, you can use:
find /my/path/ -type f -name "*.sql" -o -name "*.log" -exec \
tar -P --transform='s@/my/path/@@g' -cf - {} + | \
pigz -9 -p 4 > myarchive.tar.gz
Step 1: find

find /my/path/ -type f -name "*.sql" -o -name "*.log" -exec

This command will look for the files you want to archive, in this case /my/path/*.sql and /my/path/*.log . Add as many -o -name "pattern" as you want.

-exec will execute the next command using the results of find : tar

Step 2: tar

tar -P --transform='s@/my/path/@@g' -cf - {} +

--transform is a simple string replacement parameter. It will strip the path of the files from the archive so the tarball's root becomes the current directory when extracting. Note that you can't use -C option to change directory as you'll lose benefits of find : all files of the directory would be included.

-P tells tar to use absolute paths, so it doesn't trigger the warning "Removing leading `/' from member names". Leading '/' with be removed by --transform anyway.

-cf - tells tar to use the tarball name we'll specify later

{} + uses everyfiles that find found previously

Step 3: pigz

pigz -9 -p 4

Use as many parameters as you want. In this case -9 is the compression level and -p 4 is the number of cores dedicated to compression. If you run this on a heavy loaded webserver, you probably don't want to use all available cores.

Step 4: archive name

> myarchive.tar.gz

Finally.

[Oct 29, 2018] Getting all the matches with 'grep -f' option

Perverted example, but interesting question.
Oct 29, 2018 | stackoverflow.com

Arturo ,Mar 24, 2017 at 8:59

I would like to find all the matches of the text I have in one file ('file1.txt') that are found in another file ('file2.txt') using the grep option -f, that tells to read the expressions to be found from file.

'file1.txt'

a

a

'file2.txt'

a

When I run the command:

grep -f file1.txt file2.txt -w

I get only once the output of the 'a'. instead I would like to get it twice, because it occurs twice in my 'file1.txt' file. Is there a way to let grep (or any other unix/linux) tool to output a match for each line it reads? Thanks in advance. Arturo

RomanPerekhrest ,Mar 24, 2017 at 9:02

the matches of the text - some exact text? should it compare line to line? – RomanPerekhrest Mar 24 '17 at 9:02

Arturo ,Mar 24, 2017 at 9:04

Yes it contains exact match. I added the -w options, following your input. Yes, it is a comparison line by line. – Arturo Mar 24 '17 at 9:04

Remko ,Mar 24, 2017 at 9:19

Grep works as designed, giving only one output line. You could use another approach:
while IFS= read -r pattern; do
    grep -e $pattern file2.txt
done < file1.txt

This would use every line in file1.txt as a pattern for the grep, thus resulting in the output you're looking for.

Arturo ,Mar 24, 2017 at 9:30

That did the trick!. Thank you. And it is even much faster than my previous grep command. – Arturo Mar 24 '17 at 9:30

ar7 ,Mar 24, 2017 at 9:12

When you use
grep -f pattern.txt file.txt

It means match the pattern found in pattern.txt in the file file.txt .

It is giving you only one output because that is all is there in the second file.

Try interchanging the files,

grep -f file2.txt file1.txt -w

Does this answer your question?

Arturo ,Mar 24, 2017 at 9:17

I understand that, but still I would like to find a way to print a match each time a pattern (even a repeated one) from 'pattern.txt' is found in 'file.txt'. Even a tool or a script rather then 'grep -f' would suffice. – Arturo Mar 24 '17 at 9:17

[Oct 23, 2018] To switch from vertical split to horizontal split fast in Vim

Nov 24, 2013 | stackoverflow.com

ДМИТРИЙ МАЛИКОВ, Nov 24, 2013 at 7:55

How can you switch your current windows from horizontal split to vertical split and vice versa in Vim?

I did that a moment ago by accident but I cannot find the key again.

Mark Rushakoff

Vim mailing list says (re-formatted for better readability):

  • To change two vertically split windows to horizontal split: Ctrl - W t Ctrl - W K
  • Horizontally to vertically: Ctrl - W t Ctrl - W H

Explanations:

  • Ctrl - W t -- makes the first (topleft) window current
  • Ctrl - W K -- moves the current window to full-width at the very top
  • Ctrl - W H -- moves the current window to full-height at far left

Note that the t is lowercase, and the K and H are uppercase.

Also, with only two windows, it seems like you can drop the Ctrl - W t part because if you're already in one of only two windows, what's the point of making it current?

Too much php Aug 13 '09 at 2:17

So if you have two windows split horizontally, and you are in the lower window, you just use ^WL

Alex Hart Dec 7 '12 at 14:10

There are a ton of interesting ^w commands (b, w, etc)

holms Feb 28 '13 at 9:07

somehow doesn't work for me.. =/ –

Lambart Mar 26 at 19:34

Just toggle your NERDTree panel closed before 'rotating' the splits, then toggle it back open. :NERDTreeToggle (I have it mapped to a function key for convenience).

xxx Feb 19 '13 at 20:26

^w followed by capital H , J , K or L will move the current window to the far left , bottom , top or right respectively like normal cursor navigation.

The lower case equivalents move focus instead of moving the window.

respectTheCode, Jul 21 '13 at 9:55

1 Wow, cool! Thanks! :-) – infous Feb 6 at 8:46 it's much better since users use hjkl to move between buffers. – Afshin Mehrabani

In VIM, take a look at the following to see different alternatives for what you might have done:

:help opening-window

For instance:

Ctrl - W s
Ctrl - W o
Ctrl - W v
Ctrl - W o
Ctrl - W s

Anon, Apr 29 at 21:45

The command ^W-o is great! I did not know it. – Masi Aug 13 '09 at 2:20 add a comment | up vote 6 down vote The following ex commands will (re-)split any number of windows:

If there are hidden buffers, issuing these commands will also make the hidden buffers visible.

Mark Oct 22 at 19:31

When you have two or more windows open horizontally or vertically and want to switch them all to the other orientation, you can use the following:

[Oct 22, 2018] move selection to a separate file

Highly recommended!
Oct 22, 2018 | superuser.com

greg0ire ,Jan 23, 2013 at 13:29

With vim, how can I move a piece of text to a new file? For the moment, I do this:

Is there a more efficient way to do this?

Before

a.txt

sometext
some other text
some other other text
end
After

a.txt

sometext
end

b.txt

some other text
some other other text

Ingo Karkat, Jan 23, 2013 at 15:20

How about these custom commands:
:command! -bang -range -nargs=1 -complete=file MoveWrite  <line1>,<line2>write<bang> <args> | <line1>,<line2>delete _
:command! -bang -range -nargs=1 -complete=file MoveAppend <line1>,<line2>write<bang> >> <args> | <line1>,<line2>delete _

greg0ire ,Jan 23, 2013 at 15:27

This is very ugly, but hey, it seems to do in one step exactly what I asked for (I tried). +1, and accepted. I was looking for a native way to do this quickly but since there does not seem to be one, yours will do just fine. Thanks! � greg0ire Jan 23 '13 at 15:27

Ingo Karkat ,Jan 23, 2013 at 16:15

Beauty is in the eye of the beholder. I find this pretty elegant; you only need to type it once (into your .vimrc). � Ingo Karkat Jan 23 '13 at 16:15

greg0ire ,Jan 23, 2013 at 16:21

You're right, "very ugly" shoud have been "very unfamiliar". Your command is very handy, and I think I definitely going to carve it in my .vimrc � greg0ire Jan 23 '13 at 16:21

embedded.kyle ,Jan 23, 2013 at 14:08

By "move a piece of text to a new file" I assume you mean cut that piece of text from the current file and create a new file containing only that text.

Various examples:

The above only copies the text and creates a new file containing that text. You will then need to delete afterward.

This can be done using the same range and the d command:

Or by using dd for the single line case.

If you instead select the text using visual mode, and then hit : while the text is selected, you will see the following on the command line:

:'<,'>

Which indicates the selected text. You can then expand the command to:

:'<,'>w >> old_file

Which will append the text to an existing file. Then delete as above.


One liner:

:2,3 d | new +put! "

The breakdown:

greg0ire, Jan 23, 2013 at 14:09

Your assumption is right. This looks good, I'm going to test. Could you explain 2. a bit more? I'm not very familiar with ranges. EDIT: If I try this on the second line, it writes the first line to the other file, not the second line. � greg0ire Jan 23 '13 at 14:09

embedded.kyle ,Jan 23, 2013 at 14:16

@greg0ire I got that a bit backward, I'll edit to better explain � embedded.kyle Jan 23 '13 at 14:16

greg0ire ,Jan 23, 2013 at 14:18

I added an example to make my question clearer. � greg0ire Jan 23 '13 at 14:18

embedded.kyle ,Jan 23, 2013 at 14:22

@greg0ire I corrected my answer. It's still two steps. The first copies and writes. The second deletes. � embedded.kyle Jan 23 '13 at 14:22

greg0ire ,Jan 23, 2013 at 14:41

Ok, if I understand well, the trick is to use ranges to select and write in the same command. That's very similar to what I did. +1 for the detailed explanation, but I don't think this is more efficient, since the trick with hitting ':' is what I do for the moment. � greg0ire Jan 23 '13 at 14:41

Xyon ,Jan 23, 2013 at 13:32

Select the text in visual mode, then press y to "yank" it into the buffer (copy) or d to "delete" it into the buffer (cut).

Then you can :split <new file name> to split your vim window up, and press p to paste in the yanked text. Write the file as normal.

To close the split again, pass the split you want to close :q .

greg0ire ,Jan 23, 2013 at 13:42

I have 4 steps for the moment: select, write, select, delete. With your method, I have 6 steps: select, delete, split, paste, write, close. I asked for something more efficient :P � greg0ire Jan 23 '13 at 13:42

Xyon ,Jan 23, 2013 at 13:44

Well, if you pass the split :x instead, you can combine writing and closing into one and make it five steps. :P � Xyon Jan 23 '13 at 13:44

greg0ire ,Jan 23, 2013 at 13:46

That's better, but 5 still > 4 :P � greg0ire Jan 23 '13 at 13:46 Based on @embedded.kyle's answer and this Q&A , I ended up with this one liner to append a selection to a file and delete from current file. After selecting some lines with Shift+V , hit : and run:
'<,'>w >> test | normal gvd

The first part appends selected lines. The second command enters normal mode and runs gvd to select the last selection and then deletes.

[Oct 22, 2018] linux - If I rm -rf a symlink will the data the link points to get erased, to

Notable quotes:
"... Put it in another words, those symlink-files will be deleted. The files they "point"/"link" to will not be touch. ..."
Oct 22, 2018 | unix.stackexchange.com

user4951 ,Jan 25, 2013 at 2:40

This is the contents of the /home3 directory on my system:
./   backup/    hearsttr@  lost+found/  randomvi@  sexsmovi@
../  freemark@  investgr@  nudenude@    romanced@  wallpape@

I want to clean this up but I am worried because of the symlinks, which point to another drive.

If I say rm -rf /home3 will it delete the other drive?

John Sui

rm -rf /home3 will delete all files and directory within home3 and home3 itself, which include symlink files, but will not "follow"(de-reference) those symlink.

Put it in another words, those symlink-files will be deleted. The files they "point"/"link" to will not be touch.

[Oct 22, 2018] Does rm -rf follow symbolic links?

Jan 25, 2012 | superuser.com
I have a directory like this:
$ ls -l
total 899166
drwxr-xr-x 12 me scicomp       324 Jan 24 13:47 data
-rw-r--r--  1 me scicomp     84188 Jan 24 13:47 lod-thin-1.000000-0.010000-0.030000.rda
drwxr-xr-x  2 me scicomp       808 Jan 24 13:47 log
lrwxrwxrwx  1 me scicomp        17 Jan 25 09:41 msg -> /home/me/msg

And I want to remove it using rm -r .

However I'm scared rm -r will follow the symlink and delete everything in that directory (which is very bad).

I can't find anything about this in the man pages. What would be the exact behavior of running rm -rf from a directory above this one?

LordDoskias Jan 25 '12 at 16:43, Jan 25, 2012 at 16:43

How hard it is to create a dummy dir with a symlink pointing to a dummy file and execute the scenario? Then you will know for sure how it works! –

hakre ,Feb 4, 2015 at 13:09

X-Ref: If I rm -rf a symlink will the data the link points to get erased, too? ; Deleting a folder that contains symlinkshakre Feb 4 '15 at 13:09

Susam Pal ,Jan 25, 2012 at 16:47

Example 1: Deleting a directory containing a soft link to another directory.
susam@nifty:~/so$ mkdir foo bar
susam@nifty:~/so$ touch bar/a.txt
susam@nifty:~/so$ ln -s /home/susam/so/bar/ foo/baz
susam@nifty:~/so$ tree
.
├── bar
│   └── a.txt
└── foo
    └── baz -> /home/susam/so/bar/

3 directories, 1 file
susam@nifty:~/so$ rm -r foo
susam@nifty:~/so$ tree
.
└── bar
    └── a.txt

1 directory, 1 file
susam@nifty:~/so$

So, we see that the target of the soft-link survives.

Example 2: Deleting a soft link to a directory

susam@nifty:~/so$ ln -s /home/susam/so/bar baz
susam@nifty:~/so$ tree
.
├── bar
│   └── a.txt
└── baz -> /home/susam/so/bar

2 directories, 1 file
susam@nifty:~/so$ rm -r baz
susam@nifty:~/so$ tree
.
└── bar
    └── a.txt

1 directory, 1 file
susam@nifty:~/so$

Only, the soft link is deleted. The target of the soft-link survives.

Example 3: Attempting to delete the target of a soft-link

susam@nifty:~/so$ ln -s /home/susam/so/bar baz
susam@nifty:~/so$ tree
.
├── bar
│   └── a.txt
└── baz -> /home/susam/so/bar

2 directories, 1 file
susam@nifty:~/so$ rm -r baz/
rm: cannot remove 'baz/': Not a directory
susam@nifty:~/so$ tree
.
├── bar
└── baz -> /home/susam/so/bar

2 directories, 0 files

The file in the target of the symbolic link does not survive.

The above experiments were done on a Debian GNU/Linux 9.0 (stretch) system.

Wyrmwood ,Oct 30, 2014 at 20:36

rm -rf baz/* will remove the contents – Wyrmwood Oct 30 '14 at 20:36

Buttle Butkus ,Jan 12, 2016 at 0:35

Yes, if you do rm -rf [symlink], then the contents of the original directory will be obliterated! Be very careful. – Buttle Butkus Jan 12 '16 at 0:35

frnknstn ,Sep 11, 2017 at 10:22

Your example 3 is incorrect! On each system I have tried, the file a.txt will be removed in that scenario. – frnknstn Sep 11 '17 at 10:22

Susam Pal ,Sep 11, 2017 at 15:20

@frnknstn You are right. I see the same behaviour you mention on my latest Debian system. I don't remember on which version of Debian I performed the earlier experiments. In my earlier experiments on an older version of Debian, either a.txt must have survived in the third example or I must have made an error in my experiment. I have updated the answer with the current behaviour I observe on Debian 9 and this behaviour is consistent with what you mention. – Susam Pal Sep 11 '17 at 15:20

Ken Simon ,Jan 25, 2012 at 16:43

Your /home/me/msg directory will be safe if you rm -rf the directory from which you ran ls. Only the symlink itself will be removed, not the directory it points to.

The only thing I would be cautious of, would be if you called something like "rm -rf msg/" (with the trailing slash.) Do not do that because it will remove the directory that msg points to, rather than the msg symlink itself.

> ,Jan 25, 2012 at 16:54

"The only thing I would be cautious of, would be if you called something like "rm -rf msg/" (with the trailing slash.) Do not do that because it will remove the directory that msg points to, rather than the msg symlink itself." - I don't find this to be true. See the third example in my response below. – Susam Pal Jan 25 '12 at 16:54

Andrew Crabb ,Nov 26, 2013 at 21:52

I get the same result as @Susam ('rm -r symlink/' does not delete the target of symlink), which I am pleased about as it would be a very easy mistake to make. – Andrew Crabb Nov 26 '13 at 21:52

,

rm should remove files and directories. If the file is symbolic link, link is removed, not the target. It will not interpret a symbolic link. For example what should be the behavior when deleting 'broken links'- rm exits with 0 not with non-zero to indicate failure

[Oct 21, 2018] Common visual block selection scenarios

Notable quotes:
"... column oriented ..."
Oct 21, 2018 | stackoverflow.com
You are talking about text selecting and copying, I think that you should give a look to the Vim Visual Mode .

In the visual mode, you are able to select text using Vim commands, then you can do whatever you want with the selection.

Consider the following common scenarios:

You need to select to the next matching parenthesis.

You could do:

You want to select text between quotes:

You want to select a curly brace block (very common on C-style languages):

You want to select the entire file:

Visual block selection is another really useful feature, it allows you to select a rectangular area of text, you just have to press Ctrl - V to start it, and then select the text block you want and perform any type of operation such as yank, delete, paste, edit, etc. It's great to edit column oriented text.

[Oct 21, 2018] Moving lines between split windows in vim

Notable quotes:
"... "send the line I am on (or the test I selected) to the other window" ..."
Oct 21, 2018 | superuser.com

brad ,Nov 24, 2015 at 12:28

I have two files, say a.txt and b.txt , in the same session of vim and I split the screen so I have file a.txt in the upper window and b.txt in the lower window.

I want to move lines here and there from a.txt to b.txt : I select a line with Shift + v , then I move to b.txt in the lower window with Ctrl + w , paste with p , get back to a.txt with Ctrl + w and I can repeat the operation when I get to another line I want to move.

My question: is there a quicker way to say vim "send the line I am on (or the test I selected) to the other window" ?

Chong ,Nov 24, 2015 at 12:33

Use q macro? q[some_letter] [whatever operations] q , then call the macro with [times to be called]@qChong Nov 24 '15 at 12:33

Anthony Geoghegan ,Nov 24, 2015 at 13:00

I presume that you're deleting the line that you've selected in a.txt . If not, you'd be pasting something else into b.txt . If so, there's no need to select the line first. – Anthony Geoghegan Nov 24 '15 at 13:00

Anthony Geoghegan ,Nov 24, 2015 at 13:17

This sounds like a good use case for a macro. Macros are commands that can be recorded and stored in a Vim register. Each register is identified by a letter from a to z. Recording

From Recording keys for repeated jobs - Vim Tips

To start recording, press q in Normal mode followed by a letter (a to z). That starts recording keystrokes to the specified register. Vim displays "recording" in the status line. Type any Normal mode commands, or enter Insert mode and type text. To stop recording, again press q while in Normal mode.

For this particular macro, I chose the m (for move) register to store it.

I pressed qm to record the following commands:

When I typed q to finish recording the macro, the contents of the m register were:

dd^Wjp^Wk
Usage

brad ,Nov 24, 2015 at 14:26

I asked to see if there is a command unknown to me that does the job: it seems there is none. In absence of such a command, this can be a good solution. – brad Nov 24 '15 at 14:26

romainl ,Nov 26, 2015 at 9:54

@brad, you can find all the commands available to you in the documentation. If it's not there it doesn't exist no need to ask random strangers. – romainl Nov 26 '15 at 9:54

brad ,Nov 26, 2015 at 10:17

@romainl, yes, I know this but vim documentation is really huge and, although it doesn't scare me, there is always the possibility to miss something. Moreover, it could also be that you can obtain the effect using the combination of 2 commands and in this case it would be hardly documented – brad Nov 26 '15 at 10:17

[Oct 21, 2018] How to move around buffers in vim?

Oct 21, 2018 | stackoverflow.com

user3721893 ,Jul 23, 2014 at 5:43

I normally work with more than 5 files at a time. I use buffers to open different files. I use commands such as :buf file1, :buf file2 etc. Is there a faster way to move to different files?

eckes ,Jul 23, 2014 at 5:49

What I use:

And have a short look on :he buffer

And the wiki entry on Easier Buffer Switching on the Vim Wiki: http://vim.wikia.com/wiki/Easier_buffer_switching

SO already has a question regarding yours: How do you prefer to switch between buffers in Vim?

romainl ,Jul 23, 2014 at 6:13

A few mappings can make your life a lot easier.

This one lists your buffers and prompts you for a number:

nnoremap gb :buffers<CR>:buffer<Space>

This one lists your buffers in the "wildmenu". Depends on the 'wildcharm' option as well as 'wildmenu' and 'wildmode' :

nnoremap <leader>b :buffer <C-z>

These ones allow you to cycle between all your buffers without much thinking:

nnoremap <PageUp>   :bprevious<CR>
nnoremap <PageDown> :bnext<CR>

Also, don't forget <C-^> which allows you to alternate between two buffers.

mikew ,Jul 23, 2014 at 6:38

Once the buffers are already open, you can just type :b partial_filename to switch

So if :ls shows that i have my ~./vimrc open, then I can just type :b vimr or :b rc to switch to that buffer

Brady Trainor ,Jul 25, 2014 at 22:13

Below I describe some excerpts from sections of my .vimrc . It includes mapping the leader key, setting wilds tab completion, and finally my buffer nav key choices (all mostly inspired by folks on the interweb, including romainl). Edit: Then I ramble on about my shortcuts for windows and tabs.
" easier default keys {{{1

let mapleader=','
nnoremap <leader>2 :@"<CR>

The leader key is a prefix key for mostly user-defined key commands (some plugins also use it). The default is \ , but many people suggest the easier to reach , .

The second line there is a command to @ execute from the " clipboard, in case you'd like to quickly try out various key bindings (without relying on :so % ). (My nmeumonic is that Shift - 2 is @ .)

" wilds {{{1

set wildmenu wildmode=list:full
set wildcharm=<C-z>
set wildignore+=*~ wildignorecase

For built-in completion, wildmenu is probably the part that shows up yellow on your Vim when using tab completion on command-line. wildmode is set to a comma-separated list, each coming up in turn on each tab completion (that is, my list is simply one element, list:full ). list shows rows and columns of candidates. full 's meaning includes maintaining existence of the wildmenu . wildcharm is the way to include Tab presses in your macros. The *~ is for my use in :edit and :find commands.

" nav keys {{{1
" windows, buffers and tabs {{{2
" buffers {{{3

nnoremap <leader>bb :b <C-z><S-Tab>
nnoremap <leader>bh :ls!<CR>:b<Space>
nnoremap <leader>bw :ls!<CR>:bw<Space>
nnoremap <leader>bt :TSelectBuffer<CR>
nnoremap <leader>be :BufExplorer<CR>
nnoremap <leader>bs :BufExplorerHorizontalSplit<CR>
nnoremap <leader>bv :BufExplorerVerticalSplit<CR>
nnoremap <leader>3 :e#<CR>
nmap <C-n> :bn<cr>
nmap <C-p> :bp<cr>

The ,3 is for switching between the "two" last buffers (Easier to reach than built-in Ctrl - 6 ). Nmeuonic is Shift - 3 is # , and # is the register symbol for last buffer. (See :marks .)

,bh is to select from hidden buffers ( ! ).

,bw is to bwipeout buffers by number or name. For instance, you can wipeout several while looking at the list, with ,bw 1 3 4 8 10 <CR> . Note that wipeout is more destructive than :bdelete . They have their pros and cons. For instance, :bdelete leaves the buffer in the hidden list, while :bwipeout removes global marks (see :help marks , and the description of uppercase marks).

I haven't settled on these keybindings, I would sort of prefer that my ,bb was simply ,b (simply defining while leaving the others defined makes Vim pause to see if you'll enter more).

Those shortcuts for :BufExplorer are actually the defaults for that plugin, but I have it written out so I can change them if I want to start using ,b without a hang.

You didn't ask for this:

If you still find Vim buffers a little awkward to use, try to combine the functionality with tabs and windows (until you get more comfortable?).

" windows {{{3

" window nav
nnoremap <leader>w <C-w>
nnoremap <M-h> <C-w>h
nnoremap <M-j> <C-w>j
nnoremap <M-k> <C-w>k
nnoremap <M-l> <C-w>l
" resize window
nnoremap <C-h> <C-w><
nnoremap <C-j> <C-w>+
nnoremap <C-k> <C-w>-
nnoremap <C-l> <C-w>>

Notice how nice ,w is for a prefix. Also, I reserve Ctrl key for resizing, because Alt ( M- ) is hard to realize in all environments, and I don't have a better way to resize. I'm fine using ,w to switch windows.

" tabs {{{3

nnoremap <leader>t :tab
nnoremap <M-n> :tabn<cr>
nnoremap <M-p> :tabp<cr>
nnoremap <C-Tab> :tabn<cr>
nnoremap <C-S-Tab> :tabp<cr>
nnoremap tn :tabe<CR>
nnoremap te :tabe<Space><C-z><S-Tab>
nnoremap tf :tabf<Space>
nnoremap tc :tabc<CR>
nnoremap to :tabo<CR>
nnoremap tm :tabm<CR>
nnoremap ts :tabs<CR>

nnoremap th :tabr<CR>
nnoremap tj :tabn<CR>
nnoremap tk :tabp<CR>
nnoremap tl :tabl<CR>

" or, it may make more sense to use
" nnoremap th :tabp<CR>
" nnoremap tj :tabl<CR>
" nnoremap tk :tabr<CR>
" nnoremap tl :tabn<CR>

In summary of my window and tabs keys, I can navigate both of them with Alt , which is actually pretty easy to reach. In other words:

" (modifier) key choice explanation {{{3
"
"       KEYS        CTRL                  ALT            
"       hjkl        resize windows        switch windows        
"       np          switch buffer         switch tab      
"
" (resize windows is hard to do otherwise, so we use ctrl which works across
" more environments. i can use ',w' for windowcmds o.w.. alt is comfortable
" enough for fast and gui nav in tabs and windows. we use np for navs that 
" are more linear, hjkl for navs that are more planar.) 
"

This way, if the Alt is working, you can actually hold it down while you find your "open" buffer pretty quickly, amongst the tabs and windows.

,

There are many ways to solve. The best is the best that WORKS for YOU. You have lots of fuzzy match plugins that help you navigate. The 2 things that impress me most are

1) CtrlP or Unite's fuzzy buffer search

2) LustyExplorer and/or LustyJuggler

And the simplest :

:map <F5> :ls<CR>:e #

Pressing F5 lists all buffer, just type number.

[Oct 21, 2018] What is vim recording and how can it be disabled?

Oct 21, 2018 | stackoverflow.com

vehomzzz, Oct 6, 2009 at 20:03

I keep seeing the recording message at the bottom of my gvim 7.2 window.

What is it and how do I turn it off?

Joey Adams, Aug 17, 2010 at 16:26

To turn off vim recording for good, add map q <Nop> to your .vimrc file. – Joey Adams Aug 17 '10 at 16:26

0xc0de, Aug 12, 2016 at 9:04

I can't believe you want to turn recording off! I would show a really annoying popup 'Are you sure?' if one asks to turn it off (or probably would like to give options like the Windows 10 update gives). – 0xc0de Aug 12 '16 at 9:04

yogsototh, Oct 6, 2009 at 20:08

You start recording by q<letter> and you can end it by typing q again.

Recording is a really useful feature of Vim.

It records everything you type. You can then replay it simply by typing @<letter> . Record search, movement, replacement...

One of the best feature of Vim IMHO.

Cascabel, Oct 6, 2009 at 20:13

As seen other places, it's q followed by a register. A really cool (and possibly non-intuitive) part of this is that these are the same registers used by things like delete, yank, and put. This means that you can yank text from the editor into a register, then execute it as a command. – Cascabel Oct 6 '09 at 20:13

Tolga E, Aug 17, 2013 at 3:07

One more thing to note is you can hit any number before the @ to replay the recording that many times like (100@<letter>) will play your actions 100 times – Tolga E Aug 17 '13 at 3:07

anisoptera, Dec 4, 2014 at 9:43

You could add it afterward, by editing the register with put/yank. But I don't know why you'd want to turn recording on or off as part of a macro. ('q' doesn't affect anything when typed in insert mode.) – anisoptera Dec 4 '14 at 9:43

L0j1k, Jul 16, 2015 at 21:08

Vim is so freakin' cool, man. – L0j1k Jul 16 '15 at 21:08

Cascabel, Jul 29, 2015 at 14:52

@Wade " - it's called the default register. – Cascabel Jul 29 '15 at 14:52

ephemient, Oct 6, 2009 at 20:17

Type :h recording to learn more.
                           *q* *recording*
q{0-9a-zA-Z"}           Record typed characters into register {0-9a-zA-Z"}
                        (uppercase to append).  The 'q' command is disabled
                        while executing a register, and it doesn't work inside
                        a mapping.  {Vi: no recording}

q                       Stops recording.  (Implementation note: The 'q' that
                        stops recording is not stored in the register, unless
                        it was the result of a mapping)  {Vi: no recording}


                                                        *@*
@{0-9a-z".=*}           Execute the contents of register {0-9a-z".=*} [count]
                        times.  Note that register '%' (name of the current
                        file) and '#' (name of the alternate file) cannot be
                        used.  For "@=" you are prompted to enter an
                        expression.  The result of the expression is then
                        executed.  See also |@:|.  {Vi: only named registers}

Tim Henigan, Oct 6, 2009 at 20:07

It sounds like you have macro recording turned on. To shut it off, press q .

Refer to " :help recording " for further information.

Related links:

mitchus, Feb 13, 2015 at 14:16

Typing q starts macro recording, and the recording stops when the user hits q again.

As Joey Adams mentioned, to disable recording, add the following line to .vimrc in your home directory:

map q <Nop>

n611x007, Oct 4, 2015 at 7:16

only answer about "how to turn off" part of the question. Well, it makes recording inaccessible, effectively turning it off - at least noone expects vi to have a separate thread for this code, I guess, including me. – n611x007 Oct 4 '15 at 7:16

JeffH, Oct 6, 2009 at 20:10

As others have said, it's macro recording, and you turn it off with q. Here's a nice article about how-to and why it's useful.

John Millikin, Oct 6, 2009 at 20:06

It means you're in "record macro" mode. This mode is entered by typing q followed by a register name, and can be exited by typing q again.

ephemient, Oct 6, 2009 at 20:08

It's actually entered by typing q followed by any register name, which is 0-9, a-z, A-Z, and ". – ephemient Oct 6 '09 at 20:08

Cascabel, Oct 6, 2009 at 20:08

Actually, it's q{0-9a-zA-Z"} - you can record a macro into any register (named by digit, letter, "). In case you actually want to use it... you execute the contents of a register with @<register>. See :help q and :help @ if you're interested in using it. – Cascabel Oct 6 '09 at 20:08

[Oct 21, 2018] vim - how to move a block or column of text

Oct 21, 2018 | stackoverflow.com

how to move a block or column of text Ask Question up vote 23 down vote favorite 5


David.Chu.ca ,Mar 6, 2009 at 20:47

I have the following text as a simple case:

...
abc xxx 123 456
wer xxx 345 678676
...

what I need to move a block of text xxx to another location:

...
abc 123 xxx 456
wer 345 xxx 678676
...

I think I use visual mode to block a column of text, what are the other commands to move the block to another location?

Paul ,Mar 6, 2009 at 20:52

You should use blockwise visual mode ( Ctrl + v ). Then d to delete block, p or P to paste block.

Klinger ,Mar 6, 2009 at 20:53

Try the link .


Marking text (visual mode)

Visual commands

Cut and Paste

Kemin Zhou ,Nov 6, 2015 at 23:59

One of the few useful command I learned at the beginning of learning VIM is :1,3 mo 5 This means move text line 1 through 3 to line 5.

J�da Ron�n ,Jan 18, 2017 at 21:20

And you can select the lines in visual mode, then press : to get :'<,'> (equivalent to the :1,3 part in your answer), and add mo N . If you want to move a single line, just :mo N . If you are really lazy, you can omit the space (e.g. :mo5 ). Use marks with mo '{a-zA-Z} . � J�da Ron�n Jan 18 '17 at 21:20

Miles ,Jun 29, 2017 at 23:44

just m also works � Miles Jun 29 '17 at 23:44

John Ellinwood ,Mar 6, 2009 at 20:52

  1. In VIM, press Ctrl + V to go in Visual Block mode
  2. Select the required columns with your arrow keys and press x to cut them in the buffer.
  3. Move cursor to row 1 column 9 and press P (thats capital P) in command mode.
  4. Press Ctrl + Shift + b to get in and out of it. ( source )

SergioAraujo ,Jan 4 at 21:49

Using an external command "awk".

%!awk '{print $1,$3,$2,$4}' test.txt

With pure vim

:%s,\v(\w+) (\w+) (\w+) (\w+),\1 \3 \2 \4,g

Another vim solution using global command

:g/./normal wdwwP

[Oct 21, 2018] Camtasia Studio 8

Notable quotes:
"... What did you use to make the gif? ..."
"... That was done with Camtasia Studio 8. Very easy actually. ..."
"... @Zoltán you can use LiceCap, which is small size – ..."
"... GifCam is a simple to use tool, too: blog.bahraniapps.com/gifcam ..."
Oct 21, 2018 | stackoverflow.com

218 down vote


Adam ,Feb 7, 2014 at 22:20

Doesn't get any simpler than this! From normal mode:

yy

then move to the line you want to paste at and

p

Zoltán ,Jul 2, 2014 at 7:42

What did you use to make the gif? Zoltán Jul 2 '14 at 7:42

Adam ,Sep 19, 2014 at 20:15

That was done with Camtasia Studio 8. Very easy actually. Adam Sep 19 '14 at 20:15

onmyway133 ,Feb 23, 2016 at 15:29

@Zoltán you can use LiceCap, which is small size – onmyway133 Feb 23 '16 at 15:29

Jared ,Jun 2, 2016 at 13:44

GifCam is a simple to use tool, too: blog.bahraniapps.com/gifcamJared Jun 2 '16 at 13:44

[Oct 21, 2018] Favorite (G)Vim plugins/scripts?

Dec 27, 2009 | stackoverflow.com
What are your favorite (G)Vim plugins/scripts?

community wiki 2 revs ,Jun 24, 2009 at 13:35

Nerdtree

The NERD tree allows you to explore your filesystem and to open files and directories. It presents the filesystem to you in the form of a tree which you manipulate with the keyboard and/or mouse. It also allows you to perform simple filesystem operations.

The tree can be toggled easily with :NERDTreeToggle which can be mapped to a more suitable key. The keyboard shortcuts in the NERD tree are also easy and intuitive.

Edit: Added synopsis

SpoonMeiser ,Sep 17, 2008 at 19:32

For those of us not wanting to follow every link to find out about each plugin, care to furnish us with a brief synopsis? � SpoonMeiser Sep 17 '08 at 19:32

AbdullahDiaa ,Sep 10, 2012 at 19:51

and NERDTree with NERDTreeTabs are awesome combination github.com/jistr/vim-nerdtree-tabs � AbdullahDiaa Sep 10 '12 at 19:51

community wiki 2 revs ,May 27, 2010 at 0:08

Tim Pope has some kickass plugins. I love his surround plugin.

Taurus Olson ,Feb 21, 2010 at 18:01

Surround is a great plugin for sure. � Taurus Olson Feb 21 '10 at 18:01

Benjamin Oakes ,May 27, 2010 at 0:11

Link to all his vim contributions: vim.org/account/profile.php?user_id=9012 � Benjamin Oakes May 27 '10 at 0:11

community wiki SergioAraujo, Mar 15, 2011 at 15:35

Pathogen plugin and more things commented by Steve Losh

Patrizio Rullo ,Sep 26, 2011 at 12:11

Pathogen is the FIRST plugin you have to install on every Vim installation! It resolves the plugin management problems every Vim developer has. � Patrizio Rullo Sep 26 '11 at 12:11

Profpatsch ,Apr 12, 2013 at 8:53

I would recommend switching to Vundle . It's better by a long shot and truly automates. You can give vim-addon-manager a try, too. � Profpatsch Apr 12 '13 at 8:53

community wiki JPaget, Sep 15, 2008 at 20:47

Taglist , a source code browser plugin for Vim, is currently the top rated plugin at the Vim website and is my favorite plugin.

mindthief ,Jun 27, 2012 at 20:53

A more recent alternative to this is Tagbar , which appears to have some improvements over Taglist. This blog post offers a comparison between the two plugins. � mindthief Jun 27 '12 at 20:53

community wiki 1passenger, Nov 17, 2009 at 9:15

I love snipMate . It's simular to snippetsEmu, but has a much better syntax to read (like Textmate).

community wiki cschol, Aug 22, 2008 at 4:19

A very nice grep replacement for GVim is Ack . A search plugin written in Perl that beats Vim's internal grep implementation and externally invoked greps, too. It also by default skips any CVS directories in the project directory, e.g. '.svn'. This blog shows a way to integrate Ack with vim.

FUD, Aug 27, 2013 at 15:50

github.com/mileszs/ack.vim � FUD Aug 27 '13 at 15:50

community wiki Dominic Dos Santos ,Sep 12, 2008 at 12:44

A.vim is a great little plugin. It allows you to quickly switch between header and source files with a single command. The default is :A , but I remapped it to F2 reduce keystrokes.

community wiki 2 revs, Aug 25, 2008 at 15:06

I really like the SuperTab plugin, it allows you to use the tab key to do all your insert completions.

community wiki Greg Hewgill, Aug 25, 2008 at 19:23

I have recently started using a plugin that highlights differences in your buffer from a previous version in your RCS system (Subversion, git, whatever). You just need to press a key to toggle the diff display on/off. You can find it here: http://github.com/ghewgill/vim-scmdiff . Patches welcome!

Nathan Fellman, Sep 15, 2008 at 18:51

Do you know if this supports bitkeeper? I looked on the website but couldn't even see whom to ask. � Nathan Fellman Sep 15 '08 at 18:51

Greg Hewgill, Sep 16, 2008 at 9:26

It doesn't explicitly support bitkeeper at the moment, but as long as bitkeeper has a "diff" command that outputs a normal patch file, it should be easy enough to add. � Greg Hewgill Sep 16 '08 at 9:26

Yogesh Arora, Mar 10, 2010 at 0:47

does it support clearcase � Yogesh Arora Mar 10 '10 at 0:47

Greg Hewgill, Mar 10, 2010 at 1:39

@Yogesh: No, it doesn't support ClearCase at this time. However, if you can add ClearCase support, a patch would certainly be accepted. � Greg Hewgill Mar 10 '10 at 1:39

Olical ,Jan 23, 2013 at 11:05

This version can be loaded via pathogen in a git submodule: github.com/tomasv/vim-scmdiff � Olical Jan 23 '13 at 11:05

community wiki 4 revs, May 23, 2017 at 11:45

  1. Elegant (mini) buffer explorer - This is the multiple file/buffer manager I use. Takes very little screen space. It looks just like most IDEs where you have a top tab-bar with the files you've opened. I've tested some other similar plugins before, and this is my pick.
  2. TagList - Small file explorer, without the "extra" stuff the other file explorers have. Just lets you browse directories and open files with the "enter" key. Note that this has already been noted by previous commenters to your questions.
  3. SuperTab - Already noted by WMR in this post, looks very promising. It's an auto-completion replacement key for Ctrl-P.
  4. Desert256 color Scheme - Readable, dark one.
  5. Moria color scheme - Another good, dark one. Note that it's gVim only.
  6. Enahcned Python syntax - If you're using Python, this is an enhanced syntax version. Works better than the original. I'm not sure, but this might be already included in the newest version. Nonetheless, it's worth adding to your syntax folder if you need it.
  7. Enhanced JavaScript syntax - Same like the above.
  8. EDIT: Comments - Great little plugin to [un]comment chunks of text. Language recognition included ("#", "/", "/* .. */", etc.) .

community wiki Konrad Rudolph, Aug 25, 2008 at 14:19

Not a plugin, but I advise any Mac user to switch to the MacVim distribution which is vastly superior to the official port.

As for plugins, I used VIM-LaTeX for my thesis and was very satisfied with the usability boost. I also like the Taglist plugin which makes use of the ctags library.

community wiki Yariv ,Nov 25, 2010 at 19:58

clang complete - the best c++ code completion I have seen so far. By using an actual compiler (that would be clang) the plugin is able to complete complex expressions including STL and smart pointers.

community wiki Greg Bowyer, Jul 30, 2009 at 19:51

No one said matchit yet ? Makes HTML / XML soup much nicer http://www.vim.org/scripts/script.php?script_id=39

community wiki 2 revs, 2 users 91% ,Nov 24, 2011 at 5:18

Tomas Restrepo posted on some great Vim scripts/plugins . He has also pointed out some nice color themes on his blog, too. Check out his Vim category .

community wiki HaskellElephant ,Mar 29, 2011 at 17:59,

With version 7.3, undo branches was added to vim. A very powerful feature, but hard to use, until Steve Losh made Gundo which makes this feature possible to use with a ascii representation of the tree and a diff of the change. A must for using undo branches.

community wiki, Auguste ,Apr 20, 2009 at 8:05

Matrix Mode .

community wiki wilhelmtell ,Dec 10, 2010 at 19:11

My latest favourite is Command-T . Granted, to install it you need to have Ruby support and you'll need to compile a C extension for Vim. But oy-yoy-yoy does this plugin make a difference in opening files in Vim!

Victor Farazdagi, Apr 19, 2011 at 19:16

Definitely! Let not the ruby + c compiling stop you, you will be amazed on how well this plugin enhances your toolset. I have been ignoring this plugin for too long, installed it today and already find myself using NERDTree lesser and lesser. � Victor Farazdagi Apr 19 '11 at 19:16

datentyp ,Jan 11, 2012 at 12:54

With ctrlp now there is something as awesome as Command-T written in pure Vimscript! It's available at github.com/kien/ctrlp.vim � datentyp Jan 11 '12 at 12:54

FUD ,Dec 26, 2012 at 4:48

just my 2 cents.. being a naive user of both plugins, with a few first characters of file name i saw a much better result with commandt plugin and a lots of false positives for ctrlp. � FUD Dec 26 '12 at 4:48

community wiki
f3lix
,Mar 15, 2011 at 12:55

Conque Shell : Run interactive commands inside a Vim buffer

Conque is a Vim plugin which allows you to run interactive programs, such as bash on linux or powershell.exe on Windows, inside a Vim buffer. In other words it is a terminal emulator which uses a Vim buffer to display the program output.

http://code.google.com/p/conque/

http://www.vim.org/scripts/script.php?script_id=2771

community wiki 2 revs ,Nov 20, 2009 at 14:51

The vcscommand plugin provides global ex commands for manipulating version-controlled source files and it supports CVS,SVN and some other repositories.

You can do almost all repository related tasks from with in vim:
* Taking the diff of current buffer with repository copy
* Adding new files
* Reverting the current buffer to the repository copy by nullifying the local changes....

community wiki Sirupsen ,Nov 20, 2009 at 15:00

Just gonna name a few I didn't see here, but which I still find extremely helpful:

community wiki thestoneage ,Dec 22, 2011 at 16:25

One Plugin that is missing in the answers is NERDCommenter , which let's you do almost anything with comments. For example {add, toggle, remove} comments. And more. See this blog entry for some examples.

community wiki james ,Feb 19, 2010 at 7:17

I like taglist and fuzzyfinder, those are very cool plugin

community wiki JAVH ,Aug 15, 2010 at 11:54

TaskList

This script is based on the eclipse Task List. It will search the file for FIXME, TODO, and XXX (or a custom list) and put them in a handy list for you to browse which at the same time will update the location in the document so you can see exactly where the tag is located. Something like an interactive 'cw'

community wiki Peter Hoffmann ,Aug 29, 2008 at 4:07

I really love the snippetsEmu Plugin. It emulates some of the behaviour of Snippets from the OS X editor TextMate, in particular the variable bouncing and replacement behaviour.

community wiki Anon ,Sep 11, 2008 at 10:20

Zenburn color scheme and good fonts - [Droid Sans Mono]( http://en.wikipedia.org/wiki/Droid_(font)) on Linux, Consolas on Windows.

Gary Willoughby ,Jul 7, 2011 at 21:21

Take a look at DejaVu Sans Mono too dejavu-fonts.org/wiki/Main_Page � Gary Willoughby Jul 7 '11 at 21:21

Santosh Kumar ,Mar 28, 2013 at 4:48

Droid Sans Mono makes capital m and 0 appear same. � Santosh Kumar Mar 28 '13 at 4:48

community wiki julienXX ,Jun 22, 2010 at 12:05

If you're on a Mac, you got to use peepopen , fuzzyfinder on steroids.

Khaja Minhajuddin ,Apr 5, 2012 at 9:24

Command+T is a free alternative to this: github.com/wincent/Command-T � Khaja Minhajuddin Apr 5 '12 at 9:24

community wiki Peter Stuifzand ,Aug 25, 2008 at 19:16

I use the following two plugins all the time:

Csaba_H ,Jun 24, 2009 at 13:47

vimoutliner is really good for managing small pieces of information (from tasks/todo-s to links) � Csaba_H Jun 24 '09 at 13:47

ThiefMaster ♦ ,Nov 25, 2010 at 20:35

Adding some links/descriptions would be nice � ThiefMaster ♦ Nov 25 '10 at 20:35

community wiki chiggsy ,Aug 26, 2009 at 18:22

For vim I like a little help with completions. Vim has tons of completion modes, but really, I just want vim to complete anything it can, whenver it can.

I hate typing ending quotes, but fortunately this plugin obviates the need for such misery.

Those two are my heavy hitters.

This one may step up to roam my code like an unquiet shade, but I've yet to try it.

community wiki Brett Stahlman, Dec 11, 2009 at 13:28

Txtfmt (The Vim Highlighter) Screenshots

The Txtfmt plugin gives you a sort of "rich text" highlighting capability, similar to what is provided by RTF editors and word processors. You can use it to add colors (foreground and background) and formatting attributes (all combinations of bold, underline, italic, etc...) to your plain text documents in Vim.

The advantage of this plugin over something like Latex is that with Txtfmt, your highlighting changes are visible "in real time", and as with a word processor, the highlighting is WYSIWYG. Txtfmt embeds special tokens directly in the file to accomplish the highlighting, so the highlighting is unaffected when you move the file around, even from one computer to another. The special tokens are hidden by the syntax; each appears as a single space. For those who have applied Vince Negri's conceal/ownsyntax patch, the tokens can even be made "zero-width".

community wiki 2 revs, Dec 10, 2010 at 4:37

tcomment

"I map the "Command + /" keys so i can just comment stuff out while in insert mode imap :i

[Oct 21, 2018] What is your most productive shortcut with Vim?

Notable quotes:
"... less productive ..."
"... column oriented ..."
Feb 17, 2013 | stackoverflow.com
I've heard a lot about Vim, both pros and cons. It really seems you should be (as a developer) faster with Vim than with any other editor. I'm using Vim to do some basic stuff and I'm at best 10 times less productive with Vim.

The only two things you should care about when you talk about speed (you may not care enough about them, but you should) are:

  1. Using alternatively left and right hands is the fastest way to use the keyboard.
  2. Never touching the mouse is the second way to be as fast as possible. It takes ages for you to move your hand, grab the mouse, move it, and bring it back to the keyboard (and you often have to look at the keyboard to be sure you returned your hand properly to the right place)

Here are two examples demonstrating why I'm far less productive with Vim.

Copy/Cut & paste. I do it all the time. With all the contemporary editors you press Shift with the left hand, and you move the cursor with your right hand to select text. Then Ctrl + C copies, you move the cursor and Ctrl + V pastes.

With Vim it's horrible:

Another example? Search & replace.

And everything with Vim is like that: it seems I don't know how to handle it the right way.

NB : I've already read the Vim cheat sheet :)

My question is: What is the way you use Vim that makes you more productive than with a contemporary editor?

community wiki 18 revs, 16 users 64%, Dec 22, 2011 at 11:43

Your problem with Vim is that you don't grok vi .

You mention cutting with yy and complain that you almost never want to cut whole lines. In fact programmers, editing source code, very often want to work on whole lines, ranges of lines and blocks of code. However, yy is only one of many way to yank text into the anonymous copy buffer (or "register" as it's called in vi ).

The "Zen" of vi is that you're speaking a language. The initial y is a verb. The statement yy is a synonym for y_ . The y is doubled up to make it easier to type, since it is such a common operation.

This can also be expressed as dd P (delete the current line and paste a copy back into place; leaving a copy in the anonymous register as a side effect). The y and d "verbs" take any movement as their "subject." Thus yW is "yank from here (the cursor) to the end of the current/next (big) word" and y'a is "yank from here to the line containing the mark named ' a '."

If you only understand basic up, down, left, and right cursor movements then vi will be no more productive than a copy of "notepad" for you. (Okay, you'll still have syntax highlighting and the ability to handle files larger than a piddling ~45KB or so; but work with me here).

vi has 26 "marks" and 26 "registers." A mark is set to any cursor location using the m command. Each mark is designated by a single lower case letter. Thus ma sets the ' a ' mark to the current location, and mz sets the ' z ' mark. You can move to the line containing a mark using the ' (single quote) command. Thus 'a moves to the beginning of the line containing the ' a ' mark. You can move to the precise location of any mark using the ` (backquote) command. Thus `z will move directly to the exact location of the ' z ' mark.

Because these are "movements" they can also be used as subjects for other "statements."

So, one way to cut an arbitrary selection of text would be to drop a mark (I usually use ' a ' as my "first" mark, ' z ' as my next mark, ' b ' as another, and ' e ' as yet another (I don't recall ever having interactively used more than four marks in 15 years of using vi ; one creates one's own conventions regarding how marks and registers are used by macros that don't disturb one's interactive context). Then we go to the other end of our desired text; we can start at either end, it doesn't matter. Then we can simply use d`a to cut or y`a to copy. Thus the whole process has a 5 keystrokes overhead (six if we started in "insert" mode and needed to Esc out command mode). Once we've cut or copied then pasting in a copy is a single keystroke: p .

I say that this is one way to cut or copy text. However, it is only one of many. Frequently we can more succinctly describe the range of text without moving our cursor around and dropping a mark. For example if I'm in a paragraph of text I can use { and } movements to the beginning or end of the paragraph respectively. So, to move a paragraph of text I cut it using { d} (3 keystrokes). (If I happen to already be on the first or last line of the paragraph I can then simply use d} or d{ respectively.

The notion of "paragraph" defaults to something which is usually intuitively reasonable. Thus it often works for code as well as prose.

Frequently we know some pattern (regular expression) that marks one end or the other of the text in which we're interested. Searching forwards or backwards are movements in vi . Thus they can also be used as "subjects" in our "statements." So I can use d/foo to cut from the current line to the next line containing the string "foo" and y?bar to copy from the current line to the most recent (previous) line containing "bar." If I don't want whole lines I can still use the search movements (as statements of their own), drop my mark(s) and use the `x commands as described previously.

In addition to "verbs" and "subjects" vi also has "objects" (in the grammatical sense of the term). So far I've only described the use of the anonymous register. However, I can use any of the 26 "named" registers by prefixing the "object" reference with " (the double quote modifier). Thus if I use "add I'm cutting the current line into the ' a ' register and if I use "by/foo then I'm yanking a copy of the text from here to the next line containing "foo" into the ' b ' register. To paste from a register I simply prefix the paste with the same modifier sequence: "ap pastes a copy of the ' a ' register's contents into the text after the cursor and "bP pastes a copy from ' b ' to before the current line.

This notion of "prefixes" also adds the analogs of grammatical "adjectives" and "adverbs' to our text manipulation "language." Most commands (verbs) and movement (verbs or objects, depending on context) can also take numeric prefixes. Thus 3J means "join the next three lines" and d5} means "delete from the current line through the end of the fifth paragraph down from here."

This is all intermediate level vi . None of it is Vim specific and there are far more advanced tricks in vi if you're ready to learn them. If you were to master just these intermediate concepts then you'd probably find that you rarely need to write any macros because the text manipulation language is sufficiently concise and expressive to do most things easily enough using the editor's "native" language.


A sampling of more advanced tricks:

There are a number of : commands, most notably the :% s/foo/bar/g global substitution technique. (That's not advanced but other : commands can be). The whole : set of commands was historically inherited by vi 's previous incarnations as the ed (line editor) and later the ex (extended line editor) utilities. In fact vi is so named because it's the visual interface to ex .

: commands normally operate over lines of text. ed and ex were written in an era when terminal screens were uncommon and many terminals were "teletype" (TTY) devices. So it was common to work from printed copies of the text, using commands through an extremely terse interface (common connection speeds were 110 baud, or, roughly, 11 characters per second -- which is slower than a fast typist; lags were common on multi-user interactive sessions; additionally there was often some motivation to conserve paper).

So the syntax of most : commands includes an address or range of addresses (line number) followed by a command. Naturally one could use literal line numbers: :127,215 s/foo/bar to change the first occurrence of "foo" into "bar" on each line between 127 and 215. One could also use some abbreviations such as . or $ for current and last lines respectively. One could also use relative prefixes + and - to refer to offsets after or before the curent line, respectively. Thus: :.,$j meaning "from the current line to the last line, join them all into one line". :% is synonymous with :1,$ (all the lines).

The :... g and :... v commands bear some explanation as they are incredibly powerful. :... g is a prefix for "globally" applying a subsequent command to all lines which match a pattern (regular expression) while :... v applies such a command to all lines which do NOT match the given pattern ("v" from "conVerse"). As with other ex commands these can be prefixed by addressing/range references. Thus :.,+21g/foo/d means "delete any lines containing the string "foo" from the current one through the next 21 lines" while :.,$v/bar/d means "from here to the end of the file, delete any lines which DON'T contain the string "bar."

It's interesting that the common Unix command grep was actually inspired by this ex command (and is named after the way in which it was documented). The ex command :g/re/p (grep) was the way they documented how to "globally" "print" lines containing a "regular expression" (re). When ed and ex were used, the :p command was one of the first that anyone learned and often the first one used when editing any file. It was how you printed the current contents (usually just one page full at a time using :.,+25p or some such).

Note that :% g/.../d or (its reVerse/conVerse counterpart: :% v/.../d are the most common usage patterns. However there are couple of other ex commands which are worth remembering:

We can use m to move lines around, and j to join lines. For example if you have a list and you want to separate all the stuff matching (or conversely NOT matching some pattern) without deleting them, then you can use something like: :% g/foo/m$ ... and all the "foo" lines will have been moved to the end of the file. (Note the other tip about using the end of your file as a scratch space). This will have preserved the relative order of all the "foo" lines while having extracted them from the rest of the list. (This would be equivalent to doing something like: 1G!GGmap!Ggrep foo<ENTER>1G:1,'a g/foo'/d (copy the file to its own tail, filter the tail through grep, and delete all the stuff from the head).

To join lines usually I can find a pattern for all the lines which need to be joined to their predecessor (all the lines which start with "^ " rather than "^ * " in some bullet list, for example). For that case I'd use: :% g/^ /-1j (for every matching line, go up one line and join them). (BTW: for bullet lists trying to search for the bullet lines and join to the next doesn't work for a couple reasons ... it can join one bullet line to another, and it won't join any bullet line to all of its continuations; it'll only work pairwise on the matches).

Almost needless to mention you can use our old friend s (substitute) with the g and v (global/converse-global) commands. Usually you don't need to do so. However, consider some case where you want to perform a substitution only on lines matching some other pattern. Often you can use a complicated pattern with captures and use back references to preserve the portions of the lines that you DON'T want to change. However, it will often be easier to separate the match from the substitution: :% g/foo/s/bar/zzz/g -- for every line containing "foo" substitute all "bar" with "zzz." (Something like :% s/\(.*foo.*\)bar\(.*\)/\1zzz\2/g would only work for the cases those instances of "bar" which were PRECEDED by "foo" on the same line; it's ungainly enough already, and would have to be mangled further to catch all the cases where "bar" preceded "foo")

The point is that there are more than just p, s, and d lines in the ex command set.

The : addresses can also refer to marks. Thus you can use: :'a,'bg/foo/j to join any line containing the string foo to its subsequent line, if it lies between the lines between the ' a ' and ' b ' marks. (Yes, all of the preceding ex command examples can be limited to subsets of the file's lines by prefixing with these sorts of addressing expressions).

That's pretty obscure (I've only used something like that a few times in the last 15 years). However, I'll freely admit that I've often done things iteratively and interactively that could probably have been done more efficiently if I'd taken the time to think out the correct incantation.

Another very useful vi or ex command is :r to read in the contents of another file. Thus: :r foo inserts the contents of the file named "foo" at the current line.

More powerful is the :r! command. This reads the results of a command. It's the same as suspending the vi session, running a command, redirecting its output to a temporary file, resuming your vi session, and reading in the contents from the temp. file.

Even more powerful are the ! (bang) and :... ! ( ex bang) commands. These also execute external commands and read the results into the current text. However, they also filter selections of our text through the command! This we can sort all the lines in our file using 1G!Gsort ( G is the vi "goto" command; it defaults to going to the last line of the file, but can be prefixed by a line number, such as 1, the first line). This is equivalent to the ex variant :1,$!sort . Writers often use ! with the Unix fmt or fold utilities for reformating or "word wrapping" selections of text. A very common macro is {!}fmt (reformat the current paragraph). Programmers sometimes use it to run their code, or just portions of it, through indent or other code reformatting tools.

Using the :r! and ! commands means that any external utility or filter can be treated as an extension of our editor. I have occasionally used these with scripts that pulled data from a database, or with wget or lynx commands that pulled data off a website, or ssh commands that pulled data from remote systems.

Another useful ex command is :so (short for :source ). This reads the contents of a file as a series of commands. When you start vi it normally, implicitly, performs a :source on ~/.exinitrc file (and Vim usually does this on ~/.vimrc, naturally enough). The use of this is that you can change your editor profile on the fly by simply sourcing in a new set of macros, abbreviations, and editor settings. If you're sneaky you can even use this as a trick for storing sequences of ex editing commands to apply to files on demand.

For example I have a seven line file (36 characters) which runs a file through wc, and inserts a C-style comment at the top of the file containing that word count data. I can apply that "macro" to a file by using a command like: vim +'so mymacro.ex' ./mytarget

(The + command line option to vi and Vim is normally used to start the editing session at a given line number. However it's a little known fact that one can follow the + by any valid ex command/expression, such as a "source" command as I've done here; for a simple example I have scripts which invoke: vi +'/foo/d|wq!' ~/.ssh/known_hosts to remove an entry from my SSH known hosts file non-interactively while I'm re-imaging a set of servers).

Usually it's far easier to write such "macros" using Perl, AWK, sed (which is, in fact, like grep a utility inspired by the ed command).

The @ command is probably the most obscure vi command. In occasionally teaching advanced systems administration courses for close to a decade I've met very few people who've ever used it. @ executes the contents of a register as if it were a vi or ex command.
Example: I often use: :r!locate ... to find some file on my system and read its name into my document. From there I delete any extraneous hits, leaving only the full path to the file I'm interested in. Rather than laboriously Tab -ing through each component of the path (or worse, if I happen to be stuck on a machine without Tab completion support in its copy of vi ) I just use:

  1. 0i:r (to turn the current line into a valid :r command),
  2. "cdd (to delete the line into the "c" register) and
  3. @c execute that command.

That's only 10 keystrokes (and the expression "cdd @c is effectively a finger macro for me, so I can type it almost as quickly as any common six letter word).


A sobering thought

I've only scratched to surface of vi 's power and none of what I've described here is even part of the "improvements" for which vim is named! All of what I've described here should work on any old copy of vi from 20 or 30 years ago.

There are people who have used considerably more of vi 's power than I ever will.

Jim Dennis, Feb 12, 2010 at 4:08

@Wahnfieden -- grok is exactly what I meant: en.wikipedia.org/wiki/Grok (It's apparently even in the OED --- the closest we anglophones have to a canonical lexicon). To "grok" an editor is to find yourself using its commands fluently ... as if they were your natural language. – Jim Dennis Feb 12 '10 at 4:08

knittl, Feb 27, 2010 at 13:15

wow, a very well written answer! i couldn't agree more, although i use the @ command a lot (in combination with q : record macro) – knittl Feb 27 '10 at 13:15

Brandon Rhodes, Mar 29, 2010 at 15:26

Superb answer that utterly redeems a really horrible question. I am going to upvote this question, that normally I would downvote, just so that this answer becomes easier to find. (And I'm an Emacs guy! But this way I'll have somewhere to point new folks who want a good explanation of what vi power users find fun about vi. Then I'll tell them about Emacs and they can decide.) – Brandon Rhodes Mar 29 '10 at 15:26

Marko, Apr 1, 2010 at 14:47

Can you make a website and put this tutorial there, so it doesn't get burried here on stackoverflow. I have yet to read better introduction to vi then this. – Marko Apr 1 '10 at 14:47

CMS, Aug 2, 2009 at 8:27

You are talking about text selecting and copying, I think that you should give a look to the Vim Visual Mode .

In the visual mode, you are able to select text using Vim commands, then you can do whatever you want with the selection.

Consider the following common scenarios:

You need to select to the next matching parenthesis.

You could do:

You want to select text between quotes:

You want to select a curly brace block (very common on C-style languages):

You want to select the entire file:

Visual block selection is another really useful feature, it allows you to select a rectangular area of text, you just have to press Ctrl - V to start it, and then select the text block you want and perform any type of operation such as yank, delete, paste, edit, etc. It's great to edit column oriented text.

finnw, Aug 2, 2009 at 8:49

Every editor has something like this, it's not specific to vim. – finnw Aug 2 '09 at 8:49

guns, Aug 2, 2009 at 9:54

Yes, but it was a specific complaint of the poster. Visual mode is Vim's best method of direct text-selection and manipulation. And since vim's buffer traversal methods are superb, I find text selection in vim fairly pleasurable. – guns Aug 2 '09 at 9:54

Hamish Downer, Mar 16, 2010 at 13:34

I think it is also worth mentioning Ctrl-V to select a block - ie an arbitrary rectangle of text. When you need it it's a lifesaver. – Hamish Downer Mar 16 '10 at 13:34

CMS, Apr 2, 2010 at 2:07

@viksit: I'm using Camtasia, but there are plenty of alternatives: codinghorror.com/blog/2006/11/screencasting-for-windows.htmlCMS Apr 2 '10 at 2:07

Nathan Long, Mar 1, 2011 at 19:05

Also, if you've got a visual selection and want to adjust it, o will hop to the other end. So you can move both the beginning and the end of the selection as much as you like. – Nathan Long Mar 1 '11 at 19:05

community wiki
12 revs, 3 users 99%
,Oct 29, 2012 at 18:51

Some productivity tips:

Smart movements

Quick editing commands

Combining commands

Most commands accept a amount and direction, for example:

Useful programmer commands

Macro recording

By using very specific commands and movements, VIM can replay those exact actions for the next lines. (e.g. A for append-to-end, b / e to move the cursor to the begin or end of a word respectively)

Example of well built settings

# reset to vim-defaults
if &compatible          # only if not set before:
  set nocompatible      # use vim-defaults instead of vi-defaults (easier, more user friendly)
endif

# display settings
set background=dark     # enable for dark terminals
set nowrap              # dont wrap lines
set scrolloff=2         # 2 lines above/below cursor when scrolling
set number              # show line numbers
set showmatch           # show matching bracket (briefly jump)
set showmode            # show mode in status bar (insert/replace/...)
set showcmd             # show typed command in status bar
set ruler               # show cursor position in status bar
set title               # show file in titlebar
set wildmenu            # completion with menu
set wildignore=*.o,*.obj,*.bak,*.exe,*.py[co],*.swp,*~,*.pyc,.svn
set laststatus=2        # use 2 lines for the status bar
set matchtime=2         # show matching bracket for 0.2 seconds
set matchpairs+=<:>     # specially for html

# editor settings
set esckeys             # map missed escape sequences (enables keypad keys)
set ignorecase          # case insensitive searching
set smartcase           # but become case sensitive if you type uppercase characters
set smartindent         # smart auto indenting
set smarttab            # smart tab handling for indenting
set magic               # change the way backslashes are used in search patterns
set bs=indent,eol,start # Allow backspacing over everything in insert mode

set tabstop=4           # number of spaces a tab counts for
set shiftwidth=4        # spaces for autoindents
#set expandtab           # turn a tabs into spaces

set fileformat=unix     # file mode is unix
#set fileformats=unix,dos    # only detect unix file format, displays that ^M with dos files

# system settings
set lazyredraw          # no redraws in macros
set confirm             # get a dialog when :q, :w, or :wq fails
set nobackup            # no backup~ files.
set viminfo='20,\"500   # remember copy registers after quitting in the .viminfo file -- 20 jump links, regs up to 500 lines'
set hidden              # remember undo after quitting
set history=50          # keep 50 lines of command history
set mouse=v             # use mouse in visual mode (not normal,insert,command,help mode


# color settings (if terminal/gui supports it)
if &t_Co > 2 || has("gui_running")
  syntax on          # enable colors
  set hlsearch       # highlight search (very useful!)
  set incsearch      # search incremently (search while typing)
endif

# paste mode toggle (needed when using autoindent/smartindent)
map <F10> :set paste<CR>
map <F11> :set nopaste<CR>
imap <F10> <C-O>:set paste<CR>
imap <F11> <nop>
set pastetoggle=<F11>

# Use of the filetype plugins, auto completion and indentation support
filetype plugin indent on

# file type specific settings
if has("autocmd")
  # For debugging
  #set verbose=9

  # if bash is sh.
  let bash_is_sh=1

  # change to directory of current file automatically
  autocmd BufEnter * lcd %:p:h

  # Put these in an autocmd group, so that we can delete them easily.
  augroup mysettings
    au FileType xslt,xml,css,html,xhtml,javascript,sh,config,c,cpp,docbook set smartindent shiftwidth=2 softtabstop=2 expandtab
    au FileType tex set wrap shiftwidth=2 softtabstop=2 expandtab

    # Confirm to PEP8
    au FileType python set tabstop=4 softtabstop=4 expandtab shiftwidth=4 cinwords=if,elif,else,for,while,try,except,finally,def,class
  augroup END

  augroup perl
    # reset (disable previous 'augroup perl' settings)
    au!  

    au BufReadPre,BufNewFile
    \ *.pl,*.pm
    \ set formatoptions=croq smartindent shiftwidth=2 softtabstop=2 cindent cinkeys='0{,0},!^F,o,O,e' " tags=./tags,tags,~/devel/tags,~/devel/C
    # formatoption:
    #   t - wrap text using textwidth
    #   c - wrap comments using textwidth (and auto insert comment leader)
    #   r - auto insert comment leader when pressing <return> in insert mode
    #   o - auto insert comment leader when pressing 'o' or 'O'.
    #   q - allow formatting of comments with "gq"
    #   a - auto formatting for paragraphs
    #   n - auto wrap numbered lists
    #   
  augroup END


  # Always jump to the last known cursor position. 
  # Don't do it when the position is invalid or when inside
  # an event handler (happens when dropping a file on gvim). 
  autocmd BufReadPost * 
    \ if line("'\"") > 0 && line("'\"") <= line("$") | 
    \   exe "normal g`\"" | 
    \ endif 

endif # has("autocmd")

The settings can be stored in ~/.vimrc, or system-wide in /etc/vimrc.local and then by read from the /etc/vimrc file using:

source /etc/vimrc.local

(you'll have to replace the # comment character with " to make it work in VIM, I wanted to give proper syntax highlighting here).

The commands I've listed here are pretty basic, and the main ones I use so far. They already make me quite more productive, without having to know all the fancy stuff.

naught101, Apr 28, 2012 at 2:09

Better than '. is g;, which jumps back through the changelist . Goes to the last edited position, instead of last edited line – naught101 Apr 28 '12 at 2:09

community wiki
5 revs, 4 users 53%
,Apr 12, 2012 at 7:46

The Control + R mechanism is very useful :-) In either insert mode or command mode (i.e. on the : line when typing commands), continue with a numbered or named register:

See :help i_CTRL-R and :help c_CTRL-R for more details, and snoop around nearby for more CTRL-R goodness.

vdboor, Jun 3, 2010 at 9:08

FYI, this refers to Ctrl+R in insert mode . In normal mode, Ctrl+R is redo. – vdboor Jun 3 '10 at 9:08

Aryeh Leib Taurog, Feb 26, 2012 at 19:06

+1 for current/alternate file name. Control-A also works in insert mode for last inserted text, and Control-@ to both insert last inserted text and immediately switch to normal mode. – Aryeh Leib Taurog Feb 26 '12 at 19:06

community wiki
Benson
, Apr 1, 2010 at 3:44

Vim Plugins

There are a lot of good answers here, and one amazing one about the zen of vi. One thing I don't see mentioned is that vim is extremely extensible via plugins. There are scripts and plugins to make it do all kinds of crazy things the original author never considered. Here are a few examples of incredibly handy vim plugins:

rails.vim

Rails.vim is a plugin written by tpope. It's an incredible tool for people doing rails development. It does magical context-sensitive things that allow you to easily jump from a method in a controller to the associated view, over to a model, and down to unit tests for that model. It has saved dozens if not hundreds of hours as a rails developer.

gist.vim

This plugin allows you to select a region of text in visual mode and type a quick command to post it to gist.github.com . This allows for easy pastebin access, which is incredibly handy if you're collaborating with someone over IRC or IM.

space.vim

This plugin provides special functionality to the spacebar. It turns the spacebar into something analogous to the period, but instead of repeating actions it repeats motions. This can be very handy for moving quickly through a file in a way you define on the fly.

surround.vim

This plugin gives you the ability to work with text that is delimited in some fashion. It gives you objects which denote things inside of parens, things inside of quotes, etc. It can come in handy for manipulating delimited text.

supertab.vim

This script brings fancy tab completion functionality to vim. The autocomplete stuff is already there in the core of vim, but this brings it to a quick tab rather than multiple different multikey shortcuts. Very handy, and incredibly fun to use. While it's not VS's intellisense, it's a great step and brings a great deal of the functionality you'd like to expect from a tab completion tool.

syntastic.vim

This tool brings external syntax checking commands into vim. I haven't used it personally, but I've heard great things about it and the concept is hard to beat. Checking syntax without having to do it manually is a great time saver and can help you catch syntactic bugs as you introduce them rather than when you finally stop to test.

fugitive.vim

Direct access to git from inside of vim. Again, I haven't used this plugin, but I can see the utility. Unfortunately I'm in a culture where svn is considered "new", so I won't likely see git at work for quite some time.

nerdtree.vim

A tree browser for vim. I started using this recently, and it's really handy. It lets you put a treeview in a vertical split and open files easily. This is great for a project with a lot of source files you frequently jump between.

FuzzyFinderTextmate.vim

This is an unmaintained plugin, but still incredibly useful. It provides the ability to open files using a "fuzzy" descriptive syntax. It means that in a sparse tree of files you need only type enough characters to disambiguate the files you're interested in from the rest of the cruft.

Conclusion

There are a lot of incredible tools available for vim. I'm sure I've only scratched the surface here, and it's well worth searching for tools applicable to your domain. The combination of traditional vi's powerful toolset, vim's improvements on it, and plugins which extend vim even further, it's one of the most powerful ways to edit text ever conceived. Vim is easily as powerful as emacs, eclipse, visual studio, and textmate.

Thanks

Thanks to duwanis for his vim configs from which I have learned much and borrowed most of the plugins listed here.

Tom Morris, Apr 1, 2010 at 8:50

The magical tests-to-class navigation in rails.vim is one of the more general things I wish Vim had that TextMate absolutely nails across all languages: if I am working on Person.scala and I do Cmd+T, usually the first thing in the list is PersonTest.scala. – Tom Morris Apr 1 '10 at 8:50

Gavin Gilmour, Jan 15, 2011 at 13:44

I think it's time FuzzyFinderTextmate started to get replaced with github.com/wincent/Command-TGavin Gilmour Jan 15 '11 at 13:44

Nathan Long, Mar 1, 2011 at 19:07

+1 for Syntastic. That, combined with JSLint, has made my Javascript much less error-prone. See superuser.com/questions/247012/ about how to set up JSLint to work with Syntastic. – Nathan Long Mar 1 '11 at 19:07

AlG, Sep 13, 2011 at 17:37

@Benson Great list! I'd toss in snipMate as well. Very helpful automation of common coding stuff. if<tab> instant if block, etc. – AlG Sep 13 '11 at 17:37

EarlOfEgo, May 12, 2012 at 15:13

I think nerdcommenter is also a good plugin: here . Like its name says, it is for commenting your code. – EarlOfEgo May 12 '12 at 15:13

community wiki
4 revs, 2 users 89%
,Mar 31, 2010 at 23:01

. Repeat last text-changing command

I save a lot of time with this one.

Visual mode was mentioned previously, but block visual mode has saved me a lot of time when editing fixed size columns in text file. (accessed with Ctrl-V).

vdboor, Apr 1, 2010 at 8:34

Additionally, if you use a concise command (e.g. A for append-at-end) to edit the text, vim can repeat that exact same action for the next line you press the . key at. – vdboor Apr 1 '10 at 8:34

community wiki
3 revs, 3 users 87%
,Dec 24, 2012 at 14:50

gi

Go to last edited location (very useful if you performed some searching and than want go back to edit)

^P and ^N

Complete previous (^P) or next (^N) text.

^O and ^I

Go to previous ( ^O - "O" for old) location or to the next ( ^I - "I" just near to "O" ). When you perform searches, edit files etc., you can navigate through these "jumps" forward and back.

R. Martinho Fernandes, Apr 1, 2010 at 3:02

Thanks for gi ! Now I don't need marks for that! – R. Martinho Fernandes Apr 1 '10 at 3:02

Kungi, Feb 10, 2011 at 16:23

I Think this can also be done with `` – Kungi Feb 10 '11 at 16:23

Grant McLean, Aug 23, 2011 at 8:21

@Kungi `. will take you to the last edit `` will take you back to the position you were in before the last 'jump' - which /might/ also be the position of the last edit. – Grant McLean Aug 23 '11 at 8:21

community wiki
Ronny Brendel
, Mar 31, 2010 at 19:37

I recently (got) discovered this site: http://vimcasts.org/

It's pretty new and really really good. The guy who is running the site switched from textmate to vim and hosts very good and concise casts on specific vim topics. Check it out!

Jeromy Anglim, Jan 13, 2011 at 6:40

If you like vim tutorials, check out Derek Wyatt's vim videos as well. They're excellent. – Jeromy Anglim Jan 13 '11 at 6:40

community wiki
2 revs, 2 users 67%
,Feb 27, 2010 at 11:20

CTRL + A increments the number you are standing on.

innaM, Aug 3, 2009 at 9:14

... and CTRL-X decrements. – innaM Aug 3 '09 at 9:14

SolutionYogi, Feb 26, 2010 at 20:43

It's a neat shortcut but so far I have NEVER found any use for it. – SolutionYogi Feb 26 '10 at 20:43

matja, Feb 27, 2010 at 14:21

if you run vim in screen and wonder why this doesn't work - ctrl+A, A – matja Feb 27 '10 at 14:21

hcs42, Feb 27, 2010 at 19:05

@SolutionYogi: Consider that you want to add line number to the beginning of each line. Solution: ggI1<space><esc>0qqyawjP0<c-a>0q9999@q – hcs42 Feb 27 '10 at 19:05

blueyed, Apr 1, 2010 at 14:47

Extremely useful with Vimperator, where it increments (or decrements, Ctrl-X) the last number in the URL. Useful for quickly surfing through image galleries etc. – blueyed Apr 1 '10 at 14:47

community wiki
3 revs
,Aug 28, 2009 at 15:23

All in Normal mode:

f<char> to move to the next instance of a particular character on the current line, and ; to repeat.

F<char> to move to the previous instance of a particular character on the current line and ; to repeat.

If used intelligently, the above two can make you killer-quick moving around in a line.

* on a word to search for the next instance.

# on a word to search for the previous instance.

Jim Dennis, Mar 14, 2010 at 6:38

Whoa, I didn't know about the * and # (search forward/back for word under cursor) binding. That's kinda cool. The f/F and t/T and ; commands are quick jumps to characters on the current line. f/F put the cursor on the indicated character while t/T puts it just up "to" the character (the character just before or after it according to the direction chosen. ; simply repeats the most recent f/F/t/T jump (in the same direction). – Jim Dennis Mar 14 '10 at 6:38

Steve K, Apr 3, 2010 at 23:50

:) The tagline at the top of the tips page at vim.org: "Can you imagine how many keystrokes could have been saved, if I only had known the "*" command in time?" - Juergen Salk, 1/19/2001" – Steve K Apr 3 '10 at 23:50

puk, Feb 24, 2012 at 6:45

As Jim mentioned, the "t/T" combo is often just as good, if not better, for example, ct( will erase the word and put you in insert mode, but keep the parantheses! – puk Feb 24 '12 at 6:45

community wiki
agfe2
, Aug 19, 2010 at 8:08

Session

a. save session

:mks sessionname

b. force save session

:mks! sessionname

c. load session

gvim or vim -S sessionname


Adding and Subtracting

a. Adding and Subtracting

CTRL-A ;Add [count] to the number or alphabetic character at or after the cursor. {not in Vi

CTRL-X ;Subtract [count] from the number or alphabetic character at or after the cursor. {not in Vi}

b. Window key unmapping

In window, Ctrl-A already mapped for whole file selection you need to unmap in rc file. mark mswin.vim CTRL-A mapping part as comment or add your rc file with unmap

c. With Macro

The CTRL-A command is very useful in a macro. Example: Use the following steps to make a numbered list.

  1. Create the first list entry, make sure it starts with a number.
  2. qa - start recording into buffer 'a'
  3. Y - yank the entry
  4. p - put a copy of the entry below the first one
  5. CTRL-A - increment the number
  6. q - stop recording
  7. @a - repeat the yank, put and increment times

Don Reba, Aug 22, 2010 at 5:22

Any idea what the shortcuts are in Windows? – Don Reba Aug 22 '10 at 5:22

community wiki
8 revs, 2 users 98%
,Aug 18, 2012 at 21:44

Last week at work our project inherited a lot of Python code from another project. Unfortunately the code did not fit into our existing architecture - it was all done with global variables and functions, which would not work in a multi-threaded environment.

We had ~80 files that needed to be reworked to be object oriented - all the functions moved into classes, parameters changed, import statements added, etc. We had a list of about 20 types of fix that needed to be done to each file. I would estimate that doing it by hand one person could do maybe 2-4 per day.

So I did the first one by hand and then wrote a vim script to automate the changes. Most of it was a list of vim commands e.g.

" delete an un-needed function "
g/someFunction(/ d

" add wibble parameter to function foo "
%s/foo(/foo( wibble,/

" convert all function calls bar(thing) into method calls thing.bar() "
g/bar(/ normal nmaf(ldi(`aPa.

The last one deserves a bit of explanation:

g/bar(/  executes the following command on every line that contains "bar("
normal   execute the following text as if it was typed in in normal mode
n        goes to the next match of "bar(" (since the :g command leaves the cursor position at the start of the line)
ma       saves the cursor position in mark a
f(       moves forward to the next opening bracket
l        moves right one character, so the cursor is now inside the brackets
di(      delete all the text inside the brackets
`a       go back to the position saved as mark a (i.e. the first character of "bar")
P        paste the deleted text before the current cursor position
a.       go into insert mode and add a "."

For a couple of more complex transformations such as generating all the import statements I embedded some python into the vim script.

After a few hours of working on it I had a script that will do at least 95% of the conversion. I just open a file in vim then run :source fixit.vim and the file is transformed in a blink of the eye.

We still have the work of changing the remaining 5% that was not worth automating and of testing the results, but by spending a day writing this script I estimate we have saved weeks of work.

Of course it would have been possible to automate this with a scripting language like Python or Ruby, but it would have taken far longer to write and would be less flexible - the last example would have been difficult since regex alone would not be able to handle nested brackets, e.g. to convert bar(foo(xxx)) to foo(xxx).bar() . Vim was perfect for the task.

Olivier Pons, Feb 28, 2010 at 14:41

Thanks a lot for sharing that's really nice to learn from "useful & not classical" macros. – Olivier Pons Feb 28 '10 at 14:41

Ipsquiggle, Mar 23, 2010 at 16:55

%s/\(bar\)(\(.\+\))/\2.\1()/ would do that too. (Escapes are compatible with :set magic .) Just for the record. :) – Ipsquiggle Mar 23 '10 at 16:55

Ipsquiggle, Mar 23, 2010 at 16:56

Of if you don't like vim-style escapes, use \v to turn on Very Magic: %s/\v(bar)\((.+)\)/\2.\1()/Ipsquiggle Mar 23 '10 at 16:56

Dave Kirby, Mar 23, 2010 at 17:16

@lpsquiggle: your suggestion would not handle complex expressions with more than one set of brackets. e.g. if bar(foo(xxx)) or wibble(xxx): becomes if foo(xxx)) or wibble(xxx.bar(): which is completely wrong. – Dave Kirby Mar 23 '10 at 17:16

community wiki
2 revs
,Aug 2, 2009 at 11:17

Use the builtin file explorer! The command is :Explore and it allows you to navigate through your source code very very fast. I have these mapping in my .vimrc :
map <silent> <F8>   :Explore<CR>
map <silent> <S-F8> :sp +Explore<CR>

The explorer allows you to make file modifications, too. I'll post some of my favorite keys, pressing <F1> will give you the full list:

Svend, Aug 2, 2009 at 8:48

I always thought the default methods for browsing kinda sucked for most stuff. It's just slow to browse, if you know where you wanna go. LustyExplorer from vim.org's script section is a much needed improvement. – Svend Aug 2 '09 at 8:48

Taurus Olson, Aug 6, 2009 at 17:37

Your second mapping could be more simple: map <silent> <S-F8> :Sexplore<CR> – Taurus Olson Aug 6 '09 at 17:37

kprobst, Apr 1, 2010 at 3:53

I recommend NERDtree instead of the built-in explorer. It has changed the way I used vim for projects and made me much more productive. Just google for it. – kprobst Apr 1 '10 at 3:53

dash-tom-bang, Aug 24, 2011 at 0:35

I never feel the need to explore the source tree, I just use :find, :tag and the various related keystrokes to jump around. (Maybe this is because the source trees I work on are big and organized differently than I would have done? :) ) – dash-tom-bang Aug 24 '11 at 0:35

community wiki
2 revs, 2 users 92%
,Jun 15, 2011 at 13:39

I am a member of the American Cryptogram Association. The bimonthly magazine includes over 100 cryptograms of various sorts. Roughly 15 of these are "cryptarithms" - various types of arithmetic problems with letters substituted for the digits. Two or three of these are sudokus, except with letters instead of numbers. When the grid is completed, the nine distinct letters will spell out a word or words, on some line, diagonal, spiral, etc., somewhere in the grid.

Rather than working with pencil, or typing the problems in by hand, I download the problems from the members area of their website.

When working with these sudokus, I use vi, simply because I'm using facilities that vi has that few other editors have. Mostly in converting the lettered grid into a numbered grid, because I find it easier to solve, and then the completed numbered grid back into the lettered grid to find the solution word or words.

The problem is formatted as nine groups of nine letters, with - s representing the blanks, written in two lines. The first step is to format these into nine lines of nine characters each. There's nothing special about this, just inserting eight linebreaks in the appropriate places.

The result will look like this:

T-O-----C
-E-----S-
--AT--N-L
---NASO--
---E-T---
--SPCL---
E-T--OS--
-A-----P-
S-----C-T

So, first step in converting this into numbers is to make a list of the distinct letters. First, I make a copy of the block. I position the cursor at the top of the block, then type :y}}p . : puts me in command mode, y yanks the next movement command. Since } is a move to the end of the next paragraph, y} yanks the paragraph. } then moves the cursor to the end of the paragraph, and p pastes what we had yanked just after the cursor. So y}}p creates a copy of the next paragraph, and ends up with the cursor between the two copies.

Next, I to turn one of those copies into a list of distinct letters. That command is a bit more complex:

:!}tr -cd A-Z | sed 's/\(.\)/\1\n/g' | sort -u | tr -d '\n'

: again puts me in command mode. ! indicates that the content of the next yank should be piped through a command line. } yanks the next paragraph, and the command line then uses the tr command to strip out everything except for upper-case letters, the sed command to print each letter on a single line, and the sort command to sort those lines, removing duplicates, and then tr strips out the newlines, leaving the nine distinct letters in a single line, replacing the nine lines that had made up the paragraph originally. In this case, the letters are: ACELNOPST .

Next step is to make another copy of the grid. And then to use the letters I've just identified to replace each of those letters with a digit from 1 to 9. That's simple: :!}tr ACELNOPST 0-9 . The result is:

8-5-----1
-2-----7-
--08--4-3
---4075--
---2-8---
--7613---
2-8--57--
-0-----6-
7-----1-8

This can then be solved in the usual way, or entered into any sudoku solver you might prefer. The completed solution can then be converted back into letters with :!}tr 1-9 ACELNOPST .

There is power in vi that is matched by very few others. The biggest problem is that only a very few of the vi tutorial books, websites, help-files, etc., do more than barely touch the surface of what is possible.

hhh, Jan 14, 2011 at 17:12

and an irritation is that some distros such as ubuntu has aliases from the word "vi" to "vim" so people won't really see vi. Excellent example, have to try... +1 – hhh Jan 14 '11 at 17:12

dash-tom-bang, Aug 24, 2011 at 0:45

Doesn't vim check the name it was started with so that it can come up in the right 'mode'? – dash-tom-bang Aug 24 '11 at 0:45

sehe, Mar 4, 2012 at 20:47

I'm baffled by this repeated error: you say you need : to go into command mode, but then invariably you specify normal mode commands (like y}}p ) which cannot possibly work from the command mode?! – sehe Mar 4 '12 at 20:47

sehe, Mar 4, 2012 at 20:56

My take on the unique chars challenge: :se tw=1 fo= (preparation) VG:s/./& /g (insert spaces), gvgq (split onto separate lines), V{:sort u (sort and remove duplicates) – sehe Mar 4 '12 at 20:56

community wiki
jqno
, Aug 2, 2009 at 8:59

Bulk text manipulations!

Either through macros:

Or through regular expressions:

(But be warned: if you do the latter, you'll have 2 problems :).)

Jim Dennis, Jan 10, 2010 at 4:03

+1 for the Jamie Zawinski reference. (No points taken back for failing to link to it, even). :) – Jim Dennis Jan 10 '10 at 4:03

jqno, Jan 10, 2010 at 10:06

@Jim I didn't even know it was a Jamie Zawinski quote :). I'll try to remember it from now on. – jqno Jan 10 '10 at 10:06

Jim Dennis, Feb 12, 2010 at 4:15

I find the following trick increasingly useful ... for cases where you want to join lines that match (or that do NOT match) some pattern to the previous line: :% g/foo/-1j or :'a,'z v/bar/-1j for example (where the former is "all lines and matching the pattern" while the latter is "lines between mark a and mark z which fail to match the pattern"). The part after the patter in a g or v ex command can be any other ex commmands, -1j is just a relative line movement and join command. – Jim Dennis Feb 12 '10 at 4:15

JustJeff, Feb 27, 2010 at 12:54

of course, if you name your macro '2', then when it comes time to use it, you don't even have to move your finger from the '@' key to the 'q' key. Probably saves 50 to 100 milliseconds every time right there. =P – JustJeff Feb 27 '10 at 12:54

Simon Steele, Apr 1, 2010 at 13:12

@JustJeff Depends entirely on your keyboard layout, my @ key is at the other side of the keyboard from my 2 key. – Simon Steele Apr 1 '10 at 13:12

community wiki
David Pope
, Apr 2, 2012 at 7:56

I recently discovered q: . It opens the "command window" and shows your most recent ex-mode (command-mode) commands. You can move as usual within the window, and pressing <CR> executes the command. You can edit, etc. too. Priceless when you're messing around with some complex command or regex and you don't want to retype the whole thing, or if the complex thing you want to do was 3 commands back. It's almost like bash's set -o vi, but for vim itself (heh!).

See :help q: for more interesting bits for going back and forth.

community wiki
2 revs, 2 users 56%
,Feb 27, 2010 at 11:29

I just discovered Vim's omnicompletion the other day, and while I'll admit I'm a bit hazy on what does which, I've had surprisingly good results just mashing either Ctrl + x Ctrl + u or Ctrl + n / Ctrl + p in insert mode. It's not quite IntelliSense, but I'm still learning it.

Try it out! :help ins-completion

community wiki
tfmoraes
, Mar 14, 2010 at 19:49

These are not shortcuts, but they are related:
  1. Make capslock an additional ESC (or Ctrl)
  2. map leader to "," (comma), with this command: let mapleader=","

They boost my productivity.

Olivier Pons, Mar 15, 2010 at 10:09

Hey nice hint about the "\"! Far better to type "," than "\". – Olivier Pons Mar 15 '10 at 10:09

R. Martinho Fernandes, Apr 1, 2010 at 3:30

To make Caps Lock an additional Esc in Windows (what's a caps lock key for? An "any key"?), try this: web.archive.org/web/20100418005858/http://webpages.charter.net/R. Martinho Fernandes Apr 1 '10 at 3:30

Tom Morris, Apr 1, 2010 at 8:45

On Mac, you need PCKeyboardHack - details at superuser.com/questions/34223/Tom Morris Apr 1 '10 at 8:45

Jeromy Anglim, Jan 10, 2011 at 4:43

On Windows I use AutoHotKey with Capslock::EscapeJeromy Anglim Jan 10 '11 at 4:43

community wiki
Costyn
, Sep 20, 2010 at 10:34

Another useful vi "shortcut" I frequently use is 'xp'. This will swap the character under the cursor with the next character.

tester, Aug 22, 2011 at 17:19

Xp to go the other way – tester Aug 22 '11 at 17:19

kguest, Aug 27, 2011 at 8:21

Around the time that Windows xp came out, I used to joke that this is the only good use for it. – kguest Aug 27 '11 at 8:21

community wiki
Peter Ellis
, Aug 2, 2009 at 9:47

<Ctrl> + W, V to split the screen vertically
<Ctrl> + W, W to shift between the windows

!python % [args] to run the script I am editing in this window

ZF in visual mode to fold arbitrary lines

Andrew Scagnelli, Apr 1, 2010 at 2:58

<Ctrl> + W and j/k will let you navigate absolutely (j up, k down, as with normal vim). This is great when you have 3+ splits. – Andrew Scagnelli Apr 1 '10 at 2:58

coder_tim, Jan 30, 2012 at 20:08

+1 for zf in visual mode, I like code folding, but did not know about that. – coder_tim Jan 30 '12 at 20:08

puk, Feb 24, 2012 at 7:00

after bashing my keyboard I have deduced that <C-w>n or <C-w>s is new horizontal window, <C-w>b is bottom right window, <C-w>c or <C-w>q is close window, <C-w>x is increase and then decrease window width (??), <C-w>p is last window, <C-w>backspace is move left(ish) window – puk Feb 24 '12 at 7:00

sjas, Jun 25, 2012 at 0:25

:help ctrl-w FTW... do yourself a favour, and force yourself to try these things for at least 15 minutes! – sjas Jun 25 '12 at 0:25

community wiki
2 revs
,Apr 1, 2010 at 17:00

Visual Mode

As several other people have said, visual mode is the answer to your copy/cut & paste problem. Vim gives you 'v', 'V', and C-v. Lower case 'v' in vim is essentially the same as the shift key in notepad. The nice thing is that you don't have to hold it down. You can use any movement technique to navigate efficiently to the starting (or ending) point of your selection. Then hit 'v', and use efficient movement techniques again to navigate to the other end of your selection. Then 'd' or 'y' allows you to cut or copy that selection.

The advantage vim's visual mode has over Jim Dennis's description of cut/copy/paste in vi is that you don't have to get the location exactly right. Sometimes it's more efficient to use a quick movement to get to the general vicinity of where you want to go and then refine that with other movements than to think up a more complex single movement command that gets you exactly where you want to go.

The downside to using visual mode extensively in this manner is that it can become a crutch that you use all the time which prevents you from learning new vi(m) commands that might allow you to do things more efficiently. However, if you are very proactive about learning new aspects of vi(m), then this probably won't affect you much.

I'll also re-emphasize that the visual line and visual block modes give you variations on this same theme that can be very powerful...especially the visual block mode.

On Efficient Use of the Keyboard

I also disagree with your assertion that alternating hands is the fastest way to use the keyboard. It has an element of truth in it. Speaking very generally, repeated use of the same thing is slow. This most significant example of this principle is that consecutive keystrokes typed with the same finger are very slow. Your assertion probably stems from the natural tendency to use the s/finger/hand/ transformation on this pattern. To some extent it's correct, but at the extremely high end of the efficiency spectrum it's incorrect.

Just ask any pianist. Ask them whether it's faster to play a succession of a few notes alternating hands or using consecutive fingers of a single hand in sequence. The fastest way to type 4 keystrokes is not to alternate hands, but to type them with 4 fingers of the same hand in either ascending or descending order (call this a "run"). This should be self-evident once you've considered this possibility.

The more difficult problem is optimizing for this. It's pretty easy to optimize for absolute distance on the keyboard. Vim does that. It's much harder to optimize at the "run" level, but vi(m) with it's modal editing gives you a better chance at being able to do it than any non-modal approach (ahem, emacs) ever could.

On Emacs

Lest the emacs zealots completely disregard my whole post on account of that last parenthetical comment, I feel I must describe the root of the difference between the emacs and vim religions. I've never spoken up in the editor wars and I probably won't do it again, but I've never heard anyone describe the differences this way, so here it goes. The difference is the following tradeoff:

Vim gives you unmatched raw text editing efficiency Emacs gives you unmatched ability to customize and program the editor

The blind vim zealots will claim that vim has a scripting language. But it's an obscure, ad-hoc language that was designed to serve the editor. Emacs has Lisp! Enough said. If you don't appreciate the significance of those last two sentences or have a desire to learn enough about functional programming and Lisp to develop that appreciation, then you should use vim.

The emacs zealots will claim that emacs has viper mode, and so it is a superset of vim. But viper mode isn't standard. My understanding is that viper mode is not used by the majority of emacs users. Since it's not the default, most emacs users probably don't develop a true appreciation for the benefits of the modal paradigm.

In my opinion these differences are orthogonal. I believe the benefits of vim and emacs as I have stated them are both valid. This means that the ultimate editor doesn't exist yet. It's probably true that emacs would be the easiest platform on which to base the ultimate editor. But modal editing is not entrenched in the emacs mindset. The emacs community could move that way in the future, but that doesn't seem very likely.

So if you want raw editing efficiency, use vim. If you want the ultimate environment for scripting and programming your editor use emacs. If you want some of both with an emphasis on programmability, use emacs with viper mode (or program your own mode). If you want the best of both worlds, you're out of luck for now.

community wiki
konryd
, Mar 31, 2010 at 22:44

Spend 30 mins doing the vim tutorial (run vimtutor instead of vim in terminal). You will learn the basic movements, and some keystrokes, this will make you at least as productive with vim as with the text editor you used before. After that, well, read Jim Dennis' answer again :)

dash-tom-bang, Aug 24, 2011 at 0:47

This is the first thing I thought of when reading the OP. It's obvious that the poster has never run this; I ran through it when first learning vim two years ago and it cemented in my mind the superiority of Vim to any of the other editors I've used (including, for me, Emacs since the key combos are annoying to use on a Mac). – dash-tom-bang Aug 24 '11 at 0:47

community wiki
Johnsyweb
, Jan 12, 2011 at 22:52

What is the way you use Vim that makes you more productive than with a contemporary editor?

Being able to execute complex, repetitive edits with very few keystrokes (often using macros ). Take a look at VimGolf to witness the power of Vim!

After over ten years of almost daily usage, it's hard to imagine using any other editor.

community wiki
2 revs, 2 users 67%
,Jun 15, 2011 at 13:42

Use \c anywhere in a search to ignore case (overriding your ignorecase or smartcase settings). E.g. /\cfoo or /foo\c will match foo, Foo, fOO, FOO, etc.

Use \C anywhere in a search to force case matching. E.g. /\Cfoo or /foo\C will only match foo.

community wiki
2 revs, 2 users 67%
,Jun 15, 2011 at 13:44

I was surprised to find no one mention the t movement. I frequently use it with parameter lists in the form of dt, or yt,

hhh, Jan 14, 2011 at 17:09

or dfx, dFx, dtx, ytx, etc where x is a char, +1 – hhh Jan 14 '11 at 17:09

dash-tom-bang, Aug 24, 2011 at 0:48

@hhh yep, T t f and F are all pretty regular keys for me to hit... – dash-tom-bang Aug 24 '11 at 0:48

markle976, Mar 30, 2012 at 13:52

Yes! And don't forget ct (change to). – markle976 Mar 30 '12 at 13:52

sjas, Jun 24, 2012 at 23:35

t for teh win!!! – sjas Jun 24 '12 at 23:35

community wiki
3 revs
,May 6, 2012 at 20:50

Odd nobody's mentioned ctags. Download "exuberant ctags" and put it ahead of the crappy preinstalled version you already have in your search path. Cd to the root of whatever you're working on; for example the Android kernel distribution. Type "ctags -R ." to build an index of source files anywhere beneath that dir in a file named "tags". This contains all tags, nomatter the language nor where in the dir, in one file, so cross-language work is easy.

Then open vim in that folder and read :help ctags for some commands. A few I use often:

community wiki
2 revs, 2 users 67%
,Feb 27, 2010 at 11:19

Automatic indentation:

gg (go to start of document)
= (indent time!)
shift-g (go to end of document)

You'll need 'filetype plugin indent on' in your .vimrc file, and probably appropriate 'shiftwidth' and 'expandtab' settings.

xcramps, Aug 28, 2009 at 17:14

Or just use the ":set ai" (auto-indent) facility, which has been in vi since the beginning. – xcramps Aug 28 '09 at 17:14

community wiki
autodidakto
, Jul 24, 2010 at 5:41

You asked about productive shortcuts, but I think your real question is: Is vim worth it? The answer to this stackoverflow question is -> "Yes"

You must have noticed two things. Vim is powerful, and vim is hard to learn. Much of it's power lies in it's expandability and endless combination of commands. Don't feel overwhelmed. Go slow. One command, one plugin at a time. Don't overdo it.

All that investment you put into vim will pay back a thousand fold. You're going to be inside a text editor for many, many hours before you die. Vim will be your companion.

community wiki
2 revs, 2 users 67%
,Feb 27, 2010 at 11:23

Multiple buffers, and in particular fast jumping between them to compare two files with :bp and :bn (properly remapped to a single Shift + p or Shift + n )

vimdiff mode (splits in two vertical buffers, with colors to show the differences)

Area-copy with Ctrl + v

And finally, tab completion of identifiers (search for "mosh_tab_or_complete"). That's a life changer.

community wiki
David Wolever
, Aug 28, 2009 at 16:07

Agreed with the top poster - the :r! command is very useful.

Most often I use it to "paste" things:

:r!cat
**Ctrl-V to paste from the OS clipboard**
^D

This way I don't have to fiddle with :set paste .

R. Martinho Fernandes, Apr 1, 2010 at 3:17

Probably better to set the clipboard option to unnamed ( set clipboard=unnamed in your .vimrc) to use the system clipboard by default. Or if you still want the system clipboard separate from the unnamed register, use the appropriately named clipboard register: "*p . – R. Martinho Fernandes Apr 1 '10 at 3:17

kevpie, Oct 12, 2010 at 22:38

Love it! After being exasperated by pasting code examples from the web and I was just starting to feel proficient in vim. That was the command I dreamed up on the spot. This was when vim totally hooked me. – kevpie Oct 12 '10 at 22:38

Ben Mordecai, Feb 6, 2013 at 19:54

If you're developing on a Mac, Command+C and Command+V copy and paste using the system clipboard, no remap required. – Ben Mordecai Feb 6 '13 at 19:54

David Wolever, Feb 6, 2013 at 20:55

Only with GVIm From the console, pasting without :set paste doesn't work so well if autoindent is enabled. – David Wolever Feb 6 '13 at 20:55

[Oct 21, 2018] What are the dark corners of Vim your mom never told you about?

Notable quotes:
"... Want to look at your :command history? q: Then browse, edit and finally to execute the command. ..."
"... from the ex editor (:), you can do CTRL-f to pop up the command history window. ..."
"... q/ and q? can be used to do a similar thing for your search patterns. ..."
"... adjacent to the one I just edit ..."
Nov 16, 2011 | stackoverflow.com

Ask Question, Nov 16, 2011 at 0:44

There are a plethora of questions where people talk about common tricks, notably " Vim+ctags tips and tricks ".

However, I don't refer to commonly used shortcuts that someone new to Vim would find cool. I am talking about a seasoned Unix user (be they a developer, administrator, both, etc.), who thinks they know something 99% of us never heard or dreamed about. Something that not only makes their work easier, but also is COOL and hackish .

After all, Vim resides in the most dark-corner-rich OS in the world, thus it should have intricacies that only a few privileged know about and want to share with us.

user3218088, Jun 16, 2014 at 9:51

:Sex -- Split window and open integrated file explorer (horizontal split) – user3218088 Jun 16 '14 at 9:51

community wiki, 2 revs, Apr 7, 2009 at 19:04

Might not be one that 99% of Vim users don't know about, but it's something I use daily and that any Linux+Vim poweruser must know.

Basic command, yet extremely useful.

:w !sudo tee %

I often forget to sudo before editing a file I don't have write permissions on. When I come to save that file and get a permission error, I just issue that vim command in order to save the file without the need to save it to a temp file and then copy it back again.

You obviously have to be on a system with sudo installed and have sudo rights.

jm666, May 12, 2011 at 6:09

cmap w!! w !sudo tee % – jm666 May 12 '11 at 6:09

Gerardo Marset, Jul 5, 2011 at 0:49

You should never run sudo vim . Instead you should export EDITOR as vim and run sudoedit . – Gerardo Marset Jul 5 '11 at 0:49

migu, Sep 2, 2013 at 20:42

@maximus: vim replaces % by the name of the current buffer/file. – migu Sep 2 '13 at 20:42

community wiki
Chad Birch
, Apr 7, 2009 at 18:09

Something I just discovered recently that I thought was very cool:
:earlier 15m

Reverts the document back to how it was 15 minutes ago. Can take various arguments for the amount of time you want to roll back, and is dependent on undolevels. Can be reversed with the opposite command :later

ephemient, Apr 8, 2009 at 16:15

@skinp: If you undo and then make further changes from the undone state, you lose that redo history. This lets you go back to a state which is no longer in the undo stack. – ephemient Apr 8 '09 at 16:15

Etienne PIERRE, Jul 21, 2009 at 13:53

Also very usefull is g+ and g- to go backward and forward in time. This is so much more powerfull than an undo/redo stack since you don't loose the history when you do something after an undo. – Etienne PIERRE Jul 21 '09 at 13:53

Ehtesh Choudhury, Nov 29, 2011 at 12:09

You don't lose the redo history if you make a change after an undo. It's just not easily accessed. There are plugins to help you visualize this, like Gundo.vim – Ehtesh Choudhury Nov 29 '11 at 12:09

Igor Popov, Dec 29, 2011 at 6:59

Wow, so now I can just do :later 8h and I'm done for today? :P – Igor Popov Dec 29 '11 at 6:59

Ring Ø, Jul 11, 2014 at 5:14

Your command assumes one will spend at least 15 minutes in vim ! – Ring Ø Jul 11 '14 at 5:14

community wiki,2 revs, 2 users 92%, ,Mar 31, 2016 at 17:54

:! [command] executes an external command while you're in Vim.

But add a dot after the colon, :.! [command], and it'll dump the output of the command into your current window. That's : . !

For example:

:.! ls

I use this a lot for things like adding the current date into a document I'm typing:

:.! date

saffsd, May 6, 2009 at 14:41

This is quite similar to :r! The only difference as far as I can tell is that :r! opens a new line, :.! overwrites the current line. – saffsd May 6 '09 at 14:41

hlovdal, Jan 25, 2010 at 21:11

An alternative to :.!date is to write "date" on a line and then run !$sh (alternatively having the command followed by a blank line and run !jsh ). This will pipe the line to the "sh" shell and substitute with the output from the command. – hlovdal Jan 25 '10 at 21:11

Nefrubyr, Mar 25, 2010 at 16:24

:.! is actually a special case of :{range}!, which filters a range of lines (the current line when the range is . ) through a command and replaces those lines with the output. I find :%! useful for filtering whole buffers. – Nefrubyr Mar 25 '10 at 16:24

jabirali, Jul 13, 2010 at 4:30

@sundar: Why pass a line to sed, when you can use the similar built-in ed / ex commands? Try running :.s/old/new/g ;-) – jabirali Jul 13 '10 at 4:30

aqn, Apr 26, 2013 at 20:52

And also note that '!' is like 'y', 'd', 'c' etc. i.e. you can do: !!, number!!, !motion (e.g. !Gshell_command<cr> replace from current line to end of file ('G') with output of shell_command). – aqn Apr 26 '13 at 20:52

community wiki 2 revs , Apr 8, 2009 at 12:17

Not exactly obscure, but there are several "delete in" commands which are extremely useful, like..

Others can be found on :help text-objects

sjh, Apr 8, 2009 at 15:33

dab "delete arounb brackets", daB for around curly brackets, t for xml type tags, combinations with normal commands are as expected cib/yaB/dit/vat etc – sjh Apr 8 '09 at 15:33

Don Reba, Apr 13, 2009 at 21:41

@Masi: yi(va(p deletes only the brackets – Don Reba Apr 13 '09 at 21:41

thomasrutter, Apr 26, 2009 at 11:11

This is possibly the biggest reason for me staying with Vim. That and its equivalent "change" commands: ciw, ci(, ci", as well as dt<space> and ct<space> – thomasrutter Apr 26 '09 at 11:11

Roger Pate Oct 12 '10 at 16:40 ,

@thomasrutter: Why not dW/cW instead of dt<space>? �

Roger Pate Oct 12 '10 at 16:43, Oct 12, 2010 at 16:43

@Masi: With the surround plugin: ds(. �

community wiki, 9 revs, 9 users 84%, ultraman, Apr 21, 2017 at 14:06

de Delete everything till the end of the word by pressing . at your heart's desire.

ci(xyz[Esc] -- This is a weird one. Here, the 'i' does not mean insert mode. Instead it means inside the parenthesis. So this sequence cuts the text inside parenthesis you're standing in and replaces it with "xyz". It also works inside square and figure brackets -- just do ci[ or ci{ correspondingly. Naturally, you can do di (if you just want to delete all text without typing anything. You can also do a instead of i if you want to delete the parentheses as well and not just text inside them.

ci" - cuts the text in current quotes

ciw - cuts the current word. This works just like the previous one except that ( is replaced with w .

C - cut the rest of the line and switch to insert mode.

ZZ -- save and close current file (WAY faster than Ctrl-F4 to close the current tab!)

ddp - move current line one row down

xp -- move current character one position to the right

U - uppercase, so viwU upercases the word

~ - switches case, so viw~ will reverse casing of entire word

Ctrl+u / Ctrl+d scroll the page half-a-screen up or down. This seems to be more useful than the usual full-screen paging as it makes it easier to see how the two screens relate. For those who still want to scroll entire screen at a time there's Ctrl+f for Forward and Ctrl+b for Backward. Ctrl+Y and Ctrl+E scroll down or up one line at a time.

Crazy but very useful command is zz -- it scrolls the screen to make this line appear in the middle. This is excellent for putting the piece of code you're working on in the center of your attention. Sibling commands -- zt and zb -- make this line the top or the bottom one on the sreen which is not quite as useful.

% finds and jumps to the matching parenthesis.

de -- delete from cursor to the end of the word (you can also do dE to delete until the next space)

bde -- delete the current word, from left to right delimiter

df[space] -- delete up until and including the next space

dt. -- delete until next dot

dd -- delete this entire line

ye (or yE) -- yanks text from here to the end of the word

ce - cuts through the end of the word

bye -- copies current word (makes me wonder what "hi" does!)

yy -- copies the current line

cc -- cuts the current line, you can also do S instead. There's also lower cap s which cuts current character and switches to insert mode.

viwy or viwc . Yank or change current word. Hit w multiple times to keep selecting each subsequent word, use b to move backwards

vi{ - select all text in figure brackets. va{ - select all text including {}s

vi(p - highlight everything inside the ()s and replace with the pasted text

b and e move the cursor word-by-word, similarly to how Ctrl+Arrows normally do . The definition of word is a little different though, as several consecutive delmiters are treated as one word. If you start at the middle of a word, pressing b will always get you to the beginning of the current word, and each consecutive b will jump to the beginning of the next word. Similarly, and easy to remember, e gets the cursor to the end of the current, and each subsequent, word.

similar to b / e, capital B and E move the cursor word-by-word using only whitespaces as delimiters.

capital D (take a deep breath) Deletes the rest of the line to the right of the cursor, same as Shift+End/Del in normal editors (notice 2 keypresses -- Shift+D -- instead of 3)

Nick Lewis, Jul 17, 2009 at 16:41

zt is quite useful if you use it at the start of a function or class definition. – Nick Lewis Jul 17 '09 at 16:41

Nathan Fellman, Sep 7, 2009 at 8:27

vity and vitc can be shortened to yit and cit respectively. – Nathan Fellman Sep 7 '09 at 8:27

Laurence Gonsalves, Feb 19, 2011 at 23:49

All the things you're calling "cut" is "change". eg: C is change until the end of the line. Vim's equivalent of "cut" is "delete", done with d/D. The main difference between change and delete is that delete leaves you in normal mode but change puts you into a sort of insert mode (though you're still in the change command which is handy as the whole change can be repeated with . ). – Laurence Gonsalves Feb 19 '11 at 23:49

Almo, May 29, 2012 at 20:09

I thought this was for a list of things that not many people know. yy is very common, I would have thought. – Almo May 29 '12 at 20:09

Andrea Francia, Jul 3, 2012 at 20:50

bye does not work when you are in the first character of the word. yiw always does. – Andrea Francia Jul 3 '12 at 20:50

community wiki 2 revs, 2 users 83%, ,Sep 17, 2010 at 16:55

One that I rarely find in most Vim tutorials, but it's INCREDIBLY useful (at least to me), is the

g; and g,

to move (forward, backward) through the changelist.

Let me show how I use it. Sometimes I need to copy and paste a piece of code or string, say a hex color code in a CSS file, so I search, jump (not caring where the match is), copy it and then jump back (g;) to where I was editing the code to finally paste it. No need to create marks. Simpler.

Just my 2cents.

aehlke, Feb 12, 2010 at 1:19

similarly, '. will go to the last edited line, And `. will go to the last edited position – aehlke Feb 12 '10 at 1:19

Kimball Robinson, Apr 16, 2010 at 0:29

Ctrl-O and Ctrl-I (tab) will work similarly, but not the same. They move backward and forward in the "jump list", which you can view by doing :jumps or :ju For more information do a :help jumplist – Kimball Robinson Apr 16 '10 at 0:29

Kimball Robinson, Apr 16, 2010 at 0:30

You can list the change list by doing :changes – Kimball Robinson Apr 16 '10 at 0:30

Wayne Werner, Jan 30, 2013 at 14:49

Hot dang that's useful. I use <C-o> / <C-i> for this all the time - or marking my place. – Wayne Werner Jan 30 '13 at 14:49

community wiki, 4 revs, 4 users 36%, ,May 5, 2014 at 13:06

:%!xxd

Make vim into a hex editor.

:%!xxd -r

Revert.

Warning: If you don't edit with binary (-b), you might damage the file. – Josh Lee in the comments.

Christian, Jul 7, 2009 at 19:11

And how do you revert it back? – Christian Jul 7 '09 at 19:11

Naga Kiran, Jul 8, 2009 at 13:46

:!xxd -r //To revert back from HEX – Naga Kiran Jul 8 '09 at 13:46

Andreas Grech, Nov 14, 2009 at 10:37

I actually think it's :%!xxd -r to revert it back – Andreas Grech Nov 14 '09 at 10:37

dotancohen, Jun 7, 2013 at 5:50

@JoshLee: If one is careful not to traverse newlines, is it safe to not use the -b option? I ask because sometimes I want to make a hex change, but I don't want to close and reopen the file to do so. – dotancohen Jun 7 '13 at 5:50

Bambu, Nov 23, 2014 at 23:58

@dotancohen: If you don't want to close/reopen the file you can do :set binary – Bambu Nov 23 '14 at 23:58

community wiki AaronS, Jan 12, 2011 at 20:03

gv

Reselects last visual selection.

community wiki 3 revs, 2 users 92%
,Jul 7, 2014 at 19:10

Sometimes a setting in your .vimrc will get overridden by a plugin or autocommand. To debug this a useful trick is to use the :verbose command in conjunction with :set. For example, to figure out where cindent got set/unset:
:verbose set cindent?

This will output something like:

cindent
    Last set from /usr/share/vim/vim71/indent/c.vim

This also works with maps and highlights. (Thanks joeytwiddle for pointing this out.) For example:

:verbose nmap U
n  U             <C-R>
        Last set from ~/.vimrc

:verbose highlight Normal
Normal         xxx guifg=#dddddd guibg=#111111 font=Inconsolata Medium 14
        Last set from ~/src/vim-holodark/colors/holodark.vim

Artem Russakovskii, Oct 23, 2009 at 22:09

Excellent tip - exactly what I was looking for today. – Artem Russakovskii Oct 23 '09 at 22:09

joeytwiddle, Jul 5, 2014 at 22:08

:verbose can also be used before nmap l or highlight Normal to find out where the l keymap or the Normal highlight were last defined. Very useful for debugging! – joeytwiddle Jul 5 '14 at 22:08

SidOfc, Sep 24, 2017 at 11:26

When you get into creating custom mappings, this will save your ass so many times, probably one of the most useful ones here (IMO)! – SidOfc Sep 24 '17 at 11:26

community wiki 3 revs, 3 users 70% ,May 31, 2015 at 19:30

Not sure if this counts as dark-corner-ish at all, but I've only just learnt it...
:g/match/y A

will yank (copy) all lines containing "match" into the "a / @a register. (The capitalization as A makes vim append yankings instead of replacing the previous register contents.) I used it a lot recently when making Internet Explorer stylesheets.

tsukimi, May 27, 2012 at 6:17

You can use :g! to find lines that don't match a pattern e.x. :g!/set/normal dd (delete all lines that don't contain set) – tsukimi May 27 '12 at 6:17

pandubear, Oct 12, 2013 at 8:39

Sometimes it's better to do what tsukimi said and just filter out lines that don't match your pattern. An abbreviated version of that command though: :v/PATTERN/d Explanation: :v is an abbreviation for :g!, and the :g command applies any ex command to lines. :y[ank] works and so does :normal, but here the most natural thing to do is just :d[elete] . – pandubear Oct 12 '13 at 8:39

Kimball Robinson, Feb 5, 2016 at 17:58

You can also do :g/match/normal "Ayy -- the normal keyword lets you tell it to run normal-mode commands (which you are probably more familiar with). – Kimball Robinson Feb 5 '16 at 17:58

community wiki 2 revs, 2 users 80% ,Apr 5, 2013 at 15:55

:%TOhtml Creates an html rendering of the current file.

kenorb, Feb 19, 2015 at 11:27

Related: How to convert a source code file into HTML? at Vim SE – kenorb Feb 19 '15 at 11:27

community wiki 2 revs, 2 users 86% ,May 11, 2011 at 19:30

Want to look at your :command history? q: Then browse, edit and finally to execute the command.

Ever make similar changes to two files and switch back and forth between them? (Say, source and header files?)

:set hidden
:map <TAB> :e#<CR>

Then tab back and forth between those files.

Josh Lee, Sep 22, 2009 at 16:58

I hit q: by accident all the time... – Josh Lee Sep 22 '09 at 16:58

Jason Down, Oct 6, 2009 at 4:14

Alternatively, from the ex editor (:), you can do CTRL-f to pop up the command history window.Jason Down Oct 6 '09 at 4:14

bradlis7, Mar 23, 2010 at 17:10

@jleedev me too. I almost hate this command, just because I use it accidentally way too much. – bradlis7 Mar 23 '10 at 17:10

bpw1621, Feb 19, 2011 at 15:01

q/ and q? can be used to do a similar thing for your search patterns. bpw1621 Feb 19 '11 at 15:01

idbrii, Feb 23, 2011 at 19:07

Hitting <C-f> after : or / (or any time you're in command mode) will bring up the same history menu. So you can remap q: if you hit it accidentally a lot and still access this awesome mode. – idbrii Feb 23 '11 at 19:07

community wiki, 2 revs, 2 users 89%, ,Jun 4, 2014 at 14:52

Vim will open a URL, for example
vim http://stackoverflow.com/

Nice when you need to pull up the source of a page for reference.

Ivan Vučica, Sep 21, 2010 at 8:07

For me it didn't open the source; instead it apparently used elinks to dump rendered page into a buffer, and then opened that. – Ivan Vučica Sep 21 '10 at 8:07

Thomas, Apr 19, 2013 at 21:00

Works better with a slash at the end. Neat trick! – Thomas Apr 19 '13 at 21:00

Isaac Remuant, Jun 3, 2013 at 15:23

@Vdt: It'd be useful if you posted your error. If it's this one: " error (netrw) neither the wget nor the fetch command is available" you obviously need to make one of those tools available from your PATH environment variable. – Isaac Remuant Jun 3 '13 at 15:23

Dettorer, Oct 29, 2014 at 13:47

I find this one particularly useful when people send links to a paste service and forgot to select a syntax highlighting, I generally just have to open the link in vim after appending "&raw". – Dettorer Oct 29 '14 at 13:47

community wiki 2 revs, 2 users 94% ,Jan 20, 2015 at 23:14

Macros can call other macros, and can also call itself.

eg:

qq0dwj@qq@q

...will delete the first word from every line until the end of the file.

This is quite a simple example but it demonstrates a very powerful feature of vim

Kimball Robinson, Apr 16, 2010 at 0:39

I didn't know macros could repeat themselves. Cool. Note: qx starts recording into register x (he uses qq for register q). 0 moves to the start of the line. dw delets a word. j moves down a line. @q will run the macro again (defining a loop). But you forgot to end the recording with a final "q", then actually run the macro by typing @q. – Kimball Robinson Apr 16 '10 at 0:39

Yktula, Apr 18, 2010 at 5:32

I think that's intentional, as a nested and recursive macro. – Yktula Apr 18 '10 at 5:32

Gerardo Marset, Jul 5, 2011 at 1:38

qqqqqifuu<Esc>h@qq@qGerardo Marset Jul 5 '11 at 1:38

Nathan Long, Aug 29, 2011 at 15:33

Another way of accomplishing this is to record a macro in register a that does some transformation to a single line, then linewise highlight a bunch of lines with V and type :normal! @a to applyyour macro to every line in your selection. – Nathan Long Aug 29 '11 at 15:33

dotancohen, May 14, 2013 at 6:00

I found this post googling recursive VIM macros. I could find no way to stop the macro other than killing the VIM process. – dotancohen May 14 '13 at 6:00

community wiki
Brian Carper
, Apr 8, 2009 at 1:15

Assuming you have Perl and/or Ruby support compiled in, :rubydo and :perldo will run a Ruby or Perl one-liner on every line in a range (defaults to entire buffer), with $_ bound to the text of the current line (minus the newline). Manipulating $_ will change the text of that line.

You can use this to do certain things that are easy to do in a scripting language but not so obvious using Vim builtins. For example to reverse the order of the words in a line:

:perldo $_ = join ' ', reverse split

To insert a random string of 8 characters (A-Z) at the end of every line:

:rubydo $_ += ' ' + (1..8).collect{('A'..'Z').to_a[rand 26]}.join

You are limited to acting on one line at a time and you can't add newlines.

Sujoy, May 6, 2009 at 18:27

what if i only want perldo to run on a specified line? or a selected few lines? – Sujoy May 6 '09 at 18:27

Brian Carper, May 6, 2009 at 18:52

You can give it a range like any other command. For example :1,5perldo will only operate on lines 1-5. – Brian Carper May 6 '09 at 18:52

Greg, Jul 2, 2009 at 16:41

Could you do $_ += '\nNEWLINE!!!' to get a newline after the current one? – Greg Jul 2 '09 at 16:41

Brian Carper, Jul 2, 2009 at 17:26

Sadly not, it just adds a funky control character to the end of the line. You could then use a Vim search/replace to change all those control characters to real newlines though. – Brian Carper Jul 2 '09 at 17:26

Derecho, Mar 14, 2014 at 8:48

Similarly, pydo and py3do work for python if you have the required support compiled in. – Derecho Mar 14 '14 at 8:48

community wiki
4 revs
,Jul 28, 2009 at 19:05

^O and ^I

Go to older/newer position. When you are moving through the file (by searching, moving commands etc.) vim rember these "jumps", so you can repeat these jumps backward (^O - O for old) and forward (^I - just next to I on keyboard). I find it very useful when writing code and performing a lot of searches.

gi

Go to position where Insert mode was stopped last. I find myself often editing and then searching for something. To return to editing place press gi.

gf

put cursor on file name (e.g. include header file), press gf and the file is opened

gF

similar to gf but recognizes format "[file name]:[line number]". Pressing gF will open [file name] and set cursor to [line number].

^P and ^N

Auto complete text while editing (^P - previous match and ^N next match)

^X^L

While editing completes to the same line (useful for programming). You write code and then you recall that you have the same code somewhere in file. Just press ^X^L and the full line completed

^X^F

Complete file names. You write "/etc/pass" Hmm. You forgot the file name. Just press ^X^F and the filename is completed

^Z or :sh

Move temporary to the shell. If you need a quick bashing:

sehe, Mar 4, 2012 at 21:50

With ^X^F my pet peeve is that filenames include = signs, making it do rotten things in many occasions (ini files, makefiles etc). I use se isfname-== to end that nuisance – sehe Mar 4 '12 at 21:50

joeytwiddle, Jul 5, 2014 at 22:10

+1 the built-in autocomplete is just sitting there waiting to be discovered. – joeytwiddle Jul 5 '14 at 22:10

community wiki
2 revs
,Apr 7, 2009 at 18:59

This is a nice trick to reopen the current file with a different encoding:
:e ++enc=cp1250 %:p

Useful when you have to work with legacy encodings. The supported encodings are listed in a table under encoding-values (see help encoding-values ). Similar thing also works for ++ff, so that you can reopen file with Windows/Unix line ends if you get it wrong for the first time (see help ff ).

>, Apr 7, 2009 at 18:43

Never had to use this sort of a thing, but we'll certainly add to my arsenal of tricks... – Sasha Apr 7 '09 at 18:43

Adriano Varoli Piazza, Apr 7, 2009 at 18:44

great tip, thanks. For bonus points, add a list of common valid encodings. – Adriano Varoli Piazza Apr 7 '09 at 18:44

Ivan Vučica, Jul 8, 2009 at 19:29

I have used this today, but I think I didn't need to specify "%:p"; just opening the file and :e ++enc=cp1250 was enough. I – Ivan Vučica Jul 8 '09 at 19:29

laz, Jul 8, 2009 at 19:32

would :set encoding=cp1250 have the same effect? – laz Jul 8 '09 at 19:32

intuited, Jun 4, 2010 at 2:51

`:e +b %' is similarly useful for reopening in binary mode (no munging of newlines) – intuited Jun 4 '10 at 2:51

community wiki
4 revs, 3 users 48%
,Nov 6, 2012 at 8:32

" insert range ip's
"
"          ( O O )
" =======oOO=(_)==OOo======

:for i in range(1,255) | .put='10.0.0.'.i | endfor

Ryan Edwards, Nov 16, 2011 at 0:42

I don't see what this is good for (besides looking like a joke answer). Can anybody else enlighten me? – Ryan Edwards Nov 16 '11 at 0:42

Codygman, Nov 6, 2012 at 8:33

open vim and then do ":for i in range(1,255) | .put='10.0.0.'.i | endfor" – Codygman Nov 6 '12 at 8:33

Ruslan, Sep 30, 2013 at 10:30

@RyanEdwards filling /etc/hosts maybe – Ruslan Sep 30 '13 at 10:30

dotancohen, Nov 30, 2014 at 14:56

This is a terrific answer. Not the bit about creating the IP addresses, but the bit that implies that VIM can use for loops in commands . – dotancohen Nov 30 '14 at 14:56

BlackCap, Aug 31, 2017 at 7:54

Without ex-mode: i10.0.0.1<Esc>Y254p$<C-v>}g<C-a>BlackCap Aug 31 '17 at 7:54

community wiki
2 revs
,Aug 6, 2010 at 0:30

Typing == will correct the indentation of the current line based on the line above.

Actually, you can do one = sign followed by any movement command. = {movement}

For example, you can use the % movement which moves between matching braces. Position the cursor on the { in the following code:

if (thisA == that) {
//not indented
if (some == other) {
x = y;
}
}

And press =% to instantly get this:

if (thisA == that) {
    //not indented
    if (some == other) {
        x = y;
    }
}

Alternately, you could do =a{ within the code block, rather than positioning yourself right on the { character.

Ehtesh Choudhury, May 2, 2011 at 0:48

Hm, I didn't know this about the indentation. – Ehtesh Choudhury May 2 '11 at 0:48

sehe, Mar 4, 2012 at 22:03

No need, usually, to be exactly on the braces. Thought frequently I'd just =} or vaBaB= because it is less dependent. Also, v}}:!astyle -bj matches my code style better, but I can get it back into your style with a simple %!astyle -ajsehe Mar 4 '12 at 22:03

kyrias, Oct 19, 2013 at 12:12

gg=G is quite neat when pasting in something. – kyrias Oct 19 '13 at 12:12

kenorb, Feb 19, 2015 at 11:30

Related: Re-indenting badly indented code at Vim SE – kenorb Feb 19 '15 at 11:30

Braden Best, Feb 4, 2016 at 16:16

@kyrias Oh, I've been doing it like ggVG= . – Braden Best Feb 4 '16 at 16:16

community wiki
Trumpi
, Apr 19, 2009 at 18:33

imap jj <esc>

hasen, Jun 12, 2009 at 6:08

how will you type jj then? :P – hasen Jun 12 '09 at 6:08

ojblass, Jul 5, 2009 at 18:29

How often to you type jj? In English at least? – ojblass Jul 5 '09 at 18:29

Alex, Oct 5, 2009 at 5:32

I remapped capslock to esc instead, as it's an otherwise useless key. My mapping was OS wide though, so it has the added benefit of never having to worry about accidentally hitting it. The only drawback IS ITS HARDER TO YELL AT PEOPLE. :) – Alex Oct 5 '09 at 5:32

intuited, Jun 4, 2010 at 4:18

@Alex: definitely, capslock is death. "wait, wtf? oh, that was ZZ?....crap." – intuited Jun 4 '10 at 4:18

brianmearns, Oct 3, 2012 at 12:45

@ojblass: Not sure how many people ever right matlab code in Vim, but ii and jj are commonly used for counter variables, because i and j are reserved for complex numbers. – brianmearns Oct 3 '12 at 12:45

community wiki
4 revs, 3 users 71%
,Feb 12, 2015 at 15:55

Let's see some pretty little IDE editor do column transposition.
:%s/\(.*\)^I\(.*\)/\2^I\1/

Explanation

\( and \) is how to remember stuff in regex-land. And \1, \2 etc is how to retrieve the remembered stuff.

>>> \(.*\)^I\(.*\)

Remember everything followed by ^I (tab) followed by everything.

>>> \2^I\1

Replace the above stuff with "2nd stuff you remembered" followed by "1st stuff you remembered" - essentially doing a transpose.

chaos, Apr 7, 2009 at 18:33

Switches a pair of tab-separated columns (separator arbitrary, it's all regex) with each other. – chaos Apr 7 '09 at 18:33

rlbond, Apr 26, 2009 at 4:11

This is just a regex; plenty of IDEs have regex search-and-replace. – rlbond Apr 26 '09 at 4:11

romandas, Jun 19, 2009 at 16:58

@rlbond - It comes down to how good is the regex engine in the IDE. Vim's regexes are pretty powerful; others.. not so much sometimes. – romandas Jun 19 '09 at 16:58

Kimball Robinson, Apr 16, 2010 at 0:32

The * will be greedy, so this regex assumes you have just two columns. If you want it to be nongreedy use {-} instead of * (see :help non-greedy for more information on the {} multiplier) – Kimball Robinson Apr 16 '10 at 0:32

mk12, Jun 22, 2012 at 17:31

This is actually a pretty simple regex, it's only escaping the group parentheses that makes it look complicated. – mk12 Jun 22 '12 at 17:31

community wiki
KKovacs
, Apr 11, 2009 at 7:14

Not exactly a dark secret, but I like to put the following mapping into my .vimrc file, so I can hit "-" (minus) anytime to open the file explorer to show files adjacent to the one I just edit . In the file explorer, I can hit another "-" to move up one directory, providing seamless browsing of a complex directory structures (like the ones used by the MVC frameworks nowadays):
map - :Explore<cr>

These may be also useful for somebody. I like to scroll the screen and advance the cursor at the same time:

map <c-j> j<c-e>
map <c-k> k<c-y>

Tab navigation - I love tabs and I need to move easily between them:

map <c-l> :tabnext<enter>
map <c-h> :tabprevious<enter>

Only on Mac OS X: Safari-like tab navigation:

map <S-D-Right> :tabnext<cr>
map <S-D-Left> :tabprevious<cr>

Roman Plášil, Oct 1, 2009 at 21:33

You can also browse files within Vim itself, using :Explore – Roman Plášil Oct 1 '09 at 21:33

KKovacs, Oct 15, 2009 at 15:20

Hi Roman, this is exactly what this mapping does, but assigns it to a "hot key". :) – KKovacs Oct 15 '09 at 15:20

community wiki
rampion
, Apr 7, 2009 at 20:11

Often, I like changing current directories while editing - so I have to specify paths less.
cd %:h

Leonard, May 8, 2009 at 1:54

What does this do? And does it work with autchdir? – Leonard May 8 '09 at 1:54

rampion, May 8, 2009 at 2:55

I suppose it would override autochdir temporarily (until you switched buffers again). Basically, it changes directory to the root directory of the current file. It gives me a bit more manual control than autochdir does. – rampion May 8 '09 at 2:55

Naga Kiran, Jul 8, 2009 at 13:44

:set autochdir //this also serves the same functionality and it changes the current directory to that of file in buffer – Naga Kiran Jul 8 '09 at 13:44

community wiki
4 revs
,Jul 21, 2009 at 1:12

I like to use 'sudo bash', and my sysadmin hates this. He locked down 'sudo' so it could only be used with a handful of commands (ls, chmod, chown, vi, etc), but I was able to use vim to get a root shell anyway:
bash$ sudo vi +'silent !bash' +q
Password: ******
root#

RJHunter, Jul 21, 2009 at 0:53

FWIW, sudoedit (or sudo -e) edits privileged files but runs your editor as your normal user. – RJHunter Jul 21 '09 at 0:53

sundar, Sep 23, 2009 at 9:41

@OP: That was cunning. :) – sundar Sep 23 '09 at 9:41

jnylen, Feb 22, 2011 at 15:58

yeah... I'd hate you too ;) you should only need a root shell VERY RARELY, unless you're already in the habit of running too many commands as root which means your permissions are all screwed up. – jnylen Feb 22 '11 at 15:58

d33tah, Mar 30, 2014 at 17:50

Why does your sysadmin even give you root? :D – d33tah Mar 30 '14 at 17:50

community wiki
Taurus Olson
, Apr 7, 2009 at 21:11

I often use many windows when I work on a project and sometimes I need to resize them. Here's what I use:
map + <C-W>+
map - <C-W>-

These mappings allow to increase and decrease the size of the current window. It's quite simple but it's fast.

Bill Lynch, Apr 8, 2009 at 2:49

There's also Ctrl-W =, which makes the windows equal width. – Bill Lynch Apr 8 '09 at 2:49

joeytwiddle, Jan 29, 2012 at 18:12

Don't forget you can prepend numbers to perform an action multiple times in Vim. So to expand the current window height by 8 lines: 8<C-W>+ – joeytwiddle Jan 29 '12 at 18:12

community wiki
Roberto Bonvallet
, May 6, 2009 at 7:38

:r! <command>

pastes the output of an external command into the buffer.

Do some math and get the result directly in the text:

:r! echo $((3 + 5 + 8))

Get the list of files to compile when writing a Makefile:

:r! ls *.c

Don't look up that fact you read on wikipedia, have it directly pasted into the document you are writing:

:r! lynx -dump http://en.wikipedia.org/wiki/Whatever

Sudhanshu, Jun 7, 2010 at 8:40

^R=3+5+8 in insert mode will let you insert the value of the expression (3+5+8) in text with fewer keystrokes. – Sudhanshu Jun 7 '10 at 8:40

dcn, Mar 27, 2011 at 10:13

How can I get the result/output to a different buffer than the current? – dcn Mar 27 '11 at 10:13

kenorb, Feb 19, 2015 at 11:31

Related: How to dump output from external command into editor? at Vim SE – kenorb Feb 19 '15 at 11:31

community wiki
jqno
, Jul 8, 2009 at 19:19

Map F5 to quickly ROT13 your buffer:
map <F5> ggg?G``

You can use it as a boss key :).

sehe, Mar 4, 2012 at 21:57

I don't know what you are writing... But surely, my boss would be more curious when he saw me write ROT13 jumble :) – sehe Mar 4 '12 at 21:57

romeovs, Jun 19, 2014 at 19:22

or to spoof your friends: nmap i ggg?G`` . Or the diabolical: nmap i ggg?G``i ! – romeovs Jun 19 '14 at 19:22

Amit Gold, Aug 7, 2016 at 10:14

@romeovs 2nd one is infinite loop, use nnoremap – Amit Gold Aug 7 '16 at 10:14

community wiki
mohi666
, Mar 4, 2011 at 2:20

Not an obscure feature, but very useful and time saving.

If you want to save a session of your open buffers, tabs, markers and other settings, you can issue the following:

mksession session.vim

You can open your session using:

vim -S session.vim

TankorSmash, Nov 3, 2012 at 13:45

You can also :so session.vim inside vim. – TankorSmash Nov 3 '12 at 13:45

community wiki
Grant Limberg
, May 11, 2009 at 21:59

I just found this one today via NSFAQ :

Comment blocks of code.

Enter Blockwise Visual mode by hitting CTRL-V.

Mark the block you wish to comment.

Hit I (capital I) and enter your comment string at the beginning of the line. (// for C++)

Hit ESC and all lines selected will have // prepended to the front of the line.

Neeraj Singh, Jun 17, 2009 at 16:56

I added # to comment out a block of code in ruby. How do I undo it. – Neeraj Singh Jun 17 '09 at 16:56

Grant Limberg, Jun 17, 2009 at 19:29

well, if you haven't done anything else to the file, you can simply type u for undo. Otherwise, I haven't figured that out yet. – Grant Limberg Jun 17 '09 at 19:29

nos, Jul 28, 2009 at 20:00

You can just hit ctrl+v again, mark the //'s and hit x to "uncomment" – nos Jul 28 '09 at 20:00

ZyX, Mar 7, 2010 at 14:18

I use NERDCommenter for this. – ZyX Mar 7 '10 at 14:18

Braden Best, Feb 4, 2016 at 16:23

Commented out code is probably one of the worst types of comment you could possibly put in your code. There are better uses for the awesome block insert. – Braden Best Feb 4 '16 at 16:23

community wiki
2 revs, 2 users 84%
Ian H
, Jul 3, 2015 at 23:44

I use vim for just about any text editing I do, so I often times use copy and paste. The problem is that vim by default will often times distort imported text via paste. The way to stop this is to use
:set paste

before pasting in your data. This will keep it from messing up.

Note that you will have to issue :set nopaste to recover auto-indentation. Alternative ways of pasting pre-formatted text are the clipboard registers ( * and + ), and :r!cat (you will have to end the pasted fragment with ^D).

It is also sometimes helpful to turn on a high contrast color scheme. This can be done with

:color blue

I've noticed that it does not work on all the versions of vim I use but it does on most.

jamessan, Dec 28, 2009 at 8:27

The "distortion" is happening because you have some form of automatic indentation enabled. Using set paste or specifying a key for the pastetoggle option is a common way to work around this, but the same effect can be achieved with set mouse=a as then Vim knows that the flood of text it sees is a paste triggered by the mouse. – jamessan Dec 28 '09 at 8:27

kyrias, Oct 19, 2013 at 12:15

If you have gvim installed you can often (though it depends on what your options your distro compiles vim with) use the X clipboard directly from vim through the * register. For example "*p to paste from the X xlipboard. (It works from terminal vim, too, it's just that you might need the gvim package if they're separate) – kyrias Oct 19 '13 at 12:15

Braden Best, Feb 4, 2016 at 16:26

@kyrias for the record, * is the PRIMARY ("middle-click") register. The clipboard is +Braden Best Feb 4 '16 at 16:26

community wiki
viraptor
, Apr 7, 2009 at 22:29

Here's something not obvious. If you have a lot of custom plugins / extensions in your $HOME and you need to work from su / sudo / ... sometimes, then this might be useful.

In your ~/.bashrc:

export VIMINIT=":so $HOME/.vimrc"

In your ~/.vimrc:

if $HOME=='/root'
        if $USER=='root'
                if isdirectory('/home/your_typical_username')
                        let rtuser = 'your_typical_username'
                elseif isdirectory('/home/your_other_username')
                        let rtuser = 'your_other_username'
                endif
        else
                let rtuser = $USER
        endif
        let &runtimepath = substitute(&runtimepath, $HOME, '/home/'.rtuser, 'g')
endif

It will allow your local plugins to load - whatever way you use to change the user.

You might also like to take the *.swp files out of your current path and into ~/vimtmp (this goes into .vimrc):

if ! isdirectory(expand('~/vimtmp'))
   call mkdir(expand('~/vimtmp'))
endif
if isdirectory(expand('~/vimtmp'))
   set directory=~/vimtmp
else
   set directory=.,/var/tmp,/tmp
endif

Also, some mappings I use to make editing easier - makes ctrl+s work like escape and ctrl+h/l switch the tabs:

inoremap <C-s> <ESC>
vnoremap <C-s> <ESC>
noremap <C-l> gt
noremap <C-h> gT

Kyle Challis, Apr 2, 2014 at 21:18

Just in case you didn't already know, ctrl+c already works like escape. – Kyle Challis Apr 2 '14 at 21:18

shalomb, Aug 24, 2015 at 8:02

I prefer never to run vim as root/under sudo - and would just run the command from vim e.g. :!sudo tee %, :!sudo mv % /etc or even launch a login shell :!sudo -ishalomb Aug 24 '15 at 8:02

community wiki
2 revs, 2 users 67%
,Nov 7, 2009 at 7:54

Ctrl-n while in insert mode will auto complete whatever word you're typing based on all the words that are in open buffers. If there is more than one match it will give you a list of possible words that you can cycle through using ctrl-n and ctrl-p.

community wiki
daltonb
, Feb 22, 2010 at 4:28

gg=G

Corrects indentation for entire file. I was missing my trusty <C-a><C-i> in Eclipse but just found out vim handles it nicely.

sjas, Jul 15, 2012 at 22:43

I find G=gg easier to type. – sjas Jul 15 '12 at 22:43

sri, May 12, 2013 at 16:12

=% should do it too. – sri May 12 '13 at 16:12

community wiki
mohi666
, Mar 24, 2011 at 22:44

Ability to run Vim on a client/server based modes.

For example, suppose you're working on a project with a lot of buffers, tabs and other info saved on a session file called session.vim.

You can open your session and create a server by issuing the following command:

vim --servername SAMPLESERVER -S session.vim

Note that you can open regular text files if you want to create a server and it doesn't have to be necessarily a session.

Now, suppose you're in another terminal and need to open another file. If you open it regularly by issuing:

vim new_file.txt

Your file would be opened in a separate Vim buffer, which is hard to do interactions with the files on your session. In order to open new_file.txt in a new tab on your server use this command:

vim --servername SAMPLESERVER --remote-tab-silent new_file.txt

If there's no server running, this file will be opened just like a regular file.

Since providing those flags every time you want to run them is very tedious, you can create a separate alias for creating client and server.

I placed the followings on my bashrc file:

alias vims='vim --servername SAMPLESERVER'
alias vimc='vim --servername SAMPLESERVER --remote-tab-silent'

You can find more information about this at: http://vimdoc.sourceforge.net/htmldoc/remote.html

community wiki
jm666
, May 11, 2011 at 19:54

Variation of sudo write:

into .vimrc

cmap w!! w !sudo tee % >/dev/null

After reload vim you can do "sudo save" as

:w!!

community wiki
3 revs, 3 users 74%
,Sep 17, 2010 at 17:06

HOWTO: Auto-complete Ctags when using Vim in Bash. For anyone else who uses Vim and Ctags, I've written a small auto-completer function for Bash. Add the following into your ~/.bash_completion file (create it if it does not exist):

Thanks go to stylishpants for his many fixes and improvements.

_vim_ctags() {
    local cur prev

    COMPREPLY=()
    cur="${COMP_WORDS[COMP_CWORD]}"
    prev="${COMP_WORDS[COMP_CWORD-1]}"

    case "${prev}" in
        -t)
            # Avoid the complaint message when no tags file exists
            if [ ! -r ./tags ]
            then
                return
            fi

            # Escape slashes to avoid confusing awk
            cur=${cur////\\/}

            COMPREPLY=( $(compgen -W "`awk -vORS=" "  "/^${cur}/ { print \\$1 }" tags`" ) )
            ;;
        *)
            _filedir_xspec
            ;;
    esac
}

# Files matching this pattern are excluded
excludelist='*.@(o|O|so|SO|so.!(conf)|SO.!(CONF)|a|A|rpm|RPM|deb|DEB|gif|GIF|jp?(e)g|JP?(E)G|mp3|MP3|mp?(e)g|MP?(E)G|avi|AVI|asf|ASF|ogg|OGG|class|CLASS)'

complete -F _vim_ctags -f -X "${excludelist}" vi vim gvim rvim view rview rgvim rgview gview

Once you restart your Bash session (or create a new one) you can type:

Code:

~$ vim -t MyC<tab key>

and it will auto-complete the tag the same way it does for files and directories:

Code:

MyClass MyClassFactory
~$ vim -t MyC

I find it really useful when I'm jumping into a quick bug fix.

>, Apr 8, 2009 at 3:05

Amazing....I really needed it – Sasha Apr 8 '09 at 3:05

TREE, Apr 27, 2009 at 13:19

can you summarize? If that external page goes away, this answer is useless. :( – TREE Apr 27 '09 at 13:19

Hamish Downer, May 5, 2009 at 16:38

Summary - it allows ctags autocomplete from the bash prompt for opening files with vim. – Hamish Downer May 5 '09 at 16:38

community wiki
2 revs, 2 users 80%
,Dec 22, 2016 at 7:44

I often want to highlight a particular word/function name, but don't want to search to the next instance of it yet:
map m* *#

René Nyffenegger, Dec 3, 2009 at 7:36

I don't understand this one. – René Nyffenegger Dec 3 '09 at 7:36

Scotty Allen, Dec 3, 2009 at 19:55

Try it:) It basically highlights a given word, without moving the cursor to the next occurrance (like * would). – Scotty Allen Dec 3 '09 at 19:55

jamessan, Dec 27, 2009 at 19:10

You can do the same with "nnoremap m* :let @/ = '\<' . expand('<cword>') . '\>'<cr>" – jamessan Dec 27 '09 at 19:10

community wiki
Ben
, Apr 9, 2009 at 12:37

% is also good when you want to diff files across two different copies of a project without wearing out the pinkies (from root of project1):
:vert diffs /project2/root/%

community wiki
Naga Kiran
, Jul 8, 2009 at 19:07

:setlocal autoread

Auto reloads the current buffer..especially useful while viewing log files and it almost serves the functionality of "tail" program in unix from within vim.

Checking for compile errors from within vim. set the makeprg variable depending on the language let's say for perl

:setlocal makeprg = perl\ -c \ %

For PHP

set makeprg=php\ -l\ %
set errorformat=%m\ in\ %f\ on\ line\ %l

Issuing ":make" runs the associated makeprg and displays the compilation errors/warnings in quickfix window and can easily navigate to the corresponding line numbers.

community wiki
2 revs, 2 users 73%
,Sep 14 at 20:16

Want an IDE?

:make will run the makefile in the current directory, parse the compiler output, you can then use :cn and :cp to step through the compiler errors opening each file and seeking to the line number in question.

:syntax on turns on vim's syntax highlighting.

community wiki
Luper Rouch
, Apr 9, 2009 at 12:53

Input a character from its hexadecimal value (insert mode):
<C-Q>x[type the hexadecimal byte]

MikeyB, Sep 22, 2009 at 21:57

<C-V> is the more generic command that works in both the text-mode and gui – MikeyB Sep 22 '09 at 21:57

jamessan, Dec 27, 2009 at 19:06

It's only <C-q> if you're using the awful mswin.vim (or you mapped it yourself). – jamessan Dec 27 '09 at 19:06

community wiki
Brad Cox
, May 8, 2009 at 1:54

I was sure someone would have posted this already, but here goes.

Take any build system you please; make, mvn, ant, whatever. In the root of the project directory, create a file of the commands you use all the time, like this:

mvn install
mvn clean install
... and so forth

To do a build, put the cursor on the line and type !!sh. I.e. filter that line; write it to a shell and replace with the results.

The build log replaces the line, ready to scroll, search, whatever.

When you're done viewing the log, type u to undo and you're back to your file of commands.

ojblass, Jul 5, 2009 at 18:27

This doesn't seem to fly on my system. Can you show an example only using the ls command? – ojblass Jul 5 '09 at 18:27

Brad Cox, Jul 29, 2009 at 19:30

!!ls replaces current line with ls output (adding more lines as needed). – Brad Cox Jul 29 '09 at 19:30

jamessan, Dec 28, 2009 at 8:29

Why wouldn't you just set makeprg to the proper tool you use for your build (if it isn't set already) and then use :make ? :copen will show you the output of the build as well as allowing you to jump to any warnings/errors. – jamessan Dec 28 '09 at 8:29

community wiki
2 revs, 2 users 95%
,Dec 28, 2009 at 8:38

==========================================================
In normal mode
==========================================================
gf ................ open file under cursor in same window --> see :h path
Ctrl-w f .......... open file under cursor in new window
Ctrl-w q .......... close current window
Ctrl-w 6 .......... open alternate file --> see :h #
gi ................ init insert mode in last insertion position
'0 ................ place the cursor where it was when the file was last edited

Braden Best, Feb 4, 2016 at 16:33

I believe it's <C-w> c to close a window, actually. :h ctrl-wBraden Best Feb 4 '16 at 16:33

community wiki
2 revs, 2 users 84%
,Sep 17, 2010 at 16:53

Due to the latency and lack of colors (I love color schemes :) I don't like programming on remote machines in PuTTY . So I developed this trick to work around this problem. I use it on Windows.

You will need

Setting up remote machine

Configure rsync to make your working directory accessible. I use an SSH tunnel and only allow connections from the tunnel:

address = 127.0.0.1
hosts allow = 127.0.0.1
port = 40000
use chroot = false
[bledge_ce]
    path = /home/xplasil/divine/bledge_ce
    read only = false

Then start rsyncd: rsync --daemon --config=rsyncd.conf

Setting up local machine

Install rsync from Cygwin. Start Pageant and load your private key for the remote machine. If you're using SSH tunelling, start PuTTY to create the tunnel. Create a batch file push.bat in your working directory which will upload changed files to the remote machine using rsync:

rsync --blocking-io *.cc *.h SConstruct rsync://localhost:40001/bledge_ce

SConstruct is a build file for scons. Modify the list of files to suit your needs. Replace localhost with the name of remote machine if you don't use SSH tunelling.

Configuring Vim That is now easy. We will use the quickfix feature (:make and error list), but the compilation will run on the remote machine. So we need to set makeprg:

set makeprg=push\ &&\ plink\ -batch\ [email protected]\ \"cd\ /home/xplasil/divine/bledge_ce\ &&\ scons\ -j\ 2\"

This will first start the push.bat task to upload the files and then execute the commands on remote machine using SSH ( Plink from the PuTTY suite). The command first changes directory to the working dir and then starts build (I use scons).

The results of build will show conviniently in your local gVim errors list.

matpie, Sep 17, 2010 at 23:02

A much simpler solution would be to use bcvi: sshmenu.sourceforge.net/articles/bcvimatpie Sep 17 '10 at 23:02

Uri Goren, Jul 20 at 20:21

cmder is much easier and simpler, it also comes with its own ssh client – Uri Goren Jul 20 at 20:21

community wiki
3 revs, 2 users 94%
,Jan 16, 2014 at 14:10

I use Vim for everything. When I'm editing an e-mail message, I use:

gqap (or gwap )

extensively to easily and correctly reformat on a paragraph-by-paragraph basis, even with quote leadin characters. In order to achieve this functionality, I also add:

-c 'set fo=tcrq' -c 'set tw=76'

to the command to invoke the editor externally. One noteworthy addition would be to add ' a ' to the fo (formatoptions) parameter. This will automatically reformat the paragraph as you type and navigate the content, but may interfere or cause problems with errant or odd formatting contained in the message.

Andrew Ferrier, Jul 14, 2014 at 22:22

autocmd FileType mail set tw=76 fo=tcrq in your ~/.vimrc will also work, if you can't edit the external editor command. – Andrew Ferrier Jul 14 '14 at 22:22

community wiki
2 revs, 2 users 94%
,May 6, 2009 at 12:22

Put this in your .vimrc to have a command to pretty-print xml:
function FormatXml()
    %s:\(\S\)\(<[^/]\)\|\(>\)\(</\):\1\3\r\2\4:g
    set filetype=xml
    normal gg=G
endfunction

command FormatXml :call FormatXml()

David Winslow, Nov 24, 2009 at 20:43

On linuxes (where xmllint is pretty commonly installed) I usually just do :%! xmllint - for this. – David Winslow Nov 24 '09 at 20:43

community wiki
searlea
, Aug 6, 2009 at 9:33

:sp %:h - directory listing / file-chooser using the current file's directory

(belongs as a comment under rampion's cd tip, but I don't have commenting-rights yet)

bpw1621, Feb 19, 2011 at 15:13

":e ." does the same thing for your current working directory which will be the same as your current file's directory if you set autochdir – bpw1621 Feb 19 '11 at 15:13

community wiki
2 revs
,Sep 22, 2009 at 22:23

Just before copying and pasting to stackoverflow:
:retab 1
:% s/^I/ /g
:% s/^/    /

Now copy and paste code.

As requested in the comments:

retab 1. This sets the tab size to one. But it also goes through the code and adds extra tabs and spaces so that the formatting does not move any of the actual text (ie the text looks the same after ratab).

% s/^I/ /g: Note the ^I is tthe result of hitting tab. This searches for all tabs and replaces them with a single space. Since we just did a retab this should not cause the formatting to change but since putting tabs into a website is hit and miss it is good to remove them.

% s/^/ /: Replace the beginning of the line with four spaces. Since you cant actually replace the beginning of the line with anything it inserts four spaces at the beging of the line (this is needed by SO formatting to make the code stand out).

vehomzzz, Sep 22, 2009 at 20:52

explain it please... – vehomzzz Sep 22 '09 at 20:52

cmcginty, Sep 22, 2009 at 22:31

so I guess this won't work if you use 'set expandtab' to force all tabs to spaces. – cmcginty Sep 22 '09 at 22:31

Martin York, Sep 23, 2009 at 0:07

@Casey: The first two lines will not apply. The last line will make sure you can just cut and paste into SO. – Martin York Sep 23 '09 at 0:07

Braden Best, Feb 4, 2016 at 16:40

Note that you can achieve the same thing with cat <file> | awk '{print " " $line}' . So try :w ! awk '{print " " $line}' | xclip -i . That's supposed to be four spaces between the ""Braden Best Feb 4 '16 at 16:40

community wiki
Anders Holmberg
, Dec 28, 2009 at 9:21

When working on a project where the build process is slow I always build in the background and pipe the output to a file called errors.err (something like make debug 2>&1 | tee errors.err ). This makes it possible for me to continue editing or reviewing the source code during the build process. When it is ready (using pynotify on GTK to inform me that it is complete) I can look at the result in vim using quickfix . Start by issuing :cf[ile] which reads the error file and jumps to the first error. I personally like to use cwindow to get the build result in a separate window.

community wiki
quabug
, Jul 12, 2011 at 12:21

set colorcolumn=+1 or set cc=+1 for vim 7.3

Luc M, Oct 31, 2012 at 15:12

A short explanation would be appreciated... I tried it and could be very usefull! You can even do something like set colorcolumn=+1,+10,+20 :-) – Luc M Oct 31 '12 at 15:12

DBedrenko, Oct 31, 2014 at 16:17

@LucM If you tried it why didn't you provide an explanation? – DBedrenko Oct 31 '14 at 16:17

mjturner, Aug 19, 2015 at 11:16

colorcolumn allows you to specify columns that are highlighted (it's ideal for making sure your lines aren't too long). In the original answer, set cc=+1 highlights the column after textwidth . See the documentation for more information. – mjturner Aug 19 '15 at 11:16

community wiki
mpe
, May 11, 2009 at 4:39

For making vim a little more like an IDE editor:

Rook, May 11, 2009 at 4:42

How does that make Vim more like an IDE ?? – Rook May 11 '09 at 4:42

mpe, May 12, 2009 at 12:29

I did say "a little" :) But it is something many IDEs do, and some people like it, eg: eclipse.org/screenshots/images/JavaPerspective-WinXP.pngmpe May 12 '09 at 12:29

Rook, May 12, 2009 at 21:25

Yes, but that's like saying yank/paste functions make an editor "a little" more like an IDE. Those are editor functions. Pretty much everything that goes with the editor that concerns editing text and that particular area is an editor function. IDE functions would be, for example, project/files management, connectivity with compiler&linker, error reporting, building automation tools, debugger ... i.e. the stuff that doesn't actually do nothing with editing text. Vim has some functions & plugins so he can gravitate a little more towards being an IDE, but these are not the ones in question. – Rook May 12 '09 at 21:25

Rook, May 12, 2009 at 21:26

After all, an IDE = editor + compiler + debugger + building tools + ... – Rook May 12 '09 at 21:26

Rook, May 12, 2009 at 21:31

Also, just FYI, vim has an option to set invnumber. That way you don't have to "set nu" and "set nonu", i.e. remember two functions - you can just toggle. – Rook May 12 '09 at 21:31

community wiki
2 revs, 2 users 50%
PuzzleCracker
, Sep 13, 2009 at 23:20

I love :ls command.

aehlke, Oct 28, 2009 at 3:16

Well what does it do? – aehlke Oct 28 '09 at 3:16

>, Dec 7, 2009 at 10:51

gives the current file name opened ? – user59634 Dec 7 '09 at 10:51

Nona Urbiz, Dec 20, 2010 at 8:25

:ls lists all the currently opened buffers. :be opens a file in a new buffer, :bn goes to the next buffer, :bp to the previous, :b filename opens buffer filename (it auto-completes too). buffers are distinct from tabs, which i'm told are more analogous to views. – Nona Urbiz Dec 20 '10 at 8:25

community wiki
2 revs, 2 users 80%
,Sep 17, 2010 at 16:45

A few useful ones:
:set nu # displays lines
:44     # go to line 44
'.      # go to last modification line

My favourite: Ctrl + n WORD COMPLETION!

community wiki
2 revs
,Jun 18, 2013 at 11:10

In insert mode, ctrl + x, ctrl + p will complete (with menu of possible completions if that's how you like it) the current long identifier that you are typing.
if (SomeCall(LONG_ID_ <-- type c-x c-p here
            [LONG_ID_I_CANT_POSSIBLY_REMEMBER]
             LONG_ID_BUT_I_NEW_IT_WASNT_THIS_ONE
             LONG_ID_GOSH_FORGOT_THIS
             LONG_ID_ETC
             ∶

Justin L., Jun 13, 2013 at 16:21

i type <kbd>ctrl</kbd>+<kbd>p</kbd> way too much by accident while trying to hit <kbd>ctrl</kbd>+<kbd>[</kbd> >< – Justin L. Jun 13 '13 at 16:21

community wiki
Fritz G. Mehner
, Apr 22, 2009 at 16:41

Use the right mouse key to toggle insert mode in gVim with the following settings in ~/.gvimrc :
"
"------------------------------------------------------------------
" toggle insert mode <--> 'normal mode with the <RightMouse>-key
"------------------------------------------------------------------
nnoremap  <RightMouse> <Insert>
inoremap  <RightMouse> <ESC>
"

Andreas Grech, Jun 20, 2010 at 17:22

This is stupid. Defeats the productivity gains from not using the mouse. – Andreas Grech Jun 20 '10 at 17:22

Brady Trainor, Jul 5, 2014 at 21:07

Maybe fgm has head gestures mapped to mouse clicks. – Brady Trainor Jul 5 '14 at 21:07

community wiki
AIB
, Apr 27, 2009 at 13:06

Replace all
  :%s/oldtext/newtext/igc

Give a to replace all :)

Nathan Fellman, Jan 12, 2011 at 20:58

or better yet, instead of typing a, just remove the c . c means confirm replacementNathan Fellman Jan 12 '11 at 20:58

community wiki
2 revs
,Sep 13, 2009 at 18:39

Neither of the following is really diehard, but I find it extremely useful.

Trivial bindings, but I just can't live without. It enables hjkl-style movement in insert mode (using the ctrl key). In normal mode: ctrl-k/j scrolls half a screen up/down and ctrl-l/h goes to the next/previous buffer. The µ and ù mappings are especially for an AZERTY-keyboard and go to the next/previous make error.

imap <c-j> <Down>
imap <c-k> <Up>
imap <c-h> <Left>
imap <c-l> <Right>
nmap <c-j> <c-d>
nmap <c-k> <c-u>
nmap <c-h> <c-left>
nmap <c-l> <c-right>

nmap ù :cp<RETURN>
nmap µ :cn<RETURN>

A small function I wrote to highlight functions, globals, macro's, structs and typedefs. (Might be slow on very large files). Each type gets different highlighting (see ":help group-name" to get an idea of your current colortheme's settings) Usage: save the file with ww (default "\ww"). You need ctags for this.

nmap <Leader>ww :call SaveCtagsHighlight()<CR>

"Based on: http://stackoverflow.com/questions/736701/class-function-names-highlighting-in-vim
function SaveCtagsHighlight()
    write

    let extension = expand("%:e")
    if extension!="c" && extension!="cpp" && extension!="h" && extension!="hpp"
        return
    endif

    silent !ctags --fields=+KS *
    redraw!

    let list = taglist('.*')
    for item in list
        let kind = item.kind

        if     kind == 'member'
            let kw = 'Identifier'
        elseif kind == 'function'
            let kw = 'Function'
        elseif kind == 'macro'
            let kw = 'Macro'
        elseif kind == 'struct'
            let kw = 'Structure'
        elseif kind == 'typedef'
            let kw = 'Typedef'
        else
            continue
        endif

        let name = item.name
        if name != 'operator=' && name != 'operator ='
            exec 'syntax keyword '.kw.' '.name
        endif
    endfor
    echo expand("%")." written, tags updated"
endfunction

I have the habit of writing lots of code and functions and I don't like to write prototypes for them. So I made some function to generate a list of prototypes within a C-style sourcefile. It comes in two flavors: one that removes the formal parameter's name and one that preserves it. I just refresh the entire list every time I need to update the prototypes. It avoids having out of sync prototypes and function definitions. Also needs ctags.

"Usage: in normal mode, where you want the prototypes to be pasted:
":call GenerateProptotypes()
function GeneratePrototypes()
    execute "silent !ctags --fields=+KS ".expand("%")
    redraw!
    let list = taglist('.*')
    let line = line(".")
    for item in list
        if item.kind == "function"  &&  item.name != "main"
            let name = item.name
            let retType = item.cmd
            let retType = substitute( retType, '^/\^\s*','','' )
            let retType = substitute( retType, '\s*'.name.'.*', '', '' ) 

            if has_key( item, 'signature' )
                let sig = item.signature
                let sig = substitute( sig, '\s*\w\+\s*,',        ',',   'g')
                let sig = substitute( sig, '\s*\w\+\(\s)\)', '\1', '' )
            else
                let sig = '()'
            endif
            let proto = retType . "\t" . name . sig . ';'
            call append( line, proto )
            let line = line + 1
        endif
    endfor
endfunction


function GeneratePrototypesFullSignature()
    "execute "silent !ctags --fields=+KS ".expand("%")
    let dir = expand("%:p:h");
    execute "silent !ctags --fields=+KSi --extra=+q".dir."/* "
    redraw!
    let list = taglist('.*')
    let line = line(".")
    for item in list
        if item.kind == "function"  &&  item.name != "main"
            let name = item.name
            let retType = item.cmd
            let retType = substitute( retType, '^/\^\s*','','' )
            let retType = substitute( retType, '\s*'.name.'.*', '', '' ) 

            if has_key( item, 'signature' )
                let sig = item.signature
            else
                let sig = '(void)'
            endif
            let proto = retType . "\t" . name . sig . ';'
            call append( line, proto )
            let line = line + 1
        endif
    endfor
endfunction

community wiki
Yada
, Nov 24, 2009 at 20:21

I collected these over the years.
" Pasting in normal mode should append to the right of cursor
nmap <C-V>      a<C-V><ESC>
" Saving
imap <C-S>      <C-o>:up<CR>
nmap <C-S>      :up<CR>
" Insert mode control delete
imap <C-Backspace> <C-W>
imap <C-Delete> <C-O>dw
nmap    <Leader>o       o<ESC>k
nmap    <Leader>O       O<ESC>j
" tired of my typo
nmap :W     :w

community wiki
jonyamo
, May 10, 2010 at 15:01

Create a function to execute the current buffer using it's shebang (assuming one is set) and call it with crtl-x.
map <C-X> :call CallInterpreter()<CR>

au BufEnter *
\ if match (getline(1),  '^\#!') == 0 |
\   execute("let b:interpreter = getline(1)[2:]") |
\ endif

fun! CallInterpreter()
    if exists("b:interpreter")
        exec("! ".b:interpreter." %")
    endif
endfun

community wiki
Marcus Borkenhagen
, Jan 12, 2011 at 15:22

map macros

I rather often find it useful to on-the-fly define some key mapping just like one would define a macro. The twist here is, that the mapping is recursive and is executed until it fails.

Example:

enum ProcStats
{
        ps_pid,
        ps_comm,
        ps_state,
        ps_ppid,
        ps_pgrp,
:map X /ps_<CR>3xixy<Esc>X

Gives:

enum ProcStats
{
        xypid,
        xycomm,
        xystate,
        xyppid,
        xypgrp,

Just an silly example :).

I am completely aware of all the downsides - it just so happens that I found it rather useful in some occasions. Also it can be interesting to watch it at work ;).

00dani, Aug 2, 2013 at 11:25

Macros are also allowed to be recursive and work in pretty much the same fashion when they are, so it's not particularly necessary to use a mapping for this. – 00dani Aug 2 '13 at 11:25

>,

Reuse

Motions to mix with other commands, more here .

tx
fx
Fx

Use your favorite tools in Vim.

:r !python anything you want or awk or Y something

Repeat in visual mode, powerful when combined with tips above.

;

[Oct 21, 2018] What are your suggestions for an ideal Vim configuration for Perl development?

Notable quotes:
"... The .vimrc settings should be heavily commented ..."
"... Look also at perl-support.vim (a Perl IDE for Vim/gVim). Comes with suggestions for customizing Vim (.vimrc), gVim (.gvimrc), ctags, perltidy, and Devel:SmallProf beside many other things. ..."
"... Perl Best Practices has an appendix on Editor Configurations . vim is the first editor listed. ..."
"... Andy Lester and others maintain the official Perl, Perl 6 and Pod support files for Vim on Github: https://github.com/vim-perl/vim-perl ..."
Aug 18, 2016 | stackoverflow.com
There are a lot of threads pertaining to how to configure Vim/GVim for Perl development on PerlMonks.org .

My purpose in posting this question is to try to create, as much as possible, an ideal configuration for Perl development using Vim/GVim. Please post your suggestions for .vimrc settings as well as useful plugins.

I will try to merge the recommendations into a set of .vimrc settings and to a list of recommended plugins, ftplugins and syntax files.

.vimrc settings
"Create a command :Tidy to invoke perltidy"
"By default it operates on the whole file, but you can give it a"
"range or visual range as well if you know what you're doing."
command -range=% -nargs=* Tidy <line1>,<line2>!
    \perltidy -your -preferred -default -options <args>

vmap <tab> >gv    "make tab in v mode indent code"
vmap <s-tab> <gv

nmap <tab> I<tab><esc> "make tab in normal mode indent code"
nmap <s-tab> ^i<bs><esc>

let perl_include_pod   = 1    "include pod.vim syntax file with perl.vim"
let perl_extended_vars = 1    "highlight complex expressions such as @{[$x, $y]}"
let perl_sync_dist     = 250  "use more context for highlighting"

set nocompatible "Use Vim defaults"
set backspace=2  "Allow backspacing over everything in insert mode"

set autoindent   "Always set auto-indenting on"
set expandtab    "Insert spaces instead of tabs in insert mode. Use spaces for indents"
set tabstop=4    "Number of spaces that a <Tab> in the file counts for"
set shiftwidth=4 "Number of spaces to use for each step of (auto)indent"

set showmatch    "When a bracket is inserted, briefly jump to the matching one"
syntax plugins ftplugins CPAN modules Debugging tools

I just found out about VimDebug . I have not yet been able to install it on Windows, but looks promising from the description.

innaM, Oct 15, 2009 at 19:06

The .vimrc settings should be heavily commented. E.g., what does perl_include_pod do? � innaM Oct 15 '09 at 19:06

Sinan �n�r, Oct 15, 2009 at 20:02

@Manni: You are welcome. I have been using the same .vimrc for many years and a recent bunch of vim related questions got me curious. I was too lazy to wade through everything that was posted on PerlMonks (and see what was current etc.), so I figured we could put together something here. � Sinan �n�r Oct 15 '09 at 20:02

innaM, Oct 16, 2009 at 8:22

I think that that's a great idea. Sorry that my own contribution is that lame. � innaM Oct 16 '09 at 8:22

Telemachus, Jul 8, 2010 at 0:40

Rather than closepairs, I would recommend delimitMate or one of the various autoclose plugins. (There are about three named autoclose, I think.) The closepairs plugin can't handle a single apostrophe inside a string (i.e. print "This isn't so hard, is it?" ), but delimitMate and others can. github.com/Raimondi/delimitMate � Telemachus Jul 8 '10 at 0:40

community wiki 2 revs, 2 users 74% ,Dec 21, 2009 at 20:57

From chromatic's blog (slightly adapted to be able to use the same mapping from all modes).
vmap, pt :!perltidy<CR> 
nmap, pt :%! perltidy<CR>

hit, pt in normal mode to clean up the whole file, or in visual mode to clean up the selection. You could also add:

imap, pt <ESC>:%! perltidy<CR>

But using commands from input mode is not recommended.

innaM, Oct 21, 2009 at 9:21

I seem to be missing something here: How can I type, ptv without vim running perltidy on the entire file? � innaM Oct 21 '09 at 9:21

innaM, Oct 21, 2009 at 9:23

Ovid's comment (#3) seems offer a much better solution. � innaM Oct 21 '09 at 9:23

innaM, Oct 21, 2009 at 13:22

Three hours later: turns out that the 'p' in that mapping is a really bad idea. It will bite you when vim's got something to paste. � innaM Oct 21 '09 at 13:22

Ether, Oct 21, 2009 at 15:23

@Manni: select a region first: with the mouse if using gvim, or with visual mode ( v and then use motion commands). � Ether Oct 21 '09 at 15:23

Ether, Oct 21, 2009 at 19:44

@Manni: I just gave it a try: if you type, pt, vim waits for you to type something else (e.g. <cr>) as a signal that the command is ended. Hitting, ptv will immediately format the region. So I would expect that vim recognizes that there is overlap between the mappings, and waits for disambiguation before proceeding. � Ether Oct 21 '09 at 19:44

community wiki hobbs, Oct 16, 2009 at 0:35

" Create a command :Tidy to invoke perltidy.
" By default it operates on the whole file, but you can give it a
" range or visual range as well if you know what you're doing.
command -range=% -nargs=* Tidy <line1>,<line2>!
    \perltidy -your -preferred -default -options <args>

community wiki Fritz G. Mehner, Oct 17, 2009 at 7:44

Look also at perl-support.vim (a Perl IDE for Vim/gVim). Comes with suggestions for customizing Vim (.vimrc), gVim (.gvimrc), ctags, perltidy, and Devel:SmallProf beside many other things.

innaM, Oct 19, 2009 at 12:32

I hate that one. The comments feature alone deserves a thorough 'rm -rf', IMHO. � innaM Oct 19 '09 at 12:32

sundar, Mar 11, 2010 at 20:54

I hate the fact that \$ is changed automatically to a "my $" declaration (same with \@ and \%). Does the author never use references or what?! � sundar Mar 11 '10 at 20:54

chiggsy, Sep 14, 2010 at 13:48

I take pieces of that one. If it were a man, you'd say about him, "He was only good for transplants..." � chiggsy Sep 14 '10 at 13:48

community wiki Permanuno, Oct 20, 2009 at 16:55

Perl Best Practices has an appendix on Editor Configurations . vim is the first editor listed.

community wiki 2 revs, 2 users 67% ,May 10, 2014 at 21:08

Andy Lester and others maintain the official Perl, Perl 6 and Pod support files for Vim on Github: https://github.com/vim-perl/vim-perl

Sinan �n�r, Jan 26, 2010 at 21:20

Note that that link is already listed in the body of the question (look under syntax ). � Sinan �n�r Jan 26 '10 at 21:20

community wiki 2 revs, 2 users 94% ,Dec 7, 2010 at 19:42

For tidying, I use the following; either \t to tidy the whole file, or I select a few lines in shift+V mode and then do \t
nnoremap <silent> \t :%!perltidy -q<Enter>
vnoremap <silent> \t :!perltidy -q<Enter>

Sometimes it's also useful to deparse code. As the above lines, either for the whole file or for a selection.

nnoremap <silent> \D :.!perl -MO=Deparse 2>/dev/null<CR>
vnoremap <silent> \D :!perl -MO=Deparse 2>/dev/null<CR>

community wiki 3 revs, 3 users 36% ,Oct 16, 2009 at 14:25

.vimrc:
" Allow :make to run 'perl -c' on the current buffer, jumping to 
" errors as appropriate
" My copy of vimparse: http://irc.peeron.com/~zigdon/misc/vimparse.pl

set makeprg=$HOME/bin/vimparse.pl\ -c\ %\ $*

" point at wherever you keep the output of pltags.pl, allowing use of ^-]
" to jump to function definitions.

set tags+=/path/to/tags

innaM, Oct 15, 2009 at 19:34

What is pltags.pl? Is it better than ctags? � innaM Oct 15 '09 at 19:34

Sinan �n�r, Oct 15, 2009 at 19:46

I think search.cpan.org/perldoc/Perl::Tags is based on it. � Sinan �n�r Oct 15 '09 at 19:46

Sinan �n�r, Oct 15, 2009 at 20:00

Could you please explain if there are any advantages to using pltags.pl rather than taglist.vim w/ ctags ? � Sinan �n�r Oct 15 '09 at 20:00

innaM, Oct 16, 2009 at 14:24

And vimparse.pl really works for you? Is that really the correct URL? � innaM Oct 16 '09 at 14:24

zigdon, Oct 16, 2009 at 18:51

@sinan it enables quickfix - all it does is reformat the output of perl -c so that vim parses it as compiler errors. The the usual quickfix commands work. � zigdon Oct 16 '09 at 18:51

community wiki
innaM
, Oct 19, 2009 at 8:57

Here's an interesting module I found on the weekend: App::EditorTools::Vim . Its most interesting feature seems to be its ability to rename lexical variables. Unfortunately, my tests revealed that it doesn't seem to be ready yet for any production use, but it sure seems worth to keep an eye on.

community wiki 3 revs, 2 users 79%, Oct 19, 2009 at 13:50

Here are a couple of my .vimrc settings. They may not be Perl specific, but I couldn't work without them:
set nocompatible        " Use Vim defaults (much better!) "
set bs=2                " Allow backspacing over everything in insert mode "
set ai                  " Always set auto-indenting on "
set showmatch           " show matching brackets "

" for quick scripts, just open a new buffer and type '_perls' "
iab _perls #!/usr/bin/perl<CR><BS><CR>use strict;<CR>use warnings;<CR>

community wiki, J.J., Feb 17, 2010 at 21:35

I have 2.

The first one I know I picked up part of it from someone else, but I can't remember who. Sorry unknown person. Here's how I made "C^N" auto complete work with Perl. Here's my .vimrc commands.

" to use CTRL+N with modules for autocomplete "
set iskeyword+=:
set complete+=k~/.vim_extras/installed_modules.dat

Then I set up a cron to create the installed_modules.dat file. Mine is for my mandriva system. Adjust accordingly.

locate *.pm | grep "perl5" | sed -e "s/\/usr\/lib\/perl5\///" | sed -e "s/5.8.8\///" | sed -e "s/5.8.7\///" | sed -e "s/vendor_perl\///" | sed -e "s/site_perl\///" | sed -e "s/x86_64-linux\///" | sed -e "s/\//::/g" | sed -e "s/\.pm//" >/home/jeremy/.vim_extras/installed_modules.dat

The second one allows me to use gf in Perl. Gf is a shortcut to other files. just place your cursor over the file and type gf and it will open that file.

" To use gf with perl "
set path+=$PWD/**,
set path +=/usr/lib/perl5/*,
set path+=/CompanyCode/*,   " directory containing work code "
autocmd BufRead *.p? set include=^use
autocmd BufRead *.pl set includeexpr=substitute(v:fname,'\\(.*\\)','\\1.pm','i')

community wiki
zzapper
, Sep 23, 2010 at 9:41

I find the following abbreviations useful
iab perlb  print "Content-type: text/html\n\n <p>zdebug + $_ + $' + $`  line ".__LINE__.__FILE__."\n";exit;
iab perlbb print "Content-type: text/html\n\n<p>zdebug  <C-R>a  line ".__LINE__.__FILE__."\n";exit;
iab perlbd do{print "Content-type: text/html\n\n<p>zdebug  <C-R>a  line ".__LINE__."\n";exit} if $_ =~ /\w\w/i;
iab perld print "Content-type: text/html\n\n dumper";use Data::Dumper;$Data::Dumper::Pad="<br>";print Dumper <C-R>a ;exit;

iab perlf foreach $line ( keys %ENV )<CR> {<CR> }<LEFT><LEFT>
iab perle while (($k,$v) = each %ENV) { print "<br>$k = $v\n"; }
iab perli x = (i<4) ? 4 : i;
iab perlif if ($i==1)<CR>{<CR>}<CR>else<CR>{<CR>}
iab perlh $html=<<___HTML___;<CR>___HTML___<CR>

You can make them perl only with

au bufenter *.pl iab xbug print "<p>zdebug ::: $_ :: $' :: $`  line ".__LINE__."\n";exit;

mfontani, Dec 7, 2010 at 14:47

there's no my anywhere there; I take it you usually write CGIs with no use strict; ? (just curious if this is so) � mfontani Dec 7 '10 at 14:47

Sean McMillan, Jun 30, 2011 at 0:43

Oh wow, and without CGI.pm as well. It's like a 15 year flashback. � Sean McMillan Jun 30 '11 at 0:43

community wiki Benno, May 3, 2013 at 12:45

By far the most useful are
  1. Perl filetype pluging (ftplugin) - this colour-codes various code elements
  2. Creating a check-syntax-before-saving command "W" preventing you from saving bad code (you can override with the normal 'w').

Installing he plugins are a bit dicky as the version of vim (and linux) put the plugins in different places. Mine are in ~/.vim/after/

my .vimrc below.

set vb
set ts=2
set sw=2
set enc=utf-8
set fileencoding=utf-8
set fileencodings=ucs-bom,utf8,prc
set guifont=Monaco:h11
set guifontwide=NSimsun:h12
set pastetoggle=<F3>
command -range=% -nargs=* Tidy <line1>,<line2>!
    \perltidy
filetype plugin on
augroup JumpCursorOnEdit
 au!
 autocmd BufReadPost *
 \ if expand("<afile>:p:h") !=? $TEMP |
 \ if line("'\"") > 1 && line("'\"") <= line("$") |
 \ let JumpCursorOnEdit_foo = line("'\"") |
 \ let b:doopenfold = 1 |
 \ if (foldlevel(JumpCursorOnEdit_foo) > foldlevel(JumpCursorOnEdit_foo - 1)) |
 \ let JumpCursorOnEdit_foo = JumpCursorOnEdit_foo - 1 |
 \ let b:doopenfold = 2 |
 \ endif |
 \ exe JumpCursorOnEdit_foo |
 \ endif |
 \ endif
 " Need to postpone using "zv" until after reading the modelines.
 autocmd BufWinEnter *
 \ if exists("b:doopenfold") |
 \ exe "normal zv" |
 \ if(b:doopenfold > 1) |
 \ exe "+".1 |
 \ endif |
 \ unlet b:doopenfold |
 \ endif
augroup END

[Oct 21, 2018] Duplicate a whole line in Vim

Notable quotes:
"... Do people not run vimtutor anymore? This is probably within the first five minutes of learning how to use Vim. ..."
"... Can also use capital Y to copy the whole line. ..."
"... I think the Y should be "copy from the cursor to the end" ..."
"... In normal mode what this does is copy . copy this line to just below this line . ..."
"... And in visual mode it turns into '<,'> copy '> copy from start of selection to end of selection to the line below end of selection . ..."
"... I like: Shift + v (to select the whole line immediately and let you select other lines if you want), y, p ..."
"... Multiple lines with a number in between: y7yp ..."
"... 7yy is equivalent to y7y and is probably easier to remember how to do. ..."
"... or :.,.+7 copy .+7 ..."
"... When you press : in visual mode, it is transformed to '<,'> so it pre-selects the line range the visual selection spanned over ..."
Oct 21, 2018 | stackoverflow.com

sumek, Sep 16, 2008 at 15:02

How do I duplicate a whole line in Vim in a similar way to Ctrl + D in IntelliJ IDEA/Resharper or Ctrl + Alt + / in Eclipse?

dash-tom-bang, Feb 15, 2016 at 23:31

Do people not run vimtutor anymore? This is probably within the first five minutes of learning how to use Vim.dash-tom-bang Feb 15 '16 at 23:31

Mark Biek, Sep 16, 2008 at 15:06

yy or Y to copy the line
or
dd to delete (cutting) the line

then

p to paste the copied or deleted text after the current line
or
P to paste the copied or deleted text before the current line

camflan, Sep 28, 2008 at 15:55

Can also use capital Y to copy the whole line.camflan Sep 28 '08 at 15:55

nXqd, Jul 19, 2012 at 11:35

@camflan I think the Y should be "copy from the cursor to the end" nXqd Jul 19 '12 at 11:35

Amir Ali Akbari, Oct 9, 2012 at 10:33

and 2yy can be used to copy 2 lines (and for any other n) – Amir Ali Akbari Oct 9 '12 at 10:33

zelk, Mar 9, 2014 at 13:29

To copy two lines, it's even faster just to go yj or yk, especially since you don't double up on one character. Plus, yk is a backwards version that 2yy can't do, and you can put the number of lines to reach backwards in y9j or y2k, etc.. Only difference is that your count has to be n-1 for a total of n lines, but your head can learn that anyway. – zelk Mar 9 '14 at 13:29

DarkWiiPlayer, Apr 13 at 7:26

I know I'm late to the party, but whatever; I have this in my .vimrc:
nnoremap <C-d> :copy .<CR>
vnoremap <C-d> :copy '><CR>

the :copy command just copies the selected line or the range (always whole lines) to below the line number given as its argument.

In normal mode what this does is copy . copy this line to just below this line .

And in visual mode it turns into '<,'> copy '> copy from start of selection to end of selection to the line below end of selection .

yolenoyer, Apr 11 at 16:34

I like to use this mapping:
:nnoremap yp Yp

because it makes it consistent to use alongside the native YP command.

Gabe add a comment, Jul 14, 2009 at 4:45

I like: Shift + v (to select the whole line immediately and let you select other lines if you want), y, p

jedi, Feb 11 at 17:20

If you would like to duplicate a line and paste it right away below the current like, just like in Sublime Ctrl + Shift + D, then you can add this to your .vimrc file.

imap <S-C-d> <Esc>Yp

jedi, Apr 14 at 17:48

This works perfectly fine for me: imap <S-C-d> <Esc>Ypi insert mode and nmap <S-C-d> <Esc>Yp in normal mode – jedi Apr 14 at 17:48

Chris Penner, Apr 20, 2015 at 4:33

Default is yyp, but I've been using this rebinding for a year or so and love it:

" set Y to duplicate lines, works in visual mode as well. nnoremap Y yyp vnoremap Y y`>pgv

yemu, Oct 12, 2013 at 18:23

yyp - paste after

yyP - paste before

Mikk, Dec 4, 2015 at 9:09

@A-B-B However, there is a miniature difference here - what line will your cursor land on. – Mikk Dec 4 '15 at 9:09

theschmitzer, Sep 16, 2008 at 15:16

yyp - remember it with "yippee!"

Multiple lines with a number in between: y7yp

graywh, Jan 4, 2009 at 21:25

7yy is equivalent to y7y and is probably easier to remember how to do.graywh Jan 4 '09 at 21:25

Nefrubyr, Jul 29, 2014 at 14:09

y7yp (or 7yyp) is rarely useful; the cursor remains on the first line copied so that p pastes the copied lines between the first and second line of the source. To duplicate a block of lines use 7yyP – Nefrubyr Jul 29 '14 at 14:09

DarkWiiPlayer, Apr 13 at 7:28

@Nefrubyr or :.,.+7 copy .+7 :P – DarkWiiPlayer Apr 13 at 7:28

Michael, May 12, 2016 at 14:54

For someone who doesn't know vi, some answers from above might mislead him with phrases like "paste ... after/before current line ".
It's actually "paste ... after/before cursor ".

yy or Y to copy the line
or
dd to delete the line

then

p to paste the copied or deleted text after the cursor
or
P to paste the copied or deleted text before the cursor


For more key bindings, you can visit this site: vi Complete Key Binding List

ap-osd, Feb 10, 2016 at 13:23

For those starting to learn vi, here is a good introduction to vi by listing side by side vi commands to typical Windows GUI Editor cursor movement and shortcut keys. It lists all the basic commands including yy (copy line) and p (paste after) or P (paste before).

vi (Vim) for Windows Users

pjz, Sep 16, 2008 at 15:04

yy

will yank the current line without deleting it

dd

will delete the current line

p

will put a line grabbed by either of the previous methods

Benoit, Apr 17, 2012 at 15:17

Normal mode: see other answers.

The Ex way:

If you need to move instead of copying, use :m instead of :t .

This can be really powerful if you combine it with :g or :v :

Reference: :help range, :help :t, :help :g, :help :m and :help :v

Benoit, Jun 30, 2012 at 14:17

When you press : in visual mode, it is transformed to '<,'> so it pre-selects the line range the visual selection spanned over. So, in visual mode, :t0 will copy the lines at the beginning. – Benoit Jun 30 '12 at 14:17

Niels Bom, Jul 31, 2012 at 8:21

For the record: when you type a colon (:) you go into command line mode where you can enter Ex commands. vimdoc.sourceforge.net/htmldoc/cmdline.html Ex commands can be really powerful and terse. The yyp solutions are "Normal mode" commands. If you want to copy/move/delete a far-away line or range of lines an Ex command can be a lot faster. – Niels Bom Jul 31 '12 at 8:21

Burak Erdem, Jul 8, 2016 at 16:55

:t. is the exact answer to the question. – Burak Erdem Jul 8 '16 at 16:55

Aaron Thoma, Aug 22, 2013 at 23:31

Y is usually remapped to y$ (yank (copy) until end of line (from current cursor position, not beginning of line)) though. With this line in .vimrc : :nnoremap Y y$Aaron Thoma Aug 22 '13 at 23:31

Kwondri, Sep 16, 2008 at 15:37

If you want another way :-)

"ayy this will store the line in buffer a

"ap this will put the contents of buffer a at the cursor.

There are many variations on this.

"a5yy this will store the 5 lines in buffer a

see http://www.vim.org/htmldoc/help.html for more fun

frbl, Jun 21, 2015 at 21:04

Thanks, I used this as a bind: map <Leader>d "ayy"ap – frbl Jun 21 '15 at 21:04

Rook, Jul 14, 2009 at 4:37

Another option would be to go with:
nmap <C-d> mzyyp`z

gives you the advantage of preserving the cursor position.

,Sep 18, 2008 at 20:32

You can also try <C-x><C-l> which will repeat the last line from insert mode and brings you a completion window with all of the lines. It works almost like <C-p>

Jorge Gajon, May 11, 2009 at 6:38

This is very useful, but to avoid having to press many keys I have mapped it to just CTRL-L, this is my map: inoremap ^L ^X^L – Jorge Gajon May 11 '09 at 6:38

cori, Sep 16, 2008 at 15:06

1 gotcha: when you use "p" to put the line, it puts it after the line your cursor is on, so if you want to add the line after the line you're yanking, don't move the cursor down a line before putting the new line.

Ghoti, Jan 31, 2016 at 11:05

or use capital P - put before – Ghoti Jan 31 '16 at 11:05

[Oct 21, 2018] Indent multiple lines quickly in vi

Oct 21, 2018 | stackoverflow.com

Allain Lalonde, Oct 25, 2008 at 3:27

Should be trivial, and it might even be in the help, but I can't figure out how to navigate it. How do I indent multiple lines quickly in vi?

Greg Hewgill, Oct 25, 2008 at 3:28

Use the > command. To indent 5 lines, 5>> . To mark a block of lines and indent it, Vjj> to indent 3 lines (vim only). To indent a curly-braces block, put your cursor on one of the curly braces and use >% .

If you're copying blocks of text around and need to align the indent of a block in its new location, use ]p instead of just p . This aligns the pasted block with the surrounding text.

Also, the shiftwidth setting allows you to control how many spaces to indent.

akdom, Oct 25, 2008 at 3:31

<shift>-v also works to select a line in Vim. � akdom Oct 25 '08 at 3:31

R. Martinho Fernandes, Feb 15, 2009 at 17:26

I use >i} (indent inner {} block). Works in vim. Not sure it works in vi. � R. Martinho Fernandes Feb 15 '09 at 17:26

Kamran Bigdely, Feb 28, 2011 at 23:25

My problem(in gVim) is that the command > indents much more than 2 blanks (I want just two blanks but > indent something like 5 blanks) � Kamran Bigdely Feb 28 '11 at 23:25

Greg Hewgill, Mar 1, 2011 at 18:42

@Kamran: See the shiftwidth setting for the way to change that. � Greg Hewgill Mar 1 '11 at 18:42

Greg Hewgill, Feb 28, 2013 at 3:36

@MattStevens: You can find extended discussion about this phenomenon here: meta.stackexchange.com/questions/9731/ � Greg Hewgill Feb 28 '13 at 3:36

Michael Ekoka, Feb 15, 2009 at 5:42

When you select a block and use > to indent, it indents then goes back to normal mode. I have this in my .vimrc file:
vnoremap < <gv

vnoremap > >gv

It lets you indent your selection as many time as you want.

sundar, Sep 1, 2009 at 17:14

To indent the selection multiple times, you can simply press . to repeat the previous command. � sundar Sep 1 '09 at 17:14

masukomi, Dec 6, 2013 at 21:24

The problem with . in this situation is that you have to move your fingers. With @mike's solution (same one i use) you've already got your fingers on the indent key and can just keep whacking it to keep indenting rather than switching and doing something else. Using period takes longer because you have to move your hands and it requires more thought because it's a second, different, operation. � masukomi Dec 6 '13 at 21:24

Johan, Jan 20, 2009 at 21:11

A big selection would be:
gg=G

It is really fast, and everything gets indented ;-)

asgs, Jan 28, 2014 at 21:57

I've an XML file and turned on syntax highlighting. Typing gg=G just puts every line starting from position 1. All the white spaces have been removed. Is there anything else specific to XML? � asgs Jan 28 '14 at 21:57

Johan, Jan 29, 2014 at 6:10

stackoverflow.com/questions/7600860/

Amanuel Nega, May 19, 2015 at 19:51

I think set cindent should be in vimrc or should run :set cindent before running that command � Amanuel Nega May 19 '15 at 19:51

Amanuel Nega, May 19, 2015 at 19:57

I think cindent must be set first. and @asgs i think this only works for cstyle programming languages. � Amanuel Nega May 19 '15 at 19:57

sqqqrly, Sep 28, 2017 at 23:59

I use block-mode visual selection:

This is not a uni-tasker. It works:

oligofren, Mar 27 at 15:23

This is cumbersome, but is the way to go if you do formatting outside of core VIM (for instance, using vim-prettier instead of the default indenting engine). Using > will otherwise royally scew up the formatting done by Prettier. � oligofren Mar 27 at 15:23

sqqqrly, Jun 15 at 16:30

Funny, I find it anything but cumbersome. This is not a uni-tasker! Learning this method has many uses beyond indenting. � sqqqrly Jun 15 at 16:30

user4052054, Aug 17 at 17:50

I find it better than the accepted answer, as I can see what is happening, the lines I'm selecting and the action I'm doing, and not just type some sort of vim incantation. � user4052054 Aug 17 at 17:50

Sergio, Apr 25 at 7:14

Suppose | represents the position of the cursor in Vim. If the text to be indented is enclosed in a code block like:
int main() {
line1
line2|
line3
}

you can do >i{ which means " indent ( > ) inside ( i ) block ( { ) " and get:

int main() {
    line1
    line2|
    line3
}

Now suppose the lines are contiguous but outside a block, like:

do
line2|
line3
line4
done

To indent lines 2 thru 4 you can visually select the lines and type > . Or even faster you can do >2j to get:

do
    line2|
    line3
    line4
done

Note that >Nj means indent from current line to N lines below. If the number of lines to be indented is large, it could take some seconds for the user to count the proper value of N . To save valuable seconds you can activate the option of relative number with set relativenumber (available since Vim version 7.3).

Sagar Jain, Apr 18, 2014 at 18:41

The master of all commands is
gg=G

This indents the entire file!

And below are some of the simple and elegant commands used to indent lines quickly in Vim or gVim.

To indent the current line
==

To indent the all the lines below the current line

=G

To indent n lines below the current line

n==

For example, to indent 4 lines below the current line

4==

To indent a block of code, go to one of the braces and use command

=%

These are the simplest, yet powerful commands to indent multiple lines.

rojomoke, Jul 30, 2014 at 15:48

This is just in vim, not vi . � rojomoke Jul 30 '14 at 15:48

Sagar Jain, Jul 31, 2014 at 3:40

@rojomoke: No, it works in vi as well � Sagar Jain Jul 31 '14 at 3:40

rojomoke, Jul 31, 2014 at 10:09

Not on my Solaris or AIX boxes it doesn't. The equals key has always been one of my standard ad hoc macro assignments. Are you sure you're not looking at a vim that's been linked to as vi ? � rojomoke Jul 31 '14 at 10:09

rojomoke, Aug 1, 2014 at 8:22

Yeah, on Linux, vi is almost always a link to vim. Try running the :ve command inside vi. � rojomoke Aug 1 '14 at 8:22

datelligence, Dec 28, 2015 at 17:28

I love this kind of answers: clear, precise and succinct. Worked for me in Debian Jessie. Thanks, @SJain � datelligence Dec 28 '15 at 17:28

kapil, Mar 1, 2015 at 13:20

To indent every line in a file type, esc then G=gg

zundarz, Sep 10, 2015 at 18:41

:help left

In ex mode you can use :left or :le to align lines a specified amount. Specifically, :left will Left align lines in the [range]. It sets the indent in the lines to [indent] (default 0).

:%le3 or :%le 3 or :%left3 or :%left 3 will align the entire file by padding with three spaces.

:5,7 le 3 will align lines 5 through 7 by padding them with 3 spaces.

:le without any value or :le 0 will left align with a padding of 0.

This works in vim and gvim .

Subfuzion, Aug 11, 2017 at 22:02

Awesome, just what I was looking for (a way to insert a specific number of spaces -- 4 spaces for markdown code -- to override my normal indent). In my case I wanted to indent a specific number of lines in visual mode, so shift-v to highlight the lines, then :'<,'>le4 to insert the spaces. Thanks! � Subfuzion Aug 11 '17 at 22:02

Nykakin, Aug 21, 2015 at 13:33

There is one more way that hasn't been mentioned yet - you can use norm i command to insert given text at the beginning of the line. To insert 10 spaces before lines 2-10:
:2,10norm 10i

Remember that there has to be space character at the end of the command - this will be the character we want to have inserted. We can also indent line with any other text, for example to indent every line in file with 5 underscore characters:

:%norm 5i_

Or something even more fancy:

:%norm 2i[ ]

More practical example is commenting Bash/Python/etc code with # character:

:1,20norm i#

To re-indent use x instead of i . For example to remove first 5 characters from every line:

:%norm 5x

Eliethesaiyan, Jun 13, 2016 at 14:18

this starts from the left side of the file...not the the current position of the block � Eliethesaiyan Jun 13 '16 at 14:18

John Sonderson, Jan 31, 2015 at 19:17

Suppose you use 2 spaces to indent your code. Type:
:set shiftwidth=2

Then:

You get the idea.

( Empty lines will not get indented, which I think is kind of nice. )


I found the answer in the (g)vim documentation for indenting blocks:

:help visual-block
/indent

If you want to give a count to the command, do this just before typing the operator character: "v{move-around}3>" (move lines 3 indents to the right).

Michael, Dec 19, 2014 at 20:18

To indent all file by 4:
esc 4G=G

underscore_d, Oct 17, 2015 at 19:35

...what? 'indent by 4 spaces'? No, this jumps to line 4 and then indents everything from there to the end of the file, using the currently selected indent mode (if any). � underscore_d Oct 17 '15 at 19:35

Abhishesh Sharma, Jul 15, 2014 at 9:22

:line_num_start,line_num_end>

e.g.

14,21> shifts line number 14 to 21 to one tab

Increase the '>' symbol for more tabs

e.g.

14,21>>> for 3 tabs

HoldOffHunger, Dec 5, 2017 at 15:50

There are clearly a lot of ways to solve this, but this is the easiest to implement, as line numbers show by default in vim and it doesn't require math. � HoldOffHunger Dec 5 '17 at 15:50

rohitkadam19, May 7, 2013 at 7:13

5== will indent 5 lines from current cursor position. so you can type any number before ==, it will indent number of lines. This is in command mode.

gg=G will indent whole file from top to bottom.

Kamlesh Karwande, Feb 6, 2014 at 4:04

I dont know why its so difficult to find a simple answer like this one...

I myself had to struggle a lot to know this its its very simple

edit your .vimrc file under home directory add this line

set cindent

in you file where you want to indent properly

in normal/command mode type

10==   (this will indent 10 lines from the current cursor location )
gg=G   (complete file will be properly indented)

Michael Durrant, Nov 4, 2013 at 22:57

Go to the start of the text

Eric Leschinski, Dec 23, 2013 at 3:30

How to indent highlighted code in vi immediately by a # of spaces:

Option 1: Indent a block of code in vi to three spaces with Visual Block mode:

  1. Select the block of code you want to indent. Do this using Ctrl+V in normal mode and arrowing down to select text. While it is selected, enter : to give a command to the block of selected text.
  2. The following will appear in the command line: :'<,'>
  3. To set indent to 3 spaces, type le 3 and press enter. This is what appears: :'<,'>le 3
  4. The selected text is immediately indented to 3 spaces.

Option 2: Indent a block of code in vi to three spaces with Visual Line mode:

  1. Open your file in VI.
  2. Put your cursor over some code
  3. Be in normal mode press the following keys:
    Vjjjj:le 3
    

    Interpretation of what you did:

    V means start selecting text.

    jjjj arrows down 4 lines, highlighting 4 lines.

    : tells vi you will enter an instruction for the highlighted text.

    le 3 means indent highlighted text 3 lines.

    The selected code is immediately increased or decreased to three spaces indentation.

Option 3: use Visual Block mode and special insert mode to increase indent:

  1. Open your file in VI.
  2. Put your cursor over some code
  3. Be in normal mode press the following keys:

    Ctrl+V

    jjjj
    

    (press spacebar 5 times)

    Esc Shift+i

    All the highlighted text is indented an additional 5 spaces.

ire_and_curses, Mar 6, 2011 at 17:29

This answer summarises the other answers and comments of this question, and adds extra information based on the Vim documentation and the Vim wiki . For conciseness, this answer doesn't distinguish between Vi and Vim-specific commands.

In the commands below, "re-indent" means "indent lines according to your indentation settings ." shiftwidth is the primary variable that controls indentation.

General Commands

>>   Indent line by shiftwidth spaces
<<   De-indent line by shiftwidth spaces
5>>  Indent 5 lines
5==  Re-indent 5 lines

>%   Increase indent of a braced or bracketed block (place cursor on brace first)
=%   Reindent a braced or bracketed block (cursor on brace)
<%   Decrease indent of a braced or bracketed block (cursor on brace)
]p   Paste text, aligning indentation with surroundings

=i{  Re-indent the 'inner block', i.e. the contents of the block
=a{  Re-indent 'a block', i.e. block and containing braces
=2a{ Re-indent '2 blocks', i.e. this block and containing block

>i{  Increase inner block indent
<i{  Decrease inner block indent

You can replace { with } or B, e.g. =iB is a valid block indent command. Take a look at "Indent a Code Block" for a nice example to try these commands out on.

Also, remember that

.    Repeat last command

, so indentation commands can be easily and conveniently repeated.

Re-indenting complete files

Another common situation is requiring indentation to be fixed throughout a source file:

gg=G  Re-indent entire buffer

You can extend this idea to multiple files:

" Re-indent all your c source code:
:args *.c
:argdo normal gg=G
:wall

Or multiple buffers:

" Re-indent all open buffers:
:bufdo normal gg=G:wall

In Visual Mode

Vjj> Visually mark and then indent 3 lines

In insert mode

These commands apply to the current line:

CTRL-t   insert indent at start of line
CTRL-d   remove indent at start of line
0 CTRL-d remove all indentation from line

Ex commands

These are useful when you want to indent a specific range of lines, without moving your cursor.

:< and :> Given a range, apply indentation e.g.
:4,8>   indent lines 4 to 8, inclusive

Indenting using markers

Another approach is via markers :

ma     Mark top of block to indent as marker 'a'

...move cursor to end location

>'a    Indent from marker 'a' to current location

Variables that govern indentation

You can set these in your .vimrc file .

set expandtab       "Use softtabstop spaces instead of tab characters for indentation
set shiftwidth=4    "Indent by 4 spaces when using >>, <<, == etc.
set softtabstop=4   "Indent by 4 spaces when pressing <TAB>

set autoindent      "Keep indentation from previous line
set smartindent     "Automatically inserts indentation in some cases
set cindent         "Like smartindent, but stricter and more customisable

Vim has intelligent indentation based on filetype. Try adding this to your .vimrc:

if has ("autocmd")
    " File type detection. Indent based on filetype. Recommended.
    filetype plugin indent on
endif

References

Amit, Aug 10, 2011 at 13:26

Both this answer and the one above it were great. But I +1'd this because it reminded me of the 'dot' operator, which repeats the last command. This is extremely useful when needing to indent an entire block several shiftspaces (or indentations) without needing to keep pressing >} . Thanks a long � Amit Aug 10 '11 at 13:26

Wipqozn, Aug 24, 2011 at 16:00

5>> Indent 5 lines : This command indents the fifth line, not 5 lines. Could this be due to my VIM settings, or is your wording incorrect? � Wipqozn Aug 24 '11 at 16:00

ire_and_curses, Aug 24, 2011 at 16:21

@Wipqozn - That's strange. It definitely indents the next five lines for me, tested on Vim 7.2.330. � ire_and_curses Aug 24 '11 at 16:21

Steve, Jan 6, 2012 at 20:13

>42gg Indent from where you are to line 42. � Steve Jan 6 '12 at 20:13

aqn, Mar 6, 2013 at 4:42

Great summary! Also note that the "indent inside block" and "indent all block" (<i{ >a{ etc.) also works with parentheses and brackets: >a( <i] etc. (And while I'm at it, in addition to <>'s, they also work with d,c,y etc.) � aqn Mar 6 '13 at 4:42

NickSoft, Nov 5, 2013 at 16:19

I didn't find a method I use in the comments, so I'll share it (I think vim only):
  1. Esc to enter command mode
  2. Move to the first character of the last line you want to ident
  3. ctrl - v to start block select
  4. Move to the first character of the first line you want to ident
  5. shift - i to enter special insert mode
  6. type as many spases/tabs as you need to indent to (2 for example
  7. press Esc and spaces will appear in all lines

This is useful when you don't want to change ident/tab settings in vimrc or to remember them to change it while editing.

To unindent I use the same ctrl - v block select to select spaces and delete it with d .

svec, Oct 25, 2008 at 4:21

Also try this for C-indenting indentation, do :help = for more info:

={

That will auto-indent the current code block you're in.

Or just:

==

to auto-indent the current line.

underscore_d, Oct 17, 2015 at 19:39

doesn't work for me, just dumps my cursor to the line above the opening brace of 'the current code block i'm in'. � underscore_d Oct 17 '15 at 19:39

John La Rooy, Jul 2, 2013 at 7:24

Using Python a lot, I find myself needing frequently needing to shift blocks by more than one indent. You can do this by using any of the block selection methods, and then just enter the number of indents you wish to jump right before the >

Eg. V5j3> will indent 5 lines 3 times - which is 12 spaces if you use 4 spaces for indents

Juan Lanus, Sep 18, 2012 at 14:12

The beauty of vim's UI is that it's consistent. Editing commands are made up of the command and a cursor move. The cursor moves are always the same:

So, in order to use vim you have to learn to move the cursor and remember a repertoire of commands like, for example, > to indent (and < to "outdent").
Thus, for indenting the lines from the cursor position to the top of the screen you do >H, >G to indent to the bottom of the file.

If, instead of typing >H, you type dH then you are deleting the same block of lines, cH for replacing it, etc.

Some cursor movements fit better with specific commands. In particular, the % command is handy to indent a whole HTML or XML block.
If the file has syntax highlighted ( :syn on ) then setting the cursor in the text of a tag (like, in the "i" of <div> and entering >% will indent up to the closing </div> tag.

This is how vim works: one has to remember only the cursor movements and the commands, and how to mix them.
So my answer to this question would be "go to one end of the block of lines you want to indent, and then type the > command and a movement to the other end of the block" if indent is interpreted as shifting the lines, = if indent is interpreted as in pretty-printing.

aqn, Mar 6, 2013 at 4:38

I would say that vi/vim is mostly consistent. For instance, D does not behave the same as S and Y! :) � aqn Mar 6 '13 at 4:38

Kent Fredric, Oct 25, 2008 at 9:16

Key-Presses for more visual people:
  1. Enter Command Mode:
    Escape
  2. Move around to the start of the area to indent:
    hjkl↑↓←→
  3. Start a block:
    v
  4. Move around to the end of the area to indent:
    hjkl↑↓←→
  5. (Optional) Type the number of indentation levels you want
    0..9
  6. Execute the indentation on the block:
    >

Shane Reustle, Mar 10, 2011 at 22:24

This is great, but it uses spaces and not tabs. Any possible way to fix this? � Shane Reustle Mar 10 '11 at 22:24

Kent Fredric, Mar 16, 2011 at 8:33

If its using spaces instead of tabs, then its probably because you have indentation set to use spaces. =). � Kent Fredric Mar 16 '11 at 8:33

Kent Fredric, Mar 16, 2011 at 8:36

When the 'expandtab' option is off (this is the default) Vim uses <Tab>s as much as possible to make the indent. ( :help :> ) � Kent Fredric Mar 16 '11 at 8:36

Shane Reustle, Dec 2, 2012 at 3:17

The only tab/space related vim setting I've changed is :set tabstop=3. It's actually inserting this every time I use >>: "<tab><space><space>". Same with indenting a block. Any ideas? � Shane Reustle Dec 2 '12 at 3:17

Kent Fredric, Dec 2, 2012 at 17:08

The three settings you want to look at for "spaces vs tabs" are 1. tabstop 2. shiftwidth 3. expandtab. You probably have "shiftwidth=5 noexpandtab", so a "tab" is 3 spaces, and an indentation level is "5" spaces, so it makes up the 5 with 1 tab, and 2 spaces. � Kent Fredric Dec 2 '12 at 17:08

mda, Jun 4, 2012 at 5:12

For me, the MacVim (Visual) solution was, select with mouse and press ">", but after putting the following lines in "~/.vimrc" since I like spaces instead of tabs:
set expandtab
set tabstop=2
set shiftwidth=2

Also it's useful to be able to call MacVim from the command-line (Terminal.app), so since I have the following helper directory "~/bin", where I place a script called "macvim":

#!/usr/bin/env bash
/usr/bin/open -a /Applications/MacPorts/MacVim.app $@

And of course in "~/.bashrc":

export PATH=$PATH:$HOME/bin

Macports messes with "~/.profile" a lot, so the PATH environment variable can get quite long.

jash, Feb 17, 2012 at 15:16

>} or >{ indent from current line up to next paragraph

<} or <{ same un-indent

Eric Kigathi, Jan 4, 2012 at 0:41

A quick way to do this using VISUAL MODE uses the same process as commenting a block of code.

This is useful if you would prefer not to change your shiftwidth or use any set directives and is flexible enough to work with TABS or SPACES or any other character.

  1. Position cursor at the beginning on the block
  2. v to switch to -- VISUAL MODE --
  3. Select the text to be indented
  4. Type : to switch to the prompt
  5. Replacing with 3 leading spaces:

    :'<,'>s/^/ /g

  6. Or replacing with leading tabs:

    :'<,'>s/^/\t/g

  7. Brief Explanation:

    '<,'> - Within the Visually Selected Range

    s/^/ /g - Insert 3 spaces at the beginning of every line within the whole range

    (or)

    s/^/\t/g - Insert Tab at the beginning of every line within the whole range

pankaj ukumar, Nov 11, 2009 at 17:33

do this
$vi .vimrc

and add this line

autocmd FileType cpp setlocal expandtab shiftwidth=4 softtabstop=4 cindent

this is only for cpp file you can do this for another file type also just by modifying the filetype...

SteveO, Nov 10, 2010 at 19:16

I like to mark text for indentation:
  1. go to beginning of line of text then type ma (a is the label from the 'm'ark: it could be any letter)
  2. go to end line of text and type mz (again z could be any letter)
  3. :'a,'z> or :'a,'z< will indent or outdent (is this a word?)
  4. Voila! the text is moved (empty lines remain empty with no spaces)

PS: you can use :'a,'z technique to mark a range for any operation (d,y,s///, etc) where you might use lines, numbers, or %

Paul Tomblin, Oct 25, 2008 at 4:08

As well as the offered solutions, I like to do things a paragraph at a time with >}

aqn, Mar 6, 2013 at 4:47

Yup, and this is why one of my big peeves is white spaces on an otherwise empty line: they messes up vim's notion of a "paragraph". � aqn Mar 6 '13 at 4:47

Daniel Spiewak, Oct 25, 2008 at 4:00

In addition to the answer already given and accepted, it is also possible to place a marker and then indent everything from the current cursor to the marker. Thus, enter ma where you want the top of your indented block, cursor down as far as you need and then type >'a (note that " a " can be substituted for any valid marker name). This is sometimes easier than 5>> or vjjj> .

user606723, Mar 17, 2011 at 15:31

This is really useful. I am going to have to look up what all works with this. I know d'a and y'a, what else? � user606723 Mar 17 '11 at 15:31

ziggy, Aug 25, 2014 at 14:14

This is very useful as it avoids the need to count how many lines you want to indent. � ziggy Aug 25 '14 at 14:14

[Oct 19, 2018] Vim faster way to select blocks of text in visual mode - Stack Overflow

Oct 19, 2018 | stackoverflow.com

Vim: faster way to select blocks of text in visual mode Ask Question up vote 149 down vote favorite 65


Calvin Cheng ,Sep 13, 2011 at 18:52

I have been using vim for quite some time and am aware that selecting blocks of text in visual mode is as simple as SHIFT + V and moving the arrow key up or down line-by-line until I reach the end of the block of text that I want selected.

My question is - is there a faster way in visual mode to select a block of text for example by SHIFT + V followed by specifying the line number in which I want the selection to stop? (via :35 for example, where 35 is the line number I want to select up to - this obviously does not work so my question is to find how if something similar to this can be done...)

user786653 ,Sep 13, 2011 at 19:08

+1 Good question as I have found myself doing something like this often. I am wondering if perhaps this isn't the place start using using v% or v/pattern or something else? – user786653 Sep 13 '11 at 19:08

SergioAraujo ,Sep 13, 2011 at 20:30

vip select inner paragraph vis select inner sentence. – SergioAraujo Sep 13 '11 at 20:30

Stephan ,Sep 29, 2014 at 22:49

V35G will visually select from current line to line 35, also V10j or V10k will visually select the next or previous 10 lines – Stephan Sep 29 '14 at 22:49

shriek ,Feb 20, 2015 at 4:28

@Stephan, that's just what I was looking for. Thanks!! – shriek Feb 20 '15 at 4:28

Mikhail V ,Mar 27, 2015 at 16:52

for line selecting I use shortcut: nnoremap <Space> V . When in visual line mode just right-click with mouse to define selection (at least on linux it is so). Anyway, more effective than with keyboard only. – Mikhail V Mar 27 '15 at 16:52

Jay ,Sep 13, 2011 at 19:05

In addition to what others have said, you can also expand your selection using pattern searches.

For example, v/foo will select from your current position to the next instance of "foo." If you actually wanted to expand to the next instance of "foo," on line 35, for example, just press n to expand selection to the next instance, and so on.

update

I don't often do it, but I know that some people use marks extensively to make visual selections. For example, if I'm on line 5 and I want to select to line 35, I might press ma to place mark a on line 5, then :35 to move to line 35. Shift + v to enter linewise visual mode, and finally `a to select back to mark a .

Calvin Cheng ,Sep 13, 2011 at 19:19

now this is COOL. Thanks! – Calvin Cheng Sep 13 '11 at 19:19

Peter Rincker ,Sep 13, 2011 at 19:41

If you need to include the pattern you can use v/foo/e . The e stands for "end" of the matched pattern. – Peter Rincker Sep 13 '11 at 19:41

bheeshmar ,Sep 13, 2011 at 20:29

And you can modify from that line with offsets: V/foo/+5 or V/foo/-5 (I'm using linewise visual mode like the author). – bheeshmar Sep 13 '11 at 20:29

Jay ,Oct 31, 2013 at 0:18

@DanielPark To select the current word, use v i w . If you want to select the current contiguous non-whitespace, use v i Shift + w . The difference would be when the caret is here MyCla|ss.Method , the first combo would select MyClass and second would select the whole thing. – Jay Oct 31 '13 at 0:18

Daniel Park ,Oct 31, 2013 at 1:55

Thanks. Found that also using v i w s allows you to effectively do a "replace" operation. – Daniel Park Oct 31 '13 at 1:55

bheeshmar ,Sep 13, 2011 at 19:00

G                       Goto line [count], default last line, on the first
                        non-blank character linewise.  If 'startofline' not
                        set, keep the same column.
                        G is a one of jump-motions.

V35G achieves what you want

Daniel Kobe ,Apr 24, 2017 at 18:16

My vim gives me the error Not an editor command: VDaniel Kobe Apr 24 '17 at 18:16

kzh ,Apr 24, 2017 at 19:42

@Daniel vimdoc.sourceforge.net/htmldoc/visual.html#Vkzh Apr 24 '17 at 19:42

bheeshmar ,Apr 25, 2017 at 19:06

@DanielKobe, it's a Normal mode command, so don't press ":". – bheeshmar Apr 25 '17 at 19:06

kzh ,Jun 1, 2013 at 12:29

Vim is a language. To really understand Vim, you have to know the language. Many commands are verbs, and vim also has objects and prepositions.
V100G
V100gg

This means "select the current line up to and including line 100."

Text objects are where a lot of the power is at. They introduce more objects with prepositions.

Vap

This means "select around the current paragraph", that is select the current paragraph and the blank line following it.

V2ap

This means "select around the current paragraph and the next paragraph."

}V-2ap

This means "go to the end of the current paragraph and then visually select it and the preceding paragraph."

Understanding Vim as a language will help you to get the best mileage out of it.

After you have selecting down, then you can combine with other commands:

Vapd

With the above command, you can select around a paragraph and delete it. Change the d to a y to copy or to a c to change or to a p to paste over.

Once you get the hang of how all these commands work together, then you will eventually not need to visually select anything. Instead of visually selecting and then deleting a paragraph, you can just delete the paragraph with the dap command.

Daniel Kobe ,Apr 24, 2017 at 18:17

My vim gives me the error Not an editor command: VDaniel Kobe Apr 24 '17 at 18:17

kzh ,Apr 24, 2017 at 19:43

@Daniel vimdoc.sourceforge.net/htmldoc/visual.html#Vkzh Apr 24 '17 at 19:43

michaelmichael ,Sep 13, 2011 at 18:58

v35G will select everything from the cursor up to line 35.

v puts you in select mode, 35 specifies the line number that you want to G go to.

You could also use v} which will select everything up to the beginning of the next paragraph.

mateusz.fiolka ,Sep 13, 2011 at 18:58

For selecting number of lines:

shift+v 9j - select 10 lines

Peter Rincker ,Sep 13, 2011 at 19:38

For small ranges this is good, especially when paired with :set rnuPeter Rincker Sep 13 '11 at 19:38

µBio ,Sep 13, 2011 at 18:55

v 35 j

text added for 30 character minimum

Peng Zhang ,Feb 17, 2016 at 3:28

Shift+V n j or Shift+V n k

This selects the current line and the next/previous n lines. I find it very useful.

Kevin Yue ,Sep 24, 2016 at 5:20

It's very useful, thanks. – Kevin Yue Sep 24 '16 at 5:20

Arsal ,Apr 24, 2017 at 20:09

This is a simple way I was looking for. Thanks – Arsal Apr 24 '17 at 20:09

> ,

Text objects: http://vim.wikia.com/wiki/Creating_new_text_objects

http://vimdoc.sourceforge.net/htmldoc/motion.html#text-objects

[Oct 19, 2018] Accessing Multiple Files in VIM

Oct 19, 2018 | stackoverflow.com

Léo Léopold Hertz, Apr 19, 2009 at 2:44

The book "Unix in a Nutshell" discusses about accessing multiple files on pages 572-573. There seem to be very useful commands such as ":e", ":e #", ":e new_file", ":n files", ":args", ":prev" and ":n!". The commands confuse me:

":n Edit next file in the list of files."

":args Display list of files to be edited."

":prev Edit previous file in the list of files."

I cannot see no real list when I do ":args". There is only a small text at the corner. I would like to see all files that I accessed with ":e", ie a list of files in the buffer.

Where can I see the list when I do the command ":n files"? What are the commands ":prev" and ":n" supposed to do? I got the error message:

There is only one file to edit.

Rook, Apr 19, 2009 at 3:25

Regarding the last part: If you have only one buffer open, then you cannot toggle through them ('cause there is only one open). – Rook Apr 19 '09 at 3:25

Rook, Apr 19, 2009 at 3:20

I've not read the book in mention, but I'll try to explain how vim handles files.

Vim has buffers. You open every file with:

:e name_of_the_file.txt (loads file in a buffer, i.e. "opens file")

You can also:

:e *.txt

Useful options while doing this are

:set laststatus=2 (to always show the statusline)

:set wildmenu (to ease opening files)

If you have standard vim with netrw plugin, you can:

:e . (for one of vim's file managers, so to say)

To manage buffers:

:ls will give you a list of currently opened buffers (files)

:bnext , and :bprevious (or :bn and :bp ) enable you to cycle through buffers

:bd closes the buffer/file (buffer done)

Other buffer types serve other purposes (yanking/pasting, temporary, vim's internal, ... etc.)

Léo Léopold Hertz 준영 ,Apr 19, 2009 at 3:57

Is the notation of buffer the same as in Emacs? Interestingly, the book defines buffer only for Emacs :( It states "When you open a file in Emacs, the file is put into a Buffer. -- The view of the buffer contents that you have at any point in time is called a window." Are the buffers and windows different to the things in Vim? – Léo Léopold Hertz 준영 Apr 19 '09 at 3:57

Rook ,Apr 19, 2009 at 11:08

Yes, you could say that. There are some differences in types of available buffers, but in principle, that's it. I'm not sure about emacs, he has windows/frames .., while vim has windows/tabs. Regarding vim: a window is only y method of showing what vim has in a buffer. A tab is a method of showing several windows on screen (tabs in vim have only recently been introduced). – Rook Apr 19 '09 at 11:08

Kyle Strand ,Sep 28, 2015 at 17:24

I don't think :e *.txt can be used to open multiple files. :next appears to work, though: stackoverflow.com/a/12304605/1858225Kyle Strand Sep 28 '15 at 17:24

Brian Carper ,Apr 19, 2009 at 4:05

In addition to what Jonathan Leffler said, if you don't invoke Vim with multiple files from the commandline, you can set Vim's argument list after Vim is open via:
:args *.c

Note that the argument list is different from the list of open buffers you get from :ls . Even if you close all open buffers in Vim, the argument list stays the same. :n and :prev may open a brand new buffer in Vim (if a buffer for that file isn't already open), or may take you to an existing buffer.

Similarly you can open multiple buffers in Vim without affecting the argument list (or even if the arg list is empty). :e opens a new buffer but doesn't necessarily affect the argument list. The list of open buffers and the argument list are independent. If you want to iterate through the list of open buffers rather than iterate through the argument list, use :bn and :bp and friends.

Jonathan Leffler ,Apr 19, 2009 at 2:54

For those commands to make sense, you do:
vim *.c

in a directory where there are twenty C files, for example. With a single file, there is no next or previous or significant list of files.

Léo Léopold Hertz 준영 ,Apr 19, 2009 at 3:48

Wow! Very nice example :) Great thanks! – Léo Léopold Hertz 준영 Apr 19 '09 at 3:48

DanM ,Apr 19, 2009 at 4:02

The :n :p :ar :rew :last operate on the command line argument list.

E.g.

> touch aaa.txt bbb.txt ccc.txt
> gvim *.txt

vim opens in aaa.txt

:ar gives a status line

[aaa.txt] bbb.txt ccc.txt

:n moves to bbb.txt

:ar gives the status line

aaa.txt [bbb.txt] ccc.txt

:rew rewinds us back to the start of the command line arg list to aaa.txt

:last sends us to ccc.txt

:e ddd.txt edits a new file ddd.txt

:ar gives the status line

aaa.txt bbb.txt [ccc.txt]

So the command set only operates on the initial command line argument list.

,

To clarify, Vim has the argument list, the buffer list, windows, and tab pages. The argument list is the list of files you invoked vim with (e.g. vim file1 file2); the :n and :p commands work with this. The buffer list is the list of in-memory copies of the files you are editing, just like emacs. Note that all the files loaded at start (in the argument list) are also in the buffer list. Try :help buffer-list for more information on both.

Windows are viewports for buffers. Think of windows as "desks" on which you can put buffers to work on them. Windows can be empty or be displaying buffers that can also be displayed in other windows, which you can use for example to look at two different areas of the same buffer at the same time. Try :help windows for more info.

Tabs are collections of windows. For example, you can have one tab with one window, and another tab with two windows vertically split. Try :help tabpage for more info

[Oct 18, 2018] 'less' command clearing screen upon exit - how to switch it off?

Notable quotes:
"... To prevent less from clearing the screen upon exit, use -X . ..."
Oct 18, 2018 | superuser.com

Wojciech Kaczmarek ,Feb 9, 2010 at 11:21

How to force the less program to not clear the screen upon exit?

I'd like it to behave like git log command:

Any ideas? I haven't found any suitable less options nor env variables in a manual, I suspect it's set via some env variable though.

sleske ,Feb 9, 2010 at 11:59

To prevent less from clearing the screen upon exit, use -X .

From the manpage:

-X or --no-init

Disables sending the termcap initialization and deinitialization strings to the terminal. This is sometimes desirable if the deinitialization string does something unnecessary, like clearing the screen.

As to less exiting if the content fits on one screen, that's option -F :

-F or --quit-if-one-screen

Causes less to automatically exit if the entire file can be displayed on the first screen.

-F is not the default though, so it's likely preset somewhere for you. Check the env var LESS .

markpasc ,Oct 11, 2010 at 3:44

This is especially annoying if you know about -F but not -X , as then moving to a system that resets the screen on init will make short files simply not appear, for no apparent reason. This bit me with ack when I tried to take my ACK_PAGER='less -RF' setting to the Mac. Thanks a bunch! – markpasc Oct 11 '10 at 3:44

sleske ,Oct 11, 2010 at 8:45

@markpasc: Thanks for pointing that out. I would not have realized that this combination would cause this effect, but now it's obvious. – sleske Oct 11 '10 at 8:45

Michael Goldshteyn ,May 30, 2013 at 19:28

This is especially useful for the man pager, so that man pages do not disappear as soon as you quit less with the 'q' key. That is, you scroll to the position in a man page that you are interested in only for it to disappear when you quit the less pager in order to use the info. So, I added: export MANPAGER='less -s -X -F' to my .bashrc to keep man page info up on the screen when I quit less, so that I can actually use it instead of having to memorize it. – Michael Goldshteyn May 30 '13 at 19:28

Michael Burr ,Mar 18, 2014 at 22:00

It kinda sucks that you have to decide when you start less how it must behave when you're going to exit. – Michael Burr Mar 18 '14 at 22:00

Derek Douville ,Jul 11, 2014 at 19:11

If you want any of the command-line options to always be default, you can add to your .profile or .bashrc the LESS environment variable. For example:
export LESS="-XF"

will always apply -X -F whenever less is run from that login session.

Sometimes commands are aliased (even by default in certain distributions). To check for this, type

alias

without arguments to see if it got aliased with options that you don't want. To run the actual command in your $PATH instead of an alias, just preface it with a back-slash :

\less

To see if a LESS environment variable is set in your environment and affecting behavior:

echo $LESS

dotancohen ,Sep 2, 2014 at 10:12

In fact, I add export LESS="-XFR" so that the colors show through less as well. – dotancohen Sep 2 '14 at 10:12

Giles Thomas ,Jun 10, 2015 at 12:23

Thanks for that! -XF on its own was breaking the output of git diff , and -XFR gets the best of both worlds -- no screen-clearing, but coloured git diff output. – Giles Thomas Jun 10 '15 at 12:23

[Oct 18, 2018] Isn't less just more

Highly recommended!
Oct 18, 2018 | unix.stackexchange.com

Bauna ,Aug 18, 2010 at 3:07

less is a lot more than more , for instance you have a lot more functionality:
g: go top of the file
G: go bottom of the file
/: search forward
?: search backward
N: show line number
: goto line
F: similar to tail -f, stop with ctrl+c
S: split lines

And I don't remember more ;-)

törzsmókus ,Feb 19 at 13:19

h : everything you don't remember ;) – törzsmókus Feb 19 at 13:19

KeithB ,Aug 18, 2010 at 0:36

There are a couple of things that I do all the time in less , that doesn't work in more (at least the versions on the systems I use. One is using G to go to the end of the file, and g to go to the beginning. This is useful for log files, when you are looking for recent entries at the end of the file. The other is search, where less highlights the match, while more just brings you to the section of the file where the match occurs, but doesn't indicate where it is.

geoffc ,Sep 8, 2010 at 14:11

Less has a lot more functionality.

You can use v to jump into the current $EDITOR. You can convert to tail -f mode with f as well as all the other tips others offered.

Ubuntu still has distinct less/more bins. At least mine does, or the more command is sending different arguments to less.

In any case, to see the difference, find a file that has more rows than you can see at one time in your terminal. Type cat , then the file name. It will just dump the whole file. Type more , then the file name. If on ubuntu, or at least my version (9.10), you'll see the first screen, then --More--(27%) , which means there's more to the file, and you've seen 27% so far. Press space to see the next page. less allows moving line by line, back and forth, plus searching and a whole bunch of other stuff.

Basically, use less . You'll probably never need more for anything. I've used less on huge files and it seems OK. I don't think it does crazy things like load the whole thing into memory ( cough Notepad). Showing line numbers could take a while, though, with huge files.

[Oct 18, 2018] What are the differences between most, more and less

Highly recommended!
Jun 29, 2013 | unix.stackexchange.com

Smith John ,Jun 29, 2013 at 13:16

more

more is an old utility. When the text passed to it is too large to fit on one screen, it pages it. You can scroll down but not up.

Some systems hardlink more to less , providing users with a strange hybrid of the two programs that looks like more and quits at the end of the file like more but has some less features such as backwards scrolling. This is a result of less 's more compatibility mode. You can enable this compatibility mode temporarily with LESS_IS_MORE=1 less ... .

more passes raw escape sequences by default. Escape sequences tell your terminal which colors to display.

less

less was written by a man who was fed up with more 's inability to scroll backwards through a file. He turned less into an open source project and over time, various individuals added new features to it. less is massive now. That's why some small embedded systems have more but not less . For comparison, less 's source is over 27000 lines long. more implementations are generally only a little over 2000 lines long.

In order to get less to pass raw escape sequences, you have to pass it the -r flag. You can also tell it to only pass ANSI escape characters by passing it the -R flag.

most

most is supposed to be more than less . It can display multiple files at a time. By default, it truncates long lines instead of wrapping them and provides a left/right scrolling mechanism. most's website has no information about most 's features. Its manpage indicates that it is missing at least a few less features such as log-file writing (you can use tee for this though) and external command running.

By default, most uses strange non-vi-like keybindings. man most | grep '\<vi.?\>' doesn't return anything so it may be impossible to put most into a vi-like mode.

most has the ability to decompress gunzip-compressed files before reading. Its status bar has more information than less 's.

most passes raw escape sequences by default.

tifo ,Oct 14, 2014 at 8:44

Short answer: Just use less and forget about more

Longer version:

more is old utility. You can't browse step wise with more, you can use space to browse page wise, or enter line by line, that is about it. less is more + more additional features. You can browse page wise, line wise both up and down, search

Jonathan.Brink ,Aug 9, 2015 at 20:38

If "more" is lacking for you and you know a few vi commands use "less" – Jonathan.Brink Aug 9 '15 at 20:38

Wilko Fokken ,Jan 30, 2016 at 20:31

There is one single application whereby I prefer more to less :

To check my LATEST modified log files (in /var/log/ ), I use ls -AltF | more .

While less deletes the screen after exiting with q , more leaves those files and directories listed by ls on the screen, sparing me memorizing their names for examination.

(Should anybody know a parameter or configuration enabling less to keep it's text after exiting, that would render this post obsolete.)

Jan Warchoł ,Mar 9, 2016 at 10:18

The parameter you want is -X (long form: --no-init ). From less ' manpage:

Disables sending the termcap initialization and deinitialization strings to the terminal. This is sometimes desirable if the deinitialization string does something unnecessary, like clearing the screen.

Jan Warchoł Mar 9 '16 at 10:18

[Sep 17, 2018] Which HTML5 tag should I use to mark up an author's name - Stack Overflow

Sep 17, 2018 | stackoverflow.com

Which HTML5 tag should I use to mark up an author's name? Ask Question up vote 75 down vote favorite 19


Quang Van ,Sep 3, 2011 at 1:03

For example of a blog-post or article.
<article>
<h1>header<h1>
<time>09-02-2011</time>
<author>John</author>
My article....
</article>

The author tag doesn't exist though... So what is the commonly used HTML5 tag for authors? Thanks.

(If there isn't, shouldn't there be one?)

Joseph Marikle ,Sep 3, 2011 at 1:10

<cite> maybe? I don't know lol. :P Doesn't make very much of a difference in style though. – Joseph Marikle Sep 3 '11 at 1:10

jalgames ,Apr 23, 2014 at 12:53

It's not about style. Technically, you can use a <p> to create a heading just by increasing the font size. But search engines won't understand it like that. – jalgames Apr 23 '14 at 12:53

Andreas Rejbrand ,Nov 17, 2014 at 18:58

You are not allowed to use the time element like that. Since dd-mm-yyy isn't one of the recognised formats, you have to supply a machine-readable version (in one of the recognised formats) in a datetime attribute of the time element. See w3.org/TR/2014/REC-html5-20141028/Andreas Rejbrand Nov 17 '14 at 18:58

Dan Dascalescu ,Jun 13, 2016 at 0:23

There's a better answer now than the accepted (robertc's) one. – Dan Dascalescu Jun 13 '16 at 0:23

robertc ,Sep 3, 2011 at 1:13

HTML5 has an author link type :
<a href="http://johnsplace.com" rel="author">John</a>

The weakness here is that it needs to be on some sort of link, but if you have that there's a long discussion of alternatives here . If you don't have a link, then just use a class attribute, that's what it's for:

<span class="author">John</span>

robertc ,Sep 3, 2011 at 1:21

@Quang Yes, I think a link type without an actual link would defeat the purpose of trying to mark it up semantically. – robertc Sep 3 '11 at 1:21

Paul D. Waite ,Sep 5, 2011 at 7:19

@Quang: the rel attribute is there to describe what the link's destination is. If the link has no destination, rel is meaningless. – Paul D. Waite Sep 5 '11 at 7:19

Michael Mior ,Jan 19, 2013 at 14:36

You might also want to look at schema.org for ways of expressing this type of information. – Michael Mior Jan 19 '13 at 14:36

robertc ,Jan 1, 2014 at 15:09

@jdh8 Because John is not the title of a workrobertc Jan 1 '14 at 15:09

Dan Dascalescu ,Jun 13, 2016 at 0:19

This answer just isn't the best any longer. Google no longer supports rel="author" , and as ryanve and Jason mention, the address tag was explicitly design for expressing authorship as well. – Dan Dascalescu Jun 13 '16 at 0:19

ryanve ,Sep 3, 2011 at 18:28

Both rel="author" and <address> are designed for this exact purpose. Both are supported in HTML5. The spec tells us that rel="author" can be used on <link> <a> , and <area> elements. Google also recommends its usage . Combining use of <address> and rel="author" seems optimal. HTML5 best affords wrapping <article> headlines and bylines info in a <header> like so:
<article>
    <header>
        <h1 class="headline">Headline</h1>
        <div class="byline">
            <address class="author">By <a rel="author" href="/author/john-doe">John Doe</a></address> 
            on <time pubdate datetime="2011-08-28" title="August 28th, 2011">8/28/11</time>
        </div>
    </header>

    <div class="article-content">
    ...
    </div>
</article>

If you want to add the hcard microformat , then I would do so like this:

<article>
    <header>
        <h1 class="headline">Headline</h1>
        <div class="byline vcard">
            <address class="author">By <a rel="author" class="url fn n" href="/author/john-doe">John Doe</a></address> 
            on <time pubdate datetime="2011-08-28" title="August 28th, 2011">on 8/28/11</time>
        </div>
    </header>

    <div class="article-content">
    ...
    </div>
</article>

aridlehoover ,Jun 24, 2013 at 18:12

Shouldn't "By " precede the <address> tag? It's not actually a part of the address. – aridlehoover Jun 24 '13 at 18:12

ryanve ,Jun 24, 2013 at 20:40

@aridlehoover Either seems correct according to whatwg.org/specs/web-apps/current-work/multipage/ - If outside, use .byline address { display:inline; font-style:inherit } to override the block default in browsers. – ryanve Jun 24 '13 at 20:40

ryanve ,Jun 24, 2013 at 20:46

@aridlehoover I also think that <dl> is viable. See the byline markup in the source of demo.actiontheme.com/sample-page for example. – ryanve Jun 24 '13 at 20:46

Paul Kozlovitch ,Apr 16, 2015 at 11:36

Since the pubdate attribute is gone from both the WHATWG and W3C specs, as Bruce Lawson writes here , I suggest you to remove it from your answer. – Paul Kozlovitch Apr 16 '15 at 11:36

Nathan Hornby ,May 28, 2015 at 12:41

This should really be the accepted answer. – Nathan Hornby May 28 '15 at 12:41

Jason Gennaro ,Sep 3, 2011 at 2:18

According to the HTML5 spec, you probably want address .

The address element represents the contact information for its nearest article or body element ancestor.

The spec further references address in respect to authors here

Under 4.4.4

Author information associated with an article element (q.v. the address element) does not apply to nested article elements.

Under 4.4.9

Contact information for the author or editor of a section belongs in an address element, possibly itself inside a footer.

All of which makes it seems that address is the best tag for this info.

That said, you could also give your address a rel or class of author .

<address class="author">Jason Gennaro</address>

Read more: http://dev.w3.org/html5/spec/sections.html#the-address-element

Quang Van ,Feb 29, 2012 at 9:24

Thanks Jason, do you know what "q.v." means? Under >4.4.4 >Author information associated with an article element (q.v. the address element) does not apply to nested article elements. – Quang Van Feb 29 '12 at 9:24

pageman ,Feb 10, 2014 at 5:44

@QuangVan - (wait, your initials are ... q.v. hmm) - q.v. means "quod vide" or "on this (matter) go see" - son on the matter of "q.v." go see english.stackexchange.com/questions/25252/ (q.v.) haha – pageman Feb 10 '14 at 5:44

Jason Gennaro ,Feb 10, 2014 at 13:19

@pageman well done rocking the Latin! – Jason Gennaro Feb 10 '14 at 13:19

pageman ,Feb 11, 2014 at 1:10

@JasonGennaro haha nanos gigantum humeris insidentes! – pageman Feb 11 '14 at 1:10

Jason Gennaro ,Feb 11, 2014 at 13:16

@pageman aren't we all. – Jason Gennaro Feb 11 '14 at 13:16

remyActual ,Sep 7, 2015 at 0:49

Google support for rel="author" is deprecated :

"Authorship markup is no longer supported in web search."

Use a Description List (Definition List in HTML 4.01) element.

From the HTML5 spec :

The dl element represents an association list consisting of zero or more name-value groups (a description list). A name-value group consists of one or more names (dt elements) followed by one or more values (dd elements), ignoring any nodes other than dt and dd elements. Within a single dl element, there should not be more than one dt element for each name.

Name-value groups may be terms and definitions, metadata topics and values, questions and answers, or any other groups of name-value data.

Authorship and other article meta information fits perfectly into this key:value pair structure:

An opinionated example:

<article>
  <header>
    <h1>Article Title</h1>
    <p class="subtitle">Subtitle</p>
    <dl class="dateline">
      <dt>Author:</dt>
      <dd>Remy Schrader</dd>
      <dt>All posts by author:</dt>
      <dd><a href="http://www.blog.net/authors/remy-schrader/">Link</a></dd>
      <dt>Contact:</dt>
      <dd><a mailto="[email protected]"><img src="email-sprite.png"></a></dd>
    </dl>
  </header>
  <section class="content">
    <!-- article content goes here -->
  </section>
</article>

As you can see when using the <dl> element for article meta information, we are free to wrap <address> , <a> and even <img> tags in <dt> and/or <dd> tags according to the nature of the content and it's intended function .
The <dl> , <dt> and <dd> tags are free to do their job -- semantically -- conveying information about the parent <article> ; <a> , <img> and <address> are similarly free to do their job -- again, semantically -- conveying information regarding where to find related content, non-verbal visual presentation, and contact details for authoritative parties, respectively.

steveax ,Sep 3, 2011 at 1:39

How about microdata :
<article>
<h1>header<h1>
<time>09-02-2011</time>
<div id="john" itemscope itemtype="http://microformats.org/profile/hcard">
 <h2 itemprop="fn">
  <span itemprop="n" itemscope>
   <span itemprop="given-name">John</span>
  </span>
 </h2>
</div>
My article....
</article>

Raphael ,Feb 26, 2016 at 11:29

You can use
<meta name="author" content="John Doe">

in the header as per the HTML5 specification .

> ,

If you were including contact details for the author, then the <address> tag is appropriate:

But if it's literally just the author's name, there isn't a specific tag for that. HTML doesn't include much related to people.

[Aug 25, 2018] What are the various link types in Windows? How do I create them?

Notable quotes:
"... SeCreateSymbolicLinkPrivilege ..."
"... shell links ..."
"... directory junctions ..."
"... indirectly accessed ..."
"... junction point ..."
"... on the local drives ..."
"... will be gone ..."
"... volume mount point ..."
"... Why should you even be viewing anything but your personal files on a daily basis? ..."
"... and that folder ..."
"... one-to-many relationship ..."
"... My brain explodes... ..."
Aug 25, 2018 | superuser.com

Cookie ,Oct 18, 2011 at 17:43

Is it possible to link two files or folders without having a different extension under Windows?

I'm looking for functionality equivalent to the soft and hard links in Unix.

barlop ,Jun 11, 2016 at 17:30

this is related superuser.com/questions/343074/barlop Jun 11 '16 at 17:30

barlop ,Jun 11, 2016 at 21:07

great article here cects.com/ watch out for juctions pre w7 – barlop Jun 11 '16 at 21:07

grawity ,Oct 18, 2011 at 18:00

Please note that the only unfortunate difference is that you need Administrator rights to create symbolic links. IE, you need an elevated prompt. (A workaround is the SeCreateSymbolicLinkPrivilege can be granted to normal Users via secpol.msc .)

Note in terminology: Windows shortcuts are not called "symlinks"; they are shell links , as they are simply files that the Windows Explorer shell treats specially.


Symlinks: How do I create them on NTFS file system?

Windows Vista and later versions support Unix-style symlinks on NTFS filesystems. Remember that they also follow the same path resolution – relative links are created relative to the link's location, not to the current directory. People often forget that. They can also be implemented using an absolute path; EG c:\windows\system32 instead of \system32 (which goes to a system32 directory connected to the link's location).
Symlinks are implemented using reparse points and generally have the same behavior as Unix symlinks.

For files you can execute:

mklink linkname targetpath

For directories you can execute:

mklink /d linkname targetpath

Hardlinks: How do I create them on NTFS file systems?

All versions of Windows NT support Unix-style hard links on NTFS filesystems. Using mklink on Vista and up:

mklink /h linkname targetpath

For Windows 2000 and XP, use fsutil .

fsutil hardlink create linkname targetpath

These also work the same way as Unix hard links – multiple file table entries point to the same inode .


Directory Junctions: How do I create them on NTFS file systems?

Windows 2000 and later support directory junctions on NTFS filesystems. They are different from symlinks in that they are always absolute and only point to directories, never to files.

mklink /j linkname targetpath

On versions which do not have mklink , download junction from Sysinternals:

junction linkname targetpath

Junctions are implemented using reparse points .


How can I mount a volume using a reparse point in Windows?

For completeness, on Windows 2000 and later , reparse points can also point to volumes , resulting in persistent Unix-style disk mounts :

mountvol mountpoint \\?\Volume{volumeguid}

Volume GUIDs are listed by mountvol ; they are static but only within the same machine.


Is there a way to do this in Windows Explorer?

Yes, you can use the shell extension Link Shell Extension which makes it very easy to make the links that have been described above. You can find the downloads at the bottom of the page .

The NTFS file system implemented in NT4, Windows 2000, Windows XP, Windows XP64, and Windows7 supports a facility known as hard links (referred to herein as Hardlinks ). Hardlinks provide the ability to keep a single copy of a file yet have it appear in multiple folders (directories). They can be created with the POSIX command ln included in the Windows Resource Kit, the fsutil command utility included in Windows XP or my command line ln.exe utility.

The extension allows the user to select one or many files or folders, then using the mouse, complete the creation of the required Links - Hardlinks, Junctions or Symbolic Links or in the case of folders to create Clones consisting of Hard or Symbolic Links. LSE is supported on all Windows versions that support NTFS version 5.0 or later, including Windows XP64 and Windows7. Hardlinks, Junctions and Symbolic Links are NOT supported on FAT file systems, and nor is the Cloning and Smart Copy process supported on FAT file systems.

The source can simple be picked using a right click menu.

And depending on what you picked , you right click on a destination folder and get a menu with options.

This makes it very easy to create links. For an extensive guide, read the LSE documentation .

Downloads can be found at the bottom of their page .

Relevant MSDN URLs:

Tom Wijsman ,Dec 31, 2011 at 3:42

In this answer I will attempt to outline what the different types of links in directory management are as well as why they are useful as well as when they could be used. When trying to achieve a certain organization on your file volumes, knowing the various different types as well as creating them is valuable knowledge.

For information on how a certain link can be made, refer to grawity 's answer .

What is a link?

A link is a relationship between two entities; in the context of directory management, a link can be seen as a relationship between the following two entities:

  1. Directory Table

    This table keeps track of the files and folders that reside in a specific folder.

    A directory table is a special type of file that represents a directory (also known as a folder). Each file or directory stored within it is represented by a 32-byte entry in the table. Each entry records the name, extension, attributes (archive, directory, hidden, read-only, system and volume), the date and time of last modification, the address of the first cluster of the file/directory's data and finally the size of the file/directory.

  2. Data Cluster

    More specifically, the first cluster of the file or directory.

    A cluster is the smallest logical amount of disk space that can be allocated to hold a file.

The special thing about this relationship is that it allows one to have only one data cluster but many links to that data cluster, this allows us to show data as being present in multiple locations. However, there are multiple ways to do this and each method of doing so has its own effects.

To see where this roots from, lets go back to the past...

What is a shell link and why is in not always sufficient?

Although it might not sound familiar, we all know this one! File shortcuts are undoubtedly the most frequently used way of linking files. These were found in some of the early versions of Windows 9x and have been there for a long while.

These allow you to quickly create a shortcut to any file or folder, they are more specifically made to store extra information along just the link like for example the working directory the file is executed in, the arguments to provide to the program as well as options like whether to maximize the program.

The downside to this approach of linking is exactly that, the extra information requires this type of link to have a data cluster on its own to contain that file. The problem then is not necessarily that it takes disk space, but its rather that it the link is indirectly accessed as the Data Cluster first has to be requested before we get to the actual link. If the path referred to in the actual link is gone the shell link will still exist.

If you were to operate on the file being referred to, you would actually first have to figure out in which directory the file is. You can't simply open the link in an editor as you would then be editing the .lnk file rather than the file being linked to. This locks out a lot of possible use cases for shell links.

How does a junction point link try to solve these problems?

A NTFS junction point allows one to create a symbolic link to a directory on the local drives , in such a way that it behaves just like a normal directory. So, you have one directory of files stored on your disk but can access it from multiple locations.

When removing the junction point, the original directory remains. When removing the original directory, the junction point remains. It is very costly to enumerate the disk to check for junction points that have to be deleted. This is a downside as a result of its implementation.

The NTFS junction point is implemented using NTFS reparse points , which are NTFS file system objects introduced with Windows 2000.

An NTFS reparse point is a type of NTFS file system object. Reparse points provide a way to extend the NTFS filesystem by adding extra information to the directory entry, so a file system filter can interpret how the operating system will treat the data. This allows the creation of junction points, NTFS symbolic links and volume mount points, and is a key feature to Windows 2000's Hierarchical Storage System.

That's right, the invention of the reparse point allows us to do more sophisticated ways of linking.

The NTFS junction point is a soft link , which means that it just links to the name of the file. This means that whenever the link is deleted the original data stays intact ; but, whenever the original data is deleted the original data will be gone .

Can I also soft link files? Are there symbolic links?

Yes, when Windows Vista came around they decided to extend the functionality of the NTFS file system object(s) by providing the NTFS symbolic link , which is a soft link that acts in the same way as the NTFS junction point. But can be applied to file and directories.

They again share the same deletion behavior, in some use cases this can be a pain for files as you don't want to have an useless copy of a file hanging around. This is why also the notion of hard links have been implemented.

What is a hard link and how does it behave as opposed to soft links?

Hard links are not NTFS file system objects, but they are instead a link to a file (in detail, they refer to the MFT entry as that stores extra information about the actual file). The MFT entry has a field that remembers the amounts of time a file is being hard linked to. The data will still be accessible as long as at least one link that points to it still exists.

So, the data does no longer depend on a single MFT entry to exist . As long as there is a hard link around, the data will survive. This prevents accidental deleting for cases where one does not want to remember where the original file was.

You could for example make a folder with "movies I still have to watch" as well as a folder "movies that I take on vacation" as well as a folder "favorite movies". Movies that are none of these will be properly deleted while movies that are any of these will continue to exist even when you have watched a movie.

What is a volume mount point link for?

Some IT or business people might dislike having to remember or type the different drive letters their system has. What does M: really mean anyway? Was it Music? Movies? Models? Maps?

Microsoft has done efforts over the year to try to migrate users away from the work in drive C: to work in your user folder . I could undoubtedly say that the users with UAC and permission problems are those that do not follow these guidelines, but doesn't that make them wonder:

Why should you even be viewing anything but your personal files on a daily basis?

Volume mount points are the professional IT way of not being limited by drive letters as well as having a directory structure that makes sense for them, but...

My files are in different places, can I use links to get them together?

In Windows 7, Libraries were introduced exactly for this purpose. Done with music files that are located in this folder, and that folder and that folder . From a lower level of view, a library can be viewed as multiple links. They are again implemented as a file system object that can contain multiple references. It is in essence a one-to-many relationship ...

My brain explodes... Can you summarize when to use them?

Medinoc ,Sep 29, 2014 at 16:28

Libraries are shell-level like shortcut links, right? – Medinoc Sep 29 '14 at 16:28

Tom Wijsman ,Sep 29, 2014 at 21:53

@Medinoc: No, they aggregate the content of multiple locations. – Tom Wijsman Sep 29 '14 at 21:53

Medinoc ,Sep 30, 2014 at 20:52

But do they do so at filesystem level in such way that, say, cmd.exe and dir can list the aggregated content (in which case, where in the file system are they, I can't find it), or do they only aggregate at shell level, where only Windows Explorer and file dialogs can show them? I was under the impression it was the latter, but your "No" challenges this unless I wrote my question wrong (I meant to say "Libraries are shell-level like shortcut links are , right?" ). – Medinoc Sep 30 '14 at 20:52

Tom Wijsman ,Oct 1, 2014 at 9:32

@Medinoc: They are files at C:\Users\{User}\AppData\Roaming\Microsoft\Windows\Libraries . – Tom Wijsman Oct 1 '14 at 9:32

Tom Wijsman ,Apr 25, 2015 at 10:23

@Pacerier: Windows uses the old location system, where you can for example move a music folder around from its properties. Libraries are a new addition, which the OS itself barely uses as a result. Therefore I doubt if anything would break; as they are intended solely for display purposes, ... – Tom Wijsman Apr 25 '15 at 10:23

GeminiDomino ,Oct 18, 2011 at 17:49

If you're on Windows Vista or later, and have admin rights, you might check out the mklink command (it's a command line tool). I'm not sure how symlink-y it actually is since windows gives it the little arrow icon it puts on shortcuts, but a quick notepad++ test on a text file suggests it might work for what you're looking for.

You can run mklink with no arguments for a quick usage guide.

I hope that helps.

jcrawfordor ,Oct 18, 2011 at 17:58

mklink uses NTFS junction points (I believe that's what they're called) to more or less perfectly duplicate Unix-style linking. Windows can tell that it's a junction, though, so it'll give it the traditional arrow icon. iirc you can remove this with some registry fiddling, but I don't remember where. – jcrawfordor Oct 18 '11 at 17:58

grawity ,Oct 18, 2011 at 18:01

@jcrawfordor: The disk structures are "reparse points" . Junctions and symlinks are two different types of reparse points; volume mountpoints are third. – grawity Oct 18 '11 at 18:01

grawity ,Oct 18, 2011 at 18:08

And yes, @Gemini, mklink -made symlinks were specifically implemented to work just like Unix ones . – grawity Oct 18 '11 at 18:08

GeminiDomino ,Oct 18, 2011 at 18:13

Thanks grawity, for the confirmation. I've never played around with them much, so I just wanted to include disclaim.h ;) – GeminiDomino Oct 18 '11 at 18:13

barlop ,Jun 11, 2016 at 17:32

this article has some distinctions

one important distinction is that in a sense, junctions pre win7 were a bit unsafe, in that deleting them would delete the target directory.

http://cects.com/overview-to-understanding-hard-links-junction-points-and-symbolic-links-in-windows/

A Junction Point should never be removed in Win2k, Win2003 and WinXP with Explorer, the del or del /s commands, or with any utility that recursively walks directories since these will delete the target directory and all its subdirectories. Instead, use the rmdir command, the linkd utility, or fsutil (if using WinXP or above) or a third party tool to remove the junction point without affecting the target. In Vista/Win7, it's safe to delete Junction Points with Explorer or with the rmdir and del commands.

[Aug 07, 2018] May I sort the -etc-group and -etc-passwd files

Aug 07, 2018 | unix.stackexchange.com

Ned64 ,Feb 18 at 13:52

My /etc/group has grown by adding new users as well as installing programs that have added their own user and/or group. The same is true for /etc/passwd . Editing has now become a little cumbersome due to the lack of structure.

May I sort these files (e.g. by numerical id or alphabetical by name) without negative effect on the system and/or package managers?

I would guess that is does not matter but just to be sure I would like to get a 2nd opinion. Maybe root needs to be the 1st line or within the first 1k lines or something?

The same goes for /etc/*shadow .

Kevin ,Feb 19 at 23:50

"Editing has now become a little cumbersome due to the lack of structure" Why are you editing those files by hand? – Kevin Feb 19 at 23:50

Barmar ,Feb 21 at 20:51

How does sorting the file help with editing? Is it because you want to group related accounts together, and then do similar changes in a range of rows? But will related account be adjacent if you sort by uid or name? – Barmar Feb 21 at 20:51

Ned64 ,Mar 13 at 23:15

@Barmar It has helped mainly because user accounts are grouped by ranges and separate from system accounts (when sorting by UID). Therefore it is easier e.g. to spot the correct line to examine or change when editing with vi . – Ned64 Mar 13 at 23:15

ErikF ,Feb 18 at 14:12

You should be OK doing this : in fact, according to the article and reading the documentation, you can sort /etc/passwd and /etc/group by UID/GID with pwck -s and grpck -s , respectively.

hvd ,Feb 18 at 22:59

@Menasheh This site's colours don't make them stand out as much as on other sites, but "OK doing this" in this answer is a hyperlink. – hvd Feb 18 at 22:59

mickeyf ,Feb 19 at 14:05

OK, fine, but... In general, are there valid reasons to manually edit /etc/passwd and similar files? Isn't it considered better to access these via the tools that are designed to create and modify them? – mickeyf Feb 19 at 14:05

ErikF ,Feb 20 at 21:21

@mickeyf I've seen people manually edit /etc/passwd when they're making batch changes, like changing the GECOS field for all users due to moving/restructuring (global room or phone number changes, etc.) It's not common anymore, but there are specific reasons that crop up from time to time. – ErikF Feb 20 at 21:21

hvd ,Feb 18 at 17:28

Although ErikF is correct that this should generally be okay, I do want to point out one potential issue:

You're allowed to map different usernames to the same UID. If you make use of this, tools that map a UID back to a username will generally pick the first username they find for that UID in /etc/passwd . Sorting may cause a different username to appear first. For display purposes (e.g. ls -l output), either username should work, but it's possible that you've configured some program to accept requests from username A, where it will deny those requests if it sees them coming from username B, even if A and B are the same user.

Rui F Ribeiro ,Feb 19 at 17:53

Having root at first line has been a long time de facto "standard" and is very convenient if you ever have to fix their shell or delete the password, when dealing with problems or recovering systems.

Likewise I prefer to have daemons/utils users in the middle and standard users at the end of both passwd and shadow .

hvd answer is also very good about disturbing the users order, especially in systems with many users maintained by hand.

If you somewhat manage to sort the files, for instance, only for standard users, it would be more sensible than changing the order of all users, imo.

Barmar ,Feb 21 at 20:13

If you sort numerically by UID, you should get your preferred order. Root is always 0 , and daemons conventionally have UIDs under 100. – Barmar Feb 21 at 20:13

Rui F Ribeiro ,Feb 21 at 20:16

@Barmar If sorting by UID and not by name, indeed, thanks for remembering. – Rui F Ribeiro Feb 21 at 20:16

[Jul 30, 2018] Non-root user getting root access after running sudo vi -etc-hosts

Notable quotes:
"... as the original user ..."
Jul 30, 2018 | unix.stackexchange.com

Gilles, Mar 10, 2018 at 10:24

If sudo vi /etc/hosts is successful, it means that the system administrator has allowed the user to run vi /etc/hosts as root. That's the whole point of sudo: it lets the system administrator authorize certain users to run certain commands with extra privileges.

Giving a user the permission to run vi gives them the permission to run any vi command, including :sh to run a shell and :w to overwrite any file on the system. A rule allowing only to run vi /etc/hosts does not make any sense since it allows the user to run arbitrary commands.

There is no "hacking" involved. The breach of security comes from a misconfiguration, not from a hole in the security model. Sudo does not particularly try to prevent against misconfiguration. Its documentation is well-known to be difficult to understand; if in doubt, ask around and don't try to do things that are too complicated.

It is in general a hard problem to give a user a specific privilege without giving them more than intended. A bulldozer approach like giving them the right to run an interactive program such as vi is bound to fail. A general piece of advice is to give the minimum privileges necessary to accomplish the task. If you want to allow a user to modify one file, don't give them the permission to run an editor. Instead, either:

Note that allowing a user to edit /etc/hosts may have an impact on your security infrastructure: if there's any place where you rely on a host name corresponding to a specific machine, then that user will be able to point it to a different machine. Consider that it is probably unnecessary anyway .

[Jul 05, 2018] Can rsync resume after being interrupted

Notable quotes:
"... as if it were successfully transferred ..."
Jul 05, 2018 | unix.stackexchange.com

Tim ,Sep 15, 2012 at 23:36

I used rsync to copy a large number of files, but my OS (Ubuntu) restarted unexpectedly.

After reboot, I ran rsync again, but from the output on the terminal, I found that rsync still copied those already copied before. But I heard that rsync is able to find differences between source and destination, and therefore to just copy the differences. So I wonder in my case if rsync can resume what was left last time?

Gilles ,Sep 16, 2012 at 1:56

Yes, rsync won't copy again files that it's already copied. There are a few edge cases where its detection can fail. Did it copy all the already-copied files? What options did you use? What were the source and target filesystems? If you run rsync again after it's copied everything, does it copy again? – Gilles Sep 16 '12 at 1:56

Tim ,Sep 16, 2012 at 2:30

@Gilles: Thanks! (1) I think I saw rsync copied the same files again from its output on the terminal. (2) Options are same as in my other post, i.e. sudo rsync -azvv /home/path/folder1/ /home/path/folder2 . (3) Source and target are both NTFS, buy source is an external HDD, and target is an internal HDD. (3) It is now running and hasn't finished yet. – Tim Sep 16 '12 at 2:30

jwbensley ,Sep 16, 2012 at 16:15

There is also the --partial flag to resume partially transferred files (useful for large files) – jwbensley Sep 16 '12 at 16:15

Tim ,Sep 19, 2012 at 5:20

@Gilles: What are some "edge cases where its detection can fail"? – Tim Sep 19 '12 at 5:20

Gilles ,Sep 19, 2012 at 9:25

@Tim Off the top of my head, there's at least clock skew, and differences in time resolution (a common issue with FAT filesystems which store times in 2-second increments, the --modify-window option helps with that). – Gilles Sep 19 '12 at 9:25

DanielSmedegaardBuus ,Nov 1, 2014 at 12:32

First of all, regarding the "resume" part of your question, --partial just tells the receiving end to keep partially transferred files if the sending end disappears as though they were completely transferred.

While transferring files, they are temporarily saved as hidden files in their target folders (e.g. .TheFileYouAreSending.lRWzDC ), or a specifically chosen folder if you set the --partial-dir switch. When a transfer fails and --partial is not set, this hidden file will remain in the target folder under this cryptic name, but if --partial is set, the file will be renamed to the actual target file name (in this case, TheFileYouAreSending ), even though the file isn't complete. The point is that you can later complete the transfer by running rsync again with either --append or --append-verify .

So, --partial doesn't itself resume a failed or cancelled transfer. To resume it, you'll have to use one of the aforementioned flags on the next run. So, if you need to make sure that the target won't ever contain files that appear to be fine but are actually incomplete, you shouldn't use --partial . Conversely, if you want to make sure you never leave behind stray failed files that are hidden in the target directory, and you know you'll be able to complete the transfer later, --partial is there to help you.

With regards to the --append switch mentioned above, this is the actual "resume" switch, and you can use it whether or not you're also using --partial . Actually, when you're using --append , no temporary files are ever created. Files are written directly to their targets. In this respect, --append gives the same result as --partial on a failed transfer, but without creating those hidden temporary files.

So, to sum up, if you're moving large files and you want the option to resume a cancelled or failed rsync operation from the exact point that rsync stopped, you need to use the --append or --append-verify switch on the next attempt.

As @Alex points out below, since version 3.0.0 rsync now has a new option, --append-verify , which behaves like --append did before that switch existed. You probably always want the behaviour of --append-verify , so check your version with rsync --version . If you're on a Mac and not using rsync from homebrew , you'll (at least up to and including El Capitan) have an older version and need to use --append rather than --append-verify . Why they didn't keep the behaviour on --append and instead named the newcomer --append-no-verify is a bit puzzling. Either way, --append on rsync before version 3 is the same as --append-verify on the newer versions.

--append-verify isn't dangerous: It will always read and compare the data on both ends and not just assume they're equal. It does this using checksums, so it's easy on the network, but it does require reading the shared amount of data on both ends of the wire before it can actually resume the transfer by appending to the target.

Second of all, you said that you "heard that rsync is able to find differences between source and destination, and therefore to just copy the differences."

That's correct, and it's called delta transfer, but it's a different thing. To enable this, you add the -c , or --checksum switch. Once this switch is used, rsync will examine files that exist on both ends of the wire. It does this in chunks, compares the checksums on both ends, and if they differ, it transfers just the differing parts of the file. But, as @Jonathan points out below, the comparison is only done when files are of the same size on both ends -- different sizes will cause rsync to upload the entire file, overwriting the target with the same name.

This requires a bit of computation on both ends initially, but can be extremely efficient at reducing network load if for example you're frequently backing up very large files fixed-size files that often contain minor changes. Examples that come to mind are virtual hard drive image files used in virtual machines or iSCSI targets.

It is notable that if you use --checksum to transfer a batch of files that are completely new to the target system, rsync will still calculate their checksums on the source system before transferring them. Why I do not know :)

So, in short:

If you're often using rsync to just "move stuff from A to B" and want the option to cancel that operation and later resume it, don't use --checksum , but do use --append-verify .

If you're using rsync to back up stuff often, using --append-verify probably won't do much for you, unless you're in the habit of sending large files that continuously grow in size but are rarely modified once written. As a bonus tip, if you're backing up to storage that supports snapshotting such as btrfs or zfs , adding the --inplace switch will help you reduce snapshot sizes since changed files aren't recreated but rather the changed blocks are written directly over the old ones. This switch is also useful if you want to avoid rsync creating copies of files on the target when only minor changes have occurred.

When using --append-verify , rsync will behave just like it always does on all files that are the same size. If they differ in modification or other timestamps, it will overwrite the target with the source without scrutinizing those files further. --checksum will compare the contents (checksums) of every file pair of identical name and size.

UPDATED 2015-09-01 Changed to reflect points made by @Alex (thanks!)

UPDATED 2017-07-14 Changed to reflect points made by @Jonathan (thanks!)

Alex ,Aug 28, 2015 at 3:49

According to the documentation --append does not check the data, but --append-verify does. Also, as @gaoithe points out in a comment below, the documentation claims --partial does resume from previous files. – Alex Aug 28 '15 at 3:49

DanielSmedegaardBuus ,Sep 1, 2015 at 13:29

Thank you @Alex for the updates. Indeed, since 3.0.0, --append no longer compares the source to the target file before appending. Quite important, really! --partial does not itself resume a failed file transfer, but rather leaves it there for a subsequent --append(-verify) to append to it. My answer was clearly misrepresenting this fact; I'll update it to include these points! Thanks a lot :) – DanielSmedegaardBuus Sep 1 '15 at 13:29

Cees Timmerman ,Sep 15, 2015 at 17:21

This says --partial is enough. – Cees Timmerman Sep 15 '15 at 17:21

DanielSmedegaardBuus ,May 10, 2016 at 19:31

@CMCDragonkai Actually, check out Alexander's answer below about --partial-dir -- looks like it's the perfect bullet for this. I may have missed something entirely ;) – DanielSmedegaardBuus May 10 '16 at 19:31

Jonathan Y. ,Jun 14, 2017 at 5:48

What's your level of confidence in the described behavior of --checksum ? According to the man it has more to do with deciding which files to flag for transfer than with delta-transfer (which, presumably, is rsync 's default behavior). – Jonathan Y. Jun 14 '17 at 5:48

Alexander O'Mara ,Jan 3, 2016 at 6:34

TL;DR:

Just specify a partial directory as the rsync man pages recommends:

--partial-dir=.rsync-partial

Longer explanation:

There is actually a built-in feature for doing this using the --partial-dir option, which has several advantages over the --partial and --append-verify / --append alternative.

Excerpt from the rsync man pages:
--partial-dir=DIR
      A  better way to keep partial files than the --partial option is
      to specify a DIR that will be used  to  hold  the  partial  data
      (instead  of  writing  it  out to the destination file).  On the
      next transfer, rsync will use a file found in this dir  as  data
      to  speed  up  the resumption of the transfer and then delete it
      after it has served its purpose.

      Note that if --whole-file is specified (or  implied),  any  par-
      tial-dir  file  that  is  found for a file that is being updated
      will simply be removed (since rsync  is  sending  files  without
      using rsync's delta-transfer algorithm).

      Rsync will create the DIR if it is missing (just the last dir --
      not the whole path).  This makes it easy to use a relative  path
      (such  as  "--partial-dir=.rsync-partial")  to have rsync create
      the partial-directory in the destination file's  directory  when
      needed,  and  then  remove  it  again  when  the partial file is
      deleted.

      If the partial-dir value is not an absolute path, rsync will add
      an  exclude rule at the end of all your existing excludes.  This
      will prevent the sending of any partial-dir files that may exist
      on the sending side, and will also prevent the untimely deletion
      of partial-dir items on the receiving  side.   An  example:  the
      above  --partial-dir  option would add the equivalent of "-f '-p
      .rsync-partial/'" at the end of any other filter rules.

By default, rsync uses a random temporary file name which gets deleted when a transfer fails. As mentioned, using --partial you can make rsync keep the incomplete file as if it were successfully transferred , so that it is possible to later append to it using the --append-verify / --append options. However there are several reasons this is sub-optimal.

  1. Your backup files may not be complete, and without checking the remote file which must still be unaltered, there's no way to know.
  2. If you are attempting to use --backup and --backup-dir , you've just added a new version of this file that never even exited before to your version history.

However if we use --partial-dir , rsync will preserve the temporary partial file, and resume downloading using that partial file next time you run it, and we do not suffer from the above issues.

trs ,Apr 7, 2017 at 0:00

This is really the answer. Hey everyone, LOOK HERE!! – trs Apr 7 '17 at 0:00

JKOlaf ,Jun 28, 2017 at 0:11

I agree this is a much more concise answer to the question. the TL;DR: is perfect and for those that need more can read the longer bit. Strong work. – JKOlaf Jun 28 '17 at 0:11

N2O ,Jul 29, 2014 at 18:24

You may want to add the -P option to your command.

From the man page:

--partial By default, rsync will delete any partially transferred file if the transfer
         is interrupted. In some circumstances it is more desirable to keep partially
         transferred files. Using the --partial option tells rsync to keep the partial
         file which should make a subsequent transfer of the rest of the file much faster.

  -P     The -P option is equivalent to --partial --progress.   Its  pur-
         pose  is to make it much easier to specify these two options for
         a long transfer that may be interrupted.

So instead of:

sudo rsync -azvv /home/path/folder1/ /home/path/folder2

Do:

sudo rsync -azvvP /home/path/folder1/ /home/path/folder2

Of course, if you don't want the progress updates, you can just use --partial , i.e.:

sudo rsync --partial -azvv /home/path/folder1/ /home/path/folder2

gaoithe ,Aug 19, 2015 at 11:29

@Flimm not quite correct. If there is an interruption (network or receiving side) then when using --partial the partial file is kept AND it is used when rsync is resumed. From the manpage: "Using the --partial option tells rsync to keep the partial file which should <b>make a subsequent transfer of the rest of the file much faster</b>." – gaoithe Aug 19 '15 at 11:29

DanielSmedegaardBuus ,Sep 1, 2015 at 14:11

@Flimm and @gaoithe, my answer wasn't quite accurate, and definitely not up-to-date. I've updated it to reflect version 3 + of rsync . It's important to stress, though, that --partial does not itself resume a failed transfer. See my answer for details :) – DanielSmedegaardBuus Sep 1 '15 at 14:11

guettli ,Nov 18, 2015 at 12:28

@DanielSmedegaardBuus I tried it and the -P is enough in my case. Versions: client has 3.1.0 and server has 3.1.1. I interrupted the transfer of a single large file with ctrl-c. I guess I am missing something. – guettli Nov 18 '15 at 12:28

Yadunandana ,Sep 16, 2012 at 16:07

I think you are forcibly calling the rsync and hence all data is getting downloaded when you recall it again. use --progress option to copy only those files which are not copied and --delete option to delete any files if already copied and now it does not exist in source folder...
rsync -avz --progress --delete -e  /home/path/folder1/ /home/path/folder2

If you are using ssh to login to other system and copy the files,

rsync -avz --progress --delete -e "ssh -o UserKnownHostsFile=/dev/null -o \
StrictHostKeyChecking=no" /home/path/folder1/ /home/path/folder2

let me know if there is any mistake in my understanding of this concept...

Fabien ,Jun 14, 2013 at 12:12

Can you please edit your answer and explain what your special ssh call does, and why you advice to do it? – Fabien Jun 14 '13 at 12:12

DanielSmedegaardBuus ,Dec 7, 2014 at 0:12

@Fabien He tells rsync to set two ssh options (rsync uses ssh to connect). The second one tells ssh to not prompt for confirmation if the host he's connecting to isn't already known (by existing in the "known hosts" file). The first one tells ssh to not use the default known hosts file (which would be ~/.ssh/known_hosts). He uses /dev/null instead, which is of course always empty, and as ssh would then not find the host in there, it would normally prompt for confirmation, hence option two. Upon connecting, ssh writes the now known host to /dev/null, effectively forgetting it instantly :) – DanielSmedegaardBuus Dec 7 '14 at 0:12

DanielSmedegaardBuus ,Dec 7, 2014 at 0:23

...but you were probably wondering what effect, if any, it has on the rsync operation itself. The answer is none. It only serves to not have the host you're connecting to added to your SSH known hosts file. Perhaps he's a sysadmin often connecting to a great number of new servers, temporary systems or whatnot. I don't know :) – DanielSmedegaardBuus Dec 7 '14 at 0:23

moi ,May 10, 2016 at 13:49

"use --progress option to copy only those files which are not copied" What? – moi May 10 '16 at 13:49

Paul d'Aoust ,Nov 17, 2016 at 22:39

There are a couple errors here; one is very serious: --delete will delete files in the destination that don't exist in the source. The less serious one is that --progress doesn't modify how things are copied; it just gives you a progress report on each file as it copies. (I fixed the serious error; replaced it with --remove-source-files .) – Paul d'Aoust Nov 17 '16 at 22:39

[Jul 04, 2018] How do I parse command line arguments in Bash

Notable quotes:
"... enhanced getopt ..."
Jul 04, 2018 | stackoverflow.com

Lawrence Johnston ,Oct 10, 2008 at 16:57

Say, I have a script that gets called with this line:
./myscript -vfd ./foo/bar/someFile -o /fizz/someOtherFile

or this one:

./myscript -v -f -d -o /fizz/someOtherFile ./foo/bar/someFile

What's the accepted way of parsing this such that in each case (or some combination of the two) $v , $f , and $d will all be set to true and $outFile will be equal to /fizz/someOtherFile ?

Inanc Gumus ,Apr 15, 2016 at 19:11

See my very easy and no-dependency answer here: stackoverflow.com/a/33826763/115363Inanc Gumus Apr 15 '16 at 19:11

dezza ,Aug 2, 2016 at 2:13

For zsh-users there's a great builtin called zparseopts which can do: zparseopts -D -E -M -- d=debug -debug=d And have both -d and --debug in the $debug array echo $+debug[1] will return 0 or 1 if one of those are used. Ref: zsh.org/mla/users/2011/msg00350.htmldezza Aug 2 '16 at 2:13

Bruno Bronosky ,Jan 7, 2013 at 20:01

Preferred Method: Using straight bash without getopt[s]

I originally answered the question as the OP asked. This Q/A is getting a lot of attention, so I should also offer the non-magic way to do this. I'm going to expand upon guneysus's answer to fix the nasty sed and include Tobias Kienzler's suggestion .

Two of the most common ways to pass key value pair arguments are:

Straight Bash Space Separated

Usage ./myscript.sh -e conf -s /etc -l /usr/lib /etc/hosts

#!/bin/bash

POSITIONAL=()
while [[ $# -gt 0 ]]
do
key="$1"

case $key in
    -e|--extension)
    EXTENSION="$2"
    shift # past argument
    shift # past value
    ;;
    -s|--searchpath)
    SEARCHPATH="$2"
    shift # past argument
    shift # past value
    ;;
    -l|--lib)
    LIBPATH="$2"
    shift # past argument
    shift # past value
    ;;
    --default)
    DEFAULT=YES
    shift # past argument
    ;;
    *)    # unknown option
    POSITIONAL+=("$1") # save it in an array for later
    shift # past argument
    ;;
esac
done
set -- "${POSITIONAL[@]}" # restore positional parameters

echo FILE EXTENSION  = "${EXTENSION}"
echo SEARCH PATH     = "${SEARCHPATH}"
echo LIBRARY PATH    = "${LIBPATH}"
echo DEFAULT         = "${DEFAULT}"
echo "Number files in SEARCH PATH with EXTENSION:" $(ls -1 "${SEARCHPATH}"/*."${EXTENSION}" | wc -l)
if [[ -n $1 ]]; then
    echo "Last line of file specified as non-opt/last argument:"
    tail -1 "$1"
fi
Straight Bash Equals Separated

Usage ./myscript.sh -e=conf -s=/etc -l=/usr/lib /etc/hosts

#!/bin/bash

for i in "$@"
do
case $i in
    -e=*|--extension=*)
    EXTENSION="${i#*=}"
    shift # past argument=value
    ;;
    -s=*|--searchpath=*)
    SEARCHPATH="${i#*=}"
    shift # past argument=value
    ;;
    -l=*|--lib=*)
    LIBPATH="${i#*=}"
    shift # past argument=value
    ;;
    --default)
    DEFAULT=YES
    shift # past argument with no value
    ;;
    *)
          # unknown option
    ;;
esac
done
echo "FILE EXTENSION  = ${EXTENSION}"
echo "SEARCH PATH     = ${SEARCHPATH}"
echo "LIBRARY PATH    = ${LIBPATH}"
echo "Number files in SEARCH PATH with EXTENSION:" $(ls -1 "${SEARCHPATH}"/*."${EXTENSION}" | wc -l)
if [[ -n $1 ]]; then
    echo "Last line of file specified as non-opt/last argument:"
    tail -1 $1
fi

To better understand ${i#*=} search for "Substring Removal" in this guide . It is functionally equivalent to `sed 's/[^=]*=//' <<< "$i"` which calls a needless subprocess or `echo "$i" | sed 's/[^=]*=//'` which calls two needless subprocesses.

Using getopt[s]

from: http://mywiki.wooledge.org/BashFAQ/035#getopts

Never use getopt(1). getopt cannot handle empty arguments strings, or arguments with embedded whitespace. Please forget that it ever existed.

The POSIX shell (and others) offer getopts which is safe to use instead. Here is a simplistic getopts example:

#!/bin/sh

# A POSIX variable
OPTIND=1         # Reset in case getopts has been used previously in the shell.

# Initialize our own variables:
output_file=""
verbose=0

while getopts "h?vf:" opt; do
    case "$opt" in
    h|\?)
        show_help
        exit 0
        ;;
    v)  verbose=1
        ;;
    f)  output_file=$OPTARG
        ;;
    esac
done

shift $((OPTIND-1))

[ "${1:-}" = "--" ] && shift

echo "verbose=$verbose, output_file='$output_file', Leftovers: $@"

# End of file

The advantages of getopts are:

  1. It's portable, and will work in e.g. dash.
  2. It can handle things like -vf filename in the expected Unix way, automatically.

The disadvantage of getopts is that it can only handle short options ( -h , not --help ) without trickery.

There is a getopts tutorial which explains what all of the syntax and variables mean. In bash, there is also help getopts , which might be informative.

Livven ,Jun 6, 2013 at 21:19

Is this really true? According to Wikipedia there's a newer GNU enhanced version of getopt which includes all the functionality of getopts and then some. man getopt on Ubuntu 13.04 outputs getopt - parse command options (enhanced) as the name, so I presume this enhanced version is standard now. – Livven Jun 6 '13 at 21:19

szablica ,Jul 17, 2013 at 15:23

That something is a certain way on your system is a very weak premise to base asumptions of "being standard" on. – szablica Jul 17 '13 at 15:23

Stephane Chazelas ,Aug 20, 2014 at 19:55

@Livven, that getopt is not a GNU utility, it's part of util-linux . – Stephane Chazelas Aug 20 '14 at 19:55

Nicolas Mongrain-Lacombe ,Jun 19, 2016 at 21:22

If you use -gt 0 , remove your shift after the esac , augment all the shift by 1 and add this case: *) break;; you can handle non optionnal arguments. Ex: pastebin.com/6DJ57HTcNicolas Mongrain-Lacombe Jun 19 '16 at 21:22

kolydart ,Jul 10, 2017 at 8:11

You do not echo –default . In the first example, I notice that if –default is the last argument, it is not processed (considered as non-opt), unless while [[ $# -gt 1 ]] is set as while [[ $# -gt 0 ]]kolydart Jul 10 '17 at 8:11

Robert Siemer ,Apr 20, 2015 at 17:47

No answer mentions enhanced getopt . And the top-voted answer is misleading: It ignores -⁠vfd style short options (requested by the OP), options after positional arguments (also requested by the OP) and it ignores parsing-errors. Instead:

The following calls

myscript -vfd ./foo/bar/someFile -o /fizz/someOtherFile
myscript -v -f -d -o/fizz/someOtherFile -- ./foo/bar/someFile
myscript --verbose --force --debug ./foo/bar/someFile -o/fizz/someOtherFile
myscript --output=/fizz/someOtherFile ./foo/bar/someFile -vfd
myscript ./foo/bar/someFile -df -v --output /fizz/someOtherFile

all return

verbose: y, force: y, debug: y, in: ./foo/bar/someFile, out: /fizz/someOtherFile

with the following myscript

#!/bin/bash

getopt --test > /dev/null
if [[ $? -ne 4 ]]; then
    echo "I'm sorry, `getopt --test` failed in this environment."
    exit 1
fi

OPTIONS=dfo:v
LONGOPTIONS=debug,force,output:,verbose

# -temporarily store output to be able to check for errors
# -e.g. use "--options" parameter by name to activate quoting/enhanced mode
# -pass arguments only via   -- "$@"   to separate them correctly
PARSED=$(getopt --options=$OPTIONS --longoptions=$LONGOPTIONS --name "$0" -- "$@")
if [[ $? -ne 0 ]]; then
    # e.g. $? == 1
    #  then getopt has complained about wrong arguments to stdout
    exit 2
fi
# read getopt's output this way to handle the quoting right:
eval set -- "$PARSED"

# now enjoy the options in order and nicely split until we see --
while true; do
    case "$1" in
        -d|--debug)
            d=y
            shift
            ;;
        -f|--force)
            f=y
            shift
            ;;
        -v|--verbose)
            v=y
            shift
            ;;
        -o|--output)
            outFile="$2"
            shift 2
            ;;
        --)
            shift
            break
            ;;
        *)
            echo "Programming error"
            exit 3
            ;;
    esac
done

# handle non-option arguments
if [[ $# -ne 1 ]]; then
    echo "$0: A single input file is required."
    exit 4
fi

echo "verbose: $v, force: $f, debug: $d, in: $1, out: $outFile"

1 enhanced getopt is available on most "bash-systems", including Cygwin; on OS X try brew install gnu-getopt
2 the POSIX exec() conventions have no reliable way to pass binary NULL in command line arguments; those bytes prematurely end the argument
3 first version released in 1997 or before (I only tracked it back to 1997)

johncip ,Jan 12, 2017 at 2:00

Thanks for this. Just confirmed from the feature table at en.wikipedia.org/wiki/Getopts , if you need support for long options, and you're not on Solaris, getopt is the way to go. – johncip Jan 12 '17 at 2:00

Kaushal Modi ,Apr 27, 2017 at 14:02

I believe that the only caveat with getopt is that it cannot be used conveniently in wrapper scripts where one might have few options specific to the wrapper script, and then pass the non-wrapper-script options to the wrapped executable, intact. Let's say I have a grep wrapper called mygrep and I have an option --foo specific to mygrep , then I cannot do mygrep --foo -A 2 , and have the -A 2 passed automatically to grep ; I need to do mygrep --foo -- -A 2 . Here is my implementation on top of your solution.Kaushal Modi Apr 27 '17 at 14:02

bobpaul ,Mar 20 at 16:45

Alex, I agree and there's really no way around that since we need to know the actual return value of getopt --test . I'm a big fan of "Unofficial Bash Strict mode", (which includes set -e ), and I just put the check for getopt ABOVE set -euo pipefail and IFS=$'\n\t' in my script. – bobpaul Mar 20 at 16:45

Robert Siemer ,Mar 21 at 9:10

@bobpaul Oh, there is a way around that. And I'll edit my answer soon to reflect my collections regarding this issue ( set -e )... – Robert Siemer Mar 21 at 9:10

Robert Siemer ,Mar 21 at 9:16

@bobpaul Your statement about util-linux is wrong and misleading as well: the package is marked "essential" on Ubuntu/Debian. As such, it is always installed. – Which distros are you talking about (where you say it needs to be installed on purpose)? – Robert Siemer Mar 21 at 9:16

guneysus ,Nov 13, 2012 at 10:31

from : digitalpeer.com with minor modifications

Usage myscript.sh -p=my_prefix -s=dirname -l=libname

#!/bin/bash
for i in "$@"
do
case $i in
    -p=*|--prefix=*)
    PREFIX="${i#*=}"

    ;;
    -s=*|--searchpath=*)
    SEARCHPATH="${i#*=}"
    ;;
    -l=*|--lib=*)
    DIR="${i#*=}"
    ;;
    --default)
    DEFAULT=YES
    ;;
    *)
            # unknown option
    ;;
esac
done
echo PREFIX = ${PREFIX}
echo SEARCH PATH = ${SEARCHPATH}
echo DIRS = ${DIR}
echo DEFAULT = ${DEFAULT}

To better understand ${i#*=} search for "Substring Removal" in this guide . It is functionally equivalent to `sed 's/[^=]*=//' <<< "$i"` which calls a needless subprocess or `echo "$i" | sed 's/[^=]*=//'` which calls two needless subprocesses.

Tobias Kienzler ,Nov 12, 2013 at 12:48

Neat! Though this won't work for space-separated arguments à la mount -t tempfs ... . One can probably fix this via something like while [ $# -ge 1 ]; do param=$1; shift; case $param in; -p) prefix=$1; shift;; etc – Tobias Kienzler Nov 12 '13 at 12:48

Robert Siemer ,Mar 19, 2016 at 15:23

This can't handle -vfd style combined short options. – Robert Siemer Mar 19 '16 at 15:23

bekur ,Dec 19, 2017 at 23:27

link is broken! – bekur Dec 19 '17 at 23:27

Matt J ,Oct 10, 2008 at 17:03

getopt() / getopts() is a good option. Stolen from here :

The simple use of "getopt" is shown in this mini-script:

#!/bin/bash
echo "Before getopt"
for i
do
  echo $i
done
args=`getopt abc:d $*`
set -- $args
echo "After getopt"
for i
do
  echo "-->$i"
done

What we have said is that any of -a, -b, -c or -d will be allowed, but that -c is followed by an argument (the "c:" says that).

If we call this "g" and try it out:

bash-2.05a$ ./g -abc foo
Before getopt
-abc
foo
After getopt
-->-a
-->-b
-->-c
-->foo
-->--

We start with two arguments, and "getopt" breaks apart the options and puts each in its own argument. It also added "--".

Robert Siemer ,Apr 16, 2016 at 14:37

Using $* is broken usage of getopt . (It hoses arguments with spaces.) See my answer for proper usage. – Robert Siemer Apr 16 '16 at 14:37

SDsolar ,Aug 10, 2017 at 14:07

Why would you want to make it more complicated? – SDsolar Aug 10 '17 at 14:07

thebunnyrules ,Jun 1 at 1:57

@Matt J, the first part of the script (for i) would be able to handle arguments with spaces in them if you use "$i" instead of $i. The getopts does not seem to be able to handle arguments with spaces. What would be the advantage of using getopt over the for i loop? – thebunnyrules Jun 1 at 1:57

bronson ,Jul 15, 2015 at 23:43

At the risk of adding another example to ignore, here's my scheme.

Hope it's useful to someone.

while [ "$#" -gt 0 ]; do
  case "$1" in
    -n) name="$2"; shift 2;;
    -p) pidfile="$2"; shift 2;;
    -l) logfile="$2"; shift 2;;

    --name=*) name="${1#*=}"; shift 1;;
    --pidfile=*) pidfile="${1#*=}"; shift 1;;
    --logfile=*) logfile="${1#*=}"; shift 1;;
    --name|--pidfile|--logfile) echo "$1 requires an argument" >&2; exit 1;;

    -*) echo "unknown option: $1" >&2; exit 1;;
    *) handle_argument "$1"; shift 1;;
  esac
done

rhombidodecahedron ,Sep 11, 2015 at 8:40

What is the "handle_argument" function? – rhombidodecahedron Sep 11 '15 at 8:40

bronson ,Oct 8, 2015 at 20:41

Sorry for the delay. In my script, the handle_argument function receives all the non-option arguments. You can replace that line with whatever you'd like, maybe *) die "unrecognized argument: $1" or collect the args into a variable *) args+="$1"; shift 1;; . – bronson Oct 8 '15 at 20:41

Guilherme Garnier ,Apr 13 at 16:10

Amazing! I've tested a couple of answers, but this is the only one that worked for all cases, including many positional parameters (both before and after flags) – Guilherme Garnier Apr 13 at 16:10

Shane Day ,Jul 1, 2014 at 1:20

I'm about 4 years late to this question, but want to give back. I used the earlier answers as a starting point to tidy up my old adhoc param parsing. I then refactored out the following template code. It handles both long and short params, using = or space separated arguments, as well as multiple short params grouped together. Finally it re-inserts any non-param arguments back into the $1,$2.. variables. I hope it's useful.
#!/usr/bin/env bash

# NOTICE: Uncomment if your script depends on bashisms.
#if [ -z "$BASH_VERSION" ]; then bash $0 $@ ; exit $? ; fi

echo "Before"
for i ; do echo - $i ; done


# Code template for parsing command line parameters using only portable shell
# code, while handling both long and short params, handling '-f file' and
# '-f=file' style param data and also capturing non-parameters to be inserted
# back into the shell positional parameters.

while [ -n "$1" ]; do
        # Copy so we can modify it (can't modify $1)
        OPT="$1"
        # Detect argument termination
        if [ x"$OPT" = x"--" ]; then
                shift
                for OPT ; do
                        REMAINS="$REMAINS \"$OPT\""
                done
                break
        fi
        # Parse current opt
        while [ x"$OPT" != x"-" ] ; do
                case "$OPT" in
                        # Handle --flag=value opts like this
                        -c=* | --config=* )
                                CONFIGFILE="${OPT#*=}"
                                shift
                                ;;
                        # and --flag value opts like this
                        -c* | --config )
                                CONFIGFILE="$2"
                                shift
                                ;;
                        -f* | --force )
                                FORCE=true
                                ;;
                        -r* | --retry )
                                RETRY=true
                                ;;
                        # Anything unknown is recorded for later
                        * )
                                REMAINS="$REMAINS \"$OPT\""
                                break
                                ;;
                esac
                # Check for multiple short options
                # NOTICE: be sure to update this pattern to match valid options
                NEXTOPT="${OPT#-[cfr]}" # try removing single short opt
                if [ x"$OPT" != x"$NEXTOPT" ] ; then
                        OPT="-$NEXTOPT"  # multiple short opts, keep going
                else
                        break  # long form, exit inner loop
                fi
        done
        # Done with that param. move to next
        shift
done
# Set the non-parameters back into the positional parameters ($1 $2 ..)
eval set -- $REMAINS


echo -e "After: \n configfile='$CONFIGFILE' \n force='$FORCE' \n retry='$RETRY' \n remains='$REMAINS'"
for i ; do echo - $i ; done

Robert Siemer ,Dec 6, 2015 at 13:47

This code can't handle options with arguments like this: -c1 . And the use of = to separate short options from their arguments is unusual... – Robert Siemer Dec 6 '15 at 13:47

sfnd ,Jun 6, 2016 at 19:28

I ran into two problems with this useful chunk of code: 1) the "shift" in the case of "-c=foo" ends up eating the next parameter; and 2) 'c' should not be included in the "[cfr]" pattern for combinable short options. – sfnd Jun 6 '16 at 19:28

Inanc Gumus ,Nov 20, 2015 at 12:28

More succinct way

script.sh

#!/bin/bash

while [[ "$#" > 0 ]]; do case $1 in
  -d|--deploy) deploy="$2"; shift;;
  -u|--uglify) uglify=1;;
  *) echo "Unknown parameter passed: $1"; exit 1;;
esac; shift; done

echo "Should deploy? $deploy"
echo "Should uglify? $uglify"

Usage:

./script.sh -d dev -u

# OR:

./script.sh --deploy dev --uglify

hfossli ,Apr 7 at 20:58

This is what I am doing. Have to while [[ "$#" > 1 ]] if I want to support ending the line with a boolean flag ./script.sh --debug dev --uglify fast --verbose . Example: gist.github.com/hfossli/4368aa5a577742c3c9f9266ed214aa58hfossli Apr 7 at 20:58

hfossli ,Apr 7 at 21:09

I sent an edit request. I just tested this and it works perfectly. – hfossli Apr 7 at 21:09

hfossli ,Apr 7 at 21:10

Wow! Simple and clean! This is how I'm using this: gist.github.com/hfossli/4368aa5a577742c3c9f9266ed214aa58hfossli Apr 7 at 21:10

Ponyboy47 ,Sep 8, 2016 at 18:59

My answer is largely based on the answer by Bruno Bronosky , but I sort of mashed his two pure bash implementations into one that I use pretty frequently.
# As long as there is at least one more argument, keep looping
while [[ $# -gt 0 ]]; do
    key="$1"
    case "$key" in
        # This is a flag type option. Will catch either -f or --foo
        -f|--foo)
        FOO=1
        ;;
        # Also a flag type option. Will catch either -b or --bar
        -b|--bar)
        BAR=1
        ;;
        # This is an arg value type option. Will catch -o value or --output-file value
        -o|--output-file)
        shift # past the key and to the value
        OUTPUTFILE="$1"
        ;;
        # This is an arg=value type option. Will catch -o=value or --output-file=value
        -o=*|--output-file=*)
        # No need to shift here since the value is part of the same string
        OUTPUTFILE="${key#*=}"
        ;;
        *)
        # Do whatever you want with extra options
        echo "Unknown option '$key'"
        ;;
    esac
    # Shift after checking all the cases to get the next option
    shift
done

This allows you to have both space separated options/values, as well as equal defined values.

So you could run your script using:

./myscript --foo -b -o /fizz/file.txt

as well as:

./myscript -f --bar -o=/fizz/file.txt

and both should have the same end result.

PROS:

CONS:

These are the only pros/cons I can think of off the top of my head

bubla ,Jul 10, 2016 at 22:40

I have found the matter to write portable parsing in scripts so frustrating that I have written Argbash - a FOSS code generator that can generate the arguments-parsing code for your script plus it has some nice features:

https://argbash.io

RichVel ,Aug 18, 2016 at 5:34

Thanks for writing argbash, I just used it and found it works well. I mostly went for argbash because it's a code generator supporting the older bash 3.x found on OS X 10.11 El Capitan. The only downside is that the code-generator approach means quite a lot of code in your main script, compared to calling a module. – RichVel Aug 18 '16 at 5:34

bubla ,Aug 23, 2016 at 20:40

You can actually use Argbash in a way that it produces tailor-made parsing library just for you that you can have included in your script or you can have it in a separate file and just source it. I have added an example to demonstrate that and I have made it more explicit in the documentation, too. – bubla Aug 23 '16 at 20:40

RichVel ,Aug 24, 2016 at 5:47

Good to know. That example is interesting but still not really clear - maybe you can change name of the generated script to 'parse_lib.sh' or similar and show where the main script calls it (like in the wrapping script section which is more complex use case). – RichVel Aug 24 '16 at 5:47

bubla ,Dec 2, 2016 at 20:12

The issues were addressed in recent version of argbash: Documentation has been improved, a quickstart argbash-init script has been introduced and you can even use argbash online at argbash.io/generatebubla Dec 2 '16 at 20:12

Alek ,Mar 1, 2012 at 15:15

I think this one is simple enough to use:
#!/bin/bash
#

readopt='getopts $opts opt;rc=$?;[ $rc$opt == 0? ]&&exit 1;[ $rc == 0 ]||{ shift $[OPTIND-1];false; }'

opts=vfdo:

# Enumerating options
while eval $readopt
do
    echo OPT:$opt ${OPTARG+OPTARG:$OPTARG}
done

# Enumerating arguments
for arg
do
    echo ARG:$arg
done

Invocation example:

./myscript -v -do /fizz/someOtherFile -f ./foo/bar/someFile
OPT:v 
OPT:d 
OPT:o OPTARG:/fizz/someOtherFile
OPT:f 
ARG:./foo/bar/someFile

erm3nda ,May 20, 2015 at 22:50

I read all and this one is my preferred one. I don't like to use -a=1 as argc style. I prefer to put first the main option -options and later the special ones with single spacing -o option . Im looking for the simplest-vs-better way to read argvs. – erm3nda May 20 '15 at 22:50

erm3nda ,May 20, 2015 at 23:25

It's working really well but if you pass an argument to a non a: option all the following options would be taken as arguments. You can check this line ./myscript -v -d fail -o /fizz/someOtherFile -f ./foo/bar/someFile with your own script. -d option is not set as d: – erm3nda May 20 '15 at 23:25

unsynchronized ,Jun 9, 2014 at 13:46

Expanding on the excellent answer by @guneysus, here is a tweak that lets user use whichever syntax they prefer, eg
command -x=myfilename.ext --another_switch

vs

command -x myfilename.ext --another_switch

That is to say the equals can be replaced with whitespace.

This "fuzzy interpretation" might not be to your liking, but if you are making scripts that are interchangeable with other utilities (as is the case with mine, which must work with ffmpeg), the flexibility is useful.

STD_IN=0

prefix=""
key=""
value=""
for keyValue in "$@"
do
  case "${prefix}${keyValue}" in
    -i=*|--input_filename=*)  key="-i";     value="${keyValue#*=}";; 
    -ss=*|--seek_from=*)      key="-ss";    value="${keyValue#*=}";;
    -t=*|--play_seconds=*)    key="-t";     value="${keyValue#*=}";;
    -|--stdin)                key="-";      value=1;;
    *)                                      value=$keyValue;;
  esac
  case $key in
    -i) MOVIE=$(resolveMovie "${value}");  prefix=""; key="";;
    -ss) SEEK_FROM="${value}";          prefix=""; key="";;
    -t)  PLAY_SECONDS="${value}";           prefix=""; key="";;
    -)   STD_IN=${value};                   prefix=""; key="";; 
    *)   prefix="${keyValue}=";;
  esac
done

vangorra ,Feb 12, 2015 at 21:50

getopts works great if #1 you have it installed and #2 you intend to run it on the same platform. OSX and Linux (for example) behave differently in this respect.

Here is a (non getopts) solution that supports equals, non-equals, and boolean flags. For example you could run your script in this way:

./script --arg1=value1 --arg2 value2 --shouldClean

# parse the arguments.
COUNTER=0
ARGS=("$@")
while [ $COUNTER -lt $# ]
do
    arg=${ARGS[$COUNTER]}
    let COUNTER=COUNTER+1
    nextArg=${ARGS[$COUNTER]}

    if [[ $skipNext -eq 1 ]]; then
        echo "Skipping"
        skipNext=0
        continue
    fi

    argKey=""
    argVal=""
    if [[ "$arg" =~ ^\- ]]; then
        # if the format is: -key=value
        if [[ "$arg" =~ \= ]]; then
            argVal=$(echo "$arg" | cut -d'=' -f2)
            argKey=$(echo "$arg" | cut -d'=' -f1)
            skipNext=0

        # if the format is: -key value
        elif [[ ! "$nextArg" =~ ^\- ]]; then
            argKey="$arg"
            argVal="$nextArg"
            skipNext=1

        # if the format is: -key (a boolean flag)
        elif [[ "$nextArg" =~ ^\- ]] || [[ -z "$nextArg" ]]; then
            argKey="$arg"
            argVal=""
            skipNext=0
        fi
    # if the format has not flag, just a value.
    else
        argKey=""
        argVal="$arg"
        skipNext=0
    fi

    case "$argKey" in 
        --source-scmurl)
            SOURCE_URL="$argVal"
        ;;
        --dest-scmurl)
            DEST_URL="$argVal"
        ;;
        --version-num)
            VERSION_NUM="$argVal"
        ;;
        -c|--clean)
            CLEAN_BEFORE_START="1"
        ;;
        -h|--help|-help|--h)
            showUsage
            exit
        ;;
    esac
done

akostadinov ,Jul 19, 2013 at 7:50

This is how I do in a function to avoid breaking getopts run at the same time somewhere higher in stack:
function waitForWeb () {
   local OPTIND=1 OPTARG OPTION
   local host=localhost port=8080 proto=http
   while getopts "h:p:r:" OPTION; do
      case "$OPTION" in
      h)
         host="$OPTARG"
         ;;
      p)
         port="$OPTARG"
         ;;
      r)
         proto="$OPTARG"
         ;;
      esac
   done
...
}

Renato Silva ,Jul 4, 2016 at 16:47

EasyOptions does not require any parsing:
## Options:
##   --verbose, -v  Verbose mode
##   --output=FILE  Output filename

source easyoptions || exit

if test -n "${verbose}"; then
    echo "output file is ${output}"
    echo "${arguments[@]}"
fi

Oleksii Chekulaiev ,Jul 1, 2016 at 20:56

I give you The Function parse_params that will parse params:
  1. Without polluting global scope.
  2. Effortlessly returns to you ready to use variables so that you could build further logic on them
  3. Amount of dashes before params does not matter ( --all equals -all equals all=all )

The script below is a copy-paste working demonstration. See show_use function to understand how to use parse_params .

Limitations:

  1. Does not support space delimited params ( -d 1 )
  2. Param names will lose dashes so --any-param and -anyparam are equivalent
  3. eval $(parse_params "$@") must be used inside bash function (it will not work in the global scope)

#!/bin/bash

# Universal Bash parameter parsing
# Parse equal sign separated params into named local variables
# Standalone named parameter value will equal its param name (--force creates variable $force=="force")
# Parses multi-valued named params into an array (--path=path1 --path=path2 creates ${path[*]} array)
# Parses un-named params into ${ARGV[*]} array
# Additionally puts all named params into ${ARGN[*]} array
# Additionally puts all standalone "option" params into ${ARGO[*]} array
# @author Oleksii Chekulaiev
# @version v1.3 (May-14-2018)
parse_params ()
{
    local existing_named
    local ARGV=() # un-named params
    local ARGN=() # named params
    local ARGO=() # options (--params)
    echo "local ARGV=(); local ARGN=(); local ARGO=();"
    while [[ "$1" != "" ]]; do
        # Escape asterisk to prevent bash asterisk expansion
        _escaped=${1/\*/\'\"*\"\'}
        # If equals delimited named parameter
        if [[ "$1" =~ ^..*=..* ]]; then
            # Add to named parameters array
            echo "ARGN+=('$_escaped');"
            # key is part before first =
            local _key=$(echo "$1" | cut -d = -f 1)
            # val is everything after key and = (protect from param==value error)
            local _val="${1/$_key=}"
            # remove dashes from key name
            _key=${_key//\-}
            # search for existing parameter name
            if (echo "$existing_named" | grep "\b$_key\b" >/dev/null); then
                # if name already exists then it's a multi-value named parameter
                # re-declare it as an array if needed
                if ! (declare -p _key 2> /dev/null | grep -q 'declare \-a'); then
                    echo "$_key=(\"\$$_key\");"
                fi
                # append new value
                echo "$_key+=('$_val');"
            else
                # single-value named parameter
                echo "local $_key=\"$_val\";"
                existing_named=" $_key"
            fi
        # If standalone named parameter
        elif [[ "$1" =~ ^\-. ]]; then
            # Add to options array
            echo "ARGO+=('$_escaped');"
            # remove dashes
            local _key=${1//\-}
            echo "local $_key=\"$_key\";"
        # non-named parameter
        else
            # Escape asterisk to prevent bash asterisk expansion
            _escaped=${1/\*/\'\"*\"\'}
            echo "ARGV+=('$_escaped');"
        fi
        shift
    done
}

#--------------------------- DEMO OF THE USAGE -------------------------------

show_use ()
{
    eval $(parse_params "$@")
    # --
    echo "${ARGV[0]}" # print first unnamed param
    echo "${ARGV[1]}" # print second unnamed param
    echo "${ARGN[0]}" # print first named param
    echo "${ARG0[0]}" # print first option param (--force)
    echo "$anyparam"  # print --anyparam value
    echo "$k"         # print k=5 value
    echo "${multivalue[0]}" # print first value of multi-value
    echo "${multivalue[1]}" # print second value of multi-value
    [[ "$force" == "force" ]] && echo "\$force is set so let the force be with you"
}

show_use "param 1" --anyparam="my value" param2 k=5 --force --multi-value=test1 --multi-value=test2

Oleksii Chekulaiev ,Sep 28, 2016 at 12:55

To use the demo to parse params that come into your bash script you just do show_use "$@"Oleksii Chekulaiev Sep 28 '16 at 12:55

Oleksii Chekulaiev ,Sep 28, 2016 at 12:58

Basically I found out that github.com/renatosilva/easyoptions does the same in the same way but is a bit more massive than this function. – Oleksii Chekulaiev Sep 28 '16 at 12:58

galmok ,Jun 24, 2015 at 10:54

I'd like to offer my version of option parsing, that allows for the following:
-s p1
--stage p1
-w somefolder
--workfolder somefolder
-sw p1 somefolder
-e=hello

Also allows for this (could be unwanted):

-s--workfolder p1 somefolder
-se=hello p1
-swe=hello p1 somefolder

You have to decide before use if = is to be used on an option or not. This is to keep the code clean(ish).

while [[ $# > 0 ]]
do
    key="$1"
    while [[ ${key+x} ]]
    do
        case $key in
            -s*|--stage)
                STAGE="$2"
                shift # option has parameter
                ;;
            -w*|--workfolder)
                workfolder="$2"
                shift # option has parameter
                ;;
            -e=*)
                EXAMPLE="${key#*=}"
                break # option has been fully handled
                ;;
            *)
                # unknown option
                echo Unknown option: $key #1>&2
                exit 10 # either this: my preferred way to handle unknown options
                break # or this: do this to signal the option has been handled (if exit isn't used)
                ;;
        esac
        # prepare for next option in this key, if any
        [[ "$key" = -? || "$key" == --* ]] && unset key || key="${key/#-?/-}"
    done
    shift # option(s) fully processed, proceed to next input argument
done

Luca Davanzo ,Nov 14, 2016 at 17:56

what's the meaning for "+x" on ${key+x} ? – Luca Davanzo Nov 14 '16 at 17:56

galmok ,Nov 15, 2016 at 9:10

It is a test to see if 'key' is present or not. Further down I unset key and this breaks the inner while loop. – galmok Nov 15 '16 at 9:10

Mark Fox ,Apr 27, 2015 at 2:42

Mixing positional and flag-based arguments --param=arg (equals delimited)

Freely mixing flags between positional arguments:

./script.sh dumbo 127.0.0.1 --environment=production -q -d
./script.sh dumbo --environment=production 127.0.0.1 --quiet -d

can be accomplished with a fairly concise approach:

# process flags
pointer=1
while [[ $pointer -le $# ]]; do
   param=${!pointer}
   if [[ $param != "-"* ]]; then ((pointer++)) # not a parameter flag so advance pointer
   else
      case $param in
         # paramter-flags with arguments
         -e=*|--environment=*) environment="${param#*=}";;
                  --another=*) another="${param#*=}";;

         # binary flags
         -q|--quiet) quiet=true;;
                 -d) debug=true;;
      esac

      # splice out pointer frame from positional list
      [[ $pointer -gt 1 ]] \
         && set -- ${@:1:((pointer - 1))} ${@:((pointer + 1)):$#} \
         || set -- ${@:((pointer + 1)):$#};
   fi
done

# positional remain
node_name=$1
ip_address=$2
--param arg (space delimited)

It's usualy clearer to not mix --flag=value and --flag value styles.

./script.sh dumbo 127.0.0.1 --environment production -q -d

This is a little dicey to read, but is still valid

./script.sh dumbo --environment production 127.0.0.1 --quiet -d

Source

# process flags
pointer=1
while [[ $pointer -le $# ]]; do
   if [[ ${!pointer} != "-"* ]]; then ((pointer++)) # not a parameter flag so advance pointer
   else
      param=${!pointer}
      ((pointer_plus = pointer + 1))
      slice_len=1

      case $param in
         # paramter-flags with arguments
         -e|--environment) environment=${!pointer_plus}; ((slice_len++));;
                --another) another=${!pointer_plus}; ((slice_len++));;

         # binary flags
         -q|--quiet) quiet=true;;
                 -d) debug=true;;
      esac

      # splice out pointer frame from positional list
      [[ $pointer -gt 1 ]] \
         && set -- ${@:1:((pointer - 1))} ${@:((pointer + $slice_len)):$#} \
         || set -- ${@:((pointer + $slice_len)):$#};
   fi
done

# positional remain
node_name=$1
ip_address=$2

schily ,Oct 19, 2015 at 13:59

Note that getopt(1) was a short living mistake from AT&T.

getopt was created in 1984 but already buried in 1986 because it was not really usable.

A proof for the fact that getopt is very outdated is that the getopt(1) man page still mentions "$*" instead of "$@" , that was added to the Bourne Shell in 1986 together with the getopts(1) shell builtin in order to deal with arguments with spaces inside.

BTW: if you are interested in parsing long options in shell scripts, it may be of interest to know that the getopt(3) implementation from libc (Solaris) and ksh93 both added a uniform long option implementation that supports long options as aliases for short options. This causes ksh93 and the Bourne Shell to implement a uniform interface for long options via getopts .

An example for long options taken from the Bourne Shell man page:

getopts "f:(file)(input-file)o:(output-file)" OPTX "$@"

shows how long option aliases may be used in both Bourne Shell and ksh93.

See the man page of a recent Bourne Shell:

http://schillix.sourceforge.net/man/man1/bosh.1.html

and the man page for getopt(3) from OpenSolaris:

http://schillix.sourceforge.net/man/man3c/getopt.3c.html

and last, the getopt(1) man page to verify the outdated $*:

http://schillix.sourceforge.net/man/man1/getopt.1.html

Volodymyr M. Lisivka ,Jul 9, 2013 at 16:51

Use module "arguments" from bash-modules

Example:

#!/bin/bash
. import.sh log arguments

NAME="world"

parse_arguments "-n|--name)NAME;S" -- "$@" || {
  error "Cannot parse command line."
  exit 1
}

info "Hello, $NAME!"

Mike Q ,Jun 14, 2014 at 18:01

This also might be useful to know, you can set a value and if someone provides input, override the default with that value..

myscript.sh -f ./serverlist.txt or just ./myscript.sh (and it takes defaults)

    #!/bin/bash
    # --- set the value, if there is inputs, override the defaults.

    HOME_FOLDER="${HOME}/owned_id_checker"
    SERVER_FILE_LIST="${HOME_FOLDER}/server_list.txt"

    while [[ $# > 1 ]]
    do
    key="$1"
    shift

    case $key in
        -i|--inputlist)
        SERVER_FILE_LIST="$1"
        shift
        ;;
    esac
    done


    echo "SERVER LIST   = ${SERVER_FILE_LIST}"

phk ,Oct 17, 2015 at 21:17

Another solution without getopt[s], POSIX, old Unix style

Similar to the solution Bruno Bronosky posted this here is one without the usage of getopt(s) .

Main differentiating feature of my solution is that it allows to have options concatenated together just like tar -xzf foo.tar.gz is equal to tar -x -z -f foo.tar.gz . And just like in tar , ps etc. the leading hyphen is optional for a block of short options (but this can be changed easily). Long options are supported as well (but when a block starts with one then two leading hyphens are required).

Code with example options
#!/bin/sh

echo
echo "POSIX-compliant getopt(s)-free old-style-supporting option parser from phk@[se.unix]"
echo

print_usage() {
  echo "Usage:

  $0 {a|b|c} [ARG...]

Options:

  --aaa-0-args
  -a
    Option without arguments.

  --bbb-1-args ARG
  -b ARG
    Option with one argument.

  --ccc-2-args ARG1 ARG2
  -c ARG1 ARG2
    Option with two arguments.

" >&2
}

if [ $# -le 0 ]; then
  print_usage
  exit 1
fi

opt=
while :; do

  if [ $# -le 0 ]; then

    # no parameters remaining -> end option parsing
    break

  elif [ ! "$opt" ]; then

    # we are at the beginning of a fresh block
    # remove optional leading hyphen and strip trailing whitespaces
    opt=$(echo "$1" | sed 's/^-\?\([a-zA-Z0-9\?-]*\)/\1/')

  fi

  # get the first character -> check whether long option
  first_chr=$(echo "$opt" | awk '{print substr($1, 1, 1)}')
  [ "$first_chr" = - ] && long_option=T || long_option=F

  # note to write the options here with a leading hyphen less
  # also do not forget to end short options with a star
  case $opt in

    -)

      # end of options
      shift
      break
      ;;

    a*|-aaa-0-args)

      echo "Option AAA activated!"
      ;;

    b*|-bbb-1-args)

      if [ "$2" ]; then
        echo "Option BBB with argument '$2' activated!"
        shift
      else
        echo "BBB parameters incomplete!" >&2
        print_usage
        exit 1
      fi
      ;;

    c*|-ccc-2-args)

      if [ "$2" ] && [ "$3" ]; then
        echo "Option CCC with arguments '$2' and '$3' activated!"
        shift 2
      else
        echo "CCC parameters incomplete!" >&2
        print_usage
        exit 1
      fi
      ;;

    h*|\?*|-help)

      print_usage
      exit 0
      ;;

    *)

      if [ "$long_option" = T ]; then
        opt=$(echo "$opt" | awk '{print substr($1, 2)}')
      else
        opt=$first_chr
      fi
      printf 'Error: Unknown option: "%s"\n' "$opt" >&2
      print_usage
      exit 1
      ;;

  esac

  if [ "$long_option" = T ]; then

    # if we had a long option then we are going to get a new block next
    shift
    opt=

  else

    # if we had a short option then just move to the next character
    opt=$(echo "$opt" | awk '{print substr($1, 2)}')

    # if block is now empty then shift to the next one
    [ "$opt" ] || shift

  fi

done

echo "Doing something..."

exit 0

For the example usage please see the examples further below.

Position of options with arguments

For what its worth there the options with arguments don't be the last (only long options need to be). So while e.g. in tar (at least in some implementations) the f options needs to be last because the file name follows ( tar xzf bar.tar.gz works but tar xfz bar.tar.gz does not) this is not the case here (see the later examples).

Multiple options with arguments

As another bonus the option parameters are consumed in the order of the options by the parameters with required options. Just look at the output of my script here with the command line abc X Y Z (or -abc X Y Z ):

Option AAA activated!
Option BBB with argument 'X' activated!
Option CCC with arguments 'Y' and 'Z' activated!
Long options concatenated as well

Also you can also have long options in option block given that they occur last in the block. So the following command lines are all equivalent (including the order in which the options and its arguments are being processed):

All of these lead to:

Option CCC with arguments 'Z' and 'Y' activated!
Option BBB with argument 'X' activated!
Option AAA activated!
Doing something...
Not in this solution Optional arguments

Options with optional arguments should be possible with a bit of work, e.g. by looking forward whether there is a block without a hyphen; the user would then need to put a hyphen in front of every block following a block with a parameter having an optional parameter. Maybe this is too complicated to communicate to the user so better just require a leading hyphen altogether in this case.

Things get even more complicated with multiple possible parameters. I would advise against making the options trying to be smart by determining whether the an argument might be for it or not (e.g. with an option just takes a number as an optional argument) because this might break in the future.

I personally favor additional options instead of optional arguments.

Option arguments introduced with an equal sign

Just like with optional arguments I am not a fan of this (BTW, is there a thread for discussing the pros/cons of different parameter styles?) but if you want this you could probably implement it yourself just like done at http://mywiki.wooledge.org/BashFAQ/035#Manual_loop with a --long-with-arg=?* case statement and then stripping the equal sign (this is BTW the site that says that making parameter concatenation is possible with some effort but "left [it] as an exercise for the reader" which made me take them at their word but I started from scratch).

Other notes

POSIX-compliant, works even on ancient Busybox setups I had to deal with (with e.g. cut , head and getopts missing).

Noah ,Aug 29, 2016 at 3:44

Solution that preserves unhandled arguments. Demos Included.

Here is my solution. It is VERY flexible and unlike others, shouldn't require external packages and handles leftover arguments cleanly.

Usage is: ./myscript -flag flagvariable -otherflag flagvar2

All you have to do is edit the validflags line. It prepends a hyphen and searches all arguments. It then defines the next argument as the flag name e.g.

./myscript -flag flagvariable -otherflag flagvar2
echo $flag $otherflag
flagvariable flagvar2

The main code (short version, verbose with examples further down, also a version with erroring out):

#!/usr/bin/env bash
#shebang.io
validflags="rate time number"
count=1
for arg in $@
do
    match=0
    argval=$1
    for flag in $validflags
    do
        sflag="-"$flag
        if [ "$argval" == "$sflag" ]
        then
            declare $flag=$2
            match=1
        fi
    done
        if [ "$match" == "1" ]
    then
        shift 2
    else
        leftovers=$(echo $leftovers $argval)
        shift
    fi
    count=$(($count+1))
done
#Cleanup then restore the leftovers
shift $#
set -- $leftovers

The verbose version with built in echo demos:

#!/usr/bin/env bash
#shebang.io
rate=30
time=30
number=30
echo "all args
$@"
validflags="rate time number"
count=1
for arg in $@
do
    match=0
    argval=$1
#   argval=$(echo $@ | cut -d ' ' -f$count)
    for flag in $validflags
    do
            sflag="-"$flag
        if [ "$argval" == "$sflag" ]
        then
            declare $flag=$2
            match=1
        fi
    done
        if [ "$match" == "1" ]
    then
        shift 2
    else
        leftovers=$(echo $leftovers $argval)
        shift
    fi
    count=$(($count+1))
done

#Cleanup then restore the leftovers
echo "pre final clear args:
$@"
shift $#
echo "post final clear args:
$@"
set -- $leftovers
echo "all post set args:
$@"
echo arg1: $1 arg2: $2

echo leftovers: $leftovers
echo rate $rate time $time number $number

Final one, this one errors out if an invalid -argument is passed through.

#!/usr/bin/env bash
#shebang.io
rate=30
time=30
number=30
validflags="rate time number"
count=1
for arg in $@
do
    argval=$1
    match=0
        if [ "${argval:0:1}" == "-" ]
    then
        for flag in $validflags
        do
                sflag="-"$flag
            if [ "$argval" == "$sflag" ]
            then
                declare $flag=$2
                match=1
            fi
        done
        if [ "$match" == "0" ]
        then
            echo "Bad argument: $argval"
            exit 1
        fi
        shift 2
    else
        leftovers=$(echo $leftovers $argval)
        shift
    fi
    count=$(($count+1))
done
#Cleanup then restore the leftovers
shift $#
set -- $leftovers
echo rate $rate time $time number $number
echo leftovers: $leftovers

Pros: What it does, it handles very well. It preserves unused arguments which a lot of the other solutions here don't. It also allows for variables to be called without being defined by hand in the script. It also allows prepopulation of variables if no corresponding argument is given. (See verbose example).

Cons: Can't parse a single complex arg string e.g. -xcvf would process as a single argument. You could somewhat easily write additional code into mine that adds this functionality though.

Daniel Bigham ,Aug 8, 2016 at 12:42

The top answer to this question seemed a bit buggy when I tried it -- here's my solution which I've found to be more robust:
boolean_arg=""
arg_with_value=""

while [[ $# -gt 0 ]]
do
key="$1"
case $key in
    -b|--boolean-arg)
    boolean_arg=true
    shift
    ;;
    -a|--arg-with-value)
    arg_with_value="$2"
    shift
    shift
    ;;
    -*)
    echo "Unknown option: $1"
    exit 1
    ;;
    *)
    arg_num=$(( $arg_num + 1 ))
    case $arg_num in
        1)
        first_normal_arg="$1"
        shift
        ;;
        2)
        second_normal_arg="$1"
        shift
        ;;
        *)
        bad_args=TRUE
    esac
    ;;
esac
done

# Handy to have this here when adding arguments to
# see if they're working. Just edit the '0' to be '1'.
if [[ 0 == 1 ]]; then
    echo "first_normal_arg: $first_normal_arg"
    echo "second_normal_arg: $second_normal_arg"
    echo "boolean_arg: $boolean_arg"
    echo "arg_with_value: $arg_with_value"
    exit 0
fi

if [[ $bad_args == TRUE || $arg_num < 2 ]]; then
    echo "Usage: $(basename "$0") <first-normal-arg> <second-normal-arg> [--boolean-arg] [--arg-with-value VALUE]"
    exit 1
fi

phyatt ,Sep 7, 2016 at 18:25

This example shows how to use getopt and eval and HEREDOC and shift to handle short and long parameters with and without a required value that follows. Also the switch/case statement is concise and easy to follow.
#!/usr/bin/env bash

# usage function
function usage()
{
   cat << HEREDOC

   Usage: $progname [--num NUM] [--time TIME_STR] [--verbose] [--dry-run]

   optional arguments:
     -h, --help           show this help message and exit
     -n, --num NUM        pass in a number
     -t, --time TIME_STR  pass in a time string
     -v, --verbose        increase the verbosity of the bash script
     --dry-run            do a dry run, don't change any files

HEREDOC
}  

# initialize variables
progname=$(basename $0)
verbose=0
dryrun=0
num_str=
time_str=

# use getopt and store the output into $OPTS
# note the use of -o for the short options, --long for the long name options
# and a : for any option that takes a parameter
OPTS=$(getopt -o "hn:t:v" --long "help,num:,time:,verbose,dry-run" -n "$progname" -- "$@")
if [ $? != 0 ] ; then echo "Error in command line arguments." >&2 ; usage; exit 1 ; fi
eval set -- "$OPTS"

while true; do
  # uncomment the next line to see how shift is working
  # echo "\$1:\"$1\" \$2:\"$2\""
  case "$1" in
    -h | --help ) usage; exit; ;;
    -n | --num ) num_str="$2"; shift 2 ;;
    -t | --time ) time_str="$2"; shift 2 ;;
    --dry-run ) dryrun=1; shift ;;
    -v | --verbose ) verbose=$((verbose + 1)); shift ;;
    -- ) shift; break ;;
    * ) break ;;
  esac
done

if (( $verbose > 0 )); then

   # print out all the parameters we read in
   cat <<-EOM
   num=$num_str
   time=$time_str
   verbose=$verbose
   dryrun=$dryrun
EOM
fi

# The rest of your script below

The most significant lines of the script above are these:

OPTS=$(getopt -o "hn:t:v" --long "help,num:,time:,verbose,dry-run" -n "$progname" -- "$@")
if [ $? != 0 ] ; then echo "Error in command line arguments." >&2 ; exit 1 ; fi
eval set -- "$OPTS"

while true; do
  case "$1" in
    -h | --help ) usage; exit; ;;
    -n | --num ) num_str="$2"; shift 2 ;;
    -t | --time ) time_str="$2"; shift 2 ;;
    --dry-run ) dryrun=1; shift ;;
    -v | --verbose ) verbose=$((verbose + 1)); shift ;;
    -- ) shift; break ;;
    * ) break ;;
  esac
done

Short, to the point, readable, and handles just about everything (IMHO).

Hope that helps someone.

Emeric Verschuur ,Feb 20, 2017 at 21:30

I have write a bash helper to write a nice bash tool

project home: https://gitlab.mbedsys.org/mbedsys/bashopts

example:

#!/bin/bash -ei

# load the library
. bashopts.sh

# Enable backtrace dusplay on error
trap 'bashopts_exit_handle' ERR

# Initialize the library
bashopts_setup -n "$0" -d "This is myapp tool description displayed on help message" -s "$HOME/.config/myapprc"

# Declare the options
bashopts_declare -n first_name -l first -o f -d "First name" -t string -i -s -r
bashopts_declare -n last_name -l last -o l -d "Last name" -t string -i -s -r
bashopts_declare -n display_name -l display-name -t string -d "Display name" -e "\$first_name \$last_name"
bashopts_declare -n age -l number -d "Age" -t number
bashopts_declare -n email_list -t string -m add -l email -d "Email adress"

# Parse arguments
bashopts_parse_args "$@"

# Process argument
bashopts_process_args

will give help:

NAME:
    ./example.sh - This is myapp tool description displayed on help message

USAGE:
    [options and commands] [-- [extra args]]

OPTIONS:
    -h,--help                          Display this help
    -n,--non-interactive true          Non interactive mode - [$bashopts_non_interactive] (type:boolean, default:false)
    -f,--first "John"                  First name - [$first_name] (type:string, default:"")
    -l,--last "Smith"                  Last name - [$last_name] (type:string, default:"")
    --display-name "John Smith"        Display name - [$display_name] (type:string, default:"$first_name $last_name")
    --number 0                         Age - [$age] (type:number, default:0)
    --email                            Email adress - [$email_list] (type:string, default:"")

enjoy :)

Josh Wulf ,Jun 24, 2017 at 18:07

I get this on Mac OS X: ``` lib/bashopts.sh: line 138: declare: -A: invalid option declare: usage: declare [-afFirtx] [-p] [name[=value] ...] Error in lib/bashopts.sh:138. 'declare -x -A bashopts_optprop_name' exited with status 2 Call tree: 1: lib/controller.sh:4 source(...) Exiting with status 1 ``` – Josh Wulf Jun 24 '17 at 18:07

Josh Wulf ,Jun 24, 2017 at 18:17

You need Bash version 4 to use this. On Mac, the default version is 3. You can use home brew to install bash 4. – Josh Wulf Jun 24 '17 at 18:17

a_z ,Mar 15, 2017 at 13:24

Here is my approach - using regexp.

script:

#!/usr/bin/env sh

help_menu() {
  echo "Usage:

  ${0##*/} [-h][-l FILENAME][-d]

Options:

  -h, --help
    display this help and exit

  -l, --logfile=FILENAME
    filename

  -d, --debug
    enable debug
  "
}

parse_options() {
  case $opt in
    h|help)
      help_menu
      exit
     ;;
    l|logfile)
      logfile=${attr}
      ;;
    d|debug)
      debug=true
      ;;
    *)
      echo "Unknown option: ${opt}\nRun ${0##*/} -h for help.">&2
      exit 1
  esac
}
options=$@

until [ "$options" = "" ]; do
  if [[ $options =~ (^ *(--([a-zA-Z0-9-]+)|-([a-zA-Z0-9-]+))(( |=)(([\_\.\?\/\\a-zA-Z0-9]?[ -]?[\_\.\?a-zA-Z0-9]+)+))?(.*)|(.+)) ]]; then
    if [[ ${BASH_REMATCH[3]} ]]; then # for --option[=][attribute] or --option[=][attribute]
      opt=${BASH_REMATCH[3]}
      attr=${BASH_REMATCH[7]}
      options=${BASH_REMATCH[9]}
    elif [[ ${BASH_REMATCH[4]} ]]; then # for block options -qwert[=][attribute] or single short option -a[=][attribute]
      pile=${BASH_REMATCH[4]}
      while (( ${#pile} > 1 )); do
        opt=${pile:0:1}
        attr=""
        pile=${pile/${pile:0:1}/}
        parse_options
      done
      opt=$pile
      attr=${BASH_REMATCH[7]}
      options=${BASH_REMATCH[9]}
    else # leftovers that don't match
      opt=${BASH_REMATCH[10]}
      options=""
    fi
    parse_options
  fi
done

mauron85 ,Jun 21, 2017 at 6:03

Like this one. Maybe just add -e param to echo with new line. – mauron85 Jun 21 '17 at 6:03

John ,Oct 10, 2017 at 22:49

Assume we create a shell script named test_args.sh as follow
#!/bin/sh
until [ $# -eq 0 ]
do
  name=${1:1}; shift;
  if [[ -z "$1" || $1 == -* ]] ; then eval "export $name=true"; else eval "export $name=$1"; shift; fi  
done
echo "year=$year month=$month day=$day flag=$flag"

After we run the following command:

sh test_args.sh  -year 2017 -flag  -month 12 -day 22

The output would be:

year=2017 month=12 day=22 flag=true

Will Barnwell ,Oct 10, 2017 at 23:57

This takes the same approach as Noah's answer , but has less safety checks / safeguards. This allows us to write arbitrary arguments into the script's environment and I'm pretty sure your use of eval here may allow command injection. – Will Barnwell Oct 10 '17 at 23:57

Masadow ,Oct 6, 2015 at 8:53

Here is my improved solution of Bruno Bronosky's answer using variable arrays.

it lets you mix parameters position and give you a parameter array preserving the order without the options

#!/bin/bash

echo $@

PARAMS=()
SOFT=0
SKIP=()
for i in "$@"
do
case $i in
    -n=*|--skip=*)
    SKIP+=("${i#*=}")
    ;;
    -s|--soft)
    SOFT=1
    ;;
    *)
        # unknown option
        PARAMS+=("$i")
    ;;
esac
done
echo "SKIP            = ${SKIP[@]}"
echo "SOFT            = $SOFT"
    echo "Parameters:"
    echo ${PARAMS[@]}

Will output for example:

$ ./test.sh parameter -s somefile --skip=.c --skip=.obj
parameter -s somefile --skip=.c --skip=.obj
SKIP            = .c .obj
SOFT            = 1
Parameters:
parameter somefile

Jason S ,Dec 3, 2017 at 1:01

You use shift on the known arguments and not on the unknown ones so your remaining $@ will be all but the first two arguments (in the order they are passed in), which could lead to some mistakes if you try to use $@ later. You don't need the shift for the = parameters, since you're not handling spaces and you're getting the value with the substring removal #*=Jason S Dec 3 '17 at 1:01

Masadow ,Dec 5, 2017 at 9:17

You're right, in fact, since I build a PARAMS variable, I don't need to use shift at all – Masadow Dec 5 '17 at 9:17

[Jul 04, 2018] file permissions - Bash test if a directory is writable by a given UID - Stack Overflow

Jul 04, 2018 | stackoverflow.com

> ,

You can use sudo to execute the test in your script. For instance:
sudo -u mysql -H sh -c "if [ -w $directory ] ; then echo 'Eureka' ; fi"

To do this, the user executing the script will need sudo privileges of course.

If you explicitly need the uid instead of the username, you can also use:

sudo -u \#42 -H sh -c "if [ -w $directory ] ; then echo 'Eureka' ; fi"

In this case, 42 is the uid of the mysql user. Substitute your own value if needed.

UPDATE (to support non-sudo-priviledged users)
To get a bash script to change-users without sudu would be to require the ability to suid ("switch user id"). This, as pointed out by this answer , is a security restriction that requires a hack to work around. Check this blog for an example of "how to" work around it (I haven't tested/tried it, so I can't confirm it's success).

My recommendation, if possible, would be to write a script in C that is given permission to suid (try chmod 4755 file-name ). Then, you can call setuid(#) from the C script to set the current user's id and either continue code-execution from the C application, or have it execute a separate bash script that runs whatever commands you need/want. This is also a pretty hacky method, but as far as non-sudo alternatives it's probably one of the easiest (in my opinion).

,

I've written a function can_user_write_to_file which will return 1 if the user passed to it either is the owner of the file/directory, or is member of a group which has write access to that file/directory. If not, the method returns 0 .
## Method which returns 1 if the user can write to the file or
## directory.
##
## $1 :: user name
## $2 :: file
function can_user_write_to_file() {
  if [[ $# -lt 2 || ! -r $2 ]]; then
    echo 0
    return
  fi

  local user_id=$(id -u ${1} 2>/dev/null)
  local file_owner_id=$(stat -c "%u" $2)
  if [[ ${user_id} == ${file_owner_id} ]]; then
    echo 1
    return
  fi

  local file_access=$(stat -c "%a" $2)
  local file_group_access=${file_access:1:1}
  local file_group_name=$(stat -c "%G" $2)
  local user_group_list=$(groups $1 2>/dev/null)

  if [ ${file_group_access} -ge 6 ]; then
    for el in ${user_group_list-nop}; do
      if [[ "${el}" == ${file_group_name} ]]; then
        echo 1
        return
      fi
    done
  fi

  echo 0
}

To test it, I wrote a wee test function:

function test_can_user_write_to_file() {
  echo "The file is: $(ls -l $2)"
  echo "User is:" $(groups $1 2>/dev/null)
  echo "User" $1 "can write to" $2 ":" $(can_user_write_to_file $1 $2)
  echo ""
}

test_can_user_write_to_file root /etc/fstab
test_can_user_write_to_file invaliduser /etc/motd
test_can_user_write_to_file torstein /home/torstein/.xsession
test_can_user_write_to_file torstein /tmp/file-with-only-group-write-access

At least from these tests, the method works as intended considering file ownership and group write access :-)

,

Because I had to make some changes to @chepner's answer in order to get it to work, I'm posting my ad-hoc script here for easy copy & paste. It's a minor refactoring only, and I have upvoted chepner's answer. I'll delete mine if the accepted answer is updated with these fixes. I have already left comments on that answer pointing out the things I had trouble with.

I wanted to do away with the Bashisms so that's why I'm not using arrays at all. The (( arithmetic evaluation )) is still a Bash-only feature, so I'm stuck on Bash after all.

for f; do
    set -- $(stat -Lc "0%a %G %U" "$f")
    (("$1" & 0002)) && continue
    if (("$1" & 0020)); then
        case " "$(groups "$USER")" " in *" "$2" "*) continue ;; esac
    elif (("$1" & 0200)); then
        [ "$3" = "$USER" ] && continue
    fi
    echo "$0: Wrong permissions" "$@" "$f" >&2
done

Without the comments, this is even fairly compact.

[Jul 02, 2018] How can I detect whether a symlink is broken in Bash - Stack Overflow

Jul 02, 2018 | stackoverflow.com

How can I detect whether a symlink is broken in Bash? Ask Question up vote 38 down vote favorite 7


zoltanctoth ,Nov 8, 2011 at 10:39

I run find and iterate through the results with [ \( -L $F \) ] to collect certain symbolic links.

I am wondering if there is an easy way to determine if the link is broken (points to a non-existent file) in this scenario.

Here is my code:

FILES=`find /target/ | grep -v '\.disabled$' | sort`

for F in $FILES; do
    if [ -L $F ]; then
        DO THINGS
    fi
done

Roger ,Nov 8, 2011 at 10:45

# test if file exists (test actual file, not symbolic link)
if [ ! -e "$F" ] ; then
    # code if the symlink is broken
fi

Calimo ,Apr 18, 2017 at 19:50

Note that the code will also be executed if the file does not exist at all. It is fine with find but in other scenarios (such as globs) should be combined with -h to handle this case, for instance [ -h "$F" -a ! -e "$F" ] . – Calimo Apr 18 '17 at 19:50

Sridhar-Sarnobat ,Jul 13, 2017 at 22:36

You're not really testing the symbolic link with this approach. – Sridhar-Sarnobat Jul 13 '17 at 22:36

Melab ,Jul 24, 2017 at 15:22

@Calimo There is no difference. – Melab Jul 24 '17 at 15:22

Shawn Chin ,Nov 8, 2011 at 10:51

This should print out links that are broken:
find /target/dir -type l ! -exec test -e {} \; -print

You can also chain in operations to find command, e.g. deleting the broken link:

find /target/dir -type l ! -exec test -e {} \; -exec rm {} \;

Andrew Schulman ,Nov 8, 2011 at 10:43

readlink -q will fail silently if the link is bad:
for F in $FILES; do
    if [ -L $F ]; then
        if readlink -q $F >/dev/null ; then
            DO THINGS
        else
            echo "$F: bad link" >/dev/stderr
        fi
    fi
done

zoltanctoth ,Nov 8, 2011 at 10:55

this seems pretty nice as this only returns true if the file is actually a symlink. But even with adding -q, readlink outputs the name of the link on linux. If this is the case in general maybe the answer should be updated with 'readlink -q $F > dev/null'. Or am I missing something? – zoltanctoth Nov 8 '11 at 10:55

Andrew Schulman ,Nov 8, 2011 at 11:02

No, you're right. Corrected, thanks. – Andrew Schulman Nov 8 '11 at 11:02

Chaim Geretz ,Mar 31, 2015 at 21:09

Which version? I don't see this behavior on my system readlink --version readlink (coreutils) 5.2.1 – Chaim Geretz Mar 31 '15 at 21:09

Aquarius Power ,May 4, 2014 at 23:46

this will work if the symlink was pointing to a file or a directory, but now is broken
if [[ -L "$strFile" ]] && [[ ! -a "$strFile" ]];then 
  echo "'$strFile' is a broken symlink"; 
fi

ACyclic ,May 24, 2014 at 13:02

This finds all files of type "link", which also resolves to a type "link". ie. a broken symlink
find /target -type l -xtype l

cdelacroix ,Jun 23, 2015 at 12:59

variant: find -L /target -type lcdelacroix Jun 23 '15 at 12:59

Sridhar-Sarnobat ,Jul 13, 2017 at 22:38

Can't you have a symlink to a symlink that isn't broken?' – Sridhar-Sarnobat Jul 13 '17 at 22:38

,

If you don't mind traversing non-broken dir symlinks, to find all orphaned links:
$ find -L /target -type l | while read -r file; do echo $file is orphaned; done

To find all files that are not orphaned links:

$ find -L /target ! -type l

[Jul 02, 2018] command line - How can I find broken symlinks

Mar 15, 2012 | unix.stackexchange.com

gabe, Mar 15, 2012 at 16:29

Is there a way to find all symbolic links that don't point anywere?

find ./ -type l

will give me all symbolic links, but makes no distinction between links that go somewhere and links that don't.

I'm currently doing:

find ./ -type l -exec file {} \; |grep broken

But I'm wondering what alternate solutions exist.

rozcietrzewiacz ,May 15, 2012 at 7:01

I'd strongly suggest not to use find -L for the task (see below for explanation). Here are some other ways to do this:

The find -L trick quoted by solo from commandlinefu looks nice and hacky, but it has one very dangerous pitfall : All the symlinks are followed. Consider directory with the contents presented below:

$ ls -l
total 0
lrwxrwxrwx 1 michal users  6 May 15 08:12 link_1 -> nonexistent1
lrwxrwxrwx 1 michal users  6 May 15 08:13 link_2 -> nonexistent2
lrwxrwxrwx 1 michal users  6 May 15 08:13 link_3 -> nonexistent3
lrwxrwxrwx 1 michal users  6 May 15 08:13 link_4 -> nonexistent4
lrwxrwxrwx 1 michal users 11 May 15 08:20 link_out -> /usr/share/

If you run find -L . -type l in that directory, all /usr/share/ would be searched as well (and that can take really long) 1 . For a find command that is "immune to outgoing links", don't use -L .


1 This may look like a minor inconvenience (the command will "just" take long to traverse all /usr/share ) – but can have more severe consequences. For instance, consider chroot environments: They can exist in some subdirectory of the main filesystem and contain symlinks to absolute locations. Those links could seem to be broken for the "outside" system, because they only point to proper places once you've entered the chroot. I also recall that some bootloader used symlinks under /boot that only made sense in an initial boot phase, when the boot partition was mounted as / .

So if you use a find -L command to find and then delete broken symlinks from some harmless-looking directory, you might even break your system...

quornian ,Nov 17, 2012 at 21:56

I think -type l is redundant since -xtype l will operate as -type l on non-links. So find -xtype l is probably all you need. Thanks for this approach. – quornian Nov 17 '12 at 21:56

qwertzguy ,Jan 8, 2015 at 21:37

Be aware that those solutions don't work for all filesystem types. For example it won't work for checking if /proc/XXX/exe link is broken. For this, use test -e "$(readlink /proc/XXX/exe)" . – qwertzguy Jan 8 '15 at 21:37

weakish ,Apr 8, 2016 at 4:57

@Flimm find . -xtype l means "find all symlinks whose (ultimate) target files are symlinks". But the ultimate target of a symlink cannot be a symlink, otherwise we can still follow the link and it is not the ultimate target. Since there is no such symlinks, we can define them as something else, i.e. broken symlinks. – weakish Apr 8 '16 at 4:57

weakish ,Apr 22, 2016 at 12:19

@JoóÁdám "which can only be a symbolic link in case it is broken". Give "broken symbolic link" or "non exist file" an individual type, instead of overloading l , is less confusing to me. – weakish Apr 22 '16 at 12:19

Alois Mahdal ,Jul 15, 2016 at 0:22

The warning at the end is useful, but note that this does not apply to the -L hack but rather to (blindly) removing broken symlinks in general. – Alois Mahdal Jul 15 '16 at 0:22

Sam Morris ,Mar 15, 2012 at 17:38

The symlinks command from http://www.ibiblio.org/pub/Linux/utils/file/symlinks-1.4.tar.gz can be used to identify symlinks with a variety of characteristics. For instance:
$ rm a
$ ln -s a b
$ symlinks .
dangling: /tmp/b -> a

qed ,Jul 27, 2014 at 20:32

Is this tool available for osx? – qed Jul 27 '14 at 20:32

qed ,Jul 27, 2014 at 20:51

Never mind, got it compiled. – qed Jul 27 '14 at 20:51

Daniel Jonsson ,Apr 11, 2015 at 22:11

Apparently symlinks is pre-installed on Fedora. – Daniel Jonsson Apr 11 '15 at 22:11

pooryorick ,Sep 29, 2012 at 14:02

As rozcietrzewiacz has already commented, find -L can have unexpected consequence of expanding the search into symlinked directories, so isn't the optimal approach. What no one has mentioned yet is that
find /path/to/search -xtype l

is the more concise, and logically identical command to

find /path/to/search -type l -xtype l

None of the solutions presented so far will detect cyclic symlinks, which is another type of breakage. this question addresses portability. To summarize, the portable way to find broken symbolic links, including cyclic links, is:

find /path/to/search -type l -exec test ! -e {} \; -print

For more details, see this question or ynform.org . Of course, the definitive source for all this is the findutils documentaton .

Flimm ,Oct 7, 2014 at 13:00

Short, consice, and addresses the find -L pitfall as well as cyclical links. +1 – Flimm Oct 7 '14 at 13:00

neu242 ,Aug 1, 2016 at 10:03

Nice. The last one works on MacOSX as well, while @rozcietrzewiacz's answer didn't. – neu242 Aug 1 '16 at 10:03

kwarrick ,Mar 15, 2012 at 16:52

I believe adding the -L flag to your command will allow you do get rid of the grep:
$ find -L . -type l

http://www.commandlinefu.com/commands/view/8260/find-broken-symlinks

from the man:

 -L      Cause the file information and file type (see stat(2)) returned 
         for each symbolic link to be those of the file referenced by the
         link, not the link itself. If the referenced file does not exist,
         the file information and type will be for the link itself.

rozcietrzewiacz ,May 15, 2012 at 7:37

At first I've upvoted this, but then I've realised how dangerous it may be. Before you use it, please have a look at my answer ! – rozcietrzewiacz May 15 '12 at 7:37

andy ,Dec 26, 2012 at 6:56

If you need a different behavior whether the link is broken or cyclic you can also use %Y with find:
$ touch a
$ ln -s a b  # link to existing target
$ ln -s c d  # link to non-existing target
$ ln -s e e  # link to itself
$ find . -type l -exec test ! -e {} \; -printf '%Y %p\n' \
   | while read type link; do
         case "$type" in
         N) echo "do something with broken link $link" ;;
         L) echo "do something with cyclic link $link" ;;
         esac
      done
do something with broken link ./d
do something with cyclic link ./e

This example is copied from this post (site deleted) .

Reference

syntaxerror ,Jun 25, 2015 at 0:28

Yet another shorthand for those whose find command does not support xtype can be derived from this: find . type l -printf "%Y %p\n" | grep -w '^N' . As andy beat me to it with the same (basic) idea in his script, I was reluctant to write it as separate answer. :) – syntaxerror Jun 25 '15 at 0:28

Alex ,Apr 30, 2013 at 6:37

find -L . -type l |xargs symlinks will give you info whether the link exists or not on a per foundfile basis.

conradkdotcom ,Oct 24, 2014 at 14:33

This will print out the names of broken symlinks in the current directory.
for l in $(find . -type l); do cd $(dirname $l); if [ ! -e "$(readlink $(basename $l))" ]; then echo $l; fi; cd - > /dev/null; done

Works in Bash. Don't know about other shells.

Iskren ,Aug 8, 2015 at 14:01

I use this for my case and it works quite well, as I know the directory to look for broken symlinks:
find -L $path -maxdepth 1 -type l

and my folder does include a link to /usr/share but it doesn't traverse it. Cross-device links and those that are valid for chroots, etc. are still a pitfall but for my use case it's sufficient.

,

Simple no-brainer answer, which is a variation on OP's version. Sometimes, you just want something easy to type or remember:
find . | xargs file | grep -i "broken symbolic link"

[Jul 02, 2018] Explanation of % directives in find -printf

Jul 02, 2018 | unix.stackexchange.com

san1512 ,Jul 11, 2015 at 6:24

find /tmp -printf '%s %p\n' |sort -n -r | head

This command is working fine but what are the %s %p options used here? Are there any other options that can be used?

Cyrus ,Jul 11, 2015 at 6:41

Take a look at find's manpage. – Cyrus Jul 11 '15 at 6:41

phuclv ,Oct 9, 2017 at 3:13

possible duplicate of Where to find printf formatting reference?phuclv Oct 9 '17 at 3:13

Hennes ,Jul 11, 2015 at 6:34

What are the %s %p options used here?

From the man page :

%s File's size in bytes.

%p File's name.

Scroll down on that page beyond all the regular letters for printf and read the parts which come prefixed with a %.

%n Number of hard links to file.

%p File's name.

%P File's name with the name of the starting-point under which it was found removed.

%s File's size in bytes.

%t File's last modification time in the format returned by the C `ctime' function.

Are there any other options that can be used?

There are. See the link to the manpage.

Kusalananda ,Nov 17, 2017 at 9:53

@don_crissti I'll never understand why people prefer random web documentation to the documentation installed on their systems (which has the added benefit of actually being relevant to their system). – Kusalananda Nov 17 '17 at 9:53

don_crissti ,Nov 17, 2017 at 12:52

@Kusalananda - Well, I can think of one scenario in which people would include a link to a web page instead of a quote from the documentation installed on their system: they're not on a linux machine at the time of writing the post... However, the link should point (imo) to the official docs (hence my comment above, which, for some unknown reason, was deleted by the mods...). That aside, I fully agree with you: the OP should consult the manual page installed on their system. – don_crissti Nov 17 '17 at 12:52

runlevel0 ,Feb 15 at 12:10

@don_crissti Or they are on a server that has no manpages installed which is rather frequent. – runlevel0 Feb 15 at 12:10

Hennes ,Feb 16 at 16:16

My manual page tend to be from FreeBSD though. Unless I happen to have a Linux VM within reach. And I have the impression that most questions are GNU/Linux based. – Hennes Feb 16 at 16:16

[Jun 23, 2018] Bash script processing limited number of commands in parallel

Jun 23, 2018 | stackoverflow.com

AL-Kateb ,Oct 23, 2013 at 13:33

I have a bash script that looks like this:
#!/bin/bash
wget LINK1 >/dev/null 2>&1
wget LINK2 >/dev/null 2>&1
wget LINK3 >/dev/null 2>&1
wget LINK4 >/dev/null 2>&1
# ..
# ..
wget LINK4000 >/dev/null 2>&1

But processing each line until the command is finished then moving to the next one is very time consuming, I want to process for instance 20 lines at once then when they're finished another 20 lines are processed.

I thought of wget LINK1 >/dev/null 2>&1 & to send the command to the background and carry on, but there are 4000 lines here this means I will have performance issues, not to mention being limited in how many processes I should start at the same time so this is not a good idea.

One solution that I'm thinking of right now is checking whether one of the commands is still running or not, for instance after 20 lines I can add this loop:

while [  $(ps -ef | grep KEYWORD | grep -v grep | wc -l) -gt 0 ]; do
sleep 1
done

Of course in this case I will need to append & to the end of the line! But I'm feeling this is not the right way to do it.

So how do I actually group each 20 lines together and wait for them to finish before going to the next 20 lines, this script is dynamically generated so I can do whatever math I want on it while it's being generated, but it DOES NOT have to use wget, it was just an example so any solution that is wget specific is not gonna do me any good.

kojiro ,Oct 23, 2013 at 13:46

wait is the right answer here, but your while [ $(ps would be much better written while pkill -0 $KEYWORD – using proctools that is, for legitimate reasons to check if a process with a specific name is still running. – kojiro Oct 23 '13 at 13:46

VasyaNovikov ,Jan 11 at 19:01

I think this question should be re-opened. The "possible duplicate" QA is all about running a finite number of programs in parallel. Like 2-3 commands. This question, however, is focused on running commands in e.g. a loop. (see "but there are 4000 lines"). – VasyaNovikov Jan 11 at 19:01

robinCTS ,Jan 11 at 23:08

@VasyaNovikov Have you read all the answers to both this question and the duplicate? Every single answer to this question here, can also be found in the answers to the duplicate question. That is precisely the definition of a duplicate question. It makes absolutely no difference whether or not you are running the commands in a loop. – robinCTS Jan 11 at 23:08

VasyaNovikov ,Jan 12 at 4:09

@robinCTS there are intersections, but questions themselves are different. Also, 6 of the most popular answers on the linked QA deal with 2 processes only. – VasyaNovikov Jan 12 at 4:09

Dan Nissenbaum ,Apr 20 at 15:35

I recommend reopening this question because its answer is clearer, cleaner, better, and much more highly upvoted than the answer at the linked question, though it is three years more recent. – Dan Nissenbaum Apr 20 at 15:35

devnull ,Oct 23, 2013 at 13:35

Use the wait built-in:
process1 &
process2 &
process3 &
process4 &
wait
process5 &
process6 &
process7 &
process8 &
wait

For the above example, 4 processes process1 .. process4 would be started in the background, and the shell would wait until those are completed before starting the next set ..

From the manual :

wait [jobspec or pid ...]

Wait until the child process specified by each process ID pid or job specification jobspec exits and return the exit status of the last command waited for. If a job spec is given, all processes in the job are waited for. If no arguments are given, all currently active child processes are waited for, and the return status is zero. If neither jobspec nor pid specifies an active child process of the shell, the return status is 127.

kojiro ,Oct 23, 2013 at 13:48

So basically i=0; waitevery=4; for link in "${links[@]}"; do wget "$link" & (( i++%waitevery==0 )) && wait; done >/dev/null 2>&1kojiro Oct 23 '13 at 13:48

rsaw ,Jul 18, 2014 at 17:26

Unless you're sure that each process will finish at the exact same time, this is a bad idea. You need to start up new jobs to keep the current total jobs at a certain cap .... parallel is the answer. – rsaw Jul 18 '14 at 17:26

DomainsFeatured ,Sep 13, 2016 at 22:55

Is there a way to do this in a loop? – DomainsFeatured Sep 13 '16 at 22:55

Bobby ,Apr 27, 2017 at 7:55

I've tried this but it seems that variable assignments done in one block are not available in the next block. Is this because they are separate processes? Is there a way to communicate the variables back to the main process? – Bobby Apr 27 '17 at 7:55

choroba ,Oct 23, 2013 at 13:38

See parallel . Its syntax is similar to xargs , but it runs the commands in parallel.

chepner ,Oct 23, 2013 at 14:35

This is better than using wait , since it takes care of starting new jobs as old ones complete, instead of waiting for an entire batch to finish before starting the next. – chepner Oct 23 '13 at 14:35

Mr. Llama ,Aug 13, 2015 at 19:30

For example, if you have the list of links in a file, you can do cat list_of_links.txt | parallel -j 4 wget {} which will keep four wget s running at a time. – Mr. Llama Aug 13 '15 at 19:30

0x004D44 ,Nov 2, 2015 at 21:42

There is a new kid in town called pexec which is a replacement for parallel . – 0x004D44 Nov 2 '15 at 21:42

mat ,Mar 1, 2016 at 21:04

Not to be picky, but xargs can also parallelize commands. – mat Mar 1 '16 at 21:04

Vader B ,Jun 27, 2016 at 6:41

In fact, xargs can run commands in parallel for you. There is a special -P max_procs command-line option for that. See man xargs .

> ,

You can run 20 processes and use the command:
wait

Your script will wait and continue when all your background jobs are finished.

[Jun 23, 2018] parallelism - correct xargs parallel usage

Jun 23, 2018 | unix.stackexchange.com

Yan Zhu ,Apr 19, 2015 at 6:59

I am using xargs to call a python script to process about 30 million small files. I hope to use xargs to parallelize the process. The command I am using is:
find ./data -name "*.json" -print0 |
  xargs -0 -I{} -P 40 python Convert.py {} > log.txt

Basically, Convert.py will read in a small json file (4kb), do some processing and write to another 4kb file. I am running on a server with 40 CPU cores. And no other CPU-intense process is running on this server.

By monitoring htop (btw, is there any other good way to monitor the CPU performance?), I find that -P 40 is not as fast as expected. Sometimes all cores will freeze and decrease almost to zero for 3-4 seconds, then will recover to 60-70%. Then I try to decrease the number of parallel processes to -P 20-30 , but it's still not very fast. The ideal behavior should be linear speed-up. Any suggestions for the parallel usage of xargs ?

Ole Tange ,Apr 19, 2015 at 8:45

You are most likely hit by I/O: The system cannot read the files fast enough. Try starting more than 40: This way it will be fine if some of the processes have to wait for I/O. – Ole Tange Apr 19 '15 at 8:45

Fox ,Apr 19, 2015 at 10:30

What kind of processing does the script do? Any database/network/io involved? How long does it run? – Fox Apr 19 '15 at 10:30

PSkocik ,Apr 19, 2015 at 11:41

I second @OleTange. That is the expected behavior if you run as many processes as you have cores and your tasks are IO bound. First the cores will wait on IO for their task (sleep), then they will process, and then repeat. If you add more processes, then the additional processes that currently aren't running on a physical core will have kicked off parallel IO operations, which will, when finished, eliminate or at least reduce the sleep periods on your cores. – PSkocik Apr 19 '15 at 11:41

Bichoy ,Apr 20, 2015 at 3:32

1- Do you have hyperthreading enabled? 2- in what you have up there, log.txt is actually overwritten with each call to convert.py ... not sure if this is the intended behavior or not. – Bichoy Apr 20 '15 at 3:32

Ole Tange ,May 11, 2015 at 18:38

xargs -P and > is opening up for race conditions because of the half-line problem gnu.org/software/parallel/ Using GNU Parallel instead will not have that problem. – Ole Tange May 11 '15 at 18:38

James Scriven ,Apr 24, 2015 at 18:00

I'd be willing to bet that your problem is python . You didn't say what kind of processing is being done on each file, but assuming you are just doing in-memory processing of the data, the running time will be dominated by starting up 30 million python virtual machines (interpreters).

If you can restructure your python program to take a list of files, instead of just one, you will get a huge improvement in performance. You can then still use xargs to further improve performance. For example, 40 processes, each processing 1000 files:

find ./data -name "*.json" -print0 |
  xargs -0 -L1000 -P 40 python Convert.py

This isn't to say that python is a bad/slow language; it's just not optimized for startup time. You'll see this with any virtual machine-based or interpreted language. Java, for example, would be even worse. If your program was written in C, there would still be a cost of starting a separate operating system process to handle each file, but it would be much less.

From there you can fiddle with -P to see if you can squeeze out a bit more speed, perhaps by increasing the number of processes to take advantage of idle processors while data is being read/written.

Stephen ,Apr 24, 2015 at 13:03

So firstly, consider the constraints:

What is the constraint on each job? If it's I/O you can probably get away with multiple jobs per CPU core up till you hit the limit of I/O, but if it's CPU intensive, its going to be worse than pointless running more jobs concurrently than you have CPU cores.

My understanding of these things is that GNU Parallel would give you better control over the queue of jobs etc.

See GNU parallel vs & (I mean background) vs xargs -P for a more detailed explanation of how the two differ.

,

As others said, check whether you're I/O-bound. Also, xargs' man page suggests using -n with -P , you don't mention the number of Convert.py processes you see running in parallel.

As a suggestion, if you're I/O-bound, you might try using an SSD block device, or try doing the processing in a tmpfs (of course, in this case you should check for enough memory, avoiding swap due to tmpfs pressure (I think), and the overhead of copying the data to it in the first place).

[Jun 23, 2018] Linux/Bash, how to schedule commands in a FIFO queue?

Jun 23, 2018 | superuser.com

Andrei ,Apr 10, 2013 at 14:26

I want the ability to schedule commands to be run in a FIFO queue. I DON'T want them to be run at a specified time in the future as would be the case with the "at" command. I want them to start running now, but not simultaneously. The next scheduled command in the queue should be run only after the first command finishes executing. Alternatively, it would be nice if I could specify a maximum number of commands from the queue that could be run simultaneously; for example if the maximum number of simultaneous commands is 2, then only at most 2 commands scheduled in the queue would be taken from the queue in a FIFO manner to be executed, the next command in the remaining queue being started only when one of the currently 2 running commands finishes.

I've heard task-spooler could do something like this, but this package doesn't appear to be well supported/tested and is not in the Ubuntu standard repositories (Ubuntu being what I'm using). If that's the best alternative then let me know and I'll use task-spooler, otherwise, I'm interested to find out what's the best, easiest, most tested, bug-free, canonical way to do such a thing with bash.

UPDATE:

Simple solutions like ; or && from bash do not work. I need to schedule these commands from an external program, when an event occurs. I just don't want to have hundreds of instances of my command running simultaneously, hence the need for a queue. There's an external program that will trigger events where I can run my own commands. I want to handle ALL triggered events, I don't want to miss any event, but I also don't want my system to crash, so that's why I want a queue to handle my commands triggered from the external program.

Andrei ,Apr 11, 2013 at 11:40

Task Spooler:

http://vicerveza.homeunix.net/~viric/soft/ts/

https://launchpad.net/ubuntu/+source/task-spooler/0.7.3-1

Does the trick very well. Hopefully it will be included in Ubuntu's package repos.

Hennes ,Apr 10, 2013 at 15:00

Use ;

For example:
ls ; touch test ; ls

That will list the directory. Only after ls has run it will run touch test which will create a file named test. And only after that has finished it will run the next command. (In this case another ls which will show the old contents and the newly created file).

Similar commands are || and && .

; will always run the next command.

&& will only run the next command it the first returned success.
Example: rm -rf *.mp3 && echo "Success! All MP3s deleted!"

|| will only run the next command if the first command returned a failure (non-zero) return value. Example: rm -rf *.mp3 || echo "Error! Some files could not be deleted! Check permissions!"

If you want to run a command in the background, append an ampersand ( & ).
Example:
make bzimage &
mp3blaster sound.mp3
make mytestsoftware ; ls ; firefox ; make clean

Will run two commands int he background (in this case a kernel build which will take some time and a program to play some music). And in the foregrounds it runs another compile job and, once that is finished ls, firefox and a make clean (all sequentially)

For more details, see man bash


[Edit after comment]

in pseudo code, something like this?

Program run_queue:

While(true)
{
   Wait_for_a_signal();

   While( queue not empty )
   {
       run next command from the queue.
       remove this command from the queue.
       // If commands where added to the queue during execution then
       // the queue is not empty, keep processing them all.
   }
   // Queue is now empty, returning to wait_for_a_signal
}
// 
// Wait forever on commands and add them to a queue
// Signal run_quueu when something gets added.
//
program add_to_queue()
{
   While(true)
   {
       Wait_for_event();
       Append command to queue
       signal run_queue
   }    
}

terdon ,Apr 10, 2013 at 15:03

The easiest way would be to simply run the commands sequentially:
cmd1; cmd2; cmd3; cmdN

If you want the next command to run only if the previous command exited successfully, use && :

cmd1 && cmd2 && cmd3 && cmdN

That is the only bash native way I know of doing what you want. If you need job control (setting a number of parallel jobs etc), you could try installing a queue manager such as TORQUE but that seems like overkill if all you want to do is launch jobs sequentially.

psusi ,Apr 10, 2013 at 15:24

You are looking for at 's twin brother: batch . It uses the same daemon but instead of scheduling a specific time, the jobs are queued and will be run whenever the system load average is low.

mpy ,Apr 10, 2013 at 14:59

Apart from dedicated queuing systems (like the Sun Grid Engine ) which you can also use locally on one machine and which offer dozens of possibilities, you can use something like
 command1 && command2 && command3

which is the other extreme -- a very simple approach. The latter neither does provide multiple simultaneous processes nor gradually filling of the "queue".

Bogdan Dumitru ,May 3, 2016 at 10:12

I went on the same route searching, trying out task-spooler and so on. The best of the best is this:

GNU Parallel --semaphore --fg It also has -j for parallel jobs.

[Jun 20, 2018] bash - sudo as another user with their environment

Using strace is an interesting debugging tip
Jun 20, 2018 | unix.stackexchange.com

user80551 ,Jan 2, 2015 at 4:29

$ whoami
admin
$ sudo -S -u otheruser whoami
otheruser
$ sudo -S -u otheruser /bin/bash -l -c 'echo $HOME'
/home/admin

Why isn't $HOME being set to /home/otheruser even though bash is invoked as a login shell?

Specifically, /home/otheruser/.bashrc isn't being sourced. Also, /home/otheruser/.profile isn't being sourced. - ( /home/otheruser/.bash_profile doesn't exist)

EDIT: The exact problem is actually https://stackoverflow.com/questions/27738224/mkvirtualenv-with-fabric-as-another-user-fails

Pavel Šimerda ,Jan 2, 2015 at 8:29

A solution to this question will solve the other question as well, you might want to delete the other question in this situation. – Pavel Šimerda Jan 2 '15 at 8:29

Pavel Šimerda ,Jan 2, 2015 at 8:27

To invoke a login shell using sudo just use -i . When command is not specified you'll get a login shell prompt, otherwise you'll get the output of your command.

Example (login shell):

sudo -i

Example (with a specified user):

sudo -i -u user

Example (with a command):

sudo -i -u user whoami

Example (print user's $HOME ):

sudo -i -u user echo \$HOME

Note: The backslash character ensures that the dollar sign reaches the target user's shell and is not interpreted in the calling user's shell.

I have just checked the last example with strace which tells you exactly what's happening. The output bellow shows that the shell is being called with --login and with the specified command, just as in your explicit call to bash, but in addition sudo can do its own work like setting the $HOME .

# strace -f -e process sudo -S -i -u user echo \$HOME
execve("/usr/bin/sudo", ["sudo", "-S", "-i", "-u", "user", "echo", "$HOME"], [/* 42 vars */]) = 0
...
[pid 12270] execve("/bin/bash", ["-bash", "--login", "-c", "echo \\$HOME"], [/* 16 vars */]) = 0
...

I noticed that you are using -S and I don't think it is generally a good technique. If you want to run commands as a different user without performing authentication from the keyboard, you might want to use SSH instead. It works for localhost as well as for other hosts and provides public key authentication that works without any interactive input.

ssh user@localhost echo \$HOME

Note: You don't need any special options with SSH as the SSH server always creates a login shell to be accessed by the SSH client.

John_West ,Nov 23, 2015 at 11:12

sudo -i -u user echo \$HOME doesn't work for me. Output: $HOME . strace gives the same output as yours. What's the issue? – John_West Nov 23 '15 at 11:12

Pavel Šimerda ,Jan 20, 2016 at 19:02

No idea, it still works for me, I'd need to see it or maybe even touch the system. – Pavel Šimerda Jan 20 '16 at 19:02

Jeff Snider ,Jan 2, 2015 at 8:04

You're giving Bash too much credit. All "login shell" means to Bash is what files are sourced at startup and shutdown. The $HOME variable doesn't figure into it.

The Bash docs explain some more what login shell means: https://www.gnu.org/software/bash/manual/html_node/Bash-Startup-Files.html#Bash-Startup-Files

In fact, Bash doesn't do anything to set $HOME at all. $HOME is set by whatever invokes the shell (login, ssh, etc.), and the shell inherits it. Whatever started your shell as admin set $HOME and then exec-ed bash , sudo by design doesn't alter the environment unless asked or configured to do so, so bash as otheruser inherited it from your shell.

If you want sudo to handle more of the environment in the way you're expecting, look at the -i switch for sudo. Try:

sudo -S -u otheruser -i /bin/bash -l -c 'echo $HOME'

The man page for sudo describes it in more detail, though not really well, I think: http://linux.die.net/man/8/sudo

user80551 ,Jan 2, 2015 at 8:11

$HOME isn't set by bash - Thanks, I didn't know that. – user80551 Jan 2 '15 at 8:11

Pavel Šimerda ,Jan 2, 2015 at 9:46

Look for strace in my answer. It shows that you don't need to build /bin/bash -l -c 'echo $HOME' command line yourself when using -i .

palswim ,Oct 13, 2016 at 20:21

That sudo syntax threw an error on my machine. ( su uses the -c option, but I don't think sudo does.) I had better luck with: HomeDir=$( sudo -u "$1" -H -s echo "\$HOME" )palswim Oct 13 '16 at 20:21

[Jun 20, 2018] What are the differences between su, sudo -s, sudo -i, sudo su

Notable quotes:
"... (which means "substitute user" or "switch user") ..."
"... (hmm... what's the mnemonic? Super-User-DO?) ..."
"... The official meaning of "su" is "substitute user" ..."
"... Interestingly, Ubuntu's manpage does not mention "substitute" at all. The manpage at gnu.org ( gnu.org/software/coreutils/manual/html_node/su-invocation.html ) does indeed say "su: Run a command with substitute user and group ID". ..."
"... sudo -s runs a [specified] shell with root privileges. sudo -i also acquires the root user's environment. ..."
"... To see the difference between su and sudo -s , do cd ~ and then pwd after each of them. In the first case, you'll be in root's home directory, because you're root. In the second case, you'll be in your own home directory, because you're yourself with root privileges. There's more discussion of this exact question here . ..."
"... I noticed sudo -s doesnt seem to process /etc/profile ..."
Jun 20, 2018 | askubuntu.com

Sergey ,Oct 22, 2011 at 7:21

The main difference between these commands is in the way they restrict access to their functions.

su (which means "substitute user" or "switch user") - does exactly that, it starts another shell instance with privileges of the target user. To ensure you have the rights to do that, it asks you for the password of the target user . So, to become root, you need to know root password. If there are several users on your machine who need to run commands as root, they all need to know root password - note that it'll be the same password. If you need to revoke admin permissions from one of the users, you need to change root password and tell it only to those people who need to keep access - messy.

sudo (hmm... what's the mnemonic? Super-User-DO?) is completely different. It uses a config file (/etc/sudoers) which lists which users have rights to specific actions (run commands as root, etc.) When invoked, it asks for the password of the user who started it - to ensure the person at the terminal is really the same "joe" who's listed in /etc/sudoers . To revoke admin privileges from a person, you just need to edit the config file (or remove the user from a group which is listed in that config). This results in much cleaner management of privileges.

As a result of this, in many Debian-based systems root user has no password set - i.e. it's not possible to login as root directly.

Also, /etc/sudoers allows to specify some additional options - i.e. user X is only able to run program Y etc.

The often-used sudo su combination works as follows: first sudo asks you for your password, and, if you're allowed to do so, invokes the next command ( su ) as a super-user. Because su is invoked by root , it require you to enter your password instead of root.

So, sudo su allows you to open a shell as another user (including root), if you're allowed super-user access by the /etc/sudoers file.

dr jimbob ,Oct 22, 2011 at 13:47

I've never seen su as "switch user", but always as superuser; the default behavior without another's user name (though it makes sense). From wikipedia : "The su command, also referred to as super user[1] as early as 1974, has also been called "substitute user", "spoof user" or "set user" because it allows changing the account associated with the current terminal (window)."

Sergey ,Oct 22, 2011 at 20:33

@dr jimbob: you're right, but I'm finding that "switch user" is kinda describes better what it does - though historically it stands for "super user". I'm also delighted to find that the wikipedia article is very similar to my answer - I never saw the article before :)

Angel O'Sphere ,Nov 26, 2013 at 13:02

The official meaning of "su" is "substitute user". See: "man su". – Angel O'Sphere Nov 26 '13 at 13:02

Sergey ,Nov 26, 2013 at 20:25

@AngelO'Sphere: Interestingly, Ubuntu's manpage does not mention "substitute" at all. The manpage at gnu.org ( gnu.org/software/coreutils/manual/html_node/su-invocation.html ) does indeed say "su: Run a command with substitute user and group ID". I think gnu.org is a canonical source :) – Sergey Nov 26 '13 at 20:25

Mike Scott ,Oct 22, 2011 at 6:28

sudo lets you run commands in your own user account with root privileges. su lets you switch user so that you're actually logged in as root.

sudo -s runs a [specified] shell with root privileges. sudo -i also acquires the root user's environment.

To see the difference between su and sudo -s , do cd ~ and then pwd after each of them. In the first case, you'll be in root's home directory, because you're root. In the second case, you'll be in your own home directory, because you're yourself with root privileges. There's more discussion of this exact question here .

Sergey ,Oct 22, 2011 at 7:28

"you're yourself with root privileges" is not what's actually happening :) Actually, it's not possible to be "yourself with root privileges" - either you're root or you're yourself. Try typing whoami in both cases. The fact that cd ~ results are different is a result of sudo -s not setting $HOME environment variable. – Sergey Oct 22 '11 at 7:28

Octopus ,Feb 6, 2015 at 22:15

@Sergey, whoami it says are 'root' because you are running the 'whoami' cmd as though you sudoed it, so temporarily (for the duration of that command) you appear to be the root user, but you might still not have full root access according to the sudoers file. – Octopus Feb 6 '15 at 22:15

Sergey ,Feb 6, 2015 at 22:24

@Octopus: what I was trying to say is that in Unix, a process can only have one UID, and that UID determines the permissions of the process. You can't be "yourself with root privileges", a program either runs with your UID or with root's UID (0). – Sergey Feb 6 '15 at 22:24

Sergey ,Feb 6, 2015 at 22:32

Regarding "you might still not have full root access according to the sudoers file": the sudoers file controls who can run which command as another user, but that happens before the command is executed. However, once you were allowed to start a process as, say, root -- the running process has root's UID and has a full access to the system, there's no way for sudo to restrict that.

Again, you're always either yourself or root, there's no "half-n-half". So, if sudoers file allows you to run shell as root -- permissions in that shell would be indistinguishable from a "normal" root shell. – Sergey Feb 6 '15 at 22:32

dotancohen ,Nov 8, 2014 at 14:07

This answer is a dupe of my answer on a dupe of this question , put here on the canonical answer so that people can find it!

The major difference between sudo -i and sudo -s is:

Here is an example, you can see that I have an application lsl in my ~/.bin/ directory which is accessible via sudo -s but not accessible with sudo -i . Note also that the Bash prompt changes as will with sudo -i but not with sudo -s :

dotancohen@melancholy:~$ ls .bin
lsl

dotancohen@melancholy:~$ which lsl
/home/dotancohen/.bin/lsl

dotancohen@melancholy:~$ sudo -i

root@melancholy:~# which lsl

root@melancholy:~# exit
logout

dotancohen@melancholy:~$ sudo -s
Sourced .bashrc

dotancohen@melancholy:~$ which lsl
/home/dotancohen/.bin/lsl

dotancohen@melancholy:~$ exit
exit

Though sudo -s is convenient for giving you the environment that you are familiar with, I recommend the use of sudo -i for two reasons:

  1. The visual reminder that you are in a 'root' session.
  2. The root environment is far less likely to be poisoned with malware, such as a rogue line in .bashrc .

meffect ,Feb 23, 2017 at 5:21

I noticed sudo -s doesnt seem to process /etc/profile , or anything I have in /etc/profile.d/ .. any idea why? – meffect Feb 23 '17 at 5:21

Marius Gedminas ,Oct 22, 2011 at 19:38

su asks for the password of the user "root".

sudo asks for your own password (and also checks if you're allowed to run commands as root, which is configured through /etc/sudoers -- by default all user accounts that belong to the "admin" group are allowed to use sudo).

sudo -s launches a shell as root, but doesn't change your working directory. sudo -i simulates a login into the root account: your working directory will be /root , and root's .profile etc. will be sourced as if on login.

DJCrashdummy ,Jul 29, 2017 at 0:58

to make the answer more complete: sudo -s is almost equal to su ($HOME is different) and sudo -i is equal to su -
In Ubuntu or a related system, I don't find much use for su in the traditional, super-user sense. sudo handles that case much better. However, su is great for becoming another user in one-off situations where configuring sudoers would be silly.

For example, if I'm repairing my system from a live CD/USB, I'll often mount my hard drive and other necessary stuff and chroot into the system. In such a case, my first command is generally:

su - myuser  # Note the '-'. It means to act as if that user had just logged in.

That way, I'm operating not as root, but as my normal user, and I then use sudo as appropriate.

[Jun 20, 2018] permission - allow sudo to another user without password

Jun 20, 2018 | apple.stackexchange.com

up vote 35 down vote favorite 11


zio ,Feb 17, 2013 at 13:12

I want to be able to 'su' to a specific user, allowing me to run any command without a password being entered.

For example:

If my login were user1 and the user I want to 'su' to is user2:

I would use the command:

su - user2

but then it prompts me with

Password:

Global nomad ,Feb 17, 2013 at 13:17

Ask the other user for the password. At least the other user knows what's been done under his/her id. – Global nomad Feb 17 '13 at 13:17

zio ,Feb 17, 2013 at 13:24

This is nothing to do with another physical user. Both ID's are mine. I know the password as I created the account. I just don't want to have to type the password every time. – zio Feb 17 '13 at 13:24

bmike ♦ ,Feb 17, 2013 at 15:32

Would it be ok to ssh to at user or do you need to inherit one shell in particular and need su to work? – bmike ♦ Feb 17 '13 at 15:32

bmike ♦ ,Feb 17, 2013 at 23:59

@zio Great use case. Does open -na Skype not work for you? – bmike ♦ Feb 17 '13 at 23:59

user495470 ,Feb 18, 2013 at 4:50

You could also try copying the application bundle and changing CFBundleIdentifier . – user495470 Feb 18 '13 at 4:50

Huygens ,Feb 18, 2013 at 7:39

sudo can do just that for you :)

It needs a bit of configuration though, but once done you would only do this:

sudo -u user2 -s

And you would be logged in as user2 without entering a password.

Configuration

To configure sudo, you must edit its configuration file via: visudo . Note: this command will open the configuration using the vi text editor, if you are unconfortable with that, you need to set another editor (using export EDITOR=<command> ) before executing the following line. Another command line editor sometimes regarded as easier is nano , so you would do export EDITOR=/usr/bin/nano . You usually need super user privilege for visudo :

sudo visudo

This file is structured in different section, the aliases, then defaults and finally at the end you have the rules. This is where you need to add the new line. So you navigate at the end of the file and add this:

user1    ALL=(user2) NOPASSWD: /bin/bash

You can replace also /bin/bash by ALL and then you could launch any command as user2 without a password: sudo -u user2 <command> .

Update

I have just seen your comment regarding Skype. You could consider adding Skype directly to the sudo's configuration file. I assume you have Skype installed in your Applications folder:

user1    ALL=(user2) NOPASSWD: /Applications/Skype.app/Contents/MacOS/Skype

Then you would call from the terminal:

sudo -u user2 /Applications/Skype.app/Contents/MacOS/Skype

bmike ♦ ,May 28, 2014 at 16:04

This is far less complicated than the ssh keys idea, so use this unless you need the ssh keys for remote access as well. – bmike ♦ May 28 '14 at 16:04

Stan Kurdziel ,Oct 26, 2015 at 16:56

One thing to note from a security-perspective is that specifying a specific command implies that it should be a read-only command for user1; Otherwise, they can overwrite the command with something else and run that as user2. And if you don't care about that, then you might as well specify that user1 can run any command as user2 and therefore have a simpler sudo config. – Stan Kurdziel Oct 26 '15 at 16:56

Huygens ,Oct 26, 2015 at 19:24

@StanKurdziel good point! Although it is something to be aware of, it's really seldom to have system executables writable by users unless you're root but in this case you don't need sudo ;-) But you're right to add this comment because it's so seldom that I've probably overlooked it more than one time. – Huygens Oct 26 '15 at 19:24

Gert van den Berg ,Aug 10, 2016 at 14:24

To get it nearer to the behaviour su - user2 instead of su user2 , the commands should probably all involve sudo -u user2 -i , in order to simulate an initial login as user2 – Gert van den Berg Aug 10 '16 at 14:24

bmike ,Feb 18, 2013 at 0:05

I would set up public/private ssh keys for the second account and store the key in the first account.

Then you could run a command like:

 ssh user@localhost -n /Applications/Skype.app/Contents/MacOS/Skype &

You'd still have the issues where Skype gets confused since two instances are running on one user account and files read/written by that program might conflict. It also might work well enough for your needs and you'd not need an iPod touch to run your second Skype instance.

calum_b ,Feb 18, 2013 at 9:54

This is a good secure solution for the general case of password-free login to any account on any host, but I'd say it's probably overkill when both accounts are on the same host and belong to the same user. – calum_b Feb 18 '13 at 9:54

bmike ♦ ,Feb 18, 2013 at 14:02

@scottishwildcat It's far more secure than the alternative of scripting the password and feeding it in clear text or using a variable and storing the password in the keychain and using a tool like expect to script the interaction. I just use sudo su - blah and type my password. I think the other answer covers sudo well enough to keep this as a comment. – bmike ♦ Feb 18 '13 at 14:02

calum_b ,Feb 18, 2013 at 17:47

Oh, I certainly wasn't suggesting your answer should be removed I didn't even down-vote, it's a perfectly good answer. – calum_b Feb 18 '13 at 17:47

bmike ♦ ,Feb 18, 2013 at 18:46

We appear to be in total agreement - thanks for the addition - feel free to edit it into the answer if you can improve on it. – bmike ♦ Feb 18 '13 at 18:46

Gert van den Berg ,Aug 10, 2016 at 14:20

The accepted solution ( sudo -u user2 <...> ) does have the advantage that it can't be used remotely, which might help for security - there is no private key for user1 that can be stolen. – Gert van den Berg Aug 10 '16 at 14:20

[Jun 20, 2018] linux - Automating the sudo su - user command

Jun 20, 2018 | superuser.com

5 down vote favorite


sam ,Feb 9, 2011 at 11:11

I want to automate
sudo su - user

from a script. It should then ask for a password.

grawity ,Feb 9, 2011 at 12:07

Don't sudo su - user , use sudo -iu user instead. (Easier to manage through sudoers , by the way.) – grawity Feb 9 '11 at 12:07

Hello71 ,Feb 10, 2011 at 1:33

How are you able to run sudo su without being able to run sudo visudo ? – Hello71 Feb 10 '11 at 1:33

Torian ,Feb 9, 2011 at 11:37

I will try and guess what you asked.

If you want to use sudo su - user without a password, you should (if you have the privileges) do the following on you sudoers file:

<youuser>  ALL = NOPASSWD: /bin/su - <otheruser>

where:

Then put into the script:

sudo /bin/su - <otheruser>

Doing just this, won't get subsequent commands get run by <otheruser> , it will spawn a new shell. If you want to run another command from within the script as this other user, you should use something like:

 sudo -u <otheruser> <command>

And in sudoers file:

<yourusername>  ALL = (<otheruser>) NOPASSWD: <command>

Obviously, a more generic line like:

<yourusername> ALL = (ALL) NOPASSWD: ALL

Will get things done, but would grant the permission to do anything as anyone.

sam ,Feb 9, 2011 at 11:43

when the sudo su - user command gets executed,it asks for a password. i want a solution in which script automaticaaly reads password from somewhere. i dont have permission to do what u told earlier. – sam Feb 9 '11 at 11:43

sam ,Feb 9, 2011 at 11:47

i have the permission to store password in a file. the script should read password from that file – sam Feb 9 '11 at 11:47

Olli ,Feb 9, 2011 at 12:46

You can use command
 echo "your_password" | sudo -S [rest of your parameters for sudo]

(Of course without [ and ])

Please note that you should protect your script from read access from unauthorized users. If you want to read password from separate file, you can use

  sudo -S [rest of your parameters for sudo] < /etc/sudo_password_file

(Or whatever is the name of password file, containing password and single line break.)

From sudo man page:

   -S          The -S (stdin) option causes sudo to read the password from
               the standard input instead of the terminal device.  The
               password must be followed by a newline character.

AlexandruC ,Dec 6, 2014 at 8:10

This actually works for me. – AlexandruC Dec 6 '14 at 8:10

Oscar Foley ,Feb 8, 2016 at 16:36

This is brilliant – Oscar Foley Feb 8 '16 at 16:36

Mikel ,Feb 9, 2011 at 11:26

The easiest way is to make it so that user doesn't have to type a password at all.

You can do that by running visudo , then changing the line that looks like:

someuser  ALL=(ALL) ALL

to

someuser  ALL=(ALL) NOPASSWD: ALL

However if it's just for one script, it would be more secure to restrict passwordless access to only that script, and remove the (ALL) , so they can only run it as root, not any user , e.g.

Cmnd_Alias THESCRIPT = /usr/local/bin/scriptname

someuser  ALL=NOPASSWD: THESCRIPT

Run man 5 sudoers to see all the details in the sudoers man page .

sam ,Feb 9, 2011 at 11:34

i do not have permission to edit sudoers file.. any other so that it should read password from somewhere so that automation of this can be done. – sam Feb 9 '11 at 11:34

Torian ,Feb 9, 2011 at 11:40

you are out of luck ... you could do this with, lets say expect but that would let the password for your user hardcoded somewhere, where people could see it (granted that you setup permissions the right way, it could still be read by root). – Torian Feb 9 '11 at 11:40

Mikel ,Feb 9, 2011 at 11:40

Try using expect . man expect for details. – Mikel Feb 9 '11 at 11:40

> ,

when the sudo su - user command gets executed,it asks for a password. i want a solution in which script automaticaaly reads password from somewhere. i dont have permission to edit sudoers file.i have the permission to store password in a file.the script should read password from that file – sam

[Jun 20, 2018] sudo - What does ALL ALL=(ALL) ALL mean in sudoers

Jun 20, 2018 | unix.stackexchange.com

up vote 6 down vote favorite 3


LoukiosValentine79 ,May 6, 2015 at 19:29

If a server has the following in /etc/sudoers:
Defaults targetpw
ALL ALL=(ALL) ALL

Then what does this mean? all the users can sudo to all the commands, only their password is needed?

lcd047 ,May 6, 2015 at 20:51

It means "security Nirvana", that's what it means. ;) – lcd047 May 6 '15 at 20:51

poz2k4444 ,May 6, 2015 at 20:19

From the sudoers(5) man page:

The sudoers policy plugin determines a user's sudo privileges.

For the targetpw:

sudo will prompt for the password of the user specified by the -u option (defaults to root) instead of the password of the invoking user when running a command or editing a file.

sudo(8) allows you to execute commands as someone else

So, basically it says that any user can run any command on any host as any user and yes, the user just has to authenticate, but with the password of the other user, in order to run anything.

The first ALL is the users allowed
The second one is the hosts
The third one is the user as you are running the command
The last one is the commands allowed

LoukiosValentine79 ,May 7, 2015 at 16:37

Thanks! In the meantime I found the "Defaults targetpw" entry in sudoers.. updated the Q – LoukiosValentine79 May 7 '15 at 16:37

poz2k4444 ,May 7, 2015 at 18:24

@LoukiosValentine79 I just update the answer, does that answer your question? – poz2k4444 May 7 '15 at 18:24

evan54 ,Feb 28, 2016 at 20:24

wait he has to enter his own password not of the other user right? – evan54 Feb 28 '16 at 20:24

x-yuri ,May 19, 2017 at 12:20

with targetpw the one of the other (target) user – x-yuri May 19 '17 at 12:20

[Jun 20, 2018] sudo - What is ALL ALL=!SUDOSUDO for

Jun 20, 2018 | unix.stackexchange.com

gasko peter ,Dec 6, 2012 at 12:50

The last line of the /etc/sudoers file is:
grep -i sudosudo /etc/sudoers
Cmnd_Alias SUDOSUDO = /usr/bin/sudo
ALL ALL=!SUDOSUDO

why? What does it exactly do?

UPDATE#1: Now I know that it prevents users to use the: "/usr/bin/sudo".

UPDATE#2: not allowing "root ALL=(ALL) ALL" is not a solution.

Updated Question: What is better besides this "SUDOSUDO"? (the problem with this that the sudo binary could be copied..)

Chris Down ,Dec 6, 2012 at 12:53

SUDOSUDO is probably an alias. Does it exist elsewhere in the file? – Chris Down Dec 6 '12 at 12:53

gasko peter ,Dec 6, 2012 at 14:21

question updated :D - so what does it means exactly? – gasko peter Dec 6 '12 at 14:21

gasko peter ,Dec 6, 2012 at 14:30

is "ALL ALL=!SUDOSUDO" as the last line is like when having DROP iptables POLICY and still using a -j DROP rule as last rule in ex.: INPUT chain? :D or does it has real effects? – gasko peter Dec 6 '12 at 14:30

Kevin ,Dec 6, 2012 at 14:48

I'm not 100% sure, but I believe it only prevents anyone from running sudo sudo ... . – Kevin Dec 6 '12 at 14:48

[Jun 18, 2018] Copy and paste text in midnight commander (MC) via putty in Linux

Notable quotes:
"... IF you're using putty in either Xorg or Windows (i.e terminal within a gui) , it's possible to use the "conventional" right-click copy/paste behavior while in mc. Hold the shift key while you mark/copy. ..."
"... Putty has ability to copy-paste. In mcedit, hold Shift and select by mouse ..."
Jun 18, 2018 | superuser.com

Den ,Mar 1, 2015 at 22:50

I use Midnight Commander (MC) editor over putty to edit files

I want to know how to copy text from one file, close it then open another file and paste it?

If it is not possible with Midnight Commander, is there another easy way to copy and paste specific text from different files?

szkj ,Mar 12, 2015 at 22:40

I would do it like this:
  1. switch to block selection mode by pressing F3
  2. select a block
  3. switch off block selection mode with F3
  4. press Ctrl+F which will open Save block dialog
  5. press Enter to save it to the default location
  6. open the other file in the editor, and navigate to the target location
  7. press Shift+F5 to open Insert file dialog
  8. press Enter to paste from the default file location (which is same as the one in Save block dialog)

NOTE: There are other environment related methods, that could be more conventional nowadays, but the above one does not depend on any desktop environment related clipboard, (terminal emulator features, putty, Xorg, etc.). This is a pure mcedit feature which works everywhere.

Andrejs ,Apr 28, 2016 at 8:13

To copy: (hold) Shift + Select with mouse (copies to clipboard)

To paste in windows: Ctrl+V

To paste in another file in PuTTY/MC: Shift + Ins

Piotr Dobrogost ,Mar 30, 2017 at 17:32

If you get unwanted indents in what was pasted then while editing file in Midnight Commander press F9 to show top menu and in Options/Generals menu uncheck Return does autoindent option. Yes, I was happy when I found it too :) – Piotr Dobrogost Mar 30 '17 at 17:32

mcii-1962 ,May 26, 2015 at 13:17

IF you're using putty in either Xorg or Windows (i.e terminal within a gui) , it's possible to use the "conventional" right-click copy/paste behavior while in mc. Hold the shift key while you mark/copy.

Eden ,Feb 15, 2017 at 4:09

  1. Hold down the Shift key, and drag the mouse through the text you want to copy. The text's background will become dark orange.
  2. Release the Shift key and press Shift + Ctrl + c . The text will be copied.
  3. Now you can paste the text to anywhere you want by pressing Shift + Ctrl + v , even to the new page in MC.

xoid ,Jun 6, 2016 at 6:37

Putty has ability to copy-paste. In mcedit, hold Shift and select by mouse

mcii-1962 ,Jun 20, 2016 at 23:01

LOL - did you actually read the other answers? And your answer is incomplete, you should include what to do with the mouse in order to "select by mouse".
According to help in MC:

Ctrl + Insert copies to the mcedit.clip, and Shift + Insert pastes from mcedit.clip.

It doesn't work for me, by some reason, but by pressing F9 you get a menu, Edit > Copy to clipfile - worked fine.

[Jun 13, 2018] reverse engineering - Is there a C++ decompiler

Jun 13, 2018 | stackoverflow.com

David Holm ,Oct 15, 2008 at 15:08

You can use IDA Pro by Hex-Rays . You will usually not get good C++ out of a binary unless you compiled in debugging information. Prepare to spend a lot of manual labor reversing the code.

If you didn't strip the binaries there is some hope as IDA Pro can produce C-alike code for you to work with. Usually it is very rough though, at least when I used it a couple of years ago.

davenpcj ,May 5, 2012 at

To clarify, IDA will only give the disassembly. There's an add-on to it called Hex-Rays that will decompile the rest of the way into C/C++ source, to the extent that's possible. – davenpcj May 5 '12 at

Dustin Getz ,Oct 15, 2008 at 15:15

information is discarded in the compiling process. Even if a decompiler could produce the logical equivalent code with classes and everything (it probably can't), the self-documenting part is gone in optimized release code. No variable names, no routine names, no class names - just addresses.

Darshan Chaudhary ,Aug 14, 2017 at 17:36

"the soul" of the program is gone, just an empty shell of it's former self..." – Darshan Chaudhary Aug 14 '17 at 17:36

,

Yes, but none of them will manage to produce readable enough code to worth the effort. You will spend more time trying to read the decompiled source with assembler blocks inside, than rewriting your old app from scratch.

[Jun 13, 2018] MC_HOME allows you to run mc with alternative mc.init

Notable quotes:
"... MC_HOME variable can be set to alternative path prior to starting mc. Man pages are not something you can find the answer right away =) ..."
"... A small drawback of this solution: if you set MC_HOME to a directory different from your usual HOME, mc will ignore the content of your usual ~/.bashrc so, for example, your custom aliases defined in that file won't work anymore. Workaround: add a symlink to your ~/.bashrc into the new MC_HOME directory ..."
"... at the same time ..."
Jun 13, 2018 | unix.stackexchange.com

Tagwint ,Dec 19, 2014 at 16:41

That turned out to be simpler as one might think. MC_HOME variable can be set to alternative path prior to starting mc. Man pages are not something you can find the answer right away =)

here's how it works: - usual way

[jsmith@wstation5 ~]$ mc -F
Root directory: /home/jsmith

[System data]
<skipped>

[User data]
    Config directory: /home/jsmith/.config/mc/
    Data directory:   /home/jsmith/.local/share/mc/
        skins:          /home/jsmith/.local/share/mc/skins/
        extfs.d:        /home/jsmith/.local/share/mc/extfs.d/
        fish:           /home/jsmith/.local/share/mc/fish/
        mcedit macros:  /home/jsmith/.local/share/mc/mc.macros
        mcedit external macros: /home/jsmith/.local/share/mc/mcedit/macros.d/macro.*
    Cache directory:  /home/jsmith/.cache/mc/

and the alternative way:

[jsmith@wstation5 ~]$ MC_HOME=/tmp/MCHOME mc -F
Root directory: /tmp/MCHOME

[System data]
<skipped>    

[User data]
    Config directory: /tmp/MCHOME/.config/mc/
    Data directory:   /tmp/MCHOME/.local/share/mc/
        skins:          /tmp/MCHOME/.local/share/mc/skins/
        extfs.d:        /tmp/MCHOME/.local/share/mc/extfs.d/
        fish:           /tmp/MCHOME/.local/share/mc/fish/
        mcedit macros:  /tmp/MCHOME/.local/share/mc/mc.macros
        mcedit external macros: /tmp/MCHOME/.local/share/mc/mcedit/macros.d/macro.*
    Cache directory:  /tmp/MCHOME/.cache/mc/

Use case of this feature:

You have to share the same user name on remote server (access can be distinguished by rsa keys) and want to use your favorite mc configuration w/o overwriting it. Concurrent sessions do not interfere each other.

This works well as a part of sshrc-approach described in https://github.com/Russell91/sshrc

Cri ,Sep 5, 2016 at 10:26

A small drawback of this solution: if you set MC_HOME to a directory different from your usual HOME, mc will ignore the content of your usual ~/.bashrc so, for example, your custom aliases defined in that file won't work anymore. Workaround: add a symlink to your ~/.bashrc into the new MC_HOME directoryCri Sep 5 '16 at 10:26

goldilocks ,Dec 18, 2014 at 16:03

If you mean, you want to be able to run two instances of mc as the same user at the same time with different config directories, as far as I can tell you can't. The path is hardcoded.

However, if you mean, you want to be able to switch which config directory is being used, here's an idea (tested, works). You probably want to do it without mc running:

Hopefully it's clear what's happening there -- this sets a the config directory path as a symlink. Whatever configuration changes you now make and save will be int the one directory. You can then exit and switch_mc two , reverting to the old config, then start mc again, make changes and save them, etc.

You could get away with removing the killall mc and playing around; the configuration stuff is in the ini file, which is read at start-up (so you can't switch on the fly this way). It's then not touched until exit unless you "Save setup", but at exit it may be overwritten, so the danger here is that you erase something you did earlier or outside of the running instance.

Tagwint ,Dec 18, 2014 at 16:52

that works indeed, your idea is pretty clear, thank you for your time However my idea was to be able run differently configured mc's under the same account not interfering each other. I should have specified that in my question. The path to config dir is in fact hardcoded, but it is hardcoded RELATIVELY to user's home dir, that is the value of $HOME, thus changing it before mc start DOES change the config dir location - I've checked that. the drawback is $HOME stays changed as long as mc runs, which could be resolved if mc had a kind of startup hook to put restore to original HOME into – Tagwint Dec 18 '14 at 16:52

Tagwint ,Dec 18, 2014 at 17:17

I've extended my original q with 'same time' condition - it did not fit in my prev comment size limitation – Tagwint Dec 18 '14 at 17:17

[Jun 13, 2018] MC (Midnight Commmander) mc/ini settings file location

Jun 13, 2018 | unix.stackexchange.com

UVV ,Oct 13, 2014 at 7:51

It's in the following file: ~/.config/mc/ini .

obohovyk ,Oct 13, 2014 at 7:53

Unfortunately not... – obohovyk Oct 13 '14 at 7:53

UVV ,Oct 13, 2014 at 8:02

@alexkowalski then it's ~/.config/mc/iniUVV Oct 13 '14 at 8:02

obohovyk ,Oct 13, 2014 at 8:41

Yeah, thanks!!! – obohovyk Oct 13 '14 at 8:41

,

If you have not made any changes, the config file does not yet exist.

The easy way to change from the default skin:

  1. Start Midnight Commander
    sudo mc
    
  2. F9 , O for Options, or cursor to "Options" and press Enter
  3. A for Appearance, or cursor to Appearance and press Enter

    You will see that default is the current skin.

  4. Press Enter to see the other skin choices
  5. Cursor to the skin you want and select it by pressing Enter
  6. Click OK

After you do this, the ini file will exist and can be edited, but it is easier to change skins using the method I described.

[Jun 02, 2018] Parallelise rsync using GNU Parallel

Jun 02, 2018 | unix.stackexchange.com

up vote 7 down vote favorite 4


Mandar Shinde ,Mar 13, 2015 at 6:51

I have been using a rsync script to synchronize data at one host with the data at another host. The data has numerous small-sized files that contribute to almost 1.2TB.

In order to sync those files, I have been using rsync command as follows:

rsync -avzm --stats --human-readable --include-from proj.lst /data/projects REMOTEHOST:/data/

The contents of proj.lst are as follows:

+ proj1
+ proj1/*
+ proj1/*/*
+ proj1/*/*/*.tar
+ proj1/*/*/*.pdf
+ proj2
+ proj2/*
+ proj2/*/*
+ proj2/*/*/*.tar
+ proj2/*/*/*.pdf
...
...
...
- *

As a test, I picked up two of those projects (8.5GB of data) and I executed the command above. Being a sequential process, it tool 14 minutes 58 seconds to complete. So, for 1.2TB of data it would take several hours.

If I would could multiple rsync processes in parallel (using & , xargs or parallel ), it would save my time.

I tried with below command with parallel (after cd ing to source directory) and it took 12 minutes 37 seconds to execute:

parallel --will-cite -j 5 rsync -avzm --stats --human-readable {} REMOTEHOST:/data/ ::: .

This should have taken 5 times less time, but it didn't. I think, I'm going wrong somewhere.

How can I run multiple rsync processes in order to reduce the execution time?

Ole Tange ,Mar 13, 2015 at 7:25

Are you limited by network bandwidth? Disk iops? Disk bandwidth? – Ole Tange Mar 13 '15 at 7:25

Mandar Shinde ,Mar 13, 2015 at 7:32

If possible, we would want to use 50% of total bandwidth. But, parallelising multiple rsync s is our first priority. – Mandar Shinde Mar 13 '15 at 7:32

Ole Tange ,Mar 13, 2015 at 7:41

Can you let us know your: Network bandwidth, disk iops, disk bandwidth, and the bandwidth actually used? – Ole Tange Mar 13 '15 at 7:41

Mandar Shinde ,Mar 13, 2015 at 7:47

In fact, I do not know about above parameters. For the time being, we can neglect the optimization part. Multiple rsync s in parallel is the primary focus now. – Mandar Shinde Mar 13 '15 at 7:47

Mandar Shinde ,Apr 11, 2015 at 13:53

Following steps did the job for me:
  1. Run the rsync --dry-run first in order to get the list of files those would be affected.

rsync -avzm --stats --safe-links --ignore-existing --dry-run --human-readable /data/projects REMOTE-HOST:/data/ > /tmp/transfer.log

  1. I fed the output of cat transfer.log to parallel in order to run 5 rsync s in parallel, as follows:

cat /tmp/transfer.log | parallel --will-cite -j 5 rsync -avzm --relative --stats --safe-links --ignore-existing --human-readable {} REMOTE-HOST:/data/ > result.log

Here, --relative option ( link ) ensured that the directory structure for the affected files, at the source and destination, remains the same (inside /data/ directory), so the command must be run in the source folder (in example, /data/projects ).

Sandip Bhattacharya ,Nov 17, 2016 at 21:22

That would do an rsync per file. It would probably be more efficient to split up the whole file list using split and feed those filenames to parallel. Then use rsync's --files-from to get the filenames out of each file and sync them. rm backups.* split -l 3000 backup.list backups. ls backups.* | parallel --line-buffer --verbose -j 5 rsync --progress -av --files-from {} /LOCAL/PARENT/PATH/ REMOTE_HOST:REMOTE_PATH/ – Sandip Bhattacharya Nov 17 '16 at 21:22

Mike D ,Sep 19, 2017 at 16:42

How does the second rsync command handle the lines in result.log that are not files? i.e. receiving file list ... done created directory /data/ . – Mike D Sep 19 '17 at 16:42

Cheetah ,Oct 12, 2017 at 5:31

On newer versions of rsync (3.1.0+), you can use --info=name in place of -v , and you'll get just the names of the files and directories. You may want to use --protect-args to the 'inner' transferring rsync too if any files might have spaces or shell metacharacters in them. – Cheetah Oct 12 '17 at 5:31

Mikhail ,Apr 10, 2017 at 3:28

I would strongly discourage anybody from using the accepted answer, a better solution is to crawl the top level directory and launch a proportional number of rync operations.

I have a large zfs volume and my source was was a cifs mount. Both are linked with 10G, and in some benchmarks can saturate the link. Performance was evaluated using zpool iostat 1 .

The source drive was mounted like:

mount -t cifs -o username=,password= //static_ip/70tb /mnt/Datahoarder_Mount/ -o vers=3.0

Using a single rsync process:

rsync -h -v -r -P -t /mnt/Datahoarder_Mount/ /StoragePod

the io meter reads:

StoragePod  30.0T   144T      0  1.61K      0   130M
StoragePod  30.0T   144T      0  1.61K      0   130M
StoragePod  30.0T   144T      0  1.62K      0   130M

This in synthetic benchmarks (crystal disk), performance for sequential write approaches 900 MB/s which means the link is saturated. 130MB/s is not very good, and the difference between waiting a weekend and two weeks.

So, I built the file list and tried to run the sync again (I have a 64 core machine):

cat /home/misha/Desktop/rsync_logs_syncs/Datahoarder_Mount.log | parallel --will-cite -j 16 rsync -avzm --relative --stats --safe-links --size-only --human-readable {} /StoragePod/ > /home/misha/Desktop/rsync_logs_syncs/Datahoarder_Mount_result.log

and it had the same performance!

StoragePod  29.9T   144T      0  1.63K      0   130M
StoragePod  29.9T   144T      0  1.62K      0   130M
StoragePod  29.9T   144T      0  1.56K      0   129M

As an alternative I simply ran rsync on the root folders:

rsync -h -v -r -P -t /mnt/Datahoarder_Mount/Mikhail/Marcello_zinc_bone /StoragePod/Marcello_zinc_bone
rsync -h -v -r -P -t /mnt/Datahoarder_Mount/Mikhail/fibroblast_growth /StoragePod/fibroblast_growth
rsync -h -v -r -P -t /mnt/Datahoarder_Mount/Mikhail/QDIC /StoragePod/QDIC
rsync -h -v -r -P -t /mnt/Datahoarder_Mount/Mikhail/sexy_dps_cell /StoragePod/sexy_dps_cell

This actually boosted performance:

StoragePod  30.1T   144T     13  3.66K   112K   343M
StoragePod  30.1T   144T     24  5.11K   184K   469M
StoragePod  30.1T   144T     25  4.30K   196K   373M

In conclusion, as @Sandip Bhattacharya brought up, write a small script to get the directories and parallel that. Alternatively, pass a file list to rsync. But don't create new instances for each file.

Julien Palard ,May 25, 2016 at 14:15

I personally use this simple one:
ls -1 | parallel rsync -a {} /destination/directory/

Which only is usefull when you have more than a few non-near-empty directories, else you'll end up having almost every rsync terminating and the last one doing all the job alone.

Ole Tange ,Mar 13, 2015 at 7:25

A tested way to do the parallelized rsync is: http://www.gnu.org/software/parallel/man.html#EXAMPLE:-Parallelizing-rsync

rsync is a great tool, but sometimes it will not fill up the available bandwidth. This is often a problem when copying several big files over high speed connections.

The following will start one rsync per big file in src-dir to dest-dir on the server fooserver:

cd src-dir; find . -type f -size +100000 | \
parallel -v ssh fooserver mkdir -p /dest-dir/{//}\; \
  rsync -s -Havessh {} fooserver:/dest-dir/{}

The directories created may end up with wrong permissions and smaller files are not being transferred. To fix those run rsync a final time:

rsync -Havessh src-dir/ fooserver:/dest-dir/

If you are unable to push data, but need to pull them and the files are called digits.png (e.g. 000000.png) you might be able to do:

seq -w 0 99 | parallel rsync -Havessh fooserver:src/*{}.png destdir/

Mandar Shinde ,Mar 13, 2015 at 7:34

Any other alternative in order to avoid find ? – Mandar Shinde Mar 13 '15 at 7:34

Ole Tange ,Mar 17, 2015 at 9:20

Limit the -maxdepth of find. – Ole Tange Mar 17 '15 at 9:20

Mandar Shinde ,Apr 10, 2015 at 3:47

If I use --dry-run option in rsync , I would have a list of files that would be transferred. Can I provide that file list to parallel in order to parallelise the process? – Mandar Shinde Apr 10 '15 at 3:47

Ole Tange ,Apr 10, 2015 at 5:51

cat files | parallel -v ssh fooserver mkdir -p /dest-dir/{//}\; rsync -s -Havessh {} fooserver:/dest-dir/{} – Ole Tange Apr 10 '15 at 5:51

Mandar Shinde ,Apr 10, 2015 at 9:49

Can you please explain the mkdir -p /dest-dir/{//}\; part? Especially the {//} thing is a bit confusing. – Mandar Shinde Apr 10 '15 at 9:49

,

For multi destination syncs, I am using
parallel rsync -avi /path/to/source ::: host1: host2: host3:

Hint: All ssh connections are established with public keys in ~/.ssh/authorized_keys

[May 13, 2018] What is the difference between FASTA, FASTQ, and SAM file formats

May 13, 2018 | bioinformatics.stackexchange.com

Konrad Rudolph ,Jun 2, 2017 at 12:16

Let's start with what they have in common: All three formats store
  1. sequence data, and
  2. sequence metadata.

Furthermore, all three formats are text-based.

However, beyond that all three formats are different and serve different purposes.

Let's start with the simplest format:

FASTA

FASTA stores a variable number of sequence records, and for each record it stores the sequence itself, and a sequence ID. Each record starts with a header line whose first character is > , followed by the sequence ID. The next lines of a record contain the actual sequence.

The Wikipedia artice gives several examples for peptide sequences, but since FASTQ and SAM are used exclusively (?) for nucleotide sequences, here's a nucleotide example:

>Mus_musculus_tRNA-Ala-AGC-1-1 (chr13.trna34-AlaAGC)
GGGGGTGTAGCTCAGTGGTAGAGCGCGTGCTTAGCATGCACGAGGcCCTGGGTTCGATCC
CCAGCACCTCCA
>Mus_musculus_tRNA-Ala-AGC-10-1 (chr13.trna457-AlaAGC)
GGGGGATTAGCTCAAATGGTAGAGCGCTCGCTTAGCATGCAAGAGGtAGTGGGATCGATG
CCCACATCCTCCA

The ID can be in any arbitrary format, although several conventions exist .

In the context of nucleotide sequences, FASTA is mostly used to store reference data; that is, data extracted from a curated database; the above is adapted from GtRNAdb (a database of tRNA sequences).

FASTQ

FASTQ was conceived to solve a specific problem of FASTA files: when sequencing, the confidence in a given base call (that is, the identity of a nucleotide) varies. This is expressed in the Phred quality score . FASTA had no standardised way of encoding this. By contrast, a FASTQ record contains a sequence of quality scores for each nucleotide.

A FASTQ record has the following format:

  1. A line starting with @ , containing the sequence ID.
  2. One or more lines that contain the sequence.
  3. A new line starting with the character + , and being either empty or repeating the sequence ID.
  4. One or more lines that contain the quality scores.

Here's an example of a FASTQ file with two records:

@071112_SLXA-EAS1_s_7:5:1:817:345
GGGTGATGGCCGCTGCCGATGGCGTC
AAATCCCACC
+
IIIIIIIIIIIIIIIIIIIIIIIIII
IIII9IG9IC
@071112_SLXA-EAS1_s_7:5:1:801:338
GTTCAGGGATACGACGTTTGTATTTTAAGAATCTGA
+
IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII6IBI

FASTQ files are mostly used to store short-read data from high-throughput sequencing experiments. As a consequence, the sequence and quality scores are usually put into a single line each, and indeed many tools assume that each record in a FASTQ file is exactly four lines long, even though this isn't guaranteed.

As for FASTA, the format of the sequence ID isn't standardised, but different producers of FASTQ use fixed notations that follow strict conventions .

SAM

SAM files are so complex that a complete description [PDF] takes 15 pages. So here's the short version.

The original purpose of SAM files is to store mapping information for sequences from high-throughput sequencing. As a consequence, a SAM record needs to store more than just the sequence and its quality, it also needs to store information about where and how a sequence maps into the reference.

Unlike the previous formats, SAM is tab-based, and each record, consisting of either 11 or 12 fields, fills exactly one line. Here's an example (tabs replaced by fixed-width spacing):

r001  99  chr1  7 30  17M         =  37  39  TTAGATAAAGGATACTG   IIIIIIIIIIIIIIIII
r002  0   chrX  9 30  3S6M1P1I4M  *  0   0   AAAAGATAAGGATA      IIIIIIIIII6IBI    NM:i:1

For a description of the individual fields, refer to the documentation. The relevant bit is this: SAM can express exactly the same information as FASTQ, plus, as mentioned, the mapping information. However, SAM is also used to store read data without mapping information.

In addition to sequence records, SAM files can also contain a header , which stores information about the reference that the sequences were mapped to, and the tool used to create the SAM file. Header information precede the sequence records, and consist of lines starting with @ .

SAM itself is almost never used as a storage format; instead, files are stored in BAM format, which is a compact binary representation of SAM. It stores the same information, just more efficiently, and in conjunction with a search index , allows fast retrieval of individual records from the middle of the file (= fast random access ). BAM files are also much more compact than compressed FASTQ or FASTA files.


The above implies a hierarchy in what the formats can store: FASTA ⊂ FASTQ ⊂ SAM.

In a typical high-throughput analysis workflow, you will encounter all three file types:

  1. FASTA to store the reference genome/transcriptome that the sequence fragments will be mapped to.
  2. FASTQ to store the sequence fragments before mapping.
  3. SAM/BAM to store the sequence fragments after mapping.

Scott Gigante ,Aug 17, 2017 at 6:01

FASTQ is used for long-read sequencing as well, which could have a single record being thousands of 80-character lines long. Sometimes these are split by line breaks, sometimes not. – Scott Gigante Aug 17 '17 at 6:01

Konrad Rudolph ,Aug 17, 2017 at 10:03

@ScottGigante I alluded to this by saying that the sequence can take up several lines. – Konrad Rudolph Aug 17 '17 at 10:03

Scott Gigante ,Aug 17, 2017 at 13:22

Sorry, should have clarified: I was just referring to the line "FASTQ files are (almost?) exclusively used to store short-read data from high-throughput sequencing experiments." Definitely not exclusively. – Scott Gigante Aug 17 '17 at 13:22

Konrad Rudolph ,Feb 21 at 17:06

@charlesdarwin I have no idea. The line with the plus sign is completely redundant. The original developers of the FASTQ format probably intended it as a redundancy to simplify error checking (= to see if the record was complete) but it fails at that. In hindsight it shouldn't have been included. Unfortunately we're stuck with it for now. – Konrad Rudolph Feb 21 at 17:06

Wouter De Coster ,Feb 21 at 23:16

@KonradRudolph as far as I know fastq is a combination of fasta and qual files, see also ncbi.nlm.nih.gov/pmc/articles/PMC2847217 This explains the header of the quality part. It, however, doesn't make sense we're stuck with it... – Wouter De Coster Feb 21 at 23:16

eastafri ,May 16, 2017 at 18:57

In a nutshell,

FASTA file format is a DNA sequence format for specifying or representing DNA sequences and was first described by Pearson (Pearson,W.R. and Lipman,D.J. (1988) Improved tools for biological sequence comparison. Proc. Natl Acad. Sci. USA, 85, 2444–2448)

FASTQ is another DNA sequence file format that extends the FASTA format with the ability to store the sequence quality. The quality scores are often represented in ASCII characters which correspond to a phred score)

Both FASTA and FASTQ are common sequence representation formats and have emerged as key data interchange formats for molecular biology and bioinformatics.

SAM is format for representing sequence alignment information from a read aligner. It represents sequence information in respect to a given reference sequence. The information is stored in a series of tab delimited ascii columns. The full SAM format specification is available at http://samtools.sourceforge.net/SAM1.pdf

user172818 ♦ ,May 16, 2017 at 19:07

On a historical note, the Sanger Institute first used the FASTQ format. – user172818 ♦ May 16 '17 at 19:07

Konrad Rudolph ,Jun 2, 2017 at 10:43

SAM can also (and is increasingly used for it, see PacBio) store unaligned sequence information, and in this regard equivalent to FASTQ. – Konrad Rudolph Jun 2 '17 at 10:43

bli ,Jun 2, 2017 at 11:30

Note that fasta is also often use for protein data, not just DNA. – bli Jun 2 '17 at 11:30

BaCh ,May 16, 2017 at 18:53

Incidentally, the first part of your question is something you could have looked up yourself as the first hits on Google of "NAME format" point you to primers on Wikipedia, no less. In future, please do that before asking a question.
  1. FASTA
  2. FASTQ
  3. SAM

FASTA (officially) just stores the name of a sequence and the sequence, inofficially people also add comment fields after the name of the sequence. FASTQ was invented to store both sequence and associated quality values (e.g. from sequencing instruments). SAM was invented to store alignments of (small) sequences (e.g. generated from sequencing) with associated quality values and some further data onto a larger sequences, called reference sequences, the latter being anything from a tiny virus sequence to ultra-large plant sequences.

Alon Gelber ,May 16, 2017 at 19:50

FASTA and FATSQ formats are both file formats that contain sequencing reads while SAM files are these reads aligned to a reference sequence. In other words, FASTA and FASTQ are the "raw data" of sequencing while SAM is the product of aligning the sequencing reads to a refseq.

A FASTA file contains a read name followed by the sequence. An example of one of these reads for RNASeq might be:

>Flow cell number: lane number: chip coordinates etc.
ATTGGCTAATTGGCTAATTGGCTAATTGGCTAATTGGCTAATTGGCTAATTGGCTAATTGGCTA

The FASTQ version of this read will have two more lines, one + as a space holder and then a line of quality scores for the base calls. The qualities are given as characters with '!' being the lowest and '~' being the highest, in increasing ASCII value. It would look something like this

@Flow cell number: lane number: chip coordinates etc.
ATTGGCTAATTGGCTAATTGGCTAATTGGCTAATTGGCTAATTGGCTAATTGGCTAATTGGCTA
+
!''*((((***+))%%%++)(%%%%).1***-+*''))**55CCF>>>>>>CCCCCCC65

A SAM file has many fields for each alignment, the header begins with the @ character. The alignment contains 11 mandatory fields and various optional ones. You can find the spec file here: https://samtools.github.io/hts-specs/SAMv1.pdf .

Often you'll see BAM files which are just compressed binary versions of SAM files. You can view these alignment files using various tools, such as SAMtools, IGV or USCS Genome browser.

As to the benefits, FASTA/FASTQ vs. SAM/BAM is comparing apples and oranges. I do a lot of RNASeq work so generally we take the FASTQ files and align them the a refseq using an aligner such as STAR which outputs SAM/BAM files. There's a l ot you can do with just these alignment files, looking at expression, but usually I'll use a tool such as RSEM to "count" the reads from various genes to create an expression matrix, samples as columns and genes as rows. Whether you get FASTQ or FASTA files just depends on your sequencing platform. I've never heard of anybody really using the quality scores.

Konrad Rudolph ,Jun 2, 2017 at 10:47

Careful, the FASTQ format description is wrong: a FASTQ record can span more than four lines; also, + isn't a placeholder, it's a separator between the sequence and the quality score, with an optional repetition of the record ID following it. Finally, the quality score string has to be the same length as the sequence. – Konrad Rudolph Jun 2 '17 at 10:47

[May 09, 2018] How to read binary file in Perl - Stack Overflow

Notable quotes:
"... BTW: I don't think it's a good idea to read tons of binary files into memory at once. You can search them 1 by 1... ..."
May 09, 2018 | stackoverflow.com

2 down vote favorite 1


Grace ,Jan 19, 2012 at 2:08

I'm having an issue with writing a Perl script to read a binary file.

My code is as the following whereby the $file are files in binary format. I tried to search through the web and apply in my code, tried to print it out, but it seems it doesn't work well.

Currently it only prints the '&&&&&&&&&&&" and ""ppppppppppp", but what I really want is it can print out each of the $line , so that I can do some other post processing later. Also, I'm not quite sure what the $data is as I see it is part of the code from sample in article, stating suppose to be a scalar. I need somebody who can pin point me where the error goes wrong in my code. Below is what I did.

my $tmp = "$basedir/$key";
opendir (TEMP1, "$tmp");
my @dirs = readdir(TEMP1);
closedir(TEMP1);

foreach my $dirs (@dirs) {
    next if ($dirs eq "." || $dirs eq "..");
    print "---->$dirs\n";
    my $d = "$basedir/$key/$dirs";
    if (-d "$d") {
        opendir (TEMP2, $d) || die $!;
        my @files = readdir (TEMP2); # This should read binary files
        closedir (TEMP2);

        #my $buffer = "";
        #opendir (FILE, $d) || die $!;
        #binmode (FILE);
        #my @files =  readdir (FILE, $buffer, 169108570);
        #closedir (FILE);

        foreach my $file (@files) {
            next if ($file eq "." || $file eq "..");
            my $f = "$d/$file";
            print "==>$file\n";
            open FILE, $file || die $!;
            binmode FILE;
            foreach ($line = read (FILE, $data, 169108570)) {
                print "&&&&&&&&&&&$line\n";
                print "ppppppppppp$data\n";
            }
            close FILE;
        }
    }
}

I have altered my code so that it goes like as below. Now I can read the $data. Thanks J-16 SDiZ for pointing out that. I'm trying to push the info I got from the binary file to an array called "@array", thinkking to grep data from the array for string whichever match "p04" but fail. Can someone point out where is the error?

my $tmp = "$basedir/$key";
opendir (TEMP1, "$tmp");
my @dirs = readdir (TEMP1);
closedir (TEMP1);

foreach my $dirs (@dirs) {
    next if ($dirs eq "." || $dirs eq "..");
    print "---->$dirs\n";
    my $d = "$basedir/$key/$dirs";
    if (-d "$d") {
        opendir (TEMP2, $d) || die $!;
        my @files = readdir (TEMP2); #This should read binary files
        closedir (TEMP2);

        foreach my $file (@files) {
            next if ($file eq "." || $file eq "..");
            my $f = "$d/$file";
            print "==>$file\n";
            open FILE, $file || die $!;
            binmode FILE;
            foreach ($line = read (FILE, $data, 169108570)) {
                print "&&&&&&&&&&&$line\n";
                print "ppppppppppp$data\n";
                push @array, $data;
            }
            close FILE;
        }
    }
}

foreach $item (@array) {
    #print "==>$item<==\n"; # It prints out content of binary file without the ==> and <== if I uncomment this.. weird!
    if ($item =~ /p04(.*)/) {
        print "=>$item<===============\n"; # It prints "=><===============" according to the number of binary file I have.  This is wrong that I aspect it to print the content of each binary file instead :(
        next if ($item !~ /^w+/);
        open (LOG, ">log") or die $!;
        #print LOG $item;
        close LOG;
    }
}

Again, I changed my code as following, but it still doesn't work as it do not able to grep the "p04" correctly by checking on the "log" file. It did grep the whole file including binary like this "@^@^@^@^G^D^@^@^@^^@p04bbhi06^@^^@^@^@^@^@^@^@^@hh^R^@^@^@^^@^@^@p04lohhj09^@^@^@^^@@" . What I'm aspecting is it do grep the anything with p04 only such as grepping p04bbhi06 and p04lohhj09. Here is how my code goes:-

foreach my $file (@files) {
    next if ($file eq "." || $file eq "..");
    my $f = "$d/$file";
    print "==>$file\n";
    open FILE, $f || die $!;
    binmode FILE;
    my @lines = <FILE>;
    close FILE;
    foreach $cell (@lines) {
        if ($cell =~ /b12/) {
            push @array, $cell;
        }
    }
}

#my @matches = grep /p04/, @lines;
#foreach $item (@matches) {
foreach $item (@array) {
    #print "-->$item<--";
    open (LOG, ">log") or die $!;
    print LOG $item;
    close LOG;
}

Brad Gilbert ,Jan 19, 2012 at 15:53

use autodie � Brad Gilbert Jan 19 '12 at 15:53

reinierpost ,Jan 30, 2012 at 13:00

There is no such thing as 'binary format'. Please be more precise. What format are the files in? What characteristics do they have that cause you to call them 'in binary format'? � reinierpost Jan 30 '12 at 13:00

Grace ,Jan 31, 2012 at 6:56

It is in .gds format. This file is able to read in Unix with strings command. It was reaable in my Perl script but I am not able to grep the data I wanted (p04* here in my code) . � Grace Jan 31 '12 at 6:56

mivk ,Nov 19, 2013 at 13:16

As already suggested, use File::Find or something to get your list of files. For the rest, what do you really want? Output the whole file content if you found a match? Or just the parts that match? And what do you want to match? p04(.*) matches anything from "p04" up to the next newline. You then have that "anything" in $1 . Leave out all the clumsy directory stuff and concentrate first on what you want out of a single file. How big are the files? You are only reading the first 170MB. And you keep overwriting the "log" file, so it only contains the last item from the last file. � mivk Nov 19 '13 at 13:16

jm666 ,May 12, 2015 at 6:44

@reinierpost the OP under the "binary file" probably mean the opposite of the text files - e.g. same thing as is in the perldoc's -X documentation see the -B explanation. (cite: -B File is a "binary" file (opposite of -T).) � jm666 May 12 '15 at 6:44

J-16 SDiZ ,Jan 19, 2012 at 2:19

Use:
$line = read (FILE, $data, 169108570);

The data is in $data ; and $line is the number of bytes read.

       my $f = "$d/$file" ;
       print "==>$file\n" ;
       open FILE, $file || die $! ;

I guess the full path is in $f , but you are opening $file . (In my testing -- even $f is not the full path, but I guess you may have some other glue code...)

If you just want to walk all the files in a directory, try File::DirWalk or File::Find .

Grace ,Jan 19, 2012 at 2:34

Hi J-16 SDiZ, thanks for the reply. each of the $file is in binary format, and what I want to do is to read eaxh of the file to grep some information in readable format and dump into another file (which I consider here as post processing). I want to perform something like "strings <filename> | grep <text synctax>" as in Unix. whereby the <filename> is the $file here in my code. My problem here is cannot read the binary file so that I can proceed with other stuff. Thanks. � Grace Jan 19 '12 at 2:34

Dimanoid ,Jan 20, 2012 at 8:51

I am not sure if I understood you right.

If you need to read a binary file, you can do the same as for a text file:

open F, "/bin/bash";
my $file = do { local $/; <F> };
close F;

Under Windows you may need to add binmode F; under *nix it works without it.

If you need to find which lines in an array contains some word, you can use grep function:

my @matches = grep /something/, @array_to_grep;

You will get all matched lines in the new array @matches .

BTW: I don't think it's a good idea to read tons of binary files into memory at once. You can search them 1 by 1...

If you need to find where the match occurs you can use another standard function, index :

my $offset = index('myword', $file);

Grace ,Jan 30, 2012 at 4:30

Hi Dinanoid, thanks for your answer, I tried it but it didn't work well for me. I tried to edit my code as above (my own code, and it didn't work). Also, tried code as below as you suggested, it didn't work for me either. Can you point out where I did wrong? Thanks. � Grace Jan 30 '12 at 4:30

Peter Mortensen ,May 1, 2016 at 8:31

What will $file be assigned to? An array of characters? A string? Something else? � Peter Mortensen May 1 '16 at 8:31

> ,

I'm not sure I'll be able to answer the OP question exactly, but here are some notes that may be related. (edit: this is the same approach as answer by @Dimanoid, but with more detail)

Say you have a file, which is a mix of ASCII data, and binary. Here is an example in a bash terminal:

$ echo -e "aa aa\x00\x0abb bb" | tee tester.txt
aa aa
bb bb
$ du -b tester.txt 
13  tester.txt
$ hexdump -C tester.txt 
00000000  61 61 20 61 61 00 0a 62  62 20 62 62 0a           |aa aa..bb bb.|
0000000d

Note that byte 00 (specified as \x00 ) is a non-printable character, (and in C , it also means "end of a string") - thereby, its presence makes tester.txt a binary file. The file has size of 13 bytes as seen by du , because of the trailing \n added by the echo (as it can be seen from hexdump ).

Now, let's see what happens when we try to read it with perl 's <> diamond operator (see also What's the use of <> in perl? ):

$ perl -e '
open IN, "<./tester.txt";
binmode(IN);
$data = <IN>; # does this slurp entire file in one go?
close(IN);
print "length is: " . length($data) . "\n";
print "data is: --$data--\n";
'

length is: 7
data is: --aa aa
--

Clearly, the entire file didn't get slurped - it broke at the line end \n (and not at the binary \x00 ). That is because the diamond filehandle <FH> operator is actually shortcut for readline (see Perl Cookbook: Chapter 8, File Contents )

The same link tells that one should undef the input record separator, \$ (which by default is set to \n ), in order to slurp the entire file. You may want to have this change be only local, which is why the braces and local are used instead of undef (see Perl Idioms Explained - my $string = do { local $/; }; ); so we have:

$ perl -e '
open IN, "<./tester.txt";
print "_$/_\n"; # check if $/ is \n
binmode(IN);
{
local $/; # undef $/; is global
$data = <IN>; # this should slurp one go now
};
print "_$/_\n"; # check again if $/ is \n
close(IN);
print "length is: " . length($data) . "\n";
print "data is: --$data--\n";
'

_
_
_
_
length is: 13
data is: --aa aa
bb bb
--

... and now we can see the file is slurped in its entirety.

Since binary data implies unprintable characters, you may want to inspect the actual contents of $data by printing via sprintf or pack / unpack instead.

Hope this helps someone,
Cheers!

[May 04, 2018] Why are tar archive formats switching to xz compression to replace bzip2 and what about gzip

May 04, 2018 | unix.stackexchange.com

のbるしtyぱんky ,Jan 6, 2014 at 18:39

More and more tar archives use the xz format based on LZMA2 for compression instead of the traditional bzip2(bz2) compression. In fact kernel.org made a late " Good-bye bzip2 " announcement, 27th Dec. 2013 , indicating kernel sources would from this point on be released in both tar.gz and tar.xz format - and on the main page of the website what's directly offered is in tar.xz .

Are there any specific reasons explaining why this is happening and what is the relevance of gzip in this context?

> ,

For distributing archives over the Internet, the following things are generally a priority:
  1. Compression ratio (i.e., how small the compressor makes the data);
  2. Decompression time (CPU requirements);
  3. Decompression memory requirements; and
  4. Compatibility (how wide-spread the decompression program is)

Compression memory & CPU requirements aren't very important, because you can use a large fast machine for that, and you only have to do it once.

Compared to bzip2, xz has a better compression ratio and lower (better) decompression time. It, however -- at the compression settings typically used -- requires more memory to decompress [1] and is somewhat less widespread. Gzip uses less memory than either.

So, both gzip and xz format archives are posted, allowing you to pick:

There isn't really a realistic combination of factors that'd get you to pick bzip2. So its being phased out.

I looked at compression comparisons in a blog post . I didn't attempt to replicate the results, and I suspect some of it has changed (mostly, I expect xz has improved, as its the newest.)

(There are some specific scenarios where a good bzip2 implementation may be preferable to xz: bzip2 can compresses a file with lots of zeros and genome DNA sequences better than xz. Newer versions of xz now have an (optional) block mode which allows data recovery after the point of corruption and parallel compression and [in theory] decompression. Previously, only bzip2 offered these. [2] However none of these are relevant for kernel distribution)


1: In archive size, xz -3 is around bzip -9 . Then xz uses less memory to decompress. But xz -9 (as, e.g., used for Linux kernel tarballs) uses much more than bzip -9 . (And even xz -0 needs more than gzip -9 ).

2: F21 System Wide Change: lbzip2 as default bzip2 implementation

> ,

First of all, this question is not directly related to tar . Tar just creates an uncompressed archive, the compression is then applied later on.

Gzip is known to be relatively fast when compared to LZMA2 and bzip2. If speed matters, gzip (especially the multithreaded implementation pigz ) is often a good compromise between compression speed and compression ratio. Although there are alternatives if speed is an issue (e.g. LZ4).

However, if a high compression ratio is desired LZMA2 beats bzip2 in almost every aspect. The compression speed is often slower, but it decompresses much faster and provides a much better compression ratio at the cost of higher memory usage.

There is not much reason to use bzip2 any more, except of backwards compatibility. Furthermore, LZMA2 was desiged with multithreading in mind and many implementations by default make use of multicore CPUs (unfortunately xz on Linux does not do this, yet). This makes sense since the clock speeds won't increase any more but the number of cores will.

There are multithreaded bzip2 implementations (e.g. pbzip ), but they are often not installed by default. Also note that multithreaded bzip2 only really pay off while compressing whereas decompression uses a single thread if the file was compress using a single threaded bzip2 , in contrast to LZMA2. Parallel bzip2 variants can only leverage multicore CPUs if the file was compressed using a parallel bzip2 version, which is often not the case.

Slyx ,Jan 6, 2014 at 19:14

Short answer : xz is more efficient in terms of compression ratio. So it saves disk space and optimizes the transfer through the network.
You can see this Quick Benchmark so as to discover the difference by practical tests.

Mark Warburton ,Apr 14, 2016 at 14:15

LZMA2 is a block compression system whereas gzip is not. This means that LZMA2 lends itself to multi-threading. Also, if corruption occurs in an archive, you can generally recover data from subsequent blocks with LZMA2 but you cannot do this with gzip. In practice, you lose the entire archive with gzip subsequent to the corrupted block. With an LZMA2 archive, you only lose the file(s) affected by the corrupted block(s). This can be important in larger archives with multiple files.

,

[May 04, 2018] bit manipulation - Bit operations in Perl

May 04, 2018 | stackoverflow.com

4 down vote favorite


Toren ,Jan 12, 2011 at 14:50

I have an attribute (32 bits-long), that each bit responsible to specific functionality. Perl script I'm writing should turn on 4th bit, but save previous definitions of other bits.

I use in my program:

Sub BitOperationOnAttr

{

my $a="";

MyGetFunc( $a);

$a |= 0x00000008;

MySetFunc( $a);

}

** MyGetFunc/ MySetFunc my own functions that know read/fix value.

Questions:

  1. if usage of $a |= 0x00000008; is right ?
  2. how extract hex value by Regular Expression from string I have : For example:

"Attribute: Somestring: value (8 long (0x8))"

Michael Carman ,Jan 12, 2011 at 16:13

Your questions are not related; they should be posted separately. That makes it easier for other people with similar questions to find them. – Michael Carman Jan 12 '11 at 16:13

toolic ,Jan 12, 2011 at 16:47

Same question asked on PerlMonks: perlmonks.org/?node_id=881892toolic Jan 12 '11 at 16:47

psmears ,Jan 12, 2011 at 15:00

  1. if usage of $a |= 0x00000008; is right ?

Yes, this is fine.

  1. how extract hex value by Regular Expression from string I have : For example:

"Attribute: Somestring: value (8 long (0x8))"

I'm assuming you have a string like the above, and want to use a regular expression to extract the "0x8". In that case, something like:

if ($string =~ m/0x([0-9a-fA-F]+)/) {
    $value = hex($1);
} else {
    # string didn't match
}

should work.

Toren ,Jan 16, 2011 at 12:35

Thank you for quick answer. You show me the right way to solve the problem – Toren Jan 16 '11 at 12:35

Michael Carman ,Jan 12, 2011 at 16:32

Perl provides several ways for dealing with binary data:

Your scenario sounds like a set of packed flags. The bitwise operators are a good fit for this:

my $mask = 1 << 3;   # 0x0008
$value |=  $mask;    # set bit
$value &= ~$mask;    # clear bit
if ($value & $mask)  # check bit

vec is designed for use with bit vectors. (Each element has the same size, which must be a power of two.) It could work here as well:

vec($value, 3, 1) = 1;  # set bit
vec($value, 3, 1) = 0;  # clear bit
if (vec($value, 3, 1))  # check bit

pack and unpack are better suited for working with things like C structs or endianness.

Toren ,Jan 16, 2011 at 12:36

Thank you . Your answer is very informative – Toren Jan 16 '11 at 12:36

sdaau ,Jul 15, 2014 at 5:01

I upvoted, but there is something very important missing: vec operates on a string! If we use a number; say: $val=5; printf("b%08b",$val); (this gives b00000101 ) -- then one can see that the "check bit" syntax, say: for($ix=7;$ix>=0;$ix--) { print vec($val, $ix, 1); }; print "\n"; will not work (it gives 00110101 , which is not the same number). The correct is to convert the number to ASCII char, i.e. print vec(sprintf("%c", $val), $ix, 1); . – sdaau Jul 15 '14 at 5:01

[Apr 29, 2018] Clear unused space with zeros (ext3, ext4)

Notable quotes:
"... Purpose: I'd like to compress partition images, so filling unused space with zeros is highly recommended. ..."
"... Such an utility is zerofree . ..."
"... Be careful - I lost ext4 filesystem using zerofree on Astralinux (Debian based) ..."
"... If the "disk" your filesystem is on is thin provisioned (e.g. a modern SSD supporting TRIM, a VM file whose format supports sparseness etc.) and your kernel says the block device understands it, you can use e2fsck -E discard src_fs to discard unused space (requires e2fsprogs 1.42.2 or higher). ..."
"... If you have e2fsprogs 1.42.9, then you can use e2image to create the partition image without the free space in the first place, so you can skip the zeroing step. ..."
Apr 29, 2018 | unix.stackexchange.com

Grzegorz Wierzowiecki, Jul 29, 2012 at 10:02

How to clear unused space with zeros ? (ext3, ext4)

I'm looking for something smarter than

cat /dev/zero > /mnt/X/big_zero ; sync; rm /mnt/X/big_zero

Like FSArchiver is looking for "used space" and ignores unused, but opposite site.

Purpose: I'd like to compress partition images, so filling unused space with zeros is highly recommended.

Btw. For btrfs : Clear unused space with zeros (btrfs)

Mat, Jul 29, 2012 at 10:18

Check this out: superuser.com/questions/19326/Mat Jul 29 '12 at 10:18

Totor, Jan 5, 2014 at 2:57

Two different kind of answer are possible. What are you trying to achieve? Either 1) security, by forbidding someone to read those data, or 2) optimizing compression of the whole partition or [SSD performance]( en.wikipedia.org/wiki/Trim_(computing) ? – Totor Jan 5 '14 at 2:57

enzotib, Jul 29, 2012 at 11:45

Such an utility is zerofree .

From its description:

Zerofree finds the unallocated, non-zeroed blocks in an ext2 or ext3 file-system and fills them with zeroes. This is useful if the device on which this file-system resides is a disk image. In this case, depending on the type of disk image, a secondary utility may be able to reduce the size of the disk image after zerofree has been run. Zerofree requires the file-system to be unmounted or mounted read-only.

The usual way to achieve the same result (zeroing the unused blocks) is to run "dd" do create a file full of zeroes that takes up the entire free space on the drive, and then delete this file. This has many disadvantages, which zerofree alleviates:

  • it is slow
  • it makes the disk image (temporarily) grow to its maximal extent
  • it (temporarily) uses all free space on the disk, so other concurrent write actions may fail.

Zerofree has been written to be run from GNU/Linux systems installed as guest OSes inside a virtual machine. If this is not your case, you almost certainly don't need this package.

UPDATE #1

The description of the .deb package contains the following paragraph now which would imply this will work fine with ext4 too.

Description: zero free blocks from ext2, ext3 and ext4 file-systems Zerofree finds the unallocated blocks with non-zero value content in an ext2, ext3 or ext4 file-system and fills them with zeroes...

Grzegorz Wierzowiecki, Jul 29, 2012 at 14:08

Is it official page of the tool intgat.tigress.co.uk/rmy/uml/index.html ? Do you think it's safe to use with ext4 ? – Grzegorz Wierzowiecki Jul 29 '12 at 14:08

enzotib, Jul 29, 2012 at 14:12

@GrzegorzWierzowiecki: yes, that is the page, but for debian and friends it is already in the repos. I used on a ext4 partition on a virtual disk to successively shrink the disk file image, and had no problem. – enzotib Jul 29 '12 at 14:12

jlh, Mar 4, 2016 at 10:10

This isn't equivalent to the crude dd method in the original question, since it doesn't work on mounted file systems. – jlh Mar 4 '16 at 10:10

endolith, Oct 14, 2016 at 16:33

zerofree page talks about a patch that lets you do "filesystem is mounted with the zerofree option" so that it always zeros out deleted files continuously. does this require recompiling the kernel then? is there an easier way to accomplish the same thing? – endolith Oct 14 '16 at 16:33

Hubbitus, Nov 23, 2016 at 22:20

Be careful - I lost ext4 filesystem using zerofree on Astralinux (Debian based)Hubbitus Nov 23 '16 at 22:20

Anon, Dec 27, 2015 at 17:53

Summary of the methods (as mentioned in this question and elsewhere) to clear unused space on ext2/ext3/ext4: Zeroing unused space File system is not mounted File system is mounted

Having the filesystem unmounted will give better results than having it mounted. Discarding tends to be the fastest method when a lot of previously used space needs to be zeroed but using zerofree after the discard process can sometimes zero a little bit extra (depending on how discard is implemented on the "disk").

Making the image file smaller Image is in a dedicated VM format

You will need to use an appropriate disk image tool (such as qemu-img convert src_image dst_image ) to enable the zeroed space to be reclaimed and to allow the file representing the image to become smaller.

Image is a raw file

One of the following techniques can be used to make the file sparse (so runs of zero stop taking up space):

These days it might easier to use a tool like virt-sparsify to do these steps and more in one go.

Sources

cas, Jul 29, 2012 at 11:45

sfill from secure-delete can do this and several other related jobs.

e.g.

sfill -l -l -z /mnt/X
UPDATE #1

There is a source tree that appears to be used by the ArchLinux project on github that contains the source for sfill which is a tool included in the package Secure-Delete.

Also a copy of sfill 's man page is here:

cas, Jul 29, 2012 at 12:04

that URL is obsolete. no idea where its home page is now (or even if it still has one), but it's packaged for debian and ubuntu. probably other distros too. if you need source code, that can be found in the debian archives if you can't find it anywhere else. – cas Jul 29 '12 at 12:04

mwfearnley, Jul 31, 2017 at 13:04

The obsolete manpage URL is fixed now. Looks like "Digipedia" is no longer a thing. – mwfearnley Jul 31 '17 at 13:04

psusi, Apr 2, 2014 at 15:27

If you have e2fsprogs 1.42.9, then you can use e2image to create the partition image without the free space in the first place, so you can skip the zeroing step.

mwfearnley, Mar 3, 2017 at 13:36

I couldn't (easily) find any info online about these parameters, but they are indeed given in the 1.42.9 release notes: e2fsprogs.sf.net/e2fsprogs-release.html#1.42.9mwfearnley Mar 3 '17 at 13:36

user64219, Apr 2, 2014 at 14:39

You can use sfill . It's a better solution for thin volumes.

Anthon, Apr 2, 2014 at 15:01

If you want to comment on cas answer, wait until you have enough reputation to do so. – Anthon Apr 2 '14 at 15:01

derobert, Apr 2, 2014 at 17:01

I think the answer is referring to manpages.ubuntu.com/manpages/lucid/man1/sfill.1.html ... which is at least an attempt at answering. ("online" in this case meaning "with the filesystem mounted", not "on the web"). – derobert Apr 2 '14 at 17:01

[Apr 28, 2018] tar exclude single files/directories, not patterns

The important detail about this is that the excluded file name must match exactly the notation reported by the tar listing.
Apr 28, 2018 | stackoverflow.com

Udo G ,May 9, 2012 at 7:13

I'm using tar to make daily backups of a server and want to avoid backup of /proc and /sys system directories, but without excluding any directories named "proc" or "sys" somewhere else in the file tree.

For, example having the following directory tree (" bla " being normal files):

# find
.
./sys
./sys/bla
./foo
./foo/sys
./foo/sys/bla

I would like to exclude ./sys but not ./foo/sys .

I can't seem to find an --exclude pattern that does that...

# tar cvf /dev/null * --exclude=sys
foo/

or...

# tar cvf /dev/null * --exclude=/sys
foo/
foo/sys/
foo/sys/bla
sys/
sys/bla

Any ideas? (Linux Debian 6)

drinchev ,May 9, 2012 at 7:19

Are you sure there is no exclude? If you are using MAC OS it is a different story! Look heredrinchev May 9 '12 at 7:19

Udo G ,May 9, 2012 at 7:21

Not sure I understand your question. There is a --exclude option, but I don't know how to match it for single, absolute file names (not any file by that name) - see my examples above. – Udo G May 9 '12 at 7:21

paulsm4 ,May 9, 2012 at 7:22

Look here: stackoverflow.com/questions/984204/paulsm4 May 9 '12 at 7:22

CharlesB ,May 9, 2012 at 7:29

You can specify absolute paths to the exclude pattern, this way other sys or proc directories will be archived:
tar --exclude=/sys --exclude=/proc /

Udo G ,May 9, 2012 at 7:34

True, but the important detail about this is that the excluded file name must match exactly the notation reported by the tar listing. For my example that would be ./sys - as I just found out now. – Udo G May 9 '12 at 7:34

pjv ,Apr 9, 2013 at 18:14

In this case you might want to use:
--anchored --exclude=sys/\*

because in case your tar does not show the leading "/" you have a problem with the filter.

Savvas Radevic ,May 9, 2013 at 10:44

This did the trick for me, thank you! I wanted to exclude a specific directory, not all directories/subdirectories matching the pattern. bsdtar does not have "--anchored" option though, and with bsdtar we can use full paths to exclude specific folders. – Savvas Radevic May 9 '13 at 10:44

Savvas Radevic ,May 9, 2013 at 10:58

ah found it! in bsdtar the anchor is "^": bsdtar cvjf test.tar.bz2 --exclude myfile.avi --exclude "^myexcludedfolder" *Savvas Radevic May 9 '13 at 10:58

Stephen Donecker ,Nov 8, 2012 at 19:12

Using tar you can exclude directories by placing a tag file in any directory that should be skipped.

Create tag files,

touch /sys/.exclude_from_backup
touch /proc/.exclude_from_backup

Then,

tar -czf backup.tar.gz --exclude-tag-all=.exclude_from_backup *

pjv ,Apr 9, 2013 at 17:58

Good idea in theory but often /sys and /proc cannot be written to. – pjv Apr 9 '13 at 17:58

[Apr 27, 2018] Shell command to tar directory excluding certain files-folders

Highly recommended!
Notable quotes:
"... Trailing slashes at the end of excluded folders will cause tar to not exclude those folders at all ..."
"... I had to remove the single quotation marks in order to exclude sucessfully the directories ..."
"... Exclude files using tags by placing a tag file in any directory that should be skipped ..."
"... Nice and clear thank you. For me the issue was that other answers include absolute or relative paths. But all you have to do is add the name of the folder you want to exclude. ..."
"... Adding a wildcard after the excluded directory will exclude the files but preserve the directories: ..."
"... You can use cpio(1) to create tar files. cpio takes the files to archive on stdin, so if you've already figured out the find command you want to use to select the files the archive, pipe it into cpio to create the tar file: ..."
Apr 27, 2018 | stackoverflow.com

deepwell ,Jun 11, 2009 at 22:57

Is there a simple shell command/script that supports excluding certain files/folders from being archived?

I have a directory that need to be archived with a sub directory that has a number of very large files I do not need to backup.

Not quite solutions:

The tar --exclude=PATTERN command matches the given pattern and excludes those files, but I need specific files & folders to be ignored (full file path), otherwise valid files might be excluded.

I could also use the find command to create a list of files and exclude the ones I don't want to archive and pass the list to tar, but that only works with for a small amount of files. I have tens of thousands.

I'm beginning to think the only solution is to create a file with a list of files/folders to be excluded, then use rsync with --exclude-from=file to copy all the files to a tmp directory, and then use tar to archive that directory.

Can anybody think of a better/more efficient solution?

EDIT: cma 's solution works well. The big gotcha is that the --exclude='./folder' MUST be at the beginning of the tar command. Full command (cd first, so backup is relative to that directory):

cd /folder_to_backup
tar --exclude='./folder' --exclude='./upload/folder2' -zcvf /backup/filename.tgz .

Rekhyt ,May 1, 2012 at 12:55

Another thing caught me out on that, might be worth a note:

Trailing slashes at the end of excluded folders will cause tar to not exclude those folders at all. – Rekhyt May 1 '12 at 12:55

Brice ,Jun 24, 2014 at 16:06

I had to remove the single quotation marks in order to exclude sucessfully the directories. ( tar -zcvf gatling-charts-highcharts-1.4.6.tar.gz /opt/gatling-charts-highcharts-1.4.6 --exclude=results --exclude=target ) – Brice Jun 24 '14 at 16:06

Charles Ma ,Jun 11, 2009 at 23:11

You can have multiple exclude options for tar so
$ tar --exclude='./folder' --exclude='./upload/folder2' -zcvf /backup/filename.tgz .

etc will work. Make sure to put --exclude before the source and destination items.

shasi kanth ,Feb 27, 2015 at 10:49

As an example, if you are trying to backup your wordpress project folder, excluding the uploads folder, you can use this command:

tar -cvf wordpress_backup.tar wordpress --exclude=wp-content/uploads

Alfred Bez ,Jul 16, 2015 at 7:28

I came up with the following command: tar -zcv --exclude='file1' --exclude='pattern*' --exclude='file2' -f /backup/filename.tgz . note that the -f flag needs to precede the tar file see:

flickerfly ,Aug 21, 2015 at 16:22

A "/" on the end of the exclude directory will cause it to fail. I guess tar thinks an ending / is part of the directory name to exclude. BAD: --exclude=mydir/ GOOD: --exclude=mydir – flickerfly Aug 21 '15 at 16:22

NightKnight on Cloudinsidr.com ,Nov 24, 2016 at 9:55

> Make sure to put --exclude before the source and destination items. OR use an absolute path for the exclude: tar -cvpzf backups/target.tar.gz --exclude='/home/username/backups' /home/username – NightKnight on Cloudinsidr.com Nov 24 '16 at 9:55

Johan Soderberg ,Jun 11, 2009 at 23:10

To clarify, you can use full path for --exclude. – Johan Soderberg Jun 11 '09 at 23:10

Stephen Donecker ,Nov 8, 2012 at 0:22

Possible options to exclude files/directories from backup using tar:

Exclude files using multiple patterns

tar -czf backup.tar.gz --exclude=PATTERN1 --exclude=PATTERN2 ... /path/to/backup

Exclude files using an exclude file filled with a list of patterns

tar -czf backup.tar.gz -X /path/to/exclude.txt /path/to/backup

Exclude files using tags by placing a tag file in any directory that should be skipped

tar -czf backup.tar.gz --exclude-tag-all=exclude.tag /path/to/backup

Anish Ramaswamy ,May 16, 2015 at 0:11

This answer definitely helped me! The gotcha for me was that my command looked something like tar -czvf mysite.tar.gz mysite --exclude='./mysite/file3' --exclude='./mysite/folder3' , and this didn't exclude anything. – Anish Ramaswamy May 16 '15 at 0:11

Hubert ,Feb 22, 2017 at 7:38

Nice and clear thank you. For me the issue was that other answers include absolute or relative paths. But all you have to do is add the name of the folder you want to exclude.Hubert Feb 22 '17 at 7:38

GeertVc ,Dec 31, 2013 at 13:35

Just want to add to the above, that it is important that the directory to be excluded should NOT contain a final backslash. So, --exclude='/path/to/exclude/dir' is CORRECT , --exclude='/path/to/exclude/dir/' is WRONG . – GeertVc Dec 31 '13 at 13:35

Eric Manley ,May 14, 2015 at 14:10

You can use standard "ant notation" to exclude directories relative.
This works for me and excludes any .git or node_module directories.
tar -cvf myFile.tar --exclude=**/.git/* --exclude=**/node_modules/*  -T /data/txt/myInputFile.txt 2> /data/txt/myTarLogFile.txt

myInputFile.txt Contains:

/dev2/java
/dev2/javascript

not2qubit ,Apr 4 at 3:24

I believe this require that the Bash shell option variable globstar has to be enabled. Check with shopt -s globstar . I think it off by default on most unix based OS's. From Bash manual: " globstar: If set, the pattern ** used in a filename expansion context will match all files and zero or more directories and subdirectories. If the pattern is followed by a '/', only directories and subdirectories match. " – not2qubit Apr 4 at 3:24

Benoit Duffez ,Jun 19, 2016 at 21:14

Don't forget COPYFILE_DISABLE=1 when using tar, otherwise you may get ._ files in your tarballBenoit Duffez Jun 19 '16 at 21:14

Scott Stensland ,Feb 12, 2015 at 20:55

This exclude pattern handles filename suffix like png or mp3 as well as directory names like .git and node_modules
tar --exclude={*.png,*.mp3,*.wav,.git,node_modules} -Jcf ${target_tarball}  ${source_dirname}

Alex B ,Jun 11, 2009 at 23:03

Use the find command in conjunction with the tar append (-r) option. This way you can add files to an existing tar in a single step, instead of a two pass solution (create list of files, create tar).
find /dir/dir -prune ... -o etc etc.... -exec tar rvf ~/tarfile.tar {} \;

carlo ,Mar 4, 2012 at 15:18

To avoid possible 'xargs: Argument list too long' errors due to the use of find ... | xargs ... when processing tens of thousands of files, you can pipe the output of find directly to tar using find ... -print0 | tar --null ... .
# archive a given directory, but exclude various files & directories 
# specified by their full file paths
find "$(pwd -P)" -type d \( -path '/path/to/dir1' -or -path '/path/to/dir2' \) -prune \
   -or -not \( -path '/path/to/file1' -or -path '/path/to/file2' \) -print0 | 
   gnutar --null --no-recursion -czf archive.tar.gz --files-from -
   #bsdtar --null -n -czf archive.tar.gz -T -

Znik ,Mar 4, 2014 at 12:20

you can quote 'exclude' string, like this: 'somedir/filesdir/*' then shell isn't going to expand asterisks and other white chars.

Tuxdude ,Nov 15, 2014 at 5:12

xargs -n 1 is another option to avoid xargs: Argument list too long error ;) – Tuxdude Nov 15 '14 at 5:12

Aaron Votre ,Jul 15, 2016 at 15:56

I agree the --exclude flag is the right approach.
$ tar --exclude='./folder_or_file' --exclude='file_pattern' --exclude='fileA'

A word of warning for a side effect that I did not find immediately obvious: The exclusion of 'fileA' in this example will search for 'fileA' RECURSIVELY!

Example:A directory with a single subdirectory containing a file of the same name (data.txt)

data.txt
config.txt
--+dirA
  |  data.txt
  |  config.docx

Mike ,May 9, 2014 at 21:26

After reading this thread, I did a little testing on RHEL 5 and here are my results for tarring up the abc directory:

This will exclude the directories error and logs and all files under the directories:

tar cvpzf abc.tgz abc/ --exclude='abc/error' --exclude='abc/logs'

Adding a wildcard after the excluded directory will exclude the files but preserve the directories:

tar cvpzf abc.tgz --exclude='abc/error/*' --exclude='abc/logs/*' abc/

camh ,Jun 12, 2009 at 5:53

You can use cpio(1) to create tar files. cpio takes the files to archive on stdin, so if you've already figured out the find command you want to use to select the files the archive, pipe it into cpio to create the tar file:
find ... | cpio -o -H ustar | gzip -c > archive.tar.gz

frommelmak ,Sep 10, 2012 at 14:08

You can also use one of the "--exclude-tag" options depending on your needs:

The folder hosting the specified FILE will be excluded.

Joe ,Jun 11, 2009 at 23:04

Your best bet is to use find with tar, via xargs (to handle the large number of arguments). For example:
find / -print0 | xargs -0 tar cjf tarfile.tar.bz2

jørgensen ,Mar 4, 2012 at 15:23

That can cause tar to be invoked multiple times - and will also pack files repeatedly. Correct is: find / -print0 | tar -T- --null --no-recursive -cjf tarfile.tar.bz2jørgensen Mar 4 '12 at 15:23

Stphane ,Dec 19, 2015 at 11:10

I read somewhere that when using xargs , one should use tar r option instead of c because when find actually finds loads of results, the xargs will split those results (based on the local command line arguments limit) into chuncks and invoke tar on each part. This will result in a archive containing the last chunck returned by xargs and not all results found by the find command. – Stphane Dec 19 '15 at 11:10

Andrew ,Apr 14, 2014 at 16:21

gnu tar v 1.26 the --exclude needs to come after archive file and backup directory arguments, should have no leading or trailing slashes, and prefers no quotes (single or double). So relative to the PARENT directory to be backed up, it's:

tar cvfz /path_to/mytar.tgz ./dir_to_backup --exclude=some_path/to_exclude

Ashwini Gupta ,Jan 12 at 10:30

tar -cvzf destination_folder source_folder -X /home/folder/excludes.txt

-X indicates a file which contains a list of filenames which must be excluded from the backup. For Instance, you can specify *~ in this file to not include any filenames ending with ~ in the backup.

Georgios ,Sep 4, 2013 at 22:35

Possible redundant answer but since I found it useful, here it is:

While a FreeBSD root (i.e. using csh) I wanted to copy my whole root filesystem to /mnt but without /usr and (obviously) /mnt. This is what worked (I am at /):

tar --exclude ./usr --exclude ./mnt --create --file - . (cd /mnt && tar xvd -)

My whole point is that it was necessary (by putting the ./ ) to specify to tar that the excluded directories where part of the greater directory being copied.

My €0.02

user2792605 ,Sep 30, 2013 at 20:07

I had no luck getting tar to exclude a 5 Gigabyte subdirectory a few levels deep. In the end, I just used the unix Zip command. It worked a lot easier for me.

So for this particular example from the original post
(tar --exclude='./folder' --exclude='./upload/folder2' -zcvf /backup/filename.tgz . )

The equivalent would be:

zip -r /backup/filename.zip . -x upload/folder/**\* upload/folder2/**\*

(NOTE: Here is the post I originally used that helped me https://superuser.com/questions/312301/unix-zip-directory-but-excluded-specific-subdirectories-and-everything-within-t )

t0r0X ,Sep 29, 2014 at 20:25

Beware: zip does not pack empty directories, but tar does! – t0r0X Sep 29 '14 at 20:25

RohitPorwal ,Jul 21, 2016 at 9:56

Check it out
tar cvpzf zip_folder.tgz . --exclude=./public --exclude=./tmp --exclude=./log --exclude=fileName

James ,Oct 28, 2016 at 14:01

The following bash script should do the trick. It uses the answer given here by Marcus Sundman.
#!/bin/bash

echo -n "Please enter the name of the tar file you wish to create with out extension "
read nam

echo -n "Please enter the path to the directories to tar "
read pathin

echo tar -czvf $nam.tar.gz
excludes=`find $pathin -iname "*.CC" -exec echo "--exclude \'{}\'" \;|xargs`
echo $pathin

echo tar -czvf $nam.tar.gz $excludes $pathin

This will print out the command you need and you can just copy and paste it back in. There is probably a more elegant way to provide it directly to the command line.

Just change *.CC for any other common extension, file name or regex you want to exclude and this should still work.

EDIT

Just to add a little explanation; find generates a list of files matching the chosen regex (in this case *.CC). This list is passed via xargs to the echo command. This prints --exclude 'one entry from the list'. The slashes () are escape characters for the ' marks.

tripleee ,Sep 14, 2017 at 4:27

Requiring interactive input is a poor design choice for most shell scripts. Make it read command-line parameters instead and you get the benefit of the shell's tab completion, history completion, history editing, etc. – tripleee Sep 14 '17 at 4:27

tripleee ,Sep 14, 2017 at 4:38

Additionally, your script does not work for paths which contain whitespace or shell metacharacters. You should basically always put variables in double quotes unless you specifically require the shell to perform whitespace tokenization and wildcard expansion. For details, please see stackoverflow.com/questions/10067266/tripleee Sep 14 '17 at 4:38

> ,Apr 18 at 0:31

For those who have issues with it, some versions of tar would only work properly without the './' in the exclude value.
Tar --version

tar (GNU tar) 1.27.1

Command syntax that work:

tar -czvf ../allfiles-butsome.tar.gz * --exclude=acme/foo

These will not work:

$ tar -czvf ../allfiles-butsome.tar.gz * --exclude=./acme/foo
$ tar -czvf ../allfiles-butsome.tar.gz * --exclude='./acme/foo'
$ tar --exclude=./acme/foo -czvf ../allfiles-butsome.tar.gz *
$ tar --exclude='./acme/foo' -czvf ../allfiles-butsome.tar.gz *
$ tar -czvf ../allfiles-butsome.tar.gz * --exclude=/full/path/acme/foo
$ tar -czvf ../allfiles-butsome.tar.gz * --exclude='/full/path/acme/foo'
$ tar --exclude=/full/path/acme/foo -czvf ../allfiles-butsome.tar.gz *
$ tar --exclude='/full/path/acme/foo' -czvf ../allfiles-butsome.tar.gz *

[Apr 04, 2018] gzip - How can I recover files from a corrupted .tar.gz archive

Apr 04, 2018 | stackoverflow.com

22 down vote favorite 5


Tom Melluish ,Oct 14, 2008 at 14:30

I have a large number of files in a .tar.gz archive. Checking the file type with the command
file SMS.tar.gz

gives the response

gzip compressed data - deflate method , max compression

When I try to extract the archive with gunzip, after a delay I receive the message

gunzip: SMS.tar.gz: unexpected end of file

Is there any way to recover even part of the archive?

David Segonds ,Oct 14, 2008 at 14:32

Are you sure that it is a gzip file? I would first run 'file SMS.tar.gz' to validate that.

Then I would read the The gzip Recovery Toolkit page.

George ,Mar 1, 2014 at 18:57

gzrecover does not come installed on Mac OS. However, Liudvikas Bukys's method worked fine. Had tcpdump piped into gzip, killed with Control-C, unexpected EOF trying to decompress pipee file. – George Mar 1 '14 at 18:57

Nemo ,Jun 24, 2016 at 2:49

gzip Recovery Toolkit is tremendous. Thanks! – Nemo Jun 24 '16 at 2:49

Liudvikas Bukys ,Oct 21, 2008 at 18:29

Recovery is possible but it depends on what caused the corruption.

If the file is just truncated, getting some partial result out is not too hard; just run

gunzip < SMS.tar.gz > SMS.tar.partial

which will give some output despite the error at the end.

If the compressed file has large missing blocks, it's basically hopeless after the bad block.

If the compressed file is systematically corrupted in small ways (e.g. transferring the binary file in ASCII mode, which smashes carriage returns and newlines throughout the file), it is possible to recover but requires quite a bit of custom programming, it's really only worth it if you have absolutely no other recourse (no backups) and the data is worth a lot of effort. (I have done it successfully.) I mentioned this scenario in a previous question .

The answers for .zip files differ somewhat, since zip archives have multiple separately-compressed members, so there's more hope (though most commercial tools are rather bogus, they eliminate warnings by patching CRCs, not by recovering good data). But your question was about a .tar.gz file, which is an archive with one big member.

JohnEye ,Oct 4, 2016 at 11:27

There will most likely be an unreadable file after this procedure. Fortunately, there is a tool to fix this and get the partial data from it too: riaschissl.bestsolution.at/2015/03/JohnEye Oct 4 '16 at 11:27

> ,

Here is one possible scenario that we encountered. We had a tar.gz file that would not decompress, trying to unzip gave the error:
gzip -d A.tar.gz
gzip: A.tar.gz: invalid compressed data--format violated

I figured out that the file may been originally uploaded over a non binary ftp connection (we don't know for sure).

The solution was relatively simple using the unix dos2unix utility

dos2unix A.tar.gz
dos2unix: converting file A.tar.gz to UNIX format ...
tar -xvf A.tar
file1.txt
file2.txt 
....etc.

It worked! This is one slim possibility, and maybe worth a try - it may help somebody out there.

[Mar 15, 2018] Why Python Sucks

Mar 15, 2018 | dannyman.toldme.com

[Mar 15, 2018] programming languages - What are the drawbacks of Python - Software Engineering Stack Exchange

Mar 15, 2018 | softwareengineering.stackexchange.com

down vote

> ,

I use Python somewhat regularly, and overall I consider it to be a very good language. Nonetheless, no language is perfect. Here are the drawbacks in order of importance to me personally:
  1. It's slow. I mean really, really slow. A lot of times this doesn't matter, but it definitely means you'll need another language for those performance-critical bits.
  2. Nested functions kind of suck in that you can't modify variables in the outer scope. Edit: I still use Python 2 due to library support, and this design flaw irritates the heck out of me, but apparently it's fixed in Python 3 due to the nonlocal statement. Can't wait for the libs I use to be ported so this flaw can be sent to the ash heap of history for good.
  3. It's missing a few features that can be useful to library/generic code and IMHO are simplicity taken to unhealthy extremes. The most important ones I can think of are user-defined value types (I'm guessing these can be created with metaclass magic, but I've never tried), and ref function parameter.
  4. It's far from the metal. Need to write threading primitives or kernel code or something? Good luck.
  5. While I don't mind the lack of ability to catch semantic errors upfront as a tradeoff for the dynamism that Python offers, I wish there were a way to catch syntactic errors and silly things like mistyping variable names without having to actually run the code.
  6. The documentation isn't as good as languages like PHP and Java that have strong corporate backings.

[Mar 15, 2018] Why Python Sucks Armin Ronacher's Thoughts and Writings

Mar 15, 2018 | lucumr.pocoo.org

Armin Ronacher 's Thoughts and Writings

Why Python Sucks

written on Monday, June 11, 2007

And of course also why it sucks a lot less than any other language. But it's not perfect. My personal problems with python:

Why it still sucks less? Good question. Probably because the meta programming capabilities are great, the libraries are awesome, indention based syntax is hip, first class functions , quite fast, many bindings (PyGTW FTW!) and the community is nice and friendly. And there is WSGI!

[Mar 13, 2018] git log - View the change history of a file using Git versioning

Mar 13, 2018 | stackoverflow.com

Richard ,Nov 10, 2008 at 15:42

How can I view the change history of an individual file in Git, complete details with what has changed?

I have got as far as:

git log -- [filename]

which shows me the commit history of the file, but how do I get at the content of each of the file changes?

I'm trying to make the transition from MS SourceSafe and that used to be a simple right-clickshow history .

chris ,May 10, 2010 at 8:58

The above link is no-longer valid. This link is working today: Git Community Bookchris May 10 '10 at 8:58

Claudio Acciaresi ,Aug 24, 2009 at 12:05

For this I'd use:
gitk [filename]

or to follow filename past renames

gitk --follow [filename]

Egon Willighagen ,Apr 6, 2010 at 15:50

But I rather even have a tool that combined the above with 'git blame' allowing me to browse the source of a file as it changes in time... – Egon Willighagen Apr 6 '10 at 15:50

Dan Moulding ,Mar 30, 2011 at 23:17

Unfortunately, this doesn't follow the history of the file past renames. – Dan Moulding Mar 30 '11 at 23:17

Florian Gutmann ,Apr 26, 2011 at 9:05

I was also looking for the history of files that were previously renamed and found this thread first. The solution is to use "git log --follow <filename>" as Phil pointed out here . – Florian Gutmann Apr 26 '11 at 9:05

mikemaccana ,Jul 18, 2011 at 15:17

The author was looking for a command line tool. While gitk comes with GIT, it's neither a command line app nor a particularly good GUI. – mikemaccana Jul 18 '11 at 15:17

hdgarrood ,May 13, 2013 at 14:57

Was he looking for a command line tool? "right click -> show history" certainly doesn't imply it. – hdgarrood May 13 '13 at 14:57

VolkA ,Nov 10, 2008 at 15:56

You can use
git log -p filename

to let git generate the patches for each log entry.

See

git help log

for more options - it can actually do a lot of nice things :) To get just the diff for a specific commit you can

git show HEAD

or any other revision by identifier. Or use

gitk

to browse the changes visually.

Jonas Byström ,Feb 17, 2011 at 17:13

git show HEAD shows all files, do you know how to track an individual file (as Richard was asking for)? – Jonas Byström Feb 17 '11 at 17:13

Marcos Oliveira ,Feb 9, 2012 at 21:44

you use: git show <revision> -- filename, that will show the diffs for that revision, in case exists one. – Marcos Oliveira Feb 9 '12 at 21:44

Raffi Khatchadourian ,May 9, 2012 at 22:29

--stat is also helpful. You can use it together with -p. – Raffi Khatchadourian May 9 '12 at 22:29

Paulo Casaretto ,Feb 27, 2013 at 18:05

This is great. gitk does not behave well when specifying paths that do not exist anymore. I used git log -p -- path . – Paulo Casaretto Feb 27 '13 at 18:05

ghayes ,Jul 21, 2013 at 19:28

Plus gitk looks like it was built by the boogie monster. This is a great answer and is best tailored to the original question. – ghayes Jul 21 '13 at 19:28

Dan Moulding ,Mar 30, 2011 at 23:25

git log --follow -p -- file

This will show the entire history of the file (including history beyond renames and with diffs for each change).

In other words, if the file named bar was once named foo , then git log -p bar (without the --follow option) will only show the file's history up to the point where it was renamed -- it won't show the file's history when it was known as foo . Using git log --follow -p bar will show the file's entire history, including any changes to the file when it was known as foo . The -p option ensures that diffs are included for each change.

Raffi Khatchadourian ,May 9, 2012 at 22:29

--stat is also helpful. You can use it together with -p. – Raffi Khatchadourian May 9 '12 at 22:29

zzeroo ,Sep 6, 2012 at 14:11

Dan's answer is the only real one! git log --follow -p filezzeroo Sep 6 '12 at 14:11

Trevor Boyd Smith ,Sep 11, 2012 at 18:54

I agree this is the REAL answer. (1.) --follow ensures that you see file renames (2.) -p ensures that you see how the file gets changed (3.) it is command line only. – Trevor Boyd Smith Sep 11 '12 at 18:54

Dan Moulding ,May 28, 2015 at 16:10

@Benjohn The -- option tells Git that it has reached the end of the options and that anything that follows -- should be treated as an argument. For git log this only makes any difference if you have a path name that begins with a dash . Say you wanted to know the history of a file that has the unfortunate name "--follow": git log --follow -p -- --followDan Moulding May 28 '15 at 16:10

NHDaly ,May 30, 2015 at 6:03

@Benjohn: Normally, the -- is useful because it can also guard against any revision names that match the filename you've entered, which can actually be scary. For example: If you had both a branch and a file named foo , git log -p foo would show the git log history up to foo , not the history for the file foo . But @DanMoulding is right that since the --follow command only takes a single filename as its argument, this is less necessary since it can't be a revision . I just learned that. Maybe you were right to leave it out of your answer then; I'm not sure. – NHDaly May 30 '15 at 6:03

Falken ,Jun 7, 2012 at 10:23

If you prefer to stay text-based, you may want to use tig .

Quick Install:

Use it to view history on a single file: tig [filename]
Or browse detailed repo history: tig

Similar to gitk but text based. Supports colors in terminal!

Tom McKenzie ,Oct 24, 2012 at 5:28

Excellent text-based tool, great answer. I freaked out when I saw the dependencies for gitk installing on my headless server. Would upvote again A+++ – Tom McKenzie Oct 24 '12 at 5:28

gloriphobia ,Oct 27, 2017 at 12:05

You can look at specific files with tig too, i.e. tig -- path/to/specific/filegloriphobia Oct 27 '17 at 12:05

farktronix ,Nov 11, 2008 at 6:12

git whatchanged -p filename is also equivalent to git log -p filename in this case.

You can also see when a specific line of code inside a file was changed with git blame filename . This will print out a short commit id, the author, timestamp, and complete line of code for every line in the file. This is very useful after you've found a bug and you want to know when it was introduced (or who's fault it was).

rockXrock ,Mar 8, 2013 at 9:45

+1, but filename is not optional in command git blame filename . – rockXrock Mar 8 '13 at 9:45

ciastek ,Mar 18, 2014 at 8:03

"New users are encouraged to use git-log instead. (...) The command is kept primarily for historical reasons;" – ciastek Mar 18 '14 at 8:03

Mark Fox ,Jul 30, 2013 at 18:55

SourceTree users

If you use SourceTree to visualize your repository (it's free and quite good) you can right click a file and select Log Selected

The display (below) is much friendlier than gitk and most the other options listed. Unfortunately (at this time) there is no easy way to launch this view from the command line -- SourceTree's CLI currently just opens repos.

Chris ,Mar 13, 2015 at 13:07

I particularly like the option "Follow renamed files", which allows you to see if a file was renamed or moved. – Chris Mar 13 '15 at 13:07

Sam Lewallen ,Jun 30, 2015 at 6:16

but unless i'm mistaken (please let me know!), one can only compare two versions at a time in the gui? Are there any clients which have an elegant interface for diffing several different versions at once? Possibly with a zoom-out view like in Sublime Text? That would be really useful I think. – Sam Lewallen Jun 30 '15 at 6:16

Mark Fox ,Jun 30, 2015 at 18:47

@SamLewallen If I understand correctly you want to compare three different commits? This sounds similar to a three-way merge (mine, yours, base) -- usually this strategy is used for resolving merge conflicts not necessarily comparing three arbitrary commits. There are many tools that support three way merges stackoverflow.com/questions/10998728/ but the trick is feeding these tools the specific revisions gitready.com/intermediate/2009/02/27/Mark Fox Jun 30 '15 at 18:47

Sam Lewallen ,Jun 30, 2015 at 19:02

Thanks Mark Fox, that's what I mean. Do you happen to know of any applications that will do that? – Sam Lewallen Jun 30 '15 at 19:02

AechoLiu ,Jan 25 at 6:58

You save my life. You can use gitk to find the SHA1 hash, and then open SourceTree to enter Log Selected.. based on the found SHA1 . – AechoLiu Jan 25 at 6:58

yllohy ,Aug 11, 2010 at 13:01

To show what revision and author last modified each line of a file:
git blame filename

or if you want to use the powerful blame GUI:

git gui blame filename

John Lawrence Aspden ,Dec 5, 2012 at 18:38

Summary of other answers after reading through them and playing a bit:

The usual command line command would be

git log --follow --all -p dir/file.c

But you can also use either gitk (gui) or tig (text-ui) to give much more human-readable ways of looking at it.

gitk --follow --all -p dir/file.c

tig --follow --all -p dir/file.c

Under debian/ubuntu, the install command for these lovely tools is as expected :

sudo apt-get install gitk tig

And I'm currently using:

alias gdf='gitk --follow --all -p'

so that I can just type gdf dir to get a focussed history of everything in subdirectory dir .

PopcornKing ,Feb 25, 2013 at 17:11

I think this is a great answer. Maybe you arent getting voted as well because you answer other ways (IMHO better) to see the changes i.e. via gitk and tig in addition to git. – PopcornKing Feb 25 '13 at 17:11

parasrish ,Aug 16, 2016 at 10:04

Just to add to answer. Locate the path (in git space, up to which exists in repository still). Then use the command stated above "git log --follow --all -p <folder_path/file_path>". There may be the case, that the filde/folder would have been removed over the history, hence locate the maximum path that exists still, and try to fetch its history. works ! – parasrish Aug 16 '16 at 10:04

cregox ,Mar 18, 2017 at 9:44

--all is for all branches, the rest is explained in @Dan's answer – cregox Mar 18 '17 at 9:44

Palesz ,Jun 26, 2013 at 20:12

Add this alias to your .gitconfig:
[alias]
    lg = log --all --graph --pretty=format:'%Cred%h%Creset -%C(yellow)%d%Creset %s %Cgreen(%cr) %C(bold blue)<%an>%Creset'\n--abbrev-commit --date=relative

And use the command like this:

> git lg
> git lg -- filename

The output will look almost exactly the same as the gitk output. Enjoy.

jmbeck ,Jul 22, 2013 at 14:40

After I ran that lg shortcut, I said (and I quote) "Beautiful!". However, note that the "\n" after "--graph" is an error. – jmbeck Jul 22 '13 at 14:40

Egel ,Mar 27, 2015 at 12:11

Also can be used git lg -p filename - it returns a beautiful diff of searched file. – Egel Mar 27 '15 at 12:11

Jian ,Nov 19, 2012 at 6:25

I wrote git-playback for this exact purpose
pip install git-playback
git playback [filename]

This has the benefit of both displaying the results in the command line (like git log -p ) while also letting you step through each commit using the arrow keys (like gitk ).

George Anderson ,Sep 17, 2010 at 16:50

Or:

gitx -- <path/to/filename>

if you're using gitx

Igor Ganapolsky ,Sep 4, 2011 at 16:19

For some reason my gitx opens up blank. – Igor Ganapolsky Sep 4 '11 at 16:19

zdsbs ,Jan 3, 2014 at 18:17

@IgorGanapolsky you have to make sure you're at the root of your git repository – zdsbs Jan 3 '14 at 18:17

Adi Shavit ,Aug 7, 2012 at 13:57

If you want to see the whole history of a file, including on all other branches use:
gitk --all <filename>

lang2 ,Nov 10, 2015 at 11:53

Lately I discovered tig and found it very useful. There are some cases I'd wish it does A or B but most of the time it's rather neat.

For your case, tig <filename> might be what you're looking for.

http://jonas.nitro.dk/tig/

PhiLho ,Nov 28, 2012 at 15:58

With the excellent Git Extensions , you go to a point in the history where the file still existed (if it have been deleted, otherwise just go to HEAD), switch to the File tree tab, right-click on the file and choose File history .

By default, it follows the file through the renames, and the Blame tab allows to see the name at a given revision.

It has some minor gotchas, like showing fatal: Not a valid object name in the View tab when clicking on the deletion revision, but I can live with that. :-)

Evan Hahn ,May 9, 2013 at 20:39

Worth noting that this is Windows-only. – Evan Hahn May 9 '13 at 20:39

Shmil The Cat ,Aug 9, 2015 at 17:42

@EvanHahn not accurate, via mono one can use GitExtension also on Linux, we use it on ubuntu and quite happy w/ it. see git-extensions-documentation.readthedocs.org/en/latest/Shmil The Cat Aug 9 '15 at 17:42

cori ,Nov 10, 2008 at 15:56

If you're using the git GUI (on Windows) under the Repository menu you can use "Visualize master's History". Highlight a commit in the top pane and a file in the lower right and you'll see the diff for that commit in the lower left.

jmbeck ,Jul 22, 2013 at 14:42

How does this answer the question? – jmbeck Jul 22 '13 at 14:42

cori ,Jul 22, 2013 at 15:34

Well, OP didn't specify command line, and moving from SourceSafe (which is a GUI) it seemed relevant to point out that you could do pretty much the same thing that you can do in VSS in the Git GUI on Windows. – cori Jul 22 '13 at 15:34

Malks ,Dec 1, 2011 at 5:24

The answer I was looking for that wasn't in this thread is to see changes in files that I'd staged for commit. i.e.
git diff --cached

ghayes ,Jul 21, 2013 at 19:47

If you want to include local (unstaged) changes, I often run git diff origin/master to show the complete differences between your local branch and the master branch (which can be updated from remote via git fetch ) – ghayes Jul 21 '13 at 19:47

Brad Koch ,Feb 22, 2014 at 0:04

-1, That's a diff, not a change history. – Brad Koch Feb 22 '14 at 0:04

user3885927 ,Aug 12, 2015 at 21:22

If you use TortoiseGit you should be able to right click on the file and do TortoiseGit --> Show Log . In the window that pops up, make sure:

Noam Manos ,Nov 30, 2015 at 10:54

TortoiseGit (and Eclipse Git as well) somehow misses revisions of the selected file, don't count on it! – Noam Manos Nov 30 '15 at 10:54

user3885927 ,Nov 30, 2015 at 23:20

@NoamManos, I haven't encountered that issue, so I cannot verify if your statement is correct. – user3885927 Nov 30 '15 at 23:20

Noam Manos ,Dec 1, 2015 at 12:06

My mistake, it only happens in Eclipse, but in TortoiseGit you can see all revisions of a file if unchecking "show all project" + checking "all branches" (in case the file was committed on another branch, before it was merged to main branch). I'll update your answer. – Noam Manos Dec 1 '15 at 12:06

Lukasz Czerwinski ,May 20, 2013 at 17:17

git diff -U <filename> give you a unified diff.

It should be colored on red and green. If it's not, run: git config color.ui auto first.

jitendrapurohit ,Aug 13, 2015 at 5:41

You can also try this which lists the commits that has changed a specific part of a file (Implemented in Git 1.8.4).

Result returned would be the list of commits that modified this particular part. Command goes like :

git log --pretty=short -u -L <upperLimit>,<lowerLimit>:<path_to_filename>

where upperLimit is the start_line_number and lowerLimit is the ending_line_number of the file.

Antonín Slejška ,Jun 1, 2016 at 10:44

SmartGit :
  1. In the menu enable to display unchanged files: View / Show unchanged files
  2. Right click the file and select 'Log' or press 'Ctrl-L'

AhHatem ,Jan 2, 2013 at 19:35

If you are using eclipse with the git plugin, it has an excellent comparison view with history. Right click the file and select "compare with"=> "history"

avgvstvs ,Sep 27, 2013 at 13:22

That won't allow you to find a deleted file however. – avgvstvs Sep 27 '13 at 13:22

golimar ,Oct 30, 2012 at 14:35

Comparing two versions of a file is different to Viewing the change history of a file – golimar May 7 '15 at 10:22

[Dec 06, 2017] Install R on RedHat errors on dependencies that don't exist

Highly recommended!
Dec 06, 2017 | stackoverflow.com

Jon ,Jul 11, 2014 at 23:55

I have installed R before on a machine running RedHat EL6.5, but I recently had a problem installing new packages (i.e. install.packages()). Since I couldn't find a solution to this, I tried reinstalling R using:
sudo yum remove R

and

sudo yum install R

But now I get:

....
---> Package R-core-devel.x86_64 0:3.1.0-5.el6 will be installed
--> Processing Dependency: blas-devel >= 3.0 for package: R-core-devel-3.1.0-5.el6.x86_64
--> Processing Dependency: libicu-devel for package: R-core-devel-3.1.0-5.el6.x86_64
--> Processing Dependency: lapack-devel for package: R-core-devel-3.1.0-5.el6.x86_64
---> Package xz-devel.x86_64 0:4.999.9-0.3.beta.20091007git.el6 will be installed
--> Finished Dependency Resolution
Error: Package: R-core-devel-3.1.0-5.el6.x86_64 (epel)
           Requires: blas-devel >= 3.0
Error: Package: R-core-devel-3.1.0-5.el6.x86_64 (epel)
       Requires: lapack-devel
Error: Package: R-core-devel-3.1.0-5.el6.x86_64 (epel)
       Requires: libicu-devel
 You could try using --skip-broken to work around the problem
 You could try running: rpm -Va --nofiles --nodigest

I already checked, and blas-devel is installed, but the newest version is 0.2.8. Checked using:

yum info openblas-devel.x86_64

Any thoughts as to what is going wrong? Thanks.

Scott Ritchie ,Jul 12, 2014 at 0:31

A cursory search of blas-devel in google shows that the latest version is at least version 3.2. You probably used to have an older version of R installed, and the newer version depends on a version of BLAS not available in RedHat? – Scott Ritchie Jul 12 '14 at 0:31

bdemarest ,Jul 12, 2014 at 0:31

Can solve this by sudo yum install lapack-devel , etc.. until the errors stop. – bdemarest Jul 12 '14 at 0:31

Jon ,Jul 14, 2014 at 4:08

sudo yum install lapack-devel does not work. Returns: No package lapack-devel available. Scott - you are right that blas-devel is not available in yum. What is the best way to fix this? – Jon Jul 14 '14 at 4:08

Owen ,Aug 27, 2014 at 18:33

I had the same issue. Not sure why these packages are missing from RHEL's repos, but they are in CentOS 6.5, so the follow solution works, if you want to keep things in the package paradigm:
wget http://mirror.centos.org/centos/6/os/x86_64/Packages/lapack-devel-3.2.1-4.el6.x86_64.rpm
wget http://mirror.centos.org/centos/6/os/x86_64/Packages/blas-devel-3.2.1-4.el6.x86_64.rpm
wget http://mirror.centos.org/centos/6/os/x86_64/Packages/texinfo-tex-4.13a-8.el6.x86_64.rpm
wget http://mirror.centos.org/centos/6/os/x86_64/Packages/libicu-devel-4.2.1-9.1.el6_2.x86_64.rpm
sudo yum localinstall *.rpm

cheers


UPDATE: Leon's answer is better -- see below.

DavidJ ,Mar 23, 2015 at 19:50

When installing texinfo-tex-5.1-4.el7.x86_654, it complains about requiring tex(epsd.tex), but I've no idea which package supplies that. This is on RHEL7, obviously (and using CentOS7 packages). – DavidJ Mar 23 '15 at 19:50

Owen ,Mar 24, 2015 at 21:07

Are you trying to install using rpm or yum? yum should attempt to resolve dependencies. – Owen Mar 24 '15 at 21:07

DavidJ ,Mar 25, 2015 at 14:18

It was yum complaining. Adding the analogous CentOS repo to /etc/yum.repos.d temporarily and then installing just the missing dependencies, then removing it and installing R fixed the issue. It is apparently a issue/bug with the RHEL package dependencies. I had to be careful to ensure the all other packages came from the RHEL repos, not CentOS, hence not a good idea to install R itself when the CentOS repo is active. – DavidJ Mar 25 '15 at 14:18

Owen ,Mar 26, 2015 at 4:49

Glad you figured it out. When I stumbled on this last year I was also surprised that the Centos repos seemed more complete than RHEL. – Owen Mar 26 '15 at 4:49

Dave X ,May 28, 2015 at 19:33

They are in the RHEL optional RPMs. See Leon's answer. – Dave X May 28 '15 at 19:33

Leon ,May 21, 2015 at 18:38

Do the following:
  1. vim /etc/yum.repos.d/redhat.repo
  2. Change enabled = 0 in [rhel-6-server-optional-rpms] section of the file to enabled=1
  3. yum install R

DONE!

I think I should give reference to the site of solution:

https://bluehatrecord.wordpress.com/2014/10/13/installing-r-on-red-hat-enterprise-linux-6-5/

Dave X ,May 28, 2015 at 19:31

Works for RHEL7 with [rhel-7-server-optional-rpms] change too. – Dave X May 28 '15 at 19:31

Jon ,Aug 4, 2014 at 4:49

The best solution I could come up with was to install from source. This worked and was not too bad. However, now it isn't in my package manager.

[Dec 06, 2017] Download RStudio Server -- RStudio

Dec 06, 2017 | www.rstudio.com
RStudio Server v0.99 requires RedHat or CentOS version 5.4 (or higher) as well as an installation of R. You can install R for RedHat and CentOS using the instructions on CRAN: https://cran.rstudio.com/bin/linux/redhat/README .

RedHat/CentOS 6 and 7

To download and install RStudio Server open a terminal window and execute the commands corresponding to the 32 or 64-bit version as appropriate.

64bit
Size: 43.5 MB MD5: 1e973cd9532d435d8a980bf84ec85c30 Version: 1.1.383 Released: 2017-10-09

$ wget https://download2.rstudio.org/rstudio-server-rhel-1.1.383-x86_64.rpm
$ sudo yum install --nogpgcheck rstudio-server-rhel-1.1.383-x86_64.rpm

See the Getting Started document for information on configuring and managing the server.

Read the RStudio Server Professional Admin Guide for more detailed instructions.

[Dec 06, 2017] The difficulties of moving from Python to R

Dec 06, 2017 | blog.danwin.com

This post is in response to: Python, Machine Learning, and Language Wars , by Sebastian Raschka

As someone who's switched from Ruby to Python (because the latter is far easier to teach, IMO) and who has also put significant time into learning R just to use ggplot2, I was really surprised at the lack of relevant Google results for "switching from python to r" – or similarly phrased queries. In fact, that particular query will bring up more results for R to Python , e.g. " Python Displacing R as The Programming Language For Data ". The use of R is so ubiquitous in academia (and in the wild, ggplot2 tends to wow nearly on the same level as D3) that I had just assumed there were a fair number of Python/Ruby developers who have tried jumping into R. But there aren't minimaxir's guides are the most and only comprehensive how-to-do-R-as-written-by-an-outsider guides I've seen on the web.

By and far, the most common shift seems to be that of Raschka's – going from R to Python:

Well, I guess it's no big secret that I was an R person once. I even wrote a book about it So, how can I summarize my feelings about R? I am not exactly sure where this quote is comes from – I picked it up from someone somewhere some time ago – but it is great for explaining the difference between R and Python: "R is a programming language developed by statisticians for statisticians; Python was developed by a computer scientist, and it can be used by programmers to apply statistical techniques." Part of the message is that both R and Python are similarly capable for "data science" tasks, however, the Python syntax simply feels more natural to me – it's a personal taste.

That said, one of the things I've appreciated about R is how it "just works" I usually install R through Homebrew, but installing RStudio via point and click is also straightforward . I can see why that's a huge appeal for both beginners and people who want to do computation but not necessarily become developers. Hell, I've been struggling for what feels like months to do just the most rudimentary GIS work in Python 3 . But in just a couple weeks of learning R – and leveraging however it manages to package GDAL and all its other geospatial dependencies with rgdal – been able to create some decent geospatial visualizations (and queries) :

... ... ...

I'm actually enjoying plotting with Matplotlib and seaborn, but it's hard to beat the elegance of ggplot2 – it's worth learning R just to be able to read and better understand Wickham's ggplot2 book and its explanation of the "Grammar of Graphics" . And there's nothing else quite like ggmap in other languages.

Also, I used to hate how <- was used for assignment. Now, that's one of the things I miss most about using R. I've grown up with single-equals-sign assignment in every other language I've learned, but after having to teach some programming the difference between == and = is a common and often hugely stumping error for beginners. Not only that, they have trouble remembering how assignment even works, even for basic variable assignment I've come to realize that I've programmed so long that I immediately recognize the pattern, but that can't possibly be the case for novices, who if they've taken general math classes, have never seen the equals sign that way. The <- operator makes a lot more sense though I would have never thought that if hadn't read Hadley Wickham's style guide .

Speaking of Wickham's style guide, one thing I wish I had done at the very early stages of learning R is to have read Wickham's Advanced R book – which is free online (and contains the style guide). Not only is it just a great read for any programmer, like everything Wickham writes, it is not at all an "advanced" book if you are coming from another language. It goes over the fundamentals of how the language is designed. For example, one major pain point for me was not realizing that R does not have scalars – things that appear to be scalars happen to be vectors of length one. This is something Wickham's book mentions in its Data structures chapter .

Another vital and easy-to-read chapter: Wickham's explanation of R's non-standard evaluation has totally illuminated to me why a programmer of Wickham's caliber enjoys building in R, but why I would find it infuriating to teach R versus Python to beginners.

(Here's another negative take on non-standard evaluation , by an R-using statistician)

FWIW, Wickham has posted a repo attempting to chart and analyze various trends and metrics about R and Python usage . I won't be that methodical; on Reddit, r/Python seems to be by far the biggest programming subreddit. At the time of writing, it has 122,690 readers . By comparison, r/ruby and r/javascript have 31,200 and 82,825 subscribers, respectively. The R-focused subreddit, r/rstats , currently has 8,500 subscribers.

The Python community is so active on Reddit that it has its own learners subreddit – r/learnpython – with 54,300 subscribers .

From anecdotal observations, I don't think Python shows much sign of diminishing popularity on Hacker News, either. Not just because Python-language specific posts keep making the front page, but because of the general increased interest in artificial intelligence, coinciding with Google's recent release of TensorFlow , which they've even quickly ported to Python 3.x .

[Nov 16, 2017] Which is better, Perl or Python? Which one is more robust? How do they compare with each other?

Notable quotes:
"... Python is relatively constraining - in the sense that it does not give the same amount of freedom as PERL in implementing something (note I said 'same amount' - there is still some freedom to do things). But I also see that as Python's strength - by clamping down the parts of the language that could lead to chaos we can live without, I think it makes for a neat and tidy language. ..."
"... Perl 5, Python and Ruby are nearly the same, because all have copied code from each other and describe the same programming level. Perl 5 is the most backward compatible language of all. ..."
"... Python is notably slower than Perl when using regex or data manipulation. ..."
"... you could use PyCharm with Perl too ..."
May 30, 2015 | www.quora.com

Joe Pepersack , Just Another Perl Hacker Answered May 30 2015

Perl is better. Perl has almost no constraints. It's philosophy is that there is more than one way to do it (TIMTOWTDI, pronounced Tim Toady). Python artificially restricts what you can do as a programmer. It's philosophy is that there should be one way to do it. If you don't agree with Guido's way of doing it, you're sh*t out of luck.

Basically, Python is Perl with training wheels. Training wheels are a great thing for a beginner, but eventually you should outgrow them. Yes, riding without training wheels is less safe. You can wreck and make a bloody mess of yourself. But you can also do things that you can't do if you have training wheels. You can go faster and do interesting and useful tricks that aren't possible otherwise. Perl gives you great power, but with great power comes great responsibility.

A big thing that Pythonistas tout as their superiority is that Python forces you to write clean code. That's true, it does... at the point of a gun, sometimes at the detriment of simplicity or brevity. Perl merely gives you the tools to write clean code (perltidy, perlcritic, use strict, /x option for commenting regexes) and gently encourages you to use them.

Perl gives you more than enough rope to hang yourself (and not just rope, Perl gives you bungee cords, wire, chain, string, and just about any other thing you can possibly hang yourself with). This can be a problem. Python was a reaction to this, and their idea of "solving" the problem was to only give you one piece of rope and make it so short you can't possibly hurt yourself with it. If you want to tie a bungee cord around your waist and jump off a bridge, Python says "no way, bungee cords aren't allowed". Perl says "Here you go, hope you know what you are doing... and by the way here are some things that you can optionally use if you want to be safer"

Some clear advantage of Perl:

Advantages of Python
Dan Lenski , I do all my best thinking in Python, and I've written a lot of it Updated Jun 1 2015
Though I may get flamed for it, I will put it even more bluntly than others have: Python is better than Perl . Python's syntax is cleaner, its object-oriented type system is more modern and consistent, and its libraries are more consistent. ( EDIT: As Christian Walde points out in the comments, my criticism of Perl OOP is out-of-date with respect to the current de facto standard of Moo/se. I do believe that Perl's utility is still encumbered by historical baggage in this area and others.)

I have used both languages extensively for both professional work and personal projects (Perl mainly in 1999-2007, Python mainly since), in domains ranging from number crunching (both PDL and NumPy are excellent) to web-based programming (mainly with Embperl and Flask ) to good ol' munging text files and database CRUD

Both Python and Perl have large user communities including many programmers who are far more skilled and experienced than I could ever hope to be. One of the best things about Python is that the community generally espouses this aspect of "The Zen of Python"

Python's philosophy rejects the Perl " there is more than one way to do it " approach to language design in favor of "there should be one -- and preferably only one -- obvious way to do it".
... while this principle might seem stultifying or constraining at first, in practice it means that most good Python programmers think about the principle of least surprise and make it easier for others to read and interface with their code.

In part as a consequence of this discipline, and definitely because of Python's strong typing , and arguably because of its "cleaner" syntax, Python code is considerably easier to read than Perl code. One of the main events that motivated me to switch was the experience of writing code to automate lab equipment in grad school. I realized I couldn't read Perl code I'd written several months prior, despite the fact that I consider myself a careful and consistent programmer when it comes to coding style ( Dunning–Kruger effect , perhaps? :-P).

One of the only things I still use Perl for occasionally is writing quick-and-dirty code to manipulate text strings with regular expressions; Sai Janani Ganesan's pointed out Perl's value for this as well . Perl has a lot of nice syntactic sugar and command line options for munging text quickly*, and in fact there's one regex idiom I used a lot in Perl and for which I've never found a totally satisfying substitute in Python


* For example, the one-liner perl bak pe 's/foo/bar/g' *. txt will go through a bunch of text files and replace foo with bar everywhere while making backup files with the bak extension.
David Roth , Linux sysadmin, developer, vim enthusiast Answered Jun 9, 2015
Neither language is objectively better. If you get enthused about a language - and by all means, get enthused about a language! - you're going to find aspects of it that just strike you as beautiful. Other people who don't share your point of view may find that same aspect pointless. Opinions will vary widely. But the sentence in your question that attracted my attention, and which forms the basis of my answer, is "But, most of my teammates use Perl."

If you have enough charisma to sway your team to Python, great. But in your situation, Perl is probably better . Why? Because you're part of a team, and a team is more effective when it can communicate easily. If your team has a signifigant code base written in Perl, you need to be fluent in it. And you need to contribute to the code base in that same language. Life will be easier for you and your team.

Now, it may be that you've got some natural lines of demarcation in your areas of interest where it makes sense to write code for one problem domain in Perl and for another in Python - I've seen this kind of thing before, and as long as the whole team is on board, it works nicely. But if your teammates universally prefer Perl, then that is what you should focus on.

It's not about language features, it's about team cohesion.

Debra Klopfenstein , Used C++/VHDL/Verilog as an EE. now its Python for BMEng Answered Jul 22, 2015
I used Perl since about 1998 and almost no Python until 2013. I have now (almost) completely switched over to Python (written 10s of thousands of lines already for my project) and now only use Perl one-liners on the Linux command line when I want to use regex and format the print results when grepping through multiple files in my project. Perl's one-liners are great.

This surprisingly easy transition to fully adopt Python and pretty much drop Perl simply would not be have been possible without the Python modules, collections and re (regex package). Python's numpy, matplotlib, and scipy help seal the deal.

The collections package makes complex variables even easier to create than in Perl. It was created in 2004, I think. The regex package, re , works great, but I wish it was built into the Python language like it is in Perl because using regex usage is smooth in Perl and clunky(er) in Python.

OOP is super easy and not as verbose as in C++, so I find it faster to write, if needed in Python.

I've drank the Kool-Aid and there is no going back. Python is it. (for now)

Sai Janani Ganesan , Postdoctoral Scholar at UCSF Updated May 25, 2015
I like and use both, few quick points:I can't think of anything that you can do with Perl that you can't with Python. If I were you, I'd stick with Python.
Reese Currie , Professional programmer for over 25 years Answered May 26, 2015
It's kind of funny; back in the early 1970's, C won out over Pascal despite being much more cryptic. They actually have the same level of control over the machine, Pascal is just more verbose about it, and so it's quicker to write little things in C. Today, with Python and Perl, the situation is reversed; Python, the far less cryptic of the two, has won out over Perl. It shows how values change over time, I suppose.

One of the values of Python is the readability of the code. It's certainly a better language for receiving someone else's work and being able to comprehend it. I haven't had the problem where I can't read my own Perl scripts years later, but that's a matter of fairly strict discipline on my part. I've certainly received some puzzling and unreadable code from other Perl developers. I rather hate PowerShell, in part because the messy way it looks on-screen and its clumsy parameter passing reminds me of Perl.

For collaboration, the whole team would do better on Python than on Perl because of the inherent code readability. Python is an extreme pain in the neck to write without a syntax-aware editor to help with all the whitespace, and that could create a barrier for some of your co-workers. Also Python isn't as good for a really quick-and-dirty script, because nothing does quick (or dirty) better than Perl. There are a number of things you can do in Perl more quickly and I've done some things with Perl I probably wouldn't be interested in even trying in Python. But if I'm doing a scripting task today, I'll consider Python and even Scala in scripting mode before I'll consider Perl.

I'll take a guess that your co-workers are on average older than you, and probably started Perl before Python came along. There's a lot of value in both. Don't hate Perl because of your unfamiliarity with it, and if it's a better fit for the task, maybe you will switch to Perl. It's great to have the choice to use something else, but on the other hand, you may pick up an awful lot of maintenance burden if you work principally in a language nobody else in your company uses.

Abhishek Ghose , works at [24]7 Ai. Answered Jun 11, 2015
I have used Perl and Python both. Perl from 2004-2007, and Python, early 2009 onward. Both are great, malleable languages to work with. I would refrain from making any comments on the OOP model of PERL since my understanding is most likely out-of-date now.

Library-wise PERL and Python both have a fantastic number of user-contributed libraries. In the initial days of Python, PERL definitely had an edge in this regard - you could find almost anything in its library repository CPAN, but I am not sure whether this edge exists anymore; I think not.

Python is relatively constraining - in the sense that it does not give the same amount of freedom as PERL in implementing something (note I said 'same amount' - there is still some freedom to do things). But I also see that as Python's strength - by clamping down the parts of the language that could lead to chaos we can live without, I think it makes for a neat and tidy language.

I loved PERL to death in the years I used it. I was an ardent proselytizer. My biggest revelation/disappointment around the language came when I was asked to revisit a huge chunk of production code I had written a mere 6 months ago. I had a hard time understanding various parts of the code; most of it my own code! I realized that with the freedom that PERL offers, you and your team would probably work better (i.e. write code that's maintainable ) if you also had some coding discipline to go with it. So although PERL provides you a lot of freedom, it is difficult to make productive use of it unless you bring in your own discipline (why not Python then?) or you are so good/mature that any of the many ways to do something in the language is instantly comprehensible to you (i.e. a steeper learning curve if you are not looking forward to brain-teasers during code-maintenance).

The above is not an one-sided argument; it so happened that some years later I was in a similar situation again; only this time the language was Python. This time around I was easily able to understand the codebase. The consistency of doing things in Python helps. Emerson said consistency is the hobgoblin of little minds , but maybe faced with the daunting task of understanding huge legacy codebases we are relatively little minds. Or maybe its just me (actually not, speaking from experience :) ).

All said and done, I am still a bit miffed at the space/tab duality in Python :)

Jay Jarman , BS, MS, and PhD in Computer Science. Currently an Asst Prof in CS Answered Sep 5, 2015
You're asking the wrong question because the answer to your question is, it depends. There is no best language. It depends on what you're going to use it for. About 25 years ago, some bureaucratic decided that every system written by the DoD was to be written in Ada. The purpose was that we'd only have to train software developers in one language. The DoD could painlessly transfer developers from project to project. The only problem was that Ada wasn't the best language for all situations. So, what problem do you wish to solve by programming. That will help to determine what language is the best.
Steffen Winkler , software developer Answered Nov 24, 2015
Perl 5, Python and Ruby are nearly the same, because all have copied code from each other and describe the same programming level. Perl 5 is the most backward compatible language of all.

... ... ...

Mauro Lacy , Software Developer Answered Jun 8, 2015
Python is much better, if only because it's much more readable. Python programs are therefore easily modifiable, and then more maintainable, extensible, etc.

Perl is a powerful language, and a Perl script can be fun to write. But is a pain to read, even by the one who wrote it, after just a couple of months, or even weeks. Perl code usually looks like(and in many cases just is) a hack to get something done quickly and "easily".

What is still confusing and annoying in Python are the many different frameworks employed to build and install modules, the fact that much code and modules are still in beta stage, unavailable on certain platforms, or with difficult or almost impossible to satisfy binary(and also non-binary, i.e. due to different/incompatible versions) dependencies. Of course, this is related to the fact that Python is still a relatively young language. Python 3 vs. 2 incompatibilities, though justifiable and logical, also don't help in this regard.

Jim Abraham , worked at Takeda Pharmaceuticals Answered Jun 16

Perl is the unix of programming languages. Those who don't know it are condemned to re-implement it (and here I'm looking at you, Guido), badly. I've read a lot of Python books, and many of the things they crow about as being awesome in Python have existed in Perl since the beginning. And most of those Perl stole from sed or awk or lisp or shell. Sometimes I think Pythonistas are simply kids who don't know any other languages, so of course they think Python is awesome.

Eran Zimbler , works at Cloud of Things (2016-present) Answered May 30, 2015
as someone that worked professionally with Perl, and moved to python only two years ago the answer sound simple. Work with whatever you can that feels more comfortable, however if your teammates use Perl it would be better to learn Perl in order to share code and refrain from creating code that cannot be reused.

in terms of comparison

1. python is object oriented by design, Perl can be object oriented but it is not a must
2. Python has a very good standard library, Perl has cpan with almost everything.
3. Perl is everywhere and in most cases you won't need newer perl for most cpan modules, python is a bit more problematic in this regard
4. Python is more readable after a month than Perl

there are other reasons but those are the first that comes to mind

Alain Debecker , Carbon based biped Answered May 22, 2015
Your question sounds a bit like "Which is better a boat or a car?". You simply do not do the same things with them. Or more precisely, some things are easier to do with Perl and others are easier with Python. None of them is robust (compare with Eiffel or Oberon, and if you never heard of these it is because robustness is not so important for you). So learn both, and choose for yourself. And also pick a nice one in http://en.wikipedia.org/wiki/Tim... (why not Julia?). A language that none of your friend knows about and stick out your tongue to them.
Jelani Jenkins , studied at ITT Technical Institute Answered Jul 16

Python is notably slower than Perl when using regex or data manipulation. I think if you're worried about the appeal of Perl, you could use PyCharm with Perl too. Furthermore, I believe the primary reason why someone would use an interpreted language on the job is to manipulate or analyze data. Therefore, I would use the Perl in order to get the job done quickly and worry about the appears another time.

Oscar Philips , BSEE Electrical Engineering Answered Jun 13

I had to learn Perl in an old position at work, where system build scripts were written in Perl and we are talking system builds that would take 6 to 8 hours to run, and had to run in both Unix and Windows environments.

That said, I have continued to use Perl for a number of reasons

Yes, I now need to use Python at work, but I have actually found it more difficult to learn than Perl, especially for pulling in a web page and extracting data out of the page.

Franky Yang , Bioinformatics Ph.D. Answered Dec 23, 2015

For beginner, I suggest Python. Perl 6 is released and Perl 5 will be replaced by Perl 6 or Ruby. Python is easier for a beginner and more useful for future life no matter you want to be a scientist or a programmer. Perl is also a powerful language but it only popular in U.S., Japan, etc.

Anyway, select the one you like, and learn the other as a second-language.

Zach Phillips-Gary , College of Wooster '17, CS & Philosophy Double Major and Web Developer Answered May 20, 2015
Perl is and older language and isn't really in the vogue. Python has a wider array of applications and as a "trendy" language, greater value on the job market.
James McInnes Answered May 29, 2015
You don't give your criteria for "better". They are tools, and each suited for various tasks - sometimes interchangeable.

Python is fairly simple to learn and easy to read. The object model is simple to understand and use.

Perl allows you to be terse and more naturally handles regular expressions. The object model is complex, but has more features.

I tend to write one-off and quick-and-dirty processing tasks in Perl, and anything that might be used by someone else in Python.

Garry Taylor , Been programming since 8 bit computers Answered Dec 20, 2015

First, I'll say it's almost impossible to say if any one language is 'better' than another, especially without knowing what you're using them for, but...

Short answer, it's Python. To be honest, I don't know what you mean by 'robust', they are both established languages, been around a long time, they are both, to all intents and purposes, bug free, especially if you're just automating a few tasks. I've not used Perl in quite some time, except when I delve into a very old project I wrote. Python has terrible syntax, and Perl has worse, in my opinion. Just to make my answer a bit more incendiary, both Perl and Python both suck, Python sucks significantly less.

[Nov 16, 2017] The Decline of Perl - My Opinion

Rumors of Perl demise were greatly exaggerated...
Notable quotes:
"... Secondly I am personally not overly concerned with what the popular language of the day is. As I commented ages ago at RE (tilly) 1: Java vs. Perl from the CB , the dream that is sold to PHBs of programmers as interchangable monkeys doesn't appeal to me, and is a proven recipe for IT disasters. See Choose the most powerful language for further discussion, and a link to an excellent article by Paul Graham. As long as I have freedom to be productive, I will make the best choice for me. Often that is Perl. ..."
"... Indeed languages like Python and Ruby borrow some of Perl's good ideas, and make them conveniently available to people who want some of the power of Perl, but who didn't happen to click with Perl. I think this is a good thing in the end. Trying different approaches allows people to figure out what they like and why they like it, leading to better languages later. ..."
"... Well, to start off, I think you're equating "popularity" with "success". It's true that Perl awareness is not as widespread as the other languages you mention. (By the way, I notice that two of them, C# and Java, are products of corporations who would like to run away with the market regardless of the technical merit of their solution). But when I first read the title of your note, I thought a lot of other things. To me, "decline" means disuse or death, neither of which I think apply to Perl today. ..."
"... a real program ..."
"... Perl is declining ..."
Feb 02, 2002 | www.perlmonks.org
trs80 (Priest) on Feb 02, 2002 at 18:54 UTC
I love Perl and have been using it since 1996 at my work for administrative tasks as well as web based products. I use Perl on Unix and Windows based machines for numerous tasks.

Before I get in depth about the decline let me give a little background of myself. I got my first computer in 1981, a TRS-80 Model III 4 MHz Z80 processor, 16K RAM, no HD, no FD, just a cassette tape sequential read/write for storage and retrieval. The TRS-80 line allowed for assembler or BASIC programs to be run on it. I programmed in both BASIC and assembler, but most BASIC since I had limited memory and using the tape became very annoying. Lets time warp forward to 1987 when Perl was first released.

The introduction of Perl was not household knowledge; the use of computers in the home was still considerably low. Those that did have computers most likely did very specific tasks; such as bring work home from the office. So it is fairly safe to say that Perl was not targeted at inexperienced computer users, but more to system administrators and boy did system administrators love it. Now lets time warp ahead to 1994.

1994 marked what I consider the start of the rush to the WWW (not the internet) and it was the birth year of Perl 5 and DBI. The WWW brought to use the ability to easily link to any other site/document/page via hypertext markup language or as we like to say, HTML. This " new " idea caused created a stir in the non-tech world for the first time. The WWW, as HTML progressed, started to make using and knowing about computers a little less geek. Most servers were UNIX based and as the needs for dynamic content or form handling grew what language was there to assist? Perl. So Perl became, in a way, the default web language for people that hadn't been entrenched in programming another CGI capable language and just wanted to process a form, create a flat file data store, etc. that is non-techies.

Perl served us all well, but on the horizon were the competitors. The web had proven itself not to be a flash in the pan, but a tool through which commerce and social interaction could take new form and allow people that had never considered using a computer before a reason to purchase one. So the big software and hardware giants looked for ways to gain control over the web and the Internet in general. There were even smaller players that were Open Source and freeware just like Perl.

So by 2000 there were several mature choices for Internet development and Perl was a drift in the sea of choice. I also see 2000 as the year the tide went out for Internet development in general. The "rush" had subsided, companies and investor started to really analyze what they had done and what the new media really offered to them. Along with analyzing comes consultants. Consultants have an interest in researching or developing the best product possible for a company. The company is interested in that when they terminate the contract with the consultant that they will be able to maintain what they bought. This brings us to the rub on Perl. How can a consultant convince a company that his application language of choice is free and isn't backed by a company? ActiveState I believe backs Perl to some extent, but one company generally isn't enough to put a CTO at ease.

So the decline of Perl use can be summarized with these facts:

  1. Perl is thought of as a UNIX administrative tool
  2. There are not enough professional organizations or guilds to bolster confidence with corporations that investing in a Perl solution is a good long-term plan.
  3. Perl doesn't have large scale advertising and full time advocates that keep Perl in major computing publications and remind companies that when they chose, chose Perl.
  4. There is no official certification. I have seen Larry's comments on this and I agree with him, but lack of certification hurts in the corporate world.
  5. Lack of college or university Perl class, or maybe better-stated lack of Perl promotion by colleges.
I suppose all of this really only matter to people that don't make their living extending Perl or using it for system admin work that isn't approved by a board or committee. People that make final products based on Perl for the Internet and as standalone applications are effected by the myths and facts of Perl.

Last year a possible opportunity I had to produce a complete package for a large telecommunications firm failed in part due to lack of confidence in Perl as the language of choice, despite the fact that two districts had been successfully using the prototype and increased efficiency.

Another factor is the overseas development services. My most recent employer had a subsidiary in India with 30 developers. Training for Perl was unheard of. There were signs literally everywhere for C++ , C# and Java, but no mention of Perl. It seems Perl is used for down and dirty utilities not full scale applications.

Maybe Perl isn't "supposed" to be for large-scale applications, but I think it can be and I think it's more then mature enough and supported to provide a corporation with a robust and wise long-term solution.

I am very interested in your opinions about why you feel Perl is or isn't gaining ground.

tilly (Archbishop) on Feb 02, 2002 at 23:30 UTC

Re (tilly) 1: The Decline of Perl - My Opinion

I have many opinions about your points.

First of all I don't know whether Perl is declining. Certainly I know that some of the Perl 6 effort has done exactly what it was intended to do, and attracted effort and interest in Perl. I know that at my job we have replaced the vast majority of our work with Perl, and the directions we are considering away from Perl are not exactly popularly publicized ones.

Secondly I am personally not overly concerned with what the popular language of the day is. As I commented ages ago at RE (tilly) 1: Java vs. Perl from the CB , the dream that is sold to PHBs of programmers as interchangable monkeys doesn't appeal to me, and is a proven recipe for IT disasters. See Choose the most powerful language for further discussion, and a link to an excellent article by Paul Graham. As long as I have freedom to be productive, I will make the best choice for me. Often that is Perl.

Third I don't see it as a huge disaster if Perl at some point falls by the wayside. Perl is not magically great to push just because it is Perl. Perl is good because it does things very well. But other languages can adopt some of Perl's good ideas and do what Perl already does. Indeed languages like Python and Ruby borrow some of Perl's good ideas, and make them conveniently available to people who want some of the power of Perl, but who didn't happen to click with Perl. I think this is a good thing in the end. Trying different approaches allows people to figure out what they like and why they like it, leading to better languages later.

Perhaps I am being narrow minded in focusing so much on what makes for good personal productivity, but I don't think so. Lacking excellent marketing, Perl can't win in the hype game. It has to win by actually being better for solving problems. Sure, you don't see Perl advertised everywhere. But smart management understands that something is up when a small team of Perl programmers in 3 months manages to match what a fair sized team of Java programmers had done in 2 years. And when the Perl programmers come back again a year later and in a similar time frame do what the Java programmers had planned to do over the next 5 years...

VSarkiss (Monsignor) on Feb 02, 2002 at 22:36 UTC

Re: The Decline of Perl - My Opinion

Well, to start off, I think you're equating "popularity" with "success". It's true that Perl awareness is not as widespread as the other languages you mention. (By the way, I notice that two of them, C# and Java, are products of corporations who would like to run away with the market regardless of the technical merit of their solution). But when I first read the title of your note, I thought a lot of other things. To me, "decline" means disuse or death, neither of which I think apply to Perl today.

The fact that there isn't a company with a lot of money standing behind Perl is probably the cause of the phenomena you observe. The same applies to other software that is produced and maintained by enthusiasts rather than companies. Linux is a very good example. Up to recently, Linux was perceived as just a hobbyist's toy. Now there are small glimmers of its acceptance in the corporate world, mainly from IBM stepping up and putting together a marketing package that non-technical people can understand. (I'm thinking of recent TV ad campaigns.) But does that make Linux "better"? Now there's a way to start an argument.

I agree that except in certain cases (look at The Top Perl Shops for some examples), most companies don't "get" Perl. Last year at my current client, I suggested that a new application be prototyped using a Perl backend. My suggestion was met with something between ridicule and disbelief ("We don't want a bunch of scripts , we want a real program ". That one stung.) To give them credit -- and this lends credence to one of your points -- they had very few people that could program Perl. And none of then were very good at it.

So has this lack of knowledge among some people made Perl any worse, or any less useful? No, definitely not. I think the ongoing work with Perl 6 is some of the most interesting and exciting stuff around. I think the language continues to be used in fascinating leading-edge application areas, such as bioinformatics. The state of Perl definitely doesn't fit my definition of "decline".

Nonetheless, I think your point of "corporate acceptance" is well-taken. It's not that the language is declining, it's that it's not making inroads in the average boardroom. How do you get past that barrier? For my part, I think the p5ee project is a step in the right direction. We need to simplify Perl training, which is one of the goals of the standardization, and provide something for a corporate executive to hold on to -- which is a topic of discussion in the mailing list right now.

And the nice part is that the standardized framework doesn't stop all the wonderful and creative Cool uses for Perl that we've become accustomed to. If the lack of corporate acceptance is of concern to you, then join the group. "Dont' curse the darkness, light a candle" is an old Chinese proverb .

trs80 (Priest) on Feb 02, 2002 at 22:48 UTC

Re: Re: The Decline of Perl - My Opinion There was a post on the mod_perl list later the same day that I wrote this, that shows a steady rise in the use of mod_perl based servers. The graph .
Maybe a better title should be, Top Hindrances in Selling Perl Solutions

derby (Abbot) on Feb 02, 2002 at 23:00 UTC

Re: The Decline of Perl - My Opinion

trs80,

I agree with your timeline and general ideas about what brought perl into focus. I also agree with perl being adrift in a sea of choices and with the rush subsiding in 2000. I whole heartedly disagree with your arguments about why perl will decline. Let's not look at your summary but at the pillars of your summary.

1. Perl is thought of as a UNIX administrative tool.

You really don't have much support for this summary. You state that system administrators love it but so do a whole lot of developers!

2. There are not enough professional organizations or guilds to bolster confidence with corporations that investing in a Perl solution is a good long-term plan.

Well you're at one of the most professional guilds right now. I don't see a cold fusion monks site? What do you want? Certification? I think more of my BS and BA then I do of any certification. As for "good long-term plans" - very few business see past the quarter. While I think this is generally bad , I think it's going to work wonders for open software. Where can you trim the budget to ensure profitability - cut those huge $$$$ software licenses down to nothing.

3. Perl doesn't have large scale advertising and full time advocates that keep Perl in major computing publications and remind companies that when they chose, chose Perl.

Hmmm ... not sure I want to base my IT selection on what the mags have to say -- I've seen a whole lot of shelf-ware that was bought due to what some wags say in the latest issue of some Ziff Davis trash.

3. There is no official certification. I have seen Larry's comments on this and I agree with him, but lack of certification hurts in the corporate world.

The are only two certifications that count - one is years of experience and the other is a sheep skin. Anything else is pure garbage. As long as you have the fundamentals of algorithm design down - then who cares what the cert is.

4. Lack of college or university Perl class, or maybe better-stated lack of Perl promotion by colleges.

I wouldn't expect anyone right out of college to be productive in any language. I would expect them to know what makes a good algorithm - and that my friend is language agnostic. Be it VB, C, C++, or perl - you have to know big-o.

It sounds like you're a little worried about a corporations perception about our language of choice. I wouldn't be. Given the current perception of corporate management (ala Enron), I think the people who make the (ehh) long-range plans may be around a lot less than us tech weenies. Bring it in back doors if you want. Rename it if you have to - an SOAP enabled back-end XML processor may be more appealling then an apache/mod_perl engine (that's where the BA comes in).

It also sound like you're worried about overseas chop shops. Ed Yourdan rang that bell about ten years ago with "The Decline and the Fall of the American Programmer." I must say I lost a lot of respect for Ed on that one. Farming out development to India has proven to be more of a lead egg than the golden hen - time zone headaches and culture clash has proved very hard to overcome.

Perl is moving on ... it seemed static because everyone was catching up to it. That being said, some day other OSS languages may overtake it but Python and Ruby are still in perl's review mirror.

Just like loved ones, we tend to ignore those which are around us everyday. For this Valentines day, do us all a favor and by some chocolates and a few flowers for our hard-working and beloved partner - We love you perl.

-derby

dws (Chancellor) on Feb 02, 2002 at 23:53 UTC

Re: The Decline of Perl - My Opinion

A few thoughts and data points: Perl may have gained ground initially in System Administration, and since the Web came along, Perl now is though of more as the language of CGIs.

My previous employer also had a subsidiary in India, and my distributed project included a team there, working on a large (80K line) web application in Perl. Folks coming out of University in India are more likely to have been trained in C++ or Java, but Perl isn't unknown.

On Certification: Be very careful with this. HR departments might like to see certifications on resumes, but in the development teams I've worked in, a Very Big Flag is raised by someone claiming to have a certification. The implication, fair or not, is "this person is a Bozo."

random (Monk) on Feb 03, 2002 at 02:44 UTC

Re: The Decline of Perl - My Opinion

Here's the thing: to some extent, you're right, but at the same time, you're still way off base. Now let's say, just for a second, that the only thing Perl can / should be used for is system administration and web programming (please don't flame me, guys. I know that is by no means the extent of Perl's capabilities, but I don't have time to write a post covering all the bases right now.) Even assuming that's true, you're still wrong.

Considering the web side of it, yes, Perl is being used in far fewer places now. There are two major reasons for this: one is the abundance of other languages (PHP and ::shudder:: ASP, for example). Another is the fact that a lot of sites backed by Perl programming crashed and burned when the dot-coms did. You know what? I don't see this as a big deal. The techies who wrote those sites are still around, likely still using Perl, and hopefully not hurting its reputation by using it to support companies that will never, ever, make any money or do anything useful...ever. (Not to say that all dot-coms were this way, but c'mon, there were quite a few useless companies out there.) These sites were thrown together (in many cases) to make a quick buck. Granted, Perl can be great for quick-and-dirty code, but do we really want it making up the majority of the volume of the code out there?

System administration: I still think Perl is one of the finest languages ever created for system administration, especially cross-platform system admin, for those of us who don't want to learn the ins and outs of shell scripting 100x over. I really don't think there'll be much argument there, so I'll move on.

The community: when was the last time Perl had organizations as devoted as the Perl Mongers and Yet Another Society? Do you see a website as popular and helpful as Perlmonks for PHP? Before last year, I don't remember ever having programmers whose sole job is to improve and evangelize Perl. Do you? I can't remember ever before having an argument on the Linux kernel mailing list on whether Linux should switch to a quasi-Perl method of patch management. (This past week, for those who haven't been reading the kernel mailing list or Slashdot.)

To be honest, I have no idea what you're talking about. As far as I'm concerned, Perl is as strong as it's ever been. If not, then it's purely a matter of evangelization. I know I've impressed more than one boss by getting what would've been a several-day-long job done in an hour with Perl. Have you?

- Mike -

Biker (Priest) on Feb 03, 2002 at 14:56 UTC

Re: The Decline of Perl - My Opinion

I'm not sure how you backup the statement that Perl is declining , but anyway.

There's a huge difference between writing sysadmin tools and writing business oriented applications. (Unless your business is to provide sysadmin tools. ;-)

In my shop, the sysadmins are free to use almost any tools, languages, etc. to get their work done. OTOH, when it comes to business supporting applications, with end user interface, the situation is very different. This is when our middle level management starts to worry.

My personal experience is that Perl does have a very short and low learning curve when it comes to writing different 'tools' to be used by IT folks.

The learning curve may quickly become long and steep when you want to create a business oriented application, very often combining techniques like CGI, DBI (or SybPerl), using non standard libraries written as OO or not plus adding your own modules to cover your shop-specific topics. Especially if you want to create some shared business objects to be reused by other Perl applications. Add to this that the CGI application shall create JavaScript to help the user orient through the application, sometimes even JS that eval() 's more JS and it becomes tricky. (Did someone mention 'security' as well?)

Furthermore, there is (at least in Central Europe) a huge lack of good training. There are commercial training courses, but those that I've found around where I live are all beginners courses covering the first chapters in the llama. Which is good, but not enough.

Because after the introduction course, my colleagues ask me how to proceed. When I tell them to go on-line, read books and otherwise participate, they are unhappy. Yes, many of them still haven't really learnt how to take advantage of the 'Net. And yes again, not enough people (fewer and fewer?) appreciates to RTFM. They all want quick solutions. And no, they don't appreciate to spend their evenings reading the Cookbook or the Camel. (Some odd colleagues, I admit. ;-)

You can repeatedly have quick solutions using Perl, but that requires efforts to learn. And this learning stage (where I'm still at) is not quick if you need to do something big .

Too many people (around where I make my living) want quick solutions to everything with no or little investment.
(Do I sound old? Yeah, I guess I am. That's how you become after 20+ years in this business. ;-)

Conclusion: In my shop, the middle management are worried what will happen if I leave.

Questions like: " Who will take over all this Perl stuff? " and " How can we get a new colleague up to speed on this Perl thingy within a week or two? " are commonplace here. Which in a way creates a resistance.

I'd say: Momentum! When there is a hord of Perl programmers available the middle management will sleep well during the nights. (At least concerning "The Perl Problem". ;-)

I'm working very hard to create the momentum that is necessary in our shop and I hope you do the same . ( Biker points a finger at the reader.)

"Livet är hårt" sa bonden.
"Grymt" sa grisen...

beebware (Pilgrim) on Feb 03, 2002 at 19:48 UTC

Re: The Decline of Perl - My Opinion

I agree with most of this, but a major problem that I have noticed is that with the recent 'dot.gone burn', the (job) market is literally flooded with experienced Perl programmers, but there are no positions for them. Mostly this is due to Perl being free. To learn C++, C#, Java you've got to spend a lot of money. Courses do not come cheap, the software doesn't come cheap and certification (where it is offered) isn't cheap.

However, if anybody with just programming knowledge in anything from BASIC upwards and the willingness to learn can get hold for software that is free to use, free to download, free to 'learn by example' (the quickest and easiest way to learn IMHO) - you can probably have a Perl programmer that can make a basic script in next to no time. He runs his own website off it and other scripts he wrote, learns, looks at other scripts and before you know it, he's writing a complete Intranet based Management Information System in it... If that person had to get C++, learn by reading books and going to courses (and the associated costs - there isn't that much code available to read and learn by), it would have taken so much longer and if you are on a budget - it isn't an option.

Compare Linux+MySQL+Apache+Perl to (shudder) Windows 2000 Advanced Server+MS SQL Server+IIS+ASP. Which costs a lot more in the set up costs, staff costs and maintenance. Let alone security holes? But which do big corporations go for (even though ever techy knows which is the best one to go for). Why? Because 'oh, things are free for a reason - if we've got to pay lots of money for it it has got to be damn good - just look at all those ASP programmers asking 60,000UKP upwards, it must be good if they are charging that much'.

All in all, if Perl6+ had a license fee charge for it and a 'certification' was made available AND Perl programmers put up their 'minimum wage', Perl would take off again big time. Of course, it's all IMHO, but you did ask for my opinion :-)

tilly (Archbishop) on Feb 03, 2002 at 20:11 UTC

Re (tilly) 2: The Decline of Perl - My Opinion

If you really think that by asking money and shutting up Perl, you would make it more popular and profitable, then I challenge you to go out and try to do it.

If you read the licensing terms, you can take Perl, take advantage of the artistic license, rename it slightly, and make your own version which can be proprietary if you want. (See oraperl and sybperl for examples where this was done with Perl 4.)

My prediction based on both theory and observation of past examples (particularly examples of what people in the Lisp world do wrong time after time again) is that you will put in a lot of energy, lose money, and never achieve popularity. For some of the theory, the usually referred to starting place is The Cathedral and the Bazaar .

Of course if you want to charge money for something and can get away with it, go ahead. No less than Larry Wall has said that, It's almost like we're doing Windows users a favor by charging them money for something they could get for free, because they get confused otherwise. But I think that as time goes by it is becoming more mainstream to accept that it is possible for software to be both free and good at the same time.

beebware (Pilgrim) on Feb 04, 2002 at 01:23 UTC

Re: Re (tilly) 2: The Decline of Perl - My Opinion

I know, but that's how head of departments and corporate management think: these are the people that believe the FUD that Microsoft put out ('XP is more secure - we best get it then', never mind that Linux is more secure and the Windows 2000 machines are also secure). Sometimes it's just the brand name which also helps - I know of a certain sausage manufacture who makes sausages for two major supermarkets. People say 'Oh, that's from Supermarket X' upon tasting, although it is just the same sausage.

All in all - it comes down to packaging. 'Tart' something up by packaging, brand names and high prices are, despite the rival product being better in every respect, the 'well packaged' product will win.

[Nov 04, 2017] Which is the best book for learning python for absolute beginners on their own?

Nov 04, 2017 | www.quora.com

Robert Love Software Engineer at Google

Mark Lutz's Learning Python is a favorite of many. It is a good book for novice programmers. The new fifth edition is updated to both Python 2.7 and 3.3. Thank you for your feedback! Your response is private. Is this answer still relevant and up to date?

Aditi Sharma , i love coding Answered Jul 10 2016

Originally Answered: Which is the best book for learning Python from beginners to advanced level?

Instead of book, I would advice you to start learning Python from CodesDope which is a wonderful site for starting to learn Python from the absolute beginning. The way its content explains everything step-by-step and in such an amazing typography that makes learning just fun and much more easy. It also provides you with a number of practice questions for each topic so that you can make your topic even stronger by solving its questions just after reading it and you won't have to go around searching for its questions for practice. Moreover, it has a discussion forum which is really very responsive in solving all your doubts instantly.

3.1k Views 11 Upvotes Promoted by Facebook Join Facebook Engineering Leadership. We're hiring! Join our engineering leadership team to help us bring the world closer together. Learn More at facebook.com/careers Alex Forsyth , Computer science major at MIT Answered Dec 28 2015 Originally Answered: What is the best way to learn to code? Specifically Python.

There are many good websites for learning the basics, but for going a bit deeper, I'd suggest MIT OCW 6.00SC. This is how I learned Python back in 2012 and what ultimately led me to MIT and to major in CS. 6.00 teaches Python syntax but also teaches some basic computer science concepts. There are lectures from John Guttag, which are generally well done and easy to follow. It also provides access to some of the assignments from that semester, which I found extremely useful in actually learning Python.

After completing that, you'd probably have a better idea of what direction you wanted to go. Some examples could be completing further OCW courses or completing projects in Python.

[Nov 04, 2017] Should I learn Python or Ruby

Notable quotes:
"... Depends on you use. If you are into number crunching, system related scripts, Go to Python by all means. ..."
Nov 04, 2017 | www.quora.com

Joshua Inkenbrandt , Engineer @ Pinterest Updated Nov 18, 2010 · Upvoted by Daniel Roy Greenfeld , Python and Django Developer so I'll tell you my opinion.

Choose Python. It has... (more) Originally Answered: If I only have time to learn Python or Ruby, which should I choose and why? First, it should be said that both languages are great at solving a vast array of problems. You're asking which one you should choose, so I'll tell you my opinion.

Choose Python. It has a bigger and more diverse community of developers. Ruby has a great community as well, but much of the community is far too focused on Rails. The fact that a lot of people use Rails and Ruby interchangeably is pretty telling.

As for the differences between their syntax go: They both have their high and low points. Python has some very nice functional features like list comprehension and generators. Ruby gives you more freedom in how you write your code: parentheses are optional and blocks aren't whitespace delimited like they are in Python. From a syntactical point of view, they're both in a league of their own.

My reason for choosing Python (as my preferred language) though, had everything to do with the actual language itself. It actually has a real module system. It's clean and very easy to read. It has an enormous standard library and an even larger community contributed one.

One last thing I'll add, so you know I'm not a complete homer. Some of my favorite tools are written in Ruby, like homebrew [ https://github.com/mxcl/homebrew] , Puppet [ http://www.puppetlabs.com/puppet... and capistrano [ https://github.com/halorgium/cap... . I think Ruby has some really awesome features, like real closures and anonymous functions. And if you could, it wouldn't hurt to lean both. Just my $0.02. Seyram Komla Sapaty , from __future__ import braces Updated Jul 22, 2012

1. Python is widely used across several domains. It's installed on almost all linux distros. Perfect for system administration. It's the de facto scripting language in the animation/v... (more) Originally Answered: If I only have time to learn Python or Ruby, which should I choose and why? Learn python

1. Python is widely used across several domains. It's installed on almost all linux distros. Perfect for system administration. It's the de facto scripting language in the animation/visual effect industry. The following industry standard programs use python as their scripting language:
Autodesk Maya, Softimage, Toxik, Houdini, 3DSMAX, Modo, MotionBuilder, Nuke, Blender, Cinema4D, RealFlow and a lot more....
If you love visual effect intensive movies and video games, chances are python helped in making them. Python is used extensively by Industrial Light Magic, Sony Imageworks, Weta Digital, Luma Pictures, Valve etc

2. Researchers use it heavily. Resulting in free high quality libraries: NumPy, SciPy, Matplotlib etc
3. Python is backed by a giant like Google
4. Both python and ruby are slow. But ruby is slower!

And oh, the One Laptop per Child project uses python a lot

Ruby is more or less a dsl. It is used widely for web development. So much so that the name Ruby is interchangeable with Rails. Dave Aronson , T. Rex at Codosaurus, LLC (1990-present) Answered Feb 27

This was apparently asked in 2010, but I've just been A2A'ed in 2017, so maybe an updated perspective will help. Unfortunately I haven't done much Python since about 2008, but I did a little bit in...

(more)

This was apparently asked in 2010, but I've just been A2A'ed in 2017, so maybe an updated perspective will help. Unfortunately I haven't done much Python since about 2008, but I did a little bit in 2014 but I have done a ton of Ruby in the meantime.

From what I hear, the main changes in Python are that on the downside it is no longer the main darling of Google, but on the upside, more and more and better and faster libraries have come out for assorted types of scientific computation, Django has continued to climb in popularity, and there are apparently some new and much easier techniques for integrating C libraries into Python. Ruby has gotten much faster, and more popular in settings other than Rails, and meanwhile Rails itself has gotten faster and more powerful. So it's still very much not a no-brainer.

There are some good libraries for both for games, and other uses of GUI interfaces (from ordinary dialog boxes and such, to custom graphics with motion). If you mean a highly visual video game, needing fine resolution and a high frame rate, forget 'em both, that's usually done in C++.

For websites, it depends what kind . If you want to just show some info, and have ability to change what's there pretty easily, from what I've heard Django is excellent at making that kind of system, i.e., a CMS (Content Management System). If you want it to do some storage, processing, and retrieval of user-supplied data, then Ruby is a better choice, via frameworks such as Rails (excellent for making something quickly even if the result isn't lightning-fast, though it can be with additional work) or Sinatra (better for fast stuff but it does less for you). If you don't want to do either then you don't need either language, just raw HTML, CSS (including maybe some libraries like Bootstrap or Foundation), and maybe a bit of JavaScript (including maybe some libraries like JQuery) -- and you'll have to learn all that anyway!

"Apps" is a bit broad of a term. These days it usually means mobile apps. I haven't heard of ways to do them in Python, but for Ruby there is RubyMotion, which used to be iOS-only but now supports Android too. Still, though, you'd be better off with Swift (or Objective-C) for iOS and Java for Android. On the other claw, if you just mean any kind of application, either one will do very nicely for a wide range.

Vikash Vikram Software Architect Worked at SAP Studied at Indian Institute of Technology, Delhi Lived in New Delhi

Depends on you use. If you are into number crunching, system related scripts, Go to Python by all means.

If you are interested into speed, Go somewhere else as both languages are slow. When you compare their speed to each other, they are more or less same (You are not at all going to get any order of difference in their performance). If you are into web application development or want to create backend for you mobile apps or writing scripts, then only you have to dig deeper. The major difference is in the mindset of the community. Python is conservative and Ruby is adventurous. Rails is the shining example of that. Rails guys are just crazy about trying new ideas and getting them integrated. They have been one of the first to has REST, CoffeeScript, SaSS and some many other stuffs by default. With Rails 5, they will have ES6 integrated as well. So if you want to bet on a framework that let you try all the new stuffs, then Rails and hence Ruby is the way. Personally I like Ruby because of Rails. I like Rails because I like the philosophy of its creators. The syntax of Ruby is icing on the cake. I am not fond of Python syntax and I am a web developer so the versatility of Python Libraries does not cut the cake for me. In the end, you won't be wrong with any of these but you will have to find out yourself about which one you will love to code in (if that is important for you).

Max Mautner , I <3 python, a lot Answered Jan 5, 2015

PyGame would be a fine place to start: http://pygame.org/news.html

Or more generally, Ludum Dare: http://ludumdare.com/compo/ ash

Agrawal , Principal Engineer at Slideshare Updated Jan 17, 2013

Following may help you decide better:

Ruby and Python. Two languages. Two communities. Both have a similar target: to make software development better. Better than Java, better than PHP and better for everyone. But where is the difference? And what language is "better"? For the last question I can say: none is better. Both camps are awesome and do tons of great stuff. But for the first question the answer is longer. And I hope to provide that in this little article.

Is the difference in the toolset around the language? No, I don't think so. Both have good package managers, tons of libraries for all kind of stuff and a few decent web frameworks. Both promote test driven development. On the language side one is whitespace sensitive, the other isn't. Is that so important? Maybe a little, but I think there is something else that is way more important. The culture.

It all started with a stupid python troll at the Sigint that wanted to troll our cologne.rb booth. To be prepared for the next troll attack I started to investigate Python. For that I talked with a lot of Python guys and wrote a few little things in Python to get a feel for the language and the ecosystem. Luckily at theFrOSCon our Ruby booth was right next to the pycologne folks and we talked a lot about the differences. During that time I got the feeling that I know what is different in the culture of both parties. Last month I had the opportunity to test my theory in real life. The cologne.rb and the django cologne folks did a joined meetup. And I took the opportunity to test my theory. And it got confirmed by lots of the Python people.

Okay, now what is the difference in the culture? It is pretty easy. Python folks are really conservative and afraid of change, Ruby folks love the new shiny stuff even if it breaks older things. It's that simple. But it has huge consequences. One you can see for example in the adaption of Ruby 1.9 vs Python 3. Both new versions did tons of breaking changes. A lot of code needed changes to run on the new plattform. In the Ruby world the transition went pretty quick. In the Python world it is very troublesome. Some Python people even say that Python 3 is broken and all energy should be focused on the 2.x-branch of the language. The Ruby community saw the opportunities. The Python community only saw the update problems. Yes, there have been update problems in the Ruby world, but we found an easy way to fix this: isitruby19.com . A simple plattform that showed if the gem is ready for 1.9. And if it wasn't and the gem was important, it got fixed with pull requests or something similar. And the problems went away fast.

Both models of thinking have pros and cons. The Python world is more stable, you can update your django installation without much troubles. But that also means new technology is only added very slowly. The Ruby world loves changes. So much that most of the "new stuff" in the Python world was tested in the Ruby world first. We love changes so much that the Rails core is build around that idea. You can easily change nearly everything and extend everything. Most of the new stuff the Rails Core Team is testing right now for version 4 is available as plugin for Rails 3. This is pretty interesting if you love new things, love change, and love playing around with stuff. If you don't and hate the idea of breaking changes, you maybe are better suited with the Python way. But don't be afraid of breaking changes. They are all pretty well documented in the release guides. It's not voodoo.

I for myself love the Ruby mindset. Something like Rails or Asset Pipelines or all the other things would not be possible if we are stuck with "no, don't change that, it works pretty well that way". Someone has to be the leader. Someone has to play around with new ideas. Yes, some ideas won't fly, some are removed pretty quickly. But at least we tried them. Yes, I know that some people prefer the conservative way. If you consider yourself to be like that, you should at least try Python. I stay with Ruby.


Source: http://bitboxer.de/2012/10/03/ru... Andrew Korzhuev , I'm interested in everything Answered Apr 11, 2012 Should I Learn Python or Ruby next? Why not to try both and choose which one you would like more? [1] [2]

[1] http://hackety.com/
[2] http://www.trypython.org/

[Nov 04, 2017] The top 20 programming languages the GitHut and Tiobe rankings - JAXenter

Nov 04, 2017 | jaxenter.com

Githut lists its ranking according to the following characteristics: active repositories, the number of pushes and the pushes per repository, as well as the new forks per repository, the open issues per repository and the new watcher per repository.

GitHut's top 20 ranking currently looks like this:

  1. JavaScript
  2. Java
  3. Python
  4. Ruby
  5. CSS
  6. PHP
  7. C++
  8. Shell
  9. C#
  10. Objective-C
  11. VimL
  12. Go
  13. Perl

[Nov 04, 2017] GitHub - EbookFoundation-free-programming-books Freely available programming books

Nov 04, 2017 | github.com

View the English list

This list was originally a clone of stackoverflow - List of Freely Available Programming Books by George Stocker.

The list was moved to GitHub by Victor Felder for collaborative updating and maintenance. It grew to become one of the most popular repositories on Github , with over 80,000 stars, over 4000 commits, over 800 contributors, and over 20,000 forks.

The repo is now administered by the Free Ebook Foundation , a not-for-profit organization devoted to promoting the creation, distribution, archiving and sustainability of free ebooks. Donations to the Free Ebook Foundation are tax-deductible in the US.

[Sep 18, 2017] The Fall Of Perl, The Webs Most Promising Language by Conor Myhrvold

The author pays outsize attention to superficial things like popularity with particular groups of users. For sysadmin this matter less then the the level of integration with the underling OS and the quality of the debugger.
The real story is that Python has less steep initial learning curve and that helped to entrenched it in universities. Students brought it to large companies like Red Hat. The rest is history. Google support also was a positive factor. Python also basked in OO hype. So this is more widespread language now much like Microsoft Basic. That does not automatically makes it a better language in sysadmin domain.
The phase " Perl's quirky stylistic conventions, such as using $ in front to declare variables, are in contrast for the other declarative symbol $ for practical programmers today�the money that goes into the continued development and feature set of Perl's frenemies such as Python and Ruby." smells with "syntax junkie" mentality. What wrong with dereferencing using $ symbol? yes it creates problem if you are using simultaneously other languages like C or Python, but for experienced programmer this is a minor thing. Yes Perl has some questionable syntax choices so so are any other language in existence. While painful, it is the semantic and "programming environment" that mater most.
My impression is that Perl returned to its roots -- migrated back to being an excellent sysadmin tool -- as there is strong synergy between Perl and Unix shells. The fact that Perl 5 is reasonably stable is a huge plus in this area.
Notable quotes:
"... By the late 2000s Python was not only the dominant alternative to Perl for many text parsing tasks typically associated with Perl (i.e. regular expressions in the field of bioinformatics ) but it was also the most proclaimed popular language , talked about with elegance and eloquence among my circle of campus friends, who liked being part of an up-and-coming movement. ..."
"... Others point out that Perl is left out of the languages to learn first �in an era where Python and Java had grown enormously, and a new entrant from the mid-2000s, Ruby, continues to gain ground by attracting new users in the web application arena (via Rails ), followed by the Django framework in Python (PHP has remained stable as the simplest option as well). ..."
"... In bioinformatics, where Perl's position as the most popular scripting language powered many 1990s breakthroughs like genetic sequencing, Perl has been supplanted by Python and the statistical language R (a variant of S-plus and descendent of S , also developed in the 1980s). ..."
"... By 2013, Python was the language of choice in academia, where I was to return for a year, and whatever it lacked in OOP classes, it made up for in college classes. Python was like Google, who helped spread Python and employed van Rossum for many years. Meanwhile, its adversary Yahoo (largely developed in Perl ) did well, but comparatively fell further behind in defining the future of programming. Python was the favorite and the incumbent; roles had been reversed. ..."
"... from my experience? Perl's eventual problem is that if the Perl community cannot attract beginner users like Python successfully has ..."
"... The fact that you have to import a library, or put up with some extra syntax, is significantly easier than the transactional cost of learning a new language and switching to it. ..."
"... MIT Python replaced Scheme as the first language of instruction for all incoming freshman, in the mid-2000s ..."
Jan 13, 2014 | www.fastcompany.com

And the rise of Python. Does Perl have a future?

I first heard of Perl when I was in middle school in the early 2000s. It was one of the world's most versatile programming languages, dubbed the Swiss army knife of the Internet. But compared to its rival Python, Perl has faded from popularity. What happened to the web's most promising language? Perl's low entry barrier compared to compiled, lower level language alternatives (namely, C) meant that Perl attracted users without a formal CS background (read: script kiddies and beginners who wrote poor code). It also boasted a small group of power users ("hardcore hackers") who could quickly and flexibly write powerful, dense programs that fueled Perl's popularity to a new generation of programmers.

A central repository (the Comprehensive Perl Archive Network, or CPAN ) meant that for every person who wrote code, many more in the Perl community (the Programming Republic of Perl ) could employ it. This, along with the witty evangelism by eclectic creator Larry Wall , whose interest in language ensured that Perl led in text parsing, was a formula for success during a time in which lots of text information was spreading over the Internet.

As the 21st century approached, many pearls of wisdom were wrought to move and analyze information on the web. Perl did have a learning curve�often meaning that it was the third or fourth language learned by adopters�but it sat at the top of the stack.

"In the race to the millennium, it looks like C++ will win, Java will place, and Perl will show," Wall said in the third State of Perl address in 1999. "Some of you no doubt will wish we could erase those top two lines, but I don't think you should be unduly concerned. Note that both C++ and Java are systems programming languages. They're the two sports cars out in front of the race. Meanwhile, Perl is the fastest SUV, coming up in front of all the other SUVs. It's the best in its class. Of course, we all know Perl is in a class of its own."

Then came the upset.

The Perl vs. Python Grudge Match

Then Python came along. Compared to Perl's straight-jacketed scripting, Python was a lopsided affair. It even took after its namesake, Monty Python's Flying Circus. Fittingly, most of Wall's early references to Python were lighthearted jokes at its expense. Well, the millennium passed, computers survived Y2K , and my teenage years came and went. I studied math, science, and humanities but kept myself an arm's distance away from typing computer code. My knowledge of Perl remained like the start of a new text file: cursory , followed by a lot of blank space to fill up.

In college, CS friends at Princeton raved about Python as their favorite language (in spite of popular professor Brian Kernighan on campus, who helped popularize C). I thought Python was new, but I later learned it was around when I grew up as well, just not visible on the charts.

By the late 2000s Python was not only the dominant alternative to Perl for many text parsing tasks typically associated with Perl (i.e. regular expressions in the field of bioinformatics ) but it was also the most proclaimed popular language , talked about with elegance and eloquence among my circle of campus friends, who liked being part of an up-and-coming movement.

Side By Side Comparison: Binary Search

Despite Python and Perl's well documented rivalry and design decision differences�which persist to this day�they occupy a similar niche in the programming ecosystem. Both are frequently referred to as "scripting languages," even though later versions are retro-fitted with object oriented programming (OOP) capabilities.

Stylistically, Perl and Python have different philosophies. Perl's best known mottos is " There's More Than One Way to Do It ". Python is designed to have one obvious way to do it. Python's construction gave an advantage to beginners: A syntax with more rules and stylistic conventions (for example, requiring whitespace indentations for functions) ensured newcomers would see a more consistent set of programming practices; code that accomplished the same task would look more or less the same. Perl's construction favors experienced programmers: a more compact, less verbose language with built-in shortcuts which made programming for the expert a breeze.

During the dotcom era and the tech recovery of the mid to late 2000s, high-profile websites and companies such as Dropbox (Python) and Amazon and Craigslist (Perl), in addition to some of the world's largest news organizations ( BBC , Perl ) used the languages to accomplish tasks integral to the functioning of doing business on the Internet. But over the course of the last 15 years , not only how companies do business has changed and grown, but so have the tools they use to have grown as well, unequally to the detriment of Perl. (A growing trend that was identified in the last comparison of the languages, " A Perl Hacker in the Land of Python ," as well as from the Python side a Pythonista's evangelism aggregator , also done in the year 2000.)

Perl's Slow Decline

Today, Perl's growth has stagnated. At the Orlando Perl Workshop in 2013, one of the talks was titled " Perl is not Dead, It is a Dead End ," and claimed that Perl now existed on an island. Once Perl programmers checked out, they always left for good, never to return. Others point out that Perl is left out of the languages to learn first �in an era where Python and Java had grown enormously, and a new entrant from the mid-2000s, Ruby, continues to gain ground by attracting new users in the web application arena (via Rails ), followed by the Django framework in Python (PHP has remained stable as the simplest option as well).

In bioinformatics, where Perl's position as the most popular scripting language powered many 1990s breakthroughs like genetic sequencing, Perl has been supplanted by Python and the statistical language R (a variant of S-plus and descendent of S , also developed in the 1980s).

In scientific computing, my present field, Python, not Perl, is the open source overlord, even expanding at Matlab's expense (also a child of the 1980s , and similarly retrofitted with OOP abilities ). And upstart PHP grew in size to the point where it is now arguably the most common language for web development (although its position is dynamic, as Ruby and Python have quelled PHP's dominance and are now entrenched as legitimate alternatives.)

While Perl is not in danger of disappearing altogether, it is in danger of losing cultural relevance , an ironic fate given Wall's love of language. How has Perl become the underdog, and can this trend be reversed? (And, perhaps more importantly, will Perl 6 be released!?)

How I Grew To Love Python

Why Python , and not Perl? Perhaps an illustrative example of what happened to Perl is my own experience with the language. In college, I still stuck to the contained environments of Matlab and Mathematica, but my programming perspective changed dramatically in 2012. I realized lacking knowledge of structured computer code outside the "walled garden" of a desktop application prevented me from fully simulating hypotheses about the natural world, let alone analyzing data sets using the web, which was also becoming an increasingly intellectual and financially lucrative skill set.

One year after college, I resolved to learn a "real" programming language in a serious manner: An all-in immersion taking me over the hump of knowledge so that, even if I took a break, I would still retain enough to pick up where I left off. An older alum from my college who shared similar interests�and an experienced programmer since the late 1990s�convinced me of his favorite language to sift and sort through text in just a few lines of code, and "get things done": Perl. Python, he dismissed, was what "what academics used to think." I was about to be acquainted formally.

Before making a definitive decision on which language to learn, I took stock of online resources, lurked on PerlMonks , and acquired several used O'Reilly books, the Camel Book and the Llama Book , in addition to other beginner books. Yet once again, Python reared its head , and even Perl forums and sites dedicated to the language were lamenting the digital siege their language was succumbing to . What happened to Perl? I wondered. Ultimately undeterred, I found enough to get started (quality over quantity, I figured!), and began studying the syntax and working through examples.

But it was not to be. In trying to overcome the engineered flexibility of Perl's syntax choices, I hit a wall. I had adopted Perl for text analysis, but upon accepting an engineering graduate program offer, switched to Python to prepare.

By this point, CPAN's enormous advantage had been whittled away by ad hoc, hodgepodge efforts from uncoordinated but overwhelming groups of Pythonistas that now assemble in Meetups , at startups, and on college and corporate campuses to evangelize the Zen of Python . This has created a lot of issues with importing ( pointed out by Wall ), and package download synchronizations to get scientific computing libraries (as I found), but has also resulted in distributions of Python such as Anaconda that incorporate the most important libraries besides the standard library to ease the time tariff on imports.

As if to capitalize on the zeitgiest, technical book publisher O'Reilly ran this ad , inflaming Perl devotees.


By 2013, Python was the language of choice in academia, where I was to return for a year, and whatever it lacked in OOP classes, it made up for in college classes. Python was like Google, who helped spread Python and employed van Rossum for many years. Meanwhile, its adversary Yahoo (largely developed in Perl ) did well, but comparatively fell further behind in defining the future of programming. Python was the favorite and the incumbent; roles had been reversed.

So after six months of Perl-making effort, this straw of reality broke the Perl camel's back and caused a coup that overthrew the programming Republic which had established itself on my laptop. I sheepishly abandoned the llama . Several weeks later, the tantalizing promise of a new MIT edX course teaching general CS principles in Python, in addition to numerous n00b examples , made Perl's syntax all too easy to forget instead of regret.

Measurements of the popularity of programming languages, in addition to friends and fellow programming enthusiasts I have met in the development community in the past year and a half, have confirmed this trend, along with the rise of Ruby in the mid-2000s, which has also eaten away at Perl's ubiquity in stitching together programs written in different languages.

While historically many arguments could explain away any one of these studies�perhaps Perl programmers do not cheerlead their language as much, since they are too busy productively programming. Job listings or search engine hits could mean that a programming language has many errors and issues with it, or that there is simply a large temporary gap between supply and demand.

The concomitant picture, and one that many in the Perl community now acknowledge, is that Perl is now essentially a second-tier language, one that has its place but will not be the first several languages known outside of the Computer Science domain such as Java, C, or now Python.

The Future Of Perl (Yes, It Has One)

I believe Perl has a future , but it could be one for a limited audience. Present-day Perl is more suitable to users who have worked with the language from its early days , already dressed to impress . Perl's quirky stylistic conventions, such as using $ in front to declare variables, are in contrast for the other declarative symbol $ for practical programmers today�the money that goes into the continued development and feature set of Perl's frenemies such as Python and Ruby. And the high activation cost of learning Perl, instead of implementing a Python solution. Ironically, much in the same way that Perl jested at other languages, Perl now finds itself at the receiving end .

What's wrong with Perl , from my experience? Perl's eventual problem is that if the Perl community cannot attract beginner users like Python successfully has, it runs the risk of become like Children of Men , dwindling away to a standstill; vast repositories of hieroglyphic code looming in sections of the Internet and in data center partitions like the halls of the Mines of Moria . (Awe-inspiring and historical? Yes. Lively? No.)

Perl 6 has been an ongoing development since 2000. Yet after 14 years it is not officially done , making it the equivalent of Chinese Democracy for Guns N' Roses. In Larry Wall's words : "We're not trying to make Perl a better language than C++, or Python, or Java, or JavaScript. We're trying to make Perl a better language than Perl. That's all." Perl may be on the same self-inflicted path to perfection as Axl Rose, underestimating not others but itself. "All" might still be too much.

Absent a game-changing Perl release (which still could be "too little, too late") people who learn to program in Python have no need to switch if Python can fulfill their needs, even if it is widely regarded as second or third best in some areas. The fact that you have to import a library, or put up with some extra syntax, is significantly easier than the transactional cost of learning a new language and switching to it. So over time, Python's audience stays young through its gateway strategy that van Rossum himself pioneered, Computer Programming for Everybody . (This effort has been a complete success. For example, at MIT Python replaced Scheme as the first language of instruction for all incoming freshman, in the mid-2000s.)

Python Plows Forward

Python continues to gain footholds one by one in areas of interest, such as visualization (where Python still lags behind other language graphics, like Matlab, Mathematica, or the recent d3.js ), website creation (the Django framework is now a mainstream choice), scientific computing (including NumPy/SciPy), parallel programming (mpi4py with CUDA), machine learning, and natural language processing (scikit-learn and NLTK) and the list continues.

While none of these efforts are centrally coordinated by van Rossum himself, a continually expanding user base, and getting to CS students first before other languages (such as even Java or C), increases the odds that collaborations in disciplines will emerge to build a Python library for themselves, in the same open source spirit that made Perl a success in the 1990s.

As for me? I'm open to returning to Perl if it can offer me a significantly different experience from Python (but "being frustrating" doesn't count!). Perhaps Perl 6 will be that release. However, in the interim, I have heeded the advice of many others with a similar dilemma on the web. I'll just wait and C .

[Sep 03, 2017] Which is better, Perl or Python Which one is more robuts How do they compare with each other - Quora

www.danvk.org
27 Answers Dan Lenski Dan Lenski , I do all my best thinking in Python, and I've written a lot of it Updated Jun 1 2015 Though I may get flamed for it, I will put it even more bluntly than others have: Python is better than Perl . Python's syntax is cleaner, its object-oriented type system is more modern and consistent, and its libraries are more consistent. ( EDIT: As Christian Walde points out in the comments, my criticism of Perl OOP is out-of-date with respect to the current de facto standard of Moo/se. I do believe that Perl's utility is still encumbered by historical baggage in this area and others.)

I have used both languages extensively for both professional work and personal projects (Perl mainly in 1999-2007, Python mainly since), in domains ranging from number crunching (both PDL and NumPy are excellent) to web-based programming (mainly with Embperl and Flask ) to good ol' munging text files and database CRUD

Both Python and Perl have large user communities including many programmers who are far more skilled and experienced than I could ever hope to be. One of the best things about Python is that the community generally espouses this aspect of "The Zen of Python"

Python's philosophy rejects the Perl " there is more than one way to do it " approach to language design in favor of "there should be one!and preferably only one!obvious way to do it".

... while this principle might seem stultifying or constraining at first, in practice it means that most good Python programmers think about the principle of least surprise and make it easier for others to read and interface with their code.

In part as a consequence of this discipline, and definitely because of Python's strong typing , and arguably because of its "cleaner" syntax, Python code is considerably easier to read than Perl code. One of the main events that motivated me to switch was the experience of writing code to automate lab equipment in grad school. I realized I couldn't read Perl code I'd written several months prior, despite the fact that I consider myself a careful and consistent programmer when it comes to coding style ( Dunning–Kruger effect , perhaps? :-P).

One of the only things I still use Perl for occasionally is writing quick-and-dirty code to manipulate text strings with regular expressions; Sai Janani Ganesan's pointed out Perl's value for this as well . Perl has a lot of nice syntactic sugar and command line options for munging text quickly*, and in fact there's one regex idiom I used a lot in Perl and for which I've never found a totally satisfying substitute in Python


* For example, the one-liner perl bak pe 's/foo/bar/g' *. txt will go through a bunch of text files and replace foo with bar everywhere while making backup files with the bak extension.
Sep 03, 2017 | www.quora.com

[Sep 03, 2017] When To Choose Python Over Perl (And Vice Versa)

www.quora.com
When to use perl:

When to use python:

Edit after 5 years: gravatar for Istvan Albert 5.8 years ago by Istvan Albert ♦♦ 73k University Park, USA Istvan Albert ♦♦ 73k wrote:

I think very few people are completely agnostic about programming languages especially when it comes to languages with very similar strengths and weaknesses like: perl/python/ruby - therefore there is no general reason for using one language vs the other.

It is more common to find someone equally proficient in C and perl, than say equally proficient in perl and python. My guess would be that complementary languages require complementary skill sets that occupy different parts of the brain, whereas similar concepts will clash more easily.

ADD COMMENT • link modified 3.3 years ago • written 5.8 years ago by Istvan Albert ♦♦ 73k 7 gravatar for Michael Dondrup 5.8 years ago by Michael Dondrup42k Bergen, Norway Michael Dondrup42k wrote:

You are totally correct with your initial assumption. This question is similar to choosing between Spanish and English, which language to choose? Well if you go to Spain,...

All (programming) languages are equal, in the sense of that you can solve the same class of problems with them. Once you know one language, you can easily learn all imperative languages. Use the language that you already master or that suits your style. Both (Perl & Python) are interpreted languages and have their merits. Both have extensive Bio-libraries, and both have large archives of contributed packages.

An important criterion to decide is the availability of rich, stable, and well maintained libraries. Choose the language that provides the library you need. For example, if you want to program web-services (using SOAP not web-sites), you better use Java or maybe C#.

Conclusion: it does no harm to learn new languages. And no flame wars please.

ADD COMMENT • link modified 5.8 years ago • written 5.8 years ago by Michael Dondrup42k
Sep 03, 2017 | www.biostars.org

[Sep 03, 2017] Perl-Python-Ruby Comparison

www.fastcompany.com
What is the Josephus problem? To quote from Concepts, Techniques, and Models of Computer Programming (a daunting title if ever there was one):

Flavius Josephus was a roman historian of Jewish origin. During the Jewish-Roman wars of the first century AD, he was in a cave with fellow soldiers, 40 men in all, surrounded by enemy Roman troops. They decided to commit suicide by standing in a ring and counting off each third man. Each man so designated was to commit suicide...Josephus, not wanting to die, managed to place himself in the position of the last survivor.

In the general version of the problem, there are n soldiers numbered from 1 to n and each k -th soldier will be eliminated. The count starts from the first soldier. What is the number of the last survivor?

I decided to model this situation using objects in three different scripting languages, Perl, Ruby, and Python. The solution in each of the languages is similar. A Person class is defined, which knows whether it is alive or dead, who the next person in the circle is, and what position number it is in. There are methods to pass along a kill signal, and to create a chain of people. Either of these could have been implemented using iteration, but I wanted to give recursion a whirl, since it's tougher on the languages. Here are my results.
Sep 03, 2017 | www.danvk.org

[Jul 10, 2017] Crowdsourcing, Open Data and Precarious Labour by Allana Mayer Model View Culture

Notable quotes:
"... Photo CC-BY Mace Ojala. ..."
"... Photo CC-BY Samantha Marx. ..."
Jul 10, 2017 | modelviewculture.com
Crowdsourcing, Open Data and Precarious Labour Crowdsourcing and microtransactions are two halves of the same coin: they both mark new stages in the continuing devaluation of labour. by Allana Mayer on February 24th, 2016 The cultural heritage industries (libraries, archives, museums, and galleries, often collectively called GLAMs) like to consider themselves the tech industry's little siblings. We're working to develop things like Linked Open Data, a decentralized network of collaboratively-improved descriptive metadata; we're building our own open-source tech to make our catalogues and collections more useful; we're pushing scholarly publishing out from behind paywalls and into open-access platforms; we're driving innovations in accessible tech.

We're only different in a few ways. One, we're a distinctly feminized set of professions , which comes with a large set of internally- and externally-imposed assumptions. Two, we rely very heavily on volunteer labour, and not just in the internship-and-exposure vein : often retirees and non-primary wage-earners are the people we "couldn't do without." Three, the underlying narrative of a "helping" profession ! essentially a social service ! can push us to ignore the first two distinctions, while driving ourselves to perform more and expect less.

I suppose the major way we're different is that tech doesn't acknowledge us, treat us with respect, build things for us, or partner with us, unless they need a philanthropic opportunity. Although, when some ingenue autodidact bootstraps himself up to a billion-dollar IPO, there's a good chance he's been educating himself using our free resources. Regardless, I imagine a few of the issues true in GLAMs are also true in tech culture, especially in regards to labour and how it's compensated.

Crowdsourcing

Notecards in a filing drawer: old-fashioned means of recording metadata.

Photo CC-BY Mace Ojala.

Here's an example. One of the latest trends is crowdsourcing: admitting we don't have all the answers, and letting users suggest some metadata for our records. (Not to be confused with crowdfunding.) The biggest example of this is Flickr Commons: the Library of Congress partnered with Yahoo! to publish thousands of images that had somehow ended up in the LOC's collection without identifying information. Flickr users were invited to tag pictures with their own keywords or suggest descriptions using comments.

Many orphaned works (content whose copyright status is unclear) found their way conclusively out into the public domain (or back into copyright) this way. Other popular crowdsourcing models include gamification , transcription of handwritten documents (which can't be done with Optical Character Recognition), or proofreading OCR output on digitized texts. The most-discussed side benefits of such projects include the PR campaign that raises general awareness about the organization, and a "lifting of the curtain" on our descriptive mechanisms.

The problem with crowdsourcing is that it's been conclusively proven not to function in the way we imagine it does: a handful of users end up contributing massive amounts of labour, while the majority of those signed up might do a few tasks and then disappear. Seven users in the "Transcribe Bentham" project contributed to 70% of the manuscripts completed; 10 "power-taggers" did the lion's share of the Flickr Commons' image-identification work. The function of the distributed digital model of volunteerism is that those users won't be compensated, even though many came to regard their accomplishments as full-time jobs .

It's not what you're thinking: many of these contributors already had full-time jobs , likely ones that allowed them time to mess around on the Internet during working hours. Many were subject-matter experts, such as the vintage-machinery hobbyist who created entire datasets of machine-specific terminology in the form of image tags. (By the way, we have a cute name for this: "folksonomy," a user-built taxonomy. Nothing like reducing unpaid labour to a deeply colonial ascription of communalism.) In this way, we don't have precisely the free-labour-for-exposure/project-experience problem the tech industry has ; it's not our internships that are the problem. We've moved past that, treating even our volunteer labour as a series of microtransactions. Nobody's getting even the dubious benefit of job-shadowing, first-hand looks at business practices, or networking. We've completely obfuscated our own means of production. People who submit metadata or transcriptions don't even have a means of seeing how the institution reviews and ingests their work, and often, to see how their work ultimately benefits the public.

All this really says to me is: we could've hired subject experts to consult, and given them a living wage to do so, instead of building platforms to dehumanize labour. It also means our systems rely on privilege , and will undoubtedly contain and promote content with a privileged bias, as Wikipedia does. (And hey, even Wikipedia contributions can sometimes result in paid Wikipedian-in-Residence jobs.)

For example, the Library of Congress's classification and subject headings have long collected books about the genocide of First Nations peoples during the colonization of North America under terms such as "first contact," "discovery and exploration," "race relations," and "government relations." No "subjugation," "cultural genocide," "extermination," "abuse," or even "racism" in sight. Also, the term "homosexuality" redirected people to "sexual perversion" up until the 1970s. Our patrons are disrespected and marginalized in the very organization of our knowledge.

If libraries continue on with their veneer of passive and objective authorities that offer free access to all knowledge, this underlying bias will continue to propagate subconsciously. As in Mechanical Turk , being "slightly more diverse than we used to be" doesn't get us any points, nor does it assure anyone that our labour isn't coming from countries with long-exploited workers.

Labor and Compensation

Rows and rows of books in a library, on vast curving shelves.

Photo CC-BY Samantha Marx.

I also want to draw parallels between the free labour of crowdsourcing and the free labour offered in civic hackathons or open-data contests. Specifically, I'd argue that open-data projects are less ( but still definitely ) abusive to their volunteers, because at least those volunteers have a portfolio object or other deliverable to show for their work. They often work in groups and get to network, whereas heritage crowdsourcers work in isolation.

There's also the potential for converting open-data projects to something monetizable: for example, a Toronto-specific bike-route app can easily be reconfigured for other cities and sold; while the Toronto version stays free under the terms of the civic initiative, freemium options can be added. The volunteers who supply thousands of transcriptions or tags can't usually download their own datasets and convert them into something portfolio-worthy, let alone sellable. Those data are useless without their digital objects, and those digital objects still belong to the museum or library.

Crowdsourcing and microtransactions are two halves of the same coin: they both mark new stages in the continuing devaluation of labour, and they both enable misuse and abuse of people who increasingly find themselves with few alternatives. If we're not offering these people jobs, reference letters, training, performance reviews, a "foot in the door" (cronyist as that is), or even acknowledgement by name, what impetus do they have to contribute? As with Wikipedia, I think the intrinsic motivation for many people to supply us with free labour is one of two things: either they love being right, or they've been convinced by the feel-good rhetoric that they're adding to the net good of the world. Of course, trained librarians, archivists, and museum workers have fallen sway to the conflation of labour and identity , too, but we expect to be paid for it.

As in tech, stereotypes and PR obfuscate labour in cultural heritage. For tech, an entrepreneurial spirit and a tendency to buck traditional thinking; for GLAMs, a passion for public service and opening up access to treasures ancient and modern. Of course, tech celebrates the autodidactic dropout; in GLAMs, you need a masters. Period. Maybe two. And entry-level jobs in GLAMs require one or more years of experience, across the board.

When library and archives students go into massive student debt, they're rarely apprised of the constant shortfall of funding for government-agency positions, nor do they get told how much work is done by volunteers (and, consequently, how much of the job is monitoring and babysitting said volunteers). And they're not trained with enough technological competency to sysadmin anything , let alone build a platform that pulls crowdsourced data into an authoritative record. The costs of commissioning these platforms aren't yet being made public, but I bet paying subject experts for their hourly labour would be cheaper.

Solutions

I've tried my hand at many of the crowdsourcing and gamifying interfaces I'm here to critique. I've never been caught up in the "passion" ascribed to those super-volunteers who deliver huge amounts of work. But I can tally up other ways I contribute to this problem: I volunteer for scholarly tasks such as peer-reviewing, committee work, and travelling on my own dime to present. I did an unpaid internship without receiving class credit. I've put my research behind a paywall. I'm complicit in the established practices of the industry, which sits uneasily between academic and social work: neither of those spheres have ever been profit-generators, and have always used their codified altruism as ways to finagle more labour for less money.

It's easy to suggest that we outlaw crowdsourced volunteer work, and outlaw microtransactions on Fiverr and MTurk, just as the easy answer would be to outlaw Uber and Lyft for divorcing administration from labour standards. Ideally, we'd make it illegal for technology to wade between workers and fair compensation.

But that's not going to happen, so we need alternatives. Just as unpaid internships are being eliminated ad-hoc through corporate pledges, rather than being prohibited region-by-region, we need pledges from cultural-heritage institutions that they will pay for labour where possible, and offer concrete incentives to volunteer or intern otherwise. Budgets may be shrinking, but that's no reason not to compensate people at least through resume and portfolio entries. The best template we've got so far is the Society of American Archivists' volunteer best practices , which includes "adequate training and supervision" provisions, which I interpret to mean outlawing microtransactions entirely. The Citizen Science Alliance , similarly, insists on "concrete outcomes" for its crowdsourcing projects, to " never waste the time of volunteers ." It's vague, but it's something.

We can boycott and publicly shame those organizations that promote these projects as fun ways to volunteer, and lobby them to instead seek out subject experts for more significant collaboration. We've seen a few efforts to shame job-posters for unicorn requirements and pathetic salaries, but they've flagged without productive alternatives to blind rage.

There are plenty more band-aid solutions. Groups like Shatter The Ceiling offer cash to women of colour who take unpaid internships. GLAM-specific internship awards are relatively common , but could: be bigger, focus on diverse applicants who need extra support, and have eligibility requirements that don't exclude people who most need them (such as part-time students, who are often working full-time to put themselves through school). Better yet, we can build a tech platform that enables paid work, or at least meaningful volunteer projects. We need nationalized or non-profit recruiting systems (a digital "volunteer bureau") that matches subject experts with the institutions that need their help. One that doesn't take a cut from every transaction, or reinforce power imbalances, the way Uber does. GLAMs might even find ways to combine projects, so that one person's work can benefit multiple institutions.

GLAMs could use plenty of other help, too: feedback from UX designers on our catalogue interfaces, helpful tools , customization of our vendor platforms, even turning libraries into Tor relays or exits . The open-source community seems to be looking for ways to contribute meaningful volunteer labour to grateful non-profits; this would be a good start.

What's most important is that cultural heritage preserves the ostensible benefits of crowdsourcing – opening our collections and processes up for scrutiny, and admitting the limits of our knowledge – without the exploitative labour practices. Just like in tech, a few more glimpses behind the curtain wouldn't go astray. But it would require deeper cultural shifts, not least in the self-perceptions of GLAM workers: away from overprotective stewards of information, constantly threatened by dwindling budgets and unfamiliar technologies, and towards facilitators, participants in the communities whose histories we hold.

Tech Workers Please Stop Defending Tech Companies by Shanley Kane Model View Culture

[Nov 11, 2016] pdxdoughnut

Notable quotes:
"... Show the command line for a PID, converting nulls to spaces and a newline ..."
Nov 11, 2016 | www.commandlinefu.com

Commands by pdxdoughnut from the last day the last week the last month all time sorted by date votes

Terminal - Commands by pdxdoughnut - 26 results

whatinstalled() { which "$@" | xargs -r readlink -f | xargs -r dpkg -S ;}

2016-11-08 20:59:25

Tags: which grep dpkg function packages realpath

Comments (4)

command ${MYVAR:+--someoption=$MYVAR}

2015-11-04 19:47:24

Use a var with more text only if it exists. See "Parameter Expansion" in the bash manpage. They refer to this as "Use Alternate Value", but we're including the var in the at alternative.

Comments (2)

[ -n "$REMOTE_USER" ] || read -p "Remote User: " -er -i "$LOGNAME" REMOTE_USER

2015-10-30 17:08:17

If (and only if) the variable is not set, prompt users and give them a default option already filled in. The read command reads input and puts it into a variable. With -i you set an initial value. In this case I used a known environment variable.

Show sample output | Comments (2)

debsecan --format detail

2015-10-22 18:46:41

List known debian vulnerabilities on your system -- many of which may not yet be patched.

You can search for CVEs at https://security-tracker.debian.org/tracker/ or use --report to get full links. This can be added to cron, but unless you're going to do manual patches, you'd just be torturing yourself.

Comments (3)

tr '\0' ' ' </proc/21679/cmdline ; echo

2015-09-25 22:08:31

Show the command line for a PID, converting nulls to spaces and a newline

Comments (7)

mv -iv $FILENAME{,.$(stat -c %y $FILENAME | awk '{print $1}')}

2014-12-01 22:41:38

Rename file to same name plus datestamp of last modification.

Note that the -i will not help in a script. Proper error checking is required.

Show sample output | Comments (2)

echo "I am $BASH_SUBSHELL levels nested";

2014-06-20 20:33:43

Comments (0)

diff -qr /dirA /dirB

2014-04-01 21:42:19

shows which files differ in two direcories

Comments (0)

find ./ -type l -ls

2014-03-21 17:13:39

Show all symlinks

Comments (3)

ls | xargs WHATEVER_COMMAND

xargs will automatically determine how namy args are too many and only pass a reasonable number of them at a time. In the example, 500,002 file names were split across 26 instantiations of the command "echo".

killall conky

kill a process(e.g. conky) by its name, useful when debugging conky:)

nmap -sn 192.168.1.0/24

2014-01-28 23:32:18

Ping all hosts on 192.168.1.0/24

find . -name "pattern" -type f -printf "%s\n" | awk '{total += $1} END {print total}'

2014-01-16 01:16:18

Find files and calculate size of result in shell. Using find's internal stat to get the file size is about 50 times faster than using -exec stat.

mkdir Epub ; mv -v --target-directory=Epub $(fgrep -lr epub *)

2014-01-16 01:07:29

Move all epub keyword containing files to Epub folder

Comments (0) | Add to favourites | Report as malicious | Submit alternative | Report as a duplicate

CMD=chrome ; ps h -o pmem -C $CMD | awk '{sum+=$1} END {print sum}'

Show total cumulative memory usage of a process that spawns multiple instances of itself

mussh -h 192.168.100.{1..50} -m -t 10 -c uptime

2013-11-27 18:01:12

This will run them at the same time and timeout for each host in ten seconds. Also, mussh will append the ip addres to the beginning of the output so you know which host resonded with which time.

The use of the sequence expression {1..50} is not specific to mussh. The `seq ...` works, but is less efficient.

Comments (1)

du -Sh | sort -h | tail

2013-11-27 17:50:11

Which files/dirs waste my disk space

I added -S to du so that you don't include /foo/bar/baz.iso in /foo, and change sorts -n to -h so that it can properly sort the human readable sizes.

base64 /dev/urandom | head -c 33554432 | split -b 8192 -da 4 - dummy.

2013-11-12 17:56:23

Create a bunch of dummy text files

Avoiding a for loop brought this time down to less than 3 seconds on my old machine. And just to be clear, 33554432 = 8192 * 4086.

Comments (6)

sudo lsof -iTCP:25 -sTCP:LISTEN

2013-11-12 17:32:34

Check if TCP port 25 is open

for i in {1..4096}; do base64 /dev/urandom | head -c 8192 > dummy$i.rnd ; done

2013-11-12 00:36:10

Create a bunch of dummy text files

Using the 'time' command, running this with 'tr' took 28 seconds (and change) each time but using base64 only took 8 seconds (and change). If the file doesn't have to be viewable, pulling straight from urandom with head only took 6 seconds (and change)

Comments (2)

find -name .git -prune -o -type f -exec md5sum {} \; | sort -k2 | md5sum

2013-10-28 22:14:08

Create md5sum of a directory

Comments (6)

logger -t MyProgramName "Whatever you're logging"

2013-10-22 16:34:49

creating you're logging function for your script. You could also pipe to logger.

find garbage/ -type f -delete

2013-10-21 23:26:51

rm filenames with spaces

I _think_ you were trying to delete files whether or not they had spaces. This would do that. You should probably be more specific though.

Comments (3)

lsof -iTCP:8080 -sTCP:LISTEN

2013-10-07 18:22:32

check to see what is running on a specific port number

find . -size +100M

2013-10-07 18:16:14

Recursivly search current directory for files larger than 100MB

Comments (3)

[Sep 18, 2016] R Weekly

Sep 18, 2016 | tm.durusau.net
September 12th, 2016 R Weekly

A new weekly publication of R resources that began on 21 May 2016 with Issue 0 .

Mostly titles of post and news articles, which is useful, but not as useful as short summaries, including the author's name.

[Nov 08, 2015] Abstraction

nickgeoghegan.net

Filed in Programming No Comments

A few things can confuse programming students, or new people to programming. One of these is abstraction.

Wikipedia says:

In computer science, abstraction is the process by which data and programs are defined with a representation similar to its meaning (semantics), while hiding away the implementation details. Abstraction tries to reduce and factor out details so that the programmer can focus on a few concepts at a time. A system can have several abstraction layers whereby different meanings and amounts of detail are exposed to the programmer. For example, low-level abstraction layers expose details of the hardware where the program is run, while high-level layers deal with the business logic of the program.

That might be a bit too wordy for some people, and not at all clear. Here's my analogy of abstraction.

Abstraction is like a car

A car has a few features that makes it unique.

If someone can drive a Manual transmission car, they can drive any Manual transmission car. Automatic drivers, sadly, cannot drive a Manual transmission drivers without "relearing" the car. That is an aside, we'll assume that all cars are Manual transmission cars � as is the case in Ireland for most cars.

Since I can drive my car, which is a Mitsubishi Pajero, that means that I can drive your car � a Honda Civic, Toyota Yaris, Volkswagen Passat.

All I need to know, in order to drive a car � any car � is how to use the breaks, accelerator, steering wheel, clutch and transmission. Since I already know this in my car, I can abstract away your car and it's controls.

I do not need to know the inner workings of your car in order to drive it, just the controls. I don't need to know how exactly the breaks work in your car, only that they work. I don't need to know, that your car has a turbo charger, only that when I push the accelerator, the car moves. I also don't need to know the exact revs that I should gear up or gear down (although that would be better on the engine!)

Virtually all controls are the same. Standardization means that the clutch, break and accelerator are all in the same place, regardless of the car. This means that I do not need to relearn how a car works. To me, a car is just a car, and is interchangeable with any other car.

Abstraction means not caring

As a programmer, or someone using a third party API (for example), abstraction means not caring how the inner workings of some function works � Linked list data structure, variable names inside the function, the sorting algorithm used, etc � just that I have a standard (preferable unchanging) interface to do whatever I need to do.

Abstraction can be taught of as a black box. For input, you get output. That shouldn't be the case, but often is. We need abstraction so that, as a programmer, we can concentrate on other aspects of the program � this is the corner-stone for large scale, multi developer, software projects.

[Feb 05, 2015] JavaScript, PHP Top Most Popular Languages, With Apple's Swift Rising Fast

Its not about the language, its about the ecosystem

Feb 04, 2015 | slashdot.org

Posted by samzenpus
from the king-of-the-hill dept.

Nerval's Lobster writes Developers assume that Swift, Apple's newish programming language for iOS and Mac OS X apps, will become extremely popular over the next few years. According to new data from RedMonk, a tech-industry analyst firm, Swift could reach that apex of popularity sooner rather than later. While the usual stalwarts-including JavaScript, Java, PHP, Python, C#, C++, and Ruby-top RedMonk's list of the most-used languages, Swift has, well, swiftly ascended 46 spots in the six months since the firm's last update, from 68th to 22nd. RedMonk pulls data from GitHub and Stack Overflow to create its rankings, due to those sites' respective sizes and the public nature of their data. While its top-ranked languages don't trade positions much between reports, there's a fair amount of churn at the lower end of the rankings. Among those "smaller" languages, R has enjoyed stable popularity over the past six months, Rust and Julia continue to climb, and Go has exploded upwards-although CoffeeScript, often cited as a language to watch, has seen its support crumble a bit.


Dutch Gun (899105) on Wednesday February 04, 2015 @09:45PM (#48985989)

Re:not really the whole story (Score:5, Insightful)

More critically, the question I always ask about this is: "Used for what?"

Without that context, why does popularity even matter? For example, I'm a game developer, so my programming life revolves around C++, at least for game-side or engine-level code - period. Nothing else is even on the radar when you're talking about highly-optimized, AAA games. For scripting, Lua is a popular contender. For internal tools, C# seems to be quite popular. I've also seen Python used for tool extensions, or for smaller tools in their own right. Javascript is generally only used for web-based games, or by the web development teams for peripheral stuff.

I'll bet everyone in their own particular industry has their own languages which are dominant. For instance, if you're working on the Linux kernel, you're obviously working in C. It doesn't matter what the hell everyone else does. If you're working in scientific computing, are you really looking seriously at Swift? Of course not. Fortran, F#, or C++ are probably more appropriate, or perhaps others I'm not aware of. A new lightweight iOS app? Swift it is!

Languages are not all equal. The popularity of Javascript is not the measure of merit of that particular language. It's a measure of how popular web-based development is (mostly). C/C++ is largely a measure of how many native, high-performance-required applications there are (games, OS development, large native applications). Etc, etc.

Raw popularity numbers probably only have one practical use, and that's finding a programming job without concern for the particular industry. Or I suppose if you're so emotionally invested in a particular language, it's nice to know where it stands among them all.

unrtst (777550) on Wednesday February 04, 2015 @10:34PM (#48986283)

... And not sure public github or stack overflow are really as representative as they want to believe

Yeah.. why is this any better than:

TIOBE index: http://www.tiobe.com/index.php... [tiobe.com]

This story about python surpassing java as top learning language:

http://developers.slashdot.org... [slashdot.org]

Or this about 5 languages you'll need to learn for the next year and on:

http://news.dice.com/2014/07/2... [dice.com]

... those are all from the past year on slashdot, and there's loads more.

Next "top languages" post I see, I hope it just combines all the other existing stats to provide a weightable index (allow you to tweak what's most important). Maybe BH can address that :-)

gavron (1300111) on Wednesday February 04, 2015 @08:21PM (#48985495)

68th to 22nd and there are many to go (Score:5, Insightful)

All new languages start out at the bottom, as Swift did. In time, the ones that don't get used fall down.

Swift has gotten up to 22nd, but the rest of the climb past the stragglers won't ever happen.

However, to be "the most popular language" is clearly no contest worth winning. Paris Hilton and Kim Kardashian are most popular compared to Steven Hawking and Isaac Asimov.

Being popular doesn't mean better, useful, or even of any value whatsoever. It just means someone has a better marketing-of-crap department.

There's a time to have popularity contests. It's called high school.

E

coop247 (974899) on Wednesday February 04, 2015 @08:54PM (#48985695)

Being popular doesn't mean better, useful, or even of any value whatsoever

PHP runs facebook, yahoo, wordpress, and wikipedia. Javascript runs everything on the internet. Yup, no value there.

UnknownSoldier (67820) on Wednesday February 04, 2015 @08:39PM (#48985617)

Popularity != Quality (Score:5, Insightful)

McDonalds may serve billions, but no one is trying to pass it off as gourmet food.

Kind of like PHP and Javascript. The most fucked up languages are the most popular ... Go figure.

* http://dorey.github.io/JavaScr... [github.io]

Xest (935314) on Thursday February 05, 2015 @04:47AM (#48987365)

Re:Popularity != Quality (Score:2)

I think this is the SO distortion effect.

Effectively the more warts a language has and/or the more poorly documented it is, the more questions that are bound to be asked about it, hence the more apparent popularity if you use SO as a metric.

So if companies like Microsoft and Oracle produce masses of great documentation for their respective technologies and provide entire sites of resources for them (such as www.asp.net or the MSDN developer forums) then they'll inherently see reduced "popularity" on SO.

Similarly some languages have a higher bar to entry, PHP and Javascript are both repeatedly sold as languages that beginners can start with, it should similarly be unsurprising therefore that more questions are asked about them than by people who have moved up the change to enterprise languages like C#, Java, and C++.

But I shouldn't complain too much, SO popularity whilst still blatantly flawed is still a far better metric than TIOBE whose methodology is just outright broken (they explain their methodology on their site, and even without high school statistics knowledge it shouldn't take more than 5 seconds to spot gaping holes in their methodology).

I'm still amazed no one's done an actual useful study on popularity and simply scraped data from job sites each month. It'd be nice to know what companies are actually asking for, and what they're paying. That is after all the only thing anyone really wants to know when they talk about popularity - how likely is it to get me a job, and how well is it likely to pay? Popularity doesn't matter beyond that as you just choose the best tool for the job regardless of how popular it is.

Shados (741919) on Wednesday February 04, 2015 @11:08PM (#48986457)

Re:Just learn C and Scala (Score:2)

Its not about the language, its about the ecosystem. ie: .NET may be somewhere in between java and scala, and the basic of the framework is the same, but if you do high end stuff, JVM languages and CLR languages are totally different. Different in how you debug it in production, different in what the standards are, different in what patterns people expect you to use when they build a library, different gotchas. And while you can pick up the basics in an afternoon, it can take years to really push it.

Doesn't fucking matter if you're doing yet-another-e-commerce-site (and if you are, why?). Really fucking big deal if you do something that's never been done before with a ridiculous amount of users.

[Oct 18, 2013] Tom Clancy, Best-Selling Master of Military Thrillers, Dies at 66

Fully applicable to programming...
NYTimes.com

"I tell them you learn to write the same way you learn to play golf," he once said. "You do it, and keep doing it until you get it right. A lot of people think something mystical happens to you, that maybe the muse kisses you on the ear. But writing isn't divinely inspired - it's hard work."

[Oct 12, 2013] YaST Developers Explain Move to Ruby by Susan Linton

Oct. 10, 2013 | OStatic

Last summer Lukas Ocilka mentioned the completion of the basic conversion of YaST from YCP to Ruby. At the time it was said the change was needed to encourage contributions from a wider set of developers, and Ruby is said to be simpler and more flexible. Well, today Jos Poortvliet posted an interview with two YaST developers explaining the move in more detail.

In a discussion with Josef Reidinger and David Majda, Poortvliet discovered the reason for the move was because all the original YCP developers had moved on to other things and everyone else felt YCP slowed them down. "It didn't support many useful concepts like OOP or exception handling, code written in it was hard to test, there were some annoying features (like a tendency to be "robust", which really means hiding errors)."

Ruby was chosen because it is a well known language over at the openSUSE camp and was already being used on other SUSE projects (such as WebYaST). "The internal knowledge and standardization was the decisive factor." The translation went smoothly according to developers because they "automated the whole process and did testing builds months in advance. We even did our custom builds of openSUSE 13. 1 Milestones 2 and 3 with pre-release versions of YaST in Ruby."

For now performance under the Ruby code is comparable to the YCP version because developers were concentrating on getting it working well during this first few phases and user will notice very little if any visual changes to the YaST interface. No more major changes are planned for this development cycle, but the new Yast will be used in 13.1 due out November 19.

See the full interview for lots more detail.

[Oct 19, 2012] Google's Engineers Are Well Paid, Not Just Well Fed

October 18, 2012 | Slashdot

Anonymous Coward

Re:$128,000?

writes: on Thursday , @12:38PM (#41694241)

I make more than $40k as a software developer, but it wasn't too long ago that I was making right around that amount.

I have an AAS (not a fancy degree, if you didn't already know), my GPA was 2.8, and I assure you that neither of those things has EVER come up in a job interview. I'm also old enough that my transcripts are gone. (Schools only keep them for about 10 years. After that, nobody's looking anyway.)

The factors that kept me from making more are:

So when I did finally land a programming job, it was as a code monkey in a PHP sweatshop. The headhunter wanted a decent payout, so I started at $40k. No raises. Got laid off after a year and a half due to it being a sweatshop and I had outstayed my welcome. (Basically, I wanted more money and they didn't want to give me any more money.)

Next job was a startup. Still $40k. Over 2.5 years, I got a couple of small raises. I topped out at $45k-ish before I got laid off during the early days of the recession.

Next job was through a headhunter again. I asked for $50k, but the employer could only go $40k. After 3 years and a few raises, I'm finally at $50k.

I could probably go to the larger employers in this city and make $70k, but that's really the limit in this area. Nobody in this line of work makes more than about $80k here.

aralin

Not accurate, smaller companies pay more

This survey must be only talking about companies above certain size. Our Sillicon Valley startup has about 50 employees and the average engineering salaries are north of $150,000. Large companies like Google actually don't have to pay that much, because the hours are more reasonable. I know there are other companies too that pay more than Google in the area.

Reply to This Parent Share twitter facebook Flag as Inappropriate

Re:Not accurate, smaller companies pay more (Score:4, Interesting)
by MisterSquid (231834) writes: on Thursday October 18, @11:16AM (#41693121)
Our Sillicon Valley startup has about 50 employees and the average engineering salaries are north of $150,000.

I suppose there are some start-ups that do pay developers the value of the labor, but my own experience is a bit different in that it was more stereotypical of Silicon-Valley startup compensation packages. That is, my salary was shamefully low (I was new to the profession), just about unlivable for the Bay Area, and was offset with a very accelerated stock options plan.

[Mar 14, 2012] Perl vs Python Why the debate is meaningless " The ByteBaker

Arvind Padmanabhan:

I don't know Python but I can comment on Perl. I have written many elegant scripts for complex problems and I still love it. I often come across comments about how a programmer went back to his program six months later and had difficulty understanding it. For my part, I haven't had this problem primarily because I use consistently a single syntax. If perl provides more than one way to do things, I choose and use only one. Secondly, I do agree that objects in perl are clunky and make for difficult writing/reading. I have never used them. This makes it difficult for me to write perl scripts for large projects. Perhaps this is where Python succeeds.

shane o mac

I was forced to learn Python in order to write scripts within Blender (Open Source 3D Modeler).

White Hat:
1. dir( object )
This is nice as it shows functions and constants.

Black Hat:
1. Indention to denote code blocks. (!caca?)

PERL was more of an experiment instead of necessity. Much of what I know about regular expressions probably came from reading about PERL. I never even wrote much code in PERL. You see CPAN (Repository for modules) alone makes up for all the drawbacks I can't think of at the moment.

White Hat:
FreeBSD use to extensively use PERL for installation routines (4.7., I keep a copy of it in my music case although I don't know why as I feel its a good luck charm of sorts). Then I read in 5.0 they started removing it in favor of shell script (BAsh). Why?

Black Hat:
I'm drawing a blank here.

With freedom there are costs, you are allowed to do as you please. Place variables as you must and toss code a-muck.

You can discipline yourself to write great code in any language. VB-Script I write has the appearance of a standard practices C application.

I think it's a waste of time to sit and debate over which language is suited for which project. Pick one and master it.

So you can't read PERL? Break the modules into several files. It's more about information management than artisan ablility. Divide and Conquor.

Peter

I disagree with mastering one language. Often there will be trade-offs in a program that match very nicely with a particular program.

For example, you could code everything in C++\C\assembler, but that only makes sense when you really need speed or memory compactness. After all, I find it difficult to write basic file processing applications in C in under 10 minutes.

Perl examples use a lot of default variables and various ways to approach problems, but this is really a nightmare when you have to maintain someone else's code. Especially if you don't know Perl. I think its hard to understand (without a background in Perl)

Juls

I'm a perl guy not a py programmer so I won't detract from python [except for the braces, Guido should at least let the language compile with them].

Note: Perl is like English it's a reflective language. So I can make nouns into adjectives and use the power of reflection. For example � 'The book on my living room table' vs [Spanish] 'The book on the table of my living room'.

And this makes sense � because Larry Wall was a linguist; and was very influenced by the fact that reflective languages can say more with less because much is implied based on usage. These languages can also say the same thing many different ways. [Perl makes me pull my hair out. | Perl makes me pull out my hair.] And being that we have chromosomes that wire us for human language � these difficulties are soon mastered by even children. But � but we don't have the same affinity for programming languages (well most of us) so yes Perl can be a struggle in the beginning. But once you achieve a strong familiarity and stop trying to turn Perl into C or Python and allow Perl just to be Perl you really really start to enjoy it for those reasons you didn't like it before.

The biggest failure of Perl has been its users enjoying the higher end values of the language and failing to publish and document simple examples to help non-monks get there. You shouldn't have to be a monk to seek wisdom at the monastic gates.

Example � Perl classes. Obtuse and hard to understand you say? It doesn't have to be � I think that most programmers will understand and be able to write their own after just looking at this simple example. Keep in mind, we just use 'package' instead or class. Bless tells the interpreter your intentions and is explicitly used because you can bless all kinds of things � including a class (package).

my $calc = new Calc; # or Calc->new;
print $calc->Add(1);
print $calc->Add(9);
print $calc->Pie(67);

package Calc;
sub new
{
my $class = shift; # inherits from package name, 'Calc'
my $self =
{
_undue => undef,
_currentVal => 0,
_pie => 3.14
};
bless $self, $class; # now we have a class named 'Calc'
return $self;
1;

sub Add
{
my ($self, $val) = @_;

$self->{_undue} = $self->{_currentVal}; # save off the last value

$self->{_currentVal} = $self->{_currentVal} + $val; # add the scalars
return $self->{_currentVal}; # return the new value
}

sub Pie
{
my ($self, $val) = @_;

$self->{_undue} = $self->{_currentVal}; # save off the last value

$self->{_currentVal} = $self->{_pie} * $val; # add the scalars
return $self->{_currentVal}; # return the new value
}
}

[Mar 14, 2012] How To Contribute To Open Source Without Being a Programming Rock Star

Esther Schindler writes "Plenty of people want to get involved in open source, but don't know where to start. In this article, Andy Lester lists several ways to help out even if you lack confidence in your technical chops. Here are a couple of his suggestions: 'Maintenance of code and the systems surrounding the code often are neglected in the rush to create new features and to fix bugs. Look to these areas as an easy way to get your foot into a project. Most projects have a publicly visible trouble ticket system, linked from the front page of the project's website and included in the documentation. It's the primary conduit of communication between the users and the developers. Keeping it current is a great way to help the project. You may need to get special permissions in the ticketing system, which most project leaders will be glad to give you when you say you want to help clean up the tickets.'" What's your favorite low-profile way to contribute?

[Dec 25, 2010] [PDF] Brian Kernighan - Random thoughts on scripting languages

I would expect better, higher level of thinking from Brian...

Other scripting languages

[Dec 20, 2010] What hurts you the most in Perl

LinkedIn

Steve Carrobis

Perl is a far better applications type language than JAVA/C/C#. Each has their niche. Threads were always an issue in Perl, and like OO, if you don't need it or know it don't use it.

My issues with Perl is when people get Overly Obfuscated with their code because the person thinks that less characters and a few pointers makes the code faster.

Unless you do some real smart OOesque building all you are doing is making it harder to figure out what you were thinking about. and please perl programmers, don't by into the "self documenting code" i am an old mainframer and self documenting code was as you wrote you added comments to the core parts of the code ... i can call my subroutine "apple" to describe it.. but is it really an apple? or is it a tomato or pomegranate. If written properly Perl is very efficient code. and like all the other languages if written incorrectly its' HORRIBLE. I have been writing perl since almost before 3.0 ;-)

Thats my 3 cents.. Have a HAPPY and a MERRY!

Nikolai Bezroukov

@steve Thanks for a valuable comment about the threat of overcomplexity junkies in Perl. That's a very important threat that can undermine the language future.

@Gabor: A well know fact is that PHP, which is a horrible language both as for general design and implementation of most features you mentioned is very successful and is widely used on for large Web applications with database backend (Mediawiki is one example). Also if we think about all dull, stupid and unrelaible Java coding of large business applications that we see on the marketplace the question arise whether we want this type of success ;-)

@Douglas: Mastering Perl requires slightly higher level of qualification from developers then "Basic-style" development in PHP or commercial Java development (where Java typically plays the role of Cobol) which is mainstream those days. Also many important factors are outside technical domain: ecosystem for Java is tremendous and is supported by players with deep pockets. Same is true for Python. Still Perl has unique advantages, is universally deployed on Unix and as such is and always will be attractive for thinking developers

I think that for many large business applications which in those days often means Web application with database backend one can use virtual appliance model and use OS facilities for multitasking. Nothing wrong with this approach on modern hardware. Here Perl provides important advantages due to good integration with Unix.

Also structuring of a large application into modules using pipes and sockets as communication mechanism often provides very good maintainability. Pointers are also very helpful and unique for Perl. Typically scripting languages do not provide pointers. Perl does and as such gives the developer unique power and flexibility (with additional risks as an inevitable side effect).

Another important advantage of Perl is that it is a higher level language then Python (to say nothing about Java ) and stimulates usage of prototyping which is tremendously important for large projects as the initial specification is usually incomplete and incorrect. Also despite proliferation of overcomplexity junkies in Perl community, some aspects of Perl prevent excessive number of layers/classes, a common trap that undermines large projects in Java. Look at IBM fiasco with Lotus Notes 8.5.

I think that Perl is great in a way it integrates with Unix and promote thinking of complex applications as virtual appliances. BTW this approach also permits usage of a second language for those parts of the system for which Perl does not present clear advantages.

Also Perl provide an important bridge to system administrators who often know the language and can use subset of it productively. That makes it preferable for large systems which depend on customization such as monitoring systems.

Absence of bytecode compiler hurts development of commercial applications in Perl in more ways than one but that's just question of money. I wonder why ActiveState missed this opportunity to increase its revenue stream. I also agree that the quality of many CPAN modules can be improved but abuse of CPAN along with fixation on OO is a typical trait of overcomplexity junkies so this has some positive aspect too :-).

I don't think that OO is a problem for Perl, if you use it where it belongs: in GUI interfaces. In many cases OO is used when hierarchical namespaces are sufficient. Perl provides a clean implementation of the concept of namespaces. The problem is that many people are trained in Java/C++ style of OO and as we know for hummer everything looks like a nail. ;-)

Allan Bowhill:

I think the original question Gabor posed implies there is a problem 'selling' Perl to companies for large projects. Maybe it's a question of narrowing its role.

It seems to me that if you want an angle to sell Perl on, it would make sense to cast it (in a marketing sense) into a narrower role that doesn't pretend to be everything to everyone. Because, despite what some hard-core Perl programmers might say, the language is somewhat dated. It hasn't really changed all that much since the 1990s.

Perl isn't particularly specialized so it has been used historically for almost every kind of application imaginable. Since it was (for a long time in the dot-com era) a mainstay of IT development (remember the 'duct tape' of the internet?) it gained high status among people who were developing new systems in short time-frames. This may in fact be one of the problems in selling it to people nowadays.

The FreeBSD OS even included Perl as part of their main (full) distribution for some time and if I remember correctly, Perl scripts were included to manage the ports/packaging system for all the 3rd party software. It was taken out of the OS shortly after the bust and committee reorganization at FreeBSD, where it was moved into third-party software. The package-management scripts were re-written in C. Other package management utilities were effectively displaced by a Ruby package.

A lot of technologies have come along since the 90s which are more appealing platforms than Perl for web development, which is mainly what it's about now.

If you are going to build modern web sites these days, you'll more than likely use some framework that utilizes object-oriented languages. I suppose the Moose augmentation of Perl would have some appeal with that, but CPAN modules and addons like Moose are not REALLY the Perl language itself. So if we are talking about selling the Perl language alone to potential adopters, you have to be honest in discussing the merits of the language itself without all the extras.

Along those lines I could see Perl having special appeal being cast in a narrower role, as a kind of advanced systems batching language - more capable and effective than say, NT scripting/batch files or UNIX shell scripts, but less suitable than object-oriented languages, which pretty much own the market for web and console utilities development now.

But there is a substantial role for high-level batching languages, particularly in systems that build data for consumption by other systems. These are traditionally implemented in the highest-level batching language possible. Such systems build things like help files, structured (non-relational) databases (often used on high-volume commercial web services), and software. Not to mention automation many systems administration tasks.

There are not too many features or advantages to Perl that are unique in itself in the realm of scripting languages, as they were in the 90s. The simplicity of built-in Perl data structures and regular expression capabilities are reflected almost identically in Ruby, and are at least accessible in other strongly-typed languages like Java and C#.

The fact that Perl is easy to learn, and holds consistent with the idea that "everything is a string" and there is no need to formalize things into an object-oriented model are a few of its selling points. If it is cast as an advanced batching language, there are almost no other languages that could compete with it in that role.

Dean Hamstead:

@Pascal: bytecode is nasty for the poor Sysadmin/Devop who has to run your code. She/he can never fix it when bugs arise. There is no advantage to bytecode over interpreted.

Which infact leads me to a good point.

All the 'selling points' of Java have all failed to be of any real substance.

In truth, Java is popular because it is popular.

Lots of people dont like perl because its not popular any more. Similar to how lots of people hate Mac's but have no logical reason for doing so.

Douglas is almost certainly right, that Python is rapidly becoming the new fad language.

Im not sure how perl OO is a 'hack'. When you bless a reference in to an object it becomes and object... I can see that some people are confused by perls honesty about what an object is. Other languages attempt to hide away how they have implemented objects in their compiler - who cares? Ultimately the objects are all converted in to machine code and executed.

In general perl objects are more object oriented than java objects. They are certainly more polymorphic.

Perl objects can fully hide their internals if thats something you want to do. Its not even hard, and you dont need to use moose. But does it afford any real benefit? Not really.

At the end of the day, if you want good software you need to hire good programmers it has nothing to do with the language. Even though some languages try to force the code to be neat (Python) and try to force certain behaviours (Java?) you can write complete garbage in any of them, then curse that language for allowing the author to do so.

A syntactic argument is pointless. As is something oriented around OO. What benefits perl brings to a business are...

- massive centralised website of libraries (CPAN)
- MVC's
- DBI
- POE
- Other frameworks etc
- automated code review (perlcritic)
- automated code formatting and tidying (perltidy)
- document as you code (POD)
- natural test driven development (Test::More etc)
- platform independence
- perl environments on more platforms than java
- perl comes out of the box on every unix
- excellent canon of printed literature, from beginner to expert
- common language with Sysadmin/Devops and traditional developers roles (with source code always available to *fix* them problem quickly, not have to try to set up an ant environment with and role a new War file)
- rolled up perl applications (PAR files)
- Perl can use more than 3.6gig of ram (try that in java)

Brian Martin

Well said Dean.

Personally, I don't really care if a system is written in Perl or Python or some other high level language, I don't get religious about which high level language is used.

There are many [very] high level languages, any one of them is vastly more productive & consequently less buggy than developing in a low level language like C or Java. Believe me, I have written more vanilla C code in my career than Perl or Python, by a factor of thousands, yet I still prefer Python or Perl as quite simply a more succinct expression of the intended algorithm.

If anyone wants to argue the meaning of "high level", well basically APL wins ok. In APL, to invert a matrix is a single operator. If you've never had to implement a matrix inversion from scratch, then you've never done serious programming. Meanwhile, Python or Perl are pretty convenient.

What I mean by a "[very] high level language" is basically how many pages of code does it take to play a decent game of draughts (chequers), or chess ?

[Nov 18, 2010] The Law of Software Development By James Kwak

November 17, 2010 | The Baseline Scenario

I recently read a frightening 2008 post by David Pogue about the breakdown of homemade DVDs. This inspired me to back up my old DVDs of my dog to my computer (now that hard drives are so much bigger than they used to be), which led me to install HandBrake. The Handbrake web site includes this gem:

"The Law of Software Development and Envelopment at MIT:

Every program in development at MIT expands until it can read mail."
I thought of that when I heard that Facebook is launching a (beyond) email service.

(The side benefit of this project is that now I get to watch videos of my dog sleeping whenever I want to.)

Nemo

Pogue should have mentioned whether he was talking about DVD-R or DVD-RW.

The rewritable variants are vastly more perishable than the write-once variants.

http://www.thexlab.com/faqs/opticalmedialongevity.html

Jason

That law is originally due to Jamie Zawinski (rather famous software developer known for his work on Netscape Navigator and contributions to Mozilla and XEmacs). In its original form:

Every program attempts to expand until it can read mail. Those programs which cannot so expand are replaced by ones which can.-Jamie Zawinski

Ted K

James, you remind me of myself when I'm drinking and thinking of past things. I'm mean, I'm not criticizing you, please don't take it as criticism. But watching videos of the dog that passed away�Don't torture yourself man. Wait some time and when you and the family is ready, get a new dog. Should probably be a different breed unless you're super-sold on that breed. Let the kids choose one from a set of breeds you like.

We got Pogue's book on the i-Pod because Apple's manual is so crappy. He is "da man".

You know if Apple gave a damn about customers they would include a charger cord with that damned thing to hook into the wall instead of making you shop for the charger cord separately, but Mr. "I'm an assh*le" Steve Jobs couldn't bother to show he's customer service inclined. He's slowly but surely going the way of Bill Gates.

Ted K

Mr. Kwak,
There is a super good story they're running on PBS NewsHour today with the former "NOW" host David Brancaccio on a show called "Fixing the Future". James you need to try to download that or catch the show. That looks really good and shows some promise for the future. Catch this link people. Nov 18 (Thursday)
http://www.pbs.org/now/fixing-the-future/index.html

David Petraitis

@Ted K
It looks like programs need to expand until they can delete uploaded Youtube videos which are seriously off topic.

As for James' original point, most applications today are Internet Aware and use the Internet in their base functionality (which is what was meant by the original email capability) The next level is for them to be mobile and location aware, and it is already happening.

Bruce E. Woych

facebook launches beyond e-mail�

I was browsing through some tech books at B&N's and came across some works assessing the questionable fields of cyber space tech wars and the current trends in development. The book has no axe to grind and was written well before the facebook attempt to dominate the media with multimedia dependency. Here's what was so interesting in the text that applies:

The two greatest "vulnerabilities of the future" involve what is categorized as consolidation and convergence.

Now it first occurred to me that this is much like micro and macro economics�but then I realized that it is precisely (in the field) like too big to fail!

So are we on another monopoly trail down the primrose path of self destructive dependencies?

Isn't this just another brand media Octopus looking to knock out variations and dominate our choices with their market offerings? And is this going to set us up for I.T. crisis of authorization for the systemic network and future of "ownership" wars in essential services?

3-D

Facebook is recreating AOL. A gigantic walled garden that becomes "the internet" for most of the people with computers. Look how AOL ended up.

And Handbrake is a great little program. I've been using it to rip my DVD collection to the 2TB of network storage I now have on my home network. A very convenient way to watch movies.

Anonymous

Shrub already handed them out to all his war + torture buddies, as well as Greenspan � and Daddy Shrub gave one to the teabaggers' favorite faux-economist (Hayek) and to Darth Cheney, so I'd say the reputation of the medal is pretty much already in the sewer.

[Aug 25, 2010] Sometimes the Old Ways Are Best by Brian Kernighan

IEEE Software Nov/Dec 2008, pp.18-19

As I write this column, I'm in the middle of two summer projects; with luck, they'll both be finished by the time you read it. One involves a forensic analysis of over 100,000 lines of old C and assembly code from about 1990, and I have to work on Windows XP. The other is a hack to translate code written in weird language L1 into weird language L2 with a program written in scripting language L3, where none of the L's even existed in 1990; this one uses Linux. Thus it's perhaps a bit surprising that I find myself relying on much the same toolset for these very different tasks.

... ... ...

here has surely been much progress in tools over the 25 years that IEEE Software has been around, and I wouldn't want to go back in time. But the tools I use today are mostly the same old ones-grep, diff, sort, awk, and friends.

This might well mean that I'm a dinosaur stuck in the past. On the other hand, when it comes to doing simple things quickly, I can often have the job done while experts are still waiting for their IDE to start up. Sometimes the old ways are best, and they're certainly worth knowing well

Embed Lua for scriptable apps

The Lua programming language is a small scripting language specifically designed to be embedded in other programs.

Lua's C API allows exceptionally clean and simple code both to call Lua from C, and to call C from Lua.

This allows developers who want a convenient runtime scripting language to easily implement the basic API elements needed by the scripting language, then use Lua code from their applications.

This article introduces the Lua language as a possible tool for simplifying common development tasks, and discusses some of the reasons to embed a scripting language in the first place.

Hope For Multi-Language Programming

chthonicdaemon is pretty naive and does not understand that combination of scripting language with complied language like C (or semi-compiled language like Java) is more productive environment that almost any other known... You need a common runtime like in Windows to make it a smooth approach (IronPython). Scripting helps to avoid OO trap that is pushed by "a hoard of practically illiterate researchers publishing crap papers in junk conferences."
"I have been using Linux as my primary environment for more than ten years. In this time, I have absorbed all the lore surrounding the Unix Way - small programs doing one thing well, communicating via text and all that. I have found the command line a productive environment for doing many of the things I often do, and I find myself writing lots of small scripts that do one thing, then piping them together to do other things. While I was spending the time learning grep, sed, awk, python and many other more esoteric languages, the world moved on to application-based programming, where the paradigm seems to be to add features to one program written in one language. I have traditionally associated this with Windows or MacOS, but it is happening with Linux as well. Environments have little or no support for multi-language projects - you choose a language, open a project and get it done. Recent trends in more targeted build environments like cmake or ant are understandably focusing on automatic dependency generation and cross-platform support, unfortunately making it more difficult to grow a custom build process for a multi-language project organically. All this is a bit painful for me, as I know how much is gained by using a targeted language for a particular problem. Now the question: Should I suck it up and learn to do all my programming in C++/Java/(insert other well-supported, popular language here) and unlearn ten years of philosophy, or is there hope for the multi-language development process?"

[Dec 22, 2008] 13 reasons why Ruby, Python and the gang will push Java to die� of old age

Very questionable logic. The cost of programming and especially the cost of maintenance of an application depends on the level of the language. It is less with Ruby/Python then with Java.
Lately I seem to find everywhere lots of articles about the imminent dismissal of Java and its replacement with the scripting language of the day or sometimes with other compiled languages.

No, that is not gonna happen. Java is gonna die eventually of old age many many years from now. I will share the reasoning behind my statement. Let's first look at some metrics.

Language popularity status as of May 2008

For this I am gonna use the TIOBE index (tiobe.com) and the nice graphs at langpop.com. I know lots of people don't like them because their statistics are based on search engine results but I think they are a reasonable fair indicator of popularity.

Facts from the TIOBE index:

TIOBE Index Top 20

What I find significant here is the huge share the "C like syntax" languages have.

C (15.292) + C++ (10.484) + Java (20.176) + C# (3.963) = 49.915%

This means 4 languages get half of all the attention on the web.

If we add PHP (10.637) here (somehow uses a similar syntax) we get 60.552%

As a result we can extract:

Reason number 1: Syntax is very important because it builds on previous knowledge. Also similar syntax means similar concepts. Programmers have to make less effort to learn the new syntax, can reuse the old concepts and thus they can concentrate on understanding the new concepts.

Let's look at a group of 10 challengers:

He forgot to add Perl

Python (4.613) + Ruby (2.851) + Lisp/Scheme (0.449) + Lua (0.393) + SmallTalk (0.138) +
Haskell (0.137) + Groovy (0.131) + Erlang (0.110) + Caml (0.090) + Scala (0.073) = 8.985%

This is less than the attention Visual Basic gets: 10.782% and leads us to�

TIOBE Index Top 21-50

Reason number 2: Too much noise is distracting. Programmers are busy and learning 10 languages to the level where they can evaluate them and make an educated decision is too much effort. The fact that most of these languages have a different syntax and introduce different (sometimes radically different) concepts doesn't help either.

Looking at the trend for the last 7 years we can see a pretty flat evolution in popularity for most of the languages. There are a few exceptions like the decline of Perl but nothing really is earth shattering. There are seasonal variations but in long term nothing seems to change.

TIOBE Trend

This shows that while various languages catch the mind of the programmer for a short time, they are put back on the shelf pretty fast. This might be caused by the lack of opportunity to use them in real life projects. Most of the programmers in the world work on ongoing projects.

Reason number 3: Lack of pressure on the programmers to switch. The market is pretty stable, the existing languages work pretty well and the management doesn't push programmers to learn new languages.

Number of new projects started

Looking at another site that does language popularity analysis, langpop.com, we see a slightly different view but the end result is almost the same from the point of view of challenger languages.

What I found interesting here was the analysis regarding new projects started in various languages. The sources for information are Freshmeat.net and Google Code. The results show a clear preference for C/C++/Java with Python getting some attention.

Reason number 4: Challenger languages don't seem to catch momentum in order to create an avalanche of new projects started with them. This can be again due to the fact that they spread thin when they are evaluated. They are too many.

Other interesting charts at langpop.com are those about books on programming languages at amazon.com and about language discussions statistics. Book writers write about subjects that have a chance to sell. On the other hand a lot of discussion about all theses new languages takes place online. One thing I noticed in these discussion is the attitude the supporters of certain languages have. There is a lot of elitism and concentration on what is wrong with Java instead of pointing to what their language of choice brings useful and on creating good tutorials for people wanting to attempt a switch.

Reason number 5: Challenger languages communities don't do a good job at attracting programmers from established languages. Telling to somebody why she is wrong will most likely create a counter reaction not interest.

Let's look now at what is happening on the job market. I used the tools offered by indeed.com and I compared a bunch of languages to produce this graph:


Java, C, C++, C#, Python, Ruby, PHP, Scala Job Trends graph

Reason number 6: There is no great incentive to switch to one of the challenger languages since gaining this skill is not likely to translate into income in the near future.

Well, I looked at all these statistics and I extracted some reasons, but what are the qualities a language needs and what are the external conditions that will make a programming language popular?

How and when does a language become popular

So we can draw more reasons:

For one's curiosity here is a list of talked about languages with their birth date:
Ruby (mid 1990s), Python (1991), Lisp (1958), Scheme (1970s), Lua (1993), Smalltalk (1969-1980), Haskell (1990), Erlang (1987), Caml (1985), OCaml (1996), Groovy (2003), Scala (2003)

Compare this with older successful languages:
C (1972), C++ (1983), Java (1995), C# (2001), BASIC (1964), Pascal (1970), FORTRAN (1957), Ada (1983), COBOL (1959)

It is pretty obvious most of these "new" languages lost the train to success.

Why many of the new languages will never be popular

Reason number 11: "Features" that look and are dangerous for big projects. Since there are not a lot of big projects written in any of these languages it is hard to make an unbiased evaluation. But bias is in the end a real obstacle for their adoption.

Reason number 12: Unnatural concepts (for majority of programmers) raise the entry level. Functional languages make you write code like mathematical equations. But how many people actually love math so much to write everything in it? Object oriented languages provide a great advantage: they let programmers think about the domain they want to model, not about the language or the machine.

Reason number 13: Lack of advanced tools for development and refactoring cripple the programmer and the development teams when faced with big amounts of lines of code.

How would a real Java challenger look like

Pick me, pick mee, pick meeee!!!

Looking at all those (smart) languages and all the heated discussions that surround them makes me think about the donkey from Shrek yelling "Pick me! Pick mee!! Pick meeee!!!". In the end only one can be the real winner even if in a limited part of the market.

The danger for Java doesn't come from outside. None of these new (actually most of them are pretty old) languages have the potential to displace Java. The danger for Java comes from inside and it is caused by too many "features" making their way into the language and transforming if from a language that wanted to keep only the essential features of C++ into a trash box for features and concepts from all languages.

In the end I want to make it clear that I am not advocating against any of those languages. There is TAO in all of them. I actually find them interesting, cool and useful as exercise for my brain, when I have time. I recommend to every programmer to look around from time to time and try to understand what is going on the language market.

This article is part of a series of opinions and rants:

[Dec 12, 2008] The A-Z of Programming Languages Perl

What new elements does Perl 5.10.0 bring to the language? In what way is it preparing for Perl 6?

Perl 5.10.0 involves backporting some ideas from Perl 6, like switch statements and named pattern matches.

One of the most popular things is the use of "say" instead of "print".

This is an explicit programming design in Perl - easy things should be easy and hard things should be possible. It's optimised for the common case. Similar things should look similar but similar things should also look different, and how you trade those things off is an interesting design principle.

Huffman Coding is one of those principles that makes similar things look different.

In your opinion, what lasting legacy has Perl brought to computer development?

An increased awareness of the interplay between technology and culture. Ruby has borrowed a few ideas from Perl and so has PHP. I don't think PHP understands the use of signals, but all languages borrow from other languages, otherwise they risk being single-purpose languages. Competition is good.

It's interesting to see PHP follow along with the same mistakes Perl made over time and recover from them. But Perl 6 also borrows back from other languages too, like Ruby. My ego may be big, but it's not that big.

Where do you envisage Perl's future lying?

My vision of Perl's future is that I hope I don't recognize it in 20 years.

Where do you see computer programming languages heading in the future, particularly in the next 5 to 20 years?

Don't design everything you will need in the next 100 years, but design the ability to create things we will need in 20 or 100 years. The heart of the Perl 6 effort is the extensibility we have built into the parser and introduced language changes as non-destructively as possible.

Linux Today's comments

> Given the horrible mess that is Perl (and, BTW, I derive 90% of my income from programming in Perl),
.
Did the thought that 'horrible mess' you produce with $language 'for an income' could be YOUR horrible mess already cross your mind? The language itself doesn't write any code.

> You just said something against his beloved
> Perl and compounded your heinous crime by
> saying something nice about Python...in his
> narrow view you are the antithesis of all that is
> right in the world. He will respond with his many
> years of Perl == good and everything else == bad
> but just let it go...
.
That's a pretty pointless insult. Languages don't write code. People do. A statement like 'I think that code written in Perl looks very ugly because of the large amount of non-alphanumeric characters' would make sense. Trying to elevate entirely subjective, aesthetic preferences into 'general principles' doesn't. 'a mess' is something inherently chaotic, hence, this is not a sensible description for a regularly structured program of any kind. It is obviously possible to write (or not write) regularly structured programs in any language providing the necessary abstractions for that. This set includes Perl.
.
I had the mispleasure to have to deal with messes created by people both in Perl and Python (and a couple of other languages) in the past. You've probably heard the saying that "real programmers can write FORTRAN in any language" already.

It is even true that the most horrible code mess I have seen so far had been written in Perl. But this just means that a fairly chaotic person happened to use this particular programming language.

[Dec 9, 2008] What Programming Language For Linux Development

Slashdot

by dkf (304284) <[email protected]> on Saturday December 06, @07:08PM (#26016101) Homepage

C/C++ are the languages you'd want to go for. They can do *everything*, have great support, are fast etc.

Let's be honest here. C and C++ are very fast indeed if you use them well (very little can touch them; most other languages are actually implemented in terms of them) but they're also very easy to use really badly. They're genuine professional power tools: they'll do what you ask them to really quickly, even if that is just to spin on the spot chopping peoples' legs off. Care required!

If you use a higher-level language (I prefer Tcl, but you might prefer Python, Perl, Ruby, Lua, Rexx, awk, bash, etc. - the list is huge) then you probably won't go as fast. But unless you're very good at C/C++ you'll go acceptably fast at a much earlier calendar date. It's just easier for most people to be productive in higher-level languages. Well, unless you're doing something where you have to be incredibly close to the metal like a device driver, but even then it's best to keep the amount of low-level code small and to try to get to use high-level things as soon as you can.

One technique that is used quite a bit, especially by really experienced developers, is to split the program up into components that are then glued together. You can then write the components in a low-level language if necessary, but use the far superior gluing capabilities of a high-level language effectively. I know many people are very productive doing this.

[Apr 25, 2008] Interview with Donald Knuth by Donald E. Knuth & Andrew Binstock


Andrew Binstock and Donald Knuth converse on the success of open source, the problem with multicore architecture, the disappointing lack of interest in literate programming, the menace of reusable code, and that urban legend about winning a programming contest with a single compilation.

Andrew Binstock: You are one of the fathers of the open-source revolution, even if you aren't widely heralded as such. You previously have stated that you released TeX as open source because of the problem of proprietary implementations at the time, and to invite corrections to the code-both of which are key drivers for open-source projects today. Have you been surprised by the success of open source since that time?

Donald Knuth:

The success of open source code is perhaps the only thing in the computer field that hasn't surprised me during the past several decades. But it still hasn't reached its full potential; I believe that open-source programs will begin to be completely dominant as the economy moves more and more from products towards services, and as more and more volunteers arise to improve the code.

For example, open-source code can produce thousands of binaries, tuned perfectly to the configurations of individual users, whereas commercial software usually will exist in only a few versions. A generic binary executable file must include things like inefficient "sync" instructions that are totally inappropriate for many installations; such wastage goes away when the source code is highly configurable. This should be a huge win for open source.

Yet I think that a few programs, such as Adobe Photoshop, will always be superior to competitors like the Gimp-for some reason, I really don't know why! I'm quite willing to pay good money for really good software, if I believe that it has been produced by the best programmers.

Remember, though, that my opinion on economic questions is highly suspect, since I'm just an educator and scientist. I understand almost nothing about the marketplace.

Andrew: A story states that you once entered a programming contest at Stanford (I believe) and you submitted the winning entry, which worked correctly after a single compilation. Is this story true? In that vein, today's developers frequently build programs writing small code increments followed by immediate compilation and the creation and running of unit tests. What are your thoughts on this approach to software development?

Donald:

The story you heard is typical of legends that are based on only a small kernel of truth. Here's what actually happened: John McCarthy decided in 1971 to have a Memorial Day Programming Race. All of the contestants except me worked at his AI Lab up in the hills above Stanford, using the WAITS time-sharing system; I was down on the main campus, where the only computer available to me was a mainframe for which I had to punch cards and submit them for processing in batch mode. I used Wirth's ALGOL W system (the predecessor of Pascal). My program didn't work the first time, but fortunately I could use Ed Satterthwaite's excellent offline debugging system for ALGOL W, so I needed only two runs. Meanwhile, the folks using WAITS couldn't get enough machine cycles because their machine was so overloaded. (I think that the second-place finisher, using that "modern" approach, came in about an hour after I had submitted the winning entry with old-fangled methods.) It wasn't a fair contest.

As to your real question, the idea of immediate compilation and "unit tests" appeals to me only rarely, when I'm feeling my way in a totally unknown environment and need feedback about what works and what doesn't. Otherwise, lots of time is wasted on activities that I simply never need to perform or even think about. Nothing needs to be "mocked up."

Andrew: One of the emerging problems for developers, especially client-side developers, is changing their thinking to write programs in terms of threads. This concern, driven by the advent of inexpensive multicore PCs, surely will require that many algorithms be recast for multithreading, or at least to be thread-safe. So far, much of the work you've published for Volume 4 of The Art of Computer Programming (TAOCP) doesn't seem to touch on this dimension. Do you expect to enter into problems of concurrency and parallel programming in upcoming work, especially since it would seem to be a natural fit with the combinatorial topics you're currently working on?

Donald:

The field of combinatorial algorithms is so vast that I'll be lucky to pack its sequential aspects into three or four physical volumes, and I don't think the sequential methods are ever going to be unimportant. Conversely, the half-life of parallel techniques is very short, because hardware changes rapidly and each new machine needs a somewhat different approach. So I decided long ago to stick to what I know best. Other people understand parallel machines much better than I do; programmers should listen to them, not me, for guidance on how to deal with simultaneity.

Andrew: Vendors of multicore processors have expressed frustration at the difficulty of moving developers to this model. As a former professor, what thoughts do you have on this transition and how to make it happen? Is it a question of proper tools, such as better native support for concurrency in languages, or of execution frameworks? Or are there other solutions?

Donald:

I don't want to duck your question entirely. I might as well flame a bit about my personal unhappiness with the current trend toward multicore architecture. To me, it looks more or less like the hardware designers have run out of ideas, and that they're trying to pass the blame for the future demise of Moore's Law to the software writers by giving us machines that work faster only on a few key benchmarks! I won't be surprised at all if the whole multithreading idea turns out to be a flop, worse than the "Titanium" approach that was supposed to be so terrific - until it turned out that the wished-for compilers were basically impossible to write.

Let me put it this way: During the past 50 years, I've written well over a thousand programs, many of which have substantial size. I can't think of even five of those programs that would have been enhanced noticeably by parallelism or multithreading. Surely, for example, multiple processors are no help to TeX.[1]

How many programmers do you know who are enthusiastic about these promised machines of the future? I hear almost nothing but grief from software people, although the hardware folks in our department assure me that I'm wrong.

I know that important applications for parallelism exist-rendering graphics, breaking codes, scanning images, simulating physical and biological processes, etc. But all these applications require dedicated code and special-purpose techniques, which will need to be changed substantially every few years.

Even if I knew enough about such methods to write about them in TAOCP, my time would be largely wasted, because soon there would be little reason for anybody to read those parts. (Similarly, when I prepare the third edition of Volume 3 I plan to rip out much of the material about how to sort on magnetic tapes. That stuff was once one of the hottest topics in the whole software field, but now it largely wastes paper when the book is printed.)

The machine I use today has dual processors. I get to use them both only when I'm running two independent jobs at the same time; that's nice, but it happens only a few minutes every week. If I had four processors, or eight, or more, I still wouldn't be any better off, considering the kind of work I do-even though I'm using my computer almost every day during most of the day. So why should I be so happy about the future that hardware vendors promise? They think a magic bullet will come along to make multicores speed up my kind of work; I think it's a pipe dream. (No-that's the wrong metaphor! "Pipelines" actually work for me, but threads don't. Maybe the word I want is "bubble.")

From the opposite point of view, I do grant that web browsing probably will get better with multicores. I've been talking about my technical work, however, not recreation. I also admit that I haven't got many bright ideas about what I wish hardware designers would provide instead of multicores, now that they've begun to hit a wall with respect to sequential computation. (But my MMIX design contains several ideas that would substantially improve the current performance of the kinds of programs that concern me most-at the cost of incompatibility with legacy x86 programs.)

Andrew: One of the few projects of yours that hasn't been embraced by a widespread community is literate programming. What are your thoughts about why literate programming didn't catch on? And is there anything you'd have done differently in retrospect regarding literate programming?

Donald:

Literate programming is a very personal thing. I think it's terrific, but that might well be because I'm a very strange person. It has tens of thousands of fans, but not millions.

In my experience, software created with literate programming has turned out to be significantly better than software developed in more traditional ways. Yet ordinary software is usually okay-I'd give it a grade of C (or maybe C++), but not F; hence, the traditional methods stay with us. Since they're understood by a vast community of programmers, most people have no big incentive to change, just as I'm not motivated to learn Esperanto even though it might be preferable to English and German and French and Russian (if everybody switched).

Jon Bentley probably hit the nail on the head when he once was asked why literate programming hasn't taken the whole world by storm. He observed that a small percentage of the world's population is good at programming, and a small percentage is good at writing; apparently I am asking everybody to be in both subsets.

Yet to me, literate programming is certainly the most important thing that came out of the TeX project. Not only has it enabled me to write and maintain programs faster and more reliably than ever before, and been one of my greatest sources of joy since the 1980s-it has actually been indispensable at times. Some of my major programs, such as the MMIX meta-simulator, could not have been written with any other methodology that I've ever heard of. The complexity was simply too daunting for my limited brain to handle; without literate programming, the whole enterprise would have flopped miserably.

If people do discover nice ways to use the newfangled multithreaded machines, I would expect the discovery to come from people who routinely use literate programming. Literate programming is what you need to rise above the ordinary level of achievement. But I don't believe in forcing ideas on anybody. If literate programming isn't your style, please forget it and do what you like. If nobody likes it but me, let it die.

On a positive note, I've been pleased to discover that the conventions of CWEB are already standard equipment within preinstalled software such as Makefiles, when I get off-the-shelf Linux these days.

Andrew: In Fascicle 1 of Volume 1, you reintroduced the MMIX computer, which is the 64-bit upgrade to the venerable MIX machine comp-sci students have come to know over many years. You previously described MMIX in great detail in MMIXware. I've read portions of both books, but can't tell whether the Fascicle updates or changes anything that appeared in MMIXware, or whether it's a pure synopsis. Could you clarify?

Donald:

Volume 1 Fascicle 1 is a programmer's introduction, which includes instructive exercises and such things. The MMIXware book is a detailed reference manual, somewhat terse and dry, plus a bunch of literate programs that describe prototype software for people to build upon. Both books define the same computer (once the errata to MMIXware are incorporated from my website). For most readers of TAOCP, the first fascicle contains everything about MMIX that they'll ever need or want to know.

I should point out, however, that MMIX isn't a single machine; it's an architecture with almost unlimited varieties of implementations, depending on different choices of functional units, different pipeline configurations, different approaches to multiple-instruction-issue, different ways to do branch prediction, different cache sizes, different strategies for cache replacement, different bus speeds, etc. Some instructions and/or registers can be emulated with software on "cheaper" versions of the hardware. And so on. It's a test bed, all simulatable with my meta-simulator, even though advanced versions would be impossible to build effectively until another five years go by (and then we could ask for even further advances just by advancing the meta-simulator specs another notch).

Suppose you want to know if five separate multiplier units and/or three-way instruction issuing would speed up a given MMIX program. Or maybe the instruction and/or data cache could be made larger or smaller or more associative. Just fire up the meta-simulator and see what happens.

Andrew: As I suspect you don't use unit testing with MMIXAL, could you step me through how you go about making sure that your code works correctly under a wide variety of conditions and inputs? If you have a specific work routine around verification, could you describe it?

Donald:

Most examples of machine language code in TAOCP appear in Volumes 1-3; by the time we get to Volume 4, such low-level detail is largely unnecessary and we can work safely at a higher level of abstraction. Thus, I've needed to write only a dozen or so MMIX programs while preparing the opening parts of Volume 4, and they're all pretty much toy programs-nothing substantial. For little things like that, I just use informal verification methods, based on the theory that I've written up for the book, together with the MMIXAL assembler and MMIX simulator that are readily available on the Net (and described in full detail in the MMIXware book).

That simulator includes debugging features like the ones I found so useful in Ed Satterthwaite's system for ALGOL W, mentioned earlier. I always feel quite confident after checking a program with those tools.

Andrew: Despite its formulation many years ago, TeX is still thriving, primarily as the foundation for LaTeX. While TeX has been effectively frozen at your request, are there features that you would want to change or add to it, if you had the time and bandwidth? If so, what are the major items you add/change?

Donald:

I believe changes to TeX would cause much more harm than good. Other people who want other features are creating their own systems, and I've always encouraged further development-except that nobody should give their program the same name as mine. I want to take permanent responsibility for TeX and Metafont, and for all the nitty-gritty things that affect existing documents that rely on my work, such as the precise dimensions of characters in the Computer Modern fonts.

Andrew: One of the little-discussed aspects of software development is how to do design work on software in a completely new domain. You were faced with this issue when you undertook TeX: No prior art was available to you as source code, and it was a domain in which you weren't an expert. How did you approach the design, and how long did it take before you were comfortable entering into the coding portion?

Donald:

That's another good question! I've discussed the answer in great detail in Chapter 10 of my book Literate Programming, together with Chapters 1 and 2 of my book Digital Typography. I think that anybody who is really interested in this topic will enjoy reading those chapters. (See also Digital Typography Chapters 24 and 25 for the complete first and second drafts of my initial design of TeX in 1977.)

Andrew: The books on TeX and the program itself show a clear concern for limiting memory usage-an important problem for systems of that era. Today, the concern for memory usage in programs has more to do with cache sizes. As someone who has designed a processor in software, the issues of cache-aware and cache-oblivious algorithms surely must have crossed your radar screen. Is the role of processor caches on algorithm design something that you expect to cover, even if indirectly, in your upcoming work?

Donald:

I mentioned earlier that MMIX provides a test bed for many varieties of cache. And it's a software-implemented machine, so we can perform experiments that will be repeatable even a hundred years from now. Certainly the next editions of Volumes 1-3 will discuss the behavior of various basic algorithms with respect to different cache parameters.

In Volume 4 so far, I count about a dozen references to cache memory and cache-friendly approaches (not to mention a "memo cache," which is a different but related idea in software).

Andrew: What set of tools do you use today for writing TAOCP? Do you use TeX? LaTeX? CWEB? Word processor? And what do you use for the coding?

Donald:

My general working style is to write everything first with pencil and paper, sitting beside a big wastebasket. Then I use Emacs to enter the text into my machine, using the conventions of TeX. I use tex, dvips, and gv to see the results, which appear on my screen almost instantaneously these days. I check my math with Mathematica.

I program every algorithm that's discussed (so that I can thoroughly understand it) using CWEB, which works splendidly with the GDB debugger. I make the illustrations with MetaPost (or, in rare cases, on a Mac with Adobe Photoshop or Illustrator). I have some homemade tools, like my own spell-checker for TeX and CWEB within Emacs. I designed my own bitmap font for use with Emacs, because I hate the way the ASCII apostrophe and the left open quote have morphed into independent symbols that no longer match each other visually. I have special Emacs modes to help me classify all the tens of thousands of papers and notes in my files, and special Emacs keyboard shortcuts that make bookwriting a little bit like playing an organ. I prefer rxvt to xterm for terminal input. Since last December, I've been using a file backup system called backupfs, which meets my need beautifully to archive the daily state of every file.

According to the current directories on my machine, I've written 68 different CWEB programs so far this year. There were about 100 in 2007, 90 in 2006, 100 in 2005, 90 in 2004, etc. Furthermore, CWEB has an extremely convenient "change file" mechanism, with which I can rapidly create multiple versions and variations on a theme; so far in 2008 I've made 73 variations on those 68 themes. (Some of the variations are quite short, only a few bytes; others are 5KB or more. Some of the CWEB programs are quite substantial, like the 55-page BDD package that I completed in January.) Thus, you can see how important literate programming is in my life.

I currently use Ubuntu Linux, on a standalone laptop-it has no Internet connection. I occasionally carry flash memory drives between this machine and the Macs that I use for network surfing and graphics; but I trust my family jewels only to Linux. Incidentally, with Linux I much prefer the keyboard focus that I can get with classic FVWM to the GNOME and KDE environments that other people seem to like better. To each his own.

Andrew: You state in the preface of Fascicle 0 of Volume 4 of TAOCP that Volume 4 surely will comprise three volumes and possibly more. It's clear from the text that you're really enjoying writing on this topic. Given that, what is your confidence in the note posted on the TAOCP website that Volume 5 will see light of day by 2015?

Donald:

If you check the Wayback Machine for previous incarnations of that web page, you will see that the number 2015 has not been constant.

You're certainly correct that I'm having a ball writing up this material, because I keep running into fascinating facts that simply can't be left out-even though more than half of my notes don't make the final cut.

Precise time estimates are impossible, because I can't tell until getting deep into each section how much of the stuff in my files is going to be really fundamental and how much of it is going to be irrelevant to my book or too advanced. A lot of the recent literature is academic one-upmanship of limited interest to me; authors these days often introduce arcane methods that outperform the simpler techniques only when the problem size exceeds the number of protons in the universe. Such algorithms could never be important in a real computer application. I read hundreds of such papers to see if they might contain nuggets for programmers, but most of them wind up getting short shrift.

From a scheduling standpoint, all I know at present is that I must someday digest a huge amount of material that I've been collecting and filing for 45 years. I gain important time by working in batch mode: I don't read a paper in depth until I can deal with dozens of others on the same topic during the same week. When I finally am ready to read what has been collected about a topic, I might find out that I can zoom ahead because most of it is eminently forgettable for my purposes. On the other hand, I might discover that it's fundamental and deserves weeks of study; then I'd have to edit my website and push that number 2015 closer to infinity.

Andrew: In late 2006, you were diagnosed with prostate cancer. How is your health today?

Donald:

Naturally, the cancer will be a serious concern. I have superb doctors. At the moment I feel as healthy as ever, modulo being 70 years old. Words flow freely as I write TAOCP and as I write the literate programs that precede drafts of TAOCP. I wake up in the morning with ideas that please me, and some of those ideas actually please me also later in the day when I've entered them into my computer.

On the other hand, I willingly put myself in God's hands with respect to how much more I'll be able to do before cancer or heart disease or senility or whatever strikes. If I should unexpectedly die tomorrow, I'll have no reason to complain, because my life has been incredibly blessed. Conversely, as long as I'm able to write about computer science, I intend to do my best to organize and expound upon the tens of thousands of technical papers that I've collected and made notes on since 1962.

Andrew: On your website, you mention that the Peoples Archive recently made a series of videos in which you reflect on your past life. In segment 93, "Advice to Young People," you advise that people shouldn't do something simply because it's trendy. As we know all too well, software development is as subject to fads as any other discipline. Can you give some examples that are currently in vogue, which developers shouldn't adopt simply because they're currently popular or because that's the way they're currently done? Would you care to identify important examples of this outside of software development?

Donald:

Hmm. That question is almost contradictory, because I'm basically advising young people to listen to themselves rather than to others, and I'm one of the others. Almost every biography of every person whom you would like to emulate will say that he or she did many things against the "conventional wisdom" of the day.

Still, I hate to duck your questions even though I also hate to offend other people's sensibilities-given that software methodology has always been akin to religion. With the caveat that there's no reason anybody should care about the opinions of a computer scientist/mathematician like me regarding software development, let me just say that almost everything I've ever heard associated with the term "extreme programming" sounds like exactly the wrong way to go...with one exception. The exception is the idea of working in teams and reading each other's code. That idea is crucial, and it might even mask out all the terrible aspects of extreme programming that alarm me.

I also must confess to a strong bias against the fashion for reusable code. To me, "re-editable code" is much, much better than an untouchable black box or toolkit. I could go on and on about this. If you're totally convinced that reusable code is wonderful, I probably won't be able to sway you anyway, but you'll never convince me that reusable code isn't mostly a menace.

Here's a question that you may well have meant to ask: Why is the new book called Volume 4 Fascicle 0, instead of Volume 4 Fascicle 1? The answer is that computer programmers will understand that I wasn't ready to begin writing Volume 4 of TAOCP at its true beginning point, because we know that the initialization of a program can't be written until the program itself takes shape. So I started in 2005 with Volume 4 Fascicle 2, after which came Fascicles 3 and 4. (Think of Star Wars, which began with Episode 4.)

[Apr 23, 2008] Dr. Dobb's Programming Languages Everyone Has a Favorite One by Deirdre Blake

I think the article misstates position of Perl (according to TIOBE index it is No.6 about above C#, Python and Ruby). The author definitely does not understand the value and staying power of C. Also there is an inherent problem with their methodology (as with any web presence based metric). This is visible in positions of C# which now definitely looks stronger then Python and Perl (and may be even PHP) as well as in positions of bash, awk and PowerShell. That means all other statements should be taken with the grain of salt...
April 23, 2008 | DDJ

From what Paul Jansen has seen, everyone has a favorite programming language.

Paul Jansen is managing director of TIOBE Software (www.tiobe.com). He can be contacted at [email protected].


DDJ: Paul, can you tell us about the TIOBE Programming Community Index?

PJ: The TIOBE index tries to measure the popularity of programming languages by monitoring their web presence. The most popular search engines Google, Yahoo!, Microsoft, and YouTube are used to calculate these figures. YouTube has been added recently as an experiment (and only counts for 4 percent of the total). Since the TIOBE index has been published now for more than 6 years, it gives an interesting picture about trends in the area of programming languages. I started the index because I was curious to know whether my programming skills were still up to date and to know for which programming languages our company should create development tools. It is amazing to see that programming languages are something very personal. Every day we receive e-mails from people that are unhappy with the position of "their" specific language in the index. I am also a bit overwhelmed about the vast and constant traffic this index generates.

DDJ: Which language has moved to the top of the heap, so to speak, in terms of popularity, and why do you think this is the case?

PJ: If we take a look at the top 10 programming languages, not much has happened the last five years. Only Python entered the top 10, replacing COBOL. This comes as a surprise because the IT world is moving so fast that in most areas, the market is usually completely changed in five years time. Python managed to reach the top 10 because it is the truly object-oriented successor of Perl. Other winners of the last couple of years are Visual Basic, Ruby, JavaScript, C#, and D (a successor of C++). I expect in five years time there will be two main languages: Java and C#, closely followed by good-old Visual Basic. There is no new paradigm foreseen.

DDJ: Which languages seem to be losing ground?

PJ: C and C++ are definitely losing ground. There is a simple explanation for this. Languages without automated garbage collection are getting out of fashion. The chance of running into all kinds of memory problems is gradually outweighing the performance penalty you have to pay for garbage collection. Another language that has had its day is Perl. It was once the standard language for every system administrator and build manager, but now everyone has been waiting on a new major release for more than seven years. That is considered far too long.

DDJ: On the flip side, what other languages seem to be moving into the limelight?

PJ: It is interesting to observe that dynamically typed object-oriented (scripting) languages are evolving the most. A new language has hardly arrived on the scene, only to be immediately replaced by another new emerging language. I think this has to do with the increase in web programming. The web programming area demands a language that is easy to learn, powerful, and secure. New languages pop up every day, trying to be leaner and meaner than their predecessors. A couple of years ago, Ruby was rediscovered (thanks to Rails). Recently, Lua was the hype, but now other scripting languages such as ActionScript, Groovy, and Factor are about to claim a top 20 position. There is quite some talk on the Internet about the NBL (next big language). But although those web-programming languages generate a lot of attention, there is never a real breakthrough.

DDJ: What are the benefits of introducing coding standards into an organization? And how does an organization choose a standard that is a "right fit" for its development goals?

PJ: Coding standards help to improve the general quality of software. A good coding standard focuses on best programming practices (avoiding known language pitfalls), not only on style and naming conventions. Every language has its constructions that are perfectly legitimate according to its language definition but will lead to reliability, security, or maintainability problems. Coding standards help engineers to stick to a subset of a programming language to make sure that these problems do not occur. The advantage of introducing coding standards as a means to improve quality is that-once it is in place-it does not change too often. This is in contrast with dynamic testing. Every change in your program calls for a change in your dynamic tests. In short, dynamic tests are far more labor intensive than coding standards. On the other hand, coding standards can only take care of nonfunctional defects. Bugs concerning incorrectly implemented requirements remain undetected. The best way to start with coding standards is to download a code checker and tweak it to your needs. It is our experience that if you do not check the rules of your coding standard automatically, the coding standard will soon end as a dusty document on some shelf.

[Apr 18, 2008] CoScripter

A useful Firefox plug-in

CoScripter is a system for recording, automating, and sharing processes performed in a web browser such as printing photos online, requesting a vacation hold for postal mail, or checking flight arrival times. Instructions for processes are recorded and stored in easy-to-read text here on the CoScripter web site, so anyone can make use of them. If you are having trouble with a web-based process, check to see if someone has written a CoScript for it!

[Mar 12, 2008] Tiny Eclipse by Eric Suen

About: Tiny Eclipse is distribution of Eclipse for development with dynamic languages for the Web, such as JSP, PHP, Ruby, TCL, and Web Services. It features a small download size, the ability to choose the features you want to install, and GUI installers for Win32 and Linux GTK x86.

[Jan 2, 2008] Java is becoming the new Cobol by Bill Snyder

"Simply put, developers are saying that Java slows them down"
Dec 28, 2007 | infoworld.com

Simply put, developers are saying that Java slows them down. "There were big promises that Java would solve incompatibility problems [across platforms]. But now there are different versions and different downloads, creating complications," says Peter Thoneny, CEO of Twiki.net, which produces a certified version of the open source Twiki wiki-platform software. "It has not gotten easier. It's more complicated," concurs Ofer Ronen, CEO of Sendori, which routes domain traffic to online advertisers and ad networks. Sendori has moved to Ruby on Rails. Ronen says Ruby offers pre-built structures - say, a shopping cart for an e-commerce site - that you'd have to code from the ground up using Java.

Another area of weakness is the development of mobile applications. Java's UI capabilities and its memory footprint simply don't measure up, says Samir Shah, CEO of software testing provider Zephyr. No wonder the mobile edition of Java has all but disappeared, and no wonder Google is creating its own version (Android).

These weaknesses are having a real effect. Late last month, Info-Tech Research Group said its survey of 1,850 businesses found .Net the choice over Java among businesses of all sizes and industries, thanks to its promotion via Visual Studio and SharePoint. Microsoft is driving uptake of the .Net platform at the expense of Java," says George Goodall, a senior research analyst at Info-Tech.

One bit of good news: developers and analysts agree that Java is alive and well for internally developed enterprise apps. "On the back end, there is still a substantial amount of infrastructure available that makes Java a very strong contender," says Zephyr's Shah.

The Bottom Line: Now that Java is no longer the unchallenged champ for Internet-delivered apps, it makes sense for companies to find programmers who are skilled in the new languages. If you're a Java developer, now's the time to invest in new skills.

[Jan 2, 2008] Java is Becoming the New Cobol

I think that the author of this comment is deeply mistaken: the length of the code has tremendous influence of the cost of maintenance and number of errors and here Java sucks.
Linux Today

It's true. putting together an Enterprise-scale Java application takes a considerable amount of planning, design, and co-ordination. Scripted languages like Python are easier - just hack something out and you've a working webapp by the end of the day.

But then you get called in at midnight, because a lot of the extra front-end work in Java has to do with the fact that the compiler is doing major datatype validation. You're a lot less likely to have something blow up after it went into production, since a whole raft of potential screw-ups get caught at build time.

Scripting systems like Python, Perl, PHP, etc. not only have late binding, but frequently have late compiling as well, so until the offending code is invoked, it's merely a coiled-up snake.

In fact, after many years and many languages, I'm just about convinced that the amount of time and effort for producing a debugged major app in just about any high-level language is about the same.

Myself, I prefer an environment that keeps me from having to wear a pager. For those who need less sleep and more Instant Gratification, they're welcome to embrace the other end of the spectrum.

[Dec 28, 2007] My Programming Language History by Keith Waclena

It's pretty strange for the system admin and CGI programmer prefer Python to Perl... It goes without saying that in any case such evaluations should be taken with a grain of sslt. what makes this comparison interesting is that author claim to have substantial programming experience in Perl-4, Tcl and Python

Before going any further, read the disclaimer!


Some languages I've used and how I've felt about them. This may help you figure out where I'm coming from. I'm only listing the highlights here, and not including the exotica (Trac, SAM76, Setl, Rec, Convert, J...) and languages I've only toyed with or programmed in my head (Algol 68, BCPL, APL, S-Algol, Pop-2 / Pop-11, Refal, Prolog...).
Pascal.
My first language. Very nearly turned me off programming before I got started. I hated it. Still do.
Sail.
The first language I loved, Sail was Algol 60 with zillions of extensions, from dynamically allocated strings to Leap, a weird production system / logic programming / database language that I never understood (and now can barely recall), and access to every Twenex JSYS and TOPS 10 UUO! Pretty much limited to PDP-10s; supposedly reincarnated as MAINSAIL, but I never saw that.
Teco.
The first language I actually became fluent in; also the first language I ever got paid to program in.
Snobol4.
The first language I actually wrote halfway-decent sizable code in, developed a personal subroutine library for, wrote multi-platform code in, and used on an IBM mainframe (Spitbol -- but I did all my development under Twenex with Sitbol, thank goodness). I loved Snobol: I used to dream in it.
PL/I.
The first language I ever thought was great at first and then grew to loathe. Subset G only, but that was enough.
Forth.
The first language I ever ran on my own computer; also the first language I ever wrote useful assembler in -- serial I/O routines for the Z/80 (my first assembly language was a handful of toy programs in IBM 360 assembly language, using the aptly-named SPASM assembler), and the first language I thought was really wonderful but had a really difficult time writing useful programs in. Also the first language whose implementation I actually understood, the first language that really taught me about hardware, and the first language implementation I installed myself. Oh and the first language I taught.
C.
What can I say here? It's unpleasant, but it works, it's everywhere, and you have to use it.
Lisp.
The first language I thought was truly brilliant and still think so to this day. I programmed in Maclisp and Muddle (MDL) at first, on the PDP-10; Franz and then Common Lisp later (but not much).
Scheme.
How could you improve on Lisp? Scheme is how.
Perl.
The first language I wrote at least hundreds of useful programs in (Perl 4 (and earlier) only). Probably the second language I thought was great and grew to loathe (for many of the same reasons I grew to loathe PL/I, interestingly enough -- but it took longer).
Lazy Functional Languages.
How could you improve on Scheme? Lazy functional languages is how, but can you actually do anything with them (except compile lazy functional languages, of course)?
Tcl.
My previous standard, daily language. It's got a lot of problems, and it's amazing that Tcl programs ever get around to terminating, but they do, and astonishingly quickly (given the execution model...). I've developed a large library of Tcl procs that allow me to whip up substantial programs really quickly, the mark of a decent language. And it's willing to dress up as Lisp to fulfill my kinky desires.
Python
My current standard, daily language. Faster than Tcl, about as fast as Perl and with nearly as large a standard library, but with a reasonable syntax and real data structures. It's by no means perfect -- still kind of slow, not enough of an expression language to suit me, dynamically typed, no macro system -- but I'm really glad I found it.

[Oct 5, 2007] Turn Vim into a bash IDE By Joe 'Zonker' Brockmeier

June 11, 2007 | Linux.com

By itself, Vim is one of the best editors for shell scripting. With a little tweaking, however, you can turn Vim into a full-fledged IDE for writing scripts. You could do it yourself, or you can just install Fritz Mehner's Bash Support plugin.

To install Bash Support, download the zip archive, copy it to your ~/.vim directory, and unzip the archive. You'll also want to edit your ~/.vimrc to include a few personal details; open the file and add these three lines:

let g:BASH_AuthorName   = 'Your Name'
let g:BASH_Email        = '[email protected]'
let g:BASH_Company      = 'Company Name'

These variables will be used to fill in some headers for your projects, as we'll see below.

The Bash Support plugin works in the Vim GUI (gVim) and text mode Vim. It's a little easier to use in the GUI, and Bash Support doesn't implement most of its menu functions in Vim's text mode, so you might want to stick with gVim when scripting.

When Bash Support is installed, gVim will include a new menu, appropriately titled Bash. This puts all of the Bash Support functions right at your fingertips (or mouse button, if you prefer). Let's walk through some of the features, and see how Bash Support can make Bash scripting a breeze.

Header and comments

If you believe in using extensive comments in your scripts, and I hope you are, you'll really enjoy using Bash Support. Bash Support provides a number of functions that make it easy to add comments to your bash scripts and programs automatically or with just a mouse click or a few keystrokes.

When you start a non-trivial script that will be used and maintained by others, it's a good idea to include a header with basic information -- the name of the script, usage, description, notes, author information, copyright, and any other info that might be useful to the next person who has to maintain the script. Bash Support makes it a breeze to provide this information. Go to Bash -> Comments -> File Header, and gVim will insert a header like this in your script:

#!/bin/bash
#===============================================================================
#
#          FILE:  test.sh
#
#         USAGE:  ./test.sh
#
#   DESCRIPTION:
#
#       OPTIONS:  ---
#  REQUIREMENTS:  ---
#          BUGS:  ---
#         NOTES:  ---
#        AUTHOR:  Joe Brockmeier, [email protected]
#       COMPANY:  Dissociated Press
#       VERSION:  1.0
#       CREATED:  05/25/2007 10:31:01 PM MDT
#      REVISION:  ---
#===============================================================================

You'll need to fill in some of the information, but Bash Support grabs the author, company name, and email address from your ~/.vimrc, and fills in the file name and created date automatically. To make life even easier, if you start Vim or gVim with a new file that ends with an .sh extension, it will insert the header automatically.

As you're writing your script, you might want to add comment blocks for your functions as well. To do this, go to Bash -> Comment -> Function Description to insert a block of text like this:

#===  FUNCTION  ================================================================
#          NAME:
#   DESCRIPTION:
#    PARAMETERS:
#       RETURNS:
#===============================================================================

Just fill in the relevant information and carry on coding.

The Comment menu allows you to insert other types of comments, insert the current date and time, and turn selected code into a comment, and vice versa.

Statements and snippets

Let's say you want to add an if-else statement to your script. You could type out the statement, or you could just use Bash Support's handy selection of pre-made statements. Go to Bash -> Statements and you'll see a long list of pre-made statements that you can just plug in and fill in the blanks. For instance, if you want to add a while statement, you can go to Bash -> Statements -> while, and you'll get the following:

while _; do
done

The cursor will be positioned where the underscore (_) is above. All you need to do is add the test statement and the actual code you want to run in the while statement. Sure, it'd be nice if Bash Support could do all that too, but there's only so far an IDE can help you.

However, you can help yourself. When you do a lot of bash scripting, you might have functions or code snippets that you reuse in new scripts. Bash Support allows you to add your snippets and functions by highlighting the code you want to save, then going to Bash -> Statements -> write code snippet. When you want to grab a piece of prewritten code, go to Bash -> Statements -> read code snippet. Bash Support ships with a few included code fragments.

Another way to add snippets to the statement collection is to just place a text file with the snippet under the ~/.vim/bash-support/codesnippets directory.

Running and debugging scripts

Once you have a script ready to go, and it's testing and debugging time. You could exit Vim, make the script executable, run it and see if it has any bugs, and then go back to Vim to edit it, but that's tedious. Bash Support lets you stay in Vim while doing your testing.

When you're ready to make the script executable, just choose Bash -> Run -> make script executable. To save and run the script, press Ctrl-F9, or go to Bash -> Run -> save + run script.

Bash Support also lets you call the bash debugger (bashdb) directly from within Vim. On Ubuntu, it's not installed by default, but that's easily remedied with apt-get install bashdb. Once it's installed, you can debug the script you're working on with F9 or Bash -> Run -> start debugger.

If you want a "hard copy" -- a PostScript printout -- of your script, you can generate one by going to Bash -> Run -> hardcopy to FILENAME.ps. This is where Bash Support comes in handy for any type of file, not just bash scripts. You can use this function within any file to generate a PostScript printout.

Bash Support has several other functions to help run and test scripts from within Vim. One useful feature is syntax checking, which you can access with Alt-F9. If you have no syntax errors, you'll get a quick OK. If there are problems, you'll see a small window at the bottom of the Vim screen with a list of syntax errors. From that window you can highlight the error and press Enter, and you'll be taken to the line with the error.

Put away the reference book...

Don't you hate it when you need to include a regular expression or a test in a script, but can't quite remember the syntax? That's no problem when you're using Bash Support, because you have Regex and Tests menus with all you'll need. For example, if you need to verify that a file exists and is owned by the correct user ID (UID), go to Bash -> Tests -> file exists and is owned by the effective UID. Bash Support will insert the appropriate test ([ -O _]) with your cursor in the spot where you have to fill in the file name.

To build regular expressions quickly, go to the Bash menu, select Regex, then pick the appropriate expression from the list. It's fairly useful when you can't remember exactly how to express "zero or one" or other regular expressions.

Bash Support also includes menus for environment variables, bash builtins, shell options, and a lot more.

Hotkey support

Vim users can access many of Bash Support's features using hotkeys. While not as simple as clicking the menu, the hotkeys do follow a logical scheme that makes them easy to remember. For example, all of the comment functions are accessed with \c, so if you want to insert a file header, you use \ch; if you want a date inserted, type \cd; and for a line end comment, use \cl.

Statements can be accessed with \a. Use \ac for a case statement, \aie for an "if then else" statement, \af for a "for in..." statement, and so on. Note that the online docs are incorrect here, and indicate that statements begin with \s, but Bash Support ships with a PDF reference card (under .vim/bash-support/doc/bash-hot-keys.pdf) that gets it right.

Run commands are accessed with \r. For example, to save the file and run a script, use \rr; to make a script executable, use \re; and to start the debugger, type \rd. I won't try to detail all of the shortcuts, but you can pull up a reference using :help bashsupport-usage-vim when in Vim, or use the PDF. The full Bash Support reference is available within Vim by running :help bashsupport, or you can read it online.

Of course, we've covered only a small part of Bash Support's functionality. The next time you need to whip up a shell script, try it using Vim with Bash Support. This plugin makes scripting in bash a lot easier.

[Sep 26, 2007] Beware Exotic tools can kill you!

Killersites.com

Once and a while you may come across, what would seem to be a 'killer' piece of software, or maybe a cool new programming language - something in that would appear to give you some advantage.

That MAY be the case, but many times, it isn't really so - think twice before your leap!

Consider these points:

Do you notice a pattern here?

Yes, it's all about time. All this junk (software, programming languages, markup languages etc�) have one purpose in the end: to save you time.

Keep that in mind when you approach things - ask yourself:

'Will using this save me time?'

[Sep 26, 2007] Will Ruby kill PHP

KILLERPHP.COM

OO is definitely overkill for a lot of web projects. It seems to me that so many people use OO frameworks like Ruby and Zope because "it's enterprise level". But using an 'enterprise' framework for small to medium sized web applications just adds so much overhead and frustration at having to learn the framework that it just doesnt seem worth it to me.

Having said all this I must point out that I'm distrustful of large corporations and hate their dehumanising heirarchical structure. Therefore i am naturally drawn towards open source and away from the whole OO/enterprise/heirarchy paradigm. Maybe people want to push open source to the enterprise level in the hope that they will adopt the technology and therefore they will have more job security. Get over it - go and learn Java and .NET if you want job security and preserve open source software as an oasis of freedom away from the corporate world. Just my 2c

===

OOP has its place, but the diversity of frameworks is just as challenging to figure out as a new class you didn't write, if not more. None of them work the same or keep a standard convention between them that makes learning them easier. Frameworks are great, but sometimes I think maybe they don't all have to be OO. I keep a small personal library of functions I've (and others have) written procedurally and include them just like I would a class.

Beyond the overhead issues is complexity. OOP has you chasing declarations over many files to figure out what's happening. If you're trying to learn how that unique class you need works, it can be time consuming to read through it and see how the class is structured. By the time you're done you may as well have written the class yourself, at least by then you'd have a solid understanding. Encapsulation and polymorphism have their advantages, but the cost is complexity which can equal time. And for smaller projects that will likely never expand, that time and energy can be a waste.

Not trying to bash OOP, just to defend procedural style. They each have their place.

===

Sorry, but I don't like your text, because you mix Ruby and Ruby on Rails alot. Ruby is in my oppinion easier to use then PHP, because PHP has no design-principle beside "make it work, somehow easy to use". Ruby has some really cool stuff I miss quite often, when I have to program in PHP again (blocks for example), but has a more clear and logic syntax.

Ruby on Rails is of course not that easy to use, at least when speaking about small-scale projects. This is, because it does alot more than PHP does. Of course, there are other good reasons to prefere PHP over Rails (like the better support by providers, more modules, more documentation), but from my opinion, most projects done in PHP from the complexity of a blog could profit from being programmed in Rails, from the pure technical point of view. At least I won't program in PHP again unless a customer asks me.

===

I have a reasonable level of experience with PHP and Python but unfortunately haven't touched Ruby yet. They both seem to be a good choice for low complexity projects. I can even say that I like Python a lot. But I would never consider it again for projects where design is an issue. They also say it is for (rapid) prototyping. My experience is that as long as you can't afford a proper IDE Python is maybe the best place to go to. But a properly "equipped" environment can formidably boost your productivity with a statically typed language like Java. In that case Python's advantage shrinks to the benefits of quick tests accesible through its command line.

Another problem of Python is that it wants to be everything: simple and complete, flexible and structured, high-level while allowing for low-level programming. The result is a series of obscure features

Having said all that I must give Python all the credits of a good language. It's just not perfect. Maybe it's Ruby :-)
My apologies for not sticking too closely to the subject of the article.

===

The one thing I hate is OOP geeks trying to prove that they can write code that does nothing usefull and nobody understands.

"You don't have to use OOP in ruby! You can do it PHP way! So you better do your homework before making such statements!"

Then why use ruby in the first place?

"What is really OVERKILL to me, is to know the hundrets of functions, PHP provides out of the box, and available in ANY scope! So I have to be extra careful whether I can use some name. And the more functions - the bigger the MESS."

On the other hand, in ruby you use only functions available for particular object you use.

I would rather say: "some text".length than strlen("some text"); which is much more meaningful! Ruby language itself much more descriptive. I remember myself, from my old PHP days, heaving always to look up the php.net for appropriate function, but now I can just guess!"

Yeah you must have weak memory and can`t remember whether strlen() is for strings or for numbers�.

Doesn`t ruby have the same number of functions just stored in objects?

Look if you can`t remember strlen than invent your own classes you can make a whole useless OOP framework for PHP in a day��

[Sep 26, 2007] Will Ruby on Rails kill .net and Java

www dot james mckay dot net

Ruby on Rails 1.1 has been released.

Dion Hinchcliffe has posted a blog entry at the end of which he asks the question, will it be a nail in the coffin for .net and Java?

Rails certainly looks beautiful. It is fully object oriented, with built in O/R mapping, powerful AJAX support, an elegant syntax, a proper implementation of the Model-View-Controller design pattern, and even a Ruby to Javascript converter which lets you write client side web code in Ruby.

However, I don't think it's the end of the line for C# and Java by a long shot. Even if it does draw a lot of fire, there is a heck of a lot of code knocking around in these languages, and there likely still will be for a very long time to come. Even throwaway code and hacked together interim solutions have a habit of living a lot longer than anyone ever expects. Look at how much code is still out there in Fortran, COBOL and Lisp, for instance.

Like most scripting languages such as Perl, Python, PHP and so on, Ruby is still a dynamically typed language. For this reason it will be slower than statically typed languages such as C#, C++ and Java. So it won't be used so much in places where you need lots of raw power. However, most web applications don't need such raw power in the business layer. The main bottleneck in web development is database access and network latency in communicating with the browser, so using C# rather than Rails would have only a very minor impact on performance. But some of them do, and in such cases the solutions often have different parts of the application written in different languages and even running on different servers. One of the solutions that we have developed, for instance, has a web front end in PHP running on a Linux box, with a back end application server running a combination of Python and C++ on a Windows server.

Rails certainly knocks the spots off PHP though�

[May 7, 2007] The Hundred-Year Language by Paul Graham

April 2003

(Keynote from PyCon2003)

...I have a hunch that the main branches of the evolutionary tree pass through the languages that have the smallest, cleanest cores. The more of a language you can write in itself, the better.

...Languages evolve slowly because they're not really technologies. Languages are notation. A program is a formal description of the problem you want a computer to solve for you. So the rate of evolution in programming languages is more like the rate of evolution in mathematical notation than, say, transportation or communications. Mathematical notation does evolve, but not with the giant leaps you see in technology.

...I learned to program when computer power was scarce. I can remember taking all the spaces out of my Basic programs so they would fit into the memory of a 4K TRS-80. The thought of all this stupendously inefficient software burning up cycles doing the same thing over and over seems kind of gross to me. But I think my intuitions here are wrong. I'm like someone who grew up poor, and can't bear to spend money even for something important, like going to the doctor.

Some kinds of waste really are disgusting. SUVs, for example, would arguably be gross even if they ran on a fuel which would never run out and generated no pollution. SUVs are gross because they're the solution to a gross problem. (How to make minivans look more masculine.) But not all waste is bad. Now that we have the infrastructure to support it, counting the minutes of your long-distance calls starts to seem niggling. If you have the resources, it's more elegant to think of all phone calls as one kind of thing, no matter where the other person is.

There's good waste, and bad waste. I'm interested in good waste-- the kind where, by spending more, we can get simpler designs. How will we take advantage of the opportunities to waste cycles that we'll get from new, faster hardware?

The desire for speed is so deeply engrained in us, with our puny computers, that it will take a conscious effort to overcome it. In language design, we should be consciously seeking out situations where we can trade efficiency for even the smallest increase in convenience.

Most data structures exist because of speed. For example, many languages today have both strings and lists. Semantically, strings are more or less a subset of lists in which the elements are characters. So why do you need a separate data type? You don't, really. Strings only exist for efficiency. But it's lame to clutter up the semantics of the language with hacks to make programs run faster. Having strings in a language seems to be a case of premature optimization.

... Inefficient software isn't gross. What's gross is a language that makes programmers do needless work. Wasting programmer time is the true inefficiency, not wasting machine time. This will become ever more clear as computers get faster

... Somehow the idea of reusability got attached to object-oriented programming in the 1980s, and no amount of evidence to the contrary seems to be able to shake it free. But although some object-oriented software is reusable, what makes it reusable is its bottom-upness, not its object-orientedness. Consider libraries: they're reusable because they're language, whether they're written in an object-oriented style or not.

I don't predict the demise of object-oriented programming, by the way. Though I don't think it has much to offer good programmers, except in certain specialized domains, it is irresistible to large organizations. Object-oriented programming offers a sustainable way to write spaghetti code. It lets you accrete programs as a series of patches. Large organizations always tend to develop software this way, and I expect this to be as true in a hundred years as it is today.

... As this gap widens, profilers will become increasingly important. Little attention is paid to profiling now. Many people still seem to believe that the way to get fast applications is to write compilers that generate fast code. As the gap between acceptable and maximal performance widens, it will become increasingly clear that the way to get fast applications is to have a good guide from one to the other.

...One of the most exciting trends in the last ten years has been the rise of open-source languages like Perl, Python, and Ruby. Language design is being taken over by hackers. The results so far are messy, but encouraging. There are some stunningly novel ideas in Perl, for example. Many are stunningly bad, but that's always true of ambitious efforts. At its current rate of mutation, God knows what Perl might evolve into in a hundred years.

... One helpful trick here is to use the length of the program as an approximation for how much work it is to write. Not the length in characters, of course, but the length in distinct syntactic elements-- basically, the size of the parse tree. It may not be quite true that the shortest program is the least work to write, but it's close enough that you're better off aiming for the solid target of brevity than the fuzzy, nearby one of least work. Then the algorithm for language design becomes: look at a program and ask, is there any way to write this that's shorter?

[Dec 15, 2006] Ralph Griswold died

Ralph Griswold, the creator of Snobol and Icon programming languages, died in October 2006 of cancer. Until recently Computer Science was a discipline where the founders were still around. That's changing. Griswold was an important pioneer of programming language design with Snobol sting manipulation facilities different and somewhat faster then regular expressions.
Lambda the Ultimate

Ralph Griswold died two weeks ago. He created several programming languages, most notably Snobol (in the 60s) and Icon (in the 70s) - both outstandingly innovative, integral, and efficacious in their areas. Despite the abundance of scripting and other languages today, Snobol and Icon are still unsurpassed in many respects, both as elegance of design and as practicality.

[Dec 15, 2006] Ralph Griswold

See also Ralph Griswold 1934-2006 and Griswold Memorial Endowment
Ralph E. Griswold died in Tucson on October 4, 2006, of complications from pancreatic cancer. He was Regents Professor Emeritus in the Department of Computer Science at the University of Arizona.

Griswold was born in Modesto, California, in 1934. He was an award winner in the 1952 Westinghouse National Science Talent Search and went on to attend Stanford University, culminating in a PhD in Electrical Engineering in 1962.

Griswold joined the staff of Bell Telephone Laboratories in Holmdel, New Jersey, and rose to become head of Programming Research and Development. In 1971, he came to the University of Arizona to found the Department of Computer Science, and he served as department head through 1981. His insistence on high standards brought the department recognition and respect. In recognition of his work the university granted him the breastle of Regents Professor in 1990.

While at Bell Labs, Griswold led the design and implementation of the groundbreaking SNOBOL4 programming language with its emphasis on string manipulation and high-level data structures. At Arizona, he developed the Icon programming language, a high-level language whose influence can be seen in Python and other recent languages.

Griswold authored numerous books and articles about computer science. After retiring in 1997, his interests turned to weaving. While researching mathematical aspects of weaving design he collected and digitized a large library of weaving documents and maintained a public website. He published technical monographs and weaving designs that inspired the work of others, and he remained active until his final week.

-----Gregg Townsend Staff Scientist The University of Arizona

[Dec 15, 2006] The mythical open source miracle by Neil McAllister

Actually Spolsky does not understand the role of scripting languages. But he is right of target with his critique of OO. Object oriented programming is no silver bullet.
Dec 14, 2006 | Computerworld

(InfoWorld) Joel Spolsky is one of our most celebrated pundits on the practice of software development, and he's full of terrific insight. In a recent blog post, he decries the fallacy of "Lego programming" -- the all-too-common assumption that sophisticated new tools will make writing applications as easy as snapping together children's toys. It simply isn't so, he says -- despite the fact that people have been claiming it for decades -- because the most important work in software development happens before a single line of code is written.

By way of support, Spolsky reminds us of a quote from the most celebrated pundit of an earlier generation of developers. In his 1987 essay "No Silver Bullet," Frederick P. Brooks wrote, "The essence of a software entity is a construct of interlocking concepts ... I believe the hard part of building software to be the specification, design, and testing of this conceptual construct, not the labor of representing it and testing the fidelity of the representation ... If this is true, building software will always be hard. There is inherently no silver bullet."

As Spolsky points out, in the 20 years since Brooks wrote "No Silver Bullet," countless products have reached the market heralded as the silver bullet for effortless software development. Similarly, in the 30 years since Brooks published " The Mythical Man-Month" -- in which, among other things, he debunks the fallacy that if one programmer can do a job in ten months, ten programmers can do the same job in one month -- product managers have continued to buy into various methodologies and tricks that claim to make running software projects as easy as stacking Lego bricks.

Don't you believe it. If, as Brooks wrote, the hard part of software development is the initial design, then no amount of radical workflows or agile development methods will get a struggling project out the door, any more than the latest GUI rapid-development toolkit will.

And neither will open source. Too often, commercial software companies decide to turn over their orphaned software to "the community" -- if such a thing exists -- in the naive belief that open source will be a miracle cure to get a flagging project back on track. This is just another fallacy, as history demonstrates.

In 1998, Netscape released the source code to its Mozilla browser to the public to much fanfare, but only lukewarm response from developers. As it turned out, the Mozilla source was much too complex and of too poor quality for developers outside Netscape to understand it. As Jamie Zawinski recounts, the resulting decision to rewrite the browser's rendering engine from scratch derailed the project anywhere from six to ten months.

This is a classic example of the fallacy of the mythical man-month. The problem with the Mozilla code was poor design, not lack of an able workforce. Throwing more bodies at the project didn't necessarily help; it may have even hindered it. And while implementing a community development process may have allowed Netscape to sidestep its own internal management problems, it was certainly no silver bullet for success.

The key to developing good software the first time around is doing the hard work at the beginning: good design, and rigorous testing of that design. Fail that, and you've got no choice but to take the hard road. As Brooks observed all those years ago, successful software will never be easy. No amount of open source process will change that, and to think otherwise is just more Lego-programming nonsense.

[Oct 26, 2006] Cobol Not Dead Yet

It's interesting that Perl is at 30% in this survey (as unscientific as it is)
What programming languages do you use in your organization? Choose all that apply.
Visual Basic - 67%
Cobol - 62%
Java - 61%
JavaScript - 55%
VB.Net - 47%
C++ - 47%
Perl - 30%
C - 26%
C# - 23%
ColdFusion - 15%
PHP - 13%
Fortran - 7%
PL/1 - 5%
Python - 5%
Pascal - 4%
Ada - 2%
Source: Computerworld survey of 352 readers

[Oct 25, 2006] Sun Gets Rubyfied by Jon Erickson

DDJ Portal Blog

On the heels of last weekend's Ruby Conference in Denver (for a report, see Jack Woehr's blog), Sun Microsystems made a Ruby-related announcement of its own. Led by Charles Nutter and Thomas Enebo, the chief maintainers of JRuby, a 100% pure Java implementation of the Ruby language, Sun has released JRuby 0.9.1. Among the features of this release are:

In related news, Ola Bini has been inducted into JRuby as a core developer during this development cycle.
Details are available at Thomas Enebo's blog and Ola Bini's blog.

[Sep 30, 2006] Ruby Book Sales Surpass Python

O'Reilly Radar

I was just looking at our BookScan data mart to update a reporter on Java vs. C# adoption. (The answer to his query: in the last twelve weeks, Java book sales are off 4% vs. the same period last year, while C# book sales are up 16%.) While I was looking at the data, though, I noticed something perhaps more newsworthy: in the same period, Ruby book sales surpassed Python book sales for the first time. Python is up 20% vs. the same period last year, but Ruby is up 1552%! (Perl is down 3%.) Perl is still the most commonly used of the three languages, at least according to book sales, but Python and now Ruby are narrowing the gap.

[Sep 30, 2006] Industry demands Python, not Ruby or Rails

Andrew L Smith

RoR, AJAX, SOA -- these are hype. The reality is that Java, JSP, PHP, Python, Perl, and Tcl/Tk are what people use these days. And if it's a web app, they use PHP or JSP.

RoR is too time-consuming. It takes too long to learn and provides too few results in that timeframe compared to a PHP developer.

AJAX is also time-consuming and assumes too much about the stability of the browser client. And it puts way too much power into ugly Javascript. It's good for only a smidgen of things in only a smidgen of environments.

Java and JSP is around only because of seniority. It was hyped and backed by big companies as the only other option to Microsoft's stuff. Now we're stuck with it. In reality, JSP and Java programmers spend too much time in meetings going over OOP minutea. PHP programmers may use a little OOP, but they focus most on just getting it done.

Python seems to have taken hold on Linux only as far as a rich client environment. It's not used enough for web apps.

By supermike, at 9:41 PM

[Sep 30, 2006] The departure of the hyper-enthusiasts

A couple of additional feedback posts
Weblogs Forum

Re: The departure of the hyper-enthusiasts Posted: Dec 18, 2005 5:54 PM

The issue at hand is comparable to the "to use or not to use EJB". I, too, had a bad time trying to use EJBs, so maybe you can demonstrate some simpathy for a now Ruby user who can't seem to use any other language.

I claim that Ruby is simple enough for me to concentrate on the problem and not on the language (primary tool). Maybe for you Python is cleaner, but to me Python is harder than Ruby when I try to read the code. Ruby has a nice convention of "CamelCase", "methods_names", "AClass.new", etc, that make the code easier to read than a similar Python code, because in Python you don't have a good convention for that. Also, when I require 'afile.rb' in Ruby, it's much easier to read than the "import this.that.Something" in Python. Thus, despite the forced indentation, I prefer the way that the Ruby code looks and feels in comparison to the Python code.

On the supported libraries, Python has a very good selection, indeed. I would say that the Python libraries might be very good in comparison to the Ruby libraries. On the other hand, Ruby has very unique libraries which feel good to use. So, even if Python has more libraries, Ruby should have some quality libraries that compensate a lot for the difference. By considering that one should be well served using Ruby or Python in terms of libraries, the Python's force over Ruby diminishes quite a bit.

Finally, when you are up to the task of creating something new, like a library or program, you may be much more well served by using Ruby if you get to the point of fluid Ruby programming. But, if all you want is to create some web app, maybe Rails already fulfills your requirements.

Even if you consider Python better, for example, because Google uses it and you want to work for Google or something, that won't make us Ruby users give up on improving the language and the available tools. I simply love Ruby and I will keep using it for the forseeable future -- in the future, if I can make a major contribution to the Ruby community, I will.

[Sep 28, 2006] Rants Get Famous By Not Programming -- scripting language as a framework

The author is really incoherent in this rant (also he cannot be right by definition as he loves Emacs ;-). The key problem with arguments presented is that he is mixing apples with oranges (greatness of a programmer as an artist and greatness of a programmer as an innovator). Strong point of this runt is the idea that easy extensibility is a huge advantage and openness of code does not matter much per se. Another good obeservation (made by many other authors) is that "Any sufficiently flexible and programmable environment - say Emacs, or Ruby on Rails, or Firefox, or even my game Wyvern - begins to take on characteristics of ... operating system as it grows."
Stevey's Blog

Any sufficiently flexible and programmable environment - say Emacs, or Ruby on Rails, or Firefox, or even my game Wyvern - begins to take on characteristics of both language and operating system as it grows. So I'm lumping together a big class of programs that have similar characteristics. I guess you could call them frameworks, or extensible systems.

... ... ...

Not that we'd really know, because how often do we go look at the source code for the frameworks we use? How much time have you spent examining the source code of your favorite programming language's compiler, interpreter or VM? And by the time such systems reach sufficient size and usefulness, how much of that code was actually penned by the original author?

Sure, we might go look at framework code sometimes. But it just looks like, well, code. There's usually nothing particularly famous-looking or even glamorous about it. Go look at the source code for Emacs or Rails or Python or Firefox, and it's just a big ball of code. In fact, often as not it's a big hairy ball, and the original author is focused on refactoring or even rewriting big sections of it.

[Sep 28, 2006] Rants Blogger's Block #4 Ruby and Java and Stuff

Stevey's Blog

I was in Barnes today, doing my usual weekend stroll through the tech section. Helps me keep up on the latest trends. And wouldn't you know it, I skipped a few weeks there, and suddenly Ruby and Rails have almost as many books out as Python. I counted eleven Ruby/RoR titles tonight, and thirteen for Python (including one Zope book). And Ruby had a big display section at the end of one of the shelves.

Not all the publishers were O'Reilly and Pragmatic Press. I'm pretty sure there were two or three others there, so it's not just a plot by Tim O'Reilly to sell more books. Well, actually that's exactly what it is, but it's based on actual market research that led him to the conclusion that Rails and Ruby are both gathering steam like nobody's business.

... ... ...

I do a lot more programming in Python than in Ruby -- Jython in my game server, and Python at work, since that's what everyone there uses for scripting. I have maybe 3x more experience with Python than with Ruby (and 10x more experience with Perl). But Perl and Python both have more unnecessary conceptual overhead, so I find I have to consult the docs more often with both of them. And when all's said and done, Ruby code generally winds up being the most direct and succinct, whether it's mine or someone else's.

I have a lot of trouble writing about Ruby, because I find there's nothing to say. It's why I almost never post to the O'Reilly Ruby blog. Ruby seems so self-explanatory to me. It makes it almost boring; you try to focus on Ruby and you wind up talking about some problem domain instead of the language. I think that's the goal of all programming languages, but so far Ruby's one of the few to succeed at it so well.

... ... ...

I think next year Ruby's going to be muscling in on Perl in terms of mindshare, or shelf-share, at B&N.

[May 10, 2006] Google Code - Summer of Code

Among participating organization PHP , Python Software Foundation and Ruby Central See also Student FAQ, Mentor FAQ
Google Code

Welcome to the Summer of Code 2006 site. We are no longer accepting applications from students or mentoring organizations. Students can view previously submitted applications and respond to mentor comments via the student home page. Accepted student projects will be announced on code.google.com/soc/ on May 23, 2006. You can talk to us in the Summer-Discuss-2006 group or via IRC in #summer-discuss on SlashNET.

If you're feeling nostalgic, you can still access the Summer of Code 2005 site.

[May 2, 2006] Embeddable scripting with Lua

While interpreted programming languages such as Perl, Python, PHP, and Ruby are increasingly favored for Web applications -- and have long been preferred for automating system administration tasks -- compiled programming languages such as C and C++ are still necessary. The performance of compiled programming languages remains unmatched (exceeded only by the performance of hand-tuned assembly), and certain software -- including operating systems and device drivers -- can only be implemented efficiently using compiled code. Indeed, whenever software and hardware need to mesh seamlessly, programmers instinctively reach for a C compiler: C is primitive enough to get "close to the bare metal" -- that is, to capture the idiosyncrasies of a piece of hardware -- yet expressive enough to offer some high-level programming constructs, such as structures, loops, named variables, and scope.

However, scripting languages have distinct advantages, too. For example, after a language's interpreter is successfully ported to a platform, the vast majority of scripts written in that language run on the new platform unchanged -- free of dependencies such as system-specific function libraries. (Think of the many DLL files of the Microsoft� Windows� operating system or the many libcs of UNIX� and Linux�.) Additionally, scripting languages typically offer higher-level programming constructs and convenience operations, which programmers claim boost productivity and agility. Moreover, programmers working in an interpreted language can work faster, because the compilation and link steps are unnecessary. The "code, build, link, run" cycle of C and its ilk is reduced to a hastened "script, run."

Lua novelties

Like every scripting language, Lua has its own peculiarities:

Find more examples of Lua code in Programming in Lua and in the Lua-users wiki (for links, see the Resources section below).

As in all engineering pursuits, choosing between a compiled language and an interpreted language means measuring the pros and cons of each in context, weighing the trade-offs, and accepting compromises.

[Apr 10, 2006] Digg PHP's Scalability and Performance

O'Reilly ONLamp Blog

Several weeks ago there was a notable bit of controversy over some comments made by James Gosling, father of the Java programming language. He has since addressed the flame war that erupted, but the whole ordeal got me thinking seriously about PHP and its scalability and performance abilities compared to Java. I knew that several hugely popular Web 2.0 applications were written in scripting languages like PHP, so I contacted Owen Byrne - Senior Software Engineer at digg.com to learn how he addressed any problems they encountered during their meteoric growth. This article addresses the all-to-common false assumptions about the cost of scalability and performance in PHP applications.

At the time Gosling's comments were made, I was working on tuning and optimizing the source code and server configuration for the launch of Jobby, a Web 2.0 resume tracking application written using the WASP PHP framework. I really hadn't done any substantial research on how to best optimize PHP applications at the time. My background is heavy in the architecture and development of highly scalable applications in Java, but I realized there were enough substantial differences between Java and PHP to cause me concern. In my experience, it was certainly faster to develop web applications in languages like PHP; but I was curious as to how much of that time savings might be lost to performance tuning and scaling costs. What I found was both encouraging and surprising.

What are Performance and Scalability?

Before I go on, I want to make sure the ideas of performance and scalability are understood. Performance is measured by the output behavior of the application. In other words, performance is whether or not the app is fast. A good performing web application is expected to render a page in around or under 1 second (depending on the complexity of the page, of course). Scalability is the ability of the application to maintain good performance under heavy load with the addition of resources. For example, as the popularity of a web application grows, it can be called scalable if you can maintain good performance metrics by simply making small hardware additions. With that in mind, I wondered how PHP would perform under heavy load, and whether it would scale well compared with Java.

Hardware Cost

My first concern was raw horsepower. Executing scripting language code is more hardware intensive because to the code isn't compiled. The hardware we had available for the launch of Jobby was a single hosted Linux server with a 2GHz processor and 1GB of RAM. On this single modest server I was going to have to run both Apache 2 and MySQL. Previous applications I had worked on in Java had been deployed on 10-20 application servers with at least 2 dedicated, massively parallel, ultra expensive database servers. Of course, these applications handled traffic in the millions of hits per month.

To get a better idea of what was in store for a heavily loaded PHP application, I set up an interview with Owen Byrne, cofounder and Senior Software Engineer at digg.com. From talking with Owen I learned digg.com gets on the order of 200 million page views per month, and they're able to handle it with only 3 web servers and 8 small database servers (I'll discuss the reason for so many database servers in the next section). Even better news was that they were able to handle their first year's worth of growth on a single hosted server like the one I was using. My hardware worries were relieved. The hardware requirements to run high-traffic PHP applications didn't seem to be more costly than for Java.

Database Cost

Next I was worried about database costs. The enterprise Java applications I had worked on were powered by expensive database software like Oracle, Informix, and DB2. I had decided early on to use MySQL for my database, which is of course free. I wondered whether the simplicity of MySQL would be a liability when it came to trying to squeeze the last bit of performance out of the database. MySQL has had a reputation for being slow in the past, but most of that seems to have come from sub-optimal configuration and the overuse of MyISAM tables. Owen confirmed that the use of InnoDB for tables for read/write data makes a massive performance difference.

There are some scalability issues with MySQL, one being the need for large amounts of slave databases. However, these issues are decidedly not PHP related, and are being addressed in future versions of MySQL. It could be argued that even with the large amount of slave databases that are needed, the hardware required to support them is less expensive than the 8+ CPU boxes that typically power large Oracle or DB2 databases. The database requirements to run massive PHP applications still weren't more costly than for Java.

PHP Coding Cost

Lastly, and most importantly, I was worried about scalability and performance costs directly attributed to the PHP language itself. During my conversation with Owen I asked him if there were any performance or scalability problems he encountered that were related to having chosen to write the application in PHP. A bit to my surprise, he responded by saying, "none of the scaling challenges we faced had anything to do with PHP," and that "the biggest issues faced were database related." He even added, "in fact, we found that the lightweight nature of PHP allowed us to easily move processing tasks from the database to PHP in order to deal with that problem." Owen mentioned they use the APC PHP accelerator platform as well as MCache to lighten their database load. Still, I was skeptical. I had written Jobby entirely in PHP 5 using a framework which uses a highly object oriented MVC architecture to provide application development scalability. How would this hold up to large amounts of traffic?

My worries were largely related to the PHP engine having to effectively parse and interpret every included class on each page load. I discovered this was just my misunderstanding of the best way to configure a PHP server. After doing some research, I found that by using a combination of Apache 2's worker threads, FastCGI, and a PHP accelerator, this was no longer a problem. Any class or script loading overhead was only encountered on the first page load. Subsequent page loads were of comparative performance to a typical Java application. Making these configuration changes were trivial and generated massive performance gains. With regard to scalability and performance, PHP itself, even PHP 5 with heavy OO, was not more costly than Java.

Conclusion

Jobby was launched successfully on its single modest server and, thanks to links from Ajaxian and TechCrunch, went on to happily survive hundreds of thousands of hits in a single week. Assuming I applied all of my new found PHP tuning knowledge correctly, the application should be able to handle much more load on its current hardware.

Digg is in the process of preparing to scale to 10 times current load. I asked Owen Byrne if that meant an increase in headcount and he said that wasn't necessary. The only real change they identified was a switch to a different database platform. There doesn't seem to be any additional manpower cost to PHP scalability either.

It turns out that it really is fast and cheap to develop applications in PHP. Most scaling and performance challenges are almost always related to the data layer, and are common across all language platforms. Even as a self-proclaimed PHP evangelist, I was very startled to find out that all of the theories I was subscribing to were true. There is simply no truth to the idea that Java is better than scripting languages at writing scalable web applications. I won't go as far as to say that PHP is better than Java, because it is never that simple. However it just isn't true to say that PHP doesn't scale, and with the rise of Web 2.0, sites like Digg, Flickr, and even Jobby are proving that large scale applications can be rapidly built and maintained on-the-cheap, by one or two developers.

Further Reading

Scalability:

Performance:

[Mar 25, 2006] AutoHotkey - Free Mouse and Keyboard Macro Program with Hotkeys and AutoText

Kind of Expect for Windows. Nice addition to Windows.
AutoHotkey

AutoHotkey is a free, open-source utility for Windows. With it, you can:

Getting started might be easier than you think. Check out the quick-start tutorial.

More About Hotkeys

AutoHotkey unleashes the full potential of your keyboard, joystick, and mouse. For example, in addition to the typical Control, Alt, and Shift modifiers, you can use the Windows key and the Capslock key as modifiers. In fact, you can make any key or mouse button act as a modifier. For these and other capabilities, see Advanced Hotkeys.

Other Features

License: GNU General Public License

[Feb 16, 2006] Introducing Lua by Keith Fieldhouse

What if you could provide a seamlessly integrated, fully dynamic language with a conventional syntax while increasing your application's size by less than 200K on an x86? You can do it with Lua!
ONLamp.com

There's no reason that web developers should have all the fun. Web 2.0 APIs enable fascinating collaborations between developers and an extended community of developer-users. Extension and configuration APIs added to traditional applications can generate the same benefits.

Of course, extensibility isn't a particularly new idea. Many applications have a plugin framework (think Photoshop) or an extension language (think Emacs). What if you could provide a seamlessly integrated, fully dynamic language with a conventional syntax while increasing your application's size by less than 200K on an x86? You can do it with Lua!

Lua Basics

Roberto Ierusalimschy of the Pontifical Catholic University of Rio de Janeiro in Brazil leads the development of Lua. The most recent version (5.0.2; version 5.1 should be out soon) is made available under the MIT license. Lua is written in 99 percent ANSI C. Its main design goals are to be compact, efficient, and easy to integrate with other C/C++ programs. Game developers (such as World of Warcraft developer Blizzard Entertainment) are increasingly using Lua as an extension and configuration language.

Virtually anyone with any kind of programming experience should find Lua's syntax concise and easy to read. Two dashes introduce comments. An end statement delimits control structures (if, for, while). All variables are global unless explicitly declared local. Lua's fundamental data types include numbers (typically represented as double-precision floating-point values), strings, and Booleans. Lua has true and false as keywords; any expression that does not evaluate to nil is true. Note that 0 and arithmetic expressions that evaluate to 0 do not evaluate to nil. Thus Lua considers them as true when you use them as part of a conditional statement.

Finally, Lua supports userdata as one of its fundamental data types. By definition, a userdata value can hold an ANSI C pointer and thus is useful for passing data references back and forth across the C-Lua boundary.

Despite the small size of the Lua interpreter, the language itself is quite rich. Lua uses subtle but powerful forms of syntactic sugar to allow the language to be used in a natural way in a variety of problem domains, without adding complexity (or size) to the underlying virtual machine. The carefully chosen sugar results in very clean-looking Lua programs that effectively convey the nature of the problem being solved.

The only built-in data structure in Lua is the table. Perl programmers will recognize this as a hash; Python programmers will no doubt see a dictionary. Here are some examples of table usage in Lua:

a      = {}       -- Initializes an empty table
a[1]   = "Fred"   -- Assigns "Fred" to the entry indexed by the number 1
a["1"] = 7        -- Assigns the number 7 to the entry indexed by the string "1"

Any Lua data type can serve as a table index, making tables a very powerful construct in and of themselves. Lua extends the capabilities of the table by providing different syntactic styles for referencing table data. The standard table constructor looks like this:

t = { "Name"="Keith", "Address"="Ballston Lake, New York"}

A table constructor written like

t2 = { "First", "Second","Third"}

is the equivalent of

t3 = { 1="First", 2="Second", 3="Third" }

This last form essentially initializes a table that for all practical purposes behaves as an array. Arrays created in this way have as their first index the integer 1 rather than 0, as is the case in other languages.

The following two forms of accessing the table are equivalent when the table keys are strings:

t3["Name"] = "Keith"
t3.Name    = "Keith"

Tables behave like a standard struct or record when accessed in this fashion.

[Feb 14, 2006] OOP Criticism

Object Oriented Programming Oversold by B. Jacobs OOP criticism and OOP problems. The emperor has no clothes! Reality Check 101 Snake OOil
5/14/2005

OOP Myths Debunked:

SymbianOne Feature

Simkin started life in 1995. At that time Lateral Arts' Simon Whiteside was involved in the development of "Animals of Farthing Wood" an adventure game being produced by the BBC. Simon was asked to produce the game code. "When I started the project it became clear that while the games designers had clear objectives for what they were trying to achieve the detail of much of the game were not defined," says Simon. "Faced with the prospect of rewriting section of the games as the design progressed, which written in C running on Windows 3.0, I realized was going to be time consuming, I looked for some alternative solutions." Simon's initial solution was to allow the game to be manipulated using configuration files, but as time progressed the need for an expression evaluator was identified and later the loops were added to give greater control and flexibility and so the scripting language emerged.

From the Farthing Wood project Simon took this technology with him to a project for Sibelius the best selling music notation application. The developers of Sibelius wanted to add a macro language, to provide Sibelius with a macro capability similar to the facilities available in a word processor. Simon created this feature using Simkin to provide the Sibelius plug-in.

When Simon left Sibelius in 1997 he decided to make Simkin available as a product and after productizing it spent about 6 months working on licensing the product. In that period he sold a couple of licenses but eventually realized that his core interest was in bespoke applications development. Rather than let the product die Simon made the decision to release it as an open source project. So in 1999 it was released through Sourceforge. "Simkin certainly gained interest as an open source product," says Simon. "I received a lot of feed back and several bug fixes so I was happy that open source was the right way to go with Simkin."

Since Simon open sourced Simkin he has developed Java and XML versions as well as pilot J2ME version.

The Symbian version started with an inquiry from Hewlett-Packard in early 2002. Hewlett-Packard Research Laboratories, Europe were running the Bristol Wearable Computing Project in partnership with Bristol University. The project involves looking at various applications for wearable computing devices from applications such as games to guides. One application provides a guide to the works in the city art gallery fed with information from wireless access points, which had been set up around Bristol. As part of the project HP wanted to build an interactive game to run on the HP iPAQ. To provide the games with a simple mechanism to customize it, they approached Simon to port Simkin to the iPAQ and so provide the ability to use XML schemas to describe elements of the game.

"Once we had done that HP wanted to extend the project to use phones," says Simon. "They had identified Symbian OS phones as the emerging technology in this arena and they asked me to do a port." Through contacts Simon approached Symbian who provided comprehensive support in porting Simkin. However HP did not proceed with the use of Symbian phones in the wearables project, although Simon notes that there has been a fair amount of interest from Symbian developers since the port was made available through Sourceforge

In porting to Symbian Simon wanted to retain source code compatibility with the other versions of Simkin. "Maintaining compatibility created two main challenges due to the fact that Symbian C++ does not include the ability to process C++ exceptions and you cannot use the Symbian leave process in a C++ constructor," says Simon. "I managed to overcome most of these problems by using C++ macros, part of which I had started for the HP port as Windows CE also lacked support for exceptions. In the most part this approached worked but still there were some placed that needed particular code for Symbian."

Simkin is not a language that can be used to develop applications from scratch. As Simon describes it "Simkin is a language that can be used to configure application behavior, I call it an embeddable scripting language. So it bolts onto an application to allow a script to make the final decisions about the applications behavior or allows users to control aspects of what the application does, but the real functionality is still in the host application." Simon believes Simkin is well suited to games, where performance is an issue, as the intrinsic games functions can be developed in C or C++ but then controlled by the light weight Simkin. "Using a conventional scripting language would simply not be possible for that type of application," says Simon.

[Jan 31, 2006] A little anti-anti-hype - O'Reilly Ruby The departure of the hyper-enthusiasts by Bruce Eckel

It's kind of funny to see the negative opinion about Ruby, this "Perl with exceptions, co-routines and OO done right" from an OO evangelist, who managed to ride Java wave chanting standard mantras "Java is a great OO language", "OO programming Java is holy and the best thing on the Earth since sliced bread" to the fullest extent possible. All those cries were enthusiastically performed despite the fact the Bruce understands C++ well enough to feel all the deficiencies of Java from day one.
What the author misses here is that the length of the program (an indirect expression of the level of the language) is an extremely important measure of the success of the language design. And here Ruby beats Python and trash Java.
BTW by this measure Java with all its OO holyness is a miserable failure in comparison with Ruby or Python as most Java programs are even more verbose then the equivalent programs in C++.
Actually if you read Thinking in Java attentively you realize that Bruce Eckel is more language feature collector type of person (a type very good for sitting in the language standardization committees) that a software/language architect type of the person.
Also he by-and-large depends on the correct choice of "next hype wave" for the success of his consulting business and might be slightly resentful that Ruby recently has a lot of positive press when he bet on Python. But the problem for consultants like Bruce might be not about Python vs. Ruby, but that there might be no "the next wave".

December 18, 2005

Ruby is to Perl what C++ was to C. Ruby improves and simplifies the Perl language (the name "Ruby" is even a tribute to Perl), and adds workable OO features (If you've ever tried to use classes or references in Perl, you know what I'm talking about. I have no idea whether Perl 6 will rectify this or not, but I stopped paying attention long ago). But it also seems to carry forward some of the Perl warts. For anyone used to, and tired of, Perl, this certainly seems like a huge improvement, but I'm tired of all impositions by a language upon my thinking process, and so arbitrary naming conventions, reversed syntax and begin-end statements all seem like impediments to me.

But it's hard to argue that Ruby hasn't moved things forward, just like C++ and Java have. It has clearly shaken the Python community up a little; for a long time they were on the high ground of pure, clean language design. ... " Ruby, for example has coroutines, as I learned from Tate's book. The expression of coroutines in Ruby (at least, according to Tate's example) is awkward, but they are there, and I suspect that this may be why coroutines -- albeit in a much more elegant form -- are appearing in Python 2.5. Python's coroutines also allow straightforward continuations, and so we may see continuation servers implemented using Python 2.5.

... ... ...

... the resulting code has 20 times the visual bulk of a simpler approach. One of the basic tenets of the Python language has been that code should be simple and clear to express and to read, and Ruby has followed this idea, although not as far as Python has because of the inherited Perlisms. But for someone who has invested Herculean effort to use EJBs just to baby-sit a database, Rails must seem like the essence of simplicity. The understandable reaction for such a person is that everything they did in Java was a waste of time, and that Ruby is the one true path.

... ... ...

So -- sorry, Jim (Fulton, not Kirk) -- I'm going to find something drop-dead simple to solve my drop-dead simple problems. Probably PHP5, which actually includes most of Java and C++ syntax, amazingly enough, and I wonder if that isn't what made IBM adopt it.

... ... ...

However, I can't see Ruby, or anything other than C#, impacting the direction of the Java language, because of the way things have always happened in the Java world. And I think the direction that C# 3.0 may be too forward-thinking for Java to catch up to.

But here's something interesting. I was on the C++ standards committee from the initial meeting and for about 8 years. When Java burst on the scene with its onslaught of Sun marketing, a number of people on the standards committee told me they were moving over to Java, and stopped coming to meetings. And although some users of Python like Martin Fowler (who, it could be argued, was actually a Smalltalk programmer looking for a substitute, because the Smalltalk relationship never really worked out in the real world) have moved to Ruby, I have not heard of any of the rather significant core of Python language and library developers saying "hey, this Ruby thing really solves a lot of problems we've been having in Python, I'm going over there." Instead, they write PEPs (Python Enhancement Proposals) and morph the language to incorporate the good features.

Dick Ford Re: The departure of the hyper-enthusiasts Posted: Dec 20, 2005 9:05 PM
Reply to this message Reply
Remember, both Tate and Eckel make a living writing and talking about programming technologies. So both have to be looking down the road when Java goes into COBOL-like legacy status. It takes a lot of investment in time to learn a programming language and their libraries well enough to write and lecture about them. So if Ruby "is the one" to make into the enterprise eventually and Python never makes the leap, then that's a huge amount of re-tooling that Eckel has to do. It looks like he's trying to protect his Python investment.
Lars Stitz Re: The departure of the hyper-enthusiasts Posted: Dec 22, 2005 6:10 AM
Reply to this message Reply
The "hyper-enthusiasts", as they are euphemistically called by the article, are no more (and no less) than a bunch of consultants who want to advertise their expertise in order to gain more consulting gigs. They are not that outspoken because they know more or are brighter than other developers, but because their blogs and websites like Artima or TheServerSide grant them the benefit of publicity promotion at no cost.

Now, as mainstream development has moved to Javaland for good, these consultants are not needed anymore. Everybody and their dog can write a good Java application that uses decent frameworks and OR mappers and thus performs well in most tasks. So, more enthusiastic propaganda for Java does not pay the bill for these folks anymore -- they have to discover a new niche market where their service is still needed. In this case, what could be better than a language that only few people know yet? Instantly, the consultant's services appear valuable again!

My advice: Don't follow the hype unless you have good reasons to do so. "A cobbler should stick to his last," as we say in Germany. Sure, PHP, Perl, Ruby, C# have all their place in software development. But they are not to replace Java -- for now. For this, their benefits over Java still are too small.

Cheers, Lars

Kyrill Alyoshin Re: The departure of the hyper-enthusiasts Posted: Dec 18, 2005 1:00 PM
Reply to this message Reply
Very glad that you touched on the Tate's book. How about "I've never written a single EJB in my life" from an author of "Bitter EJB"?..

I am sensing quite a bit of commercial pressure from Bruce and his camp. They are simply not making enough margin teaching Java anymore. To do make that margin, you have to work hard, maybe not as hard as B. Eckel but still really hard: play with intricacies of the language, dig ever deeper and deeper, invest the time to write a book... But that's hard to do, kayaking is way more interesting.

So, it seems like Ruby has potential, why not throw a book or two at it, run a few $1000 a day courses... If you read "Beyond Java", this is exactly what Jason Hunter says in his interview.

I think mercantilism of "hyper-enthusiasts" is yet to be analyzed.

That said, I am not buying another Tate's book every again, no matter how "pragmatic" it is.

Jakub Pawlowicz Re: The departure of the hyper-enthusiasts Posted: Dec 18, 2005 5:01 PM
Reply to this message Reply
I think you are mainly right about reasons for people moving to Ruby, and its influence on Python and Java languages.
But by saying that "Java-on-rails might actually tempt me into creating a web app using Java again." and by comparing development in Ruby to the one in EJB 1/2 (or even EJB 3), you are missing the fact, that part of the server-side Java community has already moved to the lightweight approaches such as Spring Framework.

From a one and a half year experience of working as a Spring web developer I must admit that the server-side Java development could be much simpler with lightweight approaches than it was in the EJB 1/2 times.

Steven E. Newton Re: The departure of the hyper-enthusiasts Posted: Dec 19, 2005 9:37 PM
Reply to this message Reply
> Does anyone have an Open Source Ruby application they can
> point me to besides "ROR"( prefarbly a desktop app)? I
> wouldn't mind analysing some of its code, if one exists,
> so I can get a better sense of why its going to be a great
> desktop application programming language for me to use.

How about a Ruby/Cocoa application? I'm speaking of the graphical TestRunner for Ruby's Test::Unit I wrote: http://rubyforge.org/projects/crtestrunner/
It provides an interface similar to jUnit's for running ruby unit tests.

Also check out Rake and RubyGems. Actually any of the top projects on RubyForge are worth digging into.

DougHolton Re: The departure of the hyper-enthusiasts Posted: Dec 20, 2005 12:27 AM
Reply to this message Reply
I highly recommend skimming thru these short resources to get a much more in depth feel for what ruby is like, if you are already familiar with java or python like myself. I did recently and I am finally "getting" ruby much better (the perl-isms turned me off from seriously looking at it earlier, just like it took me a year to get over python's indenting):

Ruby user's guide:
http://www.rubyist.net/~slagell/ruby/
10 things every java programmer should know about ruby:
http://onestepback.org/articles/10things/
Coming to ruby from java:
http://fhwang.net/blog/40.html

Things I like:
-blocks
-you can be more expressive in ruby and essentially twist it into different domain-specific languages, see: http://blog.ianbicking.org/ruby-python-power.html
-I like how standalone functions essentially become protected extensions of the object class (like C# 3.0 extension methods):
http://www.rubyist.net/~slagell/ruby/accesscontrol.html
-using "end" instead of curly braces (easier for beginners and more readable)

Things I don't like and never will:
-awkward syntax for some things like symbols and properties
-awful perlisms like $_,$$,$0,$1,?,<<,=begin
-80's style meaningless and over-abbreviated keywords and method names like "def", "to_s", "puts", etc.
-:: (double colon) vs. . (period).

Ruby is definitely better than python, but still not perfect, and still an order of magnitude slower than statically typed languages.

Re: The departure of the hyper-enthusiasts Posted: Dec 20, 2005 5:42 PM
Reply to this message Reply
> 'For example, the beautiful little feature where you can
> ask an array for its size as well as for its length
> (beautiful because it doesn't terrorize you into having to
> remember the exact precise syntax; it approximates it,
> which is the way most humans actually work),'
>
> you've intrigued me, which means I might be one of those
> programmers who lacks the imagination to see the
> difference between an arrays size and its length. :D What
> exactly is the difference?

There is no difference. The terms are synonymous. Ruby, being a common-sense oriented language, allows for synonymous terms without throwing a fit. It accommodates the way humans tend to think.

Java is the exact opposite. It is very stern, very non-commonsense oriented. It will throw a fit if you send the message 'length()' to an ArrayList. Although in the commonsense world, we all know what the meaning of the question: "what is your length?" should be for an ArrayList. Still, Java bureaucratically insists that our question is dead wrong, and that we should be asking it for its 'size()'. Java is absolutely non lenient.

Now, if you ask me, such boneheaded bureaucratic mindset is very dumb, very stupid. This is why anyone who develops in such bureaucratic languages feels their debilitating effects. And that's why switching to Ruby feels like a full-blown liberation!

James Watson Re: The departure of the hyper-enthusiasts Posted: Dec 20, 2005 5:56 PM
Reply to this message Reply
> Java is the exact opposite. It is very stern, very
> non-commonsense oriented. It will throw a fit if you send
> the message 'length()' to an ArrayList. Although in the
> commonsense world, we all know what the meaning of the
> question: "what is your length?" should be for an
> ArrayList. Still, Java bureaucratically insists that our
> question is dead wrong, and that we should be asking it
> for its 'size()'. Java is absolutely non lenient.

And then what? You give up? You go and cry? The world explodes? I don't get it. What's the big problem. You try to compile, the compiler says, "sorry, I don't get your meaning" and you correct the mistake. Is that really a soul-crushing experience? And that's in the stone-age when we didn't have IDEs for Java. Now you type '.' a list comes up and you select the appropriate method. Not that difficult.

Re: The departure of the hyper-enthusiasts Posted: Dec 20, 2005 6:19 PM
Reply to this message Reply
How many times have your project enabled you to create reusable components?

Without blackboxes, you will always be starting over and creating as many parts from scratch as needed.

Take Rails, for example. It's a framework built from components. One person was responsible for creating the main components, like HTTP Interface, ORM, general framework, etc. One person only! And the components were so good that people were able to use them with extreme ease (now known as "hype").

How many Java projects could have enjoyed a way to create good components, instead of poor frameworks and libraries that barely work together? I would say most Java projects could enjoy a componentized approach because they generally involve lots of very skilled people and lots of resources. :-)

What's a component compared to a library or a module? A component is a code that has a published interface and works like a blackbox -- you don't need to know how it works, only that it works. Even a single object can be a component, like said by Anders Hejlsberg (C#, Delphi):

"Anders Hejlsberg: The great thing about the word component is that you can just sling it about, and it sounds great, but we all think about it differently. In the simplest form, by component I just mean a class plus stuff. A component is a self-contained unit of software that isn't just code and data. It is a class that exposes itself through properties, methods, and events. It is a class that has additional attributes associated with it, in the form of metadata or naming patterns or whatever. The attributes provide dynamic additional information about how the component slots into a particular hosting environment, how it persists itself-all these additional things you want to say with metadata. The metadata enables IDEs to intelligently reason about what a component does and show you its documentation. A component wraps all of that up."
http://www.artima.com/intv/simplexity3.html

So, to me, components are truly the fine-grained units of code reuse. With Ruby, I not only can create my own components in a succinct way, but also can use its Domain Specific Language capabilities to create easy interfaces to use and exercise the components. All this happens in Rails. All this happens in my own libraries. And all this happens in the libraries of people who use Ruby. We are not starting our projects from scratch and hopping for the best. We are enjoying some powerful programmability!

Alex Bunardzic Re: The departure of the hyper-enthusiasts Posted: Dec 21, 2005 1:57 AM
Reply to this message Reply
> The length/size inconsistency has nothing to do with Java
> and everything to do with poor API design decisions made
> in 1995, probably by some very inexperienced programmer
> who had no idea that Java would become so successful.

This is akin to saying that the Inquisition had nothing to do with the fanaticism of the Catholic church, and everything to do with poor decisions some clergy made at that time. In reality, however, the Inquisition was inspired by the broader climate of the Catholic church fanaticism.

In the same way, poor API design that Java is infested with was/is directly inspired by the bureaucratic nature of the language itself.

Re: The departure of the hyper-enthusiasts Posted: Dec 22, 2005 12:58 AM
Reply to this message Reply
Ruby is good for Python

Because it offers more proof that dynamically typed, loosely coupled languages can more productive in creating robust solutions than statically typed, stricter languages with deeply nested class hierarchies. Java and C# essentially lead us through the same path for tackling problems. One may be a better version of the other (I like C# more) but the methodology is very similar. In fact the release of C# only validated the Java-style methodology by emulating it (albeit offering a more productive way to follow it).

Enter Python or Ruby, both different from the Java/C# style. Both producing 'enlightening' expreriences in ever growing list of seasoned, fairly well known static-style developers (Bruce Eckel, Bruce Tate, Martin Fowler...). As the knowledge spreads, it pokes holes in the strong Java/C# meme in peoples minds. Then people start to explore and experiment, and discover the Python (or Ruby) productivity gain. Some may prefer one, some the other. Ruby, in the end, validates the fact that Java/C# style methods may not be the best for everything, something the Python advocates have been saying for quite some time.

(copy from my posting at http://www.advogato.org/person/shalabh/diary.html?start=46)

Re: The departure of the hyper-enthusiasts Posted: Dec 22, 2005 5:21 AM
Reply to this message Reply
> The person I want to hear from is the core Python expert,
> someone who knows that language incredibly well ...

Ok Bruce(s). I'm not sure I qualify as your "core Python expert". I'm a core Python developer, though: http://sourceforge.net/project/memberlist.php?group_id=5470

I've taken you up (well kinda), here are my rambling thoughts. http://nnorwitz.blogspot.com/2005/12/confessions-of-language-bigot.html

Notes on Postmodern Programming

CS-TR-02-9 Authors: James Noble, Robert Biddle, Elvis Software Design Research Group ~Source: GZipped PostScript (1700kb); Adobe PDF (1798kb)\

These notes have the status of letters written to ourselves: we wrote them down because, without doing so, we found ourselves making up new arguments over and over again. When reading what we had written, we were always too satisfied. For one thing, we felt they suffered from a marked silence as to what postmoderism actually is. Yet, we will not try to define postmodernism, first because a complete description of postmodernism in general would be too large for the paper, but secondly (and more importantly) because an understanding of postmodern programming is precisely what we are working towards. Very few programmers tend to see their (sometimes rather general) difficulties as the core of the subject and as a result there is a widely held consensus as to what programming is really about. If these notes prove to be a source of recognition or to give you the appreciation that we have simply written down what you already know about the programmer's trade, some of our goals will have been reached.

[Jan 31, 2006] A little anti-anti-hype

O'Reilly Ruby

Everyone's buzzing about Bruce Eckel's "anti-hype" article. I hope the irony isn't lost on him.

... ... ...

First, inferior languages and technologies are just as likely to win. Maybe even more likely, since it takes less time to get them right. Java beat Smalltalk; C++ beat Objective-C; Perl beat Python; VHS beat Beta; the list goes on. Technologies, especially programming languages, do not win on merit. They win on marketing. Begging for fair, unbiased debate is going to get your language left in the dust.

You can market a language by pumping money into a hype machine, the way Sun and IBM did with Java, or Borland did back with Turbo Pascal. It's pretty effective, but prohibitively expensive for most. More commonly, languages are marketed by a small group of influential writers, and the word-of-mouth hyping extends heirarchically down into the workplace, where a bunch of downtrodden programmers wishing they were having more fun stage a coup and start using a new "forbidden" language on the job. Before long, hiring managers start looking for this new language on resumes, which drives book sales, and the reactor suddenly goes supercritical.

Perl's a good example: how did it beat Python? They were around at more or less the same time. Perl might predate Python by a few years, but not enough for it to matter much. Perl captured roughly ten times as many users as Python, and has kept that lead for a decade. How? Perl's success is the result of Larry Wall's brilliant marketing, combined with the backing of a strong publisher in O'Reilly.

"Programming Perl" was a landmark language book: it was chatty, it made you feel welcome, it was funny, and you felt as if Perl had been around forever when you read it; you were just looking at the latest incarnation. Double marketing points there: Perl was hyped as a trustworthy, mature brand name (like Barnes and Noble showing up overnight and claiming they'd been around since 1897 or whatever), combined with that feeling of being new and special. Larry continued his campaigning for years. Perl's ugly deficiencies and confusing complexities were marketed as charming quirks. Perl surrounded you with slogans, jargon, hip stories, big personalities, and most of all, fun. Perl was marketed as fun.

What about Python? Is Python hip, funny, and fun? Not really. The community is serious, earnest, mature, and professional, but they're about as fun as a bunch of tax collectors.

... ... ...

Pedantry: it's just how things work in the Python world. The status quo is always correct by definition. If you don't like something, you are incorrect. If you want to suggest a change, put in a PEP, Python's equivalent of Java's equally glacial JSR process. The Python FAQ goes to great lengths to rationalize a bunch of broken language features. They're obviously broken if they're frequently asked questions, but rather than 'fessing up and saying "we're planning on fixing this", they rationalize that the rest of the world just isn't thinking about the problem correctly. Every once in a while some broken feature is actually fixed (e.g. lexical scoping), and they say they changed it because people were "confused". Note that Python is never to blame.

In contrast, Matz is possibly Ruby's harshest critic; his presentation "How Ruby Sucks" exposes so many problems with his language that it made my blood run a bit cold. But let's face it: all languages have problems. I much prefer the Ruby crowd's honesty to Python's blaming, hedging and overt rationalization.

As for features, Perl had a very different philosophy from Python: Larry would add in just about any feature anyone asked for. Over time, the Perl language has evolved from a mere kitchen sink into a vast landfill of flotsam and jetsam from other languages. But they never told anyone: "Sorry, you can't do that in Perl." That would have been bad for marketing.

Today, sure, Perl's ugly; it's got generations of cruft, and they've admitted defeat by turning their focus to Perl 6, a complete rewrite. If Perl had started off with a foundation as clean as Ruby's, it wouldn't have had to mutate so horribly to accommodate all its marketing promises, and it'd still be a strong contender today. But now it's finally running out of steam. Larry's magical marketing vapor is wearing off, and people are realizing that Perl's useless toys (references, contexts, typeglobs, ties, etc.) were only fun back when Perl was the fastest way to get things done. In retrospect, the fun part was getting the job done and showing your friends your cool software; only half of Perl's wacky features were helping with that.

So now we have a void. Perl's running out of steam for having too many features; Java's running out of steam for being too bureaucratic. Both are widely beginning to be perceived as offering too much resistance to getting cool software built. This void will be filled by... you guessed it: marketing. Pretty soon everyone (including hiring managers) will see which way the wind is blowing, and one of Malcolm Gladwell's tipping points will happen.

We're in the middle of this tipping-point situation right now. In fact it may have already tipped, with Ruby headed to become the winner, a programming-language force as prominent on resumes and bookshelves as Java is today. This was the entire point of Bruce Tate's book. You can choose to quibble over the details, as Eckel has done, or you can go figure out which language you think is going to be the winner, and get behind marketing it, rather than complaining that other language enthusiasts aren't being fair.

Could Python be the next mega-language? Maybe. It's a pretty good language (not that this really matters much). To succeed, they'd have to get their act together today. Not in a year, or a few months, but today -- and they'd have to realize they're behind already. Ruby's a fine language, sure, but now it has a killer app. Rails has been a huge driving and rallying force behind Ruby adoption. The battleground is the web framework space, and Python's screwing it up badly. There are at least five major Python frameworks that claim to be competing with Rails: Pylons, Django, TurboGears, Zope, and Subway. That's at least three (maybe four) too many. From a marketing perspective, it doesn't actually matter which one is the best, as long as the Python community gets behind one of them and starts hyping it exclusively. If they don't, each one will get 20% of the developers, and none will be able to keep pace with the innovation in Rails.

The current battle may be over web frameworks, but the war is broader than that. Python will have to get serious about marketing, which means finding some influential writers to crank out some hype books in a hurry. Needless to say, they also have to abandon their anti-hype position, or it's a lost cause. Sorry, Bruce. Academic discussions won't get you a million new users. You need faith-based arguments. People have to watch you having fun, and envy you.

My guess is that the Python and Java loyalists will once again miss the forest for the trees. They'll debate my points one by one, and declare victory when they've proven beyond a doubt that I'm mistaken: that marketing doesn't really matter. Or they'll say "gosh, it's not really a war; there's room for all of us", and they'll continue to wonder why the bookshelves at Barnes are filling up with Ruby books.

I won't be paying much attention though, 'cuz Ruby is soooo cool. Did I mention that "quit" exits the shell in Ruby? It does, and so does Ctrl-D. Ruby's da bomb. And Rails? Seriously, you don't know what you're missing. It's awesome. Ruby's dad could totally beat up Python's dad. Check out Why's Poignant Guide if you don't believe me. Ruby's WAY fun -- it's like the only language I want to use these days. It's so easy to learn, too. Not that I'm hyping it or anything. You just can't fake being cool.

[Jan 28, 2006] Draft of the paper "In Praise of Scripting: Real Programming Pragmatism" by Ronald P. Loui, Associate Professor of CSE, Washington University in St. Louis.

This article's main purpose is to review the changes in programming practices known collectively
as the "rise of scripting," as predicted in 1998 IEEE COMPUTER by Ousterhout. This attempts to be both brief and definitive, drawing on many of the essays that have appeared in online forums. The main new idea is that programming language theory needs to move beyond semantics and take language pragmatics more seriously.

... ... ...

Part of the problem is that scripting has risen in the shadow of object-oriented programming and highly publicized corporate battles between Sun, Netscape, and Microsoft with their competing software practices. Scripting has been appearing language by language, including object-oriented scripting languages now. Another part of the problem is that scripting is only now mature enough to stand up against its legitimate detractors. Today, there are answers to many of the persistent questions about scripting:

Continued...

Recommended Links

Google matched content

Softpanorama Recommended

Top articles

[Aug 14, 2019] linux - How to get PID of background process - Stack Overflow Published on Aug 14, 2019 | stackoverflow.com

[Jun 23, 2019] Utilizing multi core for tar+gzip-bzip compression-decompression Published on Jun 23, 2019 | stackoverflow.com

[Oct 22, 2018] move selection to a separate file Published on Oct 22, 2018 | superuser.com

[Oct 18, 2018] Isn't less just more Published on Oct 18, 2018 | unix.stackexchange.com

[Oct 18, 2018] What are the differences between most, more and less Published on Jun 29, 2013 | unix.stackexchange.com

[Apr 27, 2018] Shell command to tar directory excluding certain files-folders Published on Apr 27, 2018 | stackoverflow.com

[Dec 06, 2017] Install R on RedHat errors on dependencies that don't exist Published on Dec 06, 2017 | stackoverflow.com

Sites

Classic Papers Icon Rebol S-Lang History Humor Etc.

Internal

External

Major sites and directories

Perl

Python Programming Language

Ruby Home Page

Dave Winer's Scripting News Weblog

OOP Oversold



Icon

The classic language from the creator of Snobol. Icon introduced many interesting constructs (generators), moreover Icon constructs were done right unlike similar attempts in Perl and Python.

The Icon Programming Language -- the main page

FAQ - Icon Programming Language

Books:


Random Findings

Remark about the danger of mixing of arbitary languages in large projects (in this case Java and Perl). This is true that Perl and Java are an odd couple for a large project (everybody probably would be better off using Jython for this particular project ;-)

Scripting In Large Projects (Score:2)
by Jboy_24 (88864) on Monday February 24, @04:02PM (#5373095)
(http://www.john-nelson.org/factor/)

is like a bacterial infestation. I worked on a large Perl based ecommerce project and a large Java based Ecommerce project. In the end, to insure quality code we had 100% make sure use strict was used, we had to forbid many things Perl programmers pride themselves on in order to get 8 developers to stop duplicating work, stepping on each others code and make our code malleable to changes in specs.

In Java project it was sooo much easier. Sure it took a little longer to start up, creating the Beans, the database layer etc, but once we were going everyone used the code we created, adding features and dealing with changing specs were SOO much easier.

Now comes to the point of the title, we were on a tight deadline, so the bosses got a team from another part of the company to write a PDF generator. That piece came in Perl. Now the piece was written by good, skilled programmers, but dealing with different error log locations, creating processes for the Perl interpreter to live in etc was a nightmare. If we paid the $$ for a 3rd Party Java PDF writer or developed our own we could have saved a good 2-3 man months off of the code. I learned pretty quickly as the only 'Perl' guy on the Java side of the project, You should NEVER, EVER mix languages in a project.

Scripting languages are fine for small one-two page cgi programs, but unless you can crack a whip and get the programmers to fall in line, you'd better let the language and environment do that.

btw, J2EE are frustrating to Script Programmers because they were DESIGNED to be. But if you were ever in charge of divying out tasks in a large project you'll realize how J2EE was designed for you.



Etc

Society

Groupthink : Two Party System as Polyarchy : Corruption of Regulators : Bureaucracies : Understanding Micromanagers and Control Freaks : Toxic Managers :   Harvard Mafia : Diplomatic Communication : Surviving a Bad Performance Review : Insufficient Retirement Funds as Immanent Problem of Neoliberal Regime : PseudoScience : Who Rules America : Neoliberalism  : The Iron Law of Oligarchy : Libertarian Philosophy

Quotes

War and Peace : Skeptical Finance : John Kenneth Galbraith :Talleyrand : Oscar Wilde : Otto Von Bismarck : Keynes : George Carlin : Skeptics : Propaganda  : SE quotes : Language Design and Programming Quotes : Random IT-related quotesSomerset Maugham : Marcus Aurelius : Kurt Vonnegut : Eric Hoffer : Winston Churchill : Napoleon Bonaparte : Ambrose BierceBernard Shaw : Mark Twain Quotes

Bulletin:

Vol 25, No.12 (December, 2013) Rational Fools vs. Efficient Crooks The efficient markets hypothesis : Political Skeptic Bulletin, 2013 : Unemployment Bulletin, 2010 :  Vol 23, No.10 (October, 2011) An observation about corporate security departments : Slightly Skeptical Euromaydan Chronicles, June 2014 : Greenspan legacy bulletin, 2008 : Vol 25, No.10 (October, 2013) Cryptolocker Trojan (Win32/Crilock.A) : Vol 25, No.08 (August, 2013) Cloud providers as intelligence collection hubs : Financial Humor Bulletin, 2010 : Inequality Bulletin, 2009 : Financial Humor Bulletin, 2008 : Copyleft Problems Bulletin, 2004 : Financial Humor Bulletin, 2011 : Energy Bulletin, 2010 : Malware Protection Bulletin, 2010 : Vol 26, No.1 (January, 2013) Object-Oriented Cult : Political Skeptic Bulletin, 2011 : Vol 23, No.11 (November, 2011) Softpanorama classification of sysadmin horror stories : Vol 25, No.05 (May, 2013) Corporate bullshit as a communication method  : Vol 25, No.06 (June, 2013) A Note on the Relationship of Brooks Law and Conway Law

History:

Fifty glorious years (1950-2000): the triumph of the US computer engineering : Donald Knuth : TAoCP and its Influence of Computer Science : Richard Stallman : Linus Torvalds  : Larry Wall  : John K. Ousterhout : CTSS : Multix OS Unix History : Unix shell history : VI editor : History of pipes concept : Solaris : MS DOSProgramming Languages History : PL/1 : Simula 67 : C : History of GCC developmentScripting Languages : Perl history   : OS History : Mail : DNS : SSH : CPU Instruction Sets : SPARC systems 1987-2006 : Norton Commander : Norton Utilities : Norton Ghost : Frontpage history : Malware Defense History : GNU Screen : OSS early history

Classic books:

The Peter Principle : Parkinson Law : 1984 : The Mythical Man-MonthHow to Solve It by George Polya : The Art of Computer Programming : The Elements of Programming Style : The Unix Hater�s Handbook : The Jargon file : The True Believer : Programming Pearls : The Good Soldier Svejk : The Power Elite

Most popular humor pages:

Manifest of the Softpanorama IT Slacker Society : Ten Commandments of the IT Slackers Society : Computer Humor Collection : BSD Logo Story : The Cuckoo's Egg : IT Slang : C++ Humor : ARE YOU A BBS ADDICT? : The Perl Purity Test : Object oriented programmers of all nations : Financial Humor : Financial Humor Bulletin, 2008 : Financial Humor Bulletin, 2010 : The Most Comprehensive Collection of Editor-related Humor : Programming Language Humor : Goldman Sachs related humor : Greenspan humor : C Humor : Scripting Humor : Real Programmers Humor : Web Humor : GPL-related Humor : OFM Humor : Politically Incorrect Humor : IDS Humor : "Linux Sucks" Humor : Russian Musical Humor : Best Russian Programmer Humor : Microsoft plans to buy Catholic Church : Richard Stallman Related Humor : Admin Humor : Perl-related Humor : Linus Torvalds Related humor : PseudoScience Related Humor : Networking Humor : Shell Humor : Financial Humor Bulletin, 2011 : Financial Humor Bulletin, 2012 : Financial Humor Bulletin, 2013 : Java Humor : Software Engineering Humor : Sun Solaris Related Humor : Education Humor : IBM Humor : Assembler-related Humor : VIM Humor : Computer Viruses Humor : Bright tomorrow is rescheduled to a day after tomorrow : Classic Computer Humor

The Last but not Least Technology is dominated by two types of people: those who understand what they do not manage and those who manage what they do not understand ~Archibald Putt. Ph.D


Copyright � 1996-2021 by Softpanorama Society. www.softpanorama.org was initially created as a service to the (now defunct) UN Sustainable Development Networking Programme (SDNP) without any remuneration. This document is an industrial compilation designed and created exclusively for educational use and is distributed under the Softpanorama Content License. Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.

FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available to advance understanding of computer science, IT technology, economic, scientific, and social issues. We believe this constitutes a 'fair use' of any such copyrighted material as provided by section 107 of the US Copyright Law according to which such material can be distributed without profit exclusively for research and educational purposes.

This is a Spartan WHYFF (We Help You For Free) site written by people for whom English is not a native language. Grammar and spelling errors should be expected. The site contain some broken links as it develops like a living tree...

You can use PayPal to to buy a cup of coffee for authors of this site

Disclaimer:

The statements, views and opinions presented on this web page are those of the author (or referenced source) and are not endorsed by, nor do they necessarily reflect, the opinions of the Softpanorama society. We do not warrant the correctness of the information provided or its fitness for any purpose. The site uses AdSense so you need to be aware of Google privacy policy. You you do not want to be tracked by Google please disable Javascript for this site. This site is perfectly usable without Javascript.

Last modified: October 14, 2020