Softpanorama

May the source be with you, but remember the KISS principle ;-)
Home Switchboard Unix Administration Red Hat TCP/IP Networks Neoliberalism Toxic Managers
(slightly skeptical) Educational society promoting "Back to basics" movement against IT overcomplexity and  bastardization of classic Unix

Unix find tutorial

Prev | Contents | Next

Part 12: Typical Errors In Using Find

 Unix Find Tutorial Recommended Links Softpanorama sysadmin utilities Horror stories
Pure stupidity Creative uses of rm Root deletion protection Safe-rm
Executing command in a wrong directory Typical Errors In Using Find Performing the operation on a wrong computer Lack of testing
  Unix History with some Emphasis on Scripting Humor Etc

Find is complex and powerful utility and Unix sysadmin folklore contains many example of tremendous damage that sysadmin can do to the system by using find incorrectly.

Probably the most important source of blunders is using  -exec  option without sufficient testing under time pressure. Hurry slowly is one of the saying that are very true for sysadmin. Sometimes your emotional state contribute to the problems: you didn’t have much sleep or your mind was distracted by your personal life problems. In such days it is important to slow down and be extra cautious.

The utility think from Softpanorama sysadmin utilities provides some limited protection against this type of blunders. It also allows "on the fly" conversion of find -exec of find -delete command into  find -ls command and running this command before find to see the list of file affected.

If we try to classify typical blunders in using find they fall into several categories

Find filesystem traversal errors

One common mistake in using find command from root directory or other level 2 directory like /var or /opt is that it can contain mounted filesystems with write access. If you intend to make changes only on local filesystem always put -xdev in find command.  That prevent find traversing mounted NFS and other filesystems.  Here is one such story

If you're doing this using find always put -xdev in:

 find /tmp/ -xdev -fstype 4.2 -type f -atime +5 -exec rm {} \;

This stops find from working its way down filesystems mounted under /tmp/. If you're using, say, perl you have to stat . and .. and see if they are mounted on the same device. The fstype 4.2 is pure paranoia.

Needless to say, I once forgot to do this. All was well for some weeks until Convex's version of NQS decided to temporarily mount /mnt under /tmp... Interestingly, only two people noticed. Yes, the chief op. Keeps good backups!

Other triumphs: I created a list of a user's files that hadn't been accessed for three months and a perl script for him to delete them. Of course, it had to be tested, I mislaid a quote from a print statement... This did turn into a triumph, he only wanted a small fraction of them back so we saved 20 MB.

I once deleted the only line from within an if.. then statement in rc.local, the sun refused to come up, and it was surprisingly difficult to come up single user with a writeable file system.

AIX is a whole system of nightmares strung together. If you stray outside of the sort of setup IBM implicitly assume you have (all IBM kit, no non IBM hosts on the network, etc.) you're liable to end up in deep doodoo.

One thing I would like all vendors to do (I know one or two do) is to give root the option of logging in using another shell. Am I the only one to have mangled a root shell?

John Rowe

Another common find filesystem traversal error are side effects of performing operations on home or application directories that contain links to other directories, especially to system directories. This is a pretty common mistake and I had committed it myself several time with various, but always unpleasant consequences. Here is one example 

Gotchas connected with presence of spaces of special characters in file name

Here is an extreme example of the problems that using blank-delimited names can cause. If the following command is run daily from cron, then any user can remove any file on the system:

     find / -name '#*' -atime +7 -print | xargs rm

To delete other files, for example /u/joeuser/.plan, you could do this:

     eg$ mkdir '#
     '
     eg$ cd '#
     '
     eg$ mkdir u u/joeuser u/joeuser/.plan'
     '
     eg$ echo > u/joeuser/.plan'
     /#foo'
     eg$ cd ..
     eg$ find . -name '#*' -print | xargs echo
     ./# ./# /u/joeuser/.plan /#foo

 

Not testing complex change or deletion using find, especially before execution of production box. 

Such errors are often made under time pressure. See more at  Typical Errors In Using Find. such errors are often reqal mini-disasters and they are often connected with attempts to use find for recursive change of the files in a certain subtree  using rm, chown, or chmod.  Such attempt are dangerous if you do them without testing it using -ls  first to see the set of files to which this operation will be applied. 

If you attempt to make changed that involve system directories it is better to do it in two stages. first create a file with the list of changes using find and verify that it is accurate. Them use xargs to process this file.

You should always use ls -Rl command to test complex rm  -R  commands (  -R, --recursive means process  subdirectories recursively).

Unintended mass changes of files ownership or files permissions are also common when using chown or chmod with find. Here are a couple of examples:

Problems with find were understood for several decades. Here is an amusing description from Unix hater's handbook

Find

 

The most horrifying thing about Unix is that, no matter how many
 times you hit yourself over the head with it, you never quite manage
 to lose consciousness. It just goes on and on.

—Patrick Sobalvarro

Losing a file in a large hierarchical filesystem is a common occurrence. (Think of Imelda Marcos trying to find her pink shoes with the red toe ribbon among all her closets.) This problem is now hitting PC and Apple users with the advent of large, cheap disks. To solve this problem computer systems provide programs for finding files that match given criteria, that have a particular name, or type, or were created after a particular date. The Apple Macintosh and Microsoft Windows have powerful file locators that are relatively easy to use and extremely reliable. These file finders were designed with a human user and modern networking in mind. The Unix file finder program, find, wasn’t designed to work with humans, but with cpio —a Unix backup utility program. Find couldn’t anticipate networks or enhancements to the file system such as symbolic links; even after extensive modifications, it still doesn’t work well with either. As a result, despite its importance to humans who’ve misplaced their files, find doesn’t work reliably or predictably.

The authors of Unix tried to keep find up to date with the rest of Unix, butit is a hard task. Today’s find has special flags for NFS file systems, symbolic links, executing programs, conditionally executing programs if the user types “y,” and even directly archiving the found files in cpio or cpio-c format. Sun Microsystems modified find so that a background daemon builds a database of every file in the entire Unix file system which, for some strange reason, the find command will search if you type “find filename” without any other arguments. (Talk about a security violation!) Despite all of these hacks, find still doesn’t work properly.

For example, the csh follows symbolic links, but find doesn’t: csh was written at Berkeley (where symbolic links were implemented), but find dates back to the days of AT&T, pre-symlink. At times, the culture clash between East and West produces mass confusion.

Date: Thu, 28 Jun 1990 18:14 EDT

From: [email protected]

Subject: more things to hate about Unix

To: UNIX-HATERS

This is one of my favorites. I’m in some directory, and I want to search another directory for files, using find. I do:

	po> pwd
	/ath/u1/pgs
	po> find ~halstead -name "*.trace" -print
    po>

The files aren’t there. But now:

po> cd ~halstead
	po> find . -name "*.trace" -print
	./learnX/fib-3.trace
	./learnX/p20xp20.trace
	./learnX/fib-3i.trace
	./learnX/fib-5.trace
	./learnX/p10xp10.trace
    po>

Hey, now the files are there! Just have to remember to cd to random directories in order to get find to find things in them. What a crock of Unix.

Poor Halstead must have the entry for his home directory in /etc/passwd pointing off to some symlink that points to his real directory, so some commands work for him and some don’t.

Why not modify find to make it follow symlinks? Because then any symlink that pointed to a directory higher up the tree would throw find into an endless loop. It would take careful forethought and real programming to design a system that didn’t scan endlessly over the same directory time after time. The simple, Unix, copout solution is just not to follow symlinks, and force the users to deal with the result.

As networked systems become more and more complicated, these problems are becoming harder and harder:

Date: Wed, 2 Jan 1991 16:14:27 PST
From: Ken Harrenstien <[email protected]>
Subject: Why find doesn’t find anything
To: UNIX-HATERS
I just figured out why the “find” program isn’t working for me anymore.

Even though the syntax is rather clumsy and gross, I have relied on it for a long time to avoid spending hours fruitlessly wandering up and down byzantine directory hierarchies in search of the source for a program that I know exists somewhere (a different place on each machine, of course).

It turns out that in this brave new world of NFS and symbolic links, “find” is becoming worthless. The so-called file system we have here is a grand spaghetti pile combining several different fileservers with lots and lots of symbolic links hither and thither, none of which the program bothers to follow up on. There isn’t even a switch to request this… the net effect is that enormous chunks of the search space are silently excluded. I finally realized this when my request to search a fairly sizeable directory turned up nothing (not entirely surprising, but it did nothing too fast) and investigation finally revealed that the directory was a symbolic link to some other place.

I don’t want to have to check out every directory in the tree I give to find—that should be find’s job, dammit. I don’t want to mung the system software every time misfeatures like this come up. I don’t want to waste my time fighting SUN or the entire universe of Unix weeniedom. I don’t want to use Unix. Hate, hate, hate, hate, hate, hate, hate.

—Ken (feeling slightly better but still pissed)

Writing a complicated shell script that actually does something with the files that are found produces strange results, a sad result of the shell’s method for passing arguments to commands.

Date: Sat, 12 Dec 92 01:15:52 PST
From: Jamie Zawinski <[email protected]>
Subject: Q: what’s the opposite of ‘find?’ A: ‘lose.’
To: UNIX-HATERS
I wanted to find all .el files in a directory tree that didn’t have a corresponding .elc file. That should be easy. I tried to use find.

What was I thinking.

First I tried:

% find . -name ’*.el’ -exec ’test -f {}c’

find: incomplete statement

Oh yeah, I remember, it wants a semicolon.

% find . -name ’*.el’ -exec ’test -f {}c’ \;

find: Can’t execute test -f {}c:

No such file or directory

Oh, great. It’s not tokenizing that command like most other things

do.

% find . -name ’*.el’ -exec test -f {}c \;

Well, that wasn’t doing anything…

% find . -name ’*.el’ -exec echo test -f {}c \;

test -f c

test -f c

test -f c

test -f c

Great. The shell thinks curly brackets are expendable.

% find . -name ’*.el’ -exec echo test -f ’{}’c \;

test -f {}c

test -f {}c

test -f {}c

test -f {}c

...

Huh? Maybe I’m misremembering, and {} isn’t really the magic “substitute this file name” token that find uses. Or maybe…

% find . -name ’*.el’ \

-exec echo test -f ’{}’ c \;

test -f ./bytecomp/bytecomp-runtime.el c

test -f ./bytecomp/disass.el c

test -f ./bytecomp/bytecomp.el c

test -f ./bytecomp/byte-optimize.el c

...

Oh, great. Now what. Let’s see, I could use “sed…”

Now at this point I should have remembered that profound truism: “Some people, when confronted with a Unix problem, think ‘I know, I’ll use sed.’ Now they have two problems.”

Five tries and two searches through the sed man page later, I had come up with:

% echo foo.el | sed ’s/$/c/’

foo.elc

and then:

% find . -name ’*.el’ \

-exec echo test -f `echo ’{}’ \

| sed ’s/$/c/’` \;

test -f c

test -f c

test -f c

...

OK, let’s run through the rest of the shell-quoting permutations until we find one that works.

% find . -name ’*.el’ \

-exec echo test -f "`echo ’{}’ |\

sed ’s/$/c/’`" \;

Variable syntax.

% find . -name ’*.el’ \

-exec echo test -f ’`echo "{}" |\

sed "s/$/c/"`’ \;

test -f `echo "{}" | sed "s/$/c/"`

test -f `echo "{}" | sed "s/$/c/"`

test -f `echo "{}" | sed "s/$/c/"`

...

Hey, that last one was kind of close. Now I just need to…

% find . -name ’*.el’ \

-exec echo test -f ’`echo {} | \

sed "s/$/c/"`’ \;

test -f `echo {} | sed "s/$/c/"`

test -f `echo {} | sed "s/$/c/"`

test -f `echo {} | sed "s/$/c/"`

...

Wait, that’s what I wanted, but why isn’t it substituting the filename for the {}??? Look, there are spaces around it, what do you want, the blood of a goat spilt under a full moon?

Oh, wait. That backquoted form is one token.

Maybe I could filter the backquoted form through sed. Um. No.

So then I spent half a minute trying to figure out how to do something that involved “-exec sh -c …”, and then I finally saw the light,  and wrote some emacs-lisp code to do it. It was easy. It was fast. It worked.

I was happy. I thought it was over.

But then in the shower this morning I thought of a way to do it. I couldn’t stop myself. I tried and tried, but the perversity of the task had pulled me in, preying on my morbid fascination. It had the same attraction that the Scribe implementation of Towers of Hanoi has. It only took me 12 tries to get it right. It only spawns two processes per file in the directory tree we're iterating over. It’s the Unix Way!

% find . -name ’*.el’ -print \

| sed ’s/^/FOO=/’|\

sed ’s/$/; if [ ! -f \ ${FOO}c ]; then \

echo \ $FOO ; fi/’ | sh

BWAAAAAHH HAAAAHH HAAAAHH HAAAAHH

HAAAAHH HAAAAHH HAAAAHH HAAAAHH HAAAAHH!!!!

—Jamie

Prev | Contents | Next



Etc

Society

Groupthink : Two Party System as Polyarchy : Corruption of Regulators : Bureaucracies : Understanding Micromanagers and Control Freaks : Toxic Managers :   Harvard Mafia : Diplomatic Communication : Surviving a Bad Performance Review : Insufficient Retirement Funds as Immanent Problem of Neoliberal Regime : PseudoScience : Who Rules America : Neoliberalism  : The Iron Law of Oligarchy : Libertarian Philosophy

Quotes

War and Peace : Skeptical Finance : John Kenneth Galbraith :Talleyrand : Oscar Wilde : Otto Von Bismarck : Keynes : George Carlin : Skeptics : Propaganda  : SE quotes : Language Design and Programming Quotes : Random IT-related quotesSomerset Maugham : Marcus Aurelius : Kurt Vonnegut : Eric Hoffer : Winston Churchill : Napoleon Bonaparte : Ambrose BierceBernard Shaw : Mark Twain Quotes

Bulletin:

Vol 25, No.12 (December, 2013) Rational Fools vs. Efficient Crooks The efficient markets hypothesis : Political Skeptic Bulletin, 2013 : Unemployment Bulletin, 2010 :  Vol 23, No.10 (October, 2011) An observation about corporate security departments : Slightly Skeptical Euromaydan Chronicles, June 2014 : Greenspan legacy bulletin, 2008 : Vol 25, No.10 (October, 2013) Cryptolocker Trojan (Win32/Crilock.A) : Vol 25, No.08 (August, 2013) Cloud providers as intelligence collection hubs : Financial Humor Bulletin, 2010 : Inequality Bulletin, 2009 : Financial Humor Bulletin, 2008 : Copyleft Problems Bulletin, 2004 : Financial Humor Bulletin, 2011 : Energy Bulletin, 2010 : Malware Protection Bulletin, 2010 : Vol 26, No.1 (January, 2013) Object-Oriented Cult : Political Skeptic Bulletin, 2011 : Vol 23, No.11 (November, 2011) Softpanorama classification of sysadmin horror stories : Vol 25, No.05 (May, 2013) Corporate bullshit as a communication method  : Vol 25, No.06 (June, 2013) A Note on the Relationship of Brooks Law and Conway Law

History:

Fifty glorious years (1950-2000): the triumph of the US computer engineering : Donald Knuth : TAoCP and its Influence of Computer Science : Richard Stallman : Linus Torvalds  : Larry Wall  : John K. Ousterhout : CTSS : Multix OS Unix History : Unix shell history : VI editor : History of pipes concept : Solaris : MS DOSProgramming Languages History : PL/1 : Simula 67 : C : History of GCC developmentScripting Languages : Perl history   : OS History : Mail : DNS : SSH : CPU Instruction Sets : SPARC systems 1987-2006 : Norton Commander : Norton Utilities : Norton Ghost : Frontpage history : Malware Defense History : GNU Screen : OSS early history

Classic books:

The Peter Principle : Parkinson Law : 1984 : The Mythical Man-MonthHow to Solve It by George Polya : The Art of Computer Programming : The Elements of Programming Style : The Unix Hater’s Handbook : The Jargon file : The True Believer : Programming Pearls : The Good Soldier Svejk : The Power Elite

Most popular humor pages:

Manifest of the Softpanorama IT Slacker Society : Ten Commandments of the IT Slackers Society : Computer Humor Collection : BSD Logo Story : The Cuckoo's Egg : IT Slang : C++ Humor : ARE YOU A BBS ADDICT? : The Perl Purity Test : Object oriented programmers of all nations : Financial Humor : Financial Humor Bulletin, 2008 : Financial Humor Bulletin, 2010 : The Most Comprehensive Collection of Editor-related Humor : Programming Language Humor : Goldman Sachs related humor : Greenspan humor : C Humor : Scripting Humor : Real Programmers Humor : Web Humor : GPL-related Humor : OFM Humor : Politically Incorrect Humor : IDS Humor : "Linux Sucks" Humor : Russian Musical Humor : Best Russian Programmer Humor : Microsoft plans to buy Catholic Church : Richard Stallman Related Humor : Admin Humor : Perl-related Humor : Linus Torvalds Related humor : PseudoScience Related Humor : Networking Humor : Shell Humor : Financial Humor Bulletin, 2011 : Financial Humor Bulletin, 2012 : Financial Humor Bulletin, 2013 : Java Humor : Software Engineering Humor : Sun Solaris Related Humor : Education Humor : IBM Humor : Assembler-related Humor : VIM Humor : Computer Viruses Humor : Bright tomorrow is rescheduled to a day after tomorrow : Classic Computer Humor

The Last but not Least Technology is dominated by two types of people: those who understand what they do not manage and those who manage what they do not understand ~Archibald Putt. Ph.D


Copyright © 1996-2021 by Softpanorama Society. www.softpanorama.org was initially created as a service to the (now defunct) UN Sustainable Development Networking Programme (SDNP) without any remuneration. This document is an industrial compilation designed and created exclusively for educational use and is distributed under the Softpanorama Content License. Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.

FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available to advance understanding of computer science, IT technology, economic, scientific, and social issues. We believe this constitutes a 'fair use' of any such copyrighted material as provided by section 107 of the US Copyright Law according to which such material can be distributed without profit exclusively for research and educational purposes.

This is a Spartan WHYFF (We Help You For Free) site written by people for whom English is not a native language. Grammar and spelling errors should be expected. The site contain some broken links as it develops like a living tree...

You can use PayPal to to buy a cup of coffee for authors of this site

Disclaimer:

The statements, views and opinions presented on this web page are those of the author (or referenced source) and are not endorsed by, nor do they necessarily reflect, the opinions of the Softpanorama society. We do not warrant the correctness of the information provided or its fitness for any purpose. The site uses AdSense so you need to be aware of Google privacy policy. You you do not want to be tracked by Google please disable Javascript for this site. This site is perfectly usable without Javascript.

Last modified: November 28, 2020;

[an error occurred while processing this directive]