Softpanorama

May the source be with you, but remember the KISS principle ;-)
Home Switchboard Unix Administration Red Hat TCP/IP Networks Neoliberalism Toxic Managers
(slightly skeptical) Educational society promoting "Back to basics" movement against IT overcomplexity and  bastardization of classic Unix

Unix find tutorial

Prev | Contents | Next

Part 3: Finding files using file name or path


Introduction

Find is now more then 40 years old and naturally there are several generations of it. As we are taking about GNU find there are multiple version of it too. Each has different level of support of regular expressions. Here we see the sad truth of Donald Knuth humorous definition of Unix/Linux as an OS with six different types of regular expressions.

The current versions of GNU find (as of Sept 2014 this is 4.5.11) support more then six different types of regular expression with the exception of one, the most needed type: as of August 2014 it still cannot use Perl regular expression, though. Starting in 1997, Philip Hazel developed PCRE (Perl Compatible Regular Expressions), which attempts to closely mimic Perl's regular expression functionality and is used by many modern tools including PHP and Apache HTTP Server. Unfortunately GNU find does not use it yet.

Name predicate and shell patterns

The first and probably the most popular option in finding files using regular expression is -name option which supports basic (shell-style or DOS-style) regular expressions:

-name "basic_regular_expression"
Basic regular expressions or as they also called "shell globbing patterns" is different from POSIX regular expressions and Perl regular expression, but is well known to people who use Unix shell (or DOS shell). The key ideas are

"Globbing patterns" are not as powerful as regular expressions, but they are easier to read, and they are convenient for simple matching of filenames. They also are well known by most Unix sysadmins. See below for more complete discussion.

Nuances of usage of name predicate

the most common name related predicate used with find is -name.

The -name predicate operates only of basename of the file (with the path removed). Expression is true if file name matches the shell pattern specified. For example to find files with the extension .conf in the /etc directory:

find /etc -name '*.conf'

Predicate -iname pattern does the same thing but matching is case insensitive.

Notes:

Searching fully qualified file name and path

Predicate -name (or -iname) is not the only game in town. There are two other important  possibilities:

 Using

-path  shell_pattern

you can search a full path of the file instead of its name.

Even more useful is predicate -wholename which searches the path+name (in case of using relative derectory path such as ./my  the path is from start of the search so this not a fully qualified file name).

Predicates -ipath and -iwholename are similar but in the latter case the match is case-insensitive.

Note: For predicates -path , -wholename, -ipath and -iwholename , a path is consists of all the directories traversed from find's start point to the file being tested, followed by the base name of the file itself. Only if search starts from the root directory these it will be equal to absolute paths

For example

cd /tmp
mkdir -p foo/bar/baz
find foo -path foo/bar -print # first find command
find foo -path /tmp/foo/bar -print # the second find command (prints nothing)
find /tmp/foo -path /tmp/foo/bar -print /tmp/foo/bar # the third find command

Notice that due to search starting point foo the second find command prints nothing, even though /tmp/foo/bar exists.

Unlike file name expansion on the command line, a * in the pattern will match both / and leading dots in file names:

find .  -path '*f'
./quux/bar/baz/f
find .  -path '*/*config'
./quux/bar/baz/.config

Regular expressions

Find defaults to basic regular expressions (DOS style regex or shell pattern matching). Here is the quote from GNU find manual (Finding Files)

find and locate can compare file names, or parts of file names, to shell patterns. A shell pattern is a string that may contain the following special characters, which are known as wildcards or metacharacters.

You must quote patterns that contain metacharacters to prevent the shell from expanding them itself. Double and single quotes both work; so does escaping with a backslash.

*
Matches any zero or more characters.
?
Matches any one character.
[string]
Matches exactly one character that is a member of the string string. This is called a character class. As a shorthand, string may contain ranges, which consist of two characters with a dash between them. For example, the class [a-z0-9_] matches a lowercase letter, a number, or an underscore. You can negate a class by placing a ! or ^ immediately after the opening bracket. Thus, [^A-Z@] matches any character except an uppercase letter or an at sign.
\
Removes the special meaning of the character that follows it. This works even in character classes.

In the find tests that do shell pattern matching ( -name , -wholename , etc.), wildcards in the pattern will match a . at the beginning of a file name. This is also the case for locate. Thus, find -name '*macs' will match a file named .emacs, as will locate '*macs' .

Slash characters have no special significance in the shell pattern matching that find and locate do, unlike in the shell, in which wildcards do not match them. Therefore, a pattern foo*bar can match a file name foo3/bar , and a pattern ./sr*sc can match a file name ./src/misc .

If you want to locate some files with the locate command but don't need to see the full list you can use the --limit option to see just a small number of results, or the --count option to display only the total number of matches.

Type of regular expression that find will use can be specified with option -regextype. In best GNU traditions you need to select from several option, only half of which are useful (see Finding Files)

See Regular Expressions for more information on the regular expression dialects... There are many books about regular expressions that provide a good guidance into this esoteric area. For in depth coverage of regular expression see recommendations in the page Best books about Regular Expressions

Prev | Contents | Next



Etc

Society

Groupthink : Two Party System as Polyarchy : Corruption of Regulators : Bureaucracies : Understanding Micromanagers and Control Freaks : Toxic Managers :   Harvard Mafia : Diplomatic Communication : Surviving a Bad Performance Review : Insufficient Retirement Funds as Immanent Problem of Neoliberal Regime : PseudoScience : Who Rules America : Neoliberalism  : The Iron Law of Oligarchy : Libertarian Philosophy

Quotes

War and Peace : Skeptical Finance : John Kenneth Galbraith :Talleyrand : Oscar Wilde : Otto Von Bismarck : Keynes : George Carlin : Skeptics : Propaganda  : SE quotes : Language Design and Programming Quotes : Random IT-related quotesSomerset Maugham : Marcus Aurelius : Kurt Vonnegut : Eric Hoffer : Winston Churchill : Napoleon Bonaparte : Ambrose BierceBernard Shaw : Mark Twain Quotes

Bulletin:

Vol 25, No.12 (December, 2013) Rational Fools vs. Efficient Crooks The efficient markets hypothesis : Political Skeptic Bulletin, 2013 : Unemployment Bulletin, 2010 :  Vol 23, No.10 (October, 2011) An observation about corporate security departments : Slightly Skeptical Euromaydan Chronicles, June 2014 : Greenspan legacy bulletin, 2008 : Vol 25, No.10 (October, 2013) Cryptolocker Trojan (Win32/Crilock.A) : Vol 25, No.08 (August, 2013) Cloud providers as intelligence collection hubs : Financial Humor Bulletin, 2010 : Inequality Bulletin, 2009 : Financial Humor Bulletin, 2008 : Copyleft Problems Bulletin, 2004 : Financial Humor Bulletin, 2011 : Energy Bulletin, 2010 : Malware Protection Bulletin, 2010 : Vol 26, No.1 (January, 2013) Object-Oriented Cult : Political Skeptic Bulletin, 2011 : Vol 23, No.11 (November, 2011) Softpanorama classification of sysadmin horror stories : Vol 25, No.05 (May, 2013) Corporate bullshit as a communication method  : Vol 25, No.06 (June, 2013) A Note on the Relationship of Brooks Law and Conway Law

History:

Fifty glorious years (1950-2000): the triumph of the US computer engineering : Donald Knuth : TAoCP and its Influence of Computer Science : Richard Stallman : Linus Torvalds  : Larry Wall  : John K. Ousterhout : CTSS : Multix OS Unix History : Unix shell history : VI editor : History of pipes concept : Solaris : MS DOSProgramming Languages History : PL/1 : Simula 67 : C : History of GCC developmentScripting Languages : Perl history   : OS History : Mail : DNS : SSH : CPU Instruction Sets : SPARC systems 1987-2006 : Norton Commander : Norton Utilities : Norton Ghost : Frontpage history : Malware Defense History : GNU Screen : OSS early history

Classic books:

The Peter Principle : Parkinson Law : 1984 : The Mythical Man-MonthHow to Solve It by George Polya : The Art of Computer Programming : The Elements of Programming Style : The Unix Hater’s Handbook : The Jargon file : The True Believer : Programming Pearls : The Good Soldier Svejk : The Power Elite

Most popular humor pages:

Manifest of the Softpanorama IT Slacker Society : Ten Commandments of the IT Slackers Society : Computer Humor Collection : BSD Logo Story : The Cuckoo's Egg : IT Slang : C++ Humor : ARE YOU A BBS ADDICT? : The Perl Purity Test : Object oriented programmers of all nations : Financial Humor : Financial Humor Bulletin, 2008 : Financial Humor Bulletin, 2010 : The Most Comprehensive Collection of Editor-related Humor : Programming Language Humor : Goldman Sachs related humor : Greenspan humor : C Humor : Scripting Humor : Real Programmers Humor : Web Humor : GPL-related Humor : OFM Humor : Politically Incorrect Humor : IDS Humor : "Linux Sucks" Humor : Russian Musical Humor : Best Russian Programmer Humor : Microsoft plans to buy Catholic Church : Richard Stallman Related Humor : Admin Humor : Perl-related Humor : Linus Torvalds Related humor : PseudoScience Related Humor : Networking Humor : Shell Humor : Financial Humor Bulletin, 2011 : Financial Humor Bulletin, 2012 : Financial Humor Bulletin, 2013 : Java Humor : Software Engineering Humor : Sun Solaris Related Humor : Education Humor : IBM Humor : Assembler-related Humor : VIM Humor : Computer Viruses Humor : Bright tomorrow is rescheduled to a day after tomorrow : Classic Computer Humor

The Last but not Least Technology is dominated by two types of people: those who understand what they do not manage and those who manage what they do not understand ~Archibald Putt. Ph.D


Copyright © 1996-2021 by Softpanorama Society. www.softpanorama.org was initially created as a service to the (now defunct) UN Sustainable Development Networking Programme (SDNP) without any remuneration. This document is an industrial compilation designed and created exclusively for educational use and is distributed under the Softpanorama Content License. Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.

FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available to advance understanding of computer science, IT technology, economic, scientific, and social issues. We believe this constitutes a 'fair use' of any such copyrighted material as provided by section 107 of the US Copyright Law according to which such material can be distributed without profit exclusively for research and educational purposes.

This is a Spartan WHYFF (We Help You For Free) site written by people for whom English is not a native language. Grammar and spelling errors should be expected. The site contain some broken links as it develops like a living tree...

You can use PayPal to to buy a cup of coffee for authors of this site

Disclaimer:

The statements, views and opinions presented on this web page are those of the author (or referenced source) and are not endorsed by, nor do they necessarily reflect, the opinions of the Softpanorama society. We do not warrant the correctness of the information provided or its fitness for any purpose. The site uses AdSense so you need to be aware of Google privacy policy. You you do not want to be tracked by Google please disable Javascript for this site. This site is perfectly usable without Javascript.

Last modified: March 12, 2019;