Softpanorama

Home Switchboard Unix Administration Red Hat TCP/IP Networks Neoliberalism Toxic Managers
May the source be with you, but remember the KISS principle ;-)
Skepticism and critical thinking is not panacea, but can help to understand the world better

Unix find tutorial

Prev | Contents | Next

Part 3: Finding files using file name or path


Introduction

Find is now more then 40 years old and naturally there are several generations of it. As we are taking about GNU find there are multiple version of it too. Each has different level of support of regular expressions. Here we see the sad truth of Donald Knuth humorous definition of Unix/Linux as an OS with six different types of regular expressions.

The current versions of GNU find (as of Sept 2014 this is 4.5.11) support more then six different types of regular expression with the exception of one, the most needed type: as of August 2014 it still cannot use Perl regular expression, though. Starting in 1997, Philip Hazel developed PCRE (Perl Compatible Regular Expressions), which attempts to closely mimic Perl's regular expression functionality and is used by many modern tools including PHP and Apache HTTP Server. Unfortunately GNU find does not use it yet.

Name predicate and shell patterns

The first and probably the most popular option in finding files using regular expression is -name option which supports basic (shell-style or DOS-style) regular expressions:

-name "basic_regular_expression"
Basic regular expressions or as they also called "shell globbing patterns" is different from POSIX regular expressions and Perl regular expression, but is well known to people who use Unix shell (or DOS shell). The key ideas are

"Globbing patterns" are not as powerful as regular expressions, but they are easier to read, and they are convenient for simple matching of filenames. They also are well known by most Unix sysadmins. See below for more complete discussion.

Nuances of usage of name predicate

the most common name related predicate used with find is -name.

The -name predicate operates only of basename of the file (with the path removed). Expression is true if file name matches the shell pattern specified. For example to find files with the extension .conf in the /etc directory:

find /etc -name '*.conf'

Predicate -iname pattern does the same thing but matching is case insensitive.

Notes:

Searching fully qualified file name and path

Predicate -name (or -iname) is not the only game in town. There are two other important  possibilities:

 Using

-path  shell_pattern

you can search a full path of the file instead of its name.

Even more useful is predicate -wholename which searches the path+name (in case of using relative derectory path such as ./my  the path is from start of the search so this not a fully qualified file name).

Predicates -ipath and -iwholename are similar but in the latter case the match is case-insensitive.

Note: For predicates -path , -wholename, -ipath and -iwholename , a path is consists of all the directories traversed from find's start point to the file being tested, followed by the base name of the file itself. Only if search starts from the root directory these it will be equal to absolute paths

For example

cd /tmp
mkdir -p foo/bar/baz
find foo -path foo/bar -print # first find command
find foo -path /tmp/foo/bar -print # the second find command (prints nothing)
find /tmp/foo -path /tmp/foo/bar -print /tmp/foo/bar # the third find command

Notice that due to search starting point foo the second find command prints nothing, even though /tmp/foo/bar exists.

Unlike file name expansion on the command line, a * in the pattern will match both / and leading dots in file names:

find .  -path '*f'
./quux/bar/baz/f
find .  -path '*/*config'
./quux/bar/baz/.config

Regular expressions

Find defaults to basic regular expressions (DOS style regex or shell pattern matching). Here is the quote from GNU find manual (Finding Files)

find and locate can compare file names, or parts of file names, to shell patterns. A shell pattern is a string that may contain the following special characters, which are known as wildcards or metacharacters.

You must quote patterns that contain metacharacters to prevent the shell from expanding them itself. Double and single quotes both work; so does escaping with a backslash.

*
Matches any zero or more characters.
?
Matches any one character.
[string]
Matches exactly one character that is a member of the string string. This is called a character class. As a shorthand, string may contain ranges, which consist of two characters with a dash between them. For example, the class [a-z0-9_] matches a lowercase letter, a number, or an underscore. You can negate a class by placing a ! or ^ immediately after the opening bracket. Thus, [^A-Z@] matches any character except an uppercase letter or an at sign.
\
Removes the special meaning of the character that follows it. This works even in character classes.

In the find tests that do shell pattern matching ( -name , -wholename , etc.), wildcards in the pattern will match a . at the beginning of a file name. This is also the case for locate. Thus, find -name '*macs' will match a file named .emacs, as will locate '*macs' .

Slash characters have no special significance in the shell pattern matching that find and locate do, unlike in the shell, in which wildcards do not match them. Therefore, a pattern foo*bar can match a file name foo3/bar , and a pattern ./sr*sc can match a file name ./src/misc .

If you want to locate some files with the locate command but don't need to see the full list you can use the --limit option to see just a small number of results, or the --count option to display only the total number of matches.

Type of regular expression that find will use can be specified with option -regextype. In best GNU traditions you need to select from several option, only half of which are useful (see Finding Files)

See Regular Expressions for more information on the regular expression dialects... There are many books about regular expressions that provide a good guidance into this esoteric area. For in depth coverage of regular expression see recommendations in the page Best books about Regular Expressions

Prev | Contents | Next



Etc

The Last but not Least Technology is dominated by two types of people: those who understand what they do not manage and those who manage what they do not understand ~Archibald Putt. Ph.D


Copyright 1996-2018 by Dr. Nikolai Bezroukov. www.softpanorama.org was initially created as a service to the (now defunct) UN Sustainable Development Networking Programme (SDNP) in the author free time and without any remuneration. This document is an industrial compilation designed and created exclusively for educational use and is distributed under the Softpanorama Content License. Original materials copyright belong to respective owners. Quotes are made for educational purposes only in compliance with the fair use doctrine.

FAIR USE NOTICE This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. We are making such material available to advance understanding of computer science, IT technology, economic, scientific, and social issues. We believe this constitutes a 'fair use' of any such copyrighted material as provided by section 107 of the US Copyright Law according to which such material can be distributed without profit exclusively for research and educational purposes.

This is a Spartan WHYFF (We Help You For Free) site written by people for whom English is not a native language. Grammar and spelling errors should be expected. The site contain some broken links as it develops like a living tree...

You can use PayPal to make a contribution, supporting development of this site and speed up access. In case softpanorama.org is down you can use the at softpanorama.info

Disclaimer:

The statements, views and opinions presented on this web page are those of the author (or referenced source) and are not endorsed by, nor do they necessarily reflect, the opinions of the author present and former employers, SDNP or any other organization the author may be associated with. We do not warrant the correctness of the information provided or its fitness for any purpose.

The site uses AdSense so you need to be aware of Google privacy policy. You you do not want to be tracked by Google please disable Javascript for this site. This site is perfectly usable without Javascript.

Last modified: March 12, 2019;