Find is capable to perform several actions on the files or directories that are found with options
-exec and -execdir (the latter is "GNU find only" feature). At the same time
it is a perfect tool to destroy your filesystem as option -exec blindly and very quickly
executes commands you specified for the set of files provided by find. Which might be not what
expected. The stress here is on "very quickly", which is especially noticeable in case of
-exec /bin/rm {} \; command with wrong file set ;-)
Unix system administrators folklore contains many horror
stories of wiping out important filesystems by misunderstanding what set of files will be affected.
The first rule of using -exec option with any destructive command is to replace it with
-ls option and visually inspect the resulting file set.
This will take five or 10 min which is much shorter then several hours (or days) of desperate efforts
to recover from the damage inflicted by some unforeseen side effect or complex find command.
Even typos can be deadly in this case (for example extra space before asterisk). You are warned.
The first rule is to be aware about several more specialized options:
printprints the names of the files on standard output (usually enabled by default);
this list can be piped to another script for post processing. This is the default action and you
can usually omit it.
-print0 ( GNU find only) tells find to use the null character (\0)
instead of whitespace as the output delimiter between pathnames found. This is a safer option
if your files can contain blanks or other special characters. It is strongly recommended to
use the-print0argument if you use-execcommand or xargs (the-0 argument is needed inxargs.).
-ls This is almost identical to ls -l listing. Listing can be post processed
by AWK or Perl scripts.
-deleteDelete files or directories; true if removal succeeded. If the
removal failed, an error message is issued. As deleted files are difficult to recover, you are strongly
advised test the command substituting it first with -ls,especially if you try to
delete files from system or other critically important directories. Five minutes testing often saves five hours
of frantic
recovery efforts ;-). You can also use mv command instead moving them to some "trash" folder.
There is also a more specialized Linux utility incorrectly calledtmpwatch which can more safely delete files
based on their age.
There is also a more specialized Linux utility incorrectly called
tmpwatch which can more safely
delete files based on their age
The -delete option also can be used for deleting files with "strange" characters in names.
But it is better to rename them first.
First, determine the file or directory's inode. For example
ls -lhi *.html
Then use the find command with the inode of the troublemaker, for example:
After then you can see the content of those files and they are useless delete them iether
manually or using some command (if you created them with unique names).
To simply delete such a file you can use option -delete of GNU find, but again you
need to know what you are doing (hint: list the content first):
Options -exec and -execdircommandexecute the specified command for each file found. This is the most powerful (and
thus the most dangerous) options that find provides. The difference between
them is that the latter provides absolute path to the file and as such is safer.
Please be aware that -exec option has some notorious side effects if used incorrectly. Making
backup of filesystem before doing something complex is highly recommended. Making backup of /etc/ before
making any changes is a must for any seasoned Unix sysadmin. In case of usage of destructive commands
consider it to be a surgery and test if the set of files is correct first with -ls or other
non-destructive command.
Both -exec and -execdir options uses parameterless macro {}
which is expanded to the current file. More correctly, in case of -exec macro {} is
expanded to a relative path starting with the name of one of the starting directories, rather
than just the basename of the matched file. In case of -execdir options
absolute path is provided.
You can use several instances of {} in the command: GNU find
replaces {} wherever it appears.
For more complex things post processing of output of find command
with xargs is a safer option as you
first write it wont to a file, check the output and only then run xargson file preventing running some potentially irreversible
action on files beyond the subset you intended to process...
The option -execdir is a more modern optionintroduced in GNU
find is an attempt to create a more safe version of-exec. It
has the same semantic as -exec with two important enhancements:
It always provides absolute path to the file (using relative path to a file is really dangerous in case of -exec).
In addition to providing absolute path it also checks
the PATH variable for safety (if dot is present in the PATH environment variable,
you can pickup executable from the wrong directory)
Here is a relevant quote from the man page:
Execute command; true if zero status is returned. find takes all arguments
after -exec to be part of the command until an argument consisting of ;
is reached. It replaces the string {} by the current file name being processed everywhere
it occurs in the command. Both of these constructions need to be escaped (with a \) or quoted
to protect them from expansion by the shell. The command is executed in the directory in which
find was run.
For example, to compare each C header file in or below the current directory with the file
/tmp/master:
If you use -execdir, you must ensure that the $PATHvariable contains only
absolute directory names. Having an empty element in $PATH, explicitly including
. (or any other non-absolute name) is insecure. GNU find will refuse to run
That said, old habits die slowly, and the usage of -exec option dominates the literature,
including this site. But you need to understand that all examples for find with -exec
option will work after substitution -exec option to -execdir, making them safer with
so little effort that it worth adopting this change as your standard practice.
All examples for find with -exec option will
work after substitution -exec option to -execdir, making them safer. Please
use -execdir option as default.
On modern servers creation of tar of /etc/ directory takes seconds and cpip of critical partitions
several minutes. That means that they should always be done as the first step of prepartion to running
global find command with -exec option that contins potentially destructive command (rm, chmod, etc).
This is a rule No. 1
The rule No.2 is using -exec option is very simple: unless you enjoy the situation commonly
called SNAFU, always test find command containing -exec using
-ls option instead of -exec ( or -execdir) to see if the files selected are
the files you really wish to process.
Never use -exec or -execdir option in a
hurry or under pressure. Always test correctness of selected files with -ls option first
before running "destructive" command on them
Again it is better to experiment first to see if everything is right if you deal with important files.
Five minutes of testing can save five or more hours of desperate attempts to recover accidentally deleted
files.
Here are examples of "good practices" of using find. We will use chmod as the base
of examples. Many people do not think about commands like
chmod or chown as particularly dangerous, but applied to root filesystem they can
be pretty devastating. Please note that we first get to the target directory using cd and only
then are using find command with "." (dot) argument. This avoids such unpleased
situation as typing "/ etc" instead of "/etc".
Or worse "/etc" instead of local etcdirectory(the intention
was to get to local etc directory but string "/etc" is hardwired in sysadmin brains and this slip costs
many sysadmins tremendous pain):
Test command:
find . -type f -ls
Final command:
find `pwd` -type f -execdir /bin/chmod 500 {} ';'
The command bellow search in the current directory and all sub directories and change permissions
of each file as specified. Here an additional danger is connected with being in a wrong directory and
having mount points within target directory.
This command will search in the current directory and all sub directories. All files named *rc.conf
will be processed by the chmod -o+r command. The argument {} is a macro that
expands to each found file. The \; argument indicates the -exec argument has ended.
You can use ';' instead:
The end results of this command is all *rc.conf files have read bit set in "other" permissions.
The find command is commonly used to remove core files that are more than a few 24-hour
periods (days) old. These core files are copies of the actual memory image of a running program when
the program dies unexpectedly. They can be huge, so occasionally trimming them is wise:
For grep the /dev/null argument can by used to show the name of the file
along with the text that is found. Without it, only the text found is printed. An equivalent mechanism
in GNU find is to use the "-H" or "--with-filename" option to grep:
Limit the scope of the command to the minimum number of directories:
Unless you really need to proceed the whole subtree use-maxdepth 1 to
prevent getting extra files in results.
If you do not need to to follow symbolic links
-xtype is like the opposite of -type for symbolic links. If
-follow isn't given, -xtype checks the file that the symlink points to; otherwise,
-xtype checks the symlink itself.
Use options that limit the command to local filesystem, such as
-mount This predicate is always true. Restricts the search to the file system
containing the directory specified. Does not traverse mount points to other file systems.
-xdev Same as the -mount primary. Always evaluates to the value True.
Prevents the find command from traversing a file system different from the the set
of directories specified by the PATH.
Always use the option-0 with xargscommand,if you supply the list of files generated by find to it. Correspondingly always
use option -print0 of find command to generate such a list. It prevent mistreating
files with spaces in the name (which typically comes from Windows environment) option
The command that you want to execute needs to contain a special macro argument{}, which will be replaced by the matched filename on each
invocation of -exec or -execdir predicate. You can use {}multiple times in command and each time. It will
evaluate to the same file and path each time you use it.
You need to specify \;(or';' )at the end
of the command. (If the \ is left out, the shell will interpret the ;
as the end of the find command.)
For example, the following two commands are equivalent:
NOTE: In case {} macro parameter is the last item
in the command then it should be a space between the {} and the \;. For example:
find . -type d -execdir /bin/ls -ld {} \;
If your macro file substitution argument is the last in generated command you can use{} + instead of {} \; . In this case find
will process multiple arguments in xargs fashion grouping multiple arguments in a single
command.
If you attempt to make changesthat involve system directoriesit is better
to do it in two stages. First create a file with the list of changes using find with
-print0 optionand verify that it is accurate. Then use xargs with option
-p (see below) to process this file.
In case of deletion of the file GNU find has option -delete which is
safer then "-execdir /bin/rm {} \;". Or, better, use move to some directory.For example find / -name core -delete
In case of deletion of the file GNU findhas option -delete which is safer then "-exec /bin/rm {}\;" but using move instead is even
better. For example find / -name core -delete
There is classic problem of using rm -- among file you are trying to delete there
might be one or several filenames with spaces, for example files that migrated to Unix filesystem
from Windows. In Windows, unfortunately, using spaces in filenames is a common practice.
Specifying "Windows-style" names for deletion often leads to nasty surprises. For example, if
you have a set of files like "report2015.doc copy" you can accidentally delete all
your documents that ends with "doc", by executing the command :
find /mnt/zip -name "*doc copy" -delete
If have several hundred of important Word documents in this folder this is akin to fire
in the library, and you only can imagine the size of the disaster if you backup proved to be unreliable.
Especially with filesystem for which undelete tools do not exist. You only option is to switch to
runlevel 1 and take dump of the disk with dd and then painstakingly try to locate deleted
file one by one. See dd and
Recovery of lost files using DD
There are several way to prevent this nasty error:
Always check the list of the files you are deleting or with which you are performing
some potentially damaging operation by first writing it to the file and they inspecting it
visually and then running several grep commands (to detect spaces in filenames,
single quotes, dots as argument and other "gotchas").
Always use xargs with option -0 and find command with
option -print0 (see discussion below, in section devoted
to xargs):
Many users frequently ask why xargs should be used when shell command substitution
archives the same results. Take a look at this example:
grep foo `find /usr/src/linux -name "*.html"`
The drawback with commands such as this is that if the set of files returned by find
is longer than the system's command-line length limit, the command will fail.
One way to solve this problem is to use xargs. This approach gets around this problem
because xargs runs the command as many times as is required, instead of just once.
But ability of xargs to use multiple argument can be a source of the problems too. For example
Here the attempt is made to create a backup of all java files in the current tree: But if the list
length for xargs to invoke the /bin/tar command is too big, xargs will split
it into multiple command, and subsequent tar commands will overwrite previous tar archives.
As a result archive will contain a fraction of files, but without testing you might discover this sad
side effect too late.
To solve this problem you can use either file with the list of files to include in the archive (tar
can read a list of files from the file using option-T) or option "-r" which
tells tar to append to the archive (option '-c' means "create"):.
One of the biggest limitations of the -exec option (or predicate with the side effect to
be more correct) is that it can only run the specified command on one file at a time.
Always check the correctness of the list of the files
selected by find
command. Write to a file and inspect visually the set of selected by find file to avoid gotchas.
You can write a script can checks the result for some common gotchas. Expected number of files
in the result is also can be used in such a script. this is especially important if you
are making changes in the whole operating system (all filesystems)
The xargs command (or parallel
command which can be used as xargs substitute by Perl enthusiasts and people
who want parallel execution of commands ) solves two problems
It enables users to run a single command on many files at one time. In general, it is
much faster to run one command on many files, because this cuts down on the number of invocations
of particular command/utility. For example often one needs to find files containing a specific pattern
in multiple directories one can use an exec option in find and then pipe the
result to tee command
find . -type f -execdir /usr/bin/grep -iH '#!/bin/ksh' {} \; | tee /tmp/allfiles
But there is more elegant and more Unix-like way of accomplishing the same task using xargs
and pipes. You can use the xargs to read the output of find and build a pipeline
that invokes grep. This way,grepis called only four or five times
even though it might check through 200 or 300 files. By default, xargs always appends
the list of filenames to the end of the specified command, so using it with grep and most
other Unix command is pretty natural:
find . -type f -print | xargs /usr/bin/grep -il 'bin/ksh' | tee /tmp/allfiles
This gave the same output a lot faster . Option -l ingrepprints only the names of files with matching lines, separated by NEWLINE characters. It does
not repeat the names of files when the pattern is found more than once.
It provides debugging mode with options -t and -p. they can be used together:
Option -t echo each command before executing
Option -p prompts the user before executing each command.
This way you can manually debug small set of files found by find (and see generated commands for
each of them). You can just answer NO for each prompt. And stop the test with Ctrl-C when you
became confident that everything is OK, or if things go wrong.
If output of find contains hundreds
of entries, those options are not enough. It is safer to write first
the list to the file, inspect it and only then to runxargs with the option
-t. For example:
Note: Find option -print0 prints list of filenameswith null character (\0) instead of whitespace as the output delimiter between
pathnames found. This is a safer option if files can contain blanks or other special characters
if you use find with xargs (the -0 argument is needed in
xargs.).
You can also filter output using additional grep stage of pipeline before xargs
By default xargs places input arguments at the end of each generated command.
In this case you do not need to use file placeholder macro '{}' like in option -exec.
Option - i in xargs
Option -i in xargs provides explicit specification of a symbol or a
group of symbols used to denote a parametless macro (macro substitution string). In find as we know
this is fixed to '{}'.In xargs you can specify you own macro substitution string.
Option -i requires one parameter -- current file macro placeholder.
If you use as this macro special characters that have special meaning in shell, you need to
put them in single brackets or use a backslash (\) before each bracket to keep the shell from interpreting
the special characters.
For example "^" is more convenient in most cases then {} as this is rarely used symbol:
One common problem is that without special precautions files with names that contain spaces will
be treated by default will be treated as multiple arguments.
As we mentioned before the option -0 prevent mistreating files with spaces in the name (which
typically comes from Windows environment). The feed file from find should be generated with the option
-print0
As we mentioned before the option -0 prevent mistreating files with
spaces in the name (such files typically come from Windows environment) and should be used
with option -print0 of find command. You should always
use this option with xargs. I mean always.
I would like to stress it again and again that this is a vital option which not only prevent
"mistreatment" of filenames with spaces, but also filenames with a single quote (for example,
can't_open_display.txt ) and several other important cases. As there is a pretty high
chance to encounter such a file in any large set of files in modern Unix environment, you should always
use this option. I mean, always
In you feed xargs from find command you need to use -print0
in find and option -0 to xargs command.
This way you can avoid the danger to processing wrong file name with blanks as multiple files
with potential catastrophic consequences if you use some destruction command iether in
-exec or xargs:
Using option -p you can provide manual confirmation of each action. The reason is that
xargs runs the specified command on the filenames from its standard input, so interactive commands
such as cp -i, mv -i, and rm -i (which are often aliased as cp, mv and rm,
respectively) don't work right. For the same reason you need to provide the path to the executable,
such as rm to make find work right.
So when you run the command first time you can use this option as a safety valve. After
several operations with confirmation to which you answered NO you can cancel the command and run without
option -p. The -p option solves the problem of some typo that you do not noticed but
that dramatically affects what find or xargs is doing. In the preceding example,
the -p option would have makes the initial run safer because you could answer no
to each prompt and then rerun the command without option -p.
People are doing pretty complex staff this way. For example
(Ubuntu Forums, March
23rd, 2010)
FakeOutdoorsman
I'm trying to convert Nikon NEF images to jpg. Usually I use find and xargs for batch processes
like this for example:
As we mentioned when the xargs is used with grep, or other commandthe
latter it will be getting multiple filenames. If grep gets multiple arguments it automatically
includes the filename of any file that contains a match. Still for grep you do need option
-H (or addition /dev/null to the list of files) as the last "chunk" of filenames
can contain a single file.
When used in combination, find, grep, and xargs are a potent team to help
find files lost or misplaced anywhere in the UNIX file system. This is important and recurrent problem
with modern filesystem, which often contain thousands of files and I strongly encourage you to experiment
further. This is a vital sysadmin skill that is really necessary in the current environment. Even directories
like /etc in modern Unixes contain way to many files and often you do not remember whether
the necessary config file in /etc directory or in one of subdirectories like /etc/ssh
If a regular file is lost it is important to think about distinctive criteria that you can use to
find it. It might be not only name but some string within it, date of last modification, size or any
other attribute. the more precise in search the better are your chances to find the file. Often
you need to experiment with different criteria to achieve a useful result. Even if file is not found
because it was accidentally deleted the content of it might still be present on the disk. In this case
just do dd dump of the whole disk and search it for some unique string.
With SSD disks find command for root filesystem is almost instant. In other cases searching
using find is an interesting indicator of the speed of the filesystem. It shown clear difference between
15K RPM drives and 10K RPM drives of the same size. It also shows that large size 10K RPM drives beat
smaller size 15 RPM drives. In any case it is a very interesting and revealing test of the i/o
subsystem and the filesystem used (ext3 is actually nor a bad filesystem for large number of relatively
small files).
Also you can use time command to see the dramatic difference in speed of find with the
-exec option vs. results piped to xargs. In the simplest form you can do it the following
way:
On any substantial set of files xargs works considerably faster. The difference becomes
even greater when more complex commands are run and the list of files is longer.
The -exec option in find command is a very sharp tool. Below we'll present
some of the horror stories(see also
Typical Errors In Using Find). Such errors are often
made under time pressure or when the person is very tied and situation awareness is low.
Please remember that five minutes of testing usually can save five or more hours of desperate
attempts to recover from the results of incorrectly run find command with the option
-exec that contain some 'destructive' action.
Please remember that five minutes of testing usually can save
five or more hours of desperate attempts to recover from the results of incorrectly run find
command.
Typically "find blunders" are committed when a complex find command that changes the
files in a certain subtree using rm, chown, or chmod command is constructed
and run without any testing. So in many cases this is a direct result of recklessness of sysadmin. Sometimes
it is result of time pressure, or being extremely tired (in this situation people often try to cut corners,
even if they understand the risk). It also can be result of the lack of situational awareness (like
many errors of pilots) due to information overload or other factors.
Often you just can't foresee the results of particular find command without testing. For
example, sometimes the directories that are used contain symbolic links to directories in other part
of filesystem and "find start running wild" on subtree that you never intended it to run. Sometimes
the pattern that you use has unintended side effect. Sometimes it just a silly typo.
It's always safer to create list of files to which you apply the particular command inspect it carefully
and only then as a separate state execute command for each of those files. And by "inspect if carefully"
I mean to run several grep commands to detect two the most disastrous "hidden" cases:
Cases when your command contains dots. Here just creating the list is not enough as if
the list contains "hidden" command like mv . /net/location that you did not detect, you
are still hosed despite the fact that you created list of files and supposedly inspected it.
for example such command is generated is you do the following:
find . -type d -execdir echo /bin/mv {} /new/location \;
Don't rely on your eyes. Use grep to find such "suspicious" cases. Here the best strategy
is to proceed slowly and with extreme caution.
Cases when the generated commands do not use absolute path to files processed (simple
cases of this type might be OK). It is a good strategy to generate command with absolute path, not
relative path. It prevents some gotchas.
Life of sysadmin is a complex one so little testing does wonders in preventing nasty surprises from
overconfidence in your own abilities :-).
I sent one of my support guys to do an Oracle update in Madrid.
As instructed he created a new user called esf and changed the files in /u/appl to owner esf,
however in doing so he *must* have cocked up his find command, the command was:
find /u/appl -user appl -exec chown esf {} \;
He rang me up to tell me there was a problem, I logged in via x25 and about 75% of files on system belonged to owner esf.
VERY little worked on system. What a mess, it took me a while and I came up with a brain wave
to fix it but it really screwed up the system.
Moral: be *very* careful of find execs, get the syntax right!!!!
Maroo 07.01.09 at 4:46 pm
I issued the following command on a BackOffice Trading Box in an attempt to clean out a user's
directory. But issued it in the /local. The command ended up taking out the Application
mounted SAN directory and the /local directory.
find . -name "foo.log*" -exec
ls -l {} \; | cut -f2 -d "/" | while
read NAME; do gzip -c $NAME > $NAME.gz; rm -r $NAME;
done
Took out the server for an entire day.
Ville 07.14.09 at 12:17 am
I run a periodic (daily) script on a BSD system to clean out a temp directory for joe (the editor).
Anything older than a day gets wiped out. For some historical reason the temp directory sits in
/usr/joe-cache rather than in, for instance, /usr/local/joe-cache or /var/joe-cache
or /tmp/joe-cache. The first version of the line in the script that does the deleting
looked like this:
Good thing the only files in /usr were two symlinks that were neither mission critical
nor difficult to recreate as the above also matches "/usr/edit-cache/.." In the above
the rather extraneous (joe doesn't save backup files in sub-directories) "-maxdepth 1″
saved the entire /usr from being wiped out!
Both Unix and GNU xargs options procession is badly written and does not use lexical
scanner to process them. This is especially dangerous with option -i
You should already use -print0 option in find and -0 option in xargs
to avoid this error. Also this error arise when -i option is used without argument (should be at least
-i{} -- no space between -iand{})
If you cat list of files into xargs, use tr to translate \n to 0, but always use
option -0 with xargs. The problem is that xargs is somewhat capricious
and is sensitive to quotes and spaces in filenames. Without option -0 it will complain about
single quote in filename, but will process filenames with blanks possibly leading to disastrous consequences.
In case you put a space after -i argument this error can is observed with -print0
and -0 options which is pretty unexpected and put your troubleshooting off track.
See the discussion below to get better understanding of this gotchas.
I am confused. Isn't the use of xargs supposed to precisely help with this problem?
Note: I know that I can techincally use -exec in find, but I would like to understand
why the above fails, since my understanding is that xargs is supposed to know how to split the
input into a manageable size to the argument that it runs. Is this not true?
This is all with zsh.
/ xargs / arguments
slm
Well for one thing the -i switch is deprecated:
-i[replace-str]
This option is a synonym for -Ireplace-str if replace-str is specified.
If the replace-str argument is missing, the effect is the same as -I{}.
This option is deprecated; use -I instead.
So when I changed your command around to this, it worked:
This approach shouldn't be used since running this command construct:
$ find -print0 ... | xargs -I{} -0 ...
implicitly turns on these switches to xargs, -x and -L 1. The -L 1configures
xargs so that it's calling the commands you want it to run the files through in a single fashion.
So this defeats the purpose of using xargs here since if you give it 1000 files it's going
to run the mv command 1000 times.
If so, how does xargs know in this case where in the mv command to feed in the arguments
it gets from the pipe? (does it always place them last?)
slm, Jul 21 '13 at 6:54
@user815423426
Doing it with just the find ... -exec ... is a better way or if you want to use
xargs the find ... | xargs ... mv -t ... is fine too.
Yup it always puts them last. That's why that method needs the -t.
Gilles
The option -i takes an optional argument. Since you put a space
after -i, there was no argument to the -i option and therefore the subsequent -0 was not an option
to xargs but the second of 6 operands {} -0 mv -t /some/path {}.
With only the option -i, xargs expected a newline-separated list of file names. Since there was
probably no newline in the input, xargs received what looked like a huge file name (with embedded
null bytes, but xargs didn't check that). This single string containing the whole output of find
was longer than the maximum command line length, hence the error "command line too long".
Your command would have worked with-i{} instead of -i {}. Alternatively,
you could have used -I {}: -I is similar to -i, but takes a mandatory argument, so the next argument
passed to the xargs is used as the argument of the -I option. Then the argument after that is
-0 which is interpreted as an option, and so on.
However, you shouldn't use -I {} at all. Using -I has three effects:
-I turns off quote processing, which -0 already does.
-I changes the string to replace, but {} is the default value.
-I causes the command to be executed separately for each input record, which is useless
here since your command (mv -t) is specifically intended to cope with multiple files per invocation.
root@dwarf /var/spool/clientmqueue # rm spam-* /bin/rm: Argument list too long.
Ever seen this error in Linux when you have too many files in a directory and you are unable
to delete them with a simple rm -rf *? I have run into this problem a number
of times. After doing a bit of research online I came across a neat solution to work around this
issue.
find . -name 'spam-*' | xargs rm
In the above instance the command will forcefully delete all files in the current directory
that begin with spam-. You can replace the spam-* with anything
you like. You can also replace it with just a * if you want to remove all files
in the folder.
find . -name '*' | xargs rm
We have covered the Linux find
command in great detail earlier.
Xargs is Linux command that makes passing a number of arguments to a command easier.
LetsTalkTexasTurkeyHere
I got this error from my RaspberryPi trying to erase a large amount of jpeg images from the
current working directory.
(works even for those shared hosts that block access to find, like nexus)
Kevin Polley
Big thanks - find . -type f -print0 | xargs -0 /bin/rm saved the day for me with an
overflowing pop acct
Michael T
Good catch using the print0 option, that's an important one.
Most find commands do not require the "-name" predicate. What's usually more important is to make
sure you're deleting *files* and not something else you might not have intended. For this use
"-type f" inplace of the "-name" option....
find . -type f -print0 | xargs -0 /bin/rm
A) Use the full path to the 'rm' command so your aliases don't muck with things.
B) Check your xargs command, you can sometimes, if needed, tell it to use one "result" at a time,
such as (if you didn't use print0 but regular print) "-l1"
One common problem is that without special precautions files with names that contain spaces will
be treated by default will be treated as multiple arguments.
As we mentioned before the option -0 prevent mistreating files with spaces in the name (which
typically comes from windows environment) and should be used option -print0 of find
command
As we mentioned before the option -0 prevent mistreating files with
spaces in the name (such files typically come from Windows environment) and should be used
with option -print0 of find command
I would like to stress it again and again that this is a vital option if you can have filenames with
spaces in you filesystem. As there is a pretty high chance to encounter such a file in any large set
of files in modern Unix environment.
I recommend using it as the default option. That means always. If you add option
-print0 to find command and option -0 to xargs command, you
can avoid the danger to processing wrong file name with blanks as multiple files with potential catastrophic
consequences if you use some destruction option in -exec or xargs:
Using option -p you can provide manual confirmation of each action. The reason is that
xargs runs the specified command on the filenames from its standard input, so interactive commands
such as cp -i, mv -i, and rm -i (which are often aliased as cp, mv and rm,
respectively) don't work right. For the same reason you need to provide the path to the executable,
such as rm to make find work right.
As we mentioned when the xargs is used with grep, or other commandthe
latter it will be getting multiple filenames. If grep gets multiple arguments it automatically
includes the filename of any file that contains a match. Still for grep you do need option
-H (or addition /dev/null to the list of files) as the last "chunk" of filenames
can contain a single file.
Many users frequently ask why xargs should be used when shell command substitution
archives the same results. Take a look at this example:
grep foo `find /usr/src/linux -name "*.html"`
The drawback with commands such as this is that if the set of files returned by find
is longer than the system's command-line length limit, the command will fail.
One way to solve this problem is to use xargs. This approach gets around this problem
because xargs runs the command as many times as is required, instead of just once.
But ability of xargs to use multiple argument can be a source of the problems too. For example
Here the attempt is made to create a backup of all java files in the current tree: But if the list
length for xargs to invoke the tar command is too big, xargs will split it into
multiple command, and subsequent tar commands will overwrite previous tar archives. As a result
archive will contain a fraction of files, but without testing you might discover this sad side effect
too late.
To solve this problem you can use either file with the list of files to include in the archive (tar
can read a list of files from the file using option -T) or option "-r" which tells
tar to append to the archive (option '-c' means "create"):.
Tip 4: Always try to use -execdir instead of -exec.
It is safer option as it provide absolute path to the file
Tip 5: For large number of processed file inspect the list of file
before running any "destructive" command
Tip 6: If your macro file substitution argument is the last in generated
command you can use "{} +" instead of {} \;
. In this case find will process multiple arguments in xargs fashion.
All of the -exec example end with "{} \;" which means they would be more efficient and faster
if they ended with "{} +" instead. Using '+' instead of ';' makes find aggregate pathnames
and execute far fewer commands, instead of one command for each pathname.
You can't use a '+' if the last command argument is not "{}", for example you can't do:
but there is a way around that involving the shell:
find . -name "*.old" -exec sh -c 'mv "$@" oldfiles' sh {} +
This uses two process per aggregated set of pathnames, but is still way more efficient than:
find . -name "*.old" -exec mv {} oldfiles \;
if there are more than a couple of files.
James Youngman, 09/27/2012 at 00:40
Ed wrote:
I am surprised you didn't mention that that the -exec option can over flow the command line
if find returns too many objects
That should not happen, which is the whole point of -exec. If you can reproduce this problem with
GNU find, please report it (with clear, reproducible instructions on how to reproduce the problem!)
as a bug.
Geoff, 09/27/2012 at 08:05
Ed, you are mixing up two different problems. The overflow problem is with -print and command
substitution.
The problem with -exec, as stated in the article you referred to, was efficiency.
The original solution to that was xargs, but these days find has that functionality built in
you use '+' instead of ';' to terminate the -exec command and is
preferred because of the problems xargs has with spaces and other special characters in filenames.
(GNU solved the xargs problem a different way by inventing find -print0 and xargs
-0, but those aren't as widely implemented as find's "-exec command {} +", and they
are less efficient because of the extra xargs process and the I/O through the pipe.)
Xargs
, along with the
find
command,
can also be used to copy or move a set of files from one directory to another. For example, to move all the text files that are more
than 10 minutes old from the current directory to the previous directory, use the following command:
The
-I
command
line option is used by the
xargs
command
to define a replace-string which gets replaced with names read from the output of the
find
command.
Here the replace-string is
{}
,
but it could be anything. For example, you can use "file" as a replace-string.
Suppose you want to list the details of all the .txt files present in the current directory. As already explained, it can be easily
done using the following command:
find . -name "*.txt" | xargs ls -l
But there is one problem: the
xargs
command
will execute the
ls
command
even if the
find
command
fails to find any .txt file. The following is an example.
So you can see that there are no .txt files in the directory, but that didn't stop
xargs
from
executing the
ls
command.
To change this behavior, use the
-r
command
line option:
I think this approach is way too complex. A simpler and more reliable approach is first to create directory structure and then
as the second statge to copy files.
Use of cp command optionis interesting though
Notable quotes:
"... create the intermediate parent directories if needed to preserve the parent directory structure. ..."
By default xargs reads items from standard input as separated by blanks and
executes a command once for each argument. In the following example standard input is piped to
xargs and the mkdir command is run for each argument, creating three folders.
echo 'one two three' | xargs mkdir
ls
one two three
How to use xargs with find
The most common usage of xargs is to use it with the find command. This uses find
to search for files or directories and then uses xargs to operate on the results.
Typical examples of this are removing files, changing the ownership of files or moving
files.
find and xargs can be used together to operate on files that match
certain attributes. In the following example files older than two weeks in the temp folder are
found and then piped to the xargs command which runs the rm command on each file
and removes them.
find /tmp -mtime +14 | xargs rm
xargs v exec {}
The find command supports the -exec option that allows arbitrary
commands to be found on files that are found. The following are equivalent.
find ./foo -type f -name "*.txt" -exec rm {} \;
find ./foo -type f -name "*.txt" | xargs rm
So which one is faster? Let's compare a folder with 1000 files in it.
time find . -type f -name "*.txt" -exec rm {} \;
0.35s user 0.11s system 99% cpu 0.467 total
time find ./foo -type f -name "*.txt" | xargs rm
0.00s user 0.01s system 75% cpu 0.016 total
Clearly using xargs is far more efficient. In fact severalbenchmarks suggest using
xargs over exec {} is six times more efficient.
How to print
commands that are executed
The -t option prints each command that will be executed to the terminal. This
can be helpful when debugging scripts.
echo 'one two three' | xargs -t rm
rm one two three
How to view the command and prompt for execution
The -p command will print the command to be executed and prompt the user to run
it. This can be useful for destructive operations where you really want to be sure on the
command to be run. l
echo 'one two three' | xargs -p touch
touch one two three ?...
How to run multiple commands with xargs
It is possible to run multiple commands with xargs by using the -I
flag. This replaces occurrences of the argument with the argument passed to xargs. The
following prints echos a string and creates a folder.
cat foo.txt
one
two
three
cat foo.txt | xargs -I % sh -c 'echo %; mkdir %'
one
two
three
ls
one two three
The find command supports the -exec option that allows arbitrary
commands to be found on files that are found. The following are equivalent.
find ./foo -type f -name "*.txt" -exec rm {} \;
find ./foo -type f -name "*.txt" | xargs rm
So which one is faster? Let's compare a folder with 1000 files in it.
time find . -type f -name "*.txt" -exec rm {} \;
0.35s user 0.11s system 99% cpu 0.467 total
time find ./foo -type f -name "*.txt" | xargs rm
0.00s user 0.01s system 75% cpu 0.016 total
Clearly using xargs is far more efficient. In fact severalbenchmarks suggest using
xargs over exec {} is six times more efficient.
How to print
commands that are executed
The -t option prints each command that will be executed to the terminal. This
can be helpful when debugging scripts.
echo 'one two three' | xargs -t rm
rm one two three
How to view the command and prompt for execution
The -p command will print the command to be executed and prompt the user to run
it. This can be useful for destructive operations where you really want to be sure on the
command to be run. l
echo 'one two three' | xargs -p touch
touch one two three ?...
How to run multiple commands with xargs
It is possible to run multiple commands with xargs by using the -I
flag. This replaces occurrences of the argument with the argument passed to xargs. The
following prints echos a string and creates a folder.
cat foo.txt
one
two
three
cat foo.txt | xargs -I % sh -c 'echo %; mkdir %'
one
two
three
ls
one two three
George Ornbo is a hacker, futurist, blogger and Dad based in Buckinghamshire,
England.He is the author of Sams Teach Yourself
Node.js in 24 Hours .He can be found in most of the usual places as shapeshed including
Twitter and GitHub .
If you just want to see some examples and skip the reading, here are a little more than
thirty find command examples to get you started. Almost every command is followed
by a short description to explain the command; others are described more fully at the URLs
shown:
basic 'find file' commands
--------------------------
find / -name foo.txt -type f -print # full command
find / -name foo.txt -type f # -print isn't necessary
find / -name foo.txt # don't have to specify "type==file"
find . -name foo.txt # search under the current dir
find . -name "foo.*" # wildcard
find . -name "*.txt" # wildcard
find /users/al -name Cookbook -type d # search '/users/al' dir
case-insensitive searching
--------------------------
find . -iname foo # find foo, Foo, FOo, FOO, etc.
find . -iname foo -type d # same thing, but only dirs
find . -iname foo -type f # same thing, but only files
find files with different extensions
------------------------------------
find . -type f \( -name "*.c" -o -name "*.sh" \) # *.c and *.sh files
find . -type f \( -name "*cache" -o -name "*xml" -o -name "*html" \) # three patterns
find files that don't match a pattern (-not)
--------------------------------------------
find . -type f -not -name "*.html" # find all files not ending in ".html"
find files by text in the file (find + grep)
--------------------------------------------
find . -type f -name "*.java" -exec grep -l StringBuffer {} \; # find StringBuffer in all
*.java files
find . -type f -name "*.java" -exec grep -il string {} \; # ignore case with -i option
find . -type f -name "*.gz" -exec zgrep 'GET /foo' {} \; # search for a string in gzip'd
files
5 lines before, 10 lines after grep matches
-------------------------------------------
find . -type f -name "*.scala" -exec grep -B5 -A10 'null' {} \;
(see http://alvinalexander.com/linux-unix/find-grep-print-lines-before-after-search-term)
find files and act on them (find + exec)
----------------------------------------
find /usr/local -name "*.html" -type f -exec chmod 644 {} \; # change html files to mode
644
find htdocs cgi-bin -name "*.cgi" -type f -exec chmod 755 {} \; # change cgi files to mode
755
find . -name "*.pl" -exec ls -ld {} \; # run ls command on files found
find and copy
-------------
find . -type f -name "*.mp3" -exec cp {} /tmp/MusicFiles \; # cp *.mp3 files to
/tmp/MusicFiles
copy one file to many dirs
--------------------------
find dir1 dir2 dir3 dir4 -type d -exec cp header.shtml {} \; # copy the file header.shtml to
those dirs
find and delete
---------------
find . -type f -name "Foo*" -exec rm {} \; # remove all "Foo*" files under current dir
find . -type d -name CVS -exec rm -r {} \; # remove all subdirectories named "CVS" under
current dir
find files by modification time
-------------------------------
find . -mtime 1 # 24 hours
find . -mtime -7 # last 7 days
find . -mtime -7 -type f # just files
find . -mtime -7 -type d # just dirs
find files by modification time using a temp file
-------------------------------------------------
touch 09301330 poop # 1) create a temp file with a specific timestamp
find . -mnewer poop # 2) returns a list of new files
rm poop # 3) rm the temp file
find with time: this works on mac os x
--------------------------------------
find / -newerct '1 minute ago' -print
find and tar
------------
find . -type f -name "*.java" | xargs tar cvf myfile.tar
find . -type f -name "*.java" | xargs tar rvf myfile.tar
(see
http://alvinalexander.com/blog/post/linux-unix/using-find-xargs-tar-create-huge-archive-cygwin-linux-unix
for more information)
find, tar, and xargs
--------------------
find . -name -type f '*.mp3' -mtime -180 -print0 | xargs -0 tar rvf music.tar
(-print0 helps handle spaces in filenames)
(see
http://alvinalexander.com/mac-os-x/mac-backup-filename-directories-spaces-find-tar-xargs)
find and pax (instead of xargs and tar)
---------------------------------------
find . -type f -name "*html" | xargs tar cvf jw-htmlfiles.tar -
find . -type f -name "*html" | pax -w -f jw-htmlfiles.tar
(
Note that the code will also be executed if the file does not exist at all. It is fine with
find but in other scenarios (such as globs) should be combined with -h to handle
this case, for instance [ -h "$F" -a ! -e "$F" ] . – Calimo
Apr 18 '17 at 19:50
this seems pretty nice as this only returns true if the file is actually a symlink. But even
with adding -q, readlink outputs the name of the link on linux. If this is the case in
general maybe the answer should be updated with 'readlink -q $F > dev/null'. Or am I
missing something? – zoltanctoth
Nov 8 '11 at 10:55
I'd strongly suggest not to use find -L for the task (see below for
explanation). Here are some other ways to do this:
If you want to use a "pure find " method, it should rather look like this:
find . -xtype l
( xtype is a test performed on a dereferenced link) This may not be
available in all versions of find , though. But there are other options as
well:
You can also exec test -e from within the find command:
find . -type l ! -exec test -e {} \; -print
Even some grep trick could be better (i.e., safer ) than
find -L , but not exactly such as presented in the question (which greps in
entire output lines, including filenames):
find . -type l -exec sh -c 'file -b "$1" | grep -q ^broken' sh {} \; -print
The find -L trick quoted by solo from commandlinefu
looks nice and hacky, but it has one very dangerous pitfall : All the symlinks are followed.
Consider directory with the contents presented below:
$ ls -l
total 0
lrwxrwxrwx 1 michal users 6 May 15 08:12 link_1 -> nonexistent1
lrwxrwxrwx 1 michal users 6 May 15 08:13 link_2 -> nonexistent2
lrwxrwxrwx 1 michal users 6 May 15 08:13 link_3 -> nonexistent3
lrwxrwxrwx 1 michal users 6 May 15 08:13 link_4 -> nonexistent4
lrwxrwxrwx 1 michal users 11 May 15 08:20 link_out -> /usr/share/
If you run find -L . -type l in that directory, all /usr/share/
would be searched as well (and that can take really long) 1 . For a
find command that is "immune to outgoing links", don't use -L .
1 This may look like a minor inconvenience (the command will "just" take long
to traverse all /usr/share ) – but can have more severe consequences. For
instance, consider chroot environments: They can exist in some subdirectory of the main
filesystem and contain symlinks to absolute locations. Those links could seem to be broken
for the "outside" system, because they only point to proper places once you've entered the
chroot. I also recall that some bootloader used symlinks under /boot that only
made sense in an initial boot phase, when the boot partition was mounted as /
.
So if you use a find -L command to find and then delete broken symlinks from
some harmless-looking directory, you might even break your system...
I think -type l is redundant since -xtype l will operate as
-type l on non-links. So find -xtype l is probably all you need.
Thanks for this approach. – quornian
Nov 17 '12 at 21:56
Be aware that those solutions don't work for all filesystem types. For example it won't work
for checking if /proc/XXX/exe link is broken. For this, use test -e
"$(readlink /proc/XXX/exe)" . – qwertzguy
Jan 8 '15 at 21:37
@Flimm find . -xtype l means "find all symlinks whose (ultimate) target files
are symlinks". But the ultimate target of a symlink cannot be a symlink, otherwise we can
still follow the link and it is not the ultimate target. Since there is no such symlinks, we
can define them as something else, i.e. broken symlinks. – weakish
Apr 8 '16 at 4:57
@JoóÁdám "which can only be a symbolic link in case it is broken". Give
"broken symbolic link" or "non exist file" an individual type, instead of overloading
l , is less confusing to me. – weakish
Apr 22 '16 at 12:19
The warning at the end is useful, but note that this does not apply to the -L
hack but rather to (blindly) removing broken symlinks in general. – Alois Mahdal
Jul 15 '16 at 0:22
As rozcietrzewiacz has already commented, find -L can have unexpected
consequence of expanding the search into symlinked directories, so isn't the optimal
approach. What no one has mentioned yet is that
find /path/to/search -xtype l
is the more concise, and logically identical command to
find /path/to/search -type l -xtype l
None of the solutions presented so far will detect cyclic symlinks, which is another type
of breakage.
this question addresses portability. To summarize, the portable way to find broken
symbolic links, including cyclic links, is:
find /path/to/search -type l -exec test ! -e {} \; -print
-L Cause the file information and file type (see stat(2)) returned
for each symbolic link to be those of the file referenced by the
link, not the link itself. If the referenced file does not exist,
the file information and type will be for the link itself.
If you need a different behavior whether the link is broken or cyclic you can also use %Y
with find:
$ touch a
$ ln -s a b # link to existing target
$ ln -s c d # link to non-existing target
$ ln -s e e # link to itself
$ find . -type l -exec test ! -e {} \; -printf '%Y %p\n' \
| while read type link; do
case "$type" in
N) echo "do something with broken link $link" ;;
L) echo "do something with cyclic link $link" ;;
esac
done
do something with broken link ./d
do something with cyclic link ./e
Yet another shorthand for those whose find command does not support
xtype can be derived from this: find . type l -printf "%Y %p\n" | grep -w
'^N' . As andy beat me to it with the same (basic) idea in his script, I was reluctant
to write it as separate answer. :) – syntaxerror
Jun 25 '15 at 0:28
I use this for my case and it works quite well, as I know the directory to look for broken
symlinks:
find -L $path -maxdepth 1 -type l
and my folder does include a link to /usr/share but it doesn't traverse it.
Cross-device links and those that are valid for chroots, etc. are still a pitfall but for my
use case it's sufficient.
,
Simple no-brainer answer, which is a variation on OP's version. Sometimes, you just want
something easy to type or remember:
[high]
Its always better to wrap the search term (name parameter) in double or single quotes. Not
doing so will seem to work sometimes and give strange results at other times.
[/high]
3. Limit depth of directory traversal
The find command by default travels down the entire directory tree recursively, which is
time and resource consuming. However the depth of directory travesal can be specified. For
example we don't want to go more than 2 or 3 levels down in the sub directories. This is done
using the maxdepth option.
The second example uses maxdepth of 1, which means it will not go lower than 1 level deep,
either only in the current directory.
This is very useful when we want to do a limited search only in the current directory or max
1 level deep sub directories and not the entire directory tree which would take more time.
Just like maxdepth there is an option called mindepth which does what the name suggests,
that is, it will go atleast N level deep before searching for the files.
4. Invert
match
It is also possible to search for files that do no match a given name or pattern. This is
helpful when we know which files to exclude from the search.
So in the above example we found all files that do not have the extension of php, either
non-php files. The find command also supports the exclamation mark inplace of not.
[pre]
find ./test ! -name "*.php"
[/pre]
5. Combine multiple search criterias
It is possible to use multiple criterias when specifying name and inverting. For example
The above find command looks for files that begin with abc in their names and do not have a
php extension. This is an example of how powerful search expressions can be build with the find
command.
OR operator
When using multiple name criterias, the find command would combine them with AND operator,
which means that only those files which satisfy all criterias will be matched. However if we
need to perform an OR based matching then the find command has the "o" switch.
@don_crissti I'll never understand why people prefer random web documentation to the
documentation installed on their systems (which has the added benefit of actually being
relevant to their system). – Kusalananda
Nov 17 '17 at 9:53
@Kusalananda - Well, I can think of one scenario in which people would include a link to a
web page instead of a quote from the documentation installed on their system: they're not on
a linux machine at the time of writing the post... However, the link should point (imo) to
the official docs (hence my comment above, which, for some unknown reason, was deleted by the
mods...). That aside, I fully agree with you: the OP should consult the manual page installed
on their system. – don_crissti
Nov 17 '17 at 12:52
My manual page tend to be from FreeBSD though. Unless I happen to have a Linux VM within
reach. And I have the impression that most questions are GNU/Linux based. – Hennes
Feb 16 at 16:16
The file tool gives you a one-line summary of what kind of file you're looking at,
based on its extension, headers and other cues. This is very handy used with find when
examining a set of unfamiliar files:
$ find . -exec file {} +
.: directory
./hanoi: Perl script, ASCII text executable
./.hanoi.swp: Vim swap file, version 7.3
./factorial: Perl script, ASCII text executable
./bits.c: C source, ASCII text
./bits: ELF 32-bit LSB executable, Intel 80386, version ...
This question already has an answer here: How to delete files older than X hours
I have this command that I run every 24 hours currently.
find /var/www/html/audio -daystart -maxdepth 1 -mtime +1 -type f -name "*.mp3" -exec rm -f {} \;
I would like to run it every 1 hour and delete files that are older than 1 hour.
could I just use -mmin +59?
This question has been asked before and already has an answer. If those answers do not fully address
your question, please ask a new question.
If you are using GNU find (and you most likely are) you can also pass the -delete flag instead of
the -exec rm business. I think that more clearly expresses the intent. Joost Baaij Nov 16 '11 at
10:32
From man find: -mmin n
File's data was last modified n minutes ago.
Also, make sure to test this first! ... -exec echo rm -f '{}' \;
^^^^ Add the 'echo' so you just see the commands that are going to get
run instead of actual trying them first.
Sean Bright
Wouldn't -mmin 60 only find the files modified exactly 60 minutes ago? I think it needs to be -mmin
+59 or such. Otis Feb 12 '09 at 23:17
I updated based on Otis' comments. Nice catch! Sean Bright Feb 12 '09 at 23:21
Thanks. :) I'm curious if the modification needs to be 60 minutes or greater or if 59m 1s would trip
it. I'm not sure it needs to be that precise for what Abs is doing. Otis Feb 12 '09 at 23:24
I'll let you know in 54 minutes and 12 seconds ;-) Otis++ on a random post of yours Sean Bright
Feb 12 '09 at 23:25
instead of -exec rm -f {} \; you can simply use -delete denis2342 Nov 26 '13 at 9:11
In such cases the UID of the file is often different from uid of "legitimate" files in polluted directories and you probably can
use this fact for quick elimination of the tar bomb, But the idea of using the list of files from the tar bomb to eliminate offending
files also works if you observe some precautions -- some directories that were created can have the same names as existing directories.
Never do rm in -exec or via xargs without testing.
Notable quotes:
"... You don't want to just rm -r everything that tar tf tells you, since it might include directories that were not empty before unpacking! ..."
"... Another nice trick by @glennjackman, which preserves the order of files, starting from the deepest ones. Again, remove echo when done. ..."
"... One other thing: you may need to use the tar option --numeric-owner if the user names and/or group names in the tar listing make the names start in an unpredictable column. ..."
"... That kind of (antisocial) archive is called a tar bomb because of what it does. Once one of these "explodes" on you, the solutions in the other answers are way better than what I would have suggested. ..."
"... The easiest (laziest) way to do that is to always unpack a tar archive into an empty directory. ..."
"... The t option also comes in handy if you want to inspect the contents of an archive just to see if it has something you're looking for in it. If it does, you can, optionally, just extract the file(s) you want. ..."
This can be piped to xargs directly, but beware : do the deletion very carefully. You don't want to just rm -r
everything that tar tf tells you, since it might include directories that were not empty before unpacking!
You could do
tar tf archive.tar | xargs -d'\n' rm -v
tar tf archive.tar | sort -r | xargs -d'\n' rmdir -v
to first remove all files that were in the archive, and then the directories that are left empty.
sort -r (glennjackman suggested tac instead of sort -r in the comments to the accepted
answer, which also works since tar 's output is regular enough) is needed to delete the deepest directories first; otherwise
a case where dir1 contains a single empty directory dir2 will leave dir1 after the rmdir
pass, since it was not empty before dir2 was removed.
This will generate a lot of
rm: cannot remove `dir/': Is a directory
and
rmdir: failed to remove `dir/': Directory not empty
rmdir: failed to remove `file': Not a directory
Shut this up with 2>/dev/null if it annoys you, but I'd prefer to keep as much information on the process as possible.
And don't do it until you are sure that you match the right files. And perhaps try rm -i to confirm everything. And
have backups, eat your breakfast, brush your teeth, etc.
===
List the contents of the tar file like so:
tar tzf myarchive.tar
Then, delete those file names by iterating over that list:
while IFS= read -r file; do echo "$file"; done < <(tar tzf myarchive.tar.gz)
This will still just list the files that would be deleted. Replace echo with rm if you're really sure these are the ones you want
to remove. And maybe make a backup to be sure.
In a second pass, remove the directories that are left over:
while IFS= read -r file; do rmdir "$file"; done < <(tar tzf myarchive.tar.gz)
This prevents directories with from being deleted if they already existed before.
Another nice trick by @glennjackman, which preserves the order of files, starting from the deepest ones. Again, remove echo
when done.
tar tvf myarchive.tar | tac | xargs -d'\n' echo rm
This could then be followed by the normal rmdir cleanup.
Here's a possibility that will take the extracted files and move them to a subdirectory, cleaning up your main folder.
#!/usr/bin/perl -w
use strict ;
use Getopt :: Long ;
my $clean_folder = "clean" ;
my $DRY_RUN ;
die "Usage: $0 [--dry] [--clean=dir-name]\n"
if ( ! GetOptions ( "dry!" => \$DRY_RUN ,
"clean=s" => \$clean_folder ));
# Protect the 'clean_folder' string from shell substitution
$clean_folder =~ s / '/' \\ '' / g ;
# Process the "tar tv" listing and output a shell script.
print "#!/bin/sh\n" if ( ! $DRY_RUN );
while (<>)
{
chomp ;
# Strip out permissions string and the directory entry from the 'tar' list
my $perms = substr ( $_ , 0 , 10 );
my $dirent = substr ( $_ , 48 );
# Drop entries that are in subdirectories
next if ( $dirent =~ m :/.: );
# If we're in "dry run" mode, just list the permissions and the directory
# entries.
#
if ( $DRY_RUN )
{
print "$perms|$dirent\n" ;
next ;
}
# Emit the shell code to clean up the folder
$dirent =~ s / '/' \\ '' / g ;
print "mv -i '$dirent' '$clean_folder'/.\n" ;
}
Save this to the file fix-tar.pl and then execute it like this:
$ tar tvf myarchive . tar | perl fix - tar . pl -- dry
This will confirm that your tar list is like mine. You should get output like:
- rw - rw - r --| batch
- rw - rw - r --| book - report . png
- rwx ------| CaseReports . png
- rw - rw - r --| caseTree . png
- rw - rw - r --| tree . png
drwxrwxr - x | sample /
If that looks good, then run it again like this:
$ mkdir cleanup
$ tar tvf myarchive . tar | perl fix - tar . pl -- clean = cleanup > fixup . sh
The fixup.sh script will be the shell commands that will move the top-level files and directories into a "clean"
folder (in this instance, the folder called cleanup). Have a peek through this script to confirm that it's all kosher.
If it is, you can now clean up your mess with:
$ sh fixup . sh
I prefer this kind of cleanup because it doesn't destroy anything that isn't already destroyed by being overwritten by that initial
tar xv.
Note: if that initial dry run output doesn't look right, you should be able to fiddle with the numbers in the two substr
function calls until they look proper. The $perms variable is used only for the dry run so really only the $dirent
substring needs to be proper.
One other thing: you may need to use the tar option --numeric-owner if the user names and/or group names
in the tar listing make the names start in an unpredictable column.
One other thing: you may need to use the tar option --numeric-owner if the user names and/or group
names in the tar listing make the names start in an unpredictable column.
===
That kind of (antisocial) archive is called a tar bomb because of what it does. Once one of these "explodes" on you, the solutions
in the other answers are way better than what I would have suggested.
The best "solution", however, is to prevent the problem in the first place.
The easiest (laziest) way to do that is to always unpack a tar archive into an empty directory. If it includes a top
level directory, then you just move that to the desired destination. If not, then just rename your working directory (the one that
was empty) and move that to the desired location.
If you just want to get it right the first time, you can run tar -tvf archive-file.tar | less and it will list the contents of
the archive so you can see how it is structured and then do what is necessary to extract it to the desired location to start with.
The t option also comes in handy if you want to inspect the contents of an archive just to see if it has something you're
looking for in it. If it does, you can, optionally, just extract the file(s) you want.
I would like to know how to change a UID (USER ID)/GID (GROUP ID) and all belonging files on Linux
operating system. Say, I want to change UID from 1005 to 2005 and GID from 1005 to 2005 on Linux.
How do I make such change for belonging files and directories?
The procedure is pretty simple:
First, assign a new UID to user using the usermod command.
Second, assign a new GID to group using the groupmod command.
Finally, use the chown and chgrp commands to change old UID and GID respectively. You can
automate this with the help of find command.
It cannot be stressed enough how important it is to make a backup of your system before you do
this. Make a backup. Let us say:
Our sample user name: foo
Foo's old UID: 1005
Foo's new UID: 2005
Our sample group name: foo
Foo's old GID: 2000
Foo's new GID: 3000
Commands
To assign a new UID to user called foo, enter: # usermod -u 2005 foo
To assign a new GID to group called foo, enter: # groupmod -g 3000 foo
Please note that all files which are located in the user's home directory will have the file UID
changed automatically as soon as you type above two command. However, files outside user's home directory
need to be changed manually. To manually change files with old GID and UID respectively, enter: # find / -group 2000 -exec chgrp -h foo {} \;
# find / -user 1005 -exec chown -h foo {} \;
The -exec command executes chgrp or chmod command on each file. The -h option
passed to the chgrp/chmod command affect each symbolic link instead of any referenced file. Use the
following command to verify the same: # ls -l /home/foo/
# id -u foo
# id -g foo
# grep foo /etc/passwd
# grep foo /etc/group
The Last but not LeastTechnology is dominated by
two types of people: those who understand what they do not manage and those who manage what they do not understand ~Archibald Putt.
Ph.D
FAIR USE NOTICEThis site contains
copyrighted material the use of which has not always been specifically
authorized by the copyright owner. We are making such material available
to advance understanding of computer science, IT technology, economic, scientific, and social
issues. We believe this constitutes a 'fair use' of any such
copyrighted material as provided by section 107 of the US Copyright Law according to which
such material can be distributed without profit exclusively for research and educational purposes.
This is a Spartan WHYFF (We Help You For Free)
site written by people for whom English is not a native language. Grammar and spelling errors should
be expected. The site contain some broken links as it develops like a living tree...
You can use PayPal to to buy a cup of coffee for authors
of this site
Disclaimer:
The statements, views and opinions presented on this web page are those of the author (or
referenced source) and are
not endorsed by, nor do they necessarily reflect, the opinions of the Softpanorama society.We do not warrant the correctness
of the information provided or its fitness for any purpose. The site uses AdSense so you need to be aware of Google privacy policy. You you do not want to be
tracked by Google please disable Javascript for this site. This site is perfectly usable without
Javascript.