One way of estimating the relative importance of the tasks that folk use Linux for would be to count the number of different applications that have been written to perform each of those tasks. Given the rather large number of programs that exist for "finding stuff", we might conclude that the thing users do most often is to lose it in the first place!
In this tutorial you'll learn how to find files on the command line by specifying all sorts of different search criteria.
Searches based on file name
The simplest kinds of search are those based on file name, and the shell's filename wildcard matching provides a starting point for this. For example, the command
$ ls *invoice*
will list all file names in the current directory containing the string invoice. Not too impressive? Why not try something like:
$ ls */*invoice*
which will list files with invoice in the name in any subdirectories of your current directory? Then you can extend the idea to whatever level you want, maybe using something like this:
$ ls *invoice* */*invoice* */*/*invoice*
If you want to search the entire file system for a file based on a file name, the slocate command provides a solution. For example,
$ slocate invoice
will find all files with names that contain the string invoice. You'll find that slocate is lightning fast because it uses a pre-built index of filenames. This index is built using the program updatedb (the slocate command with the -u option does the same thing) which is usually run once a day via cron or anacron.
On Ubuntu installations, the slocate database is /var/lib/slocate/slocate.db. This is the only down-side of slocate - it won't find files that were created since updatedb was last run.
Don't give me bad news...
Among the output from find you'll often notice a bunch of error messages relating to directories we don't have permission to search. Sometimes, there can be so many of these messages that they entirely swamp the 'good' output. You can easily suppress the error messages by redirecting them to the 'black hole' device, /dev/null. To do this, simply append 2> /dev/null to the command line.
S for secure
In case you were wondering, the s in slocate stands for 'secure'. Here's the scoop on this: the updatedb program (the one that builds the index) runs with root privilege, so it can be sure of seeing all the files. This means that potentially there will be files listed in the slocate.db index that ordinary users should not be able to see. These might be system files, or they might be private files belonging to other users.
The slocate index also keeps a record of the ownership and permissions on the files, and the slocate program is careful not to show you file names that you shouldn't be able to see. There was (I think) an older program called locate that wasn't this smart, but on a modern Linux distribution, slocate and locate are links to the same program.
Specialised search: which and whereis
There are a couple of more specialised search tools, whereis and which, that should be mentioned for the sake of completeness. The program whereis searches for the executable, source code and documentation (manual page) for a specified command. It looks in a pre-defined list of directories. For example:
$ whereis ls ls: /bin/ls /usr/share/man/man1/ls.1.gz
tells us the location the executable (binary) and the man page for the ls command. The which command is even more specialised. It simply looks up a specified command on our search path, reporting where it would first find it. For example:
$ which vi /usr/bin/vi
tells us that the vi command is in /usr/bin/vi. Effectively, this command answers the question "If I entered the command vi, which program would actually get run?"
Searching on steroids: find
At the other end of the scale is the top-of-the-range search tool, find. In addition to filename-based searching, find is able to locate files based on ownership, access permissions, time of last access, size, and much else besides. Of course, the price you pay for all this flexibility is a rather perplexing command syntax. We'll dive into the details later, but here's an example to give you the idea:
$ find /etc -name '*.conf' -user cupsys -print find: /etc/ssl/private: Permission denied find: /etc/cups/ssl: Permission denied /etc/cups/cupsd.conf /etc/cups/printers.conf
In this example, find is searching in (and below) the directory /etc for files whose name ends in .conf and that are owned by the cupsys account.
Generally, the syntax of the find command is of the form:
$ find <where to look> <what to look for> <what to do with it>
The "where to look" part is simply a space-separated list of the directories we want find to search. For each one, find will recursively descend into every directory beneath those specified. Our table below, titled Search Criteria For Find, lists the most useful search criteria (the "what to look for" part of the command).
Search criteria for find
-name string | File name matches string (wildcards are allowed) | -name '*.jpg' |
-iname string | Same as -name but not case sensitive | -iname '*tax*' |
-user username | File is owned by username | -user chris |
-group groupname | File has group groupname | -group admin |
-type x | File is of type 'x', one of:
f - regular file
d - directory
l - symbolic link
c - character device
b - block device
p - named pipe (FIFO)
| -type d |
-size +N | File is bigger than N 512-byte blocks (use suffix c for bytes, k for kilobytes, M for megabytes) | -size +100M |
-size -N | File is smaller than N blocks (use suffix c for bytes, k for kilobytes, M for megabytes) | -size -50c |
-mtime -N | File was last modified less than N days ago | -mtime -1 |
-mtime +N | File was last modified more than N days ago | -mtime +14 |
-mmin -N | File was last modified less than N minutes ago | -mmin -10 |
-perm mode | The files permissions exactly match mode. The mode can be specified in octal, or using the same symbolic notation that chmod supports | -perm 644 |
-perm -mode | All of the permission bits specified by mode are set. | -perm -ugo=x |
-perm /mode | Any of the permission bits specified by mode is set | -perm /011 |
And the smaller table Actions For Find, below, lists the most useful actions (the "what to do with it" part of the command). Neither of these is a complete list, so check the manual page for the full story.
Actions for find
Print the full pathname of the file to standard output | |
-ls | Give a full listing of the file, equivalent to running ls -dils |
-delete | Delete the file |
-exec command | Execute the specified command. All following arguments to find are taken to be arguments to the command until a ';' is encountered. The string {} is replaced by the current file name. |
If no other action is specified, the -print action is assumed, with the result that the pathname of the selected file is printed (or to be more exact, written to standard output). This is a very common use of find. I should perhaps point out that many of the search criteria supported by find are really intended to help in rounding up files to perform some administrative operation on them (make a backup of them, perhaps) rather than helping you find odd files you happen to have mislaid.
Why is this not a command?
The which command can - occasionally - give a misleading answer, if the command in question also happens to be a built-in command of the bash shell. For example:
$ which kill /bin/kill
tells us that the kill command lives in /bin. However, kill is also a built-in bash command, so if I enter a command like
$ kill -HUP 1246
it will actually run the shell's built-in kill and not the external command.
To find out whether a command is recognised as a shell built-in, an alias, or an external command, you can use the type command, like this:
$ type kill kill is a shell builtin
Learning by Example
It takes a while to get your head around all this syntax, so maybe a few examples would help ...
Example 1 This is a simple name-based search, starting in my home directory and looking for all PowerPoint (.ppt) files. Notice we've put the filename wildcard expression in quotes to stop the shell trying to expand it. We want to pass the argument '*.ppt' directly and let find worry about the wildcard matching.
$ find ~ -name '*.ppt'
Example 2 You can supply multiple "what to look for" tests to find and by default they will be logically AND-ed, that is, they must all be true in order for the file to match. Here, we look for directories under /var that are owned by daemon:
$ find /var -type d -user daemon
Example 3 This shows how you can OR tests together rather than AND-ing them. Here, we're looking in /etc for files that are either owned by the account cupsys or are completely empty:
$ find /etc -user cupsys -or -size 0
Example 4 This uses the '!' operator to reverse the sense of a test. Here, we're searching /bin for files that aren't owned by root:
$ find /usr/bin ! -user root
Example 5 The tests that make numeric comparisons are especially confusing. Just remember that '+' in front of a number means 'more than', '-' means 'less than', and if there is no '+' or '-', find looks for an exact match. These three example search for files that have been modified less than 10 minutes ago, more than 1 year ago, and exactly 4 days ago. (This third example is probably not very useful.)
$ find ~ -mmin -10 $ find ~ -mtime +365 $ find ~ -mtime 4
Example 6 Perhaps the most confusing tests of all are those made on a file's access permissions. This example isn't too bad, it looks for an exact match on the permissions 644 (which would be represented symbolically by ls -l as rw-r--r--:
$ find ~ -perm 644
Example 7 Here we look for files that are writeable by anybody (that is, either the owner, the group, or rest-of-world). The two examples are equivalent; the first uses the traditional octal notation, the second uses the same symbolic notation for representing permissions that chmod uses:
$ find ~ -perm -222 $ find ~ -perm -ugo=w
Example 8 Here we look for files that are writeable by everybody (that is, by the owner and the group and the rest-of-world):
$ find ~ -perm /222 $ find ~ -perm /ugo=w
Example 9 So far we've just used the default -print action of find to display the names of the matching files. Here's an example that uses the -exec option to move all matching files into a backup directory. There are a couple of points to note here. First, the notation {} gets replaced by the full pathname of the matching file, and the ';' is used to mark the end of the command that follows -exec. Remember: ';' is also a shell metacharacter, so we need to put the backslash in front to prevent the shell interpreting it.
$ find ~ -mtime +365 -exec mv {} /tmp/mybackup \;
Never mind the file name, what's in the file?
As we've seen, tools such as find can track down files based on file name, size, ownership, timestamps, and much else, but find cannot select files based on their content. It turns out that we can do some quite nifty content-based searching using grep in conjunction with the shell's wildcards. This example is taken from my personal file system:
$ grep -l Hudson */* Desktop/suse_book_press_release.txt google-earth/README.linux Mail/inbox.ev-summary Mail/sent-mail.ev-summary snmp_training/enterprise_mib_list
Here, we're asking grep to report the names of the files containing a match for the string Hudson. The wildcard notation */* is expanded by the shell to a list of all files that are one level below the current directory. If we wanted to be a bit more selective on the file name, we could do something like:
$ grep -l Hudson */*.txt Desktop/search_tools.txt Desktop/suse_book_press_release.txt
which would only search in files with names ending in .txt. In principal you could extend the search to more directory levels, but in practice you may find that the number of file names matched by the shell exceeds the number of arguments that can appear in the argument list, as happened when I tried it on my system:
$ grep -l Hudson */* */*/* bash: /bin/grep: Argument list too long
A more powerful approach to content-based searching is to use grep in conjunction with find. This example shows a search for files under my home directory ('~') whose names end in .txt, that contain the string Hudson.
$ find ~ -name '*.txt' -exec grep -q Hudson {} \; -print /home/chris/Desktop/search_tools.txt /home/chris/Desktop/suse_book_press_release.txt
This approach does not suffer from the argument list overflow problem that our previous example suffered from. Remember, too, that find is capable of searching on many more criteria that just file name, and grep is capable of searching for regular expressions not just fixed text, so there is a lot more power here than this simple example suggests.
If you're unclear about the syntax of this example, read The truth about find, below left. In this example, the predicate -exec grep -q Hudson {} \; returns true if grep finds a match for the string Hudson in the specified file, and false if not. If the predicate is false, find does not continue to evaluate any following expressions, that is, it does not execute the -print action.
Finding a File containing a particular text string in Linux server
grep “text string to search” directory-path
Examples
For example search for a string called redeem reward in all text files located in /home/tom/*.txt directory, use
$ grep "redeem reward" /home/tom/*.txt
Task: Search all subdirectories recursively
You can search for a text string all files under each directory, recursively with -roption:
$ grep -r "redeem reward" /home/tom
Task: Only print filenames
By default, grep command prints the matching lines You can pass -H option to print the filename for each match.
Output:
$ grep -H -r “redeem reward” /home/tom
Output:
... filename.txt: redeem reward ...
To just print the filename use cut command as follows:
Output:
$ grep -H vivek /etc/* -R | cut -d: -f1
Output:
... filename.txt ...
Find Files Containing Text Find files that contain a text string:
grep -lir "some text" *
The -l switch outputs only the names of files in which the text occurs (instead of each line containing the text), the -i switch ignores the case, and the -r descends into subdirectories.
Find syntax
This section requires expansion.(August 2008) |
find [-H] [-L] [-P] path... [expression]
The three options control how the
find
command should treat symbolic links. The default behaviour is never to follow symbolic links. This can be explicitly specified using the -P flag. The -L flag will cause the find
command to follow symbolic links. The -H flag will only follow symbolic links while processing the command line arguments. These flags are not available with some older versions of find
.
At least one path must precede the expression.
find
is capable of interpreting wildcards internally and commands must be constructed carefully in order to control shell globbing.
Expression elements are whitespace-separated and evaluated from left to right. They can contain logical elements such as AND (-a) and OR (-o) as well as more complex predicates.
The GNU
find
has a large number of additional features not specified by POSIX.[edit]POSIX protection from infinite output
Real-world filesystems often contain looped structures created through the use of hard or soft links. The POSIX standard requires that
Thefind
utility shall detect infinite loops; that is, entering a previously visited directory that is an ancestor of the last file encountered. When it detects an infinite loop,find
shall write a diagnostic message to standard error and shall either recover its position in the hierarchy or terminate.
[edit]Examples
[edit]From current directory
find . -name 'my*'
This searches in the current directory (represented by the dot character) and below it, for files and directories with names starting with my. The quotes avoid the shellexpansion — without them the shell would replace my* with the list of files whose names begin with my in the current directory. In newer versions of the program, the directory may be omitted, and it will imply the current directory.
Note that for RedHat Linux Version 9: find . -name my* returns this error find: paths must precede expression. Double quotes find . -name "my*" works fine.
[edit]Files only
find . -name "my*" -type f
This limits the results of the above search to only regular files, therefore excluding directories, special files, pipes, symbolic links, etc. my* is enclosed in quotes as otherwise the shell would replace it with the list of files in the current directory starting with my...
[edit]Commands
The previous examples created listings of results because, by default,
find
executes the '-print' action. (Note that early versions of the find
command had no default action at all; therefore the resulting list of files would be discarded, to the bewilderment of users.)find . -name "my*" -type f -ls
This prints extended file information.
[edit]Search all directories
find / -name "myfile" -type f -print
This searches every file on the computer for a file with the name myfile and prints it to the screen. It is generally not a good idea to look for data files this way. This can take a considerable amount of time, so it is best to specify the directory more precisely. Some operating systems may mount dynamic filesystems that are not congenial to
find
.[edit]Search all but one directory subtree
find / -path excluded_path -prune -o -type f -name myfile -print
This searches every folder on the computer except the subtree excluded_path (pull path including the leading /), for a file with the name myfile. It will not detect directories, devices, links, doors, or other "special" filetypes.
[edit]Specify a directory
find /home/weedly -name "myfile" -type f -print
This searches for files named myfile in the /home/weedly directory, the home directory for userid weedly. You should always specify the directory to the deepest level you can remember.
[edit]Search several directories
find local /tmp -name mydir -type d -print
This searches for directories named mydir in the local subdirectory of the current working directory and the /tmp directory.
[edit]Ignore errors
If you're doing this as a user other than root, you might want to ignore permission denied (and any other) errors. Since errors are printed to stderr, they can be suppressed by redirecting the output to /dev/null. The following example shows how to do this in the bash shell:
find / -name "myfile" -type f -print 2>/dev/null
If you are a csh or tcsh user, you cannot redirect stderr without redirecting stdout as well. You can use sh to run the
find
command to get around this:sh -c find / -name "myfile" -type f -print 2>/dev/null
An alternate method when using csh or tcsh is to pipe the output from stdout and stderr into a grep command. This example shows how to suppress lines that contain permission denied errors.
find . -name "myfile" |& grep -v "Permission denied"
[edit]Find any one of differently named files
find . \( -name "*jsp" -o -name "*java" \) -type f -ls
The
-ls
option prints extended information, and the example finds any file whose name ends with either 'jsp' or 'java'. Note that the parentheses are required. Also note that the operator "or" can be abbreviated as "o". The "and" operator is assumed where no operator is given. In many shells the parentheses must be escaped with a backslash, "\(" and "\)", to prevent them from being interpreted as special shell characters. The -ls
option and the -or
operator are not available on all versions offind
.[edit]Execute an action
find /var/ftp/mp3 -name "*.mp3" -type f -exec chmod 644 {} \;
This command changes the permissions of all files with a name ending in .mp3 in the directory /var/ftp/mp3. The action is carried out by specifying the option
-execchmod 644 {} \;
in the command. For every file whose name ends in .mp3
, the command chmod 644 {}
is executed replacing {}
with the name of the file. The semicolon (backslashed to avoid the shell interpreting it as a command separator) indicates the end of the command. Permission 644
, usually shown as rw-r--r--
, gives the file owner full permission to read and write the file, while other users have read-only access. In some shells, the {}
must be quoted.
Note that the command itself should *not* be quoted; otherwise you get error messages like
find: echo "mv ./3bfn rel071204": No such file or directory
which means that
find
is trying to run a file called 'echo "mv ./3bfn rel071204"' and failing.
If running under Windows, don't include the backslash before the semicolon:
find . -exec grep blah {} ;
If you will be executing over many results, it is more efficient to pipe the results to the xargs command instead. xargs is a more modern implementation, and handles long lists in a more intelligent way. The print0 option can be used with this.
The following command will ensure that filenames with whitespaces are passed to the executed COMMAND without being split up by the shell. It looks complicated at first glance, but is widely used.
find . -print0 | xargs -0 COMMAND
The list of files generated by
find
(whilst it is being generated) is simultaneously piped to xargs, which then executes COMMAND with the files as arguments. Seexargs for more examples and options.[edit]Delete files and directories
Delete empty files and directories and print the names
find /foo -empty -delete -print
Delete empty files
find /foo -type f -empty -delete
Delete empty directories
find /foo -type d -empty -delete
Delete files and directories (if empty) named
bad
find /foo -name bad -delete
Warning:
-delete
should be use with other operators such as -empty
or -name
.find /foo -delete (this deletes all in foo
)
[edit]Search for a string
This command will search for a string in all files from the /tmp directory and below:
find /tmp -exec grep "search string" '{}' /dev/null \; -print
The /dev/null argument is used to show the name of the file before the text that is found. Without it, only the text found is printed. An equivalent mechanism is to use the "-H" or "--with-filename" option to grep:
find /tmp -exec grep -H "search string" '{}' \; -print
GNU grep can be used on its own to perform this task:
grep -r "search string" /tmp
Example of search for "LOG" in jsmith's home directory
find ~jsmith -exec grep "LOG" '{}' /dev/null \; -print /home/jsmith/scripts/errpt.sh:cp $LOG $FIXEDLOGNAME /home/jsmith/scripts/errpt.sh:cat $LOG /home/jsmith/scripts/title:USER=$LOGNAME
Example of search for the string "ERROR" in all XML files in the current directory and all sub-directories
find . -name "*.xml" -exec grep "ERROR" '{}' \; -print
The double quotes (" ") surrounding the search string and single quotes (' ') surrounding the braces are optional in this example, but needed to allow spaces and other special characters in the string.
[edit]Search for all files owned by a user
find . -user <userid>
[edit]Search in case insensitive mode
find . -iname "MyFile*"
If the
-iname
switch is not supported on your system then workaround techniques may be possible such as:find . -name "[mM][yY][fF][iI][lL][eE]*"
This uses Perl to build the above command for you:
echo "'MyFile*'" |perl -pe 's/([a-zA-Z])/[\L\1\U\1]/g;s/(.*)/find . -name \1/'|sh
[edit]Search files by size
Example of searching files with size between 100 kilobytes and 500 kilobytes.
find . -size +100k -a -size -500k
Example of searching empty files.
find . -size 0k
Example of searching non-empty files.
find . -not -size 0k
[edit]Search files by name and size
find /usr/src -not \( -name "*,v" -o -name ".*,v" \) '{}' \; -print
This command will search in the /usr/src directory and all sub directories. All files that are of the form '*,v' and '.*,v' are excluded. Important arguments to note are:
-not means the negation of the expression that follows \( means the start of a complex expression. \) means the end of a complex expression. -o means a logical or of a complex expression. In this case the complex expression is all files like '*,v' or '.*,v'
for file in `find /opt \( -name error_log -o -name 'access_log' -o -name 'ssl_engine_log' -o -name 'rewrite_log' -o -name 'catalina.out' \) -size +300000k -a -size -5000000k`; do cat /dev/null > $file; done
The units should be one of [bckw], 'b' means 512-byte blocks, 'c' means byte, 'k' means kilobytes and 'w' means 2-byte words. The size does not count indirect blocks, but it does count blocks in sparse files that are not actually allocated.
[edit]Operators
Operators can be used to enhance the expressions of the find command. Operators are listed in order of decreasing precedence:
- ( expr ) Force precedence.
- ! expr True if expr is false.
- -not expr Same as ! expr.
- expr1 expr2 And (implied); expr2 is not evaluated if expr1 is false.
- expr1 -a expr2 Same as expr1 expr2.
- expr1 -and expr2 Same as expr1 expr2.
- expr1 -o expr2 Or; expr2 is not evaluated if expr1 is true.
- expr1 -or expr2 Same as expr1 -o expr2.
- expr1 , expr2 List; both expr1 and expr2 are always evaluated. The value of expr1 is discarded; the value of the list is the value of expr2.
find . -name 'fileA_*' -or -name 'fileB_*'
This command searches files whose name has a prefix of "fileA_" or "fileB_" in the current directory.
find . -name 'foo.cpp' -not -path '.svn'
This command searches for files with the name "foo.cpp" in all subdirectories of the current directory (current directory itself included) other than ".svn".
The truth about find
The individual components of a find command are known as expressions, (or more technically, as predicates). For example, -uname cupsys is a predicate. The find command operates by examining each and every file under the directory you ask it to search and evaluating each of the predicates in turn against that file.
Each predicate returns either true or false, and the results of the predicates are logically AND-ed together. If one of the predicates returns a false result, find does not evaluate the remaining predicates. So for example in a command such as:
$ find . -user chris -name '*.txt' -print
if the predicate -user chris is false (that is, if the file is not owned by chris) find will not evaluate the remaining predicates. Only if -user chris and -name '*.txt' both return true will find evaluate the -print predicate (which writes the file name to standard output and also returns the result 'true').
No comments:
Post a Comment