How does the find command search for files - linux

Quarrying out from another thread Move file that has aged x minutes, this question came up:
How does the find command found typically in Linux search for files in the current directory?
Consider a directory that contains a fairly large amount of files, then:
Firstly find MY_FILE.txt returns immediately and secondly find . -name MY_FILE.txt takes much longer.
I used strace -c to see what happens for both and I learned that the second command invokes a directory scan, which explains why it's slower.
So, the first command must be optimized. Can anybody point me to the appropriate resource or provide a quick explanation how this might be implemented?

The syntax for find is find <paths> <expression>, where paths is a list of files and directories to start the search from. find starts from those locations and then recurses (if they're directories).
When you write find . -name MY_FILE.txt it performs a recursive search under the ./ directory. But if you write find MY_FILE.txt then you're telling it to start the search at ./MY_FILE.txt, and so it does:
$ strace -e file find MY_FILE.txt
...
newfstatat(AT_FDCWD, "MY_FILE.txt", 0x556688ecdc68, AT_SYMLINK_NOFOLLOW) = -1 ENOENT (No such file or directory)
...
(No such file or directory)
: No such file or directory
+++ exited with 1 +++
Since the path doesn't exist, it only takes a single system call to determine that there's no such file. It calls newfstat(), gets a No such file or directory error, and that's that.
In other words, find MY_FILE.txt isn't equivalent to find . -name MY_FILE.txt. Heck, I wouldn't even call it useful because you're not asking it to search. You're just asking it to tell you if MY_FILE.txt exists in the current directory or not. But you could find that out by simply calling ls MY_FILE.txt.
Here's the difference:
[~]$ cd /usr
[/usr]$ find . -name sha384sum
./bin/sha384sum
[/usr]$ find sha384sum
find: ‘sha384sum’: No such file or directory
The first one performs a recursive search and finds /usr/bin/sha384sum. The second one doesn't recurse and immediately fails bcause /usr/sha384sum doesn't exist. It doesn't look any deeper. It's done in a nanosecond.

Related

Is it possible to somehow undo the results of the mv command?

Here's the problem. I had a bunch of files in a directory. Then I created another directory in that directory. Then I cobbled together this command:
find . -maxdepth 1 -type f -exec mv {} ./1 \;
This command was supposed to take all the files in the directory and move them to that newly-created directory, but instead of providing the name of the directory, I screwed up and typed 1, as you can see from the code snippet. So, I ended up having just one text file named 1 that now contains the stuff from one of the disappeared files and that's all.
Is there any chance I could recover the lost files (or possibly the actual data from the files--they were all text files) or are they pretty much permanently gone?
Before:
misha#hp-laptop:~/Documents/prgmg/work$ ls
add.s bubble.s cpuid.s div.s hello.s mult.s sum.s test.s
a.out c demo.s gas.txt max.s print_arr.s test.c
misha#hp-laptop:~/Documents/prgmg/work$ mkdir asm
After:
misha#hp-laptop:~/Documents/prgmg/work$ ls
1 asm c
So, as you can see, I wanted to put all assembly language files into the asm directory. And as things stand now, 1 is a text file and it contains the stuff from gas.txt.
No. Not easily. Sorry.
Restoring from backup would be the best option.
See the answers to the question "Recovering accidentally deleted files" over at Unix & Linux, if you feel like doing a bit of low-level file access.

Linux "find" returns all files

A few days ago I was reading about the Linux find tool and based on that I issued the following command to see if I have the Python.h file:
find . 'Python.h'
The problem is that all files in current dir and subdirs are returned. Shouldn't I get what I'm looking for?
You left out the parameter specifier -name:
find ./ -name 'Python.h'
find will recurse through all directories in the current directory. If you just want to see whether you have a file in the current directory, use ls:
ls Python.h
Use -name switch:
find . -name 'Python.h'
Otherwise it takes the name as location to look at.

Fast way to find file names in Linux and specify directory

This command is slow: find / -name 'program.c' 2>/dev/null
1) Any faster alternatives?
2) Is there an alternative to the above command to search for a file within a specific nested directory (but not the entire system)?
The first / in your command is the base directory from which find will begin searching. You can specify any directory you like, so if you know, for example, that program.c is somewhere in your home directory you could do find ~ -name 'program.c' or if it's in, say, /usr/src do find /usr/src -name 'program.c'
That should help with both 1 and 2.
If you want a command that's not find that can be faster you can check out the mlocate stuff. If you've done a recent updatedb (or had cron do it for you overnight) you can do locate <pattern> and it will show you everywhere that matches that pattern in a file/directory name, and that's usually quite fast.
For fast searching, you probably want locate
It is usually setup to do a daily scan of the filesystem, and index the files.
http://linux.die.net/man/1/locate
although locate & updatedb is for the whole system, the search usually is faster.

How to find a particular folder through terminal in fedora

Presently i am using linux(Fedora 15) and i ma trying to search a folder in the entire file system like with below command
find / -name "apache-tomcat*"
The execution of the above command is taking more and more time that a user cant wait and results are some thing like below
[root#user fedrik]# find / -name "apache-tomcat*"
find: `/proc/6236/task/6236/ns/net': No such file or directory
find: `/proc/6236/task/6236/ns/uts': No such file or directory
find: `/proc/6236/task/6236/ns/ipc': No such file or directory
find: `/proc/6236/ns/net': No such file or directory
find: `/proc/6236/ns/uts': No such file or directory
find: `/proc/6236/ns/ipc': No such file or directory
find: `/proc/6462/task/6462/ns/net': No such file or directory
.................
.................
But as i have mentioned it is taking long time to process and sometimes it is been strucked, so can anyone please let me know on how to search a particular folder by name with a command from linux terminal that will be very fast and should search in the entire file system like above i used '/'
Edit
Actually my intention is to search the folder something like apache-tomcat-7.0.37 in the entire filesystem,
for example there may be many folders like apache-tomcat-6.0.45, apache-tomcat-5.1.7, apache-tomcat-5.0.37........... on different locations on filesystem
So as we can observe only the last part(which is numerical part) is changing and the entire folder name is same, so is there a way to search for these kind of folders irrespective of the last numerical part , like by using regular expression or somethingl ike that.
Finally my intention is to find the folders of the format apache-tomcat-xxxxxxx on the entire file system, because if we search for just apache-tomcat we will get hundreds of results and even thousands too sometimes which is difficult to analyze and search from them
?
Try this:
locate apache-tomcat
It uses a database (updated by the hilariously-named updatedb, which you can run with sudo updatedb to refresh the search index).
locate apache-tomcat | grep -E '^apache-tomcat-[[:digit:]]+\.[[:digit:]]+\.[[:digit:]]+$'
or just use [0-9] instead of [[:digit:]]. That's probably more readable. Or
locate apache-tomcat | perl -ne 'print if /^apache-tomcat-\d+\.\d+\.\d+$/'
Whatever you do, you definitely want to use locate instead of find, as it will be much faster.

Linux shell:Is it possible to speedup finding files using "find" by using a predefined list of files/folders?

I primarily program in Linux, using tcsh shell. By default, my current directory is the root of my code base - I use "find" to locate whichever file I'm interested in modifying, and then once find shows up the location of the file, I can then edit/modify on Vim.
The problem is, due to the size of the code base, every time I ask find to show up the location of a file , it takes at least 4-5 seconds to complete the search, which are too short to be used for anything else !! So, since the rate is new files being added to the code base is very small, i'm looking for a way as follows:
1) Generate the list of all files in my code base
2) Have find look in only those locations/files to answer my query
I've seen how opening up files in cscope is lightning fast, as it stores the list of files previously. I'd like to use the same mechanism for find, just not from within the cscope window, but from the generic cmd line.
Any ideas ?
Install the locate, mlocate, or slocate package from your distribution, and either wait for cron to run the update task :) or run the updatedb command manually via the /etc/cron.daily/mlocate or similar file.
$ time locate kernel.txt
/home/sarnold/Local/linux-2.6/Documentation/sysctl/kernel.txt
/home/sarnold/Local/linux-2.6-config-all/Documentation/sysctl/kernel.txt
/home/sarnold/Local/linux-apparmor/Documentation/sysctl/kernel.txt
/usr/share/doc/libfuse2/kernel.txt.gz
real 0m0.595s
Yes. See slocate (or updatedb & locate).
The -U flag is particularily interesting because you can just index the directory that contains your code (and thus, updating or creating the database will be quick).
You could write a list of directories to a file and use them in your find command:
$ find /path/to/src -type d > dirs
$ find $(cat dirs) -type f -name "foo"
Alternatively, write a list of files to a file and use grep on it. The list of files is more likely to change than the list of dirs though.
$ find /path/to/src -type f > files
$ vi $(grep foo files)
find in conjunction with xargs (substituting -exec) does differ significantly in execution timings:
http://forrestrunning.wordpress.com/2011/08/01/find-exec-xargs/

Resources