Linux shell:Is it possible to speedup finding files using "find" by using a predefined list of files/folders? - linux

I primarily program in Linux, using tcsh shell. By default, my current directory is the root of my code base - I use "find" to locate whichever file I'm interested in modifying, and then once find shows up the location of the file, I can then edit/modify on Vim.
The problem is, due to the size of the code base, every time I ask find to show up the location of a file , it takes at least 4-5 seconds to complete the search, which are too short to be used for anything else !! So, since the rate is new files being added to the code base is very small, i'm looking for a way as follows:
1) Generate the list of all files in my code base
2) Have find look in only those locations/files to answer my query
I've seen how opening up files in cscope is lightning fast, as it stores the list of files previously. I'd like to use the same mechanism for find, just not from within the cscope window, but from the generic cmd line.
Any ideas ?

Install the locate, mlocate, or slocate package from your distribution, and either wait for cron to run the update task :) or run the updatedb command manually via the /etc/cron.daily/mlocate or similar file.
$ time locate kernel.txt
/home/sarnold/Local/linux-2.6/Documentation/sysctl/kernel.txt
/home/sarnold/Local/linux-2.6-config-all/Documentation/sysctl/kernel.txt
/home/sarnold/Local/linux-apparmor/Documentation/sysctl/kernel.txt
/usr/share/doc/libfuse2/kernel.txt.gz
real 0m0.595s

Yes. See slocate (or updatedb & locate).
The -U flag is particularily interesting because you can just index the directory that contains your code (and thus, updating or creating the database will be quick).

You could write a list of directories to a file and use them in your find command:
$ find /path/to/src -type d > dirs
$ find $(cat dirs) -type f -name "foo"
Alternatively, write a list of files to a file and use grep on it. The list of files is more likely to change than the list of dirs though.
$ find /path/to/src -type f > files
$ vi $(grep foo files)

find in conjunction with xargs (substituting -exec) does differ significantly in execution timings:
http://forrestrunning.wordpress.com/2011/08/01/find-exec-xargs/

Related

How to find the source of a copied file?

I have a file that I copied sometime back, but I forgot the source of it. Is there a way to find the source of the copied file? I don't remember which terminal I have used to try and check with Esc+P
Command used: cp -rf $source/file $destination/file
Thanks in advance!
You could try history | grep your_filename.
A Linux system has many files (and if you think of /proc/, it could change at every moment). And some other process can write or create (or append or truncate) files (e.g. some crontab(1) job...)
Assume you do know some parent directory containing the source file. Suppose it is /home/foo.
Then, you might use find(1) and some hashing command like md5sum(1) to compute and collect the hash of every file.
Use the property that two files A and B with identical contents (a sequence of bytes) have the same md5sum. Of course, the converse is false, but in practice unlikely.
So run first
find /home/foo -type f -exec md5sum '{}' \; > /tmp/foo-md5
then do seekingmd5=$(md5sum A )
then grep $seekingmd5 /tmp/foo-md5 will find lines for files having the same md5 than your original A
Depending on your filesystem and hardware, this could take hours.
You could accelerate slightly things by writing a C program using nftw(3) with md5init etc...

bash: get path from current directory given sub-directory name

Trying to write a script to clean up environment files after a resource is deleted. The problem is all the script is given as input is the name of the resource (this cannot be changed) with zero identifying information beyond that. How can I find the path of the directory the resource is sitting in?
The directory is set up a bit like the following, although much more extensive. All of these are directories, not files. There can be as many as 40+ directories to search, but the desired one is generally not more than 2-3 directories deep.
foo
aaa
aaa_green
aaa_blue
bbb
ccc
ccc_green
bar
ddd
eee
eee_green
eee_blue
fff
fff_green
fff_blue
fff_pink
I might be handed input like aaa_green or just ddd.
As an example, given eee_blue as input, I need to know eee_blue's path from the working directory so I can cd there and delete the directory. IE, I would expect to return bar/eee/eee_blue/ or bar/eee/, either is acceptable.
The "best" option I can see currently is to cd into the lowest level of each directory via multiple greps, get each's contents and look for a match, and when it does (eventually) match save that cd'ing as the path. This frankly sounds awful and inefficient.
The only other alternative method I could think of was a straight recursive grep, but I tested it and at 8 minutes it still hadn't finished running.
This script needs to run on both mac and linux, although in a desperate pinch I could go linux only.
The standard Unix tool for doing this sort of task is the find command. The GNU version of find has more extensive options than the POSIX specification (by quite a margin). The version on macOS Sierra (and Mac OS X) is similar to the GNU version. I found an online manual for OS X 10.9 at Apple find, but there's probably a better location somewhere.
It looks like you might want to run:
find . -name 'eee_blue'
which will print the names of matching files or directories, or perhaps:
find . -name 'eee_blue' -exec rm -fr {} +
which will run the rm -fr command on each name. You can run a custom script you create in place of rm -fr if you prefer; if the logic is complex, it's what I do.
Be extremely cautious before using rm -fr automatically!

Fast way to find file names in Linux and specify directory

This command is slow: find / -name 'program.c' 2>/dev/null
1) Any faster alternatives?
2) Is there an alternative to the above command to search for a file within a specific nested directory (but not the entire system)?
The first / in your command is the base directory from which find will begin searching. You can specify any directory you like, so if you know, for example, that program.c is somewhere in your home directory you could do find ~ -name 'program.c' or if it's in, say, /usr/src do find /usr/src -name 'program.c'
That should help with both 1 and 2.
If you want a command that's not find that can be faster you can check out the mlocate stuff. If you've done a recent updatedb (or had cron do it for you overnight) you can do locate <pattern> and it will show you everywhere that matches that pattern in a file/directory name, and that's usually quite fast.
For fast searching, you probably want locate
It is usually setup to do a daily scan of the filesystem, and index the files.
http://linux.die.net/man/1/locate
although locate & updatedb is for the whole system, the search usually is faster.

How to find a particular folder through terminal in fedora

Presently i am using linux(Fedora 15) and i ma trying to search a folder in the entire file system like with below command
find / -name "apache-tomcat*"
The execution of the above command is taking more and more time that a user cant wait and results are some thing like below
[root#user fedrik]# find / -name "apache-tomcat*"
find: `/proc/6236/task/6236/ns/net': No such file or directory
find: `/proc/6236/task/6236/ns/uts': No such file or directory
find: `/proc/6236/task/6236/ns/ipc': No such file or directory
find: `/proc/6236/ns/net': No such file or directory
find: `/proc/6236/ns/uts': No such file or directory
find: `/proc/6236/ns/ipc': No such file or directory
find: `/proc/6462/task/6462/ns/net': No such file or directory
.................
.................
But as i have mentioned it is taking long time to process and sometimes it is been strucked, so can anyone please let me know on how to search a particular folder by name with a command from linux terminal that will be very fast and should search in the entire file system like above i used '/'
Edit
Actually my intention is to search the folder something like apache-tomcat-7.0.37 in the entire filesystem,
for example there may be many folders like apache-tomcat-6.0.45, apache-tomcat-5.1.7, apache-tomcat-5.0.37........... on different locations on filesystem
So as we can observe only the last part(which is numerical part) is changing and the entire folder name is same, so is there a way to search for these kind of folders irrespective of the last numerical part , like by using regular expression or somethingl ike that.
Finally my intention is to find the folders of the format apache-tomcat-xxxxxxx on the entire file system, because if we search for just apache-tomcat we will get hundreds of results and even thousands too sometimes which is difficult to analyze and search from them
?
Try this:
locate apache-tomcat
It uses a database (updated by the hilariously-named updatedb, which you can run with sudo updatedb to refresh the search index).
locate apache-tomcat | grep -E '^apache-tomcat-[[:digit:]]+\.[[:digit:]]+\.[[:digit:]]+$'
or just use [0-9] instead of [[:digit:]]. That's probably more readable. Or
locate apache-tomcat | perl -ne 'print if /^apache-tomcat-\d+\.\d+\.\d+$/'
Whatever you do, you definitely want to use locate instead of find, as it will be much faster.

How to do partial search in Linux with locate?

I prefer to seach with locate command but I don't know how to perform a partial search with it.
Suppose I want to search file containing the word libevent. How can I do that?
Locate searches for file names. Not file contents.
The ugly way is to use grep It'll start searching from / directory.
grep -irn 'libevent' /
The better way is to narrow down the suspected directories where this files could exists. Suppose those directories' full paths are /path/to/dir1, /path/to/dir2 etc. Then invoke the following command.
for dir in /path/to/dir1 /path/to/dir2
do
grep -irn 'libevent' $dir
done
The locate command is not searching inside the content of files like grep (and other commands) do. It is simply searching inside file paths.
locate work by using a cache index of file paths, and this index is often updated by the updatedb utility.
addenda
A useful way to search some pattern inside (the content of) some files is to use the ability of zsh or some recent versions of bash to expand the ** file pattern, like e.g.
grep foo ~/gee/**/*.[ch]
with zsh this search inside all files named *.c or *.h under $HOME/gee/ containing foo. I find this feature tremendously useful, and justifying alone the adoption of zsh as my interactive shell. With other shells you might type the much longer
find $HOME/gee -name '*.ch' | xargs grep foo

Resources