bash: get path from current directory given sub-directory name - linux

Trying to write a script to clean up environment files after a resource is deleted. The problem is all the script is given as input is the name of the resource (this cannot be changed) with zero identifying information beyond that. How can I find the path of the directory the resource is sitting in?
The directory is set up a bit like the following, although much more extensive. All of these are directories, not files. There can be as many as 40+ directories to search, but the desired one is generally not more than 2-3 directories deep.
foo
aaa
aaa_green
aaa_blue
bbb
ccc
ccc_green
bar
ddd
eee
eee_green
eee_blue
fff
fff_green
fff_blue
fff_pink
I might be handed input like aaa_green or just ddd.
As an example, given eee_blue as input, I need to know eee_blue's path from the working directory so I can cd there and delete the directory. IE, I would expect to return bar/eee/eee_blue/ or bar/eee/, either is acceptable.
The "best" option I can see currently is to cd into the lowest level of each directory via multiple greps, get each's contents and look for a match, and when it does (eventually) match save that cd'ing as the path. This frankly sounds awful and inefficient.
The only other alternative method I could think of was a straight recursive grep, but I tested it and at 8 minutes it still hadn't finished running.
This script needs to run on both mac and linux, although in a desperate pinch I could go linux only.

The standard Unix tool for doing this sort of task is the find command. The GNU version of find has more extensive options than the POSIX specification (by quite a margin). The version on macOS Sierra (and Mac OS X) is similar to the GNU version. I found an online manual for OS X 10.9 at Apple find, but there's probably a better location somewhere.
It looks like you might want to run:
find . -name 'eee_blue'
which will print the names of matching files or directories, or perhaps:
find . -name 'eee_blue' -exec rm -fr {} +
which will run the rm -fr command on each name. You can run a custom script you create in place of rm -fr if you prefer; if the logic is complex, it's what I do.
Be extremely cautious before using rm -fr automatically!

Related

Go to bottom most directory?

I'm working with a directory with a lot of nested folders like /path/to/project/users/me/tutorial
I found a neat way to navigate up the folders here:
https://superuser.com/questions/449687/using-cd-to-go-up-multiple-directory-levels
But I'm wondering how to go down them. This seems significantly more difficult, but a couple things about the directory structure help. Each directory only has another directory in it, or maybe a directory and a README.
The directory I'm looking for looks more like a traditional project and might have random directories and files in it (more than any of the other higher directories certainly).
Right now I'm working on a solution using uh.. recursive bash functions cd'ing into the only directory underneath until there are either 0 or 2+ directories to loop through. This doesn't work yet..
Am I overcomplicating this? I feel like there could be some sweet solution using find. Ideally I want to be able to type something like:
down path
where path is a top-level folder. And that will take me down to the bottom folder tutorial.
There is an environment variable named CDPATH. This variable is used by cd in the same manner that executables use PATH when searching for pathname.
For example, if you have the following directories:
/path/to/project/users/me
/path/to/project/users/me/tutorial
/path/to/project/users/him
/path/to/project/users/him/test
/path/to/project/users/her
/path/to/project/users/her/uat
/path/to/project/users/her/dev
/path/to/application
/path/to/application/conf
/path/to/application/bin
/path/to/application/share
export CDPATH=/path/to/project/users/me:/path/to/project/users/him:/path/to/project/users/her:/path/to/application
A simple command such as cd tutorial will search the above paths for tutorial.
Let's pretend /path/to/application has directories underneath namely, conf, bin, share. A simple cd conf will send you to /path/to/application/conf as long as none of the paths before it have conf directory. This behavior is similar to executables in PATH. The first occurrence always gets chosen
My attempt - this actually works now! I'm still afraid it could easily go infinite with symbolic links or some such.
Also, I have to run this like
. down
from within the first empty folder.
#!/bin/bash
function GoDownOnce {
Dirs=$(find ./ -maxdepth 1 -mindepth 1 -type d)
NumDirs=$(echo $Dirs | wc -w)
echo $Dirs
echo $NumDirs
if [ "$NumDirs" = "1" ]; then
cd $Dirs
GoDownOnce
fi
}
GoDownOnce
A friend also suggested this sweet one liner:
cd $(find . -type d -name tutorial)
Admittedly this isn't quite what I asked, but it gets the job done pretty well.

How to list recently deleted files from a directory?

I'm not even sure if this is easily possible, but I would like to list the files that were recently deleted from a directory, recursively if possible.
I'm looking for a solution that does not require the creation of a temporary file containing a snapshot of the original directory structure against which to compare, because write access might not always be available. Edit: If it's possible to achieve the same result by storing the snapshot in a shell variable instead of a file, that would solve my problem.
Something like:
find /some/directory -type f -mmin -10 -deletedFilesOnly
Edit: OS: I'm using Ubuntu 14.04 LTS, but the command(s) would most likely be running in a variety of Linux boxes or Docker containers, most or all of which should be using ext4, and to which I would most likely not have access to make modifications.
You can use the debugfs utility,
debugfs is a simple to use RAM-based file system specially designed
for debugging purposes
First, run debugfs /dev/hda13 in your terminal (replacing /dev/hda13 with your own disk/partition).
(NOTE: You can find the name of your disk by running df / in the terminal).
Once in debug mode, you can use the command lsdel to list inodes corresponding with deleted files.
When files are removed in linux they are only un-linked but their
inodes (addresses in the disk where the file is actually present) are
not removed
To get paths of these deleted files you can use debugfs -R "ncheck 320236" replacing the number with your particular inode.
Inode Pathname
320236 /path/to/file
From here you can also inspect the contents of deleted files with cat. (NOTE: You can also recover from here if necessary).
Great post about this here.
So a few things:
You may have zero success if your partition is ext2; it works best with ext4
df /
Fill mount point with result from #2, in my case:
sudo debugfs /dev/mapper/q4os--desktop--vg-root
lsdel
q (to exit out of debugfs)
sudo debugfs -R 'ncheck 528754' /dev/sda2 2>/dev/null (replace number with one from step #4)
Thanks for your comments & answers guys. debugfs seems like an interesting solution to the initial requirements, but it is a bit overkill for the simple & light solution I was looking for; if I'm understanding correctly, the kernel must be built with debugfs support and the target directory must be in a debugfs mount. Unfortunately, that won't really work for my use-case; I must be able to provide a solution for existing, "basic" kernels and directories.
As this seems virtually impossible to accomplish, I've been able to negotiate and relax the requirements down to listing the amount of files that were recently deleted from a directory, recursively if possible.
This is the solution I ended up implementing:
A simple find command piped into wc to count the original number of files in the target directory (recursively). The result can then easily be stored in a shell or script variable, without requiring write access to the file system.
DEL_SCAN_ORIG_AMOUNT=$(find /some/directory -type f | wc -l)
We can then run the same command again later to get the updated number of files.
DEL_SCAN_NEW_AMOUNT=$(find /some/directory -type f | wc -l)
Then we can store the difference between the two in another variable and update the original amount.
DEL_SCAN_DEL_AMOUNT=$(($DEL_SCAN_ORIG_AMOUNT - $DEL_SCAN_NEW_AMOUNT));
DEL_SCAN_ORIG_AMOUNT=$DEL_SCAN_NEW_AMOUNT
We can then print a simple message if the number of files went down.
if [ $DEL_SCAN_DEL_AMOUNT -gt 0 ]; then echo "$DEL_SCAN_DEL_AMOUNT deleted files"; fi;
Return to step 2.
Unfortunately, this solution won't report anything if the same amount of files have been created and deleted during an interval, but that's not a huge issue for my use case.
To circumvent this, I'd have to store the actual list of files instead of the amount, but I haven't been able to make that work using shell variables. If anyone could figure that out, I'd help me immensely as it would meet the initial requirements!
I'd also like to know if anyone has comments on either of the two approaches.
Try:
lsof -nP | grep -i deleted
history >> history.txt
Look for all rm statements.

cronjob to remove files older than 99 days

I have to make a cronjob to remove files older than 99 days in a particular directory but I'm not sure the file names were made by trustworthy Linux users. I must expect special characters, spaces, slash characters, and others.
Here is what I think could work:
find /path/to/files -mtime +99 -exec rm {}\;
But I suspect this will fail if there are special characters or if it finds a file that's read-only, (cron may not be run with superuser privileges). I need it to go on if it meets such files.
When you use -exec rm {} \;, you shouldn't have any problems with spaces, tabs, returns, or special characters because find calls the rm command directly and passes it the name of each file one at a time.
Directories won't' be removed with that command because you aren't passing it the -r parameter, and you probably don't want too. That could end up being a bit dangerous. You might also want to include the -f parameter to do a force in case you don't have write permission. Run the cron script as root, and you should be fine.
The only thing I'd worry about is that you might end up hitting a file that you don't want to remove, but has not been modified in the past 100 days. For example, the password to stop the autodestruct sequence at your work. Chances are that file hasn't been modified in the past 100 days, but once that autodestruct sequence starts, you wouldn't want the one to be blamed because the password was lost.
Okay, more reasonable might be applications that are used but rarely modified. Maybe someone's resume that hasn't been updated because they are holding a current job, etc.
So, be careful with your assumptions. Just because a file hasn't been modified in 100 days doesn't mean it isn't used. A better criteria (although still questionable) is whether the file has been accessed in the last 100 days. Maybe this as a final command:
find /path/to/files -atime +99 -type f -exec rm -f {}\;
One more thing...
Some find commands have a -delete parameter which can be used instead of the -exec rm parameter:
find /path/to/files -atime +99 -delete
That will delete both found directories and files.
One more small recommendation: For the first week, save the files found in a log file instead of removing them, and then examine the log file. This way, you make sure that you're not deleting something important. Once you're happy thet there's nothing in the log file you don't want to touch, you can remove those files. After a week, and you're satisfied that you're not going to delete anything important, you can revert the find command to do the delete for you.
If you run rm with the -f option, your file is going to be deleted regardless of whether you have write permission on the file or not (all that matters is the containing folder). So, either you can erase all the files in the folder, or none. Add also -r if you want to erase subfolders.
And I have to say it: be very careful! You're playing with fire ;) I suggest you debug with something less harmful likfe the file command.
You can test this out by creating a bunch of files like, e.g.:
touch {a,b,c,d,e,f}
And setting permissions as desired on each of them
You should use -execdir instead of -exec. Even better, read the full Security considerations for find chapter in the findutils manual.
Please, always use rm [opts] -- [files], this will save you from errors with files like -rf wiich would otherwise be parsed as options. When you provide file names, then end all options.

Linux shell:Is it possible to speedup finding files using "find" by using a predefined list of files/folders?

I primarily program in Linux, using tcsh shell. By default, my current directory is the root of my code base - I use "find" to locate whichever file I'm interested in modifying, and then once find shows up the location of the file, I can then edit/modify on Vim.
The problem is, due to the size of the code base, every time I ask find to show up the location of a file , it takes at least 4-5 seconds to complete the search, which are too short to be used for anything else !! So, since the rate is new files being added to the code base is very small, i'm looking for a way as follows:
1) Generate the list of all files in my code base
2) Have find look in only those locations/files to answer my query
I've seen how opening up files in cscope is lightning fast, as it stores the list of files previously. I'd like to use the same mechanism for find, just not from within the cscope window, but from the generic cmd line.
Any ideas ?
Install the locate, mlocate, or slocate package from your distribution, and either wait for cron to run the update task :) or run the updatedb command manually via the /etc/cron.daily/mlocate or similar file.
$ time locate kernel.txt
/home/sarnold/Local/linux-2.6/Documentation/sysctl/kernel.txt
/home/sarnold/Local/linux-2.6-config-all/Documentation/sysctl/kernel.txt
/home/sarnold/Local/linux-apparmor/Documentation/sysctl/kernel.txt
/usr/share/doc/libfuse2/kernel.txt.gz
real 0m0.595s
Yes. See slocate (or updatedb & locate).
The -U flag is particularily interesting because you can just index the directory that contains your code (and thus, updating or creating the database will be quick).
You could write a list of directories to a file and use them in your find command:
$ find /path/to/src -type d > dirs
$ find $(cat dirs) -type f -name "foo"
Alternatively, write a list of files to a file and use grep on it. The list of files is more likely to change than the list of dirs though.
$ find /path/to/src -type f > files
$ vi $(grep foo files)
find in conjunction with xargs (substituting -exec) does differ significantly in execution timings:
http://forrestrunning.wordpress.com/2011/08/01/find-exec-xargs/

How to directly overwrite with 'unexpand' (spaces-to-tabs conversion)?

I'm trying to use something along the lines of
unexpand -t 4 *.php
but am unsure how to write this command to do what I want.
Weirdly,
unexpand -t 4 file.php > file.php
gives me an empty file. (i.e. overwriting file.php with nothing)
I can specify multiple files okay, but don't know how to then overwrite each file.
I could use my IDE, but there are ~67000 instances of to be replaced over 200 files, and this will take a while.
I expect that the answers to my question(s) will be standard unix fare, but I'm still learning...
You can very seldom use output redirection to replace the input. Replacing works with commands that support it internally (since they then do the basic steps themselves). From the shell level, it's far better to work in two steps, like so:
Do the operation on foo, creating foo.tmp
Move (rename) foo.tmp to foo, overwriting the original
This will be fast. It will require a bit more disk space, but if you do both steps before continuing to the next file, you will only need as much extra space as the largest single file, this should not be a problem.
Sketch script:
for a in *.php
do
unexpand -t 4 $a >$a-notab
mv $a-notab $a
done
You could do better (error-checking, and so on), but that is the basic outline.
Here's the command I used:
for p in $(find . -iname "*.js")
do
unexpand -t 4 $(dirname $p)/"$(basename $p)" > $(dirname $p)/"$(basename $p)-tab"
mv $(dirname $p)/"$(basename $p)-tab" $(dirname $p)/"$(basename $p)"
done
This version changes all files within the directory hierarchy rooted at the current working directory.
In my case, I only wanted to make this change to .js files; you can omit the iname clause from find if you wish, or use different args to cast your net differently.
My version wraps filenames in quotes, but it doesn't use quotes around 'interesting' directory names that appear in the paths of matching files.
To get it all on one line, add a semi after lines 1, 3, & 4.
This is potentially dangerous, so make a backup or use git before running the command. If you're using git, you can verify that only whitespace was changed with git diff -w.

Resources