I'm not even sure if this is easily possible, but I would like to list the files that were recently deleted from a directory, recursively if possible.
I'm looking for a solution that does not require the creation of a temporary file containing a snapshot of the original directory structure against which to compare, because write access might not always be available. Edit: If it's possible to achieve the same result by storing the snapshot in a shell variable instead of a file, that would solve my problem.
Something like:
find /some/directory -type f -mmin -10 -deletedFilesOnly
Edit: OS: I'm using Ubuntu 14.04 LTS, but the command(s) would most likely be running in a variety of Linux boxes or Docker containers, most or all of which should be using ext4, and to which I would most likely not have access to make modifications.
You can use the debugfs utility,
debugfs is a simple to use RAM-based file system specially designed
for debugging purposes
First, run debugfs /dev/hda13 in your terminal (replacing /dev/hda13 with your own disk/partition).
(NOTE: You can find the name of your disk by running df / in the terminal).
Once in debug mode, you can use the command lsdel to list inodes corresponding with deleted files.
When files are removed in linux they are only un-linked but their
inodes (addresses in the disk where the file is actually present) are
not removed
To get paths of these deleted files you can use debugfs -R "ncheck 320236" replacing the number with your particular inode.
Inode Pathname
320236 /path/to/file
From here you can also inspect the contents of deleted files with cat. (NOTE: You can also recover from here if necessary).
Great post about this here.
So a few things:
You may have zero success if your partition is ext2; it works best with ext4
df /
Fill mount point with result from #2, in my case:
sudo debugfs /dev/mapper/q4os--desktop--vg-root
lsdel
q (to exit out of debugfs)
sudo debugfs -R 'ncheck 528754' /dev/sda2 2>/dev/null (replace number with one from step #4)
Thanks for your comments & answers guys. debugfs seems like an interesting solution to the initial requirements, but it is a bit overkill for the simple & light solution I was looking for; if I'm understanding correctly, the kernel must be built with debugfs support and the target directory must be in a debugfs mount. Unfortunately, that won't really work for my use-case; I must be able to provide a solution for existing, "basic" kernels and directories.
As this seems virtually impossible to accomplish, I've been able to negotiate and relax the requirements down to listing the amount of files that were recently deleted from a directory, recursively if possible.
This is the solution I ended up implementing:
A simple find command piped into wc to count the original number of files in the target directory (recursively). The result can then easily be stored in a shell or script variable, without requiring write access to the file system.
DEL_SCAN_ORIG_AMOUNT=$(find /some/directory -type f | wc -l)
We can then run the same command again later to get the updated number of files.
DEL_SCAN_NEW_AMOUNT=$(find /some/directory -type f | wc -l)
Then we can store the difference between the two in another variable and update the original amount.
DEL_SCAN_DEL_AMOUNT=$(($DEL_SCAN_ORIG_AMOUNT - $DEL_SCAN_NEW_AMOUNT));
DEL_SCAN_ORIG_AMOUNT=$DEL_SCAN_NEW_AMOUNT
We can then print a simple message if the number of files went down.
if [ $DEL_SCAN_DEL_AMOUNT -gt 0 ]; then echo "$DEL_SCAN_DEL_AMOUNT deleted files"; fi;
Return to step 2.
Unfortunately, this solution won't report anything if the same amount of files have been created and deleted during an interval, but that's not a huge issue for my use case.
To circumvent this, I'd have to store the actual list of files instead of the amount, but I haven't been able to make that work using shell variables. If anyone could figure that out, I'd help me immensely as it would meet the initial requirements!
I'd also like to know if anyone has comments on either of the two approaches.
Try:
lsof -nP | grep -i deleted
history >> history.txt
Look for all rm statements.
Related
I have a file that I copied sometime back, but I forgot the source of it. Is there a way to find the source of the copied file? I don't remember which terminal I have used to try and check with Esc+P
Command used: cp -rf $source/file $destination/file
Thanks in advance!
You could try history | grep your_filename.
A Linux system has many files (and if you think of /proc/, it could change at every moment). And some other process can write or create (or append or truncate) files (e.g. some crontab(1) job...)
Assume you do know some parent directory containing the source file. Suppose it is /home/foo.
Then, you might use find(1) and some hashing command like md5sum(1) to compute and collect the hash of every file.
Use the property that two files A and B with identical contents (a sequence of bytes) have the same md5sum. Of course, the converse is false, but in practice unlikely.
So run first
find /home/foo -type f -exec md5sum '{}' \; > /tmp/foo-md5
then do seekingmd5=$(md5sum A )
then grep $seekingmd5 /tmp/foo-md5 will find lines for files having the same md5 than your original A
Depending on your filesystem and hardware, this could take hours.
You could accelerate slightly things by writing a C program using nftw(3) with md5init etc...
Trying to write a script to clean up environment files after a resource is deleted. The problem is all the script is given as input is the name of the resource (this cannot be changed) with zero identifying information beyond that. How can I find the path of the directory the resource is sitting in?
The directory is set up a bit like the following, although much more extensive. All of these are directories, not files. There can be as many as 40+ directories to search, but the desired one is generally not more than 2-3 directories deep.
foo
aaa
aaa_green
aaa_blue
bbb
ccc
ccc_green
bar
ddd
eee
eee_green
eee_blue
fff
fff_green
fff_blue
fff_pink
I might be handed input like aaa_green or just ddd.
As an example, given eee_blue as input, I need to know eee_blue's path from the working directory so I can cd there and delete the directory. IE, I would expect to return bar/eee/eee_blue/ or bar/eee/, either is acceptable.
The "best" option I can see currently is to cd into the lowest level of each directory via multiple greps, get each's contents and look for a match, and when it does (eventually) match save that cd'ing as the path. This frankly sounds awful and inefficient.
The only other alternative method I could think of was a straight recursive grep, but I tested it and at 8 minutes it still hadn't finished running.
This script needs to run on both mac and linux, although in a desperate pinch I could go linux only.
The standard Unix tool for doing this sort of task is the find command. The GNU version of find has more extensive options than the POSIX specification (by quite a margin). The version on macOS Sierra (and Mac OS X) is similar to the GNU version. I found an online manual for OS X 10.9 at Apple find, but there's probably a better location somewhere.
It looks like you might want to run:
find . -name 'eee_blue'
which will print the names of matching files or directories, or perhaps:
find . -name 'eee_blue' -exec rm -fr {} +
which will run the rm -fr command on each name. You can run a custom script you create in place of rm -fr if you prefer; if the logic is complex, it's what I do.
Be extremely cautious before using rm -fr automatically!
I am hoping that a more experienced set of eyes will find something obvious that I am missing or will be able to help me work around the errors that mv and rsync are producing. Up for the challenge?
Basic idea:
I have a bash script in which I am automating the move of files from one directory to another.
The problem:
When I run the script, periodically I get the following error from the mv command:
mv: cannot stat `/shares/directory with spaces/test file.txt': No such file or directory. The error code from the vm command is 1. Even more odd, the file move actually succeeds sometimes.
In addition, I have a branch of logic in the script that will alternately use rsync to move/copy specific files (from the same local file system source and destination as the mv command mentioned above). I get a similar error related to the stat() system call:
rsync: link_stat "/shares/directory with spaces/test file.txt" failed: No such file or directory (2)
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1070) [sender=3.0.9]
This error does not always manifest itself when the script is run. Sometimes it completes the file move without complaint, while other times it will return the error consistently when the script is run successive times.
There is one additional ingredient you should be aware of (and I am growing to suspect this as a key ingredient in my grief): the directory /shares/ is a directory that is being monitored by an installation of Dropbox -- meaning it is watched and mirrored by an installation of Dropbox. At this point, I am unable to determine if dropboxd is somehow locking the file, or the like, such that it cannot be stat-ed. To be clear, the files are eventually freed from this state without further intervention and are mv-able.
The code:
mv -v --no-clobber "${SOURCEPATH}${item}" "${DESTINATIONPATH}${item}"
More info:
The following might, or might not, be relevant:
mount indicates the filesystem is ext4
Presumably, ownership and permissions shouldn't be an issue as the script is being run by root. Especially if the file system is not fuse-based.
The base "directory" in the path (e.g. /shares/) is a symlink to another directory on the same file system.
The flavor of Linux is Debian.
Troubleshooting:
In trying to eliminate any issues with the variable expansion or their contents, I tried hardwiring the bash script like such:
mv -v --no-clobber "/shares/directory with spaces/test file.txt" "/new destination/directory with spaces/test file.txt" after verifying via ls -al that "test file.txt" existed. For reference the permissions were: -rw-r--r--
Unfortunately, this too results in the same error.
Other possible issues I could think of and what I have done to try to rule them out:
>> possible issue: slow HDD (or drive is in low power mode) or external USB drive
>> findings: The drives are all local SATA disks set to not park heads. In addition, even when forcing a consistent read from the file system, the same error happens
>> possible issue: non-Linux, NFS or fuse-based file system
>> findings: nope, source and destination are on the same local file system and mount indicates the file system is ext4
>> possible issue: white space or other unprintable chars in the file path
>> findings: verified that the source and destination paths where properly wrapped in quotes
>> possible issue: continuation issues after escaped newline (space after \ in wrapped command)
>> findings: made sure the command was all on one line, still the same error
>> possible issue: globbing (use of * in specifying the files to move)
>> findings: nope, each file is specified directly by path and name
>> possible issue: path confusion from the use of local path
>> findings: nope, file paths are fully qualified starting from /
>> possible issue: the files are not actually in the path specified
>> findings: nope, verified the file existed right prior to executing the script via ls -al
>> possible issue: somehow the --no-clobber of mv was causing issues
>> findings: nope, tried it without, same error
>> possible issue: only files created via Dropbox sync to the file system are problematic
>> findings: nope, created a local file directly via touch new-local-file.txt and it too produced the same stat() error
My analysis:
The fact that mv and rsync produce similar stat() errors leads me to believe:
there is some systemic underlying boundary case (e.g. file permissions/ownership or file busy) that is not accounted for in the bash script; or
the same bug is plaguing me in both the mv and the rsync scenarios.
Desired outcomes:
1. The root cause of the intermittent errors can be identified.
2. The root cause can be resolved or worked around.
3. The bash script can be improved to gracefully handle when the error occurs.
So, with a lot more troubleshooting I found an errant rsync statement some 200 lines earlier in the script that was conditionally executed (thus the seeming inconsistent behavior). That rsync --archive ... statement was being passed /shares/ as its source directory, therefore it effected the /shares/directory with spaces/ subdirectory. That subdirectory was the ${SOURCEPATH} of the troubling mv command mentioned in my post above.
Ultimately, it was a missing --dry-run flag on the rsync --archive ... statement that causing the trampling of the files that the script later expected to pass to mv to process.
Thanks for all who took the time to read my post. Though I am bummed to have spent my and your time on what turned out to be a bug in my script, it is reassuring to know that:
- computers are not irrational
- I am not insane
- there is not some nefarious, deep rooted bug in the linux file system
For those that stumble upon this post in the future because you are experiencing an error of cannot stat, please read my troubleshooting notes above. Much research went into that list. One of those might be your issue. If not, keep debugging, there is an explanation. Good luck!
Now, I get the feeling that some people will think that there was no original file of a hard link, but I would strongly disagree because of the following experiment I did.
Let's create a file with the content pwd and make a hard link to a subfolder:
echo "pwd" > original
mkdir subfolder
cp -l original subfolder/hardlink
Now let's see what the files output if I run it with shell:
sh original
sh subfolder/hardlink
The output is the same, even though the file hardlink is in a subfolder!
Sorry, for the long intro, but I wanted to make sure that nobody says that my following question is irrelevent.
So my question now is: If the content of the original file was not conveniently pwd, how do I find out the path to the original file from a hard link file?
I know that linux programs seem to know the path somehow, but not the filename, because some programs returned error messages that <path to original file>/hardlinkname was not found. But how do they do that?
Thanks in advance for an answer!
Edit: Btw, I fixed the error messages mentioned above by naming the hard links the same as the original file.
But how do they do that?
By looking for the same inode value. Here's one way you can list files with the same inode:
find /home -xdev -samefile original
replace /home with any other starting directory for find to start searching.
how do I find out the path to the original file from a hard link file?
For hard links there are no multiple files, just one file (inode) with multiple (file) names.
ADDENDUM:
is there no other way to find the hard links of an inode than searching through folders?
ln, ls, find, and stat are the common ways of discovering and querying the filesystem for inodes. Then depending on what next you want to accomplish, many file, directory, archiving, and searching commands recognize inode values. Some may require a special -inum or --follow or equivalent option to specify inodes.
The find example I gave above is just one such usage. Another is to combine with xargs to operate on all the found files. Here's one way to delete them all:
find /home -xdev -samefile original | xargs rm
Look under --help for other standard os commands. Most Linux distributions also come with help files that explain inodes and which tools work with inodes.
pwd is the present working directory, so of course, the output should be the same, since you didnt cd't into your subfolder.
Sorry to say, but there is no "original" file if you create other hardlinks. If you want to get other hardlinks of a file, look at How to find all hard links to a given file? for example.
Agree with #Emacs User. Your example of pwd is irrelevant and confused you.
There is no concept of original file for hard-links. The file names just act as a reference count to the content on the disk pointed by the i-node (see 'ls -li original subfolder/hardlink'). So even if you delete the original file hardlink still points to the same content.
It is impossible to find out as all hard links are treated the same way pointing to one inode.
I want to move /bin/ls to /root, but I typed a wrong dir:
mv /bin/ls /roo
Now I couldn't find the ls command file, how can I retrieve it?
First of all, why do you want to do that?? Careful with root privilege!!
Unless you have an extremely good reason and know exactly what you're doing, don't move unix commands from /bin. For one thing, other OS components and libraries may depend on them and you could totally hose your system.
ls is used from various binaries in subprocesses to list files.
Do this to recover, if you're sure what you're showing here is what you did to move it exactly.
mv /roo /bin/ls