How to identify renamed files on linux? - linux

I am using the 'find' command to identify modified files. But I've noticed that my method only identifies content-modified files and new files. It does not identify files where the only change was a rename. Is there a way to use 'find' to identify renamed files? If not, is there some other linux command that can be used for this?
Here is my current method for identifying changed files going back roughly one month (this method does NOT identify renamed files):
$ touch --date "2017-09-10T16:00:00" ~/Desktop/tmp
$ find ~/Home -newer ~/Desktop/tmp -type f > modified-files

You should replace option -newer with \( -newer -o -cnewer \) in order to catch modifications to file metadata as well.

Related

Finding and following symbolic links but without deleting them

The current find command is utilized to find and delete outdated files and directories. The expired data is based on a given properties file and the destination.
If the properties file says…
"/home/some/working/directory;.;180"
…then we want files and empty subdirectories deleted after 180 days.
The original command is…
"find ${var[0]} -mtime +${var[2]} -delete &"
…but I now need to modify now that we've discovered is has deleted symbolic links that existed in specified sub-directories after the given expiration date in the properties file. The variable path and variable expiration time are designated in the properties file (as previously demonstrated).
I have been testing using…
"find -L"
…to follow the symbolic links to make sure this clean up command reaches the destinations, as desired.
I have also been testing using…
"\! -type l"
…to ignore deleting symbolic links, so the command I've been trying is…
"find -L ${var[0]} ! -type l -mtime +${var[2]} -delete &"
…but I haven't achieved the desired results. Help please, I am still fresh into Linux and my research hasn't lead me to a desired answer. Thank you for your time.
Change
\! -type l
to
\! -xtype l
find -L ${var[0]} \\! -xtype l -mtime +${var[2]} -delete &

Best way to tar and zip files meeting specific name criteria?

I'm writing a shell script on a Linux machine to be run via a crontab which is meant to move all files older than the current day to a new folder, and then tar and zip the entire folder. Seems like a simple task but for some reason, I'm running into all kinds of roadblocks. I'm new to this and self-taught so any help or redirection would be greatly appreciated.
Specific criteria for which files to archive:
All log files are in /home/tech/logs/ and all pdfs are in /home/tech/logs/pdf
All files are over a day old as indicated by the file name (file name does not include $CURRENT_DATE)
All files must be *.log or *.pdf (i.e. don't archive files that don't include $CURRENT_DATE if it isn't a log or pdf file.
Filename formatting specifics:
All the log file names are in home/tech/logs in the format NAME 00_20180510.log, and all the pdf files are in a "pdf" subdirectory (home/tech/logs/pdf) with the format NAME 00_20180510_00000000.pdf ("20180510" would be whenever the file was created and the 0's would be any number). I need to use the name rather than the file metadata for the creation date, and all files (pdf/log) whose name does not include the current date are "old". I also can't just move all files that don't contain $CURRENT_DATE in the name because it would take any non-*.pdf or *.log files with it.
Right now the script creates a new folder with a new pdf subdir for the old files (mkdir -p /home/tech/logs/$ARCHIVE_NAME/pdf). I then want to move the old logs into $ARCHIVE_NAME, and move all old pdfs from the original pdf subdirectory into $ARCHIVE_NAME/pdf.
Current code:
find /home/tech/logs -maxdepth 1 -name ( "*[^$CURRENT_DATE].log" "*.log" ) -exec mv -t "$ARCHIVE_NAME" '{}' ';'
find /home/tech/logs/pdf -maxdepth 1 -name ( "*[^$CURRENT_DATE]*.pdf" "*.pdf" ) -exec mv -t "$ARCHIVE_NAME/pdf" '{}' ';'
This hasn't been working because it treats the numbers in $CURRENT_DATE as a list of numbers to exclude rather than a literal string.
I've considered just using tar's exclude options like this:
tar -cvzPf "$ARCHIVE_NAME.tgz" --directory /home/tech/logs --exclude="$CURRENT_DATE" --no-unquote --recursion --remove-files --files-from="/home/tech/logs/"
But a) it doesn't work, and b) it would theoretically include all files that weren't *.pdf or *.log files, which would be a problem.
Am I overcomplicating this? Is there a better way to go about this?
I would go about this using bash's extended glob features, which allow you to negate a pattern:
#!/bin/bash
shopt -s extglob
mv /home/tech/logs/*!("$CURRENT_DATE")*.log "$ARCHIVE_NAME"
mv /home/tech/logs/pdf/*!("$CURRENT_DATE")*.pdf "$ARCHIVE_NAME"/pdf
With extglob enabled, !(pattern) expands to everything that doesn't match the pattern (or list of pipe-separated patterns).
Using find it should also be possible:
find /home/tech/logs -name '*.log' -not -name "*$CURRENT_DATE*" -exec mv -t "$ARCHIVE_NAME" {} +
Building on #tom-fenech answer, optimized to avoid many mv invocations:
find /home/tech/logs -maxdepth 1 -name '*.log' -not -name "*_${CURRENT_DATE?}.log" | \
xargs mv -t "${ARCHIVE_NAME?}"
An interesting feature, from processing the file thru pipes, is the ability to filter them with extra tools (aka grep :), which can (arguably) become more readable i.e. ->
find /home/tech/logs -maxdepth 1 -name '*.log' | fgrep -v "_${CURRENT_DATE?}" | \
xargs mv -t "${ARCHIVE_NAME?}"
Then similarly for the pdf ones, BTW you can "dry-run" above by just replacing mv by echo mv.
--jjo

Linux backup all files with known extensions with timestamps

I want to backup all files with a given extension in a directory but I want them to be with timestamps.
Given a directory:
Sample/ with multiple subdirectory and a subfolder name BACKUPS.
cd Sample
find . -name '*.xml' -exec cp {} BACKUPS \;
Say I have multiple xml files in this Sample folder and I want them to be copied to the BACKUPS folder but I want them to be timestamp
say..
text.xml.20171107
conf.xml.20171107
I am able to backup the files but I could not figure out how to append a timestamp to the files using the find command.
You could try this:
find . -name '*.xml' -execdir cp {} "$PWD/BACKUPS/{}.$(date +%Y%m%d)" \;
As before, we use find . -name '*.xml' to locate all the files. However, in order to get rid of the names of subdirectories, we use -execdir instead of exec. This causes the specified command to be run from inside the subdirectory the current file is in and replaces {} by its base name.
This means we have to modify cp's second argument (the target filename). We now pass "$PWD/BACKUPS" to create an absolute path ($PWD is the current working directory). This way cp always targets the right directory, even when invoked from a subdirectory of Sample.
Finally, the filename we use is constructed from {}.$(date +%Y%m%d). $( ) runs the specified command and substitutes its output (the current date, in this case). This is done by the shell before find is invoked, so find just sees .../{}.20171107. The {} part is replaced by find itself just before it runs each cp.

grep files based on time stamp

This should be pretty simple, but I am not figuring it out. I have a large code base more than 4GB under Linux. A few header files and xml files are generated during build (using gnu make). If it matters the header files are generated based on xml files.
I want to search for a keyword in header file that was last modified after a time instance ( Its my start compile time), and similarly xml files, but separate grep queries.
If I run it on all possible header or xml files, it take a lot of time. Only those that were auto generated. Further the search has to be recursive, since there are a lot of directories and sub-directories.
You could use the find command:
find . -mtime 0 -type f
prints a list of all files (-type f) in and below the current directory (.) that were modified in the last 24 hours (-mtime 0, 1 would be 48h, 2 would be 72h, ...). Try
grep "pattern" $(find . -mtime 0 -type f)
To find 'pattern' in all files newer than some_file in the current directory and its sub-directories recursively:
find -newer some_file -type f -exec grep 'pattern' {} +
You could specify the timestamp directly in date -d format and use other find tests e.g., -name, -mmin.
The file list could also be generate by your build system if find is too slow.
More specific tools such as ack, etags, GCCSense might be used instead of grep.
Use this. Because if find doesn't return a file, then grep will keep waiting for an input halting the script.
find . -mtime 0 -type f | xargs grep "pattern"

Bash script to recursively step through folders and delete files

Can anyone give me a bash script or one line command i can run on linux to recursively go through each folder from the current folder and delete all files or directories starting with '._'?
Change directory to the root directory you want (or change . to the directory) and execute:
find . -name "._*" -print0 | xargs -0 rm -rf
xargs allows you to pass several parameters to a single command, so it will be faster than using the find -exec syntax. Also, you can run this once without the | to view the files it will delete, make sure it is safe.
find . -name '._*' -exec rm -Rf {} \;
I've had a similar problem a while ago (I assume you are trying to clean up a drive that was connected to a Mac which saves a lot of these files), so I wrote a simple python script which deletes these and other useless files; maybe it will be useful to you:
http://github.com/houbysoft/short/blob/master/tidy
find /path -name "._*" -exec rm -fr "{}" +;
Instead of deleting the AppleDouble files, you could merge them with the corresponding files. You can use dot_clean.
dot_clean -- Merge ._* files with corresponding native files.
For each dir, dot_clean recursively merges all ._* files with their corresponding native files according to the rules specified with the given arguments. By default, if there is an attribute on the native file that is also present in the ._ file, the most recent attribute will be used.
If no operands are given, a usage message is output. If more than one directory is given, directories are merged in the order in which they are specified.
Because dot_clean works recursively by default, use:
dot_clean <directory>
If you want to turn off the recursively merge, use -f for flat merge.
dot_clean -f <directory>
find . -name '.*' -delete
A bit shorter and perform better in case of extremely long list of files.

Resources