Automatically hardlinking files, but only once - linux

I have a script that runs every 30 minutes to find files matching a string and automatically hard-linking them to another folder. This folder then is uploaded to a backup and removed locally.
My current setup is working, but it inevitably hard-links the file again after it has been removed locally.
I am wanting to implement a way of logging what has already been linked, so when something is matched, it also checks against "hardlinklog.txt" file.
find . -name '*FILE*' -print0 | xargs -0 ln -t ~/media/
That is my current script with changed paths and filter.

This would be a job for grep -v -x -f hardlinklog.txt
-v: Pass only non-matching lines
-f <file>: Find the lines to check for in <file>
-x: Match entire lines only

Related

bash rm to delete old files only deleting the first one

I'm using Ubuntu 16.04.1 LTS
I found a script to delete everything but the 'n' newest files in a directory.
I modified it to this:
sudo rm /home/backup/`ls -t /home/backup/ | awk 'NR>5'`
It deletes only one file. It reports the following message about the rest of the files it should have deleted:
rm: cannot remove 'delete_me_02.tar': No such file or directory
rm: cannot remove 'delete_me_03.tar': No such file or directory
...
I believe that the problem is the path. It's looking for delete_me_02.tar (and subsequent files) in the current directory, and it's somehow lost its reference to the correct directory.
How can I modify my command to keep looking in the /home/backup/ directory for all 'n' files?
Maybe find could help you do what you want:
find /home/backup -type f | xargs ls -t | head -n 5 | xargs rm
But I would first check what find would return (just remove | xargs rm) and check what is going to be removed.
The command in the backticks will be expanded to the list of relative file paths:
%`ls -t /home/backup/ | awk 'NR>5'`
a.txt b.txt c.txt ...
so the full command will now look like this:
sudo rm /home/backup/a.txt b.txt c.txt
which, I believe, makes it pretty obvious on why only the first file is removed.
There is also a limit on a number of arguments you can pass to rm, so
you better modify your script to use xargs instead:
ls -t|tail -n+5|xargs -I{} echo rm /home/backup/'{}'
(just remove echo, once you verify that it produces an expected results for you)
After the command substitution expands, your command line looks like
sudo rm /home/backup/delete_me_01.tar delete_me_02.tar delete_me_03.tar etc
/home/backup is not prefixed to each word from the output. (Aside: don't use ls in a script; see http://mywiki.wooledge.org/ParsingLs.)
Frankly, this is something most shells just doesn't make easy to do properly. (Exception: with zsh, you would just use sudo rm /home/backup/*(Om[1,-6]).) I would use some other language.

executing Linux sed command and version control complain

I have a folder which contains jsp files. I used find and sed to change part of the text in some files. This folder is under version control. The command successfully changed all the occurrences of the specified pattern But
The problem is when I'm synchronizing the folder with the remote repository I can see so many files listed as modified which actually nothing in that file has changed. There is sth wrong with the white space I suppose. Could anyone shed some light on this matter.
I'm trying to replace ../../images/spacer to ${pageContext.request.contextPath}/static/images/spacer in all jsp files under current folder
The command I'm using is as below
find . -name '*.jsp' -exec sed -i 's/..\/..\/images\/spacer/${pageContext.request.contextPath}\/static\/images\/spacer/g' {} \;
In most of systems, grep has an option to recursively search for files that contains a pattern, avoiding find.
So, the command would be:
grep -r -l -m1 "\.\./\.\./images/spacer" --include \*.jsp |
xargs -r sed -i 's!\.\./\.\./\(images/spacer\)!${pageContext.request.contextPath}/static/\1!g'
Explanation
Both grep and sed work with regular expression patterns, in which th dot character . represent any character including the dot itself. In order to explicit indicate a dot, it must be escaped with a \ before it. So to search .. is necessary specify \.\., or it can match texts like ab/cd/
Now, about the grep options:
-m1 stops search when finds the first occurrence avoiding search the entire file.
-r search recursively in the directories
--include \*.jsp search only in files with FILEPAT file pattern.

Create symlink from the contents of a file

Due to an inefficient workflow where I have to copy directories between a Linux machine and a windows machine. The directories contain symlinks which (after copying Linux>Windows>Linux) contain the link in plaintext (eg foobar.C contains the text ../../../Foo/Bar/foobar.C)
Is there an efficient way to recreate the symlinks from the contents of the file recursively for a complete directory?
I have tried:
find . | xargs ln -s ??A?? ??B?? && mv ??B?? ??A??
where I really have no idea how to populate the variables, but ??A?? should be the symlink's destination from the file and ??B?? should be the name of the file with the suffix _temp appended.
If you are certain that all the files contain a symlink, it's not very hard.
find . -print0 | xargs -r 0 sh -c '
for f; do ln -s "$(cat "$f")" "${f}_temp" && mv "${f}_temp" "$f"; done' _
The _ dummy argument is necessary because the second argument to sh -c is used to populate $0 in the subshell. The shell itself is necessary because you cannot directly pass multiple commands to xargs.
The -print0 and corresponding xargs -0 are a GNU extension to correctly cope with tricky file names. See the find manual for details.
I would perhaps add a simple verification check before proceeding with the symlinking; for example, if grep -cm2 . on the file returns 2, skip the file (it contains more than one line of text). If you can be more specific (say, all the symlinks begin with ../) by all means be more specific.

Removing files in a sub directory based on modification date [duplicate]

This question already has answers here:
bash script to remove directories based on modified file date
(3 answers)
Closed 8 years ago.
Hi so I'm trying to remove old backup files from a sub directory if the number of files exceeds the maximum and I found this command to do that
ls -t | sed -e '1,10d' | xargs -d '\n' rm
And my changes are as follows
ls -t subdirectory | sed -e '1,$f' | xargs -d '\n' rm
Obviously when I try running the script it gives me an error saying unknown commands: f
My only concern right now is that I'm passing in the max number of files allowed as an argument so I'm storing that in f but now I'm not too sure how to use that variable in the command above instead of having to set condition to a specific number.
Can anyone give me any pointers? And is there anything else I'm doing wrong?
Thanks!
The title of your question says "based on modification date". So why not simply using find with mtime option?
find subdirectory -mtime +5d -exec rm -v {} \;
Will delete all files older than 5 days.
The problem is that the file list you are passing to xargs does not contain the needed path information to delete the files. When called from the current directory, no path is needed, but if you call it with subdirectory, you need to then rm subdirectory/file from the current directory. Try it:
ls -t subdirectory # returns files with no path info
What you need to do is change to the subdirectory, call the removal script, then change back. In one line it could be done with:
pushd subdirectory &>/dev/null; ls -t | sed -e '1,$f' | xargs -d '\n' rm; popd
Other than doing it in a similar manner, you are probably better writing a slightly longer and more flexible script forming the list of files with the find command to insure the path information is retained.

Retrieving the sub-directory, which had most recently been modified, in a Linux shell script?

How can I retrieve the sub-directory, which had most recently been modified, in a directory?
I am using a shell script on a Linux distribution (Ubuntu).
Sounds like you want the ls options
-t sort by modification time, newest first
And only show directories, use something like this answer suggests Listing only directories using ls in bash: An examination
ls -d */
And if you want each directory listed on one line (if your file/dirnames have no newlines or crazy characters) I'd add -1 So all together, this should list directories in the current directory, with the newest modified times at the top
ls -1td */
And only the single newest directory:
ls -1td */ | head -n 1
Or if you want to compare to a specific time you can use find and it's options like -cmin -cnewer -ctime -mmin -mtime and find can handle crazy names like newlines, spaces, etc with null terminated names options like -print0
How much the subdirectory is modified is irrelevant. Do you know the name of the subdirectory? Get its content like this:
files=$(ls subdir-name)
for file in ${files}; do
echo "I see there is a file named ${file}"
done

Resources