Run additional command when rsync detects a file - linux

I am currently running the following script to make an automatic backup of my Music:
#!/bin/bash
while :; do
rsync -ruv /mnt/hdd1/Music/ /mnt/hdd2/Music/
done
Whenever a new file is added to my music folder, it is detected by rsync and it is copied to my other disk. This script runs fine, but I would also like to convert the detected file to an ogg opus file for putting on my phone.
My question is: How do I run a command on a new file found by rsync -u?
I will also accept answers which work totally differently, but have the same result.

rsync -ruv /mnt/hdd1/Music /mnt/hdd2/ | sed -n 's|^Music/||p' >~/filelist.tmp
while IFS= read filename
do
[ -f "$filename" ] || continue
# do something with file
echo "Now processing '$filename'"
done <~/filelist.tmp
With the -v option, rsync prints the names of files it copies to stdout. I use sed to capture just those filenames, excluding the informational messages, to a file. The filenames in that file can be processed later as you like.
The approach with sed above depends on rsync displaying filenames starting with the final part of the source directory, e.g. "Music/" in my example above, which is then removed assuming that you don't need it. Alternately, one could try an explicit approach for excluding noise messages.

Related

Bash Scripting with xargs to BACK UP files

I need to copy a file from multiple locations to the BACK UP directory by retaining its directory structure. For example, I have a file "a.txt" at the following locations /a/b/a.txt /a/c/a.txt a/d/a.txt a/e/a.txt, I now need to copy this file from multiple locations to the backup directory /tmp/backup. The end result should be:
when i list /tmp/backup/a --> it should contain /b/a.txt /c/a.txt /d/a.txt & /e/a.txt.
For this, I had used the command: echo /a/*/a.txt | xargs -I {} -n 1 sudo cp --parent -vp {} /tmp/backup. This is throwing the error "cp: cannot stat '/a/b/a.txt /a/c/a.txt a/d/a.txt a/e/a.txt': No such file or directory"
-I option is taking the complete input from echo instead of individual values (like -n 1 does). If someone can help debug this issue that would be very helpful instead of providing an alternative command.
Use rsync with the --relative (-R) option to keep (parts of) the source paths.
I've used a wildcard for the source to match your example command rather than the explicit list of directories mentioned in your question.
rsync -avR /a/*/a.txt /tmp/backup/
Do the backups need to be exactly the same as the originals? In most cases, I'd prefer a little compression. [tar](https://man7.org/linux/man-pages/man1/tar.1.html) does a great job of bundling things including the directory structure.
tar cvzf /path/to/backup/tarball.tgz /source/path/
tar can't update compressed archives, so you can skip the compression
tar uf /path/to/backup/tarball.tar /source/path/
This gives you versioning of a sort, as if only updates changed files, but keeps the before and after versions, both.
If you have time and cycles and still want the compression, you can decompress before and recompress after.

Recursive Text Substitution and File Extension Rename

I am using an application that creates a text file on a Linux server. I then have the ability to execute a shell script (BASH 3.2.57) in which I need to convert the text file from Unix line endings to DOS and also change the extension of the file from .txt to .log.
I currently have a sed based command to do this. This command is rewritten by the application at run time to point to the specific folder and file name, in this example where you see ABC (all capital 3 letters in all my examples are a variable that can be any 3 letters).
pushd /rootfolder/parentfolder/ABC/
sed 's/$/\r/' prABC.txt > prABC.log
popd
The problem with this is that if a user runs the application for 2 different groups, say ABC and DEF at nearly the same time, the script will get overwritten with the DEF variables before ABC had a chance to fire off and do its thing with the file. Additionally the .txt is left in the folder regardless and I would like that to be removed.
A friend of mine came up with the following code that seems to work if its determined to be our best solution, but I would think and hope we have a cleaner more dynamic way to do this. Also this current method requires that when my user decides to add a GHI directory and file I now have to update the code, which i can program my application to do for me but i don't want this script to have to be rewritten every time the application wants to use it.
pushd /rootfolder/parentfolder/ABC
if [[ -f prABC.txt ]]
then
sed 's/$/\r/' prABC.txt > prABC.log
rm prABC.txt
fi
popd
pushd /rootfolder/parentfolder/DEF
if [[ -f prABC.txt ]]
then
sed 's/$/\r/' prABC.txt > prABC.log
rm prABC.txt
fi
popd
I would like to call this script at anytime from my application and it find any file named pr*.txt below the /rootfolder/parentfolder/ directory (if that has to include the parentfolder in its search that won't be a problem) and convert the line endings from LF to CRLF and change the extension of the file from .txt to .log.
I've done a ton of searching and have found near solutions for this but not exactly what I need and I want to be sure it's as safe as possible (issues with using "find with for". I don't know what utilities are installed on this build so i would like to keep it as basic/supportable as possible Thanks in advance :)
You should almost never need pushd and popd in scripts. In fact, you rarely need cd, either.
#!/bin/bash
for d in /rootfolder/parentfolder/ABC /rootfolder/parentfolder/DEF
do
if [[ -f "$d/prABC.txt" ]]
then
sed 's/$/\r/' "$d/prABC.txt" > "$d/prABC.log" &&
rm "$d/prABC.txt"
fi
done
Recall that a && b is shorthand for
if a; then
b
fi
In other words, if sed fails (because the source file can't be read, or the destination can't be written) we don't rm the source file. There should be an error message already so we don't add another one.
Not only is this more succinct, it is also easier to change if you decide that the old file should be renamed instead of removed, or you want to filter out all lines which contain "beef" in the sed script. Generally you should avoid repeated code; see also the DRY principle on Wikipedia.
Something is seriously wrong somewhere if you require DOS line endings in your files on Unix.

linux - watch a directory for new files, then run a script

I want to watch a directory in Ubuntu 14.04, and when a new file is created in this directory, run a script.
specifically I have security cameras that upload via FTP captured video when they detect motion. I want to run a script on this FTP server so when new files are created, they get mirrored (uploaded) to a cloud storage service immediately, which is done via a script I've already written.
I found iWatch which lets me do this (http://iwatch.sourceforge.net/index.html) - the problem I am having is that iwatch immediately kicks off the cloud upload script the instant the file is created in the FTP directory, even while the file is in progress of being uploaded still. This causes the cloud sync script to upload 0-byte files, useless to me.
I could add a 'wait' in the cloud upload script maybe but it seems hack-y and impossible to predict how long to wait as it depends on file size, network conditions etc.
Whats a better way to do this?
Although inotifywait was mentioned in comments, a complete solution might be useful to others. This seems to be working:
inotifywait -m -e close_write /tmp/upload/ | gawk '{print $1$3; fflush()}' | xargs -L 1 yourCommandHere
will run
yourCommandHere /tmp/upload/filename
when a newly uploaded file is closed
Notes:
inotifywait is part of apt package inotify-tools in Ubuntu. It uses the kernel inotify service to monitor file or directory events
-m option is monitor mode, outputs one line per event to stdout
-e close_write for file close events for files that were open for writing. File close events hopefully avoid receiving incomplete files.
/tmp/upload can be replaced with some other directory to monitor
the pipe to gawk reformats the inotifywait output lines to drop the 2nd column, which is a repeat of the event type. It combines the dirname in column 1 with the filename in column 3 to make a new line, which is flushed every line to defeat buffering and encourage immediate action by xargs
xargs takes a list of files and runs the given command for each file, appending the filename on the end of the command. -L 1 causes xargs to run after each line received on standard input.
You were close to solution there. You can watch many different events with iwatch - the one that interests you is close_write. Syntax:
iwatch -e close_write <directory_name>
This of course works only if file's closed when the writing's complete, which, while it's a sane assumption, it's not necessarily a true one (yet often is).
Here's another version of reacting to a filesystem event by making a POST request to a given URL.
#!/bin/bash
set -euo pipefail
cd "$(dirname "$0")"
watchRoot=$1
uri=$2
function post()
{
while read path action file; do
echo '{"Directory": "", "File": ""}' |
jq ".Directory |= \"$path\"" |
jq ".File |= \"$file\"" |
curl --data-binary #- -H 'Content-Type: application/json' -X POST $uri || continue
done
}
inotifywait -r -m -e close_write "$watchRoot" | post

How can I automatically rename, copy and delete files in linux for my ip camera webcam?

I have an ip camera that automatically ftps images every few seconds to directory on my linux Ubuntu Server web server. I'd like to make a simple webcam page that references a static image and just refreshes every few seconds. The problem is that my ipcamera's firmware automatically names every file with a date_time.jpg type filename, and does not have the option to overwrite with the same file name over and over.
I'd like to have a script running on my linux machine to automatically copy a new file that has been ftp'd into a directory into a different directory, rename it in the process and then delete the original.
Regards,
Glen
I made a quick script, you would need to uncomment the rm -f line to make it delete things :)
it currently prints the command it would have run, so you can test with higher confidence.
You also need to set THE WORK_DIR and DEST_DIR variables near the top of the script.
#!/bin/bash
#########################
# configure vars
YYYYMMDD=`date +%Y%m%d`
WORK_DIR=/Users/neil/linuxfn
DEST_DIR=/Users/neil/linuxfn/dest_dir
##########################
LATEST=`ls -tr $WORK_DIR/$YYYYMMDD* 2>/dev/null | tail -1`
echo "rm -f $DEST_DIR/image.jpg ; mv $LATEST $DEST_DIR/image.jpg"
#rm -f $DEST_DIR/image.jpg ; mv $LATEST $DEST_DIR/image.jpg
This give me the following output when i run it on my laptop:
mba1:linuxfn neil$ bash renamer.sh
rm -f /Users/neil/linuxfn/dest_dir/image.jpg ; mv /Users/neil/linuxfn/20150411-2229 /Users/neil/linuxfn/dest_dir/image.jpg
Inotify (http://en.wikipedia.org/wiki/Inotify) can be set up to do as you ask, but it would probably be better to use a simple web script (PHP, Python, Perl, etc.) to serve the latest file from the directory, instead.

How to test whether file at path exists in repo?

Given a path (on my computer), how can I test whether that file is under version control (ie. a copy exists in the Perforce depot)? I'd like to test this at the command line.
Check p4 help files. In short, you run p4 files <your path here> and it will give you the depot path to that file. If it isn't in the depot, you'll get "no such file(s)".
For scripting, p4 files FILE is insufficient because it doesn't change its exit code when there is no such file.
Instead, you can pass it through grep, which looks for perforce paths' telltale pair of leading slashes:
# silently true when the file exists or false when it does not.
p4_exists() {
p4 files -e "$1" 2>/dev/null |grep -q ^//
}
You can get rid of the 2>/dev/null and grep's -q if you want visible output.
Before p4 files version 2012.1 (say p4 files version 2011.1), it didn't support -e. You'll have to add |grep -v ' - delete [^-]*$' before the grep above.
Warning: A future p4 release could change the formatting and break this logic.
Similar to Adam Katz's solution except more likely to be supported by future releases of p4, you can pass global option -s which 'prepends a descriptive field' to each line. This is one of 'text', 'info', 'error' or 'exit' followed by a colon (and a space, it seems). This is intended for facilitating scripting.
For all files passed in to the p4 -s files command, you should get one line back per file. If the file exists in the depot, the line starts with info: whereas if the file does not exist in the depot, the line starts with error:. For example:
info: //depot/<depot-path-to-file>
error: <local-path-to-file>
So, essentially, the status lines are the equivalent of an exit code but on a per-file basis. An exit code alone wouldn't cope neatly with an arbitrary number of files passed in.
Note that if there is a different error (e.g. a connection error) then an error line is still output. So, if you really want the processing to be robust, you may want to combine this with what Adam Katz suggested or perhaps grep for the basename of the file in the output line.

Resources