inotifywait misses events while script is running - linux

I am running an inotify wait script that triggers a bash script to call a function to synchronize my database of files whenever files are modified, created, or deleted.
#!/bin/sh
while inotifywait -r -e modify -e create -e delete /var/my/path/Documents; do
cd /var/scripts/
./sync.sh
done
This actually works quite well except that during the 10 seconds it takes my sync script to run the watch doesn't pickup any additional changes. There are instances where the sync has already looked at a directory and an additional change occurs that isn't detected by inotifywait because it hasn't re-established watches.
Is there any way for the inofitywait to trigger the script and still maintain the watch?

Use the -m option so that it runs continuously, instead of exiting after each event.
inotifywait -q -m -r -e modify -e create -e delete /var/my/path/Documents | \
while read event; do
cd /var/scripts
./sync.sh
done
This would actually have the opposite problem: if multiple changes occur while the sync script is running, it will run it again that many times. You might want to put something in the sync.sh script that prevents it from running again if it has run too recently.

Related

BASH: simultaneous execution of a multiloop function without waiting

Usecase:
need to transfer binary files (1Gb) to array of IPs and start executing them upon arrival to their destinations without waiting all binaries to be transferred/executed. Sort of parallel mode.
Situation:
I have 2 functions - transfer and execution (depending on approach it can be shortened to 1 with 2 loops).
for N in "${NODES[#]}"; do
rsync -Pcz -e "ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null" --timeout=10 $FILE user#$N
done
and
for N in "${NODES[#]}"; do
ssh user#$N "cd ~/; ./exec.sh"
done
The point is that in this case i have to wait till all transfers finish first (and there sometimes can be tens of addresses)and just afterwards start the execution.
If i combine the loops into a single one, i have to wait again - this time for transfer+execution per node.
Expectation:
I'd like to transfer a file to the first node, start its execution, and switch to the second node with the same process, and so on. So timing would count for the transfers only, whereas each node executes the file on its own in parallel.
Obstacles:
1- need to be able to have an execution output from each node
2- additional packages, like screen are not an option.
What did i try:
i was thinking about injecting some script to the remote nodes via the loop to control the execution from there. But i'm sure there must be some less barbaric option.
What can be done here?
You should be able to use a single loop, and run the ssh command with a & suffix, which runs it in the background (i.e. without waiting for it to finish), and then after the loop use wait to wait for all of them to finish. Collecting output will be more interesting... I think you'll need to collect each run's output into a file, and then print the files at the end. Something like this (note that I have not tested this properly):
tmpdir="$(mktemp -qd -t "$(basename "$0")")" || {
echo "Error creating temporary directory" >&2
exit 1
}
for nodenum in "${!NODES[#]}"; do
# The ${!array[#]} idiom gets a list of array *indexes*, not elements; get the element by index:
N=${NODES[nodenum]}
# Copy file, and wait for copy to finish:
rsync -Pcz -e "ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null" --timeout=10 $FILE user#$N
# Start the script, and *don't* wait for it to finish:
ssh user#$N "cd ~/ sh exec.sh" >"$tmpdir/$nodenum.out" 2>&1 &
done
# Wait for all of the scripts to finish
wait
# Print all of the outputs (in order)
for nodenum in "${!NODES[#]}"; do
echo
echo "Output from ${NODES[nodenum]}:"
cat "$tmpdir/$nodenum.out"
done
# Clean up the temp directory
rm -R "$tmpdir"
BTW, the remote command "cd ~/ sh exec.sh" doesn't make sense. Is there supposed to be a semicolon in there? Also, I recommend using lower or mixed-case variable names to avoid conflicts with the many all-caps variables that have some sort of special meaning, and putting double-quotes around variable references (i.e. rsync ... "$FILE" "user#$N" instead of rsync ... $FILE user#$N).
EDIT: this assumes you want to start the script on each host as soon as that particular copy is done; if you want to wait until all copies are done, then fire all scripts at once, use two loops: one to do the copies, then a second that does the ssh commands in the background (collecting output as above), then wait for those to all finish, then print all of the outputs.
You could do the transfer and script as a single background task, so that the script on a particular host starts as soon as its transfer is complete
for N in "${NODES[#]}"; do
(rsync -Pcz -e "ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null" --timeout=10 $FILE user#$N
ssh user#$N "cd ~/; ./exec.sh") > ${N}.log 2>&1 &
done
You then collect all of the hostname.log files

inotifywait shell script run as daemon

I have a script that watches a directory (recursively) and performs a command when a file changes. This is working correctly when the monitoring flag is used as below:
#!/bin/sh
inotifywait -m -r /path/to/directory |
while read path action file; do
if [ <perform a check> ]
then
my_command
fi
done
However, I want to run this on startup and in the background, so naïvely thought I could change the -m flag to -d (run inotifywait as daemon, and include an --outfile location) and then add this to rc.local to have this run at startup. Where am I going wrong?
Well .... with -d it backgrounds itself and outputs ONLY to outfile, so your whole pipe & loop construct is moot, and it never sees any data.
Incron is a cron-like daemon for inotify events.
Just need to use incrontab and an entry for your task:
/path/to/directory IN_ALL_EVENTS /usr/local/bin/my-script $# $# $%
And /local/bin/my-script would be:
#! /bin/bash
local path=$1
local action=$2
local file=$3
if [ <perform a check> ]
then
my_command
fi
You need to add a single & to the end of command in your /etc/rc.local
Putting a single & at the end of a command means Run this program in the background so the user can still have input.

How to kill a process on no output for some period of time

I've written a program that is suppose to run for a long time and it outputs the progress to stdout, however, under some circumstances it begins to hang and the easiest thing to do is to restart it.
My question is: Is there a way to do something that would kill the process only if it had no output for a specific number of seconds?
I have started thinking about it, and the only thing that comes to mind is something like this:
./application > output.log &
tail -f output.log
then create script which would look at the date and time of the last modification on output.log and restart the whole thing.
But it looks very tedious, and i would hate to go through all that if there were an existing command for that.
As far as I know, there isn't a standard utility to do it, but a good start for a one-liner would be:
timeout=10; if [ -z "`find output.log -newermt #$[$(date +%s)-${timeout}]`" ]; then killall -TERM application; fi
At least, this will avoid the tedious part of coding a more complex script.
Some hints:
Using the find utility to compare the last modification date of the output.log file against a time reference.
The time reference is returned by date utility as the current time in seconds (+%s) since EPOCH (1970-01-01 UTC).
Using bash $[] operation to subtract the $timeout value (10 seconds on the example)
If no output is returned from the above find, then the file wasn't changed for more than 10 seconds. This will trigger a true in the if condition and the killall command will be executed.
You can also set an alias for that, using:
alias kill_application='timeout=10; if [ -z "`find output.log -newermt #$[$(date +%s)-${timeout}]`" ]; then killall -TERM application; fi';
And then use it whenever you want by just issuing the command kill_application
If you want to automatically restart the application without human intervention, you can install a crontab entry to run every minute or so and also issue the application restart command after the killall (Probably you may also want to change the -TERM to -KILL, just in case the application becomes unresponsive to handleable signals).
The inotifywait could help here, it efficiently waits for changes to files. The exit status can be checked to identify if the event (modify) occurred in the specified interval of time.
$ inotifywait -e modify -t 10 output.log
Setting up watches.
Watches established.
$ echo $?
2
Some related info from man:
OPTIONS
-e <event>, --event <event>
Listen for specific event(s) only.
-t <seconds>, --timeout <seconds>
Exit if an appropriate event has not occurred within <seconds> seconds.
EXIT STATUS
2 The -t option was used and an event did not occur in the specified interval of time.
EVENTS
modify A watched file or a file within a watched directory was written to.

Simple bash script runs asynchronously when run as a cron job

I have a backup script written that will do the following in this order:
Zip up files via SSH on a remote backup server
Dump my local database
Transfer my local database via SSH rsync to the backup server
Now when I run this script from the command line in RHEL it works a-ok perfectly fine.
BUT when I set this script to run via a cronjob, the script does run, but from what I can tell, it's somehow running those above 3 commands simultaneously. Because of that, things are getting done out of order (my local database is completed dumping and transferred before the #1 zip job is actually complete).
Has anyone run across such a strange scenario? As the most simple fix, is there a way to force a script to run synchronously? Maybe add some kind of command to wait for the prior line to complete before moving on?
EDIT I added a example version of my backup script. It seems that the second line of my script runs at the same time as the first line of my script, so while the SSH command has been issued, it has not yet completed before my second line triggers and an SQL dump begins.
#!/bin/bash
THEDIR="sample"
THEDBNAME="mydatabase"
ssh -i /rsync/mirror-rsync-key sample#sample.com "tar zcvpf /$THEDIR/old-1.tar /$THEDIR/public_html/*"
mysqldump --opt -Q $THEDBNAME > mySampleDb
/usr/bin/rsync -avz --delete --exclude=**/stats --exclude=**/error -e "ssh -i /rsync/mirror-rsync-key" /$THEDIR/public_html/ sample#sample.com:/$THEDIR/public_html/
/usr/bin/rsync -avz --delete --exclude=**/stats --exclude=**/error -e "ssh -i /rsync/mirror-rsync-key" /$THEDIR/ sample#sample.com:/$THEDIR/
Unless you're explicitly using backgrounding (&) everything should run one-by-one, waiting until the prior finishes.
Perhaps you are actually seeing overlapping prior executions by cron? If so, you can prevent multi-execution by calling your script with flock
e.g. midnight cron entry from
0 0 * * * backup.sh
to
0 0 * * * flock -n /tmp/backup.lock -c backup.sh
If you want to run commands in a sequential order you can use ; operator.
; – semicolon operator
This operator Run multiple commands in one go, but in a sequential order. If we take three commands separated by semicolon, second command will run after first command completion, third command will run only after second command execution completes. One point we should know is that to run second command, it do not depend on first command exit status.
Execute ls, pwd, whoami commands in one line sequentially one after the other.
ls;pwd;whoami
Please correct me if i am not understanding your question correctly.

Rsync cronjob that will only run if rsync isn't already running

I have checked for a solution here but cannot seem to find one. I am dealing with a very slow wan connection about 300kb/sec. For my downloads I am using a remote box, and then I am downloading them to my house. I am trying to run a cronjob that will rsync two directories on my remote and local server every hour. I got everything working but if there is a lot of data to transfer the rsyncs overlap and end up creating two instances of the same file thus duplicate data sent.
I want to instead call a script that would run my rsync command but only if rsync isn't running?
The problem with creating a "lock" file as suggested in a previous solution, is that the lock file might already exist if the script responsible to removing it terminates abnormally.
This could for example happen if the user terminates the rsync process, or due to a power outage. Instead one should use flock, which does not suffer from this problem.
As it happens flock is also easy to use, so the solution would simply look like this:
flock -n lock_file -c "rsync ..."
The command after the -c option is only executed if there is no other process locking on the lock_file. If the locking process for any reason terminates, the lock will be released on the lock_file. The -n options says that flock should be non-blocking, so if there is another processes locking the file nothing will happen.
Via the script you can create a "lock" file. If the file exists, the cronjob should skip the run ; else it should proceed. Once the script completes, it should delete the lock file.
if [ -e /home/myhomedir/rsyncjob.lock ]
then
echo "Rsync job already running...exiting"
exit
fi
touch /home/myhomedir/rsyncjob.lock
#your code in here
#delete lock file at end of your job
rm /home/myhomedir/rsyncjob.lock
To use the lock file example given by #User above, a trap should be used to verify that the lock file is removed when the script is exited for any reason.
if [ -e /home/myhomedir/rsyncjob.lock ]
then
echo "Rsync job already running...exiting"
exit
fi
touch /home/myhomedir/rsyncjob.lock
#delete lock file at end of your job
trap 'rm /home/myhomedir/rsyncjob.lock' EXIT
#your code in here
This way the lock file will be removed even if the script exits before the end of the script.
A simple solution without using a lock file is to just do this:
pgrep rsync > /dev/null || rsync -avz ...
This will work as long as it is the only rsync job you run on the server, and you can then run this directly in cron, but you will need to redirect the output to a log file.
If you do run multiple rsync jobs, you can get pgrep to match against the full command line with a pattern like this:
pgrep -f rsync.*/data > /dev/null || rsync -avz --delete /data/ otherhost:/data/
pgrep -f rsync.*/www > /dev/null || rsync -avz --delete /var/www/ otherhost:/var/www/
As a definite solution kill rsync processes before new one starts in crontab.

Resources