BASH: simultaneous execution of a multiloop function without waiting - linux

Usecase:
need to transfer binary files (1Gb) to array of IPs and start executing them upon arrival to their destinations without waiting all binaries to be transferred/executed. Sort of parallel mode.
Situation:
I have 2 functions - transfer and execution (depending on approach it can be shortened to 1 with 2 loops).
for N in "${NODES[#]}"; do
rsync -Pcz -e "ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null" --timeout=10 $FILE user#$N
done
and
for N in "${NODES[#]}"; do
ssh user#$N "cd ~/; ./exec.sh"
done
The point is that in this case i have to wait till all transfers finish first (and there sometimes can be tens of addresses)and just afterwards start the execution.
If i combine the loops into a single one, i have to wait again - this time for transfer+execution per node.
Expectation:
I'd like to transfer a file to the first node, start its execution, and switch to the second node with the same process, and so on. So timing would count for the transfers only, whereas each node executes the file on its own in parallel.
Obstacles:
1- need to be able to have an execution output from each node
2- additional packages, like screen are not an option.
What did i try:
i was thinking about injecting some script to the remote nodes via the loop to control the execution from there. But i'm sure there must be some less barbaric option.
What can be done here?

You should be able to use a single loop, and run the ssh command with a & suffix, which runs it in the background (i.e. without waiting for it to finish), and then after the loop use wait to wait for all of them to finish. Collecting output will be more interesting... I think you'll need to collect each run's output into a file, and then print the files at the end. Something like this (note that I have not tested this properly):
tmpdir="$(mktemp -qd -t "$(basename "$0")")" || {
echo "Error creating temporary directory" >&2
exit 1
}
for nodenum in "${!NODES[#]}"; do
# The ${!array[#]} idiom gets a list of array *indexes*, not elements; get the element by index:
N=${NODES[nodenum]}
# Copy file, and wait for copy to finish:
rsync -Pcz -e "ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null" --timeout=10 $FILE user#$N
# Start the script, and *don't* wait for it to finish:
ssh user#$N "cd ~/ sh exec.sh" >"$tmpdir/$nodenum.out" 2>&1 &
done
# Wait for all of the scripts to finish
wait
# Print all of the outputs (in order)
for nodenum in "${!NODES[#]}"; do
echo
echo "Output from ${NODES[nodenum]}:"
cat "$tmpdir/$nodenum.out"
done
# Clean up the temp directory
rm -R "$tmpdir"
BTW, the remote command "cd ~/ sh exec.sh" doesn't make sense. Is there supposed to be a semicolon in there? Also, I recommend using lower or mixed-case variable names to avoid conflicts with the many all-caps variables that have some sort of special meaning, and putting double-quotes around variable references (i.e. rsync ... "$FILE" "user#$N" instead of rsync ... $FILE user#$N).
EDIT: this assumes you want to start the script on each host as soon as that particular copy is done; if you want to wait until all copies are done, then fire all scripts at once, use two loops: one to do the copies, then a second that does the ssh commands in the background (collecting output as above), then wait for those to all finish, then print all of the outputs.

You could do the transfer and script as a single background task, so that the script on a particular host starts as soon as its transfer is complete
for N in "${NODES[#]}"; do
(rsync -Pcz -e "ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null" --timeout=10 $FILE user#$N
ssh user#$N "cd ~/; ./exec.sh") > ${N}.log 2>&1 &
done
You then collect all of the hostname.log files

Related

Taking sequentially output of multi ssh with bash scripting

I'm wrote a bash script and I don't have a chance to download pssh. However, I need to run multiple ssh commands in parallel. There is no problem so far, but when I run ssh commands, I want to see the outputs sequentially from the remote machine. I mean one ssh has multiple outputs and they get mixed up because more than one ssh is running.
#!/bin/bash
pid_list=""
while read -r list
do
ssh user#$list 'commands'&
c_pid=$!
pid_list="$pid_list $c_pid"
done < list.txt
for pid in $pid_list
do
wait $pid
done
What should I add to the code to take the output unmixed?
The most obvious way to me would be to write the outputs in a file and cat the files at the end:
#!/bin/bash
me=$$
pid_list=""
while read -r list
do
ssh user#$list 'hostname; sleep $((RANDOM%5)); hostname ; sleep $((RANDOM%5))' > /tmp/log-${me}-$list &
c_pid=$!
pid_list="$pid_list $c_pid"
done < list.txt
for pid in $pid_list
do
wait $pid
done
cat /tmp/log-${me}-*
rm /tmp/log-${me}-* 2> /dev/null
I didn't handle stderr because that wasn't in your question. Nor did I address the order of output because that isn't specified either. Nor is whether the output should appear as each host finishes. If you want those aspects covered, please improve your question.

How do I setup two curl commands to execute at different times forever?

For example, I want to run one command every 10 seconds and the other command every 5 minutes. I can only get the first one to log properly to a text file. Below is the shell script I am working on:
echo "script Running. Press CTRL-C to stop the process..."
while sleep 10;
do
curl -s -I --http2 https://www.ubuntu.com/ >> new.txt
echo "------------1st command--------------------" >> logs.txt;
done
||
while sleep 300;
do
curl -s -I --http2 https://www.google.com/
echo "-----------------------2nd command---------------------------" >> logs.txt;
done
I would advise you to go with #Marvin Crone's answer, but researching cronjobs and back-ground processes doesn't seem like the kind of hassle I would go through for this little script. Instead, try putting both loops into separate scripts; like so:
script1.sh
echo "job 1 Running. Type fg 1 and press CTRL-C to stop the process..."
while sleep 10;
do
echo $(curl -s -I --http2 https://www.ubuntu.com/) >> logs.txt;
done
script2.sh
echo "job 2 Running. Type fg 2 and press CTRL-C to stop the process..."
while sleep 300;
do
echo $(curl -s -I --http2 https://www.google.com/) >> logs.txt;
done
adding executable permissions
chmod +x script1.sh
chmod +x script2.sh
and last but not least running them:
./script1.sh & ./script2.sh &
this creates two separate jobs in the background that you can call by typing:
fg (1 or 2)
and stop them with CTRL-C or send them to background again by typing CTRL-Z
I think what is happening is that you start the first loop. Your first loop needs to complete before the second loop will start. But, the first loop is designed to be infinite.
I suggest you put each curl loop in a separate batch file.
Then, you can run each batch file separately, in the background.
I offer two suggestions for you to investigate for your solution.
One, research the use of crontab and set up a cron job to run the batch files.
Two, research the use of nohup as a means of running the batch files.
I strongly suggest you also research the means of monitoring the jobs and knowing how to terminate the jobs if anything goes wrong. You are setting up infinite loops. A simple Control C will not terminate jobs running in the background. You are treading in areas that can get out of control. You need to know what you are doing.

inotifywait misses events while script is running

I am running an inotify wait script that triggers a bash script to call a function to synchronize my database of files whenever files are modified, created, or deleted.
#!/bin/sh
while inotifywait -r -e modify -e create -e delete /var/my/path/Documents; do
cd /var/scripts/
./sync.sh
done
This actually works quite well except that during the 10 seconds it takes my sync script to run the watch doesn't pickup any additional changes. There are instances where the sync has already looked at a directory and an additional change occurs that isn't detected by inotifywait because it hasn't re-established watches.
Is there any way for the inofitywait to trigger the script and still maintain the watch?
Use the -m option so that it runs continuously, instead of exiting after each event.
inotifywait -q -m -r -e modify -e create -e delete /var/my/path/Documents | \
while read event; do
cd /var/scripts
./sync.sh
done
This would actually have the opposite problem: if multiple changes occur while the sync script is running, it will run it again that many times. You might want to put something in the sync.sh script that prevents it from running again if it has run too recently.

Rsync cronjob that will only run if rsync isn't already running

I have checked for a solution here but cannot seem to find one. I am dealing with a very slow wan connection about 300kb/sec. For my downloads I am using a remote box, and then I am downloading them to my house. I am trying to run a cronjob that will rsync two directories on my remote and local server every hour. I got everything working but if there is a lot of data to transfer the rsyncs overlap and end up creating two instances of the same file thus duplicate data sent.
I want to instead call a script that would run my rsync command but only if rsync isn't running?
The problem with creating a "lock" file as suggested in a previous solution, is that the lock file might already exist if the script responsible to removing it terminates abnormally.
This could for example happen if the user terminates the rsync process, or due to a power outage. Instead one should use flock, which does not suffer from this problem.
As it happens flock is also easy to use, so the solution would simply look like this:
flock -n lock_file -c "rsync ..."
The command after the -c option is only executed if there is no other process locking on the lock_file. If the locking process for any reason terminates, the lock will be released on the lock_file. The -n options says that flock should be non-blocking, so if there is another processes locking the file nothing will happen.
Via the script you can create a "lock" file. If the file exists, the cronjob should skip the run ; else it should proceed. Once the script completes, it should delete the lock file.
if [ -e /home/myhomedir/rsyncjob.lock ]
then
echo "Rsync job already running...exiting"
exit
fi
touch /home/myhomedir/rsyncjob.lock
#your code in here
#delete lock file at end of your job
rm /home/myhomedir/rsyncjob.lock
To use the lock file example given by #User above, a trap should be used to verify that the lock file is removed when the script is exited for any reason.
if [ -e /home/myhomedir/rsyncjob.lock ]
then
echo "Rsync job already running...exiting"
exit
fi
touch /home/myhomedir/rsyncjob.lock
#delete lock file at end of your job
trap 'rm /home/myhomedir/rsyncjob.lock' EXIT
#your code in here
This way the lock file will be removed even if the script exits before the end of the script.
A simple solution without using a lock file is to just do this:
pgrep rsync > /dev/null || rsync -avz ...
This will work as long as it is the only rsync job you run on the server, and you can then run this directly in cron, but you will need to redirect the output to a log file.
If you do run multiple rsync jobs, you can get pgrep to match against the full command line with a pattern like this:
pgrep -f rsync.*/data > /dev/null || rsync -avz --delete /data/ otherhost:/data/
pgrep -f rsync.*/www > /dev/null || rsync -avz --delete /var/www/ otherhost:/var/www/
As a definite solution kill rsync processes before new one starts in crontab.

Redirecting Output of Bash Child Scripts

I have a basic script that outputs various status messages. e.g.
~$ ./myscript.sh
0 of 100
1 of 100
2 of 100
...
I wanted to wrap this in a parent script, in order to run a sequence of child-scripts and send an email upon overall completion, e.g. topscript.sh
#!/bin/bash
START=$(date +%s)
/usr/local/bin/myscript.sh
/usr/local/bin/otherscript.sh
/usr/local/bin/anotherscript.sh
RET=$?
END=$(date +%s)
echo -e "Subject:Task Complete\nBegan on $START and finished at $END and exited with status $RET.\n" | sendmail -v group#mydomain.com
I'm running this like:
~$ topscript.sh >/var/log/topscript.log 2>&1
However, when I run tail -f /var/log/topscript.log to inspect the log I see nothing, even though running top shows myscript.sh is currently being executed, and therefore, presumably outputting status messages.
Why isn't the stdout/stderr from the child scripts being captured in the parent's log? How do I fix this?
EDIT: I'm also running these on a remote machine, connected via ssh using pseudo-tty allocation, e.g. ssh -t user#host. Could the pseudo-tty be interfering?
I just tried your the following: I have three files t1.sh, t2.sh, and t3.sh all with the following content:
#!/bin/bash
for((i=0;i<10;i++)) ; do
echo $i of 9
sleep 1
done
And a script called myscript.sh with the following content:
#!/bin/bash
./t1.sh
./t2.sh
./t3.sh
echo "All Done"
When I run ./myscript.sh > topscript.log 2>&1 and then in another terminal run tail -f topscript.log I see the lines being output just fine in the log file.
Perhaps the things being run in your subscripts use a large output buffer? I know when I've run python scripts before, it has a pretty big output buffer so you don't see any output for a while. Do you actually see the entire output in the email that gets sent out at the end of topscript.sh? Is it just that while the processes run you're not seeing the output?
try
unbuffer topscript.sh >/var/log/topscript.log 2>&1
Note that unbuffer is not always available as a std binary in old-style Unix platforms and may require a search and installation for a package to support it.
I hope this helps.

Resources