Shell script for running subprocesses two at a time - linux

Let's assume there are a total of 10 subprocesses which I want my shell script to run. Subprocess (i.e. a process created within the shell script) being called x1...10 for simplicity. A normal shell script would have 10 lines; let's assume each line calls ./xi. However, to maximize efficiency, I know my hardware allows for two of the subprocesses to be launched at the same time. Therefore, at any point in time, two of these processes should be running. The moment that one is done, the next is launched. No order should be assumed in how they finish, any order is fine as they are assumed independent. Is there an elegant way of doing this in a shell script? Note, each x1...x10 should run once only.

seq 10 | xargs -P2 -I{} ./xi
seq 10 - Outputs 10 numbers. We don't care about them.
xargs run a command for each input
-P2 runs two processes at a time.
-I{} replaces each {} in the command for the input line. So just discard the input.
./x{} run this command for each line in the input.
Final answer: cat myshellscript.sh | xargs -L 1 -I CMD -P 2 bash -c CMD
With myshellscript.sh being a file like this:
./task-jsuqh
./task-siuww
./task-uqywh
./task-sdqaw

Related

"watch" and tell how long it has been watching

I need to repeatedly execute a command (e.g. ls) on a linux OS and I also would like to know how long it has been running since I started watching it.
I know I can use watch to execute the command. My question is how I can show when the "watch" was started? Or how long it has been watching?
It is not supported out of the box, but you can hack around it by supplying a more complex command.
watch bash -c '"echo $(($(date +%s) - '$(date +%s)'))s; echo ---; ls"'
Instead of directly watching the command you want (ls in this toy example), watch a bash instance, because it can parse a command line and do more. We tell this bash to execute three commands, the first of which calculates the difference between the seconds since epoch at watch invocation and the seconds since epoch at bash invocation. The second prints a line because for prettiness, and the third then executes the desired command.
Showing when the watch started follows the same idea, but is simpler:
watch bash -c "'echo $(date); echo ---; ls'"
Note that the order of the quotes is not random. The last command could have also been written similar to the above one:
watch bash -c '"echo '$(date)'; echo ---; ls"'

Run "dummy" background command with specific text

I'm looking for a bash command I can run in the background that will sleep for a bit (60 seconds), and the command will contain a specific text string I can grep out of a ps command.
I can't release a "dummy" script I'm afraid, so it needs to be a one line command.
I tried
echo "textneeded">/dev/null && sleep 60 &
But of course the only text I can grep for is the sleep, as the echo is over in a flash.
(The reasoning for this is it's for putting another script in "test" mode so it doesn't create child processes, but other functionality that ensures there are none of these processes running will still find something, and therefore wait. The other functionality isn't in a bash script.)
I had to do this to test a process killing script. You can use perl to set the process name.
perl -e '$0="textneeded"; sleep 60' &
Original props goes to this guy

Linux - Execute shell scripts simultaneously on the background and know when its done

I'm using rsync to transfer files from a server to another server (both owned by me), my only problem is that these files are over 50GB and I got a ton of them to transfer (Over 200 of them).
Now I could just open multiple tabs and run rsync or add the "&" at the end of the script to execute it in the background.
So my question is, how can I execute this command in the background and when its done transferring, I want a message to be shown on the terminal window that executed the script.
(rsync -av -progress [FOLDER_NAME] [DISTINATION]:[PATH] &) && echo 'Finished'
I know thats completely wrong but I need to use & to run it in the background and && to run echo after rsync finished.
Next to the screen-based solution, you could use xargs tool, too.
echo '/srcpath1 host1 /dstpath1
/srcpath2 host2 /dstpath2
/srcpath3 host3 /dstpath3'| \
xargs -P 5 --max-lines 1 bash -e 'rsync -av -progress $1 $2:$3'
xargs reads its input for stdin, and executes a command for every single words or lines. This time, lines.
What it makes very good: it can do with its child processes parallel! In this configuration, xargs does this by using always 5 parallel child processes. This number can be 1 or even infinite.
xargs will exit, if all of its childs are ready, and handles every ctrl/c, child processing, etc very well and problem tolerant.
Instead of the echo, the input of xargs can come from a file, or even from a previous command in the pipe, too. Or from a for or while loop.
You could use gnu screen for that, screen could monitor output for silence and for activity. Additional benefit - you could close terminal and reattach to screen later - even better if you run could screen on server - then you could shutdown or reboot your machine and processes in screen still be executing.
Well, to answer your specific question, your invocation:
(rsync ... &) && echo 'Finished'
creates a subshell - the ( ... ) bit - in which rsync is run in the background, which means the subshell will exit as soon as it has started rsync, not after rsync finishes. The && echo ... part then notices that the subshell has exited successfully and does its thing, which is not what you want, because rsync is most likely still running.
To accomplish what you want, you need to do this:
(rsync ... && echo 'Finished') &
That will put the subshell itself in the background, and the subshell will run rsync and then echo. If you need to wait for that subshell to finish at some point later in your script, simply insert a wait at the appropriate point.
You could also structure it this way:
rsync ... &
# other stuff to do while rsync runs
wait
echo 'Finished'
Which is "better" is really just a matter of preference. There's one minor difference in that the && will run echo only if rsync doesn't report an error exit code - but replacing && with ; would make the two patterns more equivalent. The second method makes the echo synchronous with other output from your script, so it doesn't show up in the middle of other output, so it might be slightly preferable from that respect, but capturing the exit condition of rsync would be more complicated if it was necessary...

Why 'wait' doesn't wait for detached jobs

I followed this blog entry to parallelize sort by splitting a large file, sorting and merging.
The steps are:
split -l5000000 data.tsv '_tmp'
ls -1 _tmp* | while read FILE; do sort $FILE -o $FILE & done
sort -m _tmp* -o data.tsv.sorted
Between step 2 and 3, one must wait until the sorting step has finished.
I assumed that wait without any arguments would be the right thing, since according to the man page, if wait is called without arguments all currently active child processes are waited for.
However, when I try this in the shell (i.e. executing steps 1 and 2, and then wait), wait returns immediately, although top shows the sort processes are still running.
Ultimately I want to increase the speed of a script with that, so its not a one time thing I could do manually on the shell.
I know sort has a --parallel option since version 8, however on the cluster I am running this, an older version is installed, and I am also curious about how to solve this issue.
Here's a simple test case reproducing your problem:
true | { sleep 10 & }
wait
echo "This echos immediately"
The problem is that the pipe creates a subshell, and the forked processes are part of that subshell. The solution is to wait in that subshell instead of your main parent shell:
true | { sleep 10 & wait }
echo "This waits"
Translated back into your code, this means:
ls -1 _tmp* | { while read FILE; do sort $FILE -o $FILE & done; wait; }
From the bash man page:
Each command in a pipeline is executed as a separate process (i.e., in a subshell).
So when you pipe to while, a subshell is created. Everything else in step 2 is executed within this subshell, (ie, all the background processes). The script then exits the while loop, leaving the subshell, and wait is executed in the parent shell, where there is nothing to wait for. You can avoid using the pipeline by using a process substitution:
while read FILE; do
sort $FILE -o $FILE &
done < <(ls -1 _tmp*)

How to use a Linux bash function to "trigger two processes in parallel"

Please kindly consider the following sample code snippet:
function cogstart
{
nohup /home/michael/..../cogconfig.sh
nohup /home/michael/..../freshness_watch.sh
watch -n 15 -d 'tail -n 1 /home/michael/nohup.out'
}
Basically the freshness_watch.sh and the last watch commands are supposed to be executed in parallel, i.e., the watch command doesn't have to wait till its prequel to finish. I am trying to work out a way like using xterm but since the freshness_watch.sh is a script that would last 15 minutes at the most(due to my bad way of writing a file monitoring script in Linux), I definitely want to trigger the last watch command while this script is still executing...
Any thoughts? Maybe in two separate/independent terminals?
Many thanks in advance for any hint/help.
As schemathings indicates indirectly, you probably want to append the '&' character to the end of the line with freshness_watch.sh. (without the single-quotes). I don't see any reason to use '&' for your final watch command, unless you add more commands after that.
'&' at the end of a unix command-line indicates 'run in the back-ground'.
You might want to insert a sleep ${someNumOfSecs} after your call to freshness_watch, to give it some time to have the CPU to it's self.
Seeing as you mention xterm, do you know about the crontab facility that allows you to schedule a job to run anytime you want, and is done without the user having to login? (Maybe this will help with your issue). I like setting jobs to run in crontab, because then you can capture any trace information you care to capture AND any possible output from stderr into a log/trace file.
( nohup wc -l * || nohup ls -l * ) &
( nohup wc -l * ; nohup ls -l * ) &
I'm not clear on what you're attempting to do - the question seems self contradictory.

Resources