how to run multiple shell scripts one by one in single go using another shell script - linux

Dear experts i have a small problem...i am trying to run multiple shell scripts having same extension(.sh) in one go, which are present inside a directory. In so far i wrote a common script like as below. But problem is that it does not finish running instead it keeps running.I am unable to find out where the problems persist.I hope some expert may look into it. my small code is as below. if i do something like bash scriptone.sh, bash scriptkk.sh it works fine but i donot want manual way to do it.Thanks.
#!/bin/sh
for f in *.sh; do
bash "$f" -H
done

You are probably calling yourself recursively
#!/bin/sh
for f in *.sh; do
if [ "$f" == "$0" ]; then
continue
else
echo "running: $f"
bash "$f" -H
fi
done

You are running them sequentially.
Maybe one of the other scripts is still going?
Try starting them all in background.
Simple version -
for f in *.sh; do bash "$f" -H & done
If there's any output this will be a mess though. Likewise, if you log out they will crash. Here's an elaborated version to handle such things -
for f in *.sh; do
nohup bash "$f" -H <&- > "$f.log" 2>&1 &
done
The & at the end puts it into background so that the loop can start the next one without waiting for the current $f to finish. nohup catches SIGHUP, so if it takes a long time you can disconnect and come back later.
<&- closes stdin. > "$f.log" gives each script a log of its own so you can check them individually without them getting all intermixed. 2>&1 just makes sure any error output goes into the same log as the stdout - be aware that stderr is unbuffered, while stdout IS buffered, so if your error seems to be in a weird place (too early) in the log, switch it around:
nohup bash "$f" -H <&- 2>"$f.log" 1>&2 &
which ought to unbuffer them both and keep them collated.
Why do you give them all the same -H argument?
Since you mention below that you have 5k scripts to run, that kind of maybe explains why it's taking so long... You might not want to pound the server with all those at once. Let's elaborate that just a little more...
Minimally, I'd do something like this:
for f in *.sh; do
nohup nice "$f" -H <&- > "$f.log" 2>&1 &
sleep 0.1 # fractional seconds ok, longer pause means fewer per sec
done
This will start nine or ten per second until all of them have been processed, and nohup nice will run $f with a lower priority so normal system requests will be able to get ahead of it.
A better solution might be par or parallel.

Related

When piping in BASH, is it possible to get the PID of the left command from within the right command?

The Problem
Given a BASH pipeline:
./a.sh | ./b.sh
The PID of ./a.sh being 10.
Is there way to find the PID of ./a.sh from within ./b.sh?
I.e. if there is, and if ./b.sh looks something like the below:
#!/bin/bash
...
echo $LEFT_PID
cat
Then the output of ./a.sh | ./b.sh would be:
10
... Followed by whatever else ./a.sh printed to stdout.
Background
I'm working on this bash script, named cachepoint, that I can place in a pipeline to speed things up.
E.g. cat big_data | sed 's/a/b/g' | uniq -c | cachepoint | sort -n
This is a purposefully simple example.
The pipeline may run slowly at first, but on subsequent runs, it will be quicker, as cachepoint starts doing the work.
The way I picture cachepoint working is that it would use the first few hundred lines of input, along with a list of commands before it, in order to form a hash ID for the previously cached data, thus breaking the stdin pipeline early on subsequent runs, resorting instead to printing the cached data. Cached data would get deleted every hour or so.
I.e. everything left of | cachepoint would continue running, perhaps to 1,000,000 lines, in normal circumstances, but on subsequent executions of cachepoint pipelines, everything left of | cachepoint would exit after maybe 100 lines, and cachepoint would simply print the millions of lines it has cached. For the hash of the pipe sources and pipe content, I need a way for cachepoint to read the PIDs of what came before it in the pipeline.
I use pipelines a lot for exploring data sets, and I often find myself piping to temporary files in order to bypass repeating the same costly pipeline more than once. This is messy, so I want cachepoint.
This Shellcheck-clean code should work for your b.sh program on any Linux system:
#! /bin/bash
shopt -s extglob
shopt -s nullglob
left_pid=
# Get the identifier for the pipe connected to the standard input of this
# process (e.g. 'pipe:[10294010]')
input_pipe_id=$(readlink "/proc/self/fd/0")
if [[ $input_pipe_id != pipe:* ]]; then
echo 'ERROR: standard input is not a pipe' >&2
exit 1
fi
# Find the process that has standard output connected to the same pipe
for stdout_path in /proc/+([[:digit:]])/fd/1; do
output_pipe_id=$(readlink -- "$stdout_path")
if [[ $output_pipe_id == "$input_pipe_id" ]]; then
procpid=${stdout_path%/fd/*}
left_pid=${procpid#/proc/}
break
fi
done
if [[ -z $left_pid ]]; then
echo "ERROR: Failed to set 'left_pid'" >&2
exit 1
fi
echo "$left_pid"
cat
It depends on the fact that, on Linux, for a process with id PID the path /proc/PID/fd/0 looks like a symlink to the device connected to the standard input of the process and /proc/PID/fd/1 looks like a symlink to the device connected to the standard output of the process.

Multiple scripts making rest calls interfering

So I am running into a problem with unix scripts that use curl to make rest calls. I have one script, that runs two other scripts inside of it.
cat example.sh
FILE="file1.txt"
RECIP="wilfred#blamagam.com"
rm -f $FILE
./script1.sh > $FILE
mail -s "subject" $RECIP < $FILE
RECIP="bob#blamagam.com"
rm -f $FILE
./script2.sh > $FILE
mail -s "subject" $RECIP < $FILE
exit 0
Each script makes rest calls to the same service. It is my understanding that script1.sh should completely finish before script2.sh is ran, however that is not the case. In the logs for the rest service I see a rest call from the second script in the middle of the first one still executing. The second script then fails because of this (it does not get any data returned).
I am modifying this process so I am not the one who originally wrote it. I am not seeing any forked processes, or background processes at all and I have been banging my head against the wall.
I do know that script2.sh works. Whenever script1.sh takes under a minute script2.sh works just fine, but more often than not script1.sh takes over a min, causing the second script to fail.
This is ran by a cron, and the contents of the files are mailed out, so I cant just default to running them manually. Any suggestions for what to look into would be much appreciated!
EDIT: Here is a high pseudo code example
script1.sh
ITEMS=`/usr/bin/curl -m 10 -k -u userName:passWord -L https://server/rest-service/rest?where=clause=value;clause2=value2&sel=field 2>/dev/null | sed s/<\/\?Attribute[^>]*>/\n/g | grep -v '^<' | grep -v '^$' | sed 's/ //g'`
echo "\n Subject for these metrics"
echo "$ITEMS"
Both scripts have lots of entries like this. There are 2 or 3 for loops but they are simple and I do not see any background processes being called. Its a large script so I could only provide a snippet. Could the rest call into pipes be causing an issue?
Edit:
Just tested this on my system and it seems to work.
cat example.sh
FILE="file1.txt"
RECIP="wilfred#blamagam.com"
rm -f "$FILE"
(./script1.sh > "$FILE") &
procscript1=$!
wait "$procscript1"
mail -s "subject" "$RECIP" < "$FILE"
RECIP="bob#blamagam.com"
rm -f "$FILE"
(./script2.sh > "$FILE") &
procscript2=$!
wait "$procscript2"
mail -s "subject" "$RECIP" < "$FILE"
exit 0
Put the script executions in the background with the &.
Get the process id's for each script execution.
Use the wait command to block until the execution is done.

Add seconds option to positional parameters in bash

Hello im writing a bash code which has some positional parameters but whats the best approach to add an optional seconds parameter which will allow some function to run for x seconds?
This is what code looks like:
doaction()
{
(run a process)
}
while [ $# -gt -0 ]; do
case "$1" in
--action|-a)
doaction ;;
--seconds|-s)
???????? $2
shift ;;
esac
shift
done
After x seconds kill process.
Also what happens when i run the script like
./script -s 10 -a
instead of
./script -a -s 10
Thanks
It looks like the timeout command is probably useful here. However, this only works on a script separate from the one that is currently running (as far as I can tell).
For your second question, the way you currently have things written, if you use ./script -a -s 10 then the result would be that the action would run before the delay is set. You can fix this by using a flag to indicate that the action should be executed, and you can ensure that the timeout is set (if at all) before the execution.
Here is my suggestion for a possible solution:
while [ $# -gt -0 ]; do
case "$1" in
--action|-a)
action=true;;
--seconds|-s)
time="$2"
shift;;
esac;
shift
done
if $action; then
timeout $time /path/to/action.sh
else
# do something else
fi
Where /path/to/action.sh is the location of the script that you want to run for a specific amount of time. You can test that the script exits after the specified number of seconds by replacing the script with the bash command top or something else which runs indefinitely.
You can use "getopts" to solve your problem. You might find the information on this link to be useful for your scenario.

Why does part of a script executed by cron fail unless stderr is directed to /dev/null?

Here is a snippet from a script which I generally execute from cron:
if [ "$RESCAN_COMMAND" = "wipecache" ]; then
log "Linking cover art."
find $FLAC_DIR -name "*.jpg" | while read f; do c=`echo $f | sed -e 's/flac/mp3/g'`; ln -s "$f" "$c"; done
log "Done linking cover art"
fi
The script works perfectly when run from the command line. But when run by cron (as the same user) it fails somewhere in the find line. The "Done" message is not logged and the script does not continue beyond the if block.
The find line creates links from files like flac/Artist/Album/cover.jpg to mp3/Artist/Album/cover.jpg. There are a few hundred files to link. The command generates a lot of output to stderr, because most, if not all, of the links already exist.
On a hunch, I tried redirecting the stderr of the ln command to /dev/null:
find $FLAC_DIR -name "*.jpg" | while read f; do c=`echo $f | sed -e 's/flac/mp3/g'`; ln -s "$f" "$c" 2>/dev/null; done
With that change, the script executes successfully from cron (as well as from the command line).
I would be interested to understand why.
Could it be this bug report: https://bugs.launchpad.net/ubuntu/+source/cron/+bug/151231
It's probably producing too much output. This really isn't a bug, but a feature as cron typically send emails with it's output. MTA's don't like text messages with many many lines, so cron just quits. Maybe the silent quit is a bug though.
You could also use ln -f to suppress the ln errors in only the case of pre-existing files.

Delayed Monitoring in Bash

What's the best way to "wait" in a Bash script, until the result of a command contains a specific pattern?
I've written a simple script to repair a RAID array, and now I want the script to wait until the command cat /proc/mdstat reports that the rebuilding of the array is complete. I want to wait, because afterwards, I need to install Grub on the new device via sudo grub-install /dev/sd*
Something like
#! /bin/bash
doneString="RAIDFix Completed successfully"
until ${mdstat_done:-false} ; do
if grep "${doneString:?}" /proc/mdstat > /dev/null ; then
sudo grub-install /dev/sd*
mdstat_done=true
else
sleep ${sleepSecs:-60}
fi
done
Do you need an explanation?
I hope this helps.
(there are more succinct ways of doing some of this for just bash, but I tend to write in portable SH when doing things like this:)
weredone=0
while test $weredone = 0 ; do
# this is not actually what you want to grep for, but you get the idea:
grep complete /proc/mdstat
if test $? = 0 ; then
weredone=1
else
sleep 5
fi
done

Resources