wanted to know pid for outcome of script - linux

./abcd.sh #this script is responsible to run a java code for creating a zip file in /tmp/abcd/
Some times abcd.sh script takes 30 seconds to create a zip file and some times it takes 60 seconds to create.
Since i don't have permission to edit abcd.sh file, I wrote this code to get pid for ./abcd.sh, but dont know how to get the pid for its child process.
./abcd.sh &
pid=$!
wait $pid
This code is waiting till the ./abcd.sh executes, but its not waiting till the zip file is complete.
Is there any way where it can wait till the zip file creates? my idea is, if we get to know the pid of zip file creation we can use wait $zipfilepid, but not sure how to get the pid for zip file creation.
.abcd.sh
sleep 60
I know sleep is an alternative for this, but i don't want to wait even if the zip file is created.

If you know the name of the "zip" program, you can find it in the process list and then determine its process ID. Assuming the name of the process you're waiting on is simply "zip", you can use something like this:
PROCNAME="zip"
PROCRUN=`ps h -ef|egrep -e "$PROCNAME"|grep -v " $$ "`
PIDS=`echo "$PROCRUN"|tr -s " "|cut -d" " -f2`
wait $PIDS
This will work for multiple PIDs. you may need to alter the PROCNAME variable's value to better target a specific process, such as:
PROCNAME=" zip "
PROCNAME="zip name-of-file-to-zip"
PROCNAME="zipscriptname"
...etc. Hope this helps.

After doing some research, i came to a conclusion that this approach may not be possible because i dont have control over the java code that creating a zip file and shell script(abcd.sh) that is instantiating to run the java code.
The approach that i took is:
find out whether zip file is still writing/open using /usr/sbin/lsof /path/to/zip/abcd.zip, with this command i will $? 0 if file is writing/open, i get $? 1 if file completed/close.
put $? into while loop and if $? -eq 0 go for sleep else exit the program.
/usr/sbin/lsof /path/to/zip/abcd.zip
rc=$?
while [ $rc -eq 0 ]
do
echo "Zip file is still creating, sleeping for 3 seconds"
sleep 3
done
Please let me know if you have better approach.

Related

Start programs synchronized in bash

What I simply want to do is to do a wait for release lock.
I have for example 4 (because I have 4 core) identical script that works each on a part of a project each script looks like that:
#!/bin/bash
./prerenderscript $1
scriptsync step1 4
./renderscript $1
scriptsync step2 4
./postprod $1
when I run the main script that call the four script, I want each script to work individualy but at some point, I want to have each script waiting for each other because the next part need all data from the first part.
For now I used some logic like the number of file or a file that get created for each process and their existance getting tested with other one.
I also got the idea to use a makefile and to have
prerender%: source
./prerender $#
renderscript%: prerender1 prerender2 prerender3 prerender4
./renderscript $#
postprod: renderscript1 renderscript2 renderscript3 renderscript4
./postprod $#
But actually the process is simplified here the script is more complex and for each step the thread need to keep his variables.
Is there anyway to get the script in sync instead of the placeholder command scriptsync.
To achieve this in Bash, one way to do it is using inter-process communication to cause a task to wait for the previous one to finish. Here is an example.
#!/bin/bash
# $1 is received to allow for an example command, not required for the mechanism suggested
task_a()
{
# Do some work
sleep $1 # This is just a dummy command as an example
echo "Task A/$1 completed" >&2
# Send status to stdin, telling next task to proceed
echo "OK"
}
task_b()
{
IFS= read status ; [[ $status = OK ]] || return 1
# Do some work
sleep $1 # This is just a dummy command as an example
echo "Task B/$1 completed" >&2
}
task_a 2 | task_b 2 &
task_a 1 | task_b 1 &
wait
You will notice that the read could be anywhere in task B, so you could do some work, then wait (read) for the other task, then continue. You could have many signals sent by task A to task B, and several corresponding read statements.
As shown in the example, you can launch several pipelines in parallel.
One limit of this approach is that a pipeline establishes a communication channel between one writer and one reader. If a task needs to wait for signals from several tasks, you would need FIFOs to allow the task with dependencies to read from multiple sources.

Bash output happening after prompt, not before, meaning I have to manually press enter

I am having a problem getting bash to do exactly what I want, it's not a major issue, but annoying.
1.) I have a third party software I run that produces some output as stderr. Some of it is useful, some of it is regularly stuff I don't care about and I don't want this dumped to screen, however I do want the useful parts of the stderr dumped to screen. I figured the best way to achieve this was to pass stderr to a function, then use conditions in that function to either show the stderr or not.
2.) This works fine. However the solution I have implemented dumped out my errors at the right time, but then returns a bash prompt and I want to summarise the status of the errors at the end of the function, but echo-ing here prints the text after the prompt meaning that I have to press enter to get back to a clean prompt. It shall become clear with the example below.
My error stream generator:
./TestErrorStream.sh
#!/bin/bash
echo "test1" >&2
My function to process this:
./Function.sh
#!/bin/bash
function ProcessErrors()
{
while read data;
do
echo Line was:"$data"
done
sleep 5 # This is used simply to simulate the processing work I'm doing on the errors.
echo "Completed"
}
I source the Function.sh file to make ProcessErrors() available, then I run:
2> >(ProcessErrors) ./TestErrorStream.sh
I expect (and want) to get:
user#user-desktop:~/path$ 2> >(ProcessErrors) ./TestErrorStream.sh
Line was:test1
Completed
user#user-desktop:~/path$
However what I really get is:
user#user-desktop:~/path$ 2> >(ProcessErrors) ./TestErrorStream.sh
Line was:test1
user#user-desktop:~/path$ Completed
And no clean prompt. Of course the prompt is there, but "Completed" is being printed after the prompt, I want to printed before, and then a clean prompt to appear.
NOTE: This is a minimum working example, and it's contrived. While other solutions to my error stream problem are welcome I also want to understand how to make bash run this script the way I want it to.
Thanks for your help
Joey
Your problem is that the while loop stay stick to stdin until the program exits.
The release of stdin occurs at the end of the "TestErrorStream.sh", so your prompt is almost immediately available compared to what remains to process in the function.
I suggest you wrap the command inside a script so you'll be able to handle the time you want before your prompt is back (I suggest 1sec more than the suspected time needed for the function to process the remaining lines of codes)
I successfully managed to do this like that :
./Functions.sh
#!/bin/bash
function ProcessErrors()
{
while read data;
do
echo Line was:"$data"
done
sleep 5 # simulate required time to process end of function (after TestErrorStream.sh is over and stdin is released)
echo "Completed"
}
./TestErrorStream.sh
#!/bin/bash
echo "first"
echo "firsterr" >&2
sleep 20 # any number here
./WrapTestErrorStream.sh
#!/bin/bash
source ./Functions.sh
2> >(ProcessErrors) ./TestErrorStream.sh
sleep 6 # <= this one is important
With the above you'll get a nice "Completed" before your prompt after 26 seconds of processing. (Works fine with or without the additional "time" command)
user#host:~/path$ time ./WrapTestErrorStream.sh
first
Line was:firsterr
Completed
real 0m26.014s
user 0m0.000s
sys 0m0.000s
user#host:~/path$
Note: the process substitution ">(ProcessErrors)" is a subprocess of the script "./TestErrorStream.sh". So when the script ends, the subprocess is no more tied to it nor to the wrapper. That's why we need that final "sleep 6"
#!/bin/bash
function ProcessErrors {
while read data; do
echo Line was:"$data"
done
sleep 5
echo "Completed"
}
# Open subprocess
exec 60> >(ProcessErrors)
P=$!
# Do the work
2>&60 ./TestErrorStream.sh
# Close connection or else subprocess would keep on reading
exec 60>&-
# Wait for process to exit (wait "$P" doesn't work). There are many ways
# to do this too like checking `/proc`. I prefer the `kill` method as
# it's more explicit. We'd never know if /proc updates itself quickly
# among all systems. And using an external tool is also a big NO.
while kill -s 0 "$P" &>/dev/null; do
sleep 1s
done
Off topic side-note: I'd love to see how posturing bash veterans/authors try to own this. Or perhaps they already did way way back from seeing this.

Need an in-depth explanation of how to use flock in Linux shell scripting

I am working on a tiny Raspberry Pi cluster (4 pis). I have 3 Raspberry Pi nodes that will be leaving a message in a message.txt file on the head Pi. The head Pi will be in a loop checking the message.txt file to see if it has any lines. When it does I want to lock the file and then extract the info I need. The problem I am having is that I need to do multiple commands. The only ways I have found that allows multiple commands look like this...
(
flock -s 200
# ... commands executed under lock ...
) 200>/var/lock/mylockfile
The problem with this way is that it uses a sub shell. The problem with that is that I have "job" files labeled job_1 job_2 etc..... that I want to be able to use a counter with. If I place the increment of the counter in the subshell it will be considered only in the scope of the subshell. If I pull the incrementation out there is a chance that another pi will add an entry before I increment the counter and lock the file.
I have heard talk that there is a way to lock the file and run multiple commands and flow control and then unlock it all using flock. I have not seen any good examples though.
Here is my current code.
# Now go into loop to send out jobs as pis ask for more work
while [ $jobsLeftCount -gt 0 ]
do
echo "launchJobs.sh: About to check msg file"
msgLines=$(wc -l < $msgLocation)
if [ $msgLines ]; then
#FIND WAY TO LOCK FILE AND DO THAT HERE
echo "launchJobs.sh: Messages found. Locking message file to read contents"
(
flock -e 350
echo "Message Received"
while read line; do
#rename file to be sent to node "job"
mv $jobLocation$jobName$jobsLeftCount /home/pi/algo2/Jobs/job
#transfer new job to each script that left a message
scp /home/pi/algo2/Jobs/job pi#192.168.0.$line:/home/pi/algo2/Jobs/
jobsLeftCount=$jobsLeftCount-1;
echo $line
done < $msgLocation
#clear msg file
>$msgLocation
#UNLOCK MESG FILE HERE
) 350>>$msgLocation
echo "Head node has $jobsLeftCount remaining"
fi
#jobsLeftCount=$jobsLeftCount-1;
#echo "here is $jobsLeftCount file"
done
If the sub-shell environment is not acceptable, use braces in place of parentheses to group the commands:
{
flock -s 200
# ... commands executed under lock ...
} 200>/var/lock/mylockfile
This runs the commands executed under lock in a new I/O context, but does not start a sub-shell. Within the braces, all the commands executed will have file descriptor 200 open to the locked lock file.

bash script to watch a folder

I have the following situation:
There is a windows folder that has been mounted on a Linux machine. There could be multiple folders (setup before hand)
in this windows mount. I have to do something (preferably a script to start with) to watch these folders.
These are the steps:
Watch for any incoming file(s). Make sure they are transferred completely.
Move it to another folder.
I do not have any control over the file transfer program on the windows machine. It is a secure FTP I believe.
So I cannot ask that process to send me a trailer file to ensure the completion of file transfer.
I have written a bash script. I would like to know about any potential pitfalls with this approach. Reason is,
there is a possibility of mulitple copies of this script running for multiple directories like this.
At the moment, there could be upto 100 directories that may have to be monitored.
Following is the script. I'm sorry for pasting a very long one here. Please take your time to review it and
comment / criticize it. :-)
It takes 3 parameters, the folder that has to be watched, the folder where the file has to be moved,
and a time interval, which has been explained below.
I'm sorry there seems to be a problem with the alignment. Markdown doesn't seem to like it. I tried to organize it properly, but not able to do so.
Linux servername 2.6.9-42.ELsmp #1 SMP Wed Jul 12 23:27:17 EDT 2006 i686 i686 i386
GNU/Linux
#!/bin/bash
log_this()
{
message="$1"
now=`date "+%D-%T"`
echo $$": "$now ": " $message
}
usage()
{
cat << EOF
Usage: $0 <Directory to be watched> <Directory to transfer> <time interval>
Time interval is the amount of time after which the modification time of a
file will be monitored.
EOF
`exit 1`
}
if [ $# -lt 2 ]
then
usage
fi
WATCH_DIR=$1
APP_DIR=$2
if [ ! -d "$WATCH_DIR" ]
then
log_this "FATAL: WATCH_DIR, $WATCH_DIR does not exist. Exiting"
exit 1
fi
if [ ! -d "$APP_DIR" ]
then
log_this "APP_DIR: $APP_DIR does not exist. Exiting"
exit 1
fi
# This needs to be set after considering the rate of file transfer.
# Represents the seconds elapsed after the last modification to the file.
# If not supplied as parameter, defaults to 3.
seconds_between_mods=$3
if ! [[ "$seconds_between_mods" =~ ^[0-9]+$ ]]; then
if [ ${#seconds_between_mods} -eq 0 ]; then
log_this "No value supplied for elapse time. Defaulting to 3."
seconds_between_mods=3
else
log_this "Invalid value provided for elapse time"
exit 1
fi
fi
log_this "Start Monitor."
while true
do
ls -1 $WATCH_DIR | while read file_name
do
log_this "Start Monitoring for $file_name"
# Refer only the modification with reference to the mount folder.
# If there is a diff in time between servers, we are in trouble.
token_file=$WATCH_DIR/foo.$$
current_time=`touch $token_file && stat -c "%Y" $token_file`
rm -f $token_file 2>/dev/null
log_this "Current Time: $current_time"
last_mod_time=`stat -c "%Y" $WATCH_DIR/$file_name`
elapsed_time=`expr $current_time - $last_mod_time`
log_this "Elapsed time ==> $elapsed_time"
if [ $elapsed_time -ge $seconds_between_mods ]
then
log_this "Moving $file_name to $APP_DIR"
# In case if there is no space left on the target mount, hide the file
# in the mount itself and remove the incomplete file from APP_DIR.
mv $WATCH_DIR/$file_name $APP_DIR
if [ $? -ne 0 ]
then
log_this "FATAL: mv failed!! Hiding $file_name"
rm $APP_DIR/$file_name
mv $WATCH_DIR/$file_name $WATCH_DIR/.$file_name
log_this "Removed $APP_DIR/$file_name. Look for $WATCH_DIR/.$file_name and submit later."
fi
log_this "End Monitoring for $file_name"
else
log_this "$file_name: Transfer seems to be in progress"
fi
done
log_this "Nothing more to monitor."
echo
sleep 5
done
This isn't going to work for any length of time. In production, you will have network problems and other errors which can leave a partial file in the upload directory. I also don't like the idea of a "trailer" file. The usual approach is to upload the file under a temporary name and then rename it after the upload completes.
This way, you just have to list the directory, filter the temporary names out and and if there is anything left, use it.
If you can't make this change, then ask your boss for a written permission to implement something which can lead to arbitrary data corruption. This is for two purposes: 1) To make them understand that this is a real problem and not something which you make up and 2) to protect yourself when it breaks ... because it will and guess who'll get all the blame?
I believe a much saner approach would be the use of a kernel-level filesystem notify item. Such as inotify. Get also the tools here.
incron is an "inotify cron" system. It consists of a daemon and a table manipulator. You can use it a similar way as the regular cron. The difference is that the inotify cron handles filesystem events rather than time periods.
First make sure inotify-tools in installed.
Then use them like this:
logOfChanges="/tmp/changes.log.csv" # Set your file name here.
# Lock and load
inotifywait -mrcq $DIR > "$logOfChanges" & # monitor, recursively, output CSV, be quiet.
IN_PID=$$
# Do your stuff here
...
# Kill and analyze
kill $IN_PID
cat "$logOfChanges" | while read entry; do
# Split your CSV, but beware that file names may contain spaces too.
# Just look up how to parse CSV with bash. :)
path=...
event=...
... # Other stuff like time stamps
# Depending on the event…
case "$event" in
SOME_EVENT) myHandlingCode path ;;
...
*) myDefaultHandlingCode path ;;
done
Alternatively, using --format instead of -c on inotifywait would be an idea.
Just man inotifywait and man inotifywatch for more infos.
To be honest a python app set up to run at start-up will do this quickly and efficiently. Python has amazing OS support and its rather complete.
Running the script will likely work, but it will be troublesome to take care and manage. I take it you will run these as frequent cron jobs?
To get you off your feet here is a small app I wrote which takes a path and looks at the binary output of jpeg files. I never quite finished it, but it will get you started and to see the structure of python as well as some use of os..
I wouldnt spend to much time worrying about my code.
import time, os, sys
#analyze() takes in a path and moves into the output_files folder, to then analyze files
def analyze(path):
list_outputfiles = os.listdir(path + "/output_files")
print list_outputfiles
for i in range(len(list_outputfiles)):
#print list_outputfiles[i]
f = open(list_outputfiles[i], 'r')
f.readlines()
#txtmaker reads the media file and writes its binary contents to a text file.
def txtmaker(c_file):
print c_file
os.system("cat" + " " + c_file + ">" + " " + c_file +".txt")
os.system("mv *.txt output_files")
#parser() takes in the inputed path, reads and lists all files, creates a directory, then calls txtmaker.
def parser(path):
os.chdir(path)
os.mkdir(path + "/output_files", 0777)
list_files = os.listdir(path)
for i in range(len(list_files)):
if os.path.isdir(list_files[i]) == True:
print (list_files[i], "is a directory")
else:
txtmaker(list_files[i])
analyze(path)
def main():
path = raw_input("Enter the full path to the media: ")
parser(path)
if __name__ == '__main__':
main()

Multi-threaded BASH programming - generalized method?

Ok, I was running POV-Ray on all the demos, but POV's still single-threaded and wouldn't utilize more than one core. So, I started thinking about a solution in BASH.
I wrote a general function that takes a list of commands and runs them in the designated number of sub-shells. This actually works but I don't like the way it handles accessing the next command in a thread-safe multi-process way:
It takes, as an argument, a file with commands (1 per line),
To get the "next" command, each process ("thread") will:
Waits until it can create a lock file, with: ln $CMDFILE $LOCKFILE
Read the command from the file,
Modifies $CMDFILE by removing the first line,
Removes the $LOCKFILE.
Is there a cleaner way to do this? I couldn't get the sub-shells to read a single line from a FIFO correctly.
Incidentally, the point of this is to enhance what I can do on a BASH command line, and not to find non-bash solutions. I tend to perform a lot of complicated tasks from the command line and want another tool in the toolbox.
Meanwhile, here's the function that handles getting the next line from the file. As you can see, it modifies an on-disk file each time it reads/removes a line. That's what seems hackish, but I'm not coming up with anything better, since FIFO's didn't work w/o setvbuf() in bash.
#
# Get/remove the first line from FILE, using LOCK as a semaphore (with
# short sleep for collisions). Returns the text on standard output,
# returns zero on success, non-zero when file is empty.
#
parallel__nextLine()
{
local line rest file=$1 lock=$2
# Wait for lock...
until ln "${file}" "${lock}" 2>/dev/null
do sleep 1
[ -s "${file}" ] || return $?
done
# Open, read one "line" save "rest" back to the file:
exec 3<"$file"
read line <&3 ; rest=$(cat<&3)
exec 3<&-
# After last line, make sure file is empty:
( [ -z "$rest" ] || echo "$rest" ) > "${file}"
# Remove lock and 'return' the line read:
rm -f "${lock}"
[ -n "$line" ] && echo "$line"
}
#adjust these as required
args_per_proc=1 #1 is fine for long running tasks
procs_in_parallel=4
xargs -n$args_per_proc -P$procs_in_parallel povray < list
Note the nproc command coming soon to coreutils will auto determine
the number of available processing units which can then be passed to -P
If you need real thread safety, I would recommend to migrate to a better scripting system.
With python, for example, you can create real threads with safe synchronization using semaphores/queues.
sorry to bump this after so long, but I pieced together a fairly good solution for this IMO
It doesnt work perfectly, but it will limit the script to a certain number of child tasks running, and then wait for all the rest at the end.
#!/bin/bash
pids=()
thread() {
local this
while [ ${#} -gt 6 ]; do
this=${1}
wait "$this"
shift
done
pids=($1 $2 $3 $4 $5 $6)
}
for i in 1 2 3 4 5 6 7 8 9 10
do
sleep 5 &
pids=( ${pids[#]-} $(echo $!) )
thread ${pids[#]}
done
for pid in ${pids[#]}
do
wait "$pid"
done
it seems to work great for what I'm doing (handling parallel uploading of a bunch of files at once) and keeps it from breaking my server, while still making sure all the files get uploaded before it finishes the script
I believe you're actually forking processes here, and not threading. I would recommend looking for threading support in a different scripting language like perl, python, or ruby.

Resources