KSH: constraint the number of thread that can run at one time - linux

I have a script that loop and each iteration invoke a thread that run in a background like below
xn_run_process.sh
...
for each in `ls ${INPUT_DIR}/MDX*.txt`
do
java -Xms256m -Xmx1024m -cp ${CLASSPATH} com.wf.xn.etcc.Main -config=${CONFIG_FILE}
...
for SCALE_PDF in `ls ${PROCESS_DIR}/*.pdf`
do
OUTPUT_AFP=${OUTPUT_DIR}/`basename ${SCALE_PDF}`
OUTPUT_AFP=`print ${OUTPUT_AFP} | sed s/pdf/afp/g`
${PROJ_DIR}/myscript.sh -i ${SCALE_PDF} -o ${OUTPUT_AFP} &
sleep 30
done
done
When I did this, I only think that it will be only 5 threads of myscript.sh be concurrently executed at one time, however things change, and this list execute 30 threads, each does quite heavy process. How do I constraint the number of concurrent processes to 5?

While this is possible in pure shell scripting, the easiest approach would be using a parallelization tool like GNU parallel or GNU make. Makefile example:
SOURCES = ${SOME_LIST}
STAMPS = $(SOME_LIST:=.did-run-stamp)
all : $(STAMPS)
%.did-run-stamp : %
/full/path/myscript.sh -f $<
and then calling make as make -j 5.

Use GNU Parallel (adjust -j as you see fit. Remove it if you want # of CPUs):
for each in `ls ${INPUT_DIR}/MDX*.txt`
do
java -Xms256m -Xmx1024m -cp ${CLASSPATH} com.wf.xn.etcc.Main -config=${CONFIG_FILE}
...
for SCALE_PDF in `ls ${PROCESS_DIR}/*.pdf`
do
OUTPUT_AFP=${OUTPUT_DIR}/`basename ${SCALE_PDF}`
OUTPUT_AFP=`print ${OUTPUT_AFP} | sed s/pdf/afp/g`
sem --id myid -j 5 ${PROJ_DIR}/myscript.sh -i ${SCALE_PDF} -o ${OUTPUT_AFP}
done
done
sem --wait --id myid
sem is part of GNU Parallel.
This will keep 5 jobs running until there is only 5 jobs left. Then it will allow your java to run while finishing the last 5. The sem --wait will wait until the last 5 are finished, too.
Alternatively:
for each ...
java ...
...
ls ${PROCESS_DIR}/*.pdf |
parallel -j 5 ${PROJ_DIR}/myscript.sh -i {} -o ${OUTPUT_DIR}/{/.}.afp
done
This will run 5 jobs in parallel and only let java run when all the jobs are finished.
Alternatively you can use the queue trick described in GNU Parallel's man page: https://www.gnu.org/software/parallel/man.html#example__gnu_parallel_as_queue_system_batch_manager
echo >jobqueue; tail -f jobqueue | parallel -j5 &
for each ...
...
ls ${PROCESS_DIR}/*.pdf |
parallel echo ${PROJ_DIR}/myscript.sh -i {} -o ${OUTPUT_DIR}/{/.}.afp >> jobqueue
done
echo killall -TERM parallel >> jobqueue
wait
This will run java, then add jobs to be run to a queue. After adding jobs java will be run immediately. At all time 5 jobs will be run from the queue until the queue is empty.
You can install GNU Parallel simply by:
wget http://git.savannah.gnu.org/cgit/parallel.git/plain/src/parallel
chmod 755 parallel
cp parallel sem
Watch the intro videos to learn more: https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1 and walk through the tutorial (man parallel_tutorial). You command line with love you for it.

If you have ksh93 check if JOBMAX is available:
JOBMAX
This variable defines the maximum number running background
jobs that can run at a time. When this limit is reached, the
shell will wait for a job to complete before staring a new job.

Related

Running a regression with shell script and make utility

I want to run a regression with a shell script which should start each test via make command. The following is a simple version of my script:
#!/bin/sh
testlist="testlist.txt"
while read line;
do
test_name=$(echo $line | awk '{print $1}')
program_path=$(echo $line | awk '{print $2}')
make sim TEST_NAME=$test_name PROGRAM_PATH=$program_path
done < "$testlist"
The problem with the above script is that when make command starts a program, the script goes to the next iteration without waiting for the completion of that program in the previous iteration and continues to read the next line from the file.
Is there any option in make utility to make sure that it waits for the completion of the program? Maybe I'm missing something else.
This is the related part of Makefile:
sim:
vsim -i -novopt \
-L $(QUESTA_HOME)/uvm-1.1d \
-L questa_mvc_lib \
-do "add wave top/AL_0/*;log -r /*" \
-G/path=$(PROGRAM_PATH) \
+UVM_TESTNAME=$(TEST_NAME) +UVM_VERBOSITY=UVM_MEDIUM -sv_seed random
As OP stated in a comment, he noticed that vsim forks few other processes, that are still running after vsim is finished. So he needs to wait until the other processes are finished.
I emulated your vsim command with a script that forks some sleep processes:
#!/bin/bash
forksome() {
sleep 3&
sleep 2&
sleep 5&
}
echo "Forking some sleep"
forksome
I made a Makefile, that shows your problem after make normal.
When you know which processes are forked, you can make a solution as I demonstrated with make new.
normal: sim
ps -f
new: sim mywait
ps -f
sim:
./vsim
mywait:
echo "Waiting for process(es):"
while pgrep sleep; do sleep 1; done

qsub Job using GNU parallel not running

I am trying execute qsub job in a multinode(2) and PPN of 20 using GNU parallel, However it shows some error.
#!/bin/bash
#PBS -l nodes=2:ppn=20
#PBS -l walltime=02:00:00
#PBS -N down
cd $PBS_O_WORKDIR
module load gnu-parallel
for cdr in /scratch/data/v/mt/Downscale/*;do
(cp /scratch/data/v/mt/DWN_FILE_NEW/* $cdr/)
(cd $cdr && parallel -j20 --sshloginfile $PBS_NODEFILE 'echo {} | ./vari_1st_imge' ::: *.DS0 )
done
When I run the above code I got the following error(Please note all the path are properly checked, and the same code without qsub is running properly in a normal computer)
$ ./down
parallel: Error: Cannot open echo {} | ./vari_1st_imge.
& for $qsub down -- no output is creating
I am using parallel --version
GNU parallel 20140622
Please help to solve the problem
First try adding --dryrun to parallel.
But my feeling is that $PBS_NODEFILE is not set for some reason, and that GNU Parallel tries to read the command as the --sshloginfile.
To test this:
echo $PBS_NODEFILE
(cd $cdr && parallel --sshloginfile $PBS_NODEFILE -j20 'echo {} | ./vari_1st_imge' ::: *.DS0 )
If GNU Parallel now tries to open -j20 then it is clear that it is empty.

SLURM sbatch multiple parallel calls to executable

I have an executable that takes multiple options and multiple file inputs in order to run. The executable can be called with a variable number of cores to run.
E.g. executable -a -b -c -file fileA --file fileB ... --file fileZ --cores X
I'm trying to create an sbatch file that will enable me to have multiple calls of this executable with different inputs. Each call should be allocated in a different node (in parallel with the rest), using X cores. The parallelization at core level is taken care of the executable, while at the node level by SLURM.
I tried with ntasks and multiple sruns but the first srun was called multiple times.
Another take was to rename the files and use a SLURM process or node number as filename before the extension but it's not really practical.
Any insight on this?
i do these kind of jobs always with the help of bash script that i run by a sbatch command. The easiest approach would be to have a loop in a sbatch script where you spawn the different job and job steps under your executable with srun specifying i.e. the corresponding node name in your partion with -w . You may also read up the documentation of slurm array jobs (if that befits you better). Alternatively you could also store all parameter combinations in a file and than loop over them with the script of have a look at "array job" manual page.
Maybe the following script (i just wrapped it up) helps you to get a feeling for what i have in mind (i hope its what you need). Its not tested so dont just copy and paste it!
#!/bin/bash
parameter=(10 5 2)
node_names=(node1 node2 node3)
# lets run one job per node each time taking one parameter
for parameter in ${parameter[*]}
# asign parameter to node
#script some if else condition here to specify parameters
# -w specifies the name of the node to use
# -N specifies the amount of nodes
JOBNAME="jmyjob$node-$parameter"
# asign the first job to the node
$node=${node_names[0]}
#delete first node from list
unset node_names[0];
#reinstantiate list
node_names=("${Unix[#]}")
srun -N1 -w$node -psomepartition -JJOBNAME executable.sh model_parameter &
done;
You will have the problem that you need to force your sbatch script to wait for the last job step. In this case the follwoing additional while loop might help you.
# Wait for the last job step to complete
while true;
do
# wait for last job to finish use the state of sacct for that
echo "waiting for last job to finish"
sleep 10
# sacct shows your jobs, -R only running steps
sacct -s R,gPD|grep "myjob*" #your job name indicator
# check the status code of grep (1 if nothing found)
if [ "$?" == "1" ];
then
echo "found no running jobs anymore"
sacct -s R |grep "myjob*"
echo "stopping loop"
break;
fi
done;
I managed to find one possible solution, so I'm posting it for reference:
I declared as many tasks as calls to the executable, as well as nodes and the desired number of cpus per call.
And then a separate srun for each call, declaring the number of nodes and tasks at each call. All the sruns are bound with ampersands (&):
srun -n 1 -N 1 --exclusive executable -a1 -b1 -c1 -file fileA1 --file fileB1 ... --file fileZ1 --cores X1 &
srun -n 1 -N 1 --exclusive executable -a2 -b2 -c2 -file fileA2 --file fileB2 ... --file fileZ2 --cores X2 &
....
srun -n 1 -N 1 --exclusive executable -aN -bN -cN -file fileAN --file fileBN ... --file fileZN --cores XN
--Edit: After some tests (as I mentioned in a comment below), if the process of the last srun ends before the rest, it seems to end the whole job, leaving the rest unfinished.
--edited based on the comment by Carles Fenoy
Write a bash script to populate multiple xyz.slurm files and submit each of them using sbatch. Following script does a a nested for loop to create 8 files. Then iterate over them to replace a string in those files, and then batch them. You might need to modify the script to suit your need.
#!/usr/bin/env bash
#Path Where you want to create slurm files
slurmpath=~/Desktop/slurms
rm -rf $slurmpath
mkdir -p $slurmpath/sbatchop
mkdir -p /exports/home/schatterjee/reports
echo "Folder /slurms and /reports created"
declare -a threads=("1" "2" "4" "8")
declare -a chunks=("1000" "32000")
declare -a modes=("server" "client")
## now loop through the above array
for i in "${threads[#]}"
{
for j in "${chunks[#]}"
{
#following are the content of each slurm file
cat <<EOF >$slurmpath/net-$i-$j.slurm
#!/bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --output=$slurmpath/sbatchop/net-$i-$j.out
#SBATCH --wait-all-nodes=1
echo \$SLURM_JOB_NODELIST
cd /exports/home/schatterjee/cs553-pa1
srun ./MyNETBench-TCP placeholder1 $i $j
EOF
#Now schedule them
for m in "${modes[#]}"
{
for value in {1..5}
do
#Following command replaces placeholder1 with the value of m
sed -i -e 's/placeholder1/'"$m"'/g' $slurmpath/net-$i-$j.slurm
sbatch $slurmpath/net-$i-$j.slurm
done
}
}
}
You can also try this python wrapper which can execute your command on the files you provide

Executing several bash scripts simultaneously from one script?

I want to make a bash script that will execute around 30 or so other scripts simultaneously, these 30 scripts all have wget commands iterating through some lists.
I thought of doing something with screen (send ctrl + shift + a + d) or send the scripts to background but really I dont know what to do.
To summarize: 1 master script execution will trigger all other 30 scripts to execute all at the same time.
PS: I've seen the other questions but I don't quite understand how the work or the are a bit more than what I need(expecting a return value, etc)
EDIT:
Small snippet of the script(this part is the one that executes with the config params I specified)
if [ $WP_RANGE_STOP -gt 0 ]; then
#WP RANGE
for (( count= "$WP_RANGE_START"; count< "$WP_RANGE_STOP"+1; count=count+1 ));
do
if cat downloaded.txt | grep "$count" >/dev/null
then
echo "File already downloaded!"
else
echo $count >> downloaded.txt
wget --keep-session-cookies --load-cookies=cookies.txt --referer=server.com http://server.com/wallpaper/$count
cat $count | egrep -o "http://wallpapers.*(png|jpg|gif)" | wget --keep-session-cookies --load-cookies=cookies.txt --referer=http://server.com/wallpaper/$number -i -
rm $count
fi
Probably the most straightforward approach would be to use xargs -P or GNU parallel. Generate the different arguments for each child in the master script. For simplicity's sake, let's say you just want to download a bunch of different content at once. Either of
xargs -P 30 wget < urls_file
parallel -j 30 wget '{}' < urls_file
will spawn up to 30 simultaneous wget processes with different args from the given input. If you give more information about the scripts you want to run, I might be able to provide more specific examples.
Parallel has some more sophisticated tuning options compared to xargs, such as the ability to automatically split jobs across cores or cpus.
If you're just trying to run a bunch of heterogeneous different bash scripts in parallel, define each individual script in its own file, then make each file executable and pass it to parallel:
$ cat list_of_scripts
/path/to/script1 arg1 arg2
/path/to/script2 -o=5 --beer arg3
…
/path/to/scriptN
then
parallel -j 30 < list_of_scripts

How to (trivially) parallelize with the Linux shell by starting one task per Linux core?

Today's CPUs typically comprise several physical cores. These might even be multi-threaded so that the Linux kernel sees quite a large number of cores and accordingly starts several times the Linux scheduler (one for each core). When running multiple tasks on a Linux system the scheduler achieves normally a good distribution of the total workload to all Linux cores (might be the same physical core).
Now, say, I have a large number of files to process with the same executable. I usually do this with the "find" command:
find <path> <option> <exec>
However, this starts just one task at any time and waits until its completion before starting the next task. Thus, just one core at any time is in use for this. This leaves the majority of the cores idle (if this find-command is the only task running on the system). It would be much better to launch N tasks at the same time. Where N is the number of cores seen by the Linux kernel.
Is there a command that would do that ?
Use find with the -print0 option. Pipe it to xargs with the -0 option. xargs also accepts the -P option to specify a number of processes. -P should be used in combination with -n or -L.
Read man xargs for more information.
An example command:
find . -print0 | xargs -0 -P4 -n4 grep searchstring
If you have GNU Parallel http://www.gnu.org/software/parallel/ installed you can do this:
find | parallel do stuff {} --option_a\; do more stuff {}
You can install GNU Parallel simply by:
wget http://git.savannah.gnu.org/cgit/parallel.git/plain/src/parallel
chmod 755 parallel
cp parallel sem
Watch the intro videos for GNU Parallel to learn more:
https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1
Gnu parallel or xargs -P is probably a better way to handle this, but you can also write a sort-of multi-tasking framework in bash. It's a little messy and unreliable, however, due to the lack of certain facilities.
#!/bin/sh
MAXJOBS=3
CJ=0
SJ=""
gj() {
echo ${1//[][-]/}
}
endj() {
trap "" sigchld
ej=$(gj $(jobs | grep Done))
jobs %$ej
wait %$ej
CJ=$(( $CJ - 1 ))
if [ -n "$SJ" ]; then
kill $SJ
SJ=""
fi
}
startj() {
j=$*
while [ $CJ -ge $MAXJOBS ]; do
sleep 1000 &
SJ=$!
echo too many jobs running: $CJ
echo waiting for sleeper job [$SJ]
trap endj sigchld
wait $SJ 2>/dev/null
done
CJ=$(( $CJ + 1 ))
echo $CJ jobs running. starting: $j
eval "$j &"
}
set -m
# test
startj sleep 2
startj sleep 10
startj sleep 1
startj sleep 1
startj sleep 1
startj sleep 1
startj sleep 1
startj sleep 1
startj sleep 2
startj sleep 10
wait

Resources