Is there any way to run more than one parallel job simultaneously using a single job script? - slurm

Is there any way to run more than one parallel job simultaneously using a single job script? I have written a script like this. However, it is not processing four jobs simultaneously. Only 12 cores out of 48 are running a single job. Only one by one the four codes (from four different directories) are running.
#!/bin/sh
#SBATCH --job-name=my_job_name # Job name
#SBATCH --ntasks-per-node=48
#SBATCH --nodes=1
#SBATCH --time=24:00:00 # Time limit hrs:min:sec
#SBATCH -o cpu_srun_new.out
#SBATCH --partition=medium
module load compiler/intel/2019.5.281
cd a1
mpirun -np 12 ./a.out > output.txt
cd ../a2
mpirun -np 12 ./a.out > output.txt
cd ../a3
mpirun -np 12 ./a.out > output.txt
cd ../a4
mpirun -np 12 ./a.out > output.txt

Commands in sh (like in any other shell) are blocking, meaning that once you run them, the shell waits for its completion before looking at the next comment, unless you append an ampersand & at the end of the command.
Your script should look like this:
#!/bin/sh
#SBATCH --job-name=my_job_name # Job name
#SBATCH --ntasks-per-node=48
#SBATCH --nodes=1
#SBATCH --time=24:00:00 # Time limit hrs:min:sec
#SBATCH -o cpu_srun_new.out
#SBATCH --partition=medium
module load compiler/intel/2019.5.281
cd a1
mpirun -np 12 ./a.out > output1.txt &
cd ../a2
mpirun -np 12 ./a.out > output2.txt &
cd ../a3
mpirun -np 12 ./a.out > output3.txt &
cd ../a4
mpirun -np 12 ./a.out > output4.txt &
wait
Note the & at the end of the mpirun lines, and the addition of the wait command at the end of the script. That command is necessary to make sure the script does not end before the mpirun commands are completed.

Related

linux slurm - separate .out files for tasks run in paralell on 1 node

I am running jobs in parallel on linux using slurm by requesting a node and running one task per cpu.
However, the output as specified joins both streams into the single out file. I tried the %t flag on the epxectation it would separate the tasks, but it just logs everything in the output file with _0 appended (e.g. sample_output__XXX_XX_0.out).
Any advice on how to best generate a separate .out log per task would be much appreciated
#!/bin/bash
#SBATCH --job-name=recon_all_06172021_1829
#SBATCH --output=/path/recon_all_06172021_1829_%A_%a_%t.out
#SBATCH --error=/path/recon_all_06172021_1829_%A_%a.err
#SBATCH --ntasks=2
#SBATCH --ntasks-per-node=2
#SBATCH --nodes=1
#SBATCH --cpus-per-task=1
#SBATCH --time=23:59:00
#! Always keep the following echo commands to monitor CPU, memory usage
echo "SLURM_MEM_PER_CPU: $SLURM_MEM_PER_CPU"
echo "SLURM_MEM_PER_NODE: $SLURM_MEM_PER_NODE"
echo "SLURM_JOB_NUM_NODES: $SLURM_JOB_NUM_NODES"
echo "SLURM_NNODES: $SLURM_NNODES"
echo "SLURM_NTASKS: $SLURM_NTASKS"
echo "SLURM_CPUS_PER_TASK: $SLURM_CPUS_PER_TASK"
echo "SLURM_JOB_CPUS_PER_NODE: $SLURM_JOB_CPUS_PER_NODE"
command 1 &
command 2
wait
You can redirect the standard output from the command itself, for example:
command 1 > file1 2>&1
command 2 > file2 2>&1
Not as neat as using the sbatch filename patterns, but it will separate the output from each command.

slurm/sbatch doesn't work when option `-o` is specified

I'm trying to run the following script with sbatch on our cluster.
#!/bin/bash
#SBATCH -o /SCRATCH-BIRD/users/lindenbaum-p/work/NEXTFLOW/work/chunkaa/work/a4/6d0605f453add1d97d609839cfd318/command.log
#SBATCH --no-requeue
#SBATCH --partition=Bird
set -e
echo "Hello" 1>&2
sbatch displays a job-id on stdout, there is nothing listed in squeue and it looks like nothing was written/executed.
If the line #SBATCH -o /SCRATCH-BIRD/users/... is removed , then the script works.
the directory exists
$ test -w /SCRATCH-BIRD/users/lindenbaum-p/work/NEXTFLOW/work/chunkaa/work/a4/6d0605f453add1d97d609839cfd318/ && echo OK
OK
could it be a problem with the filesystem ? how can I test this ?
OK, got it the partition is visible from the login node but not from the cluster nodes.

SLURM job script for multiple nodes

I would like to request for two nodes in the same cluster, and it is necessary that both nodes are allocated before the script begins.
In the slurm script, I was wondering if there is a way to launch job-A on a given node and the job-B on the second node with a small delay or simultaneously.
Do you have suggestions on how this could be possible? This is how my script is right now.
#!/bin/bash
#SBATCH --job-name="test"
#SBATCH -D .
#SBATCH --output=./logs_%j.out
#SBATCH --error=./logs_%j.err
#SBATCH --nodelist=nodes[19,23]
#SBATCH --time=120:30:00
#SBATCH --partition=AWESOME
#SBATCH --wait-all-nodes=1
#launched on Node 1
ifconfig > node19.txt
#Launched on Node2
ifconfig >> node23.txt
In other words, if I request for two nodes, how do i run two different jobs on the two nodes simultaneously? Could it be that we deploy it as job steps as given in the last part of srun manual (MULTIPLE PROGRAM CONFIGURATION).. In that context, "-l" isn't defined.
I'm assuming that when you say job-A and job-B you are refering the two echos in the script. I'm also assuming that the setup you show us is working, but without starting the jobs in the proper nodes and serializing the execution (I have the feeling that the requested resources are not clear, there is missing information to me, but if SLURM does not complain, then everything is OK). You should also be careful in the proper writing of the redirected output. If the first job opens the redirection after the second job, it will truncate the file and you will lose the second job output.
For them to be started in the appropriate nodes, run the commands through srun:
#!/bin/bash
#SBATCH --job-name="test"
#SBATCH -D .
#SBATCH --output=./logs_%j.out
#SBATCH --error=./logs_%j.err
#SBATCH --nodelist=nodes[19,23]
#SBATCH --time=120:30:00
#SBATCH --partition=AWESOME
#SBATCH --wait-all-nodes=1
#launched on Node 1
srun --nodes=1 echo 'hello from node 1' > test.txt &
#Launched on Node2
srun --nodes=1 echo 'hello from node 2' >> test.txt &
That did the job! the files ./com_19.bash and ./com_23.bash are acting as binaries.
#!/bin/bash
#SBATCH --job-name="test"
#SBATCH -D .
#SBATCH --output=./logs_%j.out
#SBATCH --error=./logs_%j.err
#SBATCH --nodelist=nodes[19,23]
#SBATCH --time=120:30:00
#SBATCH --partition=AWESOME
#SBATCH --wait-all-nodes=1
# Launch on node 1
srun -lN1 -n1 -r 1 ./com_19.bash &
# launch on node 2
srun -lN1 -r 0 ./com_23.bash &
sleep 1
squeue
squeue -s
wait

Handling SLURM .out output

I am using sbatch to run scripts, and I want the output text to be written in a file from a certain point, i.e. I want to echo some text so the user can see, but after a certain command I want all output to be written in a file. Is there a way to do it?
If not, how can I disable entirely the output logging?
EDIT: Example:
#!/bin/bash
#SBATCH --partition analysis
#SBATCH --nodes 1
#SBATCH --ntasks-per-node 1
#SBATCH --exclusive
#SBATCH --time 14-0
#SBATCH -c1
#SBATCH --mem=400M
#SBATCH --job-name jupyter
module load jupyter
## get tunneling info
XDG_RUNTIME_DIR=""
ipnip=$(hostname -i)
echo "
Copy/Paste this in your local terminal to ssh tunnel with remote
-----------------------------------------------------------------
ssh -N -L 7905:$ipnip:7905 USER#HOST
-----------------------------------------------------------------
"
##UP UNTIL HERE ECHO TO TERMINAL
##FROM NOW ON, ECHO TO A FILE
## start an ipcluster instance and launch jupyter server
jupyter-notebook --no-browser --port=7905 --ip=$ipnip
As per my comment above, it's not possible to write to terminal with an sbatch submitted job.
You can do that with srun in the following way:
#!/bin/bash
srun --partition analysis --nodes 1 --ntasks-per-node 1 --exclusive --time 14-0 -c1 --mem=400M --job-name jupyter wrapper.sh
wrapper.sh:
#!/bin/bash
module load jupyter
## get tunneling info
XDG_RUNTIME_DIR=""
ipnip=$(hostname -i)
echo "
Copy/Paste this in your local terminal to ssh tunnel with remote
-----------------------------------------------------------------
ssh -N -L 7905:$ipnip:7905 USER#HOST
-----------------------------------------------------------------
"
##UP UNTIL HERE ECHO TO TERMINAL
##FROM NOW ON, ECHO TO A FILE
exec > $SLURM_JOBID.out 2>&1
## start an ipcluster instance and launch jupyter server
jupyter-notebook --no-browser --port=7905 --ip=$ipnip

SLURM sbatch multiple parallel calls to executable

I have an executable that takes multiple options and multiple file inputs in order to run. The executable can be called with a variable number of cores to run.
E.g. executable -a -b -c -file fileA --file fileB ... --file fileZ --cores X
I'm trying to create an sbatch file that will enable me to have multiple calls of this executable with different inputs. Each call should be allocated in a different node (in parallel with the rest), using X cores. The parallelization at core level is taken care of the executable, while at the node level by SLURM.
I tried with ntasks and multiple sruns but the first srun was called multiple times.
Another take was to rename the files and use a SLURM process or node number as filename before the extension but it's not really practical.
Any insight on this?
i do these kind of jobs always with the help of bash script that i run by a sbatch command. The easiest approach would be to have a loop in a sbatch script where you spawn the different job and job steps under your executable with srun specifying i.e. the corresponding node name in your partion with -w . You may also read up the documentation of slurm array jobs (if that befits you better). Alternatively you could also store all parameter combinations in a file and than loop over them with the script of have a look at "array job" manual page.
Maybe the following script (i just wrapped it up) helps you to get a feeling for what i have in mind (i hope its what you need). Its not tested so dont just copy and paste it!
#!/bin/bash
parameter=(10 5 2)
node_names=(node1 node2 node3)
# lets run one job per node each time taking one parameter
for parameter in ${parameter[*]}
# asign parameter to node
#script some if else condition here to specify parameters
# -w specifies the name of the node to use
# -N specifies the amount of nodes
JOBNAME="jmyjob$node-$parameter"
# asign the first job to the node
$node=${node_names[0]}
#delete first node from list
unset node_names[0];
#reinstantiate list
node_names=("${Unix[#]}")
srun -N1 -w$node -psomepartition -JJOBNAME executable.sh model_parameter &
done;
You will have the problem that you need to force your sbatch script to wait for the last job step. In this case the follwoing additional while loop might help you.
# Wait for the last job step to complete
while true;
do
# wait for last job to finish use the state of sacct for that
echo "waiting for last job to finish"
sleep 10
# sacct shows your jobs, -R only running steps
sacct -s R,gPD|grep "myjob*" #your job name indicator
# check the status code of grep (1 if nothing found)
if [ "$?" == "1" ];
then
echo "found no running jobs anymore"
sacct -s R |grep "myjob*"
echo "stopping loop"
break;
fi
done;
I managed to find one possible solution, so I'm posting it for reference:
I declared as many tasks as calls to the executable, as well as nodes and the desired number of cpus per call.
And then a separate srun for each call, declaring the number of nodes and tasks at each call. All the sruns are bound with ampersands (&):
srun -n 1 -N 1 --exclusive executable -a1 -b1 -c1 -file fileA1 --file fileB1 ... --file fileZ1 --cores X1 &
srun -n 1 -N 1 --exclusive executable -a2 -b2 -c2 -file fileA2 --file fileB2 ... --file fileZ2 --cores X2 &
....
srun -n 1 -N 1 --exclusive executable -aN -bN -cN -file fileAN --file fileBN ... --file fileZN --cores XN
--Edit: After some tests (as I mentioned in a comment below), if the process of the last srun ends before the rest, it seems to end the whole job, leaving the rest unfinished.
--edited based on the comment by Carles Fenoy
Write a bash script to populate multiple xyz.slurm files and submit each of them using sbatch. Following script does a a nested for loop to create 8 files. Then iterate over them to replace a string in those files, and then batch them. You might need to modify the script to suit your need.
#!/usr/bin/env bash
#Path Where you want to create slurm files
slurmpath=~/Desktop/slurms
rm -rf $slurmpath
mkdir -p $slurmpath/sbatchop
mkdir -p /exports/home/schatterjee/reports
echo "Folder /slurms and /reports created"
declare -a threads=("1" "2" "4" "8")
declare -a chunks=("1000" "32000")
declare -a modes=("server" "client")
## now loop through the above array
for i in "${threads[#]}"
{
for j in "${chunks[#]}"
{
#following are the content of each slurm file
cat <<EOF >$slurmpath/net-$i-$j.slurm
#!/bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --output=$slurmpath/sbatchop/net-$i-$j.out
#SBATCH --wait-all-nodes=1
echo \$SLURM_JOB_NODELIST
cd /exports/home/schatterjee/cs553-pa1
srun ./MyNETBench-TCP placeholder1 $i $j
EOF
#Now schedule them
for m in "${modes[#]}"
{
for value in {1..5}
do
#Following command replaces placeholder1 with the value of m
sed -i -e 's/placeholder1/'"$m"'/g' $slurmpath/net-$i-$j.slurm
sbatch $slurmpath/net-$i-$j.slurm
done
}
}
}
You can also try this python wrapper which can execute your command on the files you provide

Resources