Slurm only using all CPUs on some software when using srun? - multithreading

I have a script defined like this:
#!/bin/sh
#SBATCH --nodes=1
#SBATCH --cpus-per-task=16
#SBATCH --mem 180000
./program1 --threads 16
./program2 --threads 16
I then submit my job with sbatch job.sh
The thing is that program1 uses all 16 cores/cpus, but program2 does only use 1 (both are supposedly multi-thread). If I however modify the script to be like:
#!/bin/sh
#SBATCH --nodes=1
#SBATCH --cpus-per-task=16
#SBATCH --mem 180000
./program1 --threads 16
srun --mpi=openmpi ./program2 --threads 16
then program2 does also use all 16 cores. Why is it necessary to add that "srun"?
As extra information, the implementation of program2 multithreading is done using std::async

Related

Is there any way to run more than one parallel job simultaneously using a single job script?

Is there any way to run more than one parallel job simultaneously using a single job script? I have written a script like this. However, it is not processing four jobs simultaneously. Only 12 cores out of 48 are running a single job. Only one by one the four codes (from four different directories) are running.
#!/bin/sh
#SBATCH --job-name=my_job_name # Job name
#SBATCH --ntasks-per-node=48
#SBATCH --nodes=1
#SBATCH --time=24:00:00 # Time limit hrs:min:sec
#SBATCH -o cpu_srun_new.out
#SBATCH --partition=medium
module load compiler/intel/2019.5.281
cd a1
mpirun -np 12 ./a.out > output.txt
cd ../a2
mpirun -np 12 ./a.out > output.txt
cd ../a3
mpirun -np 12 ./a.out > output.txt
cd ../a4
mpirun -np 12 ./a.out > output.txt
Commands in sh (like in any other shell) are blocking, meaning that once you run them, the shell waits for its completion before looking at the next comment, unless you append an ampersand & at the end of the command.
Your script should look like this:
#!/bin/sh
#SBATCH --job-name=my_job_name # Job name
#SBATCH --ntasks-per-node=48
#SBATCH --nodes=1
#SBATCH --time=24:00:00 # Time limit hrs:min:sec
#SBATCH -o cpu_srun_new.out
#SBATCH --partition=medium
module load compiler/intel/2019.5.281
cd a1
mpirun -np 12 ./a.out > output1.txt &
cd ../a2
mpirun -np 12 ./a.out > output2.txt &
cd ../a3
mpirun -np 12 ./a.out > output3.txt &
cd ../a4
mpirun -np 12 ./a.out > output4.txt &
wait
Note the & at the end of the mpirun lines, and the addition of the wait command at the end of the script. That command is necessary to make sure the script does not end before the mpirun commands are completed.

linux slurm - separate .out files for tasks run in paralell on 1 node

I am running jobs in parallel on linux using slurm by requesting a node and running one task per cpu.
However, the output as specified joins both streams into the single out file. I tried the %t flag on the epxectation it would separate the tasks, but it just logs everything in the output file with _0 appended (e.g. sample_output__XXX_XX_0.out).
Any advice on how to best generate a separate .out log per task would be much appreciated
#!/bin/bash
#SBATCH --job-name=recon_all_06172021_1829
#SBATCH --output=/path/recon_all_06172021_1829_%A_%a_%t.out
#SBATCH --error=/path/recon_all_06172021_1829_%A_%a.err
#SBATCH --ntasks=2
#SBATCH --ntasks-per-node=2
#SBATCH --nodes=1
#SBATCH --cpus-per-task=1
#SBATCH --time=23:59:00
#! Always keep the following echo commands to monitor CPU, memory usage
echo "SLURM_MEM_PER_CPU: $SLURM_MEM_PER_CPU"
echo "SLURM_MEM_PER_NODE: $SLURM_MEM_PER_NODE"
echo "SLURM_JOB_NUM_NODES: $SLURM_JOB_NUM_NODES"
echo "SLURM_NNODES: $SLURM_NNODES"
echo "SLURM_NTASKS: $SLURM_NTASKS"
echo "SLURM_CPUS_PER_TASK: $SLURM_CPUS_PER_TASK"
echo "SLURM_JOB_CPUS_PER_NODE: $SLURM_JOB_CPUS_PER_NODE"
command 1 &
command 2
wait
You can redirect the standard output from the command itself, for example:
command 1 > file1 2>&1
command 2 > file2 2>&1
Not as neat as using the sbatch filename patterns, but it will separate the output from each command.

concurrent.futures.ProcessPoolExecutor does not work on Slurm

I run a program on a mac os system with concurrent.futures.ProcessPoolExecutor. it performs well and I found there were 8 processes because my CPU has 8 cores. However, when I run it on slurm it is time-wasting. my script is as following. there are 18 tasks set because I want to create processes.
#!/bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=18
#SBATCH --cpus-per-task=4
#SBATCH --mem=8g
#SBATCH --tmp=5g
#SBATCH -t 80:00:00
#SBATCH --mail-type=ALL
#SBATCH --mail-user=meng0167#umn.edu
#SBATCH -p amdsmall
#SBATCH -e %j.err
#SBATCH -o %j.out
cd $SLURM SUBMIT DIR
module load python3
module load pyrosetta
python3 test_process.py
Please help me, I have been stuck in it for 1 week.

SLURM job script for multiple nodes

I would like to request for two nodes in the same cluster, and it is necessary that both nodes are allocated before the script begins.
In the slurm script, I was wondering if there is a way to launch job-A on a given node and the job-B on the second node with a small delay or simultaneously.
Do you have suggestions on how this could be possible? This is how my script is right now.
#!/bin/bash
#SBATCH --job-name="test"
#SBATCH -D .
#SBATCH --output=./logs_%j.out
#SBATCH --error=./logs_%j.err
#SBATCH --nodelist=nodes[19,23]
#SBATCH --time=120:30:00
#SBATCH --partition=AWESOME
#SBATCH --wait-all-nodes=1
#launched on Node 1
ifconfig > node19.txt
#Launched on Node2
ifconfig >> node23.txt
In other words, if I request for two nodes, how do i run two different jobs on the two nodes simultaneously? Could it be that we deploy it as job steps as given in the last part of srun manual (MULTIPLE PROGRAM CONFIGURATION).. In that context, "-l" isn't defined.
I'm assuming that when you say job-A and job-B you are refering the two echos in the script. I'm also assuming that the setup you show us is working, but without starting the jobs in the proper nodes and serializing the execution (I have the feeling that the requested resources are not clear, there is missing information to me, but if SLURM does not complain, then everything is OK). You should also be careful in the proper writing of the redirected output. If the first job opens the redirection after the second job, it will truncate the file and you will lose the second job output.
For them to be started in the appropriate nodes, run the commands through srun:
#!/bin/bash
#SBATCH --job-name="test"
#SBATCH -D .
#SBATCH --output=./logs_%j.out
#SBATCH --error=./logs_%j.err
#SBATCH --nodelist=nodes[19,23]
#SBATCH --time=120:30:00
#SBATCH --partition=AWESOME
#SBATCH --wait-all-nodes=1
#launched on Node 1
srun --nodes=1 echo 'hello from node 1' > test.txt &
#Launched on Node2
srun --nodes=1 echo 'hello from node 2' >> test.txt &
That did the job! the files ./com_19.bash and ./com_23.bash are acting as binaries.
#!/bin/bash
#SBATCH --job-name="test"
#SBATCH -D .
#SBATCH --output=./logs_%j.out
#SBATCH --error=./logs_%j.err
#SBATCH --nodelist=nodes[19,23]
#SBATCH --time=120:30:00
#SBATCH --partition=AWESOME
#SBATCH --wait-all-nodes=1
# Launch on node 1
srun -lN1 -n1 -r 1 ./com_19.bash &
# launch on node 2
srun -lN1 -r 0 ./com_23.bash &
sleep 1
squeue
squeue -s
wait

Setting sbatch environment variables with srun

In a blog post by Pierre Lindenbaum, srun is called within a Makefile to run jobs. I rely on this technique, but it makes no use of sbatch at all, so I am missing the chance to set sbatch-like environment variables. Where can I put the following so SLURM knows what to do?
#SBATCH -J testing
#SBATCH -A account
#SBATCH --time=1:00:00
#SBATCH --cpus-per-task=1
#SBATCH --begin=now
#SBATCH --mem=1G
#SBATCH -C sb
The srun command accepts nearly all of the sbatch parameters (with the notable exception of --array). In the referred blog post, these arguments are set at the line:
.SHELLFLAGS= -N1 -n1 bash -c
so you would write
.SHELLFLAGS= -J testing -A account --time=1:00:00 --cpus-per-task --begin=now --mem=1G -C sb bash -c
Note that if you specify --cpu-per-task=1, and you keep the default of one tasks, it probably means that nodes are shared in your setup ; in that case, --mem-per-cpu=1G makes more sense than --mem=1G

Resources