problems on multithreading with OpenMP - slurm

#SBATCH --time=8:30:00
#SBATCH -N 1
#SBATCH -c 24
#SBATCH --output=wf22N-%j.out
#SBATCH --error=wf22N-%j.err
#SBATCH --mail-type=all
#SBATCH --mail-user=
module load pre2019
cd $PWD
module load mkl/18.0.4 gcc/5.2.0 openmpi/gnu/3.1.4.4 blas/netlib/intel lapack/netlib/intel
export OMP_NUM_THREADS=24
ulimit -s unlimited
srun /nfs/home/xx/quip
I tried to do multithreading calculation with OpenMP. I submitted the batch script. But get an error:
srun: error: Attempt to run a job step with pack group value of 1, but the job allocation has maximum value of 0
Is there any problem with the batch file? Thanks.

Related

How can I see the memory usage for each process or job?

With SLURM and By this code I run a file on the cluster and at the end of the running, in an output file, it gives me the processing time, (Real, use, sys).
I need also to know how much memory every process use. Do you know with which code and where should I add it? I need to have the memory usage of the current Job in SLURM output, which I defined by BATCH at line 8.
#!/bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --mem=100G
#SBATCH --time=12:00:00
#SBATCH --error=slurm.%A_%a.err
#SBATCH --output=slurm.%A_%a.out # %A becomes the job ID, %a becomes the array index
#SBATCH --mail-type=END,FAIL
#SBATCH --mail-user=m#uni.de
module purge
module load hpc
if [ "$mdepth" == "" ]
then
mdepth=3
fi
echo "config:"$SLURM_ARRAY_JOB_ID","$SLURM_ARRAY_TASK_ID","$file","$solver
{ time ./ista --max-depth $mdepth --i $file | tee SomeFile.txt; } 2>&1
sacct --format=jobid,MaxRSS,MaxVMSize,start,end,CPUTimeRAW,NodeList
I can add this line at the end of the file but give me the memory usage of the previous Job's not the current job which is now finished.
sacct --format=jobid,MaxRSS,MaxVMSize,start,end,CPUTimeRAW,NodeList
Try:
program="./ista --max-depth $mdepth --i $file"
/usr/bin/time "--format" "\t%E real,\t%U user,\t%S,\t%K avg tot KB,\t%M max resident KB" ${program}
or the more comprehensive
/usr/bin/time "--verbose" ${program}
However, don't you want to have "time" outside the braces, to get a complete picture of the task, not just the "ista" execution ?

SLURM error - sbatch HS8_main.sbatch sbatch: error: Unable to open file HS8_main.sbatch

I am trying to send an sbatch file to run a code in 200 cores in the system. My code is
#!/bin/sh
#SBATCH --job-name=sm #Job name
#SBATCH --mail-type=ALL # Mail events (NONE, BEGIN, END, FAIL, ALL)
#SBATCH --mail-user=sankalpmathur#ufl.edu # Where to send mail
#SBATCH --mem-per-cpu=3gb # Per processor memory
#SBATCH --array=1-200
#SBATCH -t 199:00:00 # Walltime
#SBATCH -o output_%a.out # Name output file
#
pwd; hostname; date
module load stata
stata-mp -b do array_${SLURM_ARRAY_TASK_ID}.do
When I run the file I get this error
sbatch HS8_main.sbatch
sbatch: error: Unable to open file HS8_main.sbatch
I have run the same sbatch before and it ran fine. What could possibly be the reason for it to not run this time?
Thank you
That's the error one gets when the sbatch script isn't in the current directory, or the name is wrong. Are you sure HS8_main.sbatch is the name of your script, and it's in the same place you're running sbatch from?
just try "pwd" to check out the

Slurm: Schedule Job Arrays to Minimal Number of Nodes

I am running Slurm 19.05.2 with Cloud nodes only. I specified
SelectType = select/cons_tres
SelectTypeParameters = CR_CORE_MEMORY,CR_CORE_DEFAULT_DIST_BLOCK
To make sure that a node is fully utilised before allocating a second node.
It seem to work well with jobs that have many tasks. If I have 8 nodes which each have 16 cores and I submit a job which has 8 tasks and each task requires 2 cores this will be scheduled to one node.
For example the script:
#!/bin/bash
#
#SBATCH --job-name=batch
#SBATCH --output=o_batch.%A.%a.txt
#
#SBATCH --ntasks=8
#SBATCH --time=10:00
#SBATCH --cpus-per-task 2
#SBATCH --mem-per-cpu=100
srun hostname
will output
node-010000
node-010000
node-010000
node-010000
node-010000
node-010000
node-010000
node-010000
If I specify a job array with --array=1-8 (--ntasks=1) all jobs of the array will be scheduled on a different node (even though one node could fit all job requirements)
#!/bin/bash
#
#SBATCH --job-name=array
#SBATCH --output=array.%A.%a.txt
#
#SBATCH --ntasks=1
#SBATCH --time=10:00
#SBATCH --mem-per-cpu=100
#SBATCH --array=1-8
srun hostname
will output
node-010000
node-010001
node-010002
node-010003
node-010004
node-010005
node-010006
node-010007
Is there a way of configuring slurm to behave the same with arrays as with task?

Why does slurm assign more tasks than I asked when I "sbatch" multiple jobs with a .sh file?

I submit some cluster mode spark jobs which run just fine when I do it one by one with below sbatch specs.
#!/bin/bash -l
#SBATCH -J Spark
#SBATCH --time=0-05:00:00 # 5 hour
#SBATCH --partition=batch
#SBATCH --qos qos-batch
###SBATCH -N $NODES
###SBATCH --ntasks-per-node=$NTASKS
### -c, --cpus-per-task=<ncpus>
### (multithreading) Request that ncpus be allocated per process
#SBATCH -c 7
#SBATCH --exclusive
#SBATCH --mem=0
#SBATCH --dependency=singleton
If I use a launcher to submit the same job with different node and task numbers, the system gets confused and tries to assign according to $SLURM_NTASK which gives 16. However I ask for example only 1 node,3tasks.
#!/bin/bash -l
for n in {1..4}
do
for t in {3..4}
do
echo "Running benchmark with ${n} nodes and ${t} tasks per node"
sbatch -N ${n} --ntasks-per-node=${t} spark-teragen.sh
sleep 5
sbatch -N ${n} --ntasks-per-node=${t} spark-terasort.sh
sleep 5
sbatch -N ${n} --ntasks-per-node=${t} spark-teravalidate.sh
sleep 5
done
done
How can I fix the error below by preventing slurm assign weird number of tasks per node which exceeds the limit.
Error:
srun: Warning: can't honor --ntasks-per-node set to 3 which doesn't match the
requested tasks 16 with the number of requested nodes 1. Ignoring --ntasks-per-node.
srun: error: Unable to create step for job 233838: More processors requested than
permitted

How increasing the maximum memory per cpu for a job in Slurm?

When I wanted to submit an array job in Slurm I indicated in configuration script:
#!/bin/sh
#SBATCH --mem-per-cpu=4G
#SBATCH --job-name=dat
#SBATCH --array=0-1000%20
#SBATCH --output=exp-%A_%a.out
#SBATCH --error=exp-%A_%a.err
#SBATCH --partition=32Nodes
#SBATCH --cpus-per-task=10
#SBATCH --ntasks=1
But after submitting the array job, I realized that the maximum memory of each cpu that can be used by a job is 7500M. How changing the maximum memory per Cpu for a job array in Slurm? I am using Linux terminal.

Resources