How to get SLURM task ID in program - slurm

I'm running srun -n 100 python foo.py. Inside the python script how does it find out which task number/id/rank it is? Is there an environment variable set?

Have a look at man srun or man sbatch for a list of environment variables. $SLURM_PROCID might be the one you need.

Related

I am new to shell scripting in Linux, what is the meaning of a=$a $b [duplicate]

test.sh:
#! /bin/sh
me=I ./test2.sh
test2.sh:
#! /bin/sh
echo $me
run script 1 and printing this:
[zhibin#szrnd1 sh]$ ./test.sh
I
[zhibin#szrnd1 sh]$
As see, the variable "$me" be transferred to "test2.sh".
I didn't find this usage of variable definition on googling, can someone tell me where can find the tutorial including the usage above mentioned?
Thx a lot!
Since this has been mentioned on SO a lot, I'm assuming you're looking for some documentation on it. I'm not sure there is anything more detailed than the BASH documentation about this:
The environment for any simple command or function may be augmented temporarily by prefixing it with parameter assignments, as described in Shell Parameters. These assignment statements affect only the environment seen by that command.
As you've seen by experimenting, when you do "A=B command", it runs the command as if "export A=B" was run just prior to that command then A's value reverts to its previous after command completes. It's a very convenient way to pass some environment into a command while ensuring the rest of the script is not affected.
From the bash documentation on the Environment:
The environment for any simple command or function may be augmented temporarily by prefixing it with parameter assignments, as described in Shell Parameters. These assignment statements affect only the environment seen by that command.
So if you put variable assignments before a command, that command is run with those environment variables.

Import bash variables into slurm script

I have seen similar questions, but not exactly the same as mine: Use Bash variable within SLURM sbatch script, because I am not talking about slurm parameters.
I want to launch a slurm job for each of my sample files, so imagine I have 3 vcfs and I want to run a job for each of them:
I created a script to loop through a file in which I wrote sampleIds to run another script with each sample, which would perfectly work if I wanted to run it directly with bash:
while read line
do
sampleID="${line[0]}"
myscript.sh $sampleID
The problem is that I need to run the script with slurm, so is there any way to indicate slurm the bash variable that it should include?
I was trying this, but it is not working:
sbatch myscrip.sh --export=$sampleID
Okay, I've solved it:
sbatch --export=sampleID=$sampleID myscript.sh

Can I use PBS environment variables inside the PBS directives of my script?

Something like:
#PBS -t 0-99
#PBS -d "~/$PBS_ARRAYID.output"
What I want to do here is to redefine the working directory of each individual job in the job array, using the job's array id. Is this valid code?
I need to know before I send to the cluster, because I can't run tests there.
Yes, you can use any of the environment variables listed here in -d, -o, or -e.

Stop slurm sbatch from copying script to compute node

Is there a way to stop sbatch from copying the script to the compute node. For example when I run:
sbatch --mem=300 /shared_between_all_nodes/test.sh
test.sh is copied to /var/lib/slurm-llnl/slurmd/etc/ on the executing compute node. The trouble with this is there are other scripts in /shared_between_all_nodes/ that test.sh needs to use and I would like to avoid hard coding the path.
In sge I could use qsub -b y to stop it from copying the script to the compute node. Is there a similar option or config in slurm?
Using sbatch --wrap is a nice solution for this
sbatch --wrap /shared_between_all_nodes/test.sh
quotes are required if the script has parameters
sbatch --wrap "/shared_between_all_nodes/test.sh param1 param2"
from sbatch docs http://slurm.schedmd.com/sbatch.html
--wrap=
Sbatch will wrap the specified command string in a simple "sh" shell script, and submit that script to the slurm controller. When --wrap is used, a script name and arguments may not be specified on the command line; instead the sbatch-generated wrapper script is used.
The script might be copied there, but the working directory will be the directory in which the sbatch command is launched. So if the command is launched from /shared_between_all_nodes/ it should work.
To be able to lauch sbatch form anywhere, use this option
-D, --workdir=<directory>
Set the working directory of the batch script to directory before
it is executed.
like
sbatch --mem=300 -D /shared_between_all_nodes /shared_between_all_nodes/test.sh

Use Bash variable within SLURM sbatch script

I'm trying to obtain a value from another file and use this within a SLURM submission script. However, I get an error that the value is non-numerical, in other words, it is not being dereferenced.
Here is the script:
#!/bin/bash
# This reads out the number of procs based on the decomposeParDict
numProcs=`awk '/numberOfSubdomains/ {print $2}' ./meshModel/decomposeParDict`
echo "NumProcs = $numProcs"
#SBATCH --job-name=SnappyHexMesh
#SBATCH --output=./logs/SnappyHexMesh.log
#
#SBATCH --ntasks=`$numProcs`
#SBATCH --time=240:00
#SBATCH --mem-per-cpu=4000
#First run blockMesh
blockMesh
#Now decompose the mesh
decomposePar
#Now run snappy in parallel
mpirun -np $numProcs snappyHexMesh -parallel -overwrite
When I run this as a normal Bash shell script, it prints out the number of procs correctly and makes the correct mpirun call. Thus the awk command parses out the number of procs correctly and the variable is dereferenced as expected.
However, when I submit this to SLURM using:
sbatch myScript.sh
I get the error:
sbatch: error: Invalid numeric value "`$numProcs`" for number of tasks.
Can anyone help with this?
This won't work. What happens when you run
sbatch myscript.sh
is that slurm parses the script for those special #SBATCH lines, generates a job record, stores the batch script somewhere. The batch script is executed only later when the job runs.
So you need to structure you workflow in a slightly different way, and first calculate the number of procs you need before submitting the job. Note that you can use something like
sbatch -n $numProcs myscript.sh
, you don't need to autogenerate the script (also, mpirun should be able to get the number of procs in your allocation automatically, no need to use "-np").
Slurm stops processing #SBATCH directives on the first line of executable code in a script. For users whose #SBATCH directives are not dependent on the code they're trying to run above those directives, just put the #SBATCH lines at the top.
See the other answer for a workaround/solution if, as with OP, your sbatch options are dependent on the commands you've placed above them.
The batch script may contain options preceded with "#SBATCH" before
any executable commands in the script. sbatch will stop processing
further #SBATCH directives once the first non-comment non-whitespace
line has been reached in the script.
From the sbatch docs, my emphasis.

Resources