set values ​for job variables from another file - slurm

I wanted to indicate the name and other values ​​of some variables of a job from another file but I get an error.
sbatch: error: Unable to open file 10:12:35
file.sh
#!/bin/bash
DATE=`date '+%Y-%m-%d %H:%M:%S'`
name='test__'$DATE
sbatch -J $name -o $name'.out' -e $name'.err' job.sh
job.sh
#!/bin/bash
#SBATCH --job-name=test
#SBATCH --nodes=1 # number of nodes
#SBATCH --ntasks-per-node=2 # number of cores
#SBATCH --output=.out
#SBATCH --error=.err
#module load R
Rscript script.R
script.R
for(i in 1:1e6){print(i)}

You are wrongly quoting the variables and the space requested in the date is creating two arguments to sbatch, hence he is complaining about that wrong parameter.
If I were you, I would avoid the space (as a general rule, cause it is more error prone and always requires quoting):
file.sh:
#!/bin/bash
DATE=$(date '+%Y-%m-%dT%H:%M:%S')
name="test__$DATE"
sbatch -J "$name" -o "${name}.out" -e "${name}.err" job.sh

Related

linux slurm - separate .out files for tasks run in paralell on 1 node

I am running jobs in parallel on linux using slurm by requesting a node and running one task per cpu.
However, the output as specified joins both streams into the single out file. I tried the %t flag on the epxectation it would separate the tasks, but it just logs everything in the output file with _0 appended (e.g. sample_output__XXX_XX_0.out).
Any advice on how to best generate a separate .out log per task would be much appreciated
#!/bin/bash
#SBATCH --job-name=recon_all_06172021_1829
#SBATCH --output=/path/recon_all_06172021_1829_%A_%a_%t.out
#SBATCH --error=/path/recon_all_06172021_1829_%A_%a.err
#SBATCH --ntasks=2
#SBATCH --ntasks-per-node=2
#SBATCH --nodes=1
#SBATCH --cpus-per-task=1
#SBATCH --time=23:59:00
#! Always keep the following echo commands to monitor CPU, memory usage
echo "SLURM_MEM_PER_CPU: $SLURM_MEM_PER_CPU"
echo "SLURM_MEM_PER_NODE: $SLURM_MEM_PER_NODE"
echo "SLURM_JOB_NUM_NODES: $SLURM_JOB_NUM_NODES"
echo "SLURM_NNODES: $SLURM_NNODES"
echo "SLURM_NTASKS: $SLURM_NTASKS"
echo "SLURM_CPUS_PER_TASK: $SLURM_CPUS_PER_TASK"
echo "SLURM_JOB_CPUS_PER_NODE: $SLURM_JOB_CPUS_PER_NODE"
command 1 &
command 2
wait
You can redirect the standard output from the command itself, for example:
command 1 > file1 2>&1
command 2 > file2 2>&1
Not as neat as using the sbatch filename patterns, but it will separate the output from each command.

I got no output using echo $SLURM_NTASKS

I create this batch file myfirst_slurm_job.sh that contain the following code:
#!/bin/bash
#SBATCH --output="slurm1.txt"
cd $HOME/..
echo $PWD
echo $SLURMD_NODENAME
echo $SLURM_NTASKS
and then I run this command line:
sbatch myfirst_slurm_job.sh
note: it's my first post
You need to specify the --ntasks/-n flag;
#SBATCH -n 1
else SLURM won't bother to define this variable for you.

passing a string argument to name a job in SLURM

I want to pass a parameter to as bash script in a cluster in order to name the job. I tried this:
#!/bin/bash
#SBATCH -J "$1" #<--- to name the job with the first parameter
#SBATCH --partition=shortq
#SBATCH -o %x-%j.out
#SBATCH -e %x-%j.err
echo "this is a test job named" $1
Gate main.mac
When I launch the job with
sbatch my_script.sh test_sript
I'm getting a file named $1-23472.out . It appears that "$1" didn't be interpreted. How can I have a file named "test_script-23472.out" ?
Also, is the line Gate main.mac mandatory? Can anyone explains me why we should put it ?
Many thanks
You probably can't do it exactly as you want to, but here's a solution that comes pretty close:
Batch script:
#!/bin/bash
#SBATCH --partition=shortq
#SBATCH -o %x-%j.out
#SBATCH -e %x-%j.err
echo "this is a test job named" $SLURM_JOB_NAME
(rest of your script here)
Submit with:
$ sbatch -J jobname my_script.sh
Slurm will not interpret the Bash variable in the comments. Bash either since it is in a comment.
One solution is a construct like this for submission:
ARG="<something>" sbatch -J "$ARG" my_script.sh test_sript "$ARG"
As for the Gate main.mac line, it is used to start the Gate program with main.mac as argument.
This is how I've been formatting slurm scripts to parse bash variables as job names.
#!/bin/bash
MYVAR=$1
sbatch --export=ALL -J ${MYVAR} --wrap="run something"

Wrapper, This does not look like a batch script

I want to pass arguments in a job. I write a wrapper in the file job.sh but when i use it i get a error:
sbatch: error: This does not look like a batch script. The first
sbatch: error: line must start with #! followed by the path to an interpreter.
sbatch: error: For instance: #!/bin/sh
main.sh
DATE=`date '+%Y-%m-%d %H:%M:%S'`
name='test__'$DATE
./job.sh $name
job.sh
#!/bin/bash
sbatch << EOT
#!/bin/sh
#SBATCH --job-name=$1
#SBATCH --nodes=1 # number of nodes
#SBATCH --ntasks-per-node=1 # number of cores
#module load R
Rscript script.R $1
EOT
script.R
args <- commandArgs(trailingOnly=TRUE)
print((args[[1]]))
(the three files are in the same directory)

Insert system variables to SBATCH

I would like to ask you if is possible to pass global system variables to #SBATCH tags.
I would like to do somethink like that
SBATCH FILE
#!/bin/bash -l
ARG=64.dat
NODES=4
TASK_PER_NODE=8
NP=$((NODES*TASK_PER_NODE))
#SBATCH -J 'MPI'+'_'+$NODES+'_'+$TASK_PER_NODE
#SBATCH -N $NODES
#SBATCH --ntasks-per-node=$TASK_PER_NODE
It isn't workink so that is why I ask you.
Remember that SBATCH parameter lines are viewed by Bash as comments, so it will not try to interpret them at all.
Furthermore, the #SBATCH directives must be before any other Bash command for Slurm to handle them.
Alternatives include setting the parameters in the command line:
NODES=4 sbatch --nodes=$NODES ... submitscript.sh
or passing the submission script through stdin:
#!/bin/bash -l
ARG=64.dat
NODES=4
TASK_PER_NODE=8
NP=$((NODES*TASK_PER_NODE))
sbatch <<EOT
#SBATCH -J "MPI_$NODES_$TASK_PER_NODE"
#SBATCH -N $NODES
#SBATCH --ntasks-per-node=$TASK_PER_NODE
srun ...
EOT
In this later case, you will need to run the submission script rather than handing it to sbatch since it will run sbatch itself. Note also that string concatenation in Bash is not achieved with the + sign.

Resources