Import bash variables into slurm script - slurm

I have seen similar questions, but not exactly the same as mine: Use Bash variable within SLURM sbatch script, because I am not talking about slurm parameters.
I want to launch a slurm job for each of my sample files, so imagine I have 3 vcfs and I want to run a job for each of them:
I created a script to loop through a file in which I wrote sampleIds to run another script with each sample, which would perfectly work if I wanted to run it directly with bash:
while read line
do
sampleID="${line[0]}"
myscript.sh $sampleID
The problem is that I need to run the script with slurm, so is there any way to indicate slurm the bash variable that it should include?
I was trying this, but it is not working:
sbatch myscrip.sh --export=$sampleID

Okay, I've solved it:
sbatch --export=sampleID=$sampleID myscript.sh

Related

Bash Script for Submitting Job on Cluster

I am trying to write a script so I can use the 'qsub' command to submit a job to the cluster.
Pretty much, once I get into the cluster, I go to the directory with my files and I do these steps:
export PATH=$PATH:$HOME/program/bin
Then,
program > run.log&
Is there any way to make this into a script so I am able to submit the job to the queue?
Thanks!
Putting the lines into a bash script and then running qsub myscript.sh should do it.

Stop slurm sbatch from copying script to compute node

Is there a way to stop sbatch from copying the script to the compute node. For example when I run:
sbatch --mem=300 /shared_between_all_nodes/test.sh
test.sh is copied to /var/lib/slurm-llnl/slurmd/etc/ on the executing compute node. The trouble with this is there are other scripts in /shared_between_all_nodes/ that test.sh needs to use and I would like to avoid hard coding the path.
In sge I could use qsub -b y to stop it from copying the script to the compute node. Is there a similar option or config in slurm?
Using sbatch --wrap is a nice solution for this
sbatch --wrap /shared_between_all_nodes/test.sh
quotes are required if the script has parameters
sbatch --wrap "/shared_between_all_nodes/test.sh param1 param2"
from sbatch docs http://slurm.schedmd.com/sbatch.html
--wrap=
Sbatch will wrap the specified command string in a simple "sh" shell script, and submit that script to the slurm controller. When --wrap is used, a script name and arguments may not be specified on the command line; instead the sbatch-generated wrapper script is used.
The script might be copied there, but the working directory will be the directory in which the sbatch command is launched. So if the command is launched from /shared_between_all_nodes/ it should work.
To be able to lauch sbatch form anywhere, use this option
-D, --workdir=<directory>
Set the working directory of the batch script to directory before
it is executed.
like
sbatch --mem=300 -D /shared_between_all_nodes /shared_between_all_nodes/test.sh

Difference between different ways of running shell script

Recently I have been asked a question. What are the different ways of executing shell script and what is the difference between each methods ?
I said we can run shell script in the following methods assuming test.sh is the script name,
sh test.sh
./test.sh
. ./test.sh
I don't know the difference between 1 & 2. But usually in first 2 methods, upon executing, it will spawn new process and run the same. Whereas in the last method, it won't spawn new process. Instead it runs in the same one.
Can someone throw more insight on this and correct me if I am wrong?
sh test.sh
Tells the command to use sh to execute test.sh.
./test.sh
Tells the command to execute the script. The interpreter needs to be defined in the first line with something like #!/bin/sh or #!/bin/bash. Note (thanks keltar) that in this case the file test.sh needs to have execution rights for the user performing this command. Otherwise it will not be executed.
In both cases, all variables used will expire after the script is executed.
. ./test.sh
Sources the code. That is, it executes it and whatever executed, variables defined, etc, will persist in the session.
For further information, you can check What is the difference between executing a bash script and sourcing a bash script? very good answer:
The differences are:
When you execute the script you are opening a new shell, type
the commands in the new shell, copy the output back to your current
shell, then close the new shell. Any changes to environment will take
effect only in the new shell and will be lost once the new shell is
closed.
When you source the script you are typing the commands in your
current shell. Any changes to the environment will take effect and stay in your current shell.

Crontab Source File

Recently I created a bash script which I am supposed to run in cron.
After preparing the bash script and its normal working, I put it in Cron and found that it was failing. As as second step , I removed all the environment dependencies i.e instead of just file.txt, I specified /home/blah-blah/file.txt
I still found the script to be failing still at one step. The step was a data processing tool.
The command i executed was /bin/blah-blah/processing_tool -parameter $INDEX where $INDEX is a variable calculated within the bash script.
Third step was to add the bash profile as source at the beginning of the bash script. Voila!!!! The script started executing perfectly from cron.
My question is why is this happening even after I removed all the environment dependencies from my script. Also I have heard that sourcing a cron job to a bash profile is not recommended. If so, Is there any other way in which I can avoid doing this.
Basicly: Anything started from cron starts with a totally clean slate.
You can make no assumptions whatsoever about the content of environment variables or whichever folder is the current folder at the start of any script run from cron.
Easiest solution:
cd to the desired directory to make sure your path is in the desired location.
source /etc/profile to mak sure you get the system wide environment variables setup.
source ~myuserid/.profile to read your personal environment settings. (~/.profile won't work as that would indicate the cron user.)
Then start executing the actual script.
Of course the approach above requires the cron process to have read access to your home-dir adn it's probably doing a lot more work thatn is actually required.
Slightly more complicated: Figure out which environment variables are required by the script and anything that gets called by the script.
Explicitly export these at the beginning of the cron script.
(P.s. replace /etc/profile and ~myuserid/.profile with whatever are the corresponding files for your shell of choice.)
A cron can be thought of as a separate user. So, this "user" may not "see" or "read" the same files as you do. It is thus essential that all path names etc. be defined in the absolute.
Every script runs within its own process. So, when you run a script, you can change the $SHELL and any other variable within but it will be lost once you get out of it. My guess is that the $INDEX variable computation may have had been computed within the script successfully but its use outside of the script may have failed. Without more information about what job it was, or what you wanted to do, it is hard to tell.
There are two ways to run a cron job:
As root, you can run su -user -c < job > in root crontab.
Sourcing your profile explicitly, as you have done.
You can also set environment variables within the crontab.
As user in the user crontab, you can run it like so: "/home/blah/.profile && myScript"
That said, there HAS to be something in your environment variables (apart from file extensions) that is not present when you run the cron job. You will have to execute that script with -x flag (in bash) and then pore over the output. Using a diff between your environment variables and that of root/cron might be a pointer. Also, check if there are some utilities that are being used in your scripts whose locations are not part of the $PATH variable for cron/root.

Help debugging a cron job which has the correct script path and works when manually triggered

I'm struggling trying to debug a cron job which isn't working correctly. The cron job calls a shell script which should unrar a rar file - this works correctly when i run the script manually, but for some reason it's not working via cron. I am using the absolute file path and have verified that the path is correct. Has anyone got any ideas why this could be happening?
Well, you already said that you have used absolute paths, so the number one problem is dealt with.
Next to check are permissions. Which user is the cron job run as? Does it have all the permissions necessary?
Then, a little trick: if you have a shell script that fails and it's not run in a terminal I like to redirect the output of it to some files. Right at the start of the script, add:
exec &>/tmp/my.log
This will redirect STDOUT and STDERR to /tmp/my.log. Then it might also be a good idea to also add the line:
set -x
This will make bash print which command it's about to execute, and at what nesting level.
Happy debugging!
The first thing to check when cron jobs fail is to see if the full environment is available to the script you are trying to execute. In other words, you need to realize that a job executed via cron runs as a detached process meaning it is not associated with a login environment. Therefore whenever you try to debug a cron job that works when you execute manually, you need to be sure the same environment is available to the cronjob as is available to you when you execute it manually. This include any PATH settings, and other envvars that the script may depend on.
For me, the problem was a different shell interpreter in crontab.

Resources