How to make subsection of sbatch jobs run at a time - slurm

I have a bash script (essentially what it does is align all files in a directory to all the files of another specified directory). The number of jobs gets quite large if there are 100 files being aligned individually to another 100 files (10,000 jobs), which all get submitted to slurm individually.
I have been doing them in batches manually but I think there must be a way to include it in a script so that, for example, only 50 jobs are running at a time.
I tried
$ sbatch --array [1-1000]%50
but it didn't work

Related

how to generate batch files in a loop in Python (not a loop in a batch file) with slightly changed parameters per iteration

I am a researcher that needs to run files for a set of years on a SLURM system (high performance computing center). The available nodes for long compute times have a long queue. I have 42 years to run, and the only way to run them that gets my files processed quickly (due to the wait times, and that this is many GB of data, it takes time), is to submit them individually, one batch file per year, as jobs. I cannot include multiple years in a single batch file, or I have to wait a week in the queue to run my data due to the time I have to reserve per batch file. This is the fastest way my university's system lets me run my data.
To do this, I have 2 lines in my batch script that I have to change every time: the name of the job, and the last line which is the python script name plus a parameter being passed to it (the year)
like so: pythonscript.py 2020.
I would like to generate batch files with a python or other script I can run, where it loops over a list of years and just changes the job name to jobNameYEAR and changes the last line to pythonscript.py YEAR, writes that to a file jobNameYEAR.sl, then continues in a loop to output the next batch file. ...Even better if it can write the batch file and submit the job (sjob jobNameYEAR) before continuing in the loop, but I realize maybe this is asking too much. But separately...
Is there a way to submit jobs in a loop once these files are created? E.g. loop through the year list and submit sjob jobName2000.sl, sjob jobName2001.sl, sjob jobName2002.sl
I do not want a loop in the batch file changing the variable, this would mean reserving too many hours on the SLURM system for a single job. I want a loop outside of the batch file that generates multiple batch files I can submit as jobs.
Thank you for your help!
This is what one of my .sl files looks like, it works fine, I just want to generate these files in a loop so I can stop editing them by hand:
#!/bin/bash -l
# The -l above is required to get the full environment with modules
# Set the allocation to be charged for this job
# not required if you have set a default allocation
#SBATCH -A MYFOLDER
# The name of the job
#SBATCH -J jobNameYEAR
# 24 hour wall-clock time will be given to this job
#SBATCH -t 3:00:00
# Job partition
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=6
#SBATCH --mem=30GB
#SBATCH -p main
# load the anaconda module
ml PDC/21.11
ml Anaconda3/2021.05
conda activate myEnv
python pythonfilename.py YEAR
Create a script with the following content (let's call it chainsubmit.sh):
#!/bin/bash
SCRIPT=${1?Usage: $0 script.slum arg1 arg2 ...}
shift
ARG=$1
ID=$(sbatch --job-name=jobName$ARG --parsable $SCRIPT $ARG)
shift
for ARG in "$#"; do
ID=$(sbatch --job-name=jobName$ARG --parsable --dependency=afterok:${ID%%;*} $SCRIPT $ARG)
done
Then, adapt your script so that the last line
python pythonfilename.py YEAR
is replaced with
python pythonfilename.py $1
Finally submit all the jobs with
./chainsubmit.sh jobName.sl {2000..2004}
for instance for YEAR ranging from 2000 to 2004
… script I can run, where it loops over a list of years and just changes the job name to jobNameYEAR and changes the last line to pythonscript.py YEAR, writes that to a file jobNameYEAR.sl… submit the job (sjob jobNameYEAR) before continuing in the loop…
It can easily be done with a few shell commands and sed. Assume you have a template file jobNameYEAR.sl as shown, which literally contains jobNameYEAR and YEAR as the parameters. Then we can substitute YEAR with each given year in the loop, e. g.
seq 2000 2002|while read year
do <jobNameYEAR.sl sed s/YEAR$/$year/ >jobName$year.sl
sjob jobName$year.sl
done
If your years aren't in sequence, we can use e. g. echo 1962 1965 1970 instead of seq ….
Other variants are on Linux also possible, like for year in {2000..2002} instead of seq 2000 2002|while read year, and using envsubst instead of sed.

Scheduling more jobs than MaxArraySize

Let's say I have 6233 simulations to run. The commands are generated and stored in a file, one in each line. I would like to use Slurm to schedule and run these commands. However, the MaxArraySize limit is 2000. So I can't use one job array to schedule all of them.
One solution is given here, where we create four separate jobs and use arithmetic indexing into the file, with the last job having a smaller number of tasks to run (233).
Is it possible to do this using one sbatch script with one job ID?
I set ntasks=1 when using job arrays. Do larger ntasks help in such situations?
Update:
Following Damien's solution and examples given here, I ended up with the following line in my bash script:
curID=$(( ${SLURM_ARRAY_TASK_ID} * ${SLURM_NTASKS} + ${SLURM_PROCID} ))
The same can be done using Python (shown in the referenced page). The only difference is that the environment variables should be imported into the script.
Is it possible to do this using one sbatch script with one job ID?
No. That solution will give you multiple job IDs
I set ntasks=1 when using job arrays. Do larger ntasks help in such situations?
Yes, that is a factor that you can leverage.
Each job in the array can spawn multiple tasks (--ntasks=...). In that case, the line number in the command file must be computed from $SLURM_ARRAY_TASK_ID and $SLURM_PROCID, and the program must be started with srun. Each task in a job member of the array will run in parallel. How large the job can be will depend on the MaxJobsize limit defined on the cluster/partition/qos you have access to.
Another option is to chain the tasks inside each job of the array, with a Bash loop (for i in $seq(...) ; do ...; done). In that case, the line number in the command file must be computed from $SLURM_ARRAY_TASK_ID and $i. Each task in a job member of the array will run serially. How large the job can be will depend on the MaxWall limit defined on the cluster/partition/qos you have access to.

Dealing with large number of small json files using pyspark

I have around 376K of JSON files under a directory in S3. These files are 2.5 KB each and contain only a single record/file. When I tried to load the entire directory via the below code via Glue ETL with 20 workers:
spark.read.json("path")
It just didn't run. There was a Timeout after 5 hrs. So, I developed and ran a shell script to merge the records of these files under a single file, and when I tried to load it, it just displays a single record. The merged file size is 980 MB. It worked fine for 4 records when tested locally after merging those 4 records under a single file. It displayed 4 records as expected.
I used the below command to append the JSON records from different files under a single file:
for f in Agent/*.txt; do cat ${f} >> merged.json;done;
It doesn't have any nested JSON. I even tried the multiline option but didn't work. So, what could be done in this case? As per me, when merged it is not treating records separately hence causing the issue. I even tried head -n 10 to display the top 10 lines but it goes to an infinite loop.
The problem was with my shell script that was being used to merge multiple small files. Post merge, records weren't aligned properly due to which they weren't treated as separate records.
Since I was dealing with a JSON dataset, I used the jq utility to process it. Below is the shell script that would merge a large number of records in a faster way into one file:
find . -name '*.txt' -exec cat '{}' + | jq -s '.' > output.txt
Later on, I was able to load the JSON records as expected with the below code:
spark.read.option("multiline","true").json("path")
I have run into trouble in the past working with thousands of small files. In my case they where csv files not json. One of hte thing I did to try and debug was to create a for loop and load smaller batches then combine all the data frame together. During each iteration I would call an action to force the execution. I would log the progress to get an idea of it was making progress . And monitor how it was slowing down as the job progressed

Does the file get changed in squeue if I modify after being sent into queue? [duplicate]

Say I want to run a job on the cluster: job1.m
Slurm handles the batch jobs and I'm loading Mathematica to save the output file job1.csv
I submit job1.m and it is sitting in the queue. Now, I edit job1.m to have different variables and parameters, and tell it to save data to job1_edited.csv. Then I re-submit job1.m.
Now I have two batch jobs in the queue.
What will happen to my output files? Will job1.csv be data from the original job1.m file? And will job1_edited.csv be data from the edited file? Or will job1.csv and job1_edited.csv be the same output?
:(
Thanks in advance!
I am assuming job1.m is a Mathematica job, run from inside a Bash submission script. In that case, job1.m is read when the job starts so if it is modified after submission but before job start, the modified version will run. If it is modified after the job starts, the original version will run.
If job1.m is the submission script itself (so you run sbatch job1.m), that script is copied in a spool directory specific to the job so if it is modified after the job is submitted, it still will run the original version.
In any case, it is better, for reproducibility and traceability, to make use of a workflow manager such as Fireworks, or Bosco

spark embedded (same JVM) for <400k text file process

Can we do a mini spark job embedded in our app? Any example? Reason : want to process part of a file and give results quicker than submitting a regular job. File is only 500 lines. But do not want to keep 2 code bases - just the one that is used for the large files too. File size is less than an MB.
I want to process the file in the same JVM that my client code is running. Want a single executor kicked off from within same JVM, via a flag in the config. (so a few jobs will have this flag set and others wont. Those that do not will run as usual on the cluster.)

Resources