Submit OpenMPI job via Slurm REST API failed - slurm

Here is my job script content:
#!/bin/bash
#SBATCH --partition=compute
#SBATCH --job-name=demo
#SBATCH --output=job.%j.out
#SBATCH --error=job.%j.err
#SBATCH -N 3
#SBATCH --ntasks-per-node=1
#SBATCH --export=ALL
srun --mpi=pmi2 -n 3 hostname
When I submit this job via sbatch, it runs to completion and return hostname of my nodes SUCCESSFULLY. But if I submit via Slurm REST API (slurm/v0.0.37/job/submit), the API returns HTTP 200 with job id, and the job result is FAILED with following stderr.
srun: error: auth_g_unpack: remote plugin_id 101 not found
srun: error: slurm_receive_msgs: [[node1]:6818] auth_g_unpack: Resource temporarily unavailable
srun: error: slurm_receive_msgs: [[node1]:6818] failed: Header lengths are longer than data received
srun: error: auth_g_unpack: remote plugin_id 101 not found
srun: error: slurm_receive_msgs: [[node3]:6818] auth_g_unpack: Resource temporarily unavailable
srun: error: slurm_receive_msgs: [[node3]:6818] failed: Header lengths are longer than data received
srun: error: auth_g_unpack: remote plugin_id 101 not found
srun: error: slurm_receive_msgs: [[node2]:6818] auth_g_unpack: Resource temporarily unavailable
srun: error: slurm_receive_msgs: [[node2]:6818] failed: Header lengths are longer than data received
The body I POST likes below:
{
"job": {
"name":"demo",
"partition":"compute",
"standard_output":"%j.out",
"standard_error":"%j.err",
"nodes":3,
"tasks":3,
"tasks_per_node":1,
"get_user_environment":1,
"current_working_directory":"/SHARED_NFS_STORAGE"
},
"script":"#!/bin/bash\nsrun --mpi=pmi2 -n 3 hostname"
}

Related

Is there any reason in slurm not running more than a certain number of nodes?

I want to run about 400 jobs in GCP-slurm from about 2,000 arrays.
The slurm settings and slurm.config settings in my bash file are as follows.
run.sh
#SBATCH -o ./out/vs.%j.out
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=16
#SBATCH -W
slurm.config
MaxArraySize=50000
MaxJobCount=50000
#COMPUTE NODE
NodeName=DEFAULT CPUs=16 RealMemory=63216 State=UNKNOWN
NodeName=node-0-[0-599] State=CLOUD
Currently, 100 nodes are being used for work other than that task.
If you proceed with this task, only about 130-150 node tasks in total are executed and the rest are not executed.
Are there any additional parameters that need to be set?
-- additional error log
[2022-06-20T01:18:41.294] error: get_addr_info: getaddrinfo() failed: Name or service not known
[2022-06-20T01:18:41.294] error: slurm_set_addr: Unable to resolve “node-333"
[2022-06-20T01:18:41.294] error: fwd_tree_thread: can't find address for host node-333, check slurm.conf
I found a workaround for the additional error.
https://groups.google.com/g/slurm-users/c/y-QZKDbYfIk
I referred to the article, and you can edit slurcmtld.service / slurmd.service / slurmdbd.service.
network.target -> network-online.target
However, the maximum number of execution nodes is still maintained.

Memory specification error when submitting a job with sbatch

I am trying to submit a job with the sbatch command and getting the following error:
sbatch --mem-per-cpu=1024 -N2 -n 2 --job-name="test" -p partition script
sbatch: error: Memory specification can not be satisfied
sbatch: error: Batch job submission failed: Requested node configuration is not available
It says Memory specification can not be satisfied although I have much more free memory available in the cluster.

SLURM error - sbatch HS8_main.sbatch sbatch: error: Unable to open file HS8_main.sbatch

I am trying to send an sbatch file to run a code in 200 cores in the system. My code is
#!/bin/sh
#SBATCH --job-name=sm #Job name
#SBATCH --mail-type=ALL # Mail events (NONE, BEGIN, END, FAIL, ALL)
#SBATCH --mail-user=sankalpmathur#ufl.edu # Where to send mail
#SBATCH --mem-per-cpu=3gb # Per processor memory
#SBATCH --array=1-200
#SBATCH -t 199:00:00 # Walltime
#SBATCH -o output_%a.out # Name output file
#
pwd; hostname; date
module load stata
stata-mp -b do array_${SLURM_ARRAY_TASK_ID}.do
When I run the file I get this error
sbatch HS8_main.sbatch
sbatch: error: Unable to open file HS8_main.sbatch
I have run the same sbatch before and it ran fine. What could possibly be the reason for it to not run this time?
Thank you
That's the error one gets when the sbatch script isn't in the current directory, or the name is wrong. Are you sure HS8_main.sbatch is the name of your script, and it's in the same place you're running sbatch from?
just try "pwd" to check out the

Why does slurm assign more tasks than I asked when I "sbatch" multiple jobs with a .sh file?

I submit some cluster mode spark jobs which run just fine when I do it one by one with below sbatch specs.
#!/bin/bash -l
#SBATCH -J Spark
#SBATCH --time=0-05:00:00 # 5 hour
#SBATCH --partition=batch
#SBATCH --qos qos-batch
###SBATCH -N $NODES
###SBATCH --ntasks-per-node=$NTASKS
### -c, --cpus-per-task=<ncpus>
### (multithreading) Request that ncpus be allocated per process
#SBATCH -c 7
#SBATCH --exclusive
#SBATCH --mem=0
#SBATCH --dependency=singleton
If I use a launcher to submit the same job with different node and task numbers, the system gets confused and tries to assign according to $SLURM_NTASK which gives 16. However I ask for example only 1 node,3tasks.
#!/bin/bash -l
for n in {1..4}
do
for t in {3..4}
do
echo "Running benchmark with ${n} nodes and ${t} tasks per node"
sbatch -N ${n} --ntasks-per-node=${t} spark-teragen.sh
sleep 5
sbatch -N ${n} --ntasks-per-node=${t} spark-terasort.sh
sleep 5
sbatch -N ${n} --ntasks-per-node=${t} spark-teravalidate.sh
sleep 5
done
done
How can I fix the error below by preventing slurm assign weird number of tasks per node which exceeds the limit.
Error:
srun: Warning: can't honor --ntasks-per-node set to 3 which doesn't match the
requested tasks 16 with the number of requested nodes 1. Ignoring --ntasks-per-node.
srun: error: Unable to create step for job 233838: More processors requested than
permitted

ArangoDB goes silent after slurm sbatch submission

I am trying to run ArangoDB in cluster-mode on a Cray-supercomputer.
It runs on a login node.
I followed these instructions:
https://docs.arangodb.com/3.3/Manual/Deployment/Local.html
To make proper use of the Cray-cluster I however need to submit it as a batch-job (Slurm / sbatch).
I am having issues getting it running because "arangod" goes silent, that is its command-line output does not end up in the slurm-log-file.
I have tried to change the log-settings using this link:
https://docs.arangodb.com/3.3/Manual/Administration/Configuration/Logging.html
If I put the logging to "info" then I get nothing. If I use "trace" like this:
build/bin/arangod --server.endpoint tcp://0.0.0.0:5003 --agency.my-address tcp://148.187.32.9:5001 --server.authentication false --agency.activate true --agency.size 3 --agency.supervision true --database.directory db_dir/agency_2 --log.level startup=trace --log.level agency=trace --log.level queries=trace --log.level replication=trace --log.level threads=trace
I get something, but it does not print any line I'm interested in, namely if it created the database-directory, if it ends up in gossip-mode and so on. I don't get a line of the expected output I would get in the console if I just ran it from the terminal.
As I said: on the login-node it all works. I suspect the problem might be in the interaction of Slurm and arangod.
Can you help me?
* EDIT *
I ran a small experiment. First I ran this (expecting an error message):
#!/bin/bash -l
#SBATCH --job-name=slurm_test
#SBATCH --time=00:30:00
#SBATCH --nodes=1
#SBATCH --ntasks-per-core=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=1
#SBATCH --partition=debug
#SBATCH --constraint=mc
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
srun build/bin/arangod --server.endpoint tcp://0.0.0.0:5001
And got this (first line from arangodb, what we expect):
slurm-....out:
no database path has been supplied, giving up, please use the '--database.directory' option
srun: error: nid00008: task 0: Exited with exit code 1
srun: Terminating job step 8106415.0
Batch Job Summary Report for Job "slurm_test" (8106415) on daint
-----------------------------------------------------------------------------------------------------
Submit Eligible Start End Elapsed Timelimit
------------------- ------------------- ------------------- ------------------- ---------- ----------
2018-06-20T22:41:54 2018-06-20T22:41:54 Unknown Unknown 00:00:00 00:30:00
-----------------------------------------------------------------------------------------------------
Username Account Partition NNodes Energy
---------- ---------- ---------- ------ --------------
peterem g34 debug 1 joules
This job did not utilize any GPUs
----------------------------------------------------------
Scratch File System Files Quota
-------------------- ---------- ----------
/scratch/snx3000 85020 1000000
Then I ran this:
#!/bin/bash -l
#SBATCH --job-name=slurm_test
#SBATCH --time=00:30:00
#SBATCH --nodes=1
#SBATCH --ntasks-per-core=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=1
#SBATCH --partition=debug
#SBATCH --constraint=mc
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
srun build/bin/arangod --server.endpoint tcp://0.0.0.0:5001 --agency.my-address tcp://127.0.0.1:5001 --server.authentication false --agency.activate true --agency.size 1 --agency.supervision true --database.directory agency1
This created the "agency1" directory but did not complete (ran for over 3min). So after a few minutes I "scancel" the job. This is the only output (slurm-....out:):
srun: got SIGCONT
slurmstepd: error: *** STEP 8106340.0 ON nid00008 CANCELLED AT 2018-06-20T22:38:03 ***
slurmstepd: error: *** JOB 8106340 ON nid00008 CANCELLED AT 2018-06-20T22:38:03 ***
srun: forcing job termination
srun: Job step aborted: Waiting up to 32 seconds for job step to finish.
Batch Job Summary Report for Job "slurm_test" (8106340) on daint
-----------------------------------------------------------------------------------------------------
Submit Eligible Start End Elapsed Timelimit
------------------- ------------------- ------------------- ------------------- ---------- ----------
2018-06-20T22:32:15 2018-06-20T22:32:15 Unknown Unknown 00:00:00 00:30:00
-----------------------------------------------------------------------------------------------------
Username Account Partition NNodes Energy
---------- ---------- ---------- ------ --------------
peterem g34 debug 1 joules
This job did not utilize any GPUs
----------------------------------------------------------
Scratch File System Files Quota
-------------------- ---------- ----------
/scratch/snx3000 85020 1000000
So: I know it is running in both cases (gives output or crates folder). But I have no idea why it gives no output in the second case.
I hope this clarifies my issue.
Thanks, Emanuel
Would you please print your entire slurm job command / job file? arangod logs to stdout. When stdout is redirected to an output file, as cluster batch systems do per default, you should monitor that file. As far as I remember slurm per default writes to slurm-$jobid.out

Resources