slurmctld.service: Can't open PID file No such file or directory - slurm

I have the following error message after trying to start slurm on Ubuntu 18.04
slurmctld.service: Can't open PID file /var/run/slurm-llnl/slurmctld.pid (yet?) after start: No such file or directory
here's the ownership of the slurmllnl directory :
drwxr-xr-x 2 slurm slurm 60 juin 22 11:06 slurm-llnl
And in this directory i have slurmd.pid but i don't have slurmctld.pid
Here is my slurm.conf file :
# slurm.conf file generated by configurator easy.html.
# Put this file on all nodes of your cluster.
# See the slurm.conf man page for more information.
#
ControlMachine=daoud
#ControlAddr=
#
#MailProg=/bin/mail
MpiDefault=none
#MpiParams=ports=#-#
ProctrackType=proctrack/linuxproc
ReturnToService=1
SlurmctldPidFile=/var/run/slurm-llnl/slurmctld.pid
#SlurmctldPort=6817
SlurmdPidFile=/var/run/slurm-llnl/slurmd.pid
#SlurmdPort=6818
SlurmdSpoolDir=/var/lib/slurm-llnl/slurmd
SlurmUser=slurm
#SlurmdUser=root
StateSaveLocation=/var/spool/slurm-llnl
SwitchType=switch/none
TaskPlugin=task/none
#
#
# TIMERS
#KillWait=30
#MinJobAge=300
#SlurmctldTimeout=120
#SlurmdTimeout=300
#
#
# SCHEDULING
FastSchedule=1
SchedulerType=sched/backfill
#SchedulerPort=7321
SelectType=select/cons_res
#
#
# LOGGING AND ACCOUNTING
AccountingStorageType=accounting_storage/filetxt
ClusterName=cluster
#JobAcctGatherFrequency=30
JobAcctGatherType=jobacct_gather/linux
#SlurmctldDebug=3
SlurmctldLogFile=/var/log/slurm-llnl/slurmctld.log
#SlurmdDebug=3
SlurmdLogFile=/var/log/slurm-llnl/slurmd.log
#
#
# COMPUTE NODES
NodeName=daoud CPUs=64 Sockets=2 CoresPerSocket=16 ThreadsPerCore=2 State=UNKNOWN
PartitionName=standard Nodes=daoud Default=YES MaxTime=INFINITE State=UP

This is a message issued by systemd, not Slurm, and is caused by using PIDfile in the systemd unit. Slurmctld should keep the Slurmctld from starting.
Newer versions of Slurm switched to Type=simple, therefore not needing a PIDfile anymore

Related

SLURM: Cannot use more than 8 CPUs in a job

Sorry for a stupid question. We setup a small cluster using slurm-16.05.9. The sinfo command shows:
NODELIST NODES PARTITION STATE CPUS S:C:T MEMORY TMP_DISK WEIGHT AVAIL_FE REASON
g01 1 batch* idle 40 2:20:2 258120 63995 1 (null) none
g02 1 batch* idle 40 2:20:2 103285 64379 1 (null) none
g03 1 batch* idle 40 2:20:2 515734 64379 1 (null) none
So each node has 2 socket, each socket 20 cores, and totally 40 CPUs. However, we cannot submit a job using more than 8 CPUs. For example, with the following job description file:
#!/bin/bash
#SBATCH -J Test # Job name
#SBATCH -p batch
#SBATCH --nodes=1
#SBATCH --tasks=1
#SBATCH --cpus-per-task=10
#SBATCH -o log
#SBATCH -e err
then submitting this job gave the following error message:
sbatch: error: Batch job submission failed: Requested node configuration is not available
even there is no jobs at all in the cluster, unless we set --cpus-per-task <= 8.
Our slurm.conf has the following contents:
ControlMachine=cbc1
ControlAddr=192.168.80.91
AuthType=auth/munge
CryptoType=crypto/munge
MpiDefault=pmi2
ProctrackType=proctrack/pgid
ReturnToService=0
SlurmctldPidFile=/var/run/slurm-llnl/slurmctld.pid
SlurmctldPort=6817
SlurmdPidFile=/var/run/slurm-llnl/slurmd.pid
SlurmdPort=6818
SlurmdSpoolDir=/var/lib/slurm-llnl/slurmd
SlurmUser=slurm
StateSaveLocation=/var/lib/slurm-llnl/slurmctld
SwitchType=switch/none
TaskPlugin=task/none
InactiveLimit=0
KillWait=30
MinJobAge=300
SlurmctldTimeout=120
SlurmdTimeout=300
Waittime=0
DefMemPerCPU=128000
FastSchedule=0
MaxMemPerCPU=128000
SchedulerType=sched/backfill
SchedulerPort=7321
SelectType=select/cons_res
SelectTypeParameters=CR_Core_Memory
AccountingStorageType=accounting_storage/none
AccountingStoreJobComment=YES
ClusterName=cluster
JobCompType=jobcomp/none
JobAcctGatherFrequency=30
JobAcctGatherType=jobacct_gather/none
SlurmctldDebug=3
SlurmctldLogFile=/var/log/slurm-llnl/slurmctld.log
SlurmdDebug=3
SlurmdLogFile=/var/log/slurm-llnl/slurmd.log
GresTypes=gpu
NodeName=g01 CPUs=40 Sockets=2 CoresPerSocket=20 ThreadsPerCore=2 RealMemory=256000 State=UNKNOWN
NodeName=g02 CPUs=40 Sockets=2 CoresPerSocket=20 ThreadsPerCore=2 RealMemory=256000 State=UNKNOWN Gres=gpu:P100:1
NodeName=g03 CPUs=40 Sockets=2 CoresPerSocket=20 ThreadsPerCore=2 RealMemory=256000 State=UNKNOWN
PartitionName=batch Nodes=g0[1-3] Default=YES MaxTime=UNLIMITED State=UP
Could anyone give us a hint how to fixe this problem ?
Thank you very much.
T.H.Hsieh
The problem is most probably not the number of CPUs but the memory. The job script does not specify memory requirements, and the configuration states
DefMemPerCPU=128000
Therefore the job with 10 CPUs is requesting a total of 1 280 000 MB of
The memory computation per job in that old version of Slurm is possibly related to cores and not threads so the actual request would be half of that 640 000
RAM on a single node while the maximum available is 515 734 MB.
A job requesting 8 CPUs under that hypothesis requests 512 000MB. You can confirm this with scontrol show job <JOBID>

Slurm says drained Low RealMemory

I want to install slurm on localhost. I already installed slurm on similar machine, and it works fine, but on the other machine i got this:
transgen#transgen-4:~/galaxy/tools/melanoma_tools$ sinfo
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
transgen-4-partition* up infinite 1 drain transgen-4
transgen#transgen-4:~/galaxy/tools/melanoma_tools$ sinfo -Nel
Fri Jun 25 17:42:56 2021
NODELIST NODES PARTITION STATE CPUS S:C:T MEMORY TMP_DISK WEIGHT AVAIL_FE REASON
transgen-4 1 transgen-4-partition* drained 48 1:24:2 541008 0 1 (null) Low RealMemory
transgen#transgen-4:~/galaxy/tools/melanoma_tools$ srun -n8 sleep 10
srun: Required node not available (down, drained or reserved)
srun: job 5 queued and waiting for resources
^Csrun: Job allocation 5 has been revoked
srun: Force Terminated job 5
I found the advice to do so:
sudo scontrol update NodeName=transgen-4 State=DOWN Reason=hung_completing
sudo systemctl restart slurmctld slurmd
sudo scontrol update NodeName=transgen-4 State=RESUME
, but it had no effect.
slurm.conf:
# slurm.conf file generated by configurator easy.html.
# Put this file on all nodes of your cluster.
# See the slurm.conf man page for more information.
#
SlurmctldHost=localhost
#
#MailProg=/bin/mail
MpiDefault=none
#MpiParams=ports=#-#
ProctrackType=proctrack/cgroup
ReturnToService=1
SlurmctldPidFile=/var/run/slurmctld.pid
#SlurmctldPort=6817
SlurmdPidFile=/var/run/slurmd.pid
#SlurmdPort=6818
SlurmdSpoolDir=/var/spool/slurmd
SlurmUser=slurm
#SlurmdUser=root
StateSaveLocation=/var/spool/slurm.state
SwitchType=switch/none
TaskPlugin=task/cgroup
#
#
# TIMERS
#KillWait=30
#MinJobAge=300
#SlurmctldTimeout=120
#SlurmdTimeout=300
#
#
# SCHEDULING
SchedulerType=sched/backfill
SelectType=select/cons_res
SelectTypeParameters=CR_Core
#
#
# LOGGING AND ACCOUNTING
AccountingStorageType=accounting_storage/none
ClusterName=cluster
#JobAcctGatherFrequency=30
JobAcctGatherType=jobacct_gather/linux
#SlurmctldDebug=info
#SlurmctldLogFile=
#SlurmdDebug=info
#SlurmdLogFile=
#
#
# COMPUTE NODES
NodeName=transgen-4 NodeAddr=localhost CPUs=48 Sockets=1 CoresPerSocket=24 ThreadsPerCore=2 RealMemory=541008 State=UNKNOWN
PartitionName=transgen-4-partition Nodes=transgen-4 Default=YES MaxTime=INFINITE State=UP
cgroup.conf:
###
# Slurm cgroup support configuration file.
###
CgroupAutomount=yes
CgroupMountpoint=/sys/fs/cgroup
ConstrainCores=no
ConstrainDevices=yes
ConstrainKmemSpace=no #avoid known Kernel issues
ConstrainRAMSpace=no
ConstrainSwapSpace=no
TaskAffinity=no #use task/affinity plugin instead
How can i get slurm working?
Thanks in advance.
This could be that RealMemory=541008 in slurm.conf is too high for your system. Try lowering the value. Lets suppose you have indeed 541 Gb of RAM installed: change it to RealMemory=500000, do a scontrol reconfigure and then a scontrol update nodename=transgen-4 state=resume.
If that works, you could try to raise the value a bit.

Slurmd fails to start with the following error: fatal: Unable to determine this slurmd's NodeName

I'm trying to setup slurm on a bunch of aws instances, but whenever I try to start the head node it gives me the following error:
fatal: Unable to determine this slurmd's NodeName
I've setup the instances /etc/hosts so they can address each other as node1-6, with node6 being the the head node. This the hosts file for node6 all other nodes have a similar hosts file.
/etc/hosts file:
127.0.0.1 localhost node6
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts
<Node1 IP> node1
<Node2 IP> node2
<Node3 IP> node3
<Node4 IP> node4
<Node5 IP> node5
/etc/slurm-llnl/slurm.conf:
###############################################################################
# Sample configuration file for SLURM 2
###############################################################################
#
# This file holds the system-wide SLURM configuration. It is read
# by SLURM clients, daemons, and the SLURM API to determine where
# and how to contact the SLURM controller, what other nodes reside
# in the current cluster, and various other configuration information.
#
# SLURM configuration parameters take the form Keyword=Value, where
# at this time, no spacing is allowed to surround the equals (=) sign.
# Many of the config values are not mandatory, and so may be left
# out of the config file. We will attempt to list the default
# values for those parameters in this file.
#
# This simple configuration provides a control machine named "laptop"
# to run the Slurm's central management daemon and a single node
# named "server" which execute jobs. Both machine should have Slurm
# installed and use this configuration file. If you have a similar
# configuration just change the values of ControlMachine, for the
# control machine and PartitionName and NodeName for job execution
#
###############################################################################
#
ControlMachine=node6
#ControlAddr=
#BackupController=
#BackupAddr=
#
AuthType=auth/munge
CacheGroups=0
#CheckpointType=checkpoint/none
CryptoType=crypto/munge
#DisableRootJobs=NO
#EnforcePartLimits=NO
#Epilog=
#PrologSlurmctld=
#FirstJobId=1
JobCheckpointDir=/var/lib/slurm-llnl/checkpoint
#JobCredentialPrivateKey=
#JobCredentialPublicCertificate=
#JobFileAppend=0
#JobRequeue=1
#KillOnBadExit=0
#Licenses=foo*4,bar
#MailProg=/usr/bin/mail
#MaxJobCount=5000
MpiDefault=none
#MpiParams=ports:#-#
#PluginDir=
#PlugStackConfig=
#PrivateData=jobs
ProctrackType=proctrack/pgid
#Prolog=
#PrologSlurmctld=
#PropagatePrioProcess=0
#PropagateResourceLimits=
#PropagateResourceLimitsExcept=
ReturnToService=1
#SallocDefaultCommand=
SelectType=select/cons_res
SelectTypeParameters=CR_Core
SlurmctldPidFile=/var/run/slurm-llnl/slurmctld.pid
SlurmctldPort=6817
SlurmdPidFile=/var/run/slurm-llnl/slurmd.pid
SlurmdPort=6818
SlurmdSpoolDir=/var/lib/slurm-llnl/slurmd
SlurmUser=slurm
#SrunEpilog=
#SrunProlog=
StateSaveLocation=/var/lib/slurm-llnl/slurmctld
SwitchType=switch/none
#TaskEpilog=
TaskPlugin=task/none
#TaskPluginParam=
#TaskProlog=
#TopologyPlugin=topology/tree
#TmpFs=/tmp
#TrackWCKey=no
#TreeWidth=
#UnkillableStepProgram=
#UnkillableStepTimeout=
#UsePAM=0
#
#
# TIMERS
#BatchStartTimeout=10
#CompleteWait=0
#EpilogMsgTime=2000
#GetEnvTimeout=2
#HealthCheckInterval=0
#HealthCheckProgram=
InactiveLimit=0
KillWait=30
#MessageTimeout=10
#ResvOverRun=0
MinJobAge=300
#OverTimeLimit=0
SlurmctldTimeout=300
SlurmdTimeout=300
#UnkillableStepProgram=
#UnkillableStepTimeout=60
Waittime=0
#
#
# SCHEDULING
#DefMemPerCPU=0
FastSchedule=1
#MaxMemPerCPU=0
#SchedulerRootFilter=1
#SchedulerTimeSlice=30
SchedulerType=sched/backfill
SchedulerPort=7321
#SelectType=select/linear
#SelectTypeParameters=
#
#
# JOB PRIORITY
#PriorityType=priority/basic
#PriorityDecayHalfLife=
#PriorityFavorSmall=
#PriorityMaxAge=
#PriorityUsageResetPeriod=
#PriorityWeightAge=
#PriorityWeightFairshare=
#PriorityWeightJobSize=
#PriorityWeightPartition=
#PriorityWeightQOS=
#
#
# LOGGING AND ACCOUNTING
#AccountingStorageEnforce=0
#AccountingStorageHost=
#AccountingStorageLoc=
#AccountingStoragePass=
#AccountingStoragePort=
AccountingStorageType=accounting_storage/none
#AccountingStorageUser=
ClusterName=cluster
#DebugFlags=
#JobCompHost=
#JobCompLoc=
#JobCompPass=
#JobCompPort=
JobCompType=jobcomp/none
#JobCompUser=
JobAcctGatherFrequency=30
JobAcctGatherType=jobacct_gather/none
SlurmctldDebug=3
SlurmctldLogFile=/var/log/slurm-llnl/slurmctld.log
SlurmdDebug=3
SlurmdLogFile=/var/log/slurm-llnl/slurmd.log
#
#
# POWER SAVE SUPPORT FOR IDLE NODES (optional)
#SuspendProgram=
#ResumeProgram=
#SuspendTimeout=
#ResumeTimeout=
#ResumeRate=
#SuspendExcNodes=
#SuspendExcParts=
#SuspendRate=
#SuspendTime=
#
#
# COMPUTE NODES
NodeName=node1 Procs=1 State=UNKNOWN
NodeName=node2 Procs=1 State=UNKNOWN
NodeName=node3 Procs=1 State=UNKNOWN
NodeName=node4 Procs=1 State=UNKNOWN
NodeName=node5 Procs=1 State=UNKNOWN
NodeName=node6 Procs=1 State=UNKNOWN
#PartitionName=debug Nodes=server Default=YES MaxTime=INFINITE State=UP
PartitionName=mycluster Nodes=node[1-6] Default=YES MaxTime=INFINITE State=UP
The problem is with your
ControlMachine=node6
replace node6 with the result of "hostname -s". Also specify ip as in "ifconfig"
ControlAddr=<your local ip>

How to make job allocation by node group in a partition in SLURM

I am using slurm job scheduler.
HPC consists of two groups of nodes: ddcd[00-31] and ddcb[00-31]
two groups have different H/W spec. (40 cores and 16 cores) but are in a same partion.
I would like to make slurm allocate job in one of the node groups instead of mixing or spreading the job among two groups.
for instance, a job of 160 cores should be allocated in 10 nodes of ddcb or 4 nodes of ddcd.
I have set node weight on each node groups but it looks not working. some mixed allocation was observed.
Any help would be appreciated.
my slurm.conf is as follows:
SlurmctldHost=mynode
MpiDefault=none
ProctrackType=proctrack/pgid
ReturnToService=1
SlurmctldPidFile=/var/run/slurmctld.pid
SlurmdPidFile=/var/run/slurmd.pid
SlurmdSpoolDir=/var/spool/slurmd
SlurmUser=slurm
StateSaveLocation=/var/spool/slurmctld
SwitchType=switch/none
TaskPlugin=task/none
JobRequeue=0
# JOB PRIORITY
#PriorityType=priority/multifactor
PriorityDecayHalfLife=14-0
PriorityCalcPeriod=5
PriorityFavorSmall=NO
PriorityMaxAge=14-0
PriorityUsageResetPeriod=NONE
PriorityWeightAge=10000
PriorityWeightFairshare=0
PriorityWeightJobSize=100000
PriorityWeightPartition=0
PriorityWeightQOS=1000000
#
AuthType=auth/munge
CryptoType=crypto/munge
#
PrologFlags=Alloc
#PrologFlags=x11
# SCHEDULING
FastSchedule=1
SchedulerType=sched/backfill
SchedulerParameters=enable_user_top
SelectType=select/linear
#
PropagateResourceLimitsExcept=MEMLOCK
#
# LOGGING AND ACCOUNTING
AccountingStorageType=accounting_storage/slurmdbd
AccountingStorageEnforce=qos,limits,
ClusterName=ssmbhpc
JobAcctGatherType=jobacct_gather/none
SlurmctldLogFile=/var/log/slurmctld.log
SlurmdLogFile=/var/log/slurmd.log
#
#
# COMPUTE NODES
NodeName=ddcd[00-31] Sockets=2 CoresPerSocket=20 ThreadsPercore=1 Weight=10 State=UNKNOWN
NodeName=ddcb[00-31] Sockets=2 CoresPerSocket=8 ThreadsPercore=1 Weight=200 State=UNKNOWN
#
# Partition
PartitionName=debug Nodes=ddcd[00-31] Default=YES MaxTime=INFINITE State=UP
PartitionName=strp Nodes=ddcd[00-31],ddcb[00-31] Default=No MaxTime=INFINITE State=UP QOS=normal
I found that it is achievable with node feature and sbatch --constraint

rsyslog doesnt send any information

Good day, everyone!
I'm trying to send log files from one Red Hat based server to another Graylog server by using rsyslog.
So I cant do it, cause rsyslog doesn't send anything.
I would really appreciate if someone helps!
Graylog recieves messages by:
echo "Hello Graylog, let's be friends." | nc -w 1 -u "my-graylog-ip" 13101
I had configured rsyslog.config:
# rsyslog configuration file
# For more information see /usr/share/doc/rsyslog-*/rsyslog_conf.html
# If you experience problems, see http://www.rsyslog.com/doc/troubleshoot.html
#### MODULES ####
# The imjournal module bellow is now used as a message source instead of imuxsock.
$ModLoad imuxsock # provides support for local system logging (e.g. via logger command)
$ModLoad imjournal # provides access to the systemd journal
#$ModLoad imklog # reads kernel messages (the same are read from journald)
#$ModLoad immark # provides --MARK-- message capability
# Provides UDP syslog reception
#$ModLoad imudp
#$UDPServerRun 514
# Provides TCP syslog reception
#$ModLoad imtcp
#$InputTCPServerRun 514
#### GLOBAL DIRECTIVES ####
# Where to place auxiliary files
$WorkDirectory /var/lib/rsyslog
# Use default timestamp format
$ActionFileDefaultTemplate RSYSLOG_TraditionalFileFormat
# File syncing capability is disabled by default. This feature is usually not required,
# not useful and an extreme performance hit
#$ActionFileEnableSync on
# Include all config files in /etc/rsyslog.d/
$IncludeConfig /etc/rsyslog.d/*.conf
# Turn off message reception via local log socket;
# local messages are retrieved through imjournal now.
$OmitLocalLogging on
# File to store the position in the journal
$IMJournalStateFile imjournal.state
#### RULES ####
# Log all kernel messages to the console.
# Logging much else clutters up the screen.
#kern.* /dev/console
# Log anything (except mail) of level info or higher.
# Don't log private authentication messages!
*.info;mail.none;authpriv.none;cron.none /var/log/messages
# The authpriv file has restricted access.
authpriv.* /var/log/secure
# Log all the mail messages in one place.
mail.* -/var/log/maillog
# Log cron stuff
cron.* /var/log/cron
# Everybody gets emergency messages
*.emerg :omusrmsg:*
# Save news errors of level crit and higher in a special file.
uucp,news.crit /var/log/spooler
# Save boot messages also to boot.log
local7.* /var/log/boot.log
# ### begin forwarding rule ###
# The statement between the begin ... end define a SINGLE forwarding
# rule. They belong together, do NOT split them. If you create multiple
# forwarding rules, duplicate the whole block!
# Remote Logging (we use TCP for reliable delivery)
#
# An on-disk queue is created for this action. If the remote host is
# down, messages are spooled to disk and sent when it is up again.
#$ActionQueueFileName fwdRule1 # unique name prefix for spool files
#$ActionQueueMaxDiskSpace 1g # 1gb space limit (use as much as possible)
#$ActionQueueSaveOnShutdown on # save messages to disk on shutdown
#$ActionQueueType LinkedList # run asynchronously
#$ActionResumeRetryCount -1 # infinite retries if host is down
# remote host is: name/ip:port, e.g. 192.168.0.1:514, port optional
*.* #http://"my-graylog-ip":13101
# ### end of the forwarding rule ###
then restarted rsyslog
systemctl restart rsyslog
in /etc/rsyslog.d/ located grid.conf:
# File 1
$ModLoad imfile
$InputFileName /home/ucp/current-envelope/log/envelope.log
$InputFileTag def-grid101-envelope
$InputFileFacility local0
$InputRunFileMonitor
# File 2
ModLoad imfile
$InptFileName /home/ucp/current-envelope/log/envelope-err.log
$InputFileTag def-grid101-envelope
$InputFileFacility local0
$InputRunFileMonitor
# File 3
$ModLoad imfile
$InputFileName /home/ucp/current-envelope/log/performance.log
$InputFileTag def-grid101-envelope
$InputFileFacility local0
$InputRunFileMonitor
journalctl -b | grep rsyslog shows next messages:
Mar 23 10:24:01 grid101 kernel: type=1130 audit(1490253841.179:135915): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=rsyslog comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Mar 23 10:45:02 grid101 kernel: type=1131 audit(1490255102.506:135925): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=rsyslog comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Mar 23 10:45:02 grid101 kernel: type=1130 audit(1490255102.512:135926): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=rsyslog comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
The IP:port configuration is not right. It must be changed to something like this for TCP:
*.* ##my-graylog-ip:13101
I wonder why your Graylog server is using the 13101 TCP port. I guess, it is behind a NAT or something like that. I also wonder why you want to use the http:// thing in your configuration.

Resources