In PBS, how to specify nodes NOT to use - pbs

when you submit jobs to a PBS server, is it possible to specify the nodes that we do NOT want to use?
Thanks

You may use excludenodes to exclude two specific hosts, e.g.
#PBS -l excludenodes=host1:host2

Related

Boostrap many new cassandras to cluster with no errors

I have cluster about 100 nodes and it grows. I need to add 10-50 on request. As I know by default cassandra has cassandra.consistent.rangemovement=true this means multiple nodes can't to bootstrap in a moment.
Anyway when I add many nodes using Terraform and some kind of default configuration (using Puppet) at least 2-3 becomes UJ state and eventually only one bootstrap successfully. Earlier I used random time delay before start cassandra.service, but it doesn't work adding 10+ nodes.
I'm trying to figure out how to implement kind of "lock" for bootstrap.
I have Consul and can get kind of lock for bootstrap in KV. For instance get lock using ExecPreStart systemd feature but I can't get how to release it after bootstrap.
I'm looking for any solutions for that.
I've done something similar using Rundeck before. Basically, we had Rundeck kick off a bash script, taking parameters about the deployment of our nodes as well as how many.
What we did, was parse the output of nodetool status. We'd count the number of nodes as well as the number of UN indicators. If those two numbers didn't match, we'd do a sleep 30s and try again.
Once those numbers matched, we knew that it was safe to add another node. The total operation could take a while to add all nodes, but it worked.

Best practices to submit a huge numer of jobs with slurm

I need to submit several thousand jobs to our cluster. Each job needs around six hours to complete. This will take around a week if I would use all available resources. Theoretically I could do that but the I would block all other users for a week. So this is not an option.
I have two ideas that could possibly solve the problem:
Create an array job and limit the maximum number of running jobs. I don't like this option because quite often (over night, weekends, etc.) no one uses the cluster and my jobs can not use these unused resources.
Submit all jobs at once but somehow set the priority of each job really low. Ideally anyone could still use the cluster because when they submit jobs they will start sooner than mine. I do not know if this is possible in slurm and if I would have the permission to do that.
Is there a slurm mechanism I am missing? Is it possible to set priorities of a slurm job as described above and would I have permission to do that?
Generaly this is the cluster admin problem. They should have configured the cluster in a way that prioritize short and small jobs over long and large ones and/or prevent large jobs from running on some nodes.
However you can also manually reduce the priority of your job as a non admin with the nice factor option (higher -> less priority):
sbatch --nice=POSITIVE_NUMBER script.sh

Slurm split single node in multiple

I'm setting up a SLURM cluster with two 'physical' nodes.
Each of the two nodes has two GPUs.
I would like to give the option to use only one of the GPUs (and have the other GPU still available for computation).
I managed to set-up something with gres, but I later realized that even if only 1 of the GPUs is used the node will be occupied and the other GPU can not be used.
Is there a way to set the GPUs as the consumables and have two 'nodes' within a single node? And to assign a limited number of CPUs and memory to each?
I've had the same problem and I managed to make it work by allowing oversubscribing.
Here's the documentation about it:
https://slurm.schedmd.com/cons_res_share.html
Not sure if what I did was exactly right, but I've put
SelectType=select/cons_tres, SelectTypeParameters=CR_Core and put OverSubscribe=FORCE for my partition. Now I can launch several GPU jobs on the same node.

slurm - I/O shared between to nodes? Is that possible?

I am working with NGS data and the newest test files are massive.
Normally our pipeline is using just one node and the output from different tools is its ./scratch folder.
To use just one node is not possible with the current massive data set. That's why I would like to use at least 2 nodes to solve the issues such as speed, not all jobs are submitted, etc.
Using multiple nodes or even multiple partitions is easy - i know how which parameter to use for that step.
So my issue is not about missing parameters, but the logic behind slurm to solve the following issue about I/O:
Lets say I have tool-A. Tool-A is running with 700 jobs on two nodes (340 jobs on node1 and 360 jobs on node2) - the ouput is saved on ./scratch on each node separately.
Tool-B is using the results from tool-A - which are on two different nodes.
What is the best approach to fix that?
- Is there a parameter which tells slurm which jobs belongs together and where to find the input for tool-B?
- would it be smarter to change the output on /scratch to a local-folder?
- or would it be better to merge the output from tool-A from both nodes to one node?
- any other ideas?
I hope I made my issue "simply" to understand... Please apologize if that is not the case!
My naive suggestion would be why not share a scratch nfs volume across all nodes ? This way all ouput datas of ToolA would be acessible for ToolB whatever the node. It migth not be the best solution for read/write speed, but to my mind it would be the easiest for your situation.
A more sofware solution (not to hard to develop) can be to implement a database that track where the files have been generated.
I hope it help !
... just for those coming across this via search engines: if you cannot use any kind of shared filesystem (NFS, GPFS, Lustre, Ceph) and you don't have only massive data sets, you could use "staging", meaning data transfer before and after your job really runs.
Though this is termed "cast"ing in the Slurm universe, it generally means you define
files to be copied to all nodes assigned to your job BEFORE the job starts
files to be copied from nodes assigned to your job AFTER the job completes.
This can be a way to get everything needed back and forth from/to your job's nodes even without a shared file system.
Check the man page of "sbcast" and amend your sbatch job scripts accordingly.

Multiple identical azure WebJobs with different parameters

I need to run identical jobs in schedule, and they differ only in few strings.
As you may know, there is no a convenient way to create identical jobs with different parameters. For now i prefer so "codeless" way to do so, or with "as less code as possilbe".
So lets imagine they are stored in a rows of JobsConfigurations table of the website-related database.
How I can get the Job name of job being running to pick the right configuration from the table?
Thanks for help!
See https://github.com/projectkudu/kudu/wiki/Web-Jobs#environment-settings
The WEBJOBS_NAME environment variable will give you the name of the current WebJob.

Resources