Scheduling: How to remove slurm cloud nodes from existing without marking them as down? - slurm

Current behavior
When a new instance is started by slurm, that instance is listed as a cloud node in sinfo. Sometimes we don't want to wait until slurm terminates the instance (after SuspendTime) and "releases" the node (so a new instance can claim that nodes place), but want to terminate an instance and "release" the node manually. The only way we know of currently is: 0) Terminate the instance I) Set the node state to down sudo scontrol update NodeName=$NODE_NAME state=DOWN reason=NoReason wait until it is no longer shown in sinfo and resume the node sudo scontrol update NodeName=$NODE_NAME state=RESUME reason=NoReason. After that the node can be used by slurm again.
However this is not really a good solution since you still have to wait quite some time.
Wanted behavior
After terminating the instance manually I would like to call a command that instantly "releases" a node, allowing a new instance to take its name and place once it's needed.
Why?
For updating the master we don't want any active worker instances.

Related

Is it possible to restart a process in Google Cloud run

We have multiple Google Cloud Run services running for an API. There is one parent service and multiple child services. When the parent service starts it loads a schema from all the children.
Currently there isn't a way to tell the parent process to reload the schema so when a new child is deployed the parent service needs to be restarted to reload the schema.
We understand there there are 1 or more instances of Google Cloud Run running and have ideas on dealing with this, but are wondering if there is a way to restart the parent process at all. Without a way to achieve it, one or more is irrelevant for now. The only way found it by deploying the parent which seems like overkill.
The containers running in google cloud are Alpine Linux with Nodejs, running an express application/middleware. I can stop the node application running but not restart it. If I stop the service Google Cloud Run may still continue to serve traffic to that instances causing errors.
Perhaps I can stop the express service so Google Cloud run will replace that instance? Is this a possibility? Is there a graceful way to do it so it tries to complete and current requests first (not simply kill express)?
Looking for any approaches to force Google Cloud Run to restart or start new instances. Thoughts?
Your design seems, at high level, be a cache system: The parent service get the data from the child service and cache the data.
Therefore, you have all the difficulties of cache management, especially cache invalidation. There is no easy solution for that, but my recommendation will be to use memorystore where all child service publish the latest version number of their schema (at container startup for example). Then, the parent service checks (at each requests, for example) the status in memory store (single digit ms latency) if a new version is available of not. If a new, request the child service, and update the parent service schema cache.
If applicable, you can also set a TTL on your cache and reload it every minute for example.
EDIT 1
If I focus only on Cloud Run, you can in only one condition, restart your container without deploying a new version: set the max-instance param to 1, and implement an exit endpoint (simply do os.exit() or similar in your code)
Ok, you loose all the scale up capacity, but it's the only case where, with a special exit endpoint, you can exit the container and force Cloud Run to reload it at the next request.
If you have more than 1 instance, you won't be able to restart all the running instances but only this one which handle the "exit" request.
Therefore, the only one solution is to deploy a new revision (simply deploy, without code/config change)

Shutdown system or stop AWS EC2 instance from NodeJS

I have AWS EC2 instances running Debian with systemd running Node as a service. (Hereinafter these instances are called the "Node servers".)
The Node servers are started by another instance (hereinafter called "the manager instance") that is permanently on.
When a Node server experiences some predefined period of inactivity, I want it to shut down automatically.
I am considering the following options:
(After sensing a period of inactivity in Node) execute a child_process in Node that runs the shutdown now command.
(After sensing a period of inactivity in Node) call AWS SDK's stopInstances with the instance's own resource ID.
Expose an HTTP GET endpoint called last-request-time on each Node server, which is periodically polled by a "manager instance", which then decides whether/when to call AWS SDK's stopInstances.
I am unsure which of these approaches to take and would appreciate any advice. Explicitly shutting down a machine from Node running on that same machine feels somehow inappropriate. But option 3 requires periodic HTTP polling, not to mention that it feels more risky to rely on another instance for auto-shutdown. (If the manager is down all the instances keep going.)
Or perhaps it is possible to get systemd to shut down the machine when a particular service exits with a particular code? This, if possible, would feel like the best solution as the Node process would only need to abort itself after the period of inactivity with a particular exit code.
You could create a lambda function that acts as an api and uses the SDK's stopInstances functionality.
That would also allow you to make it have the full functionality of a "manager instance" and save even more on instances since it will only run when needed.
Or you could cut out the middle-man and migrate the "Node servers" to lambda.
(lambda documentation)

How to create a livenessprobe for a node.js container that is not a server?

I have to create a readyness and liveness probe for a node.js container (docker) in kubernetes. My problem is that the container is NOT a server, so I cannot use an http request to see if it is live.
My container runs a node-cron process that download some csv file every 12 h, parse them and insert the result in elasticsearch.
I know I could add express.js but I woud rather not do that just for a probe.
My question is:
Is there a way to use some kind of liveness command probe? If it is possible, what command can I use?
Inside the container, I have pm2 running the process. Can I use it in any way for my probe and, if so, how?
Liveness command
You can use a Liveness command as you describe. However, I would recommend to design your job/task for Kubernetes.
Design for Kubernetes
My container runs a node-cron process that download some csv file every 12 h, parse them and insert the result in elasticsearch.
Your job is not executing so often, if you deploy it as a service, it will take up resources all the time. And when you write that you want to use pm2 for your process, I would recommend another design. As what I understand, PM2 is a process manager, but Kubernetes is also a process manager in a way.
Kubernetes native CronJob
Instead of handling a process with pm2, implement your process as a container image and schedule your job/task with Kubernetes CronJob where you specify your image in the jobTemplate. With this design, you don't have any livenessProbe but your task will be restarted if it fails, e.g. fail to insert the result to elasticSearch due to a network problem.
First, you should certainly consider a Kubernetes CronJob for this workload. That said, it may not be appropriate for your job, for example if your job takes the majority of the time between scheduled runs to run, or you need more complex interactions between error handling in your job and scheduling. Finally, you may even want a liveness probe running for the container spawned by the CronJob if you want to check that the job is making progress as it runs -- this uses the same syntax as you would use with a normal job.
I'm less familiar with pm2, though I don't think you should need to use the additional job management inside of Kubernetes, which should already provide most of what you need.
That said, it is certainly possible to use an arbitrary command for your liveness probe, and as you noted it is even explicitly covered in the kubernetes liveness/rediness probe documentation
You just add an exec member to the livenessProbe stanza for the container, like so:
livenessProbe:
exec:
command:
- cat
- /tmp/healthy
initialDelaySeconds: 5
periodSeconds: 5
If the command returns 0 (e.g. succeeds), then the kubelet considers the container to be alive and healthy. (e.g. in this trivial example, the container is considered healthy only while /tmp/healthy exists).
In your case, I can think of several possibilities to use. As one example, the job could probably be configured to drop a sentinel file that indicates it is making progress in some way. For example, append the name and timestamp of the last file copied. The liveness command would then be a small script that could read that file and ensure that there has been adequate progress (e.g. in the cron job case, that a file has been copied within the last few minutes).
Readiness probes probably don't make sense in the context of the service you describe, since they're more about not sending the application traffic, but they can also have a similar stanza, just for readinessProbe rather than livenessProbe.

Running command on EC2 launch and shutdown in auto-scaling group

I'm running a Docker swarm deployed on AWS. The setup is an auto-scaling group of EC2 instances that each act as Docker swarm nodes.
When the auto-scaling group scales out (spawns new instance) I'd like to run a command on the instance to join the Docker swarm (i.e. docker swarm join ...) and when it scales in (shuts down instances) to leave the swarm (docker swarm leave).
I know I can do the first one with user data in the launch configuration, but I'm not sure how to act on shutdown. I'd like to make use of lifecycle hooks, and the docs mention I can run custom actions on launch/terminate, but it is never explained just how to do this. It should be possible to do without sending SQS/SNS/Cloudwatch events, right?
My AMI is a custom one based off of Ubuntu 16.04.
Thanks.
One of the core issues is that removing a node from a Swarm is currently a 2 or 3-step action when done gracefully, and some of those actions can't be done on the node that's leaving:
docker node demote, if leaving-node is a manager
docker swarm leave on leaving-node
docker swarm rm on a manager
This step 3 is what's tricky because it requires you to do one of three things to complete the removal process:
Put something on a worker that would let it do things on a manager remotely (ssh to a manager with sudo perms, or docker manager API access). Not a good idea. This breaks the security model of "workers can't do manager things" and greatly increases risk, so not recommended. We want our managers to stay secure, and our workers to have no control or visibility into the swarm.
(best if possible) Setup an external solution so that on a EC2 node removal, a job is run to SSH or API into a manager and remove the node from swarm. I've seen people do this, but can't remember a link/repo for full details on using a lambda, etc. to deal with the lifecycle hook.
Setup a simple cron on a single manager (or preferably as a manager-only service running a cron container) that removes workers that are marked down. This is a sort of blunt approach and has edge cases where you could potentially delete a node that's existing but considered down/unhealthy by swarm, but I've not heard of that happening. If it was fancy, it could maybe validate with AWS that node is indeed gone before removing.
WORST CASE, if a node goes down hard and doesn't do any of the above, it's not horrible, just not ideal for graceful management of user/db connections. After 30s a node is considered down and Service tasks will be re-created on healthy nodes. A long list of workers marked down in the swarm node list doesn't have an effect on your Services really, it's just unsightly (as long as there are enough healthy workers).
THERE'S A FEATURE REQUEST in GitHub to make this removal easier. I've commented on what I'm seeing in the wild. Feel free to post your story and use case in the SwarmKit repo.

How to make one micro service instance at a time run a script (using dockers)

I'll keep it simple.
I have multiple instances of the same micro service (using dockers) and this micro service also responsible of syncing a cache.
Every X time it pulls data from some repository and stores it in cache.
The problem is that i need only 1 instance of this micro-service to do this job, and if it fails, i need another one to take it place.
Any suggestions how to do it simple?
Btw, is there an option to tag some micro-service docker instance and make him do some extra work?
Thanks!
The responsibility of restarting a failed service or scaling up/down is that of an orchestrator. For example, in my latest project, I used Docker Swarm.
Currently, Docker's restart policies are:
no: Do not automatically restart the container when it exits. This is the default.
on-failure[:max-retries]: Restart only if the container exits with a non-zero exit status. Optionally, limit the number of restart retries the Docker daemon attempts.
unless-stopped: Always restart the container regardless of the exit status. When you specify always, the Docker daemon will try to restart the container indefinitely. The container will also always start on daemon startup, regardless of the current state of the container.
always: Always restart the container regardless of the exit status, but do not start it on daemon startup if the container has been put to a stopped state before.

Resources