I have AWS EC2 instances running Debian with systemd running Node as a service. (Hereinafter these instances are called the "Node servers".)
The Node servers are started by another instance (hereinafter called "the manager instance") that is permanently on.
When a Node server experiences some predefined period of inactivity, I want it to shut down automatically.
I am considering the following options:
(After sensing a period of inactivity in Node) execute a child_process in Node that runs the shutdown now command.
(After sensing a period of inactivity in Node) call AWS SDK's stopInstances with the instance's own resource ID.
Expose an HTTP GET endpoint called last-request-time on each Node server, which is periodically polled by a "manager instance", which then decides whether/when to call AWS SDK's stopInstances.
I am unsure which of these approaches to take and would appreciate any advice. Explicitly shutting down a machine from Node running on that same machine feels somehow inappropriate. But option 3 requires periodic HTTP polling, not to mention that it feels more risky to rely on another instance for auto-shutdown. (If the manager is down all the instances keep going.)
Or perhaps it is possible to get systemd to shut down the machine when a particular service exits with a particular code? This, if possible, would feel like the best solution as the Node process would only need to abort itself after the period of inactivity with a particular exit code.
You could create a lambda function that acts as an api and uses the SDK's stopInstances functionality.
That would also allow you to make it have the full functionality of a "manager instance" and save even more on instances since it will only run when needed.
Or you could cut out the middle-man and migrate the "Node servers" to lambda.
(lambda documentation)
Related
I have a f1-micro gcloud vm instance running Ubuntu 20.04.
It has 0,2 vcpus and 600mb memory.
I write freezing/crashing which stands for just not responding to anything anymore.
From my monitoring i can see that the cpu is at its peak at 40% usage (usually steady under 1%), while the memory is always arround 60% (both stats with my (nodejs) server running).
When i open a ssh connection to my instance and run my (nodejs) server in background everything works fine as long as i keep the ssh connection alive. As soon as i close the connection it takes a few more minutes until the instance freezes/crashes. Without closing the ssh connection i can keep it running for hours without any problem.
I dont get any crash or freeze information from gcloud itself. The instance has a green checkmark and is kind of still running. I just cant open a new ssh connection and also the only way to do something again with this instance is by restarting it.
I have cloud logging active and there are also no messages in there.
So with this knowledge my question is if gcloud somehow boosts ssh connected vms to keep them alive?
Cause i dont know what else could cause this behaviour.
My (nodejs) server uses arround 120mb, another service uses 80mb and the gcp monitoring agent uses 30mb. The linux free command on the instance shows memory available between 60mb and 100mb.
In addition to John Hanley and Mike, You can edit your Machine Type based on you needs.
In the Google Cloud Console, Go to VM Instance under Compute Engine.
Select Instance name to open its Overview page.
Make sure to Stop the Instance before editing Instance.
Select Machine Type that match your application needs.
Save.
For more info and guides you may refer on link below:
Edit Instance
Machine Family Categories
Since there were no answers that explained the strange behaviour i encountered.
I also haven't figured it out but at least my servers wont crash/freeze anymore.
I somehow fixxed it by running my node.js application in an actual background job using forever instead of running it like node main.js &.
We have multiple Google Cloud Run services running for an API. There is one parent service and multiple child services. When the parent service starts it loads a schema from all the children.
Currently there isn't a way to tell the parent process to reload the schema so when a new child is deployed the parent service needs to be restarted to reload the schema.
We understand there there are 1 or more instances of Google Cloud Run running and have ideas on dealing with this, but are wondering if there is a way to restart the parent process at all. Without a way to achieve it, one or more is irrelevant for now. The only way found it by deploying the parent which seems like overkill.
The containers running in google cloud are Alpine Linux with Nodejs, running an express application/middleware. I can stop the node application running but not restart it. If I stop the service Google Cloud Run may still continue to serve traffic to that instances causing errors.
Perhaps I can stop the express service so Google Cloud run will replace that instance? Is this a possibility? Is there a graceful way to do it so it tries to complete and current requests first (not simply kill express)?
Looking for any approaches to force Google Cloud Run to restart or start new instances. Thoughts?
Your design seems, at high level, be a cache system: The parent service get the data from the child service and cache the data.
Therefore, you have all the difficulties of cache management, especially cache invalidation. There is no easy solution for that, but my recommendation will be to use memorystore where all child service publish the latest version number of their schema (at container startup for example). Then, the parent service checks (at each requests, for example) the status in memory store (single digit ms latency) if a new version is available of not. If a new, request the child service, and update the parent service schema cache.
If applicable, you can also set a TTL on your cache and reload it every minute for example.
EDIT 1
If I focus only on Cloud Run, you can in only one condition, restart your container without deploying a new version: set the max-instance param to 1, and implement an exit endpoint (simply do os.exit() or similar in your code)
Ok, you loose all the scale up capacity, but it's the only case where, with a special exit endpoint, you can exit the container and force Cloud Run to reload it at the next request.
If you have more than 1 instance, you won't be able to restart all the running instances but only this one which handle the "exit" request.
Therefore, the only one solution is to deploy a new revision (simply deploy, without code/config change)
I have a node.js application that runs inside docker in aws ec2 fargate.
It started to consume high cpu, and i wonder if i can profile it
I couldn't find a way to connect via ssh, and I am not sure if it helps to run it with --prof flag
I am a newbie in AWS myself, so please check everything that I will say. EC2 Fargate is provisioning EC2 instances for you and you are not allowed to interact with them directly (ssh) but I think you can use CloudWatch Logs, that prints every console.log of your app in the specified log groups. There are must be some configurations when you create your task definition or container defifnition. (at least in Cloudformation which I hardly recommend to use). You can console.log the number of users or function calls and use this info to debug what is happening.
I have deployed a Bitnami AMI of NodeJS on an AWS micro instance. After starting my node app, everything works fine.
After some time without any activity, the app which is attached to port :3000, seems to shut down. When this happens on refreshing the page my browser gives the message:
Network Error (tcp_error)
A communication error occurred: "Connection refused"
The Web Server may be down, too busy, or experiencing other problems preventing it from responding to requests. You may wish to try again at a later time.
The AWS console shows the instance is still running and the Bitnami build still responds with the standard message on port 80.
Forever (https://github.com/nodejitsu/forever) is also a useful tool for this kind of thing, and it gives you a little more control than nohup or screen.
As we discussed in comments, the problem was binding the node process to SSH session.
You can use nohup or screen to launch the node process in an instance not bound to session.
I suggest using screen because the function of returning to launched instance is essential for maintenance/updating.
Related: How to run process as background and never die
Related: Command-Line Interface tool to run node as a service
Besides configuring an EC2-instance you can also use the PaaS-solution of AWS, namely Elastic Beanstalk. They have also support for Node.js and it's super easy to deploy your apps using this service.
I have a NodeJS instance running with Amazon Elastic Beanstalk. I would like to know if the instance will automatically restart if nodejs crash the server ?
Do I have to use foreverjs ?
Thank you
TLDR - Use foreverjs.
So there are two types of restarts. One is where the code throws an exception and stops node. The OS is still running. In this case, from the OS perspective, node decided to exit. None of it's business. This is where foreverjs plays a role - it'll watch node and restart it if it ever stops due to an exception/error etc.
The second type of restart is a machine reboot. This is something that you might want to do if there is a kernel panic etc. AWS will not automatically reboot; it won't do anything that your desktop would do. You're going to have to reboot it (but really - try and debug it before having it serve production traffic again). I've run a fair number of servers and this isn't a common issue. The best way to deal with this is to have redundancy and have other servers step in if one fails in such a stark manner.