I am refactoring a couple of node.js services. All of them used to start with forever on virtual servers, if the process crashed they just relaunch.
Now, moving to containerised and state-less application structures, I think the process should exit and the container should be restarted on a failure.
Is that correct? Are there benefits or disadvantages?
My take is do not use an in-container process supervisor (forever, pm2) and instead use docker restart policy via the --restart=always (or one of the other flavors of that option). This is more inline with the overall docker philosophy, and should operate very similarly to in-container process supervision since docker containers start running very quickly.
The strongest advocate for running in-container process supervision I've seen is in the phusion baseimage-docker README if you want to explore the other position on this topic.
While it's a good idea to use --restart=always as a failsafe, container restarting is relatively slow (5+ seconds with the simple Hello World Node server described here), so you can minimize app downtime using something like forever.
A downside of restarting the process within the container is that crash recovery can now happen two ways, which might have implications for your monitoring, etc.
Node needs clustering setup if you are running on a server with multiple CPUs.
With PM2 you get that without writing any extra code. http://pm2.keymetrics.io/docs/usage/cluster-mode/
Unless you are using a bunch of servers with single CPU instances than i would say use PM2 in production.
pm2 will also be quicker to restart than docker
Related
My backend is a nodejs application running in ubuntu linux. It needs to spawn a nodejs sub process when there is a request from client. The sub process usually takes less than 20 seconds to finish. There is a need to manage these processes if there are many concurrent requests come in. I am thinking to move spawn process inside docker container. That means a new docker container will be created to run the process if there is a request from client. In this way, I can use kubernetes to manage these docker containers. I am not sure whether this is a good design. Whether put the process inside docker container cause any performance issue.
The reason I am thinking to use docker container instead of spawn is that kubernetes offers all the features to manage these containers. Such as, auto scale if there are too many requests, limit the cpu and memory of the docker container, scheduler, monitoring, etc. I have to implement these logic if I use spawn.
You can easily measure the overhead yourself: get any basic docker image (e.g. a Debian base image) and run
time bash -c true
time docker run debian bash -c true
(Run each a few times and ignore the first runs.)
This will give you the startup and cleanup costs. During actual runtime, there is negligible/no further overhead.
Kubernetes itself may add some more overhead - best measure that too.
From the Docker documentation on network settings:
Compared to the default bridge mode, the host mode gives significantly better networking performance since it uses the host’s native networking stack whereas the bridge has to go through one level of virtualization through the docker daemon. It is recommended to run containers in this mode when their networking performance is critical, for example, a production Load Balancer or a High Performance Web Server.
So, answers which say there is no significant performance difference are incorrect as the Docker docs themselves say there is.
This is just in the case of network. There may or may not be impacts in accessing disk, memory, CPU, or other kernel resources. I'm not an export on Docker, but there are other good answers to this question around, for example here, and blogs detailing Docker-specific performance issues.
Ultimately, it will depend on exactly what your application does as to how it is impacted. The best advice will always be that, if you're highly concerned about performance, you should set your own benchmarks and do your own testing in your environment. That doesn't answer your question because there is no generic answer. Importantly, though, "there's virtually no impact" does not appear to be correct.
docker is in fact just wrapper on core functionaity of linux itself so there is no significant impact - it is just separaing process in container. so question is more about levels of virtualisation in your host. If it is linux in windows, or docker on windows it can affect your app somehow and virtualisation is a heavy way then. docker let you separate dependencies without almost any impact on performance.
I have a node.js web application that runs on my amazon aws server using nginx and pm2. The application processes files for the user, which is done using a job system and child processes. In short, when the application starts via pm2, i create a child process for each cpu core of the server. Each child process (worker) then completes jobs from the job queue.
My question is, could i replicate this in docker or would i need to modify it somehow. One assumption i had was that i would need to create a container for the database, one container for the application, and then multiple worker containers to do the processing, so that if one crashes i just spin up another worker.
I have been doing research online, including a udemy course to get my head around this stuff, but i haven't come across an example or something i can relate to my problem/question.
Any help, reading material or suggestions would be greatly appreciated.
Containers run at the same performance level as the host OS. There is no process performance hit. I created a whitepaper with Docker and HPE on this.
You wouldn't use pm2 or nodemon, which are meant to start multiple processes of your node app and restart them if they fail. That's the job of Docker now.
If in Swarm, you'd just increase the replica count of your service to be similar to the number of CPU/threads you'd want to run at the same time in the swarm.
I don't mention the nodemon/pm2 thing for Swarm in my node-docker-good-defaults so I'll at that as an issue to update it for.
NodeJS has its own modules for managing clustering and process restart:
clustering module which allows node to run multiple processes based on the # of cores in the machine. This will also spawn new processes when old ones shutdown.
domain module allows node to stop taking requests and shutdown the processes after an error has occurred.
Then there's PM2, and I've seen guides like this one saying that PM2 allows for logging, some stats monitoring, process restart, and clustering for nodejs.
Other than the stats monitoring and logging, can someone explain what the difference between the two is? Are they supposed to be used together or do I pick one or the other?
In a production environment, how does each fare in shutting down + restarting on bootup for the nodejs app:
System needs to restart (applying system patches, etc)
Restarting all nodejs processes to apply new code changes on server.
PM2 uses cluster under the hood, and the makes the whole cluster management easier. For your requirements, you want to look at PM2.
I have an api server running Node.js that was using it's cluster module and testing looked to be pretty good. Now our IT department wants to move to using Docker containers which I am happy about but I've never actually used it other than just playing around. But I had a thought, the Node.js app runs within a single Docker process so the cluster module wouldn't really be the best as the single Docker process can be a slow point of the setup until the request is split up within that process by the cluster module.
So really a cluster of Docker containers running being able to start and stop them on the fly is more important than using Node.js' cluster module correct?
If I have a cluster of containers, would using Node.js' cluster module get me anything? The api endpoints take less than .5sec to return (usually quite a bit less).
I'm using MySQL (believe it's a single server, nothing more currently) so there shouldn't be any reason to use a data integrity solution then.
What I've seen as the best solution when using Docker is to keep as fewer processes per container as possible since containers are lightweight; you don't want processes trying to use more than one CPU. So, running a cluster in the container won't add any value and might worsen latency.
Here https://medium.com/#CodeAndBiscuits/understanding-nodejs-clustering-in-docker-land-64ce2306afef#.9x6j3b8vw Chad Robinson explains the idea in general terms.
Kubernetes, Rancher, Mesos and other container management layers handle the load-balancing. They provide "scheduling" (moving those Docker container slices around different CPUs and machines to get a good usage across the cluster) and "networking" (load balancing inbound requests to those containers) layers internally.
Update
I think it's worth adding the link Why it is recommended to run only one process in a container? where people share their ideas and experiences, but chiefly from Jon there are some interesting points:
Provided that you give a single responsibility (single process, function or concern) to a container: Good idea Docker names this 'concern' ;)
Scaling containers horizontally is easier.
It can be re-used in different projects.
Identifying issues and troubleshooting is a breeze compared to do it in an entire application environment. Also, logging and reporting can be more accurate and detailed.
Upgrades/Downgrades can be gradually and fully controlled.
Security can be applied to specific resources and at different levels.
You'll have to measure to be sure, but my hunch would be running with node's cluster module would be worthwhile. It would get you more CPU utilization with the least amount of extra overhead. No extra containers to manage (start, stop, monitor). Plus the cluster workers have an efficient communication mechanism. The most reasonable evolution (don't skip steps) would seem to me:
1 container, 1 node process
1 container, several clustered node workers
several containers, each with several node workers
I have a system with 4 logical cores with me and I ran following line on my machine as well as on docker installed on same machine.
const numCPUs = require('os').cpus().length;
console.log(numCPUs)
This lines prints 4 on my machine and 2 inside docker container. Which means if we use clustering in docker container only 2 instance would be running. So docker container doesn't see cores same as actual machine does. Also running 5 docker container with clustering mode enabled gives 10 instance of machine which ultimately be manages by kernel of OS with 4 logical cores.
So I think best approach is to use multiple docker container instance in swarm mode with node.js clustering disabled. This should give the best performance.
We have a custom setup which has several daemons (web apps + background tasks) running. I am looking at using a service which helps us to monitor those daemons and restart them if their resource consumption exceeds over a level.
I will appreciate any insight on when one is better over the other. As I understand monit spins up a new process while supervisord starts a sub process. What is the pros and cons of this approach ?
I will also be using upstart to monitor monit or supervisord itself. The webapp deployment will be done using capistrano.
Thanks
I haven't used monit but there are some significant flaws with supervisord.
Programs should run in the foreground
This means you can't just execute /etc/init.d/apache2 start. Most times you can just write a one liner e.g. "source /etc/apache2/envvars && exec /usr/sbin/apache2 -DFOREGROUND" but sometimes you need your own wrapper script. The problem with wrapper scripts is that you end up with two processes, a parent and child. See the the next flaw...
supervisord does not manage child processes
If your program starts child process, supervisord wont detect this. If the parent process dies (or if it's restarted using supervisorctl) the child processes keep running but will be "adopted" by the init process and stay running. This might prevent future invocations of your program running or consume additional resources. The recent config options stopasgroup and killasgroup are supposed to fix this, but didn't work for me.
supervisord has no dependency management - see #122
I recently setup squid with qlproxy. qlproxyd needs to start first otherwise squid can fail. Even though both programs were managed with supervisord there was no way to ensure this. I needed to write a start script for squid that made it wait for the qlproxyd process. Adding the start script resulted in the orphaned process problem described in flaw 2
supervisord doesn't allow you to control the delay between startretries
Sometimes when a process fails to start (or crashes), it's because it can't get access to another resource, possibly due to a network wobble. Supervisor can be set to restart the process a number of times. Between restarts the process will enter a "BACKOFF" state but there's no documentation or control over the duration of the backoff.
In its defence supervisor does meet our needs 80% of the time. The configuration is sensible and documentation pretty good.
If you want to additionally monitor resources you should settle for monit. In addition to just checking whether a process is running (availability), monit can also perform some checks of resource usage (performance, capacity usage), load levels and even basic security checks (md5sum of a bianry file, config file, etc). It has a rule-based config which is quite easy to comprehend. Also there is a lot of ready to use configs: http://mmonit.com/wiki/Monit/ConfigurationExamples
Monit requires processes to create PID files, which can be a flaw, because if a process does not create pid file you have to create some wrappers around. See http://mmonit.com/wiki/Monit/FAQ#pidfile
Supervisord on the other hand is more bound to a process, it spawns it by itself. It cannot make any resource based checks as monit. It has a nice CLI servicectl and a web GUI though.