NodeJS Production Deployment Best Practice

NodeJS Production Deployment Best Practice - node.js

I'm looking for ways in which to deploy some web services into production in a consistent and timely manner.
I'm currently implementing a deployment pipeline that will end with a manual deployment action of a specific version of the software to a number of virtual machines provisioned by Ansible. The idea is to provision x number of instances using version A whilst already having y number of instances running version B. Then image and flick the traffic over. The same mechanism should allow me to scale new vms in a set using the image I already made.
I have considered the following options but was wondering if theres something I'm overlooking:
TGZ
The CI environment would build a tarball from a project that has passed unit tests and integration tests. Optionally depednencies would be bundled (removing the need to run npm install on the production machine and relying on network connectivity to public or private npm repository).
My main issue here is that any dependencies that depend on system libraries would be build on a different machine (albeit the same image). I don't like this.
NPM
The CI environment would publish to a private NPM repository and the Ansible deployment script would check out a specific version after provisioning. Again this suffers from a reliance on external services being available when you want to deploy. I dont like this.
Git
Any system dependent modules become globally installed as part of provisioning and all other dependencies are checked into the repository. This gives me the flexibility of being able to do differential deployments whereby just the deltas are pushed and the application daemon can be restarted automatically by the process manager almost instantly. Dependencies are then absolutely locked down.
This would mean that theres no need to spinning up new VM unless to scale. Deployments can be pushed straight to all active instances.

First and foremost, regardless of the deployment method, you need to make sure you don't drop requests while deploying new code. One simple approach is removing the node from a load balancer prior to switchover. Before doing so, you may also want to try and evaluate if there are pending requests, open connections, or anything else negatively impacted by premature termination. Or perhaps something like the up module.
Most people would not recommend source controlling your modules. It seems that a .tgz with your node_modules already filled in from an npm install while utilizing a bundledDependencies declaration in your package.json might cover all your concerns. With this approach, an npm install on your nodes will not download and install everything again. Though, it will rebuild node-gyp implementations which may cover your system library concern.
You can also make use of git tags to more easily keep track of versions with specific dependencies and payloads. Manually deploying the code may get tedious, you may want to consider automating the routine while iterating over x amount of known server entries in a database from an interface. docker.io may be of interest.

Related

Separate environments for learning or trying out vs production (sandboxes?)

Can you suggest me a way of separating learning/trying out vs production in the same computer? I am in such a place that I know a lot of JS and production ready skills whilst sometimes require probing or trying out simpler stuff or basics. I presume that a lot of engineers are also in a similar place.
This is the situation I am facing with right now.
I wanted to install redis and configure it while trying out something interested.
In a separate project I needed another clean redis configuration and installation.
In front-end side I tried and installed a few npm packages globally.
At some point I installed python 3.4 now require 3.6
At some point I installed nginx and configured it, now need another configuration and wipe the previous one out,
If I start a big project right now I feel like my computer will eventually let me down due to several attempts I previously done
et cetera, these all create friction on both my learning and exploration
Now, it crosses mind to use separate virtual box installations for trying out things, but this answer is trivial, please suggest something else.
P.S.: I am using Linux Mint.

You can install and use Docker, which is also trivial,
however, if your environment is Linux you can use LXC

There isn't really a single good answer to this sort of question of course; but some things that are generally a good idea are:
use git repos to keep the source "backed up" (obviously your local pc should not be the git server); commit your changes all the time, if you can't hold your breath for as long as the timespan between 2 commits, then you're doing it wrong (or you may have asthma, see a doctor).
Always build your project with there being not just multiple, but a variable amount of "deployments" in mind. That means not hardcoding absolute paths and database names/ports/hostnames and things like that. If your project needs database/api credentials then that should be in a configfile of sorts (or in the env); that configfile should be stored outside the codebase and shouldn't be checked into your git repos (though there can ofcourse be a config template in there).
Always have at least 2 deployments of any project actually deployed. Next to the (obvious) "live"/"production" deployment, which your clients/users use, you want a "dev"-version for yourself where you can freely shit the bed, and for bigger projects you may well want multiple. Each deployment would have its own database, and it's own copy of the code/assets.
It can be useful to deploy everything inside podman or docker containers, that makes it easier to have a near-identical system in both development and production (incase those are different servers), but that may be too much overhead for you.
Have a method (maybe a script) that makes it very easy to deploy updates from your gitrepo or dev-deployment, to the production deployment. Based on your description, i'm guessing if a client tells you she wants some minor cosmetic changes done, you do them straight on the live version; very convenient and fast, but a horrible thing in practice. once you switch from that workflow to having a seperate dev-deploy, you'll feel slowed down by that (which you are), but if you optimize that workflow over time you'll get to the point where you could still deploy cosmetic changes in a minute orso, while having fully separated deployments, it is worth the time investment.
Have a personal devtools git repo or something similar. You're likely using an IDE such as VS code ? Back up your vs code user config in that repo, update it reasonably frequently. Use a texteditor, photoshop/editor, etc etc, same deal. You hear that ticking sound ? that's the bomb that's been placed on your motherboard. It might go off tonight, it might not go off for years, but you never know, always expect it could be today or tomorrow, so have stuff backed up externally and/or on offline media.
There's a lot more but those are some of the basics that spring to mind.

I though Docker was only for containerizing your app with all the installation files and configurations before pushing to the production
Docker is useful whenever you need to configure the runtime environment in an isolated manner. Production, local development, other environments - all need the same runtime. All benefit from the runtime definition and isolation that docker provides. Arguably docker is even more useful in workstation-centric development, than it is in production.
I wanted to install redis and configure it while trying out something interested.
Instead of installing redis on your os directly, run the preexisting docker image for redis.
In a separate project I needed another clean redis configuration and installation.
Instantiate the docker image again and now you have 2 isolated redis servers running locally.
In front-end side I tried and installed a few npm packages globally.
Run your npm code within a nodejs docker container
At some point I installed python 3.4 now require 3.6
Different versions of python is a great use case for docker containers, which will tagged with specific python versions.
At some point I installed nginx and configured it, now need another configuration and wipe the previous one out,
Nginx also has a very useful official container.
If I start a big project right now I feel like my computer will eventually let me down due to several attempts I previously done
Yeah, it gets messy quick. That's why docker is such a great solution. Give every project dedicated services and use docker-compose to simplify the networking and building components. Fight the temptation to use a docker container for more than one service - instead stitch them together with docker networks.
Read https://docs.docker.com/get-started/overview/ to get started with docker.

chef get converged-attributes without deploying

We are using chef to deploy all of our stacks.
I need to build a runbook for each environment we deploy.
I have been parsing the environment, node and recipe files but the more information I need to extract, the more complex it becomes because I am converging the attributes in my application.
I would like to use the converged-attributes.json file produced by our chef deployment without deploying any code because we can't deploy production to the build runbooks.
We also plan to build the runbook before the environment exists to provide configuration information to the DevOps team (e.g. memory requirements, ports, etc.).
Is there a way to use any of the chef/knife components or libraries to do the following?
Converge the attributes for each node
Write the converged attributes to a location my application can access on Mac OSX.
Quit before attempting to access any servers

This is not possible in the generic case. Chef is executable code at heart and the only way to fully compute the side effects is to actually execute it. This is what chef-client does, you can't "converge" the node externally so step 3 doesn't really make any sense. You could try to use Why Run mode but we really don't recommend it and are probably going to remove the feature as it does more harm than good most of the time. Roles and environments are static data so you can parse and manipulate those, but cookbooks are code and have to be run in-place to know exactly what they will do.

Forbid npm update in Docker environment

guys,
For various projects, I'm creating single Docker environments. Each Docker container consists of Debian, Nginx, Node.js, etc. and is going to use by developers as well as in production via Google Cloud's Kubernetes. Since the Node.js/module version should be everywhere the same, I would like to restrict the access to certain npm commands (somehow). Often developers work with different Node.js and project modules and that caused a lot of trouble in the past. With the Docker containers, I can provide environments with everything you need for a project. To finish this step, I would like to restrict the npm command execution and only allow arguments like install, test, etc.
Please drop me a comment if you know how to resolve this :)
Cheers

It is almost impossible to limit your developers to run some commands in the container if they have an access to Dockerfiles and can somehow change a build flow.
But, because container providing isolation and you can build a custom container for which application based on your basic image, it can be not a big problem if the version of any package for one application will be changed somehow, as an example in a build step, because it will not affect other apps. They just have different containers.
So, you will not have a problem with compatibility like when you using one server with many application which using a shared environment.
The only one thing you need to do - make sure that nobody change container which you using as a base image.

Continuous Integration - How to run npm install on offline server

Our continuous integration system uses ansible playbooks to deploy our repo (including node modules) to a list of servers as specified in our ansible host file. We have numerous environments which are configured in individual host files. Jenkins is our build server which is used to kick of each ansible run on a schedule. One of the tasks in the ansible playbook is npm install.
Our problem arises because only one of the environments is offline, so when the playbook performs the npm install task on this particular environment (read "host file"), it fails due to lack of internet connectivity.
I've seen lots of answers on how to get around this manually, but the whole point of our continuous integration system is to run automatically, and consistently (from environment to environment). So I don't want to introduce a bunch of workarounds in the playbook and/or repo to bundle up, etc. the node modules just to get around this particular hosts offline issue.
Since this is a lower enviroment, I am willing to do something specific in advance, as a one time step, on this server in order to bypass this issue. But since the playbook tasks tar up all existing files under the user account in a rollback zip file prior to installing the new code, anything I introduce on the server under this user account will be essentially removed (with the exception of . files and directories).
So, how to us CI to run npm install on single offline server without manual intervention?

I'm not familiar with ansible but the problem you're experiencing isn't uncommon, and there are definitely automated ways of solving it e.g.
Setup a local NPM registry e.g. sinopia and configure this as your default registry. Nodes with no internet access will receive the latest cached version of the package (presuming it was previously downloaded by a node)
Run npm install once, package the nodes up and share the artifacts across the environments
Personally, I prefer and advocate the second solution because:
Faster CI runs (only one download)
Package version is guarenteed across all environments (although strict versioning would also solve this)

An alternative would be to setup an internal NPM registry, so that your offline Jenkins server can talk to your internal NPM, however keeping this in sync with the public NPM repository would likely be a nightmare, presenting a number of problems.
You need to allow "outbound only" internet access to this Jenkins server in order for this to work effectively without workarounds.

Docker and grunt-based workflow automation

I'm looking for a Docker-based project setup which enables:
Development environment to most closely match production
Best of breed workflow automation tools for all developers
Highly portable/quick to set-up development environment, which supports Linux, OSX, and Windows. Currently we use Vagrant and that seems to be the most obvious choice still.
To satisfy #1:
Same app container (node.js + Apache) for dev, test, staging and production
Do not add any custom workflow tools to the container just for development's sake
To satisfy #3:
Do not require developers to all install their own dev tools for their respective environments/OSes (e.g. getting them to install node.js, npm, grunt, etc within the host)
So then to still satisfy #2, the idea I have is:
have a second "dev" container which shares files with the node/apache container and runs all the workflow automation.
run all the grunt watch/rebuild/reload/browser-sync etc from within that.
If using Vagrant, the file sharing would essentially go as host->dev container->app container
Are there any flaws in the above model, or perhaps better ideas?
One potentially missing point is whether to - and if so then how to - avoid performing a full build of containers in production each time. Without risking a mismatch of production vs other containers, I'd like to "package up" the container so that when new code is pushed to production, the app server only needs to restart, instead of npm install, etc. Particularly, once we're pushing to production, it should no longer have to pull anything from third party servers in order to run.

This is a bit broad question where answers will be opinionated rather then backed by objective arguments, but here's what I would change there:
Node.js is fine, but I would choose nginx instead of Apache. Both Node.js and Nginx are event-based and allow much more throughput, which is one of advantages of Node.js. But this might vary, like if you need certain Apache-only modules, but Nginx seems more natural to put in front of Node.
Why do you want to have a separate container? To minimize the production container by it not having to have dev tools?
I really think that having, say, grunt.js in the production container not too heavy, but again, you seem to try to minimize impact. Anyway, alternatively you can have both code and grunt watch etc inside one container and deploy like that. Pros are that you're simplifying setup, cons are that your production build might install a few extra libs. Which you can mitigate by, for example, setting NODE_ENV to production when deploying production container so that on startup, your scripts will know not to load certain dev tools.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string