Forbid npm update in Docker environment - node.js

guys,
For various projects, I'm creating single Docker environments. Each Docker container consists of Debian, Nginx, Node.js, etc. and is going to use by developers as well as in production via Google Cloud's Kubernetes. Since the Node.js/module version should be everywhere the same, I would like to restrict the access to certain npm commands (somehow). Often developers work with different Node.js and project modules and that caused a lot of trouble in the past. With the Docker containers, I can provide environments with everything you need for a project. To finish this step, I would like to restrict the npm command execution and only allow arguments like install, test, etc.
Please drop me a comment if you know how to resolve this :)
Cheers

It is almost impossible to limit your developers to run some commands in the container if they have an access to Dockerfiles and can somehow change a build flow.
But, because container providing isolation and you can build a custom container for which application based on your basic image, it can be not a big problem if the version of any package for one application will be changed somehow, as an example in a build step, because it will not affect other apps. They just have different containers.
So, you will not have a problem with compatibility like when you using one server with many application which using a shared environment.
The only one thing you need to do - make sure that nobody change container which you using as a base image.

Related

Separate environments for learning or trying out vs production (sandboxes?)

Can you suggest me a way of separating learning/trying out vs production in the same computer? I am in such a place that I know a lot of JS and production ready skills whilst sometimes require probing or trying out simpler stuff or basics. I presume that a lot of engineers are also in a similar place.
This is the situation I am facing with right now.
I wanted to install redis and configure it while trying out something interested.
In a separate project I needed another clean redis configuration and installation.
In front-end side I tried and installed a few npm packages globally.
At some point I installed python 3.4 now require 3.6
At some point I installed nginx and configured it, now need another configuration and wipe the previous one out,
If I start a big project right now I feel like my computer will eventually let me down due to several attempts I previously done
et cetera, these all create friction on both my learning and exploration
Now, it crosses mind to use separate virtual box installations for trying out things, but this answer is trivial, please suggest something else.
P.S.: I am using Linux Mint.
You can install and use Docker, which is also trivial,
however, if your environment is Linux you can use LXC
There isn't really a single good answer to this sort of question of course; but some things that are generally a good idea are:
use git repos to keep the source "backed up" (obviously your local pc should not be the git server); commit your changes all the time, if you can't hold your breath for as long as the timespan between 2 commits, then you're doing it wrong (or you may have asthma, see a doctor).
Always build your project with there being not just multiple, but a variable amount of "deployments" in mind. That means not hardcoding absolute paths and database names/ports/hostnames and things like that. If your project needs database/api credentials then that should be in a configfile of sorts (or in the env); that configfile should be stored outside the codebase and shouldn't be checked into your git repos (though there can ofcourse be a config template in there).
Always have at least 2 deployments of any project actually deployed. Next to the (obvious) "live"/"production" deployment, which your clients/users use, you want a "dev"-version for yourself where you can freely shit the bed, and for bigger projects you may well want multiple. Each deployment would have its own database, and it's own copy of the code/assets.
It can be useful to deploy everything inside podman or docker containers, that makes it easier to have a near-identical system in both development and production (incase those are different servers), but that may be too much overhead for you.
Have a method (maybe a script) that makes it very easy to deploy updates from your gitrepo or dev-deployment, to the production deployment. Based on your description, i'm guessing if a client tells you she wants some minor cosmetic changes done, you do them straight on the live version; very convenient and fast, but a horrible thing in practice. once you switch from that workflow to having a seperate dev-deploy, you'll feel slowed down by that (which you are), but if you optimize that workflow over time you'll get to the point where you could still deploy cosmetic changes in a minute orso, while having fully separated deployments, it is worth the time investment.
Have a personal devtools git repo or something similar. You're likely using an IDE such as VS code ? Back up your vs code user config in that repo, update it reasonably frequently. Use a texteditor, photoshop/editor, etc etc, same deal. You hear that ticking sound ? that's the bomb that's been placed on your motherboard. It might go off tonight, it might not go off for years, but you never know, always expect it could be today or tomorrow, so have stuff backed up externally and/or on offline media.
There's a lot more but those are some of the basics that spring to mind.
I though Docker was only for containerizing your app with all the installation files and configurations before pushing to the production
Docker is useful whenever you need to configure the runtime environment in an isolated manner. Production, local development, other environments - all need the same runtime. All benefit from the runtime definition and isolation that docker provides. Arguably docker is even more useful in workstation-centric development, than it is in production.
I wanted to install redis and configure it while trying out something interested.
Instead of installing redis on your os directly, run the preexisting docker image for redis.
In a separate project I needed another clean redis configuration and installation.
Instantiate the docker image again and now you have 2 isolated redis servers running locally.
In front-end side I tried and installed a few npm packages globally.
Run your npm code within a nodejs docker container
At some point I installed python 3.4 now require 3.6
Different versions of python is a great use case for docker containers, which will tagged with specific python versions.
At some point I installed nginx and configured it, now need another configuration and wipe the previous one out,
Nginx also has a very useful official container.
If I start a big project right now I feel like my computer will eventually let me down due to several attempts I previously done
Yeah, it gets messy quick. That's why docker is such a great solution. Give every project dedicated services and use docker-compose to simplify the networking and building components. Fight the temptation to use a docker container for more than one service - instead stitch them together with docker networks.
Read https://docs.docker.com/get-started/overview/ to get started with docker.

Google Cloud Run and system capabilities

I have a docker image which I am running on Google's Cloud Run.
When I want to run the image locally, I have to give my container additional capabilities like the following:
docker run -p 8080:8080 --cap-add=SYS_ADMIN gcr.io/my-project/my-docker-image
Is there a way of configuring Docker's capabilities in Cloud Run?
I stumbled upon this piece of API documentation from Google, but I don't know how to configure my container. I am not even sure that it is relevant to my situation.
Any help would be really appreciated.
Expanding the POSIX capabilities is not an option on Cloud Run or Cloud Run on GKE as they represent expanding the security vulnerabilities of the underlying host.
Adding capabilities is often the easiest way to make something with special system demands work. More complex but frequently doable are modifications within the container environment or to the package configuration to get things working.
If what you're trying to do absolutely requires cap-add, this might be addressed in a feature request to the software package... or it may be a novel use case that Cloud Run cannot support but may in the future with your feedback.

Docker for a one shot CLI application

Since I first knew of Docker, I thought it might be the solution for several problems we are usually facing at the lab. I work as a Data Analyst for a small Biology research group. I am using Snakemake for defining the -usually big and quite complex- workflows for our analyses.
From Snakemake, I usually call small scripts in R, Python, or even Command Line Applications such as aligners or annotation tools. In this scenario, it is not uncommon to suffer from dependency hell, hence I was thinking about wrapping some of the tools in Docker containers.
At this moment I am stuck at a point where I do not know if I have chosen technology badly, or if I am not able to properly assimilate all the information about Docker.
The problem is related to the fact that you have to run the Docker tools as root, which is something I would not like to do at all, since the initial idea was to make the dockerized applications available to every researcher willing to use them.
In AskUbuntu, the most voted answer proposes to add the final user to the docker group, but it seems that this is not good for security. In the security articles at Docker, on the other hand, they explain that running the tools as root is good for your security. I have found similar questions at SO, but related to the environment inside the container.
Ok, I have no problem with this, but as every moderate-complexity example I happen to find, it seems it is more oriented towards web-applications development, where the system could initially start the container once and then forget about it.
Things I am considering right now:
Configuring the Docker daemon as a TLS-enabled, TCP remote service, and provide the corresponding users with certificates. Would there be any overhead in running the applications? Security issues?
Create images that only make available the application to the host by sharing a /usr/local/bin/ volume or similar. Is this secure? How can you create a daemonized container that does not need to execute anything? The only example I have found implies creating an infinite loop.
The nucleotid.es page seem to do something similar to what I want, but I have not found any reference to security issues. Maybe they are running all the containers inside a virtual machine, where they do not have to worry about these issues, due to the fact that they do not need to expose the dockerized applications to more people.
Sorry about my verbosity. I just wanted to write down the mental process (possibly flawed, I know, I know) where I am stuck. To sum up:
Is there any possibility to create a dockerized command line application which does not need to be run using sudo, is available for several people in the same server, and which is not intended to run in a daemonized fashion?
Thank you in advance.
Regards.
If users will be able to execute docker run then will be able to control host system just because they could map files from host to container and in container they always could be root if they could use docker run or docker exec. So users should not be able to execute docker directly. I think easiest solution here to create scripts which run docker and these scripts could either have suid flag or users could have sudo access to them.

Docker and grunt-based workflow automation

I'm looking for a Docker-based project setup which enables:
Development environment to most closely match production
Best of breed workflow automation tools for all developers
Highly portable/quick to set-up development environment, which supports Linux, OSX, and Windows. Currently we use Vagrant and that seems to be the most obvious choice still.
To satisfy #1:
Same app container (node.js + Apache) for dev, test, staging and production
Do not add any custom workflow tools to the container just for development's sake
To satisfy #3:
Do not require developers to all install their own dev tools for their respective environments/OSes (e.g. getting them to install node.js, npm, grunt, etc within the host)
So then to still satisfy #2, the idea I have is:
have a second "dev" container which shares files with the node/apache container and runs all the workflow automation.
run all the grunt watch/rebuild/reload/browser-sync etc from within that.
If using Vagrant, the file sharing would essentially go as host->dev container->app container
Are there any flaws in the above model, or perhaps better ideas?
One potentially missing point is whether to - and if so then how to - avoid performing a full build of containers in production each time. Without risking a mismatch of production vs other containers, I'd like to "package up" the container so that when new code is pushed to production, the app server only needs to restart, instead of npm install, etc. Particularly, once we're pushing to production, it should no longer have to pull anything from third party servers in order to run.
This is a bit broad question where answers will be opinionated rather then backed by objective arguments, but here's what I would change there:
Node.js is fine, but I would choose nginx instead of Apache. Both Node.js and Nginx are event-based and allow much more throughput, which is one of advantages of Node.js. But this might vary, like if you need certain Apache-only modules, but Nginx seems more natural to put in front of Node.
Why do you want to have a separate container? To minimize the production container by it not having to have dev tools?
I really think that having, say, grunt.js in the production container not too heavy, but again, you seem to try to minimize impact. Anyway, alternatively you can have both code and grunt watch etc inside one container and deploy like that. Pros are that you're simplifying setup, cons are that your production build might install a few extra libs. Which you can mitigate by, for example, setting NODE_ENV to production when deploying production container so that on startup, your scripts will know not to load certain dev tools.

NodeJS Production Deployment Best Practice

I'm looking for ways in which to deploy some web services into production in a consistent and timely manner.
I'm currently implementing a deployment pipeline that will end with a manual deployment action of a specific version of the software to a number of virtual machines provisioned by Ansible. The idea is to provision x number of instances using version A whilst already having y number of instances running version B. Then image and flick the traffic over. The same mechanism should allow me to scale new vms in a set using the image I already made.
I have considered the following options but was wondering if theres something I'm overlooking:
TGZ
The CI environment would build a tarball from a project that has passed unit tests and integration tests. Optionally depednencies would be bundled (removing the need to run npm install on the production machine and relying on network connectivity to public or private npm repository).
My main issue here is that any dependencies that depend on system libraries would be build on a different machine (albeit the same image). I don't like this.
NPM
The CI environment would publish to a private NPM repository and the Ansible deployment script would check out a specific version after provisioning. Again this suffers from a reliance on external services being available when you want to deploy. I dont like this.
Git
Any system dependent modules become globally installed as part of provisioning and all other dependencies are checked into the repository. This gives me the flexibility of being able to do differential deployments whereby just the deltas are pushed and the application daemon can be restarted automatically by the process manager almost instantly. Dependencies are then absolutely locked down.
This would mean that theres no need to spinning up new VM unless to scale. Deployments can be pushed straight to all active instances.
First and foremost, regardless of the deployment method, you need to make sure you don't drop requests while deploying new code. One simple approach is removing the node from a load balancer prior to switchover. Before doing so, you may also want to try and evaluate if there are pending requests, open connections, or anything else negatively impacted by premature termination. Or perhaps something like the up module.
Most people would not recommend source controlling your modules. It seems that a .tgz with your node_modules already filled in from an npm install while utilizing a bundledDependencies declaration in your package.json might cover all your concerns. With this approach, an npm install on your nodes will not download and install everything again. Though, it will rebuild node-gyp implementations which may cover your system library concern.
You can also make use of git tags to more easily keep track of versions with specific dependencies and payloads. Manually deploying the code may get tedious, you may want to consider automating the routine while iterating over x amount of known server entries in a database from an interface. docker.io may be of interest.

Resources