I'm looking for a way to securely clone private npm modules from a proxy repository inside a Docker container that is spun up by a Jenkins that runs on Ubuntu. The Docker image will be thrown away, but it is supposed to compile the project and run the unit tests.
The Jenkinsfile used for the build looks, simplified, like this:
node('master') {
stage('Checkout from version control') {
checkout scm
}
stage('Build within Docker') {
docker.build("intermediate-image", ".")
}
}
The Dockerfile at the moment:
FROM node:10-alpine
COPY package.json package-lock.json .npmrc ./
RUN npm ci && \
rm -f .npmrc
COPY . .
RUN npm run build && \
npm run test
The .npmrc file (anonymized):
#domain:registry=https://npm.domain.com/
//npm.domain.com/:_authToken=abcdefg
The problem is that the COPY command creates a layer with the .npmrc file. Should I build outside of my own Jenkins server, the layer would be cached by the build provider.
Building manually, I could specify the token as a docker environment variable. Is there a way to set the environment variable on Ubuntu and have Jenkins pass it through to Docker?
(Maybe) I could inject environment variables into Jenkins and then into the pipeline? The user claims that the plugin is not fully compatible with the pipeline plugin though.
Should I use the fact that Docker and Jenkins run on the same machine and mount something into the container?
Or do I worry too much, considering that the image will not be published and the Jenkins is private too?
What I want to achieve is that a build can use an arbitrary node version that is independent of that of the build server's.
I have decided that, because the docker host is the same (virtual) machine as the Jenkins host, it is no problem if I bake the .npmrc file into a docker layer.
Anyone with access to the Docker host can, currently, steal the local .npmrc token anyway.
Furthermore, the group that has access to our private npm modules is a complete subgroup of people with access to the source control repository. Therefore, exposing the npm token to the build machine, Jenkins, Docker intermediate image, Docker image layer and/or repository poses no additional authentication problems as of now. Revoking access should then go hand in hand with rotating the npmrc token (so that removed developers do not use the build token), but that is a small attack surface, in any case waay smaller than people copying the code to a hard drive.
We will have to re-evaluate our options should this setup change. Hopefully, we will find a solution then, but it is not worth the trouble now. One possible solution could be requesting the token from a different docker container with the sole purpose of answering these (local) calls.
Related
Most/all examples I see online usually copy package.json into the image and then run npm install within the image. Is there a deal breaker reason for not running npm install from outside on the build server and then just copying everything including the node_modules/ folder?
My main motivation for doing this is that, we are using a private npm registry, with security, and running npm from within an image, we would need to figure out how to securely embed credentials. Also, we are using yarn, and we could just leverage the yarn cache across projects if yarn runs on the build server. I suppose there's workarounds for these, but running yarn/npm from the build server where everything is already set up seems very convenient.
thanks
Public Dockerfiles out there are trying to provide generalized solution.
Having dependencies coded in package.json makes it possible to share only one Dockerfile and not depend on anything not public available.
But at runtime Docker does not care how files got to container. So this is up to you, how you push all needed files to your container.
P.S. Consider layering. If you copy stuff under node_modules/, do it in one step, by that only one layer is used.
I am building a NodeJS application using MongoDB as the database. I am thinking that it will make more sense in terms of portability across different platforms and also versioning and comparison to have the application deployed in Docker. Going through various recommendations on internet, here are my specific questions :
(a) Do I copy my application code (nodejs) within Docker? Or do I keep Source code on the host machine and have the code base available to Docker using Volumes? (Just for experimenting, I had docker file instruction pulling the code from repository within the image directly. It works, but is it a good practice, or should I pull the code outside the docker container and make it available to docker container using Volumes / copy the code)?
(b) When I install all my application dependencies, my node_module size explodes to almost 250 MB. So would you recommend that run npm install (for dependencies) as Docker step, which will increase the size of my image ? Or is there any other alternative that you can recommend?
(c) For connecting to the database, what will be the recommendation? Would you recommend, using another docker container with MongoDB image and define the dependency between the web and the db using docker? Along with that have configurable runtime property such that app in different environments (PROD, STAGE, DEV) can have the ability to connect to different database (mongodb).
Thoughts / suggestions greatly appreciated. I am sure, I may be asking questions which all of you may have run into at some point in time and have adopted different approaches, with pros and cons.
Do I copy my application code (nodejs) within Docker? Or do I keep
Source code on the host machine and have the code base available to
Docker using Volumes?
You should have the nodejs code inside the container. Keeping the source code on your machine will make your image not portable since if you switch to another machine, you need to copy the code there.
You can also pull the code directly into the container if you have git installed inside the container. But remember to remove the .git folder to have a smaller image.
When I install all my application dependencies, my node_module size
explodes to almost 250 MB. So would you recommend that run npm install
(for dependencies) as Docker step, which will increase the size of my
image ? Or is there any other alternative that you can recommend?
This is node pulling over all the internet. You have to install you dependencies. However, you should run npm cache clean --force after the install to do some clean up to have a smaller image
For connecting to the database, what will be the recommendation? Would
you recommend, using another docker container with MongoDB image and
define the dependency between the web and the db using docker? Along
with that have configurable runtime property such that app in
different environments (PROD, STAGE, DEV) can have the ability to
connect to different database (mongodb)
It is a good idea to create a new container for the database and connect your app to the database using docker networks. You can have multiple DB at the same time, but preferably keep one db container inside the network, and if you want to use another one, just remove the old one and add the new one to the network.
A
During development
Using a directory in the host is fast. You modify your code, relaunch the docker image and it will start your app quickly.
Docker image for production/deployement
It is good to pull the code from git. it's heavier to run, but easier to deploy.
B
During development
Don't run npm install inside docker, you can handle the dependencies manually.
Docker image for production/deployement
Make a single npm i in image building, because it's supposed to be static anyway.
More explanation
When you are developing, you change your code, use a new package, adapt your package.json, update packages ...
You basically need to control what happen with npm. It is easier to interact with it if you can directly execute commands lines and access the files (outside docker in local directory). You make your change, you relaunch your docker and it get started!
When you are deploying your app, you don't have the need to interact with npm modules. You want a packaged application with an appropriate version number and release date that do not move and that you can rely on.
Because npm is not 100% trustworthy, it happen that with exact same package.json some stuff you get as you npm i makes the application to crash. So I would not recommend to use npm i at every application relaunch or deployement, because imagine some package get fucked up, you gotta rush to find out a soluce. Moreover there is no need at all to reload packages that should be the exact same (they should!). It's not in deployement that you want to update the package! But in your developement environment where you can npm update safely and test everything up.
(Sorry about english!)
C
Use two docker image and connect them using a docker network. So you can deploy easily your app anywhere.
Some commands to help maybe about Docker networking! (i'm actually using it in my company)
// To create your own network with docker
sudo docker network create --subnet=172.42.0.0/24 docker-network
// Run the mondogb docker
sudo docker run -i -t --net docker-network --ip 172.42.0.2 -v ~/DIRECTORY:/database mongodb-docker
// Run the app docker
sudo docker run -i -t --net docker-network --ip 172.42.0.3 -v ~/DIRECTORY:/local-git backend-docker
I wrote an ASP.NET Core application which should run in a container using docker. This works, but the hole build process is relatively slow. The main bottleneck seems to be nuget. A lot of packages are referenced, and it take time to load all of them from the internet. This is done on every build since docker alwas start a new container.
My idea is to create a persistent dictionary on the host, where the packages are stored. So they don't have to be fetched on every build. dotnet restore has a parameter --packages where I can define a cache directory. But for this its required to pass a shared dictionary to the docker build command.
I found out that docker run has a -v parameter where i can pass /host/path:/container/path to share a folder from the host to the container. But this only works for docker run, not docker build. Also the COPY command doesn't fit here since it let me only copy files from the host to the container. First I had to copy the other way round (container to host).
So how can I create a cache directory which doesn't got disposed together with the container?
I found similar issues like this. Its composer there, but the problem of a persistent cache directory is the same. They use the -v parameter on docker run. But I can't understand how this solve the problem: In my understanding of docker, the dockerfile should build the application. This includes installing dependencies like NuGet-Packages for ASP.NET Core, bower and similar. So this should happen in the dockerfile, not when running the container.
If you absolutely must restore packages within the container first then the best you can do is re-use intermediate or previously built docker images that already have the packages restored.
If you use the Visual Studio generated Dockerfile it already does the best it can to re-use the intermediate image that includes the package cache. This is accomplished by copying the .csproj file over first, restoring packages, and then copying over the source files. That way if you didn't change your package references (basically the only thing that changes in .csproj) then docker simply uses the intermediate image it created on the previous build after restoring packages.
You could also create a base image that has packages restored already and occasionally update it. Here's how you'd do that:
1. Build your Dockerfile: docker build .
2. Tag the intermediate container that has the package cache (the one created by dotnet restore): docker tag {intermediate image id} {your project name}:base-1
3. Update Dockerfile to use {your project name}:base-1 as the base image FROM {your project name}:base-1
4. If you're using a build system then publish the base image: docker publish {your project name}:base-1
5. Periodically update your base image, rolling the version number.
I have a dockerized project that has three apps and three databases. The three apps are written in node and use npm as usual.
I have a script that clones the three repos, docker-compose.yaml mounts the three containers and uses a Dockerfile for each of the three projects to basically just do an npm install and run them.
This is all working fine, but the whole point of this exercise is to make the cluster of projects easy to set up and run for the purposes of development. Actually working on the project code is not a problem since it gets cloned by the developer, but npm install is done through docker and thus root. This means that node_modules in the repos is owned by root.
A developer cannot simply do npm install to add a new package to the repo because they won't have permissions on node_modules and the module would possibly be built with a different architecture depending on their host system.
I have thought about creating a script that runs npm install in the container instead, but this has a couple of caveats:
root would own package.json
This breaks a typical node developer's flow ... they are used to just doing npm install
Like I said above, the whole point of this is to make it as easy to jump in and develop as possible, so I want to get as close to a common development experience as I can.
Are there any suggestions for handling installation of node modules in a docker container for development of a project?
A common problem with mounted source folders, the best solution I have come up with so far is to simply match the uid/gid of the host user to some fixed user in the container. Until recently one had to resort to some external tools and dockerfile/compose templating, with the latest docker-compose versions (>=1.6.0) you can do the following now:
Dockerfile:
FROM busybox
ARG HOST_UID=1000
RUN adduser -D -H -u ${HOST_UID} -s /bin/sh npm
USER npm
RUN echo "i'm $(whoami) and have uid: ${HOST_UID}"
Notice the ARG directive. The value of HOST_UID is passed at runtime via docker build --build-arg HOST_UID=${UID}. Then just add a custom npm user with the value of HOST_UID as its uid and set it as default USER for all following commands.
--build-arg is now also supported by docker-compose and the new version 2 yml format:
version: '2'
services:
foo:
build:
context: .
args:
HOST_UID: ${UID}
Provided UID is set on your host, docker-compose up foo will build the image with a default user that matches your uid on the host. The important lesson I learned there was that the uid/gid is all that matters for permissions, the actual user/group names are irrelevant.
Another technique I used a few times is to replace the uid of a fixed user in /etc/passwd/ via sed on container start, if a certain env is set. This avoids image rebuilds and is suitable for images that are expected to run straight from some repository.
Lastly I would recommend to fully embrace the docker philosophy, meaning your devs should only use the project containers for tasks like npm install. You avoid the inevitable version mismatch and other headaches down the road.
I'm trying to identify a good practice for the build process of a nodejs app using grunt/gulp to be deployed inside a docker container.
I'm pretty happy with the following sequence:
build using grunt (or gulp) outside container
add ./dist folder to container
run npm install (with --production flag) inside container
But in every example I find, I see a different approach:
add ./src folder to container
run npm install (with dev dependencies) inside container
run bower install (if required) inside container
run grunt (or gulp) inside container
IMO, the first approach generates a lighter and more efficient container, but all of the examples out there are using the second approach. Am I missing something?
I'd like to suggest a third approach that I have done for a static generated site, the separate build image.
In this approach, your main Dockerfile (the one in project root) becomes a build and development image, basically doing everything in the second approach. However, you override the CMD at run time, which is to tar up the built dist folder into a dist.tar or similar.
Then, you have another folder (something like image) that has a Dockerfile. The role of this image is only to serve up the dist.tar contents. So we do a docker cp <container_id_from_tar_run> /dist. Then the Dockerfile just installs our web server and has a ADD dist.tar /var/www.
The abstract is something like:
Build the builder Docker image (which gets you a working environment without webserver). At thist point, the application is built. We could run the container in development with grunt serve or whatever the command is to start our built in development server.
Instead of running the server, we override the default command to tar up our dist folder. Something like tar -cf /dist.tar /myapp/dist.
We now have a temporary container with a /dist.tar artifact. Copy it to your actual deployment Docker folder we called image using docker cp <container_id_from_tar_run> /dist.tar ./image/.
Now, we can build the small Docker image without all our development dependencies with docker build ./image.
I like this approach because it is still all Docker. All the commands in this approach are Docker commands and you can really slim down the actual image you end up deploying.
If you want to check out an image with this approach in action, check out https://github.com/gliderlabs/docker-alpine which uses a builder image (in the builder folder) to build tar.gz files that then get copied to their respective Dockerfile folder.
The only difference I see is that you can reproduce a full grunt installation in the second approach.
With the first one, you depend on a local action which might be done differently, on different environments.
A container should be based in an image that can be reproduced easily instead of depending on an host folder which contains "what is needed" (not knowing how that part has been done)
If the build environment overhead which comes with the installation is too much for a grunt image, you can:
create an image "app.tar" dedicated for the installation (I did that for Apache, that I had to recompile, creating a deb package in a shared volume).
In your case, you can create an archive ('tar') of the app installed.
creating a container from a base image, using the volume from that first container
docker run --it --name=app.inst --volumes-from=app.tar ubuntu untar /shared/path/app.tar
docker commit app.inst app
Then end result is an image with the app present on its filesystem.
This is a mix between your approach 1 and 2.
A variation of the solution 1 is to have a "parent -> child" that makes the build of the project really fast.
I would have dockerfile like:
FROM node
RUN mkdir app
COPY dist/package.json app/package.json
WORKDIR app
RUN npm install
This will handle the installation of the node dependencies, and have another dockerfile that will handle the application "installation" like:
FROM image-with-dependencies:v1
ENV NODE_ENV=prod
EXPOSE 9001
COPY dist .
ENTRYPOINT ["npm", "start"]
with this you can continue your development and the "build" of the docker image is going to be faster of what it would be if you required to "re-install" the node dependencies. If you install new dependencies on node, just re-build the dependencies image.
I hope this helps someone.
Regards