Jenkins dynamic slaves on kubernetes - Very high build time - node.js

Background - Migrating a project from CircleCI to Jenkins.
Project technology - typescript (node.js)
I have deployed a Jenkins on a newly baked GKE cluster using the Jenkins official helm chart and leveraging the benefits of dynamic slaves.
The issue I am facing is with one of my application, it is a group of 4 microservice which build and deployed together as a project.
Since all apps build and ship together I have set up a Jenkins parallel build pipeline that pulls the repo and builds all the applications in parallel to save the build time(copied the same logic from the existing CircleCI setup).
In CircleCI it normally takes five to seven minutes to build the app whereas in Jenkins it is taking more than 20 minutes.
I doubt I have the limitation of the resources on the node and increased to a very high spec node and then monitored using the kubectl top pods command and I notice it never reaches more than 3 CPU during the entire build process.
For further debugging, I thought it could be the IOPS issues as the project is pulling a lot of node modules and I have changed the node disk to SSD for testing but no luck.
For further debugging, I started provisioning a dynamic PV with every slave that Jenkins Sprawns and no luck again.
I am not sure what I am missing I checked the docker stats, Kubernetes logs but everything looks normal.
I am ruining docker build like this(4 different applications):
docker build --build-arg NODE_HEAP_SIZE=8096 --build-arg NPM_TOKEN=$NPM_TOKEN -f "test/Dockerfile" -t "test:123"
This is how my Dockerfile looks like:
FROM node:10.19.0 AS node
WORKDIR /etc/xyz/test
COPY --from=gcr.io/berglas/berglas:0.5.0 /bin/berglas /usr/local/bin/berglas
COPY docker-entrypoint.sh /
ENTRYPOINT ["/docker-entrypoint.sh"]
#
# development stage used in conjunction with docker-compose for local development
#
FROM node AS dev
ENV NODE_ENV="development"
COPY new/package.json new/package-lock.json ../new/
RUN (cd ../new && npm install)
COPY brand/package.json brand/package-lock.json ../brand/
RUN (cd ../brand && npm install)
COPY chain/package.json chain/package-lock.json ./
RUN npm install
COPY chain ./
COPY new ../new/
COPY brand ../brand/
#
# production stage that compiles and runs production artifacts
#
FROM dev AS prod
ENV NODE_ENV="production"
ARG NODE_HEAP_SIZE="4096"
RUN NODE_OPTIONS="--max-old-space-size=${NODE_HEAP_SIZE}" npm run build:prod
To verify the network bandwidth on the nodes I have started a ubuntu container and did the network test and it is up to the mark.
I even tried passing the -cache-from to improve the caching during the build but no luck here as well.
I have even tried changing the NODE_HEAP_SIZE to a very high value but did not get any improvements.
I have seen the maximum time is going in npm install or npm ci or npm run build
Adding further investigation:
I have tried building the same steps on VM and also by spinning up a docker container on the same VM and tried to run the docker build inside, it is taking significantly less time than running in Jenkins dynamic slaves. The time difference is more or less double on dynamic slaves.
The maximum time is going in npm install and npm ci steps.
I don't know understand how CircleCi is able to build it faster.
Can someone help me with what else should I debug?

Without checking logs it is hard to say what is happening in Jenkins. Please take a look this article about global jenkins logs and configuring additional log recorders.
I had similar problem with dynamic jenkins slaves in AWS, because "Amazon EC2" plugin's developers changed security settings and it took ~15 minutes for checking ssh keys.

Related

Docker Container yarn install ENOMEM: not enough memory

I'm using docker compose on remote server and in the entrypoint to one of the services I have a shell script that has
yarn install --production --frozen-lockfile
This had to be done because I'm downloading the project files from git inside entrypoint by adding ssh keys for github from local. The downloading of project files is done only if it's not already available in the volume.
I'm using the user home folder /home/appuser as a volume so that it's shared with other services, mounting it like this:
volumes:
- appuser-home-store:/home/appuser
yarn install inside this entrypoint has been very unstable, it randomly works but most of the times gives error:
error https://registry.yarnpkg.com/workbox-background-sync/-/workbox-background-sync-6.5.4.tgz: Extracting tar content of undefined failed, the file appears to be corrupt: "ENOMEM: not enough memory, open '/home/appuser/.cache/yarn/v6/npm-workbox-background-sync-6.5.4-3141afba3cc8aa2ae14c24d0f6811374ba8ff6a9-integrity/node_modules/workbox-background-sync/LICENSE'"
Almost everytime the location after open is different
In the build I'm using
FROM node:lts
I have tried
export NODE_OPTIONS=--max-old-space-size=8192
But I don't think this is relevant to my problem and didn't work anyway.
I have total of 200GB storage on this server, and I see that only 6% is used. I have tried many things, deleting volume, docker system prune, yarn cache clean and many other but none of them were successful, how to fix this?
This is a genuine error and known issue even on yarn github:
https://github.com/yarnpkg/berry/issues/4373
The error is not at all same as mine so I could not find it but, one of the solution given on this issue thread worked.
The solution is to set yarn version to canary which can be done using
yarn set version canary

Docker in Jenkins and private modules

I'm looking for a way to securely clone private npm modules from a proxy repository inside a Docker container that is spun up by a Jenkins that runs on Ubuntu. The Docker image will be thrown away, but it is supposed to compile the project and run the unit tests.
The Jenkinsfile used for the build looks, simplified, like this:
node('master') {
stage('Checkout from version control') {
checkout scm
}
stage('Build within Docker') {
docker.build("intermediate-image", ".")
}
}
The Dockerfile at the moment:
FROM node:10-alpine
COPY package.json package-lock.json .npmrc ./
RUN npm ci && \
rm -f .npmrc
COPY . .
RUN npm run build && \
npm run test
The .npmrc file (anonymized):
#domain:registry=https://npm.domain.com/
//npm.domain.com/:_authToken=abcdefg
The problem is that the COPY command creates a layer with the .npmrc file. Should I build outside of my own Jenkins server, the layer would be cached by the build provider.
Building manually, I could specify the token as a docker environment variable. Is there a way to set the environment variable on Ubuntu and have Jenkins pass it through to Docker?
(Maybe) I could inject environment variables into Jenkins and then into the pipeline? The user claims that the plugin is not fully compatible with the pipeline plugin though.
Should I use the fact that Docker and Jenkins run on the same machine and mount something into the container?
Or do I worry too much, considering that the image will not be published and the Jenkins is private too?
What I want to achieve is that a build can use an arbitrary node version that is independent of that of the build server's.
I have decided that, because the docker host is the same (virtual) machine as the Jenkins host, it is no problem if I bake the .npmrc file into a docker layer.
Anyone with access to the Docker host can, currently, steal the local .npmrc token anyway.
Furthermore, the group that has access to our private npm modules is a complete subgroup of people with access to the source control repository. Therefore, exposing the npm token to the build machine, Jenkins, Docker intermediate image, Docker image layer and/or repository poses no additional authentication problems as of now. Revoking access should then go hand in hand with rotating the npmrc token (so that removed developers do not use the build token), but that is a small attack surface, in any case waay smaller than people copying the code to a hard drive.
We will have to re-evaluate our options should this setup change. Hopefully, we will find a solution then, but it is not worth the trouble now. One possible solution could be requesting the token from a different docker container with the sole purpose of answering these (local) calls.

Deploying NodeJS with MongoDB on Docker

I am building a NodeJS application using MongoDB as the database. I am thinking that it will make more sense in terms of portability across different platforms and also versioning and comparison to have the application deployed in Docker. Going through various recommendations on internet, here are my specific questions :
(a) Do I copy my application code (nodejs) within Docker? Or do I keep Source code on the host machine and have the code base available to Docker using Volumes? (Just for experimenting, I had docker file instruction pulling the code from repository within the image directly. It works, but is it a good practice, or should I pull the code outside the docker container and make it available to docker container using Volumes / copy the code)?
(b) When I install all my application dependencies, my node_module size explodes to almost 250 MB. So would you recommend that run npm install (for dependencies) as Docker step, which will increase the size of my image ? Or is there any other alternative that you can recommend?
(c) For connecting to the database, what will be the recommendation? Would you recommend, using another docker container with MongoDB image and define the dependency between the web and the db using docker? Along with that have configurable runtime property such that app in different environments (PROD, STAGE, DEV) can have the ability to connect to different database (mongodb).
Thoughts / suggestions greatly appreciated. I am sure, I may be asking questions which all of you may have run into at some point in time and have adopted different approaches, with pros and cons.
Do I copy my application code (nodejs) within Docker? Or do I keep
Source code on the host machine and have the code base available to
Docker using Volumes?
You should have the nodejs code inside the container. Keeping the source code on your machine will make your image not portable since if you switch to another machine, you need to copy the code there.
You can also pull the code directly into the container if you have git installed inside the container. But remember to remove the .git folder to have a smaller image.
When I install all my application dependencies, my node_module size
explodes to almost 250 MB. So would you recommend that run npm install
(for dependencies) as Docker step, which will increase the size of my
image ? Or is there any other alternative that you can recommend?
This is node pulling over all the internet. You have to install you dependencies. However, you should run npm cache clean --force after the install to do some clean up to have a smaller image
For connecting to the database, what will be the recommendation? Would
you recommend, using another docker container with MongoDB image and
define the dependency between the web and the db using docker? Along
with that have configurable runtime property such that app in
different environments (PROD, STAGE, DEV) can have the ability to
connect to different database (mongodb)
It is a good idea to create a new container for the database and connect your app to the database using docker networks. You can have multiple DB at the same time, but preferably keep one db container inside the network, and if you want to use another one, just remove the old one and add the new one to the network.
A
During development
Using a directory in the host is fast. You modify your code, relaunch the docker image and it will start your app quickly.
Docker image for production/deployement
It is good to pull the code from git. it's heavier to run, but easier to deploy.
B
During development
Don't run npm install inside docker, you can handle the dependencies manually.
Docker image for production/deployement
Make a single npm i in image building, because it's supposed to be static anyway.
More explanation
When you are developing, you change your code, use a new package, adapt your package.json, update packages ...
You basically need to control what happen with npm. It is easier to interact with it if you can directly execute commands lines and access the files (outside docker in local directory). You make your change, you relaunch your docker and it get started!
When you are deploying your app, you don't have the need to interact with npm modules. You want a packaged application with an appropriate version number and release date that do not move and that you can rely on.
Because npm is not 100% trustworthy, it happen that with exact same package.json some stuff you get as you npm i makes the application to crash. So I would not recommend to use npm i at every application relaunch or deployement, because imagine some package get fucked up, you gotta rush to find out a soluce. Moreover there is no need at all to reload packages that should be the exact same (they should!). It's not in deployement that you want to update the package! But in your developement environment where you can npm update safely and test everything up.
(Sorry about english!)
C
Use two docker image and connect them using a docker network. So you can deploy easily your app anywhere.
Some commands to help maybe about Docker networking! (i'm actually using it in my company)
// To create your own network with docker
sudo docker network create --subnet=172.42.0.0/24 docker-network
// Run the mondogb docker
sudo docker run -i -t --net docker-network --ip 172.42.0.2 -v ~/DIRECTORY:/database mongodb-docker
// Run the app docker
sudo docker run -i -t --net docker-network --ip 172.42.0.3 -v ~/DIRECTORY:/local-git backend-docker

npm install with a docker-compose project

I have a dockerized project that has three apps and three databases. The three apps are written in node and use npm as usual.
I have a script that clones the three repos, docker-compose.yaml mounts the three containers and uses a Dockerfile for each of the three projects to basically just do an npm install and run them.
This is all working fine, but the whole point of this exercise is to make the cluster of projects easy to set up and run for the purposes of development. Actually working on the project code is not a problem since it gets cloned by the developer, but npm install is done through docker and thus root. This means that node_modules in the repos is owned by root.
A developer cannot simply do npm install to add a new package to the repo because they won't have permissions on node_modules and the module would possibly be built with a different architecture depending on their host system.
I have thought about creating a script that runs npm install in the container instead, but this has a couple of caveats:
root would own package.json
This breaks a typical node developer's flow ... they are used to just doing npm install
Like I said above, the whole point of this is to make it as easy to jump in and develop as possible, so I want to get as close to a common development experience as I can.
Are there any suggestions for handling installation of node modules in a docker container for development of a project?
A common problem with mounted source folders, the best solution I have come up with so far is to simply match the uid/gid of the host user to some fixed user in the container. Until recently one had to resort to some external tools and dockerfile/compose templating, with the latest docker-compose versions (>=1.6.0) you can do the following now:
Dockerfile:
FROM busybox
ARG HOST_UID=1000
RUN adduser -D -H -u ${HOST_UID} -s /bin/sh npm
USER npm
RUN echo "i'm $(whoami) and have uid: ${HOST_UID}"
Notice the ARG directive. The value of HOST_UID is passed at runtime via docker build --build-arg HOST_UID=${UID}. Then just add a custom npm user with the value of HOST_UID as its uid and set it as default USER for all following commands.
--build-arg is now also supported by docker-compose and the new version 2 yml format:
version: '2'
services:
foo:
build:
context: .
args:
HOST_UID: ${UID}
Provided UID is set on your host, docker-compose up foo will build the image with a default user that matches your uid on the host. The important lesson I learned there was that the uid/gid is all that matters for permissions, the actual user/group names are irrelevant.
Another technique I used a few times is to replace the uid of a fixed user in /etc/passwd/ via sed on container start, if a certain env is set. This avoids image rebuilds and is suitable for images that are expected to run straight from some repository.
Lastly I would recommend to fully embrace the docker philosophy, meaning your devs should only use the project containers for tasks like npm install. You avoid the inevitable version mismatch and other headaches down the road.

Run Grunt / Gulp inside Docker container or outside?

I'm trying to identify a good practice for the build process of a nodejs app using grunt/gulp to be deployed inside a docker container.
I'm pretty happy with the following sequence:
build using grunt (or gulp) outside container
add ./dist folder to container
run npm install (with --production flag) inside container
But in every example I find, I see a different approach:
add ./src folder to container
run npm install (with dev dependencies) inside container
run bower install (if required) inside container
run grunt (or gulp) inside container
IMO, the first approach generates a lighter and more efficient container, but all of the examples out there are using the second approach. Am I missing something?
I'd like to suggest a third approach that I have done for a static generated site, the separate build image.
In this approach, your main Dockerfile (the one in project root) becomes a build and development image, basically doing everything in the second approach. However, you override the CMD at run time, which is to tar up the built dist folder into a dist.tar or similar.
Then, you have another folder (something like image) that has a Dockerfile. The role of this image is only to serve up the dist.tar contents. So we do a docker cp <container_id_from_tar_run> /dist. Then the Dockerfile just installs our web server and has a ADD dist.tar /var/www.
The abstract is something like:
Build the builder Docker image (which gets you a working environment without webserver). At thist point, the application is built. We could run the container in development with grunt serve or whatever the command is to start our built in development server.
Instead of running the server, we override the default command to tar up our dist folder. Something like tar -cf /dist.tar /myapp/dist.
We now have a temporary container with a /dist.tar artifact. Copy it to your actual deployment Docker folder we called image using docker cp <container_id_from_tar_run> /dist.tar ./image/.
Now, we can build the small Docker image without all our development dependencies with docker build ./image.
I like this approach because it is still all Docker. All the commands in this approach are Docker commands and you can really slim down the actual image you end up deploying.
If you want to check out an image with this approach in action, check out https://github.com/gliderlabs/docker-alpine which uses a builder image (in the builder folder) to build tar.gz files that then get copied to their respective Dockerfile folder.
The only difference I see is that you can reproduce a full grunt installation in the second approach.
With the first one, you depend on a local action which might be done differently, on different environments.
A container should be based in an image that can be reproduced easily instead of depending on an host folder which contains "what is needed" (not knowing how that part has been done)
If the build environment overhead which comes with the installation is too much for a grunt image, you can:
create an image "app.tar" dedicated for the installation (I did that for Apache, that I had to recompile, creating a deb package in a shared volume).
In your case, you can create an archive ('tar') of the app installed.
creating a container from a base image, using the volume from that first container
docker run --it --name=app.inst --volumes-from=app.tar ubuntu untar /shared/path/app.tar
docker commit app.inst app
Then end result is an image with the app present on its filesystem.
This is a mix between your approach 1 and 2.
A variation of the solution 1 is to have a "parent -> child" that makes the build of the project really fast.
I would have dockerfile like:
FROM node
RUN mkdir app
COPY dist/package.json app/package.json
WORKDIR app
RUN npm install
This will handle the installation of the node dependencies, and have another dockerfile that will handle the application "installation" like:
FROM image-with-dependencies:v1
ENV NODE_ENV=prod
EXPOSE 9001
COPY dist .
ENTRYPOINT ["npm", "start"]
with this you can continue your development and the "build" of the docker image is going to be faster of what it would be if you required to "re-install" the node dependencies. If you install new dependencies on node, just re-build the dependencies image.
I hope this helps someone.
Regards

Resources