Bumping package.json version without invalidating docker cache - node.js

I'm using a pretty standard Dockerfile to containerize a Node.js application:
# Simplified version
FROM node:alpine
# Copy package.json first for docker build's layer caching
COPY package.json package-lock.json foo/
RUN npm install
COPY src/ foo/
RUN npm run build
Breaking up my COPY into two parts was advantageous because it allowed Docker to cache the (long) npm install step.
Recently, however, I started bumping my package.json version using semver. This had the side effect of invalidating the Docker cache for the npm install step, lengthening my build times significantly.
Is there an alternative caching strategy I can use so that npm install only runs when my dependencies change?

Here's my take on this, based off other answers, but shorter and with usage of jq:
Dockerfile:
FROM endeveit/docker-jq AS deps
# https://stackoverflow.com/a/58487433
# To prevent cache invalidation from changes in fields other than dependencies
COPY package.json /tmp
RUN jq '{ dependencies, devDependencies }' < /tmp/package.json > /tmp/deps.json
FROM node:12-alpine
WORKDIR /app
COPY --from=deps /tmp/deps.json ./package.json
COPY package-lock.json .
RUN npm ci
# https://docs.npmjs.com/cli/ci.html#description
COPY . .
RUN npm run build
LABEL maintainer="Alexey Vishnyakov <n3tn0de#gmail.com>"
I extract dependencies and devDependencies fields to a separate file, then on next build step I copy it from the previous step as package.json (COPY --from=deps /tmp/deps.json ./package.json).
After RUN npm ci, COPY . . will overwrite gutted package.json with the original one (you can test it by adding RUN cat package.json after COPY . . command.
Note that npm-scripts commands like postinstall won't run since they're not present in the file during npm ci and also if npm ci is running from root and without --unsafe-perm
Either run commands after COPY . . or/and (if needed) include them via jq (changing command will invalidate cache layer) or add --unsafe-perm
Dockerfile:
FROM endeveit/docker-jq AS deps
COPY package.json /tmp
RUN jq '{ dependencies, devDependencies, peerDependencies, scripts: (.scripts | { postinstall }) }' < /tmp/package.json > /tmp/deps.json
# keep postinstall script
FROM node:12-alpine
WORKDIR /app
COPY --from=deps /tmp/deps.json ./package.json
COPY package-lock.json .
# RUN npm ci --unsafe-perm
# allow postinstall to run from root (security risk)
RUN npm ci
# https://docs.npmjs.com/cli/ci.html#description
RUN npm run postinstall
...

You can add an additional "preparation" step in your Dockerfile that creates a temporary package.json where the "version" field is fixed. This file is then used while installing dependencies and afterwards replaced by the "real" package.json.
As all of this happens during the Docker build process, your actual source repository is not touched (so you can use the environment variable npm_package_version both during your build and when running the docker script, e.g. to tag) and the solution is portable:
Dockerfile:
# PREPARATION
FROM node:lts-alpine as preparation
COPY package.json package-lock.json ./
# Create temporary package.json where version is set to 0.0.0
# – this way the cache of the build step won't be invalidated
# if only the version changed.
RUN ["node", "-e", "\
const pkg = JSON.parse(fs.readFileSync('package.json', 'utf-8'));\
const pkgLock = JSON.parse(fs.readFileSync('package-lock.json', 'utf-8'));\
fs.writeFileSync('package.json', JSON.stringify({ ...pkg, version: '0.0.0' }));\
fs.writeFileSync('package-lock.json', JSON.stringify({ ...pkgLock, version: '0.0.0' }));\
"]
# BUILD
FROM node:lts-alpine as build
# Install deps, using temporary package.json from preparation step
COPY --from=preparation package.json package-lock.json ./
RUN npm ci
# Copy source files (including "real" package.json) and build app
COPY . .
RUN npm run build
If you think inlining the Node script is iffy (I like it, because this way the entire Docker build process can be found in the Dockerfile), you can of course extract it to a separate JS file:
create-tmp-pkg.js:
const fs = require('fs');
const pkg = JSON.parse(fs.readFileSync('package.json', 'utf-8'));
const pkgLock = JSON.parse(fs.readFileSync('package-lock.json', 'utf-8'));
fs.writeFileSync('package.json', JSON.stringify({ ...pkg, version: '0.0.0' }));
fs.writeFileSync('package-lock.json', JSON.stringify({ ...pkgLock, version: '0.0.0' }));
and change your preparation step to:
# PREPARATION
FROM node:lts-alpine as preparation
COPY package.json package-lock.json create-tmp-pkg.js ./
# Create temporary package.json where version is set to "0.0.0"
# – this way the cache of the build step won't be invalidated
# if only the version changed.
RUN node create-tmp-pkg.js

I spent some time thinking about this. Fundamentally, I'm cheating because the package.json file is, in fact, changed, which means anything that circumvents the cache invalidation technically makes the build not reproducible.
For my purposes, however, I care more about build time than strict cache correctness. Here's what I came up with:
build-artifacts.js
/*
Used to keep docker cache fresh despite package.json version bumps.
In this script
- copy package.json to package-artifact.json
- zero package.json version
In Docker
- copy package.json
- run npm install normal
- copy package-artifact.json to package.json (undo-build-artifacts.js accomplishes this with a conditional check that package-artifact exists)
*/
const fs = require('fs');
const package = fs.readFileSync('package.json', 'utf8');
fs.writeFileSync('package-artifact.json', package);
const modifiedPackage = { ...JSON.parse(package), version: '0.0.0' };
fs.writeFileSync('package.json', JSON.stringify(modifiedPackage));
const packageLock = fs.readFileSync('package-lock.json', 'utf8');
fs.writeFileSync('package-lock-artifact.json', packageLock);
const modifiedPackageLock = { ...JSON.parse(packageLock), version: '0.0.0' };
fs.writeFileSync('package-lock.json', JSON.stringify(modifiedPackageLock));
undo-build-artifacts.js
const fs = require('fs');
const hasBuildArtifacts = fs.existsSync('package-artifact.json');
if (hasBuildArtifacts) {
const package = fs.readFileSync('package-artifact.json', 'utf8');
const packageLock = fs.readFileSync('package-lock-artifact.json', 'utf8');
fs.writeFileSync('package.json', package);
fs.writeFileSync('package-lock.json', packageLock);
fs.unlinkSync('package-artifact.json');
fs.unlinkSync('package-lock-artifact.json');
}
These two files serve to relocate package.json and package-lock.json, replacing them with artifacts that have zeroed-out versions. These artifacts will be used in the docker build, and will be replaced with the original versions upon npm install completion.
I run build-artifacts.js in a Travis CI before_script, and undo-build-artifacts.js in the Dockerfile itself (after I npm install). undo-build-artifacts.js incorporates a check for the build artifacts, meaning the Docker container can still build if build-artifacts.js hasn't run. That keeps the container portable enough in my books. :)

I went about this a bit different. I just ignore the version in package.json and leave it set to 1.0.0. Instead I add a file version.json then I use a script like the one below for deploying.
This approach won't work if you need to publish to npm, since the version will never change
version.json
{"version":"1.2.3"}
deploy.sh
#!/bin/sh
VERSION=`node -p "require('./version.json').version"`
#docker build
docker pull node:10
docker build . -t mycompany/myapp:v$VERSION
#commit version tag
git add version.json
git commit -m "version $VERSION"
git tag v$VERSION
git push origin
git push origin v$VERSION
#push Docker image to repo
docker push mycompany/myapp:v$VERSION
I normally just update the version file manually but if you want something that works like npm version you can use a script like this that uses the semvar package.
patch.js
var semver = require('semver')
var fs = require('fs')
var version = require('./version.json').version
var patch = semver.inc(version, 'patch')
fs.writeFile('./version.json', JSON.stringify({'version': patch}), (err) => {
if (err) {
console.error(err)
} else {
console.log(version + ' -> ' + patch)
}
})

Based on n3tn0de answer
I changed the Dockerfile to be
######## Preperation
FROM node:12-alpine AS deps
COPY package.json package-lock.json ./
RUN npm version --allow-same-version 1.0.0
######## Building
FROM node:12-alpine
WORKDIR /app
COPY --from=deps package.json package-lock.json ./
RUN npm ci
COPY . .
EXPOSE 80
CMD ["npm", "start"]
This approach will avoid using 2 different Docker images -less download and less storage- and fix/avoid any issues in package.json

Another option pnpm now has pnpm fetch which only uses the lockfile so you are free to make other changes to package.json
This requires switching from npm/yarn to using pnpm
Example from: https://pnpm.io/cli/fetch
FROM node:14
RUN curl -f https://get.pnpm.io/v6.16.js | node - add --global pnpm
# pnpm fetch does require only lockfile
COPY pnpm-lock.yaml ./
RUN pnpm fetch --prod
ADD . ./
RUN pnpm install -r --offline --prod
EXPOSE 8080
CMD [ "node", "server.js" ]

Patching the version can be done without jq, using basic sed:
FROM alpine AS temp
COPY package.json /tmp
RUN sed -e 's/"version": "[0-9]\+\.[0-9]\+\.[0-9]\+",/"version": "0.0.0",/'
< /tmp/package.json > /tmp/package-v0.json
FROM node:14.5.0-alpine
....
COPY --from=temp /tmp/package-v0.json package.json
...
The sed regex assumes that the version value follows the semver scheme (e.g. 1.23.456)
The other assumption is that the "version": "xx.xx.xx," string is not found elsewhere in the file. The "," at the end of the pattern can help to lower the probability of "false positives". Check it before with your package.json file by security of course.

Steps:
remove version from package json
install packages for production
Copy to production image
Benefits:
Can freely patch package json without invalidating docker
If dependencies were not changed will not do unecessary npm install for production (packages don't change that frequently)
in practice:
# prepare package
FROM node:14-alpine AS package
COPY ./package.json ./package-lock.json ./
RUN node -e "['./package.json','./package-lock.json'].forEach(n => { \
let p = require(n); \
p.version = '0.0.0'; \
fs.writeFileSync(n, JSON.stringify(p)); \
});"
# install deps
FROM node:14-alpine AS build
COPY --from=package package*.json ./
RUN npm ci --only=production
# production
FROM node:14-alpine
...
COPY . .
COPY --from=build ./node_modules ./node_modules
...

Related

How to cache node_modules on Docker build with version?

To cache node_modules I add the package.json first then I run npm i inside docker image.
which is works great. but I also need to have version inside the package.json, and each deploy/build I increment the version number.
Because package.json has been changed, docker is not cache mode_modules because of it.
How can I cache node_modules in this senirio?
FROM node
# If needed, install system dependencies here
# Add package.json before rest of repo for caching
ADD package.json /app/
WORKDIR /app
RUN npm install
ADD . /app
# If needed, add additional RUN commands here
You can achieve this cache using BUILD_VERSION along with package.json version.
ARG BUILD_VERSION=0.0.0
Set some default value to BUILD_VERSION, keep the same value from BUILD_VERSION as package.json version to ignore the npm installation process.
Suppose you have the version in package.json is 0.0.0 and build version should be 0.0.0 to ignore installation.
FROM node:alpine
WORKDIR /app
ARG BUILD_VERSION=0.0.0
copy package.json /app/package.json
RUN echo BUILD_VERSION is $BUILD_VERSION and Package.json version is $(node -e "console.log(require('/app/package.json').version);")
RUN if [ "${BUILD_VERSION}" != $(node -e "console.log(require('/app/package.json').version);") ];then \
echo "ARG version and Package.json is different, installing node modules";\
npm install;\
else \
echo "npm installation process ignored";\
fi
To ignore npm installation during the build, run build command with
docker build --no-cache --build-arg BUILD_VERSION=0.0.0 -t test-cache-image .
Now, if you want to install node_modules just update the run command and it will work as you are expecting but more control as compared to caching track.
docker build --no-cache --build-arg BUILD_VERSION=0.0.1 -t test-cache-image .
This will install node_modules if the package.json version did not match with build-version.

Docker + Nodejs Getting Error: Cannot find module "for a module that I wrote"

I am Docker beginner.
I was able to implement docker for my nodejs project, but when I try to pull it I am getting the error
Error: Cannot find module 'my_db'
(my_db is a module that I wrote that handles my mysql functionality).
So I am guessing my modules are not bundled into the docker image, right?
I moved my modules to a folder name my_node_modules/ so they won't be ignored.
I also modified the Dockerfile as follow:
FROM node:11.10.1
ENV NODE_ENV production
WORKDIR /usr/src/app
COPY ["package.json", "package-lock.json*", "npm-shrinkwrap.json*", "./my_node_modules/*", "./"]
RUN npm install --production --silent && mv node_modules ../
COPY . .
EXPOSE 3000
CMD node index.js
What am I missing?
Thanks
I would do something like this. First create a .dockerignore:
.git
node_modules
The above ensures that the node_modules folder is excluded from the actual build context.
You should add any temporary things to your .dockerignore. This will also speed up the actual build, since the build context will be smaller.
In my docker file I would then first only copy package.json and any existing lock file in order to be able to cache this layer:
FROM node:11.10.1
ENV NODE_ENV production
WORKDIR /usr/src/app
# Only copy package* before installing to make better use of cache
COPY package*.json .
RUN npm install --production --silent
# Copy everything
COPY . .
EXPOSE 3000
CMD node index.js
Like I also wrote in my comment, I have no idea why you are doing this mv node_modules ../? This will move the node_modules directory out from the /usr/src/app folder, which is not what you want.
It would also be nice to see how you are actually including your module.
If you own module resides in the following folder my_node_modules/my_db it will be copied when doing COPY . . in the above docker file. Then in your index.js file you should be able to use the module like this:
const db = require('./my_node_modules/my_db');
COPY . . this step will override everything in the current directory and copying node modules from Host is not recommended and maybe it breaks the container in case of host biners compiled for Window and you are using Linux container.
So better to refactor your Dockerfile and install modules inside docker instead of copying from the host.
FROM node:11.10.1
ENV NODE_ENV production
WORKDIR /usr/src/app
COPY . .
RUN npm install --production --silent
EXPOSE 3000
CMD node index.js
Also will suggest using .dockerignore
# add git-ignore syntax here of things you don't want copied into docker image
.git
*Dockerfile*
*docker-compose*
node_modules

npm unable to find correct version of package within Docker

I am attempting to perform npm install within a docker image. As part of the package.json, I need version 1.8.8 of react-pattern-library. Within the docker image, only version 0.0.1 appears to be available.
If I locally run
npm view react-pattern-library versions
I can see version 1.8.8
However the same command within my docker file only show version 0.0.1
Can anyone tell me what configuration setting I need to be able to find the correct version when attempting my docker build?
docker build -t jhutc/molly-ui
Contents of Dockerfile
FROM node:10
# Create app directory
WORKDIR /usr/src/app
# Install app dependencies
# A wildcard is used to ensure both package.json AND package-lock.json are copied
# where available (npm#5+)
#COPY package*.json ./
COPY package.json ./
RUN npm set strict-ssl false
ENV HTTP_PROXY="http://proxy.company.com:8080"
ENV HTTPS_PROXY="https://proxy.company.com:8080"
RUN echo $HTTP_PROXY
RUN echo $HTTPS_PROXY
RUN npm view react-pattern-library versions
#RUN npm install
Try deleting the package-lock.json and running npm install again.

How to get version value of package.json inside of Dockerfile?

This is how my Dockerfile looks like: First I copy the files, then I run npm install and npm build. As you can see, I do set the productive ENV variable.
I would like to get the version of the current package.json file to the running docker image. So I thought of using en ENV variable, e.g. VERSION.
My package.json file could look like this:
"version": "0.0.1"
"scripts": {
"version": "echo $npm_package_version"
}
So npm run version returns the version value. But I don't know how to use this result as ENV in my dockerfile
COPY . /app
RUN npm install --silent
RUN npm run build
RUN VERSION=$(npm run version)
ENV NODE_ENV production
ENV VERSION ???
CMD ["node", "server.js"]
If you just need the version from inside of your node app.. require('./package.json').version will do the trick.
Otherwise, since you're already building your own container, why not make it easier on yourself and install jq? Then you can run VERSION=$(jq .version package.json -r).
Either way though, you cannot simply export a variable from a RUN command for use in another stage. There is a common workaround though:
FROM node:8-alpine
RUN apk update && apk add jq
COPY package.json .
COPY server.js .
RUN jq .version package.json -r > /root/version.txt
CMD VERSION=$(cat /root/version.txt) node server.js
Results from docker build & run:
{ NODE_VERSION: '8.11.1',
YARN_VERSION: '1.5.1',
HOSTNAME: 'xxxxx',
SHLVL: '1',
HOME: '/root',
VERSION: '1.0.0',
PATH: '/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin',
PWD: '/' }
Have you tried something like:
COPY . /app
RUN npm install --silent
RUN npm run build
RUN export VERSION=$(npm run version)
ENV NODE_ENV production
CMD ["node", "server.js"]
you can simply use npm pkg get version to retrieve the package version in the working directory.
in Dockerfile that's be something like that:
$(npm pkg get version)
or
RUN npm pkg get version
if you want other information from the package.json then change the version to any different key you want.
Try using bash script (i.e. grep) in RUN command to read the value and set as environment variable. Docker will do a snapshot with the variable set and your good to go.

docker build + private NPM (+ private docker hub)

I have an application which runs in a Docker container. It requires some private modules from the company's private NPM registry (Sinopia), and accessing these requires user authentication. The Dockerfile is FROM iojs:latest.
I have tried:
1) creating an .npmrc file in the project root, this actually makes no difference and npm seems to ignore it
2) using env variables for NPM_CONFIG_REGISTRY, NPM_CONFIG_USER etc., but the user doesn't log in.
Essentially, I seem to have no way of authenticating the user within the docker build process. I was hoping that someone might have run into this problem already (seems like an obvious enough issue) and would have a good way of solving it.
(To top it off, I'm using Automated Builds on Docker Hub (triggered on push) so that our servers can access a private Docker registry with the prebuilt images.)
Are there good ways of either:
1) injecting credentials for NPM at build time (so I don't have to commit credentials to my Dockerfile) OR
2) doing this another way that I haven't thought of
?
I found a somewhat elegant-ish solution in creating a base image for your node.js / io.js containers (you/iojs):
log in to your private npm registry with the user you want to use for docker
copy the .npmrc file that this generates
Example .npmrc:
registry=https://npm.mydomain.com/
username=dockerUser
email=docker#mydomain.com
strict-ssl=false
always-auth=true
//npm.mydomain.com/:_authToken="someAuthToken"
create a Dockerfile that copies the .npmrc file appropriately.
Here's my Dockerfile (based on iojs:onbuild):
FROM iojs:2.2.1
MAINTAINER YourSelf
# Exclude the NPM cache from the image
VOLUME /root/.npm
# Create the app directory
RUN mkdir -p /usr/src/app
WORKDIR /usr/src/app
# Copy npm config
COPY .npmrc /root/.npmrc
# Install app
ONBUILD COPY package.json /usr/src/app/
ONBUILD RUN npm install
ONBUILD COPY . /usr/src/app
# Run
CMD [ "npm", "start" ]
Make all your node.js/io.js containers FROM you/iojs and you're good to go.
In 2020 we've got BuildKit available. You don't have to pass secrets via COPY or ENV anymore, as it's not considered safe.
Sample Dockerfile:
# syntax=docker/dockerfile:experimental
FROM node:13-alpine
WORKDIR /app
COPY package.json yarn.lock ./
RUN --mount=type=ssh --mount=type=secret,id=npmrc,dst=$HOME/.npmrc \
yarn install --production --ignore-optional --frozen-lockfile
# More stuff...
Then, your build command can look like this:
docker build --no-cache --progress=plain --secret id=npmrc,src=/path-to/.npmrc .
For more details, check out: https://docs.docker.com/develop/develop-images/build_enhancements/#new-docker-build-secret-information
For those who are finding this article via google and are still looking for an alternative way that doesn't involve leaving you private npm tokens on your docker images and containers:
We were able to get this working by doing the npm install prior to the docker build (By doing this it lets you have your .npmrc outside of your image\container). Once the private modules have been installed locally you can copy your files across to the image as part of your build:
# Make sure the node_modules contain only the production modules when building this image
COPY . /usr/src/app
You also need to make sure that your .dockerignore file doesn't exclude the node_modules folder.
Once you have the folder copied into your image, the trick is to to npm rebuild instead of npm install. This will rebuild any native dependancies that are effected by any differences between your build server and your docker OS:
FROM nodesource/vivid:LTS
# For application location, default from nodesource is /usr/src/app
# Make sure the node_modules contain only the production modules when building this image
COPY . /usr/src/app
WORKDIR /usr/src/app
RUN npm rebuild
CMD npm start
I would recommend not using a .npmrc file but instead use npm config set. This works like a charm and is much cleaner:
ARG AUTH_TOKEN_PRIVATE_REGISTRY
FROM node:latest
ARG AUTH_TOKEN_PRIVATE_REGISTRY
ENV AUTH_TOKEN_PRIVATE_REGISTRY=${AUTH_TOKEN_PRIVATE_REGISTRY}
WORKDIR /home/usr/app
RUN npm config set #my-scope:registry https://my.private.registry && npm config set '//my.private.registry/:_authToken' ${AUTH_TOKEN_PRIVATE_REGISTRY}
RUN npm ci
CMD ["bash"]
The buildkit answer is correct, except it runs everything as root which is considered a bad security practice.
Here's a Dockerfile that works and uses the correct user node as the node Dockerfile sets up. Note the secret mount has the uid parameter set, otherwise it mounts as root which user node can't read. Note also the correct COPY commands that chown to user:group of node:node
FROM node:12-alpine
USER node
WORKDIR /home/node/app
COPY --chown=node:node package*.json ./
RUN --mount=type=secret,id=npm,target=./.npmrc,uid=1000 npm ci
COPY --chown=node:node index.js .
COPY --chown=node:node src ./src
CMD [ "node", "index.js" ]
#paul-s Should be the accepted answer now because it's more recent IMO. Just as a complement, you mentioned you're using the docker/build-push-action action so your workflow must be as following:
- uses: docker/build-push-action#v3
with:
context: .
# ... all other config inputs
secret-files: |
NPM_CREDENTIALS=./.npmrc
And then, of course, bind the .npmrc file from your dockerfile using the ID you specified. In my case I'm using a Debian based image (uid starts from 1000). Anyways:
RUN --mount=type=secret,id=NPM_CREDENTIALS,target=<container-workdir>/.npmrc,uid=1000 \
npm install --only=production

Resources