I can't seem to make my dockerfile cache my npm install. I have it set up like all the examples specify, and the package.json doesn't change but it still downloads all the dependencies.
Here's what I have
FROM mf/nodebox
# Maintainer
MAINTAINER Raif Harik <reharik#gmail.com>
RUN rm /bin/sh && ln -s /bin/bash /bin/sh
ADD /app/package.json /tmp/package.json
RUN cd /tmp && npm install && npm install -g babel
RUN cd /tmp && cp -a /tmp/node_modules /opt/app/current/node_modules
# Entrypoint to docker shell
ENTRYPOINT ["docker-shell"]
#this is the flag that tells the docker-shell what mode to execute
# Startup commands
CMD ["-r"]
# set WORKDIR
WORKDIR /opt/app/current
# Add shell script for starting container
ADD ./docker-shell.sh /usr/bin/docker-shell
RUN chmod +x /usr/bin/docker-shell
COPY /app /opt/app/current
Then the output I get is
Building domain...
Step 0 : FROM mf/nodebox
---> 4ee7c51a410d
Step 1 : MAINTAINER Raif Harik <reharik#gmail.com>
---> Using cache
---> 78d0db67240c
Step 2 : RUN rm /bin/sh && ln -s /bin/bash /bin/sh
---> Using cache
---> d7d360d8f89a
Step 3 : ADD /app/package.json /tmp/package.json
---> 5b373dae5141
Removing intermediate container f037272f49c3
Step 4 : RUN cd /tmp && npm install && npm install -g babel
---> Running in cb89bb6fc2d0
npm WARN package.json MF_Domain#0.0.1 No description
So it's caching the first couple commands, but it stops at Step 3 the ADD package.json and then goes to npm for Step 4.
Edit:
I guess i should mention that when I deploy a new change in the code ( or for my experimenting with this issue, just the same code ), while the package.json stays the same it is copies over to the deploy folder. I don't know if docker checks the createddate, the checksum, or does a diff. If it's the createddate then maybe that's the issue.
from the docker documentation it is said that
In the case of the ADD and COPY instructions, the contents of the file(s) being put into the image are examined. Specifically, a checksum is done of the file(s) and then that checksum is used during the cache lookup. If anything has changed in the file(s), including its metadata, then the cache is invalidated.
Those metadata include the file modification time.
There are tricks to get around this (for instance docker add cache when git checkout same file).
See also the related discussion on the Docker github project.
Related
I have a docker file for deploying my react project. Here is the dockerfile:
FROM node:14.2.0
COPY package.json /tmp/package.json
RUN cd /tmp && npm install --silent
RUN mkdir -p /home/node/app/ && cp -a /tmp/node_modules /home/node/app/
WORKDIR /home/node/app/
USER root
COPY . ./
I believe if there is no change in package.json it will use cache for npm install(2nd step) and copying node_modules (3rd step).
But even if there is no change in package.json it does not.
How do I cache the steps then ?
Background:
I'm writing code in node.js, using npm and docker. I'm trying to get my docker file to use cache when I build it so it doesn't take too long.
We have a "common" repo that we use to keep logic that is used in a variety of repositories and this gets propagated is npm packages.
The problem:
I want the docker file NOT use the cache on my "common" package.
Docker file:
FROM node:12-alpine as X
RUN npm i npm#latest -g
RUN mkdir /app && chown node:node /app
WORKDIR /app
RUN apk add --no-cache python3 make g++ tini \
&& apk add --update tzdata
USER node
COPY package*.json ./
COPY .npmrc .npmrc
RUN npm install --no-optional && npm cache clean --force
ENV PATH /app/node_modules/.bin:$PATH
COPY . .
package.json has this line:
"dependencies": {
"#myorg/myorg-common-repo": "~1.0.13",
I have tried adding these lines in a variety of places and nothing seems to work:
RUN npm uninstall #myorg/myorg-common-repo && npm install #myorg/myorg-common-repo
RUN npm update #myorg/myorg-common-repo --force
Any ideas on how I can get docker to build and not use the cache on #myorg/myorg-common-repo ?
So I finally managed to solve this using this answer:
What we want to do is invalidate the cache for a specific block in the Docker file and then run our update command. This is done by adding a build argument to the command (CLI or Makefile) like so:
docker-compose -f docker-compose-dev.yml build --build-arg CACHEBUST=0
And then Adding this additional block to the Docker file:
ARG CACHEBUST=1
USER node
RUN npm update #myorg/myorg-common-repo
This does what we want.
The ARG CACHEBUST=1 invalidates the cache and the npm update command runs without it.
As my code (nodeJS-application) is changing more often than the (npm) dependencies do, I've tried to build something like a cache in my CI.
I'm using a multi-stage Dockerfile. In that I run npm install for all, and only, prod dependencies. Later they are copied to the final image so that it is much smaller. Great.
Also the build get super fast if no dependency has been changed.
However, over time the hd gets full so I have to run docker prune ... to get the space back. But, when I do this, the cache is gone.
So if I run a prune after each pipeline in my CI, I do not get the 'cache functionality' of the multi-stage Dockerfile.
### 1. Build
FROM node:10.13 AS build
WORKDIR /home/node/app
COPY ./package*.json ./
COPY ./.babelrc ./
RUN npm set progress=false \
&& npm config set depth 0 \
&& npm install --only=production --silent \
&& cp -R node_modules prod_node_modules
RUN npm install --silent
COPY ./src ./src
RUN ./node_modules/.bin/babel ./src/ -d ./dist/ --copy-files
### 2. Run
FROM node:10.13-alpine
RUN apk --no-cache add --virtual \
builds-deps \
build-base \
python
WORKDIR /home/node/app
COPY --from=build /home/node/app/prod_node_modules ./node_modules
COPY --from=build /home/node/app/dist .
EXPOSE 3000
ENV NODE_ENV production
CMD ["node", "app.js"]
If your CI system lets you have multiple docker build steps, you could split this into two Dockerfiles.
# Dockerfile.dependencies
# docker build -f Dockerfile.dependencies -t me/dependencies .
FROM node:10.13
...
RUN npm install
# Dockerfile
# docker build -t me/application .
FROM me/dependencies:latest AS build
COPY ./src ./src
RUN ./node_modules/.bin/babel ./src/ -d ./dist/ --copy-files
FROM node:10.13-alpine
...
CMD ["node", "app.js"]
If you do this, then you can delete unused images after each build:
docker image prune
The most recent build of the dependencies image will have a label, so it won't be "dangling" and won't appear in the image listing. On each build its label will get "taken from" the previous build (if it changed) and so this sequence will clean up previous builds. This will also delete the "build" images, though as you note if anything changed to trigger a build it will probably be in the src tree and so forcing a rebuild there is reasonable.
In this specific circumstance, just using the latest tag is appropriate. If the final built images have some more unique tag (based on a version number or timestamp, say) and they're stacking up then you might need to do some more creative filtering of that image list to clean them up.
I have the following docker file
FROM ubuntu:14.04
#Install Node
RUN apt-get update -y
RUN apt-get upgrade -y
RUN apt-get install nodejs -y
RUN apt-get install nodejs-legacy -y
RUN apt-get install npm -y
RUN update-alternatives --install /usr/bin/node node /usr/bin/nodejs 10
# Create app directory
RUN mkdir -p /usr/src/app
WORKDIR /usr/src/app
# COPY distribution
COPY dist dist
COPY package.json package.json
# Substitute dependencies from environment variables
COPY entrypoint.sh /entrypoint.sh
RUN chmod +x /entrypoint.sh
ENTRYPOINT ["/entrypoint.sh"]
EXPOSE 8000
And here is the entrypoint script
#!/bin/sh
cp package.json /usr/src/app/dist/
cd /usr/src/app/dist/
echo "starting server"
exec npm start
When I run the image it fails with this error
sh: 1: http-server: not found
npm ERR! weird error 127
npm WARN This failure might be due to the use of legacy binary "node"
npm WARN For further explanations, please read
/usr/share/doc/nodejs/README.Debian
I tried various kinds of installation but still get the same error, I also tried checking if the node_modules conatins the http-server executable and it does. I tried forcing 777 permission on all the files but still running into the same error
What could be the problem?
It looks like you're just missing an npm install call somewhere, so the node_modules directory, nor any of its contents (like http-server) are present on the image. After the COPY package.json package.json, if you add a RUN npm install line, that might be all you need.
There are a few other things that could be simpler too though, like you probably don't need an ENTRYPOINT script to run the app and copy package.json since that's already done. Here's a simplified version of a Node Docker image I've been running with. I'm using the base Node images which, I believe, are Linux-based, but you could probably keep the Ubuntu stuff if you wanted to and it shouldn't be an issue.
FROM node:6.9.5
# Create non-root user to run app with
RUN useradd --user-group --create-home --shell /bin/bash my-app
# Set working directory
WORKDIR /home/my-app
COPY package.json ./
# Change user so that everything that's npm-installed belongs to it
USER my-app
# Install dependencies
RUN npm install --no-optional && npm cache clean
# Switch to root and copy over the rest of our code
# This is here, after the npm install, so that code changes don't trigger an un-caching
# of the npm install line
USER root
COPY .eslintrc index.js ./
COPY app ./app
RUN chown -R my-app:my-app /home/my-app
USER my-app
CMD [ "npm", "start" ]
It's good practice to make a specific user for owning/running your code and not using root, but, as I understand it, you need to use root to put files onto your image, hence the switching users a couple times here (which is what USER ... does).
I'll also note that I use this image with Docker Compose for local development, which is what the comment about code changes is referring to.
I am currently developing a Node backend for my application.
When dockerizing it (docker build .) the longest phase is the RUN npm install. The RUN npm install instruction runs on every small server code change, which impedes productivity through increased build time.
I found that running npm install where the application code lives and adding the node_modules to the container with the ADD instruction solves this issue, but it is far from best practice. It kind of breaks the whole idea of dockerizing it and it cause the container to weight much more.
Any other solutions?
Ok so I found this great article about efficiency when writing a docker file.
This is an example of a bad docker file adding the application code before running the RUN npm install instruction:
FROM ubuntu
RUN echo "deb http://archive.ubuntu.com/ubuntu precise main universe" > /etc/apt/sources.list
RUN apt-get update
RUN apt-get -y install python-software-properties git build-essential
RUN add-apt-repository -y ppa:chris-lea/node.js
RUN apt-get update
RUN apt-get -y install nodejs
WORKDIR /opt/app
COPY . /opt/app
RUN npm install
EXPOSE 3001
CMD ["node", "server.js"]
By dividing the copy of the application into 2 COPY instructions (one for the package.json file and the other for the rest of the files) and running the npm install instruction before adding the actual code, any code change wont trigger the RUN npm install instruction, only changes of the package.json will trigger it. Better practice docker file:
FROM ubuntu
MAINTAINER David Weinstein <david#bitjudo.com>
# install our dependencies and nodejs
RUN echo "deb http://archive.ubuntu.com/ubuntu precise main universe" > /etc/apt/sources.list
RUN apt-get update
RUN apt-get -y install python-software-properties git build-essential
RUN add-apt-repository -y ppa:chris-lea/node.js
RUN apt-get update
RUN apt-get -y install nodejs
# use changes to package.json to force Docker not to use the cache
# when we change our application's nodejs dependencies:
COPY package.json /tmp/package.json
RUN cd /tmp && npm install
RUN mkdir -p /opt/app && cp -a /tmp/node_modules /opt/app/
# From here we load our application's code in, therefore the previous docker
# "layer" thats been cached will be used if possible
WORKDIR /opt/app
COPY . /opt/app
EXPOSE 3000
CMD ["node", "server.js"]
This is where the package.json file added, install its dependencies and copy them into the container WORKDIR, where the app lives:
ADD package.json /tmp/package.json
RUN cd /tmp && npm install
RUN mkdir -p /opt/app && cp -a /tmp/node_modules /opt/app/
To avoid the npm install phase on every docker build just copy those lines and change the ^/opt/app^ to the location your app lives inside the container.
Weird! No one mentions multi-stage build.
# ---- Base Node ----
FROM alpine:3.5 AS base
# install node
RUN apk add --no-cache nodejs-current tini
# set working directory
WORKDIR /root/chat
# Set tini as entrypoint
ENTRYPOINT ["/sbin/tini", "--"]
# copy project file
COPY package.json .
#
# ---- Dependencies ----
FROM base AS dependencies
# install node packages
RUN npm set progress=false && npm config set depth 0
RUN npm install --only=production
# copy production node_modules aside
RUN cp -R node_modules prod_node_modules
# install ALL node_modules, including 'devDependencies'
RUN npm install
#
# ---- Test ----
# run linters, setup and tests
FROM dependencies AS test
COPY . .
RUN npm run lint && npm run setup && npm run test
#
# ---- Release ----
FROM base AS release
# copy production node_modules
COPY --from=dependencies /root/chat/prod_node_modules ./node_modules
# copy app sources
COPY . .
# expose port and define CMD
EXPOSE 5000
CMD npm run start
Awesome tuto here: https://codefresh.io/docker-tutorial/node_docker_multistage/
I've found that the simplest approach is to leverage Docker's copy semantics:
The COPY instruction copies new files or directories from and adds them to the filesystem of the container at the path .
This means that if you first explicitly copy the package.json file and then run the npm install step that it can be cached and then you can copy the rest of the source directory. If the package.json file has changed, then that will be new and it will re-run the npm install caching that for future builds.
A snippet from the end of a Dockerfile would look like:
# install node modules
WORKDIR /usr/app
COPY package.json /usr/app/package.json
RUN npm install
# install application
COPY . /usr/app
I imagine you may already know, but you could include a .dockerignore file in the same folder containing
node_modules
npm-debug.log
to avoid bloating your image when you push to docker hub
you don't need to use tmp folder, just copy package.json to your container's application folder, do some install work and copy all files later.
COPY app/package.json /opt/app/package.json
RUN cd /opt/app && npm install
COPY app /opt/app
I wanted to use volumes, not copy, and keep using docker compose, and I could do it chaining the commands at the end
FROM debian:latest
RUN apt -y update \
&& apt -y install curl \
&& curl -sL https://deb.nodesource.com/setup_12.x | bash - \
&& apt -y install nodejs
RUN apt -y update \
&& apt -y install wget \
build-essential \
net-tools
RUN npm install pm2 -g
RUN mkdir -p /home/services_monitor/ && touch /home/services_monitor/
RUN chown -R root:root /home/services_monitor/
WORKDIR /home/services_monitor/
CMD npm install \
&& pm2-runtime /home/services_monitor/start.json