docker-node: Running as non-root user, file permissions

docker-node: Running as non-root user, file permissions - node.js

Following docker-node’s best practices, I want to run my node app as non-root user. The recommendation is as follows:
FROM node:6.10.3
...
# At the end, set the user to use when running this image
USER node
My simplified Dockerfile currently looks like this:
FROM node:6.10.3
WORKDIR /opt/app
COPY package.json .
RUN npm install
COPY . .
EXPOSE 3000
USER node
CMD ["node", "server.js"]
So, all the files added during image build are owned by root, but node server.js is run as the node user. This seems to work fine.
My question: Is there any additional security benefit from chown-ing the files so that they belong to node instead of root? I.e. doing something like:
RUN chown -R node:node .

It definitely does, however I would also remove the chown binary (as well as all admin tools). This would make it harder when someone accesses the container as e.g. root. See here for a related answer.
Also, see this Dockerfile for inspiration.

Related

Should i dockerize django app as non-root?

Should i dockerize Django app as a root user? If yes how can i set up non-root user for Django? Because in node.js app should have USER:node which is a better practice.
Code example from official docker page which does not include non-root:
FROM python:3
ENV PYTHONUNBUFFERED=1
WORKDIR /code
COPY requirements.txt /code/
RUN pip install -r requirements.txt
COPY . /code/

It's generically a good practice.
At the start of your Dockerfile, before you COPY anything in, create the user. It doesn't need to have any specific properties and it doesn't need to match any specific host user ID. The only particular reason to do this early is to avoid repeating it on rebuilds.
At the end of your Dockerfile, after you run all of the build steps, only then switch USER to the new user. The code and any installed libraries will be owned by the root user; and that's good, because it means the application can't accidentally overwrite the application code.
FROM python:3
# Create the non-root user. Doing this before any COPY means it won't
# be repeated on rebuild, for marginal savings in space and rebuild time.
# The user can have any name and any uid; it does not need to match any
# particular host system where the image might run.
RUN adduser --system --no-create-home someuser
# Install the application as in the question (still as root).
ENV PYTHONUNBUFFERED=1
WORKDIR /code
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
# Explain how to run the container. Only switch to the non-root user now.
EXPOSE 8000
USER someuser
CMD ["./main.py"]
Do not try to write files inside the container; instead, use a separate database container for persistence. Do not pass a host user ID as a build argument. Do not configure a password for the user or otherwise attempt to set up interactive logins. Do not create a home directory; it won't be used.

Store NodeJS script outside of Docker container, so it can be modified

I'm interested in running a NodeJS script inside a Docker container, because that seems to be the easiest way to run stuff in unRAID (small scripts at least).
My current Dockerfile looks like this:
FROM node:12.9.1
COPY app.js /home/node/
COPY package*.json /home/node/
RUN mkdir /home/node/saves
WORKDIR /home/node
RUN npm install
CMD ["node", "app.js"]
Not perfect, but works well enough. What my script does, is that it scrapes certain websites for data, then places it in a folder called saves.
But because I do COPY app.js /home/node/, every time I want to make a tiny change to my app.js file, I need to rebuild the whole image, delete the container, and start a new one. Kind of irritating, but it has worked for me.. for now.
When I start my container, I want that volume to stay persistent, so I do this:
docker run --net=bridge -h scraper --name scraper -d -v /mnt/user/scripts/scraper/saves:/home/node/saves scraper
This works, but as I said, if I want to change my app.js (like add a new site to scrape), I have to rebuild the image and run the above command again. Every single time.
What's a better approach than this? I could solve this by not copying the files, but instead run npm install and then node app.js every time, but this script runs every 3 minutes, so that would be a huge waste of resources.
I could also store the appropriate data inside my /saves/ folder, then read that in the NodeJS script every time, but I feel like that's kind of a hack.

Use Environment variables.
If the type of changes to app.js file already known, you could
use environment variables to supply such changes to docker run, and
an entrypoint script file to make such modification using
environment variable.
The content of entrypoint script will depend on what you want to do.
# Add to your Dockerfile
ENV SITE_TO_SCRAP=example.com
COPY docker-entrypoint.sh /
RUN chmod +x /docker-entrypoint.sh
ENTRYPOINT ["/docker-entrypoint.sh"]
# Run docker with
-e SITE_TO_SCRAP=abc.com
Entry Point script may looks like
#!/bin/bash
# Modify app.js
# Assuming changing the line SCRAP_URL="someurl"
sed -i -e "s#SCRAP_URL=\"someurl\"#SCRAP_URL=\"${SITE_TO_SCRAP}\"#" /home/node/app.js
# This will execute the CMD
exec "$#"
Make node cache folder persistent
To run npm install do it in entry point script not in docker file. make node.js cache folder persistent by mounting -v host/path/to/cache:/root/.npm. this way node install use cached files whenever possible. use docker run --rm node:{version} npm config get cache to get container cache directory.
Manually mount modified files.
# To your docker run command add required file/directory mount(s)
-v /path/to/modified/app.js:/home/node/app.js

How to dynamically change content in node project run through docker

I have an angularjs application, I'm running using docker.
The docker file looks like this:-
FROM node:6.2.2
RUN npm install --global gulp-cli && \
npm install --global bower
RUN mkdir -p /usr/src/app
WORKDIR /usr/src/app
COPY package.json /usr/src/app/
COPY bower.json /usr/src/app/
RUN npm install && \
bower install -F --allow-root --config.interactive=false
COPY . /usr/src/app
ENV GULP_COMMAND serve:dist
ENTRYPOINT ["sh", "-c"]
CMD ["gulp $GULP_COMMAND"]
Now when I make any changes in say any html file, It doesn't dynamically loads up on the web page. I have to stop the container, remove it, build the image again, remove the earlier image and then restart the container from new image. Do I have to do this every time? (I'm new to docker, and I guess this issue is coz my source code is not put into volume, but I don't know how to do it using docker file)

You are correct, you should use volumes for stuff like this. During development, give it the same volumes as the COPY directories. It'll override it with whatever is on your machine, no need to rebuild the image, or even restart the container. Perfect for development.
When actually baking your images for production, you remove the volumes, leave the COPY in, and you'll get a deterministic container. I would recommend you read through this article here: https://docs.docker.com/storage/volumes/.
In general, there are 3 ways to do volumes.
Define them in your dockerfile using VOLUME.
Personally, I've never done this. I don't really see the benefits of this against the other two methods. I believe it would be more common to do this when your volume is meant to act as a permanent data-store. Not so much when you're just trying to use your live dev environment.
Define them when calling docker run.
docker run ... -v $(pwd)/src:/usr/src/app ...
This is great, cause if your COPY in your dockerfile is ./src /usr/src/app then it temporarily overrides the directory while running the image, but it's still there for deployment when you don't use -v.
Use docker-compose.
My personal recommendation. Docker compose massively simplifies running containers. For sake of simplicity just calls docker run ... but automates the arguments based on a given docker-compose.yml config.
Create a dev service specifying the volumes you want to mount, other containers you want it linked to, etc. Then bring it up using docker-compose up ... or docker-compose run ... depending on what you need.
Smart use of volumes will DRAMATICALLY reduce your development cycle. Would really recommend looking into them.

Yes, you need to rebuild every time the files change, since you only modify the files that are outside of the container. In order to apply the changes to the files IN the container, you need to rebuild the container.
Depending on the use case, you could either make the Docker Container dynamically load the files from another repository, or you could mount an external volume to use in the container, but there are some pitfalls associated with either solution.

If you want to keep your container running as you add your files you could also use a variation.
Mount a volume to any other location e.g. /usr/src/staging.
While the container is running, if you need to copy new files into the container, copy them into the location of the mounted volume.
Run docker exec -it <container-name> bash to open a bash shell inside the running container.
Run a cp /usr/src/staging/* /usr/src/app command to copy all new files into the target folder.

How should I Accomplish a Better Docker Workflow?

Everytime I change a file in the nodejs app I have to rebuild the docker image.
This feels redundant and slows my workflow. Is there a proper way to sync the nodejs app files without rebuilding the whole image again, or is this a normal usage?

It sounds like you want to speed up the development process. In that case I would recommend to mount your directory in your container using the docker run -v option: https://docs.docker.com/engine/userguide/dockervolumes/#mount-a-host-directory-as-a-data-volume
Once you are done developing your program build the image and now start docker without the -v option.

What I ended up doing was:
1) Using volumes with the docker run command - so I could change the code without rebuilding the docker image every time.
2) I had an issue with node_modules being overwritten because a volume acts like a mount - fixed it with node's PATH traversal.
Dockerfile:
FROM node:5.2
# Create our app directories
RUN mkdir -p /usr/src/app
WORKDIR /usr/src/app
RUN npm install -g nodemon
# This will cache npm install
# And presist the node_modules
# Even after we are using the volume (overwrites)
COPY package.json /usr/src/
RUN cd /usr/src && npm install
#Expose node's port
EXPOSE 3000
# Run the app
CMD nodemon server.js
Command-line:
to build:
docker build -t web-image
to run:
docker run --rm -v $(pwd):/usr/src/app -p 3000:3000 --name web web-image

You could have also done something like change the instruction and it says look in the directory specified by the build context argument of docker build and find the package.json file and then copy that into the current working directory of the container and then RUN npm install and afterwards we will COPY over everything else like so:
# Specify base image
FROM node:alpine
WORKDIR /usr/app
# Install some dependencies
COPY ./package.json ./
RUN npm install
# Setup default command
CMD ["npm", "start"]
You can make as many changes as you want and it will not invalidate the cache for any of these steps here.
The only time that npm install will be executed again is if we make a change to that step or any step above it.
So unless you make a change to the package.json file, the npm install will not be executed again.
So we can test this by running the docker build -t <tagname>/<project-name> .
Now I have made a change to the Dockerfile so you will see some steps re run and eventually our successfully tagged and built image.
Docker detected the change to the step and every step after it, but not the npm install step.
The lesson here is that yes it does make a difference the order in which all these instructions are placed in a Dockerfile.
Its nice to segment out these operations to ensure you are only copying the bare minimum.

npm package.json and docker (mounting it...)

I am using Docker, so this case might look weird. But I want my whole /data directory to be mounted inside my docker container when developing.
My /data folder container my package.json file, an app directory and a bunch of other stuff.
The problem is that I want my node_modules folder to NOT be persistent, only the package.json file.
I have tried a couple of things, but package.json and npm is giving me a hard time here...
Mounting the package.json file directly will break npm. npm tries to rename the file on save, which is not possible when its a mounted file.
Mounting the parent folder (/data) will mount the node_modules folder.
I cant find any configuration option to put node_modules in another folder outside /data, example /dist
Putting package.json in /data/conf mounting the /data/conf as a volume instead wont work. I cant find any way to specify the package.json path in npmrc.
Putting package.json in /data/conf and symlinking it to /data/package.json wont work. npm breaks the symlink and replaces it with a file.
Copying data back and forth to/from inside the docker container is how I am doing it now.. A little tedious.. I also want a clean solution..

As you have already answered, I think that might be the only solution right now.
When you are building your Docker image, do something like:
COPY data/package.json /data/
RUN mkdir /dist/node_modules && ln -s /dist/node_modules /data/node_modules && cd /data && npm install
And for other stuff (like bower, do the same thing)
COPY data/.bowerrc /data/
COPY data/bower.json /data/
RUN mkdir /dist/vendor && ln -s /dist/vendor /data/vendor && cd /data && bower install --allow-root
And COPY data/ /data at the end (so you are able to use Dockers caching and not having to do npm/docker installation when there is a change to data.
You will also need to create the symlinks you need and store them in your git-repo. They will be invalid on the outside, but will happely work on the inside of your container.
Using this solution, you are able to mount your $PWD/data:/data without getting the npm/bower "junk" outside your container. And you will still be able to build your image as a standalone deployment of your service..

A similar and alternative way is to use NODE_ENV variable instead of creating a symlink.
RUN mkdir -p /dist/node_modules
RUN cp -r node_modules/* /dist/node_modules/
ENV NODE_PATH /dist/node_modules
Here you first create a new directory for node_modules, copy all modules there, and have Node read the modules from there.

I've been having this problem for some time now, and the accepted solution didn't work for me*
I found this link, which had an edit pointing here and this indeed worked for me:
volumes:
- ./:/data
- /data/node_modules
In this case the Engine creates a volume (see Compose reference on volumes) which is not mounted to your source directory. This was the easiest solution and didn't require me to do any symlinking, setting paths, etc.
For reference, my simple Dockerfile just looks like this:
# install node requirements
WORKDIR /data
COPY ./package.json ./package.json
RUN npm install -qq
# add source code
COPY ./ ./
# run watch script
CMD npm run watch
(The watch script is just webpack --watch -d)
Hope this is able to help someone and save hours of time like it did for me!
'*' = I couldn't get webpack to work from my package.json scripts and installing anything while inside the container created the node_modules folder with whatever I just installed (I run npm i --save [packages] from inside the container to get the package update the package.json until the next rebuild)

The solution I went with was placing the node_modules folder in /dist/node_modules, and making a symlink to it from /data/node_modules. I can do this both in my Dockerfile so it will use it when building, and I can submit my symlinks to my git-repo. Everything worked out nicely..

Maybe you can save your container, and then rebuild it regularly with a minimal dockerfile
FROM my_container
and a .dockerignore file containing
/data/node_modules
See the doc
http://docs.docker.com/reference/builder/#the-dockerignore-file

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string