Speeding up the npm install - node.js

I am trying to speed up the npm install during the build process phase. My package.json has the list of packages pretty much with locked revisions in it. I've also configured the cache directory using the command
npm config set cache /var/tmp/npm-cache --global
However, on trying to install using npm install -g --cache, I find that this step isn't reducing the time to install by just loading the packages from cache as I would expect. In fact, I doubt if it's even using the local cache to look up packages first.

Proposing two more modern approches:
1) npm ci
Use npm ci, which is available from npm version 5.7.0 (although I recommend 5.7.1 and upwards because of the broken release) - this requires package-lock.json to be present and it skips building your dependency tree off of your package.json file, respecting the already resolved dependency URLs in your lock file.
A very quick
boost for your CI/CD envs (our build time was cut down to a quarter of the original!) and/or to make sure all your developers sit on the same versions of dependencies during development (without having to hard-code strict versions in your package.json file).
Note however that npm ci removes the node_modules/ directory before installing, so it won't benefit from any caching strategies.
2) npm i --prefer-offline
Use the --prefer-offline flag with your regular npm install / npm i. With this approach, you need to make sure you've cached your node_modules/ directory between builds (in a CI/CD environment). If it fails to find packages locally with the specific version, it falls back to the network safely.
You can also add --no-audit --progress=false to reduce pre-install checks and remove the progress bar (latter is only a very slight improvement)

For pure npm solution, you may try
npm install --prefer-offline --no-audit --progress=false
Prefer offline may not be useful for the first run.

As suggested by #Daniel Serodio
You could also include your node_modules folder inside your repository but you should probably zip it first than add to repo, and while installing you can unzip it and just
npm rebuild
(which works cross platform) it is quite fast.
This would also give you the benefit of full control over all your dependencies.
Also you can set the process flag to false to increase your speed by 2x.
npm set progress=false
Read source for more info
Update:
You can also use pnpm for this
npm i -g pnpm
This basically use local cached modules (i have heard its better then YARN)

It's better to install pnpm package using the following command:
npm i -g pnpm
pnpm uses hard links and symlinks to save one version of a module only ever once on a disk. When using npm or Yarn for example, if you have 100 projects using the same version of lodash, you will have 100 copies of lodash on disk. With pnpm, lodash will be saved in a single place on the disk and a hard link will put it into the node_modules where it should be installed.
As an example I can mention that whenever you want to install the dependencies of package.json file, what you should do is simply that enter the pnpm i and it handles the other things by itself.

UPDATE: The original answer is from 2014. I wouldnt recommend checking in node_modules, as there are definitly better options around speeding up the install especially for a ci pipeline, eg. npm ci --only=production
You could also include your node_modules folder inside your repository (you are probably using git), and just npm rebuild (which works cross platform) on build/deploy processes, and is pretty fast.
This would also give you the benefit of full control over all your dependencies (I know that's what shrinkwrap usually should be used for)
Edit:
Also you can set the progress flag to false to increase your speed by at least 20%. This works only with npm#v3.x.x, and there will be hopefully fixes for that soon (see second link)
npm set progress=false
Tweet about finding
Github Issue Cause identification

As very modern solution you can start to use Docker.
Docker allows you virtualize and pre-define as image the current state of your code, including installed npm-modules and other goodies.
Once the docker image for your infrastructure/env is built locally, or retrieved from remote repository, it will be stored on the host machine, and you can spin server in seconds.
Another benefit of it is that you use same virtualized code infrastructure on any machine where you deploy your code.
Docker speeds up install/deployment processes and is widely used technology.
To start using docker is enough to (all the snippets are just mock/example for pre-setup and are not by any means most robust/elegant solution) :
1) Install docker and docker-compose using manuals and get some basic understanding of it at docker.com
2) Write Dockerfile file in root of your application
FROM node:6.9.5
RUN mkdir /usr/local/app
WORKDIR /usr/local/app
COPY package.json package.json
RUN npm install
3) create docker-compose.yml in the root of your project with such content:
version: "2"
server:
hostname: server
container_name: server
image: server
build: .
command: sh -c 'NODE_ENV=development PORT=8080 node app.js'
ports:
- "8080:8080"
volumes: #list of folders and files to use
- ${PWD}/server:/usr/local/server
- ${PWD}/app.js:/usr/local/app.js
4) To start server you will need to docker-compose up -d. To see the logs docker-compose logs -f server. If you will restart your server it will do it in seconds once it built the image already at once.
Then it will cache build layers locally so next run will take only few seconds.
I know this might be bit of a robust solution, but I am sure it is have most potential/flexibility and is widely used in industry. And while it requires some learning for anyone who did not use Docker before, in my humble oppinion, it is the best one for your problem.

Nothing helped me more than disabling antivirus (Windows Defender in my case) I got from 2:30 to 1 minute.
With npm-cache package I got to ~30 secs.
I tried to use yarn, which is very fast, but was randomly failing in my case.

We have been trying to solve this problem to speed up our deployments.
We have settled on using pac, which follows the principles in the other answers. It zips the npm modules and includues them in your repo so you don't have a million files in your commits and code reviews and you can just unzip/rebuild for the target machine.
https://www.npmjs.com/package/pac

Related

Yarn install with multiple git dependencies results in "EINVAL: invalid argument, mkdir ..." error

Node (v14.2.0), Yarn (1.22.4), Windows 10
Context: I have several node projects hosted in a private git repo. I have several cross dependencies between projects, e.g. project C depends on projects A and B, and project D may depend on C and A (perhaps this is my problem?). I generally have my package.json files set up to use the git repos directly, and it works reasonably well for the projects with one or two dependencies.
One of my larger projects has many dependencies on my other projects. Running yarn install on this project gives me this error consistently:
EINVAL: invalid argument, mkdir [some C:\\...Yarn\\Cache\\... directory]
The install ends with that error, and node_modules not being created.
I worked around the issue by removing all (nine) git dependencies from my package.json and then adding them one-by-one and running yarn install each time. No issues, no errors, and in the end I have a fully functioning node project. Great success!
The question here then, is why can I not install (run yarn install) everything at once. I have tried the tricks I found googling - clear the yarn cache, use npm install, run npm adduser or npm login, run as administrator... every combination of those actions resulted in the same EINVAL error.
My guess would be that yarn is trying to do "too many things at once" and its resulting in filesystem errors (trying to mkdir a dir that is locked)... but why is this not documented, and more importantly, why is there not a way to tell yarn to install "one thing at a time"? If there is, and I missed it, I would love to know about it.
Cheers!

Force project to use Yarn but not npm

As a team practice, I would like to force my teammate to use yarn install/ run but not npm install/ run.
Is it possible to force a package.json 's dependency be installed only via yarn install or package.json's script be run only via yarn run?
If it cannot be done, can I at least get a warning when using npm install?
Again, this is only to align the team practice, so that reduces the possibility of error/ problem produced during dev/ops. Thanks
One of the way I can think of is CI can set a rules to detect if there is new file package-lock.json file being created, make the build fail. The developer will then realized he made a mistake since his build failed.
Alternatively you can also rely on husky pre-commit hook, which essentially runs a command to check if package-lock.json existed everytime developer trying to run git commit.

pre-cache node_modules in Docker container

It frustrates me that CI builds for projects which use Node tool chains such as Grunt and Gulp take quite a long time, the bulk of which is consumed by npm install.
I've tried to set up a Docker image, pre-baked with all of the node_module dependencies in the npm cache (each at the same fixed release as declared in my package.json file), but even then the build still takes a few minutes when all it really should need to do is to copy a few directories from the npm cache into my project's node_modules.
I've set cache-min to 9999999, but it still seems to take much longer than it shoul need to.
I've looked local-npm and npm_lazy but they seem over the top, and the former takes ages to install - I suspect that it's trying to download every single npm module in existence - I only need a limited number and don't need to be running a web server to serve them from within the Docker container.
...am I missing something? There must be a faster way to run a CI build...
I was able to get it to work by using .npmrc to point to the npm cache within the docker container. I would suggest you to docker exec into your container and run npm config list | grep cache to ensure that the cache is used.

Docker and node_modules - put them in a layer, or a volume?

I'm planning a docker dev environment and doubtful whether running npm install as a cached layer is a good idea.
I understand that there are ways to optimize dockerfiles to avoid rebuilding node_modules unless package.json changes, however I don't want to completely rebuild node_modules every time package.json changes either. A fresh npm install takes over 5 minutes for us, and changes to package.json happen reasonably frequently. For someone reviewing pull requests and switching branches quite often, they could have to suffer through an infuriating amount of 5 minute npm installs each day.
Wouldn't it be better in cases like mine to somehow install node_modules into a volume so that it persists across builds, and small changes to package.json don't result in the entire dependency tree being rebuilt?
Yes. Don't rebuild node_modules over and over again. Just stick them in a data container and mount it read only. You can have a central process rebuild node_modules now and then.
As an added benefit, you get a much more predictable build because you can enforce that everyone uses the same node modules. This is critical if you want to be sure that you actually test the same thing that you're planning to put in production.
Something like this (untested!):
docker build -t my/module-container - <<END_DOCKERFILE
FROM busybox
RUN mkdir -p /usr/local/node
VOLUME /usr/local/node
END_DOCKERFILE
docker run --name=module-container my/module-container
docker run --rm --volumes-from=module-container \
-v package.json:/usr/local/node/package.json \
/bin/bash -c "cd /usr/local/node; npm install"
By now, the data container module-container will contain the modules specified by package.json in /usr/local/node/node_modules. It should now be possible to mount it in the production containers using --volume-from=module-container.

npm install without symlinks option not working

I setup a development environment with Windows 8 and Ubuntu as a virtual machine. For that I use VirtualBox.
I also manage to create a shared folder in VirtualBox.
In this shared folder I try to start a project with ember-generator of Yeoman.
yo ember --skip-install --karma
npm install --no-bin-links
For installing modules NPM I use the option "--no-bin-links" not to create symbolic links. Unfortunately, I still have errors creations symbolic links ... Is what I use although this option ? There he has a bug ?
The NPM docs about parameter "--no-bin-links" say:
will prevent npm from creating symlinks for any binaries the package
might contain.
Which will just cause NPM to not create links in the node_modules/.bin folder. I also searched for a way to prevent NPM from creating symlinks when using npm install ../myPackage, but can't find any solution...
Update: The npm support team said this will reproduce the old behaviour (no symbolic links):
npm install $(npm pack <folder> | tail -1)
Works for me in git-bash on Windows 10.
This Stack Overflow page comes up in Google search results when trying to solve the issue of installing local modules (ie. npm install ../myPackage) and not wanting symbolic links. So I'm adding this answer below to help others who end up here.
Solution #1 - For development environment.
Using the solution proposed by the NPM support team as mentioned in the other answer works...
# Reproduces the old behavior of hard copies and not symlinks
npm install $(npm pack <folder> | tail -1)
This is fine in the development environment for manual installs.
Solution #2 - For build environment.
However, in our case, the development environment doesn't quite matter as much though because when committing our changes to Git, the ./node_modules/ folder is ignored anyway.
The files ./package.json and ./package-lock.json is what is important and is carried into our build environment.
In our build environment (part of our automated CI/CD pipeline), the automation just runs the npm install command and builds from the dependencies listed in the package.json file.
So, here is where the problem affects us. The locally referenced files in the dependencies list of the package.json causes symlinks to appear. Now we are back to the old problem. These symlinks then get carried into the build's output which move onto the Stage and Production environments.
What we did instead is use rsync in archive mode with the --copy-links option that turns symbolic links into copies of the original.
Here is what the command looks like in the automated build:
# Install dependencies based on ./package.json
npm install
# Make a copy that changes symlinks to hard copies
rsync --archive --verbose --copy-links ./node_modules/ ./node_modules_cp/
# Remove and replace
rm -r ./node_modules/
mv ./node_modules_cp/ ./node_modules/
I have a similar environment. Apparently the Virtualbox (vagrant) synchronisation has problems when renaming or moving files, which happens when updating modules.
If you do a file listing (ls -alhp) on the command line and see ??? for the file permissions, then it is time to reboot your virtualbox. This will set the permissions to valid values. Then use the --no-bin-links option when installing a module.

Resources