Check in node_modules vs. shrinkwrap

Check in node_modules vs. shrinkwrap - node.js

Checking in node_module was the community standard but now we also have an option to use shrinkwrap. The latter makes more sense to me but there is always the chance that someone did "force publish" and introduced a bug. Are there any additional drawbacks?

My favorite post/philosophy on this subject goes all the way back (a long time in node.js land) to 2011:
https://web.archive.org/web/20150116024411/http://www.futurealoof.com/posts/nodemodules-in-git.html
To quote directly:
If you have an application, that you deploy, check in all your dependencies in to node_modules. If you use npm do deploy, only define bundleDependencies for those modules. If you have dependencies that need to be compiled you should still check in the code and just run $ npm rebuild on deploy.
Everyone I’ve told this too tells me I’m an idiot and then a few weeks later tells me I was right and checking node_modules in to git has been a blessing to deployment and development. It’s objectively better, but here are some of the questions/complaints I seem to get.
I think this is still the best advice.
The force-publish scenario is rare and npm shrinkwrap would probably work for most people. But if you're deploying to a production environment, nothing gives you the peace-of-mind like checking in the entire node_modules directory.
Alternately, if you really, really don't want to check in the node_modules directory but want a better guarantee there hasn't been a forced push, I'd follow the advice in npm help shrinkwrap:
If you want to avoid any risk that a byzantine author replaces a package you're using with code that breaks your application, you could modify the shrinkwrap file to use git URL references rather than version numbers so that npm always fetches all packages from git.
Of course, someone could run a weird git rebase or something and modify a git commit hash... but now we're just getting crazy.

npm FAQ directly answers this:
Check node_modules into git for things you deploy, such as websites
and apps.
Do not check node_modules into git for libraries and modules
intended to be reused.
Use npm to manage dependencies in your dev
environment, but not in your deployment scripts.
cited from npm FAQ

Related

Why does "npm install" modify package-lock.json? Why commit it to git then?

When I run "npm install" in a project it often modifies package-lock.json, for example if I work on the same project from another computer (with different node or npm version).
But at the same time the documentation suggests that the file is supposed to be added to version control (git in my case):
https://docs.npmjs.com/files/package-lock.json
This file is intended to be committed into source repositories, and
serves various purposes: ...
So should I commit the changes made by npm back and forth when switching work machines or when somebody else does npm install? This would be a nightmare.
Currently I just discard any changes to package-lock.json made by npm, and it's been working fine. So I might as well add it to .gitignore...
Am I doing it wrong? Should I use npm ci instead? I wouldn't call my computer a "CI", it's just a development machine, why should I use it there?
Basically I have the same question as this gentleman:
https://github.com/npm/npm/issues/18103#issuecomment-370401935
(Sadly I can't add a comment on that issue or create a new issue at all, the npm repo has issues disabled)

Yes you want to commit your package-lock.json file to source control. The reasoning behind this is to ensure that all of the same versions of each package are downloaded and installed for each user that pulls down the code. There are some other reasons to include the file such as tracking changes to your package tree for auditing.

How to work on two npm packages at the same time?

I'm trying to write an npm package that will be published and used as a framework in other projects. The problem is -- I can't figure out a solid workflow for working on it at the same time as working on projects that depend on it.
I know this seems super basic and that npm link solves the issue, but this is a bigger one than just being able to import one local package from another.
I have my framework package scaffolded out; let's call it gumby, It exports a function that does console.log('hello from gumby'). That's all that matters for right now.
Now I'm ready to create a project that will use gumby. Let's call this one client. I set that up too and npm link gumby so client can import from it, etc. OK cool, it's working as expected.
So now it's time to publish gumby. I run npm publish and it goes out to npm as version 0.0.1.
At this point, how do I get the published, npm-hosted version of gumby into the package.json for client? I mean, I could just delete the symlinked copy from my node_modules and then yarn add gumby, but what if I want to go back and work on it locally again? And then run it against the npm version again? And then work on it some more? And then...
You get the point, I imagine. There's no obvious way to switch between the npm copy of a package that you're working on, and the local one. There's the additional problem of how to do that without messing with your package.json too much, e.g. what if I accidentally commit to it version control with some weird file:// dependency path. Any suggestions would be much appreciated.

For local development, having the package symlinked is definitely the way to go, the idea of constantly publishing / re-installing the package sounds like a total pain.
The real issue sounds more like you’re concerned about committing a dev configuration to prod - you could address that problem with something as simple as a pre-commit hook on your VCS e.g. block if it detects any local file references in the package.json.

How can I preserve local changes made to an NPM module?

I've pulled down a node module using NPM, and added it to package.json. However there was a need to change some of the module's code as it didn't meet my requirements 100%.
Typically when I'm working with node and git I would ignore the node_modules directory and use npm install when deploying to a server.
I'm wondering what best practise would be in my scenario, is there a way of defining a module in package.json that should be ignored if it already exists locally when running npm install? Is this already the default behaviour for all modules? How would that work if someone ran npm update? I would assume the latest version of that module would be pulled down and would overwrite my changes?
Alternatively I've thought about forking the original git repo for the module, republishing my fork to NPM and then using that instead.
Tips and ideas would be greatly appreciated :)

Alternatively I've thought about forking the original git repo for the module, republishing my fork to NPM and then using that instead.
You have the right idea here. Under NPM, you definitely don't want to split your concerns between hosted and version control-tracked resources. Fork the repo, and then answer this question: if you add the functionality to the existing module, is the pull request likely to be merged and published to NPM soon enough for you?
If the answer is no because the functionality doesn't meet the intentions of the original module, you're better off creating your own, making sure to note your fork in the README.
If you're waiting on the PR, you have an option in the interim. NPM lets you link directly to your fork's .git file.

How do I use npm link with Heroku?

I'm using npm link as described here
http://npmjs.org/doc/link.html
Locally everything works perfectly. When I deploy to Heroku I get the error message
Error: Cannot find module '...'
How can I get this working with Heroku?

I wish there were an elegant solution to this (it would make my life a hell of a lot easier). Your custom package is symlinked into node_modules by npm link, but git doesn't follow symbolic links nowadays. So when you git push to Heroku, there's no way to make your custom packages go along for the ride.
Note, however, that from my experiments, Heroku will honor any node_modules you do push in, instead of trying to install them from the network. It just runs npm install --production, essentially. Perhaps a hard link directly to the development source of your package would do the trick, but I'm not sure whether Git would play nicely with that. Use at your own risk!
EDIT: If you want to know exactly what Heroku does, it's all open source.
The ideal situation would be to get the packages, if they're open source, onto NPM itself. Which is pretty painless and automatic.

If you are hosting your private module on GitHub (or BitBucket), you can add the git repo as a dependency in your package.json.
"dependencies": {
// ... all your deps
"my_private_module": "git+ssh://git#github.com:my-username/my-private-module.git"
}
You will, however, need to grant privileges to Heroku to read your repo (assuming it's private -- which is the crux of the issue). Check out this answer for a detailed set of instructions showing how to do so with Github. This might help for Bitbucket.
I've found that the build time increases when doing this. Worth it for my needs.

NodeJS and NPM : problems following recommendation to check modules into git

I'm having problems following the 'official' recommendation to check in all external dependencies into git (article http://www.mikealrogers.com/posts/nodemodules-in-git.html linked fron FAQ)
How do you make sure that not only top-level dependencies are checked-in? Most npm modules do currently not follow the recommendation. They all have their node_modules in .gitignore . Just Deleting their .gitignore seems risky.
For compiled module the article recommends to check-in only the sources and run 'npm rebuild' and deploy time. Unfortunately 'npm rebuild' does not do a 'clean make' for all modules (despite bugfix https://github.com/isaacs/npm/issues/1872 being included in npm version 1.0.106 i'm using). This means that I have to prevent compile targets from being checked in (otherwise i would have object code compiled for the developer machine on the production machine without being overwritten by npm rebuild). But: how do i do this? Unfortunately the modules don't have a common compile output directory, so just git-ignoring "node_modules//build" and "/node_modules//out/" (as mentioned in this good article eng.yammer.com/blog/2012/1/4/managing-nodejs-dependencies-and-deployments-at-yammer.html won't help in every case.
Short version: how do you make sure that production servers use the exact same version of all dependent modules as you use during development?

UPDATE: there is now npm shrinkwrap which solves the problem of locking down exact dependency versions, even of dependencies' dependencies! More info here.
Checking in node_modules can be problematic, as the environment it's running on may differ from user to user - so what is compiled on some environment may not work on another. Plus it would fill up your changelogs and repositories with 3rd party code. Which I take it is the conslusion you've come to with your short version of the question, so let me address that.
Short version: how do you make sure that production servers use the exact same version of all dependent modules as you use during development?
Inside your package.json, there will be dependencies: {}, if it is not there, then add it. To accomplish what you want, add your dependencies as the key, and their exact versions as the value. E.g. dependencies: { docpad: '2.5.0', mocha: '1.1.0' }
However, generally (it depends on the author) upgrades to the revision number (the x.x.X number) are just bugfixes and safe. You can allow minor changes by doing dependencies: { docpad: '2.5.x', mocha: '1.1.x' } which saves you from having to update your package.json and do a release everytime there is a bugfix release. You can even do things like 2.x if you wish.
This is the solution I've come to use for all of my modules, as it ensures that even 6 months later or whatever the module will still work - whereas doing something like >= 2.0.0 means when v3 of a dependency comes out, your module will probably be unusable at that time. Ensuring you stick to specific versions "guarantees" stability over time.
For reference you can see how I've done it in my open-source node.js modules here

About the .gitignore of your dependencies (in "node_modules"), npm 1.1 ignores the .gitignore files, so that they are not installed;
npm 1.1 will exclude .gitignore files from the things it installs.
npm 1.0 did not have this feature, so you have to be careful about that.
Deleting them recursively is fine:
find node_modules -name .gitignore | xargs rm
But, in npm 1.1, you never have to do this, because it excludes them
from the install automatically.
That's coming from the chief himself (Isaac), and it's here and seems to cover pretty much everything. The "extraneous" problem I have must be something silly I've done, I'll try a clean setup.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string