How to safely develop with untrusted NPM modules on my own laptop?

How to safely develop with untrusted NPM modules on my own laptop? - security

I only realised relatively recently that running npm install <foo> on my own machine is completely unsafe. Anyone can publish anything to NPM, and even if there was some way of verifying that an individual package was not nefarious, it would be extremely difficult to verify all of its dependencies.
NPM runs with my user permissions (not root), but in the context of my Macbook, that's pretty much irrelevant, as all of my personal files are also owned by the same user. (The only thing root would add is the ability to destroy the operating system itself, which is vastly less important.)
So: how do people safely run untrusted code from NPM? Obviously running everything inside a VM is safer, but I always seem to have difficulties with local networking.
Or I guess I could create a new user, and make everything NPM-related owned by that user, who doesn't have access to the rest of my system.
Are there any other good options?

Mitigations for this would include:
If you are a user who owns modules you should not stay logged into
npm. (Easily enough, npm logout and npm login)
Use npm shrinkwrap to lock down your dependencies
Use npm install someModule --ignore-scripts
Also see this link and this link.
Be aware that a malicious module could introduce backdoors or vulnerabilities into any code built with it, if not noticed by the community.

Related

How to work on two npm packages at the same time?

I'm trying to write an npm package that will be published and used as a framework in other projects. The problem is -- I can't figure out a solid workflow for working on it at the same time as working on projects that depend on it.
I know this seems super basic and that npm link solves the issue, but this is a bigger one than just being able to import one local package from another.
I have my framework package scaffolded out; let's call it gumby, It exports a function that does console.log('hello from gumby'). That's all that matters for right now.
Now I'm ready to create a project that will use gumby. Let's call this one client. I set that up too and npm link gumby so client can import from it, etc. OK cool, it's working as expected.
So now it's time to publish gumby. I run npm publish and it goes out to npm as version 0.0.1.
At this point, how do I get the published, npm-hosted version of gumby into the package.json for client? I mean, I could just delete the symlinked copy from my node_modules and then yarn add gumby, but what if I want to go back and work on it locally again? And then run it against the npm version again? And then work on it some more? And then...
You get the point, I imagine. There's no obvious way to switch between the npm copy of a package that you're working on, and the local one. There's the additional problem of how to do that without messing with your package.json too much, e.g. what if I accidentally commit to it version control with some weird file:// dependency path. Any suggestions would be much appreciated.

For local development, having the package symlinked is definitely the way to go, the idea of constantly publishing / re-installing the package sounds like a total pain.
The real issue sounds more like you’re concerned about committing a dev configuration to prod - you could address that problem with something as simple as a pre-commit hook on your VCS e.g. block if it detects any local file references in the package.json.

NodeJs development offline in docker

I'm trying to implement a developer workflow with docker, with the ability to develop offline (as in, not having to run npm install when you switch between branches that have differing dependencies)
The most intuitive way to do that is to store dependencies in source control. This has its own issues especially when using modules that compile dependencies. I have tried nearly everything I could think of and find:
npm packing my projects dependencies, storing in source but this doesn't store my dependencies' dependencies
storing node_modules in source, copying this to the container and running npm rebuild but it doesn't actually trigger a rebuild
running npm install --no-registry so t triggers a rebuild but doesn't try to call out, but it actually calls out to the public registry anyway
other solutions I've seen like Node-PAC seem abandoned
npmbox looks the most promising but it requires that it's installed on the target globally, which would work in a container I can build but not production, unless we start deploying containers in production.
Is this a fruitless effort? Lack of network access is rare and would only really be needed when installing a new module or moving between revisions that have differing dependencies

Another option is to setup a private npm repository and to configure it to cache public repository. There are several options to implement this, I would recommend to try Nexus: https://www.sonatype.com/nexus-repository-oss

Installing Node Modules in a Standalone Environment

I'm working with NodeJS in a small standalone isolated environment, as such commands like npm install -g can't reach out to the internet for dependencies. To "workaround" this constraint, I'm doing installs on a regular system and just air-gaping the pieces over.
Any given module (and all its dependencies) lives in /usr/local/lib/node_modules/XXXXX, and if I'm actually in that directory and enter node cli.js, it works.
However, back on the main system invoking module XXXXX by name from the command line runs that particular module. The isolated system does not, and I'm trying to find that last little step of how npm is wiring things up.
What's the pedantic methodology once one's this far to teach npm (or node) of the new command?
I'm willing to resort to hacks or aliases if absolutely necessary, but I'd like to do it the npm way.
Unfortunately, I'm not at liberty to set up a CouchDB or such and run a mirrored repository — I'm working in a tight footprint that's trying to keep things small and sleek.

Check in node_modules vs. shrinkwrap

Checking in node_module was the community standard but now we also have an option to use shrinkwrap. The latter makes more sense to me but there is always the chance that someone did "force publish" and introduced a bug. Are there any additional drawbacks?

My favorite post/philosophy on this subject goes all the way back (a long time in node.js land) to 2011:
https://web.archive.org/web/20150116024411/http://www.futurealoof.com/posts/nodemodules-in-git.html
To quote directly:
If you have an application, that you deploy, check in all your dependencies in to node_modules. If you use npm do deploy, only define bundleDependencies for those modules. If you have dependencies that need to be compiled you should still check in the code and just run $ npm rebuild on deploy.
Everyone I’ve told this too tells me I’m an idiot and then a few weeks later tells me I was right and checking node_modules in to git has been a blessing to deployment and development. It’s objectively better, but here are some of the questions/complaints I seem to get.
I think this is still the best advice.
The force-publish scenario is rare and npm shrinkwrap would probably work for most people. But if you're deploying to a production environment, nothing gives you the peace-of-mind like checking in the entire node_modules directory.
Alternately, if you really, really don't want to check in the node_modules directory but want a better guarantee there hasn't been a forced push, I'd follow the advice in npm help shrinkwrap:
If you want to avoid any risk that a byzantine author replaces a package you're using with code that breaks your application, you could modify the shrinkwrap file to use git URL references rather than version numbers so that npm always fetches all packages from git.
Of course, someone could run a weird git rebase or something and modify a git commit hash... but now we're just getting crazy.

npm FAQ directly answers this:
Check node_modules into git for things you deploy, such as websites
and apps.
Do not check node_modules into git for libraries and modules
intended to be reused.
Use npm to manage dependencies in your dev
environment, but not in your deployment scripts.
cited from npm FAQ

Advantages of bundledDependencies over normal dependencies in npm

npm allows us to specify bundledDependencies, but what are the advantages of doing so? I guess if we want to make absolutely sure we get the right version even if the module we reference gets deleted, or perhaps there is a speed benefit with bundling?
Anyone know the advantages of bundledDependencies over normal dependencies?

For the quick reader : this QA is about the package.json bundledDependencies field, not about the package.
What bundledDependencies do
"bundledDependencies" are exactly what their name implies. Dependencies that should be inside your project. So the functionality is basically the same as normal dependencies. They will also be packed when running npm pack.
When to use them
Normal dependencies are usually installed from the npm registry.
Thus bundled dependencies are useful when:
you want to re-use a third party library that doesn't come from the npm registry or that was modified
you want to re-use your own projects as modules
you want to distribute some files with your module
This way, you don't have to create (and maintain) your own npm repository, but get the same benefits that you get from npm packages.
When not to use bundled dependencies
When developing, I don't think that the main point is to prevent accidental updates though. We have better tools for that, namely code repositories (git, mercurial, svn...) or now lock files.
To pin your package versions, you can use:
Option1: Use the newer NPM version 5 that comes with node 8. It uses a package-lock.json file (see the node blog and the node 8 release)
Option2: use yarn instead of npm.
It is a package manager from facebook, faster than npm and it uses a yarn.lock file. It uses the same package.json otherwise.
This is comparable to lockfiles in other package managers like Bundler
or Cargo. It’s similar to npm’s npm-shrinkwrap.json, however it’s not
lossy and it creates reproducible results.
npm actually copied that feature from yarn, amongst other things.
Option3: this was the previously recommended approach, which I do not recommend anymore. The idea was to use npm shrinkwrap most of the time, and sometimes put the whole thing, including the node_module folder, into your code repository. Or possibly use shrinkpack. The best practices at the time were discussed on the node.js blog and on the joyent developer websites.
See also
This is a bit outside the scope of the question, but I'd like to mention the last kind of dependencies (that I know of): peer dependencies. Also see this related SO question and possibly the docs of yarn on bundledDependencies.

One of the biggest problems right now with Node is how fast it is changing. This means that production systems can be very fragile and an npm update can easily break things.
Using bundledDependencies is a way to get round this issue by ensuring, as you correctly surmise, that you will always deliver the correct dependencies no matter what else may be changing.
You can also use this to bundle up your own, private bundles and deliver them with the install.

Other advantage is that you can put your internal dependencies (application components) there and then just require them in your app as if they were independent modules instead of cluttering your lib/ and publishing them to npm.
If/when they are matured to the point they could live as separate modules, you can put them on npm easily, without modifying your code.

I'm surprised I didn't see this here already, but when carefully selected, bundledDependencies can be used to produce a distributable package from npm pack that will run on a system where npm is not configured. This is helpful if you have e.g. a system that's not networked / not on the internet: bring your package over on a thumb drive (or whatever) and unpack the tarball, then npm run or node index.js and it Just Works.
Maybe there's a better way to bundle up your application to run "offline", but if there is I haven't found it.

Operationally, I look at bundledDependencies as a module's private module store, where dependencies is more public, resolved among your module and its dependencies (and sub-dependencies). Your module may rely on an older version of, say, react, but a dependency requires latest-and-greatest. Your package/install will result in your pinned version in node_modules/$yourmodule/node_modules/react, while your dependency will get their version in node_modules/react (or node_modules/$dependency/node_modules/react if they're so inclined).
A caveat: I recently ran into a dependency that did not properly configure its dependency on react, and having react in bundledDependencies caused that dependent module to fail at runtime.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string