Managing multiple NPM modules browser-side - browser

Okay, I've looked and looked and don't quite see a question that looks like mine, nor a project that quite addresses my need. This is probably because I'm doing something insane, and I am also asking for something difficult. But I wanted to see what others think.
I'm building my first large-scale single-page application. The way I've set it up is by breaking it up into a number of NPM modules. I like this because NPM provides a nice environment to build purely node-run unit tests in, a way to reuse some of my code for other projects we do at my company, and a forced separation of concerns. Here's the general idea:
A core data model library
A core UI library
A psuedo-library that provides individual UI components based on the above two…
…another of those…
…etc for each sub-application of my application
A very small central project that pulls all the components provided by the above together into an interface as necessary
This means a lot of libraries, and a number of dependencies that are common (Underscore, moment, EventEmitter2, etc).
Now I need to figure out how to get all that code into a browser. Ideally, I'd want something with some browserify characteristics (rolls modules and dependencies together into single files to cut down on resource callbacks), but has some of requirejs's asynchronous loading DNA (I'd rather not have to load my entire application up front; being able to call down chunks when the user navigates is useful).
I'm having trouble reconciling the above, though. I get what Require is trying to do, but every time I try to use it for already-built NPM modules (not AMD modules, though I'm happy to write that central project in an AMD-ish way) I get really confused and the sense that it's not really meant for me. For a single page application it seems like it's just going to resolve everything into one file anyway, since my dynamic resources are whole dependencies rather than individual files? And of course, Browserify is made with the sole intent of Hulk Smashing all your code into a single file. I could bundle each NPM module separately with Browserify, but then I'm duplicating the common dependencies for each.
I've looked at a bunch of other projects and they all seem to be addressing the client side more than the bundling side. What am I missing here?
[In pipe dream mode, I also like inject, partly because it's written by LinkedIn (who have a good reputation in my mind), but also for its localStorage caching.]

Related

When should I create my own module package instead of using other packages?

I'm still a new node js developer, currently building a personal project, and I recently found out that there are open source packages available on npm similar to the thing I'm developing.
These packages carry new advanced concepts that I haven't come up with yet and provide more options than I want, but after thinking, it occurred to me why not develop a package that serves me in my project the way I want instead of using packages where I won't use more than 5% of the functions in my project?
Benefits of using an existing, well-supported module:
You save your development time for things that haven't already been written by someone else allowing you to make faster progress on your project
Well tested by the community (pre-tested code saves you lots of time)
Other people finding and fixing bugs (don't underestimate the importance of this)
The code will likely be kept up-to-date as tech changes over time
Possible community of people to ask questions of that knows about that package
Non-issues with using an existing, well-supported module:
Code size is rarely an issue for server-side nodejs development so the fact that a package may contain extra code that you don't need is generally not a practical issue of any consequence. If code size is paramount (like say you were running on a small, embedded system), then nodejs itself might not be the right environment as it's not exactly compact.
Reasons not to use an existing, well-supported module:
You aren't allowed to use open-source code in your project (but then you wouldn't be using nodejs if that was the case).
No existing module does what you want.
Existing modules that do what you want don't appear to be well supported or have many relevant bugs that have been open for a long time. In this case, it still might be worth if for you to clone the repository and use it as a starting point or learning point for your own module.
I'm still a new node js developer, currently building a personal project, and I recently found out that there are open source packages available on npm similar to the thing I'm developing.
IMO, this is part of the magic sauce of doing nodejs development. The huge repository of open source packages (through NPM) that are so easy to use make your development far more productive than developing everything from scratch yourself.
why not develop a package that serves me in my project the way I want instead of using packages where I won't use more than 5% of the functions in my project?
Unused code doesn't really cost you anything of consequence in a server-side environment. If you really wanted you can use bundlers that support tree-shaking which removes the code you're not using.
The question that really matters is whether an existing module meets your needs or is closest enough that you only have to write a little bit of code in order to use it. If that's the case, then the question becomes this: "Why should I use my precious development time to write a package from scratch when I could use far less development time by using something that is already available for free, is already tested and is already proven and then spend that development time (I would have spent developing that package) on other things that advance my product/service further?
In many ways, this is really no different than using the fs module built into nodejs. You use it because it's already developed and already tested and saves you time over developing your own file access module. Yes, the fs module contains lots of code you may never need, but that's not the question. The question is whether it already contains the code you DO need.

how to find all of the options for a node package?

this is a general question about node modules. Everytime I download a node module, I am scrambling online for hints as to what options I can pass into the node module. On github there only seems to be a few options as an example, but what if I want to see what other options are available and what they do? how do I do this? is there a way in the command prompt to see if all of the options exist?
fore example... how would I see the options for this...
https://www.npmjs.com/package/gulp-imagemin
The documentation for every Node module (package) is available on npm, e.g.:
https://www.npmjs.com/package/gulp-imagemin
By default what is displayed is the README.md file in the project. Sometimes it contains the entire documentation, sometimes it has links to other documents or websites.
But sometimes it can be empty or outdated because the modules and their documentation is usually created by people on their free time with no obligation to keep it maintained or well documented.
If there is no documentation available or you think that the documentation is insufficient then you can either post an issue (usually on GitHub) or update the documentation and post a pull request.
See the documentation of a given module to know how to contribute or how to post issues. There should be links to issues and pull requests on the right of the module's page on npm.
I agree with the William with respect to the usabiity of node modules. While most of the modules have 'some' documentation in the npmjs.com, and 'some' in the module's repository (if public, mostly github), there is no standard form in which the capabilities are represented. Also, in many cases, the documentation is not comprehensive.
Ideally I would expect to have a standard template in the npmjs.com with these below details. This would help accelerate the consumption and serviceability of the module when deployed in large and complex software systems.
A high level description of the module.
List of its most common use cases.
List of its most common (and desired) topologies
List of exposed APIs, with their input and expected output, side effects, assumptions.
Tips on debugging the potential issues.
Potential side effects (cache, memory, open fd's, leftover disc files, network access)
People can add / refine items which they think will improve the usabillity of modules, before we take it up with the npm community.

How to prevent malicious *.js scripts from executing in Node.js

I'm using Node.js to create the web service. In the implementation, I consumed many third party modules which are installed via npm. There is security issue if there is malicious *.js scripts in the consumed modules. For example, the malicious code may delete all my disk files, or collect the secret data in silence.
I have a couple of questions regarding this.
How to detect if there is security issue in the module?
What should I do to prevent malicious *.js scripts from executing in Node.js?
I'm very appreciate if you can share any experience to build the node.js service.
Thanks,
Jeffrey
One concern you did not raise is that a module might try to make a direct connection to your database itself, or to other services on your internal network. This might be prevented by setting passwords which the module cannot find so easily.
1. Restricting disk access
This project was presented at NodeConf last year. It attempts to restrict filesystem access in precisely the situation you describe.
https://github.com/yahoo/fs-lock
"The goal for this module is to help when you are loading 3rd party modules and you need to restrict their access."
It sounds rather like the proposal Jeffrey made in the comments in Plato's answer.
(If you want to look further into hooking OS calls, this hookit project may present a few ideas. Although in its current form it only wraps the callback function, it might provide inspiration of what to hook, and how. Here is an example of it being used.)
2. Analyse flow of sensitive data
If you are only worried about data-stealing (not filesystem or database access), then you can focus your concerns:
You should be most concerned about those packages which are being passed sensitive data. Presumably some of the data on your web-service is presented to the public anyway!
Most packages will not have access to the full stack of your application, only the bits of data you pass them. If a package is only being passed a small amount of sensitive data, and never passed the rest of the data, it may not be able to do anything malicious with the data it receives. (For example, if you pass all your usernames to one package for processing and all your addresses to a different package, that is a much smaller concern than if you pass all your usernames, addresses and credit-card numbers to the same package!)
Identify the sensitive data in your app, and note which functions in which modules they are passed to.
3. Perform efficient code review
You may not need to go to Github to read the code. The great majority of packages provide all their source-code in their install folder inside node_modules. (There are a few packages which provide binaries however; these are naturally harder to verify.)
If you do want to check the code yourself, there may ways to reduce the amount of work involved:
To secure your own app, you do not need to read the entire source code of all packages in your project. You only need to review those functions which are actually called.
You may trace the code by reading it, or with the aid of a text-based debugger, or a GUI debugger. (Of course you should look out for branching, where different inputs may cause different parts of the module to be called.)
Set breakpoints when you call into a module which you don't trust, so you can step through the code that is called and see what it does. You may be able to conclude that only a small part of the module is used, so only that code needs to be verified.
Whilst tracing flow should cover concerns about sensitive data at runtime, to check for file access or database access, we should also look at the initialisation code of each module which is required, and all calls (including requires) which are made from there.
4. Other measures
It might be wise to lock the version number of each package in package.json so that you don't accidentally install a new version of a package until you decide that you need to.
You may use social factors to build confidence in a package. Check the respectability of the author. Who is he, and who does he work for? Do the author and his employers have a reputation to uphold? Similarly, who uses his project? If the package is very popular, and used by industry giants, it is likely that others have already reviewed the code.
You may wish to visit github and enable notifications for all the top-level modules you are using, by "watching" the repository. This will inform you if any vulnerabilities are reported in the package in future.
Most (all?) modules have source code available on Github, you can read through the source and look for security problems, or hire a security professional to do the job.
I just take the risk - although I tend to use popular packages with hundreds of commits, active maintenence, and issue lists.
If your project dependency tree is large enough, reviewing all of your dependencies is not a feasible long-term strategy.
The original answer from Joey has some good countermeasures you can use for specific scenarios. I've also seen https://github.com/berstend/node-safe - could make you slightly safer on mac.
A general solution to the problem is taking shape though.
How to protect a project from malicious packages
make sure you don't run lifecycle (postinstall) scripts unless they're known and necessary (see my talk on this topic)
put 3rdparty code in a compartment, lock down the environment, decide on which powerful APIs to pass to each package.
The second step requires the use of Compartment, which is a work-in-progress in TC39 https://github.com/tc39/proposal-compartments/
But a shim exists. And Some tooling was built on top of that shim.
You could use the SES-shim directly and implement your own controls, or use the convenience of LavaMoat
LavaMoat lets you generate and tweak a per-package policy where you can decide which globals and builtins it should have access to.
LavaMoat also offers a tool to manage install scripts.
Here's my talk on SES and LavaMoat with a demo at the end.
How to set up LavaMoat
See LavaMoat docs for more details
disable/allow dependency lifecycle scripts (eg. "postinstall") via #lavamoat/allow-scripts
npm i --ignore-scripts -D #lavamoat/allow-scripts
npx --no-install allow-scripts setup
npx --no-install allow-scripts auto
then, edit the allow-list in package.json
after every insstall/reinstall run allow-scripts
run your server or build process in lavamoat-node
npm i -D lavamoat
in your package.json add something like:
"scripts": {
"lavamoat-policy": "lavamoat app.js --autopolicy",
"start": "lavamoat app.js"
run lavamoat-policy every time you make changes to your dependency tree and review the policy (see also: policy override)
run npm start to start your app
Disclaimer: I contribute to LavaMoat and Endo. They are Open Source projects on permissive licenses.

Modular programming and node

UPDATE 1: I made a lot of progress on this one. I pretty much gave up (at least for now, but maybe long term) on the idea of allowing user-uploaded modules. However, I am developing a structure so that several modules can be defined and loaded. A module will be initialised, set its own routes, and have a 'public" directory for Javascript to be served. The more I see it, the more I realise that I can (should) also move the calls that are now system-wide in a module called "system".
UPDATE 2: I have made HUGE progress on this. I am about to commit tons of code on GitHub which will allow people to do really, really good modular programming (with modules exposing both client and server side code) using Node and Express. Please stay tuned.
UPDATE 3: I rewrote this thing as a system to register modules and enable them to communicate via a event/hooks system. It's coming along extremely nicely. I have tons of code already good to go -- I am just porting it to the new system. Feel free to have a look at the project on GitHub: https://github.com/mercmobily/hotplate )
UPDATE 4: This is good. It turns out that my idea about a module being client AND server is really working.
UPDATE 5: The module is getting closer to something usable. I implemented a new loader which will take into account what an init() function will invokeAll() -- and will make sure that modules providing that hook will be loaded first. This opens up hotplate to a whole new level.
UPDATE 6: Hotplate is now close to 12000 lines of code. By the time it's finished, sometime in February, I imagine it will be close to 20000 lines of code. It does a lot of stuff, and it all started here on StackOverflow! I need it to develop my own SaaS, so I really need to get it finished by February (so that I can sprint to July and finish the first version of BookingDojo). Thanks everybody!
I am writing something that will probably turn into a pretty big piece of software. The short story is that it's nodejs + Express + Mongodb/Mongoose + Dojo (client side).
NOTE: Questions in this text are marked as [Q1], [Q2], etc.
Coming from a Drupal background (and knowing how coooomplex it has evolved, something I would like to avoid), I am a bit of a module freak. At the moment, I've done the application's boilerplate (hotplate: https://github.com/mercmobily/hotplate ). It does all of the boring stuff (users, workspaces, password reminder, etc.) and it's missing quite a few pieces.
I would like to come up with a design that will allow modules in a similar fashion as Drupal (but possibly better). That is:
Modules can define new routes, and handle them
Modules are installed system-wide, and then each workspace can enable a set list of them
The initial architecture could be something along those lines:
A "modules" directory, where there is one directory per module
Each module has a directory for "public" files for the Javascript side of things
Each module would have public/startup.js which would be included in the app's javascript
Each module would have server/node.js which would be included on the fly by the server if/when needed
There would be one route defined, something like /app/:workspaceid/modules/MODULE_NAME/.* with a middleware that checks if that workspace has MODULE_NAME enabled -- and if it does, calls a module's function with the passed parameter
[Q1]: Does this some vaguely sane?
Issues:
I want to make this dynamic. I would like modules to be required when needed on the spot. This should be easy enough to do, by requiring things on the fly.
server/node.js would have a function called, but that function feels/looks an awful lot like a router itself
[Q2] Do you have any specific hints about this one?
These don't seem to be too much of a concern. However, the real question comes when you talk about security.
Privacy. This is a nasty one. At the moment, all the calls will make the right queries to mongoDb filtering by workspaceId. I would like to enforce some way so that there is no clear access to the database by the modules, so that each module doesn't have access to data that belongs to other workspaces
User-defined modules. I would love to give users the ability to upload their own modules (and maybe make them available to other users). But, this effectively means allowing people to upload code that will be executed by node itself! How would you go about this?
[Q3] How would you go about these privacy/security issues? Is there any way for example to run the user-uploaded code in a sort of node sandbox? What about access to file system etc.?
Thanks!
In the end, I answered this myself -- the hard way.
The answer: hotplate, https://github.com/mercmobily/hotplate
It does most of what I describe above. More importantly, with hotPlate (using hotPage and hotClientPages, available by default), you can write a module which
Defines some routes
Defines a "public" directory with the UI
Defines specific CSS and JS files that must be loaded when loading that module
Is able to add route-specific JSes if needed
Status:
I am accepting this answer as I am finished developing Hotplate's "core", which was the point of this answer. I still need to "do" things (for example, once I've written docs, I will make sure "hotplate" is the only directory in the module, without having an example server there). However, the foundation is there. In terms of "core", it's only really missing the "auth" side of the story (which will require a lot of thinking, since I want to make it so that it's db agnostic AND interfacing with passport). The Dojo widgets are a great bonus, although this framework can be used with anything (and in fact backbone-specific code would be sweeeeet).
What hotplate DOESN'T do:
What hotplate DOESn'T do, is give users the ability to upload modules which will then be loaded in the application. This is extremely tricky. The client side wouldn't be so bad (the user could define Javascript to upload, and there could be a module to do that, no worries). The server side, however, is tricky at best. There are just too many things that can go wrong (the client might upload a blocking piece of code, or they could start reading the file system, they would have access to the full database, and so on).
The solution to these issues are possible, but none of them are easy (you can cage the user's node environment and get it to run on a different port, for example, and so on) but some problems will stay. But, there is always hope.

Why is everybody using Node.js and NPM for compiling JavaScript libraries? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I am really confused with everybody in JS community using Node.js and NPM with their JS libraries. Why do we have to resort to such extreme measures? What problems is this solving for us?
[Edit]
I think my question wasn't to the point.
Frameworks like Ember.js, Batman.js and more recently Yahoo's Mojito require me to use node.js - why this dependency on Node.js and NPM?
Why are we making things complex? "If you haven't already, you'll need to install node.js..." You read messages like this and you're turned off.
Why? There is already a problem of plenty in JS - far too many active JS libs/frameworks to choose from - going by the record of JS libs most will become inactive soon. There are just too many things to look for that often result in multiple frameworks in an app - dependency management, routers, MVC, templating, etc. On top of this we are using Node.js to use these libs/frameworks... How will this push usage of these libraries to new JS developers? JS was meant to easy!
"If you haven't already, you'll need to install node.js..." You read messages like this and you're turned off. Why?
NodeJS is Google's V8 "running on it's own". It's a JS engine with additional low-level API (Network,I/O,etc.). NodeJS provides "the missing platform" for JS developers, who were just limited to working on a browser.
why this dependency on Node.js and NPM?
Node.js, aside from using it as an app (servers, proxies, bots etc.), it can also be used as a tool build and aid development. Take for example Grunt which is a scriptable automation tool which is similar to Make. Scripting in just plain JS, you need not learn another tool or language to do automation. Another tool is Bower, which is a front-end package management tool. All you need to do is a bower install jquery and it installs jquery with that single command. No need for manual download, copy and paste.
NPM, on the other hand, is Node.js' package manager. It's a program that manages the modules you use on NodeJS. No need to list down your modules manually, and no need to remember them when you develop somewhere else. As long as you have the package list NPM made for you, reinstalling is just a matter of npm install.
Why are we making things complex?
We're not. In fact, we're making them easy for developers. Instead of worrying on your workflow, managing your libraries, or doing stuff manually, you can off-load these tasks to some of the modules that exist on NPM. Then you can just focus on what you are actually doing.
On top of this we are using Node.js to use these libs/frameworks... How will this push usage of these libraries to new JS developers? JS was meant to easy!
Like mentioned above, NodeJS is a versatile platform. It can be used as a server (Connect, Express), an automation tool (Grunt), a package management system (using NPM, Bower etc.), a testing platform (QUnit, Mocha), a proxy, game server, chat bot.
And it's beneficial, especially to the JS developer, since these weren't possible in JS.
There is already a problem of plenty in JS - far too many active JS libs/frameworks to choose from - going by the record of JS libs most will become inactive soon. There are just too many things to look for that often result in multiple frameworks in an app - dependency management, routers, MVC, templating, etc.
Well, it's good to have an abundant set of frameworks. Your work will be cut in half after learning some of them. Implementation diversity is also good, to address different styles of coding and different approaches of implementation. Some libraries rise from differing approaches, while others rise from the incompatibilities and/or incompleteness of others.
The developers are hard at work to make life easier for other developers by normalizing JS quirks (because browser vendors just can't seem to do the right thing of following standards) and most of them are done voluntarily, like free beer - you should be happy for that. Besides, nobody's forcing you to use one anyway.
The CommonJS standard (best implemented, in my opinion, by Node.js and NPM) introduces the concept of modules to Javascript. For years, the Perl and Python communities have demonstrated why modules are awesome:
Unix-style "do one thing and do it well" libraries that are small and heavily tested against bugs, that can be combined easily (with no namespace issues) to solve your particular task.
Central repository of open source modules (CPAN, NPM, etc) that you can easily pull the modules from (NPM takes it one level higher by keeping all of the versions available, so you can specify that your code uses the last known "good" version rather than hope that nothing broke when you redeploy a la CPAN).
Greater peer review of the code (since they are more easily composable, they're used in more varied situations, so this is what helps reduce the bugs, but also what helps improve the modules to be more generalized).
Greater variety of tasks solved. Since the libraries are short, pretty much anyone can write one. That does mean there's a lot more crap to filter through (articles about widely-used libraries help with this), but it also means a library that solves some very specific problem (such as localizing strings and dates ) probably also exists.
And then a Node module called browserify makes the actual build process for your client-side code incredibly simple, and you can use just about any piece of code you find on NPM.
This breaks away from the "kitchen sink" mentality of libraries like jQuery (who have developed their own custom build system so they can start modularizing their code, too) that believe they need to solve every problem their user might have, rather than just produce results that can be used by other libraries.
Very often you needs different builds of your javascript. Usually it is spread out in different files, sometimes its in coffeescript.
You often want a build AMD compatible build, as well as CommonJS one, plus regular minimized and unminimized builds.
There's also the potential for dependency resolution.
I've even seen a library that had a build for jQuery and protoype...
Edit: noticed I was answering the question as worded in the question body, but missed the compilation question in the title.
What criteria do you have for considering this an "extreme measure"? This has been done for years, for the sake of writing clean, easy to read/write code, but precompiled to be optimized for on-the-wire transfer (and perhaps other optimizations as well). Node.js makes a nice solution for this, simply because it's also using JavaScript and therefore familiar to people using it to compile their JavaScript code. Previously this was typically done in something like Python, which, though working, seems less sensical to me than sticking with a common language.

Resources