Modular programming and node - node.js

UPDATE 1: I made a lot of progress on this one. I pretty much gave up (at least for now, but maybe long term) on the idea of allowing user-uploaded modules. However, I am developing a structure so that several modules can be defined and loaded. A module will be initialised, set its own routes, and have a 'public" directory for Javascript to be served. The more I see it, the more I realise that I can (should) also move the calls that are now system-wide in a module called "system".
UPDATE 2: I have made HUGE progress on this. I am about to commit tons of code on GitHub which will allow people to do really, really good modular programming (with modules exposing both client and server side code) using Node and Express. Please stay tuned.
UPDATE 3: I rewrote this thing as a system to register modules and enable them to communicate via a event/hooks system. It's coming along extremely nicely. I have tons of code already good to go -- I am just porting it to the new system. Feel free to have a look at the project on GitHub: https://github.com/mercmobily/hotplate )
UPDATE 4: This is good. It turns out that my idea about a module being client AND server is really working.
UPDATE 5: The module is getting closer to something usable. I implemented a new loader which will take into account what an init() function will invokeAll() -- and will make sure that modules providing that hook will be loaded first. This opens up hotplate to a whole new level.
UPDATE 6: Hotplate is now close to 12000 lines of code. By the time it's finished, sometime in February, I imagine it will be close to 20000 lines of code. It does a lot of stuff, and it all started here on StackOverflow! I need it to develop my own SaaS, so I really need to get it finished by February (so that I can sprint to July and finish the first version of BookingDojo). Thanks everybody!
I am writing something that will probably turn into a pretty big piece of software. The short story is that it's nodejs + Express + Mongodb/Mongoose + Dojo (client side).
NOTE: Questions in this text are marked as [Q1], [Q2], etc.
Coming from a Drupal background (and knowing how coooomplex it has evolved, something I would like to avoid), I am a bit of a module freak. At the moment, I've done the application's boilerplate (hotplate: https://github.com/mercmobily/hotplate ). It does all of the boring stuff (users, workspaces, password reminder, etc.) and it's missing quite a few pieces.
I would like to come up with a design that will allow modules in a similar fashion as Drupal (but possibly better). That is:
Modules can define new routes, and handle them
Modules are installed system-wide, and then each workspace can enable a set list of them
The initial architecture could be something along those lines:
A "modules" directory, where there is one directory per module
Each module has a directory for "public" files for the Javascript side of things
Each module would have public/startup.js which would be included in the app's javascript
Each module would have server/node.js which would be included on the fly by the server if/when needed
There would be one route defined, something like /app/:workspaceid/modules/MODULE_NAME/.* with a middleware that checks if that workspace has MODULE_NAME enabled -- and if it does, calls a module's function with the passed parameter
[Q1]: Does this some vaguely sane?
Issues:
I want to make this dynamic. I would like modules to be required when needed on the spot. This should be easy enough to do, by requiring things on the fly.
server/node.js would have a function called, but that function feels/looks an awful lot like a router itself
[Q2] Do you have any specific hints about this one?
These don't seem to be too much of a concern. However, the real question comes when you talk about security.
Privacy. This is a nasty one. At the moment, all the calls will make the right queries to mongoDb filtering by workspaceId. I would like to enforce some way so that there is no clear access to the database by the modules, so that each module doesn't have access to data that belongs to other workspaces
User-defined modules. I would love to give users the ability to upload their own modules (and maybe make them available to other users). But, this effectively means allowing people to upload code that will be executed by node itself! How would you go about this?
[Q3] How would you go about these privacy/security issues? Is there any way for example to run the user-uploaded code in a sort of node sandbox? What about access to file system etc.?
Thanks!

In the end, I answered this myself -- the hard way.
The answer: hotplate, https://github.com/mercmobily/hotplate
It does most of what I describe above. More importantly, with hotPlate (using hotPage and hotClientPages, available by default), you can write a module which
Defines some routes
Defines a "public" directory with the UI
Defines specific CSS and JS files that must be loaded when loading that module
Is able to add route-specific JSes if needed
Status:
I am accepting this answer as I am finished developing Hotplate's "core", which was the point of this answer. I still need to "do" things (for example, once I've written docs, I will make sure "hotplate" is the only directory in the module, without having an example server there). However, the foundation is there. In terms of "core", it's only really missing the "auth" side of the story (which will require a lot of thinking, since I want to make it so that it's db agnostic AND interfacing with passport). The Dojo widgets are a great bonus, although this framework can be used with anything (and in fact backbone-specific code would be sweeeeet).
What hotplate DOESN'T do:
What hotplate DOESn'T do, is give users the ability to upload modules which will then be loaded in the application. This is extremely tricky. The client side wouldn't be so bad (the user could define Javascript to upload, and there could be a module to do that, no worries). The server side, however, is tricky at best. There are just too many things that can go wrong (the client might upload a blocking piece of code, or they could start reading the file system, they would have access to the full database, and so on).
The solution to these issues are possible, but none of them are easy (you can cage the user's node environment and get it to run on a different port, for example, and so on) but some problems will stay. But, there is always hope.

Related

When should I create my own module package instead of using other packages?

I'm still a new node js developer, currently building a personal project, and I recently found out that there are open source packages available on npm similar to the thing I'm developing.
These packages carry new advanced concepts that I haven't come up with yet and provide more options than I want, but after thinking, it occurred to me why not develop a package that serves me in my project the way I want instead of using packages where I won't use more than 5% of the functions in my project?
Benefits of using an existing, well-supported module:
You save your development time for things that haven't already been written by someone else allowing you to make faster progress on your project
Well tested by the community (pre-tested code saves you lots of time)
Other people finding and fixing bugs (don't underestimate the importance of this)
The code will likely be kept up-to-date as tech changes over time
Possible community of people to ask questions of that knows about that package
Non-issues with using an existing, well-supported module:
Code size is rarely an issue for server-side nodejs development so the fact that a package may contain extra code that you don't need is generally not a practical issue of any consequence. If code size is paramount (like say you were running on a small, embedded system), then nodejs itself might not be the right environment as it's not exactly compact.
Reasons not to use an existing, well-supported module:
You aren't allowed to use open-source code in your project (but then you wouldn't be using nodejs if that was the case).
No existing module does what you want.
Existing modules that do what you want don't appear to be well supported or have many relevant bugs that have been open for a long time. In this case, it still might be worth if for you to clone the repository and use it as a starting point or learning point for your own module.
I'm still a new node js developer, currently building a personal project, and I recently found out that there are open source packages available on npm similar to the thing I'm developing.
IMO, this is part of the magic sauce of doing nodejs development. The huge repository of open source packages (through NPM) that are so easy to use make your development far more productive than developing everything from scratch yourself.
why not develop a package that serves me in my project the way I want instead of using packages where I won't use more than 5% of the functions in my project?
Unused code doesn't really cost you anything of consequence in a server-side environment. If you really wanted you can use bundlers that support tree-shaking which removes the code you're not using.
The question that really matters is whether an existing module meets your needs or is closest enough that you only have to write a little bit of code in order to use it. If that's the case, then the question becomes this: "Why should I use my precious development time to write a package from scratch when I could use far less development time by using something that is already available for free, is already tested and is already proven and then spend that development time (I would have spent developing that package) on other things that advance my product/service further?
In many ways, this is really no different than using the fs module built into nodejs. You use it because it's already developed and already tested and saves you time over developing your own file access module. Yes, the fs module contains lots of code you may never need, but that's not the question. The question is whether it already contains the code you DO need.

How to prevent malicious *.js scripts from executing in Node.js

I'm using Node.js to create the web service. In the implementation, I consumed many third party modules which are installed via npm. There is security issue if there is malicious *.js scripts in the consumed modules. For example, the malicious code may delete all my disk files, or collect the secret data in silence.
I have a couple of questions regarding this.
How to detect if there is security issue in the module?
What should I do to prevent malicious *.js scripts from executing in Node.js?
I'm very appreciate if you can share any experience to build the node.js service.
Thanks,
Jeffrey
One concern you did not raise is that a module might try to make a direct connection to your database itself, or to other services on your internal network. This might be prevented by setting passwords which the module cannot find so easily.
1. Restricting disk access
This project was presented at NodeConf last year. It attempts to restrict filesystem access in precisely the situation you describe.
https://github.com/yahoo/fs-lock
"The goal for this module is to help when you are loading 3rd party modules and you need to restrict their access."
It sounds rather like the proposal Jeffrey made in the comments in Plato's answer.
(If you want to look further into hooking OS calls, this hookit project may present a few ideas. Although in its current form it only wraps the callback function, it might provide inspiration of what to hook, and how. Here is an example of it being used.)
2. Analyse flow of sensitive data
If you are only worried about data-stealing (not filesystem or database access), then you can focus your concerns:
You should be most concerned about those packages which are being passed sensitive data. Presumably some of the data on your web-service is presented to the public anyway!
Most packages will not have access to the full stack of your application, only the bits of data you pass them. If a package is only being passed a small amount of sensitive data, and never passed the rest of the data, it may not be able to do anything malicious with the data it receives. (For example, if you pass all your usernames to one package for processing and all your addresses to a different package, that is a much smaller concern than if you pass all your usernames, addresses and credit-card numbers to the same package!)
Identify the sensitive data in your app, and note which functions in which modules they are passed to.
3. Perform efficient code review
You may not need to go to Github to read the code. The great majority of packages provide all their source-code in their install folder inside node_modules. (There are a few packages which provide binaries however; these are naturally harder to verify.)
If you do want to check the code yourself, there may ways to reduce the amount of work involved:
To secure your own app, you do not need to read the entire source code of all packages in your project. You only need to review those functions which are actually called.
You may trace the code by reading it, or with the aid of a text-based debugger, or a GUI debugger. (Of course you should look out for branching, where different inputs may cause different parts of the module to be called.)
Set breakpoints when you call into a module which you don't trust, so you can step through the code that is called and see what it does. You may be able to conclude that only a small part of the module is used, so only that code needs to be verified.
Whilst tracing flow should cover concerns about sensitive data at runtime, to check for file access or database access, we should also look at the initialisation code of each module which is required, and all calls (including requires) which are made from there.
4. Other measures
It might be wise to lock the version number of each package in package.json so that you don't accidentally install a new version of a package until you decide that you need to.
You may use social factors to build confidence in a package. Check the respectability of the author. Who is he, and who does he work for? Do the author and his employers have a reputation to uphold? Similarly, who uses his project? If the package is very popular, and used by industry giants, it is likely that others have already reviewed the code.
You may wish to visit github and enable notifications for all the top-level modules you are using, by "watching" the repository. This will inform you if any vulnerabilities are reported in the package in future.
Most (all?) modules have source code available on Github, you can read through the source and look for security problems, or hire a security professional to do the job.
I just take the risk - although I tend to use popular packages with hundreds of commits, active maintenence, and issue lists.
If your project dependency tree is large enough, reviewing all of your dependencies is not a feasible long-term strategy.
The original answer from Joey has some good countermeasures you can use for specific scenarios. I've also seen https://github.com/berstend/node-safe - could make you slightly safer on mac.
A general solution to the problem is taking shape though.
How to protect a project from malicious packages
make sure you don't run lifecycle (postinstall) scripts unless they're known and necessary (see my talk on this topic)
put 3rdparty code in a compartment, lock down the environment, decide on which powerful APIs to pass to each package.
The second step requires the use of Compartment, which is a work-in-progress in TC39 https://github.com/tc39/proposal-compartments/
But a shim exists. And Some tooling was built on top of that shim.
You could use the SES-shim directly and implement your own controls, or use the convenience of LavaMoat
LavaMoat lets you generate and tweak a per-package policy where you can decide which globals and builtins it should have access to.
LavaMoat also offers a tool to manage install scripts.
Here's my talk on SES and LavaMoat with a demo at the end.
How to set up LavaMoat
See LavaMoat docs for more details
disable/allow dependency lifecycle scripts (eg. "postinstall") via #lavamoat/allow-scripts
npm i --ignore-scripts -D #lavamoat/allow-scripts
npx --no-install allow-scripts setup
npx --no-install allow-scripts auto
then, edit the allow-list in package.json
after every insstall/reinstall run allow-scripts
run your server or build process in lavamoat-node
npm i -D lavamoat
in your package.json add something like:
"scripts": {
"lavamoat-policy": "lavamoat app.js --autopolicy",
"start": "lavamoat app.js"
run lavamoat-policy every time you make changes to your dependency tree and review the policy (see also: policy override)
run npm start to start your app
Disclaimer: I contribute to LavaMoat and Endo. They are Open Source projects on permissive licenses.

Managing multiple NPM modules browser-side

Okay, I've looked and looked and don't quite see a question that looks like mine, nor a project that quite addresses my need. This is probably because I'm doing something insane, and I am also asking for something difficult. But I wanted to see what others think.
I'm building my first large-scale single-page application. The way I've set it up is by breaking it up into a number of NPM modules. I like this because NPM provides a nice environment to build purely node-run unit tests in, a way to reuse some of my code for other projects we do at my company, and a forced separation of concerns. Here's the general idea:
A core data model library
A core UI library
A psuedo-library that provides individual UI components based on the above two…
…another of those…
…etc for each sub-application of my application
A very small central project that pulls all the components provided by the above together into an interface as necessary
This means a lot of libraries, and a number of dependencies that are common (Underscore, moment, EventEmitter2, etc).
Now I need to figure out how to get all that code into a browser. Ideally, I'd want something with some browserify characteristics (rolls modules and dependencies together into single files to cut down on resource callbacks), but has some of requirejs's asynchronous loading DNA (I'd rather not have to load my entire application up front; being able to call down chunks when the user navigates is useful).
I'm having trouble reconciling the above, though. I get what Require is trying to do, but every time I try to use it for already-built NPM modules (not AMD modules, though I'm happy to write that central project in an AMD-ish way) I get really confused and the sense that it's not really meant for me. For a single page application it seems like it's just going to resolve everything into one file anyway, since my dynamic resources are whole dependencies rather than individual files? And of course, Browserify is made with the sole intent of Hulk Smashing all your code into a single file. I could bundle each NPM module separately with Browserify, but then I'm duplicating the common dependencies for each.
I've looked at a bunch of other projects and they all seem to be addressing the client side more than the bundling side. What am I missing here?
[In pipe dream mode, I also like inject, partly because it's written by LinkedIn (who have a good reputation in my mind), but also for its localStorage caching.]

deploying node.js in compiled form

We are considering node.js for our next server side application. But we don't want our client to be able to look into our application's code. Can we deploy application written in node.js in compiled form? If yes, then how?
Maybe you could obfuscate all your code... I know this is not like compiling, but at least, it will avoid the 99% of the clients of looking at the code :D
Here is another topic: How can I obfuscate (protect) JavaScript?
good luck
But we don't want our client to be able to look into our application's code.
If you are shipping code to your client, they will be able to "look into your application's code". Technically, the process of "running your code" is "looking into your application's code".
Having a fully compiled version of your code can definitely feel "more safe", but they still have a copy of the code in some usable form. They can still reverse engineer pieces or do other things. This stuff really comes down to the license.
Here's a related answer. His quote is:
Write a license and get a lawyer to go after violators
Otherwise, you should host the stuff yourself and allow for public access.
Any form of obfuscation, minification, compilation is just going to be a speed bump on the way to "stealing your code". It's probably much better to simply have legal recourse.
I don't believe this is possible. I mean, technically I guess you could write everything as native C++ extensions, but that would defeat the purpose of using node.
As mentioned before, there is no true compilation in Node.js because the nod executable basically compiles javascript code on the fly.
A lot of developers use Google's Closure Compiler which really just "minify" -- removes comments, whitespaces, etc. -- and "optimize" -- converts javascript code to more efficient javascript. However, the resultant code is generally still parsable javascript code (albeit rather hard to read!). Check out this related stream for more info: Getting closure-compiler and Node.js to play nice
A couple of options that might be helpful:
Develop a custom module for "proprietary" business logic and host it on your secure servers
Wrap "proprietary" business logic into a java class or executable that is called as an external process in Node.js
Code "proprietary" business logic as compiled web services available on a separate application server that is called by Node.js.
It's up to you to define what part of your application should be considered "proprietary", but as a general rule I would not classify HTML and related javascript -- sent to the we browser -- as "proprietary". My advice is to be judicious here.
Lastly, I found the following stream with an interesting approach that might be helpful, but it is rather advanced and likely to be rather buggy: Secure distribution of NodeJS applications
Hope that helps...

Common Lisp: What's the best way to use libraries in a shared hosting environment?

I was thinking about this the other day and wanted to see what the SO community had to say about the subject.
As it stands right now Common Lisp is getting some attention as a web development platform, and with good reason (of which I'm sure you are already convinced).
I was wondering how one would go about using a library in a shared environment in a similar fashion to PHP.
If I set up something like SBCL as an interperter to interpret FASL files like Python or PHP, what would be the best way to use libraries (like clsql for instance).
Most come as asdf installable libraries, but it would be a stupid amount of overhead to require and install the library each and every time a request is made.
Keeping in mind this is for shared hosting; would it be best to ..
1) Install system wide copies of the libraries for use in applications; reduces space, but there may be problems with using the correct version of the library.
2) Allow users (through a control panel) to install local copies for themselves; more space, no version problems.
3) Tell them to wrap it into a module and load it on demand like Python does (I'm not sure if/how this can be done with Lisp). Just being able to load a library for use would be the best option, but I don't think a lot of them are designed to be used this way.
Anyways, looking to hear your opinions, thanks.
There are two ways I would look at it:
start a Lisp for each request
This way it would be much better that the Lisp is a saved image with all necessary libraries and data loaded. But that approach does not look very promising to me.
run a Lisp and let a frontend (web browser, another web server, ...) connect to it
This way you can either start a saved image or a Lisp that loads a bunch of stuff once and serves the requests.
I like to use saved images/applications in a deployment scenario. They can be quickly started, contain all the necessary software and are independent of library changes.
So it might be useful to provide pre-configured Lisp images that contain the necessary software or let the user configure and save an image.

Resources