nodejs - what to use instead of require.paths? - node.js

Recent node docs say that modifying require.paths is bad practice. What should I do instead?

I believe the concern is that it can be repeatedly modified at run time, rather than just set. That could obviously be confusing and causes some quite bizarre bugs. Also, if individual packages modify the path the results are applied globally, which is really bad and goes against the modular nature of node.
If you have several library paths of your own, the best solution is to set the NODE_PATH environment variable before launching node. Node then picks this up when it's launched and applies it automatically.

I keep the related models in the same dir or a sub dir and load using:
var x = require('./mod/x');
In case it's an external module, I install it using npm that puts the module correctly in NODE_PATH.
I've never changed require.paths.

have a look at https://github.com/patrick-steele-idem/app-module-path-node; you can add a directory to the require statements in the top level, without influencing the paths of sub-modules.

Unless I'm making a mistake in my understanding, the primary limitation of the current system is that for namespacing you're stuck without the uses of folders for non-hierarchical dependencies.
What that means in practice...
Consider that you have x/y/z and a/b as well as a/b/c. If both a/b and a/b/c depend on z/y/z you end up having to either specify that relatively (require('../../x/y/z') and require('../../../x/y/z') respectively) or having to make every single package a node_module. Failing that you can probably do horrific things with symlinks or similar.
As far as I can tell the only alternative is to rather than use folders to namespace and organise, use filenames such as:
a.b.js
a.b.c.js
x.y.z.js

Related

Changing PATH based on currenty directory

I am working on a project that requires a user version of Node.js. I already have it installed with root privileges, and I would like to keep that, so my solution was to install a new version as user using the direct download. With this in mind, I would like to make it so that when I make calls to Node from within my working project directory it uses the user version, and otherwise it defaults to the root version.
So, there are really 2 questions:
Is it possible to have different PATH variables depending on where in your directory structure you are?
Is this a good way of approaching this problem or is there a better way to manage versions of Node? (without too much overhead)
You can approximate what you are asking by putting a relative path in your PATH;
PATH=./localnode:$PATH
Now if ./localnode/Node.js exists in the current directory, it will take precedence over the system-wide Node.js
I would not particularly recommend this approach. A better or at least less peculiar approach is to run a separate shell with a different PATH (or an overriding function or alias) for the duration you want to override the system version. This also decouples this preference from changing your working directory, which generally should not have side effects like this.

How to avoid double creating directories in /proc?

I'm writing a Linux kernel module, and I'd like to create a subdirectory, /proc/foo/, and then expose several artificial files inside it that will be generated on the fly by my module. I know I can use proc_mkdir to create the foo directory, but if it already exists dmesg will display a warning, and I'd prefer to keep the log clean.
Now you might think on a module teardown it should be removing the /proc/foo/ tree so that a redundant mkdir should never happen. But I'm working on a series of related kernel modules, and I figured I'd have each of them separately expose files in /proc/foo/. Maybe this is atypical? I don't see any functions in proc_fs.h for querying existing files so maybe I'm going about this wrong?
Another option would be to have a module that just creates the directory, and have it export a global containing the proc_dir_entry, and then all of my modules can extern that variable and use it. But then I have to worry about that module getting loaded before all of the others. But maybe that's the way this is usually done? I'm interested in knowing what best practices are.
It is odd. If you really want everything grouped just create a module providing /proc/foo and make everything else depend on it.

How to tell Node.JS to use modules from global by default?

Is there any chance to tell Node.Js to also look in the global modules folder by default, without changing sources?
I am trying to avoid that my project folders (up to a hundred packages) gets messed with thousands of sub-folders (also, it slows most IDEs into their knees too). I am aware about the npm link trick but it doesn't work on all platforms or its causing other problems. Also, npm/npm3 is sometimes so slow that i have to wait an entire day that my project is ready for actually working on it (i have a top speed computer and broadband).
known solutions:
changing NODE_PATH environment is out for some other reasons, shell .rc changes are little bad too.
changing core files is easy but requires also patches in many other places (when using nodejs. as dependency for instance )
patching node.js's require function as in other versions like require-js which supports require({cache:{}}) or require({config:{}})
At the end I went with https://github.com/h2non/requireg. It doesn't need any kiddie hacks like npm link or special environment variables and works fantastic. It comes with a globalize function which makes subsequent require calls also looking in the global folders.

Using UglifyJs on the whole Node project?

I need to obfuscate my source code as best as possible so I decided to use uglifyjs2.. Now I have the project structure that has nested directories, how can I run it through uglifyjs2 to do the whole project instead of giving it all the input files?
I wouldn't mind if it minified the whole project into a single file or something
I've done something very similar to this in a project I worked on. You have two options:
Leave the files in their directory structure.
This is by far the easier option, but provides a much lower level of obfuscation since someone interested enough in your code basically has a copy of the logical organization of files.
An attacker can simply pretty-print all the files and rename the obfuscated variable names in each file until they have an understanding of what is going on.
To do this, use fs.readdir and fs.stat to recursively go through folders, read in every .js file and output the mangled code.
Compile everything into a single JS file.
This is much more difficult for you to implement, but does make life harder on an attacker since they no longer have the benefit of your project's organization.
Your main problem is reconciling your require calls with files that no longer exist (since everything is now in the same file).
I did this by using Uglify to perform static analysis of my source code by analyzing the AST for calls to require. I then loaded the source code of the required file and repeated.
Once all code was loaded, I replaced the require calls with calls to a custom function, wrapped each file's source code in a function that emulates how node's module system works, and then mangled everything and compiled it into a single file.
My custom require function does most of what node's require does except that rather than searching the disk for a module, it searches the wrapper functions.
Unfortunately, I can't really share any code for #2 since it was part of a proprietary project, but the gist is:
Parse the source text into an AST using UglifyJS.parse.
Use the TreeWalker to visit every node of the AST and check if
node instanceof UglifyJS.AST_Call && node.start.value == 'require'
As I have just completed a huge pure Nodejs project in 80+ files I had the same problem as OP. I needed at least a minimal protection for my hard work, but it seems this very basic need had not been covered by the NPMjs OS community. Add salt to injury the JXCore package encryption system was cracked last week in a few hours so back to obfuscation...
So I created the complete solution, that handles file merging, uglifying. You have the option of leaving out specified files/folders as well from merging. These files are then copied to the new output location of the merged file and references to them are rewritten auto.
NPMjs link of node-uglifier
Github repo of of node-uglifier
PS: I would be glad if people would contribute to make it even better. This is a war between thieves and hard working coders like yourself. Lets join our forces, increase the pain of reverse engineering!
This isn't supported natively by uglifyjs2.
Consider using webpack to package up your entire app into a single minified .js file, excluding node_modules:
http://jlongster.com/Backend-Apps-with-Webpack--Part-I
I had the same need - for which I created node-optimize and grunt-node-optimize.
https://www.npmjs.com/package/grunt-node-optimize

Recommended FHS compliant application test/install workflow under Linux?

I'm in the process of switching to Linux for development, and I'm puzzled about how to maintain a good FHS compliancy in my programs.
For example, under Windows, I know that all the resources (Bitmaps, audio data, etc.) that my program will need can be found with relative paths from the executable, so its the same if I'm running the program from my development directory, or from an installation (Under "Program Files" for example), the program will be able to locate all its files.
Now, under Linux, I see that usually the executable goes under /usr/local/bin and its resources on /usr/local/share. (And the truth is that I'm not even sure of this)
For convenience reasons (such as version control) I'd like to have all the files pertaining to the project under a same path, say, for example, project/src for the source and project/data for resource files.
Is there any standard or recommended way to let me just rebuild the binary for testing and use the files on the project/data directory, while also being able to locate the files when they are under /usr/local/share?
I thought for example of setting a symlink under /usr/local/share pointing to my resources dir, and then just hardcode that path inside my program, but I feel its quite hackish and not very portable.
Also, I thought of running an install script that copies all the resources to /usr/local/share everytime I change, or add resources, but I also feel its not a good way to do it.
Could anyone tell me or point me to where it tells how this issue is usually resolved?
Thanks!
For convenience reasons (such as version control) I'd like to have all the files pertaining to the project under a same path, say, for example, project/src for the source and project/data for resource files.
You can organize your source tree as you wish — it need not bear any resemblance to the FHS layout desired of installed software.
I see that usually the executable goes under /usr/local/bin and its resources on /usr/local/share. (And the truth is that I'm not even sure of this)
The standard prefix is /usr. /usr/local is for, well, "local installations" as the FHS spec reiterates.
Is there any standard or recommended way to let me just rebuild the binary for testing and use the files on the project/data directory
Definitely. Run ./configure --datadir=$PWD/share for example is the way to point your build to the data files form the source tree (substitute by proper path) and use something like -DDATADIR="'${datadir}'" in AM_CFLAGS to make the value known to the (presumably C) code. (All of that, provided you are using autoconf/automake. Similar options may be available in other build systems.)
This sort of hardcoding is what is used in practice, and it suffices. For a development build within your own working copy, having a hardcoded path should not be a problem, and final builds (those done by a packager) will simply use the standard FHS paths.
You could just test a few locations. For example, first check if you have a data directory within the directory you're currently running the program from. If so, just go ahead and use it. If not, try /usr/local/share/yourproject/data, and so on.
For developing/testing, you can use the data directory within your project folder, and for deploying, use the stuff in /usr/local/share/. Of course, you can test for even more locations (e.g. /usr/share).
Basically the requirement for this method is that you have a function that builds the correct paths for all filesystem accesses. Instead of fopen("data/blabla.conf", "w") use something like fopen(path("blabla.conf"), "w"). path() will construct the correct path from the path determined using the directory tests when the program started. E.g. if the path was /usr/local/share/yourproject/data/, the string returned by path("blabla.conf") would be "/usr/local/share/yourproject/data/blabla.conf" - and there is your nice absolute path.
That's how I'd do it. HTH.
My preferred solution in cases like this is to use a configuration file, along with a command-line option that overrides its location.
For example, a configuration file for a fully deployed application named myapp could reside in /etc/myapp/settings.conf and a part of it could look like this:
...
confdir=/etc/myapp/
bindir=/usr/bin/
datadir=/usr/share/myapp/
docdir=/usr/share/doc/myapp/
...
Your application (or a launcher script) can parse this file to determine where to find the rest of the needed files.
I believe that you can reasonably assume in your code that the location of the configuration file is fixed under /etc/myapp - or any other location specified at compile time. Then you provide a command line option to allow that location to be overridden:
myapp --configfile=/opt/myapp/etc/settings.conf ...
It might also make sense to have options for some of the directory paths as well, so that the user can easily override any of the configuration file settings. This approach has a couple of advantages:
Your users can relocate the application very easily - just by moving the files, modifying the paths in the configuration file and then using e.g. a wrapper script to call the main application with the proper --configfile option.
You can easily support FHS, as well as any other scheme you need to.
While developing, you can have your testsuite use a specially crafted configuration file with the paths being wherever you need them to be.
Some people advocate probing the system at runtime to resolve issues like this. I usually suggest avoiding such solutions for at least the following reasons:
It makes your program non-deterministic. You can never tell at a first glance which configuration file it picks up - especially if you have multiple versions of the application on your system.
At any installation mix-up, the application will remain fat and happy - and so will the user. In my opinion, the application should look at one specific and well-documented location and abort with an informative message if it cannot find what it is looking for.
It's highly unlikely that you will always get everything right. There will always be unexpected rare environments or corner cases that the application will not handle.
Such behaviour is against the Unix philosophy. Even comamnd shells probe multiple locations because all locations can hold a file that should be parsed.
EDIT:
This method is not mandated by any formal standard that I know of, but it is the prevalent solution in the Unix world. Most major daemons (e.g. BIND, sendmail, postfix, INN, Apache) will look for a configuration file at a certain location, but will allow you to override that location and - through the file - any other path.
This is mostly to allow the system administrator to implement whetever scheme they want or to setup multiple concurrent installations, but it does help during testing as well. This flexibility is what makes it a Best Practice if not a proper standard.

Resources