How do the core modules in node.js work? (https://github.com/nodejs/node/blob/master/lib) - node.js

Does the node interpreter look for core modules (let's say "fs") within the node binary? If yes, are these modules packaged as js files. Are the core modules that are referenced within our code converted to c/c++ code first and then executed? For example, I see a method in the _tls_common.js (https://github.com/nodejs/node/blob/master/lib/_tls_common.js) file called "loadPKCS12" and the only place that I see this method being referenced/defined is within the "node_crypto.cc" file (https://github.com/nodejs/node/blob/master/src/node_crypto.cc). So how does node link a method in javascript with the one defined in the c/c++ file?
here is the extract from the _tls_common.js file that makes use of the "loadPKCS12" method:
if (passphrase) {
c.context.loadPKCS12(buf, toBuf(passphrase));
} else {
c.context.loadPKCS12(buf);
}
}
} else {
const buf = toBuf(options.pfx);
const passphrase = options.passphrase;
if (passphrase) {
c.context.loadPKCS12(buf, toBuf(passphrase));
} else {
c.context.loadPKCS12(buf);

There are two different (but seemingly related) questions asked here. The first one is: "How the core modules work?". Second one being "How does NodeJS let c++ code get referenced and executed in JavaScript?". Let's take them one by one.
How the core modules work?
The core modules are packaged with NodeJS binary. And, while they are packaged with the binary, they are not converted to c++ code before packaging. The internal modules are loaded into memory during bootstrap of the node process. When a program executes, lets say require('fs'), the require function simply returns the already loaded module from cache. The actual loading of the internal module obviously happens in c++ code.
How does NodeJS let c++ code get referenced in JS?
This ability comes partly from V8 engine which exposes the ability to create and manage JS constructs in C++, and partly from NodeJS / LibUV which create a wrapper on top of V8 to provide the execution environment. The documentation about such node modules can be accessed here. As the documentation states, these c++ modules can be used in JS file by requiring them, like any other ordinary JS module.
Your example for use of c++ function in JS (loadPKCS12), however, is more special case of internal c++ functionality of NodeJS. loadPKCS12 is called on a object of SecureContext imported from crypto c++ module. If you follow the link to SecureContext import in _tls_common.js above, you will see that the crypto is not loaded using require(), instead a special (global) method internalBinding is used to obtain the reference. At the last line in node_crypto.cc file, initializer for internal module crypto is registered. Following the chain of initialization, node::crypto::Initialize calls node::crypto::SecureContext::Initialize which creates a function template, assigns the appropriate prototype methods and exports it on target. Eventually these exported functionalities from C++ world are imported and used in JS-World using internalBinding.

Related

How to modify NodeJS built-in global objects?

I'd like to add an important functionality that I can use at work without using require() every time. So, I thought modifying built-in objects can make this happen but I could not locate the locations of those objects to do the modification.
Are those objects in NodeJS binary? Would that mean I have to fork the whole NodeJS repo to make this happen?
Please, I don't want to use prototype editing etc which are same as require. I need to have native feeling.
First off, I agree with an earlier comment that this sounds like a bit of an XY problem where we could better help you if you describe what problem you're really trying to solve. Portable node.js programs that work anywhere or work with any future versions of node.js don't rely on some sort of custom configured environment in order to run. They use the built-in capabilities of node.js and they require/import in external things they want to add to the environment.
Are those objects in NodeJS binary?
Yes, they are in the executable.
Would that mean I have to fork the whole NodeJS repo to make this happen?
Yes.
Please, I don't want to use prototype editing etc which are same as require. I need to have native feeling.
"Native feeling"? This sounds like you haven't really bought into the node.js module architecture. It is different than many other environments. It's easy to get used to over time. IMO, it would really be better to go with the flow and architecture of the platform rather than make some custom version of node.js just to save one line of typing in your startup code.
And, the whole concept of adding a number of globals you can use anywhere pretty much shows that you haven't fully understood the design, architectural, code reuse and testability advantages of the module design baked into node.js. If you had, you wouldn't be trying to write a lot of code that can't be reused in other ways that you don't anticipate now.
That said, in searching through the node.js source code on Github, I found this source file node.js which is where lots of things are added to the node.js global object such as setTimeout(), clearTimeout(), setImmediate(), clearImmediate() and so on. So, that source file seems to be where node.js is setting up the global object. If you wanted to add your own things there, that's one place where it would be done.
To provide a sample of that code (you can see the link above for the complete code):
if (!config.noBrowserGlobals) {
// Override global console from the one provided by the VM
// to the one implemented by Node.js
// https://console.spec.whatwg.org/#console-namespace
exposeNamespace(global, 'console', createGlobalConsole(global.console));
const { URL, URLSearchParams } = require('internal/url');
// https://url.spec.whatwg.org/#url
exposeInterface(global, 'URL', URL);
// https://url.spec.whatwg.org/#urlsearchparams
exposeInterface(global, 'URLSearchParams', URLSearchParams);
const {
TextEncoder, TextDecoder
} = require('internal/encoding');
// https://encoding.spec.whatwg.org/#textencoder
exposeInterface(global, 'TextEncoder', TextEncoder);
// https://encoding.spec.whatwg.org/#textdecoder
exposeInterface(global, 'TextDecoder', TextDecoder);
// https://html.spec.whatwg.org/multipage/webappapis.html#windoworworkerglobalscope
const timers = require('timers');
defineOperation(global, 'clearInterval', timers.clearInterval);
defineOperation(global, 'clearTimeout', timers.clearTimeout);
defineOperation(global, 'setInterval', timers.setInterval);
defineOperation(global, 'setTimeout', timers.setTimeout);
defineOperation(global, 'queueMicrotask', queueMicrotask);
// Non-standard extensions:
defineOperation(global, 'clearImmediate', timers.clearImmediate);
defineOperation(global, 'setImmediate', timers.setImmediate);
}
This code is built into the node.js executable so the only way I know of to directly modify it (without hackish patching of the executable itself) would be to modify the file and then rebuild node.js for your platform into a custom build.
On a little more practical note, you can also use the -r module command line argument to tell node.js to run require(module) before starting your main script. So, you could make a different way of starting node.js from a shell file that always passes the -r fullPathToYourModule argument to node.js so it will always run your startup module that adds things to the global object.
Again, you'd be doing this just to save one line of typing in your startup file. It is really worth doing that?

Can I expose a commonjs module object in the global namespace?

We're building a react/flux application using nodejs/webpack and therefore all of our new code is written in commonjs modules.
There are a few isolated cases where we need to access an object from one of our commonjs modules in legacy code. Is there any way to do this...and yes, I assume this is gross. We're eventually going to migrate all of our code to commonjs modules, but this is a stop-gap.
In the legacy JS code (non using CommonJS), create a function in the global namespace:
/* Legacy code NOT using CommonJS */
function getCommonJSObject(obj) {
...
}
In the CommonJS code you can test for this global function and if it exists, call it, passing whichever object you need access to.
/* CommonJS object */
if(window['getCommonJSObject']){
window.getCommonJSObject(obj);
}
By doing this we can still follow our flux pattern from legacy javascript code in our app. If there's a better way to do this, let us know. But this is a handy workaround.

RequireJS Multi Injecting

I am building a modular single page application which consumes multiple require config files from different sources. I would like a way in my application to be able to consume a list of all modules of a specific type. something like this:
define('module-type/an-implementation',...)
define('module-type/another-implementation',...)
require('module-type/*', function(modules){
$.each(modules,function(m){ m.doStuff(); });
})
This is similar patterns dependency injectors use with multiple dependency injection (eg. https://github.com/ninject/ninject/wiki/Multi-injection)
Is there a way to do this (or something similar) with require?
RequireJS doesn't know which modules exist until something requires them. Once a module is required / depended upon RequireJS will figure out where to request the module from based on module's name and RequireJS's configuration. Once the module is loaded it can be examined / executed to find out its dependencies and handle them in turn, until all dependencies are loaded and all module bodies are executed.
In order to be able to "consume a list of all modules of a specific type" something would need to be able to find all such modules. RequireJS doesn't have any means to know which modules exist, so it alone wouldn't be enough to implement "Multi Injection".
Speculation
Some kind of module registry could be created and populated with help of the build system: e.g. a file (say module-registry.js) could be generated each time a file in the source directory is added / removed or renamed, then multi inject could be possible like:
multiRequire('module-type/*', function(modules){
$.each(modules,function(m){ m.doStuff(); });
})
which in turn would call
require(findModules(pattern), function() {
callback(Array.prototype.slice.call(arguments, 0));
});
(where multiRequire and findModules are provided by the module registry).

Write a module that works both in nodejs and requirejs

I wrote a small parser that currently works in node app, but wondering if there is a way that I can make a module that will work both in NodeJS app and client side app that uses requirejs?
path/to/lib/index.js
function someRandom(strings) {
// we are doing something here
return strings
}
exports.someRandom = someRandom;
Right now I'm getting this in client-side
Uncaught ReferenceError: exports is not defined
I know that I can use node requirejs and then change the structure to use define but is there other way without adding node requirejs?
This is my js/main.js file
require(["path/to/lib/index"], function(something) {
// will do something here
});
The way I prefer to do it is to write all my modules in the AMD syntax (use define) and use amd-loader to load them in Node. Note that this solution is not using RequireJS, even though the AMD syntax is used.
However, there's a way to do it without having to use the AMD syntax. You can use r.js to wrap your Node modules. For instance, if you put your tree of Node modules in in, you can do:
$ r.js -convert in out
This will create in out a tree of files that correspond to those in in but wrapped with the define call. You can then load these in the browser using RequireJS. There are limitations. Some are obvious, like not being able to use the Node modules that depend on the Node runtime (like fs, child_process, etc.). Some are more subtle, like the fact that you can't use require(foo) where foo is a variable (RequireJS will handle only string literals there). See the documentation for the details.

On-demand require()

Say I create a library in ./libname which contains one main file: main.js and multiple optional library files which are occasionally used with the main object: a.js and b.js.
I create index.js file with the following:
exports.MainClass = require('main.js').MainClass; // shortcut
exports.a = require('a');
exports.b = require('b');
And now I can use the library as follows:
var lib = require('./libname');
lib.MainClass;
lib.a.Something; // Here I need the optional utility object
lib.b.SomeOtherThing;
However, that means, I load 'a.js' and 'b.js' always and not when I really need them.
Sure I can manually load the optional modules with require('./libname/a.js'), but then I lose the pretty lib.a dot-notation :)
Is there a way to implement on-demand loading with some kind of index file? Maybe, some package.json magic can play here well?
This may possible if you called the "MainClass" to dynamically load the additional modules on-demand. But I suspect this will also mean an adjustment in the api to access the module.
I am guess your motivation is to "avoid" the extra processing used by "non-required modules".
But remember Node is single threaded so the memory footprint of loading a module is not per-connection, it's per-process. Loading a module is a one-off to get it into memory.
In other words the modules are only ever loaded when you start your server not every-time someone makes a request.
I think you looking at this from client-side programming where it's advantages to load scripts when they are required to save on both processing and or http requests.
On the server the most you will be saving is few extra bites in memory.
Looks like the only way is to use getters. In short, like this:
exports = {
MainClass : require('main.js').MainClass,
get a(){ return require('./a.js'); },
get b(){ return require('./a.js'); }
}

Resources