Is there a way to precompile node.js scripts? - node.js

Is there a way to precompile node.js scripts and distribute the binary files instead of source files?

Node already does this.
By "this" I mean creating machine-executable binary code. It does this using the JIT pattern though. More on that after I cover what others Googling for this may be searching for...
OS-native binary executable...
If by binary file instead of source, you mean a native-OS executable, yes. NW.JS and Electron both do a stellar job.
Use binaries in your node.js scripts...
If by binary file instead of source, you mean the ability to compile part of your script into binary, so it's difficult or impossible to utilize, or you want something with machine-native speed, yes.
They are called C/C++ Addons. You can distribute a binary (for your particular OS) and call it just like you would with any other var n = require("blah");
Node uses binaries "Just In Time"
Out of the box, Node pre-compiles your scripts on it's own and creates cached V8 machine code (think "executable" - it uses real machine code native to the CPU Node is running on) it then executes with each event it processes.
Here is a Google reference explaining that the V8 engine actually compiles to real machine code, and not a VM.
Google V8 JavaScript Engine Design
This compiling takes place when your application is first loaded.
It caches these bits of code as "modules" as soon as you invoke a "require('module')" instruction.
It does not wait for your entire application to be processed, but pre-compiles each module as each "require" is encountered.
Everything inside the require is compiled and introduced into memory, including it's variables and active state. Again, contrary to many popular blog articles, this is executed as individual machine-code processes. There is no VM, and nothing is interpreted. The JavaScript source is essentially compiled into an executable in memory.
This is why each module can just reference the same require and not create a bunch of overhead; it's just referencing a pre-compiled and existing object in memory, not "re-requiring" the entire module.
You can force it to recompile any module at any time. It's lesser-known that you actually have control of re-compiling these objects very easily, enabling you to "hot-reload" pieces of your application without reloading the entire thing.
A great use-case for this is creating self-modifying code, i.e. a strategy pattern that loads strategies from folders, for example, and as soon as a new folder is added, your own code can re-compile the folders into an in-line strategy pattern, create a "strategyRouter.js" file, and then invalidate the Node cache for your router which forces Node to recompile only that module, which is then utilized on future client requests.
The end result: Node can hot-reload routes or strategies as soon as you drop a new file or folder into your application. No need to restart your app, no need to separate stateless and stateful operations: Just write responses as regular Node modules and have them recompile when they change.
Note: Before people tell me self-modifying code is as bad or worse than eval, terrible for debugging and impossible to maintain, please note that Node itself does this, and so do many popular Node frameworks. I am not explaining original research, I am explaining the abilities of Google's V8 Engine (and hence Node) by design, as this question asks us to do. Please don't shoot people who R the FM, or people will stop R'ing it and keep to themselves.
"Unix was not designed to stop its users from doing stupid things, as
that would also stop them from doing clever things." – Doug Gwyn
Angular 2, Meteor, the new opensource Node-based Light table IDE and a bunch of other frameworks are headed in this direction in order to further remove the developer from the code and bring them closer to the application.
How do I recompile (hot-reload) a required Node module?
It's actually really easy... Here is a hot-reloading npm, for alternatives just Google "node require hot-reload"
https://www.npmjs.com/package/hot-reload
What if I want to build my own framework and hot-reload in an amazing new way?
That, like many things in Node, is surprisingly easy too. Node is like jQuery for servers! ;D
stackoverflow - invalidating Node's require cache

Related

Create executable and hide application code in Node.JS

I created an application in Node.JS that sends data from a local MySQL database to a MondoDB database in a Cloud.
I want to install it on my clients without them seeing the source code and not even needing to have the Node.JS environment installed. Basically I want to turn it into an executable, as is done in Delphi, then the client would only have the executable and with that in theory it would not have access to my source code and would not even need the Node.JS environment installed, just the executable being enough.
I saw that this is possible using PKG at: https://devpleno.com/hands-on-pkg. But on the other hand I saw many threads on the subject saying that this is not effective, obfuscation techniques, but all aimed at JavaScript for the web, which is not my case, so I was worried.
In my studies on Node.JS all the examples I've seen are kept in production in the Node.JS environment with the code exposed, since they run on the server side and clients don't have access.
But in this case of distributing the app to customers, it made me think, and such doubts arose.
Does Node.JS see this? Is it best suited for this? Are there any more suitable options? Or JavaScript applications, even if they're not for the web, can't be turned into executables that actually hide the source code, and if they can't, why?

Node native add-on, dependent on V8

I’m in the early design phases of an application and trying to reason about expectations and requirements, to make sure I don’t spend substantial time down a dead-end. I’ve never worked with V8 or node native add-ons, so bear with me.
Suppose I were building an Electron application. For performance reasons some functionality would be written in native code in an independent library, which would be invoked from a native Node add-on. The native library needs to execute JavaScript. So:
Electron App -> Native add-on -> Native library -> V8
First, is this feasible? For example, will trying to construct/run a V8 context fail due to its execution inside an Electron V8 context? I’m thinking deadlocks, aborts, etc.
I've come up with my plan-of-action from comments to the main question. Unfortunately that makes it hard to credit any specific user or comment for the answer. Specifically:
Restructure the data flow. Allow the Native Add-on to use its copy of V8 to execute any necessary JavaScript, instead of introducing it as a dependency to the Native Library. (comment)
Entering a separate V8 Isolate within another is supported. (comment) (docs)
The Native Library component was conceived because the document type to be supported is large (file-size) and needs JavaScript processing and expensive rendering. Initially I had conceived this to be one large monolithic library for supporting the document type, but in actuality it can be (and probably should be) broken up.
Parsing the binary file. Due to V8's speed, this can probably be done in my Node.js app itself, and may be faster than marshaling data across language barriers. If not, I may consider an N-API Native Add-on for parsing the document and returning a JS object representing the parsed data. (comment)
Executing document-level JavaScript. This should be doable in a Native Add-on by entering a separate V8 Isolate. (See above.)
Rendering the document to a canvas/bitmap. This is an unknown as it is dictated by drawing performance consisting of complex paths, etc. I will probably try to do this directly in the Node.js app first, and if it isn't performant enough then I will consider something like an N-API Native Add-On with e.g. Skia as a dependency to render the data.

General terminology put into context

I learned Java programming before learning any other programming languages. As I learn Node.js, I'm getting all the terminology confused. I have always thought API as a library of methods, classes, etc. that someone has built to make our lives easier. Then I learned about module, which I basically assume to be the same thing as an API(list of methods already built by someone). THEN, I learned about the Express Framework, which again, is a list of methods just like a module and an API. Moreover, the way we incorporate these features into our program is all by doing something like
Var http = require('http');
Therefore, can someone who understands the distinctions between these terms put these terms in context(examples) that could address my question.
Thanks a lot for the help.
A library is just a collection of numerous modules, classes, functions, etc. that are related to each other.
A framework is either a type of or a part of a library that is setup for you to build on top of rather than just call upon. And the distinction between Library and Framework can sometimes be a bit blurred.
With Express, you build upon the Application and its Router which handles incoming requests and determines when to call your code.
app.get('/', function (req, res) {
// ...
});
Though, frameworks can also span beyond code into tools. compound.js' executable is a good example of this.
A module is an individual piece of a library or framework. With Node, it's a single script file and the Object that is exported from the script.
An API is the summary/description of how you interact with the library, framework, or module.
It's usually what you'll find in documentation and is the accessible members, their name, their type, what arguments they accept, etc.
Express' API is:
express
Type: function
Named Arguments: (none)
Returns: Application
Members
listen
Type: function
Named Arguments:
name: port
Type: Number
etc.
This is largely an opinionated question. But I will attempt to provide the some terms commonly used by the Node community, and roughly the factual differences between them.
Module as it pertains to Node is very similar to what you would associate with a Java Library. It provides a wrapper around things that Node users find they do a lot. Frequently providing wrappers around node library functions for doing things everyone wants to do. A simple example would be a recursive file system reader, like wrench. Modules also extend to files you use to modularize your code. For example, modules aren't only installed via NPM, but separate javascript files you write as part of your code base to separate code functionality, under standard OOP practices.
require('someNPMINStalledModule')
require('./someFileInYourCodeBase.js')
both are modules. One is installed via NPM and located in node_modules directory, in the directory you launched node from. The latter example is a javascript file located in the directory you launched node from.
Then there are frameworks. At the core these do the same thing as modules, however, they are meant to be more wide spread, and really change the way you use node. In the java world frameworks like Express would be similar to things like Grails. You can still include and do everything you can do in Java, but grails wraps some things for you, and provides convenient powerful method calls for doing batches of work in a less verbose way. In the end you end up with functionally equivalent code, but Grails has allowed you to accomplish more in fewer lines of code, by generalizing the language a little more. But it still, as I said, allows you to use native code, when Grails doesn't provide the functionality you need. At the cost of this 'few lines of code' gain, you have added a layer of abstraction, additional function calls, etc. This distinction is unimportant, unless you are one who cares deeply about style. A hardcore ExpressJS developer likely wouldn't like it if you included a plain node http server in your code. Not so much because it is invalid Node, or from a perforamnce view any different, it wrecks the style of your code. If your code uses a framework, you should stick to using the coding conventions as used in this framework. But if you use a module like wrench to recursively search a directory, it is still perfectly stylistically acceptable to use fs.readFile, to read a single file.
Then there are mini applications which is a module that allow you to quickly launch simple things like serving a file. For example: http-server will server a directory of files to any port you wish, with a simple command line. You wouldn't use them in your own code with 'require' but this type of module can honestly be some of the most useful thing node provides, I highly recommend using some. (Nodemon, http-server, and grunt are a few highly useful examples of modules that can help make your development life easier)
Finally there are Native Extensions. The concurrency that Node provides comes from the V8 backend. Replicating this in pure Javscript is impossible, and the only way to write truly asyncrhonous code is to take advantage of asynchronous operations provided by the Node API, do some really wonky logic with process.nextTick, fork child processes, or write native extensions. Native Extensions provide truly concurrent operations that Node does not provide. Database communication is the most obvious example, but anyone can develope a C++ extension that will spawn threads to do work. There is also a very handy module that launches threads to handle bits of Javascript called "threads a gogo". It simplifies the launching of truly concurrent work, though if you're in a position where such things are necessary, you may find that you're using the wrong language for your use case. Ultimately these are no different from Modules in the way that you use them, but being aware of the fact that they can provide additional concurrent method for I/O type of operations not provided by NodeJS APIs is a unique and very important distinction.

hesitation between two technologies for a little program

I want to make a program (more precisely, a service) that periodically scans directories to find some video files (.avi, .mkv, etc) and automatically download some associated files (mostly subtitles) from one or several websites.
This program could run on linux or windows as well.
On one hand, I know well Qt from a long time and I know all its benefits, but on the other hand, I'm attracted by node.js and it extreme flexibility and liveliness.
I need to offer some interactivity with the end user of my program (for instance, chose the scans directories, etc).
What would be the best choice in your opinion in 2013?
I advise against Node.js for "small tools and programs". Especially for iterative tasks.
The long story
The reason is quite simply the way Node.js works. Its asynchronous model makes simple tasks unnecessarily convoluted. Additionally, because many callbacks are called from the Node.js event loop, you can't just use try/catch structures so every tiny error will crash your whole Application.
Of course there are ways to catch those errors or work with them, but the docs advise you against all of them and advise you to restart the application gracefully in any case to prevent memory leaks. This means you have to implement yet another piece of code.
The only real solution in Node.js would be writing your Application as a Cluster, which is a great concept but of course would require you to use some kind of IPC to get your data back to a process that can handle it.
Also, since you wrote about "periodically scan"ning a directory, I want to point out that you should...
Use file system watchers for services
Almost every language kit has those now and I strongly suggest using those and only use a fallback full-scan.
In Qt there is a system-independent class QFileSystemWatcher that provides a handy callback whenever specified files are changed
In Java there is the java.nio.file.FileSystem.getWatchService()
Node.js has the fs.watch function, if you really want to go for it

What are common development issues, pitfalls and suggestions?

I've been developing in Node.js for only 2 weeks and started re-creating a web site previously written in PHP. So far so good, and looks like I can do same thing in Node (with Express) that was done in PHP in same or less time.
I have ran into things you just have to get used to such as using modules, modules not sharing common environment, and getting into a habit of using callbacks for file system and database operations etc.
But is there anything that developer might discover a lot later that is pretty important to development in node? Issues that everyone else developing in Node has but they don't surface until later? Pitfalls? Anything that pros know and noobs don't?
I would appreciate any suggestions and advice.
Here are the things you might not realize until later:
Node will pause execution to run the garbage collector eventually/periodically. Your server will pause for a hiccup when this happens. For most people, this issue is not a significant problem, but it can be a barrier for building near-time systems. See Does Node.js scalability suffer because of garbage collection when under high load?
Node is single process and thus by default will only use 1 CPU. There is built-in clustering support to run multiple processes (typically 1 per CPU), and for the most part the Node community believes this to be a solid approach. You might be surprised by this reality, though.
Stack traces are often lost due to the event queue, so your logging and debugging methodology needs to change significantly
Here are some minor stumbling blocks you may run into for a while (I still bump up against these)
Remembering to do callback(null, value) on a successful callback. Passing null as a first parameter is weird and thus I forget to do it. Instead I accidentally do callback(value), which is interpreted as an error by the caller until I debug into it for a while and slap my forehead.
forgetting to use return when you invoke the callback in a guard clause and don't want a function to continue to execute past that point. Sometimes this results in the callback getting invoked twice, which causes all manner of misbehavior.
Here are some NICE things you might not realize initially
It is much easier in node.js, using one of the awesome flow control libraries, to do complex operations like loading 3 network resources in parallel, then making 2 DB calls in serial, then writing to 2 log files in parallel, then sending an HTTP response. This stuff is trivial and beautiful in node and damn near impossible in many synchronous environments.
ALL of node's modules are new and modern, and for the most part, you can find a beautifully-designed module with a great API to do what you need. Python has great libraries by now, too, but compare Node's cheerio or jsdom module to python's BeautifulSoup and see what I mean. Compare python's requests module to node's superagent.
There's a community benefit that comes from working with a modern platform where people are focused on modern web development. The contrast between the node community and the PHP community cannot be overstated.

Resources