How node IPC works between 2 processes - node.js

Using nodejs fork you can perform IPC between the parent process and the child process. Previously I was under the impression that the child process would have an extra environment variable with a file descriptor. I printed the process env but I can't see any variable with a file Id, I don't see any open sockets either, so my question is how does node IPC works behind the scenes?

so my question is how does node IPC (for forked processes) works behind the scenes
The source code for fork uses a Pipe object internally. Looking further into that Pipe object, it is a wrapper over the libuv Pipe object. Then, looking into libuv, it's Pipe abstraction is a domain socket on Unix and a named pipe on Windows.
Now, since this is all undocumented implementation details, there's nothing that says it has to always be done this way in the future - though one would not expect it to change unless there was a really good reason.

Related

Node.js multithreading and asynchronous

I'm a little confused with multithreading and asynchronous in js. What is the difference between a cluster, a stream, a child process, and a worker thread?
The first thing to remember about multithreading in Node.js is that in user-space, there exists no concept of threading, and as such you cannot write any code making use of threads. Any node program is always a single threaded program (in user-space).
Since a node program is a single thread, and runs as a single process, it uses only a single CPU. Most modern processors have multiple CPUs, and in order to make use of all of these CPUs and provide better throughput, you can start the same node program as a cluster.
The cluster module of node, allows you to start a node program, and the first instance launched is launched as the master instance. The master allows you to spawn new workers as separate processes (not threads) using cluster.fork() method. The actual work that is to be done by the node program is done by the workers. The example in the node docs demonstrates this perfectly.
A child process is a process that is spawned from the current process and has an established IPC channel between them to communicate with each other. The master and workers I described in cluster are an example of child processes. the child_process module in node allows you to spawn custom child processes as you require.
Streams are something that is not at all related to multi-threading or multiple processes. Streams are just a way to handle large amounts of data without loading all the data into the working memory at the same time. Ex: Consider you want to read a 10GB log file, and your server only has 4GB of memory. Trying to load the file using fs.readFile will crash your process. Instead you use fs.createReadStream and use that to process the file in smaller chunks that can be loaded into memory.
Hope this explains. For further details you really should read the node docs.
this is a little vague so I'm just gonna give an overview.
Streams are really just data streams like in any other language. Similar to iostreams in C and where you get user input, or other types of data. They're usually masked by another class so you don't know you're using a stream. You won't mess with these unless you're building a new type usually.
Child processes, worker threads, and clusters are all ways of utilizing multi-core processing in Node applications.
Worker threads are basic multithreading the Node way, with each thread having a way to communicate with the parent, and shared memory possible between each thread. You pass in a function and data, and can provide a callback for when the thread is done processing.
Clusters are more for network sharing. Often used behind a master listener port, a master app will listen for connections, then assign them in a round-robin manner to each cluster thread for use. They share the server port(s) across multiple processors to even out the load.
Child processes are a way to create a new process in a similar way to through popen. These can be asynchronous or synchronous (non-blocking or blocking the Node event loop), and can send to and receive from the parent process via stdout/stderr and stdin, respectively. The parent can register listeners to each child process for updates. You can pass a file, a function, or a module to a child process. Generally do not share memory.
I'd suggest reading the documentation yourself and coming back with any specific questions you have, you won't get much with vague questions like this, makes it seem like you didn't do your own part of the work beforehand.
Documentation:
Streams
Worker Threads
Clusters
Child Processes

Parent process <- child processes UNIdirectional communication in "real-world" Haskell?

Goal:
There is an IShell which is nothing but an ordinary console able
to consume somewhat command like do param1=value1 --option.
IShell should orchestrate whole execution. It does not run commands, the only thing it does is starts appropriate
process.
Any process started from the running IShell instance should be able to report back to it what's happening inside. So,
say, IShell has started process A to do something
complicated; process A should be able to report both progress
and result back to parent IShell. In practice, it means, that
there should be a mechanism how to, for example, print message
from process A to appropriate IShell.
Finally, code should work both with Windows and Linux.
I really like Haskell and I'd like to promote "real-world" Haskell usage. But I don't know existing libraries well, I haven't done yet any "real-world" Haskell app.
Thus, questions:
How can I establish IShell <- it's processes communication? Is there a single library able to handle both Windows-specific and Linux-specific stuff?
The process package supports Linux and Windows and provides mechanisms for communicating with children processes via their stdin, stdout, stderr, and exit code.
The network package supports Linux and Windows and provides mechanisms for communicating with children processes via socket.

Is it possible to attach to a running background process with ruby?

I have a nodejs daemon running on my server, I would like to give him some input on stdin and read it stdout from a Rails controller, is it possible with Ruby?
I am looking at Open3 but it seems to give me only the chance to spawn a new process.
I need the keep the nodejs process running because the starting overhead is too high to be called at every request.
In general there is no way to attach to a running process's IO streams unless it was set up to do so initially. It is easy if, for example, the process was set up to read from a pipe: just have Ruby write to that pipe like any other file (this is what the Open3 lib does).
For a daemon usually there are more proper ways to interact with it than hijacking its input with a pipe, though it depends on the particular daemon you are running and how it is being managed by the OS. For example, sockets are a popular way to communicate to a running process on *nix systems.

How to fork/clone an identical Node child process in the same sense as fork() of Linux system call?

So I was developing a server farm on Node which requires multiple processes per machine to handle the load. Since Windows doesn't quite get along with Node cluster module, I had to manually work it out.
The real problem is when I was forking Node processes, a JS module path was required as the first argument to the child_process.fork() function and once forked, the child process wouldn't inherit anything from its parent. In my case, I want a function that does similar thing as fork() system call in Linux, which clones the parent process, inherits everything and continues execution from exactly where the fork() is done. Can this be achieved on the Node platform?
I don't think node.js is ever going to support fork(2)
The comment from the node github page on the subject
https://github.com/joyent/node/issues/2334#issuecomment-3153822
We're not (ever) going to support fork.
not portable to windows
difficult conceptually for users
entire heap will be quickly copied with a compacting VM; no benefits from copy-on-write
not necessary
difficult for us to do
child_process.fork()
This is a special case of the spawn() functionality for spawning Node
processes. In addition to having all the methods in a normal
ChildProcess instance, the returned object has a communication channel
built-in. See child.send(message, [sendHandle]) for details.

What is different about the way NodeJS handles requests as opposed to a setup like Rails / Passenger?

My understanding is that Node is an 'Event' driven as opposed to sequentially driven server application. This I've come to understand means, that for event driven software, the user is in command, he can create an event at any time and the server is in a state such that it can respond, whereas with sequential software (like a DOS prompt), the application tells the user when its 'ok' to response, and may at any given time be not available (due to some other process).
Further, my understanding is that applications like Node and EventMachine use a reactor of sorts.. they wait for an 'event' to occur, and using a callback they delegate the task to some other worker. Ok.. so then, what about Rails & Passenger?
Rails might use a server like NGINX with Passenger to spawn new processes when 'events' are received by the system. Is this not conceptually the same idea? If it is, is it just the processing overhead that is really separating the two where Passenger would need to potentially spawn a new rails instance while, node is already waiting to handle the request?
Node.js is event driven non blocking programming language. The key is the non blocking part. Node doesn't spawn for other processes. It runs in one thread (this is for starters... you can actually spawn it now through some modules - i think - but that's another talk)
Anyway this is different from other typical programming languages where you receive a request and the thread is locked until it has an answer. If you assign it to another thread, that thread is still locked...
In node you never lock. You receive request and the thread continues to receive requests. When a request is processed, the callback is called.
Hope I made myself understand and I used the right terms ;)
Anyway, if you want this video is nicee: http://www.youtube.com/watch?v=jo_B4LTHi3I
The non-blocking/evented I/O as jribeiro described is one part of the answer. Ruby applications tend to be written using blocking I/O, and using processes and threads for concurrency.
However, non-blocking and evented I/O are not inherent to Node. You can achieve the same thing in Ruby by using EventMachine, and an in-process evented server like Thin. If you program directly against EventMachine and Thin, then it is architecturally almost the same as Node. That being said, the Ruby ecosystem does not have as many event-friendly libraries and documentation as Node does, so it takes a bit of skill to achieve the same thing in Ruby.
Conversely, the way Phusion Passenger manages processes - i.e. by spawning multiple processes and load balancing requests between them, and supervising processes - is not unique to Ruby. In fact, Phusion Passenger introduced Node.js support recently. The Node.js support was open sourced today.

Resources