Which process handles async or simultaneous work to happen in node js. Is there any specific api that takes care of all these events to happen in queue?
Is there any specific api that takes care of all these events to happen in queue?
No, not accessible from Javascript. The event queue is completely under the covers. You don't access it directly.
The implementation of asynchronous operations is all handled in native code. When an async operation completes, its native code calls an internal C++ API that inserts the completion event into the node.js event queue. If no Javascript is currently running in node.js at that moment, then inserting the item in the event queue will trigger it to get pulled out of the queue and the callback associated with it will be run. If Javascript is running at the moment, it will stay in the event queue until the current piece of running Javascript finishes at which point the interpreter will check the event queue, see there is an event in there and will pull that event out and run the callback associated with that event.
Which process handles async or simultaneous work to happen in node js.
It is not entirely clear what you mean by this. Each node.js function that is asynchronous has its own implementation. Networking uses OS-level event driven networking (not threads). Async file I/O uses a native thread pool. Timers use OS level timers. Some other asynchronous operation will have its own implementation and do it some other way as it completely depends upon what the async operation is for who it will accomplish its work.
The only three ways (I know of) for you to write your own asynchronous operation are:
Compose your own operation entirely using existing asynchronous operations such as request this data from another server, then write it to this file.
Use native code to write your own node.js add-on that can expose an asynchronous interface and use native code to implement that asynchronous interface in whatever manner is most appropriate for your operation.
Run some other process and communicate back the result from that other process. This can be some other program written in any language or it can be Javascript that you run in another node.js process.
Now, there are a few ways you can influence the event queue timing of some things from Javascript. For example, setTimeout(fn, t), process.nextTick(fn) and setImmediate(fn) all have slightly different ways they insert your callback function into the event queue that determines what (that is already in the event queue) they run before or after. But, these by themselves just schedule a callback sometime in the future - they don't actually implement an asynchronous operation that accomplishes some tasks in a non-blocking way.
You may want to read some of these references:
The Node.js Event Loop, Timers, and process.nextTick()
setImmediate() vs nextTick() vs setTimeout(fn,0) – in depth explanation
Demystifying Asynchronous Programming Part 1: Node.js Event Loop
You might be thinking of child_process.spawn().
From the NodeJS documentation
The child_process.spawn(), child_process.fork(), child_process.exec(),
and child_process.execFile() methods all follow the idiomatic
asynchronous programming pattern typical of other Node.js APIs.
Each of the methods returns a ChildProcess instance. These objects
implement the Node.js EventEmitter API, allowing the parent process to
register listener functions that are called when certain events occur
during the life cycle of the child process.
The child_process.exec() and child_process.execFile() methods
additionally allow for an optional callback function to be specified
that is invoked when the child process terminates.
Related
I have a problem with the concept of async in NodeJS. I have read a lot about the event poll in NodeJS. They say things like:
The event loop is what allows Node.js to perform non-blocking I/O
operations
or
Node uses the Worker Pool to handle "expensive" tasks. This includes
I/O for which an operating system does not provide a non-blocking
version, as well as particularly CPU-intensive tasks.
or
These are the Node module APIs that make use of this Worker Pool such
as File System(fs)
So, I found that Node manages I/O running using a Thread pool. Now my question is, if Node is managing them, why do we need to utilize async programming at all in NodeJS? And whats the reason behind some modules like BlueBird?
tl;dr: You need async to prevent a blocking of the Event-Loop.
NodeJS uses a certain number of threads to handle clients. There basically are two types of threads:
Event Loop (or your main thread)
Worker Pool (or threadpool)
The Event Loop:
Basically the reason why async programming is needed:
Once all events are registered, NodeJS enters the Event Loop and handles all incoming requests as well as outgoing responses. All of them pass through the Event Loop.
Worker Pool
As you already said, NodeJS uses the Worker Pool to perform I/O and CPU intensive tasks .
Asynchronous Code
In order to prevent blocking the main thread, you want to keep your Event Loop clean, and delegate certain task. This is where async code is needed. That way your code becomes non-blocking. The terminology concerning async and non-blocking is a bit vague though. To clarify:
Async Code: Performs certain tasks in parallel
Non-Blocking: Basically Polling without blocking further code.
In NodeJS however, Async is often used for I/O operations. There it doesn't just mean "perform in parallel", because it mostly means "don't block and get the signal".
So in order to make the Event Loop of NodeJS efficient, we don't want to wait for an operation to finish. Therefore we register an async "listener" instead. This allows NodeJS to efficiently manage its own resources.
BlueBird (or Promises in general):
Libraries like BlueBird which you mentioned, aren't required anymore because NodeJS supports promises out of the box (see note here).
Promises are just another way of writing asynchronous code. So are Async/Await and Generator Functions.
Side note: Functions defined with the async keyword actually yield a promise.
I want to know how the engine in node.js know when to call and execute the queued operation. I understand that node.js is single threaded and uses asynchronous non-blocking to execute operations.
But let's say you are calling something from the database and node.js queues the operation and doesn't wait for this to execute further lines of code. Being single thread it stores the data in channels(if I am not wrong). But how does the server know that the network isn't busy and it is the right time to execute the queued operation.
Or it executes the queued operation at the end of the program. This can't be the case. So I am curious to know what happens under the hood and how does the node server know when to execute the queued operation.
It's not clear what you mean by "channels" as that is not a built-in node.js concept or a term that is used internally to describe how node.js works. The tutorial you reference in your comment is talking about the Java language, not the Javascript language so it has absolutely nothing to do with node.js. The concept of "channels" in that tutorial is something that that specific NIO library implements in Java.
But, if you're really just asking how async operations work in node.js, I can explain that generaly concept.
node.js works off an event queue. The node.js interpreter grabs the next event in the event queue and executes it (typically that involves calling a callback function). That callback function runs (single threaded) until it returns. When it returns, the node.js interpreter then looks in the event queue for the next event. If there are any, it grabs that next event and calls the callback associated with it. If there are not events currently in the event queue, then it waits for an event to be put in the event queue (with nothing to do expect perhaps runs some garbage collection).
Now, when you runs some sort of asynchronous operation in your Javascript (such as a DB query), you can your db query function, that initiates the query (by sending the query off to the database) and then immediately returns. This allows your piece of Javascript that started that query to then return and give control back to node.js.
Meanwhile, some native code is managing the connection to your database. When the response comes back from the database, an event is added to the internal node.js event queue.
Whenever the node.js interpreter has finished running other Javascript, it looks in the event queue and pulls out the next event and calls the callback associated with that. In that way, your DB query result gets processed by your code. In all cases here, an asynchronous operation has some sort of callback associated with it that the interpreter can call when it finds the event in the event queue.
Being single thread it stores the data in channels(if I am not wrong).
node.js doesn't have a "channels" concept so I'm not sure what you meant by that.
Or it executes the queued operation at the end of the program.
When the result comes back from the database, some native code managing that will insert an event into the node.js event queue. The Javascript interpreter will service that event, then next time it gets back to the event loop (other Javascript has finished running and it's time to check for the next event to process).
It's important to understand that initiating an asynchronous operation is non-blocking so you get this:
console.log("1");
callSomeAsyncFunction(someData, (err, data) => {
// this callback gets called later when the async operation finishes
// and node.js gets a chance to process the event that was put
// in the event queue by the code that managed the async operation
console.log("2");
});
// right here there is nothing else to do so the interpreter returns back
// to the system, allowing it to process other events
console.log("3");
Because of the non-blocking nature of things, this will log:
1
3
2
Some other references on the topic:
How does JavaScript handle AJAX responses in the background?
Where is the node.js event queue?
Understanding Call backs
Why does a while loop block the node event loop?
Node internally uses the underlying operating systems non-blocking io API-s. See overlapped io, WSAEventSelect for Windows, select for Linux.
The reactor pattern which is utilized by libuv for handling IO is synchronous by design but libuv supports async io. How is this possible? Does libuv extend the reactor's design somehow to support async io? Does using multiple threads/event loops aid in achieving this?
The I/O model of Node and libuv is very similar to what nginx does internally.
The libuv uses a single-threaded event loop and non-blocking asynchronous I/O. All functions are synchronous in a way that they run to completion but some clever hackery with promises and generators can be used to appear that they don't (when in fact both the invocation of the generator function is non-blocking and returns the generator object immediately and the generator methods like .next() run to completion), plus the new async/await syntax makes it very convenient.
For operations that cannot be accomplished in a non-blocking way Node uses a thread pool to run the blocking operations in separate threads but this is done transparently and it is never exposed to the application code written in JavaScript (you need to step down to C++ to work with that directly).
See: http://docs.libuv.org/en/v1.x/design.html
Unlike network I/O, there are no platform-specific file I/O primitives libuv could rely on, so the current approach is to run blocking file I/O operations in a thread pool. [...]
libuv currently uses a global thread pool on which all loops can queue work on. 3 types of operations are currently run on this pool:
File system operations
DNS functions (getaddrinfo and getnameinfo)
User specified code via uv_queue_work()
See also those answers for more details:
what is mean by event loop in node.js ? javascript event loop or libuv event loop?
*NodeJS event loop internal working
Prevent NodeJS from exiting event-loop
How node.js server serve next request, if current request have huge computation?
Which would be better for concurrent tasks on node.js? Fibers? Web-workers? or Threads?
Speed up setInterval
Async.js - Is parallel really parallel?
Node.js: Asynchronous Callback Execution. Is this Zalgo?
See the links and illustration in those answers. There are a lot of resources to read bout that topic.
I want to know that even though the callbacks are done. How will the callbacks and other process on main thread can be executed parallely if the node is single threaded model.
First off, callbacks are not executed in parallel. There is only ever one thread of Javascript execution at a time so there is no actual parallel execution of Javascript code. You can have multiple asynchronous native code operations in-flight at the same time, but when it comes time for them to run some sort of Javascript completion callback, those are run one at a time, not in parallel.
Javascript is an event driven language which means that things like timers, callbacks from asynchronous disk I/O, callbacks from network operations, etc... use an event queue. When the JS engine finishes executing a piece of Javascript and control is returned back to the system, the JS engine pulls the next event off the event queue and executes it. Native code operations can signal that they are complete and want to run a callback by inserting an event into the event queue.
For an async operation like an HTTP request such as http.get(), here's basically what happens:
Your javascript calls http.get()
That call sends the http request and then returns immediately. Whatever callback you passed to http.get() is stored internally by the http module and is associated with that particular network request.
Your javascript code after the http.get() continues to execute until it is done and returns control back to the system.
Then, some time later, the response from the previous HTTP request comes back to the TCP engine. That causes the http module to insert an event in the Javascript event queue to call your completion callback that you previously sent with http.get().
If the Javascript engine is not doing anything else at the moment, then your callback is executed and passed the result response data.
If the Javascript engine is doing something at the moment, then the event sits in the event queue until the current piece of Javascript that is executing finishes and returns control back to the system. At that point, the JS engine checks the event queue and pulls the next event off the queue and executes it. If that is your callback, then you callback is called at that point.
So, various operations such as async disk I/O, network operations, etc... can run in the background because they are being controlled by native code (not by the Javascript engine). That native code has the ability to use threads if required. The fs module uses threads, the net module does not because the native OS already supports asynchronous networking operations.
I'm new to nodeJS and was wondering about the single-instance model of Node.
In a simple nodeJs application, when some blocking operation is asynchronously handled with callbacks, does the main thread running the nodeJs handle the callback as well?.
If the request is to get some data off the database, and there are 100s of concurrent users, and each db operation takes couple of seconds, when the callback is finally fired (for each of the connections), is the main thread accepting these requests used for executing the callback as well? If so, how does nodeJs scale and how does it respond so fast?.
Each instance of nodejs runs in a single thread. Period. When you make an async call to, say, a network request, it doesn't wait around for it, not in your code or anywhere else. It has an event loop that runs through. When the response is ready, it invokes your callback.
This can be incredibly performant, because it doesn't need lots of threads and all the memory overhead, but it means you need to be careful not to do synchronous blocking stuff.
There is a pretty decent explanation of the event loop at http://blog.mixu.net/2011/02/01/understanding-the-node-js-event-loop/ and the original jsconf presentation by Ryan Dahl http://www.youtube.com/watch?v=ztspvPYybIY is worth watching. Ever seen an engineer get a standing ovation for a technical presentation?