Event loop in Node.js - node.js

We all know that in Node.js, the functions are handled by worker thread for execution and then send to the event queue and then the event loop looks into the call stack.
If the call stack is empty then the event loop takes the function's context environment to call stack, and then call stack process it and give it as a response.
My question is if we have multiple functions with same timeout function and then all the function is given to worker thread then worker thread sends their context environment to the event queue,
and if the timeout of all the functions are same then they all come into the event queue at the same time and then if the call stack is empty then the event loop will send all the functions to call stack, and we all know the property of stack is FILO.
so if this happened resulting the last function should be sent in response first but this is not happening the first function is coming in response first if all the timeouts are the same?

There are lots of things wrong in how you describe things in your question, but I'll speak to the timeout issue that you ask about.
nodejs has its own timer system. It keeps a sorted list of timers and the ONLY Javascript timeout that has a physical system timer is the next one to fire. If multiple Javascript timeouts are set to fire at the same point in time, then they all share that one OS timer.
When that OS timer fires and when the main JS thread is free and able to pull the next event from the event loop, it will see a JS timer is ready to call its callback. If there are more than one ready to go, all for the same time, then the interpreter will call each of their callbacks one after the other, in the order the timers were configured (FIFO).
We all know that in Node.js, the functions are handled by worker thread for execution and then send to the event queue and then the event loop looks into the call stack. If the call stack is empty then the event loop takes the function's context environment to call stack, and then call stack process it and give it as a response.
My question is if we have multiple functions with same timeout function and then all the function is given to worker thread
That part is wrong. They aren't given to a worker thread.
Then worker thread sends their context environment to the event queue, and if the timeout of all the functions are same then they all come into the event queue at the same time
As I described above, timers are a special type of event in the event loop code. They use only one system timer at a time to schedule the next timer to fire. Multiple timers set to fire at the same time are all stored in the same list (a list element for that particular time) and all share the same OS timer when it's their turn to be next. So, they don't all come into the event queue at the same time. Nodejs has set one system timer for the group of timers set to fire at the same time. When that system timer fires and when the interpreter is free to pull the next event from the event loop, then it will call each callback for each timer set for that time one after another in FIFO order.
and then if the call stack is empty then the event loop will send all the functions to call stack, and we all know the property of stack is FILO.
I don't know what "send all the functions to that call stack" means. That's not how things work at all. node.js runs your Javascript in a single thread (except for WorkerThreads which are not in play here) so it calls the callback for one timer, runs that to completion, then calls the callback for the next timer, runs it to completion and so on...
so if this happened resulting the last function should be sent in response first but this is not happening the first function is coming in response first if all the timeouts are the same?
As I've said a couple times above, multiple timers set to fire at the same use one system timer and when that system timer fires, the callbacks for each of those timers are called one after the other in FIFO order.
References
You can learn more about the node.js timer architecture here: How does Node.js manage timers.
And, here's some more info about the node.js timer architecture taken from comments in the source code: How many concurrent setTimeouts before performance issues?.
Looking for a solution between setting lots of timers or using a scheduled task queue
Six Part Series on the Node.js Event Loop and How It Works

My question is if we have multiple functions with same timeout
function
poll phase controll the timer
First read here
I highly recommend this
series
So the main question is how does Node decide what code to run next?
As we know the Event Loop and the Worker Pool maintain queues for pending events and pending tasks, respectively.
don't confuse with Worker thread
Worker threads is different concept Read
here
But In realtiy the Event Loop does not actually maintain a queue. Instead, it has a collection of file descriptors that it asks the operating system to monitor, using a mechanism like epoll (Linux), kqueue (OSX), event ports (Solaris), or IOCP (Windows). These file descriptors correspond to network sockets, any files it is watching, and so on. When the operating system says that one of these file descriptors is ready, the Event Loop translates it to the appropriate event and invokes the callback(s) associated with that event
The Worker Pool uses a real queue whose entries are tasks to be processed. A Worker pops a task from this queue and works on it, and when finished the Worker raises an "At least one task is finished" event for the Event Loop.
timer callback is called depends on the performance of the system (Node has to check the timer for expiration once before executing the callback, which takes some CPU time) as well as currently running processes in the event loop.

Related

Node.js synchronous and blocking demultiplexer

I'm trying to understand the internals of node.js and how it works under the hood
So if I've understood correctly,
The event loop is executing on a single thread, when app generates a new I/O task it makes a non blocking call to the event demultiplexer, the call immediately returns without any data, allowing the event loop to continue with other tasks. When data is ready the demultiplexer pushes an event (or more) in the event queue. Event loop takes out this event (handler with data)...
As I understand it, the demultiplexer is executing in the os(epoll in Linux) not in the application's main thread (in which event loop executes) and by definition demultiplexer is synchronous and blocking that's because the watch() call is blocking and this is exactly where my question stands.
I know that the demultiplexer is watching until one or more events are ready but:
If Watch() is executed in the os and not in the event loop what does blocking mean? It blocks what?
(Sorry for my English, I'm not a native)

Guarantees about latency in node.js

Are there explicit considerations about the latency of any single request in the node.js event loop? AFAICT every IO call returns an eventEmitter which emits an event. The processing of all the events is multiplexed through the use of a pipe. So it is possible that the event that needs to be processed for an important request may be placed too far back into the pipe. Is there some sort of priority queue that can be used to schedule the order of execution of eventHandlers ?
Here's why I asked this question in the first place. I decided to give a gist.github link because the reason is long and related to the technical question
It's not clear exactly what you're asking here. Your Javascript does not directly add things to the event queue (that is only done with native code). Instead, you call some async operation and the native code behind that operation adds something to the event queue when the async operation completes.
This article The Node.js Event Loop, Timers, and process.nextTick() gives you a lot of details about how the event queue is serviced and how it handles different types of events (timers, I/O, etc...).
In general things are FIFO (first in, first out) within an event type with some exceptions.
process.nextTick() will run its callback BEFORE waiting I/O events.
setImmediate() will runs its callback AFTER waiting I/O events.
More detail here: setImmediate vs. nextTick and nextTick vs setImmediate, visual explanation and setTimeout vs. setImmediate vs. nextTick
So it is possible that the event that needs to be processed for an important request may be placed too far back into the pipe.
You'd have to show us the specific situation you're concerned about. If you yourself are scheduling a callback with setTimeout(), setImmediate() or process.nextTick(), then you have some control over when it happens by which of these three you pick. If you aren't scheduling it yourself (e.g. it's the completion callback of some async operation), then you don't control it's scheduling in the event loop. It will go into the sub-queue that matches its type and be served FIFO from that phase or the event loop (as described in the above articles).
Is there some sort of priority queue that can be used to schedule the order of execution of eventHandlers ?
There is no exposed priority system. Within an event type, things are FIFO. Again, if you give us an actual coding example so we can see exactly what you're trying to do, we could offer some help on what your choices are. You may be able to use the setTimeout(), setImmediate() and process.nextTick() tools that are already available or you may want to implement your own task queuing and prioritization system that runs off some of the above three methods that allows you to prioritize things that are already queued yourself.
About priorities of execution in the event loop:
setImmediate() runs before setTimeout(fn, 0)
nextTick() triggers the callback on next tick (iteration)
Natively the event loop in node.js does not support priorities. You can always implement your own priority queue or use an existing one like here and assign your functions to the priority queue.

SetEvent ResetEvent WaitForMultipleObjectsEx - Race condition?

I am not able to understand the PulseEvent or race condition. But to avoid it I am trying to SetEvent instead, and ResetEvent every time before WaitForMultipleObjectsEx.
This is my flow:
Thread One - Uses CreateEvent to create an auto reseting event, I then spawn and tell Thread TWO about it.
Thread One - Tell thread TWO to run.
Thread TWO will do ResetEvent on event and then immediately start WaitForMultipleObjectsEx on the event and some other stuff for file watching. If WaitForMultipleObjectsEx returns, and it is not due to the event, then restart the loop immediately. If WaitForMultipleObjectsEx returns, due to event going to signaled, then do not restart loop.
So now imagine this case please:
Thread TWO - loop is running
Thread One - needs to add a path, so it does (1) SetEvent, and then (2) sends another message to thread 2 to add a path, and then (3) sends message to thread 2 to restart loop.
The messages of add path and restart loop will not come in to Thread TWO unless I stop the loop in TWO, which is done by the SetEvent. Thread TWO will see it was stoped due to the event, and so it wont restart the loop. So it will now get the message to add path, so it will add path, then restart loop.
Thread One - needs to stop the thread, so it does (1) SetEvent and then (2) waits for message thread 2, when it gets that message it will terminate the thread.
Will this avoid race condition?
Thank you
Suppose the loop needs to be interrupted twice in succession. You're imagining a sequence of events something like this, on thread ONE and thread TWO:
Thread ONE realizes that the first interruption is complete.
Thread ONE sends a message telling TWO to restart the wait loop.
Thread TWO reads the message "restart the wait loop".
Thread TWO resets the event.
Thread TWO starts waiting.
Thread ONE now realizes that another interruption is needed.
Thread ONE sets the event to ask for another interruption.
Thread ONE sends message related to the second interruption.
Thread TWO stops the loop, receives the message about the second interruption.
But since you don't have any control over the timing between the two threads, it might instead happen like this:
Thread ONE realizes that the first interruption is complete.
Thread ONE sends a message telling TWO to restart the wait loop.
Thread ONE now realizes that another interruption is needed.
Thread ONE sets the event to ask for another interruption.
Thread TWO reads the message "restart the wait loop".
Thread TWO resets the event.
Thread TWO starts waiting.
Thread ONE sends a message about the second interruption, but TWO isn't listening!
Even if the message passing mechanism is synchronous, so that ONE won't continue until TWO has read the message, it could happen this way:
Thread ONE realizes that the first interruption is complete.
Thread ONE sends a message telling TWO to restart the wait loop.
Thread TWO reads the message "restart the wait loop", but is then swapped out.
Thread ONE now realizes that another interruption is needed.
Thread ONE sets the event to ask for another interruption.
Thread TWO resets the event.
Thread TWO starts waiting.
Thread ONE sends a message about the second interruption, but TWO isn't listening!
(Obviously, a similar thing can happen if you use PulseEvent.)
One quick solution would be to use a second event for TWO to signal ONE at the appropriate point, i.e., after resetting the main event but before waiting on it, but that seems somewhat inelegant and also doesn't generalize very well. If you can guarantee that there will never be two interruptions in close-enough succession, you might simply choose to ignore the race condition, but note that it is difficult to reason about this because there is no theoretical limit to how long it might take for thread TWO to resume running after being swapped out.
The various alternatives depend on how the messages are being passed between the threads and any other constraints. [If you can provide more information about your current implementation I'll update my answer accordingly.]
This is an overview of some of the more obvious options.
If the message-passing mechanism is synchronous (if thread ONE waits for thread TWO to receive the message before proceeding) then using a single auto-reset event should just work. Thread ONE won't set the event until after thread TWO has received the restart-loop message. If the event is already set when thread TWO starts waiting, that just means that there were two interruptions in immediate succession; TWO will never stall waiting for a message that isn't coming. [This potential stall is the only reason I can think of why you might not want to use an auto-reset event. If you have another concern, please edit your question to provide more details.]
If is OK for sending a message to be non-blocking, and you aren't already locked in to a particular solution, any of these options would probably be sensible:
User mode APCs (the QueueUserAPC function) provide a message-passing mechanism that automatically interrupts alertable waits.
You could implement a simple queue (protected by a critical section) which uses an event to indicate whether there is a message pending or not. In this case you can safely use a manual-reset event provided that you only manipulate it when you hold the same critical section that protects the queue.
You could use an auto-reset event in combination with any sort of thread-safe queue, provided only that the queue allows you to test for emptiness without blocking. The idea here is that thread ONE would always insert the message into the queue before setting the event, and if thread TWO sees that the event is set but it turns out that the queue is empty, the event is ignored. If efficiency is a concern, you might even be able to find a suitable lock-free queue implementation. (I don't recommend attempting that yourself.)
(All of those mechanisms could also be made synchronous by using a second event object.)
I wouldn't recommend the following approaches, but if you happen to already be using one of these for messaging this is how you can make it work:
If you're using named pipes for messaging, you could use asynchronous I/O in thread TWO. Thread TWO would use an auto-reset event internally, you specify the event handle when you issue the I/O call and Windows sets it when I/O arrives. From the point of view of thread ONE, there's only a single operation. From the point of view of thread TWO, if the event is set, a message is definitely available. (I believe this is somewhat similar to your original approach, you just have to issue the I/O call in advance rather than afterwards.)
If you're using a window queue for messaging, the MsgWaitForMultipleObjectsEx() function allows you to wait for a window message and other events simultaneously.
PS:
The other problem with PulseEvent, the one mentioned in the documentation, is that this can happen:
Thread TWO starts waiting.
Thread TWO is preempted by Windows and all user code on the thread stops running.
Thread ONE pulses the event.
Thread TWO is restarted by Windows, and the wait is resumed.
Thread ONE sends a message, but TWO isn't listening.
(Personally I'm a bit disappointed that the kernel doesn't deal with this situation; I would have thought that it would be possible for it to set a flag saying that the wait shouldn't be resumed. But I can only assume that there is a good reason why this is impractical.)
The Auto-Reset Events
Would you please try to change the flow so there is just SetEvent and WaitForMultipleObjectsEx with auto-reset events? You may create more events if you need. For example, each thread will have its own pair of events: one to get notifications and another to report about its state changes - you define the scheme that best suits your needs.
Since there will be auto-reset events, there would be neither ResetEvent nor PulseEvent.
If you will be able to change the logic of the algorithm flow this way - the program will become clear, reliable, and straightforward.
I advise this because this is how our applications work since the times of Windows NT 3.51 – we manage to do everything we need with just SetEvent and WaitForMultipleObjects (without the Ex suffix).
As for the PulseEvent, as you know, it is very unreliable, even though it exists from the very first version of Windows NT - 3.1 - maybe it was reliable then, but not now.
To create the auto-reset events, use the bManualReset argument of the CreateEvent API function (if this parameter is TRUE, the function creates a manual-reset event object, which requires the use of the ResetEvent function to set the event state to non-signaled -- this is not what you need). If this parameter is FALSE, the function creates an auto-reset event object. The system will automatically reset the event state to non-signaled after a single waiting thread has been released, i.e., after WaitForMultipleObjects or WaitForSingleObject or other wait functions that explicitly wait for this event to become signaled.
These auto-reset events are very reliable and easy to use.
Let me make a few additional notes on the PulseEvent. Even Microsoft has admitted that PulseEvent is unreliable and should not be used -- see https://msdn.microsoft.com/en-us/library/windows/desktop/ms684914(v=vs.85).aspx -- because only those threads will be notified that are in the "wait" state when PulseEvent is called. If they are in any other state, they will not be notified, and you may never know for sure what the thread state is, and, even if you are responsible for the program flow, the state can be changed by the operating system contrary to your program logic. A thread waiting on a synchronization object can be momentarily removed from the wait state by a kernel-mode Asynchronous Procedure Call (APC) and returned to the wait state after the APC is complete. If the call to PulseEvent occurs during the time when the thread has been removed from the wait state, the thread will not be released because PulseEvent releases only those threads that are waiting at the moment it is called.
You can find out more about the kernel-mode APC at the following links:
https://msdn.microsoft.com/en-us/library/windows/desktop/ms681951(v=vs.85).aspx
http://www.drdobbs.com/inside-nts-asynchronous-procedure-call/184416590
http://www.osronline.com/article.cfm?id=75
The Manual-Reset Events
The Manual-Reset events are not that bad. :-) You can reliably use them when you need to notify multiple instances of a global state change that occurs only once, for example, application exit. The auto-reset events can only be used to notify one thread (because if more threads are waiting simultaneously for an auto-reset event and you set the event, one random thread will exist and will reset the event, but the behavior of the remaining threads that also wait for the event, will be undefined). From the Microsoft documentation, we may assume that one and only one thread will exit while others would definitely not exit, but this is not very explicitly articulated in the documentation. Anyway, we must take the following quote into consideration: "Do not assume a first-in, first-out (FIFO) order. External events such as kernel-mode APCs can change the wait order" Source - https://msdn.microsoft.com/en-us/library/windows/desktop/ms682655(v=vs.85).aspx
So, when you need to notify all the threads quickly – just set the manual-reset event to the signaled state, rather than signaling each auto-reset event for each thread. Once you have signaled the manual-reset event, do not call ResetEvent since then. The drawback of this solution is that the threads need to have an additional event handle passed in the array of their WaitForMultipleObjects. The array size is limited, although, to MAXIMUM_WAIT_OBJECTS, which is 64, we never reached close to this limit in practice.
You can get more ideas about auto-reset events and manual reset events from https://www.codeproject.com/Articles/39040/Auto-and-Manual-Reset-Events-Revisited

does user defined callback function uses thread pool in node.js?

This question is to understand how event loop calls thread pool to process task.
Say,
I want to create a function (say to process small task) not any i/o operation, i want that to process using a callback function, so that it can call thread pool and task can be concurrent with my main thread, and return result in callback after completion. I have understanding that it can be done by creating child processes(forking etc),
but, I am little confused and want to understand how exactly is process executes concurrently in single threaded node in i/o operation and not in user defined operation. What exactly happens in event loop, will all event be passed to thread pool or how it identifies if it is I/O operation??
I am new at node.js and totally confused.
Help would be appreciated :)
“Node.js manages its own threads for I/O” by using libuv for operations involving the network, file system, etc. libuv essentially creates a thread pool for I/O that varies in size based on platform. The V8 event loop is a separate thread that processes events in the queue. Those events map to a JavaScript function to execute with the event data. This is how asynchronous I/O is handled by Node.js.
Source: http://www.wintellect.com/blogs/dbanister/stop-fighting-node.js-in-the-enterprise
So each I/O operation executes outside V8 event loop thread, that's why it runs concurrently.
I/O operations run efficiently because, as you mentioned, a thread pool is used - a group of threads that "wait" for incoming tasks from V8 event loop, execute them, and return data to JavaScript callback functions.
As you already stated Node runtime is single threaded.
Node is well suited for IO bound operation. It's less recommended for CPU bound because it will block Node's event loop.
If you really want to do CPU bound work async with node you can achieve that using a nodes cluster but you'll have to manage the communication between them. (A simple example here - http://davidherron.com/blog/2014-07-03/easily-offload-your-cpu-intensive-nodejs-code-simple-express-based-rest-server) or using chiled process - http://nodejs.org/api/child_process.html

How to Wait and Terminate a TThread in Delphi (notifying the user when finished)

If an user provides information that is recorded in an excel file then I choose Excel COM to read the data.
However, as the user can repeat the process to N files and the process can take a while, I decided to move this routines to a separated thread.
Therefore, I need your advice to define how can I do this.
The worker thread cannot be destroyed until there is no more remaining files.
Inside the thread the data is loaded to a ClientDataSet and at the end is applied to database.
I need somehow notify the user when task is done, so he can decide if he will load another file and execute the thread again or finish the job.
How to properly destroy the thread and notify the user?
Should I create and destroy the thread to each file?
You can, but that is not a very efficient design. Put the files into a thread-safe queue, then start the thread if it is not already running, and then have the thread loop through the queue util it is empty. At that time, the thread can then be destroyed, or just put to sleep in case more files will be queued later on.
This design also allows you to process multiple files in parallel if you implement a thread pool. When you put a file into the queue, start a new thread if there is not already an idle thread waiting to be used. When a thread starts, pull the next available file from the queue. When that thread finishes, it can pull the next file from the queue, and if there is no file then go back into the pool for later reuse.
If so, How to properly destroy the thread and notify the user?
When you are ready to destroy a thread, call its Terminate() method (its Execute() needs to should check its Terminated property periodically and exit when set to true), then call its WaitFor() method (or equivalent, like MsgWaitForMultipleOjects(), which allows you to keep the message queue responsive while waiting for the thread to terminate), then free it from memory. The thread triggers its OnTerminate event after Execute() exits, however it is not safe to destroy the thread in the OnTerminate event handler. If you want to destroy the thread when the OnTerminate event is triggered (especially if you are not expecting the thread to terminate, such as if it threw an uncaught exception), you can post yourself an asynchronous notification, such as with PostMessage(), PostThreadMessage(), TThread.Queue(), etc, and then destroy the thread when that notification is processed at a later time.
How to set a thread to notify the user when the work is finished? By assigning the event OnTerminate?
Yes. Unless the thread is going to process multiple files before terminating, in which case the thread could manually send a notification in between each file.
It's better to create the thread to each file or create 1 thread and somehow control it's execution to every time for different files?
Creating and destroying a thread is not trivial for the OS, in terms of resources and processing, so you should re-use threads as much as possible. Make them sleep when they have nothing to do, unless they are going to be sleeping for a long time in which case you should destroy them to release their resources.

Resources