NodeJs Eventqueue's in Cluster - node.js

I am new to node.js, When multiple processes are created using Cluster, do all of them have individual event loop or do they share the event loop? The doubt is because of
Note: Note that even though a global thread pool which is shared across all events loops is used, the functions are not thread safe.
at the link
http://docs.libuv.org/en/v1.x/threadpool.html#threadpool
Which indicates multiple event loops in nodejs

Libuv is a separate project that is used by more than just node.js, so the documentation snippet you posted is speaking more generally and not referring to a specific user of libuv. Each node.js process currently has only one (individual) event loop.

Related

How worker threads works in Nodejs?

Nodejs can not have a built-in thread API like java and .net
do. If threads are added, the nature of the language itself will
change. It’s not possible to add threads as a new set of available
classes or functions.
Nodejs 10.x added worker threads as an experiment and now stable since 12.x. I have gone through the few blogs but did not understand much maybe due to lack of knowledge. How are they different than the threads.
Worker threads in Javascript are somewhat analogous to WebWorkers in the browser. They do not share direct access to any variables with the main thread or with each other and the only way they communicate with the main thread is via messaging. This messaging is synchronized through the event loop. This avoids all the classic race conditions that multiple threads have trying to access the same variables because two separate threads can't access the same variables in node.js. Each thread has its own set of variables and the only way to influence another thread's variables is to send it a message and ask it to modify its own variables. Since that message is synchronized through that thread's event queue, there's no risk of classic race conditions in accessing variables.
Java threads, on the other hand, are similar to C++ or native threads in that they share access to the same variables and the threads are freely timesliced so right in the middle of functionA running in threadA, execution could be interrupted and functionB running in threadB could run. Since both can freely access the same variables, there are all sorts of race conditions possible unless one manually uses thread synchronization tools (such as mutexes) to coordinate and protect all access to shared variables. This type of programming is often the source of very hard to find and next-to-impossible to reliably reproduce concurrency bugs. While powerful and useful for some system-level things or more real-time-ish code, it's very easy for anyone but a very senior and experienced developer to make costly concurrency mistakes. And, it's very hard to devise a test that will tell you if it's really stable under all types of load or not.
node.js attempts to avoid the classic concurrency bugs by separating the threads into their own variable space and forcing all communication between them to be synchronized via the event queue. This means that threadA/functionA is never arbitrarily interrupted and some other code in your process changes some shared variables it was accessing while it wasn't looking.
node.js also has a backstop that it can run a child_process that can be written in any language and can use native threads if needed or one can actually hook native code and real system level threads right into node.js using the add-on SDK (and it communicates with node.js Javascript through the SDK interface). And, in fact, a number of node.js built-in libraries do exactly this to surface functionality that requires that level of access to the nodejs environment. For example, the implementation of file access uses a pool of native threads to carry out file operations.
So, with all that said, there are still some types of race conditions that can occur and this has to do with access to outside resources. For example if two threads or processes are both trying to do their own thing and write to the same file, they can clearly conflict with each other and create problems.
So, using Workers in node.js still has to be aware of concurrency issues when accessing outside resources. node.js protects the local variable environment for each Worker, but can't do anything about contention among outside resources. In that regard, node.js Workers have the same issues as Java threads and the programmer has to code for that (exclusive file access, file locks, separate files for each Worker, using a database to manage the concurrency for storage, etc...).
It comes under the node js architecture. whenever a req reaches the node it is passed on to "EVENT QUE" then to "Event Loop" . Here the event-loop checks whether the request is 'blocking io or non-blocking io'. (blocking io - the operations which takes time to complete eg:fetching a data from someother place ) . Then Event-loop passes the blocking io to THREAD POOL. Thread pool is a collection of WORKER THREADS. This blocking io gets attached to one of the worker-threads and it begins to perform its operation(eg: fetching data from database) after the completion it is send back to event loop and later to Execution.

Node.js, not works only in single thread by default

I have a question, Node.js uses libuv inside of u core, to manage its event loop and by default works whit 4 threads and process queue whit limit of 1024 process.
Process queue limit
Threads by default
So, because most programmers say it's single thread?
By default, node.js only uses ONE thread to run your Javascript. Thus your Javascript runs as single threaded. No two pieces of your Javascript are ever running at the same time. This is a critical design element in Javascript and is why it does not generally have concurrency problems with access to shared variables.
The event driven system works by doing this:
Fetch event from event queue.
Run the Javascript callback associated with the event.
Run that Javascript until it returns control back to the system.
Fetch the next event from the event queue and go back to step 2.
If no event in the event queue, go to sleep until an event is added to the queue, then go to step 1.
In this way, you can see that a given piece of Javascript runs until it returns control back to the system and then, and only then, can another piece of Javascript run. That's where the notion of "single threaded" comes from. One piece of Javascript running at a time. It vastly simplifies concurrency issues and, when combined with the non-blocking I/O model, it makes a very efficient system, even when lots of operations are "in flight" (though only one is actually running at a time).
Yes, node.js has some threads inside of libuv that are used for things like implementing file system access. But those are only for native code inside the library and do NOT make your Javascript multi-threaded in any way.
Now, recent versions of node.js do have Worker Threads which allow you to actually run multiple threads of Javascript, but each thread is a very separate environment and you must communicate with other threads via messages without the direct sharing of variables. This is relatively new to nodejs version 10.5 (though it's similar in concept to WebWorkers in the browser. These Worker Threads are not used at all unless you specifically engage them with custom programming designed to take advantage of them and live within their specific rules of operation.

Does a node server have multiple event loops?

Does a single node server have multiple event loops? want to know how multiple threads of a server machine gets used.
Is it like if 10 threads are available in a machine and so ten event loops are created?
New to javascript, server-side development and want to understand how things work under the hood.
Does a single node server have multiple event loops?
[..]
Is it like if 10 threads are available in a machine and so ten event loops are created?
No, the NodeJS Event Loop is only one.
want to know how multiple threads of a server machine gets used.
Citing from NodeJS site event-loop-timers-and-nexttick
Since most modern kernels are multi-threaded, they can handle multiple operations executing in the background. When one of these operations completes, the kernel tells Node.js so that the appropriate callback may be added to the poll queue to eventually be executed [by the event loop]
Since NodeJS 10.5 you can use threads, their results are managed by the same event loop that manage all async operations like IO readings.

Multi-Thread Firebase function

So I am developing a Firebase function that accepts requests from users and updates few nodes under a branch these users are listening to.
My issue is if the function receives two client requests at the same time, that triggers two functions to execute and update the data at the same time.
I know this is typically solved by a transaction, but my updates are done on various nodes not only one value (i.e counter).
In a traditional multi-threading programming, this problems is solved by locking the code from executing so it can only be executed by one thread, where the next one resumes when the current one finishes.
Is this an option in Firebase Functions? If so, how can it be done?
There is currently no threading in Cloud Functions in any environment, both node and python included. You should not be depending on process level locking in Cloud Function - use a database transaction to ensure that updates are atomic and consistent. Each of your function invocations is going to be completely isolated from each other.

Why is Node.js called single threaded when it maintains threads in thread pool?

Node.js maintains an event loop but then it also has by default four threads for the complicated requests. How this is single threaded when there are more threads available in the thread pool?
Also, the threads assigned by the event loop for the complicated task are the dedicated threads then how it's different from other multithreading concepts?
In the context to which you're referring, "single threaded" means that your Javascript runs as a single thread. No two pieces of Javascript are ever running at the same time either literally or time sliced (note: as of 2020 node.js does now have WorkerThreads, but those are something different from this original discussion). This massively simplifies Javascript development because there is no need to do thread synchronization for Javascript variables which are shared between different pieces of Javascript because only one piece of Javascript can ever be running at the same time.
All that said, node.js does use threads internal to its implementation. The default four threads you mention are used in a thread pool for disk I/O. Because disk I/O is normally a synchronous operation at the OS level that blocks the calling thread and node.js has a design where all I/O operations should be offered as asynchronous operations, the node.js designers decided to fulfill the asynchronous interface by using a pool of threads in order to implement (in native code), the fs module disk I/O interface (yes there are non-blocking disk I/O operations in some operating systems, but the node.js designers decided not to use them). This all happens under the covers in native code and does not affect the fact that your Javascript runs only in a single thread.
Here's a summary of how a disk I/O call works in node.js. Let's assume there's already an open file handle.
Javascript code calls fs.write() on an existing file handle.
fs module packages the arguments to the function and then calls native code.
Native code gets a thread from the thread pool and initiates the OS call to write data to that file
Native code returns from the function
fs module returns from the fs.write() call
Javascript continues to execute (whatever statements came after the fs.write() call
Some time later the native code fs.write() call on a thread finishes. It obtains a mutex protecting the event loop and inserts an event in the event queue.
When the Javascript engine is done executing whatever stream of Javascript it was running, it checks the event queue to see if there are any other events to run.
When it finds an event in the event queue, it removes it from the event queue and executes the callback associated with that event, starting a new stream of running Javascript.
Because a new event is never acted upon until the current stream of Javascript is done executing, this is where Javascript gets is event-driven, single threaded nature even though native code threads may be used to implement some library functions. Those threads are used to make a blocking operation into a non-blocking operation, but do not affect the single threaded-ness of Javascript execution itself.
The key here is that node.js is event driven. Every new operation that triggers some Javascript to run is serialized through the event queue and the next event is not serviced until the current stream of Javascript has finished executing.
In the node.js architecture the only way to get two pieces of Javascript to run independently and at the same time is to use a separate node.js process for each. Then, they will run as two completely separate operations and the OS will manage them separately. If your computer has at least two cores, then they can literally run at the same time, each on their own core. If your computer has only one core, they will essentially be in their own process thread and the OS will time slice them (sharing the one CPU between them).
I will tell it in a clear and simple way and clear the confusion :
Node Event Loop is SINGLE-THREADED But THE Other processes are not.
The confusion came from c++, which Node uses underline ( NodeJs is about 30% js + 70% c++ ).So, By default, The JS part of NodeJs is single-threaded BUT it uses a thread pool of c++. So, We have a single thread JS which is the event loop of NodeJs + 4 threads of c++ if needed for asynchronous I/O operations.
It is also important to know that The event loop is like a traffic organizer, Every request go through the loop ( which is single-thread ) then the loop organizes them to the pool threads if I/O processes are needed, so if you have a high computational app that does like heavy lifting image-processing, video-editing, audio-processing or 3d-graphics ..etc, which is not needed for most apps,So NodeJs will be a bottleneck for that high load computational app and the traffic organizer will be unhappy.
While NodeJS shine for I/O bound apps ( most apps ) Like apps dealing with databases and filesystem.
Again: By default, NodeJs uses a 4 thread pool (PLUS one thread for the event loop itself ). so by default (total of 5) because of the underlying c++ system.
As a general idea, The CPU could contain one or more cores, it depends on your server(money).
Each core could have threads. Watch your activity Monitor discover how many threads you are using.
Each process has multiple threads.
The multi-threading of Node is due to that node depends on V8 and libuv ( C Library ).
So Long story short:-
Node is single-threaded for the event loop itself but there are many operations that are done outside the event Loop, Like crypto and file system (fs ). if you have two calls for crypto then each of them will reach each THREAD ( imagine 3 calls to crypto and 1 for fs, These calls will be distributed one for each thread from the 4 thread pool )
Finally: It is very easy to increase the default number of threads of the C-Library libuv thread pool which is 4 by default by changing the value of process.env.uv_threadpool_size. and also you could use clustering ( PM2 recommended ) to like clone the event-loop, like have multiple event-loops in case the single-threaded one is not enough for your high load app.
So nobody illustrates that thread pool is a c++ thing that’s nodeJs control mostly not the developer, which still asking How it’s single-thread while having a thread-pool !!
Hope that simplifies that advanced topic.
By default, the execution of your JavaScript code runs on a single thread.
However, node.js tries to make most long-running calls async. For some that just involves doing async OS calls, but for some others node.js will execute the call itself on a secondary thread, while continuing to run other JS code. Once the async call terminated, the Js callback or Promise handler will run.
For async processing, Node.js was created explicitly as an experiment. It is believed that more performance and scalability can be achieved by doing async processing on a single thread under typical web loads than the typical thread based implementation.

Resources