Does one Node.js thread block the other? [closed] - node.js

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
Using Node.js and one single CPU virtual instance, if I put a worker thread and a web thread on the same node, would one block the other? Would I require two CPUs for them to run perfectly in parallel?

Yes, one would block another if it is all synchronous code.
Since you only have one virtual CPU instance (assuming not hyperthreaded), at the core level, the CPU only takes instructions in a synchronous fashion: 1, then 2, then 3, then 4.
Therefor in theory, if one worker had something like this, it would block:
while (true) {
doSomething();
}
Disclaimer: I'm not sure whether the OS kernel would handle anything regarding blocking instructions.
However, Node.js runs all I/O in the event loop, along with tasks that you explicitly state to be ran in the event loop (process.nextTick(), setTimeout()...). The way the event loop works is explained well here, so I won't go into much detail - however, the only blocking part about Node.js is synchronous running code, like the example above.
So, long story short: since your web worker uses Node.js, and the http module is an async module, your web worker will not block. Since your worker thread also uses Node.js, assuming it executes code that is launched when an event occurs (a visit to your website, for example), it will not block.
To run synchronous code perfectly in parallel, you would need two CPUs. However, assuming it is asynchronous, it should work just fine on one CPU.

Related

How many threads should I spawn for maximum performance? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 months ago.
Improve this question
I am writing a Rust script that needs to brute force the solution to some calculation and is likely to run 2^80 times. That is a lot! I am trying to make it run as fast as possible and thus want to divide the burden to multiple threads. However if I understand correctly this only accelerates my script if the threads actually run on different cores, otherwise they will not truly run simultaneously but switch between one another when running..
How can I make sure they use different cores, and how can I know that no more cores are available?
TL;DR: Use std::thread::available_parallelism (or alternatively the num-cpus crate) to know how many threads to run and let your OS handle the rest.
Typically when you create a thread, the OS thread scheduler is given free liberty to decide where and when those threads execute, however it will do so in a way that best takes advantage of CPU resources. So of course if you use less threads than the system has available, you are potentially missing out on performance. If you use more than the number of available threads, that's not particularly a problem since the thread scheduler will try its best to balance the threads that have work to do, but more than the available threads would be a mall waste of memory, OS resources, and context-switches. Creating your threads to match the number of logical CPU cores on your system is the sweetspot, and the above function will get that.
You could tell the OS exactly which cores to run which threads by setting their affinity, however that isn't really advisable since it wouldn't particularly make anything faster unless you start really configuring your kernel or are really taking advantage of your NUMA nodes.

How many tasks can a single thread execute simultaneously? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
How many tasks can a single thread execute simultaneously?
Concurrently: Zero or one. A thread is a thread. Not a magic yarn.
If by "in parallel" you mean "processed in parallel" and if you consider awaited Tasks, then there is no upper-bound limit on how many tasks are being awaited - but only one will actually be executed per a single CPU hardware-thread (usually 2x the CPU core count due to superscalar simultaneous multithreading, aka Hyper-Threading).
Also remember that Task is very abstract. It does not refer only to concurrently executing/executed (non-blocking) code, but can also refer to pending IO (e.g. disk IO, network IO, etc) that is being handled asynchronously by the host environment (e.g. the operating system) rather than it blocking the thread if it used a "traditional" (non-asynchronous) OS API call.
Re: comment
I just have a problem with handling multiple (it can be 5000, for instance) clients on the server and for each of them, I need to run a separate handling loop. But I'm concerned about the fact that the thread can handle either 0 or 1 tasks. Does it mean I should create a new thread for every new client? I know it does not matter how much threads I'll create, it won't change speed. But speed does not matter - the loop just should be executed independently for each client.
Ugh, this is not quite the same thing as your question - but I'll try my best to explain...
for each of them, I need to run a separate handling loop
Not necessarily. Just because you need to maintain state for each connected client does not mean you need a separate "loop" (i.e. a thread of execution).
In computers today fundamentally almost all network IO goes through the BSD Sockets API ("WinSock" on Windows, and in .NET this is represented via System.Net.Sockets.Socket). Remember that all kinds of computers work with sockets, including simple single-threaded computers. They don't need a blocking-loop for each connection: instead they use select to get information about socket status without blocking and only read data from the socket's input buffer if safe to do so. Voila! Only a single thread is needed. You can do this in .NET by checking Socket.Available, Socket.Select, or better yet: using the newer NetworkStream.ReadAsync method, for example.
If you're using BSD Sockets API (System.Net.Sockets) then you should use Socket.Select
Does it mean I should create a new thread for every new client?
*NOOOOONONONONONNONONO - no, you do not. Creating and running a new Thread for each connected client (Socket, NetworkStream, TcpClient, etc) is an anti-pattern that will quickly exhaust your available process memory (as each Thread costs 1MB just for its default stack on Windows desktop, ~250KB within IIS).
I know it does not matter how much threads I'll create
YES IT DOES!. Spawning lots of threads is a good way to torpedo your application's network performance and consume unnecessarily large amounts of memory.
the loop just should be executed independently for each client.
Please learn about Asynchronous Sockets. By using the async feature in C# with NetworkStream or Socket's async methods your code will use as few threads as necessary to handle network data.

Understanding node.js [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I have started reading node.js. I have a few questions:
Is node better than multi-threading just because it saves us from caring about deadlocks and reduces thread creation overhead, or are there are other factors too? Node does use threads internally, so we can't say that it saves thread creation overhead, just that it is managed internally.
Why do we say that node is not good for multi-core processors? It creates threads internally, so it must be getting benefits of multi-core. Why do we say it is not good for CPU intensive applications? We can always fork new processes for CPU intensive tasks.
Are only functions with callback dispatched as threads or there are other cases too?
Non-blocking I/O can be achieved using threads too. A main thread may be always ready to receive new requests. So what is the benefit?
Correct.
Node.js does scale with cores, through child processes, clusters, among other things.
Callbacks are just a common convention developers use to implement asynchronous methods. There is no technical reason why you have to include them. You could, for example, have all your async methods use promises instead.
Everything node does could be accomplished with threads, but there is less code/overhead involved with node.js's asynchronous IO than there is with multi-threaded code. You do not, for example, need to create an instance of thread or runnable every time like you would in Java.

Why is Node.js single threaded? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
The community reviewed whether to reopen this question 1 year ago and left it closed:
Original close reason(s) were not resolved
Improve this question
In PHP (or Java/ASP.NET/Ruby) based webservers every client request is instantiated on a new thread. But in Node.js all the clients run on the same thread (they can even share the same variables!) I understand that I/O operations are event-based so they don't block the main thread loop.
What I don't understand is WHY the author of Node chose it to be single-threaded? It makes things difficult. For example, I can't run a CPU intensive function because it blocks the main thread (and new client requests are blocked) so I need to spawn a process (which means I need to create a separate JavaScript file and execute another node process on it). However, in PHP cpu intensive tasks do not block other clients because as I mentioned each client is on a different thread. What are its advantages compared to multi-threaded web servers?
Note: I've used clustering to get around this, but it's not pretty.
Node.js was created explicitly as an experiment in async processing. The theory was that doing async processing on a single thread could provide more performance and scalability under typical web loads than the typical thread-based implementation.
And you know what? In my opinion that theory's been borne out. A node.js app that isn't doing CPU intensive stuff can run thousands more concurrent connections than Apache or IIS or other thread-based servers.
The single threaded, async nature does make things complicated. But do you honestly think it's more complicated than threading? One race condition can ruin your entire month! Or empty out your thread pool due to some setting somewhere and watch your response time slow to a crawl! Not to mention deadlocks, priority inversions, and all the other gyrations that go with multithreading.
In the end, I don't think it's universally better or worse; it's different, and sometimes it's better and sometimes it's not. Use the right tool for the job.
The issue with the "one thread per request" model for a server is that they don't scale well for several scenarios compared to the event loop thread model.
Typically, in I/O intensive scenarios the requests spend most of the time waiting for I/O to complete. During this time, in the "one thread per request" model, the resources linked to the thread (such as memory) are unused and memory is the limiting factor. In the event loop model, the loop thread selects the next event (I/O finished) to handle. So the thread is always busy (if you program it correctly of course).
The event loop model as all new things seems shiny and the solution for all issues but which model to use will depend on the scenario you need to tackle. If you have an intensive I/O scenario (like a proxy), the event base model will rule, whereas a CPU intensive scenario with a low number of concurrent processes will work best with the thread-based model.
In the real world most of the scenarios will be a bit in the middle. You will need to balance the real need for scalability with the development complexity to find the correct architecture (e.g. have an event base front-end that delegates to the backend for the CPU intensive tasks. The front end will use little resources waiting for the task result.) As with any distributed system it requires some effort to make it work.
If you are looking for the silver bullet that will fit with any scenario without any effort, you will end up with a bullet in your foot.
Long story short, node draws from V8, which is internally single-threaded. There are ways to work around the constraints for CPU-intensive tasks.
At one point (0.7) the authors tried to introduce isolates as a way of implementing multiple threads of computation, but were ultimately removed: https://groups.google.com/forum/#!msg/nodejs/zLzuo292hX0/F7gqfUiKi2sJ

What is event-driven programming? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
What is event-driven programming and has event-driven programming anything to do with threading? I came to this question reading about servers and how they handle user requests and manage data. If user sends request, server begins to process data and writes the state in a table. Why is that so? Does server stop processing data for that user and start to process data for another user or processing for every user is run in a different thread (multithread server)?
Event driven programming != Threaded programming, but they can (and should) overlap.
Threaded programming is used when multiple actions need to be handled by a system "simultaneously." I use simultaneously loosely as most OS's use a time sharing model for threaded activity, or at least they do when there are more threads than processors available. Either way, not germane to your Q.
I would use threaded programming when I need an application to do two or more things - like receiving user input from a keyboard (thread 1) and running calculations based upon the received input (thread 2).
Event driven programming is a little different, but in order for it to scale, it must utilize threaded programming. I could have a single thread that waits for an event / interrupt and then processes things on the event's occurrence. If it were truly single threaded, any additional events coming in would be blocked or lost while the first event was being processed. If I had a multi-threaded event processing model then additional threads would be spun up as events came in. I'm glossing over the producer / worker mechanisms required, but again, not germane to the level of your question.
Why does a server start processing / storing state information when an event is received? Well, because it was programmed to. :-) State handling may or may not be related to the event processing. State handling is a separate subject from event processing, just like events are different than threads.
That should answer all of the questions you raised. Jonny's first comment / point is worth heeding - being more specific about what you don't understand will get you better answers.

Resources