I am not sure how to phrase this, but any ideas about how to achieve the below behavior would be great.
I have web server that makes long running calls to a command line program. I want the server to handle multiple long running calls, but not return a given request until the call is complete. This is not a website, so it is okay that the calls run a very long time, and the client will not timeout either. Any ideas about how to achieve this?
Is this link relevant? writing a multiplexing server in clojure?
Given that Luminus generates a war file (which I assume it does, because it runs on top of ring and compojure) then it is already "multi-threaded" in that when you run that war file in tomcat or jetty, each request will get its own thread.
Clients can set their socket read timeout to infinite and they'll wait forever.
If your calls are long for any reason other than CPU usage, your best option will be to use http-kit or aleph in an uberjar. Unlike the other servers, http-kit and aleph use a thread pool rather than a thread per request, and if you have any bottleneck other than CPU usage (for example an arbitrary sleep time, network or disk io, etc.), than a thread pool will perform much better than a thread per request would.
http-kit client / ring server
aleph client / ring server
Related
I have a very simple express server with 1 endpoint listening on a certain port.
I'd like a separate scheduled script to make an API call every 10 minutes and save the data locally (in memory).
When the endpoint on my server is hit, I'd like to serve the locally cached data from memory.
How can I make sure that the scheduled cron job does not ever block the processing of requests coming in i.e. both continue to happen at the same time.
I've read about child processes and worker threads, but I'm not sure if this would be a good use case for it.
JavaScript is single-threaded, so worker threads and forking notwithstanding, every line of JavaScript code that executes blocks other lines of code from executing. Only one can execute at a time. So forgetting about the scheduled job for a moment, in an express server, multiple requests happening concurrently will compete with each other for CPU resources and block each other.
However, Node.js has asynchronous I/O. A typical request often looks a bit like this:
Receive request and run a little bit of JavaScript code (maybe 1ms).
Query an external REST API (let's say 50ms)
Run a little more JavaScript code and send a response (maybe 1ms).
In steps 1 and 3, all other requests are blocked from executing JavaScript code. They have to wait until the current request is done executing its code. In step 2 however, other requests are free to execute JavaScript code, one after the other, since the network request is non-blocking.
I explain all this because it sounds like your scheduled job might go through the same three steps mentioned above. If that's the case, then functionally it's not going to block anything any more than simultaneous requests to the server are already blocking each other, and there's little reason to worry about threading or multi-processing unless your server is under such heavy load that it cannot serve all incoming requests.
So this is not a good use case for child processes and worker threads unless the scheduled job has different characteristics than I'm assuming (for example if it spends multiple seconds crunching heavy computations in JavaScript, that would noticeably impact request response time when it runs, and might make sense to break out into a separate process or worker thread).
I had asked in an interview, are there any cases that may force you to use blocking code in a node.js server?
my answer was: I didn't ever need that in any project but I think it may be useful in some tasks that need much CPU processing like Some Image Processing or video generation.
so experts, can you correct that for me, is there any case that a blocking code would be a must?
First off, you have to distinguish between the different types of programs. A server that you expect to be responsive to many different incoming requests has very different needs than a single user program you write to do some file management or fetch some content and insert it in a database.
So, if you're not a multi-user server, you may be able to use synchronous I/O everywhere it's offered (most specifically for file access). For example, I have several scripts that do file management on my hard disk. These scripts don't have any server component and are run automatically in the middle of the night to trim backups, trim log files, etc... These scripts are perfectly OK to use synchronous I/O for pretty much anything.
If, on the other hand, you are a mutli-user server and you need to be responsive to incoming requests that can arrive at any time, then the only two times you can/should use blocking I/O or blocking crypto are at startup time or in some sort of shut-down scenario. For all other code in service of incoming requests, you have to use non-blocking, asynchronous I/O to avoid locking up your server during a request and making it non-responsive to new incoming requests.
If you have time consuming, CPU-intensive operations such as image processing or video generation, then you will want to offload that processing to another thread or process so that your main server thread is not blocked doing that processing. A typical way of handling that would be to create a worker pool of N processes/threads that can be sent jobs to crunch on. Then, you keep your most CPU-intensive work out of the main nodejs thread, allowing it to stay responsive to incoming requests.
so experts, can you correct that for me, is there any case that a blocking code would be a must?
Synchronous (blocking) I/O vastly simplifies server startup as you can do things like read configurations synchronously. You could write that code asynchronously, but then your module interface often end up having to return promises that indicate when it's actually ready and done with its initialization which complicates using the module.
For example, require() is synchronous and this really, really helps make initialization a lot simpler.
The only place I know of in a server where blocking code might be required is if you're trying to write something to disk right before your program exits when it's already in the process of exiting. You get notified of an exit event and if you try to use asynchronous file I/O, then your program will exit before the I/O finishes. In that case, you may need to use synchronous file I/O (which is not a problem in that circumstance).
I know this is a weird question but hear me out. I'm working on a high throughput, compute heavy HTTP backend server in C++. It is quite straight forward:
Spins up a HTTP server
Receive some request
Do a lot of math
This step is parallelized using TBB
Send the result back (takes about 20ms)
There's no limit on how soon the response have to get out. But the lower the worst case the better it is.
Now my bottleneck is the server part of uses a different thread pool than TBB. Thus when TBB is busy doing math. The server may suddenly get tens of new requests, then the thread from the server side get scheduled, and cause a lot of cache miss and branch prediction failures.
A solution I came up is to share TBB's thread pool with the server. Then no request will be registered while TBB is busy and processed immediately after TBB is free.
Is this a good idea? Or could it have potential problems?
This is difficult to answer without knowing what that other thread pool is doing. If it handles file or network I/O then combining it with a CPU-intensive pool can be a pessimization since I/O does not consume CPU.
Normally there should be a small pool or maybe even a single thread handling the accept loop and async I/O, handing new requests off to the worker pool for processing and sending the results back to the network.
Try to avoid mixing CPU-intensive work with I/O work, as it makes resource utilization difficult to manage. Having said that, sometimes it's just easier and it's never good to run at 100% CPU anyway. So yes, you should try having just one pool. But measure the performance before/after the change.
I'm investigating what reactive means and because it is kind of low level difference, compared to the common non-reactive approach, I'd like to understand what is going on. Let's take Tomcat as a server(I guess it will be different for netty)
Non-reactive
Connection from the browser is created.
For each request thread from thread pool is taken, which will process it.
After the thread finished processing, it returns the result through the connection back to other side.
Reactive???
How is it done for Tomcat or Netty. I cannot find any decent article about how Tomcat supports reactive apps and how Netty does that differently(Connection, Thread, request level explanation)
What bothers me is how reactive is making the webserver unblocking, when you still need to wait for the response. You can get first part of the response quicker maybe with reactive, but is it all? I guess the main point of reactivness is effective thread utilization and this is what I am asking about.
The last point by you : " I guess the main point of reactiveness is effective thread utilization and this is what I am asking about.", is exactly what reactive approach was designed for.
So how does effective utilization achieved?
Well, as an example, lets say you are requesting data from a server multiple times.
In a typical non-reactive way, you will be creating/using multiple threads(may be from a thread-pool) for each of your requests. And job of one particular thread is only to serve that particular request. The thread will take the request, give it to the server and waits for its response till the data is fetched from the server, and then bring that data back to the client.
Now, in a Reactive way, once there is a request, a thread will be allocated for it. Now if another request comes up, there won't be creation of another thread, rather it will be served by the same thread. How?
The thread when takes a request to the server, it won't wait for any immediate response from the server, rather it will come back and serve other request.
Now, when server searches for the data and it is available with the server, an event will be raised, and then the thread will go to fetch that data. This is called Event-loop mechanism as all the work of calling the thread when data is available is achieved by invoking an event.
Now, there is complexity assigned with it to map exact response to requests.
And all these complexity is abstracted by Spring-Webflux(in Java).
So the whole process becomes non-blocking. And as only one thread is enough to serve all the requests, there will be no thread switching we can have one thread per CPU core. Thus achieving effective utilization of threads.
Few images over the net to help you understand: ->
I know Node.js is good at keeping large number of simultaneous persistent connections, for example, a chat room for many many chatters.
I am wondering how it achieves this. I mean anyway it is using TCP/IP which is encapsulated by the underlying OS, why it can handle persistent connections so well that others cannot?
What is the magic thing does it have?
Node.js makes all I/O asynchronous. It only runs in a single thread, but will do other requests or operations while waiting on I/O.
In contrast, classical web servers will not serve another request until the previous one is fully done. For this reason, Apache runs several processes at the same time; let's say there's 10 httpd processes, that normally means 10 requests can be served at any one time (*). If the processes take more time to complete, you will serve less requests - or will have to spawn more processes, even if the process is doing nothing - like waiting for the database to chew up and return data.
A node.js process, faced with a request that will go to the database, leaves the database to work while it goes to serve another request.
*) MPM makes this not quite true, but true enough for all intents and purposes.
Well, the thing is that most web servers (like apache etc.. ) works using thread spawning, where they spwan a new thread for every incoming HTTP request. these threads are synchronous and blocking in nature => which means they will execute the code in the order it is written and any further computation will be blocked by the current I/O or compute task. Like if you want to listen for an event like - chat submission by a chatter you need to have a dedicated thread per user ( per user is necessary for maintaining persistent connection, there are few possible optimization techniques but still you can assume threads to be per user) listening to this event and this thread will be blocked waiting for this event to happen. So for any thread spawning and blocking web-server
Javascript on the other hand is non-blocking ( and conductive to asynchronous codes )by nature => here you register a callback for an event and whenever it occurs some the callback function will be executed. It will not block at any point waiting for this event.
You can find more about this by reading about non-blocking or asynchronous servers.