My express server does a very simple work - saving the request url into a file (via fs.appendFile).
I suppose it works fine when not using pm2, because it has only one process, so no other process/thread saving the same file at the same time.
But when using pm2, I don't know if it will occur two processes write the same file at the same time? Thanks.
When you use pm2 on cluster mode, request routing will happen
using Round Robin algorithm. That means the cluster master accepts
all the incoming connections and routes them to the child processes
(one request to one child process).
So, one request will be routed to one child process and the same request won't be processed by another process.
For your above case, When you receive two different requests from two different clients then they will be processed by two different processes.
As long as you have a logic to create a unique file name even though the requests handled at the same time, you won't get any issues.
You will get issues only if you try to write the files by two different processes with the same file name.
If you write different files from different clients with different file names then it won't be an issue.
Note: As a request from one client will be processed by one process, two or more processes won't process the same request and won't write the same file twice.
The issue will occur, If you write different files from different clients with the same file name.
Hope you understand :-)
Yes it may mess when multiple processes writing/appending the same file at the same time. So then best way is to only use one process to write file, or you have to synchronize them
Related
I am using Cloud Run with the default settings (80 requests per instance) running a container with node and express.
The service needs to create a temporary file when a request is processed. I'm wondering if when multiple requests arrive at the same time, will they be processed concurrently? So if the file is named the same thing, could it be overwritten by another process before the first one is completed?
With Node, I don't think we have parallel processes but I think there could still be a conflict unless express handles the requests sequentially.
If you set max concurrency = 1, then you can use the same file name.
If you use max concurrency > 1, then you are at risk that multiple request would conflict when processing the file if using the same filename. The best is to use unique temporary filenames for each request and to ensure it is deleted at the end.
I'm building a website where a user can upload their CSV file which contains certain id's. My application will lookup those id's using AJAX requests. The amount of id's in the CSV file may be between 100 - 50000.
I'm wondering if it's a best practice to spawn a new node.js process to handle each such CSV because else the event loop will be full of thousands of such AJAX requests and responses. So if the user needs to trigger an AJAX request somewhere in the website for some small task (not related to the CSV) that request will be queued after potentially tens of thousands of other requests which will result in bad user experience.
Don't worry. Nodejs can handle that much API requests.
If you're using pm2 to start processes, you can use pm2 start <entry-file.js> -i max to start the processes according to CPU cores you have.
To avoid bad user experience, you can immediately send user a response that your work is in progress and notify your user about completion of work later.
what's your thought ?
I have a site that makes the standard data-bound calls, but then also have a few CPU-intensive tasks which are ran a few times per day, mainly by the admin.
These tasks include grabbing data from the db, running a few time-consuming different algorithms, then reuploading the data. What would be the best method for making these calls and having them run without blocking the event loop?
I definitely want to keep the calculations on the server so web workers wouldn't work here. Would a child process be enough here? Or should I have a separate thread running in the background handling all /api/admin calls?
The basic answer to this scenario in Node.js land is to use the core cluster module - https://nodejs.org/docs/latest/api/cluster.html
It is an acceptable API to :
easily launch worker node.js instances on the same machine (each instance will have its own event loop)
keep a live communication channel for short messages between instances
this way, any work done in the child instance will not block your master event loop.
I am developing an application that allows users to run AI algorithms on the server remotely. Some of these algorithms take a VERY long time. It is set up such that AJAX calls supply the algorithm parameters and launch a C++ algorithm on the server. The results and status of the computation are tracked via AJAX calls polling status files. This solution seems to work well for multiple users concurrently using the service, but I am now looking for a way to cancel the computation from the user's browser. I have a stop button that stops the AJAX updating service and ceases any communication between the browser and the running process on the server. The problem is that the process still runs, and I would like to free up the server resources when the user cancels the operation. Below are some more details.
The web service where the AJAX calls hit are run under the user 'tomcat' and can be listed by ps -U tomcat. The algorithm executions are all child processes of 'java' and can be listed by ps --ppid ###.
The browser keeps a record of the time that the current computation began (user system time, not server system time).
Multiple computations may be going on at once from users connected from different locations, resulting in many processes under the same name and parent process.
The restful service executes terminal commands via java runtime.exec().
I am not so knowledgeable about shell scripting, so any help would be greatly appreciated. Can anyone think of a way to either use java process object or shell script/awk to locate a process via timestamp (maybe the closest timestamp to user system time..?) or some other way?
Thanks in advance.
--edit
Is there even a way in java to get a handle for a given process if you have the pid...? Doesn't seem like it.
--edit
I cannot change the source code of the long running process on the server. :(
Your AJAX call should be manipulating some sort of a resource (most conveniently a text file) that acts as a semaphore to the process, which in every iteration of polling checks whether that semaphore file has been set to the stop status. If the AJAX changes the semaphore file to stop, then the process stops because your application checks it and responds accordingly. Which in turn means that the functionality needs to be programmed into your Java AI application rather than figuring out what the PID is and then killing it at the OS level. That, of course, assumes you have access to the source code of the app.
Of course, the semaphore does not have to be a file but can be a value in the DB etc., whichever suits your taste and configuration.
I have finally found a secure solution. From the restful java service, using Process p = Runtime.getRuntime().exec() gives you a handle on the running process. The only way, however, to get the pid is through a technique called reflection.
Field f = p.getClass().getDeclaredField();
f.setAccessible(true);
String pid = Integer.toString(f.getInt(p));
How unbelievably awkward...
Anyways, due to the passing of p from the server to the client being impossible, and the insecurity of allowing a remote call to kill an arbitrary server process by a pid passed by parameter, the only logical strategy I could come up with was to write the obtained pid to a process-unique file indicated by the initial client timestamp, and to delete this file upon restful service function return. This unique file can be used as a termination handle via yet another restful service which reads the file, and terminates the process with pid equal to the contents of the file. This
You could keep the Process instance returned by runtime.exec and invoke Process.destroy to kill the subprocess. Not knowing much about your webservice application I would assume you can keep the process instances in a global session map that maps users to process lists. Make sure access to this map is thread-safe. Also it only works if you have one webservice process that allows to share such a global session map across different requests.
Alternatively take a look at Get subprocess id in Java.
I need to create multi node web server that will be allow to control number of nodes in real time and change process UID and GUID.
For example at start server starts 5 workers and pushes them into workers pool.
When the server gets the new request it searches for free workers, sets UID or GUID if needed, and gives it the request to proces. In case if there is no free workers, server will create new one, set GUID or UID, also pushes it into pool and so on.
Can you suggest me how it can be implemented?
I've tried this example http://nodejs.ru/385 but it doesn't allow to control the number of workers, so I decided that there must be other solution but I can't find it.
If you have some examples or links that will help me to resolve this issue write me please.
I guess you are looking for this: http://learnboost.github.com/cluster/
I don't think cluster will do it for you.
What you want is to use one process per request.
Have in mind that this can be very innefficient, and node is designed to work around those types of worker processing, but if you really must do it, then you must do it.
On the other hand, node is very good at handling processes, so you need to keep a process pool, which is easily accomplished by using node internal child_process.spawn API.
Also, you will need a way for you to communicate to the worker process.
I suggest opening a unix-domain socket and sending the client connection file descriptor, so you can delegate that connection into the new worker.
Also, you will need to handle edge-cases for timeouts, etc.
https://github.com/pgte/fugue I use this.