Long running process remote termination? - linux

I am developing an application that allows users to run AI algorithms on the server remotely. Some of these algorithms take a VERY long time. It is set up such that AJAX calls supply the algorithm parameters and launch a C++ algorithm on the server. The results and status of the computation are tracked via AJAX calls polling status files. This solution seems to work well for multiple users concurrently using the service, but I am now looking for a way to cancel the computation from the user's browser. I have a stop button that stops the AJAX updating service and ceases any communication between the browser and the running process on the server. The problem is that the process still runs, and I would like to free up the server resources when the user cancels the operation. Below are some more details.
The web service where the AJAX calls hit are run under the user 'tomcat' and can be listed by ps -U tomcat. The algorithm executions are all child processes of 'java' and can be listed by ps --ppid ###.
The browser keeps a record of the time that the current computation began (user system time, not server system time).
Multiple computations may be going on at once from users connected from different locations, resulting in many processes under the same name and parent process.
The restful service executes terminal commands via java runtime.exec().
I am not so knowledgeable about shell scripting, so any help would be greatly appreciated. Can anyone think of a way to either use java process object or shell script/awk to locate a process via timestamp (maybe the closest timestamp to user system time..?) or some other way?
Thanks in advance.
--edit
Is there even a way in java to get a handle for a given process if you have the pid...? Doesn't seem like it.
--edit
I cannot change the source code of the long running process on the server. :(

Your AJAX call should be manipulating some sort of a resource (most conveniently a text file) that acts as a semaphore to the process, which in every iteration of polling checks whether that semaphore file has been set to the stop status. If the AJAX changes the semaphore file to stop, then the process stops because your application checks it and responds accordingly. Which in turn means that the functionality needs to be programmed into your Java AI application rather than figuring out what the PID is and then killing it at the OS level. That, of course, assumes you have access to the source code of the app.
Of course, the semaphore does not have to be a file but can be a value in the DB etc., whichever suits your taste and configuration.

I have finally found a secure solution. From the restful java service, using Process p = Runtime.getRuntime().exec() gives you a handle on the running process. The only way, however, to get the pid is through a technique called reflection.
Field f = p.getClass().getDeclaredField();
f.setAccessible(true);
String pid = Integer.toString(f.getInt(p));
How unbelievably awkward...
Anyways, due to the passing of p from the server to the client being impossible, and the insecurity of allowing a remote call to kill an arbitrary server process by a pid passed by parameter, the only logical strategy I could come up with was to write the obtained pid to a process-unique file indicated by the initial client timestamp, and to delete this file upon restful service function return. This unique file can be used as a termination handle via yet another restful service which reads the file, and terminates the process with pid equal to the contents of the file. This

You could keep the Process instance returned by runtime.exec and invoke Process.destroy to kill the subprocess. Not knowing much about your webservice application I would assume you can keep the process instances in a global session map that maps users to process lists. Make sure access to this map is thread-safe. Also it only works if you have one webservice process that allows to share such a global session map across different requests.
Alternatively take a look at Get subprocess id in Java.

Related

Node.js API that runs a script executing continuously in the background

I need to build a Node.js API that, for each different user that calls it, starts running some piece of code (a simple script that sets up a Telegram client, listens to new messages and performs a couple of tasks here) that'd then continuously run in the background.
My ideas so far have been a) launching a new child process for each API call and b) for each call automatically deploying the script on the cloud.
I assume the first idea wouldn't be scalable, as for the second I have no experience on the matter.
I searched a dozen of keyword and haven't found anything relevant so far. Is there any handy way to implement this? In which direction can I search?
I look forward to any hint
Im not a node dev, but as a programmer you can do something like:
When user is active, it calls a function
this function must count the seconds that has passed to match the 24h (86400 seconds == 24 hours) and do the tasks;
When time match, the program stops
Node.js is nothing more that an event loop (libuv) whose execution stack run on v8 (javascript). The process will keep running until the stack is empty.
Keep it mind that there is only one thread executing your code (the event loop) and everything will happen as callback.
As long as you set up your telegram client with some listeners, node.js will wait for new messages and execute related listener.
Just instantiate a new client on each api call and listen to it, no need to spam a new process.
Anyway you'll eventually end in out of memory if you don't limit the number of parallel client of if you don't close them after some time (eg. using setInterval()).

Arangodb foxx-application poor performance

I have serious issue with custom foxx application.
About the app
The application is customized algorithm for finding path in graph. It's optimized for public transport. On init it loads all necessary data into javascript variable and then it traverse through them. Its faster then accessing the db each time.
The issue
When I access through api the application for first time then it is fast eg. 300ms. But when I do absolutely same request second time it is very slow. eg. 7000ms.
Can you please help me with this? I have no idea where to look for bugs.
Without knowing more about the app & the code, I can only speculate about reasons.
Potential reason #1: development mode.
If you are running ArangoDB in development mode, then the init procedure is run for each Foxx route request, making precalculation of values useless.
You can spot whether or not you're running in development mode by inspecting the arangod logs. If you are in development mode, there will be a log message about that.
Potential reason #2: JavaScript variables are per thread
You can run ArangoDB and thus Foxx with multiple threads, each having thread-local JavaScript variables. If you issue a request to a Foxx route, then the server will pick a random thread to answer the request.
If the JavaScript variable is still empty in this thread, it may need to be populated first (this will be your init call).
For the next request, again a random thread will be picked for execution. If the JavaScript variable is already populated in this thread, then the response will be fast. If the variable needs to be populated, then response will be slow.
After a few requests (at least as many as configured in --server.threads startup option), the JavaScript variables in each thread should have been initialized and the response times should be the same.

How to run cronjobs on local files on cloudControl PaaS?

On cloudControl, I can either run a local task via a worker or I can run a cronjob.
What if I want to perform a local task on a regular basis (I don't want to call a publicly accessible website).
I see possible solutions:
According to the documentation,
"cronjobs on cloudControl are periodical calls to a URL you specify."
So calling the file locally is not possible(?). So I'd have to create a page I can call via URL. And I have to perform checks, if the client is on localhost (=the server) -- I would like to avoid this way.
I make the worker sleep() for the desired amount of time and then make it re-run.
// do some arbitrary action
Foo::doSomeAction();
// e.g. sleep 1 day
sleep(86400);
// restart worker
exit(2);
Which one is recommended?
(Or: Can I simply call a local file via cron?)
The first option is not possible, because the url request is made from a seperate webservice.
You could either use HTTP authentication in the cron task, but the worker solution is also completely valid.
Just keep in mind that the worker can get migrated to a different server (in case of software updates or hardware failure), so do SomeAction() may get executed more often than once per day from time to time.

General question about parallel threading in C++

I haven't used threading in my program before. But there is a problem I am having with this 3rd party application.
It is an offsite backup solution and it has a server and many clients. We have an admin console to manage all the clients and that is where there is a problem.
If one of the client side application gets stuck, or is running in a broken condition, the admin console waits forever to get a response and does not display anything.
$for(client= client1; client < last_client; client++){
if (getOServConnection(client, &socHandler)!=NULL) { .. }
}
I want two solutions to this. I want to know if there is anyway, I can set a timeout for the function getOServConnection, so that I get a response within X seconds.
And, I want to know how to call this function in parallel for all clients, so that I get the response from all clients within X seconds.
the getOServConnection contains a WSAConnect call, and I don't want to use any options on the socket, since it is used by other modules and it will affect the application severely.
First.. If you move the call that hangs into a separate thread you can use the main thread for starting a timer an waiting for the timeout. If you are using Visual C++ and if you are in Win32 you can use the (rather old) MFC based timer. Once this timer expires it will launch a function call OnTimer. This timer does not affect your application's main thread as it works in a different system based thread.
Second.. If you need to start any number of threads with that connection you should start thinking of a design pattern to use for that. You could use a fixed number of threads, and in that case you may want to use a object pool. Or if the number of threads is (relatively) limitless you may want to use a factory method

nodeJS multi node Web server

I need to create multi node web server that will be allow to control number of nodes in real time and change process UID and GUID.
For example at start server starts 5 workers and pushes them into workers pool.
When the server gets the new request it searches for free workers, sets UID or GUID if needed, and gives it the request to proces. In case if there is no free workers, server will create new one, set GUID or UID, also pushes it into pool and so on.
Can you suggest me how it can be implemented?
I've tried this example http://nodejs.ru/385 but it doesn't allow to control the number of workers, so I decided that there must be other solution but I can't find it.
If you have some examples or links that will help me to resolve this issue write me please.
I guess you are looking for this: http://learnboost.github.com/cluster/
I don't think cluster will do it for you.
What you want is to use one process per request.
Have in mind that this can be very innefficient, and node is designed to work around those types of worker processing, but if you really must do it, then you must do it.
On the other hand, node is very good at handling processes, so you need to keep a process pool, which is easily accomplished by using node internal child_process.spawn API.
Also, you will need a way for you to communicate to the worker process.
I suggest opening a unix-domain socket and sending the client connection file descriptor, so you can delegate that connection into the new worker.
Also, you will need to handle edge-cases for timeouts, etc.
https://github.com/pgte/fugue I use this.

Resources