How to solve this specific threading problem

How to solve this specific threading problem - multithreading

Using the code from the following article, I implemented an own ThreadPool:
http://www.developer.com/net/article.php/3783756
This is what I want to achieve:
Triggered through a Timer, a service should query a database every 5 seconds for new jobs to execute. A Job is basically only the information about a commandline program that needs to be run with arguments.
Up to 50 or more of these programs should be able to be executed at the same time. A program can be running a couple of seconds, minutes and also hours. The service needs to be in control of these programs at all times, i.e. it must be able to kill a program on request for instance.
Using the ThreadPool implementation from above, I startet to queue the programs to be executed and could see when the service indeed executed them. No problem so far. However, the mechanism here works like this:
The ThreadPool creates a workerthread and starts it. Whenever a program is queued, the workerthread notices this and calls a delegate that essentially instantiates a System.Diagnostics.Process object and starts the external program. The thread then is finished with its work and would be able to start further programs. However... when theres no program to start, an idle timer makes the threadmanager kill the thread and thus interrupt the Process that has been started.
This is not what I need. Does anyone here have an idea, how to handle the scenario I described better?

1) Why is the death of a thread in this process resulting in the death of the other, started process? If you answer that question, you'll have solved your problem.
2) That looks like a pretty lousy, and fairly naive, ThreadPool article. Check out Joe Duffy's series on a custom thread pool (part 1, part 2, and part 3). Code such as that is surprisingly intricate, and a significant maintenance burden on its own.
3) (The real answer) Why are you using a threadpool to do this at all? Your threadpool only ever uses one thread at a time, why not just use a timer and trigger your main thread? Does your app do other stuff besides in a UI that you need to keep responsive?
Get rid of the threadpool, there's no need for it, it's making your life difficult, and threadpools in general are not designed to host long-running tasks. That's what individual threads are for. If you must trigger the timer on a separate thread, simply create a Thread to handle it and use this one, identical thread to spawn all your processes. You can then track your process state in one, central, sensible location. :)

Are you sure that a thread pool is the optimal way of handling this? Spawning a new thread for each process which will be mostly idle but has to be present until the process terminates seems like a waste of a thread to me.
I would implement all this with a single thread and a dictionary of processes. The thread would periodically query the database and all the processes to see what actions need to be done.

AFAIK, processes spawned by Process.Start will continue to run even if the thread calling Process.Start exits. The following code illustrates this. LongRunningApp.exe will continue to run after the main program exits:
static void Main(string[] args)
{
Process p = new Process();
ProcessStartInfo psi = new ProcessStartInfo(#"C:\LongRunningApp.exe");
psi.CreateNoWindow = true;
psi.UseShellExecute = false;
p.StartInfo = psi;
p.Start();
Console.ReadLine();
}

Related

Most optimal way to execute timed functions in parallel using node?

I'm trying to create a timed scheduler that can execute tasks in parallel. For example:
Let's say I'm trying to create a function that will do something after 10 seconds of being called. After calling Process_1(), it will be expected to run its intended functionality after 10 seconds.
But at the 5 second mark while Process_1() is waiting to be executed at the halfway point, I'm now calling Process_2() midway. So at the 10 seconds mark, Process_1() will execute its function and at the 15 seconds mark, Process_2() will execute its function.
I've tried using node-cron for this but it doesn't seem like it can schedule things in parallel. Thanks in advance!

Nodejs runs your Javascript in a single thread unless you explicitly create a WorkerThread and run some code in that. True parallel execution where both jobs are running code that uses the CPU will only be accomplished if you either run each task in a WorkerThread or child process to get it out of the main thread.
Let me repeat, true parallel execution requires more than one thread or process in nodejs and nodejs does not do that by default so you will have to create a WorkerThread or child_process.
So, if you have code that takes more than a few ms to do its work and you want it to run at a fairly precise time, then you can't count on the main Javascript thread to do that because it might be busy at that precise time. Timers in Javascript will run your code no earlier than the scheduled time, and when that scheduled time comes around, the event loop is ready to run them, but they won't actually run until whatever was running before finishes and returns control back to the event loop so the event loop can run the code attached to your timer.
So, if all you're mostly doing is I/O kind of work (reading/writing files or network), then your actual Javascript execution time is probably only milliseconds and nodejs can be very, very responsive to run your timers pretty close to "on time". But, if you have computationally expensive things that keep the CPU busy for much longer, then you can't count on your timers to run "on time" if you run that CPU-heavy stuff in the main thread.
What you can do, is start up a WorkerThread, set the timer in the WorkerThread and run your code in the worker thread. As long as you don't ask that WorkerThread to run anything else, it should be ready to run that timer pretty much "on time".
Now WorkerThreads do share some resources with the main thread so they aren't 100% independent (though they are close to independent). If you want 100% independence, then you can start a nodejs child process that runs a node script, sets its own timers and runs its own work in that other process.
All that said, the single threaded model works very, very well at reasonably high scale for code that is predominantly I/O code because nodejs uses non-blocking I/O so while it's waiting to read or write from file or network, the main thread is free and available to run other things. So, it will often give the appearance of running things in parallel because progress is being made on multiple fronts. The I/O itself inside the nodejs library is either natively non-blocking (network I/O) or is happening in an OS-native thread (file I/O) and the programming interface to Javascript is callback or promise based so it is also non-blocking.
I mention all this because you don't say what your two operations that you want to run in parallel are (including your actual code allows us to write more complete answers). If they are I/O or even some crypto, then they may already be non-blocking and you may achieve desired parallelism without having to use additional threads or processes.

Confused about threads

I'm studying threads in C and I have this theoretical question in mind that is driving me crazy. Assume the following code:
1) void main() {
2) createThread(...); // create a new thread that does "something"
3) }
After line 2 is executed, two paths of execution are created. However I believe that immediately after line 2 is executed then it doesn't even matter what the new thread does, which was created at line 2, because the original thread that executed line 2 will end the entire program at its next instruction. Am I wrong? is there any chance the original thread gets suspended somehow and the new thread get its chance to do something (assume the code as is, no sync between threads or join operations are performed)

It can work out either way. If you have more than one core, the new thread might get its own core. Even if you don't, the scheduler might give the new thread priority over the existing one. The original thread might exhaust its timeslice right after it creates a new thread.
So that code creates a race condition -- one thread is trying to do work, another thread is trying to terminate the process. Which one wins will depend on the threading implementation, the hardware, and perhaps even some random chance.

If main() finishes before the spawned threads, all those threads will be terminated as there is no main() to support them.
Calling pthread_exit() at the end of main() will block it and keep it alive to support the threads it created until they complete execution.
You can learn more about this here: https://computing.llnl.gov/tutorials/pthreads/

Assuming you are using POSIX pthreads (not clear from your example) then you are right. If you don't want that then indeed pthread_exit from main will mean the program will continue to run until all the threads finish. The "main thread" is special in this regard, as its exit normally causes all threads to terminate.
More typically, you'll do something useful in the main thread after a new thread has been forked. Otherwise, what's the point? So you'll do your own processing, wait on some events, etc. If you want main (or any other thread) to wait for a thread to complete before proceeding, you can call pthread_join() with the handle of the thread of interest.
All of this may be off the point, however since you are not explicitly using POSIX threads in your example, so I don't know if that's pseudo-code for the purpose of example or literal code. In Windows, CreateThread has different semantics from POSIX pthreads. However, you didn't use that capitalization for the call in your example so I don't know if that's what you intended either. Personally I use the pthreads_win32 library even on Windows.

Detecting low user activity and checking email on background

I'm writing an application that must do some things in background: check emails and parse them to inject some data in a database, and connect to a web service to check status for some asynchronous operations.
Right now, my solution is a simple timer that performs these operations on a predefined schedule: email every five minutes, and web service checks every minute (but these are only performed if there is pending activity, so most of the time this does nothing.)
Right now I'm not using a thread for this (I'm early in the development stage.) But my plan is to create a background thread and let it do the work offline.
Couple of questions:
I plan to control everything in the timer(s). Set up a global variable (rudimentary "locking",) start the thread. If the "lock" is already set, ignore it. The thread cleans it up on termination. Should I use a more robust locking / queue mechanism for my threads? (I already have OmniThread installed)
How can I run a thread with low priority? I don't want the application to feel sluggish when the background thread is performing data insertion or networking.
Is there a clean way to verify for user activity and start this thread only when the user is not busy at the keyboard / mouse?
Please have in mind that I'm not experienced with threads. I wrote an FTP sync application once so I'm not a complete newbie, but that was long time ago.

For part 3 of your question, the Windows API has a GetLastInputInfo function which should return information about the last time the user did something. It even says it's
"This function is useful for input idle detection". I did plan to use this for something myself, but haven't had a chance to test it.
Edit: Delphi implementation link

I plan to control everything in the timer(s). Set up a global variable (rudimentary "locking",) start the thread. If the "lock" is already set, ignore it. The thread cleans it up on termination. Should I use a more robust locking / queue mechanism for my threads? (I already have OmniThread installed)
I wouldn't bother with the Timer at all. Make your thread's loop look like this, and you'll have your delays. You will NOT need a lock because there's only one thread, it will not sleep until the previous job is over.
procedure YourThread;
var N: Integer;
begin
while not Terminated do
begin
// Figure out if there's a job to do
// Do the job
// Sleep for a while, but give the thread a chance to notice
// it needs to terminate.
for N := 1 to 500 do
if not Terminated then
Sleep(100);
end;
end;
How can I run a thread with low priority? I don't want the application to feel sluggish when the background thread is performing data insertion or networking.
Don't bother. You can easily use SetThreadPriority but it's not worth the trouble. If your background thread is waiting for I/O (networking), then it will not consume any CPU resource. Even if your background thread works full-speed, your GUI will not feel sluggish because Windows does a good job of splitting available CPU time among all available threads.
Is there a clean way to verify for user activity and start this thread only when the user is not busy at the keyboard / mouse?
Again, why bother checking for user activity? Checking for email is network (ie: I/O) bound, the thread checking for email will mostly be idle.

Can you not just do all this in the background thread, getting rid of all the thread micro-management? Seems to me that you could just loop around a sleep(60000) call in the background thread. Check the web service every time round the loop, check the email every 5 times round. You can set the priority to tpLower, if you want, but this thread is going to be sleeping or blocked on I/O nearly all the time, so I don't think it's even worth the typing.
I would be surprised if such a thread is noticeable at all to the user at the keyboard/mouse, no matter when it runs.
'Set up a global variable (rudimentary "locking",) start the thread' - what is this global variable intended to do? What is there to lock?

writing a thread(educational purpose)

Sorry if this is a duplicate...
I have a task to write a thread. And the question is - what a good thread class should contain. I looked through Java implementation and some other, but since it is just an educational project, I wouldn't want to make it too complex. If you can tell or point me to source witch contains required information, I would be very grateful.

Simple thread class consists of following along with threadManager class for easier management of multiple threads
Thread class:
Constructor
function to execute thread
Check if thread is running and process thread's output, if any is present. Returns
TRUE if the thread is still executing, FALSE if it's finished.
Wait until the thread exits
ThreadManager class:
Constructor
Add an existing thread to the manager queue.
Remove a thread from the manager queues.
Process all threads. Returns the number of threads that are still running.
Create and start a new thread. Returns the ID assigned to the thread or FALSE on error.
Remove a finished thread from the internal queue and return it. Returns FALSE if there are no threads that have completed execution.

On the highest level of abstraction you can think about the thread as a combinration of:
Finite-state machine to represent thread's state
Queue of tasks to proceed
Scheduler which can manage threads (start, pause, notify etc ..). Scheduler can be OS level scheduler or some custom scheduler, for example, on the VM level - so called "green threads".
To be more specific, I would recommend to look at Erlang VM. Sources are available online and you can go through their implementantion for "green threads" which are extremely lightweight.
Erlang Downloads

Tell if 'elapsed' event thread is still running?

Given a System.Timers.Timer, is there a way from the main thread to tell if the worker thread running the elapsed event code is still running?
In other words, how can one make sure the code running in the worker thread is not currently running before stopping the timer or the main app/service thread the timer is running in?
Is this a matter of ditching Timer for threading timer using state, or is it just time to use threads directly?

Look up ManualResetEvent, as it is made to do specifically what you're asking for.
Your threads create a new reset event, and add it to an accessible queue that your main thread can use to see if any threads are still running.
// main thread owns this
private List<ManualResetEvent> _resetEvents;
...
// main thread does this to wait for executing threads to finish
WaitHandle.WaitAll(_resetEvents.ToArray(), 2000, false)
...
// worker threads do this to signal the thread is done
myResetEvent.Set();
I can give you more sample code if you want, but I basically just copied it from the couple articles I read when I had to do this a year ago or so.
Forgot to mention, you can't add this functionality to the default threads you'll get when your timer fires. So you should make your timer handler be very lean and do nothing more than prepare and start a new worker thread.
...
ThreadPool.QueueUserWorkItem(new WaitCallback(MyWorkerDelegate),
myCustomObjectThatContainsAResetEvent);

For the out of the box solution, there is no way. The main reason is the thread running the TimerCallback function is in all likelihood still alive even if the code running the callback has completed. The TimerCallback is executed by a Thread out of the ThreadPool. When the task is completed the thread does not die, but instead goes back into the queue for the next thread pool task.
In order to get this to work your going to have to use a manner of thread safe signalling to detect the operation has completed.
Timer Documentation

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to solve this specific threading problem - multithreading

Related

Most optimal way to execute timed functions in parallel using node?

Confused about threads

Detecting low user activity and checking email on background

writing a thread(educational purpose)

Tell if 'elapsed' event thread is still running?

Categories

Resources