I am experimenting with threads in perl. The following code basically creates n threads and assigns the same function to them (which they should execute in parallel).
Twist: The function just prints something. This means that they can't do it in parallel. I am honestly fine with that since I am just starting to do things with them however not all threads seem to finish. I suppose it is due to the fact that I haven't locked the STD out and that is why some conficts occur. That may not be the reason. In any case a different ammount of threads are not finishing each time.
If I am correct, how can I lock stdout (I get an error when I try to use the lock function) ?
If I am wrong, why are all threads not finishing and how can I fix that ?
The code:
use strict;
use threads ('yield',
'stack_size' => 64*4096,
'exit' => 'threads_only',
'stringify');
use threads::shared;
sub PrintTestMessage()
{
print "Hello world\n";
}
my #t;
push #t, threads->new(\&PrintTestMessage) for 1..10;
I get 10 times hello world, however after the program finishes I get different output:
Perl exited with active threads:
1 running and unjoined
9 finished and unjoined
0 running and detached
Perl exited with active threads:
8 running and unjoined
2 finished and unjoined
0 running and detached
Perl exited with active threads:
5 running and unjoined
5 finished and unjoined
0 running and detached
Why haven't all threads finished ? ( the unjoined is because I never join them in the code so it is expected)
You have to join the threads, otherwise main thread could (as in your example) finish before its child threads,
$_->join for #t;
From perldoc threads,
$thr->join()
This will wait for the corresponding thread to complete its execution. When the thread finishes, ->join() will return the return value(s) of the entry point function.
Related
I am using queues in the context of processes and threads and there is a constellation with a strange behavior which i do not understand.
the used queue is defined like this: multiprocessing.Queue(1)
First Constellation (works fine)
initiate queue
Process 1 starts Process 2 (queue in parameters)
Process 1 starts Thread
2 Loops (one inside Thread one in Process 2) are communcating via the queue: everything works fine
Second Constellation (does not work)
Process 1 starts Process 2
inside Process 2 queue is initiated
Process 2 starts Process 3 (queue in parameters)
Process 2 starts Thread
2 Loops (one inside Thread one in Process 3) are communcating via the queue:
--> Loop in Thread in Process2 says always queue is full (wants to put something in the queue)
--> Loop in Process 3 says always queue is empty
Third Constellation (works fine)
Process 1 starts Process 2
inside Process 2 queue is initiated
Process 2 starts Process 3 (queue in parameters)
2 Loops (one in Process 2 one in Process 3) are communcating via the queue: everything works fine
I do not understand what's the problem with the second constellation. I read that this could happen if one first starts the thread and then the child-process. But i start Process 3 first and then start the Thread in Process 2. And if the thread is the Problem why does constellation 1 work?
I am trying to get into Perl's use of threads. Reading the documentation I came across the following code:
use threads;
my $thr = threads->create(\&sub1); # Spawn the thread
$thr->detach(); # Now we officially don't care any more
sleep(15); # Let thread run for awhile
sub sub1 {
my $count = 0;
while (1) {
$count++;
print("\$count is $count\n");
sleep(1);
}
}
The goal, it seems, would be to create one thread running sub1 for 15 seconds, and in the mean time print some strings. However, I don't think I understand what's going on at the end of the programme.
First of all, detach() is defined as follows:
Once a thread is detached, it'll run until it's finished; then Perl
will clean up after it automatically.
However, when does the subroutine finish? while(1) never finishes. Nor do I find any information in sleep() that it'd cause to break a loop. On top of that, from the point we detach we are 'waiting for the script to finish and then clean it up' for 15 seconds, so if we are waiting for the subroutine to finish, why do we need sleep() in the main script? The position is awkward to me; it suggests that the main programme sleeps for 15 seconds. But what is the point of that? The main programme (thread?) sleeps while the sub-thread keeps running, but how is the subroutine then terminated?
I guess the idea is that after sleep-ing is done, the subroutine ends, after which we can detach/clean up. But how is this syntactically clear? Where in the definition of sleep is it said that sleep terminates a subroutine (and why), and how does it know which one to terminate in case there are more than one threads?
All threads end when the program ends. The program ends when the main thread ends. The sleep in the main thread is merely keeping the program running a short time, after which the main thread (therefore the program, therefore all created threads) also end.
So what's up with detach? It just says "I'm never going to bother joining to this thread, and I don't care what it returns". If you don't either detach a thread or join to it, you'd get a warning when the program ends.
detach a thread means "I don't care any more", and that does actually mean when your process exits, the thread will error and terminate.
Practically speaking - I don't think you ever want to detach a thread in perl - just add a join at the end of your code, so it can exit cleanly, and signal it via a semaphore or Thread::Queue in order to terminate.
$_ -> join for threads -> list;
Will do the trick.
That code example - in my opinion - is a bad example. It's just plain messy to sleep so a detached thread has a chance to complete, when you could just join and know that it's finished. This is especially true of perl threads, which it's deceptive to assume they're lightweight, and so can be trivially started (and detached). If you're ever spawning enough that the overhead of joining them is too high, then you're using perl threads wrong, and probably should fork instead.
You're quite right - the thread will never terminate, and so you code will always have a 'dirty' exit.
So instead I'd rewrite:
#!/usr/bin/perl
use strict;
use warnings;
use threads;
use threads::shared;
my $run : shared;
$run = 1;
sub sub1 {
my $count = 0;
while ($run) {
$count++;
print("\$count is $count\n");
sleep(1);
}
print "Terminating\n";
}
my $thr = threads->create( \&sub1 ); # Spawn the thread
sleep(15); # Let thread run for awhile
$run = 0;
$thr->join;
That way your main signals the thread to say "I'm done" and waits for it to finish it's current loop.
I have a perl program, that spawns several threads. Each thread processees some task (by firing off other system commands etc) and then when its all done, Waits.
Once all threads are done, they fire a signal to Parent process. The parent then loads up new jobs, and signals the threads to go work on these new tasks.
So ideally, this program, would run forever.
Now, if I kill it in command line with kill -9 MainProgram.pl, its not killed! I see the output of the jobs the threads are running, and then I also see that after they are done, they getnew jobs and just go on and on...
I am absolutely confounded. If I do a kill -9 MainProgram.pl, it is supposed to kill all threads it owns, right?
Regardless of what the threads are out doing....
And even if the threads are doing I/O and so they wait for the IO to get done...I would expect the thread to die after its current task is done..but clearly, Main is reloading jobs too, as threads just keep continuing...
Is this kind of behaviour seen in perl ?
EDIT: Some of the code in mainProgram.pl
use threads;
use threads::shared;
for (my $count = 0; $count <= $threadNum-1; $count++) {
$t = threads->new(\&handleEvent, $count) ;
push(#threads, $t);
}
#Parent thread:
while(1) {
lock($parentSignal);
cond_wait($parentSignal);
getEvents();
while(#eventCount== 0){
sleep($parent_sleep_time);
getEvents(); #Try to get events again until you get some new stuff to process
}
cond_broadcast($threadsDone); # threadsgo work on this
}
Thanks
From what I understand, you're supposed to either join() or detach() on all threads prior to exiting.
From the POD:
If the program exits without all threads having either been joined or
detached, then a warning will be issued.
Source: http://metacpan.org/pod/threads
I have code that spawns two threads. The first is a system command which launches an application. The second monitors the program. I'm new to perl threads so I have a few questions...
my $thr1 = threads->new(system($cmd));
sleep(FIVEMINUTES);
my $thr2 = threads->new(\&check);
my $rth1 = $thr1->join();
my $rth2 = $thr2->join();
1) Do I need a second thread to monitor the program? You can think of my sub routine call to &check as a infinite while loop which checks a text file for stuff the application produces. Could I just do this:
my $thr1 = threads->new(system($cmd));
sleep(FIVEMINUTES);
✓
2) I'm trying to figure out what my parent is doing after I run this code. So after I launch line 1 it will spawn that new thread, sleep, then spawn that second thread and then sit at that first join and wait. It will not execute the rest of my code until it joins at that first join. Is this correct or am I wrong? If I am wrong, then how does it work?
3) My first thread the one that launches the application can be killed unexpectedly. when this happens, I have nothing to catch that and kill the threads. It just says:
"Thread 1 terminated abnormally: Undefined subroutine &main::65280 called at myScript.pl line 109." and then hangs there.
What could I do to get it to end the other threads? I need it to send an email before the program ends as well which I can do by just calling &email (another subroutine I made).
Thanks
First of all,
my $thr1 = threads->new(system($cmd));
should be
my $thr1 = threads->new(sub { system($cmd) });
or simply
my $thr1 = async { system($cmd) };
You don't need to start a third thread. As you suspected, the main thread and the one executing system are sufficient.
What if the command finishes executing in less than five minutes? The following replaces sleep with a signal.
use threads;
use threads::shared;
my $done :shared = 0;
my $thr1 = async {
system($cmd);
lock($done);
$done = 1;
cond_signal($done);
};
{ # Wait up to $timeout for the thread to end.
lock($done);
my $timeout = time() + 5*60;
1 while !$done && cond_timedwait($done, $timeout);
if (!$done) {
... there was a timeout ...
}
}
$thr1->join();
In 2004-2006 I had the same challenges for 24/7 running perl app on Winblows... The only approach working was to use xml state files on disk to communicate the status of each component of the system... and make sure if threads are used every stat file handling occurred within a closure code block {} (big gotcha) The app ran at least 3 years on 100 machines 24/7 without errors ...
If you are on a Unix-like OS I would suggest to use forks and interprocess communication.
Use cpan modules, do not reinvent the wheel..
Multithreading in Perl is a little hard to deal with, I would suggest using the fork() commands instead. I will attempt to answer your questions to the best of my ability.
1) It seems to me like two threads/processes are the way to go here, as you need to check asynchronously check your data.
2) Your parent works exactly as you describe.
3) The reason for your thread hanging could be that you never terminate your second thread. You said it was an infinite loop, is there any exit condition?
I'm using Thread::Pool::Simple for multi-threading.
I have a couple of questions which are quite general to multi-threading, I guess:
Each of my threads might die if something unexpected hapens. This is totally accepted by me, since it means some of my assertion are wrong and I need to redesign the code. Currently, when any thread dies the main program (calling thread) also dies, yielding something like:
Perl exited with active threads:
0 running and unjoined
0 finished and unjoined
4 running and detached
Are these "running and detached"
zombies? Are they "dangerous" in any
way? Is there a way to kill all of
them if any of the threads dies?
What is the common solution for such
scenarios?
Generally, my jobs are independent.
However, I pass each of them as an
argument a unique hash which is
taken form one big hash oh hashes.
the thread might change this
personal hash (but it can't get to
the large hash - it doesn't even
know about it). Hence, I guess I
don't need any locks etc. Am I
missing anything?
When your main program exits, all threads are terminated.
Perl threads work in one of two ways.
1) You can use join:
my $thr = threads->create(...);
# do something else while thread works
my $return = $thr->join(); # wait for thread to terminate and fetch return value
2) You can use detach:
my $thr = threads->create(...);
$thr->detatch(); # thread will discard return value and auto-cleanup when done
That message lists the threads that hadn't been cleaned up before the program terminated.
"Running and unjoined" is case 1, still running. "Finished and unjoined" is case 1, finished but the return value hasn't been fetched yet. "Running and detached" is case 2, still running.
So it's saying you have 4 threads that had been detached but hadn't finished before the program died. You can't tell from that whether they would have finished if the program had run longer, or they were stuck in an infinite loop, or deadlocked, or what.
You shouldn't need any locks for the situation you describe.