I'm creating a perl application which executes in multiple threads and each thread consuming time. This is what I have so far
use strict;
use warnings;
use threads;
my #file_list = ("file1", "file2", "file3");
my #jobs;
my #failed_jobs;
my $timeout = 10; #10 seconds timeout
foreach my $s (#file_list){
push #jobs, threads->create(sub{
#time consuming task
})
}
$_->join for #jobs;
The problem is that the time consuming task may sometimes get stuck (or take more than $timeout seconds of time to run). So when that happens, I want to get the name of the file and push it to #failed_jobs and then kill that thread. However, I want to continue with the other threads. When all threads are either killed or completed, I want to exit.
Can someone tell me how to modify my above code to achieve this?
Thanks
If you want the ability to kill the task, you don't want threads but processes.
Related
I am trying to get into Perl's use of threads. Reading the documentation I came across the following code:
use threads;
my $thr = threads->create(\&sub1); # Spawn the thread
$thr->detach(); # Now we officially don't care any more
sleep(15); # Let thread run for awhile
sub sub1 {
my $count = 0;
while (1) {
$count++;
print("\$count is $count\n");
sleep(1);
}
}
The goal, it seems, would be to create one thread running sub1 for 15 seconds, and in the mean time print some strings. However, I don't think I understand what's going on at the end of the programme.
First of all, detach() is defined as follows:
Once a thread is detached, it'll run until it's finished; then Perl
will clean up after it automatically.
However, when does the subroutine finish? while(1) never finishes. Nor do I find any information in sleep() that it'd cause to break a loop. On top of that, from the point we detach we are 'waiting for the script to finish and then clean it up' for 15 seconds, so if we are waiting for the subroutine to finish, why do we need sleep() in the main script? The position is awkward to me; it suggests that the main programme sleeps for 15 seconds. But what is the point of that? The main programme (thread?) sleeps while the sub-thread keeps running, but how is the subroutine then terminated?
I guess the idea is that after sleep-ing is done, the subroutine ends, after which we can detach/clean up. But how is this syntactically clear? Where in the definition of sleep is it said that sleep terminates a subroutine (and why), and how does it know which one to terminate in case there are more than one threads?
All threads end when the program ends. The program ends when the main thread ends. The sleep in the main thread is merely keeping the program running a short time, after which the main thread (therefore the program, therefore all created threads) also end.
So what's up with detach? It just says "I'm never going to bother joining to this thread, and I don't care what it returns". If you don't either detach a thread or join to it, you'd get a warning when the program ends.
detach a thread means "I don't care any more", and that does actually mean when your process exits, the thread will error and terminate.
Practically speaking - I don't think you ever want to detach a thread in perl - just add a join at the end of your code, so it can exit cleanly, and signal it via a semaphore or Thread::Queue in order to terminate.
$_ -> join for threads -> list;
Will do the trick.
That code example - in my opinion - is a bad example. It's just plain messy to sleep so a detached thread has a chance to complete, when you could just join and know that it's finished. This is especially true of perl threads, which it's deceptive to assume they're lightweight, and so can be trivially started (and detached). If you're ever spawning enough that the overhead of joining them is too high, then you're using perl threads wrong, and probably should fork instead.
You're quite right - the thread will never terminate, and so you code will always have a 'dirty' exit.
So instead I'd rewrite:
#!/usr/bin/perl
use strict;
use warnings;
use threads;
use threads::shared;
my $run : shared;
$run = 1;
sub sub1 {
my $count = 0;
while ($run) {
$count++;
print("\$count is $count\n");
sleep(1);
}
print "Terminating\n";
}
my $thr = threads->create( \&sub1 ); # Spawn the thread
sleep(15); # Let thread run for awhile
$run = 0;
$thr->join;
That way your main signals the thread to say "I'm done" and waits for it to finish it's current loop.
I have 100+ tasks to do, I can do it in a loop, but that will be slow
I want to do these jobs by threading, let's say, 10 threads
There is no dependency between the jobs, each can run independently, and stop if failed
I want these threads to pick up my jobs and do it, there should be no more than 10 threads in total, otherwise it may harm the server
These threads keep doing the jobs until all finished
Stop the job in the thread when timeout
I was searching information about this on the Internet, Threads::Pool, Threads::Queue...
But I can't be sure on which one is better for my case. Could anyone give me some advise?
You could use Thread::Queue and threads.
The IPC (communication between threads) is much easier tan between processes.
To fork or not to fork?
use strict;
use warnings;
use threads;
use Thread::Queue;
my $q = Thread::Queue->new(); # A new empty queue
# Worker thread
my #thrs = threads->create(sub {
while (my $item = $q->dequeue()) {
# Do work on $item
}
})->detach() for 1..10;#for 10 threads
my $dbh = ...
while (1){
#get items from db
my #items = get_items_from_db($dbh);
# Send work to the thread
$q->enqueue(#items);
print "Pending items: "$q->pending()."\n";
sleep 15;#check DB in every 15 secs
}
I'd never use perl threads. The reason is that they aren't conceptually speaking threads: you have to specify what data is to be shared between the threads. Each thread runs a perl interpreter. That's why they are called interpreterthreads or ithreads. Needless to say, this consumes alot of memory all for running things in parallel. fork() shares al the memory up until the fork point. So if they are independent tasks, always use fork. It's also the most Unix way of doing things.
I have a perl program, that spawns several threads. Each thread processees some task (by firing off other system commands etc) and then when its all done, Waits.
Once all threads are done, they fire a signal to Parent process. The parent then loads up new jobs, and signals the threads to go work on these new tasks.
So ideally, this program, would run forever.
Now, if I kill it in command line with kill -9 MainProgram.pl, its not killed! I see the output of the jobs the threads are running, and then I also see that after they are done, they getnew jobs and just go on and on...
I am absolutely confounded. If I do a kill -9 MainProgram.pl, it is supposed to kill all threads it owns, right?
Regardless of what the threads are out doing....
And even if the threads are doing I/O and so they wait for the IO to get done...I would expect the thread to die after its current task is done..but clearly, Main is reloading jobs too, as threads just keep continuing...
Is this kind of behaviour seen in perl ?
EDIT: Some of the code in mainProgram.pl
use threads;
use threads::shared;
for (my $count = 0; $count <= $threadNum-1; $count++) {
$t = threads->new(\&handleEvent, $count) ;
push(#threads, $t);
}
#Parent thread:
while(1) {
lock($parentSignal);
cond_wait($parentSignal);
getEvents();
while(#eventCount== 0){
sleep($parent_sleep_time);
getEvents(); #Try to get events again until you get some new stuff to process
}
cond_broadcast($threadsDone); # threadsgo work on this
}
Thanks
From what I understand, you're supposed to either join() or detach() on all threads prior to exiting.
From the POD:
If the program exits without all threads having either been joined or
detached, then a warning will be issued.
Source: http://metacpan.org/pod/threads
I wrote a very short script with Perl and I used multi-thread in it.
My problem is, the thread I created is not joinable. So I am wondering, what is the condition to make thread joinable?
What is the limit of a thread in Perl?
#!/usr/bin/env perl
#
#
use lib "$::XCATROOT/lib/perl";
use strict;
use threads;
use Safe;
sub test
{
my $parm = shift;
}
my $newchassis = ["1", "2", "3"];
my #snmp_threads ;
for my $item (#$newchassis)
{
my $thread = threads->create(\&test, $item);
push #snmp_threads, $thread;
}
for my $t (#snmp_threads)
{
$t->join();
}
This can be very tricky as it works find on RHEL 6.3 and but fails on SLES 11sp2.
Though there is no code, i will go ahead and assume that you are using join foreach #threads; for joining the threads. Now the joining of the threads depends on the post processing. Without seeing your code it's difficult to know, what you are doing. But how it works is that :
If the post-processing step needs all threads to finish before
beginning work, then the wait for individual threads is unavoidable.
If the post-processing step is specific to the results of each
thread, it should be possible to make the post-processing part of
the thread itself.
In both cases, $_->join foreach #threads; is the way to go.
If there is no need to wait for the threads to finish, use the
detach command instead of join. However, any results that the
threads may return will be discarded.
Are you sure, you have provided a valid post processing scenario for your activity ?
I am using perl ithreads and things work fine, unless I decide to have threads sleep.
Lets say my routine thread_job is passed as an entry for several threads to start running concurrently.
thread_job()
{
...
sleep 2;
#do other stuff here
}
If I dont have a sleep I have no issues with the threads running and they do their tasks fine. If I add a sleep, my script hangs. I am running this off a windows command prompt, if that helps.
Since I do need to sleep and Im guessing there's an issue with using this sleep on my current setup, I intend to have the thread do something, for a while, instead of sleeping. Is there any such mathematical operation which I could perform?
Try using Win32::Sleep instead. (Note that it takes milliseconds as an argument, not seconds.)
Calling sleep() blocks the entire process (that is all the threads).
You can instead block a single thread by calling select(). Do something like this:
thread_job() {
...
$delay = 2;
select(undef, undef, undef, $delay);
...
}