Threaded script exits prematurely - multithreading

I want to do something along the lines of:
my $sema = Thread::Semaphore->new(8);
while(#compsCopy)
{
my $thread1 = threads->create('Build', (shift #compsCopy), "clean");
}
sub FfsBuild {
$sema->down();
my ($comp, $action) = #_;
my $cmd = "$MAKE $MAKE_INVOCATION_PATH/$comp $action";
my $retCode = system($cmd);
push(#retCodes, $retCode);
print "\n\t\t**** ERROR IN $comp ****\n" unless $retCode == 0;
print "added proc $comp\n";
$sema->up();
return $retCode;
}
This seems to work fine for the first 10-20 directories... but eventually I get:
Perl exited with active threads:
364 running and unjoined
14 finished and unjoined
0 running and detached
C:\dev>make: Leaving directory `/cygdrive/C/dev/dir0'
make: Leaving directory `/cygdrive/C/dev/dir1'
make: Leaving directory `/cygdrive/C/dev/dir2'
make: Leaving directory `/cygdrive/C/dev/dir3'
make: Leaving directory `/cygdrive/C/dev/dir4'
make: Leaving directory `/cygdrive/C/dev/dir5'
make: Leaving directory `/cygdrive/C/dev/dir6'
make: Leaving directory `/cygdrive/C/dev/dir7'
Where am I messing up here?

The problem is that you don't wait for you your threads to finish. Adding the following at the bottom will do the trick:
$_->join for threads->list;
You shouldn't create 378 threads then make it so only 8 execute at a time! What a waste! It's people running code like that that say Perl threads are inefficient. Create a pool of workers and assign the work to them.
use constant NUM_WORKERS => 8;
use Thread::Queue 3.01 qw( );
sub worker {
my ($job) = #_;
FfsBuild($job, 'clean');
}
{
my $q = Thread::Queue->new();
for (1..NUM_WORKERS) {
async {
while (defined(my $job = $q->dequeue()) {
worker($job);
}
};
}
$q->enqueue($_) for #compsCopy;
# When you're done adding to the queue.
$q->end();
$_->join() for threads->list();
}
FfsBuild is what you had without the semaphore.

Threading with Perl is a bit difficult (and also rather inefficient. Don't do it without a good reason).
Every thread except the main thread either has to be joined prior to exit, or be detached. You should still make sure that the thread terminates before you exit the process.
So something like threads->create(...)->detach should work here, but I think it's better to join:
my #threads;
for my $job (#jobs) {
push #threads, threads->create(\&worker, $job);
}
# wait until all have completed
$_->join for #threads;
However, you will probably want to kick of a few workers at the start of the program, and feed them jobs via a Thread::Queue instead – each thread is a complete clone of the current interpreter state, which tends to use unnecessary large amounts of memory. Running hundreds of Perl threads is not likely to be useful.

Related

Perl exited with active threads:

I have a Perl program. Everything is working fine, but I am seeing the errors below. I have also joined threads in the end, and tried checking is_joinable, but nothing is working. This error is not causing any issue at the moment but I want to fix it
Perl exited with active threads:
6 running and unjoined
0 finished and unjoined
0 running and detached
code
my #threads;
open FILE, "$inputCsv" or die $!;
my #records = <FILE>;
foreach $record ( #records ) {
#fields = split( /,/, $record );
$identityDomain = $fields[0];
push( #threads, threads->new( \&populateSubscriptionMap, $identityDomain ) );
}
foreach $thr ( #threads ) {
print "threads - " . $thr;
my %myhash = $thr->join();
}
I have spent almost 3 hours on this and tried various things. I would appreciate if someone can take a look and help me out.
$_->join() for threads->list();
is a simple way of waiting for all threads to end, but your problem is far more likely to be that you don't actually reach the part of your code that reaps the threads. The most likely culprits are:
Your main thread threw an uncaught exception.
Your main thread or one of the other threads used exit.

Using threads in loop

I want to use threads in loops. The way I want to use this is that start threads in a loop and wait for them to finish. Once all threads are finished then sleep for some predefined number of time and then start those threads again.
Actually I want to run these threads once every hour and that is why I am using sleep. Also I know that hourly run can be done via cron but I can't do that so I am using sleep.
I am getting this error when I am trying to run my code:
Thread already joined at ex.pl line 33.
Perl exited with active threads:
5 running and unjoined
0 finished and unjoined
0 running and detached
This is my code:
use strict;
use warnings;
use threads;
use Thread::Queue;
my $queue = Thread::Queue->new();
my #threads_arr;
sub main {
$queue->enqueue($_) for 1 .. 5;
$queue->enqueue(undef) for 1 .. 5;
}
sub thread_body {
while ( my $num = $queue->dequeue() ) {
print "$num is popped by " . threads->tid . " \n";
sleep(5);
}
}
while (1) {
my $main_thread = threads->new( \&main );
push #threads_arr, threads->create( \&thread_body ) for 1 .. 5;
$main_thread->join();
foreach my $x (#threads_arr) {
$x->join();
}
sleep(1);
print "sleep \n";
}
Also I am able to see other questions similar to this but I am not able to get any of them.
Your #threads_arr array never gets cleared after you join the first 5 threads. The old (already joined) threads still exist in the array the second time around the loop, so Perl throws the "Thread already joined" error when attempting to join them. Clearing or locally initializing #threads_arr every time around the loop will fix the problem.
#threads_arr=(); # Reinitialize the array
push #threads_arr, threads->create( \&thread_body ) for 1 .. 5;

Log message in perl every 90 seconds in the parent process as long as the child process still runs

I just passed over from php to perl due to my company's request so even if this may be a silly question is kind of nerve wreaking right now.
I have one little perl script deployed on a server through a debian package. I have this all figured out so that's all cool.
Now this script is called from another server through an SSH connection and the script logs back to that server all its actions. I use Log::Log4perl for that.
One of the tasks takes a very long time and also runs some other scripts in the process. The ssh connection has a set timeout of 5 minutes unless I log something back. So I figured out I would create a child process to run the task and let the parent process log back every 90 (or whatever) seconds. My issue is that I don't want to use sleep because if the task is finished sooner it will mess up the log.
I have also tried using Time, Time::HiRes and alarm, but they all mess up my log one way or another.
This is my code:
$log->info("uid $uid: calling the configure script for operation $mode,on $dst_path");
my $pid = fork();
die "Could not fork\n" if not defined $pid;
if ( $pid == 0 ) {
configure( $script_dir, $mode, $node, $uid, $gid); # this also uses a parallel process in its execution, but we don't have a non blocking wait
}
while ( !waitpid( $pid, WNOHANG ) ) {
sleep(90);
if ( !$pid ) {
$log->info("Still waiting for the process to finish"); # this should come up every 90 seconds of so
}
}
$log->info("uid $uid: configure script executed"); # this should come up only once, now I get it every 90 seconds
# do other stuff here after the execution of the configure sub is done
Unfortunately I inherited this architecture as it is and cannot change it because there are a lot of services based on it.
If you don't want to sleep, you can call select with a timeout. To implement this reliably, you can employ the self-pipe trick which involves creating a pipe, writing to the pipe in a SIGCHLD handler, and making the select call wait on the pipe's read handle.
Here's a simple example:
#!/usr/bin/perl
use strict;
use warnings;
use Errno qw(EINTR);
use Fcntl qw(F_GETFL F_SETFL O_NONBLOCK);
use Symbol qw(gensym);
sub make_non_blocking {
my $handle = shift;
my $flags = fcntl($handle, F_GETFL, 0)
or die("F_GETFL: $!");
fcntl($handle, F_SETFL, $flags | O_NONBLOCK)
or die("F_SETFL: $!");
}
my ($read_handle, $write_handle) = (gensym, gensym);
pipe($read_handle, $write_handle)
or die("pipe: $!");
make_non_blocking($read_handle);
make_non_blocking($write_handle);
local $SIG{CHLD} = sub {
syswrite($write_handle, "\0", 1);
};
my $pid = fork();
die("fork: $!") if !defined($pid);
if ($pid == 0) {
sleep(10);
exit;
}
my $rin = '';
vec($rin, fileno($read_handle), 1) = 1;
while (1) {
my $nfound = select(my $rout = $rin, undef, undef, 2);
if ($nfound < 0) {
# Error. Must restart the select call on EINTR.
die("select: $!") if $! != EINTR;
}
elsif ($nfound == 0) {
# Timeout.
print("still running...\n");
}
else {
# Child exited and pipe was written to.
last;
}
}
waitpid($pid, 0);
close($read_handle);
close($write_handle);
I tried to run the code and noticed a few things that may be your issue, but without knowing what configure does, I can't be sure. Here's what I found:
The child process doesn't exit after calling configure
waitpid does not change the value of $pid, so $pid is always 0 in the child and always the pid of the child in the parent.
What this means is that the parent is never writing out "Still waiting for the process to finish", the child is writing it out every 90 seconds after it completes it's call to configure.
Additionally, the child should print that message ever 90 seconds forever because it's waiting for pid 0 to send it the CHLD signal which won't happen because it doesn't have a child with pid 0.
I updated your code with a few stubs that does what I think you want (on a slightly tighter timeline because I don't like to wait :) ). My code makes the following assumptions that you may wish to change:
Log the waiting message every second
The child always exits with a status value of 0
Here's my code:
#!/usr/bin/env perl
use strict;
use warnings;
use Log::Log4perl qw(:easy);
use POSIX qw(:sys_wait_h);
Log::Log4perl->easy_init();
my ($uid,$mode,$dst_path,$script_dir,$node,$gid) = (0..5);
my $log = get_logger();
$log->info("uid $uid: calling the configure script for operation $mode,on $dst_path");
my $pid = fork();
die "Could not fork\n" if not defined $pid;
if ( $pid == 0 ) {
configure( $script_dir, $mode, $node, $uid, $gid); # this also uses a parallel process in its execution, but we don't have a non blocking wait
exit(0);
}
my $zombie;
while ( ($zombie = waitpid( $pid, WNOHANG ) ) != $pid) {
$log->info("Still waiting for the process to finish"); # this should come up every 90 seconds of so
sleep(1);
}
$log->info("uid $uid: configure script executed"); # this should come up only once, now I get it every 90 seconds
# do other stuff here after the execution of the configure sub is done
sub configure {
sleep 10;
}

perl thread dies after detach

I need to create a thread and let it do some stuff. The main reason is to quit from main programm as fast as possible.
But after I detach the thread and end the main programm, the thread dies. In my example the file doesn´t contain expected data.
If I let the main programm sleep for 5 sec everything works fine.
How can I let run the thread after quitting from the main prog?
use threads;
print "Starting main program\n";
my $t = threads->new(\&sub1)->detach();
print "End of main program\n";
sub sub1 {
print "started thread\n";
open (FILE, ">>./test.txt") or die($!);
print FILE localtime(time())."\n";
foreach (1..3){
print FILE "$_\n";
sleep 1;
}
close(FILE);
return 1;
}
Quite simply, you can't.
$thr->detach()
Makes the thread unjoinable, and causes any eventual return value to be discarded. When the program exits, any detached threads that are still running are silently terminated.
But you could do:
threads->new(\&sub1)->join();
But that hardly buys you much. Because threads are mainly meant to be used to do other things while you continue to do some thing. There is little reason to stop the main line of execution, and then run a thread, when you could just as well have called sub1();
You can't use threads, but you could fork and let parent process exit without waiting for the child process to finish:
my $pid = fork();
if ($pid == 0) {
print "child proc\n";
sleep 5;
print "child, 5 seconds older\n";
exit(0);
} else {
print "parent proc. goodbye\n";
}
I don't think this isn't a very good approach since you've now lost control over the child process. Why not just run it main and background the process?
Edit: This of course also has the potential to generate zombie processes, which will consume system resources. and brains.

Perl multithreading and foreach

I am writing a simple Perl script that should run other Perl scripts simultaneously. I don't know how to make the main program wait for running threads. Sleep IS NOT a suitable solution because of many reason. This is my "main" script:
#!/usr/bin/perl
use threads;
main:
{
if ($#ARGV == -1) { usage(); exit(1); }
my $hostname = $ARGV[0];
my $thrssh = threads ->create(\&ssh, $hostname);
my $thrdns = threads ->create(\&dns, $hostname);
my $thrping = threads ->create(\&ping, $hostname);
my $thrsmtp = threads ->create(\&smtp, $hostname);
my $thrproxy = threads ->create(\&proxy, $hostname);
}
sub ssh {
threads->detach();
my $hostname = #_;
#print "SSH\n";
#LAUNCH SSH SCRIPT
}
#OTHER SUBROUTINES...
sub proxy {
threads->detach();
my $hostname = #_;
#print "PROXY\n";
#LAUNCH PROXY SCRIPT
}
If I try to run this script the first thing I can notice is that prints are "sequential", I thought text was messed up but maybe Print is exclusive I don't know. Main problem is that the last two subroutines don't have time to be executed.
SSH
DNS
PING
Perl exited with active threads:
2 running and unjoined
0 finished and unjoined
0 running and detached
If I use join instead of detach the subroutines become "sequential", for example if I put a sleep in sub ssh other threads will wait before starting. I want them to be parallel and the main program to close ONLY when all threads are finished, any help?
I actually have another question, if I have to run multiple threads in a foreach cicle when I am supposed to join them? ie:
my $thrssh;
foreach $mynode ($nodeset->get_nodelist) {
#...
$thrssh = threads ->create(\&ssh, $port, $hostname);
#...
}
$thssh->join();
Is that right?
You need to join all your threads from the main thread after they have all been started. Don't detatch if you don't want the interpreter to exit while the threads are still running.
...
my $thrssh = threads ->create(\&ssh, $hostname);
my $thrdns = threads ->create(\&dns, $hostname);
...
$thrssh->join();
$thrdns->join();
...
Regarding your edit: no, it's not right. You need to keep a reference to each thread you create, otherwise you can't join them.
Do something like:
my #thrs;
foreach $mynode ($nodeset->get_nodelist) {
#...
$thrssh = threads ->create(\&ssh, $port, $hostname);
push #thrs, $thrssh;
#...
}
$_->join() for #thrs;

Resources