Perl multithreading and foreach - multithreading

I am writing a simple Perl script that should run other Perl scripts simultaneously. I don't know how to make the main program wait for running threads. Sleep IS NOT a suitable solution because of many reason. This is my "main" script:
#!/usr/bin/perl
use threads;
main:
{
if ($#ARGV == -1) { usage(); exit(1); }
my $hostname = $ARGV[0];
my $thrssh = threads ->create(\&ssh, $hostname);
my $thrdns = threads ->create(\&dns, $hostname);
my $thrping = threads ->create(\&ping, $hostname);
my $thrsmtp = threads ->create(\&smtp, $hostname);
my $thrproxy = threads ->create(\&proxy, $hostname);
}
sub ssh {
threads->detach();
my $hostname = #_;
#print "SSH\n";
#LAUNCH SSH SCRIPT
}
#OTHER SUBROUTINES...
sub proxy {
threads->detach();
my $hostname = #_;
#print "PROXY\n";
#LAUNCH PROXY SCRIPT
}
If I try to run this script the first thing I can notice is that prints are "sequential", I thought text was messed up but maybe Print is exclusive I don't know. Main problem is that the last two subroutines don't have time to be executed.
SSH
DNS
PING
Perl exited with active threads:
2 running and unjoined
0 finished and unjoined
0 running and detached
If I use join instead of detach the subroutines become "sequential", for example if I put a sleep in sub ssh other threads will wait before starting. I want them to be parallel and the main program to close ONLY when all threads are finished, any help?
I actually have another question, if I have to run multiple threads in a foreach cicle when I am supposed to join them? ie:
my $thrssh;
foreach $mynode ($nodeset->get_nodelist) {
#...
$thrssh = threads ->create(\&ssh, $port, $hostname);
#...
}
$thssh->join();
Is that right?

You need to join all your threads from the main thread after they have all been started. Don't detatch if you don't want the interpreter to exit while the threads are still running.
...
my $thrssh = threads ->create(\&ssh, $hostname);
my $thrdns = threads ->create(\&dns, $hostname);
...
$thrssh->join();
$thrdns->join();
...
Regarding your edit: no, it's not right. You need to keep a reference to each thread you create, otherwise you can't join them.
Do something like:
my #thrs;
foreach $mynode ($nodeset->get_nodelist) {
#...
$thrssh = threads ->create(\&ssh, $port, $hostname);
push #thrs, $thrssh;
#...
}
$_->join() for #thrs;

Related

Perl: fork from parallel threads

Below code is supposed to run 2 parallel threads, each thread executes fork(), waits for child process to finish then threads are expected to join (finish) and result is printed.
In reality, first forked child process finishes as expected, but second one hangs on _mutex_lock() trying to exit, thus second thread never joins till you kill second child manually with -9 signal.
Could somebody please explain why this happens, and how to avoid this?
use strict;
use warnings;
use threads;
use Data::Dumper;
sub Run
{
my $tid = threads->tid();
my $log = {};
$log->{"[$$:$tid]:00"} = "started";
my $pid = fork();
if ($pid == 0)
{
print "In child ($$): started\n";
sleep 3;
print "In child ($$): finished\n";
# system("kill -9 $$"); -- brutal way
exit 0;
}
waitpid($pid, 0);
my $exitCode = $? >> 8;
$log->{"[$$:$tid]:01"} = "finished, code=$exitCode";
return $log;
}
my #threads = ();
foreach (1..2)
{
push #threads, threads->new(sub { return Run() });
}
print Dumper($_->join()) for #threads;
On my Linux box, using _exit from POSIX instead of exit works. This solution might be non-portable to other platforms, though.
The linked documentation says:
Note that when using threads and in Linux this is not a good way to exit a thread because in Linux processes and threads are kind of the same thing (Note: while this is the situation in early 2003 there are projects under way to have threads with more POSIXly semantics in Linux). If you want not to return from a thread, detach the thread.
Similarly, perlthrtut - Tutorial on threads in Perl says
Thinking of mixing fork() and threads? Please lie down and wait until the feeling passes. Be aware that the semantics of fork() vary between platforms. For example, some Unix systems copy all the current threads into the child process, while others only copy the thread that called fork(). You have been warned!

Using threads in loop

I want to use threads in loops. The way I want to use this is that start threads in a loop and wait for them to finish. Once all threads are finished then sleep for some predefined number of time and then start those threads again.
Actually I want to run these threads once every hour and that is why I am using sleep. Also I know that hourly run can be done via cron but I can't do that so I am using sleep.
I am getting this error when I am trying to run my code:
Thread already joined at ex.pl line 33.
Perl exited with active threads:
5 running and unjoined
0 finished and unjoined
0 running and detached
This is my code:
use strict;
use warnings;
use threads;
use Thread::Queue;
my $queue = Thread::Queue->new();
my #threads_arr;
sub main {
$queue->enqueue($_) for 1 .. 5;
$queue->enqueue(undef) for 1 .. 5;
}
sub thread_body {
while ( my $num = $queue->dequeue() ) {
print "$num is popped by " . threads->tid . " \n";
sleep(5);
}
}
while (1) {
my $main_thread = threads->new( \&main );
push #threads_arr, threads->create( \&thread_body ) for 1 .. 5;
$main_thread->join();
foreach my $x (#threads_arr) {
$x->join();
}
sleep(1);
print "sleep \n";
}
Also I am able to see other questions similar to this but I am not able to get any of them.
Your #threads_arr array never gets cleared after you join the first 5 threads. The old (already joined) threads still exist in the array the second time around the loop, so Perl throws the "Thread already joined" error when attempting to join them. Clearing or locally initializing #threads_arr every time around the loop will fix the problem.
#threads_arr=(); # Reinitialize the array
push #threads_arr, threads->create( \&thread_body ) for 1 .. 5;

Threaded script exits prematurely

I want to do something along the lines of:
my $sema = Thread::Semaphore->new(8);
while(#compsCopy)
{
my $thread1 = threads->create('Build', (shift #compsCopy), "clean");
}
sub FfsBuild {
$sema->down();
my ($comp, $action) = #_;
my $cmd = "$MAKE $MAKE_INVOCATION_PATH/$comp $action";
my $retCode = system($cmd);
push(#retCodes, $retCode);
print "\n\t\t**** ERROR IN $comp ****\n" unless $retCode == 0;
print "added proc $comp\n";
$sema->up();
return $retCode;
}
This seems to work fine for the first 10-20 directories... but eventually I get:
Perl exited with active threads:
364 running and unjoined
14 finished and unjoined
0 running and detached
C:\dev>make: Leaving directory `/cygdrive/C/dev/dir0'
make: Leaving directory `/cygdrive/C/dev/dir1'
make: Leaving directory `/cygdrive/C/dev/dir2'
make: Leaving directory `/cygdrive/C/dev/dir3'
make: Leaving directory `/cygdrive/C/dev/dir4'
make: Leaving directory `/cygdrive/C/dev/dir5'
make: Leaving directory `/cygdrive/C/dev/dir6'
make: Leaving directory `/cygdrive/C/dev/dir7'
Where am I messing up here?
The problem is that you don't wait for you your threads to finish. Adding the following at the bottom will do the trick:
$_->join for threads->list;
You shouldn't create 378 threads then make it so only 8 execute at a time! What a waste! It's people running code like that that say Perl threads are inefficient. Create a pool of workers and assign the work to them.
use constant NUM_WORKERS => 8;
use Thread::Queue 3.01 qw( );
sub worker {
my ($job) = #_;
FfsBuild($job, 'clean');
}
{
my $q = Thread::Queue->new();
for (1..NUM_WORKERS) {
async {
while (defined(my $job = $q->dequeue()) {
worker($job);
}
};
}
$q->enqueue($_) for #compsCopy;
# When you're done adding to the queue.
$q->end();
$_->join() for threads->list();
}
FfsBuild is what you had without the semaphore.
Threading with Perl is a bit difficult (and also rather inefficient. Don't do it without a good reason).
Every thread except the main thread either has to be joined prior to exit, or be detached. You should still make sure that the thread terminates before you exit the process.
So something like threads->create(...)->detach should work here, but I think it's better to join:
my #threads;
for my $job (#jobs) {
push #threads, threads->create(\&worker, $job);
}
# wait until all have completed
$_->join for #threads;
However, you will probably want to kick of a few workers at the start of the program, and feed them jobs via a Thread::Queue instead – each thread is a complete clone of the current interpreter state, which tends to use unnecessary large amounts of memory. Running hundreds of Perl threads is not likely to be useful.

How do I kill Perl threads?

In this program I create a fork, and then call domultithreading from it. It then creates a few threads.
sub domultithreading {
#Not my function
my ($num) = #_;
my #thrs;
my $i = 0;
my $connectionsperthread = 50;
while ( $i < $num ) {
$thrs[$i] = threads->create( \&doconnections, $connectionsperthread, 1 );
$i += $connectionsperthread;
}
my #threadslist = threads->list();
while ( $#threadslist > 0 ) {
$failed = 0;
}
}
sub kill {
#how can I kill the threads made in domultithreading?
kill 9, $pid;
print "\nkilling $pid\n";
}
I then want to be able to kill the fork and its threads, however I can't figure it out. Any suggestions?
Thanks a lot
Perl provides two concurrency models: Processes and Threads. While you shouldn't neccessarily mix these two without a good reason, threads do model processes quite closely, so we can nearly treat them as such. Specifically, we can send signals to threads.
Processes can be signalled with the kill function: kill SIGNAL => $pid, while threads can be signalled with the kill method: $thr->kill(SIGNAL). This method returns the thread object. Signals can be intercepted when setting signal handlers in the %SIG hash.
This means that every process TERM signal handler TERMs all the child threads like
$_->kill(9)->join() for threads->list;
and every thread TERM signal handler simply exits the thread, or does cleaning up:
threads->exit; # exit the current thread
There are actually few different ways to kill a thread in Perl, depending on what you want to achieve.
Let's take the following code as example:
use strict;
use warnings;
use threads;
use Thread::Queue;
# Create the shared queue (used by the threads):
my $queue = Thread::Queue->new();
sub main {
# Initialise the shared queue:
$queue->enqueue("A", "B", "C", "D", "E", "F");
print "Total number of items: " . $queue->pending() . "\n";
$queue->end(); # signal that there is no more work to be sent...
# Create 3 threads:
threads->create('do') for ( 0..2 );
print "Number of current threads: " . threads->list() . "\n";
foreach my $thread ( threads->list() ) { # for each thread...
$thread->join(); # wait the thread to finish all its work...
print "Number of items in the queue: " . $queue->pending() . "\n" if defined $queue->pending();
print "Number of current threads: " . threads->list() . "\n";
}
}
sub do {
# Retrieve the current thread ID:
my $threadID = threads->self()->tid();
# Setup the thread's kill signal handler:
local $SIG{KILL} = sub { threads->exit() };
while ( defined (my $item = $queue->dequeue()) ) { # for each element in the queue...
print "(Thread-" . $threadID . "): Do something with item '$item'...\n";
sleep 1 + $threadID;
print "(Thread-" . $threadID . "): Finished to use item '$item'...\n";
}
}
main();
The code above spawns 3 threads, each of which will take and process an element of the shared queue till the queue is empty.
In this case, since we declared that no more element will be added to the queue (i.e. $queue->end()), the threads will be joined (to the main) once they had processed all the elements of the queue. Indeed, using $thread->join() we are saying to the main to wait for $thread to join.
If we omit to declare $queue->end(), the threads will not join the main but stay pending for new elements of the queue.
Now, if we want to kill the threads, we have two options: killing the threads but letting them to finish what they are doing first or simply (brutally) killing the threads immediately. In Perl, both are achieved via Thread Signalling.
In the first case (i.e. if we want to tell the threads to finish their work and, after, to stop processing the shared queue), we should use $thread->kill('KILL')->join():
foreach my $thread ( threads->list() ) { # for each thread...
$thread->kill('KILL')->join(); # wait the thread finish its work and kill it...
print "Number of items in the queue: " . $queue->pending() . "\n" if defined $queue->pending();
print "Number of current threads: " . threads->list() . "\n";
}
On the other hand, in the latter case (i.e. if we want to kill the threads immediately), we should use $thread->kill('KILL')->kill():
foreach my $thread ( threads->list() ) { # for each thread...
$thread->kill('KILL')->kill(); # kill the thread immediately...
print "Number of items in the queue: " . $queue->pending() . "\n" if defined $queue->pending();
print "Number of current threads: " . threads->list() . "\n";
}
Of course, if you want to kill the thread from within itself, you just need to call threads->exit() or simply use return:
sub do {
...
threads->exit(); # kill the thread...
...
}

perl thread dies after detach

I need to create a thread and let it do some stuff. The main reason is to quit from main programm as fast as possible.
But after I detach the thread and end the main programm, the thread dies. In my example the file doesn´t contain expected data.
If I let the main programm sleep for 5 sec everything works fine.
How can I let run the thread after quitting from the main prog?
use threads;
print "Starting main program\n";
my $t = threads->new(\&sub1)->detach();
print "End of main program\n";
sub sub1 {
print "started thread\n";
open (FILE, ">>./test.txt") or die($!);
print FILE localtime(time())."\n";
foreach (1..3){
print FILE "$_\n";
sleep 1;
}
close(FILE);
return 1;
}
Quite simply, you can't.
$thr->detach()
Makes the thread unjoinable, and causes any eventual return value to be discarded. When the program exits, any detached threads that are still running are silently terminated.
But you could do:
threads->new(\&sub1)->join();
But that hardly buys you much. Because threads are mainly meant to be used to do other things while you continue to do some thing. There is little reason to stop the main line of execution, and then run a thread, when you could just as well have called sub1();
You can't use threads, but you could fork and let parent process exit without waiting for the child process to finish:
my $pid = fork();
if ($pid == 0) {
print "child proc\n";
sleep 5;
print "child, 5 seconds older\n";
exit(0);
} else {
print "parent proc. goodbye\n";
}
I don't think this isn't a very good approach since you've now lost control over the child process. Why not just run it main and background the process?
Edit: This of course also has the potential to generate zombie processes, which will consume system resources. and brains.

Resources