Run bash scripts from Perl threads - multithreading

My script should have n subroutines (my_proc) to run simultaneously, each of them runs bash script and one sub (check_procs) checks if subs has finished.
use strict;
use threads;
use threads::shared;
my %proc_status :shared;
my %thr;
foreach my $i (1,2,3,4) {
$proc_status{$i}=0;
}
sub my_proc {
my $arg=shift(#_);
while (1) {
sleep(2);
print "Proc $arg Started\n";
#exec("/bin/bash","sleep_for_10_sec.bash") or die("Can't exec"); # case 1
#`sleep_for_10_sec.bash &`; # case 2
print "Proc $arg Finished\n";
{
lock(%proc_status);
$proc_status{$arg}=1;
}
}
}
sub check_procs {
my $all_finished;
while (! $all_finished) {
sleep 5;
print "CHECK: \n";
$all_finished=1;
foreach my $num (1,2,3,4) {
if ($proc_status{$num} == 1) {
print "CHECK: procedure $num has finished\n";
} else {
$all_finished=0;
}
}
}
print "All jobs finished\n";
}
foreach my $num (1,2,3,4) {
$thr{"$num"} = new threads \&my_proc,$num;
}
my $thr_check= new threads \&check_procs;
$thr_check->join();
And here are the sleep_for_10_sec.bash
ls
# bunch of other stuff
sleep 10
echo "finished sleep"
I don't want my_proc subs to wait "sleep_for_10_sec.bash" command to be executed, after browsing I have found that either #case1 or #case2 should work, but they both fail.
the output of #case1:
Proc 1 Started
[ls result]
finsihed sleep
the output of #case2:
Proc 1 Started
Proc 2 Started
Proc 3 Started
Proc 4 Started
CHECK:
CHECK:
Proc 4 Finished
Proc 2 Finished
Proc 3 Finished
Proc 1 Finished
Proc 3 Started
Proc 1 Started
Proc 2 Started
Proc 4 Started
CHECK:
CHECK: procedure 1 has finished
CHECK: procedure 2 has finished
CHECK: procedure 3 has finished
CHECK: procedure 4 has finished
But I expect something like this :
Proc 1 Started
Proc 2 Started
Proc 3 Started
Proc 4 Started
Proc 1 Finished
Proc 1 Started
Proc 3 Finished
Proc 3 Started
Proc 4 Finished
Proc 4 Started
Proc 2 Finished
Proc 2 Started
CHECK:
CHECK:
CHECK:
CHECK: procedure 1 has finished
CHECK: procedure 2 has finished
CHECK: procedure 3 has finished
CHECK: procedure 4 has finished
Actually I get wanted result in case of redirecting output to " > log", but anyway after:
Proc 1 Started
Proc 2 Started
Proc 3 Started
Proc 4 Started
it waits "sleep_for_10_sec.bash" to finish.
This is my first project where I use "thread" and "exec", could someone help me on this ?

exec shouldn't be combined with threads. exec launches a new program within the current process, so when you call exec from one thread, the program the threads were executing disappears. Since the threads would have no program to execute, exec kills the threads as well.
It's not clear to me why case 2 doesn't work (edit: see ikegami's comment below). I would think it would launch the process, run it in the background, and allow the Perl thread to immediately continue. It doesn't seem to do that, but this code will:
system("/bin/bash sleep_for_10_sec.bash &"); # case 3

exec("/bin/bash","sleep_for_10_sec.bash") or die("Can't exec"); # case 1
exec replaces the program running in the current process with another program. At the same time, the existing threads are terminated (since the program they want to execute is no longer there), replaced with a single thread executing the new program.
This means that exec never returns (except on error). Threads or no threads, exec is not what you want, because you don't want your program to stop running.
But I expect something like this:
Are you sure you want to launch sleep_for_10_sec.bash 4 times every two seconds (meaning you can have up to 20 of them running at a time) as your desired output indicates?
Are you sure you don't care if sleep_for_10_sec.bash completes or not as your desired output indicates?
If so, why are you using threads at all? You could simply use the following:
sub start {
my $num = shift;
say "Proc $num Started";
system('bash -c sleep_for_10_sec.bash &');
say "Proc $num Finished";
}
for my $pass (1..2) {
start($_) for 1..4;
sleep 2;
start($_) for 1..4;
sleep 2;
start($_) for 1..4;
sleep 1;
if ($pass == 1) {
say "CHECK:";
} else {
say "CHECK: procedure $_ has finished" for 1..4;
}
}
I think you want
use threads;
use Thread::Queue qw( ); # 3.01+
use constant NUM_WORKERS => 4;
sub worker {
my $num = shift;
say "Job $num Started\n";
system("sleep_for_10_sec.bash"); # Make sure starts with #! and is executable.
say "Job $num Finished\n";
}
{
my $q = Thread::Queue->new();
for (1..NUM_WORKERS) {
while (defined( my $job = $q->dequeue() )) {
worker($job);
}
}
$q->enqueue(1..4, 1..4);
$q->end();
$_->join() for threads->list;
}

Related

Using threads in loop

I want to use threads in loops. The way I want to use this is that start threads in a loop and wait for them to finish. Once all threads are finished then sleep for some predefined number of time and then start those threads again.
Actually I want to run these threads once every hour and that is why I am using sleep. Also I know that hourly run can be done via cron but I can't do that so I am using sleep.
I am getting this error when I am trying to run my code:
Thread already joined at ex.pl line 33.
Perl exited with active threads:
5 running and unjoined
0 finished and unjoined
0 running and detached
This is my code:
use strict;
use warnings;
use threads;
use Thread::Queue;
my $queue = Thread::Queue->new();
my #threads_arr;
sub main {
$queue->enqueue($_) for 1 .. 5;
$queue->enqueue(undef) for 1 .. 5;
}
sub thread_body {
while ( my $num = $queue->dequeue() ) {
print "$num is popped by " . threads->tid . " \n";
sleep(5);
}
}
while (1) {
my $main_thread = threads->new( \&main );
push #threads_arr, threads->create( \&thread_body ) for 1 .. 5;
$main_thread->join();
foreach my $x (#threads_arr) {
$x->join();
}
sleep(1);
print "sleep \n";
}
Also I am able to see other questions similar to this but I am not able to get any of them.
Your #threads_arr array never gets cleared after you join the first 5 threads. The old (already joined) threads still exist in the array the second time around the loop, so Perl throws the "Thread already joined" error when attempting to join them. Clearing or locally initializing #threads_arr every time around the loop will fix the problem.
#threads_arr=(); # Reinitialize the array
push #threads_arr, threads->create( \&thread_body ) for 1 .. 5;

Perl: Run multiple system commands at once

In perl, I have some code like
my $enter = `curl -s -m 10 http://URL`;
How would I use threading to run this function 10 times at once?
I found this but I am not sure how to use it to set a specific amount of threads
Edit: I guess I misunderstood what Threads::Queue was doing. My original question still stands for simultaneously running multiple commands at once.
You can use fork(). In this example, I use the Parallel::ForkManager module. $max_forks is the number of processes to run simultaneously (set to two for an example), and you'd put your system/curl code after ### add curl logic here, and remove the print() and sleep() example statements from there as well.
#!/usr/bin/perl
use warnings;
use strict;
use Parallel::ForkManager;
my $max_forks = 2;
my $fork = new Parallel::ForkManager($max_forks);
my #urls = (
'http://perlmonks.org',
'http://stackoverflow.com',
'http://slashdot.org',
'http://wired.com',
);
# on start callback
$fork->run_on_start(
sub {
my $pid = shift;
print "Starting PID $pid\n";
}
);
# on finish callback
$fork->run_on_finish(
sub {
my ( $pid, $exit, $ident, $signal, $core) = #_;
if ($core){
print "PID $pid core dumped.\n";
}
else {
print "PID $pid exited with exit code $exit " .
" and signal $signal\n";
}
}
);
# forking code
for my $url (#urls){
$fork->start and next;
### add curl logic here
print "$url\n";
sleep(2);
$fork->finish;
}
$fork->wait_all_children;

Log message in perl every 90 seconds in the parent process as long as the child process still runs

I just passed over from php to perl due to my company's request so even if this may be a silly question is kind of nerve wreaking right now.
I have one little perl script deployed on a server through a debian package. I have this all figured out so that's all cool.
Now this script is called from another server through an SSH connection and the script logs back to that server all its actions. I use Log::Log4perl for that.
One of the tasks takes a very long time and also runs some other scripts in the process. The ssh connection has a set timeout of 5 minutes unless I log something back. So I figured out I would create a child process to run the task and let the parent process log back every 90 (or whatever) seconds. My issue is that I don't want to use sleep because if the task is finished sooner it will mess up the log.
I have also tried using Time, Time::HiRes and alarm, but they all mess up my log one way or another.
This is my code:
$log->info("uid $uid: calling the configure script for operation $mode,on $dst_path");
my $pid = fork();
die "Could not fork\n" if not defined $pid;
if ( $pid == 0 ) {
configure( $script_dir, $mode, $node, $uid, $gid); # this also uses a parallel process in its execution, but we don't have a non blocking wait
}
while ( !waitpid( $pid, WNOHANG ) ) {
sleep(90);
if ( !$pid ) {
$log->info("Still waiting for the process to finish"); # this should come up every 90 seconds of so
}
}
$log->info("uid $uid: configure script executed"); # this should come up only once, now I get it every 90 seconds
# do other stuff here after the execution of the configure sub is done
Unfortunately I inherited this architecture as it is and cannot change it because there are a lot of services based on it.
If you don't want to sleep, you can call select with a timeout. To implement this reliably, you can employ the self-pipe trick which involves creating a pipe, writing to the pipe in a SIGCHLD handler, and making the select call wait on the pipe's read handle.
Here's a simple example:
#!/usr/bin/perl
use strict;
use warnings;
use Errno qw(EINTR);
use Fcntl qw(F_GETFL F_SETFL O_NONBLOCK);
use Symbol qw(gensym);
sub make_non_blocking {
my $handle = shift;
my $flags = fcntl($handle, F_GETFL, 0)
or die("F_GETFL: $!");
fcntl($handle, F_SETFL, $flags | O_NONBLOCK)
or die("F_SETFL: $!");
}
my ($read_handle, $write_handle) = (gensym, gensym);
pipe($read_handle, $write_handle)
or die("pipe: $!");
make_non_blocking($read_handle);
make_non_blocking($write_handle);
local $SIG{CHLD} = sub {
syswrite($write_handle, "\0", 1);
};
my $pid = fork();
die("fork: $!") if !defined($pid);
if ($pid == 0) {
sleep(10);
exit;
}
my $rin = '';
vec($rin, fileno($read_handle), 1) = 1;
while (1) {
my $nfound = select(my $rout = $rin, undef, undef, 2);
if ($nfound < 0) {
# Error. Must restart the select call on EINTR.
die("select: $!") if $! != EINTR;
}
elsif ($nfound == 0) {
# Timeout.
print("still running...\n");
}
else {
# Child exited and pipe was written to.
last;
}
}
waitpid($pid, 0);
close($read_handle);
close($write_handle);
I tried to run the code and noticed a few things that may be your issue, but without knowing what configure does, I can't be sure. Here's what I found:
The child process doesn't exit after calling configure
waitpid does not change the value of $pid, so $pid is always 0 in the child and always the pid of the child in the parent.
What this means is that the parent is never writing out "Still waiting for the process to finish", the child is writing it out every 90 seconds after it completes it's call to configure.
Additionally, the child should print that message ever 90 seconds forever because it's waiting for pid 0 to send it the CHLD signal which won't happen because it doesn't have a child with pid 0.
I updated your code with a few stubs that does what I think you want (on a slightly tighter timeline because I don't like to wait :) ). My code makes the following assumptions that you may wish to change:
Log the waiting message every second
The child always exits with a status value of 0
Here's my code:
#!/usr/bin/env perl
use strict;
use warnings;
use Log::Log4perl qw(:easy);
use POSIX qw(:sys_wait_h);
Log::Log4perl->easy_init();
my ($uid,$mode,$dst_path,$script_dir,$node,$gid) = (0..5);
my $log = get_logger();
$log->info("uid $uid: calling the configure script for operation $mode,on $dst_path");
my $pid = fork();
die "Could not fork\n" if not defined $pid;
if ( $pid == 0 ) {
configure( $script_dir, $mode, $node, $uid, $gid); # this also uses a parallel process in its execution, but we don't have a non blocking wait
exit(0);
}
my $zombie;
while ( ($zombie = waitpid( $pid, WNOHANG ) ) != $pid) {
$log->info("Still waiting for the process to finish"); # this should come up every 90 seconds of so
sleep(1);
}
$log->info("uid $uid: configure script executed"); # this should come up only once, now I get it every 90 seconds
# do other stuff here after the execution of the configure sub is done
sub configure {
sleep 10;
}

Breaking out of a while loop with system commands in Perl using Ctrl-C (SIGINT)?

Consider the following example, test.pl:
#!/usr/bin/env perl
use 5.10.1;
use warnings;
use strict;
$SIG{'INT'} = sub {print "Caught Ctrl-C - Exit!\n"; exit 1;};
$| = 1; # turn off output line buffering
use Getopt::Long;
my $doSystemLoop = 0;
GetOptions( "dosysloop"=>\$doSystemLoop );
print("$0: doSystemLoop is:$doSystemLoop (use " . (($doSystemLoop)?"system":"Perl") . " loop); starting...\n");
my $i=0;
if (not($doSystemLoop)) { # do Perl loop
while ($i < 1e6) {
print("\tTest value is $i");
$i++;
sleep 1;
print(" ... ");
sleep 1;
print(" ... \n");
}
} else { # do system call loop
while ($i < 1e6) {
system("echo","-ne","\tTest value is $i");
$i++;
system("sleep 1");
system("echo","-ne"," ... ");
system("sleep 1");
system("echo","-e"," ... ");
}
}
So, if I call this program, so it uses a usual Perl loop, everything is as expected:
$ perl test.pl
test.pl: doSystemLoop is:0 (use Perl loop); starting...
Test value is 0 ... ...
Test value is 1 ... ...
Test value is 2 ... ^CCaught Ctrl-C - Exit!
$
... that is, I hit Ctrl-C, program exits instantly.
However, if the while loop's commands consist mostly of system calls, then it becomes nearly impossible to exit with Ctrl-C:
$ perl test.pl --dosysloop
test.pl: doSystemLoop is:1 (use system loop); starting...
Test value is 0 ... ...
Test value is 1 ... ...
Test value is 2 ... ^C ...
Test value is 3 ... ^C ...
Test value is 4 ... ^C ...
Test value is 5^C ... ^C ...
Test value is 6^C ... ^C ...
Test value is 7^C ... ^C ...
Test value is 8^C ... ^C ...
Test value is 9^C ... ^C ...
Test value is 10 ... ^C ...
Test value is 11^C ... ^C ...
Test value is 12^C ... ...
Test value is 13^Z
[1]+ Stopped perl test.pl --dosysloop
$ killall perl
$ fg
perl test.pl --dosysloop
Terminated
$
So in the snippet above, I'm hitting Ctrl-C (the ^C) like mad, and the program ignores me completely :/ Then I cheat by hitting Ctrl-Z (the ^Z), which stops the process and sets in the background; then in the resulting shell I do killall perl, and after that I execute the fg command, which places the Perl job back in the foreground - where it finally terminates due to the killall.
What I would like to have, is run a system loop like this, with the possibility to break out of it/exit it with the usual Ctrl-C. Is this possible to do, and how do I do that?
Perl's signal handling mechanism defers the handling of signals until a safe point. Deferred signals are checked between Opcodes of the perl VM. As system and friends count as a single opcode, signals are only checked once the exec'd command has terminated.
This can be circumvented by forking, and then waiting in a loop for the child process to terminate. The child can also be terminated early via a signal handler.
sub launch_and_wait {
my $wait = 1;
my $child;
local $SIG{CHLD} = sub {
$wait = 0;
};
local $SIG{INT} = sub {
$wait = 0;
kill KILL => $child if defined $child;
};
if ($child = fork()) {
# parent
while ($wait) {
print "zzz\n";
sleep 1;
}
wait; # try to join the child
} else {
# child
exec {$_[0]} #_;
}
}
launch_and_wait sleep => 60;
print "Done\n";
There are probably lots of ways this can go wrong (getting a SIGINT before the child was spawned…). I also omitted any error handling.
Check the exit status of the system() command for any signals. An external command interrupted with SIGINT will get a "2" here:
while () {
system("sleep", 1);
if ($? & 127) {
my $sig = $? & 127;
die "Caught signal INT" if $sig == 2; # you may also abort on other signals if you like
}
}

Perl multithreading and foreach

I am writing a simple Perl script that should run other Perl scripts simultaneously. I don't know how to make the main program wait for running threads. Sleep IS NOT a suitable solution because of many reason. This is my "main" script:
#!/usr/bin/perl
use threads;
main:
{
if ($#ARGV == -1) { usage(); exit(1); }
my $hostname = $ARGV[0];
my $thrssh = threads ->create(\&ssh, $hostname);
my $thrdns = threads ->create(\&dns, $hostname);
my $thrping = threads ->create(\&ping, $hostname);
my $thrsmtp = threads ->create(\&smtp, $hostname);
my $thrproxy = threads ->create(\&proxy, $hostname);
}
sub ssh {
threads->detach();
my $hostname = #_;
#print "SSH\n";
#LAUNCH SSH SCRIPT
}
#OTHER SUBROUTINES...
sub proxy {
threads->detach();
my $hostname = #_;
#print "PROXY\n";
#LAUNCH PROXY SCRIPT
}
If I try to run this script the first thing I can notice is that prints are "sequential", I thought text was messed up but maybe Print is exclusive I don't know. Main problem is that the last two subroutines don't have time to be executed.
SSH
DNS
PING
Perl exited with active threads:
2 running and unjoined
0 finished and unjoined
0 running and detached
If I use join instead of detach the subroutines become "sequential", for example if I put a sleep in sub ssh other threads will wait before starting. I want them to be parallel and the main program to close ONLY when all threads are finished, any help?
I actually have another question, if I have to run multiple threads in a foreach cicle when I am supposed to join them? ie:
my $thrssh;
foreach $mynode ($nodeset->get_nodelist) {
#...
$thrssh = threads ->create(\&ssh, $port, $hostname);
#...
}
$thssh->join();
Is that right?
You need to join all your threads from the main thread after they have all been started. Don't detatch if you don't want the interpreter to exit while the threads are still running.
...
my $thrssh = threads ->create(\&ssh, $hostname);
my $thrdns = threads ->create(\&dns, $hostname);
...
$thrssh->join();
$thrdns->join();
...
Regarding your edit: no, it's not right. You need to keep a reference to each thread you create, otherwise you can't join them.
Do something like:
my #thrs;
foreach $mynode ($nodeset->get_nodelist) {
#...
$thrssh = threads ->create(\&ssh, $port, $hostname);
push #thrs, $thrssh;
#...
}
$_->join() for #thrs;

Resources