Invoke mutilple threads by perl system function - multithreading

I'd like to invoke multiple perl instances/scripts from one perl script. Please see the simple script below which illustrates the problme nicely
my #filenames = {"file1.xml","file2.xml","file3.xml",file4.xml"}
foreach my $file (#filenames)
{
#Scripts which parses the XML file
system("perl parse.pl $file");
#Go-On don't wait till parse.pl has finished
}
As I'm on a quad-core CPU and the parsing of a single file takes a while, I want to split the Job. Could someone point me in a good direction?
Thanks and best,
Tim

Taking advantage of multiple cores for implicitly parallel workloads has many ways to do it.
The most obvious is - suffix an ampersand after your system call, and it'll charge off and do it in the background.
my #filenames = ("file1.xml","file2.xml","file3.xml",file4.xml");
foreach my $file (#filenames)
{
#Scripts which parses the XML file
system("perl parse.pl $file &");
#Go-On don't wait till parse.pl has finished
}
That's pretty simplistic, but should do the trick. The downside of this approach is it doesn't scale too well - if you had a long list of files (say, 1000?) then they'd all kick off at once, and you may drain system resources and cause problems by doing it.
So if you want a more controlled approach - you can use either forking or threading. forking uses the C system call, and starts duplicate process instances.
use Parallel::ForkManager;
my $manager = Parallel::ForkManager -> new ( 4 ); #number of CPUs
my #filenames = ("file1.xml","file2.xml","file3.xml",file4.xml");
foreach my $file (#filenames)
{
#Scripts which parses the XML file
$manager -> start and next;
exec("perl", "parse.pl", $file) or die "exec: $!";
$manager -> finish;
#Go-On don't wait till parse.pl has finished
}
# and if you want to wait:
$manager -> wait_all_children();
And if you wanted to do something that involved capturing output and post-processing it, I'd be suggesting thinking in terms of threads and Thread::Queue. But that's unnecessary if there's no synchronisation required.
(If you're thinking that might be useful, I'll offer:
Perl daemonize with child daemons)
Edit: Amended based on comments. Ikegami correctly points out:
system("perl parse.pl $file"); $manager->finish; is wasteful (three processes per worker). Use: exec("perl", "parse.pl", $file) or die "exec: $!"; (one process per worker).

Related

How to speed up Parallel::ForkManager in perl

I am using EC2 amazon server to perform data processing of 63 files,
the server i am using has 16 core but using perl Parallel::ForkManager with number of thread = number of core then it seems like half the core are sleeping and the working core are not at 100% and fluctuate around 25%~50%
I also checked IO and it is mostly iddling.
use Sys::Info;
use Sys::Info::Constants qw( :device_cpu );
my $info = Sys::Info->new;
my $cpu = $info->device( CPU => %options );
use Parallel::ForkManager;
my $manager=new Parallel::ForkManager($cpu->count);
for($i=0;$i<=$#files_l;$i++)
{
$manager->start and next;
do_stuff($files_l[$i]);
$manager->finish;
}
$manager->wait_all_children;
The short answer is - we can't tell you, because it depends entirely on what 'do_stuff' is doing.
The major reasons why parallel code doesn't create linear speed increases are:
Process creation overhead - some 'work' is done to spawn a process, so if the children are trivially small, that 'wastes' effort.
Contented resources - the most common is disk IO, but things like file locks, database handles, sockets, or interprocess communication can also play a part.
something else causing a 'back off' that stalls a process.
And without knowing what 'do_stuff' does, we can't second guess what it might be.
However I'll suggest a couple of steps:
Double the number of processes to twice CPU count. That's often a 'sweet spot' because it means that any non-CPU delay in a process just means one of the others get to go full speed.
Try strace -fTt <yourprogram> (if you're on linux, the commands are slightly different on other Unix variants). Then do it again with strace -fTtc because the c will summarise syscall run times. Look at which ones take the most 'time'.
Profile your code to see where the hot spots are. Devel::NYTProf is one library you can use for this.
And on a couple of minor points:
my $manager=new Parallel::ForkManager($cpu->count);
Would be better off written:
my $manager=Parallel::ForkManager -> new ( $cpu->count);
Rather than using indirect object notation.
If you are just iterating #files then it might be better to not use a loop count variable and instead:
foreach my $file ( #files ) {
$manager -> start and next;
do_stuff($file);
$manager -> finish;
}

perl fork() doesn't seem to utilize the cores, but cpu only

I've got a perl program that trying to do conversion of a bunch of files from one format to another (via a command-line tool). It works fine, but too slow as it's converting the files one and a time.
I researched and utilize the fork() mechanism trying to spawn off all the conversion as child-forks hoping to utilize the cpu/cores.
Coding is done and tested, it does improve performance, but not to the way I expected.
When looking at /proc/cpuinfo, I have this:
> egrep -e "core id" -e ^physical /proc/cpuinfo|xargs -l2 echo|sort -u
physical id : 0 core id : 0
physical id : 0 core id : 1
physical id : 0 core id : 2
physical id : 0 core id : 3
physical id : 1 core id : 0
physical id : 1 core id : 1
physical id : 1 core id : 2
physical id : 1 core id : 3
That means I have 2 CPU and quad-core each? If so, I should able to fork out 8 forks and supposingly I should able to make a 8-min job (1min per file, 8 files) to finish in 1-min (8 forks, 1 file per fork).
However, when I test run this, it still take 4-min to finish. It appears like it only utilized 2 CPUs, but not the cores?
Hence, my question is:
Is it true that perl's fork() only parallel it based on CPUs, but not cores? Or maybe I didn't do it right? I'm simply using fork() and wait(). Nothing special.
I'd assume perl's fork() should be using cores, is there a simple bash/perl that I can write to prove my OS (i.e. RedHat 4) nor Perl is the culprit for such symptom?
To Add:
I even tried running the following command multiple times to simulate multiple processing and monitor htop.
while true; do echo abc >>devnull; done &
Somehow htop is telling me I've got 16 cores? and then when I spawn 4 of the above while-loop, I see 4 of them utilizing ~100% cpu each. When I spawn more, all of them start reducing the cpu utilization percentage evenly. (e.g. 8 processing, see 8 bash in htop, but using ~50% each) Does this mean something?
Thanks ahead. I tried google around but not able to find an obvious answer.
Edit: 2016-11-09
Here is the extract of perl code. I'm interested to see what I did wrong here.
my $maxForks = 50;
my $forks = 0;
while(<CIFLIST>) {
extractPDFByCIF($cifNumFromIndex, $acctTypeFromIndex, $startDate, $endDate);
}
for (1 .. $forks) {
my $pid = wait();
print "Child fork exited. PID=$pid\n";
}
sub extractPDFByCIF {
# doing SQL constructing to for the $stmt to do a DB query
$stmt->execute();
while ($stmt->fetch()) {
# fork the copy/afp2web process into child process
if ($forks >= $maxForks) {
my $pid = wait();
print "PARENTFORK: Child fork exited. PID=$pid\n";
$forks--;
}
my $pid = fork;
if (not defined $pid) {
warn "PARENTFORK: Could not fork. Do it sequentially with parent thread\n";
}
if ($pid) {
$forks++;
print "PARENTFORK: Spawned child fork number $forks. PID=$pid\n";
}else {
print "CHILDFORK: Processing child fork. PID=$$\n";
# prevent child fork to destroy dbh from parent thread
$dbh->{InactiveDestroy} = 1;
undef $dbh;
# perform the conversion as usual
if($fileName =~ m/.afp/){
system("file-conversion -parameter-list");
} elsif($fileName =~ m/.pdf/) {
system("cp $from-file $to-file");
} else {
print ERRORLOG "Problem happened here\r\n";
}
exit;
}
# end forking
$stmt->finish();
close(INDEX);
}
fork() spawns a new process - identical to, and with the same state as the existing one. No more, no less. The kernel schedules it and runs it wherever.
If you do not get the results you're expecting, I would suggest that a far more likely limiting factor is that you are reading files from your disk subsystem - disks are slow, and contending for IO isn't actually making them any faster - if anything the opposite, because it forces additional drive seeks and less easy caching.
So specifically:
1/ No, fork() does nothing more than clone your process.
2/ Largely meaningless unless you want to rewrite most of your algorithm as a shell script. There's no real reason to think that it'll be any different though.
To follow on from your edit:
system('file-conversion') looks an awful lot like an IO based process, which will be limited by your disk IO. As will your cp.
Have you considered Parallel::ForkManager which greatly simplifies the forking bit?
As a lesser style point, you should probably use 3 arg 'open'.
#!/usr/bin/env perl
use strict;
use warnings;
use Parallel::ForkManager;
my $maxForks = 50;
my $manager = Parallel::ForkManager->new($maxForks);
while ($ciflist) {
## do something with $_ to parse.
##instead of: extractPDFByCIF($cifNumFromIndex, $acctTypeFromIndex, $startDate, $endDate);
# doing SQL constructing to for the $stmt to do a DB query
$stmt->execute();
while ( $stmt->fetch() ) {
# fork the copy/afp2web process into child process
$manager->start and next;
print "CHILDFORK: Processing child fork. PID=$$\n";
# prevent child fork to destroy dbh from parent thread
$dbh->{InactiveDestroy} = 1;
undef $dbh;
# perform the conversion as usual
if ( $fileName =~ m/.afp/ ) {
system("file-conversion -parameter-list");
} elsif ( $fileName =~ m/.pdf/ ) {
system("cp $from-file $to-file");
} else {
print ERRORLOG "Problem happened here\r\n";
}
# end forking
$manager->finish;
}
$stmt->finish();
}
$manager->wait_all_children;
Your goal is to parallelize your application in a way that is using multiple cores as independent resources. What you want to achieve is multi-threading, in particular Perl's ithreads that are using calls to the fork() function of the underlying system (and are heavy-weight for that reason). You can teach the Perl way of multi-threading to yourself from perlthrtut. Quote from perlthrtut:
When a new Perl thread is created, all the data associated with the current thread is copied to the new thread, and is subsequently private to that new thread! This is similar in feel to what happens when a Unix process forks, except that in this case, the data is just copied to a different part of memory within the same process rather than a real fork taking place.
That being said, regarding your questions:
You're not doing it right (sorry). [see my comment...] With multi-threading you don't need to call fork() by yourself for that, but Perl will do it for you.
You can check whether your Perl interpreter has been built with thread support e.g. by perl -V (note the capital V) and looking at the messages. If there is nothing to see about threads then your Perl interpreter is not capable of Perl-multithreading.
The reason that your application is already faster even with only one CPU core by using fork() is likely that while one process has to wait for slow resources such as the file system another process can use the same core as a computation resource in the meantime.

Run Perl Script on Several files Simultaneously

I have written a Perl script that reads in a data file, line by line, does some computations and returns 3 files as outputs; I have also written it so that it reads through every *.csv file that I have in my directory, one file at the time, returning the 3 separate output files for each input file (so for 10 csv input files, when my script is done, I have 30 output files.)
However, when I run my script, I see that it only runs on one core. What I would like to do is make my script run simultaneously on several input files: is this even possible? Or, alternatively, what would be a better option? I'm working on a Windows machine.
There are two (main) options for using more processors in Perl.
Threads and forks. There's a certain amount of similarity between them, but some crucial differences. fork() is a native system call on Unix, and it's extremely efficient (it's used a lot). On Windows you don't have it - perl emulates it's functionality quite well though.
fork clones your program exactly - it makes a 'parent' and 'child' and the only difference is the return code from fork. Codes resumes from exactly the same point, so you can end up with some slightly strange behaviour.
Note for example when you run:
#!/usr/bin/perl
use strict;
use warnings;
my $pid = fork();
if ( $pid ) {
print "$$ is the parent - child is $pid\n";
}
else {
print "$$ is the child\n";
}
You should be aware - every variable that exists before is still defined in each 'fork' but it's a separate copy. This leads you to the next challenge, which is inter-process communications. This is a sufficiently big topic that it's got it's own perl documentation page perlipc
When it comes to more paralleism though, fork can somewhat awkward because it's a low level call. How many lines do you think this prints?
#!/usr/bin/perl
use strict;
use warnings;
my #fruits = qw ( apple pear lemon lime cucumber );
foreach my $fruit ( #fruits ) {
my $pid = fork();
if ( $pid ) {
print "Parent $$ with a child of $pid has a fruit of $fruit\n";
}
else {
print "Child of $$ has a fruit of $fruit\n";
}
}
Because the fork is nested, it happens more times than you might intuitively guess. It's also somewhat easy to fork too much with a loop, and you can create a denial of service condition.
Fortunately, there is a solution - Parallel::Forkmanager implements some simple mechanisms for controlling forks, which makes it a lot smoother.
#!/usr/bin/perl
use strict;
use warnings;
use Parallel::ForkManager;
my #fruits = qw ( apple pear lemon lime cucumber );
my $manager = Parallel::ForkManager -> new ( 5 );
print "Parent: $$\n";
foreach my $fruit ( #fruits ) {
$manager -> start and next;
print "Child of $$ - $fruit\n";
$manager -> finish;
}
$manager -> wait_all_children;
For the sake of completeness - I'll also mention threads. They're another way of doing things, but they're slightly counter-intuitively not lightweight as they are in other languages. They're also a status of 'discouraged':
The "interpreter-based threads" provided by Perl are not the fast, lightweight system for multitasking that one might expect or hope for. Threads are implemented in a way that make them easy to misuse. Few people know how to use them correctly or will be able to provide help.
The use of interpreter-based threads in perl is officially discouraged.
So where forks are fairly easy to have lots and lots of, threads are basically best thought of as separate processes.
#!/usr/bin/perl
use strict;
use warnings;
use threads;
sub thread_sub {
print threads -> self -> tid(). ": #_\n";
}
my #fruits = qw ( apple pear lemon lime cucumber );
foreach my $fruit ( #fruits ) {
threads -> create ( \&thread_sub, $fruit );
}
foreach my $thr ( threads -> list ) {
$thr -> join;
}
In either case though, you should be aware - parallel processing means your code no longer happens in an obvious sequential fashion. This means you'll have some really fruity and funky bugs that will be horrific to debug if you're not careful. So make sure you code works sequentially first before even trying to go near parallelism.
You should also be aware - you only get linear performance improvements for as long as your limiting factor is purely CPU. It generally isn't. Disk IO is invariably much slower. You mention processing several files. If the emphasis is on processing, rather than reading data - then parallelism will help.
But disks are pretty slow, and 'thrashing' them by trying to stream data from multiple locations will slow them down more. So you don't gain much - if anything - by parallising IO intensive tasks (disk traversals, bulk file reads etc.), and you can quite easily make matters worse.

How can I get a value from a child process?

I have a script and at some part I fork some processes to do a task and the main process waits for all children to complete.
So far all ok.
Problem: I am interested to get the max time that each child process spend while working on what it had to do.
What I do now is just look at the logs where I print the times spend at each action the child process did and try to figure out more or less the times.
I know that the only way to get something back from a child process is via some sort of shared memory but I was wondering for this specific problem is there a "ready"/easy solution?
I mean in order to get the times back and the parent process prints them in a nice fashion in one place.
I thought there could be a better way than just checking all over the logs....
Update based on comments:
I am not interested in the times of the child processes i.e. which child took most time to finish. Each child process is working on X tasks. Each of the tasks takes at worse case Y secs to finish. I am looking to find the Y i.e. the most time it took for a child process to finish one of the X tasks
The biggest limitation of fork() is that it doesn't do IPC as easily as threads. Aside from trapping when a process starts and exits, what you're doing otherwise has a whole segment of the perl documentation.
What I would suggest is that what you probably want is a pipe and connect it to the child.
Something like this (not tested yet, I'm on a Windows box!)
use strict;
use warnings;
use Parallel::ForkManager;
my $manager = Parallel::ForkManager -> new ( 5 ) ;
pipe ( my $read_handle, my $write_handle );
for ( 1..10 ) {
$manager -> start and next;
close ( $read_handle );
print {$write_handle} "$$ - child says hello!\n";
$manager -> finish;
}
close ( $write_handle );
while ( <$read_handle> ) { print; }
$manager -> wait_all_children();

why isn't my thread joinable in perl?

I wrote a very short script with Perl and I used multi-thread in it.
My problem is, the thread I created is not joinable. So I am wondering, what is the condition to make thread joinable?
What is the limit of a thread in Perl?
#!/usr/bin/env perl
#
#
use lib "$::XCATROOT/lib/perl";
use strict;
use threads;
use Safe;
sub test
{
my $parm = shift;
}
my $newchassis = ["1", "2", "3"];
my #snmp_threads ;
for my $item (#$newchassis)
{
my $thread = threads->create(\&test, $item);
push #snmp_threads, $thread;
}
for my $t (#snmp_threads)
{
$t->join();
}
This can be very tricky as it works find on RHEL 6.3 and but fails on SLES 11sp2.
Though there is no code, i will go ahead and assume that you are using join foreach #threads; for joining the threads. Now the joining of the threads depends on the post processing. Without seeing your code it's difficult to know, what you are doing. But how it works is that :
If the post-processing step needs all threads to finish before
beginning work, then the wait for individual threads is unavoidable.
If the post-processing step is specific to the results of each
thread, it should be possible to make the post-processing part of
the thread itself.
In both cases, $_->join foreach #threads; is the way to go.
If there is no need to wait for the threads to finish, use the
detach command instead of join. However, any results that the
threads may return will be discarded.
Are you sure, you have provided a valid post processing scenario for your activity ?

Resources