Thread-safe alternative to File::Tee? - multithreading

I was wanting to implement some logging for a threaded script I have, and I came across File::Tee. However, when attempting to ppm the module on a Windows box, it's not found (and according to activestate, not supported on Windows).
I really liked that you could lock file access though, by doing something like:
tee STDOUT, {mode => '>>', open => '$ENV{DOM}\\threaded_build.log', lock => 1};
tee STDERR, {mode => '>>', open => '$ENV{DOM}\\threaded_debug.log', lock => 1};
Is there a cross-platform, thread-safe alternative?

File::Tee takes extra care to handle output generated by external programs run through system or XS code that doesn't go through perlio. I think that's what makes it incompatible with Windows.
IO::Tee is more cross-platform and I don't think making it thread safe would be too hard to do. The sync code in File::Tee just looks like:
flock($teefh, LOCK_EX) if $target->{lock};
print $teefh $cp;
flock($teefh, LOCK_UN) if $target->{lock};
You could accomplish the same thing in IO::Tee by modifying a couple of methods:
use Fcntl ':flock';
no warnings 'redefine';
sub IO::Tee::PRINT
{
my $self = shift;
my $ret = 1;
foreach my $fh (#$self) {
flock($fh, LOCK_EX);
undef $ret unless print $fh #_;
flock($fh, LOCK_UN);
}
return $ret;
}
sub IO::Tee::PRINTF
{
my $self = shift;
my $fmt = shift;
my $ret = 1;
foreach my $fh (#$self) {
flock($fh, LOCK_EX);
undef $ret unless printf $fh $fmt, #_;
flock($fh, LOCK_UN);
}
return $ret;
}

Related

Perl threads, sockets and STDIN interference

I am trying to use a perl threads and sockets, to create a simple client/server application. The problem comes in client side, when trying to mix STDIN with socket reading in different threads. Note, I succeeded with using Tkx as workaround to STDIN. But my intention is to build short samples for teaching purposes only, in several programming languages. I intend to create the program minimal, with no UI, as simple as possible.
Here is the code of client, which has the problems:
use strict;
use IO::Socket::INET;
use threads;
our $sock = new IO::Socket::INET ( PeerAddr => 'localhost', PeerPort => 1234, Proto => 'tcp') or die "cannot connect to localhost";
my $thr = threads->create(\&msg_proc, $sock); #, $sock);
for (my $msg; $msg ne "exit";)
{
$msg = <STDIN>;
$msg =~ s/[\r\n\s\0]*\Z//g;
$sock->send ("$msg\n");# or die "can't send";
};
$sock->close();
print "exit main thread\n";
$thr->join() if $thr->is_running();
print "exit all\n";
sub msg_proc
{
my $svr = shift;
my $i = 0;
while ($svr)
{
sleep(1);
$svr->send ("{slept $i}\n") or die "server closed connection";
my $svr_msg = <$svr> or die "server closed connection";
$svr_msg =~ s/[\r\n\s\0]*\Z//g;
print "From server: <<$i $svr_msg>>\n";
$i++;
}
print "sock exit\n";
}
The problem starts when I remove the line $svr->send in the thread procedure msg_proc. In this case the client fails to read normally from STDIN, looks like interfering with socket reading operation <$svr>. Looks like it interferes with <STDIN> operation. For some reason these two can't coexist in parallel. Note, the C++ and Java versions of these demos do not have this problem.
This is the code of the server:
use strict;
use IO::Socket::INET;
use threads;
my $sock = IO::Socket::INET -> new (LocalPort => 1234, Proto => 'tcp', Listen => 1, Reuse => 1) or die "Could not create socket: $!\n";
print "server started\n";
my $i = 0;
while (my $s = $sock->accept())
{
print "New connection\n";
my $thr = threads->create(\&client_proc, $s, $i);
$s->send("welcome $i\n");
$i++;
}
sub client_proc
{
my $client = shift;
my $client_no = shift;
print "## client $client_no started\n";
$client->send("hello $client_no\n");
for (my $msg; $msg ne "exit";)
{
$msg = <$client> or die "client $client_no closed connection";
$msg =~ s/[\r\n\s\0]*\Z//;
print "From client $client_no: '$msg' len: ", length ($msg), "\n";
$client->send( "client $client_no: $msg\n" ) or die "client $client_no closed connection";
};
$client->close();
print "## client $i exit\n";
}
As I understand from here, the use of interpreter-based threads in perl is officially discouraged. But I can't understand, what interpreter-based threads actually means, and what is exactly the alternative provided. Does this mean using threads is discouraged in perl at all?
Note: I am using Strawberry Perl 5.32.1 64 bit under Windows 11, zip package, no msi. But under ActiveState Perl the problem was identically the same.

Multithreading Perl script and crontab/init script

I have an issue with a Perl script using thread.
It works fine when I launch it manually but when I launch it using crontab I have this feedback:
Perl exited with active threads:
0 running and unjoined
1 finished and unjoined
0 running and detached
The PATH variable and SHELL variable are correct on the crontad.
I try to make a init script (to launch as a service) and same error:
Feb 24 08:04:48 SERVER kernel: perl[103293]: segfault at 4a8 ip
00007f6cfd075dd9 sp 00007fffb93437c0 error 4 in
libperl.so[7f6cfcfdf000+183000] Feb 24 08:04:49 SERVER
test_ping[102238]: Perl exited with active threads: Feb 24 08:04:49
SERVER test_ping[102238]: 0 running and unjoined Feb 24 08:04:49
SERVER test_ping[102238]: 1 finished and unjoined Feb 24 08:04:49
SERVER test_ping[102238]: 0 running and detached
So I have also tried to modified the perl with:
for my $thread (threads->list) {
$thread->join();
}
Instead of
for my $thread (threads->list) {
$thread->detach();
}
And after this modification when I launch manually the script, this one seems to be stuck/freezing.
So to resume this is all my check:
Executed manually it work
Via crontab it doesn't work, check of PATH variable and SHELL
variable are ok
Via init script, doesn't work
Try to modify the perl script to join all thread but it the script
is freezing after that.
Anyone has a suggestion ? Something else to check/do ?
Thk
use lib '/usr/local/perf/lib';
use lib '/usr/share/perl5';
use threads;
use Thread::Queue;
use SNMP::Persist qw(&define_oid &start_persister &define_subtree);
use Schedule::ByClock;
use Time::HiRes qw( usleep ualarm gettimeofday tv_interval );
use strict;
#use warnings;
use constant DEBUG => 0;
use constant DEBUG2 => 1;
if ($#ARGV + 1 != 2) {
print "usage: test_ping.pl OUTPUTFILE INPUTFILE \n";
exit;
}
my $output_file=$ARGV[0];
my $data_file=$ARGV[1];
shift #ARGV;
shift #ARGV;
#start the thread serving answers
start_persister();
#create queue for processing commands
my $q_queue = new Thread::Queue;
my $r_queue = new Thread::Queue;
#create threads for processing queues
for(my $i= 0; $i < $thread_num; $i++) {
threads->create(\&process) -> detach();
}
my $datestring=localtime();
my %subtree;
my #raw_data;
my ($q_line, #q_split);
my ($r_line, #r_split);
my $index=0;
# open file to get data
open(DAT, $data_file) || die("Could not open file!");
#raw_data=<DAT>;
close(DAT);
# enqueue all lines to be process by threads
foreach $q_line (#raw_data) {
chomp($q_line);
$q_line =~ s/^\s+//;
$q_line =~ s/\s+$//;
next if ($q_line =~ /^#.*/);
next if ($q_line eq "");
next if ($q_line =~ /^\|/);
#q_split=split(/\|/,$q_line);
next if (!($q_split[0] eq "icmp" || $q_split[0] eq "tcp" || $q_split[0] eq "ldap" || $q_split[0] eq "dig" ));
$q_queue->enqueue(++$index ."|". $q_line);
}
while ($index != 0 && ($r_line = $r_queue->dequeue)) {
open(my $fh, '>>', $output_file) or die "Could not open file '$output_file' $!";
print $fh $datestring."|";
print $fh $r_line."\n";
close $fh;
#r_split=split(/\|/,$r_line);
$index--;
}
for my $thread (threads->list) {
$thread->detach();
}
Below the process fonction:
sub process {
# my #hotefqdn = split(/\./, `hostname`);
# my $hote=$hotefqdn[0];
my ($q_line,#q_split,$q_index,$q_query);
my ($q_module,$q_type,$q_name,$q_host,$q_port,$q_ssl,$q_send,$q_expect,$q_quit);
my ($q_lookup,$q_record);
my ($q_base_dn,$q_attr,$q_binddn,$q_password,$q_warn_time,$q_crit_time,$q_timeout);
my ($r_tab);
while ($q_line = $q_queue->dequeue) {
#q_split=split(/\|/,$q_line);
$q_index=$q_split[0];
$q_module=$q_split[1];
if ($q_module eq "icmp") {
$q_type=$q_split[2];
$q_name=$q_split[3];
$q_host=$q_split[4];
$q_query="$q_host (ping)";
print "query=$q_query\n" if(DEBUG);
$r_tab=icmp_query($q_host);
}
elsif ($q_module eq "tcp") {
$q_type=$q_split[2];
$q_name=$q_split[3];
$q_query="$q_host ($q_type:$q_port)";
print "query=$q_query\n" if(DEBUG);
$r_tab=tcp_query($q_host,$q_port,$q_ssl,$q_send,$q_expect,$q_quit);
}
elsif ($q_module eq "ldap") {
$q_type=$q_split[2];
$q_name=$q_split[3];
print "query=$q_query\n" if(DEBUG);
$r_tab=ldap_query($q_host,$q_base_dn,$q_port,$q_attr,$q_binddn,$q_password,$q_warn_time,$q_crit_time,$q_timeout);
}
elsif ($q_module eq "dig") {
$q_type=$q_split[2];
$q_name=$q_split[3];
$q_query="$q_lookup($q_record) #".$q_host;
print "query=$q_query\n" if(DEBUG);
$r_tab=dig_query($q_host,$q_port,$q_lookup,$q_record,$q_expect);
}
$r_queue->enqueue($q_index."|".$q_name."|".$q_type."|".$q_query."|".$r_tab->{'min'}."|".$r_tab->{'med'}."|".$r_tab->{'avg'}."|".$r_tab->{'max'}."|".$r_tab->{'dev'}."|".$r_tab->{'loss'}."|".$r_tab->{'err'});
}
}
First of all, don't detach your threads. When you do, you can't wait for them to finish.
for (my $i= 0; $i < $thread_num; $i++) {
threads->create(\&process) -> detach();
}
...
for my $thread (threads->list) {
$thread->detach();
}
should be
for (1..$thread_num) {
threads->create(\&process);
}
...
... Tell the threads to finish up ...
for my $thread (threads->list) {
$thread->join();
}
Now on the question: Why don't your threads finish? Well, you never tell them to exit, so they never do! You need to ask to them to exit, and that can be achieved by adding the following:
$q_queue->end();
What follows is what you get after you apply the above fixes. I've also moved all thread-related code out of process, since it doesn't belong there. Finally, I removed the reliance on $index by moving the output code into its own thread.
sub process {
my ($q_line) = #_;
...
return join("|", $q_index, $q_name, $q_type, $q_query, #$r_tab{qw( min med avg max dev loss err )});
}
my $request_q = Thread::Queue->new();
my $response_q = Thread::Queue->new();
my #worker_threads;
for (1..$thread_num) {
push #worker_threads, async {
while (defined( my $request = $request_q->dequeue() )) {
$response_q->enqueue( process($request) );
}
};
}
my $output_thread = do {
my $datestring = localtime();
open(my $fh, '>', $output_file)
or die("Can't create file \"$output_file\": $!\n");
async {
while (defined( my $response = $response_q->dequeue() )) {
print($fh "$datestring|$response\n");
}
}
};
{
my %protos = map { $_ => 1 } qw( icmp tcp ldap dig );
open(my $fh, '<', $data_file)
or die("Can't open file \"$data_file\": $!\n");
my $index = 0;
while (<$fh>) {
s/^\s+//;
s/\s+\z//;
next if $_ eq "" || /^#/;
my ($proto) = split /\|/;
next if !$protos{$proto};
$request_q->enqueue(++$index ."|". $_);
}
}
$request_q->end();
$_->join() for #worker_threads;
$response_q->end();
$output_threads->join();

How to write a multi thread perl script to write a log file?

The log file need to be updated every now and then.The problem here is if i use multi thread to write a same log file i could see some lines are incorrect.
my $date_string = strftime "%Y%m%d\_%H\_%M\_%S",localtime;
open(LOG,"+>","log\_$date_string.log")or die "cant open the file";
for ($count = 1; $count <=$thread_count ; $count++)
{
my $thread = threads->new(\&process);
push(#threads,$thread);
}
foreach (#threads)
{
$_->join;
}
close LOG;
sub process
{ ......
......
print LOG "$datestring"."\|$process_id\|$host\|$port\|input\|Processing $filename\|\n";
}
For the above code iam getting output like this:
2014-05-20 12:12:46|19903|172.16.7.109|2000|input|Processing /opt/search_command/_13668.out|
2014-05-20 12:12:47|19903|172.16.7.109|2000|search|Processing /opt/search_command/_13668.out|0.940|
5-20 12:12:46|19903|172.16.7.109|2000|input|Processing /opt/search_command/_11061.out|
2014-05-20 12:12:47|19903|172.16.7.109|2000|search|Processing /opt/search_command/_11061.out|0.952|
If you see the third line date is not printed properly
I need a solution on how to log the status in log file using multi thread in perl.
Thanks in advance
You can lock the file before writing, and release lock after it.
sub process {
lock(\*LOG);
print LOG ..
unlock(\*LOG);
}
use Fcntl qw(:flock SEEK_END);
sub lock {
my ($fh) = #_;
flock($fh, LOCK_EX) or die $!;
seek($fh, 0, SEEK_END) or die $!;
}
sub unlock {
my ($fh) = #_;
flock($fh, LOCK_UN) or die $!;
}

Implementation of a watchdog in perl

I need to contain execution of an external process (a command line call) into a fixed time window.
After few readings I coded up this implementation:
#/bin/perl -w
use IPC::System::Simple qw( capture );
use strict;
use threads;
use threads::shared;
use warnings;
my $timeout = 4;
share($timeout);
my $stdout;
share($stdout);
my $can_proceed = 1;
share($can_proceed);
sub watchdogFork {
my $time1 = time;
my $ml = async {
my $sleepTime = 2;
my $thr = threads->self();
$stdout = capture("sleep $sleepTime; echo \"Good morning\n\";");
print "From ml: " . $stdout;
$thr->detach();
};
my $time2;
do {
$time2 = time - $time1;
} while ( $time2 < $timeout );
print "\n";
if ( $ml->is_running() ) {
print "From watchdog: timeout!\n";
$can_proceed = 0;
$ml->detach();
}
}
my $wd = threads->create('watchdogFork');
$wd->join();
print "From main: " . $stdout if ($can_proceed);
When $timeout > $sleepTime it returns:
From ml: Good morning
From main: Good morning
Other hand, when $timeout < $sleepTime:
From watchdog: timeout!
The behaviour obtained is correct but I think that this approach is slightly raw.
I was wondering if there are libraries that could help to refine the source code improving readability and performances. Any suggestion?
IPC::Run allows you to run child processes and interact with their stdin, stdout, and stderr. You can also set timeouts, which throw an exception when exceeded:
use IPC::Run qw(harness run timeout);
my #cmd = qw(sleep 10);
my $harness = harness \#cmd, \undef, \my $out, \my $err, timeout(3);
eval {
run $harness or die "sleep: $?";
};
if ($#) {
my $exception = $#; # Preserve $# in case another exception occurs
$harness->kill_kill;
print $exception; # and continue with the rest of the program
}
Note that there are some limitations when running on Windows.
You can use timeout_system from Proc::Background:
use Proc::Background qw(timeout_system);
my $wait_status = timeout_system($seconds, $command, $arg1, $arg2);
my $exit_code = $wait_status >> 8;
The process will be killed after $seconds seconds.

How to use threads in Perl?

I want to use threads in Perl to increase the speed of my program ... for example i want to use 20 threads in this code:
use IO::Socket;
my $in_file2 = 'rang.txt';
open DAT,$in_file2;
my #ip=<DAT>;
close DAT;
chomp(#ip);
foreach my $ip(#ip)
{
$host = IO::Socket::INET->new(
PeerAddr => $ip,
PeerPort => 80,
proto => 'tcp',
Timeout=> 1
)
and open(OUT, ">>port.txt");
print OUT $ip."\n";
close(OUT);
}
In the above code we give a list of ips and scan a given port. I want use threads in this code. Is there any other way to increase the speed of my code?
Thanks.
Instead of using threads, you might want to look into AnyEvent::Socket, or Coro::Socket, or POE, or Parallel::ForkManager.
Read the Perl threading tutorial.
Perl can do both threading and forking. "threads" is officially not recommended - in no small part because it's not well understood, and - perhaps slightly counterintutively - isn't lightweight like threads are in some programming languages.
If you are particularly keen to thread, the 'worker' model of threading works much better than spawning a thread per task. You might do the latter in some languages - in perl it's very inefficient.
As such you might do something like this:
#!/usr/bin/env perl
use strict;
use warnings;
use threads;
use Thread::Queue;
use IO::Socket;
my $nthreads = 20;
my $in_file2 = 'rang.txt';
my $work_q = Thread::Queue->new;
my $result_q = Thread::Queue->new;
sub ip_checker {
while ( my $ip = $work_q->dequeue ) {
chomp($ip);
$host = IO::Socket::INET->new(
PeerAddr => $ip,
PeerPort => 80,
proto => 'tcp',
Timeout => 1
);
if ( defined $host ) {
$result_q->enqueue($ip);
}
}
}
sub file_writer {
open( my $output_fh, ">>", "port.txt" ) or die $!;
while ( my $ip = $result_q->dequeue ) {
print {$output_fh} "$ip\n";
}
close($output_fh);
}
for ( 1 .. $nthreads ) {
push( #workers, threads->create( \&ip_checker ) );
}
my $writer = threads->create( \&file_writer );
open( my $dat, "<", $in_file2 ) or die $!;
$work_q->enqueue(<$dat>);
close($dat);
$work_q->end;
foreach my $thr (#workers) {
$thr->join();
}
$result_q->end;
$writer->join();
This uses a queue to feed a set of (20) worker threads with an IP list, and work their way through them, collating and printing results through the writer thread.
But as threads aren't really recommended any more, a better way might be to use Parallel::ForkManager which with your code might go a bit like this:
#!/usr/bin/env perl
use strict;
use warnings;
use Fcntl qw ( :flock );
use IO::Socket;
my $in_file2 = 'rang.txt';
open( my $input, "<", $in_file2 ) or die $!;
open( my $output, ">", "port.txt" ) or die $!;
my $manager = Parallel::ForkManager->new(20);
foreach my $ip (<$input>) {
$manager->start and next;
chomp($ip);
my $host = IO::Socket::INET->new(
PeerAddr => $ip,
PeerPort => 80,
proto => 'tcp',
Timeout => 1
);
if ( defined $host ) {
flock( $output, LOCK_EX ); #exclusive or write lock
print {$output} $ip, "\n";
flock( $output, LOCK_UN ); #unlock
}
$manager->finish;
}
$manager->wait_all_children;
close($output);
close($input);
You need to be particularly careful of file IO when multiprocessing, because the whole point is your execution sequence is no longer well defined. So it's insanely easy to end up with different threads clobbering files that another thread has open, but hasn't flushed to disk.
I note your code - you seem to rely on failing a file open, in order to not print to it. That's not a nice thing to do, especially when your file handle is not lexically scoped.
But in both multiprocessing paradigms I outlined above (there are others, these are the most common) you still have to deal with the file IO serialisation. Note that your 'results' will be in a random order in both, because it'll very much depend on when the task completes. If that's important to you, then you'll need to collate and sort after your threads or forks complete.
It's probably generally better to look towards forking - as said above, in threads docs:
The "interpreter-based threads" provided by Perl are not the fast, lightweight system for multitasking that one might expect or hope for. Threads are implemented in a way that make them easy to misuse. Few people know how to use them correctly or will be able to provide help.
The use of interpreter-based threads in perl is officially discouraged.

Resources