What is the Perl equivalent of PHP's proc_open(), proc_close(), etc.? - linux

Using PHP's proc_open(), I can start a process, read from STDOUT and STDERR (separately) an arbitrary number of bytes at a time using fread() while the process is running, detect when the process is done using feof() on the STDOUT and STDERR pipes, and then use proc_close() to get the exit code of the process. I've done all of this in PHP. It works well, and gives me a lot of control.
Is there a way to do all of these things in Perl? To summarize, I need to be able to do the following:
start an external process
read STDOUT and STDERR separately
read STDOUT and STDERR an arbitrary number of bytes at a time while the process is running (i.e. without having to wait for the process to finish)
detect when the process is finished
get the exit code of the process
Thanks in advance for your answers.

You could roll your own solution using Perl's system call interface, but it's easier to use the built-in module IPC::Open3. As for your list:
Start an external process:
use IPC::Open3;
use IO::Handle;
use strict;
my $stdout = IO::Handle->new;
my $stderr = IO::Handle->new;
my $pid = open3(undef, $stdout, $stderr, 'my-command', 'arg1', 'arg2');
Read STDOUT and STDERR separately, an arbitrary number of bytes at a time:
my $line = <$stdout>;
# Or
sysread $stderr, my $buffer, 1024;
Detect when the process is finished:
use POSIX qw(sys_wait_h);
waitpid $pid, 0; # Waits for process to terminate
waitpid $pid, WNOHANG; # Checks if the process has terminated
Get the exit code of the process:
my $status = $?; # After waitpid indicates the process has exited
Be sure to read the IPC::Open3 documentation; as it warns, it's easy to get yourself deadlocked when you have separate stdout and stderr pipes, if you're not careful. If the child process fills either pipe, it will block, and if the parent process reads the other pipe, it will block.

You want this module: IPC::Open3

You want IPC::Run, it captures the IO and returns the exit value

Related

Why is File::FcntlLock's l_type always "F_UNLCK" even if the file is locked?

The Perl subroutine below uses File::FcntlLock to check if a file is locked.
Why does it return 0 and print /tmp/test.pid is unlocked. even if the file is locked?
sub getPidOwningLock {
my $filename = shift;
my $fs = new File::FcntlLock;
$fs->l_type( F_WRLCK );
$fs->l_whence( SEEK_SET );
$fs->l_start( 0 );
$fs->l_len( 0 );
my $fd;
if (!open($fd, '+<', $filename)) {
print "Could not open $filename\n";
return -1;
}
if (!$fs->lock($fd, F_GETLK)) {
print "Could not get lock information on $filename, error: $fs->error\n";
close($fd);
return -1;
}
close($fd);
if ($fs->l_type() == F_UNLCK) {
print "$filename is unlocked.\n";
return 0;
}
return $fs->l_pid();
}
The file is locked as follows (lock.sh):
#!/bin/sh
(
flock -n 200
while true; do sleep 1; done
) 200>/tmp/test.pid
The file is indeed locked:
~$ ./lock.sh &
[2] 16803
~$ lsof /tmp/test.pid
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
bash 26002 admin 200w REG 8,5 0 584649 test.pid
sleep 26432 admin 200w REG 8,5 0 584649 test.pid
fcntl and flock locks are invisible to each other.
This is a big problem for your use case because the flock utility that you're using in your shell script depends on flock semantics: the shell script runs a flock child process, which locks an inherited file descriptor and then exits. The shell keeps that file descriptor open (because the redirection is on a whole sequence of commands) until it wants to release the lock.
That plan can't work with fcntl because fcntl locks are not shared among processes. If there was a utility identical to flock but using fcntl, the lock would be released too early (as soon as the child process exits).
For coordination of a file lock between a perl process and a shell script, some options you can consider are:
port the shell script to zsh and use the zsystem flock builtin from the zsh/system module (note: in the documentation it claims to use fcntl in spite of its name being flock)
rewrite the shell script in perl
just use flock in the perl script (give up byte range locking and the "get locker PID" feature - but you can emulate that on Linux by reading /proc/locks)
write your own fcntl utility in C for use in the shell script (the usage pattern will be different - the shell script will have to background it and then kill it later to unlock - and it will need some way to tell the parent process when it has obtained or failed to obtain the lock, which will be hard because it's happening asynchronously now... maybe use the coprocess feature that some shells have).
run a small perl script from the shell script to do the locking (will need the same background treatment that a dedicated fcntl utility would need)
For more information on features of the different kinds of locks, see What is the difference between locking with fcntl and flock.

How to get PID of perl daemon in init script?

I have the following perl script:
#!/usr/bin/perl
use strict;
use warnings;
use Proc::Daemon;
Proc::Daemon::Init;
my $continue = 1;
$SIG{TERM} = sub { $continue = 0 };
while ($continue) {
# stuff
}
I have the following in my init script:
DAEMON='/path/to/perl/script.pl'
start() {
PID=`$DAEMON > /dev/null 2>&1 & echo $!`
echo $PID > /var/run/mem-monitor.pid
}
The problem is, this returns the wrong PID! This returns the PID of the parent process which is started when the daemon is run, but that process is immediately killed off. I need to get the PID of the child process!
The Proc::Daemon says
Proc::Daemon does the following:
...
9. The first child transfers the PID of the second child (daemon) to the parent. Additionally the PID of the daemon process can be written into a file if 'pid_file' is defined. Then the first child exits.
and then later, under new ( %ARGS )
pid_file
Defines the path to a file (owned by the parent user) where the PID of the daemon process will be stored. Defaults to undef (= write no file).
Also look at Init() method description. This all implies that you may want to use new first.
The point is that it is the grand-child process that is the daemon. However, the childr passes the pid along and it is available to the parent. If pid_file => $file_name is set in the constructor (the daemon's) pid is written to that file.
A comment asks to not have shell script rely on a file written by another script.
I can see two ways to do that.
Print the pid, returned by the $daemon->Init(), from the parent and pick it up in the shell. This is defeated by redirects in the question, but I don't know why they are needed. The parent and child exit right as all is set up, while the daemon is detached from everything.
Shell script can start the Perl script with the desired log-file name as an argument, letting it write the daemon pid to that file by the above process. The file is still output by Perl, but what matters about it is decided by the shell script.
I'd like to include a statement from my comment below. I consider these superior to two other things that come to mind: picking the filename from a config-style file kept by the shell is more complicated, while parsing the process table may be unreliable.
I've seen this before and had to resort to using STDERR to send back the childs PID to the calling shell script. I've always assumed it was due to the mentioned unreliability of exit codes - but details were not clear in the documentation. Please try something like this:
#!/usr/bin/perl
use strict;
use warnings;
use Proc::Daemon;
if( my $pid = Proc::Daemon::Init() ) {
print STDERR $pid;
exit;
}
my $continue = 1;
$SIG{TERM} = sub { $continue = 0 };
while ($continue) {
sleep(20);
exit;
}
With a calling script like this:
#!/bin/bash
DAEMON='./script.pl'
start() {
PID=$($DAEMON 2>&1 >/dev/null)
echo $PID > ./mem-monitor.pid
}
start;
When the bash script is ran, it will capture the STDERR output (containing the correct PID), and store it in the file. Any STDOUT the Perl script produces would be sent to /dev/null - though this is unlikely as the 1st level Perl script does (in this case) exit fairly early on.
Thank you to the suggestions from zdim and Hakon. They are certainly workable, and got me on the right track, but ultimately I went a different route. Rather than relying on $!, I used ps and awk to get the PID, as follows:
DAEMON='/path/to/perl/script.pl'
start() {
$DAEMON > /dev/null 2>&1
PID=`ps aux | grep -v 'grep' | grep "$DAEMON" | awk '{print $2}'`
echo $PID > /var/run/mem-monitor.pid
}
This works and satisfies my OCD! Note the double quotes around "$DAEMON" in grep "$DAEMON".

How to handle updates from an continuous process pipe in Perl

I am trying to follow log files in Perl on Fedora but unfortunately, Fedora uses journalctl to read binary log files that I cannot parse directly. This, according to my understanding, means I can only read Fedora's log files by calling journalctl.
I tried using IO::Pipe to do this, but the problem is that $p->reader(..) waits until journalctl --follow is done writing output (which will be never since --follow is like tail -F) and then allows me to print everything out which is not what I want. I would like to be able to set a callback function to be called each time a new line is printed to the process pipe so that I can parse/handle each new log event.
use IO::Pipe;
my $p = IO::Pipe->new();
$p->reader("journalctl --follow"); #Waits for process to exit
while (<$p>) {
print;
}
I assume that journalctl is working like tail -f. If this is correct, a simple open should do the job:
use Fcntl; # Import SEEK_CUR
my $pid = open my $fh, '|-', 'journalctl --follow'
or die "Error $! starting journalctl";
while (kill 0, $pid) {
while (<$fh>) {
print $_; # Print log line
}
sleep 1; # Wait some time for new lines to appear
seek($fh,0,SEEK_CUR); # Reset EOF
}
open opens a filehandle for reading the output of the called command: http://perldoc.perl.org/functions/open.html
seek is used to reset the EOF marker: http://perldoc.perl.org/functions/seek.html Without reset, all subsequent <$fh> calls will just return EOF even if the called script issued additional output in the meantime.
kill 0,$pid will be true as long as the child process started by open is alive.
You may replace sleep 1 by usleep from Time::HiRes or select undef,undef,undef,$fractional_seconds; to wait less than a second depending on the frequency of incoming lines.
AnyEvent should also be able to do the job via it's AnyEvent::Handle.
Update:
Adding use POSIX ":sys_wait_h"; at the beginning and waitpid $pid, WNOHANG) to the outer loop would also detect (and reap) a zombie journalctl process:
while (kill(0, $pid) and waitpid($pid, WNOHANG) != $pid) {
A daemon might also want to check if $pid is still a child of the current process ($$) and if it's still the original journalctl process.
I have no access to journalctl, but if you avoid IO::Pipe and open the piped output directly then the data will not be buffered
use strict;
use warnings 'all';
open my $follow_fh, '-|', 'journalctl --follow' or die $!;
print while <$follow_fh>;

How can I change the current directory in a thread-safe manner in Perl?

I'm using Thread::Pool::Simple to create a few working threads. Each working thread does some stuff, including a call to chdir followed by an execution of an external Perl script (from the jbrowse genome browser, if it matters). I use capturex to call the external script and die on its failure.
I discovered that when I use more then one thread, things start to be messy. after some research. it seems that the current directory of some threads is not the correct one.
Perhaps chdir propagates between threads (i.e. isn't thread-safe)?
Or perhaps it's something with capturex?
So, how can I safely set the working directory for each thread?
** UPDATE **
Following the suggestions to change dir while executing, I'd like to ask how exactly should I pass these two commands to capturex?
currently I have:
my #args = ( "bin/flatfile-to-json.pl", "--gff=$gff_file", "--tracklabel=$track_label", "--key=$key", #optional_args );
capturex( [0], #args );
How do I add another command to #args?
Will capturex continue die on errors of any of the commands?
I think that you can solve your "how do I chdir in the child before running the command" problem pretty easily by abandoning IPC::System::Simple as not the right tool for the job.
Instead of doing
my $output = capturex($cmd, #args);
do something like:
use autodie qw(open close);
my $pid = open my $fh, '-|';
unless ($pid) { # this is the child
chdir($wherever);
exec($cmd, #args) or exit 255;
}
my $output = do { local $/; <$fh> };
# If child exited with error or couldn't be run, the exception will
# be raised here (via autodie; feel free to replace it with
# your own handling)
close ($fh);
If you were getting a list of lines instead of scalar output from capturex, the only thing that needs to change is the second-to-last line (to my #output = <$fh>;).
More info on forking-open is in perldoc perlipc.
The good thing about this in preference to capture("chdir wherever ; $cmd #args") is that it doesn't give the shell a chance to do bad things to your #args.
Updated code (doesn't capture output)
my $pid = fork;
die "Couldn't fork: $!" unless defined $pid;
unless ($pid) { # this is the child
chdir($wherever);
open STDOUT, ">/dev/null"; # optional: silence subprocess output
open STDERR, ">/dev/null"; # even more optional
exec($cmd, #args) or exit 255;
}
wait;
die "Child error $?" if $?;
I don't think "current working directory" is a per-thread property. I'd expect it to be a property of the process.
It's not clear exactly why you need to use chdir at all though. Can you not launch the external script setting the new process's working directory appropriately instead? That sounds like a more feasible approach.

perl hangs on exit (after closing a filehandle)

I've got a function that does (in short):
my $file = IO::File->new("| some_command >> /dev/null 2>&1")
or die "cannot open some_command for writing: $!\n";
...
undef $file;
Right now I'm not even writing anything to $file. Currently there are no other operations on $file at all. When I run the program, it doesn't exit properly. I see that handle is closed, but my program is still waiting for the process to close. Captured with strace:
close(6) = 0
rt_sigaction(SIGHUP, {SIG_IGN}, {SIG_DFL}, 8) = 0
rt_sigaction(SIGINT, {SIG_IGN}, {SIG_DFL}, 8) = 0
rt_sigaction(SIGQUIT, {SIG_IGN}, {SIG_DFL}, 8) = 0
wait4(16861, ^C <unfinished ...>
I don't see this problem if I open the same process for reading.
What do I have to do to make the program to exit?
Edit: Suggestions so far were to use the Expect library or to finish the input stream via ctrl+d. But I do not want to interact with the program in any way at this point. I want it to finish exactly now without any more IO going on. Is that possible?
undef $file removes a reference count from the filehandle and makes it eligible for garbage collection. If $file is a handle to a regular file and there are no other references to the filehandle anywhere else, it should work as documented in IO::File. In this case $file is a handle to a shell command, and there may be some other internal references to the filehandle that keep it from getting destroyed. Using $file->close is safer and makes your intent much clearer.
To kill off the command when closing the filehandle doesn't work, you need the process ID. If you invoked the command like
my ($file,$pid);
$pid = open($file, "| some_command >> /dev/null 2>&1");
then you could
kill 'TERM',$pid;
at the end of your program. I don't know how to extract the process ID from the return value of IO::File::new though.
If some_command is waiting for input, it will likely sit there forever doing just that, waiting for input.
From what the docs say, I don't think it makes any difference, but I always do $file->close() instead of/before undef'ing the handle.
EDIT: Send it Control D?
Perhaps some_command is reading tty's instead of stdin, like passwd does. If you are in that realm, I'd suggest looking up Expect.
Control D simply duplicates the zero byte read that close should do anyway for a command line program.
Have you tried using $file->close() instead of the undef?
Does some_command slurp all input and process it? Such as grep?
Or does it prompt? Like, say... chfn?
Does it return any useful information? Like an indication that it's finished?
If it's the latter, you might want to read up on Expect so you that you can interact with it.
This ugly, ugly hack will cause some_command to be parented to init instead of staying in your perl's process tree. Perl no longer has any process to wait for, and the pipe still works -- yay UNIX.
my $file = IO::File->new("| some_command >> /dev/null 2>&1 &")
Cons: The shell will succeed at & even if some_command fails, so you won't get any errors back.
or die "cannot open some_command for writing: $!\n"; # now useless
If some_command exited as soon as it got an EOF on stdin (and never stops reading from stdin), though, I'd expect this wouldn't be necessary.
$ cat | some_command
^D
Does that hang, and can you fix that?

Resources