How to get PID of perl daemon in init script?

How to get PID of perl daemon in init script? - linux

I have the following perl script:
#!/usr/bin/perl
use strict;
use warnings;
use Proc::Daemon;
Proc::Daemon::Init;
my $continue = 1;
$SIG{TERM} = sub { $continue = 0 };
while ($continue) {
# stuff
}
I have the following in my init script:
DAEMON='/path/to/perl/script.pl'
start() {
PID=`$DAEMON > /dev/null 2>&1 & echo $!`
echo $PID > /var/run/mem-monitor.pid
}
The problem is, this returns the wrong PID! This returns the PID of the parent process which is started when the daemon is run, but that process is immediately killed off. I need to get the PID of the child process!

The Proc::Daemon says
Proc::Daemon does the following:
...
9. The first child transfers the PID of the second child (daemon) to the parent. Additionally the PID of the daemon process can be written into a file if 'pid_file' is defined. Then the first child exits.
and then later, under new ( %ARGS )
pid_file
Defines the path to a file (owned by the parent user) where the PID of the daemon process will be stored. Defaults to undef (= write no file).
Also look at Init() method description. This all implies that you may want to use new first.
The point is that it is the grand-child process that is the daemon. However, the childr passes the pid along and it is available to the parent. If pid_file => $file_name is set in the constructor (the daemon's) pid is written to that file.
A comment asks to not have shell script rely on a file written by another script.
I can see two ways to do that.
Print the pid, returned by the $daemon->Init(), from the parent and pick it up in the shell. This is defeated by redirects in the question, but I don't know why they are needed. The parent and child exit right as all is set up, while the daemon is detached from everything.
Shell script can start the Perl script with the desired log-file name as an argument, letting it write the daemon pid to that file by the above process. The file is still output by Perl, but what matters about it is decided by the shell script.
I'd like to include a statement from my comment below. I consider these superior to two other things that come to mind: picking the filename from a config-style file kept by the shell is more complicated, while parsing the process table may be unreliable.

I've seen this before and had to resort to using STDERR to send back the childs PID to the calling shell script. I've always assumed it was due to the mentioned unreliability of exit codes - but details were not clear in the documentation. Please try something like this:
#!/usr/bin/perl
use strict;
use warnings;
use Proc::Daemon;
if( my $pid = Proc::Daemon::Init() ) {
print STDERR $pid;
exit;
}
my $continue = 1;
$SIG{TERM} = sub { $continue = 0 };
while ($continue) {
sleep(20);
exit;
}
With a calling script like this:
#!/bin/bash
DAEMON='./script.pl'
start() {
PID=$($DAEMON 2>&1 >/dev/null)
echo $PID > ./mem-monitor.pid
}
start;
When the bash script is ran, it will capture the STDERR output (containing the correct PID), and store it in the file. Any STDOUT the Perl script produces would be sent to /dev/null - though this is unlikely as the 1st level Perl script does (in this case) exit fairly early on.

Thank you to the suggestions from zdim and Hakon. They are certainly workable, and got me on the right track, but ultimately I went a different route. Rather than relying on $!, I used ps and awk to get the PID, as follows:
DAEMON='/path/to/perl/script.pl'
start() {
$DAEMON > /dev/null 2>&1
PID=`ps aux | grep -v 'grep' | grep "$DAEMON" | awk '{print $2}'`
echo $PID > /var/run/mem-monitor.pid
}
This works and satisfies my OCD! Note the double quotes around "$DAEMON" in grep "$DAEMON".

Related

Can I capture the output of another process that I started?

I'm currently using > /dev/null & to have Perl script A run Perl script B totally independently, and it works fine. Script B runs without throwing back any output and stays alive when Script A ends, even when my terminal session ends.
Not saying I need it, but is there a way to recapture its output if I wanted to?
Thanks

Your code framework may look like this:
#!/usr/bin/perl
# I'm a.pl
#...
system "b.pl > ~/b.out &";
while (1)
{
my $time = localtime;
my ($fsize, $mtime) = (stat "/var/log/syslog")[7,9];
print "syslog: size=$fsize, mtime=$mtime at $time\n";
sleep 60;
}
while the b.pl may look like:
#!/usr/bin/perl
# I'm b.pl
while (1)
{
my $time = localtime;
my $fsize_a = (stat "/var/log/auth.log")[7];
my $fsize_s = (stat "/var/log/syslog")[7];
print "fsize: syslog=$fsize_s auth.log=$fsize_a at $time\n";
sleep 60;
}
a.pl and b.pl do their job independently.
b.pl is called by a.pl as a background job, which sends its output to b.out(won't mess up the screen of a.pl)
You can read b.out from some other terminal, or after a.pl is finished(or when a.pl is put to background temporarily)
About terminating the two scripts:
`ctrl-c` for a.pl
`killall b.pl` for b.pl
Note:
b.pl will never terminate even when you terminate your terminal (Assumed that your terminal is run as a desktop application), so you don't need the `nohup` command to help. (Perhaps only useful in console)
If your b.pl may spit out error messages from time to time, then you still need to deal with its stderr. It's left as your homework.

How to handle updates from an continuous process pipe in Perl

I am trying to follow log files in Perl on Fedora but unfortunately, Fedora uses journalctl to read binary log files that I cannot parse directly. This, according to my understanding, means I can only read Fedora's log files by calling journalctl.
I tried using IO::Pipe to do this, but the problem is that $p->reader(..) waits until journalctl --follow is done writing output (which will be never since --follow is like tail -F) and then allows me to print everything out which is not what I want. I would like to be able to set a callback function to be called each time a new line is printed to the process pipe so that I can parse/handle each new log event.
use IO::Pipe;
my $p = IO::Pipe->new();
$p->reader("journalctl --follow"); #Waits for process to exit
while (<$p>) {
print;
}

I assume that journalctl is working like tail -f. If this is correct, a simple open should do the job:
use Fcntl; # Import SEEK_CUR
my $pid = open my $fh, '|-', 'journalctl --follow'
or die "Error $! starting journalctl";
while (kill 0, $pid) {
while (<$fh>) {
print $_; # Print log line
}
sleep 1; # Wait some time for new lines to appear
seek($fh,0,SEEK_CUR); # Reset EOF
}
open opens a filehandle for reading the output of the called command: http://perldoc.perl.org/functions/open.html
seek is used to reset the EOF marker: http://perldoc.perl.org/functions/seek.html Without reset, all subsequent <$fh> calls will just return EOF even if the called script issued additional output in the meantime.
kill 0,$pid will be true as long as the child process started by open is alive.
You may replace sleep 1 by usleep from Time::HiRes or select undef,undef,undef,$fractional_seconds; to wait less than a second depending on the frequency of incoming lines.
AnyEvent should also be able to do the job via it's AnyEvent::Handle.
Update:
Adding use POSIX ":sys_wait_h"; at the beginning and waitpid $pid, WNOHANG) to the outer loop would also detect (and reap) a zombie journalctl process:
while (kill(0, $pid) and waitpid($pid, WNOHANG) != $pid) {
A daemon might also want to check if $pid is still a child of the current process ($$) and if it's still the original journalctl process.

I have no access to journalctl, but if you avoid IO::Pipe and open the piped output directly then the data will not be buffered
use strict;
use warnings 'all';
open my $follow_fh, '-|', 'journalctl --follow' or die $!;
print while <$follow_fh>;

setting global variable in bash

I have function where I am expecting it to hang sometime. So I am setting one global variable and then reading it, if it didn't come up after few second I give up. Below is not complete code but it's not working as I am not getting $START as value 5
START=0
ineer()
{
sleep 5
START=5
echo "done $START" ==> I am seeing here it return 5
return $START
}
echo "Starting"
ineer &
while true
do
if [ $START -eq 0 ]
then
echo "Not null $START" ==> But $START here is always 0
else
echo "else $START"
break;
fi
sleep 1;
done

You run inner function call in back ground, which means the START will be assigned in a subshell started by current shell. And in that subshell, the START value will be 5.
However in your current shell, which echo the START value, it is still 0. Since the update of START will only be in the subshell.
Each time you start a shell in background, it is just like fork a new process, which will make a copy of all current shell environments, including the variable value, and the new process will be completely isolate from your current shell.
Since the subshell have been forked as a new process, there is no way to directly update the parent shell's START value. Some alternative ways include signals passing when the subshell which runs inner function exit.
common errors:
export
export could only be used to make the variable name available to any subshells forked from current shell. however, once the subshell have been forked. The subshell will have a new copy of the variable and the value, any changes to the exported variable in the shell will not effect the subshell.
Please take the following code for details.
#!/bin/bash
export START=0
ineer()
{
sleep 3
export START=5
echo "done $START" # ==> I am seeing here it return 5
sleep 1
echo "new value $START"
return $START
}
echo "Starting"
ineer &
while true
do
if [ $START -eq 0 ]
then
echo "Not null $START" # ==> But $START here is always 0
export START=10
echo "update value to $START"
sleep 3
else
echo "else $START"
break;
fi
sleep 1;
done

The problem is that ineer & runs the function in a subshell, which is its own scope for variables. Changes made in a subshell will not apply to the parent shell. I recommend looking into kill and signal catching.

Save pid of inner & by:
pid=$!
and use kill -0 $pid (that is zero!!) to detect if your process still alive.
But better redesign inner to use lock file, this is safer check!
UPDATE From KILL(2) man page:
#include <sys/types.h>
#include <signal.h>
int kill(pid_t pid, int sig);
If sig is 0, then no signal is sent, but error checking is still
performed; this can be used to check for the existence
of a process ID or process group ID.

The answer is: in this case you can use export.
This instruction allow all subprocesses to use this variable.
So when you'll call the ineer function it will fork a process that is copying the entire environment, including the START variable taken from the parent process.
You have to change the first line from:
START=0
to:
export START=0
You may also want to read this thread: Defining a variable with or without export

How can I change the current directory in a thread-safe manner in Perl?

I'm using Thread::Pool::Simple to create a few working threads. Each working thread does some stuff, including a call to chdir followed by an execution of an external Perl script (from the jbrowse genome browser, if it matters). I use capturex to call the external script and die on its failure.
I discovered that when I use more then one thread, things start to be messy. after some research. it seems that the current directory of some threads is not the correct one.
Perhaps chdir propagates between threads (i.e. isn't thread-safe)?
Or perhaps it's something with capturex?
So, how can I safely set the working directory for each thread?
** UPDATE **
Following the suggestions to change dir while executing, I'd like to ask how exactly should I pass these two commands to capturex?
currently I have:
my #args = ( "bin/flatfile-to-json.pl", "--gff=$gff_file", "--tracklabel=$track_label", "--key=$key", #optional_args );
capturex( [0], #args );
How do I add another command to #args?
Will capturex continue die on errors of any of the commands?

I think that you can solve your "how do I chdir in the child before running the command" problem pretty easily by abandoning IPC::System::Simple as not the right tool for the job.
Instead of doing
my $output = capturex($cmd, #args);
do something like:
use autodie qw(open close);
my $pid = open my $fh, '-|';
unless ($pid) { # this is the child
chdir($wherever);
exec($cmd, #args) or exit 255;
}
my $output = do { local $/; <$fh> };
# If child exited with error or couldn't be run, the exception will
# be raised here (via autodie; feel free to replace it with
# your own handling)
close ($fh);
If you were getting a list of lines instead of scalar output from capturex, the only thing that needs to change is the second-to-last line (to my #output = <$fh>;).
More info on forking-open is in perldoc perlipc.
The good thing about this in preference to capture("chdir wherever ; $cmd #args") is that it doesn't give the shell a chance to do bad things to your #args.
Updated code (doesn't capture output)
my $pid = fork;
die "Couldn't fork: $!" unless defined $pid;
unless ($pid) { # this is the child
chdir($wherever);
open STDOUT, ">/dev/null"; # optional: silence subprocess output
open STDERR, ">/dev/null"; # even more optional
exec($cmd, #args) or exit 255;
}
wait;
die "Child error $?" if $?;

I don't think "current working directory" is a per-thread property. I'd expect it to be a property of the process.
It's not clear exactly why you need to use chdir at all though. Can you not launch the external script setting the new process's working directory appropriately instead? That sounds like a more feasible approach.

Trace of executed programs called by a Bash script

A script is misbehaving. I need to know who calls that script, and who calls the calling script, and so on, only by modifying the misbehaving script.
This is similar to a stack-trace, but I am not interested in a call stack of function calls within a single bash script.
Instead, I need the chain of executed programs/scripts that is initiated by my script.

A simple script I wrote some days ago...
# FILE : sctrace.sh
# LICENSE : GPL v2.0 (only)
# PURPOSE : print the recursive callers' list for a script
# (sort of a process backtrace)
# USAGE : [in a script] source sctrace.sh
#
# TESTED ON :
# - Linux, x86 32-bit, Bash 3.2.39(1)-release
# REFERENCES:
# [1]: http://tldp.org/LDP/abs/html/internalvariables.html#PROCCID
# [2]: http://linux.die.net/man/5/proc
# [3]: http://linux.about.com/library/cmd/blcmdl1_tac.htm
#! /bin/bash
TRACE=""
CP=$$ # PID of the script itself [1]
while true # safe because "all starts with init..."
do
CMDLINE=$(cat /proc/$CP/cmdline)
PP=$(grep PPid /proc/$CP/status | awk '{ print $2; }') # [2]
TRACE="$TRACE [$CP]:$CMDLINE\n"
if [ "$CP" == "1" ]; then # we reach 'init' [PID 1] => backtrace end
break
fi
CP=$PP
done
echo "Backtrace of '$0'"
echo -en "$TRACE" | tac | grep -n ":" # using tac to "print in reverse" [3]
... and a simple test.
I hope you like it.

You can use Bash Debugger http://bashdb.sourceforge.net/
Or, as mentioned in the previous comments, the caller bash built-in. See: http://wiki.bash-hackers.org/commands/builtin/caller
i=0; while caller $i ;do ((i++)) ;done
Or as a bash function:
dump_stack(){
local i=0
local line_no
local function_name
local file_name
while caller $i ;do ((i++)) ;done | while read line_no function_name file_name;do echo -e "\t$file_name:$line_no\t$function_name" ;done >&2
}
Another way to do it is to change PS4 and enable xtrace:
PS4='+$(date "+%F %T") ${FUNCNAME[0]}() $BASH_SOURCE:${BASH_LINENO[0]}+ '
set -o xtrace # Comment this line to disable tracing.

~$ help caller
caller: caller [EXPR]
Returns the context of the current subroutine call.
Without EXPR, returns "$line $filename". With EXPR,
returns "$line $subroutine $filename"; this extra information
can be used to provide a stack trace.
The value of EXPR indicates how many call frames to go back before the
current one; the top frame is frame 0.

Since you say you can edit the script itself, simply put a:
ps -ef >/tmp/bash_stack_trace.$$
in it, where the problem is occurring.
This will create a number of files in your tmp directory that show the entire process list at the time it happened.
You can then work out which process called which other process by examining this output. This can either be done manually, or automated with something like awk, since the output is regular - you just use those PID and PPID columns to work out the relationships between all the processes you're interested in.
You'll need to keep an eye on the files, since you'll get one per process so they may have to be managed. Since this is something that should only be done during debugging, most of the time that line will be commented out (preceded by #), so the files won't be created.
To clean them up, you can simply do:
rm /tmp/bash_stack_trace.*

UPDATE:
The code below should work. Now I have a newer answer with a newer code version that allows a message inserted in the stacktrace.
IIRC I just couldn't find this answer to update it as well at the time. But now decided code is better kept in git so latest version of the above should be in this gist.
original code-corrected answer below:
There was another answer about this somewhere but here is a function to use for getting stack trace in the sense used for example in the java programming language. You call the function and it puts the stack trace into the variable $STACK. It show the code points that led to get_stack being called. This is mostly useful for complicated execution where single shell sources multiple script snippets and nesting.
function get_stack () {
STACK=""
# to avoid noise we start with 1 to skip get_stack caller
local i
local stack_size=${#FUNCNAME[#]}
for (( i=1; i<$stack_size ; i++ )); do
local func="${FUNCNAME[$i]}"
[ x$func = x ] && func=MAIN
local linen="${BASH_LINENO[(( i - 1 ))]}"
local src="${BASH_SOURCE[$i]}"
[ x"$src" = x ] && src=non_file_source
STACK+=$'\n'" "$func" "$src" "$linen
done
}

adding pstree -p -u `whoami` >>output in your script will probably get you the information you need.

The simplest script which returns a stack trace with all callers:
i=0; while caller $i ;do ((i++)) ;done

You could try something like
strace -f -e execve script.sh

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string