Bash subprocess is getting duplicated - linux

I'm facing a behavior where a code that is running in a background bash subprocess (between parentesis and &) is being sometimes, apparently, called twice:
That's the case:
# script start.sh
#!/bin/bash
echo "Starting ..."
(
java -server ...
ret=$?
log "Process has stopped returning: [$ret]"
exit $ret
) &
In a normal scenario, running the start.sh script, two process would be created, one for the start.sh itself and other for the background bash subprocess (java program):
#> ps -ef | grep ^user
user 24538 1 0 Oct22 ? 00:00:00 /bin/bash start.sh
user 24539 24538 2 Oct22 ? 06:20:56 java -server ...
But, after a few days a new java process, that is child of 24539 process (java), is being created:
#> ps -ef | grep ^user
user 24538 1 0 Oct22 ? 00:00:00 /bin/bash start.sh
user 24539 24538 18 Oct22 ? 06:20:56 java -server ...
user 25888 24539 2 Oct25 ? 00:00:00 java -server ...
Does anyone have any idea why/how it's happening?

This has nothing to do with the shell; if bash were involved, the parent process id of the new Java process would be 24538, not 24539. The Java process is forking itself. You'd have to look at the code to see why.

Related

Linux: process of a bash shell launched by crontab still running after the shell is terminated

There is an issue I would like to solve: I'm going to deploy on a Linux Red Hat 5 production environment (called PENV) a php web application under apache server; I am developing such this application on a development environment (called DENV) with Linux Mint 20.3.
On DENV, I created a crontab for the user www-data containing the following scheduled command:
0 4,12 * * * sh /bdir/s_etlShell.sh >/dev/null 2>&1;
the shell /bdir/s_etlShell.sh starts everyday at 4.00 AM and at noon, and its execution lasts between 2 and 10 minutes. It also writes to a logfile /bdir/logshell.txt.
the last two instructions of the shell are
echo "SHELL TERMINATED" >> /bdir/logshell.txt
exit
Past the 4.00 AM and noon, I found SHELL TERMINATED as a final statement inside /bdir/logshell.txt, but when I give the following command by terminal
ps fax | grep "s_etlShell.sh" | grep -v grep
I get the following output (the PID's are varying obviously):
ps fax | grep "s_etlShell.sh" | grep -v grep
1596 ? Ss 0:00 \_ /bin/sh -c sh /bdir/s_etlShell.sh >/dev/null 2>&1
1605 ? S 0:00 \_ sh /bdir/s_etlShell.sh
the processes of the shell look as if they were still active despite the shell terminated. I would expect no output instead.
I need to check the status of the shell execution in the web application via php script (check_etl_shell_status.php) launched every 2 seconds by the following javascript funcion
function loadCall() {
setInterval(function () {$("#id_content").load("check_etl_shell_status.php",'q='); }, 2000);
}
the function loadCall() is being called on load the home page.
The content of check_etl_shell_status.php is the following
<?php
$output = shell_exec('ps fax | grep "s_etlShell.sh" | grep -v grep');
if ($output) {
echo "shell is still running...";
} else {
echo "shell terminated";
}
?>
and the output message is displayed inside a div of the home page
...
<div id="id_content"></div>
...
is there a way to make sure that, when a shell has terminated, whether is launched by crontab or on demand by web application, I have the right information on its status?
Thanks to whoever can help me

Issue when starting wiremock-standalone using crontab

I have a new regression suite that uses the Wiremock standalone JAR. In order to ensure this is running on the server, I have this script called checkwiremock.sh
#!/bin/bash
cnt=$(ps -eaflc --sort stime | grep wiremock-standalone-2.11.0.jar |grep -v grep | wc -l)
if(test $cnt -eq 1);
then
echo "Service already running..."
else
echo "Starting Service"
nohup java -jar /etc/opt/wiremock/wiremock-standalone-2.11.0.jar --port 1324 --verbose &
fi
The script works as expected when ran manually
./checkwiremock.sh
However when started using Crontab,
* * * * * /bin/bash /etc/opt/wiremock/checkwiremock.sh
Wiremock returns
No response could be served as there are no stub mappings in this WireMock instance.
The only difference I can see between the manually started process and cron process is the TTY
root 31526 9.5 3.2 1309736 62704 pts/0 Sl 11:28 0:01 java -jar /etc/opt/wiremock/wiremock-standalone-2.11.0.jar --port 1324
root 31729 22.0 1.9 1294104 37808 ? Sl 11:31 0:00 java -jar /etc/opt/wiremock/wiremock-standalone-2.11.0.jar --port 1324
Can't figure out what is wrong here.
Server details:
Red Hat Enterprise Linux Server release 6.5 (Santiago)
*Edit: corrected paths to ones actually used
Change the directory in the checkwiremock.sh to:
cd /path/to/shell/script

How to identify crontab child job?

My unix production server has test.ksh files, but every day it's running on daily basics using job.
I want to know which crontab job is calling this script. I checked usign below command, but i didn't find exact job name,
crontab -l
--It has been listed 100 job --
I have analysed above mentioned 100 job, but i didn't get test.ksh file
crontab -l | grep "test.ksh"
--file not found
But the file available in one directory, I can't find which job is called test.ksh script.
Finding:
1. Whether it's child job? - If yes, how can i identify the child job?
you could use pstree -p xxxx where xxxx is the pid of crond. You will then get a nice hierarchical overview of all offspring processes of crond.
If it is a child script, use ps -ef and use the ppid of the test.ksh job to identify the calling script.
For example, consider these two scripts, the first just calls the second
parent
#! /bin/sh
# Run child process
./child
child
#! /bin/sh
sleep 60
ps -ef shows (with a lot of other processes removed)
UID PID PPID C STIME TTY TIME CMD
501 5725 5724 0 8:22pm ttys000 0:00.28 -bash
501 6046 5725 0 11:38am ttys000 0:00.01 /bin/sh ./parent
501 6047 6046 0 11:38am ttys000 0:00.00 /bin/sh ./child
501 6048 6047 0 11:38am ttys000 0:00.00 sleep 60
The pid is the process identifier, so child has process id 6047. Its ppid - 6046 - is the process id of its parent as you can see looking at the entry for the parent process.

How to get the process ID to kill a nohup process?

I'm running a nohup process on the server. When I try to kill it my putty console closes instead.
this is how I try to find the process ID:
ps -ef |grep nohup
this is the command to kill
kill -9 1787 787
When using nohup and you put the task in the background, the background operator (&) will give you the PID at the command prompt. If your plan is to manually manage the process, you can save that PID and use it later to kill the process if needed, via kill PID or kill -9 PID (if you need to force kill). Alternatively, you can find the PID later on by ps -ef | grep "command name" and locate the PID from there. Note that nohup keyword/command itself does not appear in the ps output for the command in question.
If you use a script, you could do something like this in the script:
nohup my_command > my.log 2>&1 &
echo $! > save_pid.txt
This will run my_command saving all output into my.log (in a script, $! represents the PID of the last process executed). The 2 is the file descriptor for standard error (stderr) and 2>&1 tells the shell to route standard error output to the standard output (file descriptor 1). It requires &1 so that the shell knows it's a file descriptor in that context instead of just a file named 1. The 2>&1 is needed to capture any error messages that normally are written to standard error into our my.log file (which is coming from standard output). See I/O Redirection for more details on handling I/O redirection with the shell.
If the command sends output on a regular basis, you can check the output occasionally with tail my.log, or if you want to follow it "live" you can use tail -f my.log. Finally, if you need to kill the process, you can do it via:
kill -9 `cat save_pid.txt`
rm save_pid.txt
I am using red hat linux on a VPS server (and via SSH - putty), for me the following worked:
First, you list all the running processes:
ps -ef
Then in the first column you find your user name; I found it the following three times:
One was the SSH connection
The second was an FTP connection
The last one was the nohup process
Then in the second column you can find the PID of the nohup process and you only type:
kill PID
(replacing the PID with the nohup process's PID of course)
And that is it!
I hope this answer will be useful for someone I'm also very new to bash and SSH, but found 95% of the knowledge I need here :)
suppose i am running ruby script in the background with below command
nohup ruby script.rb &
then i can get the pid of above background process by specifying command name. In my case command is ruby.
ps -ef | grep ruby
output
ubuntu 25938 25742 0 05:16 pts/0 00:00:00 ruby test.rb
Now you can easily kill the process by using kill command
kill 25938
jobs -l should give you the pid for the list of nohup processes.
kill (-9) them gently.
;)
You could try
kill -9 `pgrep [command name]`
Suppose you are executing a java program with nohup you can get java process id by
`ps aux | grep java`
output
xxxxx 9643 0.0 0.0 14232 968 pts/2
then you can kill the process by typing
sudo kill 9643
or lets say that you need to kill all the java processes then just use
sudo killall java
this command kills all the java processes. you can use this with process. just give the process name at the end of the command
sudo killall {processName}
If your application always uses the same port, you can kill all the processes in that port like this.
kill -9 $(lsof -t -i:8080)
This works in Ubuntu
Type this to find out the PID
ps aux | grep java
All the running process regarding to java will be shown
In my case is
johnjoe 3315 9.1 4.0 1465240 335728 ? Sl 09:42 3:19 java -jar batch.jar
Now kill it kill -9 3315
The zombie process finally stopped.
when you create a job in nohup it will tell you the process ID !
nohup sh test.sh &
the output will show you the process ID like
25013
you can kill it then :
kill 25013
I started django server with the following command.
nohup manage.py runserver <localhost:port>
This works on CentOS:
:~ ns$netstat -ntlp
:~ ns$kill -9 PID
This works for mi fine on mac
kill -9 `ps -ef | awk '/nohup/{ print \$2 }'`
I often do this way. Try this way :
ps aux | grep script_Name
Here, script_Name could be any script/file run by nohup.
This command gets you a process ID. Then use this command below to kill the script running on nohup.
kill -9 1787 787
Here, 1787 and 787 are Process ID as mentioned in the question as an example.
This should do what was intended in the question.
If you are unaware of the PID, then first find it using TOP command
top -U userid
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
You will get the PID using top, then perform the kill operation.
$ kill -9 <PID>
Today I met the same problem. And since it was a long time ago, I totally forgot which command I used and when. I tried three methods:
Using the STIME shown in ps -ef command. This shows the time you start your process, and it's very likely that you nohup you command just before you close ssh(depends on you) . Unfortunately I don't think the latest command is the command I run using nohup, so this doesn't work for me.
Second is the PPID, also shown in ps -ef command. It means Parent Process ID, the ID of process that creates the process. The ppid is 1 in ubuntu for process that using nohup to run. Then you can use ps --ppid "1" to get the list, and check TIME(the total CPU time your process use) or CMD to find the process's PID.
Use lsof -i:port if the process occupy some ports, and you will get the command. Then just like the answer above, use ps -ef | grep command and you will get the PID.
Once you find the PID of the process, then can use kill pid to terminal the process.
About losing your putty: often the ps ... | awk/grep/perl/... process gets matched, too! So the old school trick is like this
ps -ef | grep -i [n]ohup
That way the regex search doesn't match the regex search process!
if you are on a remote server, check memory usage with top , and find your process and its ID. After that, just execute kill [your process ID] .

Who does the daemonizing?

There are various tricks to daemonize a linux process, i.e. to make a command running after the terminal is closed.
nohup is used for this purpose, and fork()/setsid() combination can be used in a C program to make itself a daemon process.
The above was my knowledge about linux daemon, but today I noticed that exiting the terminal doesn't really terminate processes started with & at the end of the command.
$ while :; do echo "hi" >> temp.log ; done &
[1] 11108
$ ps -ef | grep 11108
username 11108 11076 83 15:25 pts/0 00:00:05 /bin/sh
username 11116 11076 0 15:25 pts/0 00:00:00 grep 11108
$ exit
(after reconnecting)
$ ps -ef | grep 11108
username 11108 1 91 15:25 pts/0 00:00:17 /bin/sh
username 11130 11540 0 15:25 pts/0 00:00:00 grep 11108
So apparently, the process's PPID changed to 1, meaning that it got daemonized somehow.
This contradicts my knowledge, that & is not enough and one must use nohup or some other tricks to a process 'daemon'.
Does anyone know who is doing this daemonizing?
I'm using a CentOS 6.3 host and putty/cygwin/sshclient produced the same result.
You can daemonize a process if that doesn't respond to SIGHUP signal.
When bash shell is terminated while it is running background tasks, bash shell sends SIGHUP
(hangup signal) to all tasks. However bash won't wait until child processes are completely
terminated. If child process doesn't respond to SIGHUP signal, that process becomes an orphan
process. (its parent pid is changed to 1 - init process - to prevent becoming a useless zombie process)
Subshell executions basically do not responds to SIGHUP signals, thus your command will still be running after logging out from the first shell.

Resources