Detect if pid is zombie on Linux - linux

We can detect if some is a zombie process via shell command line
ps ef -o pid,stat | grep <pid> | grep Z
To get that info in our C/C++ programs we use popen(), but we would like to avoid using popen(). Is there a way to get the same result without spawning additional processes?
We are using Linux 2.6.32-279.5.2.el6.x86_64.

You need to use the proc(5) filesystem. Access to files inside it (e.g. /proc/1234/stat ...) is really fast (it does not involve any physical I/O).
You probably want the third field from /proc/1234/stat (which is readable by everyone, but you should read it sequentially, since it is unseekable.). If that field is Z then process of pid 1234 is zombie.
No need to fork a process (e.g. withpopen or system), in C you might code
pid_t somepid;
// put the process pid you are interested in into somepid
bool iszombie = false;
// open the /proc/*/stat file
char pbuf[32];
snprintf(pbuf, sizeof(pbuf), "/proc/%d/stat", (int) somepid);
FILE* fpstat = fopen(pbuf, "r");
if (!fpstat) { perror(pbuf); exit(EXIT_FAILURE); };
{
int rpid =0; char rcmd[32]; char rstatc = 0;
fscanf(fpstat, "%d %30s %c", &rpid, rcmd, &rstatc);
iszombie = rstatc == 'Z';
}
fclose(fpstat);
Consider also procps and libproc so see this answer.
(You could also read the second line of /proc/1234/status but this is probably harder to parse in C or C++ code)
BTW, I find that the stat file in /proc/ has a weird format: if your executable happens to contain both spaces and parenthesis in its name (which is disgusting, but permitted) parsing the /proc/*/stat file becomes tricky.

Related

Memory usage when reading from STDIN

If I have text file (textfile) with lines of text and a Perl file (perlscript) of
#!/usr/bin/perl
use strict;
use warnings;
my $var;
$var.=$_ while (<>) ;
print $var;
And the command run in terminal
cat ./textfile | ./perlscript | ./perlscript | ./perlscript
If I run the above code on a 1kb text file, other than the program stack etc., have I used 4Kb of memory? Or when I pull from STDIN, have I freed that memory, therefore, I would only use 1 Kb?
To word the above question another way, is copying from STDIN to a variable effectively neutral in memory usage? Or doubling memory consumption?
You've already got a good answer, but I wasn't satisfied with my guess, so I decided to test my assumptions.
I made a simple C++ program called streamstream that just takes STDIN and writes it to STDOUT in 1024-byte chunks. It looks like this:
#include <stdio.h>
int main()
{
const int BUF_SIZE = 1024;
unsigned char* buf = new unsigned char[BUF_SIZE];
size_t read = fread(buf, 1, BUF_SIZE, stdin);
while(read > 0)
{
fwrite(buf, 1, read, stdout);
read = fread(buf, 1, BUF_SIZE, stdin);
}
delete buf;
}
To test how the program uses memory, I ran it with valgrind while piping the output from one to another as follows:
cat onetwoeightk | valgrind --tool=massif ./streamstream | valgrind --tool=massif ./streamstream | valgrind --tool=massif ./streamstream | hexdump
...where onetwoeightk is just a 128KB file of random bytes. Then I used the ms_print tool on the massif output to aid in interpretation. Obviously there is the overhead of the program itself and its heap, but it starts at about 80KB and never grows beyond that, because it's sipping STDIN just one kilobyte at a time.
The data is passed from process to process 1 kilobyte at a time. Our overall memory usage will peak at 1 kilobyte * the number of instances of the program handling the stream.
Now let's do what your perl program is doing--I'll read the whole stream (growing my buffer each time) and then write it all to STDOUT. Then I'll check the valgrind output again.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main()
{
const int BUF_INCREMENT = 1024;
unsigned char* inbuf = (unsigned char*)malloc(BUF_INCREMENT);
unsigned char* buf = NULL;
unsigned int bufsize = 0;
size_t read = fread(inbuf, 1, BUF_INCREMENT, stdin);
while(read > 0)
{
bufsize += read;
buf = (unsigned char *)realloc(buf, bufsize);
memcpy(buf + bufsize - read, inbuf, read);
read = fread(inbuf, 1, BUF_INCREMENT, stdin);
}
fwrite(buf, 1, bufsize, stdout);
free(inbuf);
free(buf);
}
Unsurprisingly, memory usage climbs to over 128 kilobytes over the execution of the program.
KB
137.0^ :#
| ::#
| ::::#
| :#:::#
| :::#:::#
| :::::#:::#
| :#:::::#:::#
| :#:#:::::#:::#
| ::#:#:::::#:::#
| :#::#:#:::::#:::#
| :#:#::#:#:::::#:::#
| #::#:#::#:#:::::#:::#
| ::#::#:#::#:#:::::#:::#
| :#::#::#:#::#:#:::::#:::#
| #:#::#::#:#::#:#:::::#:::#
| ::#:#::#::#:#::#:#:::::#:::#
| :#::#:#::#::#:#::#:#:::::#:::#
| #::#::#:#::#::#:#::#:#:::::#:::#
| ::#::#::#:#::#::#:#::#:#:::::#:::#
| ::::#::#::#:#::#::#:#::#:#:::::#:::#
0 +----------------------------------------------------------------------->ki
0 210.9
But the question is, what is the total memory usage due to this approach? I can't find a good tool for measuring the memory footprint over time of a set of interacting processes. ps doesn't seem accurate enough here, even when I insert a bunch of sleeps. But we can work it out: the 128KB buffer is only freed at the end of program execution, after the stream is written. But while the stream is being written, another instance of the program builds its own 128KB buffer. So we know our memory usage will climb to 2x 128KB. But it won't rise to 3x or 4x 128KB by chaining more instances of our program, as our instances free their memory and close as soon as they are done writing to STDOUT.
More like 2kB, but a 1kB file isn't a very good example as your read buffer is probably bigger than that. Let's make the file 1GB instead. Then your peak memory usage would probably be around 2GB plus some overhead. cat uses negligible memory, just shuffling its input to its output. The first perl process has to read all of that input and store it in $var, using 1GB (plus a little bit). Then it starts writing it to the second one, which will store it into its own private $var, also using 1GB (plus a little bit), so we're up to 2GB. When the first perl process finishes writing, it exits, which closes its stdout, causing the second perl process to get EOF on stdin, which is what makes the while(<>) loop terminate and the second perl proces to start writing. At this point the third perl process starts reading and storing into its own $var, using another 1GB, but the first one is gone, so we're still in the neighborhood of 2GB. Then the second perl process ends, and the third starts writing to stdout, and exits itself.

Fork() linux issue

Why program prints 4 times 'do' instead of just one 'do'?
Code:
#include<stdio.h>
#include<unistd.h>
int main()
{
printf(" do ");
if(fork()!=0) printf(" ma ");
if(fork()==0) printf(" to \n ");
else printf("\n");
}
Program prints
do ma
do
do ma to
do to
You call fork twice in your "if" statements:
if(fork()!=0) printf(" ma ");
if(fork()==0) printf(" to \n ");
On the first fork, the parent A spawns a child B, then both the parent and child will invoke the fork a second time. The parent will spawn child C and the child will spawn child D. The result are 4 processes: A,B,C,D.
A ---- B
| |
C D
Since your prints are buffered until flushed to stdout and each forked process gets a copy of this buffer, four "do" are printed (check #ilkkachu answer).
If you intend to have a single "do", you should do this instead:
pid_t pid = fork();
if (pid > 0){
printf(" do ");
printf(" ma ");
} else {
printf(" to \n");
}
Basically store the return of fork() in a variable instead of invoking fork twice in your "if" statements.
Because standard output is line-buffered by default (or fully buffered, if you redirect it to a file or a pipe).
The first printf doesn't hit a newline, so it only adds the string do to a buffer internal to the C library. On the first fork, the whole process, including that buffer, is duplicated. Then one of the copies adds ma to its buffer, and both copies are duplicated (since both processes call fork again, not just the parent or the child.)
Finally, either printf(" to \n ") or printf("\n") is called, producing a newline, which triggers the actual writing of whatever was in the buffer.
You can either use fflush(stdout) to force the C library to output any buffered data before you fork, or use setbuf(stdout, NULL) to disable buffering completely.

Continuously monitor linux console logs using C/C++ application

I have one third party library application which runs continuously and generates console print when some event occurs.
I want to take some action when some specific event occurs so I need to monitor console prints continuously to trigget my action.
Is it possible to write application which can continuously monitor string dumper on console(stdout) and do processing when one line is detected.
I have tried to use 'popen' function but it keeps waiting until library application stops execution.
Here is my sample code using open
#include <stdio.h>
int main()
{
FILE *fd = NULL;
char buf[512] = {0};
fd = popen ("./monitor","r");
while (fgets (buf, 512, fd) != NULL)
{
printf ("__FILE__ : message : %s\n",buf);
}
printf ("EOF detected!\n");
return 0;
}
Can anyone please let me know proper way of monitoring console logs and take action.
Thanks in advance.
Pratik
Here is an example piece o code I 've written recently that reads from stdin and prints to stdout .
void echo(int bufferSize) {
// Disable output buffering.
setbuf(stdout, NULL);
char buffer[bufferSize];
while (fgets(buffer, sizeof(buffer), stdin)) {
printf("%s", buffer);
}
}
As I understand you have a similar issue as I had initially getting delayed output because I didn't use:
setbuf(stdout, NULL);
You can also read from stdin(that's what my example code does ) just pipe your command to your c code or if you just want to filter output pipe it to grep. If it's a standardized syslog log you could also use tail on the log file:
tail -f <logfile>| <your c prgoramme>
or
for just filering
tail -f <logfile>|grep "<your string here>"
or if without log file pipe stdout logs this way:
<your app>|<your c prgoramme>
or
<your app>| grep "<your string here>"
3rd party program simulated by a shell script that writes to stdout
#!/bin/bash
while true; do
echo "something"
sleep 2
done
You want to write something like this to capture the output from the 3rd party program and then act on the information:
#!/bin/bash
while read line; do
if [[ $line == "something" ]]; then
echo "do action here"
fi
done
Then combine them with a pipe operator:
./dosomething.sh | act.sh

How does Shell implement pipe programmatically?

I understand how I/O redirection works in Unix/Linux, and I know Shell uses this feature to pipeline programs with a special type of file - anonymous pipe. But I'd like to know the details of how Shell implements it programmatically? I'm interested in not only the system calls involved, but also the whole picture.
For example ls | sort, how does Shell perform I/O redirection for ls and sort?
The whole picture is complex and the best way to understand is to study a small shell. For a limited picture, here goes. Before doing anything, the shell parses the whole command line so it knows exactly how to chain processes. Let's say it encounters proc1 | proc2.
It sets up a pipe. Long story short, writing into thepipe[0] ends up in thepipe[1]
int thepipe[2];
pipe(thepipe);
It forks the first process and changes the direction of its stdout before exec
dup2 (thepipe[1], STDOUT_FILENO);
It execs the new program which is blissfully unaware of redirections and just writes to stdout like a well-behaved process
It forks the second process and changes the source of its stdin before exec
dup2 (thepipe[0], STDIN_FILENO);
It execs the new program, which is unaware its input comes from another program
Like I said, this is a limited picture. In a real picture the shell daisy-chains these in a loop and also remembers to close pipe ends at opportune moments.
This is a sample program from the book operating system concepts by silberschatz
Program is self-explanatory if you know the concepts of fork() and related things..hope this helps! (If you still want an explanation then I can explain it!)
Obviously some changes(such as change in fork() etc) should be made in this program if you want it to make it work like
ls | sort
#include <unistd.h>
#include <stdio.h>
#include <string.h>
#include <sys/types.h>
#define BUFFER SIZE 25
#define READ END 0
#define WRITE END 1
int main(void)
{
char write msg[BUFFER SIZE] = "Greetings";
char read msg[BUFFER SIZE];
int fd[2];
pid t pid;
/* create the pipe */
if (pipe(fd) == -1) {
fprintf(stderr,"Pipe failed");
return 1;
}
/* fork a child process */
pid = fork();
if (pid < 0) { /* error occurred */
fprintf(stderr, "Fork Failed");
return 1;
}
if (pid > 0) { /* parent process */
/* close the unused end of the pipe */
close(fd[READ END]);
/* write to the pipe */
write(fd[WRITE END], write msg, strlen(write msg)+1);
/* close the write end of the pipe */
close(fd[WRITE END]);
}
else { /* child process */
/* close the unused end of the pipe */
close(fd[WRITE END]);
/* read from the pipe */
read(fd[READ END], read msg, BUFFER SIZE);
printf("read %s",read msg);
}
}
/* close the write end of the pipe */
close(fd[READ END]);
return 0;
}

How to determine the date-and-time that a Linux process was started?

If I look at /proc/6945/stat then I get a series of numbers, one of which is the number of CPU-centiseconds for which the process has been running.
But I'm running these processes on heavily-loaded boxes, and what I'm interested in is the clock-time when the job will finish, for which I want to know the clock-time that it started.
The timestamps on files in /proc/6945 look to be in the right sort of range but I can't find a particular file which consistently has the right clock-time on it.
As always I can't modify the process.
Timestamps of the directories in /proc are useless.
I was advised to look at 'man proc'; this says that /proc/$PID/stat field 21 records the start-time of the process in kernel jiffies since boot ... so:
open A,"< /proc/stat"; while (<A>) { if (/^btime ([0-9]*)/) { $btime = $1 } }
to obtain the boot time, then
my #statl = split " ",`cat /proc/$i/stat`;
$starttime_jiffies = $statl[21];
$starttime_ut = $btime + $starttime_jiffies / $jiffies_per_second;
$cputime = time-$starttime_ut
but I set $jiffies_per_second to 100 because I don't know how to ask the kernel for its value from perl.
I have a project on github that does this in perl. You can find it here:
https://github.com/cormander/psj
The code you're wanting is in lib/Proc/PID.pm, and here is the snippit (with comments removed):
use POSIX qw(ceil sysconf _SC_CLK_TCK);
sub _start_time {
my $pid = shift->pid;
my $tickspersec = sysconf(_SC_CLK_TCK);
my ($secs_since_boot) = split /\./, file_read("/proc/uptime");
$secs_since_boot *= $tickspersec;
my $start_time = (split / /, file_read("/proc/$pid/stat"))[21];
return ceil(time() - (($secs_since_boot - $start_time) / $tickspersec));
}
Beware the non-standard code function file_read in here, but that should be pretty straight forward.
Use the creation timestamp of the /proc/6945 directory (or whatever PID), rather than looking at the files it contains. For example:
ls -ld /proc/6945
Bash command to get the start date of some process:
date -d #$(cat /proc/PID/stat | awk "{printf \"%.0f\", $(grep btime /proc/stat | cut -d ' ' -f 2)+\$22/$(getconf CLK_TCK);}")

Resources