I've executed the following C code in Linux CentOS to create a process.
#include <stdio.h>
#include <unistd.h>
int main ()
{
int i = 0;
while ( 1 )
{
printf ( "\nhello %d\n", i ++ );
sleep ( 2 );
}
}
I've compiled it to hello_count. When I do ./hello_count, The output is like this:
hello 0
hello 1
hello 2
...
till I kill it.
I've stopped the execution using the following command
kill -s SIGSTOP 2956
When I do
ps -e
the process 2956 ./hello_count is still listed.
Is there any command or any method to resume (not to restart) the process having process id 2956?
Also, when I stop the process, the command line shows:
[1]+ Stopped ./hello_count
What does the [1]+ in the above line mean?
To continue a stopped process, that is resume use kill -SIGCONT PID
Regd [1]+ that is bash way of handling jobs. For further information try help jobs from bash prompt.
Related
I'm thinking about some tool that can pause the program at start.
For example, my_bin starts running at once.
$ ./my_bin
With this tool
$ magic_tool ./my_bin
my_bin will start. I can get the PID. Then I can start the actual running later.
I've just tested my suggestion in the comments and it worked! This is the code in my magic_tool.c:
#include <stdio.h>
#include <unistd.h>
#include <signal.h>
int main (int argc, char *argv[])
{
pid_t pid;
printf("Executing %s to wrap %s.\n", argv[0], argv[1]);
pid = fork();
if (pid == -1)
return -1;
if (pid == 0) {
raise(SIGSTOP);
execl(argv[1], "", NULL);
} else {
printf("PID == %d\n", pid);
}
return 0;
}
I wrote another test program target.c:
#include <stdio.h>
int main ()
{
puts("It works!\n");
return 0;
}
Running ./magic_tool ./target printed a PID and returned to shell. Only after running kill -SIGCONT <printed_pid> was It works! printed. You'll probably want to have PID saved somewhere else and also perform some checks in the magic_tool, but I think this is nonetheless a good proof of concept.
EDIT:
I was playing around with this a bit more and for some reason it didn't always work (see why below). The solution is simple - just follow a proper fork off and die pattern a bit more closely in magic_tool.c:
#include <stdio.h>
#include <unistd.h>
#include <signal.h>
int main (int argc, char *argv[])
{
pid_t pid;
printf("Executing %s to wrap %s.\n", argv[0], argv[1]);
pid = fork();
if (pid == -1)
return -1;
if (pid == 0) {
setsid();
pid = fork();
if (pid == -1)
return -1;
if (pid == 0) {
raise(SIGSTOP);
if (execl(argv[1], "", NULL))
return -1;
}
printf("PID == %d\n", pid);
}
return 0;
}
I found an explanation in this answer:
When you start the root process from your shell, it is a process group leader, and its descendants are members of that group. When that leader terminates, the process group is orphaned. When the system detects a newly-orphaned process group in which any member is stopped, then every member of the process group is sent a SIGHUP followed by a SIGCONT.
So, some of your descendant processes are still stopped when the leader terminates, and thus everyone receives a SIGHUP followed by a SIGCONT, which for practical purposes mean they die of SIGHUP.
Exactly which descendants are still stopped (or even just merrily advancing toward exit()) is a timing race.
The answer also links to IEEE Std 1003.1-2017 _Exit entry which contains more details on the matter.
This is mostly a very similar idea as #gst, but done entirely in the shell, you can spawn a subshell (this forks and create a new pid) and have the subshell send itself SIGSTOP signal, when the subshell receives a SIGCONT signal and resumes, the subshell exec the intended program (this replaces the subshell with the intended program without creating a new pid). So that the main shell can continue doing stuff, the subshell should run on background with &.
In a nutshell:
(kill -STOP $BASHPID && exec ./my_bin) &
subpid=$! # get the pid of above subshell
... do something else ...
kill -CONT $subpid # resume
Another idea that wouldn't suffer from race condition between the main process sending SIGCONT and the subshell SIGSTOP-ing itself is to use a file descriptor to implement the wait instead:
exec {PIPEFD}<> <(:) # set PIPEFD to the file descriptor of an anonymous pipe
(read -u $PIPEFD && exec ./my_bin) &
subpid=$! # get the pid of above subshell
... do something else ...
echo >&$PIPEFD # resume
I have this simple test:
int main() {
int res = fork();
if (res == 0) { // child
printf("Son running now, pid = %d\n", getpid());
}
else { // parent
printf("Parent running now, pid = %d\n", getpid());
wait(NULL);
}
return 0;
}
When I run it a hundred times, i.e. run this command,
for ((i=0;i<100;i++)); do echo ${i}:; ./test; done
I get:
0:
Parent running now, pid = 1775
Son running now, pid = 1776
1:
Parent running now, pid = 1777
Son running now, pid = 1778
2:
Parent running now, pid = 1779
Son running now, pid = 1780
and so on; whereas when I first write to a file and then read the file, i.e. run this command,
for ((i=0;i<100;i++)); do echo ${i}:; ./test; done > forout
cat forout
I get it flipped! That is,
0:
Son running now, pid = 1776
Parent running now, pid = 1775
1:
Son running now, pid = 1778
Parent running now, pid = 1777
2:
Son running now, pid = 1780
Parent running now, pid = 1779
I know about the scheduler. What does this result not mean, in terms of who runs first after forking?
The forking function, do_fork() (at kernel/fork.c) ends with setting the need_resched flag to 1, with the comment by kernel developers saying, "let the child process run first."
I guessed that this has something to do with the buffers that the printf writes to.
Also, is it true to say that the input redirection (>) writes everything to a buffer first and only then copies to the file? And even so, why would this change the order of the prints?
Note: I am running the test on a single-core virtual machine with a Linux kernel v2.4.14.
Thank you for your time.
When you redirect, glibc detects that stdout is not tty turns on output buffering for efficiency. The buffer is therefore not written until the process exits. You can see this with e.g.:
int main() {
printf("hello world\n");
sleep(60);
}
When you run it interactively, it prints "hello world" and waits. When you redirect to a file, you will see that nothing is written for 60 seconds:
$ ./foo > file & tail -f file
(no output for 60 seconds)
Since your parent process waits for the child, it will necessarily always exit last, and therefore flush its output last.
I'm using gdb.
I run a command like the below to set up the program by sending it input to stdin:
r < <(python -c "print '1\n2\n3'")
I want that command to allow me to start typing input after it finishes (so I can interact with the debugee normally) instead of stdin being closed.
This would work in bash but you can't pipe to the gdb r command this way:
cat <(python -c "print '1\n2\n3'") - | r
The below doesn't work, I assume it waits for EOF before it sends it to the program.
r < <(cat <(python -c "print '1\n2\n3'") -)
Is there a third option that will work?
This sounds like a job for expect.
Given
#include <stdio.h>
int main()
{
char *cp = NULL;
size_t n = 0;
while(getline(&cp, &n, stdin) >= 0) {
fprintf(stderr, "got: %s", cp);
}
return 0;
}
gcc -g -Wall t.c
And this expect script:
#!/usr/bin/expect
spawn gdb -q ./a.out
send run\n
send 1\n2\n3\n
interact
Here is the session:
$ ./t.exp
spawn gdb -q ./a.out
run
1
2
3
Reading symbols from ./a.out...done.
(gdb) run
Starting program: /tmp/a.out
got: 1
got: 2
got: 3
Now the script is waiting for my input. I provide some:
foo bar baz
got: foo bar baz
I can also interact with GDB:
^C
Program received signal SIGINT, Interrupt.
0x00007ffff7b006b0 in __read_nocancel () at ../sysdeps/unix/syscall-template.S:81
81 ../sysdeps/unix/syscall-template.S: No such file or directory.
(gdb) bt
#0 0x00007ffff7b006b0 in __read_nocancel () at ../sysdeps/unix/syscall-template.S:81
#1 0x00007ffff7a8f5a0 in _IO_new_file_underflow (fp=0x7ffff7dd4640 <_IO_2_1_stdin_>) at fileops.c:613
#2 0x00007ffff7a840d5 in _IO_getdelim (lineptr=0x7fffffffdda0, n=0x7fffffffdda8, delimiter=10, fp=0x7ffff7dd4640 <_IO_2_1_stdin_>) at iogetdelim.c:77
#3 0x000000000040064e in main () at t.c:9
I am analyzing a set of buggy programs that under some test they may terminate with segfault. The segfault event is logged in /var/log/syslog.
For example the following snippet returns Segmentation fault and it is logged.
#!/bin/bash
./test
My question is how to suppress the segfault such that it does NOT appear in the system log. I tried trap to capture the signal in the following script:
#!/bin/bash
set -bm
trap "echo 'something happened'" {1..64}
./test
It returns:
Segmentation fault
something happened
So, it does traps the segfault but the segfault is still logged.
kernel: [81615.373989] test[319]: segfault at 0 ip 00007f6b9436d614
sp 00007ffe33fb77f8 error 6 in libc-2.19.so[7f6b942e1000+1bb000]
You can try to change ./test to the following line:
. ./test
This will execute ./test in the same shell.
We can suppress the log message system-wide with e. g.
echo 0 >/proc/sys/debug/exception-trace
- see also
Making the Linux kernel shut up about segfaulting user programs
Is there a way to temporarily disable segfault messages in dmesg?
We can suppress the log message for a single process if we run it under ptrace() control, as in a debugger. This program does that:
exe.c
#include <sys/wait.h>
#include <sys/ptrace.h>
main(int argc, char *args[])
{
pid_t pid;
if (*++args)
if (pid = fork())
{
int status;
while (wait(&status) > 0)
{
if (!WIFSTOPPED(status))
return WIFSIGNALED(status) ? 128+WTERMSIG(status)
: WEXITSTATUS(status);
int signal = WSTOPSIG(status);
if (signal == SIGTRAP) signal = 0;
ptrace(PTRACE_CONT, pid, 0, signal);
}
perror("wait");
}
else
{
ptrace(PTRACE_TRACEME, 0, 0, 0);
execvp(*args, args);
perror(*args);
}
return 1;
}
It is called with the buggy program as its argument, in your case
exe ./test
- then the exit status of exe normally is the exit status of test, but if test was terminated by signal n (11 for Segmentation fault), it is 128+n.
After I wrote this, I realized that we can also use strace for the purpose, e. g.
strace -enone ./test
I my real gdb script while analyzing a core file I try to dereference a pointer and get "Error in sourced command file: Cannot access memory at address " and then my gdb script stops. What I want is just to go on executing my gdb script without stopping. Is it possible?
This is a test program and a test gdb script that demonstrates my problem. In this situation the pointer has NULL value but in a real situation the pointer will like have not null invalid value.
This is test C program:
#include <stdio.h>
struct my_struct {
int v1;
int v2;
};
int main()
{
my_struct *p;
printf("%d %d\n", p->v1, p->v2);
return 0;
}
This is a test gdb script:
>cat analyze.gdb
p p->v1
q
And this is demonstration of the problem (what I want from gdb here is to get this error message and then go process quit command):
>gdb -silent a.out ./core.22384 -x ./analyze.gdb
Reading symbols from /a.out...done.
[New Thread 22384]
Core was generated by `./a.out'.
Program terminated with signal 11, Segmentation fault.
#0 0x0000000000400598 in main () at main.cpp:11
11 printf("%d %d\n", p->v1, p->v2);
./analyze.gdb:1: Error in sourced command file:
Cannot access memory at address 0x0
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.80.el6.x86_64
Update
Thanks to Tom. This is a gdb script that handles this problem:
>cat ./analyze.v2.gdb
python
def my_ignore_errors(arg):
try:
gdb.execute("print \"" + "Executing command: " + arg + "\"")
gdb.execute (arg)
except:
gdb.execute("print \"" + "ERROR: " + arg + "\"")
pass
my_ignore_errors("p p")
my_ignore_errors("p p->v1")
gdb.execute("quit")
This is how it works:
>gdb -silent ./a.out -x ./analyze.v2.gdb -c ./core.15045
Reading symbols from /import/home/a.out...done.
[New Thread 15045]
Core was generated by `./a.out'.
Program terminated with signal 11, Segmentation fault.
#0 0x0000000000400598 in main () at main.cpp:11
11 printf("%d %d\n", p->v1, p->v2);
$1 = "Executing command: p p"
$2 = (my_struct *) 0x0
$3 = "Executing command: p p->v1"
$4 = "ERROR: p p->v1"
$5 = "Executing command: quit"
gdb's command language doesn't have a way to ignore an error when processing a command.
This is easily done, though, if your gdb was built with the Python extension. Search for the "ignore-errors" script. With that, you can:
(gdb) ignore-errors print *foo
... and any errors from print will be shown but not abort the rest of your script.
You can also do this:
gdb a.out < analyze.v2.gdb
This will execute the commands in analyze.v2.gdb line by line, even if an error occurs.
If you just want to exit if any error occurs, you can use the -batch gdb option:
Run in batch mode. Exit with status 0 after processing all the command
files specified with ā-xā (and all commands from initialization files,
if not inhibited with ā-nā). Exit with nonzero status if an error
occurs in executing the GDB commands in the command files. [...]