Syscall argument in kprobe with wrong value libbpf - linux

I'm trying to use libbpf to trace calls to the kill syscall. Here is my eBPF program:
SEC("kprobe/__x64_sys_kill")
int BPF_KPROBE(__x64_sys_kill, pid_t pid, int sig)
{
bpf_printk("Pid = %i\n", pid);
return 0;
}
But for some reason, when I try to read the pid argument, the value is negative. But when using strace on the kill command the value of pid is positive.
$ ping 8.8.8.8 > /dev/null &
[1] 87120
$ strace kill -9 $(pidof ping)
...
kill(87120, SIGKILL) = 0
...
[1]+ Killed ping 8.8.8.8 > /dev/null
Logs:
bash-83960 [001] d... 42409.690336: bpf_trace_printk: Pid = -1060765864
I can't figure out why the value of the pid argument insde the eBPF program is not the same as the one given by the urserland process

Related

Who runs first in fork, with contradicting results

I have this simple test:
int main() {
int res = fork();
if (res == 0) { // child
printf("Son running now, pid = %d\n", getpid());
}
else { // parent
printf("Parent running now, pid = %d\n", getpid());
wait(NULL);
}
return 0;
}
When I run it a hundred times, i.e. run this command,
for ((i=0;i<100;i++)); do echo ${i}:; ./test; done
I get:
0:
Parent running now, pid = 1775
Son running now, pid = 1776
1:
Parent running now, pid = 1777
Son running now, pid = 1778
2:
Parent running now, pid = 1779
Son running now, pid = 1780
and so on; whereas when I first write to a file and then read the file, i.e. run this command,
for ((i=0;i<100;i++)); do echo ${i}:; ./test; done > forout
cat forout
I get it flipped! That is,
0:
Son running now, pid = 1776
Parent running now, pid = 1775
1:
Son running now, pid = 1778
Parent running now, pid = 1777
2:
Son running now, pid = 1780
Parent running now, pid = 1779
I know about the scheduler. What does this result not mean, in terms of who runs first after forking?
The forking function, do_fork() (at kernel/fork.c) ends with setting the need_resched flag to 1, with the comment by kernel developers saying, "let the child process run first."
I guessed that this has something to do with the buffers that the printf writes to.
Also, is it true to say that the input redirection (>) writes everything to a buffer first and only then copies to the file? And even so, why would this change the order of the prints?
Note: I am running the test on a single-core virtual machine with a Linux kernel v2.4.14.
Thank you for your time.
When you redirect, glibc detects that stdout is not tty turns on output buffering for efficiency. The buffer is therefore not written until the process exits. You can see this with e.g.:
int main() {
printf("hello world\n");
sleep(60);
}
When you run it interactively, it prints "hello world" and waits. When you redirect to a file, you will see that nothing is written for 60 seconds:
$ ./foo > file & tail -f file
(no output for 60 seconds)
Since your parent process waits for the child, it will necessarily always exit last, and therefore flush its output last.

execvp appears to change pid of the process

I created a program that does:
1. fork
2. from the son, create execvp with csh. The execvp csh runs a script a.sh that prints in infinite loop.
My problem is that I cant stop or kill the process from the father (using kill(SIGKILL,pid) from the father process didn't work).
I think that the problem is in execvp.
when I print the pid from the script(echo $BASHPID ) I get a different pid from the one that I get before the execvp. i know the pid after execvp is supposed to remain the same, but it seems like it doesn't.
here is the problematic code:
int ExeExternal(char* args[MAX_ARG], char* cmdString, int* fg_pid, char** L_Fg_Cmd){
//int child_status;
int pID;
pID = fork();
switch(pID) {
case -1: //error
perror("fork");
return 1;
case 0 : // Child Process
setpgrp();
printf("son getpid() : %d",getpid());
fflush(stdout);
char* argument_for_cshs[5];
char* cmd="csh";
argument_for_cshs[0]="csh";
argument_for_cshs[1]="-f";
argument_for_cshs[2]="-c";
argument_for_cshs[3]=cmdString;
argument_for_cshs[4]=NULL;
execvp(argument_for_cshs[0],argument_for_cshs);
//if return => there is a problem
any solution?

How can I continue sending to stdin after input from bash process substitution finishes?

I'm using gdb.
I run a command like the below to set up the program by sending it input to stdin:
r < <(python -c "print '1\n2\n3'")
I want that command to allow me to start typing input after it finishes (so I can interact with the debugee normally) instead of stdin being closed.
This would work in bash but you can't pipe to the gdb r command this way:
cat <(python -c "print '1\n2\n3'") - | r
The below doesn't work, I assume it waits for EOF before it sends it to the program.
r < <(cat <(python -c "print '1\n2\n3'") -)
Is there a third option that will work?
This sounds like a job for expect.
Given
#include <stdio.h>
int main()
{
char *cp = NULL;
size_t n = 0;
while(getline(&cp, &n, stdin) >= 0) {
fprintf(stderr, "got: %s", cp);
}
return 0;
}
gcc -g -Wall t.c
And this expect script:
#!/usr/bin/expect
spawn gdb -q ./a.out
send run\n
send 1\n2\n3\n
interact
Here is the session:
$ ./t.exp
spawn gdb -q ./a.out
run
1
2
3
Reading symbols from ./a.out...done.
(gdb) run
Starting program: /tmp/a.out
got: 1
got: 2
got: 3
Now the script is waiting for my input. I provide some:
foo bar baz
got: foo bar baz
I can also interact with GDB:
^C
Program received signal SIGINT, Interrupt.
0x00007ffff7b006b0 in __read_nocancel () at ../sysdeps/unix/syscall-template.S:81
81 ../sysdeps/unix/syscall-template.S: No such file or directory.
(gdb) bt
#0 0x00007ffff7b006b0 in __read_nocancel () at ../sysdeps/unix/syscall-template.S:81
#1 0x00007ffff7a8f5a0 in _IO_new_file_underflow (fp=0x7ffff7dd4640 <_IO_2_1_stdin_>) at fileops.c:613
#2 0x00007ffff7a840d5 in _IO_getdelim (lineptr=0x7fffffffdda0, n=0x7fffffffdda8, delimiter=10, fp=0x7ffff7dd4640 <_IO_2_1_stdin_>) at iogetdelim.c:77
#3 0x000000000040064e in main () at t.c:9

Suppressing the segfault signal

I am analyzing a set of buggy programs that under some test they may terminate with segfault. The segfault event is logged in /var/log/syslog.
For example the following snippet returns Segmentation fault and it is logged.
#!/bin/bash
./test
My question is how to suppress the segfault such that it does NOT appear in the system log. I tried trap to capture the signal in the following script:
#!/bin/bash
set -bm
trap "echo 'something happened'" {1..64}
./test
It returns:
Segmentation fault
something happened
So, it does traps the segfault but the segfault is still logged.
kernel: [81615.373989] test[319]: segfault at 0 ip 00007f6b9436d614
sp 00007ffe33fb77f8 error 6 in libc-2.19.so[7f6b942e1000+1bb000]
You can try to change ./test to the following line:
. ./test
This will execute ./test in the same shell.
We can suppress the log message system-wide with e. g.
echo 0 >/proc/sys/debug/exception-trace
- see also
Making the Linux kernel shut up about segfaulting user programs
Is there a way to temporarily disable segfault messages in dmesg?
We can suppress the log message for a single process if we run it under ptrace() control, as in a debugger. This program does that:
exe.c
#include <sys/wait.h>
#include <sys/ptrace.h>
main(int argc, char *args[])
{
pid_t pid;
if (*++args)
if (pid = fork())
{
int status;
while (wait(&status) > 0)
{
if (!WIFSTOPPED(status))
return WIFSIGNALED(status) ? 128+WTERMSIG(status)
: WEXITSTATUS(status);
int signal = WSTOPSIG(status);
if (signal == SIGTRAP) signal = 0;
ptrace(PTRACE_CONT, pid, 0, signal);
}
perror("wait");
}
else
{
ptrace(PTRACE_TRACEME, 0, 0, 0);
execvp(*args, args);
perror(*args);
}
return 1;
}
It is called with the buggy program as its argument, in your case
exe ./test
- then the exit status of exe normally is the exit status of test, but if test was terminated by signal n (11 for Segmentation fault), it is 128+n.
After I wrote this, I realized that we can also use strace for the purpose, e. g.
strace -enone ./test

ps display thread name

Is there a way for ps (or similar tool) to display the pthread's name?
I wrote the following simple program:
// th_name.c
#include <stdio.h>
#include <pthread.h>
void * f1() {
printf("f1 : Starting sleep\n");
sleep(30);
printf("f1 : Done sleep\n");
}
int main() {
pthread_t f1_thread;
pthread_create(&f1_thread, NULL, f1, NULL);
pthread_setname_np(f1_thread, "f1_thread");
printf("Main : Starting sleep\n");
sleep(40);
printf("Main : Done sleep\n");
return 0;
}
Is there a command/utility (like ps) that I can use to display the threads for the above program, along with their name.
$ /tmp/th_name > /dev/null &
[3] 2055
$ ps -eLf | egrep "th_name|UID"
UID PID PPID LWP C NLWP STIME TTY TIME CMD
aal 31088 29342 31088 0 2 10:01 pts/4 00:00:00 /tmp/th_name
aal 31088 29342 31089 0 2 10:01 pts/4 00:00:00 /tmp/th_name
aal 31095 29342 31095 0 1 10:01 pts/4 00:00:00 egrep th_name|UID
I am running my program on Ubuntu 12.10.
With procps-ng (https://gitlab.com/procps-ng/procps) there are output option -L and -T which will print threads names:
$ ps -eL
$ ps -eT
-l long format may be used with them:
$ ps -eLl
$ ps -eTl
but -f option will replace thread name with full command line which is the same for all threads.
note the man page of pthread_setname_np(),which have showed how to get the threads' names:
pthread_setname_np() internally writes to the thread specific comm
file under /proc filesystem: /proc/self/task/[tid]/comm.
pthread_getname_np() retrieves it from the same location.
and
Example
The program below demonstrates the use of pthread_setname_np() and
pthread_getname_np().
The following shell session shows a sample run of the program:
$ ./a.out
Created a thread. Default name is: a.out
The thread name after setting it is THREADFOO.
^Z #Suspend the program
1+ Stopped ./a.out
$ ps H -C a.out -o 'pid tid cmd comm'
PID TID CMD COMMAND
5990 5990 ./a.out a.out
5990 5991 ./a.out THREADFOO
$ cat /proc/5990/task/5990/comm
a.out
$ cat /proc/5990/task/5991/comm
THREADFOO
Show the thread IDs and names of the process with PID 12345:
ps H -o 'tid comm' 12345

Resources