Monitoring Process Syscalls in Live Environment - linux

I've been working on a project for a little while, and the first step is building a library of syscall traces for processes. Essentially, what I'm trying to do is have system wherein every time a process requests an OS service via a syscall, relevant information (calling process, time, syscall name) of the event get logged to a file.
Theoretically, this sounds like a simple enough thing to do, however, implementing such is becoming more of a pain as time goes on. I suppose the main that's causing issues for me is a general lack of knowing where to start implementation.
Initially, I thought that this could all be handled be adding a few lines of code to the kernel entry point, but after digging through entry_64.S for a little while, I came to the conclusion that there must be an easier way. The next idea I had was to overwrite all the services pointed to by sys_call_table with my own service that did logging then called the original service. But, turns out, there are some difficulties to this method with linux kernel 5.4.18 due to sys_call_table no longer being exported. And, even when recompiling the kernel so that sys_call_table is exported, the table is in a memory protected location. Lastly, I've been experimenting with auditd. Specifically, I followed this link but it doesn't seem to be working (when I executed kill command there was is only a corresponding result in ausearch about 50% of time based on timestamps).
I'm getting a little burned out by all these dead-ends, and am really hoping to finally have this first stage in my project up and running. Does anyone have any pointers as to what I should try?
Solution: BPFTrace was exactly what I was looking for.

I used BPFTrace to log every time the kernel began execution of a syscall (excluding those initiated by BPFTrace itself)

Related

replace a process bin file when it is running

I have a server program(compile by g++) which is running. And I change some code and compile a new bin file. Without kill the running process, I mv the new created bin to overwrite the old one.
After a while, the server process crashed. Dose it relate to my replace action?
My server is an multi-thread high concurrent server. One crash is segfault, other one is deadlock.
I print all parameters in the core dump file and pass them exactly same to the function which was crashed. But it is OK.
And I carefully watch all thread info in the deadlock core dump, I can not find it is an possibility to cause deadlock.
So I doubt the replacement will cause strange things
According to this question, if swap action is happen, it indeed will generate strange things
For a simple standard program, even if it is currently opened by the running process, moving a new file will first unlink the original file which will remain untouched apart from that.
But for long running servers, many things can happen: some fork new processes and occasionally some can even exec a new fresh version. In that case, you could have different versions running side by side which could or not be supported depending on the change.
Said differently, without more info on what is the server program, how it is designed to run and what was the change, the only answer I can give is maybe.
If you can make sure that you remove ONLY the bin file, and the bin file isn't used by any other process (such as some daemon). Then it doesn't relate to your replace action.

Is there a supported way to obtain LDT entries of debuggee?

A userspace process can call modify_ldt(2) to alter entries of its LDT. A debugger, to make correct analysis of what the process reads and where, as well as what code it executes currently, needs to know what descriptor a value of e.g. CS=0x7 selects.
Currently the only way I think could possibly work is injecting some code into debuggee to retrieve the LDT, executing it and then returning back to original state. But this is quite error-prone and is likely to break the debugger user's workflow e.g. when signals arrive.
So is there a better way? I've googled something like PTRACE_LDT, but the pages are from 2005, and grepping modern linux source doesn't find anything relevant to x86 other than comments.

Open MPI Virtual Timer Expired

I'm using Open MPI 1.8 on Gentoo 3.13 to manage the data transfer from one program to another via a server/client concept. Both the server and the clients are launched via mpiexec as separate processes. After some days (this is quite a heavy computation...), I sometimes receive the error
mpiexec noticed that process rank 0 with PID 17213 on node XXX exited on signal 26 (Virtual timer expired).
Unfortunately, the error is not reproducible in a reliable way, i.e., the error does not appear always and not always at the same point in the program flow. I also experienced this error on other machines. I already tracked the issue down to the ITIMER_VIRTUAL which, upon expiration, delivers SIGVTALRM (see, e.g., http://man7.org/linux/man-pages/man2/setitimer.2.html). In the BUGS section of the man page, it says that
Under very heavy loading, an ITIMER_REAL timer may expire before the signal from a previous expiration has been delivered. The second signal in such an event will be lost.
I wonder if something similar might also hold for ITIMER_VIRTUAL? Did anyone experience similar problems and can confirm the error?
The only workaround I can think of is to invoke setitimer(...) and try to manipulate the timer myself. However, I hope there is another way since I can't always modify the clients' source code. Any suggestions?
Since this question has not been answered officially, I will do it on behalf of Hristo (#HristoIliev: I hope this is ok for you). As was pointed out in the first comment to my question, there is not a single hint in the Open MPI source code which can have caused the virtual timer expiration. Indeed, the timer problem was related to a third-party library which made the code crash after an unpredictable time (depending on the current loading of the machine).

How to see what started a thread in Xcode?

I have been asked to debug, and improve, a complex multithreaded app, written by someone I don't have access to, that uses concurrent queues (both GCD and NSOperationQueue). I don't have access to a plan of the multithreaded architecture, that's to say a high-level design document of what is supposed to happen when. I need to create such a plan in order to understand how the app works and what it's doing.
When running the code and debugging, I can see in Xcode's Debug Navigator the various threads that are running. Is there a way of identifying where in the source-code a particular thread was spawned? And is there a way of determining to which NSOperationQueue an NSOperation belongs?
For example, I can see in the Debug Navigator (or by using LLDB's "thread backtrace" command) a thread's stacktrace, but the 'earliest' user code I can view is the overridden (NSOperation*) start method - stepping back earlier in the stack than that just shows the assembly instructions for the framework that invokes that method (e.g. __block_global_6, _dispatch_call_block_and_release and so on).
I've investigated and sought various debugging methods but without success. The nearest I got was the idea of method swizzling, but I don't think that's going to work for, say, queued NSOperation threads. Forgive my vagueness please: I'm aware that having looked as hard as I have, I'm probably asking the wrong question, and probably therefore haven't formed the question quite clearly in my own mind, but I'm asking the community for help!
Thanks
The best I can think of is to put breakpoints on dispatch_async, -[NSOperation init], -[NSOperationQueue addOperation:] and so on. You could configure those breakpoints to log their stacktrace, possibly some other info (like the block's address for dispatch_async, or the address of the queue and operation for addOperation:), and then continue running. You could then look though the logs when you're curious where a particular block came from and see what was invoked and from where. (It would still take some detective work.)
You could also accomplish something similar with dtrace if the breakpoints method is too slow.

How do you safely read memory in Unix (or at least Linux)?

I want to read a byte of memory but don't know if the memory is truly readable or not. You can do it under OS X with the vm_read function and under Windows with ReadProcessMemory or with _try/_catch. Under Linux I believe I can use ptrace, but only if not already being debugged.
FYI the reason I want to do this is I'm writing exception handler program state-dumping code, and it helps the user a lot if the user can see what various memory values are, or for that matter know if they were invalid.
If you don't have to be fast, which is typical in code handling state dump/exception handlers, then you can put your own signal handler in place before the access attempt, and restore it after. Incredibly painful and slow, but it is done.
Another approach is to parse the contents of /dev/proc//maps to build a map of the memory once, then on each access decide whether the address in inside the process or not.
If it were me, I'd try to find something that already does this and re-implement or copy the code directly if licence met my needs. It's painful to write from scratch, and nice to have something that can give support traces with symbol resolution.

Resources