Hook for pthread_create - hook

Is there (in glibc-2.5 and newer) a way to define a hook for pthread_create?
There is a lot of binary applications and I want to write a dynamic lib to be loaded via LD_PRELOAD
I can add hook on entry to main (''attributte constructor''), but how can I force my code to be executed in every thread just before the thread's function will run.

This answer shows how to interpose pthread_create. (Beware: it will work correctly in 64-bit, but not 32-bit programs.)
Once you interpose pthread_create, you can make it call your own function, which will do whatever you want, and then call the original function the user passed to pthread_create.

Related

How to invoke any kernel function?

I know that Kprobes can be used to probe any kernel function. But after going through its documents I realise that it is mostly a kind of passive entity. It simply puts a probe in the middle of an execution sequence.
But what if I want to invoke any kernel function directly without bothering about the execution sequence.
How can I achieve that?
Updated:
Note: I want to invoke any kernel function inside my kernel module and not from any user space application.
Kernel functions cannot be simply invoked from applications that live in user space. System calls are the only functions in user space that can request kernel services.
To call kernel functions directly, if you are interested in kernel programming, you must implement a kernel module. This is a starting point.
EDIT
As you have specified that you want to call kernel functions from within a module, then there is no problem at all. Just follow the link I posted above for the documentation.
what if I want to invoke any kernel function directly
Not all functions can be used directly at least.
Consider the following points when calling a kernel function in your case.
kernel function from different module can be used only if it is exported using EXPORT_SYMBOL family of macros.
static functions can't be used directly outside of that file.
Example
Function definition (i2c_smbus_read_byte_data)
http://lxr.free-electrons.com/source/drivers/i2c/i2c-core.c#L2689
Used here
http://lxr.free-electrons.com/source/drivers/i2c/i2c-core.c#L350

How can I get a running thread's start address on linux?

Problem Statement
I'm trying to get the address of a running thread's start_routine as passed in the pthread_create() call.
Research so far
It is apparently not in /proc/[tid]/stat or /proc/[tid]/status.
I found that start_routine is a member of struct pthread and gets set by pthread_create.[1]
If I knew the address of this struct, I could read the start_routine address.
I also found td_thr_get_info defined in the debugging library thread_db.h.[2]
It fills a struct with information about the thread, including the start function.[3] But it needs a struct td_thragent as an argument and I don't know how to create it properly.
Links
[1] http://fxr.watson.org/fxr/source/nptl/pthread_create.c?v=GLIBC27;im=excerpts#L455
[2] http://fxr.watson.org/fxr/source/nptl_db/td_thr_get_info.c?v=GLIBC27#L27
[3] See comment, because I'm not allowed to post more than 2 links.
You probably can't, and I could even imagine a very wild scenario where it could not exist at the moment you are querying it.
Let's suppose that the initial thread start routine void*foo_start(void*) is in some dlopen-ed dynamic shared library  libfoo.so.
Let's imagine that foo_start is making a tail-recursive call to bar, and that bar function is dlclose-ing libfoo.so and later calling some of your routine querying that start. It is an wild address in some defunct segment (which has been munmap-ed by dlclose called by bar)!
So, even if you hack your libc to retrieve the start routine of a thread, that does not make much sense. BTW, you could look into MUSL libc, its src/thread/pthread_create.c file is quite readable.
NB: on some occasions, recent GCC (e.g. 4.8 or 4.9) when asked to optimize a lot (e.g. -O3) are able to generate tail recursive calls from C code.

How can I inject a background thread to an application with LD_PRELOAD?

I know that LD_PRELOAD can be used to intercept calls to functions in shared libraries (if the app is not statically linked). However, I do not know how it can be used to add additional features or background threads to applications.
For example, I think Berkeley labs checkpoint/restart uses this method to add a background thread to an application that may be checkpointed later on.
So, now the question is how can a thread be injected into a compiled app using LD_PRELOAD without knowing before hand what functions of shared libraries are being called from this app?
It's a simple enough matter - you can implement the _init function - that would be void _init(void) {}, and you can use pthread_create in it (assuming you linked your library with -lpthread). You should compile your library with the other -l dependencies you need. GCC will allow you to replace the hardcoded _init() with another entry point, specified with an __attribute (constructor), as well. At any rate, your entry point will get called by LD.
When your library is injected, it gets injected before all others, but its own dependencies do get resolved as well, so whatever calls you make are generally ok (one notable exception being if you intercept functions you later call, for which you'll need to use the dlfcn APIs to do so safely).

Replace system call in linux kernel 3

I am interested in replacing a system call with a custom that I will implement in linux kernel 3.
I read that the sys call table is no longer exposed.
Any ideas?
any reference to this http://www.linuxtopia.org/online_books/linux_kernel/linux_kernel_module_programming_2.6/x978.html example but for kernel 3 will be appreciated :)
Thank you!
I would recommend using kprobes for this kind of job, you can easily break on any kernel address (or symbol...) and alter the execution path, all of this at runtime, with a kernel module if you need to :)
Kprobes work by dynamically replacing an instruction (e.g. first instruction of your syscall entry) by a break (e.g. int3 on x86). Inside the do_int3 handler, a notifier notifies kprobes, which in turn passes the execution to your registered function, from which point you can do almost anything.
A very good documentation is given in Documentation/kprobes.txt so as a tiny example in samples/kprobes/kprobes_example.c (in this example they break on do_fork to log each fork on the system). It has a very simple API and is very portable nowdays.
Warning: If you need to alter the execution path, make sure your kprobes are not optimized (i.e. a jmp instruction to your handler replaces the instruction you break onto instead of an int3) otherwize you won't be able to really alter the execution easily (after the ret of your function, the syscall function will still be executed as usual). If you are only interested in tracing, then this is fine and you can safely ignore this issue.
Write a LKM that would be better optio.What do you mean by replace,do you want to add a new one.

LD_PRELOAD not working for printf

i am using LD_PRELOAD to capture write() system call in linux .
I am successfully able to do this for write system call and make it work.
But when i call printf() that time it does not work. If we observe printf stack trace using strace i found that, at the end printf calls write() system call to write to console, but at that time my write() system call is not called before actually calling the the write() system call.
Anybody have any idea why is this happening ?
Function calls made from one library to another or from the executable to a dynamically loaded library go through the PLT (Procedure Linkage Table) and are able to be redirected by the use of LD_PRELOAD. However, function calls within a library can be resolved at compile time and do not go through the PLT. Therefore they are unable to be redirected by LD_PRELOAD. Since printf and write are both compiled into libc.so.6, the call to write from printf never goes through the PLT to look for a possible redirection, but when you call write directly from your application (or from another shared library) it does.

Resources