How can I get a running thread's start address on linux? - linux

Problem Statement
I'm trying to get the address of a running thread's start_routine as passed in the pthread_create() call.
Research so far
It is apparently not in /proc/[tid]/stat or /proc/[tid]/status.
I found that start_routine is a member of struct pthread and gets set by pthread_create.[1]
If I knew the address of this struct, I could read the start_routine address.
I also found td_thr_get_info defined in the debugging library thread_db.h.[2]
It fills a struct with information about the thread, including the start function.[3] But it needs a struct td_thragent as an argument and I don't know how to create it properly.
Links
[1] http://fxr.watson.org/fxr/source/nptl/pthread_create.c?v=GLIBC27;im=excerpts#L455
[2] http://fxr.watson.org/fxr/source/nptl_db/td_thr_get_info.c?v=GLIBC27#L27
[3] See comment, because I'm not allowed to post more than 2 links.

You probably can't, and I could even imagine a very wild scenario where it could not exist at the moment you are querying it.
Let's suppose that the initial thread start routine void*foo_start(void*) is in some dlopen-ed dynamic shared library  libfoo.so.
Let's imagine that foo_start is making a tail-recursive call to bar, and that bar function is dlclose-ing libfoo.so and later calling some of your routine querying that start. It is an wild address in some defunct segment (which has been munmap-ed by dlclose called by bar)!
So, even if you hack your libc to retrieve the start routine of a thread, that does not make much sense. BTW, you could look into MUSL libc, its src/thread/pthread_create.c file is quite readable.
NB: on some occasions, recent GCC (e.g. 4.8 or 4.9) when asked to optimize a lot (e.g. -O3) are able to generate tail recursive calls from C code.

Related

Switching Between More than Two Shared Libraries (LD_PRELOAD)

Suppose that there exists three shared libraries A.so, B.so and C.so, each having a function f(). I want to switch between each function f() under certain circumstances, determined at runtime. Using the LD_PRELOAD trick, I will execute the program p (i.e., the user of these libraries) as follows:
LD_PRELOAD=A.so:B.so:C.so ./p
f() in A.so will be the default. In that f() instance, I can access f() of B.so using dlsym(), as follows:
void f() // in A.so
{
...
void *f_in_B = dlsym(RTLD_NEXT, "f");
...
}
How can I access the f() instance in C.so?
UPDATE:
Although yugr's answer works in the simple case, it has problems in more general conditions. I will shed some light on the problem by giving more details the current situation:
I need two different dynamic memory allocators and do not want to deal with internal implementation details of glibc malloc(),.... Put simply, I have two separate memory areas each with its own glibc. I use LD_PRELOAD to switch between the allocators based on some runtime condition. I used yugr's answer to load and access the secondary library.
Firstly, I called ptmalloc_init() in the secondary library to initialize malloc() data structures. I have also managed brk() calls at the OS system call level in such a way that each library has its own large brk() range and this avoids further conflicts.
The problem is that the solution works only at the application/glibc boundary. For example, when I use malloc() in the secondary library, it calls functions such as __default_morecore() in the primary library, internally, and these calls can not be captured by the LD_PRELOAD trick.
Why does this happen? I thought that the internal symbols of a library are set internally at the library compilation time, but, here, it seems that they use the symbols in the primary library of LD_PRELOAD. If the resolution is done by the linker, why these seemingly internal symbols are not captured by the LD_PRELOAD trick?
How can I fix the problem? Should I rename all the exported functions in the secondary library? Or the only feasible approach is to use a single library and delve into the implementation stuff?
You can dlopen each of the three libraries at startup to obtain their handles and use those to call dlsym(h, "f") to get particular implementations of f. If you don't know library names in advance you could use dl_iterate_phdr to obtain them (see this code for example).

RPG program error: Error MCH3601 was detected in file

We have been facing a very strange issue with one of our RPGLE programs that bombs intermittently with the subjected error.
This happens specifically at a line where a write operation is performed to a subfile record format. I have debugged and checked all the values assigned to variables during runtime and could not find absolutely no issues. As per the https://www.ibm.com/support/pages/node/644069 IBM page, I can only assume that this might be related to the parameter definitions of the programs called within the RPG. But I have checked the parameters of each and every prototyped program call and everything seems to be in sync.
Can some one please guide on the direction to go to find out the root cause of this problem?
But I have checked the parameters of each and every prototyped program
call
Assuming you're using prototypes properly, ie. there is one prototype defined in a separate source member and it is /INCLUDE into BOTH the caller and the callee...
Then prototype calls aren't the problem, as long as you're properly handling any *OMIT and *NOPASS parameters.
Look at any old style CALL or CALLB calls and anyplace you're not using prototypes properly...meaning there's a explicit PR coded in both caller & callee.
Note that you it's not just old-style calls made by the program that bombs, it's calls made anywhere down the call chain.
And if the program is repeatedly called with LR=*OFF or without reclaiming resources, then it could be any old style calls up the call chain also.
Lastly, old style calls include any made by CL or CLLE programs.
Good luck!

identifier is undefined in EXE if new'd in DLL and exported

I am newing a heap object in a regular DLL. I export it properly with __declspec(dllexport) and import it in the EXE with __declspec(dllimport) linkage. As long as I am in the DLL the object is defined properly, but when executing/debugging in the EXE, the object is undefined. What am I missing? Name mangling? Should extern "C" demangling help?
Further explanation:
#Colin Robertson My problem stems from the prototype using extension DLLs whose code is integrated with the EXE upon compile. I knew my app would need to access objects directly in the DLL from the EXE which is okay in windows extension DLLs because of the code integration. But the prototype turned out to be a memory hog as my app creates many DLLs during execution, each of which got integrated, dynamically I might add, into the running executable. Therefore, the production code had to use the regular DLL which has automatic reference counting (dllmain etc) as long as it isn’t statically linked. Which brings me to my current problem of how do I access the DLL object from within the EXE?
As such, the discussion in your links regarding the passing of allocator is not relevant. Point 60 (60. Avoid allocating and deallocating memory in different modules.) in Sutter and Alexandrescu's book does not apply since the EXE is not responsible for object lifecyle. Also, since I am using shared libraries, the following is true: “Specifically, if you use the Dll runtime option, then a single dll - msvcrtxx.dll - manages a single freestore that is shared between all dll's, and the exe, that are linked against that dll.” (see StackOverflow’s “Who allocates heap to my DLL?” whose thread was closed by poke, Linus Kleen, mauris, Cody Gray, miku for some reason). My code does not mix the allocation/deallocation responsibilities of the DLL with the usage requests of the EXE.
I think the problem lies in the fact that in a regular DLL, using a pointer to an object in another module’s allocated heap running in a different thread with its own message pump is disastrous and is censured by the compiler. This is as it should be.
But my problem is still legitimate.
I see two ways windows solves such a situation. One is the Send/PostMessage call which posts messages on other thread queues and the other is COM marshalling. For the former I would have a problem with the return value. Since what I am doing is basically a remote procedure call, my EXE wants results back from the DLL, and SendMessage only returns an HRESULT. As for the latter, this is exactly what COM does when it marshals a pointer in an Apartment threaded app (see “Single-Threaded Apartments” in MSDN). COM is designed to let you pass pointers between threads, or even processes. There might be a third C++ way which is to use the Pimpl idiom (see http://www.c2.com/cgi/wiki?PimplIdiom), but this method is a lot more work and has drawbacks. Thanks to MVP Scott McPhillips for this suggestion.
Does anyone have advice or experience on which way to proceed?
Don't do that. This is item 60 in Sutter and Alexandrescu's C++ Coding Standards book, which I highly recommend. Separate modules may use their own versions of the run time library, including the basic allocation routines. Things allocated on one module's heap may be inaccessible from another module, or have different conventions for allocating and freeing them. The name mangling conventions can be different, but that's the least of your worries. Here's another StackOverflow question that has more detailed answers for why this is a bad idea, and what to do instead: Is it bad practice to allocate memory in a DLL and give a pointer to it to a client app?

Replace system call in linux kernel 3

I am interested in replacing a system call with a custom that I will implement in linux kernel 3.
I read that the sys call table is no longer exposed.
Any ideas?
any reference to this http://www.linuxtopia.org/online_books/linux_kernel/linux_kernel_module_programming_2.6/x978.html example but for kernel 3 will be appreciated :)
Thank you!
I would recommend using kprobes for this kind of job, you can easily break on any kernel address (or symbol...) and alter the execution path, all of this at runtime, with a kernel module if you need to :)
Kprobes work by dynamically replacing an instruction (e.g. first instruction of your syscall entry) by a break (e.g. int3 on x86). Inside the do_int3 handler, a notifier notifies kprobes, which in turn passes the execution to your registered function, from which point you can do almost anything.
A very good documentation is given in Documentation/kprobes.txt so as a tiny example in samples/kprobes/kprobes_example.c (in this example they break on do_fork to log each fork on the system). It has a very simple API and is very portable nowdays.
Warning: If you need to alter the execution path, make sure your kprobes are not optimized (i.e. a jmp instruction to your handler replaces the instruction you break onto instead of an int3) otherwize you won't be able to really alter the execution easily (after the ret of your function, the syscall function will still be executed as usual). If you are only interested in tracing, then this is fine and you can safely ignore this issue.
Write a LKM that would be better optio.What do you mean by replace,do you want to add a new one.

Can i get the id of the thread which holds a CriticalSection?

I want to write a few asserts around a complicated multithreaded piece of code.
Is there some way to do a
assert(GetCurrentThreadId() == ThreadOfCriticalSection(sec));
If you want to do this properly I think you have use a wrapper object around your critical sections which will track which thread (if any) owns each CS in debug builds.
i.e. Rather than call EnterCriticalSection directly, you'd call a method on your wrapper which did the EnterCriticalSection and then, when it succeeded, stored GetCurrentThreadId in a DWORD which the asserts would check. Another method would zero that thread ID DWORD before calling LeaveCriticalSection.
(In release builds, the wrapper would probably omit the extra stuff and just call Enter/LeaveCriticalSection.)
As Casablanca points out, the owner thread ID is within the current CRITICAL_SECTION structure, so using a wrapper like I suggest would be storing redundant information. But, as Casablanca also points out, the CRITICAL_SECTION structure is not part of any API contract and could change. (In fact, it has changed in past Windows versions.)
Knowing the internal structure is useful for debugging but should not be used in production code.
So which method you use depends on how "proper" you want your solution to be. If you just want some temporary asserts for tracking down problems today, on the current version of Windows, then using the CRITICAL_SECTION fields directly seems reasonable to me. Just don't expect those asserts to be valid forever. If you want something that will last longer, use a wrapper.
(Another advantage of using a wrapper is that you'll get RAII. i.e. The wrapper's constructor and destructor will take care of the InitializeCriticalSection and DeleteCriticalSection calls so you no longer have to worry about them. Speaking of which, I find it extremely useful to have a helper object which enters a CS on construction and then automatically leaves it on destruction. No more critical sections accidentally left locked because a function had an early return hidden in the middle of it...)
As far as I know, there is no documented way to get this information. If you look at the headers, the CRITICAL_SECTION structure contains a thread handle, but I wouldn't rely on such information because internal structures could change without notice. A better way would be to maintain this information yourself whenever a thread enters/exits the critical section.
Your requirement doesn't make sense. If your current thread is not the thread which is in the critical section, then the code within the current thread won't be running, it'll be blocked when trying to lock the critical section.
If your thread is actually inside the critical section, then your assertion will always be true. If it's not, your assertion will always be false!
So what I mean is, assuming you're able to track which thread is in the critical section, if you place your assertion inside the critical section code, it'll always be true. If you place it outside, it'll always be false.

Resources