I am writing a memory leak tracker, and want to print the statistics messages in the tracked program at the time the library unloading.
This library will be used with the LD_PRELOAD trick, our programs have static variables that will be destroyed very late, so I want to be assured that the statistics messages will be printed after all the static variables in the program have been destroyed to avoid false alarm.
Will libraries loaded with LD_PRELOAD be unload in the last (Later than the hacked programs)?
Will libraries loaded with LD_PRELOAD be unload in the last
You are assuming that LD_PRELOADED library will be unloaded, but there is no guarantee that this will happen at all.
Here is a test case:
// main.c
#include <unistd.h>
int main(int argc, char *argv[]) {
if (argc > 1) _exit(0);
return 0;
}
// preload.c
#include <stdio.h>
__attribute__((constructor)) void init() { fprintf(stderr, "Init\n"); }
__attribute__((destructor)) void fini() { fprintf(stderr, "Fini\n"); }
Build them like so:
gcc main.c && gcc -fPIC -shared -o preload.so preload.c
Now observe the order of initialization and finalization:
LD_DEBUG=files LD_PRELOAD=./preload.so ./a.out |& grep 'calling .*ini'
18310: calling init: /lib/x86_64-linux-gnu/libc.so.6
18310: calling init: ./preload.so
18310: calling fini: ./a.out [0]
18310: calling fini: ./preload.so [0]
LD_DEBUG=files LD_PRELOAD=./preload.so ./a.out 1 |& grep 'calling .*ini'
18312: calling init: /lib/x86_64-linux-gnu/libc.so.6
18312: calling init: ./preload.so
Notice that in the case of _exit() the library is not finalized at all.
Also notice that any shared library will have a dependency on libc.so.6, and thus will be finalized before libc.so.6.
But absent signal or _exit() termination, yes: the libraries will be finalized in reverse order of their initialization (loading and unloading aren't really the right terms to use here), and that does mean that LD_PRELOADed libraries will be finalized last.
Related
There is a large number of questions on SO about how to execute a library or dynamically load an executable. As far as I can tell, all the answers come down to: compile your executable as position-independent code and load it with dlopen. This worked great --- and still works great on macOS --- until a recent change in glibc, which explicitly disabled dlopening PIEs. This change is now in the current version of glibc (2.30) on ArchLinux, for example, and trying to dlopen a position-independent executable gives an error: "cannot dynamically load position-independent executable".
It's difficult to guess what prompted such a radical change that breaks so much code and useful use cases. (The explanations on Patchwork and Bugzilla don't make much sense to me.) But there is now a question: what to do if you want to create an executable that's also a dynamic library, or vice versa?
A solution was linked from one of the comments. Reproducing it here for posterity:
#include <stdio.h>
#include <unistd.h>
const char service_interp[] __attribute__((section(".interp"))) = "/lib/ld-linux-x86-64.so.2";
extern "C" {
void lib_entry(void)
{
printf("Entry point of the service library\n");
_exit(0);
}
}
Compiling with g++ -shared test-no-pie.cpp -o test-no-pie -Wl,-e,lib_entry produces a shared object (dynamic library) that can also be executed on Linux.
I have two questions:
What if I want to pass command-line arguments? How to modify this solution so it accepts arc,argv?
Are there other alternatives?
It's difficult to guess what prompted such a radical change
Not really: it never worked correctly.
that breaks so much code
That code was broken already in subtle ways. Now you get a clear indication that it will not work.
Are there other alternatives?
Don't do that?
What problem does dlopening an executable solve?
If it's a real problem, open a GLIBC bugzilla feature request, explaining that problem and requesting a supported mechanism to achieve desired result.
Update:
at least say why "it never worked correctly". Is it some triviality like potentially clashing globals between the executables, or something real?
Thread-local variables is an example that doesn't work correctly. Whether you think they are "real" or not I have no idea.
Here is the code:
// foo.c
#include <stdio.h>
__thread int var;
__attribute__((constructor))
static void init()
{
var = 42;
printf("foo.c init: %d %p\n", var, &var);
}
int bar() {
printf("foo.c bar: %d %p\n", var, &var);
return var;
}
int main()
{
printf("foo.c main: %d %p bar()=%d\n", var, &var, bar());
return 0;
}
gcc -g foo.c -o foo -Wl,-E -fpie -pie && ./foo
foo.c init: 42 0x7fb5dfd7d4fc
foo.c bar: 42 0x7fb5dfd7d4fc
foo.c main: 42 0x7fb5dfd7d4fc bar()=42
// main.c
// Error checking omitted for brevity
#include <dlfcn.h>
#include <stdio.h>
int main()
{
void *h1 = dlopen("./foo", RTLD_LOCAL|RTLD_LAZY);
int (*bar)(void) = dlsym(h1, "bar");
printf("main.c: %d\n", bar());
return 0;
}
gcc -g main.c -ldl && ./a.out
foo.c init: 42 0x7fb7305da73c
foo.c bar: 0 0x7fb7305da73c <<< what?
main.c: 0 <<< what?
This is using GNU C Library (Debian GLIBC 2.28-10) stable release version 2.28.
Bottom line: this was never designed to work, and you just happened to not step on many of the land-mines, so you thought it is working, when in fact you were exercising undefined behavior.
Please see this answer:
https://stackoverflow.com/a/68339111/14760867
The argc, argv question is not answered there, but when I found I needed one, I hacked something together to parse /proc/self/cmdline at runtime for pam_cap.so use.
There is a large number of questions on SO about how to execute a library or dynamically load an executable. As far as I can tell, all the answers come down to: compile your executable as position-independent code and load it with dlopen. This worked great --- and still works great on macOS --- until a recent change in glibc, which explicitly disabled dlopening PIEs. This change is now in the current version of glibc (2.30) on ArchLinux, for example, and trying to dlopen a position-independent executable gives an error: "cannot dynamically load position-independent executable".
It's difficult to guess what prompted such a radical change that breaks so much code and useful use cases. (The explanations on Patchwork and Bugzilla don't make much sense to me.) But there is now a question: what to do if you want to create an executable that's also a dynamic library, or vice versa?
A solution was linked from one of the comments. Reproducing it here for posterity:
#include <stdio.h>
#include <unistd.h>
const char service_interp[] __attribute__((section(".interp"))) = "/lib/ld-linux-x86-64.so.2";
extern "C" {
void lib_entry(void)
{
printf("Entry point of the service library\n");
_exit(0);
}
}
Compiling with g++ -shared test-no-pie.cpp -o test-no-pie -Wl,-e,lib_entry produces a shared object (dynamic library) that can also be executed on Linux.
I have two questions:
What if I want to pass command-line arguments? How to modify this solution so it accepts arc,argv?
Are there other alternatives?
It's difficult to guess what prompted such a radical change
Not really: it never worked correctly.
that breaks so much code
That code was broken already in subtle ways. Now you get a clear indication that it will not work.
Are there other alternatives?
Don't do that?
What problem does dlopening an executable solve?
If it's a real problem, open a GLIBC bugzilla feature request, explaining that problem and requesting a supported mechanism to achieve desired result.
Update:
at least say why "it never worked correctly". Is it some triviality like potentially clashing globals between the executables, or something real?
Thread-local variables is an example that doesn't work correctly. Whether you think they are "real" or not I have no idea.
Here is the code:
// foo.c
#include <stdio.h>
__thread int var;
__attribute__((constructor))
static void init()
{
var = 42;
printf("foo.c init: %d %p\n", var, &var);
}
int bar() {
printf("foo.c bar: %d %p\n", var, &var);
return var;
}
int main()
{
printf("foo.c main: %d %p bar()=%d\n", var, &var, bar());
return 0;
}
gcc -g foo.c -o foo -Wl,-E -fpie -pie && ./foo
foo.c init: 42 0x7fb5dfd7d4fc
foo.c bar: 42 0x7fb5dfd7d4fc
foo.c main: 42 0x7fb5dfd7d4fc bar()=42
// main.c
// Error checking omitted for brevity
#include <dlfcn.h>
#include <stdio.h>
int main()
{
void *h1 = dlopen("./foo", RTLD_LOCAL|RTLD_LAZY);
int (*bar)(void) = dlsym(h1, "bar");
printf("main.c: %d\n", bar());
return 0;
}
gcc -g main.c -ldl && ./a.out
foo.c init: 42 0x7fb7305da73c
foo.c bar: 0 0x7fb7305da73c <<< what?
main.c: 0 <<< what?
This is using GNU C Library (Debian GLIBC 2.28-10) stable release version 2.28.
Bottom line: this was never designed to work, and you just happened to not step on many of the land-mines, so you thought it is working, when in fact you were exercising undefined behavior.
Please see this answer:
https://stackoverflow.com/a/68339111/14760867
The argc, argv question is not answered there, but when I found I needed one, I hacked something together to parse /proc/self/cmdline at runtime for pam_cap.so use.
I have just shifted to ubuntu and newly using gdb and g++ . Please forgive me if my question is silly .
This is from Richard Stevens Advanced Linux Programming . Three files were created in the folder names reciprocal
main.c:
#include <stdio.h>
#include "reciprocal.hpp"
int main (int argc, char **argv)
{
int i;
i = atoi (argv[1]);
printf ("The reciprocal of %d is %g\n", i, reciprocal (i));
return 0;
}
reciprocal.cpp:
#include <cassert>
#include "reciprocal.hpp"
double reciprocal (int i) {
// I should be non-zero.
assert (i != 0);
return 1.0/i;
}
reciprocal.hpp:
#ifdef __cplusplus
extern "C" {
#endif
extern double reciprocal (int i);
#ifdef __cplusplus
}
#endif
After compiling , I ran the command (gdb) reciprocal and the (gdb) run . I was expecting something as in the book
Starting program: reciprocal
Program received signal SIGSEGV, Segmentation fault.
__strtol_internal (nptr=0x0, endptr=0x0, base=10, group=0)
at strtol.c:287
287 strtol.c: No such file or directory.
(gdb)
But I got :
Starting program: /home/trafalgar/Desktop/reciprocal/reciprocal
warning: no loadable sections found in added symbol-file system-supplied DSO at 0x7ffff7ffa000
Program received signal SIGSEGV , Segmentation fault.
0x00007ffff7a56ad4 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
What might be happening different . Is this a version problem or anything else ?
Here is the Makefile
reciprocal: main.o reciprocal.o
g++ $(CFLAGS) -o reciprocal main.o reciprocal.o
main.o: main.c reciprocal.hpp
gcc $(CFLAGS) -c main.c
reciprocal.o: reciprocal.cpp reciprocal.hpp
g++ $(CFLAGS) -c reciprocal.cpp
clean:
rm -f *.o reciprocal
How did you compile the program?
use g++ -g programname.c
also, when you do
gdb reciprocal
note if there is a message like
loaded symbols from ...
or
couldnot find symbols
if you get output similar to 2nd code statement, then the problem is that you did not use -g symbol.
You should compile with all warnings and debug info, i.e.
gcc -Wall -g -c main.c
g++ -Wall -g -c reciprocal.cpp
then link with
g++ -g main.o reciprocal.o -o reciprocal
So add
CFLAGS= -Wall -g
in your Makefile. See also this.
Then run the debugger with
gdb reciprocal
then set a program argument with set args 12 command to (gdb) prompt
at last start the debugged program with run when having the (gdb) prompt
Of course, if you don't have any program arguments, argc is 1 and argv[1] is NULL, which you should not pass to atoi(3).
The debugger works quite well. The bug is in your code. You should handle correctly the case when argc is 1 and argv[1] is NULL.
If you encounter a segmentation fault inside a C library function, use the bt or backtrace gdb command to understand how you get there.
Everything is working properly, with expected output.
Compare:
1. Starting program: reciprocal
2. Program received signal SIGSEGV, Segmentation fault.
3. __strtol_internal (nptr=0x0, endptr=0x0, base=10, group=0)
4. at strtol.c:287
5. 287 strtol.c: No such file or directory.
A. Starting program: /home/trafalgar/Desktop/reciprocal/reciprocal
B. warning: no loadable sections found in added symbol-file system-supplied DSO at 0x7ffff7ffa000
C. Program received signal SIGSEGV , Segmentation fault.
D. 0x00007ffff7a56ad4 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
Lines 1-5 are your expected output (from Mr Stevens' text), lines A-D are your actual output.
Lines 1 & A, are essentially identical, they both specify the filename of the executable, (1) is relative pathed, (A) has full path. No worries.
Line B... this is NORMAL, this is gdb telling you that you don't have the debugging information installed for your library functions (NOT YOUR CODE, the dynamically linked libraries on your system).
Line C... Same as (2), easy enough.
Line D... Well, since we don't have debug info for the library functions, it can only point out where the error was as best it can: libc.so.6 (standard library functions, of which strtol is one such)
Essentially, line D is similar to lines 3-5. Without the debug information installed/available, you're not going to get much more information than this.
But everything is working as expected. You're fine.
For help on how to install debug symbols, see here: how-to-use-debug-libraries-on-ubuntu
Fear not! You're doing great. (Technically, the error is on line 6 of your main.cpp, since argv[1] is pretty much undefined because you didn't supply an argument, perhaps confusing since atoi() is often replaced with strtol() behind the scenes.)
Try:
gdb --args ./reciprocal 15
or similar to test with arguments.
How can i hook file saving in Linux systems (to show my programm dialog, opearting with them then)?
Just use the inotify interface to get notification of file system changes. See: http://linux.die.net/man/7/inotify
You can try FILE_PRELOAD utility which generate C++ code with hooks, compile and LD_PRELOAD it. After short look at it you can feel how easy to hook linux. Start point is this tutorial.
For example, if you want to change 'open call' of file /tmp/some with /tmp/replace_with:
#: FILE_PRELOAD -C "A+f:/tmp/some:/tmp/replace_with" -- bash
#: echo "HaHa" >> /tmp/some
#: ll /tmp/some
ls: cannot access /tmp/some: No such file or directory
#: cat /tmp/replace_with
HaHa
If you want to see the source of generated code just add "-p" to options.
#: FILE_PRELOAD -p -C "A+f:/tmp/some:/tmp/replace_with" -- bash
In additional all generated.cpp files you can find in /tmp/$USER/FILE_PRELOAD/cpp.
Have a nice play with linux hooks)
Generated code looks like this:
#include <sys/types.h>
#include <dlfcn.h>
#include <stdio.h>
#include <map>
#include <string>
#define I int
#define C char
#define S string
#define P printf
#define R return
using std::map;
using std::string;
typedef map<S,S> MAP;
static I (*old_open)(const C *p, I flags, mode_t mode);
extern "C"
I open (const C *p, I flags, mode_t mode){
old_open = dlsym(RTLD_NEXT, "open");
P("open hook\n");
MAP files;
files[p]=p;
files["/tmp/some"]="/tmp/replace_with";
S newpath = files[S(p)];
R old_open(newpath.c_str(), flags, mode);
}
# &compile
gcc -w -fpermissive -fPIC -c -Wall file.cpp
gcc -shared file.o -ldl -lstdc++ -o wrap_loadfile.so
LD_PRELOAD=./wrap_loadfile.so bash
nm -D /lib/libc.so.6 | grep open # we hook this syscall
If you can compile them you can link first against a custom library that provides open().
There's a stock way of doing it.
If you can't compile it, this works most of the time:
Write function _open_posthook that does syscall(NR_OPEN, ...)
Provide shared library libopenhook that provides your new open. Rembember you renamed open to _open_posthook() here unless you want recursion. Don't forget to also provide creat().
Load this library with LD_PRELOAD.
EDIT: if you're trying for security this won't work. You might be able to get away with using strace() but unless you are very careful a determined programmer can overcome that too.
I need additional initialization over existing in dynamic-linked application.
If you want to hook additional code before running main() in an already-compiled program, you can use a combination of the constructor attribute, and LD_PRELOAD like so:
#include <stdio.h>
void __attribute__((constructor)) init() {
printf("Hello, world!\n");
}
Compile and run:
$ gcc -shared demo_print.c -o demo_print.so -fPIC
$ LD_PRELOAD=$PWD/demo_print.so true
Hello, world!
If you don't want to run normal main() at all, just terminate (with exit() etc) before main() runs. Note that you won't be able to actually get the address of main() to call manually - just return from your constructor to continue normal startup.
If you're writing a shared library that needs specific startup initialisation, you can use the GCC "constructor" extension:
void foo() __attribute__ ((constructor))