Linux: LD_PRELOAD + -z,initfirst - linux

I'm writing a shared-object which is supposed to be LD_PRELOADed into processes.
In that shared object I have some initialization like
__attribute__((constructor)) void initFunc();
That I'd like to be called before any other code in the process.
With processes that are just an executable this works fine but if the process depends on some other share-objects on its own, these get initialized before my LD_PRELOAD shared object.
I tried giving the linker the option -Wl,-z,initfirst but that doesn't seem to have any effect at all. When I'm running the process with LD_DEBUG=files I still see the application so inited before mine.
I'm running CentOS 5.5

The problem is that the loader only supports one shared library with -z initfirst, and libpthread.so (which is used by almost everything) already has this set. Even if you use LD_PRELOAD to load a library, libpthread's constructors will be called first.
You can get around this by patching the loader to support multiple shared libraries with -z initfirst. Here is a patch for ld.so version 2.21 which preserves the binary ABI but makes a linked list out of initfirst libraries and calls them with LD_PRELOAD constructors first.
diff --git a/elf/dl-load.c b/elf/dl-load.c
index 6dd8550..ac3b079 100644
--- a/elf/dl-load.c
+++ b/elf/dl-load.c
## -1387,7 +1387,27 ## cannot enable executable stack as shared object requires");
/* Remember whether this object must be initialized first. */
if (l->l_flags_1 & DF_1_INITFIRST)
- GL(dl_initfirst) = l;
+ {
+#if 0
+ struct initfirst_list *first = malloc(sizeof(*first));
+ first->which = l;
+ first->next = GL(dl_initfirst);
+ GL(dl_initfirst) = first;
+#else
+ struct initfirst_list *node = malloc(sizeof(*node));
+ node->which = l;
+ node->next = NULL;
+ struct initfirst_list *it = GL(dl_initfirst);
+ if (!it)
+ GL(dl_initfirst) = node;
+ else
+ {
+ while (it->next)
+ it = it->next;
+ it->next = node;
+ }
+#endif
+ }
/* Finally the file information. */
l->l_dev = st.st_dev;
diff --git a/elf/dl-map-segments.h b/elf/dl-map-segments.h
index baaa813..bca961c 100644
--- a/elf/dl-map-segments.h
+++ b/elf/dl-map-segments.h
## -55,7 +55,11 ## _dl_map_segments (struct link_map *l, int fd,
/* Remember which part of the address space this object uses. */
l->l_map_start = (ElfW(Addr)) __mmap ((void *) mappref, maplength,
c->prot,
+#if 0
MAP_COPY|MAP_FILE,
+#else
+ MAP_COPY|MAP_FILE|MAP_32BIT,
+#endif
fd, c->mapoff);
if (__glibc_unlikely ((void *) l->l_map_start == MAP_FAILED))
return DL_MAP_SEGMENTS_ERROR_MAP_SEGMENT;
diff --git a/elf/dl-support.c b/elf/dl-support.c
index 835dcb3..9ea0c05 100644
--- a/elf/dl-support.c
+++ b/elf/dl-support.c
## -148,7 +148,7 ## struct r_search_path_elem *_dl_all_dirs;
struct r_search_path_elem *_dl_init_all_dirs;
/* The object to be initialized first. */
-struct link_map *_dl_initfirst;
+struct initfirst_list *_dl_initfirst;
/* Descriptor to write debug messages to. */
int _dl_debug_fd = STDERR_FILENO;
diff --git a/sysdeps/generic/ldsodefs.h b/sysdeps/generic/ldsodefs.h
index b421931..7bb7a69 100644
--- a/sysdeps/generic/ldsodefs.h
+++ b/sysdeps/generic/ldsodefs.h
## -318,7 +318,11 ## struct rtld_global
EXTERN unsigned long long _dl_load_adds;
/* The object to be initialized first. */
- EXTERN struct link_map *_dl_initfirst;
+ /*EXTERN struct link_map *_dl_initfirst;*/
+ EXTERN struct initfirst_list {
+ struct link_map *which;
+ struct initfirst_list *next;
+ } *_dl_initfirst;
#if HP_SMALL_TIMING_AVAIL || defined HP_TIMING_PAD
/* Start time on CPU clock. */
I suppose you can try hacking libpthread to not use -z initfirst, but this seems like the simplest option. I have used it successfully to get a constructor called before anything else. You just have to make sure your LD_PRELOADed library does not use libc, because then libc's constructors will be called first, and in a multithreaded program libc depends on libpthread, so pthread's constructors will be called before that.
Here's an example. I compile a hello, world program with -pthread (otherwise there are no problems). I write a small library which is meant to be LD_PRELOADed and which does not depend on libc. With the default loader, you can't get your init function called first:
$ cat hello.c
#include <stdio.h>
int main() {
puts("Hello, world!");
return 0;
}
$ gcc hello.c -o hello -pthread
$ cat superearly.c
#include <unistd.h>
#include <sys/syscall.h>
long write(int fd, const void *buffer, size_t len) {
unsigned long ret;
__asm__ __volatile__("syscall" : "=a"(ret) : "a"(__NR_write), "D"(fd),
"S"(buffer), "d"(len));
return ret;
}
void hello(void) __attribute__((constructor));
void hello(void) {
write(STDOUT_FILENO, "Got in first!\n", 14);
}
$ gcc superearly.c -fPIC -shared -nostdlib -o libsuperearly.so -Wl,-z,initfirst
$ LD_DEBUG=libs LD_PRELOAD=./libsuperearly.so ./hello
19997: find library=libpthread.so.0 [0]; searching
19997: search cache=/etc/ld.so.cache
19997: trying file=/lib/x86_64-linux-gnu/libpthread.so.0
19997:
19997: find library=libc.so.6 [0]; searching
19997: search cache=/etc/ld.so.cache
19997: trying file=/lib/x86_64-linux-gnu/libc.so.6
19997:
19997:
19997: calling init: /lib/x86_64-linux-gnu/libpthread.so.0
19997:
19997:
19997: calling init: /lib/x86_64-linux-gnu/libc.so.6
19997:
19997:
19997: calling init: ./libsuperearly.so
19997:
Got in first!
19997:
19997: initialize program: ./hello
19997:
19997:
19997: transferring control: ./hello
19997:
Hello, world!
19997:
19997: calling fini: ./hello [0]
19997:
19997:
19997: calling fini: /lib/x86_64-linux-gnu/libpthread.so.0 [0]
19997:
$
However, with an ld.so which has been patched with my patch above:
$ LD_DEBUG=libs LD_PRELOAD=./libsuperearly.so ~/libc/lib/ld-2.21.so ./hello
19986: find library=libpthread.so.0 [0]; searching
19986: search cache=/home/user/libc/etc/ld.so.cache
19986: trying file=/home/user/libc/lib/libpthread.so.0
19986:
19986: find library=libc.so.6 [0]; searching
19986: search cache=/home/user/libc/etc/ld.so.cache
19986: trying file=/home/user/libc/lib/libc.so.6
19986:
19986:
19986: calling init: ./libsuperearly.so
19986:
Got in first!
19986:
19986: calling init: /home/user/libc/lib/libpthread.so.0
19986:
19986:
19986: calling init: /home/user/libc/lib/libc.so.6
19986:
19986:
19986: initialize program: ./hello
19986:
Hello, world!
19986:
19986: calling fini: ./hello [0]
19986:
19986:
19986: calling fini: /home/user/libc/lib/libpthread.so.0 [0]
19986:
$

Related

Rust linker errors: using "-Wl,--as-needed" when linking system libraries

I'm trying to write an application primarily in rust that uses a gtk-based frontend written in C++. I've gotten pretty far in getting the build setup in Cargo, but it's failing in the linking stage.
I've created a minimal reproducible example.
// build.rs
extern crate cc;
extern crate pkg_config;
fn main() {
let gtk = pkg_config::probe_library("gtk+-3.0").unwrap();
cc::Build::new()
.cpp(true)
.file("src/gui.cc")
.includes(gtk.include_paths)
.compile("gui");
}
// src/gui.cc
// This is essentially the first example from https://docs.gtk.org/gtk3/getting_started.html
#include <gtk/gtk.h>
extern "C" {
int run_gui();
}
static void activate(GtkApplication* app, gpointer) {
GtkWidget* window = gtk_application_window_new (app);
gtk_window_set_title (GTK_WINDOW (window), "Window");
gtk_window_set_default_size (GTK_WINDOW (window), 200, 200);
gtk_widget_show_all (window);
}
int run_gui() {
GtkApplication* app = gtk_application_new ("org.gtk.example", G_APPLICATION_DEFAULT_FLAGS);
g_signal_connect (app, "activate", G_CALLBACK (activate), nullptr);
int status = g_application_run (G_APPLICATION (app), 0, nullptr);
g_object_unref (app);
return status;
}
// src/main.rs
extern "C" {
fn run_gui() -> core::ffi::c_int;
}
fn main() {
unsafe { run_gui(); }
}
When I cargo build the package, compiling the c++ and rust files seems to work fine, but then I get undefined reference errors on the link stage.
/usr/bin/ld: /home/pete/workspaces/csvtk/so-minimal-gtk/target/debug/build/so-minimal-gtk-75c7a9a61d32d2e6/out/libgui.a(gui.o): in function `activate(_GtkApplication*, void*)':
/home/pete/workspaces/csvtk/so-minimal-gtk/src/gui.cc:9: undefined reference to `gtk_application_window_new'
/usr/bin/ld: /home/pete/workspaces/csvtk/so-minimal-gtk/src/gui.cc:10: undefined reference to `gtk_window_get_type'
...
The link line, which is quite long, is also logged before that error. It includes all the required gtk libraries, -lgtk-3, -lcairo, etc. When I run it outside of cargo, it fails with the same errors, but, when I remove the -Wl,--as-needed flag from the link line, it links correctly and the program runs fine.
Why did rustc add -Wl,--as-needed? Is there a reason those gtk symbols aren't "needed" from the perspective of the linker? Any tips on how to fix this problem elegantly?
In general, using --as-needed as a flag to the linker is extremely helpful because it prevents your binary from being linked to libraries it doesn't use directly. As an example of a reason why this is beneficial, if you depend on libfoo 1.0, which depends on libbar 1.0, and later on libfoo 1.1 updates to libbar 2.0, then as long as you don't use libbar directly, your code will continue to work with --as-needed, but won't work without it (unless libbar has symbol versioning).
The kind of impact is visible here when you use objdump -x on the binary after making the change I suggest below:
NEEDED libgtk-3.so.0
NEEDED libgio-2.0.so.0
NEEDED libgobject-2.0.so.0
NEEDED libgcc_s.so.1
NEEDED libc.so.6
NEEDED ld-linux-x86-64.so.2
ldd lists 70 libraries linked into your program, and you're only directly requiring to 6, including the dynamic linker.
You can usually rely on other crates to specify their link dependencies correctly. However, you need to be careful because when you're creating your own dependency on a library using C or C++, you have to specify the libraries to link with yourself. This can be seen if you use cargo build -v, where you can notice that your gui library is linked after the dependencies of libgtk-3:
Running `rustc --crate-name test_repo --edition=2021 src/main.rs --error-format=json --json=diagnostic-rendered-ansi,artifacts,future-incompat --crate-type bin --emit=dep-info,link -C embed-bitcode=no -C debuginfo=2 -C metadata=c60f7e6ab8348a08 -C extra-filename=-c60f7e6ab8348a08 --out-dir /tmp/user/1000/test-repo/target/debug/deps -C incremental=/tmp/user/1000/test-repo/target/debug/incremental -L dependency=/tmp/user/1000/test-repo/target/debug/deps -L native=/usr/lib/x86_64-linux-gnu -L native=/tmp/user/1000/test-repo/target/debug/build/test-repo-c89dae0927f2103a/out -l gtk-3 -l gdk-3 -l z -l pangocairo-1.0 -l pango-1.0 -l harfbuzz -l atk-1.0 -l cairo-gobject -l cairo -l gdk_pixbuf-2.0 -l gio-2.0 -l gobject-2.0 -l glib-2.0 -l static=gui -l stdc++`
So you'd want to do something more like this in your build.rs:
// build.rs
extern crate cc;
extern crate pkg_config;
fn main() {
let gtk = pkg_config::probe_library("gtk+-3.0").unwrap();
cc::Build::new()
.cpp(true)
.file("src/gui.cc")
.includes(gtk.include_paths)
.compile("gui");
for path in gtk.link_paths {
println!("cargo:rustc-link-search={}", path.display());
}
for lib in gtk.libs {
println!("cargo:rustc-link-lib={}", lib);
}
}

Undefined symbols in hello world C++ kernel module

I have added C++ support to the Linux kernel version 4.14.41, compiled it and booted using the kernel successfully. I can check the correctness of the C++ module by inserting a LKM. This is the module that I am trying to load:
#include<c++/begin_include.h>
#include<linux/module.h>
#include<linux/kernel.h>
#include<c++/end_include.h>
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("LKM in c++");
MODULE_AUTHOR("MOOL");
class hello
{
public:
hello();
void hi();
};
void hello::hi()
{
printk("Hello world!! \n");
}
hello::hello()
{
printk("Constructor is being called \n");
}
extern "C"
{
static int __init test_classes_init()
{
class hello obj;
obj.hi();
printk("Module inserted:\n");
return 0;
}
static void __exit test_classes_fini()
{
printk("Module removed:\n");
}
module_init(test_classes_init);
module_exit(test_classes_fini);
}
The Makefile:
obj-m = helloworld.o
KVERSION=$(shell uname -r)
all:
make -C /lib/modules/$(KVERSION)/build M=$(PWD) modules
clean:
make -C /lib/modules/$(KVERSION)/build M=$(PWD) clean
When I enter the make command, the helloworld.ko is generated with the warnings
WARNING: "begin_fini" [/home/jai/Downloads/helloworld/helloworld.ko] undefined !
WARNING: "end_init" [/home/jai/Downloads/helloworld/helloworld.ko] undefined !
WARNING: "begin_init" [/home/jai/Downloads/helloworld/helloworld.ko] undefined !
But when I try to insert it using insmod helloworld.ko, the undefined symbol error occurs.
dmesg:
loading out-of-tree module taints kernel
Unknown symbol begin_init (err 0)
Unknown symbol end_init (err 0)
Unknown symbol begin_fini (err 0)
These begin_init, end_init and begin_fini are defined in lib/gcc/crtstuff.c (which was ported into the kernel). These functions are declared as extern in both crtstuff.c and linux/module.h. This module.h is being included in the helloworld module above, but still, those symbols become undefined. So, How can I make those functions defined?
Your kernel C++ implementation is incomplete. You will have to implement global constructor and destructor support (processing of .init_array and .fini_array sections properly), or stop using these C++ features in the source code. This needs cooperation from the kernel module loader. Changes to the startup code will not work because the startup code is not linked into kernel modules.

How to catch Python 3 stdout in C++ code

In an old question about how to catch python stdout in C++ code, there is a good answer and it works - but only in Python 2.
I would like to use something like that with Python 3. Anyone could help me here?
UPDATE
The code I am using is below. It was ported from Mark answer cited above, the only change was the use of PyBytes_AsString instead of PyString_AsString, as cited in documentation.
#include <Python.h>
#include <string>
int main(int argc, char** argv)
{
std::string stdOutErr =
"import sys\n\
class CatchOutErr:\n\
def __init__(self):\n\
self.value = ''\n\
def write(self, txt):\n\
self.value += txt\n\
catchOutErr = CatchOutErr()\n\
sys.stdout = catchOutErr\n\
sys.stderr = catchOutErr\n\
"; //this is python code to redirect stdouts/stderr
Py_Initialize();
PyObject *pModule = PyImport_AddModule("__main__"); //create main module
PyRun_SimpleString(stdOutErr.c_str()); //invoke code to redirect
PyRun_SimpleString("print(1+1)"); //this is ok stdout
PyRun_SimpleString("1+a"); //this creates an error
PyObject *catcher = PyObject_GetAttrString(pModule,"catchOutErr"); //get our catchOutErr created above
PyErr_Print(); //make python print any errors
PyObject *output = PyObject_GetAttrString(catcher,"value"); //get the stdout and stderr from our catchOutErr object
printf("Here's the output:\n %s", PyBytes_AsString(output)); //it's not in our C++ portion
Py_Finalize();
return 0;
}
I build it using Python 3 library:
g++ -I/usr/include/python3.6m -Wall -Werror -fpic code.cpp -lpython3.6m
and the output is:
Here's the output:
(null)
If someone needs more information about the question, please let me know and I will try provide here.
Your issue is that .value isn't a bytes object, it is a string (i.e. Python2 unicode) object. Therefore PyBytes_AsString fails. We can convert it to a bytes object with PyUnicode_AsEncodedString.
PyObject *output = PyObject_GetAttrString(catcher,"value"); //get the stdout and stderr from our catchOutErr
PyObject* encoded = PyUnicode_AsEncodedString(output,"utf-8","strict");
printf("Here's the output:\n %s", PyBytes_AsString(encoded));
Note that you should be checking these result PyObject* against NULL to see if an error has occurred.

procps cause stack smashing

I've been writing a program trying to find itself using the procps library.
But for some reason it smashes the stack.
This is my code:
int main(){
PROCTAB *ptp;
proc_t task;
pid_t mypid[1];
mypid[0] = getpid();
printf("My id: %d\n", mypid[0]);
ptp = openproc(PROC_PID, mypid, 1);
if(readproc(ptp, &task)){
printf("Task id:%d\n",task.XXXID);
}
else{
printf("Error: could not find currect task\n");
}
closeproc(ptp);
printf("Done\n");
return 0;
}
The output i get when i run the program is:
$ ./test
My id is: 8514
Task id is:8514
Done
*** stack smashing detected ***: ./test terminated
======= Backtrace: =========
/lib/i386-linux-gnu/libc.so.6(__fortify_fail+0x45)[0xb7688dd5]
/lib/i386-linux-gnu/libc.so.6(+0xffd8a)[0xb7688d8a]
./test[0x804863e]
/lib/i386-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0xb75a24d3]
./test[0x80484f1]
======= Memory map: ========
...
Aborted (core dumped)
Any one has an idea why it happens?
Am I doing something wrong?
Thanks.
Edit:
I've looked at the header file and notice that I've made a wrong use of the openproc function the correct way to use it is (for pid) is to have the mypid array be null terminated, so I've change my code to:
int main(){
PROCTAB *ptp;
proc_t task;
pid_t mypid[2];
mypid[0] = getpid();
memset(&mypid[1], 0, sizeof(pid_t));
printf("My id: %d\n", mypid[0]);
ptp = openproc(PROC_PID, mypid);
if(readproc(ptp, &task)){
printf("Task id:%d\n",task.XXXID);
}
else{
printf("Error: could not find currect task\n");
}
closeproc(ptp);
printf("Done\n");
return 0;
}
and it still crushes the stack.
It works for me here. After getting that version of procps, it compiled and run fine:
$ gcc -Wall -Werror -o rp -L. -lproc-3.2.8 rp.c
$ ./rp
My id: 11468
Task id:11468
Done
Update
Try a modified version:
proc_t *result;
...
if((result = readproc(ptp, NULL))){
printf("Task id:%d\n",result->XXXID);
free(result);
}
A possible cause for your crash is the fact that the proc_t struct returned by readproc() has additional dynamically allocated elements, such as environment variables or command line arguments. A safer way is to let readproc() allocate the whole structure, and free it later using freeproc():
while ((proc_info = readproc(proc, nullptr)) != NULL) {
// do something with proc_info
freeproc(proc_info);
}

Create a shared lib using another shared lib

I have a shared library "libwiston.so". I am using this to create another shared library called "libAnimation.so", which will be used by another project. Now, the second library "libAnimation.so" can't be used in test code correctly. So I doubt that the creation of the second lib "libAnimation.so" is right. The gcc command to create this lib is
g++ -g -shared -Wl,-soname,libwiston.so -o libAnimation.so $(objs) -lc".
Has someone come across this problem?
That looks like a weird link line - you are creating libAnimation.so, but its internal DT_SONAME name is libwiston.so.
I don't think that what you wanted to do. Don't you want to link libAnimation.so against libwiston.so (-lwiston)?
g++ -g -shared -o libAnimation.so $(objs) -lc -lwiston
I think it would be easier to wrap your build in automake/autoconf and rely on libtool to get the shared library creation correct.
I'll do a humble review on the process of creating shared libraries.
Let's begin by creating libwiston.so. First we implement the function we would like to export and then define it on a header so other programs knows how to call it.
/* file libwiston.cpp
* Implementation of hello_wiston(), called by libAnimation.so
*/
#include "libwiston.h"
#include <iostream>
int hello_wiston(std::string& msg)
{
std::cout << msg << std::endl;
return 0;
}
and:
/* file libwiston.h
* Exports hello_wiston() as a C symbol.
*/
#include <string>
extern "C" {
int hello_wiston(std::string& msg);
};
This code can be compiled with: g++ libwiston.cpp -o libwiston.so -shared
Now we implement the second shared library, named libAnimation.so that calls the function exported by the first library.
/* file libAnimation.cpp
* Implementation of call_wiston().
* This function is a simple wrapper around hello_wiston().
*/
#include "libAnimation.h"
#include "libwiston.h"
#include <iostream>
int call_wiston(std::string& param)
{
hello_wiston(param);
return 0;
}
and header:
/* file libAnimation.h
* Exports call_wiston() as a C symbol.
*/
#include <string>
extern "C" {
int call_wiston(std::string& param);
};
Compile it with: g++ libAnimation.cpp -o libAnimation.so -shared -L. -lwiston
Finally, we create a small application to test libAnimation.
/* file demo.cpp
* Implementation of the test application.
*/
#include "libAnimation.h"
int main()
{
std::string msg = "hello stackoverflow!";
call_wiston(msg);
}
And compile it with: g++ demo.cpp -o demo -L. -lAnimation
There's an interesting tool named nm that you can use to list the symbols exported by your shared library. Using these examples, you could execute the following commands to check for the symbols:
nm libAnimation.so | grep call_wiston
outputs:
00000634 t _GLOBAL__I_call_wiston
000005dc T call_wiston
and also:
nm libwiston.so | grep hello_wiston
outputs:
0000076c t _GLOBAL__I_hello_wiston
000006fc T hello_wiston

Resources