Problems on injecting into printf using LD_PRELOAD method - hook

I was hacking printf() of glibc in one of my project and encountered some problem. Could you please give some clues? And one of my concern is why the same solution for malloc/free works perfect!
As attached, “PrintfHank.c” contains my own solution of printf() which will be preloaded before standard library; and “main.c” just outputs a sentence using printf(). After editing two files, I issued following commands:
compile main.c
gcc –Wall –o main main.c
create my own library
gcc –Wall –fPIC –shared –o PrintfHank.so PrintfHank.c –ldl
test the new library
LD_PRELOAD=”$mypath/PrintfHank.so” $mypath/main
But I received “hello world” instead of “within my own printf” in the console. When hacking malloc/free functions, it’s okay.
I log in my system as “root” and am using 2.6.23.1-42.fc8-i686. Any comments will be highly appreciated!!
main.c
#include <stdio.h>
int main(void)
{
printf("hello world\n");
return 0;
}
PrintfHank.c
#ifndef _GNU_SOURCE
#define _GNU_SOURCE
#endif
#include <stdio.h>
#include <dlfcn.h>
static int (*orig_printf)(const char *format, ...) = NULL;
int printf(const char *format, ...)
{
if (orig_printf == NULL)
{
orig_printf = (int (*)(const char *format, ...))dlsym(RTLD_NEXT, "printf");
}
// TODO: print desired message from caller.
return orig_printf("within my own printf\n");
}

This question is ancient, however:
In your main.c, you've got a newline at the end and aren't using any of the formatting capability of printf.
If I look at the output of LD_DEBUG=all LD_PRELOAD=./printhack.so hello 2>&1 (I've renamed your files somewhat), then near the bottom I can see
17246: transferring control: ./hello
17246:
17246: symbol=puts; lookup in file=./hello [0]
17246: symbol=puts; lookup in file=./printhack.so [0]
17246: symbol=puts; lookup in file=/lib/x86_64-linux-gnu/libc.so.6 [0]
17246: binding file ./hello [0] to /lib/x86_64-linux-gnu/libc.so.6 [0]: normal symbol `puts' [GLIBC_2.2.5]
and no actual mention of printf. puts is basically printf without the formatting and with an automatic line break at the end, so this evidently the result of gcc being "helpful" by replacing the printf with a puts.
To make your example work, I removed the \n from the printf, which gives me output like:
17114: transferring control: ./hello
17114:
17114: symbol=printf; lookup in file=./hello [0]
17114: symbol=printf; lookup in file=./printhack.so [0]
17114: binding file ./hello [0] to ./printhack.so [0]: normal symbol `printf' [GLIBC_2.2.5]
Now I can see that printhack.so is indeed being dragged in with its custom printf.
Alternatively, you can define a custom puts function as well:
static int (*orig_puts)(const char *str) = NULL;
int puts(const char *str)
{
if (orig_puts == NULL)
{
orig_puts = (int (*)(const char *str))dlsym(RTLD_NEXT, "puts");
}
// TODO: print desired message from caller.
return orig_puts("within my own puts");
}

Check
1) preprocessor output. printf can be changed to smth else
gcc -E main.c
2) ld_debug info about printf symbol and preloading
LD_DEBUG=help LD_PRELOAD=”$mypath/PrintfHank.so” $mypath/main
LD_DEBUG=all LD_PRELOAD=”$mypath/PrintfHank.so” $mypath/main

Change
return orig_printf("within my own printf\n");
to
return (*orig_printf)("within my own printf\n");

Related

On Linux, why does this library loaded with LD_PRELOAD catch only some openat() calls?

I am trying to intercept openat() calls with the following library comm.c. This is very standard minimal example, nothing special about it. I compile it with
>gcc -shared -Wall -fPIC -Wl,-init,init comm.c -o comm.so
I am pasting this standard minimal example to show that, I thought, I knew what I was doing.
#define _GNU_SOURCE
#include <dlfcn.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <stdarg.h>
#include <stdio.h>
typedef int (*openat_type)(int, const char *, int, ...);
static openat_type g_orig_openat;
void init() {
g_orig_openat = (openat_type)dlsym(RTLD_NEXT,"openat");
}
int openat(int dirfd, const char* pathname, int flags, ...) {
int fd;
va_list ap;
if (flags & (O_CREAT)) {
va_start(ap, flags);
fd = g_orig_openat(dirfd, pathname, flags, va_arg(ap, mode_t));
}
else
fd = g_orig_openat(dirfd, pathname, flags);
printf("openat dirfd %d pathname %s\n", dirfd, pathname);
return fd;
}
I am running a tar command, again a minimal example, untarring an archive containing a single file foobar, to a pre-existing subdirectory dir:
>strace -f tar xf foobar.tar -C dir 2>&1 | grep openat
openat(AT_FDCWD, "dir", O_RDONLY|O_NOCTTY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 4
openat(4, "foobar", O_WRONLY|O_CREAT|O_EXCL|O_NOCTTY|O_NONBLOCK|O_CLOEXEC, 0600) = -1 EEXIST (File exists)
openat(4, "foobar", O_WRONLY|O_CREAT|O_EXCL|O_NOCTTY|O_NONBLOCK|O_CLOEXEC, 0600) = 5
However,
>LD_PRELOAD=./comm.so tar xf foobar.tar -C dir
openat dirfd 4 pathname foobar
openat dirfd 4 pathname foobar
OK, I know how to handle this - I have done this before - the reason for this kind of discrepancy, is that the system call openat() that is shown by strace is not done by the same-named user function openat(). To find out what that other user function is, one gets the sources, rebuilds them, and finds out.
So, I got the sources for my tar:
>$(which tar) --version
tar (GNU tar) 1.26
I got the tar 1.26 sources and rebuilt them myself, and, lo and behold, if I use the binary tar that I built, rather than the above installed one, then comm.so does catch all 3 openat calls!
So that means there is no "other user function".
Please help, what is possibly going on here??
NO, the question is not answered by that previous question. That previous answer simply said, the library call may be differently named, than the underlying system call. Here, that is NOT the case because I recompiled the same code myself, and there are no other library calls in there.
According to the discussion mentioned, openat will probably be called by different symbol or function. The system call dumped by tool such as strace is raw system call. It might be wrapped by user function or glibc. If you want intercept it by LD_PRELOAD, you need to find out those wrapper instead of openat. To my experience, you can try intercept open64 or open, it can redirect to openat which you observe on strace.
The link is one example to wrap openat from open64.

How to export symbols from POSIX shared library and load using dlopen, dlsym

We are using dlopen to read in a dynamic library on Mac OS X. Update:
This is a posix problem, the same thing fails under cygwin.
First the compile. On cygwin:
extern "C" void foo() { }
g++ -shared foo.c -o libfoo.so
nm -D libfoo.so
displays no public symbols. This appears to be the problem. If I could make them public, nm -D should display them.
Using:
nm libfoo.so | grep foo
000000x0xx0x00x0x0 T _foo
you can see the symbol is there. In Linux, this does seem to work:
nm -D foo.so
0000000000201020 B __bss_start
w __cxa_finalize
0000000000201020 D _edata
0000000000201028 B _end
0000000000000608 T _fini
0000000000000600 T foo
w __gmon_start__
00000000000004c0 T _init
w _ITM_deregisterTMCloneTable
w _ITM_registerTMCloneTable
w _Jv_RegisterClasses
However, even in Linux, we cannot seem to connect to the library. Here is the source code:
include
include
using namespace std;
int main() {
void* so = dlopen("foo.so", RTLD_NOW);
if (so = nullptr) {
cerr << "Can't open shared library\n";
exit(-1);
}
#if 0
const void* sym = dlsym(so, "foo");
if (sym == nullptr) {
cout << "Symbol not found\n";
}
#endif
dlclose(so);
}
If we remove the #ifdef, the above code prints "Symbol not found"
but it crashes on the dlclose.
We tried exporting LD_LIBRARY_PATH=. just to see if the library cannot be reached. And the dlopen call seems to work in any case, the return is not nullptr.
So to summarize, the library does not seem to work on Mac and Cygwin. On Linux nm -D shows the symbol in the library, but the code to load the symbol does not work.
In your example, you wrote if (so = nullptr) {, which assigns nullptr to so, and the condition is always false. -Wall is a good idea when debugging!
This alone explains why you can't load the symbol, but I also found that I needed to do dlopen("./foo.so", RTLD_NOW); because dlopen otherwise searches library paths, not the current directory.

eclipse CDT /usr/bin/ld: cannot find -l<libname>

this is my code :
#include <stdio.h>
#include <stdlib.h>
#include <libimobiledevice/libimobiledevice.h>
#include <libimobiledevice/lockdown.h>
#include <libimobiledevice/installation_proxy.h>
#include <libimobiledevice/notification_proxy.h>
#include <libimobiledevice/afc.h>
int main(void) {
idevice_t phone = NULL;
char *udid = NULL;
idevice_new(&phone, udid);
puts("!!!hello!!!"); /* prints !!!Hello World!!! */
return EXIT_SUCCESS;
}
i installed libimobiledevice library and this is
#ls /usr/lib/i386-linux-gnu | grep libimob
libimobiledevice.a
libimobiledevice.so
libimobiledevice.so.4
libimobiledevice.so.4.0.1
but why in configure CDT to use shared library like picture CDT error ?
/usr/bin/ld: cannot find -llibimobiledevice
Under libraries add imobiledevice instead of libimobiledevice. When you use -lx, linker searches for libx.so. In your case linker searched for liblibimobiledevice.so which it could not find.
In Mars Eclipse, to add third party libraries it was only possible from
C++/Build->Setting->Cross G++ Link-> Miscellaneous.
I wasted two hours adding libraries to by other methods, but this one worked for me.

Computing memory address of the environment within a process

I got the following code from the lecture-slides of a security course.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <unistd.h>
extern char shellcode;
#define VULN "./vuln"
int main(int argc, char **argv) {
void *addr = (char *) 0xc0000000 - 4
- (strlen(VULN) + 1)
- (strlen(&shellcode) + 1);
fprintf(stderr, "Using address: 0x%p\n", addr);
// some other stuff
char *params[] = { VULN, buf, NULL };
char *env[] = { &shellcode, NULL };
execve(VULN, params, env);
perror("execve");
}
This code calls a vulnerable program with the shellcode in its environment. The shellcode is some assembly code in an external file that opens a shell and VULN defines the name of the vulnerable program.
My question: how is the shellcode address is computed
The addr variable holds the address of the shellcode (which is part of the environment). Can anyone explain to me how this address is determined? So:
Where does the 0xc0000000 - 4 come from?
Why is the length of the shellcode and the programname substracted from it?
Note that both this code and the vulnerable program are compiled like this:
$ CFLAGS="-m32 -fno-stack-protector -z execstack -mpreferred-stack-boundary=2"
$ cc $CFLAGS -o vuln vuln.c
$ cc $CFLAGS -o exploit exploit.c shellcode.s
$ echo 0 | sudo tee /proc/sys/kernel/randomize_va_space
0
So address space randomization is turned off.
I understood that the stack is the first thing inside the process (highest memory address). And the stack contains, in this order:
The environment data.
argv
argc
the return address of main
the framepointer
local variables in main
...etc...
Constants and global data is not stored on the stack, that's why I also don' t understand why the length of the VULN constant influence the address at which the shellcode is placed.
Hope you can clear this up for me :-)
Note that we're working with a unix system on a intel x86 architecture

Handling __sync_add_and_fetch not being defined

In my open source software project, I call the gcc atomic builtins: __sync_add_and_fetch and __sync_sub_and_fetch to implement atomic increments and decrements on certain variables. I periodically get an email from someone trying to compile my code, but they get the following linker error:
refcountobject.cpp:(.text+0xb5): undefined reference to `__sync_sub_and_fetch_4'
refcountobject.cpp:(.text+0x115): undefined reference to `__sync_add_and_fetch_4'
After some digging, I narrowed down the root cause to the fact that their older version of gcc (4.1) defaults to a target architecture of i386. And evidently, gcc doesn't actually have an intrinsic for atomic addition on 80386, so it implicitly injects an undefined __sync_add_and_fetch_4 call in it place. A great description of how this works is here.
The easy workaround, as discussed here, is to tell them to modify the Makefile to append -march=pentium as one of the compiler flags. And all is good.
So what's the long term fix so users don't have to manually fix the Makefile?
I am considering a few ideas:
I don't want to hardcode -march=pentium as a compiler flag into the Makefile. I'm guessing that will break on anything that isn't Intel based. But I could certainly could add it if the Makefile had a rule to detect that the default target was i386. I'm thinking about having a rule in the Makefile that is a script that calls gcc -dumpmachine and parses out the first triplet. If the string is i386, it would add the compiler flag. I'm assuming no one will be actually be building for 80386 machines.
The other alternative is to actually supply an implementation for __sync_add_and_fetch_4 for the linker to fall back on. It could even be compiled conditionally based on the presence of GCC_HAVE_SYNC_COMPARE_AND_SWAP macros being defined. I prototyped an implementation with a global pthread_mutex. Likely not the best performance, but it works and resolves the issue nicely. A better idea might be to write the inline assembly myself to call "lock xadd" for the implementation if compiling for x86.
This is my other working solution. It might have it's place in certain situations, but I opted for the makefile+script solution above.
This solution is to provide local definitions for _sync_add_and_fetch_4, _sync_fetch_and_add_4, _sync_sub_and_fetch_4, and _sync_fetch_and_sub_4 in a separate source file. They get linked in only if the compiler couldn't natively generate them. Some assembly required, but Wikipedia of all places had a reasonable implementation that I could reference. (I also disassembled what the compiler normally generates to infer if everything else was correct).
#if defined(__i386) || defined(i386) || defined(__i386__)
extern "C" unsigned int xadd_4(volatile void* pVal, unsigned int inc)
{
unsigned int result;
unsigned int* pValInt = (unsigned int*)pVal;
asm volatile(
"lock; xaddl %%eax, %2;"
:"=a" (result)
: "a" (inc), "m" (*pValInt)
:"memory" );
return (result);
}
extern "C" unsigned int __sync_add_and_fetch_4(volatile void* pVal, unsigned int inc)
{
return (xadd_4(pVal, inc) + inc);
}
extern "C" unsigned int __sync_sub_and_fetch_4(volatile void* pVal, unsigned int inc)
{
return (xadd_4(pVal, -inc) - inc);
}
extern "C" unsigned int __sync_fetch_and_add_4(volatile void* pVal, unsigned int inc)
{
return xadd_4(pVal, inc);
}
extern "C" unsigned int __sync_fetch_and_sub_4(volatile void* pVal, unsigned int inc)
{
return xadd_4(pVal, -inc);
}
#endif
With no replies, I struck it out on my own to solve.
There are two possible solutions this is one of them.
First, add the following script, getfixupflags.sh, to the same directory as the Makefile. This script will detect if the compiler is likely targeting i386, and if so will echo out "-march=pentium" as output.
#!/bin/bash
_cxx=$1
_fixupflags=
_regex_i386='^i386'
if [[ ! -n $_cxx ]]; then echo "_cxx var is empty - exiting" >&2; exit; fi
_target=`$_cxx -dumpmachine`
if [[ $_target =~ $_regex_i386 ]]; then
_fixupflags="$_fixupflags -march=pentium"
fi
if [[ -n $_fixupflags ]]; then echo $_fixupflags; fi
Now fix the Makefile to use this script. Add the following line to the Makefile
FIXUP_FLAGS := $(shell getfixupflags.sh $(CXX))
Then modify the compiler directives in the Makefile to include the FIXUP_FLAGS when compiling code. For example:
%.o: %.cpp
$(COMPILE.cpp) $(FIXUP_FLAGS) $^

Resources