For a static library (.a file), how to list the module-level dependencies of it?
I know for a shared library (.so), we can use objdump or readelf to do this:
objdump -p test.so
or
readelf -d test.so
I can get something like
NEEDED libOne.so
NEEDED libc.so.6
But for a static library, I can only get the dependencies in symbol-level, for example, by running
objdump -T test.a
I will get some thing like:
00000000 DF UND 00000000 QByteArray::mid(int, int) const
00000000 DF UND 00000000 QUrl::fromEncoded(QByteArray const&)
00000000 DF UND 00000000 QFileInfo::fileName() const
But I need the information in module-level, does anyone know how to get that information?
A static library have no such list of dependencies.
A static library is nothing more than an archive of object files. And as object files doesn't know what libraries they depend on, neither can a static library.
#Some programmer dude is right, but what you can do is build a simple program using this library statically and then check with ldd -v what are the dependencies.
Related
How can I easily find out the direct shared object dependencies of a Linux binary in ELF format?
I'm aware of the ldd tool, but that appears to output all dependencies of a binary, including the dependencies of any shared objects that binary is dependent on.
You can use readelf to explore the ELF headers. readelf -d will list the direct dependencies as NEEDED sections.
$ readelf -d elfbin
Dynamic section at offset 0xe30 contains 22 entries:
Tag Type Name/Value
0x0000000000000001 (NEEDED) Shared library: [libssl.so.1.0.0]
0x0000000000000001 (NEEDED) Shared library: [libc.so.6]
0x000000000000000c (INIT) 0x400520
0x000000000000000d (FINI) 0x400758
...
If you want to find dependencies recursively (including dependencies of dependencies, dependencies of dependencies of dependencies and so on)…
You may use ldd command.
ldd - print shared library dependencies
The objdump tool can tell you this information. If you invoke objdump with the -x option, to get it to output all headers then you'll find the shared object dependencies right at the start in the "Dynamic Section".
For example running objdump -x /usr/lib/libXpm.so.4 on my system gives the following information in the "Dynamic Section":
Dynamic Section:
NEEDED libX11.so.6
NEEDED libc.so.6
SONAME libXpm.so.4
INIT 0x0000000000002450
FINI 0x000000000000e0e8
GNU_HASH 0x00000000000001f0
STRTAB 0x00000000000011a8
SYMTAB 0x0000000000000470
STRSZ 0x0000000000000813
SYMENT 0x0000000000000018
PLTGOT 0x000000000020ffe8
PLTRELSZ 0x00000000000005e8
PLTREL 0x0000000000000007
JMPREL 0x0000000000001e68
RELA 0x0000000000001b38
RELASZ 0x0000000000000330
RELAENT 0x0000000000000018
VERNEED 0x0000000000001ad8
VERNEEDNUM 0x0000000000000001
VERSYM 0x00000000000019bc
RELACOUNT 0x000000000000001b
The direct shared object dependencies are listing as 'NEEDED' values. So in the example above, libXpm.so.4 on my system just needs libX11.so.6 and libc.so.6.
It's important to note that this doesn't mean that all the symbols needed by the binary being passed to objdump will be present in the libraries, but it does at least show what libraries the loader will try to load when loading the binary.
ldd -v prints the dependency tree under "Version information:' section. The first block in that section are the direct dependencies of the binary.
See Hierarchical ldd(1)
I am wondering if it is legal to return with ret from a program's entry point.
Example with NASM:
section .text
global _start
_start:
ret
; Linux: nasm -f elf64 foo.asm -o foo.o && ld foo.o
; OS X: nasm -f macho64 foo.asm -o foo.o && ld foo.o -lc -macosx_version_min 10.12.0 -e _start -o foo
ret pops a return address from the stack and jumps to it.
But are the top bytes of the stack a valid return address at the program entry point, or do I have to call exit?
Also, the program above does not segfault on OS X. Where does it return to?
MacOS Dynamic Executables
When you are using MacOS and link with:
ld foo.o -lc -macosx_version_min 10.12.0 -e _start -o foo
you are getting a dynamically loaded version of your code. _start isn't the true entry point, the dynamic loader is. The dynamic loader as one of its last steps does C/C++/Objective-C runtime initialization, and then calls your specified entry point specified with the -e option. The Apple documentation about Forking and Executing the Process has these paragraphs:
A Mach-O executable file contains a header consisting of a set of load commands. For programs that use shared libraries or frameworks, one of these commands specifies the location of the linker to be used to load the program. If you use Xcode, this is always /usr/lib/dyld, the standard OS X dynamic linker.
When you call the execve routine, the kernel first loads the specified program file and examines the mach_header structure at the start of the file. The kernel verifies that the file appear to be a valid Mach-O file and interprets the load commands stored in the header. The kernel then loads the dynamic linker specified by the load commands into memory and executes the dynamic linker on the program file.
The dynamic linker loads all the shared libraries that the main program links against (the dependent libraries) and binds enough of the symbols to start the program. It then calls the entry point function. At build time, the static linker adds the standard entry point function to the main executable file from the object file /usr/lib/crt1.o. This function sets up the runtime environment state for the kernel and calls static initializers for C++ objects, initializes the Objective-C runtime, and then calls the program’s main function
In your case that is _start. In this environment where you are creating a dynamically linked executable you can do a ret and have it return back to the code that called _start which does an exit system call for you. This is why it doesn't crash. If you review the generated object file with gobjdump -Dx foo you should get:
start address 0x0000000000000000
Sections:
Idx Name Size VMA LMA File off Algn
0 .text 00000001 0000000000001fff 0000000000001fff 00000fff 2**0
CONTENTS, ALLOC, LOAD, CODE
SYMBOL TABLE:
0000000000001000 g 03 ABS 01 0010 __mh_execute_header
0000000000001fff g 0f SECT 01 0000 [.text] _start
0000000000000000 g 01 UND 00 0100 dyld_stub_binder
Disassembly of section .text:
0000000000001fff <_start>:
1fff: c3 retq
Notice that start address is 0. And the code at 0 is dyld_stub_binder. This is the dynamic loader stub that eventually sets up a C runtime environment and then calls your entry point _start. If you don't override the entry point it defaults to main.
MacOS Static Executables
If however you build as a static executable, there is no code executed before your entry point and ret should crash since there is no valid return address on the stack. In the documentation quoted above is this:
For programs that use shared libraries or frameworks, one of these commands specifies the location of the linker to be used to load the program.
A statically built executable doesn't use the dynamic loader dyld with crt1.o embedded in it. CRT = C runtime library which covers C++/Objective-C as well on MacOS. The processes of dealing with dynamic loading are not done, C/C++/Objective-C initialization code is not executed, and control is transferred directly to your entry point.
To build statically drop the -lc (or -lSystem) from the linker command and add -static option:
ld foo.o -macosx_version_min 10.12.0 -e _start -o foo -static
If you run this version it should produce a segmentation fault. gobjdump -Dx foo produces
start address 0x0000000000001fff
Sections:
Idx Name Size VMA LMA File off Algn
0 .text 00000001 0000000000001fff 0000000000001fff 00000fff 2**0
CONTENTS, ALLOC, LOAD, CODE
1 LC_THREAD.x86_THREAD_STATE64.0 000000a8 0000000000000000 0000000000000000 00000198 2**0
CONTENTS
SYMBOL TABLE:
0000000000001000 g 03 ABS 01 0010 __mh_execute_header
0000000000001fff g 0f SECT 01 0000 [.text] _start
Disassembly of section .text:
0000000000001fff <_start>:
1fff: c3 retq
You should notice start_address is now 0x1fff. 0x1fff is the entry point you specified (_start). There is no dynamic loader stub as an intermediary.
Linux
Under Linux when you specify your own entry point it will segmentation fault whether you are building as a static or shared executable. There is good information on how ELF executables are run on Linux in this article and the dynamic linker documentation. The key point that should be observed is that the Linux one makes no mention of doing C/C++/Objective-C runtime initialisation unlike the MacOS dynamic linker documentation.
The key difference between the Linux dynamic loader (ld.so) and the MacOS one (dynld) is that the MacOS dynamic loader performs C/C++/Objective-C startup initialization by including the entry point from crt1.o. The code in crt1.o then transfers control to the entry point you specified with -e (default is main). In Linux the dynamic loader makes no assumption about the type of code that will be run. After the shared objects are processed and initialized control is transferred directly to the entry point.
Stack Layout at Process Creation
FreeBSD (on which MacOS is based) and Linux share one thing in common. When loading 64-bit executables the layout of the user stack when a process is created is the same. The stack for 32-bit processes is similar but pointers and data are 4 bytes wide, not 8.
Although there isn't a return address on the stack, there is other data representing the number of arguments, the arguments, environment variables, and other information. This layout is not the same as what the main function in C/C++ expects. It is part of the C startup code to convert the stack at process creation to something compatible with the C calling convention and the expectations of the function main (argc, argv, envp).
I wrote more information on this subject in this Stackoverflow answer that shows how a statically linked MacOS executable can traverse through the program arguments passed by the kernel at process creation.
Suplementing what Michael Petch already answered:
from runnable Mach-o executable perspective program launch happens either due to load command LC_MAIN (most modern executables since 10.7) which utilises DYLD in the process or backward compatible load command LC_UNIXTHREAD. The former is the variant where your ret is allowed and in fact preferable because you return control to DYLD __mh_execute_header. This will be followed by a buffer flush.
Alternatively to ret you can use system exit call either through undocumented syscall kernel API (64bit, int 0x80 for 32bit) or DYLD wrapper C lib doing it(documented).
If your executable is not utilising LC_MAIN you're left with legacy LC_UNIXTHREAD where you have no alternative to system exit call , ret will cause a segmentation fault.
I have an android .so(example.so) file which is build with some n number of static libraries(.a files). Is there is any way to find what are the static libraries used to built that .so(example.so) file.
It's not possible to figure out which static libraries were used to build the shared library. Shared libraries aren't a container of their components like static libraries are, but are a linked form of all the object files.
I think the best you can do here is figure out which source files are included, and only if the library hasn't been stripped. You can use readelf -sW libfoo.so, and if it was built from foo.cpp you'll see:
...
18: 00000000 0 FILE LOCAL DEFAULT ABS foo.cpp
...
I’d like to run an x86 shared library that I grabbed from an apk on a non-android linux machine.
It’s linked against android libc, so I grabbed the libc.so from the android ndk.
After debugging segfaults for a while, I figured that libc.so is “cheating” and contains only nop implementations of many library functions:
$ objdump -d libc.so | grep memalign -A 8
0000bf82 <memalign>:
bf82: 55 push %ebp
bf83: 89 e5 mov %esp,%ebp
bf85: 5d pop %ebp
bf86: c3 ret
Now the ndk also contains a libc.a that contains actual implementations of these functions, but how do I get my process to load these and override the nop functions of libc.so?
Would also be interested on some more context on why android is doing this trick and how the overriding works there.
As you see libc.so that is taken from NDK contains only stubs, since its purpose it to provide necessary information to linker during creation of your own shared library or executable. Here is nice explanation of why we need stub libraries.
So if you need a real libc.so binary - there are two alternatives:
grab it directly from Android device:
$ adb pull /system/lib/libc.so <local_destination>
Download factory ROM image for your device, unpack it, mount system.img to your local filesystem, and then again copy it from /system/lib of that mounted partition.
But even if you get proper binary it is a really painful exersize - to make it working on your desktop Linux. There are at least two reasons:
Android and desktop Linux ELFs require different interpreters. You can check it with readelf:
$ readelf --all <android_binary> | grep interpreter
[Requesting program interpreter: /system/bin/linker]
$ readelf --all <linux_x64_binary> | grep interpreter
[Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
(Interpreter is a small program that performs actual loading of your binary and is loaded by kernel) Obviously your Linux system has no /system/bin/linker and kernel will reject loading of such binary. So you must somehow load sections properly and resolve all dependencies by yourself.
Android kernel is not the same as desktop one, it has some extra features that libc.so depends on, so even if you load ELF somehow it is still incompatible with your kernel and surely you'll get problems at some moment.
To top it off: it is practically impossible to reuse android binaries on desktop GNU/Linux even if they are targeted with the same hardware architecture.
I'm building a library that needs to be dynamically linked to my project. The output is a .so file, so I think I'm on the right track. I'm concerned by the way it's being linked at compile time - by specifying the location of its makefile and depending on a bunch of macros, which I've never encountered before.
Can I assume that since I'm building a .so library (rather than a .a) that I'm in fact dynamically linking? Or is it possible for .so libs to be statically linked, in which case I need to rip apart the make/config files to better understand what's going on?
Thanks,
Andrew
I'm not familiar with internal structure of executables and shared objects, so I could only give some practical hints.
Assuming you use gcc, it should have -shared option when linking object files into library - this way ld (called by gcc) makes shared object instead of executable binary.
gcc -shared -o libabc.so *.o ...
When you link some application with this libabc.so it should link without errors and after that with ldd command you should be able to see libabc.so among its dependencies.
$ ldd app
...
libabc.so => ...............