Bionic and libc’s stub implementations

Bionic and libc’s stub implementations - android-ndk

I’d like to run an x86 shared library that I grabbed from an apk on a non-android linux machine.
It’s linked against android libc, so I grabbed the libc.so from the android ndk.
After debugging segfaults for a while, I figured that libc.so is “cheating” and contains only nop implementations of many library functions:
$ objdump -d libc.so | grep memalign -A 8
0000bf82 <memalign>:
bf82: 55 push %ebp
bf83: 89 e5 mov %esp,%ebp
bf85: 5d pop %ebp
bf86: c3 ret
Now the ndk also contains a libc.a that contains actual implementations of these functions, but how do I get my process to load these and override the nop functions of libc.so?
Would also be interested on some more context on why android is doing this trick and how the overriding works there.

As you see libc.so that is taken from NDK contains only stubs, since its purpose it to provide necessary information to linker during creation of your own shared library or executable. Here is nice explanation of why we need stub libraries.
So if you need a real libc.so binary - there are two alternatives:
grab it directly from Android device:
$ adb pull /system/lib/libc.so <local_destination>
Download factory ROM image for your device, unpack it, mount system.img to your local filesystem, and then again copy it from /system/lib of that mounted partition.
But even if you get proper binary it is a really painful exersize - to make it working on your desktop Linux. There are at least two reasons:
Android and desktop Linux ELFs require different interpreters. You can check it with readelf:
$ readelf --all <android_binary> | grep interpreter
[Requesting program interpreter: /system/bin/linker]
$ readelf --all <linux_x64_binary> | grep interpreter
[Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
(Interpreter is a small program that performs actual loading of your binary and is loaded by kernel) Obviously your Linux system has no /system/bin/linker and kernel will reject loading of such binary. So you must somehow load sections properly and resolve all dependencies by yourself.
Android kernel is not the same as desktop one, it has some extra features that libc.so depends on, so even if you load ELF somehow it is still incompatible with your kernel and surely you'll get problems at some moment.
To top it off: it is practically impossible to reuse android binaries on desktop GNU/Linux even if they are targeted with the same hardware architecture.

Related

gdb can't resolve symbols for linux kernel

I have setup Linux Kernel debug environment with VMware Workstation. But When I tried to connect with gdb that connects correctly but I can't set any breakpoint or examine any kernel symbol.
Target Machine (debugee) Ubuntu 18:
I have compiled linux kernel 5.0-0 with the following directives:
CONFIG_DEBUG_INFO=y
# CONFIG_DEBUG_INFO_REDUCED is not set
# CONFIG_DEBUG_INFO_SPLIT is not set
CONFIG_DEBUG_INFO_DWARF4=y
CONFIG_DEBUG_FS=y
# CONFIG_DEBUG_SECTION_MISMATCH is not set
# CONFIG_DEBUG_FORCE_WEAK_PER_CPU is not set
Also my VMX file configuration:
debugStub.listen.guest64 = "TRUE"
debugStub.listen.guest64.remote="TRUE"
After that I transfered vmlinux to debugger machine and use gdb:
bash$ gdb vmlinux
gdb-peda$ target remote 10.251.31.28:8864
Remote debugging using 10.251.31.28:8864
Warning: not running or target is remote
0xffffffff9c623f36 in ?? ()
gdb-peda$ disas sys_open
No symbol "do_sys_open" in current context.

First you need to install kernel-debug-devel, kernel-debuginfo, kernel-debuginfo-common for corresponding kernel version.
Then you can use crash utility to debug kernel, which internally uses gdb

The symbol name you're looking for is sometimes not exactly what you expect it to be. You can use readelf or other similar tools to find the full name of the symbol in the kernel image. These names sometimes differ from the names in the code because of various architecture level differences and their related header and C definitions in kernel code. For example you might be able to disassemble the open() system call by using:
disas __x64_do_sys_open
if you've compiled it for x86-64 architecture.
Also keep in mind that these naming conventions are subject to change in different versions of kernel.

Patching and compiling Ext4 as a kernel module

I'm currently patching Ext4 for academic purposes (only linux/fs/ext4/*, like file.c, ioctl.c, ext4.h) . I'm working on the QEMU virtual machine, and to speed up the whole process I've selected Ext4 to compile as a kernel module. The problem occurs when it comes to test new changes, as, even though I run make modules ARCH=x86 && make modules_install ARCH=x86 and reboot the machine (/ is Ext4), they are not visible unless I recompile the whole kernel. It's a little bit weird as I have a variety of signs that the Ext4 has been compiled as a module:
It is configured as that:
$ grep EXT4 .config
CONFIG_EXT4_FS=m
It does compile as a module:
$ make modules ARCH=x86
(...)
CC [M] fs/ext4/ioctl.o
LD [M] fs/ext4/ext4.o
Building modules, stage 2.
MODPOST 3 modules
LD [M] fs/ext4/ext4.ko
After $ make modules_install ARCH=x86 the files in /lib/modules/3.13.3/kernel/fs/ have proper time stamp.
Finally:
$ lsmod
Module Size Used by
ext4 340817 1
(...)
For some reason I have to do $ make all ARCH=x86 in order to see my changes appear in the runtime. What have I missed? Thanks!

Most boot processes use an "initial ramdisk" (initrd) which contains all kernel modules which the kernel needs to load to be able to do anything - after all, to read files from an Ext4 file system, the kernel needs a driver for this file system and if the driver is on said file system, well, ...
So the solution is to pack all those files into an archive (the initial ramdisk) and save the hard disk blocks as a list of numbers in the boot loader. It can then use a primitive IDE/SATA driver to load the blocks directly, extract the drivers and load them.
Check the documentation of your linux distro to find out how to update initrd. On my Ubuntu Linux, it's mkinitramfs.
Related:
Linux initial RAM disk (initrd) overview

linux application get Killed

I have a "Seagate Central" NAS with an embedded linux on it
$ cat /etc/*release
MontaVista Linux 6, (.dev-snapshot-20130726)
When I try to run my own application on this NAS, it will be "Killed"
without any notifications on dmesg or /var/log/messages
$ cat /proc/cpuinfo
Processor : ARMv6-compatible processor rev 4 (v6l)
BogoMIPS : 279.34
Features : swp half thumb fastmult vfp edsp java
CPU implementer : 0x41
CPU architecture: 7
CPU variant : 0x0
CPU part : 0xb02
CPU revision : 4
Hardware : Cavium Networks CNS3420 Validation Board
Revision : 0000
Serial : 0000000000000000
My toolchain is
Sourcery_CodeBench_Lite_for_ARM_GNU_Linux/arm-none-linux-gnueabi
and my compile switches are
-march=armv6k -mcpu=mpcore -mfloat-abi=softfp -mfpu=vfp
How can I find out which process is killing my application, or what setting I have to change?
PS: I have created a simple HelloWorld application which is also not working !
$ ldd Hello
$ not a dynamic executable
readelf -a Hello
=> http://pastebin.com/kT9FvkjE
readelf -a zip
=> http://pastebin.com/3V6kqA9b
UPDATE 1
I have comiled a new binary with hard float
Readelf output
http://pastebin.com/a87bKksY
But no success ;(
I guess it is really a "lock" topic, which is blocking my application to execute. How can I find out what application kills mine ?
Or how can I disable such kind of function ?

Use these compiler switches:
-march=armv6k -Wl,-z,max-page-size=0x10000,-z,common-page-size=0x10000,-Ttext-segment=0x10000
See also this link regarding the toolchain.
You can run readelf -a against one of the built-in binaries (e.g. /usr/bin/nano) to see the proper text-segment offset in the section headers and page size / alignment in the program headers. The above compiler flags make self-compiled programs match the structure of built in binaries, and have been tested to work. It seems the Seagate Central NAS uses a page size / offset of 0x10000 while the default for ARM gcc is 0x8000.
Edit: I see you ran readelf already. Your pastebin shows
HelloWorld:[ 1] .interp PROGBITS 00008134 000134 000013 00 A 0 0 1
zip:[ 1] .interp PROGBITS 00010134 000134 000013 00 A 0 0 1
The value 10134-134=10000 (hex) yields the correct text-segment linker option. Further down (LOAD...) are the alignment specifiers, which are 0x8000 for your HelloWorld, but 0x10000 for the zip built-in. In my experience, soft-float has not caused problems.

Do you see any output at all?
Is your application dynamically linked?
If so, run the dynamic linker with the verbose option (you'll have to figure out the name of the dynamic linker on your platform, for Arch linux, it is ldd):
ldd --verbose 'your_program_name'
That will tell you if you're missing any dependencies (shared libs etc)
Run readelf -a 'your_program_name'
Make sure the file mentioned in Requesting program interpreter: /lib/ld-linux.so.2 exists. In this case, that filename is /lib/ld-linux.so.2
If this fails to help you figure out the problem, post the complete output of ldd --verbose 'your_program_name' and readelf -a 'your_program_name' in your question.
Another issue may be that the NAS software just kills foreign programs. I'm not sure why it would, but we're talking about a big corporation here (Seagate) and they have odd ideas of how the world works at times...
Edit, after looking at the pastebin of readelf:
From what I see, your Hello executable differs in 2 ways from the zip executable:
It is not dynamically linked, so that throws out a whole load of problems to look for.
There's a difference in how the 2 programs are built. zip does not use softfloats and Hello does. I suspect the soft-float dependency is due to one or both of these compiler switches: -mfloat-abi=softfp -mfpu=vfp
Hello Flags: 0x5000202, has entry point, Version5 EABI, soft-float ABI
zip Flags: 0x5000002, has entry point, Version5 EABI
I'd start with either:
Removing the soft-float option from the Hello build or:
make sure the soft-float emulation libraries are on the machine. I don't know what libs this would depend on, but I do remember MontaVista supplying them the last time I touched their software. It's been 8+ years since I touched MontaVista so it's clouded in a bit of old-memory fog.

This is an old thread, but I just wanted to add that I succeeded in compiling a "hello world" for this old NAS today.
Running ld-linux.so.3 <app> told me that
ELF load command alignment not page-aligned
Googling this, I found this: https://github.com/JuliaLang/julia/issues/33293, which pointed me to linker-options:
-Wl,-z,max-page-size=0x10000
Compiling with these options yielded en ELF that actually did work!

Are you sure your compilation options are correct ?
Try the following :
strace your application (if strace is present on the NAS)
downloas one of the NAS binary and run arm-none-linux-gnueabi-readelf -a on it, do the same on your helloworld program and see if the abi tag differ.
It looks like an illegal instruction issue, a floating point issue or an incompatible libc issue.
Edit : according to readelf output, nas program are compiled without soft float, you should try that.

relocation error & Linux sw distributing

This is my goal: I developed software in Linux and I need to distribute it without source code. The idea is to create a zip file that contains all the necessary items to run the executable. The user will download the zip, extract it, double-click, and the software will start on any Linux-based machine. For motivations that I'm not gonna explain, I can't use deb/rpm/etc or an installer.
The sw has the following dependencies:
some libraries (written by myself that depends on OpenCV), compiled with g++, creating .a files (i.e. static libraries)
OpenCV, in shared libraries, that have several depenencies
I compile my program with gcc, and I link it with:
$ gcc -o my_exe <*.o files> -L<path my_lib> -Wl,--rpath,\$$ORIGIN/lib -lm -lstdc++ -lmy_lib -lopencv
Then I do:
$ ldd my_exe
and I copy all the libraries here listed in ./lib, and I create the .zip.
I copy the zip in an another machine, the dependencies listed by ldd my_exe are satisfied and correctly point to ./lib but when I launch the program, I get the following error:
$ ./my_exe: relocation error: lib/libglib-2.0.so.0: symbol strncmp, version GLIBC_2.2.5 not defined in file libc.so.6 with link time reference
What's wrong? Where is my mistake?
Some additional info:
$ -bash-3.2$ nm -D lib/libc.so.6 |grep strncmp
0000000000083010 T strncmp
$ -bash-3.2$ strings lib/libc.so.6 |grep GLIBC_2.2
GLIBC_2.2.5
GLIBC_2.2.6
I'm using gcc 4.4.5, Ubuntu with a kernel 2.6.35 SMP, 64bit. The machine that I tried is 64bit SMP kernel 2.6 as well.

You seems to re-invent what package managers (for .deb, .rpm, ...) are doing. Why don't you want to make a real package. It would make things simpler and more robust.
And since you code in C++, you will have hard time in making a thing which will work with different versions of libstdc++*.so

Determine target ISA extensions of binary file in Linux (library or executable)

We have an issue related to a Java application running under a (rather old) FC3 on an Advantech POS board with a Via C3 processor. The java application has several compiled shared libs that are accessed via JNI.
Via C3 processor is supposed to be i686 compatible. Some time ago after installing Ubuntu 6.10 on a MiniItx board with the same processor, I found out that the previous statement is not 100% true. The Ubuntu kernel hanged on startup due to the lack of some specific and optional instructions of the i686 set in the C3 processor. These instructions missing in C3 implementation of i686 set are used by default by GCC compiler when using i686 optimizations. The solution, in this case, was to go with an i386 compiled version of Ubuntu distribution.
The base problem with the Java application is that the FC3 distribution was installed on the HD by cloning from an image of the HD of another PC, this time an Intel P4. Afterwards, the distribution needed some hacking to have it running such as replacing some packages (such as the kernel one) with the i386 compiled version.
The problem is that after working for a while the system completely hangs without a trace. I am afraid that some i686 code is left somewhere in the system and could be executed randomly at any time (for example after recovering from suspend mode or something like that).
My question is:
Is there any tool or way to find out at what specific architecture extensions a binary file (executable or library) requires? file does not give enough information.

The unix.linux file command is great for this. It can generally detect the target architecture and operating system for a given binary (and has been maintained on and off since 1973. wow!)
Of course, if you're not running under unix/linux - you're a bit stuck. I'm currently trying to find a java based port that I can call at runtime.. but no such luck.
The unix file command gives information like this:
hex: ELF 32-bit LSB executable, ARM, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.4.17, not stripped
More detailed information about the details of the architecture are hinted at with the (unix) objdump -f <fileName> command which returns:
architecture: arm, flags 0x00000112:
EXEC_P, HAS_SYMS, D_PAGED
start address 0x0000876c
This executable was compiled by a gcc cross compiler (compiled on an i86 machine for the ARM processor as a target)

I decide to add one more solution for any, who got here: personally in my case the information provided by the file and objdump wasn't enough, and the grep isn't much of a help -- I resolve my case through the readelf -a -W.
Note, that this gives you pretty much info. The arch related information resides in the very beginning and the very end. Here's an example:
ELF Header:
Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
Class: ELF32
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: EXEC (Executable file)
Machine: ARM
Version: 0x1
Entry point address: 0x83f8
Start of program headers: 52 (bytes into file)
Start of section headers: 2388 (bytes into file)
Flags: 0x5000202, has entry point, Version5 EABI, soft-float ABI
Size of this header: 52 (bytes)
Size of program headers: 32 (bytes)
Number of program headers: 8
Size of section headers: 40 (bytes)
Number of section headers: 31
Section header string table index: 28
...
Displaying notes found at file offset 0x00000148 with length 0x00000020:
Owner Data size Description
GNU 0x00000010 NT_GNU_ABI_TAG (ABI version tag)
OS: Linux, ABI: 2.6.16
Attribute Section: aeabi
File Attributes
Tag_CPU_name: "7-A"
Tag_CPU_arch: v7
Tag_CPU_arch_profile: Application
Tag_ARM_ISA_use: Yes
Tag_THUMB_ISA_use: Thumb-2
Tag_FP_arch: VFPv3
Tag_Advanced_SIMD_arch: NEONv1
Tag_ABI_PCS_wchar_t: 4
Tag_ABI_FP_rounding: Needed
Tag_ABI_FP_denormal: Needed
Tag_ABI_FP_exceptions: Needed
Tag_ABI_FP_number_model: IEEE 754
Tag_ABI_align_needed: 8-byte
Tag_ABI_align_preserved: 8-byte, except leaf SP
Tag_ABI_enum_size: int
Tag_ABI_HardFP_use: SP and DP
Tag_CPU_unaligned_access: v6

I think you need a tool that checks every instruction, to determine exactly which set it belongs to. Is there even an offical name for the specific set of instructions implemented by the C3 processor? If not, it's even hairier.
A quick'n'dirty variant might be to do a raw search in the file, if you can determine the bit pattern of the disallowed instructions. Just test for them directly, could be done by a simple objdump | grep chain, for instance.

To answer the ambiguity of whether a Via C3 is a i686 class processor: It's not, it's an i586 class processor.
Cyrix never produced a true 686 class processor, despite their marketing claims with the 6x86MX and MII parts. Among other missing instructions, two important ones they didn't have were CMPXCHG8b and CPUID, which were required to run Windows XP and beyond.
National Semiconductor, AMD and VIA have all produced CPU designs based on the Cyrix 5x86/6x86 core (NxP MediaGX, AMD Geode, VIA C3/C7, VIA Corefusion, etc.) which have resulted in oddball designs where you have a 586 class processor with SSE1/2/3 instruction sets.
My recommendation if you come across any of the CPUs listed above and it's not for a vintage computer project (ie. Windows 98SE and prior) then run screaming away from it. You'll be stuck on slow i386/486 Linux or have to recompile all of your software with Cyrix specific optimizations.

Expanding upon #Hi-Angel's answer I found an easy way to check the bit width of a static library:
readelf -a -W libsomefile.a | grep Class: | sort | uniq
Where libsomefile.a is my static library. Should work for other ELF files as well.

Quickest thing to find architecture would be to execute:
objdump -f testFile | grep architecture
This works even for binary.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string