Can I do `ret` instruction from code at _start in MacOS? Linux? - linux

I am wondering if it is legal to return with ret from a program's entry point.
Example with NASM:
section .text
global _start
_start:
ret
; Linux: nasm -f elf64 foo.asm -o foo.o && ld foo.o
; OS X: nasm -f macho64 foo.asm -o foo.o && ld foo.o -lc -macosx_version_min 10.12.0 -e _start -o foo
ret pops a return address from the stack and jumps to it.
But are the top bytes of the stack a valid return address at the program entry point, or do I have to call exit?
Also, the program above does not segfault on OS X. Where does it return to?

MacOS Dynamic Executables
When you are using MacOS and link with:
ld foo.o -lc -macosx_version_min 10.12.0 -e _start -o foo
you are getting a dynamically loaded version of your code. _start isn't the true entry point, the dynamic loader is. The dynamic loader as one of its last steps does C/C++/Objective-C runtime initialization, and then calls your specified entry point specified with the -e option. The Apple documentation about Forking and Executing the Process has these paragraphs:
A Mach-O executable file contains a header consisting of a set of load commands. For programs that use shared libraries or frameworks, one of these commands specifies the location of the linker to be used to load the program. If you use Xcode, this is always /usr/lib/dyld, the standard OS X dynamic linker.
When you call the execve routine, the kernel first loads the specified program file and examines the mach_header structure at the start of the file. The kernel verifies that the file appear to be a valid Mach-O file and interprets the load commands stored in the header. The kernel then loads the dynamic linker specified by the load commands into memory and executes the dynamic linker on the program file.
The dynamic linker loads all the shared libraries that the main program links against (the dependent libraries) and binds enough of the symbols to start the program. It then calls the entry point function. At build time, the static linker adds the standard entry point function to the main executable file from the object file /usr/lib/crt1.o. This function sets up the runtime environment state for the kernel and calls static initializers for C++ objects, initializes the Objective-C runtime, and then calls the program’s main function
In your case that is _start. In this environment where you are creating a dynamically linked executable you can do a ret and have it return back to the code that called _start which does an exit system call for you. This is why it doesn't crash. If you review the generated object file with gobjdump -Dx foo you should get:
start address 0x0000000000000000
Sections:
Idx Name Size VMA LMA File off Algn
0 .text 00000001 0000000000001fff 0000000000001fff 00000fff 2**0
CONTENTS, ALLOC, LOAD, CODE
SYMBOL TABLE:
0000000000001000 g 03 ABS 01 0010 __mh_execute_header
0000000000001fff g 0f SECT 01 0000 [.text] _start
0000000000000000 g 01 UND 00 0100 dyld_stub_binder
Disassembly of section .text:
0000000000001fff <_start>:
1fff: c3 retq
Notice that start address is 0. And the code at 0 is dyld_stub_binder. This is the dynamic loader stub that eventually sets up a C runtime environment and then calls your entry point _start. If you don't override the entry point it defaults to main.
MacOS Static Executables
If however you build as a static executable, there is no code executed before your entry point and ret should crash since there is no valid return address on the stack. In the documentation quoted above is this:
For programs that use shared libraries or frameworks, one of these commands specifies the location of the linker to be used to load the program.
A statically built executable doesn't use the dynamic loader dyld with crt1.o embedded in it. CRT = C runtime library which covers C++/Objective-C as well on MacOS. The processes of dealing with dynamic loading are not done, C/C++/Objective-C initialization code is not executed, and control is transferred directly to your entry point.
To build statically drop the -lc (or -lSystem) from the linker command and add -static option:
ld foo.o -macosx_version_min 10.12.0 -e _start -o foo -static
If you run this version it should produce a segmentation fault. gobjdump -Dx foo produces
start address 0x0000000000001fff
Sections:
Idx Name Size VMA LMA File off Algn
0 .text 00000001 0000000000001fff 0000000000001fff 00000fff 2**0
CONTENTS, ALLOC, LOAD, CODE
1 LC_THREAD.x86_THREAD_STATE64.0 000000a8 0000000000000000 0000000000000000 00000198 2**0
CONTENTS
SYMBOL TABLE:
0000000000001000 g 03 ABS 01 0010 __mh_execute_header
0000000000001fff g 0f SECT 01 0000 [.text] _start
Disassembly of section .text:
0000000000001fff <_start>:
1fff: c3 retq
You should notice start_address is now 0x1fff. 0x1fff is the entry point you specified (_start). There is no dynamic loader stub as an intermediary.
Linux
Under Linux when you specify your own entry point it will segmentation fault whether you are building as a static or shared executable. There is good information on how ELF executables are run on Linux in this article and the dynamic linker documentation. The key point that should be observed is that the Linux one makes no mention of doing C/C++/Objective-C runtime initialisation unlike the MacOS dynamic linker documentation.
The key difference between the Linux dynamic loader (ld.so) and the MacOS one (dynld) is that the MacOS dynamic loader performs C/C++/Objective-C startup initialization by including the entry point from crt1.o. The code in crt1.o then transfers control to the entry point you specified with -e (default is main). In Linux the dynamic loader makes no assumption about the type of code that will be run. After the shared objects are processed and initialized control is transferred directly to the entry point.
Stack Layout at Process Creation
FreeBSD (on which MacOS is based) and Linux share one thing in common. When loading 64-bit executables the layout of the user stack when a process is created is the same. The stack for 32-bit processes is similar but pointers and data are 4 bytes wide, not 8.
Although there isn't a return address on the stack, there is other data representing the number of arguments, the arguments, environment variables, and other information. This layout is not the same as what the main function in C/C++ expects. It is part of the C startup code to convert the stack at process creation to something compatible with the C calling convention and the expectations of the function main (argc, argv, envp).
I wrote more information on this subject in this Stackoverflow answer that shows how a statically linked MacOS executable can traverse through the program arguments passed by the kernel at process creation.

Suplementing what Michael Petch already answered:
from runnable Mach-o executable perspective program launch happens either due to load command LC_MAIN (most modern executables since 10.7) which utilises DYLD in the process or backward compatible load command LC_UNIXTHREAD. The former is the variant where your ret is allowed and in fact preferable because you return control to DYLD __mh_execute_header. This will be followed by a buffer flush.
Alternatively to ret you can use system exit call either through undocumented syscall kernel API (64bit, int 0x80 for 32bit) or DYLD wrapper C lib doing it(documented).
If your executable is not utilising LC_MAIN you're left with legacy LC_UNIXTHREAD where you have no alternative to system exit call , ret will cause a segmentation fault.

Related

implementation of elf packer for shared libraries on aarch64

I had implemented an elf packer for executable for aarch64.
Now, I would like to pack a shared library.
For executable binaries, the entry point is replaced with the address
of the loader. While the job of the loader is to decrypt the .text section
and finally jump to the decrypted .text section.
However, there is no entry point for shared libraries.
Where should my loader be placed ?
The elf loader can be triggered by patching the entry of .rela.dyn with the address of the loader.
https://github.com/0xN3utr0n/Noteme/blob/master/injection.c
This trick has been tested on x86_64 and x86_86.
But failed on aarch64.

Bionic and libc’s stub implementations

I’d like to run an x86 shared library that I grabbed from an apk on a non-android linux machine.
It’s linked against android libc, so I grabbed the libc.so from the android ndk.
After debugging segfaults for a while, I figured that libc.so is “cheating” and contains only nop implementations of many library functions:
$ objdump -d libc.so | grep memalign -A 8
0000bf82 <memalign>:
bf82: 55 push %ebp
bf83: 89 e5 mov %esp,%ebp
bf85: 5d pop %ebp
bf86: c3 ret
Now the ndk also contains a libc.a that contains actual implementations of these functions, but how do I get my process to load these and override the nop functions of libc.so?
Would also be interested on some more context on why android is doing this trick and how the overriding works there.
As you see libc.so that is taken from NDK contains only stubs, since its purpose it to provide necessary information to linker during creation of your own shared library or executable. Here is nice explanation of why we need stub libraries.
So if you need a real libc.so binary - there are two alternatives:
grab it directly from Android device:
$ adb pull /system/lib/libc.so <local_destination>
Download factory ROM image for your device, unpack it, mount system.img to your local filesystem, and then again copy it from /system/lib of that mounted partition.
But even if you get proper binary it is a really painful exersize - to make it working on your desktop Linux. There are at least two reasons:
Android and desktop Linux ELFs require different interpreters. You can check it with readelf:
$ readelf --all <android_binary> | grep interpreter
[Requesting program interpreter: /system/bin/linker]
$ readelf --all <linux_x64_binary> | grep interpreter
[Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
(Interpreter is a small program that performs actual loading of your binary and is loaded by kernel) Obviously your Linux system has no /system/bin/linker and kernel will reject loading of such binary. So you must somehow load sections properly and resolve all dependencies by yourself.
Android kernel is not the same as desktop one, it has some extra features that libc.so depends on, so even if you load ELF somehow it is still incompatible with your kernel and surely you'll get problems at some moment.
To top it off: it is practically impossible to reuse android binaries on desktop GNU/Linux even if they are targeted with the same hardware architecture.

read data from file in assembly AT&T

I want to read data from a file in assembly AT&T but I don't really know where to start.
I haven't found a useful resource on internet.
My working environment info:
OS: Ubuntu 14 - 64 bit
CPU: Intel
GAS compiler
Assembly Sintax: AT&T
I'll assemble with: as -o hello.o hello.s
I'll link with: ld -o test hello.o
Look up how to do systems programming on POSIX in C (open/read/write/etc.), then use the same system calls in your asm. There's nothing special about asm for this, compared to just doing it in C. (except that in C you'd be using the glibc wrappers instead of the syscall instruction directly.)
See the x86 tag wiki for links documenting how to make system calls from asm.

linux application get Killed

I have a "Seagate Central" NAS with an embedded linux on it
$ cat /etc/*release
MontaVista Linux 6, (.dev-snapshot-20130726)
When I try to run my own application on this NAS, it will be "Killed"
without any notifications on dmesg or /var/log/messages
$ cat /proc/cpuinfo
Processor : ARMv6-compatible processor rev 4 (v6l)
BogoMIPS : 279.34
Features : swp half thumb fastmult vfp edsp java
CPU implementer : 0x41
CPU architecture: 7
CPU variant : 0x0
CPU part : 0xb02
CPU revision : 4
Hardware : Cavium Networks CNS3420 Validation Board
Revision : 0000
Serial : 0000000000000000
My toolchain is
Sourcery_CodeBench_Lite_for_ARM_GNU_Linux/arm-none-linux-gnueabi
and my compile switches are
-march=armv6k -mcpu=mpcore -mfloat-abi=softfp -mfpu=vfp
How can I find out which process is killing my application, or what setting I have to change?
PS: I have created a simple HelloWorld application which is also not working !
$ ldd Hello
$ not a dynamic executable
readelf -a Hello
=> http://pastebin.com/kT9FvkjE
readelf -a zip
=> http://pastebin.com/3V6kqA9b
UPDATE 1
I have comiled a new binary with hard float
Readelf output
http://pastebin.com/a87bKksY
But no success ;(
I guess it is really a "lock" topic, which is blocking my application to execute. How can I find out what application kills mine ?
Or how can I disable such kind of function ?
Use these compiler switches:
-march=armv6k -Wl,-z,max-page-size=0x10000,-z,common-page-size=0x10000,-Ttext-segment=0x10000
See also this link regarding the toolchain.
You can run readelf -a against one of the built-in binaries (e.g. /usr/bin/nano) to see the proper text-segment offset in the section headers and page size / alignment in the program headers. The above compiler flags make self-compiled programs match the structure of built in binaries, and have been tested to work. It seems the Seagate Central NAS uses a page size / offset of 0x10000 while the default for ARM gcc is 0x8000.
Edit: I see you ran readelf already. Your pastebin shows
HelloWorld:[ 1] .interp PROGBITS 00008134 000134 000013 00 A 0 0 1
zip:[ 1] .interp PROGBITS 00010134 000134 000013 00 A 0 0 1
The value 10134-134=10000 (hex) yields the correct text-segment linker option. Further down (LOAD...) are the alignment specifiers, which are 0x8000 for your HelloWorld, but 0x10000 for the zip built-in. In my experience, soft-float has not caused problems.
Do you see any output at all?
Is your application dynamically linked?
If so, run the dynamic linker with the verbose option (you'll have to figure out the name of the dynamic linker on your platform, for Arch linux, it is ldd):
ldd --verbose 'your_program_name'
That will tell you if you're missing any dependencies (shared libs etc)
Run readelf -a 'your_program_name'
Make sure the file mentioned in Requesting program interpreter: /lib/ld-linux.so.2 exists. In this case, that filename is /lib/ld-linux.so.2
If this fails to help you figure out the problem, post the complete output of ldd --verbose 'your_program_name' and readelf -a 'your_program_name' in your question.
Another issue may be that the NAS software just kills foreign programs. I'm not sure why it would, but we're talking about a big corporation here (Seagate) and they have odd ideas of how the world works at times...
Edit, after looking at the pastebin of readelf:
From what I see, your Hello executable differs in 2 ways from the zip executable:
It is not dynamically linked, so that throws out a whole load of problems to look for.
There's a difference in how the 2 programs are built. zip does not use softfloats and Hello does. I suspect the soft-float dependency is due to one or both of these compiler switches: -mfloat-abi=softfp -mfpu=vfp
Hello Flags: 0x5000202, has entry point, Version5 EABI, soft-float ABI
zip Flags: 0x5000002, has entry point, Version5 EABI
I'd start with either:
Removing the soft-float option from the Hello build or:
make sure the soft-float emulation libraries are on the machine. I don't know what libs this would depend on, but I do remember MontaVista supplying them the last time I touched their software. It's been 8+ years since I touched MontaVista so it's clouded in a bit of old-memory fog.
This is an old thread, but I just wanted to add that I succeeded in compiling a "hello world" for this old NAS today.
Running ld-linux.so.3 <app> told me that
ELF load command alignment not page-aligned
Googling this, I found this: https://github.com/JuliaLang/julia/issues/33293, which pointed me to linker-options:
-Wl,-z,max-page-size=0x10000
Compiling with these options yielded en ELF that actually did work!
Are you sure your compilation options are correct ?
Try the following :
strace your application (if strace is present on the NAS)
downloas one of the NAS binary and run arm-none-linux-gnueabi-readelf -a on it, do the same on your helloworld program and see if the abi tag differ.
It looks like an illegal instruction issue, a floating point issue or an incompatible libc issue.
Edit : according to readelf output, nas program are compiled without soft float, you should try that.

Debug assembly code using Kdbg

I have a project with one .c C source code and one .S assembly source code. Once compiled and linked, is there any way to debug .S code using Kdbg? I am calling one .S function from .c file but no code loads in Kdbg.
Add a .file directive in your source, like: .file "sourceasm.s". Kdbg will then use it as expected.
I I just tried kdbg (the KDE front-end for gdb, not the Linux kernel debugger kgdb of almost the same name).
It doesn't seem to have a proper disassembly mode like regular gdb's layout asm. You can set the "memory" window to disassembly and the address to $pc (and it updates as you single step), but that ties up the memory window and isn't very flexible for setting breakpoints or scrolling backwards to instructions before the current RIP/EIP.
Even if you're debugging asm source, you sometimes want to have the debugger show you the real disassembly, as well / instead of the asm source. For example in code that uses macros, or NASM %rep to repeat blocks.
AFAICT, kdbg is not a very good choice for asm debugging. text-mode GDB with layout asm / layout reg is ok; see the bottom of the x86 tag wiki for tips. I've also tried https://github.com/cs01/gdbgui. It has a disassembly mode, but it's not very nice.
As #ivan says, kgdb will let you do source level debugging of asm source files if you add enough metadata for it to know what source file the object came from.
gcc: Build with gcc -g foo.S
NASM: Assemble with nasm -felf64 -g -Fdwarf to include DWARF debug info. (The NASM default is STABS debug info, which also works.)
YASM: Assemble with yasm -felf64 -gdwarf2.
See Assembling 32-bit binaries on a 64-bit system (GNU toolchain) for more about building static / dynamic binaries from asm source.

Resources