Related
I have a binary file: work_s. I need to disassemble it, but I am getting an error: file truncated
root#debian:/home/l# file work_s
work_s: ELF 32-bit LSB executable, ARM, EABI5 version 1 (GNU/Linux), too many section (65535)
root#debian:/home/l# readelf -h work_s
ELF Header:
Magic: 7f 45 4c 46 01 01 01 03 00 00 00 00 00 00 00 00
Class: ELF32
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - GNU
ABI Version: 0
Type: EXEC (Executable file)
Machine: ARM
Version: 0x1
Entry point address: 0xd984c
Start of program headers: 52 (bytes into file)
Start of section headers: 65535 (bytes into file)
Flags: 0x5000002, Version5 EABI, <unknown>
Size of this header: 52 (bytes)
Size of program headers: 32 (bytes)
Number of program headers: 3
Size of section headers: 40 (bytes)
Number of section headers: 65535
Section header string table index: 65535 (187120167)
readelf: Error: Reading 0x27ffd8 bytes extends past end of file for section headers
root#debian:/home/l# /usr/arm-linux-gnueabihf/bin/objdump -d work_s
/usr/arm-linux-gnueabihf/bin/objdump: work_s: File truncated
Help fix this.
For example, the curl file is normally disassembled
Changed e_shnum and e_shstrndx fields in ELF header on 0xff:
Entry point address is greater than file size.
How to restore ELF?
I am currently writing a compiler (http://curly-lang.org if you're curious), and have been encountering a strange bug when trying to run the generated ELF binaries on the latest Linux kernel. The same binaries run fine on older kernels (I've tried on several Ubuntu boxes, uname 4.4.0-1049-aws), but on my updated Arch box (uname 4.17.11-arch1), I can't even open them under GDB.
The error message given by GDB is During startup program terminated with signal SIGSEGV, Segmentation fault, which as I understand is indicative of a failure to load the program segments before the first instruction is ever run.
I compiled a minimal ELF executable with GCC/NASM to try and reproduce the problem, but the GCC-produced executable loads without a hitch, whereas my programs definitely do not.
Here are the printouts of readelf -a for both executables, for reference. The first is the program generated by my compiler :
$ readelf -a my-program
ELF Header:
Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: EXEC (Executable file)
Machine: Advanced Micro Devices X86-64
Version: 0x1
Entry point address: 0x400040
Start of program headers: 2400 (bytes into file)
Start of section headers: 2568 (bytes into file)
Flags: 0x0
Size of this header: 64 (bytes)
Size of program headers: 56 (bytes)
Number of program headers: 3
Size of section headers: 64 (bytes)
Number of section headers: 6
Section header string table index: 1
Section Headers:
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
[ 0] NULL 0000000000000000 00000000
0000000000000000 0000000000000000 0 0 0
[ 1] STRTAB 0000000000000000 00000b88
000000000000009e 0000000000000000 0 0 0
[ 2] .init PROGBITS 0000000000400040 00000040
000000000000006b 0000000000000000 AX 0 0 0
[ 3] .text PROGBITS 00000000004000ab 000000ab
0000000000000824 0000000000000000 AX 0 0 0
[ 4] .data PROGBITS 00000000008008cf 000008cf
0000000000000091 0000000000000000 WA 0 0 0
[ 5] .symtab SYMTAB 0000000000000000 00000c26
00000000000000c0 0000000000000018 1 8 0
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
L (link order), O (extra OS processing required), G (group), T (TLS),
C (compressed), x (unknown), o (OS specific), E (exclude),
l (large), p (processor specific)
There are no section groups in this file.
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
LOAD 0x0000000000000040 0x0000000000400040 0x0000000000000000
0x000000000000006b 0x000000000000006b R E 0x1000
LOAD 0x00000000000000ab 0x00000000004000ab 0x0000000000000000
0x0000000000000824 0x0000000000000824 R E 0x1000
LOAD 0x00000000000008cf 0x00000000008008cf 0x0000000000000000
0x0000000000000091 0x0000000000000091 RW 0x1000
Section to Segment mapping:
Segment Sections...
00 .init
01 .text
02 .data
There is no dynamic section in this file.
There are no relocations in this file.
The decoding of unwind sections for machine type Advanced Micro Devices X86-64 is not currently supported.
Symbol table '.symtab' contains 8 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 00000000004004e0 0 NOTYPE LOCAL HIDDEN 3 .text.argument
1: 00000000004005a0 0 NOTYPE LOCAL HIDDEN 3 .text.constant
2: 0000000000400270 0 NOTYPE LOCAL HIDDEN 3 .text.memextend-page
3: 0000000000400210 0 NOTYPE LOCAL HIDDEN 3 .text.memextend-pool-32
4: 00000000004002b0 0 NOTYPE LOCAL HIDDEN 3 .text.unit
5: 00000000004005f0 0 NOTYPE LOCAL HIDDEN 3 .text.write
6: 00000000008008d0 0 NOTYPE LOCAL HIDDEN 4 .data.brkaddr
7: 0000000000400040 0 NOTYPE LOCAL HIDDEN 2 .init.brkaddr-init
No version information found in this file.
And for the GCC-generated program :
$ readelf -a gcc-program
ELF Header:
Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: EXEC (Executable file)
Machine: Advanced Micro Devices X86-64
Version: 0x1
Entry point address: 0x400110
Start of program headers: 64 (bytes into file)
Start of section headers: 336 (bytes into file)
Flags: 0x0
Size of this header: 64 (bytes)
Size of program headers: 56 (bytes)
Number of program headers: 3
Size of section headers: 64 (bytes)
Number of section headers: 5
Section header string table index: 4
Section Headers:
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
[ 0] NULL 0000000000000000 00000000
0000000000000000 0000000000000000 0 0 0
[ 1] .note.gnu.build-i NOTE 00000000004000e8 000000e8
0000000000000024 0000000000000000 A 0 0 4
[ 2] .text PROGBITS 0000000000400110 00000110
0000000000000010 0000000000000000 AX 0 0 16
[ 3] .data PROGBITS 0000000000600120 00000120
0000000000000001 0000000000000000 WA 0 0 4
[ 4] .shstrtab STRTAB 0000000000000000 00000121
000000000000002a 0000000000000000 0 0 1
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
L (link order), O (extra OS processing required), G (group), T (TLS),
C (compressed), x (unknown), o (OS specific), E (exclude),
l (large), p (processor specific)
There are no section groups in this file.
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000
0x0000000000000120 0x0000000000000120 R E 0x200000
LOAD 0x0000000000000120 0x0000000000600120 0x0000000000600120
0x0000000000000001 0x0000000000000001 RW 0x200000
NOTE 0x00000000000000e8 0x00000000004000e8 0x00000000004000e8
0x0000000000000024 0x0000000000000024 R 0x4
Section to Segment mapping:
Segment Sections...
00 .note.gnu.build-id .text
01 .data
02 .note.gnu.build-id
There is no dynamic section in this file.
There are no relocations in this file.
The decoding of unwind sections for machine type Advanced Micro Devices X86-64 is not currently supported.
No version information found in this file.
Displaying notes found in: .note.gnu.build-id
Owner Data size Description
GNU 0x00000014 NT_GNU_BUILD_ID (unique build ID bitstring)
Build ID: 1a3e678b08996ee6a9d289c3f76c7c52cd4a30aa
As you can see, I've tried to mirror GCC's segment placement (~0x400000 for the code, and ~0x800000 for the data), and the two ELF headers are strictly identical. The only meaningful difference I can think of is that my custom binaries have two LOAD segments (one for the initialization code, one for the rest) that share the same page, whereas GCC only produces a single code LOAD segment. That shouldn't pose a problem, though, since they both share the same permissions and don't overlap.
Other than that, I do not see what could possibly keep the first program from loading correctly. If anyone well-versed in the arcanes of the Linux ELF loader could enlighten me, that would be greatly appreciated.
Thank you for your attention,
Nevermind everyone, it was the page-sharing segments that were causing the problem all along.
Thinking the problem was probably in the kernel loader, I should have thought about running dmesg much earlier, where I would have noticed the following message, clear as day :
[54178.211348] 12766 (my-program): Uhuuh, elf segment at 0000000000400000 requested but the memory is mapped already
Apparently, some benevolent mastermind decided 3 months ago that it would be good to actually catch double-mapping errors instead of just letting them go silently as we always did in the ELF loader.
It's not that my binaries used to be correct, it's that the error they were causing wasn't caught before. I don't know if I should be proud or ashamed for my bugs to have eluded detection all this time.
Anyway, I leave this answer to warn anyone foolish enough to map multiple segments on a single page in an ELF binary : do not. There is no try.
PS: #rodrigo: Thanks for your answer, I didn't even notice the PhysAddr before you pointed them out. The manual says that they're used "on systems where physical addressing is relevant", which doesn't seem to be the case here, but I'll remember to keep a lookout for them next time.
I am currently trying to create simple ELF executable using ELFIO library based on their docs (page 8) [1]. This is output from readelf when run on the slightly modified version of their code (nothing important, just different string and address base. Alignments, flags etc. are untouched.)
ELF Header:
Magic: 7f 45 4c 46 01 01 01 03 00 00 00 00 00 00 00 00
Class: ELF32
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - GNU
ABI Version: 0
Type: EXEC (Executable file)
Machine: Intel 80386
Version: 0x1
Entry point address: 0x400000
Start of program headers: 52 (bytes into file)
Start of section headers: 4168 (bytes into file)
Flags: 0x0
Size of this header: 52 (bytes)
Size of program headers: 32 (bytes)
Number of program headers: 2
Size of section headers: 40 (bytes)
Number of section headers: 4
Section header string table index: 1
Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[ 0] NULL 00000000 000000 000000 00 0 0 0
[ 1] .shstrtab STRTAB 00000000 00102e 000017 00 0 0 0
[ 2] .text PROGBITS 00400000 001000 00001d 00 AX 0 0 16
[ 3] .data PROGBITS 00400020 001020 00000e 00 WA 0 0 4
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings)
I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown)
O (extra OS processing required) o (OS specific), p (processor specific)
There are no section groups in this file.
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
LOAD 0x001000 0x00400000 0x00400000 0x0001d 0x0001d R E 0x1000
LOAD 0x001020 0x00400020 0x00400020 0x0000e 0x0000e RW 0x10
Section to Segment mapping:
Segment Sections...
00 .text
01 .data
This executable file runs fine. I wanted to move data section a little bit further from text section, so I have chosen address 0x400040. here you can see that it is all I did.
ELF Header:
Magic: 7f 45 4c 46 01 01 01 03 00 00 00 00 00 00 00 00
Class: ELF32
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - GNU
ABI Version: 0
Type: EXEC (Executable file)
Machine: Intel 80386
Version: 0x1
Entry point address: 0x400000
Start of program headers: 52 (bytes into file)
Start of section headers: 4168 (bytes into file)
Flags: 0x0
Size of this header: 52 (bytes)
Size of program headers: 32 (bytes)
Number of program headers: 2
Size of section headers: 40 (bytes)
Number of section headers: 4
Section header string table index: 1
Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[ 0] NULL 00000000 000000 000000 00 0 0 0
[ 1] .shstrtab STRTAB 00000000 00102e 000017 00 0 0 0
[ 2] .text PROGBITS 00400000 001000 00001d 00 AX 0 0 16
[ 3] .data PROGBITS 00400040 001020 00000e 00 WA 0 0 4
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings)
I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown)
O (extra OS processing required) o (OS specific), p (processor specific)
There are no section groups in this file.
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
LOAD 0x001000 0x00400000 0x00400000 0x0001d 0x0001d R E 0x1000
LOAD 0x001020 0x00400040 0x00400040 0x0000e 0x0000e RW 0x10
Section to Segment mapping:
Segment Sections...
00 .text
01 .data
However, this file keeps seg. faulting on its start. I can't really understand what is going on. At first I though it has something to do with alignment of sections and segments but they seem OK to me. I have also suspected ASLR (because the address of Hello World is hardcoded and there are no relocations) and tried to turned it off in /proc/sys/kernel/randomize_va_space. I have also checked linux kernel source code, especially file fs/binfmt_elf.c, but to fully understand what is going on there I will need more time. So I would like to ask for your help. If you know what is wrong with the address 0x400040, why this file keeps seg. faulting, or you can point me to a certain lines of code in kernel, then I would really appreciate that.
Edit: I have also tried gdb but bt provides No stack. Setting read watchpoint on entry point address does not do anything.
Edit 2: After playing around with the file I found out that moving the file offset of second segment from 0x1020 to 0x1040 causes the file to work fine. So now my question is, does file offset and virtual address need to be in some relation?
[1] http://elfio.sourceforge.net/elfio.pdf
So now my question is, does file offset and virtual address need to be in some relation?
Yes: file offset must be congruent to VirtAddr and PhysAddr modulo page size, because the file is mmaped directly (by the kernel loader).
If you do not maintain that relationship, the kernel will create a new process, attempt to map your executable into it, discover that you've violated basic constraint, and will terminate your process (with SIGKILL, I believe). Your process never executes a single instruction in user-space.
No stack frame is created then it means there is some problem occurs before entry point.
Please tell me whether the file is linked with any library modules like libc because they will provide initial setup to call our entry point function.
Use gdb and make breakpoint at _start. If you don't linked with library then the task of setting up stack, arguments, environment will be your responsibility.
Help me please with gnu assembler for arm926ejs cpu.
I try to build a simple program(test.S):
.global _start
_start:
mov r0, #2
bx lr
and success build it:
arm-none-linux-gnueabi-as -mthumb -o test.o test.S
arm-none-linux-gnueabi-ld -o test test.o
but when I run the program in the arm target linux environment, I get an error:
./test
Segmentation fault
What am I doing wrong?
Can _start function be the thumb func?
or
It is always arm func?
Can _start be a thumb function (in a Linux user program)?
Yes it can. The steps are not as simple as you may believe.
Please use the .code 16 as described by others. Also look at ARM Script predicate; my answer shows how to detect a thumb binary. The entry symbol must have the traditional _start+1 value or Linux will decide to call your _start in ARM mode.
Also your code is trying to emulate,
int main(void) { return 2; }
The _start symbol must not do this (as per auselen). To do _start to main() in ARM mode you need,
#include <linux/unistd.h>
static inline void exit(int status)
{
asm volatile ("mov r0, %0\n\t"
"mov r7, %1\n\t"
"swi #7\n\t"
: : "r" (status),
"Ir" (__NR_exit)
: "r0", "r7");
}
/* Wrapper for main return code. */
void __attribute__ ((unused)) estart (int argc, char*argv[])
{
int rval = main(argc,argv);
exit(rval);
}
/* Setup arguments for estart [like main()]. */
void __attribute__ ((naked)) _start (void)
{
asm(" sub lr, lr, lr\n" /* Clear the link register. */
" ldr r0, [sp]\n" /* Get argc... */
" add r1, sp, #4\n" /* ... and argv ... */
" b estart\n" /* Let's go! */
);
}
It is good to clear the lr so that stack traces will terminate. You can avoid the argc and argv processing if you want. The start shows how to work with this. The estart is just a wrapper to convert the main() return code to an exit() call.
You need to convert the above assembler to Thumb equivalents. I would suggest using gcc inline assembler. You can convert to pure assembler source if you get inlines to work. However, doing this in 'C' source is probably more practical, unless you are trying to make a very minimal executable.
Helpful gcc arguements are,
-nostartfiles -static -nostdlib -isystem <path to linux user headers>
Add -mthumb and you should have a harness for either mode.
Your problem is you end with
bx lr
and you expect Linux to take over after that. That exact line must be the cause of Segmentation fault.
You can try to create a minimal executable then try to bisect it to see the guts and understand how an executable is expected to behave.
See below for a working example:
.global _start
.thumb_func
_start:
mov r0, #42
mov r7, #1
svc #0
compile with
arm-linux-gnueabihf-as start.s -o start.o && arm-linux-gnueabihf-ld
start.o -o start_test
and dump to see the guts
$ arm-linux-gnueabihf-readelf -a -W start_test
Now you should notice the odd address of _start
ELF Header:
Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
Class: ELF32
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: EXEC (Executable file)
Machine: ARM
Version: 0x1
Entry point address: 0x8055
Start of program headers: 52 (bytes into file)
Start of section headers: 160 (bytes into file)
Flags: 0x5000000, Version5 EABI
Size of this header: 52 (bytes)
Size of program headers: 32 (bytes)
Number of program headers: 1
Size of section headers: 40 (bytes)
Number of section headers: 6
Section header string table index: 3
Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[ 0] NULL 00000000 000000 000000 00 0 0 0
[ 1] .text PROGBITS 00008054 000054 000006 00 AX 0 0 4
[ 2] .ARM.attributes ARM_ATTRIBUTES 00000000 00005a 000014 00 0 0 1
[ 3] .shstrtab STRTAB 00000000 00006e 000031 00 0 0 1
[ 4] .symtab SYMTAB 00000000 000190 0000e0 10 5 6 4
[ 5] .strtab STRTAB 00000000 000270 000058 00 0 0 1
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings)
I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown)
O (extra OS processing required) o (OS specific), p (processor specific)
There are no section groups in this file.
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
LOAD 0x000000 0x00008000 0x00008000 0x0005a 0x0005a R E 0x8000
Section to Segment mapping:
Segment Sections...
00 .text
There is no dynamic section in this file.
There are no relocations in this file.
There are no unwind sections in this file.
Symbol table '.symtab' contains 14 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 00000000 0 NOTYPE LOCAL DEFAULT UND
1: 00008054 0 SECTION LOCAL DEFAULT 1
2: 00000000 0 SECTION LOCAL DEFAULT 2
3: 00000000 0 FILE LOCAL DEFAULT ABS start.o
4: 00008054 0 NOTYPE LOCAL DEFAULT 1 $t
5: 00000000 0 FILE LOCAL DEFAULT ABS
6: 0001005a 0 NOTYPE GLOBAL DEFAULT 1 _bss_end__
7: 0001005a 0 NOTYPE GLOBAL DEFAULT 1 __bss_start__
8: 0001005a 0 NOTYPE GLOBAL DEFAULT 1 __bss_end__
9: 00008055 0 FUNC GLOBAL DEFAULT 1 _start
10: 0001005a 0 NOTYPE GLOBAL DEFAULT 1 __bss_start
11: 0001005c 0 NOTYPE GLOBAL DEFAULT 1 __end__
12: 0001005a 0 NOTYPE GLOBAL DEFAULT 1 _edata
13: 0001005c 0 NOTYPE GLOBAL DEFAULT 1 _end
No version information found in this file.
Attribute Section: aeabi
File Attributes
Tag_CPU_arch: v4T
Tag_THUMB_ISA_use: Thumb-1
here answer.
Thanks for all.
http://stuff.mit.edu/afs/sipb/project/egcs/src/egcs/gcc/config/arm/README-interworking
Calls via function pointers should use the BX instruction if the call is made in ARM mode:
.code 32
mov lr, pc
bx rX
This code sequence will not work in Thumb mode however, since the mov instruction will not set the bottom bit of the lr register. Instead a branch-and-link to the _call_via_rX functions should be used instead:
.code 16
bl _call_via_rX
where rX is replaced by the name of the register containing the function address.
How can we implement the system call using sysenter/syscall directly in x86 Linux? Can anybody provide help? It would be even better if you can also show the code for amd64 platform.
I know in x86, we can use
__asm__(
" movl $1, %eax \n"
" movl $0, %ebx \n"
" call *%gs:0x10 \n"
);
to route to sysenter indirectly.
But how can we code using sysenter/syscall directly to issue a system call?
I find some material http://damocles.blogbus.com/tag/sysenter/ . But still find it difficult to figure out.
First of all, you can't safely use GNU C Basic asm(""); syntax for this (without input/output/clobber constraints). You need Extended asm to tell the compiler about registers you modify. See the inline asm in the GNU C manual and the inline-assembly tag wiki for links to other guides for details on what things like "D"(1) means as part of an asm() statement.
You also need asm volatile because that's not implicit for Extended asm statements with 1 or more output operands.
I'm going to show you how to execute system calls by writing a program that writes Hello World! to standard output by using the write() system call. Here's the source of the program without an implementation of the actual system call :
#include <sys/types.h>
ssize_t my_write(int fd, const void *buf, size_t size);
int main(void)
{
const char hello[] = "Hello world!\n";
my_write(1, hello, sizeof(hello));
return 0;
}
You can see that I named my custom system call function as my_write in order to avoid name clashes with the "normal" write, provided by libc. The rest of this answer contains the source of my_write for i386 and amd64.
i386
System calls in i386 Linux are implemented using the 128th interrupt vector, e.g. by calling int 0x80 in your assembly code, having set the parameters accordingly beforehand, of course. It is possible to do the same via SYSENTER, but actually executing this instruction is achieved by the VDSO virtually mapped to each running process. Since SYSENTER was never meant as a direct replacement of the int 0x80 API, it's never directly executed by userland applications - instead, when an application needs to access some kernel code, it calls the virtually mapped routine in the VDSO (that's what the call *%gs:0x10 in your code is for), which contains all the code supporting the SYSENTER instruction. There's quite a lot of it because of how the instruction actually works.
If you want to read more about this, have a look at this link. It contains a fairly brief overview of the techniques applied in the kernel and the VDSO. See also The Definitive Guide to (x86) Linux System Calls - some system calls like getpid and clock_gettime are so simple the kernel can export code + data that runs in user-space so the VDSO never needs to enter the kernel, making it much faster even than sysenter could be.
It's much easier to use the slower int $0x80 to invoke the 32-bit ABI.
// i386 Linux
#include <asm/unistd.h> // compile with -m32 for 32 bit call numbers
//#define __NR_write 4
ssize_t my_write(int fd, const void *buf, size_t size)
{
ssize_t ret;
asm volatile
(
"int $0x80"
: "=a" (ret)
: "0"(__NR_write), "b"(fd), "c"(buf), "d"(size)
: "memory" // the kernel dereferences pointer args
);
return ret;
}
As you can see, using the int 0x80 API is relatively simple. The number of the syscall goes to the eax register, while all the parameters needed for the syscall go into respectively ebx, ecx, edx, esi, edi, and ebp. System call numbers can be obtained by reading the file /usr/include/asm/unistd_32.h.
Prototypes and descriptions of the functions are available in the 2nd section of the manual, so in this case write(2).
The kernel saves/restores all the registers (except EAX) so we can use them as input-only operands to the inline asm. See What are the calling conventions for UNIX & Linux system calls (and user-space functions) on i386 and x86-64
Keep in mind that the clobber list also contains the memory parameter, which means that the instruction listed in the instruction list references memory (via the buf parameter). (A pointer input to inline asm does not imply that the pointed-to memory is also an input. See How can I indicate that the memory *pointed* to by an inline ASM argument may be used?)
amd64
Things look different on the AMD64 architecture which sports a new instruction called SYSCALL. It is very different from the original SYSENTER instruction, and definitely much easier to use from userland applications - it really resembles a normal CALL, actually, and adapting the old int 0x80 to the new SYSCALL is pretty much trivial. (Except it uses RCX and R11 instead of the kernel stack to save the user-space RIP and RFLAGS so the kernel knows where to return).
In this case, the number of the system call is still passed in the register rax, but the registers used to hold the arguments now nearly match the function calling convention: rdi, rsi, rdx, r10, r8 and r9 in that order. (syscall itself destroys rcx so r10 is used instead of rcx, letting libc wrapper functions just use mov r10, rcx / syscall.)
// x86-64 Linux
#include <asm/unistd.h> // compile without -m32 for 64 bit call numbers
// #define __NR_write 1
ssize_t my_write(int fd, const void *buf, size_t size)
{
ssize_t ret;
asm volatile
(
"syscall"
: "=a" (ret)
// EDI RSI RDX
: "0"(__NR_write), "D"(fd), "S"(buf), "d"(size)
: "rcx", "r11", "memory"
);
return ret;
}
(See it compile on Godbolt)
Do notice how practically the only thing that needed changing were the register names, and the actual instruction used for making the call. This is mostly thanks to the input/output lists provided by gcc's extended inline assembly syntax, which automagically provides appropriate move instructions needed for executing the instruction list.
The "0"(callnum) matching constraint could be written as "a" because operand 0 (the "=a"(ret) output) only has one register to pick from; we know it will pick EAX. Use whichever you find more clear.
Note that non-Linux OSes, like MacOS, use different call numbers. And even different arg-passing conventions for 32-bit.
Explicit register variables
https://gcc.gnu.org/onlinedocs/gcc-8.2.0/gcc/Explicit-Register-Variables.html#Explicit-Reg-Vars)
I believe this should now generally be the recommended approach over register constraints because:
it can represent all registers, including r8, r9 and r10 which are used for system call arguments: How to specify register constraints on the Intel x86_64 register r8 to r15 in GCC inline assembly?
it's the only optimal option for other ISAs besides x86 like ARM, which don't have the magic register constraint names: How to specify an individual register as constraint in ARM GCC inline assembly? (besides using a temporary register + clobbers + and an extra mov instruction)
I'll argue that this syntax is more readable than using the single letter mnemonics such as S -> rsi
Register variables are used for example in glibc 2.29, see: sysdeps/unix/sysv/linux/x86_64/sysdep.h.
main_reg.c
#define _XOPEN_SOURCE 700
#include <inttypes.h>
#include <sys/types.h>
ssize_t my_write(int fd, const void *buf, size_t size) {
register int64_t rax __asm__ ("rax") = 1;
register int rdi __asm__ ("rdi") = fd;
register const void *rsi __asm__ ("rsi") = buf;
register size_t rdx __asm__ ("rdx") = size;
__asm__ __volatile__ (
"syscall"
: "+r" (rax)
: "r" (rdi), "r" (rsi), "r" (rdx)
: "rcx", "r11", "memory"
);
return rax;
}
void my_exit(int exit_status) {
register int64_t rax __asm__ ("rax") = 60;
register int rdi __asm__ ("rdi") = exit_status;
__asm__ __volatile__ (
"syscall"
: "+r" (rax)
: "r" (rdi)
: "rcx", "r11", "memory"
);
}
void _start(void) {
char msg[] = "hello world\n";
my_exit(my_write(1, msg, sizeof(msg)) != sizeof(msg));
}
GitHub upstream.
Compile and run:
gcc -O3 -std=c99 -ggdb3 -ffreestanding -nostdlib -Wall -Werror \
-pedantic -o main_reg.out main_reg.c
./main.out
echo $?
Output
hello world
0
For comparison, the following analogous to How to invoke a system call via syscall or sysenter in inline assembly? produces equivalent assembly:
main_constraint.c
#define _XOPEN_SOURCE 700
#include <inttypes.h>
#include <sys/types.h>
ssize_t my_write(int fd, const void *buf, size_t size) {
ssize_t ret;
__asm__ __volatile__ (
"syscall"
: "=a" (ret)
: "0" (1), "D" (fd), "S" (buf), "d" (size)
: "rcx", "r11", "memory"
);
return ret;
}
void my_exit(int exit_status) {
ssize_t ret;
__asm__ __volatile__ (
"syscall"
: "=a" (ret)
: "0" (60), "D" (exit_status)
: "rcx", "r11", "memory"
);
}
void _start(void) {
char msg[] = "hello world\n";
my_exit(my_write(1, msg, sizeof(msg)) != sizeof(msg));
}
GitHub upstream.
Disassembly of both with:
objdump -d main_reg.out
is almost identical, here is the main_reg.c one:
Disassembly of section .text:
0000000000001000 <my_write>:
1000: b8 01 00 00 00 mov $0x1,%eax
1005: 0f 05 syscall
1007: c3 retq
1008: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1)
100f: 00
0000000000001010 <my_exit>:
1010: b8 3c 00 00 00 mov $0x3c,%eax
1015: 0f 05 syscall
1017: c3 retq
1018: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1)
101f: 00
0000000000001020 <_start>:
1020: c6 44 24 ff 00 movb $0x0,-0x1(%rsp)
1025: bf 01 00 00 00 mov $0x1,%edi
102a: 48 8d 74 24 f3 lea -0xd(%rsp),%rsi
102f: 48 b8 68 65 6c 6c 6f movabs $0x6f77206f6c6c6568,%rax
1036: 20 77 6f
1039: 48 89 44 24 f3 mov %rax,-0xd(%rsp)
103e: ba 0d 00 00 00 mov $0xd,%edx
1043: b8 01 00 00 00 mov $0x1,%eax
1048: c7 44 24 fb 72 6c 64 movl $0xa646c72,-0x5(%rsp)
104f: 0a
1050: 0f 05 syscall
1052: 31 ff xor %edi,%edi
1054: 48 83 f8 0d cmp $0xd,%rax
1058: b8 3c 00 00 00 mov $0x3c,%eax
105d: 40 0f 95 c7 setne %dil
1061: 0f 05 syscall
1063: c3 retq
So we see that GCC inlined those tiny syscall functions as would be desired.
my_write and my_exit are the same for both, but _start in main_constraint.c is slightly different:
0000000000001020 <_start>:
1020: c6 44 24 ff 00 movb $0x0,-0x1(%rsp)
1025: 48 8d 74 24 f3 lea -0xd(%rsp),%rsi
102a: ba 0d 00 00 00 mov $0xd,%edx
102f: 48 b8 68 65 6c 6c 6f movabs $0x6f77206f6c6c6568,%rax
1036: 20 77 6f
1039: 48 89 44 24 f3 mov %rax,-0xd(%rsp)
103e: b8 01 00 00 00 mov $0x1,%eax
1043: c7 44 24 fb 72 6c 64 movl $0xa646c72,-0x5(%rsp)
104a: 0a
104b: 89 c7 mov %eax,%edi
104d: 0f 05 syscall
104f: 31 ff xor %edi,%edi
1051: 48 83 f8 0d cmp $0xd,%rax
1055: b8 3c 00 00 00 mov $0x3c,%eax
105a: 40 0f 95 c7 setne %dil
105e: 0f 05 syscall
1060: c3 retq
It is interesting to observe that in this case GCC found a slightly shorter equivalent encoding by picking:
104b: 89 c7 mov %eax,%edi
to set the fd to 1, which equals the 1 from the syscall number, rather than a more direct:
1025: bf 01 00 00 00 mov $0x1,%edi
For an in-depth discussion of the calling conventions, see also: What are the calling conventions for UNIX & Linux system calls (and user-space functions) on i386 and x86-64
Tested in Ubuntu 18.10, GCC 8.2.0.