Segmentation fault ASM on linux at printf - linux

The following is a program from a book (Introduction to 64 Bit Intel Assembly Language Programming for Linux, by Seyfarth, 2012), chap 9. The fault (in gdb) is:
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7aa10a5 in __printf_size (fp=0x400400, info=0x0,
args=) at printf_size.c:199
199 printf_size.c: No such file or directory.
Until this chapter, I successfully used the following to "produce an object file", as recommended,
yasm -f elf64 -g dwarf2 -l exit.lst exit.asm
and then,
ld -o prgm prgm.o
This is the program as copied from the book(l 10 push rbp; I had firstly rem'd the ; but had the same result):
segment .text
global main
extern printf
; void print_max ( long a, long b )
; {
a equ 0
b equ 8
print_max:
push rbp; ;normal stack frame
mov rbp, rsp
; leave space for a, b and max
sub rsp, 32
; int max;
max equ 16
mov [rsp+a], rdi ; save a
mov [rsp+b], rsi ; save b
; max = a;
mov [rsp+max], rdi
; if ( b > max ) max = b;
cmp rsi, rdi
jng skip
mov [rsp+max], rsi
skip:
; printf ( "max(%1d,%1d ) = %1d\n",
; a, b, max );
segment .data
fmt db 'max(%1d,%1d) = %1d',0xa,0
segment .text
lea rdi, [fmt]
mov rsi, [rsp+a]
mov rdx, [rsp+b]
mov rcx, [rsp+max]
call printf
; }
leave
ret
main:
push rbp
mov rbp, rsp
; print_max ( 100, 200 );
mov rdi, 100 ;first parameter
mov rsi, 200 ;second parameter
call print_max
xor eax, eax ;to return 0
leave
ret
After a similar segmentation fault with a previous program in this chap ("Hello World" example), I used
gcc -o prgm prgm.o
which had worked until this program.

using gcc to link is the easiest way to go if you are going to use functions from the C Library, since gcc takes care of a few things for you "behind the scenes".
To use just ld, you need to link against ld-linux-x86-64.so.2 and pass it -lc to link to the C Library.
Next, you are using printf wrong. If you are not using floating point registers (which you are not) you need to "zero out" rax.
Also, since you are linking against the C Library, you cannot just ret from the main but call exit.
lea rdi, [fmt]
mov rsi, [rsp+a]
mov rdx, [rsp+b]
mov rcx, [rsp+max]
xor rax, rax ; # of floating point registers used.
call printf
and:
; print_max ( 100, 200 );
mov rdi, 100 ;first parameter
mov rsi, 200 ;second parameter
call print_max
xor eax, eax ;to return 0
leave
xor rdi, rdi
call exit
ld -o $(APP) $(APP).o -lc -I/lib64/ld-linux-x86-64.so.2
and the output:
max(100,200) = 200

Gunner gave an excellent summary. The program should have placed a 0 in rax. This can be done using "xor eax, eax" which is the normal way to zero out a register in x86-64 mode. The top half of the register is zeroed out with xor with a 32 bit register and the lower half depends on the the bits of the 2 registers used (with eax, eax the result is 0).

Related

msvc x64: how to omit extra size check before __chkstk() for _alloca(size_t)

When call _alloca(size) with a runtime-known size, msvc x64 v19.* will call __chkstk(), but emits extra code that checks if size+15 is overflow, if that occurs it make size=0x0ffffffffffffff0, see: godbolt.org/z/YT4xE8s4q
extern void * _alloca(size_t); //x64 msvc v19.*
int f()
{
size_t n = 3 & (size_t)f;
void * p = _alloca(n);
return 3 & (int)(size_t)p;
}
compiled by x64 msvc v19.latest with option -O2 -GS-:
f PROC ; COMDAT
$LN5:
push rbp
sub rsp, 32 ; 00000020H
lea rbp, QWORD PTR [rsp+32]
lea rax, OFFSET FLAT:f
and eax, 3 ; eax = n
lea rcx, QWORD PTR [rax+15] ; rcx = n+15 for 16-byte align
cmp rcx, rax ; checks overflow
ja SHORT $LN3#f ; normally n+15 is above n
mov rcx, 1152921504606846960 ; 0ffffffffffffff0H
$LN3#f:
and rcx, -16 ; align the size
mov rax, rcx ; rax = argument for __chkstk
call __chkstk ; probe stack pages in sequence
sub rsp, rcx ; do allocation after probe
lea rax, QWORD PTR [rsp+32]
and eax, 3
mov rsp, rbp
pop rbp
ret 0
f ENDP
Such "overflow check" is needless for normal program, clang and gcc do not emit such checks.
It is unexpected that for every _alloca it inserts garbage instructions (cmp + ja + mov = 15 bytes).
I tried __assume(n+15>n) and __assume(n<0xFFFFu), but does not help and seems ignored.
I guess msvc backend (c2.dll) use some hardcoded "snippet" to handle _alloca().
So the question is, is there an option, documented or undocumented, to disable the "overflow check"?
Or, is there some global flag that could "control" the compiler backend's "snippet"?

dlsym crash when called from assembler

I have a small program in assembler that loads an .so file using dlopen, and then tries to load a function pointer using dlsym. Calling dlopen seems to be fine but it crashes when I call dlsym.
SECTION .text
;default rel
EXTERN dlopen ; loads a dynamic library
EXTERN dlsym ; retrieves the address for a symbol in the dynamic library
; inputs:
; rdi: rdi the pointer to print
printHex:
sub rsp, 19 ; allocate space for the string 0x0123456789ABCDEF\n
mov BYTE [rsp + 0], '0'
mov BYTE [rsp + 1], 'x'
xor rcx, rcx ; int loop variable to 0
.LOOP1:
lea rsi, [rsp + rcx] ; rsi will we the offset where we will store the next hex charcter
mov rax, rdi
and rax, 0xf
sar rdi, 4 ; shift right 4 bits (divide by 16)
lea rdx, [hexLookUp + rax]
mov bl, [rdx]
mov BYTE [rsi +18], bl
dec rcx ; rcx--
cmp rcx, -16 ; while rcx > -16
jne .LOOP1
mov BYTE [rsp + 18], 10
; print
mov rax, 1 ; syscall: write
mov rdi, 1 ; stdout
mov rsi, rsp
mov rdx, 19
syscall
; release stack memory
add rsp, 19
ret
global _start ; "global" means that the symbol can be accessed in other modules. In order to refer to a global symbol from another module, you must use the "extern" keyboard
_start:
; load the library
mov rdi, str_libX11so
mov rsi, 2; RTLD_NOW=2
call dlopen wrt ..plt
; PLT stands for Procedure Linkage Table:
; used to call external library functions whose address is not know at link time,
; so it must be resolved by the dynamic linker at run time
; more info: https://reverseengineering.stackexchange.com/questions/1992/what-is-plt-got
mov [ptr_libX11so], rax ; the previous function call returned the value in rax
mov rdi, rax
call printHex
; load the function
mov rdi, [str_libX11so]
mov rsi, fstr_XOpenDisplay
call dlsym wrt ..plt
mov [fptr_XOpenDisplay], rax
mov rdi, rax
call printHex
mov rax, 60 ; syscal: exit
mov rdi, 0 ; return code
syscall
hexLookUp: db "0123456789ABCDEF"
str_libX11so: db "libX11.so", 0
; X11 function names
fstr_XOpenDisplay: db "XOpenDisplay", 0
SECTION .data
ptr_libX11so: dq 0 ; ptr to the X11 library
; X11 function ptrs
fptr_XOpenDisplay: dq 0
I have tried to make the same program in C and it seems to work. So I must be doing something wrong.
extern void* dlopen(const char* name, int);
extern void* dlsym(void* restrict handle, const char* restrict name);
int main()
{
void* libX11so = dlopen("libX11.so", 2);
void (*XOpenDisplay)() = dlsym(libX11so, "XOpenDisplay");
}
I tried to disassemble the C version and compare, but I can't still figure out what is the problem.
An interesting thing I noticed is that the pointer returned by dlopen (which is different in each execution), in the asm version is quite small compared to the C version (e.g 0x0000000001A932D vs 0x5555555592d0). But maybe that could be because I'm using the -no-pie flag for linking:
nasm -f elf64 -g -F dwarf minimal.asm && gcc -nostartfiles -no-pie minimal.o -ldl -o minimal && ./minimal
I just noticed my mistake:
; load the function
mov rdi, [str_libX11so]
should be:
; load the function
mov rdi, [ptr_libX11so]

Print ARGC in NASM without printf

Any good NASM/Intel Assembly programmers out there? If so, I have a question for you!
Every tutorial I can find online, shows the usage of "printf" for printing the actual value of ARGC to the screen (fd:/dev/stdout). Is it not possible to simply print it with sys_write() for example:
SEGMENT .data ; nothing here
SEGMENT .text ; sauce
global _start
_start:
pop ECX ; get ARGC value
mov EAX, 4 ; sys_write()
mov EBX, 1 ; /dev/stdout
mov EDX, 1 ; a single byte
int 0x80
mov EAX, 1 ; sys_exit()
mov EBX, 0 ; return 0
int 0x80
SEGMENT .bss ; nothing here
When I run this, I get no output at all. I have tried copying ESP into EBP and tried using byte[EBP+4], (i was told the brackets de-reference the memory address).
I can confirm that the value when compared to a constant, works. For instance,
this code works:
pop ebp ; put the first argument on the stack
mov ebp, esp ; make a copy
cmp byte[ebp+4],0x5 ; does it equal 5?
je _good ; goto _good, &good, good()
jne _bad ; goto _bad, &bad, bad()
When we "pop" the stack, we technically should get the full number of arguments, no? Oh, btw, I compile with:
nasm -f elf test.asm -o test.o
ld -o test test.o
not sure if that is relevant. Let me know if i need to provide more information, or format my code for readability.
At least 2 problems.
You need to pass a pointer to the thing you want to print.
You probably want to convert to text.
Something like this should work:
SEGMENT .text ; sauce
global _start
_start:
mov ecx, esp ; pointer to ARGC on stack
add byte [esp], '0' ; convert to text assuming single digit
mov EAX, 4 ; sys_write()
mov EBX, 1 ; /dev/stdout
mov EDX, 1 ; a single byte
int 0x80
mov EAX, 1 ; sys_exit()
mov EBX, 0 ; return 0
int 0x80
Everyone's comments where very helpful! I am honored that you all pitched in and helped! I have used #Jester's code,
SEGMENT .text ; sauce
global _start
_start:
mov ecx, esp ; pointer to ARGC on stack
add byte [esp], '0' ; convert to text assuming single digit
mov EAX, 4 ; sys_write()
mov EBX, 1 ; /dev/stdout
mov EDX, 1 ; a single byte
int 0x80
mov EAX, 1 ; sys_exit()
mov EBX, 0 ; return 0
int 0x80
Which works perfectly when compiled, linked and loaded. The sys_write() function requires a pointer, such like in the common "Hello World" example, the symbol "msg" is a pointer as seen in the code below.
SECTION .data ; initialized data
msg: db "Hello World!",0xa
SECTION .text ; workflow
global _start
_start:
mov EAX, 4
mov EBX, 1
mov ECX, msg ; a pointer!
So first, we move the stack pointer into the counter register, ECX, with the code,
mov ecx, esp ; ecx now contains a pointer!
and then convert it to a string by adding a '0' char to the value pointed to by ESP (which is ARGC), by de-referencing it with square brackets, as [ESP] like so,
add byte[esp], '0' ; update the value stored at "esp"
Again, thank you all for the great help! <3

How to print a number in assembly NASM?

Suppose that I have an integer number in a register, how can I print it? Can you show a simple example code?
I already know how to print a string such as "hello, world".
I'm developing on Linux.
If you're already on Linux, there's no need to do the conversion yourself. Just use printf instead:
;
; assemble and link with:
; nasm -f elf printf-test.asm && gcc -m32 -o printf-test printf-test.o
;
section .text
global main
extern printf
main:
mov eax, 0xDEADBEEF
push eax
push message
call printf
add esp, 8
ret
message db "Register = %08X", 10, 0
Note that printf uses the cdecl calling convention so we need to restore the stack pointer afterwards, i.e. add 4 bytes per parameter passed to the function.
You have to convert it in a string; if you're talking about hex numbers it's pretty easy. Any number can be represented this way:
0xa31f = 0xf * 16^0 + 0x1 * 16^1 + 3 * 16^2 + 0xa * 16^3
So when you have this number you have to split it like I've shown then convert every "section" to its ASCII equivalent.
Getting the four parts is easily done with some bit magic, in particular with a right shift to move the part we're interested in in the first four bits then AND the result with 0xf to isolate it from the rest. Here's what I mean (soppose we want to take the 3):
0xa31f -> shift right by 8 = 0x00a3 -> AND with 0xf = 0x0003
Now that we have a single number we have to convert it into its ASCII value. If the number is smaller or equal than 9 we can just add 0's ASCII value (0x30), if it's greater than 9 we have to use a's ASCII value (0x61).
Here it is, now we just have to code it:
mov si, ??? ; si points to the target buffer
mov ax, 0a31fh ; ax contains the number we want to convert
mov bx, ax ; store a copy in bx
xor dx, dx ; dx will contain the result
mov cx, 3 ; cx's our counter
convert_loop:
mov ax, bx ; load the number into ax
and ax, 0fh ; we want the first 4 bits
cmp ax, 9h ; check what we should add
ja greater_than_9
add ax, 30h ; 0x30 ('0')
jmp converted
greater_than_9:
add ax, 61h ; or 0x61 ('a')
converted:
xchg al, ah ; put a null terminator after it
mov [si], ax ; (will be overwritten unless this
inc si ; is the last one)
shr bx, 4 ; get the next part
dec cx ; one less to do
jnz convert_loop
sub di, 4 ; di still points to the target buffer
PS: I know this is 16 bit code but I still use the old TASM :P
PPS: this is Intel syntax, converting to AT&T syntax isn't difficult though, look here.
Linux x86-64 with printf
main.asm
default rel ; make [rel format] the default, you always want this.
extern printf, exit ; NASM requires declarations of external symbols, unlike GAS
section .rodata
format db "%#x", 10, 0 ; C 0-terminated string: "%#x\n"
section .text
global main
main:
sub rsp, 8 ; re-align the stack to 16 before calling another function
; Call printf.
mov esi, 0x12345678 ; "%x" takes a 32-bit unsigned int
lea rdi, [rel format]
xor eax, eax ; AL=0 no FP args in XMM regs
call printf
; Return from main.
xor eax, eax
add rsp, 8
ret
GitHub upstream.
Then:
nasm -f elf64 -o main.o main.asm
gcc -no-pie -o main.out main.o
./main.out
Output:
0x12345678
Notes:
sub rsp, 8: How to write assembly language hello world program for 64 bit Mac OS X using printf?
xor eax, eax: Why is %eax zeroed before a call to printf?
-no-pie: plain call printf doesn't work in a PIE executable (-pie), the linker only automatically generates a PLT stub for old-style executables. Your options are:
call printf wrt ..plt to call through the PLT like traditional call printf
call [rel printf wrt ..got] to not use a PLT at all, like gcc -fno-plt.
Like GAS syntax call *printf#GOTPCREL(%rip).
Either of these are fine in a non-PIE executable as well, and don't cause any inefficiency unless you're statically linking libc. In which case call printf can resolve to a call rel32 directly to libc, because the offset from your code to the libc function would be known at static linking time.
See also: Can't call C standard library function on 64-bit Linux from assembly (yasm) code
If you want hex without the C library: Printing Hexadecimal Digits with Assembly
Tested on Ubuntu 18.10, NASM 2.13.03.
It depends on the architecture/environment you are using.
For instance, if I want to display a number on linux, the ASM code will be different from the one I would use on windows.
Edit:
You can refer to THIS for an example of conversion.
I'm relatively new to assembly, and this obviously is not the best solution,
but it's working. The main function is _iprint, it first checks whether the
number in eax is negative, and prints a minus sign if so, than it proceeds
by printing the individual numbers by calling the function _dprint for
every digit. The idea is the following, if we have 512 than it is equal to: 512 = (5 * 10 + 1) * 10 + 2 = Q * 10 + R, so we can found the last digit of a number by dividing it by 10, and
getting the reminder R, but if we do it in a loop than digits will be in a
reverse order, so we use the stack for pushing them, and after that when
writing them to stdout they are popped out in right order.
; Build : nasm -f elf -o baz.o baz.asm
; ld -m elf_i386 -o baz baz.o
section .bss
c: resb 1 ; character buffer
section .data
section .text
; writes an ascii character from eax to stdout
_cprint:
pushad ; push registers
mov [c], eax ; store ascii value at c
mov eax, 0x04 ; sys_write
mov ebx, 1 ; stdout
mov ecx, c ; copy c to ecx
mov edx, 1 ; one character
int 0x80 ; syscall
popad ; pop registers
ret ; bye
; writes a digit stored in eax to stdout
_dprint:
pushad ; push registers
add eax, '0' ; get digit's ascii code
mov [c], eax ; store it at c
mov eax, 0x04 ; sys_write
mov ebx, 1 ; stdout
mov ecx, c ; pass the address of c to ecx
mov edx, 1 ; one character
int 0x80 ; syscall
popad ; pop registers
ret ; bye
; now lets try to write a function which will write an integer
; number stored in eax in decimal at stdout
_iprint:
pushad ; push registers
cmp eax, 0 ; check if eax is negative
jge Pos ; if not proceed in the usual manner
push eax ; store eax
mov eax, '-' ; print minus sign
call _cprint ; call character printing function
pop eax ; restore eax
neg eax ; make eax positive
Pos:
mov ebx, 10 ; base
mov ecx, 1 ; number of digits counter
Cycle1:
mov edx, 0 ; set edx to zero before dividing otherwise the
; program gives an error: SIGFPE arithmetic exception
div ebx ; divide eax with ebx now eax holds the
; quotent and edx the reminder
push edx ; digits we have to write are in reverse order
cmp eax, 0 ; exit loop condition
jz EndLoop1 ; we are done
inc ecx ; increment number of digits counter
jmp Cycle1 ; loop back
EndLoop1:
; write the integer digits by poping them out from the stack
Cycle2:
pop eax ; pop up the digits we have stored
call _dprint ; and print them to stdout
dec ecx ; decrement number of digits counter
jz EndLoop2 ; if it's zero we are done
jmp Cycle2 ; loop back
EndLoop2:
popad ; pop registers
ret ; bye
global _start
_start:
nop ; gdb break point
mov eax, -345 ;
call _iprint ;
mov eax, 0x01 ; sys_exit
mov ebx, 0 ; error code
int 0x80 ; край
Because you didn't say about number representation I wrote the following code for unsigned number with any base(of course not too big), so you could use it:
BITS 32
global _start
section .text
_start:
mov eax, 762002099 ; unsigned number to print
mov ebx, 36 ; base to represent the number, do not set it too big
call print
;exit
mov eax, 1
xor ebx, ebx
int 0x80
print:
mov ecx, esp
sub esp, 36 ; reserve space for the number string, for base-2 it takes 33 bytes with new line, aligned by 4 bytes it takes 36 bytes.
mov edi, 1
dec ecx
mov [ecx], byte 10
print_loop:
xor edx, edx
div ebx
cmp dl, 9 ; if reminder>9 go to use_letter
jg use_letter
add dl, '0'
jmp after_use_letter
use_letter:
add dl, 'W' ; letters from 'a' to ... in ascii code
after_use_letter:
dec ecx
inc edi
mov [ecx],dl
test eax, eax
jnz print_loop
; system call to print, ecx is a pointer on the string
mov eax, 4 ; system call number (sys_write)
mov ebx, 1 ; file descriptor (stdout)
mov edx, edi ; length of the string
int 0x80
add esp, 36 ; release space for the number string
ret
It's not optimised for numbers with base of power of two and doesn't use printf from libc.
The function print outputs the number with a new line. The number string is formed on stack. Compile by nasm.
Output:
clockz
https://github.com/tigertv/stackoverflow-answers/tree/master/8194141-how-to-print-a-number-in-assembly-nasm

x64 bit assembly

I started assembly (nasm) programming not too long ago. Now I made a C function with assembly implementation which prints an integer. I got it working using the extended registers, but when I want to write it with the x64 registers (rax, rbx, ..) my implementation fails. Does any of you see what I missed?
main.c:
#include <stdio.h>
extern void printnum(int i);
int main(void)
{
printnum(8);
printnum(256);
return 0;
}
32 bit version:
; main.c: http://pastebin.com/f6wEvwTq
; nasm -f elf32 -o printnum.o printnum.asm
; gcc -o printnum printnum.o main.c -m32
section .data
_nl db 0x0A
nlLen equ $ - _nl
section .text
global printnum
printnum:
enter 0,0
mov eax, [ebp+8]
xor ebx, ebx
xor ecx, ecx
xor edx, edx
push ebx
mov ebx, 10
startLoop:
idiv ebx
add edx, 0x30
push dx ; With an odd number of digits this will screw up the stack, but that's ok
; because we'll reset the stack at the end of this function anyway.
; Needs fixing though.
inc ecx
xor edx, edx
cmp eax, 0
jne startLoop
push ecx
imul ecx, 2
mov edx, ecx
mov eax, 4 ; Prints the string (from stack) to screen
mov ebx, 1
mov ecx, esp
add ecx, 4
int 80h
mov eax, 4 ; Prints a new line
mov ebx, 1
mov ecx, _nl
mov edx, nlLen
int 80h
pop eax ; returns the ammount of used characters
leave
ret
x64 version:
; main.c : http://pastebin.com/f6wEvwTq
; nasm -f elf64 -o object/printnum.o printnum.asm
; gcc -o bin/printnum object/printnum.o main.c -m64
section .data
_nl db 0x0A
nlLen equ $ - _nl
section .text
global printnum
printnum:
enter 0, 0
mov rax, [rbp + 8] ; Get the function args from the stac
xor rbx, rbx
xor rcx, rcx
xor rdx, rdx
push rbx ; The 0 byte of the string
mov rbx, 10 ; Dividor
startLoop:
idiv rbx ; modulo is in rdx
add rdx, 0x30
push dx
inc rcx ; increase the loop variable
xor rdx, rdx ; resetting the modulo
cmp rax, 0
jne startLoop
push rcx ; push the counter on the stack
imul rcx, 2
mov rdx, rcx ; string length
mov rax, 4
mov rbx, 1
mov rcx, rsp ; the string
add rcx, 4
int 0x80
mov rax, 4
mov rbx, 1
mov rcx, _nl
mov rdx, nlLen
int 0x80
pop rax
leave
ret ; return to the C routine
Thanks in advance!
I think your problem is that you're trying to use the 32-bit calling conventions in 64-bit mode. That won't fly, not if you're calling these assembly routines from C. The 64-bit calling convention is documented here: http://www.x86-64.org/documentation/abi.pdf
Also, don't open-code system calls. Call the wrappers in the C library. That way errno gets set properly, you take advantage of sysenter/syscall, you don't have to deal with the differences between the normal calling convention and the system-call argument convention, and you're insulated from certain low-level ABI issues. (Another of your problems is that write is system call number 1, not 4, for Linux/x86-64.)
Editorial aside: There are two, and only two, reasons to write anything in assembly nowadays:
You are writing one of the very few remaining bits of deep magic that cannot be written in C alone (a good example is the guts of libffi)
You are hand-optimizing an inner-loop subroutine that has been measured to be performance-critical and the C compiler doesn't do a good enough job on.
Otherwise just write whatever it is in C. Your successors will thank you.
EDIT: checked system call numbers.
I'm not sure if this answer is related to the problem you're seeing (since you didn't specify anything about what the failure is), but 64-bit code has a different calling convention than 32-bit code does. Both of the major 64-bit Intel ABIs (Windows & Linux/BSD/Mac OS) pass function parameters in registers and not on the stack. Your program appears to still be expecting them on the stack, which isn't the normal way to go about it.
Edit: Now that I see there is a C main() routine that calls your functions, my answer is exactly about the problem you're having.

Resources