I am new to asm. I am trying to copy a pointer from a register to a .data variable using NASM, on linux 64-bit.
Concider this program:
section .data
ptr: dq 0
section .text
global _start
_start:
mov [ptr], rsp
mov rax, 60
mov rdi, 0
syscall
Here I try to copy the current stack pointer to ptr. ptr is declared as a quadword. Neither nasm nor the linker complains, but when debugging the program with gdb, I can see that both addresses are different:
gdb ./test.s
+(gdb) break _start
Breakpoint 1 at 0x4000b0
+(gdb) run
Starting program: test
Breakpoint 1, 0x00000000004000b0 in _start ()
+(gdb) nexti
0x00000000004000b8 in _start ()
+(gdb) info registers
...
rsp 0x7fffffffe460 0x7fffffffe460
...
+(gdb) x ptr
0xffffffffffffe460: Cannot access memory at address 0xffffffffffffe460
From what I understand, mov should copy all 64 bits from rsp to [ptr], but it seems that the most significant 0s are not copied and/or that there is some kind of sign extension, as if only the least significant bits were copied.
The problem is, you don't have debug info for the ptr type, so gdb treats it as integer. You can examine its real contents using:
(gdb) x/a &ptr
0x600124 <ptr>: 0x7fffffffe950
(gdb) p/a $rsp
$3 = 0x7fffffffe950
Of course I have a different value for rsp than you, but you can see that ptr and rsp match.
Looks like you're using gdb wrongly to me:
section .data
ptr: dq 0
section .text
global main
main:
mov [ptr], rsp
ret
Compiling with:
rm -f test.o && nasm -f elf64 test.asm && gcc -m64 -o test test.o
Then my debugging session looks like this:
gdb ./test
(...)
(gdb) break main
Breakpoint 1 at 0x4004c0
(gdb) run
Starting program: /home/rr-/test
Breakpoint 1, 0x00000000004004c0 in main ()
(gdb) nexti
0x00000000004004c8 in main ()
(gdb) info registers
rax 0x4004c0 4195520
rbx 0x0 0
rcx 0x0 0
rdx 0x7fffffffe388 140737488348040
rsi 0x7fffffffe378 140737488348024
rdi 0x1 1
rbp 0x4004d0 0x4004d0 <__libc_csu_init>
rsp 0x7fffffffe298 0x7fffffffe298
(...)
(gdb) info addr ptr
Symbol "ptr" is at 0x600880 in a file compiled without debugging.
(gdb) x/g 0x600880
0x600880: 140737488347800
140737488347800 evaluates to 0x7FFFFFFFE298 just fine.
+(gdb) x/h ptr
h means half-word, which is two bytes. What you want is probably g (Giant words in GDB terminology, which is eight bytes).
Related
I am trying to learn nasm. I want to make a program that prints "Hello, world." n times (in this case 10). I am trying to save the loop register value in a constant so that it is not changed when the body of the loop is executed. When I try to do this I receive a segmentation fault error. I am not sure why this is happening.
My code:
SECTION .DATA
print_str: db 'Hello, world.', 10
print_str_len: equ $-print_str
limit: equ 10
step: dw 1
SECTION .TEXT
GLOBAL _start
_start:
mov eax, 4 ; 'write' system call = 4
mov ebx, 1 ; file descriptor 1 = STDOUT
mov ecx, print_str ; string to write
mov edx, print_str_len ; length of string to write
int 80h ; call the kernel
mov eax, [step] ; moves the step value to eax
inc eax ; Increment
mov [step], eax ; moves the eax value to step
cmp eax, limit ; Compare sil to the limit
jle _start ; Loop while less or equal
exit:
mov eax, 1 ; 'exit' system call
mov ebx, 0 ; exit with error code 0
int 80h ; call the kernel
The result:
Hello, world.
Segmentation fault (core dumped)
The cmd:
nasm -f elf64 file.asm -o file.o
ld file.o -o file
./file
section .DATA is the direct cause of the crash. Lower-case section .data is special, and linked as a read-write (private) mapping of the executable. Section names are case-sensitive.
Upper-case .DATA is not special for nasm or the linker, and it ends up as part of the text segment mapped read+exec without write permission.
Upper-case .TEXT is also weird: by default objdump -drwC -Mintel only disassembles the .text section (to avoid disassembling data as if it were code), so it shows empty output for your executable.
On newer systems, the default for a section name NASM doesn't recognize doesn't include exec permission, so code in .TEXT will segfault. Same as Assembly section .code and .text behave differently
After starting the program under GDB (gdb ./foo, starti), I looked at the process's memory map from another shell.
$ cat /proc/11343/maps
00400000-00401000 r-xp 00000000 00:31 110651257 /tmp/foo
7ffff7ffa000-7ffff7ffd000 r--p 00000000 00:00 0 [vvar]
7ffff7ffd000-7ffff7fff000 r-xp 00000000 00:00 0 [vdso]
7ffffffde000-7ffffffff000 rwxp 00000000 00:00 0 [stack]
As you can see, other than the special VDSO mappings and the stack, there's only the one file-backed mapping, and it has read+exec permission only.
Single-stepping inside GDB, the mov eax,DWORD PTR ds:0x400086 load succeeds, but the mov DWORD PTR ds:0x400086,eax store faults. (See the bottom of the x86 tag wiki for GDB asm tips.)
From readelf -a foo, we can see the ELF program headers that tell the OS's program loader how to map it into memory:
$ readelf -a foo # broken version
...
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000
0x00000000000000bf 0x00000000000000bf R 0x200000
Section to Segment mapping:
Segment Sections...
00 .DATA .TEXT
Notice how both .DATA and .TEXT are in the same segment. This is what you'd want for section .rodata (a standard section name where you should put read-only constant data like your string), but it won't work for mutable global variables.
After fixing your asm to use section .data and .text, readelf shows us:
$ readelf -a foo # fixed version
...
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000
0x00000000000000e7 0x00000000000000e7 R E 0x200000
LOAD 0x00000000000000e8 0x00000000006000e8 0x00000000006000e8
0x0000000000000010 0x0000000000000010 RW 0x200000
Section to Segment mapping:
Segment Sections...
00 .text
01 .data
Notice how segment 00 is R + E without W, and the .text section is in there. Segment 01 is RW (read + write) without exec, and the .data section is there.
The LOAD tag means they're mapped into the process's virtual address space. Some section (like debug info) aren't, and are just metadata for other tools. But NASM flags unknown section names as progbits, i.e. loaded, which is why it was able to link and have the load not segfault.
After fixing it to use section .data, your program runs without segfaulting.
The loop runs for one iteration, because the 2 bytes following step: dw 1 are not zero. After the dword load, RAX = 0x2c0001 on my system. (cmp between 0x002c0002 and 0xa makes the LE condition false because it's not less or equal.)
dw means "data word" or "define word". Use dd for a data dword.
BTW, there's no need to keep your loop counter in memory. You're not using RDI, RSI, RBP, or R8..R15 for anything so you could just keep it in a register. Like mov edi, limit before the loop, and dec edi / jnz at the bottom.
But actually you should use the 64-bit syscall ABI if you want to build 64-bit code, not the 32-bit int 0x80 ABI. What happens if you use the 32-bit int 0x80 Linux ABI in 64-bit code?. Or build 32-bit executables if you're following a guide or tutorial written for that.
Anyway, in that case you'd be able to use ebx as your loop counter, because the syscall ABI uses different args for registers.
I coding assembly with Nasm, i want debug the program using gdb, but it not works when i put a breakpoint and run the program.
The program compile fine and link too, the problem is gdb.
Here is the commands to compile:
nasm -f elf64 -F dwarf -g types.asm
nasm -f elf64 -F dwarf -g functions.asm
nasm -f elf64 -F dwarf -g Hello.asm
ld -g -o Hello Hello.o functions.o types.o
This is the file i want debug Hello.asm:
%include "functions.asm"
section .bss
res: resb 1
fout: resb 1
section .text
global _start: ;must be declared for linker (ld)
section .data
msg db 'Hello, world!', 0xa ;string to be printed
len equ $ - msg ;length of the string
filename db 'hello.txt'
_start: ;tells linker entry point
mov ecx,5
mov edx,4
call sum
mov [res],eax
mov edx,1 ;message length
mov ecx,res ;message to write
mov ebx,1 ;file descriptor (stdout)
mov eax,4 ;system call number (sys_write)
int 0x80 ;call kernel
write_string msg,len
create_file filename
mov [fout],eax
close_file [fout]
call print_msg
mov eax,1 ;system call number (sys_exit)
int 0x80 ;call kernel
sum:
mov eax,ecx
add eax,edx
add eax,'0'
ret
Next i open gdb:
gdb Hello
(gdb) break _start
Function «_start» not defined
¿Compilación de breakpoint pendiente hasta futura cargada de biblioteca compartida? (y or [n]) y
Punto de interrupción 1 (_start) pendiente.
(gdb) run
Starting program: /asm/Hello
9Hello, world!
Hello, world!from another file
[Inferior 1 (process 5811) exited with code 01]
(gdb)
I solved it, i only change position section .data to section .text and the debugger works.I don't know why, but now the gdb take the .start.
When writing some x64 assembly, I stumbled upon something weird. A function call works fine when executed on a main thread, but causes a segmentation fault when executed as a pthread. At first I thought I was invalidating the stack, as it only segfaults on the second call, but this does not match with the fact that it works properly on the main thread yet crashes on a newly-spawned thread.
From gdb:
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Value: 1337
Value: 1337
[New Thread 0x7ffff77f6700 (LWP 8717)]
Return value: 0
Value: 1337
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff77f6700 (LWP 8717)]
__printf (format=0x600570 <fmt> "Value: %d\n") at printf.c:28
28 printf.c: No such file or directory.
Does anyone have an idea about what could be going on here?
extern printf
extern pthread_create
extern pthread_join
extern pthread_exit
section .data
align 4
fmt db "Value: %d", 0x0A, 0
fmt_rval db "Return value: %d", 0x0A, 0
tID dw 0
section .text
global _start
_start:
mov rdi, 1337
call show_value
call show_value ; <- this call works fine
; CREATE THREAD
mov ecx, 0 ; function argument
mov edx, thread_1 ; function pointer
mov esi, 0 ; attributes
mov rdi, tID ; pointer to threadID
call pthread_create
mov rdi, rax
call show_rval
mov rsi, 0 ; return value
mov rdi, [tID] ; id to wait on
call pthread_join
mov rdi, rax
call show_rval
call exit
thread_1:
mov rdi, 1337
call show_value
call show_value ; <- this additional call causes a segfault
ret
show_value:
push rdi
mov rsi, rdi
mov rdi, fmt
call printf
pop rdi
ret
show_rval:
push rdi
mov rsi, rdi
mov rdi, fmt_rval
call printf
pop rdi
ret
exit:
mov rax, 60
mov rdi, 0
syscall
The binary was generated on Ubuntu 14.04 (64-bit of course), with:
nasm -felf64 -g -o $1.o $1.asm
ld -I/lib64/ld-linux-x86-64.so.2 -o $1.out $1.o -lc -lpthread
Functions that take a variable number of parameters like printf require the RAX register to be set properly. You need to set it to the number of vector registers used, which in your case is 0. From Section 3.2.3 Parameter Passing in the System V 64-bit ABI:
RAX
temporary register;
with variable arguments passes information about the number of vector registers used;
1st return register
Section 3.5.7 contains more detailed information about the parameter passing mechanism of functions taking a variable number of arguments. That section says:
When a function taking variable-arguments is called, %rax must be set to the total number of floating point parameters passed to the function in vector registers.
Modify your code to set RAX to zero in your call to printf:
show_value:
push rdi
xor rax, rax ; rax = 0
mov rsi, rdi
mov rdi, fmt
call printf
pop rdi
ret
You have a similar issue with show_rval
One other observation is that you could simplify linking your executable by using GCC instead of LD
I would recommend renaming _start to main and simply use GCC to link the final executable. GCC's C runtime code will provide a _start label that does proper initialization of the C runtime, which could potentially be required in some scenarios. When the C runtime code is finished initialization it transfers (via a CALL) to the label main. You could then produce your executable with:
nasm -felf64 -g -o $1.o $1.asm
gcc -o $1.out $1.o -lpthread
I don't think this is related to your problem, but was meant more as an FYI.
By not properly setting RAX for the printf call, unwanted behavior may occur in some cases. In this case, the value of RAX not being set properly for the printf call in an environment with threads causes a segmentation fault. The code without threads happened to work because you were lucky.
I'm trying to write some assembly programs using nasm on linux. Everything is good, but I make heavy use of local symbols (.loop, .else, etc.), which is a pain when debugging, because these symbols are emitted to the symbol table, e.g.:
[BITS 32]
global main
section .text
main:
do stuff
.else:
do other stuff
will produce a disassembly that looks like:
<main>:
00000000 do stuff
<main.else>:
00000000 do other stuff
which is a bit annoying just because gdb will think these are all separate functions, so when I 'disas' it will only disassemble a couple of instructions before it runs into another label and stops.
Is there a way to suppress emitting these symbols to the ELF symbol table using nasm under linux?
I haven't found a way to do it directly with nasm, however if you link your object with ld, then you have at your disposal a very handy switch.
Quoting from ld's man page:
-x --discard-all
Delete all local symbols.
-X --discard-locals
Delete all temporary local symbols. (These symbols start with
system-specific local label prefixes, typically .L for ELF
systems or L for traditional a.out systems.)
so if you have, for example, this:
section .data
hello: db 'Hello world!',10
helen: equ $-hello
hi: db 'Hi!',10
hilen: equ $-hi
section .text
global _start
_start:
mov eax,4
mov ebx,1
mov ecx,hello
mov edx,helen
int 80h
.there:
mov eax,4
mov ebx,1
mov ecx,hi
mov edx,hilen
int 80h
.end:
mov eax,1
mov ebx,0
int 80h
and then build, link (and run) it like this:
$ nasm -g -f elf32 prog.asm && ld -x prog.o -o prog && ./prog
Hello world!
Hi!
then, when you load it in gdb, you get this:
$ gdb prog
.....
Reading symbols from prog...done.
(gdb) disas _start
Dump of assembler code for function _start:
0x08048080 <+0>: mov $0x4,%eax
0x08048085 <+5>: mov $0x1,%ebx
0x0804808a <+10>: mov $0x80490b8,%ecx
0x0804808f <+15>: mov $0xd,%edx
0x08048094 <+20>: int $0x80
0x08048096 <+22>: mov $0x4,%eax
0x0804809b <+27>: mov $0x1,%ebx
0x080480a0 <+32>: mov $0x80490c5,%ecx
0x080480a5 <+37>: mov $0x4,%edx
0x080480aa <+42>: int $0x80
0x080480ac <+44>: mov $0x1,%eax
0x080480b1 <+49>: mov $0x0,%ebx
0x080480b6 <+54>: int $0x80
End of assembler dump.
(gdb)
where the disassembly is not hindered by the local symbols any more.
I have a function which prints text and a floating point number. Here is a version which does not use main
extern printf
extern _exit
section .data
hello: db 'Hello world! %f',10,0
pi: dq 3.14159
section .text
global _start
_start:
xor eax, eax
lea rdi, [rel hello]
movsd xmm0, [rel pi]
mov eax, 1
call printf
mov rax, 0
jmp _exit
I assemble and link this like this
nasm -felf64 hello.asm
ld hello.o -dynamic-linker /lib64/ld-linux-x86-64.so.2 -lc -melf_x86_64
This runs fine. However, now I want to do this using main.
global main
extern printf
section .data
hello: db 'Hello world! %f',10,0
pi: dq 3.14159
section .text
main:
sub rsp, 8
xor eax, eax
lea rdi, [rel hello]
movsd xmm0, [rel pi]
mov eax, 1
call printf
mov rax, 0
add rsp, 8
ret
I assembly and link like this
nasm -felf64 hello_main.asm
gcc hello_main.o
This runs fine as well. However, I had to subtract eight bytes from the stack pointer before calling printf and then add eight bytes to the stack pointer after otherwise I get a segmentation fault.
Looking at the stack pointer I see that without using main it's 16-byte aligned but with main it's only eight byte aligned. The fact that eight bytes has to be subtracted and added says that it's always 8-byte aligned and never 16-byte aligned (unless I misunderstand something). Why is this? I thought with x86_64 code we could assume that the stack is 16-byte aligned (at least for standard library function calls which I would think includes main).
According to the ABI, the stack pointer + 8 should be kept 16 byte aligned upon entry to functions. The reason you have to subtract 8 is that call itself places 8 bytes of return address on the stack, thereby violating this constraint. Basically you have to make sure the total stack pointer movement is a multiple of 16, including the return address. Thus the stack pointer needs to be moved by multiple of 16 + 8 to leave room for the return address.
As for _start, I don't think you can rely on it working without manual alignment either. It just so happens that in your case it works due to the things already on the stack.