The value displayed in Kdbg is wrong -- NASM - linux

How can I test to see if the value of k is correct?
section .data
k dw 5
m dw 110
rez dw 0
section .bss
tabela resq 3
section .text
global _start
extern uslov
_start:
mov qword [tabela], k
mov qword [tabela + 8], m
mov qword [tabela + 16], rez
mov rbx, tabela
call uslov
mov rax, 60
mov rdi, 0
syscall
When I try to inspect the values of k,m,rez in kdbg the values of m and rez are just fine but the value of k is totally different, now at first i thought it was random, but it seems as tough it reads the value of rez as an 8 byte number instead of a 2 byte number and also reads in 6 more bytes taking in all the set 1's from m and rez which is wrong, so how can I display it correctly ?
Screenshot:

I can reproduce this with your source (removing undefined references to uslov) when I compile using this command line:
nasm -f elf64 test.asm -o test.o
ld test.o -o test
Then, in GDB I can indeed see that k appears to have sizeof(k)==4:
gdb ./test -ex 'tb _start' -ex r -ex 'p sizeof(k)'
Reading symbols from ./test...done.
Starting program: /tmp/test
Temporary breakpoint 1, 0x00000000004000b0 in _start ()
$1 = 4
This is because the only information the final binary has about k is that it's a symbol in data area. See:
(gdb) ptype k
type = <data variable, no debug info>
The debugger (KDbg uses GDB under the hood) can't know its size, so it just guesses the default size to be sizeof(int). Even if you enable debug info in NASM via -F dwarf -g options, it still doesn't appear to put any actual debug info.
So, your only way to get the variables displayed with the right size is to manually specify it, like (short)k instead of k.

Related

Trivial assembler program segfaults [duplicate]

I am trying to learn nasm. I want to make a program that prints "Hello, world." n times (in this case 10). I am trying to save the loop register value in a constant so that it is not changed when the body of the loop is executed. When I try to do this I receive a segmentation fault error. I am not sure why this is happening.
My code:
SECTION .DATA
print_str: db 'Hello, world.', 10
print_str_len: equ $-print_str
limit: equ 10
step: dw 1
SECTION .TEXT
GLOBAL _start
_start:
mov eax, 4 ; 'write' system call = 4
mov ebx, 1 ; file descriptor 1 = STDOUT
mov ecx, print_str ; string to write
mov edx, print_str_len ; length of string to write
int 80h ; call the kernel
mov eax, [step] ; moves the step value to eax
inc eax ; Increment
mov [step], eax ; moves the eax value to step
cmp eax, limit ; Compare sil to the limit
jle _start ; Loop while less or equal
exit:
mov eax, 1 ; 'exit' system call
mov ebx, 0 ; exit with error code 0
int 80h ; call the kernel
The result:
Hello, world.
Segmentation fault (core dumped)
The cmd:
nasm -f elf64 file.asm -o file.o
ld file.o -o file
./file
section .DATA is the direct cause of the crash. Lower-case section .data is special, and linked as a read-write (private) mapping of the executable. Section names are case-sensitive.
Upper-case .DATA is not special for nasm or the linker, and it ends up as part of the text segment mapped read+exec without write permission.
Upper-case .TEXT is also weird: by default objdump -drwC -Mintel only disassembles the .text section (to avoid disassembling data as if it were code), so it shows empty output for your executable.
On newer systems, the default for a section name NASM doesn't recognize doesn't include exec permission, so code in .TEXT will segfault. Same as Assembly section .code and .text behave differently
After starting the program under GDB (gdb ./foo, starti), I looked at the process's memory map from another shell.
$ cat /proc/11343/maps
00400000-00401000 r-xp 00000000 00:31 110651257 /tmp/foo
7ffff7ffa000-7ffff7ffd000 r--p 00000000 00:00 0 [vvar]
7ffff7ffd000-7ffff7fff000 r-xp 00000000 00:00 0 [vdso]
7ffffffde000-7ffffffff000 rwxp 00000000 00:00 0 [stack]
As you can see, other than the special VDSO mappings and the stack, there's only the one file-backed mapping, and it has read+exec permission only.
Single-stepping inside GDB, the mov eax,DWORD PTR ds:0x400086 load succeeds, but the mov DWORD PTR ds:0x400086,eax store faults. (See the bottom of the x86 tag wiki for GDB asm tips.)
From readelf -a foo, we can see the ELF program headers that tell the OS's program loader how to map it into memory:
$ readelf -a foo # broken version
...
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000
0x00000000000000bf 0x00000000000000bf R 0x200000
Section to Segment mapping:
Segment Sections...
00 .DATA .TEXT
Notice how both .DATA and .TEXT are in the same segment. This is what you'd want for section .rodata (a standard section name where you should put read-only constant data like your string), but it won't work for mutable global variables.
After fixing your asm to use section .data and .text, readelf shows us:
$ readelf -a foo # fixed version
...
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000
0x00000000000000e7 0x00000000000000e7 R E 0x200000
LOAD 0x00000000000000e8 0x00000000006000e8 0x00000000006000e8
0x0000000000000010 0x0000000000000010 RW 0x200000
Section to Segment mapping:
Segment Sections...
00 .text
01 .data
Notice how segment 00 is R + E without W, and the .text section is in there. Segment 01 is RW (read + write) without exec, and the .data section is there.
The LOAD tag means they're mapped into the process's virtual address space. Some section (like debug info) aren't, and are just metadata for other tools. But NASM flags unknown section names as progbits, i.e. loaded, which is why it was able to link and have the load not segfault.
After fixing it to use section .data, your program runs without segfaulting.
The loop runs for one iteration, because the 2 bytes following step: dw 1 are not zero. After the dword load, RAX = 0x2c0001 on my system. (cmp between 0x002c0002 and 0xa makes the LE condition false because it's not less or equal.)
dw means "data word" or "define word". Use dd for a data dword.
BTW, there's no need to keep your loop counter in memory. You're not using RDI, RSI, RBP, or R8..R15 for anything so you could just keep it in a register. Like mov edi, limit before the loop, and dec edi / jnz at the bottom.
But actually you should use the 64-bit syscall ABI if you want to build 64-bit code, not the 32-bit int 0x80 ABI. What happens if you use the 32-bit int 0x80 Linux ABI in 64-bit code?. Or build 32-bit executables if you're following a guide or tutorial written for that.
Anyway, in that case you'd be able to use ebx as your loop counter, because the syscall ABI uses different args for registers.

GAS to NASM assembly: translate ".rept .set" to NASM (loop and assign incrementing value to label)

GAS assembly knows about the .set-directive which can be combined with .rept to increment a label (variable) in a loop as in the example below:
pd:
.set SPAGE, 0
.rept 512
.quad SPAGE + 0x87 // PRESENT, R/W, USER, 2MB
.set SPAGE, SPAGE + 0x200000
.endr
How can I achieve something similar convenient in NASM? I know about TIMES directive, but this alone doesn't help me to achieve, what I want. Any ideas? The EQU-directive from NASM only allows assigning a value once. Hence, it will not solve my problem.
Indeed this is impossible to do with times directive due to the operand to TIMES is a critical expression, to repeat more than one line of code, or a complex macro, use the preprocessor %rep directive, take a look at this silly example:
global _start
section .text
_start:
mov rbx, 0
%assign i 0
%rep 5
mov rbx, [variable]
add rbx, i
mov [variable], rbx
%assign i i+1
%endrep
mov rax, 60 ; system call for exit
mov rdi, [variable]; value of 'variable' = 10
syscall
section .bss
variable: resb 1
Check the answer:
nasm -felf64 ass.asm && ld ass.o && ./a.out
echo $?

Segmentation fault with a variable in SECTION .DATA

I am trying to learn nasm. I want to make a program that prints "Hello, world." n times (in this case 10). I am trying to save the loop register value in a constant so that it is not changed when the body of the loop is executed. When I try to do this I receive a segmentation fault error. I am not sure why this is happening.
My code:
SECTION .DATA
print_str: db 'Hello, world.', 10
print_str_len: equ $-print_str
limit: equ 10
step: dw 1
SECTION .TEXT
GLOBAL _start
_start:
mov eax, 4 ; 'write' system call = 4
mov ebx, 1 ; file descriptor 1 = STDOUT
mov ecx, print_str ; string to write
mov edx, print_str_len ; length of string to write
int 80h ; call the kernel
mov eax, [step] ; moves the step value to eax
inc eax ; Increment
mov [step], eax ; moves the eax value to step
cmp eax, limit ; Compare sil to the limit
jle _start ; Loop while less or equal
exit:
mov eax, 1 ; 'exit' system call
mov ebx, 0 ; exit with error code 0
int 80h ; call the kernel
The result:
Hello, world.
Segmentation fault (core dumped)
The cmd:
nasm -f elf64 file.asm -o file.o
ld file.o -o file
./file
section .DATA is the direct cause of the crash. Lower-case section .data is special, and linked as a read-write (private) mapping of the executable. Section names are case-sensitive.
Upper-case .DATA is not special for nasm or the linker, and it ends up as part of the text segment mapped read+exec without write permission.
Upper-case .TEXT is also weird: by default objdump -drwC -Mintel only disassembles the .text section (to avoid disassembling data as if it were code), so it shows empty output for your executable.
On newer systems, the default for a section name NASM doesn't recognize doesn't include exec permission, so code in .TEXT will segfault. Same as Assembly section .code and .text behave differently
After starting the program under GDB (gdb ./foo, starti), I looked at the process's memory map from another shell.
$ cat /proc/11343/maps
00400000-00401000 r-xp 00000000 00:31 110651257 /tmp/foo
7ffff7ffa000-7ffff7ffd000 r--p 00000000 00:00 0 [vvar]
7ffff7ffd000-7ffff7fff000 r-xp 00000000 00:00 0 [vdso]
7ffffffde000-7ffffffff000 rwxp 00000000 00:00 0 [stack]
As you can see, other than the special VDSO mappings and the stack, there's only the one file-backed mapping, and it has read+exec permission only.
Single-stepping inside GDB, the mov eax,DWORD PTR ds:0x400086 load succeeds, but the mov DWORD PTR ds:0x400086,eax store faults. (See the bottom of the x86 tag wiki for GDB asm tips.)
From readelf -a foo, we can see the ELF program headers that tell the OS's program loader how to map it into memory:
$ readelf -a foo # broken version
...
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000
0x00000000000000bf 0x00000000000000bf R 0x200000
Section to Segment mapping:
Segment Sections...
00 .DATA .TEXT
Notice how both .DATA and .TEXT are in the same segment. This is what you'd want for section .rodata (a standard section name where you should put read-only constant data like your string), but it won't work for mutable global variables.
After fixing your asm to use section .data and .text, readelf shows us:
$ readelf -a foo # fixed version
...
Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
LOAD 0x0000000000000000 0x0000000000400000 0x0000000000400000
0x00000000000000e7 0x00000000000000e7 R E 0x200000
LOAD 0x00000000000000e8 0x00000000006000e8 0x00000000006000e8
0x0000000000000010 0x0000000000000010 RW 0x200000
Section to Segment mapping:
Segment Sections...
00 .text
01 .data
Notice how segment 00 is R + E without W, and the .text section is in there. Segment 01 is RW (read + write) without exec, and the .data section is there.
The LOAD tag means they're mapped into the process's virtual address space. Some section (like debug info) aren't, and are just metadata for other tools. But NASM flags unknown section names as progbits, i.e. loaded, which is why it was able to link and have the load not segfault.
After fixing it to use section .data, your program runs without segfaulting.
The loop runs for one iteration, because the 2 bytes following step: dw 1 are not zero. After the dword load, RAX = 0x2c0001 on my system. (cmp between 0x002c0002 and 0xa makes the LE condition false because it's not less or equal.)
dw means "data word" or "define word". Use dd for a data dword.
BTW, there's no need to keep your loop counter in memory. You're not using RDI, RSI, RBP, or R8..R15 for anything so you could just keep it in a register. Like mov edi, limit before the loop, and dec edi / jnz at the bottom.
But actually you should use the 64-bit syscall ABI if you want to build 64-bit code, not the 32-bit int 0x80 ABI. What happens if you use the 32-bit int 0x80 Linux ABI in 64-bit code?. Or build 32-bit executables if you're following a guide or tutorial written for that.
Anyway, in that case you'd be able to use ebx as your loop counter, because the syscall ABI uses different args for registers.

NASM: copying a pointer from a register to a buffer in .data

I am new to asm. I am trying to copy a pointer from a register to a .data variable using NASM, on linux 64-bit.
Concider this program:
section .data
ptr: dq 0
section .text
global _start
_start:
mov [ptr], rsp
mov rax, 60
mov rdi, 0
syscall
Here I try to copy the current stack pointer to ptr. ptr is declared as a quadword. Neither nasm nor the linker complains, but when debugging the program with gdb, I can see that both addresses are different:
gdb ./test.s
+(gdb) break _start
Breakpoint 1 at 0x4000b0
+(gdb) run
Starting program: test
Breakpoint 1, 0x00000000004000b0 in _start ()
+(gdb) nexti
0x00000000004000b8 in _start ()
+(gdb) info registers
...
rsp 0x7fffffffe460 0x7fffffffe460
...
+(gdb) x ptr
0xffffffffffffe460: Cannot access memory at address 0xffffffffffffe460
From what I understand, mov should copy all 64 bits from rsp to [ptr], but it seems that the most significant 0s are not copied and/or that there is some kind of sign extension, as if only the least significant bits were copied.
The problem is, you don't have debug info for the ptr type, so gdb treats it as integer. You can examine its real contents using:
(gdb) x/a &ptr
0x600124 <ptr>: 0x7fffffffe950
(gdb) p/a $rsp
$3 = 0x7fffffffe950
Of course I have a different value for rsp than you, but you can see that ptr and rsp match.
Looks like you're using gdb wrongly to me:
section .data
ptr: dq 0
section .text
global main
main:
mov [ptr], rsp
ret
Compiling with:
rm -f test.o && nasm -f elf64 test.asm && gcc -m64 -o test test.o
Then my debugging session looks like this:
gdb ./test
(...)
(gdb) break main
Breakpoint 1 at 0x4004c0
(gdb) run
Starting program: /home/rr-/test
Breakpoint 1, 0x00000000004004c0 in main ()
(gdb) nexti
0x00000000004004c8 in main ()
(gdb) info registers
rax 0x4004c0 4195520
rbx 0x0 0
rcx 0x0 0
rdx 0x7fffffffe388 140737488348040
rsi 0x7fffffffe378 140737488348024
rdi 0x1 1
rbp 0x4004d0 0x4004d0 <__libc_csu_init>
rsp 0x7fffffffe298 0x7fffffffe298
(...)
(gdb) info addr ptr
Symbol "ptr" is at 0x600880 in a file compiled without debugging.
(gdb) x/g 0x600880
0x600880: 140737488347800
140737488347800 evaluates to 0x7FFFFFFFE298 just fine.
+(gdb) x/h ptr
h means half-word, which is two bytes. What you want is probably g (Giant words in GDB terminology, which is eight bytes).

linux nasm command line args as integers

I've been banging my head for days trying to figure this out, finally posting here for some help. This exercise is purely academic for me, but it's come to a point where I simply need to understand why this doesn't work or what I'm doing wrong.
section .text
global _start
_start:
pop eax
pop ebx
pop ecx
_exit:
mov eax, 1
mov ebx, 0
int 0x80
Compiling/linking with:
$ nasm -f elf -o test.o test.asm
$ gcc -o test test.o
Running it in gdb with argument of "5":
$ gdb test
...
(gdb) b _exit
Breakpoint 1 at 0x8048063
(gdb) r 5
Starting program: /home/rich/asm/test 5
Breakpoint 1, 0x08048063 in _exit ()
(gdb) i r
eax 0x2 2
ebx 0xbffff8b0 -1073743696
ecx 0xbffff8c8 -1073743672
edx 0x0 0
esp 0xbffff78c 0xbffff78c
ebp 0x0 0x0
...
So eax makes sense here - it's 0x2, or 2, argc. My question is: how do I get the value "5" (or 0x5) into a register? As I understand it, ecx is a pointer to my value 5, so how do I "dereference" it into a usable digit, i.e. one that I can do arithmetic things to?
What do you want to do with it? Your understanding is right: the kernel pushes the argc count on the top of the stack, underneath which is argv[0] ... argv[argc-1] in reverse order (i.e. top of the stack / lowest memory address holds the first argument). You can check this with gdb on any binary on the system:
$ echo "int main(){return 0;}" > test.c
$ gcc test.c
$ gdb ./a.out
(gdb) b _start
(gdb) r A B C D E
(gdb) p ((void**)$rsp)[0]
$2 = (void *) 0x6
(gdb) p (char*)((void**)$rsp)[1]
$4 = 0x7fffffffeab7 "/home/andy/a.out"
(gdb) p (char*)((void**)$rsp)[2]
$5 = 0x7fffffffeac8 "A"
(gdb) p (char*)((void**)$rsp)[3]
$6 = 0x7fffffffeaca "B"
(gdb) p (char*)((void**)$rsp)[4]
$7 = 0x7fffffffeacc "C"
(gdb) p (char*)((void**)$rsp)[5]
$8 = 0x7fffffffeace "D"
(gdb) p (char*)((void**)$rsp)[6]
$9 = 0x7fffffffead0 "E"
Are you maybe asking how to parse the strings? That's a more involved question.
I realise this may be a little late and you may have already worked this out or moved on to something else, but I came across this question while googling something related and figured I could help out for anyone else that comes across this.
The problem I think you are facing here is that the "5" you are passing to your program is not then stored as an integer 5 like one might assume. The argument is passed to your program as a char, and so as Andy pointed out you would have a pointer to a byte containing 0x35 - which is the integer value that represents an ASCII character 5 - rather than a pointer to an integer value 5.
To use your argument as an integer, you would need to convert the byte to its integer equivalent as defined by the ASCII table - otherwise you will find that you pass in the char 5 but any math you attempt to do with this will be using 53 (0x35) because that represents a 5 in ASCII.
You can find an example of how to perform that conversion in the rsi_to_bin function of the example asm program here . Once you have converted the ascii code to its actual integer equivalent you will have the correct number you passed in, and will be able to perform whatever arithmetic you wanted with it. An extremely simple example would be to just subtract 48 from the input - this would work assuming you only passed in a single integer of value 0-9.

Resources