I've been banging my head for days trying to figure this out, finally posting here for some help. This exercise is purely academic for me, but it's come to a point where I simply need to understand why this doesn't work or what I'm doing wrong.
section .text
global _start
_start:
pop eax
pop ebx
pop ecx
_exit:
mov eax, 1
mov ebx, 0
int 0x80
Compiling/linking with:
$ nasm -f elf -o test.o test.asm
$ gcc -o test test.o
Running it in gdb with argument of "5":
$ gdb test
...
(gdb) b _exit
Breakpoint 1 at 0x8048063
(gdb) r 5
Starting program: /home/rich/asm/test 5
Breakpoint 1, 0x08048063 in _exit ()
(gdb) i r
eax 0x2 2
ebx 0xbffff8b0 -1073743696
ecx 0xbffff8c8 -1073743672
edx 0x0 0
esp 0xbffff78c 0xbffff78c
ebp 0x0 0x0
...
So eax makes sense here - it's 0x2, or 2, argc. My question is: how do I get the value "5" (or 0x5) into a register? As I understand it, ecx is a pointer to my value 5, so how do I "dereference" it into a usable digit, i.e. one that I can do arithmetic things to?
What do you want to do with it? Your understanding is right: the kernel pushes the argc count on the top of the stack, underneath which is argv[0] ... argv[argc-1] in reverse order (i.e. top of the stack / lowest memory address holds the first argument). You can check this with gdb on any binary on the system:
$ echo "int main(){return 0;}" > test.c
$ gcc test.c
$ gdb ./a.out
(gdb) b _start
(gdb) r A B C D E
(gdb) p ((void**)$rsp)[0]
$2 = (void *) 0x6
(gdb) p (char*)((void**)$rsp)[1]
$4 = 0x7fffffffeab7 "/home/andy/a.out"
(gdb) p (char*)((void**)$rsp)[2]
$5 = 0x7fffffffeac8 "A"
(gdb) p (char*)((void**)$rsp)[3]
$6 = 0x7fffffffeaca "B"
(gdb) p (char*)((void**)$rsp)[4]
$7 = 0x7fffffffeacc "C"
(gdb) p (char*)((void**)$rsp)[5]
$8 = 0x7fffffffeace "D"
(gdb) p (char*)((void**)$rsp)[6]
$9 = 0x7fffffffead0 "E"
Are you maybe asking how to parse the strings? That's a more involved question.
I realise this may be a little late and you may have already worked this out or moved on to something else, but I came across this question while googling something related and figured I could help out for anyone else that comes across this.
The problem I think you are facing here is that the "5" you are passing to your program is not then stored as an integer 5 like one might assume. The argument is passed to your program as a char, and so as Andy pointed out you would have a pointer to a byte containing 0x35 - which is the integer value that represents an ASCII character 5 - rather than a pointer to an integer value 5.
To use your argument as an integer, you would need to convert the byte to its integer equivalent as defined by the ASCII table - otherwise you will find that you pass in the char 5 but any math you attempt to do with this will be using 53 (0x35) because that represents a 5 in ASCII.
You can find an example of how to perform that conversion in the rsi_to_bin function of the example asm program here . Once you have converted the ascii code to its actual integer equivalent you will have the correct number you passed in, and will be able to perform whatever arithmetic you wanted with it. An extremely simple example would be to just subtract 48 from the input - this would work assuming you only passed in a single integer of value 0-9.
Related
How can I test to see if the value of k is correct?
section .data
k dw 5
m dw 110
rez dw 0
section .bss
tabela resq 3
section .text
global _start
extern uslov
_start:
mov qword [tabela], k
mov qword [tabela + 8], m
mov qword [tabela + 16], rez
mov rbx, tabela
call uslov
mov rax, 60
mov rdi, 0
syscall
When I try to inspect the values of k,m,rez in kdbg the values of m and rez are just fine but the value of k is totally different, now at first i thought it was random, but it seems as tough it reads the value of rez as an 8 byte number instead of a 2 byte number and also reads in 6 more bytes taking in all the set 1's from m and rez which is wrong, so how can I display it correctly ?
Screenshot:
I can reproduce this with your source (removing undefined references to uslov) when I compile using this command line:
nasm -f elf64 test.asm -o test.o
ld test.o -o test
Then, in GDB I can indeed see that k appears to have sizeof(k)==4:
gdb ./test -ex 'tb _start' -ex r -ex 'p sizeof(k)'
Reading symbols from ./test...done.
Starting program: /tmp/test
Temporary breakpoint 1, 0x00000000004000b0 in _start ()
$1 = 4
This is because the only information the final binary has about k is that it's a symbol in data area. See:
(gdb) ptype k
type = <data variable, no debug info>
The debugger (KDbg uses GDB under the hood) can't know its size, so it just guesses the default size to be sizeof(int). Even if you enable debug info in NASM via -F dwarf -g options, it still doesn't appear to put any actual debug info.
So, your only way to get the variables displayed with the right size is to manually specify it, like (short)k instead of k.
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 5 years ago.
Improve this question
I have this FASM code:
msg1:
db "hello", 0
msg1_len equ $ - msg1 ; should be 6
Then later in code, mov edx, msg1_len puts its value in a register.
Although msg1_len is supposed to be 6, when I'm debugging it, it returns a strange big number such as 4570. that is, "msg1_len" is equal to 4570
In other applications it's same -- a big, random-looking number instead of the length of a string.
Why is this? How to fix it?
TL:DR: in FASM, equ is a text substitution, like NASM %define.
FASM len = $ - msg1 evaluates once, on the spot. (Like equ in most other assemblers, and also like = in MASM and GAS).
Text substitution breaks because $ - msg1 is context-sensitive: $ is the current position so mov edx, $ - msg1 is some large size that depends on the position of the instruction. equ would be fine for something like 8 * myconst in most cases.
EDIT: ooops.... I did use =, not equ.
When I replaced = with equ, I get compile error:
helloworld.asm [13]:
mov edx,msg1_size ; Length of message
error: undefined symbol 'msg1_size'.
(flat assembler version 1.71.51)
Works for me, when I put it into compilable FASM example, I get 6.
The full code I used to verify it works correctly:
format ELF executable 3
entry start
;================== code =====================
segment readable executable
;=============================================
start:
mov eax,4 ; System call 'write'
mov ebx,1 ; 'stdout'
mov ecx,msg1 ; Address of message
mov edx,msg1_size ; Length of message
^^ this compiles as mov edx,6, verified in debugger.
int 0x80 ; All system calls are done via this interrupt
mov eax,1 ; System call 'exit'
xor ebx,ebx ; Exitcode: 0 ('xor ebx,ebx' saves time; 'mov ebx, 0' would be slower)
int 0x80
;================== data =====================
segment readable writeable
;=============================================
msg1:
db 'hello', 0
msg1_size = $-msg1
final(?) update:
Check FASM docs about 2.2.1 Numerical constants:
The = directive allows to define the numerical constant. It should be preceded by the name for the constant and followed by the numerical expression providing the value. The value of such constants can be a number or an address, but - unlike labels - the numerical constants are not allowed to hold the register-based addresses. Besides this difference, in their basic variant numerical constants behave very much like labels and you can even forward-reference them (access their values before they actually get defined).
vs 2.3.2 Symbolic constants:
The symbolic constants are different from the numerical constants, before the assembly process they are replaced with their values everywhere in source lines after their definitions, and anything can become their values.
The definition of symbolic constant consists of name of the constant followed by the equ directive. Everything that follows this directive will become the value of constant. If the value of symbolic constant contains other symbolic constants, they are replaced with their values before assigning this value to the new constant.
Conclusion: so you should use = instead of equ (in FASM).
(for calculating numeric constants I mean.. you can still use equ for symbolic constants... sounds to me like macro definition)
You got your big constant because you had that symbol defined ahead of code, and during compilation it did something like mov edx,$ - msg1, where $ is already address of the instruction, not your placement of msg1_len definition.
"printf" returns the number of characters really printed, so I had:
#include<stdio.h>
int main()
{
printf("1");
printf("55555");
printf("10________");
printf("13___________");
printf("18________________");
printf("28__________________________");
}
This program will output
15555510________13___________18________________28__________________________
Then I tried to debug it in gdb and check the return value of gdb:
(gdb) b main
Breakpoint 1 at 0x804844c: file testp.c, line 4.
(gdb) r
Starting program: /home/a/cpp/a.out
Breakpoint 1, main () at testp.c:4
4 printf("1");
(gdb) n # will return "1" to $eax
5 printf("55555");
(gdb) p $eax # I expect it will print "1" here, wrong!
$1 = 49
(gdb) n
6 printf("10________");
(gdb) p $eax # I expect it will print "5" here, right!
$2 = 5
(gdb) n
7 printf("13____________");
(gdb) p $eax # I expect it will print "10" here, right!
$3 = 10
As you could see, when the first printf is run, the $eax value is not as my expectation. Later values are seems correct.
Why is this? Why first printf doesn't return a "1" to $eax? I suppose c style ABI stores return value in $eax, right?
Thanks
gcc can replace calls to printf with more efficient code, such as calls to puts or putchar, in certain cases where the optimization won't change the documented behavior of the functions (for example, when the output doesn't require any formatting to be done and you don't use the return value). That's what's happening here. You're seeing 49 in %eax because putchar returns either the character that was output, or EOF.
(gdb) disass /m main
Dump of assembler code for function main:
3 {
0x000000000040057d <+0>: push %rbp
0x000000000040057e <+1>: mov %rsp,%rbp
4 printf("1");
0x0000000000400581 <+4>: mov $0x31,%edi
0x0000000000400586 <+9>: callq 0x400450 <putchar#plt>
5 printf("55555");
0x000000000040058b <+14>: mov $0x400664,%edi
0x0000000000400590 <+19>: mov $0x0,%eax
0x0000000000400595 <+24>: callq 0x400460 <printf#plt>
To get gcc to generate calls to printf all the time, you can use the -fno-builtin-printf option.
This is the source code I have:
section .data
msg: db "pppaaa"
len: equ $
section .text
global main
main:
mov edx,len
mov ecx,msg
mov ebx,1
mov eax,4
int 0x80
And when I debug this code I will see:
(gdb) info register ecx
ecx 0x804a010 134520848
(gdb) x 0x804a010
0x804a010 <msg>: 0x61707070
(gdb) x 0x804a014
0x804a014: 0x00006161
"70" here represents the character 'p' and "61" the character 'a' obviously.
What I am confused about is, why is the data in location 0x804a010 is 0x61707070 (appp) and moving 4 bytes forward at 0x804a014 the data is --aa ?
I would expect to see (pppa) for the first location and (aa--) for the second location. Why is this the case?
GDB doesn't know that you have a bunch of chars. You are just asking it to look at a memory location and it is displaying what is there, defaulting to a 4-byte integer. It assumes the integer is stored least significant byte first, because that is how it is done on Intel, so you get your bytes reversed.
To fix this, use a format specifier with your x command, like this:
x/10c 0x804a010
(will print 10 chars beginning at 0x804a010).
help x in GDB will give more information.
extern putchar
extern exit
section .data
section .text
global main
main:
push 's'
mov eax, 2
cmp eax, 2
point:
call putchar
jz point
push 0
call exit
On the console I see only one 's' charcter.
Compile and run:
nasm -f elf ./prog.asm
gcc -m32 -o prog ./prog.o
./prog
The cmp is "equal to 0" (that is, it does set the ZF flag). However, the call putchar in the next line is trashing the flags set by cmp, so your jz does not work (more or less by accident). If you want to save the flags for later comparison, you could use pushf and popf, however this won't really work in your case since putchar will expect the character on the stack, not the flags.
Now, to answer the actual problem which you didn't state. I'll assume you want to print 's' two times. Here's how to do it properly:
mov eax, 2 ; init counter
print_loop:
push eax; save the counter since it will be trashed by putchar
push 's'
call putchar
add esp, 4 ; restore the stack pointer since putchar is cdecl
pop eax ; restore the saved counter
dec eax ; decrement it
jnz print_loop ; if it's not yet zero, do another loop
add esp, 4 can be replaced by another pop eax for slightly shorter code.
The result of doing a cmp is that flags get set, zf for zero, and so on. You can then either branch on whether or not the flag was set or use one of the set? instructions to have a value, e.g. the al register, set based on whether or not the flag was set.