English it's not my first language so if i spell wrong some words sorry. I've some trouble with the stack, all codes that i will put here works perfectly.
This code for example it's easy and i understand the stack of it.
.globl f
f:
push %ebx
movl 8(%esp), %eax
movl 12(%esp), %ebx
addl %ebx, %eax
ret
STACK
-------
VAR Y --> ESP + 12
-------
VAR X --> ESP + 8
-------
RET --> RETURN
-------
%EBX --> %ESP
-------
But with this code i've some t
.code32
.globl f
f:
pushl %ebx
movl 8(%esp), %ebx
subl $8, %esp # Creo posto nella stack per i parametri
movl $1, (%esp)
movl $2, 4(%esp)
call a
addl %ebx, %eax
addl $8, %esp #Tolgo posto nella stack
popl %ebx
ret
The code work perfectly but i've many question about that?. Where is %ebx and ret on stack now?
Code of asm transalted in c:
int f(int x){
return x + g(y,z);
}
And this is the stack that i've made
STACK
--------
8(%esp) --> x parameter of function f
--------
4(%esp) --> z parameter of function g
--------
(%esp) --> y parameter of funcion g
--------
So the question now is where are %ebx and ret on this stack now?
The first code will return to old ebx value (probably not a valid address), not to the original return address, it's missing pop ebx ahead of ret.
In the second call the memory at ss:esp address, before call a instruction, contains:
dword 1 +0 (current esp)
dword 2 +4
dword old_ebx_value +8
dword return_address_from_f +12
dword x +16
... older stack content ...
Your "esp+x" notation doesn't work, as the esp does change dynamically, so if you want to describe stack like that, you have to say at which position in the code (which value of esp) you are using. Ie. at the entry of f the mov eax,[esp+4] will load the "x", but just one instruction later after push ebx the same thing is achieved by mov eax,[esp+8] (Intel syntax, convert to that "machine" gas/at&t syntax by yourself, I'm human).
But even then, if you will picture it as memory values, it is dynamically changing with every push or write to memory, so you still have to specify at which point of execution you are describing the stack (like I did ahead of call a, because after call a there's the address of instruction addl %ebx, %eax written ahead of that value 1 and code at a is not shown in question.
Anyway the old ebx and return address are in the memory at the same place all the time (unless a overwrites them), it's not the content that moves. It's the pointer esp that is being adjusted by push/pop/add/sub. (the memory content will stay for some undefined period of time even after you pop it, it's just not safe to assume how long it takes other code to overwrite it, in case the SW interrupt handlers are using the app stack it may be overwritten any time, although in x86 32b mode usually the app has it's own stack, so then those values will probably stay there until you overwrite them by next push or call or some other way).
Finally, just make those things to compile, and run them in debugger, put memory view to ss:esp-32 at the beginning, and watch how the memory is being written to by instructions like call or push, and how esp does change to point to the "top of stack". It's usually much easier to "watch it" in debugger, than reading text like my answer.
Related
I know that its getting 0 into ebx but why? I'm so sorry if it looks like a no-brainer question to you, its my first week of learning assembly and a few months of programming.
I haven't included everything below because it is a quite long, lmk if its necessary
The assembly is from the book "Programming From Ground Up Chapter 6"
summary of assembly:
Opens an input and output file, reads records from the input, increments the age, writes the new record to the output file
SYS_EXIT is 1
LINUX_SYSCALL is 0x80
loop_begin:
pushl ST_INPUT_DESCRIPTOR(%ebp)
pushl $record_buffer
call read_record
addl $8, %esp
# Returns the number of bytes read. If it isn’t the same number we requested, then it’s either an end-of-file, or an error, so we’re quitting
cmpl $RECORD_SIZE, %eax
jne loop_end
#Increment the age
incl record_buffer + RECORD_AGE
#Write the record out
pushl ST_OUTPUT_DESCRIPTOR(%ebp)
pushl $record_buffer
call write_record
addl $8, %esp
jmp loop_begin
loop_end:
movl $SYS_EXIT, %eax
movl $0, %ebx <------------------------ THE INSTRUCTION'S PURPOSE THAT IM ASKING FOR
int $LINUX_SYSCALL
This is the equivalent of _exit(0); in C; except that the Linux kernel uses different calling conventions (parameters passed in registers and not on the stack).
The movl $0, %ebx is loading the 2nd parameter (0) into the right register for the kernel's calling convention. The first parameter is the function number (SYS_EXIT).
This question already has an answer here:
x64: Why does this piece of code give me "Address boundary error"
(1 answer)
Closed 5 years ago.
Introduction
I'm following "ProgrammingGroundUp" book. and I've followed the example of creating a function to raise the power of two numbers and adding them. "2^3 + 5^2". However when I go to compile the code, and then run the program. I receive a segmentation fault.
From my understanding a segmentation fault occurs when a program attempts to do an illegal read or write from a memory location. I think it could be occurring inside the function itself, however confused of where the segmentation fault has occurred.
Source Code - power.s
#purpose illustrate how functions work. Program will compute 2^3 + 5^2
#using registers so nothing in data section
.section .data
.section .text
.globl _start
_start:
pushl $3 #push 2nd arg on stack
pushl $2 #push 1st arg on stack
call power
addl 8,%esp #move stack pointer back
pushl %eax #push result to top of stack
pushl $2 #push 2nd arg on stack
pushl $5 #push 1st arg on stack
call power
addl 8,%esp #move stack pointer back
popl %ebx #put function1 result into ebx reg
addl %eax , %ebx #add return result of function2 + function1 result
movl $1 , %eax #exit system call
int $0x80
#PURPOSE: power function
#REGISTERS: %ebx - holds base number ; %ecx - holds power; -4(%ebp) -holds current result ;%eax temp storage
.type power,#function
power:
pushl %ebp #save state of base pointer
movl %esp,%ebp #make stack pointer the base pointer
subl $4,%esp #room for local storage
movl 8(%ebp),%ebx #1st arg initialized,
movl 12(%ebp),%ecx #2nd arg initialized,
movl %ebx , -4(%ebp) #store current result
power_loop_start:
cmpl $1,%ecx #if ^1 then jump to end_power & exit
je end_power
movl -4(%ebp),%eax #store current result
imull %ebx,%eax #multiply
movl %eax,-4(%ebp) #store result
decl %ecx #decrement ecx
jmp power_loop_start #loop
end_power: #return
movl -4(%ebp) , %eax #move result in eax for return
movl %ebp , %esp #reset the stack pointer
popl %ebp #reset base pointer to original position
ret #return
Compiling
as --32 power.s -o power.o
ld -m elf_i386 power.o -o power
./power
Segmentation fault
Summary
Segmentation fault occurring in code, Not sure where is exactly, very new to assembly, tried to explain as best I can. BTW used the "--32" as the code is 32bit and I'm on a 64bit machine.
*Also if my question doesn't meet stack overflow standards please let me know so I can improve.
Thanks to #Michael Petch for spotting the syntax error. In lines such as "addl 8,%esp" i did not put the dollar sign, which signifies a value and not a memory address as the instruction is immediate addressing. However i miseed the dollar sign which makes it into a memory address. Thanks for helping.
When I write a simple assembly language program, linked with the C library, using gcc 4.6.1 on Ubuntu, and I try to print an integer, it works fine:
.global main
.text
main:
mov $format, %rdi
mov $5, %rsi
mov $0, %rax
call printf
ret
format:
.asciz "%10d\n"
This prints 5, as expected.
But now if I make a small change, and try to print a floating point value:
.global main
.text
main:
mov $format, %rdi
movsd x, %xmm0
mov $1, %rax
call printf
ret
format:
.asciz "%10.4f\n"
x:
.double 15.5
This program seg faults without printing anything. Just a sad segfault.
But I can fix this by pushing and popping %rbp.
.global main
.text
main:
push %rbp
mov $format, %rdi
movsd x, %xmm0
mov $1, %rax
call printf
pop %rbp
ret
format:
.asciz "%10.4f\n"
x:
.double 15.5
Now it works, and prints 15.5000.
My question is: why did pushing and popping %rbp make the application work? According to the ABI, %rbp is one of the registers that the callee must preserve, and so printf cannot be messing it up. In fact, printf worked in the first program, when only an integer was passed to printf. So the problem must be elsewhere?
I suspect the problem doesn't have anything to do with %rbp, but rather has to do with stack alignment. To quote the ABI:
The ABI requires that stack frames be aligned on 16-byte boundaries. Specifically, the end of
the argument area (%rbp+16) must be a multiple of 16. This requirement means that the frame
size should be padded out to a multiple of 16 bytes.
The stack is aligned when you enter main(). Calling printf() pushes the return address onto the stack, moving the stack pointer by 8 bytes. You restore the alignment by pushing another eight bytes onto the stack (which happen to be %rbp but could just as easily be something else).
Here is the code that gcc generates (also on the Godbolt compiler explorer):
.LC1:
.ascii "%10.4f\12\0"
main:
leaq .LC1(%rip), %rdi # format string address
subq $8, %rsp ### align the stack by 16 before a CALL
movl $1, %eax ### 1 FP arg being passed in a register to a variadic function
movsd .LC0(%rip), %xmm0 # load the double itself
call printf
xorl %eax, %eax # return 0 from main
addq $8, %rsp
ret
As you can see, it deals with the alignment requirements by subtracting 8 from %rsp at the start, and adding it back at the end.
You could instead do a dummy push/pop of whatever register you like instead of manipulating %rsp directly; some compilers do use a dummy push to align the stack because this can actually be cheaper on modern CPUs, and saves code size.
I am trying to read command line arguments with assembly code for IA 32. I found an explanation of how to do it here http://www.paladingrp.com/ia32.shtml. I am able to use the stack pointer to get the number of arguments but I am not able to get the value of the arguments.
Here is what I am trying to do:
movl 8(%esp), %edx # Move pointer to argument 1 to edx
movl (%edx), %ebx # Move value of edx to ebx
movl $1, %eax # opcode for exit system call in eax
int $0x80 # return
Am I getting the correct pointer? If so, how do I get the value of it? If not, how do I get the correct pointer?
movl (%edx), %ebx # Move value of edx to ebx
That doesn't move value of EDX to EBX (the comment is incorrect).
That dereferences pointer in EDX, and puts the result of dereference into EBX. So if you invoked your program with ./a.out foo, then EBX will end up being 0x006f6f66 (== '\0oof' ("foo\0" in little-endian)).
I am guessing that's not what you wanted, but your question is not very clear about what you are expecting to happen where.
I'm trying to learn basic assembly. I wrote a simple program in C to translate to assembly:
void myFunc(int x, int y) {
int z;
}
int main() {
myFunc(20, 10);
return 0;
}
This is what I thought the correct translation of the function would be:
.text
.globl _start
.type myFunc, #function
myFunc:
pushl %ebp #Push old ebp register on to stack
movl %esp, %ebp #Move esp into ebp so we can reference vars
sub $4, %esp #Subtract 4 bytes from esp to make room for 'z' var
movl $2, -4(%ebp) #Move value 2 into 'z'
movl %ebp, %esp #Restore esp
popl %ebp #Set ebp to 0?
ret #Restore eip and jump to next instruction
_start:
pushl $10 #Push 10 onto stack for 'y' var
pushl $20 #Push 20 onto stack for 'x' var
call myFunc #Jump to myFunc (this pushes ret onto stack)
add $8, %esp #Restore esp to where it was before
movl $1, %eax #Exit syscall
movl $0, %ebx #Return 0
int $0x80 #Interrupt
Just to double check it I ran it in gdb and was confused by the results:
(gdb) disas myFunc
Dump of assembler code for function myFunc:
0x08048374 <myFunc+0>: push ebp
0x08048375 <myFunc+1>: mov ebp,esp
0x08048377 <myFunc+3>: sub esp,0x10
0x0804837a <myFunc+6>: leave
0x0804837b <myFunc+7>: ret
End of assembler dump.
Why at 0x08048377 did gcc subtract 0x10 (16 bytes) from the stack when an integer is 4 bytes in length?
Also, is the leave instruction equivalent to the following?
movl %ebp, %esp #Restore esp
popl %ebp #Set ebp to 0?
Using:
gcc version 4.3.2 (Debian 4.3.2-1.1)
GNU gdb 6.8-debian
Depending on the platform, GCC may choose different stack alignments; this can be overridden, but doing so can make the program slower or crash. The default -mpreferred-stack-boundary=4 keeps the stack aligned to 16-byte addresses. Assuming the stack pointer already suitably aligned at the start of the function, it will remain so aligned after sub %esp, $10.
leave is an x86 macro-instruction which is equivalent to mov %ebp, %esp; pop %ebp.
Your GDB is configured to print out Intel instead of AT&T assembly syntax - turn that off before it confuses you more than it already has.
The stack pointer (%esp) is required to always be aligned to a 16-byte boundary. That's probably where the sub esp,0x10 is coming from. (It's unnecessary, but GCC has historically been bad at noticing that stack adjustments are unnecessary.) Also, your function doesn't do anything interesting, so the body has been optimized out. You should have compiled this code:
int myFunc(int x, int y)
{
return x + y;
}
int main(void)
{
return myFunc(20, 30);
}
That'll produce assembly language that's easier to map back to the original C. GCC would still be allowed to produce
main:
movl $50,%eax
ret
and nothing else, but it probably won't unless you use -O3 -fwhole-program ;-)