This question already has an answer here:
x64: Why does this piece of code give me "Address boundary error"
(1 answer)
Closed 5 years ago.
Introduction
I'm following "ProgrammingGroundUp" book. and I've followed the example of creating a function to raise the power of two numbers and adding them. "2^3 + 5^2". However when I go to compile the code, and then run the program. I receive a segmentation fault.
From my understanding a segmentation fault occurs when a program attempts to do an illegal read or write from a memory location. I think it could be occurring inside the function itself, however confused of where the segmentation fault has occurred.
Source Code - power.s
#purpose illustrate how functions work. Program will compute 2^3 + 5^2
#using registers so nothing in data section
.section .data
.section .text
.globl _start
_start:
pushl $3 #push 2nd arg on stack
pushl $2 #push 1st arg on stack
call power
addl 8,%esp #move stack pointer back
pushl %eax #push result to top of stack
pushl $2 #push 2nd arg on stack
pushl $5 #push 1st arg on stack
call power
addl 8,%esp #move stack pointer back
popl %ebx #put function1 result into ebx reg
addl %eax , %ebx #add return result of function2 + function1 result
movl $1 , %eax #exit system call
int $0x80
#PURPOSE: power function
#REGISTERS: %ebx - holds base number ; %ecx - holds power; -4(%ebp) -holds current result ;%eax temp storage
.type power,#function
power:
pushl %ebp #save state of base pointer
movl %esp,%ebp #make stack pointer the base pointer
subl $4,%esp #room for local storage
movl 8(%ebp),%ebx #1st arg initialized,
movl 12(%ebp),%ecx #2nd arg initialized,
movl %ebx , -4(%ebp) #store current result
power_loop_start:
cmpl $1,%ecx #if ^1 then jump to end_power & exit
je end_power
movl -4(%ebp),%eax #store current result
imull %ebx,%eax #multiply
movl %eax,-4(%ebp) #store result
decl %ecx #decrement ecx
jmp power_loop_start #loop
end_power: #return
movl -4(%ebp) , %eax #move result in eax for return
movl %ebp , %esp #reset the stack pointer
popl %ebp #reset base pointer to original position
ret #return
Compiling
as --32 power.s -o power.o
ld -m elf_i386 power.o -o power
./power
Segmentation fault
Summary
Segmentation fault occurring in code, Not sure where is exactly, very new to assembly, tried to explain as best I can. BTW used the "--32" as the code is 32bit and I'm on a 64bit machine.
*Also if my question doesn't meet stack overflow standards please let me know so I can improve.
Thanks to #Michael Petch for spotting the syntax error. In lines such as "addl 8,%esp" i did not put the dollar sign, which signifies a value and not a memory address as the instruction is immediate addressing. However i miseed the dollar sign which makes it into a memory address. Thanks for helping.
Related
I know that its getting 0 into ebx but why? I'm so sorry if it looks like a no-brainer question to you, its my first week of learning assembly and a few months of programming.
I haven't included everything below because it is a quite long, lmk if its necessary
The assembly is from the book "Programming From Ground Up Chapter 6"
summary of assembly:
Opens an input and output file, reads records from the input, increments the age, writes the new record to the output file
SYS_EXIT is 1
LINUX_SYSCALL is 0x80
loop_begin:
pushl ST_INPUT_DESCRIPTOR(%ebp)
pushl $record_buffer
call read_record
addl $8, %esp
# Returns the number of bytes read. If it isn’t the same number we requested, then it’s either an end-of-file, or an error, so we’re quitting
cmpl $RECORD_SIZE, %eax
jne loop_end
#Increment the age
incl record_buffer + RECORD_AGE
#Write the record out
pushl ST_OUTPUT_DESCRIPTOR(%ebp)
pushl $record_buffer
call write_record
addl $8, %esp
jmp loop_begin
loop_end:
movl $SYS_EXIT, %eax
movl $0, %ebx <------------------------ THE INSTRUCTION'S PURPOSE THAT IM ASKING FOR
int $LINUX_SYSCALL
This is the equivalent of _exit(0); in C; except that the Linux kernel uses different calling conventions (parameters passed in registers and not on the stack).
The movl $0, %ebx is loading the 2nd parameter (0) into the right register for the kernel's calling convention. The first parameter is the function number (SYS_EXIT).
# Given a number, this program compute the square of a given function
# For Example the 2*2 is 4
#This program showa how to call a function recursively
.section .data
#This program has no global data
.section .text
.globl _start
.globl square #this is unneeded unless we need to share this program
#among others
_start:
pushl $4 #The function takes one argument_ the number we want
# square of . So it get pushed of.
call square #run the square function
addl $4, %esp # Scrubs the paramter that was pushed on the stack
movl %eax, %ebx #factorial returns the answer in %eax, but we want it #in %ebx to send it as the exit status
movl $1, %eax #call the kernel's next function
int $0x80
#This is a function that test square of a function
# It takes one argument and then return the square
.type square, #square
square:
pushl %ebp #standard function stuff -we have to
#restore %ebp to its prior state before
#returning, so we have to push it.
movl %esp,%ebp #This is because we don't want to modify
#the stack pointer, so we use %ebp
movl 8(%ebp), %eax #This moves the first argument to %eax
#4(%ebp) holds the return address, and
#8(%ebp) holds the first parameter.
cmpl $1,%eax #If the number is 1, that is our base
#case, and we simply return(1 is
#already in %eax as the return value.)
je end_square
pushl %eax #Push it for call to square
call square #call square
movl 8(%ebp),%ebx #%eax has the return value, so we
#reload or parameter into %ebx
imull %ebx,%eax #multiply that by the result of the last call
#to square(in %eax) the answer is stored in %eax,
#which is good since that's where return values go.
end_square:
movl %ebp, %esp # standard function stuff-we have to restore %ebp and
popl %ebp # %esp to where they were were before the function started
ret # return from the function (this pops the # return value, too)
Try replacing: .type square, #square (which makes no sense) with: .type square, #function
For future reference - don't just put a title on an unformatted code dump. You should take the time to read about the site.
I am learning AT&T x86 assembly language. I am trying to write an assembly program which takes an integer n, and then return the result (n/2+n/3+n/4). Here is what I have done:
.text
.global _start
_start:
pushl $24
call profit
movl %eax, %ebx
movl $1, %eax
int $0x80
profit:
popl %ebx
popl %eax
mov $0, %esi
movl $4, %ebp
div %ebp
addl %eax, %esi
movl %ecx, %eax
movl $3, %ebp
div %ebp
addl %eax, %esi
movl %ecx, %eax
movl $2, %ebp
div %ebp
addl %eax, %esi
movl %esi, %eax
cmpl %ecx, %esi
jg end
pushl %ebx
ret
end:
mov %ecx, %eax
ret
The problem is I am getting segmentation fault. Where is the problem?
I think the code fails here:
_start:
pushl $24
call profit
movl %eax, %ebx
movl $1, %eax
int $0x80
profit:
popl %ebx
popl %eax
So, you push $24 (4 bytes) and then call profit, which pushes eip and jumps to profit. Then you pop the value of eip into ebx and the value $24 into eax.
Then, in the end, if jg end branches to end:, then the stack won't hold a valid return address and ret will fail. You probably need pushl %ebx there too.
cmpl %ecx, %esi
jg end
pushl %ebx
ret
end:
mov %ecx, %eax
; `pushl %ebx` is needed here!
ret
You do not appear to be doing function calls correctly. You need to read and understand the x86 ABI (32-bit, 64-bit) particularly the "calling convention" sections.
Also, this is not your immediate problem, but: Don't write _start, write main as if this were a C program. When you start doing something more complicated, you will want the C library to be available, and that means you have to let it initialize itself. Relatedly, do not make your own system calls; call the wrappers in the C library. That insulates you from low-level changes in the kernel interface, ensures that errno is available, and so on.
you use ecx without ever explicitly initializing it (I'm not sure if Linux will guarantee the state of ecx when the process starts - looks like it's 0 in practice if not by rule)
when the program takes the jg end jump near the end of the procedure, the return address is no longer on the stack, so ret will transfer control to some garbage address.
Your problem is that you pop the return address off of the stack and when you branch to end you don't restore it. A quick fix is to add push %ebx there as well.
What you should do is modify your procedure so it uses the calling convention correctly. In Linux, the caller function is expected to clean the arguments from the stack, so your procedure should leave them where they are.
Instead of doing this to get the argument and then restoring the return address later
popl %ebx
popl %eax
You should do this and leave the return address and arguments where they are
movl 4(%esp), %eax
and get rid of the code that pushes the return address back onto the stack. You then should add
subl $4, %esp
after the call to the procedure to remove the argument from the stack. It's important to follow this convention correctly if you want to be able to call your assembly procedures from other languages.
It looks to me like you have a single pushl before you call profit and then the first thing that profit does is to do two popl instructions. I would expect that this would pop the value you pushed onto the stack as well as the return code so that your ret would not work.
push and pop should be the same number of times.
call pushes the return address onto the stack.
Given the following code :
.globl main
.type main, #function
input: .string "%d"
main:
pushl %ebp # save the old frame pointer
movl %esp,%ebp # create the new frame pointer
movl $0,%eax
addl $-4 ,%esp # moving down the stack
pushl %esp # push the address of esp to the stack in order to store the number given by the user
pushl $input # push to the stack the format of the input
call scanf # call scanf to get a number from the user
addl $8,%esp # clear the stack
movl (%esp),%eax # get the selection from the user
subl $50,%eax
jmp *.switching(,%eax,4)
.section .rodata
.align 4
.switching:
.long .L1
.long .L2
.long .L3
.long .L4
.text
.L1:
call case1
jmp .quitTheProgram
.L2:
call case2
jmp .quitTheProgram
.L3:
call case
jmp .quitTheProgram
.L4:
call case4
jmp .quitTheProgram
case1:
pushl %ebp # save the old frame pointer
movl %esp,%ebp # create the new frame pointer
#
# code of case1
#
movl %ebp,%esp # restore the old ebp
popl %ebp # restore the old stack pointer and release all used memory
ret # return to caller function (OS)
The user presses numbers between 50-54. The problem is after pressing (for example) 50
I jump to case1 , but not to the code itself , but straight to the ret line , and then the code stops and exit case1 (as for the rest of the cases) .
What might be the problem ?
Regards,Ron
The problem is after pressing (for example) 50 I jump to case1 , but not to the code itself , but straight to the ret line
I just built your code on Linux, stepped through it in GDB, and did not observe that behavior.
It is somewhat likely that you are mis-interpreting what you actually observed.
I'm trying to learn basic assembly. I wrote a simple program in C to translate to assembly:
void myFunc(int x, int y) {
int z;
}
int main() {
myFunc(20, 10);
return 0;
}
This is what I thought the correct translation of the function would be:
.text
.globl _start
.type myFunc, #function
myFunc:
pushl %ebp #Push old ebp register on to stack
movl %esp, %ebp #Move esp into ebp so we can reference vars
sub $4, %esp #Subtract 4 bytes from esp to make room for 'z' var
movl $2, -4(%ebp) #Move value 2 into 'z'
movl %ebp, %esp #Restore esp
popl %ebp #Set ebp to 0?
ret #Restore eip and jump to next instruction
_start:
pushl $10 #Push 10 onto stack for 'y' var
pushl $20 #Push 20 onto stack for 'x' var
call myFunc #Jump to myFunc (this pushes ret onto stack)
add $8, %esp #Restore esp to where it was before
movl $1, %eax #Exit syscall
movl $0, %ebx #Return 0
int $0x80 #Interrupt
Just to double check it I ran it in gdb and was confused by the results:
(gdb) disas myFunc
Dump of assembler code for function myFunc:
0x08048374 <myFunc+0>: push ebp
0x08048375 <myFunc+1>: mov ebp,esp
0x08048377 <myFunc+3>: sub esp,0x10
0x0804837a <myFunc+6>: leave
0x0804837b <myFunc+7>: ret
End of assembler dump.
Why at 0x08048377 did gcc subtract 0x10 (16 bytes) from the stack when an integer is 4 bytes in length?
Also, is the leave instruction equivalent to the following?
movl %ebp, %esp #Restore esp
popl %ebp #Set ebp to 0?
Using:
gcc version 4.3.2 (Debian 4.3.2-1.1)
GNU gdb 6.8-debian
Depending on the platform, GCC may choose different stack alignments; this can be overridden, but doing so can make the program slower or crash. The default -mpreferred-stack-boundary=4 keeps the stack aligned to 16-byte addresses. Assuming the stack pointer already suitably aligned at the start of the function, it will remain so aligned after sub %esp, $10.
leave is an x86 macro-instruction which is equivalent to mov %ebp, %esp; pop %ebp.
Your GDB is configured to print out Intel instead of AT&T assembly syntax - turn that off before it confuses you more than it already has.
The stack pointer (%esp) is required to always be aligned to a 16-byte boundary. That's probably where the sub esp,0x10 is coming from. (It's unnecessary, but GCC has historically been bad at noticing that stack adjustments are unnecessary.) Also, your function doesn't do anything interesting, so the body has been optimized out. You should have compiled this code:
int myFunc(int x, int y)
{
return x + y;
}
int main(void)
{
return myFunc(20, 30);
}
That'll produce assembly language that's easier to map back to the original C. GCC would still be allowed to produce
main:
movl $50,%eax
ret
and nothing else, but it probably won't unless you use -O3 -fwhole-program ;-)