Bogus Results from Simple Assembly Program on FreeBSD System - linux

I've been having problems getting even the simplest of assembly programs that I write on Linux to run on my FreeBSD machine. Here's the offending code (I'm trying to keep this as simple as possible):
#counts to sixty
.section .data
.section .text
.global _start
_start:
movl $1, %ecx #move $1 into ecx
movl $1, %eax
start_loop:
addl %ecx, %eax #add ecx to eax
cmpl $60, %eax #compare $60 and eax...
je end_loop #if eax = 60 go to end_loop
cmpl $60, %eax #
jle start_loop #jump if eax is < $60...
jmp start_loop #...to start_loop
end_loop:
movl %eax, %ebx #move the value of eax into ebx because ebx holds
#the return value
movb $1, %al #Move $1 into eax (int 1 is the value for the
#exit() syscall
int $0x80
The Linux machine returns the expected resulted which is sixty, whereas the FreeBSD machine consistently returns 164 for the return code. Does anybody know why this is? If so, can you please explain to me what is happening? Also, I should mention that they are both indeed running x86 CPUs. Thanks in advance :)

Refer to the FreeBSD Developer's handbook, and you need to do:
push %eax
mov $1, %eax
push %eax
int $0x80
because:
only the system call vector is passed via register %eax, all arguments are on the stack
the FreeBSD default syscall expects an additional word on the stack, which would be a dummy for inlined uses of int $0x80 but a return address where you do a syscall via a call kernel_entry trampoline (that then can do int $0x80; ret).
If you want to use the Linux convention (some syscall args in regs, called "Alternative Calling convention" in the manual), you have to brand the executable so that the system knows you're using Linux-style syscalls.

Related

Convert a string of digits to an integer by using a subroutine

Assembly language program to read in a (three-or-more-digit) positive integer as a string and convert the string to the actual value of the integer.
Specifically, create a subroutine to read in a number. Treat this as a string, though it will be composed of digits. Also, create a subroutine to convert a string of digits to an integer.
Do not have to test for input where someone thought i8xc was an integer.
I am doing it like this. Please help.
.section .data
String:
.asciz "1234"
Intg:
.long 0
.section .text
.global _start
_start:
movl $1, %edi
movl $String, %ecx
character_push_loop:
cmpb $0, (%ecx)
je conversion_loop
movzx (%ecx), %eax # move byte from (%ecx) to eax
pushl %eax # Push the byte on the stack
incl %ecx # move to next byte
jmp character_push_loop # loop back
conversion_loop:
popl %eax # pop off a character from the stack
subl $48, %eax # convert to integer
imul %edi, %eax # eax = eax*edi
addl %eax, Intg
imul $10, %edi
decl %ecx
cmpl $String, %ecx # check when it get's to the front %ecx == $String
je end # When done jump to end
jmp conversion_loop
end:
pushl Intg
addl $8, %esp # clean up the stack
movl $0, %eax # return zero from program
ret
Also, I am unable to get the output. I am getting a Segmentation Fault. I am not able to find out what is the error in my code.
Proper interaction with operating system is missing.
In the end: you pushed the result but the following addl $8, %esp invalidates the pushed value and the final ret incorrectly leads the instruction flow to whatever garbage was in the memory pointed by SS:ESP+4 at the program entry.
When you increase the stack pointer, you cannot rely that data below ESP will survive.
Your program does not interact with its user, if you want it to print something, use system function to write.
print_String:
mov $4,eax ; System function "sys_write".
mov $1,ebx ; Handle of the standard output (console).
mov $String,ecx ; Pointer to the text string.
mov $4,edx ; Number of bytes to print.
int 0x80 ; Invoke kernel function.
end:mov $1,eax ; System function "sys_exit".
mov (Intg),ebx ; Let your program terminate gracefully with errorlevel Intg.
int 0x80 ; Invoke kernel function.

Loading value at address into register

As a learning exercise, I've been handwriting assembly. I can't seem to figure out how to load the value of an address into a register.
Semantically, I want to do the following:
_start:
# read(0, buffer, 1)
mov $3, %eax # System call 3 is read
mov $0, %ebx # File handle 0 is stdin
mov $buffer, %ecx # Buffer to write to
mov $1, %edx # Length of buffer
int $0x80 # Invoke system call
lea (%ecx, %ecx), %edi # Pull the value at address into %edi
cmp $97, %edi # Compare to 'a'
je done
I've written a higher-level implementation in C:
char buffer[1];
int main()
{
read(0, buffer, 1);
char a = buffer[0];
return (a == 'a') ? 1 : 0;
}
But compiling with gcc -S produces assembly that doesn't port well into my implementation above.
I think lea is the right instruction I should be using to load the value at the given address stored in %ecx into %edi, but upon inspection in gdb, %edi contains a garbage value after this instruction is executed. Is this approach correct?
Instead of the lea instruction, what you need is:
movzbl (%ecx), %edi
That is, zero extending into the edi register the byte at the memory address contained in ecx.
_start:
# read(0, buffer, 1)
mov $3, %eax # System call 3 is read
mov $0, %ebx # File handle 0 is stdin
mov $buffer, %ecx # Buffer to write to
mov $1, %edx # Length of buffer
int $0x80 # Invoke system call
movzbl (%ecx), %edi # Pull the value at address ecx into edi
cmp $97, %edi # Compare to 'a'
je done
Some advice
You don't really need the movz instruction: you don't need a separate load operation, since you can compare the byte in memory pointed by ecx directly with cmp:
cmpb $97, (%ecx)
You may want to specify the character to be compared against (i.e., 'a') as $'a' instead of $97 in order to improve readability:
cmpb $'a', (%ecx)
Avoiding conditional branches is usually a good idea. Immediately after performing the system call, you could use the following code that uses cmov for determining the return value, which is stored in eax, instead of performing a conditional jump (i.e., the je instruction):
xor %eax, %eax # set eax to zero
cmpb $'a', (%ecx) # compare to 'a'
cmovz %edx, %eax # conditionally move edx(=1) into eax
ret # eax is either 0 or 1 at this point
edx was set to 1 prior to the system call. Therefore, this approach above relies on the fact that edx is preserved across the system call (i.e., the int 0x80 instruction).
Even better, you could use sete on al after the comparison instead of the cmov:
xor %eax, %eax # set eax to zero
cmpb $'a', (%ecx) # compare to 'a'
sete %al # conditionally set al
ret # eax is either 0 or 1 at this point
The register al, which was set to zero by means of xor %eax, %eax, will be set to 1 if the ZF flag was set by the cmp (i.e., if the byte pointed by ecx is 'a'). With this approach you don't need to care about thinking whether the syscall preserves edx or not, since the outcome doesn't depend on edx.

Assembler - adding big (128b) numbers (AT&T assembly syntax) - where to store results?

I am trying to add two 128 bits numbers using ATT assembly syntax in linux ubuntu 64b and I am debugging it in gdb so I know that after every loop the result of adding two parts is correct but how to store all 4 results together?? I was considering adding every result to the stack but I can't add the register content to the stack, right? Am I even doing it correctly? I am a real beginner in assembler but I need it for uni :/
EXIT_SUCCESS = 0
SYSEXIT = 1
number1:
.long 0x10304008, 0x701100FF, 0x45100020, 0x08570030
number2:
.long 0xF040500C, 0x00220026, 0x321000CB, 0x04520031
.global _start
_start:
movl $4, %edx
clc
pushf
_loop:
dec %edx
movl number1(,%edx,4), %eax
movl number2(,%edx,4), %ebx
popf
adcl %ebx, %eax
cmp $0, %edx
jne _loop
popf
jnc _end
push $1
_end:
mov $SYSEXIT, %eax
mov $EXIT_SUCCESS, %ebx
int $0x80

Segmentation fault in Assembly and string

I am trying to make a simple program in assembler, but I do not understand why, I get a fault.
I' ve a 64 bit machine running Ubuntu 12.04, and "as" as a assembly compiler.
My goal merely is to print the string "Hello" on screen.
I wrote this:
#print.s
.section .data
.globl StringToPrint
StringToPrint: .asciz "Hello"
.globl _start
_start:
movq $4, %rax
movq $1, %rbx
movq $StringToPrint, %rcx
movq $5, %rdx
int $0x80
_done:
ret
But that's what I get:
$ as print.s -o print.o
$ ld print.o -o print
$ ./print
Hello[1] 10679 segmentation fault (core dumped) ./print
Why do you think this happens? Any idea?
Here is the fix :
#print.s
.section .data
.globl StringToPrint
StringToPrint: .asciz "Hello"
.globl _start
_start:
movl $5, %edx # string length
movl $StringToPrint, %ecx # pointer to string to write
movl $1, %ebx # file handle (stdout)
movl $4, %eax # system call number (sys_write)
int $0x80 # Passes control to interrupt vector
#sys_exit (return_code)
movl $1, %eax #System call number 1: exit()
movl $0, %ebx #Exits with exit status 0
int $0x80 #Passes control to interrupt vector
As Michael has already said you need to call sys_exit to avoid segmentation fault .
Edit :
Here is good to mention that int 0x80 invokes 32-bit system calls .
Using int 0x80 for syscall on x64 systems is used for backward compatibility to allow 32-bit applications to run .
On 64-bit systems will be correct to use syscall instruction .
Here is a working version :
.section .data
StringToPrint: .asciz "Hello"
.section .text
.globl _start
_start:
movq $1, %rax # sys_write
movq $1, %rdi # stdout
movq $StringToPrint, %rsi # pointer to string to write
movq $5, %rdx # string length
syscall
movq $60, %rax # sys_exit
movq $0, %rdi # exit code
syscall
The calling conventions differ between 32 and 64 bit applications in Linux as well as other OSs. Additionally, for Linux the system call numbers are also different. This is how you invoke the write system call in Linux amd64:
; sys_write(stdout, message, length)
mov rax, 1 ; sys_write
mov rdi, 1 ; stdout
mov rsi, message ; message address
mov rdx, length ; message string length
syscall
Additionally, your application needs to call sys_exit to terminate, not return using ret. Read the calling conventions for your platform.

Segmentation Fault in Assembly Language

I am learning AT&T x86 assembly language. I am trying to write an assembly program which takes an integer n, and then return the result (n/2+n/3+n/4). Here is what I have done:
.text
.global _start
_start:
pushl $24
call profit
movl %eax, %ebx
movl $1, %eax
int $0x80
profit:
popl %ebx
popl %eax
mov $0, %esi
movl $4, %ebp
div %ebp
addl %eax, %esi
movl %ecx, %eax
movl $3, %ebp
div %ebp
addl %eax, %esi
movl %ecx, %eax
movl $2, %ebp
div %ebp
addl %eax, %esi
movl %esi, %eax
cmpl %ecx, %esi
jg end
pushl %ebx
ret
end:
mov %ecx, %eax
ret
The problem is I am getting segmentation fault. Where is the problem?
I think the code fails here:
_start:
pushl $24
call profit
movl %eax, %ebx
movl $1, %eax
int $0x80
profit:
popl %ebx
popl %eax
So, you push $24 (4 bytes) and then call profit, which pushes eip and jumps to profit. Then you pop the value of eip into ebx and the value $24 into eax.
Then, in the end, if jg end branches to end:, then the stack won't hold a valid return address and ret will fail. You probably need pushl %ebx there too.
cmpl %ecx, %esi
jg end
pushl %ebx
ret
end:
mov %ecx, %eax
; `pushl %ebx` is needed here!
ret
You do not appear to be doing function calls correctly. You need to read and understand the x86 ABI (32-bit, 64-bit) particularly the "calling convention" sections.
Also, this is not your immediate problem, but: Don't write _start, write main as if this were a C program. When you start doing something more complicated, you will want the C library to be available, and that means you have to let it initialize itself. Relatedly, do not make your own system calls; call the wrappers in the C library. That insulates you from low-level changes in the kernel interface, ensures that errno is available, and so on.
you use ecx without ever explicitly initializing it (I'm not sure if Linux will guarantee the state of ecx when the process starts - looks like it's 0 in practice if not by rule)
when the program takes the jg end jump near the end of the procedure, the return address is no longer on the stack, so ret will transfer control to some garbage address.
Your problem is that you pop the return address off of the stack and when you branch to end you don't restore it. A quick fix is to add push %ebx there as well.
What you should do is modify your procedure so it uses the calling convention correctly. In Linux, the caller function is expected to clean the arguments from the stack, so your procedure should leave them where they are.
Instead of doing this to get the argument and then restoring the return address later
popl %ebx
popl %eax
You should do this and leave the return address and arguments where they are
movl 4(%esp), %eax
and get rid of the code that pushes the return address back onto the stack. You then should add
subl $4, %esp
after the call to the procedure to remove the argument from the stack. It's important to follow this convention correctly if you want to be able to call your assembly procedures from other languages.
It looks to me like you have a single pushl before you call profit and then the first thing that profit does is to do two popl instructions. I would expect that this would pop the value you pushed onto the stack as well as the return code so that your ret would not work.
push and pop should be the same number of times.
call pushes the return address onto the stack.

Resources