The assembly code (x86) with jumps and a syscall read function - linux

I would like to ask anyone for help with understanding an assembly code. My problem is:
the code after the label L2 is important, it calls subroutine function. But it seems to me that the program would never get to the code after label L2, because according to me syscall read (after L1) always reads 0 and after compare it to 1. But zero never equals one, so it seems to me the program never jumps to L2. I guess I must be wrong. I would really appreciate any help
jmp L1
L2:
movzbl -0x11(%ebp), %eax
movsbl %al, %eax
mov %eax, (%esp)
call SUBROUTINE_FNC
<...>
L1:
mov $0x0, %ebx
lea -0x11(%ebp), %ecx
mov $0x1, %edx
mov $0x3, %eax
int $0x80
mov %eax, -0x10(%ebp)
cmpl $0x1, -0x10(%ebp)
je L2

The syscall corresponds to read and it looks like you are trying to read one byte at a time. read should return the number of actual bytes read, so if the call is successful then you will get a return value of 1, the compare will be true, and you will jump to L2, i.e.
L2:
SUBROUTINE_FNC(...);
if (read(fd, buff, 1) == 1) // read one byte
goto L2; // if one byte read then loop to L2
or, in a more structured form:
while (read(fd, buff, 1) == 1)
{
SUBROUTINE_FNC(...)
}

Related

Convert a string of digits to an integer by using a subroutine

Assembly language program to read in a (three-or-more-digit) positive integer as a string and convert the string to the actual value of the integer.
Specifically, create a subroutine to read in a number. Treat this as a string, though it will be composed of digits. Also, create a subroutine to convert a string of digits to an integer.
Do not have to test for input where someone thought i8xc was an integer.
I am doing it like this. Please help.
.section .data
String:
.asciz "1234"
Intg:
.long 0
.section .text
.global _start
_start:
movl $1, %edi
movl $String, %ecx
character_push_loop:
cmpb $0, (%ecx)
je conversion_loop
movzx (%ecx), %eax # move byte from (%ecx) to eax
pushl %eax # Push the byte on the stack
incl %ecx # move to next byte
jmp character_push_loop # loop back
conversion_loop:
popl %eax # pop off a character from the stack
subl $48, %eax # convert to integer
imul %edi, %eax # eax = eax*edi
addl %eax, Intg
imul $10, %edi
decl %ecx
cmpl $String, %ecx # check when it get's to the front %ecx == $String
je end # When done jump to end
jmp conversion_loop
end:
pushl Intg
addl $8, %esp # clean up the stack
movl $0, %eax # return zero from program
ret
Also, I am unable to get the output. I am getting a Segmentation Fault. I am not able to find out what is the error in my code.
Proper interaction with operating system is missing.
In the end: you pushed the result but the following addl $8, %esp invalidates the pushed value and the final ret incorrectly leads the instruction flow to whatever garbage was in the memory pointed by SS:ESP+4 at the program entry.
When you increase the stack pointer, you cannot rely that data below ESP will survive.
Your program does not interact with its user, if you want it to print something, use system function to write.
print_String:
mov $4,eax ; System function "sys_write".
mov $1,ebx ; Handle of the standard output (console).
mov $String,ecx ; Pointer to the text string.
mov $4,edx ; Number of bytes to print.
int 0x80 ; Invoke kernel function.
end:mov $1,eax ; System function "sys_exit".
mov (Intg),ebx ; Let your program terminate gracefully with errorlevel Intg.
int 0x80 ; Invoke kernel function.

Loading value at address into register

As a learning exercise, I've been handwriting assembly. I can't seem to figure out how to load the value of an address into a register.
Semantically, I want to do the following:
_start:
# read(0, buffer, 1)
mov $3, %eax # System call 3 is read
mov $0, %ebx # File handle 0 is stdin
mov $buffer, %ecx # Buffer to write to
mov $1, %edx # Length of buffer
int $0x80 # Invoke system call
lea (%ecx, %ecx), %edi # Pull the value at address into %edi
cmp $97, %edi # Compare to 'a'
je done
I've written a higher-level implementation in C:
char buffer[1];
int main()
{
read(0, buffer, 1);
char a = buffer[0];
return (a == 'a') ? 1 : 0;
}
But compiling with gcc -S produces assembly that doesn't port well into my implementation above.
I think lea is the right instruction I should be using to load the value at the given address stored in %ecx into %edi, but upon inspection in gdb, %edi contains a garbage value after this instruction is executed. Is this approach correct?
Instead of the lea instruction, what you need is:
movzbl (%ecx), %edi
That is, zero extending into the edi register the byte at the memory address contained in ecx.
_start:
# read(0, buffer, 1)
mov $3, %eax # System call 3 is read
mov $0, %ebx # File handle 0 is stdin
mov $buffer, %ecx # Buffer to write to
mov $1, %edx # Length of buffer
int $0x80 # Invoke system call
movzbl (%ecx), %edi # Pull the value at address ecx into edi
cmp $97, %edi # Compare to 'a'
je done
Some advice
You don't really need the movz instruction: you don't need a separate load operation, since you can compare the byte in memory pointed by ecx directly with cmp:
cmpb $97, (%ecx)
You may want to specify the character to be compared against (i.e., 'a') as $'a' instead of $97 in order to improve readability:
cmpb $'a', (%ecx)
Avoiding conditional branches is usually a good idea. Immediately after performing the system call, you could use the following code that uses cmov for determining the return value, which is stored in eax, instead of performing a conditional jump (i.e., the je instruction):
xor %eax, %eax # set eax to zero
cmpb $'a', (%ecx) # compare to 'a'
cmovz %edx, %eax # conditionally move edx(=1) into eax
ret # eax is either 0 or 1 at this point
edx was set to 1 prior to the system call. Therefore, this approach above relies on the fact that edx is preserved across the system call (i.e., the int 0x80 instruction).
Even better, you could use sete on al after the comparison instead of the cmov:
xor %eax, %eax # set eax to zero
cmpb $'a', (%ecx) # compare to 'a'
sete %al # conditionally set al
ret # eax is either 0 or 1 at this point
The register al, which was set to zero by means of xor %eax, %eax, will be set to 1 if the ZF flag was set by the cmp (i.e., if the byte pointed by ecx is 'a'). With this approach you don't need to care about thinking whether the syscall preserves edx or not, since the outcome doesn't depend on edx.

How to read and display a value in Linux assembly?

I'm a beginner in Linux assembler and I have some questions. I'd like to read some characters from keyboard, convert it to value (I understand that this convertion should be from ASCII to decimal, right?), do some math (add, sub, multiply, whatever) and display the result in Terminal. How should I do that? I wrote some code but it probably doesn't make sense:
SYSEXIT = 1
EXIT_SUCC = 0
SYSWRITE = 4
SYSCALL = 0x80
SYSREAD = 3
.data
value: .space 5, 0
value_len: .long .-value
result: .long
result_len: .long .-result
.text
.global _start
_start:
movl $SYSREAD, %eax
movl $EXIT_SUCC, %ebx
movl $value, %ecx
movl value_len, %edx
int $SYSCALL
movl $0, %edx
movl value_len, %ecx
for:
movb value(, %edx, 1), %al
subb $48, %al
movb %al, result(, %edx, 1)
inc %edx
loop for
add $10, result
movl $0, %edx
movl result_len, %ecx
for1:
movb result(, %edx, 1), %al
add $48, %al
movb %al, result(, %edx, 1)
inc %edx
loop for1
movl $SYSWRITE, %eax
movl $SYSEXIT, %ebx
movl $result, %ecx
movl result_len, %edx
int $SYSCALL
movl $SYSEXIT, %eax
movl $EXIT_SUCC, %ebx
int $SYSCALL
I don't know if I should reserve memory by spaces? Or reading characters in loop?
How to convert it, to be able to make some math operation and then convert it to be able to display it?
I know that to get the value of ASCII char I should subtract 48, but what next?
I had an idea to multiply each bits by 2^k where k is 0,1,2...n it's good idea? If so, how to implement something like this?
As you can see I had a lot of questions, but I only need to someone show me how to do, what I am asking about. I saw some similar problems, but nothing like this in Linux.
Thank you in advance for the all information.
All the best.
At first reading and writing the console in Linux is by using the file functions with special console handles, that have always the same values: STDIN=0, STDOUT=1 and STDERR=2.
At second you will need some decent documentation about Linux system calls. Notice that the C-centric one (like "man") are not suitable, because C language does not use the system calls directly, but has wrappers that often change the arguments and the result values.
On the following site you can download an assembly-centric SDK for Linux, that contains the needed documentation and many examples and include files.
If you only need the help files, you can browse them online: Here
If the problem is the conversion from ASCII string to number and then back to string, here are two simple procedures that can do the job, if the requirements are not so big. The StrToNum is not so advanced, it simply convert decimal unsigned number:
; Arguments:
; esi - pointer to the string
; Return:
; CF=0
; eax - converted number
; edx - offset to the byte where convertion ended.
;
; CF=1 - the string contains invalid number.
;
StrToNum:
push ebx esi edi
xor ebx,ebx ; ebx will store our number
xor eax,eax
mov al,[esi]
cmp al,'0'
jb .error
cmp al,'9'
jbe .digit
jmp .error
.digit:
sub al,'0'
add ebx,eax
inc esi
mov al,[esi]
cmp al,'0'
jb .finish
cmp al,'9'
ja .finish
mov edx,ebx ; multiply ebx by 10
shl ebx,3
add ebx,edx
add ebx,edx
jmp .digit
.finish:
mov eax, ebx
mov edx, esi
clc
pop edi esi ebx
ret
.error:
stc
pop edi esi ebx
ret
NumToStr is pretty flexible. It converts number to a string in any radix and with sign:
;**********************************************************************************
; NumToStr converts the number in eax to the string in any radix approx. [2..26]
; Arguments:
; edi - pointer to the string buffer
; ecx - radix
; eax - number to convert.
; There is no parameter check, so be careful.
; returns: edi points to the end of a converted number
;**********************************************************************************
NumToStr:
test eax,eax
jns NumToStrU
neg eax
mov byte [edi],"-"
inc edi
NumToStrU:
cmp eax,ecx
jb .lessA
xor edx,edx
div ecx
push edx
call NumToStrU
pop eax
.lessA:
cmp al, 10
sbb al, 69h
das
stosb
ret

IA 32 read command line argument

I am trying to read command line arguments with assembly code for IA 32. I found an explanation of how to do it here http://www.paladingrp.com/ia32.shtml. I am able to use the stack pointer to get the number of arguments but I am not able to get the value of the arguments.
Here is what I am trying to do:
movl 8(%esp), %edx # Move pointer to argument 1 to edx
movl (%edx), %ebx # Move value of edx to ebx
movl $1, %eax # opcode for exit system call in eax
int $0x80 # return
Am I getting the correct pointer? If so, how do I get the value of it? If not, how do I get the correct pointer?
movl (%edx), %ebx # Move value of edx to ebx
That doesn't move value of EDX to EBX (the comment is incorrect).
That dereferences pointer in EDX, and puts the result of dereference into EBX. So if you invoked your program with ./a.out foo, then EBX will end up being 0x006f6f66 (== '\0oof' ("foo\0" in little-endian)).
I am guessing that's not what you wanted, but your question is not very clear about what you are expecting to happen where.

Bogus Results from Simple Assembly Program on FreeBSD System

I've been having problems getting even the simplest of assembly programs that I write on Linux to run on my FreeBSD machine. Here's the offending code (I'm trying to keep this as simple as possible):
#counts to sixty
.section .data
.section .text
.global _start
_start:
movl $1, %ecx #move $1 into ecx
movl $1, %eax
start_loop:
addl %ecx, %eax #add ecx to eax
cmpl $60, %eax #compare $60 and eax...
je end_loop #if eax = 60 go to end_loop
cmpl $60, %eax #
jle start_loop #jump if eax is < $60...
jmp start_loop #...to start_loop
end_loop:
movl %eax, %ebx #move the value of eax into ebx because ebx holds
#the return value
movb $1, %al #Move $1 into eax (int 1 is the value for the
#exit() syscall
int $0x80
The Linux machine returns the expected resulted which is sixty, whereas the FreeBSD machine consistently returns 164 for the return code. Does anybody know why this is? If so, can you please explain to me what is happening? Also, I should mention that they are both indeed running x86 CPUs. Thanks in advance :)
Refer to the FreeBSD Developer's handbook, and you need to do:
push %eax
mov $1, %eax
push %eax
int $0x80
because:
only the system call vector is passed via register %eax, all arguments are on the stack
the FreeBSD default syscall expects an additional word on the stack, which would be a dummy for inlined uses of int $0x80 but a return address where you do a syscall via a call kernel_entry trampoline (that then can do int $0x80; ret).
If you want to use the Linux convention (some syscall args in regs, called "Alternative Calling convention" in the manual), you have to brand the executable so that the system knows you're using Linux-style syscalls.

Resources