strlen in NASM Linux - linux

Excuse me again. I am trying understand learn assembly languaje. However I have many problems. I am trying working with strings in NASM. I have copy a string constant to string variable. The maximum size is 50. So I want verify this bound. However this program throw a segmentation fault. I use a example in MASM, so perhaps exist a use error with NASM syntax.
My program is the following:
section .data
MAXTEXTSIZE equ 50
_cte_hola db "Hola", 0
_cte_mundo db "Mundo", 0
section .bss
MAIN_d resb MAXTEXTSIZE+1
section .text
global _start
strlen:
mov bx, 0
strl01:
cmp WORD [SI+BX],0 t
je strend
inc bx
jmp strl01
strend:
ret
strcpy:
call strlen
cmp bx, MAXTEXTSIZE
jle copiarsizeok
mov bx, MAXTEXTSIZE
copiarsizeok:mov cx, bx
cld
rep movsb
mov al,0
mov BYTE [DI], al
ret
_start:
mov ds, ax
mov es, ax
mov si, [MAIN_d]
mov di, [_cte_hola]
call strcpy
mov eax, 1
mov ebx, 0
int 80h
Thanks in advance and excuse me. My question are stupid for a assembly programmer.

I believe you are trying to make 32bit program in Linux, but your examples are 16bit.
In Linux, all pointers are 32bit. So, use extended registers: esi, edi, ebx etc. You still can use 8 and 16bit registers for arithmetics and data processing but not as memory pointers.
In strlen you have to compare byte [esi+ebx], 0 not word.
Don't set the segment registers in Linux. They will be set by the OS and you can't touch them. In Linux all memory is one flat area and you don't have to use segment registers anymore.

Here's a more concrete example of how you could write your strlen function (which is the first of your problems)
section .data
MAXTEXTSIZE equ 50
_cte_hola db "Hola", 0xa, 0
_cte_mundo db "Mundo", 0
section .bss
MAIN_d resb MAXTEXTSIZE+1
section .text
global _start
strlen:
mov ebx, 0
strlen_loop:
cmp BYTE [esi+ebx], 0
je strlen_end
inc ebx
jmp strlen_loop
strlen_end:
mov eax, ebx
ret
_start:
mov esi, _cte_hola
call strlen ; Get the length of _cte_hola
mov edx, eax ; The length was stored in eax by strlen
mov ecx, _cte_hola
mov ebx,1
mov eax, 4
int 0x80 ; Write to stdout
mov eax, 1
int 0x80 ; Exit
There are definitely better ways of implementing this (I'd use repne to implement strlen, for example) but I wanted to keep it close to your implementation.
Hope this helps!

Related

Printing binary string in assembly

I'm writing a program to print binary string of a hardcoded word. Here is how it looks like currently:
main.asm
section .text
global _start
extern _print_binary_content
_start:
push word [word_to_print] ; pushing word. Can we push just one byte?
call _print_binary_content
mov rax, 60
mov rdi, 0
syscall
section .data
word_to_print: dw 0xAB0F
printer.asm
SYS_BRK_NUM equ 0x0C
BITS_IN_WORD equ 0x10
SYS_WRITE_NUM equ 0x01
STD_OUT_FD equ 0x01
FIRST_BIT_BIT_MASK equ 0x01
ASCII_NUMBER_OFFSET equ 0x30
section .text
global _print_binary_content
_print_binary_content:
pop rbp
xor ecx, ecx ;zeroing rcx
xor ebx, ebx ;zeroing rbx
pop bx ;the word to print the binary content of
;sys_brk for current location
mov rax, SYS_BRK_NUM
mov rdi, 0
syscall
;end sys_brk
mov r12, rax ;save the current brake location
;sys_brk for memory allocation 16 bytes
lea rdi, [rax + BITS_IN_WORD]
mov rax, SYS_BRK_NUM
syscall
;end sys_brk
xor ecx, ecx
mov cl, byte BITS_IN_WORD - 1; used as a counter in the loop below
loop:
mov dx, bx
and dx, FIRST_BIT_BIT_MASK
add dx, ASCII_NUMBER_OFFSET
mov [r12 + rcx], dl
shr bx, 0x01
dec cl
cmp cl, 0
jge loop
mov rsi, r12
mov rax, SYS_WRITE_NUM
mov rdi, STD_OUT_FD
mov rdx, BITS_IN_WORD
syscall
push rbp ; pushing return address back
ret
If I compile link and run this program it works. But the question is about performance and maybe conventions of writing assembly programs. In the file printer.asm I cleaned ecx twice which looks kind of not optimal. Maybe some registers were used not by their purpose (I used intel-manual).
Can you please help me to improve this very simple program?

Linux Assembly segmentation fault print using loop

I'm writing an assembly program that would print even numbers between 0-9 using a loop. I encountered this problem, segmentation fault while running the code. I check other answers on the site but couldn't find an answer that satisfies my issue.
I suspect that the function nwLine might be the source of the problem.
;;this program prints even numbers from 0-8 using loop function
section .text
global _start
cr db 10
_start: ;tell linker entry point
mov ecx, 5
mov eax, '0'
evenLoop:
mov [evnum], eax ;add eax to evnum
mov eax, 4
mov ebx, 1
push ecx
mov ecx, evnum
mov edx, 1
int 80h
call nwLine
mov eax, [evnum]
sub eax, '1'
inc eax
add eax, '2'
pop ecx
loop evenLoop
nwLine: ;function to move pointer to next line
mov eax,4 ; System call number(sys_write)
mov ebx,1 ; File descriptor 1 - standard output
mov ecx, cr
mov edx, 1
int 80h ; Call the kernel
ret
mov eax,1 ;system call number (sys_exit)
int 80h ;call kernel
section .bss
evnum resb 1
if anyone knows how to solve the problem with the nwLine function, please tell me.

Conditional jump fails in linux x86 intel syntax(NASM)

STORY(IM A NEWBIE):
I started reading a pdf tutorial about programming in assembly(x86 intel) using the famous nasm assembler and i have a problem executing a very basic assembly code(inspired by a code about loops from the tutorial).
THE PROBLEM(JE FAILS):
This assembly code should read a digit(a character(that means '0'+digit)) from stdin and then write to the screen digit times "Hello world\n".Really easy loop :decrease digit and if digit equals zero('0' not the integer the character) jump(je) to the exit(mov eax,1\nint 0x80).
Sounds really easy but when i try to execute the output is weird.(really weird and BIG)
It runs many times throught the loop and stops when digit equals '0'(weird because until the program stops the condition digit == '0' been tested many times and it should be true)
Actually my problem is that the code fails to jump when digit == '0'
THE CODE(IS BIG):
segment .text
global _start
_start:
;Print 'Input a digit:'.
mov eax,4
mov ebx,1
mov ecx,msg1
mov edx,len1
int 0x80
;Input the digit.
mov eax,3
mov ebx,0
mov ecx,dig
mov edx,2
int 0x80
;Mov the first byte(the digit) in the ecx register.
;mov ecx,0
mov ecx,[dig]
;Use ecx to loop dig[0]-'0' times.
loop:
mov [dig],ecx
mov eax,4
mov ebx,1
mov ecx,dig
mov edx,1
int 0x80
mov eax,4
mov ebx,1
mov ecx,Hello
mov edx,Hellolen
int 0x80
;For some debuging (make the loop stop until return pressed)
;mov eax,3
;mov ebx,0
;mov ecx,some
;mov edx,2
;int 0x80
;Just move dig[0](some like character '4' or '7') to ecx register and compare ecx with character '0'.
mov ecx,[dig]
dec ecx
cmp ecx,'0'
;If comparison says ecx and '0' are equal jump to exit(to end the loop)
je exit
;If not jump back to loop
jmp loop
;Other stuff ...(like an exit procedure and a data(data,bss) segment)
exit:
mov eax,1
int 0x80
segment .data
msg1 db "Input a digit:"
len1 equ $-msg1
Hello db ":Hello world",0xa
Hellolen equ $-Hello
segment .bss
dig resb 2
some resb 2
THE OUTPUT:
Input a digit:4
4:Hello world
3:Hello world
2:Hello world
1:Hello world
0:Hello world
...
...(many loops later)
...
5:Hello world
4:Hello world
3:Hello world
2:Hello world
1:Hello world
$
That is my question:What is wrong with this code?
Could you explain that ?
AND i dont need alternative codes that will magically(without explanation) run cause i try to learn(im a newbie)
That is my problem(and my first question in Stackoverflow.com )
ECX is 32 bit, a character is just 8 bit. Use a 8 bit register, such as CL instead of ECX.
As jester mentioned, ecx comes in as a character so you probably should use cl
loop:
mov [dig],cl
...
mov cl,[dig]
dec cl
cmp cl,'0'
jne loop
You can also load ecx with movzx which clears the top bits of the register (i.e. a zero-extedning load):
...
movzx ecx, byte [dig]
loop:
mov [dig], cl ; store just the low byte, if you want to store
...
movzx ecx, byte [dig]
dec ecx
cmp ecx, '0'
jne loop
Note that it is often suggested that you do not use the al, bl, cl, dl registers as their use is not fully optimized. Whether this is still true, I do not know.

NASM addition program

I am a developer who uses high level languages, learning assembly language in my spare time. Please see the NASM program below:
section .data
section .bss
section .text
global main
main:
mov eax,21
mov ebx,9
add eax,ebx
mov ecx,eax
mov eax,4
mov ebx,1
mov edx,4
int 0x80
push ebp
mov ebp,esp
mov esp,ebp
pop ebp
ret
Here are the commands I use:
ian#ubuntu:~/Desktop/NASM/Program4$ nasm -f elf -o asm.o SystemCalls.asm
ian#ubuntu:~/Desktop/NASM/Program4$ gcc -o program asm.o
ian#ubuntu:~/Desktop/NASM/Program4$ ./program
I don't get any errors, however nothing is printed to the terminal. I used the following link to ensure the registers contained the correct values: http://docs.cs.up.ac.za/programming/asm/derick_tut/syscalls.html
You'll have to convert the integer value to a string to be able to print it with sys_write (syscall 4). The conversion could be done like this (untested):
; Converts the integer value in EAX to a string in
; decimal representation.
; Returns a pointer to the resulting string in EAX.
int_to_string:
mov byte [buffer+9],0 ; add a string terminator at the end of the buffer
lea esi,[buffer+9]
mov ebx,10 ; divisor
int_to_string_loop:
xor edx,edx ; clear edx prior to dividing edx:eax by ebx
div ebx ; EAX /= 10
add dl,'0' ; take the remainder of the division and convert it from 0..9 -> '0'..'9'
dec esi ; store it in the buffer
mov [esi],dl
test eax,eax
jnz int_to_string_loop ; repeat until EAX==0
mov eax,esi
ret
buffer: resb 10
programming in assembly requires a knowledge of ASCII codes and a some basic conversion routines. example: hexadecimal to decimal, decimal to hexadecimal are good routines to keep somewhere on some storage.
No registers can be printed as they are, you have to convert (a lot).
To be a bit more helpfull:
ASCII 0 prints nothing but some text editors (kate in kde linux) will show something on screen (a square or ...). In higher level language like C and C++ is it used to indicate NULL pointers and end of strings.
Usefull to calculate string lengths too.
10 is end of line. depending Linux or Windows there will be a carriage return (Linux) too or not (Windows/Dos).
13 is carriage return
1B is the ESC key (Linux users will now more about this)
255 is a hard return, I never knew why it is good for but it must have its purpose.
check http://www.asciitable.com/ for the entire list.
Convert the integer value to a string.
Here i have used macros pack and unpack to convert integers to string and macro unpack to do the vice-versa
%macro write 2
mov eax, 4
mov ebx, 1
mov ecx, %1
mov edx, %2
int 80h
%endmacro
%macro read 2
mov eax,3
mov ebx,0
mov ecx,%1
mov edx,%2
int 80h
%endmacro
%macro pack 3 ; 1-> string ,2->length ,3->variable
mov esi, %1
mov ebx,0
%%l1:
cmp byte [esi], 10
je %%exit
imul ebx,10
movzx edx,byte [esi]
sub edx,'0'
add ebx,edx
inc esi
jmp %%l1
%%exit:
mov [%3],ebx
%endmacro
%macro unpack 3 ; 1-> string ,2->length ,3->variable
mov esi, %1
mov ebx,0
movzx eax, byte[%3]
mov byte[%2],0
cmp eax, 0
jne %%l1
mov byte[%2],1
push eax
jmp %%exit2
%%l1:
mov ecx,10
mov edx,0
div ecx
add edx,'0'
push edx
inc byte[%2]
cmp eax, 0
je %%exit2
jmp %%l1
%%exit2:
movzx ecx,byte[%2]
%%l2:
pop edx
mov [esi],dl
inc esi
loop %%l2
%endmacro
section .data ; data section
msg1: db "First number : " ;
len1: equ $-msg1 ;
msg2: db "Second number : " ;
len2: equ $-msg2 ;
msg3: db "Sum : " ;
len3: equ $-msg3 ;
ln: db 10
lnl: equ $-ln
var1: resb 10
var2: resb 10
str1: resb 10
str2: resb 10
ans: resb 10
ansvar: resb 10
ansl: db ''
l1: db ''
l2: db ''
section.text ;code
global _start
_start:
write msg1,len1
read str1,10
pack str1,l1,var1
write msg2,len2
read str2,10
pack str2,l2,var2
mov al,[var1]
add al,[var2]
mov [ansvar],al
unpack ans,ansl,ansvar
write msg3,len3
write ans,10
write ln,lnl
mov ebx,0 ; exit code, 0=normal
mov eax,1 ; exit command to kernel
int 0x80 ; interrupt 80 hex, call kernel
To assembler, link and run:
nasm -f elf add.asm
ld -s -o add add.o
./add

Segfault accessing BSS memory

section .data
bufChar: equ 0
section .bss
bufNum: resb 1
bufMult: resb 1
.
.
.
leerNumero:
xor eax,eax
mov [bufNum],eax
add eax,1
mov [bufMult],eax
inicioLeerNumero:
mov edx,1
mov ecx,bufChar
mov ebx,0
mov eax,3
int 80h
cmp byte [ecx + edx - 1],10 ; Segfaults here.
je rLeerNumero
cmp byte [ecx + edx - 1],48
jl noNumero
cmp byte [ecx + edx - 1],57
jg noNumero
sub eax,48
mul word [bufMult]
jo overflow
add [bufNum],eax
jo overflow
mov eax,10
mul word [bufMult]
jo overflow
mov [bufMult],eax
jmp inicioLeerNumero
rLeerNumero:
mov eax,bufNum
ret
noNumero:
mov eax,errorNumero
mov ebx,lErrorNumero
call imprimir
jmp salir
overflow:
mov eax,errorOverflow
mov ebx,lErrorOverflow
call imprimir
jmp salir
This code should of work, at least in paper it does. I need to do some homework completely in assembly without linking the C Library, hence why i am re-inventing the wheel and making a method to read a number from the console into EAX.
I am having a mysterious segfault at the line marked with the comment and i fail to see how i coud be trying to access misaligned memory... any ideas on how could this be failing?
Any chance that int 80h is changing ecx or edx on you causing a bad pointer read? If you can read the registers in a debugger before and after that instruction, you could confirm that.
I had declared bufChar as .data, obviously mov'ing into a constant would segfault the thing. Sadly, i wasted a week wrapping my head around this, just realized so.

Resources