My goal is to swap the first character with the last character of the string some_str in x86-assembly.
Here is my attempt:
; assemble and link with:
; nasm -f elf32 -g test.asm && ld -melf_i386 test.asm.o -o test
section .text
global _start
extern printf
_start:
mov eax, some_str
_loop:
mov di, [eax + 4] ; ptr to end char
mov si, [eax] ; ptr to start char
mov dl, [di] ; DL = end char
mov al, [si] ; AL = start char
mov [si], dl ; start char = end char
mov [di], al ; end char = char 1
mov edx, len
mov ecx, eax
mov ebx, 1
mov eax, 4
int 0x80
ret
mov eax, 1
int 0x80
section .data
some_str db `abcd`, 0xa
len equ $ - some_str
For some reason I am oblivious to the lines:
mov dl, [di] ; DL = end char
mov al, [si] ; AL = start char
Causes the program to result in a segmentation fault.
The expected stdout is:
dbca
Actual stdout:
Segmentation fault (core dumped)`
Is there something I am missing? How do I correct this code to correctly swap the first and last character of some_str.
Your code seems to be doing something much more complicated than necessary. After mov eax, some_str, we have that eax points to one of the bytes that wants to be swapped, and eax+4 points to the other. So just load them into two 8-bit registers and then store them back the other way around.
mov eax, some_str
mov cl, [eax]
mov dl, [eax + 4]
mov [eax + 4], cl
mov [eax], dl
And you're done and can proceed to write out the result.
Note it isn't necessary to load the pointer into eax first; you could also do
mov cl, [some_str]
mov dl, [some_str + 4]
mov [some_str + 4], cl
mov [some_str], dl
If you really wanted to have two different registers to point to the two different bytes: first of all, they need to be 32-bit registers. Trying to address memory in 32-bit mode using 16-bit registers si, di is practically never going to work. Second, mov edi, [eax] would load edi with the contents of the memory at location eax, which is some bytes of your string, not a pointer. You'd want simply mov edi, eax. For the second one, you can use lea to do the arithmetic of an effective address calculation but keep the resulting pointer instead of doing a load. So I think the way to turn your code into something in the original (inefficient) spirit, but correct, would be
mov edi, eax
lea esi, [eax+4]
mov dl, [edi]
mov al, [esi]
mov [esi], dl
mov [edi], al
Related
I'm trying to write a function that reverses order of characters in a string using x86 NASM assembly language. I tried doing it using registers (I know it's more efficient to do it using stack) but I keep getting a segmentation fault, the c declaration looks as following
extern char* reverse(char*);
The assembly segment:
section .text
global reverse
reverse:
push ebp ; prologue
mov ebp, esp
mov eax, [ebp+8] ; eax <- points to string
mov edx, eax
look_for_last:
mov ch, [edx] ; put char from edx in ch
inc edx
test ch, ch
jnz look_for_last ; if char != 0 loop
sub edx, 2 ; found last
swap: ; eax = first, edx = last (characters in string)
test eax, edx
jg end ; if eax > edx reverse is done
mov cl, [eax] ; put char from eax in cl
mov ch, [edx] ; put char from edx in ch
mov [edx], cl ; put cl in edx
mov [eax], ch ; put ch in eax
inc eax
dec edx
jmp swap
end:
mov eax, [ebp+8] ; move char pointer to eax (func return)
pop ebp ; epilogue
ret
It seems like the line causing the segmentation fault is
mov cl, [eax]
Why is that happening? In my understanding eax never goes beyond the bounds of the string so there always is something in [eax]. How come I get a segmentation fault?
Ok I figured it out, I mistakenly used test eax, edx instead of which I should have used cmp eax, edx. It works now.
I'm making new post about the same program - I'm sorry but I think that my question is much different than the previous one. My program gets 2 parameters at the start - number of repeats and a string. Number of repeats determines how many times should the last word from string be printed. For example:
./a.out 3 "ab cd"
shows in output
cdcdcd
I already made ( with Stack users help :-) ) a working program using call printf. It works for 0-9 number of repeats only but it's not as imporant as the main thing - my question is how to replace this "call printf" with sys_write calling.
I got information that I have to compile this using
-nostdlib
option but it doesn't matter if my code isn't correct. I tried my best and I also found some information about possible methods here but I can't make it work properly.
Printing new line works good but I have no idea how to deal with string from parameter #2 connected with sys_write. It would be great if someone more experienced find some time and point out what I need to change in the code. It took some time to get through the "call printf" version but then I was able to experiment and now I'm totally lost. Here it is:
.intel_syntax noprefix
.globl _start
.text
_start:
push ebp
mov ebp, esp
mov ecx, [ebp + 4] # arg1 int ECX
mov ebx, [ebp + 8] # arg2 string EBX
xor eax, eax
# ARG1 - FROM STRING TO INT
atoi:
movzx edx, byte ptr [ecx]
cmp edx, '0'
jb programend
sub edx, '0'
mov ecx, edx
## =========================== FUNCTION =========================== ##
# SEARCH FOR END OF STRING
findend:
mov dl, byte ptr [ebx + eax] # move through next letters
cmp dl, 0
jz findword
inc eax
jmp findend
# SEARCH FOR LAST SPACE
findword:
dec eax
mov dl, byte ptr [ebx + eax]
cmp dl, ' '
jz foundwordstart
jmp findword
# REMEMBER SPACE POSITION, CHECK COUNTER >0
foundwordstart:
push eax # remember space position
cmp ecx, 0 # check if counter > 0
jz theend
jmp foundword
# PRINT LAST WORD
foundword:
inc eax
mov dl, byte ptr [ebx + eax]
cmp dl, 0
jz checkcount
push ecx
push eax # save current position in word
push edx
push ebx
lea ecx, [ebx+eax] # char * to string
mov eax, 4 # sys_write
mov edx, 1; # how many chars will be printed
mov ebx, 1 # stdout
int 0x80
pop ebx
pop edx
pop eax
pop ecx
jmp foundword
# decrease counter and restore beginning of last word
checkcount:
dec ecx # count = count-1
pop eax # restore beginning of last word
jmp foundwordstart
theend:
pop eax # pop the space position from stack
jmp programend
# END OF PROGRAM
programend:
pop ebp
# new line
mov eax,4
mov ebx,1
mov ecx,offset msgn
mov edx,1
int 0x80
# return 0
mov eax, 1
mov ebx, 0
int 0x80
.data
msgn: .ascii "\n"
It's also really strange for me that I can run it with:
mov ecx, [ebp + 12]
mov ecx, [ecx + 8]
add ecx, eax # char * to string
mov eax, 4 # sys_write
mov edx, 1; # how many chars will be printed
mov ebx, 1 # stdout
int 0x80
and it works well - but only if I don't use -nostdlib (and of course I have to change _start to main)...
How do I write a variable to a file using NASM?
For example, if I execute some mathematical operation - how do I write the result of the operation to write a file?
My file results have remained empty.
My code:
%include "io.inc"
section .bss
result db 2
section .data
filename db "Downloads/output.txt", 0
section .text
global CMAIN
CMAIN:
mov eax,5
add eax,17
mov [result],eax
PRINT_DEC 2,[result]
jmp write
write:
mov EAX, 8
mov EBX, filename
mov ECX, 0700
int 0x80
mov EBX, EAX
mov EAX, 4
mov ECX, [result]
int 0x80
mov EAX, 6
int 0x80
mov eax, 1
int 0x80
jmp exit
exit:
xor eax, eax
ret
You have to implement ito (integer to ascii) subsequently len for this manner. This code tested and works properly in Ubuntu.
section .bss
answer resb 64
section .data
filename db "./output.txt", 0
section .text
global main
main:
mov eax,5
add eax,44412
push eax ; Push the new calculated number onto the stack
call itoa
mov EAX, 8
mov EBX, filename
mov ECX, 0x0700
int 0x80
push answer
call len
mov EBX, EAX
mov EAX, 4
mov ECX, answer
movzx EDX, di ; move with extended zero edi. length of the string
int 0x80
mov EAX, 6
int 0x80
mov eax, 1
int 0x80
jmp exit
exit:
xor eax, eax
ret
itoa:
; Recursive function. This is going to convert the integer to the character.
push ebp ; Setup a new stack frame
mov ebp, esp
push eax ; Save the registers
push ebx
push ecx
push edx
mov eax, [ebp + 8] ; eax is going to contain the integer
mov ebx, dword 10 ; This is our "stop" value as well as our value to divide with
mov ecx, answer ; Put a pointer to answer into ecx
push ebx ; Push ebx on the field for our "stop" value
itoa_loop:
cmp eax, ebx ; Compare eax, and ebx
jl itoa_unroll ; Jump if eax is less than ebx (which is 10)
xor edx, edx ; Clear edx
div ebx ; Divide by ebx (10)
push edx ; Push the remainder onto the stack
jmp itoa_loop ; Jump back to the top of the loop
itoa_unroll:
add al, 0x30 ; Add 0x30 to the bottom part of eax to make it an ASCII char
mov [ecx], byte al ; Move the ASCII char into the memory references by ecx
inc ecx ; Increment ecx
pop eax ; Pop the next variable from the stack
cmp eax, ebx ; Compare if eax is ebx
jne itoa_unroll ; If they are not equal, we jump back to the unroll loop
; else we are done, and we execute the next few commands
mov [ecx], byte 0xa ; Add a newline character to the end of the character array
inc ecx ; Increment ecx
mov [ecx], byte 0 ; Add a null byte to ecx, so that when we pass it to our
; len function it will properly give us a length
pop edx ; Restore registers
pop ecx
pop ebx
pop eax
mov esp, ebp
pop ebp
ret
len:
; Returns the length of a string. The string has to be null terminated. Otherwise this function
; will fail miserably.
; Upon return. edi will contain the length of the string.
push ebp ; Save the previous stack pointer. We restore it on return
mov ebp, esp ; We setup a new stack frame
push eax ; Save registers we are going to use. edi returns the length of the string
push ecx
mov ecx, [ebp + 8] ; Move the pointer to eax; we want an offset of one, to jump over the return address
mov edi, 0 ; Set the counter to 0. We are going to increment this each loop
len_loop: ; Just a quick label to jump to
movzx eax, byte [ecx + edi] ; Move the character to eax.
movsx eax, al ; Move al to eax. al is part of eax.
inc di ; Increase di.
cmp eax, 0 ; Compare eax to 0.
jnz len_loop ; If it is not zero, we jump back to len_loop and repeat.
dec di ; Remove one from the count
pop ecx ; Restore registers
pop eax
mov esp, ebp ; Set esp back to what ebp used to be.
pop ebp ; Restore the stack frame
ret ; Return to caller
On NASM in Arch Linux, how can I append the character zero ('0') to a 32 bit variable? My reason for wanting to do this is so that I can output the number 10 by setting a single-digit input to 1 and appending a zero. I need to figure out how to append the zero.
The desirable situation:
Please enter a number: 9
10
Using this method, I want to be able to do this:
Please enter a number: 9999999
10000000
How can I do this?
Thanks in advance,
RileyH
Well, as Bo says... but I was bored. You seem resistant to doing this the easy way (convert your input to a number, add 1, and convert it back to text) so I tried it using characters. This is what I came up with. It's horrid, but "seems to work".
; enter a number and add 1 - the hard way!
; nasm -f elf32 myprog.asm
; ld -o myprog myprog.o -melf_i386
global _start
; you may have these in an ".inc" file
sys_exit equ 1
sys_read equ 3
sys_write equ 4
stdin equ 0
stdout equ 1
stderr equ 2
LF equ 10
section .data
prompt db "Enter a number - not more than 10 digits - no nondigits.", LF
prompt_size equ $ - prompt
errmsg db "Idiot human! Follow instructions next time!", LF
errmsg_size equ $ - errmsg
section .bss
buffer resb 16
fakecarry resb 1
section .text
_start:
nop
mov eax, sys_write
mov ebx, stdout
mov ecx, prompt
mov edx, prompt_size
int 80h
mov eax, sys_read
mov ebx, stdin
mov ecx, buffer + 1 ; leave a space for an extra digit in front
mov edx, 11
int 80h
cmp byte [buffer + 1 + eax - 1], LF
jz goodinput
; pesky user has tried to overflow us!
; flush the buffer, yell at him, and kick him out!
sub esp, 4 ; temporary "buffer"
flush:
mov eax, sys_read
; ebx still okay
mov ecx, esp ; buffer is on the stack
mov edx, 1
int 80h
cmp byte [ecx], LF
jnz flush
add esp, 4 ; "free" our "buffer"
jmp errexit
goodinput:
lea esi, [buffer + eax - 1] ; end of input characters
mov byte [fakecarry], 1 ; only because we want to add 1
xor edx, edx ; count length as we go
next:
; check for valid decimal digit
mov al, [esi]
cmp al, '0'
jb errexit
cmp al, '9'
ja errexit
add al, [fakecarry] ; from previous digit, or first... to add 1
mov byte [fakecarry], 0 ; reset it for next time
cmp al, '9' ; still good digit?
jna nocarry
; fake a "carry" for next digit
mov byte [fakecarry], 1
mov al, '0'
cmp esi, buffer + 1
jnz nocarry
; if first digit entered, we're done
; save last digit and add one ('1') into the space we left
mov [esi], al
inc edx
dec esi
mov byte [esi], '1'
inc edx
dec esi
jmp done
nocarry:
mov [esi], al
inc edx
dec esi
cmp esi, buffer
jnz next
done:
inc edx
inc edx
mov ecx, esi ; should be either buffer + 1, or buffer
mov ebx, stdout
mov eax, sys_write
int 80h
xor eax, eax ; claim "no error"
exit:
mov ebx, eax
mov eax, sys_exit
int 80h
errexit:
mov edx, errmsg_size
mov ecx, errmsg
mov ebx, stderr
mov eax, sys_write
int 80h
mov ebx, -1
jmp exit
;-----------------------------
Is that what you had in mind?
I made my own implementation of strlen in assembly, but it doesn't return the correct value. It returns the string length + 4. Consequently. I don't see why.. and I hope any of you do...
Assembly source:
section .text
[GLOBAL stringlen:] ; C function
stringlen:
push ebp
mov ebp, esp ; setup the stack frame
mov ecx, [ebp+8]
xor eax, eax ; loop counter
startLoop:
xor edx, edx
mov edx, [ecx+eax]
inc eax
cmp edx, 0x0 ; null byte
jne startLoop
end:
pop ebp
ret
And the main routine:
#include <stdio.h>
extern int stringlen(char *);
int main(void)
{
printf("%d", stringlen("h"));
return 0;
}
Thanks
You are not accessing bytes (characters), but doublewords. So your code is not looking for a single terminating zero, it is looking for 4 consecutive zeroes. Note that won't always return correct value +4, it depends on what the memory after your string contains.
To fix, you should use byte accesses, for example by changing edx to dl.
Thanks for your answers. Under here working code for anyone who has the same problem as me.
section .text
[GLOBAL stringlen:]
stringlen:
push ebp
mov ebp, esp
mov edx, [ebp+8] ; the string
xor eax, eax ; loop counter
jmp if
then:
inc eax
if:
mov cl, [edx+eax]
cmp cl, 0x0
jne then
end:
pop ebp
ret
Not sure about the four, but it seems obvious it will always return the proper length + 1, since eax is always increased, even if the first byte read from the string is zero.
Change the line
mov edx, [ecx+eax]
to
mov dl, byte [ecx+eax]
and
cmp edx, 0x0 ; null byte
to
cmp dl, 0x0 ; null byte
Because you have to compare only byte at a time.
Following is the code. Your original code got off-by-one error. For "h" it will return two h + null character.
section .text
[GLOBAL stringlen:] ; C function
stringlen:
push ebp
mov ebp, esp ; setup the stack frame
mov ecx, [ebp+8]
xor eax, eax ; loop counter
startLoop:
xor dx, dx
mov dl, byte [ecx+eax]
inc eax
cmp dl, 0x0 ; null byte
jne startLoop
end:
pop ebp
ret
More easy way here(ASCII zero terminated string only):
REPE SCAS m8
http://pdos.csail.mit.edu/6.828/2006/readings/i386/REP.htm
I think your inc should be after the jne. I'm not familiar with this assembly, so I don't really know.