I'm trying to recreate the strcpy function in asm, the thing is when I use movsb to move a byte from rsi to rdi my program segfaults, I'm not sure if movsb is the right thing to use here, I'm just beginning to learn assembly, here is the code:
global ft_strcpy
section .text:
ft_strcpy:
mov rcx, 0
jmp copy
copy:
inc rcx
cmp BYTE[rsi], 0
je exit
cld
movsb
jmp copy
exit:
movsb
sub rdi, rcx
mov rax, rdi
ret
and here's a simple main to test it
char *ft_strcpy(char *d, char *s);
int main(void)
{
char *s = "hello";
char *d = "world!!!!";
ft_strcpy(d, s); //it crashes here and with lldb it says it's at movsb
return (0);
}
Thanks for your help
Related
My goal is to swap the first character with the last character of the string some_str in x86-assembly.
Here is my attempt:
; assemble and link with:
; nasm -f elf32 -g test.asm && ld -melf_i386 test.asm.o -o test
section .text
global _start
extern printf
_start:
mov eax, some_str
_loop:
mov di, [eax + 4] ; ptr to end char
mov si, [eax] ; ptr to start char
mov dl, [di] ; DL = end char
mov al, [si] ; AL = start char
mov [si], dl ; start char = end char
mov [di], al ; end char = char 1
mov edx, len
mov ecx, eax
mov ebx, 1
mov eax, 4
int 0x80
ret
mov eax, 1
int 0x80
section .data
some_str db `abcd`, 0xa
len equ $ - some_str
For some reason I am oblivious to the lines:
mov dl, [di] ; DL = end char
mov al, [si] ; AL = start char
Causes the program to result in a segmentation fault.
The expected stdout is:
dbca
Actual stdout:
Segmentation fault (core dumped)`
Is there something I am missing? How do I correct this code to correctly swap the first and last character of some_str.
Your code seems to be doing something much more complicated than necessary. After mov eax, some_str, we have that eax points to one of the bytes that wants to be swapped, and eax+4 points to the other. So just load them into two 8-bit registers and then store them back the other way around.
mov eax, some_str
mov cl, [eax]
mov dl, [eax + 4]
mov [eax + 4], cl
mov [eax], dl
And you're done and can proceed to write out the result.
Note it isn't necessary to load the pointer into eax first; you could also do
mov cl, [some_str]
mov dl, [some_str + 4]
mov [some_str + 4], cl
mov [some_str], dl
If you really wanted to have two different registers to point to the two different bytes: first of all, they need to be 32-bit registers. Trying to address memory in 32-bit mode using 16-bit registers si, di is practically never going to work. Second, mov edi, [eax] would load edi with the contents of the memory at location eax, which is some bytes of your string, not a pointer. You'd want simply mov edi, eax. For the second one, you can use lea to do the arithmetic of an effective address calculation but keep the resulting pointer instead of doing a load. So I think the way to turn your code into something in the original (inefficient) spirit, but correct, would be
mov edi, eax
lea esi, [eax+4]
mov dl, [edi]
mov al, [esi]
mov [esi], dl
mov [edi], al
I'm trying to write a function that reverses order of characters in a string using x86 NASM assembly language. I tried doing it using registers (I know it's more efficient to do it using stack) but I keep getting a segmentation fault, the c declaration looks as following
extern char* reverse(char*);
The assembly segment:
section .text
global reverse
reverse:
push ebp ; prologue
mov ebp, esp
mov eax, [ebp+8] ; eax <- points to string
mov edx, eax
look_for_last:
mov ch, [edx] ; put char from edx in ch
inc edx
test ch, ch
jnz look_for_last ; if char != 0 loop
sub edx, 2 ; found last
swap: ; eax = first, edx = last (characters in string)
test eax, edx
jg end ; if eax > edx reverse is done
mov cl, [eax] ; put char from eax in cl
mov ch, [edx] ; put char from edx in ch
mov [edx], cl ; put cl in edx
mov [eax], ch ; put ch in eax
inc eax
dec edx
jmp swap
end:
mov eax, [ebp+8] ; move char pointer to eax (func return)
pop ebp ; epilogue
ret
It seems like the line causing the segmentation fault is
mov cl, [eax]
Why is that happening? In my understanding eax never goes beyond the bounds of the string so there always is something in [eax]. How come I get a segmentation fault?
Ok I figured it out, I mistakenly used test eax, edx instead of which I should have used cmp eax, edx. It works now.
extern puts
global main
section .text
main:
mov rax, rdi
label:
test rax, rax
je exit
push rsi
mov rdi, [rsi]
call puts
pop rsi
dec rax
add rsi, 8
jmp label
exit:
pop rsi
ret
I wrote nasm code like that. However segmentation fault occur in last. I can't understand why segmentation fault is occur.
rax is not guaranteed to be preserved across function calls, as it is used to return integer results from functions (in the case of puts "a nonnegative number on success, or EOF on error") You need to save the value of rax before calling puts, like you're doing with rsi, and restore it afterwards.
Obviously you want to get the command line parameters in a GCC environment on a 64-bit Linux, where they are passed according to the GCC calling convention which follows the Linux calling convention "System V AMD64 ABI".
Let's translate the program logic to C:
#include <stdio.h>
int main ( int argc, char** argv )
{
if (argc != 0)
{
do
{
puts (*argv);
argc--;
argv++;
} while (argc);
}
return;
}
The asm program doesn't return an exit code. That exit code should be in RAX when the function returns. BTW: argc is always >0 since the first string of argv holds the program name.
The main function is both "caller" (calls puts) and "callee" (returns to the GCC environment). As caller it has to preserve RAX and RSI before the call to puts and restore them when it needs them. A callee-saved register is not used. Don't forget to align the stack by 16.
This works:
extern puts
global main
section .text
main: ; RDI: argc, RSI: argv, stack is unaligned by 8
mov rax, rdi
label:
test rax, rax
je exit
push rbx ; Push 8 bytes to align the stack before the call
push rax ; Save it (caller-saved)
push rsi ; Save it (caller-saved)
mov rdi, [rsi] ; Argument for puts
call puts
pop rsi ; Restore it
pop rax ; Restore it
pop rbx ; "Unalign" the stack
dec rax
add rsi, 8
jmp label
exit:
; pop rsi ; Once too much
xor eax, eax ; RAX = 0 (return 0)
ret ; RAX: return value
My problem is related with Assembler and Shellcoding.
I started off by writing my first shellcode and it worked out pretty well so far. I then made an assembly script of the following C code:
#include <stdio.h>
#include <unistd.h>
#include <fcntl.h>
int main()
{
int fd = open("test.txt", O_CREAT | O_WRONLY);
write(fd, "Hello World!", 6);
return 0;
}
The assembly code for that piece looks like this:
global _start
_start:
xor eax, eax ; null eax reg
push 0x7478742e ; push "test.txt" on stack
push 0x74736574
mov ebx, esp ; first Argument
mov cl, 0x41 ; Flags O_CREAT | O_WRONLY
mov al, 0x5 ; sys_open
int 0x80
push 0x736b6330 ; "shellcodingr0cks"
push 0x72676e69
push 0x646f636c
push 0x6c656853
mov ebx, eax ; file identifier
mov ecx, esp ; string on the stack
mov dl, 0x10 ; 0x10 is the size of the string
mov al, 0x4 ; sys_write
int 0x80
xor eax, eax ; exit proc
inc eax
int 0x80
The Program works pretty well and I've got the expected output but there is one problem and I don't know why this is occurring.
The filename of the file I'm writing to should be test.txt but it is writing to test.txt^A. I don't know where the ^A is coming from, nor do I know how to fix it.
Does anyone know what is wrong, and how I can fix it?
I made my own implementation of strlen in assembly, but it doesn't return the correct value. It returns the string length + 4. Consequently. I don't see why.. and I hope any of you do...
Assembly source:
section .text
[GLOBAL stringlen:] ; C function
stringlen:
push ebp
mov ebp, esp ; setup the stack frame
mov ecx, [ebp+8]
xor eax, eax ; loop counter
startLoop:
xor edx, edx
mov edx, [ecx+eax]
inc eax
cmp edx, 0x0 ; null byte
jne startLoop
end:
pop ebp
ret
And the main routine:
#include <stdio.h>
extern int stringlen(char *);
int main(void)
{
printf("%d", stringlen("h"));
return 0;
}
Thanks
You are not accessing bytes (characters), but doublewords. So your code is not looking for a single terminating zero, it is looking for 4 consecutive zeroes. Note that won't always return correct value +4, it depends on what the memory after your string contains.
To fix, you should use byte accesses, for example by changing edx to dl.
Thanks for your answers. Under here working code for anyone who has the same problem as me.
section .text
[GLOBAL stringlen:]
stringlen:
push ebp
mov ebp, esp
mov edx, [ebp+8] ; the string
xor eax, eax ; loop counter
jmp if
then:
inc eax
if:
mov cl, [edx+eax]
cmp cl, 0x0
jne then
end:
pop ebp
ret
Not sure about the four, but it seems obvious it will always return the proper length + 1, since eax is always increased, even if the first byte read from the string is zero.
Change the line
mov edx, [ecx+eax]
to
mov dl, byte [ecx+eax]
and
cmp edx, 0x0 ; null byte
to
cmp dl, 0x0 ; null byte
Because you have to compare only byte at a time.
Following is the code. Your original code got off-by-one error. For "h" it will return two h + null character.
section .text
[GLOBAL stringlen:] ; C function
stringlen:
push ebp
mov ebp, esp ; setup the stack frame
mov ecx, [ebp+8]
xor eax, eax ; loop counter
startLoop:
xor dx, dx
mov dl, byte [ecx+eax]
inc eax
cmp dl, 0x0 ; null byte
jne startLoop
end:
pop ebp
ret
More easy way here(ASCII zero terminated string only):
REPE SCAS m8
http://pdos.csail.mit.edu/6.828/2006/readings/i386/REP.htm
I think your inc should be after the jne. I'm not familiar with this assembly, so I don't really know.