I am new to assembly and am aware that my assembly code may not be efficient or could be better. The comments of the assembly may be messed up a little due to constant changes. The goal is to print each character of the string individually and when comes across with a format identifier like %s, it prints a string from one of the parameters in place of %s.
So for example:
String: Hello, %s
Parameter (RSI): Foo
Output: Hello, Foo
So the code does what it suppose to do but give segmentation error at the end.
.bss
char: .byte 0
.text
.data
text1: .asciz "%s!\n"
text2: .asciz "My name is %s. I think I’ll get a %u for my exam. What does %r do? And %%?\n"
word1: .asciz "Piet"
.global main
main:
pushq %rbp # push the base pointer (and align the stack)
movq %rsp, %rbp # copy stack pointer value to base pointer
movq $text1, %rdi
movq $word1, %rsi
movq $word1, %rdx
movq $word1, %rcx
movq $word1, %r8
movq $word1, %r9
call myPrint
end:
movq %rbp, %rsp # clear local variables from stack
popq %rbp # restore base pointer location
movq $60, %rax
movq $0, %rdi
syscall
myPrint:
pushq %rbp
movq %rsp, %rbp
pushq %rsi
pushq %rdx
pushq %rcx
pushq %r8
pushq %r9
movq %rdi, %r12
regPush:
movq $0, %rbx
#rbx: counter
printLooper:
movb (%r12), %r14b #Get a byte of r12 to r14
cmpb $0, %r14b #Check if r14 is a null byte
je endPrint #If it is a null byte then go to 'endPrint'
cmpb $37, %r14b
je formatter
incq %r12 #Increment r12 to the next byte
skip:
mov $char, %r15 #Move char address to r15
mov %r14b, (%r15) #Move r14 byte into the value of r15
mov $char, %rcx #Move char address into rcx
movq $1, %r13 #For the number of byte
printer:
movq $0, %rsi #Clearing rsi
mov %rcx, %rsi #Move the address to rsi
movq $1, %rax #Sys write
movq $1, %rdi #Output
movq %r13, %rdx #Number of byte to rdx
syscall
jmp printLooper
formatter:
incq %r12 #Moving to char after "%"
movb (%r12), %r14b #Moving the char byte into r14
cmpb $115, %r14b #Compare 's' with r14
je formatString #If it is equal to 's' then jump to 'formatString'
movb -1(%r12), %r14b #Put back the previous char into r14
jmp skip
####String Formatter Start ##################################################
formatString:
addq $1, %rbx
movq $8, %rax
mulq %rbx
subq %rax, %rbp
movq (%rbp), %r15
pushq %r15 ### into the stack
movq $0, %r13 ### Byte counter
formatStringLoop:
movb (%r15), %r14b #Move char into r14
cmpb $0, %r14b #Compare r14 with null byte
je formatStringEnd #If it is equal, go to 'formatStringEnd'
incq %r15 #Increment to next char
addq $1, %r13 #Add 1 to the byte counter
jmp formatStringLoop#Loop again
formatStringEnd:
popq %rcx #Pop the address into rcx
incq %r12 #Moving r12 to next char
jmp printer
#######String Formatter End #############################################
endPrint:
movq %rbp, %rsp
popq %rbp
ret
In formatString you modify %rbp with subq %rax, %rbp, forgetting that you will restore %rsp from it. So when you mov %rbp, %rsp just before the function returns, you end up with %rsp pointing somewhere else, and so you get the wrong return address.
I guess you are subtracting some offset from %rbp to get some space on the stack. This seems unsafe because you've pushed lots of other stuff there. It is safe to use up to 128 bytes below the stack pointer as this is the red zone, but it would be more natural to use an offset from %rsp instead. Using SIB addressing you can access data at constant or variable offsets to %rsp without actually changing its value.
How I found this with gdb: by setting breakpoints at myPrint and endPrint, I found that %rsp was different at the ret than it was on entry. Its value could only have come from %rbp, so I did watch $rbp to have the debugger break when %rbp changed, and it pointed straight to the offending instruction in formatString. (Which I could also have found by searching the source code for %rbp.)
Also, your .text at the top of the file is misplaced, so all your code gets placed in the .data section. This actually works but it surely is not what you intended.
Related
.globl start
.section .text
_start:
movq $2, %rbx
movq $3, %rcx
movq $1, %rax
mainloop:
addq $0, %rcx
jz complete
mulq %rbx
decq %rcx
jmp mainloop
complete:
movq %rax, %rdi
movq $60, %rax
syscall
I have been trying to run this code, but keep getting an
illegal instruction
through the assembler.
I cannot figure why it is supposed to run through GNU assembler.
So, for this assignment I have to write an Assembly "function" to be called by C code. The purpose of the function is, given an integer and a memory address (the address of a char array, to be used as a string), convert the integer to a string, which starting address is the memory address that is given.
I'm on Ubuntu Linux, btw.
Here's the Assembly code (I tried to make it using the Linux x86_64 ABI calling conventions)(It is in AT&T syntax):
.global dec
.type dec, #function
.text
dec:
######################### Subroutine prologue
push %rbp # Save the base pointer
movq %rsp, %rbp # Make the stack pointer the new base pointer
push %rdi # Stack parameter 1
push %rsi # Stack parameter 2
push %rbx # Save callee-saved registers
push %r12
push %r13
push %r14
push %r15
######################### Subroutine body
movq %rdi, %rax
xor %rcx, %rcx
addDigit:
cmp $0, %rax
je putMem
xor %rdx, %rdx
mov $10, %ebx
div %ebx
addq $'0', %rdx
pushq %rdx
inc %rcx
jmp addDigit
putMem:
cmp $0, %rcx
je endProg
popq (%rsi)
add $1, %rsi
dec %rcx
jmp putMem
endProg:
movq $0x0, (%rsi)
movq -16(%rbp), %rsi
mov $1, %rax
######################### Subroutine epilogue
popq %r15 # Restore callee-saved registers
popq %r14
popq %r13
popq %r12
popq %rbx
movq %rbp, %rsp # Reset stack to base pointer.
popq %rbp # Restore the old base pointer
ret # Return to caller
And here is my C code:
extern int dec(int num, char* c);
#include <stdio.h>
int main(){
char* a = "Test\n";
dec(0x100, a);
printf("Num: %s\n", a);
}
It compiles without any problems, but when I try to run, it segfaults.
I've tried debugging it with gdb, and apparently the problem occurs when I try to run the instruction
pop (%rsi)
So, I made a few changes in my C code:
extern int dec(int num, char* c);
#include <stdio.h>
int main(){
char c;
dec(0x100, &c);
printf("Num: %s\n", &c);
}
Now, when I attempt to run it, I get this message:
Num: 256
*** stack smashing detected ***: ./teste.out terminated
Aborted (core dumped)
Can someone help me understand what's going on here and how do I fix my code?
Thanks in advance.
I was writing this simple programm to calculate a ith element of a recursive sequence. The sequence basically looks like
a(n)=a(n-1)*a(n-2)
with first two elements being -1 and -3. I use imul for multiplying and due to my findings in the net I should be able to use any registers I want, but programm returns 0 for third element. When switched to add it works as intended.
Here's the fragment where I recursively call the function and multiply (as seen, I use stack to store my variables)
push %rcx
push %rax
call calculate
pop %rax
pop %rcx
imul %rcx, %rbx
Basically question is "why it doesn't work" :P
PS. In case full code is needed:
.data
STDOUT = 1
SYSWRITE = 1
HOW_MANY = 3 # which number to calculate
SYSEXIT = 60
EXIT_SUCCESS = 0
FIRST = -1 # first element of the sequence
SECOND = -3 # second element of the sequence
NUMBER_BEGIN = 0x30
OUTPUT_BASE = 10
NEW_LINE = '\n'
PLUS = '+'
MINUS = '-'
.bss
.comm textin, 512
.comm textout, 512
.comm text2, 512
.comm znak, 1
.text
.globl _start
_start:
#
# Calling function to calculate ith element
#
mov $HOW_MANY, %r8
sub $1, %r8
push %r8 # push r8 (function argument) to stack
call calculate # call function to calculate
add $8, %rsp # removing parameter from stack
# now we should've have result in rbx
#
mov $0, %r15 # Flaga znaku (domyślnie 0 = +)
cmp $0, %rbx
jge to_ascii # Pomiń jeśli liczba jest dodatnia
not %rbx # Odwrócenie bitów liczby i dodanie 1,
inc %rbx # aby uzyskać jej wartość bezwzględną.
mov $1, %r15 # Ustawienie flagi znaku na 1 = -.
to_ascii:
mov %rbx, %rax # result goes to rax
mov $OUTPUT_BASE, %rbx
mov $0, %rcx
jmp loop
loop:
mov $0, %rdx
div %rbx # divide rax by rbx, rest in rdx
add $NUMBER_BEGIN, %rdx # rest in rdx is a next position number
mov %dl, text2(, %rcx, 1)
inc %rcx
cmp $0, %rax
jne loop
jmp inverse
inverse:
mov $0, %rdi
mov %rcx, %rsi
dec %rsi
jmp inversev2
inversev2:
mov text2(, %rsi, 1), %rax
mov %rax, textout(, %rdi, 1)
inc %rdi
dec %rsi
cmp %rcx, %rdi
jle inversev2
push %rcx # legth of the answer goes to stack
mov $0, %r10 # want sign at the first position
movb $PLUS, znak(, %r10, 1)
cmp $0, %r15 # r15 register contains info about the sign
je next # 0 = +, so nothing has to be done
movb $MINUS, znak(, %r10, 1) # otherwise set it to minus
next: # show sign
mov $SYSWRITE, %rax
mov $STDOUT, %rdi
mov $znak, %rsi
mov $1, %rdx
syscall
pop %rcx
movb $NEW_LINE, textout(, %rcx, 1)
inc %rcx
mov $SYSWRITE, %rax
mov $STDOUT, %rdi
mov $textout, %rsi
mov %rcx, %rdx
syscall
mov $SYSEXIT, %rax
mov $EXIT_SUCCESS, %rdi
syscall
# recursive function calculating ith element of a given sequence
# sequence =
# n_1 = -1
# n_2 = -3
# n_i = n_(i-1)*n_(i-2)
calculate:
push %rbp # push rbp to stack to save it's value
mov %rsp, %rbp # now stack pointer is stored in rbp
sub $8, %rsp
mov 16(%rbp), %rax
cmp $1, %rax
jl first
je second
mov $0, %rcx
# wywołanie dla n_(i-1)
dec %rax
push %rcx
push %rax
call calculate
pop %rax
pop %rcx # przepisać na rejestry imula
imul %rcx, %rbx
# wywołanie dla n_(i-2)
dec %rax
push %rcx
push %rax
call calculate
pop %rax
pop %rcx
imul %rcx, %rbx
return:
mov %rcx, %rbx
mov %rbp, %rsp
pop %rbp
ret
first:
mov $FIRST, %rbx
mov %rbp, %rsp
pop %rbp
ret
second:
mov $SECOND, %rbx
mov %rbp, %rsp
pop %rbp
ret
You are seeding %rcx to zero, then multiplying into that, so you will always have a product of zero.
Perhaps you want to change
mov $0, %rcx
to
mov $1, %rcx
I think you also need to reverse the
imul %rcx, %rbx
to
imul %rbx, %rcx
(I'm not familiar with that flavor of assembler)
I could test using strncpy() with larger source string then the destination:
int main() {
char *ptr = malloc(12);
strcpy(ptr,"hello world!");
return 0;
}
Compiling with the flag -fstack-protector and using the -S option I got:
.file "malloc.c"
.text
.globl main
.type main, #function
main:
.LFB2:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
subq $32, %rsp
movl %edi, -20(%rbp)
movq %rsi, -32(%rbp)
movq %fs:40, %rax
movq %rax, -8(%rbp)
xorl %eax, %eax
movq $0, -16(%rbp)
movl $12, %edi
call malloc
movq %rax, -16(%rbp)
movq -16(%rbp), %rax
movabsq $8022916924116329800, %rdx
movq %rdx, (%rax)
movl $560229490, 8(%rax)
movb $0, 12(%rax)
movl $0, %eax
movq -8(%rbp), %rcx
xorq %fs:40, %rcx
je .L3
call __stack_chk_fail
.L3:
leave
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE2:
.size main, .-main
Could someone explain to me how this works? And why isn't the "canary word" also overwritten by the \0 of the hello world! string?
Could someone explain to me how does this work ?
Canary word is read from fs:40 and store at top of frame here:
movq %fs:40, %rax
movq %rax, -8(%rbp)
It's below the return address so if your code happens to overflow the buffer (which will be below -8(%rbp)), it'll first overwrite the -8(%rbp) location. This will be detected by GCC prior to issuing ret here:
movq -8(%rbp), %rcx
xorq %fs:40, %rcx ; Checks that %fs:40 == -8(%rbp)
je .L3 ; Ok, return
call __stack_chk_fail ; Die
as overwritten contents of -8(%rbp) will likely to be different from proper value (installed from fs:40).
And why is not the canary word also overwritten by the \0 of the hello world!?
Your code has heap overflow, not buffer overflow so SSP can't help...
I am quite new to Assembly and I am trying to create a program that uses scanf to receive a number from the user. It then outputs "Result: (the number)"
I keep getting a segmentation error upon running the code.
This is the code I have got now:
.global main
mystring: .asciz"input\n"
formatstring: .asciz" %d"
resultstring: .asciz "Result: %ld\n"
main:
movq $0, %rax
movq $mystring, %rdi
call printf
call inout
movq $0, %rax
movq $resultstring, %rdi
call printf
jmp end
inout:
pushq %rbp
subq $8, %rsp
leaq -8(%rbp), %rsi
movq $formatstring, %rdi
movq $0, %rax
call scanf
popq %rbp
ret
end:
movq $0, %rdi
call exit
I suspect there is something wrong with the 'inout' method. Any solutions to make this program working?
leaq -8(%rbp), %rsi
In this instruction you are referring to the %rbp register but you forgot to actually initialize it!