So, for this assignment I have to write an Assembly "function" to be called by C code. The purpose of the function is, given an integer and a memory address (the address of a char array, to be used as a string), convert the integer to a string, which starting address is the memory address that is given.
I'm on Ubuntu Linux, btw.
Here's the Assembly code (I tried to make it using the Linux x86_64 ABI calling conventions)(It is in AT&T syntax):
.global dec
.type dec, #function
.text
dec:
######################### Subroutine prologue
push %rbp # Save the base pointer
movq %rsp, %rbp # Make the stack pointer the new base pointer
push %rdi # Stack parameter 1
push %rsi # Stack parameter 2
push %rbx # Save callee-saved registers
push %r12
push %r13
push %r14
push %r15
######################### Subroutine body
movq %rdi, %rax
xor %rcx, %rcx
addDigit:
cmp $0, %rax
je putMem
xor %rdx, %rdx
mov $10, %ebx
div %ebx
addq $'0', %rdx
pushq %rdx
inc %rcx
jmp addDigit
putMem:
cmp $0, %rcx
je endProg
popq (%rsi)
add $1, %rsi
dec %rcx
jmp putMem
endProg:
movq $0x0, (%rsi)
movq -16(%rbp), %rsi
mov $1, %rax
######################### Subroutine epilogue
popq %r15 # Restore callee-saved registers
popq %r14
popq %r13
popq %r12
popq %rbx
movq %rbp, %rsp # Reset stack to base pointer.
popq %rbp # Restore the old base pointer
ret # Return to caller
And here is my C code:
extern int dec(int num, char* c);
#include <stdio.h>
int main(){
char* a = "Test\n";
dec(0x100, a);
printf("Num: %s\n", a);
}
It compiles without any problems, but when I try to run, it segfaults.
I've tried debugging it with gdb, and apparently the problem occurs when I try to run the instruction
pop (%rsi)
So, I made a few changes in my C code:
extern int dec(int num, char* c);
#include <stdio.h>
int main(){
char c;
dec(0x100, &c);
printf("Num: %s\n", &c);
}
Now, when I attempt to run it, I get this message:
Num: 256
*** stack smashing detected ***: ./teste.out terminated
Aborted (core dumped)
Can someone help me understand what's going on here and how do I fix my code?
Thanks in advance.
Related
This question already has an answer here:
Using interrupt 0x80 on 64-bit Linux [duplicate]
(1 answer)
Closed 1 year ago.
I am trying to print the character h in assembly, but it is not outputting anything right now. I see no reason, nor can I understand why this is not working.
I would believe that it is because I am using %rbp instead of %eax but I am reasonably new to assembly, and I do not know whether writing to the %rbp register instead of %eax makes a difference.
.section .text
.global _start
_start:
mov %eax, %edi
call main
movl $1, %eax
int $0x80
main:
pushq %rbp
movq %rsp, %rbp
movl $4, %eax
movl $1, %ebx
push $0x068
movl $5, %edx
movq %rbp, %rsp
syscall
popq %rbp
ret
The code is compiled with
> as $(BIN_DIR)/assembly.asm -o $(BIN_DIR)/a.o
> ld $(BIN_DIR)/a.o -o $(BIN_DIR)/a
I looked up the structure in e.g. Free Pascal sources which somewhat illustrates how parameters are allocated and how success is determined.
movq sysnr, %rax { Syscall number -> rax. }
// for calls that have less parameters, just skip the relevant lines that load it
movq param1, %rdi { shift arg1 - arg5. }
movq param2, %rsi
movq param3, %rdx
movq param4, %r10
movq param5, %r8
movq param6, %r9
syscall { Do the system call. }
cmpq $-4095, %rax { Check %rax for error. }
jnae .LSyscOK { Jump to error handler if error. }
negq %rax
movq %rax,%rdi
call seterrno // call some function to set errno threadvar
movq $-1,%rax
.LSyscOK: // end of procedure
I am new to assembly and am aware that my assembly code may not be efficient or could be better. The comments of the assembly may be messed up a little due to constant changes. The goal is to print each character of the string individually and when comes across with a format identifier like %s, it prints a string from one of the parameters in place of %s.
So for example:
String: Hello, %s
Parameter (RSI): Foo
Output: Hello, Foo
So the code does what it suppose to do but give segmentation error at the end.
.bss
char: .byte 0
.text
.data
text1: .asciz "%s!\n"
text2: .asciz "My name is %s. I think I’ll get a %u for my exam. What does %r do? And %%?\n"
word1: .asciz "Piet"
.global main
main:
pushq %rbp # push the base pointer (and align the stack)
movq %rsp, %rbp # copy stack pointer value to base pointer
movq $text1, %rdi
movq $word1, %rsi
movq $word1, %rdx
movq $word1, %rcx
movq $word1, %r8
movq $word1, %r9
call myPrint
end:
movq %rbp, %rsp # clear local variables from stack
popq %rbp # restore base pointer location
movq $60, %rax
movq $0, %rdi
syscall
myPrint:
pushq %rbp
movq %rsp, %rbp
pushq %rsi
pushq %rdx
pushq %rcx
pushq %r8
pushq %r9
movq %rdi, %r12
regPush:
movq $0, %rbx
#rbx: counter
printLooper:
movb (%r12), %r14b #Get a byte of r12 to r14
cmpb $0, %r14b #Check if r14 is a null byte
je endPrint #If it is a null byte then go to 'endPrint'
cmpb $37, %r14b
je formatter
incq %r12 #Increment r12 to the next byte
skip:
mov $char, %r15 #Move char address to r15
mov %r14b, (%r15) #Move r14 byte into the value of r15
mov $char, %rcx #Move char address into rcx
movq $1, %r13 #For the number of byte
printer:
movq $0, %rsi #Clearing rsi
mov %rcx, %rsi #Move the address to rsi
movq $1, %rax #Sys write
movq $1, %rdi #Output
movq %r13, %rdx #Number of byte to rdx
syscall
jmp printLooper
formatter:
incq %r12 #Moving to char after "%"
movb (%r12), %r14b #Moving the char byte into r14
cmpb $115, %r14b #Compare 's' with r14
je formatString #If it is equal to 's' then jump to 'formatString'
movb -1(%r12), %r14b #Put back the previous char into r14
jmp skip
####String Formatter Start ##################################################
formatString:
addq $1, %rbx
movq $8, %rax
mulq %rbx
subq %rax, %rbp
movq (%rbp), %r15
pushq %r15 ### into the stack
movq $0, %r13 ### Byte counter
formatStringLoop:
movb (%r15), %r14b #Move char into r14
cmpb $0, %r14b #Compare r14 with null byte
je formatStringEnd #If it is equal, go to 'formatStringEnd'
incq %r15 #Increment to next char
addq $1, %r13 #Add 1 to the byte counter
jmp formatStringLoop#Loop again
formatStringEnd:
popq %rcx #Pop the address into rcx
incq %r12 #Moving r12 to next char
jmp printer
#######String Formatter End #############################################
endPrint:
movq %rbp, %rsp
popq %rbp
ret
In formatString you modify %rbp with subq %rax, %rbp, forgetting that you will restore %rsp from it. So when you mov %rbp, %rsp just before the function returns, you end up with %rsp pointing somewhere else, and so you get the wrong return address.
I guess you are subtracting some offset from %rbp to get some space on the stack. This seems unsafe because you've pushed lots of other stuff there. It is safe to use up to 128 bytes below the stack pointer as this is the red zone, but it would be more natural to use an offset from %rsp instead. Using SIB addressing you can access data at constant or variable offsets to %rsp without actually changing its value.
How I found this with gdb: by setting breakpoints at myPrint and endPrint, I found that %rsp was different at the ret than it was on entry. Its value could only have come from %rbp, so I did watch $rbp to have the debugger break when %rbp changed, and it pointed straight to the offending instruction in formatString. (Which I could also have found by searching the source code for %rbp.)
Also, your .text at the top of the file is misplaced, so all your code gets placed in the .data section. This actually works but it surely is not what you intended.
I'm studying the x86 assembly language. In order to better understand what's going on behind the scenes of string creation, I have a sample program that just prints a string. GCC produced the following Assembly program, and I'm having trouble understanding the compiler's output:
Assembly Code:
Dump of assembler code for function main:
0x0000000000400596 <+0>: push %rbp
0x0000000000400597 <+1>: mov %rsp,%rbp
0x000000000040059a <+4>: sub $0x10,%rsp
0x000000000040059e <+8>: movq $0x400668,-0x8(%rbp)
0x00000000004005a6 <+16>: mov -0x8(%rbp),%rax
0x00000000004005aa <+20>: mov %rax,%rsi
=> 0x00000000004005ad <+23>: mov $0x400675,%edi
0x00000000004005b2 <+28>: mov $0x0,%eax
0x00000000004005b7 <+33>: callq 0x4004a0 <printf#plt>
0x00000000004005bc <+38>: mov $0x0,%eax
0x00000000004005c1 <+43>: leaveq
0x00000000004005c2 <+44>: retq
C Code:
#include <stdio.h>
int main()
{
char *me = "abcdefghijkl";
printf("%s",me);
}
At the conceptual level, I understand that the stack pointer is being subtracted to allocate memory on the stack, and then somehow, and this is the part I'm having trouble understanding the mechanics of, the program creates the string.
Can someone please help?
Thanks.
It's a lot clearer if you use the -S flag to gcc to create an assembly file for your program (gcc -S asm.c). This generates a asm.s file:
.file "asm.c"
.section .rodata
.LC0:
.string "abcdefghijkl"
.LC1:
.string "%s"
.text
.globl main
.type main, #function
main:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
subq $16, %rsp
movq $.LC0, -8(%rbp)
movq -8(%rbp), %rax
movq %rax, %rsi
movl $.LC1, %edi
movl $0, %eax
call printf
leave
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size main, .-main
.ident "GCC: (GNU) 4.8.5 20150623 (Red Hat 4.8.5-36)"
.section .note.GNU-stack,"",#progbits
From this you can see that the string is just some initialized memory in the .rodata section assigned the label .LC0. Changing that memory changes the string.
I am quite new to Assembly and I am trying to create a program that uses scanf to receive a number from the user. It then outputs "Result: (the number)"
I keep getting a segmentation error upon running the code.
This is the code I have got now:
.global main
mystring: .asciz"input\n"
formatstring: .asciz" %d"
resultstring: .asciz "Result: %ld\n"
main:
movq $0, %rax
movq $mystring, %rdi
call printf
call inout
movq $0, %rax
movq $resultstring, %rdi
call printf
jmp end
inout:
pushq %rbp
subq $8, %rsp
leaq -8(%rbp), %rsi
movq $formatstring, %rdi
movq $0, %rax
call scanf
popq %rbp
ret
end:
movq $0, %rdi
call exit
I suspect there is something wrong with the 'inout' method. Any solutions to make this program working?
leaq -8(%rbp), %rsi
In this instruction you are referring to the %rbp register but you forgot to actually initialize it!
I'm trying to print a floating-point value from assemler calling a printf function. It works fine with strings and integer values but fails printing floats. Here is an example of working code:
global main
extern printf
section .data
message: db "String is: %d %x %s", 10, 0
end_message: db ".. end of string", 0
section .text
main:
mov eax, 0xff
mov edi, message
movsxd rsi, eax
mov rdx, 0xff
mov rcx, end_message
xor rax, rax
call printf
ret
String is: 255 ff .. end of string
So, the parameters are passed through registers: edi contains address of a formatting string, rsi and rdx contain the same number to print in decimal and hex styles, rcx contains end of a string, rax contains 0 as we do not have a float to print.
This code works fine but something changes while trying to print float:
global main
extern printf
section .data
val: dq 123.456
msg: db "Result is: %fl",10, 0
section .text
main:
mov rdi,msg
movsd xmm0,[val]
mov eax,1
call printf
mov rax, 0
ret
This code snipped can be compiled but returns segmentation fault being executed. It seems that the problem is in wrong value of xmm0 but trying to change movsd xmm0,[val] to movsd xmm0,val gives an
error: invalid combination of opcode and operands
message.
The compiler is NASM running on openSuSe 12.3
Update. I tried to make a c program and produce a .S assembly. It gives a very weird solution:
main:
.LFB2:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
subq $32, %rsp
movl %edi, -4(%rbp)
movq %rsi, -16(%rbp)
movq val(%rip), %rax
movq %rax, -24(%rbp)
movsd -24(%rbp), %xmm0
movl $.LC0, %edi
movl $1, %eax
call printf
movl $0, %eax
leave
.cfi_def_cfa 7, 8
ret
Is it possible to write a simple printf example?
for your assembler problem:
you need to align the stack before your main program starts.
insert
sub rsp, 8
right after main:
then add it again before ret:
add rsp, 8