How to solve segmentation fault of certain instructions of assembly code? [duplicate] - linux

This question already has an answer here:
Nasm segmentation fault on RET in _start
(1 answer)
Closed 11 months ago.
I wrote some simple assembly code like below:
.global _start
.text
_start:
call _sum
subq $0x8, %rsp
popq %rax
ret
_sum:
ret
In order to get the value of %rax after 'popq' instruction,
I assembled that code using 'as' and 'ld' command.
and I started gdb debugger by putting break point at '_start'
and the result comes like below:
B+> │0x400078 <_start> callq 0x400083 <_sum>
│ │0x40007d <_start+5> sub $0x8,%rsp
│ │0x400081 <_start+9> pop %rax
│ │0x400082 <_start+10> retq
│ │0x400083 <_sum> retq
However, before going into pop instruction,
There comes an error message saying that
Program received signal SIGSEGV, Segmentation fault.
Cannot access memory at address 0x1
(when I changed the $0x8 into $0x0~$0x7, it all worked.)
It seems like at the first stage the sum function might be the problem. because It literally does nothing but return.
So, How can I modify this code to get the value of %rax after the popq instruction?
Thanks.

I think probably this question is a duplicate, but anyway, there is one problem in your code.
.global _start
.text
_start:
call _sum
subq $0x8, %rsp
popq %rax
ret # <-- return to where?
_sum:
ret
A main in C has to can return because _start eventually calls main, but here, you are writing _start directly. It returns to nowhere if you put a ret.
In place of ret, put this instead.
movl $60, %eax # syscall number for sys_exit
movl $0, %edi # or whatever value you want your process
# to return with from 0 to 255;
# xor %edi, %edi is usually better if you want 0
syscall
Leave a comment if it still crashes.
BTW, I was assuming your platform is Linux (because of the AT&T syntax..). The syscalls can be different for a different platform.

Related

Assembly -- syscall write returning -38, no output

I'm trying to learn some assembly, and I'm starting out by outputting text to the screen. I'm starting to think it might be my environment and/or compilation: by now, I'm so frustrated that I've literally copy-pasted assembly code but it just won't call the system calls. Here is the source code (mainly adapted from https://en.wikibooks.org/wiki/X86_Assembly/Interfacing_with_Linux)
.section .data
msg: .ascii "Hello World\n"
.section .text
.global main
main:
movq $1, %rdi # write to stdout
movq $msg, %rsi # use string "Hello World"
movq $12, %rdx # write 12 characters
syscall # make syscall
movq $60, %rax # use the _exit syscall
movq $0, %rdi # error code 0
syscall # make syscall
I'm on a 64-bit machine running Kali Linux, and am compiling with GCC. Like so:
gcc -c test.s
gcc test.o -no-pie
I've debugged the program with GDB and the syscall instruction always sets the eax register to 0xffffffffffffffda (-38) which does not seem right...
Can anyone give an insight?
Syscalls usually return a negative value in case of error, the absolute value being the errno value itself.
In your case 38is ENOSYS: Function not implemented.
But what syscall function are you calling? Let's see, the function number is stored into rax (eax in 32-bits) before issuing the syscall and your program loads... nothing!
It looks like you lost one line in your copy/paste:
movq $1, %rax ; use the write syscall
Your code is missing the first instruction from the sample code:
movq $1, %rax ; use the write syscall
Without this code, it ends up executing an unexpected (and probably invalid) system call, based on whatever happened to be in %rax when main was called.

Problems with accessing command line arguments in linux from x86 asm

I have a basic asm program that checks if a string is a digit. I was adding in code to read from command line arguements, put it keeps seg faulting.
if what I have read is right, this should get the amount of arguments passed to the program, which should be stored in 0(%ebp). What am i doing wrong?
The entirity of the code can be found here: http://pastebin.com/kGV2Mxx4
The problem is the first 3-5 lines of _start.
upon Looking at lscpu's output, I have an i868 cpu. Although, it says it can operate in 32-bit and 64-bit. I am running 32 bit linux (Arch linux x86)
I fixed the issue. I did 2 pop's, one to bypass the programs name, the next to get the first argument. the updated code can be found here: http://pastebin.com/xewyeHYf
Can someone please tell me why I could not just do the following:
pushl 8(%ebp)
or
movl 8(%ebp), %eax
Here is a little tutorial I wrote on the subject:
NASM - Linux Getting command line parameters
You could write this:
_start:
b1: movl 0(%ebp), %eax
cmpl $1, %eax
je load_msg
b2: pushl 8(%ebp)
b4: call check
To understand why your previous attempts didn't work, draw stack diagrams.
Compile a small C program that does something like what you want to do, and compile it to assembly language to find out exactly how to access arguments. The x86_32 code doesn't look at all like any of the above, BTW:
int main(int argc, char *argv[])
{
return argv[1][0];
}
gives (yes, some is superfluous stack bookkeeping, but anyway):
.file "tst.c"
.text
.globl main
.type main, #function
main:
.LFB0:
.cfi_startproc
pushl %ebp
.cfi_def_cfa_offset 8
.cfi_offset 5, -8
movl %esp, %ebp
.cfi_def_cfa_register 5
movl 12(%ebp), %eax
addl $4, %eax
movl (%eax), %eax
movzbl (%eax), %eax
movsbl %al, %eax
popl %ebp
.cfi_restore 5
.cfi_def_cfa 4, 4
ret
.cfi_endproc
.LFE0:
.size main, .-main
.ident "GCC: (GNU) 4.7.2 20121109 (Red Hat 4.7.2-8)"
.section .note.GNU-stack,"",#progbits

ASM call Printf

movl %ebx, %esi
movl $.LC1, %edi
movl $0, %eax
call printf
I use the following asm code to print what is in EBX register. When I use
movl $1,%eax
int 0x80
and the echo $? I get the correct answer but segmentation fault in the first case. I am using the GNU Assembler and AT&T syntax. How can I fix this problem?
Judging by the code, you are probably in 64 bit mode (please confirm) in which case pointers are 64 bit in size. In a position-depended executable on Linux movl $.LC1, %edi is safe and what compilers use, but to make your code position-independent and able to handle symbol addresses being outside the low 32 bits you can use leaq .LC1(%rip), %rdi.
Furthermore, make sure that:
you are preserving value of rbx in your function
stack pointer is aligned as required
This code works for me in 64 bit:
.globl main
main:
push %rbx
movl $42, %ebx
movl %ebx, %esi
leaq .LC1(%rip), %rdi
movl $0, %eax
call printf
xor %eax, %eax
pop %rbx
ret
.data
.LC1: .string "%d\n"
Edit: As Jester noted, this answer only applies to x86 (32 bits) asm whereas the sample provided is more likely for x86-64.
That's because printf has a variable number of arguments. The printf call doesn't restore the stack for you, you need to do it yourself.
In your example, you'd need to write (32 bits assembly):
push %ebx
push $.LC1
call printf
add $8, %esp // 8 : 2 argument of 4 bytes

Segfault running cmp 'A', %al‽

For my own sick pleasure, I'm writing a small program in x86_64 assembly for Linux. However, I've encountered a segfault that makes absolutely no sense to me, in an instruction comparing an immediate operand to a register. What gives?
Here's the code leading up to the crash:
_start:
sub $8, %rsp
mov %rsp, %rbx
lea le_string(%rip), %rsi
mov %rsi, %rdi
add $8, %rdi
mov $26, %cl
mov (%rsi), %al
cmp 'A', %al /* This line segfaults */
/* snip code that never runs */
le_string:
.ascii "YrFgevat"
I'm assembling with gcc -nostdlib, which is calling the GNU assembler.
Dumping the registers after the crash reveals:
%rsi contains the expected pointer to the string
%al contains the expected first character in the string
%rip points to an instruction that doesn't touch memory
Please ignore the lack of normal calling conventions—I'm not calling out to anything besides the syscall interface, and this crashes before it's even gotten that far!
'A' is being interpreted as an address after all. If you want to use it as a constant, you need to write:
cmp $'A', %al

why gcc 4.x default reserve 8 bytes for stack on linux when calling a method?

as a beginner of asm, I am checking gcc -S generated asm code to learn.
why gcc 4.x default reserve 8 bytes for stack when calling a method?
func18 is the empty function with no return no param no local var defined.
I can't figure out why 8 bytes is reserved here (neither any forum/site mention for the reason, ppl seems take it for granted)
is it for the %ebp just push? or return type?! many thx!
.globl _func18
_func18:
pushl %ebp
movl %esp, %ebp
subl $8, %esp
.text
Some instructions require certain data types to be aligned to as much as a 16-byte boundary (in particular, the SSE data type __m128). To meet this requirement, gcc ensures that the stack is initially 16-byte aligned, and allocates stack space in multiples of 16 bytes. If only a 4-byte return address and 4-byte frame pointer need to be pushed, 8 additional bytes are needed to keep the stack aligned to a 16-byte boundary. However, if gcc determines that the additional alignment is unnecessary (i.e. the fancy data types are not used and no external functions are called), then it may omit any additional instructions used to align the stack. The analysis necessary to determine this may require certain optimization passes to be performed.
See also the gcc documentation for the option -mpreferred-stack-boundary=num.
As richard mentioned above, it's all because of optimization, showing below.
but still I got no idea why 8 bytes reserved is something optimized?!
original c
void func18() {}
int main() {return 0;}
compile without optimization flag specified
.text
.globl _func18
_func18:
pushl %ebp
movl %esp, %ebp
subl $8, %esp
leave
ret
.globl _main
_main:
pushl %ebp
movl %esp, %ebp
subl $8, %esp
movl $0, %eax
leave
ret
.subsections_via_symbols
with -Os optimization flag, no more stack reserve
.text
.globl _func18
_func18:
pushl %ebp
movl %esp, %ebp
leave
ret
.globl _main
_main:
pushl %ebp
xorl %eax, %eax
movl %esp, %ebp
leave
ret
.subsections_via_symbols
Easy way to find out: Have you empty function call another function with one parameter. If the parameter is stored directly to the stack (no push), then that's what the extra space is for.

Resources