Why did not I get segmentation fault?

Why did not I get segmentation fault? - linux

I'm new to assembly programming and experementing with simple examples and gdb. Here is the program I wrote:
1.asm
section .text
global _start
extern _print_func
_start:
push str
movzx rdx, byte [str_len]
push dx ; <--- typo here, should be rdx
call _print_func
mov rax, 60
syscall
section .data
str: db 'Some data',0x0A,0x0D
str_len: db $ - str
2.asm
section .text
global _print_func
_print_func:
pop rbx
pop rdx
pop rsi
mov rax, 0x01
mov rdi, 0x01
syscall
push rbx
ret
section .data
str: db 'Some string',0x0A,0x0D
str_len: db $ - str
After compiling, linking (with ld) and running the program it just printed nothing. So I examined the content of registers before the syscall made.
(gdb) info registers
rax 0x1 1
rbx 0x4000c5 4194501
rcx 0x0 0
rdx 0x6000e4000b 412331802635 ; <-- obviously wrong
rsi 0x10000 65536
rdi 0x1 1
rbp 0x0 0x0
rsp 0x7fffffffdcc6 0x7fffffffdcc6
So the syscall should try read 412331802635 bytes starting at 0x10000 which I thought should have caused Segmentation Fault since the program is not allowed to access all the bytes.
But it silently printed nothing. Why? Why didn't Segmantation Fault raised? Was that some sort of undefined behavior? I'n using Ubuntu 16.04 LTS under intel core i5.

sys_write does not raise a segfault, it just returns an -EFAULT error code. You should see that in rax after the syscall finishes. See also man 2 write

Related

how to call lseek and _llseek syscall from assembly [duplicate]

I was experimenting and have the following assembly code, which works very well, except that I get a "Segmentation fault (core dumped)" message right before my program ends:
GLOBAL _start
%define ___STDIN 0
%define ___STDOUT 1
%define ___SYSCALL_WRITE 0x04
segment .data
segment .rodata
L1 db "hello World", 10, 0
segment .bss
segment .text
_start:
mov eax, ___SYSCALL_WRITE
mov ebx, ___STDOUT
mov ecx, L1
mov edx, 13
int 0x80
It doesn't matter whether or not I have ret at the end; I still get the message.
What's the problem?
I'm using x86 and nasm.

You can't ret from start; it isn't a function and there's no return address on the stack. The stack pointer points at argc on process entry.
As n.m. said in the comments, the issue is that you aren't exiting the program, so execution runs off into garbage code and you get a segfault.
What you need is:
;; Linux 32-bit x86
%define ___SYSCALL_EXIT 1
// ... at the end of _start:
mov eax, ___SYSCALL_EXIT
mov ebx, 0
int 0x80
(The above is 32-bit code. In 64-bit code you want mov eax, 231 (exit_group) / syscall, with the exit status in EDI. For example:
;; Linux x86-64
xor edi, edi ; or mov edi, eax if you have a ret val in EAX
mov eax, 231 ; __NR_exit_group
syscall

I'm getting a segmentation fault in my assembly program [duplicate]

The tutorial I am following is for x86 and was written using 32-bit assembly, I'm trying to follow along while learning x64 assembly in the process. This has been going very well up until this lesson where I have the following simple program which simply tries to modify a single character in a string; it compiles fine but segfaults when ran.
section .text
global _start ; Declare global entry oint for ld
_start:
jmp short message ; Jump to where or message is at so we can do a call to push the address onto the stack
code:
xor rax, rax ; Clean up the registers
xor rbx, rbx
xor rcx, rcx
xor rdx, rdx
; Try to change the N to a space
pop rsi ; Get address from stack
mov al, 0x20 ; Load 0x20 into RAX
mov [rsi], al; Why segfault?
xor rax, rax; Clear again
; write(rdi, rsi, rdx) = write(file_descriptor, buffer, length)
mov al, 0x01 ; write the command for 64bit Syscall Write (0x01) into the lower 8 bits of RAX
mov rdi, rax ; First Paramter, RDI = 0x01 which is STDOUT, we move rax to ensure the upper 56 bits of RDI are zero
;pop rsi ; Second Parameter, RSI = Popped address of message from stack
mov dl, 25 ; Third Parameter, RDX = Length of message
syscall ; Call Write
; exit(rdi) = exit(return value)
xor rax, rax ; write returns # of bytes written in rax, need to clean it up again
add rax, 0x3C ; 64bit syscall exit is 0x3C
xor rdi, rdi ; Return value is in rdi (First parameter), zero it to return 0
syscall ; Call Exit
message:
call code ; Pushes the address of the string onto the stack
db 'AAAABBBNAAAAAAAABBBBBBBB',0x0A
This culprit is this line:
mov [rsi], al; Why segfault?
If I comment it out, then the program runs fine, outputting the message 'AAAABBBNAAAAAAAABBBBBBBB', why can't I modify the string?
The authors code is the following:
global _start
_start:
jmp short ender
starter:
pop ebx ;get the address of the string
xor eax, eax
mov al, 0x20
mov [ebx+7], al ;put a NULL where the N is in the string
mov al, 4 ;syscall write
mov bl, 1 ;stdout is 1
pop ecx ;get the address of the string from the stack
mov dl, 25 ;length of the string
int 0x80
xor eax, eax
mov al, 1 ;exit the shellcode
xor ebx,ebx
int 0x80
ender:
call starter
db 'AAAABBBNAAAAAAAABBBBBBBB'0x0A
And I've compiled that using:
nasm -f elf <infile> -o <outfile>
ld -m elf_i386 <infile> -o <outfile>
But even that causes a segfault, images on the page show it working properly and changing the N into a space, however I seem to be stuck in segfault land :( Google isn't really being helpful in this case, and so I turn to you stackoverflow, any pointers (no pun intended!) would be appreciated

I would assume it's because you're trying to access data that is in the .text section. Usually you're not allowed to write to code segment for security. Modifiable data should be in the .data section. (Or .bss if zero-initialized.)
For actual shellcode, where you don't want to use a separate section, see Segfault when writing to string allocated by db [assembly] for alternate workarounds.
Also I would never suggest using the side effects of call pushing the address after it to the stack to get a pointer to data following it, except for shellcode.
This is a common trick in shellcode (which must be position-independent); 32-bit mode needs a call to get EIP somehow. The call must have a backwards displacement to avoid 00 bytes in the machine code, so putting the call somewhere that creates a "return" address you specifically want saves an add or lea.
Even in 64-bit code where RIP-relative addressing is possible, jmp / call / pop is about as compact as jumping over the string for a RIP-relative LEA with a negative displacement.
Outside of the shellcode / constrained-machine-code use case, it's a terrible idea and you should just lea reg, [rel buf] like a normal person with the data in .data and the code in .text. (Or read-only data in .rodata.) This way you're not trying execute code next to data, or put data next to code.
(Code-injection vulnerabilities that allow shellcode already imply the existence of a page with write and exec permission, but normal processes from modern toolchains don't have any W+X pages unless you do something to make that happen. W^X is a good security feature for this reason, so normal toolchain security features / defaults must be defeated to test shellcode.)

NASM basic input-output program crashes

Following this thread, How do i read single character input from keyboard using nasm (assembly) under ubuntu? ,I'm trying to compile a program that echoes the input in NASM.
I've made following files:
my_load2.asm:
%include "testio.inc"
global _start
section .text
_start: mov eax, 0
call canonical_off
call canonical_on
testio.inc:
termios: times 36 db 0
stdin: equ 0
ICANON: equ 1<<1
ECHO: equ 1<<3
canonical_off:
call read_stdin_termios
; clear canonical bit in local mode flags
push rax
mov eax, ICANON
not eax
and [termios+12], eax
pop rax
call write_stdin_termios
ret
echo_off:
call read_stdin_termios
; clear echo bit in local mode flags
push rax
mov eax, ECHO
not eax
and [termios+12], eax
pop rax
call write_stdin_termios
ret
canonical_on:
call read_stdin_termios
; set canonical bit in local mode flags
or dword [termios+12], ICANON
call write_stdin_termios
ret
echo_on:
call read_stdin_termios
; set echo bit in local mode flags
or dword [termios+12], ECHO
call write_stdin_termios
ret
read_stdin_termios:
push rax
push rbx
push rcx
push rdx
mov eax, 36h
mov ebx, stdin
mov ecx, 5401h
mov edx, termios
int 80h
pop rdx
pop rcx
pop rbx
pop rax
ret
write_stdin_termios:
push rax
push rbx
push rcx
push rdx
mov eax, 36h
mov ebx, stdin
mov ecx, 5402h
mov edx, termios
int 80h
pop rdx
pop rcx
pop rbx
pop rax
ret
Then I run:
[root#localhost asm]# nasm -f elf64 my_load2.asm
[root#localhost asm]# ld -m elfx86_64 my_load2.o -o my_load2
When I try to run it i get:
[root#localhost asm]# ./my_load2
Segmentation fault
Debugger says:
(gdb) run
Starting program: /root/asm/my_load2
Program received signal SIGSEGV, Segmentation fault.
0x00000000004000b1 in canonical_off ()
Can someone explain why is it crashing without on "import" step?
Also, I am running RHEL in Virtualbox under Win7 64 bit. Can this cause problems with compilation?

Firstly, let's address the issue of not exiting, as mentioned by Daniel. Let's comment out the two call instructions, so the program essentially does nothing:
%include "testio.inc"
global _start
section .text
_start: mov eax, 0
;call canonical_off
;call canonical_on
When we run this:
$ ./my_load2
Segmentation fault (core dumped)
It still dies! Daniel is right - you need to exit:
%include "testio.inc"
global _start
section .text
_start: mov eax, 0
;call canonical_off
;call canonical_on
mov eax, 1
mov ebx, 0
int 0x80
This time:
$ ./my_load2
$
No segfault. So let's uncomment the calls:
%include "testio.inc"
global _start
section .text
_start: mov eax, 0
call canonical_off
call canonical_on
mov eax, 1
mov ebx, 0
int 0x80
And run it again:
$ ./my_load2
Segmentation fault (core dumped)
We get a segfault again. But at least we can be (fairly) sure that's coming from inside one of the called routines.
Running the executable with strace is also quite informative:
$ strace ./my_load2
execve("./my_load2", ["./my_load2"], [/* 57 vars */]) = 0
setsockopt(0, SOL_IP, 0x400080 /* IP_??? */, NULL, 0) = -1 EFAULT (Bad address)
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_ACCERR, si_addr=0x40008c} ---
+++ killed by SIGSEGV (core dumped) +++
Segmentation fault (core dumped)
The setsockopt line is due to the ioctl request that happens in read_stdin_termios. strace tells us the return value was EFAULT. The setsockopt(2) man page tells us this happens when:
The address pointed to by optval is not in a valid part of the process address space.
Actually this is telling us that the block of memory into which the termios structure is written is read-only. Frank is correct; everything in the program - including the termios space, and all the code - is in the read-only .text section. You can see this with:
$ objdump -h my_load2.o
my_load2.o: file format elf64-x86-64
Sections:
Idx Name Size VMA LMA File off Algn
0 .text 000000cd 0000000000000000 0000000000000000 000001c0 2**4
CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
i.e. there's only one section, .text, and it's READONLY.
The line that actually causes the segfault, however, is this one:
and [termios+12], eax
because it also tries to write to the (read-only) termios memory.
The quickest way to fix this is to put the termios memory in the .data section, and everything else in the .text section:
section .data
termios: times 36 db 0
section .text
stdin: equ 0
ICANON: equ 1<<1
ECHO: equ 1<<3
canonical_off:
call read_stdin_termios
[...]
(stdin, ICANON, and ECHO can be in the read-only .text section, because they're just used as constants - i.e. we don't write to those bits of memory.)
Having made these changes:
$ ./my_load2
$
The program runs and exits normally.

Assembly segmentation fault after making a system call, at the end of my code

I was experimenting and have the following assembly code, which works very well, except that I get a "Segmentation fault (core dumped)" message right before my program ends:
GLOBAL _start
%define ___STDIN 0
%define ___STDOUT 1
%define ___SYSCALL_WRITE 0x04
segment .data
segment .rodata
L1 db "hello World", 10, 0
segment .bss
segment .text
_start:
mov eax, ___SYSCALL_WRITE
mov ebx, ___STDOUT
mov ecx, L1
mov edx, 13
int 0x80
It doesn't matter whether or not I have ret at the end; I still get the message.
What's the problem?
I'm using x86 and nasm.

You can't ret from start; it isn't a function and there's no return address on the stack. The stack pointer points at argc on process entry.
As n.m. said in the comments, the issue is that you aren't exiting the program, so execution runs off into garbage code and you get a segfault.
What you need is:
;; Linux 32-bit x86
%define ___SYSCALL_EXIT 1
// ... at the end of _start:
mov eax, ___SYSCALL_EXIT
mov ebx, 0
int 0x80
(The above is 32-bit code. In 64-bit code you want mov eax, 231 (exit_group) / syscall, with the exit status in EDI. For example:
;; Linux x86-64
xor edi, edi ; or mov edi, eax if you have a ret val in EAX
mov eax, 231 ; __NR_exit_group
syscall

NASM x86_64 having trouble writing command line arguments, returning -14 in rax

I am using elf64 compilation and trying to take a parameter and write it out to the console.
I am calling the function as ./test wooop
After stepping through with gdb there seems to be no problem, everything is set up ok:
rax: 0x4
rbx: 0x1
rcx: pointing to string, x/6cb $rcx gives 'w' 'o' 'o' 'o' 'p' 0x0
rdx: 0x5 <---correctly determining length
after the int 80h rax contains -14 and nothing is printed to the console.
If I define a string in .data, it just works. gdb shows the value of $rcx in the same way.
Any ideas? here is my full source
%define LF 0Ah
%define stdout 1
%define sys_exit 1
%define sys_write 4
global _start
section .data
usagemsg: db "test {string}",LF,0
testmsg: db "wooop",0
section .text
_start:
pop rcx ;this is argc
cmp rcx, 2 ;one argument
jne usage
pop rcx
pop rcx ; argument now in rcx
test rcx,rcx
jz usage
;mov rcx, testmsg ;<-----uncomment this to print ok!
call print
jmp exit
usage:
mov rcx, usagemsg
call print
jmp exit
calclen:
push rdi
mov rdi, rcx
push rcx
xor rcx,rcx
not rcx
xor al,al
cld
repne scasb
not rcx
lea rdx, [rcx-1]
pop rcx
pop rdi
ret
print:
push rax
push rbx
push rdx
call calclen
mov rax, sys_write
mov rbx, stdout
int 80h
pop rdx
pop rbx
pop rax
ret
exit:
mov rax, sys_exit
mov rbx, 0
int 80h
Thanks
EDIT: After changing how I make my syscalls as below it works fine. Thanks all for your help!
sys_write is now 1
sys_exit is now 60
stdout now goes in rdi, not rbx
the string to write is now set in rsi, not rcx
int 80h is replaced by syscall

I'm still running 32-bit hardware, so this is a wild asmed guess! As you probably know, 64-bit system call numbers are completely different, and "syscall" is used instead of int 80h. However int 80h and 32-bit system call numbers can still be used, with 64-bit registers truncated to 32-bit. Your tests indicate that this works with addresses in .data, but with a "stack address", it returns -14 (-EFAULT - bad address). The only thing I can think of is that truncating rcx to ecx results in a "bad address" if it's on the stack. I don't know where the stack is in 64-bit code. Does this make sense?
I'd try it with "proper" 64-bit system call numbers and registers and "syscall", and see if that helps.
Best,
Frank

As you said, you're using ELF64 as the target of the compilation. This is, unfortunately, your first mistake. Using the "old" system call interface on Linux, e.g. int 80h is possible only when running 32-bit tasks. Obviously, you could simply assemble your source as ELF32, but then you're going to lose all the advantages if running tasks in 64-bit mode, namely the extra registers and 64-bit operations.
In order to make system calls in 64-bit tasks, the "new" system call interface must be used. The system call itself is done with the syscall instruction. The kernel destroys registers rcx and r11. The number of the system is specified in the register rax, while the arguments of the call are passed in rdi, rsi, rdx, r10, r8 and r9. Keep in mind that the numbers of the syscalls are different than the ones in 32-bit mode. You can find them in unistd_64.h, which is usually in /usr/include/asm or wherever your distribution stores it.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string