How to compile shared file with .bss section using nasm - linux

I am trying to compile shared file from nasm I am using these commands:
nasm -f elf64 -o asm/asm.o asm/asm.asm
ld -shared -o asm/asm.so asm/asm.o -I/lib64/ld-linux-x86-64.so.2
after second one i got an error:
ld: asm/asm.o: relocation R_X86_64_32S against `.bss' can not be used when making a shared object; recompile with -fPIC
ld: final link failed: Nonrepresentable section on output
I cant use nasm -fPIC instead -f elf64 because it's invalid command. What should I do if I need .bss section maybe I can compile it in different way?
Here is my assembly code:
DEFAULT rel
%include "asm/python.inc"
GLOBAL PyInit_asm:function
SECTION .rodata
l_dubois_name db "dubois", 0
l_module_name db "asm", 0
SECTION .bss
digitSpace resb 100
digitSpacePos resb 8
SECTION .data
l_asm_methods:
ISTRUC PyMethodDef
at PyMethodDef.ml_name , dq l_dubois_name
at PyMethodDef.ml_meth , dq asm_dubois
at PyMethodDef.ml_flags , dq METH_NOARGS
at PyMethodDef.ml_doc , dq l_sayit_doc
IEND
NullMethodDef
l_asm_module: ;; struct PyModuleDef *
ISTRUC PyModuleDef
at PyModuleDef.m_base , PyModuleDef_HEAD_INIT
at PyModuleDef.m_name , dq l_module_name
at PyModuleDef.m_doc , dq NULL
at PyModuleDef.m_size , dq -1
at PyModuleDef.m_methods , dq l_asm_methods
at PyModuleDef.m_slots , dq NULL
at PyModuleDef.m_traverse , dq NULL
at PyModuleDef.m_clear , dq 0
at PyModuleDef.m_free , dq NULL
IEND
SECTION .text
asm_dubois:
push rbp
mov rbp, rsp
mov rax, [rbp+62]
mov rsp, rbp
pop rbp
call _printRAX
mov rax, 60
mov rdi, 0
syscall
ret
_printRAX:
mov rcx, digitSpace
mov rbx, 10
mov [rcx], rbx
inc rcx
mov [digitSpacePos], rcx
_printRAXLoop:
mov rdx, 0
mov rbx, 10
div rbx
push rax
add rdx, 48
mov rcx, [digitSpacePos]
mov [rcx], dl
inc rcx
mov [digitSpacePos], rcx
pop rax
cmp rax, 0
jne _printRAXLoop
_printRAXLoop2:
mov rcx, [digitSpacePos]
mov rax, 1
mov rdi, 1
mov rsi, rcx
mov rdx, 1
syscall
mov rcx, [digitSpacePos]
dec rcx
mov [digitSpacePos], rcx
cmp rcx, digitSpace
jge _printRAXLoop2
ret
Some random text for stackOverflow to accept mine code

The recompile with -fPIC error message is assuming that the asm / object file was created by a compiler. With hand-written asm you are the compiler and have to write position-independent code. (Or at least code that inefficiently uses 64-bit absolute addresses like your mov rcx, digitSpace; runtime fixups are supported for those relocations on GNU/Linux.)
Use lea r8, [digitSpace] (or any convenient reg, preferably outside the loop) and compare against that.
cmp rcx, digitSpace uses a static address as a 32-bit immediate sign-extended to 64-bit. This will require a R_X86_64_32S relocation: 64-bit address encoded as a 32-bit signed value. (Same as you'd get for [digitSpace + rdx] for example, that's another thing you can't do in PIC/PIE code)
Only mov allows a 64-bit immediate (which NASM uses by default when you write mov r64, symbol). Of course it would be better to use a RIP-relative LEA like lea rcx, [digitSpace]. You used default rel so that will be RIP-relative).
Almost exact duplicate of Assembler Error: Mach-O 64 bit does not support absolute 32 bit addresses (MacOS never allows using symbol addresses as 32-bit immediates so it's an assemble-time error, vs. on Linux only an error when you try to link into an ELF shared object instead of a non-PIE executable.)
Also related:
How to load address of function or label into register in GNU Assembler - RIP-relative LEA is the best if you can't use mov r32, imm32
Mach-O 64-bit format does not support 32-bit absolute addresses. NASM Accessing Array various general stuff about using symbol addresses when disp32 and imm32 aren't allowed.
32-bit absolute addresses no longer allowed in x86-64 Linux? - Linux PIE executables have the same restrictions as shared objects.

Related

dlsym crash when called from assembler

I have a small program in assembler that loads an .so file using dlopen, and then tries to load a function pointer using dlsym. Calling dlopen seems to be fine but it crashes when I call dlsym.
SECTION .text
;default rel
EXTERN dlopen ; loads a dynamic library
EXTERN dlsym ; retrieves the address for a symbol in the dynamic library
; inputs:
; rdi: rdi the pointer to print
printHex:
sub rsp, 19 ; allocate space for the string 0x0123456789ABCDEF\n
mov BYTE [rsp + 0], '0'
mov BYTE [rsp + 1], 'x'
xor rcx, rcx ; int loop variable to 0
.LOOP1:
lea rsi, [rsp + rcx] ; rsi will we the offset where we will store the next hex charcter
mov rax, rdi
and rax, 0xf
sar rdi, 4 ; shift right 4 bits (divide by 16)
lea rdx, [hexLookUp + rax]
mov bl, [rdx]
mov BYTE [rsi +18], bl
dec rcx ; rcx--
cmp rcx, -16 ; while rcx > -16
jne .LOOP1
mov BYTE [rsp + 18], 10
; print
mov rax, 1 ; syscall: write
mov rdi, 1 ; stdout
mov rsi, rsp
mov rdx, 19
syscall
; release stack memory
add rsp, 19
ret
global _start ; "global" means that the symbol can be accessed in other modules. In order to refer to a global symbol from another module, you must use the "extern" keyboard
_start:
; load the library
mov rdi, str_libX11so
mov rsi, 2; RTLD_NOW=2
call dlopen wrt ..plt
; PLT stands for Procedure Linkage Table:
; used to call external library functions whose address is not know at link time,
; so it must be resolved by the dynamic linker at run time
; more info: https://reverseengineering.stackexchange.com/questions/1992/what-is-plt-got
mov [ptr_libX11so], rax ; the previous function call returned the value in rax
mov rdi, rax
call printHex
; load the function
mov rdi, [str_libX11so]
mov rsi, fstr_XOpenDisplay
call dlsym wrt ..plt
mov [fptr_XOpenDisplay], rax
mov rdi, rax
call printHex
mov rax, 60 ; syscal: exit
mov rdi, 0 ; return code
syscall
hexLookUp: db "0123456789ABCDEF"
str_libX11so: db "libX11.so", 0
; X11 function names
fstr_XOpenDisplay: db "XOpenDisplay", 0
SECTION .data
ptr_libX11so: dq 0 ; ptr to the X11 library
; X11 function ptrs
fptr_XOpenDisplay: dq 0
I have tried to make the same program in C and it seems to work. So I must be doing something wrong.
extern void* dlopen(const char* name, int);
extern void* dlsym(void* restrict handle, const char* restrict name);
int main()
{
void* libX11so = dlopen("libX11.so", 2);
void (*XOpenDisplay)() = dlsym(libX11so, "XOpenDisplay");
}
I tried to disassemble the C version and compare, but I can't still figure out what is the problem.
An interesting thing I noticed is that the pointer returned by dlopen (which is different in each execution), in the asm version is quite small compared to the C version (e.g 0x0000000001A932D vs 0x5555555592d0). But maybe that could be because I'm using the -no-pie flag for linking:
nasm -f elf64 -g -F dwarf minimal.asm && gcc -nostartfiles -no-pie minimal.o -ldl -o minimal && ./minimal
I just noticed my mistake:
; load the function
mov rdi, [str_libX11so]
should be:
; load the function
mov rdi, [ptr_libX11so]

linking functions in shared object - ld

I'm trying to write a library (shared object) in assembly. I'm compiling with nasm and linking with ld. I've got 2 ASM files containing differents symbols. I'm trying to call a symbol contained in the first file from the second one, but ld keep throwing an error : relocation R_X86_64_PC32 against symbol 'strchr' can not be used when making a shared object; recompile with -fPIC.
The first file contains :
BITS 64
section .text
global strchr
strchr:
push rbp
mov rbp, rsp
xor rax, rax
looper:
cmp sil, byte [rdi]
je saveptr
cmp byte [rdi], 0x0
je endloop
inc rdi
jmp looper
saveptr:
mov rax, rdi
endloop:
mov rsp, rbp
pop rbp
ret
The second file contains :
BITS 64
section .text
extern strchr
global strspn
strspn:
push rbp
mov rbp, rsp
xor rax, rax
xor r11, r11
mov rax, rdi
looper:
cmp byte [rsi], 0x0
je endloop
push rax
mov sil, byte [rsi]
call strchr
cmp rax, 0x0
jne increase
inc rsi
mov rax, rdi
jmp looper
increase:
inc r11
inc rax
jmp looper
endloop:
mov rax, r11
mov rsp, rbp
pop rbp
ret
I'm compiling the library through this process:
nasm -f elf64 first_file.asm -o first_file.o
nasm -f elf64 second_file.asm -o second_file.o
ld -shared first_file.o second_file.o -o mylib.so
How can I link the first (compiled) ASM file so that I can call the symbol from the second one ?

nasm,86_64,linux,"hello world" program. when link ,it says "relocation truncated to fit"

[section .data]
strHello db "Hello World"
STRLEN equ $-strHello
MessageLength equ 9
Message db "hi!!!! "
[section .text]
global main
main:
mov edx,STRLEN;
mov ecx,strHello;
mov ebx,1
mov eax,4
int 0x80
call DispStr
mov ebx,0
mov eax,1
int 0x80
DispStr:
mov ax,MessageLength
mov dh,0
mul dh
add ax,Message
mov bp,ax
mov ax,ds
mov es,ax
mov cx,MessageLength
mov ax,01301h
mov bx,0007h
mov dl,0
int 10h
ret
Compile and run:
$ nasm -f elf64 helloworld.asm -o helloworld.o
$ gcc -s -o helloworld helloworld.o
helloworld.o: In function `DispStr':
helloworld.asm:(.text+0x31): relocation truncated to fit: R_X86_64_16 against `.data'
collect2: ld return 1
This exact error happens because at:
add ax,Message
ax is only 16-bit wide, but Message is a 64-bit wide address, so it won't fit during relocation.
I have explained this error in detail at: https://stackoverflow.com/a/32639540/895245
The solution in this case is to use a linker script as mentioned at: Using .org directive with data in .data section: In connection with ld
This repository contains working examples of boot sectors and BIOS: https://github.com/cirosantilli/x86-bare-metal-examples/tree/d217b180be4220a0b4a453f31275d38e697a99e0
Since you're in 64-bit mode, you won't be able to use BIOS functions (i.e. the int 10h instruction). Even if you could, BIOS uses a different addressing mechanism, so attempting to use the address of Message wouldn't work anyway.
Also, wouldn't the first 3 lines of the DispStr function zero out ax? (since you're multiplying by dh, which was just set to zero)

NASM x86_64 having trouble writing command line arguments, returning -14 in rax

I am using elf64 compilation and trying to take a parameter and write it out to the console.
I am calling the function as ./test wooop
After stepping through with gdb there seems to be no problem, everything is set up ok:
rax: 0x4
rbx: 0x1
rcx: pointing to string, x/6cb $rcx gives 'w' 'o' 'o' 'o' 'p' 0x0
rdx: 0x5 <---correctly determining length
after the int 80h rax contains -14 and nothing is printed to the console.
If I define a string in .data, it just works. gdb shows the value of $rcx in the same way.
Any ideas? here is my full source
%define LF 0Ah
%define stdout 1
%define sys_exit 1
%define sys_write 4
global _start
section .data
usagemsg: db "test {string}",LF,0
testmsg: db "wooop",0
section .text
_start:
pop rcx ;this is argc
cmp rcx, 2 ;one argument
jne usage
pop rcx
pop rcx ; argument now in rcx
test rcx,rcx
jz usage
;mov rcx, testmsg ;<-----uncomment this to print ok!
call print
jmp exit
usage:
mov rcx, usagemsg
call print
jmp exit
calclen:
push rdi
mov rdi, rcx
push rcx
xor rcx,rcx
not rcx
xor al,al
cld
repne scasb
not rcx
lea rdx, [rcx-1]
pop rcx
pop rdi
ret
print:
push rax
push rbx
push rdx
call calclen
mov rax, sys_write
mov rbx, stdout
int 80h
pop rdx
pop rbx
pop rax
ret
exit:
mov rax, sys_exit
mov rbx, 0
int 80h
Thanks
EDIT: After changing how I make my syscalls as below it works fine. Thanks all for your help!
sys_write is now 1
sys_exit is now 60
stdout now goes in rdi, not rbx
the string to write is now set in rsi, not rcx
int 80h is replaced by syscall
I'm still running 32-bit hardware, so this is a wild asmed guess! As you probably know, 64-bit system call numbers are completely different, and "syscall" is used instead of int 80h. However int 80h and 32-bit system call numbers can still be used, with 64-bit registers truncated to 32-bit. Your tests indicate that this works with addresses in .data, but with a "stack address", it returns -14 (-EFAULT - bad address). The only thing I can think of is that truncating rcx to ecx results in a "bad address" if it's on the stack. I don't know where the stack is in 64-bit code. Does this make sense?
I'd try it with "proper" 64-bit system call numbers and registers and "syscall", and see if that helps.
Best,
Frank
As you said, you're using ELF64 as the target of the compilation. This is, unfortunately, your first mistake. Using the "old" system call interface on Linux, e.g. int 80h is possible only when running 32-bit tasks. Obviously, you could simply assemble your source as ELF32, but then you're going to lose all the advantages if running tasks in 64-bit mode, namely the extra registers and 64-bit operations.
In order to make system calls in 64-bit tasks, the "new" system call interface must be used. The system call itself is done with the syscall instruction. The kernel destroys registers rcx and r11. The number of the system is specified in the register rax, while the arguments of the call are passed in rdi, rsi, rdx, r10, r8 and r9. Keep in mind that the numbers of the syscalls are different than the ones in 32-bit mode. You can find them in unistd_64.h, which is usually in /usr/include/asm or wherever your distribution stores it.

x64 bit assembly

I started assembly (nasm) programming not too long ago. Now I made a C function with assembly implementation which prints an integer. I got it working using the extended registers, but when I want to write it with the x64 registers (rax, rbx, ..) my implementation fails. Does any of you see what I missed?
main.c:
#include <stdio.h>
extern void printnum(int i);
int main(void)
{
printnum(8);
printnum(256);
return 0;
}
32 bit version:
; main.c: http://pastebin.com/f6wEvwTq
; nasm -f elf32 -o printnum.o printnum.asm
; gcc -o printnum printnum.o main.c -m32
section .data
_nl db 0x0A
nlLen equ $ - _nl
section .text
global printnum
printnum:
enter 0,0
mov eax, [ebp+8]
xor ebx, ebx
xor ecx, ecx
xor edx, edx
push ebx
mov ebx, 10
startLoop:
idiv ebx
add edx, 0x30
push dx ; With an odd number of digits this will screw up the stack, but that's ok
; because we'll reset the stack at the end of this function anyway.
; Needs fixing though.
inc ecx
xor edx, edx
cmp eax, 0
jne startLoop
push ecx
imul ecx, 2
mov edx, ecx
mov eax, 4 ; Prints the string (from stack) to screen
mov ebx, 1
mov ecx, esp
add ecx, 4
int 80h
mov eax, 4 ; Prints a new line
mov ebx, 1
mov ecx, _nl
mov edx, nlLen
int 80h
pop eax ; returns the ammount of used characters
leave
ret
x64 version:
; main.c : http://pastebin.com/f6wEvwTq
; nasm -f elf64 -o object/printnum.o printnum.asm
; gcc -o bin/printnum object/printnum.o main.c -m64
section .data
_nl db 0x0A
nlLen equ $ - _nl
section .text
global printnum
printnum:
enter 0, 0
mov rax, [rbp + 8] ; Get the function args from the stac
xor rbx, rbx
xor rcx, rcx
xor rdx, rdx
push rbx ; The 0 byte of the string
mov rbx, 10 ; Dividor
startLoop:
idiv rbx ; modulo is in rdx
add rdx, 0x30
push dx
inc rcx ; increase the loop variable
xor rdx, rdx ; resetting the modulo
cmp rax, 0
jne startLoop
push rcx ; push the counter on the stack
imul rcx, 2
mov rdx, rcx ; string length
mov rax, 4
mov rbx, 1
mov rcx, rsp ; the string
add rcx, 4
int 0x80
mov rax, 4
mov rbx, 1
mov rcx, _nl
mov rdx, nlLen
int 0x80
pop rax
leave
ret ; return to the C routine
Thanks in advance!
I think your problem is that you're trying to use the 32-bit calling conventions in 64-bit mode. That won't fly, not if you're calling these assembly routines from C. The 64-bit calling convention is documented here: http://www.x86-64.org/documentation/abi.pdf
Also, don't open-code system calls. Call the wrappers in the C library. That way errno gets set properly, you take advantage of sysenter/syscall, you don't have to deal with the differences between the normal calling convention and the system-call argument convention, and you're insulated from certain low-level ABI issues. (Another of your problems is that write is system call number 1, not 4, for Linux/x86-64.)
Editorial aside: There are two, and only two, reasons to write anything in assembly nowadays:
You are writing one of the very few remaining bits of deep magic that cannot be written in C alone (a good example is the guts of libffi)
You are hand-optimizing an inner-loop subroutine that has been measured to be performance-critical and the C compiler doesn't do a good enough job on.
Otherwise just write whatever it is in C. Your successors will thank you.
EDIT: checked system call numbers.
I'm not sure if this answer is related to the problem you're seeing (since you didn't specify anything about what the failure is), but 64-bit code has a different calling convention than 32-bit code does. Both of the major 64-bit Intel ABIs (Windows & Linux/BSD/Mac OS) pass function parameters in registers and not on the stack. Your program appears to still be expecting them on the stack, which isn't the normal way to go about it.
Edit: Now that I see there is a C main() routine that calls your functions, my answer is exactly about the problem you're having.

Resources