Adding 64 bit asm instructions to 32bit asm code - nasm

I am planning to write the smallest binary file in terms of size in binary. I am following the instructions as given here. I have successfully managed to reduce the size of my binary using 32 bit asm to 64 bytes. Now I want to convert the 32 bit assembly language to 64 bit language.
This is the code that I tried:
; tiny.asm
BITS 32
org 0x00200000
db 0x7F, "ELF" ; e_ident
db 1, 1, 1, 0
_start:
mov bl, 42 ;These are the lines where
xor eax, eax ;I want to insert the 64 bit
inc eax ;instructions.
int 0x80 ;
db 0
dw 2 ; e_type
dw 3 ; e_machine
dd 1 ; e_version
dd _start ; e_entry
dd phdr - $$ ; e_phoff
phdr: dd 1 ; e_shoff ; p_type
dd 0 ; e_flags ; p_offset
dd $$ ; e_ehsize ; p_vaddr
; e_phentsize
dw 1 ; e_phnum ; p_paddr
dw 0 ; e_shentsize
dd filesize ; e_shnum ; p_filesz
; e_shstrndx
dd filesize ; p_memsz
dd 5 ; p_flags
dd 0x1000 ; p_align
filesize equ $ - $$
As per the code I did have some wild guesses. However, the compiler is expecting 32 bit instructions when I tried to tinker with it giving registers and syscall of 64 bit asm.
The code I used to execute the program is :
$ nasm -f bin -o a.out tiny.asm
$ chmod +x a.out
$ ./a.out ; echo $?
42
$ wc -c a.out
64 a.out
Any idea or any sort of lead in this will be really appreciated. Thanks.

Related

.text segment bigger than .text section in executable. Why?

I have the following 'uppercaser.asm' assembly program in NASM which converts all lowercase letters input from user into uppercase:
section .bss
Buff resb 1
section .data
section .text
global _start
_start:
nop ; This no-op keeps the debugger happy
Read: mov eax,3 ; Specify sys_read call
mov ebx,0 ; Specify File Descriptor 0: Standard Input
mov ecx,Buff ; Pass offset of the buffer to read to
mov edx,1 ; Tell sys_read to read one char from stdin
int 80h ; Call sys_read
cmp eax,0 ; Look at sys_read's return value in EAX
je Exit ; Jump If Equal to 0 (0 means EOF) to Exit
; or fall through to test for lowercase
cmp byte [Buff],61h ; Test input char against lowercase 'a'
jb Write ; If below 'a' in ASCII chart, not lowercase
cmp byte [Buff],7Ah ; Test input char against lowercase 'z'
ja Write ; If above 'z' in ASCII chart, not lowercase
; At this point, we have a lowercase character
sub byte [Buff],20h ; Subtract 20h from lowercase to give uppercase...
; ...and then write out the char to stdout
Write: mov eax,4 ; Specify sys_write call
mov ebx,1 ; Specify File Descriptor 1: Standard output
mov ecx,Buff ; Pass address of the character to write
mov edx,1 ; Pass number of chars to write
int 80h ; Call sys_write...
jmp Read ; ...then go to the beginning to get another character
Exit: mov eax,1 ; Code for Exit Syscall
mov ebx,0 ; Return a code of zero to Linux
int 80H ; Make kernel call to exit program
The program is then assembled with the -g -F stabs option for the debugger and linked for 32-bit executables in ubuntu 18.04.
Running readelf --segments uppercaser for the segments and readelf -S uppercaser for the sections I see a difference in size of text segment and text section.
readelf --segments uppercaser
Elf file type is EXEC (Executable file)
Entry point 0x8048080
There are 2 program headers, starting at offset 52
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
LOAD 0x000000 0x08048000 0x08048000 0x000db 0x000db R E 0x1000
LOAD 0x0000dc 0x080490dc 0x080490dc 0x00000 0x00004 RW 0x1000
Section to Segment mapping:
Segment Sections...
00 .text
01 .bss
readelf -S uppercaser
Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[ 0] NULL 00000000 000000 000000 00 0 0 0
[ 1] .text PROGBITS 08048080 000080 00005b 00 AX 0 0 16
[ 2] .bss NOBITS 080490dc 0000dc 000004 00 WA 0 0 4
[ 3] .stab PROGBITS 00000000 0000dc 000120 0c 4 0 4
[ 4] .stabstr STRTAB 00000000 0001fc 000011 00 0 0 1
[ 5] .comment PROGBITS 00000000 00020d 00001f 00 0 0 1
[ 6] .shstrtab STRTAB 00000000 00022c 00003e 00 0 0 1
[ 7] .symtab SYMTAB 00000000 0003d4 0000f0 10 8 11 4
[ 8] .strtab STRTAB 00000000 0004c4 000045 00 0 0 1
In the sections description one can see that the size of .text section is 5Bh=91 bytes (the same number one is getting with the size command) whereas in the segments description we see that the size is 0x000DB, a difference of 128 bytes. Why is that?
From the elf man pages for the Elf32_Phdr (program header) structure:
p_filesz
This member holds the number of bytes in the file image of
the segment. It may be zero.
p_memsz
This member holds the number of bytes in the memory image
of the segment. It may be zero.
Is the difference somehow related to the .bss section?
Notice that the first program segment at file address 0 starts at virtual address 0x08048000, not at VA 0x08048080 which corresponds with the .text section.
In fact the segment displayed by readelf as 00 .text covers ELF file header (52 bytes), alignment, two program headers (2*32 bytes) and the netto contents of .text section, alltogether mapped from file address 0 to VA 0x08048000.

Creating a program header with read only flag causes segfault

I've been writing ELF binaries using NASM, and I created a segment with the read-only flag turned on. Running the program causes a segfault. I tested the program in replit, and it ran just fine so what's the problem? I created a regular NASM hello world program with the hello world string inside the .rodata section and that ran fine. I checked the binary with readelf to make sure the string was in a read only segment.
The only solution I've come up with is to set the executable flag in the rodata segment so it has read / execute permissions, but that's hacky and I'd like the rodata segment to be read-only.
This is the code for the ELF-64 hello world.
; hello.asm
[bits 64]
[org 0x400000]
fileHeader:
db 0x7F, "ELF"
db 2 ; ELF-64
db 1 ; little endian
db 1 ; ELF version
db 0 ; System V ABI
db 0 ; ABI version
db 0, 0, 0, 0, 0, 0, 0 ; unused
dw 2 ; executable object file
dw 0x3E ; x86-64
dd 1 ; ELF version
dq text ; entry point
dq 64 ; program header table offset
dq nullSection - $$ ; section header table offset
dd 0 ; flags
dw 64 ; size of file header
dw 56 ; size of program header
dw 3 ; program header count
dw 64 ; size of section header
dw 4 ; section header count
dw 3 ; section header string table index
nullSegment:
times 56 db 0
textSegment:
dd 1 ; loadable segment
dd 0x4 ; read / execute permissions
dq text - $$ ; segment offset
dq text ; virtual address of segment
dq 0 ; physical address of segment
dq textSize ; size of segment in file
dq textSize ; size of segment in memory
dq 0x1000 ; alignment
rodataSegment:
dd 1 ; loadable segment
dd 0x4 ; read permission (setting this flag to 0x5 causes the program to run just fine)
dq rodata - $$ ; segment offset
dq rodata ; virtual address of segment
dq 0 ; physical address of segment
dq rodataSize ; size of segment in file
dq rodataSize ; size of segment in memory
dq 0x1000 ; alignment
text:
mov rax, 1
mov rdi, 1
mov rsi, message
mov rdx, messageLength
syscall
mov rax, 60
xor rdi, rdi
syscall
textSize equ $ - text
rodata:
message db "Hello world!", 0xA, 0
messageLength equ $ - message
rodataSize equ $ - rodata
stringTable:
db 0
db ".text", 0
db ".rodata", 0
db ".shstrtab", 0
stringTableSize equ $ - stringTable
nullSection:
times 64 db 0
textSection:
dd 1 ; index into string table
dd 1 ; program data
dq 0x6 ; occupies memory & executable
dq text ; virtual address of section
dq text - $$ ; offset of section in file
dq textSize ; size of section in file
dq 0 ; unused
dq 0x1000 ; alignment
dq 0 ; unused
rodataSection:
dd 7 ; index into string table
dd 1 ; program data
dq 0x2 ; occupies memory
dq rodata ; virtual address of section
dq rodata - $$ ; offset of section in file
dq rodataSize ; size of section in file
dq 0 ; unused
dq 0x1000 ; no alignment
dq 0 ; unused
stringTableSection:
dd 15 ; index into string table
dd 3 ; string table
dq 0 ; no attributes
dq stringTable ; virtual address of section
dq stringTable - $$ ; offset of section in file
dq stringTableSize ; size of section in file
dq 0 ; unused
dq 0 ; no alignment
dq 0 ; unused
replitHello.asm: https://hastebin.com/ujanoguveq.properties // it should be nearly the same line for line
This is the minimal nasm hello world program.
; helloNasm.asm
section .text
global _start
_start:
mov rax, 1
mov rdi, 1
mov rsi, message
mov rdx, messageLength
syscall
mov rax, 60
xor rdi, rdi
syscall
section .rodata
message db "Hello NASM!", 0xA, 0
messageLength equ $ - message
textSegment:
dd 1 ; loadable segment
dd 0x4 ; read / execute permissions
I assume you meant 0x5 for flags above.
With that fixed, I see the following segments:
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
NULL 0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 0
LOAD 0x0000e8 0x00000000004000e8 0x0000000000000000 0x000025 0x000025 R E 0x1000
LOAD 0x00010d 0x000000000040010d 0x0000000000000000 0x00000e 0x00000e R 0x1000
This asks the kernel to perform two mmaps at the same address (0x400000). The second of these mmaps maps over the first one, resulting in the following /proc/$pid/maps:
00400000-00401000 r--p 00000000 fe:02 22548440 /tmp/t
7ffff7ff9000-7ffff7ffd000 r--p 00000000 00:00 0 [vvar]
7ffff7ffd000-7ffff7fff000 r-xp 00000000 00:00 0 [vdso]
7ffffffdd000-7ffffffff000 rw-p 00000000 00:00 0 [stack]
As you can see, the program text is not executable, and as a result the program SIGSEGVs on the very first instruction:
(gdb) run
Starting program: /tmp/t
Program received signal SIGSEGV, Segmentation fault.
0x00000000004000e8 in ?? ()
(gdb) x/i $pc
=> 0x4000e8: mov $0x1,%eax
To fix this, you must move one of the segments to a different page (as Jester correctly noted).
Also note that sections are completely unnecessary (only segments matter). Setting A X flags in the .text section in particular has no effect on anything.

Segmentation fault when using memory with custom ELF file

I am trying to program a small ELF program with a custom ELF header but have a segmentation fault whenever i am writing to memory.
Why would that code trigger a segmentation fault ?
%assign LOAD_ADDRESS 0x08048000
BITS 32
org LOAD_ADDRESS ; load address
ehdr: ; Elf32_Ehdr
db 0x7F, "ELF", 1, 1, 1 ; e_ident
times 9 db 0 ; some places to run code ?
dw 2 ; e_type
dw 3 ; e_machine
dd 1 ; e_version
dd _start ; e_entry
dd phdr - $$ ; e_phoff
dd shent - $$ ; e_shoff
dd 0 ; e_flags
dw ehdrsz ; e_ehsize
dw phdrsz ; e_phentsize
dw 1 ; e_phnum
dw shentsize ; e_shentsize
dw 3 ; e_shnum
dw 2 ; e_shstrndx
ehdrsz equ $ - ehdr
phdr: ; Elf32_Phdr
dd 1 ; p_type
dd 0 ; p_offset
dd $$ ; p_vaddr
dd $$ ; p_paddr
dd filesz ; p_filesz
dd filesz ; p_memsz
dd 5 ; p_flags
dd 0x1000 ; p_align
phdrsz equ $ - phdr
shent: ; sections table
; data
dd 0 ; unamed
dd 1 ; PROGBITS
dd 2|1 ; ALLOC / WRITE
dd data
dd data - LOAD_ADDRESS
dd datasz
dd 0
dd 0
dd 4
dd 0
shentsize equ $ - shent ; length of a single section entry
; bss
dd 6 ; unamed
dd 8 ; NOBITS
dd 2|1 ; ALLOC / WRITE
dd bss
dd bss - LOAD_ADDRESS
dd bsssz
dd 0
dd 0
dd 4
dd 0
; shstrtab
dd 11 ; unamed
dd 3 ; STRTAB
dd 0
dd shstrtab
dd shstrtab - LOAD_ADDRESS
dd shstrtabsz
dd 0
dd 0
dd 1
dd 0
; ELF end
section .shstrtab
shstrtab:
db ".data",0
db ".bss",0
db ".shstrtab",0
shstrtabsz equ $ - shstrtab
_start:
mov eax,0
mov [test],eax ; segmentation fault
xor eax,eax
inc eax
int 0x80
section .data
data:
test:
db 1
datasz equ $ - data
section .bss
bss:
bsssz equ $ - bss
filesz equ $ - $$
nasm -f bin -o small_program small_program.asm
Okay found out that the program was missing a second program header with R/W flag for the data / bss section, it describe a second memory segment for the OS with the appropriate flags for run time execution.
Here is what to add after phdrsz equ $ - phdr line :
dd 1 ; p_type
dd data - LOAD_ADDRESS ; p_offset
dd data ; p_vaddr
dd data ; p_paddr
dd datasz ; p_filesz
dd datasz + bsssz ; p_memsz
dd 6 ; p_flags (R/W)
dd 0x1000 ; p_align
Note : Before understanding this i was mislead on the importance of the sections, i thought that by describing the sections i could access the memory but it turns out that the program headers is what the OS is looking for, the whole sections code can be dropped and the program still work.

Raising BRK in assembly on i386 Linux

I found and studied x86 memory access segmentation fault and it won't work in my code. The difference perhaps being that I don't use separate .text and .data segments but keep all in a single segment by creating a custom ELF header. Does that explain why the SYS_BRK call fails?
The program then continues by making the memory pages read/write/execute etc.
I tried to find the minimum code sample that illustrated the issue.
In kdbg the sample does work, but not when started from the command line, hence the message printing.
cpu 386
bits 32
; System calls used
%assign SYS_EXIT 1
%assign SYS_WRITE 4
%assign SYS_BRK 45
%assign SYS_MPROTECT 125
; flags for SYS_MPROTECT
%assign PROT_READ 1
%assign PROT_WRITE 2
%assign PROT_EXEC 4
%assign STDOUT 1
memstart: org 0x08048000
ehdr: ; Elf32_Ehdr (see https://en.wikipedia.org/wiki/Executable_and_Linkable_Format)
db 0x7F, "ELF" ; e_ident[EI_MAG0..EI_MAG03]
db 1 ; e_ident[EI_CLASS]
db 1 ; e_ident[EI_DATA]
db 1 ; e_ident[EI_VERSION]
db 0 ; e_ident[EI_OSABI]
db 0 ; e_ident[EI_ABIVERSION]
times 7 db 0 ; e_ident[EI_PAD]
dw 2 ; e_type
dw 3 ; e_machine
dd 1 ; e_version
dd start ; e_entry
dd phdr - $$ ; e_phoff
dd 0 ; e_shoff
dd 0 ; e_flags
dw ehdrsize ; e_ehsize
dw phdrsize ; e_phentsize
dw 1 ; e_phnum
dw 0 ; e_shentsize
dw 0 ; e_shnum
dw 0 ; e_shstrndx
ehdrsize equ $ - ehdr
phdr: ; Elf32_Phdr
dd 1 ; p_type
dd 0 ; p_offset
dd $$ ; p_vaddr
dd $$ ; p_paddr
dd filesize ; p_filesz
dd filesize ; p_memsz
dd 5 ; p_flags
dd 0x1000 ; p_align
phdrsize equ $ - phdr
memsize: dd 16*4096 ; = 16 * 4K pages
memtop: dd 0
start: ;int 3
xor ebx, ebx ; find the amount of allocated memory
mov eax, SYS_BRK
int 0x80 ; eax contains current memtop
sub eax, memstart ; got enough memory?
test eax, [memsize]
ja memgood
mov eax, memstart ; raise memory limit to memstart + memsize
add eax, [memsize]
mov ebx, eax
mov ecx, eax ; save requested memory size in ecx
mov eax, SYS_BRK
int 0x80
cmp eax, ecx
jne brk_error ; raising memory limit failed
memgood: mov edx, (PROT_READ | PROT_WRITE | PROT_EXEC)
mov ecx, [memsize] ; make memory read/write/execute
mov ebx, memstart
mov eax, SYS_MPROTECT
int 0x80
test eax, eax
js bailout
jmp launch ; lets start the party
brk_error: mov edx, brkelen
mov ecx, brke
mov ebx, STDOUT
mov eax, SYS_WRITE
int 0x80
jmp bailout
brke: db 'SYS_BRK failed, bye', 10
brkelen equ $ - brke
bailout: mov eax, SYS_EXIT
xor ebx, ebx
int 0x80
launch: mov edx, succlen
mov ecx, succ
mov ebx, STDOUT
mov eax, SYS_WRITE
int 0x80
jmp bailout
succ: db 'Success with mem config, bye', 10
succlen equ $ - succ
filesize equ $ - $$
You should use cmp here instead of test:
sub eax, memstart ; got enough memory?
test eax, [memsize]
ja memgood
The memory area described by SYS_BRK start a random offset of 0 to 0x02000000 after the executable unless address space layout randomisation is disabled, which I suspect your debugger does. You can use mmap to allocate memory at a specified address (don't set MAP_FIXED unless you want overwrite existing mappings) .
However this whole exercise with brk and mprotect seems rather pointless as apart from the stack upon program start memory is allocated exactly as the ELF header specified, instead you can:
phdr: ; Elf32_Phdr
dd 1 ; p_type
dd 0 ; p_offset
dd $$ ; p_vaddr
dd $$ ; p_paddr
dd filesize ; p_filesz
dd 16*4096 ; p_memsz
dd 7 ; p_flags
dd 0x1000 ; p_align
phdrsize equ $ - phdr

Has anyone been able to create a hybrid of PE COFF and ELF?

I mean could a single binary file run in both Win32 and Linux i386 ?
This is not possible, because the two types have conflicting formats:
The initial two characters of a PE file must be 'M' 'Z';
The initial four characters of an ELF file must be '\x7f' 'E' 'L' 'F'.
Clearly, you can't create one file that satisifies both formats.
In response to the comment about a polyglot binary valid as both a 16 bit COM file and a Linux ELF file, that's possible (although really a COM file is a DOS program, not Windows - and certainly not Win32).
Here's one I knocked together - compile it with NASM. It works because the first two bytes of an ELF file ('\x7f' 'E') happen to also be valid 8086 machine code (a 45 byte relative jump-if-greater-than instruction). Minimal ELF headers cribbed from Brian Raiter.
BITS 32
ORG 0x08048000
ehdr: ; Elf32_Ehdr
db 0x7F, "ELF", 1, 1, 1, 0 ; e_ident
times 8 db 0
dw 2 ; e_type
dw 3 ; e_machine
dd 1 ; e_version
dd _start ; e_entry
dd phdr - $$ ; e_phoff
dd 0 ; e_shoff
dd 0 ; e_flags
dw ehdrsize ; e_ehsize
dw phdrsize ; e_phentsize
dw 1 ; e_phnum
dw 0 ; e_shentsize
dw 0 ; e_shnum
dw 0 ; e_shstrndx
ehdrsize equ $ - ehdr
times 0x47-($-$$) db 0
; DOS COM File code
BITS 16
mov dx, msg1 - $$ + 0x100
mov ah, 0x09
int 0x21
mov ah, 0x00
int 0x21
msg1: db `Hello World (DOS).\r\n$`
BITS 32
phdr: ; Elf32_Phdr
dd 1 ; p_type
dd 0 ; p_offset
dd $$ ; p_vaddr
dd $$ ; p_paddr
dd filesize ; p_filesz
dd filesize ; p_memsz
dd 5 ; p_flags
dd 0x1000 ; p_align
phdrsize equ $ - phdr
; Linux ELF code
_start:
mov eax, 4 ; SYS_write
mov ebx, 1 ; stdout
mov ecx, msg2
mov edx, msg2_len
int 0x80
mov eax, 1 ; SYS_exit
mov ebx, 0
int 0x80
msg2: db `Hello World (Linux).\n`
msg2_len equ $ - msg2
filesize equ $ - $$
The two formats are sufficiently different that a hybrid is unlikely.
However, Linux supports loading different executable formats by "interpreter". This way compiled .exe files containing CIL (compiled C# or other .NET languages) can be executed directly under Linux, for example.
Sure. Use Java.

Resources