Is there a way to execute a flat binary image in Linux, using a syntax something like:
nasm -f bin -o foo.bin foo.asm
runbinary foo.bin
The Linux kernel can load several different binary formats - ELF is just the most common, though the a.out format is also pretty well known.
The supported binary formats are controlled by which binfmt modules are loaded or compiled in to the kernel (they're under the Filesystem section of the kernel config). There's a binfmt_flat for uClinux BFLT flat format binaries which are pretty minimal - they can even be zlib compressed which will let you make your binary even smaller, so this could be a good choice.
It doesn't look like nasm natively supports this format, but it's pretty easy to add the necessary header manually as Jim Lewis describes for ELF. There's a description of the format here.
Is there some reason you don't want to use "-f elf" instead of "-f bin"?
I think Linux won't run a binary that's not in ELF format. I can't find a tool that converts flat binaries to ELF, but you can cheat by putting the ELF information in foo.asm,
using the technique described here :
We can look at the ELF
specification, and
/usr/include/linux/elf.h, and
executables created by the standard
tools, to figure out what our empty
ELF executable should look like. But,
if you're the impatient type, you can
just use the one I've supplied here:
BITS 32
org 0x08048000
ehdr: ; Elf32_Ehdr
db 0x7F, "ELF", 1, 1, 1, 0 ; e_ident
times 8 db 0
dw 2 ; e_type
dw 3 ; e_machine
dd 1 ; e_version
dd _start ; e_entry
dd phdr - $$ ; e_phoff
dd 0 ; e_shoff
dd 0 ; e_flags
dw ehdrsize ; e_ehsize
dw phdrsize ; e_phentsize
dw 1 ; e_phnum
dw 0 ; e_shentsize
dw 0 ; e_shnum
dw 0 ; e_shstrndx
ehdrsize equ $ - ehdr
phdr: ; Elf32_Phdr
dd 1 ; p_type
dd 0 ; p_offset
dd $$ ; p_vaddr
dd $$ ; p_paddr
dd filesize ; p_filesz
dd filesize ; p_memsz
dd 5 ; p_flags
dd 0x1000 ; p_align
phdrsize equ $ - phdr
_start:
; your program here
filesize equ $ - $$
This image contains an ELF header,
identifying the file as an Intel 386
executable, with no section header
table and a program header table
containing one entry. Said entry
instructs the program loader to load
the entire file into memory (it's
normal behavior for a program to
include its ELF header and program
header table in its memory image)
starting at memory address 0x08048000
(which is the default address for
executables to load), and to begin
executing the code at _start, which
appears immediately after the program
header table. No .data segment, no
.bss segment, no commentary — nothing
but the bare necessities.
So, let's add in our little program:
; tiny.asm
org 0x08048000
;
; (as above)
;
_start: mov bl, 42 xor eax, eax inc eax int 0x80 filesize equ $ - $$
and try it out:
$ nasm -f bin -o a.out tiny.asm
$ chmod +x a.out
$ ./a.out ; echo $?
42
Minimally, Linux will need to figure out the format of the executable and it will get that from the first bytes. For example, if it's a script that will be #!, shebang. If it's ELF that will be 0x7F 'E' 'L' 'F'. Those magic numbers will determine the handler from a lookup.
So you're gonna need a header with a recognized magic number. You can get a list of shebang supported formats in /proc/sys/fs/binfmt_misc. Getting a list of native binary formats is (unfortunately) a little trickier.
bFLT may be a good choice. Indeed, it's a popular embedded executable format. But you can also squeeze ELF down quite far. This article got an ELF executable down to 45 bytes. That said, you'd be squeezing it down mostly by hand rather than by tool.
Related
I wrote a simple "Hello world" in assembly under debian linux:
; Define variables in the data section
SECTION .data
hello: db 'Hello world!',10
helloLen: equ $-hello
; Code goes in the text section
SECTION .text
GLOBAL _start
_start:
mov eax,4 ; 'write' system call = 4
mov ebx,1 ; file descriptor 1 = STDOUT
mov ecx,hello ; string to write
mov edx,helloLen ; length of string to write
int 80h ; call the kernel
; Terminate program
mov eax,1 ; 'exit' system call
mov ebx,0 ; exit with error code 0
int 80h ; call the kernel
After assembling
nasm -f elf64 hello.asm -o hello.o
ld -o hello hello.o.
I got a 9048 byte binary.
Then I changed two lines in the code: from .data to .DATA and .text to .TEXT:
SECTION .DATA
SECTION .TEXT
and got a 4856 byte binary.
Changing them to
SECTION .dAtA
SECTION .TeXt
produced a 4856 byte binary too.
NASM is declared to be a case-insensitive compiler. What is the difference then?
You're free to use whatever names you like for ELF sections, but if you don't use standard names, it becomes your responsibility to specify the section flags. (If you use standard names, you get to take advantage of default flag settings for those names.) Section names are case-sensitive, and .data and .text are known to NASM. .DATA, .dAta, etc. are not, and there is nothing which distinguishes these sections from each other, allowing ld to combine them into a single segment.
That automatically makes your executable smaller. With the standard flags for .text and .data, one of those is read-only and the other is read-write, which means that they cannot be placed into the same memory page. In your example program, both sections are quite small, so they could fit in a single memory page. Thus, using non-standard names makes your executable one page smaller, but one of the sections will have incorrect writability.
I'm trying to disassembly app written in assembly. I'm on Linux, x64:
$ objdump -d my_app
my_app: file format elf64-x86-64
That's it. What's wrong with it? It's not a simple hello world of a few lines, it's around 200 lines of code.
The same with gbd:
$ gdb -q my_app
Reading symbols from my_app...(no debugging symbols found)...done.
(gdb)
And
$ radare2 my_app
Warning: Cannot initialize section headers
Warning: Cannot initialize strings table
Warning: Cannot initialize dynamic strings
Warning: Cannot initialize dynamic section
-- Calculate checksums for the current block with the commands starting with '#' (#md5, #crc32, #all, ..)
update:
$ objdump -D my_app
my_app: file format elf64-x86-64
compiling:
$ fasm my_app.asm
# => my_app
update2:
; simplified
format ELF64 executable 3
include "import64.inc"
interpreter "/lib64/ld-linux-x86-64.so.2"
needed "libc.so.6"
import printf, close
segment readable
A equ 123
B equ 222
C equ 333
segment readable writeable
struc s1 a, b, c {
.a1 dw a
.b1 dw b
.c dd c
}
msg:
.m1 db "aaa", 0
.m2 db "bbb", 0
.m3 db "ccc", 0
segment readable executable
entry $
mov rax, 2
mov rdi, "something.txt"
mov rsi, 0
syscall
; .............
; omitted
Asking fasm to directly produce an ELF binary without the use of a linker will only create segments but no sections in the output. This confuses some tools. In particular objdump -d is specifically documented to operate on sections. Note that gdb can still debug and disassemble it, if you give it some addresses, e.g. the entry point.
Ok.
So i have been messing around with Assembly, and i was wondering: just HOW does a linkes ELF64 File look like, and can i directly write a linked file in plain-text? (like create a file e.G "main", write the hex-values of the system-calls, and then run it without linking or assembling.)
I have tried objdump -x main but i don't think, this is the entire ELF-File, because there is too less information, as i think.
Here the output:
main: Dateiformat elf64-x86-64
Inhalt von Abschnitt .text:
4000b0 b8040000 00bb0100 0000b9d0 006000ba .............`..
4000c0 0c000000 cd80b801 000000cd 80 .............
Inhalt von Abschnitt .data:
6000d0 48454c4c 4f2c2057 4f524c44 HELLO, WORLD
my Assembler Code:
section .data
msg: db "HELLO, WORLD"
len: equ $-msg
section .text
;write
mov eax, 4
mov ebx, 1
mov ecx, msg
mov edx, len
int 80h;
;quit
mov eax, 1
int 80h;
EDIT: My Compiler is finished now, I just stuck with assembler and let NASM/ld do the job
If you want to see the entire structure of your executable try:
objdump -D some_exe
and if you want to see your file in hex format do:
xxd some_exe
or
hexdump some_exe
can i directly write a linked file in plain-text?
Well... Theoretically you can if you know exactly the instructions of the executable and you write them in binary to a plaintext file.
For example, for any given executable exe_file you can do this:
touch temp_file plaintext_file
xxd -p exe_file > temp_file
xxd -p -r temp_file > plaintext_file
chmod u+x plaintext_file
The plaintext_file will be an executable exactly the same as your exe_file. If between steps 2 and 3 you modify the temp_file you are directly modifying the executable by hand, although it is not very likely to change something "specific", unless you have very deep understanding of elf64 format (which I don't and I'm not sure what can be achieved with this).
Note: I know step 1 is redundant, I used it for demonstrating that you are starting with 2 simple plaintext files.
i'm new to assembly language, I finished writing a simple program so i ran the follow commends
nasm -o learn.bin learn.asm
to assemble the code then
chmod +x learn.bin
and then finally to run it
./learn.bin
but the last returned an error
bash: ./learn.bin: cannot execute binary file
im running ubuntu with an atom intel CPU
any help would be awesome,
Thanks in advance
The error message sounds like you don't have a proper ELF executable header on it. It IS possible to assemble a file using Nasm's -f bin output format (the default, if you don't specify an output format). But it needs an ELF header stuffed into it.
The usual way would be nasm -f elf32 learn.asm (or perhaps -f elf64 if you've got 64-bit code). This "should" produce "learn.o", if all goes well. Then you've got to link this "linkable object" file using ld -o learn learn.o (add -melf-i386 if you're using 64-bit ld... which you probably are). Or, depending on the code, gcc -o learn learn.o (add -m32 for 64-bit gcc). I see that Jester has just told you that (in fewer words).
Here's an example of a file that "should" work the way you're trying to do it:
[map all hkhw.map] ; optional
;==========================
bits 32
ORIGIN equ 8048000h
org ORIGIN
section .text
code_offset equ 0
code_addr:
;--------------------------- ELF header----------------------
dd $464c457f,$00010101,0,0,$00030002,1,main,$34,0,0,$00200034,2,0
dd 1,code_offset,code_addr,code_addr,code_filez,code_memsz,5,4096
dd 1,data_offset,data_addr,data_addr,data_filez,data_memsz,6,4096
main:
;--------- your code goes here -------------------------------
push byte 4
pop eax
xor ebx, ebx
mov ecx, msg
push byte msg_len
pop edx
int 80h
push byte 1
pop eax
int 80h
;------------ constant data -----------------------
; (note that we're in .text, not .rdata)
align 4
;-------------------------------------------------------------
align 4
code_memsz equ $ - $$
code_filez equ code_memsz
data_addr equ (ORIGIN+code_memsz+4095)/4096*4096 + (code_filez % 4096)
data_offset equ code_filez
section .data vstart=data_addr
;------------ initialized data -------------
msg db "Hello from Nasm, all by itself!", 10
msg_len equ $ - msg
;---------------------------------------------------------------------------
idat_memsz equ $ - $$
bss_addr equ data_addr + ($ - $$)
section .bss vstart=bss_addr
;------------- uninitialized data ----------------------
;-------------------------------------------------
udat_memsz equ $ - $$
data_memsz equ idat_memsz + udat_memsz
data_filez equ idat_memsz
;========================
Well... that didn't format well. Probably unreadable. Try Nasm Forum. We can help you more if you post the code
You can't (normally) run plain binary files under linux. You'll have to create an ELF executable by first asking nasm to produce an object file and then using a linker. Note that your code should also of course be written for linux. There are plenty of examples on the internet, see this tutorial for example.
What's the best tool for converting PE binaries to ELF binaries?
Following is a brief motivation for this question:
Suppose I have a simple C program.
I compiled it using gcc for linux(this gives ELF), and using 'i586-mingw32msvc-gcc' for Windows(this gives a PE binary).
I want to analyze these two binaries for similarities, using Bitblaze's static analysis tool - vine(http://bitblaze.cs.berkeley.edu/vine.html)
Now vine doesn't have a good support for PE binaries, so I wanted to convert PE->ELF, and then carry on with my comparison/analysis.
Since all the analysis has to run on Linux, I would prefer a utility/tool that runs on Linux.
Thanks
It is possible to rebuild an EXE as an ELF binary, but the resulting binary will segfault very soon after loading, due to the missing operating system.
Here's one method of doing it.
Summary
Dump the section headers of the EXE file.
Extract the raw section data from the EXE.
Encapsulate the raw section data in GNU linker script snippets.
Write a linker script to build an ELF binary, including those scripts from the previous step.
Run ld with the linker script to produce the ELF file.
Run the new program, and watch it segfault as it's not running on Windows (and it tries to call functions in the Import Address Table, which doesn't exist).
Detailed Example
Dump the section headers of the EXE file. I'm using objdump from the mingw cross compiler package to do this.
$ i686-pc-mingw32-objdump -h trek.exe
trek.exe: file format pei-i386
Sections:
Idx Name Size VMA LMA File off Algn
0 AUTO 00172600 00401000 00401000 00000400 2**2
CONTENTS, ALLOC, LOAD, READONLY, CODE
1 .idata 00001400 00574000 00574000 00172a00 2**2
CONTENTS, ALLOC, LOAD, DATA
2 DGROUP 0002b600 00576000 00576000 00173e00 2**2
CONTENTS, ALLOC, LOAD, DATA
3 .bss 000e7800 005a2000 005a2000 00000000 2**2
ALLOC
4 .reloc 00013000 0068a000 0068a000 0019f400 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
5 .rsrc 00000a00 0069d000 0069d000 001b2400 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
Use dd (or a hex editor) to extract the raw section data from the EXE. Here, I'm just going to copy the code and data sections (named AUTO and DGROUP in this example). You may want to copy additional sections though.
$ dd bs=512 skip=2 count=2963 if=trek.exe of=code.bin
$ dd bs=512 skip=2975 count=347 if=trek.exe of=data.bin
Note, I've converted the file offsets and section sizes from hex to decimal to use as skip and count, but I'm using a block size of 512 bytes in dd to speed up the process (example: 0x0400 = 1024 bytes = 2 blocks # 512 bytes).
Encapsulate the raw section data in GNU ld linker scripts snippets (using the BYTE directive). This will be used to populate the sections.
cat code.bin | hexdump -v -e '"BYTE(0x" 1/1 "%02X" ")\n"' >code.ld
cat data.bin | hexdump -v -e '"BYTE(0x" 1/1 "%02X" ")\n"' >data.ld
Write a linker script to build an ELF binary, including those scripts from the previous step. Note I've also set aside space for the uninitialized data (.bss) section.
start = 0x516DE8;
ENTRY(start)
OUTPUT_FORMAT("elf32-i386")
SECTIONS {
.text 0x401000 :
{
INCLUDE "code.ld";
}
.data 0x576000 :
{
INCLUDE "data.ld";
}
.bss 0x5A2000 :
{
. = . + 0x0E7800;
}
}
Run the linker script with GNU ld to produce the ELF file. Note I have to use an emulation mode elf_i386 since I'm using 64-bit Linux, otherwise a 64-bit ELF would be produced.
$ ld -o elf_trek -m elf_i386 elf_trek.ld
ld: warning: elf_trek.ld contains output sections; did you forget -T?
$ file elf_trek
elf_trek: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV),
statically linked, not stripped
Run the new program, and watch it segfault as it's not running on Windows.
$ gdb elf_trek
(gdb) run
Starting program: /home/quasar/src/games/botf/elf_trek
Program received signal SIGSEGV, Segmentation fault.
0x0051d8e6 in ?? ()
(gdb) bt
\#0 0x0051d8e6 in ?? ()
\#1 0x00000000 in ?? ()
(gdb) x/i $eip
=> 0x51d8e6: sub (%edx),%eax
(gdb) quit
IDA Pro output for that location:
0051D8DB ; size_t stackavail(void)
0051D8DB proc stackavail near
0051D8DB push edx
0051D8DC call [ds:off_5A0588]
0051D8E2 mov edx, eax
0051D8E4 mov eax, esp
0051D8E6 sub eax, [edx]
0051D8E8 pop edx
0051D8E9 retn
0051D8E9 endp stackavail
For porting binaries to Linux, this is kind of pointless, given the Wine project.
For situations like the OP's, it may be appropriate.
I've found a simpler way to do this. Use the strip command.
Example
strip -O elf32-i386 -o myprogram.elf myprogram.exe
The -O elf32-i386 has it write out the file in that format.
To see supported formats run
strip --info
I am using the strip command from mxe, which on my system is actually named /opt/mxe/usr/bin/i686-w64-mingw32.static-strip.
I don't know whether this totally fits your needs, but is it an option for you to cross-compile with your MinGW version of gcc?
I mean do say: does it suit your needs to have i586-mingw32msvc-gcc compile direct to ELF format binaries (instead of the PEs you're currently getting). A description of how to do things in the other direction can be found here - I imagine it will be a little hacky but entirely possible to make this work for you in the other direction (I must admit I haven't tried it).