Does GNU Assembler add its own entry point? - linux

Say I have the following Assembly code:
.section .text
.globl _start
_start:
If I created an executable file using the following commands:
as 1.s -o 1.o
ld 1.o -o 1
Will the GNU Assembler add its own entry point to my executable which calls _start or will _start be the actual entry point?
See this question for more details.

The file crt0.o (or crt1.o or however this file is called) that contains the startup code mentioned in the other question has also been written in assembler.
So what the Linker ("ld") does is to search all object files (which are in fact all created using "as") for a symbol named "_start" which becomes the entry point.
You are of course free to add crt0.o to your assembler-written program when using "ld". In this case however you MUST NOT name your symbol "_start" but "main" in your assembler file:
.globl main
.text
main:
...
Otherwise "ld" will print an error message because it will find two symbols named "_start" and it does not know which one is the entry point!

You can check it this way:
objdump -x 1 # n.b. 1 is the name of your program
This will print, among other things:
start address 0x000000...
Take the address it gives you, and search for it elsewhere in the output. I think you will find it matches the start of the .text segment, as well as the _start symbol. If so, then _start is indeed the ELF entry point.

Related

How do you assemble, link and run a .s file in linux?

I'm getting a weird error message when trying to assemble and run a .s file using AT&T Intel Syntax. Not sure if I'm even using the correct architecture to begin with, or if I'm having syntax errors, if I'm not using the correct commands to assemble and link, etc. Completely lost and I do not know where to begin.
So basically, I have a file called yea.s , which contains some simple assembler instructions. I then try to compile it using the command as yea.s -o yea.o and then link is using ld yea.o -o yea. When running ld, I get this weird message:ld: warning: cannot find entry symbol _start; defaulting to 000000440000.
This is the program im trying to run, very simple and doesn't really do anything.
resMsg: .asciz "xxxxxxxx"
.text
.global main
main:
pushq $0
ret
I just cannot figure out what's going on. Obviously, this is for school homework. I'm not looking for the answer to the homework, obviously, but this is the starting point to where I can actually start the coding. And I just cant figure out how to simple run the program, which it doesn't say in the assignment. Anyway, thanks in advance guys!
Linux executables require an entry point to be specified. The entry point is the address of the first instruction to be executed in your program. If not specified otherwise, the link editor looks for a symbol named _start to use as an entry point. Your program does not contain such a symbol, thus the linker complains and picks the beginning of the .text section as the entry point. To fix this problem, rename main to _start.
Note further that unlike on DOS, there is nothing to return to from _start. So your attempt to return is going to cause a crash. Instead, call the system call sys_exit to exit the program:
mov $0, %edi # exit status
mov $60, %eax # system call number
syscall # perform exit call
Alternatively, if you want to use the C runtime environment and call functions from the C library, leave your program as is and instead assemble and link using the C compiler driver cc:
cc -o yea yea.s
If you do so, the C runtime environment provides the entry point for you and eventually tries to call a function main which is where your code comes in. This approach is required if you want to call functions from the C library. If you do it this way, make sure that main follows the SysV ABI (calling convention).
Note that even then your code is incorrect. The return value of a function is given in the eax (resp. rax) register and not pushed on the stack. To return zero from main, write
mov $0, %eax # exit status
ret # return from function
In all currently supported versions of Ubuntu open the terminal and type:
sudo apt install as31 nasm
as31: Intel 8031/8051 assembler
This is a fast, simple, easy to use Intel 8031/8051 assembler.
nasm: General-purpose x86 assembler
Netwide Assembler. NASM will currently output flat-form binary files, a.out, COFF and ELF Unix object files, and Microsoft 16-bit DOS and Win32 object files.
If you are using NASM in Ubuntu 18.04, the commands to compile and run an .asm file named example.asm are:
nasm -f elf64 example.asm # assemble the program
ld -s -o example example.o # link the object file nasm produced into an executable file
./example # example is an executable file

Assembly executable on Termux now produces Illegal instruction error [duplicate]

This question already has an answer here:
How to implement system call in ARM64?
(1 answer)
Closed 3 years ago.
Can you let me know what I'm doing wrong?
I'm new to assembly programming and am unfamiliar with the various options in ld.
I've been trying to use the yasm compiler initially but then realised that as is the way to go for the ARM architecture while composing GNU compliant assembly code.
Better luck running as from the binutils package, i.e. the GNU assembler. But the assembly code has to be ARM-compliant.
The following is the code within arm.s:
.text /* Start of the program code section */
.global main /* declares the main identifier */
.type main, %function
main: /* Address of the main function */
/* Program code would go here */
BR LR
/* Return to the caller */
.end /* End of the program */
The above was throwing an Illegal Instruction error. That can be fixed
by substituting ret for BR LR. This is new to ARM V8.
ARM, a RISC architecture, is not supported by YASM.
My build file is as follows:
#/usr/bin/env bash
#display usage
[ $# -eq 0 ] && { echo "Usage: $0 <File Name without extension> ";exit 1; }
set +e
rm -f $1.exe $1 $1.o
as -o $1.o $1.s
[ -e $1.o ] && { file $1.o;}
gcc -s -o $1.exe $1.o -fpic
ld -s -o $1 -pie --dynamic-linker /system/bin/linker64 /data/data/com.termux/files/usr/lib/crtbegin_dynamic.o $1.o -lc -lgcc -ldl /data/data/com.termux/files/usr/lib/crtend_android.o
[ -e $1.exe ] && { file $1.exe;nohup ./$1.exe; }
[ -e $1 ] && { file $1;nohup ./$1;}
set -e
The code was causing either a segmentation fault or a bus error earlier.
I was able to run a program or two without any segmentation or bus errors with the updated build file above. I set up the build file to produce two executables, one using gcc and the other ld, since some online tutorials use ld instead of gcc for the linking step. Using the verbose setting of gcc, you can look at the options passed to the linker and thus mimic the same for the linker independently.
There may be some redundant settings that I've missed.
You can access updates to the source code and build file at
Learn Assembly.
Check out this resource from Keil here. arm Keil product guides
More resources:
https://thinkingeek.com/2016/10/08/exploring-aarch64-assembler-chapter1/
How to link a gas assembly program that uses the C standard library with ld without using gcc?
While the above problem appears to be fixed for now, I have errors running the following code:
.text
.global main
main:
mov w0, #2
mov w7, #1 // request to exit program
svc 0
I obtain an illegal instruction error when I try to execute the code.
Secondly, if I alter the main to _start (since I don't want to be using main all the time), I have the following error from the buildrun script.
./buildrun myprogram
/data/data/com.termux/files/usr/bin/aarch64-linux-android-ld: myprogram.o: in function `_start': (.text+0x0): multiple definition of `_start'; /data/data/com.termux/files/usr/lib/crtbegin_dynamic.o:crtbegin.c:(.text+0x0): first defined here /data/data/com.termux/files/usr/bin/aarch64-linux-android-ld: /data/data/com.termux/files/usr/lib/crtbegin_dynamic.o: in function `_start_main': crtbegin.c:(.text+0x38): undefined reference to `main
/data/data/com.termux/files/usr/bin/aarch64-linux-android-ld: crtbegin.c:(.text+0x3c): undefined reference to `main' clang-8: error: linker command failed with exit co
de 1 (use -v to see invocation)
ld: myprogram.o: in function `_start': (.text+0x0): multiple definition of `_start'; /data/data/com.termux/files/usr/lib/crtbegin_dynamic.o:crtbegin.c:(.text+0x0): first defined here ld: /data/data/com.termux/files/usr/lib/crtbegin_dynamic.o: in function `_start_main': crtbegin.c:(.text+0x38): undefined reference to `main'
ld: crtbegin.c:(.text+0x3c): undefined reference to `main'
How do I create programs with entry points other than main?
I want to be able to :
Create a statically linked executable that works.
Create an executable that has a function named _start instead of main.
This file builds static executables that don't use main or call any library calls.
Create a dynamically linked executable with an entry point other than main.
My build file handles this, sort of, with the entry point as second parameter.
Create an executable that uses supervisor call svc to exit without throwing an illegal instruction error as against using ret.
I was able to call svc by setting the system call number in register X8 as against W7 in version 7 ARM. Additionally, ARM 64 has renumbered the system call numbers as per the following header file.
https://github.com/torvalds/linux/blob/v4.17/include/uapi/asm-generic/unistd.h
https://reverseengineering.stackexchange.com/q/16917
.data
.balign 8
labs: .asciz "Azeria Labs\n" //.asciz adds a null-byte to the end of the string .balign 8 after_labs: .set size_of_labs, after_labs - labs .balign 8 addr_of_labs: .dword labs .balign 8 .text
.global main
main:
mov x0, #1 //STDOUT ldr x1,addr_of_labs //memory address of labs mov w2, #size_of_labs //size of labs mov x8,#64 svc #0x0 // invoke syscall _exit: mov x8, #93 //exit syscall
svc #0x0 //invoke syscall
The above code was ported from the example code listed below.
https://azeria-labs.com/writing-arm-shellcode/
Compacting the data section into one instead of splitting it as in the example from the site mitigates the relocation errors while linking.
Other useful references:
https://thinkingeek.com/2013/01/09/arm-assembler-raspberry-pi-chapter-1/
*Check the comment by ehrt74 on the above post for the motivation to explore svc call further. *
Yasm is an x86 assembler. It cannot produce executables for an ARM processor.
The tutorials you are working with are describing x86 assembly. They are intended to be followed on an x86 system.

ld can not find symbol _start error after assemble and link .asm file to 64 bit executables

I have a shellcode file.
Then use ndisasm to build the assembly code.
ndisasm -b 64 shellcode > shellcode.asm
cat shellcode.asm | cut -c29->key.asm
I add 2 lines to the key.asm file
global_start:
_start:
$vi key.asm
global_start:
_start:
xor eax,eax
push rax
push qword 0x79237771
push qword 0x76772427
push qword 0x25747320
. . .
. . .
. . .
push qword 0x20757577
push rsp
pop rsi
mov edi,esi
mov edx,edi
cld
mov ecx,0x80
mov ebx,0x41
xor eax,eax
push rax
lodsb
xor eax,ebx
An then I assemble and link it to 64 bit executables
$nasm -f elf64 -g -F stabs key.asm
$ld -o key key.o
It gives me a warning
ld: warning: cannot find entry symbol _start; defaulting to 0000000000400080
I tested it out with gcc
gcc -o key key.o
I still get an error almost the same as the first one
/usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/crt1.o: In
function `_start':
(.text+0x20): undefined reference to `main'
collect2: error: ld returned 1 exit status
And when I run ./key with gdb after I use $ld NOT $gcc
$gdb -q ./key
$run
I get a seg fault
Starting program: /mnt/c/Users/owner/Documents/U
M/Computer_Security/ExtraCredit/key
Program received signal SIGSEGV, Segmentation fault.
0x000000000040013a in global_start ()
If I debug after run with gcc then the file will not be found because of exit status
Can you explain why does it happen? And how can I fix this problem? Thanks
It gives me a warning
ld: warning: cannot find entry symbol _start; defaulting to 0000000000400080
This isn't actually a problem, as long as you're fine with the entry point being the start of the text segment (i.e. the first instruction in your shellcode).
You got the error because you left out the space between the global keyword and the _start symbol name. i.e. use global _start, or don't bother. What you did defined a label called global_start, as you can see from your later error message.
You segfault on lodsb because you truncated a stack address with mov edi,esi instead of mov rdi, rsi. And if you fix that then you fall off the end of your code into garbage instructions because you don't make an exit system call. You're already running this inside gdb, use it!
I tested it out with gcc: gcc -o key key.o
I still get an error almost the same as the first one
/usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/crt1.o:
In function `_start':
(.text+0x20): undefined reference to `main'
No, that's a totally different error. If you had exported _start correctly, you would have gotten an error for conflicting definitions of _start (between your code and the CRT start files).
This error is that the _start definition in crt1.o (provided by gcc) has a reference to main, but your code doesn't provide a main. This is what happens when you try to compile a C or C++ program that doesn't define main.
To link with gcc, use -nostdlib to omit the CRT start files and all other libraries. (i.e. link pretty much exactly like you were doing manually with ld.)
gcc -nostdlib -static key.o -o key # static executable: just your code
Or dynamically linked without the CRT start files, using your _start.
gcc -nostdinc -no-pie key.o -o key
You can call libc functions from code linked that way, but only on Linux or other platforms where dynamic linking takes care of running libc initialization functions.
If you statically link libc, you can only call functions like printf if you first call all the libc init functions that the normal CRT startup code does. (Not going into detail here because this code doesn't use libc)
Your code is wrong. Between global and _start must have a space. That is one of your problems.
section .text
global _start
_start:
xor eax,eax
push rax
...
In addition, to get why the segmentation fault is happening, you have to debug it. You could look the assembly instruction that does the segfault.
x/5i $eip

What sections are necessary in a minimal dynamically-linked ELF program?

I assembled a simple "Hello, world" program and linked it using TCC, after which I got 4196 bytes of an executable.
The program has 31 sections: ['', '.text', '.data', '.bss', '.symtab', '.strtab', '.rel.text', '.rodata', '.rodata.cst4', '.note.GNU-stack', '.init', '.rel.init', '.gnu.linkonce.t.__x86.get_pc_thunk.bx', '.fini', '.rel.fini', '.text.unlikely', '.text.__x86.get_pc_thunk.bx', '.eh_frame', '.rel.eh_frame', '.preinit_array', '.init_array', '.fini_array', '.interp', '.dynsym', '.dynstr', '.hash', '.dynamic', '.got', '.plt', '.rel.got', '.shstrtab']. I feel that's a real lot for such a simple binary - which ones are actually necessary here for my program to run?
Here's the source code and the way I compiled it:
extern printf
global main
section .data
msg: db "Hello World!", 0
section .text
main:
;; puts (msg)
push msg
call printf
add esp, 4
;; return 0
mov eax, 0
ret
nasm main.asm -f elf32 && tcc main.o -o main
Tested on 32bit/ubuntu:16.04 Docker image.
Note: this question is different from this one in that I'm not looking for a tensy Linux ELF, but one that allows me to call dynamic symbols. I believe that due to the nature of dynamic linking, I need some extra sections.
I believe that due to the nature of dynamic linking, I need some extra sections.
Your belief is mistaken. No section is necessary at runtime, only segments matter.
A runnable dynamically-linked ELF binary will have at least one PT_LOAD segment, a PT_INTERP segment, and PT_DYNAMIC segment.

Why simple exit program do not work?

I am new to assembly language programming. I write following code,
.text
.globl _start
_start:
movl $1,%eax
movl $0,%ebx
int $0x80
and use as -o JustExit.o JustExit.asm command for creating object file. (Assembly file name is JustExit.asm).
After this step I gave executable permission using,
chmod 777 ./JustExit.o
When I execute program it says,
-su: ./JustExit.o: cannot execute binary file
I am not able to understand why this simple 'exit' program is not working.
Thanks.
Assembling your source through as produces an object file which is "not yet" executable.
You have to link the object file with a linker such as ld, which will then produce a fully working executable (a.out by default).
Your command line chain would look like this:
$ as -o JustExit.o JustExit.asm
$ ld JustExit.o
$ ./a.out
And it works!

Resources