e_machine field in elf header - linux

In the elf header, there's 'e-machine' field. So my question is does it only specify the processor architecture the file can run on or it specifies the processor architecture that was used to make the elf file?
I have done some research and I've found that it specifies the architecture required for the file

The job of ELF is to describe the executable, not where it came from. (that information would basically be useless; why would you care?)

Related

How does file(1) utility discern from an ELF shared object and an ELF executable?

The problem is ELF shared libraries and normal executables don't differ at all if you look at their ELF headers. On my Linux machine(Debian 11.4) even e_type field of the ELF header is set to shared object file as ht utility reports even when the file in consideration is an ELF executable. It looked like the only easy and reliable method to differ ELF executable files from ELF shared libraries, and for some reason GCC fills out e_type field only with this value. Nevertheless, file(1) utility can accurately tell me if the input file I give to her is an executable or shared library.
The first answer to this question suggests that the file(1)'s code should look for PT_INTERP program header:
distinguish shared objects from position independent executables
This sounds reasonable because all shared libraries get loaded after the executable file is loaded in first place, so they don't need to load interpreter one more time because a normal executable would already have done it.
Also I found this in file(1) source code when I was looking how magic.mgc file is compiled:
0 name elf-le
>16 leshort 0 no file type,
!:mime application/octet-stream
>16 leshort 1 relocatable,
!:mime application/x-object
>16 leshort 2 executable,
!:mime application/x-executable
>16 leshort 3 ${x?pie executable:shared object},
and I cannot understand what the last line means but it seems I can find an answer to my question if I understand this.
>16 leshort 3 ${x?pie executable:shared object},
This means "look at a 2-byte little endian word at offset 16 of the file. If it has the value '3' then check if the file is executable or not (permissions bit). If it is, the type is 'pie executable', otherwise it is 'shared object'"
You can look at the magic(5) man page for info about this syntax.

What is the significance of ".note.ABI-tag" section in ELF?

I see a .note.ABI-tag section when I objdump -h <binary> on a ELF file.
As per the ELF man page:
.note.ABI-tag
This section is used to declare the expected run-time ABI
of the ELF image. It may include the operating system name
and its run-time versions. This section is of type
SHT_NOTE. The only attribute used is SHF_ALLOC.
Is this section necessary?
What could be the side effects removing this section?
How to remove this section (a gcc flag) from ELF?
Could break the executable on some systems. It is supposed to give information on which kernel it's compatible with so if the binary's ABI are not compatible with the current kernel's ABI. See more information here https://refspecs.linuxfoundation.org/LSB_1.2.0/gLSB/noteabitag.html
However if your binary is not compiled for a specific kernel(not necessarily Linux as a lot of different targets use the ELF output) it doesn't matter and could just be cut if your goal is to reduce the executable's size. You should however be aware that it's already ignored if you are doing a objcopyfrom ELF to BIN.

ELF format: is ELF a subset of .o/.so or is ELF basically the entire .o/.so?

I'm currently doing some study on ELF format. I would like to confirm something I think is right.
ELF is a format, it stands for Executable and linkable format. In linux, everything is in ELF format.
When using gcc to compile a code with -c and -fPIC file, it transfers the code into a .o file with ELF format.
Is it correct if I say .o/.so and linux executables are ELF files? or is ELF something inside a .o/.so file? In other words, is ELF a subset of .o/.so or is ELF basically the entire .o/.so?
I would like to confirm this, because I'd like to make sure I understand this. Sorry for asking a stupid question.
Is it correct if I say .o/.so and linux executables are ELF files? or is ELF something inside a .o/.so file? In other words, is ELF a subset of .o/.so or is ELF basically the entire .o/.so?
Yes. Object files (.o), shared libraries (.so), and executables (.exe) are three of the four types of ELF files. (The fourth type is core files -- a dump of the state of a crashed process, sometimes used for post-mortem debugging.)
All four types use the same general format, but will have some differences specific to their type. For instance, an executable will typically have an entry point, whereas object files and shared libraries won't.

How does the Linux kernel determine ld.so's load address?

I know that the dynamic linker uses mmap() to load libraries. I guess it is the kernel who loads both the executable and its .interpreter into the same address space, but how does it determine where? I noticed that ld.so's load address with ASLR disabled is 0x555555554000 (on x86_64) — where does this address come from? I tried following do_execve()'s code path, but it is too ramified for me not to be confused as hell.
Read more about ELF, in particular elf(5), and about the execve(2) syscall.
An ELF file may contain an interpreter. elf(5) mentions:
PT_INTERP The array element specifies the location and
size of a null-terminated pathname to invoke
as an interpreter. This segment type is
meaningful only for executable files (though
it may occur for shared objects). However it
may not occur more than once in a file. If
it is present, it must precede any loadable
segment entry.
That interpreter is practically almost always ld-linux(8) (e.g. with GNU glibc), more precisely (on my Debian/Sid) /lib64/ld-linux-x86-64.so.2. If you compile musl-libc then build some software with it you'll get a different interpreter, /lib/ld-musl-x86_64.so.1. That ELF interpreter is the dynamic linker.
The execve(2) syscall is using that interpreter:
If the executable is a dynamically linked ELF executable, the
interpreter named in the PT_INTERP segment is used to load the needed
shared libraries. This interpreter is typically /lib/ld-linux.so.2
for binaries linked with glibc.
See also Levine's book on Linkers and loaders, and Drepper's paper: How To Write Shared Libraries
Notice that execve is also handling the shebang (i.e. first line starting with #!); see the Interpreter scripts section of execve(2). BTW, for ELF binaries, execve is doing the equivalent of mmap(2) on some segments.
Read also about vdso(7), proc(5) & ASLR. Type cat /proc/self/maps in your shell.
(I guess, but I am not sure, that the 0x555555554000 address is in the ELF program header of your executable, or perhaps of ld-linux.so; it might also come from the kernel, since 0x55555555 seems to appear in the kernel source code)

Intel binary to ELF

Really quick question here. I'm working in Ubuntu, I have a simple "Hello World!" program in assembly which I have assembled into x86 assembly. Now I want to turn that machine code into an ELF executable which my computer can run. I am aware that I could just assemble directly to ELF, the purpose of my inquiry is to discover how to make ELF binaries out of assembled machine code.
Thanks!
Final ELF executable files are typically built out of other ELF files, reorganized by the linker. The easiest way, of course, would be to specify ELF as the output format of your assembler.
1) If you really want to do this, you could start with an "empty" ELF file (that you get from compiling or assembling nothing, etc.). Then you could use objcopy --add-section, which allows you to add an arbitrary file as a section in an existing ELF file.
This will create a minimal ELF file:
$ echo "" | gcc -c -o empty.out -xc -
2) Alternatively, you could include your raw binary into another assembly file using something like nasm's incbin, which would then need to be assembled as an ELF.
3) A third option (the best so far) would be to provide your raw binary to the linker, and use a custom linker script to tell it what section to put it in (determined from the input file name). The -b flag before an input file will tell ld what type of file it is. This should let you use your flat binary file.
One of the first obstacles you're going to face is getting the entry point to point to your code. Off the top of my head I'm not sure how to edit that.
There is a Python library, pyelftools that may help you in your quest.
If it's really assembled, then it's already in the ELF format (compilers targeting Linux generally store the object code in ELF object files as well).
However, if you want a fully-functioning executable, you have to feed the object file to a linker.

Resources