How does ELF file format defines the stack? - linux

I'm studying the ELF file format, so I compiled a small program, dumped the section headers and their contents from the resulting executable.
The ELF header contains the entry point address, which points into start of the .text section.
I also found the .data section that contains the static data and .rodata that contains the read only data... I expect there is a section for the stack too, but I can't find that section.
I also expect that at some point ESP is set to the top of some section but I can't find anything like that in the disassembly.
So how does ESP gets its initial value?

The following figure describes the memory map of a typical C ELF executable on x86.
The process loads the .text and .data sections at the base address.
The main-stack is located just below and grows downwards.
Each thread and function-call will have its own-stack / stack-frame.
This is located located below the main-stack.
Each stack is separated by a guard page to detect Stack-Overflow.
Hence one does NOT need a dedicated stack section in the ELF file.
However within the man pages for ELF, one does find a couple of things in an ELF file that control the stack attributes. Mainly the executable permissions to the stack in memory.
PT_GNU_STACK
GNU extension which is used by the Linux kernel to control the state of the stack via the flags set in the p_flags member.
.note.GNU-stack
This section is used in Linux object files for declaring stack attributes. This section is of type SHT_PROGBITS. The only attribute used is SHF_EXECINSTR. This indicates to the GNU linker that the object file requires an executable stack.

Related

Why does the .bss segment have no executable attribute?

I have an ELF 32-bit executable file named orw from the pwnable.tw: https://pwnable.tw/challenge/. In my Ubuntu18.04, the .bss segment can be executed:
But in my Ubuntu20 and IDA Pro, the .bss segment have no executable attributes, why?
Why does the .bss segment have no executable attribute?
In a normal executable .bss should not have execute permissions, so it's the Ubuntu 18.04 result that is strange, not the other way around.
The following are all relevant here:
output from readelf -Wl orw
kernel versions
output from cat /proc/cpuinfo
emulator details (if you are using some kind of emulator).
I suspect that you are using an emulator, and it's set up to emulate pre-NX-bit processor (where the W bit implied X bit as well).
Alternatively, the executable lacks PT_GNU_STACK segment, in which case this answer is likely the correct one -- kernel defaults have changed for such binaries.
.bss is a segment for uninitialized global variables, so It's not normally executable (it doesn't need to). If you want it executable (because you are compiling machine code that you want to be able to test) you will probably need to select a special segment or to create two segments (one executable and other read/write) overlapping to allow to write the code while you can also execute it. This can be already specified in the standard script you use to link executables (with a different name, sure) or if it has not been done for you, you can specify a linker script that allows for those to be created. Read the linker documentation (in full, sorry) to know how the linker deals with this (and other) idiosynchracies of your processor architecture.
I don't know what architecture you are using, but for example, intel processors have an execution bit permissions in the segments, as they have read and write, which means that the memory access for an executable segment must be an opcode fetch access and not a data read access to load a data register. If you want to access the text segment for data reading, then you need to add also read access to the text segment to be able to see the code you are executing.

RIP stuck at inc instruction in self-modifying shellcode [duplicate]

Is it possible to allocate memory in other sections of a NASM program, besides .data and .bss?
Say I want to write to a location in .text section and receive Segmentation Fault
I'm interested in ways to avoid this and access memory legally. I'm running Ubuntu Linux
If you want to allocate memory at runtime, reserve some space on the stack with sub rsp, 4096 or something. Or run an mmap system call or call malloc from libc, if you linked against libc.
If you want to test shellcode / self-modifying code,
or have some other reason for have a writeable .text:
Link with ld --omagic or gcc -Wl,--omagic. From the ld(1) man page:
-N
--omagic
Set the text and data sections to be readable and writable. Also, do not page-align the data segment, and disable linking against shared
libraries. If the output format supports Unix style magic numbers, mark the output as "OMAGIC".
See also How can I make GCC compile the .text section as writable in an ELF binary?
Or probably you can use a linker script. It might also be possible to use NASM section attribute stuff to declare a custom section that has read, write, exec permission.
There's normally (outside of shellcode testing) no reason to do any of this, just put your static storage in .data or .bss, and your static read-only data in .rodata like a normal person.
Putting read/write data near code is actively bad for performance: possible pipeline nukes from the hardware that detects self-modifying-code, and it at least pollutes the iTLB with data and the dTLB with code, if you have a page that includes some of both instead of being full of one or the other.

Locate a function from ELF executable file

In a rather obscure use case, instead of loading the whole ELF executable file into memory, I'd like to load only the part of ELF file that contains a particular function. The difficulty I am facing is: I don't know how to locate where the code of this particular function is in the ELF file. If I had this piece of information, I would use it to load the disk sector(s) containing this part of ELF file into memory, and jump to it. But, being not very familiar with ELF file format and how ld works, I don't know how to get this piece of information. All the information I know is the function name (Just C function, no overload). Or, is this possible to find out the position of a function from headers of ELF file at all?
I would greatly appreciate it if you could give me some help to locate a particular function in ELF executable file. It would be perfect if I can know both its starting and ending position, but only the starting position is also fine. A reference to some reading materials towards this goal (if technically feasible at all) for self-study is also ok. The platform I am working on is Linux 20.04 with GNU development toolchain (the version of ld is 2.34) on x86 CPU (the ELF format is elf32−i386).
I realize this is a bit late, but for posterity:
Note that in the following text, most of the numbers are going to be in hexadecimal, since that's how they are output by binutils.
Code (usually?) resides in the .text section, so find out where that is in the ELF file: readelf -S <ELF file>
This should output section headers, the relevant output should look like this:
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
...
[13] .text PROGBITS 0000000000411000 00011000
0000000000465bac 0000000000000000 AX 0 0 4096
Note that the address is 411000 and the offset is 11000 in the above output, so the address is offset+400000.
Figure out the address of your symbol: nm -gC <ELF file> | grep <func name>
For function main, this got me (among other things): 0000000000411455 T main, which means that the function main is located at the address 411455.
Since readelf told us that offset=address-400000, the code for main should start at byte 11455 from the start of the file.
Also note to OP: AFAIK on modern computers most memory is marked as not executable, so if you just load code into memory and jump to it, it will likely crash. There are certainly ways to allocate executable memory, but they are probably a bit more complicated than calling malloc.

Allocate writable memory in the .text section

Is it possible to allocate memory in other sections of a NASM program, besides .data and .bss?
Say I want to write to a location in .text section and receive Segmentation Fault
I'm interested in ways to avoid this and access memory legally. I'm running Ubuntu Linux
If you want to allocate memory at runtime, reserve some space on the stack with sub rsp, 4096 or something. Or run an mmap system call or call malloc from libc, if you linked against libc.
If you want to test shellcode / self-modifying code,
or have some other reason for have a writeable .text:
Link with ld --omagic or gcc -Wl,--omagic. From the ld(1) man page:
-N
--omagic
Set the text and data sections to be readable and writable. Also, do not page-align the data segment, and disable linking against shared
libraries. If the output format supports Unix style magic numbers, mark the output as "OMAGIC".
See also How can I make GCC compile the .text section as writable in an ELF binary?
Or probably you can use a linker script. It might also be possible to use NASM section attribute stuff to declare a custom section that has read, write, exec permission.
There's normally (outside of shellcode testing) no reason to do any of this, just put your static storage in .data or .bss, and your static read-only data in .rodata like a normal person.
Putting read/write data near code is actively bad for performance: possible pipeline nukes from the hardware that detects self-modifying-code, and it at least pollutes the iTLB with data and the dTLB with code, if you have a page that includes some of both instead of being full of one or the other.

How does GDB perform base addresses of shared libraries [ internals of info sharedlibrary command]

I am trying to understand the internal working behind GDB commands. After initial homework of understanding about elf / shared libraries / address space randomization, I attempted to understand how GDB make sense between the executable and corefile.
solib.c contains the implementation of shared library processing. Esp am interested in the info sharedlibrary command.
The comment on the solib.c goes like this..
/* Relocate the section binding addresses as recorded in the shared
object's file by the base address to which the object was actually
mapped. */
ops->relocate_section_addresses (so, p);
I could not understand much from this comment. Can somebody explain me in plain english how relocation happens? i.e Every time when an executable loads a shared object, it is going to load at some location say X, and all the symbols inside the shared library will be located at fixed offset, say X+Y with some size Z. My question is, how does gdb does the same range of address relocation, so that it matches with the load segments in the corefile. How it takes that hint from executable.
how does gdb does the same range of address relocation, so that it matches with the load segments in the corefile
In other words, how does GDB find the relocation X?
The answer depends on the operating system.
On Linux, GDB finds _DYNAMIC[] array of struct Elf{32,64}_Dyns in the core file, which contains an element with .d_tag == DT_DEBUG.
The .d_ptr in that element points to struct r_debug (see /usr/include/link.h), which points to a linked list of struct link_maps, which describe all loaded shared libraries and their relocations in l_addr.
The relevant file in GDB is solib-svr4.c.
EDIT:
I see that, there are no .dynamic sections in the corefile.
There shouldn't be. There is a .dynamic section in the executable and a matching LOAD segment in the core (the segment will "cover" the .dynamic section, and have the contents that was there at runtime).

Resources