Why does the .bss segment have no executable attribute? - linux

I have an ELF 32-bit executable file named orw from the pwnable.tw: https://pwnable.tw/challenge/. In my Ubuntu18.04, the .bss segment can be executed:
But in my Ubuntu20 and IDA Pro, the .bss segment have no executable attributes, why?

Why does the .bss segment have no executable attribute?
In a normal executable .bss should not have execute permissions, so it's the Ubuntu 18.04 result that is strange, not the other way around.
The following are all relevant here:
output from readelf -Wl orw
kernel versions
output from cat /proc/cpuinfo
emulator details (if you are using some kind of emulator).
I suspect that you are using an emulator, and it's set up to emulate pre-NX-bit processor (where the W bit implied X bit as well).
Alternatively, the executable lacks PT_GNU_STACK segment, in which case this answer is likely the correct one -- kernel defaults have changed for such binaries.

.bss is a segment for uninitialized global variables, so It's not normally executable (it doesn't need to). If you want it executable (because you are compiling machine code that you want to be able to test) you will probably need to select a special segment or to create two segments (one executable and other read/write) overlapping to allow to write the code while you can also execute it. This can be already specified in the standard script you use to link executables (with a different name, sure) or if it has not been done for you, you can specify a linker script that allows for those to be created. Read the linker documentation (in full, sorry) to know how the linker deals with this (and other) idiosynchracies of your processor architecture.
I don't know what architecture you are using, but for example, intel processors have an execution bit permissions in the segments, as they have read and write, which means that the memory access for an executable segment must be an opcode fetch access and not a data read access to load a data register. If you want to access the text segment for data reading, then you need to add also read access to the text segment to be able to see the code you are executing.

Related

RIP stuck at inc instruction in self-modifying shellcode [duplicate]

Is it possible to allocate memory in other sections of a NASM program, besides .data and .bss?
Say I want to write to a location in .text section and receive Segmentation Fault
I'm interested in ways to avoid this and access memory legally. I'm running Ubuntu Linux
If you want to allocate memory at runtime, reserve some space on the stack with sub rsp, 4096 or something. Or run an mmap system call or call malloc from libc, if you linked against libc.
If you want to test shellcode / self-modifying code,
or have some other reason for have a writeable .text:
Link with ld --omagic or gcc -Wl,--omagic. From the ld(1) man page:
-N
--omagic
Set the text and data sections to be readable and writable. Also, do not page-align the data segment, and disable linking against shared
libraries. If the output format supports Unix style magic numbers, mark the output as "OMAGIC".
See also How can I make GCC compile the .text section as writable in an ELF binary?
Or probably you can use a linker script. It might also be possible to use NASM section attribute stuff to declare a custom section that has read, write, exec permission.
There's normally (outside of shellcode testing) no reason to do any of this, just put your static storage in .data or .bss, and your static read-only data in .rodata like a normal person.
Putting read/write data near code is actively bad for performance: possible pipeline nukes from the hardware that detects self-modifying-code, and it at least pollutes the iTLB with data and the dTLB with code, if you have a page that includes some of both instead of being full of one or the other.

Locate a function from ELF executable file

In a rather obscure use case, instead of loading the whole ELF executable file into memory, I'd like to load only the part of ELF file that contains a particular function. The difficulty I am facing is: I don't know how to locate where the code of this particular function is in the ELF file. If I had this piece of information, I would use it to load the disk sector(s) containing this part of ELF file into memory, and jump to it. But, being not very familiar with ELF file format and how ld works, I don't know how to get this piece of information. All the information I know is the function name (Just C function, no overload). Or, is this possible to find out the position of a function from headers of ELF file at all?
I would greatly appreciate it if you could give me some help to locate a particular function in ELF executable file. It would be perfect if I can know both its starting and ending position, but only the starting position is also fine. A reference to some reading materials towards this goal (if technically feasible at all) for self-study is also ok. The platform I am working on is Linux 20.04 with GNU development toolchain (the version of ld is 2.34) on x86 CPU (the ELF format is elf32−i386).
I realize this is a bit late, but for posterity:
Note that in the following text, most of the numbers are going to be in hexadecimal, since that's how they are output by binutils.
Code (usually?) resides in the .text section, so find out where that is in the ELF file: readelf -S <ELF file>
This should output section headers, the relevant output should look like this:
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
...
[13] .text PROGBITS 0000000000411000 00011000
0000000000465bac 0000000000000000 AX 0 0 4096
Note that the address is 411000 and the offset is 11000 in the above output, so the address is offset+400000.
Figure out the address of your symbol: nm -gC <ELF file> | grep <func name>
For function main, this got me (among other things): 0000000000411455 T main, which means that the function main is located at the address 411455.
Since readelf told us that offset=address-400000, the code for main should start at byte 11455 from the start of the file.
Also note to OP: AFAIK on modern computers most memory is marked as not executable, so if you just load code into memory and jump to it, it will likely crash. There are certainly ways to allocate executable memory, but they are probably a bit more complicated than calling malloc.

why non-pic code can't be totally ASLR using run-time fixups?

I understand that PIC code makes ASLR randomization more efficient and easier since the code can be placed anywhere in memory with no change in code. But if i understand right according to Wikipedia relocation dynamic linker can make "fixups" at runtime so a symbol can be located although code being not position-independent. But according to many answers i saw here non-pic code can't ASLR sections except the stack(so cant randomize program entry point). If that is correct then what are runtime fixups are used for and why can't we just fixup all locations in code at runtime before program start to make program entry point randomized.
TL:DR: Not all uses of absolute address will have relocation info in a non-PIE executable (ELF type EXEC, not DYN). Therefore the kernel's program-loader can't find them all to apply fixups.
Thus there's no way to retroactively enable ASLR for executables built as non-PIE. There's no way for a traditional executable to flag itself as having relocation metadata for every use of an absolute address, and no point in adding such a feature either since if you want text ASLR you'd just build a PIE.
Because ELF-type EXEC Linux executables are guaranteed to be loaded / mapped at the fixed base address chosen by the linker at link time, it would be a waste of space in the executable to make symbol-table entries for internal symbols. So toolchains didn't do that, and there's no reason to start. That's simply how traditional ELF executables were designed; Linux switched from a.out to ELF back in the mid 90s before stack ASLR was a thing, so it wasn't on people's radar.
e.g. the absolute address of static char buf[100] is probably embedded somewhere in the machine code that uses it (if we're talking about 32-bit code, or 64-bit code that puts the address in a register), but there's no way to know where or how many times.
Also, for x86-64 specifically, the default code model for non-PIE executables guarantees that static addresses (text / data / bss) will all be in the low 2GiB of virtual address space, so 32-bit absolute signed or unsigned addresses can work, and rel32 displacements can reach anything from anything. That's why non-PIE compiler output uses mov $symbol, %edi (5 bytes) to put an address in a register, instead of lea symbol(%rip), %rdi (7 bytes). https://godbolt.org/z/89PeK1
So even if you did know where every absolute address was, you could only ASLR it in the low 2GiB, limiting the number of bits of entropy you could introduce. (I think Windows has a mode for this: LargeAddressAware = no. But Linux doesn't. 32-bit absolute addresses no longer allowed in x86-64 Linux? Again, PIE is a better way to allow text ASLR, so people (distros) should just compile for that if they want its benefits.)
Unlike Windows, Linux doesn't spend huge effort on things that can be handled better and more efficiently by recompiling binaries from source.
That being said, GNU/Linux does support fixup relocations for 64-bit absolute addresses even in PIC / PIE ELF shared objects. That's why beginner code like NASM mov rdi, BUFFER can work even in a shared library: use objdump -drwC -Mintel to see the relocation info on that use of the symbol in a mov reg, imm64 instruction. An lea rdi, [rel BUFFER] wouldn't need any relocation entry if BUFFER wasn't a global symbol. (Equivalent of C static.)
You might be wondering why metadata is essential:
There's no reliable way to search text/data for possible absolute addresses; false positives would be possible. e.g. /usr/bin/ld probably contains 0x401000 as the default start address for an x86-64 executable. You don't want ASLR of ld's code+data to also change its defaults. Or that integer value could have come up in any number of ways in many programs, e.g. as a bitmap. And of course x86-64 machine code is variable length so there's no reliable way to even distinguish opcodes from immediate operands in the most general case.
And also potentially false negatives. Not super likely that an x86 program would construct an absolute address in a register with multiple instructions, but it's certainly possible. However in non-x86 code, that would be common.
RISC machines with fixed-length instructions can't put a 32-bit address into a 32-bit instruction; there'd be no room left for anything else. So to load from static storage, the absolute addresses would have to be split across multiple instructions, like MIPS lui $t0, %hi(0x612300) / lw $t1, %lo(0x612300)($t0) to load from a static variable at absolute address 0x612300. (There would normally be a symbol name in the asm source, but it wouldn't appear in the final linked binary unless it was .globl, so I used numbers as a reminder.) Instructions like that don't have to come in pairs; the same high-half of the address could be reused by other accesses into the same array or struct in later instructions.
Let's first have a look at Windows before having a look at Linux:
Windows' .EXE files (programs) typically have a so-called "base relocation table" and they have an "image base".
The "image base" is the "desired" start address of the program; if Windows loads the program to that address, no relocation needs to be done.
The "base relocation table" contains a list of all values in a program which represent addresses. If the program is loaded to a different address than the "image base", Windows must add the difference to all values listed in that table.
If the .EXE file does not contain a "base relocation table" (as far as I know some 32-bit GCC versions generate such files), it is not possible to load the file to another address.
This is because the following C code statements will result in exactly the same machine code (binary code) if the variable someVariable is located at the address 12340000, and it is not possible to distinguish between them:
long myVariable = 12340000;
And:
int * myVariable = &someVariable;
In the first case, the value 12340000 must not be changed in any situation; in the second case, the address (which is 12340000) must be changed to the real address if the program is loaded to another address.
If the "base relocation table" is missing, there is no information if the value 12340000 is an integer value (which must not be changed) or an address (which must be changed).
So the program must be loaded to some fixed address.
I'm not sure about the latest 32-bit Linux releases, but at least in older 32-bit Linux versions there was nothing like a "base relocation table" and programs did not use PIC. This means that these programs had to be loaded to their "favorite" address.
I don't know about 64-bit Linux programs, but if a program is compiled the same way as the (older) 32-bit programs, they also must be loaded to a certain address and ASLR is not possible.

Allocate writable memory in the .text section

Is it possible to allocate memory in other sections of a NASM program, besides .data and .bss?
Say I want to write to a location in .text section and receive Segmentation Fault
I'm interested in ways to avoid this and access memory legally. I'm running Ubuntu Linux
If you want to allocate memory at runtime, reserve some space on the stack with sub rsp, 4096 or something. Or run an mmap system call or call malloc from libc, if you linked against libc.
If you want to test shellcode / self-modifying code,
or have some other reason for have a writeable .text:
Link with ld --omagic or gcc -Wl,--omagic. From the ld(1) man page:
-N
--omagic
Set the text and data sections to be readable and writable. Also, do not page-align the data segment, and disable linking against shared
libraries. If the output format supports Unix style magic numbers, mark the output as "OMAGIC".
See also How can I make GCC compile the .text section as writable in an ELF binary?
Or probably you can use a linker script. It might also be possible to use NASM section attribute stuff to declare a custom section that has read, write, exec permission.
There's normally (outside of shellcode testing) no reason to do any of this, just put your static storage in .data or .bss, and your static read-only data in .rodata like a normal person.
Putting read/write data near code is actively bad for performance: possible pipeline nukes from the hardware that detects self-modifying-code, and it at least pollutes the iTLB with data and the dTLB with code, if you have a page that includes some of both instead of being full of one or the other.

How can we tell an instruction is from application code or library code on Linux x86_64

I wanted to know whether an instruction is from the application itself or from the library code.
I observed some application code/data are located at about 0x000055xxxx while libraries and mmaped regions are by default located at 0x00007fcxxxx. Can I use for example, 0x00007f00...00 as a boundary to tell instruction is from the application itself or from the library?
How can I configure this boundary in Linux kernel?
Updated.
Can I prevent (or detect) a syscall instruction being issued from application code (only allow it to go through libc). Maybe we can do a binary scan, but due to the variable length of instruction, it's hard to prevent unintended syscall instruction.
Do it the other way. You need to learn a lot.
First, read a lot more about operating systems. So read the Operating Systems: Three Easy Pieces textbook.
Then, learn more about ASLR.
Read also Drepper's How to write shared libraries and Levine's Linkers and loaders book.
You want to use pmap(1) and proc(5).
You probably want to parse the /proc/self/maps pseudo-file from inside your program. Or use dladdr(3).
To get some insight, run cat /proc/$$/maps and cat /proc/self/maps in a Linux terminal
I wanted to know whether an instruction is from userspace or from library code.
You are confused: both library code and main executable code are userspace.
On Linux x86_64, you can distinguish kernel addresses from userpsace addresses, because the kernel addresses are in the FFFF8000'00000000 through FFFFFFFF'FFFFFFFF range on current (48-bit) implementations. See canonical form address description here.
I observed some application code/data are located at about 0x000055xxxx while libraries and mmaped regions are by default located at 0x00007fcxxxx. Can I use for example, 0x00007f00...00 as a boundary to tell instruction is from the application itself or from the library?
No, in general you can't. An application can be linked to load anywhere within canonical address space (though most applications aren't).
As Basile Starynkevitch already answered, you'll need to parse /proc/$pid/maps, or know what address the executable is linked to load at (for non-PIE binary).

Resources