What can cause SIGBUS (bus error) on a generic x86 userland application in Linux? All of the discussion I've been able to find online is regarding memory alignment errors, which from what I understand doesn't really apply to x86.
(My code is running on a Geode, in case there are any relevant processor-specific quirks there.)
SIGBUS can happen in Linux for quite a few reasons other than memory alignment faults - for example, if you attempt to access an mmap region beyond the end of the mapped file.
Are you using anything like mmap, shared memory regions, or similar?
You can get a SIGBUS from an unaligned access if you turn on the unaligned access trap, but normally that's off on an x86. You can also get it from accessing a memory mapped device if there's an error of some kind.
Your best bet is using a debugger to identify the faulting instruction (SIGBUS is synchronous), and trying to see what it was trying to do.
SIGBUS on x86 (including x86_64) Linux is a rare beast. It may appear from attempt to access past the end of mmaped file, or some other situations described by POSIX.
But from hardware faults it's not easy to get SIGBUS. Namely, unaligned access from any instruction — be it SIMD or not — usually results in SIGSEGV. Stack overflows result in SIGSEGV. Even accesses to addresses not in canonical form result in SIGSEGV. All this due to #GP being raised, which almost always maps to SIGSEGV.
Now, here're some ways to get SIGBUS due to a CPU exception:
Enable AC bit in EFLAGS, then do unaligned access by any memory read or write instruction. See this discussion for details.
Do canonical violation via a stack pointer register (rsp or rbp), generating #SS. Here's an example for GCC (compile with gcc test.c -o test -masm=intel):
int main()
{
__asm__("mov rbp,0x400000000000000\n"
"mov rax,[rbp]\n"
"ud2\n");
}
Oh yes there's one more weird way to get SIGBUS.
If the kernel fails to page in a code page due to memory pressure (OOM killer must be disabled) or failed IO request, SIGBUS.
This was briefly mentioned above as a "failed IO request", but I'll expand upon it a bit.
A frequent case is when you lazily grow a file using ftruncate, map it into memory, start writing data and then run out of space in your filesystem. Physical space for mapped file is allocated on page faults, if there's none left then process receives a SIGBUS.
If you need your application to correctly recover from this error, it makes sense to explicitly reserve space prior to mmap using fallocate. Handling ENOSPC in errno after fallocate call is much simpler than dealing with signals, especially in a multi-threaded application.
You may see SIGBUS when you're running the binary off NFS (network file system) and the file is changed. See https://rachelbythebay.com/w/2018/03/15/core/.
If you request a mapping backed by hugepages with mmap and the MAP_HUGETLB flag, you can get SIGBUS if the kernel runs out of allocated huge pages and thus cannot handle a page fault.
In this case, you'll need to raise the number of allocated huge pages via
/sys/kernel/mm/hugepages/hugepages-<size>/nr_hugepages or
/sys/devices/system/node/nodeX/hugepages/hugepages-<size>/nr_hugepages on NUMA systems.
A common cause of a bus error on x86 Linux is attempting to dereference something that is not really a pointer, or is a wild pointer. For example, failing to initialize a pointer, or assigning an arbitrary integer to a pointer and then attempting to dereference it will normally produce either a segmentation fault or a bus error.
Alignment does apply to x86. Even though memory on an x86 is byte-addressable (so you can have a char pointer to any address), if you have for example an pointer to a 4-byte integer, that pointer must be aligned.
You should run your program in gdb and determine which pointer access is generating the bus error to diagnose the issue.
It's a bit off the beaten path, but you can get SIGBUS from an unaligned SSE2 (m128) load.
Related
For fun I am just trying to write a program in assembly for Linux on a laptop with an x86 processor to get some system information. So one of the things I am trying to find is how much memory is available to my program, and where e.g. the stack is and if and how I can allocate additional memory if needed.
Long time ago I did things like this on an Atari ST and there was just a system 'malloc' I could ask memory from and there were functions to find the available memory.
I know Linux is set up differently and I kind of have the whole address space to myself, but I guess there are some memory areas I am not allowed to touch.
And somehow a default stack seems to have been setup.
I researched quite a bit for this, but I can't find any 'assembly' system call. Most people point to linking the C malloc for memory management, but I am not looking for a memory manager. I just want to know the memory boundaries of my program.
I find things like getrlimit, setrlimit, prlimit and brk and sbrk, but those seem to be C functions and not system calls.
What am i missing?
Linux uses virtual memory (and ASLR). Atari ST doesn't use either so it did have a fixed memory map for some OS data structures and code. (Because the OS was in ROM and couldn't be easily updated, some people even documented some internal addresses.)
Linux tries to keep the boundary between kernel and user-space rigid, with a well-defined documented API / ABI for user-space to interact with the kernel via system calls. (e.g. on x86-64, via the syscall instruction). User-space doesn't need to care what's on the other side of that wall, and usually not even where its pages are in virtual memory as long as it has pointers to them.
When glibc malloc wants more pages from the OS, it uses mmap(MAP_ANONYMOUS) or brk to get them, and hand out chunks of them for small calls to malloc. It keeps bookkeeping data structures in user-space (so that's per-process of course).
I know Linux is set up differently and I kind of have the whole address space to myself, but I guess there are some memory areas I am not allowed to touch.
Yeah, every process has its own virtual address space. You can only touch the parts you've allocated, otherwise the resulting page fault will be "invalid" (OS knows there isn't supposed to be a physical page for that virtual page) and will deliver a SIGSEGV signal to your process if you try to read or write it. ("valid" page faults happen because of swap space or lazy allocation / copy-on-write; the kernel updates the HW page tables and returns to user-space for it to re-run the instruction that faulted.)
Also, the kernel claims the high half of virtual address space for its own use. (https://wiki.osdev.org/Higher_Half_Kernel). See also https://www.kernel.org/doc/Documentation/x86/x86_64/mm.txt for Linux's x86-64 memory map layout.
I can't find any 'assembly' system call.
mmap and brk are true system calls. See the "notes" section of the brk(2) man page. Section 2 man pages are system calls, section 3 are libc functions.
Of course in C when you call mmap(...), you're actually calling a wrapper function in glibc. glibc provides wrapper functions, not inline asm macros that use the syscall instruction directly.
See also The Definitive Guide to Linux System Calls which explains the asm interface, and also the VDSO pages. Linux maps some kernel memory (read-only) into your user-space process, holding code and data so getpid() and clock_gettime() can run in user-space.
Also various Q&As on Stack Overflow, including What are the calling conventions for UNIX & Linux system calls on i386 and x86-64
So one of the things I am trying to find is how much memory is available to my program
There isn't a system call to query the current memory map of your process. Parsing /proc/self/maps would be your best bet.
See Finding mapped memory from inside a process for some fun ideas on using system calls to scan ranges of virtual address space for mapped pages. e.g. Like Linux's mincore(2) syscall returns -ENOMEM if the specified range contains any unmapped pages.
what will happen if we dereference the null pointer in user space and kernel space?
From my understanding the behaviour is based on compiler,architecture,etc.
but in general for every user space program allocated with virtual memory and the paging is used to translate the virtual address to physical address using page tables.
so if we are dereferencing null pointer in user space,that address is invalid so the context switch will happen and in kernel based on the interrupt for this null pointer dereference 'Segmentation fault will come or page fault error will come'.
In Kernel space:
If we dereference the NULL pointer there is a possibility of crashing the system or kernel may not able to return from that call.
Is my understanding correct?or any other informations missing means please explain.
Ref:I have understood from this "What happens in OS when we dereference a NULL pointer in C?"
The kernel maps the page at virtual address 0 into all processes with no permission bits set. When you try to access that page, you get a page fault. The kernel routine that handles it issues a SIGSEGV signal to your process. If you have no handler for SIGSEGV registered, core is dumped and you see your "Segmentation fault" message.
Kernel side, things are a bit different. After all, the kernel is supposed to be robust:
If the dereference happens and recovery is possible (e.g. your trackpad driver did the offence), a kernel oops is generated. The kernel continues running (for now).
If the dereference occurs so that no recovery is possible, the Oops leads to a kernel panic. Reboot necessary.
If for some reason, there is data mapped at page zero, you will corrupt memory. Which could lead to a panic down the way, go unnoticed or even be abused for a privilege escalation attack.
I have a Linux machine and I am trying to catch all the writes or reads to the memory for a specific amount of time (I basically need the byte address and the value that is being written). Is there any tool that can help me do that or do I have to change the OS code?
You mentioned that you only want to monitor memory reads and writes to a certain physical memory address. I'm going to assume that when you say memory reads/writes, you mean an assembly instruction that reads/writes data to memory and not an instruction fetch.
You would have to modify some paging code in the kernel so it page faults when a certain address range is accessed. Then, in the page fault handler, you could somehow log the access. You could extract the target address and data by decoding the instruction that caused the fault and reading the data off the registers. After logging, the page is configured to not to fault and the instruction is reattempted. Similar to the copy-on-write technique but you're logging each read/write to the region.
The other, hardware, method is to somehow install a bus sniffer or tap into a hardware debugging interface on your platform to monitor which regions of memory is being accessed but I imagine you'll run into trouble with caches with this method.
As mentioned by another poster, you could also modify an emulator to capture certain memory accesses and run your code on that.
I'd say both methods are very platform specific and will take a lot of effort to do. Out of curiosity, what is it that you're hoping to achieve? There must be a better way to solve it than to monitor accesses to physical memory.
Self-introspection is suitable for some types of debugging. For a complete trace of memory access, it is not. How is the debug code supposed to store a trace without performing more memory access?
If you want to stay in software, your best bet is to run the code being traced inside an emulator. Not a virtual machine that uses the MMU to isolate the test code while still providing direct access, but a full emulator. Plenty exist for x86 and most other architectures you would care about.
Well, if you're just interested in memory reads and writes by a particular process (to part/all of that process's virtual memory space), you can use a combination of ptrace and mprotect (mprotect to make the memory not accessable and ptrace run until it accesses the memory and then single step).
Sorry to say, it's just not possible to do what you want, even if you change OS code. Reads and writes to memory do not go through OS system calls.
The closest you could get would be to use accessor functions for the variables of interest. The accessors could be instrumented to put trace info in a separate buffer. Embedded debugging often does this to get a log of I/O register accesses.
I want to emulate the system with prohibited unaligned memory accesses on the x86/x86_64.
Is there some debugging tool or special mode to do this?
I want to run many (CPU-intensive) tests on the several x86/x86_64 PCs when working with software (C/C++) designed for SPARC or some other similar CPU. But my access to Sparc is limited.
As I know, Sparc always checks alignment in memory reads and writes to be natural (reading a byte from any address, but reading a 4-byte word only allowed when address is divisible by 4).
May be Valgrind or PIN has such mode? Or special mode of compiler?
I'm searching for Linux non-commercial tool, but windows tools allowed too.
or may be there is secret CPU flag in EFLAGS?
I've just read question Does unaligned memory access always cause bus errors? which linked to Wikipedia article Segmentation Fault.
In the article, there's a wonderful reminder of rather uncommon Intel processor flags AC aka Alignment Check.
And here's how to enable it (from Wikipedia's Bus Error example, with a red-zone clobber bug fixed for x86-64 System V so this is safe on Linux and MacOS, and converted from Basic asm which is never a good idea inside functions: you want changes to AC to be ordered wrt. memory accesses.
#if defined(__GNUC__)
# if defined(__i386__)
/* Enable Alignment Checking on x86 */
__asm__("pushf\n orl $0x40000,(%%esp)\n popf" ::: "memory");
# elif defined(__x86_64__)
/* Enable Alignment Checking on x86_64 */
__asm__("add $-128, %%rsp \n" // skip past the red-zone, in case there is one and the compiler has local vars there.
"pushf\n"
"orl $0x40000,(%%rsp)\n"
"popf \n"
"sub $-128, %%rsp" // and restore the stack pointer.
::: "memory"); // ordered wrt. other mem access
# endif
#endif
Once enable it's working a lot like ARM alignment settings in /proc/cpu/alignment, see answer How to trap unaligned memory access? for examples.
Additionally, if you're using GCC, I suggest you enable -Wcast-align warnings. When building for a target with strict alignment requirements (ARM for example), GCC will report locations that might lead to unaligned memory access.
But note that libc's handwritten asm for memcpy and other functions will still make unaligned accesses, so setting AC is often not practical on x86 (including x86-64). GCC will sometimes emit asm that makes unaligned accesses even if your source doesn't, e.g. as an optimization to copy or zero two adjacent array elements or struct members at once.
It's tricky and I haven't done it personally, but I think you can do it in the following way:
x86_64 CPUs (specifically I've checked Intel Corei7 but I guess others as well) have a performance counter MISALIGN_MEM_REF which counter misaligned memory references.
So first of all, you can run your program and use "perf" tool under Linux to get a count of the number of misaligned access your code has done.
A more tricky and interesting hack would be to write a kernel module that programs the performance counter to generate an interrupt on overflow and get it to overflow the first unaligned load/store. Respond to this interrupt in your kernel module but sending a signal to your process.
This will, in effect, turn the x86_64 into a core that doesn't support unaligned access.
This wont be simple though - beside your code, the system libraries also use unaligned accesses, so it will be tricky to separate them from your own code.
Both GCC and Clang have UndefinedBehaviorSanitizer built in. One of those checks, alignment, can be enabled with -fsanitize=alignment. It'll emit code to check pointer alignment at runtime and abort if unaligned pointers are dereferenced.
See online documentation at:
https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html
Perhaps you somehow could compile to SSE, with all aligned moves. Unaligned accesses with movaps are illegal and probably would behave as illegal unaligned accesses on other architechtures.
I'm using Slackware 12.2 on an x86 machine. I'm trying to debug/figure out things by dumping specific parts of memory. Unfortunately my knowledge on the Linux kernel is quite limited to what I need for programming/pentesting.
So here's my question: Is there a way to access any point in memory? I tried doing this with a char pointer so that It would only be a byte long. However the program crashed and spat out something in that nature of: "can't access memory location". Now I was pointing at the 0x00000000 location which where the system stores it's interrupt vectors (unless that changed), which shouldn't matter really.
Now my understanding is the kernel will allocate memory (data, stack, heap, etc) to a program and that program will not be able to go anywhere else. So I was thinking of using NASM to tell the CPU to go directly fetch what I need but I'm unsure if that would work (and I would need to figure out how to translate MASM to NASM).
Alright, well there's my long winded monologue. Essentially my question is: "Is there a way to achieve this?".
Anyway...
If your program is running in user-mode, then memory outside of your process memory won't be accessible, by hook or by crook. Using asm will not help, nor will any other method. This is simply impossible, and is a core security/stability feature of any OS that runs in protected mode (i.e. all of them, for the past 20+ years). Here's a brief overview of Linux kernel memory management.
The only way you can explore the entire memory space of your computer is by using a kernel debugger, which will allow you to access any physical address. However, even that won't let you look at the memory of every process at the same time, since some processes will have been swapped out of main memory. Furthermore, even in kernel mode, physical addresses are not necessarily the same as the addresses visible to the process.
Take a look at /dev/mem or /dev/kmem (man mem)
If you have root access you should be able to see your memory there. This is a mechanism used by kernel debuggers.
Note the warning: Examining and patching is likely to lead to unexpected results when read-only or write-only bits are present.
From the man page:
mem is a character device file that is an image of
the main memory of the computer. It may be used, for
example, to examine (and even patch) the system.
Byte addresses in mem are interpreted as physical
memory addresses. References to nonexistent locations
cause errors to be returned.
...
The file kmem is the same as mem, except that the
kernel virtual memory rather than physical memory is
accessed.