Trying to trap all memory reads/writes on a Linux machine - linux

I have a Linux machine and I am trying to catch all the writes or reads to the memory for a specific amount of time (I basically need the byte address and the value that is being written). Is there any tool that can help me do that or do I have to change the OS code?

You mentioned that you only want to monitor memory reads and writes to a certain physical memory address. I'm going to assume that when you say memory reads/writes, you mean an assembly instruction that reads/writes data to memory and not an instruction fetch.
You would have to modify some paging code in the kernel so it page faults when a certain address range is accessed. Then, in the page fault handler, you could somehow log the access. You could extract the target address and data by decoding the instruction that caused the fault and reading the data off the registers. After logging, the page is configured to not to fault and the instruction is reattempted. Similar to the copy-on-write technique but you're logging each read/write to the region.
The other, hardware, method is to somehow install a bus sniffer or tap into a hardware debugging interface on your platform to monitor which regions of memory is being accessed but I imagine you'll run into trouble with caches with this method.
As mentioned by another poster, you could also modify an emulator to capture certain memory accesses and run your code on that.
I'd say both methods are very platform specific and will take a lot of effort to do. Out of curiosity, what is it that you're hoping to achieve? There must be a better way to solve it than to monitor accesses to physical memory.

Self-introspection is suitable for some types of debugging. For a complete trace of memory access, it is not. How is the debug code supposed to store a trace without performing more memory access?
If you want to stay in software, your best bet is to run the code being traced inside an emulator. Not a virtual machine that uses the MMU to isolate the test code while still providing direct access, but a full emulator. Plenty exist for x86 and most other architectures you would care about.

Well, if you're just interested in memory reads and writes by a particular process (to part/all of that process's virtual memory space), you can use a combination of ptrace and mprotect (mprotect to make the memory not accessable and ptrace run until it accesses the memory and then single step).

Sorry to say, it's just not possible to do what you want, even if you change OS code. Reads and writes to memory do not go through OS system calls.
The closest you could get would be to use accessor functions for the variables of interest. The accessors could be instrumented to put trace info in a separate buffer. Embedded debugging often does this to get a log of I/O register accesses.

Related

How can I force a cache flush for a process from a Linux device driver?

I'm working on a research project that requires me to perform a memory capture from custom hardware. I am working with a Zedboard SoC (dual-core ARM Cortex-A9 with FPGA fabric attached). I have designed a device driver that allows me to perform virtual memory captures and physical memory captures (using an AXI4-Lite peripheral that controls the Xilinx AXI DMA IP).
My goal is to capture all mapped pages, so I check /proc/pid/maps for mapped regions, then obtain PFNs from /proc/pid/pagemaps, pass the physical addresses into my device driver, and then pass them to my custom hardware (which invokes the Xilinx AXI DMA to obtain the contents from physical memory).
NOTE: I am using Xilinx's PetaLinux distribution, which is built on Linux version 4.14.
My device driver implements the following procedure through a series of IOCTL calls:
Stop the target process.
Perform virtual memory capture (using the access_process_vm() function).
Flush the cache (using the flush_user_range() function).
Perform physical memory capture.
Resume the target process.
What I'm noticing, however, is that the virtual memory capture and the physical memory capture differ in the [heap] section (which is the first section that extends past one page). The first page matches, but none of the other pages are even close. The [stack] section does not match at all. I should note that for the first two memory sections, .text and .rodata, the captures match exactly. The conclusion for now is that data that does not change during runtime matches between virtual and physical captures while data that does change during runtime does not match.
So this leaves me wondering: am I using the correct function to ensure coherency between the cache and the RAM? If not, what is the proper function to use to force a cache flush to RAM? It is necessary that the data in RAM is up-to-date to the point when the target process is stopped because I cannot access the caches from the custom hardware.
Edit 1:
In regards to this question being marked as a possible duplicate of this question, I am using a function from the accepted answer to initiate a cache flush. However, from my perspective, it doesn't seem to be working as the physical memory does not match the virtual memory as I would expect if a cache flush were occurring.
For anyone coming across this question in the future, the problem was not what I thought it was. The flush_user_range() function I mentioned is the correct function to use to push pages back to main memory from cache.
However, what I didn't think about at the time was that pages that are virtually contiguous are not necessarily (and are very often not) physically contiguous. In my kernel code, I passed the length of the mapped region to my hardware, which requested that length of data from the AXI DMA. What I should have done was a virtual-to-physical translation to get the physical address of each page in each region, then requested one page length of data from main memory, repeating that process for each page in each mapped region.
I realize this is a pretty specific problem that likely won't help anyone else doing the same thing I was doing, but hopefully the learned lesson can help someone in the future: Linux allocates pages in physical memory (that are usually 4kB in size, though you shouldn't assume that's the case), and a collection of physical pages is contained in one mapped region. If you're working on any code that requires you to check physical memory, be sure to be wary of where your data may cross physical page boundaries and act accordingly.

Allocating "temporary" memory (in Linux)

I'm trying to find any system functionality that would allow a process to allocate "temporary" memory - i.e. memory that is considered discardable by the process, and can be take away by the system when memory is needed, but allowing the process to benefit from available memory when possible. In other words, the process tells the system it's OK to sacrifice the block of memory when the process is not using it. Freeing the block is also preferable to swapping it out (it's more expensive, or as expensive, to swap it out rather then re-constitute its contents).
Systems (e.g. Linux), have those things in the kernel, like F/S memory cache. I am looking for something like this, but available to the user space.
I understand there are ways to do this from the program, but it's really more of a kernel job to deal with this. To some extent, I'm asking the kernel:
if you need to reduce my, or another process residency, take these temporary pages off first
if you are taking these temporary pages off, don't swap them out, just unmap them
Specifically, I'm interested on a solution that would work on Linux, but would be interested to learn if any exist for any other O/S.
UPDATE
An example on how I expect this to work:
map a page (over swap). No difference to what's available right now.
tell the kernel that the page is "temporary" (for the lack of a better name), meaning that if this page goes away, I don't want it paged in.
tell the kernel that I need the temporary page "back". If the page was unmapped since I marked it "temporary", I am told that happened. If it hasn't, then it starts behaving as a regular page.
Here are the problems to have that done over existing MM:
To make pages not being paged in, I have to allocate them over nothing. But then, they can get paged out at any time, without notice. Testing with mincore() doesn't guarantee that the page will still be there by the time mincore() finishes. Using mlock() requires elevated privileges.
So, the closest I can get to this is by using mlock(), and anonymous pages. Following the expectations I outlined earlier, it would be:
map an anonymous, locked page. (MAP_ANON|MAP_LOCKED|MAP_NORESERVE). Stamp the page with magic.
for making page "temporary", unlock the page
when needing the page, lock it again. If the magic is there, it's my data, otherwise it's been lost, and I need to reconstitute it.
However, I don't really need for pages to be locked in RAM when I'm using them. Also, MAP_NORESERVE is problematic if memory is overcommitted.
This is what the VmWare ESXi server aka the Virtual Machine Monitor (VMM) layer implements. This is used in the Virtual Machines and is a way to reclaim memory from the virtual machine guests. Virtual machines that have more memory allocated than they actually are using/require are made to release/free it to the VMM so that it can assign it back to the Virtual Machines guests that are in need of it.
This technique of Memory Reclamation is mentioned in this paper: http://www.vmware.com/files/pdf/mem_mgmt_perf_vsphere5.pdf
On similar lines, something similar you can implement in your kernel.
I'm not sure to understand exactly your needs. Remember that processes run in virtual memory (their address space is virtual), that the kernel is dealing with virtual to physical address translation (using the MMU) and with paging. So page fault can happen at any time. The kernel will choose to page-in or page-out at arbitrary moments - and will choose which page to swap (only the kernel care about RAM, and it can page-out any physical RAM page at will). Perhaps you want the kernel to tell you when a page is genuinely discarded. How would the kernel take away temporary memory from your process without your process being notified ? The kernel could take away and later give back some RAM.... (so you want to know when the given back memory is fresh)
You might use mmap(2) with MAP_NORESERVE first, then again (on the same memory range) with MAP_FIXED|MAP_PRIVATE. See also mincore(2) and mlock(2)
You can also later use madvise(2) with MADV_WONTNEED or MADV_WILLNEED etc..
Perhaps you want to mmap some device like /dev/null, /dev/full, /dev/zero or (more likely) write your own kernel module providing a similar device.
GNU Hurd has an external pager mechanism... You cannot yet get exactly that on Linux. (Perhaps consider mmap on some FUSE mounted file).
I don't understand what you want to happen when the kernel is paging out your memory, and what you want to happen when the kernel is paging in again such a page because your process is accessing it. Do you want to get a zero-ed page, or a SIGSEGV ?

Nvidia Information Disclosure / Memory Vulnerability on Linux and General OS Memory Protection

I thought this was expected behavior?
From: http://classic.chem.msu.su/cgi-bin/ceilidh.exe/gran/gamess/forum/?C35e9ea936bHW-7675-1380-00.htm
Paraphrased summary: "Working on the Linux port we found that cudaHostAlloc/cuMemHostAlloc CUDA API calls return un-initialized pinned memory. This hole may potentially allow one to examine regions of memory previously used by other programs and Linux kernel. We recommend everybody to stop running CUDA drivers on any multiuser system."
My understanding was that "Normal" malloc returns un-initialized memory, so I don't see what the difference here is...
The way I understand how memory allocation works would allow the following to happen:
-userA runs a program on a system that crunches a bunch of sensitive information. When the calculations are done, the results are written to disk, the processes exits, and userA logs off.
-userB logs in next. userB runs a program that requests all available memory in the system, and writes the content of his un-initialized memory, which contains some of userA's sensitive information that was left in RAM, to disk.
I have to be missing something here. What is it? Is memory zero'd-out somewhere? Is kernel/pinned memory special in a relevant way?
Memory returned by malloc() may be nonzero, but only after being used and freed by other code in the same process. Never another process. The OS is supposed to rigorously enforce memory protections between processes, even after they have exited.
Kernel/pinned memory is only special in that it apparently gave a kernel mode driver the opportunity to break the OS's process protection guarantees.
So no, this is not expected behavior; yes, this was a bug. Kudos to NVIDIA for acting on it so quickly!
The only part that requires root priviledges to install CUDA is the NVIDIA driver. As a result all operations done using NVIDIA compiler and link can be done using regular system calls, and standard compiling (provided you have the proper information -lol-). If any security holes lies there, it remains, wether or not cudaHostAlloc/cuMemHostAlloc is modified.
I am dubious about the first answer seen on this post. The man page for malloc specifies that
the memory is not cleared. The man page for free does not mention any clearing of the memory.
The clearing of memory seems to be in the responsability of the coder of a sensitive section -lol-, that leave the problem of an unexpected (rare) exit. Apart from VMS (good but not widely used OS), I dont think any OS accept the performance cost of a systematic clearing. I am not clear about the way the system may track in the heap of a newly allocated memory what was previously in the process area, and what was not.
My conclusion is: if you need a strict level of privacy, do not use a multi-user system
(or use VMS).

User-space memory editing programs

How do programs that edit memory of other processes work, such as Cheat Engine and iHaxGamez? My understanding is that a process reading from (let alone writing to) another process' memory is immediate grounds for a segmentation fault.
Gaining access to another processes memory under linux is fairly straightforward (assuming you have sufficient user privileges).
For example the file /dev/mem will provide access to the entire memory space of cpu. Details of the mappings for an individual process can be found in /proc/<pid>/maps.
Another example has been given here.
The operation system's hardware abstraction layer usually offers functions to manipulate the memory of other processes. In Windows, the corresponding functions are ReadProcessMemory and WriteProcessMemory.
It has no reason to segfault; OS (kernel, ...) API is used to write.
Segfault occurs (get signalled) from OS when a process attempts to access it's own memory in a bad way (char[] overflow).
About the games: well, if a value is stored at an address, and gets read sometimes, then it could be modified before next reading occurs.
You can use WinAPI WriteProcessMemory to write to memory space of other process.
Also read some PE/COFF documentation and use VirtualQueryEx and ReadProcessMemory to know what and where to write.

Accessing any memory locations under Linux 2.6.x

I'm using Slackware 12.2 on an x86 machine. I'm trying to debug/figure out things by dumping specific parts of memory. Unfortunately my knowledge on the Linux kernel is quite limited to what I need for programming/pentesting.
So here's my question: Is there a way to access any point in memory? I tried doing this with a char pointer so that It would only be a byte long. However the program crashed and spat out something in that nature of: "can't access memory location". Now I was pointing at the 0x00000000 location which where the system stores it's interrupt vectors (unless that changed), which shouldn't matter really.
Now my understanding is the kernel will allocate memory (data, stack, heap, etc) to a program and that program will not be able to go anywhere else. So I was thinking of using NASM to tell the CPU to go directly fetch what I need but I'm unsure if that would work (and I would need to figure out how to translate MASM to NASM).
Alright, well there's my long winded monologue. Essentially my question is: "Is there a way to achieve this?".
Anyway...
If your program is running in user-mode, then memory outside of your process memory won't be accessible, by hook or by crook. Using asm will not help, nor will any other method. This is simply impossible, and is a core security/stability feature of any OS that runs in protected mode (i.e. all of them, for the past 20+ years). Here's a brief overview of Linux kernel memory management.
The only way you can explore the entire memory space of your computer is by using a kernel debugger, which will allow you to access any physical address. However, even that won't let you look at the memory of every process at the same time, since some processes will have been swapped out of main memory. Furthermore, even in kernel mode, physical addresses are not necessarily the same as the addresses visible to the process.
Take a look at /dev/mem or /dev/kmem (man mem)
If you have root access you should be able to see your memory there. This is a mechanism used by kernel debuggers.
Note the warning: Examining and patching is likely to lead to unexpected results when read-only or write-only bits are present.
From the man page:
mem is a character device file that is an image of
the main memory of the computer. It may be used, for
example, to examine (and even patch) the system.
Byte addresses in mem are interpreted as physical
memory addresses. References to nonexistent locations
cause errors to be returned.
...
The file kmem is the same as mem, except that the
kernel virtual memory rather than physical memory is
accessed.

Resources