Are memory addresses reported in valgrind absolute, physical addresses? - malloc

While running an MPI program with valgrind using mpirun -np 3 valgrind test, I noticed that the addresses of malloc/calloc'ed arrays are sometimes identical for different processes. This would lead me to believe that the addresses reported by Valgrind are either not absolute or do not correspond to the physical memory addresses -- which would make sense, as it uses its own allocator. Could anyone confirm this, or tell me which trivial insight I'm missing here? Thank you.
The code:
#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>
#include <assert.h>
int main(int argc, char* argv[])
{
int rank, nproc;
/* first let MPI strip off its MPI stuff: */
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &nproc);
double* a;
int n = 40;
assert(a = calloc(n, sizeof(double)));
printf("Rank %d: Address of a = %p\n",rank,a);
free(a);
MPI_Finalize();
return 0;
}
Example output:
Rank 0: Address of a = 0x6dad300
Rank 1: Address of a = 0x67a8800
Rank 2: Address of a = 0x67a8800

In modern processors / operating systems, memory is mapped into the process space of a program. The program has no notion (or interest in) the actual physical memory address of the space it's using--only device drivers really have the need.

Related

C Programming Segmentation fault (core dumped) error

I am very new to programming with C but I have spent a few semesters in C++. I have a homework assignment that I just started and I ran into an issue within the first few lines of code I have written and I am not sure what is going on. It will compile fine and when I run it I am able to enter in a string but once I hit enter I get the segmentation fault (core dumped) error message. Here is my code. I just started and I will be adding a lot more to it and will also be implementing functions in my program as well but I am taking it in baby steps:
#include <stdio.h>
#include <ctype.h>
#include <stdlib.h>
#include <string.h>
struct profile {
char *f_Name;
char *l_Name;
int age;
char *email;
char *password;
};
int main(void)
{
struct profile userOne; //creates a variable
printf("Please enter your first name: \n");
fgets(userOne.f_Name, sizeof(userOne.f_Name), stdin);
//takes input from user.
//I want to use fgets because my professor wants us to consider
//bufferoverflows
printf("%s\n", userOne.f_Name); //prints it to the screen
return 0;
}
You need to malloc (explicitly or via strdup) but sizeof(f_Name) in fgets is wrong--it's 4/8 because f_Name is a pointer, not a buffer. Try this:
char buf[5000];
fgets(buf,sizeof(buf),stdin);
userone.f_Name = strdup(buf);
You just declared a pointer variable without allocating memory to it. Use the malloc function first to allocate memory and then get the value from stdin.
userOne.f_Name = (char *) malloc( n * sizeof(char));
where n is the number of characters in your string
http://www.tutorialspoint.com/c_standard_library/c_function_malloc.htm
The following link has info on Segmentation fault
What is a segmentation fault?

Is there a way to find the file names of files mapped to the virtual memory area of a process in the linux kernel?

Been working on a project for a few weeks now and I've hit a pretty significant roadblock and I was hoping somebody here might be able to offer some guidance.
All I need to do is write a system call that reports statistics of a process’s virtual address space when called. Those statistics, according to the assignment criteria, need to include the size of the process’s virtual address space, each virtual memory area’s access permissions, and the names of files mapped to these virtual memory areas.
The first two I have working, the last appears to not be possible, at least from what my research and attempts so far have turned up. I've isolated it down to accessing the vm_file struct within the vm_area_struct of the process and using that to get to the f_path, but past that I'm still stuck on how to get from there to a format that can actually be put into a printk statement, and everything I've tried hasn't output anything when I finally get the kernel recompiled.
Here's where the code sits at the moment. Am I even on the right track?
#include <linux/kernel.h>
#include <linux/sched.h>
#include <linux/mm_types.h>
#include <linux/module.h>
#include <linux/fs.h>
#include <linux/path.h>
#include <linux/dcache.h>
asmlinkage int sys_project3a1(unsigned int processID)
{
struct task_struct *task;
for_each_process(task)
{
if (task->pid == processID)
{
unsigned long virtualAddressSpace = 0;
struct vm_area_struct *vmlist;
printk("Process ID: %d", task->pid);
for (vmlist = task->mm->mmap; vmlist!=NULL; vmlist=vmlist->vm_next)
{
unsigned long space = vmlist->vm_end - vmlist->vm_start;
char *tmp;
char *pathname;
struct file *file;
struct path *path;
printk("Process Access Permissions: %lu", (unsigned long)(vmlist->vm_page_prot.pgprot));
file = vmlist->vm_file;
path = &file->f_path;
path_get(path);
tmp = (char *)__get_free_page(GFP_TEMPORARY);
pathname = d_path(path, tmp, PAGE_SIZE);
printk("Path Name: %s", pathname);
free_page((unsigned long)tmp);
virtualAddressSpace += space;
}
printk("Process Virtual Address Space: %lu", virtualAddressSpace);
}
}
return 1;
}
I was in a similar situation and had to find this on my own. Hope others who come here benefit from my answer.
So I am assuming you want the values corresponding to "Mapping" column of a pmap command ouput. The following works for me(tried on v4.6.4):
char filename[50];
if(vmlist->vm_file){
strcpy(filename, vmlist->vm_file->f_path.dentry->d_iname);
}
else{ // implies an anonymous mapping i.e. not file backup'ed
strcpy(filename, "[ anon ]");
}
After getting to the path, follow it's dentry field and then d_iname which is the mapped file's name. Doesn't look quite pretty, but does the job.
struct vm_area_struct *vas;
vas->vm_file->f_path.dentry->d_name.name

malloc large memory never returns NULL

when I run this, it seems to have no problem with keep allocating memory with cnt going over thousands. I don't understand why -- aren't I supposed to get a NULL at some point? Thanks!
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <math.h>
int main(void)
{
long C = pow(10, 9);
int cnt = 0;
int conversion = 8 * 1024 * 1024;
int *p;
while (1)
{
p = (int *)malloc(C * sizeof(int));
if (p != NULL)
cnt++;
else break;
if (cnt % 10 == 0)
printf("number of successful malloc is %d with %ld Mb\n", cnt, cnt * C / conversion);
}
return 0;
}
Are you running this on Linux? Linux has a highly surprising feature known as overcommit. It doesn't actually allocate memory when you call malloc(), but rather when you actually use that memory. malloc() will happily let you allocate as much memory as your heart desires, never returning a NULL pointer.
It's only when you actually access the memory that Linux takes you seriously and goes out searching for free memory to give you. Of course there may not actually be enough memory to meet the promise it gave your program. You say, "Give me 8GB," and malloc() says, "Sure." Then you try to write to your pointer and Linux says, "Oops! I lied. How bout I just kill off processes (probably yours) until I I free up enough memory?"
You're allocating virtual memory. On a 64-bit OS, virtual memory is available in almost unlimited supply.

Is there any API for determining the physical address from virtual address in Linux?

Is there any API for determining the physical address from virtual address in Linux operating system?
Kernel and user space work with virtual addresses (also called linear addresses) that are mapped to physical addresses by the memory management hardware. This mapping is defined by page tables, set up by the operating system.
DMA devices use bus addresses. On an i386 PC, bus addresses are the same as physical addresses, but other architectures may have special address mapping hardware to convert bus addresses to physical addresses.
In Linux, you can use these functions from asm/io.h:
virt_to_phys(virt_addr);
phys_to_virt(phys_addr);
virt_to_bus(virt_addr);
bus_to_virt(bus_addr);
All this is about accessing ordinary memory. There is also "shared memory" on the PCI or ISA bus. It can be mapped inside a 32-bit address space using ioremap(), and then used via the readb(), writeb() (etc.) functions.
Life is complicated by the fact that there are various caches around, so that different ways to access the same physical address need not give the same result.
Also, the real physical address behind virtual address can change. Even more than that - there could be no address associated with a virtual address until you access that memory.
As for the user-land API, there are none that I am aware of.
/proc/<pid>/pagemap userland minimal runnable example
virt_to_phys_user.c
#define _XOPEN_SOURCE 700
#include <fcntl.h> /* open */
#include <stdint.h> /* uint64_t */
#include <stdio.h> /* printf */
#include <stdlib.h> /* size_t */
#include <unistd.h> /* pread, sysconf */
typedef struct {
uint64_t pfn : 55;
unsigned int soft_dirty : 1;
unsigned int file_page : 1;
unsigned int swapped : 1;
unsigned int present : 1;
} PagemapEntry;
/* Parse the pagemap entry for the given virtual address.
*
* #param[out] entry the parsed entry
* #param[in] pagemap_fd file descriptor to an open /proc/pid/pagemap file
* #param[in] vaddr virtual address to get entry for
* #return 0 for success, 1 for failure
*/
int pagemap_get_entry(PagemapEntry *entry, int pagemap_fd, uintptr_t vaddr)
{
size_t nread;
ssize_t ret;
uint64_t data;
uintptr_t vpn;
vpn = vaddr / sysconf(_SC_PAGE_SIZE);
nread = 0;
while (nread < sizeof(data)) {
ret = pread(pagemap_fd, ((uint8_t*)&data) + nread, sizeof(data) - nread,
vpn * sizeof(data) + nread);
nread += ret;
if (ret <= 0) {
return 1;
}
}
entry->pfn = data & (((uint64_t)1 << 55) - 1);
entry->soft_dirty = (data >> 55) & 1;
entry->file_page = (data >> 61) & 1;
entry->swapped = (data >> 62) & 1;
entry->present = (data >> 63) & 1;
return 0;
}
/* Convert the given virtual address to physical using /proc/PID/pagemap.
*
* #param[out] paddr physical address
* #param[in] pid process to convert for
* #param[in] vaddr virtual address to get entry for
* #return 0 for success, 1 for failure
*/
int virt_to_phys_user(uintptr_t *paddr, pid_t pid, uintptr_t vaddr)
{
char pagemap_file[BUFSIZ];
int pagemap_fd;
snprintf(pagemap_file, sizeof(pagemap_file), "/proc/%ju/pagemap", (uintmax_t)pid);
pagemap_fd = open(pagemap_file, O_RDONLY);
if (pagemap_fd < 0) {
return 1;
}
PagemapEntry entry;
if (pagemap_get_entry(&entry, pagemap_fd, vaddr)) {
return 1;
}
close(pagemap_fd);
*paddr = (entry.pfn * sysconf(_SC_PAGE_SIZE)) + (vaddr % sysconf(_SC_PAGE_SIZE));
return 0;
}
int main(int argc, char **argv)
{
pid_t pid;
uintptr_t vaddr, paddr = 0;
if (argc < 3) {
printf("Usage: %s pid vaddr\n", argv[0]);
return EXIT_FAILURE;
}
pid = strtoull(argv[1], NULL, 0);
vaddr = strtoull(argv[2], NULL, 0);
if (virt_to_phys_user(&paddr, pid, vaddr)) {
fprintf(stderr, "error: virt_to_phys_user\n");
return EXIT_FAILURE;
};
printf("0x%jx\n", (uintmax_t)paddr);
return EXIT_SUCCESS;
}
GitHub upstream.
Usage:
sudo ./virt_to_phys_user.out <pid> <virtual-address>
sudo is required to read /proc/<pid>/pagemap even if you have file permissions as explained at: https://unix.stackexchange.com/questions/345915/how-to-change-permission-of-proc-self-pagemap-file/383838#383838
As mentioned at: https://stackoverflow.com/a/46247716/895245 Linux allocates page tables lazily, so make sure that you read and write a byte to that address from the test program before using virt_to_phys_user.
How to test it out
Test program:
#define _XOPEN_SOURCE 700
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
enum { I0 = 0x12345678 };
static volatile uint32_t i = I0;
int main(void) {
printf("vaddr %p\n", (void *)&i);
printf("pid %ju\n", (uintmax_t)getpid());
while (i == I0) {
sleep(1);
}
printf("i %jx\n", (uintmax_t)i);
return EXIT_SUCCESS;
}
The test program outputs the address of a variable it owns, and its PID, e.g.:
vaddr 0x600800
pid 110
and then you can pass convert the virtual address with:
sudo ./virt_to_phys_user.out 110 0x600800
Finally, the conversion can be tested by using /dev/mem to observe / modify the memory, but you can't do this on Ubuntu 17.04 without recompiling the kernel as it requires: CONFIG_STRICT_DEVMEM=n, see also: How to access physical addresses from user space in Linux? Buildroot is an easy way to overcome that however.
Alternatively, you can use a Virtual machine like QEMU monitor's xp command: How to decode /proc/pid/pagemap entries in Linux?
See this to dump all pages: How to decode /proc/pid/pagemap entries in Linux?
Userland subset of this question: How to find the physical address of a variable from user-space in Linux?
Dump all process pages with /proc/<pid>/maps
/proc/<pid>/maps lists all the addresses ranges of the process, so we can walk that to translate all pages: /proc/[pid]/pagemaps and /proc/[pid]/maps | linux
Kerneland virt_to_phys() only works for kmalloc() addresses
From a kernel module, virt_to_phys(), has been mentioned.
However, it is import to highlight that it has this limitation.
E.g. it fails for module variables. arc/x86/include/asm/io.h documentation:
The returned physical address is the physical (CPU) mapping for
the memory address given. It is only valid to use this function on
addresses directly mapped or allocated via kmalloc().
Here is a kernel module that illustrates that together with an userland test.
So this is not a very general possibility. See: How to get the physical address from the logical one in a Linux kernel module? for kernel module methods exclusively.
As answered before, normal programs should not need to worry about physical addresses as they run in a virtual address space with all its conveniences. Furthermore, not every virtual address has a physical address, the may belong to mapped files or swapped pages. However, sometimes it may be interesting to see this mapping, even in userland.
For this purpose, the Linux kernel exposes its mapping to userland through a set of files in the /proc. The documentation can be found here. Short summary:
/proc/$pid/maps provides a list of mappings of virtual addresses together with additional information, such as the corresponding file for mapped files.
/proc/$pid/pagemap provides more information about each mapped page, including the physical address if it exists.
This website provides a C program that dumps the mappings of all running processes using this interface and an explanation of what it does.
The suggested C program above usually works, but it can return misleading results in (at least) two ways:
The page is not present (but the virtual addressed is mapped to a page!). This happens due to lazy mapping by the OS: it maps addresses only when they are actually accessed.
The returned PFN points to some possibly temporary physical page which could be changed soon after due to copy-on-write. For example: for memory mapped files, the PFN can point to the read-only copy. For anonymous mappings, the PFN of all pages in the mapping could be one specific read-only page full of 0s (from which all anonymous pages spawn when written to).
Bottom line is, to ensure a more reliable result: for read-only mappings, read from every page at least once before querying its PFN. For write-enabled pages, write into every page at least once before querying its PFN.
Of course, theoretically, even after obtaining a "stable" PFN, the mappings could always change arbitrarily at runtime (for example when moving pages into and out of swap) and should not be relied upon.
I wonder why there is no user-land API.
Because user land memory's physical address is unknown.
Linux uses demand paging for user land memory. Your user land object will not have physical memory until it is accessed. When the system is short of memory, your user land object may be swapped out and lose physical memory unless the page is locked for the process. When you access the object again, it is swapped in and given physical memory, but it is likely different physical memory from the previous one. You may take a snapshot of page mapping, but it is not guaranteed to be the same in the next moment.
So, looking for the physical address of a user land object is usually meaningless.

Getting stack traces on Unix systems, automatically

What methods are there for automatically getting a stack trace on Unix systems? I don't mean just getting a core file or attaching interactively with GDB, but having a SIGSEGV handler that dumps a backtrace to a text file.
Bonus points for the following optional features:
Extra information gathering at crash time (eg. config files).
Email a crash info bundle to the developers.
Ability to add this in a dlopened shared library
Not requiring a GUI
FYI,
the suggested solution (using backtrace_symbols in a signal handler) is dangerously broken. DO NOT USE IT -
Yes, backtrace and backtrace_symbols will produce a backtrace and a translate it to symbolic names, however:
backtrace_symbols allocates memory using malloc and you use free to free it - If you're crashing because of memory corruption your malloc arena is very likely to be corrupt and cause a double fault.
malloc and free protect the malloc arena with a lock internally. You might have faulted in the middle of a malloc/free with the lock taken, which will cause these function or anything that calls them to dead lock.
You use puts which uses the standard stream, which is also protected by a lock. If you faulted in the middle of a printf you once again have a deadlock.
On 32bit platforms (e.g. your normal PC of 2 year ago), the kernel will plant a return address to an internal glibc function instead of your faulting function in your stack, so the single most important piece of information you are interested in - in which function did the program fault, will actually be corrupted on those platform.
So, the code in the example is the worst kind of wrong - it LOOKS like it's working, but it will really fail you in unexpected ways in production.
BTW, interested in doing it right? check this out.
Cheers,
Gilad.
If you are on systems with the BSD backtrace functionality available (Linux, OSX 1.5, BSD of course), you can do this programmatically in your signal handler.
For example (backtrace code derived from IBM example):
#include <execinfo.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
void sig_handler(int sig)
{
void * array[25];
int nSize = backtrace(array, 25);
char ** symbols = backtrace_symbols(array, nSize);
for (int i = 0; i < nSize; i++)
{
puts(symbols[i]);;
}
free(symbols);
signal(sig, &sig_handler);
}
void h()
{
kill(0, SIGSEGV);
}
void g()
{
h();
}
void f()
{
g();
}
int main(int argc, char ** argv)
{
signal(SIGSEGV, &sig_handler);
f();
}
Output:
0 a.out 0x00001f2d sig_handler + 35
1 libSystem.B.dylib 0x95f8f09b _sigtramp + 43
2 ??? 0xffffffff 0x0 + 4294967295
3 a.out 0x00001fb1 h + 26
4 a.out 0x00001fbe g + 11
5 a.out 0x00001fcb f + 11
6 a.out 0x00001ff5 main + 40
7 a.out 0x00001ede start + 54
This doesn't get bonus points for the optional features (except not requiring a GUI), however, it does have the advantage of being very simple, and not requiring any additional libraries or programs.
Here is an example of how to get some more info using a demangler. As you can see this one also logs the stacktrace to file.
#include <iostream>
#include <sstream>
#include <string>
#include <fstream>
#include <cxxabi.h>
void sig_handler(int sig)
{
std::stringstream stream;
void * array[25];
int nSize = backtrace(array, 25);
char ** symbols = backtrace_symbols(array, nSize);
for (unsigned int i = 0; i < size; i++) {
int status;
char *realname;
std::string current = symbols[i];
size_t start = current.find("(");
size_t end = current.find("+");
realname = NULL;
if (start != std::string::npos && end != std::string::npos) {
std::string symbol = current.substr(start+1, end-start-1);
realname = abi::__cxa_demangle(symbol.c_str(), 0, 0, &status);
}
if (realname != NULL)
stream << realname << std::endl;
else
stream << symbols[i] << std::endl;
free(realname);
}
free(symbols);
std::cerr << stream.str();
std::ofstream file("/tmp/error.log");
if (file.is_open()) {
if (file.good())
file << stream.str();
file.close();
}
signal(sig, &sig_handler);
}
Dereks solution is probably the best, but here's an alternative anyway:
Recent Linux kernel version allow you to pipe core dumps to a script or program. You could write a script to catch the core dump, collect any extra information you need and mail everything back.
This is a global setting though, so it'd apply to any crashing program on the system. It will also require root rights to set up.
It can be configured through the /proc/sys/kernel/core_pattern file. Set that to something like ' | /home/myuser/bin/my-core-handler-script'.
The Ubuntu people use this feature as well.

Resources