In old good DOS 5.0 times I was using some resident program for finding (and modifying) memory location of program variable. Typically lives, or ammunition in games (yea cheating). It was taking memory snapshot to disk and making diffs. One can also narrow the search using using greater, smaller comparison. Then it was able to fix the value and so on.
If it's possible, how can I do something similar in current Linux(64bit)? Is there such a tool? I was trying radare2 for tracking the calls, but the binary is stripped and I'm getting lost.
Thanks.
The memory of a Linux process can be examined and modified by mapping pieces of the /proc/<PID>/maps pseudo file. You can discover where the different parts of the process are by reading /proc/<PID>/maps and other similar files.
The problem you'll face is that many things have changed since the old good DOS times. In those times, with just a few tens of kilobytes for your program, global variables were the norm, and those are easy to find.
But now, with hundreds of megabytes, most programs will use dynamic memory, complex hierarchies of classes, virtual functions... and that will make your cheats a lot harder.
Also the pmap command will show you the memory locations (starting points and size) used by the process.
For example, if I had a process with id 123:
pmap 123
Related
I observe that each ffmpeg instance doing audio decoding takes about 50 mb of memory. If I record 100 stations, that's 5 GB of RAM.
Now, they all more or less use the same amount of RAM, I suspect the contain the same information over and over again because they are spawned as new processes rather than forked.
Is there way to avoid this duplication?
I am using Ubuntu 20.04, x64
Now, they all more or less use the same amount of RAM, I suspect the
contain the same information over and over again because they are
spawned as new processes rather than forked.
Have you considered that the processes may use about the same amount of RAM because they are performing roughly the same computation, with similar parameters?
Have you considered that whatever means you are using to compute memory usage may be insensitive to whether the memory used is attributed uniquely to the process vs. already being shared with other processes?
Is there way to avoid this duplication?
Programs that rely on shared libraries already share those libraries' executable code among them, saving memory.
Of course, each program does need its own copy of any writable data belonging to the library, some of which may turn out to be unused by a particular client program, and programs typically have memory requirements separate from those of any libraries they use, too. Whatever amount of that 50 MB per process is in fact additive across processes is going to be from these sources. Possibly you could reduce the memory load from these by changing program parameters (or by changing programs), but there's no special way to run the same number of instances of the program you're running now, with the same options and inputs, to reduce the amount of memory they use.
I have a process that reads thousands of small files ONE TIME. The cached data is not needed after this. The process proceeds at full speed until most memory is consumed by the file cache and then it slows down. I don't understand the slowdown, since freeing cache memory and allocating space for the next file should be a matter of microseconds. Hard page faults also increase when this threshold is reached. The OS is vanilla Ubuntu 16.04.
I would like to limit the file caching for this process only.
This is a user process, so using a privileged shell command to purge the cache is not a solution. Using fadvise on a per-file level is not a solution, since the files are being read my multiple library programs depending on the file type.
What I need is a process-level option: do not cache, or set a low size limit like 100 MB. I have searched for this and found nothing. Is this really the case? Seems like something big that is missing.
Any insight on the apparent memory management performance issue?
Here's the strict answer to your question. If you are mmap-ing your files, the way to do this is using madvise() and MADV_DONTNEED:
MADV_DONTNEED
Do not expect access in the near future. (For the time being,
the application is finished with the given range, so the ker‐
nel can free resources associated with it.) Subsequent
accesses of pages in this range will succeed, but will result
either in reloading of the memory contents from the underlying
mapped file (see mmap(2)) or zero-fill-on-demand pages for
mappings without an underlying file.
There is to my knowledge no way of doing it with files that are simply opened, read (using read() or similar) and closed.
However, it sounds to me like this is not in fact the issue. Are you sure it's buffer / cache that is growing here, and not something else? (e.g. perhaps you are reading them into RAM and not freeing that RAM, or not closing them, or similar)
You can tell by doing:
echo 3 > /proc/sys/vm/drop_caches
if you don't get all the memory back, then it's your program which is leaking something.
I am convinced there is no way to stop file caching on a per-process level. The program must have direct control over file I/O, with access to the file descriptors so that madvise() can be used. You cannot do this if library functions are doing all the file reading and you are not willing to modify them. This does look like a design gap that should be filled.
HOWEVER: My assertion of some performance issue with memory management was wrong. The reason for the process slow-down as the file cache grows and free memory shrinks was something else: disk seek distances were growing during the process. Other tests have verified that allocating memory does not significantly slow down as the file cache grows and free memory shrinks.
I have been studying OS concepts and decided to look in how these stuff are actually implemented in Linux. But I am having problem understanding some thing in relation with memory management during boot process before page_allocator is turned on, more precisely how bootmem works. I do not need the exact workings of it, but just an understanding how some things are/can be solved.
So obviously, bootmem cannot use dynamic memory, meaning that size he has must be known before runtime, so appropriate steps can be taken, i.e. the maximum size of his bitmap must be known in advance. From what I understand, this is most likely solved by simply mapping enough memory during kernel initialization, if architecture changes, simply change the size of the mapped memory. Obviously, there is probably a lot more going on, but I guess I got the general idea? However, what really makes no sense to me is NUMA architecture. Everywhere I read, it says that pg_data_t is created for each memory node. This pg_data is put into a list(how can it know the size of the list? Or is the size fixed for specific arch?) and for each node, bitmap is allocated. So, basically, it sounds like it can create undefined number of these pg_data, each of which has their memory bitmap of arbitrary size. How? What am I missing?
EDIT: Sorry for not including reference. Here is bootmem code, it can also be found in mm/bootmem.c: http://lxr.free-electrons.com/source/mm/bootmem.c
It is architecture-dependent. On the x86 architecture, early on in the boot process the kernel does issue one BIOS call - the 0xe820 function of the trap at Interrupt Vector 0x15. This returns a memory map that the kernel can use to build it's memory tables, including holes for non-memory (PCI or ISA) devices, etc. Bootloaders (before the kernel) will do the same.
See: Detecting Memory
After looking into this more, I think it works this way: basically, all the necessary things are statically allocated, i.e. by using preprocessor DEFINES it is ensured certain sections of bootmem (as well as other parts of the kernel) code either exist or do not exist in the compiled code for specific architecture (even though the code itself is architecture-independent). These DEFINES are specified in architecture-dependent sources codes found under arch/ (e.g. arch/i386, arch/arm/, etc.). For NUMA architectures there is a define called MAX_NUMNODES, ensuring that list of structs (more specifically, list of pg_data_t structures) representing nodes is allocated as a static array (which is then treated as a list). The bitmaps representing memory map are obviously relatively small, since each page is represented as only one bit, taking up KBs, or maybe MBs. Whatever the case, architecture dependent head.S sets-up all the necessary structures needed for system functioning (like page-tables) and ensure enough physical memory is mapped to virtual so these bitmaps can fit in it without causing page-fault (in case of x86 arch, initial 8MB of RAM is mapped, which is more then enough for both kernel and additional structures, like bitmaps).
I am writing small process monitor script in Perl by reading values from Proc file system. Right now I am able to fetch number of threads, process state, number of bytes read and write using /proc/[pid]/status and /proc/[pid]/io files. Now I want to calculate the memory usage of a process. After searching, I came to know memory usage will be present /proc/[pid]/statm. But I still can't figure out what are necessary fields needed from that file to calculate the memory usage. Can anyone help me on this? Thanks in advance.
You likely want resident or size. From kernel.org.
size total program size
This is the whole program, including stuff never swapped in
resident resident set size
Stuff in RAM at the current moment (this does not include pages swapped out)
It is extremely difficult to know what the "memory usage" of a process is. VM size and RSS are known, measurable values.
But what you probably want is something else. In practice, "VM size" seems too high and RSS often seems too low.
The main problems are:
Multiple processes can share the same pages. You can add up the RSS of all running processes, and end up with much more than the physical memory of your machine (this is before kernel data structures are counted)
Private pages belonging to the process can be swapped out. Or they might not be initialised yet. Do they count?
How exactly do you count memory-mapped file pages? Dirty ones? Clean ones? MAP_SHARED or MAP_PRIVATE ones?
So you really need to think about what counts as "memory usage".
It seems to me that logically:
Private pages which are not shared with any other processes (NB: private pages can STILL be copy-on-write!) must count even if swapped out
Shared pages should count divided by the number of processes they're shared by e.g. a page shared by two processes counts half
File-backed pages which are resident can count in the same way
File-backed non-resident pages can be ignored
If the same page is mapped more than once into the address-space of the same process, it can be ignored the 2nd and subsequent time. This means that if proc 1 has page X mapped twice, and proc 2 has page X mapped once, they are both "charged" half a page.
I don't know of any utility which does this. It seems nontrivial though, and involves (at least) reading /proc/pid/pagemap and possibly some other /proc interfaces, some of which are root-only.
Another (less simple, but more precise) possibility would be to parse the /proc/123/maps file, perhaps by using the pmap utility. It gives you information about the "virtual memory" (i.e. the address space of the process).
I have a program that repeatedly solves large systems of linear equations using cholesky decomposition. Characterising is that I sometimes need to store the complete factorisation which can exceed about 20 GB of memory. The factorisation happens inside a library that I call. Furthermore, this matrix and the resulting factorisation changes quite frequently and as such the memory requirements as well.
I am not the only person to use this compute-node. Therefore, is there a way to start the program under Linux and preallocate free memory for the process?
Something like: $: prealloc -m 25G ./program
I'll stick my neck out and say that I don't think that there is such a way under Linux. I think that the philosophy of Linux (and every other multi-tasking o/s I've used or heard of) is to provide the programmer (and the program) with the illusion that they have the whole of the computer's memory to play with and to make it very difficult indeed for a programmer to interfere with the o/s.
Instead, I think that you should plan to modify your program to grab the memory it will (or may) require when it starts up, that is, do the memory management yourself in whatever your chosen language is. How easy this might be for you, considering calls to a library, I don't know.
I've never heard of such a way. Usually it would be bad for other users on the node if one program went ahead and hogged all available memory. It's not good practice.
But opinions aside, I would probably write my program in such a way that it acts like a small environment that is able to make multiple runs of the routine in question without ending. It would allocate lots of memory on startup, then wait for user commands (through a minimal shell) and make the runs requested with the allocated memory pool. It would hold on to the pool until the user requests termination.
Of course this requires you to have an interactive session on the node, which you may not have.