Limiting RAM usage during performance tests

Limiting RAM usage during performance tests - io

I have to run some performance tests, to see how my programs work when the system runs out of RAM and the system starts thrashing. Ideally, I would be able to change the amount of RAM used by the system.
I haved tried to by boot my system (running Ubuntu 10.10) in single user mode with a limited amount of physical memory, but with the parameters I used (max_addr=300M, max_addr=314572800 or mem=300M) the system did not use my swap partition.
Is there a way to limit the amount of RAM used by the total system, while still using swap space?
The point is to measure the total running time of each program as a function of the input size. I am not trying to pinpoint performance problems, I am trying to compare algorithms, which means I need accuracy.

Write a simple c program which
Will allocate large amount of memory.
Keep on accessing allocated memory random to try to keep in main memory (in an infinite loop).
Now run this program (one or few processes) so that you allocate enough memory to cause the thrashing of process you are testing.

Related

Can file copying be CPU-bound?

As far as I know, the CPU is usually faster than an I/O device (like the HDD, the network, RAM, etc.), so when copying a file the bottleneck is usually I/O-bound (right?).
If under some condition that I/O device is faster than the CPU (like in a virtual machine) is it possible to keep the CPU busy moving data (like from buffer to kernel space, from kernel space to user space)? And does it then become CPU-bound?

It depends on the program and the conditions where the program is run.
It would be highly unlikely that the speed of a program copying data would be throttled by the CPU speed. However it could be the case if for example the computer runs other programs that use CPU intensively and with higher priority than the program executing the copy.
The most common bottleneck would be the persistence storage medium speed (e.g. Hard drive).
Then, the amount of RAM available.
Then, the CPU being unavailable.
If and only if however, an I/O device is so super fast that outperforms the CPU speed. Then, it could be the case. However this is a hypothetical case since the CPU does not usually performs the copy itself, but commands other hardware to do so.
And, in real systems the bandwidth available for I/O device are far slower than the CPU and RAM bandwidth.
If copy is done efficiently, copying RAM data to HDD should not stress the CPU.
Data from RAM and Northbridge can be copied to the HDD via the Southbridge.
See also here.
If copy is done inefficiently, of course a program could read every single byte with the CPU and copy it.
Furthermore, as one can infer, the answer also depends from the hardware and architecture of the system.

Wrong answer, I am afraid. At least not always correct.
If I copy a folder with some 50.000 files (different sizes) in Windows Explorer, then TaskManager reports that the copy is mostly CPU bound. (I.e. TM reports low disk usage and very high CPU usage)

Linux using swap instead of RAM with large image processing

I'm processing large images on a Linux server using the R programming language, so I expect much of the RAM to be used in the image processing and file writing process.
However, the server is using swap memory long before it appears to need to, thus slowing down the processing time significantly. See following image:
This shows I am using roughly 50% of the RAM for the image processing, about 50% appears to be reserved for disk cache (yellow) and yet 10Gb of swap is being used!
I was watching the swap being eaten up, and it didn't happen when the RAM was any higher in use than is being shown in this image. The swap appears to be eaten up during the processed data being written to a GeoTiff file.
My working theory is that the disk writing process is using much of the disk cache area (yellow region), and therefore the yellow isn't actually available to the server (as is often assumed of disk cache RAM)?
Does that sound reasonable? Is there another reason for swap being used when RAM is apparently available?

I believe you may be affected by swappiness kernel parameter:
When an application needs memory and all the RAM is fully occupied, the kernel has two ways to free some memory at its disposal: it can either reduce the disk cache in the RAM by eliminating the oldest data or it may swap some less used portions (pages) of programs out to the swap partition on disk. It is not easy to predict which method would be more efficient. The kernel makes a choice by roughly guessing the effectiveness of the two methods at a given instant, based on the recent history of activity.
Swappiness takes a value between 0 and 100 to change the balance between swapping applications and freeing cache. At 100, the kernel will always prefer to find inactive pages and swap them out. A value of 0 gives something close to the old behavior where applications that wanted memory could shrink the cache to a tiny fraction of RAM.
If you want to force the kernel to avoid swapping whenever possible and give the RAM from device buffers and disk cache to your application, you can set swappiness to zero:
echo 0 > /proc/sys/vm/swappiness
Note that you may actually worsen the performace with this setting, because your disk cache may shrink to a tiny fraction of what it is now, making disk access slower.

How to determine the real memory usage of a single process?

How can I calculate the real memory usage of a single process? I am not talking about the virtual memory, because it just keeps growing. For instance, there are proc files like smaps, where you can get the mappings of a process. But this is virtual memory and the values of that file just keeps growing for running process. But I would like to reflect the real memory usage of a process. E.g. if you plot the memory usage of a process it should represent the allocations of memory and also the freeing of memory. So the plot should be like an up and down movement instead of a linear function, that just keeps growing for a running process.
So, how could I calculate the real memory usage? I would appreciate any helpful answer.

It's actually kind of a complicated question. The two most common metrics for a program's memory usage at the OS level are virtual size and resident set size. (These show in the output of ps -u as the VSZ and RSS columns.) Roughly speaking, these tell the total memory the program has assigned to it, versus how much it is currently actively using.
Further complicating the question is that when you use malloc (or the C++ new operator) to allocate memory, memory is allocated from a pool in your process which is built by occasionally requesting an allocation of memory from the operating system. But when you free memory, the memory goes back into this pool, but it is typically not returned to the OS. So as your program allocates and frees memory, you typically will not see its memory footprint go up and down. (However, if it frees a lot of memory and then doesn't allocate it any more, eventually you may see its rss go down.)

Necessity to bring program to main memory for execution?

Why is it necessary that we need to bring program in main memory from secondary memory for execution?
Why cant we execute program from secondary memory?
Though, it may not be possible currently, but is it possible in future, somehow by some mechanism, that we can execute the program from secondary memory directly?

Almost all modern CPUs execute instructions by fetching them from an address in main memory identified by the instruction pointer register, loading the referenced memory through one or more cache levels before the portion of the CPU that executes the instruction even starts its work. Designing a CPU that could, for example, fetch instructions directly from a disk or network stream would be a rather large project, and performance would likely be pathetic. There's a reason you have a main memory that operates orders of magnitude faster than disk/network access, and caches between that and the actual execution cores that are orders of magnitude faster even than the main memory...

Mostly some parts of the program is required to be accessed multiple times during the execution of the program. Reading from secondary memory every single time we needed the particular data would obviously require a lot of time.
It is better to load the program in a faster memory i.e. Main memory , so that whenever a part of program is required it can be accessed much faster. Similarly, more frequently used variables are stored in the cache memory for even faster access. It;s all about speed.
If could somehow develop affordable secondary memories that have speed as fast as the main memory, we could do without copying the whole program into main memory. However, we would still need some memory to store the temporaries during the program execution.

Main memory is used to distinguish it from external mass storage devices such as hard drives.
Another term for main memory is RAM. The computer can manipulate only data that is in main memory.
So, every program you execute and every file you access must be copied from a
storage device into main memory.The amount of main memory on a computer is crucial because it determines
how many programs can be executed at one time and how much data can be readily available to a program.

Calculating % memory used on Linux

Linux noob question:
If I have 500MB of RAM, and 500MB of swap space, can the OS and processes then use 1GB of memory?
In other words, is the total amount of memory available to programs and the OS the total of the physical memory size and swap size?
I'm trying to figure out which SNMP counters to query, but need to understand how Linux uses virtual memory a little better first.
Thanks

Actually, it IS essentially correct, but your "virtual" memory does NOT reside beside your "physical memory" (as Matthew Scharley stated).
Your "virtual memory" is an abstraction layer covering both "physical" (as in RAM) and "swap" (as in hard-disk, which is of course as much physical as RAM is) memory.
Virtual memory is in essention an abstraction layer. Your program always addresses a "virtual" address, which your OS translates to an address in RAM or on disk (which needs to be loaded to RAM first) depending on where the data resides. So your program never has to worry about lack of memory.

Nothing is ever quite so simple anymore...
Memory pages are lazily allocated. A process can malloc() a large quantity of memory and never use it. So on your 500MB_RAM + 500MB_SWAP system, I could -- at least in theory -- allocate 2 gig of memory off the heap and things will run merrily along until I try to use too much of that memory. (At which point whatever process couldn't acquire more memory pages gets nuked. Hopefully it's my process. But not always.)
Individual processes may be limited to 4 gig as a hard address limitation on 32-bit systems. Even when you have more than 4 gig of RAM on the machine and you're using that bizarre segmented 36-bit atrocity from hell addressing scheme, individual processes are still limited to only 4 gigs. Some of that 4 gigs has to go for shared libraries and program code. So yer down to 2-3 gigs of stack+heap as an ADDRESSING limitation.
You can mmap files in, effectively giving you more memory. It basically acts as extra swap. I.e. Rather than loading a program's binary code data into memory and then swapping it out to the swapfile, the file is just mmapped. As needed, pages are swapped into RAM directly from the file.
You can get into some interesting stuff with sparse data and mmapped sparse files. I've seen X-windows claim enormous memory usage when in fact it was only using up a tiny bit.
BTW: "free" might help you. As might "cat /proc/meminfo" or the Vm lines in /proc/$PID/status. (Especially VmData and VmStk.) Or perhaps "ps up $PID"

Although mostly it's true, it's not entirely correct. For a particular process, the environment you run it in may limit the memory available to your process. Check the output of ulimit -v as well.

Yes, this is essentially correct. The actual numbers might be (very) marginally lower, but for all intents and purposes, if you have x physical memory and y virtual memory (swap in linux), then you have x + y memory available to the operating system and any programs running underneath the OS.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string