Virtual Memory in Visual C++ - visual-c++

I have a query regarding Virtual Memory. First of all, I would like to mention that I am new to the field of programming. I have read up on Visual Memory.
Now I have a program which opens softwares which require large amounts of memory (like for example a picture viewer). The concerned computer, however, cannot spare that much of memory for this . And this is all done with Visual C++. The picture viewer is currently running on physical memory.
But once this software is distributed, it will be used on computers which cant spare that much of physical memory. So my task is to research and find out how to switch this program from using physical memory to virtual memory. In the end I will probably be implementing this myself.
So my question is, how do I alter the code in such a way that I can prevent the application from using physical memory and instead, switch over to virtual memory?
I'm not asking for someone to provide me with a copy paste code ofcourse, but just a method to do so. Also, if someone could explain the logic behind it, I would appreciate it.
Loads of thanks in advance.

The operating system is in charge of deciding what should be stored in RAM and what should be paged out to VM. Under certain abnormal circumstances, it can be useful to can provide advice to the OS from your app, but this is only recommended for experts. As a novice, your best bet is to trust that the OS will do the right thing.
Why do you think you need special behavior anyway? Pictures are generally small anyway. Unless your app is dealing with thousands and thousands, they will fit in RAM.

You can't use virtual memory without using some physical memory. There's a reason for the name swapfile. The processor cannot directly operate on data in secondary storage, such as hard disks. It must first copy it into RAM.

Related

How To implement swapfile cross operating system

There are use cases where I can't have a lot of ram, and sometimes due to docker based services doesn't always provide more than 512mb/1gb of ram, or if I run multiple rust based gui apps and if each take 100mb of ram normally, how can I implement a swapfile/ virtual ram to exceed allotted ram? Also os level swapfiles don't let users choose which app can use real ram and which swapfile, so it can become a problem too. I want to use swapfile as much as possible, and not even real ram, if possible. Users and hosting services provide with lot of storage usually (more than 10gb normally) so it would be a good way to use the available storage too!
If swapfile or anything like that aren't possible, I would like to know if there is any difference in speed and cpu consumption between "cache data in ram" apps and "cache data in file and read it when required" apps. If the latter is slow normally and not as efficient as swapfiles, I would like to know the possible ways how os manages to make swapfiles that efficient than apps.
An application does not control whether the memory they allocate is allocated on real RAM, on a swap partition, or else. You just ask for memory, and the OS is responsible for finding available memory to give to you.
Besides that, note that using swap (sometimes called swapping) is extremely bad performance-wise. How much depends a lot on your hardware, but it's about three orders of magnitude. This is even amplified if you are interacting with a user: a program that is fetching some resources will not be too bothered if it has to wait one minute to get them instead of a few milliseconds because the system is under heavy load, but a user will generally not be that patient.
Also note that, when swapping, the OS does not chose which application gets the faster RAM and which ones get the swap memory at random. It will try to determine which application should be prioritized, by how much, etc. based on how it was configured (at least for the Linux kernel), so in reality it's the user who, in the end, decides which applications get the most RAM (ahead of time, of course: they are not prompted each time the kernel has to make that decision with a little pop-up...).
Finally, modern OS allow several applications to allocate memory that overlap, as long as each application is not fully using the memory it asked for (which is kind of usual), allowing you to run applications that in theory require more RAM that you actually have.
This was on the OS part: now to the application part. Usually, when you write a program (whose purpose is not specifically RAM-related), you should not really care for memory consumption (up to a certain point), especially in Rust. Not only that is usually handled by the OS in case you used a little bit too much memory, but when it's possible, most people prefer to trade a little more memory usage (even a lot more) for better CPU performance, because RAM is a lot cheaper than CPU.
There are exceptions, of course, in which the memory consumption is so high that you can't really afford not paying attention. In these cases, either you let the user deal with this problem (ie. this application is known to consume a lot of memory because there are no other ways to do this, so if you want to use it, just have a lot of memory), as often video games do, or you rethink your application to reduce the memory usage trading it for some CPU efficiency, as for example is done when you are handling graphs so huge you couldn't even store them on all the hard disks of the world (in which case your application has to be smart enough to be able to work on small parts of the graph at the time), or finally you are working with a big resource but which can be stored on the hard disk, so you just write it on a file and access it chunks-by-chunks, as some database managers do.

Tool to identify app's data/code most susceptible to memory performance

Context:
-- embedded platform running Linux with some static RAM which is declared about 3 times faster then the rest of RAM (dynamic). The amount of this fast memory is 512kB and the official name is eSRAM. (Details not important for this post: Galileo board, information on eSRAM and relevant kernel API: https://communities.intel.com/servlet/JiveServlet/previewBody/22488-102-1-26046/Quark_SWDevManLx_330235_001.pdf)
-- eSRAM can be used by an application with some support from the kernel---a simple driver that allocates kernel memory on its behalf, overlays the memory with eSRAM (this is done in physical space) and mmaps it to app's virtual memory space. This was tested and confirmed to work as expected.
Problem:
Identify which sections of app's data (and possibly code) to map into eSRAM to achieve optimum performance gain. A suitable analysis tool is required.
After some search I'm not sure if any existing tool is actually suited to this task. Currently my best bet is to develop a specialized Valgrind tool. But maybe there is already something in the ecosystem to start with. Any advice/information is welcome even if, for instance, a tool is kind of partially suited etc.
P.S.
Full analysis should probably take a lot of factors into account, like:
-- memory access patterns (cache performance)
-- changes over time (one could consider eSRAM paging)
...
I have taken a look at Valgrind Cachegrind. It can collect data about data cache reades and data cache writes. And cg_annotate can report Line-by-line Counts for you program. Can it be useful for you to find variables in your program that cause most operations with data cache and in this way to identify data that can benefit most from moving to quick memory? http://valgrind.org/docs/manual/cg-manual.html#cg-manual.line-by-line
Probably, you are interested in D cache reads (Dr) and D cache writes (Dw), or even (Dr+Dw). In that way you can find a place in your code which does most (Dr+Dw) and try to move this place in your quick memory.

Is it possible to allocate a certain sector of RAM under Linux?

I have recently gotten a faulty RAM and despite already finding out this I would like to try a much easier concept - write a program that would allocate faulty regions of RAM and never release them. It might not work well if they get allocated before the program runs, but it'd be much easier to reboot on failure than to build a kernel with patches.
So the question is:
How to write a program that would allocate given sectors (or pages containing given sectors)
and (if possible) report if it was successful.
This will problematic. To understand why, you have to understand the relation between physical and virtual memory.
On any modern Operating System, programs will get a very large address space for themselves, with the remainder of the address space being used for the OS itself. Other programs are simply invisible: there's no address at which they're found. How is this possible? Simple: processes use virtual addresses. A virtual address does not correspond directly to physical RAM. Instead, there's an address translation table, managed by the OS. When your process runs, the table only contains mappings for RAM that's allocated to you.
Now, that implies that the OS decides what physical RAM is allocated to your program. It can (and will) change that at runtimke. For instance, swapping is implemented using the same mechanism. When swapping out, a page of RAM is written to disk, and its mapping deleted from the translation table. When you try to use the virtual address, the OS detects the missing mapping, restores the page from disk to RAM, and puts back a mapping. It's unlikely that you get back the same page of physical RAM, but the virtual address doesn't change during the whole swap-out/swap-in. So, even if you happened to allocate a page of bad memory, you couldn't keep it. Programs don't own RAM, they own a virtual address space.
Now, Linux does offer some specific kernel functions that allocate memory in a slightly different way, but it seems that you want to bypass the kernel entirely. You can find a much more detailed description in http://lwn.net/images/pdf/LDD3/ch08.pdf
Check out BadRAM: it seems to do exactly what you want.
Well, it's not an answer on how to write a program, but it fixes the issue whitout compiling a kernel:
Use memmap or mem parameters:
http://gquigs.blogspot.com/2009/01/bad-memory-howto.html
I will edit this answer when I get it running and give details.
The thing is write own kernel module, which can allocate physical address. And make it noswap with mlock(2).
I've never tried it. No warranty.

Dynamic memory managment under Linux

I know that under Windows, there are API functions like global_alloc() and such, which allocate memory, and return a handle, then this handle can be locked and a pointer returned, then unlocked again. When unlocked, the system can move this piece of memory around when it runs low on space, optimising memory usage.
My question is that is there something similar under Linux, and if not, how does Linux optimize its memory usage?
Those Windows functions come from a time when all programs were running in the same address space in real mode. Linux, and modern versions of Windows, run programs in separate address spaces, so they can move them about in RAM by remapping what physical address a particular virtual address resolves to in the page tables. No need to burden the programmer with such low level details.
Even on Windows, it's no longer necessary to use such functions except when interacting with a small number of old APIs. I believe Raymond Chen's blog and book have some discussions of the topic if you are interested in more detail. Eg here's part 4 of a series on the history of GlobalLock.
Not sure what Linux equivalent is but in ATT UNIX there are "scatter gather" memory management functions in the memory manager of the core OS. In a virtual memory operating environment there are no absolute addresses so applications don't have an equivalent function. The executable object loader (loads executable file into memory where it becomes a process) uses memory addressing from the memory manager that is all kept track of in virtual memory blocks maintained in its page table (which contains the physical memory addresses). Bottom line is your applications physical memory layout is likely in no way ever linear or accessible directly.

Store more than 3GB of video-frames in memory, on 32-bit OS

At work we have an application to play 2K (2048*1556px) OpenEXR film sequences. It works well.. apart from when sequences that are over 3GB (quite common), then it has to unload old frames from memory, despite the fact all machines have 8-16GB of memory (which is addressable via the linux BIGMEM stuff).
The frames have to he cached into memory to play back in realtime. The OS is a several-year old 32-bit Fedora Distro (not possible to upgradable to 64bit, for the foreseeable future). The per-process limitation is 3GB per process.
Basically, is it possible to cache more than 3GB of data in memory, somehow? My initial idea was to spread the data between multiple processes, but I've no idea if this is possible..
One possibility may be to use mmap. You would map/unmap different parts of your data into the same virtual memory region. You could only have one set mapped at a time, but as long as there was enough physical memory, the data should stay resident.
How about creating a RAM drive and loading the file into that ... assuming the RAM drive supports the BIGMEM stuff for you.
You could use multiple processes: each process loads a view of the file as a shared memory segment, and the player process then maps the segments in turn as needed.
My, what an interesting problem :)
(EDIT: Oh, I just read Rob's ram drive post...I got all excited by the problem...but have a bit more to suggest, so I won't delete)
Would it be possible to...
setup a multi-gigabyte ram disk, and then
modify the program to do all it's reading from the "disk"?
I'd guess the ram disk part is where all the problem would be, since the size of the ram disk would be OS and file system dependent. You might have to create multiple ram disks and have your code jump between them. Or maybe you could setup a RAID-0 stripe set over multiple ram disks. Or, if there are still OS limitations and you can afford to drop a couple grand (4k?), setup a hardware RAID-0 strip set with some of those new blazing fast solid state drives. Or...
Fun, fun, fun.
Be sure to follow up!
I assume you can modify the application. If so, the easiest thing would be to start the application several times (once for each 3GB chunk of video), have each one hold a chunk of video, and use another program to synchronize them so they each take control of the framebuffer (or other video output) in turn.
The synchronization is going to be a little messy, perhaps, but it can be simplified if each app has its own framebuffer and the sync program points the video controller to the correct framebuffer inbetween frames when switching to the next app.
#dbr said:
There is a review machine with an absurd fiber-channel-RAID-array that can play 2K files direct from the array easily. The issue is with the artist-workstations, so it wouldn't be one $4000 RAID array, it'd be hundreds..
Well, if you can accept a limit of ~30GB, then maybe a single 36GB SSD drive would be enough? Those go for ~US$1k each I think, and the data rates might be enough. That very well maybe cheaper than a pure RAM approach. There are smaller sizes available, too. If ~60GB is enough you could probably get away with a JBOD array of 2 for double the cost, and skip the RAID controller. Be sure only to look at the higher end SSD options--the low end is filled with glorified memory sticks. :P

Resources