i'm a little confused about AGP memory and shared graphics memory http://en.wikipedia.org/wiki/Shared_graphics_memory
what is the difference between them?
I'm not sure, but hopefully this may help you, AGP is a slot type like PCI that you can put a AGP GPU in or a AIMM (AGP Inline Memory Module).
http://en.wikipedia.org/wiki/AGP_Inline_Memory_Module
http://computer.howstuffworks.com/agp.htm
Shared graphics memory is a section of normal RAM (System Memory) that is used by the graphics card.
Related
I have an OpenGL application that runs in Linux with an Intel GPU. I need to reduce the GPU memory consumption of my application. How can I profile effectively GPU memory?
I would like to be able to identify what parts are eating the most memory. For example, is it the textures or maybe the geometry? Is there any way I can query the memory that a texture or a buffer is consuming in OpenGL? Or are there any useful tools I can use for this?
Supposing I have a game that does lots of graphics in terms of openGL and I have a desktop with Linux 32-bit installed with 4GB of RAM and 1G Nvidia Graphics card. How does my game application virtual address space look like ? Is graphics card memory mapped in this virtual address space ?
Also, is there some relation between RAM and graphics card memory ? Does linux allocate equal RAM for graphics card which can not be used by any process ? That said, it results then into only 3GB of RAM available to my game process ?
How does my game application virtual address space look like?
Impossible to tell. OpenGL leaves this detail completely open to the vendor implementation. Anything that satisfies the specification is allowed.
Is graphics card memory mapped in this virtual address space?
Maybe, maybe not. That depends on the actual implementation.
Also, is there some relation between RAM and graphics card memory?
Usually yes. As far and the majority of OpenGL implementation are concerned the graphics card's RAM is essentially a cache for things that actually live in system memory (CPU RAM + swap space + stuff memory mapped from storage). However this is not pinned down to the specification and anything that satisfies the OpenGL specification is allowed.
Does Linux allocate equal RAM for graphics card which can not be used by any process?
No, because Linux (the kernel) is not concerned with these things. Your graphics card's driver is, though. And the driver may do it any way it sees fit. It can either map OpenGL context data into a separate address space through Physical Address Extension (PAE) or place it in a different process or keep it in your game's address space, or…, or…, or…. There's no written down scheme on this.
That said, it results then into only 3GB of RAM available to my game process?
If so, then more like (3GB - 1GB) - x where 0 < x because the top 1GB of your process' address space are reserved for the kernel and of course your program's text (the binary executed by the CPU) and the text of the libraries it's using takes some address space as well.
I am trying to hunt down a possible memory leak in my Sharpdx / DirectX application.
I am getting the following information from process explorer which I do not know how to interpret.
What is Dedicated GPU Memory?
What is System GPU Memory?
What is Comitted GPU Memory?
Dedicated GPU memory is basically the VRAM on-board the GPU
System GPU memory is memory that the graphics card driver is using the GART (Graphics Address Remapping Table) to store resources in system memory... AGP and PCI Express both provide regions of memory set aside for this purpose (sometimes referred to as aperture segments).
Committed GPU memory refers to the amount of memory mapped into a display device's address space by the display driver, it is a difficult concept to explain but this number typically does not represent anything worthwhile to anyone but driver developers.
I suggest you look into the following documentation on MSDN as well as this overview of GPU address space segementation, while they are somewhat technical they give a general overview of what is going on.
I am trying to test Contiguous Memory Allocator for DMA mapping framework. I have compiled kernel 3.5.7 with CMA support, I know that it is experimental but it should work.
My goal is to allocate several 32MB physically contiguous memory chunks in kernel module for device without scatter/gather capability.
I am testing my system with test patch from Barry Song: http://thread.gmane.org/gmane.linux.kernel/1263136
But when I try to allocate memory with echo 1024 > /dev/cma_test. I get bash: echo: write error: No space left on device. And in dmesg:misc cma_test: no mem in CMA area
What could be the problem? What am I missing? System is freshly rebooted and there should be at least 350mb of free contiguous memory because bigphysarea patch on kernel 3.2 were able to allocate that amount on similar system.
Thank you for your time!
At the end I have decided to use kernel 3.5 and bigphysarea patch(from 3.2). It is easy and works like a charm.
CMA is great option as well but it is a bit harder to use an debug(CMA needs actual device). I have used up all my skills to find what was the problem. Printk inside kernel code was only possibility to debug this one.
Context:
CUDA 4.0, Linux 64bit, NVIDIA UNIX x86_64 Kernel Module 270.41.19, on a GeForce GTX 480.
I try to find a (device) memory leak in my program. I use the runtime API and cudaGetMemInfo(free,total) to measure device memory usage. I notice a significant loss (in this case 31M) after kernel execution. The kernel code itself does not allocate any device memory. So I guess its the kernel code that remains in device memory. Even I would have thought the kernel isn't that big. (Is there a way to determine the size of a kernel?)
When is the kernel code loaded into device memory? I guess at execution of the host code line:
kernel<<<geom>>>(params);
Right?
And does the code remain in device memory after the call? If so, can I explicitly unload the code?
What concerns me is device memory fragmentation. Think of a large sequence of alternating device memory allocation and kernel executions (different kernels). Then after a while device memory gets quite scarce. Even if you free some memory the kernel code remains leaving only the space between the kernels free for new allocation. This would result in a huge memory fragmentation after a while. Is this the way CUDA was designed?
The memory allocation you are observing is used by the CUDA context. It doesn't only hold kernel code, it holds any other static scope device symbols, textures, per-thread scratch space for local memory, printf and heap, constant memory, as well as gpu memory required by the driver and CUDA runtime itself. Most of this memory is only ever allocated once, when a binary module is loaded, or PTX code is JIT compiled by the driver. It is probably best to think of it as a fixed overhead, rather than a leak. There is a 2 million instruction limit in PTX code, and current hardware uses 32 bit words for instructions, so the memory footprint of even the largest permissible kernel code is small compared to the other global memory overheads it requires.
In recent versions of CUDA there is a runtime API call cudaDeviceSetLimit which permits some control over the amount of scratch space a given context can consume. Be aware that it is possible to set the limits to values which are lower than the device code requires, in which case runtime execution failures can result.