When playing Minecraft with a heavyweight mod pack without increasing the memory allocation you can sometimes see a lot of stuttering: it runs smoothly for about a second then pauses for about a second. I assumed that this was due to GC with some part of the code creating a lot of temporary objects in the main loop.
Allocating more memory before launching the game cures it. My question is, what's going on under the hood that means extra memory can prevent the stuttering rather than just increasing its period?
Related
I am doing some tests using zlib provided by nodejs.
When I called the deflate function 5000 times in a loop, a memory leak occurred. And the brk function was called more than 10000 times in Linux(strace -cfe mmap,munmap,mprotect,brk -p {process Id} ).
However, when I called 5000 every 1 second using setInterval, there was no memory leak. And the brk function occurred much less in Linux.
In nodejs, the same number is called 5000, but why is the number of calls of brk function different in Linux?
I guess it's like reusing memory space in Linux. Is this correct? If not, what is the exact cause?
Mostly likely, you are seeing some side effect of the garbage collector. Analyzing this requires more more information than you've posted here, but if I were to hazard an educated guess, NodeJS (or v8 via libuv) probably does an opportunistic GC at the bottom of the event loop, to collect garbage in at least their shortest-lived pool.
When you are running this in a loop, you never fall off the event loop, which means that garbage collection probably has fewer opportunities to run. It probably runs based on a timer or allocation counter, and almost certainly during the memory allocator when there is memory pressure.
Run it in your loop for a few hundred thousand iterations, I bet it stabilizes the total process allocation at some point.
I have a real-time application in NodeJS that listen to multiple websockets and reacts to its events by placing HTTPS requests; it runs continuously. I noticed that the response time, at many many times during execution, was much higher than merely the expected network latency, which led me to investigate the problem. It turns out that the Garbage Collector was running multiple times in a row adding significant latency (5s, up to 30s) to the run time.
The problem seems to be related to the frequent scheduling of a Scavenger round to free up resources, due to allocation failure. Although each round takes less than 100ms to execute, executing thousands of times in a row does add up to many seconds. My only guess is that at some point in my application the allocated heap is close to its limit and little memory is actually freed in each GC round which keeps triggering GC in a long loop. The application seems not to be leaking memory because memory allocation never indeed fails. It just seems to be hitting an edge that triggers GC madness.
Could someone with knowledge/experience shed some tips on how to use the GC options that V8 provides in order to overcome such situation? I have tried --max_old_space_size and --nouse-idle-notification (as suggested by the few articles that tackle this problem) but I do not fully understand the internals of the node GC implementation and the effects of the available options. I would like to have more control over when Scavenger round should run or at least increase the interval between successive rounds so that it becomes more efficient.
This may have been asked a few different ways, but this is a relatively new field to me so forgive me if it is redundant and point me on my way.
Essentially I have created a data collection engine that take high speed data (up to thousands of points a second) and stores them in a database.
The database is dynamic, so the statements being fed to the database are dynamically created in code as well, this in turn required a great deal of string manipulation. All of the strings however are declared within scope of asynchronous event handler methods, so they should fall out of scope as soon as the method completes.
As this application runs, its memory usage according to task manager / process explorer, slowly but steadily increases, so it would seem that something was not getting properly disposed and or collected.
If I attach CDB -p (yes I am loading the sos.dll from the CLR) and do a !dumpheap I see that the majority of this is being used by System.String, as well if I !dumpheap -type System.String, and the !do the addresses I see the exact strings (the SQL statements).
however if I do a !gcroot on the any of the addresses, I get "Found 0 unique roots (run '!GCRoot -all' to see all roots)." that in turn if I try as it suggests I get "Invalid argument -all" O.o
So after some googling, and some arguments concerning that unrooted objects will eventually be collected by GC, that this is not an issue.. I looked to see, and it appears 84% of my problem is sitting on the LOH (where depending on which thread you look at where, may or may not get processed for GC unless there is a memory constraint on the machine or I explicitly tell it to collect which is considered bad according to everything I can find)
So what I need to know is, is this essentially true, that this is not a memory leak, it is simply the system leaving stuff there until it HAS to be reclaimed, and if so how then do I tell that I do or do not have a legitimate memory leak.
This is my first time working the debugger external to the application as I have never had to address this sort of issue before, so I am very new to that portion, this is a learning experience.
Application is written in VS2012 Pro, C#, it is multi-threaded, and a console application is wrapping the API for testing, but will eventually be a Windows service.
What you read is true, managed applications use a memory model where objects pile on until you reach a certain memory threshold (calculated based on the amount of physical memory on your system and your application's real growth rate), after which all(*) "dead" objects get squished by the rest of the useful memory, making it one contiguous block for allocation speed.
So yes, don't worry about your memory steadily increasing until you're several tens of MB up and no collection has taken place.
(*) - is actually more complicated by multiple memory pools (based on object size and lifetime length), such that the system isn't constantly probing very long lived objects, and by finalizers. When an object has a finalizer, instead of being freed, the memory gets squished over them but they get moved to a special queue, the finalizer queue, where they wait for the finalizer to run on the UI thread (keep in mind the GC runs on a separate thread), and only then it finally gets freed.
I'm looking into making a real time game with OpenGL and D, and I'm worried about the garbage collector. From what I'm hearing, this is a possibility:
10 frames run
Garbage collector runs kicks in automatically and runs for 10ms
10 frames run
Garbage collector runs kicks in automatically and runs for 10ms
10 frames run
and so on
This could be bad because it causes stuttering. However, if I force the the garbage collector to run consistently, like with GC.collect, will it make my game smoother? Like so:
1 frame runs
Garbage collector runs for 1-2ms
1 frame runs
Garbage collector runs for 1-2ms
1 frame runs
and so on
Would this approach actually work and make my framerate more consistent? I'd like to use D but if I can't make my framerate consistent then I'll have to use C++11 instead.
I realize that it might not be as efficient, but the important thing is that it will be smoother, at a more consistent framerate. I'd rather have a smoothe 30 fps than a stuttering 35 fps, if you know what I mean.
Yes, but it will likely not make a dramatic difference.
The bulk of time spent in a GC cycle is the "mark" stage, where the GC visits every allocated memory block (which is known to contain pointers) transitively, from the root areas (static data, TLS, stack and registers).
There are several approaches to optimize an application's memory so that D's GC makes a smaller impact on performance:
Use bulk allocation (allocate objects in bulk as arrays)
Use custom allocators (std.allocator is on its way, but you could use your own or third party solutions)
Use manual memory management, like in C++ (you can use RefCounted as you would use shared_ptr)
Avoiding memory allocation entirely during gameplay, and preallocating everything beforehand instead
Disabling the GC, and running collections manually when it is more convenient
Generally, I would not recommending being concerned about the GC before writing any code. D provides the tools to avoid the bulk of GC allocations. If you keep the managed heap small, GC cycles will likely not take long enough to interfere with your application's responsiveness.
If you were to run the GC every frame, you still would not get a smooth run, because you could have different amounts of garbage every frame.
You're left then with two options, both of which involve turning off the GC:
Use (and re-use) pre-allocated memory (structs, classes, arrays, whatever) so that you do not allocate during a frame, and do not need to.
Just run and eat up memory.
For both these, you would do a GC.disable() before you start your frames and then a GC.enable() after you're finished with all your frames (at the end of the battle or whatever).
The first option is the one which most high performance games use anyway, regardless of whether they're written in a language with a GC. They simply do not allocate or de-allocate during the main frame run. (Which is why you get the "loading" and "unloading" before and after battles and the like, and there are usually hard limits on the number of units.)
Is there a way to send the memory swapped, back again to the principal memory?
EDIT: I had a process that I ran and eated all memory, so now, each time I use another app, it has something in swap, so it takes time to reload to memory. Now the consuming memory process has stopped, I want to force to have all the things in memory again. So I will wait only one time to have the things that are in swap to memory again, and not each time that I reuse an opened app.
Not directly; moreover, usually you don't want to, as often what is swapped is the part that is no longer needed (initialization code). The only way to force the issue is to ask the kernel to disable the swap area, and even that is not immediate.
The kernel will automatically and transparently swap that data back into RAM when it's required.
You can avoid getting swapped out using mlock() or mlockall(), but you probably shouldn't do that. If your app ends up getting swapped out it might be using too much memory, or there could be too little memory in the machine, or too many running processes. None of those problems will be improved by your app using mlock().