How do CMS GC ensure that new references are not cleared when concurrent sweep - jvm-hotspot

I know that the CMS process of tagging is to tag all reachable objects.
n the stage after the final markup, we only marked all reachable objects, not unreachable objects (or my understanding is not correct).
Then when the concurrent cleanup occurs, there may be an object created by the user thread when all non-reachable space is cleared.
How does CMS handle, or is there a problem with what I understand from the beginning

Normally new object will be allocated to young generation which will not interfere with CMS which is happening in old generation. But in case object is promoted from young generation to old generation, CMS handle this as following:
According to the original CMS paper:
Concurrent sweeping phase: Resume the mutators once again, and sweep concurrently over the heap,
deallocating unmarked objects. Care must be taken
not to deallocate newly-allocated objects. This can be
accomplished by allocating objects “live” (i.e., marked),
at least during this phase.
Sum it up: it is allocating object with "marked" during Concurrent sweeping phase.

Related

Properly identifying memory leak with GC and LOH

This may have been asked a few different ways, but this is a relatively new field to me so forgive me if it is redundant and point me on my way.
Essentially I have created a data collection engine that take high speed data (up to thousands of points a second) and stores them in a database.
The database is dynamic, so the statements being fed to the database are dynamically created in code as well, this in turn required a great deal of string manipulation. All of the strings however are declared within scope of asynchronous event handler methods, so they should fall out of scope as soon as the method completes.
As this application runs, its memory usage according to task manager / process explorer, slowly but steadily increases, so it would seem that something was not getting properly disposed and or collected.
If I attach CDB -p (yes I am loading the sos.dll from the CLR) and do a !dumpheap I see that the majority of this is being used by System.String, as well if I !dumpheap -type System.String, and the !do the addresses I see the exact strings (the SQL statements).
however if I do a !gcroot on the any of the addresses, I get "Found 0 unique roots (run '!GCRoot -all' to see all roots)." that in turn if I try as it suggests I get "Invalid argument -all" O.o
So after some googling, and some arguments concerning that unrooted objects will eventually be collected by GC, that this is not an issue.. I looked to see, and it appears 84% of my problem is sitting on the LOH (where depending on which thread you look at where, may or may not get processed for GC unless there is a memory constraint on the machine or I explicitly tell it to collect which is considered bad according to everything I can find)
So what I need to know is, is this essentially true, that this is not a memory leak, it is simply the system leaving stuff there until it HAS to be reclaimed, and if so how then do I tell that I do or do not have a legitimate memory leak.
This is my first time working the debugger external to the application as I have never had to address this sort of issue before, so I am very new to that portion, this is a learning experience.
Application is written in VS2012 Pro, C#, it is multi-threaded, and a console application is wrapping the API for testing, but will eventually be a Windows service.
What you read is true, managed applications use a memory model where objects pile on until you reach a certain memory threshold (calculated based on the amount of physical memory on your system and your application's real growth rate), after which all(*) "dead" objects get squished by the rest of the useful memory, making it one contiguous block for allocation speed.
So yes, don't worry about your memory steadily increasing until you're several tens of MB up and no collection has taken place.
(*) - is actually more complicated by multiple memory pools (based on object size and lifetime length), such that the system isn't constantly probing very long lived objects, and by finalizers. When an object has a finalizer, instead of being freed, the memory gets squished over them but they get moved to a special queue, the finalizer queue, where they wait for the finalizer to run on the UI thread (keep in mind the GC runs on a separate thread), and only then it finally gets freed.

Why does GC(Garbage collector) freezes current execution threads

I was reading Chapter 12: Garbage collection of C# in a Nutshell where in the section about Concurrent and background collection it says that
The GC must freeze (block) your execution threads for periods during a
collection. This includes the entire period during which a Gen0 or
Gen1 collection takes place.
One thing I understand is that; probably it's trying to avoid any new memory allocation at that point of time.
Is there any other specific reason behind this - as why GC need to block currently executing thread?
The MSDN documentation claims that generations 0 and 1 are always performed non-concurrently because they happen very fast.
Performing a concurrent garbage collection pass will take longer than a non-concurrent one since access to data that is being processed must be synchronized between the GC thread and other threads. This adds overhead which probably outweighs the benefits of concurrency in gen 0 and 1 collections since they typically run very fast.
Beyond removing objects that are marked from memory, the GC also tries to compact the heap after performing a pass. This means that objects may move in memory as a result of a GC pass. For this reason a concurrent pass requires the extra overhead to synchronize data access between the GC thread and other threads of the process.

How to implement a garbage collector?

Could anyone point me to a good source on how to implement garbage collection? I am making a lisp-like interpreted language. It currently uses reference counting, but of course that fails at freeing circularly dependent objects.
I've been reading of mark and sweep, tricolor marking, moving and nonmoving, incremental and stop-the-world, but... I don't know what the best way to keep the objects neatly separated into sets while keeping per-object memory overhead at a minimum, or how to do things incrementally.
I've read some languages with reference counting use circular reference detection, which I could use. I am aware I could use freely available collectors like Boehm, but I would like to learn how to do it myself.
I would appreciate any online material with some sort of tutorial or help for people with no experience on the topic like myself.
Could anyone point me to a good source on how to implement garbage collection?
There's a lot of advanced material about garbage collection out there. The Garbage Collection Handbook is great. But I found there was precious little basic introductory information so I wrote some articles about it. Prototyping a mark-sweep garbage collector describes a minimal mark-sweep GC written in F#. The Very Concurrent Garbage Collector describes a more advanced concurrent collector. HLVM is a virtual machine I wrote that includes a stop-the-world collector that handles threading.
The simplest way to implement a garbage collector is:
Make sure you can collate the global roots. These are the local and global variables that contain references into the heap. For local variables, push them on to a shadow stack for the duration of their scope.
Make sure you can traverse the heap, e.g. every value in the heap is an object that implements a Visit method that returns all of the references from that object.
Keep the set of all allocated values.
Allocate by calling malloc and inserting the pointer into the set of all allocated values.
When the total size of all allocated values exceeds a quota, kick off the mark and then sweep phases. This recursively traverses the heap accumulating the set of all reachable values.
The set difference of the allocated values minus the reachable values is the set of unreachable values. Iterate over them calling free and removing them from the set of allocated values.
Set the quota to twice the total size of all allocated values.
Check out the following page. It has many links. http://lua-users.org/wiki/GarbageCollection
As suggested by delnan, I started with a very naïve stop-the-world tri-color mark and sweep algorithm. I managed to keep the objects in the sets by making them linked-list nodes, but it does add a lot of data to each object (the virtual pointer, two pointers to nodes, one enum to hold the color). It works perfectly, no memory lost on valgrind :) From here I might try to add a free list for recycling, or some sort of thing that detects when it is convenient to stop the world, or an incremental approach, or a special allocator to avoid fragmentation, or something else. If you can point me where to find info or advice (I don't know whether you can comment on an answered question) on how to do these things or what to do, I'd be very thankful. I'll be checking Lua's GC in the meantime.
I have implemented a Cheney-style copying garbage collector in C in about 400 SLOC. I did it for a statically-typed language and, to my surprise, the harder part was actually communicating the information which things are pointers and which things aren't. In a dynamically typed language this is probably easier since you must already use some form of tagging scheme.
There also is a new version of the standard book on garbage collection coming out: "The Garbage Collection Handbook: The Art of Automatic Memory Management" by Jones, Hosking, Moss. (The Amazon UK site says 19 Aug 2011.)
One thing I haven't yet seen mentioned is the use of memory handles. One may avoid the need to double-up on memory (as would be needed with the Cheney-style copying algorithm) if each object reference is a pointer to a structure which contains the real address of the object in question. Using handles for memory objects will make certain routines a little slower (one must reread the memory address of an object any time something might have happened that would move it) but for single-threaded systems where garbage collection will only happen at predictable times, this isn't too much of a problem and doesn't require special compiler support (multi-threaded GC systems will are likely to require compiler-generated metadata whether they use handles or direct pointers).
If one uses handles, and uses one linked list for live handles (the same storage can be used to hold a linked list for dead handles needing reallocation), one can, after marking the master record for each handle, proceed through the list of handles, in allocation order, and copy the block referred to by that handle to the beginning of the heap. Because handles will be copied in order, there will be no need to use a second heap area. Further, generations may be supported by keeping track of some top-of-heap pointers. When compactifying memory, start by just compactifying items added since the last GC. If that doesn't free up enough space, compactify items added since the last level 1 GC. If that doesn't free up enough space, compactify everything. The marking phase would probably have to act upon objects of all generations, but the expensive compactifying stage would not.
Actually, using a handle-based approach, if one is marking things of all generations, one could if desired compute on each GC pass the amount of space that could be freed in each generation. If half the objects in Gen2 are dead, it may be worthwhile to do a Gen2 collection so as to reduce the frequency of Gen1 collections.
Garbage collection implementation in Lisp
Building LISP | http://www.lwh.jp/lisp/
Arcadia | https://github.com/kimtg/arcadia
Read Memory Management: Algorithms and Implementations in C/C++. It's a good place to start.
I'm doing similar work for my postscript interpreter. more info via my question. I agree with Delnan's comment that a simple mark-sweep algorithm is a good place to start. You'll need functions to set-mark, check-mark, clear-mark, and iterators for all your containers. One easy optimization is to clear-mark whenever allocating a new object, and clear-mark during the sweep; otherwise you'll need an entire pass to clear marks before you start setting them.

How do garbage collectors track all live objects?

Garbage collection involves walking through a list of allocated objects (either all objects or objects in a particular generation) and determining which are reachable.
How is this list maintained? Do runtimes for GC languages keep a giant list of all objects?
Also, from what I understand, GC involves walking the call stack to look for object references - how does the algorithm distinguish between GC-able pointers and primitive data?
The memory management system keeps track of the size of each allocated object, just like it does in C or C++. One way this is commonly done is for the memory management system to allocate an extra size_t before each allocation, that keeps track of the size of each objecct. The memory manager likewise has to keep track of the size of each free block, so that it can reuse blocks to allocate them.
The garbage collector works in two phases: the mark phase, and the sweep phase. In the mark phase, the garbage collector starts walks object references in order to find objects that are still reachable. The garbage collector starts at a few basic places where the object references are stored and given names (the stack, and global storage, and static storage), and then traverses references in the objects.
In the sweep phase, the garbage collector walks the heap from bottom to top, jumping from allocation to allocation based on those size_ts, and frees anything that isn't marked.
Some languages (like Ruby) tag all of the primitives so that they can be identified separately from the object references at runtime. Other garbage collectors are ver conservative and follow primatives as through they were object references (though some checks must be performed to make sure that the garbage collector doesn't stick a mark in the middle of some other object). Still other languages use runtime type information to be more precise about whether they follow primatives.
Ruby's garbage collector sometimes called "conservative" because it doesn't check whether the space on the stack is actually being used, so it sometimes keeps dead objects alive by following ghost references on the stack. But since it always knows exactly whether the data it's looking at is a reference or a primative, I don't call it conservative here.
Garbage collection involves walking through a list of allocated objects (either all objects or objects in a particular generation) and determining which are reachable.
Not really. GCs are categorized into tracing and reference counting (see A unified theory of garbage collection). Tracing GCs start from a set of global roots and trace all objects reachable from them. Reference counting GCs count the number of references to each object and reclaim it when the count reaches zero. Neither require a list including unreachable objects.
How is this list maintained? Do runtimes for GC languages keep a giant list of all objects?
Pedagogical solutions like the one in HLVM can keep a list of all objects because it is simple but this is rare.
Also, from what I understand, GC involves walking the call stack to look for object references - how does the algorithm distinguish between GC-able pointers and primitive data?
Again, there are many different strategies. Conservative GCs are unable to distinguish between pointers and non-pointers so they conservatively consider that non-pointers might be pointers. Pedagogical GCs like the one in HLVM can use algorithms like Henderson's Accurate GC in an uncooperative environment. Production GCs store enough information in the OS thread stack to determine exactly which words are pointers (and which stack frames to skip because they are not affiliated with managed code) and then use a stack walker to find them.
Note that you also have to find local references held in registers as well as on the stack.
This site ( How Java’s Garbage Collector Works? ) has a good, brief explanation on how garbage collectors work, not just the default Java one.

When is garbage collector used in java?

As far as I know GC is used only when JVM needs more memory, but I'm not sure about it. So, please, someone suggest an answer to this question.
The garbage collection algorithm of Java is as I understand, pretty complex and not as straightforward. Also, there is more than algorithm available for GC, which can be chosen at VM launchtime with an argument passed to the JVM.
There's a FAQ about garbage collection here: http://www.oracle.com/technetwork/java/faq-140837.html
Oracle has also published an article "Tuning Garbage Collection with the 5.0 Java[tm] Virtual Machine" which contains deep insights into garbage collection, and will probably help you understand the matter better: http://www.oracle.com/technetwork/java/gc-tuning-5-138395.html
The JVM and java specs don't say anything about when garbage collection occurs, so its entirely up to the JVM implementors what policies they wish to use. Each JVM should probably have some documention about how it handles GC.
In general though, most JVMs will trigger GC when a memory allocation pushes the total amount of allocated memory above some threshold. There may be mulitple levels of gc (full vs partial/generational) that occur at different thresholds.
Garbage Collection is deliberately vaguely described in the Java Language Specification, to give the JVM implementors best working conditions to provide good garbage collectors.
Hence garbage collectors and their behaviour are extremely vendor dependent.
The simplest but least fun is the one that stops the whole program when needing to clean. Others are more sophisticated and run quietly along your program cleaning up as you go more or less aggressively.
The most fun way to investigate garbage collection is to run jvisualvm in the Sun 6 JDK. It allows you to see many, many internal things many relevant to garbage collection.
https://visualvm.dev.java.net/monitor_tab.html (but the newest version has plenty more)
In the older days garbage collector were empirical in nature. At some set interval or based on certain condition they would kick in and examine each of the object.
Modern days collectors are smarter in the sense that they differentiate based on the fact that objects are different lifespan. The objects are differentiated between young generation objects and tenured generation objects.
Memory is managed in pools according to generation. Whenever the young generation memory pool is filled, a minor collection happens. The surviving objects are moved to tenured generation memory pool. When the tenured generation memory pool gets filled a major collection happens. A third generation is kept which is known as permanent generation and may contain objects defining classes and methods.

Resources