why initial mark and concurrent mark needs to scan young generation since final remark scan all the young generation - garbage-collection

As cms working, final remark has to scan all the young generation, so what's the benefit of "initial mark and concurrent mark scanning the young generation". Why not save the time, leave it to the remark phase.
As we know the concurrent-abortable-preclean is not always processed.

In CMS, initial mark and final remark call STW(Stop the world) pause. This means application stop and GC continue the operation. The time elapsed at these stages is referred to as the pause time.
In the initial mark phase, GC will scan root and mark live objects. In Concurrent mark phase, GC is marking live objects concurrently with application threads running, it must record any changes to objects that are already traced. In Final remark, GC will find and mark objects which missed by concurrent phase.
If you want to do it all in one step, you will see huge pause-times(latency) and it will be not efficient. In a nutshell reduced pause-times with these steps.
References:
Concurrent mark
CMS Oracle Document

Related

How do CMS GC ensure that new references are not cleared when concurrent sweep

I know that the CMS process of tagging is to tag all reachable objects.
n the stage after the final markup, we only marked all reachable objects, not unreachable objects (or my understanding is not correct).
Then when the concurrent cleanup occurs, there may be an object created by the user thread when all non-reachable space is cleared.
How does CMS handle, or is there a problem with what I understand from the beginning
Normally new object will be allocated to young generation which will not interfere with CMS which is happening in old generation. But in case object is promoted from young generation to old generation, CMS handle this as following:
According to the original CMS paper:
Concurrent sweeping phase: Resume the mutators once again, and sweep concurrently over the heap,
deallocating unmarked objects. Care must be taken
not to deallocate newly-allocated objects. This can be
accomplished by allocating objects “live” (i.e., marked),
at least during this phase.
Sum it up: it is allocating object with "marked" during Concurrent sweeping phase.

Why does GC(Garbage collector) freezes current execution threads

I was reading Chapter 12: Garbage collection of C# in a Nutshell where in the section about Concurrent and background collection it says that
The GC must freeze (block) your execution threads for periods during a
collection. This includes the entire period during which a Gen0 or
Gen1 collection takes place.
One thing I understand is that; probably it's trying to avoid any new memory allocation at that point of time.
Is there any other specific reason behind this - as why GC need to block currently executing thread?
The MSDN documentation claims that generations 0 and 1 are always performed non-concurrently because they happen very fast.
Performing a concurrent garbage collection pass will take longer than a non-concurrent one since access to data that is being processed must be synchronized between the GC thread and other threads. This adds overhead which probably outweighs the benefits of concurrency in gen 0 and 1 collections since they typically run very fast.
Beyond removing objects that are marked from memory, the GC also tries to compact the heap after performing a pass. This means that objects may move in memory as a result of a GC pass. For this reason a concurrent pass requires the extra overhead to synchronize data access between the GC thread and other threads of the process.

Garbage collector in Node.js

According to google, V8 uses an efficient garbage collection by employing a "stop-the-world, generational, accurate, garbage collector". Part of the claim is that the V8 stops program execution when performing a garbage collection cycle.
An obvious question is how can you have an efficient GC when you pause program execution?
I was trying to find more about this topic as I would be interested to know how does the GC impacts the response time when you have possibly tens of thounsands requests per second firing your node.js server.
Any expert help, personal experience or links would be greatly appreciated
Thank you
"Efficient" can mean several things. Here it probably refers to high throughput. When looking at response time, you're more interested in latency, which could indeed be worse than with alternative GC strategies.
The main alternatives to stop-the-world GCs are
incremental GCs, which need not finish a collection cycle before handing back control to the mutator1 temporarily, and
concurrent GCs which (virtually) operate at the same time as the mutator, interrupting it only very briefly (e.g. to scan the stack).
Both need to perform extra work to be correct in the face of concurrent modification of the heap (e.g. if a new object is created and attached to an already-scanned object, this new reference must be noticed). This impacts total throughput, i.e., it takes longer to actually clean the entire heap. The upside is that they do not (usually) interrupt the program for very long, if at all, so latency is low(er).
Although the V8 documentation still mentions a stop-the-world collector, it seems that an the V8 GC is incremental since 2011. So while it does stop program execution once in a while, it does not 2 stop the program for however long it takes to scan the entire heap. Instead it can scan for, say, a couple milliseconds, and let the program resume.
1 "Mutator" is GC terminology for the program whose heap is garbage collected.
2 At least in principle, this is probably configurable.

What's the Gambit-C's GC mechanism?

What's the Gambit-C's GC mechanism? I'm curious about this for making interactive app. I want to know whether it can avoid burst GC operation or not.
According to these threads:
https://mercure.iro.umontreal.ca/pipermail/gambit-list/2005-December/000521.html
https://mercure.iro.umontreal.ca/pipermail/gambit-list/2008-September/002645.html
Gambit has traditional stop-the-world GC at least until September 2008. People in thread recommended using pre-allocated object pooling to avoid GC operation itself. I couldn't find out about current implementation.
*It's hard to agree with the conversation. Because I can't pool object not written by myself and finally full-GC will happen at sometime by accumulated small/non-pooled temporary objects. But the method mentioned by #Gregory may help to avoid this problem. However, I wish incremental GC added to Gambit :)
According to http://dynamo.iro.umontreal.ca/~gambit/wiki/index.php/Debugging#Garbage_collection_threshold gambit has some controls:
Garbage collection threshold
Pay attention to the runtime options h (maximum heapsize in kilobytes) and l (livepercent). See the reference manual for more information. Setting livepercent to five means that garbage collection will take place at the time that there are nineteen times more memory allocated for objects that should be garbage collected, than there is memory allocated for objects that should not. The reason the livepercent option is there, is to give a way to control how sparing/generous the garbage collector should be about memory consumption, vs. how heavy/light it should be in CPU load.
You can always force garbage collection by (##gc).
If you force garbage collection after some small number of operations, or schedule it near continuously, or set the livepercent to like 90 then presumably the gc will run frequently and not do very much on each run. This is likely to be more expensive overall, but avoid bursts of expense. You can then fairly easily budget for that expense to make the service fast despite.

Difference between background and concurrent garbage collection?

I read that with .NET Framework 4 the current garbage collection implementation is replaced:
The .NET Framework 4 provides
background garbage collection. This
feature replaces concurrent garbage
collection in previous versions and
provides better performance.
At this page there is an explanation how it works but I am not sure I understood it.
In practical world application what is the benefit of this new GC implementation? Is it a feature that could be use to push for a transition from 3.5 or previous to 4.0?
Here, Microsoft uses the names "concurrent" and "background" to describe two versions of the GC it uses in .NET. In the .NET world, the "background collector" is an enhancement over the "concurrent collector" in that it has less restrictions on what application threads can do while the collector is running.
A basic GC uses a "stop-the-world" strategy: applicative threads allocate memory blocks from a common heap. When the GC must run (e.g. too many blocks have been allocated, some cleanup is needed), all applicative (managed) threads stop. The last stopping thread runs the GC, and unblocks all the other threads when it has finished. A stop-the-world GC is simple to implement but induces pauses which can be perceptible at the user level.
Microsoft's "concurrent GC" is generational: it uses the stop-the-world strategy for only a limited part of the heap (what they call "generations 0 and 1"). Since that part remains small, pauses remain short (e.g. below 50ms), so that the user will not notice them. The rest of the heap is collected with a dedicated GC thread, which can run concurrently with the applicative threads (hence the name).
The concurrent GC has some limitations. Namely, there are moments when the GC thread must assume a somewhat exclusive control of the heap. During such times, applicative threads may allocate blocks only from small thread-specific areas. Threads which have bigger needs will soon stumble upon the main heap, which, at that time, is locked by the GC thread. The allocating thread must then block until the GC thread has finished its lock-the-heap phase. This again induces pauses. Less pauses than with a stop-the-world GC, and these pauses do not affect all threads. Yet pauses nonetheless.
The "background GC" is an enhanced GC in which the GC thread needs not lock the heap. This removes the extra pauses described in the previous paragraph; only remain the limited pauses when the young generations are collected (what Microsoft calls "a foreground collection").
Note: there are "hidden costs" with the concurrent GC and the background GC. For these GC to operate properly, memory accesses from applicative threads must be done in some very specific ways, which have a slight impact on performance. Also, the GC thread may have an adverse effect on cache memory, thus indirectly degrading performance. For a purely computational task with no need for user interaction, a stop-the-world collector may, on average, yield somewhat better performance (e.g. a twenty-hours-long computation will complete in nineteen hours). But this is an edge case, and in most situations the concurrent and background GC are better.
Here is the real world explanation without slur and overinflated feeling of self-importance:
In concurrent GC you were allowed to allocate while in a GC, but you are not allowed to start another GC while in a GC. This in turn means that the maximum you are allowed to allocate while in a GC is whatever space you have left on one segment (currently 16 MB in workstation mode) minus anything that is already allocated there).
The difference in Background mode is that you are allowed to start a new GC (gen 0+1) while in a full background GC, and this allows you to even create a new segment to allocate in if necessary. In short, the blocking that could occur before when you allocated all you could in one segment won’t happen anymore.
From Tess da Man! http://blogs.msdn.com/b/tess/archive/2009/05/29/background-garbage-collection-in-clr-4-0.aspx
The primary benefit will be fewer application freezes due to garbage collection, which in itself could be considered a significant improvement. For most apps this difference will not be noticeable unless you have a HUGE number of long-lived objects in memory.
This change also makes .NET slightly more viable for building timing-sensitive apps (where response times are important). The extreme example are car airbags - you don't want your software to be busy doing garbage collection when they need to be inflated. The changes in 4.0 reduce the number and length of freezes due to GCing but does not remove them entirely.

Resources