Please feel free to correct me if I am wrong. In JVM heap, there are two generations, old and young. When doing full GC, in old generation, there are heavy operations like compact spaces and fixing the hole, which will make JVM hang. And I find in young generation, a light weighted GC is applied, and there are another area called Eden involved in young generation from my search results. However, after search a lot of documents, I still have two confusions about GC in young generation,
In young generation, it seems GC does not work in the way which old generation GC works (i.e. old generation GC compact and fixing the hole)? If so, how did GC in young generation works?
What is Eden space and how this space is utilized in young generation? Appreciate if any document for a newbie could be recommended.
This is the single, most important diagram you have to memorize and understand:
(source: oracle.com)
It comes from Java SE 6 HotSpot[tm] Virtual Machine Garbage Collection Tuning, one stop place to learn everything about GC internals. But to address your immediate questions:
Allocating new objects using new operator (almost) always happens in Eden space. But Eden is actually a stack. When you create new object needing N bytes, single pointer advances by N bytes on that stack and that's it. Allocating is that fast, no searching for free spot, compacting, whatever.
Of course this stack is not infinite, at some point we'll reach its end, triggering minor GC. Also most likely multiple objects are already garbage. So what JVM does in minor GC is the following:
traverse graph of objects starting from GC roots
copy all objects reachable from GC roots to one of survivor spaces (no gaps, we know all of them and this is a single process)
wipe out eden space (basically just moving this stack pointer back to 0)
In subsequent minor collections there are additional steps:
one of survivor spaces is examined as well. Live objects from both eden and one of survivor spaces are copied to second survivor space. This means there is always exactly one free survivor space.
So how are objects ending in tenured generation? First young objects are copied to one of survivor spaces. Then they are copied to the other and again and again. Once given object jumps back and forth too many times (configurable, 8 by default), it is promoted to tenured space.
Major GC runs when tenured space is full.
Related
As part of the question in Java 11 GC logging I am struggling to understand what the numbers actually mean.
For example:
[2020-07-14T10:01:14.791-0400][gc ] GC(353) Pause Young (Normal) (G1 Evacuation Pause) 163M->16M(248M) 1.689ms
[2020-07-14T10:01:14.790-0400][gc,heap ] GC(353) Eden regions: 147->0(147)
[2020-07-14T10:01:14.790-0400][gc,heap ] GC(353) Survivor regions: 1->1(19)
[2020-07-14T10:01:14.790-0400][gc,heap ] GC(353) Old regions: 16->16
[2020-07-14T10:01:14.790-0400][gc,heap ] GC(353) Humongous regions: 1->1
I know that 147->0 is before/after collection, but what is the unit here and for the ones below? As I see it, is that the whole young generation is reduced from 163M to 16M , it also looks like this happens almost entirely within the Eden regions - so the objects already went out of scope before even moving to the survivor space?
what is the unit here
A region. Region size varies based on heap size or an explicit setting.
it also looks like this happens almost entirely within the Eden regions - so the objects already went out of scope before even moving to the survivor space?
Most of them, a small amount might still trickle into later generations but on the other hand those regions may also contain now-dead objects that can be collected so it's mostly in equilibrium with only a very small flow towards the old generation. This kind of behavior is what makes generational collectors so efficient.
Will jvm old gc mark all the heap or just old heap ? Because the young generation
objects can contains old generation objects .
It depends on the the collector that you use.
Some collectors do not use any structure to records the references from young generation to old generation. In theory, these collectors must scan the young generation to find the garbage. However, some collectors will execute minor gc instead of scan the young generation. Like CMS(Concurrent Mark Sweep), it will scan the young generation. But you can use options
-XX:+ScavengeBeforeFullGC -XX:+CMSScavengeBeforeRemark
to collect young generation before doing full gc or CMS remark phase.
But some collectors don't need to scan young generation. They usually use some data structures to record the references, Like G1(Garbage First) collector. It use the RS(remember set). The G1 collector will scan the RS to find out the reference. For instance, if there is a young region call yr1, has a reference points to an object in the old region or1. The RS will add a record like:
yr1 -> or1
(The actual implementation of RS is really complicated)
So, in mark cycle, G1 will scan the RS to find out all the reference points to the or1.
More detail:
hotspot-virtual-machine-garbage-collection-tuning-guide
memorymanagement-whitepaper
G1-One-Garbage-Collector-To-Rule-Them-All
Garbage first Collection
I see the the PS survivor space is almost full (98 %) most of the time for my application. I don't know what is PS survivor space . Is this normal ?
What should be done in such scenarios ?
First, see e.g. here : What is a survivor space?
Usually, there are 2 survivor spaces in the YoungGeneration part of the heap (e.g. for the Hostpot VM ). They are there to allow objects to mature before promoting them to the Old Generation. Because its more expensive to cleanup the old generation.
Collect some statistics to see if the survivor spaces are really full most of the time. You should see that one is always empty while the other one is being populated. See e.g. this question for collecting GC stats.
Once you have the data, look for:
survivor space overflow - this occurs when the survivor space is too small to allow the objects to mature between YoungHeap collections and the objects are overflowing to the OldGeneration without having time to mature (and die before being promoted).
also, monitor tenuring distribution with -XX:+PrintTenuringDistribution. To see how fast are the objects maturing.
UPDATE: Read the Hotspot Memory Management Whitepaper and see the section Serial Collector, there is a nice explanations of the Survivor spaces:
Note: If the To space becomes full, the live objects from Eden or From that have not been
copied to it are tenured, regardless of how many young generation collections they have survived. Any
objects remaining in Eden or the From space after live objects have been copied are, by definition, not live, and they do not need to be examined.
JVM heap is divided into two spaces, space of old generation and space of young generation. After major GC, there will be freed space in old generation after compacting/sweep process, I am wondering whether the free space we got during major GC still belong to old generation space, or the free space of old generation could be moved to the space of young generation?
In other words, I am asking whether there is fixed size/boundary for the space of old generation and space of young generation.
thanks in advance,
Lin
In Hotspot, there are options for that
-XX:+UseAdaptiveSizePolicy
-XX:+UseAdaptiveGCBoundary
However this still can be ignored by the VM. Its part of the dark auto tuning magic.
For simpicity, just assume that the division betwen old and young is fixed. Same applies to eden and survivor.
I think this is a boundary between each generation, but the size of some generations maybe changeable sine the -Xmx and -Xms not same.
When to collect an object, the garbage collection mark the space as available whick the object used.
It looks like deleting a file on you disk. The OS just mark the file path unaccessible and make the space available for next store.
Generations like disk partitions, but generations can decrease or increase their's space.
I want to monitor Full GC frequency in JMX. A MBean exposes GC count.
(cf. http://download.oracle.com/javase/1.5.0/docs/api/java/lang/management/GarbageCollectorMXBean.html - java.lang:type=GarbageCollector,name=).
The problem is that MBean does not distinguish between minor and full gc.
Does someone have an idea ?
Thanks.
Arnault
I'm not completely sure about this but I assume that the garbage collector that controls all the memory pools (at least the one for Old Gen) is the one used for major gc. e.g.: I have a JVM running with these 2 collectors:
PS MarkSweep
MemoryPoolNames: PS Eden Space, PS Survivor Space, PS Old Gen, PS Perm Gen
CollectionCount: 68
PS Scavenge
MemoryPoolNames: PS Eden Space, PS Survivor Space
CollectionCount: 2690
Taking this into account I would say, PS Scavenge is used for minor gc and PS MarkSweep for major gc.
UPDATE (based on #ajeanson comment, thanks for your feedback btw):
Effectively, the example I put in there was taken from the information exposed in the MXBeans of the JVM I was using. As you mentioned, these are GC algorithms, and the name the MXBean for the GC is using is based on the algorithm the GC is using. I've been looking for some more information about this; in this article http://download.oracle.com/javase/6/docs/technotes/guides/management/jconsole.html, reads the following:
The Java HotSpot VM defines two
generations: the young generation
(sometimes called the "nursery") and
the old generation. The young
generation consists of an "Eden space"
and two "survivor spaces." The VM
initially assigns all objects to the
Eden space, and most objects die
there. When it performs a minor GC,
the VM moves any remaining objects
from the Eden space to one of the
survivor spaces. The VM moves objects
that live long enough in the survivor
spaces to the "tenured" space in the
old generation. When the tenured
generation fills up, there is a full
GC that is often much slower because
it involves all live objects. The
permanent generation holds all the
reflective data of the virtual machine
itself, such as class and method
objects.
Taking a look at the collectionCount property on the MXBeans, in the case of my "PS MarkSweep" collector (the one managing the Old Generation pool), the collection count seems to increase only when I get a full GC in the verbose output. I might be wrong and maybe in some cases this Collector performs also minor GC, but I would need to run more tests to be totally sure about this.
Please, let me know if someone finds out something else or you have some more specific information about this issue as I'm quite interested in it.
it does ... have a look to the names e.g. ParNew, ConcurrentMarkSweep, .. etc.
some names are for minor gc, some for full gc,
Was looking for the same information and found out after reading https://docs.oracle.com/javase/8/docs/technotes/tools/unix/java.html#BABFAFAE for JAVA 8 that some collectors can be used for both minor/full GCs (such as G1 or SerialGC) but some other collectors are for only minor or full GCs (such as ParNewGC, ConcMarkSweepGC).
And when you use the G1 for example, the two collectors used are quite explicit with their names and the one for full gc is the G1 Old Generation.
But, because the MXBean is missing the information about being minor or full, either:
you know the GC in use for your app and code accordingly your monitoring method knowing the collector names
or you start having like a map of all possibilities for your selected JVM version
I will, in my case, just print the collector name along with the time and count value and let the person reading those data make the analysis. In my case, the data will be graphed (Grafana)
Not sure if the newest JDK improve this...