Young generation is tiny despite NewRatio=2

Young generation is tiny despite NewRatio=2 - garbage-collection

At some point my application starts to create a lot of temporary arrays, this is expected behaviour, and I want to give a lot of space to Young Generation, so temporary arrays don't get promoted to the Tenured Generation.
JVM options:
java -Xmx240g -XX:+UseConcMarkSweepGC -XX:NewRatio=2 -XX:+PrintGCTimeStamps -verbose:gc -XX:+PrintGCDetails
At some point my GC log starts looking like this:
800.020: [GC 800.020: [ParNew: 559514K->257K(629120K), 0.1486790 secs] 95407039K->94847783K(158690816K), 0.1487540 secs] [Times: user=3.34 sys=0.05, real=0.15 secs]
800.202: [GC 800.202: [ParNew: 559489K->246K(629120K), 0.1665870 secs] 95407015K->94847777K(158690816K), 0.1666610 secs] [Times: user=3.79 sys=0.00, real=0.17 secs]
800.402: [GC 800.402: [ParNew: 559478K->257K(629120K), 0.1536610 secs] 95407009K->94847788K(158690816K), 0.1537290 secs] [Times: user=3.48 sys=0.02, real=0.15 secs]
I'm very confused by the fact that Young Generation size is 629120K (=629M), while I expect it to be approx. 1/2 (because NewRatio=2) of Tenured Generation size which is 158690816K (=158G). Tenured size generation corresponds with NewRatio and Xms as expected, i.e. it is 2/3 of total heap size.
JVM version:
java version "1.7.0_21"
Java(TM) SE Runtime Environment (build 1.7.0_21-b11)
Java HotSpot(TM) 64-Bit Server VM (build 23.21-b01, mixed mode)
Update:
I believe that at this point (800 sec of running time) program has peak temporary array usage.
If program does not go beyond 629M Young generation size, does it mean I should increase NewRatio? Let's assume I'm planning to give more workload to the program, and I expect that temporary arrays volume to permanent arrays volume ratio will be the same.
I ran the program with NewRatio=8 before, and gc log consists mostly of lines like these:
800.004: [GC 800.004: [ParNew: 186594K->242K(209664K), 0.1059450 secs] 95345881K->95159529K(126655428K), 0.1060110 secs] [Times: user=2.41 sys=0.00, real=0.10 secs]
800.122: [GC 800.122: [ParNew: 186610K->221K(209664K), 0.1073210 secs] 95345897K->95159522K(126655428K), 0.1073900 secs] [Times: user=2.37 sys=0.07, real=0.11 secs]
800.240: [GC 800.240: [ParNew: 186589K->221K(209664K), 0.1026210 secs] 95345890K->95159524K(126655428K), 0.1026870 secs] [Times: user=2.34 sys=0.00, real=0.10 secs]
800.357: [GC 800.357: [ParNew: 186589K->218K(209664K), 0.1043130 secs] 95345892K->95159527K(126655428K), 0.1043810 secs] [Times: user=2.30 sys=0.07, real=0.10 secs]
It makes me think that NewRatio has the impact on Young generation size currently, but it shouldn't because currently Young generation is far below 1/9 of heap size.
Update 2: It's a huge scientific calculation and my solution needs up to 240GB of memory. It is not a memory leak and algorithm is the best I was able to come up with.

Please, could you try running with the AdaptiveSizePolicy disabled, use : -XX-UseAdaptiveSizePolicy?
This should allow you to preset the sizes of each Heap Area and will disable dynamic changes in their sizes at runtime.

First of all, check What is the meaning of the -XX:NewRatio and -XX:OldSize JVM flags?
The NewRatio is the ratio of young generation to old generation (e.g. value 2 means max size of old will be twice the max size of young, i.e. young can get up to 1/3 of the heap).
and you will see your young generation size is correct.
I recommend you to check also Are ratios between spaces/generations in the Java Heap constant? and set static sizes from the very beginning using
-Xms240g -Xmx240g -XX:NewSize=120g -XX:MaxNewSize=120g -XX:-UseAdaptiveSizePolicy

Related

How do I measure fragmentation in Hotspot's Metaspace?

I'm looking into debugging an "OutOfMemoryError: Metaspace" error in my application. Right before the OOME I see the following in the gc logs:
{Heap before GC invocations=6104 (full 39):
par new generation total 943744K, used 0K [...)
eden space 838912K, 0% used [...)
from space 104832K, 0% used [...)
to space 104832K, 0% used [...)
concurrent mark-sweep generation total 2097152K, used 624109K [...)
Metaspace used 352638K, capacity 487488K, committed 786432K, reserved 1775616K
class space used 36291K, capacity 40194K, committed 59988K, reserved 1048576K
2015-08-11T20:34:13.303+0000: 105892.129: [Full GC (Last ditch collection) 105892.129: [CMS: 624109K->623387K(2097152K), 3.4208207 secs] 624109K->623387K(3040896K), [Metaspace: 352638K->352638K(1775616K)], 3.4215100 secs] [Times: user=3.42 sys=0.00, real=3.42 secs]
Heap after GC invocations=6105 (full 40):
par new generation total 943744K, used 0K [...)
eden space 838912K, 0% used [...)
from space 104832K, 0% used [...)
to space 104832K, 0% used [...)
concurrent mark-sweep generation total 2097152K, used 623387K [...)
Metaspace used 352638K, capacity 487488K, committed 786432K, reserved 1775616K
class space used 36291K, capacity 40194K, committed 59988K, reserved 1048576K
}
From what I can see, Metaspace capacity isn't even nearing the committed size (in this case, -XX:MaxMetaspaceSize=768m). So I suspect fragmentation of Metaspace causing the allocator to fail to find a new chunk for the new classloader.
I'm aware of -XX:PrintFLSStatistics but that only covers CMS, not native memory.
So my question is: is there a debugging help similar to PrintFLSStatistics available for Hotspot's native memory?
This is using Java HotSpot(TM) 64-Bit Server VM (25.45-b02) for linux-amd64 JRE (1.8.0_45-b14).

I've just looked into the implementation of the Metaspace in HotSpot. The Metaspace is divided into chunks and managed using a freelist. So fragmentation is indeed a possible reason for your problem.
I've also looked through the flags of the HotSpot VM (-XX:+UnlockDiagnosticVMOptions -XX:+PrintFlagsFinal), there is no flag in the release version.
However, there is a dump() method in the Metaspace class which seems to be triggered by setting the -XX:+TraceMetadataChunkAllocation flag. There is also the -XX:+TraceMetavirtualspaceAllocation which is sounding to be of interest for you. However, those are "develop" flags, meaning you need a debug version of the VM.

#loonytune's answer works just fine, but I want to provide a little bit more detail:
For context, "The Metaspace" is a collection of metaspaces, one per class loader. Each metaspace holds a list of VirtualSpace objects out of which Metachunks of different sizes are allocated. These chunks hold MetaBlocks, which are the real containers for metadata.
I need a debug JRE to run those flags, so following this tuorial I checked out the openjdk repository (I renamed the checkout to vm because the build scripts seem to take issue with the jdk8 folder name), ran
~/vm$ bash configure --enable-debug
~/vm$ DISABLE_HOTSPOT_OS_VERSION_CHECK=ok make all
and used the resulting vm/build/linux-x86_64-normal-server-fastdebug/images/j2re-image as my java runtime.
The log lines generated look like this:
VirtualSpaceNode::take_from_committed() not available 8192 words space # 0x00007fee4cdb9350 128K, 94% used [0x00007fedf5e22000, 0x00007fedf5f13000, 0x00007fedf5f22000, 0x00007fedf6022000)
Which indicates that the current VirtualSpace is full and can't hold another chunk of the requested 8192 word size. This will cause this metaspace to switch to another VirtualSpace.
ChunkManager::chunk_freelist_allocate: 0x00007fee4c0c39f8 chunk 0x00007fee15397400 size 128 count 0 Free chunk total 7680 count 15
ChunkManager::chunk_freelist_allocate: 0x00007fee4c0c39f8 chunk 0x00007fedf6021000 size 512 count 14 Free chunk total 7168 count 14
This happens when a new Metachunk is allocated, in the first case it's 128 words big and uses up the list of small chunks. As you can see, the next request goes to the medium sized chunks (of size 512) and leaves 14 chunks free in total. Once the free total reaches 0, a Full GC is needed to increase the total Metaspace size.
Note that specifying -verbose gets you even more output from the above two flags.

Why is the Java G1 gc spending so much time scanning RS?

I'm currently evaluating the G1 garbage collector and how it performs for our application. Looking at the gc-log, I noticed a lot of collections have very long "Scan RS" phases:
7968.869: [GC pause (mixed), 10.27831700 secs]
[Parallel Time: 10080.8 ms]
(...)
[Scan RS (ms): 4030.4 4034.1 4032.0 4032.0
Avg: 4032.1, Min: 4030.4, Max: 4034.1, Diff: 3.7]
[Object Copy (ms): 6038.5 6033.3 6036.7 6037.1
Avg: 6036.4, Min: 6033.3, Max: 6038.5, Diff: 5.2]
(...)
[Eden: 19680M(19680M)->0B(20512M) Survivors: 2688M->2624M Heap:
75331M(111904M)->51633M(115744M)]
[Times: user=40.49 sys=0.02, real=10.28 secs]
All the removed log-rows entries show runtimes in single-digit ms.
I think most of the time should be spent in copying, right? What could be the reason Scan RS takes so long? Any ideas on how to tweak the G1-settings?
The JVM was started with
-Xms40960M -Xmx128G -XX:+UseG1GC -verbose:gc -XX:+PrintGCDetails -Xloggc:gc.log
Edit: Oh, I forgot... I'm using Java 7u25
Update:
I noticed two other weird things:
16187.740: [GC concurrent-mark-start]
16203.934: [GC pause (young), 2.89871800 secs]
(...)
16218.455: [GC pause (young), 4.61375100 secs]
(...)
16237.441: [GC pause (young), 4.46131800 secs]
(...)
16257.785: [GC pause (young), 4.73922600 secs]
(...)
16275.417: [GC pause (young), 3.87863400 secs]
(...)
16291.505: [GC pause (young), 3.72626400 secs]
(...)
16307.824: [GC pause (young), 3.72921700 secs]
(...)
16325.851: [GC pause (young), 3.91060700 secs]
(...)
16354.600: [GC pause (young), 5.61306000 secs]
(...)
16393.069: [GC pause (young), 17.50453200 secs]
(...)
16414.590: [GC concurrent-mark-end, 226.8497670 sec]
The concurrent GC run is continuing while parallel runs are being performed. I'm not sure if that's intended, but it kinda seems wrong to me. Admittedly, this is an extreme example, but I do see this behaviour all over my log.
Another thing is that my JVM process grew to 160g. Considering a heap-size of 128g, that's a rather large overhead. Is this to be expected, or is G1 leaking memory? Any ideas on how to find that out?
PS: I'm not really sure if I should've made new questions for the updates... if any of you think that this would be beneficial, tell me ;)
Update 2:
I guess the G1 really may be leaking memory: http://printfdebugger.tumblr.com/post/19142660766/how-i-learned-to-love-cms-and-had-my-heart-broken-by-g1
As this is a deal-breaker for now, I'm not going to spend more time on playing with this.
Things I didn't yet try is configuring region size (-XX:G1HeapRegionSize) and lowering the heap occupancy (-XX:InitiatingHeapOccupancyPercent).

Let's see.
1 - First clues
It looks like your GC was configured to use 4 threads (or you have 4 vCPU, but it is unlikely given the size of the heap). It is quite low for a 128GB heap, I was expecting more.
The GC events seems to happen at 25+ seconds interval. However, the log extract you gave do not mention the number of regions that were processed.
=> By any chance, did you specify pause time goals to G1GC (-XX:MaxGCPauseMillis=N) ?
2 - Long Scan RSet time
"Scan RSet" means the time the GC spent in scanning the Remembered Sets. Remembered Set of a region contains cards that correspond to the references pointing into that region. This phase scans those cards looking for the references pointing into all the regions of the collection set.
So here, we have one more question :
=> How many regions were processed during that particular collection (i.e. how big is the CSet)
3 - Long Object Copy time
The copy time, as the name suggest, is the time spend by each worker thread copying live objects from the regions in the Collection Set to the other regions.
Such long copy time can suggest that a lot of regions were processed, and that you may want to reduce that number. It could also suggest swapping, but this is very unlikely given your user/real values at the end of the log.
4 - Now what to do
You should check in the GC log the number of regions that were processed. Correlate this number with your region size and deduce the amount of memory that was scanned.
You can then set a smaller pause time goal (for instance, to 500ms using -XX:MaxGCPauseMillis=500). This will
increase the number of GC events,
reduce the amount of freed memory per GC cycle
reduce the STW pauses during YGC
Hope that helps !
Sources :
https://blogs.oracle.com/poonam/entry/understanding_g1_gc_logs
http://www.oracle.com/webfolder/technetwork/tutorials/obe/java/G1GettingStarted/index.html
http://jvm-options.tech.xebia.fr/

Understanding garbage collection data

All,
I am using the following VM switches while running my program. The program has a known memory leak.
Initially the heap gets full and I understand the reason of OutOfMemoryError. But later (124.283) a Full GC reclaims some space. So why am I still getting OutOfMemoryError?
Thanks in advance
VM Arguments
-XX:+PrintGCTimeStamps -XX:+PrintGCDetails -Xms32m -Xmx32m
Here is the GC data
Heap full
123.540: [Full GC 123.540: [Tenured: 21888K->21887K(21888K), 0.1215501 secs] 31679K->31679K(31680K), [Perm : 2054K->2054K(12288K)], 0.1216037 secs] [Times: user=0.13 sys=0.00, real=0.13 secs]
123.665: [Full GC 123.665: [Tenured: 21887K->21887K(21888K), 0.1504579 secs] 31679K->31575K(31680K), [Perm : 2054K->2054K(12288K)], 0.1505627 secs] [Times: user=0.16 sys=0.00, real=0.16 secs]
Exception in thread "RMI TCP Connection(idle)" java.lang.OutOfMemoryError: Java heap space
Exception in thread "AWT-EventQueue-0" java.lang.NullPointerException: disposed component
at sun.java2d.windows.GDIWindowSurfaceData.initOps(Native Method)
at sun.java2d.windows.GDIWindowSurfaceData.(Unknown Source)
at sun.java2d.windows.GDIWindowSurfaceData.createData(Unknown Source)
at sun.awt.Win32GraphicsConfig.createSurfaceData(Unknown Source)
at sun.java2d.ScreenUpdateManager.createScreenSurface(Unknown Source)
at sun.java2d.d3d.D3DScreenUpdateManager.createScreenSurface(Unknown Source)
at sun.awt.windows.WComponentPeer.replaceSurfaceData(Unknown Source)
at sun.awt.windows.WComponentPeer.replaceSurfaceData(Unknown Source)
at sun.awt.windows.WComponentPeer$2.run(Unknown Source)
at javax.swing.RepaintManager.seqPaintDirtyRegions(Unknown Source)
at javax.swing.SystemEventQueueUtilities$ComponentWorkRequest.run(Unknown Source)
at java.awt.event.InvocationEvent.dispatch(Unknown Source)
at java.awt.EventQueue.dispatchEvent(Unknown Source)
at java.awt.EventDispatchThread.pumpOneEventForFilters(Unknown Source)
at java.awt.EventDispatchThread.pumpEventsForFilter(Unknown Source)
at java.awt.EventDispatchThread.pumpEventsForHierarchy(Unknown Source)
at java.awt.EventDispatchThread.pumpEvents(Unknown Source)
at java.awt.EventDispatchThread.pumpEvents(Unknown Source)
at java.awt.EventDispatchThread.run(Unknown Source)
123.829: [Full GC 123.829: [Tenured: 21887K->21887K(21888K), 0.1306163 secs] 31679K->30695K(31680K), [Perm : 2056K->2056K(12288K)], 0.1306809 secs] [Times: user=0.13 sys=0.00, real=0.13 secs]
Exception in thread "RMI TCP Connection(idle)" java.lang.OutOfMemoryError: Java heap space
Exception in thread "RMI TCP Connection(idle)" java.lang.OutOfMemoryError: Java heap space
124.040: [Full GC 124.040: [Tenured: 21888K->21887K(21888K), 0.1259948 secs] 31680K->27400K(31680K), [Perm : 2057K->2057K(12288K)], 0.1260596 secs] [Times: user=0.13 sys=0.00, real=0.13 secs]
Heap memory is reclaimed in the following Full GC
124.283: [Full GC 124.283: [Tenured: 21888K->15215K(21888K), 0.0945810 secs] 31680K->15215K(31680K), [Perm : 2057K->2055K(12288K)], 0.0946383 secs] [Times: user=0.09 sys=0.00, real=0.09 secs]
124.829: [GC 124.829: [DefNew: 8704K->988K(9792K), 0.0079326 secs] 23919K->16203K(31680K), 0.0079854 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
Exception in thread "RMI TCP Connection(idle)" java.lang.OutOfMemoryError: Java heap space
Exception in thread "RMI TCP Connection(idle)" java.lang.OutOfMemoryError: Java heap space
Exception in thread "RMI TCP Connection(idle)" java.lang.OutOfMemoryError: Java heap space
Exception in thread "RMI TCP Connection(idle)" java.lang.OutOfMemoryError: Java heap space
Exception in thread "RMI TCP Connection(idle)" java.lang.OutOfMemoryError: Java heap space
Exception in thread "RMI TCP Connection(idle)" java.lang.OutOfMemoryError: Java heap space
Exception in thread "RMI TCP Connection(idle)" java.lang.OutOfMemoryError: Java heap space
Exception in thread "RMI TCP Connection(idle)" java.lang.OutOfMemoryError: Java heap space
Exception in thread "RMI TCP Connection(idle)" java.lang.OutOfMemoryError: Java heap space

Why does it take three Full GC to garbage collect permgen?

What are the reasons why it would take three successive "Full GC" before perm gen is garbage collected?
The first GC got the heap down from 2.4gb to 761mb, but fails to substantially GC perm gen, though it does appear to recover 6K.
We'll ignore the young generation collection.
The second Full GC does very little for the heap, as expected since the server was lightly loaded at the time. The odd thing is that it did NOTHING for perm gen.
The third Full GC finally takes perm gen from its max of 524mb down to 141mb.
Here's the unedited snippet from the GC logs:
2012-12-07T19:46:40.731-0600: [Full GC [CMS: 2474402K->761372K(2804992K), 4.6386780 secs] 2606228K->761372K(3111680K), [CMS Perm : 524286K->524280K(524288K)], 4.6387670 secs] [Times: user=4.68 sys=0.00, real=4.63 secs]
2012-12-07T19:46:45.374-0600: [GC [ParNew
Desired survivor size 17432576 bytes, new threshold 6 (max 6)
- age 1: 65976 bytes, 65976 total
: 1552K->8827K(306688K), 0.0199700 secs] 762925K->770200K(3111680K), 0.0200340 secs] [Times: user=0.08 sys=0.00, real=0.02 secs]
2012-12-07T19:46:45.395-0600: [Full GC [CMS: 761372K->752917K(2804992K), 3.7379280 secs] 770212K->752917K(3111680K), [CMS Perm : 524287K->524287K(524288K)], 3.7380180 secs] [Times: user=3.77 sys=0.00, real=3.74 secs]
2012-12-07T19:46:49.135-0600: [Full GC [CMS: 752917K->693347K(2804992K), 3.2845870 secs] 752917K->693347K(3111680K), [CMS Perm : 524287K->141759K(524288K)], 3.2846780 secs] [Times: user=3.32 sys=0.00, real=3.29 secs]
System info and GC flags:
Java 1.7.0_07, 64-Bit Server, Ubuntu 12.04
-Xms3g -Xmx3g -XX:PermSize=512m -XX:MaxPermSize=512m
-XX:+UseConcMarkSweepGC
EDIT: we have two app servers; the second one exhibited slightly different behavior: there were only two Full GC entries.
2012-12-07T20:36:31.097-0600: [Full GC [CMS: 2307424K->753901K(2804992K), 5.0783720 secs] 2394279K->753901K(3111680K), [CMS Perm : 524280K->524121K(524288K)], 5.0784780 secs] [Times: user=5.12 sys=0.00, real=5.08 secs]
2012-12-07T20:36:36.178-0600: [Full GC [CMS: 753901K->695698K(2804992K), 3.4488560 secs] 755266K->695698K(3111680K), [CMS Perm : 524121K->140568K(524288K)], 3.4489690 secs] [Times: user=3.48 sys=0.00, real=3.45 secs]
So it looks like the young generation was significant. Perhaps it's requiring two successive Full GC, with no other GC (young generation GC) in between to garbage collect perm gen in our particular set up. I've dug a lot, but I haven't found any discussion of this behavior.

It would not astonish me that the concurrent collections of the heap and perm gen do not influence each other, as especially the heap collection is already a complex operation by itself; that'd explain why the perm gen is only collected the second time. I'm mainly guessing, though.
It might be interesting to get more details on what's actually collected in the perm gen (unloaded classes, strings?). -XX:+PrintGCDetails would help, and maybe -verbose:class.

Full GC, PSPermGen not cleaned

My Java EE server has been working nicely, and then inside 10 mins full gc started to occur more frequently, then finally it was stopped all the time due to GC. PSPermGen was not released.
My JVM settings are:
set JAVA_OPTS=%JAVA_OPTS% -Xms4g -Xmx4g -XX:MaxPermSize=512m -XX:NewRatio=3
2012-09-05T14:03:10.394+0100: 94287.753: [Full GC [PSYoungGen: 843584K->0K(947200K)] [ParOldGen: 3077347K->3117145K(3145728K)] 3920931K->3117145K(4092928K) [PSPermGen: 181533K->181521K(186944K)], 10.9564398 secs] [Times: user=286.14 sys=0.19, real=10.97 secs]
Total time for which application threads were stopped: 10.9678339 seconds
Application time: 0.0023102 seconds
Total time for which application threads were stopped: 0.0088344 seconds
Application time: 0.3052301 seconds
Total time for which application threads were stopped: 0.0085634 seconds
Application time: 0.1125068 seconds
2012-09-05T14:03:21.798+0100: 94299.158: [Full GC [PSYoungGen: 842024K->22409K(947200K)] [ParOldGen: 3117145K->3145232K(3145728K)] 3959170K->3167641K(4092928K) [PSPermGen: 181521K->181521K(186752K)], 11.4649901 secs] [Times: user=372.58 sys=0.11, real=11.47 secs]
Total time for which application threads were stopped: 11.4757898 seconds
Application time: 0.0706553 seconds
Total time for which application threads were stopped: 0.0102510 seconds
Application time: 0.3951514 seconds
2012-09-05T14:03:33.748+0100: 94311.110: [Full GC [PSYoungGen: 843584K->34503K(947200K)] [ParOldGen: 3145232K->3141687K(3145728K)] 3988816K->3176190K(4092928K) [PSPermGen: 181521K->181521K(186112K)], 10.9699419 secs] [Times: user=369.43 sys=0.14, real=10.97 secs]
Total time for which application threads were stopped: 10.9806713 seconds
Application time: 0.0027075 seconds
Any clue what could be reason? Memory leak or JVM can be tweaked better?

Well from the log, few things are clear. Either that the system genuinely needs too much memory that it is unable to clear the tenured generation resulting in 3.1GB consistent consumption. This part only you can answer.
Or there is a memory leak. Memory Leak might/not be possible because the ole gen used space is constant at around 3.145GB. With memory leak usually even this increases.
Probably more log can help. If this factor increases with time, then rest assured - a leak.
If constant, then the application is genuinely short of memory needed.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string