Garbage collection logs rotation - garbage-collection

will the parameter %t also perform log rotation for collecting GC logs?
-Xloggc:/data/logs/gc-%t.log

You should enable logging by unlocking -XX:+UseGCLogFileRotation
See also Rolling garbage collector logs in java

Related

GC graph shows there is a memory leak but unable to track in the dump

We have a Java Micorservice in our application which is connected to Postgres as well as Phoenix. We are using Spring Boot 2.x.
The problem is we are executing endurance testing for our application for about 8 hours and we could observe that the used heap is keep on increasing though we used the recommended suggestions for VM arguments, looks like a memory leak. we analysed the heap dump however the root cause is not exactly clear for us, can some experts help based on the results?
The VM arguments that we are actually using are:
-XX:ConcGCThreads=8 -XX:+DisableExplicitGC -XX:InitialHeapSize=536870912 -XX:InitiatingHeapOccupancyPercent=45 -XX:MaxGCPauseMillis=1000 -XX:MaxHeapFreeRatio=70 -XX:MaxHeapSize=536870912 -XX:MinHeapFreeRatio=40 -XX:ParallelGCThreads=16 -XX:+PrintAdaptiveSizePolicy -XX:+PrintGC -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:StringDeduplicationAgeThreshold=1 -XX:+UseCompressedClassPointers -XX:+UseCompressedOops -XX:+UseG1GC -XX:+UseStringDeduplication
We are expecting the used heap should be flat in the GC log, however memory consumption is not released and it keeps on increasing.
Heap Dump:
GC graph:
I'm not sure which tool you are using above, but I would be looking for the dominator hierarchy in the heap. Eclipse MAT is a good tool to analyse heap dumps and it can point you in the direction of what's actually holding the memory and you can decide if you want to categorise it as a leak or not. Regardless of the label you attach, if the application is going to crash after a while because it runs out of memory, then it is a problem.
This blog also discusses diagnosing this type of problems.

Filebeat - Failed to publish events caused by: read tcp x.x.x.x:36196->x.x.x.x:5045: i/o timeout

Hi i'm running into a problem while sending logs via filebeat to logstash:
In short - Can't see logs in kibana - when tailing the filebeat log I see a lot of these:
ERROR logstash/async.go:235 Failed to publish events caused by: read tcp x.x.x.x:36246->y.y,y.y:5045: i/o timeout (while y.y,y.y is logstash address and 5045 is the open beat port)
More details:
I have ~60 machines with filebeat 6.1.1 installed and one logstash machine with logstash 6.2.3 installed.
Some filebeats successfully sends their logs while some throws the error I mentioned above.
those non-errors filebeats sends old logs - means I can see in logstash debug logs that some logs timestamp are 2 or 3 days ago
Logstash usage memory is 35% and cpu usage near 75% on peaks,
in netstat -tupn output in the filebeat machines I can see that the established connections to logstash from filebeat.
Can someone help me find the problem ?
It looks like logstash performance issue. Cpu usage its probabbly too high Memory could be more. increase the minimum (Xms) and maximum (Xmx) heap allocation size to =[Total amount in the host - 1], (live 1 G to the Os) and set it equals (xms=xmx)
Also you can run another logstash instance and balance the filebeat output to these 2 and see what happen.
More things to consider:
Performance Checklist
Check the performance of input sources and output destinations:
Logstash is only as fast as the services it connects to. Logstash can only consume and produce data as fast as its input and output destinations can!
Check system statistics:
CPU
Note whether the CPU is being heavily used. On Linux/Unix, you can run top -H to see process statistics broken out by thread, as well as total CPU statistics.
If CPU usage is high, skip forward to the section about checking the JVM heap and then read the section about tuning Logstash worker settings.
Memory
Be aware of the fact that Logstash runs on the Java VM. This means that Logstash will always use the maximum amount of memory you allocate to it.
Look for other applications that use large amounts of memory and may be causing Logstash to swap to disk. This can happen if the total memory used by applications exceeds physical memory.
I/O Utilization
Monitor disk I/O to check for disk saturation.
Disk saturation can happen if you’re using Logstash plugins (such as the file output) that may saturate your storage.
Disk saturation can also happen if you’re encountering a lot of errors that force Logstash to generate large error logs.
On Linux, you can use iostat, dstat, or something similar to monitor disk I/O.
Monitor network I/O for network saturation.
Network saturation can happen if you’re using inputs/outputs that perform a lot of network operations.
On Linux, you can use a tool like dstat or iftop to monitor your network.
Check the JVM heap:
Often times CPU utilization can go through the roof if the heap size is too low, resulting in the JVM constantly garbage collecting.
A quick way to check for this issue is to double the heap size and see if performance improves. Do not increase the heap size past the amount of physical memory. Leave at least 1GB free for the OS and other processes.
You can make more accurate measurements of the JVM heap by using either the jmap command line utility distributed with Java or by using VisualVM. For more info, see Profiling the Heapedit.
Always make sure to set the minimum (Xms) and maximum (Xmx) heap allocation size to the same value to prevent the heap from resizing at runtime, which is a very costly process.
Tune Logstash worker settings:
Begin by scaling up the number of pipeline workers by using the -w flag. This will increase the number of threads available for filters and outputs. It is safe to scale this up to a multiple of CPU cores, if need be, as the threads can become idle on I/O.
You may also tune the output batch size. For many outputs, such as the Elasticsearch output, this setting will correspond to the size of I/O operations. In the case of the Elasticsearch output, this setting corresponds to the batch size.
More info here.

OutOfMemoryException - GC verbose confirmed a memory leak, what now?

I'm monitoring an app whose GC verbose log looks like this:
The graph draws the amount of Used Tenured after the GC runs.
As you can see, there's an obvious memory leak, but I was wondering what would be the best next step to find out which component is holding around 50MB of memory each time the GC runs.
The machine is an AIX 6.1 running an IBM's JVM 5.
Thanks
The pattern in the chart definitely looks like a typical memory leak, building up in tenured space over time. Your best shot would be heap dump analyzers - take a heap dump for example similar to following
jmap -dump:format=b,file=dump.bin <your java process id>
and analyze the dump file for example with Eclipse Memory Analyzer.

generate heap dump reduces dramatically after performing manual GC

this is my first post in stack overflow forum. we are recently experiencing some Java OOME issues and using jvisualvm, yourkit and eclipse mat tools able to idenify and fix some issues...
one behavior observed during analysis is that when we create a heapdump manually using jconsole or jvisualvm, the used heap size in jvm reduces dramatically (from 1.3 GB to 200 MB) after generating the heapdump.
can some one please advise on this behavior? this is a boon in disguise since whenever i see the used heapsize is >1.5GB, i perform a manaul GC and the system is back to lower used heapsize numbers resulting in no jvm restarts.
let me know for any additional details
thanks
Guru
when you use JConsole to create the dump file, there are 2 parameters: The first one is the file name to generate (complete path) and the second one (true by default) indicates if you want to perform a gc before taking the dump. Set it to false if you don't want a full gc before dumping
This is an old question but I found it while asking a new question of my own, so I figured I'd answer it.
When you generate a heap dump, the JVM performs a System.gc() operation before it generates the heap dump, which is collecting non-referenced objects and effectively reducing your heap utilization. I am actually looking for a way to disable that System GC so I can inspect the garbage objects that are churning in my JVM.

Is it normal for "rsyslogd" to cost 170M memory?

One of my sites runs extremely slow,
and I use top command to see that "rsyslogd" cost 170M memory,
is that normal?
If not,how can I limit the size of memory "rsyslogd" cost,or the frequency the "rsyslogd"
runs?
Yes and No.
Generally you are using file/disk queue mode. It caches the writes to a buffer and writes out a block at time instead of an inefficent line by line at a time with open and close; reducing unnecessary and small disk accesses.
The problem lies in the fact that it makes a 10MB buffer for every file its logging. 20 log files means 200+MB. The number of log files can always be reduced, but it also possible to reduce the buffer size if you are not running a raid (big-block) or hi-demand system. The documentation is here: http://www.rsyslog.com/doc/v8-stable/concepts/queues.html#disk-queues , ”$<object>QueueMaxFileSize” to reduce the size of each buffer. 4MB can cut you down to 70MB
Sounds like you've got some process logging way too much info. You might just look at the logs and see who's doing all the writing and see if you can get them to stop. I've seen logs hit gigabyte sizes when some program has a recurring fault that causes it to log the same error message thousands of times a second. Seriously check the logs and just see who the heck is hammering rsyslogd.
There can be no 'frequency the "rsyslogd" runs', because it is a daemon, providing logging facilities. As Robert S.Barnes indicated, you'd better check the logs to determine the application, that is clogging up rsyslogd (ha!). The names of the logs are OS-specific, but chances are, they are in /var/log and its subdirectories. I've seen rsyslogd consume relatively large amounts of memory, but 170Mb is wayyyyyy too much and is not normal at all.
Shameless offtopic edit: I have serverfault and stackoverflow tabs next to each other and, honestly, I was 100% sure I was posting to serverfault until I've actually submitted the answer (that should be a hint for you) :P

Resources