JMX JConsole not showing CMS LastGCInfo - garbage-collection

I was looking at the Garbage Collection Statistics and was wondering why the LastGCInfo is empty for ConcurrentMarkSweep. Although, I see the CollectionCount and CollectionTime Info for the same.
Could someone please help me understand why.

Related

JMX java.lang:<GC>,type=GarbageCollector interpretation

Have set up collection of java GC data points into an elastic cluster from jolokia/JMX of a Jboss server, but attempting to browse descriptions in the jolokia's jmx console doesn't tell much on how to interpret the data collected like what are unit of measurements nor much in regards of their semantics :)
So please direct me to any pointers explaining the inner works of java GC data points like seen in below snap, eg. like what are the semantics Collection count+time, units of Last GC info.duration+startTime+endTime etc.
TIA!

spark.streaming.kafka.consumer.cache.enabled property working/ affect on performance of Kafka Consumers

I have come across the config spark.streaming.kafka.consumer.cache.enabled= false in the properties of our application and surprisingly no one in my team knows how does this helps us in achieving better performance. It was added on advice of the support from Cloudera. I couldn't find any elaborate explanation about this property in the Spark Docs. Can anyone please help me understand how does this configuration affect the Kafka Consumer performance.
Looking at the source code, you can see that it has a useCache : Boolean value, and seems to be putting internal KafkaConsumer objects into this cache based on the group id & topic+partition assignments.
I don't have any idea why not caching consumers would be "more performant", but I could guess that not having them cached allows for the Kafka consumer group rebalancing to operate "better"
If you think this property is missing its necessary documentation, then I would suggest opening a JIRA

Speed of hinted handoff in Cassandra

Given a particular set of configurations and a particular size of data to be written on a node, can we predict how much time will the hinted handoff take to finish?
In my case, as soon as the node came up, I checked using the 'nodetool statushandoff' command, that the hinted handoff had started running. However, it seems to be running endlessly. So is there any way by looking at the configurations, missing data size etc. so that we can know that after this much amount of time, the missing data will be written on the node.
You should be able to track the progress with some hint metrics. Have a look on this page: http://cassandra.apache.org/doc/latest/operating/metrics.html#hintedhandoff-metrics
The TotalHintsInProgress will tell you how big the backlog is and TotalHints will tell you the number of hints written on the node since startup. So by tracking these two metrics you should be able to give an estimate (good or bad) on how far it's come.

About Datastax "Monitoring a Cassandra cluster" Documentation

In this [1] document, when it describes cfstats output it says, Read count is the Number of pending read requests. Is that correct? I was thinking that is all read requests received since last server restart.
can someone please clarify this?
[1] http://www.datastax.com/documentation/cassandra/1.2/webhelp/cassandra/operations/ops_monitoring_c.html
Thanks,
Bhathiya
Yes, you're right, the docs are wrong. cfstats read count is the number of local read requests for the table since start up.

Embedded Brisk? Is it possible?

I'm just getting ramped up on a new application and have decided to try out / learn Cassandra and use it for the back end.
I've got embedded Cassandra working like a charm. Now I want to add Hive on top. Has anyone attempted embedded Brisk (from DataStax) before?
Is this even possible with all the moving parts??
Thanks!
Max
It's possible. The biggest impediments would be:
Dependency management
JVM sizing and tunning
Workload isolation/service segmentation

Resources