Cassandra prometheus integration - cassandra

I am using the JMX prometheus exporter for exporting Cassandra metrics using the java agent. Are there any performance issues that I need to be wary about?
Recently I came across https://github.com/criteo/cassandra_exporter.
Can you share your experience with managing Cassandra using Prometheus - specifically with respect to the exporter used?

You can use Telegraf (from InfluxDB ecosystem) to expose Cassandra metrics to Prometheus.
Just add Jolokia to Cassandra to expose JMX metrics over http and then use JMX input and Prometheus output plugins in Telegraf.

Related

Exporting metrics from Prometheus to Elastics Search for better monitoring capabilities

I want to use Prometheus to monitor Spark (using spark driver API) but I also want to use Kibana for better investigation capabilities.
So I want to export those metrics from Prometheus also to Elastic Search as records to show on Kibana.
Is it somehow possible?
You can check this blog where they have shown various way to export prometheus metrics to Elasticsearch.
You can use metricbeat as well to get data from prometheus as it provide module for same.
Also, if you are using latest version of Elasticsearch then you can explore Elastic Agent and Fleet as well, which have integration for prometheus.

Is it Opscenter configurable with Scylla?

For Scylla monitoring, we need to configure Grafana but is it possible to integrate Cassandra Opscenter to Scylla?
TL;DR: No.
OpsCenter is a closed source product, which was not tested with Scylla. Part of it that uses Apache Cassandra CQL and JMX will probably work, while others might not.
In addition to the open source, Scylla monitoring stack (base on Prometheus and Grafana), ScyllaDB has its own close version product for cluster management named Scylla Manager.
Tzach (Scylla Product Manager)

Apache Cassandra monitoring

What is the best way to monitor if cassandra nodes are up? Due to security reasons JMX and nodetool is out of question. I have cluster metrics monitoring via Rest Api, but I understand that even if a node goes Rest Api will only report on a whole cluster.
Well, I have integrated a system where I can monitor all the metrics regarding to my cluster of all nodes. This seems like complicated but pretty simple to integrate. You will need the following components to build up a monitoring system for cassandra:
jolokia jar
telegraf
influxdb
grafana
I'm writing a short procedure, how it works.
Step 1: copy jolokia jvm jar to install_dir/apache-cassandra-version/lib/ , jolokia jvm agent can be downloaded from anywhere in google.
Step 2: add the following line to install_dir/apache-cassandra-version/conf/cassandra-env.sh
JVM_OPTS="$JVM_OPTS -javaagent:<here_goes_the_path_of_your_jolokia_jar>"
Step 3: install telegraf on each node and configure the metrics you want to monitor. and start telegraf service.
Step 4: install grafana and configure your ip, port, protocol. grafana will give you a dashboard to look after your nodes and start grafana service. Your metrics will be able get visibility here.
Step 5: install influxdb on another server from where you want to store your metrics data which will come through telegraf agent.
Step 6: browse the ip you have mentioned, where you have launched your grafana through browser and add data source ip (influxdb ip), then customize your dashboard.
image source: https://blog.pythian.com/monitoring-cassandra-grafana-influx-db/
This is not for monitoring but only for node state.
Cassandra CQL driver provides info if a particular node is UP or DOWN with Host.StateListener Interface. This info is used by driver to mark a node UP or Down. Thus it could be used if node is down or up if JMX is not accessible.
Java Doc API : https://docs.datastax.com/en/drivers/java/3.3/
I came up with a script which listens for DN nodes in the cluster and reports it to our monitoring setup which is integrated with pagerduty.
The script runs on one of our nodes and executes nodetool status every minute and reports for all down nodes.
Here is the script https://gist.github.com/johri21/87d4d549d05c3e2162af7929058a00d1
[1]:

Cassandra production Monitoring

I am new to Cassandra and trying to setup monitoring to Cassandra production cluster.
Apart from monitoring using nodetool commands in crontab what else is recommended?
is it a general practice to use ganglia for monitoring?
can you direct me to a good resource on setting up monitoring in production.
we are using apache cassandra so opscenter was not very useful.
The free version of OpsCenter works with OSS Cassandra and most monitoring capabilities are available. You do miss a good amount of cluster management capabilities if you don't have DSE:
http://www.datastax.com/what-we-offer/products-services/datastax-opscenter/compare

OpsCenter - How does it communitate with agents?

I would like to know how OpsCenter communicates with its Agents and Cassandra Nodes.
Does it use Thrift? Is JMX required?
I'll base my answer on the latest released version of OpsCenter (1.3).
The main OpsCenter process can communicate with the agents in two ways. It can query the agents over an http rest api that each agent exposes. It uses this to ask the agent basic things about the cassandra node and also to have the agent send jmx commands to the cassandra process.
The other way is using the STOMP protocol. (http://stomp.github.com//) Agents send messages over STOMP to a message queue in OpsCenter. These generally contain details about the cassandra node and metric information.
Hope that helps.

Resources