cassandra cluster monitoring using graphite -grafana - cassandra

I am new to Cassandra and trying to setup monitoring tool to monitor Cassandra production cluster. So i have setup one graphite-grafana on one of the cassandra node & i'm able to get metrics of that particular cassandra node on grafana, but now i want to fetch metrics from all the cassandra nodes and display them in grafana.
can anyone direct me about structure i should follow or how to setup graphite-grafana tool for multiple nodes monitoring in production . what are the changes to be made configurations file etc.

I think it is better that Graphite-grafana will be in a separated machine or cluster.
You could send metrics from all your cassandra nodes to the machine/cluster, and make sure that there is identification of cassandra node in the metric key (for example, use the key cassandra.nodes.machine01.blahblahblah for one metric from machine01).
After that, you could use Graphite API to fetch metrics of all your cassandra nodes from that Graphite machine/cluster.

I got my answer after hit-trial.e.g, i have edited metrics_reporter_graphite.yaml like below:
graphite:
-
period: 30
timeunit: 'SECONDS'
prefix: 'cassandra-clustername-node1'
hosts:
- host: 'localhost'
port: 2003
predicate:
color: 'white'
useQualifiedName: true
patterns:
- '^org.apache.cassandra.+'
- '^jvm.+'`enter code here`
Replace localhost with your graphite-grafana server/vm IP address.

Related

Spark 2.x - Not getting executor metric after setting metrics.properties

I have a spark application running with Spark Operator v1beta2-1.1.2-2.4.5 with Spark 2.4.5, running on Kubernetes
I have application level metrics that I would like to expose through the jmx-exporter port in order for Prometheus podmonitor to scrape.
I am able to get driver system level metrics via the exposed port.
I am using Groupon's open source spark-metrics to publish metrics to the Jmx sink. Even after configuring spark.metrics.conf to this file below, as well as adding this to each of the executors (by sparkContext.addfile(<path to metrics.properties>), I am still not able to see these metrics be reported to my jmx-exporter port (8090/metrics). I am able to see it from the spark-ui endpoint, at (4040/metrics/json)
*.sink.jmx.class=org.apache.spark.metrics.sink.JmxSink
master.source.jvm.class=org.apache.spark.metrics.source.JvmSource
worker.source.jvm.class=org.apache.spark.metrics.source.JvmSource
driver.source.jvm.class=org.apache.spark.metrics.source.JvmSource
executor.source.jvm.class=org.apache.spark.metrics.source.JvmSource
Would appreciate any pointers in terms of where I should look.
Below is the monitoring API related deployment configs
monitoring:
exposeDriverMetrics: true
exposeExecutorMetrics: true
metricsPropertiesFile: /opt/spark/metrics.properties
prometheus:
jmxExporterJar: "/prometheus/jmx_prometheus_javaagent.jar"

How to send JVM metrics of Spark to Prometheus in Kubernetes

I am using the Spark operator to run Spark on Kubernetes. (https://github.com/GoogleCloudPlatform/spark-on-k8s-operator)
I am trying to run a Java agent in Spark driver and executor pods and send the metrics through a Kubernetes service to Prometheus operator.
I am using this example
https://github.com/GoogleCloudPlatform/spark-on-k8s-operator/blob/master/examples/spark-pi-prometheus.yaml
Java agent is exposing the metrics on port 8090 for a short time (I can validate that with port-forwarding kubctl port-forward < spark-driver-pod-name > 8090:8090 ), also the service is also exposing the metrics for few mins ( can validate that with port-forwarding kubctl port-forward svc/< spark-service-name > 8090:8090 ).
Promethues is able to register these pod's URL in the prometheus, but when it is trying to scrape the metrics(runs for every 30 seconds), the pod's URL is down.
How can i make the Java agent JMX exporter to run long, until the driver and executors completed the job. could you please guide or help me here, who have come across this scenario before?
Either Prometheus needs to scrape the metrics of every 5 seconds (chances are that metrics may not be accurate), or you need to use pushgateway, like mentioned in this blog(https://banzaicloud.com/blog/spark-monitoring/) to push the metrics to Prometheus
Pushing the metrics to Prometheus, is a best practice for batch jobs.
Pulling the metrics from Prometheus is a best approach for long running services (ex:REST Services)

How do I select the master Redis pod in this Kubernetes example?

Here's the example I have modeled after.
In the Readme's "Delete our manual pod" section:
The redis sentinels themselves, realize that the master has disappeared from the cluster, and begin the election procedure for selecting a new master. They perform this election and selection, and chose one of the existing redis server replicas to be the new master.
How do I select the new master? All 3 Redis server pods controlled by the redis replication controller from redis-controller.yaml still have the same
labels:
name: redis
which is what I currently use in my Service to select them. How will the 3 pods be distinguishable so that from Kubernetes I know which one is the master?
How will the 3 pods be distinguishable so that from Kubernetes I know
which one is the master?
Kubernetes isnt aware of the master nodes. You can find the pod manually by connecting to it and using:
redis-cli info
You will get lots of information about the server but we need role for our purpose:
redis-cli info | grep ^role
Output:
role: Master
Please note Replication controllers are replaced by Deployments for stateless services. For stateful services use Statefulsets.
Your client Redis library can actually handle this. For example with ioredis:
ioredis guarantees that the node you connected to is always a master even after a failover.
So, you actually connect to a redis-sentinel instead of a redis-client.
We need to do the same thing and tried different things like modifying chart. Finally, just created a simple python docker that does the labeling and created chart that expose the master redis as service. This periodically checked the pods create for redis-ha and label them according to their role ( master/ slave)
It uses the same sentinel commands to find the master/slave.
helm chart redis-pod-labeler here
source repo

Jmx Monitoring: Possible to collect and visualize Jmx/Mbeans Data saved on Cassandra?

I have managed to collect JMX MetricsData from a Java Application and saved it on a Cassandra database (My project leader said to do so).
I know that it is possible to collect with JmxTrans directly from JMX- Endpoints and visualize it within Grafana/Graphite.
My Question is: can I collect the JMX metrics data from cassandra and visualize it in Grafana?
Grafana requires something else (ie graphite, influxdb, cyanite) to store the data. So to answer your question (what I think your asking at least) of if grafana can pull the metrics from JMX itself, it would be "No".
That said you can make the collection easier and faster. JMX isn't a very efficient medium. Instead just have Cassandra send metrics directly to your graphite (or whatever reporter) instances using its graphite reporter. see http://www.datastax.com/dev/blog/pluggable-metrics-reporting-in-cassandra-2-0-2 for details. The steps in blog post are as follows:
Grab your favorite reporter jar (such as metrics-graphite) and add it to the server’s lib
Create a configuration file for the reporters following the sample.
Start the server with -Dcassandra.metricsReporterConfigFile=yourCoolFile.yaml by adding it to JVM_OPTS in cassandra-env.sh
Example config:
graphite:
-
period: 60
timeunit: 'SECONDS'
hosts:
- host: 'graphite-server.domain.local'
port: 2003
predicate:
color: "white"
useQualifiedName: true
patterns:
- "^org.apache.cassandra.metrics.Cache.+"
- "^org.apache.cassandra.metrics.ClientRequest.+"
- "^org.apache.cassandra.metrics.Storage.+"
- "^org.apache.cassandra.metrics.ThreadPools.+"
the question is old, but if you were to do it nowadays , i'd recommend using Prometheus as data source for Grafana along with it's JmxExporter agent on Cassandra.
It seems that you want to use Cassandra as the data store for the JMX metrics that you are collecting from other services; Grafana doesn't have that support yet (the available datastores are listed here.

logstash with redis sentinel is possible?

I want to build a High availability ELK monitoring system with redis.But a little confuse for how to make redis HA.
Redis Sentinel provides high availability for Redis.
But i donot find any configuration for this on the document. https://www.elastic.co/guide/en/logstash/current/plugins-inputs-redis.html
So can i use it for logstash as input and output? Anyone has experience for this?
Logstash Input Redis Plugin supports only one host in a host option.
I think you have 2 ways to get HA:
1) Continue use Redis. You can create dns A record (or edit your host file), that will point to multiple Redis servers, then put the record to host option.
2) Moving from Redis to Kafka:
https://www.elastic.co/blog/logstash-kafka-intro

Resources