how to integrate dropwizard metrics for monitoring cassandra database - cassandra

I want to monitor health of my cassandra cluster. And got to know about dropwizard metrics, but dont know how to integrate dropwizard metrics with my cassandra cluster to monitor it.
For this I want to use JMX as metrics reporter,graphite as metrics collector and Grafana as visualization GUI
can anyone help me here please.

Cassandra itself uses dropwizard Metrics and has a pluggable reporting interface since 2.0.2 (announcement post). 'Monitoring Apache Cassandra Metrics With Graphite and Grafana' gives a good overview on how to configure Cassandra to report metrics to graphite:
1). Download Graphite metrics reporter jar file
2). Put the downloaded jar file in Cassandra library folder, e.g. /usr/share/cassandra/lib/ (the default Cassandra library folder under packaged installation on Ubuntu 14.0.4)
3). Create a metrics reporter configuration file (e.g. metrics_reporter_graphite.yaml) and put it under the same folder as cassandra.yaml file, e.g. /etc/cassandra/ (the default Cassandra configuration folder under packaged installation on Ubuntu 14.0.4).
graphite:
-
period: 30
timeunit: 'SECONDS'
prefix: 'cassandra-clustername-node1'
hosts:
- host: 'localhost'
port: 2003
predicate:
color: 'white'
useQualifiedName: true
patterns:
- '^org.apache.cassandra.+'
- '^jvm.+'
4). Modify cassandra-env.sh file to include the following JVM option:
METRICS_REPORTER_CFG="metrics_reporter_graphite.yaml"
JVM_OPTS="$JVM_OPTS -Dcassandra.metricsReporterConfigFile=$METRICS_REPORTER_CFG"
5). Restart Cassandra service

Related

windows exporter does not collect metrics from other namespaces

I have an AKS cluster with Windows and Linux nodes and containers. For Linux I collect metrics normally with Prometheus but windows metrics are not displayed. I have installed and configured windows_exporter https://github.com/prometheus-community/windows_exporter. Metrics appeared for pods that are in the same namespace as windows_exporter. Could you please help me how to collect matrices from other namespace. Or advise how best to collect metrics from Windows AKS nodes and pods. Thanks.
You can try with the below steps :
After Downloading the Windows-Exporter,
Open the folder in the terminal and run :
.\windows_exporter.exe --collectors.enabled "cpu,cs,logical_disk,os,system,net"
Once the Windows-Exporter is started , Configure Prometheus
for scraping the exporter by adding the below inside the
scrapes_configs array:
- job_name: "windows_exporter"
static_configs:
- targets: ["localhost:9182"]
Now ,configure Prometheus to remote write by adding the
below in the root config:
remote_write:
- url: "https://<PROMETHEUS_SERVER_NAME>/prometheus/remote/write"
tls_config:
insecure_skip_verify: true
Once, the above steps are performed you can start Prometheus
and if you want one day data retention or retention of data as
per your requirement then you can run the below :
prometheus.exe --storage.tsdb.retention.time=1d ##as per your requirement change 1d
Reference:
Monitoring a Windows cluster with Prometheus – Sysdig
OR
As RahulKumarShaw-MT suggested you can refer How to export metrics from Windows Kubernetes nodes in AKS - Octopus Deploy and aidapsibr/aks-prometheus-windows-exporter

Apache Cassandra monitoring

What is the best way to monitor if cassandra nodes are up? Due to security reasons JMX and nodetool is out of question. I have cluster metrics monitoring via Rest Api, but I understand that even if a node goes Rest Api will only report on a whole cluster.
Well, I have integrated a system where I can monitor all the metrics regarding to my cluster of all nodes. This seems like complicated but pretty simple to integrate. You will need the following components to build up a monitoring system for cassandra:
jolokia jar
telegraf
influxdb
grafana
I'm writing a short procedure, how it works.
Step 1: copy jolokia jvm jar to install_dir/apache-cassandra-version/lib/ , jolokia jvm agent can be downloaded from anywhere in google.
Step 2: add the following line to install_dir/apache-cassandra-version/conf/cassandra-env.sh
JVM_OPTS="$JVM_OPTS -javaagent:<here_goes_the_path_of_your_jolokia_jar>"
Step 3: install telegraf on each node and configure the metrics you want to monitor. and start telegraf service.
Step 4: install grafana and configure your ip, port, protocol. grafana will give you a dashboard to look after your nodes and start grafana service. Your metrics will be able get visibility here.
Step 5: install influxdb on another server from where you want to store your metrics data which will come through telegraf agent.
Step 6: browse the ip you have mentioned, where you have launched your grafana through browser and add data source ip (influxdb ip), then customize your dashboard.
image source: https://blog.pythian.com/monitoring-cassandra-grafana-influx-db/
This is not for monitoring but only for node state.
Cassandra CQL driver provides info if a particular node is UP or DOWN with Host.StateListener Interface. This info is used by driver to mark a node UP or Down. Thus it could be used if node is down or up if JMX is not accessible.
Java Doc API : https://docs.datastax.com/en/drivers/java/3.3/
I came up with a script which listens for DN nodes in the cluster and reports it to our monitoring setup which is integrated with pagerduty.
The script runs on one of our nodes and executes nodetool status every minute and reports for all down nodes.
Here is the script https://gist.github.com/johri21/87d4d549d05c3e2162af7929058a00d1
[1]:

Cassandra prometheus integration

I am using the JMX prometheus exporter for exporting Cassandra metrics using the java agent. Are there any performance issues that I need to be wary about?
Recently I came across https://github.com/criteo/cassandra_exporter.
Can you share your experience with managing Cassandra using Prometheus - specifically with respect to the exporter used?
You can use Telegraf (from InfluxDB ecosystem) to expose Cassandra metrics to Prometheus.
Just add Jolokia to Cassandra to expose JMX metrics over http and then use JMX input and Prometheus output plugins in Telegraf.

How to connect Cloudera Manager to existing Spark cluster

I have following requirement: I need to provision both Cloudera Manager and Spark Cluster via Puppet but in a way that I need minimal (or none) configuration through Cloudera Manager UI afterwards. Ideal scenario that I'm looking for is following:
Topology: 3 nodes (where node1 is spark-master and node2 and node3 are spark-workers)
Provision spark cluster (this works as expected) and I have working CDH5.5 Spark cluster (verified by running Spark Pi example)
Install CM server on spark-master node
Install CM agent on all nodes
Start CM server and agents
I'm using razorsedge/cloudera puppet module to provision Cloudera Manager (https://forge.puppetlabs.com/razorsedge/cloudera) and I have custom made Spark puppet module which support CDH5.5 Spark installation
When I open Cloudera Manager UI, I can see all three nodes but I don't see any Spark related stats on CM UI dashboard.
When investigating cm agent and server logs, these are the findings:
cm agent log on spark-master (was not connected to CM server and cannot be seen on CM UI dashboard)
[12/Jan/2016 23:13:11 +0000] 4678 MainThread agent ERROR Heartbeating to EC2_PUBLIC_DNS:7182 failed
cm agent log on spark-workers (connected to CM server successfully and can be seen on CM UI dashboard)
cm server log on spark-master:
org.apache.avro.AvroRuntimeException: Unknown datum type: java.lang.IllegalArgumentException: Hostname invalid EC2_LOCAL_IPV4
Any idea what might be the issue here?
I'm also looking for following answers:
Is it even possible at all to provision some CDH service (in my case Spark) without using Cloudera Manager UI and then have it connected to CM?
If yes, which CM configuration/s need to changed to point to existing Spark Cluster?
Any help/guidance would be greatly appreciated

Spark metrics fot gmond / ganglia

OS: Cent OS 6.4
ISSUE:
Installed gmond, gmetad and gweb on a server. Installed spark worker in the same server.
configured metrics.properties in $SPARK_HOME/conf/metrics.properties as below...
CONFIGURATION (metrics.properties in spark):
org.apache.spark.metrics.sink.GangliaSink
host localhost
port 8649
period 10
unit seconds
ttl 1
mode multicast
We are not able to see any metrics in ganglia web.
Please do the needful.
-pradeep samudrala
In the first place, those are just indications of the default settings of Ganglia. You should not uncomment that. Taken from the metrics section from the Spark web page (spark page):
To install the GangliaSink you’ll need to perform a custom build of Spark. Note that by embedding this library you will include LGPL-licensed code in your Spark package. For sbt users, set the SPARK_GANGLIA_LGPL environment variable before building. For Maven users, enable the -Pspark-ganglia-lgpl profile. In addition to modifying the cluster’s Spark build user applications will need to link to the spark-ganglia-lgpl artifact.

Resources