Exporting metrics from Prometheus to Elastics Search for better monitoring capabilities - apache-spark

I want to use Prometheus to monitor Spark (using spark driver API) but I also want to use Kibana for better investigation capabilities.
So I want to export those metrics from Prometheus also to Elastic Search as records to show on Kibana.
Is it somehow possible?

You can check this blog where they have shown various way to export prometheus metrics to Elasticsearch.
You can use metricbeat as well to get data from prometheus as it provide module for same.
Also, if you are using latest version of Elasticsearch then you can explore Elastic Agent and Fleet as well, which have integration for prometheus.

Related

How to monitor Spark with Kibana

I want to have a view of Spark in Kibana such as decommissioned nodes per application, shuffle read and write per application, and more.
I know I can get all this information about these metrics here.
But I don't know how to send them to elastic search or how to do it the correct way. I know I can do it with Prometheus but I don't think that helps me.
Is there a way of doing so?

Spark metrics - Disable all metrics

I'm building a monitoring system for our Spark. I sent the metrics with spark's graphite sink. I want to have the ability to stop all the metrics dynamically. So that means I need to set it with sc.set.
How can I just disable all metrics in the spark configuration? Because I couldn't find something like spark.metrics.enable property.
I couldn't find a way of disabling it. What I do is only set it if I want to monitor (per application).
sc.set("spark.metrics.conf.*.sink.graphite.class", "org.apache.spark.metrics.sink.GraphiteSink")

Cassandra prometheus integration

I am using the JMX prometheus exporter for exporting Cassandra metrics using the java agent. Are there any performance issues that I need to be wary about?
Recently I came across https://github.com/criteo/cassandra_exporter.
Can you share your experience with managing Cassandra using Prometheus - specifically with respect to the exporter used?
You can use Telegraf (from InfluxDB ecosystem) to expose Cassandra metrics to Prometheus.
Just add Jolokia to Cassandra to expose JMX metrics over http and then use JMX input and Prometheus output plugins in Telegraf.

Grafana for Spark Structured Streaming

I followed these steps to setup Prometheus, Graphite Exporter and Grafana to plot metrics for Spark 2.2.1 running Structured Streaming. The collection metrics on this post are quite dated; and does not include any metrics (I believe) that can be used to monitor structured streaming. I am especially interested in the resources and duration to execute the streaming queries that perform various aggregations.
Is there any pre-configured dashboard for spark - I was a little surprised not to find one on https://grafana.com/dashboards
Which makes me suspect that Grafana is not widely used to monitor metrics for Spark. If that's the case, what works better?
It looks like it is not any dashboard in the oficial Grafana dashboard, but you can check the next Spark dashboard that display metrics collected from Spark applications.
https://github.com/hammerlab/grafana-spark-dashboards

Bluemix Apache Spark Metrics

I have been looking for a way to monitor performance in Spark on Bluemix. I know in the Apache Spark project, they provide a metrics service based on the Coda Hale Metrics Library. This allows users to report Spark metrics to a variety of sinks including HTTP, JMX, and CSV files. Details here: http://spark.apache.org/docs/latest/monitoring.html
Does anyone know of any way to do this in the Bluemix Spark service? Ideally, I would like to save the metrics to a csv file in Object Storage.
Appreciate the help.
Thanks
Saul
Currently, I do not see an option for usage of "Coda Hale Metrics Library" and reporting the job history or accessing the information via REST API.
However, on the main page of the Spark history server, you can see the Event log directory. It refers to your following user directory: file:/gpfs/fs01/user/USER_ID/events/
There I saw JSON (like) formatted files.

Resources