reading spark cluster metrics with grafana - apache-spark

I've a stand alone spark cluster, I'm able to get all the build in metrics for worker and driver.
Following this "https://dzlab.github.io/bigdata/2020/07/03/spark3-monitoring-1/"
I've a Prometheus server, and set up my targets like this
- job_name: 'X_master'
metrics_path: '/metrics/master/prometheus'
static_configs:
- targets: ['X:8080']
labels:
instance_type: 'master'
spark_cluster: 'X_CLUSTER'
- job_name: 'X_spark-workers'
metrics_path: '/metrics/prometheus'
static_configs:
- targets: ['X1:8081','X2:8081']
labels:
instance_type: 'worker'
spark_cluster: 'X_CLUSTER'
- job_name: 'X_spark-driver'
metrics_path: '/metrics/prometheus'
static_configs:
- targets: ['X:4040']
labels:
instance_type: 'driver'
spark_cluster: 'X_CLUSTER'
I've grafana on same server, and tried this dashboard https://grafana.com/grafana/dashboards/7890
But how I can load data to it ?
I've followed this too https://grafana.com/docs/grafana-cloud/integrations/integrations/integration-apache-spark/?pg=blog&plcmt=body-txt
But how I could create grafana agent and add that yml ?

Related

VictoriaMetrics - pass filters in azure_sd_config like ec2_sd_config

I have to make it work for azure platform, the solution for scrape_config of vmagent was working fine with AWS but unable to find similar solution in Azure. In this particular snippet we have configured scraping config for node_exporter from VMs having tag key: mon_exporters with value: node. Checked the official documentation https://docs.victoriametrics.com/sd_configs.html#azure_sd_configs but couldn't find any mention of filter option
Is there any way I can filter out the VMs basis my needs because right now it fetches all the VMs in that particular Subscription
- job_name: 'node_exporter'
honor_timestamps: true
scrape_interval: 1m
scrape_timeout: 15s
metrics_path: /metrics
scheme: http
follow_redirects: true
azure_sd_configs:
- subscription_id: 'xxxxx'
authentication_method: 'ManagedIdentity'
environment: 'AzurePublicCloud'
refresh_interval: 5m
port: 9100
filters:
- name: 'tag:mon_exporters'
values: ["*node*"]
azure_sd_config in VictoriaMetrics doesn't support filters option. But you can keep needed targets with action: keep relabeling on __meta_azure_machine_tag_mon_exporters label. Try the following config:
- job_name: 'node_exporter'
scrape_interval: 1m
azure_sd_configs:
- subscription_id: 'xxxxx'
authentication_method: 'ManagedIdentity'
port: 9100
relabel_configs:
- action: keep
if: '{__meta_azure_machine_tag_mon_exporters="node"}'
See more details about this type of relabeling here

Unable to fetch Azure VM's in Prometheus using Service discovery

I'm trying to monitor azure VM's in Grafana through Prometheus. I have update prometheus.yml using service discovery as azure_sd_configs. Node exporter is configured on all the VM's and is up and running. But this data is not reflecting in prometheus. I did lot of troubleshooting but couldn't get it. Any help is highly appreciated. Thank you. Prometheus.yml file below. After applying the below configuration prometheus starts but the VM's are not reflected as such.
global:
scrape_interval: 10s
scrape_configs:
- job_name: 'prometheus_master'
scrape_interval: 5s
static_configs:
- targets: ['localhost:9090']
- job_name: 'azure-nodes'
azure_sd_configs:
- subscription_id: "SUB_ID"
tenant_id: "tenant_ID"
client_id: "Client_id"
client_secret: "Secret"
port: 9100
remote_write:
- url: "http://URL:9100/metrics"
basic_auth:
username: "****"
password: "******"

How to monitor Fastify app with Prometheus and Grafana?

I am learning to monitor my Fastify app with Prometheus and Grafana. First, I installed fastify-metrics package and registered in Fastify app.
// app.ts
import metrics from 'fastify-metrics'
...
app.register(metrics, {
endpoint: '/metrics',
})
Then I setup Prometheus and Grafana in docker-compose.yml:
version: "3.7"
services:
prometheus:
image: prom/prometheus:latest
volumes:
- prometheus_data:/prometheus
- ./prometheus/prometheus.yml:/etc/prometheus/prometheus.yml
command:
- '--config.file=/etc/prometheus/prometheus.yml'
network_mode: host
ports:
- '9090:9090'
grafana:
image: grafana/grafana:latest
volumes:
- grafana_data:/var/lib/grafana
# - ./grafana/provisioning:/etc/grafana/provisioning
# - ./grafana/config.ini:/etc/grafana/config.ini
# - ./grafana/dashboards:/var/lib/grafana/dashboards
environment:
- GF_SECURITY_ADMIN_PASSWORD=ohno
depends_on:
- prometheus
network_mode: host
ports:
- '3000:3000'
volumes:
prometheus_data: {}
grafana_data: {}
I added network_mode=host because Fastfy app will be running at localhost:8081.
Here's the Prometheus config:
# prometheus.yml
global:
scrape_interval: 15s
scrape_timeout: 10s
evaluation_interval: 1m
scrape_configs:
- job_name: 'prometheus'
# metrics_path: /metrics
static_configs:
- targets: [
'app:8081',
]
- job_name: 'node_exporter'
static_configs:
- targets: [
'localhost:8081',
]
After docker-compose up and npm run dev, Fastify app is up and running and target localhost:8081 is UP in Prometheus dashboard, localhost:9090, I tried executing some metrics.
I imported Node Exporter Full and Node Exporter Server Metrics dashboards. And added Prometheus datasource localhost:9090, named Fastify, and saved successfully, showed Data source is working.
However, when I go to the Node Exporter Full dashboard, it shows no data. I selected Fastify in datasource but it shows None in others selections at upper left corner.
Please help, what I am doing wrong?
It looks like you're using a dashboard intended for linux stats. In order to use Prometheus/Grafana with your Fastify app, you'll need a dashboard that's meant for Node.js apps. For example:
https://grafana.com/grafana/dashboards/11159
https://grafana.com/grafana/dashboards/12230
Plugging one of those in should do the trick.
you should specify the metrics_path in the job as defined in your 'fastify-metrics' endpoint and also update the targets param:
- job_name: 'node_exporter'
scrape_interval: 5s
metrics_path: /metrics
scheme: http
static_configs:
- targets: ['localhost:8081']
labels:
group: 'node_exporter'

Node.js + Prometheus - Target Down Connection Refused

I am running a node applications locally. It runs on http://localhost:3002 using prom-client i can see the metrics at the following endpoint http://localhost:3002/metrics.
I've setup prometheus in a docker container and ran it.
Dockerfile
FROM prom/prometheus
ADD prometheus.yml /etc/prometheus/
prometheus.yml
scrape_configs:
- job_name: 'prometheus'
scrape_interval: 5s
static_configs:
- targets: ['localhost:3002']
labels:
service: 'my-service'
group: 'production'
rule_files:
- 'alert.rules'
docker build -t my-prometheus .
docker run -p 9090:9090 my-prometheus
When i navigate to http://localhost:9090/targets it shows
Get http://localhost:3002/metrics: dial tcp 127.0.0.1:3002: connect:
connection refused
Can you please tell me what im doing wrong here. node app is running on localhost at that port becasue when i go to http://localhost:3002/metrics i can see the metrics.
When you are inside a container, you cannot access the localhost directly. You will need to add docker.for.mac.localhost to your prometheus.yml file. See below:
Your Job in prometheus.yml file.
- job_name: 'prometheus'
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
static_configs:
- targets: ['localhost:9090']
- targets: ['docker.for.mac.localhost:3002']
and for windows, it would be
- job_name: 'spring-actuator'
metrics_path: '/actuator/prometheus'
scrape_interval: 5s
static_configs:
- targets: ['docker.for.win.localhost:8082']
The applications are not in the same network. Firstly, you can create docker image from your node application too. When running docker images, network( --net) parameter should be passed to both images.
Run prometheus app:
docker run --net basic -p 9090:9090 my-prometheus
Run nodejs app:
docker run --net basic -p 8080:8080 my-node-app
Now, the applications run in the same network that is called basic. So the prometheus application can access the http://localhost:3002/metric endpoint.
I do this to localhost...success global:
scrape_interval: 5s
scrape_timeout: 5s
evaluation_interval: 1s
scrape_configs:
job_name: prometheus
honor_timestamps: true
scrape_interval: 15s
scrape_timeout: 10s
metrics_path: /metrics
scheme: http
follow_redirects: true
static_configs:
targets:
['127.0.0.1:9090']
job_name: class
honor_timestamps: true
scrape_interval: 5s
scrape_timeout: 5s
metrics_path: /metrics
scheme: http
follow_redirects: true
static_configs:
targets: ['host.docker.internal:8080']

How to enable Cassandra Password Authentication in Kubernetes deployment file

I've been struggling with this for quite a while now. My effort so far is shown below. The env variable, CASSANDRA_AUTHENTICATOR, in my opinion, is supposed to enable password authentication. However, I'm still able to logon without a password after redeploying with this config. Any ideas on how to enable password authentication in a Kubernetes deployment file?
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: cassandra
spec:
replicas: 1
template:
metadata:
labels:
app: cassandra
spec:
containers:
- name: cassandra
image: cassandra
env:
- name: CASSANDRA_CLUSTER_NAME
value: Cassandra
- name: CASSANDRA_AUTHENTICATOR
value: PasswordAuthenticator
ports:
- containerPort: 7000
name: intra-node
- containerPort: 7001
name: tls-intra-node
- containerPort: 7199
name: jmx
- containerPort: 9042
name: cql
volumeMounts:
- mountPath: /var/lib/cassandra/data
name: data
volumes:
- name: data
emptyDir: {}
The environment is Google Cloud Platform.
So I made few changes to the artifact you have mentioned:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: cassandra
spec:
replicas: 1
template:
metadata:
labels:
app: cassandra
spec:
containers:
- name: cassandra
image: bitnami/cassandra:latest
env:
- name: CASSANDRA_CLUSTER_NAME
value: Cassandra
- name: CASSANDRA_PASSWORD
value: pass123
ports:
- containerPort: 7000
name: intra-node
- containerPort: 7001
name: tls-intra-node
- containerPort: 7199
name: jmx
- containerPort: 9042
name: cql
volumeMounts:
- mountPath: /var/lib/cassandra/data
name: data
volumes:
- name: data
emptyDir: {}
The changes I made were:
image name has been changed to bitnami/cassandra:latest and then replaced the env CASSANDRA_AUTHENTICATOR with CASSANDRA_PASSWORD.
After you deploy the above artifact then I could authenticate as shown below
Trying to exec into pod
fedora#dhcp35-42:~/tmp/cassandra$ oc exec -it cassandra-2750650372-g8l9s bash
root#cassandra-2750650372-g8l9s:/#
Once inside the pod trying to authenticate with the server
root#cassandra-2750650372-g8l9s:/# cqlsh 127.0.0.1 9042 -p pass123 -u cassandra
Connected to Cassandra at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 3.11.0 | CQL spec 3.4.4 | Native protocol v4]
Use HELP for help.
cassandra#cqlsh>
This image documentation can be found at https://hub.docker.com/r/bitnami/cassandra/
If you are not comfortable using the third party image and wanna use the image that upstream community manages then look for following solution, which is more DIY but also is more flexible.
To setup the password you were trying to use the env CASSANDRA_AUTHENTICATOR but this is not merged proposal yet for the image cassandra. You can see the open PRs here.
Right now the upstream suggest doing the mount of file cassandra.yaml at /etc/cassandra/cassandra.yaml, so that people can set whatever settings they want.
So follow the steps to do it:
Download the cassandra.yaml
I have made following changes to the file:
$ diff cassandra.yaml mycassandra.yaml
103c103
< authenticator: AllowAllAuthenticator
---
> authenticator: PasswordAuthenticator
Create configmap with that file
We have to create Kubernetes Configmap which then we will mount inside the container, we cannot do host mount similar to docker.
$ cp mycassandra.yaml cassandra.yaml
$ k create configmap cassandraconfig --from-file ./cassandra.yaml
The name of configmap is cassandraconfig.
Now edit the deployment to use this config and mount it in right place
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: cassandra
spec:
replicas: 1
template:
metadata:
labels:
app: cassandra
spec:
containers:
- name: cassandra
image: cassandra
env:
- name: CASSANDRA_CLUSTER_NAME
value: Cassandra
ports:
- containerPort: 7000
name: intra-node
- containerPort: 7001
name: tls-intra-node
- containerPort: 7199
name: jmx
- containerPort: 9042
name: cql
volumeMounts:
- mountPath: /var/lib/cassandra/data
name: data
- mountPath: /etc/cassandra/
name: cassandraconfig
volumes:
- name: data
emptyDir: {}
- name: cassandraconfig
configMap:
name: cassandraconfig
Once you create this deployment.
Now exec in the pod
$ k exec -it cassandra-1663662957-6tcj6 bash
root#cassandra-1663662957-6tcj6:/#
Try using the client
root#cassandra-1663662957-6tcj6:/# cqlsh 127.0.0.1 9042
Connection error: ('Unable to connect to any servers', {'127.0.0.1': AuthenticationFailed('Remote end requires authentication.',)})
For more information on creating configMap and using it by mounting inside container you can read this doc, which helped me for this answer.
If you really don't want to replace official cassandra Docker image with bitnami's version, but you still want to enable password authentication for accessing CQL shell, then you could achieve that by modification of Cassandra configuration file. Namely, enablement of password authentication is done by setting the following property definition in /etc/cassandra/cassandra.yaml file: authenticator: PasswordAuthenticator
As it is irrelevant whether certain property is defined once or multiple times, i.e. at the end the latest property definition will be used, aforementioned line can be simply appended to Cassandra configuration file. Alternative could be using sed for performing interactive search-and-replace, but IMHO that would be unnecessary overkill - both performance-wise and readability-wise.
Long-story short, specify container startup-command/entrypoint (with its arguments) so that first is config file properly adapted and then is executed image's original startup-command/entrypoint. Since in the container definition of Docker-Compose and Kubernetes y(a)ml it is only possible to define single startup-command, specify as a command standard/Bourne shell executing previous two steps.
Therefore the answer would be adding the following two lines:
command: ["/bin/sh"]
args: ["-c", "echo 'authenticator: PasswordAuthenticator' >> /etc/cassandra
/cassandra.yaml && docker-entrypoint.sh cassandra -f"]
so the OP's Kubernetes deployment file would be the following:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: cassandra
spec:
replicas: 1
template:
metadata:
labels:
app: cassandra
spec:
containers:
- name: cassandra
image: cassandra
command: ["/bin/sh"]
args: ["-c", "echo 'authenticator: PasswordAuthenticator' >> /etc/cassandra/cassandra.yaml && docker-entrypoint.sh cassandra -f"]
env:
- name: CASSANDRA_CLUSTER_NAME
value: Cassandra
ports:
- containerPort: 7000
name: intra-node
- containerPort: 7001
name: tls-intra-node
- containerPort: 7199
name: jmx
- containerPort: 9042
name: cql
volumeMounts:
- mountPath: /var/lib/cassandra/data
name: data
volumes:
- name: data
emptyDir: {}
Disclaimer: if 'latest' is used as the image tag of official Cassandra image, and if at some moment the original entrypoint (docker-entrypoint.sh cassandra -f) of image is changed, then this container might have issues starting Cassandra. However, since the entrypoint with its args is unchanged from the initial version until the latest version at the moment I was writing this post (4.0), it is very likely that it will remain as-is, so this approach/workaround should work fine.

Resources