Cloudfront Logs into Prometheus - logstash

I want to collect AWS cloudfront request level metrics [count of request by unique resource] into Prometheus.
I've seen how to use Logstash to forward the logs to ElasticSearch, and I thought of polling/querying ElasticSearch once a minute to get an aggregate, then exporting that result to Prometheus.
But it feels a little sloppy considering potential timing issues or missing/duplicate metric values.
I also saw a metrics filter for Logstash - so maybe I could create a meter for each unique url, then use the http output plugin to send the metrics to Prometheus.
One more thought -
I've never used CloudFront with CloudWatch. Maybe I could use the CloudWatch exporter for Prometheus if it provide request counts at the resource level, or is it higher level aggregates?

You can use cloudwatch_exporter to scrape metrics from cloudwatch for CloudFront.

Related

Prometheus scraping only some metrics

In my cluster I have an nginx ingress, nodejs server and prometheus. Prometheus is scraping all metrics from nginx, no problem, but it seems that it's ommitting some metrics from my nodejs server.
# HELP nodejs_version_info Node.js version info.
# TYPE nodejs_version_info gauge
nodejs_version_info{version="v16.15.0",major="16",minor="15",patch="0"} 1
This metric is indeed scraped by prometheus because it has nodejs_ in name. However, I also have some metrics which look like this:
# HELP http_request_duration_seconds duration histogram of http responses labeled with: status_code, method, path
# TYPE http_request_duration_seconds histogram
http_request_duration_seconds_bucket{le="0.003",status_code="200",method="GET",path="/"} 0
Metrics without nodejs in name do not appear in the dashboard, like so:
I should mention that I am using https://www.npmjs.com/package/express-prom-bundle for the response time metric. Does anybody know how to fix that?

How Monitor - Cosmos DB (preview) Requests is calculated?

Azure provides monitor to the incoming request to the Cosmos. When I am alone working on my Cosmos DB, ran a simple select vertex statement(eg., g.V('id')). Then I monitored the incoming request, it shows around 10. But for sure I know i'm the only person accessed. I also tried traversing through the graph in a single select query the Request count is huge (around 100).
Do anybody noticed the metrics? We are assuming the request code is huge for an hour in production cause the performance slowness. Is the metric is trustworthy to believe or how to find the incoming request to the cosmos?

How to paginate logs from a Kubernetes pod?

I have a service that displays logs from pods running in my Kubernetes cluster. I receive them via k8s /pods/{name}/log API. The logs tend to grow big so I'd like to be able to paginate the response to avoid loading them whole every time. Result similar to how Kubernetes dashboard displays logs would be perfect.
This dashboard however seems to solve the problem by running a separate backend service that loads the logs, chops them into pieces and prepares for the frontend to consume.
I'd like to avoid that and use only the API with its query parameters like limitBytes and sinceSeconds but those seem to be insufficient to make proper pagination work.
Does anyone have a good solution for that? Or maybe know if k8s plans to implement pagination in logs API?

Azure App Service - alerting for HTTP errors more than X% of total requests

I am trying to setup alert based on percentage of Http Errors in all Requests, e.g. - "Notify me, when more than 0.5% of all requests end up with Http Error."
When I look at App Service alerting capabilities, I can setup alerts to let me know when number of Http Errors is higher than X, I can also setup alerts to let me know when total number of Requests is higher than Y. But nothing that would compare those two numbers.
Any suggestions are appreciated!
AFAIK, there is no build-in feature for you to achieve this purpose. But you could write your code for retrieving the metrics and send the alert notification by yourself. I assumed that you could leverage Azure Monitor REST API for getting metrics from Microsoft.Web/sites as follows:
Get https://management.azure.com/subscriptions/${subscriptionId}/resourceGroups/${resourceGroupName}/providers/${resourceProviderNamespace}/${resourceType}/${resourceName}/providers/microsoft.insights/metrics?$filter=(name.value eq 'Requests' or name.value eq 'http401' or name.value eq 'http403') and startTime eq 2017-07-11T12:30:00Z and endTime eq 2017-07-11T12:31:00Z&api-version=2016-06-01
For more details, you could refer to Azure Monitoring REST API Walkthrough.
Additionally, you could leverage Web Jobs for running your programs or scripts periodically to retrieve your expected metrics and write your logic code for sending alert notification.
Have you tried ApplicationInsights? I think it can fulfill your requirement.
Below image is alert rules that ApplicationInsights gives. You can use Exception rate, Server exceptions, or Failed request. Basically, it already configured abnormal status alert out of the box and will send email notification, so called Smart Detection.
Unfortunately I don't believe this is possible today.

Use ELK Stack to visualise metrics of Telegraf or StatsD

I am aggregating my logs using the ELK stack. Now I would like to show metrics and create alerts with it too like current CPU usage, number of requests Handled, number of DB queries etc
I can collect the metrics using Telegraf or StatsD but how do I plug them into Logstash? There is no Logstash input for either of these two.
Does this approach even make sense or should I Aggregate time series data in a different system? I would like to have everything under one hood.
I can give you some insight on how to accomplish this with Telegraf:
Option 1: Telegraf output TCP into Logstash. This is what I do personally, because I like to have all of my data go through Logstash for tagging and mutations.
Telegraf output config:
[[outputs.socket_writer]]
## URL to connect to
address = "tcp://$LOGSTASH_IP:8094"
Logstash input config:
tcp {
port => 8094
}
Option 2: Telegraf directly to Elasticsearch. The docs for this are good and should tell you what to do!
From an ideological perspective, inserting metrics into the ELK stack may or may not be the right thing to do - it depends on your use case. I switched to using Telegraf/InfluxDB because I had a lot of metrics and my consumers preferred the Influx query syntax for time-series data and some other Influx features such as rollups.
But there is something to be said about reducing complexity by having all of your data "under one hood". Elastic is also making the push toward being more suitable for time-series data with Timelion and there were a few talks at Elasticon concerning storing time-series data in Elasticsearch. Here's one. I would say that storing your metrics in ELK is a completely reasonable thing to do. :)
Let me know if this helps.
Here are various options for storing metrics from StatsD to ES:
Using statsd module of metricbeat. The metrics can be send to metricbeat in StatsD format. Then metricbeat transfers them to ES.
Example of metricbeat configuration:
metricbeat.modules:
- module: statsd
host: "localhost"
port: 8125
enabled: true
period: 2s
ElasticSearch as StatsD backend. The following project allows to save metrics from StatsD to ES:
https://github.com/markkimsal/statsd-elasticsearch-backend

Resources