Prometheus alert multiple routes and receiver doesn't send one rules - prometheus-alertmanager

I have this alert configuration and expect this behavior.
If destination: bloom and severity: info send to slack-alert-info - it's work
Error there. If destination: bloom and severity: warning|critical send to slack-alert-multi - this works with error.
Sverity: warning sending as expected to both Slack's channels but critical sending only to default channel.
May someone help me understand my error, please?
Amtool gives no error.
amtool config routes test --config.file=/opt/prometheus/etc/alertmanager.yml --tree --verify.receivers=slack-alert-multi severity=warning destination=bloom
Matching routes:
.
└── default-route
└── {destination=~"^(?:bloom)$",severity=~"^(?:warning|critical)$"} receiver: slack-alert-multi
slack-alert-multi
amtool config routes test --config.file=/opt/prometheus/etc/alertmanager.yml --tree --verify.receivers=slack-alert-multi severity=critical destination=bloom
Matching routes:
.
└── default-route
└── {destination=~"^(?:bloom)$",severity=~"^(?:warning|critical)$"} receiver: slack-alert-multi
slack-alert-multi
Alert configuration
...
labels:
alerttype: infrastructure
severity: warning
destination: bloom
...
---
global:
resolve_timeout: 30m
route:
group_by: [ 'alertname', 'cluster', 'severity' ]
group_wait: 30s
group_interval: 30s
repeat_interval: 300s
receiver: 'slack'
routes:
- receiver: 'slack-alert-multi'
match_re:
destination: bloom
severity: warning|critical
- receiver: 'slack-alert-info'
match_re:
destination: bloom
severity: info
receivers:
- name: 'slack-alert-multi'
slack_configs:
- api_url: 'https://hooks.slack.com/services/T0/B0/V2'
channel: '#alert-upload'
send_resolved: true
icon_url: 'https://avatars3.githubusercontent.com/u/3380462'
title: '{{ template "custom_title" . }}'
text: '{{ template "custom_slack_message" . }}'
- api_url: 'https://hooks.slack.com/services/T0/B0/J1'
channel: '#alert-exports'
send_resolved: true
icon_url: 'https://avatars3.githubusercontent.com/u/3380462'
title: '{{ template "custom_title" . }}'
text: '{{ template "custom_slack_message" . }}'
# Default receiver
- name: 'slack'
slack_configs:
- api_url: 'https://hooks.slack.com/services/T0/B0/2x'
channel: '#aws-notification'
send_resolved: true
icon_url: 'https://avatars3.githubusercontent.com/u/3380462'
title: '{{ template "custom_title" . }}'
text: '{{ template "custom_slack_message" . }}'
- name: 'slack-alert-info'
slack_configs:
- api_url: 'https://hooks.slack.com/services/T0/B0/EA'
channel: '#alert-info'
send_resolved: true
icon_url: 'https://avatars3.githubusercontent.com/u/3380462'
title: '{{ template "custom_title" . }}'
text: '{{ template "custom_slack_message" . }}'
templates:
- '/opt/alertmanager_notifications.tmpl'

Try add
continue: true
into
- receiver: 'slack-alert-info'
match_re:
destination: bloom
severity: info
continue: true

Related

AlertManager not sending mail

I got following yaml for configmap for AlertAManger but it is not sending mail. I verified the smpt settings are working on another script
kind: ConfigMap
apiVersion: v1
metadata:
name: alertmanager-config
namespace: monitoring
data:
config.yml: |-
global:
smtp_smarthost: 'smtp.gmail.com:587'
smtp_from: 'AlertManager#xxx.com'
smtp_auth_username: 'alertmanager#gmail.com'
smtp_auth_password: 'xxxxxxxx'
templates:
- '/etc/alertmanager/*.tmpl'
route:
receiver: alert-emailer
group_by: ['alertname', 'priority']
group_wait: 10s
repeat_interval: 30m
routes:
- receiver: slack_demo
# Send severity=slack alerts to slack.
match:
severity: slack
group_wait: 10s
repeat_interval: 1m
receivers:
- name: alert-emailer
email_configs:
- to: alertmanager#gmail.com
send_resolved: false
from: alertmanager#gmail.com
smarthost: smtp.gmail.com:587
require_tls: false
- name: slack_demo
slack_configs:
- api_url: https://hooks.slack.com/services/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sz4Qk
channel: '#xxxxxxxx'%
any idea why it is not working?

Kube Prometheus Stack Chart - Alertmanager

When I enable the alertmanager a secret gets created with name alertmanager-{chartName}-alertmanager. But no pods or statefulset of alertmanager gets created.
When I delete this secret with kubectl delete and upgrade the chart again, then new secrets get created alertmanager-{chartName}-alertmanager , alertmanager-{chartName}-alertmanager-generated. In this case i can see the pods and statefulset of alertmanager. But the -generated secret has default values which are null. This secret alertmanager-{chartName}-alertmanager has updated configuration.
Checked the alertmanager.yml with amtool and it shows valid.
Chart - kube-prometheus-stack-36.2.0
#Configuration in my values.yaml
alertmanager:
enabled: true
global:
resolve_timeout: 5m
smtp_require_tls: false
route:
receiver: 'email'
receivers:
- name: 'null'
- name: 'email'
email_configs:
- to: xyz#gmail.com
from: abc#gmail.com
smarthost: x.x.x.x:25
send_resolved: true
#Configuration from the secret alertmanager-{chartName}-alertmanager
global:
resolve_timeout: 5m
smtp_require_tls: false
inhibit_rules:
- equal:
- namespace
- alertname
source_matchers:
- severity = critical
target_matchers:
- severity =~ warning|info
- equal:
- namespace
- alertname
source_matchers:
- severity = warning
target_matchers:
- severity = info
- equal:
- namespace
source_matchers:
- alertname = InfoInhibitor
target_matchers:
- severity = info
receivers:
- name: "null"
- email_configs:
- from: abc#gmail.com
send_resolved: true
smarthost: x.x.x.x:25
to: xyz#gmail.com
name: email
route:
group_by:
- namespace
group_interval: 5m
group_wait: 30s
receiver: email
repeat_interval: 12h
routes:
- matchers:
- alertname =~ "InfoInhibitor|Watchdog"
receiver: "null"
templates:
- /etc/alertmanager/config/*.tmpl

alertmanager getting randomly error message unexpected status code 422

I have deployed prometheus from community-helm chart(14.6.0) where is running alertmanager which is showing time-to-time errors (templating issues) with error message showing nothing extra useful. Question is that i have retested config via amtool and didnt receive error in config
level=error ts=2021-08-17T14:43:08.787Z caller=dispatch.go:309 component=dispatcher msg="Notify for alerts failed" num_alerts=2 err="opsgenie/opsgenie[0]: notify retry canceled due to unrecoverable error after 1 attempts: unexpected status code 422: {\"message\":\"Request body is not processable. Please check the errors.\",\"errors\":{\"message\":\"Message can not be empty.\"},\"took\":0.0,\"requestId\":\"38c37c18-5635-48bc-bb69-bda03e232cce\"}"
level=debug ts=2021-08-17T14:43:08.798Z caller=notify.go:685 component=dispatcher receiver=opsgenie integration=opsgenie[0] msg="Notify success" attempts=1
level=error ts=2021-08-17T14:43:08.804Z caller=dispatch.go:309 component=dispatcher msg="Notify for alerts failed" num_alerts=2 err="opsgenie/opsgenie[0]: notify retry canceled due to unrecoverable error after 1 attempts: unexpected status code 422: {\"message\":\"Request body is not processable. Please check the errors.\",\"errors\":{\"message\":\"Message can not be empty.\"},\"took\":0.001,\"requestId\":\"70d2ac84-3422-4fe6-9d8b-e601fdc37b25\"}"
Monitoring is working and getting alerts just would like to understand how this error can be translated .. what could be wrong as enabling debug mode didnt provide more information.
alertmanager config:
global: {}
receivers:
- name: opsgenie
opsgenie_configs:
- api_key: XXX
api_url: https://api.eu.opsgenie.com/
details:
Prometheus alert: ' {{ .CommonLabels.alertname }}, {{ .CommonLabels.namespace }}, {{ .CommonLabels.pod }}, {{ .CommonLabels.dimension_CacheClusterId }}, {{ .CommonLabels.dimension_DBInstanceIdentifier }}, {{ .CommonLabels.dimension_DBClusterIdentifier }}'
http_config: {}
message: '{{ .CommonAnnotations.message }}'
priority: '{{ if eq .CommonLabels.severity "critical" }}P2{{ else if eq .CommonLabels.severity "high" }}P3{{ else if eq .CommonLabels.severity "warning" }}P4{{ else }}P5{{ end }}'
send_resolved: true
tags: ' Prometheus, {{ .CommonLabels.namespace }}, {{ .CommonLabels.severity }}, {{ .CommonLabels.alertname }}, {{ .CommonLabels.pod }}, {{ .CommonLabels.kubernetes_node }}, {{ .CommonLabels.dimension_CacheClusterId }}, {{ .CommonLabels.dimension_DBInstanceIdentifier }}, {{ .CommonLabels.dimension_Cluster_Name }}, {{ .CommonLabels.dimension_DBClusterIdentifier }} '
- name: deadmansswitch
webhook_configs:
- http_config:
basic_auth:
password: XXX
send_resolved: true
url: https://api.eu.opsgenie.com/v2/heartbeats/prometheus-nonprod/ping
- name: blackhole
route:
group_by:
- alertname
- namespace
- kubernetes_node
- dimension_CacheClusterId
- dimension_DBInstanceIdentifier
- dimension_Cluster_Name
- dimension_DBClusterIdentifier
- server_name
group_interval: 5m
group_wait: 10s
receiver: opsgenie
repeat_interval: 5m
routes:
- group_interval: 1m
match:
alertname: DeadMansSwitch
receiver: deadmansswitch
repeat_interval: 1m
- match_re:
namespace: XXX
- match_re:
alertname: HighMemoryUsage|HighCPULoad|CPUThrottlingHigh
- match_re:
namespace: .+
receiver: blackhole
- group_by:
- instance
match:
alertname: PrometheusBlackboxEndpoints
- match_re:
alertname: .*
- match_re:
kubernetes_node: .*
- match_re:
dimension_CacheClusterId: .*
- match_re:
dimension_DBInstanceIdentifier: .*
- match_re:
dimension_Cluster_Name: .*
- match_re:

Alertmanager email route

I am trying configure the "route" of alertmanager, below is my configuration:
route:
group_by: ['instance']
group_wait: 30s
group_interval: 5m
repeat_interval: 7m
receiver: pager
routes:
- match:
severity: critical
receiver: email
- match_re:
severity: ^(warning|critical)$
receiver: support_team
receivers:
- name: 'email'
email_configs:
- to: 'xxxxxx#xx.com'
- name: 'support_team'
email_configs:
- to: 'xxxxxx#xx.com'
- name: 'pager'
email_configs:
- to: 'alert-pager#example.com'
Now the e-mail can only be send to the default receiver "pager", will not further route to the custom one.
You need this line for each route when want alerts to be routed to the other ones.
continue: true
e.g.
route:
group_by: ['instance']
group_wait: 30s
group_interval: 5m
repeat_interval: 7m
receiver: pager
routes:
- match:
severity: critical
receiver: email
continue: true
- match_re:
severity: ^(warning|critical)$
receiver: support_team
continue: true
receivers:
- name: 'email'
email_configs:
- to: 'xxxxxx#xx.com'
- name: 'support_team'
email_configs:
- to: 'xxxxxx#xx.com'
- name: 'pager'
email_configs:
- to: 'alert-pager#example.com'
btw. imho receiver should be at the same level as match in yaml structure.

alertmanager won't send out e-mail

I am trying to use "E-mail" to receive alert from Prometheus with alertmanager, however, It is keeping print such log like: "Error on notify: EOF" source="notify.go:283" and "Notify for 3 alerts failed: EOF" source="dispatch.go:261". My alertmanager config is like below:
smtp_smarthost: 'smtp.xxx.com:xxx'
smtp_from: 'xxxxx#xxx.com'
smtp_auth_username: 'xxxx#xxx.com'
smtp_auth_password: 'xxxxxxx'
smtp_require_tls: false
route:
group_by: ['instance']
group_wait: 30s
group_interval: 5m
repeat_interval: 7m
receiver: email
routes:
- match:
severity: critical
receiver: email
- match_re:
severity: ^(warning|critical)$
receiver: support_team
receivers:
- name: 'email'
email_configs:
- to: 'xxxxxx#xx.com'
- name: 'support_team'
email_configs:
- to: 'xxxxxx#xx.com'
- name: 'pager'
email_configs:
- to: 'alert-pager#example.com'
Any suggest?
I use smtp.xxx.com:587 fixed the issue,but also need to set smtp_require_tls: true

Resources