alertmanager 配置钉钉告警以及模板(prometheus-webhook-dingtalk)基于 kube-prometheus

前端 0

alertmanager 的 receive 并不直接支持钉钉的 url,要部署插件容器 prometheus-webhook-dingtalk

并且有个需要注意的地方是,当 receives 为钉钉时 (webhook_configs),它的告警模板不是在 alertmanager 的配置文件中指定的,而是在钉钉插件 prometheus-webhook-dingtalk 中指定的。

编写 prometheus-webhook-dingtalk 配置文件和模板

vim dingtalk-configmap.yaml,这里记的替换你的钉钉 url token。

apiVersion: v1kind: ConfigMapmetadata:  name: prometheus-webhook-dingtalk-config  namespace: monitoringdata:  config.yml: |-    templates:      - /etc/prometheus-webhook-dingtalk/default.tmpl    targets:      webhook1:        url: https://oapi.dingtalk.com/robot/send?access_token=xxxxxxxxxxxx             #修改为钉钉机器人的webhook        message:          text: '{{ template "default.tmpl" . }}'  default.tmpl: |    {{ define "default.tmpl" }}    {{- if gt (len .Alerts.Firing) 0 -}}    {{- range $index, $alert := .Alerts -}}    ============ = **<font color='#FF0000'>告警</font>** = =============  #红色字体      **告警名称:**    {{ $alert.Labels.alertname }}       **告警级别:**    {{ $alert.Labels.severity }} 级       **告警状态:**    {{ .Status }}       **告警实例:**    {{ $alert.Labels.instance }} {{ $alert.Labels.device }}       **告警概要:**    {{ .Annotations.summary }}       **告警详情:**    {{ $alert.Annotations.message }}{{ $alert.Annotations.description}}       **故障时间:**    {{ ($alert.StartsAt.Add 28800e9).Format "2006-01-02 15:04:05" }}      ============ = end = =============      {{- end }}    {{- end }}    {{- if gt (len .Alerts.Resolved) 0 -}}    {{- range $index, $alert := .Alerts -}}    ============ = <font color='#00FF00'>恢复</font> = =============   #绿色字体     **告警实例:**    {{ .Labels.instance }}       **告警名称:**    {{ .Labels.alertname }}      **告警级别:**    {{ $alert.Labels.severity }} 级       **告警状态:**    {{   .Status }}     **告警概要:**    {{ $alert.Annotations.summary }}      **告警详情:**    {{ $alert.Annotations.message }}{{ $alert.Annotations.description}}      **故障时间:**    {{ ($alert.StartsAt.Add 28800e9).Format "2006-01-02 15:04:05" }}      **恢复时间:**    {{ ($alert.EndsAt.Add 28800e9).Format "2006-01-02 15:04:05" }}      ============ = **end** = =============    {{- end }}    {{- end }}    {{- end }}

然后创建 configmap,kubectl apply -f dingtalk-configmap.yaml

部署钉钉插件

不同版本的插件指定配置文件的参数也不一样,这里部署的是 v2.1.0

vim dingtalk-webhook-deploy.yaml,这个文件不需要修改

apiVersion: v1kind: Servicemetadata:  name: dingtalk  namespace: monitoring  labels:    app: dingtalkspec:  selector:    app: dingtalk  ports:  - name: dingtalk    port: 8060    protocol: TCP    targetPort: 8060  ---apiVersion: apps/v1kind: Deploymentmetadata:  name: dingtalk  namespace: monitoringspec:  replicas: 2  selector:    matchLabels:      app: dingtalk  template:    metadata:      name: dingtalk      labels:        app: dingtalk    spec:      containers:      - name: dingtalk        image: timonwong/prometheus-webhook-dingtalk:v2.1.0        imagePullPolicy: IfNotPresent        args:        - --web.listen-address=:8060        - --config.file=/etc/prometheus-webhook-dingtalk/config.yml        ports:        - containerPort: 8060        volumeMounts:        - name: config          mountPath: /etc/prometheus-webhook-dingtalk      volumes:      - name: config        configMap:          name: prometheus-webhook-dingtalk-config

kubectl apply -f dingtalk-webhook-deploy.yaml

编写 alertmanager 的主配置文件

vim alertmanager.yaml,此文件需要自己添加一条路由或用默认路由,和相应的接收者。

这里的接收者 webhook,其实是上面部署的钉钉插件 service 的地址

global:  resolve_timeout: 5m  wechat_api_url: 'https://qyapi.weixin.qq.com/cgi-bin/'  wechat_api_secret: '*****'  wechat_api_corp_id: '*******'  smtp_smarthost: 'smtp.163.com:25'  smtp_from: '你的邮箱'  smtp_auth_username: '邮箱用户名'  smtp_auth_password: '密码或授权码'  smtp_require_tls: false  route:  group_by: ['alertname','job']  group_wait: 30s  group_interval: 1m  repeat_interval: 30m  receiver: 'wechat'  routes:  - match:      job: 'prometheus'    receiver: 'webhook' receivers:- name: 'email'  email_configs:  - to: '邮件接收人'- name: 'wechat'  wechat_configs:  - send_resolved: true    to_party: '2'    agent_id: '1'- name: 'webhook'  webhook_configs:  # 和插件不同 namespace 请填写 http://webhook-dingtalk.monitoring.svc.cluster.local:8060/dingtalk/webhook1/send  - url: 'http://webhook-dingtalk:8060/dingtalk/webhook1/send' 

先将之前的 secret 对象删除

kubectl delete secret alertmanager-main -n monitoring

secret "alertmanager-main" deleted

创建新的secret对象

kubectl create secret generic alertmanager-main --from-file=alertmanager.yaml -n monitoring

secret "alertmanager-main" created

配置钉钉告警和模板已完成,需要注意的是更新完 configmap,pod 中的配置是不会自动更新的,需要重新创建pod。有问题可以在评论区留言

也许您对下面的内容还感兴趣: