Cloudwatch 日志警报 - 如何在电子邮件通知中包含错误/异常/堆栈跟踪数据 [英] Cloudwatch Log Alert - How to include error / exception / stack trace data in email notification

查看:22
本文介绍了Cloudwatch 日志警报 - 如何在电子邮件通知中包含错误/异常/堆栈跟踪数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我刚刚在我的 ec2 实例上配置了 Cloudwatch 日志,到目前为止我很喜欢它.我还为某些关键字设置了警报,例如错误".虽然电子邮件警报似乎工作正常,但我想知道是否有一种方法可以微调警报电子邮件以使其更加简洁&信息丰富.具体来说,我希望

  1. 删除警报电子邮件中的所有样板文本.

  2. 包括一些有关触发警报的错误/异常的信息.这可能就像包含生成警报的日志语句一样简单.

现在,警报电子邮件看起来像

<块引用>

您收到此电子邮件是因为您的 Amazon CloudWatch 警报美国东部-弗吉尼亚北部地区的App-Error-Alarm"已进入ALARM 状态,因为超过阈值:1 个数据点 (1.0) 更大大于或等于阈值 (1.0)."在2017 年 2 月 7 日星期二16:39:43 UTC".

在 AWS 管理控制台中查看此警报:

I just configured Cloudwatch logs on my ec2 instances and am loving it so far. I also set up alerts for certain keywords, like "ERROR". While the email alert seems to be working fine, I was wondering if there's a way to fine-tune the alert email to make it a little concise & informative. Specifically, I'm looking to

  1. Get rid of all the boilerplate text in the alert email.

  2. Include some information about the Error/Exception that triggered the alert. This could be something as simple as including the log statement that generated the alert.

Right now, the alert email looks like

You are receiving this email because your Amazon CloudWatch Alarm "App-Error-Alarm" in the US East - N. Virginia region has entered the ALARM state, because "Threshold Crossed: 1 datapoint (1.0) was greater than or equal to the threshold (1.0)." at "Tuesday 07 February, 2017 16:39:43 UTC".

View this alarm in the AWS Management Console: https://console.aws.amazon.com/cloudwatch/home?region=us-east-1#s=Alarms&alarm=App-Error-Alarm

Alarm Details: - Name: App-Error-Alarm - Description: Errors in app.log - State Change: INSUFFICIENT_DATA -> ALARM - Reason for State Change: Threshold Crossed: 1 datapoint (1.0) was greater than or equal to the threshold (1.0). - Timestamp: Tuesday 07 February, 2017 16:39:43 UTC - AWS Account: <>

Threshold: - The alarm is in the ALARM state when the metric is GreaterThanOrEqualToThreshold 1.0 for 300 seconds.

Monitored Metric: - MetricNamespace: LogMetrics - MetricName: ERROR - Dimensions: - Period: 300 seconds - Statistic: Sum - Unit: not specified

State Change Actions: - OK: - ALARM: [arn:aws:sns:us-east-1:<>:support] - INSUFFICIENT_DATA:

I'd like it to something like

Alarm: App-Error-Alarm

Keyword: "ERROR"

Reason: ERROR 2017-02-07 07:31:47,375 [SimpleAsyncTaskExecutor-5] com.app.server.rest.Watcher: javax.net.ssl.SSLHandshakeException: Received fatal alert: handshake_failure

Its short, sweet and instantly tells me whether its something that needs my immediate attention. Can this be done without writing code as suggested here?

解决方案

You have this problem because you configured an alarm and is meant for aggregated data, not for specific log record. You configure it for some metric (number of log records with ERROR keyword).

You can use log subscription instead and stream all log records matching a filter to a custom Lambda function. You can use it to send notifications to email or Slack.

To configure log streaming, go to Lambda in AWS console and create a new function from a blueprint named "cloudwatch-logs-process-data". It has a basic structure and is easy to customize to your needs.

这篇关于Cloudwatch 日志警报 - 如何在电子邮件通知中包含错误/异常/堆栈跟踪数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆