配置SQS死信队列以在收到消息时发出云监视警报 [英] Configure SQS Dead letter Queue to raise a cloud watch alarm on receiving a message

查看:293
本文介绍了配置SQS死信队列以在收到消息时发出云监视警报的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用Amazon SQS中的Dead Letter Queue.我希望只要队列收到新消息,它都应该引发CloudWatch警报.问题是我在度量标准:队列number_of_messages_sent上配置了警报,但是在

I was working with Dead letter Queue in Amazon SQS. I want that whenever a new message is received by the queue it should raise a CloudWatch alarm. The problem is I configured an alarm on the metric: number_of_messages_sent of the queue but this metric don't work as expected in case of Dead letter Queues as mentioned in the Amazon SQS Dead-Letter Queues - Amazon Simple Queue Service documentation.

现在有关使用number_of_messages_visible的一些建议,但是我不确定如何在警报中进行配置.因此,如果我设置该metric>0的值,那么这与在队列中获取新消息并不相同.如果有旧消息,则度量标准值将始终为>0.我可以通过某种数学表达式来获取该指标在特定时间段内(例如说一分钟)的变化量,但我正在寻找更好的解决方案.

Now some suggestions on this were use number_of_messages_visible but I am not sure how to configure this in an alarm. So if i set that the value of this metric>0 then this is not same as getting a new message in the queue. If an old message is there then the metric value will always be >0. I can do some kind of mathematical expression to get the delta in this metric for some defined period (let's say a minute) but I am looking for some better solution.

推荐答案

我也遇到了同样的问题,而我的答案是改用NumberOfMessagesSent.然后,我可以为在配置的时间段内收到的新邮件设置标准.这是在CloudFormation中对我有用的东西.

I struggled with the same problem and the answer for me was to use NumberOfMessagesSent instead. Then I could set my criteria for new messages that came in during my configured period of time. Here is what worked for me in CloudFormation.

请注意,如果警报由于持续故障而保持在警报状态,则不会发生个别警报.您可以设置另一个警报来捕获那些警报.即:使用相同的方法在1小时内发生100个错误时发出警报.

Note that individual alarms do not occur if the alarm stays in an alarm state from constant failure. You can setup another alarm to catch those. ie: Alarm when 100 errors occur in 1 hour using the same method.

已更新:由于NumberOfMessagesReceived和NumberOfMessagesSent的度量标准取决于消息排队的方式,因此,在对dlq设置添加延迟后,我使用metricNumberNumberMessagesDelayed度量标准为我们设计了一种新的解决方案.如果要手动将消息添加到队列中,则NumberOfMessagesReceived将起作用.否则,请在设置延迟后使用roximateNumberOfMessagesDelayed.

Updated: Because the metrics for NumberOfMessagesReceived and NumberOfMessagesSent are dependent on how the message is queued, I have devised a new solutions for our needs using the metric ApproximateNumberOfMessagesDelayed after adding a delay to the dlq settings. If you are adding the messages to the queue manually then NumberOfMessagesReceived will work. Otherwise use ApproximateNumberOfMessagesDelayed after setting up a delay.

MyDeadLetterQueue:
    Type: AWS::SQS::Queue
    Properties:
      MessageRetentionPeriod: 1209600  # 14 days
      DelaySeconds: 60 #for alarms

DLQthresholdAlarm:
 Type: AWS::CloudWatch::Alarm
    Properties:
      AlarmDescription: "Alarm dlq messages when we have 1 or more failed messages in 10 minutes"
      Namespace: "AWS/SQS"
      MetricName: "ApproximateNumberOfMessagesDelayed"
      Dimensions:
        - Name: "QueueName"
          Value:
            Fn::GetAtt:
              - "MyDeadLetterQueue"
              - "QueueName"
      Statistic: "Sum"
      Period: 300  
      DatapointsToAlarm: 1 
      EvaluationPeriods: 2       
      Threshold: 1
      ComparisonOperator: "GreaterThanOrEqualToThreshold"
      AlarmActions:
        - !Ref MyAlarmTopic

这篇关于配置SQS死信队列以在收到消息时发出云监视警报的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆