AlwayOn通过警报监控 - 获取大量电子邮件 [英] AlwayOn Monitoring via Alerts - Getting Lots of Emails

查看:82
本文介绍了AlwayOn通过警报监控 - 获取大量电子邮件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

您好,

我最近在SQL Server 2017中设置了AlwaysOn,并且已经生产了几周了。 一切都很好。 我目前正在设置SQL警报以监控AlwaysOn AG。 我昨天创建的一个警报
用于监控可用性组角色更改。 我手动从主要故障切换到辅助设备以测试警报。 警报有效但我在1-2秒内收到了大约28封电子邮件通知。 所有电子邮件都是相同的。 
我最初的反应可能是可用性组中每个数据库的电子邮件。 但我们只有大约15个数据库。 那么,当我从小学到中学的失败时,有谁知道为什么我会收到28个警报? 这是假设
工作的方式,还是我错误配置了什么? 附件是这些电子邮件的部分列表。

I have recently setup AlwaysOn in SQL Server 2017 and has been in Production for several weeks now.  Everything is working fine.  I'm currently in the process of setting up SQL Alerts to monitor AlwaysOn AG.  One of the alerts I created yesterday was to monitor Availability Group role change.  I manually failed over from the primary to secondary to test the alert.  The alert worked however I was getting about 28 email notifications within 1-2 seconds.  All the emails were identical.  My initial reaction was maybe an email for each database in the Availability Group.  But we only have about 15 databases.  So does anyone know why I was getting 28 alerts when I failed over from primary to secondary?  Is this how it supposed to work or did I misconfigure something?  Attached is a partial listing of those emails.

谢谢

推荐答案

1。我要做的第一件事是运行以下内容来检查针对AG的历史故障转移。您应该查看" event_timestamp "列并计算具有相似时间戳
值的行数。该计数将是您的AG在故障转移期间经历的状态更改。如果计数结果为28,则可能是每次状态更改时,警报都被触发(虽然不太可能,但我仍然希望您检查一下)。
AG在故障转移期间会经历多个状态更改。
来源

1. The first thing I would do is run the following to check the historical failovers that occurred against the AG in question. You should look at the "event_timestamp" column and count the number of rows that have similar timestamp values. That count will be the state changes that your AG went through during the failover. If the count turns out to be 28 then it could be that on every state change, the alert was being triggered (although unlikely but I'd still like you to check that). An AG goes through multiple state changes during a failover. Source

declare @xel_path varchar(1024);
declare @utc_adjustment int = datediff(hour, getutcdate(), getdate());

-------------------------------------------------------------------------------
------------------- target event_file path retrieval --------------------------
-------------------------------------------------------------------------------
;with target_data_cte as
(
    select  
        target_data = 
            convert(xml, target_data)
    from sys.dm_xe_sessions s
    inner join sys.dm_xe_session_targets st
    on s.address = st.event_session_address
    where s.name = 'alwayson_health'
    and st.target_name = 'event_file'
),
full_path_cte as
(
    select
        full_path = 
            target_data.value('(EventFileTarget/File/@name)[1]', 'varchar(1024)')
    from target_data_cte
)
select
    @xel_path = 
        left(full_path, len(full_path) - charindex('\', reverse(full_path))) + 
        '\AlwaysOn_health*.xel'
from full_path_cte;

-------------------------------------------------------------------------------
------------------- replica state change events -------------------------------
-------------------------------------------------------------------------------
;with state_change_data as
(
    select
        object_name,
        event_data = 
            convert(xml, event_data)
    from sys.fn_xe_file_target_read_file(@xel_path, null, null, null)
)
select
    object_name,
    event_timestamp = 
        dateadd(hour, @utc_adjustment, event_data.value('(event/@timestamp)[1]', 'datetime')),
    ag_name = 
        event_data.value('(event/data[@name = "availability_group_name"]/value)[1]', 'varchar(64)'),
    previous_state = 
        event_data.value('(event/data[@name = "previous_state"]/text)[1]', 'varchar(64)'),
    current_state = 
        event_data.value('(event/data[@name = "current_state"]/text)[1]', 'varchar(64)')
from state_change_data
where object_name = 'availability_replica_state_change'
order by event_timestamp desc;


2. 可能的解决方法:如果您不希望收件箱充满电子邮件警报,只需提高" ; 回复延迟"
中的参数
Alert propertie
s。这样,即使事件在延迟期间触发,也不会发送任何电子邮件,直到指定的延迟期结束。 

2. A possible workaround: If you don't want your inbox to be flooded with email alerts, just bump up the "Delay between responses" parameter in the Alert properties. That way, even if the event triggers during the delay, it won't send any emails out till the specified delay period is exhausted. 


这篇关于AlwayOn通过警报监控 - 获取大量电子邮件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆