AG资源陷入失败状态 [英] AG Resource Going Into Failed Status

查看:267
本文介绍了AG资源陷入失败状态的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

您好,


Win2k12 r2操作系统,SQL2k12 Ent版本。两个节点故障转移群集,托管1.5 tb同步数据库(同步提交)。两台服务器都来自同一个子网。在群集事件中看到以下错误。几分钟后,AG资源再次上线



群集角色"AOGrp1"中"SQL Server可用性组"类型的群集资源"AOGrp1"失败。


根据资源和角色的失败策略,群集服务可能会尝试在此节点上使资源联机或将组移动到群集的另一个节点,然后重新启动它。 使用故障转移群集
Manager或Get-ClusterResource Windows PowerShell cmdlet检查资源和组状态。



此时,我在sql server日志中看到以下内容:


消息

SQL Server主机可用性组'AOGrp1'未收到在租约超时期限内来自Windows Server故障转移群集的进程事件信号。


消息

错误:19421,严重性:16,状态:1。


以下是群集日志的部分捕获。这与网络连接有关吗?


5c :: 2019/03 / 14-10:02:07.561 INFO  [RES] SQL Server可用性组:[hadrag] SQL Server组件"query_processing"运行状况已在2019-03-14 10:02:07.340 的基础上从"清除"更改为"警告" b $ b 00004590.00004838 :: 2019/03 / 14-10:02:09.248 INFO  [RES]网络名称:代理:发送请求Netname / RecheckConfig为NN:6c77a32d-387c-4a20-bfd6-8006b879b8c0:Netbios

00004590.00004838 :: 2019/03/14 -10:02:14.373 INFO  [RES]网络名称:代理:发送请求Netname / RecheckConfig为NN:6c77a32d-387c-4a20-bfd6-8006b879b8c0:Netbios

00004590.00004838 :: 2019/03/14 -10:02:20.283 INFO  [RES]网络名称:代理:发送请求Netname / RecheckConfig为NN:6c77a32d-387c-4a20-bfd6-8006b879b8c0:Netbios

00004590.00004838 :: 2019/03/14 -10:02:25.358 INFO  [RES]网络名称:代理:发送请求Netname / RecheckConfig为NN:6c77a32d-387c-4a20-bfd6-8006b879b8c0:Netbios

00004590.00003a6c :: 2019/03 / 14-10:02:32.624 INFO  [RES]网络名称:代理:发送请求Netname / RecheckConfig为NN:6c77a32d-387c-4a20-bfd6-8006b879b8c0:Netbios

00004590.00004838 :: 2019/03/14 -10:02:32.780 INFO  [RES]网络名称< AOGrp1_SQL20LS>:Dns:HealthCheck:SQL20LS

00004590.00004838 :: 2019/03 / 14-10:02:32.780 INFO  [RES]网络名称< AOGrp1_SQL20LS>:Dns:慢速操作结束,状态:初始化/读取,prevWorkState:读取

00004590.00004adc :: 2019/03/14 -10:02:37.889 INFO  [RES]网络名称:代理:发送请求Netname / RecheckConfig为NN:6c77a32d-387c-4a20-bfd6-8006b879b8c0:Netbios

000036ec.00003458 :: 2019/03 /14-10:02:38.874 INFO  [RES] SQL Server可用性组:资源ID为"7a245c40-fd05-4b59-99ac-31c395284899"的[hadrag]可用性组在HealthCheckTimeout

000036ec之前未收到健康信息。 00003458 :: 2019/03 / 14-10:02:38.874 ERR   [RES] SQL Server可用性组< AOGrp1>:[hadrag]可用组不健康,给定的HealthCheckTimeout和FailureConditionLevel

000036ec.00003458 :: 2019/03/14 -10:02:38.874 ERR   [RES] SQL Server可用性组< AOGrp1> ;: [hadrag]资源活动结果为0.

000036ec.00003458 :: 2019/03 / 14-10:02: 38.874 INFO  [RES] SQL Server可用性组:资源ID为"7a245c40-fd05-4b59-99ac-31c395284899"的[hadrag]可用性组在HealthCheckTimeout

000036ec之前未收到健康信息。 00003458 :: 2019/03 / 14-10:02:38.874 ERR   [RES] SQL Server可用性组< AOGrp1>:[hadrag]可用组不健康,给定的HealthCheckTimeout和FailureConditionLevel

000036ec.00003458 :: 2019/03/14 -10:02:38.874 ERR   [RES] SQL Server可用性组< AOGrp1> ;: [hadrag]资源活动结果为0.

000036ec.00003458 :: 2019/03 / 14-10:02: 38.874警告  [RHS]资源AOGrp1 IsAlive表示失败。

000028b0.00000840 :: 2019/03 / 14-10:02:40.774 INFO  [RCM] HandleMonitorReply:'AOGrp1'的故障诊断,gen(11)结果1/0。

000028b0.00000840 :: 2019/03 / 14-10:02: 40.774 INFO  [RCM] Res AOGrp1:在线 - > ProcessingFailure(StateUnknown)

000028b0.00000840 :: 2019/03 / 14-10:02:40.774 INFO  [RCM] TransitionToState(AOGrp1)Online - > ProcessingFailure。

000028b0.00000840 :: 2019/03 / 14-10:02:40.774 INFO  [RCM] rcm :: RcmGroup :: UpdateStateIfChanged :( AOGrp1,在线 - >待定)

000028b0.00000840 :: 2019/03 / 14-10:02 :40.774 ERR   [RCM] rcm :: RcmResource :: HandleFailure:(AOGrp1)

000028b0.00000840 :: 2019/03 / 14-10:02:40.999 INFO  [RCM]资源AOGrp1:失败计数:1,restartAction:2 persistentState:1。

000028b0.00000840 :: 2019/03 / 14-10:02:40.999 INFO  ; [RCM]自AOGrp1首次失败以来已超过restartPeriod时间,重置failureTime和failureCount。

000028b0.00000840 :: 2019/03 / 14-10:02: 40.999 INFO  [RCM]终止完成后将立即重启(500毫秒)AOGrp1。

000028b0.00000840 :: 2019/03 / 14-10:02:40.999 INFO  ; [RCM] Res AOGrp1:ProcessingFailure - > WaitingToTerminate(DelayRestartingResource)

000028b0.00000840 :: 2019/03 / 14-10:02:40.999 INFO  [RCM] TransitionToState(AOGrp1)ProcessingFailure - > [WaitingToTerminate to DelayRestartingResource]。

000028b0.00000840 :: 2019/03 / 14-10:02:41.030 INFO  ; [RCM] Res AOGrp1_FSShare:在线 - > WaitingToTerminate(WaitingToComeOnline)

000028b0.00000840 :: 2019/03 / 14-10:02:41.030 INFO  [RCM] TransitionToState(AOGrp1_FSShare)Online - > [WaitingToTerminate to WaitingToComeOnline]。

000028b0.00000840 :: 2019/03 / 14-10:02:41.030 INFO  ; [RCM] Res AOGrp1_FSShare:[WaitingToTerminate to WaitingToComeOnline] - >终止(WaitingToComeOnline)

000028b0.00000840 :: 2019/03 / 14-10:02:41.030 INFO  [RCM] TransitionToState(AOGrp1_FSShare)[WaitingToTerminate to WaitingToComeOnline] - > [终止于WaitingToComeOnline]。

000028b0.00000840 :: 2019/03 / 14-10: 02:41.030 INFO  [RCM] AOGrp1尚未准备好终止;依赖AOGrp1_FSShare仍在终止。

000028b0.00002540 :: 2019/03 / 14-10:02:41.483 INFO  [RCM]忽略非本地状态待定AOGrp1组 <$ em> b $ em 000028b0.000046b0 :: 2019/03 / 14-10:02:41.546 INFO  [GUM]节点1:在本地执行请求,gumId:3496,我的操作:/ dm / update,更新次数:1

000028b0.00002974 :: 2019/03 /14-10:02:41.594 ERR   [RCM] [GIM] ResType虚拟机没有资源,没有收集本地利用率信息

000028b0.00002974 :: 2019/03 / 14-10:02:41.594 INFO  ; [RCM] [GIM]安排本地节点爬虫以300000毫秒运行。

000028b0.00003b60 :: 2019/03 / 14-10:02:41.624 INFO  [GUM]节点1:在本地执行请求,gumId:3497,我的操作:/ dm / update,更新次数:1

000028b0.00003414 :: 2019/03 /14-10:02:41844 INFO  [API] s_ApiUnblockGetNotifyCall:用于HDL(1a)

000028b0.00003414 :: 2019/03 / 14-10:02:41.859 INFO  [API] s_ApiGetQuorumResource最终状态0。

000028b0.00002974 :: 2019/03 / 14-10:02:41.859 INFO  [API] s_ApiGetQuorumResource最终状态0。

000028b0.000046b0 :: 2019/03 / 14-10:02:41.874警告  [API] s_ApiOpenResourceEx:资源 未找到,状态= 5007

000028b0.00003b60 :: 2019/03 / 14-10:02:41.874 INFO  [RCM] HandleMonitorReply:TERMINATERESOURCE代表'AOGrp1_FSShare',gen(0)结果0/0。

000028b0.00003b60 :: 2019/03 / 14-10:02: 41.874 INFO  [RCM] Res AOGrp1_FSShare:[终止于WaitingToComeOnline] - > WaitingToComeOnline(StateUnknown)

000028b0.00003b60 :: 2019/03 / 14-10:02:41.874 INFO  [RCM] TransitionToState(AOGrp1_FSShare)[终止于WaitingToComeOnline] - > WaitingToComeOnline。

000028b0.00003b60 :: 2019/03 / 14-10:02:41.874 INFO  ; [RCM-rbtr]为组AOGrp1提供默认令牌 <$ em $ b 000028b0.00003b60 :: 2019/03 / 14-10:02:41.874 INFO  [RCM-rbtr]为组AOGrp1提供默认令牌 <$ em $ b 000028b0.00003b60 :: 2019/03 / 14-10:02:41.874 INFO  [RCM] Res AOGrp1:[WaitingToTerminate to DelayRestartingResource] - >终止(DelayRestartingResource)

000028b0.00003b60 :: 2019/03 / 14-10:02:41.874 INFO  [RCM] TransitionToState(AOGrp1)[WaitingToTerminate to DelayRestartingResource] - > [Terminating to DelayRestartingResource]。

000036ec.00004518 :: 2019/03 / 14-10: 02:41.874 ERR   [RES] SQL Server可用性组< AOGrp1>:[hadrag] Lease Thread已终止

000028b0.000046b0 :: 2019/03 / 14-10:02:41.874 INFO  ; [RCM-rbtr]为组AOGrp1提供默认令牌 <$ em $ b 000028b0.000046b0 :: 2019/03 / 14-10:02:41.874 INFO  [RCM-rbtr]为组AOGrp1提供默认令牌 <$ em $ b 000028b0.000046b0 :: 2019/03 / 14-10:02:41.874 INFO  [RCM-rbtr]为组AOGrp1提供默认令牌 <$ em $ b 000028b0.000046b0 :: 2019/03 / 14-10:02:41.874 INFO  [RCM-rbtr]为组AOGrp1提供默认令牌 <$ em $ b 000036ec.00003654 :: 2019/03 / 14-10:02:41.874 INFO  [RES] SQL Server可用性组:[hadrag]停止健康工作者线程

000036ec.00002c74 :: 2019/03 / 14-10:02:41.874 INFO  [RES] SQL Server可用性组:[hadrag]卫生工作者被要求终止

000036ec.00001f5c :: 2019/03 / 14-10:02:42.847 INFO  [RES] SQL Server可用性组:[hadrag] SQLMoreResults()返回-1,包含以下信息

000036ec.00002c74 :: 2019/03 / 14-10:02: 42.847 INFO  [RES] SQL Server可用性组:[hadrag]更改诊断间隔工作程序已停止

000036ec.00001f5c :: 2019/03 / 14-10:02:42.847 ERR  &NBSP; [RES] SQL Server可用性组:[hadrag] ODBC错误:[HY008] [Microsoft] [SQL Server Native Client 11.0]操作已取消(0)

000036ec.00001f5c :: 2019/03 / 14-10:02:42.847 ERR   [RES] SQL Server可用性组:[hadrag] ODBC错误:[01000] [Microsoft] [SQL Server Native Client 11.0] [SQL Server]  (0)

000036ec.00001f5c :: 2019/03 / 14-10:02:42.847 INFO  [RES] SQL Server可用性组:[hadrag]不再有诊断结果

000036ec.00001f5c :: 2019/03 / 14-10:02:42.847 INFO  [RES] SQL Server可用性组:[hadrag]诊断已停止

000036ec.00001f5c :: 2019/03 / 14-10:02:42.847 INFO  [RES] SQL Server可用性组:[hadrag]与SQL Server断开连接

000036ec.00001f5c :: 2019/03 / 14-10:02:42.925 INFO  [RES] SQL Server可用性组:[hadrag]扩展事件记录已停止

000036ec.00001f5c :: 2019/03 / 14-10:02:43.421 INFO  [RES] SQL Server可用性组:[hadrag]扩展事件目标状态:

000036ec.00001f5c :: 2019/03 / 14-10:02:43.421 INFO  [RES] SQL Server可用性组:[hadrag]扩展事件会话摘要:已删除缓冲区= 0,已删除事件= 0

00004590.00004adc :: 2019/03 / 14- 10:02:43.624 INFO  [RES]网络名称:代理:发送请求Netname / RecheckConfig为NN:6c77a32d-387c-4a20-bfd6-8006b879b8c0:Netbios

000036ec.00003654 :: 2019/03 /14-10:02:43.764 INFO  [RES] SQL Server可用性组:[hadrag]停止更改诊断间隔工作线程

000036ec.00003654 :: 2019/03 / 14-10:02:48.171 INFO  [RES] SQL Server可用性组< AOGrp1>:[hadrag]连接到SQL Server ...

00004590.00003a6c :: 2019/03 / 14-10:02 :48.781 INFO  [RES]网络名称:代理:发送请求Netname / RecheckConfig为NN:6c77a32d-387c-4a20-bfd6-8006b879b8c0:Netbios

00004590.00004838 :: 2019/03/14 -10:02:49.405 INFO  [RES]网络名称< AOGrp1_SQL20LS>:[cxl :: Pinger-" SQL20LS"]主机未注册。

00004590.00004838 :: 2019/03 / 14-10 :02:49.405警告  [RES]网络名称< AOGrp1_SQL20LS>:[cxl :: Pinger-" SQL20LS"]无法找到远程目标的任何端点

00004590.00004838 :: 2019/03 /14-10:02:49.421 INFO  [RES]网络名称< AOGrp1_SQL20LS>:将资源特定消息设置为名称解析尚未可用

00004590.00004838 :: 2019/03 / 14-10:02:54.452 INFO  ; [RES]网络名称:代理:发送请求Netname / RecheckConfig为NN:6c77a32d-387c-4a20-bfd6-8006b879b8c0:Netbios

00004590.00004838 :: 2019/03/14 -10:03:00.015 INFO  [RES]网络名称:代理:发送请求Netname / RecheckConfig为NN:6c77a32d-387c-4a20-bfd6-8006b879b8c0:Netbios

000036ec.00003654 :: 2019/03 /14-10:03:00.077 INFO  [RES] SQL Server可用性组< AOGrp1>:[hadrag]连接已成功建立

非常感谢您的输入。谢谢。


Victor




Victor

解决方案

您需要生成custer日志。


逐步引用链接。


https://blogs.msdn.microsoft的.com / alwaysonpro / 2014 /26分之11/诊断-意外的故障转移 - 或可用性基团的中分辨状态/


Hello,

Win2k12 r2 OS, SQL2k12 Ent edition. Two node fail-over cluster hosting a 1.5 tb synchronized database (Synchronous commit). Both the servers are from the same subnet. Seeing following error in the cluster events. After a few minutes, the AG resource comes back online again:

Cluster resource 'AOGrp1' of type 'SQL Server Availability Group' in clustered role 'AOGrp1' failed.

Based on the failure policies for the resource and role, the cluster service may try to bring the resource online on this node or move the group to another node of the cluster and then restart it.  Check the resource and group state using Failover Cluster Manager or the Get-ClusterResource Windows PowerShell cmdlet.

At that time, I see following in the sql server logs:

Message
SQL Server hosting availability group 'AOGrp1' did not receive a process event signal from the Windows Server Failover Cluster within the lease timeout period.

Message
Error: 19421, Severity: 16, State: 1.

Here is the partial capture from the cluster logs. Is this related to network connectivity or something else?

5c::2019/03/14-10:02:07.561 INFO  [RES] SQL Server Availability Group: [hadrag] SQL Server component 'query_processing' health state has been changed from 'clean' to 'warning' at 2019-03-14 10:02:07.340
00004590.00004838::2019/03/14-10:02:09.248 INFO  [RES] Network Name: Agent: Sending request Netname/RecheckConfig to NN:6c77a32d-387c-4a20-bfd6-8006b879b8c0:Netbios
00004590.00004838::2019/03/14-10:02:14.373 INFO  [RES] Network Name: Agent: Sending request Netname/RecheckConfig to NN:6c77a32d-387c-4a20-bfd6-8006b879b8c0:Netbios
00004590.00004838::2019/03/14-10:02:20.283 INFO  [RES] Network Name: Agent: Sending request Netname/RecheckConfig to NN:6c77a32d-387c-4a20-bfd6-8006b879b8c0:Netbios
00004590.00004838::2019/03/14-10:02:25.358 INFO  [RES] Network Name: Agent: Sending request Netname/RecheckConfig to NN:6c77a32d-387c-4a20-bfd6-8006b879b8c0:Netbios
00004590.00003a6c::2019/03/14-10:02:32.624 INFO  [RES] Network Name: Agent: Sending request Netname/RecheckConfig to NN:6c77a32d-387c-4a20-bfd6-8006b879b8c0:Netbios
00004590.00004838::2019/03/14-10:02:32.780 INFO  [RES] Network Name <AOGrp1_SQL20LS>: Dns: HealthCheck: SQL20LS
00004590.00004838::2019/03/14-10:02:32.780 INFO  [RES] Network Name <AOGrp1_SQL20LS>: Dns: End of Slow Operation, state: Initialized/Reading, prevWorkState: Reading
00004590.00004adc::2019/03/14-10:02:37.889 INFO  [RES] Network Name: Agent: Sending request Netname/RecheckConfig to NN:6c77a32d-387c-4a20-bfd6-8006b879b8c0:Netbios
000036ec.00003458::2019/03/14-10:02:38.874 INFO  [RES] SQL Server Availability Group: [hadrag] Availability Group with resource ID '7a245c40-fd05-4b59-99ac-31c395284899' did not receive healthinformation before HealthCheckTimeout
000036ec.00003458::2019/03/14-10:02:38.874 ERR   [RES] SQL Server Availability Group <AOGrp1>: [hadrag] Availability Group is not healthy with given HealthCheckTimeout and FailureConditionLevel
000036ec.00003458::2019/03/14-10:02:38.874 ERR   [RES] SQL Server Availability Group <AOGrp1>: [hadrag] Resource Alive result 0.
000036ec.00003458::2019/03/14-10:02:38.874 INFO  [RES] SQL Server Availability Group: [hadrag] Availability Group with resource ID '7a245c40-fd05-4b59-99ac-31c395284899' did not receive healthinformation before HealthCheckTimeout
000036ec.00003458::2019/03/14-10:02:38.874 ERR   [RES] SQL Server Availability Group <AOGrp1>: [hadrag] Availability Group is not healthy with given HealthCheckTimeout and FailureConditionLevel
000036ec.00003458::2019/03/14-10:02:38.874 ERR   [RES] SQL Server Availability Group <AOGrp1>: [hadrag] Resource Alive result 0.
000036ec.00003458::2019/03/14-10:02:38.874 WARN  [RHS] Resource AOGrp1 IsAlive has indicated failure.
000028b0.00000840::2019/03/14-10:02:40.774 INFO  [RCM] HandleMonitorReply: FAILURENOTIFICATION for 'AOGrp1', gen(11) result 1/0.
000028b0.00000840::2019/03/14-10:02:40.774 INFO  [RCM] Res AOGrp1: Online -> ProcessingFailure( StateUnknown )
000028b0.00000840::2019/03/14-10:02:40.774 INFO  [RCM] TransitionToState(AOGrp1) Online-->ProcessingFailure.
000028b0.00000840::2019/03/14-10:02:40.774 INFO  [RCM] rcm::RcmGroup::UpdateStateIfChanged: (AOGrp1, Online --> Pending)
000028b0.00000840::2019/03/14-10:02:40.774 ERR   [RCM] rcm::RcmResource::HandleFailure: (AOGrp1)
000028b0.00000840::2019/03/14-10:02:40.999 INFO  [RCM] resource AOGrp1: failure count: 1, restartAction: 2 persistentState: 1.
000028b0.00000840::2019/03/14-10:02:40.999 INFO  [RCM] Greater than restartPeriod time has elapsed since first failure of AOGrp1, resetting failureTime and failureCount.
000028b0.00000840::2019/03/14-10:02:40.999 INFO  [RCM] Will queue immediate restart (500 milliseconds) of AOGrp1 after terminate is complete.
000028b0.00000840::2019/03/14-10:02:40.999 INFO  [RCM] Res AOGrp1: ProcessingFailure -> WaitingToTerminate( DelayRestartingResource )
000028b0.00000840::2019/03/14-10:02:40.999 INFO  [RCM] TransitionToState(AOGrp1) ProcessingFailure-->[WaitingToTerminate to DelayRestartingResource].
000028b0.00000840::2019/03/14-10:02:41.030 INFO  [RCM] Res AOGrp1_FSShare: Online -> WaitingToTerminate( WaitingToComeOnline )
000028b0.00000840::2019/03/14-10:02:41.030 INFO  [RCM] TransitionToState(AOGrp1_FSShare) Online-->[WaitingToTerminate to WaitingToComeOnline].
000028b0.00000840::2019/03/14-10:02:41.030 INFO  [RCM] Res AOGrp1_FSShare: [WaitingToTerminate to WaitingToComeOnline] -> Terminating( WaitingToComeOnline )
000028b0.00000840::2019/03/14-10:02:41.030 INFO  [RCM] TransitionToState(AOGrp1_FSShare) [WaitingToTerminate to WaitingToComeOnline]-->[Terminating to WaitingToComeOnline].
000028b0.00000840::2019/03/14-10:02:41.030 INFO  [RCM] AOGrp1 not yet ready to terminate; dependent AOGrp1_FSShare still terminating.
000028b0.00002540::2019/03/14-10:02:41.483 INFO  [RCM] ignored non-local state Pending for group AOGrp1
000028b0.000046b0::2019/03/14-10:02:41.546 INFO  [GUM] Node 1: executing request locally, gumId:3496, my action: /dm/update, # of updates: 1
000028b0.00002974::2019/03/14-10:02:41.594 ERR   [RCM] [GIM] ResType Virtual Machine has no resources, not collecting local utilization info
000028b0.00002974::2019/03/14-10:02:41.594 INFO  [RCM] [GIM] Scheduling Local Node Crawler to run in 300000 millisec.
000028b0.00003b60::2019/03/14-10:02:41.624 INFO  [GUM] Node 1: executing request locally, gumId:3497, my action: /dm/update, # of updates: 1
000028b0.00003414::2019/03/14-10:02:41.844 INFO  [API] s_ApiUnblockGetNotifyCall: for the HDL( 1a )
000028b0.00003414::2019/03/14-10:02:41.859 INFO  [API] s_ApiGetQuorumResource final status 0.
000028b0.00002974::2019/03/14-10:02:41.859 INFO  [API] s_ApiGetQuorumResource final status 0.
000028b0.000046b0::2019/03/14-10:02:41.874 WARN  [API] s_ApiOpenResourceEx: Resource  not found, status = 5007
000028b0.00003b60::2019/03/14-10:02:41.874 INFO  [RCM] HandleMonitorReply: TERMINATERESOURCE for 'AOGrp1_FSShare', gen(0) result 0/0.
000028b0.00003b60::2019/03/14-10:02:41.874 INFO  [RCM] Res AOGrp1_FSShare: [Terminating to WaitingToComeOnline] -> WaitingToComeOnline( StateUnknown )
000028b0.00003b60::2019/03/14-10:02:41.874 INFO  [RCM] TransitionToState(AOGrp1_FSShare) [Terminating to WaitingToComeOnline]-->WaitingToComeOnline.
000028b0.00003b60::2019/03/14-10:02:41.874 INFO  [RCM-rbtr] giving default token to group AOGrp1
000028b0.00003b60::2019/03/14-10:02:41.874 INFO  [RCM-rbtr] giving default token to group AOGrp1
000028b0.00003b60::2019/03/14-10:02:41.874 INFO  [RCM] Res AOGrp1: [WaitingToTerminate to DelayRestartingResource] -> Terminating( DelayRestartingResource )
000028b0.00003b60::2019/03/14-10:02:41.874 INFO  [RCM] TransitionToState(AOGrp1) [WaitingToTerminate to DelayRestartingResource]-->[Terminating to DelayRestartingResource].
000036ec.00004518::2019/03/14-10:02:41.874 ERR   [RES] SQL Server Availability Group <AOGrp1>: [hadrag] Lease Thread terminated
000028b0.000046b0::2019/03/14-10:02:41.874 INFO  [RCM-rbtr] giving default token to group AOGrp1
000028b0.000046b0::2019/03/14-10:02:41.874 INFO  [RCM-rbtr] giving default token to group AOGrp1
000028b0.000046b0::2019/03/14-10:02:41.874 INFO  [RCM-rbtr] giving default token to group AOGrp1
000028b0.000046b0::2019/03/14-10:02:41.874 INFO  [RCM-rbtr] giving default token to group AOGrp1
000036ec.00003654::2019/03/14-10:02:41.874 INFO  [RES] SQL Server Availability Group: [hadrag] Stopping Health Worker Thread
000036ec.00002c74::2019/03/14-10:02:41.874 INFO  [RES] SQL Server Availability Group: [hadrag] Health worker was asked to terminate
000036ec.00001f5c::2019/03/14-10:02:42.847 INFO  [RES] SQL Server Availability Group: [hadrag] SQLMoreResults() returns -1 with following information
000036ec.00002c74::2019/03/14-10:02:42.847 INFO  [RES] SQL Server Availability Group: [hadrag] Change diagnostics interval worker is stopped
000036ec.00001f5c::2019/03/14-10:02:42.847 ERR   [RES] SQL Server Availability Group: [hadrag] ODBC Error: [HY008] [Microsoft][SQL Server Native Client 11.0]Operation canceled (0)
000036ec.00001f5c::2019/03/14-10:02:42.847 ERR   [RES] SQL Server Availability Group: [hadrag] ODBC Error: [01000] [Microsoft][SQL Server Native Client 11.0][SQL Server]  (0)
000036ec.00001f5c::2019/03/14-10:02:42.847 INFO  [RES] SQL Server Availability Group: [hadrag] No more diagnostics results
000036ec.00001f5c::2019/03/14-10:02:42.847 INFO  [RES] SQL Server Availability Group: [hadrag] Diagnostics is stopped
000036ec.00001f5c::2019/03/14-10:02:42.847 INFO  [RES] SQL Server Availability Group: [hadrag] Disconnect from SQL Server
000036ec.00001f5c::2019/03/14-10:02:42.925 INFO  [RES] SQL Server Availability Group: [hadrag] Extended Event logging is stopped
000036ec.00001f5c::2019/03/14-10:02:43.421 INFO  [RES] SQL Server Availability Group: [hadrag] Extended Event target state:
000036ec.00001f5c::2019/03/14-10:02:43.421 INFO  [RES] SQL Server Availability Group: [hadrag] Extended Event session summary: dropped buffers = 0, dropped events = 0
00004590.00004adc::2019/03/14-10:02:43.624 INFO  [RES] Network Name: Agent: Sending request Netname/RecheckConfig to NN:6c77a32d-387c-4a20-bfd6-8006b879b8c0:Netbios
000036ec.00003654::2019/03/14-10:02:43.764 INFO  [RES] SQL Server Availability Group: [hadrag] Stopping Change Diagnostics interval Worker Thread
000036ec.00003654::2019/03/14-10:02:48.171 INFO  [RES] SQL Server Availability Group <AOGrp1>: [hadrag] Connect to SQL Server ...
00004590.00003a6c::2019/03/14-10:02:48.781 INFO  [RES] Network Name: Agent: Sending request Netname/RecheckConfig to NN:6c77a32d-387c-4a20-bfd6-8006b879b8c0:Netbios
00004590.00004838::2019/03/14-10:02:49.405 INFO  [RES] Network Name <AOGrp1_SQL20LS>: [cxl::Pinger-"SQL20LS"] Host not registered.
00004590.00004838::2019/03/14-10:02:49.405 WARN  [RES] Network Name <AOGrp1_SQL20LS>: [cxl::Pinger-"SQL20LS"] Could not find any endpoints for remote target
00004590.00004838::2019/03/14-10:02:49.421 INFO  [RES] Network Name <AOGrp1_SQL20LS>: Setting resource specific message to Name Resolution Not Yet Available
00004590.00004838::2019/03/14-10:02:54.452 INFO  [RES] Network Name: Agent: Sending request Netname/RecheckConfig to NN:6c77a32d-387c-4a20-bfd6-8006b879b8c0:Netbios
00004590.00004838::2019/03/14-10:03:00.015 INFO  [RES] Network Name: Agent: Sending request Netname/RecheckConfig to NN:6c77a32d-387c-4a20-bfd6-8006b879b8c0:Netbios
000036ec.00003654::2019/03/14-10:03:00.077 INFO  [RES] SQL Server Availability Group <AOGrp1>: [hadrag] The connection was established successfully

Will greatly appreciate your input. Thanks.

Victor


Victor

解决方案

you need generate custer log.

refer link step by step.

https://blogs.msdn.microsoft.com/alwaysonpro/2014/11/26/diagnose-unexpected-failover-or-availability-group-in-resolving-state/


这篇关于AG资源陷入失败状态的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆