重做队列不断建立长达一个小时 [英] Redo queue building up constantly for up to one hour

查看:316
本文介绍了重做队列不断建立长达一个小时的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述


我们使用SQL Server版本14.0.3103.1永远在可用性组。

We use SQL Server version 14.0.3103.1 with Always on Availability group.


在生产服务器中,我们注意到重做队列不断累积长达一个小时。这是随机发生的,在此期间重做停止。我们尝试使用独立于客户
数据的表和脚本重新创建场景,我们现在有一个类似的场景。

In the production server we noticed Redo queue building up constantly for up to one hour. This occurs randomly and during this time the redo comes to a standstill. We tried to reproduce the scenario in staging with tables and scripts independent of the customer data and we now have a scenario that looks similar.


在这种情况下,我们注意到PARALLEL REDO TASK等待DPT_ENTRY_LOCK和重做队列大小不断增加。由于生产问题我们注意到查询正在等待DPT_ENTRY_LOCK等待类型

In this scenario we notice PARALLEL REDO TASK waiting on DPT_ENTRY_LOCK and redo queue size increasing constantly. With the production issue we noticed queries were waiting on DPT_ENTRY_LOCK wait type


我们使用12个连接重新创建,6个连接更新主要表中的6个连接,6个连接从可读次要(同步和异步)中选择。

We reproduced with 12 connections, 6 connections updating tables in the primary and 6 connections selecting from readable secondary (both sync and async).


下面是我们的监控查询和输出。

Below is our monitoring query and the output.



选择 顶部 10  * 来自 sys dm_os_spinlock_stats


       其中 名称 = 'DP_LIST'



选择 * 来自 sys dm_exec_requests 其中 wait_type  = 'DPT_ENTRY_LOCK'

select*fromsys.dm_exec_requestswherewait_type ='DPT_ENTRY_LOCK'



选择 wait_time_ms / waiting_tasks_count * 来自 sys dm_os_wait_stats

selectwait_time_ms/waiting_tasks_count,*fromsys.dm_os_wait_stats


       其中 wait_type  喜欢 '%DPT_ENTRY_LOCK%'


             waiting_tasks_count  > 0


       订单 wait_time_ms  desc




选择 db_name database_id ), redo_queue_size redo_rate * 来自 sys dm_hadr_database_replica_states

selectdb_name(database_id),redo_queue_size,redo_rate,*fromsys.dm_hadr_database_replica_states




当我们使用跟踪标志3459禁用并行重做时,我们没有问题,重做队列没有建立。

When we disable parallel redo with trace flag 3459, we don’t have the problem, the redo queue does not buildup.


这会导致问题,因为系统在此期间变得无法使用,特别是从中学读取。

This is causing problems because the system becomes unusable during this time, especially reading from secondary.






推荐答案

你好AlexCriss,

Hi AlexCriss,

 

>>在这种情况下,我们注意到PARALLEL REDO TASK在等待DPT_ENTRY_LOCK并且重做队列大小不断增加。出现生产问题我们注意到查询正在等待DPT_ENTRY_LOCK
等待类型

>>In this scenario we notice PARALLEL REDO TASK waiting on DPT_ENTRY_LOCK and redo queue size increasing constantly. With the production issue we noticed queries were waiting on DPT_ENTRY_LOCK wait type

 

此等待类型是指线程在并行重做期间等待访问脏页列表或在可用性组可读辅助副本上读取查询时。当您有
时主服务器上必须刷新到辅助服务器上的大量事务,并且访问相同表的辅助服务器上的读取次数很多,最终会导致此锁定。

This wait type is when a thread is waiting for access to the list of dirty pages during parallel redo or a read query on an Availability Group readable secondary replica. When you have a large number of transaction on your primary that have to be flushed to the secondary, and you have a high number of reads on a secondary that access the same tables, you end up with this lock.

 

如果您遇到此等待类型作为AG辅助设备上最高等待之一,您可以通过使用DBCC TRACEON(3459,-1)启用跟踪标志3459或添加它来禁用并行重做使用配置管理器作为
a启动跟踪标志。跟踪标志在重新启动实例之前不会生效。

If you are experiencing this wait type as one of the highest waits on your AG secondaries, you can disable parallel redo by turning on trace flag 3459 using DBCC TRACEON (3459, -1), or adding it as a startup trace flag using the Configuration Manager. The trace flag does not come into effect until the instance is restarted.

 

有关详情,请参阅
https://www.sqlskills。 com / help / wait / dpt_entry_lock /

 

希望这可以帮到你。

最好的问候,

Dedmon Dai

Dedmon Dai


这篇关于重做队列不断建立长达一个小时的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆