Azure SQL故障转移组,宽限期是什么意思? [英] Azure SQL failover group, what does the grace period mean?

查看:95
本文介绍了Azure SQL故障转移组,宽限期是什么意思?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我当前正在阅读以下内容: https://docs.microsoft.com/zh-cn/azure/sql-database/sql-database-auto-failover-group ,我很难理解自动故障转移策略:

默认情况下,故障转移组配置有自动故障转移政策.SQL数据库服务在失败后触发故障转移检测到且宽限期已到期.系统必须验证内置的高可用性无法缓解中断SQL数据库服务的基础结构由于影响.如果要从应用程序,则可以关闭自动故障转移.

在ARM模板中定义故障转移组时:

  {"condition":"[equals(parameters('redundancyId'),'pri')]","type":"Microsoft.Sql/servers","kind":"v12.0","name":"[variables('sqlServerPrimaryName')]","apiVersion":"2014-04-01-preview","location":"[parameters('location')]",特性": {"administratorLogin":"[parameters('sqlServerPrimaryAdminUsername')]","administratorLoginPassword":"[parameters('sqlServerPrimaryAdminPassword')]","version":"12.0"},资源": [{"condition":"[equals(parameters('redundancyId'),'pri')]","apiVersion":"2015-05-01-preview","type":"failoverGroups","name":"[variables('sqlFailoverGroupName')]",特性": {"serverName":"[variables('sqlServerPrimaryName')]","partnerServers":[{"id":"[resourceId('Microsoft.Sql/servers/',variables('sqlServerSecondaryName'))]]"}],"readWriteEndpoint":{"failoverPolicy":自动","failoverWithDataLossGracePeriodMinutes":60},"readOnlyEndpoint":{"failoverPolicy":已禁用"},数据库":["[[resourceId('Microsoft.Sql/servers/databases',variables('sqlServerPrimaryName'),variables('sqlDatabaseName'))]]]},取决于": ["[variables('sqlServerPrimaryName')]","[[resourceId('Microsoft.Sql/servers/databases',variables('sqlServerPrimaryName'),variables('sqlDatabaseName'))]],"[resourceId('Microsoft.Sql/servers',variables('sqlServerSecondaryName'))]]},{"condition":"[equals(parameters('redundancyId'),'pri')]","name":"[variables('sqlDatabaseName')]","type":数据库","apiVersion":"2014-04-01-preview","location":"[parameters('location')]",取决于": ["[变量('sqlServerPrimaryName')]"],特性": {"edition":"[variables('sqlDatabaseEdition')]" ,,"requestedServiceObjectiveName":"[变量('sqlDatabaseServiceObjective')]"}}]},{"condition":"[equals(parameters('redundancyId'),'pri')]","type":"Microsoft.Sql/servers","kind":"v12.0","name":"[变量('sqlServerSecondaryName')]","apiVersion":"2014-04-01-preview","location":"[variables('sqlServerSecondaryRegion')]",特性": {"administratorLogin":"[参数('sqlServerSecondaryAdminUsername')]","administratorLoginPassword":"[parameters('sqlServerSecondaryAdminPassword')]","version":"12.0"}} 

我这样指定readWriteEndpoint:

 "readWriteEndpoint":{"failoverPolicy":自动","failoverWithDataLossGracePeriodMinutes":60} 

将failoverWithDataLossGracePeriodMinutes设置为60分钟.

这是什么意思?我在任何地方都找不到清晰的答案.这是否意味着:

  1. 当我的主数据库所在的主区域发生故障时,读/写端点指向主数据库,仅在60分钟后,故障转移到我的辅助数据库,该辅助数据库成为新的主数据库.在60分钟内,读取我的数据的唯一方法是直接使用readOnlyEndpoint?或
  2. 如果他们能以某种方式检测到没有要同步的数据,我的读/写端点将被立即关闭

我认为可以归结为:如果检测到中断,是否不关心数据丢失,但是我希望能够写入数据库,是否必须手动进行故障转移?

奖励问题:存在宽限期的原因是因为如果次要节点成为新的主要节点(如果我手动切换),则主要节点上可能存在不同步的数据,这些数据将被覆盖或被丢弃?

对不起,我不能只回答一个问题.我读了很多书,我真的需要知道这一点.

解决方案

这是什么意思?

这意味着:

当我的主数据库所在的主区域发生故障时,读/写端点指向主数据库,只有在60分钟后,它才会故障转移到我的辅助数据库,从而成为新的主数据库."

即使同步了数据,它也无法自动进行故障转移,因为主区域中的高可用性解决方案正在尝试执行相同的操作,并且几乎所有时间主数据库都会在主区域中快速恢复.执行自动跨区域故障转移会对此产生干扰.

还有

宽限期之所以存在,是因为,如果辅助节点成为新的主节点,则主节点上可能存在不同步的数据,这些数据将被覆盖或丢弃".

并留出时间让数据库在主要区域内进行故障转移.

I am currently reading this: https://docs.microsoft.com/en-us/azure/sql-database/sql-database-auto-failover-group, and I have a hard time understanding the automatic failover policy:

By default, a failover group is configured with an automatic failover policy. The SQL Database service triggers failover after the failure is detected and the grace period has expired. The system must verify that the outage cannot be mitigated by the built-in high availability infrastructure of the SQL Database service due to the scale of the impact. If you want to control the failover workflow from the application, you can turn off automatic failover.

When defining the failover group in an ARM template:

{
  "condition": "[equals(parameters('redundancyId'), 'pri')]",
  "type": "Microsoft.Sql/servers",
  "kind": "v12.0",
  "name": "[variables('sqlServerPrimaryName')]",
  "apiVersion": "2014-04-01-preview",
  "location": "[parameters('location')]",
  "properties": {
    "administratorLogin": "[parameters('sqlServerPrimaryAdminUsername')]",
    "administratorLoginPassword": "[parameters('sqlServerPrimaryAdminPassword')]",
    "version": "12.0"
  },
  "resources": [
    {
      "condition": "[equals(parameters('redundancyId'), 'pri')]",
      "apiVersion": "2015-05-01-preview",
      "type": "failoverGroups",
      "name": "[variables('sqlFailoverGroupName')]",
      "properties": {
        "serverName": "[variables('sqlServerPrimaryName')]",
        "partnerServers": [
          {
            "id": "[resourceId('Microsoft.Sql/servers/', variables('sqlServerSecondaryName'))]"
          }
        ],
        "readWriteEndpoint": {
          "failoverPolicy": "Automatic",
          "failoverWithDataLossGracePeriodMinutes": 60
        },
        "readOnlyEndpoint": {
          "failoverPolicy": "Disabled"
        },
        "databases": [
          "[resourceId('Microsoft.Sql/servers/databases', variables('sqlServerPrimaryName'), variables('sqlDatabaseName'))]"
        ]
      },
      "dependsOn": [
        "[variables('sqlServerPrimaryName')]",
        "[resourceId('Microsoft.Sql/servers/databases', variables('sqlServerPrimaryName'), variables('sqlDatabaseName'))]",
        "[resourceId('Microsoft.Sql/servers', variables('sqlServerSecondaryName'))]"
      ]
    },
    {
      "condition": "[equals(parameters('redundancyId'), 'pri')]",
      "name": "[variables('sqlDatabaseName')]",
      "type": "databases",
      "apiVersion": "2014-04-01-preview",
      "location": "[parameters('location')]",
      "dependsOn": [
        "[variables('sqlServerPrimaryName')]"
      ],
      "properties": {
        "edition": "[variables('sqlDatabaseEdition')]",
        "requestedServiceObjectiveName": "[variables('sqlDatabaseServiceObjective')]"
      }
    }
  ]
},
{
  "condition": "[equals(parameters('redundancyId'), 'pri')]",
  "type": "Microsoft.Sql/servers",
  "kind": "v12.0",
  "name": "[variables('sqlServerSecondaryName')]",
  "apiVersion": "2014-04-01-preview",
  "location": "[variables('sqlServerSecondaryRegion')]",
  "properties": {
    "administratorLogin": "[parameters('sqlServerSecondaryAdminUsername')]",
    "administratorLoginPassword": "[parameters('sqlServerSecondaryAdminPassword')]",
    "version": "12.0"
  }
}

I specify the readWriteEndpoint like this:

    "readWriteEndpoint": {
      "failoverPolicy": "Automatic",
      "failoverWithDataLossGracePeriodMinutes": 60
    }

With a failoverWithDataLossGracePeriodMinutes set to 60 minutes.

What does this mean? I cannot find a clear answer anywhere. Does it mean that:

  1. When an outage is happening in my primary region where my primary database resides, the read/write endpoint points to the primary and only after 60 minutes it fails over to my secondary, which becomes the new primary. In the 60 minutes, the only way to read my data is to use the readOnlyEndpoint directly? OR
  2. My read/write endpoint is turned instantly, if they somehow can detect that there was no data to be synced

I think it boils down to: do I have to manually make the failover, if I detect an outage, if I don't care about data loss, but I want to be able to write to my database?

Bonus question: is the reason why the grace period is present because there can be unsynced data on the primary, that will be overwritten, or tossed away, if the secondary becomes the new primary (if i switch manually)?

Sorry, I can't keep it to only one question. I have read a lot and I really need to know this.

解决方案

What does this mean?

It means that:

"when a outage is happening in my primary region where my primary database resides, the read/write endpoint points to the primary and only after 60 minutes it fails over to my secondary, which becomes the new primary. "

It can't failover automatically even when the data is synced because the high-availability solution in the primary region is trying to do the same thing, and almost all of the time your primary database will come back quickly in the primary region. Performing an automatic cross-region fail-over would interfere with this.

And

"the reason why the grace period is present, is that because the there can be unsynced data on the primary, that will be overwritten, or tossed away, if the secondary becomes the new primary"

And to allow time for the database to failover within the primary region.

这篇关于Azure SQL故障转移组,宽限期是什么意思?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆