如何使用持久连接在 AWS 中处理 PDO MySQL 故障转移 [英] How to handling PDO MySQL fail over in AWS using persistent connections

查看:64
本文介绍了如何使用持久连接在 AWS 中处理 PDO MySQL 故障转移的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们遇到了 AWS Aurora 故障转移问题,正在寻找有关如何解决的指针.

场景

AWS Aurora 设置有两个端点:

  • 作家:
    • 主机:stackName-dbcluster-ID.cluster-ID.us-west-2.rds.amazonaws.com
    • 解析为 IP:10.1.0.X
  • 读者:
    • 主机:stackName-dbcluster-ID.cluster-ro-ID.us-west-2.rds.amazonaws.com
    • 解析为 IP:10.1.0.Y

因此我们的 PDO MySQL 连接字符串是 stackName-dbcluster-ID.cluster-ID.us-west-2.rds.amazonaws.com(用于写作)

故障转移后

在故障转移时,DNS 条目将翻转为如下所示:

  • 读者:
    • 主机:stackName-dbcluster-ID.cluster-ro-ID.us-west-2.rds.amazonaws.com
    • 解析为 IP:10.1.0.X
  • 作家:
    • 主机:stackName-dbcluster-ID.cluster-ID.us-west-2.rds.amazonaws.com
    • 解析为 IP:10.1.0.Y

至关重要的是,PDO 连接字符串(用于写入)保持相同的stackName-dbcluster-ID.cluster-ID.us-west-2.rds.amazonaws.com",但指向不同的 IP 地址.

发生了什么

我们有错误 1290SQLSTATE[HY000]:一般错误:1290 MySQL 服务器正在使用 --read-only 选项运行,因此它无法执行此语句".

当数据库引擎停止启动时,我们最初的持久连接将消失"并失效(我们立即在重新连接/重试代码中处理).

然而,上面的错误意味着新的连接将建立到旧节点,但不会随着 DNS 更改的传播而进一步失效.它们持续了 10/15 分钟(远远超过 DNS 的 TTL).

我的问题

  1. 有谁知道 PDO 上的持久连接是根据连接字符串检索的,还是使用 IP 或其他签名更可靠?有证据表明它是主机名,但需要确认.
  2. 有谁知道在 PDO 中将持久连接标记为无效"的方法,以便不再使用它?
  3. 或者,我错过了什么?

附注

我们已经有了处理重试的代码,他们重试被告知获得一个新的非持久连接(有效).在这一点上,我们可以使 PDO 连接无效",以便脚本的下一次运行不会一遍又一遍地重复这个循环.

故障转移可能随时发生,因此我们无法执行手动操作,例如重新启动 php(因为我们这次必须这样做).

如果没有持久连接,性能会明显变慢.

FastCGI、Centos 16、PHP 7.2、MySQLD 5.0.12-dev(这在 Centos 上是正常的 - 参见 https://superuser.com/questions/1433346/php-shows-outdated-mysqlnd-version)

解决方案

必须终止并重新启动持久连接.

让我想起了需要 20 分钟才能被识别的 2 分钟 TTL.我不知道亚马逊是否做得更好,甚至他们在 DNS 方面是否有任何发言权.

5.0.12??那是2005年发布的!也许是一个错字.无论如何,我认为版本在这个问题中并不重要.

DNS 可能不是故障转移的最佳方式;那里有几个代理服务器.我希望它们能在几秒钟内翻转.但是,他们需要知道谁是谁,而不是依赖 DNS.

您能否修改代码以在发生该错误时断开连接+重新连接?(这可能没有帮助.)

We have experienced a problem with AWS Aurora failover and looking for pointers as to how to resolve.

Scenario

AWS Aurora set up with two end points:

  • Writer:
    • host: stackName-dbcluster-ID.cluster-ID.us-west-2.rds.amazonaws.com
    • resolves to IP: 10.1.0.X
  • Reader:
    • host: stackName-dbcluster-ID.cluster-ro-ID.us-west-2.rds.amazonaws.com
    • resolves to IP: 10.1.0.Y

So therefore our PDO MySQL Connection string is stackName-dbcluster-ID.cluster-ID.us-west-2.rds.amazonaws.com (for writing)

After failover

On failover, the DNS entries are flipped to point as follows:

  • Reader:
    • host: stackName-dbcluster-ID.cluster-ro-ID.us-west-2.rds.amazonaws.com
    • resolves to IP: 10.1.0.X
  • Writer:
    • host: stackName-dbcluster-ID.cluster-ID.us-west-2.rds.amazonaws.com
    • resolves to IP: 10.1.0.Y

Critically, the PDO Connection string (for writing) remains the same "stackName-dbcluster-ID.cluster-ID.us-west-2.rds.amazonaws.com" b ut points to a different IP address.

What Happened

We had error 1290 "SQLSTATE[HY000]: General error: 1290 The MySQL server is running with the --read-only option so it cannot execute this statement".

As the DB engines are stopped started, our initial persistent connections will have "gone away" and been invalidated (something we immediately handle in a reconnect/retry code).

However the error above means new connections will have been made to the old node, but then not further invalidated with propagation of the DNS change. They lasted 10/15 minutes (well beyond TTL of the DNS).

My Questions

  1. Does anyone know if a persistent connection on PDO is retrieved based on the connection string, or is more reliable using the IP or other signature? Evidence suggests it's hostname, but would like confirmation.
  2. Does anyone know a way to mark a persistent connection as "invalid" in PDO, so that is it not used again?
  3. Or, is there something I missed?

Side notes

We already have code in place to handle the retry, and they retry is told to get a new non-persistent connection (which works). It's at this point we could "invalidate" the PDO connection so the next run of a script does not repeat this cycle over and over.

The failover can happen at any time, so we're not in a position to do manual actions such as restart php (as we had to do this time).

Without persistent connections, performance is notably slower.

FastCGI, Centos 16, PHP 7.2, MySQLD 5.0.12-dev (which is normal on Centos - see https://superuser.com/questions/1433346/php-shows-outdated-mysqlnd-version)

解决方案

Persistent connections must be terminated and restarted.

Reminds me of a 2-minute TTL that took 20 minutes to be recognized. I don't know whether Amazon does a better job, or even if they have any say in DNS.

5.0.12?? That was released in 2005! Maybe a typo. Anyway, I don't think the version matters in this Question.

DNS may not be the optimal way to failover; there are several Proxy servers out there. I would expect them to flip within seconds. However, they need to know who's who rather than depending on DNS.

Can you modify the code to disconnect+reconnect when that error occurs? (It may not help.)

这篇关于如何使用持久连接在 AWS 中处理 PDO MySQL 故障转移的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆