JDBC 连接池测试查询“SELECT 1"不捕获 AWS RDS Writer/Reader 故障转移 [英] JDBC Connection Pool test query "SELECT 1" does not catch AWS RDS Writer/Reader failover

查看:37
本文介绍了JDBC 连接池测试查询“SELECT 1"不捕获 AWS RDS Writer/Reader 故障转移的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们正在一个集群中运行 AWS RDS Aurora/MySQL 数据库,该集群具有一个写入器和一个读取器实例,其中写入器被复制到读取器.

We are running an AWS RDS Aurora/MySQL database in a cluster with a writer and a reader instance where the writer is replicated to the reader.

访问数据库的应用程序是一个使用 HikariCP 连接池的标准 java 应用程序.该池配置为在结帐时使用 "SELECT 1" 测试查询.

The application accessing the database is a standard java application using a HikariCP Connection Pool. The pool is configured to use a "SELECT 1" test query on checkout.

我们注意到,有时 RDS 会将写入器故障转移到读取器.也可以通过单击 AWS 控制台中的实例操作/故障转移"来手动复制故障转移.

What we noticed is that once in a while RDS fails over the writer to the reader. The failover can also be replicated manually by clicking "Instance Actions/Failover" in the AWS console.

连接池无法检测故障转移以及它现在连接到读取器数据库的事实,因为 "SELECT 1" 测试查询仍然成功.但是,任何后续的数据库更新都会因 "java.sql.SQLException: The MySQL server is running with the --read-only option so it cannot execute this statement" 错误而失败.

The connection pool is not able to detect the failover and the fact that it is now connected to a reader database, as the "SELECT 1" test queries still succeed. However any subsequent database updates fail with "java.sql.SQLException: The MySQL server is running with the --read-only option so it cannot execute this statement" errors.

似乎不是 "SELECT 1" 测试查询,连接池可以通过使用 "SELECT count(1) FROM test_table 检测到它现在已连接到阅读器WHERE 1 = 2 FOR UPDATE" 测试查询.

It appears that instead of a "SELECT 1" test query, the Connection Pool can detect that it is now connected to the reader by using a "SELECT count(1) FROM test_table WHERE 1 = 2 FOR UPDATE" test query.

  1. 有人遇到过同样的问题吗?
  2. 在测试查询中使用 "FOR UPDATE" 有什么缺点吗?
  3. 是否有任何替代或更好的方法来处理 AWS RDS 集群写入器/读取器故障转移?

非常感谢您的帮助

伯尼

推荐答案

我已经在我最初回复后的两个月内对此进行了很多思考...

当您启动 Aurora 集群时,您将获得 多个主机名 来访问集群.对于这个答案,我们唯一关心的两个是集群端点",它是读写的,以及只读端点",它是(你猜对了)只读的.集群中的每个节点也有一个端点,但直接访问节点违背了使用 Aurora 的目的,所以我不再赘述.

When you start up an Aurora cluster you get multiple hostnames to access the cluster. For the purposes of this answer, the only two that we care about are the "cluster endpoint," which is read-write, and the "read-only endpoint," which is (you guessed it) read-only. You also have an endpoint for each node within the cluster, but accessing nodes directly defeats the purpose of using Aurora, so I won't mention them again.

例如,如果我创建一个名为example"的集群,我将获得以下端点:

For example, if I create a cluster named "example", I'll get the following endpoints:

  • 集群端点:example.cluster-x91qlr44xxxz.us-east-1.rds.amazonaws.com
  • 只读端点:example.cluster-ro-x91qlr44xxxz.us-east-1.rds.amazonaws.com

您可能认为这些端点指的是弹性负载均衡器之类的东西,它足够智能,可以在故障转移时重定向流量,但您错了.事实上,它们只是具有非常短的生存时间的 DNS CNAME 条目:

You might think that these endpoints would refer to something like an Elastic Load Balancer, which would be smart enough to redirect traffic on failover, but you'd be wrong. In fact, they're simply DNS CNAME entries with a really short time-to-live:

dig example.cluster-x91qlr44xxxz.us-east-1.rds.amazonaws.com


; <<>> DiG 9.11.3-1ubuntu1.3-Ubuntu <<>> example.cluster-x91qlr44xxxz.us-east-1.rds.amazonaws.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 40120
;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 65494
;; QUESTION SECTION:
;example.cluster-x91qlr44xxxz.us-east-1.rds.amazonaws.com. IN A

;; ANSWER SECTION:
example.cluster-x91qlr44xxxz.us-east-1.rds.amazonaws.com. 5 IN CNAME example.x91qlr44xxxz.us-east-1.rds.amazonaws.com.
example.x91qlr44xxxz.us-east-1.rds.amazonaws.com. 4 IN CNAME ec2-18-209-198-76.compute-1.amazonaws.com.
ec2-18-209-198-76.compute-1.amazonaws.com. 7199 IN A 18.209.198.76

;; Query time: 54 msec
;; SERVER: 127.0.0.53#53(127.0.0.53)
;; WHEN: Fri Dec 14 18:12:08 EST 2018
;; MSG SIZE  rcvd: 178

发生故障转移时,CNAME 会更新(从 exampleexample-us-east-1a):

When a failover happens, the CNAMEs are updated (from example to example-us-east-1a):

; <<>> DiG 9.11.3-1ubuntu1.3-Ubuntu <<>> example.cluster-x91qlr44xxxz.us-east-1.rds.amazonaws.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 27191
;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 65494
;; QUESTION SECTION:
;example.cluster-x91qlr44xxxz.us-east-1.rds.amazonaws.com. IN A

;; ANSWER SECTION:
example.cluster-x91qlr44xxxz.us-east-1.rds.amazonaws.com. 5 IN CNAME example-us-east-1a.x91qlr44xxxz.us-east-1.rds.amazonaws.com.
example-us-east-1a.x91qlr44xxxz.us-east-1.rds.amazonaws.com. 4 IN CNAME ec2-3-81-195-23.compute-1.amazonaws.com.
ec2-3-81-195-23.compute-1.amazonaws.com. 7199 IN A 3.81.195.23

;; Query time: 158 msec
;; SERVER: 127.0.0.53#53(127.0.0.53)
;; WHEN: Fri Dec 14 18:15:33 EST 2018
;; MSG SIZE  rcvd: 187

在故障转移期间发生的另一件事是所有到集群"端点的连接都被关闭,这将使任何进程内事务失败(假设您设置了合理的查询超时).

The other thing that happens during a failover is that all of the connections to the "cluster" endpoint get closed, which will fail any in-process transactions (assuming that you've set reasonable query timeouts).

到只读"端点的连接不会关闭,这意味着任何被提升的节点都会获得读写流量除了读取-only 流量(当然,假设您的应用程序不只是将所有请求发送到集群端点).由于只读连接通常用于相对昂贵的查询(例如,报告),这可能会导致读写操作出现性能问题.

The connections to the "read-only" endpoint don't get closed, which means that whatever node gets promoted will get read-write traffic in addition to read-only traffic (assuming, of course, that your application doesn't just send all requests to the cluster endpoint). Since read-only connections are typically used for relatively expensive queries (eg, reporting), this may cause performance problems for your read-write operations.

发生故障转移时,所有进行中的事务都将失败(同样,假设您已设置查询超时).任何新连接也会在短时间内失败,因为连接池在完成恢复之前尝试连接到同一主机.根据我的经验,故障转移大约需要 15 秒,在此期间您的应用程序不应期望获得连接.

When failover happens, all in-process transactions will fail (again, assuming that you've set query timeouts). There will be a short amount of time that any new connections will also fail, as the connection pool attempts to connect to the same host before it's done with recovery. In my experience, failover takes around 15 seconds, during which time your application shouldn't expect to get a connection.

在那 15 秒(左右)之后,一切都应该恢复正常:您的连接池尝试连接到集群端点,它解析为新读写节点的 IP 地址,一切正常.但是,如果有任何原因阻止解析该 CNAME 链,您可能会发现您的连接池与只读端点建立连接,一旦您尝试更新操作,该端点就会失败.

After that 15 seconds (or so), everything should return to normal: your connection pool attempts to connect to the cluster endpoint, it resolves to the IP address of the new read-write node, and all is well. But if anything prevents resolving that chain of CNAMEs, you may find that your connection pool makes connections to a read-only endpoint, which will fail as soon as you try an update operation.

在 OP 的情况下,他有自己的 CNAME,超时时间更长.因此,他不会直接连接到集群端点,而是连接到诸如 database.example.com 之类的东西.在您手动故障转移到副本数据库的世界中,这是一项有用的技术;我怀疑它对 Aurora 的用处不大.无论如何,如果您使用自己的 CNAME 来引用数据库端点,则需要它们具有较短的生存时间值(当然不超过 5 秒).

In the case of the OP, he had his own CNAME with a longer timeout. So rather than connect to the cluster endpoint directly, he would connect to something like database.example.com. This is a useful technique in a world where you would manually fail-over to a replica database; I suspect it's less useful with Aurora. Regardless, if you use your own CNAMEs to refer to database endpoints, you need them to have short time-to-live values (certainly no more than 5 seconds).

在我最初的回答中,我还指出 Java 会缓存 DNS 查找,在某些情况下会永远缓存.此缓存的行为取决于(我相信)Java 的版本,以及您是否在安装安全管理器的情况下运行.OpenJDK 8 作为应用程序运行时,JVM 似乎将委托所有命名查找,而不是自己缓存任何内容.但是,您应该熟悉 networkaddress.cache.ttl 系统属性,如 这个 Oracle 文档这个 SO 问题.

In my original answer, I also pointed out that Java caches DNS lookups, in some cases forever. The behavior of this cache depends on (I believe) the version of Java, and also whether you're running with a security manager installed. With OpenJDK 8 running as an application, it appears that the JVM will delegate all naming lookups and not cache anything itself. However, you should be familiar with the networkaddress.cache.ttl system property, as described in this Oracle doc and this SO question.

但是,即使在您消除了任何意外缓存之后,仍有可能将集群端点解析为只读节点.这就留下了你如何处理这种情况的问题.

However, even after you've eliminated any unexpected caches, there may still be times where the cluster endpoint is resolved to a read-only node. That leaves the question of how you handle this situation.

OP 希望使用数据库连接测试来验证他的应用程序是否在只读节点上运行.这很难做到:大多数连接池(包括 OP 正在使用的 HikariCP)只是验证测试查询是否成功执行;没有能力查看它返回的内容.这意味着任何测试查询都必须抛出异常才能失败.

The OP was hoping to use a database connection test to verify that his application was running on a read-only node. This is surprisingly hard to do: most connection pools (including HikariCP, which is what the OP is using) simply verify that the test query executes successfully; there's no ability to look at what it returns. This means that any test query has to throw an exception to fail.

我一直无法想出一种方法让 MySQL 仅通过独立查询就抛出异常.我想出的最好的方法是创建一个函数:

I haven't been able to come up with a way to make MySQL throw an exception with just a stand-alone query. The best I've come up with is to create a function:

DELIMITER EOF

CREATE FUNCTION throwIfReadOnly() RETURNS INTEGER
BEGIN
    IF @@innodb_read_only THEN
        SIGNAL SQLSTATE 'ERR0R' SET MESSAGE_TEXT = 'database is read_only';
    END IF;
    RETURN 0;
END;
EOF

DELIMITER ;

然后在测试查询中调用该函数:

Then you call that function in your test query:

select throwIfReadOnly() 

这主要是有效的.在运行我的 测试程序 时,我可以看到一系列无法验证连接"消息,但随后,令人费解的是,更新查询将使用只读连接运行.Hikari 没有调试消息来指示它发出的连接,所以我无法确定它是否据称通过了验证.

This works, mostly. When running my test program I could see a series of "failed to validate connection" messages, but then, inexplicably, the update query would run with a read-only connection. Hikari doesn't have a debug message to indicate which connection it hands out, so I couldn't identify whether it had allegedly passed validation.

但是除了那个可能的问题之外,这个实现还有一个更深层次的问题:它隐藏了存在问题的事实.用户发出请求,可能需要等待 30 秒才能得到响应.日志中没有任何内容(除非您启用 Hikari 的调试日志记录)来说明延迟的原因.

But aside from that possible problem, there's a deeper issue with this implementation: it hides the fact that there's a problem. A user makes a request, and maybe waits for 30 seconds to get a response. There's nothing in the log (unless you enable Hikari's debug logging) to give a reason for this delay.

此外,虽然数据库无法访问,但 Hikari 正在疯狂地尝试建立连接:在我的单线程测试中,它会每 100 毫秒尝试一个新连接.这些是真正的连接,它们只是转到错误的主机.投入具有几十或几百个线程的应用服务器,这可能会对数据库造成显着的连锁反应.

Moreover, while the database is inaccessible Hikari is furiously trying to make connections: in my single-threaded test, it would attempt a new connection every 100 milliseconds. And these are real connections, they simply go to the wrong host. Throw in an app-server with a few dozen or hundred threads, and that could cause a significant ripple effect on the database.

与其让 Hikari 静默地重试连接,您还可以将 HikariDataSource 包装在您自己的 DataSource 实现中并自己测试/重试.这样做的好处是您可以实际查看测试查询的结果,这意味着您可以使用自包含查询而不是调用单独安装的函数.它还允许您使用首选日志级别记录问题,让您在尝试之间暂停,并让您有机会更改池配置.

Rather than let Hikari silently retry connections, you could wrap the HikariDataSource in your own DataSource implementation and test/retry yourself. This has the benefit that you can actually look at the results of the test query, which means that you can use a self-contained query rather than calling a separately-installed function. It also lets you log the problem using your preferred log levels, lets you pause between attempts, and gives you a chance to change pool configuration.

private static class WrappedDataSource
implements DataSource
{
    private HikariDataSource delegate;

    public WrappedDataSource(HikariDataSource delegate) {
        this.delegate = delegate;
    }

    @Override
    public Connection getConnection() throws SQLException {
        while (true) {
            Connection cxt = delegate.getConnection();
            try (Statement stmt = cxt.createStatement()) {
                try (ResultSet rslt = stmt.executeQuery("select @@innodb_read_only")) {
                    if (rslt.next() && ! rslt.getBoolean(1)) {
                        return cxt;
                    }
                }
            }
            // evict connection so that we won't get it again
            // should also log here
            delegate.evictConnection(cxt);
            try {
                Thread.sleep(1000);
            }
            catch (InterruptedException ignored) {
                // if we're interrupted we just retry
            }
        }
    }

    // all other methods can just delegate to HikariDataSource

该解决方案仍然存在将延迟引入用户请求的问题.确实,您知道它正在发生(您在结账测试中没有这样做),并且您可以引入超时(限制循环次数).但它仍然代表着糟糕的用户体验.

This solution still suffers from the problem that it introduces a delay into user requests. True, you know that it's happening (which you didn't with the on-checkout test), and you could introduce a timeout (limit the number of times through the loop). But it still represents a bad user experience.

用户非常不耐烦:如果收到回复的时间超过几秒钟,他们可能会尝试重新加载页面,或再次提交表单,或者做一些事情没有帮助,可能会受伤.

Users are incredibly impatient: if it takes more than a few seconds to get a response back, they'll probably try to reload the page, or submit the form again, or do something that doesn't help and may hurt.

所以我认为最好的解决方案是迅速失败并让他们知道出了什么问题.在调用堆栈顶部附近的某个地方,您应该已经有一些响应异常的代码.也许你现在只是返回一个通用的 500 页,但你可以做得更好一点:查看异常,如果是只读数据库异常,则返回抱歉,暂时不可用,请在几分钟后重试"页面.

So I think the best solution is to fail quickly and let them know that somethng's wrong. Somewhere near the top of the call stack you should already have some code that responds to exceptions. Maybe you just return a generic 500 page now, but you can do a little better: look at the exception, and return a "sorry, temporarily unavailable, try again in a few minutes" page if it's a read-only database exception.

同时,您应该向您的运维人员发送通知:这可能是正常的维护窗口故障转移,也可能是更严重的事情(但不要唤醒他们,除非您有某种方式知道更严重).

At the same time, you should send a notification to you ops staff: this may be a normal maintance window failover, or it may be something more serious (but don't wake them up unless you have some way of knowing that it's more serious).

这篇关于JDBC 连接池测试查询“SELECT 1"不捕获 AWS RDS Writer/Reader 故障转移的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆