JDBC连接池测试查询“SELECT 1”没有捕获AWS RDS Writer / Reader故障转移 [英] JDBC Connection Pool test query "SELECT 1" does not catch AWS RDS Writer/Reader failover

查看:326
本文介绍了JDBC连接池测试查询“SELECT 1”没有捕获AWS RDS Writer / Reader故障转移的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们在集群中运行AWS RDS Aurora / MySQL数据库,其中包含编写器和读取器实例,其中编写器被复制到读取器。



访问数据库的应用程序是使用HikariCP连接池的标准Java应用程序。该池配置为在结帐时使用SELECT 1测试查询。



我们注意到,偶尔RDS会将编写器故障转移给读者。也可以通过单击AWS控制台中的实例操作/故障转移手动复制故障转移。



连接池无法检测故障转移以及它现在连接到读取器数据库的事实,如SELECT 1 测试查询仍然成功。但是,任何后续数据库更新都会失败,并且java.sql.SQLException:MySQL服务器正在运行--read-only选项,因此它无法执行此语句错误。



看来,连接池可以检测到它现在不是SELECT 1测试查询通过使用SELECT count(1)FROM test_table WHERE 1 = 2 FOR UPDATE测试查询连接到阅读器。


  1. 有没有人遇到同样的问题?

  2. 在测试查询中使用FOR UPDATE是否有任何缺点?

  3. 是否有任何备用或更好的方法来处理AWS RDS群集编写器/读卡器故障转移?

非常感谢您的帮助



Bernie

解决方案

我一直在在我最初的回复后的两个月内给予了很多想法......






Aurora端点如何工作



启动Aurora集群时,您得到多个主机名以访问群集。出于本答案的目的,我们关心的唯一两个是集群端点,它是读写的,而只读端点,(您猜对了)是只读的。您还为集群中的每个节点都有一个端点,但访问节点直接违背了使用Aurora的目的,因此我不再提及它们。



例如,如果我创建一个名为example的集群,我将获得以下端点:




  • 集群端点:示例.cluster-x91qlr44xxxz.us-east-1.rds.amazonaws.com

  • 只读端点: example.cluster-ro -x91qlr44xxxz.us-east-1.rds.amazonaws.com



您可能认为这些端点会引用类似Elastic Load Balancer的东西,它足够聪明,可以在故障转移时重定向流量,但你错了。实际上,它们只是简单的DNS CNAME条目,其生存时间非常短:

  dig example.cluster-x91qlr44xxxz .us-east-1.rds.amazonaws.com 


; <<>> DiG 9.11.3-1ubuntu1.3-Ubuntu<>> example.cluster-x91qlr44xxxz.us-east-1.rds.amazonaws.com
;;全局选项:+ cmd
;;得到答案:
;; - >> HEADER<< - opcode:QUERY,status:NOERROR,id:40120
;;旗帜:qr rd ra;查询:1,答案:3,授权:0,附加:1

;; OPT PSEUDOSECTION:
; EDNS:版本:0,标志:; udp:65494
;;问题部分:
; example.cluster-x91qlr44xxxz.us-east-1.rds.amazonaws.com。 IN

;;答案部分:
example.cluster-x91qlr44xxxz.us-east-1.rds.amazonaws.com。 5 IN CNAME example.x91qlr44xxxz.us-east-1.rds.amazonaws.com。
example.x91qlr44xxxz.us-east-1.rds.amazonaws.com。 4 IN CNAME ec2-18-209-198-76.compute-1.amazonaws.com。
ec2-18-209-198-76.compute-1.amazonaws.com。 7199 IN A 18.209.198.76

;;查询时间:54毫秒
;;服务器:127.0.0.53#53(127.0.0.53)
;;时间:周五12月14日18:12:08 EST 2018
;; MSG SIZE rcvd:178

当发生故障转移时,CNAME会更新(来自示例 example-us-east-1a ):

 ; <<>> DiG 9.11.3-1ubuntu1.3-Ubuntu<>> example.cluster-x91qlr44xxxz.us-east-1.rds.amazonaws.com 
;;全局选项:+ cmd
;;得到答案:
;; - >> HEADER<< - opcode:QUERY,status:NOERROR,id:27191
;;旗帜:qr rd ra;查询:1,答案:3,授权:0,附加:1

;; OPT PSEUDOSECTION:
; EDNS:版本:0,标志:; udp:65494
;;问题部分:
; example.cluster-x91qlr44xxxz.us-east-1.rds.amazonaws.com。 IN

;;答案部分:
example.cluster-x91qlr44xxxz.us-east-1.rds.amazonaws.com。 5 IN CNAME example-us-east-1a.x91qlr44xxxz.us-east-1.rds.amazonaws.com。
example-us-east-1a.x91qlr44xxxz.us-east-1.rds.amazonaws.com。 4 IN CNAME ec2-3-81-195-23.compute-1.amazonaws.com。
ec2-3-81-195-23.compute-1.amazonaws.com。 7199 IN A 3.81.195.23

;;查询时间:158毫秒
;;服务器:127.0.0.53#53(127.0.0.53)
;;时间:12月14日星期五18:15:33美国东部时间2018
;; MSG SIZE rcvd:187

故障转移期间发生的另一件事是所有连接到集群端点关闭,这将使任何进程内事务失败(假设您已设置合理的查询超时)。



与只读的连接endpoint 不会关闭,这意味着除了只读流量之外,无论什么节点被提升都会获得读写流量(当然,假设您的应用程序不只是将所有请求发送到集群端点)。由于只读连接通常用于相对昂贵的查询(例如,报告),这可能会导致读写操作出现性能问题。



问题:DNS缓存



发生故障转移时,所有进程内事务都将失败(再次假设您已设置查询超时)。由于连接池在完成恢复之前尝试连接到同一主机,因此任何新连接都将失败的时间很短。根据我的经验,故障转移大约需要15秒,在此期间您的应用程序不应期望获得连接。



在那15秒(左右)之后,一切都应该恢复正常:你的连接池试图连接到集群端点,它解析为新的IP地址读写节点,一切都很好。但是,如果有任何事情阻止解析该CNAME链,您可能会发现您的连接池与只读端点建立连接,一旦您尝试更新操作,该端点就会失败。



在OP的情况下,他有自己的CNAME,超时时间更长。因此,他不是直接连接到集群端点,而是连接到 database.example.com 之类的东西。在手动故障转移到副本数据库的世界中,这是一种有用的技术;我怀疑它对Aurora的用处不大。无论如何,如果您使用自己的CNAME来引用数据库端点,则需要它们具有较短的生存时间值(当然不超过5秒)。



在我的原始答案中,我还指出Java在某些情况下会永远缓存DNS查找。此缓存的行为取决于(我相信)Java的版本,以及您是否在安装了安全管理器的情况下运行。随着OpenJDK 8作为应用程序运行,似乎JVM将委派所有命名查找而不是自己缓存任何内容。但是,您应该熟悉 networkaddress.cache.ttl 系统属性,如此Oracle文档此SO问题



但是,即使在消除了任何意外缓存之后,仍有可能将集群端点解析为只读节点。这就留下了如何处理这种情况的问题。



不太好的解决方案:在结账时使用只读测试



OP希望使用数据库连接测试来验证他的应用程序是否在只读节点上运行。这是非常难以做到的:大多数连接池(包括HikariCP,这是OP正在使用的)只是验证测试查询是否成功执行;没有能力去看它返回的东西。这意味着任何测试查询都必须抛出异常才能失败。



我无法想出一种让MySQL抛出异常的方法独立查询。我想出的最好的是创建一个函数:

  DELIMITER EOF 

CREATE FUNCTION throwIfReadOnly()RETURNS INTEGER
BEGIN
IF @@ innodb_read_only THEN
SIGNAL SQLSTATE'ERR0R'SET MESSAGE_TEXT ='database is read_only';
END IF;
返回0;
END;
EOF

DELIMITER;

然后在测试查询中调用该函数:

  select throwIfReadOnly()

这很有用。运行我的测试程序时,我可以看到一系列无法验证连接消息,但是然后,莫名其妙地,更新查询将以只读连接运行。 Hikari没有调试消息来指示它发出的连接,因此我无法确定它是否已通过验证。



但除了可能存在的问题,这个实现有一个更深层次的问题:它隐藏了一个存在问题的事实。用户发出请求,可能需要等待30秒才能得到响应。日志中没有任何内容(除非您启用Hikari的调试日志记录)以说明此延迟的原因。



此外,虽然数据库无法访问,但Hikari正在疯狂地尝试建立连接:在我的单线程测试中,它会每100毫秒尝试一次新连接。这些是真正的联系,他们只是去错误的主机。投入一个有几十个或几百个线程的app-server,这可能会对数据库造成严重的连锁反应。



更好的解决方案:使用只读测试在结账时,通过包装数据源



不要让Hikari静默重试连接,你可以包装 HikariDataSource 在您自己的 DataSource 实现中并自行测试/重试。这样做的好处是您可以实际查看测试查询的结果,这意味着您可以使用自包含查询而不是调用单独安装的函数。它还允许您使用首选日志级别记录问题,让您在尝试之间暂停,并让您有机会更改池配置。

 私有静态类WrappedDataSource 
实现DataSource
{
私有HikariDataSource委托;

public WrappedDataSource(HikariDataSource委托){
this.delegate = delegate;
}

@Override
public Connection getConnection()抛出SQLException {
while(true){
Connection cxt = delegate.getConnection();
try(Statement stmt = cxt.createStatement()){
try(ResultSet rslt = stmt.executeQuery(select @@ innodb_read_only)){
if(rslt.next()& ;&!rslt.getBoolean(1)){
return cxt;
}
}
}
//逐出连接以便我们不再获取它
//也应该在这里登录
delegate.evictConnection( CXT);
try {
Thread.sleep(1000);
}
catch(InterruptedException被忽略){
//如果我们被打断我们只是重试
}
}
}

//所有其​​他方法都可以委托给HikariDataSource

此解决方案仍然存在问题在用户请求中引入了延迟。是的,你知道它正在发生(你没有通过on-checkout测试),你可以引入一个超时(限制循环的次数)。但它仍然代表了糟糕的用户体验。



最佳(imo)解决方案:切换到维护模式



<用户非常不耐烦:如果需要几秒钟的时间才能获得回复,他们可能会尝试重新加载页面,或者再次提交表单,或者做某些事情帮助,可能会受到伤害。



所以我认为最好的解决方案是快速失败并让他们知道某些错误。在调用堆栈顶部附近的某处,您应该已经有一些响应异常的代码。也许你现在只返回一个通用的500页,但你可以做得更好:查看异常,如果它是只读数据库异常,则返回抱歉,暂时不可用,请在几分钟后重试页面。



与此同时,您应该向操作人员发送通知:这可能是正常的维护窗口故障转移,或者可能是更严重的事情(但不要除非你有办法知道它更严重,否则不要把它们叫醒。


We are running an AWS RDS Aurora/MySQL database in a cluster with a writer and a reader instance where the writer is replicated to the reader.

The application accessing the database is a standard java application using a HikariCP Connection Pool. The pool is configured to use a "SELECT 1" test query on checkout.

What we noticed is that once in a while RDS fails over the writer to the reader. The failover can also be replicated manually by clicking "Instance Actions/Failover" in the AWS console.

The connection pool is not able to detect the failover and the fact that it is now connected to a reader database, as the "SELECT 1" test queries still succeed. However any subsequent database updates fail with "java.sql.SQLException: The MySQL server is running with the --read-only option so it cannot execute this statement" errors.

It appears that instead of a "SELECT 1" test query, the Connection Pool can detect that it is now connected to the reader by using a "SELECT count(1) FROM test_table WHERE 1 = 2 FOR UPDATE" test query.

  1. Has anybody experienced the same issue?
  2. Are there any downsides on using "FOR UPDATE" in the test query?
  3. Are there any alternate or better approaches of handling an AWS RDS cluster writer/reader failover?

Your help is much appreciated

Bernie

解决方案

I've been giving this a lot of thought in the two months since my original reply...


How Aurora endpoints work

When you start up an Aurora cluster you get multiple hostnames to access the cluster. For the purposes of this answer, the only two that we care about are the "cluster endpoint," which is read-write, and the "read-only endpoint," which is (you guessed it) read-only. You also have an endpoint for each node within the cluster, but accessing nodes directly defeats the purpose of using Aurora, so I won't mention them again.

For example, if I create a cluster named "example", I'll get the following endpoints:

  • Cluster endpoint: example.cluster-x91qlr44xxxz.us-east-1.rds.amazonaws.com
  • Read-only endpoint: example.cluster-ro-x91qlr44xxxz.us-east-1.rds.amazonaws.com

You might think that these endpoints would refer to something like an Elastic Load Balancer, which would be smart enough to redirect traffic on failover, but you'd be wrong. In fact, they're simply DNS CNAME entries with a really short time-to-live:

dig example.cluster-x91qlr44xxxz.us-east-1.rds.amazonaws.com


; <<>> DiG 9.11.3-1ubuntu1.3-Ubuntu <<>> example.cluster-x91qlr44xxxz.us-east-1.rds.amazonaws.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 40120
;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 65494
;; QUESTION SECTION:
;example.cluster-x91qlr44xxxz.us-east-1.rds.amazonaws.com. IN A

;; ANSWER SECTION:
example.cluster-x91qlr44xxxz.us-east-1.rds.amazonaws.com. 5 IN CNAME example.x91qlr44xxxz.us-east-1.rds.amazonaws.com.
example.x91qlr44xxxz.us-east-1.rds.amazonaws.com. 4 IN CNAME ec2-18-209-198-76.compute-1.amazonaws.com.
ec2-18-209-198-76.compute-1.amazonaws.com. 7199 IN A 18.209.198.76

;; Query time: 54 msec
;; SERVER: 127.0.0.53#53(127.0.0.53)
;; WHEN: Fri Dec 14 18:12:08 EST 2018
;; MSG SIZE  rcvd: 178

When a failover happens, the CNAMEs are updated (from example to example-us-east-1a):

; <<>> DiG 9.11.3-1ubuntu1.3-Ubuntu <<>> example.cluster-x91qlr44xxxz.us-east-1.rds.amazonaws.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 27191
;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 65494
;; QUESTION SECTION:
;example.cluster-x91qlr44xxxz.us-east-1.rds.amazonaws.com. IN A

;; ANSWER SECTION:
example.cluster-x91qlr44xxxz.us-east-1.rds.amazonaws.com. 5 IN CNAME example-us-east-1a.x91qlr44xxxz.us-east-1.rds.amazonaws.com.
example-us-east-1a.x91qlr44xxxz.us-east-1.rds.amazonaws.com. 4 IN CNAME ec2-3-81-195-23.compute-1.amazonaws.com.
ec2-3-81-195-23.compute-1.amazonaws.com. 7199 IN A 3.81.195.23

;; Query time: 158 msec
;; SERVER: 127.0.0.53#53(127.0.0.53)
;; WHEN: Fri Dec 14 18:15:33 EST 2018
;; MSG SIZE  rcvd: 187

The other thing that happens during a failover is that all of the connections to the "cluster" endpoint get closed, which will fail any in-process transactions (assuming that you've set reasonable query timeouts).

The connections to the "read-only" endpoint don't get closed, which means that whatever node gets promoted will get read-write traffic in addition to read-only traffic (assuming, of course, that your application doesn't just send all requests to the cluster endpoint). Since read-only connections are typically used for relatively expensive queries (eg, reporting), this may cause performance problems for your read-write operations.

The Problem: DNS Caching

When failover happens, all in-process transactions will fail (again, assuming that you've set query timeouts). There will be a short amount of time that any new connections will also fail, as the connection pool attempts to connect to the same host before it's done with recovery. In my experience, failover takes around 15 seconds, during which time your application shouldn't expect to get a connection.

After that 15 seconds (or so), everything should return to normal: your connection pool attempts to connect to the cluster endpoint, it resolves to the IP address of the new read-write node, and all is well. But if anything prevents resolving that chain of CNAMEs, you may find that your connection pool makes connections to a read-only endpoint, which will fail as soon as you try an update operation.

In the case of the OP, he had his own CNAME with a longer timeout. So rather than connect to the cluster endpoint directly, he would connect to something like database.example.com. This is a useful technique in a world where you would manually fail-over to a replica database; I suspect it's less useful with Aurora. Regardless, if you use your own CNAMEs to refer to database endpoints, you need them to have short time-to-live values (certainly no more than 5 seconds).

In my original answer, I also pointed out that Java caches DNS lookups, in some cases forever. The behavior of this cache depends on (I believe) the version of Java, and also whether you're running with a security manager installed. With OpenJDK 8 running as an application, it appears that the JVM will delegate all naming lookups and not cache anything itself. However, you should be familiar with the networkaddress.cache.ttl system property, as described in this Oracle doc and this SO question.

However, even after you've eliminated any unexpected caches, there may still be times where the cluster endpoint is resolved to a read-only node. That leaves the question of how you handle this situation.

Not-so-good solution: use a read-only test on checkout

The OP was hoping to use a database connection test to verify that his application was running on a read-only node. This is surprisingly hard to do: most connection pools (including HikariCP, which is what the OP is using) simply verify that the test query executes successfully; there's no ability to look at what it returns. This means that any test query has to throw an exception to fail.

I haven't been able to come up with a way to make MySQL throw an exception with just a stand-alone query. The best I've come up with is to create a function:

DELIMITER EOF

CREATE FUNCTION throwIfReadOnly() RETURNS INTEGER
BEGIN
    IF @@innodb_read_only THEN
        SIGNAL SQLSTATE 'ERR0R' SET MESSAGE_TEXT = 'database is read_only';
    END IF;
    RETURN 0;
END;
EOF

DELIMITER ;

Then you call that function in your test query:

select throwIfReadOnly() 

This works, mostly. When running my test program I could see a series of "failed to validate connection" messages, but then, inexplicably, the update query would run with a read-only connection. Hikari doesn't have a debug message to indicate which connection it hands out, so I couldn't identify whether it had allegedly passed validation.

But aside from that possible problem, there's a deeper issue with this implementation: it hides the fact that there's a problem. A user makes a request, and maybe waits for 30 seconds to get a response. There's nothing in the log (unless you enable Hikari's debug logging) to give a reason for this delay.

Moreover, while the database is inaccessible Hikari is furiously trying to make connections: in my single-threaded test, it would attempt a new connection every 100 milliseconds. And these are real connections, they simply go to the wrong host. Throw in an app-server with a few dozen or hundred threads, and that could cause a significant ripple effect on the database.

Better solution: use a read-only test on checkout, via a wrapper Datasource

Rather than let Hikari silently retry connections, you could wrap the HikariDataSource in your own DataSource implementation and test/retry yourself. This has the benefit that you can actually look at the results of the test query, which means that you can use a self-contained query rather than calling a separately-installed function. It also lets you log the problem using your preferred log levels, lets you pause between attempts, and gives you a chance to change pool configuration.

private static class WrappedDataSource
implements DataSource
{
    private HikariDataSource delegate;

    public WrappedDataSource(HikariDataSource delegate) {
        this.delegate = delegate;
    }

    @Override
    public Connection getConnection() throws SQLException {
        while (true) {
            Connection cxt = delegate.getConnection();
            try (Statement stmt = cxt.createStatement()) {
                try (ResultSet rslt = stmt.executeQuery("select @@innodb_read_only")) {
                    if (rslt.next() && ! rslt.getBoolean(1)) {
                        return cxt;
                    }
                }
            }
            // evict connection so that we won't get it again
            // should also log here
            delegate.evictConnection(cxt);
            try {
                Thread.sleep(1000);
            }
            catch (InterruptedException ignored) {
                // if we're interrupted we just retry
            }
        }
    }

    // all other methods can just delegate to HikariDataSource

This solution still suffers from the problem that it introduces a delay into user requests. True, you know that it's happening (which you didn't with the on-checkout test), and you could introduce a timeout (limit the number of times through the loop). But it still represents a bad user experience.

The best (imo) solution: switch into "maintenance mode"

Users are incredibly impatient: if it takes more than a few seconds to get a response back, they'll probably try to reload the page, or submit the form again, or do something that doesn't help and may hurt.

So I think the best solution is to fail quickly and let them know that somethng's wrong. Somewhere near the top of the call stack you should already have some code that responds to exceptions. Maybe you just return a generic 500 page now, but you can do a little better: look at the exception, and return a "sorry, temporarily unavailable, try again in a few minutes" page if it's a read-only database exception.

At the same time, you should send a notification to you ops staff: this may be a normal maintance window failover, or it may be something more serious (but don't wake them up unless you have some way of knowing that it's more serious).

这篇关于JDBC连接池测试查询“SELECT 1”没有捕获AWS RDS Writer / Reader故障转移的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆