配置GlassFish JDBC连接池以处理Amazon RDS多可用区故障转移 [英] Configure GlassFish JDBC connection pool to handle Amazon RDS Multi-AZ failover

查看:254
本文介绍了配置GlassFish JDBC连接池以处理Amazon RDS多可用区故障转移的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在EC2上的GlassFish上运行了一个Java EE应用程序,在Amazon RDS上有一个MySQL数据库。
我试图配置JDBC连接池以减少数据库故障切换时的停机时间。



我的当前配置在多可用区故障转移,因为备用数据库实例在几分钟内(根据AWS控制台)可用,而我的GlassFish实例在恢复工作之前长时间停留(大约15分钟)。



连接池配置如下:

  asadmin create -jdbc-connection-pool --restype javax.sql.ConnectionPoolDataSource \ 
--datasourceclassname com.mysql.jdbc.jdbc2.optional.MysqlConnectionPoolDataSource \
--isconnectvalidatereq = true --validateatmostonceperiod = 60 --validationmethod = auto- commit \
--property user = $ DBUSER:password = $ DBPASS:databaseName = $ DBNAME:serverName = $ DBHOST:port = $ DBPORT \
MyPool

如果我使用单精度数据 db.m1.small实例和重新启动来自控制台的数据库,GlassFish将使断开的连接无效,抛出一些异常,然后在数据库尽快可用时重新连接。如果我使用多可用区 db.m1.small实例并且 >从AWS控制台重新启动故障转移,我根本没有发现任何异常情况。服务器完全停止,所有传入的请求都会超时。 15分钟后,我终于明白了这一点:

 尝试在事务外执行读取查询时检测到通信故障。试图重试查询。错误是:异常[EclipseLink-4002](Eclipse持久性服务 -  2.3.2.v20111125-r10461):org.eclipse.persistence.exceptions.DatabaseException 
内部异常:com.mysql.jdbc.exceptions.jdbc4.CommunicationsException :通信链路故障

从服务器成功接收的最后一个数据包是940,715毫秒前。成功发送到服务器的最后一个数据包是935,598毫秒前。

看起来好像每个HTTP线程在无效连接上被阻塞而没有发生异常,所以没有进行连接验证的机会。

多可用区间的停机时间总是在15-16分钟之间,所以它看起来像某种类型的超时,但我无法改变它。



我试过的东西没有成功:


  • 连接泄漏超时/回收

  • 语句泄漏超时/回收

  • 语句超时

  • 使用不同的验证方法
  • >
  • 使用 MysqlDataSource 而不是 MysqlConnectionPoolDataSource

>

如何设置停滞查询的超时时间,以便池中的连接被重用,验证和替换?
或者我该如何让GlassFish检测到数据库故障转移?

解决方案

正如我以前所说的,这是因为套接字打开并连接到数据库的用户没有意识到连接已经丢失,所以他们保持连接,直到OS套接字超时被触发,我读的时间可能通常在30分钟左右。



要解决此问题,您需要在JDBC连接字符串或JDNI连接配置/属性中重写套接字超时,以将 socketTimeout 参数定义为更短的时间。



请记住,任何长于定义值的连接都将被终止,即使它正在被使用(我还没有能够确认这一点,这是我读的)。



我在评论中提到的其他两个参数是 connectTimeout autoReconnect

这是我的JDBC连接字符串:

  jd bc:(...)& connectTimeout = 15000& socketTimeout = 60000& autoReconnect = true 

I还通过执行

  java.security.Security.setProperty(networkaddress.cache.ttl,0 ); 
java.security.Security.setProperty(networkaddress.cache.negative.ttl,0);

我这样做是因为Java不尊重TTL,并且发生故障转移时,DNS



由于您正在使用Application Server,因此在使用-Dnet启动glassfish时,必须将用于禁用DNS缓存的参数传递给JVM而不是应用程序本身。


I have a Java EE application running in GlassFish on EC2, with a MySQL database on Amazon RDS. I am trying to configure the JDBC connection pool to in order to minimize downtime in case of database failover.

My current configuration isn't working correctly during a Multi-AZ failover, as the standby database instance appears to be available in a couple of minutes (according to the AWS console) while my GlassFish instance remains stuck for a long time (about 15 minutes) before resuming work.

The connection pool is configured like this:

asadmin create-jdbc-connection-pool --restype javax.sql.ConnectionPoolDataSource \
--datasourceclassname com.mysql.jdbc.jdbc2.optional.MysqlConnectionPoolDataSource \
--isconnectvalidatereq=true --validateatmostonceperiod=60 --validationmethod=auto-commit \
--property user=$DBUSER:password=$DBPASS:databaseName=$DBNAME:serverName=$DBHOST:port=$DBPORT \
MyPool

If I use a Single-AZ db.m1.small instance and reboot the database from the console, GlassFish will invalidate the broken connections, throw some exceptions and then reconnect as soon the database is available. In this setup I get less than 1 minute of downtime.

If I use a Multi-AZ db.m1.small instance and reboot with failover from the AWS console, I see no exception at all. The server halts completely, with all incoming requests timing out. After 15 minutes I finally get this:

Communication failure detected when attempting to perform read query outside of a transaction. Attempting to retry query. Error was: Exception [EclipseLink-4002] (Eclipse Persistence Services - 2.3.2.v20111125-r10461): org.eclipse.persistence.exceptions.DatabaseException
Internal Exception: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure

The last packet successfully received from the server was 940,715 milliseconds ago.  The last packet sent successfully to the server was 935,598 milliseconds ago.

It appears as if each HTTP thread gets blocked on an invalid connection without getting an exception and so there's no chance to perform connection validation.

Downtime in the Multi-AZ case is always between 15-16 minutes, so it looks like a timeout of some sort but I was unable to change it.

Things I have tried without success:

  • connection leak timeout/reclaim
  • statement leak timeout/reclaim
  • statement timeout
  • using a different validation method
  • using MysqlDataSource instead of MysqlConnectionPoolDataSource

How can I set a timeout on stuck queries so that connections in the pool are reused, validated and replaced? Or how can I let GlassFish detect a database failover?

解决方案

As I commented before, it is because the sockets that are open and connected to the database don't realize the connection has been lost, so they stayed connected until the OS socket timeout is triggered, which I read might be usually in about 30 minutes.

To solve the issue you need to override the socket Timeout in your JDBC Connection String or in the JDNI COnnection Configuration/Properties to define the socketTimeout param to a smaller time.

Keep in mind that any connection longer than the value defined will be killed, even if it is being used (I haven't been able to confirm this, is what I read).

The other two parameters I mention in my comment are connectTimeout and autoReconnect.

Here's my JDBC Connection String:

jdbc:(...)&connectTimeout=15000&socketTimeout=60000&autoReconnect=true 

I also disabled Java's DNS cache by doing

 java.security.Security.setProperty("networkaddress.cache.ttl" , "0"); 
 java.security.Security.setProperty("networkaddress.cache.negative.ttl" , "0"); 

I do this because Java doesn't honor the TTL's, and when the failover takes place, the DNS is the same but the IP changes.

Since you are using an Application Server, the parameters to disable DNS cache must be passed to the JVM when starting the glassfish with -Dnet and not the application itself.

这篇关于配置GlassFish JDBC连接池以处理Amazon RDS多可用区故障转移的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆