如何使用JPA和Hibernate拆分只读和读写事务 [英] How to split read-only and read-write transactions with JPA and Hibernate

查看:389
本文介绍了如何使用JPA和Hibernate拆分只读和读写事务的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个非常沉重的Java Web应用程序,每秒可处理数千个请求,它使用一个主Postgresql数据库,该数据库使用流(异步)复制将自身复制到一个辅助(只读)数据库中.

I have a quite heavy java webapp that serves thousands of requests/sec and it uses a master Postgresql db which replicates itself to one secondary (read-only) database using streaming (asynchronous) replication.

因此,考虑到复制时间最短,我将使用URL将请求从主请求(辅助)分离到辅助(只读),以避免对错误主数据库的只读调用.

So, I separate the request from primary to secondary(read-only) using URLs to avoid read-only calls to bug primary database considering replication time is minimal.

注意:我将一个sessionFactory与spring提供的RoutingDataSource一起使用,该Source会基于键查找要使用的db.我对多租户感兴趣,因为我正在使用支持它的休眠4.3.4.

我有两个问题:

I have two questions:

  1. 我认为基于URL的分割不是有效的方法 只能移动10%的流量,这意味着只读数量不多 网址.我应该考虑哪种方法?
  2. 在某种程度上,基于URL,我可以达到某种程度的 分布在两个节点之间,但是我该如何处理石英 作业(甚至具有单独的JVM)?我应该采取什么务实的方法 拿?
  1. I dont think splitting on the basis of URLs is efficient as I can only move 10% of traffic around means there are not many read-only URLs. What approach should I consider?
  2. May be,somehow, on the basis of URLs I achieve some level of distribution among both nodes but what would I do with my quartz jobs(that even have separate JVM)? What pragmatic approach should I take?

我知道我可能无法在这里得到一个完美的答案,因为这确实很广泛,但我只希望您对此有自己的见解.

I know I might not get a perfect answer here as this really is broad but I just want your opinion for the context.

我的团队成员:

  • Spring4
  • Hibernate4
  • Quartz2.2
  • Java7/Tomcat7

请引起兴趣.预先感谢.

Please take interest. Thanks in advance.

推荐答案

您应该具有:

  1. 配置为连接到主节点的DataSource
  2. 一个DataSource配置为连接到一个或多个Follower节点(您可以对这些节点使用轮询访问调度)
  3. 路由DataSource位于这两个路由的前面,这是您SessionFactory使用的路由.
  4. 您可以使用@Transactional(readOnly=true)标志来确保将只读事务路由到关注者DataSource.
  5. 主要对象和跟随者DataSource都需要一种连接池机制,最快的肯定是 HikariCP . HikariCP是如此之快,以至于我的一项测试我得到了100us平均连接获取时间.
  6. 您需要确保为连接池设置正确的大小,因为这可能会带来很大的不同.为此,我建议使用灵活池.您可以在此处 SQL日志记录框架.查询执行的时间越短,锁获取时间越短,则系统每秒容纳的事务数就越多.
  7. 对于批处理,您肯定需要写最多的事务,但是一般来说OLTP尤其是Hibernate并不是最适合OLAP的方式.如果您仍然决定将Hibernate用于石英作业,请确保您启用JDBC批处理,您应该设置以下Hibernate属性:

  1. a DataSource configured to connect to the Primary node
  2. a DataSource configured to connect to the Follower node or nodes (you can use round-robin access scheduling for those)
  3. the routing DataSource stands in front of these two, being the one your SessionFactory uses.
  4. you can use the@Transactional(readOnly=true) flag to make sure you route read-only transactions to the Follower DataSource.
  5. Both the Primary and the Follower DataSource require a connection pooling mechanism, and the fastest one is definitely HikariCP. HikariCP is so fast, that on one test of mine I got a 100us average connection acquisition time.
  6. You need to make sure you set the right size for you connection pools because that can make a huge difference. For this, I recommend using flexy-pool. You can find more about it here and here.
  7. You need to be very diligent and make sure you mark all read-only transactions accordingly. It's unusual that only 10% of your transactions are read-only. Could it be that you have such a write-most application or you are using write transactions where you only issue query statements?
  8. Monitor all queries executions using an SQL logging framework. The shorter the query execution the shorter the lock acquisition times the more transactions per seconds will your system accommodate.
  9. For batch processing you definitely need write-most transactions, but OLTP in general and Hibernate in particular are not the best fit for OLAP. If you still decide to use Hibernate for your quartz jobs make sure you enable JDBC batching and you should have these Hibernate properties set:

<property name="hibernate.order_updates" value="true"/>
<property name="hibernate.order_inserts" value="true"/>
<property name="hibernate.jdbc.batch_versioned_data" value="true"/>
<property name="hibernate.jdbc.fetch_size" value="25"/>
<property name="hibernate.jdbc.batch_size" value="25"/>

对于批处理,您可以使用单独的数据源,该数据源使用不同的连接池(并且因为您已经说过具有不同的JVM,所以这就是您已经拥有的).只要确保您所有连接池的总连接大小小于已配置PostgreSQL的连接数即可.

For batching you can use a separate data source that using a different connection pool (and because you already said you have a different JVM then that's what you already have). Just make sure your total connection size of all connection pools is less than the number of connections PostgreSQL has been configured with.

因此,批处理处理器使用一个单独的HikariCPDataSource连接到主数据库.每个批处理作业必须使用专门的事务,因此请确保使用合理的批处理大小.您想持有锁并尽快完成交易.如果批处理处理器正在使用并发处理工,请确保关联的连接池大小等于工数,这样他们就不必等待其他人释放连接.

So the batch processor uses a separate HikariCPDataSource that connects to the Primary. Each batch job must use a dedicated transaction, so make sure you use a reasonable batch size. You want to hold locks and to finish transactions as fast as possible. If the batch processor is using concurrent processing workers, make sure the associated connection pool size is equal to the number of workers, so they don't wait for others to release connections.

这篇关于如何使用JPA和Hibernate拆分只读和读写事务的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆