在大事务中间安全地清除 Hibernate 会话 [英] Safely clearing Hibernate session in the middle of large transaction

查看:17
本文介绍了在大事务中间安全地清除 Hibernate 会话的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 Spring+Hibernate 进行需要创建和更新数十万个项目的操作.像这样:

<代码>{...foo foo = fooDAO.get(...);for (int i=0; i<500000; i++) {bar bar = barDAO.load(i);如果 (bar.needsModification() && foo.foo()) {bar.setWhatever("新的什么");barDAO.update(bar);//这里提交Baz baz = new Baz();bazDAO.create(baz);//如果 (i % 100 == 0), 清除}}}

为了避免在中间丢失更改,我在 barDAO.update(bar) 之后立即提交更改:

HibernateTransactionManager transactionManager = ...;//由 Spring 注入DefaultTransactionDefinition def = new DefaultTransactionDefinition();def.setPropagationBehavior(TransactionDefinition.PROPAGATION_REQUIRED);TransactionStatus transactionStatus = transactionManager.getTransaction(def);transactionManager.commit(transactionStatus);

在这一点上,我不得不说整个过程在一个包裹在org.springframework.orm.hibernate3.support.ExtendedOpenSessionInViewFilter(是的,这是一个网络应用程序)中的事务中运行.

这一切都很好,但有一个例外:在几千次更新/提交之后,整个过程变得非常缓慢,很可能是由于 Spring/Hibernate 保留的越来越多的对象导致内存膨胀.

在 Hibernate-only 环境中,这可以通过调用 org.hibernate.Session#clear() 轻松解决.

现在,问题:

  • 什么时候是 clear() 的好时机?它的性能成本是否很高?
  • 为什么像 barbaz 这样的对象不会自动发布/GCd?在提交后将它们保留在会话中有什么意义(在下一个迭代循环中它们无论如何都无法访问)?我还没有做内存转储来证明这一点,但我的好感觉是它们仍然存在,直到完全退出.如果对此的答案是休眠缓存",那么为什么在可用内存变低时不刷新缓存?
  • 直接调用 org.hibernate.Session#clear() 是否安全/推荐(考虑到整个 Spring 上下文、延迟加载等)?是否有任何可用的 Spring 包装器/对应物来实现相同的目标?
  • 如果上述问题的答案为真,假设 clear() 在循环内被调用,对象 foo 会发生什么?如果 foo.foo() 是延迟加载方法会怎样?

感谢您的回答.

解决方案

什么时候是 clear() 的好时机?性能开销大吗?

在刷新更改后,每隔一定时间,理想情况下与 JDBC 批处理大小相同.文档在关于批处理的章节中描述了常见的习惯用法::><块引用>

13.1.批量插入

当使新对象持久化时刷新()然后清除()会话定期控制大小一级缓存.

Session session = sessionFactory.openSession();交易 tx = session.beginTransaction();for ( int i=0; i<100000; i++ ) {客户customer = new Customer(.....);session.save(客户);if ( i % 20 == 0 ) {//20,与JDBC批量大小相同//刷新一批插入并释放内存:session.flush();session.clear();}}tx.commit();session.close();

这不应该有性能成本,相反:

  • 它允许将要跟踪的对象数量保持在较低的水平(因此刷新应该很快),
  • 它应该允许回收内存.
<块引用>

为什么像 bar 或 baz 这样的对象不会自动释放/GCd?在提交后将它们保留在会话中有什么意义(在下一个迭代循环中它们无论如何都无法访问)?

如果您不想跟踪实体,则需要显式 clear() 会话,仅此而已,这就是它的工作原理(人们可能希望提交事务而不丢失"实体).

但据我所知,bar 和 baz 实例在清除后应该成为 GC 的候选对象.分析内存转储以查看到底发生了什么会很有趣.

<块引用>

直接调用 org.hibernate.Session#clear() 是否安全/推荐

只要您 flush() 待处理的更改不会丢失它们(除非这是您想要的),我认为这没有任何问题(您当前的代码将丢失创建每 100 次循环,但也许只是一些伪代码).

<块引用>

如果上述问题的答案为真,假设在循环内调用 clear() ,对象 foo 会发生什么?如果 foo.foo() 是一个延迟加载方法怎么办?

调用 clear()Session,使它们分离实体.如果后续调用需要附加"一个实体,它将失败.

I am using Spring+Hibernate for an operation which requires creating and updating literally hundreds of thousands of items. Something like this:

{
   ...
   Foo foo = fooDAO.get(...);
   for (int i=0; i<500000; i++) {
      Bar bar = barDAO.load(i);
      if (bar.needsModification() && foo.foo()) {
         bar.setWhatever("new whatever");
         barDAO.update(bar);
         // commit here
         Baz baz = new Baz();
         bazDAO.create(baz);
         // if (i % 100 == 0), clear
      }
   }
}

To protect myself against losing changes in the middle, I commit the changes immediately after barDAO.update(bar):

HibernateTransactionManager transactionManager = ...; // injected by Spring
DefaultTransactionDefinition def = new DefaultTransactionDefinition();
def.setPropagationBehavior(TransactionDefinition.PROPAGATION_REQUIRED);
TransactionStatus transactionStatus = transactionManager.getTransaction(def);
transactionManager.commit(transactionStatus);

At this point I have to say that entire process is running in a transaction wrapped into org.springframework.orm.hibernate3.support.ExtendedOpenSessionInViewFilter (yes, this is a webapp).

This all works fine with one exception: after few thousand of updates/commits, entire process gets really slow, most likely due to memory being bloated by ever-increasing amount of objects kept by Spring/Hibernate.

In Hibernate-only environment this would be easily solvable by calling org.hibernate.Session#clear().

Now, the questions:

  • When is it a good time to clear()? Does it have big performance cost?
  • Why aren't objects like bar or baz released/GCd automatically? What's the point of keeping them in the session after the commit (in the next loop of iteration they're not reachable anyway)? I haven't done memory dump to prove this but my good feeling is that they're still there until completely exited. If the answer to this is "Hibernate cache", then why isn't the cache flushed upon the available memory going low?
  • is it safe/recommended to call org.hibernate.Session#clear() directly (having in mind entire Spring context, things like lazy loading, etc.)? Are there any usable Spring wrappers/counterparts for achieving the same?
  • If answer to the above question is true, what will happen with object foo, assuming clear() is called inside the loop? What if foo.foo() is a lazy-load method?

Thank you for the answers.

解决方案

When is it a good time to clear()? Does it have big performance cost?

At regular intervals, ideally the same as the JDBC batch size, after having flushed the changes. The documentation describes common idioms in the chapter about Batch processing:

13.1. Batch inserts

When making new objects persistent flush() and then clear() the session regularly in order to control the size of the first-level cache.

Session session = sessionFactory.openSession();
Transaction tx = session.beginTransaction();

for ( int i=0; i<100000; i++ ) {
    Customer customer = new Customer(.....);
    session.save(customer);
    if ( i % 20 == 0 ) { //20, same as the JDBC batch size
        //flush a batch of inserts and release memory:
        session.flush();
        session.clear();
    }
}

tx.commit();
session.close();

And this shouldn't have a performance cost, au contraire:

  • it allows to keep the number of objects to track for dirtiness low (so flushing should be fast),
  • it should allow to reclaim memory.

Why aren't objects like bar or baz released/GCd automatically? What's the point of keeping them in the session after the commit (in the next loop of iteration they're not reachable anyway)?

You need to clear() the session explicitly if you don't want to keep entities tracked, that's all, that's how it works (one might want to commit a transaction without "loosing" the entities).

But from what I can see, bar and baz instances should become candidate to GC after the clear. It would be interesting to analyze a memory dump to see what is happening exactly.

is it safe/recommended to call org.hibernate.Session#clear() directly

As long as you flush() the pending changes to not loose them (unless this is what you want), I don't see any problem with that (your current code will loose a create every 100 loop but maybe it's just some pseudo code).

If answer to the above question is true, what will happen with object foo, assuming clear() is called inside the loop? What if foo.foo() is a lazy-load method?

Calling clear() evicts all loaded instances from the Session, making them detached entities. If a subsequent invocation requires an entity to be "attached", it will fail.

这篇关于在大事务中间安全地清除 Hibernate 会话的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆