Spring Data 如何在事务方法中清理持久实体? [英] How spring data clean persited entities in transactional method?

查看:34
本文介绍了Spring Data 如何在事务方法中清理持久实体?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要在休眠状态下使用 spring 数据接收和保存大量数据.我们的服务器没有分配足够的 RAM 来同时保存所有实体.我们肯定会得到 OutOfMemory 错误.

I need to receive and save huge amount of data using spring data over hibernate. Our server allocated not enough RAM for persisting all entities at the same time. We will definitely get OutOfMemory error.

所以我们需要批量保存数据,这是显而易见的.此外,我们还需要使用@Transactional 来确保所有数据持久化或非持久化,即使是单个错误.

So we need to save data by batches it's obvious. Also we need to use @Transactional to be sure that all data persisted or non was persisted in case of even single error.

那么,问题是:@Transactional 方法中的 spring 数据是否继续将实体存储在 RAM 中,或者垃圾收集器可以访问已刷新的实体?

So, the question: does spring data during @Transactional method keep storing entities in RAM or entities which were flushed are accessible to garbage collector?

那么,使用 spring 数据处理大量数据的最佳方法是什么?也许弹簧数据不是解决此类问题的正确方法.

So, what is the best approach to process huge mount of data with spring data? Maybe spring data isn't right approach to solve problems like that.

推荐答案

@Transactional 方法中的 spring 数据是否保持将实体存储在被刷新的 RAM 或实体可被垃圾访问收藏家?

Does spring data during @Transactional method keep storing entities in RAM or entities which were flushed are accessible to garbage collector?

实体将继续存储在 RAM 中(即在 entityManager 中),直到事务提交/回滚或 entityManager 被清除.这意味着实体只有在事务提交/回滚或entityManager.clear() 被调用.

The entities will keep storing in RAM (i.e in entityManager) until the transaction commit/rollback or the entityManager is cleared. That means the entities are only eligible for GC if the transaction commit/rollback or entityManager.clear() is called.

那么,处理大量数据的最佳方法是什么?弹簧数据?

So, what is the best approach to process huge mount of data with spring data?

防止OOM的一般策略是逐批加载和处理数据.在每个批次结束时,您应该刷新并清除 entityManager,以便 entityManager 可以为 CG 释放其托管实体.一般的代码流应该是这样的:

The general strategy to prevent OOM is to load and process the data batch by batch . At the end of each batch , you should flush and clear the entityManager such that the entityManager can release its managed entities for CG. The general code flow should be something like this:

@Component
public class BatchProcessor {

    //Spring will ensure this entityManager is the same as the one that start transaction due to  @Transactional
    @PersistenceContext
    private EntityManager em;

    @Autowired
    private FooRepository fooRepository;

    @Transactional
    public void startProcess(){

        processBatch(1,100);
        processBatch(101,200);
        processBatch(201,300);
        //blablabla

    }

    private void processBatch(int fromFooId , int toFooId){
        List<Foo> foos =  fooRepository.findFooIdBetween(fromFooId, toFooId);
        for(Foo foo :foos){
            //process a foo
        }

        /*****************************
        The reason to flush is send the update SQL to DB . 
        Otherwise ,the update will lost if we clear the entity manager 
        afterward.
        ******************************/
        em.flush();
        em.clear();
    }
} 

注意,这种做法只是为了防止OOM,而不是为了实现高性能.因此,如果您不关心性能,则可以安全地使用此策略.

Note that this practise is only for preventing OOM but not for achieving high performance. So if performance is not your concern , you can safely use this strategy.

这篇关于Spring Data 如何在事务方法中清理持久实体?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆