无论如何,我无法在Hibernate中对MySQL INSERT语句进行批处理 [英] No matter what, I can't batch MySQL INSERT statements in Hibernate

查看:111
本文介绍了无论如何,我无法在Hibernate中对MySQL INSERT语句进行批处理的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前正面临众所周知的常见Hibernate插入批处理问题。



我需要保存5百万行的批。我首先尝试使用更轻的有效载荷。因为我必须插入只有2种类型的实体(首先是A型的所有记录,然后是B型的所有记录,全部指向常见类型C ManyToOne 父级),我会喜欢从JDBC批量插入中获得最大优势。



我已经阅读了很多文档,但没有一篇我已经尝试过。




  • 我知道为了使用批量插入,我不能使用实体生成器。所以我删除了 AUTO_INCREMENT ID,我用一个技巧设置了ID: SELECT MAX(ID)FROM ENTITIES 并且每次都会增加。

  • 我知道必须定期刷新会话。我会提前发布代码,但无论如何,我每500个元素执行一次交易。
  • 我知道,我必须设置 hibernate。 jdbc.batch_size 与我的应用程序的批量大小一致,所以我将它设置在 LocalSessionFactoryBean (Spring ORM集成)中。
  • 我知道我必须在连接网址中启用重写批量声明。 这里是我的实体

    常见的父实体。这首先被插入到单个事务中。我不在乎这里的自动增量列 。每个批处理作业只有一个记录

      @Entity 
    @Table(...)
    @SequenceGenerator(...)
    public class Deal
    {

    @Id
    @Column(
    name =DEAL_ID,
    nullable = false)
    @GeneratedValue(
    strategy = GenerationType.AUTO)
    保护长ID;

    ................
    }

    其中一个孩子(比方说每批2.5M记录)

      @Entity 
    @Table(
    name =TA_LOANS)
    公共类贷款
    {

    @Id
    @Column(
    name = LOAN_ID,
    nullable = false)
    保护长ID;

    @ManyToOne(
    可选= false,
    targetEntity = Deal.class,
    fetch = FetchType.LAZY)
    @JoinColumn(
    name =DEAL_ID,
    nullable = false)
    保护交易交易;


    .............
    }

    其他孩子类型。假设其他2.5M记录

      @Entity 
    @Table(
    name =TA_BONDS)
    public class Bond
    {

    @Id
    @Column(
    name =BOND_ID)

    @ManyToOne(
    fetch = FetchType.LAZY,
    可选= false,
    targetEntity = Deal.class)
    @JoinColumn(
    name =DEAL_ID,
    可以为null =假,
    updatable = false)
    保护交易协议;


    $ / code>

    插入记录的简化代码

      long loanIdCounter = loanDao.getMaxId(),bondIdCounter = bondDao.getMaxId(); //执行SELECT MAX(ID)

    Deal deal = null;

    列表< Bond> bondList = new ArrayList< Bond>(COMMIT_BATCH_SIZE); // 500常量值
    List< Loan> loanList = new ArrayList< Loan>(COMMIT_BATCH_SIZE);

    for(String msg:inputStreamReader)
    {
    log.debug(msg.toString());

    if(这是一笔交易)
    {
    Deal deal = parseDeal(msg.getMessage());

    deal = dealManager.persist(holder.deal); //使用Spring注解@Transaction(REQUIRES_NEW)

    }
    在一个单独的事务中调用另外如果(这是一个贷款)
    {

    贷款loan = parseLoan(msg.getMessage());
    loan.setId(++ loanIdCounter);
    loan.setDeal(deal);

    loanList.add(loan);

    if(loanList.size()== COMMIT_BATCH_SIZE)
    {
    loanManager.bulkInsert(loanList); //在单个事务中执行批量插入,而不是注释,但是这次手动处理
    loanList.clear();


    else if(this is a bond)
    {
    Bond bond = parseBond(msg.getMessage());
    bond.setId(++ bondIdCounter);
    bond.setDeal(成交);

    bondList.add(bond);



    if(bondList.size()== COMMIT_BATCH_SIZE)//如上
    {
    bondManager.bulkInsert(bondList);
    bondList.clear();




    $ b if(!bondList.isEmpty())
    bondManager.bulkInsert(bondList);
    if(!loanList.isEmpty())
    loanManager.bulkInsert(loanList);
    //刷新其余项目,不重要

    实现 bulkInsert

      @Override 
    public void bulkInsert(Collection< Bond> bonds)
    {
    // StatelessSession session = sessionFactory.openStatelessSession();
    Session session = sessionFactory.openSession();
    尝试
    {
    Transaction t = session.beginTransaction();
    尝试
    {
    (债券债券)
    // session.persist(债券);
    // session.insert(bond);
    session.save(债券);
    }
    catch(RuntimeException ex)
    {
    t.rollback();
    }
    finally
    {
    t.commit();
    }
    }
    finally
    {
    session.close();
    }

    }

    从评论中可以看到,我尝试了几种有状态/无状态 session 的组合。



    我的 dataSource ComboPooledDataSource 与以下URL

     < b:property name =jdbcUrlvalue =jdbc:mysql:// server:3306 / db?autoReconnect = true& amp; rewriteBatchedStatements = true/ > 

    我的 SessionFactory

     < b:bean id =sessionFactoryclass =class.that.extends.org.springframework.orm.hibernate3.LocalSessionFactoryBeanlazy-init =falsedepends-on =dataSource> 
    < b:属性名称=dataSourceref =phoenixDataSource/>
    < b:道具>
    < b:prop key =hibernate.show_sql> $ {hibernate.showSQL}< / b:prop>
    < b:prop key =hibernate.jdbc.batch_size> 500< / b:prop>
    < b:prop key =hibernate.cache.provider_class> org.hibernate.cache.EhCacheProvider< / b:prop>
    < b:prop key =hibernate.order_updates> true< / b:prop>
    < / b:道具>
    < / b:属性>
    < / b:bean>

    即使我的项目范围的类扩展了 LocalSessionFactoryBean ,它不会覆盖它的方法(只增加了几个项目范围的方法)。

    几天后我生气了。我读了一些文章,没有帮助我启用批量插入。我从JUnit测试中运行所有代码,并使用Spring上下文(所以我可以 @Autowire 我的类)。我所有的尝试都只产生大量单独的 INSERT 语句



    我缺少什么?

  • 很可能你的查询正在被重写,但你不知道是否通过厕所国王在Hibernate SQL日志。 Hibernate不重写插入语句 - MySQL驱动程序重写它们。换句话说,Hibernate会向驱动程序发送多个插入语句,然后驱动程序将重写它们。因此,Hibernate日志只会显示SQL Hibernate发送给驱动程序的内容,而不是驱动程序发送到数据库的SQL。



    您可以通过启用MySQL的profileSQL参数来验证连接url:

     < b:property name =jdbcUrlvalue =jdbc:mysql:// server:3306 / db?autoReconnect = true& amp; rewriteBatchedStatements = true& amp; profileSQL = true/> 

    使用类似于您的示例,这就是我的输出结果:



    插入Person(firstName,lastName,id)值(?,?,?)
    插入Person(firstName,lastName,id) (?,?,?)插入到Person(firstName,lastName,id)values(?,?,?)
    插入到Person(firstName,lastName,id)values(?,? ($,?,?)
    插入Person(firstName,lastName,id)values(?,?,?)
    插入Person(firstName,lastName,插入Person(firstName,lastName,id)值(?,?,?)
    插入Person(firstName,lastName,id) (?,?,?)
    星期二05月13:29:52 MST 2014 INFO: Profiler事件:[QUERY] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)持续时间:1 ms,connection-id:81,statement-id:33,resultset-id:0,messag e:插入Person(名字,姓氏,id)值('person1','Name',1),('person2','Name',2),('person3','Name',3),( 'person4','Name',4),('person5','Name',5),('person6','Name',6),('person7','Name',7),('person8 '''Name',8),('person9','Name',9),('person10','Name',10)

    Hibernate会记录前10行,但这不是实际发送到MySQL数据库的内容。最后一行来自MySQL驱动程序,它清楚地显示了具有多个值的单个批量插入,这是实际发送到MySQL数据库的内容。


    I'm currently facing the well-known and common Hibernate insert batch problem.

    I need to save batches 5 millions of rows long. I'm first trying with a much lighter payload. Since I have to insert entities of only 2 types (first all records of type A, then all records of type B, all pointing to common type C ManyToOne parent), I would like to take the most advantage from JDBC batch insert.

    I have already read lots of documentation, but none that I have tried worked.

    • I know that in order to use batch inserts I must not use an entity generator. So I removed the AUTO_INCREMENT ID and I'm setting the ID with a trick: SELECT MAX(ID) FROM ENTITIES and increment every time.
    • I know that I must flush the session regularly. I'll post code ahead, but anyway I perform a transaction every 500 elements.
    • I know that I have to set hibernate.jdbc.batch_size consistent with my application's bulk size, so I set it in the LocalSessionFactoryBean (Spring ORM integration)
    • I know I have to enable rewriting batched statements in connection URL.

    Here are my entities

    Common parent entity. This gets inserted first in a single transaction. I don't care about auto increment column here. Only one record per batch job

    @Entity
    @Table(...)
    @SequenceGenerator(...)
    public class Deal
    {
    
        @Id
        @Column(
                name = "DEAL_ID",
                nullable = false)
        @GeneratedValue(
                strategy = GenerationType.AUTO)
        protected Long id;
    
        ................
    }
    

    One of the children (let's say 2.5M records per batch)

    @Entity
    @Table(
            name = "TA_LOANS")
    public class Loan
    {
    
        @Id
        @Column(
                name = "LOAN_ID",
                nullable = false)
        protected Long id;
    
        @ManyToOne(
                optional = false,
                targetEntity = Deal.class,
                fetch = FetchType.LAZY)
        @JoinColumn(
                name = "DEAL_ID",
                nullable = false)
        protected Deal deal;
    
    
        .............
    }
    

    The other children type. Let's say the other 2.5M records

    @Entity
    @Table(
            name = "TA_BONDS")
    public class Bond
    {
    
        @Id
        @Column(
                name = "BOND_ID")
    
        @ManyToOne(
                fetch = FetchType.LAZY,
                optional = false,
                targetEntity = Deal.class)
        @JoinColumn(
                name = "DEAL_ID",
                nullable = false,
                updatable = false)
        protected Deal deal;
    
    }
    

    Simplified code that inserts records

        long loanIdCounter = loanDao.getMaxId(), bondIdCounter = bondDao.getMaxId(); //Perform SELECT MAX(ID)
    
        Deal deal = null;
    
        List<Bond> bondList = new ArrayList<Bond>(COMMIT_BATCH_SIZE); //500 constant value
        List<Loan> loanList = new ArrayList<Loan>(COMMIT_BATCH_SIZE);
    
        for (String msg: inputStreamReader)
        {
            log.debug(msg.toString());
    
            if (this is a deal)
            {
                Deal deal = parseDeal(msg.getMessage());
    
                deal = dealManager.persist(holder.deal); //Called in a separate transaction using Spring annotation @Transaction(REQUIRES_NEW)
    
            }
            else if (this is a loan)
            {
    
                Loan loan = parseLoan(msg.getMessage());
                loan.setId(++loanIdCounter);
                loan.setDeal(deal);
    
                loanList.add(loan);
    
                if (loanList.size() == COMMIT_BATCH_SIZE)
                {
                    loanManager.bulkInsert(loanList); //Perform a bulk insert in a single transaction, not annotated but handled manually this time
                    loanList.clear();
                }
            }
            else if (this is a bond)
            {
                Bond bond = parseBond(msg.getMessage());
                bond.setId(++bondIdCounter);
                bond.setDeal(deal);
    
                bondList.add(bond);
    
    
    
                if (bondList.size() == COMMIT_BATCH_SIZE) //As above
                {
                    bondManager.bulkInsert(bondList);
                    bondList.clear();
    
                }
            }
        }
    
        if (!bondList.isEmpty())
            bondManager.bulkInsert(bondList);
        if (!loanList.isEmpty())
            loanManager.bulkInsert(loanList);
        //Flush remaining items, not important
    

    Implementation of bulkInsert:

    @Override
    public void bulkInsert(Collection<Bond> bonds)
    {
        // StatelessSession session = sessionFactory.openStatelessSession();
        Session session = sessionFactory.openSession();
        try
        {
            Transaction t = session.beginTransaction();
            try
            {
                for (Bond bond : bonds)
                    // session.persist(bond);
                    // session.insert(bond);
                    session.save(bond);
            }
            catch (RuntimeException ex)
            {
                t.rollback();
            }
            finally
            {
                t.commit();
            }
        }
        finally
        {
            session.close();
        }
    
    }
    

    As you can see from comments, I have tried several combinations of stateful/stateless session. None worked.

    My dataSource is a ComboPooledDataSource with following URL

    <b:property name="jdbcUrl" value="jdbc:mysql://server:3306/db?autoReconnect=true&amp;rewriteBatchedStatements=true" />
    

    My SessionFactory

    <b:bean id="sessionFactory" class="class.that.extends.org.springframework.orm.hibernate3.LocalSessionFactoryBean" lazy-init="false" depends-on="dataSource">
            <b:property name="dataSource" ref="phoenixDataSource" />
            <b:property name="hibernateProperties">
                <b:props>
                    <b:prop key="hibernate.dialect">${hibernate.dialect}</b:prop> <!-- MySQL5InnoDb-->
                    <b:prop key="hibernate.show_sql">${hibernate.showSQL}</b:prop>
                    <b:prop key="hibernate.jdbc.batch_size">500</b:prop>
                    <b:prop key="hibernate.jdbc.use_scrollable_resultset">false</b:prop>
                    <b:prop key="hibernate.cache.use_second_level_cache">false</b:prop>
                    <b:prop key="hibernate.cache.provider_class">org.hibernate.cache.EhCacheProvider</b:prop>
                    <b:prop key="hibernate.cache.use_query_cache">false</b:prop>
                    <b:prop key="hibernate.validator.apply_to_ddl">false</b:prop>
                    <b:prop key="hibernate.validator.autoregister_listeners">false</b:prop>
                    <b:prop key="hibernate.order_inserts">true</b:prop>
                    <b:prop key="hibernate.order_updates">true</b:prop>
                </b:props>
            </b:property>
    </b:bean>
    

    Even if my project-wide class extends LocalSessionFactoryBean, it does not override its methods (only adds few project-wide methods)

    I'm getting mad since a few days. I read a few articles and none helped me enable batch inserts. I run all of my code from JUnit tests instrumented with Spring context (so I can @Autowire my classes). All of my attempts only produce a lots of separate INSERT statements

    What am I missing?

    解决方案

    It's likely your queries are being rewritten but you wouldn't know if by looking at the Hibernate SQL logs. Hibernate does not rewrite the insert statements - the MySQL driver rewrites them. In other words, Hibernate will send multiple insert statements to the driver, and then the driver will rewrite them. So the Hibernate logs only show you what SQL Hibernate sent to the driver, not what SQL the driver sent to the database.

    You can verify this by enabling MySQL's profileSQL parameter in connection url:

    <b:property name="jdbcUrl" value="jdbc:mysql://server:3306/db?autoReconnect=true&amp;rewriteBatchedStatements=true&amp;profileSQL=true" />
    

    Using an example similar to yours, this is what my output looks like:

    insert into Person (firstName, lastName, id) values (?, ?, ?)
    insert into Person (firstName, lastName, id) values (?, ?, ?)
    insert into Person (firstName, lastName, id) values (?, ?, ?)
    insert into Person (firstName, lastName, id) values (?, ?, ?)
    insert into Person (firstName, lastName, id) values (?, ?, ?)
    insert into Person (firstName, lastName, id) values (?, ?, ?)
    insert into Person (firstName, lastName, id) values (?, ?, ?)
    insert into Person (firstName, lastName, id) values (?, ?, ?)
    insert into Person (firstName, lastName, id) values (?, ?, ?)
    insert into Person (firstName, lastName, id) values (?, ?, ?)
    Wed Feb 05 13:29:52 MST 2014 INFO: Profiler Event: [QUERY]  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) duration: 1 ms, connection-id: 81, statement-id: 33, resultset-id: 0, message: insert into Person (firstName, lastName, id) values ('person1', 'Name', 1),('person2', 'Name', 2),('person3', 'Name', 3),('person4', 'Name', 4),('person5', 'Name', 5),('person6', 'Name', 6),('person7', 'Name', 7),('person8', 'Name', 8),('person9', 'Name', 9),('person10', 'Name', 10)
    

    The first 10 lines are being logged by Hibernate though this not what is actually being sent to MySQL database. The last line is coming from MySQL driver and it clearly shows a single batch insert with multiple values and that is what is actually being sent to the MySQL database.

    这篇关于无论如何,我无法在Hibernate中对MySQL INSERT语句进行批处理的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆