没有约束异常处理的Hibernate线程安全幂等upsert? [英] Hibernate thread-safe idempotent upsert without constraint exception handling?

查看:111
本文介绍了没有约束异常处理的Hibernate线程安全幂等upsert?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一些执行UPSERT的代码,也称为

那是一个死胡同,尽管听起来听起来像是一个聪明的解决方案,但我确实不理解该评论,而且提到了实际上是相同的SQL语句".

另一个有希望的方法是: Hibernate和Spring修改查询提交给数据库之前

不冲突/不重复密钥更新

两个主要的开放源数据库都支持将幂等性向下推到数据库的机制.下面的示例使用PostgreSQL语法,但可以轻松地适用于MySQL.

遵循 Hibernate和Spring修改查询之前的想法提交给数据库与Hibernate的查询生成挂钩如何在Hibernate中配置StatementInspector?,我实现了:

  import org.hibernate.resource.jdbc.spi.StatementInspector;@SuppressWarnings(序列")公共类IdempotentInspector实现StatementInspector {@Overridepublic String inspect(String sql){if(sql.startsWith(插入零售店")){sql + =冲突时不做";}返回SQL;}} 

具有属性

 < prop key ="hibernate.session_factory.statement_inspector"> com.myapp.IdempotentInspector</prop> 

不幸的是,这在遇到重复项时导致以下错误:

起因:org.springframework.orm.hibernate5.HibernateOptimisticLockingFailureException:批更新从更新[0]返回意外行数;实际行计数:0;预期:1;嵌套的异常是org.hibernate.StaleStateException:批处理更新返回意外更新[0]中的行数;实际行数:0;预期:1

如果您考虑一下幕后发生的事情,这是有道理的: ON CONFLICT DO NOTHING 导致插入零行,但希望插入一行.

是否有一种解决方案可以启用线程安全的无异常并发幂等插入,并且不需要手动定义要由Hibernate执行的整个SQL插入语句?

对于它的价值,我认为将dupcheck推送到数据库的方法是寻求正确解决方案的途径.

澄清 batchInsert 方法使用的 IncomingItem 对象源自记录是不可变的系统.在这种特殊情况下,尽管可能第N次更新丢失,但 ON CONFLICT DO NOTHING 的行为与UPSERT相同.

解决方案

简短答案-Hibernate不支持开箱即用(由Merge. I want to clean-up this code, specifically, I want to move away from exception handling, and reduce overall verbosity and sheer complexity of the code for such a simple operation. The requirement is to insert each item unless it already exists:

public void batchInsert(IncomingItem[] items) {
    try(Session session = sessionFactory.openSession()) {
        batchInsert(session, items);
    }
    catch(PersistenceException e) {
        if(e.getCause() instanceof ConstraintViolationException) {
            logger.warn("attempting to recover from constraint violation");
            DateTimeFormatter dbFormat = DateTimeFormatter.ofPattern("yyyy-MM-dd HH:mm:ss.SSS");
            items = Arrays.stream(items).filter(item -> {
                int n = db.queryForObject("select count(*) from rets where source = ? and systemid = ? and updtdate = ?::timestamp",
                        Integer.class,
                        item.getSource().name(), item.getSystemID(), 
                        dbFormat.format(item.getUpdtDateObj()));
                if(n != 0) {
                    logger.warn("REMOVED DUPLICATE: " +
                            item.getSource() + " " + item.getSystemID() + " " + item.getUpdtDate());
                    return false;
                }
                else {
                    return true; // keep
                }
            }).toArray(IncomingItem[]::new);
            try(Session session = sessionFactory.openSession()) {
                batchInsert(session, items);
            }
        }
    }
}

An initial search of SO is unsatisfactory:

In the question How to do ON DUPLICATE KEY UPDATE in Spring Data JPA? which was marked as a duplicate, I noticed this intriguing comment:

That was a dead-end as I really don't understand the comment, despite it sounding like a clever solution, and mention of "actual same SQL statement".

Another promising approach is this: Hibernate and Spring modify query Before Submitting to DB

ON CONFLICT DO NOTHING / ON DUPLICATE KEY UPDATE

Both of the major open-source databases support a mechanism to push idempotency down to the database. The examples below use the PostgreSQL syntax, but can be easily adapted for MySQL.

By following the ideas in Hibernate and Spring modify query Before Submitting to DB, Hooking into Hibernate's query generation, and How I can configure StatementInspector in Hibernate?, I implemented:

import org.hibernate.resource.jdbc.spi.StatementInspector;

@SuppressWarnings("serial")
public class IdempotentInspector implements StatementInspector {

    @Override
    public String inspect(String sql) {
        if(sql.startsWith("insert into rets")) {
            sql += " ON CONFLICT DO NOTHING";
        }
        return sql;
    }

}

with property

        <prop key="hibernate.session_factory.statement_inspector">com.myapp.IdempotentInspector</prop>

Unfortunately this leads to the following error when a duplicate is encountered:

Caused by: org.springframework.orm.hibernate5.HibernateOptimisticLockingFailureException: Batch update returned unexpected row count from update [0]; actual row count: 0; expected: 1; nested exception is org.hibernate.StaleStateException: Batch update returned unexpected row count from update [0]; actual row count: 0; expected: 1

Which makes sense, if you think about what's going on under the covers: the ON CONFLICT DO NOTHING causes zero rows to be inserted, but one insert is expected.

Is there a solution that enables thread-safe exception-free concurrent idempotent inserts and doesn't require manually defining the entire SQL insert statement to be executed by Hibernate?

For what it's worth, I feel that the approaches that push the dupcheck down to the database are the path to a proper solution.

CLARIFICATION The IncomingItem objects consumed by the batchInsert method originate from a system where records are immutable. Under this special condition the ON CONFLICT DO NOTHING behaves the same as an UPSERT, notwithstanding possible loss of the Nth update.

解决方案

Short answer - Hibernate does not support it out of the box (as confirmed by a Hibernate guru in this blog post). Probably you could make it work to some extent in some scenarios with the mechanisms you already described, but just using native queries directly looks the most straightforward approach to me for this purpose.

Longer answer would be that it would be hard to support it considering all the aspects of Hibernate I guess, e.g.:

  • What to do with instances for which duplicates are found, as they are supposed to become managed after persisting? Merge them into persistence context?
  • What to do with associations that have already been persisted, which cascade operations to apply on them (persist/merge/something_new; or is it too late at that point to make that decision)?
  • Do the databases return enough info from upsert operations to cover all use cases (skipped rows; generated keys for not-skipped in batch insert modes, etc).
  • What about @Audit-ed entities, are they created or updated, if updated what has changed?
  • Or versioning and optimistic locking (by the definition you actually want exception in that case)?

Even if Hibernate supported it in some way, I'm not sure I'd be using that feature if there were too many caveats to watch out and take into consideration.

So, the rule of thumb I follow is:

  • For simple scenarios (which are most of the time): persist + retry. Retries in case of specific errors (by exception type or similar) can be globally configured with AOP-like approaches (annotations, custom interceptors and similar) depending on which frameworks you use in your project and it is a good practice anyway especially in distributed environments.
  • For complex scenarios and performance intensive operations (especially when it comes to batching, very complex queries and alike): Native queries to maximize utilization of specific database features.

这篇关于没有约束异常处理的Hibernate线程安全幂等upsert?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆