在春天默认@Transactional和默认丢失的更新 [英] Default @Transactional in spring and the default lost update

查看:91
本文介绍了在春天默认@Transactional和默认丢失的更新的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

春季环境中有一大现象,或者我是非常错误的。
但默认的@Transactional注释不是ACID,但只有ACD没有隔离。这意味着如果你有方法:

  @Transactional 
public TheEntity updateEntity(TheEntity ent){
TheEntity storedEntity = loadEntity(ent.getId());
storedEntity.setData(ent.getData);
return saveEntity(storedEntity);
}

如果2个线程输入不同的计划更新会发生什么情况。他们都从数据库加载实体,他们都应用自己的更改,然后第一个被保存并提交,当第二个被保存并提交第一个UPDATE IS LOST。这是真的吗?使用调试器,它就是这样工作的。

解决方案

你没有错,你的问题是一个非常有趣的观察。我相信(根据你的意见),你正在考虑你的具体情况,而这个问题更广泛。让我们一步一步来。



ACID



I 在ACID中确实代表隔离。但这并不意味着两个或更多的交易需要一个接一个地执行。他们只需要被隔离到一定程度。大多数关系数据库允许设置事务的隔离级别,甚至允许您从其他未提交的事务中读取数据。如果这种情况没有问题,则由具体应用决定。参见例如mysql文档:



https://dev.mysql.com/doc/refman/5.7/en/innodb-transaction-isolation-levels.html



您当然可以将隔离级别设置为可序列化并实现您的期望。



现在,我们还有不支持NoSQL的数据库酸。除此之外,如果您开始使用数据库集群,则可能需要包含数据的最终一致性,这甚至可能意味着刚刚写入某些数据的同一个线程在进行读取时可能不会收到它。这又是一个针对特定应用程序的非常具体的问题 - 我能否承受暂时的数据不一致以换取快速编写?



您可能会倾向于一致的数据在银行业务或某些金融系统中以可序列化的方式处理,您可能会在社交应用中使用较少一致的数据,但获得更高的性能。丢失 - 是这种情况吗?

是的,情况就是这样。

我们害怕可序列化吗?

是的,它可能会变得讨厌:-)但是理解它的工作原理和后果是很重要的。我不知道这是否仍然如此,但我在10年前使用过DB2的项目中遇到过这样的情况。由于非常具体的情况,DB2正在对整个表执行锁升级以排除锁,从而有效地阻止任何其他连接访问表,即使读取也是如此。这意味着一次只能处理一个连接。

因此,如果您选择使用可序列化级别,则需要确保事务处理实际上很快并且事实上需要它。也许在你写作的时候其他一些线程正在读取数据呢?想象一下你有一个评论系统为你的文章的情况。突然间,一篇病毒文章被发表,每个人都开始评论。用于评论的单个写入事务需要100ms。 100条新评论交易排队,这将有效阻止阅读未来10年的评论。我确信,阅读这里提供的内容是绝对足够的,并且可以让你实现两件事:快速存储评论,并在写入时阅读它们。



长篇故事简而言之:
这一切都取决于您的数据访问模式,没有银弹。有时需要序列化,但它的性能会受到惩罚,有时读取未提交将会罚款,但会带来不一致的处罚。


There is one big phenomena in the spring environment or I am terribly wrong. But the default spring @Transactional annotation is not ACID but only ACD lacking the isolation. That means that if you have the method:

@Transactional
public TheEntity updateEntity(TheEntity ent){
  TheEntity storedEntity = loadEntity(ent.getId());
  storedEntity.setData(ent.getData);
  return saveEntity(storedEntity);
}

What would happen if 2 threads enter with different planned updates. They both load the entity from the db, they both apply their own changes, then the first is saved and commit and when the second is saved and commit the first UPDATE IS LOST. Is that really the case? With the debugger it is working like that.

解决方案

You are not terribly wrong, your question is a very interesting observation. I believe (based on your comments) you are thinking about it in your very specific situation whereas this subject is much broader. Let's take it step by step.

ACID

I in ACID indeed stands for isolation. But it does not mean that two or more transactions need to be executed one after another. They just need to be isolated to some level. Most of the relational databases allow to set an isolation level on a transaction even allowing you to read data from other uncommitted transaction. It is up to specific application if such a situation is fine or not. See for example mysql documentation:

https://dev.mysql.com/doc/refman/5.7/en/innodb-transaction-isolation-levels.html

You can of course set the isolation level to serializable and achieve what you expect.

Now, we also have NoSQL databases that don't support ACID. On top of that if you start working with a cluster of databases you may need to embrace eventual consistency of data which might even mean that the same thread that just wrote some data may not receive it when doing a read. Again this is a question very specific to a particular app - can I afford having inconsistent data for a moment in exchange for a fast write?

You would probably lean towards consistent data handled in serializable manner in banking or some financial system and you would probably be fine with less consistent data in a social app but achieving a higher performance.

Update is lost - is that the case?

Yes, that will be the case.

Are we scared of serializable?

Yes, it might get nasty :-) But it is important to understand how it works and what are the consequences. I don't know if this is still the case but I had a situation in a project about 10 years ago where DB2 was used. Due to very specific scenario DB2 was performing a lock escalation to exclusive lock on the whole table effectively blocking any other connection from accessing the table even for reads. That meant only a single connection could be handled at a time.

So if you choose to go with serializable level you need to be sure that your transaction are in fact fast and that it is in fact needed. Maybe it is fine that some other thread is reading the data while you are writing? Just imagine a scenario where you have a commenting system for your articles. Suddenly a viral article gets published and everyone starts commenting. A single write transaction for comment takes 100ms. 100 new comments transactions get queued which effectively will block reading the comments for the next 10s. I am sure that going with read committed here would be absolutely enough and allow you achieve two things: store the comments faster and read them while they are being written.

Long story short: It all depends on your data access patterns and there is no silver bullet. Sometimes serializable will be required but it has its performance penalty and sometimes read uncommitted will be fine but it will bring inconsistency penalties.

这篇关于在春天默认@Transactional和默认丢失的更新的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆