在Java中,我可以依靠引用分配是原子的来实现写时复制吗? [英] In Java can I depend on reference assignment being atomic to implement copy on write?

查看:80
本文介绍了在Java中,我可以依靠引用分配是原子的来实现写时复制吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果我在多线程环境中有一个未同步的Java集合,并且我不想强制该集合的读者进行同步 [1] ,这是一种解决方案,其中我可以同步作家并使用参考分配的原子性可行吗?像这样:

If I have an unsynchronized java collection in a multithreaded environment, and I don't want to force readers of the collection to synchronize[1], is a solution where I synchronize the writers and use the atomicity of reference assignment feasible? Something like:

private Collection global = new HashSet(); // start threading after this

void allUpdatesGoThroughHere(Object exampleOperand) {
  // My hypothesis is that this prevents operations in the block being re-ordered
  synchronized(global) {
    Collection copy = new HashSet(global);
    copy.remove(exampleOperand);
    // Given my hypothesis, we should have a fully constructed object here. So a 
    // reader will either get the old or the new Collection, but never an 
    // inconsistent one.
    global = copy;    
  }
}

// Do multithreaded reads here. All reads are done through a reference copy like:
// Collection copy = global;
// for (Object elm: copy) {...
// so the global reference being updated half way through should have no impact 

在这种情况下,滚动自己的解决方案似乎经常会失败,因此我想了解其他可用于防止对象创建和阻塞数据消费者的模式,集合或库.

Rolling your own solution seems to often fail in these type of situations, so I'd be interested in knowing other patterns, collections or libraries I could use to prevent object creation and blocking for my data consumers.

[1]原因是与写入相比,读取所花费的时间占很大比例,并且存在引入死锁的风险.

[1] The reasons being a large proportion of time spent in reads compared to writes, combined with the risk of introducing deadlocks.

在一些答案和评论中有很多有用的信息,其中有一些要点:

A lot of good information in several of the answers and comments, some important points:

  1. 我发布的代码中存在一个错误.交换后,在全局变量(命名错误的变量)上进行同步可能无法保护已同步化的块.
  2. 您可以通过在类上同步(将synced关键字移到方法上)来解决此问题,但是可能还有其他错误.一个更安全,更可维护的解决方案是使用java.util.concurrent中的内容.
  3. 我发布的代码中没有最终一致性保证",使用volatile关键字是确保读者可以一定程度地看到作者的更新的一种方法.
  4. 反思后,引发此问题的一般问题是尝试使用Java中的锁定写入来实现无锁读取,但是我的(已解决)问题与集合有关,这可能会给以后的读者带来不必要的困惑.因此,在不太明显的情况下,我发布的代码可以一次允许一个编写者对某个对象"执行编辑,而该对象正在不受多个读取器线程保护的情况下进行编辑.编辑的提交是通过原子操作完成的,因此读者只能获得编辑前或编辑后的对象".当读取器线程获得更新时,如果读取器线程在对象"的旧副本上进行了读取,则该更新不能在读取过程中发生.在Java中提供更好的并发支持之前,一个可能已经被发现并证明以某种方式被破坏的简单解决方案.
  1. A bug was present in the code I posted. Synchronizing on global (a badly named variable) can fail to protect the syncronized block after a swap.
  2. You could fix this by synchronizing on the class (moving the synchronized keyword to the method), but there may be other bugs. A safer and more maintainable solution is to use something from java.util.concurrent.
  3. There is no "eventual consistency guarantee" in the code I posted, one way to make sure that readers do get to see the updates by writers is to use the volatile keyword.
  4. On reflection the general problem that motivated this question was trying to implement lock free reads with locked writes in java, however my (solved) problem was with a collection, which may be unnecessarily confusing for future readers. So in case it is not obvious the code I posted works by allowing one writer at a time to perform edits to "some object" that is being read unprotected by multiple reader threads. Commits of the edit are done through an atomic operation so readers can only get the pre-edit or post-edit "object". When/if the reader thread gets the update, it cannot occur in the middle of a read as the read is occurring on the old copy of the "object". A simple solution that had probably been discovered and proved to be broken in some way prior to the availability of better concurrency support in java.

推荐答案

而不是尝试推出自己的解决方案,为什么不使用

Rather than trying to roll out your own solution, why not use a ConcurrentHashMap as your set and just set all the values to some standard value? (A constant like Boolean.TRUE would work well.)

我认为这种实现方式在许多读者很少的情况下效果很好.甚至还有.

I think this implementation works well with the many-readers-few-writers scenario. There's even a constructor that lets you set the expected "concurrency level".

更新: Veer建议使用

Update: Veer has suggested using the Collections.newSetFromMap utility method to turn the ConcurrentHashMap into a Set. Since the method takes a Map<E,Boolean> my guess is that it does the same thing with setting all the values to Boolean.TRUE behind-the-scenes.

更新:解决发布者的示例

Update: Addressing the poster's example

这可能是我最终要解决的问题,但是我仍然对我的极简解决方案如何失败感到好奇. – MilesHampson

That is probably what I will end up going with, but I am still curious about how my minimalist solution could fail. – MilesHampson

您的极简解决方案只要稍作调整,就可以正常工作.我担心的是,尽管它现在很小,但将来可能会变得更加复杂.很难记住在进行线程安全操作时要假设的所有条件,尤其是如果您要在数周/月/年之后返回代码以进行看似微不足道的调整时.如果ConcurrentHashMap具有足够的性能来满足您的所有需求,那么为什么不使用它呢?所有令人讨厌的并发详细信息都被封装了起来,甚至从现在开始的6个月,您将很难将其弄乱!

Your minimalist solution would work just fine with a bit of tweaking. My worry is that, although it's minimal now, it might get more complicated in the future. It's hard to remember all of the conditions you assume when making something thread-safe—especially if you're coming back to the code weeks/months/years later to make a seemingly insignificant tweak. If the ConcurrentHashMap does everything you need with sufficient performance then why not use that instead? All the nasty concurrency details are encapsulated away and even 6-months-from-now you will have a hard time messing it up!

在您当前的解决方案生效之前,您至少需要进行一次调整.正如已经指出的那样,您可能应该在global的声明中添加volatile修饰符.我不知道您是否具有C/C ++背景,但是当我得知volatile .如果您打算使用Java进行大量并发编程,那么熟悉 Java内存模型.如果您没有将global的引用作为volatile引用,则很可能没有线程会看到global的值的任何更改,直到他们尝试对其进行更新为止,这时进入synchronized块将刷新本地缓存并获取更新的参考值.

You do need at least one tweak before your current solution will work. As has already been pointed out, you should probably add the volatile modifier to global's declaration. I don't know if you have a C/C++ background, but I was very surprised when I learned that the semantics of volatile in Java are actually much more complicated than in C. If you're planning on doing a lot of concurrent programming in Java then it'd be a good idea to familiarize yourself with the basics of the Java memory model. If you don't make the reference to global a volatile reference then it's possible that no thread will ever see any changes to the value of global until they try to update it, at which point entering the synchronized block will flush the local cache and get the updated reference value.

但是,即使添加了volatile,仍然仍然存在巨大问题.这是一个有两个线程的问题场景:

However, even with the addition of volatile there's still a huge problem. Here's a problem scenario with two threads:

  1. 我们从空集或global={}开始.线程AB在其线程本地缓存的内存中均具有此值.
  2. 线程A获取获得global上的synchronized锁,并通过复制global并将新密钥添加到集合中来开始更新.
  3. 虽然线程A仍位于synchronized块内,但线程B会将其本地值global读取到堆栈上,并尝试进入synchronized块.由于线程A当前位于监视器的线程B块中.
  4. 线程A通过设置引用并退出监视器来完成更新,结果为global={1}.
  5. 线程B现在可以进入监视器并复制global={1}集.
  6. 线程A决定进行另一次更新,读取其本地global引用,然后尝试进入synchronized块. 由于线程B当前持有{}上的锁,因此{1}上没有任何锁,线程A成功进入了监视器!
  7. 线程A还会复制{1}以便进行更新.
  1. We begin with the empty set, or global={}. Threads A and B both have this value in their thread-local cached memory.
  2. Thread A obtains obtains the synchronized lock on global and starts the update by making a copy of global and adding the new key to the set.
  3. While Thread A is still inside the synchronized block, Thread B reads its local value of global onto the stack and tries to enter the synchronized block. Since Thread A is currently inside the monitor Thread B blocks.
  4. Thread A completes the update by setting the reference and exiting the monitor, resulting in global={1}.
  5. Thread B is now able to enter the monitor and makes a copy of the global={1} set.
  6. Thread A decides to make another update, reads in its local global reference and tries to enter the synchronized block. Since Thread B currently holds the lock on {} there is no lock on {1} and Thread A successfully enters the monitor!
  7. Thread A also makes a copy of {1} for purposes of updating.

现在,线程AB都在synchronized块内,并且它们具有global={1}集的相同副本. 这意味着它们的更新之一将会丢失!.这种情况是由于您正在对存储在引用中的对象进行同步而导致的,而该引用将在您的synchronized块中进行更新.您应该始终非常小心地使用要同步的对象.您可以通过添加新变量来充当锁来解决此问题:

Now Threads A and B are both inside the synchronized block and they have identical copies of the global={1} set. This means that one of their updates will be lost! This situation is caused by the fact that you're synchronizing on an object stored in a reference that you're updating inside your synchronized block. You should always be very careful which objects you use to synchronize. You can fix this problem by adding a new variable to act as the lock:

private volatile Collection global = new HashSet(); // start threading after this
private final Object globalLock = new Object(); // final reference used for synchronization

void allUpdatesGoThroughHere(Object exampleOperand) {
  // My hypothesis is that this prevents operations in the block being re-ordered
  synchronized(globalLock) {
    Collection copy = new HashSet(global);
    copy.remove(exampleOperand);
    // Given my hypothesis, we should have a fully constructed object here. So a 
    // reader will either get the old or the new Collection, but never an 
    // inconsistent one.
    global = copy;    
  }
}

此错误非常隐蔽,其他答案尚未解决. 正是这些疯狂的并发细节导致我建议使用已经调试过的java.util.concurrent库中的某些内容,而不是尝试将它们自己整合在一起.我认为上述解决方案可以工作,但是再次拧紧它有多容易?这样会容易得多:

This bug was insidious enough that none of the other answers have addressed it yet. It's these kinds of crazy concurrency details that cause me to recommend using something from the already-debugged java.util.concurrent library rather than trying to put something together yourself. I think the above solution would work—but how easy would it be to screw it up again? This would be so much easier:

private final Set<Object> global = Collections.newSetFromMap(new ConcurrentHashMap<Object,Boolean>());

由于引用为final,因此您无需担心使用陈旧引用的线程,并且由于ConcurrentHashMap内部处理了所有令人讨厌的内存模型问题,因此您不必担心的所有令人讨厌的细节.显示器和内存障碍!

Since the reference is final you don't need to worry about threads using stale references, and since the ConcurrentHashMap handles all the nasty memory model issues internally you don't have to worry about all the nasty details of monitors and memory barriers!

这篇关于在Java中,我可以依靠引用分配是原子的来实现写时复制吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆