从多个线程(不修改)从 java.util.HashMap 获取值是否安全? [英] Is it safe to get values from a java.util.HashMap from multiple threads (no modification)?

查看:32
本文介绍了从多个线程(不修改)从 java.util.HashMap 获取值是否安全?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有一种情况,地图会被构建,一旦初始化,就再也不会被修改了.但是,它将被多个线程访问(仅通过 get(key)).以这种方式使用 java.util.HashMap 是否安全?

There is a case where a map will be constructed, and once it is initialized, it will never be modified again. It will however, be accessed (via get(key) only) from multiple threads. Is it safe to use a java.util.HashMap in this way?

(目前,我很高兴使用 java.util.concurrent.ConcurrentHashMap,并且没有衡量提高性能的需要,但我只是好奇一个简单的 HashMap 就足够了.因此,这个问题不是我应该使用哪个?"也不是性能问题.而是它安全吗?")

(Currently, I'm happily using a java.util.concurrent.ConcurrentHashMap, and have no measured need to improve performance, but am simply curious if a simple HashMap would suffice. Hence, this question is not "Which one should I use?" nor is it a performance question. Rather, the question is "Would it be safe?")

推荐答案

你的习惯用法是安全的当且仅当HashMap 的引用是安全发布的.安全发布处理的是构造线程如何使对映射的引用对其他线程可见.

Your idiom is safe if and only if the reference to the HashMap is safely published. Rather than anything relating the internals of HashMap itself, safe publication deals with how the constructing thread makes the reference to the map visible to other threads.

基本上,这里唯一可能的竞争是 HashMap 的构造和在它完全构造之前可能访问它的任何读取线程之间的竞争.大多数讨论是关于 map 对象的状态会发生什么,但这无关紧要,因为您从不修改它 - 所以唯一有趣的部分是 HashMap 引用是如何发布的.

Basically, the only possible race here is between the construction of the HashMap and any reading threads that may access it before it is fully constructed. Most of the discussion is about what happens to the state of the map object, but this is irrelevant since you never modify it - so the only interesting part is how the HashMap reference is published.

例如,假设您像这样发布地图:

For example, imagine you publish the map like this:

class SomeClass {
   public static HashMap<Object, Object> MAP;

   public synchronized static setMap(HashMap<Object, Object> m) {
     MAP = m;
   }
}

... 并且在某些时候 setMap() 使用地图调用,其他线程正在使用 SomeClass.MAP 访问地图,并检查 null像这样:

... and at some point setMap() is called with a map, and other threads are using SomeClass.MAP to access the map, and check for null like this:

HashMap<Object,Object> map = SomeClass.MAP;
if (map != null) {
  .. use the map
} else {
  .. some default behavior
}

这是不安全,即使它看起来好像是.问题是没有 happens-before SomeObject.MAP 的集合与随后在另一个线程上读取之间的关系,因此读取线程可以自由地看到部分构造的映射.这几乎可以做任何事情,甚至在实践中它也可以做 将读取线程放入无限循环.

This is not safe even though it probably appears as though it is. The problem is that there is no happens-before relationship between the set of SomeObject.MAP and the subsequent read on another thread, so the reading thread is free to see a partially constructed map. This can pretty much do anything and even in practice it does things like put the reading thread into an infinite loop.

为了安全地发布地图,您需要在编写参考HashMap之间建立happens-before关系(即、出版物)和该参考文献的后续读者(即消费).方便的是,完成只有几种容易记住的方法[1]:

To safely publish the map, you need to establish a happens-before relationship between the writing of the reference to the HashMap (i.e., the publication) and the subsequent readers of that reference (i.e., the consumption). Conveniently, there are only a few easy-to-remember ways to accomplish that[1]:

  1. 通过正确锁定的字段交换引用 (JLS 17.4.5)
  2. 使用静态初始化器进行初始化存储(JLS 12.4)
  3. 通过可变字段交换引用 (JLS 17.4.5),或者作为此规则的结果,通过 AtomicX 类
  4. 将值初始化为最终字段 (JLS 17.5).
  1. Exchange the reference through a properly locked field (JLS 17.4.5)
  2. Use static initializer to do the initializing stores (JLS 12.4)
  3. Exchange the reference via a volatile field (JLS 17.4.5), or as the consequence of this rule, via the AtomicX classes
  4. Initialize the value into a final field (JLS 17.5).

您的场景中最有趣的是 (2)、(3) 和 (4).特别是,(3)直接适用于我上面的代码:如果您将 MAP 的声明转换为:

The ones most interesting for your scenario are (2), (3) and (4). In particular, (3) applies directly to the code I have above: if you transform the declaration of MAP to:

public static volatile HashMap<Object, Object> MAP;

那么一切都是 kosher:看到 non-null 值的读者必然与商店有 happens-before 关系到 MAP 和因此查看与地图初始化相关的所有商店.

then everything is kosher: readers who see a non-null value necessarily have a happens-before relationship with the store to MAP and hence see all the stores associated with the map initialization.

其他方法会更改方法的语义,因为 (2)(使用静态初始化程序)和 (4)(使用 final)都暗示您无法设置 MAP在运行时动态.如果您不需要 这样做,那么只需将 MAP 声明为 static final HashMap 即可保证安全发布.

The other methods change the semantics of your method, since both (2) (using the static initalizer) and (4) (using final) imply that you cannot set MAP dynamically at runtime. If you don't need to do that, then just declare MAP as a static final HashMap<> and you are guaranteed safe publication.

在实践中,安全访问从未修改的对象"的规则很简单:

In practice, the rules are simple for safe access to "never-modified objects":

如果您发布的对象不是本质上不可变的(在所有声明为final的字段中)并且:

If you are publishing an object which is not inherently immutable (as in all fields declared final) and:

  • 您已经可以创建将在声明时分配的对象a:只需使用 final 字段(包括 static final对于静态成员).
  • 您想稍后在引用已经可见后分配对象:使用可变字段b.
  • You already can create the object that will be assigned at the moment of declarationa: just use a final field (including static final for static members).
  • You want to assign the object later, after the reference is already visible: use a volatile fieldb.

就是这样!

在实践中,它非常有效.例如,static final 字段的使用允许 JVM 假定该值在程序的生命周期内保持不变并对其进行大量优化.final 成员字段的使用允许大多数架构以与普通字段读取等效的方式读取该字段,并且不会抑制进一步的优化c.

In practice, it is very efficient. The use of a static final field, for example, allows the JVM to assume the value is unchanged for the life of the program and optimize it heavily. The use of a final member field allows most architectures to read the field in a way equivalent to a normal field read and doesn't inhibit further optimizationsc.

最后,volatile 的使用确实会产生一些影响:在许多架构(例如 x86,特别是那些不允许读取通过读取的架构上)不需要硬件屏障,但进行了一些优化并且重新排序可能不会在编译时发生 - 但这种影响通常很小.作为交换,您实际上得到的比您所要求的要多——您不仅可以安全地发布一个 HashMap,还可以存储任意数量的未修改 HashMap参考相同的参考资料,并确保所有读者都能看到安全发布的地图.

Finally, the use of volatile does have some impact: no hardware barrier is needed on many architectures (such as x86, specifically those that don't allow reads to pass reads), but some optimization and reordering may not occur at compile time - but this effect is generally small. In exchange, you actually get more than what you asked for - not only can you safely publish one HashMap, you can store as many more not-modified HashMaps as you want to the same reference and be assured that all readers will see a safely published map.

更多血腥细节,请参考ShipilevManson 和 Goetz 的这个常见问题解答.

For more gory details, refer to Shipilev or this FAQ by Manson and Goetz.

[1] 直接引用自shipilev.

a 这听起来很复杂,但我的意思是您可以在构造时分配引用 - 在声明点或构造函数(成员字段)或静态初始化程序(静态字段).

a That sounds complicated, but what I mean is that you can assign the reference at construction time - either at the declaration point or in the constructor (member fields) or static initializer (static fields).

b 可选地,您可以使用 synchronized 方法来获取/设置,或者使用 AtomicReference 或其他东西,但我们正在谈论你能做的最少的工作.

b Optionally, you can use a synchronized method to get/set, or an AtomicReference or something, but we're talking about the minimum work you can do.

c 一些具有非常弱内存模型的架构(我正在查看 you,Alpha)在 final 读取之前可能需要某种类型的读取屏障 - 但这些是今天非常罕见.

c Some architectures with very weak memory models (I'm looking at you, Alpha) may require some type of read barrier before a final read - but these are very rare today.

这篇关于从多个线程(不修改)从 java.util.HashMap 获取值是否安全?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆