从多线程的 java.util.HashMap 获取值是否安全(无修改)? [英] Is it safe to get values from a java.util.HashMap from multiple threads (no modification)?

查看:26
本文介绍了从多线程的 java.util.HashMap 获取值是否安全(无修改)?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有一种情况,地图会被构造,一旦被初始化,就再也不会被修改.但是,它将从多个线程访问(仅通过 get(key)).以这种方式使用 java.util.HashMap 是否安全?

There is a case where a map will be constructed, and once it is initialized, it will never be modified again. It will however, be accessed (via get(key) only) from multiple threads. Is it safe to use a java.util.HashMap in this way?

(目前,我很高兴使用 java.util.concurrent.ConcurrentHashMap,并且没有必要提高性能,但我只是好奇是否有一个简单的 HashMap> 就足够了.因此,这个问题不是我应该使用哪个?"也不是性能问题.相反,问题是它安全吗?")

(Currently, I'm happily using a java.util.concurrent.ConcurrentHashMap, and have no measured need to improve performance, but am simply curious if a simple HashMap would suffice. Hence, this question is not "Which one should I use?" nor is it a performance question. Rather, the question is "Would it be safe?")

推荐答案

你的习语是安全的当且仅当HashMap的引用被安全发布.安全发布HashMap本身的内部结构无关,而是处理构造线程如何使对映射的引用对其他线程可见.

Your idiom is safe if and only if the reference to the HashMap is safely published. Rather than anything relating the internals of HashMap itself, safe publication deals with how the constructing thread makes the reference to the map visible to other threads.

基本上,这里唯一可能的竞争是在 HashMap 的构造和在它完全构造之前可能访问它的任何读取线程之间.大多数讨论都是关于地图对象的状态会发生什么,但这无关紧要,因为您从不修改它 - 所以唯一有趣的部分是 HashMap 引用是如何发布的.

Basically, the only possible race here is between the construction of the HashMap and any reading threads that may access it before it is fully constructed. Most of the discussion is about what happens to the state of the map object, but this is irrelevant since you never modify it - so the only interesting part is how the HashMap reference is published.

例如,假设您像这样发布地图:

For example, imagine you publish the map like this:

class SomeClass {
   public static HashMap<Object, Object> MAP;

   public synchronized static setMap(HashMap<Object, Object> m) {
     MAP = m;
   }
}

... 并且在某些时候 setMap() 用地图调用,其他线程使用 SomeClass.MAP 访问地图,并检查是否为空像这样:

... and at some point setMap() is called with a map, and other threads are using SomeClass.MAP to access the map, and check for null like this:

HashMap<Object,Object> map = SomeClass.MAP;
if (map != null) {
  .. use the map
} else {
  .. some default behavior
}

不安全,即使它看起来好像确实如此.问题是没有 happens-before SomeObject.MAP 的集合与后续在另一个线程上读取的关系,因此读取线程可以自由地查看部分构造的映射.这几乎可以做任何事情,甚至在实践中它也可以做让阅读线程陷入无限循环.

This is not safe even though it probably appears as though it is. The problem is that there is no happens-before relationship between the set of SomeObject.MAP and the subsequent read on another thread, so the reading thread is free to see a partially constructed map. This can pretty much do anything and even in practice it does things like put the reading thread into an infinite loop.

为了安全地发布地图,您需要在引用的写入HashMap(即、出版物)和该参考文献的后续读者(即消费).方便的是,只有几种易于记忆的方法可以实现[1]:

To safely publish the map, you need to establish a happens-before relationship between the writing of the reference to the HashMap (i.e., the publication) and the subsequent readers of that reference (i.e., the consumption). Conveniently, there are only a few easy-to-remember ways to accomplish that[1]:

  1. 通过正确锁定的字段(JLS 17.4.5)
  2. 使用静态初始化器进行初始化存储(JLS 12.4)
  3. 通过可变字段交换引用(JLS 17.4.5),或者作为这条规则的结果,通过 AtomicX 类
  4. 将值初始化为最终字段(JLS 17.5).
  1. Exchange the reference through a properly locked field (JLS 17.4.5)
  2. Use static initializer to do the initializing stores (JLS 12.4)
  3. Exchange the reference via a volatile field (JLS 17.4.5), or as the consequence of this rule, via the AtomicX classes
  4. Initialize the value into a final field (JLS 17.5).

最适合您的场景的是 (2)、(3) 和 (4).特别是,(3)直接适用于我上面的代码:如果将MAP的声明转换为:

The ones most interesting for your scenario are (2), (3) and (4). In particular, (3) applies directly to the code I have above: if you transform the declaration of MAP to:

public static volatile HashMap<Object, Object> MAP;

那么一切都是 kosher:看到 非空 值的读者必然与商店有 happens-before 关系到 MAP 和因此可以看到与地图初始化相关的所有商店.

then everything is kosher: readers who see a non-null value necessarily have a happens-before relationship with the store to MAP and hence see all the stores associated with the map initialization.

其他方法会更改您方法的语义,因为 (2)(使用静态初始化器)和 (4)(使用 final)都暗示您不能设置 MAP 在运行时动态.如果您不需要这样做,那么只需将 MAP 声明为 static final HashMap<> 即可保证安全发布.

The other methods change the semantics of your method, since both (2) (using the static initalizer) and (4) (using final) imply that you cannot set MAP dynamically at runtime. If you don't need to do that, then just declare MAP as a static final HashMap<> and you are guaranteed safe publication.

实际上,安全访问从未修改的对象"的规则很简单:

In practice, the rules are simple for safe access to "never-modified objects":

如果您发布的对象不是固有不可变的(如在所有字段中声明为 final)并且:

If you are publishing an object which is not inherently immutable (as in all fields declared final) and:

  • 您已经可以创建将在声明时分配的对象a:只需使用一个final 字段(包括static final对于静态成员).
  • 您想稍后在引用已经可见后分配对象:使用可变字段b.
  • You already can create the object that will be assigned at the moment of declarationa: just use a final field (including static final for static members).
  • You want to assign the object later, after the reference is already visible: use a volatile fieldb.

就是这样!

在实践中,它非常有效.例如,使用 static final 字段允许 JVM 假设该值在程序的生命周期内保持不变并对其进行大量优化.final 成员字段的使用允许大多数架构以与普通字段读取等效的方式读取该字段,并且不会抑制进一步的优化c.

In practice, it is very efficient. The use of a static final field, for example, allows the JVM to assume the value is unchanged for the life of the program and optimize it heavily. The use of a final member field allows most architectures to read the field in a way equivalent to a normal field read and doesn't inhibit further optimizationsc.

最后,使用 volatile 确实有一些影响:在许多架构上不需要硬件屏障(例如 x86,特别是那些不允许读取通过读取的架构),但进行了一些优化并且重新排序可能不会在编译时发生 - 但这种影响通常很小.作为交换,您实际上得到了比您要求的更多的东西 - 您不仅可以安全地发布一个 HashMap,还可以根据需要存储更多未修改的 HashMap使用相同的参考文献,并确保所有读者都能看到安全发布的地图.

Finally, the use of volatile does have some impact: no hardware barrier is needed on many architectures (such as x86, specifically those that don't allow reads to pass reads), but some optimization and reordering may not occur at compile time - but this effect is generally small. In exchange, you actually get more than what you asked for - not only can you safely publish one HashMap, you can store as many more not-modified HashMaps as you want to the same reference and be assured that all readers will see a safely published map.

有关详细信息,请参阅 ShipilevManson 和 Goetz 的常见问题解答.

For more gory details, refer to Shipilev or this FAQ by Manson and Goetz.

[1] 直接引用自 shipilev.

[1] Directly quoting from shipilev.

a 这听起来很复杂,但我的意思是你可以在构造时分配引用——无论是在声明点还是在构造函数(成员字段)或静态初始值设定项(静态字段).

a That sounds complicated, but what I mean is that you can assign the reference at construction time - either at the declaration point or in the constructor (member fields) or static initializer (static fields).

b 或者,您可以使用 synchronized 方法来获取/设置,或者使用 AtomicReference 或其他东西,但我们正在谈论您可以做的最少工作.

b Optionally, you can use a synchronized method to get/set, or an AtomicReference or something, but we're talking about the minimum work you can do.

c 一些具有非常弱内存模型的架构(我在看 you,Alpha)可能需要在 final 读取之前某种类型的读取屏障 - 但这些是今天非常罕见.

c Some architectures with very weak memory models (I'm looking at you, Alpha) may require some type of read barrier before a final read - but these are very rare today.

这篇关于从多线程的 java.util.HashMap 获取值是否安全(无修改)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆