多个线程可以在 Java 中看到直接映射的 ByteBuffer 上的写入吗? [英] Can multiple threads see writes on a direct mapped ByteBuffer in Java?

查看:31
本文介绍了多个线程可以在 Java 中看到直接映射的 ByteBuffer 上的写入吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在研究使用 ByteBuffers 从内存映射文件构建(通过 FileChannel.map()) 以及内存中的直接 ByteBuffers.我正在尝试了解并发和内存模型约束.

I'm working on something that uses ByteBuffers built from memory-mapped files (via FileChannel.map()) as well as in-memory direct ByteBuffers. I am trying to understand the concurrency and memory model constraints.

我已经阅读了 FileChannel、ByteBuffer、MappedByteBuffer 等所有相关的 Javadoc(和源代码).很明显,一个特定的 ByteBuffer(和相关的子类)有一堆字段,状态不受保护内存模型的观点.因此,如果跨线程使用该缓冲区,则在修改特定 ByteBuffer 的状态时必须同步.常见的技巧包括使用 ThreadLocal 包装 ByteBuffer、复制(同步时)以获取指向相同映射字节的新实例等.

I have read all of the relevant Javadoc (and source) for things like FileChannel, ByteBuffer, MappedByteBuffer, etc. It seems clear that a particular ByteBuffer (and relevant subclasses) has a bunch of fields and the state is not protected from a memory model point of view. So, you must synchronize when modifying state of a particular ByteBuffer if that buffer is used across threads. Common tricks include using a ThreadLocal to wrap the ByteBuffer, duplicate (while synchronized) to get a new instance pointing to the same mapped bytes, etc.

鉴于这种情况:

  1. manager 有一个用于整个文件的映射字节缓冲区 B_all(假设它是 <2gb)
  2. manager 调用 B_all 上的 duplicate()、position()、limit() 和 slice() 以创建一个新的较小的 ByteBuffer B_1 作为文件的一个块并将其提供给线程 T1
  3. manager 做所有相同的事情来创建一个 ByteBuffer B_2 指向相同的映射字节并将其提供给线程 T2
  1. manager has a mapped byte buffer B_all for the entire file (say it's <2gb)
  2. manager calls duplicate(), position(), limit(), and slice() on B_all to create a new smaller ByteBuffer B_1 that a chunk of the file and gives this to thread T1
  3. manager does all the same stuff to create a ByteBuffer B_2 pointing to the same mapped bytes and gives this to thread T2

我的问题是:T1写B_1和T2写B_2可以同时保证看到对方的变化吗?T3 能否使用 B_all 读取这些字节并保证看到 T1 和 T2 的变化?

My question is: Can T1 write to B_1 and T2 write to B_2 concurrently and be guaranteed to see each other's changes? Could T3 use B_all to read those bytes and be guaranteed to see the changes from both T1 and T2?

我知道,除非您使用 force() 指示操作系统将页面写入磁盘,否则不一定会跨进程看到映射文件中的写入.我不在乎那个.对于这个问题,假设这个 JVM 是唯一一个写入单个映射文件的进程.

I am aware that writes in a mapped file are not necessarily seen across processes unless you use force() to instruct the OS to write the pages down to disk. I don't care about that. Assume for this question that this JVM is the only process writing a single mapped file.

注意:我不是在寻找猜测(我自己可以很好地猜测).我想参考一些关于内存映射直接缓冲区保证(或不保证)什么的明确内容.或者,如果您有实际经验或负面测试案例,也可以作为充分证据.

Note: I am not looking for guesses (I can make those quite well myself). I would like references to something definitive about what is (or is not) guaranteed for memory-mapped direct buffers. Or if you have actual experiences or negative test cases, that could also serve as sufficient evidence.

更新:我已经做了一些测试,让多个线程并行写入同一个文件,到目前为止,其他线程似乎可以立即看到这些写入.不过,我不确定我是否可以依靠它.

Update: I have done some tests with having multiple threads write to the same file in parallel and so far it seems those writes are immediately visible from other threads. I'm not sure if I can rely on that though.

推荐答案

与 JVM 的内存映射只是 CreateFileMapping (Windows) 或 mmap (posix) 的薄包装.因此,您可以直接访问操作系统的缓冲区缓存.这意味着这些缓冲区是操作系统认为文件包含的内容(操作系统最终将同步文件以反映这一点).

Memory mapping with the JVM is just a thin wrapper around CreateFileMapping (Windows) or mmap (posix). As such, you have direct access to the buffer cache of the OS. This means that these buffers are what the OS considers the file to contain (and the OS will eventually synch the file to reflect this).

所以不需要调用 force() 来同步进程.进程已经同步(通过操作系统 - 甚至读/写访问相同的页面).强制只在操作系统和驱动器控制器之间进行同步(驱动器控制器和物理盘片之间可能会有一些延迟,但您没有硬件支持来做任何事情).

So there is no need to call force() to sync between processes. The processes are already synched (via the OS - even read/write accesses the same pages). Forcing just synchs between the OS and the drive controller (there can be some delay between the drive controller and the physical platters, but you don't have hardware support to do anything about that).

无论如何,内存映射文件是线程和/或进程之间可接受的共享内存形式.这个共享内存和,比如说,Windows 中的一个命名的虚拟内存块之间的唯一区别是最终同步到磁盘(实际上,mmap 通过映射/dev/null 来实现没有文件的虚拟内存).

Regardless, memory mapped files are an accepted form of shared memory between threads and/or processes. The only difference between this shared memory and, say, a named block of virtual memory in Windows is the eventual synchronization to disk (in fact mmap does the virtual memory without a file thing by mapping /dev/null).

从多个进程/线程读取写入内存仍然需要一些同步,因为处理器能够进行乱序执行(不确定这与 JVM 有多少交互,但你不能做出假设),但是写入来自一个线程的字节将具有与正常写入堆中的任何字节相同的保证.写入后,每个线程和每个进程都会看到更新(即使是通过打开/读取操作).

Reading writing memory from multiple processes/threads does still need some synch, as processors are able to do out-of-order execution (not sure how much this interacts with JVMs, but you can't make presumptions), but writing a byte from one thread will have the same guarantees as writing to any byte in the heap normally. Once you have written to it, every thread, and every process, will see the update (even through an open/read operation).

有关更多信息,请在 posix 中查找 mmap(或 Windows 的 CreateFileMapping,其构建方式几乎相同.

For more info, look up mmap in posix (or CreateFileMapping for Windows, which was built almost the same way.

这篇关于多个线程可以在 Java 中看到直接映射的 ByteBuffer 上的写入吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆