内存屏障和TLB [英] Memory barriers and the TLB

查看:224
本文介绍了内存屏障和TLB的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

内存屏障确保数据高速缓存将保持一致。



我看到一个问题,其中JVM(java 7更新1)有时崩溃与内存错误(SIGBUS,SIGSEG )当在线程之间传递MappedByteBuffer时。



例如

  ; MappedByteBuffer> mbbQueue = new AtomicReference<>(); 

//在后台线程中。
MappedByteBuffer map = raf.map(MapMode.READ_WRITE,offset,allocationSize);
Thread.yield();
while(!inQueue.compareAndSet(null,map));


//主线程。 (比在同一个线程中使用map()快10倍)
MappedByteBuffer mbb = inQueue.getAndSet(null);如果没有Thread.yield(),我偶尔会得到crash(),put()和()的崩溃。 C的memcpy()所有表明我试图非法访问内存。使用Thread.yield()我没有问题,但这听起来不像一个可靠的解决方案。



有没有人遇到这个问题?是否有关于TLB和内存屏障的保证?






编辑:操作系统是Centos 5.7, i7和双Xeon机器。



为什么要这样做?因为写入消息的平均时间是35-100 ns,取决于长度,并使用一个简单的write()不是那么快。如果我内存映射和清理在当前线程这需要50-130微秒,使用后台线程做它需要大约3-5微秒的主线程来交换缓冲区。为什么我需要交换缓冲区?因为我写了许多GB的数据,ByteBuffer的大小不能是2+ GB。

解决方案

映射是通过mmap64 FileChannel.map)。当地址被访问时,会有一个页面错误,内核将为你读取/写入。在mmap期间不需要更新TLB。



TLB(所有cpus的)在通过MappedByteBuffer的最终化处理的munmap期间未验证,因此munmap是昂贵的。



映射涉及很多同步,因此地址值不会被破坏。



任何机会,你可以通过Unsafe尝试一些奇特的东西吗?


Memory barriers guarantee that the data cache will be consistent. However, does it guarantee that the TLB will be consistent?

I am seeing a problem where the JVM (java 7 update 1) sometimes crashes with memory errors (SIGBUS, SIGSEG) when passing a MappedByteBuffer between threads.

e.g.

final AtomicReference<MappedByteBuffer> mbbQueue = new AtomicReference<>();

// in a background thread.
MappedByteBuffer map = raf.map(MapMode.READ_WRITE, offset, allocationSize);
Thread.yield();
while (!inQueue.compareAndSet(null, map));


// the main thread. (more than 10x faster than using map() in the same thread)
MappedByteBuffer mbb = inQueue.getAndSet(null);

Without the Thread.yield() I occasionally get crashes in force(), put(), and C's memcpy() all indicating I am trying to access memory illegally. With the Thread.yield() I haven't had a problem, but that doesn't sound like a reliable solution.

Has anyone come across this problem? Are there any guarantees about TLB and memory barriers?


EDIT: The OS is Centos 5.7, I have seen the behaviour on i7 and a Dual Xeon machines.

Why do I do this? Because the average time to write a message is 35-100 ns depending on length and using a plain write() isn't as fast. If I memory map and clean up in the current thread this takes 50-130 microseconds, using a background thread to do it takes about 3-5 microseconds for the main thread to swap buffers. Why do I need to be swapping buffers at all? Because I am writing many GB of data and ByteBuffer cannot be 2+ GB in size.

解决方案

The mapping is done via mmap64 (FileChannel.map). When the address is accessed there will be a page fault and the kernel shall read/write there for you. TLB doesn't need to be updated during mmap.

TLB (of all cpus) is unvalidated during munmap which is handled by the finalization of the MappedByteBuffer, hence munmap is costly.

Mapping involves a lot synchronization so the address value shall not be corrupted.

Any chance you try fancy stuff via Unsafe?

这篇关于内存屏障和TLB的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆