了解如何使用TheUnsafe进行memcpy [英] Understanding how to memcpy with TheUnsafe

查看:136
本文介绍了了解如何使用TheUnsafe进行memcpy的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我读了关于TheUnsafe的东西,但是我很困惑的事实是,与C / C ++不同,我们必须计算出东西的偏移量,还有32位虚拟机和64位虚拟机,它们可能有也可能没有不同指针大小取决于打开或关闭的特定虚拟机设置(另外,我假设所有的数据偏移实际上都是基于指针算术,这会影响到它们)。



<不幸的是,似乎所有关于如何使用TheUnsafe所写的东西都只来自一篇文章(碰巧是第一篇文章),而其他所有的文章都是从其中复制粘贴的。没有多少人存在,有些人并不清楚,因为作者显然不会说英语。



我的问题是:

如何使用TheUnsafe找到字段的偏移量+指向拥有该字段(或字段的字段,字段的字段,字段的字段的实例)的指针



如何使用它执行另一个指针+偏移量内存地址的memcpy

考虑到数据可能有几个GB大小,并考虑堆提供不直接控制数据对齐,它可能肯定是分散的,因为:
$ b $ 1)我不认为没有什么能阻止虚拟机从在offset +(field1)+ 32的偏移量处分配字段1和字段2是否存在?

2)我还假设GC会移动大块数据,导致1GB大小的字段有时会被分割。



那么memcpy操作也是我所描述的可能吗?



如果数据由于GC而碎片化,当然堆有一个指向下一块数据的指针,但是使用上面描述的简单过程似乎并没有涵盖这一点。



所以这个数据必须是堆外的(可能)工作?如果是这样,如何使用TheUnsafe分配堆外数据,使这些数据像实例的字段一样工作,并且一旦完成就释放分配的内存?



我鼓励任何不太明白问题的人要求提供他们需要知道的具体细节。

我还敦促人们不要回答,如果他们的整个想法是把你需要复制的所有对象放在一个数组中,并使用 System.arraycopy 。我知道在这个美妙的论坛中这是很常见的做法,而不是回答已经提出的问题,替代解决方案原则上与原始问题无关,除了它完成相同的工作。



最好的问候。

$首先是一个大警告:不安全必须死亡 com / still-unsafe-the-major-bug-in-java-6-that-turned-a-java-9-feature /rel =nofollow> http://blog.takipi.com /仍然不安全的主要错误在Java 6转换成Java 9功能/ / $ / $>

一些先决条件



 静态类DataHolder {
int i1;
int i2;
int i3;
DataHolder d1;
DataHolder d2;
public DataHolder(int i1,int i2,int i3,DataHolder dh){
this.i1 = i1;
this.i2 = i2;
this.i3 = i3;
this.d1 = dh;
this.d2 = this;
}
}

字段theUnsafe = Unsafe.class.getDeclaredField(theUnsafe);
theUnsafe.setAccessible(true);
不安全不安全=(不安全)theUnsafe.get(null);

DataHolder dh1 = new DataHolder(11,13,17,null);
DataHolder dh2 = new DataHolder(23,29,31,dh1);



基本知识



获取偏移量(i1),您可以使用以下代码:

 字段fi1 = DataHolder.class.getDeclaredField(i1 ); 
long oi1 = unsafe.objectFieldOffset(fi1);

以及访问您可以编写的实例dh1的字段值

  System.out.println(unsafe.getInt(dh1,oi1)); //会打印11 

您可以使用类似的代码来访问对象引用(d1) p>

 字段fd1 = DataHolder.class.getDeclaredField(d1); 
long od1 = unsafe.objectFieldOffset(fd1);

,您可以使用它从dh2中获取对dh1的引用:

  System.out.println(dh1 == unsafe.getObject(dh2,od1)); //将打印出真正的



字段排序和对齐



  for(Field f:DataHolder.class.getDeclaredFields( )){
if(!Modifier.isStatic(f.getModifiers())){
System.out.println(f.getName()++ unsafe.objectFieldOffset(f));


$ / code>

在我的测试中,JVM似乎将字段重新排序为它看起来合适(即添加一个字段可以在下一次运行时产生完全不同的偏移量)

本地内存中的对象地址



重要的是要明白,以下代码迟早会使JVM崩溃,因为垃圾收集器会随机移动对象,而无法控制何时以及为何发生。



另外需要理解的是,下面的代码依赖于JVM类型(32位与64位)以及JVM的一些启动参数(即在64位上使用压缩的oops JVM)。

在32位虚拟机上,对象的引用与int具有相同的大小。那么如果你调用 int addr = unsafe.getInt(dh2,od1)); 而不是 unsafe.getObject(dh2,od1) )?它可能是对象的本地地址吗?



让我们试试:

  System.out.println(unsafe.getInt(null,unsafe.getInt(dh2,od1)+ oi1)); 

会按预期打印 11



在没有压缩oops的64位虚拟机上(-XX:-UseCompressedOops),您需要写入

  System.out.println(unsafe.getInt(null,unsafe.getLong(dh2,od1)+ oi1)); 

在具有压缩oops(-XX:+ UseCompressedOops)的64位虚拟机上,复杂。这个变体具有32位对象引用,它们通过将它们与8L相乘而变成64位地址:

  System.out.println( unsafe.getInt(null,8L *(0xffffffffL&(dh2,od1)+ oi1)); 



这些访问有什么问题



问题是垃圾收集器和这段代码一起,垃圾收集器可以随意移动对象,因为JVM知道它是对象引用(局部变量dh1和dh2,这些对象的字段d1和d2),它可以相应地调整这些引用,您的代码将永远不会被注意。



通过提取对象引用到int / long变量中,您将这些对象引用转换为恰好与对象引用具有相同位模式的原始值,但垃圾收集器不知道这些是对象引用(它们可能是由随机生成器生成的以及)因此不会调整这些值同时移动物体。因此,一旦垃圾收集周期被触发,您提取的地址就不再有效,并且尝试访问这些地址处的内存可能会立即导致JVM崩溃(正常情况),或者您可能会在不注意现场的情况下垃圾内存(坏的案件)。


I read stuff about TheUnsafe, but I get confused by the fact that, unlike C/C++ we have to work out the offset of stuff, and there's also the 32bits VM vs the 64bits VM, which may or may not have different pointers sizes depending on a particular VM setting being turned on or off (also, I'm assuming all offsets to data are actually based on pointer arithmetic this would influence them to).

Unfortunately, it seems all the stuff ever written about how to use TheUnsafe stems from one article only (the one who happened to be the first) and all the others copy pasted from it to a certain degree. Not many of them exist, and some are not clear because the author apparently did not speak English.

My question is:

How can I find the offset of a field + the pointer to the instance that owns that field (or field of a field, or field, of a field, of a field...) using TheUnsafe

How can I use it to perform a memcpy to another pointer + offset memory address

Considering the data may have several GB in size, and considering the heap offers no direct control over data alignment and it may most certainly be fragmented because:

1) I don't think there's nothing stoping the VM from allocating field1 at offset + 10 and field2 at offset sizeof(field1) + 32, is there?

2) I would also assume the GC would move big chunks of data around, leading to a field with 1GB in size being fragmented sometimes.

So is the memcpy operation as I described even possible?

If data is fragmented because of GC, of course the heap has a pointer to where the next chunk of data is, but using the simple process described above doesn't seem to cover that.

so must the data be off-heap for this to (maybe) work? If so, how to allocate off-heap data using TheUnsafe, making such data work as a field of an instance and of course freeing the allocated memory once done with it?

I encourage anyone who didn't quite understand the question to ask for any specifics they need to know.

I also urge people to refrain from answering if their whole idea is "put all objects you need to copy in an array and useSystem.arraycopy. I know it's common practice in this wonderful forum to, instead of answering what's been asked, offering a complete alternate solution that, in principle, has nothing to do with the original question apart from the fact that it gets the same job done.

Best regards.

解决方案

First a big warning: "Unsafe must die" http://blog.takipi.com/still-unsafe-the-major-bug-in-java-6-that-turned-into-a-java-9-feature/

Some prerequisites

static class DataHolder {
    int i1;
    int i2;
    int i3;
    DataHolder d1;
    DataHolder d2;
    public DataHolder(int i1, int i2, int i3, DataHolder dh) {
        this.i1 = i1;
        this.i2 = i2;
        this.i3 = i3;
        this.d1 = dh;
        this.d2 = this;
    }
}

Field theUnsafe = Unsafe.class.getDeclaredField("theUnsafe");
theUnsafe.setAccessible(true);
Unsafe unsafe = (Unsafe) theUnsafe.get(null);

DataHolder dh1 = new DataHolder(11, 13, 17, null);
DataHolder dh2 = new DataHolder(23, 29, 31, dh1);

The basics

To get the offset of a field (i1), you can use the following code:

Field fi1 = DataHolder.class.getDeclaredField("i1");
long oi1 = unsafe.objectFieldOffset(fi1);

and the access the field value of instance dh1 you can write

System.out.println(unsafe.getInt(dh1, oi1)); // will print 11

You can use similar code to access an object reference (d1):

Field fd1 = DataHolder.class.getDeclaredField("d1");
long od1 = unsafe.objectFieldOffset(fd1);

and you can use it to get the reference to dh1 from dh2:

System.out.println(dh1 == unsafe.getObject(dh2, od1)); // will print true

Field ordering and alignment

To get the offsets of all declared fields of a object:

for (Field f: DataHolder.class.getDeclaredFields()) {
    if (!Modifier.isStatic(f.getModifiers())) {
        System.out.println(f.getName()+" "+unsafe.objectFieldOffset(f));
    }
}

On my test it seems that the JVM reorders fields as it sees fit (i.e. adding a field can yield completely different offsets on the next run)

An Objects address in native memory

It's important to understand that the following code is going to crash your JVM sooner or later, because the Garbage Collector will move your objects at random times, without you having any control on when and why it happens.

Also it's important to understand that the following code depends on the JVM type (32 bits versus 64 bits) and on some start parameters for the JVM (namely, usage of compressed oops on 64 bit JVMs).

On a 32 bit VM a reference to an object has the same size as an int. So what do you get if you call int addr = unsafe.getInt(dh2, od1)); instead of unsafe.getObject(dh2, od1))? Could it be the native address of the object?

Let's try:

System.out.println(unsafe.getInt(null, unsafe.getInt(dh2, od1)+oi1));

will print out 11 as expected.

On a 64 bit VM without compressed oops (-XX:-UseCompressedOops), you will need to write

System.out.println(unsafe.getInt(null, unsafe.getLong(dh2, od1)+oi1));

On a 64 bit VM with compressed oops (-XX:+UseCompressedOops), things are a bit more complicated. This variant has 32 bit object references that are turned into 64 bit addresses by multiplying them with 8L:

System.out.println(unsafe.getInt(null, 8L*(0xffffffffL&(dh2, od1)+oi1));

What is the problem with these accesses

The problem is the Garbage Collector together with this code. The Garbage Collector can move around objects as it pleases. Since the JVM knows about it's object references (the local variables dh1 and dh2, the fields d1 and d2 of these objects) it can adjust these references accordingly, your code will never notice.

By extracting object references into int/long variables you turn these object references into primitive values that happen to have the same bit-pattern as an object reference, but the Garbage Collector does not know that these were object references (they could have been generated by a random generator as well) and therefore does not adjust these values while moving objects around. So as soon as a Garbage Collection cycle is triggered your extracted addresses are no longer valid, and trying to access memory at these addresses might crash your JVM immediately (the good case) or you might trash your memory without noticing on the spot (the bad case).

这篇关于了解如何使用TheUnsafe进行memcpy的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆