我可以在java中连续分配对象吗? [英] Can I allocate objects contiguously in java?

查看:120
本文介绍了我可以在java中连续分配对象吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有一大堆相对较小的对象,我需要经常迭代。

我想通过提高缓存性能来优化我的迭代,所以我想在内存中连续分配对象 [而不是引用],所以我会得到在C ++中,我只需分配一个对象数组,然后按照我的意愿分配它们,但在java中 - 分配一个数组时,我只分配引用,并且一次只分配一个对象。



我意识到如果分配对象[一个接一个],jvm最有可能尽可能将对象分配为连续的,但如果内存被分割,它可能是不够的。



我的问题:


  1. 有没有办法告诉jvm在我开始分配我的对象之前整理内存?是否足以确保[尽可能]确保对象将被连续分配?

  2. 是否有解决此问题的其他解决方案?

  3. ol>

    解决方案

    新对象在Eden空间中创建。伊甸园的空间从不分割。在GC之后,它总是空的。



    你所遇到的问题是执行GC时,对象可以随机排列在内存中,或者甚至出乎意料地以相反顺序排列引用。



    解决方法是将字段存储为一系列数组。我称之为基于列的表格而不是基于行的表格。



    例如。而不是写入

      class PointCount {
    double x,y;
    int count;
    }

    PointCount [] pc =新的很多小物件。

    使用基于列的数据类型。

      class PointCounts {
    double [] xs,ys;
    int [] counts;
    }

      class PointCounts {
    TDoubleArrayList xs,ys;
    TIntArrayList计数;



    $ b $ p
    $ b

    数组本身最多可以位于三个不同的位置,但数据是否则总是连续的。如果您对一部分字段执行操作,这甚至可以稍微更高效。

      public int totalCount(){
    int sum = 0;
    //计数在值之间没有任何连续的情况下是连续的。
    for(int i:counts)sum + = i;
    返回i;
    }






    我使用的解决方案是为了避免大量数据的GC开销是使用一个接口来访问直接映射或存储器映射的ByteBuffer

      import java。 nio.ByteBuffer; 
    import java.nio.ByteOrder;

    public class MyCounters {
    public static void main(String ... args){
    Runtime rt = Runtime.getRuntime();
    long used1 = rt.totalMemory() - rt.freeMemory();
    long start = System.nanoTime();
    int length = 100 * 1000 * 1000;
    PointCount pc = new PointCountImpl(length);
    for(int i = 0; i< length; i ++){
    pc.index(i);
    pc.setX(i);
    pc.setY(-i);
    pc.setCount(1);
    }
    for(int i = 0; i pc.index(i);
    if(pc.getX()!= i)抛出新的AssertionError();
    if(pc.getY()!= -i)throw new AssertionError();
    if(pc.getCount()!= 1)抛出新的AssertionError();
    }
    long time = System.nanoTime() - start;
    long used2 = rt.totalMemory() - rt.freeMemory();
    System.out.printf(创建%,d使用%,d字节的堆和工具%.1f秒设置和获取%n,
    长度,(used2 - used1) ,时间/ 1e9);
    }
    }

    接口PointCount {
    //设置所引用元素的索引。
    public void index(int index);

    public double getX();

    public void setX(double x);

    public double getY();

    public void setY(double y);

    public int getCount();

    public void setCount(int count);

    public void incrementCount();
    }

    class PointCountImpl implements PointCount {
    static final int X_OFFSET = 0;
    static final int Y_OFFSET = X_OFFSET + 8;
    static final int COUNT_OFFSET = Y_OFFSET + 8;
    static final int LENGTH = COUNT_OFFSET + 4;

    最终的ByteBuffer缓冲区;
    int start = 0;

    PointCountImpl(int count){
    this(ByteBuffer.allocateDirect(count * LENGTH).order(ByteOrder.nativeOrder()));
    }

    PointCountImpl(ByteBuffer buffer){
    this.buffer = buffer;
    }

    @Override
    public void index(int index){
    start = index * LENGTH;
    }

    @Override
    public double getX(){
    return buffer.getDouble(start + X_OFFSET);
    }

    @Override
    public void setX(double x){
    buffer.putDouble(start + X_OFFSET,x);
    }

    @Override
    public double getY(){
    return buffer.getDouble(start + Y_OFFSET);
    }

    @Override
    public void setY(double y){
    buffer.putDouble(start + Y_OFFSET,y);
    }

    @Override
    public int getCount(){
    return buffer.getInt(start + COUNT_OFFSET);
    }

    @Override
    public void setCount(int count){
    buffer.putInt(start + COUNT_OFFSET,count);
    }

    @Override
    public void incrementCount(){
    setCount(getCount()+ 1);
    }
    }

    使用 -XX :-UseTLAB 选项(获得准确的内存分配大小)打印


    创建100,000,000个使用过的12,512字节的数组的堆,并花了1.8秒设置并得到


    作为它的堆,它几乎没有GC影响。


    Assume I have a large array of relatively small objects, which I need to iterate frequently.
    I would like to optimize my iteration by improving cache performance, so I would like to allocate the objects [and not the reference] contiguously on the memory, so I'll get fewer cache misses, and the overall performance could be segnificantly better.

    In C++, I could just allocate an array of the objects, and it will allocate them as I wanted, but in java - when allocating an array, I only allocate the reference, and the allocation is being done one object at a time.

    I am aware that if I allocate the objects "at once" [one after the other], the jvm is most likely to allocate the objects as contiguous as it can, but it might be not enough if the memory is fragmented.

    My questions:

    1. Is there a way to tell the jvm to defrag the memory just before I start allocating my objects? Will it be enough to ensure [as much as possible] that the objects will be allocated continiously?
    2. Is there a different solution to this issue?

    解决方案

    New objects are creating in the Eden space. The eden space is never fragmented. It is always empty after a GC.

    The problem you have is when a GC is performed, object can be arranged randomly in memory or even surprisingly in the reverse order they are referenced.

    A work around is to store the fields as a series of arrays. I call this a column-based table instead of a row based table.

    e.g. Instead of writing

    class PointCount {
        double x, y;
        int count;
    }
    
    PointCount[] pc = new lots of small objects.
    

    use columns based data types.

    class PointCounts {
        double[] xs, ys;
        int[] counts;
    }
    

    or

    class PointCounts {
        TDoubleArrayList xs, ys;
        TIntArrayList counts;
    }
    

    The arrays themselves could be in up to three different places, but the data is otherwise always continuous. This can even be marginally more efficient if you perform operations on a subset of fields.

    public int totalCount() {
       int sum = 0;
       // counts are continuous without anything between the values.
       for(int i: counts) sum += i;
       return i;
    }
    


    A solution I use is to avoid GC overhead for having large amounts of data is to use an interface to access a direct or memory mapped ByteBuffer

    import java.nio.ByteBuffer;
    import java.nio.ByteOrder;
    
    public class MyCounters {
        public static void main(String... args) {
            Runtime rt = Runtime.getRuntime();
            long used1 = rt.totalMemory() - rt.freeMemory();
            long start = System.nanoTime();
            int length = 100 * 1000 * 1000;
            PointCount pc = new PointCountImpl(length);
            for (int i = 0; i < length; i++) {
                pc.index(i);
                pc.setX(i);
                pc.setY(-i);
                pc.setCount(1);
            }
            for (int i = 0; i < length; i++) {
                pc.index(i);
                if (pc.getX() != i) throw new AssertionError();
                if (pc.getY() != -i) throw new AssertionError();
                if (pc.getCount() != 1) throw new AssertionError();
            }
            long time = System.nanoTime() - start;
            long used2 = rt.totalMemory() - rt.freeMemory();
            System.out.printf("Creating an array of %,d used %,d bytes of heap and tool %.1f seconds to set and get%n",
                    length, (used2 - used1), time / 1e9);
        }
    }
    
    interface PointCount {
        // set the index of the element referred to.
        public void index(int index);
    
        public double getX();
    
        public void setX(double x);
    
        public double getY();
    
        public void setY(double y);
    
        public int getCount();
    
        public void setCount(int count);
    
        public void incrementCount();
    }
    
    class PointCountImpl implements PointCount {
        static final int X_OFFSET = 0;
        static final int Y_OFFSET = X_OFFSET + 8;
        static final int COUNT_OFFSET = Y_OFFSET + 8;
        static final int LENGTH = COUNT_OFFSET + 4;
    
        final ByteBuffer buffer;
        int start = 0;
    
        PointCountImpl(int count) {
            this(ByteBuffer.allocateDirect(count * LENGTH).order(ByteOrder.nativeOrder()));
        }
    
        PointCountImpl(ByteBuffer buffer) {
            this.buffer = buffer;
        }
    
        @Override
        public void index(int index) {
            start = index * LENGTH;
        }
    
        @Override
        public double getX() {
            return buffer.getDouble(start + X_OFFSET);
        }
    
        @Override
        public void setX(double x) {
            buffer.putDouble(start + X_OFFSET, x);
        }
    
        @Override
        public double getY() {
            return buffer.getDouble(start + Y_OFFSET);
        }
    
        @Override
        public void setY(double y) {
            buffer.putDouble(start + Y_OFFSET, y);
        }
    
        @Override
        public int getCount() {
            return buffer.getInt(start + COUNT_OFFSET);
        }
    
        @Override
        public void setCount(int count) {
            buffer.putInt(start + COUNT_OFFSET, count);
        }
    
        @Override
        public void incrementCount() {
            setCount(getCount() + 1);
        }
    }
    

    run with the -XX:-UseTLAB option (to get accurate memory allocation sizes) prints

    Creating an array of 100,000,000 used 12,512 bytes of heap and took 1.8 seconds to set and get

    As its off heap, it has next to no GC impact.

    这篇关于我可以在java中连续分配对象吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆