最快读取/存储大量的多维数据的方法是什么? (JAVA) [英] Fastest way to read/store lots of multidimensional data? (Java)

查看:132
本文介绍了最快读取/存储大量的多维数据的方法是什么? (JAVA)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有大约三嵌套循环三个问题:

 的for(int x = 0; X< 400; X ++)
{
    对于(INT Y = 0; Y< 300; Y ++)
    {
        对于(INT Z = 0; z,其中400; Z ++)
        {
             //计算和存储值
        }
    }
}

和我需要存储所有的计算值。我的标准方法是使用一个三维数组:

 值[X] [Y] [Z] = 1; //测试值

不过这原来是缓慢的:它需要192毫秒来完成这个循环中,在一个单一的INT-分配

  int值= 1; //测试值

只需要66毫秒。

1)为什么是一个数组,这样比较慢?结果2)为什么它会得到更慢,当我把这个在内部循环:

 值[Z] [Y] [X] = 1; //(注意X和Z切换)

这需要超过4秒!

3)更重要的是:我可以用一个数据结构,是作为一个单一的整数的分配一样快,但可以尽可能多的数据存储为三维阵列


解决方案

  

1)为什么是一个数组,所以相对慢?


至于有人指出,你是比较苹果和橘子。三重阵列是缓慢的,因为它需要解引用(内部至少 - 是的,有在Java中没有指针)三次;但话又说回来,你不能引用一个整型变量...


  

2)为什么它会得到更慢,当我把这个在内部循环:


 值[Z] [Y] [X] = 1; //(注意X和Z切换)

由于您已经减少高速缓存一致性。最快变化的指数应该是最后的,让大多数内存访问彼此相邻出现,相同的缓存块中,而不是强迫你的处理器要等到块从主RAM读取。


  

3)更重要的是:我可以用一个数据结构,是作为一个单一的整数的分配一样快,但可以尽可能多的数据存储为三维阵列


没有。有没有这样的结构,因为整型变量可以放入机器寄存器(快甚至超过了处理器的内存缓存),而且总是可以比其他任何你所想要的更快访问。处理器的主频是多少,快多了那么主内存速度。如果你的工作集(你需要操作的数据),不适合到寄存器或高速缓存,你将不得不支付违约金从RAM读取它(或更糟的是,磁盘)。

这是说,Java那样每个数组访问边界检查,似乎并没有太聪明优化边界检查了。下面的比较可能感兴趣

 公共静态长TEST1(INT [] [] []数组){
    长启动= System.currentTimeMillis的();
    为(中间体X = 0; X&所述; 400; X ++){
        对于(INT Y = 0; Y< 300; Y ++){
            对于(INT Z = 0; z,其中400; Z ++){
                数组[X] [Y] [Z] = X + Y + Z;
            }
        }
    }
    返回System.currentTimeMillis的() - 启动;
}公共静态长TEST2(INT []数组){
    长启动= System.currentTimeMillis的();
    为(中间体X = 0; X&所述; 400; X ++){
        对于(INT Y = 0; Y< 300; Y ++){
            对于(INT Z = 0; z,其中400; Z ++){
                数组[Z + Y * 400 + X * 400 * 300] = X + Y + Z;
            }
        }
    }
    返回System.currentTimeMillis的() - 启动;
}公共静态无效的主要(字串[] args){    INT [] [] [] A1 =新INT [400] [300] [400];
    INT [] A2 =新INT [400 * 300 * 400];
    INT N = 20;    通信System.err.println(测试1);
    的for(int i = 0; I< N;我++){
        System.err.print(TEST1(A1)+毫秒);
    }
    System.err.println()来;
    通信System.err.println(测试2);
    的for(int i = 0; I< N;我++){
        System.err.print(TEST2(A2)+毫秒);
    }
    System.err.println()来;
}

的输出,我的系统上,是

  TEST1
164ms 177ms 148ms 149ms 148ms 147ms 150ms的151ms 152ms 154ms 151ms 150ms的148ms 148ms 150ms的148ms时间150ms 148ms 148ms 149ms
TEST2
141ms 153ms 130MS 130MS 130MS 133ms 130MS 130MS 130MS 132ms 129ms 131ms 130MS 131ms 131ms 130MS 131ms 130MS 130MS 130MS

因此​​,有一定的提升空间......但我真的不认为这是值得的。

I have three questions about three nested loops:

for (int x=0; x<400; x++)
{
    for (int y=0; y<300; y++)
    {
        for (int z=0; z<400; z++)
        {
             // compute and store value
        }
    }
}

And I need to store all computed values. My standard approach would be to use a 3D-array:

values[x][y][z] = 1; // test value

but this turns out to be slow: it takes 192 ms to complete this loop, where a single int-assignment

int value = 1; // test value

takes only 66 ms.

1) Why is an array so relatively slow?
2) And why does it get even slower when I put this in the inner loop:

values[z][y][x] = 1; // (notice x and z switched)

This takes more than 4 seconds!

3) Most importantly: Can I use a data structure that is as quick as the assignment of a single integer, but can store as much data as the 3D-array?

解决方案

1) Why is an array so relatively slow?

As others pointed, you are comparing apples to oranges. The triple-array is slow because it needs to dereference (internally at least - yes, "there are no pointers in Java") three times; but then again, you cannot reference a single integer variable...

2) And why does it get even slower when I put this in the inner loop:

values[z][y][x] = 1; // (notice x and z switched)

Because you have decreased cache coherence. The fastest-changing indices should be the last ones, so that most memory accesses occur next to each other, within the same cache blocks, instead of forcing your processor to wait until the blocks are read from the main RAM.

3) Most importantly: Can I use a data structure that is as quick as the assignment of a single integer, but can store as much data as the 3D-array?

No. There is no such structure, since the integer variable fits into a machine register (quicker even than the processor's memory cache), and can always be accessed faster than anything else you care to mention. Processor speeds are much, much faster then main memory speeds. If your 'working set' (the data that you need to operate on) does not fit into registers or cache, you will have to pay a penalty to fetch it from RAM (or even worse, disk).

This being said, Java does boundary checks on each array access, and does not seem to be too smart about optimizing the boundary checks away. The following comparison may be of interest:

public static long test1(int[][][] array) {
    long start = System.currentTimeMillis();
    for ( int x = 0; x < 400; x++ ) {
        for ( int y = 0; y < 300; y++ ) {
            for ( int z = 0; z < 400; z++ ) {
                array[x][y][z] = x + y + z;
            }
        }
    }
    return System.currentTimeMillis() - start;
}

public static long test2(int [] array) {
    long start = System.currentTimeMillis();
    for ( int x = 0; x < 400; x++ ) {
        for ( int y = 0; y < 300; y++ ) {
            for ( int z = 0; z < 400; z++ ) {
                array[z + y*400 + x*400*300] = x + y + z;
            }
        }
    }
    return System.currentTimeMillis() - start;
}

public static void main(String[] args) {

    int[][][] a1 = new int[400][300][400];
    int[] a2 = new int[400*300*400];
    int n = 20;

    System.err.println("test1");
    for (int i=0; i<n; i++) {
        System.err.print(test1(a1) + "ms ");
    }
    System.err.println();
    System.err.println("test2");
    for (int i=0; i<n; i++) {
        System.err.print(test2(a2) + "ms ");
    }
    System.err.println();
}

The output, on my system, is

test1
164ms 177ms 148ms 149ms 148ms 147ms 150ms 151ms 152ms 154ms 151ms 150ms 148ms 148ms 150ms 148ms 150ms 148ms 148ms 149ms 
test2
141ms 153ms 130ms 130ms 130ms 133ms 130ms 130ms 130ms 132ms 129ms 131ms 130ms 131ms 131ms 130ms 131ms 130ms 130ms 130ms

Therefore, there is some room for improvement... but I really don't think it is worth your while.

这篇关于最快读取/存储大量的多维数据的方法是什么? (JAVA)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆