重新循环的ArrayList中的最快方法 [英] Fastest way to recreate the ArrayList in a for loop

查看:200
本文介绍了重新循环的ArrayList中的最快方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在Java中,使用下面的函数为一个巨大的矩阵X打印的列不同的元素:

In Java, using the following function for a huge matrix X to print its column-distinct elements:

// create the list of distinct values
List<Integer> values = new ArrayList<Integer>();

// X is n * m int[][] matrix
for (int j = 0, x; j < m; j++) {
    values.clear();
    for (int i = 0; i < n; i++) {
        x = X[i][j];
        if (values.contains(x)) continue;
        System.out.println(x);
        values.add(x);
    }
}

首先,我通过迭代列(索引j)并在里边行(索引i)。

First I iterate by columns (index j) and inside by rows (index i).

这个功能将被调用百万次为不同的矩阵,所以code应该被优化以满足性能要求。我想了解一下值数组。难道是更快地使用值=新的ArrayList&LT;整数GT;(); 值= NULL 而不是 values​​.clear()

This function will be called millions of times for different matrices, so the code should be optimized to meet the performance requirements. I'm wondering about the values array. Would it be faster to use values = new ArrayList<Integer>(); or values = null instead of values.clear() ?

推荐答案

会更高效什么是使用的Set 代替列表,例如在 HashSet的实施。 contains方法将会为O运行,而不是为O(n)与列表(1)。你可以通过调用仅add方法保存一个电话。

What would be much more efficient would be to use a Set instead of a list, for example the HashSet implementation. The contains method will run in O(1) instead of O(n) with a list. And you could save one call by only calling the add method.

至于您的具体问题,我只想在每个循环创建一个新的套装 - 对象的创建是不贵,大概不到清除的设定(通过底部的基准测试证实了 - 看到编辑最有效的版本2):

As for your specific question, I would just create a new Set at each loop - object creation is not that expensive, probably less than clearing the set (as confirmed by the benchmark at the bottom - see the most efficient version in EDIT 2):

for (int j = 0, x; j < m; j++) {
    Set<Integer> values = new HashSet<Integer>();
    for (int i = 0; i < n; i++) {
        x = X[i][j];
        if (!values.add(x)) continue; //value.add returns true if the element was NOT in the set before
        System.out.println(x);
    }
}

不过,要知道的唯一途径是更快(新对象与清)是分析您的code的那部分,并检查这两个版本的性能。

However, the only way to know which is quicker (new object vs. clear) is to profile that portion of your code and check the performance of both versions.

修改

我跑了一个快速的基准和明确的版本似乎比(约20%),创建一组在每个循环快一点。你还是应该检查你的数据/使用情况下哪一个更好。更快的code。与我的数据集:

I ran a quick benchmark and the clear version seems a little faster than creating a set at each loop (by about 20%). You should still check on your dataset / use case which one is better. Faster code with my dataset:

Set<Integer> values = new HashSet<Integer>();
for (int j = 0, x; j < m; j++) {
    for (int i = 0; i < n; i++) {
        x = X[i][j];
        if (!values.add(x)) continue; //value.add returns true if the element was NOT in the set before
        System.out.println(x);
    }
    values.clear();
}

编辑2

通过在每个循环创建一组新的合适的尺寸所获得的code的实际速度更快的版本:

An actually even faster version of the code is obtained by creating a new set of the right size at each loop:

for (int j = 0, x; j < m; j++) {
    Set<Integer> values = new HashSet<Integer>(n, 1); //right size from the beginning
    for (int i = 0; i < n; i++) {
        x = X[i][j];
        if (!values.add(x)) continue; //value.add returns true if the element was NOT in the set before
        System.out.println(x);
    }
}

结果摘要

JVM热身+ JIT之后:

After JVM warm up + JIT:

Set<Integer> values = new HashSet<Integer>(n, 1); =====> 280 ms
values.clear();                                   =====> 380 ms
Set<Integer> values = new HashSet<Integer>();     =====> 450 ms 

这篇关于重新循环的ArrayList中的最快方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆