什么时候流优先于传统的循环以获得最佳性能?流是否利用分支预测? [英] When should streams be preferred over traditional loops for best performance? Do streams take advantage of branch-prediction?

查看:116
本文介绍了什么时候流优先于传统的循环以获得最佳性能?流是否利用分支预测?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我刚刚阅读了分支预测,并想尝试使用Java 8 Streams。

I just read about Branch-Prediction and wanted to try how this works with Java 8 Streams.

然而,Streams的表现总是比传统的循环更差。

However the performance with Streams is always turning out to be worse than traditional loops.

int totalSize = 32768;
int filterValue = 1280;
int[] array = new int[totalSize];
Random rnd = new Random(0);
int loopCount = 10000;

for (int i = 0; i < totalSize; i++) {
    // array[i] = rnd.nextInt() % 2560; // Unsorted Data
    array[i] = i; // Sorted Data
}

long start = System.nanoTime();
long sum = 0;
for (int j = 0; j < loopCount; j++) {
    for (int c = 0; c < totalSize; ++c) {
        sum += array[c] >= filterValue ? array[c] : 0;
    }
}
long total = System.nanoTime() - start;
System.out.printf("Conditional Operator Time : %d ns, (%f sec) %n", total, total / Math.pow(10, 9));

start = System.nanoTime();
sum = 0;
for (int j = 0; j < loopCount; j++) {
    for (int c = 0; c < totalSize; ++c) {
        if (array[c] >= filterValue) {
            sum += array[c];
        }
    }
}
total = System.nanoTime() - start;
System.out.printf("Branch Statement Time : %d ns, (%f sec) %n", total, total / Math.pow(10, 9));

start = System.nanoTime();
sum = 0;
for (int j = 0; j < loopCount; j++) {
    sum += Arrays.stream(array).filter(value -> value >= filterValue).sum();
}
total = System.nanoTime() - start;
System.out.printf("Streams Time : %d ns, (%f sec) %n", total, total / Math.pow(10, 9));

start = System.nanoTime();
sum = 0;
for (int j = 0; j < loopCount; j++) {
    sum += Arrays.stream(array).parallel().filter(value -> value >= filterValue).sum();
}
total = System.nanoTime() - start;
System.out.printf("Parallel Streams Time : %d ns, (%f sec) %n", total, total / Math.pow(10, 9));

输出:


  1. 对于分类数组:

  1. For Sorted-Array :

Conditional Operator Time : 294062652 ns, (0.294063 sec) 
Branch Statement Time : 272992442 ns, (0.272992 sec) 
Streams Time : 806579913 ns, (0.806580 sec) 
Parallel Streams Time : 2316150852 ns, (2.316151 sec) 


  • 对于未分类数组:

  • For Un-Sorted Array:

    Conditional Operator Time : 367304250 ns, (0.367304 sec) 
    Branch Statement Time : 906073542 ns, (0.906074 sec) 
    Streams Time : 1268648265 ns, (1.268648 sec) 
    Parallel Streams Time : 2420482313 ns, (2.420482 sec) 
    


  • 我使用列表尝试相同的代码:

    list.stream()而不是 Arrays.stream(数组)

    list.get(c)而不是数组[c]

    I tried the same code using List:
    list.stream() instead of Arrays.stream(array)
    list.get(c) instead of array[c]

    输出:


    1. 对于Sorted-List:

    1. For Sorted-List :

    Conditional Operator Time : 860514446 ns, (0.860514 sec) 
    Branch Statement Time : 663458668 ns, (0.663459 sec) 
    Streams Time : 2085657481 ns, (2.085657 sec) 
    Parallel Streams Time : 5026680680 ns, (5.026681 sec) 
    


  • 对于未分类的清单

  • For Un-Sorted List

    Conditional Operator Time : 704120976 ns, (0.704121 sec) 
    Branch Statement Time : 1327838248 ns, (1.327838 sec) 
    Streams Time : 1857880764 ns, (1.857881 sec) 
    Parallel Streams Time : 2504468688 ns, (2.504469 sec) 
    


  • 我提到了一些博客这个& 建议与流相同的性能问题。

    I referred to few blogs this & this which suggest the same performance issue w.r.t streams.


    1. 我同意使用流编程对于某些场景来说很好而且更容易,但是当我们在失去性能,为什么我们需要使用它们?是否有一些我错过的东西?

    2. 哪种情况下流的执行等于循环?仅在你的函数定义需要花费大量时间的情况下,导致循环性能可忽略不计吗?

    3. 在任何情景中我都看不到利用分支的流-prediction (我尝试过排序和无序流,但没有用。与正常流相比,它的性能影响是其两倍以上)?

    1. I agree to the point that programming with streams is nice and easier for some scenarios but when we're losing out on performance, why do we need to use them? Is there something I'm missing out on?
    2. Which is the scenario in which streams perform equal to loops? Is it only in the case where your function defined takes a lot of time, resulting in a negligible loop performance?
    3. In none of the scenario's I could see streams taking advantage of branch-prediction (I tried with sorted and unordered streams, but of no use. It gave more than double the performance impact compared to normal streams)?


    推荐答案


    我同意在某些情况下使用流编程更好更容易,但是当我们失去性能时,为什么我们需要使用它们?

    I agree to the point that programming with streams is nice and easier for some scenarios but when we're losing out on performance, why do we need to use them?

    性能很少是个问题。通常需要将10%的流重写为循环以获得所需的性能。

    Performance is rarely an issue. It would be usual for 10% of your streams would need to be rewritten as loops to get the performance you need.


    我有什么东西吗?我错过了吗?

    Is there something I'm missing out on?

    使用parallelStream()更容易使用流,可能更高效,因为很难编写有效的并发代码。

    Using parallelStream() is much easier using streams and possibly more efficient as it's hard to write efficient concurrent code.


    哪种情况下流的执行等于循环?是仅在您定义的函数需要花费大量时间的情况下,导致循环性能可忽略不计?

    Which is the scenario in which streams perform equal to loops? Is it only in the case where your function defined takes a lot of time, resulting in a negligible loop performance?

    您的基准是有缺陷的从某种意义上说,代码在启动时尚未编译。我会像JMH一样在循环中完成整个测试,或者我会使用JMH。

    Your benchmark is flawed in the sense that the code hasn't been compiled when it starts. I would do the whole test in a loop as JMH does, or I would use JMH.


    在任何情景中我都看不到流利用分支预测

    In none of the scenario's I could see streams taking advantage of branch-prediction

    分支预测是CPU功能而不是JVM或流功能。

    Branch prediction is a CPU feature not a JVM or streams feature.

    这篇关于什么时候流优先于传统的循环以获得最佳性能?流是否利用分支预测?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆