Java 8:Streams vs集合的性能 [英] Java 8: performance of Streams vs Collections

查看:178
本文介绍了Java 8:Streams vs集合的性能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是Java 8的新手。我还是不知道API的深度,但我做了一个小的非正式的基准来比较新的Streams API的性能和好的旧集合。



测试包括过滤 Integer 的列表,对于每个偶数,计算平方根和存储它在结果列表



是代码:

  public static void main(String [] args){
//计算均方根数字从1到N
int min = 1;
int max = 1000000;

List< Integer> sourceList = new ArrayList<>();
for(int i = min; i sourceList.add(i);
}

List< Double> result = new LinkedList<>();


//集合方法
long t0 = System.nanoTime();
long elapsed = 0;
for(Integer i:sourceList){
if(i%2 == 0){
result.add(Math.sqrt(i));
}
}
elapsed = System.nanoTime() - t0;
System.out.printf(Collections:Elapsed time:\t%d ns \t(%f seconds)%n,elapsed,elapsed / Math.pow(10,9)


//流方法
Stream< Integer> stream = sourceList.stream();
t0 = System.nanoTime();
result = stream.filter(i - > i%2 == 0).map(i - > Math.sqrt(i))。collect(Collectors.toList());
elapsed = System.nanoTime() - t0;
System.out.printf(Streams:Elapsed time:\t \t%d ns \t(%f seconds)%n,elapsed,elapsed / Math.pow(10,9)) ;


//并行流方法
stream = sourceList.stream()。parallel();
t0 = System.nanoTime();
result = stream.filter(i - > i%2 == 0).map(i - > Math.sqrt(i))。collect(Collectors.toList());
elapsed = System.nanoTime() - t0;
System.out.printf(Parallel streams:Elapsed time:\t%d ns \t(%f seconds)%n,elapsed,elapsed / Math.pow(10,9)
}。

以下是双核机器的结果:

 集合:已用时间:94338247 ns(0,094338秒)
流:已用时间:201112924 ns(0,201113秒)
并行流:经过时间:357243629 ns(0,357244秒)

缓慢的集合,并行性并没有帮助(或者我使用错误的方式?)



问题:




  • 这次测试是公平的吗?我犯了什么错误?

  • 流是否比集合慢?有没有人对此做出了良好的正式基准?

  • 我应该采用哪种方法?






更新结果



我在JVM预热(1k次迭代)

 集合:平均时间:206884437,000000 ns(0,206884秒)
Streams :平均时间:98366725,000000 ns(0,098367秒)
并行流:平均时间:167703705,000000 ns(0,167704秒)

在这种情况下,流更加高效。我不知道在应用程序中会看到什么,过滤功能在运行时只调用一次或两次。

解决方案


  1. 停止使用 LinkedList ,除了使用迭代器从列表中间删除重载。


  2. 停止手工编写基准代码,请使用 JMH


正确的基准:

  @OutputTimeUnit(TimeUnit.NANOSECONDS)
@BenchmarkMode(Mode.AverageTime)
@OperationsPerInvocation(StreamVsVanilla.N)
public class StreamVsVanilla {
public static final int N = 10000;

static List< Integer> sourceList = new ArrayList<>();
static {
for(int i = 0; i sourceList.add(i);
}
}

@Benchmark
public List< Double> vanilla(){
List< Double> result = new ArrayList<>(sourceList.size()/ 2 + 1);
for(Integer i:sourceList){
if(i%2 == 0){
result.add(Math.sqrt(i));
}
}
返回结果;
}

@Benchmark
public List< Double> stream(){
return sourceList.stream()
.filter(i - > i%2 == 0)
.map(Math :: sqrt)
.collect (Collectors.toCollection(
() - > new ArrayList<>(sourceList.size()/ 2 + 1)));
}
}

结果:


b $ b

 基准模式样本均值平均误差单位
StreamVsVanilla.stream avgt 10 17.588 0.230 ns / op
StreamVsVanilla.vanilla avgt 10 10.796 0.063 ns / op

正如我预期的流实现是相当慢。 JIT能够内联所有lambda的东西,但不会像vanilla版本那样生成完美的简洁代码。



通常,Java 8流不是一个魔术。他们不能加快已经很好的实现的东西(可能是纯粹的迭代或Java 5的for-each语句替换为 Iterable.forEach() Collection.removeIf()调用)。流更多是关于编码的方便和安全。方便 - 速度权衡在这里工作。


I'm new to Java 8. I still don't know the API in depth, but I've made a small informal benchmark to compare the performance of the new Streams API vs the good old Collections.

The test consists in filtering a list of Integer, and for each even number, calculate the square root and storing it in a result List of Double.

Here is the code:

    public static void main(String[] args) {
        //Calculating square root of even numbers from 1 to N       
        int min = 1;
        int max = 1000000;

        List<Integer> sourceList = new ArrayList<>();
        for (int i = min; i < max; i++) {
            sourceList.add(i);
        }

        List<Double> result = new LinkedList<>();


        //Collections approach
        long t0 = System.nanoTime();
        long elapsed = 0;
        for (Integer i : sourceList) {
            if(i % 2 == 0){
                result.add(Math.sqrt(i));
            }
        }
        elapsed = System.nanoTime() - t0;       
        System.out.printf("Collections: Elapsed time:\t %d ns \t(%f seconds)%n", elapsed, elapsed / Math.pow(10, 9));


        //Stream approach
        Stream<Integer> stream = sourceList.stream();       
        t0 = System.nanoTime();
        result = stream.filter(i -> i%2 == 0).map(i -> Math.sqrt(i)).collect(Collectors.toList());
        elapsed = System.nanoTime() - t0;       
        System.out.printf("Streams: Elapsed time:\t\t %d ns \t(%f seconds)%n", elapsed, elapsed / Math.pow(10, 9));


        //Parallel stream approach
        stream = sourceList.stream().parallel();        
        t0 = System.nanoTime();
        result = stream.filter(i -> i%2 == 0).map(i -> Math.sqrt(i)).collect(Collectors.toList());
        elapsed = System.nanoTime() - t0;       
        System.out.printf("Parallel streams: Elapsed time:\t %d ns \t(%f seconds)%n", elapsed, elapsed / Math.pow(10, 9));      
    }.

And here are the results for a dual core machine:

    Collections: Elapsed time:   94338247 ns    (0,094338 seconds)
    Streams: Elapsed time:       201112924 ns   (0,201113 seconds)
    Parallel streams: Elapsed time:  357243629 ns   (0,357244 seconds)

For this particular test, streams are about twice as slow as collections, and parallelism doesn't help (or either I'm using it the wrong way?).

Questions:

  • Is this test fair? Have I made any mistake?
  • Are streams slower than collections? Has anyone made a good formal benchmark on this?
  • Which approach should I strive for?

Updated results.

I ran the test 1k times after JVM warmup (1k iterations) as advised by @pveentjer:

    Collections: Average time:      206884437,000000 ns     (0,206884 seconds)
    Streams: Average time:           98366725,000000 ns     (0,098367 seconds)
    Parallel streams: Average time: 167703705,000000 ns     (0,167704 seconds)

In this case streams are more performant. I wonder what would be observed in an app where the filtering function is only called once or twice during runtime.

解决方案

  1. Stop using LinkedList for anything but heavy removing from the middle of the list using iterator.

  2. Stop writing benchmarking code by hand, use JMH.

Proper benchmarks:

@OutputTimeUnit(TimeUnit.NANOSECONDS)
@BenchmarkMode(Mode.AverageTime)
@OperationsPerInvocation(StreamVsVanilla.N)
public class StreamVsVanilla {
    public static final int N = 10000;

    static List<Integer> sourceList = new ArrayList<>();
    static {
        for (int i = 0; i < N; i++) {
            sourceList.add(i);
        }
    }

    @Benchmark
    public List<Double> vanilla() {
        List<Double> result = new ArrayList<>(sourceList.size() / 2 + 1);
        for (Integer i : sourceList) {
            if (i % 2 == 0){
                result.add(Math.sqrt(i));
            }
        }
        return result;
    }

    @Benchmark
    public List<Double> stream() {
        return sourceList.stream()
                .filter(i -> i % 2 == 0)
                .map(Math::sqrt)
                .collect(Collectors.toCollection(
                    () -> new ArrayList<>(sourceList.size() / 2 + 1)));
    }
}

Result:

Benchmark                   Mode   Samples         Mean   Mean error    Units
StreamVsVanilla.stream      avgt        10       17.588        0.230    ns/op
StreamVsVanilla.vanilla     avgt        10       10.796        0.063    ns/op

Just as I expected stream implementation is fairly slower. JIT is able to inline all lambda stuff but doesn't produce as perfectly concise code as vanilla version.

Generally, Java 8 streams is not a magic. They couldn't speedup already well-implemented things (with, probably, plain iterations or Java 5's for-each statements replaced with Iterable.forEach() and Collection.removeIf() calls). Streams are more about coding convenience and safety. Convenience -- speed tradeoff is working here.

这篇关于Java 8:Streams vs集合的性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆