Java Stream.concat VS Collection.addAll的性能 [英] Performance for Java Stream.concat VS Collection.addAll

查看:644
本文介绍了Java Stream.concat VS Collection.addAll的性能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

出于在流中合并两组数据的目的.

For the purpose of combining two sets of data in a stream.

Stream.concat(stream1, stream2).collect(Collectors.toSet());

stream1.collect(Collectors.toSet())
       .addAll(stream2.collect(Collectors.toSet()));

哪个更有效?为什么?

推荐答案

出于可读性和意图,Stream.concat(a, b).collect(toSet())比第二种方法更清晰.

For the sake of readability and intention, Stream.concat(a, b).collect(toSet()) is way clearer than the second alternative.

出于问题的考虑,这是"最有效的是",在这里进行了JMH测试(我想说的是,我使用JMH的程度不是很高,仍有改进我的基准测试的空间):

For the sake of the question, which is "what is the most efficient", here a JMH test (I'd like to say that I don't use JMH that much, there might be some room to improve my benchmark test):

使用JMH,并使用以下代码:

Using JMH, with the following code:

package stackoverflow;

import java.util.HashSet;
import java.util.Set;
import java.util.concurrent.TimeUnit;
import java.util.stream.Collectors;
import java.util.stream.Stream;

import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.BenchmarkMode;
import org.openjdk.jmh.annotations.Fork;
import org.openjdk.jmh.annotations.Measurement;
import org.openjdk.jmh.annotations.Mode;
import org.openjdk.jmh.annotations.OutputTimeUnit;
import org.openjdk.jmh.annotations.Scope;
import org.openjdk.jmh.annotations.Setup;
import org.openjdk.jmh.annotations.State;
import org.openjdk.jmh.annotations.Warmup;
import org.openjdk.jmh.infra.Blackhole;

@State(Scope.Benchmark)
@Warmup(iterations = 2)
@Fork(1)
@Measurement(iterations = 10)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@BenchmarkMode({ Mode.AverageTime})
public class StreamBenchmark {
  private Set<String> s1;
  private Set<String> s2;

  @Setup
  public void setUp() {
    final Set<String> valuesForA = new HashSet<>();
    final Set<String> valuesForB = new HashSet<>();
    for (int i = 0; i < 1000; ++i) {
      valuesForA.add(Integer.toString(i));
      valuesForB.add(Integer.toString(1000 + i));
    }
    s1 = valuesForA;
    s2 = valuesForB;
  }

  @Benchmark
  public void stream_concat_then_collect_using_toSet(final Blackhole blackhole) {
    final Set<String> set = Stream.concat(s1.stream(), s2.stream()).collect(Collectors.toSet());
    blackhole.consume(set);
  }

  @Benchmark
  public void s1_collect_using_toSet_then_addAll_using_toSet(final Blackhole blackhole) {
    final Set<String> set = s1.stream().collect(Collectors.toSet());
    set.addAll(s2.stream().collect(Collectors.toSet()));
    blackhole.consume(set);
  }
}

您会得到这些结果(为了便于阅读,我省略了一部分).

You get these result (I omitted some part for readability).

Result "s1_collect_using_toSet_then_addAll_using_toSet":
  156969,172 ±(99.9%) 4463,129 ns/op [Average]
  (min, avg, max) = (152842,561, 156969,172, 161444,532), stdev = 2952,084
  CI (99.9%): [152506,043, 161432,301] (assumes normal distribution)

Result "stream_concat_then_collect_using_toSet":
  104254,566 ±(99.9%) 4318,123 ns/op [Average]
  (min, avg, max) = (102086,234, 104254,566, 111731,085), stdev = 2856,171
  CI (99.9%): [99936,443, 108572,689] (assumes normal distribution)
# Run complete. Total time: 00:00:25

Benchmark                                                       Mode  Cnt       Score      Error  Units
StreamBenchmark.s1_collect_using_toSet_then_addAll_using_toSet  avgt   10  156969,172 ± 4463,129  ns/op
StreamBenchmark.stream_concat_then_collect_using_toSet          avgt   10  104254,566 ± 4318,123  ns/op

使用Stream.concat(a, b).collect(toSet())的版本应该执行得更快(如果我读懂了JMH编号).

The version using Stream.concat(a, b).collect(toSet()) should perform faster (if I read well the JMH numbers).

另一方面,我认为此结果是正常的,因为您没有创建中间集(即使使用HashSet也要付出一些代价),并且如第一个答案的注释所述,Stream延迟连接.

On the other hand, I think this result is normal because you don't create an intermediate set (this has some cost, even with HashSet), and as said in comment of first answer, the Stream is lazily concatenated.

使用探查器,您可能会看到它较慢的部分.您可能还想使用toCollection(() -> new HashSet(1000))而不是toSet()来查看问题是否出在HashSet内部哈希数组的增长上.

Using a profiler you might see in which part it is slower. You might also want to use toCollection(() -> new HashSet(1000)) instead of toSet() to see if the problem lies in growing the HashSet internal hash array.

这篇关于Java Stream.concat VS Collection.addAll的性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆