Java Stream.concat VS Collection.addAll的性能 [英] Performance for Java Stream.concat VS Collection.addAll
问题描述
出于在流中合并两组数据的目的.
For the purpose of combining two sets of data in a stream.
Stream.concat(stream1, stream2).collect(Collectors.toSet());
或
stream1.collect(Collectors.toSet())
.addAll(stream2.collect(Collectors.toSet()));
哪个更有效?为什么?
推荐答案
出于可读性和意图,Stream.concat(a, b).collect(toSet())
比第二种方法更清晰.
For the sake of readability and intention, Stream.concat(a, b).collect(toSet())
is way clearer than the second alternative.
出于问题的考虑,这是"最有效的是",在这里进行了JMH测试(我想说的是,我使用JMH的程度不是很高,仍有改进我的基准测试的空间):
For the sake of the question, which is "what is the most efficient", here a JMH test (I'd like to say that I don't use JMH that much, there might be some room to improve my benchmark test):
使用JMH,并使用以下代码:
Using JMH, with the following code:
package stackoverflow;
import java.util.HashSet;
import java.util.Set;
import java.util.concurrent.TimeUnit;
import java.util.stream.Collectors;
import java.util.stream.Stream;
import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.BenchmarkMode;
import org.openjdk.jmh.annotations.Fork;
import org.openjdk.jmh.annotations.Measurement;
import org.openjdk.jmh.annotations.Mode;
import org.openjdk.jmh.annotations.OutputTimeUnit;
import org.openjdk.jmh.annotations.Scope;
import org.openjdk.jmh.annotations.Setup;
import org.openjdk.jmh.annotations.State;
import org.openjdk.jmh.annotations.Warmup;
import org.openjdk.jmh.infra.Blackhole;
@State(Scope.Benchmark)
@Warmup(iterations = 2)
@Fork(1)
@Measurement(iterations = 10)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@BenchmarkMode({ Mode.AverageTime})
public class StreamBenchmark {
private Set<String> s1;
private Set<String> s2;
@Setup
public void setUp() {
final Set<String> valuesForA = new HashSet<>();
final Set<String> valuesForB = new HashSet<>();
for (int i = 0; i < 1000; ++i) {
valuesForA.add(Integer.toString(i));
valuesForB.add(Integer.toString(1000 + i));
}
s1 = valuesForA;
s2 = valuesForB;
}
@Benchmark
public void stream_concat_then_collect_using_toSet(final Blackhole blackhole) {
final Set<String> set = Stream.concat(s1.stream(), s2.stream()).collect(Collectors.toSet());
blackhole.consume(set);
}
@Benchmark
public void s1_collect_using_toSet_then_addAll_using_toSet(final Blackhole blackhole) {
final Set<String> set = s1.stream().collect(Collectors.toSet());
set.addAll(s2.stream().collect(Collectors.toSet()));
blackhole.consume(set);
}
}
您会得到这些结果(为了便于阅读,我省略了一部分).
You get these result (I omitted some part for readability).
Result "s1_collect_using_toSet_then_addAll_using_toSet":
156969,172 ±(99.9%) 4463,129 ns/op [Average]
(min, avg, max) = (152842,561, 156969,172, 161444,532), stdev = 2952,084
CI (99.9%): [152506,043, 161432,301] (assumes normal distribution)
Result "stream_concat_then_collect_using_toSet":
104254,566 ±(99.9%) 4318,123 ns/op [Average]
(min, avg, max) = (102086,234, 104254,566, 111731,085), stdev = 2856,171
CI (99.9%): [99936,443, 108572,689] (assumes normal distribution)
# Run complete. Total time: 00:00:25
Benchmark Mode Cnt Score Error Units
StreamBenchmark.s1_collect_using_toSet_then_addAll_using_toSet avgt 10 156969,172 ± 4463,129 ns/op
StreamBenchmark.stream_concat_then_collect_using_toSet avgt 10 104254,566 ± 4318,123 ns/op
使用Stream.concat(a, b).collect(toSet())
的版本应该执行得更快(如果我读懂了JMH编号).
The version using Stream.concat(a, b).collect(toSet())
should perform faster (if I read well the JMH numbers).
另一方面,我认为此结果是正常的,因为您没有创建中间集(即使使用HashSet
也要付出一些代价),并且如第一个答案的注释所述,Stream
是延迟连接.
On the other hand, I think this result is normal because you don't create an intermediate set (this has some cost, even with HashSet
), and as said in comment of first answer, the Stream
is lazily concatenated.
使用探查器,您可能会看到它较慢的部分.您可能还想使用toCollection(() -> new HashSet(1000))
而不是toSet()
来查看问题是否出在HashSet
内部哈希数组的增长上.
Using a profiler you might see in which part it is slower. You might also want to use toCollection(() -> new HashSet(1000))
instead of toSet()
to see if the problem lies in growing the HashSet
internal hash array.
这篇关于Java Stream.concat VS Collection.addAll的性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!