将元素从Stream添加到现有List的更好方法是什么? [英] What's the better way to add elements from a Stream to an existing List?
问题描述
我必须编写一些代码,将Java 8 Stream的内容多次添加到List中,而我无法确定哪种方法最好。基于我在SO上阅读的内容(主要是这个问题:如何将Java8流的元素添加到现有列表中)和其他地方,我已将其缩小到以下选项:
I have to write some code that adds the content of a Java 8 Stream to a List multiple times, and I'm having trouble figuring out what's the best way to do it. Based on what I read on SO (mainly this question: How to add elements of a Java8 stream into an existing List) and elsewhere, I've narrowed it down to the following options:
import java.util.ArrayList;
import java.util.List;
import java.util.function.Function;
import java.util.stream.Collectors;
public class Accumulator<S, T> {
private final Function<S, T> transformation;
private final List<T> internalList = new ArrayList<T>();
public Accumulator(Function<S, T> transformation) {
this.transformation = transformation;
}
public void option1(List<S> newBatch) {
internalList.addAll(newBatch.stream().map(transformation).collect(Collectors.toList()));
}
public void option2(List<S> newBatch) {
newBatch.stream().map(transformation).forEach(internalList::add);
}
}
想法是多次调用这些方法对于累加器
的相同实例。选择是在使用中间列表和在流外部调用 Collection.addAll()
或调用 collection.add()$ c之间来自每个元素的流的$ c>。
The idea is that the methods would be called multiple times for the same instance of Accumulator
. The choice is between using an intermediate list and callingCollection.addAll()
once outside of the stream or calling collection.add()
from the stream for each element.
我倾向于更喜欢选项2,这更符合函数式编程的精神,并且避免创建中间列表,但是,当n很大时,调用 addAll()
而不是调用 add()
n可能会有好处。
I would tend to prefer option 2 which is more in the spirit of functional programming, and avoid creating an intermediate list, however, there might be benefits to calling addAll()
instead of calling add()
n times when n is large.
这两个选项中的一个明显优于其他选项吗?
Is one of the two options significantly better than the other ?
编辑:JB Nizet非常酷< a href =https://stackoverflow.com/a/39495513/425682>答案延迟转换,直到添加完所有批次。在我的情况下,需要直接执行转换。
JB Nizet has a very cool answer that delays the transformation until all batches have been added. In my case, it is required that the transformation is performed straight-away.
PS:在我的示例代码中,我使用了转换
作为占位符,用于需要在流上执行的任何操作
PS: In my example code, I've used transformation
as a placeholder for whatever operations which need to be performed on the stream
推荐答案
首先,你的第二个变体应该是:
First of all, your second variant should be:
public void option2(List<S> newBatch) {
newBatch.stream().map(transformation).forEachOrdered(internalList::add);
}
是正确的。
除此之外, addAll
的优势在
public void option1(List<S> newBatch) {
internalList.addAll(newBatch.stream().map(transformation).collect(Collectors.toList()));
}
是收藏家
API不允许Stream向收集器提供有关预期大小的提示,并要求Stream评估每个元素的累加器函数,这只不过是 ArrayList :: add
在当前实现中。
is moot as the Collector
API does not allow the Stream to provide hints about the expected size to the Collector and requires the Stream to evaluate the accumulator function for every element, which is nothing else than ArrayList::add
in the current implementation.
因此,在此方法可以从 addAll
中获得任何好处之前,它填充了 ArrayList
在 ArrayList
上反复调用 add
,包括潜力增加运营能力。因此,您可以毫不后悔地使用 option2
。
So before this approach could get any benefit from addAll
, it filled an ArrayList
by repeatedly calling add
on an ArrayList
, including potential capacity increase operations. So you can stay with option2
without regret.
另一种方法是使用流构建器进行临时收集:
An alternative is to use a stream builder for temporary collections:
public class Accumulator<S, T> {
private final Function<S, T> transformation;
private final Stream.Builder<T> internal = Stream.builder();
public Accumulator(Function<S, T> transformation) {
this.transformation = transformation;
}
public void addBatch(List<S> newBatch) {
newBatch.stream().map(transformation).forEachOrdered(internal);
}
public List<T> finish() {
return internal.build().collect(Collectors.toList());
}
}
流构建器使用不需要的spined缓冲区在增加容量时复制内容,但解决方案仍然受到最终收集步骤涉及在没有适当初始容量的情况下填充 ArrayList
这一事实(在当前实现中)。
The stream builder uses a spined buffer which does not require copying the contents when increasing its capacity, but the solution still suffers from the fact that the final collection step involves filling an ArrayList
without an appropriate initial capacity (in the current implementation).
使用当前的实现,实现完成步骤的效率要高得多
With the current implementation, it’s far more efficient to implement the finishing step as
public List<T> finish() {
return Arrays.asList(internal.build().toArray(…));
}
但是这需要 IntFunction< T [调用者提供的>
(因为我们不能为通用数组类型执行此操作),或者执行未经检查的操作(假装 Object []
要 T []
,这里可以,但仍然是一个讨厌的未经检查的操作。)
But this requires either, an IntFunction<T[]>
provided by the caller (as we can’t do that for a generic array type), or to perform an unchecked operation (pretending an Object[]
to be T[]
, which would be ok here, but still a nasty unchecked operation).
这篇关于将元素从Stream添加到现有List的更好方法是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!