将元素从Stream添加到现有List的更好方法是什么? [英] What's the better way to add elements from a Stream to an existing List?

查看:188
本文介绍了将元素从Stream添加到现有List的更好方法是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我必须编写一些代码,将Java 8 Stream的内容多次添加到List中,而我无法确定哪种方法最好。基于我在SO上阅读的内容(主要是这个问题:如何将Java8流的元素添加到现有列表中)和其他地方,我已将其缩小到以下选项:

I have to write some code that adds the content of a Java 8 Stream to a List multiple times, and I'm having trouble figuring out what's the best way to do it. Based on what I read on SO (mainly this question: How to add elements of a Java8 stream into an existing List) and elsewhere, I've narrowed it down to the following options:

import java.util.ArrayList;
import java.util.List;
import java.util.function.Function;
import java.util.stream.Collectors;

public class Accumulator<S, T> {


    private final Function<S, T> transformation;
    private final List<T> internalList = new ArrayList<T>();

    public Accumulator(Function<S, T> transformation) {
        this.transformation = transformation;
    }

    public void option1(List<S> newBatch) {
        internalList.addAll(newBatch.stream().map(transformation).collect(Collectors.toList()));
    }

    public void option2(List<S> newBatch) {
        newBatch.stream().map(transformation).forEach(internalList::add);
    }
}

想法是多次调用这些方法对于累加器的相同实例。选择是在使用中间列表和在流外部调用 Collection.addAll()或调用 collection.add()

The idea is that the methods would be called multiple times for the same instance of Accumulator. The choice is between using an intermediate list and callingCollection.addAll() once outside of the stream or calling collection.add() from the stream for each element.

我倾向于更喜欢选项2,这更符合函数式编程的精神,并且避免创建中间列表,但是,当n很大时,调用 addAll()而不是调用 add() n可能会有好处。

I would tend to prefer option 2 which is more in the spirit of functional programming, and avoid creating an intermediate list, however, there might be benefits to calling addAll() instead of calling add() n times when n is large.

这两个选项中的一个明显优于其他选项吗?

Is one of the two options significantly better than the other ?

编辑:JB Nizet非常酷< a href =https://stackoverflow.com/a/39495513/425682>答案延迟转换,直到添加完所有批次。在我的情况下,需要直接执行转换。

JB Nizet has a very cool answer that delays the transformation until all batches have been added. In my case, it is required that the transformation is performed straight-away.

PS:在我的示例代码中,我使用了转换作为占位符,用于需要在流上执行的任何操作

PS: In my example code, I've used transformation as a placeholder for whatever operations which need to be performed on the stream

推荐答案

首先,你的第二个变体应该是:

First of all, your second variant should be:

public void option2(List<S> newBatch) {
  newBatch.stream().map(transformation).forEachOrdered(internalList::add);
}

是正确的。

除此之外, addAll 的优势在

public void option1(List<S> newBatch) {
  internalList.addAll(newBatch.stream().map(transformation).collect(Collectors.toList()));
}

收藏家 API不允许Stream向收集器提供有关预期大小的提示,并要求Stream评估每个元素的累加器函数,这只不过是 ArrayList :: add 在当前实现中。

is moot as the Collector API does not allow the Stream to provide hints about the expected size to the Collector and requires the Stream to evaluate the accumulator function for every element, which is nothing else than ArrayList::add in the current implementation.

因此,在此方法可以从 addAll 中获得任何好处之前,它填充了 ArrayList ArrayList 上反复调用 add ,包括潜力增加运营能力。因此,您可以毫不后悔地使用 option2

So before this approach could get any benefit from addAll, it filled an ArrayList by repeatedly calling add on an ArrayList, including potential capacity increase operations. So you can stay with option2 without regret.

另一种方法是使用流构建器进行临时收集:

An alternative is to use a stream builder for temporary collections:

public class Accumulator<S, T> {
    private final Function<S, T> transformation;
    private final Stream.Builder<T> internal = Stream.builder();

    public Accumulator(Function<S, T> transformation) {
        this.transformation = transformation;
    }

    public void addBatch(List<S> newBatch) {
        newBatch.stream().map(transformation).forEachOrdered(internal);
    }

    public List<T> finish() {
        return internal.build().collect(Collectors.toList());
    }
}

流构建器使用不需要的spined缓冲区在增加容量时复制内容,但解决方案仍然受到最终收集步骤涉及在没有适当初始容量的情况下填充 ArrayList 这一事实(在当前实现中)。

The stream builder uses a spined buffer which does not require copying the contents when increasing its capacity, but the solution still suffers from the fact that the final collection step involves filling an ArrayList without an appropriate initial capacity (in the current implementation).

使用当前的实现,实现完成步骤的效率要高得多

With the current implementation, it’s far more efficient to implement the finishing step as

public List<T> finish() {
    return Arrays.asList(internal.build().toArray(…));
}

但是这需要 IntFunction< T [调用者提供的> (因为我们不能为通用数组类型执行此操作),或者执行未经检查的操作(假装 Object [] T [] ,这里可以,但仍然是一个讨厌的未经检查的操作。)

But this requires either, an IntFunction<T[]> provided by the caller (as we can’t do that for a generic array type), or to perform an unchecked operation (pretending an Object[] to be T[], which would be ok here, but still a nasty unchecked operation).

这篇关于将元素从Stream添加到现有List的更好方法是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆