如何将 Java8 流的元素添加到现有列表中 [英] How to add elements of a Java8 stream into an existing List

查看:44
本文介绍了如何将 Java8 流的元素添加到现有列表中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Collector 的 Javadoc 显示如何将流的元素收集到新列表中.是否有将结果添加到现有 ArrayList 中的单行代码?

Javadoc of Collector shows how to collect elements of a stream into a new List. Is there an one-liner that adds the results into an existing ArrayList?

推荐答案

注意: nosid 的answer 展示了如何使用 forEachOrdered() 添加到现有集合.这是一种用于改变现有集合的有用且有效的技术.我的回答说明了为什么不应该使用 Collector 来改变现有集合.

NOTE: nosid's answer shows how to add to an existing collection using forEachOrdered(). This is a useful and effective technique for mutating existing collections. My answer addresses why you shouldn't use a Collector to mutate an existing collection.

简短的回答是,至少在一般情况下,您不应该使用 Collector 来修改现有集合.

The short answer is no, at least, not in general, you shouldn't use a Collector to modify an existing collection.

原因是收集器旨在支持并行性,即使是在非线程安全的集合上也是如此.他们这样做的方式是让每个线程在自己的中间结果集合上独立运行.每个线程获取自己的集合的方式是调用 Collector.supplier(),它每次都需要返回一个 集合.

The reason is that collectors are designed to support parallelism, even over collections that aren't thread-safe. The way they do this is to have each thread operate independently on its own collection of intermediate results. The way each thread gets its own collection is to call the Collector.supplier() which is required to return a new collection each time.

这些中间结果的集合然后再次以线程限制的方式合并,直到有一个单一的结果集合.这是 collect() 操作的最终结果.

These collections of intermediate results are then merged, again in a thread-confined fashion, until there is a single result collection. This is the final result of the collect() operation.

来自 Balderassylias 建议使用 Collectors.toCollection() 然后传递一个供应商,该供应商返回现有列表而不是新列表.这违反了对供应商的要求,即它每次都返回一个新的空集合.

A couple answers from Balder and assylias have suggested using Collectors.toCollection() and then passing a supplier that returns an existing list instead of a new list. This violates the requirement on the supplier, which is that it return a new, empty collection each time.

这适用于简单的情况,正如他们的答案中的示例所展示的那样.但是,它会失败,特别是如果流并行运行.(库的未来版本可能会以某种不可预见的方式发生变化,这将导致它失败,即使在顺序情况下也是如此.)

This will work for simple cases, as the examples in their answers demonstrate. However, it will fail, particularly if the stream is run in parallel. (A future version of the library might change in some unforeseen way that will cause it to fail, even in the sequential case.)

举个简单的例子:

List<String> destList = new ArrayList<>(Arrays.asList("foo"));
List<String> newList = Arrays.asList("0", "1", "2", "3", "4", "5");
newList.parallelStream()
       .collect(Collectors.toCollection(() -> destList));
System.out.println(destList);

当我运行这个程序时,我经常得到一个ArrayIndexOutOfBoundsException.这是因为多个线程正在对 ArrayList 进行操作,这是一种线程不安全的数据结构.好的,让我们同步:

When I run this program, I often get an ArrayIndexOutOfBoundsException. This is because multiple threads are operating on ArrayList, a thread-unsafe data structure. OK, let's make it synchronized:

List<String> destList =
    Collections.synchronizedList(new ArrayList<>(Arrays.asList("foo")));

这将不再因异常而失败.但不是预期的结果:

This will no longer fail with an exception. But instead of the expected result:

[foo, 0, 1, 2, 3]

它给出了如下奇怪的结果:

it gives weird results like this:

[foo, 2, 3, foo, 2, 3, 1, 0, foo, 2, 3, foo, 2, 3, 1, 0, foo, 2, 3, foo, 2, 3, 1, 0, foo, 2, 3, foo, 2, 3, 1, 0]

这是我上面描述的线程限制累积/合并操作的结果.对于并行流,每个线程都会调用供应商以获取自己的集合以进行中间累积.如果您传递返回相同 集合的供应商,则每个线程将其结果附加到该集合.由于线程之间没有排序,结果将按任意顺序附加.

This is the result of the thread-confined accumulation/merging operations I described above. With a parallel stream, each thread calls the supplier to get its own collection for intermediate accumulation. If you pass a supplier that returns the same collection, each thread appends its results to that collection. Since there is no ordering among the threads, results will be appended in some arbitrary order.

然后,当这些中间集合被合并时,这基本上是将列表与其自身合并.列表使用 List.addAll() 合并,这表示如果在操作期间修改了源集合,结果是未定义的.在这种情况下,ArrayList.addAll() 执行数组复制操作,因此它最终会自我复制,我猜这有点像人们所期望的那样.(请注意,其他 List 实现可能具有完全不同的行为.)无论如何,这解释了目标中的奇怪结果和重复元素.

Then, when these intermediate collections are merged, this basically merges the list with itself. Lists are merged using List.addAll(), which says that the results are undefined if the source collection is modified during the operation. In this case, ArrayList.addAll() does an array-copy operation, so it ends up duplicating itself, which is sort-of what one would expect, I guess. (Note that other List implementations might have completely different behavior.) Anyway, this explains the weird results and duplicated elements in the destination.

您可能会说,我会确保按顺序运行我的流"然后继续编写这样的代码

You might say, "I'll just make sure to run my stream sequentially" and go ahead and write code like this

stream.collect(Collectors.toCollection(() -> existingList))

无论如何.我建议不要这样做.如果您控制流,当然可以保证它不会并行运行.我预计会出现一种编程风格,其中流被传递而不是集合.如果有人给你一个流并且你使用了这个代码,如果这个流恰好是并行的,它就会失败.更糟糕的是,有人可能会给你一个顺序流,这段代码可以正常工作一段时间,通过所有测试等.然后,在任意一段时间之后,系统中其他地方的代码可能会更改为使用并行流,这将导致 你的代码要破解.

anyway. I'd recommend against doing this. If you control the stream, sure, you can guarantee that it won't run in parallel. I expect that a style of programming will emerge where streams get handed around instead of collections. If somebody hands you a stream and you use this code, it'll fail if the stream happens to be parallel. Worse, somebody might hand you a sequential stream and this code will work fine for a while, pass all tests, etc. Then, some arbitrary amount of time later, code elsewhere in the system might change to use parallel streams which will cause your code to break.

好的,那么在使用此代码之前,请确保记住在任何流上调用 sequential() :

OK, then just make sure to remember to call sequential() on any stream before you use this code:

stream.sequential().collect(Collectors.toCollection(() -> existingList))

当然,您每次都会记得这样做,对吗?:-) 假设你这样做.然后,性能团队会想知道为什么他们精心设计的并行实现没有提供任何加速.他们会再次将其追溯到您的代码,该代码迫使整个流按顺序运行.

Of course, you'll remember to do this every time, right? :-) Let's say you do. Then, the performance team will be wondering why all their carefully crafted parallel implementations aren't providing any speedup. And once again they'll trace it down to your code which is forcing the entire stream to run sequentially.

不要这样做.

这篇关于如何将 Java8 流的元素添加到现有列表中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆