遇到订单友好/不友好的终端操作与并行/顺序与有序/无序流 [英] Encounter order friendly/unfriendly terminal operations vs parallel/sequential vs ordered/unordered streams

查看:102
本文介绍了遇到订单友好/不友好的终端操作与并行/顺序与有序/无序流的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这个问题的启发,我开始了使用有序与无序流,并行与顺序流和终端操作相比,它们会遇到订单与不尊重它的终端操作。

Inspired by this question, I started to play with ordered vs unordered streams, parallel vs sequential streams and terminal operations that respect encounter order vs terminal operations that don't respect it.

在链接的一个答案中问题,显示与此类似的代码:

In one answer to the linked question, a code similar to this one is shown:

List<Integer> ordered = Arrays.asList(
    1, 2, 3, 4, 4, 3, 2, 1, 1, 2, 3, 4, 4, 3, 2, 1, 1, 2, 3, 4);
List<Integer> result = new CopyOnWriteArrayList<>();

ordered.parallelStream().forEach(result::add);

System.out.println(ordered);
System.out.println(result);

这些列表确实不同。 无序列表甚至会从一次运行更改为另一次运行,显示结果实际上是非确定性的。

And the lists are indeed different. The unordered list even changes from one run to another, showing that the result is actually non-deterministic.

所以我创建了另一个例子:

So I created this other example:

CopyOnWriteArrayList<Integer> result2 = ordered.parallelStream()
        .unordered()
        .collect(Collectors.toCollection(CopyOnWriteArrayList::new));

System.out.println(ordered);
System.out.println(result2);

我希望看到类似的结果,因为流是并行和无序的(可能 unordered()是多余的,因为它已经是并行的)。但是,结果列表是有序的,即它等于源列表。

And I expected to see similar results, as the stream is both parallel and unordered (maybe unordered() is redundant, since it's already parallel). However, the resulting list is ordered, i.e. it's equal to the source list.

所以我的问题是为什么收集的列表是有序的? collect 是否始终遵守遭遇顺序,即使对于并行,无序的流?它是特定的 Collectors.toCollection(...)收集器强制遭遇订单吗?

So my question is why the collected list is ordered? Does collect always respect encounter-order, even for parallel, unordered streams? Is it the specific Collectors.toCollection(...) collector the one that forces encounter-order?

推荐答案

Collectors.toCollection 返回缺少收集器 Collector.Characteristics.UNORDERED 特征。另一个指定 Collector.Characteristics.UNORDERED 的收集器可能表现不同。

Collectors.toCollection returns a Collector which lacks the Collector.Characteristics.UNORDERED characteristic. Another collector which specified Collector.Characteristics.UNORDERED might behave differently.

这就是说:无序意味着不保证,而不是保证变化。如果库发现最容易按顺序处理无序集合,则允许这样做,并允许该行为将发布更改为周二发布,或者如果有满月。

That said: "unordered" means no guarantees, not guaranteed to vary. If the library finds it easiest to treat an unordered collection as ordered, it is allowed to do so, and that behavior is allowed to change release to release, on Tuesdays, or if there is a full moon.

(另请注意 Collectors.toCollection 不会如果你要使用并行流,则要求你使用并发集合实现; toCollection(ArrayList :: new)可以正常工作。那是因为收集器没有 Collector.Characteristics.CONCURRENT 特性,因此它使用的集合策略适用于非并发集合,即使是并行流也是如此。)

(Note also that Collectors.toCollection does not require you to use a concurrent collection implementation if you're going to use parallel streams; toCollection(ArrayList::new) would work fine. That's because the collector doesn't have the Collector.Characteristics.CONCURRENT characteristic, so it uses a collection strategy that works for non-concurrent collections even with parallel streams.)

如果您使用无序流但收集器不是 UNORDERED ,反之亦然,我怀疑你从框架得到任何保证。如果有一张桌子,它会说这里<罢工> DRAGONS UNDEFINED BEAVAVIOR。我还希望在这里对不同类型的链式操作有一些区别,例如:尤金提到 findFirst 在这里有所不同,即使 findFirst 本身就是一个有序的操作 - 无序( ).findFirst()等同于 findAny()

If you use an unordered stream but a collector that isn't UNORDERED, or vice versa, I doubt you get any guarantees from the framework. If there were a table, it would say "HERE BE DRAGONS UNDEFINED BEHAVIOR." I'd also expect some differences for different kinds of chained operations here, e.g. Eugene mentions findFirst varies here, even though findFirst is inherently an ordered operation -- unordered().findFirst() becomes equivalent to findAny().

对于 Stream.collect ,我相信当前的实现有三种策略可供选择:

For Stream.collect, I believe the current implementation has three strategies it chooses between:


  • 顺序:启动一个累加器,将元素累积到它中(按照遭遇顺序,因为你为什么要打扰工作来洗牌?按照你得到它们的顺序接受它们),调用整理器。

  • 并行执行,并发收集器,流或收集器是无序的:一个累加器,分片输入,工作线程处理每个分片中的元素,并在它们准备好时将元素添加到累加器,调用整理器。

  • 并行执行,其他任何事情:将输入分片为N个分片,每个分片按顺序累积到自己的 distinct 累加器,累加器与组合器功能结合,调用整理器。

  • Sequential: starts one accumulator, accumulates elements into it (in encounter order, because why would you bother doing work to shuffle the elements? just accept them in the order you get them), calls the finisher.
  • Parallel execution, concurrent collector, and the stream or the collector are unordered: one accumulator, shards the input, worker threads process the elements from each shard and add elements to the accumulator when they're ready, calls the finisher.
  • Parallel execution, anything else: shards the input into N shards, each shard gets sequentially accumulated into its own distinct accumulator, the accumulators get combined with the combiner function, calls the finisher.

这篇关于遇到订单友好/不友好的终端操作与并行/顺序与有序/无序流的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆