流收集累加器/合并器命令 [英] stream collect accumulator/combiner order

查看:150
本文介绍了流收集累加器/合并器命令的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这基本上是对我的这个答案的跟进。

假设我正在处理自定义收集器,并假设累加器 总是将一些元素添加到供应商返回的集合中,有没有机会在调用 combiner 时,其中一个中间结果是空的?一个例子可能更容易理解。

Suppose that I am working on custom collector and supposing that the accumulator always will add some element to the collection returned by the supplier, is there any chance that when combiner is called, one of the intermediate results will be empty? An example is probably a lot simpler to understand.

假设我有一个 List 的数字,我想分开它在列表列表中,其中 2 是分隔符。所以例如我有 1,2,3,4,2,8 ,结果应该是 [[1],[3,4] ,[8]] 。实现起来并不是很复杂(不要过多地判断代码,我写了一些快速的内容,这样我就可以写出这个问题)。

Suppose I have a List of numbers and I want to split it in List of Lists, where 2 is the separator. So for example I have 1, 2, 3, 4, 2, 8, the result should be [[1], [3, 4], [8]]. This is not really complicated to achieve (don't judge the code too much, I wrote something fast, just so I could write this question).

List<List<Integer>> result = Stream.of(1, 2, 3, 4, 2, 8)
            .collect(Collector.of(
                    () -> new ArrayList<>(),
                    (list, elem) -> {
                        if (list.isEmpty()) {
                            List<Integer> inner = new ArrayList<>();
                            inner.add(elem);
                            list.add(inner);
                        } else {
                            if (elem == 2) {
                                list.add(new ArrayList<>());
                            } else {
                                List<Integer> last = list.get(list.size() - 1);
                                last.add(elem);
                            }
                        }
                    },
                    (left, right) -> {
                        // This is the real question here:
                        // can left or right be empty here?
                        return left;
                    }));

这可能与此示例无关,但问题是:可以是<中的元素吗? code> combiner 是一个空的列表?我真的很想说 NO ,因为在文档中这些被称为:

This is irrelevant probably in this example, but the question is: can the one the elements in the combiner be an empty List? I am really really inclined to say NO, since in the documentation these are referred as:


combiner - 一个关联的,非干扰的无状态函数,接受两个部分结果容器并合并它们。

那个部分对我来说是指示累加器在它们到达组合器之前被调用了/ code>,但只是想确定一下。

Well that partial to me is an indication that accumulator was called on them, before they reached combiner, but just wanted to be sure.

推荐答案

在合并之前,没有保证累加器已应用于容器。换句话说,要合并的列表可能是空的。

There is no guaranty that the accumulator has been applied to a container before merging. In other words, the lists to merge may be empty.

为了证明这一点:

IntStream.range(0, 10).parallel().boxed()
         .filter(i -> i >= 3 && i < 7)
         .collect(ArrayList::new, List::add, (l1,l2)->{
             System.out.println(l1.size()+" + "+l2.size());
             l1.addAll(l2);
         });

在我的机器上打印:

0 + 0
0 + 0
0 + 0
1 + 1
0 + 2
0 + 2
1 + 1
2 + 0
2 + 2

当过滤器操作的结果尚未知道时,工作负载拆分发生在源列表中。每个块都以相同的方式处理,无需重新检查是否有任何元素已到达累加器。

The workload splitting happens at the source list, when the outcome of the filter operation is not known yet. Each chunk is processed the same way, without rechecking whether any element has arrived the accumulator.

请注意,从Java 9开始,您还可以执行类似

Mind that starting with Java 9, you can also do something like

IntStream.range(0, 10).parallel().boxed()
        .collect(Collectors.filtering(i -> i >= 3 && i < 7, Collectors.toList()));

这是收藏家的另一个原因(这里, toList() collector)应准备好遇到空容器,因为过滤发生在 Stream 实现之外,而接受调用复合收集器的累加器并不总是暗示下游收集器的累加器上的接受调用。

which is another reason why a collector (here, the toList() collector) should be prepared to encounter empty containers, as the filtering happens outside the Stream implementation and an accept call on the compound collector’s accumulator doesn’t always imply an accept call on the downstream collector’s accumulator.

能够处理空容器的要求在 收集器文档

The requirement of being able to handle empty containers is specified in the Collector documentation:


确保顺序和并行执行产生相同的结果,收集器函数必须满足标识 associativity 约束。

标识约束表示对于任何部分累积的结果,将其与空结果容器组合必须产生等效结果。也就是说,对于部分累积的结果 a ,这是任何一系列累加器和组合器调用的结果, a 必须相当于 combiner.apply(a,supplier.get())

The identity constraint says that for any partially accumulated result, combining it with an empty result container must produce an equivalent result. That is, for a partially accumulated result a that is the result of any series of accumulator and combiner invocations, a must be equivalent to combiner.apply(a, supplier.get()).

这篇关于流收集累加器/合并器命令的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆